:py:mod:`orcanet.in_out`
========================

.. py:module:: orcanet.in_out

.. autoapi-nested-parse::

   Utility code regarding user input.

   ..
       !! processed by numpydoc !!


Module Contents
---------------

Classes
~~~~~~~

.. autoapisummary::

   orcanet.in_out.IOHandler


Functions
~~~~~~~~~

.. autoapisummary::

   orcanet.in_out.get_subfolder
   orcanet.in_out.get_inputs
   orcanet.in_out.split_name_of_predfile
   orcanet.in_out.h5_get_number_of_rows
   orcanet.in_out.use_local_tmpdir


.. py:function:: get_subfolder(main_folder, name=None, create=False)

   
   Get the path to one or all subfolders of the main folder.


   :Parameters:

       **main_folder** : str
           The main folder.

       **name** : str or None
           The name of the subfolder.

       **create** : bool
           If the subfolder should be created if it does not exist.

   :Returns:

       **subfolder** : str or tuple
           The path of the subfolder. If name is None, all subfolders
           will be returned as a tuple.


   ..
       !! processed by numpydoc !!

.. py:function:: get_inputs(model)

   
   Get names and keras layers of the inputs of the model, as a dict.


   ..
       !! processed by numpydoc !!

.. py:class:: IOHandler(cfg)


   Access info indirectly contained in the cfg object.


   ..
       !! processed by numpydoc !!
   .. py:method:: get_latest_epoch()

      
      Return the highest epoch/fileno pair of any saved model.


      :Returns:

          **latest_epoch** : tuple or None
              The highest epoch, file_no pair. None if the folder is
              empty or does not exist yet.


      ..
          !! processed by numpydoc !!

   .. py:method:: get_all_epochs()

      
      Get a sorted list of the epoch/fileno pairs of all saved models.


      :Returns:

          **epochs** : List
              The (epoch, fileno) tuples. List is empty if none can be found.


      ..
          !! processed by numpydoc !!

   .. py:method:: get_next_epoch(epoch)

      
      Return the next epoch / fileno tuple.

      It depends on how many train files there are.

      :Parameters:

          **epoch** : tuple or None
              Current epoch and file number.

      :Returns:

          **next_epoch** : tuple
              Next epoch and file number.


      ..
          !! processed by numpydoc !!

   .. py:method:: get_previous_epoch(epoch)

      
      Return the previous epoch / fileno tuple.


      ..
          !! processed by numpydoc !!

   .. py:method:: get_subfolder(name=None, create=False)

      
      Get the path to one or all subfolders of the main folder.


      :Parameters:

          **name** : str or None
              The name of the subfolder.

          **create** : bool
              If the subfolder should be created if it does not exist.

      :Returns:

          **subfolder** : str or tuple
              The path of the subfolder. If name is None, all subfolders
              will be returned as a tuple.


      ..
          !! processed by numpydoc !!

   .. py:method:: get_model_path(epoch, fileno, local=False)

      
      Get the path to a model (which might not exist yet).


      :Parameters:

          **epoch** : int
              Its epoch.

          **fileno** : int
              Its file number.

          **local** : bool
              If True, will only return the path inside the output_folder,
              i.e. models/models_epochXX_file_YY.h5.

      :Returns:

          **model_path** : str
              The path to the model.


      ..
          !! processed by numpydoc !!

   .. py:method:: get_latest_prediction_file_no(epoch, fileno)

      
      Returns the file number of the latest currently predicted val file.


      :Parameters:

          **epoch** : int
              Epoch of the model that has predicted.

          **fileno** : int
              Fileno of the model that has predicted.

      :Returns:

          **latest_val_file_no** : int or None
              File number of the prediction file with the highest val index.
              STARTS FROM 1, so this is whats in the file name.
              None if there is none.


      ..
          !! processed by numpydoc !!

   .. py:method:: get_pred_path(epoch, fileno, pred_file_no)

      
      Gets the path of a prediction file. The ints all start from 1.


      :Parameters:

          **epoch** : int
              Epoch of an already trained nn model.

          **fileno** : int
              File number train step of an already trained nn model.

          **pred_file_no** : int
              Val file no of the prediction files that are found in the
              prediction folder.

      :Returns:

          **pred_filepath** : str
              The path.


      ..
          !! processed by numpydoc !!

   .. py:method:: get_pred_files_list(epoch=None, fileno=None)

      
      Returns a sorted list with all pred .h5 files in the prediction folder.
      Does not include the inference files.


      :Parameters:

          **epoch** : int, optional
              Specific model epoch to look pred files up for.

          **fileno** : int, optional
              Specific model epoch to look pred files up for.

      :Returns:

          **pred_files_list** : List
              List with the full filepaths of all prediction results files.


      ..
          !! processed by numpydoc !!

   .. py:method:: get_local_files(which)

      
      Get the training or validation file paths for each list input set.

      Returns the path to the copy of the file on the local tmpdir, which
      it will generate if called for the first time.

      :Parameters:

          **which** : str
              Either "train", "val", or "inference".

      :Returns:

          dict
              A dict containing the paths to the training or validation files on
              which the model will be trained on. Example for the format for
              two input sets with two files each:
              {
               "input_A" : ('path/to/set_A_file_1.h5', 'path/to/set_A_file_2.h5'),
               "input_B" : ('path/to/set_B_file_1.h5', 'path/to/set_B_file_2.h5'),
              }


      ..
          !! processed by numpydoc !!

   .. py:method:: get_n_bins()

      
      Get the number of bins from the training files.

      Only the first files are looked up, the others should be identical.


      :Returns:

          **n_bins** : dict
              Toml-list input names as keys, list of the bins as values.


      ..
          !! processed by numpydoc !!

   .. py:method:: get_file_sizes(which)

      
      Get the number of samples in each training or validation input file.


      :Parameters:

          **which** : str
              Either train or val.

      :Returns:

          **file_sizes** : List
              Its length is equal to the number of files in each input set.


      :Raises:

          ValueError
              If there is a different number of samples in any of the
              files of all inputs.


      ..
          !! processed by numpydoc !!

   .. py:method:: get_no_of_files(which)

      
      Return the number of training or validation files.

      Only looks up the no of files of one (random) list input, as equal
      length is checked during read in.

      :Parameters:

          **which** : str
              Either train or val.

      :Returns:

          **no_of_files** : int
              The number of files.


      ..
          !! processed by numpydoc !!

   .. py:method:: yield_files(which)

      
      Yield a training or validation filepaths for every input.

      They will be yielded in the same order as they are given in the
      toml file.

      :Parameters:

          **which** : str
              Either train or val.


      :Yields:

          **files_dict** : dict
              Keys: The name of every toml list input.
              Values: One of the filepaths.


      ..
          !! processed by numpydoc !!

   .. py:method:: get_file(which, file_no)

      
      Get a dict with the n-th files.


      ..
          !! processed by numpydoc !!

   .. py:method:: check_connections(model)

      
      Check if the names and shapes of the samples and labels in the
      given input files work with the model.

      Also takes into account the possibly present sample or label modifiers.

      :Parameters:

          **model** : ks.model
              A keras model.


      :Raises:

          ValueError
              If they dont work together.


      ..
          !! processed by numpydoc !!

   .. py:method:: get_batch()

      
      For testing purposes, return a batch of x_values and y_values.

      This will always be the first batchsize samples and y_values from
      the first file, before any modifiers have been applied.


      :Returns:

          **info_blob** : dict
              X- and y-values from the files. Has the following entries:
              x_values : dict
                  Keys: Names of the input datasets from the list toml file.
                  Values: ndarray, a batch of samples.
              y_values : ndarray
                  From the y_values datagroup of the input files.


      ..
          !! processed by numpydoc !!

   .. py:method:: get_input_shapes()

      
      Get the input names and shapes of the data after the modifier has
      been applied.


      :Returns:

          **input_shapes** : dict
              Keys: Name of the inputs of the model.
              Values: Their shape without the batchsize.


      ..
          !! processed by numpydoc !!

   .. py:method:: print_log(lines, logging=True)

      
      Print and also log to the full log file.


      ..
          !! processed by numpydoc !!

   .. py:method:: get_epoch_float(epoch, fileno)

      
      Make a float value out of epoch/fileno.


      ..
          !! processed by numpydoc !!

   .. py:method:: get_learning_rate(epoch)

      
      Get the learning rate for a given epoch and file number.

      The user learning rate (cfg.learning_rate) can be None, a float,
      a tuple, or a function.

      :Parameters:

          **epoch** : tuple
              Epoch and file number. Both start at 1, i.e. the start of the
              training is (1, 1), the next file is (1, 2), ...
              This is also in the filename of the saved models.

      :Returns:

          **lr** : float
              The learning rate that will be used for the given epoch/fileno.


      ..
          !! processed by numpydoc !!


.. py:function:: split_name_of_predfile(file)

   
   Get epoch, fileno, cal fileno from the name of a predfile.


   :Parameters:

       **file** : str
           Like this: model_epoch_XX_file_YY_on_USERLIST_val_file_ZZ.h5

   :Returns:

       **epoch , file_no, val_file_no** : tuple(int)
           As integers.


   ..
       !! processed by numpydoc !!

.. py:function:: h5_get_number_of_rows(h5_filepath, datasets=None)

   
   Gets the total number of rows of of a .h5 file.

   Multiple dataset names can be given as a list to check if they all
   have the same number of rows (axis 0).

   :Parameters:

       **h5_filepath** : str
           filepath of the .h5 file.

       **datasets** : list
           Optional, The names of datasets in the file to check.

   :Returns:

       number_of_rows: int
           number of rows of the .h5 file in the first dataset.


   :Raises:

       AssertionError
           If the given datasets do not have the same no of rows.


   ..
       !! processed by numpydoc !!

.. py:function:: use_local_tmpdir(files)

   
   Copies given files to the local temp folder.


   :Parameters:

       **files** : dict
           Dict containing the file pathes.

   :Returns:

       **files_ssd** : dict
           Dict with updated SSD/scratch filepaths.


   ..
       !! processed by numpydoc !!