:py:mod:`orcanet.in_out` ======================== .. py:module:: orcanet.in_out .. autoapi-nested-parse:: Utility code regarding user input. .. !! processed by numpydoc !! Module Contents --------------- Classes ~~~~~~~ .. autoapisummary:: orcanet.in_out.IOHandler Functions ~~~~~~~~~ .. autoapisummary:: orcanet.in_out.get_subfolder orcanet.in_out.get_inputs orcanet.in_out.split_name_of_predfile orcanet.in_out.h5_get_number_of_rows orcanet.in_out.use_local_tmpdir .. py:function:: get_subfolder(main_folder, name=None, create=False) Get the path to one or all subfolders of the main folder. :Parameters: **main_folder** : str The main folder. **name** : str or None The name of the subfolder. **create** : bool If the subfolder should be created if it does not exist. :Returns: **subfolder** : str or tuple The path of the subfolder. If name is None, all subfolders will be returned as a tuple. .. !! processed by numpydoc !! .. py:function:: get_inputs(model) Get names and keras layers of the inputs of the model, as a dict. .. !! processed by numpydoc !! .. py:class:: IOHandler(cfg) Access info indirectly contained in the cfg object. .. !! processed by numpydoc !! .. py:method:: get_latest_epoch() Return the highest epoch/fileno pair of any saved model. :Returns: **latest_epoch** : tuple or None The highest epoch, file_no pair. None if the folder is empty or does not exist yet. .. !! processed by numpydoc !! .. py:method:: get_all_epochs() Get a sorted list of the epoch/fileno pairs of all saved models. :Returns: **epochs** : List The (epoch, fileno) tuples. List is empty if none can be found. .. !! processed by numpydoc !! .. py:method:: get_next_epoch(epoch) Return the next epoch / fileno tuple. It depends on how many train files there are. :Parameters: **epoch** : tuple or None Current epoch and file number. :Returns: **next_epoch** : tuple Next epoch and file number. .. !! processed by numpydoc !! .. py:method:: get_previous_epoch(epoch) Return the previous epoch / fileno tuple. .. !! processed by numpydoc !! .. py:method:: get_subfolder(name=None, create=False) Get the path to one or all subfolders of the main folder. :Parameters: **name** : str or None The name of the subfolder. **create** : bool If the subfolder should be created if it does not exist. :Returns: **subfolder** : str or tuple The path of the subfolder. If name is None, all subfolders will be returned as a tuple. .. !! processed by numpydoc !! .. py:method:: get_model_path(epoch, fileno, local=False) Get the path to a model (which might not exist yet). :Parameters: **epoch** : int Its epoch. **fileno** : int Its file number. **local** : bool If True, will only return the path inside the output_folder, i.e. models/models_epochXX_file_YY.h5. :Returns: **model_path** : str The path to the model. .. !! processed by numpydoc !! .. py:method:: get_latest_prediction_file_no(epoch, fileno) Returns the file number of the latest currently predicted val file. :Parameters: **epoch** : int Epoch of the model that has predicted. **fileno** : int Fileno of the model that has predicted. :Returns: **latest_val_file_no** : int or None File number of the prediction file with the highest val index. STARTS FROM 1, so this is whats in the file name. None if there is none. .. !! processed by numpydoc !! .. py:method:: get_pred_path(epoch, fileno, pred_file_no) Gets the path of a prediction file. The ints all start from 1. :Parameters: **epoch** : int Epoch of an already trained nn model. **fileno** : int File number train step of an already trained nn model. **pred_file_no** : int Val file no of the prediction files that are found in the prediction folder. :Returns: **pred_filepath** : str The path. .. !! processed by numpydoc !! .. py:method:: get_pred_files_list(epoch=None, fileno=None) Returns a sorted list with all pred .h5 files in the prediction folder. Does not include the inference files. :Parameters: **epoch** : int, optional Specific model epoch to look pred files up for. **fileno** : int, optional Specific model epoch to look pred files up for. :Returns: **pred_files_list** : List List with the full filepaths of all prediction results files. .. !! processed by numpydoc !! .. py:method:: get_local_files(which) Get the training or validation file paths for each list input set. Returns the path to the copy of the file on the local tmpdir, which it will generate if called for the first time. :Parameters: **which** : str Either "train", "val", or "inference". :Returns: dict A dict containing the paths to the training or validation files on which the model will be trained on. Example for the format for two input sets with two files each: { "input_A" : ('path/to/set_A_file_1.h5', 'path/to/set_A_file_2.h5'), "input_B" : ('path/to/set_B_file_1.h5', 'path/to/set_B_file_2.h5'), } .. !! processed by numpydoc !! .. py:method:: get_n_bins() Get the number of bins from the training files. Only the first files are looked up, the others should be identical. :Returns: **n_bins** : dict Toml-list input names as keys, list of the bins as values. .. !! processed by numpydoc !! .. py:method:: get_file_sizes(which) Get the number of samples in each training or validation input file. :Parameters: **which** : str Either train or val. :Returns: **file_sizes** : List Its length is equal to the number of files in each input set. :Raises: ValueError If there is a different number of samples in any of the files of all inputs. .. !! processed by numpydoc !! .. py:method:: get_no_of_files(which) Return the number of training or validation files. Only looks up the no of files of one (random) list input, as equal length is checked during read in. :Parameters: **which** : str Either train or val. :Returns: **no_of_files** : int The number of files. .. !! processed by numpydoc !! .. py:method:: yield_files(which) Yield a training or validation filepaths for every input. They will be yielded in the same order as they are given in the toml file. :Parameters: **which** : str Either train or val. :Yields: **files_dict** : dict Keys: The name of every toml list input. Values: One of the filepaths. .. !! processed by numpydoc !! .. py:method:: get_file(which, file_no) Get a dict with the n-th files. .. !! processed by numpydoc !! .. py:method:: check_connections(model) Check if the names and shapes of the samples and labels in the given input files work with the model. Also takes into account the possibly present sample or label modifiers. :Parameters: **model** : ks.model A keras model. :Raises: ValueError If they dont work together. .. !! processed by numpydoc !! .. py:method:: get_batch() For testing purposes, return a batch of x_values and y_values. This will always be the first batchsize samples and y_values from the first file, before any modifiers have been applied. :Returns: **info_blob** : dict X- and y-values from the files. Has the following entries: x_values : dict Keys: Names of the input datasets from the list toml file. Values: ndarray, a batch of samples. y_values : ndarray From the y_values datagroup of the input files. .. !! processed by numpydoc !! .. py:method:: get_input_shapes() Get the input names and shapes of the data after the modifier has been applied. :Returns: **input_shapes** : dict Keys: Name of the inputs of the model. Values: Their shape without the batchsize. .. !! processed by numpydoc !! .. py:method:: print_log(lines, logging=True) Print and also log to the full log file. .. !! processed by numpydoc !! .. py:method:: get_epoch_float(epoch, fileno) Make a float value out of epoch/fileno. .. !! processed by numpydoc !! .. py:method:: get_learning_rate(epoch) Get the learning rate for a given epoch and file number. The user learning rate (cfg.learning_rate) can be None, a float, a tuple, or a function. :Parameters: **epoch** : tuple Epoch and file number. Both start at 1, i.e. the start of the training is (1, 1), the next file is (1, 2), ... This is also in the filename of the saved models. :Returns: **lr** : float The learning rate that will be used for the given epoch/fileno. .. !! processed by numpydoc !! .. py:function:: split_name_of_predfile(file) Get epoch, fileno, cal fileno from the name of a predfile. :Parameters: **file** : str Like this: model_epoch_XX_file_YY_on_USERLIST_val_file_ZZ.h5 :Returns: **epoch , file_no, val_file_no** : tuple(int) As integers. .. !! processed by numpydoc !! .. py:function:: h5_get_number_of_rows(h5_filepath, datasets=None) Gets the total number of rows of of a .h5 file. Multiple dataset names can be given as a list to check if they all have the same number of rows (axis 0). :Parameters: **h5_filepath** : str filepath of the .h5 file. **datasets** : list Optional, The names of datasets in the file to check. :Returns: number_of_rows: int number of rows of the .h5 file in the first dataset. :Raises: AssertionError If the given datasets do not have the same no of rows. .. !! processed by numpydoc !! .. py:function:: use_local_tmpdir(files) Copies given files to the local temp folder. :Parameters: **files** : dict Dict containing the file pathes. :Returns: **files_ssd** : dict Dict with updated SSD/scratch filepaths. .. !! processed by numpydoc !!