`orcanet.in_out`

Utility code regarding user input.

Module Contents

Classes

IOHandler

Access info indirectly contained in the cfg object.

Functions

`get_subfolder`(main_folder[, name, create])	Get the path to one or all subfolders of the main folder.
`get_inputs`(model)	Get names and keras layers of the inputs of the model, as a dict.
`split_name_of_predfile`(file)	Get epoch, fileno, cal fileno from the name of a predfile.
`h5_get_number_of_rows`(h5_filepath[, datasets])	Gets the total number of rows of of a .h5 file.
`use_local_tmpdir`(files)	Copies given files to the local temp folder.

orcanet.in_out.get_subfolder(main_folder, name=None, create=False)[source]

Get the path to one or all subfolders of the main folder.

Parameters

main_folderstr: The main folder.
namestr or None: The name of the subfolder.
createbool: If the subfolder should be created if it does not exist.

Returns

subfolderstr or tuple: The path of the subfolder. If name is None, all subfolders will be returned as a tuple.

orcanet.in_out.get_inputs(model)[source]: Get names and keras layers of the inputs of the model, as a dict.

class orcanet.in_out.IOHandler(cfg)[source]

Access info indirectly contained in the cfg object.

get_latest_epoch()[source]

Return the highest epoch/fileno pair of any saved model.

Returns

latest_epochtuple or None: The highest epoch, file_no pair. None if the folder is empty or does not exist yet.

get_all_epochs()[source]

Get a sorted list of the epoch/fileno pairs of all saved models.

Returns

epochsList: The (epoch, fileno) tuples. List is empty if none can be found.

get_next_epoch(epoch)[source]

Return the next epoch / fileno tuple.

It depends on how many train files there are.

Parameters

epochtuple or None: Current epoch and file number.

Returns

next_epochtuple: Next epoch and file number.

get_previous_epoch(epoch)[source]: Return the previous epoch / fileno tuple.

get_subfolder(name=None, create=False)[source]

Get the path to one or all subfolders of the main folder.

Parameters

namestr or None: The name of the subfolder.
createbool: If the subfolder should be created if it does not exist.

Returns

subfolderstr or tuple: The path of the subfolder. If name is None, all subfolders will be returned as a tuple.

get_model_path(epoch, fileno, local=False)[source]

Get the path to a model (which might not exist yet).

Parameters

epochint: Its epoch.
filenoint: Its file number.
localbool: If True, will only return the path inside the output_folder, i.e. models/models_epochXX_file_YY.h5.

Returns

model_pathstr: The path to the model.

get_latest_prediction_file_no(epoch, fileno)[source]

Returns the file number of the latest currently predicted val file.

Parameters

epochint: Epoch of the model that has predicted.
filenoint: Fileno of the model that has predicted.

Returns

latest_val_file_noint or None: File number of the prediction file with the highest val index. STARTS FROM 1, so this is whats in the file name. None if there is none.

get_pred_path(epoch, fileno, pred_file_no)[source]

Gets the path of a prediction file. The ints all start from 1.

Parameters

epochint: Epoch of an already trained nn model.
filenoint: File number train step of an already trained nn model.
pred_file_noint: Val file no of the prediction files that are found in the prediction folder.

Returns

pred_filepathstr: The path.

get_pred_files_list(epoch=None, fileno=None)[source]

Returns a sorted list with all pred .h5 files in the prediction folder. Does not include the inference files.

Parameters

epochint, optional: Specific model epoch to look pred files up for.
filenoint, optional: Specific model epoch to look pred files up for.

Returns

pred_files_listList: List with the full filepaths of all prediction results files.

get_local_files(which)[source]

Get the training or validation file paths for each list input set.

Returns the path to the copy of the file on the local tmpdir, which it will generate if called for the first time.

Parameters

whichstr: Either “train”, “val”, or “inference”.

Returns

dict

A dict containing the paths to the training or validation files on which the model will be trained on. Example for the format for two input sets with two files each: {

“input_A” : (‘path/to/set_A_file_1.h5’, ‘path/to/set_A_file_2.h5’), “input_B” : (‘path/to/set_B_file_1.h5’, ‘path/to/set_B_file_2.h5’),

}

get_n_bins()[source]

Get the number of bins from the training files.

Only the first files are looked up, the others should be identical.

Returns

n_binsdict: Toml-list input names as keys, list of the bins as values.

get_file_sizes(which)[source]

Get the number of samples in each training or validation input file.

Parameters

whichstr: Either train or val.

Returns

file_sizesList: Its length is equal to the number of files in each input set.

Raises

ValueError: If there is a different number of samples in any of the files of all inputs.

get_no_of_files(which)[source]

Return the number of training or validation files.

Only looks up the no of files of one (random) list input, as equal length is checked during read in.

Parameters

whichstr: Either train or val.

Returns

no_of_filesint: The number of files.

yield_files(which)[source]

Yield a training or validation filepaths for every input.

They will be yielded in the same order as they are given in the toml file.

Parameters

whichstr: Either train or val.

Yields

files_dictdict: Keys: The name of every toml list input. Values: One of the filepaths.

get_file(which, file_no)[source]: Get a dict with the n-th files.

check_connections(model)[source]

Check if the names and shapes of the samples and labels in the given input files work with the model.

Also takes into account the possibly present sample or label modifiers.

Parameters

modelks.model: A keras model.

Raises

ValueError: If they dont work together.

get_batch()[source]

For testing purposes, return a batch of x_values and y_values.

This will always be the first batchsize samples and y_values from the first file, before any modifiers have been applied.

Returns

info_blobdict

X- and y-values from the files. Has the following entries: x_values : dict

Keys: Names of the input datasets from the list toml file. Values: ndarray, a batch of samples.

y_valuesndarray: From the y_values datagroup of the input files.

get_input_shapes()[source]

Get the input names and shapes of the data after the modifier has been applied.

Returns

input_shapesdict: Keys: Name of the inputs of the model. Values: Their shape without the batchsize.

print_log(lines, logging=True)[source]: Print and also log to the full log file.

get_epoch_float(epoch, fileno)[source]: Make a float value out of epoch/fileno.

get_learning_rate(epoch)[source]

Get the learning rate for a given epoch and file number.

The user learning rate (cfg.learning_rate) can be None, a float, a tuple, or a function.

Parameters

epochtuple: Epoch and file number. Both start at 1, i.e. the start of the training is (1, 1), the next file is (1, 2), … This is also in the filename of the saved models.

Returns

lrfloat: The learning rate that will be used for the given epoch/fileno.

orcanet.in_out.split_name_of_predfile(file)[source]

Get epoch, fileno, cal fileno from the name of a predfile.

Parameters

filestr: Like this: model_epoch_XX_file_YY_on_USERLIST_val_file_ZZ.h5

Returns

epoch , file_no, val_file_notuple(int): As integers.

orcanet.in_out.h5_get_number_of_rows(h5_filepath, datasets=None)[source]

Gets the total number of rows of of a .h5 file.

Multiple dataset names can be given as a list to check if they all have the same number of rows (axis 0).

Parameters

h5_filepathstr: filepath of the .h5 file.
datasetslist: Optional, The names of datasets in the file to check.

Returns

number_of_rows: int: number of rows of the .h5 file in the first dataset.

Raises

AssertionError: If the given datasets do not have the same no of rows.

orcanet.in_out.use_local_tmpdir(files)[source]

Copies given files to the local temp folder.

Parameters

filesdict: Dict containing the file pathes.

Returns

files_ssddict: Dict with updated SSD/scratch filepaths.

orcanet.in_out

Module Contents

Classes

Functions

`orcanet.in_out`