tools Package Package for collecting tools (e.g,. Tool used to convert img files to OpenMSI HDF5 files. Simple helper tool to run an analysis. Simple helper tool to run a workflow. Collection of miscellaneous tools. Simple helper tool used to generate a set of PNG images for a global peak analysis (one per global peak) as well as a LaTeX document that summarizes all the images in a single document. Simple script to generate thumbnail images Collection of experimental tools and tools under development.

convertToOMSI Module

Tool used to convert img files to OpenMSI HDF5 files.

For usage information execute: python convertToOMSI –help


Bases: object

Class providing a number of functions for converting various file types to OMSI, including a number of helper functions related to the data conversion.

static check_format(name, format_type)

Helper function used to determine the file format that should be used

  • name – Name of the folder/file that we should read
  • format_type – String indicating the format-option given by the user. If the format is not determined (i.e., “auto”) then this function tries to determine the appropriate format. Otherwise this option is returned as is, as the user explicitly said which format should be used.

String indicating the appropriate format. Returns None in case no valid option was found.

static convert_files()

Convert all files in the given list of files with their approbriate conversion options

static create_dataset_list(input_filenames, format_type=None, data_region_option='split+merge')

Based on the list of input_filenames, generate the ConvertSettings.dataset_list, which contains a dictionary describing each conversion job

  • input_filenames – List of names of files to be converted.
  • format_type – Define which file-format should be used. Default value is ‘auto’ indicating the function should determine for each file the format to be used. See also ConvertSettings.available_formats parameter.
  • data_region_option – Define how different regions defined for a file should be handled. E.g., one may want to split all regions into indiviudal datasets (‘split’), merge all regions into a single dataset (‘merge’), or do both (‘split+merge’). See also the ConvertSettings.available_region_options parameter for details. By default the function will do ‘split+merge’.

List of dictionaries describing the various conversion jobs. Each job is described by a dict with the following keys:

  • ‘basename’ : The base name of the file or directory with the data
  • ‘format’ : The data format to be used
  • ‘dataset’ : The index of the dataset to be converted if the input stores multiple data cubes
  • ‘region’ : The index of the region to be converted if the input defines multiple regions
  • ‘exp’ : One of ‘previous’ or ‘new’, defining whether a new experiment should be created or whether the experiment from the previous conversion(s) should be reused.

static suggest_chunking(xsize, ysize, mzsize, dtype, print_results=False)

Helper function used to suggest god chunking strategies for a given data cube

  • xsize – Size of the dataset in x.
  • ysize – Size o the dataset in y.
  • mzsize – Size of the dataset in mz.
  • print_results – Print the results to the console.

Three tuples:

  • spectrum_chunk : The chunking to be used to optimize selection of spectra.
  • slice_chunk : The chunking to be used to optimize selection of image slices.
  • balanced_chunk : The chunking that would provide a good balance in performance for different selection strategies.

static suggest_chunkings_for_files(in_dataset_list)

Helper function used to suggest good chunking strategies for a given set of files.

Parameters:in_dataset_list – Python list of dictionaries describing the settings to be used for the file conversion
Returns:This function simply prints results to standard-out but does not return anything.
static write_data(input_file, data, data_io_option='spectrum', chunk_shape=None, write_progress=True)

Helper function used to implement different data write options.

  • input_file – The input data file
  • data – The output dataset (either an h5py dataset or omsi_file_msidata object.
  • data_io_option

    String indicating the data write method to be used. One of:

    • spectrum: Write the data one spectrum at a time
    • all : Write the complete dataset at once.
    • chunk : Write the data one chunk at a time.
  • chunk_shape – The chunking used by the data. Needed to decide how the data should be written when a chunk-aligned write is requested.
  • write_progress (bool) – Write progress in % to standard out while data is being written.

Bases: object

This class is used specify the settings for the data conversion

add_file_to_db = True
auto_chunk = True
available_error_options = ['terminate-and-cleanup', 'terminate-only', 'continue-on-error']
available_formats = {'imzml_file': <class 'omsi.dataformat.imzml_file.imzml_file'>, 'bruckerflex_file': <class 'omsi.dataformat.bruckerflex_file.bruckerflex_file'>, 'img_file': <class 'omsi.dataformat.img_file.img_file'>, 'mzml_file': <class 'omsi.dataformat.mzml_file.mzml_file'>}
available_io_options = ['chunk', 'spectrum', 'all']
available_region_options = ['split', 'merge', 'split+merge']
check_add_nersc = True
chunks = (4, 4, 2048)
compression = 'gzip'
compression_opts = 4
dataset_list = []

List of python dictionaries describing specific conversion settings for each conversion task. Each dictionary contains the following keys:

  • ‘basename’ : Name of the file to be converted
  • ‘format’ : File format to be used (see ConvertSettings.available_formats)
  • ‘exp’ : Indicate the experiment the dataset should be stored with. Valid values are
    • ‘new’ : Generate a new experiment for the dataset
    • ‘previous’ : Use the same experiment as used for the previous dataset
    • 1, 2,3... : Integer value indicating the index of the experiment to be used.
  • ‘region’ : Optional key with index of the region to be converted. None to merge all regions.
  • ‘dataset’ : Optional key with index of the dataset to be converted.
  • ‘omsi_object’: Optional key used to save a pointer to the omsi data object with the converted data
  • ‘dependencies’: Additional dependencies that should be added for the dataset
db_server_url = ''
email_error_recipients = []
email_success_recipients = []
error_handling = 'terminate-and-cleanup'
execute_fpg = True
execute_fpl = False
execute_nmf = True
execute_ticnorm = False
file_user = 'oruebel'
format_option = None
generate_thumbnail = False
generate_xdmf = False
io_block_size_limit = 524288000
io_option = 'spectrum_to_image'
job_id = None
metadata = {}
nmf_num_component = 20
nmf_num_iter = 2000
nmf_timeout = 600
nmf_tolerance = 0.0001
nmf_use_raw_data = False
omsi_output_file = None
classmethod parse_input_args(argv)

Process input parameters and define the script settings.

Parameters:argv – The list of input arguments
Returns:This function returns the following four values:
  • ‘input_error’ : Boolean indicating whether an error has occurred during the processing of the inputs
  • ‘inputWarning’ : Boolean indicating whether a warning occurred during the processing of the inputs
  • ‘output_filename’ : Name for the output HDF5 file
  • ‘input_filenames’ : List of strings indicating the list of input filenames
classmethod print_help()

Function used to print the help for this script

recorded_warnings = []
region_option = 'split+merge'
suggest_file_chunkings = False
user_additional_chunks = []

The main function defining the control flow for the conversion

run_analysis Module

Simple helper tool to run an analysis. This is essentially just a short-cut to the omsi/workflow/analysis_driver/omsi_cl_diver module

run_workflow Module

Simple helper tool to run an analysis. This is essentially just a short-cut to the omsi/workflow/analysis_driver/omsi_cl_diver module