utils

Module utils

npfc.utils.check_arg_bool(value)[source]

Return True of the value is indeed a boolean, raise a TypeError otherwise.

Parameters

value (bool) – the argument to test

Return type

bool

npfc.utils.check_arg_config_file(config_file)[source]

Return True of the config_file exists, raise an error otherwise.

Parameters

input_file – the input file

Return type

bool

npfc.utils.check_arg_input_dir(input_dir)[source]

Return True of the input_dir exists.

Parameters

input_dir (str) – the output directory

Return type

bool

Returns

True if the directory exists.

npfc.utils.check_arg_input_file(input_file)[source]

Return True of the input_file exists, raise an error otherwise.

Parameters
  • input_file (str) – the input file

  • input_format – the expected format of the input file

Return type

bool

npfc.utils.check_arg_output_dir(output_dir)[source]

Return True of the output_dir can exist.

If the parent directory of the output dir does not exist, it has to either be created or fail the check.

Parameters
  • output_dir (str) – the output directory

  • create_parent_dir – create the output file’s parent folder in case it does not exist

Return type

bool

npfc.utils.check_arg_output_file(output_file, create_parent_dir=True)[source]

Return True of the output_file has the expected format (deduced from the file extension).

If the parent directory of the output file does not exist, it has to either be created or fail the check.

Parameters
  • output_file (str) – the output file

  • create_parent_dir (bool) – create the output file’s parent folder in case it does not exist

Return type

bool

npfc.utils.check_arg_output_plot(output_file, create_parent_dir=True)[source]

Return True of the output_plot has the expected format (deduced from the file extension).

Accepted extensions are: svg and png.

If the parent directory of the output file does not exist, it has to either be created or fail the check.

Parameters
  • output_file (str) – the output file

  • create_parent_dir (bool) – create the output file’s parent folder in case it does not exist

Return type

bool

npfc.utils.check_arg_positive_number(value)[source]

Return True of the value is indeed a positive number (>0), raise a TypeError otherwise.

Parameters

value (Union[int, float]) – the argument to test

Return type

bool

npfc.utils.decode_mol(string)[source]

Convert a string to a RDKit Mol object.

Parameters

string (str) – a string with a Mol object in bytes with a base64 string representation

Return type

Mol

Returns

a Mol object upon success, None otherwise

npfc.utils.decode_mol_smiles(string)[source]

Convert a Smiles to a Mol.

Parameters

string (str) – a string with a Mol object in bytes with a base64 string representation

Return type

Mol

Returns

a Mol object upon success, None otherwise

npfc.utils.decode_object(string)[source]

Convert a base64 string to an object.

Parameters

string (str) – a base64 string encoding an object

Return type

object

Returns

an object upon success, None otherwise

npfc.utils.encode_mol(mol)[source]

Convert a molecule to a base64 string.

Parameters

mol (Mol) – the input molecule

Return type

str

Returns

the molecule in base64

npfc.utils.encode_mol_smiles(mol)[source]

Convert a mol to a Smiles.

Parameters

string – a string with a Mol object in bytes with a base64 string representation

Return type

str

Returns

a str object upon success, None otherwise

npfc.utils.encode_object(element)[source]

Convert an object to a base64 string.

Parameters

element (object) – an object to encode

Return type

str

Returns

an base64 string upon success, None otherwise

npfc.utils.fuse_rings(rings)[source]

Check for atom indices in common between rings to aggregate them into fused rings.

Parameters

rings (tuple) – the ring atoms as provided by the RDKit function mol.GetRingInfo().AtomRings() (iteratble of iteratable of atom indices).

Return type

list

Returns

the fused ring atoms (list of lists of atom indices)

npfc.utils.get_file_format(input_file)[source]

Deduce how the file should be parsed based on its suffixes.

Only compressions for one single file with initial extension still visibile work (see example below).

>>> from pathlib import Path
>>> from npfc import utils
>>> utils.get_file_format(Path('file.csv.gz').suffixes)
>>> # returns ('CSV', 'gzip')
>>> utils.get_file_format(Path('file.sdf').suffixes)
>>> # returns ('SDF', None)
Parameters

suffixes – suffixes of a file

Return type

tuple

Returns

a tuple with syntax (format, compression)

npfc.utils.get_shortest_path_between_frags(mol, aidxf1, aidxf2)[source]

Return the shortest path within a molecule between two fragments defined by atom indices. First and last atom indices are part of respectively fragment 1 and fragment 2, so they should not be considered when estimating the distance between fragments.

(i.e. distance = len(shortest_path) - 2)

Parameters
  • mol (Mol) – The input molecule.

  • aidxf1 (set) – the atom indices of the first fragment found in the molecule

  • aidxf2 (set) – the atom indices of the second fragment found in the molecule

Return type

tuple

Returns

the atom indices of the shortest path between both fragments. The first index is the attachment point from fragment 1 whereas the last index is the attachment point from fragment 2

npfc.utils.parse_argparse_boolstring(value)[source]

Return True or False given the provided string. If the string is actually not a boolean, raise a TypeError.

Parameters

value (str) – the argument to test

Return type

bool

Returns

the parsed boolean as type bool

npfc.utils.raise_timeout(signum, frame)[source]

Function to actually raise the TimeoutError when the time has come.

npfc.utils.random_string(n)[source]

Function to generate a random stringand of n characters and digits and in lower case.

Parameters

n (int) – the number of characters in the generated string

Return type

str

Returns

a random string

npfc.utils.timeout(time)[source]

This function is used within a with statement:

>>> with timeout(5):
>>>    do something

If the code block execution time exceeds the time threshold, a TimeoutError is raised.

Parameters

time – time in seconds allowd to the code block before aborting its execution

References