Abdulelah-Gani PST Class

Joback fragmentation module.

class AbdulelahGaniPSTModel(subgroups: DataFrame, subgroups_info: DataFrame, allow_overlapping: bool = False, allow_free_atoms: bool = False)[source]

Bases: FragmentationModel

Abdulelah-Gani model dedicated to properties estimation.

Class to construct the primary, secondary and tertiary structures detector for the Abdulelah-Gani properties estimation model [12].

Parameters:
  • subgroups (pd.DataFrame) – Model’s subgroups. Index: ‘group’ (subgroups names). Mandatory columns: ‘smarts’ (SMARTS representations of the group to detect its precense in the molecule).

  • subgroups_info (pd.DataFrame) – Group’s subgroups numbers.

subgroups

Model’s subgroups. Index: ‘group’ (subgroups names). Mandatory columns: ‘smarts’ (SMARTS representations of the group to detect its precense in the molecule).

Type:

pd.DataFrame

detection_mols

Dictionary cotaining all the rdkit Mol object from the detection_smarts subgroups column

Type:

dict

info

Group’s subgroups numbers.

Type:

pd.DataFrame

get_groups(identifier: str | ~rdkit.Chem.rdchem.Mol, identifier_type: str = 'name', solver: ~ugropy.core.ilp_solvers.ilp_solver.ILPSolver = <class 'ugropy.core.ilp_solvers.default_solver.DefaultSolver'>, search_multiple_solutions: bool = False) AGaniPSTFragmentationResult | List[AGaniPSTFragmentationResult][source]

Get the groups of a molecule.

Parameters:
  • identifier (Union[str, Chem.rdchem.Mol]) – Identifier of the molecule. You can use either the name of the molecule, the SMILEs of the molecule or a rdkit Mol object.

  • identifier_type (str, optional) – Identifier type of the molecule. Use “name” if you are providing the molecules’ name, “smiles” if you are providing the SMILES or “mol” if you are providing a rdkir mol object, by default “name”

  • solver (ILPSolver, optional) – ILP solver class, by default DefaultSolver

  • search_multiple_solutions (bool, optional) – Weather search for multiple solutions or not, by default False If False the return will be a FragmentationResult object, if True the return will be a list of FragmentationResult objects.

Returns:

Fragmentation result. If search_multiple_solutions is False the return will be a FragmentationResult object, if True the return will be a list of FragmentationResult objects.

Return type:

Union[AGaniPFragmentationResult, List[AGaniPFragmentationResult]]

mol_preprocess(mol: Mol) Mol[source]

Preprocess the molecule to be ready for the fragmentation.

This method preprocess the molecule to be ready for the fragmentation process. The preprocessing steps are:

  1. Kekulize the molecule.

  2. Identify the aromatic rings.

  3. Check the aromaticity of the rings.

  4. Make the rings aromatic or non-aromatic based on the setp 3 check.

The criteria to check the aromaticity of the rings are based on the criteria proposed by Abdulelah-Gani in the original paper datase [12].

Parameters:

mol (Chem.rdchem.Mol) – Molecule to preprocess

Returns:

Preprocessed molecule

Return type:

Chem.rdchem.Mol

check_ring_aromaticity(mol: Mol, ring: List[int]) bool[source]

Verify ring aromaticity along with the authors criteria.

Parameters:
  • mol (Chem.Mol) – Molecule

  • ring (list[int]) – List of atom indexes that form the ring

Returns:

True if the ring is aromatic, False otherwise

Return type:

bool