Rdkit Mol object getProp("_Name") is empty. How to get ID?

2.7k Views Asked by At

So I have been looking through the documentation but couldn't find anything relevant. In the SDF file, Chembl name is empty but I do have entry on the chembl ID which I want. Here'

 15 12  2  0
 16 13  1  0
 17 10  1  0
 18 16  2  0
 19 16  1  0
 20 19  2  0
 21 18  1  0
 22 20  1  0
  4  5  2  0
 10  9  2  0
 21 22  2  0
M  END
>  <chembl_id>  (4) 
CHEMBL6226

>  <chembl_pref_name>  (4) 
None

$$$$

But

for idx, mol in enumerate(self.hits):
    print(mol.GetProp("_Name"))

throws me whitespace. I need the chembl ID in this case.

1

There are 1 best solutions below

0
On

The properties in your SDF are added to the molecules. You can access them in a few different ways:

# return the properties as a dictionary
prop_dict = mol.GetPropsAsDict()
chembl_id = prop_dict.get('chembl_id')

# Check a property exists
has_id = mol.HasProp('chembl_id')

# Get a single property
chembl_id = mol.GetProp('chembl_id')

# To get a numerical prop (assuming it exists in the file)
mol_wt = mol.GetDoubleProp('MolWt')

The _Name property is added from the title line of the SD file i.e. in this example the _Name property will be '6602966' (PubChem id):

$$$$
6602966
     RDKit          2D

 27 29  0  0  0  0  0  0  0  0999 V2000
    8.1706    6.9514    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
...