How to unfold only protein atoms using Bio.PDB.Selection?

583 Views Asked by At
from Bio.PDB import PDBParser
from Bio.PDB import Selection
structure = PDBParser().get_structure('4GBX', '4GBX.pdb') # load your molecule
atom_list = Selection.unfold_entities(structure[0]['E'], 'A') # 'A' is for Atoms in the chain 'E'

When I unfold chain E in the PDB 4GBX using the code above, the last 2 Oxygen atoms in atom_list belong to water heteroatoms in the same chain. How can I get a list of only protein residue atoms and avoid other ligands or water molecules in the selection?

1

There are 1 best solutions below

0
On

question code :

from Bio.PDB import PDBParser
from Bio.PDB import Selection
structure = PDBParser(QUIET = True).get_structure('4GBX', '4gbx.pdb') # load your molecule
atom_list = Selection.unfold_entities(structure[0]['E'], 'A') # 'A' is for Atoms in the chain 'E'

for i in atom_list :
    
    print(i , i.element, i.parent.id)

output:

...........
...........
...........
<Atom CA> C (' ', 93, ' ')
<Atom C> C (' ', 93, ' ')
<Atom O> O (' ', 93, ' ')
<Atom CB> C (' ', 93, ' ')
<Atom OG1> O (' ', 93, ' ')
<Atom CG2> C (' ', 93, ' ')
<Atom O> O ('W', 101, ' ')
<Atom O> O ('W', 102, ' ')

answer as per comments :

from Bio.PDB import PDBParser
from Bio.PDB import Selection
structure = PDBParser(QUIET = True).get_structure('4GBX', '4gbx.pdb') # load your molecule
atom_list = Selection.unfold_entities(structure[0]['E'], 'A') # 'A' is for Atoms in the chain 'E'

for i in [atom for atom in atom_list if atom.get_full_id()[3][0] == " "] :
    
    print(i , i.element, i.parent.id)

output :

...........
...........
...........
<Atom CA> C (' ', 93, ' ')
<Atom C> C (' ', 93, ' ')
<Atom O> O (' ', 93, ' ')
<Atom CB> C (' ', 93, ' ')
<Atom OG1> O (' ', 93, ' ')
<Atom CG2> C (' ', 93, ' ')