I am working out the area of interaction between a protein and ligand (in this example, PDB ID = 1r0r), and am experimenting with different dot densities. The file I am using has been fixed with PDBFixer and gives slightly different results to the one you can download from the PDB.
I am doing the following in PyMOL to calculate the area of interaction:
remove solvent
set dot_density, x
create 'sel1', E// #First selection is chain E
create 'sel2', * and not E// #Second selection is everything else, in this case chain I
create 'whole', * #To get the entire surface
print cmd.get_area('sel1') + cmd.get_area('sel2') - cmd.get_area('whole')
This should in theory give the area of the chains as standalone objects minus the area of them complexed together, thus leaving me with the area of interaction between them. This works fine with dot density 1 and 2 (though gives me vastly different answers), yet when I ramp it up to dot density 3 I start getting negative numbers and I can't work out why that is. Here is a visualisation of the molecule (selection 1 in pink, selection 2 is teal and the whole selection is pale green):
Whole complex
Interaction area
Using my own fixed file:
[in]
set dot_density, 1
print cmd.get_area('sel1') + cmd.get_area('sel2') - cmd.get_area('whole')
[out]
6462.04248046875
[in]
set dot_density, 2
print cmd.get_area('sel1') + cmd.get_area('sel2') - cmd.get_area('whole')
[out]
2663.0439453125
[in]
set dot_density, 3
print cmd.get_area('sel1') + cmd.get_area('sel2') - cmd.get_area('whole')
[out]
-357.0478515625
[in]
set dot_density, 4
print cmd.get_area('sel1') + cmd.get_area('sel2') - cmd.get_area('whole')
[out]
-152.4716796875
Using the file obtained directly from the PDB:
[in]
set dot_density, 1
print cmd.get_area('sel1') + cmd.get_area('sel2') - cmd.get_area('whole')
[out]
14053.78369140625
[in]
set dot_density, 2
print cmd.get_area('sel1') + cmd.get_area('sel2') - cmd.get_area('whole')
[out]
2272.84228515625
[in]
set dot_density, 3
print cmd.get_area('sel1') + cmd.get_area('sel2') - cmd.get_area('whole')
[out]
-735.10107421875
[in]
set dot_density, 4
print cmd.get_area('sel1') + cmd.get_area('sel2') - cmd.get_area('whole')
[out]
124.2021484375
I was hoping for these numbers to be at least similar (there's a difference of over 14,000 square Angstroms between all of these) and it would be great if they were all positive too. The higher dot density is supposed to give a more accurate measurement but I am not buying it right now :P
The end goal is to use a Python script that can take a list of PDB codes and do all this in one go and the script is all written out, I just need to figure out which PyMOL settings I should be using for the results to be reliable. I am also looking to use set dot_solvent, on and compare these results but clearly I'm not quite at that stage yet!
Many thanks,
Amy



this is not an answer, just trying to figure out the question better. To help people to answer. Using same .pdb file
1r0r.pdb, https://www.rcsb.org/structure/1r0r; must underline that is .pdbcmd.fetchwill dowload1r0r.cifthat could give different numbers.with this code:
and this added code (same folder of previous script) ,
runnable.py:I get these results from
pymol.cmd.get_area(....calculating both vand der Walls Surface (vdW) and Solvent Accessible Surface (SASA) , didn't find a way to get the Solvent Excluded Surface / Molecular Surface.NOTE: Not sure I got the names above right I am assuming this pic is correct, please correct me if wrong:
Results:
and the accompaining plot:
that tries to describe pymol calculated surfaces.
Note aside
set dot_densityaccepts both negative and positive integers (not float) but 0 and below seems to give same results while above 4 at visually andget_arearesults are same too. Wasn't able to get any google result for the algorithm used by pymol.Some picture to visually depict the problem:
VdW Surface :
SASA Surface:
dots, dot density -> 0 ; solvent on (like SASA)
dots , dot density -> 4 ; solvent on (like SASA)
dots, dot density -> 4 ; solvent off (like vdW)
Biopython seems to have a SASA module https://biopython.org/docs/dev/api/Bio.PDB.SASA.html .
here the code, not sure its right, please check throughly:
and graph code
runnable_CONVERGE.py:that uses
Bio.PDB.SASA.ShrakeRupley(probe_radius=1.40, n_points=100, radii_dict=None)with n_points [100 optimum as per Biopython documentation] going from 1 to 150 in step of 2 :and to better visualize how
SASAgoes toward the end of the alghoritm:with calculated difference between last and second-last SASA values of
3.4508421817808994and maximum difference between SASA and previous SASA in th last 20 calculated SASA of17.7Biopython
SASA.pyis at https://github.com/biopython/biopython/blob/master/Bio/PDB/SASA.py , quoting it :Retriving the atom coordinates from a hacked version of
SASA.pywe can compare the number of atoms having a Solvent Accessible Surface (calculate with parameter n_points = 800) with the surface (surface_solvent on) generated by pymol for the whole1r0r.pdb.All the atoms (2310) sphere cgo placed at atom coords vs pymol surface:
Only the atoms (995) with a SASA > 0 :
Using parameter
n_points= 1200 , and adding to the cgo spheres (around 1100 circa) the radius of relevant atom element plus the one of the probe , I get this pics of the entire SASA surface in green , compared to the one generated by pymol in rainbow by element (superposed to the first one) :It was kind of tricky to get this image because I did not find a way to combine all the spheres to a single cgo object (https://pymol-users.narkive.com/wtPZzPwg/pymol-combine-two-objects-in-pymol )