[gmx-users] g_sas: unable to reproduce data from original article of the DCLM (Eisenhaber1995)

David van der Spoel spoel at xray.bmc.uu.se
Sat Feb 22 22:14:49 CET 2014


On 2014-02-22 18:34, João M. Damas wrote:
> Dear all,
>
> I have been having some puzzling results when using g_sas to analyze some
> trajectories, so I have decided to go back to basics.
>
> As described in the documentation, g_sas implements the double cubic
> lattice method (DCLM) originally presented in Eisenhaber1995 (doi:
> 10.1002/jcc.540160303 <http://dx.doi.org/10.1002/jcc.540160303>). Table I
> of that article shows a test case of three proteins (PDB code: 4PTI, 3FXN
> and 1TIM), where the DCLM (numerical method) at different point densities
> is compared against an analytic method (also by Eisenhaber, citation 30).
>
> My question was: can g_sas reproduce the areas reported in Table I of
> Eisenhaber1995?
>
> First of all, I went on to reproduce the reported areas with the software
> originally developed by Eisenhaber, which has both the analytical and
> numerical (DCLM) methods implemented - ASC software. As it can be seen in
> Attached Table (see below), running ASC provided results very similar (to
> the tenths of square angstrom), if not equal, to the reported ones. Besides
> assessing the reported areas, I now have a software which allows me to
> benchmark other cases.
>
> The next step was answering my question. For that, I had to make sure g_sas
> was performing the same calculation as ASC software:
>
>     - Both ASC and g_sas took the same input structures (only using the ATOM
>     records). In the case of g_sas, I had to build a .tpr file for the -s flag
>     of g_sas, which was done using pdb2gmx+grompp. Like with the ASC, the g_sas
>     calculation was done using only the heavy atoms (using the Protein-H group
>     for calculation and output groups).
>     - The radii used also had to be the same for both ASC and g_sas. g_sas
>     is still using the vdwradii.dat, so I created a new vdwradii.dat file with
>     the Eisenberg&McLachlan radii that are used in the ASC calculations (both
>     mine and the reported ones). The radii were correctly attributed to the
>     atoms, as checked in the g_sas.log file outputted by the debug mode.
>     - In both cases the probe radius was 1.4 angstroms (0.14 nm as per g_sas
>     input).
>
> The area calculations were then performed using g_sas (number of points
> used were provided using the -ndots flag). Originally, I used GROMACS
> 4.0.4, but afterwards I also performed the calculations using GROMACS
> 4.6.3. Results are presented in Attached Table (see below).
>
>     - GROMACS 4.0.4 g_sas DCLM overestimates the area by about 5, 10 and 20%
>     for 4PTI, 3FXN and 1TIM, respectively, in relation to ASC DCLM. This is
>     also the same overestimation of the areas obtained through an analytical
>     method, since ASC DCLM gives a good estimation of the analytical ones,
>     unlike g_sas.
>     - On the other hand, GROMACS 4.6.3 g_sas DCLM underestimates the area by
>     about 3, 10 and 0.5% for 4PTI, 3FXN and 1TIM, respectively, in relation to
>     ASC DCLM. I have checked that this different behavior appeared from the
>     4.0.X to the 4.5.X versions. This is odd since I have searched the release
>     notes for g_sas and I haven't found any change in this tool... Maybe this
>     is a clue to what is going on?
>     - From my understanding, DCLM implemented on ASC or on g_sas shouldn't
>     give different results. Moreover, even if there could be some differences
>     when using a low number of points (due some factor of randomness I may not
>     be aware), this difference should have already vanished when using more
>     than 1000 points per atom. Take note that increasing the number of points
>     in g_sas does not give areas nearer the analytical results (hence, more
>     dots may not mean more accuracy, maybe this is the reason 24 is the default
>     number...). Can anyone point me out on some mistake I may be doing? Maybe
>     there is a factor that I may be disregarding...
>
> I have some data which point out that these differences (errors?) may lead
> to very large errors when calculating properties like buried/contact areas,
> specially if the buried/contact areas have different magnitudes from the
> molecule's areas. But this may be the theme of a another/follow-up e-mail
> when I get ASC working on my trajectories.
>
> Best regards,
> João M. Damas
>
> --
> Attached Table
> met  - method used, anl=analytical method, num=numerical method (DCLM)
> dots - number of points used for the numerical method
> src  - source of the data, paper=Eisenhaber1995, ASC=ASC software,
> gmx404=g_sas of GMX4.0.4,
>         gmx463=g_sas of GMX4.6.3
> Areas are in square angstroms
> ==================================================
>    met  dots     src      4PTI      3FXN      1TIM
> ==================================================
>    anl         paper   3973.80   6943.80  20002.80
>    anl           ASC   3973.81   6943.80  20002.80
>
>    num   122   paper   3961.40   6968.30  19970.90
>    num   122     ASC   3961.44   6968.33  19970.90
>    num   122  gmx404   4165.31   7727.73  24110.10
>    num   122  gmx463   3825.86   6257.83  19867.90
>
>    num   362   paper   3971.80   6933.40  19997.10
>    num   362     ASC   3971.79   6933.37  19997.10
>    num   362  gmx404   4192.12   7704.33  24161.50
>    num   362  gmx463   3838.35   6248.43  19886.80
>
>    num   642   paper   3967.90   6944.40  19998.70
>    num   642     ASC   3967.78   6944.37  19998.70
>    num   642  gmx404   4187.54   7703.00  24148.90
>    num   642  gmx463   3831.64   6255.64  19882.70
>
>    num  1002   paper   3974.10   6939.10  20012.20
>    num  1002     ASC   3974.07   6939.12  20012.20
>    num  1002  gmx404   4191.68   7702.32  24173.20
>    num  1002  gmx463   3840.58   6250.14  19897.40
>
>    num  1472   paper   3975.70   6943.30  19997.00
>    num  1472     ASC   3975.70   6943.35  19996.90
>    num  1472  gmx404   4197.65   7710.93  24153.70
>    num  1472  gmx463   3841.69   6256.92  19882.80
> ==================================================
>
Thanks for a thorough analysis. Some more points to test:
- compile gromacs in double precision and rerun
- verify there are no periodic boundary effects by making the box larger.


-- 
David van der Spoel, Ph.D., Professor of Biology
Dept. of Cell & Molec. Biol., Uppsala University.
Box 596, 75124 Uppsala, Sweden. Phone:	+46184714205.
spoel at xray.bmc.uu.se    http://folding.bmc.uu.se


More information about the gromacs.org_gmx-users mailing list