[gmx-users] Very large fluctuations in dg/dl

John D. Chodera jchodera at gmail.com
Mon May 7 22:26:00 CEST 2007


Hi Patrick,

> I find a reasonable DeltaGsol value of 8.6 kJ/mol for methane  
> (compared
> to 8.7 in Geerke & van Gunsteren, ChemPhysChem 2006, 7, 671 – 678) but
> I get really huge fluctuations in the values of dg/dl:
> lambda=0.00: 5.0 +/- 10.8 (mean +/- standard deviation)
> lambda=0.05: 4.3 +/- 11.2

Is the standard deviation you quote here the standard deviation of  
the dg/dl samples, or the standard deviation of the mean?

If the former, then this behavior is totally expected: While the  
standard deviation of a random variable (your dg/dl samples) may be  
large, with enough sampling, we can get a very precise estimate of  
the mean.  More sampling will not change the standard deviation of  
the dg/dl samples, but it will reduce the standard error in the mean,  
which is what we need for precise estimates of free energy differences.

The uncertainty in the estimate of <dg/dl> is given simply by

d<dg/dl> = sigma / sqrt(N / g)

where here, sigma is the standard deviation of your dg/dl samples, N  
is the number of data points you have collected, and g is something  
called the "statistical inefficiency", which can be estimated from  
the correlation time of your dg/dl samples.  More information on this  
sort of analysis can be found in reference [1] below.

Once you have the uncertainty in each estimate of <dg/dl>, you still  
have to combine these to get the uncertainty estimate for the  
integrated free energy difference using standard propagation of  
error.  This depends on your choice of quadrature for TI.  David  
Mobley has done a lot of this, and I'm sure would be willing to help  
if you had trouble figuring it out.

Good luck!

- John

[1] W. Janke. Statistical analysis of simulations: Data correlations  
and error estimation. In J. Grotendorst, D. Marx, and A. Murmatsu,  
editors, Quantum Simulations of Complex Many-Body Systems: From  
Theory to Algorithms, volume 10, pages 423–445. John von Neumann  
Institute for Computing, 2002.

--
John Chodera <jchodera at gmail.com>             | Mobile    : 415 867-7384
Postdoctoral researcher, Pande lab            | Lab phone : 650.723.1097
Department of Chemistry, Stanford University  | Lab fax   : 650.724.4021
http://www.dillgroup.ucsf.edu/~jchodera
--

Date: Mon, 07 May 2007 16:41:26 +0200
From: Patrick Fuchs <Patrick.Fuchs at ebgm.jussieu.fr>
Subject: [gmx-users] Very large fluctuations in dg/dl
To: gmx-users at gromacs.org
Message-ID: <463F3A96.9050206 at ebgm.jussieu.fr>
Content-Type: text/plain; charset=windows-1252; format=flowed

Hi Gromacs users,
I have a few questions related to solvation free energy calculation via
thermodynamic integration.
I'm trying to reproduce some literature data (on e.g. methane,
methanol...) using the GROMOS G53a6 force field. I followed the tutorial
of David Mobley (thanks to him BTW), but I used the standard non bonded
options of the G53a6 force field (instead of OPLS). For each lambda
value I do a minimization, a 10 ps NVT followed by a 20 ps NPT
equilibration, and a 1 ns NVT production using the sd integrator. I used
21 lambda values (0.00, 0.05...1.00).
Here's my topology file:
----------------begining of methane.top------------------------
; topology for a methane molecule

; include GROMOS53a6 force field
#include "ffG53a6.itp"

;;;;;;; begin methane definition ;;;;;;;
[ moleculetype ]
; Name           nrexcl
METH             3

[ atoms ]
;nr type resnr residue atom cgnr charge mass    typeB chargeB massB
   1  CH4  1     METH    C1   0    0.0000 16.0430 DUM   0.0000  16.04300
;;;;;; end methane definition ;;;;;;;;

; include water topology
#ifdef FLEX_SPC
#include "flexspc.itp"
#else
#include "spc.itp"
#endif

[ system ]
; name
1 methane molecule in water

[ molecules ]
; name  number
METH    1
SOL     893
-----------------end of methane.top------------------------

And here is my mdp file for lambda=0:
---------------begining of prod.mdp---------------------
title               = production NVT methane/water
cpp                 = /lib/cpp
; OPTIONS FOR BOND CONSTRAINTS
constraints         = all-bonds
; RUN CONTROL PARAMETERS
integrator          = sd
tinit               = 0
dt                  = 0.002
nsteps              = 500000 ; 1000 ps
; NUMBER OF STEPS FOR CENTER OF MASS MOTION REMOVAL
nstcomm             = 100
; OUTPUT CONTROL OPTIONS
nstxout                  = 500
nstvout                  = 500
nstfout                  = 0
nstlog                   = 500
nstenergy                = 100
nstxtcout                = 5000
xtc-precision            = 1000
; NON BONDED STUFF
ns_type             =  grid
nstlist                = 5
rlist                  = 0.8
coulombtype            = generalized-reaction-field
rcoulomb               = 1.4
rvdw                   = 1.4
epsilon_rf             = 54.0
;OPTIONS FOR TEMPERATURE COUPLING
tc_grps                  = system
tau_t                    = 0.1 ; inverse langevin friction cst
ref_t                    = 300
;OPTIONS FOR PRESSURE COUPLING
Pcoupl                   = no
tau_p                    = 0.5
compressibility          = 4.5e-5
ref_p                    = 1.0
; FREE ENERGY CONTROL STUFF
free_energy              = yes
init_lambda              = 0.00
delta_lambda             = 0
sc_alpha                 = 0.5
sc-power                 = 1.0
sc-sigma                 = 0.3
; VELOCITY GENERATION
gen_vel                  = yes
gen_temp                 = 300
gen_seed                 = -1
-----------------end of prod.mdp------------------------
I find a reasonable DeltaGsol value of 8.6 kJ/mol for methane (compared
to 8.7 in Geerke & van Gunsteren, ChemPhysChem 2006, 7, 671 – 678) but
I get really huge fluctuations in the values of dg/dl:
lambda=0.00: 5.0 +/- 10.8 (mean +/- standard deviation)
lambda=0.05: 4.3 +/- 11.2
...
lambda=1.00: -0.3 +/- 4.0
Furthermore, each of these mean value is very slow at converging (1 ns
seems a minimum for certain lambda values...).
I can't get reasonable fluctuations even if I sample more. In addition,
there are very frequent warnings in the log file such as:
----
Large VCM(group rest):      0.01363,      0.00818,      0.01147,
ekin-cm:  3.09490e+00
----
Here are my questions:
1) Has someone an idea of what could be the cause of these [very] large
fluctuations? Does it come from my setup, or is this a normal behavior?
2) Are these 'Large VCM(group rest)' warnings related to the use of sd
integrator (when I switch to md integrator, I no longer get these
warnings) ?
Thanks for your answer,

Patrick
-- 
_______________________________________________________
Patrick FUCHS
Equipe de Bioinformatique Genomique et Moleculaire
INSERM U726, Universite Paris 7
Case Courrier 7113
2, place Jussieu, 75251 Paris Cedex 05, FRANCE
Tel : +33 (0)1-44-27-77-16 - Fax : +33 (0)1-43-26-38-30
E-mail : Patrick.Fuchs at ebgm.jussieu.fr
Web Site: http://www.ebgm.jussieu.fr/~fuchs





More information about the gromacs.org_gmx-users mailing list