# [gmx-users] Very large fluctuations in dg/dl

John D. Chodera jchodera at gmail.com
Mon May 7 22:26:00 CEST 2007

```Hi Patrick,

> I find a reasonable DeltaGsol value of 8.6 kJ/mol for methane
> (compared
> to 8.7 in Geerke & van Gunsteren, ChemPhysChem 2006, 7, 671 – 678) but
> I get really huge fluctuations in the values of dg/dl:
> lambda=0.00: 5.0 +/- 10.8 (mean +/- standard deviation)
> lambda=0.05: 4.3 +/- 11.2

Is the standard deviation you quote here the standard deviation of
the dg/dl samples, or the standard deviation of the mean?

If the former, then this behavior is totally expected: While the
standard deviation of a random variable (your dg/dl samples) may be
large, with enough sampling, we can get a very precise estimate of
the mean.  More sampling will not change the standard deviation of
the dg/dl samples, but it will reduce the standard error in the mean,
which is what we need for precise estimates of free energy differences.

The uncertainty in the estimate of <dg/dl> is given simply by

d<dg/dl> = sigma / sqrt(N / g)

where here, sigma is the standard deviation of your dg/dl samples, N
is the number of data points you have collected, and g is something
called the "statistical inefficiency", which can be estimated from
sort of analysis can be found in reference [1] below.

Once you have the uncertainty in each estimate of <dg/dl>, you still
have to combine these to get the uncertainty estimate for the
integrated free energy difference using standard propagation of
Mobley has done a lot of this, and I'm sure would be willing to help
if you had trouble figuring it out.

Good luck!

- John

[1] W. Janke. Statistical analysis of simulations: Data correlations
and error estimation. In J. Grotendorst, D. Marx, and A. Murmatsu,
editors, Quantum Simulations of Complex Many-Body Systems: From
Theory to Algorithms, volume 10, pages 423–445. John von Neumann
Institute for Computing, 2002.

--
John Chodera <jchodera at gmail.com>             | Mobile    : 415 867-7384
Postdoctoral researcher, Pande lab            | Lab phone : 650.723.1097
Department of Chemistry, Stanford University  | Lab fax   : 650.724.4021
http://www.dillgroup.ucsf.edu/~jchodera
--

Date: Mon, 07 May 2007 16:41:26 +0200
From: Patrick Fuchs <Patrick.Fuchs at ebgm.jussieu.fr>
Subject: [gmx-users] Very large fluctuations in dg/dl
To: gmx-users at gromacs.org
Message-ID: <463F3A96.9050206 at ebgm.jussieu.fr>
Content-Type: text/plain; charset=windows-1252; format=flowed

Hi Gromacs users,
I have a few questions related to solvation free energy calculation via
thermodynamic integration.
I'm trying to reproduce some literature data (on e.g. methane,
methanol...) using the GROMOS G53a6 force field. I followed the tutorial
of David Mobley (thanks to him BTW), but I used the standard non bonded
options of the G53a6 force field (instead of OPLS). For each lambda
value I do a minimization, a 10 ps NVT followed by a 20 ps NPT
equilibration, and a 1 ns NVT production using the sd integrator. I used
21 lambda values (0.00, 0.05...1.00).
Here's my topology file:
----------------begining of methane.top------------------------
; topology for a methane molecule

; include GROMOS53a6 force field
#include "ffG53a6.itp"

;;;;;;; begin methane definition ;;;;;;;
[ moleculetype ]
; Name           nrexcl
METH             3

[ atoms ]
;nr type resnr residue atom cgnr charge mass    typeB chargeB massB
1  CH4  1     METH    C1   0    0.0000 16.0430 DUM   0.0000  16.04300
;;;;;; end methane definition ;;;;;;;;

; include water topology
#ifdef FLEX_SPC
#include "flexspc.itp"
#else
#include "spc.itp"
#endif

[ system ]
; name
1 methane molecule in water

[ molecules ]
; name  number
METH    1
SOL     893
-----------------end of methane.top------------------------

And here is my mdp file for lambda=0:
---------------begining of prod.mdp---------------------
title               = production NVT methane/water
cpp                 = /lib/cpp
; OPTIONS FOR BOND CONSTRAINTS
constraints         = all-bonds
; RUN CONTROL PARAMETERS
integrator          = sd
tinit               = 0
dt                  = 0.002
nsteps              = 500000 ; 1000 ps
; NUMBER OF STEPS FOR CENTER OF MASS MOTION REMOVAL
nstcomm             = 100
; OUTPUT CONTROL OPTIONS
nstxout                  = 500
nstvout                  = 500
nstfout                  = 0
nstlog                   = 500
nstenergy                = 100
nstxtcout                = 5000
xtc-precision            = 1000
; NON BONDED STUFF
ns_type             =  grid
nstlist                = 5
rlist                  = 0.8
coulombtype            = generalized-reaction-field
rcoulomb               = 1.4
rvdw                   = 1.4
epsilon_rf             = 54.0
;OPTIONS FOR TEMPERATURE COUPLING
tc_grps                  = system
tau_t                    = 0.1 ; inverse langevin friction cst
ref_t                    = 300
;OPTIONS FOR PRESSURE COUPLING
Pcoupl                   = no
tau_p                    = 0.5
compressibility          = 4.5e-5
ref_p                    = 1.0
; FREE ENERGY CONTROL STUFF
free_energy              = yes
init_lambda              = 0.00
delta_lambda             = 0
sc_alpha                 = 0.5
sc-power                 = 1.0
sc-sigma                 = 0.3
; VELOCITY GENERATION
gen_vel                  = yes
gen_temp                 = 300
gen_seed                 = -1
-----------------end of prod.mdp------------------------
I find a reasonable DeltaGsol value of 8.6 kJ/mol for methane (compared
to 8.7 in Geerke & van Gunsteren, ChemPhysChem 2006, 7, 671 – 678) but
I get really huge fluctuations in the values of dg/dl:
lambda=0.00: 5.0 +/- 10.8 (mean +/- standard deviation)
lambda=0.05: 4.3 +/- 11.2
...
lambda=1.00: -0.3 +/- 4.0
Furthermore, each of these mean value is very slow at converging (1 ns
seems a minimum for certain lambda values...).
I can't get reasonable fluctuations even if I sample more. In addition,
there are very frequent warnings in the log file such as:
----
Large VCM(group rest):      0.01363,      0.00818,      0.01147,
ekin-cm:  3.09490e+00
----
Here are my questions:
1) Has someone an idea of what could be the cause of these [very] large
fluctuations? Does it come from my setup, or is this a normal behavior?
2) Are these 'Large VCM(group rest)' warnings related to the use of sd
integrator (when I switch to md integrator, I no longer get these
warnings) ?

Patrick
--
_______________________________________________________
Patrick FUCHS
Equipe de Bioinformatique Genomique et Moleculaire
INSERM U726, Universite Paris 7
Case Courrier 7113
2, place Jussieu, 75251 Paris Cedex 05, FRANCE
Tel : +33 (0)1-44-27-77-16 - Fax : +33 (0)1-43-26-38-30
E-mail : Patrick.Fuchs at ebgm.jussieu.fr
Web Site: http://www.ebgm.jussieu.fr/~fuchs

```