# [gmx-users] bootstrapping of PMF

Jochen Hub jhub at gwdg.de
Fri Nov 18 11:30:26 CET 2016

Hi Alex,

there is no simple answer to your questions. MD simulations often suffer
from long and unknown autocorrelations. Computing reliable errors from
simulations is difficult since it is not clear which simulation frames
are truly statistically independent. With the bootstrapping of
histograms, you get a reasonable error estimate if

1) Your individual histograms are really independent. This may be
violated, for instance, if the starring position for each window is
similar. For example, if the orientation of peptide with respect to the
surface is was always the same at t=0, or if the internal structure of
the peptide was always the same.

2) Your histograms are sufficiently tight, such that at each position
along the reaction coordinate you have several histograms (such as 5 or
10). If your histograms overlap at +- sigma (or even +-2 sigma), this is
clearly violated.

However, getting individual histograms independent from each other is in
practice easier than getting frames from a single simulation independent
(due to the very long autocorrelation within one simulation). Therefore,
bootstrapping complete histograms is in many cases the best one can do
(if the points 1 and 2 are more or less fulfilled).

Btw: The integrated autocorrelation times in iact.xvg are mainly
important when you enforce a cyclic (or periodic) PMF, in order to
distribute a offset between the right and left end over the PMF (to get
it cyclic). But they are in most cased by no means suitable for getting
the "true" autocorrelation time, which you would need to compute the
error via binning single long simulations (to make sure your bins are
independent).

I hope this helps a bit.

Cheers,
Jochen

Am 09/11/2016 um 17:11 schrieb Alex:
> Dear gromacs user,
>
> I have performed a US simulation to find PMF of a peptide adsorbed to a
> solid surface.
>
> I have already evaluated the result by bootstrapping in gmx WHAM using the
> b-hist method and 600 number of bootstraps and 1200 bins also with
> considering the integrated autocorrelation time into account
>
> Here is the command:
>
> gmx wham -hist Histo.xvg -nBootstrap 600 -bins 1200 -bs-method b-hist
> -bsres bsResult.xvg -bsprof bsProfs.xvg -if Fpull.dat -it TPR.dat -min 1.95
> -max 4.7 -ac -o Profile.xvg -zprof0 4.69
>
> And here are the result:
>
> bsProfs.pdf
>
> bsResult.pdf
>
> iact.xvg   integrated autocorrelation time
>
> My first question is that if I have well converged PMF result, based on
> above files?
>
> I was also wondering that what exactly I have to be reported later in for
> example a publication and ... ? the normal profile.xvg with out bootstrap
> or this bsProfs.xvg? What is the difference between bsProfs.xvg and the
> normal profile.xvg that we can get from normal gmx WHAM with out
> bootstrapping?
>
> And why the first 130 lines of iact.xvg file have been autocratically
> commented out from  the rest?
>
> And finally, do we always and here need to correct all the PMF profile by
> the "$k_{B}T*log[4π(\epsilon)^2]$" factor in which \epsilon is reaction
> coordinate? as been mentioned here:
> Hub, J. S.; de Groot, B. L.; van der Spoel, D.J. Chem. Theory
> Comput.2010,6, 3713-3720
>
>
> Regards,
> Alex
>

--
---------------------------------------------------
Dr. Jochen Hub
Computational Molecular Biophysics Group
Institute for Microbiology and Genetics
Georg-August-University of Göttingen
Justus-von-Liebig-Weg 11, 37077 Göttingen, Germany.
Phone: +49-551-39-14189
http://cmb.bio.uni-goettingen.de/
---------------------------------------------------