[gmx-users] RE: Re: RE: About the binary identical results by restarting from the checkpoint file
Mark Abraham
mark.j.abraham at gmail.com
Sat Jun 15 21:50:52 CEST 2013
On Sat, Jun 15, 2013 at 9:00 PM, Cuiying Jian <cuiying_jian at hotmail.com>wrote:
>
>
>
>
>
>
>
>
>
> Hi Mark,
>
> I test the simulations again using Berendsen thermostat -- Still, I cannot
> get binary identical results.
> I do two sets of simulations:
> 1. Use Gromacs 4.5.2 installed on my personal computer:
>
4.6.2, I hope. Nobody is interested in reports about 4.5.2 :-)
> Run 2 simulations using the command: mdrun -s md.tpr -deffnm md -nt 1 -cpt
> 0 -reprod (-nt 1 ensures that the number of threads to start is
> 1).Terminate one simulation manually.Restart this simulation by: mdrun -s
> md.tpr -deffnm md -nt 1 -cpt 0 -cpi md.cpt -reprod -npme 0 (-npme o ensures
> that the number of pme nodes for the restarting the same with that in the
> checkpoint file.)Compare the results with those from continuous ones.
What does gmxcheck say when comparing the resulting ostensibly equivalent
trajectory files? Please provide a snippet of output if it says things
differ. We want to see how big "different" is. Also the top 20 lines of a
.log file.
Also, you can do the above procedure in a controlled manner in 4.6.2 by
using mdrun -nsteps on the run you wish to stop prematurely.
Might your FFT library be multi-threading behind your back?
Mark
2. Use Gromacs 4.0.7 installed on a cluster (only one processor is used
> during the simulation):
> Run 2 simulations using the command: mdrun_s -v -cpt 0 -s md.tpr -deffnm
> md -reprod Terminate one simulation manually.Restart this simulation by:
> mdrun_s -v -cpt 0 -cpi md.cpt -s md.tpr -deffnm md -reprod Compare the
> results with those from continuous ones. Still, I cannot get binary
> identical results. As mentioned ealier, the only case I can get binary
> identical results is for SPC rigid water molecules (using velocity
> rescaling thermostat in Gromacs 4.0.7). I guess that the reason for this
> problem may also be caused by the LINCS algorithm used to constraint all
> bonds in other cases except the rigid water case.. Thanks a lot.
> Cheers,Cuiying
>
> > Date: Mon, 3 Jun 2013 19:15:12 +0200
> > From: Mark Abraham <mark.j.abraham at gmail.com>
> > Subject: Re: [gmx-users] RE: About the binary identical results by
> > restarting from the checkpoint file
> > To: Discussion list for GROMACS users <gmx-users at gromacs.org>
> > Message-ID:
> > <CAMNuMARBEZ=m=Y_M1=
> C5PzNcGWV438MvEydOsf56R6yTc681bQ at mail.gmail.com>
> > Content-Type: text/plain; charset=ISO-8859-1
> >
> > On Mon, Jun 3, 2013 at 6:59 PM, Cuiying Jian <cuiying_jian at hotmail.com
> >wrote:
> >
> > > Hi Mark,
> > >
> > > Thanks for your reply. I tested restarting simulations with .cpt files
> by
> > > GROMACS 4.6.1. and the problems are still there, i.e. I cannot get
> binary
> > > identical results from restarted simulations with those from continuous
> > > simulations. The command I used for restarting is as the following
> (Only
> > > one processor is used during the simulations.):
> > > mdrun -v -s md.tpr -cpt 0 -cpi md.cpt -deffnm md -reprod
> > >
> >
> > This is not generally enough to generate a serial run in 4.6, by the way.
> > GROMACS tries very hard to automatically use all the resources available
> in
> > the best way. See mdrun -h for various -nt* options, and consult the
> > pre-step-0 part of the .log file for feedback.
> >
> > For further information, I attach my original .mdp file below:
> > > constraints = all-bonds ; convert all bonds to
> > > constraints.
> > > integrator = md
> > > dt = 0.002 ; ps !
> > > nsteps = 10000 ; total 2 ns.
> > > nstcomm = 10 ; frequency for center of
> > > mass motion removal.
> > > nstxout = 5 ; collect data every
> 10.0
> > > ps.
> > > nstxtcout = 5 ; frequency to write
> > > coordinate to xtc trajectory.
> > > nstvout = 5 ; frequency to write
> > > velocities to output trajectory.
> > > nstfout = 5 ; frequency to write
> > > forces to output trajectory.
> > > nstlog = 5 ; frequency to write
> > > energies to log file.
> > > nstenergy = 5 ; frequency to write
> > > energies to energy file.
> > > nstlist = 1 ; frequency to
> update
> > > the neighbor list.
> > > ns_type = grid
> > > rlist = 1.4
> > > coulombtype = PME
> > > rcoulomb = 1.4
> > > vdwtype = cut-off
> > > rvdw = 1.4
> > > pme_order = 8 ; use 6,8 or 10
> > > when running in parallel
> > > ewald_rtol = 1e-5
> > > optimize_fft = yes
> > > DispCorr = no ; don't apply any
> > > correction
> > > ;open LINCS
> > > constraint_algorithm = LINCS
> > > lincs_order = 4 ;highest order in the
> > > expansion of the constraint coupling matrix
> > > lincs_warnangle = 30 ;maximum angle that a bond
> can
> > > rotate before LINCS will complain
> > > lincs_iter = 1 ;number of
> iterations
> > > to correct for a rotational lengthening in LINCS
> > > ; Temperature coupling is on
> > > Tcoupl = v-rescale
> > >
> >
> > This coupling algorithm has a stochastic component, and at least at some
> > points in history the random number generator was either not checkpointed
> > properly, or not propagated in parallel properly. I'm not sure offhand if
> > any of that has been fixed yet (I doubt it), but you can test (parts of)
> > this hypothesis by using Berendsen (in any GROMACS 4.x), or really being
> > sure you've run a single thread.
> >
> > If Berendsen is fully reproducible, then the RNG is the issue. While
> that's
> > irritating, it probably won't get fixed before GROMACS 5 (as a side
> effect
> > of other stuff going on).
> >
> > Mark
> >
> > tau_t = 0.1
> > > tc-grps = HEP
> > > ref_t = 300
> > > ; Pressure coupling is on
> > > Pcoupl = parrinello-rahman
> > > Pcoupltype = isotropic
> > > tau_p = 1.0
> > > compressibility = 4.5e-5
> > > ref_p = 1.0
> > > ; generate velocity is on at 300 K.
> > > gen_vel = yes
> > > gen_temp = 300.0
> > > gen_seed = -1
> > >
> > > Is there something wrong with my .mdp file or my command? Thanks a lot.
> > >
> > > Cheers,
> > > Cuiying
> > > > On Sun, Jun 2, 2013 at 10:37 PM, Cuiying Jian <
> cuiying_jian at hotmail.com
> > > >wrote:
> > > >
> > > > > Hi GROMACS Users,
> > > > >
> > > > > These days, I am testing restarting simulaitions with .cpt files. I
> > > > > already set nlist=1 in the .mdp file. I can restart my simulations
> > > (which
> > > > > are stopped manually) with the following commands (version 4.0.7):
> > > > > mpiexec mdrun_s_mpi -v -s md.tpr -cpt 0 -cpi md.cpt -deffnm md
> -reprod
> > > > > -reprod is used to force binary identical simulaitons.
> > > > >
> > > > > During the restarted simulations, same number of processors are
> used as
> > > > > that in the simulation interrupted. The only case, in which I can
> get
> > > > > binary identical results with those from the continuous simulations
> > > (which
> > > > > are not stopped manually), is for SPC water molecules. Any other
> > > molecules
> > > > > (like -heptane), I can never get binary identical results with
> those
> > > from
> > > > > the continuous simulations.
> > > > >
> > > > > I also try to get new .tpr files by:
> > > > > tpbconv_s -s md.tpr -f md.trr -e md.edr -c md_c.tpr -cont
> > > > > and then:
> > > > > mpiexec mdrun_s_mpi -v -s md_c.tpr -cpt 0 -cpi md.cpt -deffnm md_c
> > > -reprod
> > > > > But I still cannot get binary identical results.
> > > > >
> > > > > I also test the simulations with only one processor and binary
> > > identical
> > > > > results are still not obtained. Using double precision does not
> solve
> > > the
> > > > > problems.
> > > > >
> > > > > I think that the above problems are caused by some information may
> not
> > > be
> > > > > stored during the running of the simulations.
> > > > >
> > > >
> > > > That seems likely. The leading candidate would be a random number
> > > generator
> > > > you're using for a stochastic integrator. Your .mdp file would have
> been
> > > > useful.
> > > >
> > > > On the other hand, if I run two independent simulations using the
> exactly
> > > > > same number of processors, the same commands and the same input
> files,
> > > i.e.
> > > > > mpiexec mdrun_s_mpi -v -s md.tpr -deffnm md -reprod
> > > > > I can always get binary identical results from these two
> independent
> > > > > simulations.
> > > > >
> > > > > I understand that MD is chaotic and if we run simulation for enough
> > > long
> > > > > time, simulation results should converge. Also, there are factors
> > > which may
> > > > > affect the reproducibility as described in the GROMACS website. But
> > > for my
> > > > > purpose, I am curious about whether there are certain methods
> through
> > > which
> > > > > I can get binary identical results from restarted simulations and
> > > > > continuous simulations. Thanks a lot.
> > > > >
> > > >
> > > > There are ways to be fully reproducible, but probably not every
> > > combination
> > > > of algorithms has that property. 4.0.7 is so old no problem will be
> > > fixed,
> > > > unless it can also be shown in 4.6 ;-)
> > > >
> > > > Mark
> > >
> > >
> > >
> > >
> > > --
> > > gmx-users mailing list gmx-users at gromacs.org
> > > http://lists.gromacs.org/mailman/listinfo/gmx-users
> > > * Please search the archive at
> > > http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> > > * Please don't post (un)subscribe requests to the list. Use the
> > > www interface or send it to gmx-users-request at gromacs.org.
> > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > >
> >
> >
> > ------------------------------
> >
> > --
> > gmx-users mailing list
> > gmx-users at gromacs.org
> > http://lists.gromacs.org/mailman/listinfo/gmx-users
> > Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> >
> > End of gmx-users Digest, Vol 110, Issue 16
> > ******************************************
>
>
>
>
> --
> gmx-users mailing list gmx-users at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-request at gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
More information about the gromacs.org_gmx-users
mailing list