[gmx-users] RE: Re: RE: About the binary identical results by restarting from the checkpoint file

Mark Abraham mark.j.abraham at gmail.com
Sat Jun 15 21:50:52 CEST 2013


On Sat, Jun 15, 2013 at 9:00 PM, Cuiying Jian <cuiying_jian at hotmail.com>wrote:

>
>
>
>
>
>
>
>
>
> Hi Mark,
>
> I test the simulations again using Berendsen thermostat -- Still, I cannot
> get binary identical results.
>  I do two sets of simulations:
> 1. Use Gromacs 4.5.2 installed on my personal computer:
>

4.6.2, I hope. Nobody is interested in reports about 4.5.2 :-)


> Run 2 simulations using the command: mdrun -s md.tpr -deffnm md -nt 1 -cpt
> 0 -reprod (-nt 1 ensures that the number of threads to start is
> 1).Terminate one simulation manually.Restart this simulation by: mdrun -s
> md.tpr -deffnm md -nt 1 -cpt 0 -cpi md.cpt -reprod -npme 0 (-npme o ensures
> that the number of pme nodes for the restarting the same with that in the
> checkpoint file.)Compare the results with those from continuous ones.


What does gmxcheck say when comparing the resulting ostensibly equivalent
trajectory files? Please provide a snippet of output if it says things
differ. We want to see how big "different" is. Also the top 20 lines of a
.log file.

Also, you can do the above procedure in a controlled manner in 4.6.2 by
using mdrun -nsteps on the run you wish to stop prematurely.

Might your FFT library be multi-threading behind your back?

Mark

2. Use Gromacs 4.0.7 installed on a cluster (only one processor is used
> during the simulation):
> Run 2 simulations using the command: mdrun_s -v -cpt 0 -s md.tpr -deffnm
> md -reprod Terminate one simulation manually.Restart this simulation by:
> mdrun_s -v -cpt 0 -cpi md.cpt -s md.tpr -deffnm md -reprod  Compare the
> results with those from continuous ones. Still, I cannot get binary
> identical results.  As mentioned ealier, the only case I can get binary
> identical results is for SPC rigid water molecules (using velocity
> rescaling thermostat in Gromacs 4.0.7). I guess that the reason for this
> problem may also be caused by the LINCS algorithm used to constraint all
> bonds in other cases except the rigid water case..  Thanks a lot.
> Cheers,Cuiying
>
> > Date: Mon, 3 Jun 2013 19:15:12 +0200
> > From: Mark Abraham <mark.j.abraham at gmail.com>
> > Subject: Re: [gmx-users] RE: About the binary identical results by
> >       restarting      from the checkpoint file
> > To: Discussion list for GROMACS users <gmx-users at gromacs.org>
> > Message-ID:
> >       <CAMNuMARBEZ=m=Y_M1=
> C5PzNcGWV438MvEydOsf56R6yTc681bQ at mail.gmail.com>
> > Content-Type: text/plain; charset=ISO-8859-1
> >
> > On Mon, Jun 3, 2013 at 6:59 PM, Cuiying Jian <cuiying_jian at hotmail.com
> >wrote:
> >
> > > Hi Mark,
> > >
> > > Thanks for your reply. I tested restarting simulations with .cpt files
> by
> > > GROMACS 4.6.1.  and the problems are still there, i.e. I cannot get
> binary
> > > identical results from restarted simulations with those from continuous
> > > simulations. The command I used for restarting is as the following
> (Only
> > > one processor is used during the simulations.):
> > > mdrun -v -s md.tpr -cpt 0 -cpi md.cpt -deffnm md -reprod
> > >
> >
> > This is not generally enough to generate a serial run in 4.6, by the way.
> > GROMACS tries very hard to automatically use all the resources available
> in
> > the best way. See mdrun -h for various -nt* options, and consult the
> > pre-step-0 part of the .log file for feedback.
> >
> > For further information, I attach my original .mdp file below:
> > > constraints          =  all-bonds         ; convert all bonds to
> > > constraints.
> > > integrator                 =  md
> > > dt                          =  0.002              ; ps !
> > > nsteps                  =  10000             ; total 2 ns.
> > > nstcomm             =  10                    ; frequency for center of
> > > mass motion removal.
> > > nstxout                =  5                      ; collect data every
> 10.0
> > > ps.
> > > nstxtcout             =  5                      ; frequency to write
> > > coordinate to xtc trajectory.
> > > nstvout                =  5                      ; frequency to write
> > > velocities to output trajectory.
> > > nstfout                 =  5                      ; frequency to write
> > > forces to output trajectory.
> > > nstlog                   =  5                      ; frequency to write
> > > energies to log file.
> > > nstenergy            =  5                      ; frequency to write
> > > energies to energy file.
> > > nstlist                   =  1                       ; frequency to
> update
> > > the neighbor list.
> > > ns_type               =  grid
> > > rlist                       =  1.4
> > > coulombtype      =  PME
> > > rcoulomb            =  1.4
> > > vdwtype              =  cut-off
> > > rvdw                     =  1.4
> > > pme_order          =  8                                 ; use 6,8 or 10
> > > when running in parallel
> > > ewald_rtol           =  1e-5
> > > optimize_fft        =  yes
> > > DispCorr               =  no                     ; don't apply any
> > > correction
> > > ;open LINCS
> > > constraint_algorithm = LINCS
> > > lincs_order                   = 4               ;highest order in the
> > > expansion of the constraint coupling matrix
> > > lincs_warnangle          = 30             ;maximum angle that a bond
> can
> > > rotate before LINCS will complain
> > > lincs_iter                      = 1                ;number of
> iterations
> > > to correct for a rotational lengthening in LINCS
> > > ; Temperature coupling is on
> > > Tcoupl                          = v-rescale
> > >
> >
> > This coupling algorithm has a stochastic component, and at least at some
> > points in history the random number generator was either not checkpointed
> > properly, or not propagated in parallel properly. I'm not sure offhand if
> > any of that has been fixed yet (I doubt it), but you can test (parts of)
> > this hypothesis by using Berendsen (in any GROMACS 4.x), or really being
> > sure you've run a single thread.
> >
> > If Berendsen is fully reproducible, then the RNG is the issue. While
> that's
> > irritating, it probably won't get fixed before GROMACS 5 (as a side
> effect
> > of other stuff going on).
> >
> > Mark
> >
> > tau_t                             = 0.1
> > > tc-grps                          = HEP
> > > ref_t                              =  300
> > > ; Pressure  coupling is on
> > > Pcoupl                          = parrinello-rahman
> > > Pcoupltype                  = isotropic
> > > tau_p                            = 1.0
> > > compressibility           = 4.5e-5
> > > ref_p                             = 1.0
> > > ; generate velocity is on at 300 K.
> > > gen_vel              = yes
> > > gen_temp          = 300.0
> > > gen_seed           = -1
> > >
> > > Is there something wrong with my .mdp file or my command? Thanks a lot.
> > >
> > > Cheers,
> > > Cuiying
> > > > On Sun, Jun 2, 2013 at 10:37 PM, Cuiying Jian <
> cuiying_jian at hotmail.com
> > > >wrote:
> > > >
> > > > > Hi GROMACS Users,
> > > > >
> > > > > These days, I am testing restarting simulaitions with .cpt files. I
> > > > > already set nlist=1 in the .mdp file. I can restart my simulations
> > > (which
> > > > > are stopped manually) with the following commands (version 4.0.7):
> > > > > mpiexec mdrun_s_mpi -v -s md.tpr -cpt 0 -cpi md.cpt -deffnm md
> -reprod
> > > > > -reprod is used to force binary identical simulaitons.
> > > > >
> > > > > During the restarted simulations, same number of processors are
> used as
> > > > > that in the simulation interrupted. The only case, in which I can
> get
> > > > > binary identical results with those from the continuous simulations
> > > (which
> > > > > are not stopped manually), is for SPC water molecules. Any other
> > > molecules
> > > > > (like -heptane), I can never get binary identical results with
> those
> > > from
> > > > > the continuous simulations.
> > > > >
> > > > > I also try to get new .tpr files by:
> > > > > tpbconv_s -s md.tpr -f md.trr -e md.edr -c md_c.tpr -cont
> > > > > and then:
> > > > > mpiexec mdrun_s_mpi -v -s md_c.tpr -cpt 0 -cpi md.cpt -deffnm md_c
> > > -reprod
> > > > > But I still cannot get binary identical results.
> > > > >
> > > > > I also test the simulations with only one processor and binary
> > > identical
> > > > > results are still not obtained. Using double precision does not
> solve
> > > the
> > > > > problems.
> > > > >
> > > > > I think that the above problems are caused by some information may
> not
> > > be
> > > > > stored during the running of the simulations.
> > > > >
> > > >
> > > > That seems likely. The leading candidate would be a random number
> > > generator
> > > > you're using for a stochastic integrator. Your .mdp file would have
> been
> > > > useful.
> > > >
> > > > On the other hand, if I run two independent simulations using the
> exactly
> > > > > same number of processors, the same commands and the same input
> files,
> > > i.e.
> > > > > mpiexec mdrun_s_mpi -v -s md.tpr -deffnm md -reprod
> > > > > I can always get binary identical results from these two
> independent
> > > > > simulations.
> > > > >
> > > > > I understand that MD is chaotic and if we run simulation for enough
> > > long
> > > > > time, simulation results should converge. Also, there are factors
> > > which may
> > > > > affect the reproducibility as described in the GROMACS website. But
> > > for my
> > > > > purpose, I am curious about whether there are certain methods
> through
> > > which
> > > > > I can get binary identical results from restarted simulations and
> > > > > continuous simulations. Thanks a lot.
> > > > >
> > > >
> > > > There are ways to be fully reproducible, but probably not every
> > > combination
> > > > of algorithms has that property. 4.0.7 is so old no problem will be
> > > fixed,
> > > > unless it can also be shown in 4.6 ;-)
> > > >
> > > > Mark
> > >
> > >
> > >
> > >
> > > --
> > > gmx-users mailing list    gmx-users at gromacs.org
> > > http://lists.gromacs.org/mailman/listinfo/gmx-users
> > > * Please search the archive at
> > > http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> > > * Please don't post (un)subscribe requests to the list. Use the
> > > www interface or send it to gmx-users-request at gromacs.org.
> > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > >
> >
> >
> > ------------------------------
> >
> > --
> > gmx-users mailing list
> > gmx-users at gromacs.org
> > http://lists.gromacs.org/mailman/listinfo/gmx-users
> > Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> >
> > End of gmx-users Digest, Vol 110, Issue 16
> > ******************************************
>
>
>
>
> --
> gmx-users mailing list    gmx-users at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-request at gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>



More information about the gromacs.org_gmx-users mailing list