[gmx-developers] Does GROMACS pass random numbers through checkpoint/restart files?
David Cerutti
dcerutti at mccammon.ucsd.edu
Mon Apr 21 07:56:27 CEST 2008
Hello,
I'm not a frequent user of gromacs, but I'm on your developers list
nonetheless... cause for introspection.
I noticed a very bizarre yet scary artifact in some simulations I was
doing with a Langevin thermostat, and with some help of the AMBER code
developers I traced it to the fact that I was feeding in the same random
number seed with every restart of a new simulation segment. The effect
was to destabilize my protein to varying degrees, depending on how
frequently I checkpointed the run. The developers are now implementing a
patch to address this behavior in the current release, but a more rigorous
solution such as the way CHARMM passes the pseudo-random state vector
through its checkpoint files will not be available until some future
release.
The implications of what I'm finding could be summarized thus: every
simulation done with AMBER, using a Langevin thermostat (probably an
Andersen thermostat as well) is affected by this repeating random numbers
problem UNLESS the user specified IGB=-1 (a Generalized Born parameter
that can be used to set the random seed based on the clock time) or
rewrote the input file with every restart (which is what I'm doing now).
In the extreme case of 1000 steps per restart, the protein will unfold; in
more subtle cases, with perhaps 25,000 to 100,000 steps per restart, it
appears that the protein may be slightly destabilized. I have a 145ns
simulation of a dimer-of-dimers protein that breaks along its weak
interface 110ns into the run, but I'm thinking that it tracks back to this
random numbers problem. There was a portion of the run where I lengthened
the segments to 1,000,000 steps, which I now see corresponds to a brief
healing of the tetramer before I went back to 100,000 steps and blew it
apart--wild!
I am writing up some more tests as a communication to make sure that
other investigators know this can happen. I, for one, have just lost over
500ns of MD to this artifact, and I suspect that it lurks in many, many
studies in the literature. One of the developers also commented that
numerous users were frustrated with the Langevin thermostat and so went
back to Berendsen weak-coupling; it may all track back to these random
numbers.
As it appears in the manual, GROMACS may be vulnerable to this
sort of error if the user does not specify Id_seed=-1, unless the random
number state vector is being saved in the checkpoint file for the next
restart. Please let me know how the code handles this issue.
Thanking you,
Dave Cerutti
More information about the gromacs.org_gmx-developers
mailing list