[gmx-developers] FW: repeatability of runs?

Berk Hess hess at kth.se
Mon Feb 16 19:36:28 CET 2015


On 02/16/2015 07:29 PM, Shirts, Michael R. (mrs5pt) wrote:
>> What version(s) are we talking about here? A released 5.0 or git master
>> code?
> 5.04.
>
>> The -reprod option tries to remove any source of divergence, such as
>> FFTW auto-tuning and dynamic load balancing. Any single bit change will
>> cause differences that diverge exponentially due to the chaotic nature
>> of MD. But you should not see any "strange" results, such as
>> pressures/virials that are higher than expected.
> -reprod still gives diverging results, as soon as the 10th step (first
> time energy is summed).  The differences at that point are very small -
> only appearing in the virials.  So I highly doubt there are any problems
> with the physics -- it's purely a mathematical result of a bit getting
> rounded one way or they other, and then chaos taking it's toll.
>
> I'll file an issue at (low) priority so that other people can take a look.
> Not having the ability to reproduce (even on the same machine) does make
> it hard to debug things that only show up after multiple NS simulations.
Note that we don't guarantee that -reprod will give reproducable runs. 
But without a GPU (and excluding the possibility of non-deterministic 
MPI reductions), I don't see anything that could interfere with 
reproducibility. Note that restarting from checkpoint files should 
result in reproducible results. This can help with debugging, since you 
don't need to run all those ns, only from the last checkpoint and you 
can then reduce the checkpoint interval until it crashes in a few steps.

Cheers,

Berk
>
> Best,
> ~~~~~~~~~~~~
> Michael Shirts
> Associate Professor
> Department of Chemical Engineering
> University of Virginia
> michael.shirts at virginia.edu
> (434) 243-1821
>
>



More information about the gromacs.org_gmx-developers mailing list