[gmx-developers] Reproducible runs with DLB

Bogdan Costescu bcostescu at gmail.com
Thu Jul 21 17:26:06 CEST 2011


On Thu, Jul 21, 2011 at 16:30, Mark Abraham <Mark.Abraham at anu.edu.au> wrote:
> Extending the checkpoint file format is not programmer-friendly, never mind
> writing save-and-restore code for DD.

If it would have been programmer-friendly, wouldn't it have been done
already ? :-)

Saving DD state was meant to be done at the same time as the
checkpoint to have a restart point for both the molecular system state
and the distribution of the atoms on nodes. But it doesn't have to be
in the same file - the checkpoint file can remain as it is and an
additional one can contain the DD state, as long as they are named
similarly (f.e. state_stepX.dd) to know which ones to be used
together.

> I suggest you look at the hidden options to mdrun that allow you to impose a
> particular DD grid that gives satisfactory performance. See "mdrun -h
> -hidden". You might have to reverse engineer how to use these from the code.

I'm already using '-dd x y z' for both the tests with and without DLB.
PME is not used in some of the simulations (so playing with -npme has
no meaning) and -dlb and -reprod I've already mentioned in my previous
message. Are there other options that you refer to ?

I understand that saving of DD state is not an easy feat. Do you
consider this to be a waste of time ? Even if the answer is positive I
would still be interested in it, as it would allow significantly
faster while also reproducible for my simulations.

Cheers,
Bogdan



More information about the gromacs.org_gmx-developers mailing list