[gmx-developers] Reproducible runs with DLB

Mark Abraham Mark.Abraham at anu.edu.au
Thu Jul 21 23:35:14 CEST 2011


On 22/07/2011 1:26 AM, Bogdan Costescu wrote:
> On Thu, Jul 21, 2011 at 16:30, Mark Abraham<Mark.Abraham at anu.edu.au>  wrote:
>> Extending the checkpoint file format is not programmer-friendly, never mind
>> writing save-and-restore code for DD.
> If it would have been programmer-friendly, wouldn't it have been done
> already ? :-)
>
> Saving DD state was meant to be done at the same time as the
> checkpoint to have a restart point for both the molecular system state
> and the distribution of the atoms on nodes. But it doesn't have to be
> in the same file - the checkpoint file can remain as it is and an
> additional one can contain the DD state, as long as they are named
> similarly (f.e. state_stepX.dd) to know which ones to be used
> together.
>
>> I suggest you look at the hidden options to mdrun that allow you to impose a
>> particular DD grid that gives satisfactory performance. See "mdrun -h
>> -hidden". You might have to reverse engineer how to use these from the code.
> I'm already using '-dd x y z' for both the tests with and without DLB.
> PME is not used in some of the simulations (so playing with -npme has
> no meaning) and -dlb and -reprod I've already mentioned in my previous
> message. Are there other options that you refer to ?

Yes. Check out the instruction I suggest.

>
> I understand that saving of DD state is not an easy feat. Do you
> consider this to be a waste of time ? Even if the answer is positive I
> would still be interested in it, as it would allow significantly
> faster while also reproducible for my simulations.

Could be done. Not all that easy.

Mark



More information about the gromacs.org_gmx-developers mailing list