[gmx-developers] Reproducible runs with DLB
XAvier Periole
x.periole at rug.nl
Thu Jul 21 23:46:49 CEST 2011
On Jul 21, 2011, at 3:35 PM, Mark Abraham wrote:
> On 22/07/2011 2:02 AM, XAvier Periole wrote:
>>
>> Hi,
>>
>> nothing I can help with here but having the reprod mode running
>> with the
>> dlb would be really useful!
>
> It relies on observing timings... how can that be reproducible?
>
>> And an even more useful option would be to be able to write out
>> conformations more often than in the original run. That would allow
>> one
>> run long simulations and go back and zoom in a particular time
>> period of the simulation where some interesting event occurred.
>
> Hacking some environment variable to do this seems feasible.
So I run a simulation on 128 CPUs using the dlb, keep my cpt let's say
every hour and then I just decide I want rerun the simulation writing
down every 10* more often the xtc file ... this is possible by hacking
some environment variables?
>
> Mark
>
>> XAvier.
>>
>> On Jul 21, 2011, at 9:26 AM, Bogdan Costescu wrote:
>>
>>> On Thu, Jul 21, 2011 at 16:30, Mark Abraham
>>> <Mark.Abraham at anu.edu.au> wrote:
>>>> Extending the checkpoint file format is not programmer-friendly,
>>>> never mind
>>>> writing save-and-restore code for DD.
>>>
>>> If it would have been programmer-friendly, wouldn't it have been
>>> done
>>> already ? :-)
>>>
>>> Saving DD state was meant to be done at the same time as the
>>> checkpoint to have a restart point for both the molecular system
>>> state
>>> and the distribution of the atoms on nodes. But it doesn't have to
>>> be
>>> in the same file - the checkpoint file can remain as it is and an
>>> additional one can contain the DD state, as long as they are named
>>> similarly (f.e. state_stepX.dd) to know which ones to be used
>>> together.
>>>
>>>> I suggest you look at the hidden options to mdrun that allow you
>>>> to impose a
>>>> particular DD grid that gives satisfactory performance. See
>>>> "mdrun -h
>>>> -hidden". You might have to reverse engineer how to use these
>>>> from the code.
>>>
>>> I'm already using '-dd x y z' for both the tests with and without
>>> DLB.
>>> PME is not used in some of the simulations (so playing with -npme
>>> has
>>> no meaning) and -dlb and -reprod I've already mentioned in my
>>> previous
>>> message. Are there other options that you refer to ?
>>>
>>> I understand that saving of DD state is not an easy feat. Do you
>>> consider this to be a waste of time ? Even if the answer is
>>> positive I
>>> would still be interested in it, as it would allow significantly
>>> faster while also reproducible for my simulations.
>>>
>>> Cheers,
>>> Bogdan
>>> --
>>> gmx-developers mailing list
>>> gmx-developers at gromacs.org
>>> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>>> Please don't post (un)subscribe requests to the list. Use the
>>> www interface or send it to gmx-developers-request at gromacs.org.
>>
>
> --
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the www
> interface or send it to gmx-developers-request at gromacs.org.
More information about the gromacs.org_gmx-developers
mailing list