[gmx-developers] Reproducible runs with DLB
x.periole at rug.nl
Thu Jul 21 23:46:49 CEST 2011
On Jul 21, 2011, at 3:35 PM, Mark Abraham wrote:
> On 22/07/2011 2:02 AM, XAvier Periole wrote:
>> nothing I can help with here but having the reprod mode running
>> with the
>> dlb would be really useful!
> It relies on observing timings... how can that be reproducible?
>> And an even more useful option would be to be able to write out
>> conformations more often than in the original run. That would allow
>> run long simulations and go back and zoom in a particular time
>> period of the simulation where some interesting event occurred.
> Hacking some environment variable to do this seems feasible.
So I run a simulation on 128 CPUs using the dlb, keep my cpt let's say
every hour and then I just decide I want rerun the simulation writing
down every 10* more often the xtc file ... this is possible by hacking
some environment variables?
>> On Jul 21, 2011, at 9:26 AM, Bogdan Costescu wrote:
>>> On Thu, Jul 21, 2011 at 16:30, Mark Abraham
>>> <Mark.Abraham at anu.edu.au> wrote:
>>>> Extending the checkpoint file format is not programmer-friendly,
>>>> never mind
>>>> writing save-and-restore code for DD.
>>> If it would have been programmer-friendly, wouldn't it have been
>>> already ? :-)
>>> Saving DD state was meant to be done at the same time as the
>>> checkpoint to have a restart point for both the molecular system
>>> and the distribution of the atoms on nodes. But it doesn't have to
>>> in the same file - the checkpoint file can remain as it is and an
>>> additional one can contain the DD state, as long as they are named
>>> similarly (f.e. state_stepX.dd) to know which ones to be used
>>>> I suggest you look at the hidden options to mdrun that allow you
>>>> to impose a
>>>> particular DD grid that gives satisfactory performance. See
>>>> "mdrun -h
>>>> -hidden". You might have to reverse engineer how to use these
>>>> from the code.
>>> I'm already using '-dd x y z' for both the tests with and without
>>> PME is not used in some of the simulations (so playing with -npme
>>> no meaning) and -dlb and -reprod I've already mentioned in my
>>> message. Are there other options that you refer to ?
>>> I understand that saving of DD state is not an easy feat. Do you
>>> consider this to be a waste of time ? Even if the answer is
>>> positive I
>>> would still be interested in it, as it would allow significantly
>>> faster while also reproducible for my simulations.
>>> gmx-developers mailing list
>>> gmx-developers at gromacs.org
>>> Please don't post (un)subscribe requests to the list. Use the
>>> www interface or send it to gmx-developers-request at gromacs.org.
> gmx-developers mailing list
> gmx-developers at gromacs.org
> Please don't post (un)subscribe requests to the list. Use the www
> interface or send it to gmx-developers-request at gromacs.org.
More information about the gromacs.org_gmx-developers