[gmx-developers] Reproducible runs with DLB

Thu Jul 21 23:46:49 CEST 2011

On Jul 21, 2011, at 3:35 PM, Mark Abraham wrote:

> On 22/07/2011 2:02 AM, XAvier Periole wrote:
>>
>> Hi,
>>
>> nothing I can help with here but having the reprod mode running  
>> with the
>> dlb would be really useful!
>
> It relies on observing timings... how can that be reproducible?
>
>> And an even more useful option would be to be able to write out
>> conformations more often than in the original run. That would allow  
>> one
>> run long simulations and go back and zoom in a particular time
>> period of the simulation where some interesting event occurred.
>
> Hacking some environment variable to do this seems feasible.
So I run a simulation on 128 CPUs using the dlb, keep my cpt let's say
every hour and then I just decide I want rerun the simulation writing
down every 10* more often the xtc file ... this is possible by hacking
some environment variables?

>
> Mark
>
>> XAvier.
>>
>> On Jul 21, 2011, at 9:26 AM, Bogdan Costescu wrote:
>>
>>> On Thu, Jul 21, 2011 at 16:30, Mark Abraham  
>>> <Mark.Abraham at anu.edu.au> wrote:
>>>> Extending the checkpoint file format is not programmer-friendly,  
>>>> never mind
>>>> writing save-and-restore code for DD.
>>>
>>> If it would have been programmer-friendly, wouldn't it have been  
>>> done
>>> already ? :-)
>>>
>>> Saving DD state was meant to be done at the same time as the
>>> checkpoint to have a restart point for both the molecular system  
>>> state
>>> and the distribution of the atoms on nodes. But it doesn't have to  
>>> be
>>> in the same file - the checkpoint file can remain as it is and an
>>> additional one can contain the DD state, as long as they are named
>>> similarly (f.e. state_stepX.dd) to know which ones to be used
>>> together.
>>>
>>>> I suggest you look at the hidden options to mdrun that allow you  
>>>> to impose a
>>>> particular DD grid that gives satisfactory performance. See  
>>>> "mdrun -h
>>>> -hidden". You might have to reverse engineer how to use these  
>>>> from the code.
>>>
>>> I'm already using '-dd x y z' for both the tests with and without  
>>> DLB.
>>> PME is not used in some of the simulations (so playing with -npme  
>>> has
>>> no meaning) and -dlb and -reprod I've already mentioned in my  
>>> previous
>>> message. Are there other options that you refer to ?
>>>
>>> I understand that saving of DD state is not an easy feat. Do you
>>> consider this to be a waste of time ? Even if the answer is  
>>> positive I
>>> would still be interested in it, as it would allow significantly
>>> faster while also reproducible for my simulations.
>>>
>>> Cheers,
>>> Bogdan
>>> -- 
>>> gmx-developers mailing list
>>> gmx-developers at gromacs.org
>>> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>>> Please don't post (un)subscribe requests to the list. Use the
>>> www interface or send it to gmx-developers-request at gromacs.org.
>>
>
> -- 
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the www  
> interface or send it to gmx-developers-request at gromacs.org.