[gmx-developers] restarting from checkpoint--option to append to output files?

Berk Hess hessb at mpip-mainz.mpg.de
Wed Apr 23 10:16:43 CEST 2008


Erik Lindahl wrote:
> Hi,
>
> On Apr 22, 2008, at 6:20 PM, Peter Kasson wrote:
>
>> It's great to have a checkpoint/resume feature in mdrun; one thing I 
>> notice is that the current code essentially treats a resume from 
>> checkpoint as a new run with an exact restart.  It would be nice to 
>> have an option to append to existing files so that one gets a single 
>> continuous trr, xtc, etc.  How complicated would this be?
>>
>> (One could imagine a relatively naive version that takes a -append 
>> flag and starts appending to output files if they exist when resuming 
>> from a checkpoint or a fancier version that stores hashes in the 
>> checkpoint file, verifies the hashes, and then appends only if the 
>> files check out as corresponding to the checkpoint).
>
> We discussed this a bit during the workshop - not sure if you were 
> there for that session.
>
> The big problem is that if somethings goes wrong (full disk, crashed 
> run, bad gromacs binary, whatever) you screw up your entire 
> trajectories and energy files, and then it's  a mess to fix things, 
> rather than just resubmitting with the correct settings.
>
> Another reason for not wanting it by default is that some parallel 
> file systems (some versions of GPFS?) simply don't support append file 
> operations.
>
> Still, I have to confess that I can't remember any good reasons for 
> not even having it as an option ;-)   I'll see if anybody else voices 
> any concerns here, otherwise we could add it as an optional choice to 
> mdrun (pretty nice in combination with the new max run time option).
>
> Cheers,
>
> Erik

For trajectory files this is trivial to implement.
But for energy file it is more complicated, since they store the 
cumulative sums and sums of squares
since the beginning of the simulation.
It would be much more convenient if energy files stored only the sums 
and squares of sums
between the last frame and the current frame.
Currently eneconv still does not get the fluctuations right for 
concatenated files.

But I don't know if we want to change the energy file format for Gromacs 4.

Berk.




More information about the gromacs.org_gmx-developers mailing list