[gmx-developers] restarting from checkpoint--option to append to output files?

Erik Lindahl lindahl at cbr.su.se
Tue Apr 22 22:01:35 CEST 2008


Hi,

On Apr 22, 2008, at 6:20 PM, Peter Kasson wrote:

> It's great to have a checkpoint/resume feature in mdrun; one thing I  
> notice is that the current code essentially treats a resume from  
> checkpoint as a new run with an exact restart.  It would be nice to  
> have an option to append to existing files so that one gets a single  
> continuous trr, xtc, etc.  How complicated would this be?
>
> (One could imagine a relatively naive version that takes a -append  
> flag and starts appending to output files if they exist when  
> resuming from a checkpoint or a fancier version that stores hashes  
> in the checkpoint file, verifies the hashes, and then appends only  
> if the files check out as corresponding to the checkpoint).

We discussed this a bit during the workshop - not sure if you were  
there for that session.

The big problem is that if somethings goes wrong (full disk, crashed  
run, bad gromacs binary, whatever) you screw up your entire  
trajectories and energy files, and then it's  a mess to fix things,  
rather than just resubmitting with the correct settings.

Another reason for not wanting it by default is that some parallel  
file systems (some versions of GPFS?) simply don't support append file  
operations.

Still, I have to confess that I can't remember any good reasons for  
not even having it as an option ;-)   I'll see if anybody else voices  
any concerns here, otherwise we could add it as an optional choice to  
mdrun (pretty nice in combination with the new max run time option).

Cheers,

Erik



More information about the gromacs.org_gmx-developers mailing list