[gmx-developers] restarting from checkpoint--option to append to output files?
Erik Lindahl
lindahl at cbr.su.se
Tue Apr 22 22:01:35 CEST 2008
Hi,
On Apr 22, 2008, at 6:20 PM, Peter Kasson wrote:
> It's great to have a checkpoint/resume feature in mdrun; one thing I
> notice is that the current code essentially treats a resume from
> checkpoint as a new run with an exact restart. It would be nice to
> have an option to append to existing files so that one gets a single
> continuous trr, xtc, etc. How complicated would this be?
>
> (One could imagine a relatively naive version that takes a -append
> flag and starts appending to output files if they exist when
> resuming from a checkpoint or a fancier version that stores hashes
> in the checkpoint file, verifies the hashes, and then appends only
> if the files check out as corresponding to the checkpoint).
We discussed this a bit during the workshop - not sure if you were
there for that session.
The big problem is that if somethings goes wrong (full disk, crashed
run, bad gromacs binary, whatever) you screw up your entire
trajectories and energy files, and then it's a mess to fix things,
rather than just resubmitting with the correct settings.
Another reason for not wanting it by default is that some parallel
file systems (some versions of GPFS?) simply don't support append file
operations.
Still, I have to confess that I can't remember any good reasons for
not even having it as an option ;-) I'll see if anybody else voices
any concerns here, otherwise we could add it as an optional choice to
mdrun (pretty nice in combination with the new max run time option).
Cheers,
Erik
More information about the gromacs.org_gmx-developers
mailing list