[gmx-developers] Checkpoints & REMD

David van der Spoel spoel at xray.bmc.uu.se
Wed Sep 7 09:29:35 CEST 2011


I have been bitten by this problem before:

[neolith1:native/REMD] % ls -l *cpt
-rw-r--r-- 1 x_davva x_davva 635388 Sep  5 23:18 native10.cpt
-rw-r--r-- 1 x_davva x_davva 635388 Sep  5 23:18 native10_prev.cpt
-rw-r--r-- 1 x_davva x_davva      0 Sep  5 23:18 native11.cpt
-rw-r--r-- 1 x_davva x_davva      0 Sep  5 23:18 native11_prev.cpt

and now it happened again, using gmx 4.5.1 (for consistency). It seems 
like the checkpoint code is not REMD or multisim aware, and hence the 
code to check for the existence of xxx_prev.cpt is not sufficient.

It seems that this problem happens due to the fact that my jobs are 
chained in the queueing system, and will restart a new job even if the 
previous job crashed. Hence the problem might be prevented by adding 
extensive checks in the script for existence of cpt files and 
consistency of those.

Nevertheless it should be quite simple to introduce a multisim check in 
the cpt code before the previous version is erased. Looking at the 
latest (release-4-5-patches) source code this does not seem to be present.

David van der Spoel, Ph.D., Professor of Biology
Dept. of Cell & Molec. Biol., Uppsala University.
Box 596, 75124 Uppsala, Sweden. Phone:	+46184714205.
spoel at xray.bmc.uu.se    http://folding.bmc.uu.se

More information about the gromacs.org_gmx-developers mailing list