[gmx-users] REMD restart: writing 2nd traj part on same local disk

pascal.baillod at epfl.ch pascal.baillod at epfl.ch
Mon Nov 13 17:27:15 CET 2006


Dear community,

I am running gmx3.3 REMD on a 32 node cluster, writing data to the local node
disks, with the following command:

mpiexec -np 64 $MDRUN -multi -np 64 -s $INPUT/md.tpr -o $OUT/traj.trr -x
$OUT/traj.xtc -c $OUT/conf.g96 -e $OUT/ener.edr -g $OUT/md.log -replex 5000
-reseed -1 >& log-job &

...where $OUT is the path to a directory on the local node hard disk, and $INPUT
the path to a /home directory accessible by all nodes and containing all the
.tpr files. The node local hard disks are not cross-mounted by NFS for sake of
performance, and are therefore only visible when logged uppon the corresponding
node. 

Following a crash, I restarted the simulation by preparing restart .tpr files.
For every temperature X, my script sends the gromacs tpbconv command to node X,
where the trr trajectory file of temperature X is stored and can be read, and
writes the restart .tpr file for temperature X somewhere on /home. I can then
restart the simulation using these .tpr with a command line similar to the one
printed above.

The problem is that the temperature-trajectories no longer are processed by the
same node as before. In other words, after the restart, node X runs temperature
Y MD instead of running temperature X MD. I wonder if that can be corrected, so
as to retrieve the original order (with node X running the 2nd part of
temperature X MD, as it already ran the first part) and facilitate the pasting
of the 1st and 2nd parts of the simulation.

Thanks for any hint!!

Pascal




More information about the gromacs.org_gmx-users mailing list