[gmx-developers] Restarting with different parallel configuration

Oliver Beckstein oliver at biop.ox.ac.uk
Mon Aug 23 19:29:28 CEST 2004


Hi Josh,

> I am not sure if this is the correct list for this question, but it
> seems to be a good match for this list.  

In the future, I'd send messages like this to gmx-users.

> Is there a method to take the output or checkpoint files (I use .trr)
> from a parallel run of gromacs and use it as input for another parallel
> run that has a different number of nodes?  The goal is to create a

You have to rerun grompp with the new number of nodes but you can have
grompp read the starting configuration from a trr with velocities and the
edr (for the coupling stuff). In this case you might also set "gen_vel =
no" and "unconstrained_start = yes" (see man grompp) in the mdp file 
(using sed comes to mind...).

> method to make gromacs runs more robust when running on a cluster
> environment in which the number of nodes is dynamic over time.  Any help
> would be greatly appreciated.

Though I haven't done what you want to do I have a Sun Gridengine queue 
script which automatically restarts jobs that died, which I attach below 
for your perusal. You might be able to slot in all the grompp stuff 
instead of the line 'tpbconv...'

Good luck,
Oliver

#-------------------------------------------------------------------
 
#!/bin/bash
#$ -r y
#$ -S /bin/bash
#$ -pe mpi-dual 2
#$ -cwd

 DEFFNM=md
 TPR=md.tpr
 TRR=${DEFFNM}.trr
 EDR=${DEFFNM}.edr

 source /opt/gromacs/3.2.1/i686-pc-linux-gnu/bin/GMXRC

# check if SGE restarted the job or if user requested restart by
# qsub -v FORCE_RESTART=1 ...
 if [ "${RESTARTED}" = "1" -o -n "${FORCE_RESTART}" ]; then
    echo "Creating restart run from previous run"
    if [ -e .restart ]; then
       # count restarts in .restart
       N_RESTART=$(cat .restart)
       PREV_TPR=${N_RESTART}_${TPR}
       PREV_TRR=${N_RESTART}_${TRR}
       PREV_EDR=${N_RESTART}_${EDR}
    else
       N_RESTART=0
       PREV_TPR=${TPR}
       PREV_TRR=${TRR}
       PREV_EDR=${EDR}
    fi
    let N_RESTART++
    echo ${N_RESTART} > .restart
   
    echo "This is restart number N_RESTART=${N_RESTART}"

    DEFFNM=${N_RESTART}_${DEFFNM}
    TPR=${N_RESTART}_${TPR}
    tpbconv -s ${PREV_TPR} -f ${PREV_TRR} -e ${PREV_EDR} -o ${TPR}
 fi

 # standard MD stuff
 mpirun C mdrun_mpi -s ${TPR} -deffnm ${DEFFNM}

#-----------------------------------------------------------------------


-- 
Oliver Beckstein * oliver at biop.ox.ac.uk
 http://sansom.biop.ox.ac.uk/oliver/





More information about the gromacs.org_gmx-developers mailing list