[gmx-users] Pausing and resuming mdrun

Andrew DeYoung adeyoung at andrew.cmu.edu
Tue Sep 29 20:22:59 CEST 2015


Hi,

I'm running an MD simulation using mdrun (specifically version 4.5.5 -- an
old version).  I would like to pause this simulation -- perhaps for days
or weeks -- so that I can run a different one on this node.

Of course, one way is to just kill the simulation altogether:

kill $ProcessID

where $ProcessID is the process ID of the simulation.  Then, when I want
to resume the simulation, I can just pass the last checkpoint file to
mdrun.  Checkpoint files have been written every 15 minutes (i.e., the
default setting), so with this method I will lose at most 15 minutes of
computation time.

But, is there any way to literally _pause_ the simulation and resume it a
few days or weeks later?

A Unix/Linux question and answer site (
http://unix.stackexchange.com/questions/2107/how-to-suspend-and-resume-processes
) says that one can pause/resume a process with this method:

kill -SIGSTOP $ProcessID
kill -SIGCONT $ProcessID

or:

kill -SIGSTP $ProcessID
kill -SIGCONT $ProcessID

Another site ( http://www.cyberciti.biz/faq/unix-kill-command-examples/ )
says to just use:

kill -STOP $ProcessID
kill -CONT $ProcessID

My question is, do you think that these Linux methods will work with
mdrun?  I did a test with mdrun on another node, and it seems to work, but
I'm just wondering if there are any dangers in using these methods.

(I am accessing the Linux machine remotely, by SSH.  Sometimes my SSH
connection gives out, so when starting a simulation I always use "nohup
mdrun -s topol.tpr" so that the mdrun process is not terminated when my
SSH connection flakes out.  I'm not sure if this will affect the viability
of the "kill -STOP/CONT" method...)

Thanks so much,

Andrew DeYoung
Carnegie Mellon University



More information about the gromacs.org_gmx-users mailing list