[gmx-users] errors on restart
gianluca_santarossa at fastmail.fm
Thu May 18 17:07:09 CEST 2006
Mark Abraham wrote:
>> I can't. I run simulations on a cluster through a queue, and
>> sometimes the jobs are longer than the max time of the queue.
> Yes you can. Do a pilot run and look at the last few lines of the
> logfile - or better one of the crashed runs - you want reasonable
> length so your setup time is amortized over all of the timesteps. That
> will tell you how much simulation time you can do per unit wall clock
> time. Now adjust the number of simulation steps accordingly.
I think you are right. I guess this is the best solution, after all.
The drawback is that I need to do a pilot run for each system I need to
simulate. And the number of processors I choose, too. O, no! It smells
like a benchmark!!!! :P
> That's a reasonable start, but the nature of buffered output is such
> that you can't guarantee that ener.edr and traj.trr are at the same
> point. What you need to do is get gromacs to exit gracefully having
> flushed its buffers. My PBS setup sends a SIGHUP that GROMACS 3.3.1
> reads and does an appropriate end-of-last-step flush and a pirouette
> to finish :-) I suggest passing the SIGHUP, delaying as long as you
> can afford and only then copying the files back. This will work better
> on average. It's probably overkill if you implement the first solution.
I don't know how to do that... Can you help me? (At least, I can learn
something new about scripting...)
If I'm right, trap is executed after its command finishes. So I cannot
send a SIGHUP signal from the trap.
On the other side, I have no rights on the signals from the queue. From
the FAQ of the cluster:
"To give the application a chance to exit gracefully, LSF first sends a
“friendly” signal (SIGUSR2) to
all processes of a job when its time limit is about to expire. If the
job is still running after a short
grace period, LSF sends increasingly “unfriendly” signals (SIGINT,
SIGTERM and SIGKILL). The last
one effectively kills the job."
More information about the gromacs.org_gmx-users