[gmx-users] Mdrun kills all other(single CPU)mdrun processes started by same father process?

Berk Hess gmx3 at hotmail.com
Thu May 29 16:16:41 CEST 2008





> Date: Thu, 29 May 2008 14:51:13 +0200
> From: spoel at xray.bmc.uu.se
> To: gmx-users at gromacs.org
> Subject: Re: [gmx-users] Mdrun kills all other(single CPU)mdrun processes started by same father process?
> 
> Peter Mueller wrote:
> > Dear Gromacs users,
> > 
> > on a smaller cluster I start with a small C + MPI program via a system call several single CPU Gromacs mdrun jobs.
> > (For the cluster admin it's OK, because my system does not scale well and I need lots of small independent trajectories.)
> > All the mdrun jobs are independent and shouldn't know from each other.
> > Just to make sure that the jobs don't kill each I included a "MPI_Barrier(MPI_COMM_WORLD)" before the process call "MPI_Finalize()".
> > Now I realized that when the first single CPU mdrun finishes a TERM signal is send to all other mdrun jobs which are started by my C program.
> > After getting the TERM signal this mdrun jobs do one additional md-step and then write out a summary before stopping.
> > 
> > In the manual page of mdrun I found following hint:
> >    "When mdrun receives a TERM signal, it will
> >      set nsteps to the current step plus one
> >     ... all the usual output will be written to file.
> >     When running with MPI, a signal to one of
> >     the mdrun processes is sufficient, this signal
> >      should not be sent to mpirun or the mdrun
> >     process that is the parent of the others."
> > 
> > I think mdrun is to clever :-). When it finishes it checks if the father process started any other mdrun jobs. If this is the case it send a TERM single to all this corresponding mdrun jobs.
> > Because my mdrun jobs are independent this behavior is my case wrong.
> > Is there any flag for "mdrun" to avoid this behavior?
> > 
> > 
> > Thanks
> > Peter
> > 
> Why not use  a script or queueing system?
> Much easier for the user.
> 
> -- 
> David van der Spoel, Ph.D.

I agree with David.

But mdrun nevers sends signals by itself.
The only thing that could happen it that one process finishes with a fatal error,
which would lead to all processes being terminated.

A script is the proper solution for your setup.
But if you really want to use MPI, mdrun already has the option -multi
which starts multiple simulations in parallel with MPI.

Berk.


_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20080529/91294db0/attachment.html>


More information about the gromacs.org_gmx-users mailing list