[gmx-users] Mdrun kills all other(single CPU)mdrun processes started by same father process?
David van der Spoel
spoel at xray.bmc.uu.se
Thu May 29 14:51:13 CEST 2008
Peter Mueller wrote:
> Dear Gromacs users,
>
> on a smaller cluster I start with a small C + MPI program via a system call several single CPU Gromacs mdrun jobs.
> (For the cluster admin it's OK, because my system does not scale well and I need lots of small independent trajectories.)
> All the mdrun jobs are independent and shouldn't know from each other.
> Just to make sure that the jobs don't kill each I included a "MPI_Barrier(MPI_COMM_WORLD)" before the process call "MPI_Finalize()".
> Now I realized that when the first single CPU mdrun finishes a TERM signal is send to all other mdrun jobs which are started by my C program.
> After getting the TERM signal this mdrun jobs do one additional md-step and then write out a summary before stopping.
>
> In the manual page of mdrun I found following hint:
> "When mdrun receives a TERM signal, it will
> set nsteps to the current step plus one
> ... all the usual output will be written to file.
> When running with MPI, a signal to one of
> the mdrun processes is sufficient, this signal
> should not be sent to mpirun or the mdrun
> process that is the parent of the others."
>
> I think mdrun is to clever :-). When it finishes it checks if the father process started any other mdrun jobs. If this is the case it send a TERM single to all this corresponding mdrun jobs.
> Because my mdrun jobs are independent this behavior is my case wrong.
> Is there any flag for "mdrun" to avoid this behavior?
>
>
> Thanks
> Peter
>
Why not use a script or queueing system?
Much easier for the user.
--
David van der Spoel, Ph.D.
Molec. Biophys. group, Dept. of Cell & Molec. Biol., Uppsala University.
Box 596, 75124 Uppsala, Sweden. Phone: +46184714205. Fax: +4618511755.
spoel at xray.bmc.uu.se spoel at gromacs.org http://folding.bmc.uu.se
More information about the gromacs.org_gmx-users
mailing list