[gmx-developers] fatal error with MPI - Scali

David van der Spoel spoel at xray.bmc.uu.se
Mon Apr 26 12:03:31 CEST 2010


We have a problem with parallel mdrun (git HEAD), that crashes with a 
fatal error:

Error on node 6, will try to stop all the nodes
Halting parallel program mdrun_mpi-0(mpi:4088 at n99) on CPU 6 out of 8

gcq#48: "Insane In Tha Membrane" (Cypress Hill)


However, halting does not work, and the job continues until the queue 
system kicks it out. Apparently the call to MPI_Abort does not work. We 
are using the Scali MPI library.
libmpi.so => /opt/scali/lib64/libmpi.so (0x00002ba2bdd32000)
	
Anyone seen this before?
With OpenMPI this does not seem to happen, so it could be a library quirk.

-- 
David van der Spoel, Ph.D., Professor of Biology
Dept. of Cell & Molec. Biol., Uppsala University.
Box 596, 75124 Uppsala, Sweden. Phone:	+46184714205. Fax: +4618511755.
spoel at xray.bmc.uu.se	spoel at gromacs.org   http://folding.bmc.uu.se



More information about the gromacs.org_gmx-developers mailing list