[gmx-developers] (no subject)

Ken Rotondi ksr at chemistry.umass.edu
Thu Jun 16 20:48:16 CEST 2005


Dear Gromacs community,

Well, it appears we've stumbled across some genuine weirdness here. The 
CVS version of GROMACS runs on our SGI supercomputer:

SGI Origin 3800, 128 processor R14000 500 MHz
SGI Message Passing Toolkit 1.6 (this is SGI's tuned MPI libraries)
FFTW version 2.1.3 (single-precision, shared libraries, MPI-enabled)

The issue is, it won't run on more than 24 processors without 
generating an MPI error, which may be an issue with GROMACS or an issue 
with something on the SGI.

I was wondering if anybody has been successful in running a job on an 
SGI system using more than 24 processors. I'm not getting any error 
about system resources, which would be my initial guess as to why it 
dies allocating more than 24 CPUs. The system architecture is such that 
it prefers to run in multiples of 4 CPUs (each processor brick 
containing 4 CPUs and memory). GROMACS dies with 28 CPUs and above, but 
runs fine with 24.

This is the error message when it dies using more than 24 processors:

MPI: MPI_COMM_WORLD rank 0 has terminated without calling MPI_Finalize()
MPI: Received signal 10

It dumps a nasty core file which doesn't contain usable symbols (We 
probably need to recompile without stripping the binaries). I've 
noticed that it also takes a long time to set up before dying compared 
to the set up and successful run using 24 processors.

Sorry to post to both developers and users, but I thought it might be a 
CVS issue.

Thanks for any insight/help,

Ken




More information about the gromacs.org_gmx-developers mailing list