[gmx-users] Need help on a SEGV mdrun mpi failure
spoel at xray.bmc.uu.se
Fri Oct 24 19:08:01 CEST 2003
On Fri, 2003-10-24 at 18:20, Mostyn Lewis wrote:
> Sent this last night but it seems to have been lost (Maybe because it
> had a 270K attachment of topol.top.bz2?). So here goes again.
> I'm having a problem with a benchmark case which causes SEGV (signal 11)
> in most cases of a MPI run with more than 4 CPUs. The failure is always
> in bondfree.c (gromacs-3.1.4 + gromacs-3.1.5_pre1) in the angles routine
> at line 535
> } /* 168 TOTAL */
> This line has a BAD t2 value which causes an out of bounds reference
> (actually a little later in x=a[XX]+b[XX]; at line 235 in vec.h due to
> the expansion of rvec_inc)
thanks for the bug report... Had I had this half a year ago, then it
would have saved me a lot of time. As it is, this has been fixed in the
CVS version of the code (unless I'm terribly wrong), as I assume that
this is a run with a protein in little water...
> I enclose a run below with the grompp and mdrun_mpi output followed by
> some dbx debugging output showing some values. This was on a 24 CPU
> SUN SMP box (Sunfire 6800) using 8 CPUs.
> I get the same failure on a cluster of Linux (2 CPU Xeon) boxes doing
> MPI across Gigabit ethernet. The failure occurs in Linux land using
> Intel/PGI and LAM/mpich combinations - so I think this is problem and/or
> Gromacs dependent.
> I'm not a Molecular persona at all, just a humble benchmarker and seek
> help from the enlightened.
> Any files you'd like (topol.top ...) or more debugging are available
> on request.
> Sorry this is so long. Any help would be appreciated.
David van der Spoel, PhD, Assist. Prof., Molecular Biophysics group,
Dept. of Cell and Molecular Biology, Uppsala University.
Husargatan 3, Box 596, 75124 Uppsala, Sweden
phone: 46 18 471 4205 fax: 46 18 511 755
spoel at xray.bmc.uu.se spoel at gromacs.org http://xray.bmc.uu.se/~spoel
More information about the gromacs.org_gmx-users