[gmx-developers] SEGV mdrun mpi failure
Mostyn Lewis
Mostyn.Lewis at sun.com
Sun Oct 26 02:57:14 CET 2003
Hello,
Well, the CVS Gromacs from Friday did not prevent a SEGV (signal 11) in the
MPI benchmark I had. The values at failure were still the same in angles
(bondfree.c), with a bad t2 value in the rvec_inc(fr->fshift[t2],f_k);
statement at line 500. So, I browsed in the debugger a little more and
simply by instinct thought that the value of ak in the statement
ivec_sub(SHIFT_IVEC(g,ak),jt,dt_kj); was maybe wrong. It was 1 greater
than g->end and so was taking g->ishift[3002] and getting a bogus
vector.
*g ->
maxedge = 9
nnodes = 3002
nbound = 3002
start = 0
end = 3001
negc = 3002
g->ishift[3002] was {0, 108081, 2}
g->ishift[3001] was {1, 1, 0}
Anyway to cut a tedious debugging tirade short I changed angles to test
if (g && (ak <= g->end)) {
and do_dih_fup to test
if (g && (l <= g->end)) {
and the benchmark ran (up to 16 CPUs) - I tried Linux LAM/icc and
SUN 6800 SUNClusterTools/SUN Forte compilers.
I don't think this is a MPI problem it seems to show up more easily in this
mode.
The benchmark is using one of your standard benchmarks, the d.poly-ch2 example.
Is this just a stupid hack or does it have any substance?
Regards,
Mostyn
More information about the gromacs.org_gmx-developers
mailing list