[gmx-users] More on my segmentation violation problem

James O'Dell jodell at ad.brown.edu
Thu Mar 20 23:54:46 CET 2003


I've recompiled GROMACS to include symbols and have managed to get a
debugger backtrace from the process that is experiencing the
segmentation violation.

#0  0x082052cd in syscall ()
#1  0xbfffe5f8 in ?? ()
#2  0x081e045f in vsyscall (handle=0x834a080, retp=0xbfffe5f8,
args=0xbfffe5ec)
    at ../scwrap.c:711
#3  0x081e0495 in score_syscall (handle=0x834a080, retp=0xbfffe5f8)
    at ../scwrap.c:726
#4  0x081e0acf in __nanosleep (req=0xbfffe61c, rem=0xbfffe61c)
    at ../scwrap.c:1166
#5  0x08203c7a in sleep ()
#6  0x081b020e in score_wait_forever () at ../libsc_util.c:154
#7  0x081b04f2 in sc_inspectme (x_display=0xbffffd56 "dev1:0",
signal=11)
    at ../libscio.c:243
#8  0x081a8be0 in MPID_SCORE_Exception ()
#9  <signal handler called>
#10 angles (nbonds=27548, forceatoms=0x8eeed60, forceparams=0x8eeaa20, 
    x=0x8fc0560, f=0x93e8218, fr=0x8cca460, g=0x8ccaca0, box=0x8ad1c98, 
    lambda=0, dvdlambda=0xbfffec98, md=0x8cb61b8, ngrp=2,
egnb=0x8ad12c0, 
    egcoul=0x8ad12a8, fcd=0x8ad15b0) at ../../include/vec.h:235
#11 0x080898ee in calc_bonds (log=0x8ad1440, cr=0x86af708, mcr=0x0, 
    idef=0x8ad402c, x_s=0x8fc0560, f=0x93e8218, fr=0x8cca460,
g=0x8ccaca0, 
    epot=0x8ad11a0, nrnb=0xbffff1d0, box=0x8ad1c98, lambda=0,
md=0x8cb61b8, 
    ngrp=2, egnb=0x8ad12c0, egcoul=0x8ad12a8, fcd=0x8ad15b0, step=0, 
    bSepDVDL=0) at bondfree.c:109
---Type <return> to continue, or q <return> to quit---
#12 0x0805dd2d in force (fp=0x8ad1440, step=0, fr=0x8cca460,
ir=0x8ad1aa8, 
    idef=0x8ad402c, nsb=0x8ad3008, cr=0x86af708, mcr=0x0,
nrnb=0xbffff1d0, 
    grps=0x8ad1908, md=0x8cb61b8, ngener=2, opts=0x8ad1c28, x=0x8fc0560,
    f=0x93e8218, epot=0x8ad11a0, fcd=0x8ad15b0, bVerbose=0,
box=0x8ad1c98, 
    lambda=0, graph=0x8ccaca0, excl=0x8adf1c4, bNBFonly=0,
lr_vir=0xbffff610, 
    mu_tot=0xbffff1c0, qsum=-6.99999762, bGatherOnly=0) at force.c:960
#13 0x0807eade in do_force (log=0x8ad1440, cr=0x86af708, mcr=0x0, 
    parm=0x8ad1aa8, nsb=0x8ad3008, vir_part=0xbffff640,
pme_vir=0xbffff610, 
    step=0, nrnb=0xbffff1d0, top=0x8ad4028, grps=0x8ad1908, x=0x8fc0560,
    v=0x90454e8, f=0x93e8218, buf=0x9363290, mdatoms=0x8cb61b8, 
    ener=0x8ad11a0, fcd=0x8ad15b0, bVerbose=0, lambda=0,
graph=0x8ccaca0, 
    bNS=1, bNBFonly=0, fr=0x8cca460, mu_tot=0xbffff1c0, bGatherOnly=0)
    at sim_util.c:282
#14 0x0805177e in do_md (log=0x8ad1440, cr=0x86af708, mcr=0x0, nfile=21,
    fnm=0x828bd04, bVerbose=1, bCompact=1, bDummies=0, dummycomm=0x0, 
    stepout=10, parm=0x8ad1aa8, grps=0x8ad1908, top=0x8ad4028,
ener=0x8ad11a0, 
    fcd=0x8ad15b0, x=0x8fc0560, vold=0x94f2128, v=0x90454e8,
vt=0x946d1a0, 
    f=0x93e8218, buf=0x9363290, mdatoms=0x8cb61b8, nsb=0x8ad3008, 
    nrnb=0x8ae0260, graph=0x8ccaca0, edyn=0xbffff7f0, fr=0x8cca460, 
    box_size=0xbffff790, Flags=0) at md.c:508
#15 0x080508b6 in mdrunner (cr=0x86af708, mcr=0x0, nfile=21,
fnm=0x828bd04, 
    bVerbose=1, bCompact=1, nDlb=0, nstepout=10, edyn=0xbffff7f0,
Flags=0)
    at md.c:193

The code at the violation is in this vicinity.
240       a[YY]=y;
241       a[ZZ]=z;
242     }
243
244     static inline void rvec_sub(const rvec a,const rvec b,rvec c)
245     {
246       real x,y,z;
247       
248       x=a[XX]-b[XX];
249       y=a[YY]-b[YY];

I don't believe that this behavior is specific to my hardware or
operating system since I get apprximately the same behavior on an IBM
SP.

The segmentation violation seems to happen very early in the run.
In this case I was running on 12 processors. Also, if I perform exactly
the same calculation several times in a row sometimes it will
segmentation fault and sometimes not. It seems to me that it has all the
classic characteristics of a storage allocation problem in the gromacs
code to me. 

Does anybody have suggestions on how to pursue this further?

Thanks,Jim




More information about the gromacs.org_gmx-users mailing list