[gmx-users] Problem with gromacs-3.3 using PME
Hector Mtz-Seara
hseara at netscape.net
Fri Jan 6 23:36:42 CET 2006
Hi,
I have a similar problem some time ago. Maybe is the same the problem is
that there is a Known bug in PME already solved in pme.c that in the
file coming with the download version allways complain witha pme_order
diferent of 4. You can take the new pme.c file to fix this problem
I hope this help.
Hector
rbjornson at gmail.com wrote:
>Hi,
>
>A couple of weeks ago I sent a message to the list reporting a problem
>I was having with gromacs-3.3 when using PME. Gromacs was segment
>faulting within pme.c after what I believe to be a small number of
>timesteps.
>
>The initial configuration was: 8 EMT64 Xeon cpus, running RHE WS
>release 3, using LAM 7.1.1, gnu compilers.
>
>I tried varying a number of things:
>- using intel compilers instead of gnu
>- compiling using intel 32 bit compilers
>- compiled without mpi, running sequentially on EMT64
>- compiled without mpi, running sequentially on 32 bit Xeon
>
>All of these runs failed when using pme, but ran fine using cut-off.
>I'll append some debugging info at the end of this email.
>
>I then compiled gromacs-3.2.1, both sequentially and using lam, and
>was able to run the pme input case without problem.
>
>It sure looks to me like a bug that is activated by pme. What is the
>protocol for submitting a bug request? I'd be happy to provide
>whatever debugging info would be helpful.
>
>I'm also curious; what am I giving up by using gromacs-3.2.1 rather than 3.3?
>
>I apologize if this is the incorrect list for this message; perhaps it
>should have gone to the developers list directly. Please let me know
>if that is the case.
>
>Rob Bjornson
>
><begin debugging info>
>
>Here is a sample stack trace from a sequential run on 32bit Xeon:
>
>#0 0x0807e46b in spread_q_bsplines (grid=0x82fcf68, idx=0x424c9008,
>charge=0x40c82008, theta=0x828b524, nr=72240, order=6, nnx=0x8330c80,
>nny=0x8331138,
> nnz=0x83315f0) at pme.c:527
>#1 0x080814fd in spread_on_grid (logfile=0x8291068, grid=0x82fcf68,
>homenr=72240, pme_order=6, x=0x409be008, charge=0x40c82008,
>box=0x82911dc,
> bGatherOnly=0, bHaveSplines=0) at pme.c:1180
>#2 0x080817da in do_pme (logfile=0x8291068, bVerbose=0, ir=0x82a2ee0,
>x=0x409be008, f=0x417ba008, chargeA=0x40c82008, chargeB=0x40cc9008,
>box=0x82911dc,
> cr=0x8291008, nsb=0x8292400, nrnb=0xbfffcfe0, vir=0x82fc27c,
>ewaldcoeff=3.47045946, bFreeEnergy=0, lambda=0, dvdlambda=0xbfffca5c,
>bGatherOnly=0)
> at pme.c:1276
>#3 0x0806a83f in force (fplog=0x8291068, step=25, fr=0x82fc178,
>ir=0x82a2ee0, idef=0x8293424, nsb=0x8292400, cr=0x8291008, mcr=0x0,
>nrnb=0xbfffcfe0,
> grps=0x8291ed8, md=0x8291720, ngener=2, opts=0x82a30bc,
>x=0x409be008, f=0x4111a008, epot=0x8291de8, fcd=0x8292318, bVerbose=0,
>box=0x82911dc,
> lambda=0, graph=0x8291858, excl=0x82a1e2c, bNBFonly=0,
>bDoForces=1, mu_tot=0xbfffcb20, bGatherOnly=0, edyn=0xbfffd8d0) at
>force.c:1306
>#4 0x0808f003 in do_force (fplog=0x8291068, cr=0x8291008, mcr=0x0,
>inputrec=0x82a2ee0, nsb=0x8292400, step=25, nrnb=0xbfffcfe0,
>top=0x8293420,
> grps=0x8291ed8, box=0x82911dc, x=0x409be008, f=0x4111a008,
>buf=0x41046008, mdatoms=0x8291720, ener=0x8291de8, fcd=0x8292318,
>bVerbose=0, lambda=0,
> graph=0x8291858, bStateChanged=1, bNS=0, bNBFonly=0, bDoForces=1,
>fr=0x82fc178, mu_tot=0xbfffcfb0, bGatherOnly=0, t=0.0250000004,
>field=0x0,
> edyn=0xbfffd8d0) at sim_util.c:334
>#5 0x08059100 in do_md (log=0x8291068, cr=0x8291008, mcr=0x0,
>nfile=25, fnm=0x82840a0, bVerbose=0, bCompact=1, bVsites=0,
>vsitecomm=0x0, stepout=10,
> inputrec=0x82a2ee0, grps=0x8291ed8, top=0x8293420, ener=0x8291de8,
>fcd=0x8292318, state=0x82911d0, vold=0x412c2008, vt=0x411ee008,
>f=0x4111a008,
> buf=0x41046008, mdatoms=0x8291720, nsb=0x8292400, nrnb=0x82a3188,
>graph=0x8291858, edyn=0xbfffd8d0, fr=0x82fc178, repl_ex_nst=0,
>repl_ex_seed=-1,
> Flags=0) at md.c:622
>#6 0x08057dda in mdrunner (cr=0x8291008, mcr=0x0, nfile=25,
>fnm=0x82840a0, bVerbose=0, bCompact=1, nDlb=0, nstepout=10,
>edyn=0xbfffd8d0, repl_ex_nst=0,
> repl_ex_seed=-1, Flags=0) at md.c:227
>#7 0x0805ad10 in main (argc=3, argv=0xbfffd984) at mdrun.c:253
>
>Examining things under gdb revealed that one element of the the idxptr
>array appeared to have been corrupted:
>
> (gdb) print idxptr[0]
>$33 = 1062338964
>(gdb) print idxptr[1]
>$34 = 12
>(gdb) print idxptr[2]
>$35 = 32
>(gdb) print idxptr[-1]
>$36 = 24
>(gdb) print idxptr[-2]
>$37 = 59
>
>(gdb) print nx
>$38 = 60
>(gdb) print ny
>$39 = 60
>(gdb) print nz
>$40 = 60
>
>Here is the code in question (pme.c) The segment fault occurs on line
>527, after xidx was (apparently erroneously) set to a very large value
>in line 515. Note that DEBUG wasn't defined for my compilation
>
>510 for(n=0; (n<nr); n++) {
>511 qn = charge[n];
>512 idxptr = idx[n];
>513
>514 if (qn != 0) {
>515 xidx = idxptr[XX];
>516 yidx = idxptr[YY];
>517 zidx = idxptr[ZZ];
>518 #ifdef DEBUG
>519 range_check(xidx,0,nx);
>520 range_check(yidx,0,ny);
>521 range_check(zidx,0,nz);
>522 #endif
>523 i0 = ii0+xidx; /* Pointer arithmetic */
>524 norder = n*4;
>525 norder1 = norder+4;
>526
>527 i = ii0[xidx];
>528 j = jj0[yidx];
>529 k = kk0[zidx];
>530
>_______________________________________________
>gmx-users mailing list
>gmx-users at gromacs.org
>http://www.gromacs.org/mailman/listinfo/gmx-users
>Please don't post (un)subscribe requests to the list. Use the
>www interface or send it to gmx-users-request at gromacs.org.
>
>
More information about the gromacs.org_gmx-users
mailing list