[gmx-developers] ERROR: Catastrophe in realloc: invalid storage ptr error on IBM Power

Marc Baaden baaden at smplinux.de
Thu May 27 13:26:10 CEST 2004


Hi,

  some time ago I posted about an error obtained when running Gromacs 3.2.1
on the Power4 architecture 
(http://www.gromacs.org/pipermail/gmx-users/2004-May/010427.html). Several
people suggested trying a debugger. This was not too easy on the supercomputer,
due to their queuing system etc. which limits the interactivity. But the people
from the supercomputer center were kind enough to do some tests for me.
From my limited knowledge of the Gromacs source code it is not obvious why
this error occurs, but I thought I share the details, as there are certainly
people on this list who have a much better idea and vision of the Gromacs code.


There are two debug outputs below, using Gromacs compiled with different
optimization:
gmx_mpi_out_v0.txt    --> optimized version (-O3 etc...)
gmx_mpi_out_vdbg0.txt  --> debugging version (nooptimize -g etc...)

The error message is the same:
Fatal error: realloc for _buf2 (20848 bytes, file fnbf.c, line 259, 
_buf2=0x0x16746090)
Fatal error: realloc for _buf2 (23296 bytes, file fnbf.c, line 259, 
_buf2=0x0x1700bcb0)

And seems to occur in this chunk of code (fnbf.c) :
...
#ifdef USE_LOCAL_BUFFERS
      /* make sure buffers can hold the longest neighbourlist */
      if (nlist->solvent==esolWATERWATER)			
	sz = 9*nlist->maxlen;
      else if (nlist->solvent==esolWATER) 
	sz = 3*nlist->maxlen;
      else	
        sz = nlist->maxlen;

      if (sz>buflen) {
	buflen=(sz+100); /* use some extra size to avoid reallocating next step */
    	srenew(drbuf,3*buflen);
    	srenew(_buf1,buflen+31);
    	srenew(_buf2,buflen+31);                 ----------------> line 259
        /* make cache aligned buffer pointers */
        buf1=(real *) ( ( (unsigned long int)_buf1 + 31 ) & (~0x1f) );	 
        buf2=(real *) ( ( (unsigned long int)_buf2 + 31 ) & (~0x1f) );	 
      }	
#endif
...

If anybody has a hint on what is going on there, I'd love to hear about it.

Also I wonder whether other people are using gmx 3.2.1 on Power4 or whether
I am the only one. I should also point out that I tried this on two different
supercomputer centers, eg two independently setup Power4's, with the same
error message.

Thanks in advance,

  Marc

NB: The error occurs whether I run on 1,2,4 or 8 processors, and I have also
tried different amounts of memory allocated (up to several gigs). Always the
same error message.
Also after this error, Gromacs keeps running (using CPU) but no output occurs.


-------------- next part --------------
llsubmit: Processed command file through Submit Filter: "/usr/local/loadl/Fidris/llsubmit_exit".
NNODES=2, MYRANK=1, HOSTNAME=zahir011
NNODES=2, MYRANK=0, HOSTNAME=zahir011
NODEID=0 argc=16
NODEID=1 argc=16
                         :-)  G  R  O  M  A  C  S  (-:

           Glycine aRginine prOline Methionine Alanine Cystine Serine

                            :-)  VERSION 3.2.1  (-:


      Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
             Copyright (c) 2001-2004, The GROMACS development team,
            check out http://www.gromacs.org for more information.

         This program is free software; you can redistribute it and/or
          modify it under the terms of the GNU General Public License
         as published by the Free Software Foundation; either version 2
             of the License, or (at your option) any later version.

                     :-)  mdrun_mpi (double precision)  (-:

Option     Filename  Type         Description
------------------------------------------------------------
  -s umb_prod01.tpr  Input        Generic run input: tpr tpb tpa xml
  -o umb_prod01.trr  Output       Full precision trajectory: trr trj
  -x umb_prod01.xtc  Output, Opt. Compressed trajectory (portable xdr format)
  -c umb_prod01.gro  Output       Generic structure: gro g96 pdb xml
  -e umb_prod01.edr  Output       Generic energy: edr ene
  -g umb_prod01.log  Output       Log file
-dgdl umb_prod01.xvg  Output, Opt. xvgr/xmgr file
-field umb_prod01.xvg  Output, Opt. xvgr/xmgr file
-table umb_prod01.xvg  Input, Opt.  xvgr/xmgr file
-rerun umb_prod01.xtc  Input, Opt.  Generic trajectory: xtc trr trj gro g96
                                   pdb
 -ei umb_prod01.edi  Input, Opt.  ED sampling input
 -eo umb_prod01.edo  Output, Opt. ED sampling output
  -j umb_prod01.gct  Input, Opt.  General coupling stuff
 -jo umb_prod01.gct  Output, Opt. General coupling stuff
-ffout umb_prod01.xvg  Output, Opt. xvgr/xmgr file
-devout umb_prod01.xvg  Output, Opt. xvgr/xmgr file
-runav umb_prod01.xvg  Output, Opt. xvgr/xmgr file
 -pi   umbrella.ppa  Input, Opt!  Pull parameters
 -po umb_prod01.ppa  Output, Opt. Pull parameters
 -pd umb_prod01.pdo  Output, Opt. Pull data output
 -pn   umbrella.ndx  Input, Opt!  Index file
-mtx umb_prod01.mtx  Output, Opt. Hessian matrix
 -dn umb_prod01.ndx  Output, Opt. Index file

      Option   Type  Value  Description
------------------------------------------------------
      -[no]h   bool     no  Print help info and quit
      -[no]X   bool     no  Use dialog box GUI to edit command line options
       -nice    int      0  Set the nicelevel
     -deffnm string umb_prod01  Set the default filename for all file options
         -np    int      2  Number of nodes, must be the same as used for
                            grompp
         -nt    int      1  Number of threads to start on each node
      -[no]v   bool    yes  Be loud and noisy
-[no]compact   bool    yes  Write a compact log file
  -[no]multi   bool     no  Do multiple simulations in parallel (only with
                            -np > 1)
   -[no]glas   bool     no  Do glass simulation with special long range
                            corrections
 -[no]ionize   bool     no  Do a simulation including the effect of an X-Ray
                            bombardment on your system


Back Off! I just backed up umb_prod010.log to ./#umb_prod010.log.2#

Back Off! I just backed up umb_prod011.log to ./#umb_prod011.log.2#
Getting Loaded...
Reading file umb_prod01.tpr, VERSION 3.2.1-lmb (single precision)
Loaded with Money


Back Off! I just backed up umb_prod01.edr to ./#umb_prod01.edr.2#
Reading parameter file umbrella.ppa
Reading parameter file umbrella.ppa

Back Off! I just backed up umb_prod01.ppa to ./#umb_prod01.ppa.2#
Sorry couldn't backup umb_prod01.ppa to ./#umb_prod01.ppa.2#
Groups: pullgrp    refgrp
Using 1 pull groups
Groups: pullgrp    refgrp
Using 1 pull groups
Using distance components 0 0 1
Using distance components 0 0 1

Back Off! I just backed up umb_prod01.pdo to ./#umb_prod01.pdo.2#
read_whole_index: 2 groups total
group 0 (pullgrp) 52 elements
group 1 (refgrp) 715 elements
starting mdrun 'FepA + 209DMPC + 214HOH + 17464SOL + ions @ 310K'
20000 steps,     40.0 ps.

Fatal error: realloc for _buf2 (20848 bytes, file fnbf.c, line 259, _buf2=0x0x16746090)
Fatal error: realloc for _buf2 (23296 bytes, file fnbf.c, line 259, _buf2=0x0x1700bcb0)
ERROR: 0031-250  task 0: Terminated
ERROR: 0031-250  task 1: Terminated
ERROR: 0031-365  LoadLeveler unable to run job, reason:
LoadL_starter: Soft WALL CLOCK limit exceeded. Soft limit 1200, Hard limit 1260
LoadL_starter: Hard WALL CLOCK limit exceeded. Soft limit 1200, Hard limit 1260

-------------- next part --------------
llsubmit: Processed command file through Submit Filter: "/usr/local/loadl/Fidris/llsubmit_exit".
NNODES=2, MYRANK=1, HOSTNAME=zahir011
NNODES=2, MYRANK=0, HOSTNAME=zahir011
NODEID=1 argc=16
NODEID=0 argc=16
                         :-)  G  R  O  M  A  C  S  (-:

                  Green Red Orange Magenta Azure Cyan Skyblue

                            :-)  VERSION 3.2.1  (-:


      Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
             Copyright (c) 2001-2004, The GROMACS development team,
            check out http://www.gromacs.org for more information.

         This program is free software; you can redistribute it and/or
          modify it under the terms of the GNU General Public License
         as published by the Free Software Foundation; either version 2
             of the License, or (at your option) any later version.

                     :-)  mdrun_mpi (double precision)  (-:

Option     Filename  Type         Description
------------------------------------------------------------
  -s umb_prod01.tpr  Input        Generic run input: tpr tpb tpa xml
  -o umb_prod01.trr  Output       Full precision trajectory: trr trj
  -x umb_prod01.xtc  Output, Opt. Compressed trajectory (portable xdr format)
  -c umb_prod01.gro  Output       Generic structure: gro g96 pdb xml
  -e umb_prod01.edr  Output       Generic energy: edr ene
  -g umb_prod01.log  Output       Log file
-dgdl umb_prod01.xvg  Output, Opt. xvgr/xmgr file
-field umb_prod01.xvg  Output, Opt. xvgr/xmgr file
-table umb_prod01.xvg  Input, Opt.  xvgr/xmgr file
-rerun umb_prod01.xtc  Input, Opt.  Generic trajectory: xtc trr trj gro g96
                                   pdb
 -ei umb_prod01.edi  Input, Opt.  ED sampling input
 -eo umb_prod01.edo  Output, Opt. ED sampling output
  -j umb_prod01.gct  Input, Opt.  General coupling stuff
 -jo umb_prod01.gct  Output, Opt. General coupling stuff
-ffout umb_prod01.xvg  Output, Opt. xvgr/xmgr file
-devout umb_prod01.xvg  Output, Opt. xvgr/xmgr file
-runav umb_prod01.xvg  Output, Opt. xvgr/xmgr file
 -pi   umbrella.ppa  Input, Opt!  Pull parameters
 -po umb_prod01.ppa  Output, Opt. Pull parameters
 -pd umb_prod01.pdo  Output, Opt. Pull data output
 -pn   umbrella.ndx  Input, Opt!  Index file
-mtx umb_prod01.mtx  Output, Opt. Hessian matrix
 -dn umb_prod01.ndx  Output, Opt. Index file

      Option   Type  Value  Description
------------------------------------------------------
      -[no]h   bool     no  Print help info and quit
      -[no]X   bool     no  Use dialog box GUI to edit command line options
       -nice    int      0  Set the nicelevel
     -deffnm string umb_prod01  Set the default filename for all file options
         -np    int      2  Number of nodes, must be the same as used for
                            grompp
         -nt    int      1  Number of threads to start on each node
      -[no]v   bool    yes  Be loud and noisy
-[no]compact   bool    yes  Write a compact log file
  -[no]multi   bool     no  Do multiple simulations in parallel (only with
                            -np > 1)
   -[no]glas   bool     no  Do glass simulation with special long range
                            corrections
 -[no]ionize   bool     no  Do a simulation including the effect of an X-Ray
                            bombardment on your system


Back Off! I just backed up umb_prod011.log to ./#umb_prod011.log.1#

Back Off! I just backed up umb_prod010.log to ./#umb_prod010.log.1#
Getting Loaded...
Reading file umb_prod01.tpr, VERSION 3.2.1-lmb (single precision)
Loaded with Money


Back Off! I just backed up umb_prod01.edr to ./#umb_prod01.edr.1#
Reading parameter file umbrella.ppa
Reading parameter file umbrella.ppa

Back Off! I just backed up umb_prod01.ppa to ./#umb_prod01.ppa.1#
Sorry couldn't backup umb_prod01.ppa to ./#umb_prod01.ppa.1#
Groups: pullgrp    refgrp
Using 1 pull groups
Groups: pullgrp    refgrp
Using 1 pull groups
Using distance components 0 0 1
Using distance components 0 0 1

Back Off! I just backed up umb_prod01.pdo to ./#umb_prod01.pdo.1#
read_whole_index: 2 groups total
group 0 (pullgrp) 52 elements
group 1 (refgrp) 715 elements
starting mdrun 'FepA + 209DMPC + 214HOH + 17464SOL + ions @ 310K'
20000 steps,     40.0 ps.

Fatal error: realloc for _buf2 (20848 bytes, file fnbf.c, line 259, _buf2=0x0x16746090)
Fatal error: realloc for _buf2 (23296 bytes, file fnbf.c, line 259, _buf2=0x0x1700bcb0)
ERROR: 0031-250  task 0: Terminated
ERROR: 0031-250  task 1: Terminated
ERROR: 0031-365  LoadLeveler unable to run job, reason:
LoadL_starter: Soft WALL CLOCK limit exceeded. Soft limit 1200, Hard limit 1260
LoadL_starter: Hard WALL CLOCK limit exceeded. Soft limit 1200, Hard limit 1260

-------------- next part --------------
 Dr. Marc Baaden  - Institut de Biologie Physico-Chimique, Paris
 mailto:baaden at smplinux.de      -      http://www.marc-baaden.de
 FAX: +49 697912 39550  -  Tel: +33 15841 5176 ou +33 609 843217


More information about the gromacs.org_gmx-developers mailing list