[gmx-developers] ERROR: Catastrophe in realloc: invalid storage ptr error on IBM Power
Marc Baaden
baaden at smplinux.de
Thu May 27 13:26:10 CEST 2004
Hi,
some time ago I posted about an error obtained when running Gromacs 3.2.1
on the Power4 architecture
(http://www.gromacs.org/pipermail/gmx-users/2004-May/010427.html). Several
people suggested trying a debugger. This was not too easy on the supercomputer,
due to their queuing system etc. which limits the interactivity. But the people
from the supercomputer center were kind enough to do some tests for me.
From my limited knowledge of the Gromacs source code it is not obvious why
this error occurs, but I thought I share the details, as there are certainly
people on this list who have a much better idea and vision of the Gromacs code.
There are two debug outputs below, using Gromacs compiled with different
optimization:
gmx_mpi_out_v0.txt --> optimized version (-O3 etc...)
gmx_mpi_out_vdbg0.txt --> debugging version (nooptimize -g etc...)
The error message is the same:
Fatal error: realloc for _buf2 (20848 bytes, file fnbf.c, line 259,
_buf2=0x0x16746090)
Fatal error: realloc for _buf2 (23296 bytes, file fnbf.c, line 259,
_buf2=0x0x1700bcb0)
And seems to occur in this chunk of code (fnbf.c) :
...
#ifdef USE_LOCAL_BUFFERS
/* make sure buffers can hold the longest neighbourlist */
if (nlist->solvent==esolWATERWATER)
sz = 9*nlist->maxlen;
else if (nlist->solvent==esolWATER)
sz = 3*nlist->maxlen;
else
sz = nlist->maxlen;
if (sz>buflen) {
buflen=(sz+100); /* use some extra size to avoid reallocating next step */
srenew(drbuf,3*buflen);
srenew(_buf1,buflen+31);
srenew(_buf2,buflen+31); ----------------> line 259
/* make cache aligned buffer pointers */
buf1=(real *) ( ( (unsigned long int)_buf1 + 31 ) & (~0x1f) );
buf2=(real *) ( ( (unsigned long int)_buf2 + 31 ) & (~0x1f) );
}
#endif
...
If anybody has a hint on what is going on there, I'd love to hear about it.
Also I wonder whether other people are using gmx 3.2.1 on Power4 or whether
I am the only one. I should also point out that I tried this on two different
supercomputer centers, eg two independently setup Power4's, with the same
error message.
Thanks in advance,
Marc
NB: The error occurs whether I run on 1,2,4 or 8 processors, and I have also
tried different amounts of memory allocated (up to several gigs). Always the
same error message.
Also after this error, Gromacs keeps running (using CPU) but no output occurs.
-------------- next part --------------
llsubmit: Processed command file through Submit Filter: "/usr/local/loadl/Fidris/llsubmit_exit".
NNODES=2, MYRANK=1, HOSTNAME=zahir011
NNODES=2, MYRANK=0, HOSTNAME=zahir011
NODEID=0 argc=16
NODEID=1 argc=16
:-) G R O M A C S (-:
Glycine aRginine prOline Methionine Alanine Cystine Serine
:-) VERSION 3.2.1 (-:
Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2004, The GROMACS development team,
check out http://www.gromacs.org for more information.
This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.
:-) mdrun_mpi (double precision) (-:
Option Filename Type Description
------------------------------------------------------------
-s umb_prod01.tpr Input Generic run input: tpr tpb tpa xml
-o umb_prod01.trr Output Full precision trajectory: trr trj
-x umb_prod01.xtc Output, Opt. Compressed trajectory (portable xdr format)
-c umb_prod01.gro Output Generic structure: gro g96 pdb xml
-e umb_prod01.edr Output Generic energy: edr ene
-g umb_prod01.log Output Log file
-dgdl umb_prod01.xvg Output, Opt. xvgr/xmgr file
-field umb_prod01.xvg Output, Opt. xvgr/xmgr file
-table umb_prod01.xvg Input, Opt. xvgr/xmgr file
-rerun umb_prod01.xtc Input, Opt. Generic trajectory: xtc trr trj gro g96
pdb
-ei umb_prod01.edi Input, Opt. ED sampling input
-eo umb_prod01.edo Output, Opt. ED sampling output
-j umb_prod01.gct Input, Opt. General coupling stuff
-jo umb_prod01.gct Output, Opt. General coupling stuff
-ffout umb_prod01.xvg Output, Opt. xvgr/xmgr file
-devout umb_prod01.xvg Output, Opt. xvgr/xmgr file
-runav umb_prod01.xvg Output, Opt. xvgr/xmgr file
-pi umbrella.ppa Input, Opt! Pull parameters
-po umb_prod01.ppa Output, Opt. Pull parameters
-pd umb_prod01.pdo Output, Opt. Pull data output
-pn umbrella.ndx Input, Opt! Index file
-mtx umb_prod01.mtx Output, Opt. Hessian matrix
-dn umb_prod01.ndx Output, Opt. Index file
Option Type Value Description
------------------------------------------------------
-[no]h bool no Print help info and quit
-[no]X bool no Use dialog box GUI to edit command line options
-nice int 0 Set the nicelevel
-deffnm string umb_prod01 Set the default filename for all file options
-np int 2 Number of nodes, must be the same as used for
grompp
-nt int 1 Number of threads to start on each node
-[no]v bool yes Be loud and noisy
-[no]compact bool yes Write a compact log file
-[no]multi bool no Do multiple simulations in parallel (only with
-np > 1)
-[no]glas bool no Do glass simulation with special long range
corrections
-[no]ionize bool no Do a simulation including the effect of an X-Ray
bombardment on your system
Back Off! I just backed up umb_prod010.log to ./#umb_prod010.log.2#
Back Off! I just backed up umb_prod011.log to ./#umb_prod011.log.2#
Getting Loaded...
Reading file umb_prod01.tpr, VERSION 3.2.1-lmb (single precision)
Loaded with Money
Back Off! I just backed up umb_prod01.edr to ./#umb_prod01.edr.2#
Reading parameter file umbrella.ppa
Reading parameter file umbrella.ppa
Back Off! I just backed up umb_prod01.ppa to ./#umb_prod01.ppa.2#
Sorry couldn't backup umb_prod01.ppa to ./#umb_prod01.ppa.2#
Groups: pullgrp refgrp
Using 1 pull groups
Groups: pullgrp refgrp
Using 1 pull groups
Using distance components 0 0 1
Using distance components 0 0 1
Back Off! I just backed up umb_prod01.pdo to ./#umb_prod01.pdo.2#
read_whole_index: 2 groups total
group 0 (pullgrp) 52 elements
group 1 (refgrp) 715 elements
starting mdrun 'FepA + 209DMPC + 214HOH + 17464SOL + ions @ 310K'
20000 steps, 40.0 ps.
Fatal error: realloc for _buf2 (20848 bytes, file fnbf.c, line 259, _buf2=0x0x16746090)
Fatal error: realloc for _buf2 (23296 bytes, file fnbf.c, line 259, _buf2=0x0x1700bcb0)
ERROR: 0031-250 task 0: Terminated
ERROR: 0031-250 task 1: Terminated
ERROR: 0031-365 LoadLeveler unable to run job, reason:
LoadL_starter: Soft WALL CLOCK limit exceeded. Soft limit 1200, Hard limit 1260
LoadL_starter: Hard WALL CLOCK limit exceeded. Soft limit 1200, Hard limit 1260
-------------- next part --------------
llsubmit: Processed command file through Submit Filter: "/usr/local/loadl/Fidris/llsubmit_exit".
NNODES=2, MYRANK=1, HOSTNAME=zahir011
NNODES=2, MYRANK=0, HOSTNAME=zahir011
NODEID=1 argc=16
NODEID=0 argc=16
:-) G R O M A C S (-:
Green Red Orange Magenta Azure Cyan Skyblue
:-) VERSION 3.2.1 (-:
Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2004, The GROMACS development team,
check out http://www.gromacs.org for more information.
This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.
:-) mdrun_mpi (double precision) (-:
Option Filename Type Description
------------------------------------------------------------
-s umb_prod01.tpr Input Generic run input: tpr tpb tpa xml
-o umb_prod01.trr Output Full precision trajectory: trr trj
-x umb_prod01.xtc Output, Opt. Compressed trajectory (portable xdr format)
-c umb_prod01.gro Output Generic structure: gro g96 pdb xml
-e umb_prod01.edr Output Generic energy: edr ene
-g umb_prod01.log Output Log file
-dgdl umb_prod01.xvg Output, Opt. xvgr/xmgr file
-field umb_prod01.xvg Output, Opt. xvgr/xmgr file
-table umb_prod01.xvg Input, Opt. xvgr/xmgr file
-rerun umb_prod01.xtc Input, Opt. Generic trajectory: xtc trr trj gro g96
pdb
-ei umb_prod01.edi Input, Opt. ED sampling input
-eo umb_prod01.edo Output, Opt. ED sampling output
-j umb_prod01.gct Input, Opt. General coupling stuff
-jo umb_prod01.gct Output, Opt. General coupling stuff
-ffout umb_prod01.xvg Output, Opt. xvgr/xmgr file
-devout umb_prod01.xvg Output, Opt. xvgr/xmgr file
-runav umb_prod01.xvg Output, Opt. xvgr/xmgr file
-pi umbrella.ppa Input, Opt! Pull parameters
-po umb_prod01.ppa Output, Opt. Pull parameters
-pd umb_prod01.pdo Output, Opt. Pull data output
-pn umbrella.ndx Input, Opt! Index file
-mtx umb_prod01.mtx Output, Opt. Hessian matrix
-dn umb_prod01.ndx Output, Opt. Index file
Option Type Value Description
------------------------------------------------------
-[no]h bool no Print help info and quit
-[no]X bool no Use dialog box GUI to edit command line options
-nice int 0 Set the nicelevel
-deffnm string umb_prod01 Set the default filename for all file options
-np int 2 Number of nodes, must be the same as used for
grompp
-nt int 1 Number of threads to start on each node
-[no]v bool yes Be loud and noisy
-[no]compact bool yes Write a compact log file
-[no]multi bool no Do multiple simulations in parallel (only with
-np > 1)
-[no]glas bool no Do glass simulation with special long range
corrections
-[no]ionize bool no Do a simulation including the effect of an X-Ray
bombardment on your system
Back Off! I just backed up umb_prod011.log to ./#umb_prod011.log.1#
Back Off! I just backed up umb_prod010.log to ./#umb_prod010.log.1#
Getting Loaded...
Reading file umb_prod01.tpr, VERSION 3.2.1-lmb (single precision)
Loaded with Money
Back Off! I just backed up umb_prod01.edr to ./#umb_prod01.edr.1#
Reading parameter file umbrella.ppa
Reading parameter file umbrella.ppa
Back Off! I just backed up umb_prod01.ppa to ./#umb_prod01.ppa.1#
Sorry couldn't backup umb_prod01.ppa to ./#umb_prod01.ppa.1#
Groups: pullgrp refgrp
Using 1 pull groups
Groups: pullgrp refgrp
Using 1 pull groups
Using distance components 0 0 1
Using distance components 0 0 1
Back Off! I just backed up umb_prod01.pdo to ./#umb_prod01.pdo.1#
read_whole_index: 2 groups total
group 0 (pullgrp) 52 elements
group 1 (refgrp) 715 elements
starting mdrun 'FepA + 209DMPC + 214HOH + 17464SOL + ions @ 310K'
20000 steps, 40.0 ps.
Fatal error: realloc for _buf2 (20848 bytes, file fnbf.c, line 259, _buf2=0x0x16746090)
Fatal error: realloc for _buf2 (23296 bytes, file fnbf.c, line 259, _buf2=0x0x1700bcb0)
ERROR: 0031-250 task 0: Terminated
ERROR: 0031-250 task 1: Terminated
ERROR: 0031-365 LoadLeveler unable to run job, reason:
LoadL_starter: Soft WALL CLOCK limit exceeded. Soft limit 1200, Hard limit 1260
LoadL_starter: Hard WALL CLOCK limit exceeded. Soft limit 1200, Hard limit 1260
-------------- next part --------------
Dr. Marc Baaden - Institut de Biologie Physico-Chimique, Paris
mailto:baaden at smplinux.de - http://www.marc-baaden.de
FAX: +49 697912 39550 - Tel: +33 15841 5176 ou +33 609 843217
More information about the gromacs.org_gmx-developers
mailing list