[gmx-users] Replica Exchange MD on more than 64 processors
bharat v. adkar
bharat at sscu.iisc.ernet.in
Mon Dec 28 10:00:37 CET 2009
On Mon, 28 Dec 2009, David van der Spoel wrote:
> bharat v. adkar wrote:
>> On Mon, 28 Dec 2009, Mark Abraham wrote:
>>
>> > bharat v. adkar wrote:
>> > > On Sun, 27 Dec 2009, Mark Abraham wrote:
>> > >
>> > > > bharat v. adkar wrote:
>> > > > > On Sun, 27 Dec 2009, Mark Abraham wrote:
>> > > > > > > > bharat v. adkar wrote:
>> > > > > > > > > Dear all,
>> > > > > > > I am trying to perform replica exchange MD (REMD) on a >
>> > > > 'protein in
>> > > > > > > water' system. I am following instructions given on wiki >
>> > > > (How-Tos ->
>> > > > > > > REMD). I have to perform the REMD simulation with 35
>> > > different
>> > > > > > > temperatures. As per advise on wiki, I equilibrated the
>> > > system > > > > at
>> > > > > > > respective temperatures (total of 35 equilibration > >
>> > > simulations). > > After
>> > > > > > > this I generated chk_0.tpr, chk_1.tpr, ..., chk_34.tpr
>> > > files > > from the
>> > > > > > > equilibrated structures.
>> > > > > > > > > Now when I submit final job for REMD with following >
>> > > > command-line, it > > gives
>> > > > > > > some error:
>> > > > > > > > > command line: mpiexec -np 70 mdrun -multi 35 -replex
>> > > 1000 -s > > chk_.tpr > > -v
>> > > > > > > > > error msg:
>> > > > > > > -------------------------------------------------------
>> > > > > > > Program mdrun_mpi, VERSION 4.0.7
>> > > > > > > Source code file: ../../../SRC/src/gmxlib/smalloc.c, line:
>> > > 179
>> > > > > > > > > Fatal error:
>> > > > > > > Not enough memory. Failed to realloc 790760 bytes for > >
>> > > > > nlist->jjnr,
>> > > > > > > nlist->jjnr=0x9a400030
>> > > > > > > (called from file ../../../SRC/src/mdlib/ns.c, line 503)
>> > > > > > > -------------------------------------------------------
>> > > > > > > > > Thanx for Using GROMACS - Have a Nice Day
>> > > > > > > : Cannot allocate memory
>> > > > > > > Error on node 19, will try to stop all the nodes
>> > > > > > > Halting parallel program mdrun_mpi on CPU 19 out of 70
>> > > > > > > > >
>> > > ***********************************************************************
>> > > > > > > > > > > The individual node on the cluster has 8GB of
>> > > physical > > memory and 16GB > > of
>> > > > > > > swap memory. Moreover, when logged onto the individual
>> > > nodes, > > it > > shows
>> > > > > > > more than 1GB of free memory, so there should be no
>> > > problem > > with > > cluster
>> > > > > > > memory. Also, the equilibration jobs for the same system
>> > > are > > run on > > the
>> > > > > > > same cluster without any problem.
>> > > > > > > > > What I have observed by submitting different test jobs
>> > > with > > varying > > number
>> > > > > > > of processors (and no. of replicas, wherever necessary),
>> > > that > > any job > > with
>> > > > > > > total number of processors <= 64, runs faithfully without
>> > > any > > problem. > > As
>> > > > > > > soon as total number of processors are more than 64, it
>> > > gives > > the > > above
>> > > > > > > error. I have tested this with 65 processors/65 replicas
>> > > > > > > also.
>> > > > > > > This sounds like you might be running on fewer physical
>> > > CPUs > > than you > have available. If so, running multiple MPI
>> > > processes per > > physical CPU > can lead to memory shortage
>> > > conditions.
>> > > > > > > I don't understand what you mean. Do you mean, there might
>> > > be more > > than 8
>> > > > > processes running per node (each node has 8 processors)? But
>> > > that > > also
>> > > > > does not seem to be the case, as SGE (sun grid engine) output
>> > > shows > > only
>> > > > > eight processes per node.
>> > > > > 65 processes can't have 8 processes per node.
>> > > why can't it have? as i said, there are 8 processors per node. what i
>> > > have
>> > > not mentioned is that how many nodes it is using. The jobs got
>> > > distributed
>> > > over 9 nodes. 8 of which corresponds to 64 processors + 1 processor
>> > > from
>> > > 9th node.
>> >
>> > OK, that's a full description. Your symptoms are indicative of someone
>> > making an error somewhere. Since GROMACS works over more than 64
>> > processors elsewhere, the presumption is that you are doing something
>> > wrong or the machine is not set up in the way you think it is or should
>> > be. To get the most effective help, you need to be sure you're providing
>> > full information - else we can't tell which error you're making or
>> > (potentially) eliminate you as a source of error.
>> >
>> Sorry for not being clear in statements.
>>
>> > > As far I can tell you, job distribution seems okay to me. It is 1 job
>> > > per
>> > > processor.
>> >
>> > Does non-REMD GROMACS run on more than 64 processors? Does your cluster
>> > support using more than 8 nodes in a run? Can you run an MPI "Hello
>> > world" application that prints the processor and node ID across more
>> > than 64 processors?
>>
>> Yes, the cluster supports runs with more than 8 nodes. I generated a
>> system with 10 nm water box and submitted on 80 processors. It was running
>> fine. It printed all 80 NODEIDs. Also showed me when the job will get
>> over.
>>
>> bharat
>>
>>
>> >
>> > Mark
>> >
>> >
>> > > bharat
>> > >
>> > > > > Mark
>> > > > > > > I don't know what you mean by "swap memory".
>> > > > > > > Sorry, I meant cache memory..
>> > > > > > > bharat
>> > > > > > > > > Mark
>> > > > > > > > System: Protein + water + Na ions (total 46878 atoms)
>> > > > > > > Gromacs version: tested with both v4.0.5 and v4.0.7
>> > > > > > > compiled with: --enable-float --with-fft=fftw3 --enable-mpi
>> > > > > > > compiler: gcc_3.4.6 -O3
>> > > > > > > machine details: uname -mpio: x86_64 x86_64 x86_64
>> > > > > > > GNU/Linux
>> > > > > > > > > > > I tried searching the mailing-list without any
>> > > luck. I > > am not sure, if > > i
>> > > > > > > am doing anything wrong in giving commands. Please correct
>> > > me > > if it > > is
>> > > > > > > wrong.
>> > > > > > > > > Kindly let me know the solution.
>> > > > > > > > > > > bharat
>> > > > > > > > > > > > >
>> > >
>> >
>>
> your system is going out of memory. probably too big a system or all replicas
> are runing on he same node.
from MPI output it doesn't seem that all or more than one replicas are
running on a single processor. regarding system, it has run successfully
during equilibration.
i am pasting below the stderr file of one of the jobs with number of
processors = 66. please check the attached file "ToAttach.txt".
again to remind, the cluster here has 8 processors per compute-node.
bharat
CommandLine: mpirun -np 66 -mdrun -multi 33 -replex 1000 -s chk_.tpr -cpi
chkpt -cpt 30 -cpo chkpt
Output:
NNODES=66, MYRANK=0, HOSTNAME=compute-0-2.local
NNODES=66, MYRANK=1, HOSTNAME=compute-0-2.local
NNODES=66, MYRANK=4, HOSTNAME=compute-0-2.local
NNODES=66, MYRANK=3, HOSTNAME=compute-0-2.local
NNODES=66, MYRANK=9, HOSTNAME=compute-0-3.local
NNODES=66, MYRANK=2, HOSTNAME=compute-0-2.local
NNODES=66, MYRANK=5, HOSTNAME=compute-0-2.local
NNODES=66, MYRANK=6, HOSTNAME=compute-0-2.local
NNODES=66, MYRANK=13, HOSTNAME=compute-0-3.local
NNODES=66, MYRANK=11, HOSTNAME=compute-0-3.local
NNODES=66, MYRANK=12, HOSTNAME=compute-0-3.local
NNODES=66, MYRANK=14, HOSTNAME=compute-0-3.local
NNODES=66, MYRANK=28, HOSTNAME=compute-0-12.local
NNODES=66, MYRANK=10, HOSTNAME=compute-0-3.local
NNODES=66, MYRANK=20, HOSTNAME=compute-0-6.local
NNODES=66, MYRANK=21, HOSTNAME=compute-0-6.local
NNODES=66, MYRANK=23, HOSTNAME=compute-0-6.local
NNODES=66, MYRANK=25, HOSTNAME=compute-0-12.local
NNODES=66, MYRANK=26, HOSTNAME=compute-0-12.local
NNODES=66, MYRANK=30, HOSTNAME=compute-0-12.local
NNODES=66, MYRANK=29, HOSTNAME=compute-0-12.local
NNODES=66, MYRANK=24, HOSTNAME=compute-0-12.local
NNODES=66, MYRANK=7, HOSTNAME=compute-0-2.local
NNODES=66, MYRANK=8, HOSTNAME=compute-0-3.local
NNODES=66, MYRANK=18, HOSTNAME=compute-0-6.local
NNODES=66, MYRANK=58, HOSTNAME=compute-0-27.local
NNODES=66, MYRANK=19, HOSTNAME=compute-0-6.local
NNODES=66, MYRANK=22, HOSTNAME=compute-0-6.local
NNODES=66, MYRANK=47, HOSTNAME=compute-0-11.local
NNODES=66, MYRANK=62, HOSTNAME=compute-0-27.local
NNODES=66, MYRANK=61, HOSTNAME=compute-0-27.local
NNODES=66, MYRANK=51, HOSTNAME=compute-0-24.local
NNODES=66, MYRANK=42, HOSTNAME=compute-0-11.local
NNODES=66, MYRANK=41, HOSTNAME=compute-0-11.local
NNODES=66, MYRANK=57, HOSTNAME=compute-0-27.local
NNODES=66, MYRANK=17, HOSTNAME=compute-0-6.local
NNODES=66, MYRANK=38, HOSTNAME=compute-0-15.local
NNODES=66, MYRANK=37, HOSTNAME=compute-0-15.local
NNODES=66, MYRANK=39, HOSTNAME=compute-0-15.local
NNODES=66, MYRANK=40, HOSTNAME=compute-0-11.local
NNODES=66, MYRANK=45, HOSTNAME=compute-0-11.local
NNODES=66, MYRANK=46, HOSTNAME=compute-0-11.local
NNODES=66, MYRANK=43, HOSTNAME=compute-0-11.local
NNODES=66, MYRANK=44, HOSTNAME=compute-0-11.local
NNODES=66, MYRANK=49, HOSTNAME=compute-0-24.local
NNODES=66, MYRANK=50, HOSTNAME=compute-0-24.local
NNODES=66, MYRANK=48, HOSTNAME=compute-0-24.local
NNODES=66, MYRANK=53, HOSTNAME=compute-0-24.local
NNODES=66, MYRANK=54, HOSTNAME=compute-0-24.local
NNODES=66, MYRANK=52, HOSTNAME=compute-0-24.local
NNODES=66, MYRANK=27, HOSTNAME=compute-0-12.local
NNODES=66, MYRANK=60, HOSTNAME=compute-0-27.local
NNODES=66, MYRANK=59, HOSTNAME=compute-0-27.local
NNODES=66, MYRANK=16, HOSTNAME=compute-0-6.local
NNODES=66, MYRANK=34, HOSTNAME=compute-0-15.local
NNODES=66, MYRANK=33, HOSTNAME=compute-0-15.local
NNODES=66, MYRANK=36, HOSTNAME=compute-0-15.local
NNODES=66, MYRANK=35, HOSTNAME=compute-0-15.local
NNODES=66, MYRANK=56, HOSTNAME=compute-0-27.local
NNODES=66, MYRANK=15, HOSTNAME=compute-0-3.local
NNODES=66, MYRANK=31, HOSTNAME=compute-0-12.local
NNODES=66, MYRANK=63, HOSTNAME=compute-0-27.local
NNODES=66, MYRANK=64, HOSTNAME=compute-0-25.local
NNODES=66, MYRANK=55, HOSTNAME=compute-0-24.local
NNODES=66, MYRANK=32, HOSTNAME=compute-0-15.local
NNODES=66, MYRANK=65, HOSTNAME=compute-0-25.local
NODEID=0 argc=19
NODEID=1 argc=19
NODEID=2 argc=19
NODEID=3 argc=19
NODEID=5 argc=19
NODEID=4 argc=19
NODEID=6 argc=19
NODEID=13 argc=19
NODEID=9 argc=19
NODEID=12 argc=19
NODEID=11 argc=19
NODEID=7 argc=19
NODEID=16 argc=19
NODEID=10 argc=19
NODEID=15 argc=19
NODEID=8 argc=19
NODEID=14 argc=19
NODEID=20 argc=19
NODEID=19 argc=19
NODEID=28 argc=19
NODEID=25 argc=19
NODEID=26 argc=19
NODEID=18 argc=19
NODEID=17 argc=19
NODEID=22 argc=19
NODEID=21 argc=19
NODEID=24 argc=19
NODEID=23 argc=19
NODEID=30 argc=19
NODEID=29 argc=19
NODEID=34 argc=19
NODEID=33 argc=19
NODEID=27 argc=19
NODEID=57 argc=19
NODEID=58 argc=19
NODEID=51 argc=19
NODEID=52 argc=19
NODEID=41 argc=19
NODEID=42 argc=19
NODEID=39 argc=19
NODEID=40 argc=19
NODEID=37 argc=19
NODEID=38 argc=19
NODEID=36 argc=19
NODEID=35 argc=19
NODEID=61 argc=19
NODEID=62 argc=19
NODEID=49 argc=19
NODEID=48 argc=19
NODEID=47 argc=19
NODEID=56 argc=19
NODEID=45 argc=19
NODEID=46 argc=19
NODEID=44 argc=19
NODEID=43 argc=19
NODEID=54 argc=19
NODEID=53 argc=19
NODEID=55 argc=19
NODEID=59 argc=19
NODEID=60 argc=19
NODEID=31 argc=19
NODEID=64 argc=19
NODEID=63 argc=19
NODEID=50 argc=19
NODEID=32 argc=19
NODEID=65 argc=19
:-) G R O M A C S (-:
Groningen Machine for Chemical Simulation
:-) VERSION 4.0.7 (-:
Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2008, The GROMACS development team,
check out http://www.gromacs.org for more information.
This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.
:-) /groupmisc/bharat/soft/GMX407_bh/INSTL/bin/mdrun_mpi (-:
Option Filename Type Description
------------------------------------------------------------
-s chk_.tpr Input Run input file: tpr tpb tpa
-o traj.trr Output Full precision trajectory: trr trj cpt
-x traj.xtc Output, Opt! Compressed trajectory (portable xdr
format)
-cpi chkpt.cpt Input, Opt! Checkpoint file
-cpo chkpt.cpt Output, Opt! Checkpoint file
-c confout.gro Output Structure file: gro g96 pdb
-e ener.edr Output Energy file: edr ene
-g md.log Output Log file
-dgdl dgdl.xvg Output, Opt. xvgr/xmgr file
-field field.xvg Output, Opt. xvgr/xmgr file
-table table.xvg Input, Opt. xvgr/xmgr file
-tablep tablep.xvg Input, Opt. xvgr/xmgr file
-tableb table.xvg Input, Opt. xvgr/xmgr file
-rerun rerun.xtc Input, Opt. Trajectory: xtc trr trj gro g96 pdb cpt
-tpi tpi.xvg Output, Opt. xvgr/xmgr file
-tpid tpidist.xvg Output, Opt. xvgr/xmgr file
-ei sam.edi Input, Opt. ED sampling input
-eo sam.edo Output, Opt. ED sampling output
-j wham.gct Input, Opt. General coupling stuff
-jo bam.gct Output, Opt. General coupling stuff
-ffout gct.xvg Output, Opt. xvgr/xmgr file
-devout deviatie.xvg Output, Opt. xvgr/xmgr file
-runav runaver.xvg Output, Opt. xvgr/xmgr file
-px pullx.xvg Output, Opt. xvgr/xmgr file
-pf pullf.xvg Output, Opt. xvgr/xmgr file
-mtx nm.mtx Output, Opt. Hessian matrix
-dn dipole.ndx Output, Opt. Index file
Option Type Value Description
------------------------------------------------------
-[no]h bool no Print help info and quit
-nice int 0 Set the nicelevel
-deffnm string Set the default filename for all file options
-[no]xvgr bool yes Add specific codes (legends etc.) in the
output
xvg files for the xmgrace program
-[no]pd bool no Use particle decompostion
-dd vector 0 0 0 Domain decomposition grid, 0 is optimize
-npme int -1 Number of separate nodes to be used for PME,
-1
is guess
-ddorder enum interleave DD node order: interleave, pp_pme or
cartesian
-[no]ddcheck bool yes Check for all bonded interactions with DD
-rdd real 0 The maximum distance for bonded interactions
with
DD (nm), 0 is determine from initial
coordinates
-rcon real 0 Maximum distance for P-LINCS (nm), 0 is
estimate
-dlb enum auto Dynamic load balancing (with DD): auto, no or
yes
-dds real 0.8 Minimum allowed dlb scaling of the DD cell
size
-[no]sum bool yes Sum the energies at every step
-[no]v bool yes Be loud and noisy
-[no]compact bool yes Write a compact log file
-[no]seppot bool no Write separate V and dVdl terms for each
interaction type and node to the log file(s)
-pforce real -1 Print all forces larger than this (kJ/mol nm)
-[no]reprod bool no Try to avoid optimizations that affect binary
reproducibility
-cpt real 30 Checkpoint interval (minutes)
-[no]append bool no Append to previous output files when
continuing
from checkpoint
-[no]addpart bool yes Add the simulation part number to all output
files when continuing from checkpoint
-maxh real -1 Terminate after 0.99 times this time (hours)
-multi int 33 Do multiple simulations in parallel
-replex int 1000 Attempt replica exchange every # steps
-reseed int -1 Seed for replica exchange, -1 is generate a
seed
-[no]glas bool no Do glass simulation with special long range
corrections
-[no]ionize bool no Do a simulation including the effect of an
X-Ray
bombardment on your system
Getting Loaded...
Getting Loaded...
Reading file chk_0.tpr, VERSION 4.0.5 (single precision)
Getting Loaded...
Reading file chk_32.tpr, VERSION 4.0.5 (single precision)
Getting Loaded...
Getting Loaded...
Getting Loaded...
Getting Loaded...
Getting Loaded...
Reading file chk_1.tpr, VERSION 4.0.5 (single precision)
Getting Loaded...
Getting Loaded...
Getting Loaded...
Reading file chk_3.tpr, VERSION 4.0.5 (single precision)
Getting Loaded...
Getting Loaded...
Getting Loaded...
Getting Loaded...
Getting Loaded...
Getting Loaded...
Reading file chk_13.tpr, VERSION 4.0.5 (single precision)
Getting Loaded...
Getting Loaded...
Reading file chk_15.tpr, VERSION 4.0.5 (single precision)
Reading file chk_12.tpr, VERSION 4.0.5 (single precision)
Reading file chk_25.tpr, VERSION 4.0.5 (single precision)
Reading file chk_30.tpr, VERSION 4.0.5 (single precision)
Reading file chk_26.tpr, VERSION 4.0.5 (single precision)
Reading file chk_24.tpr, VERSION 4.0.5 (single precision)
Getting Loaded...
Getting Loaded...
Getting Loaded...
Getting Loaded...
Getting Loaded...
Getting Loaded...
Getting Loaded...
Reading file chk_4.tpr, VERSION 4.0.5 (single precision)
Reading file chk_16.tpr, VERSION 4.0.5 (single precision)
Reading file chk_8.tpr, VERSION 4.0.5 (single precision)
Getting Loaded...
Getting Loaded...
Getting Loaded...
Getting Loaded...
Getting Loaded...
Getting Loaded...
Reading file chk_5.tpr, VERSION 4.0.5 (single precision)
Reading file chk_20.tpr, VERSION 4.0.5 (single precision)
Reading file chk_11.tpr, VERSION 4.0.5 (single precision)
Reading file chk_7.tpr, VERSION 4.0.5 (single precision)
Reading file chk_28.tpr, VERSION 4.0.5 (single precision)
Reading file chk_6.tpr, VERSION 4.0.5 (single precision)
Reading file chk_31.tpr, VERSION 4.0.5 (single precision)
Reading file chk_27.tpr, VERSION 4.0.5 (single precision)
Reading file chk_14.tpr, VERSION 4.0.5 (single precision)
Reading file chk_29.tpr, VERSION 4.0.5 (single precision)
Reading file chk_18.tpr, VERSION 4.0.5 (single precision)
Reading file chk_10.tpr, VERSION 4.0.5 (single precision)
Reading file chk_9.tpr, VERSION 4.0.5 (single precision)
Reading file chk_19.tpr, VERSION 4.0.5 (single precision)
Reading file chk_22.tpr, VERSION 4.0.5 (single precision)
Reading file chk_17.tpr, VERSION 4.0.5 (single precision)
Reading file chk_21.tpr, VERSION 4.0.5 (single precision)
Reading file chk_23.tpr, VERSION 4.0.5 (single precision)
Getting Loaded...
Reading file chk_2.tpr, VERSION 4.0.5 (single precision)
Loaded with Money
Loaded with Money
Loaded with Money
Loaded with Money
Loaded with Money
Loaded with Money
Loaded with Money
Loaded with Money
Loaded with Money
Loaded with Money
Loaded with Money
Loaded with Money
Loaded with Money
Loaded with Money
Loaded with Money
Loaded with Money
Loaded with Money
Loaded with Money
Loaded with Money
Making 1D domain decomposition 2 x 1 x 1
Making 1D domain decomposition 2 x 1 x 1
Loaded with Money
Making 1D domain decomposition 2 x 1 x 1
Making 1D domain decomposition 2 x 1 x 1
Making 1D domain decomposition 2 x 1 x 1
Making 1D domain decomposition 2 x 1 x 1
Making 1D domain decomposition 2 x 1 x 1
Making 1D domain decomposition 2 x 1 x 1
Making 1D domain decomposition 2 x 1 x 1
Making 1D domain decomposition 2 x 1 x 1
Making 1D domain decomposition 2 x 1 x 1
Making 1D domain decomposition 2 x 1 x 1
Loaded with Money
Making 1D domain decomposition 2 x 1 x 1
Making 1D domain decomposition 2 x 1 x 1
Making 1D domain decomposition 2 x 1 x 1
Making 1D domain decomposition 2 x 1 x 1
Making 1D domain decomposition 2 x 1 x 1
Loaded with Money
Making 1D domain decomposition 2 x 1 x 1
Loaded with Money
Loaded with Money
Loaded with Money
Making 1D domain decomposition 2 x 1 x 1
Loaded with Money
Loaded with Money
Loaded with Money
Loaded with Money
Loaded with Money
Loaded with Money
Loaded with Money
Making 1D domain decomposition 2 x 1 x 1
Making 1D domain decomposition 2 x 1 x 1
Loaded with Money
Making 1D domain decomposition 2 x 1 x 1
Making 1D domain decomposition 2 x 1 x 1
Making 1D domain decomposition 2 x 1 x 1
Making 1D domain decomposition 2 x 1 x 1
Making 1D domain decomposition 2 x 1 x 1
Making 1D domain decomposition 2 x 1 x 1
Making 1D domain decomposition 2 x 1 x 1
Making 1D domain decomposition 2 x 1 x 1
Making 1D domain decomposition 2 x 1 x 1
Making 1D domain decomposition 2 x 1 x 1
Making 1D domain decomposition 2 x 1 x 1
Making 1D domain decomposition 2 x 1 x 1
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
starting mdrun 'Protein'
500000 steps, 1000.0 ps.
-------------------------------------------------------
Program mdrun_mpi, VERSION 4.0.7
Source code file: ../../../SRC/src/gmxlib/smalloc.c, line: 179
Fatal error:
Not enough memory. Failed to realloc 790760 bytes for nlist->jjnr,
nlist->jjnr=0x98500030
(called from file ../../../SRC/src/mdlib/ns.c, line 503)
-------------------------------------------------------
Thanx for Using GROMACS - Have a Nice Day
: Cannot allocate memory
Error on node 64, will try to stop all the nodes
Halting parallel program mdrun_mpi on CPU 64 out of 66
gcq#0: Thanx for Using GROMACS - Have a Nice Day
application called MPI_Abort(MPI_COMM_WORLD, -1) - process 64
-------------------------------------------------------
Program mdrun_mpi, VERSION 4.0.7
Source code file: ../../../SRC/src/gmxlib/smalloc.c, line: 179
Fatal error:
Not enough memory. Failed to realloc 790760 bytes for nlist->jjnr,
nlist->jjnr=0x9a300030
(called from file ../../../SRC/src/mdlib/ns.c, line 503)
-------------------------------------------------------
Thanx for Using GROMACS - Have a Nice Day
: Cannot allocate memory
Error on node 55, will try to stop all the nodes
Halting parallel program mdrun_mpi on CPU 55 out of 66
-------------------------------------------------------
Program mdrun_mpi, VERSION 4.0.7
Source code file: ../../../SRC/src/gmxlib/smalloc.c, line: 179
Fatal error:
Not enough memory. Failed to realloc 400768 bytes for nlist->jjnr,
nlist->jjnr=0x9a300030
(called from file ../../../SRC/src/mdlib/ns.c, line 503)
-------------------------------------------------------
Thanx for Using GROMACS - Have a Nice Day
: Cannot allocate memory
Error on node 9, will try to stop all the nodes
Halting parallel program mdrun_mpi on CPU 9 out of 66
gcq#0: Thanx for Using GROMACS - Have a Nice Day
application called MPI_Abort(MPI_COMM_WORLD, -1) - process 9
gcq#0: Thanx for Using GROMACS - Have a Nice Day
application called MPI_Abort(MPI_COMM_WORLD, -1) - process 55
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
More information about the gromacs.org_gmx-users
mailing list