Subject: Re: [gmx-users] Issue with domain decomposition between v4.5.5 and 4.6.1
stephanietm at gmail.com
Mon Apr 15 21:16:02 CEST 2013
Thank you for the reply, and I am glad to hear that this is normal output.
Unfortunately, my simulations crash almost immediately when I used v4.6,
and I was assuming it has something to do with the load balancing because
that is the last line in my md.log file.
I have run with the flag "mdrun -debug 1" and find the error:
"mdrun_mpi:13106 terminated with signal 11 at PC=2abd88a03934
I know this is rather vague, but do you have any suggestions on where I
should start tracking down this error? When I use particle decomposition my
simulations run fine.
Thanks in advance!
Date: Mon, 15 Apr 2013 06:08:13 -0400
From: Justin Lemkul <jalemkul at vt.edu>
Subject: Re: [gmx-users] Issue with domain decomposition between
v4.5.5 and 4.6.1
To: Discussion list for GROMACS users <gmx-users at gromacs.org>
Message-ID: <516BD18D.8000803 at vt.edu>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
On 4/14/13 11:23 PM, Stephanie Teich-McGoldrick wrote:
> Dear all,
> I am running a NPT simulation of 33,534 tip4P waters, and I am using
> decomposition as the parallelization scheme. Previously, I had been using
> Gromacs version 4.5.5 but have recently installed and switched to Gromacs
> version 4.6.1. Using Gromacs 4.5.5 I can successfully run my water box
> using domain decomposition over many different processor numbers. However
> the same simulation returns the following error when I try Gromacs 4.6.1
> "The initial number of communication pulses is: X 1 Y 1 Z 1
> The initial domain decomposition cell size is: X 2.48 nm Y 2.48 nm Z 1.46
> When dynamic load balancing gets turned on, these settings will change to:
> The maximum number of communication pulses is: X 1 Y 1 Z 1
> The minimum size for domain decomposition cells is 1.000 nm
> The requested allowed shrink of DD cells (option -dds) is: 0.80
> The allowed shrink of domain decomposition cells is: X 0.40 Y 0.40 Z 0.68
> The above error occurred running over 16 nodes / 128 processors. The
> runs for version 4.6.1 for 1,8, and 16 processors but not for 32,64, or
> I have tried other systems (including NVT, Berendsen/PR barostats,
> anisotropic/isotropic ) at the higher number of processors using both
> version 4.5.5 and 4.6.1 and get the same result - v4.5.5 runs fine while
> v4.6.1 returns the error type listed above.
> Is anyone else having a similar issue? Is there something I am not
> considering? Any help would be greatly appreciated! The details I have
> to compile each code are below. My log files indicate that I am indeed
> calling the correct executable at run time.
Based on what you've posted, I don't see any error. All of the above is
Justin A. Lemkul, Ph.D.
Department of Biochemistry
jalemkul[at]vt.edu | (540) 231-9080
More information about the gromacs.org_gmx-users