[gmx-users] Fwd: related to bug 1222

Szilárd Páll pall.szilard at gmail.com
Fri May 30 17:06:19 CEST 2014


Albert,

The list does accept attachments. please file a redmine issue instead
(here: http://redmine.gromacs.org/). Include your input files and log
outputs too.

Have you observed the same issue without GPUs too? Is the crash
reliably reproducible? Does it always happen in the same replica? Does
it happen with a non-REMD run too? How about without GPUs?

Cheers,
--
Szilárd
--
Szilárd


On Fri, May 30, 2014 at 3:56 PM, albert ardevol
<albert.ardevol at gmail.com> wrote:
> Dear Mark,
>
>   See attached the .mdp file that I am using (for replica 4) as well as the
> job file that I am using to launch the calculation on Daint
> (http://www.cscs.ch/computers/piz_daint/index.html) and the output.log
>   Notice that the same .mdp files and the same mdrun command but running on
> 4 nodes instead of 8 works without problems.
>
>
> Best regards,
> Albert.
>
>
> 2014-05-28 22:46 GMT+02:00 Mark Abraham <mark.j.abraham at gmail.com>:
>
>> Hi,
>>
>> The source of the issue with #1222 was never established, so we can't know
>> if your situation is related. What were your mdrun command lines and .mdp
>> settings?
>>
>> Mark
>>
>>
>> On Wed, May 28, 2014 at 6:33 PM, albert ardevol
>> <albert.ardevol at gmail.com>wrote:
>>
>> > ---------- Forwarded message ----------
>> > From: albert ardevol <albert.ardevol at gmail.com>
>> > Date: 2014-05-28 18:26 GMT+02:00
>> > Subject: related to bug 1222
>> > To: gromacs.org_gmx-users at maillist.sys.kth.se, pszilard at kth.se
>> >
>> >
>> > Dear users/developers,
>> >
>> >   I am running replica exchange MD (REMD) with gromacs 4.6.5 in Piz
>> > Daint
>> > using gpu/cpu. I have two different systems.
>> >
>> >   System 1 is small (< 16000 atoms) with 32 replicas. I am running it
>> > using
>> > 4 nodes without any problem.
>> >
>> >   System 2 is big (< 49000 atoms) with 32 replicas too. I am running it
>> > using 8 nodes, but after some steps, the simulation becomes unstable and
>> > the jobs crash. Restarting from the previous checkpoint, the simulation
>> > continues for some steps until it becomes unstable again (at a different
>> > point) and the job crashes again. The number of steps that the job can
>> > run
>> > until the simulation becomes unstable ranges from 5,000 to 1,170,000
>> > steps.
>> > The gromacs output file gives me the following error
>> >
>> > -------------------------------------------------------
>> > Program mdrun_mpi, VERSION 4.6.5
>> > Source code file:
>> > /apps/daint/sandbox/lucamar/src/gromacs-4.6.5/src/mdlib/pme.c, line: 851
>> >
>> > Fatal error:
>> > 2 particles communicated to PME node 2 are more than 2/3 times the
>> > cut-off
>> > out of the domain decomposition cell of their charge group in dimension
>> > x.
>> > This usually means that your system is not well equilibrated.
>> > For more information and tips for troubleshooting, please check the
>> > GROMACS
>> > website at http://www.gromacs.org/Documentation/Errors
>> > -------------------------------------------------------
>> >
>> >  I found this post on the gromacs mailing list "
>> > http://redmine.gromacs.org/issues/1222<
>> >
>> > https://mail.ethz.ch/owa/redir.aspx?C=kG7i6RYKnkWORgJwPyvSbCHIwNklTtEIc1Tm9Or9iBnqxN_ITsGkccnUV4IOV0kwo5oDo3yFq-o.&URL=http%3a%2f%2fredmine.gromacs.org%2fissues%2f1222
>> > >"
>> > in which they seem to have a similar problem and they say that it
>> > depends
>> > on the number of GPUs used. So I launched again the job using only 4
>> > nodes
>> > ran without problems for 2,259,000 steps steps. This bug was supposed to
>> > affect version 4.6.1 and to be fixed by version 4.6.5 (the one I am
>> > using).
>> >
>> >   Notice that I had previously equilibrated each of the replicas
>> > (separately, i.e. not using replica exchange) for 5,000,000 steps using
>> > 1
>> > node per run without any problem.
>> >
>> >   That makes me wonder whether the bug is really fixed or not.
>> >
>> >   Best regards,
>> >   Albert.
>> > --
>> > Gromacs Users mailing list
>> >
>> > * Please search the archive at
>> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> > posting!
>> >
>> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>> >
>> > * For (un)subscribe requests visit
>> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> > send a mail to gmx-users-request at gromacs.org.
>> >
>> --
>> Gromacs Users mailing list
>>
>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>>
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send
>> a mail to gmx-users-request at gromacs.org.
>
>


More information about the gromacs.org_gmx-users mailing list