[gmx-users] Fwd: related to bug 1222

Wed May 28 22:46:43 CEST 2014

Hi,

The source of the issue with #1222 was never established, so we can't know
if your situation is related. What were your mdrun command lines and .mdp
settings?

Mark

On Wed, May 28, 2014 at 6:33 PM, albert ardevol <albert.ardevol at gmail.com>wrote:

> ---------- Forwarded message ----------
> From: albert ardevol <albert.ardevol at gmail.com>
> Date: 2014-05-28 18:26 GMT+02:00
> Subject: related to bug 1222
> To: gromacs.org_gmx-users at maillist.sys.kth.se, pszilard at kth.se
>
>
> Dear users/developers,
>
>   I am running replica exchange MD (REMD) with gromacs 4.6.5 in Piz Daint
> using gpu/cpu. I have two different systems.
>
>   System 1 is small (< 16000 atoms) with 32 replicas. I am running it using
> 4 nodes without any problem.
>
>   System 2 is big (< 49000 atoms) with 32 replicas too. I am running it
> using 8 nodes, but after some steps, the simulation becomes unstable and
> the jobs crash. Restarting from the previous checkpoint, the simulation
> continues for some steps until it becomes unstable again (at a different
> point) and the job crashes again. The number of steps that the job can run
> until the simulation becomes unstable ranges from 5,000 to 1,170,000 steps.
> The gromacs output file gives me the following error
>
> -------------------------------------------------------
> Program mdrun_mpi, VERSION 4.6.5
> Source code file:
> /apps/daint/sandbox/lucamar/src/gromacs-4.6.5/src/mdlib/pme.c, line: 851
>
> Fatal error:
> 2 particles communicated to PME node 2 are more than 2/3 times the cut-off
> out of the domain decomposition cell of their charge group in dimension x.
> This usually means that your system is not well equilibrated.
> For more information and tips for troubleshooting, please check the GROMACS
> website at http://www.gromacs.org/Documentation/Errors
> -------------------------------------------------------
>
>  I found this post on the gromacs mailing list "
> http://redmine.gromacs.org/issues/1222<
> https://mail.ethz.ch/owa/redir.aspx?C=kG7i6RYKnkWORgJwPyvSbCHIwNklTtEIc1Tm9Or9iBnqxN_ITsGkccnUV4IOV0kwo5oDo3yFq-o.&URL=http%3a%2f%2fredmine.gromacs.org%2fissues%2f1222
> >"
> in which they seem to have a similar problem and they say that it depends
> on the number of GPUs used. So I launched again the job using only 4 nodes
> ran without problems for 2,259,000 steps steps. This bug was supposed to
> affect version 4.6.1 and to be fixed by version 4.6.5 (the one I am using).
>
>   Notice that I had previously equilibrated each of the replicas
> (separately, i.e. not using replica exchange) for 5,000,000 steps using 1
> node per run without any problem.
>
>   That makes me wonder whether the bug is really fixed or not.
>
>   Best regards,
>   Albert.
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>