[gmx-developers] Replica exchange deadlock
Mark Abraham
Mark.Abraham at anu.edu.au
Wed Feb 4 01:51:12 CET 2009
Roland Schulz wrote:
> Hi,
>
> I think the function replica_exchange is using the wrong MPI
> communicator. I think it should be changed according to this diff:
>
> --- repl_ex.c 16 Jan 2009 13:05:34 -0000 1.22.2.1
> +++ repl_ex.c 3 Feb 2009 22:28:09 -0000
> @@ -567,7 +567,7 @@
> if (PAR(cr)) {
> #ifdef GMX_MPI
> MPI_Bcast(&bExchanged,sizeof(
> bool),MPI_BYTE,MASTERRANK(cr),
> - cr->mpi_comm_mysim);
> + cr->mpi_comm_mygroup);
> #endif
> }
>
> For me this remove a deadlock I have with certain number of PME nodes.
Yes that looks correct. The old form would provoke an MPI deadlock
because this function is called from do_md, which is only called by PP
nodes. mpi_comm_mysim == mpi_comm_mygroup except under certain DD
conditions with separate PME nodes.
Mark
More information about the gromacs.org_gmx-developers
mailing list