[gmx-developers] Replica exchange deadlock

Mark Abraham Mark.Abraham at anu.edu.au
Wed Feb 4 01:51:12 CET 2009


Roland Schulz wrote:
> Hi,
> 
> I think the function replica_exchange is using the wrong MPI 
> communicator. I think it should be changed according to this diff:
> 
> --- repl_ex.c   16 Jan 2009 13:05:34 -0000      1.22.2.1
> +++ repl_ex.c   3 Feb 2009 22:28:09 -0000
> @@ -567,7 +567,7 @@
>    if (PAR(cr)) {
>  #ifdef GMX_MPI
>      MPI_Bcast(&bExchanged,sizeof(
> bool),MPI_BYTE,MASTERRANK(cr),
> -             cr->mpi_comm_mysim);
> +             cr->mpi_comm_mygroup);
>  #endif
>    }
> 
> For me this remove a deadlock I have with certain number of PME nodes.

Yes that looks correct. The old form would provoke an MPI deadlock 
because this function is called from do_md, which is only called by PP 
nodes. mpi_comm_mysim == mpi_comm_mygroup except under certain DD 
conditions with separate PME nodes.

Mark



More information about the gromacs.org_gmx-developers mailing list