[gmx-developers] Replica exchange deadlock
Berk Hess
hessb at mpip-mainz.mpg.de
Wed Feb 4 10:19:50 CET 2009
Mark Abraham wrote:
> Roland Schulz wrote:
>> Hi,
>>
>> I think the function replica_exchange is using the wrong MPI
>> communicator. I think it should be changed according to this diff:
>>
>> --- repl_ex.c 16 Jan 2009 13:05:34 -0000 1.22.2.1
>> +++ repl_ex.c 3 Feb 2009 22:28:09 -0000
>> @@ -567,7 +567,7 @@
>> if (PAR(cr)) {
>> #ifdef GMX_MPI
>> MPI_Bcast(&bExchanged,sizeof(
>> bool),MPI_BYTE,MASTERRANK(cr),
>> - cr->mpi_comm_mysim);
>> + cr->mpi_comm_mygroup);
>> #endif
>> }
>>
>> For me this remove a deadlock I have with certain number of PME nodes.
>
> Yes that looks correct. The old form would provoke an MPI deadlock
> because this function is called from do_md, which is only called by PP
> nodes. mpi_comm_mysim == mpi_comm_mygroup except under certain DD
> conditions with separate PME nodes.
>
> Mark
Indeed.
I fixed it for 4.0.4.
Berk
More information about the gromacs.org_gmx-developers
mailing list