[gmx-developers] MPI_ERR_COMM on 4.5.5-patches

Berk Hess hess at kth.se
Tue Aug 28 22:34:06 CEST 2012


Hi,

This seems to be a bug in Gromacs.
As this is not in a Gromacs release yet, we could resolve this without a 
bug report.

A you skilled enough that you can run this in a debugger and tell me 
which MPI_comm_size
call in Gromacs is causing this?

Cheers,

Berk

On 08/28/2012 07:39 PM, Alexander Schlaich wrote:
> Dear Gromacs team,
>
> I just tried to install the release-4.5.5_patches branch with --enable-mpi on our cluster (OpemMPI-1.4.2), resulting in an error when calling mdrun whith pme enabled:
>
> Reading file topol.tpr, VERSION 4.5.5-dev-20120810-2859895 (single precision)
> [sheldon:22663] *** An error occurred in MPI_comm_size
> [sheldon:22663] *** on communicator MPI_COMM_WORLD
> [sheldon:22663] *** MPI_ERR_COMM: invalid communicator
> [sheldon:22663] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
>
> This seems to be related to a recent post on the list, however I could not find a solution:
> http://lists.gromacs.org/pipermail/gmx-users/2012-July/073316.html
> However, the 4.5.5 release version works fine.
>
> Taking a closer look I found commit dcf8b67e2801f994dae56374382b9e330833de30, "changed PME MPI_Comm comparisions to MPI_COMM_NULL, fixes #931" (Berk Hess). Apparently here the communicators were changed such that the initialization fails on my system. Reverting this single commit on the head of the release-4.5.5 branch solved the issue for me.
>
> As I am no MPI expert I would like to know if my MPI implementation is misbehaving here, if I made a configuration mistake or if I should file a bug report?
>
> Thanks for your help,
>
> Alex
>
>
>




More information about the gromacs.org_gmx-developers mailing list