[gmx-users] Gromacs 2016.3 MPI Message truncated error

Farkas-Pall, Kristof kristof.farkas-pall.14 at ucl.ac.uk
Fri Sep 8 16:14:46 CEST 2017


Trying to run a simulation on 1 node 32 cores (Cray XE6).

After running for a number of steps, the code exits with error described below.

In my submission script I have:

#PBS -l nodes=1:ppn:32:xe

aprun -n 32 gmx_mpi mdrun -deffnm replica_0/lambda_0/minimize

All info is coming from mdrun:

GROMACS:      gmx mdrun, version 2016.3
Executable: xxx/gromacs/gromacs-2016.3/install-cpu/bin/gmx_mpi

Running on 1 node with total 32 cores, 32 logical cores
Hardware detected on host nidXXXXX (the node of MPI rank 0):
  CPU info:
    Vendor: AMD
    Brand:  AMD Opteron(TM) Processor 6276
    SIMD instructions most likely to fit this hardware: AVX_128_FMA
    SIMD instructions selected at GROMACS compile time: AVX_128_FMA

  Hardware topology: Basic

The number of OpenMP threads was set by environment variable OMP_NUM_THREADS to 1

Will use 20 particle-particle and 12 PME only ranks
This is a guess, check the performance at the end of the log file
Using 32 MPI processes
Using 1 OpenMP thread per MPI process

Non-default thread affinity set probably by the OpenMP library,
disabling internal thread affinity

Rank 19 [Thu Sep  7 14:04:02 2017] [c23-1c0s7n3] Fatal error in MPI_Sendrecv: Message truncated, error stack:
MPI_Sendrecv(249).................: MPI_Sendrecv(sbuf=0x7fffffff3f90, scount=8, MPI_BYTE, dest=17, stag=0, rbuf=0x7fffffff4060, rcount=8, MPI_BYTE, src=7, rtag=0, comm=0x84000002, status=0x7fffffff3e70) failed
MPIDI_CH3U_Receive_data_found(144): Message from rank 7 and tag 0 truncated; 264 bytes received but buffer size is 8
_pmiu_daemon(SIGCHLD): [NID 14929] [c23-1c0s7n3] [Thu Sep  7 14:04:02 2017] PE RANK 19 exit signal Aborted


Is this because Gromacs wasn't built correctly? Or setting the aprun variables incorrectly?

Thanks for any help!


More information about the gromacs.org_gmx-users mailing list