[gmx-users] Gromacs 2016.3 MPI Message truncated error

Mark Abraham mark.j.abraham at gmail.com
Fri Sep 8 20:03:10 CEST 2017


Hi,

That error doesn't come from GROMACS, so I can only suspect some kind of
misconfiguration or transient error with your MPI system.

Mark

On Fri, Sep 8, 2017 at 4:15 PM Farkas-Pall, Kristof <
kristof.farkas-pall.14 at ucl.ac.uk> wrote:

> Hello,
>
> Trying to run a simulation on 1 node 32 cores (Cray XE6).
>
> After running for a number of steps, the code exits with error described
> below.
>
> In my submission script I have:
>
> ```
> #PBS -l nodes=1:ppn:32:xe
>
> aprun -n 32 gmx_mpi mdrun -deffnm replica_0/lambda_0/minimize
> ```
>
> All info is coming from mdrun:
>
> ```
> GROMACS:      gmx mdrun, version 2016.3
> Executable: xxx/gromacs/gromacs-2016.3/install-cpu/bin/gmx_mpi
>
> Running on 1 node with total 32 cores, 32 logical cores
> Hardware detected on host nidXXXXX (the node of MPI rank 0):
>   CPU info:
>     Vendor: AMD
>     Brand:  AMD Opteron(TM) Processor 6276
>     SIMD instructions most likely to fit this hardware: AVX_128_FMA
>     SIMD instructions selected at GROMACS compile time: AVX_128_FMA
>
>   Hardware topology: Basic
>
> The number of OpenMP threads was set by environment variable
> OMP_NUM_THREADS to 1
>
> Will use 20 particle-particle and 12 PME only ranks
> This is a guess, check the performance at the end of the log file
> Using 32 MPI processes
> Using 1 OpenMP thread per MPI process
>
>
> Non-default thread affinity set probably by the OpenMP library,
> disabling internal thread affinity
>
> Rank 19 [Thu Sep  7 14:04:02 2017] [c23-1c0s7n3] Fatal error in
> MPI_Sendrecv: Message truncated, error stack:
> MPI_Sendrecv(249).................: MPI_Sendrecv(sbuf=0x7fffffff3f90,
> scount=8, MPI_BYTE, dest=17, stag=0, rbuf=0x7fffffff4060, rcount=8,
> MPI_BYTE, src=7, rtag=0, comm=0x84000002, status=0x7fffffff3e70) failed
> MPIDI_CH3U_Receive_data_found(144): Message from rank 7 and tag 0
> truncated; 264 bytes received but buffer size is 8
> _pmiu_daemon(SIGCHLD): [NID 14929] [c23-1c0s7n3] [Thu Sep  7 14:04:02
> 2017] PE RANK 19 exit signal Aborted
>
> ```
>
> Is this because Gromacs wasn't built correctly? Or setting the aprun
> variables incorrectly?
>
> Thanks for any help!
>
> Kristof
>
>
>
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>


More information about the gromacs.org_gmx-users mailing list