Siva Dasetty sdasett at g.clemson.edu
Fri Jun 1 04:11:53 CEST 2018


I have come across an error that causes GROMACS (2018/2018.1) to crash. The
message is:

"tMPI error: Receive buffer size too small for transmission (in valid comm)

The error seems to only occur immediately following a LINCS or SETTLE
warning. The error is reproducible across different systems. A simple
example system is running an energy minimization on a box of 1000 rigid
TIP4P/Ice water molecules generated with gmx solvate. When SETTLE is used
as the constraint algorithm,  there are several SETTLE warnings in the
early steps of the energy minimization, and GROMACS will crash with the
above error message. If I replace SETTLE with LINCS, GROMACS crashes with
the same error message following a LINCS warning. Other systems that have
produced this error are -OH terminated self assembled monolayer surfaces
(h-bonds constrained by LINCS), and mica surfaces (h-bonds constrained by
LINCS).  Naturally, reducing -ntmpi to 1 eliminates the error for all

The problem does appear to be hardware dependent. Specifically, the tested
node(s) on the cluster contains K20/K40 GPUs with Intel Xeon E5-2680v3
processor (20/24 cores). I used GCC/5.4.0 and CUDA/8.0.44 compilers for
installing GROMACS. An installation on my desktop machine with with very
similar options does not have the thread MPI error.

Example of procedure that causes error:
# Node contains 24 cores and 2 K40 GPUs
gmx solvate -cs tip4p -o box.gro -box 3.2 3.2 3.2 -maxsol 1000
gmx grompp -f em.mdp -c box.gro -p tip4pice.top -o em
gmx mdrun -v -deffnm em -ntmpi 4 -ntomp 6 -pin on

Attached are the relevant topology (tip4pice.top), mdp (em.mdp), and log
(em.log) files.

Thanks in advance for any ideas as to what might be causing this problem,
Siva Dasetty


