[gmx-users] Problems with TI on GPUs

Tue Jul 12 19:02:34 CEST 2016

Dear all,

we struggle to get a TI on our computer running. The specifications are
listed below. As you can see, its a two socket, two graphics cards
machine. Therefore, the plan is to run two simulations in parallel. But we
can't get a single one to run.

Running on 1 node with total 20 cores, 20 logical cores, 2 compatible GPUs
Hardware detected:
  CPU info:
    Vendor: GenuineIntel
    Brand:  Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz
    SIMD instructions most likely to fit this hardware: AVX2_256
    SIMD instructions selected at GROMACS compile time: AVX2_256
  GPU info:
    Number of GPUs detected: 2
    #0: NVIDIA GeForce GTX 1080, compute cap.: 6.1, ECC:  no, stat:
compatible
    #1: NVIDIA GeForce GTX 1080, compute cap.: 6.1, ECC:  no, stat:
compatible

The simulation system in question is a protein-ligand-complex in
TIP3P-water and amber ff99SB as force field.

Now lets get into the messy details. We tried different mdrun commandline
argument rotations, for example:

gmx mdrun -s md.tpr -pin on -ntomp 2 -ntmpi 5 -gpu_id 00000 -deffnm md
(does not work)
gmx mdrun -s md.tpr -pin on -ntomp 5 -ntmpi 2 -gpu_id 00 -deffnm md
(does not work)
gmx mdrun -s md.tpr -pin on -ntomp 10 -ntmpi 1 -gpu_id 0 -deffnm md
(does not work)
gmx mdrun -s md.tpr -deffnm md
(does work, uses the complete compute node including the both gpu´s)

The error which gromacs gives us, is rather irritating (explanation
follows further down). Here a little excerpt:

Step 191, time 0.382 (ps)  LINCS WARNING
relative constraint deviation after LINCS:
rms 0.000002, max 0.000010 (between atoms 3421 and 3424)
bonds that rotated more than 30 degrees:
 atom 1 atom 2  angle  previous, current, constraint length
   3702   3703   31.0    0.1090   0.1090      0.1090
Wrote pdb files with previous and current coordinates

These errors vary, but refer all to a misplacement or unusual rotations.
Gromacs states, that this is because of our "unstable" system. However,
this explanation can be excluded, because the starting configuration of
the
simulations in question already ran 20 ns in gromacs on a CPU-Cluster.

We also tested different commands for cmake. A example is shown here:

cmake .. -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++ -DGMX_GPU=on
-DGMX_FFT_LIBRARY=fftw3 -DGMX_BUILD_OWN_FFTW=ON
-DREGRESSIONTEST_DOWNLOAD=ON

Compilerwise we tried gcc (v.4.8.5) and intel (v.15.0.1).

I would really appreciate your help and thank you very much in advance.

Sincerely,
Yannic