Dear GMX users,

I have a problem with the Gromacs jobs run on GPU node. I have the output
in the log file attached below. Anyone know if the CUDA and GROMACS are
compiled correctly according to this output?
Is so, why does the job cannot run on GPU? Thank you very much!

  gmx_mpi mdrun -deffnm em

GROMACS version:    2016.1
Precision:          single
Memory model:       64 bit
MPI library:        MPI
OpenMP support:     enabled (GMX_OPENMP_MAX_THREADS = 32)
GPU support:        CUDA
SIMD instructions:  AVX_256
FFT library:        fftw-3.3.5-sse2-avx-avx_128_fma
RDTSCP usage:       enabled
TNG support:        enabled
Hwloc support:      disabled
Tracing support:    disabled
Built on:           Tue Jan 17 15:14:56 EST 2017
Built by:           sli259 at compute-gpu-0-2.local [CMAKE]
Build OS/arch:      Linux 2.6.32-642.11.1.el6.x86_64 x86_64
Build CPU vendor:   Intel
Build CPU brand:    Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
Build CPU family:   6   Model: 63   Stepping: 2
Build CPU features: aes apic avx avx2 clfsh cmov cx8 cx16 f16c fma htt lahf
mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp sse2
sse3 sse4.1 sse4.2 ssse3 tdt x2apic
C compiler:         /home/sli259/program/gromacs/openmpi/bin/mpicc GNU 4.9.3
C compiler flags:    -mavx     -O3 -DNDEBUG -funroll-all-loops
C++ compiler:       /home/sli259/program/gromacs/openmpi/bin/mpic++ GNU
C++ compiler flags:  -mavx    -std=c++0x   -O3 -DNDEBUG -funroll-all-loops
CUDA compiler:      /home/sli259/program/gromacs/cuda/bin/nvcc nvcc: NVIDIA
(R) Cuda compiler driver;Copyright (c) 2005-2016 NVIDIA Corporation;Built
on Sun_Sep__4_22:14:01_CDT_2016;Cuda compilation tools, release 8.0, V8.0.44
CUDA compiler flags:-gencode;arch=compute_20,code=sm_20;-gencode;arch=
CUDA driver:        8.0
CUDA runtime:       0.0

NOTE: Error occurred during GPU detection:
      no CUDA-capable device is detected
      Can not use GPU acceleration, will fall back to CPU kernels.

Running on 1 node with total 12 cores, 24 logical cores, 0 compatible GPUs
Hardware detected on host compute-gpu-0-2.local (the node of MPI rank 0):

