[gmx-users] gromacs 5.1.2 mdrun can't detect GPU

Szilárd Páll pall.szilard at gmail.com
Thu Apr 21 16:12:59 CEST 2016


On Thu, Apr 21, 2016 at 3:54 AM, treinz <treinz at 163.com> wrote:
> Hi,
>
>
> Can you also explain why the function calls to cudaDriverGetVersion() and cudaRuntimeGetVersion() both return 0, as in

Not really, but it is the normal behavior on hosts where the runtime
is not compatible with the driver or there is simply no driver
installed.

Cheers,
--
Szilárd

>>> CUDA driver:        0.0
>>> CUDA runtime: 0.0
>
>
> Thanks,
> Tim
> At 2016-04-21 08:49:17, "Szilárd Páll" <pall.szilard at gmail.com> wrote:
>>On Thu, Apr 21, 2016 at 12:22 AM, treinz <treinz at 163.com> wrote:
>>> Hi all,
>>>
>>>
>>> I recently built 5.1.2 with GPU support and the config options are:
>>>
>>>
>>> module load cuda/7.5.18
>>> cmake .. -DCMAKE_C_COMPILER=gcc-4.9 \
>>>          -DCMAKE_CXX_COMPILER=g++-4.9 \
>>>          -DGMX_MPI=OFF \
>>>          -DGMX_THREAD_MPI=ON \
>>>          -DGMX_GPU=ON \
>>>          -DCMAKE_PREFIX_PATH=$HOME/local \
>>>          -DCMAKE_INSTALL_PREFIX=$HOME/local/gromacs/grid_frontend \
>>>          -DGMX_BUILD_OWN_FFTW=ON \
>>>          -DGMX_DEFAULT_SUFFIX=OFF \
>>>          -DGMX_BINARY_SUFFIX=_gpu \
>>>          -DGMX_LIBS_SUFFIX=_gpu
>>>
>>>
>>> and the installation was successful. But when I tried running mdrun, it wasn't able to detect the GPU:
>>>
>>>
>>> Build OS/arch:      Linux 2.6.32-573.1.1.el6.x86_64 x86_64
>>> Build CPU vendor:   GenuineIntel
>>> Build CPU brand:    Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
>>> Build CPU family:   6   Model: 63   Stepping: 2
>>> Build CPU features: aes apic avx avx2 clfsh cmov cx8 cx16 f16c fma htt lahf_lm mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic
>>> C compiler:         /net/noble/vol1/home/dejunlin/local/bin/mpicc GNU 4.9.3
>>> C compiler flags:    -march=core-avx2    -Wextra -Wno-missing-field-initializers -Wno-sign-compare -Wpointer-arith -Wall -Wno-unused -Wunused-value -Wunused-parameter  -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast  -Wno-array-bounds
>>> C++ compiler:       /net/noble/vol1/home/dejunlin/local/bin/mpicxx GNU 4.9.3
>>> C++ compiler flags:  -march=core-avx2    -Wextra -Wno-missing-field-initializers -Wpointer-arith -Wall -Wno-unused-function  -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast  -Wno-array-bounds
>>> Boost version:      1.59.0 (external)
>>> CUDA compiler:      /net/gs/vol3/software/modules-sw/cuda/7.5.18/Linux/RHEL6/x86_64/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2015 NVIDIA Corporation;Built on Tue_Aug_11_14:27:32_CDT_2015;Cuda compilation tools, release 7.5, V7.5.17
>>> CUDA compiler flags:-gencode;arch=compute_20,code=sm_20;-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_52,code=compute_52;-use_fast_math;; ;-march=core-avx2;-Wextra;-Wno-missing-field-initializers;-Wpointer-arith;-Wall;-Wno-unused-function;-O3;-DNDEBUG;-funroll-all-loops;-fexcess-precision=fast;-Wno-array-bounds;
>>> CUDA driver:        0.0
>>> CUDA runtime:       0.0
>>>
>>>
>>> NOTE: Error occurred during GPU detection:
>>>       CUDA driver version is insufficient for CUDA runtime version
>>>       Can not use GPU acceleration, will fall back to CPU kernels.
>>>
>>
>>^^^ This is the message you should be focusing on. For some reason,
>>the runtime mdrun was compiled with is incompatible with the driver.
>>This should not be the case if the driver is indeed 352.xx (which is
>>driver API v 7.5) as shown below and the runtime API v 7.5.
>>
>>> The command lines for this run is:
>>>
>>>
>>> module load cuda/7.5.18
>>> source $HOME/local/gromacs/grid_frontend/bin/GMXRC
>>> env
>>> nvidia-smi -a
>>> gmx_gpu mdrun -ntmpi 24 -ntomp 1 -gpu_id 1 -deffnm
>>
>>FYI: That's going to be inefficient, probably the worst possible
>>option in fact. Try 1-6 ranks/GPU.
>>
>>>
>>> I look at the stdout from nvidia-smi, it looks like the nvidia driver was installed (there are 4 GPUs but I'm only showing one of them):
>>>
>>>
>>> ==============NVSMI LOG==============
>>>
>>>
>>> Timestamp                           : Wed Apr 20 15:05:52 2016
>>> Driver Version                      : 352.39
>>>
>>>
>>> Attached GPUs                       : 4
>>> GPU 0000:02:00.0
>>>     Product Name                    : Tesla K40c
>>>     Product Brand                   : Tesla
>>>     Display Mode                    : Disabled
>>>     Display Active                  : Disabled
>>>     Persistence Mode                : Disabled
>>>     Accounting Mode                 : Disabled
>>>     Accounting Mode Buffer Size     : 1920
>>>     Driver Model
>>>         Current                     : N/A
>>>         Pending                     : N/A
>>>     Serial Number                   : 0321715040048
>>>     GPU UUID                        : GPU-647b8474-e09f-7f98-3ac5-4604e02f1c75
>>>     Minor Number                    : 0
>>>     VBIOS Version                   : 80.80.3E.00.02
>>>     MultiGPU Board                  : No
>>>     Board ID                        : 0x200
>>>     Inforom Version
>>>         Image Version               : 2081.0206.01.04
>>>         OEM Object                  : 1.1
>>>         ECC Object                  : 3.0
>>>         Power Management Object     : N/A
>>>     GPU Operation Mode
>>>
>>>
>>> One catch is that I didn't have the CUDA GPU Deployment Kit or the NVML library installed -- would that matter in detecting GPUs? Another catch is that I was running the job in a SGE cluster. I believe I have requested the GPU resources correctly because I could compile GROMACS with GPU support on the GPU nodes, although the resulting mdrun still wasn't able to detect any GPU if I ran it on the same GPU node where it was compiled.
>>
>>No! NVML allows mdrun to check/adjust the application clock. It's optional.
>>
>>Cheers,
>>--
>>Szilárd
>>
>>>
>>> Can anyone tell me if there's any flag I have to turn on during compilation or running mdrun in order to make it see the GPUs? Also, is NVML or the dev kit a must for this?
>>>
>>>
>>> Thanks,
>>> Tim
>>> --
>>> Gromacs Users mailing list
>>>
>>> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>>>
>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>
>>> * For (un)subscribe requests visit
>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.
>>--
>>Gromacs Users mailing list
>>
>>* Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>>
>>* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>>* For (un)subscribe requests visit
>>https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.


More information about the gromacs.org_gmx-users mailing list