[gmx-users] gromacs 5.1.2 mdrun can't detect GPU
treinz
treinz at 163.com
Thu Apr 21 03:55:04 CEST 2016
Hi,
Can you also explain why the function calls to cudaDriverGetVersion() and cudaRuntimeGetVersion() both return 0, as in
>> CUDA driver: 0.0
>> CUDA runtime: 0.0
Thanks,
Tim
At 2016-04-21 08:49:17, "Szilárd Páll" <pall.szilard at gmail.com> wrote:
>On Thu, Apr 21, 2016 at 12:22 AM, treinz <treinz at 163.com> wrote:
>> Hi all,
>>
>>
>> I recently built 5.1.2 with GPU support and the config options are:
>>
>>
>> module load cuda/7.5.18
>> cmake .. -DCMAKE_C_COMPILER=gcc-4.9 \
>> -DCMAKE_CXX_COMPILER=g++-4.9 \
>> -DGMX_MPI=OFF \
>> -DGMX_THREAD_MPI=ON \
>> -DGMX_GPU=ON \
>> -DCMAKE_PREFIX_PATH=$HOME/local \
>> -DCMAKE_INSTALL_PREFIX=$HOME/local/gromacs/grid_frontend \
>> -DGMX_BUILD_OWN_FFTW=ON \
>> -DGMX_DEFAULT_SUFFIX=OFF \
>> -DGMX_BINARY_SUFFIX=_gpu \
>> -DGMX_LIBS_SUFFIX=_gpu
>>
>>
>> and the installation was successful. But when I tried running mdrun, it wasn't able to detect the GPU:
>>
>>
>> Build OS/arch: Linux 2.6.32-573.1.1.el6.x86_64 x86_64
>> Build CPU vendor: GenuineIntel
>> Build CPU brand: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
>> Build CPU family: 6 Model: 63 Stepping: 2
>> Build CPU features: aes apic avx avx2 clfsh cmov cx8 cx16 f16c fma htt lahf_lm mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic
>> C compiler: /net/noble/vol1/home/dejunlin/local/bin/mpicc GNU 4.9.3
>> C compiler flags: -march=core-avx2 -Wextra -Wno-missing-field-initializers -Wno-sign-compare -Wpointer-arith -Wall -Wno-unused -Wunused-value -Wunused-parameter -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast -Wno-array-bounds
>> C++ compiler: /net/noble/vol1/home/dejunlin/local/bin/mpicxx GNU 4.9.3
>> C++ compiler flags: -march=core-avx2 -Wextra -Wno-missing-field-initializers -Wpointer-arith -Wall -Wno-unused-function -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast -Wno-array-bounds
>> Boost version: 1.59.0 (external)
>> CUDA compiler: /net/gs/vol3/software/modules-sw/cuda/7.5.18/Linux/RHEL6/x86_64/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2015 NVIDIA Corporation;Built on Tue_Aug_11_14:27:32_CDT_2015;Cuda compilation tools, release 7.5, V7.5.17
>> CUDA compiler flags:-gencode;arch=compute_20,code=sm_20;-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_52,code=compute_52;-use_fast_math;; ;-march=core-avx2;-Wextra;-Wno-missing-field-initializers;-Wpointer-arith;-Wall;-Wno-unused-function;-O3;-DNDEBUG;-funroll-all-loops;-fexcess-precision=fast;-Wno-array-bounds;
>> CUDA driver: 0.0
>> CUDA runtime: 0.0
>>
>>
>> NOTE: Error occurred during GPU detection:
>> CUDA driver version is insufficient for CUDA runtime version
>> Can not use GPU acceleration, will fall back to CPU kernels.
>>
>
>^^^ This is the message you should be focusing on. For some reason,
>the runtime mdrun was compiled with is incompatible with the driver.
>This should not be the case if the driver is indeed 352.xx (which is
>driver API v 7.5) as shown below and the runtime API v 7.5.
>
>> The command lines for this run is:
>>
>>
>> module load cuda/7.5.18
>> source $HOME/local/gromacs/grid_frontend/bin/GMXRC
>> env
>> nvidia-smi -a
>> gmx_gpu mdrun -ntmpi 24 -ntomp 1 -gpu_id 1 -deffnm
>
>FYI: That's going to be inefficient, probably the worst possible
>option in fact. Try 1-6 ranks/GPU.
>
>>
>> I look at the stdout from nvidia-smi, it looks like the nvidia driver was installed (there are 4 GPUs but I'm only showing one of them):
>>
>>
>> ==============NVSMI LOG==============
>>
>>
>> Timestamp : Wed Apr 20 15:05:52 2016
>> Driver Version : 352.39
>>
>>
>> Attached GPUs : 4
>> GPU 0000:02:00.0
>> Product Name : Tesla K40c
>> Product Brand : Tesla
>> Display Mode : Disabled
>> Display Active : Disabled
>> Persistence Mode : Disabled
>> Accounting Mode : Disabled
>> Accounting Mode Buffer Size : 1920
>> Driver Model
>> Current : N/A
>> Pending : N/A
>> Serial Number : 0321715040048
>> GPU UUID : GPU-647b8474-e09f-7f98-3ac5-4604e02f1c75
>> Minor Number : 0
>> VBIOS Version : 80.80.3E.00.02
>> MultiGPU Board : No
>> Board ID : 0x200
>> Inforom Version
>> Image Version : 2081.0206.01.04
>> OEM Object : 1.1
>> ECC Object : 3.0
>> Power Management Object : N/A
>> GPU Operation Mode
>>
>>
>> One catch is that I didn't have the CUDA GPU Deployment Kit or the NVML library installed -- would that matter in detecting GPUs? Another catch is that I was running the job in a SGE cluster. I believe I have requested the GPU resources correctly because I could compile GROMACS with GPU support on the GPU nodes, although the resulting mdrun still wasn't able to detect any GPU if I ran it on the same GPU node where it was compiled.
>
>No! NVML allows mdrun to check/adjust the application clock. It's optional.
>
>Cheers,
>--
>Szilárd
>
>>
>> Can anyone tell me if there's any flag I have to turn on during compilation or running mdrun in order to make it see the GPUs? Also, is NVML or the dev kit a must for this?
>>
>>
>> Thanks,
>> Tim
>> --
>> Gromacs Users mailing list
>>
>> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>>
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.
>--
>Gromacs Users mailing list
>
>* Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
>* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
>* For (un)subscribe requests visit
>https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.
More information about the gromacs.org_gmx-users
mailing list