[gmx-users] gromacs on GPU

Szilárd Páll szilard.pall at cbr.su.se
Thu Jan 10 04:29:18 CET 2013


Hi James,

The build looks mostly fine except that you are using fftw3 compiled with
AVX which is slower than with only SSE (even on AVX-capable CPUs) - you
should have been warned about this at configure-time.

Now, performance-wise everything looks fine except that with a 1.2 nm
cut-off your GPU is not able to keep up with the CPU and finish the
non-bonded work before the CPU is done with Bonded + PME. That's why you
see the "Wait GPU" taking 20% of the total time and that's also why you see
some cores idling (because for 20% of the run-time thread 0 on core 0
is blocked waiting for the GPU while the rest idle).

As the suggestion at the end of the log file point out, you can consider
using a shorter cut-off which will push more work back to the PME on the
CPU, but whether you can do this it depends on your very problem.

There is one more alternative of running two MPI processes on the GPU
(mpirun -np 2 mdrun -gpu_id 00) and using the -nb gpu_cpu mode which will
execute part of the nonbonded on the CPU, but this might not help.

Cheers,

--
Szilárd


On Wed, Jan 9, 2013 at 8:27 PM, James Starlight <jmsstarlight at gmail.com>wrote:

> Dear Szilárd, thanks for help again!
>
> 2013/1/9 Szilárd Páll <szilard.pall at cbr.su.se>:
>
> >
> > There could be, but I/we can't well without more information on what and
> > how you compiled and ran. The minimum we need is a log file.
> >
> I've compilated gromacs 4.6-3 beta via simple
>
>
> cmake CMakeLists.txt -DGMX_GPU=ON
> -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-5.0
> make
> sudo make install
>
> I have not added any special params to the grompp or mdrun.
>
> After that I've run tested simulation of the calmodulin in explicit
> water ( 60k atoms ) 100ps and obtain next output
>
> Host: starlight  pid: 21028  nodeid: 0  nnodes:  1
> Gromacs version:    VERSION 4.6-beta3
> Precision:          single
> MPI library:        thread_mpi
> OpenMP support:     enabled
> GPU support:        enabled
> invsqrt routine:    gmx_software_invsqrt(x)
> CPU acceleration:   AVX_256
> FFT library:        fftw-3.3.2-sse2-avx
> Large file support: enabled
> RDTSCP usage:       enabled
> Built on:           Wed Jan  9 20:44:51 MSK 2013
> Built by:           own at starlight [CMAKE]
> Build OS/arch:      Linux 3.2.0-2-amd64 x86_64
> Build CPU vendor:   GenuineIntel
> Build CPU brand:    Intel(R) Core(TM) i5-3570 CPU @ 3.40GHz
> Build CPU family:   6   Model: 58   Stepping: 9
> Build CPU features: aes apic avx clfsh cmov cx8 cx16 f16c htt lahf_lm
> mmx msr nonstop_tsc pcid pclmuldq pdcm popcnt pse rdrnd rdtscp sse2
> sse3 sse4.1 sse4.2 ssse3 tdt x2apic
> C compiler:         /usr/bin/gcc GNU gcc (Debian 4.6.3-11) 4.6.3
> C compiler flags:   -mavx  -Wextra -Wno-missing-field-initializers
> -Wno-sign-compare -Wall -Wno-unused -Wunused-value
> -fomit-frame-pointer -funroll-all-loops -fexcess-precision=fast  -O3
> -DNDEBUG
> C++ compiler:       /usr/bin/c++ GNU c++ (Debian 4.6.3-11) 4.6.3
> C++ compiler flags: -mavx  -Wextra -Wno-missing-field-initializers
> -Wno-sign-compare -Wall -Wno-unused -Wunused-value
> -fomit-frame-pointer -funroll-all-loops -fexcess-precision=fast  -O3
> -DNDEBUG
> CUDA compiler:      nvcc: NVIDIA (R) Cuda compiler driver;Copyright
> (c) 2005-2012 NVIDIA Corporation;Built on
> Fri_Sep_21_17:28:58_PDT_2012;Cuda compilation tools, release 5.0,
> V0.2.1221
> CUDA driver:        5.0
> CUDA runtime:       5.0
>
> ****************
>
>                Core t (s)   Wall t (s)        (%)
>        Time:     2770.700     1051.927      263.4
>                  (ns/day)    (hour/ns)
> Performance:        8.214        2.922
>
> full log can be found here http://www.sendspace.com/file/inum84
>
>
> Finally when I check CPU usage I notice that only 1 CPU was full
> loaded ( 100%) and 2-4 cores were loaded on only 60% but  gave me
> strange results that GPU is not used (I've only monitored temperature
> of video card and noticed increase of the temperature up to 65 degrees
> )
>
> +------------------------------------------------------+
> | NVIDIA-SMI 4.304.54   Driver Version: 304.54         |
>
> |-------------------------------+----------------------+----------------------+
> | GPU  Name                     | Bus-Id        Disp.  | Volatile Uncorr.
> ECC |
> | Fan  Temp  Perf  Pwr:Usage/Cap| Memory-Usage         | GPU-Util  Compute
> M. |
>
> |===============================+======================+======================|
> |   0  GeForce GTX 670          | 0000:02:00.0     N/A |
>  N/A |
> | 38%   63C  N/A     N/A /  N/A |   9%  174MB / 2047MB |     N/A
>  Default |
>
> +-------------------------------+----------------------+----------------------+
>
>
> +-----------------------------------------------------------------------------+
> | Compute processes:                                               GPU
> Memory |
> |  GPU       PID  Process name                                     Usage
>    |
>
> |=============================================================================|
> |    0            Not Supported
>     |
>
> +-----------------------------------------------------------------------------+
>
>
> Thanks for help again,
>
> James
> --
> gmx-users mailing list    gmx-users at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-request at gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>



More information about the gromacs.org_gmx-users mailing list