[gmx-users] problem with gpu performance
Justin Lemkul
jalemkul at vt.edu
Fri Sep 4 16:38:36 CEST 2015
On 9/4/15 10:35 AM, Peter Kroon wrote:
> Hi Jagannath,
>
> I don't dare comment on these specifics. There's probably some (gromacs
> specific) benchmarks out there *somewhere*, quite possibly on this list.
> But maybe someone else on the list knows what you should get :)
>
Quoting Carsten from a few days ago:
http://dx.doi.org/10.1002/jcc.24030
http://dx.doi.org/10.1007/978-3-319-15976-8_1
http://pubman.mpdl.mpg.de/pubman/item/escidoc:2037317/component/escidoc:2037318/2037317.pdf?mode=download
-Justin
> Peter
>
> On 04/09/15 15:58, jagannath mondal wrote:
>> Hi Peter
>> Thanks for your response. I also realized that GTX-610 is not able to
>> catch up with the faster cpu ( Intel(R) Core(TM) i7-5930K CPU @ 3.50GHz). I
>> tried cpu-gpu combination for -nb option. It improves it slightly but not
>> by much. So, we are planning to go for a replacement of GPU cards.
>> At this point, we have two plans: either go for single 4 GB GTX-970 or two
>> 2 GB GTX-960 . I was wondering whether you can comment on which options
>> will be better as far as performance is concerned.
>> Thanks for your input
>> jagannath
>>
>> On Fri, Sep 4, 2015 at 6:45 PM, Peter Kroon <p.c.kroon at rug.nl> wrote:
>>
>>> Hi Jagannath,
>>>
>>> AFAIK GT610's are rather slow. What you could try is using both cpu and
>>> gpu for non-bonded interactions (-nb gpu_cpu)
>>>
>>> Peter
>>>
>>> On 04/09/15 15:01, jagannath mondal wrote:
>>>> Dear Gromacs Users
>>>>
>>>> I am trying to run gpu version of gromacs5.0.6 in a work-station which
>>> is
>>>> a hexacore processor that can be multithreaded to 12. The workstation
>>> has 2
>>>> Geforce GT 610 GPUs . I am finding the simulation using -nb gpu is
>>>> exceedingly slower than -nb cpu ( i,e turning off gpu)
>>>>
>>>> I installed cuda-7.0 and using this I could install gpu version of
>>> gromacs
>>>> 5.0.6 as follows.
>>>>
>>>> cmake ../ -DGMX_BUILD_OWN_FFTW=ON
>>>> -DCMAKE_INSTALL_PREFIX=/home/jmondal/UTIL/GROMACS_5.0.6_gpu/
>>>> -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++ -DGMX_GPU=ON
>>>> -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda/
>>>>
>>>>
>>>> However, the performance with gpu is very weird. If I do mdrun using
>>>> following command:
>>>> 1) gmx mdrun -s topol. -nb gpu -v &>log_run
>>>>
>>>> and then repeat the same thing by turning of gpu usage
>>>>
>>>> 2) gmx mdrun -s topol -nb cpu -v >& log_run
>>>>
>>>> using gpus, the performance drops about 3 times !! Using both the GPUs
>>>> along with CPUs, the performance is: 1.620 ns/day
>>>> using only CPUs, the performance is 4.6 ns/day... usage of GPUs is
>>>> frustratingly slowing down the performance.
>>>>
>>>> when using -nb gpu option, gromacs md.log correctly detects gpu and cpu
>>> as
>>>> follows:
>>>>
>>>> Using 2 MPI threads
>>>> Using 6 OpenMP threads per tMPI thread
>>>>
>>>> Detecting CPU SIMD instructions.
>>>> Present hardware specification:
>>>> Vendor: GenuineIntel
>>>> Brand: Intel(R) Core(TM) i7-5930K CPU @ 3.50GHz
>>>> Family: 6 Model: 63 Stepping: 2
>>>> Features: aes apic avx avx2 clfsh cmov cx8 cx16 f16c fma htt lahf_lm mmx
>>>> msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp sse2
>>>> sse3 sse4.1 sse4.2 ssse3 tdt x2apic
>>>> SIMD instructions most likely to fit this hardware: AVX2_256
>>>> SIMD instructions selected at GROMACS compile time: AVX2_256
>>>>
>>>>
>>>> 2 GPUs detected:
>>>> #0: NVIDIA GeForce GT 610, compute cap.: 2.1, ECC: no, stat:
>>> compatible
>>>> #1: NVIDIA GeForce GT 610, compute cap.: 2.1, ECC: no, stat:
>>> compatible
>>>> 2 GPUs auto-selected for this run.
>>>> Mapping of GPUs to the 2 PP ranks in this node: #0, #1
>>>>
>>>>
>>>> However, when I look at the performance at the end of the simulation, the
>>>> 'wait GPU nonlocal' takes awfully long time.
>>>> I also tried few other options ( such as using only 1 gpu using gpu_id 0
>>> ).
>>>> Also played with ntmpi and ntomp option. But GPUs performance is
>>>> drastically poor ( surprisingly 3 times slower than only cpu-based
>>>> simulation),
>>>>
>>>> I am struggling to figure out whether it is a hardware issue or
>>> GPU-driver
>>>> issue or whether I am not using best optimal option.
>>>> Your suggestion will be useful in solving the issue.
>>>> Jagannath
>>>>
>>>>
>>>> R E A L C Y C L E A N D T I M E A C C O U N T I N G
>>>>
>>>> On 2 MPI ranks, each using 6 OpenMP threads
>>>>
>>>> Computing: Num Num Call Wall time Giga-Cycles
>>>> Ranks Threads Count (s) total sum %
>>>>
>>> -----------------------------------------------------------------------------
>>>> Domain decomp. 2 6 63 0.270 11.322
>>> 0.2
>>>> DD comm. load 2 6 13 0.000 0.002
>>> 0.0
>>>> Neighbor search 2 6 63 0.311 13.062
>>> 0.2
>>>> Launch GPU ops. 2 6 5002 0.205 8.614
>>> 0.2
>>>> Comm. coord. 2 6 2438 0.239 10.016
>>> 0.2
>>>> Force 2 6 2501 1.358 57.011
>>> 1.0
>>>> Wait + Comm. F 2 6 2501 0.404 16.954
>>> 0.3
>>>> PME mesh 2 6 2501 9.734 408.587
>>> 7.3
>>>> Wait GPU nonlocal 2 6 2501 117.798 4944.651
>>> 88.3
>>>> Wait GPU local 2 6 2501 0.005 0.206
>>> 0.0
>>>> NB X/F buffer ops. 2 6 9878 0.255 10.683
>>> 0.2
>>>> Write traj. 2 6 4 0.180 7.558
>>> 0.1
>>>> Update 2 6 2501 0.807 33.886
>>> 0.6
>>>> Constraints 2 6 2501 1.216 51.025
>>> 0.9
>>>> Comm. energies 2 6 126 0.001 0.055
>>> 0.0
>>>> Rest 0.609 25.573
>>> 0.5
>>> -----------------------------------------------------------------------------
>>>> Total 133.392 5599.205
>>> 100.0
>>>
>>>
>>>
>>> --
>>> Gromacs Users mailing list
>>>
>>> * Please search the archive at
>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>>> posting!
>>>
>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>
>>> * For (un)subscribe requests visit
>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>>> send a mail to gmx-users-request at gromacs.org.
>>>
>
>
>
>
--
==================================================
Justin A. Lemkul, Ph.D.
Ruth L. Kirschstein NRSA Postdoctoral Fellow
Department of Pharmaceutical Sciences
School of Pharmacy
Health Sciences Facility II, Room 629
University of Maryland, Baltimore
20 Penn St.
Baltimore, MD 21201
jalemkul at outerbanks.umaryland.edu | (410) 706-7441
http://mackerell.umaryland.edu/~jalemkul
==================================================
More information about the gromacs.org_gmx-users
mailing list