[gmx-users] GPU performance

Benjamin Bobay bgbobay at ncsu.edu
Tue Apr 9 22:27:39 CEST 2013


Good afternoon -

I recently installed gromacs-4.6 on CentOS6.3 and the installation went
just fine.

I have a Tesla C2075 GPU.

I then downloaded the benchmark directories and ran a bench mark on the
GPU/ dhfr-solv-PME.bench

This is what I got:

Using 1 MPI thread
Using 4 OpenMP threads

1 GPU detected:
  #0: NVIDIA Tesla C2075, compute cap.: 2.0, ECC: yes, stat: compatible

1 GPU user-selected for this run: #0


Back Off! I just backed up ener.edr to ./#ener.edr.1#
starting mdrun 'Protein in water'
-1 steps, infinite ps.
step   40: timed with pme grid 64 64 64, coulomb cutoff 1.000: 4122.9
M-cycles
step   80: timed with pme grid 56 56 56, coulomb cutoff 1.143: 3685.9
M-cycles
step  120: timed with pme grid 48 48 48, coulomb cutoff 1.333: 3110.8
M-cycles
step  160: timed with pme grid 44 44 44, coulomb cutoff 1.455: 3365.1
M-cycles
step  200: timed with pme grid 40 40 40, coulomb cutoff 1.600: 3499.0
M-cycles
step  240: timed with pme grid 52 52 52, coulomb cutoff 1.231: 3982.2
M-cycles
step  280: timed with pme grid 48 48 48, coulomb cutoff 1.333: 3129.2
M-cycles
step  320: timed with pme grid 44 44 44, coulomb cutoff 1.455: 3425.4
M-cycles
step  360: timed with pme grid 42 42 42, coulomb cutoff 1.524: 2979.1
M-cycles
              optimal pme grid 42 42 42, coulomb cutoff 1.524
step 4300 performance: 1.8 ns/day

and from the nvidia-smi output:
Tue Apr  9 10:13:46 2013
+------------------------------------------------------+

| NVIDIA-SMI 4.304.37   Driver Version: 304.37
|
|-------------------------------+----------------------+----------------------+
| GPU  Name                     | Bus-Id        Disp.  | Volatile Uncorr.
ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap| Memory-Usage         | GPU-Util  Compute
M. |
|===============================+======================+======================|
|   0  Tesla C2075              | 0000:03:00.0      On |
0 |
| 30%   67C    P0    80W / 225W |   4%  200MB / 5375MB |      4%
Default |
+-------------------------------+----------------------+----------------------+


+-----------------------------------------------------------------------------+
| Compute processes:                                               GPU
Memory |
|  GPU       PID  Process name
Usage      |
|=============================================================================|
|    0     22568  mdrun
59MB  |
+-----------------------------------------------------------------------------+


So I am only getting 1.8ns/day !!!!! Is that right? It seems very very
small compared to the CPU test where I am getting the same:

step 200 performance: 1.8 ns/day    vol 0.79  imb F 14%

>From the md.log of the GPU test:
Detecting CPU-specific acceleration.
Present hardware specification:
Vendor: GenuineIntel
Brand:  Intel(R) Xeon(R) CPU E5-2603 0 @ 1.80GHz
Family:  6  Model: 45  Stepping:  7
Features: aes apic avx clfsh cmov cx8 cx16 htt lahf_lm mmx msr nonstop_tsc
pcid pclmuldq pdcm pdpe1gb popcnt pse rdtscp sse2 sse3 sse4.1 sse4.2 ssse3
tdt x2a
pic
Acceleration most likely to fit this hardware: AVX_256
Acceleration selected at GROMACS compile time: AVX_256


1 GPU detected:
  #0: NVIDIA Tesla C2075, compute cap.: 2.0, ECC: yes, stat: compatible

1 GPU user-selected for this run: #0

Will do PME sum in reciprocal space.

Any thoughts as to why it is so slow?

many thanks!
Ben

-- 
____________________________________________
Research Assistant Professor
North Carolina State University
Department of Molecular and Structural Biochemistry
128 Polk Hall
Raleigh, NC 27695
Phone: (919)-513-0698
Fax: (919)-515-2047
____________________________________________



More information about the gromacs.org_gmx-users mailing list