[gmx-users] GPU performance
Benjamin Bobay
bgbobay at ncsu.edu
Tue Apr 9 22:27:39 CEST 2013
Good afternoon -
I recently installed gromacs-4.6 on CentOS6.3 and the installation went
just fine.
I have a Tesla C2075 GPU.
I then downloaded the benchmark directories and ran a bench mark on the
GPU/ dhfr-solv-PME.bench
This is what I got:
Using 1 MPI thread
Using 4 OpenMP threads
1 GPU detected:
#0: NVIDIA Tesla C2075, compute cap.: 2.0, ECC: yes, stat: compatible
1 GPU user-selected for this run: #0
Back Off! I just backed up ener.edr to ./#ener.edr.1#
starting mdrun 'Protein in water'
-1 steps, infinite ps.
step 40: timed with pme grid 64 64 64, coulomb cutoff 1.000: 4122.9
M-cycles
step 80: timed with pme grid 56 56 56, coulomb cutoff 1.143: 3685.9
M-cycles
step 120: timed with pme grid 48 48 48, coulomb cutoff 1.333: 3110.8
M-cycles
step 160: timed with pme grid 44 44 44, coulomb cutoff 1.455: 3365.1
M-cycles
step 200: timed with pme grid 40 40 40, coulomb cutoff 1.600: 3499.0
M-cycles
step 240: timed with pme grid 52 52 52, coulomb cutoff 1.231: 3982.2
M-cycles
step 280: timed with pme grid 48 48 48, coulomb cutoff 1.333: 3129.2
M-cycles
step 320: timed with pme grid 44 44 44, coulomb cutoff 1.455: 3425.4
M-cycles
step 360: timed with pme grid 42 42 42, coulomb cutoff 1.524: 2979.1
M-cycles
optimal pme grid 42 42 42, coulomb cutoff 1.524
step 4300 performance: 1.8 ns/day
and from the nvidia-smi output:
Tue Apr 9 10:13:46 2013
+------------------------------------------------------+
| NVIDIA-SMI 4.304.37 Driver Version: 304.37
|
|-------------------------------+----------------------+----------------------+
| GPU Name | Bus-Id Disp. | Volatile Uncorr.
ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute
M. |
|===============================+======================+======================|
| 0 Tesla C2075 | 0000:03:00.0 On |
0 |
| 30% 67C P0 80W / 225W | 4% 200MB / 5375MB | 4%
Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Compute processes: GPU
Memory |
| GPU PID Process name
Usage |
|=============================================================================|
| 0 22568 mdrun
59MB |
+-----------------------------------------------------------------------------+
So I am only getting 1.8ns/day !!!!! Is that right? It seems very very
small compared to the CPU test where I am getting the same:
step 200 performance: 1.8 ns/day vol 0.79 imb F 14%
>From the md.log of the GPU test:
Detecting CPU-specific acceleration.
Present hardware specification:
Vendor: GenuineIntel
Brand: Intel(R) Xeon(R) CPU E5-2603 0 @ 1.80GHz
Family: 6 Model: 45 Stepping: 7
Features: aes apic avx clfsh cmov cx8 cx16 htt lahf_lm mmx msr nonstop_tsc
pcid pclmuldq pdcm pdpe1gb popcnt pse rdtscp sse2 sse3 sse4.1 sse4.2 ssse3
tdt x2a
pic
Acceleration most likely to fit this hardware: AVX_256
Acceleration selected at GROMACS compile time: AVX_256
1 GPU detected:
#0: NVIDIA Tesla C2075, compute cap.: 2.0, ECC: yes, stat: compatible
1 GPU user-selected for this run: #0
Will do PME sum in reciprocal space.
Any thoughts as to why it is so slow?
many thanks!
Ben
--
____________________________________________
Research Assistant Professor
North Carolina State University
Department of Molecular and Structural Biochemistry
128 Polk Hall
Raleigh, NC 27695
Phone: (919)-513-0698
Fax: (919)-515-2047
____________________________________________
More information about the gromacs.org_gmx-users
mailing list