[gmx-users] GPU performance question

Irem Altan irem.altan at duke.edu
Fri Oct 7 20:23:19 CEST 2016


Hi,

I had been running simulations using our local cluster, but now I’m doing some tests on SDSC’s Comet. Even though I use considerably higher resources, the speed-up is only two-fold. I was wondering if I was doing an error while setting the simulations up, that result in lower performance.

My previous setup (1 node):

1 MPI thread, 6 OpenMP threads, 1 GPU
(nVidia Tesla K80, 6 cores, 6 logical cores (intel xeon e5-2690))

==> 52.705 ns/day

the command was:

gmx mdrun -v -deffnm npt

————————————————

The current setup (1 node):

4 MPI processes, 6 OpenMP threads per MPI process, 4 GPUs
(nVidia Tesla K80, 24 cores, 24 logical cores (intel xeon e5-2680))

==> 100.577 ns/day

the command:

ibrun gmx_mpi mdrun -v -deffnm npt
(found it in an example in Comet)
___________

The .mdp file was identical (see end of e-mail for the contents). The speed-up is only two-fold, despite the fact that I quadruple the resources. Is this normal? If not, how can I optimize my setup? (Note: the simulation in the second setup has ~20,000 less

Best,
Irem

.mdp file:

itle       = NPT Equilibration
define      = -DPOSRES          ; position restrain the protein
; Run parameters
integrator  = md                ; leap-frog integrator
nsteps      = 500000             ; 2 * 50000 = 100 ps
dt          = 0.002             ; 2 fs
; Output control
nstxout     = 5000              ; save coordinates every 2 ps
nstvout     = 5000              ; save velocities every 2 ps
nstenergy   = 5000              ; save energies every 2 ps
nstlog      = 5000              ; update log file every 2 ps
; Bond parameters
continuation         = no        ; Initial simulation
constraint_algorithm = lincs     ; holonomic constraints
constraints          = all-bonds ; all bonds (even heavy atom-H bonds) constrained
lincs_iter           = 1         ; accuracy of LINCS
lincs_order          = 4         ; also related to accuracy
; Neighborsearching
cutoff-scheme = Verlet
ns_type     = grid              ; search neighboring grid cels
nstlist     = 20                 ; 10 fs
rlist       = 1.0               ; short-range neighborlist cutoff (in nm)
rcoulomb    = 1.0               ; short-range electrostatic cutoff (in nm)
rvdw        = 1.0               ; short-range van der Waals cutoff (in nm)
; Electrostatics
coulombtype     = PME           ; Particle Mesh Ewald for long-range electrostatics
pme_order = 4             ; cubic interpolation
fourierspacing  = 0.16          ; grid spacing for FFT
; Temperature coupling is on
tcoupl      = V-rescale             ; Weak coupling for initial equilibration
tc-grps     = Protein   Non-Protein ; two coupling groups - more accurate
tau_t       = 0.1 0.1         ; time constant, in ps
ref_t       = 277 277         ; reference temperature, one for each group, in K
; Pressure coupling is on
pcoupl              = Berendsen     ; Pressure coupling on in NPT, also weak coupling
pcoupltype          = isotropic     ; uniform scaling of x-y-z box vectors
tau_p               = 2.0           ; time constant, in ps
ref_p               = 1.0           ; reference pressure (in bar)
compressibility     = 4.5e-5        ; isothermal compressibility, bar^-1
refcoord_scaling    = com
; Periodic boundary conditions
pbc     = xyz                   ; 3-D PBC
; Dispersion correction
DispCorr    = EnerPres          ; account for cut-off vdW scheme
; Velocity generation
gen_vel     = yes               ; Velocity generation is on
gen_temp    = 277               ; temperature for velocity generation
gen_seed    = -1                ; random seed
; COM motion removal
; These options remove COM motion of the system
nstcomm         = 10
comm-mode = Linear
comm-grps = System


More information about the gromacs.org_gmx-users mailing list