[gmx-users] GPU performance question
Irem Altan
irem.altan at duke.edu
Fri Oct 7 20:23:19 CEST 2016
Hi,
I had been running simulations using our local cluster, but now I’m doing some tests on SDSC’s Comet. Even though I use considerably higher resources, the speed-up is only two-fold. I was wondering if I was doing an error while setting the simulations up, that result in lower performance.
My previous setup (1 node):
1 MPI thread, 6 OpenMP threads, 1 GPU
(nVidia Tesla K80, 6 cores, 6 logical cores (intel xeon e5-2690))
==> 52.705 ns/day
the command was:
gmx mdrun -v -deffnm npt
————————————————
The current setup (1 node):
4 MPI processes, 6 OpenMP threads per MPI process, 4 GPUs
(nVidia Tesla K80, 24 cores, 24 logical cores (intel xeon e5-2680))
==> 100.577 ns/day
the command:
ibrun gmx_mpi mdrun -v -deffnm npt
(found it in an example in Comet)
___________
The .mdp file was identical (see end of e-mail for the contents). The speed-up is only two-fold, despite the fact that I quadruple the resources. Is this normal? If not, how can I optimize my setup? (Note: the simulation in the second setup has ~20,000 less
Best,
Irem
.mdp file:
itle = NPT Equilibration
define = -DPOSRES ; position restrain the protein
; Run parameters
integrator = md ; leap-frog integrator
nsteps = 500000 ; 2 * 50000 = 100 ps
dt = 0.002 ; 2 fs
; Output control
nstxout = 5000 ; save coordinates every 2 ps
nstvout = 5000 ; save velocities every 2 ps
nstenergy = 5000 ; save energies every 2 ps
nstlog = 5000 ; update log file every 2 ps
; Bond parameters
continuation = no ; Initial simulation
constraint_algorithm = lincs ; holonomic constraints
constraints = all-bonds ; all bonds (even heavy atom-H bonds) constrained
lincs_iter = 1 ; accuracy of LINCS
lincs_order = 4 ; also related to accuracy
; Neighborsearching
cutoff-scheme = Verlet
ns_type = grid ; search neighboring grid cels
nstlist = 20 ; 10 fs
rlist = 1.0 ; short-range neighborlist cutoff (in nm)
rcoulomb = 1.0 ; short-range electrostatic cutoff (in nm)
rvdw = 1.0 ; short-range van der Waals cutoff (in nm)
; Electrostatics
coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics
pme_order = 4 ; cubic interpolation
fourierspacing = 0.16 ; grid spacing for FFT
; Temperature coupling is on
tcoupl = V-rescale ; Weak coupling for initial equilibration
tc-grps = Protein Non-Protein ; two coupling groups - more accurate
tau_t = 0.1 0.1 ; time constant, in ps
ref_t = 277 277 ; reference temperature, one for each group, in K
; Pressure coupling is on
pcoupl = Berendsen ; Pressure coupling on in NPT, also weak coupling
pcoupltype = isotropic ; uniform scaling of x-y-z box vectors
tau_p = 2.0 ; time constant, in ps
ref_p = 1.0 ; reference pressure (in bar)
compressibility = 4.5e-5 ; isothermal compressibility, bar^-1
refcoord_scaling = com
; Periodic boundary conditions
pbc = xyz ; 3-D PBC
; Dispersion correction
DispCorr = EnerPres ; account for cut-off vdW scheme
; Velocity generation
gen_vel = yes ; Velocity generation is on
gen_temp = 277 ; temperature for velocity generation
gen_seed = -1 ; random seed
; COM motion removal
; These options remove COM motion of the system
nstcomm = 10
comm-mode = Linear
comm-grps = System
More information about the gromacs.org_gmx-users
mailing list