[gmx-users] GPU / CPU load imblance
Dwey
mpi566 at gmail.com
Wed Jun 26 00:33:55 CEST 2013
Hi gmx-users,
I used 8-cores AMD CPU with a GTX680 GPU [ with 1536 CUDA Cores] to
run an example of Umbrella Sampling provided by Justin.
I am happy that GPU acceleration indeed helps me reduce significant time (
from 34 hours to 7 hours) of computation in this example.
However, I found there was a NOTE on the screen like
++++++++++++++++++++++++++++++++++++++++++
The GPU has >20% more load than the CPU. This imbalance causes
performance loss, consider using a shorter cut-off and a finer PME grid
++++++++++++++++++++++++++++++++++++++++++
Given a 20% load imbalance, I wonder if someone can give suggestions as to
how to avoid performance loss in terms of hardware (GPU/CPU)
improvement or the modification of mdp file (see below).
In terms of hardware, dose this NOTE suggest that I should use a
higher-capacity GPU like GTX 780 [ with 2304 CUDA Cores] to balance load or
catch up speed ?
If so, can it help by adding another card with GTX 680 GPU in the same
box ? but will it cause GPU/CPU imbalance load again, which two GPU keep
waiting for 8-cores CPU ?
Second,
++++++++++++++++++++++++++++++++++++++++++
Force evaluation time GPU/CPU: 4.006 ms/2.578 ms = 1.554
For optimal performance this ratio should be close to 1
++++++++++++++++++++++++++++++++++++++++++
I have no idea how this is evaluated by 4.006 ms and 2.578 ms for GPU and
CPU time, respectively.
It will be very helpful to modify the attached mdp for a better
load balance between GPU and CPU.
I appreciate kind advice and hints to improve this mdp file.
Thanks,
Dwey
########### courtesy to Justin #########
title = Umbrella pulling simulation
define = -DPOSRES_B
; Run parameters
integrator = md
dt = 0.002
tinit = 0
nsteps = 5000000 ; 10 ns
nstcomm = 10
; Output parameters
nstxout = 50000 ; every 100 ps
nstvout = 50000
nstfout = 5000
nstxtcout = 5000 ; every 10 ps
nstenergy = 5000
; Bond parameters
constraint_algorithm = lincs
constraints = all-bonds
continuation = yes
; Single-range cutoff scheme
nstlist = 5
ns_type = grid
rlist = 1.4
rcoulomb = 1.4
rvdw = 1.4
; PME electrostatics parameters
coulombtype = PME
fourierspacing = 0.12
fourier_nx = 0
fourier_ny = 0
fourier_nz = 0
pme_order = 4
ewald_rtol = 1e-5
optimize_fft = yes
; Berendsen temperature coupling is on in two groups
Tcoupl = Nose-Hoover
tc_grps = Protein Non-Protein
tau_t = 0.5 0.5
ref_t = 310 310
; Pressure coupling is on
Pcoupl = Parrinello-Rahman
pcoupltype = isotropic
tau_p = 1.0
compressibility = 4.5e-5
ref_p = 1.0
refcoord_scaling = com
; Generate velocities is off
gen_vel = no
; Periodic boundary conditions are on in all directions
pbc = xyz
; Long-range dispersion correction
DispCorr = EnerPres
cutoff-scheme = Verlet
; Pull code
pull = umbrella
pull_geometry = distance
pull_dim = N N Y
pull_start = yes
pull_ngroups = 1
pull_group0 = Chain_B
pull_group1 = Chain_A
pull_init1 = 0
pull_rate1 = 0.0
pull_k1 = 1000 ; kJ mol^-1 nm^-2
pull_nstxout = 1000 ; every 2 ps
pull_nstfout = 1000 ; every 2 ps
More information about the gromacs.org_gmx-users
mailing list