[gmx-users] Improving scaling - Gromacs 4.0 RC2

Justin A. Lemkul jalemkul at vt.edu
Wed Oct 1 23:18:24 CEST 2008


Hi,

I've been playing around with the latest release candidate of version 4.0, and I 
was hoping someone out there more knowledgeable than me might tell me how to 
improve a bit on the performance I'm seeing.  To clarify, the performance I'm 
seeing is a ton faster than 3.3.x, but I still seem to be getting bogged down 
with the PME/PP balance.  I'm using mostly the default options with the new mdrun:

mdrun_mpi -s test.tpr -np 64 -npme 32

The system contains about 150,000 atoms - a membrane protein surrounded by 
several hundred lipids and solvent (water).  The protein parameters are GROMOS, 
lipids are Berger, and water is SPC.  My .mdp file (adapted from a generic 3.3.x 
file that I always used to use for such simulations) is attached at the end of 
this mail.  It seems that my system runs fastest on 64 CPU's.  Almost all tests 
with 128 or 256 seem to run slower.  The nodes are dual-core 2.3 GHz Xserve G5, 
connected by Infiniband.

Here's a summary of some of the tests I've run:

-np	-npme	-ddorder	ns/day	% performance loss from imbalance
64	16	interleave	5.760	19.6
64	32	interleave	9.600	40.9
64	32	pp_pme		5.252	3.9
64	32	cartesian	5.383	4.7

All other mdrun command line options are defaults.

I get ~10.3 ns/day with -np 256 -npme 64, but since -np 64 -npme 32 seems to 
give almost that same performance there seems to be no compelling reason to tie 
up that many nodes.

Any hints on how to speed things up any more?  Is it possible?  Not that I'm 
complaining...the same system under GMX 3.3.3 gives just under 1 ns/day :)  I'm 
really curious about the 40.9% performance loss I'm seeing with -np 64 -npme 32, 
even though it gives the best overall performance in terms of ns/day.

Thanks in advance for your attention, and any comments.

-Justin

=======test.mdp=========
title		= NPT simulation for a membrane protein
; Run parameters
integrator	= md
dt		= 0.002
nsteps		= 10000		; 20 ps
nstcomm		= 1
; Output parameters
nstxout		= 500
nstvout		= 500
nstfout		= 500
nstlog		= 500
nstenergy	= 500
; Bond parameters
constraint_algorithm 	= lincs
constraints		= all-bonds
continuation 	= no		; starting up
; Twin-range cutoff scheme, parameters for Gromos96
nstlist		= 5
ns_type		= grid
rlist		= 0.8
rcoulomb	= 0.8
rvdw		= 1.4
; PME electrostatics parameters
coulombtype	= PME
fourierspacing  = 0.24
pme_order	= 4
ewald_rtol	= 1e-5
optimize_fft	= yes
; V-rescale temperature coupling is on in three groups
Tcoupl	 	= V-rescale
tc_grps		= Protein POPC SOL_NA+_CL-
tau_t		= 0.1 0.1 0.1
ref_t		= 310 310 310
; Pressure coupling is on
Pcoupl		= Berendsen
pcoupltype	= semiisotropic
tau_p		= 2.0		
compressibility	= 4.5e-5 4.5e-5
ref_p		= 1.0 1.0
; Generate velocities is on
gen_vel		= yes		
gen_temp	= 310
gen_seed	= 173529
; Periodic boundary conditions are on in all directions
pbc		= xyz
; Long-range dispersion correction
DispCorr	= EnerPres

========end test.mdp==========

-- 
========================================

Justin A. Lemkul
Graduate Research Assistant
Department of Biochemistry
Virginia Tech
Blacksburg, VA
jalemkul[at]vt.edu | (540) 231-9080
http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin

========================================



More information about the gromacs.org_gmx-users mailing list