[gmx-users] Improving scaling - Gromacs 4.0 RC2
Justin A. Lemkul
jalemkul at vt.edu
Thu Oct 2 13:41:21 CEST 2008
Berk Hess wrote:
> Hi,
>
> Looking at your 64 core results, it seems that your PP:PME load ratio is
> about 1:1.
> In most cases 3:1 is much better performance wise.
> grompp probably also printed a note about this and also how to fix it.
From the .mdp file I posted before, grompp gave the following:
Calculating fourier grid dimensions for X Y Z
Using a fourier grid of 56x60x50, spacing 0.238 0.230 0.240
Estimate for the relative computational load of the PME mesh part: 0.34
Should it have advised me about anything else? It seems that the PME load is
reasonable, given what I understand about the matter. I suppose that does
indicate to the PP/PME ratio I should be using.
> I have also described this shortly in the parallelization section of the
> pdf manual.
>
> You should probably increase your cut-offs and pme grid spacing by the
> same factor
> (something around 1.2).
Which cut-offs, rlist/rcoulomb? I thought these were force field-dependent.
Please correct me if I'm wrong.
> Hopefully mdrun should choose the proper number of pme nodes for you
> when you do not use -npme.
I have never gotten mdrun to cooperate without specifically defining -npme;
maybe it's just me or something that I'm doing. For example, output from two
runs I tried (using 64 cores):
1. With fourierspacing = 0.12 (only difference from the posted .mdp file)
-------------------------------------------------------
Program mdrun_4.0_rc2_mpi, VERSION 4.0_rc2
Source code file: domdec_setup.c, line: 132
Fatal error:
Could not find an appropriate number of separate PME nodes. i.e. >=
0.563840*#nodes (34) and <= #nodes/2 (32) and reasonable performance wise
(grid_x=112, grid_y=117).
Use the -npme option of mdrun or change the number of processors or the PME grid
dimensions, see the manual for details.
-------------------------------------------------------
2. With fourierspacing = 0.24 (the posted .mdp file)
-------------------------------------------------------
Program mdrun_4.0_rc2_mpi, VERSION 4.0_rc2
Source code file: domdec_setup.c, line: 132
Fatal error:
Could not find an appropriate number of separate PME nodes. i.e. >=
0.397050*#nodes (24) and <= #nodes/2 (32) and reasonable performance wise
(grid_x=56, grid_y=60).
Use the -npme option of mdrun or change the number of processors or the PME grid
dimensions, see the manual for details.
-------------------------------------------------------
Thanks.
-Justin
>
> Berk
>
>
> ------------------------------------------------------------------------
> > Date: Wed, 1 Oct 2008 17:18:24 -0400
> > From: jalemkul at vt.edu
> > To: gmx-users at gromacs.org
> > Subject: [gmx-users] Improving scaling - Gromacs 4.0 RC2
> >
> >
> > Hi,
> >
> > I've been playing around with the latest release candidate of version
> 4.0, and I
> > was hoping someone out there more knowledgeable than me might tell me
> how to
> > improve a bit on the performance I'm seeing. To clarify, the
> performance I'm
> > seeing is a ton faster than 3.3.x, but I still seem to be getting
> bogged down
> > with the PME/PP balance. I'm using mostly the default options with
> the new mdrun:
> >
> > mdrun_mpi -s test.tpr -np 64 -npme 32
> >
> > The system contains about 150,000 atoms - a membrane protein
> surrounded by
> > several hundred lipids and solvent (water). The protein parameters
> are GROMOS,
> > lipids are Berger, and water is SPC. My .mdp file (adapted from a
> generic 3.3.x
> > file that I always used to use for such simulations) is attached at
> the end of
> > this mail. It seems that my system runs fastest on 64 CPU's. Almost
> all tests
> > with 128 or 256 seem to run slower. The nodes are dual-core 2.3 GHz
> Xserve G5,
> > connected by Infiniband.
> >
> > Here's a summary of some of the tests I've run:
> >
> > -np -npme -ddorder ns/day % performance loss from imbalance
> > 64 16 interleave 5.760 19.6
> > 64 32 interleave 9.600 40.9
> > 64 32 pp_pme 5.252 3.9
> > 64 32 cartesian 5.383 4.7
> >
> > All other mdrun command line options are defaults.
> >
> > I get ~10.3 ns/day with -np 256 -npme 64, but since -np 64 -npme 32
> seems to
> > give almost that same performance there seems to be no compelling
> reason to tie
> > up that many nodes.
> >
> > Any hints on how to speed things up any more? Is it possible? Not
> that I'm
> > complaining...the same system under GMX 3.3.3 gives just under 1
> ns/day :) I'm
> > really curious about the 40.9% performance loss I'm seeing with -np
> 64 -npme 32,
> > even though it gives the best overall performance in terms of ns/day.
> >
> > Thanks in advance for your attention, and any comments.
> >
> > -Justin
> >
> > =======test.mdp=========
> > title = NPT simulation for a membrane protein
> > ; Run parameters
> > integrator = md
> > dt = 0.002
> > nsteps = 10000 ; 20 ps
> > nstcomm = 1
> > ; Output parameters
> > nstxout = 500
> > nstvout = 500
> > nstfout = 500
> > nstlog = 500
> > nstenergy = 500
> > ; Bond parameters
> > constraint_algorithm = lincs
> > constraints = all-bonds
> > continuation = no ; starting up
> > ; Twin-range cutoff scheme, parameters for Gromos96
> > nstlist = 5
> > ns_type = grid
> > rlist = 0.8
> > rcoulomb = 0.8
> > rvdw = 1.4
> > ; PME electrostatics parameters
> > coulombtype = PME
> > fourierspacing = 0.24
> > pme_order = 4
> > ewald_rtol = 1e-5
> > optimize_fft = yes
> > ; V-rescale temperature coupling is on in three groups
> > Tcoupl = V-rescale
> > tc_grps = Protein POPC SOL_NA+_CL-
> > tau_t = 0.1 0.1 0.1
> > ref_t = 310 310 310
> > ; Pressure coupling is on
> > Pcoupl = Berendsen
> > pcoupltype = semiisotropic
> > tau_p = 2.0
> > compressibility = 4.5e-5 4.5e-5
> > ref_p = 1.0 1.0
> > ; Generate velocities is on
> > gen_vel = yes
> > gen_temp = 310
> > gen_seed = 173529
> > ; Periodic boundary conditions are on in all directions
> > pbc = xyz
> > ; Long-range dispersion correction
> > DispCorr = EnerPres
> >
> > ========end test.mdp==========
> >
> > --
> > ========================================
> >
> > Justin A. Lemkul
> > Graduate Research Assistant
> > Department of Biochemistry
> > Virginia Tech
> > Blacksburg, VA
> > jalemkul[at]vt.edu | (540) 231-9080
> > http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin
> >
> > ========================================
> > _______________________________________________
> > gmx-users mailing list gmx-users at gromacs.org
> > http://www.gromacs.org/mailman/listinfo/gmx-users
> > Please search the archive at http://www.gromacs.org/search before
> posting!
> > Please don't post (un)subscribe requests to the list. Use the
> > www interface or send it to gmx-users-request at gromacs.org.
> > Can't post? Read http://www.gromacs.org/mailing_lists/users.php
>
> ------------------------------------------------------------------------
> Express yourself instantly with MSN Messenger! MSN Messenger
> <http://clk.atdmt.com/AVE/go/onm00200471ave/direct/01/>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> gmx-users mailing list gmx-users at gromacs.org
> http://www.gromacs.org/mailman/listinfo/gmx-users
> Please search the archive at http://www.gromacs.org/search before posting!
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-request at gromacs.org.
> Can't post? Read http://www.gromacs.org/mailing_lists/users.php
--
========================================
Justin A. Lemkul
Graduate Research Assistant
Department of Biochemistry
Virginia Tech
Blacksburg, VA
jalemkul[at]vt.edu | (540) 231-9080
http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin
========================================
More information about the gromacs.org_gmx-users
mailing list