[gmx-users] Improving scaling - Gromacs 4.0 RC2

Berk Hess gmx3 at hotmail.com
Thu Oct 2 14:12:31 CEST 2008





> Date: Thu, 2 Oct 2008 07:41:21 -0400
> From: jalemkul at vt.edu
> To: gmx-users at gromacs.org
> Subject: Re: [gmx-users] Improving scaling - Gromacs 4.0 RC2
> 
> 
> 
> Berk Hess wrote:
> > Hi,
> > 
> > Looking at your 64 core results, it seems that your PP:PME load ratio is 
> > about 1:1.
> > In most cases 3:1 is much better performance wise.
> > grompp probably also printed a note about this and also how to fix it.
> 
>  From the .mdp file I posted before, grompp gave the following:
> 
> Calculating fourier grid dimensions for X Y Z
> Using a fourier grid of 56x60x50, spacing 0.238 0.230 0.240
> Estimate for the relative computational load of the PME mesh part: 0.34
> 
> Should it have advised me about anything else?  It seems that the PME load is 
> reasonable, given what I understand about the matter.  I suppose that does 
> indicate to the PP/PME ratio I should be using.

Indeed.
The only issue is that this is a guess, but in most cases it should be pretty
close to reality.

> 
> > I have also described this shortly in the parallelization section of the 
> > pdf manual.
> > 
> > You should probably increase your cut-offs and pme grid spacing by the 
> > same factor
> > (something around 1.2).
> 
> Which cut-offs, rlist/rcoulomb?  I thought these were force field-dependent. 
> Please correct me if I'm wrong.

That depends.
For some force-fields this might be critical.
But in general people use PME now, so for rcoulomb you can use whatever you want.
Often force fields are now parametrized with dispersion-correction for LJ.
If you use dispersion correction, rvdw is also free to choose (although probably not < 0.8).
The optimal cut-off for high parallelization is often around 1.2 nm.

> 
> > Hopefully mdrun should choose the proper number of pme nodes for you
> > when you do not use -npme.
> 
> I have never gotten mdrun to cooperate without specifically defining -npme; 
> maybe it's just me or something that I'm doing.  For example, output from two 
> runs I tried (using 64 cores):
> 
> 1. With fourierspacing = 0.12 (only difference from the posted .mdp file)
> 
> -------------------------------------------------------
> Program mdrun_4.0_rc2_mpi, VERSION 4.0_rc2
> Source code file: domdec_setup.c, line: 132
> 
> Fatal error:
> Could not find an appropriate number of separate PME nodes. i.e. >= 
> 0.563840*#nodes (34) and <= #nodes/2 (32) and reasonable performance wise 
> (grid_x=112, grid_y=117).
> Use the -npme option of mdrun or change the number of processors or the PME grid 
> dimensions, see the manual for details.
> -------------------------------------------------------
> 
> 
> 2. With fourierspacing = 0.24 (the posted .mdp file)
> 
> -------------------------------------------------------
> Program mdrun_4.0_rc2_mpi, VERSION 4.0_rc2
> Source code file: domdec_setup.c, line: 132
> 
> Fatal error:
> Could not find an appropriate number of separate PME nodes. i.e. >= 
> 0.397050*#nodes (24) and <= #nodes/2 (32) and reasonable performance wise 
> (grid_x=56, grid_y=60).
> Use the -npme option of mdrun or change the number of processors or the PME grid 
> dimensions, see the manual for details.
> -------------------------------------------------------

If you relative PME load is 0.34, which is a very reasonable value,
you want 1/3 PME nodes. With 64 nodes this is not going to work out.
Using 48 or 72 nodes would be a solution.

Berk

> 
> 
> Thanks.
> 
> -Justin
> 
> > 
> > Berk
> > 
> > 
> > ------------------------------------------------------------------------
> >  > Date: Wed, 1 Oct 2008 17:18:24 -0400
> >  > From: jalemkul at vt.edu
> >  > To: gmx-users at gromacs.org
> >  > Subject: [gmx-users] Improving scaling - Gromacs 4.0 RC2
> >  >
> >  >
> >  > Hi,
> >  >
> >  > I've been playing around with the latest release candidate of version 
> > 4.0, and I
> >  > was hoping someone out there more knowledgeable than me might tell me 
> > how to
> >  > improve a bit on the performance I'm seeing. To clarify, the 
> > performance I'm
> >  > seeing is a ton faster than 3.3.x, but I still seem to be getting 
> > bogged down
> >  > with the PME/PP balance. I'm using mostly the default options with 
> > the new mdrun:
> >  >
> >  > mdrun_mpi -s test.tpr -np 64 -npme 32
> >  >
> >  > The system contains about 150,000 atoms - a membrane protein 
> > surrounded by
> >  > several hundred lipids and solvent (water). The protein parameters 
> > are GROMOS,
> >  > lipids are Berger, and water is SPC. My .mdp file (adapted from a 
> > generic 3.3.x
> >  > file that I always used to use for such simulations) is attached at 
> > the end of
> >  > this mail. It seems that my system runs fastest on 64 CPU's. Almost 
> > all tests
> >  > with 128 or 256 seem to run slower. The nodes are dual-core 2.3 GHz 
> > Xserve G5,
> >  > connected by Infiniband.
> >  >
> >  > Here's a summary of some of the tests I've run:
> >  >
> >  > -np -npme -ddorder ns/day % performance loss from imbalance
> >  > 64 16 interleave 5.760 19.6
> >  > 64 32 interleave 9.600 40.9
> >  > 64 32 pp_pme 5.252 3.9
> >  > 64 32 cartesian 5.383 4.7
> >  >
> >  > All other mdrun command line options are defaults.
> >  >
> >  > I get ~10.3 ns/day with -np 256 -npme 64, but since -np 64 -npme 32 
> > seems to
> >  > give almost that same performance there seems to be no compelling 
> > reason to tie
> >  > up that many nodes.
> >  >
> >  > Any hints on how to speed things up any more? Is it possible? Not 
> > that I'm
> >  > complaining...the same system under GMX 3.3.3 gives just under 1 
> > ns/day :) I'm
> >  > really curious about the 40.9% performance loss I'm seeing with -np 
> > 64 -npme 32,
> >  > even though it gives the best overall performance in terms of ns/day.
> >  >
> >  > Thanks in advance for your attention, and any comments.
> >  >
> >  > -Justin
> >  >
> >  > =======test.mdp=========
> >  > title = NPT simulation for a membrane protein
> >  > ; Run parameters
> >  > integrator = md
> >  > dt = 0.002
> >  > nsteps = 10000 ; 20 ps
> >  > nstcomm = 1
> >  > ; Output parameters
> >  > nstxout = 500
> >  > nstvout = 500
> >  > nstfout = 500
> >  > nstlog = 500
> >  > nstenergy = 500
> >  > ; Bond parameters
> >  > constraint_algorithm = lincs
> >  > constraints = all-bonds
> >  > continuation = no ; starting up
> >  > ; Twin-range cutoff scheme, parameters for Gromos96
> >  > nstlist = 5
> >  > ns_type = grid
> >  > rlist = 0.8
> >  > rcoulomb = 0.8
> >  > rvdw = 1.4
> >  > ; PME electrostatics parameters
> >  > coulombtype = PME
> >  > fourierspacing = 0.24
> >  > pme_order = 4
> >  > ewald_rtol = 1e-5
> >  > optimize_fft = yes
> >  > ; V-rescale temperature coupling is on in three groups
> >  > Tcoupl = V-rescale
> >  > tc_grps = Protein POPC SOL_NA+_CL-
> >  > tau_t = 0.1 0.1 0.1
> >  > ref_t = 310 310 310
> >  > ; Pressure coupling is on
> >  > Pcoupl = Berendsen
> >  > pcoupltype = semiisotropic
> >  > tau_p = 2.0
> >  > compressibility = 4.5e-5 4.5e-5
> >  > ref_p = 1.0 1.0
> >  > ; Generate velocities is on
> >  > gen_vel = yes
> >  > gen_temp = 310
> >  > gen_seed = 173529
> >  > ; Periodic boundary conditions are on in all directions
> >  > pbc = xyz
> >  > ; Long-range dispersion correction
> >  > DispCorr = EnerPres
> >  >
> >  > ========end test.mdp==========
> >  >
> >  > --
> >  > ========================================
> >  >
> >  > Justin A. Lemkul
> >  > Graduate Research Assistant
> >  > Department of Biochemistry
> >  > Virginia Tech
> >  > Blacksburg, VA
> >  > jalemkul[at]vt.edu | (540) 231-9080
> >  > http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin
> >  >
> >  > ========================================
> >  > _______________________________________________
> >  > gmx-users mailing list gmx-users at gromacs.org
> >  > http://www.gromacs.org/mailman/listinfo/gmx-users
> >  > Please search the archive at http://www.gromacs.org/search before 
> > posting!
> >  > Please don't post (un)subscribe requests to the list. Use the
> >  > www interface or send it to gmx-users-request at gromacs.org.
> >  > Can't post? Read http://www.gromacs.org/mailing_lists/users.php
> > 
> > ------------------------------------------------------------------------
> > Express yourself instantly with MSN Messenger! MSN Messenger 
> > <http://clk.atdmt.com/AVE/go/onm00200471ave/direct/01/>
> > 
> > 
> > ------------------------------------------------------------------------
> > 
> > _______________________________________________
> > gmx-users mailing list    gmx-users at gromacs.org
> > http://www.gromacs.org/mailman/listinfo/gmx-users
> > Please search the archive at http://www.gromacs.org/search before posting!
> > Please don't post (un)subscribe requests to the list. Use the 
> > www interface or send it to gmx-users-request at gromacs.org.
> > Can't post? Read http://www.gromacs.org/mailing_lists/users.php
> 
> -- 
> ========================================
> 
> Justin A. Lemkul
> Graduate Research Assistant
> Department of Biochemistry
> Virginia Tech
> Blacksburg, VA
> jalemkul[at]vt.edu | (540) 231-9080
> http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin
> 
> ========================================
> _______________________________________________
> gmx-users mailing list    gmx-users at gromacs.org
> http://www.gromacs.org/mailman/listinfo/gmx-users
> Please search the archive at http://www.gromacs.org/search before posting!
> Please don't post (un)subscribe requests to the list. Use the 
> www interface or send it to gmx-users-request at gromacs.org.
> Can't post? Read http://www.gromacs.org/mailing_lists/users.php

_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20081002/7060a482/attachment.html>


More information about the gromacs.org_gmx-users mailing list