[gmx-users] Improving scaling - Gromacs 4.0 RC2

Justin A. Lemkul jalemkul at vt.edu
Thu Oct 2 13:41:21 CEST 2008



Berk Hess wrote:
> Hi,
> 
> Looking at your 64 core results, it seems that your PP:PME load ratio is 
> about 1:1.
> In most cases 3:1 is much better performance wise.
> grompp probably also printed a note about this and also how to fix it.

 From the .mdp file I posted before, grompp gave the following:

Calculating fourier grid dimensions for X Y Z
Using a fourier grid of 56x60x50, spacing 0.238 0.230 0.240
Estimate for the relative computational load of the PME mesh part: 0.34

Should it have advised me about anything else?  It seems that the PME load is 
reasonable, given what I understand about the matter.  I suppose that does 
indicate to the PP/PME ratio I should be using.

> I have also described this shortly in the parallelization section of the 
> pdf manual.
> 
> You should probably increase your cut-offs and pme grid spacing by the 
> same factor
> (something around 1.2).

Which cut-offs, rlist/rcoulomb?  I thought these were force field-dependent. 
Please correct me if I'm wrong.

> Hopefully mdrun should choose the proper number of pme nodes for you
> when you do not use -npme.

I have never gotten mdrun to cooperate without specifically defining -npme; 
maybe it's just me or something that I'm doing.  For example, output from two 
runs I tried (using 64 cores):

1. With fourierspacing = 0.12 (only difference from the posted .mdp file)

-------------------------------------------------------
Program mdrun_4.0_rc2_mpi, VERSION 4.0_rc2
Source code file: domdec_setup.c, line: 132

Fatal error:
Could not find an appropriate number of separate PME nodes. i.e. >= 
0.563840*#nodes (34) and <= #nodes/2 (32) and reasonable performance wise 
(grid_x=112, grid_y=117).
Use the -npme option of mdrun or change the number of processors or the PME grid 
dimensions, see the manual for details.
-------------------------------------------------------


2. With fourierspacing = 0.24 (the posted .mdp file)

-------------------------------------------------------
Program mdrun_4.0_rc2_mpi, VERSION 4.0_rc2
Source code file: domdec_setup.c, line: 132

Fatal error:
Could not find an appropriate number of separate PME nodes. i.e. >= 
0.397050*#nodes (24) and <= #nodes/2 (32) and reasonable performance wise 
(grid_x=56, grid_y=60).
Use the -npme option of mdrun or change the number of processors or the PME grid 
dimensions, see the manual for details.
-------------------------------------------------------


Thanks.

-Justin

> 
> Berk
> 
> 
> ------------------------------------------------------------------------
>  > Date: Wed, 1 Oct 2008 17:18:24 -0400
>  > From: jalemkul at vt.edu
>  > To: gmx-users at gromacs.org
>  > Subject: [gmx-users] Improving scaling - Gromacs 4.0 RC2
>  >
>  >
>  > Hi,
>  >
>  > I've been playing around with the latest release candidate of version 
> 4.0, and I
>  > was hoping someone out there more knowledgeable than me might tell me 
> how to
>  > improve a bit on the performance I'm seeing. To clarify, the 
> performance I'm
>  > seeing is a ton faster than 3.3.x, but I still seem to be getting 
> bogged down
>  > with the PME/PP balance. I'm using mostly the default options with 
> the new mdrun:
>  >
>  > mdrun_mpi -s test.tpr -np 64 -npme 32
>  >
>  > The system contains about 150,000 atoms - a membrane protein 
> surrounded by
>  > several hundred lipids and solvent (water). The protein parameters 
> are GROMOS,
>  > lipids are Berger, and water is SPC. My .mdp file (adapted from a 
> generic 3.3.x
>  > file that I always used to use for such simulations) is attached at 
> the end of
>  > this mail. It seems that my system runs fastest on 64 CPU's. Almost 
> all tests
>  > with 128 or 256 seem to run slower. The nodes are dual-core 2.3 GHz 
> Xserve G5,
>  > connected by Infiniband.
>  >
>  > Here's a summary of some of the tests I've run:
>  >
>  > -np -npme -ddorder ns/day % performance loss from imbalance
>  > 64 16 interleave 5.760 19.6
>  > 64 32 interleave 9.600 40.9
>  > 64 32 pp_pme 5.252 3.9
>  > 64 32 cartesian 5.383 4.7
>  >
>  > All other mdrun command line options are defaults.
>  >
>  > I get ~10.3 ns/day with -np 256 -npme 64, but since -np 64 -npme 32 
> seems to
>  > give almost that same performance there seems to be no compelling 
> reason to tie
>  > up that many nodes.
>  >
>  > Any hints on how to speed things up any more? Is it possible? Not 
> that I'm
>  > complaining...the same system under GMX 3.3.3 gives just under 1 
> ns/day :) I'm
>  > really curious about the 40.9% performance loss I'm seeing with -np 
> 64 -npme 32,
>  > even though it gives the best overall performance in terms of ns/day.
>  >
>  > Thanks in advance for your attention, and any comments.
>  >
>  > -Justin
>  >
>  > =======test.mdp=========
>  > title = NPT simulation for a membrane protein
>  > ; Run parameters
>  > integrator = md
>  > dt = 0.002
>  > nsteps = 10000 ; 20 ps
>  > nstcomm = 1
>  > ; Output parameters
>  > nstxout = 500
>  > nstvout = 500
>  > nstfout = 500
>  > nstlog = 500
>  > nstenergy = 500
>  > ; Bond parameters
>  > constraint_algorithm = lincs
>  > constraints = all-bonds
>  > continuation = no ; starting up
>  > ; Twin-range cutoff scheme, parameters for Gromos96
>  > nstlist = 5
>  > ns_type = grid
>  > rlist = 0.8
>  > rcoulomb = 0.8
>  > rvdw = 1.4
>  > ; PME electrostatics parameters
>  > coulombtype = PME
>  > fourierspacing = 0.24
>  > pme_order = 4
>  > ewald_rtol = 1e-5
>  > optimize_fft = yes
>  > ; V-rescale temperature coupling is on in three groups
>  > Tcoupl = V-rescale
>  > tc_grps = Protein POPC SOL_NA+_CL-
>  > tau_t = 0.1 0.1 0.1
>  > ref_t = 310 310 310
>  > ; Pressure coupling is on
>  > Pcoupl = Berendsen
>  > pcoupltype = semiisotropic
>  > tau_p = 2.0
>  > compressibility = 4.5e-5 4.5e-5
>  > ref_p = 1.0 1.0
>  > ; Generate velocities is on
>  > gen_vel = yes
>  > gen_temp = 310
>  > gen_seed = 173529
>  > ; Periodic boundary conditions are on in all directions
>  > pbc = xyz
>  > ; Long-range dispersion correction
>  > DispCorr = EnerPres
>  >
>  > ========end test.mdp==========
>  >
>  > --
>  > ========================================
>  >
>  > Justin A. Lemkul
>  > Graduate Research Assistant
>  > Department of Biochemistry
>  > Virginia Tech
>  > Blacksburg, VA
>  > jalemkul[at]vt.edu | (540) 231-9080
>  > http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin
>  >
>  > ========================================
>  > _______________________________________________
>  > gmx-users mailing list gmx-users at gromacs.org
>  > http://www.gromacs.org/mailman/listinfo/gmx-users
>  > Please search the archive at http://www.gromacs.org/search before 
> posting!
>  > Please don't post (un)subscribe requests to the list. Use the
>  > www interface or send it to gmx-users-request at gromacs.org.
>  > Can't post? Read http://www.gromacs.org/mailing_lists/users.php
> 
> ------------------------------------------------------------------------
> Express yourself instantly with MSN Messenger! MSN Messenger 
> <http://clk.atdmt.com/AVE/go/onm00200471ave/direct/01/>
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> gmx-users mailing list    gmx-users at gromacs.org
> http://www.gromacs.org/mailman/listinfo/gmx-users
> Please search the archive at http://www.gromacs.org/search before posting!
> Please don't post (un)subscribe requests to the list. Use the 
> www interface or send it to gmx-users-request at gromacs.org.
> Can't post? Read http://www.gromacs.org/mailing_lists/users.php

-- 
========================================

Justin A. Lemkul
Graduate Research Assistant
Department of Biochemistry
Virginia Tech
Blacksburg, VA
jalemkul[at]vt.edu | (540) 231-9080
http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin

========================================



More information about the gromacs.org_gmx-users mailing list