[gmx-users] help: load imbalance

Szilárd Páll szilard.pall at cbr.su.se
Wed Apr 10 21:54:46 CEST 2013


On Wed, Apr 10, 2013 at 4:50 PM, 申昊 <shenhao at mail.bnu.edu.cn> wrote:
> Hello,
>    I wanna ask some questions about load imbalance.
> 1> Here are the messages resulted from grompp -f md.mdp -p topol.top -c npt.gro -o md.tpr
>
>    NOTE 1 [file md.mdp]:
>   The optimal PME mesh load for parallel simulations is below 0.5
>   and for highly parallel simulations between 0.25 and 0.33,
>   for higher performance, increase the cut-off and the PME grid spacing
>
> therefore, i changed the md.mdp as whrited below, then used the command grompp -f md.mdp -p topol.top -c npt.gro -o md.tpr , then there is no NOTE printed. So if i change the cut-offs to 2.0 nm and increase the grid spacing to 0.30, does the calculated results reasonable?

You can shift work between short- and long-range electrostatics by
adjusting the *coulomb* cut-off freely, but *not* the VdW cut-off.

However, 2.0 nm sounds like a *very* long cut-off.

>
> ; Neighborsearching
> ns_type         = grid          ; search neighboring grid cells
> nstlist         = 5             ; 10 fs
> rlist           = 2             ; short-range neighborlist cutoff (in nm)
> rcoulomb        = 2             ; short-range electrostatic cutoff (in nm)
> rvdw            = 2             ; short-range van der Waals cutoff (in nm)
> ; Electrostatics
> coulombtype     = PME           ; Particle Mesh Ewald for long-range electrostatics
> pme_order       = 4             ; cubic interpolation
> fourierspacing  = 0.3           ; grid spacing for FFT
>
> 2> and how about no changes, just simulate it with the original mdp. Is the results still reasonable?  Here are the messages without any changes:
>
> DD  load balancing is limited by minimum cell size in dimension X
> DD  step 2933999  vol min/aver 0.189! load imb.: force 124.7%

You are simply pushing your simulation to the limit of how far it can
be parallelized. As you can see from the above output, to compensate
for the imbalance, the load balancing shrunk DD cells to the extent
that the volume ratio of the smallest and average DD cell size is
0.189, meaning that the smallest DD cells are ~5.3x smaller than the
average cell size - and the run is still *hugely* imbalanced. The "!"
indicates what the line before says, that the DD load-balancing is
limited and can't shrink cells further.

Some aspects that might be limiting your simulation are:
i) running with just a few hundred atoms/core;
ii) running on multiple very different cluster nodes;
iii) using a very inhomogeneous system.

If you're using the group scheme (which i assume you are, otherwise
the automated PP-PME balancing would have kicked in), you should be
able to get better performance at high parallelization with the verlet
scheme.

Cheers,
--
Szilard

>
>            Step           Time         Lambda
>         2934000     5868.00000        0.00000
>
>    Energies (kJ/mol)
>           Angle    Proper Dih. Ryckaert-Bell.          LJ-14     Coulomb-14
>     2.99315e+02    2.13778e+01    1.74659e+02    2.22024e+02    2.02466e+03
>         LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.      Potential
>    -1.68074e+02   -2.09809e-01   -1.80294e+03   -3.28155e+03   -2.51074e+03
>     Kinetic En.   Total Energy    Temperature Pres. DC (bar) Pressure (bar)
>     1.69264e+04    1.44156e+04    2.95552e+02   -1.33866e-04    1.51489e+00
>    Constr. rmsd
>     2.60082e-05
>
>
>
> --
> gmx-users mailing list    gmx-users at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-request at gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists



More information about the gromacs.org_gmx-users mailing list