[gmx-users] Benchmarking gromacs over large number of cores

Peter C. Lai pcl at uab.edu
Thu Apr 28 18:29:30 CEST 2011


how many particles is your system? if the number per domain is too low there is not much you can do about the load imbalance...but it did report only an overall 3.2% overhead for this so...

you can modify the PP/PME ratio during mdrun by manually specifying the domain decomposition yourself.
so for example since you are off by 33%, try to specify the dd count so you end up with 3x the nodes for PME than you did before...
example: set -dd 10 10 10 to use 1000 PP nodes the rest will be PME nodes; you can use a nonsquare matrix just try to minimze the condition number
-gcom effectively overrides nstcalenergy, as it tells each node how many steps to run before synchronizing. Usually, mdrun will let you know about excessive wait times for synch but we do not see it here with your system (must be running some really high end infiniband!)
-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

Bruno Monnet <bruno.monnet at hp.com> wrote:

Hi,

I'm not really a Gromacs user, but I'm currently benchmarking Gromacs 4.5.4 on a large cluster. It seems that my communication (PME) is really high and gromacs keeps complaining for more PME nodes :

   Average load imbalance: 113.6 %
 Part of the total run time spent waiting due to load imbalance: 3.3 %
 Steps where the load balancing was limited by -rdd, -rcon and/or -dds: X 9 % Y 9 % Z 9 %
 Average PME mesh/force load: 3.288
 Part of the total run time spent waiting due to PP/PME imbalance: 32.6 %

NOTE: 32.6 % performance was lost because the PME nodes
      had more work to do than the PP nodes.
      You might want to increase the number of PME nodes
      or increase the cut-off and the grid spacing.


I can't modify the original dataset as I only have the TPR file. I switched from dlb yes -> dlb auto since it seems to have trouble with more than 6000 / 8000 cores.

I tried to add " -gcom " parameter. This speedup the computation. This parameter is not really explained in the Gromacs documentation. Could you give me some advice on how I could use it ?

Best regards,
Bruno Monnet

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20110428/4c2fa142/attachment.html>


More information about the gromacs.org_gmx-users mailing list