[gmx-users] Attempting to scale gromacs mdrun_mpi

Fri Aug 27 07:53:06 CEST 2010

Thanks a lot Roland!

I'm using Beowulf cluster which I believe uses ethernet connection (1 node = 1 cpu in my case). I found an article by Kutzner et al (2006) which talks about the problem in speeding up parallel processing in such clusters. Does it mean that if i continue to use the ethernet-connected cluster and not gone for the 4.5 beta version, I'm stuck with a low number of processors due to the difficulty in scaling? Thanks again for your advice!

Huiwen

Hi,

you don't write what network you have and how many cores per node. If it is
ethernet it will be difficult to scale.
You might want to try the 4.5beta version because we have improved the
scaling with it. Also there is a tool g_tune_pme which might help you.

Roland

________________________________

From: NG HUI WEN
Sent: Mon 8/23/2010 11:15 PM
To: gmx-users at gromacs.org
Subject: Attempting to scale gromacs mdrun_mpi

Hi,

I have been playing with the "mdrun_mpi" command in gromacs 4.0.7 to try out  parallel processing. Unfortunately, the results I got did not show any significant improvement in simulation time.

Below is the command I issued:

mpirun -np x mdrun_mpi  -deffnm

where x is the number of processors being used.

>From the machine output, it seemed that the work had indeed been distributed to multiple processors e.g. -np 10:
NNODES=10, MYRANK=5, HOSTNAME=beowulf
NODEID=4 argc=3
NNODES=10, MYRANK=1, HOSTNAME=beowulf
NNODES=10, MYRANK=2, HOSTNAME=beowulf
NODEID=1 argc=3
NODEID=9 argc=3
NODEID=5 argc=3
NNODES=10, MYRANK=3, HOSTNAME=beowulf
NNODES=10, MYRANK=7, HOSTNAME=beowulf
NNODES=10, MYRANK=8, HOSTNAME=beowulf
NODEID=8 argc=3
NODEID=2 argc=3
NODEID=6 argc=3
NODEID=3 argc=3
NODEID=7 argc=3
Making 2D domain decomposition 5 x 1 x 2
starting mdrun 'PROTEIN'
1000 steps,      2.0 ps.

The simulation system consists of 100581 atoms, the duration is 2ps (1000 steps). results obtained are as followed:

number of CPUs       Simulation time
1                                      13m28s
2                                         6m31s
3                                         7m33s
4                                          6m47s
5                                          7m48s
6                                          6m55s
7                                          7m36s
8                                          6m58s
9                                          7m15s
10                                        7m01s
15                                        7m27s
20                                        7m15s
30                                        7m42s

Significant improvement in simulation time was only observed from -np 1 to 2.  As almost all (except -np = 1) complaint about load imbalance and PP:PME imbalance (the latter was  seen especially in those with larger -np value), I tried to increase the pme nodes by adding a -npme flag and entered a bigger number but the results either showed no improvement or worsened.

As I am new to gromacs, there might be some things that I'd missed out/done incorrectly. Would really appreciate some input to this. Many thanks in advance!!

HW
<< Email has been scanned for viruses by UNMC email management service >>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20100827/4fad3fe5/attachment.html>