[gmx-users] Attempting to scale gromacs mdrun_mpi
NG HUI WEN
HuiWen.Ng at nottingham.edu.my
Fri Aug 27 07:53:06 CEST 2010
Thanks a lot Roland!
I'm using Beowulf cluster which I believe uses ethernet connection (1 node = 1 cpu in my case). I found an article by Kutzner et al (2006) which talks about the problem in speeding up parallel processing in such clusters. Does it mean that if i continue to use the ethernet-connected cluster and not gone for the 4.5 beta version, I'm stuck with a low number of processors due to the difficulty in scaling? Thanks again for your advice!
Huiwen
Hi,
you don't write what network you have and how many cores per node. If it is
ethernet it will be difficult to scale.
You might want to try the 4.5beta version because we have improved the
scaling with it. Also there is a tool g_tune_pme which might help you.
Roland
________________________________
From: NG HUI WEN
Sent: Mon 8/23/2010 11:15 PM
To: gmx-users at gromacs.org
Subject: Attempting to scale gromacs mdrun_mpi
Hi,
I have been playing with the "mdrun_mpi" command in gromacs 4.0.7 to try out parallel processing. Unfortunately, the results I got did not show any significant improvement in simulation time.
Below is the command I issued:
mpirun -np x mdrun_mpi -deffnm
where x is the number of processors being used.
>From the machine output, it seemed that the work had indeed been distributed to multiple processors e.g. -np 10:
NNODES=10, MYRANK=5, HOSTNAME=beowulf
NODEID=4 argc=3
NNODES=10, MYRANK=1, HOSTNAME=beowulf
NNODES=10, MYRANK=2, HOSTNAME=beowulf
NODEID=1 argc=3
NODEID=9 argc=3
NODEID=5 argc=3
NNODES=10, MYRANK=3, HOSTNAME=beowulf
NNODES=10, MYRANK=7, HOSTNAME=beowulf
NNODES=10, MYRANK=8, HOSTNAME=beowulf
NODEID=8 argc=3
NODEID=2 argc=3
NODEID=6 argc=3
NODEID=3 argc=3
NODEID=7 argc=3
Making 2D domain decomposition 5 x 1 x 2
starting mdrun 'PROTEIN'
1000 steps, 2.0 ps.
The simulation system consists of 100581 atoms, the duration is 2ps (1000 steps). results obtained are as followed:
number of CPUs Simulation time
1 13m28s
2 6m31s
3 7m33s
4 6m47s
5 7m48s
6 6m55s
7 7m36s
8 6m58s
9 7m15s
10 7m01s
15 7m27s
20 7m15s
30 7m42s
Significant improvement in simulation time was only observed from -np 1 to 2. As almost all (except -np = 1) complaint about load imbalance and PP:PME imbalance (the latter was seen especially in those with larger -np value), I tried to increase the pme nodes by adding a -npme flag and entered a bigger number but the results either showed no improvement or worsened.
As I am new to gromacs, there might be some things that I'd missed out/done incorrectly. Would really appreciate some input to this. Many thanks in advance!!
HW
<< Email has been scanned for viruses by UNMC email management service >>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20100827/4fad3fe5/attachment.html>
More information about the gromacs.org_gmx-users
mailing list