[gmx-users] Parallel Gromacs Benchmarking with Opteron Dual-Core & Gigabit Ethernet

Thu Jul 26 22:41:32 CEST 2007

Erik Lindahl wrote:

>Built-in network cards are usually of lower quality, so there's
>probably only a single processor controlling both ports, and since
>the card probably only has a single driver requests might even be
>serialized.
>
My cluster nodes have two on-board Intel i82541PI GbE LAN controllers.
I setup a simulation as below to test this hypothesis that maybe the
built-in ethernet cards don't have a good performance and worsen the
cluster speed-up. I benchmarked the  DPCC system first by running
the Gmx on a single node with a single process and then by running
the similar simulation but on three nodes and single process on
every node, the results are as below.

- Running single mdrun process on single node:

	M E G A - F L O P S   A C C O U N T I N G

               NODE (s)   Real (s)      (%)
       Time:   6752.650   6753.000    100.0
                       1h52:32
               (Mnbf/s)   (MFlops)   (ns/day)  (hour/ns)
Performance:      6.573    626.776      0.128    187.574

- Running single mdrun process on three nodes:

	M E G A - F L O P S   A C C O U N T I N G

               NODE (s)   Real (s)      (%)
       Time:   2547.000   2547.000    100.0
                       42:27
               (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
Performance:     20.309      1.663      0.339     70.750

The obtained speed-up factor from these two simulations will be 2.65.
Sp = (1.663/0.626) = 2.65
But this value is very below than the ideal expected value 3.0, as we all
know the speed-up factor should be near the ideal values when the number of
parallel nodes is low (by about 8 nodes based on application). In other
words, I think that this space between the measured and ideal factors are
very unormal. And also I did't see anyone else had reported such a bad
scalability.

All of these concluded me to this point that something (network hw or sw)
does not work as usual in my cluster. I monitored the statistics of
my gigabit ethernet switch. It did work very normally without any problems.
I mean its input/output queue were empty all of the time, no collision,
no error pkts and the load on the switch fabric for rx and tx (the active
ports) was below than 7-8%. Therefore I think my bottleneck is built-in
lan cards. And I want to replace them by pci cards. I will be very
appreciated to listen any advice or help in this regard to apply to
my case.

>If you have lots of available ports on your gigabit switches and both
>switches+cards support "port trunking" you could try to connect two
>cables to each node and get a virtual 2Gb bandwidth connection. For
>more advanced stuff you'll have to consult the lam-mpi mailing list,
>although I'd be (quite pleasantly) surprised if it's possible to
>tweak gigabit performance a lot!

As I told above, the gigabit eth switch fabric works very well
without any problem during the simulation (at least for three nodes).
For this reason I bonded two built-in gigabit eth ports on every node
as below (bootup shell script):
---------------
#!/bin/bash
modprobe bonding mode=6 miimon=100 # load bonding module
ifconfig bond0 hw ether 00:11:22:33:44:55# MAC address of the bond0 interface
ifconfig bond0 192.168.55.55 up	# bond0 ip addr

ifenslave bond0 eth0# putting the eth0 interface in the slave mod for bond0
ifenslave bond0 eth1# putting the eth1 interface in the slave mod for bond0

--------------
mode=6:
Adaptive load balancing: includes balance-tlb plus receive load balancing
(rlb) for IPV4 traffic, and does not require any special switch support.
Any way, running single mdrun process on three nodes got the following
results:

	M E G A - F L O P S   A C C O U N T I N G

               NODE (s)   Real (s)      (%)
       Time:   1195.000   1195.000    100.0
                       19:55
               (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
Performance:     39.424      3.546      0.723     33.194

The obtained performance without bonding got 3.333 (GFlops), which means
trunking eth ports on nodes gets us about 7% improvement.

regards,
Kazem

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.