[gmx-users] No improvement in scaling on introducing flow control

Wed Oct 31 09:10:20 CET 2007

himanshu khandelia wrote:
> Hi Carsten,
> 
> The benchmarks were made is 1 NIC/node, and yet the scaling is bad.
> Does that mean that there is indeed network congestion ? We will try
> using back to back connections soon,

Hi Himanshu,

In my opinion the most probable scenario is that the bandwidth of the
single gigabit connection is not sufficient for the four very fast CPUs
you have on each node. I would do an 8 CPU benchmark with a back-to-back
connection as here the chance for network congestion is minimized. If
the benchmarks stay as they were with the switch (they might be a bit
better because you do not have the switch's latency), I would try to
make use of both interfaces to double the bandwidth. This can easily be
done with OpenMPI.
You could also do a 16 CPU benchmark on 16 nodes so that the processes
do not need to share the network interface. If the scaling is better
compared to 16 CPUs on 4 nodes, it is an indication for the bandwidth
problem.

Carsten

> 
> -himanshu
> 
> 
> 
>  maybe your problem is not even flow control, but the limited network
> bandwidth which is shared among 4 CPUs in your case. I also have done
> benchmarks on Woodcrests
>> (2.33 GHz) and was not able to scale an 80000 atom system beyond 1 node with Gbit Ethernet. Looking in more detail, the time gained by the additional 4 CPUs of a
>> second node was exactly balanced by the extra communication. I used only 1 network interface for that benchmark, leaving effectively only 1/4 th of the bandwidth
>> for each CPU. Using two interfaces with OpenMPI did not double the network performance on our cluster. In my tests nodes with 2 CPUs sharing one NIC were faster
>> than nodes with 4 CPUs sharing two NICs. Could be on-node contention, since both interfaces probably end up on the same bus internally.
>>
>> Are the benchmarks made with 1 or 2 NICs/node? If they are for 1 NIC/node then there should be no network congestion for the case of 8 CPUs (=2 nodes). You could
>> try a back-to-back connection between two nodes to be absolutely shure that the rest of the network (switch etc.) does not play a role. I would try that and repeat
>> the benchmark for 8 CPUs. See if you get a different value.
>> ##############
> _______________________________________________
> gmx-users mailing list    gmx-users at gromacs.org
> http://www.gromacs.org/mailman/listinfo/gmx-users
> Please search the archive at http://www.gromacs.org/search before posting!
> Please don't post (un)subscribe requests to the list. Use the 
> www interface or send it to gmx-users-request at gromacs.org.
> Can't post? Read http://www.gromacs.org/mailing_lists/users.php