[gmx-users] gigabit scaling, DPPC benchmark

Mon Jun 6 20:57:43 CEST 2005

> Date: Mon, 06 Jun 2005 19:12:24 +0200
> From: David <spoel at xray.bmc.uu.se>
> Subject: Re: [gmx-users] gigabit scaling, DPPC benchmark
> To: Discussion list for GROMACS users <gmx-users at gromacs.org>
> Message-ID: <1118077944.5498.2.camel at localhost.localdomain>
> Content-Type: text/plain
>
> On Mon, 2005-06-06 at 12:48 -0400, Jason de Joannis wrote:
> > Hi, I would like to improve the scaling on my cluster so it is more
> > like that published on the Gromacs website.
> >
> > Cluster Nodes: Dual Xeon 3.06 GHz (hyperthreaded)
> > Network: Gbit ethernet
> > System Installion:
> >   o Linux 2.4.20-64GB-SMP
> >   o LAM 6.5.9/MPI 2 C++/ROMIO
> >   o fftw-2.1.5
> >   o Gromacs 3.2.1
> >
> > For the DPPC benchmark with sort/shuffle:
> >
> > NP       1    2    4
> > ps/day   158  404  517
> > scaling       128% 81%
> >
> > There is a big drop from 2 to 4 as the Gigabit network comes into play.
> > Compare this to the Xeon 2800 benchmark on Gromacs.org where 4 processors
> > get 637 ps/day.
> >
> > Of course the problem becomes more pronounced when I use PME on this
> > benchmark:
> >
> > NP       1    2    4
> > ps/day   115  229  201
> > scaling       100% 44%
> >
> > I suspect my network may be running a 100 rather than 1000 Mbit as
> > suggested by the published 87% scaling for Xeon2800/100Mbit.
> >
> > How can a troubleshoot this problem?
> I'm afraid this is a GROMACS problem. We are working on resolving this,
> but the problem is largely due to the Linux TCP layer which is very
> inefficient as regards the start-up time for a communication.
>

Then why does the Xeon2800/1000Gbit system on gromacs.org scale so well?
I didn't see any special comments about its network.

> It basically means you can not do much about it (except buy SMP machines
> with more than two processors).
> >
> >   /Jason
> >
> --
> David.

/Jason