[gmx-users] Hyperthreading throughput increase
lindahl at sbc.su.se
Mon Jun 5 09:47:06 CEST 2006
On Jun 2, 2006, at 8:20 PM, mernst at tricity.wsu.edu wrote:
>> From my browsing of list archives I can only recall seeing advice
>> that hyperthreading
> cannot offer more Gromacs performance. For all I know this remains
> true if you're trying
> to use MPI to accelerate single calculations on hyperthreaded
> processors. However, I
> have discovered that it may be possible to increase throughput by
> running two
> independent jobs on a recent hyperthreaded processor, and I don't
> recall seeing this
> mentioned on the list before.
> My typical job involves ~790 DNA atoms, a few Na+ ions, and
> 13000-14000 water molecules.
> I use Gromacs with the Amber force field ported by the Pande group.
> My simulation
> machines are 3.2 Ghz Pentium machines with 2 MB of cache (Pentium
> 640, I think) and 1 GB
> Typical performance for one of these systems running on an unloaded
> machine is 64.3
> hours/ns, 1.4 gflop/s. I accidentally started some pairs of
> simulations on some of these
> machines this week and discovered that the performance of each job
> was *not* cut in
> half. With two systems running simultaneously, each shows
> performance of about 98.8
> hours/ns, 908 mflop/s. Running two of these jobs on each machine
> thus appears to
> increase throughput by about 30%.
> If like me you run many independent calculations, throughput is
> more important than
> turnaround, and you have hyperthreaded machines but have not
> previously tried to take
> advantage of them, it may be worth testing. I suppose this issue
> may have been covered
> on the mailing list before, but all I ever remember seeing were
> advisements that
> hyperthreaded processors won't help performance, or even advice to
> hyperthreading in the BIOS. A brief web search indicates that some
> folding at home
> participants have discovered comparable throughput advantages to
> running two client
> instances on hyperthreaded processors.
That's certainly good news. I remember trying this when
hyperthreading first appeared, but the early implementations didn't
make any throughput difference whatsoever.
However, it might still lead to significant problems with dual-CPU
systems, where each CPU has hyperthreading enabled. In _theory_ the
Linux scheduler should be able to tell logical from physical CPUs,
but the last time I tried it (which, again, was over a year ago) it
lead to severe load balancing problems.
More information about the gromacs.org_gmx-users