[gmx-developers] Re: Gromacs 3.3.1 parallel benchmarking
David van der Spoel
spoel at xray.bmc.uu.se
Tue Aug 15 19:15:42 CEST 2006
Michael Haverty wrote:
> Thanks for the feedback all.
> We're using single processor CPU machines with 2 GB of
> memory all isolated on the same switch. I've also
> tried on a shared memory 4-processor machine and we're
> still seeing sub-linear scaling. For the single cpu
> runs we are getting about 240-250 ns/day performance
> with the executables that we build in-house, and a
> little under 200 ns/day using the downloaded binaries.
> At 8 processors we're getting only 800 ns/day.
> My execution of growmmp and mdrun has been very simple
> and just using the "-np number_of_processors" flags
grompp -sort -shuffle
(but don't use -sort for production)
> except in the case of the shared memory machines where
> I used "-np number_of_processors -nt
> number_of_processors" for the mdrun flags. I've also
> tested it out with MPICH, LAM, and Intel-MPI builds,
> but can't get away from the sub-linear scaling. We're
> seeing communication between nodes of around 20
> mega-bits and latency of between 250-50 ns. We're
> running the simulations on rack systems. Originally
> they were targetted for use of batch serial systems,
> but we've upgraded things such as the switch to
> gigabit so that we could get better scaling and
> learned to run within switch to get good scaling with
> DFT codes up to the 40-60 processor range. We're
> starting to think it may be operating system issues,
> so we're going to meet with computing support later
> today to explore that.
> --- Michael Haverty <mghav at yahoo.com> wrote:
>> I'm doing some benchmarking of gromacs 3.3.1 on
>>9 systems using Intel Xeon processors on Gigabit
>>ethernet, but have been unable to reproduce the
>>for Gromacs 3.0.0 and am trying to diagnose why.
>>getting sublinear scaling on distributed
>>single-processor 3.4 GHz Intel Xeon's with gigabit
>>connections. I'm compiling using the 9.X versions
>>Intel compilers and used a wide variety of FFT and
>>BLAS libraries with no success in reproducing the
>>linear scaling shown in the online benchmarking
>>results for the "large DPPC membrane system".
>> Have any changes in the code been implemented
>>3.0.0 that would likely change this scaling behavior
>>and/or has anyone done similar parallel benchmarking
>>with 3.3.1? We'd like to start using this code for
>>to 100's of millions of atoms system, but are
>>currently limited by this poor scaling.
>> Thanks for any input or suggestions you can
>>Do You Yahoo!?
>>Tired of spam? Yahoo! Mail has the best spam
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> gmx-developers mailing list
> gmx-developers at gromacs.org
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-developers-request at gromacs.org.
David van der Spoel, PhD, Assoc. Prof., Molecular Biophysics group,
Dept. of Cell and Molecular Biology, Uppsala University.
Husargatan 3, Box 596, 75124 Uppsala, Sweden
phone: 46 18 471 4205 fax: 46 18 511 755
spoel at xray.bmc.uu.se spoel at gromacs.org http://folding.bmc.uu.se
More information about the gromacs.org_gmx-developers