[gmx-users] Performance of Gromacs-4.6.1 on BlueGene/Q
Mark Abraham
mark.j.abraham at gmail.com
Tue Jun 4 16:32:07 CEST 2013
On Tue, Jun 4, 2013 at 4:20 PM, XAvier Periole <x.periole at rug.nl> wrote:
>
> BG CPUs are generally much slower (clock whose) but scale better.
>
> You should try to run on 64 CPUs on the Blue gene too for faire comparison.
> The number of CPUs per nodes is also an important factor: the more CPUs
> per nodes the more communications needs to be done. I observed a
> significant slow down while going from 16 to 32 CPUs nodes (recent intel)
> but using the same number of CPUs.
>
Indeed. Moreover, there is not (yet) any instruction-level parallelism in
the GROMACS kernels used on BG/Q, unlike for the x86 family. So there is a
theoretical factor of four that is simply not being exploited. (And no, the
compiler is not good enough to do it automatically ;-))
Mark
> On Jun 4, 2013, at 4:02 PM, Jianguo Li <ljggmx at yahoo.com.sg> wrote:
>
> > Dear All,
> >
> >
> > Has anyone has Gromacs benchmark on Bluegene/Q?
> > I recently installed gromacs-461 on BG/Q using the following command:
> > cmake .. -DCMAKE_TOOLCHAIN_FILE=BlueGeneQ-static-XL-C \
> > -DGMX_BUILD_OWN_FFTW=ON \
> > -DBUILD_SHARED_LIBS=OFF \
> > -DGMX_XML=OFF \
> > -DCMAKE_INSTALL_PREFIX=/scratch/home/biilijg/package/gromacs-461
> > make
> > make install
> >
> > After that, I did a benchmark simulation using a box of pure water
> containing 140k atoms.
> > The command I used for the above test is:
> > srun --ntasks-per-node=32 --overcommit
> /scratch/home/biilijg/package/gromacs-461/bin/mdrun -s box_md1.tpr -c
> box_md1.gro -x box_md1.xtc -g md1.log >& job_md1
> >
> > And I got the following performance:
> > Num. cores hour/ns
> > 128 9.860
> > 256 4.984
> > 512 2.706
> > 1024 1.544
> > 2048 0.978
> > 4092 0.677
> >
> > The scaling seems ok, but the performance is far from what I expected.
> In terms CPU-to-CPU performance, the Bluegene is 8 times slower than other
> clusters. For comparison, I also did the same simulation using 64
> processors in a SGI cluster, and I got 2.8 hour/ns, which is roughly
> equivalent to using 512 cores in BlueGene/Q.
> >
> > I am wondering if the above benchmark results are reasonable or not? Or
> Am I doing something wrong in compiling?
> > Any comments/suggestions are appreciated, thank you very much!
> >
> > Have a nice day!
> > Jianguo
> >
> > --
> > gmx-users mailing list gmx-users at gromacs.org
> > http://lists.gromacs.org/mailman/listinfo/gmx-users
> > * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> > * Please don't post (un)subscribe requests to the list. Use the
> > www interface or send it to gmx-users-request at gromacs.org.
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> --
> gmx-users mailing list gmx-users at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-request at gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
More information about the gromacs.org_gmx-users
mailing list