[gmx-developers] Benchmark Errors & 3.2.0 Performace Loss
Josh Hursey
joshh at cs.earlham.edu
Tue Mar 2 23:31:19 CET 2004
We have been testing GROMACS configurations using versions 3.1.4, 3.1.5
pre 1, 3.2.0 beta 1 with the benchmark molecule collection, and reached
a few interesting fail states and found some performance losses between
3.1.5 and 3.2.0 that developers may find interesting.
We run our tests under two separate clusters, details for each cluster
can be found here:
Bazaar (Pentium III): http://cluster.earlham.edu/html/bazaar.html
Cairo (PPC G4): http://cluster.earlham.edu/html/cairo.html
General information about Earlham's Cluster Computing Lab is located here:
http://cluster.earlham.edu/html/
--------- Issue 1: Molecules that run with errors ------
The poly-ch2 molecule in the GROMACS 3.0 benchmark suite
(ftp://ftp.GROMACS.org/pub/benchmarks/gmxbench-3.0.tar.gz) will not run
in MPI mode (we are using LAM-MPI 7.0.2
Here are the links to some running directories:
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/poly-Testing-JH-Debug-Molecule-8on8-8/
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/poly-Testing-JH-Debug-Molecule-6on6-6/
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/poly-Testing-JH-Debug-Molecule-4on4-4-Run-1/
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/poly-Testing-JH-Debug-Molecule-2on2-2-Run-3/
but the single run (1on1-1 without MPI) works fine:
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/poly-Testing-JH-Debug-Molecule-1on1-1-Run-2/
LZM-PME in the benchmark suite will stall in Gromacs 3.2.0 beta 1 when
mpi is configured to run across 4 nodes, and after a long stall it will
die. To see what where the problem lies I built Gromacs with a few sets
of configurations to see where the problem lies (below are the compile
time configurations and some running directories):
Baseline:
---------
--prefix=/cluster/cairo/software/GROMACS-3.2.0-Baseline
--enable-mpi
--enable-mpi-environment
--enable-float
--disable-software-recip
--disable-software-sqrt
--disable-x86-asm
--disable-ppc-altivec
--disable-cpu-optimization
-- PME Baseline Gromacs 3.2.0 -> Works on all parallel structures
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/pme-Baseline-Optimal-3.2.0-2on2-2/
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/pme-Baseline-Optimal-3.2.0-4on4-4/
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/pme-Baseline-Optimal-3.2.0-6on6-6/
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/pme-Baseline-Optimal-3.2.0-8on8-8/
Optimal Configuration:
-----------------------
--prefix=/cluster/cairo/software/GROMACS-3.2.0_beta1-Optimal
--enable-mpi
--enable-mpi-environment
--enable-float
--disable-software-recip
--enable-software-sqrt
--disable-x86-asm
--enable-ppc-altivec
--disable-cpu-optimization
--- PME Optimal Gromacs 3.2.0 -> Works for parallel structures 2,6,8
(signifies # of uni-processor nodes used)
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/pme-Gromacs-Confirm-Optimal-3.2.0-2on2-2/
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/pme-Gromacs-Confirm-Optimal-3.2.0-6on6-6/
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/pme-Gromacs-Confirm-Optimal-3.2.0-8on8-8/
- PME Optimal Gromacs 3.2.0 -> Fails on parallel structure 4on4-4 (4
uni-processor nodes)
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/pme-Gromacs-Confirm-Optimal-3.2.0-4on4-4-Run-1/
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/pme-Gromacs-Confirm-Optimal-3.2.0-4on4-4-Run-2/
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/pme-Gromacs-Confirm-Optimal-3.2.0-4on4-4/
Optimal with out Altivec:
-------------------------
--prefix=/cluster/cairo/software/GROMACS-3.2.0-SQRT
--enable-mpi
--enable-mpi-environment
--enable-float
--disable-software-recip
--enable-software-sqrt
--disable-x86-asm
--disable-ppc-altivec
--disable-cpu-optimization
-- Works for 4 node configuration
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/pme-Testing-Confirm-3.2.0-ALTIVEC-4on4-4/
Optimal with out Software Sqrt:
------------------------------
--prefix=/cluster/cairo/software/GROMACS-3.2.0-ALT
--enable-mpi
--enable-mpi-environment
--enable-float
--disable-software-recip
--disable-software-sqrt
--disable-x86-asm
--enable-ppc-altivec
--disable-cpu-optimization
-- Works for 4 node configuration
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/pme-Testing-Confirm-3.2.0-SQRT-4on4-4/
Some clarification notes for the above errors:
* In the directories you will find all of the information that you will
need the [molecule].pl file is the executable that we used to run the test.
* All tests were intended to run 3 iterations.
* The molecule is initialized from a 'clean' configuration copied from
the original tar ball of the GROMACS benchmark suite
* On Bazaar, our x86 machines, this problem does not occur.
* Further the problem only occurs when GROMACS is configured with both
the Altivec and Software Square Root flags enabled, where individual
configurations (as illustrated above) run with out error (stalling then
death).
--------- Issue 2: Performance Loss between 3.1.5 and 3.2.0 ------
In our testing we compared the performance between GROMACS 3.1.5 pre 1
and GROMACS 3.2.0 beta 1, and noticed a significant performance loss
between the two versions, especially on Cairo (our PPC G4 environment).
Here are some scaling graphs illustrating this disparency:
Bazaar (x86 Pentium III):
* LZM-CUT:
http://cluster.earlham.edu/home/joshh/doc/gromacs-perf/bazaarFAC/cut.png
* DPPC:
http://cluster.earlham.edu/home/joshh/doc/gromacs-perf/bazaarFAC/dppc.png
* LZM-PME:
http://cluster.earlham.edu/home/joshh/doc/gromacs-perf/bazaarFAC/pme.png
* Villin
http://cluster.earlham.edu/home/joshh/doc/gromacs-perf/bazaarFAC/villin.png
-*- Bazaar compiler configuration options for both 3.1.5 and 3.2.0:
--enable-mpi
--enable-mpi-environment=GROMACS_MPI
--enable-float
--disable-software-recip
--enable-software-sqrt
--enable-x86-asm
--disable-ppc-altivec
--disable-cpu-optimization
Cairo (PPC G4):
* LZM-CUT:
http://cluster.earlham.edu/home/joshh/doc/gromacs-perf/cairoFAC/cut.png
* DPPC:
http://cluster.earlham.edu/home/joshh/doc/gromacs-perf/cairoFAC/dppc.png
* Villin:
http://cluster.earlham.edu/home/joshh/doc/gromacs-perf/cairoFAC/villin.png
-*- Cairo compiler configuration options for both 3.1.5 and 3.2.0:
--enable-mpi
--enable-mpi-environment=GROMACS_MPI
--enable-float
--disable-software-recip
--enable-software-sqrt
--disable-x86-asm
--enable-ppc-altivec
--disable-cpu-optimization
These are just a few observations that, I hope, will help in the
development of GROMACS. If there are any questions about our environment
or need for clarification of any of the information contained here
please contact me via email.
Cheers,
Josh Hursey
joshh at cs.earlham.edu
Earlham College Cluster Computing Lab
http://cluster.earlham.edu/html/
More information about the gromacs.org_gmx-developers
mailing list