[gmx-developers] Benchmark Errors & 3.2.0 Performace Loss

Josh Hursey joshh at cs.earlham.edu
Tue Mar 2 23:31:19 CET 2004


We have been testing GROMACS configurations using versions 3.1.4, 3.1.5 
pre 1, 3.2.0 beta 1 with the benchmark molecule collection, and reached 
a few interesting fail states and found some performance losses between 
3.1.5 and 3.2.0 that developers may find interesting.

We run our tests under two separate clusters, details for each cluster 
can be found here:
Bazaar (Pentium III): http://cluster.earlham.edu/html/bazaar.html
Cairo (PPC G4): http://cluster.earlham.edu/html/cairo.html
General information about Earlham's Cluster Computing Lab is located here:
http://cluster.earlham.edu/html/


--------- Issue 1: Molecules that run with errors ------

The poly-ch2 molecule in the GROMACS 3.0 benchmark suite 
(ftp://ftp.GROMACS.org/pub/benchmarks/gmxbench-3.0.tar.gz) will not run 
in MPI mode (we are using LAM-MPI 7.0.2
Here are the links to some running directories:
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/poly-Testing-JH-Debug-Molecule-8on8-8/
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/poly-Testing-JH-Debug-Molecule-6on6-6/
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/poly-Testing-JH-Debug-Molecule-4on4-4-Run-1/
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/poly-Testing-JH-Debug-Molecule-2on2-2-Run-3/
but the single run (1on1-1 without MPI) works fine:
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/poly-Testing-JH-Debug-Molecule-1on1-1-Run-2/

LZM-PME in the benchmark suite will stall in Gromacs 3.2.0 beta 1 when 
mpi is configured to run across 4 nodes, and after a long stall it will 
die. To see what where the problem lies I built Gromacs with a few sets 
of configurations to see where the problem lies (below are the compile 
time configurations and some running directories):

Baseline:
---------
--prefix=/cluster/cairo/software/GROMACS-3.2.0-Baseline
--enable-mpi
--enable-mpi-environment
--enable-float
--disable-software-recip
--disable-software-sqrt
--disable-x86-asm
--disable-ppc-altivec
--disable-cpu-optimization

-- PME Baseline Gromacs 3.2.0 -> Works on all parallel structures
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/pme-Baseline-Optimal-3.2.0-2on2-2/
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/pme-Baseline-Optimal-3.2.0-4on4-4/
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/pme-Baseline-Optimal-3.2.0-6on6-6/
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/pme-Baseline-Optimal-3.2.0-8on8-8/

Optimal Configuration:
-----------------------
--prefix=/cluster/cairo/software/GROMACS-3.2.0_beta1-Optimal
--enable-mpi
--enable-mpi-environment
--enable-float
--disable-software-recip
--enable-software-sqrt
--disable-x86-asm
--enable-ppc-altivec
--disable-cpu-optimization

--- PME Optimal Gromacs 3.2.0 -> Works for parallel structures 2,6,8 
(signifies # of uni-processor nodes used)
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/pme-Gromacs-Confirm-Optimal-3.2.0-2on2-2/
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/pme-Gromacs-Confirm-Optimal-3.2.0-6on6-6/
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/pme-Gromacs-Confirm-Optimal-3.2.0-8on8-8/
- PME Optimal Gromacs 3.2.0 -> Fails on parallel structure 4on4-4 (4 
uni-processor nodes)
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/pme-Gromacs-Confirm-Optimal-3.2.0-4on4-4-Run-1/
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/pme-Gromacs-Confirm-Optimal-3.2.0-4on4-4-Run-2/
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/pme-Gromacs-Confirm-Optimal-3.2.0-4on4-4/


Optimal with out Altivec:
-------------------------
--prefix=/cluster/cairo/software/GROMACS-3.2.0-SQRT
--enable-mpi
--enable-mpi-environment
--enable-float
--disable-software-recip
--enable-software-sqrt
--disable-x86-asm
--disable-ppc-altivec
--disable-cpu-optimization

-- Works for 4 node configuration
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/pme-Testing-Confirm-3.2.0-ALTIVEC-4on4-4/

Optimal with out Software Sqrt:
------------------------------
--prefix=/cluster/cairo/software/GROMACS-3.2.0-ALT
--enable-mpi
--enable-mpi-environment
--enable-float
--disable-software-recip
--disable-software-sqrt
--disable-x86-asm
--enable-ppc-altivec
--disable-cpu-optimization

-- Works for 4 node configuration
http://cluster.earlham.edu/project/b-and-t-gromacs/results/cairo/raw-data/pme-Testing-Confirm-3.2.0-SQRT-4on4-4/


Some clarification notes for the above errors:
* In the directories you will find all of the information that you will 
need the [molecule].pl file is the executable that we used to run the test.
* All tests were intended to run 3 iterations.
* The molecule is initialized from a 'clean' configuration copied from 
the original tar ball of the GROMACS benchmark suite
* On Bazaar, our x86 machines, this problem does not occur.
* Further the problem only occurs when GROMACS is configured with both 
the Altivec and Software Square Root flags enabled, where individual 
configurations (as illustrated above) run with out error (stalling then 
death).


--------- Issue 2: Performance Loss between 3.1.5 and 3.2.0 ------
In our testing we compared the performance between GROMACS 3.1.5 pre 1 
and GROMACS 3.2.0 beta 1, and noticed a significant performance loss 
between the two versions, especially on Cairo (our PPC G4 environment). 
Here are some scaling graphs illustrating this disparency:

Bazaar (x86 Pentium III):
  * LZM-CUT:
http://cluster.earlham.edu/home/joshh/doc/gromacs-perf/bazaarFAC/cut.png
  * DPPC:
http://cluster.earlham.edu/home/joshh/doc/gromacs-perf/bazaarFAC/dppc.png
  * LZM-PME:
http://cluster.earlham.edu/home/joshh/doc/gromacs-perf/bazaarFAC/pme.png
  * Villin
http://cluster.earlham.edu/home/joshh/doc/gromacs-perf/bazaarFAC/villin.png
-*- Bazaar compiler configuration options for both 3.1.5 and 3.2.0:
--enable-mpi
--enable-mpi-environment=GROMACS_MPI
--enable-float
--disable-software-recip
--enable-software-sqrt
--enable-x86-asm
--disable-ppc-altivec
--disable-cpu-optimization

Cairo (PPC G4):
  * LZM-CUT:
http://cluster.earlham.edu/home/joshh/doc/gromacs-perf/cairoFAC/cut.png
  * DPPC:
http://cluster.earlham.edu/home/joshh/doc/gromacs-perf/cairoFAC/dppc.png
  * Villin:
http://cluster.earlham.edu/home/joshh/doc/gromacs-perf/cairoFAC/villin.png
-*- Cairo compiler configuration options for both 3.1.5 and 3.2.0:
--enable-mpi
--enable-mpi-environment=GROMACS_MPI
--enable-float
--disable-software-recip
--enable-software-sqrt
--disable-x86-asm
--enable-ppc-altivec
--disable-cpu-optimization


These are just a few observations that, I hope, will help in the 
development of GROMACS. If there are any questions about our environment 
or need for clarification of any of the information contained here 
please contact me via email.

Cheers,
Josh Hursey
joshh at cs.earlham.edu
Earlham College Cluster Computing Lab
http://cluster.earlham.edu/html/





More information about the gromacs.org_gmx-developers mailing list