[gmx-developers] Alternate Parallelization scheme
nmoore at physics.umn.edu
Sat Jun 18 17:23:38 CEST 2005
I'm porting GROMACS to IBM's Blue Gene system. At present, the dpcc
benchmark runs fastest at about 64 processors. A deeper look into the
execution shows that of the ~1100 seconds of walltime the run takes, ~500
seconds are used for MPI communication. I'd like to see if this can be
Accordingly, I'd like to implement an MPI collective routine rather than
the ring structure currently implemented. The most important parts to
parallelize seem like the move_x and move_f functions (perhaps also
(1) in what function are the x and f arrays declared? I assume both are
cartesian triplets - is this defined in a struct or are the arrays flat?
(2) Which particles does each node control? I assume that the grompp
options -sort and -shuffle destroy the easy mapping, rank=0 gets the first
10 atoms, rank=1 gets the second 10 atoms etc.
More information about the gromacs.org_gmx-developers