[gmx-users] scalability of Gromacs with MPI
Berk Hess
gmx3 at hotmail.com
Tue Jan 24 08:17:14 CET 2006
>From: Jan Thorbecke <janth at xs4all.nl>
>Reply-To: Discussion list for GROMACS users <gmx-users at gromacs.org>
>To: gmx-users at gromacs.org
>Subject: [gmx-users] scalability of Gromacs with MPI
>Date: Mon, 23 Jan 2006 16:00:59 +0100
>
>
>Dear Users,
>
>At this moment I'm working on a benchmark for Gromacs. The benchmark is
>set up to run from 32 to 128 CPU's. The scalability is fine up to 64 CPU's
>beyond that the code is not scaling anymore (see table below). What
>prevents it from scaling are the (ring) communication parts move_x, and
>move_f. Those parts together take about 20 s. on 128 CPU's.
>
>CPU's | 3.3 and fftw3 |
>------|---------------|
>32 | 142 s. |
>64 | 88 s. |
>128 | 70 s. |
>
>
>I have no background in Molecular Dynamics and just look at the code from
>a performance point of view. My questions are:
>
>- Has anybody scaled Gromacs upto more that 64 CPU's? My guess is that
>inherent to the MD problem solved by Gromacs, there is a limit in the
>number of processors that could be used efficiently. At some point the
>communication of the forces to all other CPU's will dominate the wallclock
>time time.
>
No this is not inherent to the MD problem, but inherent to particle
decomposition (as opposed to domain decomposition).
>- I tried to change the ring communication in move_x and move-f to
>collective communication, but that does not help the scalability. Does
>anybody tried other communication schemes?
>
We tried several. But the ring communicaton turned out to the most
efficient, better for instance than dedicated MPI calls.
>- Are there options to try with grompp to set-up a different domain
>decomposition (for example blocks in x,y,z in stead of lines in x) or
>other parallelisation strategies?
No, but we are working on domain decomposition.
There is one point where one can improve communication
and that is in gmx_sumf and gmx_sumd.
If one replaces the ring in those calls by MPI_Allreduce
one can get a small performance improvement on many cpu's.
Berk.
More information about the gromacs.org_gmx-users
mailing list