[gmx-users] Intel vs gcc compilers

Tue Jun 25 19:20:58 CEST 2013

On Tue, Jun 25, 2013 at 5:46 PM, Pedro Lacerda <kanvuanza+gmx at gmail.com> wrote:
> On Tue, Jun 25, 2013 at 8:53 AM, Mark Abraham <mark.j.abraham at gmail.com>wrote:
>
>> You're using a real-MPI process per core, and you have six cores per
>> processor. The recommended procedure is to map cores to OpenMP
>> threads, and choose the number of MPI processes per processor (and
>> thus the number of OpenMP threads per MPI process) to maximize
>> performance. See
>>
>> http://www.gromacs.org/Documentation/Acceleration_and_parallelization#Multi-level_parallelization.3a_MPI.2fthread-MPI_.2b_OpenMP
>
>
> The page says:
>
>> at the moment, the multi-level parallelization will surpass the
>> (thread-)MPI-only parallelization only in case of highly parallel runs
>> and/or with a slow network.
>
>
> What "highly parallel runs" mean? I'm sure it works for Djurre as he has 72
> nodes, but how many six-core nodes are considered highly parallel?

Not sure, perhaps Szilard can contribute some ballpark estimates. I'd
guess fewer than 200 atoms/core would generally be highly parallel (OP
has around 300 atoms/core), but it would depend on the number of
nodes, socket structure and network quality also. The total cost of
doing MPI communication grows with the number of ranks N, and the
number of PME-only nodes will grow likewise, and the PME cost will
grow probably as N log(N) from the global communication cost... It
does boil down to "if you really care about maximum performance, test
the available possibilities." But remember to factor that human time
into whether it's worth your trouble ;-)

Mark