[gmx-developers] Oversubscribing on 4.62 with MPI / OpenMP
Erik Marklund
erikm at xray.bmc.uu.se
Thu Apr 25 09:54:35 CEST 2013
Hi,
That makes a lot of sense. Thanks.
Best,
Erik
On 25 Apr 2013, at 09:52, "hess at kth.se" <hess at kth.se> wrote:
> Hi,
>
> It allows for further scaling, when the domain decomposition is limiting the number of MPI ranks.
> It can be faster, especially on hundreds of cores.
> We need it with GPUs.
> OpenMP alone can be significantly faster than MPI alone.
>
> Cheers,
>
> Berk
>
>
> ----- Reply message -----
> From: "Erik Marklund" <erikm at xray.bmc.uu.se>
> To: "Discussion list for GROMACS development" <gmx-developers at gromacs.org>
> Subject: [gmx-developers] Oversubscribing on 4.62 with MPI / OpenMP
> Date: Thu, Apr 25, 2013 09:47
>
>
> Hi,
>
> Please remind me why we allow for mixed OpenMP+MPI even though it is always slower. It ought to be more complicated to maintain code that allows such mixing.
>
> Best,
> Erik
>
> On 25 Apr 2013, at 09:43, "hess at kth.se" <hess at kth.se> wrote:
>
>> Hi
>>
>> Yes, that is expected.
>> Combined MPI+ OpenMP is always slower than either of the two, except close to the scaling limit.
>> Two OpenMP threads give the least overhead, especially with hyperthreading. Although turning of hyperthreading is then probably faster.
>>
>> Cheers,
>>
>> Berk
>>
>>
>> ----- Reply message -----
>> From: "Jochen Hub" <jhub at gwdg.de>
>> To: "Discussion list for GROMACS development" <gmx-developers at gromacs.org>
>> Subject: [gmx-developers] Oversubscribing on 4.62 with MPI / OpenMP
>> Date: Thu, Apr 25, 2013 09:37
>>
>>
>>
>> Am 4/24/13 9:53 PM, schrieb Mark Abraham:
>> > I suspect -np 2 is not starting a process on each node like I suspect
>> > you think it should, because all the symptoms are consistent with that.
>> > Possibly the Host field in the .log file output is diagnostic here.
>> > Check how your your MPI configuration works.
>>
>> I fixed the issue with the mpi call. I make sure, that only one MPI
>> process is started per node (mpiexec -n 2 -npernode=1 or -bynode) . The
>> oversubscription warning does not appear, so everything seems fine.
>>
>> However, the performance is quite poor with MPI/OpenMP. Example:
>>
>> (100 kAtoms, PME, Verlet, cutoffs at 1nm nstlist=10)
>>
>> 16 MPI processes: 6.8 ns/day
>> 2 MPI processes, 8 OpenMP threads pre MPI process: 4.46 ns/day
>> 4 MPI / 4 OpenMP each does not improve things.
>>
>> I use an icc13, and I tried different MPI implementations (Mvapich 1.8,
>> openmpi 1.33)
>>
>> Is that expected?
>>
>> Many thanks,
>> Jochen
>>
>
> --
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-developers-request at gromacs.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20130425/1dd0691c/attachment.html>
More information about the gromacs.org_gmx-developers
mailing list