[gmx-developers] Oversubscribing on 4.62 with MPI / OpenMP

Erik Marklund erikm at xray.bmc.uu.se
Thu Apr 25 09:54:35 CEST 2013


Hi,

That makes a lot of sense. Thanks.

Best,
Erik

On 25 Apr 2013, at 09:52, "hess at kth.se" <hess at kth.se> wrote:

> Hi,
> 
> It allows for further scaling, when the domain decomposition is limiting the number of MPI ranks.
> It can be faster, especially on hundreds of cores.
> We need it with GPUs.
> OpenMP alone can be significantly faster than MPI alone.
> 
> Cheers,
> 
> Berk
> 
> 
> ----- Reply message -----
> From: "Erik Marklund" <erikm at xray.bmc.uu.se>
> To: "Discussion list for GROMACS development" <gmx-developers at gromacs.org>
> Subject: [gmx-developers] Oversubscribing on 4.62 with MPI / OpenMP
> Date: Thu, Apr 25, 2013 09:47
> 
> 
> Hi,
> 
> Please remind me why we allow for mixed OpenMP+MPI even though it is always slower. It ought to be more complicated to maintain code that allows such mixing.
> 
> Best,
> Erik
> 
> On 25 Apr 2013, at 09:43, "hess at kth.se" <hess at kth.se> wrote:
> 
>> Hi
>> 
>> Yes, that is expected.
>> Combined MPI+ OpenMP is always slower than either of the two, except close to the scaling limit.
>> Two OpenMP threads give the least overhead, especially with hyperthreading. Although turning of hyperthreading is then probably faster.
>> 
>> Cheers,
>> 
>> Berk
>> 
>> 
>> ----- Reply message -----
>> From: "Jochen Hub" <jhub at gwdg.de>
>> To: "Discussion list for GROMACS development" <gmx-developers at gromacs.org>
>> Subject: [gmx-developers] Oversubscribing on 4.62 with MPI / OpenMP
>> Date: Thu, Apr 25, 2013 09:37
>> 
>> 
>> 
>> Am 4/24/13 9:53 PM, schrieb Mark Abraham:
>> > I suspect -np 2 is not starting a process on each node like I suspect
>> > you think it should, because all the symptoms are consistent with that.
>> > Possibly the Host field in the .log file output is diagnostic here.
>> > Check how your your MPI configuration works.
>> 
>> I fixed the issue with the mpi call. I make sure, that only one MPI 
>> process is started per node (mpiexec -n 2 -npernode=1 or -bynode) . The 
>> oversubscription warning does not appear, so everything seems fine.
>> 
>> However, the performance is quite poor with MPI/OpenMP. Example:
>> 
>> (100 kAtoms, PME, Verlet, cutoffs at 1nm nstlist=10)
>> 
>> 16 MPI processes: 6.8 ns/day
>> 2 MPI processes, 8 OpenMP threads pre MPI process: 4.46 ns/day
>> 4 MPI / 4 OpenMP each does not improve things.
>> 
>> I use an icc13, and I tried different MPI implementations (Mvapich 1.8, 
>> openmpi 1.33)
>> 
>> Is that expected?
>> 
>> Many thanks,
>> Jochen
>> 
> 
> -- 
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the 
> www interface or send it to gmx-developers-request at gromacs.org.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20130425/1dd0691c/attachment.html>


More information about the gromacs.org_gmx-developers mailing list