[gmx-users] Can we set the number of pure PME nodes when using GPU&CPU?
Carsten Kutzner
ckutzne at gwdg.de
Mon Aug 11 13:07:01 CEST 2014
Hi,
you could start twice as many MPI processes per node as you have GPUs on a
node and use half of all processes for PME, e.g. on 4 nodes:
mpirun -np 8 mdrun -s in.tpr -npme 4
or start 4 processes per node:
mpirun -np 16 mdrun -s in.tpr -npme 4 -gpu_id 0011
or with more OpenMP threads for the PME processes:
mpirun -np 16 mdrun -ntomp 2 -ntomp_pme 6 -pin on -s in.tpr -npme 4 -gpu_id 0011
With -ntomp and -ntomp_pme you can fine-tune the compute
power ratio between PME and PP nodes. You need to try out different
combinations to find the optimum, the comments in the md.log file
give hints on what to change.
This approach usually yields good performance if you use several nodes,
on a single node there will for sure be better settings (most likely
more MPI processes with less OpenMP threads each).
Carsten
On 11 Aug 2014, at 11:45, Theodore Si <sjyzhxw at gmail.com> wrote:
> Hi Mark,
>
> This is information of our cluster, could you give us some advice as regards to our cluster so that we can make GMX run faster on our system?
>
> Each CPU node has 2 CPUs and each GPU node has 2 CPUs and 2 Nvidia K20M
>
>
> Device Name Device Type Specifications Number
> CPU Node IntelH2216JFFKRNodes CPU: 2×Intel Xeon E5-2670(8 Cores, 2.6GHz, 20MB Cache, 8.0GT)
> Mem: 64GB(8×8GB) ECC Registered DDR3 1600MHz Samsung Memory 332
> Fat Node IntelH2216WPFKRNodes CPU: 2×Intel Xeon E5-2670(8 Cores, 2.6GHz, 20MB Cache, 8.0GT)
> Mem: 256G(16×16G) ECC Registered DDR3 1600MHz Samsung Memory 20
> GPU Node IntelR2208GZ4GC CPU: 2×Intel Xeon E5-2670(8 Cores, 2.6GHz, 20MB Cache, 8.0GT)
> Mem: 64GB(8×8GB) ECC Registered DDR3 1600MHz Samsung Memory 50
> MIC Node IntelR2208GZ4GC CPU: 2×Intel Xeon E5-2670(8 Cores, 2.6GHz, 20MB Cache, 8.0GT)
> Mem: 64GB(8×8GB) ECC Registered DDR3 1600MHz Samsung Memory 5
> Computing Network Switch Mellanox Infiniband FDR Core Switch 648× FDR Core Switch MSX6536-10R, Mellanox Unified Fabric Manager 1
> Mellanox SX1036 40Gb Switch 36× 40Gb Ethernet Switch SX1036, 36× QSFP Interface 1
> Management Network Switch Extreme Summit X440-48t-10G 2-layer Switch 48× 1Giga Switch Summit X440-48t-10G, authorized by ExtremeXOS 9
> Extreme Summit X650-24X 3-layer Switch 24× 10Giga 3-layer Ethernet Switch Summit X650-24X, authorized by ExtremeXOS 1
> Parallel Storage DDN Parallel Storage System DDN SFA12K Storage System 1
> GPU GPU Accelerator NVIDIA Tesla Kepler K20M 70
> MIC MIC Intel Xeon Phi 5110P Knights Corner 10
> 40Gb Ethernet Card MCX314A-BCBT Mellanox ConnextX-3 Chip 40Gb Ethernet Card
> 2× 40Gb Ethernet ports, enough QSFP cables 16
> SSD Intel SSD910 Intel SSD910 Disk, 400GB, PCIE 80
>
>
>
>
>
>
> On 8/10/2014 5:50 AM, Mark Abraham wrote:
>> That's not what I said.... "You can set..."
>>
>> -npme behaves the same whether or not GPUs are in use. Using separate ranks
>> for PME caters to trying to minimize the cost of the all-to-all
>> communication of the 3DFFT. That's still relevant when using GPUs, but if
>> separate PME ranks are used, any GPUs on nodes that only have PME ranks are
>> left idle. The most effective approach depends critically on the hardware
>> and simulation setup, and whether you pay money for your hardware.
>>
>> Mark
>>
>>
>> On Sat, Aug 9, 2014 at 2:56 AM, Theodore Si <sjyzhxw at gmail.com> wrote:
>>
>>> Hi,
>>>
>>> You mean no matter we use GPU acceleration or not, -npme is just a
>>> reference?
>>> Why we can't set that to a exact value?
>>>
>>>
>>> On 8/9/2014 5:14 AM, Mark Abraham wrote:
>>>
>>>> You can set the number of PME-only ranks with -npme. Whether it's useful
>>>> is
>>>> another matter :-) The CPU-based PME offload and the GPU-based PP offload
>>>> do not combine very well.
>>>>
>>>> Mark
>>>>
>>>>
>>>> On Fri, Aug 8, 2014 at 7:24 AM, Theodore Si <sjyzhxw at gmail.com> wrote:
>>>>
>>>> Hi,
>>>>> Can we set the number manually with -npme when using GPU acceleration?
>>>>>
>>>>>
>>>>> --
>>>>> Gromacs Users mailing list
>>>>>
>>>>> * Please search the archive at http://www.gromacs.org/
>>>>> Support/Mailing_Lists/GMX-Users_List before posting!
>>>>>
>>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>>>
>>>>> * For (un)subscribe requests visit
>>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>>>>> send a mail to gmx-users-request at gromacs.org.
>>>>>
>>>>>
>>> --
>>> Gromacs Users mailing list
>>>
>>> * Please search the archive at http://www.gromacs.org/
>>> Support/Mailing_Lists/GMX-Users_List before posting!
>>>
>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>
>>> * For (un)subscribe requests visit
>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>>> send a mail to gmx-users-request at gromacs.org.
>>>
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.
--
Dr. Carsten Kutzner
Max Planck Institute for Biophysical Chemistry
Theoretical and Computational Biophysics
Am Fassberg 11, 37077 Goettingen, Germany
Tel. +49-551-2012313, Fax: +49-551-2012302
http://www.mpibpc.mpg.de/grubmueller/kutzner
http://www.mpibpc.mpg.de/grubmueller/sppexa
More information about the gromacs.org_gmx-users
mailing list