[gmx-users] GPU and MPI
Carsten Kutzner
ckutzne at gwdg.de
Fri Aug 29 17:16:00 CEST 2014
Hi Dawei,
On 29 Aug 2014, at 16:52, Da-Wei Li <lidawei at gmail.com> wrote:
> Dear Carsten
>
> Thanks for the clarification. Here it is my benchmark for a small protein
> system (18k atoms).
>
> (1) 1 node (12 cores/node, no GPU): 50 ns/day
> (2) 2 nodes (12 cores/node, no GPU): 80 ns/day
> (3) 1 node (12 cores/node, 2 K40 GPUs/node): 100 ns/day
> (4) 2 nodes (12 cores/node, 2 K40 GPUs/node): 40 ns/day
>
>
> I send out this question because the benchmark 4 above is very suspicious.
Indeed, if you get 80 ns/day without GPUs, then it should not be less
with GPUs. For how many time steps do you run each of the
benchmarks? Do you use the -resethway command line switch to mdrun
to disregard the first half of the run (where initialization and
balancing is done, you don’t want to count that in a benchmark)?
Carsten
> But I agree size of my system may play a role.
>
> best,
>
> dawei
>
>
> On Fri, Aug 29, 2014 at 10:36 AM, Carsten Kutzner <ckutzne at gwdg.de> wrote:
>
>> Hi Dawei,
>>
>> the mapping of GPUs to PP ranks is printed for the Master node only,
>> but if this node reports two GPUs, then all other PP ranks will also
>> use two GPUs (or an error is reported).
>>
>> The scaling will depend also on your system size, if this is too small,
>> then you might be better off by using a single node.
>>
>> Carsten
>>
>>
>> On 29 Aug 2014, at 16:24, Da-Wei Li <lidawei at gmail.com> wrote:
>>
>>> Dear users,
>>>
>>> I recently try to run Gromacs on two nodes, each of them has 12 cores
>> and 2
>>> GPUs. The nodes are connected with infiniband and scaling is pretty good
>>> when no GPU is evolved.
>>>
>>> My command is like this:
>>>
>>> mpiexec -npernode 2 -np 4 mdrun_mpi -ntomp 6
>>>
>>>
>>> However, it looks like Gromacs only detected 2 GPUs on node 0, then skip
>>> node 1. Part of the output looks like:
>>>
>>>
>>> ************************
>>>
>>> Using 4 MPI processes
>>>
>>> Using 6 OpenMP threads per MPI process
>>>
>>> 2 GPUs detected on host n0316.ten:
>>>
>>> #0: NVIDIA Tesla M2070, compute cap.: 2.0, ECC: yes, stat: compatible
>>>
>>> #1: NVIDIA Tesla M2070, compute cap.: 2.0, ECC: yes, stat: compatible
>>>
>>> 2 GPUs user-selected for this run.
>>>
>>> Mapping of GPUs to the 2 PP ranks in this node: #0, #1
>>>
>>> ****************************
>>>
>>>
>>> The performance is about only 40% of the run, where I use only 1 node (12
>>> cores+2GPUs).
>>>
>>>
>>> Does I miss something?
>>>
>>>
>>> thanks.
>>>
>>>
>>> dawei
>>> --
>>> Gromacs Users mailing list
>>>
>>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> posting!
>>>
>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>
>>> * For (un)subscribe requests visit
>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-request at gromacs.org.
>>
>> --
>> Gromacs Users mailing list
>>
>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> posting!
>>
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-request at gromacs.org.
>>
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.
--
Dr. Carsten Kutzner
Max Planck Institute for Biophysical Chemistry
Theoretical and Computational Biophysics
Am Fassberg 11, 37077 Goettingen, Germany
Tel. +49-551-2012313, Fax: +49-551-2012302
http://www.mpibpc.mpg.de/grubmueller/kutzner
http://www.mpibpc.mpg.de/grubmueller/sppexa
More information about the gromacs.org_gmx-users
mailing list