[gmx-users] GTX 960 vs Tesla K40

Mon Jun 18 23:23:58 CEST 2018

On Mon, Jun 18, 2018 at 2:22 AM, Alex <nedomacho at gmail.com> wrote:

> Thanks for the heads up. With the K40c instead of GTX 960 here's what I
> did and here are the results:
>
> 1. Enabled persistence mode and overclocked the card via nvidia-smi:
> http://acceleware.com/blog/gpu-boost-nvidias-tesla-k40-gpus

Note that: persistence mode is only for convenience.

> 2. Offloaded PME's FFT to GPU (which wasn't the case with GTX 960), this
> brough the "pme mesh / force" ratio to something like 1.07.
>

I still think you are running multiple ranks which is unlikely to be ideal,
but without seeing a log file, it's hard to tell..

The result is a solid increase in performance on a small-ish system (20K
> atoms): 90 ns/day instead of 65-70. I don't use this box for anything
> except prototyping, but still the swap + tweaks were pretty useful.

>
> Alex
>
>
>
> On 6/15/2018 1:20 PM, Szilárd Páll wrote:
>
>> Hi,
>>
>> Regarding the K40 vs GTX 960 question, the K40 will likely be a bit
>> faster (though it'l consume more power if that matters). The
>> difference will be at most 20% in total performance, I think -- and
>> with small systems likely negligible (as a smaller card with higher
>> clocks is more efficient at small tasks than a large card with lower
>> clocks).
>>
>> Regarding the load balance note, you are correct, the "pme mesh/force"
>> means the ratio of time spent in computing PME forces on a separate
>> task/rank and the rest of the forces (including nonbonded, bonded,
>> etc.). With GPU offload this is a bit more tricky as the observed time
>> is the time spent waiting for the GPU results, but the take-away is
>> the same: when a run shows "pme mesh/force" far from 1, there is
>> imbalance affecting performance.
>>
>> However, note that with a single GPU I've yet to see a case where you
>> get better performance by running multiple ranks rather than simply
>> running OpenMP-only. Also note that what a "weak GPU" can
>> case-by-case, so I recommend taking the 1-2 minutes to do a short run
>> and check for a certain hardware + simulation setup is it better to
>> offload all of PME or keep the FFTs on the CPU.
>>
>> We'll do our best to automate more of these choices, but for now if
>> you care about performance it's useful to test before doing long runs.
>>
>> Cheers,
>> --
>> Szilárd
>>
>>
>> On Thu, Jun 14, 2018 at 2:09 AM, Alex <nedomacho at gmail.com> wrote:
>>
>>> Question: in the DD output (md.log) that looks like "DD  step xxxxxx  pme
>>> mesh/force 1.229," what is the ratio? Does it mean the pme calculations
>>> take longer by the shown factor than the nonbonded interactions?
>>> With GTX 960, the ratio is consistently ~0.85, with Tesla K40 it's ~1.25.
>>> My mdrun line contains  -pmefft cpu (per Szilard's advice for weak GPUs,
>>> I
>>> believe). Would it then make sense to offload the fft to the K40?
>>>
>>> Thank you,
>>>
>>> Alex
>>>
>>> On Wed, Jun 13, 2018 at 4:53 PM, Alex <nedomacho at gmail.com> wrote:
>>>
>>> So, swap, then? Thank you!
>>>>
>>>>
>>>>
>>>> On Wed, Jun 13, 2018 at 4:49 PM, paul buscemi <pbuscemi at q.com> wrote:
>>>>
>>>>   flops trumps clock speed…..
>>>>>
>>>>> On Jun 13, 2018, at 3:45 PM, Alex <nedomacho at gmail.com> wrote:
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I have an old "prototyping" box with a 4-core Xeon and an old GTX 960.
>>>>>>
>>>>> We
>>>>>
>>>>>> have a Tesla K40 laying around and there's only one PCIE slot
>>>>>> available
>>>>>>
>>>>> in
>>>>>
>>>>>> this machine. Would it make sense to swap the cards, or is it already
>>>>>> bottlenecked by the CPU? I compared the specs and 960 has a higher
>>>>>> clock
>>>>>> speed, while K40's FP performance is better. Should I swap the GPUs?
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Alex
>>>>>> --
>>>>>> Gromacs Users mailing list
>>>>>>
>>>>>> * Please search the archive at http://www.gromacs.org/Support
>>>>>>
>>>>> /Mailing_Lists/GMX-Users_List before posting!
>>>>>
>>>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>>>>
>>>>>> * For (un)subscribe requests visit
>>>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>>>>>>
>>>>> send a mail to gmx-users-request at gromacs.org.
>>>>>
>>>>> --
>>>>> Gromacs Users mailing list
>>>>>
>>>>> * Please search the archive at http://www.gromacs.org/Support
>>>>> /Mailing_Lists/GMX-Users_List before posting!
>>>>>
>>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>>>
>>>>> * For (un)subscribe requests visit
>>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>>>>> send a mail to gmx-users-request at gromacs.org.
>>>>>
>>>>
>>>>
>>>> --
>>> Gromacs Users mailing list
>>>
>>> * Please search the archive at http://www.gromacs.org/Support
>>> /Mailing_Lists/GMX-Users_List before posting!
>>>
>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>
>>> * For (un)subscribe requests visit
>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>>> send a mail to gmx-users-request at gromacs.org.
>>>
>>
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/Support
> /Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>