[gmx-users] GTX 960 vs Tesla K40

Fri Jun 15 21:20:37 CEST 2018

Hi,

Regarding the K40 vs GTX 960 question, the K40 will likely be a bit
faster (though it'l consume more power if that matters). The
difference will be at most 20% in total performance, I think -- and
with small systems likely negligible (as a smaller card with higher
clocks is more efficient at small tasks than a large card with lower
clocks).

Regarding the load balance note, you are correct, the "pme mesh/force"
means the ratio of time spent in computing PME forces on a separate
task/rank and the rest of the forces (including nonbonded, bonded,
etc.). With GPU offload this is a bit more tricky as the observed time
is the time spent waiting for the GPU results, but the take-away is
the same: when a run shows "pme mesh/force" far from 1, there is
imbalance affecting performance.

However, note that with a single GPU I've yet to see a case where you
get better performance by running multiple ranks rather than simply
running OpenMP-only. Also note that what a "weak GPU" can
case-by-case, so I recommend taking the 1-2 minutes to do a short run
and check for a certain hardware + simulation setup is it better to
offload all of PME or keep the FFTs on the CPU.

We'll do our best to automate more of these choices, but for now if
you care about performance it's useful to test before doing long runs.

Cheers,
--
Szilárd

On Thu, Jun 14, 2018 at 2:09 AM, Alex <nedomacho at gmail.com> wrote:
> Question: in the DD output (md.log) that looks like "DD  step xxxxxx  pme
> mesh/force 1.229," what is the ratio? Does it mean the pme calculations
> take longer by the shown factor than the nonbonded interactions?
> With GTX 960, the ratio is consistently ~0.85, with Tesla K40 it's ~1.25.
> My mdrun line contains  -pmefft cpu (per Szilard's advice for weak GPUs, I
> believe). Would it then make sense to offload the fft to the K40?
>
> Thank you,
>
> Alex
>
> On Wed, Jun 13, 2018 at 4:53 PM, Alex <nedomacho at gmail.com> wrote:
>
>> So, swap, then? Thank you!
>>
>>
>>
>> On Wed, Jun 13, 2018 at 4:49 PM, paul buscemi <pbuscemi at q.com> wrote:
>>
>>>  flops trumps clock speed…..
>>>
>>> > On Jun 13, 2018, at 3:45 PM, Alex <nedomacho at gmail.com> wrote:
>>> >
>>> > Hi all,
>>> >
>>> > I have an old "prototyping" box with a 4-core Xeon and an old GTX 960.
>>> We
>>> > have a Tesla K40 laying around and there's only one PCIE slot available
>>> in
>>> > this machine. Would it make sense to swap the cards, or is it already
>>> > bottlenecked by the CPU? I compared the specs and 960 has a higher clock
>>> > speed, while K40's FP performance is better. Should I swap the GPUs?
>>> >
>>> > Thanks,
>>> >
>>> > Alex
>>> > --
>>> > Gromacs Users mailing list
>>> >
>>> > * Please search the archive at http://www.gromacs.org/Support
>>> /Mailing_Lists/GMX-Users_List before posting!
>>> >
>>> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>> >
>>> > * For (un)subscribe requests visit
>>> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>>> send a mail to gmx-users-request at gromacs.org.
>>>
>>> --
>>> Gromacs Users mailing list
>>>
>>> * Please search the archive at http://www.gromacs.org/Support
>>> /Mailing_Lists/GMX-Users_List before posting!
>>>
>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>
>>> * For (un)subscribe requests visit
>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>>> send a mail to gmx-users-request at gromacs.org.
>>
>>
>>
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.