[gmx-users] Specs for GPU box

Szilárd Páll pall.szilard at gmail.com
Fri Jan 27 16:14:48 CET 2017


On Thu, Jan 19, 2017 at 4:25 AM, Alex <nedomacho at gmail.com> wrote:
>
>> Hmmm, does that mean I should stop typing, press "discard mail" and move
>> on?
>
> Nothing sinister, please do not move on. A donation would imply an official
> endorsement by the US government, which would bring litigation against the
> two of us. : )

I'm not sure why would that be the case, but let's not get off-topic.

>>
>> I see your point, but note that complexity is often not only in the
>> interconnect. Node sharing among (by jobs / users) will bring up many
>> of the management/scheduling issues that arise with (simpler)
>> multi-node setups. Secondly, depending on the planned job concurrency,
>> you may need to care little about the (hw/sw) hurdles of scaling
>> across machines.
>>
>> The important question is what is the job distribution, size, resource
>> and time to completion requirement? E.g. If you'll always have a dozen
>> small runs in the pipeline, you might as well optimize for job
>> throughput rather than latency (e.g. always run 1-2 jobs/GPU).
>
> This part is simple, possibly trivial. 3-4 users from the same building who
> are in the same group and I'm fairly confident that it would stay that way
> in the long run.
> If we went with two E5-based machines (one with GPU, one without), one would
> be used exclusively for DFT and one for MD. In either case, we'd run one or
> two jobs at a time, that's about it.

Sounds like it a reasonably simple setup (assuming that job lengths
are not too short).

>>
>> Something like 2xE5 269x + 2-4x TITAN X-P; what's best will depend on
>> the kind of simulations you want to run, so that question really is
>> best addressed first! Small simulation systems will anyway not scale
>> across GPUs, so you might want more of the less power-hungry GPUs. If
>> you want (and can) run across the entire node, I'd go for fewer of the
>> fastest GPUs (e.g. 2-3 TITAN X-P).
>>
>> Note that GROMACS uses offload-based acceleration, so hardware balance
>> is important! Have you checked out doi:10.1002/jcc.24030?
>
> Well, In addition to what i said about the simulations above, I still want
> to go for a quad-socket setup with 2-4 GPUs so that overall we have more
> cores than needed to achieve CPU-GPU balance and could run CPU-only analysis
> scripts and/or some light prototyping and stuff on the available cores
> without disturbing heavy runs.

However let me empahsize: make sure you plan node sharing ahead of
time. GROMACS is very sensitive to core/processor affinity! In
particular, make sure concurrent tasks don't interfere with mdrun
(i.e. pin to distinct (set of) cores). It's really easy to mess up
performance by hammering just a single core that happens to be
assigned to mdrun.

Also note that a 4-8 core ~3.5-4 GHz workstation can be a lot faster
at (CPU-bound) analysis  than the high core-count Xeon CPUs,

> Yes, we can and very likely will run
> simulations across the entire node, if that is needed to achieve decent
> balance.

I was not referring to balance. Scaling is not a free lunch! For
strong scaling it is in general better to have fewer but faster GPUs
(and cores). For throughput runs (>=2 simulations/GPU) you will most
often get more _aggregate_ perf/node with larger number of more power
efficient GPUs (and cores).

> As long as these GPUs fit into rack-mountable boxes, we'll be good
> to go.

Consumer cards in servers are certainly common, there are companies
selling such boxes. However, the TITAN X-P is a tricky one. As far as
I know, NVIDIA still restricts its sale through the usual resellers
and offers it only through their website, bought with a credit card
(sadly with a max 2/person limit).

>>
>>
>> Both Pascal Teslas are slower than the consumer TITAN X-P for the
>> compute-bound GROMACS kernels (and in fact for many if not most SP
>> compute-bound codes). The TITAN X-P is slower on paper than the P40,
>> but it has a crazy sustained clock rate (~1850 MHz).
>>
> I see now, wouldn't have expected that at all, so thanks a bunch. The prices
> of these Titan X-P cards make me very happy.

I guess the silly restrictions + US gov procurement requirement might
have now made you somewhat less happy.
However, FYI the 1080 Ti is expected to be released soon and I can
only hope the TITAN X-P restrictions won't apply.

--
Szilárd

> Alex
>
>>
>>>> Sounds good. We'll definitely be interested if you end up with
>>>> something really dense and packing a good punch.
>>>>
>>> I will be happy to share the results, once I have something to share.
>>> Especially if it's something good!
>>>
>>> Thanks,
>>>
>>> Alex
>>>
>>>
>>>>> Thank you!
>>>>>
>>>>> Alex
>>>>>
>>>>>
>>>>>
>>>>> On 1/6/2017 6:05 AM, Szilárd Páll wrote:
>>>>>>
>>>>>> Hi Alex,
>>>>>>
>>>>>> Benchmarks of quad-socket Intel machines are rare because AFAIK such
>>>>>> systems are mighty expensive and you will not get good bang for buck
>>>>>> with
>>>>>> them, especially if you combine these pricey nodes/CPUs with the old
>>>>>> and
>>>>>> slow K80s.
>>>>>>
>>>>>> The only reason to get E7 is if >=4 sockets or >1.5 TB memory per node
>>>>>> is
>>>>>> a
>>>>>> must. Furthermore, the only reason to buy K80s today (for GROMACS) if
>>>>>> they
>>>>>> are dirt-cheap (e.g. free :).
>>>>>>
>>>>>> You'll be much better off with:
>>>>>> - 1-2-socket Broadwell nodes
>>>>>> - P100 if you need Tesla, GeForce 1070/1080
>>>>>>
>>>>>> However, more importantly, what kind of simulations you want to run?
>>>>>> For
>>>>>> 50K you might be able to get multiple nodes with optimal price/perf.
>>>>>>
>>>>>> Cheers,
>>>>>> --
>>>>>> Szilárd
>>>>>>
>>>>>> On Tue, Dec 27, 2016 at 9:24 PM, Alex <nedomacho at gmail.com> wrote:
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> We've got some dedicated funding (~50K) for a computing box. GMX will
>>>>>>> be
>>>>>>> one of the applications used there (the other MD package would be
>>>>>>> LAMMPS,
>>>>>>> which has similar requirements). Other applications would be ab
>>>>>>> initio
>>>>>>> and
>>>>>>> DFT packages, so, aside from a ton of RAM and possibly a fast SSD for
>>>>>>> scratch, there aren't too many requirements.
>>>>>>>
>>>>>>> My question is about an "optimal" CPU-GPU combination. Initially, we
>>>>>>> wanted
>>>>>>> something like a quad-Xeon (something relatively senior in the E7
>>>>>>> family,
>>>>>>> ~48-64 cores total) with two K80 cards, but I can't find anything
>>>>>>> like
>>>>>>> this
>>>>>>> in your benchmark documents.
>>>>>>>
>>>>>>> Can someone help spec this thing?
>>>>>>>
>>>>>>> Thanks a lot,
>>>>>>>
>>>>>>> Alex
>>>>>>> --
>>>>>>> Gromacs Users mailing list
>>>>>>>
>>>>>>> * Please search the archive at http://www.gromacs.org/Support
>>>>>>> /Mailing_Lists/GMX-Users_List before posting!
>>>>>>>
>>>>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>>>>>
>>>>>>> * For (un)subscribe requests visit
>>>>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>>>>>>> send a mail to gmx-users-request at gromacs.org.
>>>>>>>
>>>>> --
>>>>> Gromacs Users mailing list
>>>>>
>>>>> * Please search the archive at
>>>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>>>>> posting!
>>>>>
>>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>>>
>>>>> * For (un)subscribe requests visit
>>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>>>>> send a
>>>>> mail to gmx-users-request at gromacs.org.
>>>
>>>
>>> --
>>> Gromacs Users mailing list
>>>
>>> * Please search the archive at
>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>>> posting!
>>>
>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>
>>> * For (un)subscribe requests visit
>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>>> send a
>>> mail to gmx-users-request at gromacs.org.
>
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a
> mail to gmx-users-request at gromacs.org.


More information about the gromacs.org_gmx-users mailing list