[gmx-users] new i9 processor
b.reuter at uni-kassel.de
Tue Oct 17 12:04:13 CEST 2017
I would assume you are right with your argument that 6/8/10 cores per
GPU are better than i.e. 5 cores. If I were you I would still carefully
test, if you really have a big performance gain with a i9 16 core and
two vs. one GPU. There are two reasons for this: 1. The 1080ti is
significantly stronger than the older cards, so one could assume that
even a 16 core processor isn't fast enough to keep up with two cards (a
12 core most likely won't). 2. It is some difference between two xeons
with own PCI lanes working with a own card each and one i9 dealing with
two cards - in principle it should be fine, since the i9 has 44 PCI
lanes and two cards will use 2x16 lanes but one surely has to test this
carefully. The motherboard might also have some influence here.
Anyway I would be happy, if you could keep me up to date how it worked
Am 17/10/17 um 11:31 schrieb Harry Mark Greenblatt:
> Dear Bernhard and Szilárd,
> Since your replies to me are related, I’ll combine my replies to one letter.
> you are right for the dual xeon 16 core setup you (and we) use, but for the single core i9 setup we made the experience, that we only gained between 5-10% of performance by adding a second gpu. This was what I was hinting here…
> Bernhard, just so I understand the test you performed, you took the 10-core i9 machine and added a second 1080ti card? So now instead of 10 cores paired with one GPU, the simulation ran using domain decomposition with two ranks, each rank running on 5 cores paired to each GPU, correct? I would guess that those runs had a large performance imbalance, with a large amount of time spent waiting for the 5 cores to finish their part of the calculation. To take advantage of the second GPU, one would need another ~10 cores, as Szilárd mentioned in his email, which can’t be done on a Core motherboard.
> I hope I did say exactly that because in general it is not true: both
> because we're accelerating (for now) a single task and because offload
> to multiple GPU requires domain-decomposition.
> The switch from no DD (only multi-threading) to DD has an initial hit
> and going from 1 to 2 GPU you can therefore often see fairly moderate
> scaling. From e.g. 2 to 4-way DD you can get near-linear scaling, but
> it will depend on the system size.
> Scaling to multiple GPUs, especially if by that you mean using the
> same number of cores and increasing the GPU count will certainly not
> give linear scaling in performance (time-to-solution) as only part of
> the computation is reduced — if that scales at all.
> I guess my use of the word “multiple” was not ideal. I am not really thinking beyond two GPU’s. I was discussing going from 6/8/10 cores paired with one GPU, to 12/16/20 cores paired with 2 GPU’s, using 2 MPI ranks, so that you still maintain the ratio of 6/8/10 cores per GPU. In that situation, we have observed close to double the performance, and that situation was (I believe) what you commented on in some past email.
> Threadripper is also an option; if anybody has successfully install Linux and run GROMACS on a that architecture I would be happy to hear about it.
> So generally: you'll need a well-balanced CPU cores to GPU ratio and
> increasing the GPUs count only will only help if the run in GPU-bound
> (and even than, only by a limited amount).
> Harry M. Greenblatt
> Associate Staff Scientist
> Dept of Structural Biology harry.greenblatt at weizmann.ac.il<../../owa/redir.aspx?C=QQgUExlE8Ueu2zs5OGxuL5gubHf97c8IyXxHOfOIqyzCgIQtXppXx1YBYaN5yrHbaDn2xAb8moU.&URL=mailto%3aharry.greenblatt%40weizmann.ac.il>
> Weizmann Institute of Science Phone: 972-8-934-6340
> 234 Herzl St. Facsimile: 972-8-934-3361
> Rehovot, 7610001
Dipl.-Phys. Bernhard Reuter
Theoretical Biophysics Subgroup
Theoretical Physics II
Institute of Physics
University of Kassel
34132 Kassel, Germany
Tel: +49 561 804 4482
Mobile: +40 152 27331699
More information about the gromacs.org_gmx-users