[gmx-users] Cluster recommendations

Mon Feb 2 14:12:12 CET 2015

Hi David,

On 22 Jan 2015, at 18:01, David McGiven <davidmcgivenn at gmail.com> wrote:

> Hey Karsten,
> 
> Just another question. What do you think will be the performance difference
> between two gromacs runs with a ~100k atoms system like the one I mentioned
> on my first email :
> 
> - 1 server with 4 AMD processors, 16 cores each (64 cores) with no GPU
> - 1 server with 4 AMD processors, 16 cores each (64 cores) with one GTX 980
> GPU
> - 1 server with 2 Intel processors, 10 cores each (20 cores) like the ones
> you mentioned, with one or two GTX 980 GPU.
> 
> I'm not interested in exact performance numbers, I just need to understand
> the logistics behind the CPU/GPU combinations in order to make an
> inteligent cluster purchase.
I never benchmarked 64-core AMD nodes with GPUs. With a 80 k atoms test
system using a 2 fs time step I get
24 ns/d on 64 AMD   cores 6272
16 ns/d on 32 AMD   cores 6380
36 ns/d on 32 AMD   cores 6380   with 1x GTX 980
40 ns/d on 32 AMD   cores 6380   with 2x GTX 980
27 ns/d on 20 Intel cores 2680v2
52 ns/d on 20 Intel cores 2680v2 with 1x GTX 980
62 ns/d on 20 Intel cores 2680v2 with 2x GTX 980

So unless you can get the AMD nodes very cheap, probably the 20-core
Intel nodes with 1 or 2 GPUs will give you the best performance and the best 
performance/price.

Best,
  Carsten

> 
> Thanks again.
> 
> Best,
> D
> 
> 
> 2015-01-16 14:46 GMT+01:00 Carsten Kutzner <ckutzne at gwdg.de>:
> 
>> Hi David,
>> 
>> On 16 Jan 2015, at 12:28, David McGiven <davidmcgivenn at gmail.com> wrote:
>> 
>>> Hi Carsten,
>>> 
>>> Thanks for your answer.
>>> 
>>> 2015-01-16 11:11 GMT+01:00 Carsten Kutzner <ckutzne at gwdg.de>:
>>> 
>>>> Hi David,
>>>> 
>>>> we are just finishing an evaluation to find out which is the optimal
>>>> hardware for Gromacs setups. One of the input systems is an 80,000 atom
>>>> membrane channel system and thus nearly exactly what you want
>>>> to compute.
>>>> 
>>>> The biggest benefit you will get by adding one or two consumer-class
>> GPUs
>>>> to your nodes (e.g. NVIDIA GTX 980). That will typically double your
>>>> performace-to-price ratio. This is true for Intel as well as for AMD
>>>> nodes, however the best ratio in our tests was observed with 10-core
>>>> Intel CPUs (2670v2, 2680v2) in combination with a GTX 780Ti or 980,
>>>> ideally two of those CPUs with two GPUs on a node.
>>>> 
>>>> 
>>> Was there a difference between 2670v2 (2.5 GHz) and 2680v2  (2.8 GHz) ?
>> I'm
>>> wondering if those 0,3 GHz are significative. Or the 0,5 GHz compared to
>>> 2690v2 for the matter. There’s a significative difference in price
>> indeed.
>> Usually the percent improvement for Gromacs performance is not as much
>> as the percent improvement in clock speed, so the cheaper ones will
>> give you a higher performance-to-price ratio.
>> 
>>> 
>>> I'm also wondering if the performance would be better with 16 core Intels
>>> instead of 10 core. I.e E5-2698 v3.
>> Didn’t test those.
>> 
>>> 
>>> I would like to know which other tests have you done. What about AMD ?
>> We tested AMD 6380 with 1-2 GTX 980 GPUs, which gives about the same
>> performance-to-price ratio as a 10 core Intel 2680v2 node with one GTX 980.
>> The Intel node gives you a higher per-node performance, though.
>> 
>>> 
>>> Unless you want to buy expensive FDR14 Infiniband, scaling across two
>>>> or more of those nodes won’t be good (~0.65 parallel efficiency across
>> 2,
>>>> ~0.45 across 4 nodes using QDR infiniband), so I would advise against
>>>> it and go for more sampling on single nodes.
>>>> 
>>>> 
>>> Well, that puzzles me. Why is it that you get poor performance ? Are you
>>> talking about pure CPU jobs over infiniband, or are you talking about
>>> CPU+GPU jobs over infiniband ?
>> For a given network (e.g. QDR Infiniband), the scaling is better the lower
>> the performance of the individual nodes. So for CPU-only nodes you
>> will get a better scaling than for CPU+GPU nodes, which have a way higher
>> per-node performance.
>> 
>>> How come you won’t get good performance if a great percentage of
>> The performance is good, it is just that the parallel efficiency is
>> not optimal for an MD system <100,000 atoms, meaning you do not get two
>> times the performance on two nodes in parallel as compared to the
>> aggregated performance of two individual runs.
>> Bigger systems will have a better parallel efficiency.
>> 
>>> supercomputer centers in the world use InfiniBand ? And I'm sure lots of
>>> users here in the list use gromacs over Infiniband.
>> I do, too :)
>> But you get more trajectory for your money if you can wait and run on
>> a single node.
>> 
>> Carsten
>> 
>>> 
>>> Thanks again.
>>> 
>>> Best Regards,
>>> D
>>> 
>>> 
>>>> Best,
>>>> Carsten
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On 15 Jan 2015, at 17:35, David McGiven <davidmcgivenn at gmail.com>
>> wrote:
>>>> 
>>>>> Dear Gromacs Users,
>>>>> 
>>>>> We’ve got some funding to build a new cluster. It’s going to be used
>>>> mainly
>>>>> for gromacs simulations (80% of the time). We run molecular dynamics
>>>>> simulations of transmembrane proteins inside a POPC lipid bilayer. In a
>>>>> typical system we have ~100000 atoms, from which almost 1/3 correspond
>> to
>>>>> water molecules. We employ usual conditions with PME for electorstatics
>>>> and
>>>>> cutoffs for LJ interactions.
>>>>> 
>>>>> I would like to hear your advice on which kind of machines are the best
>>>>> bang-for-the-buck for that kind of simulations. For instance :
>>>>> 
>>>>> - Intel or AMD ? My understanding is that Intel is faster but
>> expensive,
>>>>> and AMD is slower but cheaper. So at the end you almost get the same
>>>>> performance-per-buck. Right ?
>>>>> 
>>>>> - Many CPUs/Cores x machine or less ? My understanding is that the more
>>>>> cores x machine the lesser the costs. One machine is always cheaper to
>>>> buy
>>>>> and maintain than various. Plus maybe you can save the costs of
>>>> Infiniband
>>>>> if you use large core densities ?
>>>>> 
>>>>> - Should we invest in an Infiniband network to run jobs across multiple
>>>>> nodes ? Will the kind of simulations we run benefit from multiple
>> nodes ?
>>>>> 
>>>>> - Would we benefit from adding GPU’s to the cluster ? If so, which
>> ones ?
>>>>> 
>>>>> We now have a cluster with 48 and 64 AMD Opteron cores x machine (4
>>>>> processors x machine) and we run our gromacs simulations there. We
>> don’t
>>>>> use MPI because our jobs are mostly run in a single node. As I said,
>> with
>>>>> 48 or 64 cores x simulation in a single machine. So far, we’re quite
>>>>> satisfied with the performance we get.
>>>>> 
>>>>> Any advice will be greatly appreciated.
>>>>> 
>>>>> 
>>>>> Best Regards,
>>>>> D.
>>>>> --
>>>>> Gromacs Users mailing list
>>>>> 
>>>>> * Please search the archive at
>>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>>>> posting!
>>>>> 
>>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>>> 
>>>>> * For (un)subscribe requests visit
>>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>>>> send a mail to gmx-users-request at gromacs.org.
>>>> 
>>>> 
>>>> --
>>>> Dr. Carsten Kutzner
>>>> Max Planck Institute for Biophysical Chemistry
>>>> Theoretical and Computational Biophysics
>>>> Am Fassberg 11, 37077 Goettingen, Germany
>>>> Tel. +49-551-2012313, Fax: +49-551-2012302
>>>> http://www.mpibpc.mpg.de/grubmueller/kutzner
>>>> http://www.mpibpc.mpg.de/grubmueller/sppexa
>>>> 
>>>> --
>>>> Gromacs Users mailing list
>>>> 
>>>> * Please search the archive at
>>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>>>> posting!
>>>> 
>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>> 
>>>> * For (un)subscribe requests visit
>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>>>> send a mail to gmx-users-request at gromacs.org.
>>>> 
>>> --
>>> Gromacs Users mailing list
>>> 
>>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> posting!
>>> 
>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>> 
>>> * For (un)subscribe requests visit
>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail togmx-users-request at gromacs.org.
>> 
>> 
>> --
>> Dr. Carsten Kutzner
>> Max Planck Institute for Biophysical Chemistry
>> Theoretical and Computational Biophysics
>> Am Fassberg 11, 37077 Goettingen, Germany
>> Tel. +49-551-2012313, Fax: +49-551-2012302
>> http://www.mpibpc.mpg.de/grubmueller/kutzner
>> http://www.mpibpc.mpg.de/grubmueller/sppexa
>> 
>> --
>> Gromacs Users mailing list
>> 
>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> posting!
>> 
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>> 
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-request at gromacs.org.
>> 
> -- 
> Gromacs Users mailing list
> 
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
> 
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> 
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.

--
Dr. Carsten Kutzner
Max Planck Institute for Biophysical Chemistry
Theoretical and Computational Biophysics
Am Fassberg 11, 37077 Goettingen, Germany
Tel. +49-551-2012313, Fax: +49-551-2012302
http://www.mpibpc.mpg.de/grubmueller/kutzner
http://www.mpibpc.mpg.de/grubmueller/sppexa