[gmx-users] Worse GROMACS performance with better specs?

Jason Loo Siau Ee JasonSiauEe.Loo at taylors.edu.my
Fri Jan 12 03:20:51 CET 2018


Dear Carsten,

Look's like we're seeing the same thing here, but only when using gcc 4.5.3:

Original performance (gcc 5.3.1, AVX512, no hwloc support): 49 ns/day

With hwloc support:
gcc 4.5.3, AVX2_256 = 67 ns/day
gcc 4.5.3, AVX512 = can't compile
gcc 5.3.1, AVX2_256 = 36 ns/day
gcc 5.3.1, AVX512 = 60 ns/day

At the very least it's now comparable to my previous workstation. Will have to try with the gromacs 2018 next.

Cheers for the info.

Jason

------------------------------

Message: 2
Date: Wed, 10 Jan 2018 08:39:22 +0000
From: "Kutzner, Carsten" <ckutzne at gwdg.de>
To: "<gmx-users at gromacs.org> GROMACS users" <gmx-users at gromacs.org>
Subject: Re: [gmx-users] Worse GROMACS performance with better specs?
Message-ID: <816DEA54-1E01-42AC-A78A-3680FDEF097E at gwdg.de>
Content-Type: text/plain; charset="us-ascii"

Dear Jason,

1.)
we have observed a similar behavior comparing Intel Silver 4114 against E5-2630v4
processors in a server with one GTX 1080Ti. Both CPUs have 10 cores and run at
2.2 GHz. Using our standard benchmark systems (see https://arxiv.org/abs/1507.00898)
we were able to get 74.4 ns/day on the E5 but only 65.7 ns/day on the Silver machine
for the MEM (80k atoms), and 4.4 vs. 4.2 ns/day for the RIB (2M atoms) system.

Although the 4114 supports AVX512 SIMD instructions, compiling with AVX2_256
yielded higher performance (this is already included in the above numbers, so
all benchmarks were run with the exact same mdrun executable).

2.)
We recently bought a bunch of workstations with 2x Gold 6146 CPUs and 2x GTX1080Ti GPUs
and were very surprised to find that initial GROMACS 2016 performances differ a
lot between the identical machines (some were up to 40% slower).

This was the reported difference in the md.log files:
> - mdrun logs show Hardware Topology: Basic versus Hardware Topology: Only logical processor count


For some reason, the logical processor count was reported in a mixed-up way
for the slower machines so that mdrun is unable to correctly determine the 
hardware topology. We solved this CPU problem by installing GROMACS with hwloc 
support. Then performances were equal over all 6146 nodes.

Best,
  Carsten


> On 10. Jan 2018, at 07:45, Jason Loo Siau Ee <JasonSiauEe.Loo at taylors.edu.my> wrote:
> 
> Dear gmx users,
> 
> 
> 
> I recently purchased a second GPU workstation and tried compiling GROMACS on it, but despite the better (and more expensive) specs the performance is significantly worse on my test system. To test things I standardized the version (2016.4). Some details below:
> 
> 
> 
> Workstation 1:
> 
> 2 x Intel Xeon E5-2680v4 (14 cores), 2 x GTX 1080
> 
> gmx mdrun -ntmpi 8 -ntomp 7 -gpu_id 00001111
> 
> Performance: 61ns/day
> 
> 
> 
> Workstation 2:
> 
> 2 x Intel Xeon Gold 6126 (12 cores), 2 x GTX 1080Ti
> 
> gmx mdrun -ntmpi 8 -ntomp 6 -gpu_id 00001111
> 
> Performance: 49ns/day
> 
> 
> 
> I'm guessing it's an issue during compilation but I can't figure it out. I wouldn't claim to have any knowledge about how GROMACS interacts with the hardware, so some observations as below (not sure idea which is actually relevant):
> 
> 
> 
> - Compilation command for both: cmake .. -DGMX_BUILD_OWN_FFTW=ON  -DREGRESSIONTEST_DOWNLOAD=ON  -DGMX_GPU=ON  -DCMAKE_INSTALL_PREFIX=/opt/gromacs
> 
> 
> 
> - When compiling on Workstation 2 I originally got a CMake error "Cannot find AVX512F compiler flag". Updated my gcc version to 5.3.1 to solve this.
> 
> 
> 
> - Some regression tests fail for Workstation 2 during compilation: 4 -FTUnitTest (SEGFAULT), 16 - CorrelationTest
> 
> 
> 
> - mdrun logs show Hardware Topology: Basic versus Hardware Topology: Only logical processor count
> 
> 
> 
> - Running CPU-only (export CUDA_VISIBLE_DEVICES="") I get 21ns/day versus 23 ns/day , so the CPUs in Workstation 2 are definitely faster.
> 
> 
> 
> - Upgraded both to 2018.rc1 (used cmake3) I get a regression test fail for Workstation 1 (9 - GPUUtilUnitTest) and Workstation 2 (8 - FFTUnitTests, 9 - GPUUtilsUnitTest, 26 - CorrelationTest). Performance is 66.5ns/day versus 51.95ns/day. The GPU load actually looks similar to previous versions (~70% for Workstation 1 and 50-60% for Workstation 2). I actually got the best performance with Workstation 1 running 2016.1 (69ns/day).
> 
> 
> 
> Any help on how I can optimize performance on Workstation 2 would be appreciated. If there are certain files that would be helpful let me know and I'll send a link.
> 
> 
> 
> Cheers,
> Jason
> 
> 
> 
> 
> 
> 
> 
> Jason Loo
> 
> PhD, MPharm, RPh
> 
> Lecturer
> 
> School of Pharmacy
> 
> Faculty of Health and Medical Sciences
> 
> Taylor's University
> 
> 
> 
> This message (including any attachments) is intended only for
> the use of the individual or entity to which it is addressed and
> may contain information that is non-public, proprietary,
> privileged, confidential, and exempt from disclosure under
> applicable law or may constitute as attorney work product.
> If you are not the intended recipient, you are hereby notified
> that any use, dissemination, distribution, or copying of this
> communication is strictly prohibited. If you have received this
> communication in error, notify us immediately by telephone and
> (i) destroy this message if a facsimile or (ii) delete this message
> immediately if this is an electronic communication.
> 
> Thank you.
> -- 
> Gromacs Users mailing list
> 
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
> 
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> 
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.



--
Dr. Carsten Kutzner
Max Planck Institute for Biophysical Chemistry
Theoretical and Computational Biophysics
Am Fassberg 11, 37077 Goettingen, Germany
Tel. +49-551-2012313, Fax: +49-551-2012302
http://www.mpibpc.mpg.de/grubmueller/kutzner
http://www.mpibpc.mpg.de/grubmueller/sppexa

This message (including any attachments) is intended only for
the use of the individual or entity to which it is addressed and
may contain information that is non-public, proprietary,
privileged, confidential, and exempt from disclosure under
applicable law or may constitute as attorney work product.
If you are not the intended recipient, you are hereby notified
that any use, dissemination, distribution, or copying of this
communication is strictly prohibited. If you have received this
communication in error, notify us immediately by telephone and
(i) destroy this message if a facsimile or (ii) delete this message
immediately if this is an electronic communication.

Thank you.


More information about the gromacs.org_gmx-users mailing list