[gmx-users] Worse GROMACS performance with better specs?
Szilárd Páll
pall.szilard at gmail.com
Tue Feb 20 17:42:51 CET 2018
On Fri, Jan 12, 2018 at 2:35 AM, Jason Loo Siau Ee
<JasonSiauEe.Loo at taylors.edu.my> wrote:
> Dear Carsten,
>
> Look's like we're seeing the same thing here, but only when using gcc 4.5.3:
>
> Original performance (gcc 5.3.1, AVX512, no hwloc support): 49 ns/day
>
> With hwloc support:
> gcc 4.5.3, AVX2_256 = 67 ns/day
> gcc 4.5.3, AVX512 = can't compile
> gcc 5.3.1, AVX2_256 = 36 ns/day
> gcc 5.3.1, AVX512 = 60 ns/day
>
> At the very least it's now comparable to my previous workstation. Will have to try with the gromacs 2018 next.
Have you tried? Any feedback?
I'd also recommend using gcc >=6.
>
> Cheers for the info.
>
> Jason
>
> ------------------------------
>
> Message: 2
> Date: Wed, 10 Jan 2018 08:39:22 +0000
> From: "Kutzner, Carsten" <ckutzne at gwdg.de>
> To: "<gmx-users at gromacs.org> GROMACS users" <gmx-users at gromacs.org>
> Subject: Re: [gmx-users] Worse GROMACS performance with better specs?
> Message-ID: <816DEA54-1E01-42AC-A78A-3680FDEF097E at gwdg.de>
> Content-Type: text/plain; charset="us-ascii"
>
> Dear Jason,
>
> 1.)
> we have observed a similar behavior comparing Intel Silver 4114 against E5-2630v4
> processors in a server with one GTX 1080Ti. Both CPUs have 10 cores and run at
> 2.2 GHz. Using our standard benchmark systems (see https://arxiv.org/abs/1507.00898)
> we were able to get 74.4 ns/day on the E5 but only 65.7 ns/day on the Silver machine
> for the MEM (80k atoms), and 4.4 vs. 4.2 ns/day for the RIB (2M atoms) system.
>
> Although the 4114 supports AVX512 SIMD instructions, compiling with AVX2_256
> yielded higher performance (this is already included in the above numbers, so
> all benchmarks were run with the exact same mdrun executable).
>
> 2.)
> We recently bought a bunch of workstations with 2x Gold 6146 CPUs and 2x GTX1080Ti GPUs
> and were very surprised to find that initial GROMACS 2016 performances differ a
> lot between the identical machines (some were up to 40% slower).
>
> This was the reported difference in the md.log files:
>> - mdrun logs show Hardware Topology: Basic versus Hardware Topology: Only logical processor count
>
>
> For some reason, the logical processor count was reported in a mixed-up way
> for the slower machines so that mdrun is unable to correctly determine the
> hardware topology. We solved this CPU problem by installing GROMACS with hwloc
> support. Then performances were equal over all 6146 nodes.
>
> Best,
> Carsten
>
>
>> On 10. Jan 2018, at 07:45, Jason Loo Siau Ee <JasonSiauEe.Loo at taylors.edu.my> wrote:
>>
>> Dear gmx users,
>>
>>
>>
>> I recently purchased a second GPU workstation and tried compiling GROMACS on it, but despite the better (and more expensive) specs the performance is significantly worse on my test system. To test things I standardized the version (2016.4). Some details below:
>>
>>
>>
>> Workstation 1:
>>
>> 2 x Intel Xeon E5-2680v4 (14 cores), 2 x GTX 1080
>>
>> gmx mdrun -ntmpi 8 -ntomp 7 -gpu_id 00001111
>>
>> Performance: 61ns/day
>>
>>
>>
>> Workstation 2:
>>
>> 2 x Intel Xeon Gold 6126 (12 cores), 2 x GTX 1080Ti
>>
>> gmx mdrun -ntmpi 8 -ntomp 6 -gpu_id 00001111
>>
>> Performance: 49ns/day
>>
>>
>>
>> I'm guessing it's an issue during compilation but I can't figure it out. I wouldn't claim to have any knowledge about how GROMACS interacts with the hardware, so some observations as below (not sure idea which is actually relevant):
>>
>>
>>
>> - Compilation command for both: cmake .. -DGMX_BUILD_OWN_FFTW=ON -DREGRESSIONTEST_DOWNLOAD=ON -DGMX_GPU=ON -DCMAKE_INSTALL_PREFIX=/opt/gromacs
>>
>>
>>
>> - When compiling on Workstation 2 I originally got a CMake error "Cannot find AVX512F compiler flag". Updated my gcc version to 5.3.1 to solve this.
>>
>>
>>
>> - Some regression tests fail for Workstation 2 during compilation: 4 -FTUnitTest (SEGFAULT), 16 - CorrelationTest
>>
>>
>>
>> - mdrun logs show Hardware Topology: Basic versus Hardware Topology: Only logical processor count
>>
>>
>>
>> - Running CPU-only (export CUDA_VISIBLE_DEVICES="") I get 21ns/day versus 23 ns/day , so the CPUs in Workstation 2 are definitely faster.
>>
>>
>>
>> - Upgraded both to 2018.rc1 (used cmake3) I get a regression test fail for Workstation 1 (9 - GPUUtilUnitTest) and Workstation 2 (8 - FFTUnitTests, 9 - GPUUtilsUnitTest, 26 - CorrelationTest). Performance is 66.5ns/day versus 51.95ns/day. The GPU load actually looks similar to previous versions (~70% for Workstation 1 and 50-60% for Workstation 2). I actually got the best performance with Workstation 1 running 2016.1 (69ns/day).
>>
>>
>>
>> Any help on how I can optimize performance on Workstation 2 would be appreciated. If there are certain files that would be helpful let me know and I'll send a link.
>>
>>
>>
>> Cheers,
>> Jason
>>
>>
>>
>>
>>
>>
>>
>> Jason Loo
>>
>> PhD, MPharm, RPh
>>
>> Lecturer
>>
>> School of Pharmacy
>>
>> Faculty of Health and Medical Sciences
>>
>> Taylor's University
>>
>>
>>
>> This message (including any attachments) is intended only for
>> the use of the individual or entity to which it is addressed and
>> may contain information that is non-public, proprietary,
>> privileged, confidential, and exempt from disclosure under
>> applicable law or may constitute as attorney work product.
>> If you are not the intended recipient, you are hereby notified
>> that any use, dissemination, distribution, or copying of this
>> communication is strictly prohibited. If you have received this
>> communication in error, notify us immediately by telephone and
>> (i) destroy this message if a facsimile or (ii) delete this message
>> immediately if this is an electronic communication.
>>
>> Thank you.
>> --
>> Gromacs Users mailing list
>>
>> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>>
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.
>
>
>
> --
> Dr. Carsten Kutzner
> Max Planck Institute for Biophysical Chemistry
> Theoretical and Computational Biophysics
> Am Fassberg 11, 37077 Goettingen, Germany
> Tel. +49-551-2012313, Fax: +49-551-2012302
> http://www.mpibpc.mpg.de/grubmueller/kutzner
> http://www.mpibpc.mpg.de/grubmueller/sppexa
>
> This message (including any attachments) is intended only for
> the use of the individual or entity to which it is addressed and
> may contain information that is non-public, proprietary,
> privileged, confidential, and exempt from disclosure under
> applicable law or may constitute as attorney work product.
> If you are not the intended recipient, you are hereby notified
> that any use, dissemination, distribution, or copying of this
> communication is strictly prohibited. If you have received this
> communication in error, notify us immediately by telephone and
> (i) destroy this message if a facsimile or (ii) delete this message
> immediately if this is an electronic communication.
>
> Thank you.
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.
More information about the gromacs.org_gmx-users
mailing list