[gmx-users] GROMACS performance issues on POWER9/V100 node

Szilárd Páll pall.szilard at gmail.com
Sat Apr 25 00:14:39 CEST 2020


Hi,

Affinity settings on the Talos II with Ubuntu 18.04 kernel 5.0 works fine.
I get threads pinned where they should be (hwloc confirmed) and consistent
results. I also get reasonable thread placement even without pinning (i.e.
the kernel scatters first until #threads <= #hwthreads). I see only a minor
penalty to not pinning -- not too surprising given that I have a single
NUMA node and the kernel is doing its job.

Here are my quick the test results run on an 8-core Talos II Power9 + a
GPU, using the adh_cubic input:

$ grep Perf *.log
test_1x1_rep1.log:Performance:       16.617
test_1x1_rep2.log:Performance:       16.479
test_1x1_rep3.log:Performance:       16.520
test_1x2_rep1.log:Performance:       32.034
test_1x2_rep2.log:Performance:       32.389
test_1x2_rep3.log:Performance:       32.340
test_1x4_rep1.log:Performance:       62.341
test_1x4_rep2.log:Performance:       62.569
test_1x4_rep3.log:Performance:       62.476
test_1x8_rep1.log:Performance:       97.049
test_1x8_rep2.log:Performance:       96.653
test_1x8_rep3.log:Performance:       96.889


This seems to point towards some issue with the OS or setup on the IBM
machines you have -- and the unit test error may be one of the symptoms of
it (as it suggests something is off with the hardware topology and a NUMA
node is missing from it). I'd still suggest checking if a full not
allocation with all threads, memory, etc passed to the job results in
successful affinity settings i) in mdrun ii) in some other tool.

Please update this thread if you have further findings.

Cheers,
--
Szilárd


On Fri, Apr 24, 2020 at 10:52 PM Szilárd Páll <pall.szilard at gmail.com>
wrote:

>
> The following lines are found in md.log for the POWER9/V100 run:
>>
>> Overriding thread affinity set outside gmx mdrun
>> Pinning threads with an auto-selected logical core stride of 128
>> NOTE: Thread affinity was not set.
>>
>> The full md.log is available here:
>> https://github.com/jdh4/running_gromacs/blob/master/03_benchmarks/md.log
>
>
> I glanced over that at first, will see if I can reproduce it, though I
> only have access to a Raptor Talos, not an IBM machine with Ubuntu.
>
> What OS are you using?
>
>
> --
>> Gromacs Users mailing list
>>
>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> posting!
>>
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-request at gromacs.org.
>>
>


More information about the gromacs.org_gmx-users mailing list