[gmx-users] Gromacs on an IBM Power8

Szilárd Páll pall.szilard at gmail.com
Fri Sep 30 18:56:13 CEST 2016


Hi,

Unless your Power8 Linux machine/kernel is configured differently from
the ones I've used, thread placement with -ntomp -pinstride is quite
straightforward. You can see the hardware thread to "CPU" index
mapping in the log file if you compile with hwloc, under the "Hardware
topology:" section. So assuming you want to run on a single socket, to
place 1,2,4 or 8 threads per core use stride 8, 4, 2, 1, resp., i.e.:
gmx mdrun -ntmpi 1 -ntomp 12 -pinstride 8    # 1 th/core
gmx mdrun -ntmpi 1 -ntomp 24 -pinstride 4  # 2 th/core
gmx mdrun -ntmpi 1 -ntomp 48 -pinstride 2  # 4 th/core
[...]

I generally script this by computing the stride and thread count from
the #threads/core and total #hardware threads per socket / NUMA.

The above is pure OpenMP, but the same applies to combined
MPI/thread-MPI + OpenMP runs, strides and offsets are applied
correctly across ranks too.

A few more things to note:
- obviously, this way you won't be able to test non power-of-2
threads/core for that you'll have to tweak the code (code
contributions that implement things like this are always appreciated
;) or construct and pass OpenMP thread affinity masks yourself
(tedious, but it's doable, I can give hints if you're really
interested);
- by default we assume a 32 threads/rank limit, to use more you'll
need to change this at compile-time using GMX_OPENMP_MAX_THREADS;
- Note that the Power8 chips are 2-way NUMA, no point in running
across the two halves so place your ranks/threads accordingly.

Cheers,
--
Szilárd


On Fri, Sep 30, 2016 at 4:11 PM, Mark Abraham <mark.j.abraham at gmail.com> wrote:
> Hi,
>
> You should also check out the documentation, e.g.
> http://manual.gromacs.org/documentation/2016/user-guide/mdrun-performance.html.
> To implement Szilard's suggestion of fewer threads per core and fewer ranks
> per node, for your 2-socket 24-core node, you will want something like
>
> gmx mdrun -ntomp 6 -ntmpi 8 -pin on -pinoffset x -pinstride y
>
> but I don't know how to advise for x and y so that you end up with 2
> threads on each of the three cores dedicated to each rank, because we only
> just got access to a Power8 machine...
>
> Mark
>
> On Fri, Sep 30, 2016 at 3:55 PM Baker D.J. <D.J.Baker at soton.ac.uk> wrote:
>
>> Hi Szilard,
>>
>> Thank you for your detailed reply, and for the attached paper. I'm heading
>> in the right direction -- I have upgraded to gcc v5.1 and installed fftw
>> v3.3.5 with VSX support turned on. On the other hand I'm not that
>> experienced with gromacs, and I would appreciate some advice on using the
>> "gmx mdrun" command, please. In other words I would be keen to try out your
>> suggestions, however I'm floundering a bit re putting sensible mdrun
>> commands together. For the moment I have simply enabled SMT=8 on our 24
>> core machine and used the "obvious choice" of flags. That is...
>>
>> mdrun -ntomp 8 -ntmpi 24 ....
>>
>> So, for example, that gives me circa 29 ns/day using the 2 K40 cards,
>> however as you note, the above command is sub-optimal for both the cpu and
>> the gpu runs. If you could please give me some examples re your
>> recommendations then that would be appreciated.
>>
>> Best regards,
>> David
>>
>> -----Original Message-----
>> From: gromacs.org_gmx-users-bounces at maillist.sys.kth.se [mailto:
>> gromacs.org_gmx-users-bounces at maillist.sys.kth.se] On Behalf Of Szilárd
>> Páll
>> Sent: Wednesday, September 28, 2016 6:41 PM
>> To: Discussion list for GROMACS users <gmx-users at gromacs.org>
>> Cc: gromacs.org_gmx-users at maillist.sys.kth.se
>> Subject: Re: [gmx-users] Gromacs on an IBM Power8
>>
>> Hi,
>>
>> Brief notes/recommendations:
>> - Use the recently released fftw 3.3.5, it has VSX support (contributed by
>> us, that is by Erik Lindahl)
>> - Use at least gcc 5.x, it's significantly faster.
>> - 8 threads/core will never be optimal, 2-4 is best; currently the best
>> way to do that is to use -pinstride
>> - 1 rank/core is rarely optimal (with GPUs), IIRC from when I last ran on
>> the 2x12c IBM machines, 6-8 ranks were a good balance.
>>
>> Finally, when it comes to performance, Intel and AVX2 will be hard to beat
>> with the current Power8 chips, but with GPUs combined, we've showed such
>> machines match/beat Haswell, e.g see
>> http://on-demand.gputechconf.com/gtc/2015/presentation/S5504-Szilard-Pall.pdf
>> (slide 34)
>>
>> Cheers,
>> --
>> Szilárd
>>
>>
>> On Wed, Sep 28, 2016 at 4:04 PM, Baker D.J. <D.J.Baker at soton.ac.uk> wrote:
>> > Hello,
>> >
>> > I would appreciate advice on running gromacs on a Power8 machine,
>> please. I recently downloaded and installed gromacs v2016 on one of our IBM
>> Power8 machines. I understand that this version of gromacs has support for
>> Power8 machines and it certainly performs a great deal better than earlier
>> versions. So that is progress. On the other hand I wonder if the Power8
>> could do better. I used one of the gromacs benchmarks (case A) supplied by
>> PRACE and ran the following command on the Power8..
>> >
>> > gmx mdrun -ntomp 8 -ntmpi 24 -s ion_channel.tpr -maxh 0.50 -resethway
>> > -noconfout -nsteps 10000 -g logfile -nb cpu
>> >
>> > In other words there are 24 cores and I turned on SMT=8. Using that
>> configuration and the above command the performance is about the same as on
>> a 16 core Sandybridge node...
>> >
>> > Performance in ns/day
>> > SandyBridge (16 cores) - 11.23
>> > Power8 (24 cores)          -- 11.82
>> > Power8 (using 2 gpus)  -- 28.34
>> >
>> > That's compiling with the GNU compiler v4.9.1. Should I be able to do
>> better on the Power8 with a different/later compiler and/or different
>> runtime settings?
>> >
>> > Best regards,
>> > David
>> >
>> >
>> >
>> >
>> > --
>> > Gromacs Users mailing list
>> >
>> > * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> posting!
>> >
>> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>> >
>> > * For (un)subscribe requests visit
>> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-request at gromacs.org.
>> --
>> Gromacs Users mailing list
>>
>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> posting!
>>
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-request at gromacs.org.
>> --
>> Gromacs Users mailing list
>>
>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> posting!
>>
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-request at gromacs.org.
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.


More information about the gromacs.org_gmx-users mailing list