[gmx-users] Gromacs 4.6.7 with MPI and OpenMP
Szilárd Páll
pall.szilard at gmail.com
Fri May 8 20:25:11 CEST 2015
On Fri, May 8, 2015 at 4:45 PM, Malcolm Tobias <mtobias at wustl.edu> wrote:
>
> Szilárd,
>
> On Friday 08 May 2015 15:56:12 Szilárd Páll wrote:
>> What's being utilized vs what's being started are different things. If
>> you don't believe the mdrun output - which is quite likely not wrong
>> about the 2 ranks x 4 threads -, use your favorite tool to check the
>> number of ranks and threads started and their placement. That will
>> explain what's going on...
>
> Good point. If I use 'ps -L' I can see the OpenMP threads:
>
> [root at gpu21 ~]# ps -Lfu mtobias
> UID PID PPID LWP C NLWP STIME TTY TIME CMD
> mtobias 9830 9828 9830 0 1 09:28 ? 00:00:00 sshd: mtobias at pts/0
> mtobias 9831 9830 9831 0 1 09:28 pts/0 00:00:00 -bash
> mtobias 9989 9831 9989 0 2 09:33 pts/0 00:00:00 mpirun -np 2 mdrun_mp
> mtobias 9989 9831 9991 0 2 09:33 pts/0 00:00:00 mpirun -np 2 mdrun_mp
> mtobias 9990 9831 9990 0 1 09:33 pts/0 00:00:00 tee mdrun.out
> mtobias 9992 9989 9992 38 7 09:33 pts/0 00:00:02 mdrun_mpi -ntomp 4 -v
> mtobias 9992 9989 9994 0 7 09:33 pts/0 00:00:00 mdrun_mpi -ntomp 4 -v
> mtobias 9992 9989 9998 0 7 09:33 pts/0 00:00:00 mdrun_mpi -ntomp 4 -v
> mtobias 9992 9989 10000 0 7 09:33 pts/0 00:00:00 mdrun_mpi -ntomp 4 -v
> mtobias 9992 9989 10001 16 7 09:33 pts/0 00:00:00 mdrun_mpi -ntomp 4 -v
> mtobias 9992 9989 10002 16 7 09:33 pts/0 00:00:00 mdrun_mpi -ntomp 4 -v
> mtobias 9992 9989 10003 16 7 09:33 pts/0 00:00:00 mdrun_mpi -ntomp 4 -v
> mtobias 9993 9989 9993 73 7 09:33 pts/0 00:00:05 mdrun_mpi -ntomp 4 -v
> mtobias 9993 9989 9995 0 7 09:33 pts/0 00:00:00 mdrun_mpi -ntomp 4 -v
> mtobias 9993 9989 9999 0 7 09:33 pts/0 00:00:00 mdrun_mpi -ntomp 4 -v
> mtobias 9993 9989 10004 0 7 09:33 pts/0 00:00:00 mdrun_mpi -ntomp 4 -v
> mtobias 9993 9989 10005 12 7 09:33 pts/0 00:00:00 mdrun_mpi -ntomp 4 -v
> mtobias 9993 9989 10006 12 7 09:33 pts/0 00:00:00 mdrun_mpi -ntomp 4 -v
> mtobias 9993 9989 10007 12 7 09:33 pts/0 00:00:00 mdrun_mpi -ntomp 4 -v
>
> but top only shows 2 CPUs being utilized:
>
> top - 09:33:42 up 37 days, 19:48, 2 users, load average: 2.13, 1.05, 0.68
> Tasks: 517 total, 3 running, 514 sleeping, 0 stopped, 0 zombie
> Cpu0 : 98.7%us, 1.3%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu1 : 98.7%us, 1.3%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu2 : 0.0%us, 0.3%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu3 : 0.0%us, 0.3%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu4 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu5 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu6 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu7 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu8 : 0.0%us, 0.0%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st
> Cpu9 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu10 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu11 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu12 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu13 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu14 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu15 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Mem: 132053748k total, 23817664k used, 108236084k free, 268628k buffers
> Swap: 4095996k total, 1884k used, 4094112k free, 15600572k cached
>
>
>> Very likely that's exactly what's screwing things up. We try to be
>> nice and back off (mdrun should note that on the output) when
>> affinities are set externally assuming that they are set for a good
>> reason and to correct values. Sadly, that assumption often proves to
>> be wrong. Try running with "-pin on" or turn off the CPUSET-ing (or
>> double-check if it's right).
>
> I wouldn't expect the CPUSETs to be problematic, I've been using them with Gromacs for over a decade now ;-)
Thread affinity setting within mdrun has been employed since v4.6 and
we do it on a per-thread basis and not doing it can leadto pretty
severe performance degradation when using multi-threading. Depending
on the Linux kernel, OS jitter, and type/speed/scale of the simulation
even MPI-only runs will see a benefit from correct affinity settings.
Hints:
- some useful mdrun command line arguments: "-pin on", "-pinoffset N"
(-pinstride N)
- more details:
http://www.gromacs.org/Documentation/Acceleration_and_parallelization
> If I use '-pin on' it appears to be utilizing 8 CPU-cores as expected:
>
> [mtobias at gpu21 Gromacs_Test]$ mpirun -np 2 mdrun_mpi -ntomp 4 -pin on -v -deffnm PolyA_Heli_J_hi_equil
>
> top - 09:36:26 up 37 days, 19:50, 2 users, load average: 1.00, 1.14, 0.78
> Tasks: 516 total, 4 running, 512 sleeping, 0 stopped, 0 zombie
> Cpu0 : 78.9%us, 2.7%sy, 0.0%ni, 18.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu1 : 63.7%us, 0.3%sy, 0.0%ni, 36.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu2 : 65.6%us, 0.3%sy, 0.0%ni, 33.8%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st
> Cpu3 : 64.9%us, 0.3%sy, 0.0%ni, 34.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu4 : 80.7%us, 2.7%sy, 0.0%ni, 16.6%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu5 : 64.0%us, 0.3%sy, 0.0%ni, 35.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu6 : 62.0%us, 0.3%sy, 0.0%ni, 37.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu7 : 60.3%us, 0.3%sy, 0.0%ni, 39.1%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st
>
>
> Weird. I wonder if anyone else has experience using pin'ing with CPUSETs?
What is your goal with using CPUSETs? Node sharing?
--
Szilárd
> Malcolm
>
> --
> Malcolm Tobias
> 314.362.1594
>
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.
More information about the gromacs.org_gmx-users
mailing list