[gmx-users] cpu/gpu utilization
Szilárd Páll
pall.szilard at gmail.com
Fri Mar 2 12:54:53 CET 2018
Once again, full log files, please, not partial cut-and-paste, please.
Also, you misread something because your previous logs show:
-nb cpu -pme gpu: 56.4 ns/day
-nb cpu -pme gpu -pmefft cpu 64.6 ns/day
-nb cpu -pme cpu 67.5 ns/day
So both mixed mode PME and PME on CPU are faster, the latter slightly
faster than the former.
This is about as much as you can do, I think. Your GPU is just too slow to
get more performance out of it and the runs are GPU-bound. You might be
able to get a bit more performance with some tweaks (compile mdrun with
AVX2_256, use a newer fftw, use a newer gcc), but expect marginal gains.
Cheers,
--
Szilárd
On Fri, Mar 2, 2018 at 11:00 AM, Mahmood Naderan <nt_mahmood at yahoo.com>
wrote:
> Command is "gmx mdrun -nobackup -pme cpu -nb gpu -deffnm md_0_1" and the
> log says
>
> R E A L C Y C L E A N D T I M E A C C O U N T I N G
>
> On 1 MPI rank, each using 16 OpenMP threads
>
> Computing: Num Num Call Wall time Giga-Cycles
> Ranks Threads Count (s) total sum %
> ------------------------------------------------------------
> -----------------
> Neighbor search 1 16 501 0.972 55.965 0.8
> Launch GPU ops. 1 16 50001 2.141 123.301 1.7
> Force 1 16 50001 4.019 231.486 3.1
> PME mesh 1 16 50001 40.695 2344.171 31.8
> Wait GPU NB local 1 16 50001 60.155 3465.079 47.0
> NB X/F buffer ops. 1 16 99501 7.342 422.902 5.7
> Write traj. 1 16 11 0.246 14.184 0.2
> Update 1 16 50001 3.480 200.461 2.7
> Constraints 1 16 50001 5.831 335.878 4.6
> Rest 3.159 181.963 2.5
> ------------------------------------------------------------
> -----------------
> Total 128.039 7375.390 100.0
> ------------------------------------------------------------
> -----------------
> Breakdown of PME mesh computation
> ------------------------------------------------------------
> -----------------
> PME spread 1 16 50001 17.086 984.209 13.3
> PME gather 1 16 50001 12.534 722.007 9.8
> PME 3D-FFT 1 16 100002 9.956 573.512 7.8
> PME solve Elec 1 16 50001 0.779 44.859 0.6
> ------------------------------------------------------------
> -----------------
>
> Core t (s) Wall t (s) (%)
> Time: 2048.617 128.039 1600.0
> (ns/day) (hour/ns)
> Performance: 67.481 0.356
>
>
>
>
>
>
> While the command is "", I see that the gpu is utilized about 10% and the
> log file says:
>
> R E A L C Y C L E A N D T I M E A C C O U N T I N G
>
> On 1 MPI rank, each using 16 OpenMP threads
>
> Computing: Num Num Call Wall time Giga-Cycles
> Ranks Threads Count (s) total sum %
> ------------------------------------------------------------
> -----------------
> Neighbor search 1 16 1251 6.912 398.128 2.3
> Force 1 16 50001 210.689 12135.653 70.4
> PME mesh 1 16 50001 46.869 2699.656 15.7
> NB X/F buffer ops. 1 16 98751 22.315 1285.360 7.5
> Write traj. 1 16 11 0.216 12.447 0.1
> Update 1 16 50001 4.382 252.386 1.5
> Constraints 1 16 50001 6.035 347.601 2.0
> Rest 1.666 95.933 0.6
> ------------------------------------------------------------
> -----------------
> Total 299.083 17227.165 100.0
> ------------------------------------------------------------
> -----------------
> Breakdown of PME mesh computation
> ------------------------------------------------------------
> -----------------
> PME spread 1 16 50001 21.505 1238.693 7.2
> PME gather 1 16 50001 12.089 696.333 4.0
> PME 3D-FFT 1 16 100002 11.627 669.705 3.9
> PME solve Elec 1 16 50001 0.965 55.598 0.3
> ------------------------------------------------------------
> -----------------
>
> Core t (s) Wall t (s) (%)
> Time: 4785.326 299.083 1600.0
> (ns/day) (hour/ns)
> Performance: 28.889 0.831
>
>
>
>
> Using GPU is still better than using CPU alone. However, I see that while
> GPU is utilized, the CPU is also busy. So, I was thinking that the source
> code uses cudaDeviceSynchronize() where the CPU enters a busy loop.
>
> Regards,
> Mahmood
>
> On Friday, March 2, 2018, 11:37:11 AM GMT+3:30, Magnus Lundborg <
> magnus.lundborg at scilifelab.se> wrote:
>
> Have you tried the mdrun options:
>
> -pme cpu -nb gpu
> -pme cpu -nb cpu
>
> Cheers,
>
> Magnus
>
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/
> Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>
More information about the gromacs.org_gmx-users
mailing list