[gmx-users] cpu/gpu utilization

Szilárd Páll pall.szilard at gmail.com
Fri Mar 2 19:16:56 CET 2018


On Fri, Mar 2, 2018 at 1:57 PM, Mahmood Naderan <nt_mahmood at yahoo.com>
wrote:

> Sorry for the confusion. My fault...
> I saw my previous post and found that I missed something. In fact, I
> couldn't run "-pme gpu".
>
> So, once again, I ran all the commands and uploaded the log files
>
>
> gmx mdrun -nobackup -nb cpu -pme cpu -deffnm md_0_1
> https://pastebin.com/RNT4XJy8
>
>
> gmx mdrun -nobackup -nb cpu -pme gpu -deffnm md_0_1
> https://pastebin.com/7BQn8R7g
> This run shows an error on the screen which is not shown in the log file.
> So please also see https://pastebin.com/KHg6FkBz


That's expected, only offloading PME is not supported.,different offload
modes supported are:
- nonbonded offload
- nonbonded + full PME offload
- nonbonded + PME mixed mode offload (FFTs run on the CPU)



>
>
>
> gmx mdrun -nobackup -nb gpu -pme cpu -deffnm md_0_1
> https://pastebin.com/YXYj23tB
>
>
>
> gmx mdrun -nobackup -nb gpu -pme gpu -deffnm md_0_1
> https://pastebin.com/P3X4mE5y
>
>
> offloadable
>
>
> From the results, it seems that running the pme on the cpu is better than
> gpu. The fastest command here is -nb gpu -pme cpu
>

Right, same as before except that it looks like this time is ~5% slower
(likely the auto-tuner did not manage two switch to the ideal setting).


>
>
> Still I have the question that while GPU is utilized, the CPU is also
> busy. So, I was thinking that the source code uses cudaDeviceSynchronize()
> where the CPU enters a busy loop.
>

Yes, CPU and GPU run concurrently and work on independent tasks, when the
CPU is done it has to wait for the GPU before it can proceed with
constraints/integration.

To get a better overview, please read some of the GROMACS papers (
http://www.gromacs.org/Gromacs_papers) or tldr see https://goo.gl/AGv6hy
(around slides 12-15).

Cheers,
--
Szilárd


>
>
>
> Regards,
> Mahmood
>
>
>
>
>
>
> On Friday, March 2, 2018, 3:24:41 PM GMT+3:30, Szilárd Páll <
> pall.szilard at gmail.com> wrote:
>
>
>
>
>
> Once again, full log files, please, not partial cut-and-paste, please.
>
> Also, you misread something because your previous logs show:
> -nb cpu -pme gpu: 56.4 ns/day
> -nb cpu -pme gpu -pmefft cpu 64.6 ns/day
> -nb cpu -pme cpu 67.5 ns/day
>
> So both mixed mode PME and PME on CPU are faster, the latter slightly
> faster than the former.
>
> This is about as much as you can do, I think. Your GPU is just too slow to
> get more performance out of it and the runs are GPU-bound. You might be
> able to get a bit more performance with some tweaks (compile mdrun with
> AVX2_256, use a newer fftw, use a newer gcc), but expect marginal gains.
>
> Cheers,
>
> --
> Szilárd
>
>
>


More information about the gromacs.org_gmx-users mailing list