[gmx-users] cpu/gpu utilization

Fri Mar 2 13:57:53 CET 2018

Sorry for the confusion. My fault...
I saw my previous post and found that I missed something. In fact, I couldn't run "-pme gpu".

So, once again, I ran all the commands and uploaded the log files

gmx mdrun -nobackup -nb cpu -pme cpu -deffnm md_0_1
https://pastebin.com/RNT4XJy8

gmx mdrun -nobackup -nb cpu -pme gpu -deffnm md_0_1
https://pastebin.com/7BQn8R7g
This run shows an error on the screen which is not shown in the log file. So please also see https://pastebin.com/KHg6FkBz

gmx mdrun -nobackup -nb gpu -pme cpu -deffnm md_0_1
https://pastebin.com/YXYj23tB

gmx mdrun -nobackup -nb gpu -pme gpu -deffnm md_0_1
https://pastebin.com/P3X4mE5y

>From the results, it seems that running the pme on the cpu is better than gpu. The fastest command here is -nb gpu -pme cpu

Still I have the question that while GPU is utilized, the CPU is also busy. So, I was thinking that the source code uses cudaDeviceSynchronize() where the CPU enters a busy loop.

Regards,
Mahmood

On Friday, March 2, 2018, 3:24:41 PM GMT+3:30, Szilárd Páll <pall.szilard at gmail.com> wrote: 

Once again, full log files, please, not partial cut-and-paste, please.

Also, you misread something because your previous logs show:
-nb cpu -pme gpu: 56.4 ns/day
-nb cpu -pme gpu -pmefft cpu 64.6 ns/day
-nb cpu -pme cpu 67.5 ns/day

So both mixed mode PME and PME on CPU are faster, the latter slightly faster than the former.

This is about as much as you can do, I think. Your GPU is just too slow to get more performance out of it and the runs are GPU-bound. You might be able to get a bit more performance with some tweaks (compile mdrun with AVX2_256, use a newer fftw, use a newer gcc), but expect marginal gains.

Cheers,

--
Szilárd