[gmx-users] 2018 performance question

Tue Feb 20 18:45:59 CET 2018

Hi Michael,

What you observe is most likely due to v2018 by default shifting the
PME work to the GPU which will often mean fewer CPU cores are needed
and runs become more GPU-bound leaving the CPU without work for part
of the runtime. This should be easily seen by comparing the log files.

Especially with older GPUs (approx >2 gen old Kepler or earlier)
running only part of the PME work on the GPU can be useful. This can
be done by using the hybrid PME mode that runs 3D-FFT / gather on the
CPU:
gmx mdrun -pmefft cpu

That might give you better CPU/GPU load balance, and sometimes a
moderate performance improvement. Otherwise, there are a few things
you can do to make better overall use of the machine:
- use fewer cores without giving up much performance (e.g. leave 2
cores free for other tasks) -- that's useful if you have other work
you can do on the free cores;
- run multiple runs to fill "utilization gaps": e.g. run 2-3
concurrent runs with 2-3 cores each.

Cheers,
--
Szilárd

On Fri, Feb 16, 2018 at 12:41 PM, Michael Brunsteiner <mbx0009 at yahoo.com> wrote:
>
> hi
> just installed gmx-2018 on a x86_64 PC with a Geforce GTX 780 and the cudasoftware directly from the nvidia webpage (didn't work using the debian nvidia packages)
> output of lscpu is included below.
> i find that:
> 1) 2018 is slightly faster (~5%) than 2016.2) both 2016 and 2018 use the GPU, but 2018 seems to use less CPU.
> With 2016 using the "top" command i usually see that the CPU load is close to 1200%(i have 6 cores, each two threads) while with 2018 this number is closer to around 400%(I guess this is because 2018 does PME on the GPU)
> my question is: can i possibly further improve the performance of 2018 by1) somehow convincing gmx to use more CPU, or
> 2) run two instances of gmx on this one computer simultaneously??
> thanks in advance for any feedback!
> cheers,Michael
>
>
>
> prompt> lscpuArchitecture:          x86_64
> CPU op-mode(s):        32-bit, 64-bit
> Byte Order:            Little Endian
> CPU(s):                12
> On-line CPU(s) list:   0-11
> Thread(s) per core:    2
> Core(s) per socket:    6
> Socket(s):             1
> NUMA node(s):          1
> Vendor ID:             GenuineIntel
> CPU family:            6
> Model:                 62
> Model name:            Intel(R) Core(TM) i7-4930K CPU @ 3.40GHz
> Stepping:              4
> CPU MHz:               3399.898
> CPU max MHz:           3900.0000
> CPU min MHz:           1200.0000
> BogoMIPS:              6799.79
> Virtualization:        VT-x
> L1d cache:             32K
> L1i cache:             32K
> L2 cache:              256K
> L3 cache:              12288K
> NUMA node0 CPU(s):     0-11
> Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm epb kaiser tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts
>
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.