[gmx-users] Suggestions for Gromacs Perfomance

Szilárd Páll pall.szilard at gmail.com
Tue Aug 12 20:20:48 CEST 2014


You do not show your exact hardware configuration and with different
CPUs you will surely get different performance. You do not show your
command line or launch configuration (#ranks, #threads, #separate PME
ranks) either, but based on the "-gpu_id 00000000" argument you have
there, I assume you are not using OpenMP multi-threading correctly.

So here's, what I suggest:

- On single-CPU (as well as single-node 2xCPU with Intel) and
single-GPU runs it is (nearly) always best to use no
domain-decomposition, i.e. "-ntmpi 1" and optionally "-ntomp N" where
N is the number of threads you want to start. If you have
hyperthreading on, things are a bit more complicated, but first you
should sort things out without HT.

- With multi-GPU and multi-node runs you'll have to try and see what
thread/rank number works best; on Intel machines start with 1-2
threads/rank with CPU-only, 2-4 threads/rank with GPUs;


- If you have similar enough hardware you should be able to get close
to the performance shown, the number are reproducible. If you don't
post your command and log files (attachments are not allowed, pastebin
or similar is your friend).

Cheers,
--
Szilárd

PS: Not sure if it's just me, but your email contains tons of newlines
which makes it barely readable.

On Wed, Aug 6, 2014 at 12:03 PM, Dinesh Mali <dmali085 at gmail.com> wrote:
> Dear gmx users,
>
> I
> have been performing some benchmarks as given taken from following URL,
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> http://www.gromacs.org/GPU_acceleration
>
>
> The expected performance values is taken from the graph
> s
> .
> This cluster
>  has a two nodes, of which one has two Nvidia K20 gpus.
> I have follo
>
> wed the performance checklist given in following URL,
> http://www.
>
> gromacs.org/Documentation/Performance_checklist
> <http://www.gromacs.org/Documentation/Performance_checklist>
>
> I am using fftw-3.3.4 with sse2 SIMD instructions and using cuda-5.5.
>
> The results are as follows:
>
> Gromacs Version 4.6.5 GCC Version 4.4.6
>
> System CPU/GPU used  performance(ns/day) expected(ns/day)
> RNASE 2 CPU (Cores 16, nopm 1) 63.815 80
> RNASE 1CPU(8cores)+1GPU
> (-gpu_id 00000000) 50.647 95
> ADH 4 CPU on 2 on each node
> (32 Cores) 15.115 15
>
>
> Gromacs Version 5.0.0 GCC Version 4.7.2
>
> System CPU/GPU used  performance(ns/day) expected(ns/day)
> RNASE 2 CPU (Cores 16, nopm 1)    66.698 80
> RNASE 1CPU(8cores)+1GPU
> (-np 8 -gpu_id 00000000) 47.763 95
> ADH 4 CPU on 2 on each node
> (32 Cores) 14.142 15
>
> Kindly provide inputs to improve the performance
> with gpus
> .
> Also, if possible please suggest
> simulation
> with reported performance
>
>
>
> simulation for
>
> performing benchmarks
>  for both CPU & GPU.
>
>
>
>
> Regards,
> Dinesh Mali
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.


More information about the gromacs.org_gmx-users mailing list