[gmx-users] Question about scaling

Carsten Kutzner ckutzne at gwdg.de
Mon Nov 12 17:28:11 CET 2012

Hi Thomas,

On Nov 12, 2012, at 5:18 PM, Thomas Schlesier <schlesi at uni-mainz.de> wrote:

> Dear all,
> i did some scaling tests for a cluster and i'm a little bit clueless about the results.
> So first the setup:
> Cluster:
> Saxonid 6100, Opteron 6272 16C 2.100GHz, Infiniband QDR
> GROMACS version: 4.0.7 and 4.5.5
> Compiler: 	GCC 4.7.0
> MPI: Intel MPI
> FFT-library: ACML 5.1.0 fma4
> System:
> 895 spce water molecules
this is a somewhat small system I would say.

> Simulation time: 750 ps (0.002 fs timestep)
> Cut-off: 1.0 nm
> but with long-range correction ( DispCorr = EnerPres ; PME (standard settings) - but in each case no extra CPU solely for PME)
> V-rescale thermostat and Parrinello-Rahman barostat
> I get the following timings (seconds), whereas is calculated as the time which would be needed for 1 CPU (so if a job on 2 CPUs took X s the time would be 2 * X s).
> These timings were taken from the *.log file, at the end of the
> 'real cycle and time accounting' - section.
> Timings:
> gmx-version	1cpu	2cpu	4cpu
> 4.0.7		4223	3384	3540
> 4.5.5		3780	3255	2878
Do you mean CPUs or CPU cores? Are you using the IB network or are you running single-node?

> I'm a little bit clueless about the results. I always thought, that if i have a non-interacting system and double the amount of CPUs, i
You do use PME, which means a global interaction of all charges.

> would get a simulation which takes only half the time (so the times as defined above would be equal). If the system does have interactions, i would lose some performance due to communication. Due to node imbalance there could be a further loss of performance.
> Keeping this in mind, i can only explain the timings for version 4.0.7 2cpu -> 4cpu (2cpu a little bit faster, since going to 4cpu leads to more communication -> loss of performance).
> All the other timings, especially that 1cpu takes in each case longer than the other cases, i do not understand.
> Probalby the system is too small and / or the simulation time is too short for a scaling test. But i would assume that the amount of time to setup the simulation would be equal for all three cases of one GROMACS-version.
> Only other explaination, which comes to my mind, would be that something went wrong during the installation of the programs…
You might want to take a closer look at the timings in the md.log output files, this will 
give you a clue where the bottleneck is, and also tell you about the communication-computation 


> Please, can somebody enlighten me?
> Greetings
> Thomas
> -- 
> gmx-users mailing list    gmx-users at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-request at gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

Dr. Carsten Kutzner
Max Planck Institute for Biophysical Chemistry
Theoretical and Computational Biophysics
Am Fassberg 11, 37077 Goettingen, Germany
Tel. +49-551-2012313, Fax: +49-551-2012302

More information about the gromacs.org_gmx-users mailing list