[gmx-users] Parallelization performance

Mark Abraham mark.j.abraham at gmail.com
Sat Mar 16 12:07:56 CET 2013


On Sat, Mar 16, 2013 at 1:50 AM, Sonia Aguilera <
sm.aguilera37 at uniandes.edu.co> wrote:

> Hi!
>
> I have been running MD simulations on a 6 processors machine. I just got an
> account on a cluster. A nvt stabilization takes about 8 hours on my 6
> processors machine, but it takes about 12 hours on the cluster using 16
> processors. It is my understanding that the idea of running in parallel is
> to be more efficient, wrigth?
>

Yes, but your performance depends on the hardware and the setup. 16
abacuses are not faster than 6 computers :-) Secondly, even if the hardware
is comparable, if your 6-processor machine has 4 cores per processor, then
OpenMP might be delivering more performance. Or your MPI environment might
be configured wrongly and you're running 16 copies of the same simulation
on superior hardware. You should inspect the top of the log files to see
what GROMACS thinks your hardware is providing, and the bottom of the log
file to see in which aspects of the simulation the two systems are
delivering difference performance.


> This is the command for the run on the 6 processors machine:
> mdrun -v -s nvtOmpA.tpr -deffnm nvtOmpA
>
> This is the command for the run on 16 processor on the cluster:
> mpirun -np 16 mdrun_mpi -v -s nvtOmpA.tpr -deffnm nvtOmpA
>
> With the last command I am imaging that my process is divided in 16
> processors that perform in parallel so that the wall time should be less
> than in the 6 processor machine. My system is a protein in oil and water,
> and the simulations are for FE calculations. I think it is spected that the
> run on the 16 processor of the cluster should be faster, but I'm getting
> the
> opposite. Am I doing something wrong?
>

Not as far as we know. But you need to inspect your .log files for all the
clues GROMACS provides.

This is my mdp. I have used the same mdp for simulations in 4, 6 and 8
> processor machines and everytime is faster and runs quite well. Any help
> will be grateful!!
>
> title                    = NVT equilibration
> ; Run control
> integrator               = sd       ; Langevin dynamics
>

There have been fixes for correctness and performance of the SD integrator
- you should certainly not be using GROMACS 4.6.

Mark


> tinit                    = 0
> dt                       = 0.002
> nsteps                   = 150000    ; 300 ps
> nstcomm                  = 100
> ; Output control
> nstxout                  = 500
> nstvout                  = 500
> nstfout                  = 0
> nstlog                   = 500
> nstenergy                = 500
> nstxtcout                = 0
> xtc-precision            = 1000
> ; Neighborsearching and short-range nonbonded interactions
> nstlist                  = 10
> ns_type                  = grid
> pbc                      = xyz
> rlist                    = 1.5
> ; Electrostatics
> coulombtype              = PME
> rcoulomb                 = 1.5
> ; van der Waals
> vdw-type                 = switch
> rvdw-switch              = 0.8
> rvdw                     = 0.9
> ; Apply long range dispersion corrections for Energy and Pressure
> DispCorr                  = EnerPres
> ; Spacing for the PME/PPPM FFT grid
> fourierspacing           = 0.12
> ; EWALD/PME/PPPM parameters
> pme_order                = 6
> ewald_rtol               = 1e-06
> epsilon_surface          = 0
> optimize_fft             = no
> ; Temperature coupling
> ; tcoupl is implicitly handled by the sd integrator
> tc_grps                  = system
> tau_t                    = 1.0
> ref_t                    = 300
> ; Pressure coupling is off for NVT
> Pcoupl                   = No
> tau_p                    = 0.5
> compressibility          = 4.5e-05
> ref_p                    = 1.0
> ; Free energy control stuff
> free_energy              = yes
> init_lambda              = 0.1
> delta_lambda             = 0
> foreign_lambda           = 0.05 0.2
> sc-alpha                 = 0
> sc-power                 = 0
> sc-sigma                 = 0
> couple-moltype           = Protein_chain_A ; name of moleculetype to
> decouple
> couple-lambda0           = vdw      ;
> couple-lambda1           = vdw-q       ;
> couple-intramol          = yes
> nstdhdl                  = 10
> ; Generate velocities to start
> gen_vel                  = yes
> gen_temp                 = 300
> gen_seed                 = -1
> ; options for bonds
> constraints              = h-bonds  ; we only have C-H bonds here
> ; Type of constraint algorithm
> constraint-algorithm     = lincs
> ; Do not constrain the starting configuration
> continuation             = no
> ; Highest order in the expansion of the constraint coupling matrix
> lincs-order              = 12
>
>
>
> Thanks in advance!
>
> Sonia Aguilera
> Graduate assistant
>
>
>
>
>
> --
> View this message in context:
> http://gromacs.5086.n6.nabble.com/Parallelization-performance-tp5006357.html
> Sent from the GROMACS Users Forum mailing list archive at Nabble.com.
> --
> gmx-users mailing list    gmx-users at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-request at gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>



More information about the gromacs.org_gmx-users mailing list