[gmx-users] Parallelization performance

Sat Mar 16 01:50:08 CET 2013

Hi!

I have been running MD simulations on a 6 processors machine. I just got an
account on a cluster. A nvt stabilization takes about 8 hours on my 6
processors machine, but it takes about 12 hours on the cluster using 16
processors. It is my understanding that the idea of running in parallel is
to be more efficient, wrigth? 

This is the command for the run on the 6 processors machine:
mdrun -v -s nvtOmpA.tpr -deffnm nvtOmpA

This is the command for the run on 16 processor on the cluster:
mpirun -np 16 mdrun_mpi -v -s nvtOmpA.tpr -deffnm nvtOmpA

With the last command I am imaging that my process is divided in 16
processors that perform in parallel so that the wall time should be less
than in the 6 processor machine. My system is a protein in oil and water,
and the simulations are for FE calculations. I think it is spected that the
run on the 16 processor of the cluster should be faster, but I'm getting the
opposite. Am I doing something wrong?

This is my mdp. I have used the same mdp for simulations in 4, 6 and 8
processor machines and everytime is faster and runs quite well. Any help
will be grateful!!

title                    = NVT equilibration
; Run control
integrator               = sd       ; Langevin dynamics
tinit                    = 0
dt                       = 0.002
nsteps                   = 150000    ; 300 ps
nstcomm                  = 100
; Output control
nstxout                  = 500
nstvout                  = 500
nstfout                  = 0
nstlog                   = 500
nstenergy                = 500
nstxtcout                = 0
xtc-precision            = 1000
; Neighborsearching and short-range nonbonded interactions
nstlist                  = 10
ns_type                  = grid
pbc                      = xyz
rlist                    = 1.5
; Electrostatics
coulombtype              = PME
rcoulomb                 = 1.5
; van der Waals
vdw-type                 = switch
rvdw-switch              = 0.8
rvdw                     = 0.9
; Apply long range dispersion corrections for Energy and Pressure
DispCorr                  = EnerPres
; Spacing for the PME/PPPM FFT grid
fourierspacing           = 0.12
; EWALD/PME/PPPM parameters
pme_order                = 6
ewald_rtol               = 1e-06
epsilon_surface          = 0
optimize_fft             = no
; Temperature coupling
; tcoupl is implicitly handled by the sd integrator
tc_grps                  = system
tau_t                    = 1.0
ref_t                    = 300
; Pressure coupling is off for NVT
Pcoupl                   = No
tau_p                    = 0.5
compressibility          = 4.5e-05
ref_p                    = 1.0
; Free energy control stuff
free_energy              = yes
init_lambda              = 0.1
delta_lambda             = 0
foreign_lambda           = 0.05 0.2
sc-alpha                 = 0
sc-power                 = 0
sc-sigma                 = 0
couple-moltype           = Protein_chain_A ; name of moleculetype to
decouple
couple-lambda0           = vdw      ;
couple-lambda1           = vdw-q       ;
couple-intramol          = yes
nstdhdl                  = 10
; Generate velocities to start
gen_vel                  = yes
gen_temp                 = 300
gen_seed                 = -1
; options for bonds
constraints              = h-bonds  ; we only have C-H bonds here
; Type of constraint algorithm
constraint-algorithm     = lincs
; Do not constrain the starting configuration
continuation             = no
; Highest order in the expansion of the constraint coupling matrix
lincs-order              = 12

Thanks in advance!

Sonia Aguilera
Graduate assistant

--
View this message in context: http://gromacs.5086.n6.nabble.com/Parallelization-performance-tp5006357.html
Sent from the GROMACS Users Forum mailing list archive at Nabble.com.