[gmx-users] gromacs 4.6 mpirun mdrun_mpi poor output performance

Mark Abraham mark.j.abraham at gmail.com
Fri Oct 21 09:42:22 CEST 2016


Hi,

The start and end of the log file is your best friend. Unfortunately we
can't see the latter, but I can see from the PME tuning that your job is
doing a lot of waiting for the long-range calculation. Thus, as you know,
-npme is your friend.

In 4.6.x CPU-only runs, you won't get value from more than 2-3 OpenMP
threads per rank. You're using 16, which is definitely inefficient for our
implementation. You probably won't be able to use this number of cores in
"1 core per MPI rank" mode, but you should start with that on a smaller
number of nodes (and thus MPI ranks) and decide how many nodes is
reasonably efficient to use. At the limit, try "2 cores per MPI rank" and
see how that goes.

 And you should not start new work with 4.6.2; at least get the bug fixes
in 4.6.7, or the better performance in recent versions of the code.

Mark

On Fri, Oct 21, 2016 at 5:54 AM shubhandra tripathi <
shub1991tripathi at gmail.com> wrote:

> Dear gmx users,
> I have problem regarding poor output performace from gromacs-4.6 simulation
> via mpirun. I have tried various options but unable to boost performace.
> Below command with -npme (32,48,64) combinations was also used with no
> improvement in performance.
>
> *"mpirun -np 128  /app/gromacs462/bin/mdrun_mpi -deffnm md " *
>
> the main issue is that load imbalance is very high, causing poor output of
> 200-500ps per day.
>
> Thanking you
>
> Sincerely,
>
>
> -----md.log ------
> https://drive.google.com/open?id=0B9Gtd3wzhkmPSmk2X1QyYmVlV0k
>
>
> ------PBS script-----
> #!/bin/bash
> #PBS -l walltime=48:00:00
> #PBS -N test1
> #PBS -q workq
> #PBS -l select=8:ncpus=16:mpiprocs=16
> # Go to the directory from which you submitted the job
>
> cd $PBS_O_WORKDIR
>
> source /usr/share/Modules/init/sh
> export MPI_DEBUG=all
> export MPI_IB_RAILS=2
> export MPI_DSM_DISTRIBUTE=1
> export MPI_VERBOSE=1
> export MPI_BUFS_THRESHOLD=1
> export MPI_BUFS_PER_PROC=1024
>
> module load gromacs-4.6.2
> module load intel-cluster-studio-2013
>
> mpirun -np 128  /app/gromacs462/bin/mdrun_mpi -deffnm md
>
> ----md.mdp----
>
> title       = Protein-ligand complex NVT equilibration
> ; Run parameters
> integrator  = md        ; leap-frog integrator
> nsteps      = 50000000    ; 2 * 50000000 = 100000 ps (100 ns)
> dt          = 0.002     ; 2 fs
> ; Output control
> nstxout     = 25000         ; suppress .trr output
> nstvout     = 25000         ; suppress .trr output
> nstenergy   = 1000      ; save energies every 2 ps
> nstlog      = 1000      ; update log file every 2 ps
> nstxtcout   = 10000      ; write .xtc trajectory every 2 ps
> energygrps  = Protein GTP
> ; Bond parameters
> continuation    = yes           ; first dynamics run
> constraint_algorithm = lincs    ; holonomic constraints
> constraints     = all-bonds     ; all bonds (even heavy atom-H bonds)
> constrained
> lincs_iter      = 1             ; accuracy of LINCS
> lincs_order     = 6             ; also related to accuracy
> ; Neighborsearching
> cutoff-scheme = Verlet
> ns_type     = grid      ; search neighboring grid cells
> nstlist     = 10         ; 10 fs
>    ; short-range neighborlist cutoff (in nm)
> rcoulomb    = 1.0       ; short-range electrostatic cutoff (in nm)
> rvdw        = 1.0       ; short-range van der Waals cutoff (in nm)
> ; Electrostatics
> coulombtype     = PME       ; Particle Mesh Ewald for long-range
> electrostatics
> pme_order       = 4         ; cubic interpolation
> fourierspacing  = 0.16      ; grid spacing for FFT
> ; Temperature coupling
> tcoupl      = V-rescale                     ; modified Berendsen thermostat
> tc-grps     = Protein_GTP Water_and_ions    ; two coupling groups - more
> accurate
> tau_t       = 0.1   0.1                     ; time constant, in ps
> ref_t       = 300   300                     ; reference temperature, one
> for each group, in K
> ; Pressure coupling
> pcoupl      = Parrinello-Rahman             ; pressure coupling is on for
> NPT
> pcoupltype  = isotropic                     ; uniform scaling of box
> vectors
> tau_p       = 2.0                           ; time constant, in ps
> ref_p       = 1.0                           ; reference pressure, in bar
> compressibility = 4.5e-5                    ; isothermal compressibility of
> water, bar^-1
> ; Periodic boundary conditions
> pbc         = xyz       ; 3-D PBC
> ; Dispersion correction
> DispCorr    = EnerPres  ; account for cut-off vdW scheme
> ; Velocity generation
> gen_vel     = no        ; assign velocities from Maxwell distribution
>
> Regards
> Shubhandra
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>


More information about the gromacs.org_gmx-users mailing list