[gmx-users] Query regarding Domain decomposition

suhani nagpal suhani.nagpal at gmail.com
Fri Jun 27 11:08:08 CEST 2014


HI

yes, i have done the trial runs. It runs on 10 nodes but not above that.
So, in all 160 processors are being used for a simulation.

Maybe its true that for a 33000 atom system its not wise to scale higher
than this perhaps.
As you mentioned earlier - You can't efficiently parallelize an algorithm
over arbitrary
amounts of hardware. You need 100-1000 atoms per core, depending on
hardware, simulation settings and GROMACS version.

Thanks


On Thu, Jun 26, 2014 at 8:16 PM, Mark Abraham <mark.j.abraham at gmail.com>
wrote:

> Like I suggested last time, find somewhere mdrun does work, e.g. 1 node.
> When you have a complex problem, simplify it and see what you learn ;-) If
> mdrun explodes on 1 node, try your .tpr on a local machine, to see whether
> it's the mdrun install, or MPI system that could be at fault, or maybe it's
> an unstable .tpr...
>
> Mark
>
>
> On Thu, Jun 26, 2014 at 2:18 PM, suhani nagpal <suhani.nagpal at gmail.com>
> wrote:
>
> > Thanks for the insights.
> >
> > So, now using the .tpr file generated with grompp whose version matches
> the
> > mdrun.
> >
> > files are generated again but after time step 0 and time 0 nothing gets
> > written and the jobs reaches an error state and stops.
> >
> > in the error file,
> >
> > mpirun noticed that process rank 41 with PID 35866 on node cn0286. exited
> > on signal 11 (Segmentation fault).
> >
> >
> > I'm not able to scale over 160 processors. I have 32600 atoms in the
> > system.
> >
> > Kindly assist
> >
> > thanks
> >
> >
> > Suhani
> >
> >
> >
> >
> >
> >
> > On Wed, Jun 25, 2014 at 5:53 PM, Mark Abraham <mark.j.abraham at gmail.com>
> > wrote:
> >
> > > On Jun 25, 2014 8:15 AM, "suhani nagpal" <suhani.nagpal at gmail.com>
> > wrote:
> > > >
> > > > Greetings
> > > >
> > > > I have been trying to run a few set of simulations using high number
> of
> > > > processors.
> > > >
> > > > Using the tutorial -
> > > >
> > >
> > >
> >
> http://compchemmpi.wikispaces.com/file/view/Domaindecomposition_KKirchner_27Apr2012.pdf
> > > >
> > > > I have done calculations to evaluate the set of nodes which would be
> > > > optimal for the protein.
> > > >
> > > >
> > > > So the all the files are generated, but error occurs and the
> trajectory
> > > > files remain empty with no error mentioned in the log file.
> > >
> > > Hard to say. The problem could be anywhere, since we don't yet know
> when
> > > mdrun does work...
> > >
> > > > Number of nodes to be used in multiple of 16
> > > >
> > > > box in x and y dimension 8 nm
> > > >
> > > >
> > > >
> > > > In the error file,
> > > >
> > > >
> > > > Reading file 400K_SIM2.tpr, VERSION 4.5.5 (single precision)
> > >
> > > Why use a slow, old version if you want parallel performance?
> > >
> > > > Note: file tpx version 73, software tpx version 83
> > >
> > > You should prefer to use grompp whose version matches mdrun.
> > >
> > > > The number of OpenMP threads was set by environment variable
> > > > OMP_NUM_THREADS to 1
> > > > Using 320 MPI processes
> > > >
> > > > NOTE: The load imbalance in PME FFT and solve is 116%.
> > > >       For optimal PME load balancing
> > > >       PME grid_x (54) and grid_y (54) should be divisible by
> > #PME_nodes_x
> > > > (140)
> > > >       and PME grid_y (54) and grid_z (54) should be divisible by
> > > > #PME_nodes_y (1)
> > > >
> > > >
> > > >
> > > > mdp file for reference
> > > >
> > > > ; Bond parameters
> > > > continuation    = yes           ; Restarting after NPT
> > > > constraint_algorithm = lincs    ; holonomic constraints
> > > > constraints     = all-bonds     ; all bonds (even heavy atom-H bonds)
> > > > constrained
> > > > lincs_iter      = 1             ; accuracy of LINCS
> > > > lincs_order     = 4             ; also related to accuracy
> > > > ; Neighborsearching
> > > > ns_type         = grid          ; search neighboring grid cells
> > > > nstlist         = 5             ; 10 fs
> > > > rlist           = 1.0           ; short-range neighborlist cutoff (in
> > nm)
> > > > rcoulomb        = 1.0           ; short-range electrostatic cutoff
> (in
> > > nm)
> > > > rvdw            = 1.0           ; short-range van der Waals cutoff
> (in
> > > nm)
> > > > ; Electrostatics
> > > > coulombtype     = PME           ; Particle Mesh Ewald for long-range
> > > > electrostatics
> > > > pme_order       = 4             ; cubic interpolation
> > > > fourierspacing  = 0.16          ; grid spacing for FFT
> > > > ; Temperature coupling is on
> > > > tcoupl          = nose-hoover   ; nose-hoover coupling
> > > > tc-grps         = Protein Non-Protein   ; two coupling groups - more
> > > > accurate
> > > > tau_t           = 0.2   0.2     ; time constant, in ps
> > > > ref_t           = 400   400     ; reference temperature, one for each
> > > > group, in K
> > > > ; Pressure coupling is off
> > > > pcoupl          = no            ;
> > > > ; Periodic boundary conditions
> > > > pbc             = xyz           ; 3-D PBC
> > > > ; Dispersion correction
> > > > DispCorr        = EnerPres      ; account for cut-off vdW scheme
> > > > ; Velocity generation
> > > > gen_vel         = yes           ; assign velocities from Maxwell
> > > > distribution
> > > > gen_temp        = 400           ; temperature for Maxwell
> distribution
> > > > gen_seed        = -1            ; generate a random seed
> > > >
> > > >
> > > > Kindly help.
> > > >
> > > > I have to run simulations at 250 to 300 processors.
> > >
> > > Maybe. You can't efficiently parallelize an algorithm over arbitrary
> > > amounts of hardware. You need 100-1000 atoms per core, depending on
> > > hardware, simulation settings and GROMACS version.
> > >
> > > Mark
> > >
> > > > --
> > > > Gromacs Users mailing list
> > > >
> > > > * Please search the archive at
> > > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > > posting!
> > > >
> > > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > > >
> > > > * For (un)subscribe requests visit
> > > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
> or
> > > send a mail to gmx-users-request at gromacs.org.
> > > --
> > > Gromacs Users mailing list
> > >
> > > * Please search the archive at
> > > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > > posting!
> > >
> > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > >
> > > * For (un)subscribe requests visit
> > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > > send a mail to gmx-users-request at gromacs.org.
> > >
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > send a mail to gmx-users-request at gromacs.org.
> >
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>


More information about the gromacs.org_gmx-users mailing list