[gmx-users] Gromacs 2018.3 Exceeding Memory Issue

Tue Nov 27 16:43:48 CET 2018

Hi, Mark,

   Thank you for all the suggestions! Regarding to the memory limit it
should be around 118 GB. The cluster I am using has 20 cores per node and 6
GB of memory space per code. That's why I think it is strange for my job to
exceed the large memory limit. Right now I have checked my mdp file and
submission file and couldn't see any possible reason that causes this large
memory usage issue. Do you have suggestions on other places to look at?
Thank you so much for your help.

Regards,
Peiyin

On Tue, Nov 27, 2018 at 2:50 AM Mark Abraham <mark.j.abraham at gmail.com>
wrote:

> Hi,
>
> On Tue, Nov 27, 2018 at 4:31 AM Peiyin Lee <peiyinlee329 at gmail.com> wrote:
>
> > Hi, all GROMACS users,
> >
> >    I am trying to run jobs with Gromacs 2018.3 version and constantly
> got a
> > memory exceeding error. The system I ran is an all-atom system with 21073
> > atoms. The largest file that is estimated to be generated is around 5.8
> GB.
> >
>
> Estimate sizes of disk files don't matter here.
>
>
> > My jobs got constantly killed after running for only around 15 minutes
> and
> > got an error message like this: "slurmstepd: error: Job 12381762 exceeded
> > memory limit (123122052 > 122880000), being killed". I have tried using a
> >
>
> 128MB is pretty tiny these days - no compute node will have less than 1GB
> physical memory, so I suggest to ask for that.
>
> GROMACS should never leak memory as the simulation progresses - if you
> think you are seeing that (e.g. with slightly larger memory limit, slurm
> interrupts a bit later) then we would like to see a bug report at
> https://redmine.gromacs.org
>
>
> > larger memory specification (12GB/core) but it would take too long to
> wait
> > and I don't think my job really uses that many memories. I have attached
> my
> > .mdp file as below:
> > "title = NVT Production Run for Trpzip4 in pure H2O
> >
> > define =        ; position restrain the protein
> >
> > ; Run parameters
> >
> > integrator = md ; leap-frog integrator
> >
> > nsteps = 50000000 ; 0.002 * 50000 = 100000 ps (100 ns)
> >
> > dt     = 0.002 ; 2 fs
> >
> > ; Output control
> >
> > nstenergy = 10000 ; save energies every 20 ps
> >
> > nstlog = 10000 ; update log file every 20 ps
> >
> > nstxout-compressed = 10000      ; 20ps
> >
> > compressed-x-precision = 200   ; 0.05
> >
> > compressed-x-grps       = System
> >
> > ; Bond parameters
> >
> > continuation = yes     ; Restarting after NVT
> >
> > constraint_algorithm = lincs ; holonomic constraints
> >
> > constraints = all-bonds         ; all bonds (even heavy atom-H bonds)
> > constrained
> >
> > lincs_iter = 1             ; accuracy of LINCS
> >
> > lincs_order = 4             ; also related to accuracy
> >
> > ; Neighborsearching
> >
> > ns_type = grid ; search neighboring grid cels
> >
> > nstlist = 5     ; 10 fs
> >
> > rlist = 1.2 ; short-range neighborlist cutoff (in nm)
> >
> > rcoulomb = 1.2 ; short-range electrostatic cutoff (in nm)
> >
> > rvdw = 1.2 ; short-range van der Waals cutoff (in nm)
> >
> > ; Electrostatics
> >
> > coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics
> >
> > pme_order = 4     ; cubic interpolation
> >
> > fourierspacing = 0.16 ; grid spacing for FFT
> >
> > ; Temperature coupling is on
> >
> > tcoupl = V-rescale         ; More accurate thermostat
> >
> > tc-grps = Protein SOL NA        ; 2 coupling groups - more accurate
> >
>
> Off topic, but it is not good practice to couple ions separately. Did you
> perhaps follow some tutorial that we can ask the author to fix?
>
>
> > tau_t = 0.5 0.5 0.5        ; time constant, in ps
> >
> > ref_t = 400 400  400     ; reference temperature, one for each group, in
> K
> >
> > ; Pressure coupling is on
> >
> > pcoupl = No      ; Pressure coupling on in NPT
> >
> > pcoupltype = isotropic     ; uniform scaling of x-y box vectors,
> > independent z
> >
> > tau_p = 5.0         ; time constant, in ps
> >
> > ref_p = 1.0         ; reference pressure, x-y, z (in bar)
> >
> > compressibility = 4.5e-5 ; isothermal compressibility, bar^-1
> >
> > ; Periodic boundary conditions
> >
> > pbc     = xyz ; 3-D PBC
> >
> > ; Dispersion correction
> >
> > DispCorr = EnerPres ; account for cut-off vdW scheme
> >
> > ; Velocity generation
> > gen_vel = no ; Velocity generation is off"
> > and the command I used to run was "mpirun -np 80 gmx_mpi mdrun -npme 16
> > -noappend -s md.tpr -c md.gro -e md.edr -x md.xtc -cpi md.cpt -cpo md.cpt
> > -g md.log".
>
>
> Looks fine. I encourage everybody to use the default file names, and
> organize their projects into natural groups for the infrastructure, like
> directories. Renaming them doesn't add value and makes your life more
> complicated when you're doing restarts.
>
> Mark
>
>
> > This is my first time posting so please excuse anything that's
> > unclear. I will try to clarify if needed. Any help is greatly
> appreciated!
> >
> > Regards,
> > Peiyin Lee
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > send a mail to gmx-users-request at gromacs.org.
> >
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>