[gmx-users] GROMACS stalls for NPT simulation when using -npme and -dd flags

Mark Abraham Mark.Abraham at anu.edu.au
Fri Mar 9 13:42:33 CET 2012


On 9/03/2012 9:43 PM, Stephen Cox wrote:
> Dear users,
>
> I'm trying to run an isotropic NPT simulation on a cubic cell 
> containing TIP4P/ice water and methane. I'm using the 
> Parrinello-Rahman barostat. I've been playing around with the 
> different decomposition flags of mdrun to get better performance and 
> scaling and have found that the standard -npme (half number of cores) 
> works pretty well. I've also tried using the -dd flags, and I appear 
> to get decent performance and scaling. However, after a few 
> nanoseconds (corresponding to about 3 hours run time), the program 
> just stalls; no output and no error messages. I realise NPT may cause 
> domain decompositon some issues if the cell vectors vary wildly, but 
> this isn't happening in my system.
>
> Has anybody else experienced issues with domain decomposition and NPT 
> simulations? If so, are there any workarounds? For the moment, I've 
> had to resort to using -pd, which is giving relatively poor 
> performance and scaling, but at least it isn't dying!
>
> I'm using GROMACS 4.5.5 with an intel compiler (I followed the 
> instructions online, with static linking) and using the command:
>
> #!/bin/bash -f
> # ---------------------------
> #$ -V
> #$ -N test
> #$ -S /bin/bash
> #$ -cwd
> #$ -l vf=2G
> #$ -pe ib-ompi 32
> #$ -q infiniband.q
>
> mpirun mdrun_mpi -cpnum -cpt 60 -npme 16 -dd 4 2 2
>
> Below is my grompp.mdp.
>
> Thanks,
> Steve
>
> P.S. I think that there may be an issue with memory leak that occurs 
> for domain decomposition with NPT. I seem to remember seeing this 
> happening on my desktop and my local cluster. I don't see this with 
> NVT simulations. This would be consistent with the lack of error 
> message: I've just run a short test run and the memory usage was 
> climbing streadily.
>
> ; run control
> integrator = md
> dt         = 0.002
> nsteps     = -1
> comm_mode  = linear
> nstcomm    = 10
>
> ; energy minimization
> emtol  = 0.01
> emstep = 0.01
>
> ; output control
> nstxout       = 0
> nstvout       = 0
> nstfout       = 0
> nstlog        = 0
> nstcalcenergy = 2500
> nstenergy     = 2500
> nstxtcout     = 2500
>
> ; neighbour searching
> nstlist            = 1
> ns_type            = grid
> pbc                = xyz
> periodic_molecules = no
> rlist              = 0.90
>
> ; electrostatics
> coulombtype = pme
> rcoulomb    = 0.90
>
> ; vdw
> vdwtype  = cut-off
> rvdw     = 0.90
> dispcorr = ener
>
> ; ewald
> fourierspacing = 0.1
> pme_order      = 4
> ewald_geometry = 3d
> optimize_fft   = yes
>
> ; temperature coupling
> tcoupl          = nose-hoover
> nh-chain-length = 10
> tau_t           = 2.0
> ref_t           = 255.0
> tc_grps         = system
>
> ; pressure coupling
> pcoupl          = parrinello-rahman
> pcoupltype      = isotropic
> ref_p        = 400.0
> tau_p           = 2.0
> compressibility = 6.5e-5
>
> ; constraints
> constraint_algorithm = shake
> shake_tol            = 0.0001
> lincs_order          = 8
> lincs_iter           = 2
>
> ; velocity generation
> gen_vel  = yes
> gen_temp = 255.0
> gen_seed = -1

You're generating velocities and immediately using a barostat that is 
unsuitable for equilibration. You're using an integration step that 
requires constraints=all-bonds but I don't see that. You may have better 
stability if you equilibrate with Berendsen barostat and then switch. 
I've seen no other reports of memory usage growing without bounds, but 
if you can observe it happening after choosing a better integration 
regime then it suggests a code problem that wants fixing.

Mark



More information about the gromacs.org_gmx-users mailing list