[gmx-users] Nodes problem?

Albert mailmd2011 at gmail.com
Sat Jan 7 05:15:44 CET 2012


Hello:

   I am submiting gromacs in cluster and the job ALWAYS terminate with 
following messages:


vol 0.75  imb F  5% pme/F 0.52 step 4200, will finish Sat Jan  7 
09:36:14 2012
vol 0.77  imb F  6% pme/F 0.52 step 4300, will finish Sat Jan  7 
09:36:28 2012

step 4389: Water molecule starting at atom 42466 can not be settled.
Check for bad contacts and/or reduce the timestep if appropriate.
Wrote pdb files with previous and current coordinates

step 4390: Water molecule starting at atom 42466 can not be settled.
Check for bad contacts and/or reduce the timestep if appropriate.
Wrote pdb files with previous and current coordinates

step 4391: Water molecule starting at atom 41659 can not be settled.
Check for bad contacts and/or reduce the timestep if appropriate.

step 4391: Water molecule starting at atom 42385 can not be settled.
Check for bad contacts and/or reduce the timestep if appropriate.
Wrote pdb files with previous and current coordinates
Wrote pdb files with previous and current coordinates

step 4392: Water molecule starting at atom 32218 can not be settled.
Check for bad contacts and/or reduce the timestep if appropriate.
Wrote pdb files with previous and current coordinates

step 4393: Water molecule starting at atom 41659 can not be settled.
Check for bad contacts and/or reduce the timestep if appropriate.

step 4393: Water molecule starting at atom 32218 can not be settled.
Check for bad contacts and/or reduce the timestep if appropriate.
Wrote pdb files with previous and current coordinates
Wrote pdb files with previous and current coordinates

-------------------------------------------------------
Program mdrun_mpi_bg, VERSION 4.5.5
Source code file: ../../../src/mdlib/pme.c, line: 538

Fatal error:
3 particles communicated to PME node 4 are more than 2/3 times the 
cut-off out of the domain decomposition cell
  of their charge group in dimension x.
This usually means that your system is not well equilibrated.
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
-------------------------------------------------------

"How Do You Like Your Vacation So Far ?" (Speed 2 - Cruise Control)

Error on node 19, will try to stop all the nodes
Halting parallel program mdrun_mpi_bg on CPU 19 out of 24

gcq#191: "How Do You Like Your Vacation So Far ?" (Speed 2 - Cruise Control)

Abort(-1) on node 19 (rank 19 in comm 1140850688): application called 
MPI_Abort(MPI_COMM_WORLD, -1) - process 1
9
<Jan 07 05:08:35.964275> BE_MPI (ERROR): The error message in the job 
record is as follows:
<Jan 07 05:08:35.964330> BE_MPI (ERROR):   "killed with signal 6"




-----------here is my scrips to submting jobs----------------
# @ job_name = I213A
# @ class = kdm-large
# @ account_no = G07-13
# @ error = gromacs.out
# @ output = gromacs.out
# @ environment = COPY_ALL
# @ wall_clock_limit = 12:00:00
# @ notification = error
# @ notify_user = albert at icm.edu.pl
# @ job_type = bluegene
# @ bg_size = 6
# @ queue
mpirun -exe /opt/gromacs/4.5.5/bin/mdrun_mpi_bg -args "-nosum -dlb yes 
-v -s npt.tpr" -mode VN -np 24


-----------here is my npt.mdp file--------------------
title        = OPLS Lysozyme NPT equilibration
define        = -DPOSRES    ; position restrain the protein
; Run parameters
integrator    = md        ; leap-frog integrator
nsteps        = 200000    ; 1 * 200000 = 200 ps
dt        = 0.001        ; 1 fs
; Output control
nstxout        = 100        ; save coordinates every 0.2 ps
nstvout        = 100        ; save velocities every 0.2 ps
nstenergy    = 100        ; save energies every 0.2 ps
nstlog        = 100        ; update log file every 0.2 ps
; Bond parameters
continuation    = yes        ; Restarting after NVT
constraint_algorithm = lincs    ; holonomic constraints
constraints    = all-bonds    ; all bonds (even heavy atom-H bonds) 
constrained
lincs_iter    = 1        ; accuracy of LINCS
lincs_order    = 4        ; also related to accuracy
; Neighborsearching
ns_type        = grid        ; search neighboring grid cells
nstlist        = 5        ; 10 fs
rlist        = 1.0        ; short-range neighborlist cutoff (in nm)
rcoulomb    = 1.0        ; short-range electrostatic cutoff (in nm)
rvdw        = 1.0        ; short-range van der Waals cutoff (in nm)
; Electrostatics
coulombtype    = PME        ; Particle Mesh Ewald for long-range 
electrostatics
pme_order    = 4        ; cubic interpolation
fourierspacing    = 0.16        ; grid spacing for FFT
; Temperature coupling is on
tcoupl        = V-rescale    ; modified Berendsen thermostat
tc-grps        = Protein Non-Protein    ; two coupling groups - more 
accurate
tau_t        = 0.1    0.1    ; time constant, in ps
ref_t        = 300     300    ; reference temperature, one for each 
group, in K
; Pressure coupling is on
pcoupl        = Parrinello-Rahman    ; Pressure coupling on in NPT
pcoupltype    = isotropic    ; uniform scaling of box vectors
tau_p        = 2.0        ; time constant, in ps
ref_p        = 1.0        ; reference pressure, in bar
compressibility = 4.5e-5    ; isothermal compressibility of water, bar^-1
; Periodic boundary conditions
pbc        = xyz        ; 3-D PBC
; Dispersion correction
DispCorr    = EnerPres    ; account for cut-off vdW scheme
; Velocity generation
gen_vel        = no        ; Velocity generation is off


I've also try to change the number of nodes to fix it but it doesn't 
work. would you please give me some advices to fix this?

Thank you very much

best
Shuguang

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20120107/0816d97f/attachment.html>


More information about the gromacs.org_gmx-users mailing list