[gmx-users] something wrong with BlueGene/P
Matthew Zwier
mczwier at gmail.com
Fri Sep 28 14:53:02 CEST 2012
Hi Kai,
A system that is marginally stable frequently succeeds in propagating
on one machine and fails on another. I've observed this even between
Xeon and Opteron systems, which is fairly minor architectural
difference. Since your system works in NVT but not NPT, this would
seem to imply that the high pressure conditions are at fault. You may
not be able to get away with a 2 fs timestep here. Try 1 fs or even
0.5 fs and see what happens.
Cheers,
MZ
On Wed, Sep 26, 2012 at 7:51 AM, Bao Kai <paeanball at gmail.com> wrote:
> Hi, all,
>
> I did many simulations with Gromacs on CO2 Water mixtures on my
> workstation with 8 cores in parallel, the results are pretty good.
>
> For bigger simulations, I turned to the BlueGene machine. The problem
> is that with exactly the same configuration files and same number of
> MPI tasks( 8 here) , I always got the following problems.
>
> 1773
> 1774 step 6870: Water molecule starting at atom 1375 can not be settled.
> 1775 Check for bad contacts and/or reduce the timestep if appropriate.
> 1776 Wrote pdb files with previous and current coordinates
> 1777
> 1778 Step 6871, time 6.871 (ps) LINCS WARNING
> 1779 relative constraint deviation after LINCS:
> 1780 rms 0.139809, max 0.559153 (between atoms 10 and 11)
> 1781 bonds that rotated more than 30 degrees:
> 1782 atom 1 atom 2 angle previous, current, constraint length
> 1783 10 11 78.5 0.1210 0.1813 0.1163
> 1784 10 12 90.0 0.1523 0.1152 0.1163
> 1785
> 1786 step 6871: Water molecule starting at atom 4576 can not be settled.
> 1787 Check for bad contacts and/or reduce the timestep if appropriate.
> 1788
> 1789 step 6871: Water molecule starting at atom 5794 can not be settled.
> 1790 Check for bad contacts and/or reduce the timestep if appropriate.
> 1791 Wrote pdb files with previous and current coordinates
> 1792 Wrote pdb files with previous and current coordinates
> 1793
> 1794 -------------------------------------------------------
> 1795 Program mdrun_bgp_d, VERSION 4.5.5
> 1796 Source code file: pme.c, line: 538
> 1797
> 1798 Fatal error:
> 1799 3 particles communicated to PME node 1 are more than 2/3 times
> the cut-off out of the domain decomposition cell of their charge group
> in dimension y.
> 1800 This usually means that your system is not well equilibrated.
> 1801 For more information and tips for troubleshooting, please
> check the GROMACS
> 1802 website at http://www.gromacs.org/Documentation/Errors
>
> When I do the energy minimization or the NVT equilibration, Gromacs
> worked pretty well.
>
> The problem happened when I turned to the NPT equilibration. The
> pressure and temperature were set to be 100bar and 318K respectively.
> When during the NPT equlibration, the temperature and pressure keep
> increasing before the program halt.
>
> 1161
> 1162 DD step 6499 load imb.: force 6.9%
> 1163
> 1164 Step Time Lambda
> 1165 6500 6.50000 0.00000
> 1166
> 1167 Energies (kJ/mol)
> 1168 Angle LJ (SR) Disper. corr. Coulomb (SR)
> Coul. recip.
> 1169 4.50025e+01 2.21540e+04 -7.02187e+02 -1.23033e+05
> -1.27743e+04
> 1170 Potential Kinetic En. Total Energy Temperature
> Pres. DC (bar)
> 1171 -1.14310e+05 1.50736e+04 -9.92368e+04 3.74677e+02
> -3.36430e+02
> 1172 Pressure (bar) Constr. rmsd
> 1173 1.53825e+03 1.53151e-06
> 1174
> 1175 DD step 6599 load imb.: force 5.1%
> 1176
> 1177 Step Time Lambda
> 1178 6600 6.60000 0.00000
> 1179
> 1180 Energies (kJ/mol)
> 1181 Angle LJ (SR) Disper. corr. Coulomb (SR)
> Coul. recip.
> 1182 5.41207e+01 2.32123e+04 -7.01221e+02 -1.23852e+05
> -1.27349e+04
> 1183 Potential Kinetic En. Total Energy Temperature
> Pres. DC (bar)
> 1184 -1.14022e+05 1.50532e+04 -9.89688e+04 3.82147e+02
> -3.35505e+02
> 1185 Pressure (bar) Constr. rmsd
> 1186 2.37940e+03 1.40409e-06
> 1187
> 1188 DD step 6699 load imb.: force 4.8%
> 1189
> 1190 Step Time Lambda
> 1191 6700 6.70000 0.00000
> 1192
> 1193 Energies (kJ/mol)
> 1194 Angle LJ (SR) Disper. corr. Coulomb (SR)
> Coul. recip.
> 1195 4.70219e+01 2.37867e+04 -6.99910e+02 -1.24021e+05
> -1.26862e+04
> 1196 Potential Kinetic En. Total Energy Temperature
> Pres. DC (bar)
> 1197 -1.13574e+05 1.54884e+04 -9.80852e+04 4.03537e+02
> -3.34252e+02
> 1198 Pressure (bar) Constr. rmsd
> 1199 3.01172e+03 1.56366e-06
> 1200
> 1201 DD step 6799 load imb.: force 6.7%
> 1202
> 1203 Step Time Lambda
> 1204 6800 6.80000 0.00000
> 1205
> 1206 Energies (kJ/mol)
> 1207 Angle LJ (SR) Disper. corr. Coulomb (SR)
> Coul. recip.
> 1208 3.79031e+01 2.70730e+04 -6.97586e+02 -1.24088e+05
> -1.24549e+04
> 1209 Potential Kinetic En. Total Energy Temperature
> Pres. DC (bar)
> 1210 -1.10129e+05 3.49837e+04 -7.51454e+04 1.00845e+03
> -3.32036e+02
> 1211 Pressure (bar) Constr. rmsd
> 1212 9.09741e+03 3.89773e-06
>
> The input file for the NPT equilibration is as follows.
>
> define = -DPOSRES ; position restrain the protein
> ; Run parameters
> integrator = md ; leap-frog integrator
> ; integrator = md-vv ; leap-frog integrator
> nsteps = 100000 ; 2 * 50000 = 100 ps
> dt = 0.002 ; 2 fs
> ; Output control
> nstxout = 100 ; save coordinates every 0.2 ps
> nstvout = 100 ; save velocities every 0.2 ps
> ; ; no nstxtcout
> nstenergy = 100 ; save energies every 0.2 ps
> nstlog = 100 ; update log file every 0.2 ps
> ; Bond parameters
> continuation = yes ; <--- Restarting after NVT
> constraint_algorithm = lincs ; holonomic constraints
> constraints = all-bonds ; all bonds (even heavy atom-H bonds) constrained
> lincs_iter = 1 ; accuracy of LINCS
> lincs_order = 4 ; also related to accuracy
> ; Neighborsearching
> ns_type = grid ; search neighboring grid cells
> nstlist = 5 ; 10 fs
> rlist = 0.9 ; short-range neighborlist cutoff (in nm)
> rcoulomb = 0.9 ; short-range electrostatic cutoff (in nm)
> ; rcoulomb = 1.5 ; short-range electrostatic cutoff (in nm)
> rvdw = 0.9 ; short-range van der Waals cutoff (in nm)
> ; Electrostatics
> coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics
> pme_order = 4 ; cubic interpolation
> fourierspacing = 0.16 ; grid spacing for FFT
> ; Temperature coupling is on
> tcoupl = V-rescale ; modified Berendsen thermostat
> ; tcoupl = nose-hoover ; modified Berendsen thermostat
> ; tc-grps = CO2 SOL ; two coupling groups - more accurate
> tc-grps = System ; two coupling groups - more accurate
> tau_t = 0.1 ; time constant, in ps
> ref_t = 318 ; reference temperature, one for each group, in K
> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> ; turning on pressure coupling
> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> ; <-- Pressure coupling is on
> pcoupl = Parrinello-Rahman ; Pressure coupling on in NPT
> pcoupltype = isotropic ; uniform scaling of box vectors
> tau_p = 2.0 ; time constant, in ps
> ref_p = 100.0 ; reference pressure, in bar
> compressibility = 4.5e-5 ; isothermal compressibility of water, bar^-1
> ; Periodic boundary conditions
> pbc = xyz ; 3-D PBC
> ; Dispersion correction
> DispCorr = EnerPres ; account for cut-off vdW scheme
> ; Velocity generation
> gen_vel = no ; <-- Velocity generation is off
> ;
> ;
>
> I do not know what to do. This is really kind of the worst things can
> happen. I can not understand why it work on the workstation while can
> not work on the BlueGene machine.
>
> The thing is a little urgent now. If I could not go over the problem,
> my project will be dead.
>
> Any suggestions will be much appreciated.
>
> Thank you very much.
>
> Best Regards,
> Kai
> --
> gmx-users mailing list gmx-users at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-request at gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
More information about the gromacs.org_gmx-users
mailing list