[gmx-users] The 20 subsystems are not compatible (REMD)

Pacho Ramos pachoramos at gmail.com
Tue Nov 26 14:53:35 CET 2013


Hello

I am having a lot of problems to get a REMD simulation end, after running
for some time, some replicas are interrupted without writting a state file,
leading then to:
The 20 subsystems are not compatible

error on next run.

I have run "gmxcheck -f" with all state files and I found that the time is
different for two of them:
- Most replicas have:
Last frame         -1 time 14786.000
- But replicas 16 and 17 have:
Last frame         -1 time 14772.900

I have looked at *prev states but they also differ:
- Most of them have:
Last frame         -1 time 14772.880
- But replicas 16 and 17 have:
Last frame         -1 time 14748.300

As you can see, the *prev* from most replicas don't fit with the states for
replicas 16 and 17 (14772.880 vs. 14772.900).

Looking at the log files I also see two differences:
- Most of them end with:
Replica exchange at step 7392900 time 14785.8
Repl 0 <-> 1  dE_term = -3.084e+00 (kT)
dplumed =  0.000e+00 dE_term = -3.084e+00 (kT)
Repl ex  0 x  1    2 x  3    4 x  5    6 x  7    8 x  9   10 x 11   12 x 13
  14 x 15   16 x 17   18   19
Repl pr   1.0       1.0       1.0       1.0       1.0       .43       1.0
    1.0       1.0       .14


Step 7392990: Run time exceeded 11.385 hours, will terminate the run
           Step           Time         Lambda
        7393000    14786.00000        0.00000

Writing checkpoint, step 7393000 at Tue Nov 26 13:01:52 2013

- The offending replicas (16 and 17) end with:
Step 7393000: Run time exceeded 11.385 hours, will terminate the run
   Energies (kJ/mol)
           Bond          Angle    Proper Dih.  Improper Dih.GB Polarization
    2.41638e+03    6.14715e+03    4.95017e+03    4.27469e+02   -1.12985e+04
  Nonpolar Sol.          LJ-14     Coulomb-14        LJ (SR)   Coulomb (SR)
    3.30195e+02    2.08134e+03    3.34282e+04   -3.57626e+03   -4.28115e+04
      Potential    Kinetic En.   Total Energy  Conserved En.    Temperature
   -7.90536e+03    9.75192e+03    1.84656e+03    2.77779e+05    4.65521e+02
 Pressure (bar)
    0.00000e+00

-> Looks like they got interrupted before writting the state file, leading
to all this problems. But I don't know how to fix this situation and
prevent it from occurring again in the future (currently, I ask for 12hour
of processor and run mdrun with -maxh 11.5... maybe I should give it more
time and run it with -maxh 11 to let it exit ok during 1 hour :/)

Thanks a lot for your help


More information about the gromacs.org_gmx-users mailing list