[gmx-users] Problem with REMD

Thu Jul 26 10:57:00 CEST 2012

On 26/07/2012 5:32 PM, Antonia Mey wrote:
> Hi Gromacs users,
> I seem to face some problems with my REMD dynamics.
> I follow as much as possible the set up description of: http://jcp.aip.org/resource/1/jcpsa6/v135/i14/p145102_s1?isAuthorized=no
> Just to summarise:
> Alanine dipeptide in explicit water: minimization, nvt, and npt equilibration at each temperature of my 24 replicas between 300K and 500K exponentially spaced. The box volume is the same for each replica, when I do an nvt production run.
> I have my inputfiles named md*.mdp, where * goes from 0->24 i produce my tpr files successfully.
> I run a nvt REMD simulation for 20ns where I swap every 1ps. The resulting  exchange statistics seem reasonable to me. Here is an example from one of the log files:
> Replica exchange statistics
> Repl  19999 attempts, 10000 odd, 9999 even
> Repl  average probabilities:
> Repl     0    1    2    3    4    5    6    7    8    9   10   11   12   13   14   15   16   17   18   19   20   21   22   23
> Repl      .32  .32  .33  .32  .34  .35  .35  .37  .36  .38  .37  .39  .39  .40  .40  .42  .42  .43  .44  .44  .45  .45  .46
> Repl  number of exchanges:
> Repl     0    1    2    3    4    5    6    7    8    9   10   11   12   13   14   15   16   17   18   19   20   21   22   23
> Repl     3157 3173 3245 3244 3353 3404 3492 3684 3651 3752 3769 3854 3849 4074 4049 4102 4187 4250 4322 4349 4475 4519 4562
> Repl  average number of exchanges:
> Repl     0    1    2    3    4    5    6    7    8    9   10   11   12   13   14   15   16   17   18   19   20   21   22   23
> Repl      .32  .32  .32  .32  .34  .34  .35  .37  .37  .38  .38  .39  .38  .41  .40  .41  .42  .43  .43  .43  .45  .45  .46
>
>
> all good so far. Now to the analysis where the problem arises:
> Here is my post production script in order to extract constant temperature trajectories:

You're not the first to make this wrong assumption, but GROMACS actually 
exchanges coordinates among (sets of) cpus, which write to the same 
output file, i.e. the trajectories written by mdrun are already at 
constant temperature. (That's inefficient because you have to re-do the 
domain decomposition, and is likely an artefact of the pre-existing 
mdrun -multi functionality from which the REMD implementation derives. I 
have a private GROMACS version off 4.5.5 that scales better to large 
processor sets by exchanging only T (and some minor details) and only 
doing pairwise communication during exchange attempts - however if it 
sees the light of day, it would be in at least GROMACS 5.0.)

> rm -f total.log
>         for j in {0..23}
>         do
>                 cat md${j}.log >> total.log
>         done
>         echo "demuxing replica"
>         demux total.log
>         trjcat -f *.xtc -demux replica_index.xvg
>
>         echo "now getting the ramachandran diagrams"
>         for j in {0..23}
>         do
>                 g_rama -f ${j}_trajout.xtc -s ../../tprInput/md${j}.tpr -o rama${j}.xvg
>         done
> This also runs through without any problems...
> Now I construct a histogram based on my 6 metstable states in my Ramachandran digram, as defined in:
> http://jcp-bcp.aip.org/resource/1/jcpbcp/v5/i6/p06B613_s1?isAuthorized=no
>
> I literally just count how many point in my diagram belong to states 1->6. This should give me an estimate of my equilibrium distribution of each state at each temperature (if i do the counting for each Ramachandran diagram at each temperature). Now here comes the problem:
> The probability for each temperature is the same, also the transition rates between states at different temperatures changes only very slightly. I have taken averages over 10 * 20ns REMD trajectories and my state probabilities look much like this:
> http://www.nottingham.ac.uk/~ppxasjsm/StatDist.png
>
> This behaviour is much too flat if i compare these results to the two papers mentioned above.

I'd say your script above ensures this result if your generalized 
ensemble is equilibrated and converged (which it should be).

>   Also at 300K data points from a 40ns equilibrium run were included as well as at 500K the same, same states have the same colours. The values for 300K correspond to the findings of the two papers. This equilibrium run was started from the same mdp file as used for the REMD simulation, but this time just a single temperature was run.
> Thus I am not able to reproduce my equilibrium distribution from my REMD simulations. Below is my sample input file.
> Just as a sanity check I also tried to create the the distribution from each replica file (i.e not building the continuous temperature trajectory) and I get the same result...

Good thought, but I'd double check that :-)

> Something else I noticed is that if I do a g_energy -f md0.edr on my output files the average temperature I get from these is always that of the input trajectory (this seems strange too me)

Yes, by construction.

>
> ; Run parameters
> integrator      = sd            ; leap-frog integrator
> nsteps          = 10000000      ; 20 ns
> dt              = 0.002         ; 2 fs
> ; Output control
> nstxout         = 0             ; don't save in trr format
> nstvout         = 0             ;
> nstxtcout       = 50           ; save xtc every 0.1ps
> nstenergy       = 50           ; save energies every 0.1 ps
> nstlog          = 2500          ; update log file every 5 ps
> ; Bond parameters
> continuation    = yes           ; first dynamics run
> constraint_algorithm = lincs    ; holonomic constraints
> constraints     = h-bonds       ; all bonds (even heavy atom-H bonds) constrained
> lincs_iter      = 1             ; accuracy of LINCS
> lincs_order     = 4             ; also related to accuracy
> ; Neighborsearching
> ns_type         = grid          ; search neighboring grid cells
> nstlist         = 5             ; 10 fs
> rlist           = 1.0           ; short-range neighborlist cutoff (in nm)
> rcoulomb        = 1.0           ; short-range electrostatic cutoff (in nm)
> rvdw            = 1.0           ; short-range van der Waals cutoff (in nm)
> ; Electrostatics
> coulombtype     = PME           ; Particle Mesh Ewald for long-range electrostatics
> pme_order       = 4             ; cubic interpolation
> fourierspacing  = 0.16          ; grid spacing for FFT
> ; Temperature coupling is on
> tcoupl          = V-rescale     ; modified Berendsen thermostat
> tc-grps         = System        ; two coupling groups - more accurate
> tau_t           = 0.1           ; time constant, in ps
> ref_t           = 300           ; reference temperature, one for each group, in K
> ; Pressure coupling is off
> pcoupl          = no            ; no pressure coupling in NVT
> ; Periodic boundary conditions
> pbc             = xyz           ; 3-D PBC
> ; Dispersion correction
> DispCorr        = EnerPres      ; account for cut-off vdW scheme
> ; Velocity generation
> gen_vel         = no            ; continuation
> energygrps      = protein water ;
> xtc_grps        = Protein
> gen_temp        = 300
> gen_seed        = -1
> ld_seed         = -1
>
>
> (I also tried md as an integrator. The seed set on different processors always seems to be the same somehow..(that should only give me correlations in the data i guess)

Meh, chaotic equilibration will take care of that, but the less said 
about the RNG state preservation in GROMACS 4.x, the better.

>   I am not sure if I have to set gen_temp, but some tutorials suggested this, so I did)
>
> Just one last note. This is pretty much the first time I am using Gromacs and I am obviously overseeing something, which I cannot find. I have written my own parallel tempering code before, so I am very familiar with the concepts, just not with the Gromacs software.
>
> If the mystery could be solved this would be greatly appreciated, of if more information is required I am happy to provide it.

The most likely explanation is that you've somehow not managed to do 
your analysis on constant-T trajectories.

Mark