[gmx-users] System volume "jumps" on exact continuations

Thu Jun 1 15:24:41 CEST 2017

Hi,

Thanks for the report.

On Wed, May 31, 2017 at 4:56 PM Elizabeth Ploetz <ploetz at ksu.edu> wrote:

> Dear GMX-USERS,
>
> We are observing abrupt, discontinuous “jumps” in our simulation box
> volume for different chunks of trajectory when performing exact
> continuations of standard, explicit solvent, MD simulations in the NPT
> ensemble. Note that these volume jumps do not appear to occur after every
> single continuation (there seems to be some element of randomness);
> however, they occur at some point in almost all of our different
> simulations. There are no visually apparent problems with the trajectories
> upon viewing them in e.g., Pymol. We are not able to visualize anything
> unusual, such as “bubbles” for cases where the system volume jumps to
> larger values.
>
> Here is what we have tested:
>
>
>   *   Temperature/Pressure Coupling Methods: Nosé-Hoover/Parrinello-Rahman
> and Berendsen/Berendsen are both affected
>   *   Gromacs Versions: 4.6, 4.6.1, 5.0, and 2016 are all affected
>   *   Machines/Architectures: three different clusters (we don’t know of a
> machine where it does not occur) are all affected:
>      *   the lab’s cluster where we installed Gromacs ourselves
>      *   the university cluster where Gromacs was installed by the
> university IT staff
>      *   the Hansen cluster at Purdue, which we access only through a GUI
> at the DiaGrid portal
>   *   Systems: Pure water systems as well as solvated single solute
> systems where the solute is a single protein, LJ sphere, or
> buckminsterfullerene are all affected
>   *   Force fields: versions of AMBER, CHARMM, GROMOS, and OPLS force
> fields are all affected
>   *   Box shape: cubic boxes and rhombic dodecahedrons are both affected
>   *   Continuation method: Restarts of killed simulations (killed on
> purpose to increase the numbers of nodes as resources became available --
> we don't do this anymore, but this was how we first saw the problem) and
> exact continuations of completed simulations are both affected
>   *   Number of Nodes: We were almost convinced that it happened whenever
> we ran on >1 node (our usual situation), but not if we ran on 1 node only.
> We did some tests on one node on the lab’s cluster with and without MPI, to
> see whether or not it was MPI. System volume jumps were not (yet?) observed
> regardless of whether or not MPI was used. However, on the university
> cluster, we did tests of 24 processors using "-pe mpi-fill 24". Usually the
> 24 processors were spread across >1 node, but sometimes they completely
> filled one node. There was one instance where there was a system volume
> jump when the 24 processors changed from being distributed across >1 node
> to being on 1 node only. However, MPI was used in all of those simulations.
> So, we still have not proved that it is not MPI that is a problem.
> Unfortunately, this test result is still murky. Perhaps we should not have
> even mentioned it.
>   *   Cut-off Scheme: We have done significantly fewer simulations with
> the Verlet cut-off scheme; however, so far the system volume jumps have not
> occurred with Verlet. We are continuing these tests.
>

The simplest explanation is that your system has a phase transition that
happens eventually :-) We don't know what's in your system, so it's hard to
suggest what is going wrong.

> Here is how we do exact continuations:
> gmx_mpi convert-tpr -s previous.tpr -extend 20000 -o next.tpr
> mpiexec gmx_mpi mdrun -v -deffnm next -cpi previous.cpt -noappend >& myjob
>

Fine

> Here<
> https://docs.google.com/document/d/1NUi3H7agEFMeEq5qxTNXWopaFXqIjUbHpLUqifVRVys/edit?usp=sharing>
> is an example (CHARMM22*-based) MDP file (only used for the initial
> production run).
>

That is not a correct way to model with CHARMM (in particular those cutoffs
and cutoff treatments). This was tested extensively by CHARMM force field
developers, and their recommendations are at
http://www.gromacs.org/Documentation/Terminology/Force_Fields/CHARMM.
You're also generating velocities at the start of your production run,
which is not a great idea (but I don't think it is related to your
problem). Do an exact restart from the end of your equlibration, e.g. gmx
grompp ... -t equil_state.cpt -o production. Your LINCS settings are
unnecessarily strict - use the recommendations in the mdp documentation.

> We have performed the Gromacs regression tests (with one processor, >1
> processor but only 1 node, and with more processors than will fit on one
> node) on the two machines that we have command line access to (lab and
> university clusters). Some of the regression tests reset the number of
> processors to 8 and thus are running on only one node. But, for what it is
> worth, all tests passed.
>
> A linked Photobucket image<
> http://i1243.photobucket.com/albums/gg545/ploetz/box-volume_zps0foiwzs9.png>
> shows the same system ran for five 1-ns chunks with exact continuations on
> 2 machines: the university's or the lab's cluster. In Row 1, note that a
> particular system box volume is obtained on the university cluster with 1
> node and MPI. No system volume jumps are noted. However, the average volume
> is not the same when ran on the lab’s cluster with 3 nodes, and a system
> volume jump occurs at 2ns (2nd Row). The system volume does at least
> roughly match that of the university cluster when ran on the lab's cluster
> with 1 node, both without (3rd Row) or with (4th Row) MPI.
>

Note that the second simulation already had a different volume before it
started. That suggests it is already a different system (compare the tpr
files with gmx check -s1 -s2 to see any silly errors), or has already had a
phase transition during your equilibration.

> Some may view these as small volume jumps, but they correspond to the
> volume of many water molecules (1 water has a volume of ~0.03 nm3). Often
> the volume jumps are larger than those shown in the linked figure (e.g., ~7
> nm3).
>
> When a jump occurs, the system stays at that new volume; it is not a
> "blip" that then re-adjusts to the original volume. These volume jumps seem
> suggestive of changes to the simulation settings or FF parameters occurring
> during the continuations.
>

Or that e.g. you have a membrane that goes in or out of a gel-like state
under the conditions you are using. Or look at RMSD or radius of gyration
time series of your protein... it's hard to help when we don't know what
your system has in it.

> We would greatly appreciate any advice. Uncertainty of the system volume
> prevents us from trusting any subsequent analysis.
>

It looks intrinsic to what you are doing, but perhaps could be an artefact
of choosing arbitrary non-bonded settings for CHARMM.

Mark

Thanks,
>
> Elizabeth
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>