[gmx-users] intermittent changes in energy drift following simulation restarts in v4.6.1

Mark Abraham mark.j.abraham at gmail.com
Mon Sep 9 16:45:30 CEST 2013


Sounds worrying :-( Thanks for the detailed report and
trouble-shooting! So far, I can't think of a reason for it.

A couple of suggestions:
* try again with 4.6.3 (at least while trouble-shooting) in case its a fixed bug
* post a representative .mdp file
* is there anything out of the ordinary in the topology?
* if the problem is restart-related and shows up in the drift quickly,
then you can probably find a reproducible case via a job that does
lots of short-interval restarts and saves all the intermediate files -
a (set of) inputs that can reproduce the problem sounds like what we'd
need to diagnose and/or fix anything
* does it happen in a non-multi simulation? (or more particularly,
what are you doing with -multi?)
* check .log files for warnings, and that there are none being
suppressed at the grompp stage
* see if the group cut-off scheme in 4.6.x shows the same problem

Mark


On Mon, Sep 9, 2013 at 4:08 PM, Richard Broadbent
<richard.broadbent09 at imperial.ac.uk> wrote:
> Dear All,
>
> I've been analysing a series of long (200 ns) NVE simulations  (md
> integrator) on ~93'000 atom systems I ran the simulations in groups of 3
> using the -multi option in gromacs v4.6.1 double precision.
>
> Simulations were run with 1 OpenMP thread per MPI process
>
> The simulations were restarted at regular intervals using the following
> submission script:
>
>
> FILE=4.6_P84_DIO_
>
> module load fftw xe-gromacs/4.6.1
>
> # Change to the direcotry that the job was submitted from
> cd $PBS_O_WORKDIR
>
> export NPROC=`qstat -f $PBS_JOBID | grep mppwidth | awk '{print $3}'`
> export NTASK=`qstat -f $PBS_JOBID | grep mppnppn  | awk '{print $3}'`
>
> aprun -n $NPROC -N $NTASK mdrun_mpi_d  -deffnm $FILE  -maxh 24 -multi 3
> -npme 64 -append -cpi
>
>
>
> ###
>
> The first simulation was run with the same script except the mdrun line was
>
> aprun -n $NPROC -N $NTASK mdrun_mpi_d  -deffnm $FILE  -maxh 24 -multi 3
> -npme 64
>
> ###
>
>
> The simulations generally ran and restarted without trouble, however, in
> several simulations the energy drift changed radically following the
> restart.
>
> in one simulation the simulation ran for 50 ns (including one restart) with
> a drift of -141.6 +/- 0.1 kJ mol^-1 ns^1
> it was restarted then had a drift of +104 +/- 1 kJ mol^-1 ns^1 for 15 ns
> then was restarted and continued with a drift of -138 +/- 0.1 kJ mol^-1 ns^1
> for a further 50~ns.
>
> The other 2 simulations running in parallel with this calculation through
> the -multi option did not experience a change in gradient.
>
> the drifts were calculated by least squares analysis of the output from the
> total energy data given by
>
> echo "total" | g_energy_d -f ${FILE}${i}.edr -o total_${FILE}${i}.xvg -xvg
> none
>
>
> The simulation writes to the edr every 20 ps and the transition is masked by
> the expected oscillations in energy due to the integrator on a 2~ns interval
> but the change in drift is clear when looking at a 4~ns range centred on the
> restart.
>
> The hardware used was of the same specification for all jobs (27 cray XE6
> nodes (9 nodes per simulation), 32 mpi processes per node)
>
> The simulations use the verlet cut-off scheme
> there are H-bond constraints enforced using lincs (order 6, iterations 2)
>
>
> I can't think what would cause this change in the drift during a restart.
> However, I have seen it in simulations run on both an AMD system (cray XE6,
> AVX-FMA) and an intel system  (SGI-ice, SSE4.1).
>
>
> I have some data generated using the same procedure using v4.5.5 and v4.5.7
> (different cut-off scheme) and the restarts in that system have not caused
> any appreciable changes in the simulation.
>
> Unfortunately I didn't save the checkpoint files used for the restart (I
> will in the future). I'm going to try building a new input file from just
> before the restart using the trr trajectory data.
>
>
> Does anyone have any ideas of what might have caused this?
>
> Has anyone seen similar effects?
>
> Thanks,
>
> Richard
> --
> gmx-users mailing list    gmx-users at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the www
> interface or send it to gmx-users-request at gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists



More information about the gromacs.org_gmx-users mailing list