[gmx-users] Continuous trajectory processing

Wed Apr 22 04:26:02 CEST 2009

Aaron Fafarman wrote:
> Hello,
> 
> Thanks for reading this. I'm doing many-nanosecond simulations (GMX
> 3.3.1) with a 2 fs time step and I would like to analyze every single
> frame (or at minimum every tenth frame) of the trajectory for the
> presence of a hydrogen bond to one particular atom in the simulation.
> The problem with doing the analysis after the trajectory is completed
> is that the trr or xtc file would contain many millions of structures
> and therefore be way too large to store on our cluster even
> temporarily.
> 
> One solution I can imagine but don't know how to implement is to have
> g_hbond continuously process the trajectory file, printing the h_bond
> analysis for each new coordinate set and then erasing (or never
> storing) the previous coordinates. Does anyone know of an
> implementation of this, or a better way to  achieve a frame-by-frame
> h-bond analysis? Perhaps there is a simple unix-based file-system
> approach to this (maybe using tempnam)? Thanks in advance.

First I'd question whether the frame-by-frame analysis was statistically 
able to tell you anything. Successive MD or MC frames are not 
independent, so a trajectory of N steps can't give anything like N 
independent samples from an ensemble.

Next, it would be straightforward to partition the calculation into 
manageable chunks, do the analysis after each chunk was complete and 
then use trjconv to reduce the amount of data you wish to store before 
continuing the calculation.

Obviously, you should consider your values of nstxout, nstvout, 
nstenergy, nstxtcout and xtc-groups carefully in advance. See the wiki 
page about restarts.

Mark