[gmx-developers] multiple xtc output files (optionally in chunks)

Fri Oct 19 15:21:19 CEST 2007

David van der Spoel wrote:
> Berk Hess wrote:
>> Carsten Kutzner wrote:
>>
>>> Hi,
>>>
>>> I wonder if we could have some new options for the XTC output of
>>> trajectories (I would also volunteer to implement them).
>>>
>>> The problem that we face in our group is the following. Since we do not
>>> use a queueing system, the trajectories naturally get very large. This
>>> is a headache for our backup system which has to backup each open file
>>> as a whole, even if only a few bytes were added since the last backup.
>>>
>>> What would be a nice solution is to allow two (or better multiple) XTC
>>> output files with individual options (nstxtcout1, xtc-grps1,
>>> nstxtcout2,..., similar to the new mdp pull options) and an extra
>>> parameter nstxtcchop1 for when to close an XTC and start writing to a
>>> new one.
>>>
>>> You could then e.g. save your protein *without* water in one single,
>>> small XTC file (nstxtcchop1=0) which is kept on disk for further
>>> analysis, while the large trajectory with water is saved in several
>>> chunks (and e.g. archived).
>>>
>>> Please let me know what you think.
>>>
>>> Regards,
>>>  Carsten
>>>
>>>
>>>  
>>>
>> Multiple xtc files for different groups could in be useful, but I have
>> never
>> had a need for this, nor did I get any requests.

It would save reading/processing of parts (e.g. solvent) in which one might not
be interested but which you still want to have just in case. It would also allow
writing subset (protein?) coordinates with a higher frequency than others.

>>
>> I think chopping up files is useless.
>> The much easier solution here is to chop up your whole simulation in
>> parts
>> and use tpbconv to continue runs, just like when you would have a
>> queing system
>> (which you should install in Gottingen anyhow).
>>

I don't agree. Why use a queueing system if one can do without? It only creates
extra (CPU and I/O) overhead. It sounds a bit like using individual bottles of
water instead of opening the tap if you want to take a bath. But obviously if
we're the only ones that see it that way there's no need to change the code.

> I agree, but what we should implement, and what is also on the todo list
> on the wiki, is to be able to analyze N trajectories as one, e.g.
> g_hbond -f a.xtc b.xtc c.xtc and so on.
> 
> 

This is obviously a useful development in any case. But it's not a replacement
for multiple output (in case one needs e.g. certain coordinates more frequently
than others).

Bert

______________________________________
Bert de Groot, PhD

Max Planck Institute for Biophysical Chemistry
Computational biomolecular dynamics group
Am Fassberg 11
37077 Goettingen, Germany

tel: +49-551-2012308, fax: +49-551-2012302
http://www.mpibpc.mpg.de/groups/de_groot