[gmx-users] Storage of large output files

Thu Aug 9 10:16:17 CEST 2007

Monika Sharma wrote:
> Dear All,
> We have started our venture into MD recently, for which we are using our 
> in-house resources. Now that MD runs are giving very large output files 
> like for trr files. The files keep piling up and using spaces on the 
> work machines. This is creating problems with the depletion of space 
> with every run. Can anyone please suggest an "economical and efficient" 
> way how to take backup of such a large files of the order of Gb or so, 
> so that we dont end up piling up our work machines with such files. And 
> the data need to be saved for future references..

First, consider whether you are producing more output than you need. 
Look at the options for output frequency of positions and velocities in 
.trr files, whether you should be using .xtc files, and whether you 
should only be outputting subsets of your data.

Normally you only want a full frame of positions and velocities in your 
.trr file with frequency with which you might ever want to do an exact 
restart (and make sure your energy output frequency is a suitable 
multiple so you also have energies at this time). This frequency is 
invariably much smaller than the frequency with which you want output 
data. If you only want position data for your solute for your later 
analysis, then outputting only that group to an .xtc file with frequency 
as low as you'd ever need will be a tiny fraction of the cost of a .trr 
file of the whole system with positions and velocities at every step. Be 
aware that analysis types that require autocorrelation functions need 
data sampled much more frequently than the characteristic times of the 
system.

Mark