[gmx-developers] How to get the number of frames contained by an .xtc trajectory file??
Paolo Franz
paolo.franz at gmail.com
Tue Jun 5 17:12:06 CEST 2012
Thank you Tsjerk, this is indeed the solution I figured out as mentioned in
a previous post. The only hick is that this can work only if the compiler
supports large files. In that case I can use a #define _FILE_OFFSET_BITS 64
and fseeko instead of fseek. I did test it with a 200Gb long file.
On 5 June 2012 09:45, Tsjerk Wassenaar <tsjerkw at gmail.com> wrote:
> Hi Paolo,
>
> The python code also gives a hint about the C solution... You still
> don't need to read in the first frame. Bytes 81-84 from the start
> contain the size of the frame, excluding 92 bytes used for the header.
> Mind that this is only an approximate size for a frame, as the size
> per frame in an xtc file is variable. But it'll probably be close. If
> you have the size of one frame, you need the size of the file, for
> which you can use the solution at
>
> http://stackoverflow.com/questions/8236/how-do-you-determine-the-size-of-a-file-in-c
> Dividing one by the other should give an indication of the number of
> frames. If you have a small C program for calculating the number of
> frames, please do post it. It might be interesting for others.
>
> Hope it helps,
>
> Tsjerk
>
> On Tue, Jun 5, 2012 at 12:40 AM, Oliver Stueker <ostueker at gmail.com>
> wrote:
> >
> > As far as I know there is no field at the beginning of the file that
> would
> > give a parser hints how many frames are in it.
> > (probably because that makes it easier/more performant to append to the
> file
> > while reducing the risk of corrupting it in case a write goes bad)
> >
> > On the other hand that makes it hard to implement random-access to
> frames in
> > XTC/TRR files.
> >
> > Interestingly there is just a discussion on the mailing list of
> MDAnalysis
> > (a python framework that can deal with XTC and other trajectories) on how
> > libxdr might be extended to generate a checksum-protected index for XTC
> > files, so that a given trajectory has to be read only once from
> beginning to
> > end.
> >
> https://groups.google.com/group/mdnalysis-discussion/browse_thread/thread/3cae3634c726f1ad
> >
> >
> > a different Oliver
> >
> >
> > On Mon, Jun 4, 2012 at 3:24 PM, Paolo Franz <paolo.franz at gmail.com>
> wrote:
> >>
> >> I am trying to avoid doing it by brute force, that is reading all frames
> >> until the last is found. In the origin, what I really need to do is to
> test
> >> if a frame exists in the trajectory. I tried with xtc_seek_frame, but
> that
> >> does not work. Of course, if I know how many frames are they the test
> >> becomes trivial.
> >>
> >> That said, I definitely know what is in the trajectory, how many frames
> >> are there: I ran the md myself and I have the output file! What I want
> to do
> >> is to write a code that figure out by itself what to expect and if, by
> any
> >> chance I forget what is inside, it does not go into an infinite loop if
> I
> >> ask to analyse the wrong frame.
> >>
> >> Cheers
> >> Paolo
> >>
> >> On 4 June 2012 22:59, Justin A. Lemkul <jalemkul at vt.edu> wrote:
> >>>
> >>>
> >>> If all you need is the number of frames contained in an .xtc file, is
> >>> there some reason why running gmxcheck on the .xtc file is
> insufficient?
> >>>
> >>> -Justin
> >>>
> >>>
> >>> On 6/4/12 4:56 PM, Paolo Franz wrote:
> >>>>
> >>>> Hi Tsjerk,
> >>>> Thanks, but I don't really want to use a python script, I am doing
> this
> >>>> from
> >>>> some c/c++ code. I think I figured out a way to do it, but I haven't
> >>>> tested it yet:
> >>>>
> >>>> i) open the file
> >>>> ii) do a read_first_xtc
> >>>> iii) then get the file pointer positon from ftellg, which should be
> the
> >>>> length
> >>>> of the frame in bytes;
> >>>> iv) place the file pointer at the end of the file with an fseek, then
> >>>> get the
> >>>> length with an ftellg
> >>>> v) Divide the total length by the length of a frame and obtain the
> >>>> number of
> >>>> written frames.
> >>>>
> >>>> I am only wondering what to do when the length in bytes of the file is
> >>>> too large
> >>>> for a long int!
> >>>>
> >>>> On 4 June 2012 16:11, Tsjerk Wassenaar <tsjerkw at gmail.com
> >>>> <mailto:tsjerkw at gmail.com>> wrote:
> >>>>
> >>>> Hey Paolo,
> >>>>
> >>>> I think I posted a script for extracting a last frame before, but
> if
> >>>> I
> >>>> can't even find it myself... Here it is:
> >>>>
> >>>> #!/usr/bin/env python
> >>>>
> >>>> from struct import unpack
> >>>> import sys
> >>>>
> >>>> def i(x): return sum([ord(x[j])<<(24-j*8) for j in range(4)])
> >>>>
> >>>> f = open(sys.argv[1])
> >>>> tag = f.read(8) # Tag: magic number and number of
> >>>> atoms
> >>>> n = 92 + i(f.read(84)[-4:]) # Size of frame in bytes
> >>>>
> >>>> f.seek(-5*n/4, 2) # This should contain a complete
> >>>> frame
> >>>> frame = f.read() # Read the remaining part in
> >>>> frame = frame[frame.index(tag):] # Find the tag
> >>>>
> >>>> # Open the output file
> >>>> if len(sys.argv) > 2:
> >>>> o = sys.argv[2]
> >>>> else:
> >>>> o = sys.argv[1][:-4]+"-last.xtc"
> >>>> open(o,"w").write(frame)
> >>>>
> >>>> ###
> >>>>
> >>>> Hope it helps. Cheers,
> >>>>
> >>>> Tsjerk
> >>>> On Mon, Jun 4, 2012 at 12:59 PM, Paolo Franz <
> paolo.franz at gmail.com
> >>>> <mailto:paolo.franz at gmail.com>> wrote:
> >>>> > Hello everybody!
> >>>> >
> >>>> > I am wondering how I can figure out the number of frames
> contained
> >>>> in an
> >>>> > .xtc file. Indeed, I need to read a particular frame of a
> >>>> trajectory and I
> >>>> > thought that the function
> >>>> > xtc_seek_frame(FILE * , int *, int *)
> >>>> > would return 0 if the frame was there and 1 when it was not.
> >>>> Instead, if I
> >>>> > call it with a frame outside the boundaries it seems to go into
> an
> >>>> infinite
> >>>> > loop. What I am doing wrong? Is there a way to read the last
> frame
> >>>> of an
> >>>> > .xtc file?
> >>>> >
> >>>> > Sincerely
> >>>> > Paolo
> >>>> >
> >
> >
> > --
> > gmx-developers mailing list
> > gmx-developers at gromacs.org
> > http://lists.gromacs.org/mailman/listinfo/gmx-developers
> > Please don't post (un)subscribe requests to the list. Use the
> > www interface or send it to gmx-developers-request at gromacs.org.
>
>
>
> --
> Tsjerk A. Wassenaar, Ph.D.
>
> post-doctoral researcher
> Molecular Dynamics Group
> * Groningen Institute for Biomolecular Research and Biotechnology
> * Zernike Institute for Advanced Materials
> University of Groningen
> The Netherlands
> --
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-developers-request at gromacs.org.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20120605/c4fb37d3/attachment.html>
More information about the gromacs.org_gmx-developers
mailing list