[gmx-developers] How to get the number of frames contained by an .xtc trajectory file??

Paolo Franz paolo.franz at gmail.com
Tue Jun 5 17:12:06 CEST 2012


Thank you Tsjerk, this is indeed the solution I figured out as mentioned in
a previous post. The only hick is that this can work only if the compiler
supports large files. In that case I can use  a #define _FILE_OFFSET_BITS 64
 and fseeko instead of fseek. I did test it with a 200Gb long file.


On 5 June 2012 09:45, Tsjerk Wassenaar <tsjerkw at gmail.com> wrote:

> Hi Paolo,
>
> The python code also gives a hint about the C solution... You still
> don't need to read in the first frame. Bytes 81-84 from the start
> contain the size of the frame, excluding 92 bytes used for the header.
> Mind that this is only an approximate size for a frame, as the size
> per frame in an xtc file is variable. But it'll probably be close. If
> you have the size of one frame, you need the size of the file, for
> which you can use the solution at
>
> http://stackoverflow.com/questions/8236/how-do-you-determine-the-size-of-a-file-in-c
> Dividing one by the other should give an indication of the number of
> frames. If you have a small C program for calculating the number of
> frames, please do post it. It might be interesting for others.
>
> Hope it helps,
>
> Tsjerk
>
> On Tue, Jun 5, 2012 at 12:40 AM, Oliver Stueker <ostueker at gmail.com>
> wrote:
> >
> > As far as I know there is no field at the beginning of the file that
> would
> > give a parser hints how many frames are in it.
> > (probably because that makes it easier/more performant to append to the
> file
> > while reducing the risk of corrupting it in case a write goes bad)
> >
> > On the other hand that makes it hard to implement random-access to
> frames in
> > XTC/TRR files.
> >
> > Interestingly there is just a discussion on the mailing list of
> MDAnalysis
> > (a python framework that can deal with XTC and other trajectories) on how
> > libxdr might be extended to generate a checksum-protected index for XTC
> > files, so that a given trajectory has to be read only once from
> beginning to
> > end.
> >
> https://groups.google.com/group/mdnalysis-discussion/browse_thread/thread/3cae3634c726f1ad
> >
> >
> > a different Oliver
> >
> >
> > On Mon, Jun 4, 2012 at 3:24 PM, Paolo Franz <paolo.franz at gmail.com>
> wrote:
> >>
> >> I am trying to avoid doing it by brute force, that is reading all frames
> >> until the last is found. In the origin, what I really need to do is to
> test
> >> if a frame exists in the trajectory. I tried with xtc_seek_frame, but
> that
> >> does not work. Of course, if I know how many frames are they the test
> >> becomes trivial.
> >>
> >> That said, I definitely know what is in the trajectory, how many frames
> >> are there: I ran the md myself and I have the output file! What I want
> to do
> >> is to write a code that figure out by itself what to expect and if, by
> any
> >> chance I forget what is inside, it does not go into an infinite loop if
> I
> >> ask to analyse the wrong frame.
> >>
> >> Cheers
> >> Paolo
> >>
> >> On 4 June 2012 22:59, Justin A. Lemkul <jalemkul at vt.edu> wrote:
> >>>
> >>>
> >>> If all you need is the number of frames contained in an .xtc file, is
> >>> there some reason why running gmxcheck on the .xtc file is
> insufficient?
> >>>
> >>> -Justin
> >>>
> >>>
> >>> On 6/4/12 4:56 PM, Paolo Franz wrote:
> >>>>
> >>>> Hi Tsjerk,
> >>>> Thanks, but I don't really want to use a python script, I am doing
> this
> >>>> from
> >>>> some c/c++ code. I think I figured out a way to do it, but I haven't
> >>>> tested it yet:
> >>>>
> >>>> i)    open the file
> >>>> ii)   do a read_first_xtc
> >>>> iii)  then get the file pointer positon from ftellg, which should be
> the
> >>>> length
> >>>> of the frame in bytes;
> >>>> iv)  place the file pointer at the end of the file with an fseek, then
> >>>> get the
> >>>> length with an ftellg
> >>>> v)   Divide the total length by the length of a frame and obtain the
> >>>> number of
> >>>> written frames.
> >>>>
> >>>> I am only wondering what to do when the length in bytes of the file is
> >>>> too large
> >>>> for a long int!
> >>>>
> >>>> On 4 June 2012 16:11, Tsjerk Wassenaar <tsjerkw at gmail.com
> >>>> <mailto:tsjerkw at gmail.com>> wrote:
> >>>>
> >>>>    Hey Paolo,
> >>>>
> >>>>    I think I posted a script for extracting a last frame before, but
> if
> >>>> I
> >>>>    can't even find it myself... Here it is:
> >>>>
> >>>>    #!/usr/bin/env python
> >>>>
> >>>>    from struct import unpack
> >>>>    import sys
> >>>>
> >>>>    def i(x): return sum([ord(x[j])<<(24-j*8) for j in range(4)])
> >>>>
> >>>>    f = open(sys.argv[1])
> >>>>    tag = f.read(8)                   # Tag: magic number and number of
> >>>> atoms
> >>>>    n = 92 + i(f.read(84)[-4:])       # Size of frame in bytes
> >>>>
> >>>>    f.seek(-5*n/4, 2)                 # This should contain a complete
> >>>> frame
> >>>>    frame = f.read()                  # Read the remaining part in
> >>>>    frame = frame[frame.index(tag):]  # Find the tag
> >>>>
> >>>>    # Open the output file
> >>>>    if len(sys.argv) > 2:
> >>>>        o = sys.argv[2]
> >>>>    else:
> >>>>        o = sys.argv[1][:-4]+"-last.xtc"
> >>>>    open(o,"w").write(frame)
> >>>>
> >>>>    ###
> >>>>
> >>>>    Hope it helps. Cheers,
> >>>>
> >>>>    Tsjerk
> >>>>    On Mon, Jun 4, 2012 at 12:59 PM, Paolo Franz <
> paolo.franz at gmail.com
> >>>>    <mailto:paolo.franz at gmail.com>> wrote:
> >>>>     > Hello everybody!
> >>>>     >
> >>>>     > I am wondering how I can figure out the number of frames
> contained
> >>>> in an
> >>>>     > .xtc file. Indeed, I need to read a particular frame of a
> >>>> trajectory and I
> >>>>     > thought that the function
> >>>>     > xtc_seek_frame(FILE * , int *, int *)
> >>>>     > would return 0 if the frame was there and 1 when it was not.
> >>>> Instead, if I
> >>>>     > call it with a frame outside the boundaries it seems to go into
> an
> >>>> infinite
> >>>>     > loop. What I am doing wrong? Is there a way to read the last
> frame
> >>>> of an
> >>>>     > .xtc file?
> >>>>     >
> >>>>     > Sincerely
> >>>>     > Paolo
> >>>>     >
> >
> >
> > --
> > gmx-developers mailing list
> > gmx-developers at gromacs.org
> > http://lists.gromacs.org/mailman/listinfo/gmx-developers
> > Please don't post (un)subscribe requests to the list. Use the
> > www interface or send it to gmx-developers-request at gromacs.org.
>
>
>
> --
> Tsjerk A. Wassenaar, Ph.D.
>
> post-doctoral researcher
> Molecular Dynamics Group
> * Groningen Institute for Biomolecular Research and Biotechnology
> * Zernike Institute for Advanced Materials
> University of Groningen
> The Netherlands
> --
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-developers-request at gromacs.org.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20120605/c4fb37d3/attachment.html>


More information about the gromacs.org_gmx-developers mailing list