[gmx-developers] How to get the number of frames contained by an .xtc trajectory file??

Tsjerk Wassenaar tsjerkw at gmail.com
Tue Jun 5 09:45:23 CEST 2012


Hi Paolo,

The python code also gives a hint about the C solution... You still
don't need to read in the first frame. Bytes 81-84 from the start
contain the size of the frame, excluding 92 bytes used for the header.
Mind that this is only an approximate size for a frame, as the size
per frame in an xtc file is variable. But it'll probably be close. If
you have the size of one frame, you need the size of the file, for
which you can use the solution at
http://stackoverflow.com/questions/8236/how-do-you-determine-the-size-of-a-file-in-c
Dividing one by the other should give an indication of the number of
frames. If you have a small C program for calculating the number of
frames, please do post it. It might be interesting for others.

Hope it helps,

Tsjerk

On Tue, Jun 5, 2012 at 12:40 AM, Oliver Stueker <ostueker at gmail.com> wrote:
>
> As far as I know there is no field at the beginning of the file that would
> give a parser hints how many frames are in it.
> (probably because that makes it easier/more performant to append to the file
> while reducing the risk of corrupting it in case a write goes bad)
>
> On the other hand that makes it hard to implement random-access to frames in
> XTC/TRR files.
>
> Interestingly there is just a discussion on the mailing list of MDAnalysis
> (a python framework that can deal with XTC and other trajectories) on how
> libxdr might be extended to generate a checksum-protected index for XTC
> files, so that a given trajectory has to be read only once from beginning to
> end.
> https://groups.google.com/group/mdnalysis-discussion/browse_thread/thread/3cae3634c726f1ad
>
>
> a different Oliver
>
>
> On Mon, Jun 4, 2012 at 3:24 PM, Paolo Franz <paolo.franz at gmail.com> wrote:
>>
>> I am trying to avoid doing it by brute force, that is reading all frames
>> until the last is found. In the origin, what I really need to do is to test
>> if a frame exists in the trajectory. I tried with xtc_seek_frame, but that
>> does not work. Of course, if I know how many frames are they the test
>> becomes trivial.
>>
>> That said, I definitely know what is in the trajectory, how many frames
>> are there: I ran the md myself and I have the output file! What I want to do
>> is to write a code that figure out by itself what to expect and if, by any
>> chance I forget what is inside, it does not go into an infinite loop if I
>> ask to analyse the wrong frame.
>>
>> Cheers
>> Paolo
>>
>> On 4 June 2012 22:59, Justin A. Lemkul <jalemkul at vt.edu> wrote:
>>>
>>>
>>> If all you need is the number of frames contained in an .xtc file, is
>>> there some reason why running gmxcheck on the .xtc file is insufficient?
>>>
>>> -Justin
>>>
>>>
>>> On 6/4/12 4:56 PM, Paolo Franz wrote:
>>>>
>>>> Hi Tsjerk,
>>>> Thanks, but I don't really want to use a python script, I am doing this
>>>> from
>>>> some c/c++ code. I think I figured out a way to do it, but I haven't
>>>> tested it yet:
>>>>
>>>> i)    open the file
>>>> ii)   do a read_first_xtc
>>>> iii)  then get the file pointer positon from ftellg, which should be the
>>>> length
>>>> of the frame in bytes;
>>>> iv)  place the file pointer at the end of the file with an fseek, then
>>>> get the
>>>> length with an ftellg
>>>> v)   Divide the total length by the length of a frame and obtain the
>>>> number of
>>>> written frames.
>>>>
>>>> I am only wondering what to do when the length in bytes of the file is
>>>> too large
>>>> for a long int!
>>>>
>>>> On 4 June 2012 16:11, Tsjerk Wassenaar <tsjerkw at gmail.com
>>>> <mailto:tsjerkw at gmail.com>> wrote:
>>>>
>>>>    Hey Paolo,
>>>>
>>>>    I think I posted a script for extracting a last frame before, but if
>>>> I
>>>>    can't even find it myself... Here it is:
>>>>
>>>>    #!/usr/bin/env python
>>>>
>>>>    from struct import unpack
>>>>    import sys
>>>>
>>>>    def i(x): return sum([ord(x[j])<<(24-j*8) for j in range(4)])
>>>>
>>>>    f = open(sys.argv[1])
>>>>    tag = f.read(8)                   # Tag: magic number and number of
>>>> atoms
>>>>    n = 92 + i(f.read(84)[-4:])       # Size of frame in bytes
>>>>
>>>>    f.seek(-5*n/4, 2)                 # This should contain a complete
>>>> frame
>>>>    frame = f.read()                  # Read the remaining part in
>>>>    frame = frame[frame.index(tag):]  # Find the tag
>>>>
>>>>    # Open the output file
>>>>    if len(sys.argv) > 2:
>>>>        o = sys.argv[2]
>>>>    else:
>>>>        o = sys.argv[1][:-4]+"-last.xtc"
>>>>    open(o,"w").write(frame)
>>>>
>>>>    ###
>>>>
>>>>    Hope it helps. Cheers,
>>>>
>>>>    Tsjerk
>>>>    On Mon, Jun 4, 2012 at 12:59 PM, Paolo Franz <paolo.franz at gmail.com
>>>>    <mailto:paolo.franz at gmail.com>> wrote:
>>>>     > Hello everybody!
>>>>     >
>>>>     > I am wondering how I can figure out the number of frames contained
>>>> in an
>>>>     > .xtc file. Indeed, I need to read a particular frame of a
>>>> trajectory and I
>>>>     > thought that the function
>>>>     > xtc_seek_frame(FILE * , int *, int *)
>>>>     > would return 0 if the frame was there and 1 when it was not.
>>>> Instead, if I
>>>>     > call it with a frame outside the boundaries it seems to go into an
>>>> infinite
>>>>     > loop. What I am doing wrong? Is there a way to read the last frame
>>>> of an
>>>>     > .xtc file?
>>>>     >
>>>>     > Sincerely
>>>>     > Paolo
>>>>     >
>
>
> --
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-developers-request at gromacs.org.



-- 
Tsjerk A. Wassenaar, Ph.D.

post-doctoral researcher
Molecular Dynamics Group
* Groningen Institute for Biomolecular Research and Biotechnology
* Zernike Institute for Advanced Materials
University of Groningen
The Netherlands



More information about the gromacs.org_gmx-developers mailing list