[gmx-developers] Gromacs parallel I/O?

Roland Schulz roland at utk.edu
Wed Jul 7 01:57:07 CEST 2010

On Tue, Jul 6, 2010 at 7:18 PM, Shirts, Michael (mrs5pt) <
mrs5pt at eservices.virginia.edu> wrote:

> > BTW: Regarding parallel read of XTC for analysis tools. I suggest we add
> an
> > XTC meta-file to solve the problem of parallel read for XTC. To be able
> to
> > read frames in parallel we need to know the starting positions of the
> frame.
> > Using the bisect search for XTC in parallel will probably give  poor
> > performance on most parallel IO systems (small random access IO pattern -
> is
> > what parallel IO systems don't like at all). Using TRR instead for
> parallel
> > analysis is also not such a good idea because even with parallel IO
> several
> > analysis will be IO bound and thus we could benefit from the XTC
> compression.
> > Thus an XTC file with a meta-file containing the starting positions
> should
> > give the best performance. A separate meta-file instead of adding the
> > positions to the header has the advantage that we don't change the
> current
> > format and thus don't break compatibility with 3rd party softare. Having
> a
> > separate meta-file has the disadvantage of the required bookkeeping to
> make
> > sure that the XTC file and the metafile are up-to date to each other, but
> I
> > think this shouldn't be to difficult to solve. And if a meta-file is
> missing
> > or not up-to date it is possible to generate it on the fly.
> I'm wondering if this is the sort of problem that eventually moving to
> something like netCDF might help solve.  Clearly, it would be a difficult
> move, and would require interconversion utilities for backward
> compatibility.

I looked into this. The compression of XTC is very good. And good
compression is important if you want to have a good IO rate (of the
uncompressed data). NetCDF3 doesn't support compressions (there are
unsupported extensions). HDF5/NetCDF4 support compression but
only parallel read of compressed data not parallel write of compressed data.
Also the zlib compression would have a significantly lower compression
ration than the XTC compression does.
Thus none would do by itself all we would like to do. Of course one could do
the XTC compression within a NetCDF/HDF5 container, but I don't see how this
would help anyone. Without the full required support for compression the
only other advantage I could see in moving to NetCDF/HDF5 is that is easier
for others to program readers/writers (is already very easy since the
library xdrfile has been released). And if we have our custom compression
within NetCDF/HDF5 than reading those files wouldn't be any easier than
reading/writing current XTC files.

Without compression we could as well use TRR. Writing a parallel
reader/writer for that is dead simple (since the position of each frame is
known from the number of atoms).


> Best,
> ~~~~~~~~~~~~
> Michael Shirts
> Assistant Professor
> Department of Chemical Engineering
> University of Virginia
> michael.shirts at virginia.edu
> (434)-243-1821
> --
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-developers-request at gromacs.org.

ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
865-241-1537, ORNL PO BOX 2008 MS6309
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20100706/c62ff9e5/attachment.html>

More information about the gromacs.org_gmx-developers mailing list