[gmx-developers] Parallel IO, was: Python interface for Gromacs

Roland Schulz roland at utk.edu
Thu Sep 9 15:34:02 CEST 2010


Hi,

regarding parallel IO most had been said in the earlier thread to this
topic. Thus here only a few comments:

Ryan has started (as a undergrad internship project in our group) to
implement the plans described in this earlier thread. Specific: XTC frames
are now not written every step but only every N steps or every checkpoint.
According to our performance analysis this is much more important than
parallel IO because the frequent small writes prohibit high IO bandwidth
(<10MB per write and node it doesn't make sense to do parallel IO). This
also allows very simple parallel IO because each of the cached frames can be
written in parallel. But it won't by itself address the memory issue.

For analysis tools it will almost always make more sense to do
parallelization over frames than over atoms. In that case the IO should also
parallelized over frames.

Memory should not be of concern for almost all simulations on almost all
platforms but BlueGene. Thus it has lower priority for us and is not
addressed by the our current approach described in point 1. As mentioned by
Berk we have started to think about it. Doing the IO from every node is
wouldn't result in good performance because of the large number of small
writes. Instead you would want write from a few nodes, with the number of
I/O nodes depending on the simulation size (thus also perfectly scaling).
All nodes sending to the few nodes can involve sorting the atoms (similar to
what Mark wrote with the All-toAll, and possible using a tree as Berk
mentioned) thus implementing the sorting for parallel IO is no problem. This
can be combined with point 1 to address the memory issue.

@David: Would you mind to send me a pre-print of the paper you mentioned? I
would be very interested in it.

Roland


On Thu, Sep 9, 2010 at 7:54 AM, Shirts, Michael (mrs5pt) <
mrs5pt at eservices.virginia.edu> wrote:

> So, great discussion!  This brings one thought to mind:
>
> How can we improve the discussion to better convert the expertise and
> person-hours into better code -- especially going beyond the developers who
> are in Sweden and can communicate face-to-face easily?
>
> Gromacs is becoming such a widely used tool -- and widely modified -- tool.
> Are there changes in the web tools that we can make -- ways that these
> sorts
> of conversations can be documented on the wiki in a better form, or that we
> can use forums to better use all the ideas and effort, and not duplicate
> effort?
>
> Best,
> ~~~~~~~~~~~~
> Michael Shirts
> Assistant Professor
> Department of Chemical Engineering
> University of Virginia
> michael.shirts at virginia.edu
> (434)-243-1821
>
>
> > From: Sébastien Buchoux <sebastien.buchoux at u-picardie.fr>
> > Reply-To: Discussion list for GROMACS development <
> gmx-developers at gromacs.org>
> > Date: Thu, 9 Sep 2010 07:13:33 -0400
> > To: "gmx-developers at gromacs.org" <gmx-developers at gromacs.org>
> > Subject: Re: [gmx-developers] Python interface for Gromacs
> >
> > Hi,
> >
> > On 09/09/2010 08:38 AM, David van der Spoel wrote:
> >> People in the Lindahl group are working on parallellizing analysis
> >> tools because they are quickly becoming the bottleneck. We run
> >> simulations of large systems on hundreds of processors, and due to
> >> checkpointing this can be done largely unattended. Analysis can take a
> >> lot of effort, both hardware (CPU, Disk) and organisational. I think
> >> the prime advantage of a scripting languagae like Python could be
> >> organisational.
> >
> >  From my experience, Python is too slow to really make analysis tools
> > (or any heavy computational work) on its own. But it can use C/C++ libs
> > like a charm... hence a Python interface! :)
> >
> >> @portability: I have tried compiling numpy a few times on mac & linux
> >> without success. As long as numpy is not in the main python
> >> distribution it will not be useful...
> >
> > Numpy compilation can indeed be... troublesome. but I don't think it is
> > mandatory.
> >  From my experience, Numpy is great for pure Python programming so it
> > would be useful for a 100% Python analysis tool. But with the use of C
> > to do the hard work, the advantage a Numpy is pretty slim and I think
> > that new Python types defined within a C extension are more efficient
> > since they allow Python users to access C objects directly (given the
> > use of a descent interface).
> > This is specially true when dealing with already written C objects that
> > are very different from C numarrays (i.e. they would need to be
> > "translated" to Numpy arrays prior to use any of the Numpy routines).
> > IMHO, the #1 priority of Numpy is much more to ease life of pure Python
> > programmers than to be useful to C extension programmers.
> >
> > Sébastien
> >
> >
> > --
> > Sébastien Buchoux, MC
> > UMR 6022 - Génie Enzymatique et Cellulaire / Enzyme and Cell Engineering
> > Université Picardie Jules Verne (UPJV)
> > 33, rue Saint Leu, 80000 Amiens, France
> > tel: +33(0)3 22 82 74 73 - email: sebastien.buchoux[at]u-picardie.fr
> >
> > Pourquoi pas de pièces jointes Word? / Why no Word attachments?
> > http://www.gnu.org/philosophy/no-word-attachments.html
> >
> >
> > --
> > gmx-developers mailing list
> > gmx-developers at gromacs.org
> > http://lists.gromacs.org/mailman/listinfo/gmx-developers
> > Please don't post (un)subscribe requests to the list. Use the
> > www interface or send it to gmx-developers-request at gromacs.org.
>
> --
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-developers-request at gromacs.org.
>



-- 
ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
865-241-1537, ORNL PO BOX 2008 MS6309
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20100909/ea2a30cc/attachment.html>


More information about the gromacs.org_gmx-developers mailing list