[gmx-developers] parallelization

Fri Jan 11 04:16:35 CET 2002

Hi,

On Thu, 10 Jan 2002, Erik Lindahl wrote:
> 
> On Thu, 10 Jan 2002, K.A.Feenstra wrote:
> 
> > Justin MacCallum wrote:
> >  >
> > > I looked into some automatic documentation tools.  The most promising one
> > > is called doxygen (www.doxygen.org).  It extracts all functions, global
> > > variables, enums, and structs from the source code and makes nice
> > > cross-linked web pages.  It will also extract specially formatted comments
> > > from the source and incorperate them into the web page.  The comment
> > > format is relatively unobtrusive.  I would be willing to convert the
> > > existing comments to doxygen format, as well as host the documentation as
> > > long as that isn't a problem with any other developers, and as long as
> > > they would be willing to use the new format when updating or adding
> > > comments.
> > 
> > That sounds like a splendid idea. Could you give a short example of how
> > comments would have to be formatted? Just so we can get an idea of how
> > much work that would be. If it is too involved (I've seen some examples
> > of similar systems, which are in practice absolutely unworkable), it
> > would probably not be worth wile, since nobody will make the effort...
> > 
> 
> Hi,
> 
> I've been thinking about how to document stuff for quite a while too.
> One of our main problems isn't the documentation, though, but the
> very dirty and cross-dependent architecture of the source,partly due
> to the fortran heritage. My only concern is that we should stick to tools
> with a GPL license, or at least completely free ones.

I just checked on doxygen is GPL'ed. I ran the program over the CVS tree
as-is. You can check out the results at
moose.bio.ucalgary.ca/~justin/devel/devel-doc/index.html. I also whipped
up a really quick page that shows how to format the comments for doxygen
to work. I don't think that it would take me long to reformat the existing
comments.  Probably only a couple of days at most. Then as things change
or are added, the new format can be used.

> I think our best bet is to separate this into two task:
> 
> 1. Change the architecture of the code and make a nice & well-documented
>    layout for the future. This takes time, but I've started doing it.
>
> 2. Document the code we have now, with the aim of making it more useable
>    rather than rewriting things.

> > > I also plan to produce some how-to/tutorial pages for developers.  For the
> > > first one, I want to put some answers to questions I had when I was just
> > > starting out.  Things like how to add a new program to the makefile, how
> > > to compile for debugging, how to debug MPI, etc.  This could probably even
> > > go into the developer FAQ.  I'd also like to produce a document to help
> > > people getting starting writing analysis programs.  We need this for our
> > > group anyway.
> > 
> 
> Wonderful! I'd just love some help :-)
> 
> I can give you access to the normal gromacs website so you can implement
> it there - of course you can also keep a local copy, but I think it's a
> good idea to have all our web material in one main gromacs.org site, so we
> can move it just by tarring the entire tree or even mirror the entire
> site. I'll send you the access stuff.

That sounds good, although if you look at my homepage, you'll see that I'm
not much of an html jockey. I also don't know what the timeframe for any
of that is.  Hopefully in the next few months, but I have been known to
take longer than I say to do things :)

> I might as well share some of the thoughts David & I have been exchanging:
> 
> There are two serious problems with the present layout of the code. First,
> there is no separating between internal utility routines and more or less
> external interfaces that could be used e.g. in utility programs. A prime
> example is the input/output of trajectories; you actually only need 3-4
> functions, but in the include directory there are probably 50 declarations
> or so, most of which shouldn't be visible to the causual user.

I agree, the shear number of include files, functions, and structures is a
little bit overwhelming.

> The other problem is that "everything depends on everything" - there are
> no modules, and when you change one declaration it will probably have to
> be altered in 10-20 additional places. This makes it very hard to debug
> and test things, not to talk about the difficulty of understanding the
> code.
> 
> I've started to design modules, beginning with core things like
> neighborlists, neighborsearching and force calculation. We will stick to
> C (rather than C++), but still use the type of module interface common
> in object-oriented languages. Each functionality (like e.g.
> neighborsearching or the pull code) should ideally have one source file
> (or a couple if they are VERY large) and one header file with the 
> EXTERNAL DECLARATIONS ONLY - very well documented. Each such module should
> only depend on a well defined small set of submodules.
> An important lesson in this context is that it often pays to duplicate
> code; rather than having a complicated general routine to calculate simple
> things like center-of-mass it might be better to have a simple routine
> internally in a module - that way you avoid unnecessary dependencies.

That also makes a lot of sense.

> This does mean some work, but in the long run it will make things 10 times
> faster to test and much easier to understand. And the code will be useable
> as a general library. However, it also means that we should not spend too
> much efforts in documenting all the current calls!
> As soon as the basic structure is a little more clear I'll create a CVS
> module with a "gromacs developer reference manual" where you can add
> things. 
> 
> It would still be extremely useful with guides and howtos, since they 
> take a lot of time to write and the actual code calls are a minor part.
> I would suggest having them online in HTML so people can read the while
> browsing the code in an adjacent window.
> 
> 
> PS: Justin - In case you're interested there are now Monte-Carlo versions
> (no force calculation=faster) of all inner loops in CVS, including the
> assembly (SSE & 3DNow) and altivec ones. I'll also be adding SSE2 support
> (double precision) the next days, and then David & I will try to get 3.1
> out as a maintenance release.
> 
> Do you have any idea of a general MC integrator?

Yes, Peter and I actually talked quite a bit about implementing some MC
stuff. At the time we were thinking about creating a seperate mcrun
program, but it might be feasible to make it an integrator in mdrun. In
all honesty, my knowledge of the "guts" of Gromacs is pretty primitive
(actually, that's what prompted this whole documentation thing in the
first place). My idea was to make it as modular as possible. It would work
something like this:

1. The user would define a list of algorithms to use. Each one would be
associated with a group of atoms and have a certain probability of being
selected. This would all be entered into a file (probably XML) along with
the options for each algorithm, and global options such as output
parameters, etc. Each algorithm would implement a different type of MC
step. There would be simple displacements, rotations, box volume changes,
and some fancier stuff like configurational bias and force bias moves.

2. The program randomly picks one of the algorithms according to its
assigned probability and executes it.

3. The algorithm will compute the trial move and decide if it should be
accepted or not. If its accepted, update the system.

4. Loop back to step 1.

Each of the algorithms would ideally be a function with the same signature
so that the main algorithm will just be dealing with an array of
function pointers. The XML input part would also use some sort of callback
routine to read in its parameters from the XML stream/tree. Basically,
each algorithm would be implemented as a module. The only public interface
would consist of three functions: two callback functions (one to actually
do a step and the other for reading input), and a setup function that
would register the two callback functions. We could start with a simple
algorithm that just randomly displaces an atom, for which a simple LJ
fluid would serve as a test case. After that's working more complicated
algorithms can be implented.

Of course this is full of holes. I don't know how this would be
parallelized, although each algorithm would likely handle this itself.
Also, I'm not sure how to trick the inner loops into just calculating the
energy of a defined set of particles, although I'm sure its possible. We'd
also need a neighbour search routine that could update the lists
efficiently if just a small number of particles moved. Peter and I thought
that it would be cool to have some kind of grand canonical MC features,
but this would require either outputting PDB snapshots (not that bad) or
creating a new trajectory file format that supports a variable number of
atoms.

This is a long term project, but definately one that I'm interested in
working on.

If anyone's still reading this horribly long message I have a couple of
questions/comments about parallelizing the pull code.

1. I think I have the umbrella sampling and AFM stuff figured out. For
these calculations, you need the COM of the pulled and reference groups,
from which you calculate the force on the pulled groups using Hooke's law.
Since umbrella sampling happens before update() is called, each node has
the current position of all atoms, so they should be able to calculate the
necessary COM's and forces. If each node applies the appropriate force to
any atoms that are being pulled and are home atoms for that node, the
force should end up being right and update() will update everything
correctly. For the pull output, you only need to know the COM's, which all
the nodes will have calculated, so node zero can then do the output. So,
basically with a little book-keeping no new communication calls need to be
added. Do you see anything wrong with this?

2. The constraint force code is a lot more problematic. Because the
algorithm uses shake, we need the position of all of the atoms in the
pulled and reference groups.  The problem is that this occurs in update()
after the positions and velocities have been updated. So each node only
has the correct positions for its home atoms. I don't see any easy way
around this except to add a bunch of extra communication to send all of
the positions to one node, do shake, and then communicate the updated
positions back. Any thoughts/ideas?

3. Currently, the constraint force code requires the use of double
precision. This is because of the way shake works. A tiny displacement in
the reference group corresponds to a huge force because the  reference
group is so heavy. Does anyone know of an algorithm that would do
something similar to shake, but allow one of the groups to be kept fixed.
This would allow the constraint force stuff to work in single precision
for reasonably light pulled molecules.

Cheers,
Justin