[gmx-developers] parallelization

Thu Jan 10 21:55:35 CET 2002

On Thu, 10 Jan 2002, K.A.Feenstra wrote:

> Justin MacCallum wrote:
>  >
> > I looked into some automatic documentation tools.  The most promising one
> > is called doxygen (www.doxygen.org).  It extracts all functions, global
> > variables, enums, and structs from the source code and makes nice
> > cross-linked web pages.  It will also extract specially formatted comments
> > from the source and incorperate them into the web page.  The comment
> > format is relatively unobtrusive.  I would be willing to convert the
> > existing comments to doxygen format, as well as host the documentation as
> > long as that isn't a problem with any other developers, and as long as
> > they would be willing to use the new format when updating or adding
> > comments.
> 
> That sounds like a splendid idea. Could you give a short example of how
> comments would have to be formatted? Just so we can get an idea of how
> much work that would be. If it is too involved (I've seen some examples
> of similar systems, which are in practice absolutely unworkable), it
> would probably not be worth wile, since nobody will make the effort...
> 

Hi,

I've been thinking about how to document stuff for quite a while too.
One of our main problems isn't the documentation, though, but the
very dirty and cross-dependent architecture of the source,partly due
to the fortran heritage. My only concern is that we should stick to tools
with a GPL license, or at least completely free ones.

I think our best bet is to separate this into two task:

1. Change the architecture of the code and make a nice & well-documented
   layout for the future. This takes time, but I've started doing it.
2. Document the code we have now, with the aim of making it more useable
   rather than rewriting things.

> > I also plan to produce some how-to/tutorial pages for developers.  For the
> > first one, I want to put some answers to questions I had when I was just
> > starting out.  Things like how to add a new program to the makefile, how
> > to compile for debugging, how to debug MPI, etc.  This could probably even
> > go into the developer FAQ.  I'd also like to produce a document to help
> > people getting starting writing analysis programs.  We need this for our
> > group anyway.
> 

Wonderful! I'd just love some help :-)

I can give you access to the normal gromacs website so you can implement
it there - of course you can also keep a local copy, but I think it's a
good idea to have all our web material in one main gromacs.org site, so we
can move it just by tarring the entire tree or even mirror the entire
site. I'll send you the access stuff.

I might as well share some of the thoughts David & I have been
exchanging:

There are two serious problems with the present layout of the code. First,
there is no separating between internal utility routines and more or less
external interfaces that could be used e.g. in utility programs. A prime
example is the input/output of trajectories; you actually only need 3-4
functions, but in the include directory there are probably 50 declarations
or so, most of which shouldn't be visible to the causual user.

The other problem is that "everything depends on everything" - there are
no modules, and when you change one declaration it will probably have to
be altered in 10-20 additional places. This makes it very hard to debug
and test things, not to talk about the difficulty of understanding the
code.

I've started to design modules, beginning with core things like
neighborlists, neighborsearching and force calculation. We will stick to
C (rather than C++), but still use the type of module interface common
in object-oriented languages. Each functionality (like e.g.
neighborsearching or the pull code) should ideally have one source file
(or a couple if they are VERY large) and one header file with the 
EXTERNAL DECLARATIONS ONLY - very well documented. Each such module should
only depend on a well defined small set of submodules.
An important lesson in this context is that it often pays to duplicate
code; rather than having a complicated general routine to calculate simple
things like center-of-mass it might be better to have a simple routine
internally in a module - that way you avoid unnecessary dependencies.

This does mean some work, but in the long run it will make things 10 times
faster to test and much easier to understand. And the code will be useable
as a general library. However, it also means that we should not spend too
much efforts in documenting all the current calls!
As soon as the basic structure is a little more clear I'll create a CVS
module with a "gromacs developer reference manual" where you can add
things. 

It would still be extremely useful with guides and howtos, since they 
take a lot of time to write and the actual code calls are a minor part.
I would suggest having them online in HTML so people can read the while
browsing the code in an adjacent window.

PS: Justin - In case you're interested there are now Monte-Carlo versions
(no force calculation=faster) of all inner loops in CVS, including the
assembly (SSE & 3DNow) and altivec ones. I'll also be adding SSE2 support
(double precision) the next days, and then David & I will try to get 3.1
out as a maintenance release.

Do you have any idea of a general MC integrator?

Cheers,

Erik