[gmx-users] GROMACS and XML
pm286 at cam.ac.uk
Wed Apr 17 09:42:59 CEST 2002
At 21:04 16/04/2002 +0200, David van der Spoel wrote:
>On Tue, 16 Apr 2002, Peter Murray-Rust wrote:
>I think the latest release will compile under Windows as well, IRIC with
>both Cygwin and MS tools.
Thanks (and to ErikL as well) - sounds as if I should install Cygwin.
>I am currently working on creating XML data
>formats using the libxml2 library. (src/gmxlib/xmlio.c is very primitive
>and not functional in the 3.1 release, in CVS it is beginning to get
>somewhat more complete). The main effort goes into creating DTDs right
After I mailed the list I noticed that there was an XML activity in
GROMACS. This is really great because XML can significantly increase the
(re)use of programs and data. CML (Chemical Markup Language) has been
developed for "small molecules" and is now starting to become widely used.
I have been extending the design to cover computational chemistry including
MM, MD and MO methods. We are still at an early stage and I am very
fortunate in that Herman Berendsen is spending 3 months in Cambridge. He
and I have just started thinking about how we can abstract the input and
output to such programs so that it is program-independent. This is a very
significant challenge, of course, and we have to balance the generality
(across all such computational experiments) against the complexity of the
DTD (or Schema - I have just converted CML to Schemas and it has
considerable advantages for validation).
Some general principles of XML design... XML is designed to be re-used so
it's worth looking at existing solutions for components of the information.
Thus MathML (semantic) should be able to manage the equations, HTML any
text and images, SVG the graphics and CML the (static) small molecule
information. There is no complete agreement on macromolecular structures in
XML but it may well be based on mmCIF (as this is the basis of the OMG
submission). There is a large amount of information in most scientific
disciplines which is scalar/array data with dataTypes, units and semantics
which can be added from dictionaries. I have developed a language to
support this (STMML) and this may be useful. In any case you are going to
have to build a dictionary of concepts, with dataTypes, units and
definitions :-) The more that this dictionary can be shared with other
codes, the more that generic tools can be built for input, validation,
analysis and display.
One general rule: for every XML element ("tag") you create software has to
exist! So always bear in mind how the XML is to be processed. For CML we
have three approaches - XSLT (probably the easiest and most generally
applicable), CML-DOM and CML-SAX. I expect that in computational chemistry
the XSLT approach will initially the most useful.
I certainly don't want to reinvent anything that has already been done - is
the GROMACS DTD reachable from the website or is it in a CVS repository?
>If you don't want to use configure you can ofcourse make your code
>conditional (#ifdef) and compile with certain CPPFLAGS and LDFLAGS.
>Dr. David van der Spoel, Biomedical center, Dept. of Biochemistry
>Husargatan 3, Box 576, 75123 Uppsala, Sweden
>phone: 46 18 471 4205 fax: 46 18 511 755
>spoel at xray.bmc.uu.se spoel at gromacs.org http://zorn.bmc.uu.se/~spoel
Peter Murray-Rust, pm286 AT cam.ac.uk
Unilever Centre for Molecular Informatics, Chemistry Department
Lensfield Road, Cambridge, CB2 1EW, UK
More information about the gromacs.org_gmx-users