[gmx-developers] libxml2

David van der Spoel spoel at xray.bmc.uu.se
Sat Nov 16 13:45:41 CET 2013


On 2013-11-14 06:18, Teemu Murtola wrote:
> Hi all,
>
> just a few quick comments on some technical details.
>
>           > On 2013-11-11 02:43, Erik Lindahl wrote:
>           >> I’m fine with having it as a hard dependency, provided
>         we’ve had it
>           >> compile automatically during installs for a while without
>         complaints (it
>           >> has been on by default for 4.6, right?).
>
>
> We've had it on by default if it is found for ages, but the build system
> has semi-silently dropped it from the build if it is not found, with no
> user-visible consequences, so I'm not sure whether that qualifies as the
> required testing.
>
>         On Nov 11, 2013 8:33 AM, "David van der Spoel"
>         <spoel at xray.bmc.uu.se <mailto:spoel at xray.bmc.uu.se>
>         <mailto:spoel at xray.bmc.uu.se <mailto:spoel at xray.bmc.uu.se>>> wrote:
>
>          > In addition, for many small files
>         you don't need a dtd or schema (and in fact there isn't one for
>         these
>         xml files), it's just that the libxml2 library demands you put
>         it into
>         the file. If we're talking rtp files then that's another matter
>         where
>         more structure is needed.
>
>
> Where does this "demand from libxml2" come from? The unit testing stuff
> in master has been writing and reading XML files without DTDs or schemas
> for years now using libxml2, and no one has reported any issues. So I
> don't think there is any hard demand for a DTD in the XML file that it
> parses. Additionally, I think that referencing a non-existent DTD file
> serves no purpose whatsoever. I think that either you need to actually
> write that DTD, or remove the reference.
>
> On Wed, Nov 13, 2013 at 8:57 PM, David van der Spoel
> <spoel at xray.bmc.uu.se <mailto:spoel at xray.bmc.uu.se>> wrote:
>
>         The thing is that for small files it doesn't matter, neither DTD
>         nor Schema is used if you don't need it. I still have a hard
>         time comprehending why we would like to mix e.g. simulation data
>         with all possible other stuff.
>
>
> I think others' point is that without DTD or Schema validation, you need
> to write a lot of validation stuff yourself, which is a lot of code, or
> just live with the fact that all kinds of malformed input can get
> accepted, which isn't much better than text files.
>
>         Just check libxml++ but that introduces another dependency so
>         that's out. I will draft a gromacs frontend in C++ for libxml2
>         with just subset of the functionality. There is however one
>         issue: XML can be read in two fashions, using the DOM (Document
>         Object Model) and using SAX (Simple API for XML). Until now I
>         have used the DOM, which reads a whole document into memory, but
>         the memory usage can be prohibitive. SAX should therefore be the
>         preferred route. Any comments on that?
>
>
> I think that unless we need to read very big XML files, DOM is a lot
> more flexible. Parsing more complex data structures in SAX requires the
> parser that receives all the SAX callbacks be a relatively complex state
> machine, as it needs to incrementally construct all the data structures.
> Code with the same level of functionality and modularity is probably a
> lot easier to write and understand if written using DOM. If you want to
> keep the ability to not load the whole document, using the third option
> in libxml2, the reader API, is probably a better idea.
>
> It would be nice if the frontend would also be able to abstract away the
> current usage of libxml2 in src/testutils/refdata.cpp. However, that is
> perhaps quite different from what most other Gromacs code will use it,
> so it may not be the highest priority. It is already quite well
> encapsulated in this single file. But this is something to think about
> in the design.
>
> Teemu
>
>
Some progress. How's this:

<?xml version="1.0" encoding="UTF-8"?>
<gromacs xmlns:gmx="http://www.gromacs.org/schemas">
   <sfactors type="Fourier" force_field="any" displaced_solvent="true" 
reference="">
     <sfactor residue="ALA" atom="MW" type="1">
       <a0 unit="e">
      10.0369
       </a0>
       <q0 unit="1/A">
            0
       </q0>
....
   </sfactors>
</gromacs>


-- 
David van der Spoel, Ph.D., Professor of Biology
Dept. of Cell & Molec. Biol., Uppsala University.
Box 596, 75124 Uppsala, Sweden. Phone:	+46184714205.
spoel at xray.bmc.uu.se    http://folding.bmc.uu.se


More information about the gromacs.org_gmx-developers mailing list