[gmx-developers] regex.h and boost in gromacs-master

Mirco Wahab mirco.wahab at chemie.tu-freiberg.de
Wed Mar 14 22:39:03 CET 2012


There was a short discussion on gerrit (gromacs-master) on how to
consider regular expressions in selections in future releases,
eg. here:
https://gerrit.gromacs.org/#/c/551/7/src/gromacs/selection/tests/selectioncollection.cpp

I'm inclined to start a new thread for this ;-) The problem
here is, in my opinion, what would be the *best package*
to rely on with the least possible amount of surprises
in the future.

The (my) [-] candidates:

- PCRE (http://www.pcre.org/) would be just another
   dependency, so better not ...

- <regex> with Gcc (tr1, C++0x) won't work at all (not even
   in 4.6.3), see 
http://gcc.gnu.org/onlinedocs/libstdc++/manual/status.html#id476343

- <regex> with VS 2010 (tr1) will work with only the
   most simple expressions, anything moderate complicated
   will crash it's engine on the initial regex compilation (*)

- <regex.h> is something GNU/GCC specific. A C-library that provides
   (through regcomp()/regexec()) a basic matching and searching
   functionality (which might be ok). It is included in the glibc
   package (under /posix) and might not easily available for
   Win64 (if at all) .

The (my) [+] candidates:

- Boost

Boost is /already/ included somehow in master (for smart_ptr,
scoped_ptr?), despite the [Allowed C++ Features] ruling:
     /Don't use Boost, except parts that all developers have
     agreed to be essential. These parts will be copied to
     the Gromacs source tree./

Boost is, imho, the only ubiquitous package that works
almost perfectly for complicated regexes in unix and
windows environments. If it can be agreed upon copying
the regex part into the 'minimal boost tree' of gromacs,
this problem would have been solved.

There could be, for exotic environments with their own boost
already in place, some kind of '-with-external-boost' or
its CMake equivalent.

my 0,02€

Thanks & Regards

M.






(*) - e.g., this will match against the contents of a gromacs .gro
file but crashes the VS2010 <regex> engine (but not the Boost one):

    const char * MDATA::reg_gro =
    /*
    SOME_NAME
    1234
       1  ABC A100    1  44.455  32.113  39.983
    */
        "\\A(\\w+)[^\\n\\r]*[\\r\\n]+"
        "[ ]*(\\d+)[^\\n\\r]*[\\r\\n]+"
        "[ ]*\\d+"  "[ ]*[-_\\w]+"  "[ ]*[-_\\w]+"  "[ ]*\\d+"  "[ 
]*[\\d\\.]+"  "[ ]*[\\d\\.]+"  "[ ]*[\\d\\.]+"
     ;




More information about the gromacs.org_gmx-developers mailing list