[gmx-developers] regex.h and boost in gromacs-master

Mark Abraham Mark.Abraham at anu.edu.au
Thu Mar 15 02:56:17 CET 2012


On 15/03/2012 10:16 AM, Roland Schulz wrote:
> Hi,
>
>
> On Wed, Mar 14, 2012 at 5:39 PM, Mirco Wahab 
> <mirco.wahab at chemie.tu-freiberg.de 
> <mailto:mirco.wahab at chemie.tu-freiberg.de>> wrote:
>
>     There was a short discussion on gerrit (gromacs-master) on how to
>     consider regular expressions in selections in future releases,
>     eg. here:
>     https://gerrit.gromacs.org/#/c/551/7/src/gromacs/selection/tests/selectioncollection.cpp
>
>     I'm inclined to start a new thread for this ;-) The problem
>     here is, in my opinion, what would be the *best package*
>     to rely on with the least possible amount of surprises
>     in the future.
>
>     The (my) [-] candidates:
>
>     - PCRE (http://www.pcre.org/) would be just another
>       dependency, so better not ...
>
>     - <regex> with Gcc (tr1, C++0x) won't work at all (not even
>       in 4.6.3), see
>     http://gcc.gnu.org/onlinedocs/libstdc++/manual/status.html#id476343
>
>     - <regex> with VS 2010 (tr1) will work with only the
>       most simple expressions, anything moderate complicated
>       will crash it's engine on the initial regex compilation (*)
>
>     - <regex.h> is something GNU/GCC specific. A C-library that provides
>       (through regcomp()/regexec()) a basic matching and searching
>       functionality (which might be ok). It is included in the glibc
>       package (under /posix) and might not easily available for
>       Win64 (if at all) .
>
>     The (my) [+] candidates:
>
>     - Boost
>
>
> I agree with your conclusion that boost would be the best option. But...
>
>
>     Boost is /already/ included somehow in master (for smart_ptr,
>     scoped_ptr?), despite the [Allowed C++ Features] ruling:
>         /Don't use Boost, except parts that all developers have
>         agreed to be essential. These parts will be copied to
>         the Gromacs source tree./
>
>
> This is a problem. regex is not a header only library. Thus different 
> from the current libraries (exception and smart_ptr (shared_ptr, 
> scope_ptr)) which only have headers, regex requires to be compiled. 
> Someone would need to look into how to best compile it. At least two 
> options exist how to compile a boost sub-selection included in Gromacs:
> - use the standard method of bjam and also ship and autocompile bjam. 
> Copying bjam is supported by bcp.
> - use cmake to compile. Either write the cmake files yourself or use 
> one of the existing cmake build scripts for 
> boost:http://gitorious.org/boost/cmake or 
> http://ryppl.github.com/gettingstarted.html
>
> Currently we use bcp to generate the subset of boost we include 
> (see src/external/boost/README). With the cmake/boost on gitorious I'm 
> not sure how to create such a subset. The ryppl based one is supposed 
> to support this but I'm not sure how to do it. 
> http://boost.2283326.n4.nabble.com/How-to-use-BCP-td3629743.html has a 
> bit more detail on the different options and problems. If you could 
> look into the issue of how to build boost-regex within gromacs that 
> would be great.

Boost has a header-only regex library: 
http://www.boost.org/doc/libs/1_49_0/doc/html/xpressive.html It can do 
both dynamic regex that are compiled at run time (good for things like 
selections), and has a C++-language format for static regexes (good for 
things like parsing input files). It does use some parts of Boost mpl, 
which I gather is less than desirable, but I'm not sure why.

>
> BTW: Being able to include linked boost libraries into the included 
> boost would help us not only help with regex. I think we could benefit 
> greatly from using Boost::MPI in non-performance critical parts of the 
> code (e.g. bcast_ir_mtop and global_stat) to improve performance, 
> scaleability AND maintainability.

Hell yes.

Mark

>
> Of course we could also not include boost regex into the boost 
> subselection we include in the Gromacs code. Then the regex part would 
> require boost to be available.
>
> Roland
>
>
>     Boost is, imho, the only ubiquitous package that works
>     almost perfectly for complicated regexes in unix and
>     windows environments. If it can be agreed upon copying
>     the regex part into the 'minimal boost tree' of gromacs,
>     this problem would have been solved.
>
>     There could be, for exotic environments with their own boost
>     already in place, some kind of '-with-external-boost' or
>     its CMake equivalent.
>
>     my 0,02EUR
>
>     Thanks & Regards
>
>     M.
>
>
>
>
>
>
>     (*) - e.g., this will match against the contents of a gromacs .gro
>     file but crashes the VS2010 <regex> engine (but not the Boost one):
>
>        const char * MDATA::reg_gro =
>        /*
>        SOME_NAME
>        1234
>           1  ABC A100    1  44.455  32.113  39.983
>        */
>            "\\A(\\w+)[^\\n\\r]*[\\r\\n]+"
>            "[ ]*(\\d+)[^\\n\\r]*[\\r\\n]+"
>            "[ ]*\\d+"  "[ ]*[-_\\w]+"  "[ ]*[-_\\w]+"  "[ ]*\\d+"  "[
>     ]*[\\d\\.]+"  "[ ]*[\\d\\.]+"  "[ ]*[\\d\\.]+"
>         ;
>
>     --
>     gmx-developers mailing list
>     gmx-developers at gromacs.org <mailto:gmx-developers at gromacs.org>
>     http://lists.gromacs.org/mailman/listinfo/gmx-developers
>     Please don't post (un)subscribe requests to the list. Use the
>     www interface or send it to gmx-developers-request at gromacs.org
>     <mailto:gmx-developers-request at gromacs.org>.
>
>
>
>
>
>
>
> -- 
> ORNL/UT Center for Molecular Biophysics cmb.ornl.gov <http://cmb.ornl.gov>
> 865-241-1537, ORNL PO BOX 2008 MS6309
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20120315/1c6f82e8/attachment.html>


More information about the gromacs.org_gmx-developers mailing list