[gmx-developers] grompp .mdp processing
David van der Spoel
spoel at xray.bmc.uu.se
Mon Feb 23 10:45:12 CET 2009
hessb at mpip-mainz.mpg.de wrote:
> Hi,
>
> I would prefer to have the multiple mdp option switch
> in the mdp file, not as an option to grompp.
> In that way the mdp files are at least internally consistent.
Isn't that something else?
That requires looping over something, e.g. lambda values.
>
> Berk
>
>> On Feb 23, 2009, at 00:58 , Mark Abraham wrote:
>>
>>> David van der Spoel wrote:
>>>> Mark Abraham wrote:
>>>>> Sander Pronk wrote:
>>>>>> Hi everybody,
>>>>>>
>>>>>> I've made some changes to grompp (not committed yet) that:
>>>>>>
>>>>>> - will allow the use of cpp-style #include and #define in .mdp
>>>>>> files (useful for setting up multiple similar simulations, but
>>>>>> also for tutorials).
>>>> I assume you have used the existing cpp library?
>>>>> That looks good to me. I also generate my .mdp files with scripts,
>>>>> but this feature would enable me to avoid that.
>>>> I'm not sure, nowadays one needs to do series of simulations
>>>> anyway, so scripting is a necessary evil.
>>> If #define is useful in the .mdp file, then the value of
>>> preprocessor variables must be being used - so probably that needs
>>> #ifdef and #if and such also. Anyway, if variable interpretation is
>>> being done, an .mdp file line like
>>>
>>> gen_seed = SEED_FROM_COMMAND_LINE
>>>
>>> enables a master script to call grompp with (say)
>>>
>>> grompp -DSEED_FROM_COMMAND_LINE=23441
>>>
>>> That adds a lot of complexity to the interpretation, however. Have I
>>> misunderstood Sander's intent with the use of #define?
>> That's one of the things that are possible - it's useful for free
>> energy simulations where the same simulation has to be run at
>> different lambdas.
>>
>> The other nice thing is being able to #include, which enables the use
>> of system-specific 'standard includes' with good settings that remain
>> consistent from structure energy minimization to production run.
>>
>>>
>>>>>> - allows multiple assignments of .mdp parameters, through
>>>>>> overrides so that the last assignment is the one that counts.
>>>>> Doing so always and silently would be asking for trouble, however
>>>>> if they're only enabled with -m, and come with a note to the user
>>>>> when they've occurred, that should be useful in a few corner cases.
>>>> And it would reverse the current policy, that is first option goes,
>>>> implying that one may get different results with the same input.
>>>> Further, I feel a bit uncomfortable with extending the mdp files
>>>> further, because we should rather move away from the endless list
>>>> of options. I haven't thought this trough, but I would prefer to
>>>> move to a slightly more complex format that, however, is more user
>>>> friendly. Thinking of a folding editor file (xml springs to mind).
>>>> It should still be possible to generate using a script, but there
>>>> are xml bindings for Perl and Python as well.
>>> Merely wrapping the "endless list of options" into an endless list
>>> of XML tags (say), gets you something like the following XML
>>>
>>> <mdp>
>>> ...
>>> <temperaturecoupling type="berendsen">
>>> <groups>
>>> <group tau_t="0.1" ref_t="298">Protein</group>
>>> <group tau_t="0.1" ref_t="298">Non-Protein</group>
>>> </groups>
>>> </temperaturecoupling>
>>> ...
>>> </mdp>
>>>
>>> This is arguably
>>>
>>> * more or less complex (links between different options like
>>> tc_groups and tau_t and ref_t are now explicit and can be tested for
>>> by validating against the DTD before we try to parse it; but there's
>>> a bunch of formatting constraints) and
>>> * more or less user friendly (the data is now structured and the
>>> format adds meaning to content; but there's all this visual cruft
>>> and users might feel constrained to need to learn an XML editor;
>>> increasingly the latter will become a requisite IT skill).
>>>
>>> Trying to wrap #ifdef-style XML conditional structures into the
>>> above would be a bit ugly... say
>>>
>>> ...
>>> <variable name="USING_RF"/>
>>> ...
>>> <if test="//variable[@name='USING_RF']">
>>> <group tau_t="0.01" ref_t="298">Protein</group>
>>> </if>
>>> ...
>>>
>>> where that test construct is an XPath expression.
>>>
>>> Such XML is readily generatable with scripts. For example, in Perl
>>> using XML::Writer you get lines like
>>>
>>> if ( $do_rf ) {
>>> $do_rf = 'USING_RF';
>>> $tau_t = 0.01;
>>> $ref_t = 298;
>>> ...
>>> $xml->startTag("if", "test" => "//variable[\@name='${do_rf}']");
>>> foreach my $group ("Protein", "Non-Protein) {
>>> $xml->dataElement("group", $group, "tau_t" => $tau_t, "ref_t" =>
>>> $ref_t);
>>> }
>>> $xml->endTag("if");
>>> }
>>>
>>> if you wanted to leave the XML-level conditionals in place, or more
>>> likely
>>>
>>> if ( $do_rf ) {
>>> $tau_t = 0.01;
>>> $ref_t = 298;
>>> ...
>>> foreach my $group ("Protein", "Non-Protein) {
>>> $xml->dataElement("group", $group, "tau_t" => $tau_t, "ref_t" =>
>>> $ref_t);
>>> }
>>> }
>>>
>>> It is also straightforward to provide scripts for the .mdp <-> <mdp>
>>> conversions in the change-over period. One would use Perl for both,
>>> though technically XSLT is probably the best tool for the <mdp> -
>>>> .mdp conversion.
>>> I'm far from convinced that the increase in usability outweighs the
>>> need for people to learn how to manage all this XML stuff, however.
>>>
>>
>> I think I agree; the most annoying aspect of the current .mdp files is
>> the lack of information hierarchy: it's hard to make out what's
>> important and what is not, and xml wouldn't be the ideal format for
>> what is essentially an options list. In many cases there is a sensible
>> default (like 'xyz' for pb, or 'no' for free_energy), where deviations
>> for the default are only needed if they're specifically needed (hence
>> the need for multiple assignment - together with #include it would
>> allow for better management of default settings).
>>
>> Perhaps just adding some syntax to enforce related settings would make
>> the structure clearer:
>>
>> free_energy
>> {
>> on = true
>> init_lambda = 0.1
>> delta_lambda = 0
>> soft-core
>> {
>> power = 1
>> alpha = 0.5
>> sigma = 0.3
>> }
>> }
>>
>> or, if no free energy calculation is required:
>>
>> free_energy
>> {
>> on = false
>> }
>>
>> or no free energy section at all.
>>
>>
>>
>> BTW, personally I don't see a reason why the parameter file shouldn't
>> be Turing-complete. :-)
>>
>>
>>
>> _______________________________________________
>> gmx-developers mailing list
>> gmx-developers at gromacs.org
>> http://www.gromacs.org/mailman/listinfo/gmx-developers
>> Please don't post (un)subscribe requests to the list. Use the
>> www interface or send it to gmx-developers-request at gromacs.org.
>>
>
>
> _______________________________________________
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://www.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-developers-request at gromacs.org.
--
David van der Spoel, Ph.D., Professor of Biology
Molec. Biophys. group, Dept. of Cell & Molec. Biol., Uppsala University.
Box 596, 75124 Uppsala, Sweden. Phone: +46184714205. Fax: +4618511755.
spoel at xray.bmc.uu.se spoel at gromacs.org http://folding.bmc.uu.se
More information about the gromacs.org_gmx-developers
mailing list