[gmx-developers] grompp .mdp processing

hessb at mpip-mainz.mpg.de hessb at mpip-mainz.mpg.de
Mon Feb 23 10:25:04 CET 2009


Hi,

I would prefer to have the multiple mdp option switch
in the mdp file, not as an option to grompp.
In that way the mdp files are at least internally consistent.

Berk

>
> On Feb 23, 2009, at 00:58 , Mark Abraham wrote:
>
>> David van der Spoel wrote:
>>> Mark Abraham wrote:
>>>> Sander Pronk wrote:
>>>>> Hi everybody,
>>>>>
>>>>> I've made some changes to grompp (not committed yet) that:
>>>>>
>>>>> - will allow the use of cpp-style #include and #define in .mdp
>>>>> files (useful for setting up multiple similar simulations, but
>>>>> also for tutorials).
>>> I assume you have used the existing cpp library?
>>>>
>>>> That looks good to me. I also generate my .mdp files with scripts,
>>>> but this feature would enable me to avoid that.
>>> I'm not sure, nowadays one needs to do series of simulations
>>> anyway, so scripting is a necessary evil.
>>
>> If #define is useful in the .mdp file, then the value of
>> preprocessor variables must be being used - so probably that needs
>> #ifdef and #if and such also. Anyway, if variable interpretation is
>> being done, an .mdp file line like
>>
>> gen_seed = SEED_FROM_COMMAND_LINE
>>
>> enables a master script to call grompp with (say)
>>
>> grompp -DSEED_FROM_COMMAND_LINE=23441
>>
>> That adds a lot of complexity to the interpretation, however. Have I
>> misunderstood Sander's intent with the use of #define?
>
> That's one of the things that are possible - it's useful for free
> energy simulations where the same simulation has to be run at
> different lambdas.
>
> The other nice thing is being able to #include, which enables the use
> of system-specific 'standard includes' with good settings that remain
> consistent from structure energy minimization to production run.
>
>>
>>
>>>>
>>>>> - allows multiple assignments of .mdp parameters, through
>>>>> overrides so that the last assignment is the one that counts.
>>>>
>>>> Doing so always and silently would be asking for trouble, however
>>>> if they're only enabled with -m, and come with a note to the user
>>>> when they've occurred, that should be useful in a few corner cases.
>>> And it would reverse the current policy, that is first option goes,
>>> implying that one may get different results with the same input.
>>> Further, I feel a bit uncomfortable with extending the mdp files
>>> further, because we should rather move away from the endless list
>>> of options. I  haven't thought this trough, but I would prefer to
>>> move to a slightly more complex format that, however, is more user
>>> friendly. Thinking of a folding editor file (xml springs to mind).
>>> It should still be possible to generate using a script, but there
>>> are xml bindings for Perl and Python as well.
>>
>> Merely wrapping the "endless list of options" into an endless list
>> of XML tags (say), gets you something like the following XML
>>
>> <mdp>
>>  ...
>>  <temperaturecoupling type="berendsen">
>>    <groups>
>>      <group tau_t="0.1" ref_t="298">Protein</group>
>>      <group tau_t="0.1" ref_t="298">Non-Protein</group>
>>    </groups>
>>  </temperaturecoupling>
>>  ...
>> </mdp>
>>
>> This is arguably
>>
>> * more or less complex (links between different options like
>> tc_groups and tau_t and ref_t are now explicit and can be tested for
>> by validating against the DTD before we try to parse it; but there's
>> a bunch of formatting constraints) and
>> * more or less user friendly (the data is now structured and the
>> format adds meaning to content; but there's all this visual cruft
>> and users might feel constrained to need to learn an XML editor;
>> increasingly the latter will become a requisite IT skill).
>>
>> Trying to wrap #ifdef-style XML conditional structures into the
>> above would be a bit ugly... say
>>
>> ...
>> <variable name="USING_RF"/>
>> ...
>> <if test="//variable[@name='USING_RF']">
>>  <group tau_t="0.01" ref_t="298">Protein</group>
>> </if>
>> ...
>>
>> where that test construct is an XPath expression.
>>
>> Such XML is readily generatable with scripts. For example, in Perl
>> using XML::Writer you get lines like
>>
>> if ( $do_rf ) {
>>  $do_rf = 'USING_RF';
>>  $tau_t = 0.01;
>>  $ref_t = 298;
>> ...
>>  $xml->startTag("if", "test" => "//variable[\@name='${do_rf}']");
>>  foreach my $group ("Protein", "Non-Protein) {
>>    $xml->dataElement("group", $group, "tau_t" => $tau_t, "ref_t" =>
>> $ref_t);
>>  }
>>  $xml->endTag("if");
>> }
>>
>> if you wanted to leave the XML-level conditionals in place, or more
>> likely
>>
>> if ( $do_rf ) {
>>  $tau_t = 0.01;
>>  $ref_t = 298;
>> ...
>>  foreach my $group ("Protein", "Non-Protein) {
>>    $xml->dataElement("group", $group, "tau_t" => $tau_t, "ref_t" =>
>> $ref_t);
>>  }
>> }
>>
>> It is also straightforward to provide scripts for the .mdp <-> <mdp>
>> conversions in the change-over period. One would use Perl for both,
>> though technically XSLT is probably the best tool for the <mdp> -
>> > .mdp conversion.
>>
>> I'm far from convinced that the increase in usability outweighs the
>> need for people to learn how to manage all this XML stuff, however.
>>
>
>
> I think I agree; the most annoying aspect of the current .mdp files is
> the lack of information hierarchy: it's hard to make out what's
> important and what is not, and xml wouldn't be the ideal format for
> what is essentially an options list. In many cases there is a sensible
> default (like 'xyz' for pb, or 'no' for free_energy), where deviations
> for the default are only needed if they're specifically needed (hence
> the need for multiple assignment - together with #include it would
> allow for better management of default settings).
>
> Perhaps just adding some syntax to enforce related settings would make
> the structure clearer:
>
> free_energy
> {
> 	on = true
> 	init_lambda = 0.1
> 	delta_lambda = 0
> 	soft-core
> 	{
> 		power = 1
> 		alpha = 0.5
> 		sigma = 0.3
> 	}
> }
>
> or, if no free energy calculation is required:
>
> free_energy
> {
> 	on = false
> }
>
> or no free energy section at all.
>
>
>
> BTW, personally I don't see a reason why the parameter file shouldn't
> be Turing-complete.  :-)
>
>
>
> _______________________________________________
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://www.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-developers-request at gromacs.org.
>





More information about the gromacs.org_gmx-developers mailing list