[gmx-developers] Conformation representation and energy function

Mark Abraham mark.abraham at anu.edu.au
Tue Oct 5 15:42:43 CEST 2010

----- Original Message -----
From: Martin Kamp Jensen <martin.kamp.jensen at gmail.com>
Date: Tuesday, October 5, 2010 22:21
Subject: [gmx-developers] Conformation representation and energy function
To: Discussion list for GROMACS development <gmx-developers at gromacs.org>

> Hello,
> I would like to manipulate the conformation of a peptide inside of GROMACS. More specifically, I want to add a method on the same level as do_steep, do_nm, etc. (in mdlib/minimize.c) that repeatedly looks up the energy value and in between changes the conformation of a peptide by changing the dihedral angles. To be honest, I am overwhelmed by the data structures and where to begin. I have looked at http://www.gromacs.org/Developer_Zone/Programming_Guide/Data_Structures without any luck. > 
> I am a computer science student that is about to do my master's thesis. That plan is to, among other things, investigate the energy landscapes for (small) peptides and create barrier trees. I will need to use the energy function of GROMACS as a "black box", but of course I need to know how to integrate with it. I have no knowledge of biochemistry :) (My contacts do, of course...)

Because you are interested in small peptides, calling GROMACS to do EM or single energy evaluations from an external script will be fine. There is negligible overhead for spawning a new simulation process when that simulation system is so small that parallelization would not show a gain. You'll spend much less developer time, even if the result does not feel as elegant as one might like :-) It will be straightforward for you to measure how long 1000 EM runs take on a small peptide.
> My goals are as follows.> 
> 1) Find as many local minima as possible in the energy landscape of a (small) peptide. This could probably be implemented in an external program that calls GROMACS multiple times as the energy minimization features promise to find a local mimima close to the starting conformation. Another possibility would be to integrate this functionality in GROMACS and then maybe write the local minima to a trajectory file and read them in later for use in 2). (How to know that I found all the local minima/the search is converged will be one of my challenges.)

Calling GROMACS from an external program seems clearly best. In your case, the overhead from calling GROMACS will be larger than the actual calculation. Your time is much better spent devising search heuristics than coding in C :-)
> 2) Find a way to "connect" pairs of local minima, i.e., what is the lowest barrier connecting a pair of local minima. This is to be done using heuristics to change one conformation to another (change the dihedral angels) step by step. (Of course, the heuristics will be another one of my challenges.) However, to do this, I will need to call the energy function (evaluate_energy in mdlib/minimize.c) each time I change a conformation to guide the heuristics. I am aware that changing the conformation after GROMACS (mdrun) has parsed the binary run input file will probably remove the possibility of using MPI because data has been distributed in some specific way (maybe it is also a problem when using multiple threads on the same machine). I will just have to accept that for now and then maybe in the future look into the possibility of redistributing data after changing a conformation. However, right now I do not even know how to change the conformation/the dihedral angles...

I still don't think you want to get involved with GROMACS code. Generating a conformation external to GROMACS, writing that to a file, spawning GROMACS to get the energy and parsing the resulting .edr file is not actually terrible in your case.

Even if that is too inefficient for a small peptide, it will be possible to cast either the search for a heuristic, or the heuristic itself in such a way that you can generate a pseudo-trajectory of conformations for which you want an energy. Then mdrun -rerun is appropriate, per my first suggestion. This reduces the size of the overheads, though you do have to get somewhat involved with reading and writing GROMACS file formats. That could be as simple as linking to the libraries, however.

And if you still want to get your hands dirty inside GROMACS, I think you should model your approach on the functionality of mdrun -rerun. You want GROMACS to find the energy of a potentially infinite set of coordinates that appear magically, and that is what mdrun -rerun does. The magic source is a file in its case, but that is easy to rework for your context. Get out a debugger and step through the code doing a rerun of a small box of water :-) Once you've got a feel for what's going on, now you can pay attention to the data structures of importance (basically, "state->x").


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20101006/06f82481/attachment.html>

More information about the gromacs.org_gmx-developers mailing list