[gmx-developers] Re: [Fsatom] Re: CML and macromolecules

Mon Oct 20 02:27:40 CEST 2003

On Sun, 2003-10-19 at 22:43, Peter Murray-Rust wrote:
> At 21:16 19/10/2003 +0200, David wrote:
> >Hi fsatoms,
> 
> Great to hear from you David - I gather you were at the tutorial
indeed.

> 
> >I'm ready for the first discussion! How about residue information in
> >CML? I ran babel on a pdb file and got only abbreviated atom names and
> >coordinates.
> 
> However I think there is still a strong "PDB-like" approach and I am happy 
> to extend CML to manage that aspect of macromolecules. I think it's *not* 
> useful for CML to try to model protein hierarchy 
> (primary/secondary/supersecondary/tertiary/quaternary, etc.) However it 
> could be useful to have a "flat-file" approach" where the atoms had PDB 
> like info on:
> - their PDB type (CA, SG. etc.)
> - their PDB number
> - their residue type
> - the chain number.
> 
> CML could carry this - and more - , but would not support the explicit 
> hierarchy.
> 
> the result might look like:
> 
> <atom elementType="C" cmlx:residue="GLY13" cmlx:pdbNumber="23" 
> cmlx:chain="B" x3="1.23".../>
> 
> where cmlx: is an extension CML namespace.
> 
> This can be compacted to an array format like:
> 
> <atomArray elementType="C O N C C O N C C..."
> cmlx:residue="GLY13 GLY13 CLY13 GLY13 ALA14..."
> cmlx:pdbNumber="23 24 25 26..."/>
> 

This looks indeed a lot more useful. The crux is very simple, my (and
probably other peoples) laziness... If I have to read CML *and* PDB I
have made my life more miserable instead of better! Therefore for
molecular modeling ends, it would be crucial to have such things like
residues, chain identifiers, crystal/unit cell information, b-factors,
occupancies.
The remainder (mainly REMARK stuff) is most likely less interesting. If
CML would provide the stuff listed above we don't have to worry about
reading pdb files anymore, and we could even use the bond information
that babel provides.

> The array format can actually be more cost-effective in space than PDB
> 
> what CML will not support is:
> 
> <protein>
>    <biologicalUnit>
>      <crystalUnit>
>         <chain id="A">
>            <residue>
>               <atom>
>               <atom>
>         <chain id="B">
> etc.
But this can easily be added in a different schema using namespaces.

-- 
David.
________________________________________________________________________
David van der Spoel, PhD, Assist. Prof., Molecular Biophysics group,
Dept. of Cell and Molecular Biology, Uppsala University.
Husargatan 3, Box 596,  	75124 Uppsala, Sweden
phone:	46 18 471 4205		fax: 46 18 511 755
spoel at xray.bmc.uu.se	spoel at gromacs.org   http://xray.bmc.uu.se/~spoel
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++