[gmx-users] Help with non-standard residues and molecular structures

Wed Jan 4 18:33:27 CET 2012

Thanks-- that clarifies a lot.  I hadn't quite realized how much is 
assumed about the residue terminii.  It seems like I was trying to fit a 
square peg into a round hole.   I'm going to re-think this and maybe 
take a different approach just based on using HETATMs and not trying to 
define a separate residue.

The pdb file specification for HETATM still requires a "residue name".  
I've found that using "LIG" seems to work. Any reason I shouldn't just 
continue using that, and define my molecule using HETATMs ?

Thanks again for your help!
Bob Hamers

For something like what I'm trying to do, where I"m using a pdb file 
that does not represent a protein, is

On 1/4/2012 2:02 AM, Mark Abraham wrote:
> On 4/01/2012 4:57 PM, Robert Hamers wrote:
>> I'd appreciate any help  --
>>
>> I'm trying to model a small (~ 20-carbon ) molecule linked to a 
>> diamond surface.  I got the diamond surface with >1500 atoms working 
>> fine all the way through to the MD simulation and it looks great.  
>> But I'm getting stuck on the molecule, which is not a protein but a 
>> moderately short-chain molecule that has a triazole (N3C2) in the 
>> middle of it.  I thought in order to make this work I would need ot 
>> learn how to with non-standard residues and after reading through the 
>> manual endless times and searching on the web site and trying things 
>> I'm basically stuck.
>>
>> I thought it would easiest to deal with the triazole ring by creating 
>> it as a non-standard residue in aminoacids.n2t.  As a starting point 
>> I thought I would try to work slowly by modifying an existing 
>> residue, so I arbitrary decided to modify "alanine" in order to 
>> understand how to work toward the more complicated triazole ring.  
>> So, in atomtypebyname.atp I copied the entry for ALA and called it 
>> ZZZ, and added a new entry "ZZZ  Protein" into "residuetypes.dat".  
>> At that point I can read in a pdb file with atoms belonging to 
>> residue "ZZZ" and pdb2gmx works fine.  However,  if I try to change 
>> one of the carbon atoms
>> to a nitrogen, (say, chance CA to N or NA), I get errors (see below) 
>> that I'm having trouble interpreting.  I thought that perhaps it was 
>> a problem of having two atoms with the same definition, so I made one 
>> "N1" and one "N2" as below, and also tried other variations (e.g., 
>> NA1 and NA2)
>
> Atom naming needs to be unique within the residue, so that grompp can 
> later confirm that the order of the names in the topology and 
> coordinate files match.
>
>>
>> (This is my entry in aminoacids.n2t)
>
> aminoacids.rtp I assume you mean.
>
>> [ ZZZ ]
>>  [ atoms ]
>>     N1    opls_238   -0.500     1
>>      H    opls_241    0.300     1
>>     N2   opls_238      0.140     1
>>     HA    opls_140    0.060     1
>>     CB    opls_135   -0.180     2
>>    HB1    opls_140    0.060     2
>>    HB2    opls_140    0.060     2
>>    HB3    opls_140    0.060     2
>>      C    opls_235    0.500     3
>>      O    opls_236   -0.500     3
>>  [ bonds ]
>>      N1     H
>>      N1    N2
>>     N2    HA
>>     N2    CB
>>     N2     C
>>     CB   HB1
>>     CB   HB2
>>     CB   HB3
>>      C     O
>>     -C     N1
>>  [ impropers ]
>>     -C    N2     N1     H    improper_Z_N_X_Y
>>     N2    +N1     C     O    improper_O_C_X_Y
>>
>> I thought that this would lead to a structure that would connect "C" 
>> to the previous residue in my pdb file and the "N" to the next . 
>> However, when I do pdb2gmx, I get:
>
> *N2* to the next, but yeah...
>
>> **
>> Back Off! I just backed up topol.top to ./#topol.top.40#
>> Processing chain 1 (13 atoms, 1 residues)
>> There are 0 donors and 1 acceptors
>> There are 0 hydrogen bonds
>> Identified residue ZZZ1 as a starting terminus.
>> Identified residue ZZZ1 as a ending terminus.
>> 8 out of 8 lines of specbond.dat converted successfully
>> Start terminus ZZZ-1: NH3+
>> End terminus ZZZ-1: COO-
>
> So the default(?) terminus selection is trying to give you charged 
> termini (perhaps because the type is protein, but I'm not sure here). 
> It is always appropriate to supply the command line you used.
>
>>
>> -------------------------------------------------------
>> Program pdb2gmx, VERSION 4.5.3
>> Source code file: /build/buildd/gromacs-4.5.3/src/kernel/pdb2top.c, 
>> line: 1056
>>
>> Fatal error:
>> atom N not found in buiding block 1ZZZ while combining tdb and rtp
>
> aminoacids.n.tdb applies to protein residues and assumes that NH3+ can 
> be applied. Evidently that won't work for your case. You will need to 
> come up with some termini that make sense, or use more than one residue.
>
>> ************
>>
>> I'm using the oplsaa force field, but up to this point it was a 
>> pretty arbitrary decision.
>> I think my problem is understanding the mapping between atom names ( 
>> N1, HB1, etc) and the opls names, as I haven't yet found a good 
>> explanation for how this mapping is done and/or what flexibility one 
>> has in creating atom names for non-standard residues.  (So, am I 
>> allowed to create a N atom and call it N1, as long as I assign it to 
>> an existing opls_xxx number ?) .
>
> Yes. Atom and residue names exist only for matching pieces of force 
> fields together. The atom type determines the physics of the resulting 
> model. The two are technically orthogonal, but in practice there are 
> strong correlations.
>
> Mark

-- 
Robert J. Hamers
Wisconsin Distinguished Professor
Univ. of Wisconsin-Madison	
1101 University Avenue
Madison, WI 53706
Ph: 608-262-6371
  Web: http://hamers.chem.wisc.edu