[gmx-users] g_select syntax

Teemu Murtola teemu.murtola at gmail.com
Fri Aug 29 21:06:47 CEST 2014


> I am recently puzzled by the syntax and behaviour of g_select. I want to
> obtain the residue index list of LIPID whose center of mass is within 1.0
> nm of the surface of protein. In my case, each LIPID molecule consists of
> only one residue. I wrote the selection.dat as follows, and set -selrpos to
> atom and -seltype to res_com. Here I think "Protein" is the reference
> group, so -selrpos should be atom because I care about the distance to its
> surface. "LIPID" is the analysis group and I care about their individual
> center of mass. So -seltype should be res_com.
> selection.dat:
> resname LIPID and within 1.0 of group "Protein";
> g_select -sf selection.dat -f traj.trr -s traj.tpr -n system.ndx  -oi
> index.dat -seltype res_com -selrpos atom

Yes, this should give you what you expect. It selects all LIPID atoms that
are within 1.0 nm from the protein, and then groups them by residue for the
-oi output.

However I tried another selection. This time instead of retrieving the
> residue index, I tried to retrieve the index of a key atom of the LIPID
> molecule.
> selection.dat:
> rdist = res_com within 1.0 of group "Protein";
> group_C15 = (resname LIPID) and (rdist) and (name C15);
> group_C15;
> g_select -sf selection.dat -f traj.trr -s traj.tpr -n system.ndx  -oi
> index.dat -seltype atom -selrpos atom

"res_com within" has a different meaning from using -seltype res_com: your
second selection selects all C15 atoms that are in LIPID residues, and
where the center-of-mass of the whole residue is within 1 nm from the
protein (the last part is the "res_com within" expression).

-seltype res_com in the first example is equivalent to writing this, where
the res_com is in a very different location:
res_com of (resname LIPID and within 1.0 of group "Protein")

Hopefully this helps understanding where the difference between the
selections comes from.

I thought these two selections should give the same number of indices per
> frame, as the second selection merely retrieve the atom indices of the
> corresponding key atoms in the LIPID molecules selected by the first
> selection. However the first selection gives significantly more indices
> than the second selection does. I guess my understanding of g_select syntax
> might be flawed. Please point out my misunderstanding. Thank you very much.

If you want to select the key atoms that match those from your first
selection, you need to write something more complex:

name C15 and same residue as (resname LIPID and within 1.0 of group

The last selection should be self-explanatory.

Hope this helps,

