Selection language¶
Chemfiles selection language allows to select some atoms in a Frame matching a set of constraints. For examples, atom: name H
and x > 15
would select all single atoms whose name is H
and x
coordinate is bigger than 15.
Chemfiles selections differs from the well-known VMD selections by the fact
that they are multiple selections: we can select more than one atom at once.
All selections starts with a context, indicating the number of atoms we are
selecting, and the relation between these atoms. Existing contextes are
atoms
or one
, pairs
or two
, three
and four
to select
one, two, three or four independent atoms; and bonds
, angles
and
dihedrals
for two, three or four bonded atoms.
A selection is built using a context and a set of constraints separated by a
colon. For example, atoms: name == H
will select all atoms whose name is
H
. angles: name(#2) == O and mass(#3) < 1.5
will select all sets of
three bonded atoms forming an angle such that the name of the second atom is
O
and the mass of the third atom is less than 1.5.
These constraints are created using selectors. Selectors are small functions
that are evaluated for each atom, and return either true
if the atom
matches, or false
if it does not. There are three kinds of selectors:
- boolean selectors returns either
true
orfalse
for a given set of atoms; - string selectors compare string values with one of
==
(equal) or!=
(not equal). One can either compare two atomic properties (name(#1) == type(#2)
) or atomic properties to literal strings (name(#1) != He
); - numeric selectors compare two numeric values with either
==
,!=
,<
(less than),<=
(less or equal),>
(more than), and>=
(more or equal)
Numeric values are produced by numeric selectors (x
; mass
, …) or
literal values (5.2
, 22.21e-2
). They can also be combined together using
mathematical operations: the usual +
, -
, *
and /
operators are
supported, as well as ^
for exponentiation and %
for modulo (remainder
of Euclidean division). These operations follow the usual priority rules:
1 + 2 * 3
is 7, not 9.
When using a selection with more than one atom, selectors must refer to the
different atoms with #1
, #2
, #3
or #4
variables: name(#3)
will give the name of the third atom, and so on.
Finally, constraints are combined with boolean operators. The and
operator
is true if both side of the expression are true; the or
operator is true if
either side of the expression is true; and the not
operator reverse true to
false and false to true. name(#1) == H and not x(#1) < 5.0
and (z(#2) < 45
and name(#4) == O) or name(#1) == C
are complex selections using booleans
operators.
List of implemented selectors¶
Here is the list of currently implemented selectors. Additional ideas are welcome!
Boolean selectors¶
all
: always matches any atom and returnstrue
;none
: never matches an atom and returnsfalse
;is_bonded(i, j)
: check if atoms i and j are bonded together. If i and j refers to the same atom, this returns false;is_angle(i, j, k)
: check if atoms i, j and k are bonded together to form an angle, i.e. that i is bonded to j and j is bonded to k. If any of i, j or k refer to the same atom, this returns false;is_dihedral(i, j, k, m)
: check if atoms i, j, k and m are bonded together to form a dihedral angle, i.e. that i is bonded to j, j is bonded to k, and k is bonded to m. If any of i, j, k or m refer to the same atom, this returns false;is_improper(i, j, k, m)
: check if atoms i, j, k and m are bonded together to form a dihedral angle, i.e. that all of i, k, and m are bonded to j. If any of i, j, k or m refer to the same atom, this returns false;[<property>]
: check if atoms have a boolean property named ‘property’ set, and that this property is true. This will return false if the property is not set;
For boolean selectors taking arguments, i/j/k/m
can either be one of the
atoms currently being matched (#1 / #2 / #3 / #4
) or another selection
(called sub-selection). In the latter case, all the atoms in the sub-selection
are checked to see if any of them verify the selection. This makes
is_bonded(#1, name O)
select all atoms bonded to an oxygen; and
is_angle(type C, #1, name O)
select all atoms in the midle of a C-X-O angle.
String properties¶
type
: gives the atomic type;name
: gives the atomic name. Some formats store both an atomic name (H3) and an atom type (H), this is why you can use two different selectors depending on the actual data;resname
: gives the residue name. If an atom is not in a residue, this return the empty string;[<property>]
: gives the value of the string property named ‘property’ for the atom. This will return an empty string (“”) if the property is not set;
Numeric properties¶
Most of the numeric properties only apply to a single atom:
index
: gives the atomic index in the frame;mass
: gives the atomic mass;x
,y
andz
: gives the atomic position in cartesian coordinates;vx
,vy
andvz
: gives the atomic velocity in cartesian coordinates;resid
: gives the atomic residue index. If an atom is not in a residue, this return -1;[<property>]
: gives the value of the numeric property named ‘property’ for the atom. This will return 0 if the property is not set;
But some properties apply to multiple atoms, and as such are only usable when selecting multiple atoms:
distance(i, j)
: gives the distance in Ångströms between atoms i and j, accounting for periodic boundary conditions.angle(i, j, k)
: gives the angle between atoms i, j and k in radians, accounting for periodic boundary conditions. The atoms do not need to be bonded together.dihedral(i, j, k, m)
: gives the dihedral angle between atoms i, j, k and m in radians, accounting for periodic boundary conditions. The atoms do not need to be bonded together.out_of_plane(i, j, k, m)
: gives the distance in Ångströms between the plane formed by the three atoms i, k, and m; and the atom j, accounting for periodic boundary conditions.
Note
The angle
and dihedral
selectors are different from the is_angle
and is_dihedral
selectors. The firsts returns a number that can then be
used in mathematical expressions, while the second returns directly true
or false
.
One can also use mathematical function to transform a number to another value.
Currently supported functions are: deg2rad
and rad2deg
functions for
transforming radians to degrees and respectively; sin
, cos
, tan
for
the trigonometric functions; asin
and acos
inverse trigonometric
functions and sqrt
. Adding new functions is easy, open an issue about the
one you need on the chemfiles repository.
Elisions¶
This multiple selection language can be a bit verbose for simpler cases, so it is sometimes allowed to remove parts of the selection. The following rules allow simpler selections:
- First, in the
atoms
context, the#1
variable is optional, andatoms: name(#1) == H
is equivalent toatoms: name == H
. - Then, if no context is given, the
atoms
context is used. This makeatoms: name == H
equivalent toname == H
. - Then if no comparison operator is given,
==
is used by default. This means that we can writename H
instead ofname == H
. - Then, multiple values are interpreted as multiple choices. A selection like
name H O C
is expanded intoname H or name O or name C
.
At the end, using all these elisions rules, atom: name(#1) == H or name(#1) ==
O
is equivalent to name H O
. A more complex example can be bonds:
name(#1) O C and index(#2) 23 55 69
, which is equivalent to bonds:
(name(#1) == O or name(#1) == C) and (index(#2) == 23 or index(#2) == 55 or
index(#2) == 69)