Tutorials¶
This section present some hand-on tutorials to the chemfiles Python module. All the code here is under the CC-0 Universal Licence which means that you are free to do whatever you want with it (i.e. it is Public Domain code)
Read a single frame¶
In this tutorials we will read a frame from a trajectory, and print the indexes
of all the atom in the half-space x < 5.
We start by importing the classes we need from chemfiles, here only the
Trajectory class.
#!/usr/bin/env python
from chemfiles import Trajectory
Then we open a Trajectory and read the first frame:
with Trajectory("filename.xyz") as trajectory:
    frame = trajectory.read()
Iterating through the atoms in the frame, we store the indices of the atoms with
x < 5 in a list. len(frame.atoms) gives the number of atoms in the
frame, which is also the size of the frame.positions array. This array is a
numpy array which shape is (len(frame), 3).
less_than_five = []
for i in range(len(frame.atoms)):
    if frame.positions[i, 0] < 5:
        less_than_five.append(i)
We can then print our results
print("Atoms with x < 5: ")
for i in less_than_five:
    print(f"  - {i}")
Click here to see the whole program
#!/usr/bin/env python
from chemfiles import Trajectory
with Trajectory("filename.xyz") as trajectory:
    frame = trajectory.read()
less_than_five = []
for i in range(len(frame.atoms)):
    if frame.positions[i, 0] < 5:
        less_than_five.append(i)
print("Atoms with x < 5: ")
for i in less_than_five:
    print(f"  - {i}")
For more information about reading frame in a trajectory, see the following functions:
- Trajectory.nstepsis the number of frame in a Trajectory.
- Trajectory.read_step()to directlty read a given step.
- Trajectory.set_cell()and- Trajectory.set_topology()to specify an unit cell or a topology for all frames in a trajectory.
Generating a structure¶
Now that we know how to read frames from a trajectory, let’s try to create a new structure and write it to a file. As previsouly, we start by importing the clases we need, as well as numpy:
#!/usr/bin/env python
import numpy as np
from chemfiles import Topology, Frame, Atom, UnitCell, Trajectory
Everything starts in a Topology. This is the class that defines the
atoms and the connectivity in a system. Here, we add three Atom and
two bonds to create a water molecule.
topology = Topology()
topology.atoms.append(Atom("H"))
topology.atoms.append(Atom("O"))
topology.atoms.append(Atom("H"))
topology.add_bond(0, 1)
topology.add_bond(2, 1)
We can then create a Frame corresponding to this topology. We resize
the frame to ensure that the frame and the topology contains the same number of
atoms.
frame = Frame()
frame.resize(3)
frame.topology = topology
We can then set the atomic positions:
frame.positions[0, :] = np.array([1.0, 0.0, 0.0])
frame.positions[1, :] = np.array([0.0, 0.0, 0.0])
frame.positions[2, :] = np.array([0.0, 1.0, 0.0])
Another possibility is to directly add atoms to the frame. Here we define a
second molecule representing carbon dioxyde. Frame.add_atom() takes
two arguments: the atom, and the position of the atom as a 3-element list
frame.add_atom(Atom("O"), [5.0, 0.0, 0.0])
frame.add_atom(Atom("C"), [6.0, 0.0, 0.0])
frame.add_atom(Atom("O"), [7.0, 0.0, 0.0])
frame.add_bond(3, 4)
frame.add_bond(4, 5)
Finally, we can set the UnitCell associated with this frame.
frame.cell = UnitCell([10, 10, 10])
Now that our frame is constructed, it is time to write it to a file. For that,
we open a trajectory in write ('w') mode, and write to it.
with Trajectory("water-co2.pdb", "w") as trajectory:
    trajectory.write(frame)
Click here to see the whole program
#!/usr/bin/env python
import numpy as np
from chemfiles import Topology, Frame, Atom, UnitCell, Trajectory
topology = Topology()
topology.atoms.append(Atom("H"))
topology.atoms.append(Atom("O"))
topology.atoms.append(Atom("H"))
topology.add_bond(0, 1)
topology.add_bond(2, 1)
frame = Frame()
frame.resize(3)
frame.topology = topology
frame.positions[0, :] = np.array([1.0, 0.0, 0.0])
frame.positions[1, :] = np.array([0.0, 0.0, 0.0])
frame.positions[2, :] = np.array([0.0, 1.0, 0.0])
frame.add_atom(Atom("O"), [5.0, 0.0, 0.0])
frame.add_atom(Atom("C"), [6.0, 0.0, 0.0])
frame.add_atom(Atom("O"), [7.0, 0.0, 0.0])
frame.add_bond(3, 4)
frame.add_bond(4, 5)
frame.cell = UnitCell([10, 10, 10])
with Trajectory("water-co2.pdb", "w") as trajectory:
    trajectory.write(frame)
Using selections¶
Now that we know how to read and write frame from trajectories, how about we do a bit a filtering? In this tutorial, we will read all the frames from a file, and use selections to filter which atoms we will write back to another file. This example will also show how chemfiles can be used to convert from a file format to another one.
Here, we will need two of chemfiles classes: :py:class`Trajectory` and :py:class`Selection`.
#!/usr/bin/env python
from chemfiles import Trajectory, Selection
We start by opening the two trajectories we will need
trajectory = Trajectory("input.arc")
output = Trajectory("output.pdb", "w")
And we create a Selection object to filter the atoms we want to
remove.
selection = Selection("name Zn or name N")
Then we can iterate over all the frames in the trajectory, and use the selection
to get the list of atoms to remove. The result of Selection.evaluate()
is a list containing the atoms matching the selection.
for frame in trajectory:
    to_remove = selection.evaluate(frame)
In order to remove the atoms from the frame, we need to sort the to_remove
list in descending order: removing the atom at index i will shift the index of
all the atoms after i. So we start from the end and work toward the start of the
frame.
    for i in reversed(sorted(to_remove)):
        frame.remove(i)
We can then write the cleaned frame to the output file, and start the next iteration.
    output.write(frame)
Finally, we close the input and output files to ensure that all the written data is flushed to the disk.
trajectory.close()
output.close()
Click here to see the whole program
#!/usr/bin/env python
from chemfiles import Trajectory, Selection
trajectory = Trajectory("input.arc")
output = Trajectory("output.pdb", "w")
selection = Selection("name Zn or name N")
for frame in trajectory:
    to_remove = selection.evaluate(frame)
    for i in reversed(sorted(to_remove)):
        frame.remove(i)
    output.write(frame)
trajectory.close()
output.close()