Tutorials

This section present some hand-on tutorials to the chemfiles Julia package. All the code here is under the CC-0 Universal Licence which means that you are free to do whatever you want with it (i.e. it is Public Domain code)

Read a single frame

In this tutorials we will read a frame from a trajectory, and print the indexes of all the atom in the half-space x < 5.

We start by using the Chemfiles package to bring in scope the needed functions:

#!/usr/bin/env julia
using chemfiles

Then we open a Trajectory and read the first frame:

trajectory = Trajectory("filename.xyz")
frame = read(trajectory)

We can now create a list to store the indices of the atoms with x < 5, and get the positions of the atoms in the frame with the positions function

less_than_five = UInt64[]
pos = positions(frame)

Iterating through the atoms in the frame, we get the ones matching our condition. size(frame) gives the number of atoms in the frame, which is also the size of the positions array. This positions array shape is (3, size(frame)).

for i in 0:size(frame)
    if pos[1, i] < 5
        push!(less_than_five, i)
    end
end

And finally we can print our results

println("Atoms with x < 5: ")
for i in less_than_five
    println("  - $i")
end
Click here to see the whole program
#!/usr/bin/env julia
using chemfiles

trajectory = Trajectory("filename.xyz")
frame = read(trajectory)

less_than_five = UInt64[]
pos = positions(frame)

for i in 0:size(frame)
    if pos[1, i] < 5
        push!(less_than_five, i)
    end
end

println("Atoms with x < 5: ")
for i in less_than_five
    println("  - $i")
end

For more information about reading frame in a trajectory, see the following functions:

  • read_step to directly read a given step.
  • set_cell! and set_topology! to specify an unit cell or a topology for all frames in a trajectory.

Generating a structure

Now that we know how to read frames from a trajectory, let’s try to create a new structure and write it to a file. As previsouly, we start by using the Chemfiles package

#!/usr/bin/env julia
using Chemfiles

Everything starts in a Topology. This is the class that defines the atoms and the connectivity in a system. Here, we add three Atom and two bonds to create a water molecule.

topology = Topology()
add_atom!(topology, Atom("H"))
add_atom!(topology, Atom("O"))
add_atom!(topology, Atom("H"))

add_bond!(topology, 0, 1)
add_bond!(topology, 2, 1)

We can then create a Frame corresponding to this topology. We resize the frame to ensure that the frame and the topology contains the same number of atoms.

frame = Frame()
resize!(frame, 3)
set_topology!(frame, topology)

We can then set the atomic positions:

pos = frame.positions()
pos[1, :] = [1.0, 0.0, 0.0]
pos[2, :] = [0.0, 0.0, 0.0]
pos[3, :] = [0.0, 1.0, 0.0]

Another possibility is to directly add atoms to the frame. Here we define a second molecule representing carbon dioxyde. add_atom! takes three arguments: the frame, the atom, and the position of the atom.

add_atom!(frame, Atom("O"), [5.0, 0.0, 0.0])
add_atom!(frame, Atom("C"), [6.0, 0.0, 0.0])
add_atom!(frame, Atom("O"), [7.0, 0.0, 0.0])
add_bond!(frame, 3, 4)
add_bond!(frame, 4, 5)

Finally, we can set the UnitCell associated with this frame.

set_cell!(frame, UnitCell(10, 10, 10))

Now that our frame is constructed, it is time to write it to a file. For that, we open a trajectory in write ('w') mode, and write to it. You only need to close the file if you are on the REPL and need to use the written trajectory right awaty.

trajectory = Trajectory("water-co2.pdb", 'w')
write(trajectory, frame)
# When running on the REPL, remember to close the trajectory or else it won't
# end writing.
close(trajectory) 
Click here to see the whole program
#!/usr/bin/env julia
using Chemfiles

topology = Topology()
add_atom!(topology, Atom("H"))
add_atom!(topology, Atom("O"))
add_atom!(topology, Atom("H"))

add_bond!(topology, 0, 1)
add_bond!(topology, 2, 1)

frame = Frame()
resize!(frame, 3)
set_topology!(frame, topology)

pos = frame.positions()
pos[1, :] = [1.0, 0.0, 0.0]
pos[2, :] = [0.0, 0.0, 0.0]
pos[3, :] = [0.0, 1.0, 0.0]

add_atom!(frame, Atom("O"), [5.0, 0.0, 0.0])
add_atom!(frame, Atom("C"), [6.0, 0.0, 0.0])
add_atom!(frame, Atom("O"), [7.0, 0.0, 0.0])
add_bond!(frame, 3, 4)
add_bond!(frame, 4, 5)

set_cell!(frame, UnitCell(10, 10, 10))

trajectory = Trajectory("water-co2.pdb", 'w')
write(trajectory, frame)
# When running on the REPL, remember to close the trajectory or else it won't
# end writing.
close(trajectory) 

Using selections

Now that we know how to read and write frame from trajectories, how about we do a bit a filtering? In this tutorial, we will read all the frames from a file, and use selections to filter which atoms we will write back to another file. This example will also show how chemfiles can be used to convert from a file format to another one.

We start by opening the two trajectories we will need

trajectory = Trajectory("input.arc")
output = Trajectory("output.pdb", 'w')

And we create a Selection object to filter the atoms we want to remove.

selection = Selection("name Zn or name N")

Then we can iterate over all the frames in the trajectory, and use the selection to get the list of atoms to remove. The result of evaluate is a list containing the atoms matching the selection.

for frame in trajectory
    to_remove = evaluate(selection, frame)

In order to remove the atoms from the frame, we need to sort the to_remove list in descending order: removing the atom at index i will shift the index of all the atoms after i. So we start from the end and work toward the start of the frame.

    for i in reverse(sort(to_remove))
        remove!(frame, i)
    end

Finally, we can write the cleaned frame to the output file, and start the next iteration.

    write(output, frame)

Remember to close the file if you are on the REPL and need to use the written trajectory right away.

close(trajectory) 
Click here to see the whole program
#!/usr/bin/env julia
using Chemfiles

trajectory = Trajectory("input.arc")
output = Trajectory("output.pdb", 'w')

selection = Selection("name Zn or name N")

for frame in trajectory
    to_remove = evaluate(selection, frame)
    for i in reverse(sort(to_remove))
        remove!(frame, i)
    end
    write(output, frame)
end
# When running on the REPL, remember to close the trajectory or else it won't
# end writing.
close(trajectory)