C Tutorials¶
This section present some hand-on tutorials to the chemfiles C API. These
examples do exaclty the same thing as the C++ ones, using
the C API instead of the C++ API. As such, they are much more verbose. They also
do not correclty check for error conditions. When using the C API, you should
alway check that the return value of the functions is CHFL_SUCCESS
.
All the code here is under the CC-0 Universal Licence which means that you are free to do whatever you want with it (i.e. it is Public Domain code)
Read a single frame¶
In this tutorials we will read a frame from a trajectory, and print the indexes
of all the atom in the half-space x < 5
.
We start by including the headers we need: chemfiles header, `stdio.h
for
printing and stdlib.h
for malloc
. All the public functions of chemfiles
are accessible through the chemfiles.h
header, which is the only header
needed.
#include <chemfiles.h>
#include <stdio.h>
#include <stdlib.h>
Then we open a CHFL_TRAJECTORY
in read ('r'
) and read the first
frame. We need to allocate memory for the CHFL_FRAME
before calling
chfl_trajectory_read()
.
CHFL_TRAJECTORY* file = chfl_trajectory_open("filename.xyz", 'r');
CHFL_FRAME* frame = chfl_frame();
chfl_trajectory_read(file, frame);
We can now and get the positions of the atoms and the number of atoms in the
frame with the chfl_frame_positions()
function:
uint64_t natoms = 0;
chfl_vector3d* positions = NULL;
chfl_frame_positions(frame, &positions, &natoms);
Knowning the total number of atoms in the frame, we can allocate memory to store
the indices of the atoms with x < 5
:
size_t* less_than_five = malloc((size_t)natoms * sizeof(size_t));
Iterating through the atoms in the frame, we get the ones matching our
condition. We need to track the number of matched
atoms to know where to add
them in the less_than_five
array.
size_t matched = 0;
for (uint64_t i=0; i<natoms; i++) {
if (positions[i][0] < 5) {
less_than_five[matched] = (size_t)i;
matched++;
}
}
At the end we can print our results
printf("Atoms with x < 5:\n");
for (size_t i=0; i<matched; i++) {
printf(" - %lu", less_than_five[i]);
}
And free all allocated memory. We don’t need to free positions
, as it points
into memory allocated and controlled by the frame.
free(less_than_five);
chfl_free(frame);
chfl_trajectory_close(file);
Click here to see the whole program
#include <chemfiles.h>
#include <stdio.h>
#include <stdlib.h>
int main() {
CHFL_TRAJECTORY* file = chfl_trajectory_open("filename.xyz", 'r');
CHFL_FRAME* frame = chfl_frame();
chfl_trajectory_read(file, frame);
uint64_t natoms = 0;
chfl_vector3d* positions = NULL;
chfl_frame_positions(frame, &positions, &natoms);
size_t* less_than_five = malloc((size_t)natoms * sizeof(size_t));
size_t matched = 0;
for (uint64_t i=0; i<natoms; i++) {
if (positions[i][0] < 5) {
less_than_five[matched] = (size_t)i;
matched++;
}
}
printf("Atoms with x < 5:\n");
for (size_t i=0; i<matched; i++) {
printf(" - %lu", less_than_five[i]);
}
free(less_than_five);
chfl_free(frame);
chfl_trajectory_close(file);
return 0;
}
For more information about reading frame in a trajectory, see the following functions:
chfl_trajectory_nsteps()
to know when to stop readingchfl_trajectory_read_step()
to directlty read a given step.chfl_trajectory_set_cell()
andchfl_trajectory_set_topology()
to specify an unit cell or a topology for all frames in a trajectory.
Generating a structure¶
Now that we know how to read frames from a trajectory, let’s try to create a new structure and write it to a file. As previsouly, we start by including the chemfiles header:
#include <chemfiles.h>
We can then create a CHFL_FRAME
and the two CHFL_ATOM
we
will need.
CHFL_FRAME* frame = chfl_frame();
CHFL_ATOM* H = chfl_atom("H");
CHFL_ATOM* O = chfl_atom("O");
We can now add the atoms to the frame, with their respective positions. The
third argument to chfl_frame_add_atom()
is set to NULL
as we don’t
need the velocities.
chfl_frame_add_atom(frame, H, (chfl_vector3d){1, 0, 0}, NULL);
chfl_frame_add_atom(frame, O, (chfl_vector3d){0, 0, 0}, NULL);
chfl_frame_add_atom(frame, H, (chfl_vector3d){0, 1, 0}, NULL);
We can then add bonds between the atoms to fully define the topology
chfl_frame_add_bond(frame, 0, 1);
chfl_frame_add_bond(frame, 2, 1);
Finally, we can set the CHFL_CELL
associated with this frame. We
also free the cell memory right away, as it is no longer needed.
CHFL_CELL* cell = chfl_cell((chfl_vector3d){10, 10, 10});
chfl_frame_set_cell(frame, cell);
chfl_free(cell);
Now that our frame is constructed, it is time to write it to a file. For that,
we open a trajectory in write ('w'
) mode, and write to it.
CHFL_TRAJECTORY* trajectory = chfl_trajectory_open("water.pdb", 'w');
chfl_trajectory_write(trajectory, frame);
chfl_trajectory_close(trajectory);
And free all remaining memory with the right function.
chfl_free(frame);
chfl_free(H);
chfl_free(O);
Click here to see the whole program
#include <chemfiles.h>
#include <stdlib.h>
int main() {
CHFL_FRAME* frame = chfl_frame();
CHFL_ATOM* H = chfl_atom("H");
CHFL_ATOM* O = chfl_atom("O");
chfl_frame_add_atom(frame, H, (chfl_vector3d){1, 0, 0}, NULL);
chfl_frame_add_atom(frame, O, (chfl_vector3d){0, 0, 0}, NULL);
chfl_frame_add_atom(frame, H, (chfl_vector3d){0, 1, 0}, NULL);
chfl_frame_add_bond(frame, 0, 1);
chfl_frame_add_bond(frame, 2, 1);
CHFL_CELL* cell = chfl_cell((chfl_vector3d){10, 10, 10});
chfl_frame_set_cell(frame, cell);
chfl_free(cell);
CHFL_TRAJECTORY* trajectory = chfl_trajectory_open("water.pdb", 'w');
chfl_trajectory_write(trajectory, frame);
chfl_trajectory_close(trajectory);
chfl_free(frame);
chfl_free(H);
chfl_free(O);
return 0;
}
Using selections¶
Now that we know how to read and write frame from trajectories, how about we do a bit a filtering? In this tutorial, we will read all the frames from a file, and use selections to filter which atoms we will write back to another file. This example will also show how chemfiles can be used to convert from a file format to another one.
We start by opening the two trajectories we will need
CHFL_TRAJECTORY* input = chfl_trajectory_open("input.arc", 'r');
CHFL_TRAJECTORY* output = chfl_trajectory_open("output.pdb", 'w');
We create a CHFL_FRAME
and a CHFL_SELECTION
object to
filter the atoms we want to keep.
CHFL_FRAME* frame = chfl_frame();
CHFL_SELECTION* selection = chfl_selection("name Zn or name N");
Then we get the number of steps in the trajectory
uint64_t nsteps = 0;
chfl_trajectory_nsteps(input, &nsteps);
And iterate over the frames in the trajectory
for (size_t step=0; step<nsteps; step++) {
chfl_trajectory_read(input, frame);
From here, we need to use the selection to get the atoms we want to remove. This is a two steps process: first we evaluate the selection and get the number of matches
uint64_t n_matches = 0;
chfl_selection_evaluate(selection, frame, &n_matches);
Second we allocate some memory and get all the matches (represented as
chfl_match
):
chfl_match* matches = malloc((size_t)n_matches * sizeof(chfl_match));
chfl_selection_matches(selection, matches, n_matches);
We can get the index of atoms in a to_remove array
uint64_t* to_remove = malloc((size_t)n_matches * sizeof(uint64_t));
for (size_t i = 0; i<n_matches; i++) {
to_remove[i] = matches[i].atoms[0];
}
In order to remove the atoms from the frame, we need to sort to_remove
in
descending order: removing the atom at index i will shift the index of all the
atoms after i. So we need start from the end and work toward the start of the
frame. The compare_matches
function is used to do this sorting in reverse
order with the standard qsort function.
qsort(to_remove, (size_t)n_matches, sizeof(uint64_t), compare_matches);
for (size_t i = 0; i<n_matches; i++) {
chfl_frame_remove(frame, to_remove[i]);
}
Finally, we can write the cleaned frame to the output file, and free the memory we allocated:
chfl_trajectory_write(output, frame);
free(to_remove);
free(matches);
The compare_matches
function we used to sort the matches is defined as
follow:
chfl_trajectory_close(output);
return 0;
}
int compare_matches(const void* lhs, const void* rhs) {
size_t left = *(const size_t*)lhs;
size_t right = *(const size_t*)rhs;
Click here to see the whole program
#include <chemfiles.h>
#include <stdlib.h>
static int compare_matches(const void* lhs, const void* rhs);
int main() {
CHFL_TRAJECTORY* input = chfl_trajectory_open("input.arc", 'r');
CHFL_TRAJECTORY* output = chfl_trajectory_open("output.pdb", 'w');
CHFL_FRAME* frame = chfl_frame();
CHFL_SELECTION* selection = chfl_selection("name Zn or name N");
uint64_t nsteps = 0;
chfl_trajectory_nsteps(input, &nsteps);
for (size_t step=0; step<nsteps; step++) {
chfl_trajectory_read(input, frame);
uint64_t n_matches = 0;
chfl_selection_evaluate(selection, frame, &n_matches);
chfl_match* matches = malloc((size_t)n_matches * sizeof(chfl_match));
chfl_selection_matches(selection, matches, n_matches);
uint64_t* to_remove = malloc((size_t)n_matches * sizeof(uint64_t));
for (size_t i = 0; i<n_matches; i++) {
to_remove[i] = matches[i].atoms[0];
}
qsort(to_remove, (size_t)n_matches, sizeof(uint64_t), compare_matches);
for (size_t i = 0; i<n_matches; i++) {
chfl_frame_remove(frame, to_remove[i]);
}
chfl_trajectory_write(output, frame);
free(to_remove);
free(matches);
}
chfl_free(frame);
chfl_free(selection);
chfl_trajectory_close(input);
chfl_trajectory_close(output);
return 0;
}
int compare_matches(const void* lhs, const void* rhs) {
size_t left = *(const size_t*)lhs;
size_t right = *(const size_t*)rhs;
if (left > right) return -1;
if (right < left) return 1;
return 0;
}