C Tutorials

This section present some hand-on tutorials to the chemfiles C API. These examples do exaclty the same thing as the C++ ones, using the C API instead of the C++ API. As such, they are much more verbose. They also do not correclty check for error conditions. When using the C API, you should alway check that the return value of the functions is CHFL_SUCCESS.

All the code here is under the CC-0 Universal Licence which means that you are free to do whatever you want with it (i.e. it is Public Domain code)

Read a single frame

In this tutorials we will read a frame from a trajectory, and print the indexes of all the atom in the half-space x < 5.

We start by including the headers we need: chemfiles header, `stdio.h for printing and stdlib.h for malloc. All the public functions of chemfiles are accessible through the chemfiles.h header, which is the only header needed.

#include <chemfiles.h>
#include <stdio.h>
#include <stdlib.h>

Then we open a CHFL_TRAJECTORY in read ('r') and read the first frame. We need to allocate memory for the CHFL_FRAME before calling chfl_trajectory_read().

CHFL_TRAJECTORY* file = chfl_trajectory_open("filename.xyz", 'r');
CHFL_FRAME* frame = chfl_frame();
chfl_trajectory_read(file, frame);

We can now and get the positions of the atoms and the number of atoms in the frame with the chfl_frame_positions() function:

uint64_t natoms = 0;
chfl_vector3d* positions = NULL;
chfl_frame_positions(frame, &positions, &natoms);

Knowning the total number of atoms in the frame, we can allocate memory to store the indices of the atoms with x < 5:

size_t* less_than_five = malloc((size_t)natoms * sizeof(size_t));

Iterating through the atoms in the frame, we get the ones matching our condition. We need to track the number of matched atoms to know where to add them in the less_than_five array.

size_t matched = 0;
for (uint64_t i=0; i<natoms; i++) {
    if (positions[i][0] < 5) {
        less_than_five[matched] = (size_t)i;
        matched++;
    }
}

At the end we can print our results

printf("Atoms with x < 5:\n");
for (size_t i=0; i<matched; i++) {
    printf("  - %lu", less_than_five[i]);
}

And free all allocated memory. We don’t need to free positions, as it points into memory allocated and controlled by the frame.

free(less_than_five);
chfl_free(frame);
chfl_trajectory_close(file);
Click here to see the whole program
#include <chemfiles.h>
#include <stdio.h>
#include <stdlib.h>

int main(void) {
    CHFL_TRAJECTORY* file = chfl_trajectory_open("filename.xyz", 'r');
    CHFL_FRAME* frame = chfl_frame();
    chfl_trajectory_read(file, frame);

    uint64_t natoms = 0;
    chfl_vector3d* positions = NULL;
    chfl_frame_positions(frame, &positions, &natoms);

    size_t* less_than_five = malloc((size_t)natoms * sizeof(size_t));

    size_t matched = 0;
    for (uint64_t i=0; i<natoms; i++) {
        if (positions[i][0] < 5) {
            less_than_five[matched] = (size_t)i;
            matched++;
        }
    }

    printf("Atoms with x < 5:\n");
    for (size_t i=0; i<matched; i++) {
        printf("  - %lu", less_than_five[i]);
    }

    free(less_than_five);
    chfl_free(frame);
    chfl_trajectory_close(file);

    return 0;
}

For more information about reading frame in a trajectory, see the following functions:

Generating a structure

Now that we know how to read frames from a trajectory, let’s try to create a new structure and write it to a file. As previsouly, we start by including the chemfiles header:

#include <chemfiles.h>

We can then create a CHFL_FRAME and the two CHFL_ATOM we will need.

CHFL_FRAME* frame = chfl_frame();
CHFL_ATOM* H = chfl_atom("H");
CHFL_ATOM* O = chfl_atom("O");

We can now add the atoms to the frame, with their respective positions. The third argument to chfl_frame_add_atom() is set to NULL as we don’t need the velocities.

chfl_frame_add_atom(frame, H, (chfl_vector3d){1, 0, 0}, NULL);
chfl_frame_add_atom(frame, O, (chfl_vector3d){0, 0, 0}, NULL);
chfl_frame_add_atom(frame, H, (chfl_vector3d){0, 1, 0}, NULL);

We can then add bonds between the atoms to fully define the topology

chfl_frame_add_bond(frame, 0, 1);
chfl_frame_add_bond(frame, 2, 1);

Finally, we can set the CHFL_CELL associated with this frame. We also free the cell memory right away, as it is no longer needed.

CHFL_CELL* cell = chfl_cell((chfl_vector3d){10, 10, 10}, NULL);
chfl_frame_set_cell(frame, cell);
chfl_free(cell);

Now that our frame is constructed, it is time to write it to a file. For that, we open a trajectory in write ('w') mode, and write to it.

CHFL_TRAJECTORY* trajectory = chfl_trajectory_open("water.pdb", 'w');
chfl_trajectory_write(trajectory, frame);
chfl_trajectory_close(trajectory);

And free all remaining memory with the right function.

chfl_free(frame);
chfl_free(H);
chfl_free(O);
Click here to see the whole program
#include <chemfiles.h>
#include <stdlib.h>

int main(void) {
    CHFL_FRAME* frame = chfl_frame();
    CHFL_ATOM* H = chfl_atom("H");
    CHFL_ATOM* O = chfl_atom("O");

    chfl_frame_add_atom(frame, H, (chfl_vector3d){1, 0, 0}, NULL);
    chfl_frame_add_atom(frame, O, (chfl_vector3d){0, 0, 0}, NULL);
    chfl_frame_add_atom(frame, H, (chfl_vector3d){0, 1, 0}, NULL);

    chfl_frame_add_bond(frame, 0, 1);
    chfl_frame_add_bond(frame, 2, 1);

    CHFL_CELL* cell = chfl_cell((chfl_vector3d){10, 10, 10}, NULL);
    chfl_frame_set_cell(frame, cell);
    chfl_free(cell);

    CHFL_TRAJECTORY* trajectory = chfl_trajectory_open("water.pdb", 'w');
    chfl_trajectory_write(trajectory, frame);
    chfl_trajectory_close(trajectory);

    chfl_free(frame);
    chfl_free(H);
    chfl_free(O);

    return 0;
}

Using selections

Now that we know how to read and write frame from trajectories, how about we do a bit a filtering? In this tutorial, we will read all the frames from a file, and use selections to filter which atoms we will write back to another file. This example will also show how chemfiles can be used to convert from a file format to another one.

We start by opening the two trajectories we will need

CHFL_TRAJECTORY* input = chfl_trajectory_open("input.arc", 'r');
CHFL_TRAJECTORY* output = chfl_trajectory_open("output.pdb", 'w');

We create a CHFL_FRAME and a CHFL_SELECTION object to filter the atoms we want to keep.

CHFL_FRAME* frame = chfl_frame();
CHFL_SELECTION* selection = chfl_selection("name Zn or name N");

Then we get the number of steps in the trajectory

uint64_t nsteps = 0;
chfl_trajectory_nsteps(input, &nsteps);

And iterate over the frames in the trajectory

for (size_t step=0; step<nsteps; step++) {
    chfl_trajectory_read(input, frame);

From here, we need to use the selection to get the atoms we want to remove. This is a two steps process: first we evaluate the selection and get the number of matches

uint64_t n_matches = 0;
chfl_selection_evaluate(selection, frame, &n_matches);

Second we allocate some memory and get all the matches (represented as chfl_match):

chfl_match* matches = malloc((size_t)n_matches * sizeof(chfl_match));
chfl_selection_matches(selection, matches, n_matches);

We can get the index of atoms in a to_remove array

uint64_t* to_remove = malloc((size_t)n_matches * sizeof(uint64_t));
for (size_t i = 0; i<n_matches; i++) {
    to_remove[i] = matches[i].atoms[0];
}

In order to remove the atoms from the frame, we need to sort to_remove in descending order: removing the atom at index i will shift the index of all the atoms after i. So we need start from the end and work toward the start of the frame. The compare_matches function is used to do this sorting in reverse order with the standard qsort function.

qsort(to_remove, (size_t)n_matches, sizeof(uint64_t), compare_matches);
for (size_t i = 0; i<n_matches; i++) {
    chfl_frame_remove(frame, to_remove[i]);
}

Finally, we can write the cleaned frame to the output file, and free the memory we allocated:

chfl_trajectory_write(output, frame);

free(to_remove);
free(matches);

The compare_matches function we used to sort the matches is defined as follow:

    chfl_trajectory_close(output);

    return 0;
}

int compare_matches(const void* lhs, const void* rhs) {
    size_t left = *(const size_t*)lhs;
    size_t right = *(const size_t*)rhs;
Click here to see the whole program
#include <chemfiles.h>
#include <stdlib.h>

static int compare_matches(const void* lhs, const void* rhs);

int main(void) {
    CHFL_TRAJECTORY* input = chfl_trajectory_open("input.arc", 'r');
    CHFL_TRAJECTORY* output = chfl_trajectory_open("output.pdb", 'w');
    CHFL_FRAME* frame = chfl_frame();
    CHFL_SELECTION* selection = chfl_selection("name Zn or name N");

    uint64_t nsteps = 0;
    chfl_trajectory_nsteps(input, &nsteps);

    for (size_t step=0; step<nsteps; step++) {
        chfl_trajectory_read(input, frame);

        uint64_t n_matches = 0;
        chfl_selection_evaluate(selection, frame, &n_matches);
        chfl_match* matches = malloc((size_t)n_matches * sizeof(chfl_match));
        chfl_selection_matches(selection, matches, n_matches);

        uint64_t* to_remove = malloc((size_t)n_matches * sizeof(uint64_t));
        for (size_t i = 0; i<n_matches; i++) {
            to_remove[i] = matches[i].atoms[0];
        }

        qsort(to_remove, (size_t)n_matches, sizeof(uint64_t), compare_matches);
        for (size_t i = 0; i<n_matches; i++) {
            chfl_frame_remove(frame, to_remove[i]);
        }

        chfl_trajectory_write(output, frame);

        free(to_remove);
        free(matches);
    }

    chfl_free(frame);
    chfl_free(selection);
    chfl_trajectory_close(input);
    chfl_trajectory_close(output);

    return 0;
}

int compare_matches(const void* lhs, const void* rhs) {
    size_t left = *(const size_t*)lhs;
    size_t right = *(const size_t*)rhs;

    if (left > right) return -1;
    if (right < left) return 1;
    return 0;
}