Overview

The figure below represents how the basic classes of chemfiles are organized and how they interact together. Chemfiles is organized around a handful of public classes: Trajectory, Frame, Topology, Residue, Atom, UnitCell and Selection, all of which are presented here.

_images/classes.svg

A trajectory is the main entry point of chemfiles. It reads one or many frames from a file on the disk using a specific format. The file type and the format are automatically determined from the extension.

A frame holds data for one step of a simulation, consisting in the positions for all the atoms; optionally the velocities for all the atoms; the topology and the unit cell of the system.

The topology describes the organization of the particles in the system. It contains a list of atoms in the system, and information about which atoms are bonded together. A residue is a group of atoms bonded together, which may or may not corresponds to molecules. When working with bio-molecules and specifically proteins from the PDB data bank, the residues should correspond to amino-acids in the protein.

The Atom class contains basic information about the atoms in the system: the name (if it is available), mass, kind of atom and so on. Atoms are not limited to plain chemical elements.

The UnitCell class describes the boundary conditions of the system: where are the boundaries, and what is the periodicity of theses boundaries. An unit cell can be of three types: Infinite, Orthorhombic or Triclinic. Infinite cells do not have any boundaries. Orthorhombic cells are defined by three orthogonal vectors, and triclinic cells are defined by three vectors without any constrain.

The Property class store additional data or metadata associated with frames, residues or atoms. Properties can store string values, numeric values, Boolean values or vector values.

Chemfiles also provides a selection language, implemented in the Selection class. This selection language allows the users to select a group of atoms from a frame using a selection string such as "(x < 45 and name O) or name C".

Units

Chemfiles uses the following set of internal units:

  • lengths (positions and cell lengths) are in Angstroms;

  • velocities are in Angstroms/picosecond;

  • angles are in degrees;

When reading from a file, chemfiles tries to to convert from the data stored in the file to these units. Some formats do not document the units of the value stored, in which case the data is read as-is and assumed to follow the units above.

When writing to a file, chemfiles tries to convert from these units to the units expected by the format. If the format does not have a way to specify units and does not define units in its specification, then chemfiles will write its internal data as-is.