0% found this document useful (0 votes)
36 views8 pages

Amber Trajectory Netcdf Convention Version 1.0, Revision B: John Mongan (Jmongan@Mccammon - Ucsd.Edu) 20Th February 2006

The document describes the AMBER Trajectory NetCDF file format convention version 1.0 for storing molecular dynamics simulation data. It specifies required global attributes, dimensions, and optional data variables to include coordinates, velocities, unit cell dimensions, and other trajectory information. Creators must strictly follow the conventions while readers should be as permissive as possible in reading files.

Uploaded by

AhmedAdel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views8 pages

Amber Trajectory Netcdf Convention Version 1.0, Revision B: John Mongan (Jmongan@Mccammon - Ucsd.Edu) 20Th February 2006

The document describes the AMBER Trajectory NetCDF file format convention version 1.0 for storing molecular dynamics simulation data. It specifies required global attributes, dimensions, and optional data variables to include coordinates, velocities, unit cell dimensions, and other trajectory information. Creators must strictly follow the conventions while readers should be as permissive as possible in reading files.

Uploaded by

AhmedAdel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

AMBER Trajectory NetCDF Convention

Version 1.0, Revision B


John Mongan (jmongan@mccammon.ucsd.edu)
20th February 2006

Introduction

The file format described in this document was developed for storing data generated by molecular dynamics simulations. It was introduced in version 9 of the
AMBER suite of programs (http://amber.scripps.edu).
The primary design goals of this format are:
Efficient input and output
Compact, high-precision representation of data
Portability of data files across different machine architectures
Extensibility of the format (ability to add additional data without re-writing
parsers)
Compatability with existing tools and formats
The file format is based on the NetCDF (Network Common Data Form) developed
by Unidata (http://www.unidata.ucar.edu/software/netcdf/). NetCDF is designed
for representation of arbitrary array-based data. Unidata provides libraries with
bindings in C, C++, Fortran (F77 and F90), Java, Python, Perl, Ruby and MATLAB
for reading and writing NetCDF files. The design goals above are largely met by
NetCDF and the libraries that implement it. It is expected that all I/O of the format
described here will occur through these libraries; this specification describes the
file format at a high level in terms of the API implemented by version 3.6 of these
libraries. In NetCDF terms, this document is a Convention, describing the names,
dimensions and attributes of the arrays that may be present in the file.

Program behavior

Programs creating trajectory files (creators) shall adhere strictly to the requirements of this document. Programs reading trajectory files (readers) shall be as
permissive as possible in applying the requirements of this document. Readers may
emit warnings if out-of-spec files are encountered; these warnings should include
information about the program that originally created the file (see Global attributes,
section 4). Readers shall not fail to read a file unless the required information cannot be located or interpreted. In particular, to ensure forward compatability with
later extension of the format, readers shall not fail or emit warnings if elements not
described in this document are present in the file.

NetCDF file encoding

Trajectory files shall be encoded in the manner employed by NetCDF


version 3.x.
Those using NetCDF versions 4 or later should take care to ensure that files are
read and written using this encoding, and not the HDF5 encoding.

Trajectory files shall use 64 bit offsets


This can be accomplished by setting a flag during file creation; refer to API docs
for details.

Global attributes

Global attributes shall have type character string. Spelling and capitalization of attribute names shall be exactly as appears below. Creators shall include all attributes
marked required and may include attributes marked optional. Creators shall not
write an attribute string having a length greater than 80 characters. Readers may
warn about missing required attributes, but shall not fail, except in the case of a
missing or unexpected Conventions or ConventionVersion attributes.

Conventions (required)
Contents of this attribute are a comma or space delimited list of tokens representing all of the conventions to which the file conforms. Creators shall include the
string AMBER as one of the tokens in this list. In the usual case, where the file conforms only to this convention, the value of the attribute will simply be AMBER.
2

Readers may fail if this attribute is not present or none of the tokens in the list are
AMBER. Optionally, if the reader does not expect NetCDF files other than those
conforming to the AMBER convention, it may emit a warning and attempt to read
the file even when the Conventions attribute is missing.

ConventionVersion (required)
Contents are a string representation of the version number of this convention. Future revisions of this convention having the same version number may include definitions of additional variables, dimensions or attributes, but are guaranteed to have
no incompatible changes to variables, dimensions or attributes specified in previous
revisions. Creators shall set this attribute to 1.0. If this attribute is present and
has a value other than 1.0, readers may fail or may emit a warning and continue.
It is expected that the version of this convention will change rarely, if ever.

application (optional)
If the creator is part of a suite of programs or modules, this attribute shall be set to
the name of the suite.

program (required)
Creators shall set this attribute to the name of the creating program or module.

programVersion (required)
Creators shall set this attribute to the preferred textual formatting of the current
version number of the creating program or module.

title (optional)
Creators may set use this attribute to represent a user-defined title for the data
represented in the file. Absence of a title may be indicated by omitting the attribute
or by including it with an empty string value.

5 Dimensions
frame (required, length unlimited)
Coordinates along the frame dimension will generally represent data taken from
different time steps, but may represent arbitrary conformation numbers when the
3

trajectory file does not represent a true trajectory but rather a collection of conformations (e.g. from clustering).

spatial (required, length 3)


This dimension represents the three spatial dimensions (X,Y,Z), in that order.

atom (required, length set as appropriate)


Coordinates along this dimension are the indices of particles for which data is
stored in the file. The length of this dimension may be different (generally smaller)
than the actual number of particles in the simulation if the user chooses to store
data for only a subset of particles.

cell_spatial (optional, length 3)


This dimension represents the three lengths (a,b,c) that define the size of the unit
cell.

cell_angular (optional, length 3)


This dimension represents the three angles (alpha,beta,gamma) that define the shape
of the unit cell.

label (optional, length set as appropriate)


This dimension is used for character strings in label variables where the label is
longer than a single character. The length of this dimension is equal to the length
of the longest label string.

Variables

Variables are described below as <type> <name>(<dimension> [,<dimension>..])


Note that the order of dimensions corresponds to the CDL and C APIs. When
using the Fortran APIs, the order of dimensions should be reversed.

6.1

Label variables

Label variables shall be written by creators whenever their corresponding dimension is present. These variables are for self-description purposes, so readers may
generally ignore them. Labels requiring more than one character per coordinate
4

shall use the label dimension. Individual coordinate labels that are shorter than
the length of the label dimension shall be space padded to the length of the label
dimension.

char spatial(spatial)
Creators shall write the string xyz to this variable, indicating the labels for coordinates along the spatial dimension.

char cell_spatial(cell_spatial)
Creators shall write the string abc to this variable, indicating the labels for the
three lengths defining the size of the unit cell.

char cell_angular(cell_angular, label)


Creators shall write the strings alpha, beta, gamma to this variable, naming
the angles defining the shape of the unit cell.

6.2

Data variables

All data variables are optional. Some data variables have dependencies on other
data variables, as described below. Creators shall define a units attribute of type
character string for each variable as described below. Creators may define a scale_factor
attribute of type float for each variable. Creators shall ensure that the units of data
values, after being multiplied by the value of scale_factor (if it exists) are equal
to that described by the units attribute. If a scale_factor attribute exists for a variable, readers shall multiply data values by the value of the scale_factor attribute
before interpreting the data. This scaling burden is placed on the reader rather than
the creator, as writing data is expected to be a more time-sensitive operation than
reading it.
It is left as an implementation detail whether creators create a separate file for
each variable grouping (e.g. coordinates and velocities) or a single file containing
all variables. Some creators may allow the user to select the approach. Readers
should support reading both styles, that is, combining data from multiple files or
reading it all from a single file.

float time(frame) units = picosecond


When coordinates on the frame dimension have a temporal sequence (e.g. they
form a molecular dynamics trajectory), creators shall define this dimension and
5

write a float for each frame coordinate representing the number of picoseconds
of simulated time elapsed since the start of the trajectory. When the file stores a
collection of conformations having no temporal sequence, creators shall omit this
variable.

float coordinates(frame, atom, spatial) units = angstrom


This variable shall contain the Cartesian coordinates of the specified particle for
the specified frame.

double cell_lengths(frame, cell_spatial) units = angstrom


When the coordinates variable is included and the data in the coordinates variable
come from a simulation with periodic boundaries, creators shall include this variable. This variable shall represent the lengths (a,b,c) of the unit cell for each frame.
The edge with length a lies along the x axis; the edge with length b lies in the x-y
plane. The origin (point of invariance under scaling) of the unit cell is defined as
(0,0,0). If the simulation has one or two dimensional periodicity, then the length(s)
corresponding to spatial dimensions in which there is no periodicity shall be set to
zero.

double cell_angles(frame, cell_angular) units = degree


Creators shall include this variable if and only if they include the cell_lengths variable. This variable shall represent the angles (, , ) defining the unit cell for each
frame. defines the angle between the a-b and a-c planes, defines the angle
between the a-b and b-c planes and defines the angle between the a-c and b-c
planes. Angles that are undefined due to less than three dimensional periodicity
shall be set to zero.

float velocities(frame, atom, spatial) units = angstrom/picosecond


When the velocities variable is present, it shall represent the cartesian components
of the velocity for the specified particle and frame. It is recognized that due to the
nature of commonly used integrators in molecular dynamics, it may not be possible
for the creator to write a set of velocities corresponding to exactly the same point
in time as defined by the time variable and represented in the coordinates variable.
In such cases, the creator shall write a set of velocities from the nearest point in
time to that represented by the specified frame.

Example

The following is an example of the CDL for a trajectory file conforming to the
preceding specification and containing most of the elements described in this document. This CDL was generated using ncdump -h <trajectory file>.
netcdf mdtrj {
dimensions:
frame = UNLIMITED ; // (10 currently)
spatial = 3 ;
atom = 28 ;
cell_spatial = 3 ;
cell_angular = 3 ;
label = 5 ;
variables:
char spatial(spatial) ;
char cell_spatial(cell_spatial) ;
char cell_angular(cell_angular, label) ;
float time(frame) ;
time:units = "picosecond" ;
float coordinates(frame, atom, spatial) ;
coordinates:units = "angstrom" ;
double cell_lengths(frame, cell_spatial) ;
cell_lengths:units = "angstrom" ;
double cell_angles(frame, cell_angular) ;
cell_angles:units = "degree" ;
float velocities(frame, atom, spatial) ;
velocities:units = "angstrom/picosecond" ;
velocities:scale_factor = 20.455f ;
// global attributes:
:title = "netCDF output test" ;
:application = "AMBER" ;
:program = "sander" ;
:programVersion = "9.0" ;
:Conventions = "AMBER" ;
:ConventionVersion = "1.0" ;
}

Extensions and modifications

Standards and formats are most useful when they are supported widely, and become less useful and more burdensome if they fragment into multiple dialects. If
you plan to support additional variables, dimensions or attributes beyond those described here in a publicly released creator or reader program, please contact the
author (jmongan@mccammon.ucsd.edu) for inclusion of these elements into a future revision of this document.

Revision history
Revision A, Febuary 9, 2006: Initial document
Revision B, February 20, 2006: Better self-description for unit cells in periodic simulations; standards for indicating one and two dimensional periodicity.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy