Desmond Manual
Desmond Manual
Desmond 3.1
User Manual
Schrödinger Press
Desmond User Manual Copyright © 2012 Schrödinger, LLC. All rights reserved.
While care has been taken in the preparation of this publication, Schrödinger
assumes no responsibility for errors or omissions, or for damages resulting from
the use of the information contained herein.
Schrödinger software includes software and libraries provided by third parties. For
details of the copyrights, and terms and conditions associated with such included
third party software, see the Legal Notices, or use your browser to open
$SCHRODINGER/docs/html/third_party_legal.html (Linux OS) or
%SCHRODINGER%\docs\html\third_party_legal.html (Windows OS).
This publication may refer to other third party software not included in or with
Schrödinger software ("such other third party software"), and provide links to third
party Web sites ("linked sites"). References to such other third party software or
linked sites do not constitute an endorsement by Schrödinger, LLC or its affiliates.
Use of such other third party software and linked sites may be subject to third
party license agreements and fees. Schrödinger, LLC and its affiliates have no
responsibility or liability, directly or indirectly, for such other third party software
and linked sites, or for damage resulting from the use thereof. Any warranties that
we make regarding Schrödinger products and services do not apply to such other
third party software or linked sites, or to the interaction between, or
interoperability of, Schrödinger products and services and such other third party
software.
Chapter 9: Utilities................................................................................................................ 93
9.1 solvate_pocket ......................................................................................................... 93
9.1.1 Methodology ..................................................................................................... 93
9.1.2 Command Syntax ............................................................................................. 94
9.1.3 Command File Syntax....................................................................................... 95
In addition to the use of italics for names of documents, the font conventions that are used in
this document are summarized in the table below.
Sans serif Project Table Names of GUI features, such as panels, menus,
menu items, buttons, and labels
Monospace $SCHRODINGER/maestro File names, directory names, commands, envi-
ronment variables, command input and output
Italic filename Text that the user must replace with a value
Sans serif CTRL+H Keyboard keys
uppercase
Links to other locations in the current document or to other PDF documents are colored like
this: Document Conventions.
In descriptions of command syntax, the following UNIX conventions are used: braces { }
enclose a choice of required items, square brackets [ ] enclose optional items, and the bar
symbol | separates items in a list from which one item must be chosen. Lines of command
syntax that wrap should be interpreted as a single command.
File name, path, and environment variable syntax is generally given with the UNIX conven-
tions. To obtain the Windows conventions, replace the forward slash / with the backslash \ in
path or directory names, and replace the $ at the beginning of an environment variable with a %
at each end. For example, $SCHRODINGER/maestro becomes %SCHRODINGER%\maestro.
Keyboard references are given in the Windows convention by default, with Mac equivalents in
parentheses, for example CTRL+H (H). Where Mac equivalents are not given, COMMAND
should be read in place of CTRL. The convention CTRL-H is not used.
In this document, to type text means to type the required text in the specified location, and to
enter text means to type the required text, then press the ENTER key.
References to literature sources are given in square brackets, like this: [10].
Chapter 1
Chapter 1: Introduction
A description of Desmond was published, along with performance data, as part of the confer-
ence proceedings of the ACM/IEEE Conference on SuperComputing 2006 (SC06) [1]. While
developing Desmond, D. E. Shaw Research has introduced and extended a number of scientific
algorithms, including new parallelization strategies and numerical techniques, some of which
• The Protein Preparation Wizard, LigPrep (ligand structure) and Epik (ligand protonation
state) preparation tools can be used to ensure that the structures provided to Desmond are
chemically correct. Such careful system preparation often represents a crucial step prior
to initiating a molecular dynamics simulation.
• Prime can be used to create homology models for use in simulations and to repair protein
structures.
• Glide can be used to generate relevant poses within protein binding sites for use in simu-
lations. Desmond in turn can be used to thermally relax, refine, and sample conforma-
tions related to the docked poses.
• Strike can be used to generate statistical models from the results of simulations.
• Desmond can be used to sample protein structures prior to performing docking calcula-
tions with Glide.
• SiteMap can be used to identify potential binding sites from simulation results.
• WaterMap analyzes specially designed Desmond simulations to characterize the thermo-
dynamics of water in protein binding sites.
Although Desmond can run serially, for most purposes, you will want to make use of the
parallel execution capabilities. Desmond uses Open MPI for parallel execution. Before you can
run jobs, however, you must add entries to the hosts file for parallel execution with Open MPI,
in addition to any configuration that is needed for the hosts and the queueing system. See the
Installation Guide for instructions, especially Chapter 7 and Section 7.3.3.
In addition, much of the framework for running Desmond jobs has been written in Python to
facilitate adaption to user-specific requirements, including the automation of larger and more
specific workflows.
• Easy to use with a focus on the real problem of interest rather than the details of the cal-
culation
• Support for absolute and relative solvation free energy calculations
• Support for relative binding free energy calculations
• Restarting and customization of FEP jobs via FEP panel
Most Desmond-related tools are available from the Desmond submenu of the Applications
menu in Maestro. The exceptions are the trajectory viewer, which is opened from the Project
Table using the output entry for a job. Desmond panels can also be opened from Tasks →
Molecular Dynamics.
If you have MacroModel you can perform a quick check on the structure by performing a
Current Energy calculation (available from the MacroModel submenu of the Applications menu)
using the OPLS_2005 force field with the Solvent set to None. If that calculation succeeds it is
almost certain that Desmond and its associated tools will be able to work with this structure as
well. If the structure is problematic Maestro and MacroModel often provide useful diagnostics
for what might be wrong.
Examining the results, including viewing a trajectory, and analysis of results, is described in
Chapter 5.
FEP jobs are handled differently due to the complexity of the calculations and the fact that the
overall goal for an FEP job is to produce one number: the free energy change. FEP jobs are
supported for specific types of calculations, using automated procedures that differ from those
used for individual, general purpose Desmond simulations.
1. Import the structure file for the system of interest into Maestro.
2. Prepare the structure for simulation with the Protein Preparation Wizard. This step
involves removing ions and molecules (which are artifacts of crystallization), setting cor-
rect bond orders, adding hydrogens, filling in missing side chains or whole residues as
necessary, reorienting various groups and varying residue protonation states to optimize
the hydrogen bonding network, and then checking the structure carefully.
3. If your system is a membrane protein, embed the protein in the membrane. This step and
the next two steps are performed in the System Builder panel.
4. Generate a solvated system for simulation.
5. Distribute positive or negative counter ions to neutralize the system, and introduce addi-
tional ions to set the desired ionic strength (when necessary).
6. Relax the system either by minimization or by selecting the panel option to relax the
model system before simulation.
7. Set the simulation parameters in one of the general Desmond panels, for molecular
dynamics, simulated annealing, or replica exchange.
8. Run the simulation.
9. Analyze your results using the Trajectory Viewer and other analysis tools.
Linux:
To run any Schrödinger program on a Linux platform, or start a Schrödinger job on a remote
host from a Linux platform, you must first set the SCHRODINGER environment variable to the
installation directory for your Schrödinger software. To set this variable, enter the following
command at a shell prompt:
Once you have set the SCHRODINGER environment variable, you can run programs and utilities
with the following commands:
$SCHRODINGER/program &
$SCHRODINGER/utilities/utility &
You can start the Maestro interface with the following command:
$SCHRODINGER/maestro &
It is usually a good idea to change to the desired working directory before starting Maestro.
This directory then becomes Maestro’s working directory.
Windows:
The primary way of running Schrödinger applications on a Windows platform is from a graph-
ical interface. To start the Maestro interface, double-click on the Maestro icon, on a Maestro
project, or on a structure file; or choose Start → All Programs → Schrodinger-2012 > Maestro.
You do not need to make any settings before starting Maestro or running programs. The default
working directory is the Schrodinger folder in your documents folder (Documents on Windows
7/Vista, My Documents on XP).
If you want to run applications from the command line, you can do so in one of the shells that
are provided with the installation and that have the Schrödinger environment set up:
Mac:
The primary way of running Schrödinger software on a Mac is from a graphical interface. To
start the Maestro interface, click its icon on the dock. If there is no Maestro icon on the dock,
you can put one there by dragging it from the SchrodingerSuite2012 folder in your Applications
folder. This folder contains icons for all the available interfaces. The default working directory
is the Schrodinger folder in your Documents folder ($HOME/Documents/Schrodinger).
Running software from the command line is similar to Linux—open a terminal window and
run the program. You can also start Maestro from the command line in the same way as on
Linux. The default working directory is then the directory from which you start Maestro. You
do not need to set the SCHRODINGER environment variable, as this is set in your default envi-
ronment on installation. If you need to set any other variables, use the command
Desmond Molecular Dynamics System, version 3.1, D. E. Shaw Research, New York, NY,
2012. Maestro-Desmond Interoperability Tools, version 3.1, Schrödinger, New York, NY,
2012.
Kevin J. Bowers, Edmond Chow, Huafeng Xu, Ron O. Dror, Michael P. Eastwood, Brent A.
Gregersen, John L. Klepeis, Istvan Kolossvary, Mark A. Moraes, Federico D. Sacerdoti, John
K. Salmon, Yibing Shan, and David E. Shaw, “Scalable Algorithms for Molecular Dynamics
Simulations on Commodity Clusters,” Proceedings of the ACM/IEEE Conference on Super-
computing (SC06), Tampa, Florida, November 11-17, 2006.
Chapter 2
Protein and ligand structures used in a Desmond simulation must be complete all-atom 3D
structures with a reasonable geometry. The preparation of protein and ligand structures for use
in a simulation can be done with the Protein Preparation Wizard and LigPrep. The Protein
Preparation Wizard corrects structural defects, adds hydrogen atoms, assigns bond orders, and
can selectively assign tautomerization and ionization states, and optimize the hydrogen
bonding network. For more information, see the Protein Preparation Guide. LigPrep performs
2D-to-3D conversion if necessary, adds hydrogen atoms, generates tautomers, ionization
states, ring conformations, and stereoisomers, as requested, and produces minimized 3D struc-
tures. For more information, see the LigPrep User Manual.
Once you have prepared the protein and ligand structures, you can proceed to the remaining
tasks in building a model system that can include proteins, ligands, explicit solvent, a
membrane, and counter ions. The System Builder automates this process and significantly
reduces the effort required. You can set up and run a System Builder job from the System
Builder panel, or from the command line. See Section 6.3 on page 75 for information on
running the System Builder from the command line.
• None—Do not use a solvent. This option allows you to run a simulation on a pure liquid,
for example, or in vacuum (with a sufficiently large box).
• Predefined—Use one of the predefined solvent models, which you can select from the
option menu. The models include four water models, SPC, TIP3P, TIP4P, and TIP4PEW,
and three organic solvents, methanol, octanol, and dimethyl sulfoxide (DMSO).
• Custom—Import a custom solvent system from file. Enter the solvent system file name in
the text box, or click Browse and navigate to the solvent system file in the file selector
that is displayed.
The solvent is placed by replicating “boxes” of solvent molecules and deleting molecules
whose center of mass lies outside the periodic box boundary, and molecules that are inside or
have significant overlap with the solute or the membrane (if one is used).
To set up the box, first choose the shape from the Box shape option menu. Three basic shapes
are provided: Cubic, Orthorhombic, and Triclinic. As special cases of the triclinic box shape,
three other shapes are supported: Truncated octahedron, Rhombic dodecahedron xy-square,
and Rhombic dodecahedron xy-hexagon.
When you have chosen the box shape, you can choose whether to specify the size of the box in
terms of a buffer distance or as an absolute size, by selecting one of the Box size calculation
method options:
• Buffer—The simulation box size is calculated by using the given buffer distance between
the solute structures and the simulation box boundary.
• Absolute size—Specify the lengths of the sides of the simulation box size (and angles if
necessary).
Having chosen a method, you can specify the distances and angles in the Distances and Angles
text boxes. The text boxes that are available depend on the box shape. For all choices except a
truncated octahedron, the box can be displayed in the Workspace by selecting Show boundary
box.
If you want to calculate the volume of the box that encompasses the solutes, click Calculate.
The volume is displayed in the Box volume text box. To minimize the volume of the box, click
Minimize Volume. The solutes are reoriented so that the box volume is minimized.
There are four predefined membranes, DPPC, DMPC, POPC, and POPE, which you can
choose by selecting Predefined, and choosing the membrane from the option menu. The
temperature at which the membrane patch was preequilibrated is given in parentheses after the
membrane name. Because DPPC has a gel transition temperature around 313 K, the recom-
mended minimum simulation temperature is also 325 K.
If you want to position a custom membrane, select Custom, and enter the name of the Maestro
file containing the membrane model in the text box, or click Browse and navigate to the file.
If you have an existing membrane in a project entry that you want to use for the current model
system, you can load it by selecting the entry and clicking Load Membrane Position from
Selected Entry. The membrane from this entry is then used for the model system you are
building.
When you click Place Automatically, the membrane position is determined according to the
information available, as follows:
• If you have a protein from the OPM database (http://opm.phar.umich.edu), the membrane
is placed using the information provided with the protein.
• Otherwise the surface of the membrane is placed perpendicular to the longest axis of the
protein.
• If transmembrane atoms are defined, they are placed inside the membrane. Placement of
transmembrane atoms inside the membrane takes precedence over placement perpendicu-
lar to the longest axis.
To define the transmembrane atoms, you can do one of the following:
• Click Select, and use the Atom Selection dialog box to select the desired atoms. For
more information on this dialog box, see Section 6.5 of the Maestro User Manual.
• Click Set to Helices. This button sets the ASL expression to res.sec helix,
which selects the residues whose secondary structure assignment is “helix”. The
protein must have a secondary structure assignment before you use this button. To
do the assignment, choose Tools → Assign Secondary Structure in the main win-
dow.
If you have a protein that is prealigned, you can click Place on Prealigned Structure to place
the membrane. The membrane is positioned symmetrically about the coordinate origin so that
its surfaces are parallel to the xy plane (perpendicular to the z axis). This means that the protein
must be aligned accordingly.
When you have placed the membrane, a representation of the membrane is displayed in the
Workspace. The representation consists of two red slabs for the surfaces, with a yellow line
perpendicular to the slab planes. After the membrane has been placed, you can adjust its orien-
tation by selecting Adjust membrane position, and rotating the membrane. The actual
membrane molecules are placed when the system builder job is run. The molecules are placed
by replication of a membrane segment and deletion of molecules whose center of mass lies
outside the periodic box boundaries. Molecules that are inside the solute or have significant
overlap with it are removed to accommodate the solute.
If you click Place Automatically after adjusting the membrane, the membrane is returned to its
default position and orientation.
The membrane position and orientation can be stored in Project Table entries, by selecting the
entries in the Project Table, and clicking Save Membrane Position to Selected Entries. This
enables the membrane position and orientation to be loaded at a later time by selecting the
entry and clicking Load Membrane Position from Selected Entry.
It can be difficult to set up GPCR systems properly. The mold_gpcr_membrane.py script can
be used to swap your GPCR protein into a system that has already been constructed for a
related protein. Templates for a number of GPCR systems are available. See Section 9.4 on
page 100 for details.
When you have selected the property, click Select to select the atoms for which these charges
are to be used. There is no default. The selection is made in the Atom Selection dialog box,
which is described in detail in Section 6.5 of the Maestro User Manual. If the property you
chose has values only for some of the atoms (e.g. the ligand), you can select these atoms by
specifying the entire range of values. Atoms that do not have a value for the property will not
be selected.
If you select Neutralize, the minimum number of sodium or chloride ions required to balance
the system charge is placed randomly in the solvent.
If you select Add, you can choose the ion type from the option menu and enter the number of
ions to add (which need not neutralize the system). The option menu only displays ions that are
opposite in charge to that of the system. Ions are not placed in the excluded regions.
Instead of placing ions automatically, at random, you can locate and select suitable regions for
ions to be placed. Usually these regions are near residues that have the same charge as the
system charge and are not near the active site. You can define these regions in the Advanced Ion
Placement dialog box, which you open by clicking Advanced Ion Placement in the Ions tab.
To place the ions, you must identify suitable candidate residues. When you click Candidates,
the Candidates table is populated with a list of residues in regions that have not been excluded
and have the same charge as the overall charge of the system. These residues are colored red
and rendered in CPK. Ions are placed near the residues that you select in the table, replacing
the closest solvent molecule to the average position of the atoms in the residue. The number of
ions placed (initially 0), along with the number of ions remaining to be placed and the total
number of candidate residues are displayed above the table.
You can add candidates to the table by clicking Select, and selecting the residues in the Atom
Selection dialog box. When you click OK in the dialog box, the residues are added to the table,
and can be selected along with the automatically located residues. To clear the table, click
Reset.
When the system builder job is run, ions that are placed using the Advanced Ion Placement
dialog box are placed first. Once these ions are placed, random placement is performed to
place any remaining ions that are needed to neutralized the system or complete the number of
ions selected for placement in the Add text box.
When the salt ions are placed, they are randomly distributed in the solvent, and replace solvent
molecules. They are not placed in the excluded region defined in the Exclude region section.
To set up and run the job, click Start. The Start dialog box opens, allowing you to name the job,
choose the host and set the user name (if necessary). System Builder jobs do not usually take
more than a few minutes, so you can run the job locally on Linux systems (but not on
Windows). You can also choose whether to incorporate the output CMS file back into the
Maestro project, by choosing Append new entries from the Incorporate option menu. This is
useful if you want to continue on to set up a simulation in Maestro. If you choose Do not incor-
porate, the CMS file is placed in the current working directory, but is not added to the project.
If you want to run the job from the command line, click Write. The Write dialog box opens, in
which you can specify a name and then write the file. The name is used to construct the file
names, by adding the appropriate extension.
If you want to rebuild a model system from the command line, you should run one job from
Maestro, and then edit the multisim input file (.msj) for other model systems.
To add solvent:
1. Select Predefined for the Solvent model option, and choose a model from the option
menu.
2. Choose a box shape.
3. Choose a box size calculation method—Buffer for adding a buffer region to the solutes, or
Absolute size for specifying the actual box size.
To add ions:
1. In the Ions tab, choose an option for the addition of ions.
2. If you selected Add, enter the number of ions to add in the text box.
3. Choose an ion type from the option menu.
4. If the solvent is intended to be a salt solution, select Add salt.
5. Enter the desired salt concentration in the Salt concentration text box.
6. Choose positive and negative ion types from the Salt positive ion and Salt negative ion
option menus.
To add a membrane:
1. Click Add Membrane in the Solvation tab.
2. In the Membrane tab, select Predefined for the membrane model, and choose a membrane
type from the option menu.
3. Click Place Automatically.
4. Select Adjust membrane position and adjust the orientation of the membrane in the Work-
space.
5. Click OK.
Click Start to run the job or click Write to write the input file.
Chapter 3
The general Desmond panels enable you to set up and run the main tasks available with
Desmond: molecular dynamics, minimization, simulated annealing, and replica exchange jobs.
The panels are designed to make setting up these types of jobs as easy as possible, and provide
the most common simulation controls. The default values provided in the panels represent a
good balance between accuracy and performance, and are adequate for most jobs without
change. For more control over the simulation parameters, you can make settings in the
Advanced Options dialog box, which is described in Section 3.9 on page 31.
A much more automated approach is provided for FEP simulations of binding and solvation
free energies in four specialized panels, Ligand Functional Group Mutation by FEP, Ring Atom
Mutation by FEP, Protein Residue Mutation by FEP, and Total Free Energy by FEP, for which a
model system and the additional parameters are set up automatically. These panels, and the
FEP panel for restarting and customizing these jobs, are described in Chapter 4.
In addition to setting up simulations, you can use the general panels to restart a simulation
from a checkpoint file as generated by a previously interrupted simulation.
All jobs run from these panels require a model system to be built first, in the System Builder
panel—see Chapter 2 for details.
Desmond simulations can also be run from the command line—see Chapter 6.
At the bottom of the panel are the action buttons for the job:
• Start—Start the job. Opens the Start dialog box to set job parameters and submit the job
for execution. See Section 3.10 on page 39 for details. A general description of this dia-
log box and its features is given in Section 2.2 of the Job Control Guide.
• Read—Read a configuration (.cfg) file, to set up the simulation. Opens a file selector in
which you can navigate to the desired file.
• Write—Write the input files for the job but do not start it. Opens a dialog box in which
you can provide the job name, which is used to name the files. The job can be run from
the command line, as described in Chapter 6.
• Reset—Reset the values in the panel to their defaults.
To run a job:
1. Choose the task from the Desmond submenu of the Applications menu.
2. Specify the model system, either by loading it from the Maestro Workspace or importing
it from a file.
3. Adjust the simulation parameters as necessary.
For parameters that are not available in the main panel, open the Advanced Options dialog
box.
4. Click Start.
5. Set the job parameters in the Start dialog box, and click Start to run the job.
There are two options on the option menu, and the tools in this section depend on which option
you choose.
• Load from Workspace—Load the model system from the Workspace. The Workspace
must contain a model system that has already been prepared with the System Builder
panel. When you choose this item, the Load button is displayed, which you click to load
the model system from the Workspace.
• Import from file—Import the model system from a file. You can choose to import a model
system file (.cms) or a checkpoint file (.cpt).
If you import a model system file, it must contain a model system that has already been
prepared with the System Builder panel. When the file is imported, a message about the
system is displayed below the option menu.
If you import a checkpoint file, most of the panel controls are unavailable. The purpose of
the checkpoint file is to restart an interrupted simulation, so the parameters of the simula-
tion cannot be altered. You can change the total simulation time, and then start the job.
3.3 Minimizations
Minimization jobs relax the system into a local energy minimum. The model system is mini-
mized using a hybrid method of the steepest decent and the limited-memory Broyden-Fletcher-
Goldfarb-Shanno (LBFGS) algorithms. This task is set up in the Minimization panel, which you
open by choosing Applications → Desmond → Minimization in the main window.
There are only two generally useful parameters for this task:
• Maximum iterations—Enter the maximum number of iterations in this text box, or use the
arrow buttons to change the maximum number of iterations in steps of 10.
This task is set up in the Molecular Dynamics panel, which you open by choosing Applications
→ Desmond → Molecular Dynamics in the main window.
The controls at the top of the Simulation section allow you to specify the simulation time in ns
and the recording interval in ps for the energy and for the trajectory.
For the recording intervals you can enter a value in ps in the text boxes. The values are rounded
to an integer multiple of the far time step size. This time step size is set in the Integration tab of
the Advanced Options dialog box, in the RESPA integrator section.
The controls in the lower part of the Simulation section allow you to choose the ensemble class,
from NVE, NVT, NPT, NPAT, and NPγT, to set the temperature (except for NVE) and the pres-
sure (except for NVE and NVT), and set the surface tension (NPγT only). It also allows you to
relax the model system before performing the simulation, and choose the protocol for the
relaxation.
When Relax model system before simulation is selected, a series of minimizations and short
molecular dynamics simulations are performed to relax the model system before performing
the simulation you set up. This option is selected by default, and a default protocol is used.
Usually, if the model system was just created from the System Builder panel, it needs to be
relaxed; if the model system has been relaxed before, it does not need to be relaxed again. As
an alternative you can run a minimization before the molecular dynamics calculation.
The stages in the default relaxation process for the NPT ensemble are:
6. Simulate in the NPT ensemble using a Berendsen thermostat and a Berendsen barostat
with:
• a simulation time of 24ps
• a temperature of 300K and a pressure of 1 atm
• a fast temperature relaxation constant
• a normal pressure relaxation constant
This protocol is used for the NPAT and NPγT ensembles as well. A similar protocol is used for
the NVT ensemble.
If you want to modify the protocol, you can copy these files and edit them. To make use of the
modified protocol, click Browse and navigate to the new protocol file, which has a .msj exten-
sion. The file name is then listed in the Relaxation protocol text box.
When the simulation finishes, the output structure file (.cms) is written to disk and incorpo-
rated into the Maestro project. In addition, a new trajectory directory is created, called
jobname_trj by default. Checkpoint files are written during the simulation, but are not written
during the relaxation process.
One of the predominant strategies used is to raise the temperature to a high value one or more
times before relaxing the system to the desired temperature. The goal is to permit the system to
relax out of an initial state that corresponds to a high energy potential minimum into a lower
state by crossing barriers in the free-energy landscape, which is achieved more effectively
during the periods of elevated temperatures. The default temperature program in the Simulated
Annealing panel falls into this category.
Another common use for simulated annealing is to perform an effective minimization with
some relaxation of the system by slowly decreasing the temperature down to very low temper-
atures. This slow cooling should permit at least some shifts from higher energy minima to
lower minima in the energy landscape.
Simulated annealing calculations can be set up and run from the Simulated Annealing panel,
which you open by choosing Applications → Desmond → Simulated Annealing.
In the Simulation section, you can make settings for the simulated annealing job. The settings
for the simulation time, recording interval, ensemble class and model system relaxation are the
same as for a molecular dynamics simulation, and are described in Section 3.4 on page 22. The
main specific task for simulated annealing is to provide information on the stages by providing
a schedule of reference temperature changes.
The number of stages is set in the Number of stages text box. When a value has been entered,
the table below is adjusted to display text boxes for each stage. The stages are indexed from 0.
For each stage you can specify a starting time in the Time text box, and a starting temperature
in the Temperature text box. The temperature is linearly interpolated between adjacent time
points. The last stage runs until the specified total simulation time.
Desmond supports replica exchange simulations in which multiple copies of the system are
simulated at different temperatures, which usually range from the temperature of interest up to
700 K or more. Periodically during the simulation, attempts are made to exchange the coordi-
nates of copies that are at different temperatures. The exchange is processed in a Monte Carlo-
like process: select the systems to attempt to exchange and then use a Metropolis-like criterion
to decide whether to accept the change [41]. The exchange acceptance ratio satisfies the
detailed balance or balance condition so that each replica remains in equilibrium after the
exchange. When many such exchanges are accepted over the course of an extended simulation,
multiple systems with very different histories can visit the temperature of interest. While
systems spend time at higher temperatures they explore conformational space significantly
more rapidly than if they remained at the target temperature. Thus the composite trajectory at
the temperature of interest may contain a more diverse collection of conformations than if
multiple simulations were performed at the target temperature.
As with a regular molecular dynamics simulation each replica may be run on multiple proces-
sors. Since the simulations of each replica proceeds independently between exchange attempts
the additional level of parallelization achieved by running multiple replicas is highly efficient.
Replica exchange simulations can be set up and run from the Replica Exchange panel, which
you open by choosing Applications → Desmond → Replica Exchange in the main window.
In the Simulation section, you can make settings for the replica exchange job. The settings for
the simulation time, recording interval, ensemble class and model system relaxation are the
same as for a molecular dynamics simulation, and are described in Section 3.4 on page 22. The
default ensemble for replica exchange is NPT. If your highest temperature is above 373 K you
might want to change the ensemble to NVT. Exchanges are done between nearest neighbors.
The main specific task is to set the temperature range and temperature profile.
The temperature range is set in the Temperature range text boxes. The defaults are 300 K for
the low temperature and 310 K for the high temperature. You should adjust the low and high
temperature values to suitable values. There are four options for the temperature profile:
Information on the temperatures is displayed in the replica table. You can edit the temperatures
by selecting manual for the temperature profile. Some guidance on selecting temperatures is
available in Ref. 43. Setting up the temperatures and the number of replicas for a meaningful
simulation can be difficult. For assistance with this task, contact help@schrodinger.com.
The replica exchange simulation produces one trajectory for each replica, labeled
jobname_replicanum_trj, where num is the index of the replica, starting from 0, and corre-
sponds to the replica index in the replica table. You can display a temperature versus time plot
of the replicas and the exchanges that were made—see Section 5.4 on page 63.
The parameters that control the accuracy of the simulation are the height and width of the
Gaussian potential and the interval at which the Gaussians are added. The accuracy of the
results is inversely proportional to the height for a given interval for addition of the Gaussian.
However, if smaller heights are used, the simulation takes longer. The accuracy of the results
increases as the time interval increases, but the simulation time also increases. The accuracy is
not very sensitive to the ratio of the height to the interval, but smaller values of this ratio
increase the accuracy. The width of the Gaussian should be roughly 1/4 to 1/3 of the average
fluctuations of the collective variable during a free MD run.
In the Simulation section, you can make settings for the metadynamics job. The settings for the
simulation time, recording interval, ensemble class and model system relaxation are the same
as for a molecular dynamics simulation, and are described in Section 3.4 on page 22. The
default ensemble for metadynamics is NPT. The main specific task for metadynamics is
defining the collective variables and the Gaussian.
The height of the Gaussian and the interval at which it is added are set in the Height and
Interval text boxes, under Metadynamics parameters.
To set up a collective variable, first select an option for the type of variable: Distance, Angle, or
Dihedral. (More flexible definitions can only be made from the command line.) Select Pick
atoms to pick the atoms in the Workspace. The atom numbers are listed in the AtomN text
boxes. You can set the width of the Gaussian in the Width text box. The default values are 0.05
Å for distances, 0.03 rad (1.8°) for angles, and 0.05 rad (3°) for dihedrals.
For distance variables, you can place a barrier (“wall”) at a given distance in the Wall at text
box. Introducing a wall prevents the system from moving too far in the direction defined by the
collective variable.
Once you have set up the variable, you can add it by clicking Add. The variable is added to the
table, and is used in the simulation. To remove a variable from the simulation, select it in the
table and click Remove. Setting up more than two collective variables can create sampling
problems.
More flexibility in the collective variables is available from the command line, via multisim.
You can use ASL expressions to define a group of atoms whose center of mass is used as the
variable, for example.
When a solute is placed in a pure membrane, some lipids are usually removed to make room
for the new molecule during the system building process. As a result, adjustment of the surface
area of the solute + membrane system is often needed. This can usually be done using a fairly
short semi-isotropic simulation of up to about 0.5 ns. When simulating beyond that time range
it is recommended to switch to either a constant surface area, constant normal pressure simula-
tion (NPAT) or a constant surface tension simulation (NPγT). If the latter is selected we
suggest using a surface tension of 2000 bar/Å for DPPC and 4000 bar/Å for POPE and POPC.
We recommend examining the results of all membrane simulations carefully.
This process may take hours to days since it equilibrates the system in stages for about 1.2 ns.
The file protein-membrane-out.cms should be reasonably well equilibrated and can be used
as input for the production simulation for your study.
To open the Advanced Options dialog box, click Advanced Options in the Desmond panel that
you have open. This panel has several tabs, which are described in the following subsections.
• Integration tab
• Ensemble tab
• Minimization tab
• Interaction tab
• Restraints tab
• Output tab
• Misc tab
The selection of tabs that is displayed depends on the task. For minimizations, only the Minimi-
zation, Interaction, Restraints, and Misc tabs are displayed. For MD simulations all but the Mini-
mization tab are displayed.
The settings in this dialog box and the settings in the Desmond panels are not entirely indepen-
dent, and can affect each other. For example, changing the far time step can affect the values of
the recording periods in the panel, because the latter are automatically rounded by the far time
step. As another example, changing the temperature or pressure in the Desmond panel updates
the temperature or pressure parameters in the dialog box. Changes in the Desmond panel take
effect immediately and update parameters in the dialog box, whereas changes made in the
dialog box only take effect when you click OK or Apply.
If you want to clear changes that have not been committed with the Apply button, click Reset.
Any changes made since the last set of changes were committed are discarded, and the values
in the dialog box are reset to the last set committed.
Figure 3.6. The Integration tab of the Advanced Options dialog box.
Specify the time steps in fs for bonded, near, and far, by entering values in the text boxes, or
using the arrow buttons to change the value in increments of the bonded time step. Because the
bonded, near, and far time steps must maintain a certain ratio, when a new value is set for the
bonded time step, the other two time steps are automatically updated according the current
ratio. Changing the near or far time steps adjusts this ratio.
Selecting Set time step automatically based on constraint setting couples the RESPA time step
settings with those in the Constraint section. The time steps will be automatically set based on
the settings in the Constraint section, and are not available for editing.
Figure 3.7. The Ensemble tab of the Advanced Options dialog box.
The Thermostat method option menu offers four choices: Nose-Hoover chain, Berendsen,
Langevin, and None.
Although in most circumstances, only one thermostat group is needed, you can specify
multiple thermostat groups by entering the number of groups in the Number of groups text box
and supplying information on these groups in the Thermostat group settings table. The
maximum number of groups is 8. The selection of atoms that is in each group can be set up in
the Misc tab, by defining multiple groups named thermostat, with the group numbers corre-
sponding to the entries in the Values column. Any atoms not explicitly added to a group are
automatically assigned to group 0, the default group. This means that you do not need to define
a group if you only want to use one thermostat, and that you only need to define groups for the
extra thermostats, starting from thermostat 1.
The Thermostat group settings table provides text boxes for making settings for each thermo-
stat group. The settings that can be made are:
• Temperature (K)
• Relaxation time (ps)
The Barostat method option menu also offers four choices: Martyna-Tobias-Klein, Berendsen,
Langevin, and None. For each of these methods you can set the relaxation time (ps) in the
Relaxation time text box and choose a coupling style from the Coupling style option menu. The
coupling style choices are Isotropic, Semi-isotropic, Anisotropic, and Constant area. The pres-
sure used is atmospheric pressure.
• LBFGS vectors—Specify the number of history vectors used by the LBFGS minimizer
for the update of the Hessian. The maximum is 6.
• Minimum SD steps—Specify the minimum number of steepest descent steps to be used
before switching to the LBFGS minimizer.
In the Output section you can specify the name of the structure output file. You can use
$JOBNAME as a variable representing the job name that you will set when you start the job or
write out the input files.
Figure 3.8. The Minimization tab of the Advanced Options dialog box.
Figure 3.9. The Interaction tab of the Advanced Options dialog box.
To define the short-range region, choose a method from the Short range method option menu.
The controls below this menu depend on the method chosen, which can be one of the
following:
• Cutoff—Enter a value in the Cutoff radius text box. The default is 9.0 Å.
• Potential tapering—Specify the range in angstroms over which the potential is tapered off
in the two Tapering range text boxes.
There are two choices for handling the long-range Coulombic interactions:
• Smooth particle mesh Ewald—use the smooth particle mesh Ewald method. This method
requires a tolerance to be set in the Ewald tolerance text box. This tolerance affects the
accuracy of the long-range Coulombic interactions. The smaller the tolerance is, the more
accurate the computation of the long-range Coulombic interactions is, but the simulation
will be correspondingly slower.
• None—use the unmodified Coulomb interaction.
Figure 3.10. The Restraints tab of the Advanced Options dialog box.
To manage the restraints, you can use the buttons beside the table:
• Select—Opens the Atom Selection dialog box to specify the atoms for the selected
restraint. Only available if a single row is selected in the table.
• Add—Adds a row to the restraints table so that a new restraint can be defined.
• Delete—Deletes the selected restraints.
• Reset—Resets the table to its default state.
Figure 3.11. The Output tab of the Advanced Options dialog box.
• Energy sequence file—This file contains a sequence of various energies of the system.
• Trajectory directory—This directory is used by Desmond to periodically write out files
that record coordinates and velocities (optional) of all particles in the system at a particu-
lar point in the simulation. You can provide a title for the trajectory, and you can select
Record velocities if you want the velocities to be recorded along with the coordinates.
To ensure that associated solutes appear together in the trajectory rather than on opposite
sides of the simulation box, you can select Glue close solute molecules together. This
option only affects the way the trajectory is displayed.
• Checkpoint file—This file contains information that can be used to restart an interrupted
simulation. Checkpoint files permit bitwise continuation of simulations, so they can be
large, and should be saved infrequently if at all.
Figure 3.12. The Misc tab of the Advanced Options dialog box.
You can set the starting time and the interval at which velocities are randomized in the
Randomize velocities section. By default velocities are randomized at the beginning of a calcu-
lation, and not randomized again. For some kinds of simulations (e.g. in the gas phase), peri-
odic randomization of velocities can improve the sampling.
In the Atom group section you can specify atom groups. Atom groups are used for various
special treatments of atoms in the simulation.
You can define multiple groups with the same name but a different index by setting the index in
the Value column. If you use multiple thermostats, for example, you can define the atoms in
each thermostat group by naming the group thermostat and setting the index to the thermo-
stat group number in the Ensemble tab. Group 0 is the default group, to which all unassigned
atoms automatically belong.
The atom groups are listed in the table. The Atoms column is filled in automatically when you
click Select and use the Atom Selection dialog box to specify the atoms. Otherwise you can
edit this column to specify the ASL expression for the atoms in the group. The number of
atoms defined by the ASL expression is shown in the No. of Atoms column.
• Select—Opens the Atom Selection dialog box to specify the atoms for the selected group.
Only available if a single row is selected in the table.
• Add—Adds an atom group. Clicking this button adds a row to the table.
• Delete—Deletes the selected groups. This operation can only be done on user-defined
groups.
• Reset—Resets the table to its default state.
When you click Start, the Start dialog box opens. The common features of this dialog box,
such as the Output section, the Job name, and CPUs text boxes, and the Host option menu, are
described in Section 2.2 of the Job Control Guide. These features allow you to direct the job to
the appropriate host with the desired number of processors, and decide how to incorporate the
output into the Maestro project.
For Desmond jobs, the choice of the number of processors has some special requirements.
Desmond uses a Cartesian domain decomposition of the simulation for efficient processing
This decomposition is described in the Desmond User’s Guide, provided by D. E. Shaw
Research. The details of the decomposition can be displayed in the Start dialog box by clicking
the Details button. You can control the decomposition by entering values in the x, y, and z text
boxes. The values you enter must be a power of 2, 3, or 5, or products of these powers. The
decomposition is done automatically if you enter a number in the CPUs text box. However, if
you enter a number that is not a suitable product of powers, the actual number of processors
used is the largest product of powers that is smaller than the number you entered. The number
of processors actually used is displayed in the details section. To hide the details, click the
Details button again.
Before you can run Desmond jobs in parallel, you must configure the hosts file and any queues
that you want to use. Details are given in Chapter 7 of the Installation Guide.
When the job is submitted to a queueing system, the processors requested are all allocated to
the master job, and the subjobs are run directly on these processors, without being submitted to
the queueing system. (This is called “umbrella” mode—see Section A.1.2 on page 106.)
Chapter 4
Maestro provides panels for quickly setting up FEP simulations for certain common scenarios:
In these three scenarios, the relative binding free energy of the ligand can be calculated. For the
first two, the solvation free energy of the ligand relative to the original ligand can be calculated.
Total (absolute) solvation free energies can also be calculated for a ligand or a molecule, which
is done by annihilating the chosen molecule in the FEP simulation.
Free energy perturbation calculations are resource-intensive. While it is possible with the FEP
panels to set up multiple mutations in a single calculation, we recommend that only one muta-
tion be performed per calculation due to the resource requirements.
The panels for these calculation types have a common design. Each panel has a Define Pertur-
bation tab, in which you set up the systems to be simulated, and a Plan Calculation tab, in
which you choose the environment (complex, pure solvent, vacuum) for the ligand and choose
the FEP protocol, which includes the ensemble. Information on setting up the system is
contained in the next few sections, followed by a section describing the Plan Calculation tab.
• Start—Opens the Start dialog box, in which you can set the job parameters such as job
name, host, number of CPUs, and so on, and then start the job. The general layout of this
dialog box is described in Section 2.2 of the Job Control Guide. The Desmond-specific
features of the Start dialog box and the issues in choosing the number of CPUs are
described in Section 3.10 on page 39.
• Write—Writes the input files for the job but does not start the job. Opens a dialog box in
which you can specify a job name, which is used to name the files. This capability is use-
ful if you want to change the FEP protocol.
• Reset—Resets the entire panel to its initial state. This includes clearing the Workspace. If
you want to use the same structures, you will have to redisplay them before proceeding.
The FEP calculations for ligand mutation are set up in the Ligand Functional Group Mutation by
FEP panel, which you open by choosing one of the following:
Figure 4.1. The Ligand Functional Group Mutation by FEP panel, Define Perturbation tab.
3. In the Fragment library table, select the fragments that you want to use to replace the orig-
inal substitution group, by selecting their table rows.
Use shift-click and control-click to select multiple table rows. You can view the mutated
ligand by clicking in the View column.
Since each FEP mutation requires a large amount of CPU time and space, we recommend
performing only one mutation in a given calculation.
4. (Optional) Display the mutated ligand and make adjustments to the fragment.
The adjusted fragment is used as the starting point in the FEP simulations. It is possible
that the fragment may have clashes with the protein in its default orientation or be in an in
appropriate conformation. The barriers between local minima in the potential energy sur-
face can be large, and thus take a long time to cross during a simulation. Adjusting the
conformation can reduce clashes and help the simulation to sample the appropriate
conformations.
Once the fragment is displayed, you can reorient the fragment with the adjustment tools,
which are available from the Adjust toolbar button:
or with the local transformation tools, which are available from the Transform toolbar
button:
For more information on these tools, see Section 5.9 of the Maestro User Manual.
You can also add to the fragment using the Build toolbar or the Build panel—for example,
replacing a hydrogen with a methyl or a hydroxyl group. Since the quality of the results
goes down as you increase the size of the fragment, it is not advisable to add too much to
the fragment.
You can revert to the original fragment by clicking the button in the Reset column. You
should not make modifications to the ligand core or other molecules in the system.
5. In the Plan Calculation tab, select the calculation types:
• For binding free energy calculations, select In complex and In pure solvent.
• For solvation free energy calculations, select In pure solvent and In vacuum.
6. Choose the FEP protocol you want to use for each calculation type from the FEP protocol
option menu.
You should choose the same ensemble type for both the complex and pure solvent
calculations.
7. Click Start, set the job parameters in the Start dialog box, and click Start in the dialog
box to run the job.
Figure 4.2. The Ring Atom Mutation by FEP panel, Define Perturbation tab.
Use shift-click and control-click to select multiple table rows. You can view the mutated
ligand by clicking in the View column. If you want to add a functional group as well as
mutate the ring atom, you can use the build tools to add to the mutated structure.
4. In the Plan Calculation tab, select the calculation types:
• For binding free energy calculations, select In complex and In pure solvent.
• For solvation free energy calculations, select In pure solvent and In vacuum.
5. Choose the FEP protocol you want to use for each calculation type from the FEP protocol
option menu.
6. Click Start, set the job parameters in the Start dialog box, and click Start in the dialog
box to run the job.
The FEP calculations for protein residue mutation are set up in the Protein Residue Mutation by
FEP panel, which you open by choosing one of the following:
Figure 4.3. The Protein Residue Mutation by FEP panel, Define Perturbation tab.
Once the residue is displayed, you can reorient the side chain with the adjustment tools,
which are available from the Adjust toolbar button:
or with the local transformation tools, which are available from the Transform toolbar
button:
For more information on these tools, see Section 5.9 of the Maestro User Manual.
You can also modify the residue using the Build toolbar or the Build panel—for example
to replace sulfur with selenium, or to phosphorylate the residue. Since the quality of the
results goes down as you increase the size of the perturbation, it is not advisable to add
too much to the residue.
You can revert to the original residue orientation and structure by clicking the button in
the Reset column. You should not make modifications to the ligand core or other mole-
cules in the system.
10. In the Plan Calculation tab, choose the FEP protocol you want to use for each calculation
type from the FEP protocol option menu.
You should choose the same ensemble type for both the complex and pure solvent
calculations.
11. Click Start, set the job parameters in the Start dialog box, and click Start in the dialog
box to run the job.
For calculations that are not defined by a simple protein, ligand, and water model (such as
membrane proteins or calculations in salt), you can use existing model systems for input. To do
this you should prepare two model systems, one with and one without the ligand. You should
run the In complex calculation only, for both systems, to ensure that the model system is used
as is for both perturbations. The relative binding free energy can be calculated from the differ-
ence between the two free-energy differences.
Figure 4.4. The Total Free Energy by FEP panel, Define Perturbation tab.
The FEP simulations can be performed in three environments: in the protein-ligand complex
(with solvent), in pure solvent, and in vacuum. This permits the calculation of binding free
energies, by selecting In complex and In pure solvent, and of solvation free energies, by
selecting In pure solvent and In vacuum. Selecting just one of the environments does not on its
own produce a meaningful result.
The FEP protocol defines in detail how system is solvated, relaxed, and simulated. Four main
protocols are provided in the FEP protocol option menu for the complex and pure solvent: two
ensembles, Desmond NPT and Desmond NVT, with two options for the relaxation part (stan-
dard, or quick relaxation). The protocol for the simulation is the same for both ensembles (NVT
and NPT). For calculations in vacuum, there is the choice of the standard or quick relaxation.
You can set the temperature for all the simulations in the Temperature text box.
In addition you can read in a protocol by choosing User defined from the FEP protocol option
menu. A file name text box and Browse button is displayed. You can enter the name of the
protocol file (which has a .msj extension) in the text box, or navigate to it in the file selector
that opens when you click Browse.
The default protocol is as follows. The original system is solvated by adding SPC water mole-
cules with a buffer distance of 5 Å for complexes and 10 Å for pure solvent. The vacuum simu-
lation uses a buffer distance of 100 Å. The system goes through a relaxation process that
includes two minimizations followed by 4 short molecular dynamics simulations. The produc-
tion simulation is run for 0.6 ns for each lambda window, and 12 windows are used for each
perturbation. You can change the buffer size and the production simulation time for each envi-
ronment in the Buffer size and Production simulation time text boxes. If you are using an
existing model system, the buffer size is ignored for the complex, but is used for the pure
solvent.
After the production simulations, the results are collected and analyzed using the Bennett
method. The final result (free energy) for each perturbation is recorded as an entry-level prop-
erty in the final Maestro output file.
You can use the script calculate_ddg.py to take the difference between the values in two
different environments to provide a ΔΔG value. To run this script, use the following command:
where jobname1-out.mae and jobname2-out.mae are the Maestro output structure files
from the two FEP calculations. The output file output-file.mae is optional; the default name is
ddG.mae. This file is in Maestro format and contains the ΔG value from both input files. The
final ΔΔG value is recorded for each mutant as an entry property s_des_ddG_jobname1-
jobname2, which is displayed in the Project Table as ddG jobname1-jobname2. The atom coor-
dinates in the output file are copied from the first input structure file.
When charged molecules are deleted or created in absolute free energy calculations, finite size
effects can be significant, particularly for the pure solvent FEP calculations. A script,
calculate_correction.py, has been provided to provide a correction to the free energy
once the FEP calculation is complete. To run this script, use the following command:
This program reviews the trajectory and calculates the correction. The final correction is
printed to standard output. During the process, the program writes out several files:
jobname_correct1.cfg, jobname_correct1.log, and jobname_correct1.dat.
Depending on your system, it may also write out the analogous files:
jobname_correct2.cfg, jobname_correct2.log, and jobname_correct2.dat. You
need not be concerned with the content of these files.
For instance, for ligand functional group mutations in the presence of a protein these files
would normally be called jobname_solvent_12.tgz and jobname_complex_12.tgz.
These files can be extracted using the command:
The settings for the simulation time, recording interval, ensemble class, temperature, pressure,
surface tension, and model system relaxation are the same as for a molecular dynamics simula-
tion, and are described in Section 3.4 on page 22.
There are two options for the type of FEP that can be performed, that correspond to the types
of FEP that are set up with the specialized FEP panels:
• Mutation—Mutate one set of atoms into another. The atom set may be a molecule or a
fragment. Select this option for ligand functional group mutation or ring atom mutation.
• Total free energy—Select this option for total free energy simulations, in which a set of
atoms is annihilated by the perturbation.
The number of lambda windows you want to use can be entered in the Number of windows text
box. Each window is listed in the table below, with values for each of the six lambda parame-
ters, Van der Waals, Charge and Bonding, for the two systems, labeled A and B.
You can set the lambda values automatically, which is the default, or you can deselect Set
lambda values automatically and edit the values for each window and each type of lambda. The
lambda values must be within the range [0, 1]. The background color of the cell is changed to
pink if the value is out of range, to warn that it must be set within the range.
The windows can be selected or deselected for the simulation using the buttons in the Window
index row. For a new FEP job, all windows should normally be selected for simulation. If a
simulation for a particular lambda window needs to be re-run, select only that window.
For mutation FEP, you can also use Hamiltonian replica exchange to enhance the sampling, by
selecting Use lambda hopping to enhance sampling.
Chapter 5
If you have entries in the Workspace, a panel opens asking if you want to keep them in the
Workspace while the trajectory is played, or to remove them. Keeping the entres in the Work-
space allows you to view the trajectory against a fixed background. You can superimpose the
trajectory atoms on the background, and you can set up measurements, such as H-bonds,
between the background atoms and the trajectory atoms, which are updated during play.
The toolbar in the Trajectory panel contains a standard set of controls for playing through the
trajectory frames, which are listed below. The menu bar has one menu, Play, which contains
items that correspond to the toolbar buttons.
Go to start
Display the first frame.
Previous
Display the previous frame.
Play backward
Display the frames in sequence, moving toward the first.
Stop
Stop playing through the frames.
Play forward
Display the frames in sequence, moving toward the last.
Next
Display the next frame.
Go to end
Display the last frame.
Loop
Choose an option for repeating the display of the frames. Single direction displays frames in
a single direction, then repeats. Oscillate reverses direction each time the beginning or end of
the frame set is reached.
You can control the selection of frames and the speed of play in the Frame control section of
the panel.
• The Start and End text boxes define the frames at which play starts and ends. Frames are
numbered from 0.
• The Frame slider and frame text box can be used to select the frame to view. The current
frame number is displayed in the text box below the slider. The total number of frames is
also displayed in a noneditable text box.
• The Step text box sets the number of frames to step when playing through frames. This
value does not affect the Frame slider. The frames that are selected for play can be
exported as a selection of frames, using the output buttons.
• The Time text boxes display the time for the current frame and the total time for the tra-
jectory. You can enter a time in the text box to select a frame.
• The Speed slider sets the speed at which the frames are played.
By default, the clipping planes window is automatically hidden when you play through a
trajectory, as this speeds up play by about 50%. It is displayed again (if it was originally
displayed) when play stops. If you want to see the clipping planes window during play, dese-
lect Hide clipping planes during continuous play.
In the Display section you can control how frames are displayed in the Workspace and what
features are displayed:
• Use lower quality drawing to speed up play—Use a lower quality representation of objects
(tubes, spheres) in the Workspace to speed up play. This option has no effect on wire
frame representation.
• Update secondary structure—Update the secondary structure assignment for each frame.
• Show simulation box—Show the edges of the simulation box (in purple).
• Show axes—Show the coordinate axes in green.
• Replicate system—Enter the number of replicas of the system to display in each of the
three directions. This enables you to visualize the movement across the simulation box
boundaries. These text boxes are unavailable if there are no periodic boundary conditions.
• Trajectory smoothing—Smooth the trajectory by averaging the coordinates over the spec-
ified number of frames.
• Positioning—Select on of these options to control the positioning of each frame relative to
the Workspace during play.
• Do not adjust—Do not adjust the positioning of each frame.
• Pick atoms for positioning—Use these picking controls to select the atoms to superimpose
or to center. You should consider picking atoms that do not change their position much
during the simulation.
If you want to superimpose the trajectory on the Workspace structure, you must take care when
you select the atoms to superimpose. The ASL expression for the selection in the Workspace
structure is applied to each trajectory frame. If the ASL expression depends on the numbering
(atom number, molecule number, etc.) and the order of the objects in the trajectory frames is
not the same as in the Workspace structure, you may get unexpected results. You should use
ASL expressions that are not order-dependent. You can use the atom selection button to choose
from a variety of structural features whose ASL expressions are not order-dependent, like
Backbone or Ligands. If the ordering is a problem, you could instead align the Workspace
structure to a particular frame, and then play the trajectory by superimposing on that frame.
The atoms that are visible in each frame can be set either with the Workspace toolbar buttons,
or with the tools in the Atoms to display in each frame section.
To select the atoms that are visible when frames are displayed, use the toolbar buttons that
control the atom display:
The choice that you make with the toolbar buttons is recorded as an atom set, and is applied to
each frame. The same atoms are always displayed in each frame, no matter where they move in
the trajectory.
To display the atoms that come within a given distance of a particular set of atoms (such as a
ligand or a binding site), use the Atoms to display in each frame section. You can select the
atoms that are always visible with the Always display these atoms selection tools. You can then
choose to display entire residues that have any atoms within a specified distance of these
atoms. This expression is evaluated for each frame, which allows residues (such as water) to
move in and out of this distance. The particular set of atoms that is visible can therefore change
from frame to frame.
The output buttons allow you to export the trajectory data in various forms. You can export
individual frames, all frames, or the selection of frames defined by using the Start, End, and
Step text boxes. The buttons have the following actions:
• Structure—Save structures from the trajectory to a file or create project entries from the
structures. Opens the Export Structure dialog box, in which you can specify where the
structures will go, which structures to export, and which atoms to export.
• Image—Create an image of the Workspace with the current frame displayed. Opens the
Save Image panel.
• Movie—Save a movie of the trajectory in MPEG format. Opens the Export Movie panel,
in which you can select the frames to be exported, the speed and the resolution.
If you want to view a trajectory while a simulation is running, you can do so by importing the
jobname-out.cms file for the simulation from the host on which the simulation is running.
This file contains information on the location of the trajectory, and should be in the temporary
directory for the job. There may also be a copy of the output CMS file in the working directory
on the local host, but the trajectory will not be present until the job finishes. Once you have
imported the file, click the T button for the imported entry. To update the trajectory later, you
must reimport the file.
To open the Simulation Quality Analysis panel, choose Applications → Desmond → Simulation
in the main window.
To load the desired simulation, click Browse, and navigate to the desired Energy Sequence file
for the simulation, which has a .ene suffix. If you want to perform the analysis on a running
simulation, you must use the file from the host on which the simulation is running.
The analysis performs averaging of the results over short time periods (“blocks”) to reduce the
noise and eliminate correlation between consecutive reports. To change the size of the block,
enter a value in the Block length for averaging text box.
When you have the desired block length, click Analyze to perform the analysis. The analysis
can take a few minutes. When it finishes, the Simulation summary and Properties sections are
filled in with the results of the analysis. If you want to view a plot of the thermodynamic prop-
erties as a function of simulation time, click Plot. A panel is displayed with the results plotted.
To open this panel, choose Applications → Desmond → Simulation Event Analysis. Detailed
information about using this panel is available in the online help, which you can open by
clicking the Help button in the panel. A summary is given below.
Five types of properties are available: energy, hydrogen bonds, RMS deviation of atoms rela-
tive to a trajectory frame or reference structure, RMS fluctuation of atoms over the entire
trajectory relative to a trajectory frame or reference structure, and measurements of distances,
angles, dihedrals, and radius of gyration.
The first task in the analysis is to select properties and the atoms or molecules for which they
are calculated. These are added to the list in the center of the panel, where they are represented
as “keywords”. Next, these properties must be evaluated, by selecting individual properties in
the list and clicking Analyze, or clicking Analyze All to analyze all the properties. You can read
or write list of properties (keywords), and you can export the results of the analysis.
Once the properties have been evaluated, you can display the results in various kinds of plots:
time series, histogram, multi-variable plot, heat map, polar plot, RMSF plot, and also display
statistics on the properties.
To open this panel, choose Applications → Desmond → Replica Exchange Review or Tasks →
Molecular Dynamics → Replica Exchange Review. First, you must load a log file for the simu-
lation (not the multisim log file). Then you can choose whether to show all replicas or a
single chosen replica. When you have made a choice, click Display Plot. The plot is displayed
in a new window, with a standard set of plotting controls for configuring the plot and saving an
image. If you are displaying single replicas, you can add replicas to the plot by selecting a new
one, or you can plot a different replica by closing the plot window, selecting a new replica, and
clicking Display Plot again.
Chapter 6
Desmond jobs may be started from Maestro or from the command-line. The mechanisms for
running jobs from the command line are described in this chapter. You might wish to run
Desmond from the command line for any of the following reasons:
Desmond jobs run under Schrödinger’s Job Control facility. This facility manages the execu-
tion and monitoring of jobs, and handles the input and output files and the incorporation of
results into a Maestro project. The Job Control Guide describes how to set up the information
needed for Job Control to run on the computers to which you have access. It includes informa-
tion on remote hosts, clusters, and batch queues.
As is the case for all Schrödinger software, the environment variable SCHRODINGER must be
set to the directory where the Schrödinger software, including Desmond, was installed. In addi-
tion, there are other environment variables that can be set to override default resource values.
See Appendix B of the Job Control Guide for more information.
By default, the Schrödinger job control facility uses ssh to communicate between remote
nodes. For more information, see Section 7.2 of the Installation Guide.
This chapter describes the command syntax for running single-stage and multistage Desmond
jobs, and for building the input composite model system. For details of the file formats and
technical background, see the Desmond User’s Guide, from D. E. Shaw Research.
The options for the desmond command are listed in Table 6.1. The standard Job Control
options are also supported—see Section 2.3 of the Job Control Guide. Additional diagnostic
options, which do not run the job but provide information, are also given in the section
mentioned. In particular, you should note the syntax of the -HOST option, which is used to
specify the list of hosts used for the job.
For information on storage of temporary files, interacting with running Desmond jobs, and so
on, see the Job Control Guide.
Option Description
Job Options
-DEBUG Print diagnostic output.
-LOGINTERVAL Interval for copying the log file back to the submission host. Default: 5s.
-INTERVAL Interval for copying monitor files back to the submission host. Default:
5s.
-p|-P|-PROCS|-NPROC Number of processors to use. Default: 1.
-JOBNAME name Specify the name for the job. The default is the input .cfg or .cpt file
name, minus the extension.
-jin filename Files to be transferred to the execution host by Job Control.
-jout filename Files to be copied back to the submission host by Job Control.
-h[elp]|-HELP Print usage information
-v print version information and exit
Option Description
Program Options
-exec program Run the specified Desmond program. Available programs are:
mdsim—Run a molecular dynamics simulation (the default)
minimize—Run a minimization
remd—Run a replica exchange molecular dynamics simulation
vrun—Analyze a trajectory
If a front-end config file is provided with -c (such as those created by
Maestro), the program to use is determined automatically and this option
is not needed.
-c config-file Parameter file for simulation. Required with -in for a new simulation.
-cfg key=val Specify extra simulation parameters. Multiple instances of this option
can be supplied to specify multiple extra parameters.
-comm plugin Use the specified communication plugin. Available plugins are serial
and mpi. Default: serial.
-dp Run double-precision version. Default: run single-precision version.
-noopt Do not optimize parameters automatically.
-t temperature Specify temperature in a replica exchange MD simulation. Include an
instance of this option for each temperature in the exchange.
-tpp n Specify the number of threads per processor.
Some examples of running desmond from command line are shown below:
Option Description
-ADD_FILE filename Additional input file to copy to the working directory of the
multisim job. By default, multisim identifies and copies most files
needed for the run.
-c cfg-file File that contains the default Desmond config parameters. This option
is only useful for Desmond subjobs.
-cpu num-cpus Number of processors for each Desmond subjob. This can be a string
indicating the processor topology e.g., '2 2 2'. If this option is not
provided, a default value set by either the backend or the protocol will
be used. Processor counts must be a power of 2, 3, or 5, or products of
these powers.
-d stage-file File (.tgz) containing information on the stages from the previous
job, when restarting a simulation. Use one instance for each file.
Requires a multisim checkpoint file.
-debug Turn on multisim debug mode.
-DEBUG Turn on both multisim and Job Control debug modes.
Option Description
The following command explicitly uses a different host for the master job and the subjobs:
minimize {
title = "Minimization with restraints on solute"
max_steps = 2000
steepest_descent_steps = 10
convergence = 50.0
restrain = { atom = solute force_constant = 50.0 }
}
minimize {
simulate {
title = "Berendsen NVT, T = 10 K, small timesteps, and restraints on solute
heavy atoms"
annealing = off
time = 12
timestep = [0.001 0.001 0.003]
temperature = 10.0
restrain = { atom = solute_heavy_atom force_constant = 50.0 }
ensemble = {
class = NVT
method = Berendsen
thermostat.tau = 0.1
}
randomize_velocity.interval = 1.0
eneseq.interval = 0.3
}
simulate {
title = "Berendsen NPT, T = 10 K, and restraints on solute heavy atoms"
annealing = off
time = 12
temperature = 10.0
restrain = retain
ensemble = {
class = NPT
method = Berendsen
thermostat.tau = 0.1
barostat .tau = 50.0
}
randomize_velocity.interval = 1.0
eneseq.interval = 0.3
}
simulate {
title = "Berendsen NPT and restraints on solute heavy atoms"
effect_if = [["@*.*.annealing"] ’annealing = off temperature =
"@*.*.temperature[0][0]"’]
time = 12
restrain = retain
ensemble = {
class = NPT
method = Berendsen
thermostat.tau = 0.1
barostat .tau = 50.0
}
randomize_velocity.interval = 1.0
eneseq.interval = 0.3
}
simulate {
title = "Berendsen NPT and no restraints"
effect_if = [["@*.*.annealing"] ’annealing = off temperature =
"@*.*.temperature[0][0]"’]
time = 24
ensemble = {
class = NPT
method = Berendsen
thermostat.tau = 0.1
barostat .tau = 2.0
}
eneseq.interval = 0.3
}
simulate {
cfg_file = "example.cfg"
jobname = "$JOBNAME"
dir = "."
compress = ""
}
In the above example, the first stage, task, specifies the type of job that is run so that appro-
priate defaults can be set. In this case desmond:auto indicates that it is a Desmond job and
that the type of job should be detected automatically.
The second stage is a minimization of the system over a maximum of 2000 steps. Of the 2000
steps, the first 10 steps are be steepest descent. The convergence criterion is set rather loosely
to 50.0 kcal mol-1Å-1. The solute is restrained with a force constant of 50.0 kcal mol-1Å-1. The
monitor file is not updated. The third stage is similar except that nothing is restrained.
The fourth stage sets up a 12 ps Berendsen NVT simulation. For this simulation, the tempera-
ture is set to 10.0 K, and the thermostat relaxation is set to 0.1 ps. Resampling is done every
1 ps. The solute atoms are restrained. Checkpointing and structure monitoring are turned off.
Center-of-mass motion is removed and the .ene file is updated more frequently, at intervals of
0.3 ps. Subsequent simulation stages follow with progressively more freedom until the second
last stage when conditions resemble the production run.
The last stage is the production dynamics simulation. The simulation parameters could have
been explicitly listed here as in the preceding stages. In this example, this stage simply refers to
a Desmond .cfg file. The last line gives an example of how this .msj file could be run. You
would need to change the options for the job.
Two other global tasks are performed by setting the command-line options:
-rezero Set the coordinate origin to the centroid of the solute coordinates.
-minimize_volume Minimize the volume of the simulation box.
The order of the keywords in a CSB file matters. Lines beginning with # are considered to be
comments and are ignored. An example CSB file is shown below:
{
read_solute_structure mysolute_setup-in.mae # solute file name
solvent_desmond_oplsaa_typer {
input_file_name spc.box.mae
run
}
positive_ion_desmond_oplsaa_typer {
input_file_name Na.mae
run
}
negative_ion_desmond_oplsaa_typer {
input_file_name Cl.mae
run
}
membranize POPE.mae.gz 10.000000 10.000000
create_boundary_conditions orthorhombic 0.000000 0.000000 10.000000
exclude_ion_from { 1 2 } 10.0
solvate
neutralize
write_maeff_file chorus_setup-out.cms
}
The various sections of the file are described in the following subsections.
The keyword read_solute_structure reads the solute structure from the specified file. If
the file is given as a relative path, it is copied to the temporary directory for the job. If it is
given as an absolute path, the file must exist at that location on the execution host. The struc-
tures can be a solute structure, structures for different stages of FEP simulations, or a
completely solvated system. The force-field information is obtained by either reading from an
existing ffio_ff block in the input file or running the force field server.
Note that alternate coordinates are removed from the solvent structure when it is imported.
The next two sections determine the data to be used for counter ions to neutralize (or in some
cases charge) the system.
positive_ion_desmond_oplsaa_typer {
input_file_name Na.mae
run
}
negative_ion_desmond_oplsaa_typer {
input_file_name Cl.mae
run
}
These two sections describe the ion systems to neutralize or to add ions to the current struc-
tures. In each section one input_file_name keyword is provided to determine the file to be
used.
The next two sections (not used in the example file above) similarly determine the ion systems
to add salts to the system.
salt_positive_ion_desmond_oplsaa_typer {
input_file_name Na.mae
run
}
salt_negative_ion_desmond_oplsaa_typer {
input_file_name Cl.mae
run
}
The syntax for this section is same as for the positive and negative ion sections.
The keyword boundary_conditions is used to define a box with specified absolute dimen-
sions. The command using this keyword can be one of the following:
boundary_conditions cubic a
boundary_conditions orthorhombic a b c
boundary_conditions triclinic a b c alpha beta gamma
The box shape is the first parameter, and can be cubic, orthorhombic, or triclinic. It is
followed by the absolute size of the box defined by a or by a, b, and c. For a triclinic box
shape, the angles alpha, beta, and gamma must also be given.
create_boundary_conditions cubic a
create_boundary_conditions orthorhombic a b c
create_boundary_conditions triclinic a b c alpha beta gamma
The parameters are similar to those for the absolute box size, but the distances are the
minimum distance any solute atom and the box boundary in the given direction. The distance
between two images of the solute structures is therefore twice the specified values. If a
membrane is used, the box shape must be orthorhombic and the a and b values must be set to
zero so that solvent molecules are not placed within the membrane layers.
You can minimize the size of the box by using the -minimize_volume option to the
system_builder command.
set_oplsaa_version 2005
By default, force-field parameters that are already present in the solute CTs (the ffio_ff
block) are replaced. The following setting can be used to retain existing parameters:
remove_solute_ffio no
Two commands using the keyword add_ion can be used to add specific numbers of positive
and negative ions.
add_ion positive 5
add_ion negative 5
The keyword add_ion adds positive or negative ions to the system. The number of ions to be
added is specified as the second argument of this keyword.
The ion_location keyword can be used to specify the proximity of ions with reference to
certain atoms.
In order to exclude ions near certain atoms, the exclude_ion_from keyword can be used:
add_salt conc
The keyword add_salt adds the positive and negative salt ions to the system defined above.
The argument specifies the salt concentration for the molecular system. Based on the volume
of the solvent and the concentration, the number of salt ions is calculated and rounded to an
integer value. The coordinates of salt ions are determined randomly.
neutralize
The net charge of the current solute structures is calculated and the number of counter ions
necessary to neutralize the system is added.
solvate
The keyword solvate is used to solvate the current solute structures using the solvent system
defined in the solvent_desmond_oplsaa_typer section. The solvent system is extended to
be consistent with the boundary condition defined by the boundary_conditions or
create_boundary_conditions section. Any solvent molecules that overlap the solute
structures are removed.
write_maeff_file filename
The keyword write_maeff_file writes the current composite structures and their force field
parameters into a file in CMS format. The first structure contains all of the molecules in the
system and is usually referred to as the “full system CT”. The structures (or CTs) that follow it
contain different components of the system. For instance there usually is a solute structure, a
solvent structure (containing all of the solvent molecules) and structures for different types of
ions. There may also be a structure that contains solvent molecules that are extracted from the
solute. The force field parameters are inserted as ffio_ff blocks within each CT block
(“structure”) except the first. The coordinate origin of the structures is by default the center of
mass of the solutes.
Chapter 7
VMD [7] is a powerful program for visualizing molecular dynamics simulations that is avail-
able from the Theoretical and Computational Biophysics Group at the University of Illinois at
Urbana-Champaign. A plugin for Desmond is provided as part of VMD that makes it possible
for VMD to read and write Maestro files and read Desmond trajectories. The output Maestro
files are suitable for use in building model systems that can then be used to run Desmond.
This chapter contains information on reading a CMS file and a Desmond trajectory into VMD,
and writing a Maestro file from VMD. For information on installing VMD, see Section 3.10.4
of the Installation Guide.
vmd
Two windows are opened, VMD main and VMD version OpenGL Display. The former can be
used to control VMD while the latter can be used to display the molecular systems and to view
trajectories. VMD has extensive documentation which is available from the Help menu in the
VMD main window.
4. In the Molecule File Browser panel, set the file type to Maestro File (no timesteps).
5. Click Load.
You have just created a Molecule listing in VMD which should appear as a new a line in the
VMD Main window corresponding to the CMS file that you just read in.
6. Click Load.
The trajectory should load into VMD and automatically start playing.
5. In the Filename text box, edit the directory, and append the name for the new file to the
directory.
This is a plain Maestro file, whose extension should be .mae.
6. Click OK.
Chapter 8
The Desmond installation includes two utilities, viparr and build_constraints that can
be used to add or adjust the force field parameters and the accompanying constraints for chem-
ical systems prior to simulating them with Desmond. Both programs read and write Maestro
structure files.
Viparr is a template-based force field assignment utility that comes with a number of built-in
force fields including some developed for Amber and CHARMM (see below for more infor-
mation). User-defined force fields are also supported by viparr, if they are provided in viparr’s
file format. Viparr can be used to specify different force fields for various components of the
system, provided that the force fields are compatible. This flexibility makes it possible to do
some things that may be useful in force-field development including:
• Using one force field for one part of the chemical system and another force field for
another part (this allows you, for example, to easily switch between water models)
• Using one or more components from one force field (e.g., the dihedral parameters) and
the remaining components from another force field
• Overriding some of the parameters (e.g., some but not all of the angle parameters) with
those from another force field
Some classes of constraints are often used with, and in some cases required by, various force
field representations of molecules. The utility build_constraints adds these constraints to
a structure file. You should run this utility after running viparr, to ensure that the force field is
ready for use.
The basic options are given in Table 8.1. The force field data files are located in the directory
$SCHRODINGER/desmond-vversion/data/viparr. When you have run viparr, you must run
build_constraints, to ensure that the force field constraints are set properly.
Option Description
-h Print usage message, including the names of the built-in force fields
-n name Specify force field name or other annotation to put into the output file. If not spec-
ified, a default name is used.
-f ffname Specify a built-in force field. The available force fields are listed in Table 8.2. You
can repeat this option to specify multiple force fields, one per instance of the
option. Parameters of force fields listed earlier override parameters of force fields
listed later. When multiple force fields are specified, the order is important if some
parameters are intended to override others.
-c ctnum Specify the index of a single structure (“CT block”) in the input file for processing
by viparr. Structures are numbered starting at 1. You can provide multiple -c
options to specify more than one structure to process. The default is to process all
structures.
-d ffdir Specify a user-defined force-field directory. You can repeat this option to specify
multiple directories. As for -f, the order is important.
-m mergedir Path to user-defined force field directory that is to be merged with previously spec-
ified force field. Multiple directories can be specified with multiple instances of
this option.
-nocompress Do not use ffio block compression.
-p pdir Specify the plugin directory. The default is to use the directory defined by the
environment variable VIPARR_PDIR, which contains the standard plugins. All
necessary plugins, including those for the built-in force fields, must be in the
directory specified by -p.
-v Verbose output.
The available force fields are listed in Table 8.2, with references.
Amber OPLS-AA
amber94 8 oplsaa_impact_2001a 24-29
b,c
amber96 9 oplsaa_impact_2005 24-31
amber99 10 oplsaa_ions_Jensen_2005 32
amber99SB 10,11 Water models
amber99SB-ILDN 12 spc 34
amber03 13 spce 35
CHARMM tip3p 36
charmm22nocmap 14–16 tip3p_charmm 37
charmm27 14–21 tip4p 38
charmm32 14,15,20–22 tip4pew 39
charmm36_lipids 23 tip4p2005
tip5p 40
• AHn, where n is a count of the number of hydrogen atoms (1, 2, 3, ...) in a group com-
posed of a heavy atom and the hydrogen atoms directly bonded to it
• HOH, oxygen bonded to two hydrogen atoms and no other atoms.
Option Description
that specify the size and shape of the simulation box. The output from the system_builder
utility meets these conditions.
Residue Matching
Atomic numbers and the bonding pattern (graph isomorphism) are used to match residues to
templates. This methodology supports nonstandard atom or residue PDB names without modi-
fication. Atom and residue names in a force field need not be edited. In particular, viparr will
identifies the N- and C-terminus versions of the residues correctly, as well as protonated and
deprotonated versions of a residue, even if they are not explicitly mentioned as such in the
input file. Modification of atom and residue names for clarity is allowed.
The atom ordering in the input file is retained in the output structure file. The residue
numbering, which also remains unaltered, can begin with any integer (including negative inte-
gers) and does not need to be contiguous (viparr constructs a contiguous set of indices that it
uses internally). Residues with different chain names can have the same residue number. To aid
in diagnosing problems with the input structure file, messages involving residues have the form
<"chain-name",residue-number> (residue-name)
Output Format
A compressed force field representation is written when all the residues in a CT are the same.
For a CT that only contains water molecules, this means that force field parameters are written
only for a single water molecule.
A version number, which is associated with a particular version of viparr along with the
versions of the built-in force fields and their associated plugins, is written into the output struc-
ture file (in the ffio_version field). You are responsible for versioning your custom force
fields, using Perforce, for instance.
If viparr reports that it cannot match a residue, please check the following:
• The template for the residue is really in the force field selected.
• Atom numbers for the residue are correct in the input structure file.
• Bonds for the residue are correct in the input structure file.
In this scenario, one force field is used for one part of a chemical system (e.g., the protein) and
another force field for another part (e.g., the water molecules). In this case, each residue in
your chemical system matches a template in exactly one of the specified force fields (warning
messages are printed otherwise).
Examples:
In this case, residues in the chemical system match templates in more than one of the specified
force fields (warning messages are printed). All matching force fields are applied. For
example, one force field provides the angle parameters for the residues, while another force
field provides the dihedral parameters. This can work if the force field components are disjoint
and there is no conflict in what parameters are assigned to each component.
Similar to the scenario mentioned above, residues in the chemical system match templates in
more than one of the specified force fields (warning messages are printed when this happens)
and all matching force fields are applied. However, if two or more force fields provide parame-
ters for the same term (e.g., two force fields provide parameters for the angle between atoms 1,
2, and 3) the conflict is resolved by using the parameters from the first force field listed on the
command line that matches the residue.
In all cases, if a bond exists between two residues that are not matched by the same force field,
viparr exits with an error message. You should correct the problem so that this bond is recog-
nized by one of the selected force fields. The force fields must have consistent van der Waals
mixing rules: viparr exits with an error message if they do not.
• If any residue matches more than one template in a force field, viparr exits with an
error. No viparr force field should contain identical templates.
• If any residue name is matched to a force field template with a different name, a message
is printed. A maximum of 5 messages are printed per residue-template name pair.
• If there are any unmatched residues, viparr prints all unmatched residues and exits with
an error. A maximum of 5 messages are printed per unmatched residue name.
• If any residue is matched by more than one of the selected force fields, viparr prints a
warning message. You should be sure that you intended multiple force fields to match and
if so that the appropriate one was selected.
• a template file
• a force-field parameter file, generally for each component of the force field. Angles,
proper dihedrals, van der Waals, etc. are examples of force field components
A set of plugin programs has been provided for the built-in force fields. In most cases, user-
defined force fields can use these plugin programs. These plugin programs are located in
VIPARR_PDIR. VIPARR_PDIR is an environment variable that should be set to
$SCHRODINGER/desmond-vversion/lib/Linux-x86/viparr_plugins.
The other files, namely the templates file, parameter files, and rules file that specify a given
force field are placed in a force field directory. The force field directories for the built-in force
fields are located in the directory given by the environment variable, VIPARR_FFDIR, which
should be set to $SCHRODINGER/desmond-vversion/data/viparr.
The -d, -m and -p options, described in Table 8.1, are provided for working with user-defined
force fields.
When you import a structure into Maestro, make sure you correct any problems that Maestro
detects, especially those involving the atomic numbers of ions.
Chapter 9
Chapter 9: Utilities
The Desmond distribution contains a number of utilities for performing a range of specific
tasks, apart from those described previously. These utilities are described in this chapter.
9.1 solvate_pocket
The solvate_pocket utility is a tool for solvating subregions of a system with water, and in
particular to solvate buried regions in a protein or protein-ligand complex. To solvate such
regions in a thermodynamically consistent manner, solvate_pocket uses a grand canonical
Monte Carlo approach to sample both the water molecule positions and the number of water
molecules present, by attempting to introduce or remove water molecules in a Metropolis-like
manner.
9.1.1 Methodology
In grand canonical methods, a value of the chemical potential is set and the simulation samples
the number of water molecules in a manner consistent with the chemical potential and the
specified temperature. In principle, even the water in buried pockets is in equilibrium with bulk
water and so one should conduct the simulation using the excess chemical potential for bulk
water (for TIP4P this is about -6.95 kcal/mol).
The solvate_pocket utility samples the number and conformation of water molecules
within an orthorhombic region of the simulation cell using grand canonical Monte Carlo
(GCMC) in the μVT (constant chemical potential, volume and temperature) ensemble. As a
short-cut it does not use periodic boundary conditions other than to center the orthorhombic
cell in the simulation prior to simulating water molecules in the sampled region. As such, the
subregion sampled by solvate_pocket should be surrounded overall by bulk solution even if
the sampled region itself is not fully solvated. In its current form solvate_pocket expects a
CMS file as input and produces a CMS file containing the final conformation of the system
from the GCMC simulation.
Interactions are calculated using the real-space part of the Ewald sum only, an approximation
that still gives reasonable energetics. For instance, the chemical potential of bulk TIP4P water
using this approach is around -7.17 kcal/mol.
The solvate_pocket utility uses the concept of a “pass”, which is roughly 1 translation/rota-
tion move for each molecule being sampled. This can be useful because 1 pass is very roughly
equivalent to a MD time step in terms of the amount of sampling for small molecules like
water. The passes can include insertion and deletion moves as well. Both types of moves are
required in order to equilibrate the number of water molecules in the system.
Option Description
-ChargeA n For FEP calculations, specify the lambda value for the charges for the origi-
nal molecule.
-ChargeB n For FEP calculations, specify the lambda value for the charges for the
mutated molecule.
-h[elp]|-HELP Print the usage message.
-lct N Use the Nth structure in the input CMS file to define the region sampled by
solvate_pocket. N must be greater than 1 (the full system structure).
Multiple instances of this option can be used to specify more than one struc-
ture for the region definition. Used to override the region specified in the
command file. Not compatible with -lmae.
-lmae ligand-file Specify the name of the ligand file used to define the region sampled by
solvate_pocket. Used to override the region specified in the command
file. Not compatible with -lct.
-seed number Provide a seed (an integer) for the random number generator.
-vdwA n For FEP calculations, specify the lambda value for the van der Waals poten-
tials for the original molecule.
-vdwB n For FEP calculations, specify the lambda value for the van der Waals poten-
tials for the mutated molecule.
-NJOBS jobs Run solvate_pocket the specified number of times, with the same input
but different random seeds. The output CMS file is the file with the number of
waters that is closest to the average over all the subjobs.
Here, command-file is the name of the command file for solvate_pocket, which has the
extension .spd, and is described in Section 9.1.3; input-cms-file and output-cms-file are the
input and output CMS (composite model system) files and usually have a .cms extension. The
optional ligand-file is a file containing the ligand. If it is present, it is used to define the subre-
gion for solvation. By default, the maximum and minimum x, y, and z values in the command
file are used. If any of -ChargeA, -ChargeB, -vdwA, or -vdwB is used, all of them must be
used and it is assumed that this run is for a relative FEP calculation.
Sampling can be improved by running multiple instances of solvate_pocket with the same
input file but different random seeds. These jobs can be run simultaneously by specifying more
than one processor with the -HOST option. The number of instances can be specified with the
–NJOBS option; by default it is set to the number of processors requested. When all subjobs are
complete, the output CMS file with the number of water molecules that is closest to the
average number across the subjobs is returned as the output structure file for the job as a whole.
A solvate_pocket run on a binding site that can contain around 15 water molecules (about
450 Å3) can take between 10 minutes and an hour. The cost of the simulation increases approx-
imately as the square of the number of water molecules sampled. It is unlikely that
solvate_pocket is needed to prepare systems for NPT simulations if there are no buried
pockets.
Keyword Description
chargeA For FEP calculations, the lambda value for the Coulombic potentials for
the original molecule. Must be used with chargeB, vdwA, vdwB.
chargeB For FEP calculations, the lambda value for the Coulombic potentials for
the mutated molecule. Must be used with chargeA, vdwA, vdwB.
chemical_potential The chemical potential of water in kcal/mol. Required.
Recommended: –7.17 kcal/mol (for TIP4P).
Keyword Description
cut_off The cutoff distance for calculating electrostatic and Lennard-Jones inter-
actions, in angstroms. Required. Recommended: 9.0 Å.
distribution_window The window used for calculating the distribution of the number of water
molecules. If this is larger than the value specified by num_passes it is
reduced to that value. If early termination upon convergence is used and
the number of passes carried out is smaller than specified by
distribution_window, that number of passes is used instead.
init_num_passes The number of Monte Carlo passes to perform in the equilibration prior
to the production Monte Carlo run. Recommended: 10000.
lig_ct_nums List of structures in the input CMS file to define the region sampled. The
value must be greater than 1, which is the full system CT.
max_dctheta The maximum change used for the cosine of theta (Euler angle) for a
combined translation/rotation move, in radians. Required. Recom-
mended: 0.0654.
max_disp The maximum change used in each of the x, y, and z directions for a
combined translation/rotation move. Required. Recommended: 0.105 Å.
max_dpsi The maximum change used for phi and psi (Euler angles) in radians for
a combined translation/rotation move. Required. Recommended: 0.318.
name A descriptive name for the simulation.
num_delete The number of attempts to delete a single water molecule per pass.
Required. Recommended: half the number of water molecules expected.
num_insert The number of attempts to insert a single water molecule per pass.
Required. Recommended: half the number of water molecules expected.
num_passes The number of Monte Carlo passes to perform in the production Monte
Carlo run. Required. Recommended: at least 20000.
num_trans_rot The number of water molecule translations to attempt per pass.
Required. Recommended: approximately the number of water mole-
cules expected to reside in the region being sampled.
output_structure Controls the output structure produced. Valid values are:
final—The last configuration sampled. This is the default.
average—The last structure from the trajectory that has a number of
dynamic water molecules closest to the average.
most_frequent—The last structure from the trajectory that has a num-
ber of dynamic water molecules closest to the most visited number.
If average or most_frequent are given then both
trajectory_freq and distribution_window must be given.
Keyword Description
pass_term_window The number of passes over which the slope of the standard deviation of
the number of water molecules is calculated. This keyword may be used
to terminate the calculations before the number of passes specified by
num_passes has been completed. The simulation portions will run for
at least twice this duration before early termination occurs.
Default: 100000.
renum_waters If a nonzero value is given, renumber the water residues, starting from 1.
Otherwise retain the input water residue numbering.
sample_xmin Limits of the orthorhombic box in which the water molecules will be
sample_xmax sampled. For WaterMap jobs this box should enclose the binding site.
sample_ymin This box should be completely enclosed by atoms after it is centered in
sample_ymax the input periodic cell.
sample_zmin
sample_zmax
short_dist The closest acceptable approach for any two atoms, in angstroms.
Required. Recommended: 1.0 Å.
temperature The temperature used in the Monte Carlo simulation. Required.
term_std_slope Threshold for the standard deviation of the number of water molecules.
The calculation is terminated if the standard deviation falls below this
value. Default: 0.00001.
trajectory_freq If present, specifies the frequency at which structures are saved to a
Maestro pose viewer file (_pv.mae). The first structure contains the
fixed portion of the system. Structures containing only the water mole-
cules are written every n passes, where n is the value assigned to
trajectory_freq.
update_frequency The number of passes between updates in the log file. Default: 10.
vdwA For FEP calculations, the lambda value for the van der Waals potentials
for the original molecule. Must be used with vdwB, chargeA, chargeB.
vdwB For FEP calculations, the lambda value for the van der Waals potentials
for the mutated molecule. Must be used with vdwA, chargeA, chargeB.
temperature= 298.15
init_num_passes= 10000
num_passes= 100000
update_frequency = 10
pass_term_window = 10000
term_std_slope = 0.00001
num_trans_rot= 20
num_delete= 5
num_insert= 5
max_disp= 0.105
max_dpsi= 0.318
max_dctheta= 0.0654
sample_xmin= -8.0
sample_xmax= 8.0
sample_ymin= -8.0
sample_ymax= 8.0
sample_zmin= -8.0
sample_zmax= 8.0
cut_off= 9.0
short_dist= 1.0
chemical_potential= -7.17
This file instructs solvate_pocket to sample water within a cube, 16 Å on a side, centered at
0.0. 10,000 passes will be used to equilibrate the system before the production simulation
starts. The production simulation will run up to 100,000 passes but will terminate before that
number is reached if the slope of the standard deviation for the number of molecules drops
below 0.00001 over a window of 10,000 passes.
9.2 manipulate_trj.py
This script can be used to generate a new trajectory from a list of input trajectories. The trajec-
tory includes the .cms and .idx files that are needed to display the trajectory in Maestro. The
command syntax is as follows:
• merge—Multiple input trajectories are merged into one new trajectory based upon the
chemical time. For example, if the trajectories A = [a0 , ... , an ] and B = [b0 , ... ,bn ] are
merged, all frames from trajectory A whose chemical time is larger than that for b0 are
discarded. Here, the trajectories are represented as a list of frames ai and bi. This is the
default mode, and is useful for merging trajectories that are continued in a new run.
• concat—Frames from the input trajectories are simply concatenated, and the time for
each frame is reset to account for the new ordering. You can specify the time between two
adjacent frames in ps with the --dt option.
You can select a subset of the frames present in each input trajectory with a syntax similar to
that used for Python lists. If a list is used, the entire trajectory specification must be quoted.
The examples below illustrate the syntax:
The frame index is sorted in ascending order for each trajectory before any subset is selected.
The extension used in the input CMS file is preserved in the output CMS file (e.g. -out.cms).
9.3 amber_prm2cms.py
The utility amber_prm2cms.py can be used to convert Amber input files (a prmtop file plus a
prmcrd file) into a Desmond .cms file. The syntax for this command is:
The Amber parameter and topology file can be generated by the PARM or LeaP programs in the
Amber package. As well, AnteChamber, which is available for free under a GNU general
9.4 mold_gpcr_membrane.py
The utility mold_gpcr_membrane.py can be used to embed a GPCR in a membrane.
With more GPCR X-ray structures being determined over the last few years and the inherent
flexibility of GPCR structures, there is great interest in simulating them, particularly within a
membrane. Properly constructing the system to simulate can be tedious, time consuming and
error prone. Proper alignment of the GPCR in the membrane can be difficult to attain. As well,
the relaxation of the membrane around the protein can take a very long time.
Most GPCR proteins are simulated based on either b2 or rhodopsin structures. As a result
membranes that are equilibrated around b2 or rhodopsin structures are likely to be able to
accommodate another GPCR structure with only relatively mild clashes. The utility
mold_gpcr_membrane.py, can automatically align a GPCR model to a2a, b1ar, b2ar or
rhodopsin and use the pre-equilibrated membrane structure from the latter to reduce the
membrane relaxation time and the potential structural problems that can arise during
membrane equilibration.
Option Description
Option Description
-r[eference] reference Specifies the GPCR template. Valid values are a2a, b1ar, b2ar, or rho
(rhodopsin).
-z[_dist] distance Buffer distance for solvation of the system. Default: 10 Å.
-h[elp] Show usage message and exit
-v[ersion] Show program version and exit
9.5 trajectory_extract_frame.py
The trajectory_extract_frame.py script can be used to extract selected frames from a
trajectory into a series of output structure files. To run this script, use the following command:
Option Description
9.6 desmond_restraints.py
This command-line script allows you to add or remove distance, angle and dihedral angle
restraints to a CMS file for use in a Desmond simulation. The restraints can be flat-bottomed
(i.e. 0 for some range around the equilibrium value) if the width parameter is specified. The
syntax of the command is:
Option Description
[ ct atoms k s r0 ]
Here, ct is the index of the structure (CT) in the Maestro input file, atoms is 2, 3 or 4 atom
indices in the CT (2 for a distance, 3 for an angle, 4 for a dihedral), k is the force constant in the
same units as normal bonded interactions, s is the half-width of the flat bottom (optional) and
r0 is the reference value of the coordinate (optional).
9.7 rebuild_cms.py
This utility creates a full model system from either the component CTs (solute, membrane, and
solvent) or a full system CT. To generate the full model system with all CTs from a set of
components, run the following command:
Appendix A
The multisim utility is useful for running tasks that consist of a sequence of steps, such as
relaxation of a system followed by a production simulation. Maestro runs multisim for jobs
that include multiple stages. This appendix provides detailed information on how to run
multisim and the format of multisim (.msj) files.
• master-job-host—the host on which to run the master job (the Python script that manages
the multisim job).
• subjob-host—the host on which to run the subjobs, which are often cpu intensive.
• example—the job name and the stem of various file names, which are constructed in a
standard way from the job name.
• the cpu specification "2 2 2"—the specification of the CPUs used by the job. This spec-
ification can either be a triplet of numbers that defines the spatial decomposition of the
system, with the total number of processors being the product of these three numbers; or
a single integer, which is the total number of processors, and must have only the factors 2,
3 and 5.
• the value for -maxjob, which is used to specify how many subjobs can be queued at the
same time. The value 0 means “all subjobs” normally, but is the same as the value 1 for
umbrella mode.
Node locking is supported for FEP jobs. You can run these jobs in a serial queue if you use
–maxjob 1.
In most cases the multisim job can be restarted with a command similar to the following:
106
Appendix A: The multisim Utility
You can restart a job with a different .msj file. For instance, the command:
To restart a job from a completed stage, you can specify the stage number right after the check-
point file name. For instance, the command:
• You can restart the job with a different .cfg file by specifying the new .cfg file with the
-c option.
• You can restart the job with a different maximum number of simultaneously running sub-
jobs by using the -maxjob option.
• You cannot change the node-locking mode.
In most cases, multisim automatically detects and uses additional input files needed for a job
when it is restarted. If the job fails because an existing input file was not detected when
restarting a job you can specify that file by using the -ADD_FILE option.
Stages:
Stage 1 completed.
Stage 2 completed.
Stage 3 completed.
Stage 4 completed.
Stage 5 was skipped.
Stage 6 completed.
Stage 7 completed.
Stage 8 completed.
Note the multisim version: checkpoint files generated by older versions of multisim might
not be compatible with the current version. The probe option of the multisim command can
be used detect whether the checkpoint file is supported.
If the job had failed in Stage 5 you would see something like:
...
Stage 5 partially completed. 1 subjobs failed, 11 subjobs done.
Jobname of failed subjobs:
biphenyl-to-benzene_5_lambda2
Stage 6 not run.
108
Appendix A: The multisim Utility
To resume the job as described earlier, this output tells you that you need to provide the .tgz
file from stage 4, and that you might want to provide the .tgz file from the partially completed
stage 5. In this case only one subjob failed in stage 5 so rerunning just that subjob might
require significantly less computer time than rerunning the entire stage.
Multisim processing deviates from the Ark standard in the following ways:
• Multisim stages with the same name remain separate, otherwise stages that appear more
than once in an .msj file (such as simulate or minimize) would be combined into one
stage.
The multisim input file consists of a sequence of stages, each of which specifies a particular
calculation to be run. A stage begins with a label identifying the type of stage followed by
braces enclosing parameters for that stage. A msj file will in general look something like:
110
Appendix A: The multisim Utility
Keyword Description
compress File name pattern of the stage data file. If it is set to an empty string, then the
data of this stage is not packaged and compressed. Default:
$JOBNAME_$STAGENO-out.tgz.
dir Pattern for the names of the directories used by subjobs of this stage. Default:
[$JOBPREFIX/][$PREFIX/]$MASTERJOBNAME_$STAGENO
[_lambda$LAMBDA]
jlaunch_opt Options to add to the command to run Desmond. The options are listed in
Table 6.1 on page 68. Example: ["-dp"] turns on use of double precision.
jobname Jobname pattern for subjobs of this stage. Default:
$MASTERJOBNAME_$STAGENO[_lambda$LAMBDA]
prefix Value of the PREFIX macro, which by default is used to specify the sub-direc-
tory name of subjobs (see the dir keyword). Default is an empty string.
should_skip Skip this stage. Allowed values: true, false. Default: false.
should_sync Do not start this stage until all subjobs of the previous stage finish successfully.
Allowed values: true, false. If it is set to false, this stage is started as soon as
any subjob of the previous stage finishes successfully. For FEP jobs, setting this
keyword to false can result in earlier completion of the job. Default: true.
struct_output File name of the final output structure file. This keyword is only effective when
set in the last stage, and the setting can be overwritten by the -o option of
multisim.
title The title for the stage. Default: ?. which stands for none.
The restrain keyword supports the specific values listed in Table A.2, and also supports a
block syntax and a list syntax.
Value Description
Restraints can be specified using blocks, and ASL expressions can be used to define the atoms
that are restrained within restraint blocks. The syntax is:
112
Appendix A: The multisim Utility
Relative restraints can be included by setting atoms and refvalue appropriately. For distance
and NOE restraints, atoms should specify exactly 2 atoms; for angle restraints, atoms should
specify exactly 3 atoms; and for dihedral restraints, atoms should specify exactly 4 atoms. All
atoms in the relative restraint must be in the same CT. For NOE constraints, refvalue should be
a list with two elements, the upper distance then the lower distance; for all other types refvalue
is the desired distance or angle value, in angstroms or degrees.
Different groups of atoms in the system can be restrained with different force constants by
listing the restraint blocks for each set of atoms within a list using the syntax:
An atom group set in the system_builder stage is persistent, which means that it remains
defined in all subsequent stages, until the next system_builder stage.
The task stage has two keywords. The first is task, whose allowed values are listed in
Table A.3.
Value Description
The second is set_family which can be used to sets parameters for a family of stages. For
example:
task {
task = "desmond:fep"
set_family = {
desmond = {
checkpt.wall_interval = 7200.0
checkpt.write_last_step = no
}
}
}
The family of stages here is desmond. This family includes minimize, simulate,
replica_exchange, and lambda_hopping stages. The settings in the desmond = {...}
block are effective for all subsequent stages belonging to the desmond family. If there is a
114
Appendix A: The multisim Utility
conflict between the settings in the set_family block and the explicit settings in a stage, the
latter take precedence. For example:
task {
task = "desmond:fep"
set_family = {
desmond = {
checkpt.wall_interval = 7200.0
checkpt.write_last_step = no
}
}
}
simulate {
checkpt.wall_interval = inf
}
The value of the checkpt.wall_interval parameter is set to inf, overriding the setting
checkpt.wall_interval = 7200.0 within the set_family block, because explicit
settings in the stage have higher precedence.
Valid stage family names include desmond and all stage names. If a stage name is used, the
family consists of all stages of that name. For example:
task {
task = "desmond:fep"
set_family = {
simulate = {
checkpt.wall_interval = 7200.0
checkpt.write_last_step = no
}
}
}
Here all subsequent simulate stages will be affected, but minimize, replica_exchange,
and lambda_hopping stages will not.
Keyword Description
116
Appendix A: The multisim Utility
Keyword Description
only_merge_ct Merge the specified CTs into a single CT, and ignore all other settings
for the stage. This is useful for building a full system CT from the sepa-
rate CTs. Valid value is a list of CT indices. Default: [].
preserve_box When building the geometry of the system, don't change the box size if
box information exists in the given solute CT. Valid values are true and
false. Default: false.
preserve_ffio When building the geometry of the system, do not delete the existing
ffio_ff blocks in the original CT. Valid values are true and false.
Default: true.
rebuild_cms Extract the component CTs from a full system CT. All other settings are
ignored. Valid values are true, false, or a block like:
{ membrane = ASL-expression | ""
solvent = ASL-expression | ""
}
If the value is not false, the component CTs are extracted from the full
system CT. The block value allows you to you can specify which atoms
belong to the membrane component CT and which belong to the solvent
component CT. Default: false.
solvent Solvent type. Allowed values are water, SPC, TIP3P, TIP4P, TIP4PEW,
methanol, octanol, DMSO. Values are case sensitive. water is a syn-
onym for SPC. Default: water.
Keyword Description
forcefield Specify the force field to use. Valid values are: OPLS_2005, CHARMM, AMBER,
amber03, amber99, amber94, amber96, amber99SB, amber99SB-ILDN,
charmm22nocmap, charmm36_lipids, charmm27, charmm32,
oplsaa_ions_Jensen_2006, oplsaa_impact_2001,
oplsaa_impact_2005, oplsaa_impact_2001, oplsaa_impact_2005.
These force fields are the ones available from viparr—see Chapter 8. Values are
case-sensitive. Default: OPLS_2005.
water Specify the water model to use. Valid values are: SPC, SPCE, TIP3P,
TIP3P_CHARMM, TIP4P, TIP4PEW, TIP4P2005, TIP5P, none. Default: SPC
humble Do not overwrite the existing ffio_ff block in the input. Valid values are true,
false. Default: false.
Table A.7. Keywords for the simulate, replica_exchange, and lambda_hopping stages.
Keyword Description
fep.model_file List of structure files, each for use with different lambda windows.
This keyword should only be used for FEP jobs.
jin_file List of auxiliary input files for the stage. Usually only needed when
custom plugins are used.
jout List of auxiliary output files for the stage. Usually only needed
when custom plugins are used.
The fep.model_file keyword makes it possible to use different input files for different
lambda windows. For instance:
118
Appendix A: The multisim Utility
task {
task = "desmond:fep"
}
simulate {
fep.model_file = ["file1.cms" "file2.cms"]
}
Here, the first two windows use the file1.cms and file2.cms respectively. The rest of the
windows use the default input file.
Metadynamics simulations can be run by including a meta block in the simulate stage. See
Section B.7 on page 141 for the syntax of this block. However, in the multisim file, you must
use an ASL expression for the center of mass of a set of atoms, not a list. For example,
atom = ['atom.num 2379 2380 2382 2384 2385' 'atom.num 631 642 651']
Keyword Description
spd_file The name of the solvate_pocket command file. If omitted, the default
settings are used.
spd_overwrite Block that provides settings in Ark syntax to overwrite specific commands
in the command file. All keywords other than spd_file and
ligand_file must be inside this block.
ligand_file The name of the ligand file used to define the region sampled. If the name
is set to an empty string or omitted, the name jobname-ligand.mae is
used. A ligand must be used to define the region.
Keyword Description
120
Appendix A: The multisim Utility
Keyword Description
short_dist The closest acceptable approach for any two atoms. Default: 1.0 Å.
chemical_potential The chemical potential of water in kcal/mol. Default: -7.17 kcal/mol (for
TIP4P).
fep.lambda For FEP calculations, turn on the solvate_pocket calculation and spec-
ify the lambda schedule that is in use. This keyword should only be used
for an FEP job, and the schedule must be the same as for the FEP calcula-
tion itself (as specified in the config file). The value true specifies the
default:12 schedule, which is the default for FEP calculations. See
Appendix B for more information.
extern {
command = "
import os;
def main( current_stage, job ):
os.system( ’ls’ )
"
}
In the embedded Python code, you can import and use modules from your Schrödinger Python
installation. If you need extra modules, you can pass their file names to multisim by setting
the auxiliary_file parameter, and multisim transfers them to the scratch directory of the
master job at run time. For example:
extern {
auxiliary_file = [mod1.py mod2.py]
command = "
import os
import mod1
import mod2
def main( current_stage, job ):
// does something with mod1 and mod2
os.system( ’ls’ )
"
}
The code given above is run once for each subjob in the previous stage. If you want to run it
only once, use command_once instead of command. For example:
extern {
auxiliary_file = [mod1.py mod2.py]
command_once = "
import os
import mod1
import mod2
def main( current_stage ):
// does something with mod1 and mod2
os.system( ’ls’ )
"
}
For scripts that are stage-specific, your code must provide a main function that takes two argu-
ments for command or one for command_once. The first argument in both cases corresponds
to information for the current stage, while the second argument for command corresponds to
information from the previous stage.
For very simple scripts, your code for command or command_once does not need to provide a
main function. The following example removes a temporary file if it exists:
extern {
command = "
import os
# Removes a temporary file.
if (os.path.isfile( ’my_temporary_file’ )) :
os.remove( ’my_temporary_file’ )
"
}
Without the main function, multisim cannot pass the current stage and the current job objects
to the Python code, but that is presumed to be not needed for simple operations.
The extern stage is an advanced feature. Please do not hesitate to contact us for additional
information on its use. The keywords for this stage are listed in Table A.9.
122
Appendix A: The multisim Utility
Keyword Description
auxiliary_file List of files containing extra modules to be transferred to the runtime direc-
tory.
command Command to execute once for each subjob of the previous stage. The com-
mand specifies Python code that can span multiple lines.
command_once Command to execute once for the previous stage. The command specifies
Python code that can span multiple lines.
Keywords Description
meta.range For metadynamics simulations, specify the range for each collective
variable, in the format [[min1 max1] [min2 max2] ... ].
meta.nbin For metadynamics simulations, specify the number of bins for each col-
lective variable in the analysis.
plotter Specify the plotter to use to draw images for figures. Allowed values:
mplchart, gchart. mplchart generates static .png file for each fig-
ure. gchart generates a URL for each figure, and the URLs are saved in
the report file. When you open the report file in a browser, the figure
images are generated using Google chart APIs. The data for the figures
is sent to the Google web site to create the images.
Default: mplchart.
prob_profile Specify how to calculate the probability profiles for the time series given
by the time_series parameter. Example: prob_profile = [2 0
360 true], which means the bin size is 2, the data range is [0, 360),
and the range is periodic (true).
report Specify the name of the report file. Default: $JOBNAME_report.html.
SEA Block for specifying the configuration of the simulation event analysis.
This block can be the contents of an .st2 file, which can be written
from the Simulation Event Analysis panel.
time_series Specify the variables for which to analyze the time series. Example:
time_series = [[dihedral 1 2 3 4] [dihedral 5 6 7 8]],
which specifies two dihedral angles to analyze.
Keywords Description
The free energy can be plotted as different functions of the simulation time. The following
settings are added to control these new functionalities:
bennett = {
forward_time = {
name = "freeenergy_time"
begin = 100.0
end = inf
dt = 30.0
}
reversed_time = {
name = "freeenergy_rtime"
begin = 100.0
end = inf
dt = 30.0
}
sliding_time = {
name = "freeenergy_stime"
begin = 100.0
124
Appendix A: The multisim Utility
end = inf
dt = 30.0
window = 500.0
}
}
The forward_time, reversed_time, and sliding_time settings control calculations of
the free energy over different ranges of time.
The forward_time keyword provides the free energy as a function of the simulation course.
In other words, each free energy value of this function is calculated using the trajectory frag-
ment from the time as given by the begin setting to the time point. The dt keyword indicates
the interval between consecutive estimates of the free energy (in this case, 30 ps).
The reversed_time keyword functions similarly to the forward_time keyword except that
the free energy is calculated from each time point to the end of the trajectory.
For the sliding_time keyword, the fragment length has a fixed length, specified by the
window keyword (in this case, 500.0 ps). The beginning of the fragment slides along the trajec-
tory with the step size given by the dt keyword (30.0 ps in this example).
trajectory = false
checkpt = off
maeff_output = off
eneseq.interval = 0.0
Setting eneseq.interval to zero means that the energy is computed for every frame of the
trajectory.
task {
task = "desmond:auto"
}
minimize {
# relaxation stage
}
simulate {
# relaxation stage
}
simulate {
# production stage
}
trim {
save = [-1]
}
The save keyword is the only trim-specific keyword and specifies stages for which the output
files should be kept. Its value is a list of integers. The integers can be either positive or nega-
tive. Positive integers are the stage index, negative integers are relative to the trim stage, i.e.
they refer to stages counting backwards from the trim stage. In this example, -1 means the
first stage before this trim stage, i.e. the production stage or the 4th stage. So the trim stage
removes all .tgz files that were generated by the earlier stages except for the one generated by
the production stage.
The following setting for the above example saves the .tgz files generated by the 3rd and 4th
stages (the two stages before the trim stage):
save = [2 3 -1]
Note that the task counts as a stage.
126
Desmond User Manual
Appendix B
The configuration file (hereafter referred to as a config file) describes the nature of the calcula-
tion that the Desmond backend should perform. Starting with the 2010 release, there are two
styles of config files: a user facing, “front-end” style that is designed to make it easier for users
to work with, and a “back-end” style, which provides a very detailed and specific description
of the calculation to the back-end itself.
For most Desmond jobs a CMS file and a front-end config file are specified on the command
line. Then the software automatically translates the front-end config file into a back-end config
file in a manner that depends on, or rather, takes into account, the nature of the system (in the
cms file). The appropriate Desmond back-end executable for the job (e.g. mdsim, minimizer)
is determined automatically based upon the contents of the front-end config file. This appendix
is mainly focused on describing the syntax of the front-end config file. For information on the
back-end config file see the Desmond User’s Guide.
The Ark syntax from D. E. Shaw Research is used for both the back-end and front-end config
files. The definitive description of the Ark format is available in the Desmond User’s Guide.
This format is also used by multisim (.msj) files. Below is a brief recap of the Ark syntax,
followed by detailed documentation of the parameters in the front-end config file.
At the most basic level the syntax is the familiar key-value pair syntax of the form:
key = value
where key must be a non-empty string, and value must be one of the following three objects:
atom, list, and map. A value of the atom type is one of the following: integer, floating number,
string, boolean (true, false, yes, no, on, off are recognized as boolean values), or none (?
is recognized as none). A value of the list of type is of the form: [ value1 value2 ... ], in
other words a list of one or more values wrapped within brackets [ ], and the elements value1,
value2, ... are values of any type and are separated by spaces. Map type values generally look
like:
{ key1 = value1
key2 = value2
...
}
where the value is wrapped with braces { }, and elements in the map are key-value pairs.
The form describe above is the canonical form of the syntax. Here is an example of this syntax
with nested values:
a = {
b = {
c = [1 2 4]
}
d = string_value
}
Ark also supports the so-called “pathname” syntax, where nested keys are separated by a
period. The pathname equivalent of the canonical example given above is:
a.b.c = [1 2 4]
a.d = string_value
B.2 Units
The units used in the configuration file are:
• time: ps
• distance: angstroms
• energy: kcal/mol
• pressure: bar
• surface tension: bar angstroms
• temperature: kelvin
Time can be specified as a positive real number or as the string inf, meaning infinity, or
“never”.
128
Appendix B: The Configuration File
Optional string fragments that contain macros can also be defined. An optional fragment is a
portion of a string enclosed by $[ and $]. Such a fragment, together with the $[ and $] char-
acters, is deleted if there are one or more $ characters within the fragment. If there is no $ char-
acter within the fragment, only the $[ and $] are deleted. This operation is performed after
macro expansion. To see its effect, consider the following example.
name = "$JOBNAME$[_replica$REPLICA$].dE"
If we have defined the values of $JOBNAME and $REPLICA to be myjob and 0, respectively,
then the actual value for name will be myjob_replica0.dE. If we do not give a value to
$REPLICA, macro expansion yields myjob$[_replica$REPLICA$].dE. Because there is a
$ within the fragment, this fragment is deleted, so that the final value for name is myjob.dE.
B.4.1 fep
Common usage:
fep = {
lambda = "default:12"
i_window = ?
output = {
name = "$JOBNAME$[_replica$REPLICA$].dE"
first = 0.0
interval = 1.2
}
}
B.4.1.1 fep.lambda
This parameter sets the lambda schedule for FEP calculations. You can set it to a string like
default:12, which means “use the default scheme with 12 windows”. To use 20 windows
with the default scheme, set the parameter to default:20. Available lambda schedule
schemes are: default, quickcharge, and superquickcharge. You can define your own
schedule. For instance, for an alchemical free energy calculation, the following syntax can be
used:
fep.lambda = {
bondedA = [1.0 0.9 ...]
bondedB = [0.0 0.1 ...]
chargeA = [1.0 0.75 ...]
chargeB = [0.0 0.25 ...]
vdwA = [1.0 1.0 ...]
vdwB = [0.0 0.0 ...]
}
Here, A refers to the original molecule while B refers to the mutated molecule, and the ellipsis
represents additional values that you would explicitly include. The number of values (lambda
stages) must be consistent in all 6 lists.
Similarly for a total free energy calculation the following syntax may be used:
fep.lambda = {
coulomb = [0.0 0.1 ...]
vdw = [1.0 1.0 ...]
}
B.4.1.2 fep.i_window
This parameter is automatically set by multisim to the index of the lambda window. You do
not normally set or adjust this parameter.
The default 12 step lambda schedule for relative free energy calculations is:
lambda = {
bondedA = [1.0 0.916666666667 0.833333333333 0.75 0.666666666667 0.583333333333
0.416666666667 0.333333333333 0.25 0.166666666667 0.0833333333333 0.0]
bondedB = [0.0 0.0833333333333 0.166666666667 0.25 0.333333333333 0.416666666667
0.583333333333 0.666666666667 0.75 0.833333333333 0.916666666667 1.0]
chargeA = [1.0 0.75 0.5 0.25 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0]
chargeB = [0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.25 0.5 0.75 1.0]
vdwA = [1.0 1.0 1.0 1.0 1.0 0.67 0.46 0.33 0.25 0.19 0.12 0.0]
130
Appendix B: The Configuration File
vdwB = [0.0 0.12 0.19 0.25 0.33 0.46 0.67 1.0 1.0 1.0 1.0 1.0]
}
The default 12 step lambda schedule for absolute free energy calculations is:
lambda = {
coulomb = [0.0 0.118514957434 0.189778434985 0.24741405481 0.325045439067
0.456296209913 0.674789989504 1.0 1.0 1.0 1.0 1.0 ]
vdw = [1.0 1.0 1.0 1.0 1.0 0.674789989504 0.456296209913 0.325045439067
0.24741405481 0.189778434985 0.118514957434 0.0 ]
}
B.4.2 cutoff_radius
This parameter sets the cutoff radius for the non-bonded interactions. If the particle mesh
Ewald (PME) method is used for electrostatic interactions, this sets the cutoff radius for the
real space part of the electrostatic interaction calculations.
cutoff_radius = 9.0
B.4.3 taper
The taper parameter is used to specify how interactions are truncated at the cutoff. For
example,
taper = off
turns off any special treatment of interactions at the cutoff (i.e. they are just truncated). There
are several ways to turn on tapering. The simplest way is to set the taper parameter to on, and
the default, shift, tapering scheme will be used. In this scheme the tapering consists of
shifting the potential to 0 at the cutoff_radius. More elaborate tapering schemes, requiring
more detailed information, are supported:
taper = {
method = potential
width = 1.0
}
where the method parameter specifies the type of tapering and has one of the following values:
potential, c1switch, c2switch, or shift. The width parameter controls the range of
distances, between (cutoff_radius - width) and cutoff_radius, over which tapering is
applied. For more information on tapering methods, see the Desmond User’s Guide.
B.4.4 coulomb_method
This parameter sets the method for treating Coulombic interactions in the simulation. For
example,
coulomb_method = pme
requests that the particle mesh Ewald (PME) method be used.
The computational accuracy of the PME method can be controlled by the so-called Ewald
tolerance. The smaller the tolerance is, the more accurate but slower the computation is. The
default value of the tolerance is 1E-9, which is very accurate for most simulations. If you want
to use a different tolerance value, set the coulomb_method parameter explicitly in the
following form:
B.4.5 temperature
This parameter sets the temperature of the system. For instance:
temperature = 300.0
requests that a temperature of 300K be used. For multiple thermostats, you must explicitly set
the temperature for each of the thermostats using the following syntax:
For simulated annealing jobs, the value of the temperature parameter specifies the temperature
schedule of the annealing process in the following form:
132
Appendix B: The Configuration File
B.4.6 annealing
The annealing parameter indicates whether this is a simulated annealing simulation, and takes
values of off (not a simulated annealing simulation) or on. If annealing is set to on then the
temperature parameter should be set to an appropriate temperature schedule.
B.4.7 pressure
The target pressure for simulations involving a barostat can be set as follows:
pressure = 1.01325
The default coupling scheme of the barostat to the system is isotropic. Use the following
format to use another coupling scheme:
B.4.8 surface_tension
This parameter sets the surface tension, e.g.
surface_tension = 4E3
It is ignored unless the ensemble parameter is set to NPgT (or the ensemble.method param-
eter is set to NPgT).
B.4.9 ensemble
This parameter is used to select the ensemble to use for the simulation. For instance,
ensemble = NPT
requests that the simulation be done in the constant pressure / constant temperature ensemble
with a constant number of particles (atoms).
In the simplest form, you can set the parameter to different ensemble classes, and the default
setup for that class is used. Valid values for ensemble class are the following: NPT, NVT, NVE,
NPgT, NPAT, NPT_Ber, NVT_Ber, NPT_L, NVT_L. In all of these ensembles the number of
particles is constant. NPgT represents the ensemble where the pressure, temperature, and
surface tension (x-y plane) are also held constant. In the NPAT ensemble, in addition to the
number of particles, the pressure, temperature, and area of the system (in the xy plane) are held
constant.
The ensembles represented by NPT, NVT, NPgT, and NPAT are controlled by a Nosé-Hoover-
based algorithm. Those ending with _Ber or _L use alternative algorithms, namely those based
upon the work of Berendsen or Langevin.
The ensemble can be described in more detail using the following form:
ensemble = {
class = NPT
method = MTK
thermostat.tau = 1.0
barostat.tau = 2.0
}
where the valid values for class are: NPT, NPgT, NPAT, NVT, and for method are: MTK, NH,
Berendsen, Langevin. The thermostat.tau and barostat.tau parameters set the relax-
ation times for the thermostat and barostat, respectively.
For the NVE ensemble, the ensemble parameter can be set as:
ensemble = NVE
or as:
ensemble.class = NVE
B.4.10 time
The time parameter can be used to set the total simulation time. e.g.,
time = 1200.0
B.4.11 elapsed_time
The elapsed_time parameter controls the starting time for the simulation. Normally this is 0.
However, if the simulation is a continuation of an earlier one the current time can be set using
this parameter, e.g.
elapsed_time = 10.0
B.4.12 timestep
The timestep parameter specifies values for the bonded, near, and far time steps, respec-
tively. e.g.,
134
Appendix B: The Configuration File
B.4.13 cpu
The cpu parameter specifies the total number of processors and, optionally, the domain-
decomposition of the system in the x, y, and z directions. For instance,
cpu = [1 2 4]
species that 1 × 2 × 4 = 8 processors should be used and that the system should be decomposed
into to 1, 2 and 4 domains in the x, y, and z directions. Alternatively, you can set this parameter
to a single integer value, which specifies the total number of CPUs, and then the decomposition
is automatically done based on the shape of the simulation box for the system. For example:
cpu = 8
sets the total number of CPUs to 8. The values specified for the domain decomposition must be
powers of 2, 3, or 5, or products of these powers.
B.4.14 glue
Glue tries to ensure that the closest images for closely associated molecules are recorded in the
output cms and trajectory files. This parameter specifies the molecule set. Only one value is
supported at present: solute. The parameter can be set with:
glue = solute
B.4.15 trajectory
The trajectory parameter is used to control how the trajectory is written.
trajectory = {
name = "$JOBNAME$[_replica$REPLICA$]_trj"
first = 0.0
interval = 4.8
periodicfix = true
write_velocity = true
frames_per_file = 25
}
The periodicfix parameter, if set to true, instructs Desmond to wrap molecules as a unit
within the periodic boundary conditions rather than to wrap individual atoms (potentially split-
ting molecules across the unit cell in the recorded trajectory). The first parameter is used to
indicate at what time to start recording the trajectory, while interval controls how often to
write trajectory frames. The frames_per_file parameter is used to control the number of
frames written to each file in the trajectory. Large numbers of files can cause very slow
copying and reading of the trajectory, and can even cause failure. You should set this parameter
so that the total number of files is small enough for the file system to handle efficiently (no
more than a few thousand). Fewer files also reduces the IO overhead when reading the files.
B.4.16 eneseq
The eneseq parameter controls when to start recording the .ene file (first), how often to
update it (interval), and the precision of the energies (precision). The minimum precision
is 8 figures.
eneseq = {
name = "$JOBNAME$[_replica$REPLICA$].ene"
first = 0.0
interval = 1.2
precision = 9
}
B.4.17 checkpt
The checkpt parameter specifies the name of the checkpoint .cpt file, when to start
recording (first) and how often to overwrite it (interval).
checkpt = {
name = "$JOBNAME.cpt"
first = 0.0
interval = 240.0
write_last_step = yes
}
To let Desmond periodically write out a .cpt file at certain wall time interval, you can set
checkpt in this form:
checkpt = {
name = "$JOBNAME.cpt"
wall_interval = 3600.0
write_last_step = yes
}
where wall_interval sets the wall time interval in seconds. You can turn off wall time
interval by either setting it to inf (wall_interval = inf) or not including it at all. If the
wall_interval parameter is set to a finite value, then the first and interval parameters
have no effect.
You can turn off recording the .cpt file altogether using:
checkpt = off
136
Appendix B: The Configuration File
B.4.18 maeff_output
The maeff_output parameter specifies the name of the output .cms file, when to start
recording this file (first), how often to overwrite it (interval), and the number of signifi-
cant figures given in floating-point numbers (precision). It also specifies the name of the
.idx file (used for recording the trajectory name).
maeff_output = {
name = "$JOBNAME$[_replica$REPLICA$]-out.cms"
trjidx = "$JOBNAME$[_replica$REPLICA$]-out.idx"
first = 0.0
interval = 120.0
precision = 8
}
B.4.19 randomize_velocity
Sometimes it is useful to randomize the velocities of the atoms at the start of or during a simu-
lation. The randomize_velocity parameter specifies when to start randomizing velocities
(first), how often to do it (interval) and the random number seed used for generating the
velocities (seed). The temperature parameter sets the target temperature for the randomized
velocities.
randomize_velocity = {
first = 0.0
interval = inf
seed = 2007
temperature = '@*.temperature'
}
The value '@*.temperature' means the value of the temperature parameter that is in the
parent map of the randomize_velocity parameter. The @ symbol in the value is referencing
operator, * means the parent map of the current map, so *.temperature means the tempera-
ture in the parent map. This syntax is a general feature in the sense that every parameter can be
set syntactically in this way. For example:
randomize_velocity.first = '@*.trajectory.first'
sets randomize_velocity.first to the value of trajectory.first. Circular references
(which are possible syntactically) are caught during the initial job-launch processing.
B.4.20 simbox_output
The simbox parameter controls how Desmond records information about the simulation box.
This parameter specifies the name of the file, when to start recording the file (first) and how
often to update it (interval).
simbox = {
name = "$JOBNAME$[_replica$REPLICA$]_simbox.dat"
first = 0.0
interval = 1.2
}
B.4.21 energy_group
Desmond can record a break-down of the energy components during the simulation. The
energy_group parameter controls this functionality and specifies the name, time to start
recording (first) and frequency to record (interval) this information, e.g.,
energy_group = {
name = "$JOBNAME$[_replica$REPLICA$]_enegrp.dat"
first = 0.0
interval = 1.2
self_energy = false
corr_energy = true
}
Setting corr_energy to true allows the correction energy to be printed to the output file.
Setting self_energy to true includes the self-energy term of the Ewald summation into the
correction energy.
B.4.22 backend
The descriptions above handle most (if not all) of the parameters that you would typically use
with Desmond. However, if additional parameters available in the back-end config file are
needed then the backend parameter can be used to specify them in a front-end config file.
For example:
backend = {
force.nonbonded.r_lazy = 12.0
}
adds force.nonbonded.r_lazy = 12.0 to the back-end config file.
138
Appendix B: The Configuration File
Settings within the backend maps have the highest precedence and thus are honored uncondi-
tionally. Incorrect settings within this map are not detected by the driver.
B.5.1 max_steps
The max_steps parameter specifies the maximum number of steps (or iterations) that the
Desmond minimizer runs, e.g.
max_steps = 2000
B.5.2 convergence
The convergence parameter sets the convergence criterion in kcal/(mol angstrom), e.g.,
convergence = 1.0
B.5.3 steepest_descent_steps
It is often good to use a few steepest descent minimization steps prior to using more elaborate
minimization approaches. The keyword steepest_descent_steps sets the number of
steepest descent steps to use, e.g.,
steepest_descent_steps = 10
After these steps, the LBFSG method is used to further minimize the system.
B.5.4 num_vector
This parameter sets the number of vectors to use for LBFSG method, e.g.
num_vector = 3
B.6.1 replica
The replica parameter sets the simulation configuration for each replica. There are two
forms for setting this parameter. The first is for replica exchange calculations in which the
temperature is set in each replica for all atoms in the system. Here, the replica parameter is a
list, and each element of the list is a map value, which in turn specifies the simulation configu-
ration for the corresponding replica. This example
Sometimes, different CMS files are used for different replicas. The CMS files can be specified
in this form:
replica = {
generator = solute_tempering
atom = "asl:atom.num 1-128"
temperature = [300.0 400.0 500.0 600.0]
}
140
Appendix B: The Configuration File
The generator keyword specifies that solute tempering is used to generate the replicas. The
atom keyword specifies the atoms involved, and the temperature list provides the tempera-
tures to be used.
B.7.1 meta
The meta block is used to set up a metadynamics simulation. The parameters that apply to all
collective variables (CVs) are the height of the repulsive Gaussian potential, in kcal/mol, the
time in ps at which they are first added, and the interval in ps at which they are subsequently
added. The collective variables are defined by the cv keyword.
meta = {
cv = cv-map
cv_name = $JOBNAME.cvseq
first = 0.0
height = 0.03
interval = 1.2
name = kernels.kerseq
}
Two output files are defined: cv_name sets the name of the metadynamics log file (default
$SJOBNAME.cvseq), and name sets the name of the metadynamics output file, which contains
the Gaussian widths and height at each step, and is used to calculate the free energy. You can
specify the time of the last addition with the last keyword; the default is the simulation time.
B.7.2 cv
The cv parameter defines the collective variables in a metadynamics simulation. Each variable
is defined in a map that includes the type (dist, angle, dihedral, rmsd), the atoms that are
in the collective variable, and the RMS width of the repulsive Gaussian potential. For distance
variables, a wall can be placed at a specified distance, which prevents the system from moving
too far in the direction defined by the collective variable. For the analysis, the range of coordi-
nate values can be defined. This is useful if the interesting phenomena occur at the end of the
range, e.g. at 180° for a dihedral angle. An example with two CVs is as follows:
cv = [ { type = dist
atom = [1 3]
width = 0.4
wall = 10.0
{ type = angle
atom = [1 3 5 6]
width = 0.4
range = [0 360] } ]
To define a collective variable in terms of the center of mass of a set of atoms, you can provide
a list of atoms inside the atom list, e.g.
If you define the collective variable in the config file, you can only use a list of atoms to define
the center of mass. If you want to use ASL to define the center of mass, you can do so in a
meta block in the .msj file.
B.8.1 vrun_frameset
The vrun_frameset parameter is used to set the path of the MD simulation trajectory that
should be analyzed. e.g.,
vrun_frameset = /home/joeuser/Desmond_files/my_study_trj
There is no default value for this parameter.
142
Desmond User Manual
Appendix C
This appendix documents the command line usage of two related Python scripts and the syntax
of two file formats used for analysis of simulation quality. The two Python scripts are
simulation_block_data.py and simulation_block_test.py, and are located in the
directory $SCHRODINGER/mmshare-vversion/lib/Linux-x86/lib/python2.6/site-
packages/schrodinger/application/desmond/. These scripts can be run with the
$SCHRODINGER/run command.
C.1 simulation_block_data.py
This command determines simulation properties from the input .log file and block averages
from the input .ene file, and writes the results to the output .sba file. The syntax is as
follows:
Option Description
C.2 simulation_block_test.py
This command evaluates the results in an input.sba file using tests specified in an .sbt file.
The output file is a plain text file that prints the job details block from the .sba file and then
provides information on whether block averages are within the tests specified in the .sbt file.
The syntax is as follows:
The header block contains the information on the energy and log files and has the following
format:
Version: version
Energy_File: enefile
Log_File: logfile
The job details block contains the information from the log file about the simulation, such as
the status of the simulation, number of atoms, ensemble, and so on. An example block is
shown below:
Block: Job_Details
Status = Normal
Temperature = 300.0
Job_name = rin
Degrees_of_freedom = 103139
Molecules = 3
144
Appendix C: Analyzing a Simulation from the Command Line
Duration = 1.2
Atoms = 50274
Ensemble = MTK_NPT
End_Block
Subsequent data blocks printed in the .sba file are block averages enclosed between Block
and End_Block lines. A sample data block is shown below:
Block: E
Time(ps) E(kcal/mol)
5.0 5.0
10.0 4.9
End_Block
The first row is a heading that indicates what quantities are listed below, and it is followed by
rows of values. This block indicates that block average for E for data points up to the first 5 ps
was 5.0 kcal/mol, and for the next 5 ps (5ps to 10ps) it was 4.9 kcal/mol.
E {
sd = 5.0
slope = 4.3
average = -435320.2
average_tol = 4000.0
}
This block indicates that the block values for the property called E in an input .sba file should
have following properties:
References
1. Bowers, K.J.; Chow, E.; Xu, H.; Dror, R. O.; Eastwood, M. P.; Gregerson, B. A.;
Klepeis, J. L.; Kolossvary, I.; Moraes, M. A.; Sacerdoti, F. D.; Salmon, J. K.; Shan, Y.;
Shaw, D. E. Scalable Algorithms for Molecular Dynamics Simulations on Commodity
Clusters, Proceedings of the ACM/IEEE Conference on Supercomputing(SC06),
Tampa, Florida, November 11-17, 2006, http://sc06.supercomputing.org/schedule/
event_detail.php?evid=9088.
2. Shaw, D.E. A fast, scalable method for the parallel evaluation of distance-limited pair
wise particle interactions. J. Comput. Chem. 2005, 26, 1318.
3. Bowers, K.J.; Dror, R.O.; Shaw, D.E. The midpoint method for parallelization of
particle simulations. J. Chem. Phys. 2006, 124, 184109.
4. Bowers, K.J.; Dror, R.O.; Shaw, D.E. Zonal methods for the parallel execution of range-
limited N-body simulations. J. Comput. Phys. 2007, 221, 303.
5. Lippert, R.A.; Bowers, K.J.; Dror, R. O.; Eastwood, M.P.; Gregersen, B. A.; Klepeis, J.
L.; Kolossvary, I.; Shaw, D. E. A common, avoidable source of error in molecular
dynamics integrators. J. Chem. Phys. 2007, 126, 046101.
6. Arkin, I.T.; et al. Mechanism of Na+/H+ Antiporting. Science, 2007, 317, 799.
7. Humphrey, W.; Dalke, A.; Schulten, K. VMD - Visual Molecular Dynamics, J. Molec.
Graphics, 1996, 14, 33.
8. Cornell, W.D.; Cieplak P.; Bayly, C. I.; Gould, I. R.; Merz, K. M.; Ferguson, D. M.;
Spellmeyer, D. C.; Fox, T.; Caldwell, J. W.; Kollmann, P. A. J. Am. Chem. Soc. 1995,
117, 5179. Parameters converted from those at http://amber.scripps.edu/
amber9.ffparms.tar.gz
9. Kollman, P. A. Acc. Chem. Res. 1996, 29, 461. Parameters converted from those at http:/
/amber.scripps.edu/amber9.ffparms.tar.gz
10. Wang, J.; Cieplak, P.; Kollman, P. J. Comput. Chem. 2000, 21, 1049. Parameters
converted from those at http://amber.scripps.edu/amber9.ffparms.tar.gz
11. Hornak et al. Proteins: Structure, Function & Genetics, 2006, 3, 712.
12. Shaw, D. E.; Maragakis, P.; Lindorff-Larsen, K.; Piana, S.; Dror, R. O.; Eastwood, M.
P.; Bank, J. A.; Jumper, J. M.; Salmon, J. K.; Shan Y.; Wriggers, W. Science 2010, 330,
341.
13. Duan, Y.; Wu, C. Chowdhury, S; Lee, M. C.; Xiong, G.; Zhang, W.; Yang, R.; Cieplak,
P.; Luo, R.; Lee, T.; Caldwell, J.; Wang, J.; Kollman, P. J. Comput. Chem. 2003, 24,
1999. Parameters converted from those at http://amber.scripps.edu/
amber9.ffparms.tar.gz. Bugfix from http://amber.scripps.edu/bugfixes/9.0/bugix.5
applied to correct torsional assignments.
14. Parameters generated from http://mackerell.umaryland.edu/CHARMM_ff_params_files
/toppar/toppar_c35b2_c36a2.tar.gz.
15. Beglov, D.; Roux, B. J. Chem. Phys. 1994, 100, 9050 (ions).
16. MacKerell, A. D., Jr., et al. J. Phys. Chem. B. 1998, 102, 3586 (proteins).
17. MacKerell, Jr. A. D.; Feig M.; Brooks, A. D., III J. Comput. Chem. 2004, 25, 1400
(protein CMAP). Missing CMAP term applied to protonated HIS.
18. Foloppe N.; MacKerell, A. D., Jr. J. Comp. Chem. 2000, 21, 86 (nucleic acids).
19. MacKerell, A. D., Jr. Banavali, N. K. J. Comp. Chem. 2000, 21, 105 (nucleic acids).
20. Feller, S. E.; MacKerell, A. D., Jr. J. Phys. Chem. B. 2000, 104, 7510 (lipids).
21. Feller, S. E.; Gawrisch, K.; MacKerell, A. D., Jr. J. Am. Chem. Soc. 2002, 124, 318
(lipids).
22. Klauda, J. B.; Brooks, B. R.; MacKerell, A. D., Jr. J. Phys. Chem. B, 2005, 109, 5300
(alkanes/lipids).
23. Klauda, J. B.; Venable, R. M.; Freites, J. A.; O’Connor, J. W.; Tobias, D. J.; Mondragon-
Ramirez, C.; Vorobyov, I.; MacKerell, A. D., Jr.; Pastor, R. W. 2010, 114, 7830.
24. Jorgensen, W. L.; Maxwell, D. S.; Tirado-Rives, J., J. Am Chem. Soc. 1996, 118, 11225.
25. Damm, W.; Frontera, A.; Tirado-Rives, J.; Jorgensen, W. L. J. Comput. Chem. 1997, 18,
1955.
26. Jorgensen, W. L., et al. Theochem. 1998, 424, 145.
27. McDonald, N. A., Jorgensen, W. L. J. Phys. Chem. B. 1998, 102, 8049.
28. Rizzo, R. C.; Jorgensen, W. L. J. Am. Chem. Soc. 1999, 121, 4827.
29. Watkins, E. K.; Jorgensen, W. L., J. Phys. Chem. A. 2001, 205, 4118.
30. Kaminski, G. A.; Friesner, R. A.; Tirado-Rives, J.; Jorgensen, W. L. J. Phys. Chem. B
2001, 105, 6474.
31. OPLSAA/L reparameterization, version 1 torsions used for SER, version 1 for ASP,
version 3 (combined) for LEU, VAL, Jacobson, M.P., et al. J. Phys. Chem. B. 2002, 106,
11673. The charges for HISE used in this force field have not been published to our
knowledge.
32. Jensen, K. P.; Jorgensen, W. L. J. Chem. Theory Comput. 2006, 2, 1499.
33. Jacobson, M.P.; Kaminski, G. A.; Friesner, R. A.; Rapp, C. S. J. Phys. Chem. B. 2002,
106, 11673.
34. Berendsen, H. J. C. et al. in Intermolecular Forces, edited by B. Pullman (Reidel,
Dordrecht,1981), p. 331.
35. Berendsen, H. J. C.; Grigera, J. R.; Straatsma, T. P. J. Phys. Chem. 1987, 91, 6269.
36. Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. J. Chem.
Phys. 1983, 79, 926. Parameters as tabulated in Mahoney, M. W.; Jorgensen, W. L. J.
Chem. Phys. 2000, 112, 8910.
37. Neria, E.; Fischer, S.; Karplus, M. J. Chem. Phys. 1996, 105, 1902.
38. Jorgensen, W. L.; Madura, J. D. Mol. Phys. 1985, 56, 1381. Parameters as tabulated in
Mahoney, M. W.; Jorgensen, W. L. J. Chem. Phys. 2000, 112, 8910.
39. Horn, H. W.; Swope, W. C.; Pitera, J. W.; Madura, J. D.; Dick, T. J.; Hura, G. L. J.
Chem. Phys. 2004, 120, 9665.
40. Mahoney, M. W.; Jorgensen, W. L. J. Chem. Phys. 2000, 112, 8910.
41. Earl D. J; Deem, M. W.; Phys. Chem. Chem. Phys., 2005, 7, 3910.
42. Guo Z; Mohanty U.; Noehre J.; Sawyer T. K.; Sherman W.; Krilov G. Probing the α-
helical structural stability of stapled p53 peptides: molecular dynamics simulations and
analysis. Chem. Biol. Drug Des. 2010, 75, 348.
43. Patriksson, A; van der Spoel, D., A temperature predictor for parallel tempering simula-
tions. Phys. Chem. Chem. Phys. 2008, 10, 2073.
44. Gervasio, F. L.; Laio, A.; Parrinello, M. Flexible docking in solution using metady-
namics. J. Am. Chem. Soc. 2005, 127, 2600.
45. Shivakumar, D.; Williams, J.; Wu, Y.; Damm, W.; Shelley, J.; Sherman, W. Prediction of
Absolute Solvation Free Energies using Molecular Dynamics Free Energy Perturbation
and the OPLS Force Field. J. Chem. Theory Comput. 2010, 6, 1509.
46. Liu, P.; Kim, B.; Friesner, R. A.; Berne, B. J., Replica exchange with solute tempering: a
method for sampling biological systems in explicit water. Proc. Natl. Acad. Sci. U.S.A.
2005, 102, 13749.
Getting Help
• The docs folder (directory) of your software installation, which contains HTML and
PDF documentation. Index pages are available in this folder.
• The Schrödinger web site, http://www.schrodinger.com/, particularly the Support Center,
http://www.schrodinger.com/supportcenter, and the Knowledge Base, http://www.schro-
dinger.com/kb.
To get information:
• Pause the pointer over a GUI feature (button, menu item, menu, ...). In the main window,
information is displayed in the Auto-Help text box, which is located at the foot of the
main window, or in a tooltip. In other panels, information is displayed in a tooltip.
If the tooltip does not appear within a second, check that Show tooltips is selected under
General → Appearance in the Preferences panel, which you can open with CTRL+, (,).
Not all features have tooltips.
• Click the Help button in a panel or press F1 for information about a panel or the tab that is
displayed in a panel. The help topic is displayed in your browser.
• Choose Help → Online Help or press CTRL+H (H) to open the default help topic in your
browser.
• When help is displayed in your browser, use the navigation links or search the help in the
side bar.
• Choose Help → Manuals Index, to open a PDF file that has links to all the PDF docu-
ments. Click a link to open the document.
• Choose Help → Search Manuals to search the manuals. The search tab in Adobe Reader
opens, and you can search across all the PDF documents. You must have Adobe Reader
installed to use this feature.
E-mail: help@schrodinger.com
USPS: Schrödinger, 101 SW Main Street, Suite 1300, Portland, OR 97204
Phone: (503) 299-1150
Fax: (503) 299-4532
WWW: http://www.schrodinger.com
FTP: ftp://ftp.schrodinger.com
Generally, e-mail correspondence is best because you can send machine output, if necessary.
When sending e-mail messages, please include the following information:
4. Click Create.
An archive file is created in your working directory, and an information dialog box with
the name of the file opens. You can highlight and copy the name of the file.
5. Attach the file specified in the dialog box to your e-mail message.
6. Copy and paste any log messages from the window used to start Maestro (or the job) into
the email message,or attach them as a file.
• Windows: Right-click in the window and choose Select All, then press ENTER to
copy the text.
• Mac: Start the Console application (Applications → Utilities), filter on the applica-
tion that you used to start the job (Maestro, BioLuminate, Elements), copy the text.
If Maestro failed:
1. Open the Diagnostics panel.
• Windows: Start → All Programs → Schrodinger-2012 → Diagnostics
• Mac: Applications → Schrodinger2012 → Diagnostics
• Linux/command line: $SCHRODINGER/diagnostics
2. When the diagnostics have run, click Technical Support.
A dialog box opens, with instructions. You can highlight and copy the name of the file.
3. Attach the file specified in the dialog box to your e-mail message.
4. Attach the file maestro_error.txt to your e-mail message.
This file should be in the following location:
• Windows: %LOCALAPPDATA%\Schrodinger\appcrash
(Choose Start → Run and paste this location into the Open text box.)
• Mac: Documents/Schrodinger
• Linux: Maestro’s working directory specified in the dialog box (the location is
given in the terminal window).
5. On Windows, also attach the file maestro.EXE.dmp, which is in the same location as
maestro_error.txt.
M mutation FEP
ligand .......................................................... 42
master job.......................................................... 39
protein residue ............................................ 46
host ............................................................. 71
ring atom .................................................... 44
membrane
mutation FEP, use of Hamiltonian replica
adding ................................................... 11, 78
exchange .................................................. 51, 55
adjusting orientation ................................... 13
available models ......................................... 11
buffer distance ............................................ 78 N
embedding GPCR in................................. 100 near time step ............................................ 32, 134
OPM ........................................................... 12 neutralizing the model system .......................... 80
placing ........................................................ 12 node locking.................................................... 106
relaxation protocol................................ 24, 30
surface area................................................. 30 O
surface tension ............................................ 30
metadynamics ................................................... 28 output, config file keyword............................. 137
analysis ....................................................... 64
bin size for analysis .................................. 123 P
config file block........................................ 141 partial charges ................................................... 13
distance barrier ................................... 29, 141 particle mesh Ewald (PME) method............... 132
log file name ............................................. 141 periodic boundary
mutlsim block ........................................... 119 setting up .............................................. 11, 78
Metadynamics Analysis panel .......................... 65 wrapping molecules at.............................. 135
Metadynamics panel ......................................... 29 pressure
minimization ..................................................... 21 config file keywords ................................. 133
algorithm .................................................... 21 simulation, setting ...................................... 23
convergence threshold ........................ 22, 139 probability profiles.......................................... 123
maximum iterations ............................ 21, 139 processors
number of LBFSG vectors........................ 139 allocation .................................................... 39
setting parameters for ................................. 34 choosing number of .............................. 39, 70
steepest descent steps ............................... 139 config file keyword................................... 135
Minimization panel ........................................... 21 product installation ......................................... 152
model system properties, for simulation event analysis .......... 62
adding ions to ............................................. 79 protein side chains, mutating ............................ 46
adding salt to .............................................. 80
importing .................................................... 21 R
neutralizing ................................................. 80
preparing..................................................... 75 random seed
relaxing ....................................................... 23 Bennett method, FEP analysis.................. 124
selecting...................................................... 20 velocity ..................................................... 137
solvating ..................................................... 80 recording
Molecular Dynamics panel ............................... 22 checkpoint file .................................... 37, 136
msj file, example ............................................... 72 energy ................................................. 22, 136
multisim .................................................... 70, 105 energy group............................................. 138
file syntax ................................................. 109 simulation box .......................................... 138
job file example .......................................... 72 start time ..................................................... 37
restarting jobs ........................................... 106 trajectory............................................. 22, 135
template commands .................................. 105 update frequency ........................................ 37
SCHRÖDINGER ®