|
Dizzy User ManualCompBio Group, Institute for Systems BiologyStephen Ramsey (sramsey at systemsbiology.org)Dizzy Version: 1.11.4, 2006/09/28 |
|
About Dizzy
Dizzy is a chemical kinetics simulation software package implemented in Java. It provides a model definition environment and various simulation engines for evolving a dynamical model from specified initial data. Stephen Ramsey in the laboratory of Hamid Bolouri at ISB. A model consists of a system of interacting chemical species, and the reactions through which they interact. The software can then be used to simulate the reaction kinetics of the system of interacting species. The software consists of the following elements:.cmdl".
The extension ".dizzy" is also recognized as indicating a CMDL file.
This document is the user manual for the Dizzy program. This manual applies to the following release version of the program:
The overview document describing Dizzy can be found at the following URL:release version: 1.11.4 release date: 2006/09/28
http://magnet.systemsbiology.net/software/Dizzy/docs/Overview.htmlThe home page for this program is:
http://magnet.systemsbiology.net/software/DizzyThe version history for this program can be found at the following URL:
http://magnet.systemsbiology.net/software/Dizzy/docs/VersionHistory.htmlIf you are reading this document through a print-out, you can find the online version of this document (which may be a more recent version) at the following URL:
http://magnet.systemsbiology.net/software/Dizzy/docs/UserManual.htmlA PDF version of this manual is also available on-line at:
The above hyperlinks for the User Manual are for the most recent version of the Dizzy system.http://magnet.systemsbiology.net/software/Dizzy/docs/UserManual.pdf
Publications
An article describing Dizzy has been published,
Ramsey S., Orrell D. and Bolouri H. Dizzy: stochastic simulation of large-scale genetic regulatory networks. J. Bioinf. Comp. Biol. 3(2) 415-436, 2005. |
External Libraries
The Dizzy system relies upon a number of external open-source libraries. These libraries are bundled with the Dizzy program and are installed within the Dizzy directory when you install Dizzy on your system.
The following table documents the external library dependencies of the Dizzy system. The libraries are provided in a compiled format called a "JAR archive". Some of the libraries have software licenses that require making the source code available, namely, the GNU Lesser General Public License (LGPL). For each of those licenses, a hyperlink is provided to a compressed archive file containing the source code for the version of the library that is included with Dizzy. These hyperlinks are shown in the "Source" column below.
Package name JAR name Home Page / Documentation License Version Source Code jfreechart jfreechart.jarhttp://www.jfree.org/jfreechart/LGPL 0.9.6 full jcommon jcommon.jarhttp://www.jfree.org/jcommon/LGPL 0.7.2 full SBW (core) SBWCore.jarhttp://sbw.kgi.eduBSD 2.5.0 see SBW web site Netx JNLP client netx.jarhttp://jnlp.sourceforge.net/netxLGPL 0.5 full JavaHelp jh.jarhttp://java.sun.com/products/javahelpSun Binary Code License 1.1.3 partial JAMA Jama.jarhttp://math.nist.gov/javanumerics/jamapublic domain 1.0.1 full colt colt.jarhttp://hoscheck.home.cern.ch/hoscheck/coltopen source (see below) 1.0.3 full odeToJava odeToJava.jarhttp://www.cs.dal.ca/~spiteri/students/mpatterson_bcs_thesis.ps(customized version -- see note below)public domain alpha.2.p1 full SBMLReader SBMLReader.jar(customized and abridged version of the SBMLValidatelibrary by Herbert Sauro and the SBW Project team)LGPL 1.0 full Cytoscape cytoscape.jarhttp://www.cytoscape.orgLGPL 1.1.1 see the Cytoscape Project Home Page
Please note that the SBMLReader.jar library is a modified
version of the SBML-parsing code originally contained in the program
SBMLValidate.jar. The package name has been changed also.
This was done in order to minimize the potential for conflict in cases
where the target installation computer already has an installation of
SBMLValidate.jar from the
Systems Biology Workbench (SBW).
The SBWCore.jar library distribution contains three
external libraries: gnu-regexp, grace,
and java_cup. For information about these
libraries and to obtain the source code, please consult the various
README.txt files within the subdirectories of the
sbw-1.0.5/src/imported directory of the source archive
for the SBWCore library, obtained at the link given above.
The odeToJava library is copyright Raymond Spiteri
and Murray Patterson. It is provided with kind permission from
Raymond Spiteri (Dalhousie University, Halifax, NS, Canada).
The odeToJava library is not distributed in its original
form with Dizzy. It has been modified from the version that is available
from Netlib.
Please use the odeToJava.jar library that is bundled
with Dizzy, as it contains some features that are necessary in
order to function correctly with Dizzy. The Netlib version of
odeToJava is no longer compatible with Dizzy, without
some slight modifications to the source code.
The jfreechart and jcommon libraries
are used by Dizzy in order to generate graphical plots of
simulation results. Please note that the public API for these libraries
has changed in recent versions, in a non-backwards-compatible manner.
It is necessary to use the (older) versions of these libraries (referenced above),
that are provided with the Dizzy installation. If you
download the latest version of the jfreechart and
jcommon libraries from the
JFree.org web site, they
will not be compatible with Dizzy.
The Colt library is provided under the following license terms:
Copyright (c) 1999 CERN - European Organization for Nuclear Research. Permission to use, copy, modify, distribute and sell this software and its documentation for any purpose is hereby granted without fee, provided that the above copyright notice appear in all copies and that both that copyright notice and this permission notice appear in supporting documentation. CERN makes no representations about the suitability of this software for any purpose. It is provided "as is" without expressed or implied warranty.
Dizzy depends on the Cytoscape program through the Java Network Launching Protocol (JNLP), which means that the Cytoscape program is not distributed with Dizzy. Instead, the Cytoscape program is loaded at run-time over the network, only when an application function is performed that depends on the Cytoscape program.
Acknowledgements
The Dizzy software program was implemented by Stephen Ramsey. Hamid Bolouri is the Principal Investigator for this research project. This research project was supported in part by grant #10830302 from the National Institute of Allergy and Infectious Disease (NIAID), a division of the National Institutes of Health (NIH). David Orrell provided helpful advice and was an early adopter of the Dizzy program. William Longabaugh provided frequent advice on Java programming. Mike Hucka and Andrew Finney provided much assistance with SBML and SBW. Paul Shannon and the Cytoscape team helped to make the Dizzy->Cytoscape bridge possible. Raymond Spiteri kindly permitted the inclusion of the "odeToJava" library, which was implemented by Murray Patterson and Raymond Spiteri.
Many other individuals have contributed to the project, as well. In particular it should be noted that Dizzy makes extensive use of external libraries. The Dizzy system would not have been possible without the hard work and contributions of the authors of these libraries.
This section describes how to get started with using the Dizzy system.
System Requirements
The Dizzy system is implemented in the Java programming language.
This means that an installation of the Java Runtime Environment
(JRE) is required in order to be able to use the Dizzy system.
A version of the Dizzy system installer program
("insDizzy.bin" on Unix/Linux,
or "insDizzy.exe" on Windows)
is available which has the Sun JRE version 1.4.1 pre-bundled with it.
This is the recommended approach for users who are not knowledgeable
in the use of the Java programming language and runtime environment.
You may also download the "thin" version of the installer
that does not have the JRE pre-bundled. In order to use
the "thin" installation of Dizzy, you must already have
a JRE installed on your computer. The JRE must be at least version
1.4 or newer, because the software uses Java 1.4 language features and
extensions. This software will not function correctly with a 1.3.X
version of the JRE; if you attempt to run it under a 1.3.X version
of the JRE, you will see an UnsupportedClassVersionException.
The specific hardware requirements for using the Dizzy system will vary depending on the complexity of the models being studied, and on the type of JRE and host operating system. A good rule of thumb is that at least 512 MB of RAM is recommended. If you are using your own JRE and it is not a Sun JRE, you will need to ensure that the appropriate command-line parameters are passed to the JRE to ensure that the built-in heap size limit is set to at least 512 MB. If you are using the Sun JRE, or the JRE that is pre-bundled with the Dizzy installer, this issue does not apply to you.
This software has been tested with the Sun Java Runtime Environment version 1.4.1 on the following platforms: Windows XP Professional on the Intel Pentium 4; Fedora Core 1 Linux on the Intel Pentium 4; Mac OSX version 10.2.6 on the PowerPC G4. It should function properly on most Windows and Linux distributions. For other operating systems, you may download the "Other Java-Enabled Platforms" version of the installer. A Mac OSX version of the installer is under development and will be released soon.
The Dizzy installer will install an executable for the Dizzy launcher program specifically designed for the operating system of the computer on which you are running the installer. This means that if you run the installer on a Windows computer, the Dizzy launcher that is installed will be a Windows executable. If there is a need to run Dizzy on multiple operating systems (e.g., in a dual-boot or heterogeneous network-file-system environment), Dizzy should be installed in a separate directory for each operating system. One exception applies: it is possible to install Dizzy on one operating system (e.g., Windows) and run it on a different operating system (e.g., Unix), if you run the command-line program and not the GUI.
Tutorial
Dizzy is launched by executing the "Dizzy" executable that was installed as a symbolic link by the installation program. The default location of this symbolic link depends on your operating system. If you are installing on a Windows computer, the symbolic link is created in a new Program Group "Dizzy", which will show up in the "Start" menu. If you are installing on a Linux computer, the symbolic link is created in your home directory, by default. Note that the installation program permits you to override the default location for the symbolic link to be created, so the symbolic link may not be in the default location on your computer, if you selected a different location in the installation process. By double-clicking on the "Dizzy" symbolic link, the Dizzy program should start up. You should see an application window appear that looks like the following picture:
To load a model definition file into Dizzy, select the "Open..." item from the "File" menu. This will open a dialog box, as shown here:
In the "Please select a file to open" dialog box, navigate to the directory in which you installed Dizzy. Then navigate into the "samples" subdirectory. The dialog box should look like this:
For starters, try selecting the "Michaelis.cmdl"
file, by double-clicking on that file name in the dialog box.
The Dizzy window should now look like this:
Note that the model description has appeared in the editor window.
In this window, you can edit a model description, after which you
may save your changes. You probably will not want to modify the
Michaelis.cmdl model definition file just yet.
Note that the file name appears after the "file:" label.
There is also a label "parser:" label, whose function
will be described later.
Now, from the "Tools" menu, select "Simulate...",
which essentially processes the model definition and loads the relevant
information into the Dizzy simulation engine.
This should create a "Dizzy: simulator"
dialog box, that looks like this:
First, you will need to specify a "stop time" for the simulation.
This is a floating-point number that you must type into the text box next
to the "stop time:" label in the "Dizzy: simulator"
dialog box. Second, you will need to select one or more species whose
populations are to be returned as time-series data resultant from the simulation.
For the purposes of demonstration, select the "G3D_G80D"
species in the list box under the "view species" label in the dialog
box.
TIP: You can select two species that are not adjacent to one another in the list box of species, by holding down the "control" key, and (while holding down the key) clicking on a species name with the mouse.Finally, you will need to specify the "output type" for the simulation. For demonstration purposes, click on the circular button next to the "plot" label on the dialog box. Go ahead and change the number of samples to 30 samples, by editing the "100" appearing in the text box next to "num samples". This controls the number of time points for which the result data will be graphed. At this point, the dialog box should look like this:
Now, let's run the simulation, by single-clicking on the "start" button in the "Dizzy: simulator" dialog box. After a moment, you should see a plot window appear that resembles the following image:
For longer-running simulations, you can use the "cancel", "pause", and "resume" buttons to control a running simulation. It is possible to pause and resume a simulation using the "pause" and "resume" buttons. You may terminate a running simulation at any time using the "cancel" button. The "start" button is only used to initiate a simulation. Only one simulation may be running at a time, in the Dizzy application.
A special note applies to the case of importing a model definition file in SBML format, using the GUI application. In this case, the GUI application will ask you to specify how species symbols appearing in reaction rate expressions are to be interpreted. The choices given are "concentration" and "molecules". It is recommended that you try using "concentration" first. If the GUI application complains that the initial population of a given chemical species is too large for it to handle, try reloading the model with the "molecules" choice instead. This will likely solve the problem.
Now that you have become acquainted with the simulation driver screen, the next step is to become acquainted with the CMDL model definition language, which permits rapid development of new models. To begin, let's define a simple model of a chemical system. This system will consist of the enzyme-substrate reaction:
where the "E S ---> P
E" is the enzyme, and
"S" is the substrate, and "P"
is the product. It is well known that the above symbols are
shorthand for the following three elementary chemical reactions:
where the "E + S ---> ES ES ---> E + S ES ---> E + P
ES" species is the enzyme-substrate
complex. We will now investigate the stochastic kinetics of this
very simple model using the Dizzy simulator.
A few notes about editing models: You may use the editor window in Dizzy, which is the white text box below the "file:" and "parser:" labels, to enter the model description as described below. Once you have entered the model description into the editor window, you may save the model description to a file by selecting "save as" from the "File" menu. Alternatively, you use your own text editor program (e.g., Notepad on Windows) to create the model definition file. In that case, you would use the "open" menu item under the "File" menu, to load the model definition file into Dizzy.
We will assume that the model definition file you are creating will be called "Model.cmdl". This file will define the species and chemical reactions that make up the model, as well as the kinetic parameters that will be used to simulate the reaction kinetics. The ".cmdl" file extension is important, so please type the file name exactly as shown. This helps the Dizzy program to recognize the file as an official "Dizzy model definition file", and to select the proper interpreter to load the file. The file should start out completely empty. Let's begin by defining the first of the three elementary reactions that make up the above model. We will be defining this model in the CMDL language for entering a model definition file, which is syntactically very close to the "Jarnac" language. At the top of the file, please enter the following lines of text, exactly as shown here:
These lines are examples of statements. The first three statements define symbols "E", "S", and "ES", and assign them the values 100.0, 100.0, and 0.0, respectively. The fourth statement is called a reaction definition. These symbols represent initial species populations for the chemical species appearing in the model. It is important that the reaction definition statement occurs after the statements defining the species symbols that appear in the reaction. Since statements are processed sequentially by the Dizzy parser, if the "ES = 0.0" statement did not occur before the reaction definition statement, the parser would generate an error message because it would not recognize the "ES" symbol occurring in the reaction definition statement. [At this point, processing of the model would cease because of the syntax error.] You will notice that in the above example, each line ends with a semicolon. In the Dizzy language, semicolons divide the model definition file into a sequence of statements. Each statement ends with a semicolon. A statement can in principle extend over one line, as shown here:E = 100.0; S = 100.0; ES = 0.0; r1, E + S -> ES, 1.0;
r1, E + S -> ES,
1.0;
This definition is logically equivalent to the one-line reaction definition
above it.
The commas in the reaction definition statement divide the statement into elements. We will explain each element in turn. In a reaction definition statement, the first element is optional, and defines the "name" of the reaction. This is just a symbolic name for the reaction, that does not affect the chemical kinetics of the model. There are rules governing allowed symbol names that apply to reaction names. The reaction name specified above was "r1", which is not very descriptive. Perhaps a more descriptive name would have been "enzyme_substrate_combine", as shown here:
Note the use of the underscore character ("enzyme_substrate_combine, E + S -> ES, 1.0;
_"), which
is necessary because spaces are not allowed in symbol names such as reaction names.
The second element of the reaction definition statement defines the
list of reactant species and products species for the chemical reaction.
In this case, the reactants are species "E" (the enzyme)
and species "S" (the substrate). The special symbol
"->" separates the list of reactants and products.
The sole product species is "ES", the enzyme-substrate
complex. The "+" operator is used to separate species
in either the reactant or product side of the reaction. In passing, we note
that this is a one-way reaction, meaning that it defines a process
that is not (by itself) reversible. To define a reversible reaction in Dizzy,
you would need to follow the above reaction definition statement with a second
reaction definition statement, in which the reactants and products are reversed,
for example:
The third element is a reaction rate. This can be specified as a bare number, a mathematical expression, or as a bracketed mathematical expression. When you specify the reaction rate as a bare number or as an (unbracketed) mathematical expression, you are instructing the Dizzy simulator to compute the reaction rate using its built-in combinatoric method. This means that the reaction probability density (usually designated with the symbol "enzyme_substrate_separate, ES -> E + S, 0.1;
a") per unit time is computed as
the product of the number of district combinations of reactant species,
times the reaction rate parameter you specified in the reaction definition.
Let us illustrate this with an example. For the reaction
If speciesr1, A + B -> C + D, 2.0;
A has a population of 10, and species B
has a population of 10, the reaction probability density per unit time will
be evaluated as the number of distinct combinations of reactant molecules
(in this case, that is 100) times the reaction rate parameter, 2.0. The
resulting reaction probability density per unit time will be 200. This
probability density can be used to compute the probability P that a given
chemical reaction will occur during an infinitesimal time interval dt:
The probability that a given chemical reaction (whose probability density per unit time is designated with the symbol "P = a dt
a") will
occur during the infitesimal time interval dt is just the product
of the infinitesimal time interval, and the reaction probability density per
unit time.
An example of a reaction definition with a mathematical expression for the reaction rate is shown here:
In the above example, the parser will attempt to immediately evaluate the expression "F * G". This evaluation will yield the result "100.0". Therefore, the above is functionally equivalent to:A = 100.0; B = 100.0; C = 0.0; D = 0.0; F = 10.0; G = 10.0; r1, A + B -> C + D, F * G;
In either case, the built-in combinatoric method for computing the reaction rate is used, with a reaction parameter of 100.0. For some cases, it may be desirable to specify a custom reaction rate, in which you specify a mathematical expression that is to be re-evaluated for each reaction event, to give the reaction rate. This might be useful for simulating cooperativity, or enzyme reactions, or inhibition. An example of a reaction defintion with a custom reaction rate expression is shown here:A = 100.0; B = 100.0; C = 0.0; D = 0.0; r1, A + B -> C + D, 100.0;
The symbol "time" is a special reserved symbol indicating the elapsed time of the simulation.A = 100.0; B = 100.0; C = 0.0; D = 0.0; k = 0.1; r1, A + B -> C + D, [k * (time + 1.0) * A * B];
Getting back to our previous model-building exercise, we have:
we see that the forward reaction for the enzyme and substrate combining, was given a reaction rate parameter of 1.0, and the reverse of that reaction (enzyme-substrate complex separating) was given the rate of 0.1.E = 100.0; S = 100.0; ES = 0.0; enzyme_substrate_combine, E + S -> ES, 1.0; enzyme_substrate_separate, ES -> E + S, 0.1;
Note that in defining a chemical reaction, the element specifying the reaction name is not required. If you do not specify a reaction name, a unique reaction name is automatically assigned to the reaction by Dizzy. The syntax for a reaction thus defined is:
It is recommended that you specify your own reaction names, because the names automatically assigned by Dizzy will be verbose and hard to understand.A + B -> C + D, 2.0;
Now, let's define the third reaction, which takes the enzyme-substrate complex to the enzyme plus product,
We will also need to define the initial population of the "P" species, using the statement:make_product, ES -> E + P, 0.01;
(note that this statement must occur before the "make_product" reaction definition statement occurs in the model definition file). Putting the three reaction definition statements together, your model definition file should look like this:P = 0.0;
The Dizzy system ignores whitespace that is not in a quoted string, so you may reformat your model definition file using spaces, so that it is more tabular:E = 100.0; S = 100.0; P = 0.0; ES = 0.0; enzyme_substrate_combine, E + S -> ES, 1.0; enzyme_substrate_separate, ES -> E + S, 0.1; make_product, ES -> E + P, 0.01;
Note that it is very important that the statements defining the initial species populations appear before the reaction definition statements. Otherwise, the Dizzy interpreter will not understand the species symbols appearing in the reaction definition.E = 100.0; S = 100.0; P = 0.0; ES = 0.0; enzyme_substrate_combine, E + S -> ES, 1.0; enzyme_substrate_separate, ES -> E + S, 0.1; make_product, ES -> E + P, 0.01;
The final model definition file should look like this:
E = 100; S = 100; P = 0; ES = 0; enzyme_substrate_combine, E + S -> ES, 1.0; enzyme_substrate_separate, ES -> E + S, 0.1; make_product, ES -> E + P, 0.01;
Remember, whitespace is ignored by the Dizzy interpreter, so your spacing
does not need to look exactly like the example shown above.
Now, let's save this model definition file in your text editor. Now, let's
open the model definition file in Dizzy, as shown above. Finally, let's
select the "Simulate..." menu item from the "Tools" menu,
and run a simulation. Select a stop time of 400.0, and specify an output type
of "plot", and a "num samples" of 40.
Select "P" as the species to view. Your simulation
controller dialog box should look like this:
Now, run the simulation. You should see the familiar Michaelis-Menten type reaction kinetics appear in a plot window:
Note that the curve is not a perfect Michaelis-Menten kinetics. This is because we are running a stochastic simulation. The Gillespie algorithm introduces the noisy effects of low copy numbers of chemical species in the model. If we were to drastically increase the number of species (say, by a factor or 1000) in the model, the curve would become less noisy:
Note that the larger the initial copy number of the species in the model, the more computational time will be requird to simulate the model for a given (fixed) choice of "stop time". This means that in general, when running stochastic simulations you should start with small initial copy numbers for the species in your model, and determine the computational run-time, before attempting simulations with large initial species populations.
Sample Model Definition Files
When you install the Dizzy application, a subdirectory "samples" is created in the directory where Dizzy is installed. You will find examples of all three languages in the "samples" subdirectory of the directory in which you install the Dizzy software.
Note to Windows users: please do not use Notepad to open the sample model definition files in the "samples" subdirectory. Please use a different editor, such as WordPad or Emacs, in order to ensure that the files appear properly formatted, in the editor. You may wish to associate "In addition, there is a repository of biomolecular models maintained by the CompBio Group, that will serve as a good source of sample model definition files..cmdl" and ".dizzy" files with the WordPad program, so that you can double-click on a ".cmdl" or ".dizzy" file and have it open (properly formatted) in WordPad.
This section describes preliminary concepts that are common throughout the various model definition languages for the Dizzy system.
Numeric Precision
All floating-point numbers in the Dizzy system are Java double-precision numbers. This means that they are floating-point numbers in a 64-bit representation adhering to the IEEE 754-1985 standard. Java double-precision floating-point numbers provide approximately 15 digits of precision. All internal calculations are performed using double-precision floating-point arithmetic, unless otherwise noted.
It should be noted that the above limitation of the number of significant digits of a double-precision floating-point number in Java, means that reaction rates differing by more than 15 orders of magnitude will cause incorrect results from the stochastic simulator. In practice, this limitation rarely presents a problem.
Another consequence of numeric precision is that a model containing
a dynamical species whose initial population is greater than
or equal to 9.0071993e+15 molecules will not be allowed
to be simulated using a stochastic simulator. In addition, if the
value of any dynamical species becomes greater than the aforementioned
threshold during the course of a simulation, an error message will occur,
and the simulation will be terminated. This is because
the number of molecules is stored using a double-precision floating-point
number, and for numbers greater than the aforementioned value,
there are not enough significant digits in the floating-point
representation to account for an increment or decrement of the
species population by one molecule.
Case Sensitivity
All symbols, keywords and expressions in the Dizzy system are
case-sensitive. This means that if you define a symbol such as
"x" (lower-case), you cannot later
refer to it as
"X" (upper-case). Similarly, mixed-case
keywords that are built into the Dizzy system,
such as the keyword exportModelInstance,
must be entered exactly as shown; case variants such as
exportmodelinstance would not be recognized as
valid keywords.
Symbol Names
Symbol names are a core ingredient of the Dizzy system. Most elements of the Dizzy system (reactions, species, parameters, compartments, etc.) are named elements whose names must be valid symbol names. A symbol name must conform to the following rules:
32
time and Navo are reserved,
because they represent clock time and Avogadro's constant, respectively+, -, *,
/, %, ^, or the
relations >, <, and =.
Further, it implies that a symbol name cannot contain the following
reserved characters:
!, @, #, $,
|, &, ;,
=, the comma ",",
and the period ".".
For the reader who is familiar with the C programming language, the above can be summarized as: a symbol name is legal if it would be a legal variable name in a C program.
Some examples of valid symbol names are shown here:
The following shows some examples of illegal symbol names:Species_A Galactose DNA_plus_TFA P1
The underscore can be used as a convenient separator when defining a symbol name with multiple words.
ILLEGAL: A + B DNA plus TFA C-D 1.0 1e+6 B!
Symbol names are stored in a namespace. There are two
types of namespaces, global and local. Normally,
all symbol names reside in the global namespace. This applies
to species names, reaction
names, compartment names, and
parameter names. This means that you cannot
define a species X and a reaction X; their
names would collide in the global namespace.
The local namespace applies only to a parameter that is defined
only for a specific reaction (or reactions). Each reaction has
a local namespace for its reaction-specific parameters. It is
permissible to define a parameter X in the global
namespace, and to also define a parameter X with a
different value, in the local namespace for one or more reactions.
In that case, the value associated with X for the
specific reaction supersedes the value associated with X
in the global namespace, for the purpose of evaluating the custom
reaction rate expression for the reaction. This can be summarized by
saying that a parameter defined locally supersedes a parameter defined
globally, of the same name. The local namespace concept applies only
to parameters. Note that defining parameters within the local
namespace is not possible in the Chemical
Model Definition Language.
Mathematical Expressions
Various aspects of the Dizzy system permit the textual specification of mathematical expressions. This is a useful method of customizing reaction rate equations and other aspects of a chemical system model. A mathematical expression may involve symbols, numeric literals, arithmetic operators, and built-in mathematical functions.
Symbols are analogous to algebraic variables or symbols. Depending on the context, a symbol may represent the population or concentration of a chemical species, or it may represent a floating point parameter defined for a model or a chemical reaction. In the context of an expression, a symbol always has an associated numeric value. When a symbol appears in a mathematical expression, its associated numeric value is used in place of the symbol, for the purpose of evaluating the expression.
In the context of a mathematical expression, numeric literals
are simply numbers, either floating point or integer. Note that
within a mathematical expression one may use scientific notation
(e.g., 1.2e-7 or 1.2e+7) to specify
floating-point numeric literals. Alternatively,
one may use constructions such as 1.2*10^7
and 1.2*10^(-7) to represent floating-point
numeric literals (but in deferred-evaluation expressions,
the latter method is less efficient than scientific notation
using the "e" character shown above).
In the Dizzy system, mathematical expressions are described using a syntax similar to the C programming language. The basic operations permitted are:
In addition to the above operations, there are a number of built-in mathematical functions that may be used in mathematical expressions. Unless otherwise stated, the built-in functions described below are implemented by calling the corresponding function in the
- addition
- adding two symbols, numbers, or sub-expressions, such as
A+B, orA+1.7, or2+2- subtraction
- computes the difference of two symbols, numbers, or sub-expressions, such as
A-B, orA-1.7, or2-2- multiplication
- multiplying two symbols, numbers, or sub-expressions, such as
A*B, orA*1.7, or2*2- division
- computes the quotient of two symbols, numbers, or sub-expressions, such as
A/B, orA/1.7, or2/2. The first operand is the dividend, and the second operator is the divisor.- modulo division
- computes the remainder of the quotient of two symbols, numbers, or sub-expressions, such as
A%B, orA%1.7, or2%2. The first operand is the dividend, and the second operator is the divisor.- exponentiation
- computes the exponent of two symbols, numbers, or sub-expressions, such as
A^B, orA^1.7, or2^2. The first operand is the value being exponentiated. The second operand is the exponent.- parentheses
- represents a sub-expression whose value is to be computed, such as the sub-expression
(B+C)appearing in the expressionA+(B+C).- negation
- computes the negative of a symbol, number, or sub-expression, such as
-A, or-1.0, or-(A+B).
java.lang.Math
class in the Java Runtime Environment. The built-in mathematical
functions available for use in mathematical expressions are:
New built-in mathematical functions may be added in forthcoming versions of the Dizzy system.
- exp
- Computes the value of the base of the natural logarithm,
e, raised to the power of the (floating-point) argument.- ln
- Computes the natual logarithm of the argument, which must be in the range
(0, infinity).- sin
- Computes the trigonometric sine of the argument. The argument is an angle, which must be specified in radians. Example:
sin(A),sin(3.14159).- cos
- Computes the trigonometric cosine of the argument. The argument is an angle, which must be specified in radians. Example:
cos(A),cos(3.14159).- tan
- Computes the trigonometric tangent of the argument. The argument is an angle, which must be specified in radians. Example:
tan(A),tan(3.14159).- asin
- Computes the trigonometric inverse sine of the argument. The argument is a dimensionless ratio, that must be within the range [-1,1]. The value returned is an angle, in radians. Example:
asin(A),asin(0.5).- acos
- Computes the trigonometric inverse cosine of the argument. The argument is a dimensionless ratio, that must be within the range [-1,1]. The value returned is an angle, in radians. Example:
acos(A),acos(0.5).- atan
- Computes the trigonometric inverse tangent of the argument. The argument is a dimensionless ratio. The value returned is an angle, in radians. Example:
atan(A),acos(0.5).- abs
- Computes the absolute value of the argument.
- floor
- Computes greatest integer value that is less than or equal to the floating-point argument. Example:
floor(A),floor(1.7)- ceil
- Computes the smallest integer value that is greater than or equal to the floating-point argument. Example:
ceil(A),ceil(1.7)- sqrt
- Computes the value of the square root of the argument. The argument must be nonnegative.
- theta
- Returns 0.0 if the argument is negative, or 1.0 if the argument is nonnegative (i.e., zero or positive)
- min(X,Y)
- Returns the smaller of expressions
XandY. This is a two-argument function.- max(X,Y)
- Returns the larger of expressions
XandY. This is a two-argument function.
Please remember that all elements of the Dizzy system
are case-sensitive,
including the aforementioned
built-in mathematical functions. Therefore an expression
such as SIN(3.14) would not be recognized
as referring to the sin trigonometric function.
The expression would therefore be considered invalid,
because the SIN function would not be
recognized as a valid built-in function.
It is important to note that all expressions are evaluated
using double-precision floating-point arithmetic. For
functions that return an integer, such as the floor()
function appearing in the expression A * floor(B),
the integer result of floor(B) is converted to a
double-precision floating-point number, before the result
is used in evaluating the A * floor(B) expression.
The following are a few examples of valid mathematical expressions that have been used in Dizzy models:
Note that the symbols10*(1/(1+exp(-0.0025*(-2000+time)))) alpha0 + (alpha + PY^n*alpha1)/(K^n + PY^n) k * (A/(N*V)) * (B/(N*V))
time and N
are special symbols, defined above.
Certain functions offered above, are not differentiable.
This means that algorithms or features of Dizzy that
rely on the Jacobian matrix of the model (the partial derivative
of the time rate of change of the ith species in the model,
with respect to the jth species), may not be used if you
specify a model that contains one of these non-differentiable
functions in an expression. An error will result if you
attempt to use a feature that relies on the Jacobian,
with a model containing a non-differentiable function.
The non-differentiable functions are:
theta(), ceil(), floor(),
abs(), and the modulo division operator %.
The features in Dizzy that rely on the Jacobian are the
Tau-Leap simulators and the
steady state fluctuations estimator
(the latter relies on the Jacobian only in the case of an ODE-based
simulator).
When specifying a mathematical expression, it is important to understand the distinction between immediate evaluation and deferred evaluation. An example of immediate evaluation is shown here:
The value for the symbol B is set to 5.0. The mathematical expression appearing in the definition of symbol B is immediately evaluated by the parser, so any symbols appearing in that expression (namely, A) must have been previously defined as symbols in the model. The special symbols "time" and "Navo" may not be used in immediate-evaluation expressions.A = 1.0; B = A * 5.0;
An example of deferred evaluation is shown her:
The square brackets define the expression as a deferred-evaluation expression. This means that the parser stores the expression and associates it with the symbol B, rather than a value. The expression will be evaluated by the simulation engine only when a value for the symbol "B" is needed. The special symbols "time" and "Navo" may be used in deferred-evaluation expressions.A = 1.0; B = [A * 5.0];
Important note about time-dependent expressions:
Although it is technically possible to define a rate law or other
expression that has an explicit time dependence through the use of
the reserved symbol "time", this practice
is discouraged when using the stochastic simulators. This is because
the stochastic simulators are based on a mathematical theory of
reaction kinetics in which the time-invariance of the reaction parameters
is a priori assumed. The time reserved symbol
is intended solely for use with the ODE simulators. A very slowly-varying
time dependence for some expression in a model, may be compatible with
the stochastic simulators, to the extent that on the time scale for
any reaction to occur, the expression is effectively time-translation-invariant.
Gillespie Stochastic Algorithm
The Gillespie stochastic algorithm is an algorithm for modeling the kinetics of a set of coupled chemical reactions, taking into account stochastic effects from low copy numbers of the chemical species. The algorithm is defined in the article:
In Gillespie's approach, chemical reaction kinetics are modeled as a markov process in which reactions occur at specific instants of time defining intervals that are Poisson-distributed, with a mean reaction time interval that is recomputed after each chemical reaction occurs. For each chemical reaction interval, a specific chemical reaction occurs, randomly selected from the set of all possible reactions with a weight given by the individual reaction rates.
D. T. Gillespie, "A General Method for Numerically Simulating the Stochastic Time Evolution of Coupled Chemical Species", J. Comp. Phys. 22, 403-434 (1976).
The Dizzy system provides a Java implementation of the Gillespie algorithm, for which more information is available in the Javadoc documentation. This implementation uses the "direct method" variant of Gillespie's algorithm.
Gillespie Tau-Leap Stochastic Algorithm
The Gillespie Tau-Leap algorithm is a method for obtaining approximate solutions for the stochastic kinetics of a coupled set of chemical reactions. An dimensionless relative tolerance "epsilon" controls the amount of error (as compared to the Gillespie Direct method) permitted in the solution, by scaling the maximum allowed "leap time" which is recomputed after each iteration of the algorithm. The leap time is the amount by which the time is stepped forward during the iteration. The number of times each reaction in the model occured during the leap time is computed as the result of a Poisson stochastic process. Species populations are adjusted in accordance with the number of times each reaction occurred during the leap time interval. In the limit as the epsilon parameter is set to zero, the Tau-Leap algorithm should agree precisely with the results of the Gillespie Direct algorithm. For complex models with a significant separation of time scales, this algorithm may potentially be much faster than the Gillespie Direct algorithm.
The Tau-Leap algorithm is described in:
and in references therein.
D. T. Gillespie and L. R. Petzold, "Improved Leap-Size Selection for Accelerated Stochastic Simulation", J. Chem. Phys. 119, 8229-8234 (2003).
Two implementations of the Tau-Leap algorithm are provided with Dizzy
The first is called "tauleap-simple". It is intended
for use with models that are entirely composed of elementary reactions, that is,
reactions with rate laws that are simple mass-action kinetics. The
second is called "tauleap-complex". It is intended
for use with models that contain custom algebraic
rate expressions.
Gibson-Bruck Stochastic Algorithm
The Gibson-Bruck stochastic algorithm is an algorithm for modeling the kinetics of a coupled set of coupled chemical reactions. The algorithm is defined in the article:
This implementation uses the "next reaction" variant of the Gibson and Bruck algorithm, for which more information is available. The Gibson-Bruck algorithm is O(log(M)) in the number of reactions, so it is preferred over the Gillespie algorithm for models with a large number of reactions and/or species. For models with a small number of reactions and species, the Gillepie algorithm is preferred, as it avoids the overhead of maintaining the complex data structures needed for the Gibson-Bruck algorithm.
M. A. Gibson and J. Bruck, "Efficient Exact Stochastic Simulation of Chemical Systems with Many Species and Many Channels", Caltech Parallel and Distributed Systems Group technical report number 026, (1999).
Deterministic simulation using ODEs
The Dizzy system provides several simulators for approximately solving the deterministic dynamics of a model as a system of ordinary differential equations (ODEs). A differential equation, called a rate equation, is generated expressing the time rate of change of the concentration of each chemical species in the model. This coupled set of differential equations is solved using finite difference techniques. The simplest methods use a fixed time-step size. More sophisticated methods use a variable time-step size that is controlled by an adaptive method involving a formula for estimating the error. If the error gets too large, the time-step size is decreased until the error is acceptable. If the error becomes very small, the time-step size is increased (to improve speed) as much as possible without exceeding the allowed error. Each step involves computing the concentration of all species at the next time step, using a finite differencing sheme. Several categories of finite differencing schemes exist. The explicit schemes compute the concentration at the next time-step using only derivatives at the previous time-step. The implicit schemes compute the concentration at the next time-step using only derivative values from the next time-step; these methods involve solving a (usually nonlinear) implicit equation for the concentration at the next time-step, for each iteration. A linearly implicit or implicit-explicit scheme is a compromise where the linear term is treated using an implicit method, and the nonlinear term is treated using an explicit method. This ensures that at most a linear system of equations needs to be solved for each iteration. For more information, please see the bookThe deterministic simulators are approximate for two reasons. First, they are solving a set of ordinary differential equations that are themselves an approximation to the underlying stochastic kinetics of the system. Second, they are using finite-difference methods that usually only give an approximate numerical solution to a system of differential equations. However, the deterministic simulators have the advantage of usually being much faster than the stochastic simulators, for most models. This means that they can be very beneficial in situations where rapid model solution is required, such as multi-parameter optimization of a model.
Introduction to Numerical Analysis, Second Edition, by J. Stoer and R. Bulirsch. New York: Springer-Verlag, 1993.
Parameters
A parameter is a name-value pair that may be referenced symbolically (i.e., by its name) in mathematical expressions. The value is always a numeric (floating-point) value. The parameter name must be a valid symbol name.
A parameter can be associated with a model, in which case it can be referenced in the custom rate expression for any chemical reaction associated with the model; in addition, it can be referenced in the species population expression for any boundary species within the model.
Compartments
A compartment is an abstraction for a named region of space that has a fixed volume. The contents of this volume are assumed to be well-stirred, so that chemical species do not have concentration gradients within this volume. Every species must be assigned to a compartment. The volume of the compartment can be used to compute the concentration of the species, from the number of molecules (population) of the species in the compartment.
By default, species defined in the Chemical
Model Definition Language are associated with a default
compartment "univ" This compartment
has unit volume.
A non-default compartment can be defined by a symbol definition as shown here:
A species "S" can be associated with this compartment by the statement:c1 = 1.0;
The special symbol "@" is used to associate a species with a compartment. Note that the species symbol "S" and the compartment symbol "c1" must have been previously defined, as shown here:S @ c1;
The above statement would tell the parser to define the two symbols "S" and "c1" with values 100 and 1, respectively, and that "S" is a species associated with the compartment "c1".c1 = 1.0; S = 100.0; S @ c1;
Species
A species is an abstraction representing a type of molecule or molecular state. A species has a name, which must be unique; in addition, a species must be assigned to one (and only one) compartment. A species must also be assigned a population value, which is a double-precision floating point number. There are two types of species in the Dizzy system, dynamical species and boundary species.
A dynamical species (called a "floating" species in SBML) is a species whose population is affected by reactions in which it participates. For example, if a reaction takes species X as a reactant, and does not produce species X as a product, then when this reaction occurs, the population of species X is decremented by one. The dynamical species is the most commonly used species type, and it is the default species type for species in the Dizzy system.
A boundary species is a species whose population is externally specified as a boundary condition for the simulation. The population of a boundary species is not affected by the occurrence of reactions in which the species participates. In this sense, a boundary species is not "dynamical". The population of a boundary species can be set to a constant, or a more complex time-dependent function. The details of how to define the population of a boundary species will be discussed further below.
Occasionally it is desirable to create a model in which
a given species can reside in more than one compartment.
This is accomplished in the Dizzy system by defining
two different species with similar (but still distinct)
names, and assigning each species to a different compartment.
For example, one might define two different species
named "SpeciesX_cytoplasm" and
"SpeciesX_nucleus", representing
the instances of chemical species "X" in the
cytoplasm and nucleus, respectively.
Please note that there is a restriction on the initial population that can be specified for dynamical species. This particular limitation only affects stochastic simulations.
Reactions
A reaction is a one-way process in which zero or more chemical species may interact, transforming into zero or more (possibly different) chemical species. The interacting species are the reactants, and the chemical species that are produced are called the products.
Here, "one-way" means that a single reaction defines a process that can only proceed from reactants to products. The "reverse" reaction is not implicitly defined. In order to model a chemical system with a "reversible" reaction, a second reaction must be defined in which the roles of reactants and products are swapped.
The mention of "zero species" above merits some
explanation. Consider the case of a chemical reaction with
zero reactants and a finite number of products. This represents
a process in which the products are spontaneously created,
somewhat like pair creation of an electron-positron pair
from the vacuum, in the presence of a strong electic field.
The case of zero products and a finite number of reactants
represents a process of annihilation of the reactant
molecules, such as in electron-positron pair annihilation.
Note that a reaction with zero reactants and zero
products is not permitted by the Dizzy system. The
cases of zero reactants or zero products are somewhat
degenerate, but are useful for defining a signal molecule
with an (ensemble-averaged) equilibrium population that is a time-dependent
function. For example, one can model a signal molecule
"S" whose equilibrium population
is a specified function of time by considering two separate
It is permissible for a chemical species to participate in a reaction as both a reactant and a product, as shown here:
In such a reaction, a single molecules of speciesA + B -> A + C
A is used in the reaction, but also
produced, so the net change in the population of
species A from this reaction is zero.
Note that the above reaction definition is not a good model of
catalysis. A simple model of catalysis in which species A
catalyzes the transformation of species B into
species C would involve three separate
reactions, as shown here:
with appropriate conditions on the relative rates of the second and third reactions. Note that the species named "A + B -> AB AB -> A + B AB -> A + C
AB" represents the
enzyme-substrate complex.
The above discussion assumes that species participating
in reactions are dynamical species.
As described above in the species
section, a species can also be defined as a "boundary"
species. In this case, the population of the species is
not dynamical but instead a boundary condition of the
system. As an illustration, suppose that species
X is declared as a boundary species.
Even if species X were to appear in a
reaction as a reactant, such as in the reaction
X + A -> B, the population of species
X would not be affected by the occurrence
of this reaction. This is mostly useful for defining
a species whose role in a system is as an externally
applied "signal" or "input".
Note that special notation is used to describe a
boundary species in the Chemical Model Definition
Language (CMDL), as described below.
Reaction Rates
A reaction rate is defined as the probability density per unit time, of a given chemical reaction occurring. In the Dizzy system, there are two methods of defining reaction rates, the built-in method and the custom expression method. The built-in method is the default method used, and it is preferred for reasions of computational performance (speed).
In the built-in method of defining a reaction rate, one specifies a numeric reaction parameter. The units of the reaction parameter depend on the reaction rate species mode attribute of the model with which the reaction is associated.
If the model's reaction rate species mode is molecules (the default), the reaction parameter represents the numeric reaction probability density per unit time, per distinct combination of reactant molecules. The reaction rate is then obtained by first computing the number of distinct combinations of reactant molecules (which depends on the populations of the various reactant species), and multiplying this number by the reaction parameter for the reaction. The result is the reaction rate.
If the model's reaction rate species mode is concentration, the reaction parameter represents the kinetic constant for the reaction, in units of inverse molar concentration to the power of the number of reactant species, per unit time. In this case, the concentration of each reactant species is computed, and the concentrations are multiplied together (with suitable exponentiation for a reactant species that has a stoichiometry greater than one). The result is then multiplied by the reaction parameter, to produce the reaction rate.
In the custom expression method of defining a reaction rate, one specifies a textual reaction rate expression. This expression is a mathematical expression involving symbols, arithmetic operators, and simple built-in mathematical functions. Symbols can be species names or parameters. A species name appearing in the expression represents either the number of molecules of the species, or the species concentration, depending on the reaction rate species mode of the model with which the reaction is associated. The custom expression method is less desirable than the built-in method, due to the computational overhead of evaluating the mathematical expression for each reaction event during the simulation of the model.
Multistep Reactions
The Dizzy system allows for defining an N-step process as a single composite "reaction". This is an experimental feature that still needs further testing before it can be considered reliable. As a more reliable and better-tested alternative, consider using the delayed reaction construct.
A multistep reaction assumed to consist of N irreversible elementary reactions that are chained together, as shown here:
One should note that it is possible to define each of these reaction steps separately, as shown here:S0 -> S1 -> S2 -> S3 -> S4 -> ... -> SN
S0 -> S1, k; S1 -> S2, k; ...The loop construct can make the above definition easier. However, if the following conditions are met:
whereS0 -> S1, k, steps: N;
k is the rate value for each
elementary reaction, and N is the
number of reaction steps in the composite multistep
reaction. If the value specified for N
is less than or equal to 15, the Dizzy simulator
will just insert the N separate reactions into
the model. If the value
N is greater than 15, the Dizzy simulator
will treat the cascade of reactions as a single
"multistep" reaction, using a history-dependent
mechanism for evaluating the probability density of producing
a molecule SN at any given time. This method
is described in the paper of Gibson
and Bruck.
Multistep reactions are useful for simulating processes such as transcription and translation, in which a long sequence of individual reaction steps transforms the system from an initial state ("polymerization complex") to a final state ("completed transcript").
Delayed Reactions
The Dizzy system allows for defining a reaction
process containing an intrinsic "delay".
This can be useful for phenomenologically modelling
complex processes for which the detailed dynamics is
not known, but for which the overall rate is known and the
total time for the process to occur, is known. A
delayed reaction must have exactly one reactant and one product
species. The delayed reaction takes up the reactant and produces
the product molecule, at the specified rate. However,
the rate of production of the product species depends upon
the number of reactant molecules at a time in the past equal
to the "delay" time specified for the reaction. For
reactant S0 and product S1 and delay
s and rate k, the delayed reaction
is equivalent to the following differential equations:
To define a delayed reaction equivalent to the above, the command language statement would be:dS0/dt = -k * S0(t) dS1/dt = k * S0(t - s)
whereS0 -> S1, k, delay: s;
k is the rate value for each
elementary reaction, and s is the delay.
The delay time is in the same units as the time scale
used for all kinetic parameters in the model, and must
be a nonnegative number. Specifying a delay time of zero
is equivalent to having no delay at all.
Models
In the Dizzy system, a model is a collection of one or more reactions, together with all of the chemical species involved in the reactions, and any parameters defined for the model or the reactions. In addition, a model contains all of the compartments with which the species are associated. A model also incorporates the initial species populations.
A model has an important attribute called the reaction rate species mode. This attribute controls how a given species contributes to a reaction rate. It has two possible values, molecules and concentration. Each will be defined in turn.
In the molecules reaction rate species mode, the contribution of any given species to a reaction rate is always computed using the number of molecules of the species. In the case of the default method of computing the reaction rate, this means that the reaction rate is computed as the product of the number of distinct combinations of reactant molecules, and the reaction parameter. The molecules reaction rate species mode is the default.
In the concentration reaction rate species mode, the contribution of a given species to a reaction rate is computed using the molar concentration of the species (number of moles of the species, divided by the volume of the compartment).
The Chemical Model Definition Language (CMDL) is a simplified
model definition language designed to minimize the amount of repetitive
typing required to define a model. The default file extension of model
definition files in the CMDL language is the ".cmdl"
suffix. The alternative extension ".dizzy" is also
understood to indicate a CMDL file.
Character Encoding
All CMDL files are required text files in UTF-8 encoding, which includes comments. This is to ensure uniform behavior of the Dizzy parser on all platforms, regardless of the default character encoding used in the particular locale. In particular, Red Hat Linux distributions subsequent to version 8, employ UTF-8 encoding as the default character encoding; therefore, care must be used to avoid embedding non-UTF-8 characters within a CMDL file.
Symbol Values
A fundamental concept in the CMDL language is the symbol value. A symbol value is an assocation between a symbol name and a value. A value may be defined as a mathematical expression, in which case it is immediately evaluated and the resulting floating point value is associated with the symbol. Or, the value may be defined as a bracketed mathematical expression (enclosed in square brackets), in which case the expression itself is stored and associated with the symbol name. The former type of value (immediately evaluated expression) is akin to a numeric macro. The latter type of value (expression with deferred evaluation) is similar to a symbolic function definition in Mathematica.
As mentioned previously, symbol names must be unique. This means that you cannot use the same symbol name for two different purposes. For example, it is illegal to define both a species "S" and a compartment "S". All elements of the Dizzy system (reaction names, species names, compartment names, and parameter names) live in the same "namespace", and so each element must have a globally unique name.
In the CMDL, compartments, species, and parameters all start out as symbol value definitions like this:
In this example, "S" is the symbol name, and the value is 1.0. The Dizzy parser determines that a given symbol name is a species, compartment, or parameter based on how the symbol is subsequently used. For example, if the symbol "S" appears as a reactant in a subsequent reaction definition,S = 1.0;
the symbol "S" is automatically promoted to be a species. It cannot be subsequently used as a compartment or other type of symbol. For the case of a compartment, suppose that a symbol "comp" is defined as shown here:r1, A + S -> C + D, 1.0;
If this is later followed by a statement such as:comp = 2.0;
the parser will automatically promote the symbol "comp" to be a compartment (and "S" will be promoted to be a species, if this has not already happened). If at the end of processing all statements in the model definition file, there are symbols left that are neither species nor compartments, these symbols are added to the model as global parameters.S @ comp;
Statements
The CMDL langauge is centered around the concept
of a statement. Model definition files are broken into
statements by use of the reserved symbol ";",
the semicolon. Each statement must terminate with a semicolon, even
if there is only one statement in the file. The CMDL
model definition file is tokenized and parsed by the parser, and
turned into an ordered collection of statements that are executed
by the scripting engine. In this way, there is a logical decoupling
between parsing of model definition files and the execution of the
statements defining the model. Statements are processed in the order
in which they appear in a model definition file.
There are two types of statements, known as statement categories. The first and most important category of statements is known as model definition statements. This category of statements includes all statements that define model elements, such as species, reactions, parameters, etc.
The second category of statements is known as action performers. These statements instruct the scripting engine to perform a concrete action, such as conducting a simulation, exporting a model definition, printing the contents of a data structure, etc. This category of statements is supported only for use with the command-line interface to the Dizzy system. The graphical user interface for the Dizzy system allows only the model definition statement category, and ignores any statements in the "action performers" category. This is because in the graphical user interface, various graphical elements (menu items, dialog boxes, and other controls) are used to instruct the application to perform actions, rather than the scripting language.
File Inclusion
In the Dizzy system, a model definition file may include
another model definition file. This include mechanism is
permitted in both of the command-language-based model definition
language. Model definitino file inclusion works just as it
does with the preprocessor in the C programming language.
The parser splices the text of the included file into the
including file, at exactly the point where the "include
directive" occurs. There is a built-in mechanism to
prevent cyclic inclusion of files. If file A
includes file B, and file B
includes file A, then the parser will simply
ignore the include directive inside file B,
since it will already have processed file A.
The include mechanism is useful for separating out "boilerplate" macro definitions thare are shared from model to model. In addition, the include mechanism might potentially be useful for extending a model.
The specific syntax for including a model definition file within another model definition file, is shown here:
where "#include "myFile.cmdl";
myFile.cmdl" is the name of the file
that is to be "included". The contents of
"myFile.cmdl" are parsed at exactly the
point where the include statement is encountered in the file,
after all statements in the including file preceding
the include statement have been parsed.
Note that the double-quotes and the semicolon are required.
It is not allowed to embed a file inclusion statement
inside a loop construct. Normally,
file includes are placed at the top of a model definition file,
in order to load a separate file containing "boilerplate"
macro definitions and reaction definitions that are shared
between different models.
Comments
The CMDL provides a mechanism for
embedding explanatory comments in a model definition file.
A comment is a fragment of text beginning with a double-slash
("//"). All text from the
double-slash, to the end of the line, are considered
by the parser to be a comment, and are ignored. The following
example shows an example of a comment:
// this is a comment
It is acceptable to include a comment on the same line as an
actual statement, as shown here:
In the above example, the statement would be parsed, but everything after (and including) the double-slash would be ignored.<some statement>; // this is a comment
The Dizzy system also supports multi-line comments. The syntax for a multi-line comment is identical to a comment in the C programming language:
The parser will ignore anything between the "/* this is a comment this is also a comment */
/*"
and the "*/" symbols. This can be useful for
temporarily commenting out multiple lines of your model definition file,
as shown here:
Using the multiline comment syntax for this case is easier beacause you do not have to add a pound symbol "/* commented out temporarily for testing purposes (sramsey 2003/05/09) G4_dimer_binding_at_DNA4, G4D_free + DNA4 -> G4D_DNA4, kf; G4_dimer_debinding_at_DNA4, G4D_DNA4 -> G4D_free + DNA4, kr; */
//" to each
line that you are commenting out.
Exporter Plug-ins
The Dizzy system has a framework of plug-ins for exporting
an Dizzy model to different file formats.
Each exporter plug-in has an exporter alias defined. The
default exporter is the exporter for Systems Biology
Markup Language (SBML). The SBML exporter has the exporter
alias "markup-language" (quotes not included).
The full list of exporters is defined here:
Currently there is no exporter plug-in for the CMDL language. Additional exporter plug-ins may be defined in the future.
markup-language- Exports a model instance to SBML.
human-readable- Exports a human-readable textual representation of the model.
orrell-column-format- Exports a model instance to a numeric-column-format used at Institute for Systems Biology for certain software tools. Only models that do not contain any expressions may be exported to this format.
command-language- Exports a model instance to the Chemical Model Definition Language (CMDL) format.
Viewer Plug-Ins
The Dizzy system has a framework for pluggable model viewer modules. The viewer plugins are only available in the graphical user interface to Dizzy A viewer is used to display a model, for example, a graphical or textual representation on the screen. This is distinct from exporting the model to a different file format. Each model viewer plug-in has an alias, which is a short texual name that identifies it. The full list of model viewers is defined here:
cytoscape- Display the model in Cytoscape
human-readable- Display the model in human-readable format.
Simulator Plug-Ins
The Dizzy system has a framework for pluggable simulation modules. Each simulator plug-in has a simulator "alias". The available simulator plug-ins are:
Each simulator above is identified by its "simulator alias". For example, the simulator alias for the Gillespie algorithm is "
gillespie- An implementation of the Gillespie stochastic algorithm for modeling the reaction kinetics. This algorithm is described in the article
ODE- An implementation of an ODE-based deterministic simulator of the reaction kinetics. The implementation is based on a 5th-order Runge-Kutta integrator with an adaptive stepsize controller.
gillespie".
Default Model Elements
One of the distinguishing aspects of the CMDL is that several core elements are created by default:
model". The model name can be modified
with the "#model" statement.univ" and volume of 1.0 litersuniv")
compartment. All reactions are automatically associated with the
default model ("myModel").
The default "model" in the CMDL language has a reaction rate species mode of "molecules". In order to use a reaction rate species mode of "concentration", it is presently necessary to use the programmatic interface to Dizzy.
Reaction Statements
A CMDL model definition (usually, but not always, a single file) consists of a series of statements which are text strings separated by semicolons. Whitespace and newlines are ignored by the CMDL parser, except in a quoted string environment, where whitespace is interpreted literally and newlines are illegal.
The centerpiece of the chemical model definition language is the reaction statement. A reaction statement defines a one-way chemical reaction in which zero or more chemical species participate as reactants, and zero or more chemical species participate as products. The species appearing in the reaction definition must be previously defined symbols. Reaction statements have three elements separated by commas. The first element is the reaction name, and it is an optional element. The second element defines the reactants and products. The third element defines the reaction rate. The following example shows a reaction statement, along with the preceding species symbol definitions:
As explained above, each statement ends with the semicolon character "A = 100.0; B = 100.0; C = 0.0; D = 0.0; creation_of_c_d, A + B -> C + D, 1.0;
;".
Note the use of the comma character ","
to separate different elements of the statement.
The statement causes the following definitions to be made:
A, in the default compartmentB, in the default compartmentC, in the default compartmentD, in the default compartmentcreation_of_c_d, in which
species A and B participate
as reactants, and in which species C and
D are products.creation_of_c_d
reaction is computed using the
built-in
method, with a reaction parameter of 1.0.In this case, a default reaction name is assigned by the Dizzy parser.A + B -> C + D, 1.0;
In many cases, it is desirable to specify a custom reaction rate, rather than using the built-in method of computing the reaction rate. To specify a custom reaction rate, the reaction rate element should be defined as a string data type. This is accomplished by substituting a bracketed mathematical expression. In the example of the above reaction, one might write:
The square brackets are required in order to tell the parser that the reaction rate is a custom reaction rate, for deferred (rather than immediate) evaluation. The text inside the quotes is parsed as a mathematical expression that typically involves species name symbols, numeric literals, and arithmetic operators. In addition, a mathematical expression may involve built-in functions (e.g.,creation_of_c_d, A + B -> C + D, [2.0 * A * B];
ln() or sin()) and
special symbols such as "time".
In the above example, the reaction rate will be computed
as the product of the number 2.0 times the
product of the populations of species A and
B, in molecules.
It should be emphasized that the CMDL parser determines which type of reaction rate computation method to use, based on whether or not the reaction rate is specified using square brackets. As explained above, if the rate expression is not bounded by square brackets, the parser will use the built-in reaction rate computation method. If the rate expression is bounded by square brackets, the parser will assume that the reaction rate computation method is a custom rate.
Note that specifying a custom reaction rate using a mathematical expression (in square brackets) has a significant drawback, relative to using the built-in method of computing the reaction rate. The custom reaction rate is more expensive to compute, in terms of the number of CPU instructions per reaction event in the simulation.
Symbol Values and Expressions
Often, a single value will be used frequently
throughout a model definition file. For example, a reaction
rate parameter of 1.0 may be used for many reactions
in a given model definition file. This can make it tedious
to modify the parameter once the model definition file has
been written. The symbol value
construct makes it easier to centralize definition of numeric
values in the model definition file. One can define a symbol
such as this:
This declares a symbol "k" and associates it with the valuek = 1.0;
1.0.
This symbol may be referenced in any subsequent mathematical expression in the model definition file. For example, we may define a reaction whose rate is "2.0 * k":
The absence of square brackets around the "2.0 * k" expression instructs the parser to immediately evaluate this expression, and to use the resultant value as the rate for the reaction.creation_of_c_d, A + B -> C + D, 2.0 * k;
In addition, the "k" symbol may be referenced in a deferred-evaluation mathematical expression, in which case "k" is evaluated as a model parameter (this works because the "k" symbol will be added to the model as a parameter, if it is not used as a species or compartment anywhere in the model):
creation_of_c_d, A + B -> C + D, [2.0 * k * A * B];
In the above example, the rate of reaction "creation_of_c_d" is defined as the expression "2.0 * k * A * B", as a deferred evaluation expression. This means that each time the rate needs to be computed, this expression is used. This is sometimes referred to as a "custom rate expression", to distinguish it from the built-in (combinatoric) method of computing the reaction rate based on a floating-point reaction parameter.
A mathematical expression may also be embedded within a symbol name using a double-quote syntax, as shown here:
In the above, the parser detects the square brackets within the quoted string, and attempts to immediately evaluate the expression within the square brackets. The result of the evaluation is converted to an integer, and interpolated into the quoted string. This example would therefore be equivalent to the reaction definition:k1 = 1.0; "my_reaction[2.0 * k1]", A + B -> C + D, 1.0;
The decimal point does not appear in "my_reaction2" because the numeric value within the string has been converted to an integer. This is occasionally useful in combination with the loop construct defined below. This technique can also be used in defining a symbol value, as shown here:k1 = 1.0; "my_reaction2", A + B -> C + D, 1.0;
The above example is equivalent to:k1 = 1.0; "k[ 2.0 * k1 ]" = 1.0;
after translation by the parser.k1 = 1.0; k2 = 1.0;
The mathematical expression facility allows
for using the special symbol time
to include simulation time. As an illustration,
consider the following exmple reaction definition:
The above defines a reaction in which the reaction rate increases as the square root of the elapsed time.my_reaction, A + B -> C + D, [2.0 * A * B * time^0.5];
In addition to the time symbol,
there is a special symbol Navo that
defines the Avogadro constant. This is occasionally
useful if you have a numeric macro that you wish
to specify in terms of moles. For example:
In the above definition, the symbolk = 3.6 * 10^(-45); my_reaction, A + B -> C + D, [k * N * A * N * B];
k
is defined in terms of inverse moles squared.
So to regain the correct units in the reaction
rate expression, the factor Navo
is multiplied by each species symbol appearing
in the reaction rate expression.
Specifying Species Populations
In the CMDL, the initial populations of species are specified using a numeric element, as shown here:
In the above example, speciesA = 100; B = 100; my_reaction, A -> B, 1.0;
A and
B have their initial populations set
to 100 molecules. The appearance of the symbols
A and B in the reaction
definition statement causes the Dizzy parser to
understand that they are species.
Instead of using a numeric literal to define the initial species population, it is also possible to use a mathematical expression, as shown here:
In this case, the symbolN = 10.0; A = N * 10.0; B = N * 10.0; my_reaction, A -> B, 1.0;
N is
defined as a numeric macro with value 10.0.
The initial populations of A
and B are defined as the value
of the expressions contained in the square
brackets. In the above example, the initial populations
of species A and B
are set to 100.
The above discussion has assumed that species defined in chemical reactions are always dynamical species. In order to define a boundary species that participates in a chemical reaction, the dollar sign is used, as shown in the following example:
In the above example, speciesA = 100; B = 100; my_reaction, $A -> B, 2.0;
A is
defined as a boundary species with a population
of 100. Species B is
defined as a boundary species with an initial
population of 100. Recall that the population
of a boundary species is unaffected by the
occurrence of a chemical reaction involving
that species. Therefore, defining the population
of a boundary species differs from that of a
dynamical species, in that the population definition
of a boundary species is not just for the inital
time, but for all times during the simulation.
Given that the above definition
specifies the built-in method of computing the
reaction rate, the above definition could be
simplified in the following way:
In the above example,A = 100; B = 100; my_reaction, -> B, [2.0 * A];
A has been
changed into a parameter, and eliminated as
a species in the reaction that produces species
B. The reaction rate has been increased
by a factor of the value of A.
As with dynamical species, the population definition for a boundary species may invole a mathematical expression that is parsed immediately by the parser:
The right-hand side of the statement defining the population ofnum = 10; A = num * 10; B = 100; my_reaction, $A -> B, 2.0;
A is still a numeric element,
in the above example, which means that the population
value is determined by the parser, and stored as an
invariant floating-point value.
The population of a boundary species may also be defined as a late-evaluation mathematical expression. This means that the actual expression is stored by the parser, rather than the value of the expression evaluated initially. This is accomplished by using a string element rather than a numeric element, on the right-hand side of the definition of the boundary species population. One can define a boundary species with a non-integer population value, as shown here:
This use of a boundary species is not necessary in this case, because it possible to just rescale the reaction parameter and eliminate the boundary species "A = 0.5; B = 100; my_reaction, $A -> B, 2.0;
A" as a
participatnt in the reaction. A more useful case is when
one wishes to define a boundary species whose value actually
varies over time, according to a function defined
by the user. In this case, the boundary species is still
not dynamical; the time dependence of its population value
is governed by a mathematical expression involving the
special symbol time, as in the following example:
In this case, the boundary speciesA = [10 * time]; B = 100; y_reaction, $A -> B, 2.0;
A
has a population that is linearly increasing with time.
Specifying the boundary species population as a mathematical expression has a significant limitation, in that it makes the simulation of the model more computationally complex, and therefore, slower. Therefore, one should specify a boundary species population using a string representation of a mathematical expression only in cases where it is required to have a time-varying boundary species population that is externally controlled.
Note that it is illegal to attempt to define the initial population of a dynamical species as a string element containing a mathematical expression.
The definition of the boundary species population
may include the special symbol time.
This is useful in cases where it is desired to
model the effects of a singnal molecule whose
population is externally controlled.
Loops
The CMDL language contains a looping construct that permits defining a family of model elements where a single counter is incremented for each member of the family. For example, one might wish to generate a family of species and chemical reactions, parameterized by an integer. The species might be
and the reactions might be of the formA1, A2, A3, A4 B1, B2, B3, B4
This can be accomplished with the "An ->Bn
loop"
keyword, as shown here:
loop (i, 1, 4)
{
"reaction_[i]", "A[i]" -> "B[i]", 1.0;
}
In the above example, the symbol "i" is
the loop index, and it is incremented from 1 to 4 in steps
of one. For each iteration of the loop, all statements between
the curly braces are executed. More than one statement can occur
between the curly braces, although only one statement is shown here.
The [i] tokens represent evaluating a mathematical
expression, in which i appears as a symbol. The
i symbol is essentially a numeric macro whose value
is incremented for each iteration of the loop. It is important
to note that in the above example, the dollar sign is required.
This is because statement elements specifying the reaction
name, and the reactant/product species, are all implicit string
elements. In implicit string elements, the
dollar-sign-square-bracket construct is required in order
to embed a mathematical expression, just as in explicit strings.
After the loop statement and the embedded mathematical expressions
are processed by the parser, the above example is equivalent to:
A more nontrivial example of using the looping construct would be to define a cascade of reactions, as shown here:reaction_1, A1 -> B1, 1.0; reaction_2, A2 -> B2, 1.0; reaction_3, A3 -> B3, 1.0; reaction_4, A4 -> B4, 1.0;
loop (i, 1, 4)
{
"reaction_[i]", "A[i]" -> "A[i+1]", 1.0;
}
This example defines a cascade of reactions that ultimately
convert species A1 into species A5.
After the loop statement and the embedded mathematical expressions
are processed by the parser, the above example is equivalent to:
In both of the above examples, it is usually the case that a loop is used to define the initial species populations. For example, one might define (before the reaction definition loop):reaction_1, A1 -> A2, 1.0; reaction_2, A2 -> A3, 1.0; reaction_3, A3 -> A4, 1.0; reaction_4, A4 -> A5, 1.0;
A1 = 100;
loop (i, 1, 5)
{
"A[i]" = 0;
}
This sets the initial population of species A1
to 100, and the initial population of species A2
through A5 to zero.
In defining a loop, note that the start and stop values are numeric elements. This means that they can be simple numeric literals (as shown above), or mathematical expressions, as shown here:
k = 10;
loop (i, 1, k + 2)
{
"reaction_[i]", "A[i]" -> "A[i+1]", 1.0;
}
In the above example, the loop index i would iterate
over the range of integer values from 1 to 12, inclusive.
Commands
In the CMDL, commands are preceded by the pound sign. The following commands are recognized by the CMDL parser:
#model" command sets the model name, as
shown here:
If the model name is not explicitly set using this command, a default model name will be used. The Dizzy program uses the default model name "model". The rules for parsing the model name are the same as for parsing other symbol names.#model "mymodel";
#define" command is a template
definition. For more information, please refer to the
section on templates.
#ref" command is a template
reference. For more information, please refer to the
section on templates.
Templates
This section describes the template feature of the CMDL. The template construct is like a parameterized macro. One defines the template, and can later reference the template at various places in the model definition file. Each time the template is referenced, the (suitably modified) body of the template is inserted into the model definition by the CMDL parser. Templates are most useful when used with parameter lists, which allows for passing information "into" and "out of" the body of the template. By default, all symbols within the body of the template are not visible outside the body of the template. This is to avoid "naming collisions" with symbols defined outside the body of the template. Therefore, templates are really only useful to the extent they can interact with other model elements defined within your model definition file; this is accomplished using the template parameter list.
The "#define" command is used to define
a template. Each template has a name, that usually is indicative of
its function. An example template definition is shown here:
#define Gene (X, Y)
{
I = 3.0;
r1, X -> I, 1.0;
r2, I -> Y, 0.1;
}
In the above example, "Gene" is the template name. The
symbols "X" and "Y" are "dummy symbols"
that act like parameters for the template. You may refer to "X" and
"Y" inside the body of the template definition. The body of the template
definition is typed inside the curly braces. The symbols "X" and "Y"
as shown above, should not have been previously defined.
The body of the template definition must be completely self-contained;
it may not refer to any external symbols except "dummy symbols"
provided in the parenthetically delimited list at the beginning of the
template definition (e.g., "X" and "Y" in the above example). The symbol
"I" defined within the template is known as an internal symbol.
It is "scoped" by the command language parser so that its name does not
conflict with any symbols defined outside the body of the template definition.
The "#ref" command is used to reference a template.
The template must have been previous defined using the "#define"
command. In a template reference, you must provide a parenthetic list of
symbols for all of the dummy symbols in the template definition. These
symbols must have been previously defined, before your
"#ref" template reference occurs in the model definition file.
An example is shown here:
After the template is substituted into the model definition by the parser, the model definition would be as follows:A = 100.0; B = 0.0; #ref Gene "GAL4" (A, B);
The prepending of the "A = 100.0; B = 0.0; GAL4::I = 3.0; GAL4::r1, A -> GAL4::I, 1.0; GAL4::r2, GAL4::I -> B, 0.0;
GAL4::" before each symbol name
is the "scoping" that ensures that symbols defined within the body of
the template definition do not have names that conflict with symbols defined
outside the template definition. Template references may exist within template
definitions (this is simply nesting a template expansion within another
template definition), but it is illegal to nest a template definition within the
body of another template definition.
You may pass a numeric value to a template, as shown here:
#define Gene (A, B, k)
{
r1, A -> B, k;
}
X = 100.0;
Y = 0.0;
#ref Gene "GAL4" (X, Y, 0.5);
In this example, the numeric value is substituted
anywhere the symbol "k" appears within the
body of the template.
Example CMDL model definition file
The following example illustrates a complete CMDL model definition file:
// Simple model of transcription, in Escherichia coli
// Written by: Stephen Ramsey, October 10, 2004
// Institute for Systems Biology, Seattle, Washington, USA
//
#model "bacteria";
#define fracSatTwoStatesTwoSitesOR( kfp, krp, qp, f0, fracsat )
{
kp = kfp / krp;
kpf0 = [kp * f0];
kpf0_2 = [ kpf0*kpf0 ];
numerator = [ qp*kpf0_2 +
2.0 * kpf0 ];
fracsat = [ numerator / (numerator + 1.0) ];
}
cellVolume = 2.5 * 10^(-15); // Liters
halfLifeMRNA = 5.0; // minutes
halfLifeProtein = 60.0; // minutes
transcriptLength = 1000.0;
proteinLength = transcriptLength / 3.0; // peptides
transcriptionTranslocationRate = 2400.0; // nucleotides/minute
translationTranslocationRate = 96.0; // codons / minute
minimumInterRibosomeDistanceCodons = 26.6; // codons
ribosomeMoleculesPerCell = 20000.0;
mrnaMoleculesPerGene = 15.0;
rnapMoleculesPerCell = 2000.0;
proteinToMRNARatio = 600.0; // dimensionless
protMoleculesPerGene = proteinToMRNARatio * mrnaMoleculesPerGene;
// --------------------------------------
// species
// --------------------------------------
transfac = 61.0;
-> transfac, 61.0;
transfac ->, 1.0;
kfp = 6.25 * 10^(-4);
krp = 1.0;
qp = 7.5;
#ref fracSatTwoStatesTwoSitesOR "twoSitesOR" (kfp, krp, qp, transfac,
fracsat_twoSitesOR);
fracsat = [fracsat_twoSitesOR];
gene = 1.0;
rnap = rnapMoleculesPerCell;
mrna = 0.0;
protein = 0.0;
ribosome = ribosomeMoleculesPerCell;
startTranscript = 0.0;
mrna = 0.0;
finishTranscript = 0.0;
startProtein = 0.0;
protein = 0.0;
finishProtein = 0.0;
// --------------------------------------
// parameters
// --------------------------------------
log2 = 0.693147181;
kd_mrna = log2/halfLifeMRNA;
kd_prot = log2/halfLifeProtein;
rnapBindingKineticRate = kd_mrna * mrnaMoleculesPerGene /
rnapMoleculesPerCell;
protMoleculesPerGene = proteinToMRNARatio * mrnaMoleculesPerGene;
ribosomeBindingKineticRate = kd_prot * protMoleculesPerGene /
(ribosomeMoleculesPerCell * mrnaMoleculesPerGene);
// transcription time delay
transcriptionTimeDelay = transcriptLength / transcriptionTranslocationRate;
// start transcription rate
startTranscriptionRate = rnapBindingKineticRate; // per min, per molec
// transcription rate
transcriptionRate = 1.0 / transcriptionTimeDelay;
// translation time delay
translationTimeDelay = proteinLength / translationTranslocationRate; // minutes
// start translation rate
startTranslationRate = ribosomeBindingKineticRate; // per min, per molec
// translation rate
translationRate = 1.0 / translationTimeDelay; // codons/minute
// --------------------------------------
// reactions
// --------------------------------------
// start transcription
startTranscription, rnap + $gene -> startTranscript,
[fracsat * startTranscriptionRate * rnap * gene];
// transcription
transcription, startTranscript -> finishTranscript,
transcriptionTranslocationRate, delay: transcriptionTimeDelay;
// finish transcription
finishTranscription, finishTranscript -> rnap + mrna, transcriptionRate;
// degrade mrna
degradeTranscript, mrna -> , kd_mrna;
// start translation
startTranslation, ribosome + $mrna -> startProtein, startTranslationRate;
// translation
translation, startProtein -> finishProtein, translationRate,
delay: translationTimeDelay;
// finish translation
finishTranslation, finishProtein -> ribosome + protein, translationRate;
// degrade protein degradeProtein, protein -> , kd_prot;
This model definition file contains a very simple model
of transcription in Escherichia coli. Please
note that this model is provided purely for pedagogical purposes, and
thus, no citations to the literature are included for the parameters
in the model.
Symbol Names
In the CMDL, all symbol names defined (except for parameters that are defined in a reaction namespace) reside within the global namespace. This means that, for example, a compartment and a species cannot have the same name. Each symbol that is defined must have a unique name. Note that in the CMDL, a species symbol appearing in a reaction rate expression can mean one of two things, depending on the reaction rate species mode of the model with which the reaction is associated. For a reaction rate species mode of "molecules" (the default), the symbolsA and B in the reaction rate
mathematical expression refer to the numeric populations
of species A and B, respectively.
For a reaction rate species mode of "concentration",
the symbols A and B in the reaction
rate mathematical expression refer to the molar concentrations
of species A and B, respectively.
Dizzy provides a collection of simulators for solving the dynamics of a model. Both stochastic and deterministic simulators are available. The stochastic simulators use a Monte Carlo-type process to approximately solve the stochastic dynamics of the model. The deterministic simulators use finite difference methods to solve the approximate dynamics of the model as a system of ordinary differential equations (ODEs). Each simulator available in Dizzy has a unique simulator alias, which is the short name by which the simulator is referred to throughout this documentation. The simulator is selected by choosing its alias from a list of aliases of all simulators available. In this section, we describe the simulators that are available in Dizzy
Simulator: gibson-bruck
A stochastic simulator implemented using the Gibson-Bruck algorithm.
Simulator: gillespie
A stochastic simulator implemented using the algorithm.
Simulator: tauleap-complex
An approximate accelerated stochastic simulator implemented using the Gillespie Tau-Leap algorithm. This implementation is intended for complex models in which the Jacobian matrix is very computationally expensive to compute, and therefore the Jacobian matrix is evaluated first, and then the "mu" and "sigma" functions are computed using the pre-computed Jacobian matrix. Although this method is N-squared in the number of species, it is faster when the Jacobian is very complicated.
Simulator: tauleap-simple
An approximate accelerated stochastic simulator implemented using the Gillespie Tau-Leap algorithm. This implementation is intended for models in which the Jacobian matrix is easy to compute (basically the elements of the Jacobian are at most linear in the species concentration), and so no pre-evaluation of the Jacobian matrix is done. The "sigma" and "mu" functions are pre-computed symbolically. This avoids the N-squared dependence on the number of species. This method scales well when the number of species gets large, as compared to the "tauleap-complex" method.
Simulator: ODE-RK5-fixed
The Dizzy system provides a deterministic simulator
for reaction kinetics, in which the system is modeled as
a set of coupled ordinary differential equations (ODEs).
The simulator alias for this simulator is "ODE-RK5-fixed".
The differential equations are solved using a finite
difference method, specifically the 5th-order Runge-Kutta
algorithm with a fixed stepsize. The step size must be specified
by the user, as a fraction of the total time interval for the
simulation. The accuracy of this simulator's solution will
depend on the size of the time-step fraction that is specified.
Note about disabling error checking:You may delete the maximumum absolute and/or relative error tolerances in the simulation launcher screen, in order to run the simulator with no absolute and/or relative error checking. This feature is currently only available for the fixed-stepsize Runge-Kutta integrator.
Simulator: ODE-RK5-adaptive
The Dizzy system provides a deterministic simulator
for reaction kinetics, in which the system is modeled as
a set of coupled ordinary differential equations (ODEs).
The simulator alias for this simulator is
"ODE-RK5-adaptive".
The differential equations are solved using a finite
difference method, specifically the 5th-order Runge-Kutta
algorithm with an adaptive stepsize controller. The step-size
controller is based on an error estimation formula that is accurate
to 4th order. The user must specify the tolerances for
relative and absolute errors, as well as the initial step size
(as a fraction of the total time interval of the simulation).
Simulator: ODEtoJava-dopr54-adaptive
A deterministic simulator implemented by Murray Patterson and Raymond Spiteri, as a part of the "odeToJava" package. This simulator is a 5/4 Dormand-Prince ODE solver with adaptive step-size control.
Simulator: ODEtoJava-imex443-stiff
A deterministic simulator implemented by Murray Patterson and Raymond Spiteri, as a part of the "odeToJava" package. This simulator is an implicit-explicit ODE solver with step doubling. This simulator works well for models with a high degree of stiffness. However, please see the issue with interpolation in the imex443 solver.
This section describes the Systems Biology Markup Language. The systems biology markup language is an XML-based document specification for defining a model instance for a system of interacting chemical species. The specification for SBML can be found at the home page of the Systems Biology Workbench Development Group. Dizzy is capable of reading a model in SBML Level 1 format, Versions 1 and 2. Dizzy can export a model to SBML Level 1, Version 2.
The Dizzy system is capable of importing a model instance from an SBML document, and exporting a model instance (defined through a different language) to an SBML document. Some features of the Dizzy system cannot be exported to an SBML document. In particular, a boundary species whose population is a mathematical expression defining a function of time, cannot be exported to SBML. Similarly, certain SBML constructs will not be imported into the Dizzy system, namely, unit definitions. The Dizzy system can import SBML documents with either of two systems of units:
Dizzy has an interface to the Systems Biology Workbench (SBW)
system. This makes it possible to access the simulation engine for Dizzy using
the cross-language remote procedure invocation capabilities of SBW. Note that this
section of the Dizzy user manual assumes you are familiar with the architecture and
terminology of the Systems Biology Workbench system. For a good overview of these
concepts and terminology, please refer to the
Introduction to the Systems Biology Workbench
document, available at the SBW Project home page. Please note: The SBW interface
in Dizzy is compatible only with SBW versions 2.2.1 or newer, and not with the 1.X.X
versions. The interface
to SBW offered by Dizzy includes a SBW module
"org.systemsbiology.chem.sbw.gui" (this is the SBW Unique Module
Name, not a Java class name),
which a single SBW service asim. The display name