On The Approximation Quality of Markov State Models: © by SIAM. Unauthorized Reproduction of This Article Is Prohibited
On The Approximation Quality of Markov State Models: © by SIAM. Unauthorized Reproduction of This Article Is Prohibited
Key words. Markov state model, biomolecular dynamics conformations, metastable sets, effec-
tive dynamics, transfer operator, spectral gap, lag time, diffusive dynamics
DOI. 10.1137/090764049
with “nice” sets Aj (e.g., with a Lipschitz boundary). We introduce the discrete
process (X̂k )k∈N on the finite state space Ê = {1, . . . , n} by setting
(X̂k ) describes the snapshot dynamics of the continuous process (Xt ) with lag time τ
between the sets A1 , . . . , An . This process (X̂k ) is generally not Markovian, i.e.,
However, MSMs attempt to approximate this process via a discrete Markov process
(X̃k )k∈N on Ê = {1, . . . , n} defined by the transition matrix P with entries
While the long-term dynamical behavior of the original process (Xkτ )k∈N is governed
by Tkτ = Tτk for k ∈ N, the long-term dynamics of the MSM process (X̃k )k∈N is
governed by Pτk . Thus, for accessing the approximation quality of the MSM compared
to the original process, we have to study the error
where dist denotes an appropriate metric measuring the difference between the op-
erators. We will see that under strong enough ergodicity conditions on the original
Markov chain (see Remark 2.1) we have E(k) ≤ 2ρk for some 0 < ρ < 1. However, we
are interested in how E(k) depends on the lag time τ and the sets A1 , . . . , An such
that the error E can be kept below a user-defined threshold.
The remainder of the article is organized as follows. In section 2 we introduce the
setting and give the general definition of MSM transfer operators. Then, in section 3
we compare the densities of the random variables (X̂k ) to the densities of the MSM
process (X̃k ) and see how the approximation quality of these densities depends on
the choice of the state space discretization A1 , . . . , An and the lag time τ . Section 4
extends our findings to some algorithmic strategies for the construction of MSMs
Downloaded 12/07/12 to 138.87.11.21. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
that are discussed in the literature. Finally, the results are illustrated in numerical
examples in section 5.
2. The MSM transfer operator.
2
2 We consider the transfer operators Tt as operators on Lμ =
2.1. Setting.
{v : E → R : v dμ < ∞} with scalar product v, w = vwdμ. In the following
·
will denote the associated norm
v
2 = v, v on L2μ and the corresponding
operator norm
B
= maxv=1
Bv
of an operator B : L2μ → L2μ .
In L2μ , Tτ has the general form
(6) Tτ v(y)μ(dy) = P[Xτ ∈ C|X0 = x]v(x)μ(dx) ∀ measurable C ⊂ E
C E
2.2. Assumptions on the original process. Now let us assume that T has
m real eigenvalues λ1 , . . . , λm ∈ R,
(7) λ0 = 1 > λ1 ≥ λ2 ≥ · · · ≥ λm ,
of which the spectral gap Δ > 0 will play an essential role later. We should empha-
size that the notion “spectral gap” is usually used differently. It usually designates
a situation in which an entire interval of the real axis does not contain any eigen-
values, whereas the intervals above and below show a significantly denser population
of eigenvalues. Despite the obvious difference in our case, we will adopt the name
spectral gap for Δ since it plays a role in finding upper bounds similar to that of the
usual spectral gaps.
(10)
j=0
Furthermore, we assume that the subspace U and the remaining subspace do not mix
under the action of T :
(13) ΠT Π⊥ = Π⊥ T Π = 0,
and therefore the dynamics can be studied by considering the dynamics of both sub-
spaces separately:
(14) T k = (T Π)k + (T Π⊥ )k ∀k ≥ 0,
(15) Π0 v := v, u0 u0 = v, 11.
According to the above we have the asymptotic convergence rate
T k − Π0
= λk1 for
all k ∈ N.
Remark 2.1. The assumptions (7), (8), (12), and (13) are definitely satisfied if
T is sufficiently ergodic and is self-adjoint (T is self-adjoint if the underlying original
Markov process (Xt ) is reversible). But it may also be sufficient if, e.g., (Xt ) is suf-
ficiently ergodic and has a dominant self-adjoint part, as is the case for second-order
Langevin dynamics with not too large friction [21] or for thermostatted Hamiltonian
molecular dynamics or stochastically perturbed Hamiltonian systems [3, 22]. Re-
versible or not, the property of being “sufficiently ergodic” seems to be central in any
case. We will now give sufficient conditions in technical terms for a reversible process.
These results and their generalizations to nonreversible cases can be found in [23, 3].
• A reversible and μ-irreducible process (Xt ) is sufficiently ergodic if one of the
following scenarios holds:
(i) (Xt ) is V -ergodic or geometrically ergodic; see [3].
(ii) The stochastic transition function p(t, x, ·) = pa (t, x, ·)+ps (t, x, ·) associ-
ated with (Xt ), where pa denotes the absolutely continuous part and ps
r
the singular part, satisfies the following two conditions: (a) pa ∈ Lμ×μ
for some 2 < r < ∞, and (b) Sv(y) = v(x)pa (t, x, y)μ(dy) satisfies
S
2,μ > 0.
n n
v, χAi
(16) Qv = χ Ai = v, φi φi ,
i=1
μ(Ai ) i=1
That is, the orthogonal projection Q keeps the measure on the sets A1 , . . . , An , but on
each of the sets the ensemble will be redistributed according to the invariant measure
and the detailed information about the distribution inside of a set Ai is lost.
Since the sets A1 , . . . , An form a full partition of E, we have
(18) Q1 = 1,
which implies
(19) QΠ0 = Π0 Q = Π0 .
2.4. MSM transfer operator. Now consider the projection of our transfer
operator T onto Dn :
such that
n
(23) P ψi = Pμ [Xτ ∈ Aj |X0 ∈ Ai ] · ψj .
Downloaded 12/07/12 to 138.87.11.21. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
j=1
So the transition matrix of the MSM Markov chain defined in (4) is identical with the
matrix representation for the projected transfer operator P ; therefore P is the MSM
transfer operator. P inherits the ergodicity properties and the invariant measure of
T , as the following lemma shows (the proof can be found in section 6.1).
Lemma 2.2. For every k ∈ N we have
(24)
P k − Π0
≤
(T Q)k − Π0
≤ λk1 .
3. Approximation quality of coarse-grained transfer operators.
3.1. Approximation error E. The approximation quality of the MSM Markov
chain can be characterized by comparing the operators P k and T k for k ∈ N restricted
to Dn :
(25) E(k) =
QT k Q − P k
=
QT k Q − Q(T Q)k
.
Lemma 2.2 immediately implies that this error decays exponentially,
(26) E(k) =
QT k Q − P k
≤
QT k Q − Π0
+
P k − Π0
≤
Q(T k − Π0 )Q
+
P k − Π0
≤ 2λk1 ,
independent of the choice of the sets A1 , . . . , An . Since we want to understand how the
choice of the sets and other parameters like the lag time τ influence the approximation
quality, we have to analyze the prefactor in much more detail.
3.2. Main result: An upper bound on E. The following theorem contains
the main result of this article.
Theorem 3.1. Let T = Tτ be a transfer operator for lag time τ > 0 with
properties as described above, in particular (7), (8), (12), and (13). Let the disjoint
sets A1 , . . . , An form a full partition and define
(27)
Q⊥ uj
=: δj ≤ 1 ∀j, δ := max δj ,
j=1,...,m
3.3. Interpretation and observations. The theorem shows that the overall
error can be made arbitrarily small by making the factor [Csets (δ, k) + Cspec (η, k)]
small. In order to understand the role of these two terms, consider for now k ≥ 2 to
Downloaded 12/07/12 to 138.87.11.21. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
2.5
Downloaded 12/07/12 to 138.87.11.21. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Potential V 2
1.5
1
ΔV
0.5
A1 A2
0
−1.5 −1 −0.5 0 0.5 1 1.5
x
Bounding the error. Consider that the desired quality of the MSM approximation
is defined by the user via some tolerable error bound tol at some timescale T . This
requirement is met by satisfying E(T ) ≤ tol. A rational procedure for guaranteeing
this can be outlined as follows:
1. Define the timescale of interest, T .
2. Define the number of eigenfunctions, m, that we are seeking to approximate
well.
3. Compute the spectral gap, Δ, which is ideally given by the (m + 1)st eigen-
value. This will be directly accessible only for simple systems. For more
complex systems, it may, however, be possible to bound Δ from below and
thus guarantee that E(k) will remain an upper bound. In practical cases
involving statistical uncertainty (e.g., molecular dynamics), one may only be
able to estimate a probability distribution of the eigenvalues and thus, for
any given dataset, be able to estimate an almost certain lower bound for Δ.
4. Set the desired lag time τ depending on how much time resolution is desired
compared to the timescale of interest.
5. Solve
1/2 T η
(36) tol = (mδ + η) m δ + F ,
τ 1−η
with η = exp(−τ Δ), and F = 1 − exp(−T Δ)
for δ and adapt the choice of the sets A1 , . . . , An and their number n to the
requirement δ = δ(τ ) that results from (36). We will illustrate this in an
explicit example in section 5.
For practical applications, δ can also only be estimated, and an approach to do
this based on two differently fine discretizations is outlined in section 4.3.
Metastability. The handbook [3] gives the following theorem, in which smallness
of the projection error δ is related to the metastability of a subdivision A0 , . . . , Am of
the state space.
Theorem 3.2. Let T be a self-adjoint transfer operator with lag time τ and
properties as described above, in particular (7), (8), (12), and (13). The metastability
of an arbitrary decomposition A0 , . . . , Am of the state space is bounded from below and
above by
m
(37) 1+(1−δ12)λ1 +· · ·+(1−δm
2
Downloaded 12/07/12 to 138.87.11.21. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
and
(41) Qv(x) = v(x, y)μx (dy),
Ey
where μx (C) = C μ(x, dy)/ Ey μ(x, dy) is the marginal of the invariant mea-
sure for fixed x. Associated averaged transfer operators are considered in the
context of so-called hybrid Monte Carlo methods; see [3, 2].
2. Consider the same projection Q as in (41), with Ex = span{u0 , . . . , um },
i.e., the projection onto the m-dimensional slow subspace of the dynamics.
For this case, the MSM error E(k) is 0 for all k, showing that a Markovian
formulation of the dynamics in the slow subspace is in principle possible.
In practical applications, however, it is often desired to obtain equations of
Such mollified or fuzzy MSMs have been considered, e.g., in [24] (see sec-
tion 4.1).
4. Algorithmic considerations.
4.1. Almost exact fuzzy MSM. Let us return to the last example of general-
ization in section 3.4. There, the MSM subspace spanned by the indicator function of
sets has been replaced by another finite-dimensional subspace, D = span{f1 , . . . , fn },
with ansatz functions that no longer are indicator functions. Now, assume that we
can design (almost everywhere) nonnegative ansatz functions by linear combination
of the eigenvector uj , i.e.,
m
(43) fj = ajk uj , with scalars ajk for j = 0, . . . , m.
k=0
(44) E(k) = QT k Q − P k = 0 ∀k = 1, 2, 3, . . . ,
showing that the MSM is exact. M. Weber et al. have developed such an MSM variant
and discussed its applicability and interpretation [24]; in particular they show how
to optimally compute the coefficients aij for nonnegative fuzzy membership functions
[25]. A warning seems appropriate: The exactness requires having the eigenvectors of
T exactly. This is something that cannot be assumed in practice: The eigenvectors
result from numerical computations and will be effected by statistical and numerical
errors. Thus, any practical implementation of this strategy will also have to consider
the actual approximation quality of the MSM depending on the δ induced by the
numerical approximation of the fi and on the lag time τ .
4.2. MSM based on projections of the original dynamics. In practice,
MSMs are often constructed not by considering arbitrary sets Aj ⊂ E but only
sets which result from discretization of the subspace of a certain set of “essential”
coordinates θ : E → θ(E) ⊂ E. For example, in molecular systems, one usually
ignores the solvent coordinates [8, 26, 6] and may even further consider only a subset
of solute coordinates such as torsion angles [26, 6]. The projection of the original
process on the essential subspace, θ(Xt )t∈R , will then in general be far from being
Markovian. However, this does not concern our result or the construction of MSMs
in general. Let E(A) = {x ∈ E : θ(x) ∈ A} denote the cylinder set that belongs to
a subset A of the essential subspace θ(E). Thus, any subdivision A1 , . . . , An of θ(E)
will induce a subdivision E(A1 ), . . . , E(An ) of the full state space E. Thus, the above
Downloaded 12/07/12 to 138.87.11.21. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
results are fully valid if applied to these subdivisions. So the question of whether
an MSM based on a definition of states in a subspace θ has a good approximation
quality boils down to the question of whether the projection error δ can be kept below
a certain accuracy threshold based on cylinder sets. A necessary condition for this
possibility is that the eigenvectors u0 , . . . , um of the original transfer operator in full
state space E are almost constant along the fibers E(ϑ) = {x ∈ E : θ(x) = ϑ}.
In other words, the approximation quality of the MSM can be good if the variables
ignored by the projection onto θ are sufficiently “fast.” More precisely, whenever
the dynamics along these fibers is rapidly mixing on some timescale
that is much
smaller than the mixing times orthogonal to the fibers (order 1 or larger), one can
show by multiscale analysis that (in the limit
→ 0) the eigenvectors of the full
transfer operator are constant up to a scale that vanishes with
; cf. [27].
4.3. Comparing coarser and finer MSMs. In practical cases, the eigenvec-
tors and eigenvalues of T are not directly available. Because of that, many articles in
the MSM literature consider a fine subdivision of the state space and the associated
MSM first and construct the final, coarse MSM based on the eigenvectors and eigen-
values of the fine MSM [2, 28, 8, 5, 6]. In order to analyze this procedure based on
the above estimate of the MSM error, let us consider a case with two subdivisions of
very different fineness:
(1) (1)
A1 , . . . , AN : fine subdivision of E,
(2) (2)
A1 , . . . , An : coarse subdivision of E, n N ;
consider the associated projections
and the errors of the fine and coarse MSMs satisfy the estimate
4.4. Statistical and total error. When considering an MSM for biomolecular
systems the MSM transfer operator P and its matrix representation (Pij ) are normally
not known exactly. Instead, only statistical approximations P̃ij of its entries
Downloaded 12/07/12 to 138.87.11.21. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
are available. Thus, the total error of an MSM compared to the original dynamics is
not E(k) =
QT k Q − P k
but
5. Numerical examples.
5.1. Double well potential. The results and concepts above will first be illus-
trated on a one-dimensional diffusion in a double well potential. In contrast to the
diffusive dynamics considered in section 3.3, this example does not rely on vanishing
noise approximations but considers the process dXt = −∇V (Xt )dt + σdBt with some
σ > 0. The potential V and its unique invariant measure are shown in Figure 2.
2.5
2
Potential V
1.5
0.5
0
−1.5 −1 −0.5 0 0.5 1 1.5
x
0.2
invariant measure μ
0.15
0.1
0.05
0
−1.5 −1 −0.5 0 0.5 1 1.5
x
Fig. 2. Top panel: the potential V . Bottom panel: the associated invariant measure.
This process satisfies all necessary assumptions, and by resolving only the slowest
process (m = 1), the following spectral values are obtained:
The eigenvector u1 is given in the middle panel of Figure 3. It is seen that it is almost
constant on the two wells of the potentials and changes sign close to where its saddle
point is located.
Downloaded 12/07/12 to 138.87.11.21. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Projection error δ. Let us first choose the lag time τ = 0.1. Then λ1 = 0.9801
and r = 0.1947. Figure 3 shows the values of the projection error δ for n = 2 and
sets of the form A1 = (−∞; x] and A2 = (x; ∞) depending on the position of the
dividing surface, x. One can see that it is optimal for the boundary between the two
sets to lie close to the saddle point of the potential, where the second eigenvector is
strongly varying.
2.5 potential
2
1.5
V
1
0.5
0
−1.5 −1 −0.5 0 0.5 1 1.5
0.5
second eigenvector
u1
−0.5
−1.5 −1 −0.5 0 0.5 1 1.5
0.2
δ depending on choice of x, A = (−∞ x] A = (x ∞)
1 2
0.1
δ
0
−1.5 −1 −0.5 0 0.5 1 1.5
x
Fig. 3. Upper panel: potential V . Middle panel: eigenvector u1 . Lower panel: projection error
δ for different sets A1 = (−∞; x] and A2 = (x; ∞) plotted against x.
0.5
0
Downloaded 12/07/12 to 138.87.11.21. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
u1 and Qu1
−0.5
−1
−1.5
−2
−2.5
−1.5 −1 −0.5 0 0.5 1 1.5
x
0.5
0
u1 and Qu1
−0.5
−1
−1.5
−2
−2.5
−1.5 −1 −0.5 0 0.5 1 1.5
x
Fig. 4. Galerkin approximation Qu1 of the second eigenvector. Left panel: uniform grid with
n = 5 sets. Right panel: adaptive grid with n = 5 sets.
0.2
adaptive
uniform
projection error δ
0.15
0.1
0.05
0
2 4 6 8 10 12
number of sets n
Fig. 5. Approximation error δ against the number of sets n for uniform and adaptive dis-
cretizations.
better when the lag time is increased. Finally, Figure 9 compares exact errors and
bounds for n = 3 sets with uniform and adaptive grids with lag time τ = 0.5 exhibiting
a dramatic advantage of the adaptive over the uniform discretization for longer lag
times.
Number of sets necessary to yield a given error and lag time. Let us briefly come
back to the question of how to build an MSM if the maximum acceptable approxima-
tion error tol is given. In the present case, Δ is known explicitly. Thus, as explained in
section 3.3, for a given lag time τ the value of δ that is required for E(T ) = tol = 0.1
can be computed (Figure 10, solid line). Next, we consider the adaptive discretization
with n = 2, 3, 4, . . . and compute their δ-error (boxes in Figure 10). This shows that
the required error tolerance of 0.1 can be obtained with different n-τ pairs, e.g., using
n = 2 with τ ≈ 0.3 or n = 5 with τ ≈ 0.15.
0.35
exact error
0.25
0.2
0.15
0.1
0.05
0
0 5 10 15 20 25 30 35 40
time t (τ=0.1)
Fig. 6. Bound and exact error E for τ = 0.1 on the adaptive grid with n = 3 adaptive sets.
0.014
bound from Theorem
approximation error E(t)
0.01
0.008
0.006
0.004
0.002
0
0 5 10 15 20 25 30 35 40
time t (τ=0.5)
1
quotient exact error/bound
0.95
0.9
0.85
0.8
0.75
0.7
0 50 100 150 200
time t (τ=0.5)
Fig. 7. Left panel: bound B(t) from Theorem 3.1 and the exact error E(t) for τ = 0.5 on the
E(t)
adaptive grid with n = 3. Right panel: the quotient B(t) .
5.2. Double well potential with diffusive transition region. Let us now
consider a one-dimensional diffusion in a different potential with two wells that are
connected by an extended transition region with substructure: dXt = −∇V (Xt )dt +
σdBt with σ = 0.8. The potential V and its unique invariant measure are shown in
Figure 11. We observe that the transition region between the two main wells now con-
tains four smaller wells that will each have their own, less pronounced metastability.
When considering the semigroup of transfer operators associated with this dynamics
we find the dominant eigenvectors as shown in Figure 12. The eigenvectors are all al-
most constant on the two main wells but are nonconstant in the transition region. The
dominant eigenvalues take the following values (in the form of lag time-independent
0.06
0.05
0.04
0.03
0.02
0.01
0
0 5 10 15 20 25 30 35 40
time t
Fig. 8. Exact error E for different lag times (τ = 0.1 and 0.5) on the adaptive grid with n = 3.
0.14
bound adaptive
exact error adaptive
approximation error E(t)
0.12
bound uniform
0.1 exact error uniform
and estimate
0.08
0.06
0.04
0.02
0
0 5 10 15 20 25 30 35 40
time t (τ=0.5)
Fig. 9. Exact error and bound for uniform and adaptive grids; n = 3, τ = 0.5.
0.14
0.12
Downloaded 12/07/12 to 138.87.11.21. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
0.1
n=2
0.08
n=3
δ
0.06
0.04 n=4
n=5
0.02
0
0 0.1 0.2 0.3 0.4 0.5 0.6
lag time τ
Fig. 10. Dependence of the requirement for δ on τ for prescribed error tol = 0.1. The boxes
indicate some values of δ that can be realized by choosing n adaptive boxes.
3
potential V
0
−2 −1 0 1 2 3 4 5 6 7
x
0.035
0.03
invariant measure μ
0.025
0.02
0.015
0.01
0.005
0
−2 −1 0 1 2 3 4 5 6 7
x
Fig. 11. Top panel: the potential V with extended transition region. Bottom panel: the associ-
ated invariant measure for σ = 0.8.
0.04
0.02
μ
0
Downloaded 12/07/12 to 138.87.11.21. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
−2 −1 0 1 2 3 4 5 6 7
0.1
u1
0
−0.1
−2 −1 0 1 2 3 4 5 6 7
0.2
u2
0
−0.2
−2 −1 0 1 2 3 4 5 6 7
0.2
u3
0
−0.2
−2 −1 0 1 2 3 4 5 6 7
0.2
u4
0
−0.2
−2 −1 0 1 2 3 4 5 6 7
x
Fig. 12. Invariant measure and eigenvectors uj , j = 1, . . . , 4, for Brownian motion in the
potential V with the extended transition region from Figure 11 for σ = 0.8.
n=20
10
V
0
−2 −1 0 1 2 3 4 5 6 7
2
u1, Qu1
0
−2
−2 −1 0 1 2 3 4 5 6 7
2
u2, Qu2
0
−2
−2 −1 0 1 2 3 4 5 6 7
x
Fig. 13. Potential and eigenvectors uj , j = 1, 2, and their stepfunction approximation Quj for
n = 20 adaptive sets. The resulting projection error is δ = 0.052.
best results; that is, for given n the lag time can be chosen smallest with m = 2 in
comparison to m = 1 and m = 5 (and other values of m not shown in the last figure).
6. Proofs.
6.1. Proof of Lemma 2.2. Proof. Because of Π0 Q = QΠ0 = Π0 and
T −
Π0
= λ1 we have for k = 1
(49) T Q − Π0 = (T − Π0 )Q ≤ λ1 .
Π0 T v = T v, 11 = T Πv, 1 1 + T Π⊥ v, Π1 1
(50)
= v, T Π 11 + ΠT Π⊥ v, 11 = v, 11 = Π0 v,
(51) Π0 T = T Π0 = Π0 .
0.2
δ for m=2
maxk||QTkQ − QPkQ||
Downloaded 12/07/12 to 138.87.11.21. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
0.1
0.05
0
5 10 15 20 25 30 35 40 45 50 55
number n of sets
Fig. 14. Decay of δ and of the maximal propagation error maxk QT k Q − P k with the number
n of sets in the optimal adaptive subdivision for m = 2.
0.35
δ for m=2
0.3 δ for m=5
δ for m=1
0.25
0.2
δ
0.15
0.1
0.05
0
5 10 15 20 25 30 35 40 45 50 55
number n of sets
Fig. 15. Decay of δ with the number n of sets in the optimal adaptive subdivision for m = 1, 2, 5.
14
m=1
m=2
required lagtime τ∗
12
Downloaded 12/07/12 to 138.87.11.21. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
10
4
0 10 20 30 40 50 60
number n of sets
70
m=2
60 m=5
required lagtime τ∗
50
40
30
20
10
0
0 10 20 30 40 50 60
number n of sets
Fig. 16. Comparison of the minimal lag time τ∗ that is required to achieve maxk E(k) ≤ 0.1
depending on the number n of sets in the optimal adaptive subdivision. Top panel: m = 1 compared
to m = 2. Bottom panel: m = 2 compared to m = 5.
The first term
QT i Q⊥
describes the propagation of the projection error in i steps,
and the second term
Q⊥ (T Q)k−i
measures how large a projection error can be in
the (k − i)th iteration of applying operator P . So the ith summand explains the effect
of the propagation of the error that is made in the (k − i)th iteration.
We will estimate the overall error by looking at both parts of the error separately.
Let us prepare for this with the following lemma.
Lemma 6.1. For the first part of the error we have the upper bound
√
(55)
QT k Q⊥
≤ mλk1 δ + rk .
which leads to
m
(27)
(57)
(T Π)k Q⊥ v
2 = λ2k ⊥
j Q uj , v
2
≤ mλ2k
1 δ
2
j=1
and therefore
√ k
(58)
Q(T Π)k Q⊥
≤ mλ1 δ.
k−1
(60)
QT k Q − QP k Q
≤
QT i Q⊥
Q⊥(T Q)k−i
.
i=1
Moreover,
(62) Q⊥ T Q ≤ Q⊥ T ΠQ + Q⊥ T Π⊥ Q ≤ Q⊥ T ΠQ + r
Now we have
k−1
k−1
√ √
(66) ( mλi1 δ + ri )λk−i−1
1 = mδ(k − 1)λk−1
1 + λk−1
1 ηi
i=1 i=1
and
k−1
1 − ηk η − ηk η
(67) ηi = −1= = (1 − η k−1 ).
i=1
1−η 1−η 1−η
6.3. Proof of Lemma 4.1. Remember that Q(1) , Q(2) denote the projection
from L2μ to the fine and coarse stepfunction spaces, while Q(12) denotes the projection
from fine to coarse.
Proof. We first find easily that Q(2) = Q(12) Q(1) , and Q(2),⊥ = Q(1),⊥ +Q(12),⊥ Q(1) .
(12)
Setting δj =
Q(12),⊥ Q(1) uj
this implies
(2)
(δj )2 =
Q(2),⊥ uj
2 = (Q(1),⊥ + Q(12),⊥ Q(1) )uj , (Q(1),⊥ + Q(12),⊥ Q(1) )uj
(68) =
Q(1),⊥ uj
2 +
Q(12),⊥ Q(1) uj
2 + 2Q(12),⊥ Q(1) uj , Q(1),⊥ uj
(1) (12) 2
= (δj )2 + (δj ) ,
where the last identity follows from the fact that Q(1),⊥ Q(12) = 0 on Range(Q(1) ),
which implies Q(1),⊥ Q(12),⊥ = Q(1),⊥ . This identity implies the assertions concerning
the δ-estimates. With respect to the estimate on the error it suffices to observe that
Downloaded 12/07/12 to 138.87.11.21. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
useful and new algorithmic strategies will be required for efficient construction of high
quality MSMs.
Downloaded 12/07/12 to 138.87.11.21. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
REFERENCES
[1] P. Deuflhard, W. Huisinga, A. Fischer, and Ch. Schuette, Identification of almost in-
variant aggregates in reversible nearly uncoupled Markov chains, Linear Algebra Appl.,
315 (2000), pp. 39–59.
[2] Ch. Schuette, A. Fischer, W. Huisinga, and P. Deuflhard, A direct approach to conforma-
tional dynamics based on hybrid Monte Carlo, J. Comput. Phys., 151 (1999), pp. 146–168.
[3] Ch. Schuette and W. Huisinga, Biomolecular conformations can be identified as metastable
sets of molecular dynamics, in Handbook of Numerical Analysis, Elsevier, Amsterdam,
2003, pp. 699–744.
[4] A. Bovier, M. Eckhoff, V. Gayrard, and M. Klein, Metastability and low lying spectra in
reversible Markov chains, Comm. Math. Phys., 228 (2002), pp. 219–255.
[5] F. Noé and S. Fischer, Transition networks for modeling the kinetics of conformational
change in macromolecules, Curr. Opin. Struct. Biol., 18 (2008), pp. 154–162.
[6] F. Noé, I. Horenko, Ch. Schuette, and J. Smith, Hierarchical analysis of conformational
dynamics in biomolecules: Transition networks of metastable states, J. Chem. Phys., 126
(2007), 155102.
[7] F. Noé, C. Schuette, L. Reich, and T. Weikl, Constructing the equilibrium ensemble of
folding pathways from short off-equilibrium simulations, Proc. Natl. Acad. Sci. USA, 106
(2009), pp. 19011–19016.
[8] J. Chodera, N. Singhal, V. S. Pande, K. Dill, and W. Swope, Automatic discovery of
metastable states for the construction of Markov models of macromolecular conformational
dynamics, J. Chem. Phys., 126 (2007), 155101.
[9] N. V. Buchete and G. Hummer, Coarse master equations for peptide folding dynamics, J.
Phys. Chem. B, 112 (2008), pp. 6057–6069.
[10] A. C. Pan and B. Roux, Building Markov state models along pathways to determine free
energies and rates of transitions, J. Chem. Phys., 129 (2008), 064107.
[11] A. Voter, Introduction to the kinetic Monte Carlo method, in Radiation Effects in Solids,
Springer, NATO Publishing Unit, Dordrecht, The Netherlands, 2007, pp. 1–23.
[12] M. Freidlin and A. D. Wentzell, Random Perturbations of Dynamical Systems, Springer,
New York, 1998.
[13] W. E and E. Vanden Eijnden, Metastability, conformation dynamics, and transition pathways
in complex systems, in Multiscale Modelling and Simulation, Springer, Berlin, 2004, pp.
35–68.
[14] A. Bovier, M. Eckhoff, V. Gayrard, and M. Klein, Metastability in reversible diffusion
processes. I. Sharp asymptotics for capacities and exit times, J. Eur. Math. Soc. (JEMS),
6 (2004), pp. 399–424.
[15] A. Bovier, V. Gayrard, and M. Klein, Metastability in reversible diffusion processes. II. Pre-
cise asymptotics for small eigenvalues, J. Eur. Math. Soc. (JEMS), 7 (2005), pp. 69–99.
[16] R. S. Maier and D. L. Stein, Limiting exit location distributions in the stochastic exit problem,
SIAM J. Appl. Math., 57 (1997), pp. 752–790.
[17] I. Pavlyukevich, Stochastic Resonance, Ph.D. thesis, HU Berlin, Berlin, 2002.
[18] W. Huisinga, S. Meyn, and Ch. Schuette, Phase transitions and metastability for Markovian
and molecular systems, Ann. Appl. Probab., 14 (2004), pp. 419–458.
[19] E. B. Davies, Spectral properties of metastable Markov semigroups, J. Funct. Anal., 52 (1983),
pp. 315–329.
[20] Ch. Schuette, Conformational Dynamics: Modelling, Theory, Algorithm, and Applications
to Biomolecules, Habilitation thesis, Fachbereich Mathematik und Informatik, FU Berlin,
Berlin, 1998.
[21] F. Herau, M. Hitrik, and J. Sjoestrand, Tunnel effect for Kramers-Fokker-Planck type
operators: Return to equilibrium and applications, Int. Math. Res. Not. IMRN, no. 15
(2008).
[22] P. Deuflhard, M. Dellnitz, O. Junge, and Ch. Schuette, Computation of essential molec-
ular dynamics by subdivision techniques, in Computational Molecular Dynamics Chal-
lenges, Methods, Ideas, Lect. Notes Comput. Sci. Eng. 4, Springer, Berlin, 1998, pp. 98–
115.
[23] W. Huisinga, Metastability of Markovian Systems: A Transfer Operator Based Approach in
Application to Molecular Dynamics, Ph.D. thesis, Fachbereich Mathematik und Informatik,
[25] P. Deuflhard and M. Weber, Robust Perron cluster analysis in conformation dynamics,
Linear Algebra Appl., 398 (2005), pp. 161–184.
[26] N.-V. Buchete and G. Hummer, Peptide folding kinetics from replica exchange molecular
dynamics, Phys. Rev. E (3), 77 (2008), 030902.
[27] C. Schütte, J. Walter, C. Hartmann, and W. Huisinga, An averaging principle for fast
degrees of freedom exhibiting long-term correlations, Multiscale Model. Simul., 2 (2004),
pp. 501–526.
[28] N. Singhal Hinrichs and V. S. Pande, Bayesian metrics for the comparison of Markovian
state models for molecular dynamics simulations, in Proceedings of the International Con-
ference on Research in Computational Molecular Biology, 2007, submitted.
[29] S. Roeblitz, Statistical Error Estimation and Grid-Free Hierarchical Refinement in Confor-
mation Dynamics, Ph.D. thesis, FU Berlin, Berlin, 2008.
[30] N. Singhal and V. S. Pande, Error analysis in Markovian state models for protein folding,
J. Chem. Phys., 123 (2005), 204909.
[31] F. Noé, Probability distributions of molecular observables computed from Markov models, J.
Chem. Phys., 128 (2008), 244103.
[32] F. Rao and A. Caflisch, The protein folding network, J. Mol. Biol., 342 (2004), pp. 299–306.
[33] S. V. Krivov and M. Karplus, Hidden complexity of free energy surfaces for peptide (protein)
folding, Proc. Natl. Acad. Sci. USA, 101 (2004), pp. 14766–14770.