Weidong Zhang Thesis
Weidong Zhang Thesis
BVPs
Weidong Zhang
University of Toronto Computer Science Department
February 2, 2012
Abstract
Boundary value problems arise in many applications, and shooting meth-
ods are one approach to approximate the solution of such problems. A Shoot-
ing method transforms a boundary value problem into a sequence of initial
value problems, and takes the advantage of the speed and adaptivity of ini-
tial value problem solvers. The implementation of continuous Runge-Kutta
methods with defect control for initial value problems gives efficient and re-
liable solutions. In this paper, we design and implement a boundary value
solver that is based on a shooting method using a continuous Runge-Kutta
method to solve the associated initial value problems. Numerical tests on a
selection of problems show that this approach achieves better performance
than another widely used existing shooting method.
1 Introduction
Applications of boundary value problems (BVPs) arise in many different areas -
see [1], section 1.2. Consider a boundary value problem defined by the system of
ordinary differential equations (ODEs)
1
solvers for the solution of BVPs. This new BVP solver generally converges faster,
is more accurate and provides more robust solutions than the multiple shooting
solver MUSN.
The existence and uniqueness theory for BVPs is considerably more difficult
than it is for IVPs. Necessary and sufficient conditions for the existence and
uniqueness of a solution of (1) can be found in [1], section 3.1. But the kind of
conditions under which the BVP may have multiple solutions is far from clear, and
such difficulties can arise in realistic problems in applications. So for a given BVP,
there maybe multiple solutions. This is the case we will encounter in our test prob-
lems later in this paper. There are other general purpose methods for solving BVPs.
One of the most popular methods, and the one that we will be using to benchmark
the performance of our shooting method is a collocation method, COLNEW [6].
In section 2, we briefly review shooting methods. We introduce CRK IVP
solvers in section 3, which we then use to implement a particular shooting method.
We combine the shooting approach and a CRK method to create a prototype model
of a new BVP solver. Some implementation issues that arise from this combination
will be discussed in section 4. We report several test results in section 5. From the
results of tests, we have some conclusions and observations which are presented in
section 6.
2 Shooting Methods
Shooting methods transform a BVP to a sequence of IVPs by attempting to find
the right initial conditions which lead to an approximate solution of the IVP that
satisfies the boundary conditions. They take advantage of the speed and adaptivity
of IVP methods. But they also inherit the stability (or instability) of the associated
IVP, which may be unstable even if the BVP itself may be quite stable.
When applied to (1), shooting methods look for initial conditions y(a) = s, so
that the solution u(t) of the resulting IVP satisfies g(s, u(b)) = 0.
2
Figure 1: Representation of the iteration associated with simple shooting method
3
Figure 2: Visualization of one iteration of a multiple shooting method
4
where hij = tij+1 − tij , 1 ≤ j ≤ Mi , and
r−1
X
kri = f tij + cr hij , yji + hij arq kqi .
q=1
t−tij
where τ = hij
, and bq (τ ) is a polynomial of degree at most p + 1,
p+1
X
bq (τ ) = βqr τ r .
r=0
One can analyze the error in (11) by considering the local interpolant uij (t) to be
an approximation to the local solution zij (t) for t ∈ [tij , tij+1 ].
This polynomial interpolant can be written as
where dq is a polynomial degree ≤ p. The polynomial uij (t) satisfies uij (t) =
zij (t)+O((hij )p ) for t ∈ (tij , tij+1 ). The polynomials {uij (t)}, i = 1, 2, . . . , N, j =
1, 2, . . . , Mi then define a vector of piecewise polynomials u(t), which are contin-
uous on [a, b]. And a simple set of constraints on bq (τ ) will ensure that uij (t)
interpolate yji , yj+1
i , so that the discrete Runge-Kutta method is embedded within
the CRK.
5
This approach allows one to decompose the error in uij (t) into two compo-
nents: the error inherent in polynomial interpolation (the local interpolation error)
and the error that arises as a consequence of “inexact” values being interpolated
(the data error associated with the fact that we are interpolating approximate solu-
tion and derivative values).
The piecewise polynomial u(t) allows an alternative error control mechanism
for continuous Runge-Kutta methods, defect control, which is different from the
local error control discussed earlier. The defect δ(t) associated with u(t) is defined
for t ∈ [a, b] to be,
δ(t) ≡ u0 (t) − f (t, u(t)). (13)
That is, δ(t) is the amount by which the associated piecewise polynomial fails to
satisfy the differential equation. With an interpolation scheme defined by (11), one
can show that the corresponding defect satisfies
6
Formula p s s̄
CRK5 5 6 9
CRK6 6 7 11
CRK8 8 13 21
Table 1: Cost per step of the RDC explicit CRK formulas we have considered
The result of implementing and testing of some explicit CRKs with both RDC
and SDC on 25 standard non-stiff problems of the DETEST package can be found
in [10]. We have focused on RDC CRKs in our multiple shooting code as the
cost per step for the CRK is roughly 75% of that for the SDC CRKs and there is
little difference in the errors of the interpolants. Note that continuous Runge-Kutta
methods can also be applied directly to a BVP in a different way (see [11] for
details).
4 Implementation
Both simple shooting and multiple shooting methods need to solve nonlinear equa-
tions, either (3) or (7). We use a modified damped Newton method to solve these
systems.
For simple shooting, the Jacobian matrix of this nonlinear system is an n × n
matrix defined as,
∂g g(s, y(b, s)) g(s, y(b, s))
= + Y (b), (15)
∂s ∂s ∂y(b, s)
∂y(t,s)
where Y (t) = ∂s is the n × n fundamental matrix which is the solution of the
matrix IVP,
∂f (t, y(t, s))
Y0 = Y, Y (a) = I, a ≤ t ≤ b.
∂y
For multiple shooting, the Jacobian matrix of the associated nonlinear system
is an nN × nN block bi-diagonal matrix,
−Y1 (x2 ) I 0 ··· 0
0 −Y2 (x3 ) I 0 · 0
∂F .
. .
.
= . 0 ··· .
∂s
−Y N −1 (x N ) I
g(s,y(b,s)) g(s,y(b,s))
∂s 0 · · · 0 ∂y(b,s) Y N (b)
(16)
∂yi (t,si )
where Yi (t) = ∂si is an n × n fundamental solution associated with the ith
subinterval defined by the IVP,
∂f (t, y(t, si ))
Yi0 = Yi , Yi (xi ) = I, xi ≤ t ≤ xi+1 , 1 ≤ i ≤ N
∂y
7
The use of a damped Newton method in solving BVPs is discussed in [1], sec-
tion 8.1, where it is shown that convergence of Newton’s method can be improved.
A damped Newton method uses a parameter λ to control the magnitude of step to
be taken in the Newton direction,
−1
sm+1 = sm − λ F 0 (sm ) F (sm ), 0 < λ ≤ 1, (17)
where F 0 (sm ) =
∂F
∂s s=sm . Newton’s method corresponds to taking λ = 1.
Let ∆m = − (F 0 (sm ))−1 F (sm ), the Newton step on the (m + 1)st iteration.
For any s ∈ RN ×m define, on the (m + 1)st iteration, an objective function
1 −1
ĝm (s) = F 0 (sm ) F (s) . (18)
2 2
Note that ĝm (sm ) = 12 k∆m k2 , and requires very little computation, whereas for
s 6= sm an evaluation of ĝm (s) requires the solution of a linear system plus the
computation of F (s). See [1] for a discussion and justification this technique for
determining an acceptable value for λ.
r = 1
1 for m = 1
λr = λm−1 if λm−1 < λm−2 (1 − σ)
min(1, 2λm−1 ) otherwise
do
until λr < 0.01 r m r
or ĝm (sm + λ ∆ ) ≤ (1 − 2σλ )ĝm (sm )
2
λr+1 = max τ λ, (2λr −1)ĝm (sλmĝ)+ĝ
m (sm )
m m
m (sm +λ ∆ )
r =r+1
end do
if λr < 0.01 then
signed no acceptable λ
else
λm = λr
end
The objective function not only provides an indication of convergence, but also
can be used to improve the initial guess and restart the iteration. From user pro-
vided initial guess s0 , for any iterate si , i > 0, with ĝ(si ) < ĝ(s0 ), we assume this
8
indicates that si is closer to the solution than s0 . Since Newton’s method is sensi-
tive to the choice of initial guess, and converges to the solution very rapidly once
the initial guess is close to a solution, by monitoring the value of ĝ(si ) on each
iteration, we can replace the user provided initial guess s0 by a better initial guess
si when a restart is indicated. The modified Newton iteration needs to be restarted
when certain events happen, such as when new mesh point(s) need to be inserted
or divergence of ĝ(sm ) is detected. Then the si corresponding to the residual solve
of ĝ(si ) derived so far will be the initial guess for the restarted iteration. We hope
with the better initial guess, we improve the chance of convergence.
The block matrix Yi (xi+1 ), 1 ≤ i ≤ N in (16) can be interpreted as a local
sensitivity matrix (see detail in [12]). We apply a QR-decomposition Yi (xi+1 ) =
Qi Ri . Let rii , 1 ≤ i ≤ n denote the diagonal entries of Ri , we use the ratio of
max(|rii |)
(19)
min(|rii |)
as a crude condition number estimator of the associated IVP (also see [13] for a
similar scheme). This condition number estimator can indicate stiffness at t ∈
[xi , xi+1 ]. If the condition number is large, then we insert an additional mesh point
in the middle of [xi , xi+1 ]. This mechanism can be used to determine the number
of mesh points automatically without requiring a user to provide the initial mesh.
We set the maximum number of iteration to 30. If after 30 iterations the result
is still not able to satisfy the given tolerance, then we double the number of mesh
points and try again. And we also set the maximum number of mesh points to 1000.
At any time, if the program tries to increase the number of mesh points greater than
1000, then we assume the given problem is too difficult for the multiple shooting
method, and the method will exit and signal a failure.
5 Numerical Tests
We report on some numerical tests to illustrate the performance of our new BVP
solver (denoted as MUSCRK). We use, as test problems, BVPs that depend on a
single parameter. In most cases, as the parameter changes, the problems change
from non-stiff to stiff. In this way, we can measure both performance and the range
of stiffness where the solver can be effective.
We use RDC CRK78 as the CRK IVP solver [14]. For comparison purpose,
we also report the performance of two other BVP solvers: COLNEW and MUSN.
As noted earlier, COLNEW is a popular BVP solver based on collocation. It can
be applied to a wide range of problems from non-stiff to stiff. In our testing, we
set the order to 8, matching the order of CRK78. MUSN is a multiple shooting
BVP solver discussed earlier. The IVP solver of MUSN is based on a Fehlberg 45
Runge-Kutta method, which is a lower order method than MUSCRK or COLNEW.
We consider several test problems and report results for MUSCRK, COLNEW,
and MUSN. These problems are subjected to two different tolerance 10−3 and
9
10−6 , and we compare the results based on number of mesh point(s), the number of
iteration(s) (which are reported as (N, m), where N is the number of mesh points,
and m is the number of iterations), execution time, maximum defect (if applica-
ble), and maximum error. If the reported number of mesh point(s), and number of
iteration(s) appear as (N, ∗), then this means there is no convergence with N mesh
points. And if (N1 , m1 ), (N2 , m2 ), . . . appear, this indicate that although with N1
mesh points after m1 iterations, the method converges to a solution, the associated
error estimation does not satisfy the given tolerance, and further smooth refinement
is needed.
Here, the number of mesh points N has different meanings for BVP solvers
based on shooting (MUSCRK and MUSN), and the collocation method (COL-
NEW). For BVP solvers based on shooting, N is intended to control the instability
of the IVPs (as discussed in the end of section 2), and is normally insensitive to the
value of T OL. For a BVP solver based on collocation, N determines the underly-
ing discretization step, and must increases as T OL decreases.
The execution time measurement is measured on a dell studio 1558 laptop run-
ning Ubuntu 11.04. Each of the three methods solves a given test problem three
times, and we report the average time. When determining the timing result, the
program does not compute any of the other reported statistics.
MUSCRK uses a defect control mechanism on every step, so we can report the
maximum defect on the whole interval for each test problem. But for the other two
BVP solvers, defect control is not the default error control mechanism, they do not
have the defect control feature, so we only report maximum defect for MUSCRK.
For each of our test problem, there is no known explicit analytic solution. We
subjected these BVPs to a very severe tolerance (usually 10−10 ), and use the result
as the reference solution to the problem. Concerning the maximum error, MUSN
only produces the approximate solution at mesh points, so for MUSN, maximum
error is the maximum difference between the reference solution and computed so-
lution at mesh points. Since both MUSCRK and COLNEW can produce approx-
imate solution between mesh points, we divide the intervals of the test problems
into 100 subintervals, and get the approximate solutions on these 100 points, and
we use the maximum difference between the reference solution and approximate
solution over these points as the maximum error for both MUSCRK and COLNEW.
We have chosen four test problems from the boundary value literature. Each
depends on a parameter and we have generated results for four parameter values
for each problem. If for a given parameter value, the BVP method cannot converge
to a solution, we leave the column blank in the table associated with the method.
y(0) = 0, y(1) = 1
10
and parameter values τ = 1, 7, 10, 16. The default initial guess is
y1 (t) = 0, y2 (t) = 0.
Refer to Figure 4 for a plot of the solutions and Tables 2 and 3 for a summary of
the results for the 3 methods. This problem is also known as Troesch’s equation.
Increasing τ increases the stiffness of this ODE (taken from [13]).
and parameter values = 1.0, 0.05, 0.005, 0.001, and initial guess
This problem is taken from [1]. Refer to Figure 5 for a plot of the solutions and
Tables 4 and 5 for a summary of the results for the 3 methods.
MUSCRK and COLNEW produce different solutions when ε = 0.001 as Fig-
ure 6 shows, MUSCRK produces a symmetric solution, while COLNEW produces
a non-symmetric solution. Are both actual solutions to the problem? We used a
reliable stiff IVP solver to verify both solutions. That is we applied the IVP solver
to the differential equation with the initial condition associated with the value of
11
method result
τ⇒ 1 7 10 16
Table 2: Results for plasma confinement problem with T OL = 10−3 , and 4 values
of τ
12
method result
τ⇒ 1 7 10 16
Table 3: Results for plasma confinement problem with T OL = 10−6 and 4 values
of τ
13
method result
ε⇒ 1.0 0.05 0.005 0.001
Profile (5, 3)
MUSN
Time(sec) 0.006
Error 1.55 × 10−9
Table 4: Results for swirling flow III problem with T OL = 10−3 and 4 values of ε
∗ for severe T OL = 10−10 COLNEW is not able to converge to a solution
14
method result
ε⇒ 1.0 0.05 0.005 0.001
Profile (5, 3)
MUSN
Time(sec) 0.006
Error 1.50 × 10−9
Table 5: Results for swirling flow III problem with T OL = 10−6 and 4 values of ε
∗ for severe T OL = 10−10 COLNEW is not able to converge to a solution
15
Figure 5: Solutions of swirling flow III problem for 4 values of ε
the converged BVP solution determined by the two BVP methods. In each case the
approximate solution generated for the respective initial value problem satisfy the
BVP boundary condition at the right endpoint. From the results of IVP approxi-
mations, we believe that both solutions are actual solutions to this problem.
Figure 6: Solutions of swirling flow III problem for ε = 0.001, f (t) versus t
We further test how different initial guesses can affect the solutions of BVP
solvers for this problem. We used the non-symmetric solution of COLNEW (ε =
0.001, and T OL = 10−6 ) to determine an initial guess for MUSCRK (at 50 equally
distributed points). The result of MUSCRK shows that MUSCRK can be forced
to converge to the non-symmetric solution. On the other hand, we use the sym-
metric solution of 50 equally distributed mesh points of MUSCRK (ε = 0.001,
and T OL = 10−6 ) to determine an initial guess for COLNEW. COLNEW didn’t
converge to a solution. When we increase the number to 100 equally distributed
mesh points, COLNEW converges to a totally different result as Figure 7 shows.
16
This result is different from either solution in Figure 6. We cannot verify this result
using a reliable stiff IVP solver. We conclude this result is not a solution of the
problem.
Figure 7: Result of CONEW to swirling flow III problem for ε = 0.001, f (t)
versus t by the solution of 100 equally distributed mesh points of MUSCRK as
initial guess to COLNEW
and parameter values = 0.1, 0.05, 0.01, 0.005 The default initial guess is
(taken from [15]). Refer to Figure 8 for a plot of the solutions and Tables 6 and 7
for a summary of the results for the 3 methods.
From Figure 8 it is clear that there are two types of solutions determined by
the three methods. MUSCRK seems to produce one type (an oscillating solution
with a frequency that increases as ε is decreased). The second type of solution is
a “U-shaped” solution produced by COLNEW and MUSN with a boundary layer
at both endpoints that becomes sharper as ε is decreased. The question is: are both
of these actual solutions of the problem?
To investigate this question, we use the oscillating result (ε = 0.05, and T OL =
10−6 ) to determine an initial guess supplied to COLNEW and MUSN. We use 20
equally distributed mesh points to capture the feature of the oscillating result for
this initial guess, and plot the solution obtained in Figure 9. The result of MUSN
seems to confirm that the oscillating result is an actual solutions of the problem.
17
method result
ε⇒ 0.1 0.05 0.01 0.005
Table 6: Results for nonlinear elastic beams problem with T OL = 10−3 and 4
values of ε
18
Figure 8: Solutions of the nonlinear elastic beams problem for 4 values of ε (left
column solutions are computed by MUSCRK, middle column solutions are com-
puted by COLNEW, and right column solutions are computed by MUSN)
19
method result
ε⇒ 0.1 0.05 0.01 0.005
Table 7: Results for nonlinear elastic beams problem with T OL = 10−6 and 4
values of ε
20
Initial guess of 20 equally distributed mesh points result of COLNEW result of MUSN
Figure 9: The result of nonlinear elastic beams problem using the oscillating so-
lution (ε = 0.05, and T OL = 10−6 ) as initial guess for COLNEW and MUSN,
M (t) versus t
21
Initial guess of 16 unequally distributed mesh points result of MUSCRK
Figure 10: The result of nonlinear elastic beams problem using the “U-shaped”
(τ = 10−5 , and T OL = 10−6 ) as initial guess for MUSCRK, M (t) versus t
22
method result
τ⇒ 0.01 1e − 3 1e − 4 1e − 5
Table 8: Results for artificial boundary layer problem with T OL = 10−3 and 4
values of τ
23
Figure 11: Solutions of artificial boundary layer problem for values of τ (left col-
umn solutions are computed by MUSCRK, middle column solutions are computed
by COLNEW, and right column solutions are computed by MUSN)
24
method result
τ⇒ 0.01 1e − 3 1e − 4 1e − 5
Table 9: Results for artificial boundary layer problem with T OL = 10−6 and 4
values of τ
Initial guess of 15 unequally distributed mesh points result of MUSN
Figure 12: The result of artificial boundary layer using “S-shaped” solution (τ =
10−5 , and T OL = 10−6 ) as initial guess for MUSN, y(t) versus t
25
Initial guess of 50 equally distributed mesh points result of MUSCRK result of COLNEW
Figure 13: The result of artificial boundary layer using oscillating solution (τ =
10−4 , and T OL = 10−6 ) as initial guess for MUSCRK and COLNEW, y(t) versus
t
MUSN are solutions of the BVP. In addition, we investigate the inaccurate approx-
imate solution in Figure 12 produced by MUSN (for all these IVP solutions, we
use τ = 10−5 , and T OL = 10−3 , 10−6 , 10−8 ), to check whether they are solu-
tions or not. From the results of the IVP approximations, we are able to confirm
that “S-shaped” result is a solution, while all the other approximations produced
by MUSN are not. From the oscillating solution of MUSN, we select 5, 10, and
50 equally distributed mesh points as initial values for the IVP solver, none of IVP
approximations produces the oscillating result.
The approximation in Figure 12 produced by MUSN is produced by 15 un-
equally distributed mesh points from the solution of COLNEW. Using an IVP
solver confirms that it is not a solution. When we use 33 unequally distributed
mesh points of the solution of COLNEW as initial guesses, MUSN produces a
similar result as the one in Figure 12, and IVP approximation confirms that it is not
a solution either. So we can conclude both the oscillating results and the inaccu-
rate approximation result in Figure 12 produced by MUSN are not solutions of the
artificial boundary layer problem.
26
MUSN do. MUSN cannot adjust the mesh points during the solution process.
MUSN can only generate solution at mesh points if it converges to a solution,
or display error message if it does not converge. So if dense output is required
with MUSN, the only choice is to increase the number of mesh points. COLNEW
can adjust mesh points during the process of solving BVPs. Generally for non-
stiff BVPs, MUSCRK requires fewer mesh points to converge to a solution than
does COLNEW and MUSN. For mildly stiff BVPs, MUSCRK requires fewer mesh
points to converge than MUSN.
Also as the comparison tests above show, for a given BVP, multiple solutions
may exist, and different initial guess can determine which solution is approximated.
MUSCRK generally can be forced to converge to a different solution by changing
the initial guess, while it is sometimes difficult for COLNEW to do so. Users do
assume that the approximations returned by a BVP solver are approximations to a
true solution. In our lengthy testing, we were able to find examples where this is
not the case for the two widely used BVP solvers, COLNEW and MUSN.
One of the disadvantages of our implementation of MUSCRK is as the number
of mesh points N becomes large, there are N IVPs to be integrated, and continu-
ous Runge-Kutta methods require substantially more function evaluations per step
than a discrete Runge-Kutta method of the same order. We made some time mea-
surement of function evaluations on some of the test problems, and we find that
function evaluations account for about 80% of the total computation time for a
method. One of the features of a multiple shooting method is that the integration
of N IVPs in (5) can be performed independently (mentioned in [1], section 4.3),
If we export this observation in MUSCRK, it should be possible to improve the
performance significantly.
Another disadvantage of MUSCRK compared with COLNEW is that as the
problem becomes stiffer, generally MUSCRK fails to converge to solution earlier
than COLNEW. This is because we use explicit continuous Runge-Kutta methods
(such as CRK78 and CRK56). We could switch to implicit Runge-Kutta method
when the problem becomes stiff.
7 Acknowledgments
I am grateful to my supervisor Prof. Wayne Enright for his efforts in conducting
my research project and preparing this thesis. I would like to thank M. Shakourifar
for his helpful discussions during the preparation of this paper. I would also like to
thank Dr. Tom Fairgrieve for his valuable comments and suggestions after carefully
reading this thesis.
References
[1] R.M.M. Mattheij U.M. Ascher and R.D. Russell. Numerical Solution of
Boundary Value Problems for Ordinary Differential Equations. Classics in
27
Applied Mathematics Series, Society for Industrial and Applied Mathemat-
ics, Philadelphia, 1995.
[3] Herbert B. Keller. Numerical Solution of Two Point Boundary Value Prob-
lems. SIAM, 1976.
[4] R.M.M. Mattheij and G.W.M. Staarink. An efficient algorithm for solving
general linear two point bvp. Report 8220, Math. Inst. Catholic University,
Nijmegen, 1982.
[8] W.H. Enright. A new error-control for initial value solvers. Applied Mathe-
matics and Computation - AMC, 31(3):288–301, 1989.
[9] W.H. Enright and Li Yan, 2009. The Reliability/Cost Trade-off for a Class of
ODE solvers.
[10] W.H. Enright. Continuous numerical methods for odes with defect control.
Journal of Computational and Applied Mathematics, 125(2000):159–170,
1999.
[11] W.H. Enright and P.H. Muir. New interpolants for asymptotically correct
defect control of bvodes. Numerical Algorithms, 53(2):219238, 2010.
[12] U.M. Ascher and L. P. Petzold. Computer methods for Ordinary Differential
Equations and Differential-Algebraic Equations. SIAM, 1998.
28