Bvps Book PDF
Bvps Book PDF
Problems
Graeme Fairweather and Ian Gladwell
Let π = {xj }N +1
j=0 denote a uniform partition of the interval I such that
xj = jh, j = 0, 1, . . . , N +1, and (N +1)h = 1. On this partition, the solution
u of (1.1)–(1.2) is approximated by the mesh function {uj }N +1
j=0 defined by
the finite difference equations
uj+1 − 2uj + uj−1 uj+1 − uj−1
Lh uj ≡ −
(1.4) +pj +qj uj = fj , j = 1, . . . , N,
h2 2h
(1.5) u0 = g0 , uN +1 = g1 ,
where
pj = p(xj ), qj = q(xj ), fj = f (xj ).
Equations (1.4) are obtained by replacing the derivatives in (1.1) by basic
centered difference quotients.
We now show that under certain conditions the difference problem (1.4)–
(1.5) has a unique solution {uj }N +1
j=0 , which is second order accurate; that
is,
|u(xj ) − uj | = O(h2 ), j = 1, . . . , N.
1
1.2 The Uniqueness of the Difference Approximation
From (1.4), we obtain
h h
2
h Lh uj = − 1 + pj uj−1 + 2 + h2 qj uj − 1 − pj uj+1 = h2 fj , j = 1, . . . , N.
2 2
(1.6)
The totality of difference equations (1.6), subject to (1.5), may be written
in the form
(1.7) Au = b,
where
d1 e1
u1 f1 c1 g0
.. ..
u2
c
2 . .
f
2
0
u= ..
,A = .. .. .. , b = h2 ..
− ..
,
. . . . . .
uN −1 .. .. fN −1 0
. . eN −1
uN fN eN g1
cN dN
and, for j = 1, . . . , N,
1 1
2
(1.8) cj = − 1 + hpj , d j = 2 + h qj , ej = − 1 − hpj .
2 2
We prove that there is a unique {uj }Nj=1 by showing that the tridiagonal
matrix A is strictly diagonally dominant and hence nonsingular.
Theorem 1.1. If h < 2/p∗ , then the matrix A is strictly diagonally domi-
nant.
Proof — If h < 2/p∗ then
h h
|cj | = 1 + pj , |ej | = 1 − pj ,
2 2
and
|cj | + |ej | = 2 < dj , j = 2, . . . , N − 1.
Also,
|e1 | < d1 , |cN | < dN ,
which completes the proof.
2
1.3 Consistency, Stability and Convergence
To study the accuracy and the computability of the difference approximation
{uj }N +1
j=0 , we introduce the concepts of consistency, stability and convergence
of finite difference methods. The basic result proved in this section is that,
for a consistent method, stability implies convergence.
|τj,π [w]| → 0 as h → 0.
The quantities τj,π [w], j = 1, . . . , N , are called the local truncation (or local
discretization) errors.
Definition 1.2. The difference problem (1.4)–(1.5) is locally pth – order ac-
curate if, for sufficiently smooth data, there exists a positive constant C,
independent of h, such that
h2 (4)
τj,π [w] = − [w (νj ) − 2p(xj )w(3) (θj )],
12
where νj and θj lie in (xj−1 , xj+1 ).
Proof — By definition
3
It is easy to show using Taylor’s theorem that
Theorem 1.2. If the functions p and q satisfy (1.3), then the difference
operator Lh of (1.4) is stable for h < 2/p∗ , with K = max{1, 1/q∗ }.
Proof — If
|vj∗ | = max |vj |, 1 ≤ j∗ ≤ N,
0≤j≤N +1
Thus,
dj∗ |vj∗ | ≤ (|ej∗ | + |cj∗ |) |vj∗ | + h2 max |Lh vj |.
1≤j≤N
4
or
1
|vj∗ | ≤ max |Lh vj |.
q∗ 1≤j≤N
Thus, if max0≤j≤N +1 |vj | occurs for 1 ≤ j ≤ N then
1
max |vj | ≤ max |Lh vj |,
0≤j≤N +1 q∗ 1≤j≤N
and clearly
Lh vj = 0, j = 1, . . . , N,
v0 = vN +1 = 0.
5
Theorem 1.3. Suppose u ∈ C 4 (I) and h < 2/p∗ . Then the difference
solution {uj }N +1
j=0 of (1.4)–(1.5) is convergent to the solution u of (1.1)–
(1.2). Moreover,
max |uj − u(xj )| ≤ Ch2 .
0≤j≤N +1
eh ≈ Chp .
If we solve the difference problem with two different mesh lengths h1 and h2
such that h2 < h1 < h0 , then
eh1 ≈ Chp1
6
and
eh2 ≈ Chp2
from which it follows that
ln eh1 ≈ p ln h1 + ln C
and
ln eh2 ≈ p ln h2 + ln C.
Therefore an estimate of p can be calculated from
In practice, one usually solves the difference problem for a sequence of values
of h, h0 > h1 > h2 > h3 > . . ., and calculates the ratio on the right hand
side of (1.13) for successive pairs of values of h. These ratios converge to
the value of p as h → 0.
As is shown in the next theorem, if (1.14) holds, then the stability of the
difference operator Lh ensures that there exists a function e(x) such that
and
h/2 h2
u(x̂) = u2j + e(x̂) + O(h4 ),
4
7
from which it follows that
h/2 1 h/2
u(x̂) = u2j + (u2j − uhj ) + O(h4 ).
3
Thus
(1) h/2 1 h/2
uj ≡ u2j + (u2j − uhj )
3
is a fourth–order approximation to u(xj ), j = 0, . . . , N + 1.
In deferred corrections, the difference equations (1.4)–(1.5) are solved in
the usual way to obtain {uj }. Then a fourth–order difference approximation
{ûj }N +1
j=0 to u(x) is computed by solving difference equations which are a
perturbation of (1.4)–(1.5) expressed in terms of {uj }. Suitable definitions
of {ûj }N +1
j=0 are discussed once we derive (1.15).
Theorem 1.4. Suppose u ∈ C 6 (I), and h < 2/p∗ . Then (1.14) and (1.15)
hold with e(x) defined as the solution of the boundary value problem
Le(x) = τ [u(x)], x ∈ I,
(1.16)
e(0) = e(1) = 0,
where
1 (4)
(1.17) τ [u(x)] = − [u (x) − 2p(x)u(3) (x)].
12
Proof — Since u ∈ C 6 (I), it is easy to show by extending the argument
used in Lemma 1.1 that (1.14) holds with τ [u(x)] defined by (1.17).
As in the proof of Theorem 1.3, we have
8
The desired result, (1.15), follows from the stability of Lh .
Deferred corrections is defined in the following way. Suppose τ̂j,π [ · ] is a
difference operator such that
9
we have
Similarly,
When (1.22) and (1.23) are substituted into (1.17), we obtain the desired
form (1.21).
In deferred corrections, the linear algebraic systems defining the basic
difference approximation and the fourth-order approximation have the same
coefficient matrix, which simplifies the algebraic problem.
If the solution u of the boundary value problem (1.1)–(1.2) is sufficiently
smooth, then it can be shown that the local truncation error τj,π [u] has an
asymptotic expansion of the form
m
X
τj,π [u] = h2ν τν [u(xj )] + O(h2m+2 ),
ν=1
Now
10
and therefore
(1.25) u(4) (xj ) = ∆h [q(xj )u(xj ) − f (xj )] + O(h2 ).
Thus, from (1.24) and (1.25),
1 2
u00 (xj ) = ∆h u(xj ) − h ∆h [q(xj )u(xj ) − f (xj )] + O(h4 ).
12
We define {ũj }N +1
j=0 by
1 1
Lh ũj ≡ −∆h ũj + 1 + h2 ∆h qj ũj = 1 + h2 ∆h f (xj ), j = 1, . . . , N,
12 12
(1.26)
ũ0 = g0 , ũN +1 = gN +1 ,
which is commonly known as Numerov’s method. Equations (1.26) are sym-
metric tridiagonal and may be written in the form
1
c̃j uj−1 + d˜j uj + ẽj uj+1 = h2 [fj+1 + 10fj + fj−1 ], j = 1, . . . , N,
12
where
1 5 1
c̃j = − 1 − h2 qj−1 , d˜j = 2 + h2 qj , ẽj = − 1 − h2 qj+1 .
12 6 12
It is easy to show that, for h sufficiently small, the coefficient matrix of this
system is strictly diagonally dominant and hence {ũj }N +1
j=0 is unique. From
a similar analysis, it follows that the difference operator Lh is stable. Also,
since
1 2
Lh [ũj − u(xj )] = 1 + h ∆h f (xj ) − Lh u(xj ) = O(h4 ),
12
the stability of Lh implies that
|ũj − u(xj )| = O(h4 ).
11
The basic second–order finite difference approximation to (1.27) takes the
form
Lh uj ≡ −∆h uj + f (xj , uj ) = 0, j = 1, . . . , N.
(1.28)
u0 = g0 , uN +1 = g1 .
If
τj,π [u] ≡ Lh u(xj ) − Lu(xj )
then clearly
h2 (4)
τj,π [u] = − u (ξj ), ξj ∈ (xj−1 , xj+1 ),
12
if u ∈ C 4 (I). Stability of the nonlinear difference problem is defined in the
following way.
Definition 1.6. A difference problem defined by the nonlinear difference
operator Lh is stable if, for sufficiently small h, there exists a positive
constant K, independent of h, such that, for all mesh functions {vj }Nj=0
+1
N +1
and {wj }j=0 ,
n o
|vj − wj | ≤ K max(|v0 − w0 |, |vN +1 − wN +1 |) + max |Lh vj − Lh wj | .
1≤i≤N
∂f
Theorem 1.5. If fu ≡ is continuous on I × (−∞, ∞) such that
∂u
0 < q∗ ≤ fu ,
on using the mean value theorem, where vˆj lies between vj and wj . Then
12
where cj = ej = −1 and
dj = 2 + h2 fu (xj , vˆj ).
Clearly
|cj | + |ej | = 2 < |dj |.
The remainder of the proof is similar to that of Theorem 1.2.
An immediate consequence of the stability of Lh is the following:
h2
|uj − u(xj )| ≤ K max |u(4) (x)|, j = 0, . . . , N + 1,
12 x∈I
provided u ∈ C 4 (I).
(1.29) Ju + h2 f (u) = g,
where
2 −1
u1 f (x1 , u1 ) g0
−1 2 −1
u2
f (x2 , u2 )
0
.. .. .. .. .. ..
J =
. . . ,
u=
.
,
f (u) =
.
,
g=
.
.
.. ..
uN −1 f (xN −1 , uN −1 ) 0
. . −1
−1 2 uN f (xN , uN ) g1
(1.30)
This system is usually solved using Newton’s method (or a variant of it),
which is described in Section 1.8.
13
1.7 Numerov’s Method for Second Order Nonlinear Equa-
tions
Numerov’s method for the solution of (1.27) may be derived in the following
way. If u is the solution of (1.27) and f is sufficiently smooth then
d2
u(4) (xj ) = f (x, u)|x=xj
dx2
= ∆h f (xj , u(xj )) + O(h2 ).
Since
1 2 (4)
∆h u(xj ) = u00 (xj ) + h u (xj ) + O(h4 ),
12
we have
1 2
∆h u(xj ) = f (xj , u(xj )) + h ∆h f (xj , u(xj )) + O(h4 ).
12
Based on this equation, Numerov’s method is defined by the nonlinear equa-
tions
1 2
−∆h uj + [1 + h ∆h ]f (xj , uj ) = 0, j = 1, . . . , N
12
(1.31)
u 0 = g0 , uN +1 = g1 ,
(1.32) Ju + h2 Bf (u) = g,
−u00 + f (x, u, u0 ) = 0, x ∈ I,
14
subject to general linear boundary conditions. When applied to the bound-
ary value problem (1.27), this method reduces to Numerov’s method. Higher-
order finite difference approximations to nonlinear equations have also been
derived by Doedel (1979) using techniques different from those employed by
Stepleman.
where u(0) is an approximation to the actual solution, and solving the lin-
earized equation
φ(u(0) ) + φ0 (u(0) )∆u = 0.
The value u(1) = u(0) + ∆u is then accepted as a better approximation and
the process is continued if necessary.
Consider now the N equations
(1.34) φi (u1 , u2 , . . . , uN ) = 0, i = 1, . . . , N,
Φ(u) = 0.
∂φi
If J (u) denotes the matrix with (i, j) element (u), then (1.35) can be
∂uj
written in the form
J u(0) ∆u = −Φ u(0) .
15
and
u(1) = u(0) + ∆u
is taken as the new approximation. If the matrices J (u(ν) ), ν = 1, 2, . . . ,
are nonsingular, one hopes to determine a sequence of successively better
approximations u(ν) , ν = 1, 2, . . . from the algorithm
This procedure is known as Newton’s method for the solution of the system
of nonlinear equations (1.34). It can be shown that as in the scalar case
this procedure converges quadratically if u(0) is chosen sufficiently close to
u, the solution of (1.34).
Now consider the system of equations
arising in Numerov’s method (1.31). In this case, the (i, j) element of the
Jacobian J is given by
2 + 56 h2 fu (xi , ui ),
if j = i,
∂φi
1 2
= −1 + 12 h fu (xj , uj ), if |i − j| = 1,
∂uj
0, if |i − j| > 1.
Thus
J (u) = J + h2 BF (u)
where
F (u) = diag(fu (xi , ui )).
In this case, Newton’s method becomes
h i h i
J + h2 BF (u(ν) ) ∆u(ν) = − Ju(ν) + h2 Bf (u(ν) ) − g ,
(1.36)
u(ν+1) = u(ν) + ∆u(ν) , ν = 0, 1, 2, . . . .
16
For the system (1.28),
and
max |φi (u(ν+1) )| ≤ FTOL,
1≤i≤N
17
2 Algorithms for Solving Tridiagonal Linear Sys-
tems
2.1 General Tridiagonal Systems
The algorithm used to solve a general tridiagonal system of the form
(2.1) Au = b,
where
d1 e1
c d2 e2
2
.. .. ..
. . .
(2.2) A= ,
.. .. ..
. . .
cN −1 dN −1 eN −1
cN dN
is a version of Gaussian elimination with partial pivoting. In the decompo-
sition step of the algorithm, the matrix A is factored in the form
(2.3) A = P LR,
P Lv = b.
Ru = v.
which can be easily solved for v and the desired solution u. Software, sgtsv,
based on this algorithm is available in LAPACK [4].
18
algorithms in which no pivoting is required. In this section, we consider the
case in which the elements of the coefficient matrix A satisfy
(i) |d1 | > |e1 |,
(2.4) (ii) |di | ≥ |ci | + |ei |, i = 2, . . . , N − 1,
(iii) |dN | > |cN |,
and
(2.5) ci ei−1 6= 0, i = 2, . . . , N ;
that is, A is irreducibly diagonally dominant and hence nonsingular. We
show that, without pivoting, such a matrix can be factored in a stable fashion
in the form
(2.6) A = L̂R̂,
where L̂ is lower bidiagonal and R̂ is unit upper bidiagonal. The matrices
L̂ and R̂ are given by
α1 1 γ1
c
2 α2 1 γ2
(2.7) L̂ = · · , R̂ = · · ,
· ·
1 γN −1
cN αN 1
where
(i) α1 = d1 ,
(ii) γi = ei /αi , αi+1 = di+1 − ci+1 γi , i = 1, . . . , N − 1.
Since
N
Y
det A = det L̂ · det R̂ = αi · 1
i=1
and A is nonsingular, it follows that none of the αi vanishes; hence the
recursion in (2.8) is well–defined.
We now show that conditions (2.4) ensure that the quantities αi and γi
are bounded and that the bounds are independent of the order of the matrix
A.
19
Proof — Since γi = ei /αi = ei /di ,
γi = ei /(di − ci γi−1 )
and
|γi | = |ei |/(|di | − |ci ||γi−1 |) < |ei |/(|di | − |ci |)
from (2.10). Thus, using (2.4(ii)), it follows that |γi | < 1, and (2.8) follows
by induction.
From (2.8(ii)), (2.8) and (2.4(ii)), it is easy to show that (2.9) holds.
v1 = b1 /α1
For i = 2 to N do:
vi = (bi − ci vi−1 )/αi
end
(2.12)
uN = vN
For i = 1 to N − 1 do:
uN −i = vN −i − γN −i uN −i+1
end
20
2.3 Positive Definite Systems
In section 1, we saw that if in equation (1.1) p = 0, then in addition to
possessing properties (2.4)–(2.5), the matrix A is also symmetric (so that
ci = ei−1 , i = 1, . . . , N − 1) and its diagonal elements, di , i = 1, . . . , N , are
positive. In this case, the matrix A is positive definite.
It is well known that when A is positive definite a stable decomposi-
tion is obtained using Gaussian elimination without pivoting. The LDLT
decomposition of A can easily be obtained from the recursion (2.8). If
1
l 1
1
· · ,
· 1
lN −1 1
δ1 = d1
For i = 1 to N − 1 do:
(2.13) li = ei /δi
δi+1 = di+1 − li ei
end
Since A is positive definite, D is also positive definite1 , and hence the diag-
onal elements of D are positive. Thus the recursions (2.13) are well defined.
Using the LDLT decomposition of A, the solution of the system (2.1) is
determined by first solving
Lv = b,
which gives
v1 = b1
vi+1 = bi+1 − li vi , i = 1, . . . , N − 1;
then
DLT u = v
1
For u = LT v 6= 0, uT Du = vT LDLT v = vT Av > 0, v 6= 0 since L nonsingular.
21
yields
uN = vN /δN
and, for i = 1, . . . , N − 1,
vN −i
uN −i = − lN −i uN −i+1
δN −i
= (vN −i − lN −i δN −i uN −i+1 )/δN −i
= (vN −i − eN −i uN −i+1 )/δN −i ,
where in the last step we have used from (2.13) the fact that lN −i δN −i =
eN −i . Thus,
v1 = b1
For i = 1 to N − 1 do:
vi+1 = bi+1 − li vi
end
(2.14)
uN = vN /δN
For i = 1 to N − 1 do:
uN −i = (vN −i − eN −i uN −i+1 )/δN −i
end
The code sptsv in LAPACK is based on (2.13)–(2.14)
22
3 The Finite Element Galerkin Method
3.1 Introduction
Consider the two–point boundary value problem
where the functions p, q and f are smooth on I, and (1.3) holds. Let H01 (I)
denote the space of all piecewise continuously differentiable functions on I
which vanish at 0 and 1. If v ∈ H01 (I), then
and Z 1 Z 1
00 0
[−u v + pu v + quv]dx = f vdx.
0 0
On integrating by parts, we obtain
Z 1 Z 1 Z 1 Z 1
0 0 0 0
u v dx − [u v]|10 + pu vdx + quvdx = f vdx.
0 0 0 0
where Z 1
(ϕ, ψ) = ϕ(x)ψ(x)dx.
0
Equation (3.2) is called the weak form of the boundary value problem (3.1),
and is written in the form
where
a(φ, ψ) = (φ0 , ψ 0 ) + (pφ0 , ψ) + (qφ, ψ), φ, ψ ∈ H01 (I).
Suppose that Sh is a finite dimensional subspace of H01 (I). The finite ele-
ment Galerkin method consists in finding the element uh ∈ Sh , the Galerkin
approximation to the solution u of (3.1), satisfying
23
Suppose {w1 , . . . , ws } is a basis for Sh , and let
s
X
(3.5) uh (x) = αj wj (x).
j=1
that is,
(3.6) Aα = f ,
where the (i, j) element of the matrix A is a(wj , wi ), and the vectors α and
f are given by
Ij = [xj−1 , xj ], j = 1, . . . , N + 1,
24
where C k (I) denotes the space of functions which are k times continuously
differentiable on I, 0 ≤ k < r, and v|Ij denotes the restriction of the function
v to the interval Ij . We denote by Mr,0k (π) the space
where
Dl Si,k (xj ; m; π) = δij δlk , 0 ≤ l ≤ m − 1, 0 ≤ j ≤ N + 1,
and δmn denotes the Kronecker delta. It is easy to see that Si,k (x; m; π) is
2m−1,0
zero outside [xi−1 , xi+1 ]. A basis for Mm−1 (π) is obtained by omitting
from (3.9) the functions Si,0 (x; m; π), i = 0, N + 1, which are nonzero at
x = 0 and x = 1.
25
`N (x), x ∈ IN +1 ,
(
SN +1,0 (x; 1; π) =
0, otherwise.
Thus there is one basis function associated with each node of the partition
π. For convenience, we set
(3.10) wi (x) = Si,0 (x; 1; π), i = 0, . . . , N + 1.
Note that if uh (x) = N +1
P
j=0 αj wj (x) then αj = uh (xj ), j = 0, . . . , N + 1,
since wj (xi ) = δij . A basis for M1,0
0 (π) is obtained by omitting from (3.10)
the functions w0 (x) and wN +1 (x).
g1 (`N (x)), x ∈ IN +1 ,
(
SN +1,0 (x; 2; π) =
0, otherwise,
and the functions Si,1 (x; 2; π), i = 0, . . . , N + 1, are the piecewise cubic
functions
−h1 g2 (1 − `0 (x)), x ∈ I1 ,
(
S0,1 (x; 2; π) =
0, otherwise,
x ∈ Ii ,
hi g2 (`i−1 (x)),
Si,1 (x; 2; π) = −hi+1 g2 (1 − `i (x)), x ∈ Ii+1 ,
0, otherwise, 1 ≤ i ≤ N,
hN +1 g2 (`N (x)), x ∈ IN +1 ,
(
SN +1,1 (x; 2; π) =
0, otherwise.
26
For notational convenience, we write
The functions vi and si are known as the value function and the slope func-
tion, respectively, associated with the point xi ∈ π. Note that if
N
X +1
uh (x) = {αj vj (x) + βj sj (x)},
j=0
then
αj = uh (xj ), βj = u0h (xj ), j = 0, . . . , N + 1,
since
since wi (x) = 0, x ∈
/(xi−1 , xi+1 ). Thus A is tridiagonal. Since the tridiagonal
matrices
(3.12) A = ((wi0 , wj0 )), B = ((wi , wj )),
corresponding to p = 0, q = 1, occur quite often in practice, their elements
are given in Appendix A. If the partition π is uniform and
2 −1
−1 2 −1
.. .. ..
. . .
−2
(3.13) J =h −1 2 −1 ,
.. .. ..
. . .
−1 2 −1
−1 2
27
then it is easy to see from Appendix A that
!
h2 h
(3.14) A = hJ, B=h I− J = tridiag(1 4 1).
6 6
Now consider the case in which Sh = M31 (π). It is convenient to consider
the basis {wi }2N +3
i=0 , where
28
the inhomogeneous Dirichlet boundary conditions
w =u−g
where
g(x) = g0 v0 (x) + g1 vN +1 (x)
then
w(0) = w(1) = 0,
and, from (3.2),
then α0 = g0 and αN +1 = g1 .
With the basis functions ordered as in (3.15), the coefficient matrix of the
Galerkin equations in this case is the matrix A of (3.16) with the elements
of rows 1 and 2N + 3 replaced by
a1j = δ1j , j = 1, . . . , 2N + 4
and
a2N +3,j = δ2N +3,j , j = 1, . . . , 2N + 4,
respectively, and the right hand side vector has as its first and (2N + 3)th
components, g0 and g1 , respectively.
In the case in which the general linear boundary conditions
29
for v ∈ H 1 (I). Using (3.18) in (3.19), we obtain
(3.21) a(Y, v) = 0, v ∈ Sh .
L∗ ψ = Y (x), x ∈ I,
ψ(0) = ψ(1) = 0,
From the hypotheses on p and q, it follows that ψ ∈ H 2 (I) ∩ H01 (I) and
30
where χ ∈ Sh , and since
(3.23) a(φ, ψ) ≤ CkφkH 1 kψkH 1 ,
for φ, ψ ∈ H 1 (I), it follows that
kY k2L2 ≤ CkY kH 1 kψ − χkH 1 .
From Theorem 3.1, we can choose χ ∈ Sh such that
kψ − χkH 1 ≤ ChkψkH 2 .
Thus
kY k2L2 ≤ ChkY kH 1 kψkH 2 ≤ ChkY kH 1 kY kL2 ,
where in the last inequality we have used (3.22), and we obtain
(3.24) kY kL2 ≤ ChkY kH 1 .
Since Y ∈ Sh ,
a(Y, Y ) = 0,
from which it follows that
kY 0 k2L2 = −(pY 0 , Y ) − (qY, Y ) ≤ C{kY 0 kL2 + kY kL2 }kY kL2 .
Using this inequality and (3.24), we obtain
kY k2H 1 ≤ ChkY k2H 1 .
Thus, for h sufficiently small,
kY kH 1 = 0,
and hence from Sobolev’s inequality,
Y = 0.
For the self-adjoint boundary value problem
Lu ≡ −(pu0 )0 + q(x)u = f (x), x ∈ I.
u(0) = u(1) = 0,
the uniqueness of the Galerkin approximation uh ∈ Sh satisfying
(pu0h , v 0 ) + (quh , v) = (f, v), v ∈ Sh ,
is much easier to prove. In this case, the coefficient matrix A of the Galerkin
equations is positive definite, from which it follows immediately that the
Galerkin approximation is unique. This result is proved in the following
theorem.
31
Theorem 3.2. The matrix A is positive definite.
s
X
βT A β = aij βi βj
i,j=1
s
{(pβi wi0 , βj wj0 ) + (qβi wi , βj wj )}
X
=
i,j=1
= (pw , w0 ) + (qw, w)
0
Theorem 3.3. Suppose u ∈ H r+1 (I) ∩ H01 (I). Then, for h sufficiently
small,
(3.25) ku − uh kL2 + hku − uh kH 1 ≤ Chr+1 kukH r+1 .
L∗ φ = e(x), x ∈ I,
φ(0) = φ(1) = 0,
32
where e = u − uh . Then, as in section 3.2 but with φ and e replacing ψ and
Y , respectively, we have
since
a(e, v) = 0, v ∈ Sh .
Since
a(e, e) = a(e, u − χ), χ ∈ Sh ,
it follows that
and hence
(3.28) kekH 1 ≤ Chr kukH r+1 .
The use of this estimate in (3.26) completes the proof.
denote a partition of I and let Π(I) denote the collection of all such partitions
π of I. As before, set
hi = xi − xi−1 , i = 1, . . . , N + 1,
33
and h = maxi hi . A collection of partitions C ⊂ Π(I) is called quasi–uniform
if there exists a constant σ ≥ 1 such that, for all π ∈ C,
max hh−1
j ≤ σ.
1≤j≤N +1
Lemma 3.1. Let P u be the L2 projection of u into Mrk (π), that is,
(P u − u, v) = 0 v ∈ Mrk (π),
r+1 (I),
where −1 ≤ k ≤ r − 1. Then, if u ∈ W∞
(W 0 , v 0 ) = (u0 , v 0 ), v ∈ Sh .
Since
a(u − uh , v) = 0, v ∈ Sh ,
it follows that
((W − uh )0 , v 0 ) + (u − uh , qv − (pv)0 ) = 0,
kW − uh kH 1 ≤ Cku − uh kL2
34
and from Sobolev’s inequality, we obtain
kW − uh kL∞ ≤ Cku − uh kL2 .
Then, from Theorem 3.3, it follows that
(3.30) kW − uh kL∞ ≤ Chr+1 kukH r+1 .
Since
(3.31) ku − uh kL∞ ≤ ku − W kL∞ + kW − uh kL∞ ,
we need to estimate ku − W kL∞ to complete the proof.
Note that since
((u − W )0 , 1) = 0,
it follows that W 0 is the L2 projection of u0 into Mr−1
k−1 (π), and hence, from
Lemma 3.1, we obtain
(3.32) k(u − W )0 kL∞ ≤ Chr ku0 kW∞ r
r ≤ Ch kuk r+1 .
W∞
and, for χ ∈ Sh ,
(u − W, g) = −(u − W, G00 ) = ((u − W )0 , (G − χ)0 ).
On using Hölder’s inequality, we obtain
(u − W, g) ≤ k(u − W )0 kL∞ k(G − χ)0 kL1 .
From Theorem 3.1, we can choose χ so that
k(G − χ)0 kL1 ≤ ChkGkW 2 .
1
The desired result now follows from (3.30), (3.31) and (3.34).
35
3.6.3 Superconvergence results
The error estimates of Theorems 3.3 and 3.4 are optimal and consequently
no better global rates of convergence are possible. However, there can be
identifiable points at which the approximate solution converges at rates that
exceed the optimal global rate. In the following theorem, we derive one such
superconvergence result.
Proof — Let G(x, ξ) denote the Green’s function for (3.1); that is,
for sufficiently smooth u. This representation is valid for u ∈ H01 (I) and
hence it can be applied to e = u − uh . Thus, for χ ∈ Sh ,
since
a(e, χ) = 0, χ ∈ Sh .
Thus,
(3.36) |e(xi )| ≤ CkekH 1 kG(xi , ·) − χkH 1 .
From the smoothness assumptions on p and q, it follows that
and
kG(xi , ·)kH r+1 ([0,x ]) + kG(xi , ·)kH r+1 ([x ,1]) ≤ C.
i i
36
for h sufficiently small, and hence combining (3.36)–(3.38), we obtain
as desired.
A method which involves very simple auxiliary computations using the
Galerkin solution can be used to produce superconvergent approximations
to the derivative, [24]. First we consider approximations to u0 (0) and u0 (1).
Motivated by the fact that
where uh is the solution to (3.4). Also, with 1−x replaced by x in the above,
we find that
u0 (1) = a(u, x) − (f, x),
and hence we define an approximation ΓN +1 to u0 (1) by
where the subscript Ij0 denotes that the inner products are taken over Ij0 =
(0, xj ), then (3.39) holds for j = 1, . . . , N . The approximation Γj is moti-
vated by the fact that
(Lu, x)Ij0 = (f, x)Ij0 ,
and, after integration by parts,
37
3.6.4 Quadrature Galerkin methods
In most cases, the integrals occurring in (3.6) cannot be evaluated exactly
and one must resort to an approximation technique for their evaluation. In
this section, the effect of the use of certain quadrature rules the Galerkin
method is discussed. To illustrate the concepts, we consider the problem
−(pu0 )0 = f, x ∈ I,
u(0) = u(1) = 0,
0 < p0 ≤ p(x), x ∈ I.
and set
ρij = xi−1 + hi ρj , i = 1, . . . , N + 1, j = 1, . . . , ν.
Suppose ωj > 0, j = 1, . . . , ν, and denote by
ν
X
hα, βii = hi ωj (αβ)(ρij ),
j=1
38
Lemma 3.2. If t ≥ 2r − 2,
hpV 0 , V 0 i ≥ p0 kV 0 k2L2 , V ∈ Sh .
hpV 0 , V 0 i ≥ p0 hV 0 , V 0 i
= p0 kV 0 k2L2 .
39
we obtain from (3.43) with v = wi , the nonlinear system
s s
(wi0 , wj0 )αj
X X
(3.44) + f( αν wν ), wi = 0, i = 1, . . . , s,
j=1 ν=1
then
(3.45) (A + Bn )α(n) = −Fn + Bn α(n−1) ,
where
(n) (n)
A = ((wi0 , wj0 )), α(n) = (α1 , . . . , αN )T ,
s
X
Bn = ((fu ( αν(n−1) wν )wi , wj )),
ν=1
(n−1)
Fn = ((f (uh ), wi )).
40
4 The Orthogonal Spline Collocation Method
4.1 Introduction
Consider the linear second order two-point boundary value problem
Then the coefficients {uj }sj=1 in (4.3) are determined by requiring that uh
satisfy (4.1) at the points {ξj }s−2
j=1 , and the boundary conditions (4.2):
where
(4.4)
ξ(i−1)(r−1)+k = xi−1 + hi σk , i = 1, 2, . . . , N + 1, k = 1, 2, . . . , r − 1,
41
Since only four basis functions,
For r > 3, with the commonly used B-spline basis [8], the coefficient
matrix has the form
D0
W
11 W12 W13
W21 W22 W23
(4.8)
.. ,
.
WN +1,1 WN +1,2 WN +1,3
D1
42
estimates, we consider the linear two-point boundary value problem
Lu = −u00 + p(x)u0 + q(x)u = f (x), x ∈ I,
(4.9)
u(0) = u(1) = 0.
Then, for v ∈ H 2 (I) ∩ H01 (I), there exists a constant C0 such that
(4.10) kvkH 2 (I) ≤ C0 kLvkL2 (I) .
We define the discrete inner product h·, ·i by
N
X +1
hφ, ψi = hφ, ψii ,
i=1
where
r−1
X
hφ, ψii = hi ωj φ(ξ(i−1)(r−1)+k )ψ(ξ(i−1)(r−1)+k ), i = 1, . . . .N + 1,
k=1
43
4.3 Optimal H 2 Error Estimates
We now derive an optimal H 2 error estimate.
Theorem 4.1. Suppose u ∈ H r+1 (I) ∩ H01 (I). Then there exists a constant
C such that, for h sufficiently small,
and
2
X
(4.15) hk k(u − W )(k) kL2 (I) ≤ C hr+1 ku(r+1) kL2 (I) .
k=0
Hence
N +1 r−1
hi ωj [L(uh − W )(xij )]2
X X
hL(uh − W ), L(uh − W )i =
i=1 j=1
N +1 r−1
h2r−2
X X
≤ i ωj ku(r+1) k2L2 (Ii )
i=1 j=1
44
(4.16) kuh − W kH 2 (I) ≤ C hr−1 ku(r+1) kL2 (I) .
The required result is now obtained from the triangle inequality, (4.15) and
(4.16).
2r (I),
and, if u ∈ W∞
where
|(ξj )| ≤ C h2r−2 kukW∞
2r (I) .
2r−2
(4.19) kuh − ûkW∞
1 (I) ≤ C h kukW∞
2r (I) ;
that is, the quasi–interpolant û differs from the collocation solution uh along
with first derivatives, uniformly by O(h2r−2 ). For r ≥ 3, this represents a
higher order in h than is possible for u − uh . However, using the triangle
inequality, (4.17) and (4.18), we see that a superconvergence phenomenon
occurs at the nodes, namely
45
4.5 L2 and H 1 Error Estimates
From the inequality,
Z 1 1
Z 1
2 2
2 00
g(x) dx ≤ C{hi (|g(xi−1 | + |g(xi )|) +
2
h2i g (x) dx 2
},
Ii Ii
valid for any g ∈ C 2 (Ii ), the H 2 estimate (4.13) and the superconvergence
result (4.20), it follows that
ku − uh kL2 ≤ C h2r−1 kukW∞
2r (I) + h
r+1
kukH r+1 (I) .
(L uh )(ξj ) = fˆ(ξj ), j = 1, . . . , s − 2.
and
|(u − uh )(xi )| ≤ Ch2r kf kH r−1 (I) , i = 1, . . . , N.
46
Figure 1: Structure of a general ABD matrix
47
Figure 2: Special ABD structure arising in BVODE solvers
BOT are 2×3 and 1×3, respectively. The overlap between successive blocks
is thus 3.
48
Figure 4: Structure of the reduced matrix
A = P LB̃U Q,
where P, Q are permutation matrices recording the row and column inter-
changes, respectively, the unit lower and unit upper triangular matrices L,
U contain the multipliers used in the row and column eliminations, respec-
tively, and the matrix B̃ has the structure shown in Figure 4, where · denotes
a zeroed element. Since there is only one row or column interchange at each
step, the pivotal information can be stored in a single vector of the order of
49
Figure 5: The coefficient matrix of the reordered system
P Lz = b, B̃w = z, U Qx = w.
the coefficient matrix of the reordered equations has the structure in Figure
5. Thus, the components w1 , w2 , w5 , w6 , w9 , w10 , are determined by solving a
lower triangular system and the remaining components by solving an upper
triangular system.
50
multiplications in the decomposition phase can be deferred to the solution
phase, where only matrix-vector multiplications are required. After the first
sequence of column eliminations, involving a permutation matrix Q1 and
multiplier matrix U1 , say, the resulting matrix is
C1 O
M1
B1 = AQ1 U1 = ,
A1
O
The next sequence of eliminations, that is, the first sequence of row elimi-
nations, is applied only to A1 to produce another reducible matrix
!
R1 N 1 O
L1 P1 A1 = ,
O A2
and the next sequence of eliminations is applied to A2 , which has the struc-
ture of the original matrix with one W block removed. Since row operations
are not performed on M1 in the second elimination step, and column op-
erations are not performed on N1 in the third, etc., there are savings in
51
arithmetic operations [11]. The decomposition phase differs from that in al-
ternating row and column elimination in that if the rth elimination step is a
column (row) elimination, it leaves unaltered the first (r −1) rows (columns)
of the matrix. The matrix in the modified procedure has the same structure
and diagonal blocks as B̃, and the permutation and multiplier matrices are
identical.
In [11, 12], this modified alternate row and column elimination procedure
was developed and implemented in the package colrow for systems with
matrices of the form in Figure 2, and in the package arceco for ABD systems
in which the blocks are of varying dimensions, and the first and last blocks
protrude, as shown in Figure 1.
A comprehensive survey of the occurrence of, and solution techniques
for, ABD linear systems is given in [3].
u1 = u, u2 = u0 , . . . , uk = u(k−1) ,
u0i = ui+1 i = 1, 2, . . . , k − 1,
u0k = f (x, u1 , . . . , uk ),
52
subject to rewritten boundary conditions. In many cases, it is desirable to
reduce the higher–order equations in this way and to use a solution approach
that is suitable for first order systems. However, it should be noted that
there are situations where such a reduction is not necessary and approaches
applicable to the original higher–order equation are more appropriate.
This system is usually solved by Newton’s method (or a variant of it), which
takes the form
(J (Uν−1 )∆Uν = −Φ(Uν−1 ),
(6.4)
Uν = Uν−1 + ∆Uν , ν = 1, 2, . . . ,
where U0 is an initial approximation to U and the (block) matrix
!
∂Φ(U) ∂φi
J (U) ≡ =
∂U ∂Uj
53
is the Jacobian of the nonlinear system (6.3). In this case, this matrix has
the ABD structure
A
L1 R1
L2 R2
(6.5) · ·
· ·
LN +1 RN +1
B
where
∂g0
A = is m1 × m,
∂U0
1 ∂f
Li = − I + hi (xi−1 , Ui−1 ) is m × m,
2 ∂Ui−1
1 ∂f
Ri = I − hi (xi , Ui ) is m × m,
2 ∂Ui
and
∂g1
B = is m2 × m.
∂UN +1
It can be shown that the approximation U determined from the trapezoidal
rule satisfies
u(xi ) − Ui = O(h2 ),
where h = maxj hj . Under sufficient smoothness conditions on u, higher–
order approximations to u on the mesh π can be generated using the method
of deferred conditions (cf. Section 1.5.1), which we shall describe briefly;
full details are given by Lentini and Pereyra (1977).
The local truncation error of the trapezoidal rule is defined as
54
where
ν d2ν
Tν (xi−1/2 ) = − f (x, u)|x=xi−1/2 .
22ν−1 (2ν + 1)! dx2ν
If Sk (U(k−1) ) is a finite difference approximation of order h2k+2 to the first k
terms in the expansion (6.6) of the local truncation error, then the solution
U(k) of the system
ξ(i−1)(r−1)+k = xi−1 + hi ρk , i = 1, . . . , N + 1, k = 1, . . . , r,
where {ρk }rk=1 are the r zeros of the Legendre polynomial of degree r on
the interval [0, 1] (cf., Section 4.1). The collocation equations then take the
55
form
φ0 (U) ≡ g0 (U(0)) = 0,
φs (U) ≡ g1 (U(1)) = 0.
Again a variant of Newton’s method is usually used to solve the nonlinear
system (6.9). Certain properties of the B–splines ensure that the Jacobian
of (6.9) is almost block diagonal but having a more general structure than
that of the matrix (6.5). These properties are (de Boor, 1978):
• On each subinterval [xi−1 , xi ] only r+1 B–splines are non–zero, namely
Br(i−2)+1 , . . . , Br(i−1)+1 .
56
6.4 Multiple Shooting
This procedure involves finding an approximation Ui to u(xi ), i = 0, 1, . . . , N +
1, in the following way (cf. Stoer and Bulirsch, 1980). Let ui (x; Ui ) denote
the solution of the initial value problem
(6.11) g0 (U0 ) = 0,
(6.12) Ui+1 − ui (xi+1 ; Ui ) = 0, i = 0, . . . , N,
(6.13) g1 (UN +1 ) = 0,
where equations (6.12) arise from the continuity constraints, and (6.11) and
(6.13) from the boundary conditions. When Newton’s method is used to
solve this system, the Jacobian is again almost block diagonal, and, in this
case, takes the form
A
L0 I
L1 I
· ·
· ·
LN I
B
57
where
∂g0 ∂ui (xi+1 ; Ui ) ∂g1
A= is m1 ×m, Li = − is m×m, B= is m2 ×m,
∂U1 ∂Ui ∂UN
58
is a version of the code PASVA3 (Pereyra, 1979). The code PASVA3 has
fairly long history and a series of predecessors has been in use for several
years. The package COLSYS (Ascher et al., 1981), which implements spline
collocation at Gauss points, and IMSL’s multiple shooting code, DTPTB ,
called BVPMS in the new IMSL Library, are of a more recent vintage. All
three codes are documented elsewhere, and in this section we shall only
briefly mention some of their similarities and differences, and other note-
worthy features of the codes. It should be noted that Bader and Ascher
(1987) have developed a new version of COLSYS called COLNEW which
differs from COLSYS principally in its use of certain monomial basis func-
tions instead of B-splines. This new code, which is reputed to be somewhat
more robust that its predecessor, shares many of its features, and, in the
remainder of this section, any comments referring to COLSYS apply equally
well to COLNEW .
The codes DVCPR and DTPTB require that the boundary value prob-
lem be formulated as a first–order system, and they can handle nonseparated
boundary conditions, i.e. boundary conditions for (1.1), for example, of the
form
g(u(0), u(1)) = 0,
where g is a vector function of order m. On the other hand, COLSYS can
handle a mixed–order system of multipoint boundary value problems with-
out first reducing it to a first–order system, but requires that the boundary
conditions be separated. This restriction is not a serious one as a boundary
value problem with nonseparated boundary conditions can be reformulated
so that only separated boundary conditions occur, as we shall see in Section
10.2. However this reformulation does increase the size of the problem.
The algebraic problem arising in the codes are similar but there is no
uniformity in the manner in which they are solved. Each code uses some
variant of Newton’s method for the solution of the nonlinear systems, and a
Gauss elimination–based algorithm to solve the almost block diagonal linear
systems. Since the codes were written, the package COLROW , (Diaz et al.,
1983) has been developed specifically for the solution of such linear systems;
see Section 9.4. Its use in the existing codes has not been investigated, but
may improve their efficiency significantly.
Both DVCPR and COLSYS solve the boundary value problem on a se-
quence of meshes until user–specified error tolerances are satisfied. Detailed
descriptions and theoretical justification of the automatic mesh–selection
procedures used in DVCPR and COLSYS are presented by Lentini and
Pereyra (1974) and Ascher et al. (1979), respectively. It is worth noting
59
that DVCPR constructs meshes in such a way that each mesh generated
contains the initial mesh, which is not necessarily the case in COLSYS .
In DTPTB , the “shooting points” x1 , . . . , xN , are chosen by the user or,
for linear problems, by the code itself. Any adaptivity in the code appears
in the initial value solver, which in this case is IMSL’s Runge–Kutta code,
DVERK .
In DVCPR, one can specify only an absolute tolerance, , say, which is
imposed on all components of the solution. The code attempts to find an
approximate solution U such that the absolute error in each of its compo-
nents is bounded by . On the other hand, COLSYS allows the user to
specify a different tolerance for each component of the solution, and in fact
allows one to impose no tolerance at all on some or all of the components.
This code attempts to obtain an approximate solution U such that
for lth component on which the tolerance l is imposed. This criterion has
a definite advantage in the case of components with different order of mag-
nitude or when components differ in magnitude over the interval of defini-
tion. Moreover, it is often more convenient to provide an array of tolerances
than to rescale the problem. On successful termination, both DVCPR and
COLSYS return estimates of the errors in the components of approximate
solution.
In DTPTB , the same absolute tolerance is imposed on all components
of the solution. In addition, a boundary condition tolerance is specified,
and, on successful termination, the solution returned will also satisfy the
boundary conditions to within this tolerance.
Each code offers the possibility of providing an initial approximation to
the solution and an initial subdivision of the interval of definition (the shoot-
ing points in the case of DTPTB ). With COLSYS , one may also specify the
number of collocation points per subinterval. In many instances, parame-
ter continuation is used to generate initial approximations and subdivisions.
That is, when solving a parameterized family of problems in increasing or-
der of difficulty, the subdivision and initial approximation for a particular
problem are derived from the final mesh and the approximate solution cal-
culated in the previous problem. This is an exceedingly useful technique
which improves the robustness of the packages. It should be noted that
COLSYS requires a continuous initial approximation, which is sometimes
an inconvenience.
60
It has been the authors’ experience that DTPTB is the least robust of
the three codes. However W. H. Enright and T. F. Fairgrieve (private com-
munication) have made some modifications to this code which have greatly
improved its performance, making it competitive with both DVCPR and
COLSYS . In general, there is little to choose between DVCPR and COL-
SYS , and it is recommended that both be used to solve a given boundary
value problem. Driver programs for these codes are straightforward to write.
Moreover, there are definite advantages to using both codes. For example,
each provides its own insights when solving a problem, simple programming
errors can be detected more quickly by comparing results, and considerable
confidence can be attached to a solution obtained by two different methods.
With the wide availability of these software packages, it is rarely advisable
or even necessary for the practitioner to develop ad hoc computer programs
for solving boundary value problems ordinary differential equations.
u0 = f (x, u), x ∈ I,
(7.1)
g(u(0), u(1)) = 0,
61
While COLSYS/COLNEW can handle mixed order systems directly,
it does not accept nonseparated boundary conditions. A simple way to
convert a BVP of the form (7.1) with nonseparated boundary conditions to
one with separated boundary conditions, where each condition involves only
one point, is the following. We introduce the constant function v such that
v(x) = u(1), x ∈ I,
u0 = f (x, u), v0 = 0, x ∈ I,
g(u(0), v(0)) = 0, u(1) = v(1).
The penalty in this approach is that the transformed problem is twice the
order of the original problem. On the other hand, as we have seen, the ap-
plication of standard techniques to boundary value problems with separated
boundary conditions involves the solution of almost block diagonal linear
systems. The algebraic problem is significantly more involved when these
techniques are applied directly to the boundary value problem (7.1).
62
With this modification, the codes DVCPR (a predecessor of BVPFD) and
DTPTB (a predessor of BVPMS) have been used successfully to solve prob-
lems of the form (7.2)–(7.3); see, for example, [15, 18].
7.4 Continuation
Many problems arising in engineering applications depend on one (or more)
parameters. Consider, for example, the boundary value problem
u0 = f (x, u; λ), x ∈ I,
(7.4)
g(u(0), u(1); λ) = 0,
where λ is a parameter. For the desired value of λ, λ1 say, the problem (7.4)
might be exceedingly difficult to solve without a very good initial approx-
imation to the solution u. If the solution of (7.4) is easily determined for
another value of λ, λ0 say, then a chain of continuation steps along the ho-
motopy path (λ, u( · ; λ)) may be initiated from (λ0 , u( · ; λ0 )). At each step
of the chain, standard software can be used to determine an approximation
to u( · ; λ) which is then used, at the next step, as the initial approximation
to u( · ; λ + ∆λ). This procedure can also be used when the solution of (7.4)
is required for a sequence of values of λ, and the problems are solved in
increasing order of difficulty.
As an example, consider the boundary value problem
s 0
u00 + u + λeu = 0, x ∈ I,
(7.5) x
u0 (0) = 0, u(1) = 0,
63
λ [21] COLSYS
0.1 0.252(−1) 0.254822(−1)
0.2 0.519(−1) 0.519865(−1)
0.3 0.796(−1) 0.796091(−1)
0.4 0.109 0.108461
0.5 0.138 0.138673
0.6 0.169 0.170397
0.7 0.2035 0.203815
0.8 0.238 0.239148
1.0 0.65 0.316694
1.5 1.1 0.575364
1.7 2.0 0.731577
64
value problem (7.6) can then be replaced by the boundary value problem
65
S −0.5 0.0 1.0 25.0
(a) 1.3023 3.0000 6.2603 73.8652
(b) 1.3036 3.0018 6.2778 74.0414
where
1 1
F (ϕ, ψ) = 2ϕ0 + ηϕ00 + (ϕ0 )2 − ϕ00 (ϕ + ψ),
2 2
subject to the boundary conditions
(a) solving the augmented boundary value problem using COLSYS and
DVCPR;
(b) [6].
Since COLSYS and DVCPR produce the same results to the number of
figures presented, it seems reasonable to assume that the results given by
(a) are the correct values.
66
and replace the constraint by the differential equation
w0 (x) = G(u(x), x)
w(0) = 0, w(1) = R.
and set Z x
w(x) = F (u, x, k)x dx.
0
Then to the original boundary value problem we add
w0 = xF (u, x, k),
w(0) = 0, w(1) = 0,
k 0 = 0.
This augmented boundary value problem was solved successfully by [7] using
COLSYS .
67
7.8 Interface Conditions
In problems involving layered media, for example, one has interface condi-
tions at known interior points. Such a problem is
L u = 0, x ∈ [0, β],
(7.14)
L u − ϕ21 u = 0, x ∈ [β, 1],
where
d2 u 1 du
Lu =2
+ ,
dx x + α dx
and α and ϕ1 are known constants, subject to
du du − du +
(7.15) u(0) = 1, (1) = 0, u(β − ) = u(β + ), (β ) = (β ).
dx dx dx
To obtain a standard boundary value problem, we map the problem (7.14)–
(7.15) to [0, 1], by setting z = x/β to map [0, β] to [0, 1], and then z =
(1 − x)/(1 − β) maps [β, 1] to [1, 0]. With
d2 u β du
L0 u = 2
+ 2 ,
dz β + α dz
and
d2 u β−1 du
L1 u = + ,
dz 2 (β − 1)2 + 1 + α dz
we obtain
(7.16) L0 u1 = 0 L1 u2 − ϕ21 (β − 1)2 u2 = 0
and
du2
u1 (0) = 1, (0) = 0,
dz
(7.17)
1 du1 1 du2
u1 (1) = u2 (1), (1) = (1).
β dz β − 1 dz
An additional complication arises when the location of the interface, x = β,
is unknown. Such a problem is discussed in [2], where in addition to (7.14),
(7.15), we have
(7.18) Lv − ϕ22 u = 0, x ∈ [β, 1],
and the boundary conditions
dv
(7.19) v(1) = 0, (β) = 0, v(1) = 1,
dx
68
where ϕ2 is a given constant. Equation (7.18) is transformed to
dv1
(7.21) v1 (0) = 1, v1 (1) = 0, (1) = 0.
dz
Since β is unknown, to the boundary value problem consisting of (7.16),
(7.17), (7.20) and (7.21), we add the trivial ordinary differential equation
dβ
= 0, z ∈ I.
dz
69
8 Matrix Decomposition Algorithms for Poisson’s
Equation
8.1 Introduction
In this section, we consider the use of finite difference, finite element Galerkin
and orthogonal spline collocation methods on uniform partitions for the so-
lution of Poisson’s equation in the unit square subject to Dirichlet boundary
conditions:
where ∆ denotes the Laplacian and Ω = (0, 1)2 . Each method gives rise to
a system of linear equations of the form
(8.2) (A ⊗ B + B ⊗ A)u = f ,
where A and B are square matrices of order M , say, u and f are vectors of
order M 2 given by
and ⊗ denotes the tensor product; see Appendix D for the definition of ⊗
and its properties. A matrix decomposition algorithm is a fast direct method
for solving systems of the form (8.2) which reduces the problem to one of
solving a set of independent one-dimensional problems. We first develop a
framework for matrix decomposition algorithms. To this end, suppose the
real nonsingular matrix E is given and assume that a real diagonal matrix
Λ and a real nonsingular matrix Z can be determined so that
(8.5) AZ = BZΛ
and
(8.6) Z T EBZ = I,
where I is the identity matrix of order M . Premultiplying (8.5) by Z T E
and using (8.6), we obtain
(8.7) Z T EAZ = Λ.
70
The system of equations (8.2) can then be written in the form
From the preceding, we obtain the following algorithm for solving (8.2):
2. Compute g = (Z T E ⊗ I)f .
3. Solve (Λ ⊗ B + I ⊗ A)v = g.
4. Compute u = (Z ⊗ I)v.
(A + λj B)vj = gj , j = 1, . . . , M.
where λ is a positive constant. The elements of the matrix Z are sines and/or
cosines and as a consequence multiplication by Z or Z T can be done using
fast Fourier transforms (FFTs). Moreover, the matrix E is sparse, so that
multiplication by E can be done very efficiently. When all of these proper-
ties are exploited, the operation count for any of the matrix decomposition
algorithms discussed in this section is O(N 2 log N ), where (N + 1)h = 1 and
h is the mesh parameter.
71
8.2 Finite Difference Method
As a simple example, consider the basic five point difference approximation
for Poisson’s equation. To describe this method, suppose h = 1/(N + 1),
where N is a positive integer, and set xm = mh, yn = nh. Denote by um,n
an approximation to u(xm , yn ) defined by the usual second order difference
equations
um−1,n − 2um,n + um+1,n um,n−1 − 2um,n + um,n+1
− − = f (xm , yn ), m, n = 1, ..., N,
h2 h2
(8.11)
where u0,n = uN +1,n = um,0 = um,N +1 = 0. If we set M = N and introduce
vectors u and f as in (8.3) and (8.4), respectively, with fm,n = f (xm , yn ),
then the finite difference equations (8.11) may be written in the form
(8.12) (A ⊗ I + I ⊗ A)u = f ,
(8.13) Λ = diag(λ1 , . . . , λN ),
where
4 jπh
(8.14) λj = sin2 , j = 1, . . . , N,
h2 2
and Z is the symmetric orthogonal matrix given by
(8.15) Z = S,
where 1/2 N
2 mnπ
(8.16) S= sin .
N +1 N +1 m,n=1
1. Compute g = (S ⊗ I)f .
2. Solve (Λ ⊗ I + I ⊗ A)v = g.
3. Compute u = (S ⊗ I)v.
72
Note that Steps 1 and 3 can be carried out using FFTs at a cost of O(N 2 log N )
operations. Step 2 consists of N tridiagonal linear systems each of which
can be solved in O(N ) operations so that the total cost of the algorithm is
O(N 2 log N ) operations. Each tridiagonal system corresponds to the stan-
dard finite difference approximation to (8.10).
and Z Z
(8.17) − ∆u(x, y)v(x, y)dx dy = f (x, y)v(x, y)dx dy.
Ω Ω
On applying Green’s formula to (8.17) and using the fact that v = 0 on ∂Ω,
we obtain
∂u ∂v ∂u ∂v
Z Z
(8.18) + dx dy = f (x, y)v(x, y)dx dy.
Ω ∂x ∂x ∂y ∂y Ω
for functions g1 and g2 with the dot denoting scalar multiplication in the
case of vector functions. Then (8.18) can be written in the form
This is the weak form of (8.1) on which the finite element Galerkin method
is based.
Now let {xk }N +1
k=0 be a uniform partition of the interval [0, 1], so that
xk = kh, k = 0, . . . , N + 1, where the stepsize h = 1/(N + 1). Let {wn }N
n=1
denote the standard basis for the space of piecewise linear functions defined
on this partition which vanish at 0 and 1; see (3.10). Then the C 0 piecewise
bilinear Galerkin approximation
N X
X N
uh (x, y) = um,n wm (x)wn (y)
m=1 n=1
73
to the solution u of (2.1) is obtained by requiring that
and
n o−1/2 N
(8.22) Z = S diag h[1 − 61 λj ] ,
j=1
where λj and S are given by (8.14) and (8.16), respectively, then (8.5)
and (8.6) are satisfied. Thus, with Λ and Z defined by (8.21) and (8.22),
respectively, we have the following matrix decomposition algorithm:
1. Compute g = (Z T ⊗ I)f .
2. Solve (Λ ⊗ B + I ⊗ A)v = g.
3. Compute u = (Z ⊗ I)v.
(8.23) φn = vn , n = 1, . . . , N, φN +n+1 = sn , n = 0, . . . , N + 1.
74
This is the standard basis but the ordering of the basis functions is nonstan-
dard. The Hermite bicubic orthogonal collocation approximation
M X
X M
uh (x, y) = um,n φm (x)φn (y)
m=1 n=1
A = (amn )M
m,n=1 , amn = −φ00n (ξm ), B = (bmn )M
m,n=1 , bmn = φn (ξm ).
(8.25)
Let
(8.26) Λ = diag(λ− − + +
1 , . . . , λN , λ0 , λ1 , . . . , λN , λN +1 ),
where
!
8 + ηj ± µj
λ±
j = 12 h−2 , j = 1, . . . , N, λ0 = 36h−2 , λN +1 = 12h−2 ,
7 − ηj
and
jπ
q
ηj = cos , µj = 43 + 40ηj − 2ηj2 .
N +1
±
To describe the matrix Z, let Λ±
α , Λβ be diagonal matrices defined by
± ±
√
Λ±
α = diag(α1 , . . . , αN ), Λ− − −
β = diag(β1 , . . . , βN ), Λ+
β = diag(1, β +
1 , . . . , βN
+
, 1/ 3),
where
jπ
αj± = (5 + 4ηj ∓ µj )νj± , βj± = 18 sin ν ±,
N +1 j
and
h i−1/2
νj± = 27(1 + ηj )(8 + ηj ∓ µj )2 + (1 − ηj )(11 + 7ηj ∓ 4µj )2 .
Then
√
" #
SΛ−α 0 SΛ+
α 0
(8.27) Z=3 3 ,
C̃Λ−
β CΛ+
β
75
where 0 is the N -dimensional zero column vector, S is given by (8.16), and
1/2 N +1 1/2 N +1,N
2 mnπ 2 mnπ
C= cos , C̃ = cos .
N +1 N +1 m,n=0 N +1 N +1 m=0,n=1
1. Compute g = (Z T B T ⊗ I)f .
2. Solve (Λ ⊗ B + I ⊗ A)v = g.
3. Compute u = (Z ⊗ I)v.
Since there are at most four nonzero elements in each row of the matrix B T
the matrix-vector multiplications involving the matrix B T in step 1 require
a total of O(N 2 ) arithmetic operations. From (8.27), it follows that FFT
routines can be used to perform multiplications by the matrix Z T in step
1 and by the matrix Z in step 3, the corresponding cost of each step being
O(N 2 log N ) operations. Step 2 involves the solution of M independent
almost block diagonal linear systems with coefficient matrices of the form
(4.7) arising from the orthogonal spline collocation approximation of (8.10),
which can be solved in a total of O(N 2 ) operations. Thus the total cost of
this algorithm is also O(N 2 log N ) operations.
76
References
1. S. Agmon, Lectures on Elliptic Boundary–Value Problems, Van Nos-
trand, Princeton, New Jersey, 1965.
2. A. Akyurtlu, J. F. Akyurtlu, C. E. Hamrin Jr. and G. Fairweather, Re-
formulation and the numerical solution of the equations for a catalytic,
porous wall, gas–liquid reactor, Computers Chem. Eng., 10(1986),
361–365.
3. P. Amodio, J. R. Cash, G. Roussos, R. W. Wright, G. Fairweather, I.
Gladwell, G. L. Kraut and M. Paprzycki, Almost block diagonal linear
systems: sequential and parallel solution techniques, and applications,
Numerical Linear Algebra with Applications, to appear.
4. E. Anderson, Z. Bai, C. Bischof, J. Demmel, J. Dongarra, J. Du Croz,
A. Greenbaum, S. Hammarling, A. McKenney, S. Ostrouchov and D.
Sorensen, LAPACK Users’ Guide, SIAM Publications, Philadelphia,
1995.
5. U. Ascher and R. D. Russell, Reformulation of boundary value prob-
lems into ‘standard’ form, SIAM Rev., 23(1981), 238–254.
6. A. Aziz and T. Y. Na, Squeezing flow of a viscous fluid between elliptic
plates, J. Comput. Appl. Math., 7(1981), 115–119.
7. D. Bhattacharyya, M. Jevtitch, J. T. Schrodt and G. Fairweather,
Prediction of membrane separation characteristics by pore distribution
measurements and surface force–pore flow model, Chem.Eng. Com-
mun., 42(1986), 111–128.
8. C. de Boor, A Practical Guide to Splines, Applied Math. Sciences 27,
Springer–Verlag, New York, 1978.
9. C. de Boor and R. Weiss, SOLVEBLOK: A package for solving almost
block diagonal linear systems, ACM Trans. Math. Software, 6(1980),
80–87.
10. C. de Boor and R. Weiss, Algorithm 546: SOLVEBLOK, ACM Trans.
Math. Software, 6(1980), 88–91.
11. J.C. Diaz, G. Fairweather and P. Keast FORTRAN packages for solv-
ing certain almost block diagonal linear systems by modified alternate
row and column elimination, ACM Trans. Math. Software, 9(1983),
358–375.
77
12. J. C. Diaz, G. Fairweather and P. Keast, Algorithm 603 COLROW and
ARCECO: FORTRAN packages for solving certain almost block diag-
onal linear systems by modified alternate row and column elimination,
ACM Trans. Math. Software, 9(1983), 376–380.
20. T. Y. Na and I. Pop, Free convection flow past a vertical flat plate
embedded in a saturated porous medium, Int.J. Engrg. Sci., 21(1983),
517–526.
78
23. J. M. Varah, Alternate row and column elimination for solving certain
linear systems, SIAM J. Numer. Anal., 13(1976), 71–75.
24. M. F. Wheeler, A Galerkin procedure for estimating the flux for two–
point boundary value problems, SIAM J. Numer. Anal., 11(1974), 764–
768.
79
Appendix B. Galerkin Matrices for Piecewise Hermite Cubic
Functions. We consider the case in which the partition π of I is uniform
and let h = xi+1 − xi , i = 0, . . . , N . The conditioning of the matrices is
improved if the slope functions are normalized by dividing by h. Thus we
define
s̃i (x) = h−1 si (x), i = 0, . . . , N + 1.
If the Galerkin solution uh is expressed in the form
N +1
X (1) (2)
uh (x) = {αi vi (x) + αi s̃i (x)},
i=0
then
(1)
αi = uh (xi ),
as before, but now
(2)
αi = hu0h (xi ), i = 0, . . . , N + 1.
and
" #
h 78 11
B00 = ,
210 11 2
" #
h 78 0
Bii = , i = 1, . . . , N,
105 0 2
80
−11
" #
h 78
BN +1,N +1 = ,
210 −11 2
54 −13
" #
T h
Bi,i+1 = Bi+1,i = , i = 0, . . . , N.
420 13 −3
81
If x = ξ2i+1 ,
1
x − xi = h(1 + ρ1 ).
2
Therefore,
1 3
vi+1 (ξ2i+1 ) = − (1 + ρ1 )3 + (1 + ρ1 )2
4 4
1 2
= (1 + ρ1 ) (2 − ρ1 ),
4
0 3 3
vi+1 (ξ2i+1 ) = − (1 + ρ1 )2 + (1 + ρ1 )
2h h
3 2
= (1 − ρ1 ),
2h
00 6 6
vi+1 (ξ2i+1 ) = − 2 (1 + ρ1 ) + 2
h hi+1
6ρ1
= − 2,
h
0 (ξ
with similar expressions for vi+1 (ξ2i+2 ), vi+1 00
2i+2 ), vi+1 (ξ2i+2 ).
With s̃i = h−1 si as in Appendix B, on [xi , xi+1 ],
( )
xi+1 − x 3 xi+1 − x 2
s̃i (x) = − − ,
h h
3 2
s̃0i (x) = − − 3 (xi+1 − x)2 + 2 (xi+1 − x) ,
h h
6 2
s̃00i (x) = − (xi+1 − x) − 2 .
h3 h
1 1
s̃i (ξ2i+1 ) = − (1 − ρ1 )3 − (1 − ρ1 )2
8 4
1
= − (1 − ρ1 )2 {1 − ρ1 − 2}
8
1
= (1 − ρ1 )2 (1 + ρ1 ),
8( )
3 (1 − ρ1 )2 1
s̃0i (ξ2i+1 ) = − − + (1 − ρ1 )
h 4 h
1
= − (1 − ρ1 ){−3(1 − ρ1 ) + 4}
4h
82
1
= − (1 − ρ1 )(1 + 3ρ1 ),
4h
3 2
s̃00i (ξ2i+1 ) = − (1 − ρ 1 ) −
h2 h2
1
= − 2 (1 − 3ρ1 ),
h
with similar expressions for s̃i (ξ2i+2 ), s̃0i (ξ2i+2 ), and s̃00i (ξ2i+2 ).
On [xi , xi+1 ],
x − xi 3 x − xi 2
s̃i+1 (x) = − ,
h h
3 2
s̃0i+1 (x) = 3
(x − xi )2 − 2 (x − xi ),
h h
6 2
s̃00i+1 (x) = (x − xi ) − 2 .
h3 h
Then, with x − xi = 12 h(1 + ρ1 ), we have
1 1
s̃i+1 (ξ2i+1 ) = (1 + ρ1 )3 − (1 + ρ1 )2
8 4
1
= − (1 + ρ1 )2 (1 − ρ1 ),
8
3 1
0 2
s̃i+1 (ξ2i+1 ) = (1 + ρ1 ) − (1 + ρ1 )
4h h
1
= − (1 + ρ1 )(1 − 3ρ1 ),
4h
1
s̃00i+1 (ξ2i+1 ) = {3(1 + ρ1 ) − 2}
h2
1
= (1 + 3ρ1 ),
h2
with similar expressions for s̃i+1 (ξ2i+2 ), s̃0i+1 (ξ2i+2 ), s̃00i+1 (ξ2i+2 ).
83