Applied and Computational Optimal Control: Kok Lay Teo Bin Li Changjun Yu Volker Rehbock
Applied and Computational Optimal Control: Kok Lay Teo Bin Li Changjun Yu Volker Rehbock
Applied and
Computational
Optimal
Control
A Control Parametrization Approach
Springer Optimization and Its Applications
Volume 171
Series Editors
Panos M. Pardalos , University of Florida
My T. Thai , University of Florida
Honorary Editor
Ding-Zhu Du, University of Texas at Dallas
Advisory Editors
Roman V. Belavkin, Middlesex University
John R. Birge, University of Chicago
Sergiy Butenko, Texas A&M University
Vipin Kumar, University of Minnesota
Anna Nagurney, University of Massachusetts Amherst
Jun Pei, Hefei University of Technology
Oleg Prokopyev, University of Pittsburgh
Steffen Rebennack, Karlsruhe Institute of Technology
Mauricio Resende, Amazon
Tamás Terlaky, Lehigh University
Van Vu, Yale University
Michael N. Vrahatis, University of Patras
Guoliang Xue, Arizona State University
Yinyu Ye, Stanford University
Aims and Scope
Optimization has continued to expand in all directions at an astonishing rate. New
algorithmic and theoretical techniques are continually developing and the diffusion
into other disciplines is proceeding at a rapid pace, with a spot light on machine
learning, artificial intelligence, and quantum computing. Our knowledge of all as-
pects of the field has grown even more profound. At the same time, one of the most
striking trends in optimization is the constantly increasing emphasis on the interdis-
ciplinary nature of the field. Optimization has been a basic tool in areas not limited
to applied mathematics, engineering, medicine, economics, computer science, oper-
ations research, and other sciences.
The series Springer Optimization and Its Applications (SOIA) aims to publish
state-of-the-art expository works (monographs, contributed volumes, textbooks,
handbooks) that focus on theory, methods, and applications of optimization. Top-
ics covered include, but are not limited to, nonlinear optimization, combinatorial
optimization, continuous optimization, stochastic optimization, Bayesian optimiza-
tion, optimal control, discrete optimization, multi-objective optimization, and more.
New to the series portfolio include Works at the intersection of optimization and
machine learning, artificial intelligence, and quantum computing.
Volumes from this series are indexed by Web of Science, zbMATH, Mathematical
Reviews, and SCOPUS.
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Switzerland AG 2021
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
v
vi Preface
methods and new applications have been obtained and published since 1991.
For this reason, we have been motivated to write this new book. To en-
sure that the book is self-contained, essential fundamental results from the
1991 book are included in this new book. A revised version of basic results
on unconstrained and constrained optimization problems, and optimization
problems subject to continuous inequality constraints are also included.
A tremendous proliferation of results published based on control parametriza-
tion has appeared in the literature after 1991. To keep the size of the book
reasonable, we restricted ourselves to a discussion of those results obtained
by the authors and their past and present collaborators and students. This
choice ensures that the results presented in this book could be organized to
form a unified computational approach to solve various real-world practical
optimal control problems. These computational methods are supported by
rigorous convergence analysis, easily programmable and adaptable to exist-
ing efficient optimization software packages.
We do not claim that this family of computational methods is necessarily
superior to others found in the literature. Direct (Runge-Kutta) discretization
of optimal control problems or pseudospectral techniques are two examples
of methods that have been intensively studied by many researchers. A brief
review of these techniques is included in Section 1.3.5.
This book can serve as a reference for researchers and students working in
the areas of optimal control theory and its applications, and for professionals
using optimal control to solve their problems. It is noted that many scientists,
engineers and practitioners may not be thoroughly familiar with optimal con-
trol theory. Thus, the optimal control software MISER, which was developed
based on the control parametrization technique, can help them to apply op-
timal control theory as a tool to solve their problems. We wish to emphasize
that the aim of this book is to furnish a rigorous and detailed exposition of
the concept of control parametrization and the time scaling transformation
to develop new theory and new computational methods for solving various
optimal control problems numerically and in a unified fashion. Based on the
knowledge gained from this book, research scientists or engineers can develop
new theory and new computational methods to solve other complex problems
that are not being covered in this book.
The background required to understand the computational methods pre-
sented in this book, and their application to solve practical problems, is
advanced calculus. However, to analyse the convergence properties of these
computational methods, some results in real and functional analysis are also
required. For the convenience of the reader, these mathematical concepts and
facts are stated without proofs in Appendix A.1. Engineers and applied sci-
entists should be able to follow the proofs of the convergence theorems with
the aid of the results presented in Appendix A.1. For global optimization,
a filled function method is presented in Appendix A.2. Some basic concepts
and results on probability theory are discussed in Appendix A.3.
Preface vii
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Optimal Control Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Computational Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.1 Dynamic Programming and Iterative Dynamic
Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3.2 Leapfrog Algorithm and STC algorithm . . . . . . . . . . . . 13
1.3.3 Control Parametrization . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3.4 Collocation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3.5 Full Parametrization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.4 Optimal Control Software Packages . . . . . . . . . . . . . . . . . . . . . . 18
xi
xii Contents
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555
Chapter 1
Introduction
joining two points with prescribed tangents, where a specified bound is im-
posed on its curvature. An elegantly simple solution was obtained by Dubins
in 1957—a selection of at most three arcs are concatenated, each of which
is either a circular arc of maximum (prescribed) curvature or a straight line.
The Markov-Dubins problem is reformulated as an optimal control problem
in various papers, and Pontryagin maximum principle is used to obtain the
same results as those obtained by Dubins. In [114], under the same reformu-
lation of the Markov-Dubins problems, the maximum principle is applied to
derive Dubins result again. The new insights are: abnormal control solutions
do exist; these solutions can be characterized as a concatenation of at most
two circular arcs; they are also solutions of the normal problem; and any
feasible path of the types mentioned in Dubins result satisfies the Pontrya-
gin maximum principle. A numerical method for computing Markov-Dubins
path is proposed. Dynamic Programming Principle developed by Bellman
[18–20] has been used to determine optimal feedback controls for a range
of practical optimal control problems. However, its application typically re-
quires the solution of a highly nonlinear partial differential equation known
as the Hamilton-Jacobi-Bellman (HJB) equation. Some numerical methods
for solving this HJB equation for low dimensional problems are available in
the literature (see, for example, [2, 6, 98, 99, 188, 208, 273, 274, 303–309]).
Practical problems, however, are usually too complex to be solved an-
alytically using either Pontryagin maximum principle or dynamic program-
ming. Thus, many numerical solution techniques have been developed and im-
plemented on computers. These numerical solution techniques coupled with
modern computing power are able to solve a wide range of highly complex
optimal control problems.
There are several survey articles and books in the literature on optimal
control computation. See, for example, [48, 69, 148, 210, 215]. For optimal
control computational methods based on Euler discretization, see, for exam-
ple, [36, 49, 50, 85, 113, 115, 178, 267].
In this book, our attention is centred on optimal control problems involving
systems of ordinary differential equations with as well as without time delayed
arguments and systems of difference equations. Note that these classes of
problems have many practical applications across a wide range of disciplines
such as those mentioned above.
subject to
dk(t)
= bw(t) − ck(t)
dt
k(0) = k0
k(T ) = kT
and 0 ≤ w(t) ≤ w̄ for all 0 ≤ t < T . Although this is a rather trivial example
of an optimal control problem, it serves well to illustrate the basic class of
optimal control problems we want to consider: an objective functional, g(w),
is to be minimized subject to a dynamical system governing the behavior
of the state variable, k(t), and subject to constraints. We need to find a
control function w(t), t ∈ [0, T ), subject to given bounds, which will optimize
the objective functional. A somewhat more realistic version of the student
problem, which assumes some ambition in the student, may be stated as
follows:
T
Maximize g(w) = ak(T ) − α1 w(t) + α2 (w(t))2 dt
0
subject to
dk(t)
= (b1 + b2 k(t))w(t) − ck(t)
dt
k(0) = k0
k(T ) ≥ kT ,
and 0 ≤ w(t) ≤ w̄ for all 0 ≤ t < T . This version of the problem assumes
that the student has some interest in maximizing his/her examination mark,
is more averse to high rates of work and can gain knowledge more readily
when he/she already has a high level of knowledge. Both problems stated in
4 1 Introduction
l3 l4
d1 - d2 - d3 -
l2
C ?- D
x3 = 0 x3 = 0 R
x4 = x̄4 x4 = x̄4 x4 = 0
x5 = 0 x5 = 0 ? E x5 = x̄5
6
? B x4 = 0
l5 ?
x5 = −x̄5
6
? F x5 = 0
l1 6
? A
x5 = 0
generated by the trolley and hoist motors, denoted by T1 (t) and T2 (t), re-
spectively. Finally, M denotes the total mass of the container and attached
equipment, m is the total mass of the crane trolley and the operator’s cab,
φ(t) is the load swing angle and g denotes acceleration due to gravity.
We define
b1 T1 (t)
v1 (t) = ,
J1 + mb21
b2 (T2 (t) + M b2 g)
v2 (t) = ,
J2 + M b22
M b21 M b22
δ1 = and δ 2 = .
J1 + mb21 J2 + M b22
Assuming that the load swing angle is small in magnitude, that the load
can be regarded as a point and that frictional torques can be neglected, the
dynamics of the container crane are given by Sakawa and Shindo [219]
dx1 (t)
=x4 (t) (1.2.1)
dt
dx2 (t)
=x5 (t) (1.2.2)
dt
dx3 (t)
=x6 (t) (1.2.3)
dt
dx4 (t)
=v1 (t) − δ1 x3 (t)v2 (t) + δ1 gx3 (t) (1.2.4)
dt
dx5 (t)
= − δ2 x3 (t)v1 (t) + v2 (t) (1.2.5)
dt
dx6 (t) 1
=− [v1 (t) − δ1 x3 (t)v2 (t) + (1 + δ1 )gx3 (t)
dt x2 (t)
+ 2x5 (t)x6 (t)]. (1.2.6)
Note that v1 and v2 are the effective control variables in these dynamics, and
they are subject to the following bounds:
previous section for which the optimal control can be determined analytically,
moving the container from B to C is a non-trivial optimal control task that
forms the basis of the problem we present here. Note that the container must
arrive at C with x3 = 0 (no swing), x4 = x̄4 (maximum allowed horizontal
velocity) and x5 = 0 (zero vertical velocity). Finally, the sections from C to
D and E to F again constitute trivial problems, while the section from D to
E has a similar complexity to that from B to C. Consider the section from B
to C over the time horizon [0, T ]. We impose the following initial conditions
and terminal state constraints on the problem:
where w1 and w2 are given weights. This objective functional can be regarded
as a measure of the total amount of swing experienced by the load.
Alternatively, since the speed of the operation is clearly an important issue
[219], one may want to minimize
T
ḡ(v) = 1 dt = T (1.2.14)
0
subject to (1.2.1) and (1.2.12) and subject to the additional swing constraint
T
w1 (x3 (t))2 + w2 (x6 (t))2 dt ≤ Smax , (1.2.15)
0
g(u) = P (T ) (1.2.16)
subject to
8 1 Introduction
dX(t)
=μ(X(t), S(t), V (t))X(t) (1.2.17)
dt
dP (t)
=π(X(t), S(t), V (t))X(t) − kP (t) (1.2.18)
dt
dS(t)
= − σ(X(t), S(t), V (t))X(t) + sF u(t) (1.2.19)
dt
dV (t)
=u(t) (1.2.20)
dt
X(0) =10.5 (1.2.21)
P (0) =0 (1.2.22)
S(0) =0 (1.2.23)
V (0) =7 (1.2.24)
V (T ) =10 (1.2.25)
0 ≤ u(t) ≤ umax , ∀t ∈ [0, T ). (1.2.26)
dx1
=x2
dt
dx2
=ϕ(x2 ) u1 + ζ2 u2 + ρ(x2 ),
dt
where x1 is the distance along the track, x2 is the speed of the train, u1 is
the fuel setting and u2 models the deceleration applied to the train by the
brakes. The function
1.2 Illustrative Examples 9
⎧
⎪
⎪ ζ1 /x2 , if x2 ≥ ζ3 + ζ4 ,
⎪
⎪
⎪
⎪
⎨
ζ1 /ζ3 + η1 (x2 − (ζ3 − ζ4 ))2
ϕ(x2 ) =
⎪ +η2 (x2 − (ζ3 − ζ4 ))3 ,
⎪ if ζ3 − ζ4 ≤ x2 < ζ3 + ζ4 ,
⎪
⎪
⎪
⎪
⎩
ζ1 /ζ3 , if x2 < ζ3 − ζ4 ,
where
1 1 3 1
η 1 = ζ1 − +
ζ3 + ζ4 ζ3 4 ζ42 2 ζ4 (ζ3 + ζ4 )2
and
1 1 1 1
η 2 = ζ1 − − −
ζ3 + ζ4 ζ3 4 ζ43 4 ζ42 (ζ3 + ζ4 )2
represent the tractive effort of the locomotive. A somewhat simpler form of ϕ
was used in [95], but the form used here models the actual data (see Figure 1
of [95]) more accurately. The function ρ is the resistive acceleration due to
friction, given by ρ(x2 ) = ζ5 + ζ6 x2 + ζ7 x22 . ζi , i = 1, . . . , 7, are constants
with given values ζ1 = 1.5, ζ2 = 1, ζ3 = 1.4, ζ4 = 0.1, ζ5 = −0.015, ζ6 =
−0.00003 and ζ7 = −0.000006. Also, x1 (0) = 0, x2 (0) = 0, x1 (1500) = 18000
and x2 (1500) = 0, i.e., the train starts from the origin at rest and comes
to rest again 18000m away at tf = 1500. The train is not allowed to move
backwards, so we require x2 (t) ≥ 0 for all t ∈ [0, 1500].
u1 1 0 0
The control is restricted to the discrete set U = , , ,
u2 0 0 −1
so that the train is either powered by the engine, coasting or being slowed
by the brakes. Note that power and brakes cannot be applied simultaneously.
The objective is to minimize the fuel used on the journey, i.e., to minimize
1500
J0 (u) = u1 dt.
0
More realistic versions of the problem include multiple fuel settings and speed
limit constraints, see [126].
Following the approach detailed in [94], we discretize the solid particle dis-
tribution into m distinct size intervals [Li , Li+1 ], i = 1, . . . , m, where size is
a measure of the diameter of a particle. A large range of particle sizes √ can
3
be captured if Li are defined by Li+1 = rLi , i = 1, . . . , m, where r = 2 and
L1 = 3.7×10−6 metres. Let Ni denote the number of particles in the i-th size
interval, and let C be the concentration of the solution. The rate of change
of each Ni consists of 3 terms that reflect the effects of nucleation, crystal
growth and agglomeration. The nucleation of new particles is assumed to oc-
cur only for the first size interval. The dynamics of the number of crystals in
individual size intervals as well as the solute concentration are then described
as follows:
m
dN1 2G r2 r
= 1− 2 N1 − 2 N2 + Bu − βN1 Nj (1.2.30)
dt L1 (1+r) r −1 r −1
nucleation j=1
growth
agglomeration
dNi 2G r r
= Ni−1 + Ni − 2 Ni+1
dt Li (1 + r) r − 1
2 r −1
growth
⎡ ⎤
i−2
1
i−1
m
+ β ⎣Ni−1 2j−i+1 Nj + (Ni−1 )2 − Ni 2j−i Nj − Ni Nj ⎦ ,
j=1
2 j=1 j=i
agglomeration
i = 2, . . . , m − 1, (1.2.31)
dNm 2G r
= Nm−1 + Nm
dt Lm (1 + r) r2 − 1
growth
⎡ ⎤
m−2
N −1
1
+ β ⎣Nm−1 2j−m+1 Nj + (Nm−1 )2 −Nm 2j−m Nj −(Nm )2 ⎦,
j=1
2 j=1
agglomeration
(1.2.32)
dC −3kv ρs
m
ρs
= G Ni Si2 − kv S13 Bu . (1.2.33)
dt ε i=1
ε
of the particle size here, kv = 0.5 is a volume shape factor, ε = 0.8 and
3
ρs = 2420kg/m is the density of the resulting solid.
Furthermore, letting T denote the temperature of the solution in degrees
Kelvin, we can define the solubility as a function of temperature and caustic
concentration [194], i.e.,
1.0875CNa O
∗ 2486.7 2
CAl 2 O3
= CNa2 O e6.21− T + T ,
is the main driving force for the three processes of nucleation, growth and
agglomeration. The growth is modeled by G = kg (ΔC)2 , where, assuming
CNa2 O = 100 kg/m as before, kg = 6.2135e− T . The dependence of the
3 7600
where
(ΔC)0.8 , if ΔC > 1,
fc (ΔC) =
−1.2(ΔC)3 + 2.2(ΔC)2 , if 0 ≤ ΔC < 1.
Furthermore, it has been shown experimentally that nucleation decreases
markedly at temperatures above 70 ◦ C and does not occur beyond 80 ◦ C. For
−10407.265
temperatures below 70 ◦ C, we take kn = 9.8 × 1022 e T , while kn = 0 for
◦
temperatures above 80 C. In between, we use a smooth cubic interpolation
for kn , i.e.,
⎧ −10407.265
⎪
⎪ 9.8 × 1022 e T , if T ≤ 343.2 ◦ K,
⎪
⎪
⎪
⎪
⎨
(0.002c1 + 0.01c2 ) (T − 353.2)3
kn (T ) =
⎪
⎪ +(0.03c1 + 0.1c2 )(T − 353.2)2 , if T ≤ 353.2 ◦ K,
⎪
⎪
⎪
⎪
⎩
0, if T > 353.2 ◦ K,
12 1 Introduction
N
N
N
where M3 = Ni (tf )Si3 , M4 = Ni (tf )Si4 and M5 = i=1 Ni (tf )Si5 .
i=1 i=1
This is equivalent to maximizing the ratio of the mean crystal size over the
variance in the crystal size. Other versions of the problem, where seed crys-
tals are added to the solution throughout the process, can also be readily
formulated [214].
in finding the necessary arc time lengths are discussed. The gradients with
respect to the switching time variables are calculated in a manner that avoids
the need for costate variables, and this can be a computational advantage
in some problems. Derivation of these gradients is given in Section 7.4.1,
where the limitations of this approach are also discussed. In [111], the time-
optimal switching (TOS) algorithm is developed for solving a class of time
optimal switching control problems involving nonlinear systems with a single
control input. In this algorithm, the problem is formulated in the arc times
space, where arc times are the durations of the arcs. A feasible switching
control is found using the STC method [117] to move from an initial point
to a target point with a given number of switching. The cost is expressed as
the summation of the arc times. Then, by using a constrained optimization
technique, a minimum-time switching control solution is obtained. In [186], a
numerical scheme is developed for constructing optimal bang-bang controls.
Then, the second order sufficient conditions developed in [185] are used to
check numerically whether the controls obtained are optimal.
The control parametrization method (see, for example, [36, 63, 69, 89, 143,
145, 148, 151, 153, 154, 160–162, 164, 166, 169–171, 181, 215, 229, 230,
238, 244, 249, 253, 255, 260, 284, 288, 294, 298, 300, 301, 311]) relies on
the discretization of the control variables using a finite set of parameters.
This is most commonly done by partitioning the time horizon of a given
problem into several subintervals such that each control can be approxi-
mated by a piecewise constant function that is consistent with the corre-
sponding partition. The approximating piecewise constant function can be
defined in terms of a finite set of parameters, known as control parame-
ters. Upon such an approximation, an optimal control problem becomes a fi-
nite dimensional optimal parameter selection problem. In real world, optimal
control problems are often subject to constraints on the state and/or con-
trol. These constraints can be point constraints and/or continuous inequality
constraints. The point constraints are expressed as functions of the states
at the end point or some intermediate interior points of the time horizon.
These point constraints can be handled without much difficulty. However,
for the continuous inequality constraints, they are expressed as functions
of the states and/or controls over the entire time horizon and hence are
very difficult to handle. Through the control parametrization, a continuous
inequality constrained optimal control problem is approximated as a con-
tinuous inequality constrained optimal parameter selection problem, which
can be viewed as a semi-infinite programming (SIP) problem involving dy-
namic system. A popular approach to deal with the continuous inequality
constraints on the state and control is known as the constraint transcrip-
1.3 Computational Algorithms 15
tion (see, for example, [76, 103, 135, 136, 148, 245, 246, 249, 253, 259]).
Details will be given in later chapters. Another effective method to handle
continuous inequality constraints is the exact penalty functions method (see,
for example, [134, 300, 301]). It is also discussed in detail in later chapter.
After the use of the constraint transcription method or the exact penalty
function method, the continuous inequality constrained optimal parameter
selection problem becomes an optimal parameter selection problem subject
to constraints in the form of the objective functional, and these constraints
are called canonical constraints. Each of these optimal parameter selection
problems with canonical constraints can be regarded as a mathematical pro-
gramming problem, and its solution is to be obtained by constrained opti-
mization techniques. The control parametrization technique has been used
in conjunction with the constraint transcription or the exact penalty func-
tion extensively in the literature (see, for example, [104, 134, 138, 145, 162,
164, 168, 171, 180, 181, 214, 215, 236, 244, 245, 249, 254, 294]). In [148], a
survey and recent developments of the technique are presented. The tech-
nique has been proven to be very efficient in solving a wide range of op-
timal control problems. In particular, several computational algorithms to
deal with a variety of different classes of problems together with a sound
theoretical convergence analysis are reported in the literature (see, for ex-
ample, [230, 237, 240, 245, 246, 248, 249, 253, 260, 279–281]). Under some
mild assumptions, convergence of the sequence of approximate optimal costs
obtained from a corresponding sequence of partition refinements of the time
horizon to the optimal cost of the original optimal control problem has been
demonstrated. Furthermore, the solution obtained for each approximate op-
timal control problem, which is regarded as a constrained optimization prob-
lem, is such that the KKT conditions are satisfied. However, there is no
proof of the convergence of the approximate optimal controls to true optimal
control. Therefore, the approximate optimal control obtained is likely to be
not identically the same as the true optimal control, but the difference in
the approximate optimal cost and the true optimal cost is insignificant. This
is sufficient in real-world applications. In the next section, full discretization
schemes based on Runge-Kutta discretization of the optimal control problems
will be briefly discussed. These full discretization schemes can solve some op-
timal control problems such that the controls obtained can be verified to
satisfy the optimality conditions.
Finally, note that the standard control parametrization approach assumes
a fixed partition for the piecewise constant (or polynomial) approximation of
the control. In many practical problems, it is desirable to allow the knot points
of the partition to be variable as well. For this, the Control Parametrization
Enhancing Transform (CPET) is introduced in the literature (see, for exam-
ple, [125, 126, 138, 215, 256]). It is now called the time scaling transformation
to better reflect the actual meaning of the transformation. It is now widely
used in the literature, such as [106–108, 142, 144, 148, 150, 151, 162, 165–
171, 311]. The time scaling transformation can be used to convert problems
16 1 Introduction
with variable knot points for the control into equivalent problems where the
control is defined on a fixed partition once more. The transformed problem
can then be readily solved by the standard control parametrization approach.
Details of the transformation and many of its applications are described in
later chapters.
In [63], an algorithm is developed for solving constrained optimal control
problems. Through the control parametrization, a constrained optimal con-
trol problem is approximated by a SIP problem. The algorithm proposed
seeks to locate a feasible point such that the KKT conditions to a specified
tolerance are achieved. Based on the right hand restriction method proposed
in [195] for standard SIP, the proposed algorithm solves the path constrained
optimal control problem iteratively through the approximation of the path
constrained optimal control problem by restricting the right hand side of the
path constraint to a finite number of time points. Then, the approximate
optimization problem with finitely many constraints is solved such that local
optimality conditions are satisfied at each iteration. The established algo-
rithm will find a feasible point in a finite number of iterations such that the
first order KKT conditions are satisfied to a specified accuracy.
For a direct local collocation method, the state and control are approximated
using a specified functional form. The time interval [t0 , T ] is partitioned into
N subintervals [ti−1 , ti ], i = 1, . . . , N , where tN = T . Since the state is re-
quired to be continuous across intervals, the following condition is imposed
for each i = 1, . . . , N :
x(t−
i ) = x(ti ), i = 2, . . . , N − 1,
+
where x(t− +
i ) = limt↑ti x(t) and x(ti ) = limt↓ti x(t). Two types of discretiza-
tion schemes are normally used in the development of algorithms for solving
optimal control problems: (i) Runge-Kutta methods and (ii) orthogonal collo-
cation methods. Runge-Kutta discretization schemes are normally in implicit
form. This is because they have better stability properties than those of ex-
plicit methods. In [212], an algorithm is developed to solve optimal control
problems based on orthogonal collocation method, where Legendre-Gauss
points, which are chosen as discretized points, are used together with cubic
splines over each subinterval. In [52], Lagrange polynomials are used, instead
of cubic spline. Note that the application of a direct local collocation to an
optimal control problem will give rise to a nonlinear programming problem of
very high dimension containing thousands to tens of thousands of variables
and a similar number of constraints. However, the nonlinear programming
problem will tend to be very sparse with many of the derivatives of the con-
straint Jacobian being zero. Thus, it can be solved efficiently using nonlinear
programming solvers.
1.3 Computational Algorithms 17
2.1 Introduction
results from the references listed at the beginning of the paragraph and from
[60, 87, 204, 207, 220–222].
Unlike optimal control problems, mathematical programming problems
are static in nature. The general constrained mathematical programming
problem is to find an x ∈ Rn to minimize the objective function
f (x) (2.1.1)
hi (x) = 0, i = 1, . . . , m, (2.1.2)
hi (x) ≤ 0, i = m + 1, . . . , m + r, (2.1.3)
For completeness, we shall first present some important concepts and results
in unconstrained optimization techniques. Some basic theory and algorithms
for constrained optimization will be given in Chapter 3. The unconstrained
optimization problem is to choose an x = [x1 , . . . , xn ] ∈ Rn to minimize
an objective function f (x). It is a special case of the general problem in
Section 2.1, where the feasible region is the entire space Rn .
Definition 2.2.1 The point x∗ ∈ Rn is said to be a global minimum (mini-
mizer) if
f (x∗ ) ≤ f (x), for all x ∈ Rn .
Definition 2.2.2 The point x∗ ∈ Rn is said to be the strict global minimum
(minimizer) if
f (x∗ ) < f (x), for all x ∈ Rn \{x∗ }.
where
∂f (ξ) ∂f (ξ) ∂f (ξ)
g(ξ) = (∇ξ f (ξ)) = (∇f (ξ)) = , ,..., ,
∂ξ1 ∂ξ2 ∂ξn
d2 β(0)
= s G(x0 )s ≥ 0
dα2
and the result follows.
Theorem 2.2.3 (Sufficient Condition for Local Minima) Let x0 be a solu-
tion to (2.2.3). If the Hessian, G(x0 ), of the function f evaluated at x0 is
positive definite, then x0 is a strict local minimum.
Proof. Since G(x0 ) is positive definite, s G(x0 )s > 0 for all s = 0. For any
unit vector s ∈ Rn and any sufficiently small α > 0, by Taylor’s Theorem,
we have
1
f (x0 + αs) − f (x0 ) = α2 s G(x0 )s + o(α2 ).
2
For small values of α > 0, the first term on the right of the last equation
dominates the second. Thus, it follows that f (x0 + αs) − f (x0 ) > 0 provided
α > 0 is sufficiently small. Since the direction of s is arbitrary, this shows
that x0 is a strict local minimum and the proof is complete.
Let s(k) be a given vector in Rn . Consider the function f (x) along the line
df (x(α)) d ! !
= f x(k) + αs(k) = ∇f x(k) + αs(k) s(k) .
dα dα
At α = 0, the slope is
! !
df (x(α))
= ∇f x(k) s(k) = g (k) s(k) .
dα α=0
then s(k) is called a descent direction of the objective function at x(k) . The
objective function value is reduced along this direction for all sufficiently
small α > 0.
The following is a general structure for the descent algorithm that we will
consider:
Algorithm 2.3.1
Step 1. Choose x(0) and set k = 0.
Step 2. Determine a search direction s(k) .
Step 3. Check for convergence.
Step 4. Find αk that minimizes f x(k) + αs(k) with respect to α.
Step 5. Set x(k+1) = x(k) + αk s(k) , and set k := k + 1. Go to Step 2.
Remark 2.3.1
(i) Different descent methods arise from different ways of generating the
search direction s(k) .
(ii) Step 4 is a one-dimensional optimization problem and is referred to
as a line search. Here, the line search is idealized. In practice, an
exact line search is impossible.
Note that if αk minimizes f x(k) + αs(k) , then
26 2 Unconstrained Optimization Techniques
df x(k) + αs(k)
= 0.
dα
α=αk
Note that −g (k) is the direction in which the objective function value de-
creases most rapidly at x(k) . By choosing s(k) = −g (k) in Algorithm 2.3.1,
we obtain the steepest descent method. This method is the simplest one among
all gradient-based unconstrained optimization methods. It requires only the
objective function value and its gradient. Moreover, with the steepest de-
scent method, we have global convergence (i.e., the method converges from
any starting point) as we will now demonstrate.
Assume that there exists a point x∗ ∈ Rn such that ∇f (x∗ ) = 0. Further-
more, we suppose that ∇f (x) = 0 if x = x∗ . Then, the steepest descent
method constructs a sequence {x(k) } with
! ! !
f x(k+1) = min f x(k) − αg (k) ≤ f x(k) , f or all k ≥ 0.
0≤α<∞
" #
Since f (x∗ ) ≤ f x(k) ≤ f (x(0) ) for all k ≥ 0, it is clear that f x(k)
" (k) #
is a bounded monotone sequence. This implies that f x is convergent
(0)
" (k) #
for any initial point x . In other words, f x is globally convergent
to f (x∗ ) However, it should be noted that the convergence of the steepest
descent method can be very slow. The convergence rate of this method is
given in the following theorem:
Theorem 2.4.1 Suppose that f (x) defined on Rn has continuous second or-
der partial derivatives, and has a local minimum at x∗ . If {x(k) } is a sequence
generated by the steepest descent method that converges to x∗ , then
! 2
r−1 (k)
f x(k+1) − f (x∗ ) ≤ x − x∗ ,
r+1
where r is the condition number of the Hessian G∗ = ∇2 f (x∗ ). Note that the
condition number is given by r = A/a, where A and a are, respectively, the
largest and the smallest eigenvalues of G∗ .
2.5 Newton’s Method 27
Remark 2.5.1
(i) Newton’s method requires the information on f (k) , g (k) and G(k) ,
i.e., function values and first and second order partial derivatives.
(ii) The basic Newton’s method does not involve a line search. The
choice of δ (k) ensures that the minimum of the quadratic approxi-
mation is achieved.
(iii) Assuming G∗ is positive definite, Newton’s method has good local
convergence if the starting point is sufficiently close to x∗ .
(iv) Choosing δ (k) as the solution of (2.5.2) is only appropriate and
well-defined if the quadratic approximation has a minimum, i.e.,
G(k) is positive definite. This may not be the case if x(k) is remote
from x∗ , where x∗ is a local minimum.
Suppose that x∗ is a point such that g(x∗ ) = 0 and that G(x∗ ) is positive
definite. Then, it follows from Taylor’s theorem that
28 2 Unconstrained Optimization Techniques
! ! ! 2
g (k) = g x(k) = g(x∗ ) + G x(k) x(k) − x∗ + O x(k) − x∗
! 2
= G(k) x(k) − x∗ + O x(k) − x∗ , (2.5.5)
From Step 3 of Algorithm 2.5.1, the left hand side of (2.5.6) may be replaced
by −δ (k) . Since δ (k) = x(k+1) − x(k) , we have
! 2
∗ (k) ∗
− x (k+1)
−x (k)
= x − x + O x − x .
(k)
2 !
By the definition of O x(k) − x∗ , there exists a constant c2 such that
2
(k+1)
x − x∗ ≤ c2 x(k) − x∗ . (2.5.7)
" #
Therefore, if x(k) is close to x∗ , x(k) converges to x∗ at a rate of at least
second order.
Assume that G(k) has eigenvalues λk1 < λk2 < . . . < λkn . Choose εk such that
εk + λk1 > 0.
2.6 Modifications to Newton’s Method 29
Since they are all positive, εk I + G(k) is positive definite. Thus, we can con-
struct a modified Newton’s method:
!−1
x(k+1) = x(k) − αk εk I + G(k) g (k) , k = 0, 1, . . . (2.6.1)
Remark 2.6.1 When εk is sufficiently large (on the order of 104 ), the term
−1
εk I dominates εk I + Gk and (εk I + Gk ) ≈ [εk I]−1 = ε1k I. Thus, the search
direction is !−1 1
− εk I + G(k) g (k) ≈ − g (k)
εk
which is the steepest descent direction. When εk is small, εk I + G(k) ≈ G(k)
and the method is similar to Newton’s method.
The steepest descent method and, in fact, most gradient-based methods re-
quire a line search. The choice of α∗ that exactly minimizes f (x+αs) is called
the exact line search. The exact line search condition is given by (2.3.4). How-
ever, an exact line search condition is difficult to implement using a computer.
Thus, we must resort to an approximate line search.
If s is a descent direction at x, as described by (2.3.3), then f (x + αs) <
f (x) for all α > 0 sufficiently small. Thus, we could replace finding the exact
minimizer of f (x+αs) by finding any α such that f (x+αs) < f (x). However,
if α is chosen too small, we may not get to the minimum of f . We need at
least linear decrease in function value to guarantee convergence. If α is chosen
too large, then s may no longer be a descent direction.
An approximate minimizer ᾱ of f (x + αs) must be chosen such that the
following conditions are satisfied:
(1) Sufficient function decrease:
f (x + ᾱs) ≤ f (x) + ρᾱs (∇f (x)) . (2.7.1)
h(α) = f (x + αs).
y y = h(α)
6
h(0) y = h(0) (ρ = 0)
For the steepest descent method with approximate line search, we have the
following convergence result, which can be found in any book on optimization.
See, for example, [172].
" #
Theorem 2.7.1 Let f ∈ C 2 and x(k) be a sequence of points generated
by the steepest descent method using an approximate line search. Then, one
of the following conditions is to be satisfied:
(i) g (k) = 0 for some k; or
(ii) g (k) → 0 as k → ∞; or
(iii) f (k) → −∞ (there is no finite minimum).
Theorem 2.8.1 If G is positive definite and the set of non-zero vectors d(i) ,
i = 0, 1, 2, . . . , k, are G-conjugate, then they are linearly independent.
32 2 Unconstrained Optimization Techniques
From the conjugacy, we have d(i) Gd(j) = 0 for all i = j. Thus, it follows
from (2.8.3) that αi d(i) Gd(i) = 0. Since G is positive definite, we have
(i)
d Gd(i) > 0, and hence αi = 0. This is true for all i = 0, 1, . . . , k. Thus,
(i)
d , i = 0, 1, . . . , k, are linearly independent.
Theorem 2.8.2 (Principal Theorem of Conjugate Direction Method) Con-
sider a quadratic function given by (2.8.1). For an arbitrary x(0) ∈ Rn , let
x(i) , i = 1, 2, . . . , k, with k ≤ n − 1 be generated by a conjugate direction
method, where the search directions s(0) , . . . , s(n−1) are G-conjugate, and
where, for each i ≤ n − 1, x(i+1) is chosen as
where * !+
αi∗ = argmin f x(i) + αs(i) . (2.8.5)
α≥0
! !
i !
g (i+1) s(j) = g (j+1) s(j) + g (k+1) − g (k) s(j)
k=j+1
!
i !
= g (j+1) s(j) + αk s(k) Gs(j)
k=j+1
=0, (2.8.9)
where the last equality "is based on the # exact line search condition (2.3.4)
and the G-conjugacy of s(0) , . . . , s(i) . Now, consider the case when j = i.
Again, by the exact line search condition (2.3.4), we have
!
g (i+1) s(i) = 0. (2.8.10)
Since (2.8.9) "and (2.8.10) hold for# all i ≤ n − 1 and since Theorem 2.8.1
insures that s(0) , s(1) , . . . , s(n−1) is linearly independent, it follows from
those two equations that
g (n) = 0.
Hence x(n) is the minimizer of f (x) given by (2.8.1) and consequently the
search must terminate in at most n exact line searches.
We now move on to prove the second part of the theorem. We need to
show that for each i ≤ n, x(i) is the minimizer of f (x) on the set Ui defined
by (2.8.6). We shall use an induction argument. The statement is true for
i = 1, since the exact line search (2.8.4)–(2.8.5) is clearly over U1 in this case.
Suppose the statement is true for some j < n, i.e., x(j) is optimal on Uj .
Then we want to show that
Comparing with (2.8.11), we have α = αj∗ which then shows that x(j+1) =
x(j) + αj∗ s(j) minimizes f on Uj+1 , as required. This proves the inductive
step and the proof is complete.
where βk−1 is chosen such that s(k) is G-conjugate to s(k−1) . More precisely,
from (2.8.17) and (2.8.18), we have
! ! !
s(k) Gs(k−1) = − g (k) Gs(k−1) + βk−1 s(k−1) Gs(k−1) . (2.8.19)
Choosing
g (k) Gs(k−1)
βk−1 = , (2.8.20)
s(k−1) Gs(k−1)
it is clear from (2.8.19) that
!
s(k) Gs(k−1) = 0. (2.8.21)
! ! !
g (k) g (k)
g (k+1) g (k) = g (k) g (k) − g (k) Gs(k) = 0. (2.8.30)
g (k) Gs(k)
Therefore, by (2.8.34) and (2.8.30), we conclude that g (0) , g (1) , . . . , g (k) are
mutually orthogonal.
It remains to show that s(i) , i = 0, 1, . . . , k, are mutually G-conjugate. We
recall that s(k) is G-conjugate to s(k−1) . By the induction hypothesis, s(k−1)
is G-conjugate to all s(i) , 0 ≤ i ≤ k − 2. Thus, for 0 ≤ i ≤ k − 2, it is clear
from (2.8.17) and (2.8.18) that
! !
s(k) Gs(i) = −g (k) + βk−1 s(k−1) Gs(i)
! !
= − g (k) Gs(i) + βk−1 s(k−1) Gs(i)
!
= − g (k) Gs(i) . (2.8.35)
g (i+1) − g (i)
Gs(i) = . (2.8.36)
αi
From (2.8.24), we see that αi is never 0 unless g (i) = 0. If g (i) = 0, then x(i)
is the minimizer of the function f (x) and the method terminates.
Substituting (2.8.36) into (2.8.35), and then noting that g (k) is orthogonal
to all g (j) , j = 0, 1, . . . , k − 1, (and hence orthogonal to g (i) and g (i+1) , for
i = 0, 1, . . . , k − 2), we obtain
! !
g (k) g (i+1) − g (i)
s (k)
Gs (i)
=− g (k)
Gs (i)
=− = 0 (2.8.37)
αi
for 0 ≤ i ≤ k − 2. This concludes the induction step and shows that s(0) ,
s(1) , . . . , s(k) are mutually G-conjugate. This completes the proof.
Step 7. Set
x(i+1) = x(i) + αi s(i) , (2.8.39)
where
s(i) g (i)
α i = − .
s(i) G(i) s(i)
For general functions, the two methods behave differently. The PRP for-
mula given by (2.8.42) will yield βk ≈ 0 when g (k+1) ≈ g (k) . Hence, s(k+1)
≈ −g (k+1) . This means that the algorithm has a tendency of restarting au-
tomatically. Thus, it can overcome the deficiency of moving forward slowly.
It is generally accepted that the PRP formula is more robust and efficient
than the FR formula. Unfortunately, Theorem 2.8.5 is not valid for the PRP
method (see [204]). However, we have the following two theorems. For their
proofs, see [232].
Theorem 2.8.5 Let the objective function f : Rn → R be three times con-
tinuously differentiable. Suppose that there exist constants K1 > K2 > 0 such
that
!
K2 y ≤ y ∇2 f (x)y ≤ K1 y , for all y ∈ Rn , x ∈ L x(0) , (2.8.43)
2 2
where L(x(0) ) is the bounded level set defined by (2.8.40). Let the sequence
{x(k) } be generated by either the Polak-Ribiere-Polyak Conjugate Gradient
2.9 Quasi-Newton Methods 41
or the Fletcher-Reeves Conjugate Gradient restart method with the exact line
search. Then, there exists a constant c > 0, such that
(k +n)
x (r) − x∗
lim sup (k ) ≤ c < ∞, (2.8.44)
k(r) →∞ x (r) − x
∗
K y ≤ y ∇2 f (x)y, ∀y ∈ Rn
2
(2.8.45)
(0) " (k) #
for all x ∈ L x . Then, the sequence x generated by the PRP method
with the exact line search converges to a unique minimizer x∗ of f .
s(0) = −g (0) .
This means that, we have the steepest descent direction at the first iteration.
Remark 2.9.1 For Newton’s method, it has fast local convergence proper-
ties. However, it requires information on second derivatives and G(k) may be
indefinite. For quasi-Newton methods, their convergence properties are much
42 2 Unconstrained Optimization Techniques
better than that of the steepest descent method, but not as good as that of New-
ton’s method. They require only information on first order partial derivatives.
Also, for each k = 0, 1, . . ., the matrix H (k) is always positive definite, and
hence the corresponding s(k) is a descent direction. Note that some quasi-
Newton methods do not ensure that H (k) is positive definite. Those that do
are also called variable metric methods.
The key idea in quasi-Newton methods is to approximate the inverse (G(k) )−1
of the Hessian G(k) by H (k+1) at step k. This approximate matrix should
be chosen to be positive definite so that the search direction generated is a
descent direction. Let
δ (k) = x(k+1) − x(k) (2.9.1)
and
γ (k) = g (k+1) − g (k) . (2.9.2)
Taylor’s series expansion gives
! !
g (k+1) = g (k) + G(k) x(k+1) − x(k) + o δ (k) , (2.9.3)
or, equivalently, !
γ (k) = G(k) δ (k) + o δ (k) , (2.9.4)
o (ξ)
lim = 0.
ξ →0 ξ
Note that δ (k) and hence γ (k) are calculated after the line search and H (k)
is used to calculate the direction of the search. Hence, it is usually expected
that
H (k) γ (k) = δ (k) .
Thus, we choose H (k+1) such that
2.9 Quasi-Newton Methods 43
Define
H (k+1) = H (k) + auu + bvv . (2.9.7)
Assume that the quasi-Newton condition (2.9.6) is satisfied. Then, by multi-
plying both sides of (2.9.7) by γ (k) , we obtain
Let u = δ (k) and v = H (k) γ (k) . Then, it follows from (2.9.9) that
1 1
a= and b = − . (2.9.10)
δ (k) γ (k) γ (k) H (k) γ (k)
1 1
Let a = H (k) 2 x, and b = H (k) 2 γ (k) , where H 1/2 is defined such that
H 1/2 H 1/2 = H.
Then, we have
2 (k) 2
a b x δ
x H (k+1)
x=a a−
+
b b δ (k) γ (k)
2 (k) 2
a a b b − a b x δ
= + . (2.9.14)
b b δ (k) γ (k)
Also, since x(k+1) is the minimum point of f (x) along the direction δ (k) , it
follows from (2.3.4) that
!
δ (k) g (k+1) = 0. (2.9.15)
Thus,
! ! ! !
δ (k) γ (k) = δ (k) g (k+1) − δ (k) g (k) = − δ (k) g (k) . (2.9.16)
Now, by the definition of δ (k) given by (2.9.1) and noting that x(k+1) =
x(k) − αk H (k) g (k) , we obtain
! ! !
δ (k) γ (k) = − −αk H (k) g (k) g (k) = αk g (k)
H (k) g (k) > 0.
(2.9.17)
Substituting (2.9.17) into the denominator of the second term on the right
hand side of (2.9.14), we obtain
2.9 Quasi-Newton Methods 45
2 (k) 2
(k+1) a a b b − a b x δ
x H x=
+ . (2.9.18)
(b b) αk (g k ) H (k) g (k)
x H (k+1) x > 0.
(ii) Inexact line search. Conditions on slope can easily be used to ensure
(k) (k)
δ γ > 0. Recall that
!
(k) (k+1)
(s ) g < −σ s(k) g (k) , 0 < σ < 1.
Therefore,
! ! !
s(k) γ (k) =α(k) s(k) g (k+1) − g (k)
46 2 Unconstrained Optimization Techniques
!
=α(k) g (k+1) − g (k) s(k)
!
>α(k) σg (k) − g (k) s(k)
!
= − (1 − σ)α(k) g (k) s(k)
>0.
Theorem 2.9.2 The DFP method is used to solve the following objective
function:
1
f (x) = x Gx − c x, (2.9.21)
2
where G ∈ Rn×n is positive definite. Then, for i = 0, 1, . . . , m, it holds that
Proof. We shall show the validity of (2.9.22) and (2.9.23) by induction. For
i = 0, it is clear from the quasi-Newton condition that
Thus, (2.9.22) with i = 0 is valid. Now we suppose that (2.9.22) and (2.9.23)
are true for i. Then, it is required to show that they are also valid for i + 1.
Note that
g (i+1) = 0. (2.9.26)
Then, by the exact line search condition (2.3.4) and the induction hypothesis,
it follows from (2.9.5) that, for j ≤ i,
! !
i !
g (i+1) δ (j) = g (j+1) δ (j) + g (k+1) − g (k) δ (j)
k=j+1
!
i !
(j+1) (j)
= g δ + γ (k) δ (j)
k=j+1
i !
=0+ δ (k) Gδ (j)
k=j+1
= 0. (2.9.27)
Note that
2.9 Quasi-Newton Methods 47
The next theorem contains the results showing the convergence of the DFP
method for a general objective function f : Rn → R. Its proof can be found
in [232].
Theorem 2.9.3 . Consider the objective function f : Rn → R. Suppose that
the following conditions are satisfied:
(a) f is twice continuously differentiable on an open convex set D ⊂ Rn .
(b) There is a strict local minimizer x∗ ∈ D such that ∇2 f (x∗ ) is symmetric
and positive definite.
(c) There is a neighbourhood Nε (x∗ ) of x∗ such that
2
∇ f (- x) − ∇2 f (x) ≤ K - - ∈ Nε (x∗ ),
x − x , ∀x, x
We shall present another matrix update formula due to Broyden [32], Fletcher
[60], Goldfarb [77] and Shanno [227]. This update formula is known as the
BFGS formula. The update matrix H (k+1) is
2.9 Quasi-Newton Methods 49
(k)
δ (k) γ (k) (k) γ
(k)
δ
H (k+1)
=H −
(k)
H −H
(k)
δ (k) γ (k) δ (k) γ (k)
δ (k) γ (k) H (k) γ (k) δ (k) δ (k) δ (k)
+ + . (2.9.36)
δ (k) γ (k) δ (k) γ (k) δ (k) γ (k)
The BFGS method has all the properties that the DFP method has. Fur-
thermore, the BFGS tends to do better than the DFP for low accuracy line
searches. In addition, if the conditions
! ! !
f x(k) − f x(k+1) ≥ −ρ g (k) δ (k) (2.9.40)
and ! !
g (k+1) s(k) ≥ σ g (k) s(k) , for 0 < ρ ≤ σ (2.9.41)
hold in an inexact line search, then the BFGS method is globally convergent.
To derive the BFGS formula (2.9.36), we need some preliminary results.
The first of these is the well-known result on the inversion of block matrices
[109].
Lemma 2.9.1 Let A ∈ Rn×n , B ∈ Rm×m , C ∈ Rm×n , D ∈ Rn×n and
= B − CA−1 D. Suppose that A−1 and −1 exist. Then
−1
AD A−1 + E−1 F −E−1
= ,
CB −−1 F −1
The next result is from [275] but the origin of this result can be traced
back to [15].
50 2 Unconstrained Optimization Techniques
1 v A−1 1 −v 1 + v A−1 u 0
= = W1 (2.9.43)
0 I u A u A
and
1 0 I −v 1 −v
= = W2 . (2.9.44)
−u I u A 0 A + uv
By Lemma 2.9.1, we see that the inverses of W1 and W2 are, respectively,
given by . /
1
−1 1+v A−1 u
0
W1 = −A−1 u (2.9.45)
1+v A−1 u
A−1
and
1 v (A + uv )−1
W2−1 = . (2.9.46)
0 (A + uv )−1
From (2.9.44) and (2.9.43), we obtain
−1
1 −v I 0
W2−1 = (2.9.47)
u A uI
and
−1 −1
1 −v 1 v A−1
W1−1 = , (2.9.48)
u A 0 I
1 v A−1
respectively. Multiplying (2.9.48) by yields
0 I
−1
1 −v 1 v A−1
= W1−1 . (2.9.49)
u A 0 I
and !
ω = δ (k) γ (k) + (γ (k) ) H (k) γ (k) . (2.9.54)
(2.9.58)
Now, by (2.9.55), (2.9.52), (2.9.57) and (2.9.58), it follows that
(k+1) H (k) γ (k) γ (k) H (k)
HBF GS =H − (k)
ω
(k)
2 (k)
ω δ δ + H (k) γ (k) (γ (k) ) δ (k) δ (k) γ (k) γ (k) H (k)
+ & '2
ω δ (k) γ (k)
ωH (k) γ (k) γ (k) δ (k) δ (k) + ωδ (k) δ (k) γ (k) γ (k) H (k)
− & '2
ω δ (k) γ (k)
H (k) γ (k) γ (k) H (k) δ (k) δ (k)
=H (k) − + ω & '2
ω
δ (k) γ (k)
H (k) γ (k) γ (k) H (k) H (k) γ (k) δ (k) + δ (k) γ (k) H (k)
+ −
ω δ (k) γ (k)
2.9 Quasi-Newton Methods 53
& '
δ (k) γ (k) + γ (k) H (k) γ (k) δ (k) δ (k)
=H (k) + & '2
δ (k) γ (k)
δ (k) γ (k) H (k) + H (k) γ (k) δ (k)
−
δ (k) γ (k)
. / . /
δ (k) (γ (k) ) γ (k) δ (k)
= I− H (k)
I−
δ (k) γ (k) δ (k) γ (k)
δ (k) δ (k)
+ . (2.9.59)
δ (k) γ (k)
3.1 Introduction
f (x) (3.1.1)
hi (x) = 0, i = 1, . . . , m, (3.1.2)
hi (x) ≤ 0, i = m + 1, . . . , m + r, (3.1.3)
Definition 3.1.3 Let x∗ be a given point in Rn . Then the active set, J (x∗ ),
of inequality constraints at x∗ is the set of indices corresponding to all those
inequality constraints that are active, i.e.,
Remark 3.1.1 Note that the condition stated in Definition 3.1.4 is known
as constraint qualification.
m
m+r
L (x, λ) = f (x) + λj hj (x) + λj hj (x) , (3.1.6)
j=1 j=m+1
Nε (x∗ ) = {x ∈ Ω : |x − x∗ | < ε} .
If f (x) > f (x∗ ) for all x ∈ Nε (x∗ ) such that x = x∗ , then x∗ is said to be
a strict local minimum point of f over the feasible region Ω.
hj (x∗ ) = 0, j = 1, 2, . . . , m, (3.1.8a)
hj (x∗ ) ≤ 0, j = m + 1, m + 2, . . . , m + r, (3.1.8b)
y H ∗ y > 0, ∀y ∈ M ∗ , y = 0, (3.1.10)
and " #
J+ (x∗ ) = j ∈ J (x∗ ) : λ∗j > 0 . (3.1.12)
quadratic programming problem with only linear equality constraints. The so-
lution of quadratic programming problems with both linear equality and lin-
ear inequality constraints via the active set strategy is outlined in Section 3.3.
In Section 3.4, we summarize a constrained quasi-Newton method for solving
a general linearly constrained optimization problem. In Section 3.5, we sum-
marize the essential steps required in the sequential quadratic programming
algorithm, making use of the materials outlined in Sections 3.2–3.4.
∂f (x)
∇x f (x) = = x G + c = 0 . (3.2.3)
∂x
If the Hessian ∇xx f = G is positive definite, then, by solving (3.2.3), we
obtain the unique minimum solution:
x∗ = −G−1 c. (3.2.4)
x1
x= ,
x2
x1 = A−1
1 (b − A2 x2 ) . (3.2.6)
G11 G12 c1
G= and c= , (3.2.7)
G
12 G22 c2
AE = Im (3.2.12)
60 3 Constrained Mathematical Programming
and
AF = 0. (3.2.13)
Note that E and F are not necessarily unique. In practice, E and $F may% be
obtained by first selecting any matrix Q ∈ Rn×(n−m) such that A |Q is
non-singular. Defining
&$ %−1 '
[E|F ] = A |Q , (3.2.14)
it is then easy to verify that E and F satisfy (3.2.11), (3.2.12) and (3.2.13).
Equation (3.2.13) implies that the columns of F are basis vectors for the null
space of A, defined by {x ∈ Rn : Ax = 0}. The general solution of (3.2.2) can
then be written as
x = Eb + F y, (3.2.15)
where y ∈ Rn−m is arbitrary. Substitution of (3.2.15) into (3.2.1) yields
1 1
f (x) = fˆ (y) = y F GF y + (c + GEb) F y + c + GEb Eb.
2 2
(3.2.16)
Clearly, if F GF is positive definite, then the unique minimizer y ∗ is given
by
−1
y ∗ = − F GF F (c + GEb) . (3.2.17)
The matrix F GF is referred to as the reduced Hessian matrix, while the
vector F (c + GEb) is referred to as the reduced gradient. x∗ can then be
easily obtained from (3.2.15).
The third method employs the idea of Lagrange multipliers. The La-
grangian function for the constrained problem (3.2.1) and (3.2.2) is
G A x∗ −c
= . (3.2.21)
A0 λ∗ b
G A
A0
where
−1
H = G−1 − G−1 A AG−1 A AG−1 , (3.2.23)
−1
P = AG−1 A AG−1 , (3.2.24)
−1
S = − AG−1 A . (3.2.25)
x∗ = −Hc + P b, (3.2.26)
λ∗ = −P c + Sb. (3.2.27)
which is a much more elegant form than that obtained by the direct elimina-
tion method in (3.2.6) and (3.2.9). The corresponding Lagrange multipliers,
λ∗ , may be obtained similarly.
Finally, note that x∗ and λ∗ can also be generated by first finding any
solution x that satisfies the constraints, i.e.,
Ax = b. (3.2.29)
Then
x∗ = x − Hp (3.2.30)
and
λ∗ = −P p, (3.2.31)
where
p = (∇x f (x)) = Gx + c, (3.2.32)
because
x − Hp
= x − H (Gx + c)
= x − HGx − Hc
−1 !
= x − G−1 − G−1 A AG−1 A AG−1 Gx − Hc (from (3.2.23))
62 3 Constrained Mathematical Programming
−1
= x − G−1 Gx + G−1 A AG−1 A AG−1 Gx − Hc
−1
= G−1 A AG−1 A b − Hc
= −Hc + P b (from (3.2.24))
= x∗ (from (3.2.26))
and
−P p = −P (Gx + c)
= −P Gx − P c
−1
= − AG−1 A AG−1 Gx − P c (from (3.2.24))
−1
= − AG−1 A b − P c (from (3.2.29))
= −P c + Sb (from (3.2.25))
= λ∗ (from (3.2.27)).
Problem (QPE) only involves linear equality constraints. In this section, let
us consider the following general quadratic programming problem that also
involves linear inequality constraints.
1
minimize f (x) = x Gx + c x (3.3.1a)
2
subject to
hi (x) = a
i x − bi = 0, i ∈ E, (3.3.1b)
hi (x) = a
i x − bi ≤ 0, i ∈ I. (3.3.1c)
hi (x∗ ) ≤ 0, i ∈ I, (3.3.4)
λ∗i hi (x∗ ) = 0, i ∈ I, (3.3.5)
λ∗i ≥ 0, i ∈ I. (3.3.6)
by solving
− A(k) λ(k) = Gx(k) + c, (3.3.12)
where A(k) is the active constraint matrix defined by (3.3.8). Select j
such that
(k) (k)
λj = min λi . (3.3.13)
i∈A(k) ∩I
and go to Step 2.
bi − a
i x
(k)
ᾱ(k) = min , (3.3.15)
i∈I\A(k) a
i d
(k)
a
i d
(k)
>0
and set
x(k+1) = x(k) + α(k) d(k) . (3.3.16)
(k)
If α < 1, set
A(k+1) = A(k) ∪ {p} , (3.3.17)
where p ∈ I\A(k) is the index that achieves the minimum in (3.3.15).
Otherwise, if α(k) = 1, set A(k+1) = A(k) .
the Lagrange multipliers associated with the active set are non-negative, the
Karush-Kuhn-Tucker necessary condition is satisfied and a local optimum is
reached. Otherwise, if some or all of the multipliers are strictly negative, the
constraint corresponding to the most negative multiplier is removed from the
active set. This procedure is carried out in Step 3 of the algorithm. Steps 2
and 4 describe the details of solving the equality only constrained problem
associated with the active set. The iterate x(k) may or may not be an optimal
solution to this problem. We may shift the origin of the coordinate system to
x(k) and check if any non-zero local perturbation d solves the corresponding
shifted problem (3.3.9). If the optimal solution is d = 0, then proceed to Step
3 to check for the Lagrange multipliers. If d = d(k) = 0, then we can reduce
the cost function by updating x(k) to x(k) + d(k) . This step may, however,
cause some of the constraints to be violated. To prevent constraint violation,
we change the update to x(k) + α(k) d(k) , where α(k) is chosen such that the
first non-active constraint in I \A(k) becomes active. This is done only for
constraints that increase in the d(k) direction, i.e., for a
i d
(k)
> 0. The new
active constraint is then included in the active set. The whole procedure is
then repeated by returning to Step 2 after updating x(k+1) = x(k) + α(k) d(k)
and k := k + 1 in Step 5.
Note that in the computation of Lagrange multipliers associated with the
active set in (3.3.12), the set of linear algebraic equations need not be solved
independently at each step since only one constraint is removed or added
each time. A pivoting strategy similar to that used in the simplex algorithm
for linear programming can be used to significantly reduce the amount of
computational effort required.
Example 3.3.1 Find an x ∈ R2 such that the objective function
is minimized subject to
h1 (x) = 2x1 + x2 − 2 ≤ 0,
h2 (x) = −x1 ≤ 0,
h3 (x) = −x2 ≤ 0.
Suppose we start at the feasible point x(0) = 0. The relevant gradients are
2x1 − 4
(∇x f ) = g (x) = , ∇x h1 = [2, 1] ,
2x2 − 5
∇x h2 = [−1, 0] , ∇x h3 = [0, −1] ,
−1 0
A(0) = .
0 −1
(0) (0)
The solution is λ2 = −4 and λ3 = −5, of which the most negative one is
(0)
λ3 . Hence the third constraint is dropped. Now, we set A(0) = {2} and go
to Step 2.
1
Problem (3.3.9) is to minimize d Gd+d g (0) subject to ∇x h2 x(0) d =
2
0. It has the solution d = [0, 5/2] . We now move to Step 4 to determine the
step length along this direction of search. We have
* ! ! +
α(0) = min 1, −h1 x(0) /∇x h1 x(0) d ,
as the third constraint does not satisfy the criterion ∇x h3 x(0) d > 0. This
gives α(0) = 4/5, so
0
x(1) = x(0) + α(0) d(0) = ,
2
A(1) = {1, 2} .
Moving through Step 5, then Steps 2 and 3 again (two linearly independent
constraints are active), the new Lagrange multipliers are found from
. /
(1)
2 −1 λ1 −4
− = ,
1 0 (1)
λ2 −1
(1) (1)
to give λ1 = 1 and λ2 = −2. The second constraint is dropped, giving
A(1) = {1}.
We now return to Step 2 where Problem (3.3.9) is solved again to give
d(1) = [1/5, −2/5] . Step 4 gives a step length of α(1) = 1, which, in turn,
gives x (2)
= [1/5, 8/5] . Moving back to Step 2 with A(2) = {1}, we find
(2)
that d = 0 and then Step 3 gives λ1 = 9/5, which is greater than zero, so
x(2) is the optimal point. The corresponding optimal Lagrange multiplier is
λ∗ = [9/5, 0, 0] .
3.4 Constrained Quasi-Newton Method 67
f (x) (3.4.1a)
is to be minimized subject to
hi (x) =ai x − bi = 0, i ∈ E, (3.4.1b)
hi (x) =ai x − bi ≤ 0, i ∈ I, (3.4.1c)
and !
H (k) = ∇xx f x(k) .
1 (k)
d B d+d g (k) (3.4.2a)
2
is minimized subject to
!
a
i d + hi x
(k)
= 0, i ∈ E, (3.4.2b)
!
a
i d + hi x
(k)
≤ 0, i ∈ I, (3.4.2c)
where
δ (k) = x(k+1) − x(k)
and
γ (k) = g (k+1) − g (k) .
This formula ensures that if B (0) is symmetric positive definite, then so are all
successive updates of the approximate Hessian matrices B (k) provided that
!
δ (k) γ (k) > 0.
Note that if x(k) is feasible, then hi x(k) = 0, i ∈ A(k) ⊇ E, where A(k)
is as defined in Definition 3.3.1.
For each k, Problem (QP)k is solved by the active set method described
in Section 3.3. The algorithm for solving Problem (3.4.1) can now be stated
as follows.
Algorithm 3.4.1
Step 1. Choose a point x(0) ∈ Ξ. (This can be achieved by the first phase of
the simplex algorithm for linear programming.) Approximate H (0) by
a symmetric positive definite matrix B (0) . Choose an ε > 0 and set
k = 0.
Remark 3.4.2 The new point x(k) + α(k) d(k) must be feasible. This will be
the case provided that
−hi x(k)
α ≤ ᾱ = min
(k) (k)
.
i∈I\A(k) ai d
(k)
a
i d
(k)
>0
" #
The condition α(k) ≤ min 1, ᾱ(k) will ensure feasibility of x(k) +α(k) d(k) . As
x(0) is chosen to be a feasible point, Step 3 ensures that the algorithm gener-
ates a sequence of feasible points. The sufficient slope improvement condition
$ %
ensures that δ (k) γ (κ) > 0. However, the constraint α(k) ≤ min{1, ᾱ(k) }
can destroy this property.
optimization problems. The theory was initiated by Wilson in [278] and was
further developed by Han in [86] and [87], Powell in [207] and Schittkowski in
[221] and [222]. In this section, we shall discuss some of the essential concepts
of the algorithm without going into detail. The main references of this section
are [222], [232] and [275]. For readers interested in details, see [232].
Consider the equality constrained optimization problem
subject to
hi (x) = 0, i = 1, 2, . . . , m. (3.5.2)
The Lagrangian function is
and
∇λ L(x, λ) = (h(x)) = 0 . (3.5.5)
(3.5.4) and (3.5.5) may be rewritten as the following system of nonlinear
equations:
⎡ ⎤
m
(∇x L(x, λ)) (∇ f (x)) + λ (∇ h (x)) ⎦= 0 .
=⎣
x i x i
W (x, λ) = i=1
h(x) 0
h(x)
(3.5.6)
We shall use Newton’s method to find the solution of the nonlinear sys-
tem (3.5.6). For a given iterate x(k) ∈ Rn and the corresponding Lagrange
multiplier λ(k) ∈ Rm , the next iterate (x(k+1) , λ(k+1) ) is obtained by solving
the following system of linear equations:
. (k) (k) (k) /
!
∇ L x , λ ∇ h x x(k+1) − x(k)
W x(k) , λ(k) + xx (k) x = 0.
∇x h x 0 λ(k+1) − λ(k)
(3.5.7)
From (3.5.6)–(3.5.7), it can be shown that
! ! !!
∇xx L x(k) , λ(k) x(k+1) − x(k) + ∇x h x(k) λ(k+1)
!!
= − ∇x f x(k) (3.5.8)
3.5 Sequential Quadratic Programming Algorithm 71
and ! ! !
∇x h x(k) x(k+1) − x(k) = −h x(k) . (3.5.9)
Solving the system of linear equations (3.5.10) and (3.5.11) gives (δx(k) ,
λ(k+1) ). Then, we obtain the next iterate as
subject to
! !
hi x(k) + ∇x hi x(k) δx(k) = 0, i = 1, 2, . . . , m. (3.5.14)
subject to
hi (x) = 0, i = 1, 2, . . . , m, (3.5.17)
hi (x) ≤ 0, i = m + 1, m + 2, . . . , m + r. (3.5.18)
m+r
L(x, λ) = f (x) + λi hi (x). (3.5.19)
i=1
Let x(k) be the current iterate, and let λ(k) ∈ Rm+r be the corresponding
Lagrange multiplier vector. To find the next iterate, we first construct the
following quadratic programming subproblem:
! ! 1 !
minimize f x(k) +∇x f x(k) δx(k) + (δx(k) ) ∇xx L x(k) , λ(k) δx(k)
2
(3.5.20)
subject to
! !
hi x(k) + ∇x hi x(k) δx(k) = 0, i = 1, 2, . . . , m, (3.5.21)
and
! !
hi x(k) + ∇x hi x(k) δx(k) ≤ 0, i = m + 1, 2, . . . , m + r. (3.5.22)
such a way that if B (k) is positive definite, then B (k+1) will also be positive
definite. To generate a new iterate, we need conditions for the step length
selection. For this, we will introduce a merit function. The SQP method for
solving the nonlinear optimization problem with equality and inequality con-
straints may now be stated as follows.
Algorithm 3.5.1
Step 1. Let k = 0. Choose a starting point x(0) ∈ Rn and choose a positive
definite matrix B (0) .
Step 2. Solve the following subproblem to obtain δx(k) , λ(k+1) :
! ! 1 !
minimize f x(k) + ∇x f x(k) δx(k) + δx(k) B (k) δx(k)
2
(3.5.24)
subject to
! !
hi x(k) + ∇x hi x(k) δx(k) = 0, i = 1, 2, . . . , m (3.5.25)
and
! !
hi x(k) +∇x hi x(k) δx(k) ≤ 0, i = m+1, 2, . . . , m+r. (3.5.26)
Step 3. If δx(k) = 0, then the algorithm terminates and x(k) is the KKT point
for problem (3.5.16)–(3.5.18). Otherwise, set x(k+1) = x(k) +αk δx(k) ,
where αk is determined by some step length rules, see below.
Step 4. Update B (k) to B (k+1) such that B (k+1) is positive definite. Return
to Step 2.
The next task is to derive the update formula for the matrix B (k) . It is
natural to consider using the BFGS update formula, i.e.,
γ (k) γ (k) B (k) δ (k) δ (k) B (k)
B (k+1) = B (k) + − , (3.5.27)
γ (k) δ (k) (δ (k) ) B (k) δ (k)
and
|∇x f (x + ᾱδ)δ| ≤ −βδ (∇x f (x)) , (3.5.29)
74 3 Constrained Mathematical Programming
where ρ and β are constants satisfying 0 < ρ < β < 1. However, this is
not always true for the constrained case. Therefore, the BFGS update should
not be applied directly. It needs to be modified as suggested in [207]. More
specifically, we introduce η (k) to replace γ (k) , where η (k) is given by
⎧ (k) (k)
⎨ γ (k) , γ δ ≥ 0.2 δ (k) B (k) δ (k) ,
η (k) =
⎩
θk γ (k) + (1 − θk )B (k) δ (k) , otherwise,
(3.5.30)
where
0.8 δ (k) B (k) δ (k)
θk = . (3.5.31)
δ (k) B (k) δ (k) − (γ (k) ) δ (k)
Clearly,
! !
η (k) δ (k) ≥ 0.2 δ (k) B (k) δ (k) (3.5.32)
and hence !
δ (k) η (k) > 0. (3.5.33)
Let η (k) be chosen as defined by (3.5.30). Then, the modified BFGS formula
is given ([207]) by
η (k) η (k) B (k) δ (k) δ (k) B (k)
B (k+1)
=B (k)
+ − . (3.5.34)
η (k) δ (k) δ (k) B (k) δ (k)
and
! !
hi x(k) + ∇x hi x(k) δx(k) ≤ 0, i = m + 1, 2, . . . , m + r. (3.5.37)
! ! (k)
m+r !!
∇x f x(k) δx(k) + δx(k) B (k) + λi ∇x hi x(k) = 0, (3.5.38)
i=1
(k)
& ! ! '
λi hi x(k) + ∇x hi x(k) δx(k) = 0, i = m+1, 2, . . . , m+r, (3.5.39)
and
(k)
λi ≥ 0, i = m + 1, 2, . . . , m + r. (3.5.40)
(k+1) (k+1) (k+1)
Then, the new estimates x ,λ and B may be determined by
where
δ (k) = x(k+1) − x(k) (3.5.44)
and
& !' & !'
γ (k) = ∇x L x(k+1) , λ(k+1) − ∇x L x(k) , λ(k) . (3.5.45)
m
m+r
P (x, σ) = f (x) + σi |hi (x)| + σi max{0, −hi (x)}, (3.5.46)
i=1 i=m+1
(k)
where λ is the optimal Lagrange multiplier of the subproblem
(3.5.35)–
(k) (k)
(3.5.37), i.e., it satisfies (3.5.38)–(3.5.40). Clearly, λi ≤ σi .
To determine the convergence and convergence rate of the SQP method,
we need the following two lemmas from [275].
76 3 Constrained Mathematical Programming
where
I(x) = {i ∈ I : hi (x) = Φ(x)}. (3.5.49)
Lemma 3.5.2 Consider Problem (3.5.16)-(3.5.18), where f (x) and hi (x),
i = 1, 2, . . . , m + r, are continuously differentiable
! functions. Suppose that
(k) (k) (k)
B is positive definite and that δx , λ is a KKT point of the
(k) (k)
subproblem (3.5.35)-(3.5.37) for which δx(k) = 0 and λi ≤ σi , i =
1, 2, . . . , m + r. Then
!
P x(k) , σ (k) ; δx(k) < 0. (3.5.50)
With the help of Lemmas 3.5.1 and 3.5.2, the step size αk can be chosen
(see [87]) such that
! !
P x(k) + αk δx(k) , σ (k) < max P x(k) + αδx(k) , σ (k) + εk , (3.5.51)
1≤α≤β
Now, suppose that there exists a k0 such that Assumption 3.5.4 is satisfied
for k > k0 . Then, there exists a λ ∈ R|A(x )| such that
(k) (k)
! ! (k)
∇x f x(k) + B (k) δx(k) = A x(k) λ (3.5.59)
and ! !
A x(k) δx(k) = −-
h x(k) , (3.5.60)
for all k > k0 , where -
h(x) is the vector whose elements are hi (x), i ∈ A x(k) .
Theorem 3.5.2 Suppose that Assumptions 3.5.1–3.5.4 are satisfied. Then,
δx(k) is a superlinearly convergent step, i.e.,
(k)
x + δx(k) − x∗
lim = 0, (3.5.61)
k→∞ x(k) − x∗
if and only if
Pk Bk − W (x∗ , λ∗ )δx(k)
lim = 0, (3.5.62)
k→∞ δx(k)
where Pk is a projection from Rn onto the null space of A x(k) , i.e.,
! !! !−1 !!
Pk = I − A x(k) A x(k) A(x(k) ) A x(k) . (3.5.63)
78 3 Constrained Mathematical Programming
4.1 Introduction
hj (x) ≤ 0, j = 1, 2, . . . , m, (4.2.1a)
and
a ≤ x ≤ b, (4.2.1b)
where x = [x1 , x2 , . . . , xn ] , a = [a1 , a2 , . . . , an ] and b = [b1 , b2 , . . . , bn ] .
Here, ai , i = 1, 2, . . . , n, and bi , i = 1, 2, . . . , n, are given constants, and we
assume that for each j = 1, 2, . . . , m, hj is continuously differentiable with
respect to x.
Given Problem (4.2.1), we can construct a corresponding unconstrained
optimization problem as follows. Let ε > 0 and consider
⎧ ⎫
⎨ m ⎬
minimize Jε (x) = Φε (hj (x)) (4.2.2a)
⎩ ⎭
j=1
subject to
a ≤ x ≤ b, (4.2.2b)
where Φε (·) is defined by
⎧
⎪
⎪ h, if h ≥ ε,
⎪
⎪
⎪
⎪
⎨ 2
(h + ε)
Φε (h) =
⎪ , if − ε ≤ h ≤ ε, (4.2.3)
⎪
⎪ 4ε
⎪
⎪
⎪
⎩ 0, if h ≤ −ε,
4.2 Constraint Transcription Technique 81
Φε
−ε ε h
Proof. Suppose x0 is not feasible. Then there exists some j such that
hj x0 > 0.
then there exists a positive integer k0 such that x(k) is a feasible solution of
Problem (4.2.1) for all k > k0 .
Proof. Since limk→∞ Jε x(k) = 0, there exists a positive integer N such
that !
Jε x(k) ≤ ε/4
for all k > N . Thus, by Theorem 4.2.2, x(N +1) is a feasible solution of
Problem (4.2.1).
In view of Theorem 4.2.3, we see that Algorithm 4.2.1 will find a feasible
solution of Problem (4.2.1) in a finite number of iterations if (4.2.6) holds.
In fact, if the optimization algorithm simply produces a point x∗ such that
the sufficient condition (4.2.5) is satisfied, then x∗ is a feasible solution of
the inequality constraints (4.2.1), so (4.2.6) is not strictly necessary for finite
convergence.
4.2 Constraint Transcription Technique 83
Remark 4.2.1 Note that Algorithm 4.2.1 can produce a non-feasible solu-
tion, which corresponds to a local minimum point for the corresponding un-
constrained optimization problem. In such a case, the solution should obvi-
ously be rejected. The algorithm may be restarted from another initial point
in an attempt to reach a global minimum.
Remark 4.2.2 ε is required to be positive, but, in practice, it does not need
to be very small. Note that if we choose ε = 0, then Φε becomes nonsmooth.
In such a case, the necessary and sufficient condition for the solvability of
Problem (4.2.1) is Jε = 0. This, however, may require an infinite number of
iterations and is thus not recommended. In general, an appropriate choice of
ε will enhance the convergence characteristic of the algorithm. If the feasible
space is sufficiently large, ε can be larger (and hence speed up convergence).
Otherwise, ε should be made small.
Here, we shall extend the technique presented in Section 4.2.1 to find a feasible
solution of the following continuous inequality constraints:
and
a ≤ x ≤ b, (4.2.7b)
where 0 < T < ∞.
For convenience, let this continuous inequality constrained problem be
referred to as Problem (4.2.7).
For any ε > 0, we consider the following optimization problem:
⎧ ⎫
⎨ T m ⎬
min Jε (x) = Φε (hj (x, t))dt (4.2.8a)
⎩ 0 ⎭
j=1
subject to
a ≤ x ≤ b, (4.2.8b)
where Φε (·) is defined by (4.2.3) and, for each j = 1, 2, . . . , m, hj is assumed
to satisfy the following conditions.
3T
Assumption 4.2.1 0 Φε (hj (x, t))dt exists for all x.
Assumption 4.2.2 hj (x, t) is continuous in t ∈ [0, T ], and ∂hj (x, t)/∂t is
piecewise continuous in t ∈ [0, T ] for each x.
Assumption 4.2.3 hj (x, t) is continuously differentiable with respect to x
for all t ∈ [0, T ], except, possibly, at a finite number of points in [0, T ].
84 4 Optimization Problems Subject to Continuous Inequality Constraints
where
df (t)
M = max (4.2.11b)
t∈[0,T ] dt
and
f- = max f (t). (4.2.11c)
t∈[0,T ]
Proof. Let t0 ∈ [0, T ] be such that f (t0 ) = f-. There are three cases to be
considered:
f-
t0 + ≤ T, (4.2.12a)
M
f-
t0 − ≥ 0, (4.2.12b)
M
f- f-
t0 − < 0 and t0 + > T. (4.2.12c)
M M
Case (4.2.12a): Define
t t
df (s)
f (t) − h(t) = f (t0 ) + ds − f (t0 ) + M ds
t0 ds t0
4.2 Constraint Transcription Technique 85
t
df (s)
= + M ds ≥ 0.
t0 ds
Hence,
!2
T t0 +(f/M ) t0 +(f/M ) -
f
1
f (t) dt ≥ f (t) dt ≥ h(t) dt = . (4.2.14)
0 t0 t0 2 M
Hence,
!2
T t0 -
f t0
1
f (t) dt ≥ f (t) dt ≥ h(t) dt = . (4.2.16)
0 t0 −(f/M ) t0 −(f/M ) 2 M
f (t0 ) − M (t − t0 ), t ≥ t0 ,
h(t) = (4.2.17)
f (t0 ) + M (t − t0 ), t < t0 .
Then,
⎧ t ⎫
⎪
⎪ df (s) ⎪
⎪
⎪ + M ds, t ≥ t0 ⎪
⎪
⎪
⎨ t0 ds ⎬
f (t) − h(t) = ≥ 0, ∀t ∈ [0, T ],
⎪
⎪ ⎪
⎪
⎪
⎪
t0
df (s) ⎪
⎩ M− ds, t < t0 ⎪
⎭
t ds
and hence,
T T
f (t) dt ≥ h(t) dt
0 0
86 4 Optimization Problems Subject to Continuous Inequality Constraints
T t0
= h(t) dt + h(t) dt
t0 0
M M t0
= (T − t0 ) f- − (T − t0 ) + t0 f- −
2 2
M M
= T f- − (T − t0 )2 − (t0 )2 . (4.2.18)
2 2
Note that
f-
t0 + >T ⇒ M (T − t0 ) < f-,
M
f-
t0 − < 0 ⇒ M t0 < f-.
M
We have
M M 1& '
T f- − (T − t0 )2 − (t0 )2 ≥ T f- − (T − t0 ) f- + t0 f-
2 2 2
1 1
= T f- − T f- = T f-.
2 2
Thus, it follows from (4.2.18) that
T
1 -
f (t) dt ≥ T f. (4.2.19)
0 2
Clearly,
∂Φ h x0 , t ∂Φ h x0 , t ∂ h x0 , t
ε j ε j j
≤ . (4.2.24)
∂t ∂hj ∂t
Since
∂Φ h x0 , t
ε j
≤ 1, ∀ t ∈ [0, T ],
∂hj
it is clear that
M̄ ≤ M. (4.2.25)
Thus, by Lemma 4.2.1, we have
T
Φ∗ε Φ∗ε
Φε (hj (x, t)) dt ≥ min ,T , (4.2.26)
0 2 M̄
where
Φ∗ε = max Φε (hj (x, t)). (4.2.27)
t∈[0,T ]
Φ∗ε Φ∗ε ε * ε + ε * ε +
min ,T > min , T ≥ min ,T . (4.2.29)
2 M̄ 8 4M̄ 8 4M
Note that the gradient of the cost function (4.2.8a) with respect to each
x ∈ [a, b] is given by
T
m
∂Φε (hj (x, t))
∇Jε (x) = dt, (4.2.30)
0 j=1
∂x
88 4 Optimization Problems Subject to Continuous Inequality Constraints
where
⎧
⎪ ∂hj (x, t)
⎪
⎪ , if hj (x, t) ≥ ε,
⎪
⎪ ∂x
⎪
⎪
∂Φε (hj (x, t)) ⎨ h (x, t) + ε ∂h (x, t)
= j j
, if −ε ≤ hj (x, t) ≤ ε,
∂x ⎪
⎪
⎪
⎪ 2ε ∂x
⎪
⎪
⎪
⎩
0, if −ε ≤ hj (x, t).
(4.2.31)
Thus, Problem (4.2.8) can be viewed as a standard unconstrained opti-
mization problem and hence is solvable by any efficient unconstrained opti-
mization technique such as the quasi-Newton method (see Chapter 2). We
may summarize these findings in the following algorithm.
Algorithm 4.2.2 At the k-th iteration of the unconstrained optimization
algorithm with iterate x(k) , we insert the following steps:
Step 1. If Jε (x(k) ) > mεT
4 , go to Step 3; otherwise, go to Step 2.
Step 2. Check if the constraints (4.2.7a) are satisfied. If so, go to Step 4;
otherwise, go to Step 3.
Step 3. Continue the next iteration of the unconstrained optimization method
to obtain x(k+1) . Set k := k + 1, and go to Step 1.
Step 4. Stop. x(k) is a feasible solution.
The following theorem shows that Algorithm 4.2.2 terminates in a finite
number of iterations.
" #
Theorem 4.2.6 Let x(k) be a sequence of admissible points generated by
Algorithm 4.2.2. If !
lim Jε x(k) = 0, (4.2.32)
k→∞
then there exists a positive integer k0 such that x(k) is a feasible solution of
Problem (4.2.7) for all k > k0 .
Proof. Since !
lim Jε x(k) = 0,
k→∞
As in the previous case, it is clear from Theorem 4.2.6 that Algorithm 4.2.2
will find a feasible solution of Problem (4.2.7) in a finite number of iterations
if (4.2.32) holds. Again, it is possible that a solution can still be obtained in a
4.3 Continuous Inequality Constraint Transcription Approach 89
finite number of steps even when (4.2.32) is not satisfied. Also, the algorithm
may converge to a non-global minimum, in which case one may repeat it with
a different starting point in the hope of reaching the global optimum.
f (x) (4.3.1a)
hj (x) ≤ 0, j = 1, 2, . . . , p (4.3.1b)
Gj (x) = 0. (4.3.2)
and
Gj (x) = 0, j = 1, 2, . . . , m. (4.3.3c)
This problem is referred to as Problem (4.3.3). Clearly, Problem (4.3.1) is
equivalent to Problem (4.3.3).
Note that for each j = 1, 2, . . . , m, Gj (x) is nonsmooth in x. Hence, stan-
dard optimization routines can have difficulty with these equality constraints.
Let
Θ = {x ∈ Rn : hj (x) ≤ 0, j = 1, 2, . . . , p} , (4.3.4)
and let F be the feasible region of Problem (4.3.1) defined by
F = {x ∈ Θ : φj (x, t) ≤ 0, ∀t ∈ [0, T ], j = 1, 2, . . . , m}
= {x ∈ Θ : Gj (x) = 0, j = 1, 2, . . . , m} . (4.3.5)
◦ ◦
Furthermore, let Θ (respectively, F) be the interior of the set Θ (respectively,
F) in the sense that
◦
Θ = {x ∈ Rn : hj (x) < 0, j = 1, 2, . . . , p} (4.3.6)
and
◦ ◦
F = x ∈ Θ : φj (x, t) < 0, ∀ t ∈ [0, T ], j = 1, 2, . . . , m . (4.3.7)
f (x) (4.3.10a)
subject to
x∈Θ (4.3.10b)
and
Gj,ε (x) ≤ τ, j = 1, 2, . . . , m. (4.3.10c)
This problem is referred to as Problem (4.3.10). Let Fε be the feasible region
of Problem (4.3.9) defined by
◦
Proof. By Assumption 4.3.5, there exists an x̄ ∈ F such that
◦
xα = αx̄ + (1 − α) x∗ = x∗ + α (x̄−x∗ ) ∈ F, ∀α ∈ (0, 1]. (4.3.12)
4.3 Continuous Inequality Constraint Transcription Approach 93
Theorem 4.3.2 Let x∗ and x∗ε be as in Theorem 4.3.1. Then, the sequence
{x∗ε } has an accumulation point. Furthermore, any accumulation point of the
sequence {x∗ε } is an optimal parameter vector of Problem (4.3.1).
We shall show that x̂ ∈ F. Suppose not. Then, there exist a j and a non-zero
interval I ⊂ [0, T ] such that
ε T ε
kj,ε = min , . (4.3.17)
16 2 2mj
but
Gj (x) > 0. (4.3.23)
Since φj is a continuously differentiable function in [0, T ], (4.3.23) implies
that there exists a t̄ ∈ [0, T ] such that
4
Φj 6
2 6 T
0 -
y t̄ t
−2
−4 − 2
? θ
- θ
−6 z
−8 Ii -
−10
y
−12 tan(θ) = mi = z
−14 z = mi y ≥ 2
mi
−16
−1 0 1 2 3 4 5 6
See Figure 4.3.1. Using (4.3.16), it is clear from (4.3.25) that the length |Ij |
of the interval Ij must satisfy
T ε
|Ij | ≥ min , . (4.3.26)
2 2mj
Algorithm 4.3.1 (Note that the integral in (4.3.8) is replaced by any suit-
able quadrature with positive weights in any computational scheme.)
Remark 4.3.2 Note that the number of quadrature points may need to be
increased as ε → 0 and that for large values of ε, a small number of quadrature
points are sufficient.
Remark 4.3.3 From Theorem 4.3.2, we see that the halving process of τ in
Step 2 of the algorithm only needs to be carried out a finite number of times.
Thus, the algorithm produces a sequence of suboptimal parameter vectors to
Problem (4.3.1), where each of them is in the feasible region of (4.3.1).
Remark 4.3.4 From the proof of Theorem 4.3.3, we see that ε and τ are
closely related. At the solution of a particular problem, if a constraint is active
over a large fraction of [0, T ], then it appears that τ = O (ε). On the other
hand, if the constraint is active at only one point in [0, T ], then τ = O ε2 .
" #
Theorem 4.3.4 Let x∗ε,τ be a sequence of the suboptimal parameter vec-
tors produced by the above algorithm. Then,
f x∗ε,τ → f (x∗ ) ,
" #
and any accumulation point of x∗ε,τ is a solution of Problem (4.3.1).
Proof. Clearly,
f (x∗ε ) ≤ f x∗ε,τ ≤ f (x∗ ) .
Since f (x∗ε ) → f (x∗ ), it follows that f x∗ε,τ → f (x∗ ).
To prove
" the#second part of the theorem, we note from (4.3.A6) that the
sequence x∗ε,τ is in a compact set. Thus, the existence of an accumula-
tion point is assured. On this basis, the conclusion follows easily from the
continuity of the function f .
Example 4.3.1 We choose an example that was used in [81] and [241]. Min-
imize
x2 (122 + 17x1 + 6x3 − 5x2 + x1 x3 ) + 180x3 − 36x1 + 1224
f (x) =
x2 (408 + 56x1 − 50x2 + 60x3 + 10x1 x3 − 2x21 )
(4.3.27a)
subject to
φ (x, ω) ≤ 0, ∀ ω ∈ Ω, (4.3.27b)
where
2
φ (x, ω) = J (T (x, ω)) − 3.33 [R(T (x, ω))] + 1.0,
$ %
and where Ω = 10−6 , 30 , i2 = −1, T (x,ω) = 1 + H (x, iω) G [iω],
H (x, x) = x1 + x2 /s + x3 s,
4.3 Continuous Inequality Constraint Transcription Approach 97
1
G (s) = ,
(s + 3) (s2 + 2s + 2)
and R(·) and J (·) denote the real and imaginary parts of their arguments,
respectively. Finally, there are simple bounds on the variables:
We apply Algorithm 4.3.1 to the problem and report the results in Table 4.3.1.
In the table, we report a “failure”parameter F . F = 0 indicates normal termi-
nation, and a minimum has been found. F = 2 indicates that the convergence
criteria are not satisfied, but no downhill search could be determined. F = 3
means the maximum number of iterations, 50, has been exceeded. The num-
ber of function evaluations is given by nf with a ‘∗’ indicating when the
maximum number of iterations was reached. ε, τ and f and x1 , x2 and x3
are self-explanatory. ω̄ gives the approximate value of ω where φ attains its
maximum over Ω at the computed x value. The value of g (x) reported gives
an idea of feasibility/infeasibility over 2001 equally spaced points in Ω, which
is a much finer partition than that used for the quadrature formula. A ‘∗’
in these columns indicates that the solution found is well within the feasible
region. Finally, λ is the Lagrange multiplier for the constraint. Note, in par-
ticular, the good failure record of F = 0 for all iterations, leaving no doubt
that a solution has been found.
ε τ ω̄ g f x1 x2 x3 nf F λ
−1 −2 −2
10 10 4.095 −2.9×10 0.2223351 28.23895 22.40785 19.51560 25 0 0.0
−2 −3 −3
10 10 5.655 1.2×10 0.1745032 17.03023 45.46166 34.69278 48 0 −0.440
−4 −3
5×10 5.655 −1.4×10 0.1747744 16.97336 45.46475 34.59242 5 0 −0.706
−3 −5 −6
10 5×10 5.655 5.3×10 0.1746270 17.00433 45.46298 34.64676 11 0 −0.975
−4 −6 −4
10 5×10 5.670 1.8×10 0.1746131 16.75351 45.45822 34.80058 19 0 −0.991
−5 −7 −5
10 5×10 5.670 7.9×10 0.1746124 16.75367 45.45822 34.80085 8 0 −0.991
−6 −8 −5
10 5×10 5.670 2.7×10 0.1746123 16.75368 45.45823 34.80088 6 0 −0.990
−7 −9 −6
10 5×10 5.670 8.8×10 0.1746123 16.75368 45.45823 34.80088 6 0 −0.990
τ nq ω̄ g f x1 x2 x3 nf F λ
−1 −2 −2
10 10 8 3.720 −8.5×10 0.2417111 26.28261 20.49562 16.42453 13 0 0.0
−2 −3
10 10 16 ∗ ∗ 0.1788802 15.94624 31.52512 34.15840 30 0 0.0
−3 −4 −4
10 10 32 5.610 −5.3×10 0.1747647 17.80857 44.56686 34.13024 34 0 −0.325
−4 −5 −4
10 10 64 5.670 1.6×10 0.1746221 16.77723 44.63652 34.76034 28 0 −0.490
−5 −6 −5
10 10 128 5.670 7.5×10 0.1746210 16.77749 44.63660 34.76076 8 0 −0.693
−7 −5
5×10 128 5.670 7.4×10 0.1746214 16.77740 44.63670 34.76061 6 0 −0.980
−6 −8 −5
10 5×10 256 5.655 2.4×10 0.1746128 16.82795 45.48089 34.75696 18 0 −1.394
−8 −5
2.5×10 256 5.655 2.4×10 0.1746129 16.82788 45.48090 34.75699 5 0 −1.969
−7 −9 −6
10 2.5×10 512 5.655 6.3×10 0.1746168 16.96054 45.48194 34.67752 29 0 −1.901
We assume once more that Assumptions 4.3.1, 4.3.2 and 4.3.4 are satisfied.
The continuous inequality constraints (4.3.1c) are equivalent to
Gj (x) = 0, j = 1, 2, . . . , m. (4.3.29)
◦
Let Θ, Θ̊, F and F be as defined in (4.3.4), (4.3.6), (4.3.5) and (4.3.7),
respectively. We further assume that Assumptions 4.3.3, 4.3.5, and 4.3.6 are
satisfied.
Note that, for each j = 1, 2, . . . , m, Gj (x) is, in general, nonsmooth
in x. Consequently, standard optimization routines would have difficulties
with this type of equality constraints. The smoothing technique is to replace
max{φj (x, t), 0} with φj,ε (x, t), where
⎧
⎨ 0, if φj (x, t) < −ε,
φj,ε (x, t) = (φj (x, t) + ε)2 /4ε, if − ε ≤ φj (x, t) ≤ ε, (4.3.30)
⎩
φj (x, t), if φj (x, t) > ε.
Fε = {x ∈ Θ : Gj,ε (x) = 0, j = 1, 2, . . . , m}
= {x ∈ Θ : φj (x, t) ≤ −ε, ∀ t ∈ [0, T ], j = 1, 2, . . . , m}. (4.3.32)
∗
m
∗
m
f xε,γ + γ Gj,ε xε,γ ≤ f (x) + γ Gj,ε (x), (4.3.34)
j=1 j=1
Gj,ε (xε ) = 0
m
γ Gj,ε x∗ε,γ ≤ f (xε ) − f (x̄). (4.3.36)
j=1
m
z
Gj,ε x∗ε,γ ≤ .
j=1
γ
By Theorem 4.3.3, we recall that for any ε > 0, there exists a τ (ε) such that
for all 0 < τ < τ (ε), if Gj,ε (x) < τ , then
m x ∈ F. Thus, by choosing γ(ε) ≥
z/τ (ε), it follows that for all γ > γ(ε), j=1 Gj,ε x∗ε,γ < τ . Consequently,
Gj,ε x∗ε,γ < τ , j = 1, 2, . . . , m, and hence x∗ε,γ ∈ F. This completes the
proof.
for all α ∈ (0, 1]. Now, for any δ1 > 0, there exists an α1 ∈ (0, 1] such that
m
m
f x∗ε,γ + γ Gj,ε x∗ε,γ ≤ f (xα2 ) + γ Gj,ε (xα2 ) = f (xα2 ).
j=1 j=1
Combining (4.3.38) with (4.3.37) and remembering that x∗ε,γ is feasible for
Problem (4.3.1), we obtain
f (x∗ ) ≤ f x∗ε,γ ≤ f (x∗ ) + δ1 .
Based on the results presented in Theorems 4.3.5 and 4.3.6, we now propose
the following algorithm for solving Problem (4.3.1). Essentially, the idea of
algorithm is to reduce ε from an initial value to a suitably small value in a
number of stages. The value of γ is adjusted at each stage to ensure that the
solution obtained remains feasible. We only use a simple updating scheme
here.
Algorithm 4.3.2 (Note that the integral in (4.3.31) is replaced by any suit-
able quadrature with positive weights.)
Step 1. Choose ε = 10−1 , γ = 1 and a starting point x ∈ Θ.
Step 4. Set ε = ε/10. If ε > 10−7 , go to Step 2, using x∗ε,γ as the next
starting point. Else, we have a successful exit.
Remark 4.3.6 Using Theorem 4.3.5, we see that the doubling of γ in Step 3
needs only to be carried out a finite number of times. Consequently, the algo-
rithm produces a sequence of suboptimal parameter vectors to Problem (4.3.1),
and each of these is in the feasible region of Problem (4.3.1). Convergence of
the cost is assured by Theorem 4.3.6. Since Θ is a compact set, we know that
the above sequence has an accumulation point. Furthermore, each accumula-
tion point is a solution of Problem (4.3.1).
Remark 4.3.7 Note also that for each ε > 0 and γ > 0, Problem (4.3.33)
is constructed using the concept of penalty functions. However, due to the
special structure of the penalty function used, the penalty weighting factor
γ does not need to go to infinity. In [259], the method was thus referred to
as an exact penalty function method. However, if the solution obtained for
Problem (4.3.33) is a local solution, it is not known whether or not it is a
local solution of the original problem (4.3.1). Thus, to avoid confusion, the
phrase exact penalty function method will be used to refer to the method
introduced in Section 4.4.
102 4 Optimization Problems Subject to Continuous Inequality Constraints
Example 4.3.2 We solve Example 4.3.1 once more with the proposed
penalty method. The integral in (4.3.31) is calculated via the trapezoidal
rule with 256 quadrature points. The optimization routine used is based on
the quasi-Newton method (see Chapter 2). The required accuracy asked of
the optimization routine is max{10−8 , 10−3 ε}.
The results are given in Table 4.3.3. The number of function evaluations
made at each iteration of the algorithm is given in the column nf . The
columns marked ε, γ, f , x1 , x2 and x3 are self-explanatory. Finally we check
the maximum value of g(x) for 2001 equally spaced points of the interval Ω.
This is given in the column marked g, while ω̄ indicates the value of ω at
which this maximum occurred.
It should be noted that, at each iteration, the optimization routine re-
turned with a ‘failure’ parameter of ‘0’. This indicates that the optimum has
been found for each iteration. The algorithm only ensures that φ(x, ω) ≤ 0
for the quadrature points, and this is reflected in the value of g as ε be-
comes small. However, noting that the interval between quadrature points is
of the order of 10−1 , it is quite reasonable to expect a constraint violation
between quadrature points of the order of 10−4 . An encouraging point to note
about the algorithm is that it tends to keep the solution of the approximate
problems more to the inside of the feasible region.
ε γ ω̄ g f x1 x2 x3 nf
−1 −2
10 1.0 5.48 −6.7×10 0.1822723 16.18185 48.86607 32.07144 21
−2 −2
10 1.0 5.48 −6.7×10 0.1822723 16.18185 48.86607 32.07144 1
−3 −4
10 1.0 5.63 6.2×10 0.1747277 17.57267 48.63336 34.44601 11
−3 −5
10 2.0 5.63 −7.3×10 0.1748006 17.55490 48.63305 34.42106 4
−4 −6
10 2.0 5.63 −8.7×10 0.1747937 17.55597 48.63278 34.42370 3
−5 −4
10 2.0 5.69 3.7×10 0.1746830 16.53537 48.11178 35.01657 17
−6 −4
10 2.0 5.69 3.7×10 0.1746830 16.53538 48.11178 35.01658 4
−7 −4
10 2.0 5.69 3.7×10 0.1746830 16.53541 48.11173 35.01657 6
This section basically comes from [300, 301]. Consider the following optimiza-
tion problem with continuous inequality constraints:
subject to
hj (x, t) ≤ 0, ∀t ∈ [0, T ], j = 1, 2, . . . , m. (4.4.2)
4.4 Exact Penalty Function Method 103
subject to
(x, ε) ∈ S0 , (4.4.4b)
where S0 is simply Sε with ε = 0.
We assume that the following conditions are satisfied.
Assumption 4.4.1 There exists a global minimizer of Problem (P ), imply-
ing that f (x) is bounded from below on S0 .
Assumption 4.4.2 The number of distinct local minimum values of the ob-
jective function of Problem (P ) is finite.
Assumption 4.4.3 Let A denote the set of all local minimizers of Problem
(P ). If x∗ ∈ A, then A(x∗ ) = {x ∈ A : f (x) = f (x∗ )} is a compact set.
α and γ are positive real numbers, β > 2 and σ > 0 is a penalty parameter.
We now introduce a surrogate optimization problem as follows:
subject to
(x, ε) ∈ Rn × [0, +∞). (4.4.7b)
Let this problem be referred to as Problem (Pσ ). Intuitively, during the
process of minimizing fσ (x, ε), if σ is increased, εβ should be reduced, mean-
ing that ε should be reduced as β is fixed. Thus ε−α will be increased, and
hence the constraint violation will also be reduced. This means that the value
2
of [max {0, hj (x, t) − εγ Wj }] must go down, eventually leading to the satis-
faction of the continuous inequality constraints, i.e.,
hj (x, t) ≤ 0, ∀ t ∈ [0, T ], j = 1, 2, . . . , m.
m T
∂fσ (x, ε) ∂f (x) ∂hj (x, t)
= + 2ε−α max{0, hj (x, t) − εγ Wj } dt
∂x ∂x j=1 0
∂x
(4.4.8)
and
m T
∂fσ (x, ε)
= − αε−α−1
2
[max {0, hj (x, t) − εγ Wj }] dt
∂ε j=1 0
m T
− 2γεγ−α−1 max {0, hj (x, t) − εγ Wj } Wj dt + σβεβ−1
j=1 0
⎧
⎨ m
T
−α−1 2
=ε −α [max {0, hj (x, t) − εγ Wj }] dt
⎩ 0
j=1
4.4 Exact Penalty Function Method 105
⎫
m
T ⎬
+2γ max {0, hj (x, t) − εγ Wj } (−εγ Wj )dt + σβεβ−1 .
0 ⎭
j=1
(4.4.9)
For every positive integer k, let x(k),∗ , ε(k),∗ be a local minimizer of
Problem (Pσk ). To obtain the main result, we need the following lemma.
Lemma 4.4.1 Let x(k),∗ , ε(k),∗ be a local minimizer of Problem (Pσk ).
(k),∗ (k),∗
Suppose that fσk x ,ε is finite and that ε(k),∗ > 0. Then,
!
x(k),∗ , ε(k),∗ ∈/ Sε ,
Suppose that ε(k),∗ → ε∗ = 0. Then, by (4.4.13), we observe that its first term
tends to a finite value, while the last term tends to infinity as σk → +∞,
when k → +∞. This is impossible for the validity of (4.4.13). Thus, ε∗ = 0.
Now, by (4.4.12), we obtain
!α ∂f x(k),∗
(k),∗
ε
∂x
m T * ! !γ + ∂h x(k),∗ , t
j
+2 max 0, hj x (k),∗
,t − ε (k),∗
Wj dt
j=1 0
∂x
= 0. (4.4.14)
4.4 Exact Penalty Function Method 107
Thus,
4
!α ∂f x(k),∗
(k),∗
lim ε
k→+∞ ∂x
5
m
T * ! !γ + ∂h x(k),∗ , t
j
+2 max 0, hj x (k),∗
,t − ε (k),∗
Wj dt
j=1 0
∂x
m T
∂hj (x∗ , t)
=2 max {0, hj (x∗ , t)} dt
j=1 0 ∂x
= 0. (4.4.15)
Proof. The conclusion follows readily from the definition of Δ(x, ε) and the
continuity of hj (x, t).
For the exact penalty function constructed in (4.4.5), we have the following
results.
δ !
Theorem 4.4.2 Assume that hj x(k),∗ , ω = o ε(k),∗ , δ > 0, j =
1, 2, . . . , m. Suppose that γ > α, δ > α, −α − 1 + 2δ > 0 and 2γ − α − 1 > 0.
Then, as ε(k),∗ → ε∗ = 0 and x(k),∗ → x∗ ∈ S0 , it holds that
!
fσk x(k),∗ , ε(k),∗ → f (x∗ ) (4.4.16)
and !
∇(x,ε) fσk x(k),∗ , ε(k),∗ → (∇f (x∗ ), 0). (4.4.17)
m
5
T & * ! !γ +'2 !β
max 0, hj x (k),∗
,t − ε (k),∗
Wj dt + σk ε (k),∗
j=1 0
m 3 &
* γ +'2
T
0
max 0, hj x(k),∗ , t − ε(k),∗ Wj dt
j=1
= f (x∗ ) + lim .
ε(k),∗ →ε∗ =0 (ε(k),∗ )α
x(k),∗ →x∗ ∈S0
(4.4.18)
For the second term of the right hand side of (4.4.18), it is clear from
Lemma 4.4.1 that
m 3 & * (k),∗ (k),∗ γ +'2
T
0
max 0, h j x , t − ε W j dt
j=1
lim α
ε(k),∗ →ε∗ =0 ε(k),∗
x(k),∗ →x∗ ∈S0
T !− α2 ! !γ− α2 2
= lim ε(k),∗ hj x(k),∗ , t − ε(k),∗ Wj dt,
ε(k),∗ →ε∗ =0 0
j∈J
x(k),∗ →x∗ ∈S0
(4.4.19)
where
* ! !γ +
J = j ∈ [1, 2, . . . , m] : hj x(k),∗ , t − ε(k),∗ Wj ≥ 0 .
Since γ > α, hj x(k),∗ , t = o (ε(k),∗ )δ and δ > α, we have
T !− α2 ! !γ− α2 2
lim ε(k),∗ hj x(k),∗ , t − ε(k),∗ Wj dt = 0.
ε(k),∗ →ε∗ =0 0
j∈J
x(k),∗ →x∗ ∈S0
(4.4.20)
Combining (4.4.18)–(4.4.20) gives
!
lim fσk x(k),∗ , ε(k),∗ = f (x∗ ). (4.4.21)
ε(k),∗ →ε∗ =0
x(k),∗ →x∗ ∈S0
Similarly, we have
!
lim ∇(x,ε) fσk x(k),∗ , ε(k),∗
ε(k),∗ →ε∗ =0
x(k),∗ →x∗ ∈S0
& ! !'
= lim ∇x fσk x(k),∗ , ε(k),∗ , ∇ε fσk x(k),∗ , ε(k),∗ , (4.4.22)
ε(k),∗ →ε∗ =0
x(k),∗ →x∗ ∈S0
where
4.4 Exact Penalty Function Method 109
!
lim ∇x fσk x(k),∗ , ε(k),∗
ε(k),∗ →ε∗ =0
x(k),∗ →x∗ ∈S0
!−α
∂f x(k),∗
= lim +2 ε(k),∗ ·
ε(k),∗ →ε∗ =0 ∂x
x(k),∗ →x∗ ∈S0
m T&
* ! !γ +' ∂h x(k),∗ , t
j
max 0, hj x (k),∗
,t − ε (k),∗
Wj dt
j=1 0 ∂x
T !−α !
= ∇x f (x∗ ) + lim 2 ε(k),∗ hj x(k),∗ , t
ε(k),∗→ε∗=0
j∈J 0
x(k),∗→x∗∈S0
!γ−α
∂hj x(k),∗, t
− ε (k),∗
Wj dt
∂x
= ∇x f (x∗ ), (4.4.23)
while
!
lim ∇ε fσk x(k),∗ , ε(k),∗
ε(k),∗→ε∗=0
x(k),∗→x∗∈S0
4
!−α−1
= lim ε(k),∗ ·
ε(k),∗→ε∗=0
x(k),∗→x∗∈S0
m T& * ! !γ +'2
−α max 0, hj x(k),∗ , t − ε(k),∗ Wj dt
j=1 0
m T * ! !γ + !γ !
+2γ max 0, hj x(k),∗ , t − ε(k),∗ Wj − ε(k),∗ Wj dt
j=1 0
5
!β−1
(k),∗
+σk β ε
4
T ! !− α+1
2
= lim −α hj x(k),∗ , t ε(k),∗
ε(k),∗→ε∗=0
j∈J 0
x(k),∗→x∗∈S0
!γ− α+1
2
2 T& ! !γ '
− ε(k),∗ Wj dt +2γ hj x(k),∗ , t − ε(k),∗ Wj ·
j∈J 0
5
!γ ! !−α−1
− ε (k),∗
Wj ε (k),∗
dt
= 0. (4.4.24)
Proof. Let us assume that the conclusion is false. Then, there exists a sub-
sequence of { x(k),∗ , ε(k),∗ }, which is denoted by the original sequence, such
that for any k0 > 0, there exists a k > k0 satisfying ε(k ),∗ = 0. By Theo-
rem 4.4.1, we have
m T
5
* ! !γ + !γ !
+2γ max 0, hj x (k),∗
,t − ε (k),∗
Wj −ε (k),∗
Wj dt
j=1 0
+σk β
= 0. (4.4.25)
This is equivalent to
4 m
!−α−β T & * ! !γ +'2
ε (k),∗
−α max 0, hj x(k),∗ , t − ε(k),∗ Wj dt
j=1 0
m T&
* ! !γ + !γ !
+ 2γ max 0, hj x(k),∗ , t − ε(k),∗ Wj −ε(k),∗ Wj
j=1 0
* ! !γ + !
+ max 0, hj x(k),∗ , t − ε(k),∗ Wj hj x(k),∗ , t
5
* ! !γ + !'
− max 0, hj x (k),∗
,t − ε (k),∗
Wj h j x (k),∗
,t dt +σk β
= 0. (4.4.26)
Note that
* ! !γ + !γ !
max 0, hj x(k),∗ , t − ε(k),∗ Wj −ε(k),∗ Wj
* ! !γ + !
+ max 0, hj x(k),∗ , t − ε(k),∗ Wj hj x(k),∗ , t
* ! !γ +& ! !γ '
= max 0, hj x(k),∗ , t − ε(k),∗ Wj hj x(k),∗ , t − ε(k),∗ Wj
4.4 Exact Penalty Function Method 111
& * ! !γ +'2
= max 0, hj x(k),∗ , t − ε(k),∗ Wj . (4.4.27)
+σk β
m
!−α−β T * ! !γ +
= 2γ ε (k),∗
max 0, hj x(k),∗ , t − ε(k),∗ Wj ·
j=1 0
!
hj x(k),∗ , t dt. (4.4.28)
Define
m
!−α−β T * ! !γ +
y k = ε(k),∗ max 0, hj x(k),∗ , t − ε(k),∗ Wj dt
j=1 0
(4.4.29)
and
z k = y k / y k . (4.4.30)
Clearly,
lim z k = |z ∗ | = 1, (4.4.31)
k→+∞
∂f (x(k),∗ ) −α m T * ! !γ +
2 ε(k),∗
∂x
+ max 0, hj x(k),∗, t − ε(k),∗ Wj ·
|y k | |y |
k
j=1 0
(k),∗
∂hj x , t
dt
∂x
= 0. (4.4.32)
[0, T ] is a compact set. Thus, it can be shown that there are constants K̂ and
K, independent of k, such that
∂f x(k),∗
≤ K̂ (4.4.33)
∂x
and
112 4 Optimization Problems Subject to Continuous Inequality Constraints
∂h x(k),∗ , t
j
≤ K, ∀t ∈ [0, T ], j = 1, 2, . . . , m. (4.4.34)
∂x
Note that
1
β
|y k | ε(k),∗
1
= −α−β
m 3 " γ # β
ε(k),∗ T
max 0, hj x(k),∗ , t − ε(k),∗ Wj dt ε(k),∗
0
j=1
1
=
m 3 " # . (4.4.35)
T
max 0, h x (k),∗ , t − ε(k),∗ γ W dt ε (k),∗ −α
0 j j
j=1
Recalling the
! assumption stated in Theorem 4.4.2, we have hj x(k),∗ , t =
δ
o ε(k),∗ , γ > α and δ > α. Thus,
m T * ! !γ + !−α
lim max 0, hj x(k),∗ , t − ε(k),∗ Wj dt ε(k),∗
k→+∞
j=1 0
m T !δ !γ !−α
= lim max 0, o ε(k),∗ − ε(k),∗ Wj dt ε(k),∗
k→+∞
j=1 0
T !δ !−α !γ−δ
m
= lim max 0, o ε(k),∗ ε(k),∗ − ε(k),∗ Wj dt
k→+∞
j=1 0
⎧ ! ⎫
T ⎨ o ε(k),∗
δ
!δ−α !γ−δ ⎬
m
= lim max 0, δ ε (k),∗
− ε (k),∗
Wj dt
k→+∞ 0 ⎩ ε(k),∗ ⎭
j=1
= 0. (4.4.36)
Therefore,
1
β → +∞, k → +∞. (4.4.37)
|y k | ε(k),∗
From (4.4.33) and (4.4.37), it is clear that
∂f (x(k),∗ )
β → +∞, k → +∞.
∂x
(4.4.38)
|y k | ε(k),∗
= 2Kz k , (4.4.39)
where z k is defined by (4.4.30). Clearly, z k = 1. Thus, it follows that 2Kz k
is bounded uniformly with respect to k. This together with (4.4.38) is a
contradiction to (4.4.32). Thus, the proof is complete.
We may now conclude that, under some mild assumptions and the con-
straint qualification condition, when the parameter σ is sufficiently large, a
local minimizer of Problem (Pσ ) is a local minimizer of Problem (P ).
Remark 4.4.1 In Step 3, if ε(k),∗ > ε, we obtain from Lemma 4.4.1 that
x(k),∗ cannot be a feasible point, meaning that the penalty parameter σ may
not be large enough. Thus we need to increase σ. If σk > 108 , but ε(k),∗ > ε∗
still, then we should adjust the value of α, β and γ such that conditions
assumed in the Theorem 4.4.2 are satisfied and go to Step 2.
Remark 4.4.3 Although we have proven that a local minimizer of the exact
penalty function optimization problem (Pσk ) will converge to a local minimizer
of the original problem (P ), in actual computation we need to set a lower
bound ε = 10−9 for ε(k),∗ so as to avoid division by zero.
4.4 Exact Penalty Function Method 115
Example 4.4.1 The following example is taken from [81], and it was also
used for testing the numerical algorithms in [259, 261, 296] and Section 4.3.
In this problem, the objective function
Simpson’s Rule with [T1 , T2 ] = [10−6 , 30] being divided into 3000 equal subin-
tervals is used to evaluate the integral. The value obtained is very accurate.
Also, these discretized points define a dense subset T- of [T1 , T2 ]. We check the
feasibility of the continuous inequality constraint by evaluating the values of
the function h over T-. Results obtained are given in Table 4.4.1.
As we can see that, as the penalty parameter, σ, is increased, the min-
imizer approaches to the boundary of the feasible region. When σ is suffi-
ciently large, we obtain a feasible point. It has the same objective function
value as that obtained in Section 4.3. However, for the minimizer obtained
in Section 4.3, there are some minor violations of the continuous inequality
constraints (4.4.41).
116 4 Optimization Problems Subject to Continuous Inequality Constraints
σ t̄ h̄ f∗ x∗
1
x∗
2
x∗
3
ε
−4
10 5.66 1.1012×10 0.17469205 16.961442 45.496567 34.677990 0.001976
−5
102 5.65 1.3205×10 0.17469506 16.959419 45.496640 34.674199 0.000261
−6
103 5.66 1.31695×10 0.17469547 16.967833 45.495363 34.668668 0.000054
−7
104 5.66 5.839365×10 0.17469569 16.987820 45.498793 34.657147 0.000019
−8
105 5.66 5.070583×10 0.17469635 16.981243 45.497607 34.660896 0.000011
−7
106 5.66 −1.82251×10 0.17469633 16.980628 45.497520 34.661238 0.000003
Simpson’s Rule with the interval [0, π] being divided into 1000 equal subin-
tervals is used to evaluate the integral. These discretized points also form a
dense subset T- of the interval [0, π]. The feasibility check is carried out over T-.
We use Algorithm 4.4.1 with the initial point taken as [x01 , x02 ] = [0.5, 0.5] .
The solution of this problem is [x∗1 , x∗2 ] = [0, 2] with the objective function
value f ∗ = 1. The results of the algorithm are presented in Table 4.4.2.
σ t̄ h̄ f∗ x∗
1
x∗
2
ε
−7 −7
−3
10 1.41 3.9583799×10 1.000002326 4×10
1.9999992 1.62×10
−8 −7
−4
102 1.51 −2.309265×10 1.000000582 1.769×10
1.9999998 3.0310×10
−8 −7 −5
103 1.51 −1.629325×10 1.00000047 1.438×10 1.9999998 9.6×10
Again, Simpson’s Rule with the interval [0, π] being partitioned into 1000
equal subintervals is used to evaluate the corresponding constraint violation
in the exact penalty function. These discretized points also define a dense
subset T- of the interval [0, π], which is to be used for checking the feasibility of
the continuous inequality constraint. Now, by using Algorithm 4.4.1 with the
initial point taken as [x01 , x02 ] = [0.5, 0.5], the results obtained are reported
in Table 4.4.3.
σ t̄ h̄ f∗ x∗
1
x∗
2
ε
By comparing our results with those obtained in [81, 103, 259, 261], it
is observed that the objective values are almost the same. However, for our
minimizer, it is a feasible point, while those obtained in [81, 103, 259, 261]
are not.
4.5 Exercises
N
Fα (x) = fi,α (x) (4.5.1)
i=1
Clearly,
Nδ
Fα,δ (x) − Fα (x) ≤ . (4.5.6)
2
(iv) Let xδ,∗ and x∗ minimize (4.5.1) and (4.5.5), respectively. Show
that
Nδ
0 ≤ Fα,δ (xδ ) − Fα (x∗ ) ≤ . (4.5.7)
2
(v) Let
x∗ = arg min Fα (x) (4.5.8)
and
xδ,∗ = arg min Fα,δ (x). (4.5.9)
If there exists a unique minimizer of Fα (x), show that
4.2. Consider the following optimization problem (see [242]). The cost func-
tion b
J(x) = |F (x, t)| dt (4.5.11a)
a
hj (x) = 0, j = 1, . . . , Ne (4.5.11b)
gj (x) ≤ 0, j = Ne , . . . , N (4.5.11c)
and
αi ≤ xi ≤ βi , i = 1, . . . , n, (4.5.11d)
α = [α1 , . . . , αn ] , β = [β1 , . . . , βn ]
Show that
(i) For each t ∈ [a, b], Fε (x, t) is continuously differentiable with re-
spect to x.
(ii) Fε (x, t) ≥ |F (x, t)| for each (x, t) ∈ Rn × [a, b].
(iii) For each (x, t) ∈ Rn × [a, b], |Fε (x,t) − |F (x, t)|| ≤ 4ε .
(iv) For each t ∈ [a, b], x minimizes |F (x, t)| if and only if it mini-
mizes Fε (x, t). With |F (x, t)| approximated by Fε (x, t), the cost
function (4.5.11a) becomes
b
Jε (x) = Fε (x, t)dt. (4.5.13)
a
subject to
hi (x) = 0, i = 1, 2, . . . , m (4.5.14b)
hi (x) ≤ 0, i = m + 1, 2, . . . , N, (4.5.14c)
5.1 Introduction
where x ∈ Rn and u ∈ Rr are the state and control vectors, respectively, and
x(0) ∈ Rn is a given initial state. Furthermore, f (i, ·, ·) : Rn × Rr is a given
function. For each i = 0, 1, . . . , N − 1,
u(i) ∈ Ui , (5.2.3)
subject to
Proof. Suppose that the conclusion is false. Then, there exists a control
5.2 Dynamic Programming Approach 123
{- - (j + 1), . . . , u
u(j), u - (N − 1)}
x(j) = x∗ (j), x
{- -(j + 1), . . . , x
-(N )}
such that
N −1
N −1
Φ0 (-
x(N )) + L0 (i, x - (i)) < Φ0 (x∗ (N )) +
-(i), u L0 (i, x∗ (i), u∗ (i)).
i=j i=j
(5.2.9)
Now construct the control {6 6 (k + 1), . . . , u
u(k), u 6 (N − 1)} as follows:
u∗ (i), i = k, . . . , j − 1,
6 (i) =
u (5.2.10)
- (i), i = j, . . . , N.
u
x∗ (i), i = k, . . . , j,
6(i) =
x (5.2.11)
-(i), i = j + 1, . . . , N.
x
N −1
Φ0 (6
x(N )) + 6(i), u
L0 (i, x 6 (i))
i=k
j−1
N −1
=Φ0 (-
x(N )) + L0 (i, x∗ (i), u∗ (i)) + -(i), u
L0 (i, x - (i))
i=k i=j
N −1
<Φ0 (x∗ (N )) + L0 (i, x∗ (i), u∗ (i)) (5.2.12)
i=k
Equation (5.2.12) shows that {u∗ (k), . . . , u∗ (N − 1)} is not an optimal con-
trol for Problem (Pk,ξ ) which is a contradiction to the hypothesis of the
theorem. Hence the proof is complete.
To continue, we assume that an optimal solution to (Pk,ξ ) exists for each
k, 0 ≤ k ≤ N − 1, and for each ξ ∈ Rn . Let V (k, ξ) be the minimum value of
the cost functional (5.2.6) for Problem (Pk,ξ ). It is called the value function.
Theorem 5.2.2 The value function V satisfies the following backward re-
cursive relation. For any k, 0 ≤ k ≤ N − 1,
Remark 5.2.1 During the process of finding the minimum in the recursive
equation (5.2.13)–(5.2.14), we obtain the optimal control v in feedback form.
This optimal feedback control can be used for all initial conditions. However,
Dynamic Programming requires us to compute and store the values of V and v
for all k and x. Therefore, unless we can find a closed-form analytic solution
to (5.2.13)–(5.2.14), the Dynamic Programming formulation will inevitably
lead to an enormous amount of storage and computation. This phenomena
is known as the curse of dimensionality. It seriously hinders the applicability
of Dynamic Programming to problems where we cannot obtain a closed-form
analytic solution to (5.2.13)-(5.2.14).
For illustration, let us look at a simple example, where all the functions
involved are scalar.
Example 5.2.1 Minimize
N −1 * +
2 2
(x(N ))2 + (x(i)) + (u(i)) (5.2.15)
i=1
subject to
Solution Let V (k, ξ) be the value function for the corresponding Problem
(Pk,ξ ) starting from the arbitrary state ξ at step k. By Theorem 5.2.2, we
have
and
V (N, ξ) = ξ 2 , (5.2.19)
where V (N, ξ) denotes the value function for the minimization problem start-
ing from the arbitrary state ξ at step k = N .
For k = N − 1, we start from the arbitrary state ξ at step N − 1. If we
denote u(N − 1) = u,
" #
V (N − 1, ξ) = min ξ 2 + u2 + V (N, ξ + u)
u∈R
" #
= min ξ 2 + u2 + (ξ + u)2 . (5.2.20)
u∈R
3
u∗ (N − 2) = − ξ. (5.2.23)
5
The corresponding value function is
2 2
3 3 3 34 2 6 2 8
V (N − 2, ξ) = ξ 2 + − ξ + ξ− ξ = ξ + ξ = ξ 2 . (5.2.24)
5 2 5 25 25 5
V (k, ξ) = ck ξ 2 , (5.2.25)
for some ck > 0. Let us show that this guess is correct. We begin by assuming
the form of (5.2.25). Then, by (5.2.18), we have
This implies that the optimal control for step k starting at state ξ is
ck+1
u∗ (k) = − ξ. (5.2.28)
1 + ck+1
V (k, ξ) = ck ξ 2 , (5.2.29)
with
1 + 2ck+1
ck = , and cN = 1. (5.2.30)
1 + ck+1
5.2 Dynamic Programming Approach 127
c3 = 1,
1 + 2c3 1 + 2(1) 3
c2 = = = ,
1 + c3 1+1 2
1 + 2c2 1 + 2( 32 ) 8
c1 = = 3 = ,
1 + c2 1+ 2 5
1 + 2c1 1 + 2( 85 ) 21
c0 = = = .
1 + c1 1 + 85 13
Now, starting with the given initial condition x(0) = x(0) , we obtain
8
c1 8
u∗ (0) = − x∗ (0) = − 5 8 x(0) = − x(0) .
1 + c1 1+ 5 13
Then
8 (0) 5 (0)
x∗ (1) = x∗ (0) + u∗ (0) = x(0) − x = x ,
13 13
and
3
c2
u∗ (1) = − x∗ (1) = − 2 3 x∗ (1)
1 + c2 1+ 3
3 ∗ 3 5 3
= − x (1) = − × x(0) = − x(0) .
5 5 13 13
Next,
5 (0) 3 2 (0)
x∗ (2) = x∗ (1) + u∗ (1) = x − x(0) = x ,
13 13 13
128 5 Discrete Time Optimal Control Problems
and
c3 1 1 1 2 1
u∗ (2) = − x∗ (2) = − x∗ (2) = − x∗ (2) = − × x(0) = − x(0) .
1 + c3 1+1 2 2 13 13
Finally,
2 (0) 1 1 (0)
x∗ (3) = x∗ (2) + u∗ (2) = x − x(0) = x .
13 13 13
Using (5.2.29), the optimal cost can be determined in terms of the initial
condition, i.e.,
!2 21 (0) !2
V (0, x(0) ) = c0 x(0) = x .
13
Remark 5.2.2 The following general discrete time linear quadratic regulator
(LQR) problem can be solved in a similar way.
−1
N
" #
minimize (x(k)) Qx(k) + (u(k)) Ru(k)
i=1
subject to x(k + 1) = Ax(k) + Bu(k), k = 0, 1, . . . , M
x(0) = x(0) ,
u(i) ∈ Ui , (5.2.35)
N −1
min L(x(i), u(i)) (5.2.36)
{u(0),...,u(N −1)}∈U0
i=0
N −1
min L(x(i), u(i)) (5.2.37)
{u(k),...,u(N −1)}∈Uk
i=k
subject to
with
x(k) = ξ (5.2.39)
and
x(N ) = x̂. (5.2.40)
We denote this as Problem (P-k,ξ ). The value function for Problem (P-k,ξ ) is the
minimum value of the cost functional (5.2.37) and denoted again by V (k, ξ).
The following two results are the equivalent of Theorems 5.2.1 and 5.2.2. The
proofs are almost identical and hence omitted.
Theorem 5.2.3 Suppose that {u∗ (k), . . . , u∗ (N − 1)} is an optimal control
for Problem (P-k,ξ ), and that {x∗ (k) = ξ, x∗ (k + 1), . . . , x∗ (N ) = x -} is the
corresponding optimal trajectory. Then, for any j, k ≤ j ≤ N − 1,
{u∗ (j), . . . , u∗ (N − 1)} is an optimal control for Problem (P-j,x∗ (j) ).
Theorem 5.2.4 The value function V satisfies the following backward re-
cursive relation. For k, 0 ≤ k ≤ N − 2,
subject to
-.
x(N ) = f (ξ, u) = x (5.2.43)
Let us consider a simple example.
130 5 Discrete Time Optimal Control Problems
−1
N
" #
(x(i))2 + (u(i))2
i=0
subject to
subject to
x(N ) = f (ξ, u(N − 1)) = ξ + u(N − 1) = 0. (5.2.46)
This is the basic dynamic programming backward recursive relation. For il-
lustration, let us first consider the case N = 3. For i = N − 1 = 2, we
have " #
V (2, ξ) = min ξ 2 + u2
u∈R
subject to
x(3) = ξ + u = 0.
This can only be satisfied with u∗ (2) = −ξ which, in turn, leads to
V (2, ξ) = 2ξ 2 .
V (k, ξ) = ck ξ 2 ,
13 2
V (0, ξ) = c0 ξ 2 = ξ ,
8
132 5 Discrete Time Optimal Control Problems
(0) 2
or, remembering that x(0) = x(0) , V 0, x(0) = 13
8 x . The corresponding
optimal controls and states can then be calculated as follows:
c1 5 (0) ! 3
u∗ (0) = − x(0) = − x ⇒ x∗ (1) = x(0) + u∗ (0) = x(0) ,
1 + c1 8 8
c 2 1 1
u∗ (1) = − x∗ (1) = − x∗ (1) ⇒ x∗ (2) = x∗ (1) + u∗ (1) = x∗ (1) = x(0) ,
2
1 + c2 3 3 8
u∗ (2) = −x∗ (2) ⇒ x∗ (3) = x∗ (2) + u∗ (2) = 0,
where the latter equations result from the first step of our analysis. Note
that, once again, it is possible to express the optimal control in a feedback
form.
N
xtj = 1, t = 1, . . . , T. (5.2.48)
j=1
Moreover, it is assumed that short selling of the risky assets is not allowed
at any time. Hence, we have
5.2 Dynamic Programming Approach 133
xtj ≥ 0, t = 1, . . . , T, j = 1, . . . , N. (5.2.49)
Let Rtj denote the rate of return of asset Sj for period t. Define Rt =
[Rt1 , . . . , RtN ] . Here, Rtj is assumed to follow a normal distribution with
mean rtj and standard deviation σtj . We further assume that vectors Rt , t =
1, . . . , T , are statistically independent, and the mean E(Rt ) = rt = [rt1 , . . . ,
rtN ] is calculated by averaging the returns over a fixed window of time τ .
Let
1
t−1
rtj = Rji , t = 1, . . . , T, j = 1, . . . , N. (5.2.50)
τ i=t−τ
We assume that in any time period, there are no two distinct assets in the
portfolio that have the same level of expected return as well as standard
deviation, i.e., for any 1 ≤ t ≤ T , there exist no i and j such that i = j,
but rti = rtj , and σti = σtj .
Let Vt denote the total wealth of the investor at the end of period t. Clearly,
we have
Vt = Vt−1 1 + Rt xt , t = 1, . . . , T, (5.2.51)
with V0 = M0 .
First recall the definition of probabilistic risk measure, which was intro-
duced in [233] for the single-period probabilistic risk measure.
where θ is a constant to adjust the risk level, and ε denotes the average risk
of the entire portfolio, which is calibrated by the function below.
1
N
ε= σj . (5.2.53)
N j=1
The whole idea of this risk measure in (5.2.52) is to locate the single asset
with greatest deviation in the portfolio. With this ‘biggest risk’ mitigated,
the risk of the whole portfolio can be substantially reduced as well. For multi-
period portfolio optimization, the single ‘biggest risk’ should be selected over
all risky assets and over the entire planning horizon. Thus, we define
where x = [x
1 , . . . , xT ] , θ is the same as defined in (5.2.52) to be a constant
adjusting the risk level, and ε denotes the average risk of the entire portfolio
over T periods, which is calibrated by
1
T N
ε= σtj . (5.2.55)
T × N t=1 j=1
134 5 Discrete Time Optimal Control Problems
Assume that the investor is rational and risk-averse, who wants to maximize
the terminal wealth as well as minimize the risk in the investment. Thus, the
portfolio selection problem can be formulated as a bi-criteria optimization
problem stated as follows:
!
max min min f (xtj ), E(VT ) , (5.2.56a)
1≤t≤T 1≤j≤N
s.t. Vt = Vt−1 1 + Rt xt , t = 1, . . . , T, (5.2.56b)
N
xtj = 1, t = 1, . . . , T, (5.2.56c)
j=1
xtj ≥ 0, t = 1, . . . , T, j = 1, . . . , N, (5.2.56d)
min min f (x∗tj ) ≤ min min f (x̃tj ), and E(VT∗ |x∗ ) ≤ E(VT∗ |x̃),
1≤t≤T 1≤j≤N 1≤t≤T 1≤j≤N
xtj ≥ 0, t = 1, . . . , T, j = 1, . . . , N, (5.2.57e)
Recall that in Section 5.2.1.1, the return, Rtj , of asset Sj in period t, follows
the normal distribution with mean rtj and standard deviation σtj . It is clear
that Rtj − rtj is a linear mixture of normal distributions with mean 0 and
standard deviation σtj .
Let qtj = Rtj − rtj . Then it follows that
* θε +
f (xtj ) = Pr{ |Rtj xtj − rtj xtj | ≤ θε } = Pr |Rtj − rtj | ≤
xtj
xθε 4 5
2
tj 1 qtj
=2 √ exp − 2 dqtj . (5.2.58)
0 2πσtj 2σtj
N
xtj = 1, t = k, . . . , T, (5.2.61d)
j=1
0 ≤ xT j ≤ UT j , j = 1, . . . , N. (5.2.65c)
The optimal solution to this linear programming problem has been obtained
in [233]. The result is quoted in the following as a lemma:
5.2 Dynamic Programming Approach 137
n−1
n
UT j < 1, and UT j ≥ 1,
j=1 j=1
and
⎧
⎪
⎪ UT j ,n−1 j=1,. . . ,n-1,
⎪
⎨
x∗T j = 1− UT j , j=n, (5.2.66)
⎪
⎪
⎪
⎩ j=1
0, j >n,
Since x∗T is solved in (5.2.65) and known, the above problem is in the
same form as (5.2.65). Thus, it can be solved in a similar manner as that
for (5.2.66). Details are given below as a lemma.
Lemma 5.2.2 Let the assets be sort in such an order that r(T −1)1 ≥
r(T −1)2 ≥ · · · ≥ r(T −1)N . Then, there exists an integer n ≤ N such that
138 5 Discrete Time Optimal Control Problems
n−1
n
U(T −1)j < 1, and U(T −1)j ≥ 1,
j=1 j=1
and
⎧
⎪
⎪ U(T −1)j , j=1,. . . ,n-1,
⎪
⎨
n−1
x∗(T −1)j = 1− U(T −1)j , j=n, (5.2.69)
⎪
⎪
⎪
⎩ j=1
0, j >n,
n−1
n
Ukj < 1, and Ukj ≥ 1,
j=1 j=1
and
⎧
⎪
⎪ Ukj ,n−1 j=1,. . . ,n-1,
⎪
⎨
x∗kj = 1− Ukj , j=n, (5.2.71)
⎪
⎪
⎪
⎩ j=1
0, j >n,
where Ukj is defined as in (5.2.58) and (5.2.59). Thus, with x∗k = [x∗k1 , . . . ,
x∗kN ] , {x∗1 , . . . , x∗T } is an optimal solution to problem (5.2.60).
5.2 Dynamic Programming Approach 139
We use daily return data of stocks on the ASX100 dated from 01/01/2007
to 30/11/2011. The portfolio takes effect from 01/01/2009 and is open for
trading for 3 years. 3 years is chosen since professionally managed portfolios
(e.g., Aberdeen Asset Management, JBWere) usually list the average holding
period as 3–5 years. At the beginning of each month, the funds in the portfolio
are allocated based on the updated asset return data. We use 2 years historical
data to decide the portfolio allocation each month. For the first portfolio
allocation on 01/01/2009, the return data from 01/01/2007 to 31/12/2008 is
used to evaluate the expected return and standard deviation of each stock.
In the month following, the return data from 01/02/2007 to 31/01/2009 is
used to evaluate the updated expected return and standard deviation, and
this goes on.
The formulation of the corresponding portfolio optimization problem is as
defined in Section 5.2.1.1. Assume the investor has an initial wealth of M0 =
1,000,000 dollars. There are N = 100 stocks to choose from for a portfolio
of investment holding period of T = 36 months. The average risk of the
portfolio, ε, over T periods is calculated as in (5.2.55). Table 5.2.1 shows
the portfolio returns for various combination of θ (risk adjusting parameter),
and y (lower bound of the probabilistic constraint). By changing the value of θ
and/or y, the investor is able to alter the portfolio composition to cater for
different risk tolerance levels. The lower the value of θ, the more diversified
the portfolio can be, while the lower the value of y, the less stringent the
probabilistic constraint.
From Table 5.2.1, it can be seen that the expected portfolio return in-
creases when θ increases. Similarly, the expected portfolio return increases
when y decreases. This makes sense since when θ increases, the portfolio
selection consists of a much smaller number of selected ‘better-performing’
stocks. When y is lower, the risk is higher and hence the return is generally
expected to be higher.
The historical price index value of ASX100, composed of 100 large-cap and
mid-cap stocks, was 3067.90 and 3329.40 at the end of December 2008 and at
the end of December 2011, respectively. This translates to a return of 8.52%
for a portfolio which comprises the entire stock selection of ASX100.
From Table 5.2.1, it can be seen that when θ > 0.5, the expected returns
following our portfolio selection criteria outperforms the market index return.
The multi-period model outperforms the passive single-period model with a
period of 3 years. Table 5.2.2 shows the expected returns using the single-
period model with a period of 3 years.
When θ = 0.1 and y = 0.95, solving the problem with Theorem 5.2.5
suggests a total wealth in portfolio of 1,043,223.89 dollars at the end of the 3
year investment, which is a return of 4.32%. Comparing this with the result of
the single-period investment strategy, the multi-period solution outperforms
140 5 Discrete Time Optimal Control Problems
it by more than one fold (the single-period portfolio has a total return of
about 2%).
dition (5.3.1b) is satisfied. This discrete time function is called the solution
of the system (5.3.1) corresponding to the control u ∈ U .
We now consider the following class of discrete time optimal control prob-
lems in canonical formulation:
M −1
g0 (u) = Φ0 (x(M | u)) + L0 (k, x(k | u), u(k)) (5.3.3)
k=0
gi (u) = 0, i = 1, 2, . . . , Ne (5.3.4a)
gi (u) ≤ 0, i = Ne + 1, . . . , N , (5.3.4b)
where
M −1
gi (u) = Φi (x(M | u)) + Li (k, x(k | u), u(k)). (5.3.5)
k=0
In this section, our aim is to derive gradient formulae for the cost as well as
the constraint functionals. Define
$ %
u = (u(0)) , (u(1)) , . . . , (u(M − 1)) . (5.3.6)
Let the control vector u be perturbed by εû, where ε > 0 is a small real
number and û is an arbitrary but fixed perturbation of u given by
$ %
û = (û(0)) , (û(1)) , . . . , (û(M − 1)) . (5.3.7)
142 5 Discrete Time Optimal Control Problems
Then, we have
uε = u + εû = [(u(0, ε)) , (u(1, ε)) , . . . , (u(M − 1, ε)) ] , (5.3.8)
where
u(k, ε) = u(k) + εû(k), k = 0, 1, . . . , M − 1. (5.3.9)
Consequently, the state of the system will be perturbed, and so are the cost
and constraint functionals.
Define
x(k, ε) = x(k | uε ), k = 1, 2, . . . , M. (5.3.10)
Then,
x(k + 1, ε) = f (k, x(k, ε), u(k, ε)). (5.3.11)
The variation of the state for k = 0, 1, . . . , M − 1 is
dx(k + 1, ε)
x(k + 1) =
dε ε=0
∂f (k, x(k), u(k)) ∂f (k, x(k), u(k))
= x(k) + û(k) (5.3.12a)
∂x(k) ∂u(k)
with
x(0) = 0. (5.3.12b)
For the i−th functional (i = 0 denotes the cost functional), we have
∂gi (u) gi (uε ) − gi (u) dgi (uε )
û = lim ≡
∂u ε→0 ε dε ε=0
M −1
∂Φi (x(M )) ∂Li (k, x(k), u(k))
= x(M ) + x(k)
∂x(M ) ∂x(k)
k=0
∂Li (k, x(k), u(k))
+ û(k) . (5.3.13)
∂u(k)
Hi (k, x(k), u(k), λi (k + 1)) = Li (k, x(k), u(k)) + (λi (k + 1)) f (k, x(k), u(k)),
(5.3.14)
By (5.3.12), we have
∂gi (u)
û
∂u
∂Hi (0, x(0), u(0), λi (1)) ∂Hi (M −1, x(M −1), u(M −1), λi (M ))
= ,..., û.
∂u(0) ∂u(M −1)
∂gi (u)
∂u
∂Hi (0, x(0), u(0), λi (1)) ∂Hi (M −1, x(M −1), u(M −1), λi (M ))
= ,..., .
∂u(0) ∂u(M −1)
(5.3.18)
is given by (5.3.18).
144 5 Discrete Time Optimal Control Problems
M −1
Li (k, x(k | u), u(k)), i = 0, 1, . . . , N ,
k=0
M −1
gi (u) = Φi (x(M | u)) + Li (k, x(k | u), u(k)), i = 0, 1, . . . , N .
k=0
By using Theorem 5.3.1, we can calculate the gradients of the cost func-
tional and the canonical constraint functionals as stated in the following
algorithm:
5.4 Problems with Terminal and All-Time-Step Inequality Constraints 145
Consider the system (5.3.1), and let U be the class of admissible controls
defined in Section 5.3. Two sets of nonlinear terminal state constraints are
specified as follows:
where
M = {0, 1, . . . , M − 1},
and hi , i = 1, . . . , N2 , are given real-valued functions defined in M×Rn ×Rr .
If u ∈ U satisfies the constraints (5.4.1) and (5.4.2), then it is called a
feasible control. Let F be the class of all feasible controls.
Consider the following problem, to be referred to as Problem (Q): Given
the system (5.3.1), find a control u ∈ F such that the cost functional
M −1
g0 (u) = Φ0 (x(M | u)) + L0 (k, x(k | u), u(k)) (5.4.3)
k=0
is minimized over F.
The following conditions are assumed throughout this section:
Assumption 5.4.1 For each k = 0, 1, . . . , M − 1, f (k, ·, ·) satisfies Assump-
tion (5.3.1).
Define
"
Θ = u ∈ U : Ψi (x(M | u)) = 0, i = 1, . . . , N0 ;
#
Ψi (x(M | u)) ≤ 0, i = Ne + 1, . . . , N1 (5.4.4)
and
Note that the terminal constraints (5.4.1) are already in canonical form
(5.3.5). Although the all-time-step inequality constraints (5.4.2) are not in
canonical form, they can be approximated by a sequence of canonical con-
straints via the constraint transcription introduced in Section 4.3. Details are
given in the next section.
Consider Problem (Q) of Section 5.4. Note that for each i = 1, . . . , N2 , the
corresponding all-time-step inequality constraint in (5.4.2) is equivalent to
M −1
gi (u) = max{hi (k, x(k | u), u(k)), 0} = 0. (5.4.6)
k=0
For convenience, let Problem (Q) with (5.4.2) replaced by (5.4.6) again be
denoted by Problem (Q). Recall the set Θ defined by (5.4.4). Then, it is clear
that the set F of feasible controls can also be written as
F = {u ∈ Θ : gi (u) = 0, i = 1, . . . , N2 }. (5.4.7)
by
M −1
gi,ε (u) = Li,ε (k, x(k | u), u(k)). (5.4.9)
k=0
and hence,
M ε ε
gi,ε (uε ) > > .
4 4
That is,
ε
− + gi,ε (uε ) > 0.
4
This is a contradiction to the constraints specified in (5.4.10), and thus the
proof is complete.
For each ε > 0, Problem (Qε ) can be regarded as a nonlinear mathe-
matical programming problem. Since the constraints appearing in (5.4.11)
and (5.4.12) are in canonical form, their gradient formulae as well as that of
g0 (u) can be readily computed as explained in Section 5.3.2. Hence, Problem
(Qε ) can be solved by any efficient optimization technique, such as the SQP
technique presented in Chapter 3. In view of Lemma 5.4.1, we see that any
feasible control vector of Problem (Qε ) is in F, and hence is a suboptimal
control vector for Problem (Q). Thus, by adjusting ε > 0 in such a way that
ε → 0, we obtain a sequence of approximate problems (Qε ), each being solved
as a nonlinear mathematical programming problem. We shall now investigate
certain convergence properties of this approximation scheme.
The aim of this section is to provide a convergence analysis for the approxi-
mation scheme proposed in the last subsection.
Theorem 5.4.1 Let u∗ and u∗ε be optimal controls of Problems (Q) and
(Qε ), respectively. Then, there exists a subsequence of {u∗ε }, which is again
denoted by the original sequence, and a control ū ∈ F such that
lim hi (k, x(k | u∗ε ), u∗ε (k)) = hi (k, x(k | ū), ū(k)), i = 1, . . . , N2 .
ε→0
(5.4.18)
By Lemma 5.4.1, u∗ε ∈ F for all ε > 0. Thus, it follows from (5.4.17)
and (5.4.18) that ū ∈ F.
Next, by Assumptions 5.4.1 and 5.4.2, we deduce from (5.4.15) and (5.4.16)
that
lim g0 (u∗ε ) = g0 (ū). (5.4.19)
ε→0
For any δ1 > 0, Assumption 5.4.6 asserts that there exists a û ∈ F 0 such
that
|u∗ (k) − û(k)| ≤ δ1 , ∀ k = 0, 1, . . . , M − 1. (5.4.20)
By Assumption 5.4.1 and induction, we can show that, for any ρ1 > 0, there
exists a δ1 > 0 such that for all k = 0, 1, . . . , M
Since û ∈ F 0 , we have
û ∈ Fε
In this section, three numerical examples are solved to illustrate the proposed
computational method.
Example 5.4.1 This problem concerns the vertical ascent of a rocket. The
original formulation (continuous time version) and numerical solution to the
problem are taken from [54]. A discrete time version of the control process is
obtained by the Euler scheme to discretize the system equations, where the
time step is taken as 1 s as in [54].
where x1 (k) is the mass of the rocket; x2 (k) is the altitude (km) above the
earth’s surface; x3 (k) is the rocket velocity (km/s); u(k) is the mass flow rate;
2
V = 2 is the constant gas nozzle velocity; g = 0.01 km/s is the acceleration
due to gravity (assumed constant); Q(x2 (k), x3 (k)) is the aerodynamic drag
defined by the formula:
0 ≤ u(k) ≤ 0.04
for k = 0, 1, . . . , M − 1.
The control objective is to maximize the rocket’s peak altitude by suitable
choice of the mass flow rate. In other words, we want to minimize
g0 = −k1 x2 (M ) (5.4.28)
M g0 Ψ1 x1 (M ) x2 (M )
100 −0.36454188 −0.167 × 10−14 0.2 36.45
Example 5.4.2 The original problem is taken from [189] and the same prob-
lem was also considered in [54]. The control process (discretized by the Euler
scheme using the time step h = 0.02) is described by the difference equations:
−1
M
$ %
g0 = (x1 (M ))2 + (x2 (M ))2 + (x1 (k))2 + (x2 (k))2 + 0.005(u(k))2
k=0
ε
M g0 − + gi,
4
50 9.1415507 0.429 × 10−9
−1
M
$ %
g0 = (x(k))2 + (u(k))2
k=1
−1 ≤ u(k) ≤ 1,
where
x = [x1 , . . . , xn ] ∈ Rn , u = [u1 , . . . ur ] ∈ Rr ,
are, respectively, the state and control vectors, while f = [f1 , . . . , fn ] ∈ Rn
is a given 0 < h < M . Here, we consider the case where there is only one time
delay. The extension to the case involving many time-delays is straightforward
but is more involved in terms of notation.
The initial functions for the state and control functions are
where
φ(k) = [φ1 , . . . , φn ] , γ(k) = [γ1 , . . . , γr ] ,
are given functions from k = −h, −h + 1, . . . − 1 into Rn and Rr , respectively,
and x0 is a given vector in Rn . Define
5.5 Discrete Time Time-Delayed Optimal Control Problem 153
5.5.1 Approximation
To begin, we first note that the all-time-step inequality constraints are equiv-
alent to the following equality constraints:
M −1
gi (u) = max{hi (k, x(k), u(k)) , 0} = 0, i = 1, . . . , N2 . (5.5.5)
k=0
where ε > 0 is an adjustable constant with small value. Then, the all-time-
step inequality constraints (5.5.3) are approximated by the inequality con-
straints in canonical form defined by
ε
− + gi,ε (u) ≤ 0, i = 1, . . . , N2 , (5.5.8)
4
where
N2 M −1
gi,ε (u) = Li,ε k, x(k), u(k).
i=1 k=0
Define
* ε +
Fε = u(k) ∈ U, k = 0, . . . , M − 1 : − + gi,ε (u) ≤ 0, i = 1, . . . , N2 .
4
(5.5.9)
Now, we can define a sequence of approximate problems (Qε ), where ε > 0,
below.
Problem (Qε ) Problem (Q) with (5.5.5) replaced by
ε
Gε (u) = − + gε (u) ≤ 0, i = 1, . . . , N2 . (5.5.10)
4
In Problem (Qε ), our aim is to find a control u in Fε such that the cost
functional (5.5.4) is minimized over Fε . For each ε > 0, Problem (Qε ) is a
special case of a general class of discrete time optimal control problems with
time-delay and subject to canonical constraints defined below.
Problem (P) Given system (5.5.1a)–(5.5.1c), find an admissible control u ∈
U such that the cost functional
M −1
g0 (u) = Φ0 (x(M )) + L0 (k, x(k), x(k − h), u(k), u(k − h)) (5.5.11)
k=0
gi (u) = 0, i = 1, 2, . . . , Ne , (5.5.12a)
gi (u) ≤ 0, i = Ne + 1, . . . , N, (5.5.12b)
where
M −1
gi (u) = Φi (x(M )) + Li (k, x(k), x(k − h), u(k), u(k − h)). (5.5.13)
k=0
5.5 Discrete Time Time-Delayed Optimal Control Problem 155
Remark 5.5.1 Under Assumption 5.5.5, it can be shown that for any u in
F and δ > 0, there exists a ū ∈ F 0 such that
for all k = 0, 1, . . . M − 1.
In what follows, we shall present an algorithm for solving Problem (Q) as
a sequence of Problems (Qε ).
Algorithm 5.5.1
Step 1. Set ε = ε0 .
Remark 5.5.2 ε0 is usually set as 1.0 × 10−2 ; and the algorithm is termi-
nated as ‘successful exit’ when ε < 10−7 .
Theorem 5.5.1 Let u∗ be an optimal control of Problem (Q) and let u∗ε be
an optimal control of Problem (Qε ). Then,
Theorem 5.5.2 Let u∗ε and u∗ be optimal controls of Problems (Qε ) and
(Q), respectively. Then, there exists a subsequence of {u∗ε }, which is again
denoted by the original sequence, and a control ū ∈ F such that, for each
k = 0, 1, .., M − 1,
lim |u∗ε (k) − ū(k)| = 0. (5.5.17)
ε→0
By induction, we can show, by using Assumption 5.5.1 and (5.5.18), that, for
each k = 0, 1, . . . , M,
lim hi (k, x(k |u∗ε ), u∗ε (k)) = hi (k, x(k |ū ), ū(k)),
ε→0
i = 1, . . . , N2 . (5.5.20)
By Lemma 1, u∗ε ∈ F for all ε > 0. Thus, it follows from (5.5.20) that ū ∈ F.
Next, by Assumption 5.5.1, we deduce from (5.5.18) and (5.5.19) that
For any δ1 > 0, it follows from Remark 3.1 that there exists a û ∈ F 0 such
that, for each k = 0, 1, . . . , M − 1,
By Assumption 5.5.1 and induction, we can show that, for any ρ1 > 0, there
exists a δ1 > 0 such that for each k = 0, 1, . . . , M,
û ∈ Fε ,
5.5.2 Gradients
1, k ≥ 0
e(k) = (5.5.30)
0, k < 0,
and
∂Hi (k)
λi (k) = , k = M − 1, M − 2, . . . , 0, (5.5.32)
∂x(k)
We set
z(k) = 0, ∀k = M − h + 1, M − h + 2, . . . , M, (5.5.34)
and
w(k) = 0, ∀k = M − h, M − h + 1, . . . , M. (5.5.35)
Then, the gradient formulas for the cost functionals (for i = 0) and constraint
functionals (for i = 1, . . . , N ) are given in the following theorem.
Theorem 5.5.3 Let gi (u), i = 0, 1, . . . , N , be defined by (5.5.11) (the
cost functional for i = 0) and (5.5.13) (the constraint functionals for
i = 1, . . . , N ). Then, for each i = 0, 1, . . . , N , the gradient of the function
gi (u) is given by
where
Hi (k) = Hi k, x(k), y(k), z(k), u(k), v(k), w(k), λi (k + 1), λ̄i (k) ,
k = 0, 1, . . . , M − 1.
Proof. Define
$ %
u = (u(0)) , (u(1)) , . . . , (u(M − 1)) . (5.5.37)
Let the control u be perturbed by εû, where ε > 0 is a small real number
and û is an arbitrary but fixed perturbation of u given by
$ %
û = (û(0)) , (û(0)) , . . . , (û(M − 1)) . (5.5.38)
Then, we have
$ %
uε = u + εû = (u(0, ε)) , (u(1, ε)) , . . . , (u(M − 1, ε)) , (5.5.39)
where
u(k,ε) = u(k) + εû(k), k = 0, 1, . . . , M − 1. (5.5.40)
160 5 Discrete Time Optimal Control Problems
Then,
x(k + 1, ε) = f (k, x(k, ε), y(k, ε), u(k, ε), v(k, ε)). (5.5.42)
where
x(k) = 0, k ≤ 0, (5.5.43b)
u(k) = 0, k < 0. (5.5.43c)
From (5.5.43b) and (5.5.43c), we obtain
y(k) = 0, k = 0, 1, . . . , h, (5.5.44a)
and
v(k) = 0, k = 0, 1, . . . , h − 1. (5.5.44b)
Define
M −1
∂Φi (x(M )) ∂ L̄i
= x(M ) + x(k)
∂x(M ) ∂x(k)
k=0
∂ L̄i ∂ L̄i ∂ L̄i
+ y(k) + û(k) v(k) . (5.5.46)
∂y(k) ∂u(k) ∂v(k)
M −1
∂ L̄i ∂ L̄i
y(k) + v(k)
∂y(k) ∂v(k)
k=0
−1
. /
M
∂ L̂i ∂ L̂i
= e(M − k − h) x(k) + û(k) . (5.5.47)
∂x(k) ∂u(k)
k=0
Substituting (5.5.47) into (5.5.46), and then using (5.5.32) and (5.5.45a)–
(5.5.45e), we obtain
M−1
∂gi (u) ∂Φi (x(M )) ∂ H̄i
û = x(M ) + x(k)
∂u ∂x(k) ∂x(k)
k=0
∂ H̄i ∂ f¯
+ û(k) − λi (k + 1) x(k)
∂u(k) ∂x
∂ fˆ
− λ̄i (k) x(k)e(M − k − h)
∂x(k)
∂ f¯
− λi (k + 1) û(k)
∂u(k)
/
i ∂ fˆ
− λ̄ (k) u(k)e(M − k − h) . (5.5.48)
∂u(k)
M −1
∂ f¯ ∂ f¯
(λi (k + 1)) y(k) + v(k)
∂y(k) ∂v(k)
k=h
M −1
∂ f¯ ∂ f¯
= (λi (k + 1)) y(k) + v(k) . (5.5.50)
∂y(k) ∂v(k)
k=0
where
⎡ ⎤ ⎡ ⎤
0.95 0 0 0 0 0 −1 −1 0 0 0 0 0
⎢ 0 0.9 0 0 0 ⎥ ⎢ 0 0 0 −1 −1 −1 0 0 ⎥
⎢ ⎥ ⎢ ⎥
A=⎢
⎢ 0 0 0.75 0 0 ⎥ , B0 = ⎢
⎥
⎢ 0 0 0 0 0 0 −1 0 ⎥ ,
⎥
⎣ 0 0 0 0.75 0 ⎦ ⎣ 0 0 0 0 0 0 0 −1 ⎦
0 0 0 0 0.85 0 0 0 0 0 0 0 0
⎡ ⎤ ⎡ ⎤
0.95 0 0 0 0 0 0 0 3500
⎢ 0 0.87 0 0 0 0 0 0 ⎥ ⎢ 800 ⎥
⎢ ⎥ ⎢ ⎥
⎢
B1 = ⎢ 0 0 0 0 0.75 0 0 0.7 ⎥ , x0 = ⎢
⎥ ⎥
⎢ 400 ⎥ .
⎣ 0 0 0.8 0 0 0.8 0.7 0 ⎦ ⎣ 400 ⎦
0 0 0 0.85 0 0 0 0 200
The cost functional is
1
G= (x(T )) Qx(T )
2
−1
1" #
T
+ (x(T )) Qx(t) + (u(t)) Ru(t) , (5.5.56)
t=0
2
where ⎡ ⎤
100 0 0000
⎡ ⎤ ⎢0 5 0 0 0 0 0 0⎥
100 0 0 ⎢ ⎥
⎢0 0 5 0 0 0 0 0⎥
⎢0 2 0 0 0 ⎥ ⎢ ⎥
⎢ ⎥ ⎢ 0 0 0 2.5 0 0 0 0⎥
Q=⎢
⎢0 0 3 0 0 ⎥⎥, R=⎢
⎢0 0 0 0
⎥.
⎣ 0 0 0 1.5 ⎢ 3 0 0 0⎥
⎥
0 ⎦ ⎢0 0 0 0
⎢ 0 4 0 0⎥
⎥
000 0 2.5 ⎣0 0 0 0 0 0 2 0⎦
000 0 0002
164 5 Discrete Time Optimal Control Problems
Supply Route 8
Min u=0
Max u=4500
Node 1
Min x=250
Max x=700
Supply Route 7
Min u=0
Node 4
Max u=4000
Min x=200 Supply Route 5
Max x=700 Min u=0
Supply Route 6
Max u=3000
Min u=0
Max u=3500
Node 2
Supply Route 3
Min x=250
Min u=0
Max x=300
Max u=2000
Min u=0
Max u=2000
Node 0
700
+ control1 ◦
control2
600
control3
Stock dispatched supply
500
400
300
200
+
100
+
0
0 0.5 1 1.5 2 2.5 3 3.5 4
Time
500
400
300 •
•
200
∗ ∗ •
100 ∗ •
∗
•
∗
0
0 0.5 1 1.5 2 2.5 3 3.5 4
Time
and the all-time-step constraints are satisfied at each time point. However,
all-time-step constraints in [12] are not always satisfied at each time point.
From Figure 5.5.2, 5.5.3 and 5.5.4, we see that u1 (k) = 0 for k = 0, 1, . . . , 4,
indicating no stock being dispatched along the supply route 1 to Node 1. This
is because u1 (k) could only contribute extra stock to Node 1 through the
supply route 1 from Node 0, and the initial stock in Node 1 is large, twice as
166 5 Discrete Time Optimal Control Problems
700
control7
control8
600
Stock dispatched supply
500
400
300
200
100
0
0 0.5 1 1.5 2 2.5 3 3.5 4
Time
900
◦ state1 ◦
Stock of logistic resources at location
800 state2
state3
700
600 ◦
500 ◦
400
300
200
100
0◦ ◦ ◦
0 1 2 3 4 5
Time
large as those in the other nodes. Thus, it is clear that stock should be moved
out of Node 1 to other nodes quickly through the supply routes 2 and 3 so as
to decrease the cost of holding the stock in Node 1. Also from Figure 5.5.2,
5.5.3 and 5.5.4, we see that u2 (k) and u3 (k) are very large at k = 0, meaning
that a large amount of stock is dispatched from Node 1 to the other nodes
5.5 Discrete Time Time-Delayed Optimal Control Problem 167
900
Stock of logistic resources at location state4 ×
800 state5 +
700
600
500
×
× ×
400 ×
×
300 + + +
+ + +
200 ×
100
0
0 1 2 3 4 5
Time
3500 ◦
Stock of logistic resources at location 1
3000
2500
2000 ◦
1500
◦
1000
◦
◦
500 ◦
0 1 2 3 4 5
Time
5.6 Exercises
The process is assumed to have the dynamic equation, x(i + 1) = au(i) where
the integer i counts the years and the growth rate ”a” (a > 0) is a constant.
The satisfaction from money spent in any year is described by a function
H(i) = H(x(i) − u(i)) and the objective is to maximize the total satisfaction
N
I= H(i) over N years. Define an optimal return function and write the
i=1
dynamic programming functional recurrence equation for the process. Take
H = (x − u)1/2 and find the optimal spending policy
(i) over 2 years
(ii) over N years.
5.6.9 Same as Exercise 5.6.8, but with H taken as
H = log(x − u).
1
I = 50(x(2))2 + (u(k))2
k=0
for a two-step process starting from x(0) = 1 with the finishing state x(2)
free.
(i) Use the dynamic programming to find the optimal controls u(i), which
are to be real numbers satisfying |u(i)| ≤ 2.
(ii) Solve the problem in part (i) when the initial state is given by x(0) =
10.
5.6.11
(a) A farm grows wheat as a yearly crop and has available unlimited acreage
and free labour. Of the total grain x(i) tones available at the start of
each year i, u(i) tones (0 ≤ u(i) ≤ x(i)) is planted and the remainder
x(i) − u(i) is sold during the year. The planted wheat produces a new
crop, x(i + 1) = au(i), (a = constant > 1). It is assumed that A tones of
grain (i.e., x(1) = A) is provided to start the venture which is to run for
4 years. The desire is to maximize
4
I= {x(i) − u(i)}
i=1
the total amount of grain sold over the period. Use the dynamic program-
ming to find the optimal planting and selling policy.
170 5 Discrete Time Optimal Control Problems
(b) Use the dynamic programming to find the optimal planting and selling
policy for the problem in part (a) with the following modifications:
(i) The project is to finish with A tones left at the end of the fourth
year.
(ii) As in (b)(i), but the amount of land available for cultivation is
limited so that no more than aA tones can be sown in any year,
i.e., 0 ≤ u(i) ≤ aA.
5.6.13 Consider the problem of Exercise 5.6.12. Let u(k) ∈ U for all k =
0, 1, . . . , M − 1, where U is a compact and convex subset of Rr . Show that
the first order necessary condition for optimality is the same as that given in
Exercise 5.6.12, except with (5.6.3) replaced by
M −1
∂H(k, x(k), u(k), λ(k + 1))
(u(k) − u∗ (k)) ≥ 0
∂u(k)
k=0
5.6.14 Consider the discrete time process governed by the following system
of difference equations:
where x(k) ∈ Rn , u(k) ∈ Rr , A(k) ∈ Rn×n , and B(k) ∈ Rn×r . The control
sequence u(k), k = 0, 1, . . . , M −1, is to be determined such that the quadratic
cost functional
1
g0 (u) = (x(M )) Q(M )x(M )
2
M −1
1 1
+ (x(k)) Q(k)x(k) + (u(k)) R(k)u(k)
2 2
k=0
where λ(k) is the costate vector governed by the following system of dif-
ference equations:
(b) By assuming that the costate and state vectors are related through the
relationship:
λ(k) = S(k)x(k),
where S(k) = (S(k)) . Show that the symmetric matrix S(k) is governed
by the discrete matrix Riccati equation:
−1
S(k) = Q(k) + (A(k)) (S(k + 1))−1 + B(k)(R(k))−1 (B(k)) A(k)
S(M ) = Q(M ).
5.6.15 A combined discrete time optimal control and optimal parameter se-
lection problem is defined in the canonical form as follows:
where
172 5 Discrete Time Optimal Control Problems
M −1
g0 (u, ζ) = Φ0 (x(M | u, ζ), ζ) + L0 (k, x(k | u, ζ), u(k), ζ),
k=0
subject to
gi (u, ζ) = 0, i = 1, . . . , Ne
gi (u, ζ) ≥ 0, i = 1, . . . , N,
where
M −1
gi (u, ζ) = Φi (x(M | u, ζ), ζ) + Li (k, x(k | u, ζ), u(k), ζ),
k=0
M
" #
g0 (u) = (x(k))2 + (u(k))2
k=0
6.1 Introduction
There are already many excellent books devoted solely to the detailed exposi-
tion of optimal control theory. We refer the interested reader to [3, 4, 8, 11, 18–
21, 29, 33, 40, 59, 64, 74, 83, 90, 121, 130, 198, 201, 206, 226, 276], just to name
a few. Texts dealing with the optimal control of partial differential equations
include [5, 37, 149, 250]. The aim of this chapter is to give a brief account
of some fundamental optimal control theory results for systems described by
ordinary differential equations.
In the next section, we present the basic formulation of an unconstrained
optimal control problem and derive the first order necessary condition known
as the Euler-Lagrange equations. In Section 6.3, we consider a class of linear
quadratic optimal control problems. This class is important because the prob-
lem can be solved analytically and the optimal control so obtained is in closed
loop form. In Section 6.4, the well-known Pontryagin minimum principle is
briefly discussed. The Maximum Principle is then used to introduce singular
control and time optimal control in Sections 6.5 and 6.6, respectively. In Sec-
tion 6.7, a version of the optimality conditions for optimal control problems
subject to constraints is presented and an illustrative example is solved using
these conditions. To conclude this chapter, Bellman’s dynamic programming
principle is included in Section 6.8. For results on the existence of optimal
controls, we refer the interested reader to [40].
The main references of this chapter are the lecture notes prepared and
used by the authors, [88] and Chapter 4 of [253].
We shall begin with the simplest optimal control formulation from which we
derive the first order necessary conditions for optimality, also known as the
Euler-Lagrange equations. More complex classes of optimal control problems
will be discussed in later sections. Consider a dynamical system described by
the following system of differential equations:
dx(t)
= f (t, x(t), u(t)), t ∈ [0, T ], (6.2.1a)
dt
with initial condition:
x(0) = x0 , (6.2.1b)
where
x(t) = [x1 (t), x2 (t), . . . , xn (t)] , u(t) = [u1 (t), u2 (t), . . . , um (t)]
Note that the appended cost functional ḡ0 is identical to the original g0 if the
dynamical constraint is satisfied. The time-dependent Lagrange multiplier
is referred to as the costate vector. It is also known as the adjoint vector.
Substituting (6.2.4) into (6.2.3) and integrating the last term by parts, we
have
As the initial condition x(0) is fixed, δx(0) vanishes and (6.2.6) reduces to
T
∂H(t, x(t), u(t), (t))
δḡ0 = δu(t) dt. (6.2.8)
0 ∂u
For a local minimum, δḡ0 is required to vanish for any arbitrary δu. Therefore,
it is necessary that
∂H(t, x(t), u(t), (t))
=0 (6.2.9)
∂u
176 6 Elements of Optimal Control Theory
for all t ∈ [0, T ], except possibly on a finite set (i.e., on a set consisting of a
finite number of points). Note that this condition holds only if u is uncon-
strained. In the case of control bounds, the Pontryagin Maximum Principle
to be discussed later is required.
Equations (6.2.1), (6.2.7) and (6.2.9) are the well-known Euler-Lagrange
equations. Note that (6.2.7) is a set of ordinary differential equations in λ with
boundary conditions specified at the terminal time T . We shall summarize
these results in a theorem.
Theorem 6.2.1 Let u∗ (t) be a local optimal control for the cost func-
tional (6.2.2), and let x∗ (t) and λ∗ (t) be, respectively, the corresponding
optimal state and costate. Then, it is necessary that
dx∗ (t) ∂H(t, x∗ (t), u∗ (t), λ∗ (t))
=
dt ∂λ
∗ ∗
= f (t, x (t), u (t)), (6.2.10a)
∗ 0
x (0) = x , (6.2.10b)
∗ ∗ ∗ ∗
dλ (t) ∂H(t, x (t), u (t), λ (t))
=− , (6.2.10c)
dt ∂x
∂Φ0 (x∗ (T ))
λ∗ (T ) = , (6.2.10d)
∂x
where a finite set denotes a set that contains only a finite number of points.
subject to
6.2 First Order Necessary Condition: Euler-Lagrange Equations 177
dx(t)
= u(t) (6.2.12a)
dt
x(0) = 1. (6.2.12b)
H = x2 + u2 + λu. (6.2.13)
dλ(t)
= −2x(t) (6.2.14a)
dt
λ(1) = 0 (6.2.14b)
∂H
= 2u(t) + λ(t) = 0. (6.2.15)
∂u
Substituting (6.2.15) into (6.2.12a) gives
dx(t) λ(t)
=− . (6.2.16)
dt 2
Differentiating (6.2.16) with respect to t and then using (6.2.14a), we have
d2 x(t)
− x(t) = 0. (6.2.17)
dt2
Clearly, the solution of (6.2.17) is given by
subject to
dx(t)
= A(t)x(t) + B(t)u(t) (6.3.2a)
dt
x(0) = x0 , (6.3.2b)
dx(t)
= A(t)x(t) + B(t)u(t) (6.3.4a)
dt
x(0) = x0 (6.3.4b)
dλ(t)
= −Q(t)x(t) − (A(t)) λ(t) (6.3.4c)
dt
λ(T ) = S f x(T ) (6.3.4d)
R(t)u(t) + (B(t)) λ(t) = 0, (6.3.4e)
where we have suppressed the superscript ∗ on the optimal control, state and
costate for clarity. Since R(t) is positive definite, the last equation (6.3.4e)
immediately yields
6.3 The Linear Quadratic Theory 179
(6.3.4a) and (6.3.4c) with u(t) given by (6.3.5) can be written in the form of
the following linear homogeneous system of differential equations in x and λ:
Since the boundary conditions for x and λ are prescribed at two different
end points, (6.3.6) presents a linear homogeneous TPBVP. Unlike the general
nonlinear TPBVP problem, this one can be solved analytically in two ways:
the transition matrix method or the backward sweep method [33].
The transition matrix method assumes the existence of two transition
matrices X(t) and Λ(t) ∈ Rn×n such that
and
λ(t) = Λ(t)x(T ). (6.3.8)
It is easy to see that the transition matrices must also satisfy (6.3.6), i.e.,
X(T ) = I (6.3.10a)
and
Λ(T ) = S f . (6.3.10b)
(6.3.9) with the boundary conditions (6.3.10) gives rise to a final value
problem, as the system can be integrated backwards in time, starting at the
terminal time T . If X(t) is non-singular for all t ∈ [0, T ], then, by (6.3.5), (6.3.7)
and (6.3.8), it follows that
(6.3.11) relates the optimal control at time t to the state at time t and hence
it is called a closed loop (or feedback ) control law. It can be written as
The feedback gain matrix can be evaluated once the system (6.3.9) is solved
with the boundary conditions (6.3.10).
Note that the control law may also be expressed in terms of the initial
state, i.e.,
u(t) = −Ĝ(t)x(0), (6.3.13)
where
Ĝ(t) = (R(t))−1 (B(t)) Λ(t)(X(0))−1 . (6.3.14)
The proof is left as an exercise.
The transition matrix method is conceptually easy, though difficulty often
arises during actual computation of the inverse of X(t). The reader is referred
to [33] for further elaboration of the numerical difficulties involved.
The backward sweep method is more popular by virtue of its computa-
tional efficiency. It assumes a linear relationship between the state and the
costate of the form:
λ(t) = S(t)x(t). (6.3.15)
Direct differentiation of (6.3.15) yields
dλ(t)
= −Q(t)x(t) − (A(t)) λ(t)
dt
dS(t) dx(t)
= x(t) + S(t)
dt dt
dS(t)
= x(t) + S(t)A(t)x(t) + S(t)B(t)u(t). (6.3.16)
dt
Substituting (6.3.5) and (6.3.15) into (6.3.16) yields
dS(t)
+ S(t)A(t) + (A(t)) S(t) + Q(t) (6.3.17)
dt
Since (6.3.18) must hold for arbitrary x, it is necessary that, for all t ∈ [0, T ],
dS(t)
= −S(t)A(t) − (A(t)) S(t) − Q(t) + S(t)B(t)(R(t))−1 (B(t)) S(t).
dt
(6.3.19a)
The boundary condition is obtained from (6.3.15) and (6.3.4d):
S(T ) = S f . (6.3.19b)
determined, a closed loop control law may again be established from (6.3.15)
and (6.3.5):
u(t) = −G(t)x(t), (6.3.20a)
where
G(t) = (R(t))−1 (B(t)) S(t). (6.3.20b)
Once S(t) is determined, the initial condition for the costate is given by
(6.3.15):
λ(0) = S(0)x0 . (6.3.21)
Hence, the optimal state and costate may be obtained by direct integra-
tion of (6.3.6) forward in time starting at t = 0. By comparing (6.3.20)
with (6.3.11), it is obvious that
This implies that the matrix Riccati equation may be solved alternatively by
the procedure given in the transition matrix method, i.e., by solving the 2n2
linear differential equations in (6.3.9), computing the inverse of X and then
computing S(t) = Λ(t)(X(t))−1 . In principle, it may be easier to solve the
linear differential equation. However, there are efficient numerical algorithms
[44] for solving the nonlinear Riccati equation directly. The need to solve
about four times as many differential equations and to invert an n × n matrix
at each time point t for the transition matrix method is often not warranted.
As an extension to the linear quadratic regulator problem, the tracking
problem seeks to track a desired reference trajectory r(t) with a linear com-
bination of the states over the interval [0, T ]. Again, subject to the previous
linear system (6.3.2), we have the following cost functional:
1
min u g(u) = [y(T ) − r(T )] S f [y(T ) − r(T )]
2
1 T " #
+ [y(t) − r(t)] Q(t)[y(t) − r(t)] + (u(t)) R(t)u(t) dt ,
2 0
(6.3.23)
where
y(t) = C(t)x(t) (6.3.24)
and
S ≥ 0, Q(t) ≥ 0, R(t) > 0 (6.3.25)
(i.e., S f is symmetric and positive semi-definite, Q(t) is symmetric and pos-
itive semi-definite for each t ∈ [0, T ], and R(t) is symmetric and positive
definite for each t ∈ [0, T ].)
If we assume a linear relationship for the state and costate, i.e.,
There are already many books written solely on the Pontryagin minimum
principle and its applications. In this section, we merely point out some fun-
damental results and briefly investigate some applications.
The Euler-Lagrange equations for the unconstrained optimal control prob-
lem of Section 6.2 require that the Hamiltonian function must be stationary
∂H
with respect to the control, i.e., = 0 at optimality.
∂u
Consider the case when the control is constrained to lie in a subset U of
Rr , where U , known as the control restraint set, is generally a compact subset
of Rr . In this situation, the optimality conditions obtained in Section 6.2 do
not make sense if the optimal control happens to lie on the boundary of U for
any positive subinterval of the planning horizon [0, T ]. To cater for this and
more general situations, some fundamental results due to Pontryagin and his
co-workers [206] will be stated without proof in the next two theorems.
Let U be a compact subset of Rr . Any piecewise continuous function from
[0, T ] into U is said to be an admissible control. Let U be the class of all such
admissible controls.
Now we consider the problem where the cost functional (6.2.2) is to be
minimized over U subject to the dynamical system (6.2.1). We refer to this
as Problem (P 1).
Theorem 6.4.1 Consider Problem (P 1). If u∗ ∈ U is an optimal control,
and x∗ (t) and λ∗ (t) are the corresponding optimal state and costate, then it
is necessary that
dx∗ (t) ∂H(t, x∗ (t), u∗ (t), λ∗ (t))
= = f (t, x∗ (t), u∗ (t)), (6.4.1a)
dt ∂λ
6.4 Pontryagin Maximum Principle 183
x∗ (0) = x0 , (6.4.1b)
∗ ∗ ∗ ∗
dλ (t) ∂H(t, x (t), u (t), λ (t))
=− , (6.4.1c)
dt ∂x
∂Φ0 (x∗ (T ))
λ∗ (T ) = (6.4.1d)
∂x
and
min H(t, x∗ (t), v, λ∗ (t)) = H(t, x∗ (t), u∗ (t), λ∗ (t)) (6.4.1e)
v∈U
Remark 6.4.1 Note that the condition (6.4.1e) in the above theorem may
also be written as
for all v ∈ U , and for all t ∈ [0, T ], except possibly on a finite subset of [0, T ].
Note furthermore that the necessary condition (6.4.1e) (and hence (6.4.2))
reduces to the stationary condition (6.2.10e) if the Hamiltonian function H
is continuously differentiable and if U = Rr .
x(T ) = xf , (6.4.3)
where xf is a given vector in Rn . Our second problem may now be stated as:
Subject to the dynamical system (6.2.1) together with the terminal condi-
tion (6.4.3), find a control u ∈ U such that the cost functional (6.2.2) is
minimized over U .
For convenience, let this second optimal control problem be referred to as
Problem (P 2).
Theorem 6.4.2 Consider Problem (P 2). If u∗ ∈ U is an optimal control,
and x∗ (t) and λ(t) are the corresponding optimal state and costate, then it
is necessary that
dx∗ (t) ∂H(t, x∗ (t), u∗ (t), λ∗ (t))
= = f (t, x∗ (t), u∗ (t)), (6.4.4a)
dt ∂λ
x∗ (0) = x0 , (6.4.4b)
∗ f
x (T ) = x , (6.4.4c)
dλ∗ (t) ∂H(t, x∗ (t), u∗ (t), λ∗ (t))
=− (6.4.4d)
dt ∂x
and
184 6 Elements of Optimal Control Theory
for all v ∈ U , and for all t ∈ [0, T ], except possibly on a finite subset of [0, T ].
dx(t)
= −bx(t) + u(t), t ∈ [0, T ] (6.4.6a)
dt
x(0) = x0 , (6.4.6b)
The optimal maintenance policy thus seeks to maximize the net discounted
payoff (i.e., productivity benefit minus maintenance cost) plus the salvage
value, assuming that the productivity is proportional to the machine quality,
i.e.,
4 5
T
max g0 (u) = e−rT S x(T ) + e−rt [px(t) − u(t)]dt , (6.4.8)
0
where r, S, p, and T are, respectively, the interest rate, salvage value per unit
terminal quality, the productivity per unit quality and the sale date of the
machine.
Before applying the Pontryagin Maximum Principle presented in Theo-
rem 6.4.1 to this problem, we first note that
We now write down the corresponding Hamiltonian function for the optimal
control problem with (6.4.9) as the objective functional:
dλ∗ (t) ∂H
=− = bλ∗ (t) + pe−rt , (6.4.11a)
dt ∂x
with the boundary condition
" #
∂ −e−rT S x(T )
λ∗ (T ) = = −Se−rT . (6.4.11b)
∂x(T )
ū, 0 ≤ t ≤ t∗
u∗ (t) = (6.4.14)
0, t∗ < t ≤ T ,
where t∗ is the time when λ∗ (t) + e−rt switches from being negative to being
positive. Since both λ∗ and e−rt are monotonic, there can only be one such
switching point. By solving for the zero of {λ∗ (t) + e−rt }, we obtain
. p /
∗ 1 r+b − 1
t =T+ ln p . (6.4.15)
b+r r+b − S
186 6 Elements of Optimal Control Theory
subject to
dx(t)
=bu(t) − cx(t),
dt
x(0) =k0 ,
x(T ) =kT ,
and 0 ≤ u(t) ≤ w̄ for all 0 ≤ t < T . Here, u(t) denotes the rate of work done
at time t, x(t) is the knowledge level at time t and T is the total time in
weeks. Furthermore, c > 0, b > 0, w̄, k0 , and kT are constants as described in
Example 1.2.1. We assume that k0 < kT (i.e., the student’s initial knowledge
level is insufficient to pass the examination) and also that w̄ is sufficiently
large so that the final knowledge level can be reached (otherwise the problem
would be infeasible). The Hamiltonian is given by
dλ∗ (t) ∂H
=− = cλ∗ (t).
dt ∂x
This yields λ∗ (t) = Kect for some constant K, which is a strictly monotone
function (note that K = 0 would lead to u∗ (t) = 0 for all t, which would lead
to a loss of knowledge and the student unable to reach the final knowledge
level). Since Pontryagin’s Maximum Principle requires the minimization of
H with respect to u, we must have
Now, K > 0 would result in λ∗ (t) > 0 for all t, which, in turn, leads to
1 + λ∗ (t)b > 0 for all t. As this forces u∗ (t) = 0 for all t, the student could
again not reach the required final knowledge level. Hence we must have K < 0,
which means that λ∗ (t) < 0 for all t and it is monotonically decreasing. If
1 + λ∗ (0)b < 0, it follows that 1 + λ∗ (t)b < 0 for all t, which means that
u∗ (t) = w̄ for all t. The more likely scenario is that 1 + λ∗ (0)b > 0 and
1 + λ∗ (t)b then decreases, becoming negative after a time t∗ . This means the
optimal control is of bang-bang type with
6.5 Singular Control 187
0, if 0 ≤ t < t∗ ,
u∗ (t) =
w̄, if t∗ ≤ t ≤ T.
dx(t)
We can now derive the complete solution. For 0 ≤ t < t∗ , = −cx(t).
dt
Together with the initial condition, this results in
dx(t)
For t∗ ≤ t ≤ T , we have = bw̄ − cx(t). Together with the terminal
dt
state constraint, this results in
bw̄ bw̄
x∗ (t) = + kT − ec(T −t) , t∗ ≤ t ≤ T.
c c
By equating the two forms of the optimal state trajectory at t∗ , we can derive
$ %
ln ckbw̄ − bw̄ e
ckT cT
0
+ ecT
t∗ = .
c
To conclude this section, we wish to note that there exist several versions of
the proof for the Pontryagin Maximum Principle. What appears in Pontrya-
gin’s original book [206] is somewhat complex. For the proof of a simplified
version, we refer the reader to [3].
dy()
− S1 ≤ ≤ S2 , 0 ≤ ≤ L, (6.5.1)
d
where S1 and S2 are the given positive constants.
B
Proposed Proposed
Link Link
a a
A ? B ?
6 R 6
A
6
y()
?
?
-
H Existing
Highways
(a ) (b )
Fig. 6.5.1: A hypothetical case of highway construction. (a) Plan view. (b)
Section a − a
Our aim is to accomplish this job with minimum cost, where the total cost
is the sum of the costs of cutting and filling of earth. If the costs of cutting
and filling are the same, then the minimum cost is, in fact, equivalent to the
minimum filling and cutting of earth.
We are now in a position to write down the mathematical model of this
optimal control problem. To simplify the presentation, we shall assume that
subject to
dy()
=u(), 0 ≤ ≤ 2, (6.5.3b)
d
y(0) =0 (6.5.3c)
y(2) =1 (6.5.3d)
6.5 Singular Control 189
and
− 1 ≤ u() ≤ 1, 0 ≤ l ≤ 2. (6.5.3e)
According to Theorem 6.4.2, the corresponding Hamiltonian function is
H = y 2 + uλ (6.5.4)
dλ() ∂H
=− = −2y(). (6.5.5)
d ∂y
Minimizing H with respect to u, the optimal control takes the form
⎧
⎨ 1, if λ() < 0,
u∗ () = −1, if λ() > 0, (6.5.6)
⎩
undetermined, if λ() = 0.
Let us assume that we can ignore the third case of (6.5.6) for the time being.
Two possibilities remain, namely u∗ () = 1 for λ() < 0 and u∗ () = −1 for
λ() > 0.
Let us explore each of these cases. We integrate the differential equa-
tions (6.5.3b) and (6.5.5) in turn and we let K1 , K2 , K3 and K4 be the
constants of integration that arise in this process.
By (6.5.5), we obtain
By (6.5.5), we have
Let us plot the relationship between y and λ for these cases by first elim-
inating the variable . Consider Case (i). Then, from (6.5.7), we have
= y − K1 .
190 6 Elements of Optimal Control Theory
Now consider Case (ii). Then, by a similar argument, it follows from (6.5.9)
that
= K3 − y.
Hence, for λ > 0,
These are the desired relationships between y and λ. They are plotted on the
phase diagram of Figure 6.5.2. In the top half-plane of the figure, λ is positive,
λ
6
u∗ = −1
K 1
K
K
K
-
−1 1 y
U
y=1
U
U
−1
u = +1 U
∗
y=0
since they do not all meet the boundary conditions, i.e., (6.5.3c), y(0) = 0,
and (6.5.3d), y(2) = 1. In terms of the phase diagram, a feasible solution
must start on the λ-axis (y(0) = 0 when = 0), and it must end on the
vertical line y = 1 when = 2.
Inspecting the phase diagram carefully, we see that points on the λ-axis
with λ > 0 are no use. We can never go to the desired final value of y = 1
from such initial points. Thus, we must start from or below the λ-axis. This,
in turn, implies that we must use the control u∗ () = 1 if it is to be an optimal
control. Thus, (6.5.11) applies throughout. But, from (6.5.3c) and (6.5.7), we
have
y(0) = 0 + K1 = 0.
Thus, y() = and hence y(2) = 2. This clearly does not satisfy the final
condition (6.5.3d) (i.e., y(2) = 1).
Where does this leave us? If we re-examine (6.5.6), there are, in fact,
three possibilities. The second is not feasible in any case, and the exclu-
sive use of the first has been shown to also not give a feasible answer. We
are therefore required to consider the third possibility, λ = 0 and u∗ un-
determined, (6.5.6). This situation leads to a singular control, because the
Maximum Principle (6.4.4e) fails to directly determine a unique value of the
optimal control.
Let us investigate this singular case further. Suppose that λ = 0 not merely
at a single point in [0, 2], but in some finite open interval, say (1 , 2 ). Then,
the derivative dλ()/d must also vanish and, by (6.5.5), we obtain y() = 0
in that same interval. But then the derivative dy()/d = 0 as well and,
by (6.5.3b), we get u∗ () = 0. The unique singular control solution is therefore
given by
y() = u∗ () = λ() = 0. (6.5.13)
On the phase diagram presented in Figure 6.5.2, this solution corresponds
to the origin, i.e., a single point in the plane rather than a curve. There are
other points in the plane with y = 0, namely all points on the λ-axis. But none
of these others can be maintained so that the differential equations (6.5.3b)
and (6.5.5) are satisfied over some finite interval, since (6.5.5) implies that
λ() cannot remain constant, in particular equal to zero, unless 2y vanishes.
The desired final optimal solution now emerges in two parts. From = 0
until some distance , - the new link should follow the existing terrain. We
start to fill the existing terrain from - onwards in such a way that the slope
at each point is always at its maximum allowable limit. In this way, the
desired height at = 2, y(2) = 1, will be reached exactly. The remaining
question is to determine the value of . - First, we recall that
y(2) = 1 (6.5.14)
y() = 0, -
0 ≤ ≤ , (6.5.15)
and
192 6 Elements of Optimal Control Theory
y() = + K1 , - ≤ ≤ 2. (6.5.16)
Now, from (6.5.14) and (6.5.16), we obtain
K1 = −1. (6.5.17)
y() = − 1, - ≤ ≤ 2. (6.5.18)
0, 0 ≤ ≤ 1
u∗ () = (6.5.19)
1, 1 ≤ ≤ 2.
0, 0 ≤ ≤ 1
y ∗ () = (6.5.20)
− 1, 1 ≤ ≤ 2.
This clearly agrees with our intuition, although we can quite easily imagine
how uncomfortable it would be to drive on such a highway—the slope of the
highway is at its maximum allowable limit from the point = 1 onwards.
This is due to the fact that we did not take driving comfort into account in
the problem formulation.
Note that the singular control problem considered in this section is only
a very simple one. For more information regarding singular control, we refer
the interested reader to [40, 69, 72, 73]. Note also that computational algo-
rithms to be introduced in subsequent chapters work equally well regardless
of whether the optimal solution to be obtained is a singular one or otherwise.
x(T ) = xf , (6.6.1)
6.6 Time Optimal Control 193
and
min H(t, x∗ (t), v, λ∗ (t)) = H(t, x∗ (t), u∗ (t), λ∗ (t)) (6.6.4e)
v∈U
H(T ∗ , x∗ (T ∗ ), u∗ (T ∗ ), λ∗ (T ∗ )) = 0. (6.6.4f)
The proof of this theorem is rather involved. Since the emphasis of this
text is on computational methods for optimal control, we omit the proof and
refer the interested reader to [3]. We illustrate the application of the theorem
with the following example.
194 6 Elements of Optimal Control Theory
d2 x(t)
= u(t)
dt2
with the following boundary conditions
dx(0)
x(0) = x01 , = x02
dt
dx(T )
x(T ) = 0, = 0.
dt
Our aim is to find a control u with
subject to
dx1 (t)
=x2 (t)
dt
dx2 (t)
=u(t)
dt
with the initial conditions
x1 (T ) = 0, x2 (T ) = 0,
dλ1 (t) ∂H
=− =0
dt ∂x1
6.6 Time Optimal Control 195
dλ2 (t) ∂H
=− = −λ1 (t).
dt ∂x2
Minimizing the Hamiltonian function with respect to u, we obtain
⎧
⎨ −1, if λ2 (t) > 0,
u∗ (t) = 1, if λ2 (t) < 0,
⎩
undetermined, if λ2 (t) = 0.
λ1 (t) = C
λ2 (t) = −Ct + D,
where C and D are arbitrary constants. From the linearity of λ2 (t), it follows
that λ2 (t) either changes sign at most once or is identically zero over the time
horizon. However, we can rule out the latter case, since this would require
C = D = 0, leading to λ1 (t) = 0, for all t ≥ 0, and hence
6x2
Ps
sB u = +1
7 7 7 7 7 7 7 7 7
U U U U U U U U U
sO -
x1
s
A
u = −1 o o o o o o o o o
s
P
rise to the two sections of this optimal trajectory, the time optimal control
is clearly given by
∂a/∂x diag(a)
rank = r + s, (6.7.5)
∂b/∂x 0
where diag(g) denotes the c×c diagonal matrix with g1 , . . . , gc along the main
diagonal. The constraint qualification (6.7.6) means that the gradients with
respect to u of all active components of (6.7.4) must be linearly independent.
∂hji ∂hpi i
= 0 for all 0 ≤ j ≤ pi − 1 and = 0. (6.7.8)
∂u ∂u
Then we say that the state constraint hi ≤ 0 is of order pi . Note that many
practical problems have state constraints of order 1 only.
Consider the state constraint hi ≤ 0 for a particular i ∈ [1, . . . , q]. We say
that a subinterval (t1 , t2 ) ⊂ [0, T ] with t1 < t2 is an interior interval of the
trajectory x(·) if hi (t, x(t)) < 0 for all t ∈ (t1 , t2 ). An interval [τ1 , τ2 ] ⊂ [0, T ]
with τ1 < τ2 is called a boundary interval if hi (t, x(t)) = 0 for all t ∈ [τ1 , τ2 ].
An instant ten ∈ [0, T ] is called an entry time if there is an interior interval
that ends at ten and a corresponding boundary interval that starts at ten .
Similarly, an instant tex ∈ [0, T ] is called an exit time if a boundary interval
finishes and an interior interval commences at tex . If hi (τ ∗ , x(τ ∗ )) = 0 and if
hi (t, x(t)) < 0 just before and just after τ ∗ , then τ ∗ is referred to as a contact
time. We collectively refer to entry, exit and contact times as junction times.
Finally, let us assume for notational convenience that boundary intervals do
not intersect. If this is not the case, a more elaborate statement of the next
assumption would be required.
Assumption 6.7.3 On any boundary interval [τ1 , τ2 ],
⎡ p
∂h1 1 ⎤
∂u
⎢ .. ⎥
rank ⎢
⎣ . ⎥ = q ,
⎦ (6.7.9)
p
∂hqq
∂u
and
where
λ0 ≥ 0 (6.7.12)
is a constant, λ(·) ∈ Rn is the costate vector, and μ(·) ∈ Rc and ν(·) ∈ Rq
are Lagrange multiplier functions. At any time t ∈ [0, T ], we also define the
feasible control region
Theorem 6.7.1 Consider Problem (SP ). Suppose that u∗ (·), where u∗ (t) ∈
Ω(t), t ∈ [0, T ), is an optimal control that is right continuous with left hand
limits and also assume that the constraint qualification (6.7.6) holds for every
pair {u, t}, t ∈ [0, T ] with u ∈ Ω(t). Denote the corresponding optimal state
as x∗ (·) and assume that it has finitely many junction times. Then there exist
a constant λ0 ≥ 0, a piecewise continuous costate trajectory λ∗ (·) whose con-
tinuous segments are absolutely continuous, piecewise continuous multiplier
functions μ∗ (·) and ν ∗ (·), a vector η(τi ) ∈ Rq for each discontinuity point τi
of λ∗ (·), and α ∈ Rr , β ∈ Rs , γ ∈ Rq such that
for every t ∈ [0, T ] and such that the following conditions hold almost every-
where.
dx∗ (t) ∂H(t, x∗ (t), u∗ (t), λ0 , λ∗ (t))
= = f (t, x∗ (t), u∗ (t)), (6.7.14a)
dt ∂λ
x∗ (0) =x0 , (6.7.14b)
∗ ∗ ∗ ∗ ∗ ∗
dλ (t) ∂L(t, x (t), u (t), λ0 , λ (t), μ (t), ν (t))
=− , (6.7.14c)
dt ∂x
∂Φ0 (x∗ (T )) ∂a(T, x∗ (T ))
λ∗ (T − ) =λ0 + α
∂x ∂x
∂b(T, x∗ (T )) ∂h(T, x∗ (T ))
+ β + γ , (6.7.14d)
∂x ∂x
α ≥ 0, γ ≥ 0, α a(T, x∗ (T )) = γ h(T, x∗ (T )) = 0, (6.7.14e)
Furthermore, for any time τ in a boundary interval and for any contact time
τ , the costate λ∗ (·) may have a discontinuity given by the following jump
conditions.
λ∗ (τ − ) = λ∗ τ + + (η(τ )) h(τ, x∗ (τ )), (6.7.14k)
H(τ − , x∗ (τ − ), u∗ (τ − ), λ0 , λ∗ (τ − ))
= H τ + , x∗ τ + , u∗ τ + , λ0 , λ∗ τ + − (η(τ )) h(τ, x∗ (τ )),
(6.7.14l)
η(τ ) ≥ 0, (η(τ )) h(τ, x∗ (τ )) = 0, (6.7.14m)
where τ + and τ − denote the left hand side and right hand side limits, re-
spectively. Finally, suppose that τ is a junction time corresponding to a first
order constraint hi (t, x) ≤ 0 for some i ∈ {1, . . . , q}. Recalling the definition
of h1i in (6.7.7), if
h1i (τ − , x∗ (τ − ), u∗ (τ − )) < 0, (6.7.15)
or
h1i τ + , x∗ τ + , u∗ τ + > 0, (6.7.16)
i.e., if the entry or exit of the state into or out of a boundary interval is
non-tangential, then the costate λ is continuous at τ .
Remark 6.7.1 The condition
for every t can help us to distinguish a normal case (λ0 = 1) from an abnormal
case (λ0 = 0).
Remark 6.7.2 Conditions (6.7.14i) and (6.7.14l) are equivalent to the re-
quirement that H(t, x∗ (t), u∗ (t), λ0 , λ∗ (t)) is constant in the autonomous
case, i.e., when f , L , g, and h do not depend on t explicitly.
202 6 Elements of Optimal Control Theory
Remark 6.7.3 In most practical cases, λ∗ and H will only jump at junction
times. However, a discontinuity may also occur in the interior of boundary
interval.
Remark 6.7.4 Parts of Theorem 6.7.1 have been proven by a range of dif-
ferent authors, see [88] and the references cited therein. In particular, a com-
plete proof that establishes the existence of various multipliers may be found
in [184]. A more general version of Theorem 6.7.1, where g also depends on
x, is stated as an informal theorem in [88]. However, no formal proof has as
yet been established for this case.
subject to
dx(t)
= u(t) (6.7.18)
dt
x(0) = 1, (6.7.19)
− 1 ≤ u(t) ≤ 1, for all t ∈ [0, 3], (6.7.20)
x(t) ≥ 0, for all t ∈ [0, 3], (6.7.21)
x(3) = 1. (6.7.22)
H =x + λu,
L =x + λu + μ1 (−1 − u) + μ2 (u − 1) − νx,
where
dλ ∂L
=− = −1 + ν,
dt ∂x
λ(3) = β − γ,
μ1 ≥ 0, μ1 (−1 − u) = 0, for all t ∈ [0, 3],
μ2 ≥ 0, μ2 (u − 1) = 0, for all t ∈ [0, 3],
ν ≥ 0, ν(−x) = 0, for all t ∈ [0, 3],
and γ ≥ 0 along with γh(x(3)) = 0. Note that the last condition gives
γx(3) = γ = 0. Furthermore,
6.8 The Bellman Dynamic Programming 203
∂L
= λ − μ1 + μ2 = 0. (6.7.23)
∂u
While it is possible to derive the optimal solution from the above condi-
tions, it is obvious from the problem statement that
⎧
⎨ 1 − t, 0 ≤ t < 1,
x∗ (t) = 0, 1 ≤ t < 2,
⎩
t − 2, 2 ≤ t ≤ 3,
and ⎧
⎨ −1, 0 ≤ t < 1,
u∗ (t) = 0, 1 ≤ t < 2,
⎩
1, 2 ≤ t < 3.
We can then derive the corresponding costate and multipliers by satisfying
the requirements of Theorem 6.7.1.
For t ∈ [1, 2], u∗ (t) = 0 requires that μ∗1 (t) = μ∗2 (t) = 0. (6.7.23) then
yields λ∗ (t) = 0. It follows that dλ/dt = −1 + ν ∗ (t) = 0, which results in
ν ∗ (t) = 1.
For t ∈ [0, 1), x∗ (t) > 0 implies that ν ∗ (t) = 0 and hence dλ∗ /dt = −1 so
that λ∗ (t) = A − t, for some constant A. Since the state enters the boundary
interval non-tangentially, Theorem 6.7.1 requires λ∗ to be continuous at t =
1. λ∗ (1− ) = A − 1 = λ∗ (1) = 0 yields A = 1 so that λ∗ (t) = 1 − t on
[0, 1). u∗ (t) = −1 on [0, 1) means that μ∗2 (t) = 0 and, by virtue of (6.7.23),
μ∗1 (t) = λ∗ (t) = 1 − t on the same interval.
For t ∈ (2, 3], x∗ (t) > 0 again implies that ν ∗ (t) = 0 and hence dλ∗ /dt =
−1 so that λ∗ (t) = B − t, for some constant B. Since the state exits the
boundary interval non-tangentially, continuity of λ∗ is required at t = 2, so
λ∗ (2+ ) = B − 2 = λ∗ (2) = 0, i.e., B = 2 and λ∗ (t) = 2 − t in this interval.
u∗ (t) = 1 means that μ∗1 (t) = 0, so, using (6.7.23), we have μ∗2 (t) = −λ∗ (t) =
t − 2.
Clearly, the costate and multipliers derived above satisfy all the require-
ments of Theorem 6.7.1.
E
s
* ~
: s z sz s
s B C D
A
Suppose we wish to travel from the point A to the point D by the short-
est possible path and suppose this optimal path is given by ABCD, where
the intermediate points B and C represent some intermediate stages of the
journey. Now let us suppose that we start from point B and wish to travel to
point D by the shortest possible path. Bellman’s principle of optimality then
asserts that the optimal path will be given by BCD. Though this answer
appears trivial, we shall prove it nevertheless. Suppose there exists another
optimal path from B to D, indicated by the broken line BED. Then, since
distance is additive, this implies that ABED is a shorter path than ABCD.
This contradicts the original assumption and the proof is complete.
In the context of optimal control, we are dealing with a dynamical process
that evolves with time. Once the initial state and initial time are fixed, a
unique optimal control to minimize the cost functional can be determined.
Hence the optimal control is a function of the initial state and time. Each
optimal state is associated with an optimal path called an extremal. If the
process starts at a different initial state and time, a different optimal extremal
will result. For all possible initial points (ξ, t) in the state–time space, a
family of extremals may be computed, which is referred to as the field of
extremals in classical calculus of variation terminology. Each extremal yields
a corresponding cost functional that can be regarded as a function of the
initial state and time, i.e.,
where
T
g0 (ξ, t | u) = Φ0 (x(T | u)) + L0 (τ, x(τ | u), u(t))dτ, (6.8.2)
t
dx(τ )
= f (τ, x(τ ), u(τ )), (6.8.3a)
dτ
x(t) = ξ. (6.8.3b)
Here, the function V is called the value function, and g0 (ξ, t | u) denotes the
cost functional corresponding to the control u over the interval [t, T ] starting
from x(t) = ξ. Moreover, the control restraint set U is, in general, taken as
a compact subset of Rr .
For convenience, let Problem (P (ξ, t)) denote the optimal control problem
in which the cost functional (6.8.2) is to be minimized over U subject to the
dynamical system (6.8.3).
We may now apply Bellman’s principle of optimality to derive a partial
differential equation for the value function V (ξ, t). Suppose t is the current
time and t + Δt is a future time infinitesimally close to t. Then, by virtue of
the system (6.8.3), it follows that corresponding to each u ∈ U , the state at
t + Δt is
x(t + Δt | u) = ξ + Δξ, (6.8.4)
where
Δξ = f (t, ξ, u(t))Δt + O Δt2 . (6.8.5)
Now, let Problem (P (ξ + Δξ, t + Δt)) be Problem (P (ξ, t)) with the initial
condition (6.8.3b) and the cost functional (6.8.2) replaced by (6.8.4) and
T
g0 (ξ+Δξ, t+Δt | u) = Φ0 (x(T | u))+ L0 (τ, x(τ | u), u(τ ))dτ , (6.8.6)
t+Δt
respectively.
Clearly, the cost functional (6.8.2) can be expressed as
T
g0 (ξ, t | u) = Φ0 (x(T | u)) + L0 (τ, x(τ | u), u(τ ))dτ (6.8.7)
t+Δt
t+Δt
+ L0 (τ, x(τ | u), u(τ ))dτ . (6.8.8)
t
u∗ (τ ), for τ ∈ (t + Δt, T ]
ũ(τ ) = (6.8.9)
u(τ ), for τ ∈ [t, t + Δt),
where
Obviously,
g0 (ξ, t | ũ) ≥ V (ξ, t), (6.8.12)
which, in turn, becomes an equality if optimal control is used in the interval
[t, t + Δt]. Thus,
* t+Δt
V (ξ, t) = min V (ξ + Δξ, t + Δt) + L0 (τ, x(τ | u), u(τ ))dτ
t
+
: u measurable on [t, t + Δt] with u(τ ) ∈ U . (6.8.13)
Using a Taylor series expansion of the value function V (ξ + Δξ, t + Δt), and
then invoking (6.8.5), we have
V (ξ + Δξ, t + Δt)
∂V (ξ, t) ∂V (ξ, t)
= V (ξ, t) + Δξ + Δt + O Δt2
∂ξ ∂t
∂V (ξ, t) ∂V (ξ, t)
= V (ξ, t) + f (t, ξ, u(t))Δt + Δt + O Δt2 . (6.8.14)
∂ξ ∂t
Next, since Δt is infinitesimally small, it is clear from using the initial con-
dition (6.8.3b) that
t+Δt
L0 (τ, x(τ | u), u(τ ))dτ = L0 (t, ξ, u(t))Δt + O Δt2 . (6.8.15)
t
Substituting (6.8.14) and (6.8.15) into (6.8.13), and then noting that V (ξ, t)
is independent of u(t), we obtain
∂V (ξ, t)
− Δt
∂t
∂V (ξ, t)
= min f (t, ξ, u(t))Δt + L0 (t, ξ, u(t))Δt + O Δt2 : u(t) ∈ U .
∂ξ
(6.8.16)
∂V (ξ, t) ∂V (ξ, t)
− = min f (t, ξ, v) + L0 (t, ξ, v) : v ∈ U . (6.8.17)
∂t ∂ξ
6.8 The Bellman Dynamic Programming 207
subject to
dx(t)
= A(t)x(t) + B(t)u(t). (6.8.23)
dt
Let V (ξ, t) be the corresponding value function if the process is assumed to
start at t < T from some state ξ. The associated HJB equation is
208 6 Elements of Optimal Control Theory
∂V (ξ, t)
−
∂t
∂V (ξ, t) 1 1
= min (A(t)ξ + B(t)v) + ξ Q(t)ξ + v R(t)v : v ∈ Rr
∂ξ 2 2
(6.8.24a)
1 dS(t)
− ξ ξ
2 dt
1 1
= min ξ S(t)(A(t)ξ + B(t)v) + ξ Q(t)ξ + v R(t)v : v ∈ Rr .
2 2
(6.8.26)
Let u∗ (t) denote the v that solves the minimization problem. Clearly, u∗ (t)
is given by
u∗ (t) = −(R(t))−1 (B(t)) S(t)ξ, (6.8.27)
which simply confirms the optimal state feedback control law of (6.3.20).
Substituting the minimizing control u∗ (t) of (6.8.27) into (6.8.26), we have
1 1
− ξ Ṡ(t)ξ = ξ S(t)(A(t)ξ − B(t)(R(t))−1 (B(t)) S(t)ξ) + ξ Q(t)ξ
2 2
1
+ ξ S(t)B(t)(R(t))−1 (B(t)) S(t)ξ. (6.8.28)
2
This can be simplified to
dS(t)
ξ + 2S(t)A(t) − S(t)B(t)(R(t))−1 (B(t)) S(t) + Q(t) ξ = 0.
dt
(6.8.29)
Since S(t) is symmetric, it follows that
ξ (2S(t)A(t))ξ
=ξ S(t)A(t) + (A(t)) S(t) ξ + ξ S(t)A(t) − (A(t)) S(t) ξ. (6.8.30)
6.9 Exercises 209
Also, since S(t)A(t) − (A(t)) S(t) is antisymmetric, the second term on the
right hand side of (6.8.30) vanishes and (6.8.29) therefore reduces to
.
dS(t)
ξ + S(t)A(t) + (A(t)) S(t)
dt
/
− S(t)B(t)(R(t))−1 (B(t)) S(t) + Q(t) ξ = 0. (6.8.31)
dS(t)
= − S(t)A(t) − (A(t)) S(t)
dt
+ S(t)B(t)(R(t))−1 (B(t)) S(t) − Q(t), (6.8.32)
which is exactly the same matrix Riccati differential equation derived previ-
ously in equation (6.3.19).
The boundary condition follows from (6.8.25) and (6.8.24b):
S(T ) = S f . (6.8.33)
After solving the matrix Riccati equation (6.8.32) with the boundary condi-
tion (6.8.33), the feedback control law is obtained readily from (6.8.27).
To close this chapter, we wish to emphasize that the class of linear
quadratic optimal control problems is one of the very few cases that can
be solved analytically via the dynamic programming approach.
6.9 Exercises
dx(t)
= u(t)
dt
with the initial condition
x(0) = 1,
210 6 Elements of Optimal Control Theory
(ii) Show that the optimal control can be expressed as a function of the
adjoint variable λ by the relation
λ(t)
u∗ (t) = − .
2
(iii) Write down the corresponding two-point boundary-value problem .
g0 (u) = T
subject to
dx(t)
= − x(t) + u(t), t ∈ [0, T ),
dt
x(0) =1,
x(T ) =0,
and
|u(t)| ≤ 1, for all t ∈ [0, T ].
6.9.7 Use the Pontryagin Maximum Principle to solve the problem of min-
imizing 1
2
g0 (u) = (x(t)) dt
0
subject to
6.9 Exercises 211
dx(t)
= −x(t) + u(t), t ∈ [0, 1),
dt
x(0) = 1,
x(1) = 0,
and
|u(t)| ≤ 1, for all t ∈ [0, 1].
subject to
dx(t)
= u(t)
dt
x(0) = a
x(1) = 0.
subject to
dx(t)
= u(t), 0<t≤1
dt
x(0) = 1,
−1 ≤ u(t) ≤ 1.
x(0) = a, x(T ) = 0,
find a real-valued control u, defined in [0, T ], such that the cost functional
212 6 Elements of Optimal Control Theory
T
g0 = (u(t))2 dt
0
dx(t)
= u(t)
dt
and we wish to minimize the cost functional
1 T $ %
g0 = (x(t))2 + (u(t))2 dt.
2 0
Write down the Riccati equation for optimal control and hence find the opti-
mal control law and the optimal trajectory.
6.9.12 Consider the necessary condition (6.4.1e). If the Hamiltonian func-
tion H is continuously differentiable and U = Rr , show that the stationary
condition (6.2.10e) must be satisfied.
subject to the system equation (6.9.1) and the constraint (6.9.3), along with
the initial condition
x(0) = x0 , (6.9.5)
where T is a fixed time.
(a) Write down the problem in the standard form as a minimization problem.
(e) Show that the optimal advertising policy involves not advertising near the
final time t = T .
(g) Under the same assumption, give an explicit solution for x(t) in the re-
gion 0 < t < τ .
dx(t)
= u(t)
dt
x(0) = 0.
−1 ≤ u(t) ≤ 1, ∀t ∈ [0, 2]
is minimized.
(ii) Solve the following optimal control problem
3
min J = e−ρt u(t)dt
0
subject to
dx1 (t)
= u1 (t), x1 (0) = 4
dt
dx2 (t)
= u1 (t) − u2 (t), x2 (0) = 4
dt
0 ≤ ui (t) ≤ 1, ∀t ∈ [0, 3], i = 1, 2
xi (t) ≥ 0, ∀t ∈ [0, 3], i = 1, 2.
subject to
dx1 (t)
= u1 (t), x1 (0) = 4
dt
dx2 (t)
= u1 (t) − u2 (t), x2 (0) = 4
dt
0 ≤ ui (t) ≤ 1, ∀t ∈ [0, 3], i = 1, 2
xi (t) ≥ 0, ∀t ∈ [0, 3], i = 1, 2.
(ii) Solve the following optimal control problem
6.9 Exercises 215
5
min J = u(t)dt
0
subject to
dx(t)
= u(t) − x(t)
dt
x(0) = 1
0 ≤u(t) ≤ 1, ∀t ∈ [0, 5]
x(t) ≥ 0.7 − 0.2t, ∀t ∈ [0, 5].
6.9.17 For the problem:
1 " #
minimize g0 (v) = (x(t)2 + (v(t))2 dt
0
subject to
dx(t)
= v(t)
dt
x(0) = 1,
and no constraint on v(t), show that the dynamic programming equation yields
the optimal control policy
e2(1−t) − 1
v(t) = − x(t).
e2(1−t) + 1
6.9.18 A system follows the equation:
dx(t)
= u(t).
dt
The process of ultimate interest starts at time t = 0 and finishes at t = T
with the cost functional:
T
2
g0 = (x(T )) + (u(t))2 dt,
0
where T is fixed.
(i) Use Bellman’s dynamic programming to find the optimal control u∗ .
(ii) Substitute for the optimal control in the dynamical equation to find
the optimal trajectory x∗ (t) when x(0) = x0 . What is the value of
x∗ (T )?
216 6 Elements of Optimal Control Theory
subject to
dx(t)
= −x(t) + u(t), t ∈ [0, T ),
dt
where T is fixed.
(i) Use Bellman’s dynamic programming to find the optimal control
u∗ (t).
(ii) Substitute the optimal control into the dynamical equation to find the
optimal trajectory x∗ (t) when x(0) = x0 . What is the value of g0 (u∗ )?
Chapter 7
Gradient Formulae for Optimal
Parameter Selection Problems
7.1 Introduction
The main theme of this chapter is to derive gradient formulae for the cost
and constraint functionals of several types of optimal parameter selection
problems with respect to various types of parameters. An optimal parameter
selection problem can be regarded as a special type of optimal control prob-
lems in which the controls are restricted to be constant functions of time.
Many optimization problems involving dynamical systems can be formulated
as optimal parameter selection problems. Examples include parameter identi-
fication problems and controller parameter design problems. In fact, optimal
control problems with their control being parametrized are essentially reduced
to optimal parameter selection problems. Thus, when developing numerical
techniques for solving complex optimal control problems, it is essential to
ensure the solvability of the resulting optimal parameter selection problems.
In Section 7.2, we consider a class of optimal parameter selection problems
where the system dynamics are described by ordinary differential equations
without time-delay arguments. Gradient formulae for the cost functional as
well as for the constraint functionals are then derived. With these gradient
formulae, the optimal parameter selection problems can be readily solved us-
ing gradient-based mathematical programming algorithms, such as sequential
quadratic programming (see Section 3.5).
In Sections 7.3–7.4, we consider a class of optimal control problems in
which the control takes a special structure. To be more precise, for each
k = 1, . . . , r, the k−th component of the control function is a piecewise con-
stant function over the planning horizon [0, T ] with jumps at tk1 , . . . , tkMk .
In Section 7.3, only the heights of the piecewise constant control func-
tions are regarded as decision parameters and gradient formulae of the cost
and constraint functionals with respect to these heights are derived on the
basis of the formulae in Section 7.2. In Section 7.4, the switching times of
the piecewise constant control functions are regarded as decision parameters.
Gradient formulae of the cost functional as well as the constraint function-
als with respect to these switching times are then derived. However, these
gradient formulae may not exist when two or more switching times collapse
into one during the optimization process [148, 151, 168]. Thus, optimization
techniques based on these gradient formulae are often ineffective. To over-
come this difficulty, the concept of a time scaling transformation, which was
originally called the control parametrization enhancing transform (CPET),
is introduced. More specifically, by applying the time scaling transformation,
the switching times are mapped into fixed knots. The equivalent transformed
problem is a standard optimal parameter selection problem of the form con-
sidered in Section 7.2. Later in Section 7.4, we consider a class of combined
optimal parameter selection and optimal control problems in which the con-
trol takes the same structure as that of Section 7.3 with variable heights and
variable switching times. After applying the time scaling transformation, gra-
dient formulae of the cost and constraint functionals in the transformed prob-
lem with respect to these heights, switching times and system parameters are
summarized. Furthermore, we consider the applications of these formulae to
the special cases of discrete valued optimal control problems as well as the
optimal control of switched systems in Section 7.4.
In Section 7.5, we extend the results of Section 7.2 to the case involving
time-delayed arguments. In Section 7.6, we consider a class of optimal pa-
rameter selection problems with multiple characteristic time points in the
cost and constraint functionals. The main references for this chapter are
[89, 125, 142, 148, 168, 215] and Chapter 5 of [253].
dx(t)
= f (t, x(t), ζ), (7.2.1a)
dt
where x = [x1 , . . . , xn ] ∈ Rn and ζ = [ζ1 , . . . , ζs ] ∈ Rs are, respec-
tively, the state and system parameter vectors, and f = [f1 , . . . , fn ] :
[0, T ] × Rn × Rs → Rn . The initial condition for the system of differential
equations (7.2.1a) is
x(0) = x0 (ζ), (7.2.1b)
$ %
where x0 = x01 , . . . , x0n ∈ Rn is a given vector valued function of the
system parameter vector ζ.
7.2 Optimal Parameter Selection Problems 219
Define
Remark 7.2.1 From the theory of differential equations, we note that the
system (7.2.1) admits a unique solution, x(· | ζ), corresponding to each ζ ∈
Z.
Remark 7.2.2 The constraints given by (7.2.4a) and (7.2.4b) are said to be
in canonical form.
∂x0 (ζ)
ψ k (0) = . (7.2.6)
∂ζk
∂x(t | ζ)
= ψ k (t), t ∈ [0, T ]. (7.2.7)
∂ζk
Using (7.2.7), the gradients of the cost and constraint functionals can then
be calculated on the basis of the chain rule of differentiation as follows. For
each i = 0, 1, . . . , N ,
The gradient derivation via the costate method is carried out as follows.
For each i = 0, 1, . . . , N, let the corresponding Hamiltonian Hi be defined by
Hi (t, x, ζ, λ) = Li (t, x, ζ) + λi (t) f (t, x, ζ), (7.2.13)
where ε > 0 is an arbitrarily small real number. For brevity, let x(·) and
x(·; ε) denote, respectively, the solution of the system (7.2.1) corresponding
to ζ and ζ(ε). Clearly, from (7.2.1), we have
t
x(t) = x0 (ζ) + f (s, x(s), ζ) ds (7.2.17)
0
and t
x(t; ε) = x0 (ζ(ε)) + f (s, x(s; ε), ζ(ε)) ds. (7.2.18)
0
Thus,
dx(t; ε)
x(t) =
dε ε=0
0 t
∂x (ζ) ∂f (s, x(s), ζ) ∂f (s, x(s), ζ)
= ρ+ x(s)+ ρ ds.
∂ζ 0 ∂x ∂ζ
(7.2.19)
Clearly,
where
∂Φi (x(τi ), ζ) ∂Φi (x(τi ), ζ)
Φi (x(τi ), ζ) = x(τi ) + ρ, (7.2.23)
∂x ∂ζ
d(x(t))
f (t, x(t), ζ) = , (7.2.24)
dt
and
Hi t, x(t), ζ, λi (t)
∂Hi t, x(t), ζ, λi (t) ∂Hi t, x(t), ζ, λi (t)
= x(t) + ρ. (7.2.25)
∂x ∂ζ
Since ρ is arbitrary, (7.2.15) follows readily from (7.2.28) and the proof is
complete.
Remark 7.2.4 The choice between gradient formulae (7.2.12) and (7.2.15)
is problem dependent. If the number of system parameters, s, is large, the
use of (7.2.12) requires the solution of a large number of auxiliary systems
compared to the use of (7.2.15), which requires the solution of just one costate
system. On the other hand, the costate system associated with (7.2.15) needs
to be solved backwards along the time horizon and requires the solution of the
state dynamics to be stored beforehand, whereas (7.2.12) can be solved forward
in time alongside the state dynamics. Nevertheless, the use of (7.2.15) is
generally more efficient in a computational sense and hence is described in
more detail in the next subsection.
Remark 7.2.5 In the above algorithm, the solution x(· | ζ) of the sys-
tem (7.2.1) corresponding to each ζ ∈ Z is computed by Algorithm 7.2.1.
dx(t)
= f (t, x(t), u(t)), (7.3.1a)
dt
where
x = [x1 , . . . , xn ] ∈ Rn , u = [u1 , . . . , ur ] ∈ Rr
are, respectively, the state and control vectors. The initial condition for the
differential equation (7.3.1a) is
x(0) = x0 , (7.3.1b)
other words, it takes a constant value until the next switching time is reached,
at which point the value changes instantaneously to another constant, which
is held until the next switching time. Mathematically, uk may be expressed
as:
M
uk (t) = hjk χ[tj ,tj+1 ) (t), (7.3.2)
j=0
where
1, if t ∈ I,
χI (t) = (7.3.3)
0, otherwise.
0 ≤ t1 ≤ · · · ≤ tM ≤ T . (7.3.5)
Let $ %
h = (h0 ) , (h1 ) , . . . , (hM ) ∈ R(M +1)r , (7.3.6)
where & '
hj = hj1 , . . . , hjr ∈ Rr , j = 0, 1, . . . , M . (7.3.7)
Let Λ be the set of all those decision parameter vectors h which satisfy (7.3.4).
For convenience, for any h ∈ Λ, the corresponding control is written as
u(· | h). Let U denote the set of all such controls. Clearly, each control in U
is determined uniquely by a decision parameter vector h in Λ and vice versa.
Let x(· | h) denote the solution of the system (7.3.1) corresponding to
u(· | h) ∈ U (and hence corresponding to h ∈ Λ). We may now state the
optimal control problem as follows.
Problem (P 2): Given the system (7.3.1), find a decision parameter vector
h ∈ Λ such that the cost functional
T
g0 (h) = Φ0 (x(T | h)) + L0 (t, x(t | h), u(t | h)) dt (7.3.8)
0
(7.3.11)
dψkj (t) ∂f (t, x(t | h), u(t | h)) j ∂f (t, x(t | h), u(t | h))
= ψk (t) + , t ∈ [0, T ],
dt ∂x ∂hjk
(7.3.12)
along with the initial condition
with
ψkj (t) = 0, t ∈ [0, tj ), (7.3.15)
where
1, if k = l,
δkl =
0, otherwise.
In other words, the integration to determine ψkj (·) only needs to commence
at t = tj . Let ψkj (t | h) denote the solution of (7.3.14)–(7.3.15) corresponding
to h ∈ Λ. We may then state the following theorem.
Theorem 7.3.1 For each i ∈ {0, 1 . . . , N }, the gradient of the functional gi
with respect to hjk , k = 1, . . . , r, j = 0, . . . , M , is given by
where the subsequent equalities follow due to (7.3.11) and τ0 and Mi are the
same as defined in that equation.
The costate approach to determine the gradients also follows from the cor-
responding results in Section 7.2. Replacing ζ with h in Equations (7.2.13)–
(7.2.15), we obtain the following results. For each i = 0, 1, . . . , N , let the
Hamiltonian Hi be defined by
Hi (t, x(t | h), u(t | h), λi (t)) = Li (t, x(t | h), u(t | h))
+(λi (t)) f (t, x(t | h), u(t | h)), (7.3.17)
7.3 Control Parametrization 229
Hi (t, x(t | h), u(t | h), λi (t | h)) = Hi (t, x(t | h), hj , λi (t | h)).
where, once again, the latter equality in (7.3.19) follows from (7.3.11) and τ0
and Mi are the same as defined in that equation.
Remark 7.3.2 We have assumed here that each component of the control u
has the same set of switching times. This is purely for notational convenience.
In practice, there is no reason for this restriction.
230 7 Gradient Formulae for Optimal Parameter Selection Problems
Consider once more the dynamics described by (7.3.1) and assume that the
control takes the form (7.3.2). In contrast to Section 7.3, we now assume
that the control heights are given and that the switching times of the control
are decision parameters. In other words, we consider hjk , j = 0, 1, . . . , Mk ,
k = 1, . . . , r, to be given and regard the switching times tj , j = 1, . . . , M , as
decision parameters.
ϑ = [t1 , . . . , tM ] is called the switching vector. Let Ξ be the set which
consists of all those vectors ϑ ∈ RM such that the constraints (7.3.5) are
satisfied. Furthermore, let U denote the set of all the corresponding control
functions. For convenience, for any ϑ ∈ Ξ, the corresponding control in U is
written as u(· | ϑ). Clearly, each control in U is determined uniquely by a
switching vector ϑ in Ξ and vice versa.
Let x(· | ϑ) denote the solution of the system (7.3.1) corresponding to the
control u(· | ϑ) ∈ U (and hence to the switching vector ϑ ∈ Ξ). We may
then state the canonical optimal control problem as follows.
Problem (P 3): Given the system (7.3.1), find a switching vector ϑ ∈ Ξ such
that the cost functional
T
g0 (ϑ) = Φ0 (x(T | ϑ)) + L0 (t, x(t | ϑ), u(t | ϑ)) dt, (7.4.1)
0
for each l = 0, 1, . . . , M .
For each j = 1, . . . , M , consider the following variational system.
There are two cases to consider. If t < tj , then x(t | ϑ) clearly does not
depend on tj and thus
∂x(t | ϑ)
= 0 for all t < tj . (7.4.8)
∂tj
∂x(t | ϑ)
=f tj , x(tj | ϑ), hj−1 − f tj , x(tj | ϑ), hj
∂tj
t
∂f (s, x(s | ϑ), u(s | ϑ)) ∂x(s | ϑ)
+ dt. (7.4.10)
tj ∂x ∂tj
∂x(t | ϑ) ∂x(t+j | ϑ)
lim+ = = f tj , x(tj | ϑ), hj−1 − f tj , x(tj | ϑ), hj .
t→tj ∂tj ∂tj
(7.4.11)
Differentiating both sides of (7.4.10) with respect to t, we have
d ∂x(t | ϑ) ∂f (t, x(t | ϑ), u(s | ϑ)) ∂x(t | ϑ)
= , t > tj . (7.4.12)
dt ∂tj ∂x ∂tj
Remark 7.4.2 Using (7.4.8), it is easy to see that the limit of the state
variation as t approaches tj from below is
∂x(t | ϑ) ∂x(t−
j | ϑ)
lim− = = 0.
t→tj ∂tj ∂tj
7.4 Switching Times as Decision Parameters 233
Comparing this with (7.4.11), we can conclude that the state variation with
respect to tj does not exist at t = tj but has a jump condition there instead.
As we show below, however, this does not prevent us from calculating the
required gradients of the cost and constraint functionals.
Note that the technical caveats in Theorem 7.4.1 (t = tj and tj−1 < tj <
tj+1 ) are not needed for the state variation with respect to the control heights.
Thus, optimizing the switching times is much more difficult than optimizing
the control heights. This is one reason for the popularity of the time scaling
transformation to be introduced in the next subsection which allows one to
circumvent the difficulties caused by variable switching times.
Using Theorem 7.4.1, we can derive the partial derivatives of the cost and
constraint functionals with respect to the switching times (assuming that
all switching times are distinct). First, recall that the cost and constraint
functionals are defined by
T
gi (ϑ) = Φi (x(T | ϑ)) + Li (t, x(t | ϑ), u(t | ϑ))dt, i = 0, 1, . . . , N.
0
Applying the Leibniz rule to interchange the order of differentiation and in-
tegration is valid here because the partial derivative of x(· | ϑ) with respect
to tj exists at all points in the interior of the interval [tl , tl+1 ]. Recall that
the Leibniz rule does not require differentiability at the end points. Com-
bining (7.4.13) and (7.4.14) with Theorem 7.4.1 yields the following gradient
formulae.
Theorem 7.4.2 For each i = 0, 1, . . . , N and j = 1, . . . , M ,
These gradient formulae are based on the variational system. The gradient
formulae derived based on the costate system are given below.
For each i = 0, 1, . . . , N , and for each switching vector ϑ ∈ Ξ, we consider
the following system of differential equations.
. /
dλi (t) ∂Hi t, x(t | ϑ), u(t | ϑ), λi (t)
=− , t ∈ [0, T ) (7.4.16a)
dt ∂x
with
∂Φi (x(T | ϑ))
λi (T ) = , (7.4.16b)
∂x
where
Hi (t, x, u, λ) = Li (t, x, u) + λ f (t, x, u). (7.4.17)
The system (7.4.16) is called the costate system for the cost functional if
i = 0 and for the i−th constraint functional if i = 0. Let λi (· | ϑ) denote the
solution of the costate system (7.4.16) corresponding to ϑ ∈ Ξ. The gradient
of each functional gi , i = 0, . . . , N , with respect to the switching time tj ,
j = 1, . . . , M , is given in the following theorem.
Theorem 7.4.3 Consider j ∈ {1, . . . , M } and assume that tj−1 < tj <
tj+1 . For each i = 0, 1, . . . , N , the gradient of the functional gi with respect
to tj is given by
∂gi (ϑ)
= Hi tj , x(tj | ϑ), hj−1 , λi (tj | ϑ) − Hi (tj , x(tj | ϑ), hj , λi (tj | ϑ)).
∂tj
(7.4.18)
Proof. Although this result has been proven in [253], a somewhat more direct
proof is given here. Let v : [0; T ] → Rn be any absolutely continuous function.
Then gi may be written as
M tl+1
gi (ϑ) =Φi (x(T | ϑ))+ Li t, x(t | ϑ), hl dt
l=0 tl
M tl+1
=Φi (x(T | ϑ))+ Hi t, x(t | ϑ), hl , v(t) dt
l=0 tl
M tl+1
dx(t | ϑ)
− v(t) dt.
dt
l=0 tl
M
tl+1
gi (ϑ) =Φj (x(T | ϑ)) + Hi t, x(t | ϑ), hl , v(t) dt
l=0 tl
4
M
− (v(tl+1 ))x(tl+1 | ϑ)
l=0
5
tl+1
dv(t)
−(v(tl )) x(tl | ϑ)− x(t | ϑ)dt
tl dt
=Φi (x(T | ϑ)) − (v(tM +1 )) x(tM +1 | ϑ) + (v(t0 )) x(t0 | ϑ)
M tl+1
4 5
dv(t)
+ Hi t, x(t | ϑ), h , v(t) +
l
x(t | ϑ) dt
tl dt
l=0
=Φi (x(T | ϑ)) − (v(T )) x(T | ϑ) + (v(0)) x(0 | ϑ)
j−2 tl+1
4 5
dv(t)
+ Hi t, x(t | ϑ), hl , v(t) + x(t | ϑ) dt
t l
dt
l=0
tj 4 5
dv(t)
+ Hi t, x(t | ϑ), h , v(t) +
j−1
x(t | ϑ) dt
tj−1 dt
tj+1 4 5
dv(t)
+ Hi t, x(t | ϑ), hj , v(t) + x(t | ϑ) dt
tj dt
M tl+1
4 5
dv(t)
+ Hi t, x(t | ϑ), h , v(t) +
l
x(t | ϑ) dt.
tl dt
l=j+1
Using the Leibniz rule, this equation can be differentiated with respect to tj
to give
M tl+1
4
∂Hi t, x(t | ϑ), hl, v(t) ∂x(t | ϑ)
+
∂x ∂tj
l=0 tl
5
dv(t) ∂x(t | ϑ)
− dt.
dt ∂tj
∂gi (ϑ)
= Hi tj , x(tj | ϑ), hj−1 , v(tj ) − Hi tj , x(tj | ϑ), hj , v(tj ) ,
∂tj
as required.
236 7 Gradient Formulae for Optimal Parameter Selection Problems
Remark 7.4.3 Equations (7.4.15) and (7.4.18) both give the partial deriva-
tives of the canonical functionals gi , j = 0, 1, . . . , N , with respect to the
switching times. Since the state trajectory and the solutions of the varia-
tional and costate systems depend continuously on ϑ, the derivative formu-
lae in (7.4.15) and (7.4.18) also depend continuously on ϑ. In principle,
either (7.4.15) or (7.4.18) can be used in conjunction with a gradient-based
optimization method to optimize the switching times. However, there are sev-
eral difficulties with this approach:
(i) The variational systems for the switching times contain a jump
condition.
(ii) The partial derivatives of the canonical functionals with respect to
tj only exist when the switching times are distinct.
(iii) It is cumbersome to integrate the state and variational or costate
systems numerically when the switching times are variable, espe-
cially when two or more switching times are close together.
For the reasons given in Remark 7.4.3, it is less popular to use the gra-
dient formulae given by (7.4.15) and (7.4.18) in practice. Instead, a time
scaling transformation is typically used to transform a problem with variable
switching times into an equivalent problem with fixed switching times. This
is discussed in the next section.
γj = tj+1 − tj , j = 0, 1, . . . , M, (7.4.19)
7.4 Switching Times as Decision Parameters 237
γj ≥ 0, j = 0, 1, . . . , M, (7.4.20)
γ0 + γ1 + · · · + γM = T. (7.4.21)
The basic idea of the time scaling transformation is to replace the original
time horizon [0, T ] containing the variable switching times tj , j = 1, . . . , M ,
with a new time horizon [0, M + 1] with fixed switching times at 1, 2, . . . , M .
We use s to denote ‘time’ in the new time horizon. The relationship between
t ∈ [0, T ] and s ∈ [0, M + 1] can be defined by the differential equation
dt(s)
= v(s) (7.4.22a)
ds
with the initial condition
t(0) = 0 (7.4.22b)
and the terminal condition
t(M + 1) = T, (7.4.22c)
where the scalar valued function v(s) is called the time scaling control. It is
defined by
M
v(s) = γj χ[j,j+1) (s), (7.4.23)
j=0
0 ≤ γi ≤ T, i = 0, 1, . . . , M, (7.4.24)
where the upper bound follows from (7.4.21). For easy reference, the time
scaling control v(s) is written as v(s | γ). Let Γ be the set containing all those
vectors γ = [γ0 , γ1 , . . . , γM ] ∈ RM +1 satisfying (7.4.24). Clearly, v(s | γ)
is uniquely determined by γ ∈ Γ and vice versa. By virtue of (7.4.24), the
solution of (7.4.22) is monotonically non-decreasing. For each s ∈ [0, M + 1],
we have
s
t(s) = v(τ | γ)dτ
0
⎧
⎪
⎨ γl−1
0 s, if s ∈ [0, 1],
= (7.4.25)
⎪
⎩ γj + γl (s − l), if s ∈ [l, l + 1].
j=0
238 7 Gradient Formulae for Optimal Parameter Selection Problems
t(j) = tj , j = 0, 1, . . . , M + 1,
so that the desired mapping between the variable switching times in [0, T ] and
the fixed switching times in [0, M + 1] has been achieved and that (7.4.22c)
is equivalent to (7.4.21). Furthermore, the piecewise constant control
M
u(t) = hj χ[tj ,tj+1 ) (t)
j=0
M
ũ(s) = u(t(s)) = hj χ[j,j+1) (s) (7.4.26)
j=0
Using (7.4.22a) and the chain rule, we then obtain the transformed dynamics
where, for each γ ∈ Γ , λ̃iv (s | γ) and λ̃i (s | γ) denote the solution of the
costate system
d λ̃iv (s)
ds λ̃i (s)
240 7 Gradient Formulae for Optimal Parameter Selection Problems
⎡ ! ⎤
∂ H̃i t(s), x̃(s | γ), v(s | γ), ũ(s), λ̃iv (s), λ̃i (s)
⎢ − ⎥
⎢ ∂t ⎥
⎢ ⎥
⎢ ⎥
=⎢ ⎛ !⎞⎥, s ∈ [0, M +1),
⎢ ⎥
⎢ ∂ H̃i t(s), x̃(s | γ), v(s | γ), ũ(s), λ̃v (s), λ̃ (s)
i i
⎥
⎣−⎝ ⎠ ⎦
∂ x̃
(7.4.34)
The gradients of the cost and constraint functionals in Problem (TP3) may
now be given in the following Theorem.
Theorem 7.4.4 The gradient of the functional g̃i with respect to γj , j ∈
{0, 1, . . . , M }, is
!
j+1 ∂ H̃i t(s | γ), x̃(s | γ), γj , hj , λ̃i (s | γ), λ̃i (s | γ)
∂g̃i (γ) v
= ds. (7.4.36)
∂γj j ∂γj
Remark 7.4.4 Note that for computational purposes, (7.4.36) may also be
written as
!
j+1 ∂ H̃i t(s | γ), x̃(s | γ), v(s | γ), ũ(s), λ̃i (s | γ), λ̃i (s | γ)
∂g̃i (γ) v
= ds.
∂γj j ∂v
(7.4.37)
Remark 7.4.10 Note that the transformed problem has only fixed time
points where the state differential equations are discontinuous. All locations
of the discontinuities of the state differential equations are thus known and
fixed during the optimization process. Even when two or more of the switch-
ing times in the original time scale coalesce, the number of these locations
remains unchanged in the transformed problem.
dx(t)
= f (t, x(t), u(t), ζ), (7.4.41a)
dt
where x = [x1 , . . . , xn ] ∈ Rn , u = [u1 , . . . , ur ] ∈ Rr and ζ = [ζ1 , . . . , ζs ] ∈
Rs are, respectively, the state, control and system parameter vectors, f :
[0, T ] × Rn × Rs → Rn , and f = [f1 , . . . , fn ] ∈ Rn . The initial condition for
the system of differential equations (7.4.41a) is
M
ũ(s | h) = u(t(s) | h, ϑ) = hj χ[j,j+1) (s) (7.4.44)
j=0
for s ∈ [0, M + 1). Similarly, we define x̃(s) = x(t(s)), for s ∈ [0, M + 1].
Using (7.4.22a), the transformed dynamics are
dx̃(s)
= v(s | γ)f (t(s), x̃(s), ũ(s | h, ζ)) (7.4.45)
ds
with the initial condition
x̃(0) = x0 (ζ). (7.4.46)
For each γ ∈ Γ , let t(· | γ) denote the corresponding solution of (7.4.22a)–
(7.4.22b). In order to simplify notation somewhat, let the triple (h, γ, ζ) be
denoted by θ. Then for each θ = (h, γ, ζ) ∈ Λ × Γ × Z, let x̃(· | θ) denote
the corresponding solution of (7.4.45)–(7.4.46). The transformed version of
Problem (P 4) is given as follows.
Problem (T P 4): Given the combined system (7.4.22a) and (7.4.45) with
the initial conditions (7.4.22b) and (7.4.46), find a combined vector θ =
(h, γ, ζ) ∈ Λ × Γ × Z such that the cost functional
M +1
g̃0 (θ) = Φ0 (x(M + 1 | θ), ζ) + L0 (t(s | γ), x̃(s | θ), ũ(s | h), ζ)v(s | γ)ds
0
(7.4.47)
is minimized over Λ × Γ × Z subject to equality constraints
244 7 Gradient Formulae for Optimal Parameter Selection Problems
M +1
g̃i (θ) = Φi (x(M +1 | θ), ζ) + Li (t(s | γ), x̃(s | θ), ũ(s | h), ζ)v(s | γ)ds = 0,
0
i = 1, . . . , Ne , (7.4.48)
Note that all the controls in the transformed problem have fixed switching
times and variable heights, just like those in Section 7.3. The gradients of
the cost and constraint functionals in Problem (T P 4) with respect to each
of h, γ and ζ are summarized in the following theorem, which follows from
the corresponding results in Sections 7.2 and 7.3.
Consider Problem (T P 4). For each i = 0, 1, . . . , N , define the Hamiltonian
!
H̃i t(s | γ), x̃(s | θ), v(s | γ), ũ(s | h), ζ, λ̃iv (s | γ), λ̃i (s | θ)
= Li (t(s | γ), x̃(s | θ), ũ(s | h), ζ) + λ̃iv (s | γ) v(s | γ)
!
+ λ̃i (s | θ) v(s | γ)f (t(s | γ), x̃(s | θ), ũ(s | h), ζ) (7.4.51)
Theorem 7.4.5 The gradients of the functional g̃i with respect to hj and
γj , j = 0, 1, . . . , M , as well as ζ are given by
!
j+1 ∂ H̃i t(s | γ), x̃(s | θ), γj , hj , λ̃i (s | γ), λ̃i (s | θ)
∂g̃i (θ) v
j
= j
ds, (7.4.54)
∂h j ∂h
!
j+1 ∂ H̃i t(s | γ), x̃(s | θ), γj , hj , ζ, λ̃i (s | γ), λ̃i (s | θ)
∂g̃i (θ) v
= ds,
∂γj j ∂γj
(7.4.55)
and
∂g̃i (θ) ∂Φi (x̃(M + 1 | θ), ζ) i ∂x0 (ζ)
= + λ̃ (0 | θ)
∂ζ ∂ζ ∂ζ
M+1 ∂ H̃i t(s | γ), x̃(s | θ), v(s | γ), ũ(s | h), ζ, λ̃iv (s | γ), λ̃i (s | θ)
+ ds,
0 ∂ζ
(7.4.56)
respectively.
In many practical optimal control problems, the values of the control com-
ponents may only be chosen from a discrete set rather than from an interval
defined by an upper and a lower bound, as we assumed in Section 7.3. The
main references for this section are [122, 126, 127, 297].
Consider once more the dynamics (7.3.1) and assume that each component
of the control takes the piecewise constant form (7.3.2). Let h and hj , j =
0, 1, . . . , M , be as defined by (7.3.6) and (7.3.7), respectively. Instead of the
individual control heights being bounded above and below by (7.3.4), though,
we now assume that
" #
hj ∈ h̄1 , h̄2 , . . . , h̄q , j = 0, 1, . . . , M, (7.4.59)
Before going on to discuss the solution strategies for Problem (P 5), let us
consider another common class of optimal control problems and show that
it is equivalent to Problem (P 5). Suppose that instead of a single dynamical
system, there is a finite set of distinct dynamical systems each of which can
be invoked on any subinterval of the time horizon [0, T ]. The state of the
system is then determined as follows. Starting with the given initial condition
at t = 0, the first dynamical system active of the first subinterval [0, t1 ] of
[0, T ] is integrated up to t1 . x(t1 ) then becomes the initial state for the
next subinterval [t1 , t2 ]. Starting with x(t1 ) the second dynamical system
active on [t1 , t2 ] is then integrated forward in time until t2 to get x(t2 ).
The process continues in the same manner until we reach the terminal time.
Mathematically, we can describe the overall dynamics as
dx(t)
= f vi (t, x(t)), t ∈ (ti , ti+1 ], i = 0, 1, . . . , M, (7.4.62)
dt
x(0) = x0 , (7.4.63)
be the set of feasible switching sequences. Suppose that the switching times
ti , i = 1, . . . , M , are also decision variables and let ϑ and Ξ be defined
as for Problem (P 5). For each (ϑ, v) ∈ Ξ × V, let x(· | ϑ, v) denote the
corresponding solution of (7.4.62)–(7.4.63). Then we can define the following
canonical optimal switching control problem.
Problem (P 6): Given the system (7.4.62)–(7.4.63), find a combined vector
(ϑ, v) ∈ Ξ × V such that the cost functional
T
g0 (ϑ, v) = Φ0 (x(T | ϑ, v)) + L0 (t, x(t | ϑ, v)) dt, (7.4.65)
0
where the sequence h̄1 , h̄2 , . . . , h̄q is repeated M̄ + 1 times. For M = (M̄ +
1) × q, this leads to a problem in the form of Problem (P 3) where we just
need to determine the switching times ti , i = 1, . . . , (M̄ + 1) × q − 1. Note
that any possible order of the h̄j , j = 1, . . . , q with respect to M̄ switches
can be parametrized in this way if any of the successive switching times
are allowed to coalesce. In other words, all possible combinations of control
height sequences of Problem (P 5) with up to M̄ switches are contained in the
7.4 Switching Times as Decision Parameters 249
vector h̄. Optimization of the resulting Problem (P 3) (via the time scaling
transformation leading to the equivalent Problem (T P 3)) will then lead to
many of the switching times coalescing and leave us with an optimal switching
sequence [122, 126]. While this is an effective heuristic scheme, there are
several issues with this approach.
(i) It is possible to get more than the assumed M̄ switches. For many
practical problems, this is not a serious issue, as the number of nec-
essary switches is generally not known in the first place. If a limited
number of switches does need to be strictly adhered to, additional
constraints can be imposed along with the time scaling transforma-
tion for this purpose [297].
(ii) The introduction of many potentially unnecessary switchings creates
a large mathematical programming problem. Numerical experience
indicates that a lot of locally optimal solutions exist for this problem
and it is easy to get trapped in these. This is particularly well illus-
trated by a complex problem requiring the calculation of an optimal
path by a submarine through a sensor field [38].
If the problem under consideration also includes continuous valued con-
trol functions and the number of switches for the discrete valued controls
is small compared to the size of the partition required for the continuous
valued controls, a modified time scaling transformation proposed in [67] can
be used. This involves a coarse partition of the time horizon for the discrete
valued control components. Within each interval of this partition, a much
finer partition is set up for the continuous valued controls.
A special class of optimal control problems which often occurs in practice
is one where the Hamiltonian function turns out to be linear in the control
variables and where the controls have constant upper and lower bounds. In
this case, application of the Maximum Principle leads to an optimal con-
trol which is made up of a combination of bang or singular arcs only, as
demonstrated for several basic examples in Chapter 6. If there are only bang
arcs, the problem can be considered in the form of Problem (P 5). If the for-
mulae for the singular control during singular intervals are known and do
not depend on the costate, the problem can be considered in the form of
Problem (P 6) (see [269] for an example). If the singular control can only be
expressed in terms of the costate, time scaling can still be used by formu-
lating an auxiliary problem where additional dynamics are included for the
costate determination [228].
Another approach to determine optimal switching sequences is a graph-
based semi analytical method proposed in [118]. For a class of optimal control
problems with a single state and multiple controls, an equivalence between
the search for the optimal solution to the problem and the search for the
shortest path in a specially constructed graph is established. The graph is
based on analytical properties derived from applying the Maximum Principle.
Thus the problem of finding optimal sequence for the control policies, which
250 7 Gradient Formulae for Optimal Parameter Selection Problems
dx(t)
= f (t, x(t), x(t − h), ζ), (7.5.1a)
dt
where
x = [x1 , . . . , xn ] ∈ Rn , ζ = [ζ1 , . . . , ζs ] ∈ Rs
are, respectively, the state and system parameter vectors, f : [0, T ] × R2n ×
Rs → Rn , f = [f1 , . . . , fn ] ∈ Rn , and h is the time delay satisfying 0 < h <
T.
For the sake of simplicity, we have confined our analysis to the case of a
single time-delay. Nevertheless, all the results can be extended in a straight-
forward manner to the case of multiple time delays. The initial function for
the state vector is
where
φ(t) = [φ1 (t), . . . , φn (t)]
is a given piecewise continuous function mapping from [−h, 0) to Rn , and x0
is a given vector in Rn .
Let Z be a compact and convex subset of Rs . For each ζ ∈ Z, let x(· | ζ)
be the corresponding vector-valued function which is absolutely continuous
on (0, T ] and satisfies the differential equation (7.5.1a) almost everywhere
on (0, T ] and the initial condition (7.5.1b) everywhere on [−h, 0]. This func-
tion is called the solution of the system (7.5.1) corresponding to the system
parameter vector ζ ∈ Z.
We may now state an optimal parameter selection problem for the time-
delay system as follows:
Problem (P 7): Given the system (7.5.1), find a system parameter vector
ζ ∈ Z such that the cost functional
T
g0 (ζ) = Φ0 (x(T | ζ)) + L0 (t, x(t | ζ), x(t − h | ζ), ζ) dt (7.5.2)
0
where
Proof. Let ζ ∈ Z be any system parameter vector and let ρ be any pertur-
bation about ζ. Define
ζ(ε) = ζ + ερ. (7.5.8)
For brevity, let x(·) and x(·; ε) denote the solutions of the system (7.5.1)
corresponding to ζ and ζ(ε), respectively. Let y(·), z(·), y(·; ε), z(·; ε) be as
defined according to (7.5.4d)–(7.5.4e). Clearly, from (7.5.1), we have
t
0
x(t) = x + f (s, x(s), y(s), ζ) ds (7.5.9)
0
and t
0
x(t; ε) = x + f (s, x(s; ε), y(s; ε), ζ(ε)) ds. (7.5.10)
0
Thus,
dx(t; ε)
x(t) =
dε ε=0
t
∂f (s, x(s), y(s), ζ) ∂f (s, x(s), y(s), ζ)
= x(s) + y(s)
0 ∂x ∂y
∂f (s, x(s), y(s), ζ)
+ ρ ds. (7.5.11)
∂ζ
Clearly,
254 7 Gradient Formulae for Optimal Parameter Selection Problems
Define
and !
Ĥi = Ĥi t, z(t), x(t), ζ, λ̂i (t) , (7.5.14f)
where λi (t) is the solution of the costate system (7.5.4) corresponding to the
system parameter vector ζ, and λ̂i (t) is defined by (7.5.4f). From (7.5.13),
we have
dgi (ζ(ε))
gi (ζ) =
dε ε=0
∂gi (ζ)
= ρ
∂ζ
T
∂Φi (x(T )) ∂ L̄i ∂ L̄i ∂ L̄i
= x(T )+ x(t)+ y(t)+ ρ dt.(7.5.15)
∂x 0 ∂x ∂y ∂ζ
gi (ζ)
.
∂Φi (x(T )) T
∂ H̄i ∂ Ĥi ∂ f¯
= x(T ) + x(t) + x(t) − λi (t) x(t)
∂x 0 ∂x ∂x ∂x
/
! ∂ fˆ ∂ H̄i i ∂ f¯
− λ̂ (t)
i
x(t)e(T − t − h) + ρ − λ (t) ρ dt.
∂x ∂ζ ∂ζ
(7.5.17)
dx(t)
= f (t, x(t), ζ), (7.6.1a)
dt
where x = [x1 , . . . , xn ] ∈ Rn is the state vector, ζ = [ζ1 , . . . , ζs ] ∈ Rs is a
vector of system parameters, f = [f1 , . . . , fn ] ∈ Rn , and f : [0, T ] × Rn ×
Rs → Rn . The initial condition for (7.6.1a) is
x(0) = x0 , (7.6.1b)
where x0 ∈ Rn is given.
For each ζ ∈ Rs , let x(· | ζ) be the corresponding solution of the sys-
tem (7.6.1). We may now state the optimal parameter selection problem as
follows.
Problem (P 8): Given the system (7.6.1), find a system parameter vector
ζ ∈ Rs such that the cost functional
T
g0 (ζ) = Φ0 (x(τ1 | ζ), . . . , x(τM | ζ)) + L0 (t, x(t | ζ), ζ) dt (7.6.2)
0
Here, Φm : Rn × · · · × Rn → R, m = 0, 1, . . . , N , Lm : [0, T ] × Rn × Rs → R,
m = 0, 1, . . . , N , are given real-valued functions and the time points τi , 0 <
τi < T , i = 1, . . . , M , are referred to as the characteristic times. For standard
optimal parameter selection problems such as those considered in previous
sections, each canonical constraint (as well as the cost which corresponds to
m = 0) depends only on one such time point. Here, however, there may be
many such time points. For convenience, define τ0 = 0 and τM +1 = T.
We assume throughout that the following conditions are satisfied.
7.6 Multiple Characteristic Time Points 257
Hm (t, x(t | ζ), ζ, λm (t)) = Lm (t, x(t | ζ), ζ) + (λm (t)) f (t, x(t | ζ), ζ).
(7.6.5)
For each m = 0, 1, . . . , N , the corresponding system (7.6.4) is called the
costate system for gm . Let λm (· | ζ) be the solution of the costate sys-
tem (7.6.4) corresponding to ζ ∈ Rs .
Remark 7.6.1 For each ζ ∈ Rs , the solution λm (· | ζ) is calculated as
follows.
Step 1. Solve the system (7.6.1), yielding x(· | ζ) on [0, T ].
Step 2. Solve the costate differential equations (7.6.4a) with the terminal con-
dition (7.6.4c) backward from t = T to t = τM+
, yielding λm (· | ζ) on
the subinterval (τM , T ] and λ τM | ζ .
m +
Step 3. Note that x(τm | ζ) is known from Step 1 and that Φm is a given con-
tinuously differentiable function of x(τk | ζ), k = 1, . . . , M . Calculate
−
λm (τM | ζ) by using the jump condition (7.6.4b) with k = M .
Step 4. Solve the costate differential equations (7.6.4a) backward from t =
− + −
τM to t = τM −1 with the condition at t = τM being taken as
−
λm (τM | ζ). This yields λm (· | ζ) on the subinterval (τM −1 , τM ]
and λ τM −1 | ζ .
m +
258 7 Gradient Formulae for Optimal Parameter Selection Problems
The solution of the costate system (7.6.4a) with the jump conditions (7.6.4b)
and the terminal condition (7.6.4c) is thus obtained by combining λm (· | ζ)
in [τk , τk+1 ], k = 0, 1, . . . , M .
where j = 1, . . . , s.
Note that
∂ dx(t) d ∂x(t)
= . (7.6.10)
∂ζj dt dt ∂ζj
The gradient of gm can be calculated as:
M +1 τk
∂Hm ∂x ∂Hm ∂Hm ∂λm
+ + +
∂x ∂ζj ∂ζj ∂λm ∂ζj
k=1 τk−1
∂(λm ) d ∂x(t)
− f (t, x, ζ) − (λm ) dt. (7.6.11)
∂ζj dt ∂ζj
Applying integration by parts to the last term of the right hand side
of (7.6.11) yields
t=τ − τk
τk
d m ∂x ∂x k
m dλm ∂x
(λ ) dt = (λ ) − dt.
τk−1 dt ∂ζj ∂ζj t=τ + τk−1 dt ∂ζj
k−1
(7.6.13)
From (7.6.12) and (7.6.13), it follows from (7.6.11) that
t=τk−
M
M +1
∂gm (ζ) ∂Φm (x(τ1 ), . . . , x(τM )) ∂x(τl ) m ∂x
= − (λ )
∂ζj ∂x(τl ) ∂ζj ∂ζj +
l=1 k=1 t=τk−1
+1 τk
. /
M
∂Hm dλm
∂x ∂Hm
+ + + dt. (7.6.14)
τk−1 ∂x dt ∂ζj ∂ζj
k=1
∂gm (ζ)
∂ζj
∂Φm (x(τ1 ), . . . , x(τM )) m − m + ∂x(τk )
M
= − λ (τk ) + λ τk
∂x(τk ) ∂ζj
k=1
. /
∂x(T )
M+1 τk
∂Hm dλm
∂x ∂Hm
− (λ (T ))
m
+ + + dt.
∂ζj τk−1 ∂x dt ∂ζj ∂ζj
k=1
(7.6.16)
By virtue of the definition of the costate system corresponding to gm given
in (7.6.4a) with the jump conditions (7.6.4b) and terminal condition (7.6.4c),
we obtain +1 τk
∂gm (ζ)
M ∂Hm
= dt. (7.6.17)
∂ζj τk−1 ∂ζj
k=1
This completes the proof.
Note that the costates are discontinuous at the characteristic time points.
The sizes of the jumps are determined by the interior-point conditions given
by (7.6.4b).
7.7 Exercises
7.7.1 Consider Problem (P 1). Write down the corresponding definition for
a system parameter ζ ∗ to be a regular point of the constraints (7.2.4) in the
sense of Definition 3.1.3.
7.7.2 Consider Problem (P 1). Write down the corresponding first order nec-
essary conditions in the sense of Theorem 3.1.1.
dx(t)
= f (t, x(t), ζ)
dt
x(0) = x0 (ζ),
where ζ is a system parameter yet to be determined, and both β(ζ) and x0 (ζ)
are given scalar functions of the system parameter ζ. The problem is to find
a system parameter ζ ∈ R such that the following cost function:
β(ζ)
g(ζ) = L(t, x(t), ζ)dt
0
is minimized.
7.7 Exercises 261
(a) Reduce the problem to the one of fixed terminal time by rescaling the time
with respect to β(ζ), i.e., by setting t = β(ζ)τ . With reference to the fixed
terminal time problem, show that the gradient of the corresponding cost
functional, again denoted by g(ζ), is given by
1
∂g ∂x0 (ζ) ∂H
= λ(0) + dτ ,
∂ζ ∂ζ 0 ∂ζ
where & '
H = β(ζ) L̃(τ, x(τ ), ζ) + λ(τ ) f˜(τ, x(τ ), ζ)
dλ(τ ) ∂H
=− , λ(1) = 0
dτ ∂x
dx(τ )
= β(ζ) f˜(τ, x(τ ), ζ), x(0) = x0 (ζ),
dτ
while
L̃(τ, x(τ ), ζ) = L(β(ζ)τ, x(τ ), ζ)
and
f˜(τ, x(τ ), ζ) = f (β(ζ)τ, x(τ ), ζ).
(b) It is also possible to derive the gradient formula directly. By going through
the steps given in the proof of Theorem 7.2.2, show that the gradient of
the cost functional g(ζ) is given by
β(ζ)
∂g ∂β(ζ) ∂x0 (ζ) ∂H
=L(β(ζ), x(β(ζ)), ζ) + λ(0) + dt
∂ζ ∂ζ ∂ζ 0 ∂ζ
dλ(t) ∂H
=− , λ(β(ζ)) = 0
dt ∂x
dx(t)
=f (t, x(t), ζ), x(0) = x0 (ς),
dt
where
H = L(t, x, ζ) + λf (t, x, ζ).
(c) Are the results given in part (a) and part (b) equivalent?
7.7.4 (Optimal Design of Suspended Cable [251].) Consider a cable with its
own weight and a distributed load along its span. After appropriate statical
analysis and normalization, the total (non-dimensional) weight of the cable
is given by ;
1 < 2
1<
= 2
dy(x)
Φ= (1 + S ) 1 + dx
0 β dx
where
262 7 Gradient Formulae for Optimal Parameter Selection Problems
2
d2 y(x) dy(x) ,
=α 1+ (1 + S 2 ) + β,
dx2 dx
dy(0) dy(1)
y(0) = 0, = 0, = S, α = a given constant which relates the spe-
dx dx
cific weight and the maximum permissible stress, β = an adjustable parameter
representing the ratio of total loading to horizontal tension in the cable, and
S = the maximum slope of the cable which is also adjustable. The optimal
design problem is to determine β and S such that Φ is minimized.
(a) Formulate the problem as an optimal parameter selection problem by set-
dy1 (x)
ting y(x) = y1 (x) and = y2 (x).
dx
(b) Show that the problem is equivalent to
min (S − β)/(αβ)
subject to
dy2 (x) , ,
= α 1 + (y2 (x))2 (1 + S 2 ) + β,
dx
y2 (0) = 0, and y2 (1) = S ⇒ g1 (β, S) = y2 (1) − S = 0.
(c) Write down the necessary conditions for optimality for the reduced prob-
lem in (b).
(d) Determine the gradient formulae of the cost functional and the equal-
ity constraint function g1 (β, S) of the problem in (b) with respect to the
decision variables β and S.
7.7.5 (Computation of Eigenvalues for Sturm-Liouville Boundary-Value
Problems.) Consider the well-known Sturm-Liouville problem
d dy(x)
p(x) + q(x)y(x) + λω(x)y(x) = 0 (7.7.1a)
dx dx
dy(0)
α1 y(0) + α2 =0 (7.7.1b)
dx
dy(1)
β1 y(1) + β2 = 0. (7.7.1c)
dx
The functions p(x), q(x) and ω(x) are assumed to be continuous. In addition,
p(x) does not vanish in (0, 1). The problem is solvable only for a countable
number of distinct values of λ known as the eigenvalues. For each λ, the
corresponding solution y(x) is known as an eigenfunction of the problem.
Traditionally, the eigenvalues and eigenfunctions are obtained by using the
Rayleigh-Ritz method. Later approaches use finite difference and finite ele-
ment methods. This exercise illustrates how the problem can be solved, rather
easily, by posing it as an optimal parameter selection problem.
7.7 Exercises 263
dy(1)
g1 (λ) = β1 y(1) + β2 = 0. (7.7.2)
dt
In principle, one can find the eigenvalues of the problem by solving for the
zeros of (7.7.2). Alternatively, one can formulate the solution of (7.7.2) as an
optimal parameter selection problem. The following functions may be useful.
Φ(g) = g 2
or ⎧
⎨ 0 if g < −ε
Φε (g) = 1
2ε (g + ε)2 if − ε ≤ g ≤ ε
⎩
g if g > ε,
where ε is a small positive constant.
7.7.6 Consider the optimal control problem involving the dynamical sys-
tem (7.2.1) and the cost functional (7.2.3). Assume that the control u has
the following special structure:
u = h(z, x),
u = −Kx,
dx(t)
= (A − BK)x(t)
dt
with the initial condition
x(0) = x0 .
(a) Use Theorem 7.2.2 to show that the necessary condition for optimality is
T
∂H
dt = 0, (7.7.3)
0 ∂K
where
1
H= x [Q + K RK]x + λ (A − BK)x
2
and
dλ(t)
= − Q + K RK x(t) − A − K B λ(t)
dt
with
λ(T ) = 0.
(b) Prove that
∂
[Tr(CX DX)] = DXC + D XC
∂X
and
∂
[Tr(EX)] = E ,
∂X
where C, D, E and X are matrices of appropriate dimensions.
(c) Making use of the result given in (b), show that (7.7.3) is equivalent to:
T $ %
R Kx − R−1 B λ(t) (x(t)) dt = 0.
0
dS(t)
+ S(t)(A − BK) + A − K B S(t) + Q + K RK = 0
dt
and T $ %
R K − R−1 B S(t) x(t)(x(t)) dt = 0.
0
for all ζ ∈ Z.
7.7 Exercises 265
7.7.9 Consider the time-lag optimal control problem described in Section 7.5,
but without the canonical equality and inequality constraints (i.e., with-
out (7.5.3a) and (7.5.3b)). The control u is assumed to take the structure
given by (7.3.2). Derive the gradient formulae for the cost functional with
respect to the variable switching times.
7.7.16 Explain in detail the equivalence of Problem (P5) and Problem (P6).
7.7.17 Consider Problem (P7) with two time-delays. State and show the
validity of the corresponding version of Theorem 7.5.1.
7.7.18 Can the time scaling transform be applied to Problem (P8)? Why?
Chapter 8
Control Parametrization for Canonical
Optimal Control Problems
8.1 Introduction
The methods reported in [75, 244, 245, 248, 249] and [253], as well as many
papers cited in the references list, are developed based on the control pa-
rameterization technique for solving various classes of optimal control prob-
lems. Basically, the method partitions the time interval [0, T ] into several
subintervals and the control variables are approximated by piecewise con-
stant or piecewise linear functions with pre-fixed switching times. Through
this process, the optimal control problem is approximated by a sequence
of optimal parameter selection problems. Each of these optimal parameter
selection problems can be viewed as a mathematical programming problem
and is hence solvable by existing optimization techniques. The software pack-
age MISER3.3 (both FORTRAN and ‘Visual Fortran’ or ‘Matlab version of
MISER’) [104] was developed by implementing the control parametrization
method. The Visual MISER [295] is now available. Many practical problems
have been solved using this approach. See relevant references in the refer-
ence list. Intuitively, the optimal parameter selection problem with a finer
partition will yield a more accurate solution to the original optimal con-
trol problem. Convergence results are first obtained in [260] for a class of
optimal control problems involving linear time-lag systems subject to lin-
ear control constraints. Subsequently, a number of control parametrization
type algorithms with associated proof of convergence have been developed in
[240, 244–246, 248, 253, 279, 280], and the relevant references cited therein.
Many of these results are included in [253]. Although convergence analysis
may or may not be of serious consequence for implementation purposes, it
nevertheless provides important insight concerning the performance of an al-
gorithm. Thus, it has become a widely accepted requirement for any new
algorithmic development.
dx(t)
= f (t, x(t), u(t)), (8.2.1a)
dt
where x = [x1 , . . . , xn ] ∈ Rn , u = [u1 , . . . , ur ] ∈ Rr , are, respectively, the
state and control vectors, f = [f1 , . . . , fn ] ∈ Rn and f : [0, T ] × Rn × Rr →
Rn . The initial condition for the differential equation (8.2.1a) is
x(0) = x0 , (8.2.1b)
U = U1 ∩ U2 . (8.2.3)
8.2 Problem Statement 269
where σ p,k
∈ U and χI denotes the indicator function of I, defined by
1, t ∈ I
χI (t) = (8.3.1b)
0, elsewhere.
and
αi ≤ σip,k ≤ βi , i = 1, . . . , r, k = 1, . . . , np , (8.3.3b)
respectively. Let Ξ p be the set of all those σ p vectors that satisfy the con-
straints (8.3.3). Clearly, for each control up ∈ U p , there exists a unique control
parameter vector σ p ∈ Ξ p such that the relation (8.3.1) is satisfied and vice
versa.
With up ∈ U p , system (8.2.1) takes the form:
dx(t)
= f6(t, x(t), σ p ), t ∈ [0, T ] (8.3.4a)
dt
x(0) = x0 , (8.3.4b)
where
np
f6(t, x(t), σ p ) = f t, x(t), σ p,k χIkp (t) . (8.3.4c)
k=1
Let x(·|σ p ) be the solution of the system (8.3.4) corresponding to the con-
trol parameter vector σ p ∈ Ξ p in the sense that it satisfies the differential
equation (8.3.4a) a.e. on (0, T ] and the initial condition (8.3.4b).
272 8 Control Parametrization for Canonical Optimal Control Problems
and
τi
Gi (σ p ) = Φi (x(τi |σ p )) + L6i (t, x(t|σ p ), σ p ) dt ≤ 0,
0
i = Ne + 1, . . . , N , (8.3.5b)
Let Ω p be the subset of Ξ p such that the constraints (8.3.5) are satis-
fied. Furthermore, let F p be the subset of U p , which consists of all those
corresponding piecewise constant controls of the form (8.3.1a). We may now
specify the approximate problem (P1 (p)) as follows.
Problem (P1 (p)): Find a control parameter vector σ p ∈ Ξ p such that the
cost function
T
G0 (σ ) = Φ0 (x(T |σ )) +
p p
L60 (t, x(t|σ p ), σ p ) dt (8.3.6)
0
In this section, we present four lemmas that will be used to support the
convergence results in the next section.
8.4 Four Preliminary Lemmas 273
np
p
u (t) = σ p,k χIkp (t), (8.4.1)
k=1
up → u (8.4.2a)
p+1 p
t1 ∈ Ik(p+1) ⊂ Ik(p) , ∀ p,
and
p
Ik(p) → 0 as p → ∞.
Then,
p∞
{t1 } = ∩ I¯k(p) ,
p=1
Since almost all points of u(t) are regular points, we conclude that up → u
a.e. in [0, T ]. To prove (8.4.2b), we note that u is a bounded measurable
function in [0, T ]. Thus, it is clear from the construction of up that {up }∞
p=1
are uniformly bounded. Thus, the result follows from Theorem A.1.10.
Remark 8.4.1 Note that the second part of Lemma 8.4.1 remains valid if
we take {up }∞
p=1 to be any bounded sequence of functions in L∞ ([0, T ], R )
r
and u are as defined in Lemma 8.4.1. The details are left to the reader as
exercises.
Lemma 8.4.2 Let {up }∞ p=1 be a bounded sequence of functions in L∞ ([0, T ],
Rr ). Then, the sequence {x(·|up )}∞p=1 of the corresponding solutions of the
system (8.2.1) is also bounded in L∞ ([0, T ], Rn ).
Proof. From (8.2.1a), we have
t
x(t|up ) = x0 + f (s, x(s|up ), up (s)) ds, (8.4.3)
0
|x(t|up )| ≤ N0 exp(KT ),
∞
Proof. Let the bound of the sequence {up ∞ }p=1 be denoted by N0 . It
follows from Lemma 8.4.2 that there exists a constant N1 > 0 such that
x(·|up )∞ ≤ N1 ,
From (8.2.2), the partial derivatives of f (t, x, u) with respect to each com-
ponent of x and u are piecewise continuous in [0, T ] for each (x, u) ∈ B × V
and continuous in B × V for each t ∈ [0, T ], where B = {y ∈ Rn : |y| ≤ N1 }
and V = {z ∈ Rr : |z| ≤ N0 }. Thus, there exists a constant N2 > 0 such
that
8.5 Some Convergence Results 275
t
|x(t|up ) − x(t|u)| ≤ N2 {|x(s|up ) − x(s|u)| + |up (s) − u(s)|} ds.
0
Thus, both the conclusions of the lemma follow easily from Remark 8.4.1.
Lemma 8.4.4 Let {up }∞ p=1 denote the bounded sequence of functions in
L∞ ([0, T ], Rr ) that converges to a function u a.e. in [0, T ]. Then
The conclusion of this lemma then follows from Lemma 8.4.3, Remark 8.4.1,
Lemma 8.4.2, Assumptions 8.2.2 and 8.2.3 and Theorem A.1.10.
to the true optimal control in the weak∗ topology of L∞ ([0, T ], Rr ) if the dy-
namical system is linear and the cost functional is convex. For further detail,
see Chapter 8 of [253]. To state the required additional assumption, we need
the following preliminary definition.
Definition 8.5.1 A control parameter vector σ p ∈ Ξ p is said to be ε-
tolerated feasible if it satisfies the following ε-tolerated constraints:
− ε ≤ Gi (σ p ) ≤ ε, i = 1, . . . , Ne , (8.5.1a)
Gi (σ p ) ≤ ε, i = Ne + 1, . . . , N, (8.5.1b)
where Gi is defined by (8.3.5).
Let Ω p,ε be the subset of Ξ p such that the ε-tolerated constraints (8.5.1)
are satisfied; and furthermore, let F p,ε be the subset of U p,ε , which consists
of all those corresponding piecewise constant controls of the form (8.3.1a).
Clearly, Ω p ⊂ Ω p,ε (and hence F p ⊂ F p,ε ) for any ε > 0. We now consider
the ε-tolerated version of the approximate Problem (P1 (p)) as follows.
Problem (P1,ε (p)): Find a control parameter vector σ p ∈ Ω p,ε such that the
cost functional (8.3.6) is minimized over Ω p,ε .
Since Ω p ⊂ Ω p,ε for any ε > 0, it follows that
G0 (σ p,ε,∗ ) ≤ G0 (σ p,∗ )
for any ε > 0, where σ p,ε,∗ and σ p,∗ are optimal control parameter vectors of
Problems (P1,ε (p)) and (P1 (p)), respectively. Furthermore, let up,ε,∗ and up,∗
be the corresponding piecewise constant controls in the form of (8.3.1a) with
σ p replaced by σ p,ε,∗ and σ p,∗ , respectively. They are referred to as optimal
piecewise constant controls of Problems (P1,ε (p)) and (P1 (p)), respectively.
We can now specify the additional required assumption mentioned earlier.
Assumption 8.5.1 There exists an integer p0 such that
Note that Assumption 8.5.1 is not really restrictive from the practical
viewpoint. Indeed, a real practical problem is most likely solved numerically.
The problem formulation would clearly be in doubt if this assumption was
not satisfied. We are now in a position to present the convergence results in
the next two theorems.
Theorem 8.5.1 Let up,∗ be an optimal piecewise constant control of the
approximate Problem (P1 (p)). Suppose that the original Problem (P1 ) has an
optimal control u∗ . Then
8.5 Some Convergence Results 277
Proof. Let up,ε,∗ be an optimal piecewise constant control of Problem (P1,ε (p)).
Then, it is clear from Assumption 8.5.1 that for any δ > 0, there exists an
ε0 > 0 such that
g0 (up,ε,∗ ) > g0 (up,∗ ) − δ (8.5.2)
for any ε, 0 < ε < ε0 , uniformly with respect to p > p0 . Let u∗,p be the
control defined from u∗ by (8.4.1). Then, for any ε, 0 < ε < ε0 , it follows
from Lemmas 8.4.1 and 8.4.3 and Assumptions 8.2.2 and 8.2.3 that there
exists an integer p1 > p0 such that
for all p ≥ p1 . On the other hand, by virtue of Lemmas 8.4.1 and 8.4.4, we
have
lim g0 (u∗,p ) = g0 (u∗ ). (8.5.6)
p→∞
Theorem 8.5.2 Let u∗ be an optimal control of Problem (P1 ), and let up,∗
be an optimal piecewise constant control of the approximate Problem (P1 (p)).
Suppose that
up,∗ → ū,
a.e. on [0, T ]. Then, ū is also an optimal control of Problem (P1 ).
Proof. Since up,∗ → ū a.e. in [0, T ], it follows from Lemma 8.4.4 that
Next, it is easy to verify from Remark 8.4.1, Lemma 8.4.3 and Assump-
tions 8.2.2 and 8.2.3 that ū is also a feasible control of Problem (P ). On the
other hand, it follows from Theorem 8.5.1 that
Hence, the conclusion of the theorem follows easily from (8.5.8) and (8.5.9).
Gi (σ p ) = 0, i = 1, . . . , Ne , (8.6.1b)
Gi (σ p ) ≤ 0, i = Ne + 1, . . . , N, (8.6.1c)
i σ p,k = E i σ p,k − bi ≤ 0, i = 1, . . . , q, k = 1, . . . , np , (8.6.1d)
αi ≤ σip,k ≤ βi , i = 1, . . . , r, k = 1, . . . , np , (8.6.1e)
In view of Section 7.3, we see that the derivations of the gradient formu-
lae for the cost functional and the canonical constraint functionals are the
same. For each i = 0, 1, . . . , N , the gradient of the corresponding Gi may be
computed using the following algorithm.
Algorithm 8.6.3 Given σ p ∈ Λp , proceed as follows.
Remark 8.6.2 During actual computation, very often the control parametriza-
tion is carried out on a uniform partition of the interval [0, T ], i.e.,
np
upj (t) = σjp,k χk (t), j = 1, . . . , r, (8.6.6)
k=1
Note that a uniform partition for the parametrization is not a strict re-
quirement but rather a computational convenience. At times when it is known
a priori that the control changes rapidly over certain intervals and changes
slowly over others, it will be more effective to use a nonuniform partition.
Remark 8.6.3 Note that the gradient formulae for the cost and constraint
functionals can also be obtained using the variational approach as detailed in
Section 7.3.
Remark 8.6.4 Clearly, when np increases, the computational time required
to solve the corresponding approximate problem will increase with some power
of np . To overcome this difficulty, we propose to solve any given problem as
follows.
Let q be a small positive integer. To begin, we solve the approximate Prob-
lem (P (1)) with n1 = q. Let σ 1,∗ be the optimal solution so obtained, and
let u1,∗ be the corresponding control. Then, it is clear that u1,∗ is a subop-
timal control for the original Problem (P ). Next, we choose n2 = 2q, and
let σ02 denote the parameter vector that describes u1,∗ over a new partition
with n2 = 2q intervals. Clearly, σ02 is a feasible parameter vector for Prob-
lem (P (2)). We then solve Problem (P (2)) using σ02 as the initial guess. The
process continues until the cost reduction becomes negligible. Computational
experience indicates that the reduction in the cost value appears to be insignif-
icant for np > 20 in many problems. Also, by virtue of the construction of
the subsequent initial guess, the increase in CPU time is seldom drastic.
Remark 8.6.5 Note that the constraint transcription introduced in [241] is
used in the transformation of the continuous constraints in Remark 8.2.1(iv).
However, this constraint transcription has a serious disadvantage because the
canonical equality state constraint so obtained does not satisfy any constraint
qualification (see Remark 3.1.1). In particular, let us consider the constraint
specified in Remark 8.2.1(iv). Then,
∂G(σ p )
=0
∂σ p
if the parameter vector σ p is such that
where the function h is defined in Remark 8.2.1(iv). In this situation, the lin-
ear approximation of the constraint G is equal to zero for all search directions.
This, in turn, implies that the search direction, which is obtained from the
282 8 Control Parametrization for Canonical Optimal Control Problems
To illustrate the simple and yet efficient solution procedure outlined in the
previous sections, we now present the numerical results of applying this pro-
cedure to several examples. Note that the procedure has been implemented
in the software package MISER 3.3 (see [104]) that is used to generate the
results presented here and also those in later sections and chapters of the
text.
Example 8.7.1 (Bitumen Pyrolysis) This problem has been widely stud-
ied in the literature, more recently in the context of determining global op-
timal solutions of nonlinear optimal control problems (see [53] and the ref-
erences cited therein). The task is to find an optimal temperature profile in
a plug flow reactor with the following reactions involving components Ai ,
i = 1, . . . , 4, with reactions rates kj , j = 1, . . . , 5.
k k k
A1 →1 A2 , A2 →2 A3 , A1 + A2 →3 A2 + A2 ,
k k
A1 + A2 →4 A3 + A2 , A1 + A2 →5 A4 + A2 .
While a more comprehensive model of the problem [176] involves all of the
components, the version commonly presented in the literature (and which
we adopt here) includes components A1 and A2 only [53]. The aim is to
determine an optimal temperature profile to maximize the final amount of
A2 subject to bound constraints on the temperature. In standard form, this
problem can be stated as follows. Minimize
g0 (u) = −x2 (T )
subject to
dx1 (t)
= −k1 x1 (t) − (k3 + k4 + k5 )x1 (t)x2 (t),
dt
dx2 (t)
= k1 x1 (t) − k2 x2 (t) + k3 x1 (t)x2 (t),
dt
x1 (0) = 1,
x2 (0) = 0,
8.7 Illustrative Examples 283
where
bi /R
ki = ai exp − , i = 1, . . . , 5,
u
and the values of ai , bi /R, i = 1, . . . , 5, are given in Table 8.7.1. We also have
the control constraint
i ln ai bi /R
1 8.86 10,215.4
2 24.25 18,820.5
3 23.67 17,008.9
4 18.75 14,190.8
5 20.70 15,599.8
1 750
x1
0.9 x2
740
0.8
0.7
730
0.6
u
0.5 720
0.4
710
0.3
0.2
700
0.1
0 690
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
t t
we choose here, other initial guesses for the control can lead to one of the
local optimal solutions. To illustrate the effect of the size of the partition
for the piecewise constant control, we also solved the problem assuming a
uniform piecewise constant partition with intervals for the control. A slightly
improved objective function value of −0.353717047 is obtained with this re-
fined partition, although the shape of the optimal control function is now
more well-defined (compare Figures 8.7.1 and 8.7.2).
1 750
x1
0.9 x2
740
0.8
0.7
730
0.6
0.5 u 720
0.4
710
0.3
0.2
700
0.1
0 690
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
t t
subject to
dx(t)
= 0.5u(t) − 0.1x(t), x(0) = 0,
dt
with the control constraint
50 16
45
14
40
12
35
30 10
x 25 u 8
20
6
15
4
10
5 2
0 0
0 5 10 15 0 5 10 15
t t
for the optimal control does not happen to coincide with one of the fixed
knot points in the chosen partition. While an increasingly finer partition can
yield improved solutions, a much better way to solve bang-bang and related
optimal control problems numerically is discussed in Section 8.9.
dx(t)
= f (t, x(t), ζ, u(t)), (8.8.1a)
dt
where x = [x1 , . . . , xn ] ∈ Rn , ζ = [ζ1 , . . . , ζs ] ∈ Rs , u = [u1 , . . . , ur ] ∈
Rr , are, respectively, the state, system parameter and control parameter vec-
tors. We have f = [f1 , . . . , fn ] ∈ Rn with f : R × Rn × Rs × Rr → Rn . The
initial condition for the differential equation (8.8.1a) is
where Φi : Rn × Rs → R, i = 0, 1, . . . , N, and Li : R × Rn × Rs × Rr → R,
i = 0, 1, . . . , N , are given real-valued functions. As before, τi ≤ T is referred
to as the characteristic time for the i-th constraint.
We assume throughout this section that the corresponding versions of
Assumptions (8.2.1)–(8.5.1) are satisfied. We now apply the concept of control
8.8 Combined Optimal Control and Optimal Parameter Selection Problems 287
dx(t)
= f˜(t, x(t), ζ, σ p ), (8.8.5a)
dt
x(0) = x0 (ζ), (8.8.5b)
where
np
f˜(t, x(t), ζ, σ ) = f
p
t, x(t), ζ, σ p,k
χIkp (t) , (8.8.5c)
k=1
and
T
Gi (ζ, σ p ) = Φi (x(T |ζ, σ p ), ζ) + L̃i (t, x(t|ζ, σ p ), ζ, σ p ) dt ≤ 0,
0
i = Ne + 1, . . . , N, (8.8.6b)
respectively, where
np
L̃i (t, x, ζ, σ ) = Li
p
t, x, ζ, σ p,k
χ (t) .
Ikp (8.8.6c)
k=1
Let Dp be the set that consists of all those combined vectors (ζ, σ p ) in Z ×Ξ p
that satisfy the constraints (8.8.6a) and (8.8.6b). Furthermore, let B p be the
corresponding subset of Z × U p . We are now in a position to specify an
approximate version of Problem (Q) as follows.
Problem (Q(p)): Subject to system (8.8.5), find a combined vector (ζ, σ p ) ∈
Dp such that the cost function
T
G0 (ζ, σ ) = Φ0 (x(T |ζ, σ ), ζ) +
p p
L̃0 (t, x(t|ζ, σ p ), ζ, σ p ) dt (8.8.7)
0
is minimized over Dp .
Problem (Q(p)) can also be stated in the following form.
Minimize
G0 (ζ, σ p ) (8.8.8a)
288 8 Control Parametrization for Canonical Optimal Control Problems
subject to
Gi (ζ, σ p ) = 0, i = 1, . . . , Ne , (8.8.8b)
Gi (ζ, σ p ) ≤ 0, i = Ne + 1, . . . , N, (8.8.8c)
i σ p,k = E i σ p,k − bi ≤ 0, i = 1, . . . , q, k = 1, . . . , np , (8.8.8d)
αi ≤ σip,k ≤ βi , i = 1, . . . , r, k = 1, . . . , np , (8.8.8e)
ai ≤ ζi ≤ bi , i = 1, . . . , s, (8.8.8f)
where G0 is defined by (8.8.7), Gi , i = 1, . . . , N , are defined by (8.8.6a)
and (8.8.6b), while (8.8.8d)–(8.8.8f) are the constraints that specify the set
Z × Ξ p . The constraints (8.8.8e)–(8.8.8f) are known as the boundedness
constraints in nonlinear optimization programming. Let Λp be the set that
consists of all those (ζ, σ p ) ∈ Rs × Rrnp such that the boundedness con-
straints (8.8.8e)–(8.8.8f) are satisfied.
This nonlinear mathematical programming problem in the control parame-
ter vectors can be solved by using any nonlinear optimization technique, such
as the sequential quadratic programming (SQP) approach (see Section 3.5).
In solving the nonlinear optimization problem (8.8.8) via SQP, we choose an
initial control parameter vector (ζ, σ p )(0) ∈ Λp to initialize the SQP process.
Then, for each (ζ, σ p )(i) ∈ Λp , the values of the cost function (8.8.8a) and the
constraints (8.8.8b)–(8.8.8d) as well as their respective gradients are required
by SQP to generate the next iterate (ζ, σ p )(i+1) . Consequently, it gives rise
to a sequence of combined vectors. The optimal combined vector obtained
by the SQP process is then regarded as an approximate optimal combined
vector of Problem (Q(p)).
For each (ζ, σ p ) ∈ Λp , the values of the cost function G0(ζ, σ p ) and the
p,k
constraint functions Gi (ζ, σ ), i = 1, . . . , N , and i σ
p
, i = 1, . . . , q,
k = 1, . . . , np , can be calculated in a manner similar to the corresponding
components of Algorithm
8.6.1 and Algorithm 8.6.2, and Remark 8.6.1. The
gradients of i σ p,k , i = 1, . . . , q, k = 1, . . . , np , are given in Remark 8.6.1.
A procedure similar to that described in Algorithm 8.6.3 is given below for
computing the gradient of Gi (σ p ) for each i = 0, 1, . . . , N .
Algorithm 8.8.1 Let (ζ, σ p ) ∈ Λp be given.
Step 1. Solve the costate system (8.6.3) with σ p replaced by (ζ, σ p ) backward
in time from t = τi to t = 0 (again τ0 = T by convention). Let the
corresponding solution be denoted by λ 6 i (·|ζ, σ p ).
Step 2. The gradient is computed from
!
p τi ∂ H 6 i (t|ζ, σ p )
6 i t, x(t|ζ, σ p ), ζ, σ p , λ
∂Gi (ζ, σ )
= dt (8.8.9a)
∂σ 0 ∂σ
and
8.8 Combined Optimal Control and Optimal Parameter Selection Problems 289
∂Gi (ζ, σ p ) 0
6 i (0|ζ, σ p ) ∂x (ζ)
= (λ
∂ζ ∂ζ
!
τi ∂ H 6i (t|ζ, σ p )
6 i t, x(t|ζ, σ p ), ζ, σ p , λ
+ dt,
0 ∂ζ
(8.8.9b)
Theorem 8.8.2 Let (ζ p,∗ , ūp,∗ ) be as defined in Theorem 8.8.1. Suppose that
and
lim |ζ p,∗ − ζ ∗ | = 0.
p→∞
In this subsection, our aim is to show that many different classes of optimal
control problems can be transformed into special cases of Problem (Q). The
following is a list of some of these transformations, but it is by no means
exhaustive. Readers are advised to exercise their ingenuity and initiative in
applying these transformations and devising new ones.
Note that the references for Sections 8.8.1 and 8.8.2 are from Section 6.8.1
and Section 6.8.2 of [253], respectively.
(i) Free terminal time problems (including minimum-time problems).
min g0 (u, T ),
u(·),T
290 8 Control Parametrization for Canonical Optimal Control Problems
where T
g0 (u, T ) = Φ0 (x(T ), T ) + L0 (t, x(t), u(t)) dt
0
subject to the differential equation
dx(t)
= f (t, x(t), u(t)), t ∈ (0, T ],
dt
with the initial condition
x(0) = x0
and terminal condition
min g0 (û, T ),
T,û(·)
where
1
g0 (û, T ) = Φ0 (x̂(1), T ) + T L0 (τ T, x̂(τ ), û(τ )) dτ,
0
dx̂(τ )
= T f (τ T, x̂(τ ), û(τ )),
dτ
the initial condition
x̂(0) = x0 ,
and the terminal condition
Note that the transformed problem takes the form of Problem (Q) if we
treat T as a system parameter.
(ii) Minimax optimal control problems.
The state dynamical equations are as in (8.2.1) but the cost functional
to be minimized takes the form
T
-
g0 (u) = max C(t, x(t), u(t)) + Φ0 (x(T )) + L0 (t, x(t), ζ, u(t)) dt
0≤t≤T 0
8.8 Combined Optimal Control and Optimal Parameter Selection Problems 291
where
-0 (x(T )) + S.
Φ0 (x(T ), S) = Φ
The resulting problem, due to the additional continuous state con-
straint, is not exactly in the form of Problem (Q). However, the tech-
niques to be introduced in Chapter 9.3 can be readily used to solve
it.
(iii) Problems with periodic boundary conditions.
The cost functional and the state dynamical equations are as described
by (8.2.4) and (8.2.1), but the initial and final state values are related
by
h(x(0), x(T )) = 0. (8.8.10)
In this case, we can introduce a system parameter vector ζ ∈ Rn and
put
x(0) = ζ.
Then the constraint (8.8.10) is equivalent to
h(ζ, x(T )) = 0,
du(t)
= v(t)
dt
with the initial conditions
du(t)
= v(t),
dt
dv(t)
= w(t),
dt
together with the initial conditions
u(0) = ζu ,
v(0) = ζv ,
where w(t) is piecewise constant, u(t) and v(t) are effectively new state vari-
ables and ζu and ζv are additional system parameter vectors to be optimized.
Clearly, we can extend this process further to generate piecewise polynomial
optimal controls with any desired degree of smoothness.
the initial position to the final equilibrium position. The robot is controlled
by the thrust of the two jets, i.e., u1 and u2 , and these two control variables
are subject to boundedness constraints. This problem was studied earlier in
[218] and [270], where it was formulated and solved as an Lp -minimization
problem in [270]. In [13], this problem is formulated as an L2 -minimization
problem, which is described formally as given below.
Given the dynamical system
dx1 (t)
= x4 (t),
dt
dx2 (t)
= x5 (t),
dt
dx3 (t)
= x6 (t),
dt
dx4 (t)
= (u1 (t) + u2 (t)) cos x3 (t),
dt
dx5 (t)
= (u1 (t) + u2 (t)) sin x3 (t),
dt
dx6 (t)
= α(u1 (t) − u2 (t)),
dt
with initial and terminal conditions given, respectively, by
and
is minimized, where
" #
U = u = (u1 , u2 ) : |u1 (t)| ≤ 0.8, |u2 (t)| ≤ 0.4, ∀t ∈ [0, T ] .
Then, we consider the cases of np = 20, 30 and 40, where np is the number
of partitions points. For each of these cases, the problem is solved using
the MISER software[104]. The optimization method used within the MISER
software is the sequential quadratic programming (SQP). The optimal costs
obtained with different numbers np of partition points are given in Table 8.8.1.
From the results obtained in Table 8.8.1, we see the convergence of the
suboptimal costs obtained using the control parametrization method.
294 8 Control Parametrization for Canonical Optimal Control Problems
np Optimal cost
20 6.28983430
30 6.20898660
40 6.18595308
Note that this problem was solved in [13], where it was first discretized us-
ing Euler discretization scheme. Then, both inexact resporation (IR) method
[113, 115, 182] and Ipopt optimization software [277] are used to solve this
discretized problem. The optimal cost obtained with N = 1500 is 6.154193,
where N denotes the number of partition points used in the Euler discretiza-
tion scheme. Comparing this cost value with that obtained for the case of
np = 40 using the control parametrization method, we see that their costs
are sufficiently close. The approximate optimal controls and approximate op-
timal state trajectories for the case of np = 40 obtained using the control
parametrization method are shown in Figure 8.8.1. Their trends are similar
to those obtained in [13].
Example 8.8.2 Consider a tubular chemical reactor of length L and plug
flow capacity v. We wish to carry out the following set of parallel reactions
k
A −→
1
B
k
A −→
2
C,
where both reactions are irreversible. Assuming that the reactions are of first
order and the velocity constants are given by
ki = Ai exp(−Ei /RT ), i = 1, 2,
0 ≤ T (z) ≤ T̄ , 0 ≤ z ≤ L.
8.8 Combined Optimal Control and Optimal Parameter Selection Problems 295
0.5
0.8
0.4
0.6
0.3
0.4
0.2
0.2 0.1
u1 u2
0 0
−0.2 −0.1
−0.2
−0.4
−0.3
−0.6
−0.4
−0.8
−0.5
0 2 4 6 8 10 12 0 2 4 6 8 10 12
t t
2 2
0 0
−2 −2
−4 −4
x1 x2
−6 −6
−8 −8
−10 −10
−12 −12
0 2 4 6 8 10 12 0 2 4 6 8 10 12
t t
1.6 2.5
1.4
2
1.2
1.5
1
x3 x4
0.8 1
0.6
0.5
0.4
0
0.2
0 −0.5
0 2 4 6 8 10 12 0 2 4 6 8 10 12
t t
2.5 0.15
0.1
2
0.05
0
1.5
−0.05
x x
5 6
1 −0.1
−0.15
0.5
−0.2
−0.25
0
−0.3
−0.5 −0.35
0 2 4 6 8 10 12 0 2 4 6 8 10 12
t t
dx1 (y)
= −(u(y) + β(u(y))p )x1 (y),
dy
dx2 (y)
= ux1 (y),
dy
and
0 ≤ u(y) ≤ ū, 0 ≤ y < 1.
1 6
x1
0.9 x2
5
0.8
0.7
4
0.6 u
0.5 3
0.4
2
0.3
0.2
1
0.1
0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t t
Fig. 8.8.2: Computed optimal solution for Example 8.8.2 with 100
subintervals
We can deal with this aspect of the problem by putting x1 (1) = ζ1 and
x2 (1) = ζ2 where ζ1 and ζ2 are system parameters, both of which are re-
stricted to [0, 1]. Hence, we have two equality constraints as follows:
g1 = x1 (1) − ζ1 = 0,
g2 = x2 (1) − ζ2 = 0.
8.9 Control Parametrization Time Scaling Transform 297
subject to
dx1 (t)
= x2 (t), x1 (0) = −5,
dt
dx2 (t)
= −x1 (t) + 1.4 − 0.14(x2 (t))2 x2 (t) + u(t), x2 (0) = −5.
dt
Instead of an open-loop optimal control, the aim is to construct state-
dependent suboptimal feedback control of the form
u = ζ1 x1 + ζ2 x2 + ζ3 x21 + ζ4 x22 .
Note the inclusion of the quadratic terms is in the hope of being able to
address the nonlinearities in the state differential equations. Once the form
of u has been substituted into the statement of the problem, we end up with
a pure optimal parameter selection problem involving 4 system parameters.
We choose upper and lower bounds of −10 and 10 for each of these 4 sys-
tem parameters. The problem is solved using MISER Software [104]. The
suboptimal feedback control obtained is
dx(t)
= f (t, x(t), u(t)), (8.9.1a)
dt
where x = [x1 , . . . , xn ] ∈ Rn , u = [u1 , . . . , ur ] ∈ Rr are, respectively, the
state and control vectors; and f = [f1 , . . . , fn ] ∈ Rn . The initial condition
for the system of differential equations (8.9.1a) is:
x(0) = x0 , (8.9.1b)
8.9 Control Parametrization Time Scaling Transform 299
For each integer p ≥ 1, let the planning horizon [0, T ] be partitioned into np
subintervals with np + 1 partition points denoted by
where τ0p = 0, τnpp = T and τkp , k = 1, . . . , np , are decision variables that are
subject to the following conditions:
p
τk−1 ≤ τkp , k = 1, 2, . . . , np . (8.9.5)
Let * +
S p = τ0p , τ1p , . . . , τnpp . (8.9.6)
For all p > 1, the partition points in S p are chosen such that
S p ⊂ S p+1 . (8.9.7)
with
σ p,k ∈ U, k = 1, . . . , np (8.9.9b)
and the set U defined by (8.2.3). Let
& '
σp = σ p,1 , . . . , (σ p,np ) (8.9.10)
Definition 8.9.3 Any piecewise constant control of the form of (8.9.8) sat-
isfying the conditions (8.9.9b) and (8.9.12) is called an admissible piecewise
constant control. Let UBp be the set of all such admissible piecewise constant
controls.
dx(t)
= f6(t, x(t), σ p , τ p ) (8.9.13a)
dt
x(0) = x0 , (8.9.13b)
where
np
f6(t, x(t), σ , τ ) = f
p p
t, x(t), σ p,k
χ [τ p p
) (t) . (8.9.13c)
k−1 ,τk
k=1
where, for i = 1, . . . , N ,
np
L6i (t, x(t | σ p , τ p ), σ p , τ p ) = Li t, x(t), σ p,k
χ [τ p p
) (t) . (8.9.15)
k−1 ,τk
k=1
dt(s)
= v p (s) (8.9.18a)
ds
with the initial condition
t(0) = 0. (8.9.18b)
Definition 8.9.6 A scalar function v p (s) ≥ 0 for all s ∈ [0, 1] is called a
time scaling control if it is a piecewise non-negative constant function with
possible discontinuities at the fixed knots ξ0p , ξ1p , . . . , ξnp p , that is,
np
v p (s) = θkp χ[ξp p
) (s), (8.9.19)
k−1 ,ξk
k=1
where θkp ≥ 0, k = 1, . . . , np , are decision variables, and χ[ξp ,ξp ) (·) is the
k−1 k
p
indicator function on the interval [ξk−1 , ξkp ) defined by (8.3.1b). Clearly, v p (s)
& '
depends on the choice of θ = θ1p , . . . , θnp p .
8.9 Control Parametrization Time Scaling Transform 303
& '
Definition 8.9.7 Let Θp be the set containing all those θ p = θ1p , . . . , θnp p
with θip ≥ 0, i = 1, . . . , np . Furthermore, let V p be the set containing all the
corresponding time scaling controls obtained by elements from Θp via (8.9.19).
Define
ω p (s) = up (t(s)), x̂(s) = [(x(s)) , t(s)] , (8.9.22)
where, by abusing the notation, we write
x(s) = x(t(s)).
np
p
ω (s) = σ p,k χ[ξp p
) (s), s ∈ [0, 1], (8.9.23)
k−1 ,ξk
k=1
1
- i (σ p , θ p ) = Φi (x(1 | σ p , θ p )) +
G L-i (x̂(s | σ p , θ p ), σ p , θ p )ds = 0,
0
i = 1, . . . , Ne , (8.9.24a)
304 8 Control Parametrization for Canonical Optimal Control Problems
1
- i (σ p , θ p ) = Φi (x(1 | σ p , θ p ))
G + L-i (x̂(s | σ p , θ p ), σ p , θ p )ds ≤ 0,
0
i = Ne + 1, . . . , N, (8.9.24b)
np
θkp ξkp − ξk−1
p
= T, (8.9.24c)
k=1
where
np
L-i (x̂(s), σ p , θ p ) = v p (s)Li t(s), x(s), σkp χ[ξp ,ξp ) (s) . (8.9.25)
k−1 k
k=1
Cp × V p uniquely defined by
Definition 8.9.10 Let Ap be the subset of U
Cp
elements from Ω .
np
p
ω (s) = σ p,k χ[ξp p
) (s), s ∈ [0, 1], (8.9.26a)
k−1 ,ξk
k=1
np
v p (s) = θip χ[ξp p
) (s), s ∈ [0, 1], (8.9.26b)
i−1 ,ξi
i=1
Cp .
with (σ p , θ p ) ∈ Ω
The equivalent transformed optimal parameter selection problem may now
be stated as follows.
Problem (P3 (p)). Subject to the system of differential equations
dx̂(s)
= fˆ(x̂(s), σ p , θ p ), (8.9.27a)
ds
where
np
fˆ(x̂(s), σ , θ ) = v (s)f
p p p
t(s), x(s), σkp χ[ξp ,ξp ) (s) (8.9.27b)
k−1 k
k=1
Cp .
is minimized over Ω
Remark 8.9.2 Note that in the transformed problem (P3 (p)), only the knots
contribute to the discontinuities of the state differential equation. Thus, all
locations of the discontinuities of the state differential equation are known
and fixed during the optimization process. These locations will not change
from one iteration to the next during the optimization process. Even when
two or more of the original switching times coalesce, the number of these
locations remains unchanged in the transformed problem. Furthermore, the
gradient formulae of the cost function and constraint functions with respect
to the original switching times in the new transformed problem are provided
by the usual gradient formulae for the classical optimal parameter selection
problem as given in Section 7.2.
The basic idea behind the control parametrization time scaling transform
is aiming to include the switching times as parameters to be optimized and at
the same time, to avoid the numerical difficulties mentioned in Remark 8.9.1.
The time scaling control captures the discontinuities of the optimal control
if the number of knots in the partition of the new time horizon is greater
than or equal to the number of discontinuities of the optimal control. Since
the time scaling control parameters θkp , k = 1, . . . , np , are allowed to vary,
the control parametrization time scaling transform technique gives rise to a
larger search space and hence produces a better or at least equal approximate
optimal cost. Clearly, if the optimal control is a piecewise constant function
with discontinuities at t̄1 , . . . , t̄M , then, by solving the transformed problems
with the number of knots greater or equal to M , and by using (8.9.23), we
obtain the exact optimal control. For the general case, the convergence results
are much harder to establish.
Definition 8.9.12 Let Ω D p,ε be the subset of Ξ p ×Θ p such that the ε-tolerated
- 0 (σ p,ε,∗ , θ p,ε,∗ ) ≤ G
G - 0 (σ p,∗ , θ p,∗ ) (8.9.30)
for any ε > 0, where (σ p,ε,∗ , θ p,ε,∗ ) and (σ p,∗ , θ p,∗ ) are optimal vectors to
Problem (P3,ε (p)) and Problem (P3 (p)), respectively.
Let the following additional condition be satisfied.
np
p
u (t) = σ p,k χ[τ p p
) (t), (8.9.32)
k−1 ,τk
k=1
& '
where σ p = σ p,1
, . . . , (σ p,np ) is the same as for ω p defined by
& '
(8.9.23), and τ p = τ1p , . . . , τnpp is determined uniquely by θ p via evalu-
ating (8.9.18) at s = k/np , k = 1, . . . , np .
The functions ω p,∗ and v p,∗ are piecewise constant functions with possible
discontinuities at s = nkp , k = 1, 2, . . . , np − 1. We choose np such that
Define
k
S1 = : k = 1, 2, . . . , n1 − 1 . (8.9.34)
n1
Then, we choose an integer n2 such that n2 > n1 , and let
k
S2 = : k = 1, 2, . . . , n2 − 1 . (8.9.35)
n2
This process is continued in such a way that the following condition is satis-
fied.
Assumption 8.9.5 S p+1 ⊃ S p and limp→∞ S p is dense in [0, 1].
The procedure for solving Problem (P2 ) may be stated as follows. For each
p ≥ 1, we use the control parametrization time scaling transform technique
to obtain Problem (P3 (p)). In what follows, we present a computational pro-
cedure to solve Problem (P3 (p)), giving an approximate optimal solution of
Problem (P2 ).
Algorithm 8.9.1
Step 1. Solve Problem (P3 (p)) as a standard optimal parameter selection
problem by using a computational procedure similar to that described
in Section 8.6. Let the optimal control vector obtained be denoted
by (σ p,∗ , θ p,∗ ). Then, by Remark 8.9.3, we obtain the corresponding
piecewise constant control (ω p,∗ , ν p,∗ ).
Step 2. If np ≥ M , where M is a pre-specified positive constant, go to Step 3.
Otherwise go to Step 1 with np increased to np+1 .
& '
Step 3. Stop. Construct τ p,∗ = τ1p,∗ , . . . , τnp,∗
p
from θ p,∗ . Then, obtain
np
up,∗ (t) = σ p,k,∗ χ[τ p,∗ ,τ p,∗ ) (t), (8.9.36)
k−1 k
k=1
& '
where σ p,∗ = σ p,1,∗ , . . . , (σ p,np ,∗ ) . The piecewise constant
p,∗
control u obtained is an approximate optimal solution of Prob-
lem (P ).
We are now in a position to present the convergence results in the next
two theorems.
Theorem 8.9.1 Let (σ p,∗ , θ p,∗ ) be an optimal parameter vector of Prob-
lem (P3 (p)), and let (up,∗ , τ p,∗ ) be the corresponding piecewise constant op-
timal control and switching vector of Problem (P2 (p)) such that
np
up,∗ (t) = σ p,k,∗ χ[τ p,∗ ,τ p,∗ ) (t), (8.9.37)
k−1 k
k=1
308 8 Control Parametrization for Canonical Optimal Control Problems
& ' & '
where σ p,∗ = σ p,1,∗ , . . . , (σ p,np ,∗ ) , and τ p,∗ = τ1p,∗ , . . . , τnp,∗
p
.
Suppose that Problem (P ) has an optimal control u∗ . Then
Proof. Let (ω p,∗ , ν p,∗ ) be determined uniquely by (σ p,∗ , θ p,∗ ) via (8.9.23)
and (8.9.19). More precisely, solve (8.9.18) to obtain t∗ (s), s ∈ [0, 1]. Then,
by evaluating t∗ (s) at s = k/np , k = 0, 1, . . . , np , we obtain τ p,∗ =
& '
τ1p,∗ , . . . , τnp,∗
p −1
. Now, by (8.9.22), up,∗ can be written as
np
up,∗ (t) = σ p,k,∗ χ[τ p,∗ ,τ p,∗ ) (t), (8.9.39)
k−1 k
k=1
& '
where σ p,∗ = σ p,1,∗ , . . . , (σ p,np ,∗ ) , τ0p,∗ = 0, τnp,∗
p
= T and τ p,∗ =
& '
τ1p,∗ , . . . , τnp,∗ .
p−1
&
Consider Problem (P3 (p)) but with the switching vector τ p = τ1p , . . . ,
'
τnpp −1 taken as fixed. Let this problem be referred to as Problem (P4 (p)).
Clearly, Problem (P4 (p)) can be considered as the type obtained in Section 8.3
through the application of the control parametrization technique. Let ūp,∗
be an optimal control of Problem (P4 (p)).
Note that the switching vector in Problem (P4 (p)) is taken as fixed, while
Problem (P3 (p)) treats the switching vector as part of the decision vector to
be optimized over. Thus, it is clear that
for all p ≥ 1. Now, by Theorem 8.5.1, it follows that for any δ > 0, there
exists a positive integer p̄ such that
for all p ≥ p̄. Combining (8.9.40) and (8.9.41), we obtain that, for all p ≥ p̄,
Taking the limit as p → ∞ and noting that δ > 0 is arbitrary, the conclusion
of the theorem follows readily.
Theorem 8.9.2 Let (σ p,∗ , θ p,∗ ) and (up,∗ , τ p,∗ ) be as defined in Theo-
rem 8.9.1. Suppose that
Next, it is easy to verify from Remark 8.4.1, Lemma 8.4.3, Assumptions 8.9.2
and 8.9.3 that û is also a feasible control of Problem (P2 ). On the other hand,
it follows from Theorem 8.9.1 that
Hence, the conclusion of the theorem follows easily from (8.9.44) and (8.9.45).
8.10 Examples
Example 8.10.1 Considers the free flying robot (FFR) problem as de-
scribed in Example 8.8.1. We shall solve the problem again for the cases
of np = 20, 30 and 40, where np is the number of partitions points. For each
of these cases, the problem is solved using the MISER Software [104]. The
optimal costs obtained with different numbers np of partition points are given
in Table 8.10.1.
Table 8.10.1: Approximate optimal costs for Example 8.10.1 with different
np and solved using time scaling
20 6.22527918
30 6.19719836
40 6.18444730
From the results listed in Table 8.10.1, we see the convergence of the ap-
proximate optimal costs. The solutions obtained with the time scaling trans-
form are better than those obtained without time scaling transform (see Ta-
ble 8.8.1). Figure 8.10.1 shows the plots of the approximate optimal controls
and approximate optimal state trajectories obtained for the case of np = 40.
Their trends are similar to those obtained in [13] for N = 1000, where N de-
notes the number of partition points used in the Euler discretization scheme.
Furthermore, the cost of 6.18444730 obtained for the case of np = 40 is close
to the cost of 6.154193 obtained in [13] for N = 1000.
310 8 Control Parametrization for Canonical Optimal Control Problems
0.5
0.8
0.4
0.6
0.3
0.4
0.2
0.2 0.1
u u
1 2
0 0
−0.2 −0.1
−0.2
−0.4
−0.3
−0.6
−0.4
−0.8
−0.5
0 2 4 6 8 10 12 0 2 4 6 8 10 12
t t
2 2
0 0
−2 −2
−4 −4
x x
1 2
−6 −6
−8 −8
−10 −10
−12 −12
0 2 4 6 8 10 12 0 2 4 6 8 10 12
t t
1.6 2.5
1.4
2
1.2
1.5
1
x3 x4
0.8 1
0.6
0.5
0.4
0
0.2
0 −0.5
0 2 4 6 8 10 12 0 2 4 6 8 10 12
t t
2.5 0.15
0.1
2
0.05
0
1.5
−0.05
x x
5 6
1 −0.1
−0.15
0.5
−0.2
−0.25
0
−0.3
−0.5 −0.35
0 2 4 6 8 10 12 0 2 4 6 8 10 12
t t
1 0.7
0.9
0.6
0.8
0.7 0.5
0.6 x 2 0.4
x1
0.5
0.3
0.4
0.3 0.2
0.2
0.1
0.1
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
t t
4
u
3
0
0 0.2 0.4 0.6 0.8 1
t
Fig. 8.10.2: Optimal state trajectories and control for Example 8.10.2 with
np = 20 and solved using time scaling
4 5
4
2
3
2
0
x1 x2 1
−2 0
−1
−4
−2
−3
−6
−4
−8 −5
0 5 10 15 0 5 10 15
t t
0.5
0.4
0.3
0.2
0.1
u 0
−0.1
−0.2
−0.3
−0.4
−0.5
0 5 10 15
t
8.11 Exercises
∂G(σ p )
=0
∂σ p
if σ p is such that
max h(t, x(t|σ p ), σ p ) = 0.
0≤t≤T
8.11 Exercises 313
8.11.5 Consider the optimal control problem, where the cost functional
(8.2.4) is to be minimized subject to the dynamical system (8.2.1). Let U be
the class of admissible controls that consists of all C 1 functions from [0, T ] to
U , where U is as defined by (8.2.2b) with αi = −1, and βi = 1, i = 1, . . . , r.
Pose this problem in a form solvable by the usual control parametrization
technique. Derive the gradient formula for the cost functional with respect to
the control parameters and the resulting system parameters.
and
γ∞ = ess sup |γ(t)| .
0≤t≤T
γp ↑ γ∞ , as p → ∞.
8.11.7 Show that the result of Lemma 8.4.1 remains valid if {up }∞ p=1 is a
bounded sequence of controls in L∞ ([0, T ], Rr ) that converges to u a.e. in
[0, T ], as p → ∞.
8.11.9 Can the control functions chosen in Definition 8.9.1 be just measur-
able, rather than Borel measurable functions?
8.11.10 Derive the gradient formulae given by (8.6.5) and show that it is
equivalent to (8.6.8).
9.1 Introduction
In real world, optimal control problems are often subject to constraints on the
state and/or control. These constraints can be point constraints and/or con-
tinuous inequality constraints. The point constraints are expressed as func-
tions of the states at the end point or some intermediate interior points
of the time horizon. These point constraints can be handled without much
difficulty. However, for the continuous inequality constraints, they are ex-
pressed as functions of the states and/or controls over the entire time hori-
zon, and hence are very difficult to handle. This chapter is devoted to de-
vise computational methods for solving optimal control problems subject to
point and continuous constraints. It is divided into two sections. In Sec-
tion 9.2, our focus is on optimal control problems subject to continuous
state and/or control inequality constraints. Through the application of the
control parametrization technique, an approximate optimal parameter se-
lection problem with continuous state inequality constraints is obtained.
Then, the constraint transcription method introduced in Section 4.3 is ap-
plied to construct a smooth approximate inequality canonical constraint for
each continuous inequality state constraint. In Section 9.3, exact penalty
function method introduced in Section 4.4 will be utilized to develop an
effective computational method for solving the class of optimal control prob-
lems considered in Section 9.3. The main references for this Chapter are
[103, 104, 133, 134, 143, 145, 146, 168, 171, 249, 253, 254, 259].
dx(t)
= f (t, x(t), u(t)), (9.2.1a)
dt
where x = [x1 , . . . , xn ] ∈ Rn , u = [u1 , . . . , ur ] ∈ Rr are, respectively, the
state and control vectors; and f = [f1 , . . . , fn ] ∈ Rn .
The initial condition for the system of differential equations (9.2.1a) is
x(0) = x0 , (9.2.1b)
Let the number np of the partition points be chosen such that np+1 > np .
The control is now approximated in the form of piecewise constant function
as:
np
up (t) = σ p,k χ[τ p ,τ p ) (t), (9.2.7)
k−1 k
k=1
0 = ξ0p < ξ1p < ξ2p < · · · < ξnp p −1 < ξnp p = 1. (9.2.11b)
dt(s)
= v p (s) (9.2.12a)
ds
with the initial condition:
t(0) = 0. (9.2.12b)
In view of Definition 8.9.6, we note that v p (s) is called a time scaling control
defined by (8.9.19), that is,
np
v p (s) = θkp χ[ξp p
) (s), (9.2.13)
k−1 ,ξk
k=1
& '
where θkp ≥ 0, k = 1, . . . , np , are decision variables. Define θ p = θ1p , . . . , θnp p
∈ Rnp with θkp ≥ 0, k = 1, . . . , np . Let Θp be the set consisting all those θ p ,
and let Vp be the set consisting of all the piecewise constant functions in
the form of (9.2.13) with θ p ∈ Θp . Clearly, each θ p ∈ Θp defines uniquely
a v p ∈ V p , and vice versa. By (9.2.13), it is clear from (9.2.12) that, for
k = 1, . . . , np ,
s
k−1
p
t (s) = p
v (τ )dτ = θjp (ξjp − ξj−1
p
) + θkp (s − ξk−1
p
), p
for s ∈ [ξk−1 , ξkp ].
0 j=1
(9.2.14)
In particular,
1
np
p
t (1) = p
v (τ )dτ = θkp (ξkp − ξk−1
p
) = T. (9.2.15)
0 k=1
and
9.2 Optimal Control with Continuous State Inequality Constraints 319
k/np
k
tp (k/np ) = v p (τ )dτ = θjp (ξjp − ξj−1
p
) = τk , k = 1, . . . , np . (9.2.16)
0 j=1
Define
ω p (s) = up (t(s)), (9.2.17)
where up is in the form of (9.2.7). Clearly, ω p is a piecewise constant control
which can be written as
np
dx̂(s)
= f-(s, x
-(s), σ p , θ p ), (9.2.19)
ds
6(0)
x x0
x̂(0) = = , (9.2.20)
0 0
where
6(s)
x x(t(s))
x̂(s) = = , (9.2.21a)
t(s) t(s)
6(s) = x(t(s)),
x (9.2.21b)
np
ˆ p p p
f (s, x̂(s), σ , θ ) = v (s)f x-(s), σkp χ[ξp p
) (s) , (9.2.21c)
k−1 ,ξk
k=1
Φi (6x(1 | σ p , θ p )) ≤ 0, i = 1, . . . , NT , (9.2.22)
ĥi s, x̂(s | σ p , θ p ), σ p,k ≤ 0, s ∈ Ik , i = 1, . . . , NS , k = 1, . . . , np (9.2.23)
and
np
Υ-(θ p ) = θkp (ξkp − ξk−1
p
) − T = 0, (9.2.24)
k=1
320 9 Optimal Control Problems with State and Control Constraints
where
np
-(s), σ
ĥi s, x p,k
= hi 6(s),
t(s), x σ p,k
χ [ξ p p
) (s) , s ∈ Ik , (9.2.25)
k−1 ,ξk
k=1
p
and, for each k = 1, . . . , np , Ik denotes the closure of Ik with Ik = [ξk−1 , ξkp ).
Let Dp be the corresponding subset of U Cp × V p uniquely defined by
elements from B . Clearly, for each (ω , ν ) ∈ Dp , there exists a unique
p p p
(σ p , θ p ) ∈ B p such that
np
p
ω (s) = σ p,k χ[ξp p
) (s), s ∈ [0, 1], (9.2.26a)
k−1 ,ξk
k=1
np
v p (s) = θkp χ[ξp p
) (s), s ∈ [0, 1]. (9.2.26b)
k−1 ,ξk
k=1
np
L-0 (s, x̂(s), σ p , θ p ) = v p (s)L0 (-
x(s), σkp χ[ξp p
) (s)). (9.2.28)
k−1 ,ξk
k=1
This function
* is obtained by smoothing out the sharp corner of the func-
+
tion max ĥi s, x-(s | σ p , θ p ), σ p,k , 0 which is in the form as shown in Fig-
ure 4.2.1. For each i = 1, . . . , NS , and k = 1, . . . , np , define
1
np
p p
Gi,ε (σ , θ ) = L-i,ε s, x
-(s | σ p , θ p ), σ p,k χ[ξp p
) (s)ds. (9.2.33)
k−1 ,ξk
0 k=1
Gi,ε (σ p , θ p ) = 0, i = 1, . . . , NS . (9.2.34)
Let Bεp be the feasible region of Problem (Pε (p)) containing all those
(σ , θ p ) ∈ Ξ p × Θp such that the constraints (9.2.22), (9.2.34) and (9.2.21)
p
Note that equality constraints (9.2.34) fail to satisfy the usual constraint
qualification. To overcome this difficulty, we consider the second approximate
problem as follows:
Problem (Pε,γ (p)): The Problem (P (p)) with (9.2.23) replaced by
- i,ε,γ (σ p ) = −γ + G
G - i,ε (σ p ) ≤ 0, i = 1, . . . , NS . (9.2.35)
Note that constraints (9.2.35) are already in canonical form, i.e., in the
form of (9.2.29), where the functions Ĝi, in (9.2.35) are equal to the constant
−γ in the present case.
We assume that the following condition is satisfied.
Assumption 9.2.5 For any combined vector (σ p , θ p ) in B p , there exists a
combined vector (σ p , θ p ) ∈ B̊ p such that
To relate the solutions of Problems (P (p)), (Pε (p)) and (Pε,γ (p)) as ε → 0,
we have the following lemma.
Lemma 9.2.1 For any ε > 0, there exists a γ(ε) > 0 such that for all γ,
p
0 < γ < γ(ε), if (σε,γ p
, θε,γ ) ∈ Ξ p × Θp satisfies the constraints of Problem
(Pε,γ (p)), i.e.,
x(1 | σε,γ
Φi (6 p p
, θε,γ )) ≤ 0, i = 1, . . . , NT , (9.2.36a)
p
Gi,k,ε,γ (σε,γ p
, θε,γ ) ≤ 0, i = 1, . . . , NS ; k = 1, . . . , np , (9.2.36b)
p p
np
Υ-(θ p ) = p
θk (ξk − ξk−1 ) − T = 0, (9.2.36c)
k=1
ε ε
ki,ε = min T, . (9.2.38)
16 2mi
and * +
- i (σ p , θ p )) = 0 ,
Bip = (σ p , θ p ) ∈ Ap : G (9.2.41)
- i,ε (σ p , θ p ) ≤ 0
−γ+G (9.2.42)
- i (σ p , θ p ) > 0.
G (9.2.43)
hi (t(6 s | σ p , θ p )) > 0.
s), x(6 (9.2.44)
y ε
|Ii | ≥ min{T, z} = min T, ≥ min T, . (9.2.47)
mi 2mi
From the definition of G - i,ε (σ p ) and the fact that L-i,ε (t(s), x(s | σ p )) is non-
negative, it follows from (9.2.42) that
324 9 Optimal Control Problems with State and Control Constraints
T
- i,ε (σ p , θ p ) = −γ +
0 ≥ −γ + G L-i,ε (t(s), x(s | σ p , θ p ))ds
0
≥ −γ + -
Li,ε (t(s), x(s | σ , θ p ))ds ≥ −γ
p
Ii
* +
+ min L-i,ε (t(s), x(s | σ p , θ p )) |Ii | . (9.2.48)
s∈Ii
Remark 9.2.2 From Lemma 9.2.1, it is clear that the halving process of γ
in Step 4 of Algorithm 9.2.1 needs only to be carried out a finite number of
times. Let γ̃(ε) be the parameter corresponding to each ε > 0 obtained in the
!
p,∗ p,∗
halving process of γ in Step 4 of the algorithm. Clearly, σε,γ̃(ε) , θε,γ̃(ε)
9.2 Optimal Control with Continuous State Inequality Constraints 325
np
p,∗,k (t).
up,∗
ε,γ̃(ε) (t) = σε,γ̃(ε) χ τ p,∗ p,∗
,τε,γ̃(ε),k
(9.2.50)
k=1 ε,γ̃(ε),k−1
p,∗
Here, we solve (9.2.12a) and (9.2.12b) with θ p taken as θε,γ̃(ε) , giving
p,∗ p,∗
tε,γ̃(ε) (s) defined by (9.2.14). Then, by evaluating tε,γ̃(ε) (s) at s = k/np ,
p,∗
k = 0, 1, . . . , np , we obtain τε,γ̃(ε),k , k = 0, 1, . . . , np .
Remark 9.2.4 By examining the proof of Lemma 9.2.1, we see that ε and
γ are closely related to each other. At the solution of a particular problem,
if a constraint is active over a large fraction of [0, 1], then we should choose
γ = O(ε). On the other hand, if the constraint is active only over a very
small fraction of [0, 1], then γ = O ε2 .
Problem (Pε,γ (p)) can be viewed as the following nonlinear optimization prob-
lem, which is again referred to as Problem (Pε,γ (p)).
326 9 Optimal Control Problems with State and Control Constraints
Minimize:
G0 (σ p , θ p ) (9.2.51)
subject to
x(1 | σ p , θ p )) ≤ 0,
Φi (6 i = 1, . . . , NT , (9.2.52)
Gi,k,ε,γ (σ p , θ p ) ≤ 0, i = 1, . . . , NS ; k = 1, . . . , np , (9.2.53)
np
Υ-(θ p ) = θkp (ξkp − ξk−1
p
)−T =0 (9.2.54)
k=1
i (σ p ) = E i σ p,k − bi ≤ 0, i = 1, . . . , q; k = 1, . . . , np (9.2.55)
αi ≤ σip,k
≤ βi , i = 1, . . . , r; k = 1, . . . , np (9.2.56)
p
θk ≥ 0, k = 1, . . . , np . (9.2.57)
Algorithm 9.2.2
For each given (σ p , θ p ) ∈ Λ, compute the solution x̂(· | σ p , θ p ) of sys-
tem (9.2.19)–(9.2.20) by solving the differential equations (9.2.19) forward
in time from s = 0 to s = 1 with the initial condition (9.2.20).
x(1 | σ p , θ p )),
Φi (6 i = 1, . . . , NT , (9.2.63)
and
p
ξk
Gi,k,ε,γ (σ , θ ) = −γ +
p p
L-i,ε (s, x̂(s | σ p , θ p ), σ p , θ p,k )ds,
p
ξk−1
328 9 Optimal Control Problems with State and Control Constraints
i = 1, . . . , NS ; k = 1, . . . , np (9.2.64)
are calculated.
Let us now move to present an algorithm for calculating the gradients of
these functionals for each given (σ p , θ p ) ∈ Λ. For this, we note from Sec-
tion 7.3 that the derivations of the gradient formulae for the cost functional
and the constraint functionals are similar. These gradients can be computed
using the following algorithm:
Algorithm 9.2.4
For a given (σ p , θ p ) ∈ Λ,
Step 1. Solve the costate systems corresponding to the cost functional (9.2.62),
the terminal constraint functionals (9.2.63), and the inequality state con-
straint functionals (9.2.64), respectively, as follows:
(i). For the cost functional (9.2.62):
Solve the following costate system of differential equations:
⎛ ! ⎞
- 0 s, x̂(s | σ p , θ p ), σ p , θ p , λ̂0 (s)
∂H
dλ̂0 (s)
= −⎝ ⎠ (9.2.65)
ds ∂ x̂
(iii) For the (i, k)-th inequality state constraint functional given in
(9.2.64):
Let λ̂i,ε (· | σ p , θ p ) be the solution of the costate system (9.2.71) and (9.2.72).
Step 2. The gradients of the cost functional (9.2.62), the terminal inequal-
ity constraint functionals (9.2.63), and the inequality constraint function-
als (9.2.64) are computed, respectively, as follows:
(i) For the cost functional (9.2.62):
!
-0
1 ∂H s, x̂(s | σ p , θ p ), σ p , θ p , λ̂0 (s | σ p , θ p )
∂G0 (σ p , θ p )
= ds
∂σ p 0 ∂σ p
(9.2.74a)
!
p p - 0 s, x̂(s | σ , θ ), σ , θ , λ̂ (s | σ , θ p )
1 ∂H p p p p 0 p
∂G0 (σ , θ )
= ds.
∂θ p 0 ∂θ p
(9.2.74b)
(ii) For the i-th terminal constraint functional given in (9.2.63):
!
1 ∂H - i s, x̂(s | σ p , θ p ), σ p , θ p , λ̂i (s | σ p , θ p )
∂Gi (σ p , θ p )
= ds
∂σ p 0 ∂σ p
(9.2.75)
330 9 Optimal Control Problems with State and Control Constraints
!
1 - i s, x̂(s | σ p , θ p ), σ p , θ p , λ̂i (s | σ p , θ p )
∂H
∂Gi (σ p , θ p )
= ds.
∂θ p 0 ∂θ p
(9.2.76)
(iii) For the (i, k)-th inequality constraint functional given in (9.2.64):
p p i,ε
1 ∂H s, x̂(s | σ p , θ p ), σ p , θ p , λ̂i,ε (s | σ p , θ p )
∂Gi,ε (σ , θ )
= ds
∂σ p 0 ∂σ p
(9.2.77)
p p
1 ∂ Hi,ε s, x̂(s | σ , θ ), σ , θ , λ̂ (s | σ , θ )
p p p p i,ε p p
∂Gi,ε (σ , θ )
= ds.
∂θ p 0 ∂θ p
(9.2.78)
np
up,∗ (t) = σ p,∗,k χ[τk−1
p,∗
,τkp,∗ ) (t). (9.2.79)
k=1
Theorem 9.2.2 Let up,∗ be as defined in Remark 9.2.3. Suppose that the
original Problem (P ) has an optimal control u∗ . Then,
subject to
dx1 (t)
= x2 (t) (9.2.84a)
dt
dx2 (t)
= −x2 (t) + u(t) (9.2.84b)
dt
with initial conditions
x1 (0) = 0, x2 (0) = −1 (9.2.84c)
and the continuous state inequality constraint
according to (9.2.32). Then, we consider the cases of np = 10, 20, 30 and 40,
where np is the number of partitions points of the parametrized control as a
piecewise constant function. For each of these cases, the problem is solved us-
ing the MISER software[104], first without using time scaling transform and
then using time scaling transform. The optimization method used within the
MISER software is the sequential quadratic programming (SQP). Note that
for each case, the constraint transcription method is used to ensure that the
continuous inequality constraint (9.2.85) is satisfied, where Lε is constructed
from h according to (9.2.32). Take the case of N = 40, the numerical results
obtained from the use of the constraint transcription method are summarized
in Table 9.2.1.
Table 9.2.1: Numerical results for example 9.2.1 with np = 40 and solved
using the constraint transcription method but without using time scaling
1 1
ε γ g0 (u) 0
Lε dt 0
max{h, 0}dt Reason for termination
10−2 0.25 × 10−2 0.1709 −0.196 × 10−2 −0.15 × 10−2 Normal
10−2 0.79 × 10−3 0.1732 −0.744 × 10−3 −0.84 × 10−4 Normal
10−2 0.25 × 10−3 0.1751 −0.218 × 10−4 0 Normal
10−3 0.25 × 10−4 0.1727 −0.205 × 10−5 −0.13 × 10−6 Normal
10−4 0.25 × 10−5 0.1727 −0.335 × 10−6 −0.10 × 10−6 Zero derivative
The cost values obtained without using time scaling transform and us-
ing time scaling transformation for the cases of np = 10, 20, 30 and 40 are
summarized in Table 9.2.2 and Table 9.2.3, respectively.
Table 9.2.2: Approximate optimal costs for Example 9.2.1 with different
np and solved without using time scaling
np g0 Terminated successfully
10 1.81473135 × 10−1 Yes
20 1.73092969 × 10−1 Yes
30 1.71593913 × 10−1 Yes
40 1.70814337 × 10−1 Yes
For the case of np = 40, Figures 9.2.1 and 9.2.2 show, respectively, the
approximate optimal control and approximate optimal state trajectories. For
this example, the optimal solution has been obtained in [189]. Compared
the optimal solution obtained in [189] with the solution obtained using our
method with np = 40, we see that their trends are similar. In particular,
their cost values are basically the same. This confirms the convergence of the
approximate optimal costs of our method.
9.2 Optimal Control with Continuous State Inequality Constraints 333
Table 9.2.3: Approximate optimal costs for different np and solved using
time scaling
np g0 Terminated successfully
10 1.78540476 × 10−1 Yes
20 1.7275432 × 10−1 Yes
30 1.71188006 × 10−1 Yes
40 1.70634166 × 10−1 Yes
14
12
10
6
u
−2
−4
0 0.2 0.4 0.6 0.8 1
t
Fig. 9.2.1: Optimal control for Example 9.2.1 with np = 40 and solved
using time scaling
subject to
dx1 (t)
= 9x4 (t) (9.2.88a)
dt
334 9 Optimal Control Problems with State and Control Constraints
0 0.4
0.2
-0.05
0
-0.1 −0.2
X1
X2
−0.4
-0.15
−0.6
-0.2
−0.8
-0.25 −1
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
t t
Fig. 9.2.2: Optimal state trajectories for Example 9.2.1 with np = 40 and
solved using time scaling
dx2 (t)
= 9x5 (t) (9.2.88b)
dt
dx3 (t)
= 9x6 (t) (9.2.88c)
dt
dx4 (t)
= 9(u1 (t) + 17.2656x3 (t)) (9.2.88d)
dt
dx5 (t)
= 9u2 (t) (9.2.88e)
dt
dx6 (t) 9
=− [u1 (t) + 27.0756x3 (t) + 2x5 (t)x6 (t)], (9.2.88f)
dt x2 (t)
where
and
|u1 (t)| ≤ 2.83374 (9.2.90)
− 0.80865 ≤ u2 (t) ≤ 0.71265, ∀t ∈ [0, 1]. (9.2.91)
with continuous state inequality constraints
4
Lε (t, x(t)) = Li,ε (t, x(t)),
i=1
where for each i = 1, . . . , 4, Li,ε (t, x(t)) is constructed from hi (t, x(t)) accord-
ing to (9.2.32), and hi (t, x(t)), i = 1, . . . , 4, are defined by (9.2.94)–(9.2.97),
respectively.
Take the case of np = 40, the numerical results obtained from the use of
the constraint transcription method are summarized in Table 9.2.4.
1 1
ε γ g0 (u) 0
Lε dt 0
max{h, 0}dt Reason for termination
−2 −2 −2 −2
10 0.25 × 10 0.55 × 10 −0.25 × 10 −0.799 × 10−5 Normal
−3 −3 −2 −3
10 0.79 × 10 0.53 × 10 −0.25 × 10 −0.795 × 10−13 Normal
−4 −4 −2 −4 −8
10 0.25 × 10 0.53 × 10 −0.25 × 10 −0.365 × 10 Normal
The optimal cost values obtained without using time scaling and those
obtained using time scaling for the cases of np = 20, 30 and 40 are summarized
in the following tables. The reason for the termination of the optimization
software is normal for each of these cases, showing that the solution obtained
for each of the cases is such that the KKT conditions are satisfied, and so
are the continuous inequality constraints.
Table 9.2.5: Approximate optimal costs for Example 9.2.2 with different
np and solved using time scaling and without using time scaling
Figures 9.2.3 and 9.2.4 show, respectively, the approximate optimal con-
trols and approximate optimal state trajectories obtained using time scaling
transform.
336 9 Optimal Control Problems with State and Control Constraints
3 1
2.5 0.8
2
0.6
1.5
u1 u2 0.4
1
0.2
0.5
0
0
−0.5 −0.2
−1 −0.4
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
t t
Fig. 9.2.3: Optimal controls for Example 9.2.2 with np = 40 and solved
using time scaling
From Table 9.2.5, we see that for each approximate problem, the optimiza-
tion software is unable to reduce the cost value further due to the smallness
of the cost value. We thus multiply the cost functional with a weighting fac-
tor of 103 to give g60 = 103 × g0 . Then, we redo the calculation. The results
obtained are listed in Table 9.2.6. From which, we can see the convergence
of the approximate optimal controls.
Table 9.2.6: Approximate optimal costs for Example 9.2.2 with different
np and solved with a weighting factor of 1000 and using time scaling and
without using time scaling
Figures 9.2.5 and 9.2.6 show the approximate optimal state trajectories,
and approximate optimal controls for the case of np = 40.
Based on Euler discretization scheme, an algorithm is developed using it-
erative restoration method in [13]. The modeling language AMPL [F2] is then
used to implement the algorithm for constrained optimal control problems,
where the optimization software Ipopt [277] is used. This example is solved
by using the algorithm developed in [13], where the optimal control obtained
is shown to satisfy the optimality conditions. The optimal cost obtained in
[13] is 0.005139 with N = 1000, where N denotes number of grid points used
in Euler discretization. From Table 9.2.6, we see that the difference between
the optimal cost obtained in [13] and that obtained by our method for the
case of np = 40 is insignificant. In view of Figures 9.2.5 and 9.2.6, we see that
9.2 Optimal Control with Continuous State Inequality Constraints 337
10 22
9
21
8
20
7
19
6
x x
1 2
5 18
4
17
3
16
2
15
1
0 14
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
t t
0 2.5
−0.005
2
−0.01
−0.015
1.5
x −0.02 x
3 4
−0.025
1
−0.03
−0.035
0.5
−0.04
−0.045 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
t t
0 0.05
−0.2 0.04
0.03
−0.4
0.02
−0.6
x5 x6
0.01
−0.8
0
−1
−0.01
−1.2 −0.02
−1.4 −0.03
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
t t
Fig. 9.2.4: Optimal state trajectories for Example 9.2.2 with np = 40 and
solved with a weighting factor of 1000, solved with a weighting factor but
without time scaling
3 0.8
2.5 0.6
0.4
2
0.2
1.5
u
u1 2
0
1
−0.2
0.5
−0.4
0 −0.6
−0.5 −0.8
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
t t
Fig. 9.2.5: Optimal control for Example 9.2.2 with np = 40 and solved
with a weighting factor of 1000 and using time scaling
In this section, the exact penalty function approach detailed in Section 4.4
will be used to develop a computational method for solving a class of optimal
control problems to be described below. The main references for this section
are [134, 147, 300, 301].
Consider the system of differential equations given by (9.2.1a) with initial
condition (9.2.1b) and terminal equality constraint given by
x(T ) = xf , (9.3.1)
Define
9
21
8
20
7
19
6
x x
1 2
5 18
4
17
3
16
2
15
1
0 14
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
t t
0 2.5
−0.005
2
−0.01
−0.015
1.5
x −0.02 x
3 4
−0.025
1
−0.03
−0.035
0.5
−0.04
−0.045 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
t t
0 0.05
0.04
−0.2
0.03
−0.4
0.02
x5 x6
−0.6
0.01
−0.8
0
−1
−0.01
−0.02
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
t t
Fig. 9.2.6: Optimal state trajectories for Example 9.2.2 with np = 40 and
solved with a weighting factor of 1000 and using time scaling
is minimized.
We assume that the following conditions are satisfied.
Assumption 9.3.2 Φ0 is continuously differentiable with respect to x.
p
up (t | σ, τ ) = σ j χ[τj−1 ,τj ) (t), (9.3.5)
j=1
& '
where τj−1 ≤ τj , j = 1, . . . , p, with τ0 = 0 and τp = T , σ j = σ1j , . . . , σrj ∈
& '
Rr , j = 1, . . . , p, σ = σ 1 , . . . , (σ p ) ∈ Rpr and χI is the indicator
function of I defined by (7.3.3).
As up ∈ U , σ j ∈ U for j = 1, . . . , p . Let Ξ be the set of all those
& '
σ = σ 1 , . . . , (σ p ) ∈ Rpr such that σ j ∈ U for j = 1, . . . , p .
The switching times τj , 1 ≤ j ≤ p − 1, are also regarded as decision
variables. The time scaling transform is employed to map these switching
times into a set of fixed time points kp , k = 1, . . . , p − 1, on a new time
horizon [0, 1]. This is easily achieved by the following differential equation
dt(s)
= υ p (s), s ∈ [0, 1], (9.3.6a)
ds
with initial condition
t(0) = 0, (9.3.6b)
where
9.3 Exact Penalty Function Approach 341
p
p
υ (s) = θj χ[ j−1 , j ) (s). (9.3.7)
p p
j=1
k−1
θj θk
t(s) = + (ps − k + 1), (9.3.8)
j=1
p p
k
θj
τk = (9.3.9)
j=1
p
and
p
θj
t(1) = = T. (9.3.10)
j=1
p
The approximate control given by (9.3.5) in the new time horizon [0, 1]
becomes
p
p p
ũ (s) = u (t(s)) = σ j χ[ j−1 , j ) (s), (9.3.11)
p p
j=1
dy(s)
= θk f t(s), y(s), σ k , s ∈ Jk , k = 1, . . . , p (9.3.12a)
ds
dt(s)
= υ p (s) (9.3.12b)
ds
y(0) = x0 and t(0) = 0, (9.3.12c)
where
y(s) = [y1 (s), . . . , yn (s)] ,
and $ %
x0 = x01 , . . . , x0n
.
The terminal conditions (9.3.1) and (9.3.10) become
dỹ(s) ˜
=f (s, ỹ(s), σ, θ), s ∈ [0, 1] (9.3.15a)
ds
ỹ(0) =ỹ 0 (9.3.15b)
ỹ(1) = ỹ f , (9.3.15c)
where
y1 (s), . . . , y6n (s), y6n+1 (s)]
ỹ(s) = [6 (9.3.16)
with
with
y6i0 = x0i , i = 1, . . . , n, y6n+1
0
= 0;
and & '
ỹ f = y61f , . . . , y6nf , y6n+1
f
(9.3.19)
with
y6if = xfi , i = 1, . . . , n, y6n+1
f
= T.
To proceed further, let ỹ(· | σ, θ) denote the solution of system (9.3.15a)–
(9.3.15b) corresponding to (σ, θ) ∈ Ξ × Θ. Similarly, applying the time scal-
ing transform to the continuous inequality constraints (9.3.3) and the cost
functional (9.3.4) yields
hi ỹ(s | σ, θ), σ k ≤ 0, ∀s ∈ Jk , k = 1, . . . , p; i = 1, . . . , N (9.3.20)
9.3 Exact Penalty Function Approach 343
and
1
g60 (σ, θ) = Φ0 (y(1 | σ, θ)) + L̄0 (s, ỹ(s | σ, θ), σ, θ)ds, (9.3.21)
0
respectively, where
L̄0 (s, ỹ(s | σ, θ), σ, θ) = υ p (s)L0 (t(s), y(s), ũp (s)). (9.3.22)
∂Υ (s, ỹ(s | σ, θ), σ, θ)
≤ K2 , s ∈ Jk , k = 1, . . . , p; (σ, θ) ∈ Ξ × Θ
∂σ
∂Υ (s, ỹ(s | σ, θ), σ, θ)
≤ K2 , s ∈ Jk , k = 1, . . . , p; (σ, θ) ∈ Ξ × Θ
∂θ
∂Υ (s, ỹ(s | σ, θ), σ, θ)
≤ K2 , s ∈ Jk , k = 1, . . . , p; (σ, θ) ∈ Ξ × Θ,
∂ y6
Similarly, we define
" #
Ωε = (σ, θ, ε) ∈ Fε : ỹ(1 | σ, θ) − ỹ f = 0 (9.3.25)
and " #
Ω0 = (σ, θ) ∈ F0 : ỹ(1 | σ, θ) − ỹ f = 0 . (9.3.26)
Clearly, Problem (P̂ (p)) is equivalent to the following problem, which is
denoted as Problem (P̃ (p)).
Problem (P̃ (p)) Given system (9.3.15a) and (9.3.15b), find a (σ, θ) ∈ Ω0
such that the cost functional (9.3.21) is minimized.
Then, by applying the exact penalty function introduced in Section 4.4,
we obtain a new cost functional defined below.
g60δ (σ, θ, ε)
⎧
⎪
⎪ g60 (σ, θ), if ε = 0, hi (ỹ(s), σ k ) ≤ 0
⎨
(s ∈ Jk , k = 1, . . . , p),
=
⎪
⎪ g60 (σ, θ) + ε−α (Δ(σ, θ, ε) + Δ1 ) + δεβ , if ε > 0,
⎩
+∞, otherwise,
(9.3.27)
p
N
$ " #%2
Δ(σ, θ, ε) = max 0, hi ỹ(s | σ, θ), σ k − εγ Wi ds,
i=1 k=1 Jk
(9.3.28)
α and γ are positive real numbers, β > 2, and δ > 0 is a penalty parameter,
while Δ1 , which is referred to as the equality constraints violation, is defined
by
n+1 !2
Δ1 = ỹ(1 | σ, θ) − ỹ f 2
= ỹi (1 | σ, θ) − ỹif , (9.3.29)
i=1
must go down such that the continuous inequality constraints (9.3.20) and
the equality constraints (9.3.15c) are satisfied.
Before deriving the gradient of the cost functional of Problem (P̃δ (p)), we
will rewrite the cost functional in the canonical form below.
1
g60 (σ, θ, ε) = Φ0 (y(1 | σ, θ)) +
δ
L̄0 (s, ỹ(s | σ, θ), σ, θ)ds
0
4 N
1$ " #%2
+ ε−α max 0, h̄i (s, ỹ(s | σ, θ), σ) − εγ Wi ds
i=1 0
5
n+1 !2
+ ỹi (1 | σ, θ) − ỹif + δεβ
i=1
4 5
n+1 !2
−α
= Φ0 (y(1 | σ, θ)) + ε ỹi (1 | σ, θ) − ỹif + δε β
i=1
1
+ L̄0 (s, ỹ(s | σ, θ), σ, θ)ds
0
N
−α
1 $ " #%2
+ε max 0, h̄i (s, ỹ(s | σ, θ), σ) − εγ Wi ds,
i=1 0
(9.3.30)
where
N
1 $ " #%2
+ ε−α max 0, h̄i (s, ỹ(s | σ, θ), σ) − εγ Wi . (9.3.33)
i=1 0
Now, the cost functional of Problem (P̃δ (p)) is in canonical form. As de-
rived for the proof of Theorem 7.2.2, the gradient formulas of the cost func-
tional (9.3.34) are given by the following theorem.
Theorem 9.3.1 The gradients of the cost functional g60δ (σ, θ, ε) with respect
to σ, θ, and ε are
1
∂6
g0δ (σ, θ, ε) ∂H0 s, ỹ(s | σ, θ), σ, θ, ε, λ0 (s | σ, θ, ε)
= ds (9.3.35)
∂σ ∂σ
0
1
∂6
g0δ (σ, θ, ε) ∂H0 s, ỹ(s | σ, θ), σ, θ, ε, λ0 (s | σ, θ, ε)
= ds (9.3.36)
∂θ 0 ∂θ
4 N
∂6
g0δ (σ, θ, ε) 1$ " #%2
−α−1
= −αε max 0, h̄i (s, ỹ(s | σ, θ), σ) − εγ Wi ds
∂ε i=1 0
5
n+1 !2
+ ỹi (1 | σ, θ) − ỹif
i=1
N
1 " #
− 2γεγ−α−1 max 0, h̄i (s, ỹ(s | σ, θ), σ)−εγ Wi Wi ds
i=1 0
β−1
+ δβε
4 N
1 $ " #%2
=ε−α−1 −α max 0, h̄i (s, ỹ(s | σ, θ), σ) − εγ Wi ds
i=1 0
N 1
" #
+ 2γ max 0, h̄i (s, ỹ(s | σ, θ), σ) − εγ Wi (−εγ Wi )ds
i=1 0
5
n+1 !2
−α ỹi (1 | σ, θ) − ỹif + δβεβ−1 , (9.3.37)
i=1
respectively, where H0 s, ỹ(s | σ, θ), σ, θ, ε, λ0 (s | σ, θ, ε) is the Hamilto-
nian function for the cost functional (9.3.34) given by
H0 s, ỹ(s | σ, θ), σ, θ, ε, λ0 (s | σ, θ, ε)
$ %
= L̃0 (s, ỹ(s | σ, θ), σ, θ, ε) + λ0 (s | σ, θ, ε) f˜(s, ỹ(s | σ, θ), σ, θ)
(9.3.38)
9.3 Exact Penalty Function Approach 347
Remark 9.3.4 By Assumptions 9.3.1, 9.3.2, and 9.3.3, Remarks 9.3.1 and
9.3.2, it follows from arguments similar to those given for the proof of
Lemma 8.4.2 that there exits a compact set Z ⊂ Rn such that λ0 (s | σ, θ, ε) ∈
Z for all s ∈ [0, 1], (σ, θ) ∈ Ξ × Θ and ε ≥ 0.
In this section, we shall show that, under some mild assumptions, if the pa-
rameter δk is sufficient large (δk → +∞ as k → +∞) and σ (k),∗ , θ (k),∗ , ε(k),∗
is (k),∗
a local minimizer
of Problem (P̃δ (p)), then ε(k),∗ → ε∗ = 0, and
σ ,θ (k),∗
→ (σ , θ ∗ ) with (σ ∗ , θ ∗ ) being a local minimizer of Problem
∗
(P̃ (p)).
For every positive integer k, let σ (k),∗ , θ (k),∗ be a local minimizer of
Problem (P̃δ (p)). To obtain our main result, we need
Lemma 9.3.1 Let σ (k),∗ , θ (k),∗ , ε(k),∗ be a local minimizer of Problem
(P̃δ (p)). Suppose that g60δk σ (k),∗ , θ (k),∗ , ε(k),∗ is finite and that ε(k),∗ > 0.
Then !
σ (k),∗ , θ (k),∗ , ε(k),∗ ∈/ Ωεk ,
Proof. Since (σ (k),∗ , θ (k),∗ , ε(k),∗ ) is a local minimizer of Problem (P̃δ (p)) and
ε(k),∗ > 0, we have
δ
∂6g0k σ (k),∗ , θ (k),∗ , ε(k),∗
= 0. (9.3.40)
∂ε
On the contrary, we assume that the conclusion of the lemma is false. Then,
we have
! !γ
hi ỹ(s | σ (k),∗ , θ (k),∗ ), σ (k),∗ ≤ ε(k),∗ Wi ,
348 9 Optimal Control Problems with State and Control Constraints
∀ s ∈ Jj , j = 1, . . . , p; i = 1, . . . , N, (9.3.41)
and !
ỹ 1 | σ (k),∗ , θ (k),∗ − ỹ f = 0. (9.3.42)
0. If σ (k),∗ , θ (k),∗ , ε(k),∗ → (σ ∗ , θ ∗ , ε∗ ) as k → +∞, and the constraint
qualification is satisfied for the continuous inequality constraints (9.3.20) at
(σ, θ) = (σ ∗ , θ ∗ ), then ε∗ = 0 and (σ ∗ , θ ∗ ) ∈ Ω0 .
Proof. From Lemma 9.2.1, it follows that (σ (k),∗ , θ (k),∗ , ε(k),∗ ) ∈ / Ωε(k),∗ . Fur-
thermore, in terms of (9.3.27), we have
g0δ σ (k),∗ , θ (k),∗ , ε(k),∗
∂6
∂σ
1
∂H0 s, ỹ s | σ (k),∗ , θ (k),∗ , σ, θ, ε, λ0 s | σ (k),∗ , θ (k),∗ , ε(k),∗
= ds
0 ∂σ
1
∂ L̄0 s, ỹ s | σ (k),∗ , θ (k),∗ , σ (k),∗ , θ (k),∗
= ds + 2(ε(k),∗ )−α ·
0 ∂σ
N 1 * ! ! !γ +
max 0, h̄i s, ỹ s | σ (k),∗ , θ (k),∗ , σ (k),∗ − ε(k),∗ Wi ·
i=1 0
1 !
∂ h̄i s, ỹ s | σ (k),∗ , θ (k),∗ , σ (k),∗
ds + λ0 s | σ (k),∗ , θ (k),∗ , ε(k),∗ ·
∂σ 0
(k),∗ (k),∗
˜
∂ f s, ỹ s | σ (k),∗
,θ (k),∗
,σ ,θ
ds
∂σ
=0 (9.3.44)
9.3 Exact Penalty Function Approach 349
g0δ σ (k),∗ , θ (k),∗ , ε(k),∗
∂6
∂ε
4 N
1 & * ! !
(k),∗ −α−1
= (ε ) −α max 0, h̄i s, ỹ s | σ (k),∗ , θ (k),∗ , σ (k),∗
i=1 0
+'2 N 1
* ! !
− (ε)γ Wi ds + 2γ max 0, h̄i s, ỹ s | σ (k),∗ , θ (k),∗ , σ (k),∗
i=1 0
5
!γ +
n+1 ! !2
− ε (k),∗
Wi (−(ε ) Wi )ds − α
(k),∗ γ
ỹi 1 | σ (k),∗
,θ (k),∗
− ỹif
i=1
!β−1
+ δk β ε(k),∗
= 0. (9.3.45)
Then, by Remarks 9.3.2 and 9.3.3, it follows from Theorem A.1.10 that the
first and third terms appeared on the right hand side of (9.3.46) converge to
finite values. On the other hand, the second term tends to infinity, which is
impossible. Thus,
N
1 " # ∂ h̄i (s, ỹ(s | σ ∗ , θ ∗ ), σ ∗ )
max 0, h̄i (s, ỹ(s | σ ∗ , θ ∗ ), σ ∗ ) ds = 0.
i=1 0
∂σ
(9.3.47)
Since the constraint qualification is satisfied for the continuous inequality con-
straints (9.3.20) at (σ, θ) = (σ ∗ , θ ∗ ), it follows that, for each i = 1, . . . , N,
" #
max 0, h̄i (s, ỹ(s | σ ∗ , θ ∗ ), σ ∗ ) = 0
for each s ∈ [0, 1]. This, in turn, implies that, for each i = 1, . . . , N
for each s ∈ [0, 1]. Next, from (9.3.45) and (9.3.48), it is easy to see that when
k → +∞, for each i = 1, . . . , n + 1,
then
! !ξ
ỹ s | σ (k),∗ , θ (k),∗ − ỹ(s | σ ∗ , θ ∗ ) =o ε(k),∗ − (ε∗ )ξ , (9.3.50)
∞
where
ỹ(· | σ, θ)∞ = ess sup |ỹ(· | σ, θ)|. (9.3.51)
s∈[0,1]
9.3 Exact Penalty Function Approach 351
for any s ∈ [0, 1]. By Assumption 10.1.1, there exists a constant N2 > 0 such
that
!
ỹ s | σ (k),∗ , θ (k),∗ − ỹ(s | σ ∗ , θ ∗ )
s * !
≤N2 ỹ τ | σ (k),∗ , θ (k),∗ − ỹ(τ | σ ∗ , θ ∗ )
0 ! +
+ σ (k),∗ , θ (k),∗ − (σ ∗ , θ ∗ ) dτ (9.3.54)
for any s ∈ [0, 1]. Since this is valid for all s ∈ [0, 1], it completes the proof.
Assumption 9.3.6
!! !ξ
φi ỹ 1 | σ (k),∗
,θ (k),∗
=o ε (k),∗
, ξ > 0, i = 1, . . . , n + 1.
(9.3.58)
Theorem 9.3.3 Suppose that γ > α, ξ > α, ξ > α, −α − 1 + 2ξ >
0, −α − 1 + 2ξ > 0, 2γ − α − 1 > 0. Then, as ε(k),∗ → ε∗ = 0 and
(k),∗
σ , θ (k),∗ → (σ ∗ , θ ∗ ) ∈ Ω0 , it holds that
!
g60k σ (k),∗ , θ (k),∗ , ε(k),∗ → g600 (σ ∗ , θ ∗ , 0) = g60 (σ ∗ , θ ∗ )
δ
(9.3.59)
!
∇(σ,θ,ε) g60k σ (k),∗ , θ (k),∗ , ε(k),∗ → ∇(σ,θ,ε) g600 (σ ∗ , θ ∗ , 0)
δ
Proof. For notational brevity, the following abbreviations will be used through
the proof of this theorem and that of Theorem 9.3.5
! !
(k),∗
h̄i (·) = h̄i ·, ỹ ·|σ (k),∗ , θ(k),∗ , σ (k),∗ (9.3.61)
! !
(k),∗
L̄0 (·) = L̄0 ·, ỹ ·|σ (k),∗ , θ(k),∗ , σ (k),∗ , θ(k),∗ (9.3.62)
! !
(k),∗
L̃0,ε (·) = L̃0 ·, ỹ ·|σ (k),∗ , θ(k),∗ , σ (k),∗ , θ(k),∗ , ε (9.3.63)
!
ỹ (k),∗ (·) = ỹ ·|σ (k),∗ , θ(k),∗ (9.3.64)
ỹ ∗ (·) = ỹ (·|σ ∗ , θ∗ ) (9.3.65)
!
(k),∗
ỹi (·) = ỹi ·|σ (k),∗ , θ(k),∗ (9.3.66)
! !
f˜(k),∗ (·) = f˜ ·, ỹ ·|σ (k),∗ , θ(k),∗ , σ (k),∗ , θ(k),∗ (9.3.67)
!
λ̄0,(k),∗ (·) = λ̄ ·|σ (k),∗ , θ(k),∗ (9.3.68)
!
λ0,(k),∗,ε (·) = λ0 ·|σ (k),∗ , θ(k),∗ , ε(k),∗ . (9.3.69)
Now, based on the conditions of the theorem, we can show that, for ε = 0,
!
g60k σ (k),∗ , θ (k),∗ , ε(k),∗
δ
lim
ε(k),∗ →ε∗ =0
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
* !
= lim g60 σ (k),∗ , θ (k),∗
ε(k),∗ →ε∗ =0
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
!−α N 1& * +'2
(k),∗
+ ε (k),∗
max 0, h̄i (s) − (ε)γ Wi ds
i=1 0
9.3 Exact Penalty Function Approach 353
!−α n+1
!2 !β +
(k),∗
+ ε(k),∗ ỹi (1) − ỹif + δk ε(k),∗ . (9.3.70)
i=1
For the second term and the third term of (9.3.73), it is clear from Lemma 9.2.1
that
N
γ
2
1 (k),∗
0
max 0,h̄i (s)−(ε(k),∗ ) Wi ds
i=1
lim (ε(k),∗ )α
ε(k),∗ →ε∗ =0
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
N 3 &
− α2 γ−α2 '2
1 (k),∗
= lim 0
ε(k),∗ h̄i (s) − ε(k),∗ Wi ds .
ε(k),∗ →ε∗ =0 i=1
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
Since ξ > α, γ > α, it follows from Assumption 9.3.5, for any s ∈ [0, 1],
(k),∗ !− α2 (k),∗
lim h̄i (s) = 0.
ε
ε(k),∗ →ε∗ =0
(9.3.74)
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
Thus, we obtain
354 9 Optimal Control Problems with State and Control Constraints
N
1 !− α2 !γ− α2 2
(k),∗
lim ε(k),∗ h̄i (s) − ε(k),∗ Wi ds
ε(k),∗ →ε∗ =0 0
σ (k),∗ ,θ (k),∗ →(σ ∗ ,θ ∗ )∈Ω
i=1
( ) 0
N 1
!− α2 !γ− α2 2
(k),∗
= lim ε(k),∗ h̄i (s) − ε(k),∗ Wi ds
0 ε(k),∗ →ε∗ =0
i=1 σ (k),∗ ,θ (k),∗ →(σ ∗ ,θ ∗ )∈Ω
( ) 0
=0. (9.3.75)
. /
dλ̄0 (s) ∂ L̄0 (s, ỹ(s | σ, θ), σ, θ) ∂ f˜(s, ỹ(s | σ, θ), σ)
=− − λ̄0 (s)
dt ∂ ỹ ∂ ỹ
(9.3.81)
(9.3.81) can be written as:
λ0 (s | σ, θ, ε) =S(s, 1)λ0 (1 | σ, θ, ε)
s
∂ L̃0 (ω, ỹ(ω | σ, θ), σ, θ, ε)
+ S(s, ω) − dω. (9.3.83)
1 ∂ ỹ
= 0. (9.3.85)
On the other hand, by (9.3.33), ξ > α and γ > α, it follows from Assump-
tion 9.3.6 that, for each s ∈ [0, 1],
0 . (k),∗
/
∂ L̄0 (ω) ∂ L̃0,ε (ω)
(k),∗
lim − + dω
ε(k),∗ →ε∗ =0 1 ∂ ỹ ∂ ỹ
∗ ∗
(σ (k),∗ ,θ (k),∗
)→(σ ,θ )∈Ω0
0 N &
* +'
= lim 2ε (k),∗ max 0, h̄(k),∗ (ω) − ε(k),∗ Wi ·
ε(k),∗ →ε∗ =0
i
1 i=1
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
∂ h̄i (ω)
(k),∗
∂ ỹ dω
= 0. (9.3.86)
We then substitute (9.3.85), (9.3.86) into (9.3.84) to give, for each s ∈ [0, 1],
0,(k),∗
lim λ̄ (s) − λ0,(k),∗ (s) = 0. (9.3.87)
ε(k),∗ →ε∗ =0
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
Then we have
!
lim ∇σ g60δk σ (k),∗ , θ (k),∗ , ε(k),∗
ε(k),∗ →ε∗ =0
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
(k),∗ !−α
1
∂ L̄0 (s)
= lim ds + 2 ε(k),∗ ·
ε(k),∗ →ε∗ =0 0 ∂σ
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
N 1 * !γ + ∂ h̄(k),∗ (s)
(k),∗
max 0, h̄i (s) − ε(k),∗ Wi i
ds
i=1 0 ∂σ
1
∂ f˜(k),∗ (s)
+ λ0,(k),∗ (s) ds
0 ∂σ
1 (k),∗ 1 !
∂ L̄0 (s)
= lim ds + λ0 s | σ (k),∗ , θ (k),∗ ·
ε(k),∗ →ε∗ =0 0 ∂σ 0
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
N 1
∂ f˜(k),∗ (s) " (k),∗ !−α (k),∗
ds + lim ε h̄i (s)
∂σ ε(k),∗ →ε∗ =0
i=1 0
∗ ∗
(σ (k),∗ ,θ (k),∗
)→(σ ,θ )∈Ω0
9.3 Exact Penalty Function Approach 357
Note that ∂L¯ 0 /∂σ, λ0 and ∂ f˜/∂σ are all bounded. Thus, it follows
from (9.3.87) and Theorem A.1.10 that
(k),∗
1
∂ L̄0 (s)
lim ds
ε(k),∗ →ε∗ =0 0 ∂σ
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
1 ! ∂ f˜(k),∗ (s)
+ λ0 s | σ (k),∗ , θ (k),∗ , ε(k),∗ · ds
0 ∂σ
1 (k),∗
∂ L̄0 (s)
= lim ds
0 ε(k),∗ →ε∗ =0 ∂σ
(σ (k),∗ ,θ (k),∗
)→(σ∗ ,θ∗ )∈Ω0
1 ! ∂ f˜(k),∗ (s)
+ lim λ0 s | σ (k),∗ , θ (k),∗ , ε(k),∗ ds
0 ε(k),∗ →ε∗ =0 ∂σ
∗ ∗
(σ (k),∗ ,θ (k),∗
)→(σ ,θ )∈Ω0
1 (k),∗ 1
∂ L̄0 (s) ∂ f˜(k),∗ (s)
= ds + λ¯0 (s | σ ∗ , θ ∗ ) ds
0 ∂σ 0 ∂σ
=∇σ g60 (σ ∗ , θ ∗ ). (9.3.89)
Similarly, (ε(k),∗ )−α gi , ∂gi /∂σ are all bounded, and ξ > α, γ > α. It follows
from Assumption 9.3.5 that
N
1 !−α !γ−α
(k),∗
lim 2 ε(k),∗ h̄i (s) − ε(k),∗ Wi ·
ε(k),∗ →ε∗ =0 0
σ (k),∗ ,θ (k),∗ →(σ ∗ ,θ ∗ )∈Ω
i=1
( ) 0
(k),∗
∂ h̄i (s)
ds
∂σ
N
1 !−α !γ−α
(k),∗
=2 lim ε(k),∗ h̄i (s) − ε(k),∗ Wi ·
0 ε(k),∗ →ε∗ =0
i=1 σ (k),∗ ,θ (k),∗ →(σ ∗ ,θ ∗ )∈Ω
( ) 0
(k),∗
∂ h̄i (s)
ds
∂σ
=0. (9.3.90)
N 1
& * !γ +'2
(k),∗
max 0, h̄i (s) − ε(k),∗ Wi ds
i=1 0
N
1 * !γ + !γ !
(k),∗
+ 2γ max 0, h̄i (s) − ε(k),∗ Wi −ε(k),∗ Wi ds
i=1 0
n+1 !!2 ' !β−1 '
(k),∗
+ φi ỹi (1) + σk β ε(k),∗
i=1
N
1 & !− α+1
(k),∗ 2
= lim −α h̄i (s) ε(k),∗
ε(k),∗ →ε∗ =0 0
i=1
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
!γ− α+1 '2 N
1 & !γ '
2 (k),∗
− ε(k),∗ Wi ds + 2γ h̄i (s) − ε(k),∗ Wi ·
i=1 0
5
& !γ ' !−α−1
n+1 !!2
(1)(ε(k),∗ )− 2
(k),∗ α+1
−ε(k),∗ Wi ε(k),∗ ds + φi ỹi
i=1
(9.3.93)
!γ− α+1
2
2
− ε(k),∗ Wi ds
9.3 Exact Penalty Function Approach 359
N
1 !γ !
(k),∗
+ 2γ lim h̄i (s) − ε(k),∗ Wi ·
0 ε(k),∗ →ε∗ =0
i=1
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
!γ ! !−α−1
n+1 !− α+1 2 5
(k),∗ 2
−ε(k),∗
Wi ε (k),∗
ds + φi ỹi (1) ε (k),∗
i=1
=0. (9.3.95)
* +
for any x ∈ Nδk σ (k),∗ , θ (k),∗ . Now, we construct a sequence σ̂ (k),∗ , θ̂ (k),∗
satisfying
! ! ξ k
(k),∗ (k),∗
σ̂ , θ̂ − σ (k),∗ , θ (k),∗ ≤
k
. Clearly,
! !
g60δk σ̂ (k),∗ , θ̂ (k),∗ , ε(k),∗ ≥ g60δk σ (k),∗ , θ (k),∗ , ε(k),∗ (9.3.97)
≤ 0 + 0 + δ̄. (9.3.98)
360 9 Optimal Control Problems with State and Control Constraints
Proof. On the contrary,"we assume that the# conclusion is false. Then, there
exists a subsequence of σ (k),∗ , θ (k),∗ , ε(k),∗ , which is denoted by the orig-
inal sequence, such that for any k0 > 0, there exists a k > k0 satisfying
ε(k ),∗ = 0. By Theorem 9.3.2, we have
!
ε(k),∗ → ε∗ = 0, σ (k),∗ , θ (k),∗ → (σ ∗ , θ ∗ ) ∈ Ω0 , as k → +∞
. Since ε(k),∗ = 0 for all k, it follows from dividing (9.3.45) by (ε(k),∗ )β−1 that
4 N
!−α−β 1 & * !γ +'2
(k),∗
ε (k),∗
−α max 0, h̄i (s) − ε(k),∗ Wi ds
i=1 0
N 1
* !γ + !γ !
(k),∗
+ 2γ max 0, h̄i (s) − ε(k),∗ Wi −ε(k),∗ Wi ds
i=1 0
5
n+1 !2
(k),∗
−α ỹi (1) − ỹif + δk β = 0. (9.3.101)
i=1
This is equivalent to
4 N
!−α−β 1 & * !γ +'2
(k),∗
ε (k),∗
−α max 0, h̄i (s) − ε(k),∗ Wi ds
i=1 0
N 1
* !γ + !γ !
(k),∗
+ 2γ max 0, h̄i (s) − ε(k),∗ Wi −ε(k),∗ Wi
i=1 0
9.3 Exact Penalty Function Approach 361
* !γ +
(k),∗ (k),∗
+ max 0, h̄i (s) − ε(k),∗ Wi h̄i (s)
* !γ +
(k),∗ (k),∗
− max 0, h̄i (s) − ε(k),∗ Wi h̄i (s) ds
5
(k),∗
n+1 !2
f
−α ỹi (1) − ỹi + δk β = 0 (9.3.102)
i=1
Let k → +∞ in (9.3.102), and note that −α−β +2ξ > 0 and −α−β +2ξ > 0.
Then, it follows that the left hand side of (9.3.103) yields
4 N 1&
!−α−β * !γ +'2
(k),∗
ε (k),∗
(2γ − α) max 0, h̄i (s) − ε(k),∗ Wi ds
i=1 0
5
n+1 !2
(k),∗
−α ỹi (1) − ỹif + δk β → ∞. (9.3.104)
i=1
However, under the same conditions and −α − β + 2γ > 0, the right hand
side of (9.3.103) gives
N
!−α−β 1 * !γ +
(k),∗ (k),∗
2γ ε (k),∗
max 0, h̄i (s) − ε(k),∗ Wi h̄i (s)ds → 0.
i=1 0
(9.3.105)
Remark 9.3.7 Although we have proved that a local minimizer of the exact
penalty function optimization Problem (P̃δk (p)) will converge to a local min-
imizer of the original Problem (P̃ (p)), we need, in actual computation, set a
lower bound ε∗ = 10−9 for ε(k),∗ so as to avoid the situation of being divided
by ε(k),∗ = 0, leading to infinity.
9.3.4 Examples
The result obtained is shown below. The optimal cost function value is
g0∗ = 5.75921513 × 10−3 , where δ = 1.0 × 105 and ε = 1.00057 × 10−7 . All the
continuous inequality constraints are satisfied for all t ∈ [0, 1]. Comparing
with the results obtained for Example 9.2.2, our minimum value of the cost
functional is slightly larger (which is 5.3 × 10−3 for Example 9.2.2). However,
the continuous inequality constraints (9.2.94)–(9.2.97) in Example 9.2.2 are
not satisfied at all t ∈ [0, 1]. The continuous inequality constraints are shown
in Figure 9.3.4, and the optimal state trajectories and the optimal control
are shown in Figures 9.3.5 and 9.3.6, respectively.
364 9 Optimal Control Problems with State and Control Constraints
2.5
h 1.5
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t
0
-0.05
-0.2
-0.1
x2
-0.4
-0.15
x1
-0.6
-0.2 -0.8
-0.25 -1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t t
Fig. 9.3.2: Optimal state trajectories x1 (t) and x2 (t) for Example 9.3.1
10
4
u
-2
-4
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t
-2 0.5
-2.5 0
-3 -0.5
-3.5 -1
h2
h1
-4 -1.5
-4.5 -2
-5 -2.5
-5.5 -3
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t t
0 -2
-2.5
-0.2
-3
-0.4
h4 -3.5
h3
-0.6
-4
-0.8
-4.5
-1 -5
-1.2 -5.5
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t t
Example 9.3.3 The following problem is taken from [68]: Find a control
u : [0, 4.5] → R that minimizes the cost functional
4.5 " #
(u(t))2 + (x1 (t))2 dt (9.3.109)
0
dx1 (t)
= x2 (t) (9.3.110)
dt
dx2 (t)
= −x1 (t) + x2 (t) 1.4 − 0.14(x2 (t))2 (t) + 4u(t) (9.3.111)
dt
with the initial conditions x1 (0) = −5 and x2 (0) = −5, and the continuous
inequality constraint
1
h = −u(t) − x1 (t) ≥ 0, t ∈ [0, 4.5]. (9.3.112)
6
366 9 Optimal Control Problems with State and Control Constraints
12 22
21
10
20
8
19
1
2
6 18
x
x
17
4
16
2
15
0 14
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t t
0.01 3
2.5
0
-0.01
1.5
x3
x4
-0.02
-0.03
0.5
-0.04
0
-0.05 -0.5
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t t
0.2 0.06
0.05
0
0.04
0.03
-0.2
0.02
x5
-0.4 0.01
x
-0.6
-0.01
-0.02
-0.8
-0.03
-1 -0.04
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t t
3 0.8
2 0.6
1 0.4
u2
u1
0 0.2
-1 0
-2 -0.2
-3 -0.4
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t t
1.8
1.6
1.4
1.2
1
h
0.8
0.6
0.4
0.2
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
t
1 5
0 4
3
-1
2
-2 1
x1
-3 0
x2
-4 -1
-2
-5
-3
-6 -4
-7 -5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
t
0.5
0
u
-0.5
-1
-1.5
-2
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
t
9.4 Exercises
9.4.1 Can the control functions chosen in Definition 9.2.1 be just measur-
able, rather than Borel measurable functions?
9.4.2 Derive the gradient formulae given by (9.3.32) and (9.3.33).
9.4.3 Consider the problem (P ) subject to additional terminal equality con-
straints:
qi (x(T | u)) = 0, i = 1, . . . , NE , (9.4.1)
where qi , i = 1, . . . , NE , are continuously differentiable functions. Let this
optimal control problem be referred to as the problem (R). Use the control
parametrization time scaling transform technique to derive a computational
method for solving Problem (R). State all the essential assumptions and then
prove all the relevant convergence results.
9.4.4 Provide detailed derivations of the gradient formulae appeared in Al-
gorithm 9.2.4.
9.4.5 Prove Theorem 9.2.1.
9.4.6 Prove Theorem 9.2.2.
9.4.7 Prove Theorem 9.2.3.
9.4.8 Show the validity of Remark 9.3.2.
9.4.9 Construct the equality constraint violation as defined by (9.3.29) fort
the interior equality constraint (8.9.14a).
9.4 Exercises 369
10.1.1 Introduction
Consider the following time-delay system, defined on the fixed time interval
(−∞, T ]:
dx
= f (x(t), x̄(t), u(t), ū(t)), t ∈ [0, T ], (10.1.1)
dt
x(t) = φ(t), t ≤ 0, (10.1.2)
u(t) = ϕ(t), t < 0, (10.1.3)
where x(t) = [x1 (t), x2 (t), . . . , xn (t)] ∈ Rn is the state vector; u(t) =
[u1 (t), u2 (t), . . . , ur (t)] ∈ Rr is the control vector; x̄(t) = [x1 (t − h1 ), x2 (t −
h2 ), . . . , xn (t − hn )] and ū(t) = [u1 (t − hn+1 ), u2 (t − hn+2 ), . . . , ur (t −
hn+r )] , in which hq > 0, q = 1, . . . , n + r, are given time-delays; f :
Rn × Rn × Rr × Rr → Rn and φ(t) = [φ1 (t), . . . , φn (t)] are given con-
tinuously differentiable functions; and ϕ(t) = [ϕ1 (t), . . . , ϕr (t)] is a given
function. A Borel measurable function u(t) : (−∞, T ] → Rr is said to be an
admissible control if u(t) ∈ U for almost all t ∈ [0, T ] and u(t) = ϕ(t) for
all t < 0, where U is a compact and convex subset of Rr . Let U denote the
class of all such admissible controls. For simplicity, we use u to denote u(t)
for the rest of the section. Let x(·) denote the solution of (10.1.1)–(10.1.3)
corresponding to each u ∈ U . It is an absolutely continuous function that
satisfies the dynamic (10.1.1) almost everywhere on [0, T ], and the initial
condition (10.1.2) everywhere on (−∞, 0]. Our optimization problem is de-
fined formally as follows: Given the dynamic system (10.1.1)–(10.1.3), choose
an admissible control u ∈ U to minimize the following cost functional:
T
Φ0 (x(T )) + L0 (x(t), x̄(t), u)dt, (10.1.4)
0
where Φk : Rn → R, k = 0, 1, . . . , Ne + Nm , and Lk : Rn × Rn × Rr → R, k =
0, 1, . . . , Ne + Nm , are given real-valued functions. We denote this problem
as Problem (P1 ).
We assume that the following conditions are satisfied throughout this sec-
tion.
10.1 Time-Lag Optimal Control 373
0 = t0 ≤ t1 ≤ t2 ≤ . . . ≤ tp−1 ≤ tp = T. (10.1.7)
Let Ξ denote the set of all vectors σ = [t1 , . . . , tp ] such that (10.1.7) is
satisfied. Then the control u is approximated as follows:
p
u≈ δ (i) χ[ti−1 ,ti ) (t), t ∈ [0, T ], (10.1.8)
i=1
& '
(i) (i)
where δ (i) = δ1 , . . . , δr is the value of the control on the ith subinterval
and χ[ti−1 ,ti ) (t) is the indicator function defined by
1, if t ∈ [ti−1 , ti ),
χ[ti−1 ,ti ) (t) =
0, otherwise.
$ %
Let Δ denote the set of all such vectors δ = (δ (1) ) , . . . , (δ (p) ) . Substi-
tuting (10.1.8) into (10.1.1), the time-delay system defined on the subinterval
[ti−1 , ti ) becomes
dx !
= f x(t), x̄(t), θ (i) , θ̄(t) , (10.1.9)
dt
$ %
where θ̄(t) = θ̄1 (t), . . . , θ̄r (t) . For θ̄m (t), m = 1, . . . , r, there are two cases:
(1) if t−hn+m < 0, then θ̄m (t) = ϕm (t−hn+m ); and (2) if t−hn+m ≥ 0, then
there exist q (q ≤ p) distinct partition points (let these points be denoted as
til , l = 1, . . . , q,) such that
Clearly, $ %
[0, T ] = [0, ti1 ) ∪ [ti1 , ti2 ) . . . ∪ tiq−1 , tiq , (10.1.10)
and for any k, l ∈ {1, 2, . . . , q}, k = l,
$ $
tik−1 , tik ∩ til−1 , til = ∅. (10.1.11)
= 0, k = 1, . . . , Ne , (10.1.13)
p ti
!
gk (σ, δ) = Φk (x(T | σ, δ)) + Lk x(t | σ, δ), x̄(t | σ, δ), δ (i) dt
i=1 ti−1
≥ 0, k = Ne + 1, . . . , Ne + Nm , (10.1.14)
For time-delayed optimal control problems with variable switching times, the
conventional time scaling transformation fails to work. This is because the
conventional time scaling transformation will map variable switching times
10.1 Time-Lag Optimal Control 375
into fixed switching times in a new time horizon. However, the time-delays
will become variable delays. As a consequence, the transformed problem will
be even harder to solve. In this section, we shall develop a novel time-scaling
transformation to transform Problem (P (p)) into an equivalent problem in
which the switching times are fixed.
For any σ = [t1 , . . . , tp ] ∈ Ξ, define a vector θ = [θ1 , . . . , θp ] ∈ Rp
where θi = ti − ti−1 , i = 1, . . . , p. Clearly, θi ≥ 0 is the duration between two
consecutive switching times for the control vector δ (i) and θ1 + · · · + θp = T.
Let Θ denote the set of all such vectors θ ∈ Rp , where vectors in Θ are called
admissible duration vectors.
Now, we introduce a new time variable s in a new time horizon (−∞, p).
For each admissible duration vector θ ∈ Θ and a time instant s in the new
time horizon, define the corresponding time-scaling function as follows:
s
μ(s | θ) = θi + θs+1 {s − max( s!, 0)}, s ∈ (−∞, p], (10.1.15)
i=1
t
tp
αp
μ(s | θ)
t3
α3
t2
t1 α2
0 α1 s
1 2 3 p
tan αi = θi , i = 1, . . . , p
It is clear from Figure 10.1.1 that the time scaling function (10.1.15) is a
continuous piecewise linear non-decreasing function, which maps s ∈ [i − 1, i)
in the new time horizon to [ti−1 , ti ) in the original time horizon. The switching
times for the control vector are fixed integer points 1, 2, . . . , p − 1, in the new
376 10 Time-Lag Optimal Control Problems
For simplicity, let η(s), ζ(s) and μ(s) denote η(s | θ), ζ(s | θ) and μ(s | θ),
respectively, for the rest of the section.
t μ(s)
μ(s )
h
μ(s ) − h
0
s
ζ(s ) s
μ(s )
h
μ(s ) − h
0
s
ζ(s ) s
Fig. 10.1.2: Two cases for finding the delay in the new time horizon
Note that, for the duration vector θ, it is possible that θi = 0 for some
i ∈ {1, . . . , p}. This means that for some time s in the new time horizon,
10.1 Time-Lag Optimal Control 377
μ(η(s)) = μ(s) − h.
By the definition of ζ(s), it follows that the delay time in the new time horizon
is unique. The process for finding the delay time in the new time horizon is
illustrated in Figure 10.1.2.
To proceed, we need the following lemma.
Lemma 10.1.1 For any given t ∈ [0, T ) and θ ∈ Θ, there exists a unique
m ∈ {1, . . . , p} such that θm > 0 and t ∈ [μ(m − 1), μ(m)).
Now, we are in the position to give an explicit formula for ζ(s). This is
presented as a theorem below:
Theorem 10.1.1 Let θ ∈ Θ. Then, for each s ∈ (−∞, p], if μ(s) − h < 0,
then
ζ(s) = μ(s) − h.
Otherwise, let κ(s | θ) denote the unique integer such that θκ(s|θ)+1 > 0 and
⎡ ⎞
κ(s|θ)
κ(s|θ)+1
μ(s) − h ∈ ⎣ θi , θi ⎠ . (10.1.17)
i=0 i=0
Proof. For simplicity, we omit the argument θ in κ(s | θ). Suppose first
that μ(s) − h < 0. Then μ(ζ(s)) < 0. Thus, μ(ζ(s)) = ζ(s). Combining this
equation with equation (10.1.15) gives
Suppose now that μ(s) − h ∈ [0, T ), it follows from (10.1.17) that κ(s) ≥ 0,
and ζ(s) ∈ [κ(s), κ(s) + 1). That is, κ(s) = ζ(s)!, and hence it follows
from (10.1.15) and (10.1.18) that
378 10 Time-Lag Optimal Control Problems
s
κ(s)
μ(ζ(s)) = θl + θκ(s)+1 (ζ(s) − κ(s)) = θl + θs+1 (s − s!) − h.
l=1 l=1
(10.1.19)
In the new time horizon, we consider the following new time-delay system
with fixed switching time, defined on the subinterval [i − 1, i), i = 1, . . . , p:
dy !
= θi f y(s), ȳ(s), δ (i) , δ̄(s) , (10.1.20)
ds
y(s) = φ(s), s ≤ 0, (10.1.21)
where θ ∈ Θ, δ ∈ Δ,
f : Rn × Rn × Rr × Rr → Rn , φ : R → Rn and ϕ : R → Rr are as
defined above. It is easy to see that ζ(s) < s. Hence, ȳ(s) is a delay term
in (10.1.20). Note that for the case of s̄q < 0, q = 1, . . . , n, yq (s̄q ) = φq (s̄q ).
By Assumptions 10.1.1 and 10.1.2, system (10.1.20) have a unique solution
for each admissible pair (θ, δ). Let y(· | θ, δ) denote the solution of (10.1.20).
Now, for each admissible duration vector θ ∈ Θ, denote
= 0, k = 1, . . . , Ne , (10.1.23)
p
i !
g̃k (θ, δ) = Φk (y(p | θ, δ)) + θ i Lk y(s | θ, δ), ȳ(s | θ, δ), δ (i) ds
i=1 i−1
≥ 0, k = Ne + 1, . . . , Ne + Nm , (10.1.24)
To solve Problem (Q1 (p)) using the gradient-based nonlinear optimization al-
gorithms, we require the gradients of the cost and constraint functionals with
respect to each of their variables. We first rewrite g̃k (θ, δ), k = 0, . . . , Ne +Nm ,
in the following forms:
p
∂μ(s)
g̃k (θ, δ) = Φk (y(p | θ, δ)) + L̂k (y(s | θ, δ), ȳ(s | θ, δ), δ) ds,
0 ∂s
(10.1.26)
where
380 10 Time-Lag Optimal Control Problems
p !
L̂k (y(s | θ, δ), ȳ(s | θ, δ), δ) = Lk y(s | θ, δ), ȳ(s | θ, δ), δ (i) χ[i−1,i) (s).
i=1
∂y(s | θ, δ)
= Λ̄(s | θ, δ), s ∈ [0, p]. (10.1.28)
∂θ
Here, Λ̄(· | θ, δ) is the solution of the following auxiliary dynamic on each
[i − 1, i):
Proof. Let δ and s ∈ {1, · · · , p} be arbitrary but fixed, and let er be the rth
unit vector in Rp . Then,
∂y(s) y s|δ, θ ξ − y(s|θ, δ)
= lim ,
∂θr ξ→0 ξ
where θ ξ = θ + ξer .
Now, we will prove the theorem in the following steps:
Step 1: Preliminaries
For each real number ξ ∈ R, let y ξ denote the function y ·|δ, θ ξ . Then,
it follows from dynamic system that, for each ξ ∈ R,
s
ξ ξ
y (s) = y (0) + F ξ (t)dt, s ∈ [0, p],
0
Define
s
ξ
Γ (s) = y s|δ, θ ξ
− y(s|δ, θ) = F ξ − F 0 dt. (10.1.31)
0
F ξ (s) − F 0 (s)
1
∂f y + ηΓ ξ (s), ȳ + η ȳ ξ − ȳ , θ + ηξer , δ ξ
= Γ (s)dη
∂y
0
1
∂f y + ηΓ ξ (s), ȳ + η ȳ ξ − ȳ , θ + ηξer , δ ξ
+ ȳ − ȳ dη
∂ ȳ
0
p
∂f y + ηΓ ξ (s), ȳ + η ȳ ξ − ȳ , θ + ηξer , δ
+ ξdη (10.1.32)
0 ∂θr
382 10 Time-Lag Optimal Control Problems
and
ȳ ξ − ȳ 0 = ȳ sξ |δ, θ ξ − ȳ(s|δ, θ)
= ȳ sξ |δ, θ ξ − ȳ s|δ, θ ξ + ȳ s|δ, θ ξ − ȳ(s|δ, θ)
= ȳ sξ |δ, θ ξ − ȳ s|δ, θ ξ + Γ ξ (s̄),
Let ξ ∈ [−a, a] be arbitrary. When s̄ < 0, taking the norm of both sides
of (10.1.31) and applying the definition of C3 gives
4 5
ξ s 1 ∂f ξ ξ ξ
Γ (s) = η ∂f η ∂f η
Γ ξ
(t) + ξ + φ ξ
− φ 0
dηdt
n 0 0 ∂y ∂θr ∂ ȳ η η
n
where
μ(α1 |θ) = h.
When s̄ ≥ 0,
ξ s ∂fηξ ξ
1 ∂fηξ ∂fηξ ξ
Γ (s) = Γ (t) + ξ+ ȳ s |δ, θ ξ − ȳ s|δ, θ ξ
n ∂y ∂θ r ∂ ȳ
0 0
+ Γ ξ (s̄) dηdt
n
s 1 ∂f ξ s 1 ∂f ξ
η ξ η
≤ Γ (t)dηdt + ξdηdt
0 0 ∂y 0 0 ∂θr
n
n
s 1 ∂f ξ
η ξ
+ Γ (s̄)dηdt
0 0 ∂ ȳ
n
s 1 ∂f ξ
η
+ ȳ sξ |δ, θ ξ − ȳ s|δ, θ ξ dηdt
0 0 ∂ ȳ
n
s 1 ∂f ξ
η ξ
≤(C3 + C32 )exp(C3 p)|ξ| + Γ (t)dηdt
α1 0 ∂y
n
384 10 Time-Lag Optimal Control Problems
s 1 ∂fηξ s 1 ∂f ξ
η ξ
+ ξdηdt + Γ (s̄)dηdt
α1 0 ∂θr α1 0 ∂ ȳ
n
n
s 1 ∂fηξ ξ
+ ȳ s |δ, θ ξ − ȳ s|δ, θ ξ dηdt
α1 0 ∂ ȳ
n
1,ξ ∂f (y + ηΓ ξ (s), ȳ + η ȳ ξ − ȳ , θ + ηξer , δ)
1
λ (t) =
0 ∂y
∂f (y, ȳ, θ, δ)
− Γ ξ (t)dη
∂y
1
2,ξ ∂f (y + ηΓ ξ (s), ȳ + η ȳ ξ − ȳ , θ + ηξer , δ)
λ (t) =
0 ∂ ȳ
∂f (y, ȳ, θ, δ)
− ȳ ξ − ȳ dη
∂ ȳ
10.1 Time-Lag Optimal Control 385
1
∂f (y + ηΓ ξ (s), ȳ + η ȳ ξ − ȳ , θ + ηξer , δ)
λ3,ξ (t) =
0 ∂θr
∂f (y, ȳ, θ, δ)
− ξdη.
∂θr
y + ηΓ ξ (t) → y, as ξ → 0 (10.1.35)
ȳ + η ȳ ξ − ȳ → ȳ, as ξ → 0 (10.1.36)
θ + ηξer → θ, as ξ → 0 (10.1.37)
uniformly with respect to t ∈ [0, p], η ∈ [0, 1] and l ∈ [0, 1],where s̄l,ξ is the
corresponding delayed time of the control θ + lξer . These results together
with (10.1.33) imply that |ξ|−1 λ1,ξ → 0, |ξ|−1 λ2,ξ → 0, |ξ|−1 λ3,ξ → 0 uni-
formly on [0, p] as ξ → 0. Thus,
386 10 Time-Lag Optimal Control Problems
lim ρ(ξ) = 0.
ξ→0
Let ξ ∈ [−a, 0)∪(0, a] be arbitrary but fixed. Then, it follows from (10.1.31)
that
s s
ξ
$ 1,ξ 2,ξ 3,ξ
% ∂f (y, ȳ, θ, δ) ξ
Γ (s) = λ (t) + λ (t) + λ (t) dt + Γ (t)dt
0 0 ∂y
s
∂f (y, ȳ, θ, δ) ∂ ȳ(sl,ξ |δ, θ) ∂sl,ξ
+ Γ ξ (s̄) + ξ dt
0 ∂ ȳ ∂s̄ ∂θr
s
∂f (y, ȳ, θ, δ)
+ ξdt. (10.1.38)
0 ∂θr
Let
s ∂ ȳ(sl,ξ |δ, lξer ) ∂sl,ξ ∂ ȳ(s|δ, θ) ∂s̄
ρ̄(ξ) = ρ(ξ) + C3 −
0 ∂s̄ ∂θr ∂s̄ ∂θr n
t μ(s)
θs +1 = 0
0
s
ζ(s ) s
Scenario C1
t μ(s)
0
s
ζ(s ) s
Scenario C2
Note that Theorem 10.1.2 is valid only under the condition that the deriva-
tive of ζ(·) with respect to θk exists. However, for the case of s ∈ S (ζ(s) ∈
{0, 1, . . . , p − 1}), there are two scenarios to consider:
(i) θs+1 = 0.
(ii) θs+1 > 0.
If scenarios (i) holds, clearly, the derivative of ζ(s) with respect to θi does
not exist for all s ∈ [j − 1, j ) for some j . However, this will not affect
solving the auxiliary system (10.1.29), since ∂ζ(·)/∂θi is always associated
with ∂f j (y(s | θ, δ), ȳ(s | θ, δ), θ, δ)/∂ ȳ, which takes the value of 0, when
θs+1 = 0.
As for scenario (ii), the derivative of μ(ζ(s)) with respect to s does not exist
for only a finite number of time instant (less than or equal to the number of
elements of S ), which implies that, the value of ∂ζ(s)/∂θi does not exist only
at these time instants. For this case, the auxiliary dynamic system (10.1.29)
is still numerically solvable.
Remark 10.1.1 Figure 10.1.3 depicts the situations for the two scenarios.
Note that, in Scenario (C1), we have θζ(s )−1 = 0, while in Scenario (C2), we
have θζ(s )−1 > 0. By the definition of ζ(·) in (10.1.16), it is clear that θζ(s ) is
greater than 0 regardless of the value of θζ(s )−1 . Hence, the auxiliary dynamic
system (10.1.29) is numerically solvable either in the case of θζ(s )−1 > 0 in
Scenario (C1) or θζ(s )−1 = 0 in Scenario (C2).
The gradient of g̃k (θ, δ), k = 0, 1, . . . , Ne + Nm , with respect to the dura-
tion vector θ is given as a theorem stated below.
Theorem 10.1.3 The gradient of g̃k (θ, δ) for each k = 0, 1, . . . , Ne + Nm
with respect to θ is given by
Proof. The proof follows from applying the chain rule to (10.1.26) and The-
orem 10.1.2.
Finally, the gradients of the state and g̃k (θ, δ) with respect to δ are given
below.
10.1 Time-Lag Optimal Control 389
∂y(s | θ, δ)
= Ῡ (s | θ, δ), s ∈ [0, p], (10.1.42)
∂δ
where Ῡ (· | θ, δ) is the solution of the following auxiliary dynamic system on
each interval [i − 1, i):
FORTRAN to solve Problem (Q1 (p)). In the next section, we will demon-
strate the effectiveness of this approach with two numerical examples.
where
x̄ = (x1 (t − 1), x2 (t − 0.5))
0 1
A1 (t) = ,
−4π 2 (a + c cos 2πt) 0
0 0 0
A2 (t) = , B(t) = ,
−4π 2 b cos 2πt 0 1
the parameters of the problem are in Table 10.1.1,
a b c tf Q R S
0.2 0.5 0.2 1.5 I2×2 I2×2 104 I2×2
− 3 ≤ ui ≤ 4, t ∈ [0, tf ], i = 1, 2.
the proposed new method can always achieve a better cost when compared
with the conventional control parametrization method for which the partition
points are evenly distributed over the time horizon.
Note that the results obtained by applying the new method have similar
cost values when compared with those obtained by applying the hybrid time-
scaling transformation reported in [298]. However, the implementation of the
new method is much simpler. In particular, it does not require the use of
numerical interpolation to calculate the delay state values in the new time
horizon. Consequently, the computational time requirement is much less.
Figure 10.1.4 shows optimal controls obtained by using the two different
methods. Figures 10.1.5 and 10.1.6 depict, respectively, the two optimal state
trajectories for the case of q = 10.
Table 10.1.2: Optimal costs for Example 10.1.1 using the two different
methods
(b) Conventional control
(a) New method parametrization method
Number of subintervals g0 (uq,∗ ) Number of subintervals g0 (uq,∗ )
q = 10 4.4355 q = 10 7.2878
q=7 4.4610 q=7 7.4755
q=5 4.7630 q=5 8.1382
2
control
-1
Traditional Control Parameterization
-2 Hybrid Time-Scaling Approach
-3
-4
0 0.5 1 1.5
t
Fig. 10.1.4: Optimal controls obtained for Example 10.1.1 using the two
different methods
392 10 Time-Lag Optimal Control Problems
2
x1
x2
1
-1
state -2
-3
-4
-5
0 0.5 1 1.5
t
Fig. 10.1.5: Optimal state trajectory obtained for Example 10.1.1 using
the new method
0.5
-0.5
state
-1
-1.5
x1
-2 x2
-2.5
0 0.5 1 1.5
t
Fig. 10.1.6: Optimal state trajectory obtained for Example 10.1.1 using
the conventional control parametrization method
where
10.1 Time-Lag Optimal Control 393
⎡ ⎤ ⎡ ⎤
1 2 0 0 1 0 0 0
⎢2 1 0 0⎥ ⎢ 0⎥
⎢
S=⎣ ⎥ , R = ⎢0 1 0 ⎥,
0 0 1 2⎦ ⎣0 0 1 0⎦
0 0 1 1 0 0 0 1
⎡ ⎤
1 0 0 0
⎢0 2 0 0 ⎥
Q=⎢
⎣0 0 1 0
⎥.
⎦
0 0 0 2
subject to the time-delay dynamic system
dx1 (t)
= − 2(x1 (t))2 + x1 (t)x2 (t − 0.2) + 2x2 (t)
dt
− u1 (t)u2 (t − 0.5),
dx2 (t)
= − x1 (t − 0.1) + 2x3 (t) + u2 (t),
dt
dx3 (t)
= − (x3 (t))3 − x1 (t)x2 (t) − x2 (t − 0.2)u2 (t)
dt
+ u1 (t − 0.4) + 2u3 (t),
dx4 (t)
= − (x4 (t))2 + x2 (t)x3 (t) − 2x3 (t − 0.3) + 2u4 (t),
dt
the initial conditions
and 10.1.8, and the corresponding state trajectories are displayed in Fig-
ure 10.1.9.
0.3
u1
0.2 u2
0.1
0
control
-0.1
-0.2
-0.3
-0.4
-0.5
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
t
Fig. 10.1.7: Optimal controls u1 and u2 obtained for Example 10.1.2 using
the new method
u3
0.2 u4
0
control
-0.2
-0.4
-0.6
-0.8
-1
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
t
Fig. 10.1.8: Optimal controls u3 and u4 obtained for Example 10.1.2 using
the new method
10.2 Time-Lag Optimal Control with State-Dependent Switched System 395
1.4
x
1
1.2 x
2
x3
1
x4
0.8
state
0.6
0.4
0.2
-0.2
-0.4
-0.6
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
t
Fig. 10.1.9: Optimal state trajectories obtained for Example 10.1.2 using
the new method
10.2.1 Introduction
This section is from [152]. Consider the following switched time-delay system,
which consists of N sub-systems operating in succession, over the time horizon
[0, T ]:
dx(t)
= f i (x(t), x(t − γ1 ), . . . , x(t − γr )),
dt
t ∈ (τi−1 , τi ), i = 1, . . . , N, (10.2.1a)
x(t) = φ(t, ζ), t ≤ 0, (10.2.1b)
Assumption 10.2.1
i 0
f (y , . . . , y r ) ≤ K1 1 + |y 0 | + · · · + |y r | ,
(y 0 , . . . , y r ) ∈ R(r+1)×n , i = 1, . . . , N, (10.2.2)
where K1 > 0 is a real constant and | · | stands for the Euclidean norm.
For i = 1, . . . , N , the system switches from sub-system i − 1 to sub-system
i at the time t = τi defined by
Z = {ζ ∈ Rs : ak ≤ ζk ≤ bk , k = 1, . . . , s} , (10.2.4)
10.2 Time-Lag Optimal Control with State-Dependent Switched System 397
where
4
inf{ t > ρi−1 : hi (ξ i (t)) = 0 }, if i ≤ N − 1,
ρi = (10.2.6)
∞, if i = N ,
where ξ 0 (t) = φ(t, ζ) and ρ0 = 0. Given ρi−1 and ξ i−1 (·) for some i ∈ ZN ,
with
ZN = {1, . . . , N } .
The existence of a unique solution to (10.2.5) can be established as follows. Di-
vide [ρi−1 , ∞) into consecutive subintervals of length min{γ1 , . . . , γr }. Then,
consider the system on each of the subintervals consecutively. This gives rise
to a set of consecutive non-delay systems on each of these subintervals in
sequential order. Since the functions φ and f i , i = 1, . . . , N , are continuous
and differentiable and Assumption 10.2.1 is satisfied, the well-known exis-
tence and uniqueness results for non-delay systems [3] can be applied to each
of these consecutive systems one by one in a sequential order. More specif-
ically, starting from ξ 0 (t) = φ(t, ζ) and ρ0 = 0, we can show by induction
that ξ i (·) and ρi are well-defined for each i ∈ ZN . Then, we see that ξ N (·)
satisfies (10.2.5) with ρi = τi , i = 0, . . . , N . Since each ξ i (·) is unique, it is
clear that ξ N (·) is the only solution of (10.2.5). This completes the proof.
398 10 Time-Lag Optimal Control Problems
10.2.3 Preliminaries
where ∂g0 (ζ)/∂ζ is an s-dimensional row vector with its kth element being
the partial derivative ∂g0 (ζ)/∂ζk of g0 with respect to the kth element of the
system parameter vector ζ.
10.2 Time-Lag Optimal Control with State-Dependent Switched System 399
In this assumption, it is assumed that the scalar product of ∂hi /∂x (which
is orthogonal to the switching surface hi = 0) and f i (which is tangent to the
state trajectory) is non-zero at the ith switching time. This means that the
state trajectory does not approach to the switching surfaces at a tangential
direction. In the literature, there are similar assumptions being made. See,
for example, [33, 190].
We are now ready to present formulae for both the state variation matrix
and the partial derivatives of the switching times with respect to the system
parameter vector. Throughout, we use the notation ∂ x̃j to denote the par-
tial differentiation with respect to x(t − γj ), with ∂ x̃0 denoting the partial
differentiation with respect to x(t) (that is, γ0 = 0).
To continue, let ζ ∈ Z and k ∈ {1, . . . , s} be arbitrary. Consider the
perturbed system parameter vector ζ + εek , where ε ∈ [ak − ζk , bk − ζk ],
such that ζ + εek ∈ Z. Let ξ i,ε (·), i ∈ ZN denote the trajectories obtained by
solving (10.2.5) recursively corresponding to the perturbed
system parameter
vector ζ + εek , starting from ξ 0,ε (t) = φ t, ζ + εek for t ≤ 0 and ρ0 = 0.
Following arguments similar to those used in the proof of Theorem 10.2.1, we
can show that ρi defined by (10.2.6) for ζ + εek is equal to τiε = τi ζ + εek ,
and ξ i,ε (t) = x t | ζ + εek for all t ≤ τiε .
Consider a function ψ of ε. The following notations are used. (1) If there
k
exists a real number M > 0 and a positive integer k such |ψ(ε)| ≤ M |ε|
kthat
for all ε of sufficiently small magnitude, then ψ(ε) = O ε ; (2) ψ(ε) = θ(ε) if
ψ(ε) → 0 as ε → 0; and (3) ψ(ε) = O(1) means that ψ is uniformly bounded
with respect to ε.
Let
γ̄ = max{γ1 , . . . , γr }
and let
μi,ε (t) = ξ i,ε (t) − ξ i,0 (t), i = 0, . . . , N. (10.2.12)
400 10 Time-Lag Optimal Control Problems
i,ε
max μ (t) = O(ε) for every Tmax > 0, (10.2.15)
t∈[−γ̄,Tmax ]
E
% i−1
lim ε−1 μi,ε (t) = Λk (t− ), t ∈ −∞, τi0 \ {τk0 }, (10.2.16a)
ε→0
k=0
τ ε − τi0
lim i
ε→0 ε
⎧
⎪ 0 if i = 0.
⎪
⎨ ∂hi (ξi,0 (τi0 ))
= −
0−
∂x Λk (τi )÷
⎪
⎪ ∂hi (ξi,0 (τi0 )) i i,0 0
⎩ f ξ τi , ξ i,0 τi0 − γ1 , . . . , ξ i,0 τi0 − γr , if i ≥ 1.
∂x
(10.2.16c)
10.2 Time-Lag Optimal Control with State-Dependent Switched System 401
fˆi,ε (t, η)
⎧ i i,0
⎨ f ξ (t − γ0 ) + ημi,ε (t − γ0 ),. . . ,
ξ (t − γrk)+ ημ (t − γr ) , if i ≥ 1,
i,0 i,ε
= (10.2.18)
⎩
dφ t, ζ + εηe /dt, if i = 0,
and, for i ≥ 1, let ∂ fˆi,ε (s, η)/∂ x̃j denote the respective partial derivatives.
From (10.2.5) for i = q + 1, we have
4 3t
q+1,ε ξ q,ε (τqε ) + τqε
fˆq+1,ε (ω, 1)dω, if t ∈ [τqε , Tmax ],
ξ (t) = (10.2.19)
ξ q,ε (t), if t ∈ [−γ̄, τqε ],
402 10 Time-Lag Optimal Control Problems
where τqε is as defined in the proof of Theorem 10.2.1 Thus, for t ∈ [−γ̄, τqε ],
q+1,ε
ξ (t) = |ξ q,ε (t)| ≤ max |ξ q,ε (ω)| , (10.2.20)
ω∈[−γ̄,Tmax ]
t
ˆq+1,ε
ξ q+1,ε
(t) ≤ ξ q,ε (τqε ) + f (ω, 1) dω
τqε
r
t
≤ max |ξ q,ε
(ω)| + K1 Tmax + K1 ξ q+1,ε (ω − γj ) dω
ω∈[−γ̄,Tmax ] τqε
j=0
t
≤ max |ξ q,ε (ω)| + K1 Tmax + (r + 1)K1 ξ q+1,ε (ω) dω.
ω∈[−γ̄,Tmax ] −γ̄
(10.2.21)
(i) t < min τqε , τq0 ; (ii) τq0 ≤ t < τqε ;
(iii) τqε ≤ t < τq0 ; (iv) t ≥ max τqε , τq0 .
Using (10.2.5) and the fundamental theorem of calculus, we can derive the
following formula for μq+1,ε (t) = ξ q+1,ε (t) − ξ q+1,0 (t) for each of the four
cases:
t* +
μ q+1,ε q,ε
(t) = μ (t) + αqt fˆq+1,ε (ω, 1) − fˆq,ε (ω, 1) dω
τqε
10.2 Time-Lag Optimal Control with State-Dependent Switched System 403
t * +
+ βqt fˆq,0 (ω, 0) − fˆq+1,0 (ω, 0) dω, (10.2.25)
τq0
where αqt and βqt are binary parameters indicating whether or not t ≥ τqε
and t ≥ τq0 , respectively.
Consider cases (i)–(iii) (i.e., at most one of t ≥ τqε and t ≥ τq0 holds). Then,
it follows from (10.2.25) that
q+1,ε q
μ (t) ≤ |μq,ε (t)| + fmax q+1
+ fmax max τqε , τq0 − min τqε , τq0
q
q+1 ε
= |μq,ε (t)| + fmax + fmax τq − τq0 , (10.2.26)
μq+1,ε (t)
t * +
= μq,ε (t) + fˆq+1,ε (ω, 1) − fˆq,ε (ω, 1) dω
τqε
t * +
+ fˆq,0 (ω, 0) − fˆq+1,0 (ω, 0) dω
τq0
4 5
t t
=ξ q,ε
(τqε ) −ξ q,0
(τq0 ) + fˆq+1,ε (ω, 1)ds − ˆ
f q+1,0
(ω, 0) dω
τqε τq0
τqε * +
= μq,ε (τq0 ) + fˆq,ε (ω, 1) − fˆq+1,ε (ω, 1) dω
τq0
t * +
+ fˆq+1,ε (ω, 1) − fˆq+1,0 (ω, 0) dω,
τq0
ε
provided that τq−1 < τq0 when q ≥ 1. Thus, by the mean value theorem, we
obtain
τqε * +
μ q+1,ε q,ε 0
(t) = μ (τq ) + fˆq,ε (ω, 1) − fˆq+1,ε (ω, 1) dω
τq0
r
t 1
∂ fˆq+1,ε (ω, η) q+1,ε
+ μ (ω − γj )dηdω. (10.2.27)
j=0 τq0 0 ∂ x̃j
where ∂fmaxq+1
denotes an upper bound for the norm of ∂ fˆq+1,ε (s, η)/∂ x̃j
(again, the existence of such an upper bound is ensured by the uniform
boundedness of ξ q+1,ε (·) and the continuous differentiability of f q+1 ). Re-
call that (10.2.28) is established under the condition:
ε
τq−1 < τq0 when q ≥ 1. (10.2.29)
ε
Hence, when ε is of sufficiently small magnitude, τq−1 < τq0 , showing the
validity of (10.2.29). Consequently, (10.2.28) is valid.
Now, we combine (10.2.26) and (10.2.28) and shift the time variable in the
integral. Then, it can be shown that, for all t ∈ [−γ̄, Tmax ],
q+1,ε q
μ (t) ≤ max |μq,ε (ω)| + fmax + fmax τq − τq0
q+1 ε
ω∈[−γ̄,Tmax ]
t
q+1 q+1,ε
+ (r + 1)∂fmax μ (ω) dω. (10.2.31)
−γ̄
Therefore, since μq,ε (ω) = O(ε) and τqε − τq0 = O(ε) from (10.2.15) and
(10.2.24), respectively, we have
q+1,ε t
μ (t) ≤ O(ε) + q+1 q+1,ε
(r + 1)∂fmax μ (ω) dω, t ∈ [−γ̄, Tmax ].
−γ̄
(10.2.32)
Finally, by Theorem A.1.19 (Gronwall-Bellman Lemma), we obtain
q+1,ε
μ (t) ≤ O(ε) exp (r + 1)(Tmax + γ̄)∂fmax q+1
= O(ε), t ∈ [−γ̄, Tmax ].
(10.2.33)
Thus, (10.2.15) for i = q + 1 follows readily. This completes the proof
of (10.2.15).
The proofs of (10.2.16a)–(10.2.16c) will depend on some auxiliary re-
sults to be established below. First, from the inductive hypothesis, we recall
that (10.2.16a)–(10.2.16c) are valid for each i = 1, . . . , q, where q ≤ N − 1.
We have already shown in the proofs of (10.2.14) and (10.2.15) that for
any Tmax > 0, ξ q+1,ε (·) is uniformly bounded on [−γ̄, Tmax ] with respect to ε,
and ξ q+1,ε (·) → ξ q+1,0 (·) uniformly on [−γ̄, Tmax ] as ε → 0. Thus, since f q+1
is a continuously differentiable function, the following limit holds uniformly
with respect to t ∈ [0, Tmax ] and η ∈ [0, 1].
10.2 Time-Lag Optimal Control with State-Dependent Switched System 405
≤ (r + 1)∂fmax
q
O(ε) = O(ε), (10.2.35)
where ∂fmax q
is an upper bound for the norm of ∂ fˆq,ε (s, η)/∂ x̃j , j =
0, 1, . . . , r. Furthermore, for t , t ∈ [0, Tmax ] and q ≥ 1,
r t ˆq,0 dξ q,0 (t − γ )
ˆq,0 ˆq,0 ∂ f (t, 0) j
f (t , 0) − f (t , 0) ≤ × dt
t ∂ x̃ j dt
j=0
≤ (r + 1)∂fmax
q i
max fmax |t − t |
i=0,...,q
= O(1) |t − t | , (10.2.36)
i
where, for each i = 0, . . . , q, the existence of the upper bound fmax for the
ˆi,ε
norm of f (s, η) is ensured by the uniform boundedness of ξ (·) i,ε
and the
continuity of the functions f i and φ. Choose Tmax > max τqε , τq0 . Then, it
follows from (10.2.24), (10.2.35) and (10.2.36) that
max(τqε ,τq0 )
ˆq,ε
f (t, 1) − fˆq,0 τq0 , 0 dt
min(τqε ,τq0 )
max(τqε ,τq0 ) *
ˆq,ε +
≤ f (t, 1) − fˆq,0 (t, 0) + fˆq,0 (t, 0) − fˆq,0 τq0 , 0 dt
min(τqε ,τq0 )
" #
≤ O(ε) + O(1) τqε − τq0 τqε − τq0 = O(ε2 ). (10.2.37)
Finally, we have
max(τqε ,τq0 ) −1 q+1,ε τq0 −1 q+1,ε
ε μ (t) − Λk (t) dt ≤ ε μ (t) − Λk (t) dt
−γ̄ −γ̄
max(τqε ,τq0 ) −1 q+1,ε
+ ε μ (t) − Λk (t) dt. (10.2.39)
τq0
Now, choose Tmax > max(τqε , τq0 ). Then, from (10.2.15), we have
Remark 10.2.3 Note that (10.2.34), (10.2.38), (10.2.39) and (10.2.41) will
be used in the proof of (10.2.16a) for i = q + 1.
Proof of (10.2.16a)
Consider the case of t < τq0 . Then, by (10.2.16b) for i = q (which is valid
under induction hypothesis), we have t < τqε for all ε of sufficiently small
magnitude and hence ξ q+1,ε (t) = ξ q,ε (t). If t = τl0 , l = 0, . . . , q − 1, then it is
clear from (10.2.16a) for i = q that
Using (10.2.44) and (10.2.45), the expression for μq+1,ε(t) can be simplified.
More specifically, for all times t satisfying max τqε , τq0 ≤ t ≤ Tmax , we can
deduce that
* +
μq+1,ε (t) = μq,ε τq0 + τqε − τq0 fˆq,0 τq0 , 0 − fˆq+1,0 τq0 , 0
r t
∂ fˆq+1,0 (ω, 0) q+1,ε
+ j
μ (ω − γj )dω + O(ε2 ) + θ(ε)O(ε).
j=0 τ 0
q
∂ x̃
(10.2.46)
Next, by virtue of (10.2.16a) and (10.2.16c) for i = q and the jump condi-
tion (10.2.13d) in the variational system (10.2.13), we have
4 5
−1 q,ε 0 0−
τqε − τq0 ∂τq (ζ)
q,ε
lim λ = lim ε μ τ q − Λ k τq + − ·
ε→0 ε→0 ε ∂ζk
* +
fˆq,0 τq0 , 0 − fˆq+1,0 τq0 , 0 = 0. (10.2.51)
which holds
for 0all %ε of sufficiently small magnitude. Finally,
for any
0 fixed time%
point t ∈ τq0 , τq+1 , we can choose Tmax > t so that t ∈ τqε , min τq+1 , Tmax
when the magnitude of ε is sufficiently small. Thus, by (10.2.53), we have
10.2 Time-Lag Optimal Control with State-Dependent Switched System 409
and let ∂ -
hεq+1 (η)/∂x denote the respective partial derivative. By Taylor’s
theorem, there exists a constant ηε ∈ (0, 1) such that
0=-
hεq+1 (1) − -
h0q+1 (0)
∂-
hεq+1 (ηε ) " q+1,ε ε #
= ξ (τq+1 ) − ξ q+1,0 (τq+1
0
) . (10.2.63)
∂x
Now, since τqε → τq0 < τq+1 0
as ε → 0, we have τqε < τq+1
0
when ε is of
sufficiently small magnitude. Thus,
ε 0
ξ q+1,ε τq+1 − ξ q+1,0 τq+1
ε 0 0
= ξ q+1,ε τq+1 − ξ q+1,ε τq+1 + μq+1,ε τq+1
τq+1
ε
0
= fˆq+1,ε (ω, 1)dω + μq+1,ε τq+1
0
τq+1
1 ε 0
= ε
τq+1 − 0
τq+1 fˆq+1,ε ητq+1 + (1 − η)τq+1
0
, 1 dη + μq+1,ε τq+1 .
0
(10.2.64)
∂-
hεq+1 (ηε ) q+1,ε 0
ε
τq+1 − τq+1
0
=− μ (τq+1 )
4 ∂x 5
∂- hεq+1 (ηε ) 1 q+1,ε ε
÷ fˆ ητq+1 + (1 − η)τq+1 , 1 dη .
0
∂x 0
(10.2.67)
We are now in a position to derive formulas for the state variation matrix
and the partial derivatives of the switching times with respect to each of the
components of the system parameter vector.
Theorem 10.2.2 For each ζ ∈ Z and for each k = 1, . . . , s, it holds that
∂x(t | ζ)
= Λk (t), t ∈ (τi−1 , τi ), i ∈ ZN , (10.2.70)
∂ζk
and
∂τi (ζ) ∂hi (x(τi ))
=− Λk (τi− )
∂ζk ∂x
∂hi (x(τi )) i
÷ f (x(τi ), x(τi − γ1 ), . . . , x(τi − γm )) ,
∂x
i ∈ ZN −1 and τi < ∞, (10.2.71)
where Λk (·) satisfies the variational system described by the differential equa-
tions (10.2.13a) with initial condition (10.2.13b)–(10.2.13c) and jump con-
dition (10.2.13d).
Proof. Note that (10.2.16a) and (10.2.16b) are valid. Thus, given any t ∈
(τi−1 , τi ), i ∈ ZN , we have t < τiε for all ε of sufficiently small magnitude,
and hence x(t | ζ + εek ) = ξ i,ε (t). This implies that for t < τiε and for all ε
of sufficiently small magnitude,
∂hi (x(τi )) i
÷ f (x(τi ), x(τi − γ1 ), . . . , x(τi − γr )) . (10.2.73)
∂x
or
∂τi (ζ)/∂ζk = 0. (10.2.74b)
Then
∂x(τi | ζ)
= Λk (τi+ ) = Λk (τi− ). (10.2.75)
∂ζk
(ii) Suppose that
and
∂τi (ζ)/∂ζk > 0. (10.2.76b)
Then,
∂ ± x(τi | ζ) x τi | ζ + εek − x(τi | ζ)
= lim = Λk (τi∓ ).
∂ζk ε→0± ε
(10.2.77)
(iii) Suppose that
and
∂τi (ζ)/∂ζk < 0. (10.2.78b)
Then,
∂ ± x(τi | ζ) x τi | ζ + εek − x(τi | ζ)
= lim = Λk (τi± ).
∂ζk ε→0± ε
(10.2.79)
Proof. Consider i ∈ ZN −1 with τi < ∞ and let k ∈ {1, . . . , s}. Then, from
the auxiliary system (10.2.5), we have
x τi0 | ζ + εek − x τi0 | ζ
τi0 * +
0 0
=ξ i,ε
τi − ξ i,0
τi + fˆi+1,ε (ω, 1) − fˆi,ε (ω, 1) dω
min(τiε ,τi0 )
τi0 * +
= μi,ε τi0 + fˆi+1,ε (ω, 1) − fˆi,ε (ω, 1) dω, (10.2.80)
min(τiε ,τi0 )
Hence,
x τi0 | ζ + εek − x τi0 | ζ
ε
−1 i,ε
τi0 − min τiε , τi0 * i+1,0 0 +
=ε μ 0
τi + · fˆ τi , 0 − fˆi,0 τi0 , 0 + O(ε).
ε
(10.2.82)
and
x τi0 | ζ + εek − x τi0 | ζ
lim− = Λk τi0− . (10.2.87)
ε→0 ε
Finally, suppose that either ∂τi /∂ζk = 0 or fˆi+1,0 τi0 , 0 = fˆi,0 τi0 , 0 is
satisfied. Then,
Λk τi0+ = Λk τi0−
and
∂x τi0 | ζ x τi0 | ζ + εek − x τi0 | ζ
= lim
∂ζk ε→0 ε
0− 0+
= Λ k τi = Λ k τi . (10.2.88)
Remark 10.2.4 In the first scenario of Theorem 10.2.3, the state variation
exists at t = τi . In the last two scenarios (the more likely scenarios), the
state variation does not exist at t = τi due to the facts that the left and
right partial derivatives of the state with respect to the kth component of the
system parameter are different, indicating that Λk (·) is discontinuous at the
ith switching time.
where O(1) in (10.2.89) means that the left hand side is uniformly bounded
with respect to σ, and θ(σ) in (10.2.90) means that the left hand side
converges to zero as σ → 0. In (10.2.90) and (10.2.91), the convergence
σ → 0 can be along any path to the origin, but in the original (10.2.15)
and (10.2.16b), the convergence is restricted to be along one of the coordi-
nate axes. Choose Tmax > T . Then, by virtue of (10.2.90) for i = N , we
have
This shows that the cost functional g0 is continuous. Note that the proof
of (10.2.90) must be carried out simultaneously with (10.2.89) and (10.2.91)
via induction as for the proof of Lemma 10.2.1.
We are now in a position to present the right and left partial derivatives
of the cost functional g0 with respect to the system parameter vector ζ as
given in the following theorem:
Theorem 10.2.4 For each ζ ∈ Z, it holds that
x T | ζ + εek → x(T | ζ) as ε → 0± , (10.2.94)
and, for each k = 1, . . . , s,
∂ ± g0 (ζ)
∂ζk
g0 ζ + εek − g0 (ζ)
= lim
ε→0± ε
∂Φ(x(T | ζ)) x T | ζ + εek − x(T | ζ)
= lim
∂x ε→0± ε
10.2 Time-Lag Optimal Control with State-Dependent Switched System 417
⎧
⎪Λk (T | ζ), if τi (ζ) = T , i = 1, . . . , N − 1,
∂Φ(x(T | ζ)) ⎨
= Λk (T ∓ | ζ), if τi (ζ) = T and ∂τi (ζ)/∂ζk ≥ 0,
∂x ⎪
⎩
Λk (T ± | ζ), if τi (ζ) = T and ∂τi (ζ)/∂ζk ≤ 0,
(10.2.95)
Proof. By Theorems 10.2.2 and 10.2.3, we can derive the left and right partial
derivatives of the cost functional g0 . First, by Taylor’s theorem, there exists,
for each ε = 0 and each k = 1, . . . , s, a constant ηε,k ∈ (0, 1) such that
∂Φ (1 − ηε,k )x(T | ζ) + ηε,k x T | ζ + εek
g0 (ζ + εe ) − g0 (ζ) =
k
·
" ∂x #
x T | ζ + εek − x(T | ζ) . (10.2.96)
Collectively, by Theorems 10.2.2 and 10.2.3, the existence of the right and
left partial derivatives of the system state with respect to each component
of the system parameter is assured under Assumptions 10.2.2 and 10.2.3
Thus, (10.2.94) and (10.2.95) are valid. The proof is complete.
Note from (10.2.95) that the left and right partial derivatives of g0 exist at
all ζ ∈ Z under Assumptions 10.2.2 and 10.2.3. In practice, these assumptions
can be easily checked for a given ζ ∈ Z by numerically solving the switched
time-delay system (10.2.1). Note that if T coincides with a switching time
satisfying one of the last two scenarios in Theorem 10.2.3, then the left and
right partial derivatives of g0 with respect to ζk may differ, since in this case
Λk (T − ) = Λk (T + ).
Since g0 has well-defined left and right partial derivatives (under Assump-
tions 10.2.2 and 10.2.3), it is continuous under Assumptions 10.2.2 and 10.2.3.
If these assumptions hold at every point in the compact set Z, then Prob-
lem (P2 ) is guaranteed to admit an optimal solution. This result is summa-
rized below.
Theorem 10.2.5 Problem (P2 ) admits an optimal solution.
If none of the switching times coincide with the terminal time, or if the
conditions for the first scenario in Theorem 10.2.3 are satisfied at the terminal
time, then the left and right partial derivatives of g0 derived above become
the full partial derivatives as shown in (10.2.9). We now present the following
line search optimization algorithm for solving Problem (P2 ).
Algorithm 10.2.1
In most cases, the partial derivatives of g0 will exist and Steps 5–7 can
be implemented using well-known methods in nonlinear optimization (see
Chapter 2). If any of the full partial derivatives of g0 does not exist (i.e., one
of the last two scenarios in Theorem 10.2.3 occurred at the terminal time),
then the signs of the left and right partial derivatives can be used to identify
an appropriate descent direction along one of the coordinate axes.
The model is based on the work in [151, 197]. Let x(t) = [x1 (t), x2 (t), x3 (t),
x4 (t)] , where t is time (hours). Here, x1 (t) is the biomass concentration
(g L−1 ), x2 (t) is the glycerol concentration (mmol L−1 ), x3 (t) is the 1,3-PD
concentration (mmol L−1 ) and x4 (t) is the fluid volume (L). The process
dynamics due to natural fermentation are described by
⎡ dx (t) ⎤ ⎡ ⎤
μ(x2 (t), x3 (t))x1 (t − γ1 )
1
dt
⎢ dx2 (t) ⎥ ⎢
⎢ dt ⎥ ⎢ −q2 (x2 (t), x3 (t))x1 (t − γ1 ) ⎥ ⎥ = f ferm (x(t), x1 (t − γ1 )),
⎢ dx3 (t) ⎥ = ⎣
⎣ dt ⎦ q3 (x2 (t), x3 (t))x1 (t − γ1 ) ⎦
dx4 (t) 0
dt
(10.2.97)
where γ1 = 0.1568 is the time-delay; μ(·, ·) is the cell growth rate; q2 (·, ·) is
the substrate consumption rate; and q3 (·, ·) is the 1,3-PD formation rate. The
process dynamics due to the input feed are
⎡ dx (t) ⎤ ⎡ ⎤
−x1 (t)
1
dt
⎢ dx2 (t) ⎥ u(t) ⎢ ⎥
⎢ dt ⎥
⎢ dx3 (t) ⎥ = ⎢ rcs0 − x2 (t) ⎥ := f feed (x(t), u(t)), (10.2.98)
⎣ dt ⎦ x4 (t) ⎣ −x3 (t) ⎦
dx4 (t) x4 (t)
dt
where u(t) is the input feeding rate (L h−1 ); r = 0.5714 is the proportion of
glycerol in the input feed and cs0 = 10762 mmol L−1 is the concentration of
glycerol in the input feed. The functions μ(·, ·), q2 (·, ·) and q3 (·, ·) in (10.2.97)
are given by
3
Δ1 x2 (t) x2 (t) x3 (t)
μ(x2 (t), x3 (t)) = 1− ∗ 1− ∗ , (10.2.99)
x2 (t) + k1 x2 x3
Δ2 x2 (t)
q2 (x2 (t), x3 (t)) = m1 + Y1 μ(x2 (t), x3 (t)) + , (10.2.100)
x2 (t) + k2
Δ3 x2 (t)
q3 (x2 (t), x3 (t)) = −m2 + Y2 μ(x2 (t), x3 (t)) + , (10.2.101)
x2 (t) + k3
where x∗2 = 2039 mmol L−1 and x∗3 = 1036 mmol L−1 are, respectively, the
critical concentrations of glycerol and 1,3-PD, and the values of the other
parameters are given in Table 10.2.1.
Δ1 k1 m1 Y1 Δ2 k2 m2 Y2 Δ3 k3
0.8037 0.4856 0.2977 144.9120 7.8367 9.4632 12.2577 80.8439 20.2757 38.75
420 10 Time-Lag Optimal Control Problems
Let Nfeed be an upper bound for the number of feeding modes. Since
the process starts and finishes in batch mode, the total number of potential
modes is N = 2Nfeed + 1 (Nfeed feeding modes and Nfeed + 1 batch modes).
During batch mode, there is no input feed and the process is only governed
by (10.2.97). On the other hand, the process is governed by both (10.2.97)
and (10.2.98) during feeding mode. Thus,
During the growth phase of the biomass, glycerol is being consumed. Since
no new glycerol is added during the batch mode, the glycerol concentration
will reduce and eventually it will become too low, and hence a switch into
feeding mode is necessary. The corresponding switching condition is
Note that the system parameters in this example appear explicitly in the
dynamics and switching conditions. Thus, to apply Theorem 10.2.2, we re-
place the system parameters with auxiliary state variables x4+k (t), k =
1, . . . , Nfeed + 2, where
dx4+k (t)
= 0, t > 0, (10.2.107)
dt
and
Let δki denote the Kronecker delta function and let ∂x and ∂ x̃1 denote
the partial differentiation with respect to x(t) and x1 (t − γ1 ), respectively.
Then, the variational system corresponding to ζk is
dΛk (t)
dt
⎧ ferm
⎪ ∂f
ferm
⎪
⎪ Λk (t) + ∂f∂ x̃1 Λk1 (t − γ1 ), batch mode,
⎨ ∂x
This condition is clearly satisfied with reasons given below: For batch mode,
the right hand side of (10.2.112) is always non-zero because in practice both
q2 and x1 are non-zero. For feeding mode, the right hand side of (10.2.112)
is also non-zero because, during feeding mode, the glycerol loss from natural
fermentation (first term) is dominated by the glycerol addition from the input
feed (second term).
Since x4 is non-decreasing and, for biologically meaningful trajectories,
μ(·, ·) is bounded, the linear growth assumption is also clearly valid.
The initial function φ for the dynamics (10.2.102) was obtained by applying
cubic spline interpolation to the experimental data reported in [197]. As in
[151], the terminal time for the fermentation process is taken as T = 24.16
hours. The upper bound for the number of feeding modes is chosen as Nfeed =
48. Our goal is to maximize the concentration of 1,3-PD at the terminal
time. Thus, the dynamic optimization problem is: Choose the parameters
ζk , k = 1, . . . , Nfeed + 2, such that the cost functional −x3 (T ) is minimized
subject to the boundedness constraints (10.2.103) and (10.2.104).
This dynamic optimization problem is solved using a FORTRAN program
that implements the gradient-based optimization procedure in Section 10.2.4.
In this program, NLPQLP [223] is used to perform the optimization itera-
tions (optimality check and line search), and LSODAR [92] is used to solve
the differential equations. Our gradient-based optimization strategy generates
critical points satisfying local optimality conditions. However, the solution
obtained is not necessarily a global optimal solution. Thus, it is necessary
to repeat the optimization process from different starting points so that a
better estimate of the global solution is obtained. We performed 100 test
runs, where each run starts from a different randomly selected initial point.
The average optimal cost over all runs is: −977.12854, and the best result,
which is obtained on run 73, is: −986.16815. For this control strategy, there
are 8 switches (5 batch modes and 4 feeding modes). The control parameters
and the respective mode durations are listed in Table 10.2.2. The optimal
state trajectories are shown in Figure 10.2.1 Due to the dilution effect from
the new input feed, the concentrations of biomass and 1,3-PD decrease during
the feeding modes. The control strategy listed in Table 10.2.2 is essentially
a state feedback strategy. It produces more 1,3-PD (an increase of 5.789%)
when compared with the time-dependent switching strategy reported in [151].
Furthermore, it requires far fewer switches. For the method reported in [151],
it requires over 1000 switches.
10.3 Min-Max Optimal Control 423
1000
6
900
800 5
700
1,3-PD [mmolL ]
4
-1
Biomass [gL -1 ]
600
500 3
400
2
300
200
1
100
0 0
0 5 10 15 20 25 0 5 10 15 20 25
Fermentation time [h] Fermentation time [h]
5.6
600
550
500
450 5.4
Glycerol [mmolL-1 ]
Volume [L]
400
350
300
5.2
250
200
150
100 5.0
0 5 10 15 20 25 0 5 10 15 20 25
Fermentation time [h] Fermentation time [h]
Remark 10.2.5 ζ1 , . . . , ζ4 are the optimal feeding rates and ζ49 and ζ50 are
the optimal switching concentrations. The optimal values of ζ5 , . . . , ζ48 are
irrelevant because they represent the feeding rates after the terminal time.
dx (t)
= A (t) x (t) + B (t) u (t) + C (t) w (t) , t ∈ [0, T ] , (10.3.1)
dt
x (0) = x0 ,
where x (t) ∈ Rn is the state vector, u (t) ∈ Rm is the control vector, w (t) ∈
Rr is the disturbance, and A (t) = [ai,j (t)], B (t) = [bi,j (t)], C (t) = [ci,j (t)]
are matrices with appropriate dimensions, and x0 is a given initial condition.
The cause of the disturbance w(t) in (10.3.1) can be due to the changes
in external environment or errors in measurement. As in [23, 30], we assume
that the disturbance w(t) ∈ Wρ , where Wρ is a L2 -norm bounded set defined
by * +
2
Wρ = w ∈ L2 ([0, T ] , Rr ) : w ≤ ρ2 , (10.3.2)
2 3T
where w = 0 (w(t)) w (t) dt and ρ > 0 is a given positive constant.
Furthermore, the control u is restricted to be chosen from Uδ which is a
L2 -norm bounded set defined by
* +
2
Uδ = u ∈ L2 ([0, T ] , Rm ) : u ≤ δ 2 , (10.3.3)
2 3T
where u = 0 (u(t)) u (t) dt and δ > 0 is a given positive constant.
The cost functional is considered to be a quadratic function given below:
T * +
J (u, w) = (x (t)) Q (t) x (t) + (u (t)) R (t) u (t) dt, (10.3.4)
0
where Q (t) = [qi,j (t)] and R (t) = [ri,j (t)] , t ∈ [0, T ] are matrices with
appropriate dimensions. It is assumed that the following terminal state con-
straint is satisfied:
x(T ) − x∗ ≤ γ, ∀ w ∈ Wρ , (10.3.5)
where x∗ is the desired terminal state and γ > 0 is a given constant. Any ele-
ment u ∈ Uδ is called a feasible control if the terminal state constraint (10.3.5)
is satisfied.
We may now state our optimal control problem formally as follows.
Problem (P3 ). Given the dynamical system (10.3.1) and the terminal
state constraint (10.3.5), find a control u ∈ Uδ such that the worst-case
performance J(u, w) is minimized over Uδ , i.e., finding a control u ∈ Uδ such
that it solves the following min-max optimal control problem:
M1 v 1 M2 v 2
λ 1 − 2
v α1 v α2
is positive semi-definite.
AB
M=
CD
426 10 Time-Lag Optimal Control Problems
dx(t)
= Ax(t) + Bu(t)
dt
t " #
Wc (t) = exp{Aτ }BB exp A τ dτ
0
t " #
= exp{A(t − τ )}BB exp A (t − τ ) dτ
0
has rank n.
4. The n × (n + r) matrix
[A − λI B]
has full row rank at every eigenvalue λ of A.
10.3 Min-Max Optimal Control 427
Now suppose that all the eigenvalues of A have negative real parts (A is
stable), and that the unique solution of the Lyapunov equation
AWc + Wc (A) = −B (B)
dx(t)
= A(t)x(t) + B(t)u(t)
dt
y(t) = C(t)x(t)
where A, B and C are, respectively, n × n, n × r, p × n matrices. Then the
system (A(t), B(t)) is controllable at time t0 if and only if there exists a finite
time t1 > t0 such that the n × n matrix, also known as the controllability
Gramian, defined by
t1
Wc (t0 , t1 ) = Φ(t1 , τ )B(τ ) (B(τ )) (Φ(t1 , τ )) dτ
t0
Let Φ(t, τ ) be the transition matrix of (10.3.1). For each u and w, define
T
T0 (u) = Φ (T, 0) x +0
Φ (T, τ ) B (τ ) u (τ ) dτ,
0
t
1/2 1/2
T1 (x) = (Q (t)) Φ (t, 0) x +
0
(Q (t)) Φ (t, τ ) B (τ ) u (τ ) dτ,
0
428 10 Time-Lag Optimal Control Problems
T
F0 (w) = Φ (T, τ ) C (τ ) w (τ ) dτ,
0
t
1/2
F1 (w) = (Q (t)) Φ (t, τ ) C (τ ) w (τ ) dτ.
0
When no confusion can arise, the same notation ·, · is used as the inner
product in L2 as well as in Rn . The cost functional (10.3.4) and the terminal
state constraint (10.3.5) can be rewritten as
F 1 1
G
J(u, w) = T1 (u) + F1 (w), T1 (u) + F1 (w) + (R) 2 u, (R) 2 u , (10.3.7)
and
Φ (t, τ ) , if τ ≤ t,
Φ (t, τ ) =
0n×n , else.
1/2
On the other hand, since {un } ⊂ Uδ , and (Q (t)) Φ (t, τ ) B (τ ) is con-
tinuous with respect to (t, τ ) ∈ [0, T ] × [0, t], we can easily show that there
exists a constant K1 such that
t & '
(Q (t)) Φ (t, τ ) B (τ ) u (τ ) dτ ≤ K1
1/2 n
i
0
10.3 Min-Max Optimal Control 429
3t& 1/2
'
for each t ∈ [0, T ], where 0 (Q (t)) Φ (t, τ ) B (τ ) un (τ ) dτ denotes the
3t 1/2
i
i-th element of 0 (Q (t)) Φ (t, τ ) B (τ ) un (τ ) dτ . Now, by Theorem A.1.10
(Lebesgue Dominated Convergence Theorem), it follows that T1 (un ) →
T1 (u) .
Clearly, w (u) may be not unique. However, they share the same cost function
value maxw∈Wρ J(u, w). For Problem (P ), we have the following theorem.
Thus,
T T
(u (t)) R (t) u (t) dt ≤ lim (un (t)) R (t) un (t) dt.
0 n→∞ 0
Furthermore, we can show that maxw∈Wρ T1 (u) + F1 (w), T1 (u) + F1 (w) is
3T
convex in u. Since (10.3.18) is convex in u and 0 (u (t)) R (t) u (t) dt is
strictly convex in u, it follows that Problem (P ) is strictly convex. Thus, the
conclusion of the theorem holds.
and let λmax (S) be the largest eigenvalue of S. Suppose that S is invertible.
If Problem (P ) has a feasible control, then
−1
n
n
(F0 (w)) (S) F0 (w) = g i , w2 ≤ g i 2 w2
i=1 i=1
T
−1/2 −1/2
= Tr (S) Φ (T, τ ) C (τ ) (C(τ )) (Φ(T, τ )) (S) dt w2 ≤ ρ2 .
0
and
T T
−1
G(t)w(t)dt = G (t)G(t) (S) hdt = h.
0 0
By Lemma 10.3.1, (10.3.3) holds if and only if there exists a ς ≥ 0 such that
432 10 Time-Lag Optimal Control Problems
γ 2 − (Vu ) Vu − ςρ2 − (Vu )
−1
−Vu ς (S) − I
γ 2 − ςρ2 0 (Vu )
= −1 − Vu I " 0 (10.3.20)
0 ς (S) I
subject to
1/2
I (PN ) θ
1/2 " 0, (10.3.22)
(θ) (PN ) t2
10.3 Min-Max Optimal Control 433
t1 − ς1 ρ2 − (θ) QN − rN
" 0, (10.3.23)
−QN θ − (rN ) ς 1 I − RN
⎛ ⎞
I V x0 − x ∗ + V N θ I
⎝ (V x0 − x∗ + VN θ) γ 2 − ρ2 ς2 0 ⎠ " 0, (10.3.24)
−1
I 0 ς2 (S)
where the explicit expressions of PN , QN , RN , qN , rN , V , VN and μ0 are
given as below:
T t t
PN = (ΦB,Γ (t, τ )) dτ Q (t) ΦB,Γ (t, τ )dτ dt
0 0 0
T
+ (ΓN (t)) R (t) ΓN (t) dt,
0
T t t
QN = (ΦB,Γ (t, τ )) dτ Q (t) ΦC,Ψ (t, τ )dτ dt,
0 0 0
T t t
RN = (ΦC,Ψ (t, τ )) dτ Q (t) ΦFC,Ψ (t, τ )dτ dt,
0 0 0
T
1/2 1/2
VN = (P ) Φ (T, t) B (t) ΓN (t) dt, V = (P ) Φ (T, 0) ,
0
T t
qN = (ΓN (τ )) B (τ ) (Φ (t, τ )) dτ Q (t) [F (t, 0) x0 ] dt,
0 0
T t
rN = (ΨN (τ )) C (τ ) (Φ (t, τ )) dτ Q (t) [Φ (t, 0) x0 ] dt,
0 0
T
μ0 = (x0 ) (Φ (t, 0)) Q (t) Φ (t, 0) dt x0 ,
0
and
J(u∗N , ω(u∗N )) = max J(u∗N , ω),
ω∈Wρ ∩VN
and
J(u∗N , ωN
∗
) = J(u∗N , ω(u∗N )), (10.3.30)
respectively, Then,
(i) lim J(u∗N , ωN
∗
) = J(u∗ , ω(u∗ )); and
N →∞
(ii) u∗N & u∗ as N → ∞.
J(u∗N , ωN
∗
)= min J(u, ωN ∗
)
u∈UN ∩Uρ
∗
≤J uN,∗ , ωN → J(u∗ , ω)
- ≤ J(u∗ , ω ∗ ). (10.3.31)
J(u∗ , ω ∗ ) = min J(u, ω ∗ ) ≤ J(-
u, ω ∗ ) ≤ lim J u∗N , ω N,∗
u∈Uρ N →∞
10.3 Min-Max Optimal Control 435
Therefore,
lim J(u∗N , ωN
∗
) = J(u∗ , ω(u∗ )) (10.3.33)
N →∞
u∗Nk & u
6 = u∗ and ωN
∗
k
6.
&ω
Let ωu be one of the maximizers ω(6 u). Then, by virtue of the uniqueness
of the solution of Problem (P3 ), it is clear that
= lim J(u∗Nk , ωN
∗
k
) ∗ ∗
= J(u , ω(u ),
k→∞
Theorem 10.3.6 shows that the min-max optimal control problem (P3 )
is approximated by a sequence of finite dimensional convex optimization
problems (PN 3 ). Then, an intuitive scheme to solve Problem (P3 ) can be
N
stated as:
N +1,∗ For tolerance ε > 0, we solve Problem (P3 ) until
a given
|J u −J u N,∗
| ≤ ε.
Problem (P3 ) with ρ = 0 is a standard optimal control problem without
disturbance. Let it be referred to as Problem (P3 ). Similarly, we can solve
Problem (P3 ) through solving a sequence of approximate optimal control
N
problems, denoted by Problems (P3 ), by restricting the feasible control u
in UN ∩ Uδ . We have the following results.
N
Corollary 10.3.1 Problem(P ) is equivalent to the following SDP problem
min t + 2 (q N ) θ + μ0 (10.3.35)
θ∈ΞN ,t≥0
subject to
1/2
I (PN ) θ
1/2 " 0, (10.3.36)
(θ) (PN ) t
436 10 Time-Lag Optimal Control Problems
I V x0 + V N θ − x ∗
∗ " 0. (10.3.37)
(V x + VN θ − x )
0
γ
Remark 10.1. During the computation, both u and w are approximated by
truncated orthonormal bases. Suppose that u∗N = ΓN (t)θ ∗ . Then I0 uN,∗ =
V x0 + VN θ ∗ , where θ ∗ is the optimal solution of Problem (PN ∗
3 ). Since θ sat-
N,∗
isfies the linear matrix inequality (10.3.24), u satisfies the linear matrix
inequality (10.3.16). Thus, by Theorem 10.3.4, the terminal inequality con-
straint (10.3.5) holds for all w ∈ Wρ . Thus, uN,∗ is a feasible solution. This
feature is not shared by the control parametrization method given in pre-
vious chapters. More specifically, if we directly approximate Uδ , Wρ by UN
and WN , then we can also transform the approximated problem as a SDP
which is different from that defined by (10.3.21)–(10.3.24). Let the solution
obtained by this method be ūN,∗ . Then, the terminal inequality constraint
(10.3.5) is only satisfied for those w ∈ WN ⊂ Wρ , not for all w ∈ Wρ .
Thus, the approximate solution ūN,∗ may be infeasible. For our proposed
approach, the approximations of u and w only affect the computation of the
cost function value (10.3.4). The feasibility of the terminal constraint (10.3.5)
is maintained for all w ∈ Wρ .
di
V = R a i + La + Ce ω
dt
dω
Cm i = Jr + μω + m, (10.3.38)
dt
where V is the voltage applied to the rotor circuit, i is the current, ω is
the rotation speed, m is the resistant torque reduced to the motor shaft, Ra
and La are the resistance and the inductance of the circuit, Jr is the inertia
moment, Ce and Cm are the constants of the motor, μ is the coefficient
of viscous friction. Let x(t) = [x1 (t), x2 (t)]T = [ω(t), i(t)] , u(t) = V (t),
w(t) = m(t). Then, (10.3.38) can be rewritten as
dx(t)
= Ax(t) + Bu(t) + Cw(t),
dt
where . /
− Jμr CJrm 0 − J1r
A= ,B= 1 ,C= .
−LCe
a
−RLa
a
La 0
10.3 Min-Max Optimal Control 437
Suppose that the initial condition is x(0) = [0, 0] . We wish to find an optimal
control to drive the system to a neighbourhood around the desired state x∗
with reference to all disturbances w ∈ Wρ such that the energy consumption
is minimized. In this case, Q = 0 and R = 1 in (10.3.4). Suppose that the
nominal parameters of the DC motor are given as: μ = 0.01, Jr = 0.028,
Cm = 0.58, La = 0.16, Ce = 0.58 and Ra = 3. Let T = 1, δ = 5, ρ = 0.01,
∞ ∞
γ = 0.2, x∗ = [3, 1] . The two orthonormal bases {γi }i=1 and {ψi }i=1 are
taken as the normalized shifted Legendre polynomial, i.e.,
√
γi (t) = ψi (t) = 2i + 1Pi (2t − 1) , i = 0, 1, 2, . . . ,
where Pi (t) is the i-th order Legendre polynomial. During the simulation,
SeDuMi [231] and YALMIP [163] are used to solve the SDP problem defined
by (10.3.21)–(10.3.24) and the SDP problem defined by (10.3.35)–(10.3.37).
Note that system (10.3.38) is time invariant. By direct computation, it fol-
lows that the matrix [C, CA] is of full rank. Thus, S in (10.3.14) is invertible.
Using Simpson’s Rule to compute it, we obtain
176.8542 −27.7387
S=
−27.7387 5.3628
and λmax (S) = 181.2293, which indicates that γ should be far larger than ρ.
We set the tolerance
ε as 10−8 andN =5 to start solving Problem (PN ).
For N = 10, we have J u − J uN,∗ ≤ 10
N +1,∗ −8
. So we stop the compu-
tation. Meanwhile, we have J¯ uN +1,∗ − J¯ uN,∗ ≤ 10−8 , where J¯ uN,∗
N
is the optimal cost function value of Problem (P ). The cost function values
obtained are given in Table 10.3.1, from which we see that the convergence
for the case with disturbance and that without disturbance are very fast.
Figure 10.3.1 depicts the nominal state [x1 (t), x2 (t)]T = [w(t), i(t)] and
Figure 10.3.2 shows that the optimal control u∗11 under worst case perfor-
mance. The terminal constraint (10.3.5) holds for any w ∈ Wρ is ensured by
Theorem 10.3.4.
Table 10.3.1: The optimal cost of Problem (P̄3N ) and the optimal cost of
Problem (PN 3 )
2.5
x1*(t)
x*2(t)
2
1.5
x
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t
Fig. 10.3.1: The nominal state trajectories [x∗1 (t), x∗2 (t)]T of Problem (P 11 )
4.5
3.5
3
u
2.5
1.5
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t
10.4 Exercises
10.2. Show the equivalence of Problem (P(p)) and Problem (Q(p)) (see Sec-
tion 10.2.4).
10.8. Show that F1 defined in Section 10.3.2 is bounded linear operator from
L2 ([0, T ], Rr ) to L2 ([0, T ], Rn ).
10.9. In the proof of Theorem 10.3.1, show that there exists a constant K >
0, such that 0 ≤ J(u, w) < K, ∀(u, w) ∈ Uδ × Wρ .
is convex in u.
11.1 Introduction
the terminal equality constraint function are derived, and a reliable computa-
tion algorithm is given. The method proposed is used to solve a ship steering
control problem.
dx(t)
= f (t, x(t), u(t)), t ∈ (0, T ], x(0) = x0 , (11.2.1)
dt
where x(t) ∈ Rn and u(t) ∈ Rr are, respectively, the state and control
vectors; f : [0, T ] × Rn × Rr → Rn ; T , 0 < T < ∞, is the fixed terminal time
and x0 ∈ Rn is a given vector.
For the control and state vectors, they are subject to the following contin-
uous inequality constraints:
h [h1 , . . . , hN ] .
Let H and L be, respectively, the Hamiltonian function and the augmented
Hamiltonian function defined by
dx∗ (t)
= f (t, x∗ (t), u∗ (t)), x∗ (0) = x0 , (11.2.7)
dt
dλ∗ (t) ∂L(t, x∗ (t), u∗ (t), λ∗ (t), ρ∗ (t))
=− , (11.2.8)
dt ∂x
∂Φ0 (x∗ (T ))
[λ∗ (T )] = , (11.2.9)
∂x
∗ ∗ ∗ ∗
∂L(t, x (t), u (t), λ (t), ρ (t))
0=
∂u
∂H(t, x∗ (t), u∗ (t), λ∗ (t)) ∗ ∗
∂h(t, x (t), u (t))
= + [ρ∗ (t)] ,
∂u ∂u
(11.2.10)
0 ≥ hi (t, x∗ (t), u∗ (t)); ρ∗i (t) ≥ 0, i =, . . . , N (11.2.11)
∗ ∗ ∗
0 = [ρ (t)] h(t, x (t), u (t)). (11.2.12)
In what follows, (x∗ , u∗ ) is also called the nominal solution, and a super-
script ‘∗’ indicates that the corresponding function is evaluated along the
nominal trajectory (x∗ , u∗ ).
Along this nominal solution, (t∗k,1 , t∗k,2 ) ⊂ [0, T ], k ∈ P , is called an interior
interval for the kth constraint if
and
444 11 Feedback Control
t∗k,1 , t∗k,2 and t∗k,3 are called junction points. Let Tk∗ denote the set of junction
points t∗k,j ∈ [0, T ] for hk (t, x∗ (t), u∗ (t)) ≤ 0. We assume that (x∗ , u∗ ) has
the following regular structure.
H
Assumption 11.2.3 The I set T ∗ k∈P Tk∗ = {t∗1 , . . . , t∗M } of all junction
points is finite and Tk∗ Tj∗ = ∅ for k = j, where ∅ denotes an empty set.
Furthermore, there are no isolated touch points with the boundary for the
nominal solution.
and
dhk (t, x(t), u(t))
dt ∗+ = 0.
t→tk,j+1
! !
For convenience, let u t∗−k,j and u t ∗+
k,j+1 denote, respectively, the limits
of u (t) from the left at tk,j and right at t∗k,j+1 .
∗ ∗
Let ĥ(t, x∗ (t), u∗ (t)) and ρ̂∗ (t) denote, respectively, vectors composed of
hk (t, x∗ (t), u∗ (t)) and ρ∗k (t), where k ∈ A(t, x∗ (t), u∗ (t)). Correspondingly,
let q(t) > 0 be the number of the constraints in A(t, x∗ (t), u∗ (t)). We have
the following assumptions.
Assumption 11.2.6 ∂ ĥ(t, x∗ (t), u∗ (t))/∂u is of full row rank when
Assumption 11.2.7 For all γ(t) ∈ ker(∂ ĥ(t, x∗ (t), u∗ (t))/∂u)\ {0}, it holds
that [γ(t)] (∂ 2 L∗ (t, x∗ (t), u∗ (t)/∂u2 )γ(t) > 0. where ker(·) denotes the null
space of a matrix, and
dx(t)
= f (t, x(t), u(t)), x(0) = x0 + εδ, (11.2.13)
dt
dλ(t) ∂L(t, x(t), u(t), λ(t), ρ(t)) ∂Φ0 (x(T ))
=− , [λ(T )] = ,
dt ∂x ∂x
(11.2.14)
∂L(t, x(t), u(t), λ(t), ρ(t))
0=
∂u
∂H(t, x(t), u(t), λ(t)) ∂h(t, x(t), u(t))
= + [ρ(t)] , (11.2.15)
∂u ∂u
0 ≥ hi (t, x(t), u(t)), ρi (t) ≥ 0, i = 1, . . . , N (11.2.16)
0 = [ρ(t)] h(t, x(t), u(t)). (11.2.17)
-∗
∂h - ∗ ∂u∗
∂h
+ = 0. (11.2.19)
∂x ∂u ∂x
From Assumption 11.2.8, there exists a small ε2 > 0 such that, for each
ε ∈ [0, ε2 ] and the perturbation δx = εδ, u∗ (x∗ ) is continuously differentiable
with respect to x∗ . Let ε0 = min{ε1 , ε2 }. It follows from the continuity of ĥ
at x∗ and u∗ (x∗ ) that, for ε ∈ [0, ε0 ],
dĥ (t, x∗ (t) + εδ(t), u∗ (x∗ (t) + εδ(t)))
= 0.
dε
ε=0
∂h∗ ∗
∂h ∂u
∗
[ρ∗ ] = − [ρ∗ ] . (11.2.21)
∂x ∂u ∂x
Thus, the conclusion follows from (11.2.10).
11.2 Neighbouring Extremals 447
. /
∂2H ∗ ∂ 2 h∗ ∂u∗ -∗
∂h ∂ ρ-∗
0= + ρ∗k k
+
∂u2 ∗
∂u2 ∂x ∂u ∂x
k∈A
∂2H ∗ ∂ 2 h∗k ∂f ∗
∂λ∗
+ + ρ∗k + . (11.2.22)
∂x∂u ∗
∂x∂u ∂u ∂x
k∈A
∂V
− = H(t, x(t), u(t), λ(t)) = L0 (t, x(t), u(t)) + [λ(t)] f (t, x(t), u(t)),
∂t
(11.2.26)
where [λ] = ∂V /∂x and
T
V (t, x(t)) Φ0 (x(T )) + L0 (t, x(τ ), u(τ ))dτ.
t
Let
∂λ∗ (t) ∂ 2 V ∗ (t, x(t))
Q∗ (t) = . (11.2.27)
∂x ∂x2
Theorem 11.2.2 Let x = x∗ + εδ, u = u∗ (x∗ + εδ) and λ = λ∗ (x∗ + εδ)
with ε ∈ [0, ε0 ]. If Assumptions 11.2.1–11.2.5 and 11.2.8 are satisfied, then
Q∗ satisfies the matrix differential equation
dQ∗ (t) ∂2H ∗ ∂u∗ ∂ 2 H ∗ ∂u∗ ∂f ∗
=− 2
− 2
− Q∗ (t)
dt ∂x ∂x ∂u ∂x ∂u
∂f ∗ ∂f ∗ ∂u∗ ∂f ∗ ∂u∗
− Q∗ (t) − Q∗ (t) − Q∗ (t)
∂x ∂u ∂x ∂u ∂x
∂ 2 H ∗ ∂u∗ ∂u∗ ∂2H ∗
− − (11.2.28)
∂u∂x ∂x ∂x ∂x∂u
with
∂ 2 Φ∗0
Q∗ (T ) = . (11.2.29)
∂x2 t=T
λ = λ∗ + εQ∗ δ + o(ε2 ).
∂V ∗ ε2 ∂2V ∗
V (x) = V ∗ + ε δ + δ δ + o(ε3 )
∂x 2 ∂x2
ε2
= V ∗ + ε [λ∗ ] δ + δ Q∗ δ + o(ε3 ). (11.2.30)
2
Hence,
∂V dV dV ∗ dλ∗ (t) dδ
H+ = L0 + = L0 + +ε δ + ε [λ∗ (t)]
∂t dt dt dt dt
ε2 dQ∗ (t) dδ
+ δ δ + ε2 δ Q ∗ + o(ε3 ). (11.2.31)
2 dt dt
11.2 Neighbouring Extremals 449
Thus, from
dV ∗ ∂V ∗
L∗0 + = H∗ + = 0,
dt ∂t
and
dλ∗ (t) ∂L(t, x∗ (t), u∗ (t), λ∗ (t), ρ∗ (t))
=− ,
dt ∂x
it follows that
∂V ∂L∗
H+ = L0 − L∗0 − ε δ + [λ∗ ] (f − f ∗ ) + εδ Q∗ (f − f ∗ ) (11.2.32)
∂t ∂x
ε 2 ∗
dQ (t)
+ δ δ + o ε3 (11.2.33)
2 dt
∗
∂L
= L0 − L∗0 − ε δ + λ (f − f ∗ ) (11.2.34)
∂x
ε2 dQ∗ (t)
+ δ δ + o ε3 (11.2.35)
2 dt
∂L∗0 ∂f
∗
∂h
∗
= L0 − L∗0 − ε + [λ∗ ] + [ρ∗ ] δ + λ (f − f ∗ )
∂x ∂x ∂x
(11.2.36)
ε 2 ∗
dQ (t)
+ δ δ + o ε3 (11.2.37)
2 dt
Then, by expanding λ to first order in ε, and L0 and f to second order
in ε, it gives
∂V
H+
∂t
∂L∗0 ∂L∗0 ∂u∗
= ε + δ
∂x ∂u ∂x
. /
ε2 ∂ 2 L∗0 ∂ 2 L∗0 ∂u∗ ∂u∗ ∂ 2 L∗0 ∂u∗
+ δ 2
+2 + δ
2 ∂x ∂u∂x ∂x ∂x ∂u2 ∂x
∂L∗0 ∂f
∗
∂h
∗
−ε + [λ∗ ] + [ρ∗ ] δ
∂x ∂x ∂x
n
∗ ∂λ∗k ∂fk∗ ∂fk∗ ∂u∗ ε2 ∂ 2 fk∗
+ λk + ε δ ε + δ + δ
∂x ∂x ∂u ∂x 2 ∂x2
k=1
5
2 ∗
∂ 2 fk∗ ∂u∗ ∂u∗ ∂ fk ∂u∗ ε2 dQ∗ (t)
+2 + 2
δ + δ δ + o(ε3 )
∂u∂x ∂x ∂x ∂u ∂x 2 dt
∂H ∗ ∂u∗ ∂h
∗
ε2 ∂2H ∗ ∂ 2 H ∗ ∂u∗
=ε δ − ε [ρ∗ ] δ + δ +
∂u ∂x ∂x 2 ∂x2 ∂u∂x ∂x
450 11 Feedback Control
∂u∗ ∂2H ∗ ∂u∗
∂ 2 H ∗ ∂u∗ ∗ ∂f
∗
∂f ∗
+ + + Q + Q∗
∂x ∂x∂u ∂x ∂u2 ∂x ∂x ∂x
/
∗ ∗
∗ ∂f ∂u ∂f ∗ ∂u∗ ∗ dQ∗ (t)
+Q + Q + δ + o(ε3 ). (11.2.38)
∂u ∂x ∂u ∂x dt
From (11.2.20), the first two terms in the right hand side of (11.2.38)
vanish. Then, (11.2.28) holds because H + ∂V /∂t = 0 and δ ∈ B(n, 1) is
arbitrary. Now, by expanding λ and ∂Φ0 /∂x to the first order in ε and
using (11.2.9), (11.2.14) and (11.2.27), Equation (11.2.29) is obtained.
To continue, let
∂2H ∗ 2 ∗
∂ ĥ∗
∗ ∂ hk
A∗ + ρk , [B ∗ ] , (11.2.39)
∂u2 ∗
∂u2 ∂u
k∈A
∂2H ∗ ∂ 2 h∗k ∂f ∗
E∗ − − ρ∗k − Q∗ , (11.2.40)
∂x∂u ∗
∂x∂u ∂u
k∈A
A∗ B ∗ ∂u∗ /∂x E∗
∗ = . (11.2.42)
[B ∗ ] 0q×q ∂ ρ̂ /∂x F∗
From Assumptions 11.2.6–11.2.7, the leftmost block matrix in (11.2.42) is
non-singular. Thus, (11.2.41) holds.
∂u∗ −1
= [A∗ ] E ∗ (11.2.43)
∂x . /
−1
∂2H ∗ ∂2H ∗ ∂f ∗
=− + Q∗ .
∂u2 ∂x∂u ∂u
∂u∗
u(x) ≈ u∗ + (x − x∗ ) (11.2.44)
∂x
can be computed readily.
Remark 11.2.3 Since the magnitude of the admissible perturbation ε0 in
Lemma 11.2.1 is hard to be determined, or the determined ε0 is too small, the
solution’s structure may change after perturbations. Specifically, it is possible
that a small boundary interval or a small interior interval along the nominal
trajectory disappears after perturbations. In the first situation, the method
proposed tries to keep the perturbed trajectory on the boundary, while in the
second the perturbed trajectory may be infeasible in a small interval. As a
remedy, we can modify the control law and project any infeasible point onto
the boundary of the constraints. Suppose there are some constraints infeasible
at time t, which satisfy that hk (t, x, u) > 0 for k ∈ K ⊆ P . Then, the control
law (11.2.44) should be modified as
In this way, the perturbed solution is always feasible although some optimality
may be lost.
The following algorithm gives the procedure to compute the feedback con-
trol (11.2.44) and its modification (11.2.45).
Algorithm 11.2.1
Step 1. Solve Problem (P1 ) to obtain u∗ (t) and x∗ (t) for t ∈ [0, T ]. The
expression of
ρ∗ (t) = ρ(t, x∗ , u∗ , λ∗ )
can be solved from (11.2.10) to (11.2.12). Substituting the obtained ρ∗
in (11.2.8), λ∗ (t) can be computed by integrating (11.2.8) backwards
from t = T to t = 0 with terminal condition (11.2.9). Then, ρ∗ (t)
can be computed, and A∗ (t) is also obtained.
Step 2. Compute Q∗ (t), t ∈ [0, T ], by integrating (11.2.28) backwards in time
with terminal condition (11.2.29), where ∂u∗ /∂x is given by (11.2.41)
or (11.2.43). Then, ∂u∗ /∂x is obtained from (11.2.41) or (11.2.43)
for each t ∈ [0, T ].
Step 3. For each neighbouring trajectory x(t), the control u(x) is given
by (11.2.44). If any constraints are violated, u(x) shall be modified
as (11.2.45).
Consider the following problem, which is generalized from the Rayleigh prob-
lem with a mixed state–control constraint [24].
452 11 Feedback Control
x1 (t) + t
u(t) + ≤ 0, t ∈ [0, 4.5].
6
The Lagrangian L for this problem is
$ %
L = u2 + x21 + λ1 x2 (1 + t/45) + λ2 −x1 + x2 1.4 − px22 + 4u
+ ρ (u + (x1 + t)/6) .
ρ = −2u − 4λ2 .
dλ1 (t)
= −∂L/∂x1 = λ2 (t) − 2x1 (t) − ρ/6
dt
= (5/3)λ2 (t) − 2x1 (t) + u(t)/3, λ1 (T ) = 0
dλ2 (t)
= −∂L/∂x2
dt
= 3pλ2 (t)(x2 (t))2 − 1.4λ2 (t) − λ1 (t)(1 + t/45), λ2 (T ) = 0.
dQ∗ (t)
= F (Q∗ ), Q∗ (4.5) = 02×2 .
dt
11.2 Neighbouring Extremals 453
Here,
− 37 0 0 − 53
F (Q∗ ) = 18 − Q∗
0 0.84λ∗2 x∗2 t+45
45 1.4 − 0.42x∗2
2
t+45
0
− Q∗ 45
− 3 1.4 − 0.42x∗2
5
2
−2 0 0 −1
F (Q∗ ) = − Q∗
0 0.84λ∗2 x∗2 t+45
45 1.4 − 0.42x∗2
2
t+45
0 00
− Q∗ 45 + Q∗ Q∗
−1 1.4 − 0.42x∗2
2 08
-2
1
0
x
-4
nominal
-6 feedback
optimal
-5
0 1 2 3 4 0 1 2 3 4
1.5
1 0
0.5
constraint
0 -0.5
u
-0.5
-1
-1
-1.5
-1.5
-2
0 1 2 3 4 0 1 2 3 4
t (seconds) t (seconds)
feedback control u(x) of (11.2.44) and (11.2.45) and the optimal open-loop
control for the perturbed problem. The respective trajectories of the state,
control and constraint are, respectively, depicted by the black solid lines, the
red solid lines and the blue dashed lines in Figure 11.2.1. It is seen that the
errors between the system trajectories under the feedback control and those
under the optimal open-loop control are relatively small, and the nominal
control is infeasible for t ∈ [3.18, 4.5]. Since the perturbation is not small, the
feedback control law (11.2.44) is infeasible in a small interval t ∈ [1.29, 1.34],
where the modified control law (11.2.45) is used instead. Under this modifi-
cation, the feasibility of the feedback control is regained.
dx(t)
= f (x(t), y(t), u(t)), t ∈ (0, T ] (11.3.1)
dt
dy(t)
= p(x(t)) (11.3.2)
dt
x(0) = x0 (11.3.3)
0
y(0) = y (11.3.4)
|p(x)| ≤ C2 (1 + |x|).
Remark 11.3.1 Suppose that the output equations are algebraic equations
given below rather than the output system (11.3.2) with the initial condi-
tion (11.3.4).
11.3 PID Control 455
x̂ = [x1 , . . . , xs ] (11.3.6)
dx̂(t)
= q(x(t)), (11.3.7)
dt
where q = [q1 , . . . , qs ]T is a continuously differentiable function. Then, it is
easy to see that
N1
u(t) = k1,j (y(t) − r(t))χI1,j (t)
j=1
N2 t
N3
dy(t)
+ k2,j (y(s) − r(s))χI2,j (s)ds + k3,j χI3,j (t),
j=1 0 j=1
dt
(11.3.10)
while
0 = ti,0 < ti,1 < ti,2 < · · · < ti,Ni < ti,Ni+1 = T, i = 1, 2, 3, (11.3.12)
are the switching times for the proportional, integral and derivative control
actions, respectively, and χI denotes the indicator function of I given by
1, t ∈ I,
χI (t) = (11.3.13)
0, otherwise.
456 11 Feedback Control
We now specify the region within which the output trajectory is allowed to
move. This region is defined in terms of the following continuous inequality
constraints, which arise due to practical requirements, such as constraints
on the rise time and for avoiding overshoot. They may also arise due to
engineering specification on the PID controller.
The optimal control problem may now be stated below. Given sys-
tem (11.3.1)–(11.3.4), design a PID controller in the form defined by (11.3.10)
such that the output y(t) of the corresponding closed loop system will
move within the specified region defined by the continuous inequality con-
straints (11.3.14) and, at the same time, it will track the given reference input
such that the terminal condition (11.3.15) is satisfied. Let this problem be
referred to as Problem (P2 ).
First, we formulate a cost functional below:
T 2
dy(t)
J(k) = α1 (y(t) − r(t))2 + α2 + α3 [u(t)]2 dt, (11.3.16)
0 dt
For the integral term of the PID controller given by (11.3.10), we define
t
zj (t) = [y(s) − r(s)]χI2,j (s)ds, j = 1, . . . , N2 . (11.3.17)
0
dzj (t)
= (y(t) − r(t))χI2,j (t), (11.3.18)
dt
zj (0) = 0. (11.3.19)
Let z(t) = [z1 (t), . . . , zN2 (t)] and q(t) = [q1 (t), . . . , qN2 (t)] , where
dz(t)
= q(t), (11.3.21)
dt
z(0) = 0. (11.3.22)
N1
u(t) = k1,j (y(t) − r(t))χI1,j (t)
j=1
N2
N3
+ k2,j zj (t) + k3,j p(x(t))χI3,j (t). (11.3.26)
j=1 j=1
Here,
458 11 Feedback Control
is the vector containing the gains for the proportional, integral and derivative
terms of the PID controller.
The specified region remains the same as given by (11.3.14). The cost
functional (11.3.16) becomes
T
J(k) = α1 (y(t) − r(t))2 + α2 [p(x(t))]2
0
.
N1
+α3 k1,j (y(t) − r(t))χI1,j (t)
j=1
N2
N3 2
+ k2,j zj (t) + k3,j p(x(t))χI3,j (t) dt. (11.3.28)
j=1 j=1
The problem may now be re-stated as: Given system (11.3.23) with initial
condition (11.3.24), find a PID control parameter vector k such that the
cost function (11.3.28) is minimized subject to the continuous inequality con-
straint (11.3.14) and the terminal equality constraint (11.3.15). Let this prob-
lem be referred to as Problem (Q2 ). Clearly, Problem (Q2 ) is an optimal
parameter selection problem.
where
L̄i,ε (t, x(t), y(t), z(t), k) = Li,ε (t, x(t), y(t), u(t)) (11.3.32)
where
2 2 2
l(t, x, y, z, k) = α1 y − r + α2 p(x) + α3 u(t) , (11.3.33)
On the basis of Theorems 11.3.1 and 11.3.2, Problem (P2 ) can be solved
through solving a sequence of optimal parameter selection problems (Q2 ε,γ )
subject to only terminal equality condition (11.3.15). Each of these optimal
parameter selection problems can be solved as a nonlinear optimization prob-
lem by using a gradient-based optimization method, such as the sequential
quadratic programming approximation scheme. See Chapter 3 for details.
Thus, the optimal control software, MISER, is applicable. Further details are
given in the next section.
Theorem 11.3.4 The gradient formula for the terminal constraint function
Ω(y(T |k)) with respect to k is given by
∂Ω(y(T |k) T
∂ H̃ε,γ (t, x(t), y(t), z(t), k, λ̃ε,γ (t))
= dt, (11.3.39)
∂k 0 ∂k
dΩ(y(T ))
λ̃(T ) = . (11.3.41b)
dy
3. Check whether all the continuous inequality constraint (11.3.14) are satis-
fied or not. If they are satisfied, go to Step 4. Otherwise, increase γ to 10γ
∗
and go to Step 2 with kε,γ as the initial guess for the new optimization
process.
4. If ε is small enough, say, less than or equal to a given small number, we
∗
have a successful exit. Else, decrease ε to ε/10 and go to Step 2, using kε,γ
as the initial guess for the new optimization process.
3
d3 y(t) d2 y(t) dy(t) dy(t) dδ (t)
+ b 1 + b2 a1 + a2 = b3 + b2 δ (t) + w,
dt3 dt2 dt dt dt
(11.3.42)
where
d(d)
w = b3 + b2 d,
dt
dδ(t)
= b4 e (t), (11.3.43)
dt
with
e, if |e| ≤ emax
e = (11.3.44)
emax sign(e), if |e| ≥ emax
11.3 PID Control 463
where e = u − δ,
δ, if |δ| ≤ δmax
δ = (11.3.45)
δmax sign(δ), if |δ| ≥ δmax
The variable w is to account for sea disturbances acting on the ship with
d a constant disturbance, u is the control that is chosen in the form of a
PID controller defined by (11.3.10), δ is the rudder angle, e is the error
as defined, e and δ are the real inputs to the actuator and ship dynamics,
respectively, because of the saturation properties that are defined as (11.3.44)
and (11.3.45). The ship model is in its full generality without resorting to
simplification and linearization. This work develops further some previous
studies of optimal ship steering strategies with time optimal control [257],
phase advanced control [31], parameter self-turning [141], adaptive control
[7] and constrained optimal model following [264].
For a ship steering problem, it has two phases: course changing and
course keeping. During the course changing phase, it is required to manoeu-
vre the ship such that it moves quickly towards the desired course set by
the command without violating the constraints arising from performance
specifications and physical limitations on the controller. During the course
keeping phase, the ship is required to move along the desired course. In
this application, the PID controller of the form defined by (11.3.10) with
N1 = N2 = N3 = 6 is used. More specifically,
6
u(t) = k1,i (y(t) − r(t))χ[ti−1 ,ti ) (t)
i=1
6 t
+ k2,i (y(s) − r(s)) χ[ti−1 ,ti ) (s)ds
i=1 0
6
dy(t)
+ k3,i χ[ti−1 ,ti ) (t), (11.3.46)
i=1
dt
dx1 (t)
= x2 (t) (11.3.49)
dt
dx2 (t)
= x3 (t) (11.3.50)
dt
dx3 (t)
= −b1 x3 (t) − b2 a1 (x2 (t))3 + a2 x2 (t)) + b3 b4 e + b2 (x4 (t) + d
dt
(11.3.51)
dx4 (t)
= b4 e (11.3.52)
dt
dx5,j (t)
= x1 (t) − r(t)χ[tj−1 ,tj ) (t), j = 1, . . . , 6 (11.3.53)
dt
with the initial condition
where
x = [x1 , x2 , . . . , x10 ]T
e = u(t) − x4 (t) (11.3.55)
with
6
u(t) = k1,i (x1 (t) − r(t)) χ[ti−1 ,ti ) (t)
i=1
6
6
+ k2,i x5,j (t) + k3,i x2 (t)χ[ti−1 ,ti ) (t). (11.3.56)
i=1 i=1
a1 a2 b1 b2 b3 b4
−30.0 −5.6 0.1372 −0.0002014 −0.003737 0.5
i.e., the heading angle should not go beyond 1% of the desired reference input
r(t). This constraint can be written as
We also impose constraint on the rise time of the heading angle such that
the heading angle is constrained to reach at least 70% of the desired reference
input in 30 seconds and 95% in 60 seconds, i.e.,
where
⎧
⎪
⎪ 0, t ∈ [0, 6)
⎨
5.1 × 10−4 t − 3.1 × 10−3 , t ∈ [6, 30)
h(t) = (11.3.60)
⎪ 1.5 × 10−4 t + 7.9 × 10−3 ,
⎪ t ∈ [30, 60)
⎩
2.2 × 10−6 t + 16.4 × 10−3 , t ∈ [60, 300].
and
g4 (t) = x4 (t) − π/6 ≤ 0, t ∈ [0, 300s]. (11.3.63)
Similarly, to cater for another saturation property, we have
and
g6 (t) = x4 (t) − u(t) − π/30 ≤ 0, t ∈ [0, 300s], (11.3.66)
where u(t) is given by (11.3.56).
The terminal equality constraint is
466 11 Feedback Control
0.018
0.016
r(t)
0.014 1.01r(t)
h(t)
y(t)
Heading Angle y (rad)
0.012
0.01
0.008
0.006
0.004
0.002
0
0 30 60 90 120 150 180 210 240 270 300
t (s)
-0.48 -0.42
-0.5 -0.44
-0.52 -0.46
-0.54 -0.48
3
g4
g
-0.56 -0.5
-0.58 -0.52
-0.6 -0.54
-0.62 -0.56
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t (300s) t (300s)
0 0
-0.05 -0.05
-0.1 -0.1
6
5
g
g
-0.15 -0.15
-0.2 -0.2
-0.25 -0.25
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0.04
0.02
-0.02
-0.04
-0.06
-0.08
-0.1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t (300s)
The optimal PID control problem may now be stated formally as follows.
Given system (11.3.49)–(11.3.54), find a PID control parameter vector
k = [(k 1 ) , . . . , (k 6 ) ] with k i = [k1i , k2i , k3i ] , i = 1, 2, . . . , 6, such that the
cost functional
300
J= {α1 (x1 (t) − r(t))2 + α2 x22 (t) + α3 u2 (t)}dt (11.3.68)
0
0.018
0.016
r(t)
y(t)
0.014
0.012
Heading Angle y (rad)
0.01
0.008
0.006
0.004
0.002
0
0 30 60 90 120 150 180 210 240 270 300
t (s)
Fig. 11.3.6: The heading angle of the ship with a larger disturbance
468 11 Feedback Control
0.02
0.018
0.016
r(t)
0.014
y(t)
0.01
0.008
0.006
0.004
0.002
0
0 30 60 90 120 150 180 210 240 270 300
t (s)
Fig. 11.3.7: The heading angle of the ship with a disturbance coming from
the initial heading direction
6
6
u(t) = k1,i (x1 (t) − r(t))χ[ti−1 ,ti ) (t) + k2,i x5,j (t)
i=1 i=1
6
+ k3,i x2 (t)χ[ti−1 ,ti ) (t). (11.3.69)
i=1
The results obtained are shown in Figures 11.3.2, 11.3.3, 11.3.4, and 11.3.5.
From the results obtained, we see that all the constraints are satisfied. The
heading angle tracks the desired reference input with no steady state error
after some small oscillation due to the constant disturbance. The overshooting
of the heading angle above the reference input is less than 1%, and hence
the constraint g1 (t) ≤ 0, t ∈ [0, 300], is satisfied. To test the robustness
of this PID controller, we run the model with the optimal PID controller
under the following environments: (1) The disturbance is much larger, more
specifically, d = 0.6 × π/180; and (2) the disturbance is coming from the
initial heading direction, more specifically, d = −0.3 × π/180. The results are
shown in Figures 11.3.6 and 11.3.7. In both cases, we see that the heading
angles track the desired reference input with no steady state error after some
small oscillations.
11.4 Exercises
12.1 Introduction
Assumption 12.2.2 The Wiener process w(t) and the initial random vector
ξ 0 are statistically independent.
Ω = {δ ∈ Rs : ηj (δ) ≥ 0, j = 1, . . . , M }, (12.2.3)
T* 5
+
+ [q(t, δ, u(t))] ξ(t)+ϑ(t, δ, u(t))+[ξ(t)] Q(t, δ, u(t))ξ(t) dt
0
(12.2.5)
is minimized over D, where S(δ) ∈ Rn×n and Q(t, δ, u) ∈ Rn×n are sym-
metric and positive semi-definite matrices continuously differentiable with
respect to their respective arguments, while q(t, δ, u) and p(δ) (respectively,
ϑ(t, δ, u) and υ(δ)) are n vector-valued functions (respectively, real-valued
functions) that are also continuously differentiable with respect to their ar-
guments. For convenience, let this combined optimal parameter selection and
optimal control problem be referred to as Problem (SP1 ).
In this section, we wish to show that the combined optimal parameter selec-
tion and optimal control problem (SP1 ) is equivalent to a deterministic one.
To begin, we note that the solution of the system (12.2.1) corresponding to
each (δ, u) can be written as
t
ξ(t | δ, u) = Φ(t, 0 | δ, u)ξ0 + Φ(t, s | δ, u)b(s, δ, u(s))ds
0
t
+ Φ(t, s | δ, u)D(s, δ, u(s)))dw(s), (12.2.6)
0
where, for each (δ, u), Φ(t, s | δ, u) ∈ Rn×n is the principal solution matrix
of the homogeneous system
∂Φ(t, τ )
= A(t, δ, u(t))Φ(t, τ ), t > τ (12.2.7a)
∂t
Φ(τ, τ ) = I, (12.2.7b)
where I is the identity matrix.
474 12 On Some Special Classes of Stochastic Optimal Control Problems
is Gaussian and
ξ1 (t) = Φ(t, 0 | δ, u)ξ 0
is also Gaussian if ξ 0 is. Thus, for each (δ, u), the process {ξ(t) : t ≥ 0},
given by (12.2.6), is a Gaussian Markov process with mean
Differentiating (12.2.8) with respect to t and then using (12.2.7), we note that
for each (δ, u) ∈ D, μ(t | δ, u) is the corresponding solution of the following
system of differential equations.
dμ(t)
= A(t, δ, u(t))μ(t) + b(t, δ, u(t)), t > 0, (12.2.10a)
dt
μ(0) = μ0 . (12.2.10b)
dΨ (t)
= A(t, δ, u(t))Ψ (t) + [Ψ (t)]A(t, δ, u(t)) + D(t, δ, u(t))Θ(t)[D(t, δ, u(t))]
dt
(12.2.11a)
with the initial condition
Ψ (0) = M 0 . (12.2.11b)
Note that Ψ (t | δ, u) is symmetric. Thus, there are only n(n + 1)/2 dis-
tinct differential equations in (12.2.11). The corresponding conditional joint
probability density function for ξ(t) is given by
12.2 A Combined Optimal Parameter and Optimal Control Problem 475
f (x, t | δ, u)
− 12 [x−μ(t | δ, u)][Ψ (t | δ, u)]−1 [x−μ(t | δ, u)]
= (2π)− 2 [det Ψ (t | δ, u)]
n
exp −
2
. (12.2.12)
Let us now turn our attention to the cost functional (12.2.5). First, we
note that
" #
E [ξ(t)] Q(t, δ, u(t))ξ(t)
" #
= E Tr([ξ(t)] Q(t, δ, u(t))ξ(t))
" #
= E Tr(Q(t, δ, u(t))ξ(t)[ξ(t)] )
" #
= Tr Q(t, δ, u(t))Ψ (t | δ, u) + μ(t | δ, u)[μ(t | δ, u))] , (12.2.13)
By carrying out the required integration, it is easy to show that this constraint
is equivalent to (12.2.15). This completes the proof.
dx(t)
= f (t, x(t), δ, u(t)) (12.2.18a)
dt
x(0) = x0 , (12.2.18b)
where w(t) is a Wiener process with zero mean and variance, i.e.,
478 12 On Some Special Classes of Stochastic Optimal Control Problems
min(t,t )
E {w(t)w (t )} = θ(s)ds, θ(s) ≥ 0, for all s. (12.2.21)
0
Here, we assume that the variance of the Wiener process is stationary, i.e.,
where α > 0 and ε > 0. This constraint implies that we want to be at least
100ε% confidence that the quality state of the machine is in the interval
[x0 − α, x0 + α] throughout its life span, where α > 0 denotes the deviation
of the quality state of the machine from its initial state. In practice, we would
obviously like to make ε as close as possible to unity, while keeping δ as small
as possible. This may, however, incur excessive control effort. The stochastic
optimal maintenance problem may now be stated as follows.
Given the dynamical system (12.2.20), find an admissible control u ∈ U
such that the expected return
4 5
T
g0 (u) = E exp(−rT )Sξ(T ) + exp(−rt)[θξ(t) − u(t)]dt (12.2.24)
0
dμ(t)
= −bμ(t) + u(t) (12.2.25a)
dt
μ(0) = x0 (12.2.25b)
and
dΨ (t)
= −2bΨ (t) + θ (12.2.26a)
dt
Ψ (0) = 0. (12.2.26b)
Note that the variance Ψ (t) does not depend on the control u. Thus, the
differential equation (12.2.26) needs only to be solved once.
12.2 A Combined Optimal Parameter and Optimal Control Problem 479
where
x0 + δ − μ(t) x0 − δ − μ(t)
g1 (t, δ, u) = erf , − erf , . (12.2.29)
2πΨ (t) 2πΨ (t)
ū = 0.1, 0 ≤ t ≤ t∗
u∗ (t) = (12.2.30)
0, t∗ ≤ t ≤ .T
The main reference for this section is [239]. Consider a system governed by
the following stochastic differential equation over a finite time interval (0, T ].
provided such a matrix C(t) exists. In fact, if B(t) has rank r and n > r,
then C(t) is just the right inverse of B(t). Substituting (12.3.3) and (12.3.2)
into (12.3.1), we obtain
!
dx(t) = A(t) + B(t)K̂H(t) x(t)dt + B(t)Ky(t)dt
+ B(t)K̂Γ 0 (t) dN 0 (t) − λ0 (t)dt + Γ (t) (dN (t) − λ(t)dt) .
(12.3.5)
Define
x(t)
ξ(t) = .
y(t)
Then, the system dynamics (12.3.5) together with the observation dynam-
ics (12.3.2) can be jointly written as
12.3 Optimal Feedback Control for Linear Systems Subject to Poisson Processes481
and for each j = 1, . . . , r, Kj (respectively, K̂j ) is the jth row of the matrix
K (respectively, K̂). For convenience, let the components of the vector κ be
denoted as
κj , j = 1, . . . , 2rk.
Note that M̃ is a vector of zero-mean martingales.
The initial condition for the system dynamics may be deterministic or
Gaussian, i.e.,
x(0) = x0 , (12.3.7)
where x0 ∈ Rn is either a deterministic or a Gaussian vector. In the case
when x0 is a Gaussian vector, let x̄0 and P 0 be its mean and covariance,
respectively. Furthermore, it is assumed that x0 is statistically independent
of N and N 0 .
The initial condition for the observation dynamics is usually assumed to
be
y(0) = 0, (12.3.8)
that is, no information is available at t = 0. Thus, in the notation of (12.3.6),
we have
x0
ξ(0) = = ξ. (12.3.9)
0
Consider the following homogeneous system.
∂ Φ̃(t, τ )
= Ã(t, κ)Φ̃(t, τ ), 0 ≤ τ ≤ t < ∞, (12.3.10a)
∂t
Φ̃(t, t) = I, for any t ∈ [0, ∞), (12.3.10b)
where I denotes the identity matrix. For each κ, let Φ̃(t, τ | κ) be the cor-
responding solution of (12.3.10). Then, it is clear that for each κ, the cor-
responding solution of the system (12.3.6) with the initial condition (12.3.9)
can be written as
482 12 On Some Special Classes of Stochastic Optimal Control Problems
t
ξ(t | κ) = Φ̃(t, 0 | κ)ξ 0 + Φ̃(t, τ | κ)Γ̃ (τ, κ)dM̃ (τ ). (12.3.11)
0
dμ(t)
= Ã(t, κ)μ(t) (12.3.13a)
dt
with the initial condition
μ(0) = ξ̄ 0 . (12.3.13b)
For the covariance matrix of the process ξ, we have the following result.
Theorem 12.3.2 For each κ, let ξ(· | κ) be the solution of the coupled
system (12.3.6) with the initial condition (12.3.9). Then, the corresponding
covariance matrix Ψ (· | κ) is determined by the following matrix differential
equation.
where
Λ(t) 0
Λ̃(t) =
0 Λ0 (t)
with Λ(t) = diag (λ1 (t), . . . , λm (t)), and Λ0 (t) = diag λ01 (t), . . . , λ0q (t) .
Furthermore, P 0 ∈ Rn×n is obviously zero in the case when x0 is a deter-
ministic vector.
12.3 Optimal Feedback Control for Linear Systems Subject to Poisson Processes483
we obtain
!
ϕ Ψ (t | κ)ϕ = ϕ Φ̃(t, 0 | κ)Ψ 0 Φ̃(t, 0 | κ) ϕ
t ! !
+ ϕ Φ̃(t, τ | κ)Γ̃ (τ, κ)Λ̃(τ ) Γ̃ (τ, κ) Φ̃(t, τ | κ) ϕdτ. (12.3.18)
0
Ψ (0 | κ) = Ψ 0 . (12.3.20)
dz(t)
= f (t, z(t), κ) (12.3.21a)
dt
with the initial condition
z(0) = z 0 , (12.3.21b)
where f (respectively, z) is determined by (12.3.13a) together with (12.3.14a)
(respectively, (12.3.13b) together with (12.3.14b)).
In this section, our aim is to formulate two classes of stochastic optimal feed-
back control problems based on the dynamical system (12.3.1), the observa-
tion dynamics (12.3.2) and the proposed control dynamics given by (12.3.3)
(which is driven by the measurement process y).
To begin, let us assume that the vector κ is to be chosen from the set K
defined by
* +
K = κ = [κ1 , . . . , κ2rk ] ∈ R2rk : β̃ ≤ κ < β̄
* +
= κ = [κ1 , . . . , κ2rk ] ∈ R2rk : β̃i ≤ κi ≤ β̄i , i = 1, . . . , 2rk , (12.3.22)
Ψ11 (t | κ) Ψ12 (t | κ)
Ψ (t | κ) = , (12.3.23)
Ψ21 (t | κ) Ψ22 (t | κ)
and " #2
ν Ψ22 (t | κ)ν = E ν (y(t | κ) − ȳ(t | κ)) , ν ∈ Rk ,
while Ψ12 (t | κ) and Ψ21 (t | κ) are cross covariances.
With these preparations, the first problem may be stated formally as fol-
lows.
12.3 Optimal Feedback Control for Linear Systems Subject to Poisson Processes485
Subject to the dynamic system (12.3.1), the initial condition (12.3.7), the
observation channel (12.3.2) with the initial condition (12.3.8) and the control
system given by (12.3.3), find a constant vector κ ∈ K such that the cost
functional
4 5
T * +
g0 (κ) = E Tr (x(t | κ) − x̄(t | κ)) [x(t | κ) − x̄(t | κ)] dt
0
(12.3.24)
is minimized over K.
For convenience, let this (stochastic optimal feedback) control problem be
referred to as Problem (SP2a ). Note that Problem (SP2a ) aims to find an
optimal vector (and hence feedback matrix) κ ∈ K such that the resulting
system (12.3.6) with the initial condition (12.3.9) is least noisy.
In our second problem, our aim is to find a constant vector (and hence
constant feedback matrix) κ ∈ K such that the mean behaviour of the cor-
responding dynamical system is closest to a given deterministic trajectory,
while the uncertainty of the corresponding dynamical system is within a given
acceptable limit. Let the given deterministic trajectory be denoted by x̂(t).
Then, the corresponding problem, which is identified as Problem (SP2b ), may
be stated formally as follows.
Given the system (12.3.1) with the initial condition (12.3.7), the obser-
vation channel (12.3.2) with the initial condition (12.3.8) and the proposed
control dynamics of the form (12.3.3), find a constant vector κ ∈ K such that
the cost functional
T
2
g0 (κ) = x̄(t | κ) − x̂(t) dt (12.3.25)
0
The stochastic optimal feedback control problems as stated above are difficult
to solve. However, by virtue of the structure of the dynamical system, the
observation channel and the form of the control law, we can show that these
problems are, in fact, equivalent to certain deterministic optimal parameter
selection problems.
486 12 On Some Special Classes of Stochastic Optimal Control Problems
In×n 0
M=
0 0
We now turn our attention to Problem (SP2b ) and define the following
deterministic optimal parameter selection problem, to be denoted as Problem
(DP2b ).
Given the system (12.3.21), find a constant feedback vector κ ∈ K such
that the cost functional (12.3.25) is minimized subject to κ ∈ K and the
constraint T
Tr {M Ψ (t | κ)} dt ≤ ε, (12.3.28)
0
Remark 12.3.2 Note that our formulation also holds for the case of time-
varying control matrices K = K(t), K̂ = K̂(t), t > 0. In this case, Problems
(DP2a ) and (DP2b ) corresponding to Problems (SP2a ) and (SP2b ), as de-
scribed above, are to be considered as deterministic optimal control problems
with controls K(t) and K̂(t) rather than as deterministic optimal parameter
selection problems with constant matrices K and K̂.
12.3.3 An Example
where u(t) denotes the continuous maintenance rate; w(t) denotes the stan-
dard Brownian motion with mean 0 and covariance given by
and k1 , k2 and k3 are the given constants. These constants represent, re-
spectively, the natural degradation rate of the machine, the propensity for
random fluctuations in the condition of the machine and the extent to which
the production is being influenced by the state of the machine. It is assumed
that the continuous maintenance rate is subject to the following boundedness
constraints:
488 12 On Some Special Classes of Stochastic Optimal Control Problems
x(0) = x0 + δ0 , (12.3.32)
y(0) = 0 (12.3.33)
τN +1 ≥ tmin . (12.3.35)
We consider the situation where the time required for each overhaul is
negligible when compared with the length of the time horizon. Thus, the
state of the machine improves instantaneously at each overhaul time. On the
other hand, the output level stays the same. This phenomenon is modeled by
the following jump conditions:
x τi+ = k5 x(τi− ) + δi , i = 1, . . . , N (12.3.36)
+ −
y τi = y(τi ), i = 1, . . . , N, (12.3.37)
where xmin is the minimum acceptable level of the state of the machine and
p1 is a given probability level. Second, the accumulated output level over
the entire time horizon is required to be greater than or equal to a specified
12.3 Optimal Feedback Control for Linear Systems Subject to Poisson Processes489
∂Φ(t, s) u(t) − k1 0
= Φ(t, s), t > s, (12.3.45)
∂t k3 0
Φ(s, s) = I, (12.3.46)
where
φ11 (t, s) φ12 (t, s)
Φ(t, s) = . (12.3.47)
φ21 (t, s) φ22 (t, s)
Then, it is known that for each i = 1, . . . , N +1, the solution of the stochas-
tic impulsive system (12.3.29) and (12.3.30) on [τi−1 , τi ) can be expressed as
follows:
+ t
x(t) x τi−1 k
= Φ(t, τi−1 ) + + Φ(t, s) 2 dw(s). (12.3.48)
y(t) y τi−1 τi−1 0
+ +
μy (t) = φ21 (t, τi−1 )μx τi−1 + φ22 (t, τi−1 )μy τi−1 . (12.3.52)
By differentiating (12.3.51) and (12.3.52) with respect to t, we obtain
dμx (t) + +
= (u(t) − k1 ) φ11 (t, τi−1 )μx τi−1 + φ12 (t, τi−1 )μy τi−1
dt
= (u(t) − k1 )μx (t) (12.3.53)
and
dμy (t) + +
= k3 φ11 (t, τi−1 )μx (τi−1 ) + φ12 (t, τi−1 )μy (τi−1 )
dt
= k3 μx (t). (12.3.54)
and
+ +
σyy (t) = φ221 (t, τi−1 )σxx τi−1 + φ222 (t, τi−1 )σyy τi−1
t
+
+ 2φ21 (t, τi−1 )φ22 (t, τi−1 )σxy τi−1 + k22 φ221 (t, s)ds.
τi−1
(12.3.56)
dσxx (t) + +
= 2(u(t) − k1 ) φ211 (t, τi−1 )σxx τi−1 + φ212 (t, τi−1 )σyy τi−1
dt +
+ 4(u(t) − k1 )φ11 (t, τi−1 )φ12 (t, τi−1 )σxy τi−1
492 12 On Some Special Classes of Stochastic Optimal Control Problems
t
+ 2(u(t) − k1 ) k22 φ211 (t, s)ds
τi−1
Since the state equations (12.3.29) and (12.3.30) and the jump con-
ditions (12.3.36) and (12.3.37) are linear, x(t) and y(t) are mixtures of
12.3 Optimal Feedback Control for Linear Systems Subject to Poisson Processes493
dt(s)
N +1
= v6(s) = vi χ[i−1,i) (s) (12.3.69)
ds i=1
t(0) = 0, (12.3.70)
494 12 On Some Special Classes of Stochastic Optimal Control Problems
This equation shows the relationship between the variable jump points
t = τi , i = 1, . . . , N + 1, and the fixed jump points s = i, i = 1, . . . , N + 1.
Define u 6(s) = u(t(s)). Recall that the continuous maintenance rate is con-
stant between consecutive overhauls. Thus, the admissible controls are re-
stricted to piecewise constant functions that assume constant values between
consecutive jump times. As a result, u 6(s) can be expressed as
N +1
6(s) =
u hi χ[i−1,i) (s), (12.3.73)
i=1
0 ≤ hi ≤ ak1 , i = 1, . . . , N + 1. (12.3.74)
6x (s) = μx (t(s)),
μ 6y (s) = μy (t(s) ,
μ (12.3.76)
Then, the dynamics (12.3.53) and (12.3.54) and (12.3.58)–(12.3.60) are trans-
formed into
μx (s)
d6
= v6(s)(6
u(s) − k1 )6
μx (s) (12.3.78)
ds
μy (s)
d6
= k3 v6(s)6
μx (s) (12.3.79)
ds
σxx (s)
d6
= v6(s) 2(6
u(s) − k1 )6
σxx (s) + k22 (12.3.80)
ds
σyy (s)
d6
= 2k3 v6(s)6
σxy (s) (12.3.81)
ds
σxy (s)
d6 σyx (s)
d6
= = v6(s)(6
u(s) − k1 )6 6xx (s).
σxy (s) + k3 σ (12.3.82)
ds ds
Furthermore, the initial conditions are
6x (0) = x∗ , μ
μ 6y (0) = 0 (12.3.83)
6xx (0) = k4 , σ
σ 6yy (0) = 0, σ
6xy (0) = σ
6yx (0) = 0. (12.3.84)
1/2
exp dη ≥ p2 . (12.3.89)
xmin (2π6
σyy (N + 1)) 26
σyy (N + 1)
The explicit forms for the functions in the cost functional are given as
follows:
40
L1 (x(t)) = 2.5x2 (t) − 20x(t) + 40, L2 (u(t)) = u(t)
k1
1
Ψ1 (x(τi− )) = 1000 − 500x(τi− ), Ψ2 (x(τN +1 )) =
x(τN +1 ) × 1000.
5
Note that N = 20 is the number of overhaul times, and $10, 000 is the
original capital cost of the machine.
i τi i τi i τi i τi i τi i τi i τi
1 15 4 60 7 105 10 150 13 195 16 240 19 285
2 30 5 75 8 120 11 165 14 210 17 255 20 300
3 45 6 90 9 135 12 180 15 225 18 270 21 400
12.3 Optimal Feedback Control for Linear Systems Subject to Poisson Processes497
1
μx (t) μy (t)
600
0.8
0.6 400
0.4 200
0.2
0
0 100 200 300 400 0 100 200 300 400
0.3
400
0.2
200
0.1
0 0
0 100 200 300 400 0 100 200 300 400
300
200
100
0
0 100 200 300 400
1.1 700
1
600
0.9
0.8 500
0.7
400
0.6
300
0.5
0.4 200
0.3
100
0.2
0.1 0
0 50 100 150 200 250 300 350 400 0 50 100 150 200 250 300 350 400
(f) Simulation of x(t) with 500 (g) Simulation of y(t) with 500
sample paths. sample paths.
Fig. 12.3.1: The optimal trajectories of the state variables and simulation
with 500 sample paths. (a) μx (t). (b) μy (t). (c) σxx (t). (d) σxy (t).
(e) σyy (t). (f ) Simulation of x(t) with 500 sample paths. (g) Simulation
of y(t) with 500 sample paths
12.4 Exercises 499
12.4 Exercises
12.4.1 Consider the optimal parameter and optimal control problem (SP)
with the control taking the form as given below:
u = Kx,
where K is an n × n matrix to be determined such that the cost func-
tional (12.2.5) is minimized. Obtain the corresponding deterministic optimal
parameter selection problem.
12.4.2 Consider the optimal parameter and optimal control problem (SP ).
However, the probabilistic state constraints (12.2.4) are replaced by the fol-
lowing constraints
$ %
αi ≤ E ξi (t) − ξi (t) ≤ βi , for all t ∈ [0, T ], i = 1, . . . , N ,
where αi , βi , i = 1, . . . , n, are the given real constants and ξi (t), i = 1, . . . , N ,
are the specified desired state trajectories. Obtain the corresponding determin-
istic optimal parameter and optimal control problem.
12.4.3 Consider the example of Section 12.2.2. Let the probabilistic con-
straint (12.2.23) be replaced by the following constraint
$ %
α ≤ E ξ(t) − ξ(t) ≤ β, for all t ∈ [0, T ],
where α and β are the given real constants and ξ(t) is the specified desired
state trajectory. Obtain the corresponding deterministic optimal control prob-
lem.
12.4.4 Consider Problem (SP2b ) but with the constraint (12.2.26) being re-
placed by appropriate probabilistic constraints of the form given by (12.2.4).
Derive the corresponding deterministic optimal control problem.
12.4.5 Consider the example of Section 12.3.3. Let the probabilistic con-
straints (12.3.38) and (12.3.39) be replaced by the constraints of the form
given by (12.3.25). Derive the corresponding deterministic optimal control
problem.
12.4.6 Consider the example of Section 12.3.3. Suppose that the control is
a piecewise constant function between every pair of overhaul times, where
the heights and switching times of the piecewise constant control are decision
variables. Derive the corresponding deterministic optimal control problem.
Appendix A.1
Elements of Mathematical Analysis
A.1.1 Introduction
In this Section, some results in measure theory and functional analysis are
presented without proofs. The main references are [3, 4, 40, 51, 90, 91, 198,
206, 216, 250, 253].
A.1.2 Sequences
Similarly, the limit inferior of the sequence xn , denoted by lim inf n→∞ xn
or limn→∞ xn , is defined by
Note that a sequence {xn } can only have one limit superior (respectively,
limit inferior). A sequence {xn } in R ∪ {±∞} has the limit A, denoted by
lim xn = A, if and only if
n→∞
lim xn = lim xn = A.
n→∞ n→∞
whenever
A.1.4 Metric Spaces 503
n
xi ∈ A, λi ≥ 0, and λi = 1.
i=1
Note that the intersection of any number of convex sets is a convex set.
However, the union of two convex sets is, in general, not a convex set.
Let x1 , . . . , xm be m vectors in a vector space X. A linear combination of
these m vectors is defined by
α 1 x 1 + · · · + αm x m ,
αi = 0, for all i = 1, . . . , m.
The set endowed with the topology T is called a topological space and is
written as (X, T ). Members of T are called open sets. A set B ⊂ X is said
to be closed if its complement X \ B is open.
Let T1 and T2 be two topologies on X. T1 is said to be stronger than T2
(or T2 weaker than T1 ) if T1 ⊃ T2 .
Let (X, T ) be a topological space, and let A be a non-empty subset of
X. The family TA = {A ∩ B : B ∈ T } is a topology on A and is called the
relative topology on A induced by the topology T on X.
Let (X, T ) be a topological space, A ⊂ X, and C ≡ {Gi } a subfamily of
T such that A ⊂ ∪Gi . Then, C is called an open covering of A. If every open
covering of X has a finite subfamily {G1 , . . . , Gn } ⊂ C such that X = ∪N
i=1 Gi ,
then the topological space (X, T ) is called a compact space. A subset A of
a topological space is said to be compact if it is compact as a subset of X.
Equivalently, A is call compact if every open covering of A contains a finite
subfamily that covers A.
A family of closed sets is said to possess the finite intersection property if
the intersection of any finite number of sets in the family is non-empty.
A topological space is compact if and only if any family of closed sets with
the finite intersection property has non-empty intersection.
A point x ∈ X is said to be an interior point of a set A ⊂ X if there
exists an open set G in X such that x ∈ G ⊂ A. The interior Å of A is
the set which consists of all the interior points of the set A. A neighbourhood
of a point x ∈ X is a set V ⊂ X such that x is an interior point of V . A
point x ∈ X is said to be an accumulation point of a set A ⊂ X if every
neighbourhood of x ∈ X contains points of A other than x. If A ⊂ X is a
closed set, then it contains all its accumulation points. The union of a set B
and its accumulation points is called the closure of B and is written as B.
A set A ⊂ X is said to be dense in a set E ⊂ X if A ⊃ E. A set A is said
to be nowhere dense if the interior of its closure is empty. If X contains a
countable subset that is dense in X, then it is called separable. The boundary
∂A of a set A is the set of all accumulation points of both A and X \ A. Thus,
∂A = A ∩ (X \ A).
A family B of subsets of X is a base for a topology T on X if B is a
subfamily of T and, for each x ∈ X and each neighbourhood U of x, there
is a member V of B such that x ∈ V ⊂ U . A family F of subsets of X is a
subbase for a topology T on X if the family of finite intersections of members
of F is a base for T .
A sequence {xn } ⊂ X is said to converge to a point x ∈ X, denoted
by xn → x, if each neighbourhood of x contains all but a finite number of
elements of the sequence.
A.1.4 Metric Spaces 505
A ∪ B = Ā ∪ B̄
and
A ∩ B ⊂ Ā ∩ B̄.
If Ā = A, then the set A is said to be closed. The closure B̄ of any set B is a
closed set. The whole set X and the empty set ∅ are closed sets. The union of
any two closed sets is a closed set. Although the intersection of any collection
(countable or uncountable) of closed sets is closed, the union of a countable
collection of closed sets needs not be closed. Similarly, the intersection of a
countable collection of open sets needs not be open.
Let A be a subset in X. The complement à of A is defined by
à = {x ∈ X : x ∈
/ A}.
We can define different metrics on the same vector space X. Let ρ and
σ be two different functions from X × X into R such that the properties
(M1)–(M3) are satisfied. Then, (X, ρ) and (X, σ) are two different metric
spaces.
In a metric space (X, ρ), a sequence {xn }∞n=1 ⊂ X is called a Cauchy
sequence if
ρ(xn+p , xn ) → 0 as n → ∞
for any integer p ≥ 1. A metric space (X, ρ) is said to be complete if every
Cauchy sequence has a limiting point in X.
Let (X, ρ) be a metric space. For any two distinct points x1 , x2 in X, there
exist two real numbers δ1 > 0 and δ2 > 0 such that
This implies that the metric space (X, ρ) is a Hausdorff space. Therefore, a
convergent sequence can have only one limiting point.
Let (X, ρ) be a metric space, and let A ⊂ X. If, for any sequence {xn }∞
n=1
in A, there exist a subsequence {xn() }∞
=1 and a point x ∈ X such that
ρ(xn() , x) → 0 as → ∞,
Let (X, ρ) and (Y, σ) be two metric spaces, and let f be a function from X
into Y . The function f is said to be continuous at x0 ∈ X if, for every ε > 0
there exists a δ ≡ δ(ε, x0 ) > 0, such that
where |y| denotes the absolute value of y. The concept of uniform continuity
for this special case is to be understood similarly.
Let (X, ρ), (Y, σ), and (Z, η) be three metric spaces. Let f be a continuous
function from X into Y , and g a continuous function from Y into Z. Then,
the composite function g ◦ f : X → Z is also continuous, where
B = {x ∈ I : f (x) = α}
f (A) = { f (x) : x ∈ A }
is compact.
Let X = (X, ρ) be a metric space. A real-valued function f defined on X
is said to be lower semicontinuous at x0 ∈ X if for every real number α such
that f (x0 ) > α, there is a neighbourhood V of x0 such that f (x) > α for all
x ∈ V . Upper semicontinuity is defined by reversing the inequalities. We say
that f is lower semicontinuous if it is lower semicontinuous at every x ∈ X.
Let f be an upper (respectively, lower) semicontinuous real-valued function
on a compact space X. Then, f is bounded from above (respectively, below)
and assumes its maximum (respectively, minimum) in X.
Let X be a vector space and let · be a function from X into [0, ∞) such
that the following properties are satisfied:
n !1/2
2
where f ≡ [f1 , . . . , fn ]τ , and |f (t)| = i=1 (fi (t)) .
A set A ⊂ C(I, R ) is said to be equicontinuous if, for any ε > 0, there
n
m
|tk − tk | < δ.
k=1
∂f (y)
f (z) − f (y) ≥ (z − y), (A.1.3)
∂x
where
∂f (y)
≡ [∂f (x)/∂x1 , . . . , ∂f (x)/∂xn ]|x=y (A.1.4)
∂x
is called the gradient (vector) of f at x = y.
For strict convexity of the function f , we only need to replace the inequality
in the conditions (A.1.2) and (A.1.3) by strict inequality.
If f : Rn → R is twice continuously differentiable, the Hessian matrix of
the function f at x0 is a real n × n matrix with its ij-th element defined by
∂ 2 f (x)
H(x0 )ij = . (A.1.5)
∂xi ∂xj x=x0
Let X ≡ (X, ·X ) and Y ≡ (Y, ·Y ) be normed spaces and let f : X → Y
be a linear mapping. If there exists a constant M such that
x∗ , x ≡ x∗ (x) at x ∈ X.
For a fixed x ∈ X, it is clear that the bilinear form also defines a continuous
linear functional on X ∗ , and we can write it as:
x∗ (x) ≡ Jx (x∗ ).
Let X be a set, and let A and B be two subsets of X. The symmetric difference
of these two sets is defined by
where
A \ B = {x ∈ X : x ∈ A and x∈
/ B} (A.1.8)
and B\A is defined similarly.
A class D of subsets of X is called a ring if it is closed under finite unions
and set differences. If D is closed with respect to complements and also con-
tains X, it is called an algebra. If an algebra is closed under countable unions,
it is called a σ-algebra.
Let X be a set, D a σ-algebra, and μ̄ a function from D to R+ ∪ {+∞},
where R+ ≡ {x ∈ R : x ≥ 0}, such that the following two conditions are
satisfied:
1. μ̄(∅) = 0;
2. If S1 , S2 , . . . ∈ S is a sequence of disjoint sets, then
∞ ∞
E
μ̄ Si = μ̄(Si ). (A.1.9)
i=1 i=1
Define
∞
μ∗ (E) = inf μ̄(Ai ), (A.1.10)
i=1
where the infimum is taken with respect to all possible sequences {Ai } from
D such that E ⊂ ∪∞ i=1 Ai .
A set E is called μ∗ measurable if, for every A ⊂ X,
n
f (x) = ai χAi (x), (A.1.12)
i=1
1 if x ∈ Ai
χAi (x) = (A.1.13)
0 otherwise.
where the supremum is taken with respect to all non-negative simple functions
ψ with 0 ≤ ψ(x) ≤ f (x) for all x ∈ I. We say f is integrable if
f (t) dt < ∞.
I
fn (t) → f (t)
for each t ∈ I \A, where μ denotes the Lebesgue measure. In this situation,
the function f is automatically a measurable function from I to R.
Theorem A.1.10 (The Lebesgue Dominated Convergence Theorem).
Let {fn }∞
n=1 be a sequence of measurable functions on I. If fn → f a.e. and
there exists an integrable function g on I such that |fn (t)| ≤ g(x) for almost
every x ∈ I, then
f (t) dt = lim fn (t) dt.
I n→∞ I
Theorem A.1.11 (Luzin’s Theorem). Let I ⊂ R be such that μ(I) < ∞,
and let f be a measurable function from I to Rn . Then, for any ε > 0,
there exists a closed set Iε ⊂ I such that μ(I \Iε ) < ε and the function f is
continuous on Iε .
n !1/2
where f ≡ [f1 , . . . , fn ] , and |f (t)| =
2
i=1 (fi (t)) . If we do not distin-
guish between equivalent functions in Lp (I, R ), for 1 ≤ p < ∞, then it is a
n
to a function f ∈ Lp (I, R ) if f − f p → 0 as n → ∞.
n n
and
(b) (Minkowski’s inequality) for f and g ∈ Lq (I, Rn ),
1/p 1/p 1/p
p p p
|f (t) + g(t)| dt ≤ |f (t)| dt + |g(t)| dt .
I I I
(A.1.24)
f −1 (V ) = {t ∈ I : f (t) ∈ V },
for every g ∈ Lq (I, R ), where 1/p + 1/q = 1 and the superscript denotes
n
the transpose. The function fˆ is called the weak limit of the sequence {f (n) }.
A set F ⊂ Lp (I, Rn ), for 1 ≤ p < ∞, is said to be weakly closed if the
limit of every weakly convergent sequence {f (n) } ⊂ F is in F .
A set F ⊂ Lp (I, Rn ), for 1 ≤ p < ∞, is said to be conditionally weakly
sequentially compact if every sequence {f (n) } ⊂ F contains a subsequence
that converges weakly to a function fˆ ∈ Lp (I, Rn ). The set F is said to be
weakly sequentially compact if it is conditionally weakly sequentially compact
and weakly closed.
Let F be s set in L∞ (I, Rn ). A sequence {f (n) } ⊂ F is said to converge
to a function fˆ ∈ L∞ (I, Rn ) in the weak ∗ topology of L∞ (I, Rn ) (written
w∗
as f (n) −→ fˆ) if
! !
lim f (n) (t) g(t)dt = fˆ(t) g(t)dt
n→∞ I I
for every g ∈ L1 (I, Rn ). The function fˆ is called the weak ∗ limit of the
sequence {f (n) }.
A set F ⊂ L∞ (I, Rn ) is said to be weak ∗ closed if the limit of every
weak∗ convergent sequence {f (n) } ⊂ F is in F .
A set F ⊂ L∞ (I, Rn ) is said to be conditionally sequentially compact
in the weak ∗ topologyof L∞ (I, Rn ) if every sequence {f (n) } ⊂ F contains
a subsequence that converges to a function fˆ ∈ L∞ (I, Rn ) in the weak∗
topology. The set F is called sequentially compact in the weak ∗ topology of
L∞ (I, Rn ) if it is conditionally sequentially compact in the weak∗ topology
of L∞ (I, Rn ) and weak∗ closed.
A.1.9 The Lp Spaces 519
U = {u ∈ L2 ([0, T ], Rr ) : u ≤ ρ}
is weakly sequentially compact, where · denotes the L2 -norm in L2 ([0, T ], Rr )
and ρ is a given positive constant.
Theorem A.1.17. Let I be an interval in R such that μ(I) < ∞, and let
f ∈ Lp (I, Rn ) for all p ∈ [1, ∞). If there exists a constant K such that
f p ≤ K for all such p, then f ∈ L∞ (I, Rn ) and f ∞ ≤ K.
where α is a continuous and bounded function on [0, ∞) such that α(t) ≥ 0 for
all t ∈ [0, ∞). If K ∈ Lloc
1 such that K(t) ≥ 0 a.e. on [0, ∞), and f (t) ≥ 0 for
all t ∈ [0, ∞), then
t t
f (t) ≤ α(t) + exp K(τ ) dτ K(s)α(s) ds
0 s
whenever
x ∈ Iδ ≡ {x ∈ I : |x − x̄| < δ} .
for all x ∈ Iδ (x̄), where δ = min {δ1 , δ3 }. This completes the proof.
for all k > N , where F ε (x̂) is the closed ε-neighbourhood of F (x̂) defined
by
F ε (x̂) = {v ∈ U : ρ (v, F (x̂)) ≤ ε} .
Since u (xk ) ∈ F (xk ) for all k ≥ 1, it is clear from (A.1.40) that u (xk ) ∈
F ε (x̂) for all k > N . Thus, by the facts that u(xk ) → û as k → ∞, and
F ε (x̂) is closed, we have
û ∈ F ε (x̂) .
Since this relation is true for all ε > 0, and F (x̂) is closed, it follows that
û ∈ F (x̂). Thus, the proof is complete.
A = {x ∈ I : u (x) ≤ α} (A.1.47)
In particular,
|u (xk )| ≤ K (A.1.52)
for all positive integers k. Thus, there exists a subsequence of the sequence
{xk }, again denoted by the original sequence, such that
û ≤ u (x̂) − ε. (A.1.54)
r̂ ≤ r (x̂) . (A.1.56)
We note that
Therefore,
r (x̂) = g (x̂, û) . (A.1.61)
However, by (A.1.54), we see that u(x̂) is not the smallest value of u such
that
r (x̂) = g (x̂, u) .
This contradicts the definition of u (x). Thus, the set A defined by (A.1.47)
must be closed, and hence the function u is lower semicontinuous on I. The
proof is complete.
By a partition of the interval [a, b], we mean a finite set of points ti ∈ [a, b],
i = 0, 1, . . . , m, such that
m
|f (ti ) − f (ti−1 )| ≤ K.
i=1
b
The total variation of f , denoted by ∨f (t), is defined by
a
b
m
∨f (t) = sup |f (ti ) − f (ti−1 )|,
a
i=1
where the supremum is taken with respect to all partitions of [a, b]. The total
variation of a constant function is zero and the total variation of a monotonic
function is the absolute value of the difference between the function values
at the end points a and b.
The space BV [a, b] is defined as the space of all functions of bounded
variation on [a, b] together with the norm defined by
b
f = |f (a)| + ∨f (t).
a
b
If f is monotone, then f ∈ BV [a, b] and ∨f (t) = |f (b) − f (a)|.
a
If f ∈ BV [a, b], then the jump of f at t is defined as:
⎧
⎨ |f (t) − f (t − 0)| + |f (t + 0) − f (t)| if a < t < b
|f (a + 0) − f (a)| if t = a
⎩
|f (b) − f (b − 0)| if t = b
b
n
b
∨f (t) = ∨fi (t).
a a
i=1
Let BV ([a, b], Rn ) be the space of all functions f : [a, b] → Rn which are
of bounded variation on [a, b].
526 A.1 Elements of Mathematical Analysis
|f (τ ) − f (t)| < ε
where x ∈ X, while μ and ρ are parameters, which are such that ρ > 0 and
0 ≤ μ < ρ/M 2 . To proceed further, we need some definitions.
Property A.2.3. Let x1,∗ be an isolated minimizer of f (x) in X, and let X1∗
be the basin as defined in Definition A.2.1. If f (x) has a basin X2∗ at x2,∗
∗
that is lower
than X 1,∗
is a point x ∈ X2∗ that mini-
1 at x , then there 1,∗
mizes p x, x , ρ, μ on the line through x and x , for every x in some
1,∗
neighbourhood of x2,∗ .
p(x, x∗ , ρ, μ) − p(x∗ , x∗ , ρ, μ)
= μ [f (x) − f (x∗ )] − ρ|x − x∗ |2 < 0.
2
(A.2.5)
and
A.2 Global Optimization via Filled Function Approach 529
f (x∗ ) ≤ f x1 ≤ f x2 . (A.2.7)
" #
If ρ > 0 and 0 ≤ μ < min ρ/M 2 , ρ/M M1 , where
2
∂f x1 + α x2 − x1 x − x 1
M1 ≥ max 2 , (A.2.8)
0≤α≤1 ∂x |x − x1,∗ | − |x1 − x1,∗ |
then
p x2 , x∗ , ρ, μ < p(x∗ , x∗ , ρ, μ) < 0 = p(x∗ , x∗ , ρ, μ). (A.2.9)
Combining (A.2.10), (A.2.11) and then using the mean value theorem, we
obtain
p x2 , x∗ , ρ, μ − p x1 , x∗ , ρ, μ
4 5
2 1 ! f x2 − f x1
≤ x −x − x −x ∗ 2 ∗ 2
−ρ + μM 2
|x − x∗ | − |x1 − x∗ |
2 2 !
≤ x 2 − x ∗ − x 1 − x ∗ ·
4 2 5
1 2 2
1 x −x
1 x − x 1
−ρ+μM ∇f x +α x −x ,
|x2 −x1 | |x2 − x∗ | − |x1 − x∗ |
(A.2.12)
530 A.2 Global Optimization via Filled Function Approach
Our next task is to show the validity of Property A.2.2. For this, we need
the following lemma.
Then, there exists a sufficiently small ε1 > 0, such that whenever d1 is chosen
satisfying 0 < |d1 | ≤ ε1 , it holds that
1
x − d 1 − x ∗ < x 1 − x ∗ < x 1 + d 1 − x ∗ , (A.2.15)
f x1 + d1 ≥ f (x∗ ) (A.2.16)
and
p x1 + d1 , x∗ , ρ, μ < p x1 , x∗ , ρ, μ
< p x1 − d1 , x∗ , ρ, μ
< 0 = p(x∗ , x∗ , ρ, μ). (A.2.17)
ε1 x 1 − x ∗
d1 = . (A.2.18)
2 |x1 − x∗ |
Then,
ε1
0 < |d1 | =≤ ε1 . (A.2.19)
2
Clearly, if ε1 > 0 is sufficiently small, we have
A.2 Global Optimization via Filled Function Approach 531
1 ε1
x + d1 − x∗ = (1 + ) x 1 − x ∗ > x 1 − x ∗ , (A.2.20a)
2|x1 ∗
−x |
1 ε1
x − d1 − x∗ = (1 − ) x 1 − x ∗ < x 1 − x ∗ . (A.2.20b)
2|x1 ∗
−x |
Since f x1 > f (x∗ ) and 0 < |d1 | ≤ ε1 , it follows that
f x1 ± d1 ≥ f (x∗ ), (A.2.21)
if ε1 > 0 is chosen
" sufficiently small.
# Now, choose ρ and μ such that ρ > 0
and 0 ≤ μ < min ρ/M 2 , ρ/M M1 . Then, by using arguments similar to that
given for Theorem A.2.2, we can show that
p x1 + d1 , x∗ , ρ, μ < p x1 , x∗ , ρ, μ
< p x1 − d1 , x∗ , ρ, μ
< 0 = p(x∗ , x∗ , ρ, μ). (A.2.22)
Proof. It suffices to show that for any x, if f (x) > f (x∗ ), then
∂p(x, x∗ , ρ, μ)
= 0. (A.2.24)
∂x
where β > 0 is sufficiently small. Then, by taking the inner product of d and
∂p(x, x∗ , ρ, μ)
, we obtain
∂x
∂p(x, x∗ , ρ, μ)
d
∂x
K
∂f (x) ∂f (x)
∗
= −2ρ|x − x | + 2ρβ(x − x ) ∗
∂x ∂x
∂f (x) x − x∗
+ 2μ [f (x) − f (x∗ )]
∂x |x − x∗ |
∂f (x)
− 2μβ [f (x) − f (x∗ )] | |. (A.2.28)
∂x
. /
1,∗
∗ ∂f (x) T ∂p x, x , ρ, μ
If (x − x ) ≤ 0, then d < 0. Otherwise,
∂x ∂x
choose μ ≥ 0 to be sufficiently small. Since β > 0 can also be chosen
∂p(x, x∗ , ρ, μ)
to be sufficiently small, it follows that d < 0. Thus,
∂x
∂p(x, x∗ , ρ, μ)
= 0. This completes the proof.
∂x
The next theorem shows that the filled function p satisfies Property A.2.3.
and
D1 = max x − x1,∗ 2 . (A.2.31)
x∈N (x 2,∗ ,δ)
Furthermore, if there exists no basin lower than B1∗ between B1∗ and B2∗ , where
B1∗ and B2∗ are the basins of f (x) at x1,∗ and x2,∗ , respectively, then there
exists a x ∈ B2∗ such that
A.2 Global Optimization via Filled Function Approach 533
f (x ) ≤ f x1,∗ . (A.2.32)
Thus, by the continuity of f (x), there are three points x0,− , x0 and x0,+ on
the line segment connecting x1,∗ and x2 such that
f x0 = f x1,∗ (A.2.38)
and
f xB > f x0,− ≥ f x0 ≥ f x0,+ > f x2 , (A.2.39)
where
x0,− = x0 − η x0 − x1,∗ (A.2.40a)
and
x0,+ = x0 + η x0 − x1,∗ , (A.2.40b)
where η > 0 is sufficiently small. Since
2
p x0 , x1,∗ , ρ, μ = −ρ x0 − x1,∗ < 0 = p x1,∗ , x1,∗ , ρ, μ . (A.2.41)
534 A.2 Global Optimization via Filled Function Approach
f x1,∗ .
Algorithm A.2.1
Step 1. Initialize ρ, μ, μ̂ (where μ̂ < 1), μl (where μl is sufficiently small).
Step 2. Start from x1,∗ , construct a search direction according to (A.2.49)
and select a search step. We minimize the filled function p(x, x1,∗ , ρ, μ)
along this search direction with the selected search step. Then, we find
a point x. Go to Step 3.
Step 3. If f (x) < f x1,∗ , stop. Otherwise, go to Step 5.
Step 4. Continue the search as described in Step 2. Find a new point x. Go
to Step 3.
Step 5. If x is on the boundary of X and μ ≥ μl , set μ = μ̂μ and go to Step
1. Otherwise, go to Step 4.
Let S denote the sample space which is the set of all possible outcomes
of an experiment. If S is finite or countably infinite, it is called a discrete
sample space. On the other hand, if S is a continuous set, then it is called a
continuous sample space. An element from S is called a sample point (i.e., a
single outcome). A collection of possible outcomes is referred to as an event. In
set theory, it is called a subset of S. A ⊂ B menas that if event B occurs, then
event A must occur. What probability does is to assign a weight between 0 and
1 to each outcome of an experiment. This weight represents the likelihood
or chance of that outcome occurring. These weights are determined by long
run experiment, assumption or some other methods.
The probability of an event A in S is the sum of the weights of all sample
points in A and is denoted by P (A). It satisfies the following properties. (1)
0 ≤ P (A) ≤ 1; (2) P (S) = 1; and (3) P (∅) = 0, where ∅ denotes the empty
event. Two events A and B are said to be mutually exclusive if A ∩ B = ∅.
If {A1 , . . . , An } is a set of mutually exclusive events, then
n
P (A1 ∪ A2 ∪ · · · ∪ An ) = P (Ai ).
i=1
If an experiment is such that each outcome has the same probability, then
the outcomes are said to be equally likely.
Consider the case for which an experiment can result in any one of N
different equally likely outcomes. Let the event A consists of exactly n of
these outcomes. Then, the probability of event A is
n
P (A) = .
N
P (AB)
P (A|B) = (A.3.1)
P (B)
Define A = {at least one head} and B = {first toss resulted in a tail}.
Then,
P (X = x) = f (x).
Convention: Capital letters for random variables and lower case for values
of the random variable.
A random variable X is discrete if its range forms a discrete (countable)
set of real numbers. On the other hand, a random variable X is continuous
A.3 Elements of Probability Theory 539
if its range forms a continuous set of real numbers. For a continuous random
variable X, the probability of a specified outcome being occurred is 0.
The cumulative distribution F (x) of a discrete random variable X with
probability function f (x) is given by
F (x) = P (X ≤ x) = f (t) = P (X = t).
t≤x t≤x
Clearly,
a
P (X = a) = P (a ≤ X ≤ a) = f (x)dx = 0.
a
∞
Properties: (1) f (x) ≥ 0 for all x ∈ R; (2) f (x)dx = 1; and (3).
−∞
3b
P (a < X < b) = a f (x)dx.
In general, since P (X = x) = 0, it holds that P (X < x) = P (X ≤ x).
The cumulative distribution F (x) of a continuous random variable is
540 A.3 Elements of Probability Theory
x
dF (x)
F (x) = P (X ≤ x) = f (t)dt ⇒ f (x) = . (A.3.5)
−∞ dx
f (x, y) = P (X = x, Y = y)
with properties: (1) f (x, y) ≥ 0 for all x, y; (2) f (x, y) = 1; and (3)
x y
P {(X, Y ) ∈ A} = f (x, y) for any region A in the xy plane.
A
2. If X and Y are continuous random variables, then they have a joint prob-
ability density function f (x, y)
f (x, y) = P (X = x, Y = y)
∞ ∞
with properties: (1)f (x, y) ≥ 0 for all x, y; (2) f (x, y)dxdy = 1;
3 3 −∞ −∞
and (3.) P {(X, Y ) ∈ A} = A f (x, y)dxdy for any region A in the xy
plane.
Joint Cumulative Distribution: The joint cumulative distribution is
∞ ∞
F (x, y) = P (X ≤ x, Y ≤ y) = f (t, s)dt ds.
−∞ −∞
f (x, y)
f (y|x) ≡ P (Y = y|X = x) = .
G(x)
f (x, y) = G(x)H(y).
E[g(X, Y )] ± h(X, Y )]
∞ ∞
= {g(x, y) ± h(x, y)} f (x, y)dxdy
−∞ −∞
∞ ∞ ∞ ∞
= g(x, y)f (x, y)dxdy ± h(x, y)f (x, y)dxdy
−∞ −∞ −∞ −∞
= E[g(X, Y )] ± E[h(X, Y )].
E(XY ) = E(X)E(Y )
.
Remark A.3.1. The condition (6) is necessary but not sufficient. That is,
E(XY ) = E(X)E(Y ) does not necessarily imply that X and Y are indepen-
dent.
Clearly,
E(X 0 ) = E(1) = 1.
The first moment is called the mean of X and is denoted by μ.
-1 = E(X) = mean of X ≡ μ.
μ
The kth moment about the mean of the random variable X is defined as:
⎧
⎨ x (x − μ) f (x),
k
⎪ if X is discrete,
$ %
E (X − μ) ≡ μk =
k
⎪3∞
⎩
−∞
(x − μ)k f (x)dx, if X is continuous.
544 A.3 Elements of Probability Theory
Clearly,
V ar(X) = E X 2 − [E(X)]2 , i.e., σ 2 = μ
-2 − μ2 .
Bivariate or Joint Moments: Let X and Y be two random variables with
a joint probability function f (x, y). Then
⎧
⎪
⎪ xyf (x, y), X, Y discrete,
⎪
⎨ x y
E(XY ) = ∞ ∞
⎪
⎪
⎪
⎩ xyf (x, y)dxdy, X, Y continuous.
−∞ −∞
The “joint” moment of X and Y about their respective means is the covari-
ance of X and Y and is denoted by
That is to say,
X ∼ B(m, p).
m
m
m
P (X = x) = px (1 − p)m−x .
x
x=0 x=0
However,
m
m m
(a + b) = ax bm−x .
x
x=0
Thus,
m
P (X = x) = (p + 1 − p)m = 1.
x=0
Hence,
m
P (X = x) = px (1 − p)m−x
x
is its probability function. The mean of X is
546 A.3 Elements of Probability Theory
m
m
E(X) = x px (1 − p)m−x
x
x=0
But
m m!
= .
x x!(m − x)!
Thus,
m
m(m − 1)!
E(X) = ppx−1 (1 − p)m−x
x=1
(x − 1)!(m − x)!
= mp[p − (1 − p)]m−1 = mp.
The variance of X is
V ar(X) = E X 2 − [E(X)]2 ,
where
m
m
E X2 = x2 px (1 − p)m−x .
x
x=0
E X 2 = E[X(X − 1)] + E(X).
Therefore,
V ar(X) = E X 2 − [E(X)]2 = mp(1 − p).
Then,
m
P (X = x + 1) = px+1 (1 − p)m−x−1
x+1
m!
= px+1 (1 − p)m−x−1
(m − x − 1)!(x + 1)!
A.3 Elements of Probability Theory 547
m! (m − x)p
= px (1 − p)m−x
(m − x)!x! (x + 1)(1 − p)
m−x p
= P (X = x).
x+1 1−p
The variance of X is
V ar(X) = E X 2 − [E(X)]2 = E[X(X − 1)] + E(X) − [E(X)]2
= λ2 + λ − λ2 = λ.
Example A.3.3. Experiment shows that the number of mechanical failures per
quarter for a certain component used in a loading plant is Poisson distributed.
The mean number of failures per quarter is 1.5. Stocks of the component are
built up to a fixed number at the beginning of a quarter and not replenished
until the beginning of the next quarter. Calculate the least number of spares
of this component which should be carried at the beginning of the quarter to
ensure that the probability of a demand exceeding this number during the
quarter will not exceed 0.10. If stocks are to be replenished only once in 6
months, how many ought now to be carried at the beginning of the period
to give the same protection?
Solution.
(a) Let X ∼ P (1.5) be the number of mechanical failures/quarter, and let t
be the smallest v number of spares. Then, P (X > t) ≤ 0.1 and P (X ≤
t −1.5 1.5
x
t) > 0.9. Since P (X ≤ t) = x=0 e , we have P (X ≤ 2) =
x!
0.8088, and P (X ≤ 3) = 0.9344 (hence P (X > 3) = 0.0666). This
implies 3 spares/quarter
(b) For 6 months, λ = 3, X ∼ P (3). Find t such that P (X > t) ≤ 0.1.
1. f (−x) = f (x);
1 * + x−μ
1 x−μ 2
2. df (x)
dx = √ exp − 2 σ − .
σ 2π σ2
Then, dfdx (x)
= 0 ⇒ x = μ;
* 2
+
2
d2 f (x) 1 1 x−μ
−2 1
1 x−μ 2 x−μ
3. dx2 = − √ e σ
+ √ exp − 2 σ .
σ 3 2π σ 2π σ2
Then,
d2 f (x) 1
=− √ ≤ 0.
dx2 x=μ 3
σ 2π
A.3 Elements of Probability Theory 549
This implies that x = μ is the point at which the function f (x) attains its
maximum, f (x) is symmetric about μ, and the points of inflexion are at
x = μ + σ and x = μ − σ.
Let X be a random variable with its value denoted by x. If X is normally
distributed with mean μ and variance σ 2 , written as X ∼ N (μ, σ 2 ). If μ = 0
and σ 2 = 1, then the corresponding random variable U is called a standard
normal random variable, written as U ∼ N (0, 1). Let the value of U be
denoted by u. Then, its probability density function φ(u) is
1 " #
φ(u) = √ exp − 12 u2 ,
2π
and its cumulative distribution is
u
1 " #
Φ(u) = √ exp − 12 v 2 dν.
−∞ 2π
If X ∼ N (μ, σ 2 ), then
*
x
1 2 +
P (X < x) = √ exp − 21 t−μ
σ dt.
−∞ σ 2π
1x " #
Φ(u) = P (U < u) = √ exp − 12 v 2 dν
−∞ σ 2π
1
Φ(2) = P (U < 2) = 0.9772; Φ(0) = ; Φ(∞) = 1; Φ(−∞) = 0.
2
and
P (U > u) = 1 − P (U < u) = 1 − Φ(u).
Thus,
Note that
Set
t−μ 1
ν= ⇒ t = σν + μ and dν = dt
σ σ
x1 − μ x2 − μ
t = x1 ⇒ ν = ≡ u1 ; t = x2 ⇒ ν = ≡ u2
σ u2 σ
1 1 2
⇒ P (x1 < X < x2 ) = √ e− 2 ν dν = Φ(u2 ) − Φ(u1 )
u1 2π
x2 − μ x1 − μ
=Φ −Φ .
σ σ
X −μ
Thus, we conclude that if X ∼ N μ, σ 2 , then U = ∼ N (0, 1). This
σ
is called standardization.
Theorem A.3.2. Suppose that X ∼ N μ, σ 2 . Then,
aX + b ∼ N aμ + b, a2 σ 2 .
Theorem A.3.3. For each i = 1, . . . , n, let Xi ∼ N μi , σi2 . If Xi , i =
1, . . . , n, are independent, then
n
n
n
Y ≡ ai Xi ∼ N ai μi , a2i σi2 .
i=1 i=1 i=1
X̄ − μ
U≡ √ ∼ (approximately) N (0, 1), (*)
σ/ n
n
where X̄ = Xi /n. In other words, for a large sample size, the sample
i=1
mean is approximately normally distributed no matter what the distribution
of the Xi , i = 1, 2, . . ., are as long as they are independently identically
distributed.
On the other hand, if σ is not known, then we shall replace σ by the sample
standard deviations, provided that n ≥ 30.
Let (Ω, F, P ) denote a complete probability space, where Ω represents the
sample space, F the σ-algebra (Borel algebra) of the subsets of the set Ω, and
P the probability measure on the algebra F. Let Ft , t ≥ 0, be an increasing
family of complete subsigma algebras of the σ-algebra F. For any random
variable (or equivalently F measurable function) X, let
E{X} = X(ω)P (ω)
Ω
1. For α1 , α2 ∈ R,
2.
E {E{X | G1 ) | G2 } = E {E{X | G2 ) | G1 } = E{X | G1 )}.
3. If Z is a bounded G measurable random variable with G ⊂ F, then
E{X | G1 ) = E{X};
5. For any F measurable and integrable random variable Z, the process
Zt = E{Z | Ft }, t ≥ 0,
is an Ft martingale in the sense that for any s ≤ t < ∞,
E{Zt | Fs } = Zs .
Theorem A.3.5. Suppose that the random process X is defined by the Ito
process:
∂F ∂F 1 ∂2F ∂F
dY (t) = α+ + 2
(β)2 dt + βdW
∂x ∂t 2 ∂x ∂x
References
1. Åkesson, J., Arzen, K., Gäfert, M., Bergdahl, T., Tummescheit, H.:
Modelling and optimization with Optimica and JModelica.org – lan-
guages and tools for solving large-scale dynamic optimization problems.
Comput. Chem. Eng. 34(11), 1737–1749 (2010)
2. Abu-Khalaf, M., Lewis, F.L.: Nearly optimal control laws for nonlin-
ear systems with saturating actuators using a neural network HJB ap-
proach. Automatica 41(5), 779–791 (2005)
3. Ahmed, N.U.: Elements of Finite-Dimensional Systems and Control
Theory. Longman Scientific and Technical, Essex (1988)
4. Ahmed, N.U.: Dynamic Systems and Control with Applications. World
Scientific, Singapore (2006)
5. Ahmed, N.U., Teo, K.L.: Optimal Control of Distributed Parameter
Systems. Elsevier Science, New York (1981)
6. Al-Tamimi, A., Lewis, F., Abu-Khalaf, M.: Discrete-time nonlinear
HJB solution using approximate dynamic programming: convergence
proof. IEEE Trans. Syst. Man Cybern. B Cybern. 38, 943–949 (2008)
7. Amerongen, J.V.: Adaptive steering of ships-a model reference ap-
proach. Automatica 20(1), 3–14 (1984)
8. Anderson, B.D.O., Moore, J.B.: Linear Optimal Control. Prentice-Hall,
Englewood Cliffs (1971)
9. Anderson, B.D.O., Moore, J.B.: Optimal Control: Linear Quadratic
Methods. Dover, New York (2007)
10. Aoki, M.: Introduction to Optimization Techniques: Fundamentals and
Applications of Nonlinear Programming. Macmillan, New York (1971)
11. Athans, M., Falb, P.L.: Optimal Control. McGraw-Hill, New York
(1966)
12. Baker, S., Shi, P.: Formulation of a tactical logistics decision analysis
problem using an optimal control approach. ANZIAM J. 44(E), 1737–
1749 (2002)
13. Banihashemi, N., Kaya, C.Y.: Inexact restoration for Euler discretiza-
tion of box-constrained optimal control problems. J. Optim. Theory
Appl. 156, 726–760 (2003)
14. Banks, H.T., Burns, J.A.: Hereditary control problem: Numerical meth-
ods based on averaging approximations. SIAM J. Control. Optim. 16,
169–208 (1978)
15. Bartlett, M.: An inverse matrix adjustment arising in discriminant anal-
ysis. Ann. Math. Stat. 22(1), 107–111 (1951)
16. Bashier, E.B.M., Patidar, K.C.: Optimal control of an epidemiological
model with multiple time delays. Appl. Math. Comput. 292, 47–56
(2017)
17. Bech, M., Smitt., L.W.: Analogue Simulation of Ship Manoeuvres. Hy-
dro and Aerodynamics Lab. Report No. Hy-14, Denmark (1969)
18. Bellman, R.: Introduction to the Mathematical Theory of Control Pro-
cesses, Vol. 1. Academic, New York (1967)
19. Bellman, R.: Introduction to the Mathematical Theory of Control Pro-
cesses, Vol. 2. Academic, New York (1971)
20. Bellman, R., Dreyfus, R.: Dynamic Programming and Modern Control
Theory. Academic, Orlando (1977)
21. Bensoussan, A., Hurst, E., Naslund, B.: Management Application of
Modern Control Theory. North Holland, Amsterdam (1974)
22. Bertsekas, D.: Constrained Optimization and Lagrange Multiplier
Methods. Academic, New York (1982)
23. Bertsimas, D., Brown, D.: Constrained stochastic LQC: a tractable
approach. IEEE Trans. Autom. Control 52, 1826–1841 (2007)
24. Betts, J.: Practical Methods for Optimal Control and Estimation Using
Nonlinear Programming. SIAM Press, Philadelphia (2010)
25. Biegler, L.: An overview of simultaneous strategies for dynamic op-
timization. Chem. Eng. Process. Process Intensif. 46(11), 1043–1053
(2007)
26. Birgin, E.G., Martinez, J.M.: Local convergence of an Inexact-
Restoration method and numerical experiments. J. Optim. Theory
Appl. 127(2), 229–247 (2005)
27. Blanchard, E., Loxton, L., Rehbock, V.: Dynamic optimization of dual-
mode hybrid systems with state-dependent switching conditions. Op-
tim. Methods Softw. 33(2), 297–310 (2018)
28. Blanchard, E., Loxton, R., Rehbock, V.: A computational algorithm for
a class of non-smooth optimal control problems arising in aquaculture
operations. Appl. Math. Comput. 219, 8738–8746 (2013)
29. Boltyanskii, V.: Mathematical Methods of Optimal Control. Holt, Rine-
hart and Winston, New York (1971)
30. Boyd, S., Vandenberghe, L.: Convex Optimization (2013). http://www.
stanford.edu/∼boyd/cvxbook/
References 557
31. Brooke., D.: The design of a new automatic pilot for the commercial
ship. In: First IFAC/IFIP Symposium on Ship Operation Automation,
Oslo (1973)
32. Broyden, C.: The convergence of a class of double-rank minimization
algorithms. J. Inst. Math. Appl. 6, 76–90 (1970)
33. Bryson, A., Ho, Y.: Applied Optimal Control. Hemisphere Publishing,
Washington DC (1975)
34. Büskens, C.: Optimierungsmethoden und sensitivitätsanalyse für opti-
male steuerprozesse mit steuer und zustands beschränkungen. Ph.D.
thesis, Institut für Numerische und Inentelle Mathematik, Universität
Münster (1998)
35. Büskens, C., Maurer, H.: Nonlinear programming methods for real-time
control of an industrial robot. J. Optim. Theory Appl. 107(3), 505–527
(2000)
36. Buskens, C., Maurer, H.: SQP-methods for solving optimal control
problems with control and state constraints: adjoint variables, sensi-
tivity analysis and real-time control. J. Comput. Appl. Math. 120,
85–108 (2000)
37. Butovskiy, A.: Distributed Control Systems. American Elsevier, New
York (1969)
38. Caccetta, L., Loosen, I., Rehbock, V.: Computational aspects of the
optimal transit path problem. J. Ind. Manage. Optim. 4, 95–105 (2008)
39. Canuto, C., Hussaini, M., Quarteroni, A., Zang, T.: Spectral Methods
in Fluid Dynamics. Springer, New York (1988)
40. Cesari, L.: Optimization: Theory and Applications. Springer, New York
(1983)
41. Chai, Q., Yang, C., Teo, K.L., Gui, W.: Time-delay optimal control
of an industrial-scale evaporation process sodium aluminate solution.
Control Eng. Pract. 20, 618–628 (2012)
42. Chen, T., Xu, C., Lin, Q., Loxton, R., Teo, K.L.: Water hammer mit-
igation via PDE constrained optimization. Control Eng. Pract. 45,
54–63 (2015)
43. Cheng, T.C.E., Teo, K.L.: Further extensions of a student related op-
timal control problem. Int. J. Math. Model. 9, 499–506 (1987)
44. Choi, C., Laub, A.: Efficient matrix-valued algorithms for solving stiff
Riccati differential equations. IEEE Trans. Autom. Control 35(7), 770–
776 (1990)
45. Chyba, M., Haberkorn, T., Smith, R.N., Choi, S.K.: Design and im-
plementation of time efficient trajectories for autonomous underwater
vehicles. Ocean Eng. 35, 63–76 (2008)
46. Cuthrell, J.E., Biegler, L.: Simultaneous optimization and solution
methods for batch reactor control profiles. Comput. Chem. Eng. 13,
49–62 (1989)
558 References
65. Gao, Y., Kostyukova, O., Chong, K.T.: Worst-case optimal control for
an electrical drive system with time-delay. Asian J. Control 11(4),
386–395 (2009)
66. Gerdts, M.: Solving mixed-integer optimal control problems by branch
and bound: a case study from automobile test-driving with gear shift.
Optimal Control Appl. Methods 26(1), 1–18 (2005)
67. Gerdts, M.: A variable time transformation method for mixed-integer
optimal control problems. Optimal Control Appl. Methods 27, 169–182
(2006)
68. Gerdts, M.: Global convergence of a nonsmooth Newton method for
control-state constrained optimal control problems. SIAM J. Control
Optim. 19(1), 326–350 (2008)
69. Gerdts, M.: Optimal control of ODEs and DAEs. De Gruyter, Berlin
(2012)
70. Giang, D., Lenbury, Y., Seidman, T.: Delay effect in models of popu-
lation growth. J. Math. Anal. Appl. 305, 631–643 (2005)
71. Gill, P., Murray, W., Wright, M.: Practical Optimization. Academic,
London (1981)
72. Goh, B.: Necessary conditions for singular extremals involving multiple
control variables. SIAM J. Control 4(4), 716–731 (1966)
73. Goh, B.: The second variation for the singular Bolza problem. SIAM
J. Control 4(2), 309–325 (1966)
74. Goh, B.: Management and Analysis of Biological Populations. Elsevier,
Amsterdam (1980)
75. Goh, C.J., Teo, K.L.: Control parametrization: a unified approach to
optimal control problems with general constraints. Automatica 24(1),
3–18 (1988)
76. Goh, C.J., Teo, K.L.: Alternative algorithms for solving nonlinear func-
tion and functional inequalities. Appl. Math. Comput. 41(2), 159–177
(1991)
77. Goldfarb, D.: A family of variable-metric methods derived by varia-
tional means. Math. Comput. 24, 23–26 (1970)
78. Goldstein, A.: On steepest descent. SIAM J. Control 3, 147–151 (1965)
79. Gong, Z.H., Loxton, R., Yu, C.J., Teo, K.L.: Dynamic optimization for
robust path planning of horizontal oil wells. Appl. Math. Comput. 274,
711–725 (2016)
80. Gong, Z.H., Teo, K.L., Liu, C.Y., Feng, E.: Horizontal well’s path plan-
ning: an optimal switching control approach. Appl. Math. Model. 39,
4022–4032 (2015)
81. Gonzaga, C., Polak, E., Trahan, R.: An improved algorithm for opti-
mization problems with functional inequality constraints. IEEE Trans.
Autom. Control 25(1), 211–246 (1980)
82. Graham, K., Rao, A.: Minimum-time trajectory optimization of low-
thrust earth-orbit transfers with eclipsing. J. Spacecr. Rocket. 53(2),
289–303 (2016)
560 References
83. Gruver, W., Sachs, E.: Algorithmic Methods in Optimal Control. Re-
search Notes in Mathematics, vol. 47. Pitman, London (1981)
84. Guinn, T.: Reduction of delayed optimal control problems to nonde-
layed problems. J. Optim. Theory Appl. 18(3), 371–377 (1976)
85. Hager, W.W.: Runge-Kutta methods in optimal control and the trans-
formed adjoint system. Numer. Math. 87, 247–282 (2000)
86. Han, S.: Superlinearly convergent variable metric algorithms for gen-
eral nonlinear programming problems. Math. Program. 11(1), 263–282
(1976)
87. Han, S.: A globally convergent method for nonlinear programming. J.
Optim. Theory Appl. 22(3), 297–309 (1977)
88. Hartl, R.F., Sethi, S.P., Vickson, R.G.: A survey of the maximum prin-
ciples for optimal control problems with state constraints. SIAM Rev.
37(2), 181–218 (1995)
89. Hausdorff, L.: Gradient Optimization and Nonlinear Control. Wiley,
New York (1976)
90. Hermes, H., LaSalle, J.P.: Functional Analysis and Time optimal Con-
trol. Academic, New York (1969)
91. Hewitt, E., Stromberg, K.: Real and Abstract Analysis. Springer, New
York (1965)
92. Hindmarsh, A.: Large ordinary differential systems and software. IEEE
Control Mag. 2, 24–30 (1982)
93. Ho, C.Y.F., Ling, B.W.K., Liu, Y.Q., Tam, P.K.S., Teo, K.L.: Opti-
mal PWM control of switched-capacitor DC–DC power converters via
model transformation and enhancing control techniques. IEEE Trans.
Circuits Syst. I 55, 1382–1391 (2008)
94. Hounslow, M.J., Ryall, R.L., Marshall, V.R.: A discretized population
balance for nucleation, growth, and aggregation. AIChE J. 34(11),
1821–1832 (1988)
95. Howlett, P.: Optimal strategies for the control of a train. Automatica
32(4), 519–532 (1996)
96. Howlett, P.: The optimal control of a train. Ann. Oper. Res. 98(1-4),
65–87 (2000)
97. Howlett, P.G., Pudney, P.J., Vu, X.: Local energy minimization in op-
timal train control. Automatica 45, 2692–2698 (2009)
98. Huang, C., Wang, S., Teo, K.L.: Solving Hamilton-Jacobi-Bellman
equations by a modified method of characteristics. Nonlinear Anal.
40(1–8), 279–293 (2000)
99. Huang, C., Wang, S., Teo, K.L.: On application of an alternating direc-
tion method to Hamilton-Jacobi-Bellman equations. J. Comput. Appl.
Math. 166(1), 153–166 (2004)
100. Hull, D., Speyer, J., Tseng, C.: Maximum-information guidance for
homing missiles. J. Guid. Control. Dyn. 8(4), 494–497 (1985)
References 561
135. Li, B., Zhu, Y.G., Sun, Y.F., Aw, G., Teo, K.L.: Deterministic con-
version of uncertain manpower planning optimization problem. IEEE
Trans. Fuzzy Syst. 26(5), 2748–2757 (2018)
136. Li, B., Zhu, Y.G., Sun, Y.F., Aw, G., Teo, K.L.: Multi-period port-
folio selection problem under uncertain environment with bankruptcy
constraint. Appl. Math. Model. 56, 539–550 (2018)
137. Li, C., Teo, K.L., Li, B., Ma, G.: A constrained optimal PID-like con-
troller design for spacecraft attitude stabilization. Acta Astrnaut. 74,
131–140 (2011)
138. Li, R., Teo, K.L., Wong, K.H., Duan, G.R.: Control parameterization
enhancing transform for optimal control of switched systems. Math.
Comput. Model. 43(11-12), 1393–1403 (2006)
139. Li, Y.G., Gui, W.H., Teo, K.L., Zhu, H.Q., Chai, Q.Q.: Optimal control
for zinc solution purification based on interacting CSTR models. J.
Process Control 22, 1878–1889 (2012)
140. Liang, J.: Optimal magnetic attitude control of small spacecraft. Ph.D.
thesis, Utah State University (2005)
141. Lim, C., Forsythe., W.: Autopilot for ship control. IEEE Proc. 130(6),
281–294 (1983)
142. Lin, Q., Loxton, R., Teo, K.L.: Optimal control of nonlinear switched
systems: computational methods and applications. J. Oper. Res. Soc.
China 1, 275–311 (2013)
143. Lin, Q., Loxton, R., Teo, K.L., Wu, Y.H.: A new computational method
for a class of free terminal time optimal control problems. Pac. J.
Optim. 7(1), 63–81 (2011)
144. Lin, Q., Loxton, R., Teo, K.L., Wu, Y.H.: Optimal control computation
for nonlinear systems with state-dependent stopping criteria. Automat-
ica 48, 2116–2129 (2012)
145. Lin, Q., Loxton, R., Teo, K.L., Wu, Y.H.: Optimal feedback control for
dynamic systems with state constraints: an exact penalty approach.
Optim. Lett. 8(4), 1535–1551 (2014)
146. Lin, Q., Loxton, R., Teo, K.L., Wu, Y.H.: Optimal control problems
with stopping constraints. J. Glob. Optim. 63(4), 835–861 (2015)
147. Lin, Q., Loxton, R., Teo, K.L., Wu, Y.H., Yu, C.J.: A new exact penalty
method for semi-infinite programming problems. J. Comput. Appl.
Math. 261(1), 271–286 (2014)
148. Lin, Q., Loxton, R.C., Teo, K.L.: The control parameterization method
for nonlinear optimal control: a survey. J. Ind. Manage. Optim. 10(1),
275–309 (2014)
149. Lions, J.: Optimal Control of Systems Governed by Partial Differential
Equations. Springer, New York (1971)
150. Liu, C., Gong, Z.: Optimal control of Switched Systems Arising in Fer-
mentation Processes. Springer, Berlin (2014)
564 References
151. Liu, C., Loxton, R., Teo, K.L.: Switching time and parameter optimiza-
tion in nonlinear switched systems with multiple time delays. J. Optim.
Theory Appl. 163, 957–988 (2014)
152. Liu, C., Loxton, R., Lin, Q., Teo, K.L. : Dynamic optimization for
switched time-delay systems with state-dependent switched conditions.
SIAM J. Control Optim. 56, 3499–3523 (2018)
153. Liu, C., Loxton, R., Lin, Q., Teo, K.L.: Dynamic optimization for
switched time-delay systems with state-dependent switching conditions.
SIAM J. Control Optim. 56(5), 3499–3523 (2018)
154. Liu, C.M., Feng, Z.G., Teo, K.L.: On a class of stochastic impulsive op-
timal parameter selection problems. Int. J. Innov. Comput. Inf. Control
5(4), 1043–1054 (2009)
155. Liu, C.Y., Gong, Z., Feng, E., Yin, H.: Optimal switching control of a
fed-batch fermentation process. J. Glob. Optim. 52, 265–280 (2012)
156. Liu, C.Y., Gong, Z., Shen, B., Feng, E.: Modelling and optimal control
for a fed-batch fermentation process. Appl. Math. Model. 37, 695–706
(2013)
157. Liu, C.Y., Gong, Z.H., Lee, H.W.J., Teo, K.L.: Robust bi-objective
optimal control of 1,3-propanediol microbial batch production process.
J. Process Control 78, 170–182 (2019)
158. Liu, C.Y., Gong, Z.H., Teo, K.L., Feng, E.: Multi-objective optimization
of nonlinear switched time-delay systems in fed-batch process. Appl.
Math. Model. 40, 10,533–10,548 (2016)
159. Liu, C.Y., Gong, Z.H., Teo, K.L., Loxton, R., Feng, E.: Bi-objective
dynamic optimization of a nonlinear time-delay system in microbial
batch process. Optim. Lett. 12, 1249–1264 (2018)
160. Liu, C.Y., Gong, Z.H., Teo, K.L., Sun, J., Caccetta, L.: Robust multi-
objective optimal switching control arising in 1,3-propanediol microbial
fed-batch process. Nonlinear Anal. Hybrid Syst. 25, 1–20 (2017)
161. Liu, Y., Teo, K.L., Agarwal, R.P.: A general approach to nonlinear mul-
tiple control problems with perturbation consideration. Math. Comput.
Model. 26, 49–58 (1997)
162. Liu, Y., Teo, K.L., Jennings, L.S., Wang, S.: On a class of optimal
control problems with state jumps. J. Optim. Theory Appl. 98(1),
65–82 (1998)
163. Löberg, J.: YALMIP : A toolbox for modeling and optimization in
Matlab. In: Proc. Int. Symp. CACSD, Taipei, pp. 284–289 (2004)
164. Loxton, R., Lin, Q., Teo, K.L.: Minimizing control variation in nonlinear
optimal control. Automatica 49, 2652–2664 (2013)
165. Loxton, R., Teo, K.L., Rehbock, V.: An optimization approach to state-
delay identification. IEEE Trans. Autom. Control 55, 2113–2119 (2010)
166. Loxton, R., Teo, K.L., Rehbock, V.: Robust suboptimal control of non-
linear systems. Appl. Math. Comput. 217(14), 6566–6576 (2011)
References 565
167. Loxton, R., Teo, K.L., Rehbock, V., Ling, W.K.: Optimal switching
instants for a switched-capacitor DA/DC power converter. Automatica
45, 973–980 (2009)
168. Loxton, R.C., Lin, Q., Teo, K.L., Rehbock, V.: Control parameteri-
zation for optimal control problems with continuous inequality con-
straints: new convergence results. Numer. Algebra Control Optim. 2(3),
571–599 (2012)
169. Loxton, R.C., Teo, K.L., Rehbock, V.: Optimal control problems with
multiple characteristic time points in the objective and constraints.
Automatica 44(11), 2923–2929 (2008)
170. Loxton, R.C., Teo, K.L., Rehbock, V.: Computational method for a
class of switched system optimal control problems. IEEE Trans. Autom.
Control 54(10), 2455–2460 (2009)
171. Loxton, R.C., Teo, K.L., Rehbock, V., Yiu, K.F.C.: Optimal control
problems with a continuous inequality constraint on the state and the
control. Automatica 45(10), 2250–2257 (2009)
172. Luenberger, D.G., Ye, Y.: Linear and Nonlinear Programming, 3rd edn.
Springer, New York (2008)
173. Luus, R.: Optimal control by dynamic programming using systematic
reduction in grid size. Int. J. Control 51(5), 995–1013 (1990)
174. Luus, R.: Piecewise linear continuous optimal control by iterative dy-
namic programming. Ind. Eng. Chem. Res. 32(5), 859–865 (1993)
175. Luus, R.: Iterative Dynamic Programming. Chapman & Hall/CRC,
Boca Raton (2000)
176. Luus, R., Okongwu, O.: Towards practical optimal control of batch
reactors. Chem. Eng. J. 75(1), 1–9 (1999)
177. Malanowski, K., Maurer, H.: Sensitivity analysis for state constrained
optimal control problems. Discrete Contin. Dyn. Syst. 4(2), 241–272
(1998)
178. Malanowski, K., Buskens, C., Maurer, H.: Convergence of approxima-
tions to nonlinear optimal control problems. In: Fiacco, A.V. (ed.)
Mathematical Programming with Data Perturbations V. Lecture Notes
in Pure and Applied Mathematics, vol. 195, pp. 253–284. Springer, New
York (1997)
179. Martez, J.M.: Inexact restoration method with Lagrangian tangent de-
crease and new merit function for nonlinear. J. Optim. Theory Appl.
111, 39–58 (2001)
180. Martin, R.B.: Optimal control drug scheduling of cancer chemotherapy.
Automatica 28, 1113–1123 (1992)
181. Martin, R., Teo, K.L.: Optimal Control of Drug Administration in Can-
cer Chemotherapy. World Scientific, Singapore (1994)
182. Martinez, J.M., Pilotta, E.A.: Inexact restoration algorithm for con-
strained optimization. J. Optim. Theory Appl. 104(1), 135–163 (2000)
183. The Mathworks, Inc., Natick, Massachusetts: MATLAB version
8.5.0.197613 (R2015a) (2015)
566 References
184. Maurer, H.: On the minimum principle for optimal control problems
with state constraints. Tech. Rep. 41, Schriftenreihe des Rechenzen-
trums der Universität Münster (1979)
185. Maurer, H., Osmolovskii, N.P.: Second order sufficient conditions for
time-optimal bang–bang control problems. SIAM J. Control Optim.
42, 2239–2263 (2004)
186. Maurer, H., Buskens, C., Kim, J.-H.R., Kaya, C.Y.: Optimization meth-
ods for the verification of second order sufficient conditions for bang–
bang controls. Optimal Control Appl. Methods 26, 129–156 (2005)
187. McCormick, G.: Nonlinear Programming: Theory, Algorithms and Ap-
plications. Wiley, New York (1983)
188. McEneaney, W.: A curse-of-dimensionality-free numerical method for
solution of certain HJB PDEs. SIAM J. Control Optim. 46(4), 1239–
1276 (2007)
189. Mehra, R., Davis, R.: A generalized gradient method for optimal control
problems with inequality constraints and singular arcs. IEEE Trans.
Autom. Control AC-17(1), 69–78 (1972)
190. Mehta, T., Egerstedt, M.: Multi-modal control using adaptive motion
description languages. Automatica 44, 1912–1917 (2008)
191. Miele, A., Wang, T.: Dual-properties of sequential gradient-restoration
algorithms for optimal control problems. In: Conti, R., De Giorgi,
E., Giannessi, F. (eds.) Optimization and Related Fields, pp. 331–357.
Springer, New York (1986)
192. Miele, A., Pritchard, R.E., Damoulakis, J.N.: Sequential gradient-
restoration algorithm for optimal control problems. J. Optim. Theory
Appl. 5, 235–282 (1970)
193. Miele, A., Wang, T., Basapur, V.K.: Primal and dual formulations of
sequential gradient-restoration algorithms for trajectory optimization.
Acta Astronaut. 13, 491–505 (1986)
194. Misra, C., White, E.: Kinetics of crystallization of aluminium trihy-
droxide from seeded caustic aluminate solutions. Chem. Eng. Prog.
Symp. Ser. 67(110), 53–65 (1971)
195. Mitsos, A.: Global optimization of semi-infinite programs via restriction
of the right-hand side. Optimization 60(10–11), 1291–1308 (2011)
196. Mordukhovich, B.S.: Variational Analysis and Generalized Differentia-
tion: Applications, vol. II. Springer, Berlin (2006)
197. Mu, Y., Zhang, D., Teng, H., Wang, W., Xiu, Z.: Microbial production
of 1,3-propanediol by Klebsiella pneumoniae using crude glycerol from
biodiesel preparation. Biotechnol. Lett. 28, 1755–1759 (2008)
198. Neustadt, L.: Optimization: A Theory of Necessary Conditions. Prince-
ton University Press, New York (1976)
199. Nocedal, J., Wright, S.: Numerical Optimization, 2nd edn. Springer,
Berlin (2006)
References 567
217. Ruby, T., Rehbock, V., Lawrance, W.B.: Optimal control of hybrid
power systems. Dyn. Contin. Discrete Impuls. Syst. 10, 429–439 (2003)
218. Sakawa, A.: Trajectory planning of a free-flying robot by using the
optimal control. Optimal Control Appl. Methods 20, 235–248 (1999)
219. Sakawa, Y., Shindo, Y.: Optimal control of container cranes. Automat-
ica 18(3), 257–266 (1982)
220. Schittkowski, K.: The nonlinear programming method of Wilson, Han,
and Powell with an augmented Lagrangian type line search function,
Part 1: convergence analysis. Numer. Math. 38(1), 83–114 (1982)
221. Schittkowski, K.: On the convergence of a sequential quadratic pro-
gramming method with an augmented Lagrangian line search function.
Optimization 14(2), 197–216 (1983)
222. Schittkowski, K.: NLPQL: a Fortran subroutine solving constrained
nonlinear programming problems. Ann. Oper. Res. 5(2), 485–500
(1986)
223. Schittkowski, K.: NLPQLP: a Fortran implementation of a sequential
quadratic programming algorithm with distributed and non-monotone
line search - User’s guide, version 2.24. University of Bayreuth,
Bayreuth (2007)
224. Schwartz, A.: Homepage of RIOTS. http://www.schwartz-home.com/
riots/ (1997)
225. Schwartz, A.: Theory and implementation of numerical methods based
on Runge-Kutta integration for solving optimal control problems.
Ph.D. thesis, Electrical Engineering and Computer Sciences, Univer-
sity of California at Berkeley (1998)
226. Sethi, S., Thompson, G.: Optimal Control Theory: Applications to
Management Science, 2nd edn. Kluwer Academic, Dordrecht (2000)
227. Shanno, D.: Conditioning of quasi-Newton methods for function mini-
mization. Math. Comput. 24(111), 647–656 (1970)
228. Siburian, A., Rehbock, V.: Numerical procedure for solving a class of
singular optimal control problems. Optim. Methods Softw. 19(3–4),
413–426 (2004)
229. Sirisena, H.: Computation of optimal controls using a piecewise poly-
nomial parameterization. IEEE Trans. Autom. Control 18(4), 409–411
(1973)
230. Sirisena, H., Chou, F.: Convergence of the control parameterization
Ritz method for nonlinear optimal control problems. J. Optim. Theory
Appl. 29(3), 369–382 (1979)
231. Sturm, J.F.: Using SeDuMi 1.02, a MATLAB toolbox for optimization
over symmetric cones. Optim. Methods Softw. 12, 625–633 (1999)
232. Sun, W., Yuan, Y.: Optimization Theory and Methods - Nonlinear
Programming. Springer, New York (2006)
233. Sun, Y., Aw, E., Teo, K.L., Zhou, G.: Portfolio optimization using a
new probabilistic risk measure. J. Ind. Manage. Optim. 11(4), 1275–
1283 (2015)
References 569
234. Sun, Y., Aw, G., Loxton, R., Teo, K.L.: An optimal machine main-
tenance problem with probabilistic state constraints. Inf. Sci. 281,
386–398 (2014)
235. Sun, Y., Aw, G., Teo, K.L., Wang, X.: Multi-period portfolio optimiza-
tion under probabilistic risk measure. Financ. Res. Lett. 18, 60–66
(2016)
236. Sun, Y.F., Aw, G., Loxton, R., Teo, K.L.: Chance constrained optimiza-
tion for pension fund portfolios in the presence of default risk. Eur. J.
Oper. Res. 256, 205–214 (2017)
237. Teo, K.L.: Control parametrization enhancing transform to optimal
control problems. Nonlinear Anal. Theory, Methods Appl. 63, e2223–
e2236 (2005)
238. Teo, K.L., Womersley, R.S.: A control parameterization algorithm for
optimal control problems involving linear systems and linear terminal
inequality constraints. Numer. Funct. Anal. Optim. 6, 291–313 (1983)
239. Teo, K.L., Ahmed, N.U., Fisher, M.F.: Optimal feedback control for
linear stochastic systems driven by counting processes. Eng. Optim.
15(1), 1–16 (1989)
240. Teo, K.L., Clements, D.: A control parametrization algorithm for con-
vex optimal control problems with linear constraints. Numer. Funct.
Anal. Optim. 8(5–6), 515–540 (1985)
241. Teo, K.L., Goh, C.J.: A Simple computational procedure for optimiza-
tion problems with functional inequality constraints. IEEE Trans. Au-
tom. Control 32(10), 940–941 (1987)
242. Teo, K.L., Goh, C.J.: On constrained optimization problems with non-
smooth cost functionals. Appl. Math. Optim. 18(1), 181–190 (1988)
243. Teo, K.L., Goh, C.J.: A unified computational method for several
stochastic optimal control problems. Int. Ser. Numer. Math. 86(2),
467–476 (1988)
244. Teo, K.L., Goh, C.J.: A computational method for combined optimal
parameter selection and optimal control problems with general con-
straints. J. Aust. Math. Soc. B 30(3), 350–364 (1989)
245. Teo, K.L., Jennings, L.S.: Nonlinear optimal control problems with con-
tinuous state inequality constraints. J. Optim. Theory Appl. 63(1),
1–22 (1989)
246. Teo, K.L., Jennings, L.S.: Optimal control with a cost on changing
control. J. Optim. Theory Appl. 68(2), 335–357 (1991)
247. Teo, K.L., Lim., C.C.: Computational algorithm for functional inequal-
ity constrained optimization problems. J. Optim. Theory Appl. 56(1),
145–156 (1998)
248. Teo, K.L., Wong, K.H.: A computational method for time-lag control
problems with control and terminal inequality constraints. Optimal
Control Appl. Methods 8(4), 377–395 (1987)
249. Teo, K.L., Wong, K.H.: Nonlinearly constrained optimal control prob-
lems. J. Aust. Math. Soc. B 33(4), 517–530 (1992)
570 References
250. Teo, K.L., Wu, Z.S.: Computational Methods for Optimizing Dis-
tributed Systems. Academic, Orlando (1984)
251. Teo, K.L., Ang, B., Wang, M.: Least weight cables: optimal parameter
selection approach. Eng. Optim. 9(4), 249–264 (1986)
252. Teo, K.L., Fischer, M.E., Moore, J.B.: A suboptimal feedback stabiliz-
ing controller for a class of nonlinear regulator problems. Appl. Math.
Comput. 59(1), 1–17 (1993)
253. Teo, K.L., Goh, C.J., Wong, K.H.: A Unified Computational Approach
to Optimal Control Problems. Longman Scientific and Technical, Essex
(1991)
254. Teo, K.L., Jennings, L.S., Lee, H.W.J., Rehbock, V.: The control pa-
rameterization enhancing transform for constrained optimal control
problems. J. Aust. Math. Soc. B Appl. Math. 40, 314–335 (1999)
255. Teo, K.L., Jepps, G., Moore, E.J., Hayes, S.: A computational method
for free time optimal control problems, with application to maximizing
the range of an aircraft-like projectile. J. Aust. Math. Soc. B 28(3),
393–413 (1987)
256. Teo, K.L., Lee, W.R., Jennings, L.S., Wang, S., Liu, Y.: Numerical
solution of an optimal control problem with variable time points in the
objective function. ANZIAM J. 43(4), 463–478 (2002)
257. Teo, K.L., Lim, C.C.: Time optimal control computation with applica-
tion to ship steering. J. Optim. Theory Appl. 56, 145–156 (1988)
258. Teo, K.L., Liu, Y., Goh, C.J.: Nonlinearly constrained discrete-time
optimal-control problems. Appl. Math. Comput. 38(3), 227–248 (1990)
259. Teo, K.L., Rehbock, V., Jennings, L.S.: A new computational algorithm
for functional inequality constrained optimization problems. Automat-
ica 29(3), 789–792 (1993)
260. Teo, K.L., Wong, K.H., Clements, D.J.: Optimal control computation
for linear time-lag systems with linear terminal constraints. J. Optim.
Theory Appl. 44(3), 509–526 (1984)
261. Teo, K.L., Yang, X.Q., Jennings, L.S.: Computational discretization
algorithms for functional inequality constrained optimization. Ann.
Oper. Res. 98(1), 215–234 (2000)
262. Thompson, G.: Optimal maintenance policy and sale date of a machine.
Manag. Sci. 14(9), 543–550 (1968)
263. Uhlig, F.: A recurring theorem about pairs of quadratic forms and
extensions: a survey. Linear Algebra Appl. 25, 219–237 (1979)
264. Rehbock, V., Lim, C.C., Teo, K.L.: A stable constrained optimal model
following controller for discrete-time nonlinear systems affine in control.
Control Theory Adv. Technol. 10(4), 793–814 (1994)
265. Varaiya, P.: Notes on Optimization. Van Nostrand Reinhold Notes on
System Sciences. Van Nostrand Reinhold, New York (1972)
266. Varaiya, P.: Lecture Notes on Optimization (2013). https://people.
eecs.berkeley.edu.cn.edu/∼varaiya-optimization.pdf
References 571
283. Woon, S.F., Rehbock, V., Loxton, R.C.: Towards global solutions of op-
timal discrete-valued control problems. Comput. Optim. Appl. 33(5),
576–594 (2012)
284. Wu, C.Z., Teo, K.L.: Global impulsive optimal control computation. J.
Ind. Manage. Optim. 2(4), 435–450 (2006)
285. Wu, C.Z., Teo, K.L., Rehbock, V.: A filled function method for opti-
mal discrete-valued control problems. J. Glob. Optim. 44(2), 213–225
(2009)
286. Wu, Z.Y., Zhang, L.S., Teo, K.L., Bai, F.S.: A New Filled Function
Method for Global Optimization. J. Optim. Theory Appl. 125, 181–
203 (2005)
287. Wu, C.Z., Teo, K.L., Wu, S.Y.: Min–max optimal control of linear sys-
tems with uncertainty and terminal state constraints. Automatica 49,
1809–1815 (2013)
288. Wu, C.Z., Teo, K.L., Li, R., Zhao, Y.: Optimal control of switched
systems with time delay. Appl. Math. Lett. 19(10), 1062–1067 (2006)
289. Wu, D., Bai, Y.Q., Xie, F.S.: Time-scaling transformation for optimal
control problem with time-varying delay. Discrete Contin. Dyn. Syst.
S (2019). https://doi.org/10.3934/dcdss.2020098
290. Wu, D., Bai, Y.Q., Yu, C.Y.: A new computational approach for optimal
control problems with multiple time-delay. Automatica 101, 388–395
(2019)
291. Xiao, L., Liu, X.: An effective pseudospectral optimization approach
with sparse variable time nodes for maximum production of chemical
engineering problems. Can. J. Chem. Eng. 95, 1313–1322 (2017)
292. Xiu, Z., Song, B., Sun, L., Zeng, A.: Theoretical analysis of effects of
metabolic overflow and time delay on the performance and dynamic
behavior of a two-stage fermentation process. Biochem. Eng. J. 11,
101–109 (2002)
293. Xiu, Z., Zeng, A., An, L.: Mathematical modeling of kinetics and re-
search on multiplicity of glycerol bioconversion to 1,3-propanediol. J.
Dalian Univ. Technol. 40, 428–433 (2000)
294. Yang, F., Teo, K.L., Loxton, R., Rehbock, V., Li, B., Yu, C.J., Jennings,
L.: Visual miser: an efficient user-friendly visual program for solving op-
timal control problems. J. Ind. Manage. Optim. 12(2), 781–810 (2016)
295. Yang, F., Teo, K.L., Loxton R., Rehbock, V., Li, B.,Yu, C.J., Jennings,
L.: VISUAL MISER: an efficient user-friendly visual program for solv-
ing optimal control problems. J. Ind. Manage. Optim. 12, 781–810
(2016)
296. Yang, X.Q., Teo, K.L.: Nonlinear Lagrangian functions and applications
to semi-infinite programs. Ann. Oper. Res. 103(1), 235–250 (2001)
297. Yu, C.J., Li, B., Loxton, R., Teo, K.L.: Optimal discrete-valued control
computation. J. Glob. Optim. 56(2), 503–518 (2013)
References 573
298. Yu, C.J., Lin, Q., Loxton, R., Teo, K.L., Wang, G.Q.: A hybrid time-
scaling transformation for time-delay optimal control problems. J. Op-
tim. Theory Appl. 169, 876–901 (2016)
299. Yu, C.J., Teo, K.L., Bai, Y.Q.: An exact penalty function method for
nonlinear mixed discrete programming problems. Optim. Lett. 7, 23–38
(2013)
300. Yu, C.J., Teo, K.L., Zhang, L.S., Bai, Y.Q.: A new exact penalty func-
tion method for continuous inequality constrained optimization prob-
lems. J. Ind. Manage. Optim. 6(4), 895–910 (2010)
301. Yu, C.J., Teo, K.L., Zhang, L.S., Bai, Y.Q.: On a refinement of the
convergence analysis for the new exact penalty function method for
continuous inequality constrained optimization problem. J. Ind. Man-
age. Optim. 8(2), 485–491 (2012)
302. Yuan, J.L., Zhang, Y.D., Yee, J.X., Xie, J., Teo, K.L., Zhu, X., Feng,
E.M., Yin, H.C., Xi, Z.L.: Robust parameter identification using paral-
lel global optimization for a batch nonlinear enzyme-catalytic time-
delayed process presenting metabolic discontinuities. Appl. Math.
Model. 46, 554–571 (2017)
303. Zhang, K., Teo, K.L.: A penalty-based method from reconstructing
smooth local volatility surface from American options. J. Ind. Manage.
Optim. 11(2), 631–644 (2015)
304. Zhang, K., Teo, K.L., Swartz, M.: A robust numerical scheme for pricing
American options under regime switching based on penalty method.
Comput. Econ. 43, 463–483 (2014)
305. Zhang, K., Wang, S., Yang, X.Q., Teo, K.L.: A power penalty approach
to numerical solutions of two-asset American options. Numer. Math.
Theory Methods Appl. 2(2), 202–223 (2009)
306. Zhang, K., Wang, S., Yang, X.Q., Teo, K.L.: Numerical performance of
penalty method for American option pricing. Optim. Methods Softw.
25(5), 737–752 (2010)
307. Zhang, K., Yang, X.Q., Teo, K.L.: Augmented Lagrangian method ap-
plied to American option pricing. Automatica 42, 1407–1416 (2006)
308. Zhang, K., Yang, X.Q., Teo, K.L.: A power penalty approach to Amer-
ican option pricing with jump diffusion processes. J. Ind. Manage.
Optim. 4, 783–799 (2008)
309. Zhang, K., Yang, X.Q., Teo, K.L.: Convergence analysis of a monotonic
penalty method for American option pricing. J. Math. Anal. Appl. 348,
915–926 (2008)
310. Zhong, W.F., Lin, Q., Loxton, R., Teo, K.L.: Optimal train control via
switched system dynamic optimization. Optim. Methods Softw. (2019).
https://doi.org/10.1080/0556788.2019.1604704
574 References
311. Zhou, J.Y., Teo, K.L., Zhou, D., Zhao, G.H.: Optimal guidance for
lunar module soft landing. Nonlinear Dyn. Syst. Theory 10(2), 189–
201 (2010)
312. Zhou, J.Y., Teo, K.L., Zhou, D., Zhao, G.H.: Nonlinear optimal feed-
back control for lunar module soft landing. J. Glob. Optim. 52(2),
211–227 (2012)