0% found this document useful (0 votes)
58 views581 pages

Applied and Computational Optimal Control: Kok Lay Teo Bin Li Changjun Yu Volker Rehbock

Uploaded by

lqsl0l0slq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views581 pages

Applied and Computational Optimal Control: Kok Lay Teo Bin Li Changjun Yu Volker Rehbock

Uploaded by

lqsl0l0slq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 581

Springer Optimization and Its Applications 171

Kok Lay Teo


Bin Li
Changjun Yu
Volker Rehbock

Applied and
Computational
Optimal
Control
A Control Parametrization Approach
Springer Optimization and Its Applications

Volume 171

Series Editors
Panos M. Pardalos , University of Florida
My T. Thai , University of Florida

Honorary Editor
Ding-Zhu Du, University of Texas at Dallas

Advisory Editors
Roman V. Belavkin, Middlesex University
John R. Birge, University of Chicago
Sergiy Butenko, Texas A&M University
Vipin Kumar, University of Minnesota
Anna Nagurney, University of Massachusetts Amherst
Jun Pei, Hefei University of Technology
Oleg Prokopyev, University of Pittsburgh
Steffen Rebennack, Karlsruhe Institute of Technology
Mauricio Resende, Amazon
Tamás Terlaky, Lehigh University
Van Vu, Yale University
Michael N. Vrahatis, University of Patras
Guoliang Xue, Arizona State University
Yinyu Ye, Stanford University
Aims and Scope
Optimization has continued to expand in all directions at an astonishing rate. New
algorithmic and theoretical techniques are continually developing and the diffusion
into other disciplines is proceeding at a rapid pace, with a spot light on machine
learning, artificial intelligence, and quantum computing. Our knowledge of all as-
pects of the field has grown even more profound. At the same time, one of the most
striking trends in optimization is the constantly increasing emphasis on the interdis-
ciplinary nature of the field. Optimization has been a basic tool in areas not limited
to applied mathematics, engineering, medicine, economics, computer science, oper-
ations research, and other sciences.

The series Springer Optimization and Its Applications (SOIA) aims to publish
state-of-the-art expository works (monographs, contributed volumes, textbooks,
handbooks) that focus on theory, methods, and applications of optimization. Top-
ics covered include, but are not limited to, nonlinear optimization, combinatorial
optimization, continuous optimization, stochastic optimization, Bayesian optimiza-
tion, optimal control, discrete optimization, multi-objective optimization, and more.
New to the series portfolio include Works at the intersection of optimization and
machine learning, artificial intelligence, and quantum computing.

Volumes from this series are indexed by Web of Science, zbMATH, Mathematical
Reviews, and SCOPUS.

More information about this series at http://www.springer.com/series/7393


Kok Lay Teo • Bin Li • Changjun Yu
Volker Rehbock

Applied and Computational


Optimal Control
A Control Parametrization Approach
Kok Lay Teo Bin Li
School of Mathematical Sciences College of Electrical Engineering
Sunway University Sichuan University
Selangor Darul Ehsan, Malaysia Chengdu, China

Changjun Yu Volker Rehbock


College of Sciences School of Electrical Engineering
Shanghai University Computing and Mathematical Sciences
Shanghai, China Curtin University
Perth, WA, Australia

ISSN 1931-6828 ISSN 1931-6836 (electronic)


Springer Optimization and Its Applications
ISBN 978-3-030-69912-3 ISBN 978-3-030-69913-0 (eBook)
https://doi.org/10.1007/978-3-030-69913-0

Mathematics Subject Classification: 49M25, 34H05, 93C10

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Switzerland AG 2021
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface

For an optimal control problem, one seeks to optimize a performance mea-


sure subject to a set of dynamic, and possibly algebraic, constraints. The
dynamic constraints may be expressed by a set of differential equations (or-
dinary or partial) or a set of difference equations. These equations may be
deterministic or stochastic in nature. Based on the theoretical foundation laid
by many great mathematicians of our time, optimal control has developed
into a well-established research area. It has attracted the interests of many
top researchers and practitioners working in several, often unrelated, disci-
plines, such as economics, management science, environmental management,
forestry, agriculture, defence, core engineering (civil, chemical, electrical and
mechanical), biology and social sciences. There is a large volume of papers,
textbooks and research monographs dealing with theoretical as well as prac-
tical aspects of optimal control available in the literature.
For many practical real-life optimal control problems, the underlying dy-
namical systems are often large scale and highly complex. These optimal
control problems are also subject to rigid algebraic constraints arising nat-
urally due to practical limitations and engineering specifications. Thus, it is
often not possible to obtain their analytical solutions. Therefore, we can only
depend on computational methods for solving these real-world problems. For
this reason, many successful computational methods have been developed to
solve many different classes of optimal control problems with various types
of constraints.
The computational methods developed based on the control parametriza-
tion form a specific family among these computational methods. Similar to
the book titled, “A Unified Computational Approach to Optimal Control
Problems”, by Teo, Goh and Wong, the focus of this book is on this family of
computational methods. The book by Teo, Goh and Wong was published in
1991, but it has been out of print since 1996. Hence, it contains only those re-
sults obtained prior to 1991. Many new theoretical results, new computational

v
vi Preface

methods and new applications have been obtained and published since 1991.
For this reason, we have been motivated to write this new book. To en-
sure that the book is self-contained, essential fundamental results from the
1991 book are included in this new book. A revised version of basic results
on unconstrained and constrained optimization problems, and optimization
problems subject to continuous inequality constraints are also included.
A tremendous proliferation of results published based on control parametriza-
tion has appeared in the literature after 1991. To keep the size of the book
reasonable, we restricted ourselves to a discussion of those results obtained
by the authors and their past and present collaborators and students. This
choice ensures that the results presented in this book could be organized to
form a unified computational approach to solve various real-world practical
optimal control problems. These computational methods are supported by
rigorous convergence analysis, easily programmable and adaptable to exist-
ing efficient optimization software packages.
We do not claim that this family of computational methods is necessarily
superior to others found in the literature. Direct (Runge-Kutta) discretization
of optimal control problems or pseudospectral techniques are two examples
of methods that have been intensively studied by many researchers. A brief
review of these techniques is included in Section 1.3.5.
This book can serve as a reference for researchers and students working in
the areas of optimal control theory and its applications, and for professionals
using optimal control to solve their problems. It is noted that many scientists,
engineers and practitioners may not be thoroughly familiar with optimal con-
trol theory. Thus, the optimal control software MISER, which was developed
based on the control parametrization technique, can help them to apply op-
timal control theory as a tool to solve their problems. We wish to emphasize
that the aim of this book is to furnish a rigorous and detailed exposition of
the concept of control parametrization and the time scaling transformation
to develop new theory and new computational methods for solving various
optimal control problems numerically and in a unified fashion. Based on the
knowledge gained from this book, research scientists or engineers can develop
new theory and new computational methods to solve other complex problems
that are not being covered in this book.
The background required to understand the computational methods pre-
sented in this book, and their application to solve practical problems, is
advanced calculus. However, to analyse the convergence properties of these
computational methods, some results in real and functional analysis are also
required. For the convenience of the reader, these mathematical concepts and
facts are stated without proofs in Appendix A.1. Engineers and applied sci-
entists should be able to follow the proofs of the convergence theorems with
the aid of the results presented in Appendix A.1. For global optimization,
a filled function method is presented in Appendix A.2. Some basic concepts
and results on probability theory are discussed in Appendix A.3.
Preface vii

Chapter 1 introduces the reader to some essential concepts of optimal


control theory. It also contains examples drawn from many fields of engineer-
ing and science. This ends with a brief survey of the existing computational
techniques for solving optimal control problems. In Chapter 2, some funda-
mental results for unconstrained optimization problems are discussed. Chap-
ter 3 contains basic results on constrained optimization problems. Chapter 4
considers optimization problems subject to continuous inequality constraints.
Three computational methods are developed, two are based on the constraint
transcription technique used in conjunction with a local smoothing method
and the third one is developed based on the exact penalty function method.
These results are important because after control parametrization, an opti-
mal control problem is reduced to an optimal parameter selection problem,
which can be viewed as an optimization problem.
Chapter 5 contains some fundamental results on discrete time optimal
control problems. It contains discrete time minimum principle and dynamic
programming technique, and computational methods for solving discrete time
optimal control problems. Chapter 6 contains some essential results in op-
timal control theory. They are for those readers who are not familiar with
optimal control theory. Chapter 7 is devoted to the derivations of gradient
formulae for various kinds of optimal parameter selection problems, includ-
ing optimal parameter selection problems with the heights and the switching
times of the piecewise constant control being taken as decision variables;
optimal parameter selection problems with discrete valued control; optimal
parameter selection problems of switched systems; time-delay optimal pa-
rameter selection problem; and optimal control problems with multiple char-
acteristic time points. With these gradient formulae, the respective optimal
parameter selection problems can be solved as mathematical programming
problems.
Chapter 8 considers optimal control problems in canonical form. The con-
cept of control parametrization is introduced and applied to these canon-
ically constrained optimal control problems. Gradient-based computational
methods are derived. They are supported by a rigorous convergence analy-
sis. A time scaling transform is also introduced to supplement the control
parametrization technique for solving these canonically constrained optimal
control problems. Chapter 9 considers a class of optimal control problems
subject to continuous inequality constraints as well as terminal inequality
constraints on the state and/or control variables. The constraint transcription
method and the exact penalty function method are used to derive respective
computational methods for solving these optimal control problems subject to
continuous inequality constraints. Chapter 10 aims to develop computational
methods for solving three classes of optimal control problems—time-lag op-
timal control problems, state-dependent switched time-delay optimal control
problems and min-max optimal control problems.
In Chapter 11, we introduce two approaches to constructing suboptimal
feedback controls for constrained optimal control problems. The first ap-
viii Preface

proach is known as the neighbouring extremals approach, while the second


approach is to construct an optimal PID control for a class of optimal control
problems subject to continuous inequality constraints and terminal equality
constraint.
In Chapter 12, we consider two classes of stochastic dynamic optimization
problems. The first one is a combined optimal parameter selection and opti-
mal control problem in which the dynamical system is governed by linear Ito
stochastic differential equation involving a Wiener process. Both the control
and system parameter vectors may, however, appear nonlinearly in the system
dynamics. The cost functional is taken as an expected value of a quadratic
function of the state vector, where the weighting matrices are time invariant
but are allowed to be nonlinear in both the control and system parameter.
Furthermore, certain realistic features such as probabilistic constraints on the
state vector may also be included. Another problem considered in Chapter 12
is a partially observed linear stochastic control problem described by three
sets of stochastic differential equations: one for the system to be controlled,
one for the observer (measurement) channel and one for the control channel
driven by the observed process. The noise processes perturbing the system
and observer dynamics are vector-valued Poisson processes. For both of these
stochastic dynamic optimization problems, we show that they are equivalent
to two respective deterministic dynamic optimization problems. These equiv-
alent deterministic dynamic optimization problems are further transformed
into special cases of the form considered in Chapter 9.
It is our pleasure to express gratitude to many of our colleagues and col-
laborators, and to those PhD students, Postdoctoral Fellows and Visiting Re-
search Fellows of the first author listed as follows: Changzhi Wu, Zhiguo Feng,
Joseph Lee, Kar Hung Wong, Ryan Loxton, Qun Lin, Rui Li, Chongyang Liu,
Zhaohua Gong and Canghua Jiang. They have made great contributions to
the book, and many of the results presented in this book are from various joint
papers co-authored with them. Further details are indicated in the respective
chapters.
We wish to thank Yanqing Liu and Zhaohua Gong for recalculating the
examples in Chapters 8 and 9. Zhaohua Gong has redrawn the figures for
the examples being solved in Chapters 8–11. Also, we wish to thank Xiaoyi
Guan, Gaoqi Liu, Yanqing Liu, Shuxuan Su, Di Wu, Lei Yuan and Xi Zhu
for their help in LATEX.
Our thanks also go to Professor Panos M. Pardalos, the Editor of the
Book Series, Springer Optimization and Its Applications, for his encourage-
ment, leading to the publication of the book in his book series. We thank the
reviewers for their constructive and detailed comments and suggestions.
Preface ix

We also wish to express our appreciation to the staff of Springer, especially


to Elizabeth Loew, for their expert collaboration. Last but not least, our most
sincere thanks go to our families for their support, patience and understand-
ing. They have done much to improve the work, but any shortcomings are
totally ours.

School of Mathematical Sciences, Sunway University Kok Lay Teo


Selangor Darul Ehsan, Malaysia
College of Electrical Engineering, Sichuan University Bin Li
Chengdu, China
College of Sciences, Shanghai University Changjun Yu
Shanghai, China
School of Electrical Engineering, Computing Volker Rehbock
and Mathematical Sciences, Curtin University
Perth, WA, Australia
August 20, 2020
Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Optimal Control Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Computational Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.1 Dynamic Programming and Iterative Dynamic
Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3.2 Leapfrog Algorithm and STC algorithm . . . . . . . . . . . . 13
1.3.3 Control Parametrization . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3.4 Collocation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3.5 Full Parametrization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.4 Optimal Control Software Packages . . . . . . . . . . . . . . . . . . . . . . 18

2 Unconstrained Optimization Techniques . . . . . . . . . . . . . . . . . 21


2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3 Gradient Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4 Steepest Descent Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.5 Newton’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.6 Modifications to Newton’s Method . . . . . . . . . . . . . . . . . . . . . . 28
2.7 Line Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.8 Conjugate Gradient Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.8.1 Convergence of the Conjugate Gradient Methods . . . . 40
2.9 Quasi-Newton Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.9.1 Approximation of the Inverse G−1 . . . . . . . . . . . . . . . . 42
2.9.2 Rank Two Correction . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.9.3 BFGS Update Formula . . . . . . . . . . . . . . . . . . . . . . . . . . 48

xi
xii Contents

3 Constrained Mathematical Programming . . . . . . . . . . . . . . . . 55


3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.2 Quadratic Programming with Linear Equality Constraints . . 58
3.3 Quadratic programming via Active Set Strategy . . . . . . . . . . . 62
3.4 Constrained Quasi-Newton Method . . . . . . . . . . . . . . . . . . . . . . 67
3.5 Sequential Quadratic Programming Algorithm . . . . . . . . . . . . 69

4 Optimization Problems Subject to Continuous


Inequality Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.2 Constraint Transcription Technique . . . . . . . . . . . . . . . . . . . . . . 79
4.2.1 Inequality Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.2.2 Continuous Inequality Constraints . . . . . . . . . . . . . . . . 83
4.3 Continuous Inequality Constraint Transcription Approach . . 89
4.3.1 The First Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.3.2 The Second Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.4 Exact Penalty Function Method . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.4.1 Convergence Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.4.2 Algorithm and Numerical Results . . . . . . . . . . . . . . . . . 113
4.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

5 Discrete Time Optimal Control Problems . . . . . . . . . . . . . . . . 121


5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
5.2 Dynamic Programming Approach . . . . . . . . . . . . . . . . . . . . . . . 121
5.2.1 Application to Portfolio Optimization . . . . . . . . . . . . . 132
5.3 Discrete Time Optimal Control Problems with Canonical
Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
5.3.1 Gradient Formulae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
5.3.2 A Unified Computational Approach . . . . . . . . . . . . . . . 144
5.4 Problems with Terminal and All-Time-Step Inequality
Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
5.4.1 Constraint Approximation . . . . . . . . . . . . . . . . . . . . . . . 146
5.4.2 Convergence Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.4.3 Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
5.5 Discrete Time Time-Delayed Optimal Control Problem . . . . . 152
5.5.1 Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
5.5.2 Gradients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
5.5.3 A Tactical Logistic Decision Analysis Problem . . . . . . 163
5.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

6 Elements of Optimal Control Theory . . . . . . . . . . . . . . . . . . . . 173


6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
6.2 First Order Necessary Condition: Euler-Lagrange
Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
6.3 The Linear Quadratic Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
Contents xiii

6.4 Pontryagin Maximum Principle . . . . . . . . . . . . . . . . . . . . . . . . . 182


6.5 Singular Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
6.6 Time Optimal Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
6.7 Continuous State Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
6.8 The Bellman Dynamic Programming . . . . . . . . . . . . . . . . . . . . . 203
6.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209

7 Gradient Formulae for Optimal Parameter Selection


Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
7.2 Optimal Parameter Selection Problems . . . . . . . . . . . . . . . . . . . 218
7.2.1 Gradient Formulae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
7.2.2 A Unified Computational Approach . . . . . . . . . . . . . . . 224
7.3 Control Parametrization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
7.4 Switching Times as Decision Parameters . . . . . . . . . . . . . . . . . 230
7.4.1 Gradient Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
7.4.2 Time Scaling Transformation . . . . . . . . . . . . . . . . . . . . . 236
7.4.3 Combined Piecewise Constant Control and Variable
System Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
7.4.4 Discrete Valued Optimal Control Problems
and Optimal Control of Switched Systems . . . . . . . . . . 246
7.5 Time-Lag System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
7.5.1 Gradient Formulae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
7.6 Multiple Characteristic Time Points . . . . . . . . . . . . . . . . . . . . . 256
7.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260

8 Control Parametrization for Canonical Optimal Control


Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
8.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
8.3 Control Parametrization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
8.4 Four Preliminary Lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
8.5 Some Convergence Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
8.6 A Unified Computational Approach . . . . . . . . . . . . . . . . . . . . . . 278
8.7 Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
8.8 Combined Optimal Control and Optimal Parameter
Selection Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
8.8.1 Model Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
8.8.2 Smoothness of Optimal Control . . . . . . . . . . . . . . . . . . . 291
8.8.3 Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
8.9 Control Parametrization Time Scaling Transform . . . . . . . . . . 297
8.9.1 Control Parametrization Time Scaling Transform . . . 299
8.9.2 Convergence Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
8.10 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
8.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
xiv Contents

9 Optimal Control Problems with State and Control


Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
9.2 Optimal Control with Continuous State Inequality
Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
9.2.1 Time Scaling Transform . . . . . . . . . . . . . . . . . . . . . . . . . 317
9.2.2 Constraint Approximation . . . . . . . . . . . . . . . . . . . . . . . 320
9.2.3 A Computational Algorithm . . . . . . . . . . . . . . . . . . . . . . 324
9.2.4 Solving Problem (Pε,γ (p)) . . . . . . . . . . . . . . . . . . . . . . . . 325
9.2.5 Some Convergence Results . . . . . . . . . . . . . . . . . . . . . . . 330
9.2.6 Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
9.3 Exact Penalty Function Approach . . . . . . . . . . . . . . . . . . . . . . . 338
9.3.1 Control Parametrization and Time Scaling
Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
9.3.2 Some Convergence Results . . . . . . . . . . . . . . . . . . . . . . . 347
9.3.3 Computational Algorithm . . . . . . . . . . . . . . . . . . . . . . . 362
9.3.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
9.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368

10 Time-Lag Optimal Control Problems . . . . . . . . . . . . . . . . . . . . 371


10.1 Time-Lag Optimal Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
10.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
10.1.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
10.1.3 Control Parametrization . . . . . . . . . . . . . . . . . . . . . . . . . 373
10.1.4 The Time-Scaling Transformation . . . . . . . . . . . . . . . . . 374
10.1.5 Gradient Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
10.1.6 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390
10.2 Time-Lag Optimal Control with State-Dependent Switched
System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
10.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
10.2.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396
10.2.3 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398
10.2.4 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
10.2.5 Numerical Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418
10.3 Min-Max Optimal Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
10.3.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424
10.3.2 Some Preliminary Results . . . . . . . . . . . . . . . . . . . . . . . . 425
10.3.3 Problem Approximation . . . . . . . . . . . . . . . . . . . . . . . . . 427
10.3.4 Illustrative Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436
10.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438

11 Feedback Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441


11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441
11.2 Neighbouring Extremals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442
11.2.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442
Contents xv

11.2.2 Construction of Suboptimal Feedback Control Law . . 446


11.2.3 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
11.3 PID Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454
11.3.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454
11.3.2 Constraint Approximation . . . . . . . . . . . . . . . . . . . . . . . 458
11.3.3 Computational Method . . . . . . . . . . . . . . . . . . . . . . . . . . 460
11.3.4 Application to a Ship Steering Control Problem . . . . . 462
11.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469

12 On Some Special Classes of Stochastic Optimal Control


Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471
12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471
12.2 A Combined Optimal Parameter and Optimal Control
Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
12.2.1 Deterministic Transformation . . . . . . . . . . . . . . . . . . . . . 473
12.2.2 A Numerical Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 477
12.3 Optimal Feedback Control for Linear Systems Subject to
Poisson Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480
12.3.1 Two Stochastic Optimal Feedback Control
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484
12.3.2 Deterministic Model Transformation . . . . . . . . . . . . . . 485
12.3.3 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487
12.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499

A.1 Elements of Mathematical Analysis . . . . . . . . . . . . . . . . . . . . . . 501


A.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501
A.1.2 Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501
A.1.3 Linear Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502
A.1.4 Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503
A.1.5 Continuous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507
A.1.6 Normed Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 508
A.1.7 Linear Functionals and Dual Spaces . . . . . . . . . . . . . . . . . . . 510
A.1.8 Elements in Measure Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 513
A.1.9 The Lp Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516
A.1.10 Multivalued Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 520
A.1.11 Bounded Variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524

A.2 Global Optimization via Filled Function Approach . . . . . . . 527

A.3 Elements of Probability Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 537

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555
Chapter 1
Introduction

1.1 Optimal Control Problems

Broadly speaking, an optimal control problem seeks to optimize a perfor-


mance measure subject to a set of dynamic constraints. The dynamic con-
straints may constitute a set of differential equations (ordinary or partial)
or a set of difference equations. These equations may be deterministic or
stochastic in nature. Furthermore, optimal control problems are often sub-
ject to constraints on the state and/or control. These constraints arise due to
engineering regulations or design specifications, such as the specified prod-
uct quality or the constraint on the safety requirement [55]. Optimal control
problems subject to constraints on the state and/or control are referred to
as constrained optimal control problems.
Optimal control has applications in almost every area of science and engi-
neering, including aquaculture operation [28], cancer chemotherapy (see, for
example, [180, 181]), switched power converters (see, for example, [93, 167]),
spacecraft control (see, for example, [82, 100, 101, 111, 137, 255, 311, 312]),
ship steering [132], underwater vehicles [45], process control (see, for example,
[25, 41, 46, 139, 156–160, 173–176, 271, 272, 291–293, 302]), core engineering
(see, for example [35, 42, 79, 80, 102, 105, 123]), optimal control of auto-
mobiles or trains (see, for example, [95–97, 128, 310]), and crystallization
processes (see, for example, [194, 214]) and management sciences (see, for
example, [70, 94, 133, 135, 136, 226, 233–236]).
The famous Pontryagin maximum principle (see, for example, [3, 4, 29,
40, 64, 69, 206]) is a set of first order necessary conditions of optimality
for a constrained optimal control problem. It provides the means to solve
many practical problems in various disciplines. In particular, researchers in
economics started to model many of their problems in an optimal control
context and were able to solve them using the maximum principle (see, for
example, [74, 110, 226]). Markov-Dubins path is the shortest planar curve
© The Author(s), under exclusive license to 1
Springer Nature Switzerland AG 2021
K. L. Teo et al., Applied and Computational Optimal Control, Springer
Optimization and Its Applications 171,
https://doi.org/10.1007/978-3-030-69913-0 1
2 1 Introduction

joining two points with prescribed tangents, where a specified bound is im-
posed on its curvature. An elegantly simple solution was obtained by Dubins
in 1957—a selection of at most three arcs are concatenated, each of which
is either a circular arc of maximum (prescribed) curvature or a straight line.
The Markov-Dubins problem is reformulated as an optimal control problem
in various papers, and Pontryagin maximum principle is used to obtain the
same results as those obtained by Dubins. In [114], under the same reformu-
lation of the Markov-Dubins problems, the maximum principle is applied to
derive Dubins result again. The new insights are: abnormal control solutions
do exist; these solutions can be characterized as a concatenation of at most
two circular arcs; they are also solutions of the normal problem; and any
feasible path of the types mentioned in Dubins result satisfies the Pontrya-
gin maximum principle. A numerical method for computing Markov-Dubins
path is proposed. Dynamic Programming Principle developed by Bellman
[18–20] has been used to determine optimal feedback controls for a range
of practical optimal control problems. However, its application typically re-
quires the solution of a highly nonlinear partial differential equation known
as the Hamilton-Jacobi-Bellman (HJB) equation. Some numerical methods
for solving this HJB equation for low dimensional problems are available in
the literature (see, for example, [2, 6, 98, 99, 188, 208, 273, 274, 303–309]).
Practical problems, however, are usually too complex to be solved an-
alytically using either Pontryagin maximum principle or dynamic program-
ming. Thus, many numerical solution techniques have been developed and im-
plemented on computers. These numerical solution techniques coupled with
modern computing power are able to solve a wide range of highly complex
optimal control problems.
There are several survey articles and books in the literature on optimal
control computation. See, for example, [48, 69, 148, 210, 215]. For optimal
control computational methods based on Euler discretization, see, for exam-
ple, [36, 49, 50, 85, 113, 115, 178, 267].
In this book, our attention is centred on optimal control problems involving
systems of ordinary differential equations with as well as without time delayed
arguments and systems of difference equations. Note that these classes of
problems have many practical applications across a wide range of disciplines
such as those mentioned above.

1.2 Illustrative Examples

Example 1.2.1 (The Student Problem) Various versions of this prob-


lem have appeared in the literature over the past few decades (see, for exam-
ple, [43, 120, 202, 209]). Consider a lazy and forgetful student who wishes to
pass an examination with a minimum expenditure of effort. Assume that the
student’s knowledge level at any time t during the semester is a reflection of
1.2 Illustrative Examples 3

his/her performance if an examination was given at that time. Also, suppose


that the rate of knowledge intake is a linear function of the work rate and
that the student is constantly forgetting a proportion of what he/she already
knows. Let w(t) denote the rate of work done at time t, let k(t) denote the
knowledge level at time t and let T be the total time of the semester in
weeks. Furthermore, let c > 0 be the forgetfulness factor, let b > 0 be a con-
stant that determines how efficiently the student’s work is being converted
to knowledge, let w̄ be an upper bound on the work rate, let k0 be the initial
knowledge level and let kT be the desired final knowledge level (assumed to
be the minimum level required to pass the examination). Then the problem
may be stated as follows:
 T
Minimize g(w) = w(t) dt
0

subject to

dk(t)
= bw(t) − ck(t)
dt
k(0) = k0
k(T ) = kT

and 0 ≤ w(t) ≤ w̄ for all 0 ≤ t < T . Although this is a rather trivial example
of an optimal control problem, it serves well to illustrate the basic class of
optimal control problems we want to consider: an objective functional, g(w),
is to be minimized subject to a dynamical system governing the behavior
of the state variable, k(t), and subject to constraints. We need to find a
control function w(t), t ∈ [0, T ), subject to given bounds, which will optimize
the objective functional. A somewhat more realistic version of the student
problem, which assumes some ambition in the student, may be stated as
follows:
 T
 
Maximize g(w) = ak(T ) − α1 w(t) + α2 (w(t))2 dt
0

subject to

dk(t)
= (b1 + b2 k(t))w(t) − ck(t)
dt
k(0) = k0
k(T ) ≥ kT ,

and 0 ≤ w(t) ≤ w̄ for all 0 ≤ t < T . This version of the problem assumes
that the student has some interest in maximizing his/her examination mark,
is more averse to high rates of work and can gain knowledge more readily
when he/she already has a high level of knowledge. Both problems stated in
4 1 Introduction

Example 1.2.1 can be solved readily by using Pontryagin maximum principle


or dynamic programming.

l3 l4

d1 - d2 -  d3 -
l2
C ?- D
x3 = 0 x3 = 0 R
 x4 = x̄4 x4 = x̄4 x4 = 0
x5 = 0 x5 = 0 ? E x5 = x̄5
6
? B x4 = 0
l5 ?
x5 = −x̄5
6
? F x5 = 0
l1 6
? A
x5 = 0

 

Fig. 1.2.1: Movement of the container

Example 1.2.2 (Optimal Control of Container Cranes) This problem


originally appeared in [219]. Consider the task of loading a container from a
truck waiting on a wharf onto a ship tied up to the wharf, see Figure 1.2.1.
The mechanics of the crane are schematically shown in Figure 1.2.2. The
crane is driven by two motors. The trolley motor (left) controls the hori-
zontal position of the crane trolley, while the hoist motor (top) effectively
controls the rope length of the crane. J1 and J2 denote the total moment of
inertia of the trolley and hoist motors, respectively, including their associated
components (reduction gears, brake, drum etc.). Similarly, b1 and b2 denote
the drum radii of the trolley and hoist motors, respectively. Furthermore, we
let θ1 (t) and θ2 (t) denote the angles of rotation of the respective motors in
radians at time t. The underlying control variables are the driving torques
1.2 Illustrative Examples 5

Fig. 1.2.2: Schematic of the container crane

generated by the trolley and hoist motors, denoted by T1 (t) and T2 (t), re-
spectively. Finally, M denotes the total mass of the container and attached
equipment, m is the total mass of the crane trolley and the operator’s cab,
φ(t) is the load swing angle and g denotes acceleration due to gravity.
We define

x1 (t) = b1 θ1 (t) (horizontal position of trolley),


x2 (t) = b2 θ2 (t) (rope length),
x3 (t) = φ(t)(swing angle of the load),
dx1 (t)
x4 (t) = ,
dt
dx2 (t)
x5 (t) = ,
dt
dx3 (t)
x6 (t) = ,
dt
6 1 Introduction

b1 T1 (t)
v1 (t) = ,
J1 + mb21
b2 (T2 (t) + M b2 g)
v2 (t) = ,
J2 + M b22
M b21 M b22
δ1 = and δ 2 = .
J1 + mb21 J2 + M b22

Assuming that the load swing angle is small in magnitude, that the load
can be regarded as a point and that frictional torques can be neglected, the
dynamics of the container crane are given by Sakawa and Shindo [219]

dx1 (t)
=x4 (t) (1.2.1)
dt
dx2 (t)
=x5 (t) (1.2.2)
dt
dx3 (t)
=x6 (t) (1.2.3)
dt
dx4 (t)
=v1 (t) − δ1 x3 (t)v2 (t) + δ1 gx3 (t) (1.2.4)
dt
dx5 (t)
= − δ2 x3 (t)v1 (t) + v2 (t) (1.2.5)
dt
dx6 (t) 1
=− [v1 (t) − δ1 x3 (t)v2 (t) + (1 + δ1 )gx3 (t)
dt x2 (t)
+ 2x5 (t)x6 (t)]. (1.2.6)

Note that v1 and v2 are the effective control variables in these dynamics, and
they are subject to the following bounds:

|v1 (t)| ≤ v̄1 , ∀t, (1.2.7)

v 2 ≤ v2 (t) ≤ v̄2 , ∀t, (1.2.8)


where v̄1 , v 2 and v̄2 are defined in terms of the maximum torques of the trolley
drive and hoist motors, respectively (see [219]). Due to safety requirements,
the following bounds on the state variables are imposed:

|x4 (t)| ≤ x̄4 , ∀t, (1.2.9)

|x5 (t)| ≤ x̄5 , ∀t. (1.2.10)


The movement of the container is generally divided into 5 distinct sections.
The first section (from A to B in Figure 1.2.1) constitutes just a simple
vertical lift off the truck. Note that x4 and x5 are equal to zero at the start
of this section. At the end, the container is moving at the maximum allowed
vertical velocity, so x4 = 0 and x5 = −x̄5 . These conditions carry over as
the starting values for the next section of the path, from B to C. Unlike the
1.2 Illustrative Examples 7

previous section for which the optimal control can be determined analytically,
moving the container from B to C is a non-trivial optimal control task that
forms the basis of the problem we present here. Note that the container must
arrive at C with x3 = 0 (no swing), x4 = x̄4 (maximum allowed horizontal
velocity) and x5 = 0 (zero vertical velocity). Finally, the sections from C to
D and E to F again constitute trivial problems, while the section from D to
E has a similar complexity to that from B to C. Consider the section from B
to C over the time horizon [0, T ]. We impose the following initial conditions
and terminal state constraints on the problem:

x1 (0) = 0, x2 (0) = l2 , x3 (0) = 0,


x4 (0) = 0, x5 (0) = −x̄5 , x6 (0) = 0, (1.2.11)
x1 (T ) = d1 , x2 (T ) = l3 , x3 (T ) = 0,
x4 (T ) = x̄4 , x5 (T ) = 0, x6 (T ) = 0, (1.2.12)

with the various constants illustrated in Figure 1.2.1.


The objective functional to be minimized subject to the dynamics and con-
straints given in (1.2.1–1.2.12) is

1 T  
g(v) = w1 (x3 (t))2 + w2 (x6 (t))2 dt, (1.2.13)
2 0

where w1 and w2 are given weights. This objective functional can be regarded
as a measure of the total amount of swing experienced by the load.
Alternatively, since the speed of the operation is clearly an important issue
[219], one may want to minimize
 T
ḡ(v) = 1 dt = T (1.2.14)
0

subject to (1.2.1) and (1.2.12) and subject to the additional swing constraint
 T  
w1 (x3 (t))2 + w2 (x6 (t))2 dt ≤ Smax , (1.2.15)
0

where Smax is a given parameter.

Example 1.2.3 (Optimal Production of Penicillin) The production of


penicillin takes place in a fed-batch fermentation process where the feed rate
of the substrate is the main control variable [200]. The aim is to find the
substrate feed rate that will optimize the final amount of penicillin. The
problem may be stated as follows. Maximize

g(u) = P (T ) (1.2.16)

subject to
8 1 Introduction

dX(t)
=μ(X(t), S(t), V (t))X(t) (1.2.17)
dt
dP (t)
=π(X(t), S(t), V (t))X(t) − kP (t) (1.2.18)
dt
dS(t)
= − σ(X(t), S(t), V (t))X(t) + sF u(t) (1.2.19)
dt
dV (t)
=u(t) (1.2.20)
dt
X(0) =10.5 (1.2.21)
P (0) =0 (1.2.22)
S(0) =0 (1.2.23)
V (0) =7 (1.2.24)
V (T ) =10 (1.2.25)
0 ≤ u(t) ≤ umax , ∀t ∈ [0, T ). (1.2.26)

Here, μ, π and σ are the growth rate functions given by


μmax S
μ(X, S, V ) = (1.2.27)
μ1 X + S
πmax SV
π(X, S, V ) = 2
(1.2.28)
π1 V + S(V + π2 S)
μ π σ3 S
σ(X, S, V ) = + + , (1.2.29)
σ1 σ2 π1 V + S
where the μmax , πmax , μ1 , π1 , σ1 , σ2 and σ3 are given constants. Here, X
represents the biomass in the reactor, P is the amount of product, S is the
amount of substrate and V is the total volume of fluid in the reactor. This
is a challenging problem even for numerical algorithms, due to the rigid be-
havior of the specific growth rate functions (see [200] and the references cited
therein).

Example 1.2.4 (Optimal Driving Strategy for a Train) The following


system of differential equations is a model of the dynamics of a train on a
level track similar to that discussed in [95]:

dx1
=x2
dt
dx2
=ϕ(x2 ) u1 + ζ2 u2 + ρ(x2 ),
dt
where x1 is the distance along the track, x2 is the speed of the train, u1 is
the fuel setting and u2 models the deceleration applied to the train by the
brakes. The function
1.2 Illustrative Examples 9


⎪ ζ1 /x2 , if x2 ≥ ζ3 + ζ4 ,





ζ1 /ζ3 + η1 (x2 − (ζ3 − ζ4 ))2
ϕ(x2 ) =
⎪ +η2 (x2 − (ζ3 − ζ4 ))3 ,
⎪ if ζ3 − ζ4 ≤ x2 < ζ3 + ζ4 ,





ζ1 /ζ3 , if x2 < ζ3 − ζ4 ,

where
1 1 3 1
η 1 = ζ1 − +
ζ3 + ζ4 ζ3 4 ζ42 2 ζ4 (ζ3 + ζ4 )2
and
1 1 1 1
η 2 = ζ1 − − −
ζ3 + ζ4 ζ3 4 ζ43 4 ζ42 (ζ3 + ζ4 )2
represent the tractive effort of the locomotive. A somewhat simpler form of ϕ
was used in [95], but the form used here models the actual data (see Figure 1
of [95]) more accurately. The function ρ is the resistive acceleration due to
friction, given by ρ(x2 ) = ζ5 + ζ6 x2 + ζ7 x22 . ζi , i = 1, . . . , 7, are constants
with given values ζ1 = 1.5, ζ2 = 1, ζ3 = 1.4, ζ4 = 0.1, ζ5 = −0.015, ζ6 =
−0.00003 and ζ7 = −0.000006. Also, x1 (0) = 0, x2 (0) = 0, x1 (1500) = 18000
and x2 (1500) = 0, i.e., the train starts from the origin at rest and comes
to rest again 18000m away at tf = 1500. The train is not allowed to move
backwards, so we require x2 (t) ≥ 0 for all t ∈ [0, 1500].
u1 1 0 0
The control is restricted to the discrete set U = , , ,
u2 0 0 −1
so that the train is either powered by the engine, coasting or being slowed
by the brakes. Note that power and brakes cannot be applied simultaneously.
The objective is to minimize the fuel used on the journey, i.e., to minimize
 1500
J0 (u) = u1 dt.
0

More realistic versions of the problem include multiple fuel settings and speed
limit constraints, see [126].

Example 1.2.5 (Optimal Control of Crystallization) Crystallization


from solution is a purification and separation technique of great economic
importance to the chemical industry. The quality of the crystallization prod-
uct and the efficiency of downstream product recovery processes are primarily
determined by the crystal size distribution (CSD). Here, we deal with the pre-
cipitation of aluminium trihydroxide (Al(OH)3 ) from supersaturated sodium
aluminate solutions. As part of the Bayer process, it represents an important
step in the production of aluminium. In industry practice, the precipitation
of aluminium trihydroxide is carried out in a continuous manner using a cas-
cade of crystallizers. For the purpose of a theoretical analysis, however, the
complete process can be approximated by a single batch cooling crystallizer.
10 1 Introduction

Following the approach detailed in [94], we discretize the solid particle dis-
tribution into m distinct size intervals [Li , Li+1 ], i = 1, . . . , m, where size is
a measure of the diameter of a particle. A large range of particle sizes √ can
3
be captured if Li are defined by Li+1 = rLi , i = 1, . . . , m, where r = 2 and
L1 = 3.7×10−6 metres. Let Ni denote the number of particles in the i-th size
interval, and let C be the concentration of the solution. The rate of change
of each Ni consists of 3 terms that reflect the effects of nucleation, crystal
growth and agglomeration. The nucleation of new particles is assumed to oc-
cur only for the first size interval. The dynamics of the number of crystals in
individual size intervals as well as the solute concentration are then described
as follows:
 m
dN1 2G r2 r
= 1− 2 N1 − 2 N2 + Bu − βN1 Nj (1.2.30)
dt L1 (1+r) r −1 r −1 
   nucleation  j=1  
growth
agglomeration

dNi 2G r r
= Ni−1 + Ni − 2 Ni+1
dt Li (1 + r) r − 1
2 r −1
  
growth
⎡ ⎤

i−2
1 
i−1 
m
+ β ⎣Ni−1 2j−i+1 Nj + (Ni−1 )2 − Ni 2j−i Nj − Ni Nj ⎦ ,
j=1
2 j=1 j=i
  
agglomeration
i = 2, . . . , m − 1, (1.2.31)
dNm 2G r
= Nm−1 + Nm
dt Lm (1 + r) r2 − 1
  
growth
⎡ ⎤

m−2 
N −1
1
+ β ⎣Nm−1 2j−m+1 Nj + (Nm−1 )2 −Nm 2j−m Nj −(Nm )2 ⎦,
j=1
2 j=1
  
agglomeration
(1.2.32)
dC −3kv ρs 
m
ρs
= G Ni Si2 − kv S13 Bu . (1.2.33)
dt ε i=1
ε

We consider m = 25 size intervals, Si , i = 1, . . . , m, denotes the average


particle size in the interval [Li , Li+1 ], equation (1.2.33) models the rate of
change of the concentration of the solution if the change of volume is assumed
negligible, G is a measure of growth, Bu denotes the rate of nucleation in
the first size interval, β is known as the agglomeration kernel (a measure
of the frequency of collisions between particles), assumed to be independent
1.2 Illustrative Examples 11

of the particle size here, kv = 0.5 is a volume shape factor, ε = 0.8 and
3
ρs = 2420kg/m is the density of the resulting solid.
Furthermore, letting T denote the temperature of the solution in degrees
Kelvin, we can define the solubility as a function of temperature and caustic
concentration [194], i.e.,
1.0875CNa O
∗ 2486.7 2
CAl 2 O3
= CNa2 O e6.21− T + T ,

where CNa2 O = 100 kg/m . The supersaturation, defined as ΔC = C − C ∗ ,


3

is the main driving force for the three processes of nucleation, growth and
agglomeration. The growth is modeled by G = kg (ΔC)2 , where, assuming
CNa2 O = 100 kg/m as before, kg = 6.2135e− T . The dependence of the
3 7600

agglomeration kernel is modeled by β = ka (ΔC)4 , where ka = 6.8972 ×


10−21 T − 2.29 × 10−18 . Finally, the dependence of nucleation on ΔC and T
is suitably modeled by
 1.7

m
0.8
Bu = kn (ΔC) ks Ni Si2 ,
i=1

where ks = π is a surface shape factor and kn is an empirical coefficient


depending on temperature. It is required that all functions involved in the
dynamics are continuously differentiable. The (ΔC)0.8 term appearing in the
equations above does not satisfy this assumption as ΔC → 0. Hence, we
replace this term by a smooth cubic approximation for small values of ΔC,
i.e.,
 m 1.7

2
Bu = kn fc (ΔC) ks Ni S i ,
i=1

where
(ΔC)0.8 , if ΔC > 1,
fc (ΔC) =
−1.2(ΔC)3 + 2.2(ΔC)2 , if 0 ≤ ΔC < 1.
Furthermore, it has been shown experimentally that nucleation decreases
markedly at temperatures above 70 ◦ C and does not occur beyond 80 ◦ C. For
−10407.265
temperatures below 70 ◦ C, we take kn = 9.8 × 1022 e T , while kn = 0 for

temperatures above 80 C. In between, we use a smooth cubic interpolation
for kn , i.e.,
⎧ −10407.265

⎪ 9.8 × 1022 e T , if T ≤ 343.2 ◦ K,





(0.002c1 + 0.01c2 ) (T − 353.2)3
kn (T ) =

⎪ +(0.03c1 + 0.1c2 )(T − 353.2)2 , if T ≤ 353.2 ◦ K,





0, if T > 353.2 ◦ K,
12 1 Introduction

where c1 = 52673.694 and c2 = 4654.0948 and T is measured in degrees


Kelvin. These dynamics are active over [0, tf ], where tf may be fixed or
variable.
A narrow size distribution, i.e., one with a small variance, is usually desired,
along with a large final mean crystal size. Thus, the aim is to maximize
 
M5 1
J(T ) = − ln − ,
M44 M3 M42


N 
N
N
where M3 = Ni (tf )Si3 , M4 = Ni (tf )Si4 and M5 = i=1 Ni (tf )Si5 .
i=1 i=1
This is equivalent to maximizing the ratio of the mean crystal size over the
variance in the crystal size. Other versions of the problem, where seed crys-
tals are added to the solution throughout the process, can also be readily
formulated [214].

1.3 Computational Algorithms

Application of the well-known Pontryagin maximum principle can yield so-


lutions to many optimal control problems by analytic means. This is partic-
ularly true in the field of economics, where many interesting problems have
been solved this way, often yielding important practical principles (see, for
example, [21, 110, 226]). Many other theoretical results regarding various
classes of optimal control problems may be found in the literature (see, for
example, [3–5, 9, 33, 40, 69, 83, 88–90, 121, 130, 149, 250, 253, 276]). However,
most practical problems arising in engineering and science have a high degree
of complexity and may not be able to find solutions by analytic means. Hence,
many computational algorithms have been developed to calculate numerical
solutions of optimal control problems. Numerical optimal control computa-
tion methods can be roughly divided into two categories: indirect methods
and direct methods. In an indirect method, the maximum principle is used
to determine the form of the optimal control in terms of state and costate
variables. Then, it gives rise to a multiple-point boundary-value problem.
Its solution is obtained through solving the multiple-point boundary-value
problem. In a direct method, the optimal control problem is approximated
as a nonlinear programming problem (NLP) through the discretization of
the control and/or state over the time interval. The NLP is then solved us-
ing optimization techniques. Several survey articles and books on optimal
control computation can be found in the open literature (see, for example,
[48, 69, 148, 210, 215]). For optimal control computational methods based on
Euler discretization, see, for example, [36, 49, 50, 85, 113, 115, 178, 258].
1.3 Computational Algorithms 13

In what follows, we provide an overview of several numerical solution meth-


ods.

1.3.1 Dynamic Programming and Iterative Dynamic


Programming

The Iterative Dynamic Programming (IDP) technique is a computational


variation of the dynamic programming principle. The technique was initially
developed in [173] and then refined in [175] to improve the computational
efficiency. The method uses a grid structure for discretizing both the state
variables and the control variables. The grid of the state defines accessible
points in the state trajectory, and the grid of the controls defines admis-
sible control values. The method typically starts with a coarse grid over a
large region of the state space. Successive refinements of the grid are then
implemented around the optimal trajectory until a satisfactory control pol-
icy is obtained. Initial development employed piecewise constant control, and
this was later extended to piecewise linear continuous control policies [174].
Constraints are handled by a penalty function approach. The IDP technique
has been successfully applied to a wide range of optimal control problems
particularly in the field of chemical engineering [175]. According to [175], ad-
vantages of the IDP technique include its robustness, and, as a gradient-free
method, its ability to steer away from local optima. It is also noted in [175]
that the method involves numerous algorithmic parameters that can be dif-
ficult to tune for a novice user. These include the region contraction factor,
the number of allowable values for controls, the number of grid points, the
initial region size and the restoration factor.

1.3.2 Leapfrog Algorithm and STC algorithm

The algorithm developed in [112] is known as the leapfrog algorithm. In this


algorithm, an initially feasible trajectory is given and subdivided over the
time horizon. In each subinterval a piecewise-optimal trajectory is obtained.
The junctions of these sub-trajectories are then updated through a scheme of
midpoint maps. Under some broad assumptions, the sequence of trajectories
is shown to converge to a trajectory that satisfies the Maximum Principle.
In [117], the switching time computation (STC) method is incorporated in a
time-optimal bang- bang control (TOBC) algorithm [111] for solving a class of
optimal control problems governed by nonlinear dynamics system with single
input. In this method, a concatenation of constant-input arcs is used to move
from an initial point to the target, and an optimization procedure is utilized to
find the necessary time lengths of the arcs. The difficulties of the STC method
14 1 Introduction

in finding the necessary arc time lengths are discussed. The gradients with
respect to the switching time variables are calculated in a manner that avoids
the need for costate variables, and this can be a computational advantage
in some problems. Derivation of these gradients is given in Section 7.4.1,
where the limitations of this approach are also discussed. In [111], the time-
optimal switching (TOS) algorithm is developed for solving a class of time
optimal switching control problems involving nonlinear systems with a single
control input. In this algorithm, the problem is formulated in the arc times
space, where arc times are the durations of the arcs. A feasible switching
control is found using the STC method [117] to move from an initial point
to a target point with a given number of switching. The cost is expressed as
the summation of the arc times. Then, by using a constrained optimization
technique, a minimum-time switching control solution is obtained. In [186], a
numerical scheme is developed for constructing optimal bang-bang controls.
Then, the second order sufficient conditions developed in [185] are used to
check numerically whether the controls obtained are optimal.

1.3.3 Control Parametrization

The control parametrization method (see, for example, [36, 63, 69, 89, 143,
145, 148, 151, 153, 154, 160–162, 164, 166, 169–171, 181, 215, 229, 230,
238, 244, 249, 253, 255, 260, 284, 288, 294, 298, 300, 301, 311]) relies on
the discretization of the control variables using a finite set of parameters.
This is most commonly done by partitioning the time horizon of a given
problem into several subintervals such that each control can be approxi-
mated by a piecewise constant function that is consistent with the corre-
sponding partition. The approximating piecewise constant function can be
defined in terms of a finite set of parameters, known as control parame-
ters. Upon such an approximation, an optimal control problem becomes a fi-
nite dimensional optimal parameter selection problem. In real world, optimal
control problems are often subject to constraints on the state and/or con-
trol. These constraints can be point constraints and/or continuous inequality
constraints. The point constraints are expressed as functions of the states
at the end point or some intermediate interior points of the time horizon.
These point constraints can be handled without much difficulty. However,
for the continuous inequality constraints, they are expressed as functions
of the states and/or controls over the entire time horizon and hence are
very difficult to handle. Through the control parametrization, a continuous
inequality constrained optimal control problem is approximated as a con-
tinuous inequality constrained optimal parameter selection problem, which
can be viewed as a semi-infinite programming (SIP) problem involving dy-
namic system. A popular approach to deal with the continuous inequality
constraints on the state and control is known as the constraint transcrip-
1.3 Computational Algorithms 15

tion (see, for example, [76, 103, 135, 136, 148, 245, 246, 249, 253, 259]).
Details will be given in later chapters. Another effective method to handle
continuous inequality constraints is the exact penalty functions method (see,
for example, [134, 300, 301]). It is also discussed in detail in later chapter.
After the use of the constraint transcription method or the exact penalty
function method, the continuous inequality constrained optimal parameter
selection problem becomes an optimal parameter selection problem subject
to constraints in the form of the objective functional, and these constraints
are called canonical constraints. Each of these optimal parameter selection
problems with canonical constraints can be regarded as a mathematical pro-
gramming problem, and its solution is to be obtained by constrained opti-
mization techniques. The control parametrization technique has been used
in conjunction with the constraint transcription or the exact penalty func-
tion extensively in the literature (see, for example, [104, 134, 138, 145, 162,
164, 168, 171, 180, 181, 214, 215, 236, 244, 245, 249, 254, 294]). In [148], a
survey and recent developments of the technique are presented. The tech-
nique has been proven to be very efficient in solving a wide range of op-
timal control problems. In particular, several computational algorithms to
deal with a variety of different classes of problems together with a sound
theoretical convergence analysis are reported in the literature (see, for ex-
ample, [230, 237, 240, 245, 246, 248, 249, 253, 260, 279–281]). Under some
mild assumptions, convergence of the sequence of approximate optimal costs
obtained from a corresponding sequence of partition refinements of the time
horizon to the optimal cost of the original optimal control problem has been
demonstrated. Furthermore, the solution obtained for each approximate op-
timal control problem, which is regarded as a constrained optimization prob-
lem, is such that the KKT conditions are satisfied. However, there is no
proof of the convergence of the approximate optimal controls to true optimal
control. Therefore, the approximate optimal control obtained is likely to be
not identically the same as the true optimal control, but the difference in
the approximate optimal cost and the true optimal cost is insignificant. This
is sufficient in real-world applications. In the next section, full discretization
schemes based on Runge-Kutta discretization of the optimal control problems
will be briefly discussed. These full discretization schemes can solve some op-
timal control problems such that the controls obtained can be verified to
satisfy the optimality conditions.
Finally, note that the standard control parametrization approach assumes
a fixed partition for the piecewise constant (or polynomial) approximation of
the control. In many practical problems, it is desirable to allow the knot points
of the partition to be variable as well. For this, the Control Parametrization
Enhancing Transform (CPET) is introduced in the literature (see, for exam-
ple, [125, 126, 138, 215, 256]). It is now called the time scaling transformation
to better reflect the actual meaning of the transformation. It is now widely
used in the literature, such as [106–108, 142, 144, 148, 150, 151, 162, 165–
171, 311]. The time scaling transformation can be used to convert problems
16 1 Introduction

with variable knot points for the control into equivalent problems where the
control is defined on a fixed partition once more. The transformed problem
can then be readily solved by the standard control parametrization approach.
Details of the transformation and many of its applications are described in
later chapters.
In [63], an algorithm is developed for solving constrained optimal control
problems. Through the control parametrization, a constrained optimal con-
trol problem is approximated by a SIP problem. The algorithm proposed
seeks to locate a feasible point such that the KKT conditions to a specified
tolerance are achieved. Based on the right hand restriction method proposed
in [195] for standard SIP, the proposed algorithm solves the path constrained
optimal control problem iteratively through the approximation of the path
constrained optimal control problem by restricting the right hand side of the
path constraint to a finite number of time points. Then, the approximate
optimization problem with finitely many constraints is solved such that local
optimality conditions are satisfied at each iteration. The established algo-
rithm will find a feasible point in a finite number of iterations such that the
first order KKT conditions are satisfied to a specified accuracy.

1.3.4 Collocation Methods

For a direct local collocation method, the state and control are approximated
using a specified functional form. The time interval [t0 , T ] is partitioned into
N subintervals [ti−1 , ti ], i = 1, . . . , N , where tN = T . Since the state is re-
quired to be continuous across intervals, the following condition is imposed
for each i = 1, . . . , N :

x(t−
i ) = x(ti ), i = 2, . . . , N − 1,
+

where x(t− +
i ) = limt↑ti x(t) and x(ti ) = limt↓ti x(t). Two types of discretiza-
tion schemes are normally used in the development of algorithms for solving
optimal control problems: (i) Runge-Kutta methods and (ii) orthogonal collo-
cation methods. Runge-Kutta discretization schemes are normally in implicit
form. This is because they have better stability properties than those of ex-
plicit methods. In [212], an algorithm is developed to solve optimal control
problems based on orthogonal collocation method, where Legendre-Gauss
points, which are chosen as discretized points, are used together with cubic
splines over each subinterval. In [52], Lagrange polynomials are used, instead
of cubic spline. Note that the application of a direct local collocation to an
optimal control problem will give rise to a nonlinear programming problem of
very high dimension containing thousands to tens of thousands of variables
and a similar number of constraints. However, the nonlinear programming
problem will tend to be very sparse with many of the derivatives of the con-
straint Jacobian being zero. Thus, it can be solved efficiently using nonlinear
programming solvers.
1.3 Computational Algorithms 17

A pseudospectral is a global orthogonal collocation method. It approx-


imates the state using a global polynomial, and the collocation is carried
out at appropriately chosen discretized points. Typically, the basis functions
used are Chebyshev or Lagrange polynomials. For local collocation, the de-
gree of the polynomial is fixed while the number of meshes is varied. On the
other hand, for a pseudospectral method, the number of meshes is fixed while
the degree of the polynomial is varied. Pseudospectral method is originally
developed to solve problems in computational fluid dynamics [39].
Psudostectral methods have been used to develop numerical algorithms for
solving optimal control problems, where appropriate discretized points (with
the basis functions being Lagrange polynomials) are to be chosen. For exam-
ple, the Legendre-Gauss-Lobatto points or Chebysheve-Gauss-Lobatto points
[52] are chosen as the discretized points for the Gauss-Lobatto pseudospec-
tral method, the Legendre-Gauss points [212] are used as discretized points in
the Gauss pseudospectral method, and for the Radau pseudospectral method
[101], Legendre- Gauss-Radau Points are used as discretized points.

1.3.5 Full Parametrization

Full parametrization (discretization) is a popular approach for solving op-


timal control problems, where an optimal control problem is discretized as
a finite dimensional optimization problem by using Euler, midpoint, trape-
zoid or, in general, Runge-Kutta discretization schemes. See, for example,
[13, 36, 49, 50, 85, 113, 115, 178, 196]. Among these discretization schemes,
Euler discretization scheme is the simplest but most popular one, for which
the optimality conditions can be expressed easily. In [36], two discretization
schemes—full discretization and control discretization—are developed to ap-
proximate optimal control problems as nonlinear constrained optimization
problems. Then, SQP optimization method is utilized to find an optimal (lo-
cal) control. In addition, SQP optimization method is used again to check
numerically whether the obtained control satisfies the second order sufficient
conditions, and the post-optimal calculation of the adjoint variables. The
Inexact Restoration (IR) method is an iterative finite dimensional optimiza-
tion method developed in [26, 179, 182], which is an extension of the gradient
restoration methods proposed in [191–193]. Each IR iteration consists of two
phases—the feasibility phase and optimality phase. They are solved sepa-
rately as two separate subproblems. It has been shown that if feasibility is
improved, while the magnitude of the update in control variables is kept small,
then the magnitude of the update in state variables is also small. A local con-
vergence analysis and an associated algorithm for the IR method are given
in [26]. This method has been applied to the discretization of optimal control
problems in [13, 113, 115]. Particularly, it is applied to Euler discretization of
state and control of a constrained optimal control problem in [13], where the
18 1 Introduction

convergence of the discretized (finite dimensional optimization) problem to


an approximate solution using the Inexact Restoration method, and the con-
vergence of the approximate solution to a solution of the original continuous
time optimal control problem are established. A practical algorithm is devel-
oped for the IR method in [85]. By using the modeling language AMPL [182],
the adapted version of the algorithm is coded for constrained optimal control
problems, where the optimization software Ipopt [186] is used. The conver-
gence of the solution of the Euler discretized problem to a continuous time
solution of the original constrained optimal control problem is also established
in [116], where the time derivative of the pure state constraints is adjoined to
the Hamiltonian function [185]. This approach is known as the indirect adjoin-
ing approach. The adjoint (or costate) variables so obtained differ from those
obtained by using the direct adjoining approach. Four discretization meth-
ods for ordinary differential equations and differential algebraic equations
are proposed in [69], which are discretized by one-step method, backward
differentiation formula, linearized implicit Runge-Kutta method and auto-
matic step-size selection. For the discretization of optimal control problems,
the approaches being proposed are full discretization, reduced discretization
and control discretization. The convergence results of Euler discretization
and Runge-Kutta discretization are obtained in [69], where real-time control,
model predictive control and mixed-integer optimal control are also covered.
In [267], a class of terminal optimal control problems involving linear sys-
tems is considered. For Runge-Kutta direct discretizations of these terminal
optimal control problems, error estimates are obtained. If certain sufficient
conditions for structural stability are satisfied, the estimate is of first order;
otherwise, the estimate is of fractional order.

1.4 Optimal Control Software Packages

Several general-purpose software packages are available in the literature for


solving constrained optimal control problems.
Recursive Integration Optimal Trajectory Solver (RIOTS) [224] is a col-
lection of programs for solving optimal control problems, designed as a MAT-
LAB [183] toolbox. The method underlying the program is the representation
of controls by finite dimensional B-splines to discretize the optimal control
problems. In this sense, it is an example of the control parametrization ap-
proach. The integration of the system dynamics is carried out using fixed
step-size Runge-Kutta integration. The use of Runge-Kutta method to nu-
merically integrate the system dynamics leads to approximations of the op-
timal control problem. It is shown in [225] that there exists a class of higher
order explicit Runge-Kutta methods that provide consistent approximations
to the original problem, where consistency is defined according the theory in-
troduced in [203]. Consequently, it is guaranteed that stationary points of the
1.4 Optimal Control Software Packages 19

approximating problems converge to stationary points of the original prob-


lem and that global solutions (or strict local solutions) of the approximating
problems converge to global (or local) solutions of the original problems as
the step-size of the Runge-Kutta method is decreased. Hence an optimal con-
trol problem is solved through solving a sequence of approximating discrete
time optimal control problems. The software can solve a large class of finite
time optimal control problems involving path and terminal time constraints,
control bounds, variable initial conditions and problems with integral as well
as endpoint cost functionals. It also has a special feature for dealing with sin-
gular optimal control problems. It is mentioned in [225] that RIOTS comes
with some limitations on the type of problems it can effectively solve. Among
those, it has difficulty in solving problems with inequality state constraints
that require a very high level of discretization. Another disadvantage is as-
sociated with the consistent approximations, which require that the approx-
imating problems be defined on finite dimensional subspaces of the control
space to which Runge-Kutta methods can be extended. The selection of the
control subspaces affects both the accuracy of numerical integration and the
accuracy of the approximate solutions to the original problem.
A general-purpose MATLAB software program called GPOPS-II [211]
is now available for solving multiple-phase optimal control problems using
variable-order Gaussian quadrature collocation methods. In this software, a
Legendre-Gauss-Radau quadrature orthogonal collocation method is used to
approximate the continuous time optimal control problem by a large sparse
nonlinear programming (NLP) problem. Then, an adaptive mesh refinement
scheme is utilized to determine the number of mesh intervals and the degree
of the approximating polynomial within each mesh interval such that a spec-
ified accuracy is achieved. The optimization solver with the software is NLP
solver, which can either be quasi-Newton or Newton solver. The derivatives
of the functions involved in the optimal control problem, which are required
by the NLP solver, are calculated using sparse finite differencing.
The optimal control software package MISER3.3 [104] is an implementa-
tion of the control parametrization technique. It has considerable flexibility
in the range of features that can be handled. A large variety of constraints
is catered for by allowing a general canonical constraint formulation as well
as several special types of constraints. In particular, the algorithm of [103] is
applied to handle continuous time inequality constraints on the state, and it
has been incorporated in the software. Note that an approach developed in
[259] and the exact penalty function approach developed in [299, 301, 305]
can also be used as alternative means of handling continuous time inequal-
ity constraints. MISER3.3 has been successfully used to solve a significantly
large number of practical optimal control problems. See, for example, those
mentioned above in this chapter and the relevant references cited in these
papers. MISER3.3 is written in the FORTRAN programming language as
well as in the MATLAB enviroment. Recently, a new version known as Vi-
sual MISER [294] was developed with the Visual FORTRAN compiler within
20 1 Introduction

the Microsoft Visual Studio environment. It provides an easy to use interface


while retaining the computational efficiency of the MISER3.3 software.
NUDOCCCS (NUmerical Discretization method for Optimal Control prob-
lems with Constraints in Controls and States) [34] is, like MISER3.3, a
FORTRAN-based package aimed at solving a quite general class of optimal
control problems. The underlying ODE system is integrated by an implicit
Runga-Kutta scheme, and NUDOCCCS solves the resulting nonlinear prob-
lem with a sequential quadratic programming method. The package has been
successfully used to solve a wide range of practical optimal control problems.
See [35] for an example.
There are other software packages for solving optimal control problems.
See, for example, [1].
Chapter 2
Unconstrained Optimization Techniques

2.1 Introduction

The numerical algorithms to be developed for optimal control computation in


this book are based on the control parametrization technique in conjunction
with a novel time scaling transform. Essentially, an optimal control prob-
lem with its control functions being approximated by an appropriate linear
combination of spline functions is reduced to an optimal parameter selec-
tion problem. Thus, the determination of optimal control function is reduced
to the selection of optimal parameters representing the control. Although
the constraint of the dynamical system still exists, the problem may, after
the parametrization, be viewed as an implicit mathematical programming
problem. The solution to the optimal control problem may thus be obtained
through solving a sequence of resulting mathematical programming prob-
lems, although the computational procedure is much more involved. Thus,
a basic understanding of the fundamental concepts, theory and methods of
mathematical programming is required.
To begin, we point out that the notation used in this chapter is applicable
only to this chapter and Chapters 3 and 4. For example, the n-vector x used
in this chapter, Chapter 3 and Chapter 4 should not be confused with the
state vectors in other chapters. Also, the vector norm is denoted by · in
these three chapters, but by |·| in later chapters. In this book, the gradient
of a function is assumed to be a column vector.
There are already many excellent books on nonlinear optimization. For
example, see [10, 22, 61, 172, 187, 199, 232, 275]. In this chapter, we sum-
marize some essential concepts and results in nonlinear unconstrained opti-
mization. It is based on the lecture notes on optimization prepared and used
by the authors. These lecture notes have also been used by their colleagues.
In addition to these lecture notes, this chapter includes also some important

© The Author(s), under exclusive license to 21


Springer Nature Switzerland AG 2021
K. L. Teo et al., Applied and Computational Optimal Control, Springer
Optimization and Its Applications 171,
https://doi.org/10.1007/978-3-030-69913-0 2
22 2 Unconstrained Optimization Techniques

results from the references listed at the beginning of the paragraph and from
[60, 87, 204, 207, 220–222].
Unlike optimal control problems, mathematical programming problems
are static in nature. The general constrained mathematical programming
problem is to find an x ∈ Rn to minimize the objective function

f (x) (2.1.1)

subject to the constraints

hi (x) = 0, i = 1, . . . , m, (2.1.2)
hi (x) ≤ 0, i = m + 1, . . . , m + r, (2.1.3)

where f and hi , i = 1, . . . , m + r, are continuously differentiable functions.


Let Ω be the set which consists of all x ∈ Rn such that (2.1.2) and (2.1.3)
are satisfied. This set is called the feasible set.

2.2 Basic Concepts

For completeness, we shall first present some important concepts and results
in unconstrained optimization techniques. Some basic theory and algorithms
for constrained optimization will be given in Chapter 3. The unconstrained
optimization problem is to choose an x = [x1 , . . . , xn ] ∈ Rn to minimize
an objective function f (x). It is a special case of the general problem in
Section 2.1, where the feasible region is the entire space Rn .
Definition 2.2.1 The point x∗ ∈ Rn is said to be a global minimum (mini-
mizer) if
f (x∗ ) ≤ f (x), for all x ∈ Rn .
Definition 2.2.2 The point x∗ ∈ Rn is said to be the strict global minimum
(minimizer) if
f (x∗ ) < f (x), for all x ∈ Rn \{x∗ }.

Definition 2.2.3 The point x∗ ∈ Rn is said to be a local minimum (mini-


mizer) if there exists an ε > 0 such that

f (x∗ ) ≤ f (x), for all x ∈ Nε (x∗ ),

where Nε (x∗ ) = {x ∈ Rn : x − x∗  < ε} is an ε-neighbourhood of x∗ .

Definition 2.2.4 The point x∗ ∈ Rn is said to be a strict local minimum


(minimizer) if there exists an ε > 0 such that

f (x∗ ) < f (x), for all x ∈ Nε (x∗ )\{x∗ }.


2.2 Basic Concepts 23

Let f (x) be a continuously differentiable function defined on an open set


X ⊂ Rn . Then, by Taylor’s Theorem, for any two points x and y = x + s in
X, there exists a θ, 0 ≤ θ ≤ 1, such that

f (y) = f (x) + (g(θx + (1 − θ)y)) s, (2.2.1)

where

  ∂f (ξ) ∂f (ξ) ∂f (ξ)
g(ξ) = (∇ξ f (ξ)) = (∇f (ξ)) = , ,..., ,
∂ξ1 ∂ξ2 ∂ξn

ξ = [ξ1 , ξ2 , . . . , ξn ] , s = [s1 , s2 , . . . , sn ] and the superscript “ ” denotes


the transpose.
The formula (2.2.1) can be generalized to functions which are twice contin-
uously differentiable as follows. Let f (x) be a twice continuously differentiable
function defined on an open set X ⊂ Rn . Then, for any two points x and
y = x + s in X, there exists a θ, 0 ≤ θ ≤ 1, such that
1
f (y) = f (x) + (g(x)) s + s G(θx + (1 − θ)y)s, (2.2.2)
2
where G(ξ) denotes the Hessian of the function f defined by

G = ∇xx f (x) = ∇2x f (x) = (∇x ) ∇x f (x)


⎡ ⎤
∂2f ∂2f ∂2f
⎢ ∂x2 ∂x1 ∂x2 ··· ⎥
⎢ 1 ∂x1 ∂xn ⎥
⎢ ∂2f ∂2f ∂2f ⎥
⎢ ··· ⎥
⎢ ⎥
= ⎢ ∂x2 ∂x1 ∂ 2 x2 ∂x2 ∂xn ⎥.
⎢ .. .. .. .. ⎥
⎢ . . . . ⎥
⎢ ⎥
⎣ ∂2f ∂2f ∂2f ⎦
···
∂xn ∂x1 ∂xn ∂x2 ∂x2n

If f ∈ C 2 , then its Hessian is symmetric. A symmetric matrix A is said


to be positive definite if x Ax > 0 for all x = 0. Similarly, A is said to be
positive semi-definite if x Ax ≥ 0 for all x ∈ Rn .
The point x0 ∈ Rn is said to be a stationary point if
 
g(x0 ) = ∇f (x0 ) = 0. (2.2.3)

Note that (2.2.3) is equivalent to



∂f (x0 ) ∂f (x) 
= = 0, j = 1, 2, . . . , n. (2.2.4)
∂xj ∂xj x=x0

Theorem 2.2.1 (Necessary Condition for Local Minima) If x0 is a local


minimum, then ∇f (x0 ) = 0.
24 2 Unconstrained Optimization Techniques
 
Proof. Suppose that ∇f (x0 ) = 0. Choose s = − ∇f (x0 ) . Then for any
sufficiently small α > 0, it follows from Taylor’s theorem that

f (x0 + αs) = f (x0 ) + α∇f (x0 )s + o(α)


 2
= f (x0 ) − α ∇f (x0 ) + o(α)
< f (x0 ).

This is a contradiction to the fact that x0 is a local minimum.


Theorem 2.2.2 (Necessary Condition for Local Minima) Let x0 be a so-
lution to (2.2.3). If x0 is a local minimum, then the Hessian G(x0 ) of the
function f evaluated at x = x0 is positive semi-definite.
Proof. For α > 0, define x(α) = x0 +αs and β(α) = f (x(α)), where s ∈ Rn is

arbitrary. Clearly, dβ(α)
dα  = ( ∇f (x(α))|α=0 )s = 0. By Taylor’s theorem,
α=0
2
1 d β(0) 2 o(α2 ) d2 β(0)
we have β(α) − β(0) = 2
2 dα2 α + o(α ), where lim α2 = 0. If dα2 < 0,
α→0
then the right hand side of the above equation is negative when α > 0 is
sufficiently small. This contradicts the fact that x0 is a local minimum. Thus,

d2 β(0)
= s G(x0 )s ≥ 0
dα2
and the result follows.
Theorem 2.2.3 (Sufficient Condition for Local Minima) Let x0 be a solu-
tion to (2.2.3). If the Hessian, G(x0 ), of the function f evaluated at x0 is
positive definite, then x0 is a strict local minimum.
Proof. Since G(x0 ) is positive definite, s G(x0 )s > 0 for all s = 0. For any
unit vector s ∈ Rn and any sufficiently small α > 0, by Taylor’s Theorem,
we have
1
f (x0 + αs) − f (x0 ) = α2 s G(x0 )s + o(α2 ).
2
For small values of α > 0, the first term on the right of the last equation
dominates the second. Thus, it follows that f (x0 + αs) − f (x0 ) > 0 provided
α > 0 is sufficiently small. Since the direction of s is arbitrary, this shows
that x0 is a strict local minimum and the proof is complete.

2.3 Gradient Methods

Consider the unconstrained optimization problem introduced in the previous


section. An algorithm which generates a sequence of points {x(k) } such that
! !
f x(k+1) < f x(k) (2.3.1)
2.3 Gradient Methods 25

for all k ≥ 0, is referred to as a descent algorithm (i.e., the objective function


value is reduced at each iteration).
Define
!! !
g (k) = ∇f x(k) and G(k) = ∇2 f x(k) . (2.3.2)

Let s(k) be a given vector in Rn . Consider the function f (x) along the line

x(α) = x(k) + αs(k) ,

for α ≥ 0. Clearly, f (x (α)) may be regarded as a function of α alone. The


slope of the function is

df (x(α)) d ! !
= f x(k) + αs(k) = ∇f x(k) + αs(k) s(k) .
dα dα
At α = 0, the slope is
 ! !
df (x(α)) 
 = ∇f x(k) s(k) = g (k) s(k) .
dα α=0

If a direction s(k) is such that


!
g (k) s(k) < 0, (2.3.3)

then s(k) is called a descent direction of the objective function at x(k) . The
objective function value is reduced along this direction for all sufficiently
small α > 0.
The following is a general structure for the descent algorithm that we will
consider:
Algorithm 2.3.1
Step 1. Choose x(0) and set k = 0.
Step 2. Determine a search direction s(k) .
Step 3. Check for convergence.  
Step 4. Find αk that minimizes f x(k) + αs(k) with respect to α.
Step 5. Set x(k+1) = x(k) + αk s(k) , and set k := k + 1. Go to Step 2.

Remark 2.3.1
(i) Different descent methods arise from different ways of generating the
search direction s(k) .
(ii) Step 4 is a one-dimensional optimization problem and is referred to
as a line search. Here, the line search is idealized. In practice, an
exact line search is impossible.
 
Note that if αk minimizes f x(k) + αs(k) , then
26 2 Unconstrained Optimization Techniques
 
df x(k) + αs(k) 
 = 0.
dα 
α=αk

Thus, a necessary condition for an exact line search is


  ! !
df x(k) + αk s(k)
= ∇f x(k) + α(k) s(k) s(k) = g (k+1) s(k) = 0.

(2.3.4)

2.4 Steepest Descent Method

Note that −g (k) is the direction in which the objective function value de-
creases most rapidly at x(k) . By choosing s(k) = −g (k) in Algorithm 2.3.1,
we obtain the steepest descent method. This method is the simplest one among
all gradient-based unconstrained optimization methods. It requires only the
objective function value and its gradient. Moreover, with the steepest de-
scent method, we have global convergence (i.e., the method converges from
any starting point) as we will now demonstrate.
Assume that there exists a point x∗ ∈ Rn such that ∇f (x∗ ) = 0. Further-
more, we suppose that ∇f (x) = 0 if x = x∗ . Then, the steepest descent
method constructs a sequence {x(k) } with
! ! !
f x(k+1) = min f x(k) − αg (k) ≤ f x(k) , f or all k ≥ 0.
0≤α<∞

  "  #
Since f (x∗ ) ≤ f x(k) ≤ f (x(0) ) for all k ≥ 0, it is clear that f x(k)
"  (k) #
is a bounded monotone sequence. This implies that f x is convergent
(0)
"  (k) #
for any initial point x . In other words, f x is globally convergent
to f (x∗ ) However, it should be noted that the convergence of the steepest
descent method can be very slow. The convergence rate of this method is
given in the following theorem:
Theorem 2.4.1 Suppose that f (x) defined on Rn has continuous second or-
der partial derivatives, and has a local minimum at x∗ . If {x(k) } is a sequence
generated by the steepest descent method that converges to x∗ , then
 !  2  
  r−1  (k) 
f x(k+1) − f (x∗ ) ≤ x − x∗  ,
r+1

where r is the condition number of the Hessian G∗ = ∇2 f (x∗ ). Note that the
condition number is given by r = A/a, where A and a are, respectively, the
largest and the smallest eigenvalues of G∗ .
2.5 Newton’s Method 27

2.5 Newton’s Method

Newton’s method is based on the quadratic approximation of the objective


function obtained by truncating the Taylor series expansion of f (x) about
x(k) . That is, for δ ∈ Rn , the objective function f (x(k) + δ) is approximated
by the following quadratic function:
! 1
q (k) (δ) = f (k) + g (k) δ + δ  G(k) δ, (2.5.1)
2
 
where f (k) = f x(k) . The next iterate x(k+1) is chosen as the minimizer of
this quadratic approximation. That is, we choose x(k+1) = x(k) + δ (k) , where
δ (k) is the solution of
∇q (k) (δ) = 0. (2.5.2)
If G(k) is positive definite, then
!−1
δ (k) = − G(k) g (k) . (2.5.3)

Remark 2.5.1
(i) Newton’s method requires the information on f (k) , g (k) and G(k) ,
i.e., function values and first and second order partial derivatives.
(ii) The basic Newton’s method does not involve a line search. The
choice of δ (k) ensures that the minimum of the quadratic approxi-
mation is achieved.
(iii) Assuming G∗ is positive definite, Newton’s method has good local
convergence if the starting point is sufficiently close to x∗ .
(iv) Choosing δ (k) as the solution of (2.5.2) is only appropriate and
well-defined if the quadratic approximation has a minimum, i.e.,
G(k) is positive definite. This may not be the case if x(k) is remote
from x∗ , where x∗ is a local minimum.

Algorithm 2.5.1 (Newton’s Method)


Step 1. Choose x(0) and set k = 0.
Step 2. If g (k) = 0, stop.
Step 3. Solve G(k) δ = −g (k) for δ = δ (k) .
Step 4. Set x(k+1) = x(k) + δ (k) .
Step 5. Set k := k + 1. Go to Step 2.

We will now examine the convergence rate of Newton’s method. From


Algorithm 2.5.1, we have, at the k-th iterate,
!−1
x(k+1) = x(k) − G(k) g (k) . (2.5.4)

Suppose that x∗ is a point such that g(x∗ ) = 0 and that G(x∗ ) is positive
definite. Then, it follows from Taylor’s theorem that
28 2 Unconstrained Optimization Techniques
! ! !  2 
 
g (k) = g x(k) = g(x∗ ) + G x(k) x(k) − x∗ + O x(k) − x∗ 
!  2 
 
= G(k) x(k) − x∗ + O x(k) − x∗  , (2.5.5)

where the last term is understood to be vector valued and


O (ξ)
lim = c,
ξ →0 ξ

for some constant c >" 0. Let# k = k0 . If x


(k0 )
is sufficiently close to x∗ , then
we may assume that x (k)
is in a neighbourhood of x∗ for all k ≥ k0 . Since
f is twice continuously differentiable
  and G (x∗ ) is positive definite, we can
  
(k) −1 
find a constant c1 such that  G x  ≤ c1 for all k ≥ k0 . Multiplying
 (k) −1
both sides of (2.5.5) by G yields
!−1  2 
 
G(k) g (k) = x(k) − x∗ + O x(k) − x∗  . (2.5.6)

From Step 3 of Algorithm 2.5.1, the left hand side of (2.5.6) may be replaced
by −δ (k) . Since δ (k) = x(k+1) − x(k) , we have
!  2 
∗  (k) ∗
− x (k+1)
−x (k)
= x − x + O x − x  .
(k)

Simplifying, we thus find that


!  2 
∗  (k) ∗
− x (k+1)
−x = O x − x  .

 2 !
By the definition of O x(k) − x∗  , there exists a constant c2 such that
   2
 (k+1)   
x − x∗  ≤ c2 x(k) − x∗  . (2.5.7)
" #
Therefore, if x(k) is close to x∗ , x(k) converges to x∗ at a rate of at least
second order.

2.6 Modifications to Newton’s Method

Assume that G(k) has eigenvalues λk1 < λk2 < . . . < λkn . Choose εk such that

εk + λk1 > 0.
2.6 Modifications to Newton’s Method 29

Obviously, if λk1 > 0, we choose εk = 0. Consider the matrix εk I + G(k) .


Clearly, it has eigenvalues

0 < εk + λk1 < εk + λk2 < · · · < εk + λkn .

Since they are all positive, εk I + G(k) is positive definite. Thus, we can con-
struct a modified Newton’s method:
!−1
x(k+1) = x(k) − αk εk I + G(k) g (k) , k = 0, 1, . . . (2.6.1)

Remark 2.6.1 When εk is sufficiently large (on the order of 104 ), the term
−1
εk I dominates εk I + Gk and (εk I + Gk ) ≈ [εk I]−1 = ε1k I. Thus, the search
direction is !−1 1
− εk I + G(k) g (k) ≈ − g (k)
εk
which is the steepest descent direction. When εk is small, εk I + G(k) ≈ G(k)
and the method is similar to Newton’s method.

Let us describe this modified Newton’s method, which is called Marquardt’s


method, as follows.
Algorithm 2.6.1
Step 1. Choose x(0) , ε0 > 0 (in the order of 104 ), c1 (where 0 < c1 < 1), c2
(where c2 > 1) and ε > 0 (in the order of 10−2 ). Set k = 0.
(k)
 (k)  g .
Step 2. Compute

Step 3. If g  ≤ ε, stop. x(k) is taken as a local minimum. Otherwise, go
to Step 4.
$ %−1 (k) !
Step 4. Find αk∗ such that f x(k) − αk εk I + G(k) g is minimized.
Compute x(k+1) according to
& '−1
x(k+1) = x(k) − αk∗ εk I + G(k) g (k) . (2.6.2)
     
Step 5. Compare the values of f x(k+1) and f x(k) . If f x(k+1) <
 (k)   (k+1)   (k) 
f x , go to Step 6. If f x ≥f x , go to Step 7.
Step 6. Set εk+1 = c1 εk , k := k + 1, and go to Step 2 followed by Step 3.
Step 7. Set εk+1 = c2 εk , k := k + 1, and go to Step 2 followed by Step 4.

Remark 2.6.2 In Marquardt’s method, the value of εk is taken to be large


at the beginning and then is reduced to zero gradually as the iterative process
progresses. Thus, as the value of εk decreases from a large value to zero, the
characteristics of the method change from those of a steepest descent method
to those of Newton’s method. Hence, the method takes advantage of both the
steepest descent method (global convergence) and the Newton’s method (fast
convergence when near a local minimum).
30 2 Unconstrained Optimization Techniques

2.7 Line Search

The steepest descent method and, in fact, most gradient-based methods re-
quire a line search. The choice of α∗ that exactly minimizes f (x+αs) is called
the exact line search. The exact line search condition is given by (2.3.4). How-
ever, an exact line search condition is difficult to implement using a computer.
Thus, we must resort to an approximate line search.
If s is a descent direction at x, as described by (2.3.3), then f (x + αs) <
f (x) for all α > 0 sufficiently small. Thus, we could replace finding the exact
minimizer of f (x+αs) by finding any α such that f (x+αs) < f (x). However,
if α is chosen too small, we may not get to the minimum of f . We need at
least linear decrease in function value to guarantee convergence. If α is chosen
too large, then s may no longer be a descent direction.
An approximate minimizer ᾱ of f (x + αs) must be chosen such that the
following conditions are satisfied:
(1) Sufficient function decrease:

f (x + ᾱs) ≤ f (x) + ρᾱs (∇f (x)) . (2.7.1)

(2) Sufficient slope improvement:



|∇f (x + ᾱs)s| ≤ −δs (∇f (x)) . (2.7.2)

Remark 2.7.1 ρ and δ are constants satisfying 0 < ρ < δ < 1. If δ = 0,


then ᾱ = α∗ .
For illustration, let us look at

h(α) = f (x + αs).

If we choose ρ = ρ̄, then ᾱ ≤ α. Furthermore,


 
 dh(ᾱ)  dh(0)
 
 dα  ≤ −δ dα ⇒ ᾱ ∈ [b, c].

An acceptable approximate minimizer is any point ᾱ satisfying both of these


conditions, i.e., any point in [b, c] in the diagram of Figure 2.7.1.
Remark 2.7.2
(i) To ensure the existence of a point satisfying both these conditions,
we need ρ ≤ δ.
(ii) As δ → 0, line search becomes more accurate. Typical values of δ
are chosen as follows: δ = 0.9—a weak line search, i.e., not very
accurate line search; and δ = 0.1—a strong line search , i.e., fairly
accurate line search.
(iii) ρ is typically taken to be quite small, e.g., 0.01, where 0 < ρ < δ <
1.
2.8 Conjugate Gradient Methods 31

y y = h(α)
6
h(0) y = h(0) (ρ = 0)

y = h(0) + αρ̄h (0)


where 0 < ρ̄ < 1

y = h(0) + αh (0) (ρ = 1)


-
0 b α∗ c a α

Fig. 2.7.1: The choices of α and ρ

For the steepest descent method with approximate line search, we have the
following convergence result, which can be found in any book on optimization.
See, for example, [172].
" #
Theorem 2.7.1 Let f ∈ C 2 and x(k) be a sequence of points generated
by the steepest descent method using an approximate line search. Then, one
of the following conditions is to be satisfied:
(i) g (k) = 0 for some k; or
(ii) g (k) → 0 as k → ∞; or
(iii) f (k) → −∞ (there is no finite minimum).

2.8 Conjugate Gradient Methods

Conjugate gradient methods are originally developed for the minimization of


a quadratic objective function, and then extended for general unconstrained
minimization problems. Let us consider the following quadratic objective
function:
1
f (x) = x Gx + c x (2.8.1)
2
over Rn , where G is a symmetric positive definite matrix.
Definition 2.8.1 Given a symmetric matrix G, two vectors d(1) and d(2)
( )  
are said to be G-conjugate, if d(1) , Gd(2) = d(1) Gd(2) = 0, where ·, ·
denotes the inner product.

Theorem 2.8.1 If G is positive definite and the set of non-zero vectors d(i) ,
i = 0, 1, 2, . . . , k, are G-conjugate, then they are linearly independent.
32 2 Unconstrained Optimization Techniques

Proof. Suppose they satisfy

α0 d(0) + α1 d(1) + · · · + αk d(k) = 0 (2.8.2)

for a set of constants αi , i = 0, 1, 2, . . . , k. Taking the inner product with


Gd(i) , we have
! ! !
α0 d(0) Gd(i) + α1 d(1) Gd(i) + · · · + αk d(k) Gd(i) = 0. (2.8.3)

 
From the conjugacy, we have d(i) Gd(j) = 0 for all i = j. Thus, it follows
 
from (2.8.3) that αi d(i) Gd(i) = 0. Since G is positive definite, we have
 (i) 
d Gd(i) > 0, and hence αi = 0. This is true for all i = 0, 1, . . . , k. Thus,
(i)
d , i = 0, 1, . . . , k, are linearly independent.
Theorem 2.8.2 (Principal Theorem of Conjugate Direction Method) Con-
sider a quadratic function given by (2.8.1). For an arbitrary x(0) ∈ Rn , let
x(i) , i = 1, 2, . . . , k, with k ≤ n − 1 be generated by a conjugate direction
method, where the search directions s(0) , . . . , s(n−1) are G-conjugate, and
where, for each i ≤ n − 1, x(i+1) is chosen as

x(i+1) = x(i) + αi∗ s(i) , (2.8.4)

where * !+
αi∗ = argmin f x(i) + αs(i) . (2.8.5)
α≥0

Then, the method terminates in at most n exact line searches. Furthermore,


for each i ≤ n, x(i) is the minimizer of the function f (x) given by (2.8.1) on
the affine subspace.
* +
Ui = x(0) + span s(0) , s(1) , . . . , s(i−1) , (2.8.6)
" #
where span s(0) , s(1) , . . . , s(i−1) denotes the linear subspace spanned by s(0) ,
s(1) , . . ., s(i−1) .
Proof. For all i ≤ n − 1, recall the notation
!!
g (i) = f x(i)

from (2.3.2). We have


! !
g (i+1) − g (i) = Gx(i+1) + c − Gx(i) + c
!
=G x(i+1) − x(i)
!
=G x(i) + αi∗ s(i) − x(i)
2.8 Conjugate Gradient Methods 33

=αi∗ Gs(i) . (2.8.7)

For j < i, note that we may write


! !
g (i+1) =g (j+1) + g (j+2) − g (j+1) + . . . + g (i+1) − g (i)

i !
=g (j+1) + g (k+1) − g (k) . (2.8.8)
k=j+1

Using (2.8.8) and (2.8.7), it follows that

! ! 
i !
g (i+1) s(j) = g (j+1) s(j) + g (k+1) − g (k) s(j)
k=j+1
! 
i !
= g (j+1) s(j) + αk s(k) Gs(j)
k=j+1

=0, (2.8.9)

where the last equality "is based on the # exact line search condition (2.3.4)
and the G-conjugacy of s(0) , . . . , s(i) . Now, consider the case when j = i.
Again, by the exact line search condition (2.3.4), we have
!
g (i+1) s(i) = 0. (2.8.10)

Since (2.8.9) "and (2.8.10) hold for# all i ≤ n − 1 and since Theorem 2.8.1
insures that s(0) , s(1) , . . . , s(n−1) is linearly independent, it follows from
those two equations that
g (n) = 0.
Hence x(n) is the minimizer of f (x) given by (2.8.1) and consequently the
search must terminate in at most n exact line searches.
We now move on to prove the second part of the theorem. We need to
show that for each i ≤ n, x(i) is the minimizer of f (x) on the set Ui defined
by (2.8.6). We shall use an induction argument. The statement is true for
i = 1, since the exact line search (2.8.4)–(2.8.5) is clearly over U1 in this case.
Suppose the statement is true for some j < n, i.e., x(j) is optimal on Uj .
Then we want to show that

x(j+1) = x(j) + αj∗ s(j)


" #
is optimal on Uj+1 = x(0) + span s(0) , s(1) , . . . , s(j) , where, by the exact
line search condition (2.3.4), g (j+1) s(j) = 0, i.e.,
! !
G x(j) + αj∗ s(j) + c = 0,
34 2 Unconstrained Optimization Techniques

which, in turn, requires

(Gx(j) + c) s(j)


αj∗ = − . (2.8.11)
(s(j) ) Gs(j)

To do so, consider the task of finding x = y + αs(j) ∈ Uj+1 to minimize f (x),


where y ∈ Uj and α ≥ 0 need to be determined. We have
!
f (x) =f y + αs(j)
1 ! ! !
= y + αs(j) G y + αs(j) + c y + αs(j) ]
2
1 ! α2 (j) !
= y  Gy + α s(i) Gy + s Gs(j) + c y + αc s(j)
2 2
! α2 (j) !
=f (y) + α s(j) (Gy + c) + s Gs(j) . (2.8.12)
2
At this point, note that y ∈ Uj and x(j) ∈ Uj implies that
* +
y − x(j) ∈ span s(0) , . . . , s(j−1) .
" #
The conjugacy of s(0) , . . . , s(n−1) then implies that
! !
s(j) G y − x(j) = 0

which can be rearranged to


! !
s(j) Gy = s(j) Gx(j) . (2.8.13)

Substituting (2.8.13) into (2.8.12), we have


! ! α2 !
f (x) = f (y) + α s(j) Gx(j) + c + s(j) Gs(j) . (2.8.14)
2
Note that the right hand side of (2.8.14) is decoupled in that the first term
depends on y only and the last two terms depend on α only. By our earlier
assumption, since x(j) yields the minimum of f on Uj , we must choose y =
x(j) to minimize the first term. Minimizing the remaining two terms with
respect to α requires
! ! α2 ! 
d
α s(j) Gx(j) +c + s(j) Gs(j)
dα 2
! ! !
= s(j) Gx(j) + c + α s(j) Gs(j) (2.8.15)
2.8 Conjugate Gradient Methods 35

to equal zero, which, in turn, yields


  (j) 
s(j) Gx + c
α=−   .
s(j) Gs(j)

Comparing with (2.8.11), we have α = αj∗ which then shows that x(j+1) =
x(j) + αj∗ s(j) minimizes f on Uj+1 , as required. This proves the inductive
step and the proof is complete.

This theorem is simple but important. All conjugate direction methods


rely on this theorem. It shows that conjugacy plus exact line search implies
quadratic termination. Also, the optimal step length in the ith exact line
search depends directly on the corresponding conjugate search directions via
the formula  (i) 
∗ Gx + c s(i)
αi = − . (2.8.16)
(s(i) ) Gs(i)
We now describe a version of the conjugate gradient method based on the
one by Fletcher and Reeves [62]. The presentation given below follows that
in [10]. At a point x(k) ∈ Rn , instead of minimizing the function f (x) in the
direction of −g (k) , we will search in the direction of s(k) generated by

s(k) = −g (k) + βk−1 s(k−1) , k ≥ 1, (2.8.17)


s (0)
= −g (0)
, (2.8.18)

where βk−1 is chosen such that s(k) is G-conjugate to s(k−1) . More precisely,
from (2.8.17) and (2.8.18), we have
! ! !
s(k) Gs(k−1) = − g (k) Gs(k−1) + βk−1 s(k−1) Gs(k−1) . (2.8.19)

Choosing
 
g (k) Gs(k−1)
βk−1 =   , (2.8.20)
s(k−1) Gs(k−1)
it is clear from (2.8.19) that
!
s(k) Gs(k−1) = 0. (2.8.21)

This shows that s(k) is G-conjugate to s(k−1) .


Now, let x(k+1) be generated from x(k) through minimizing the function
f (x(k) + αs(k) ) with respect to α, i.e.,
! !
minf x(k) + αs(k) = f x(k+1) .
α
36 2 Unconstrained Optimization Techniques

From (2.8.1), we have


! ! ! α2 (k) !
f x(k) + αs(k) = f x(k) + α g (k) s(k) + s Gs(k) . (2.8.22)
2
Minimizing this function with respect to α yields
x(k+1) = x(k) + αk s(k) , (2.8.23)
where 

g (k) s(k)
αk = −   . (2.8.24)
s(k) Gs(k)
Since g (k) = Gx(k) + c and using (2.8.23), we obtain
!
g (k+1) = Gx(k+1) + c = G x(k) + αk s(k) + c = g (k) + αk Gs(k) . (2.8.25)
 
By (2.8.17), it follows from the exact line search condition g (k) s(k−1) =
0 that
! ! ! !
g (k) s(k) = − g (k) g (k) + βk−1 g (k) s(k−1) = − g (k) g (k) .
(2.8.26)

Since s(k−1) is G-conjugate to s(k) , it is clear from (2.8.17) and (2.8.18)


that
! ! !
s(k) Gs(k) = − g (k) Gs(k) + βk−1 s(k−1) Gs(k)
!
= − g (k) Gs(k) . (2.8.27)

Substituting (2.8.26) and (2.8.27) into (2.8.24) yields


 
g (k) g (k)
αk = −   . (2.8.28)
g (k) Gs(k)

From (2.8.25), we have


! ! !
g (k+1) g (k) = g (k) g (k) + αk g (k) Gs(k) . (2.8.29)

Substituting (2.8.28) into (2.8.29) yields

! !   !
g (k) g (k)
g (k+1) g (k) = g (k) g (k) −   g (k) Gs(k) = 0. (2.8.30)
g (k) Gs(k)

This shows that g (k+1) is orthogonal to g (k) .


2.8 Conjugate Gradient Methods 37

Remark 2.8.1 Let x(0) be an initial point in Rn . Then, x(1) is calculated


k = 0, where s = −g
(0) (0)
according to (2.8.23)
 (0) with and α0 is obtained by
(0)
minimizing f x + αs with respect to α. α0 is given by (2.8.24) with k =
0. Let x(1) = x(0) − α0 g (0) . The gradient vector g (1) = ( ∂f (x)/∂x|x=x(1) )
of the functionf (x) is calculated
 at x(1) . Since α0 is the minimum point of
the function f x − αg
(0) (0)
with respect to α, we have
    
df x(0) − αg (0)  ∂f (x) 
0=  =− g (0)
dα  ∂x x=x(0) −α0 g(0)
α=α0
!
= − g (1) g (0) . (2.8.31)

This shows that g (1) is orthogonal to g (0) and hence to s(0) .

Theorem 2.8.3 The sequences {s(k) } and {g (k) } generated according to


(2.8.17), (2.8.18) and (2.8.25) are mutually G-conjugate and mutually or-
thogonal, respectively.

Proof. We shall use mathematical induction. First, by (2.8.21) with k =


1, we note that s(0) and s(1) are G-conjugate. Next, by (2.8.31), g (0) and
g (1) are orthogonal. Thus, the conclusion of the theorem for k = 1 is valid.
Now, we suppose that s(0) , s(1) , . . . , s(k−1) are mutually G-conjugate and
g (0) , g (1) , . . . , g (k−1) are mutually orthogonal for some k ≥ 2. By (2.8.30),
g (k) is orthogonal to g (k−1) and, by (2.8.21), s(k) is G-conjugate to s(k−1) .
By the induction hypotheses, we assume that g (k−1) is orthogonal to all g (i) ,
0 ≤ i ≤ k − 2 and s(k−1) is G-conjugate to s(i) , 0 ≤ i ≤ k − 2. Then, we have,
for 0 ≤ i ≤ k − 2,
! !
g (k) g (i) = g (k−1) + αk−1 Gs(k−1) g (i)
! !
= g (k−1) g (i) + ak−1 Gs(k−1) g (i)
!
= ak−1 Gs(k−1) g (i) . (2.8.32)

From (2.8.17) and (2.8.18), we can write g (i) as:

g (i) = −s(i) + βi−1 s(i−1) , (2.8.33)

where β−1 = 0, as g (0) = −s(0) , while for i ≥ 0, βi are determined by (2.8.20).


Thus, from (2.8.32) and (2.8.33), it follows from the induction hypothesis that
! ! !
g (k) g (i) = αk−1 Gs(k−1) −s(i) + βi−1 s(i−1)
= 0 for i = 0, 1, . . . , k − 2. (2.8.34)
38 2 Unconstrained Optimization Techniques

Therefore, by (2.8.34) and (2.8.30), we conclude that g (0) , g (1) , . . . , g (k) are
mutually orthogonal.
It remains to show that s(i) , i = 0, 1, . . . , k, are mutually G-conjugate. We
recall that s(k) is G-conjugate to s(k−1) . By the induction hypothesis, s(k−1)
is G-conjugate to all s(i) , 0 ≤ i ≤ k − 2. Thus, for 0 ≤ i ≤ k − 2, it is clear
from (2.8.17) and (2.8.18) that
! !
s(k) Gs(i) = −g (k) + βk−1 s(k−1) Gs(i)
! !
= − g (k) Gs(i) + βk−1 s(k−1) Gs(i)
!
= − g (k) Gs(i) . (2.8.35)

From (2.8.25), we can write Gs(i) as:

g (i+1) − g (i)
Gs(i) = . (2.8.36)
αi

From (2.8.24), we see that αi is never 0 unless g (i) = 0. If g (i) = 0, then x(i)
is the minimizer of the function f (x) and the method terminates.
Substituting (2.8.36) into (2.8.35), and then noting that g (k) is orthogonal
to all g (j) , j = 0, 1, . . . , k − 1, (and hence orthogonal to g (i) and g (i+1) , for
i = 0, 1, . . . , k − 2), we obtain

! !    
g (k) g (i+1) − g (i)
s (k)
Gs (i)
=− g (k)
Gs (i)
=− = 0 (2.8.37)
αi

for 0 ≤ i ≤ k − 2. This concludes the induction step and shows that s(0) ,
s(1) , . . . , s(k) are mutually G-conjugate. This completes the proof.

Remark 2.8.2 Combining (2.8.20) and (2.8.36), we obtain


   (k) 
g (k) g − g (k−1)
βk−1 = (k−1)   (k) 
(s ) g − g (k−1)
 (k)  (k)
g g
= −  . (2.8.38)
s(k−1) g (k−1)

The following conjugate gradient method is applicable to general minimiza-


tion problems, not just those with a quadratic objective function:
Algorithm 2.8.1
Step 1. Choose a starting point x(0) .
Step 2. Set s(0) = −g (0) .
Step 3. Define x(1) = x(0) + α0 s(0) , where
2.8 Conjugate Gradient Methods 39
 
g (0) s(0)
α0 = −  
s(0) G(0) s(0)

and G(k) denotes the Hessian of the function f evaluated at x = x(k) .


Step 4. Compute
 
(1) ∂f (x) 
g = .
∂x  (1) x=x

Step 5. Set s(1) = −g (1) + β0 s(0) , where


 (1)  (0) (0)
g G s
β0 =   .
s(0) G(0) s(0)

By Remark 2.8.2, β0 can be expressed as:


   (1)   (1)  (1)
g (1) g − g (0) g g
β0 =     = − 
s(0) g (1) − g (0) s(0) g (0)
 (1)  (1)
g g
= 
.
(g ) g (0)
(0)

This expression avoids the need to calculate Gs(0) .


Step 6. General step. After reaching x(i) , we compute g (i) and set

s(i) = −g (i) + βi−1 s(i−1) ,

where, by the orthogonality of {g (i) },


     (i)  (i)
g (i) g (i) − g (i−1) g g
βi−1 =    = − 
s(i−1) g (i) − g (i−1) s(i−1) g (i−1)
 (i)  (i)
g g
=  .
g (i−1) g (i−1)

Step 7. Set
x(i+1) = x(i) + αi s(i) , (2.8.39)
where  
s(i) g (i)
α i = −   .
s(i) G(i) s(i)

Suppose that the Hessian matrices G(i) , i = 0, 1, 2, . . ., are not known.


In this case, αi can be found by a one-dimensional search instead of using
the formula in Step 7. The quadratic interpolation method could be used to
perform this one-dimensional search.
40 2 Unconstrained Optimization Techniques

2.8.1 Convergence of the Conjugate Gradient Methods

Let us present the global convergence result of the Fletcher-Reeves method


in the following theorem [232]:
Theorem 2.8.4 Let f : Rn → R be a twice continuously differentiable func-
tion defined on a bounded level set

L(x(0) ) = {x ∈ Rn : f (x) ≤ f (x(0) )}, (2.8.40)

where x(0) is a point in Rn . Suppose that the sequence {x(k) } is generated


starting from x(0) by the Fletcher-Reeves (FR) conjugate gradient method
with inexact line search such that the conditions (2.7.1) and (2.7.2) are sat-
isfied. Then  
 
lim inf g (k)  = 0, (2.8.41)
k→∞

where · denotes the usual Euclidean norm.

Other conjugate gradient methods arise from different choices of βk . For


example, in the Polak-Ribiere-Polyak (PRP) method [204], βk is chosen ac-
cording to
 (k+1)   (k+1) 
g g − g (k)
βk =   . (2.8.42)
g (k) g (k)
In the quadratic case, the expression (2.8.42) for βk is identical to that given
by (2.8.38). This is because
!
g (k+1) g (k) = 0.

For general functions, the two methods behave differently. The PRP for-
mula given by (2.8.42) will yield βk ≈ 0 when g (k+1) ≈ g (k) . Hence, s(k+1)
≈ −g (k+1) . This means that the algorithm has a tendency of restarting au-
tomatically. Thus, it can overcome the deficiency of moving forward slowly.
It is generally accepted that the PRP formula is more robust and efficient
than the FR formula. Unfortunately, Theorem 2.8.5 is not valid for the PRP
method (see [204]). However, we have the following two theorems. For their
proofs, see [232].
Theorem 2.8.5 Let the objective function f : Rn → R be three times con-
tinuously differentiable. Suppose that there exist constants K1 > K2 > 0 such
that
!
K2 y ≤ y  ∇2 f (x)y ≤ K1 y , for all y ∈ Rn , x ∈ L x(0) , (2.8.43)
2 2

where L(x(0) ) is the bounded level set defined by (2.8.40). Let the sequence
{x(k) } be generated by either the Polak-Ribiere-Polyak Conjugate Gradient
2.9 Quasi-Newton Methods 41

or the Fletcher-Reeves Conjugate Gradient restart method with the exact line
search. Then, there exists a constant c > 0, such that
 (k +n) 
x (r) − x∗ 
lim sup  (k )  ≤ c < ∞, (2.8.44)
k(r) →∞ x (r) − x 

where k(r) means that the methods restart after r iterations.


 
Theorem 2.8.6 Let f (x) be twice continuously differentiable and let L x(0)
 (0) 
be the level set defined by (2.8.40). Suppose that L x is bounded and that
there is a constant K > 0 such that

K y ≤ y  ∇2 f (x)y, ∀y ∈ Rn
2
(2.8.45)
 (0)  " (k) #
for all x ∈ L x . Then, the sequence x generated by the PRP method
with the exact line search converges to a unique minimizer x∗ of f .

2.9 Quasi-Newton Methods


 
Recall that G(k) = ∇2 f x(k) may not always be positive definite when x(k)
is far from a local minimum. Thus, Newton’s method may not converge in
such a situation. Quasi-Newton methods are based on the idea of approx-
 −1
imating G(k) at each iteration by a symmetric positive definite matrix
H (k+1) which is updated at every iteration. The positive definiteness of the
matrix H (k+1) ensures that the search direction generated by this method is
a descent direction.
Algorithm 2.9.1
Step 1. Given x(0) , H (0) . Set k = 0.
    
Step 2. Evaluate f (k) = f x(k) , g (k) = ∇f x(k) , and H (k) .
Step 3. Set s = −H g .
(k) (k) (k)
 
Step 4. Check for convergence. If s(k)  < ε, stop.
Step 5. Set x(k+1) = x(k) + α(k) s(k) , where α(k) is chosen by a line search.
Step 6. Update H (k) to H (k+1) .
Step 7. Set k := k + 1, go to Step 2.
Usually, H (0) = I. This implies that

s(0) = −g (0) .

This means that, we have the steepest descent direction at the first iteration.
Remark 2.9.1 For Newton’s method, it has fast local convergence proper-
ties. However, it requires information on second derivatives and G(k) may be
indefinite. For quasi-Newton methods, their convergence properties are much
42 2 Unconstrained Optimization Techniques

better than that of the steepest descent method, but not as good as that of New-
ton’s method. They require only information on first order partial derivatives.
Also, for each k = 0, 1, . . ., the matrix H (k) is always positive definite, and
hence the corresponding s(k) is a descent direction. Note that some quasi-
Newton methods do not ensure that H (k) is positive definite. Those that do
are also called variable metric methods.

2.9.1 Approximation of the Inverse G−1

The key idea in quasi-Newton methods is to approximate the inverse (G(k) )−1
of the Hessian G(k) by H (k+1) at step k. This approximate matrix should
be chosen to be positive definite so that the search direction generated is a
descent direction. Let
δ (k) = x(k+1) − x(k) (2.9.1)
and
γ (k) = g (k+1) − g (k) . (2.9.2)
Taylor’s series expansion gives
! !
g (k+1) = g (k) + G(k) x(k+1) − x(k) + o δ (k)  , (2.9.3)

or, equivalently, !
γ (k) = G(k) δ (k) + o δ (k)  , (2.9.4)

where o(ξ) is to be understood as a column vector such that

o (ξ) 
lim = 0.
ξ →0 ξ

This expansion is exact if f (x) is a quadratic function. From (2.9.3), or


equivalently (2.9.4), we see that G(k) depends on δ (k) . For the case when G(k)
is constant for all k = 0, 1, 2, . . . , n − 1, (i.e., f (x) is a quadratic function),
we have
γ (k) = Gδ (k) , ∀k = 0, 1, . . . , n − 1.
We now return to (2.9.4) with higher order terms neglected, i.e.,

γ (k) = G(k) δ (k) , ∀k = 0, 1, . . . , n − 1. (2.9.5)

Note that δ (k) and hence γ (k) are calculated after the line search and H (k)
is used to calculate the direction of the search. Hence, it is usually expected
that
H (k) γ (k) = δ (k) .
Thus, we choose H (k+1) such that
2.9 Quasi-Newton Methods 43

H (k+1) γ (k) = δ (k) . (2.9.6)

Equation (2.9.6) is known as the quasi-Newton condition. This is the condi-


tion that the update formula should satisfy.

2.9.2 Rank Two Correction

Define
H (k+1) = H (k) + auu + bvv  . (2.9.7)
Assume that the quasi-Newton condition (2.9.6) is satisfied. Then, by multi-
plying both sides of (2.9.7) by γ (k) , we obtain

δ (k) = H (k) γ (k) + auu γ (k) + bvv  γ (k) . (2.9.8)

We choose a and b such that

au γ (k) = 1 and bv  γ (k) = −1. (2.9.9)

Let u = δ (k) and v = H (k) γ (k) . Then, it follows from (2.9.9) that
1 1
a=   and b = −  . (2.9.10)
δ (k) γ (k) γ (k) H (k) γ (k)

Substituting (2.9.10) into (2.9.7) yields


   
δ (k) δ (k) H (k) γ (k) γ (k) H (k)
H (k+1)
=H (k)
+  −   . (2.9.11)
δ (k) γ (k) γ (k) H (k) γ (k)

This is the Davidon-Fletcher-Powell (DFP) formula.


Remark 2.9.2 Clearly, for any a ∈ Rn and b ∈ Rn , it holds that

(ax − b) (ax − b) ≥ 0, for all x ∈ (−∞, ∞).

This means that

a ax2 − 2a bx + b b ≥ 0, for all x ∈ (−∞, ∞). (2.9.12)

The roots of the quadratic equation a ax2 − 2a bx + b b = 0 are given by


,
2a b ± (2a b)2 − 4(a a)(b b)
x= .
2a a
Clearly, (2.9.12) is valid if and only if
44 2 Unconstrained Optimization Techniques
 2   
a b ≤ a a b b .

Furthermore, (2.9.12) holds as an equality only if a is parallel to b.

Theorem 2.9.1 If H (k) is symmetric positive definite, then H (k+1) is also


symmetric positive definite. (This ensures that the search direction at each
iteration is downhill, and hence the function value is reduced along the search
direction using a line search).

Proof. For any x ∈ Rn , we have


   (k)  (k)
  δ (k) δ (k)
 H
(k) (k)
γ γ H
x H (k+1)
x=x H x+x  (k)
 x−x   x
δ (k) γ (k) γ (k) (k)
H γ (k)
  (k) 2   (k) (k) 2
 (k) x δ x H γ
=x H x+   − (k)  (k) (k) . (2.9.13)
δ (k) γ (k) (γ ) H γ

 1  1
Let a = H (k) 2 x, and b = H (k) 2 γ (k) , where H 1/2 is defined such that
H 1/2 H 1/2 = H.
Then, we have
2   (k) 2
  a b x δ
x H (k+1)
x=a a− 
+ 
b b δ (k) γ (k)
        2   (k) 2
a a b b − a b x δ
= +  . (2.9.14)
b b δ (k) γ (k)

Also, since x(k+1) is the minimum point of f (x) along the direction δ (k) , it
follows from (2.3.4) that
!
δ (k) g (k+1) = 0. (2.9.15)

Thus,
! ! ! !
δ (k) γ (k) = δ (k) g (k+1) − δ (k) g (k) = − δ (k) g (k) . (2.9.16)

Now, by the definition of δ (k) given by (2.9.1) and noting that x(k+1) =
x(k) − αk H (k) g (k) , we obtain
! ! !
δ (k) γ (k) = − −αk H (k) g (k) g (k) = αk g (k)
H (k) g (k) > 0.
(2.9.17)
Substituting (2.9.17) into the denominator of the second term on the right
hand side of (2.9.14), we obtain
2.9 Quasi-Newton Methods 45
    2   (k) 2
 (k+1) a a b b − a b x δ
x H x= 
+ . (2.9.18)
(b b) αk (g k ) H (k) g (k)

From Remark 2.9.2, we have


        2
a a b b − a b ≥0 (2.9.19)

and (2.9.19) holds as an equality only if a is parallel to b. In this case, x is


parallel to γ (k) , i.e., there exists a constant λ such that

x = λγ (k) = λG(k) δ (k) .

This implies that


!
x δ (k) = λ δ (k) Gδ (k) > 0. (2.9.20)

Combining (2.9.18), (2.9.19) and (2.9.20), we have

x H (k+1) x > 0.

This completes the proof.


 
Remark 2.9.3 Is δ (k) γ (k) > 0? The answer is positive as explained
below.
(i) Exact line search. From the exact line search condition (2.3.4) (i.e.,
 (k)  (k+1)
δ g = 0), δ (k) = −αH (k) g (k) , H (k) is positive definite
 (k)  (k) (k)
(and hence g H g > 0), and α > 0, it follows that
! ! ! !
δ (k) γ (k) = δ (k) g (k+1) − δ (k) g (k) =α g (k) H (k) g (k) >0.

(ii) Inexact line search. Conditions on slope can easily be used to ensure
 (k)  (k)
δ γ > 0. Recall that
  !
 (k)  (k+1) 
(s ) g  < −σ s(k) g (k) , 0 < σ < 1.

Since H (k) is positive definite, it follows from Step 3 of Algorithm 2.9.1


   
that s(k) g (k) = − g (k) H (k) g (k) < 0. Thus,
! !
g (k+1) s(k) > σ g (k) s(k) , 0 < σ < 1.

Therefore,
! ! !
s(k) γ (k) =α(k) s(k) g (k+1) − g (k)
46 2 Unconstrained Optimization Techniques
!
=α(k) g (k+1) − g (k) s(k)
!
>α(k) σg (k) − g (k) s(k)
!
= − (1 − σ)α(k) g (k) s(k)
>0.

Theorem 2.9.2 The DFP method is used to solve the following objective
function:
1
f (x) = x Gx − c x, (2.9.21)
2
where G ∈ Rn×n is positive definite. Then, for i = 0, 1, . . . , m, it holds that

H (i+1) γ (j) = δ (j) , j = 0, 1, . . . , i, (2.9.22)


!
δ (i) Gδ (j) = 0, j = 0, 1, . . . , i − 1. (2.9.23)

Furthermore, the method terminates at the step m + 1 ≤ n. If m = n − 1,


then
H (n) = G−1 . (2.9.24)

Proof. We shall show the validity of (2.9.22) and (2.9.23) by induction. For
i = 0, it is clear from the quasi-Newton condition that

H (1) γ (0) = δ (0) . (2.9.25)

Thus, (2.9.22) with i = 0 is valid. Now we suppose that (2.9.22) and (2.9.23)
are true for i. Then, it is required to show that they are also valid for i + 1.
Note that
g (i+1) = 0. (2.9.26)
Then, by the exact line search condition (2.3.4) and the induction hypothesis,
it follows from (2.9.5) that, for j ≤ i,

! ! 
i !
g (i+1) δ (j) = g (j+1) δ (j) + g (k+1) − g (k) δ (j)
k=j+1
! 
i !
(j+1) (j)
= g δ + γ (k) δ (j)
k=j+1


i !
=0+ δ (k) Gδ (j)
k=j+1

= 0. (2.9.27)

Note that
2.9 Quasi-Newton Methods 47

δ (i+1) = x(i+2) − x(i+1) (2.9.28)


and
x(i+2) = x(i+1) − αi+1 H (i+1) g (i+1) . (2.9.29)
Thus,
δ (i+1) = −αi+1 H (i+1) g (i+1) . (2.9.30)
Since (2.9.23) with i is valid by the induction hypothesis, it is clear from (2.9.30)
and (2.9.27) that
! !
δ (i+1) Gδ (j) = −αi+1 g (i+1) H (i+1) γ (j)
!
= −αi+1 g (i+1) δ (j)
= 0. (2.9.31)

Thus, (2.9.23) with i + 1 is valid. To show the validity of (2.9.22) with i + 1,


we need to show that

H (i+2) γ (j) = δ (j) , j = 0, 1, . . . , i + 1.

From the quasi-Newton condition (2.9.6), we have

H (i+2) γ (i+1) = δ (i+1) . (2.9.32)

For j ≤ i, it follows from (2.9.5) and (2.9.31) that


! !
δ (i+1) γ (j) = δ (i+1) Gδ (j) = 0. (2.9.33)

Now, by the induction hypothesis, (2.9.5) and (2.9.31), we have


! ! !
γ (i+1) H (i+1) γ (j) = γ (i+1) δ (j) = δ (i+1) Gδ (j) = 0. (2.9.34)

Thus, multiplying both sides of the DFP formula by γ (j) , we obtain


 
(i+2) (j) (i+1) (j) δ (i+1) δ (i+1) γ (j)
H γ =H γ +
(δ (i+1) ) γ (i+1)
 
H (i+1) γ (i+1) γ (i+1) H (i+1) γ (j)
−  
γ (i+1) H (i+1) γ (i+1)
=H (i+1) γ (j) + 0 − 0 = δ (j) , (2.9.35)

for j = 0, 1, . . . , i + 1. Thus, (2.9.22) is valid.


Now, by (2.9.23), we note that δ (i) , i = 0.1, . . . , m, are G-conjugate. This
means that the directions generated by the DFP method are mutually G-
48 2 Unconstrained Optimization Techniques

conjugate. Thus, by Theorem 2.8.2, the minimum of the function (2.9.1) is


found in at most n iterations. This implies that there exists an m ≤ n − 1
such that the DFP method terminates after m iterations. For m = n − 1, we
have
H (n) γ (j) = δ (j) , j = 0, 1, . . . , n − 1
and hence
H (n) Gδ (j) = δ (j) , j = 0, 1, . . . , n − 1.
From Theorem 2.8.1, we see that δ (i) , i = 0, 1, . . . , m, are linearly indepen-
dent. Thus,
H (n) = G−1 .
This completes the proof.

The next theorem contains the results showing the convergence of the DFP
method for a general objective function f : Rn → R. Its proof can be found
in [232].
Theorem 2.9.3 . Consider the objective function f : Rn → R. Suppose that
the following conditions are satisfied:
(a) f is twice continuously differentiable on an open convex set D ⊂ Rn .
(b) There is a strict local minimizer x∗ ∈ D such that ∇2 f (x∗ ) is symmetric
and positive definite.
(c) There is a neighbourhood Nε (x∗ ) of x∗ such that
 2 
∇ f (- x) − ∇2 f (x) ≤ K - - ∈ Nε (x∗ ),
x − x , ∀x, x

where K is a positive constant.


Furthermore, suppose that the following condition is satisfied:
 −1  ! 1
 
K  ∇2 f (x∗ )  ϕ x(k) , x(k+1) ≤
3
in Nε (x∗ ), where
! * +
ϕ x(k) , x(k+1) = max x(k) − x∗ , x(k+1) − x∗ .

Then, the DFP method converges superlinearly.

2.9.3 BFGS Update Formula

We shall present another matrix update formula due to Broyden [32], Fletcher
[60], Goldfarb [77] and Shanno [227]. This update formula is known as the
BFGS formula. The update matrix H (k+1) is
2.9 Quasi-Newton Methods 49
   (k) 
δ (k) γ (k) (k) γ
(k)
δ
H (k+1)
=H − 
(k)
 H −H 
(k)

δ (k) γ (k) δ (k) γ (k)
       
δ (k) γ (k) H (k) γ (k) δ (k) δ (k) δ (k)
+     +  . (2.9.36)
δ (k) γ (k) δ (k) γ (k) δ (k) γ (k)

This is obtained by approximating G(k) by B (k+1) . Then, H (k+1) is chosen


 −1
such that H (k+1) = B (k+1) . To begin the derivation, we note that the
quasi-Newton formula (2.9.6) is changed to

γ (k) = B (k+1) δ (k) . (2.9.37)

We update B (k) by a rank two correction as in DFP. Thus,


   
γ (k) γ (k) B (k) δ (k) δ (k) B (k)
B (k+1)
=B (k)
+  −   . (2.9.38)
γ (k) δ (k) δ (k) B (k) δ (k)

This is the DFP formula with B replacing H and δ swapped with γ.


The BFGS formula comes about by choosing H (k+1) such that

B (k+1) H (k+1) = I. (2.9.39)

The BFGS method has all the properties that the DFP method has. Fur-
thermore, the BFGS tends to do better than the DFP for low accuracy line
searches. In addition, if the conditions
! ! !
f x(k) − f x(k+1) ≥ −ρ g (k) δ (k) (2.9.40)

and ! !
g (k+1) s(k) ≥ σ g (k) s(k) , for 0 < ρ ≤ σ (2.9.41)

hold in an inexact line search, then the BFGS method is globally convergent.
To derive the BFGS formula (2.9.36), we need some preliminary results.
The first of these is the well-known result on the inversion of block matrices
[109].
Lemma 2.9.1 Let A ∈ Rn×n , B ∈ Rm×m , C ∈ Rm×n , D ∈ Rn×n and
 = B − CA−1 D. Suppose that A−1 and −1 exist. Then
−1
AD A−1 + E−1 F −E−1
= ,
CB −−1 F −1

where E = A−1 D and F = CA−1 .

The next result is from [275] but the origin of this result can be traced
back to [15].
50 2 Unconstrained Optimization Techniques

Lemma 2.9.2 Let A ∈ Rn×n be non-singular and let u, v ∈ Rn be such that


1 + v  A−1 u = 0. Then, A + uv  is non-singular. Furthermore,

$ %−1 A−1 uv  A−1


A + uv  = A−1 − . (2.9.42)
1 + v  A−1 u
Proof. Let

1 v  A−1 1 −v  1 + v  A−1 u 0
= = W1 (2.9.43)
0 I u A u A

and
1 0 I −v  1 −v 
= = W2 . (2.9.44)
−u I u A 0 A + uv 
By Lemma 2.9.1, we see that the inverses of W1 and W2 are, respectively,
given by . /
1
−1 1+v  A−1 u
0
W1 = −A−1 u (2.9.45)
1+v  A−1 u
A−1
and
1 v  (A + uv  )−1
W2−1 = . (2.9.46)
0 (A + uv  )−1
From (2.9.44) and (2.9.43), we obtain
−1
1 −v  I 0
W2−1 = (2.9.47)
u A uI

and
−1 −1
1 −v  1 v  A−1
W1−1 = , (2.9.48)
u A 0 I
1 v  A−1
respectively. Multiplying (2.9.48) by yields
0 I
−1
1 −v  1 v  A−1
= W1−1 . (2.9.49)
u A 0 I

Substituting (2.9.49) into (2.9.47), and then using (2.9.45), we obtain


−1
1 −v  I 0 1 v  A−1 I 0
W2−1 = = W1−1
u A uI 0 I uI
.  −1
/
v A
1 1+v  A−1 u
= −1 −1  −1 . (2.9.50)
0 A − A1+vuv A
 A−1 u

Equating (2.9.46) and (2.9.50) gives


2.9 Quasi-Newton Methods 51

 −1 A−1 uv  A−1


A + uv  = A−1 − .
1 + v  A−1 u
This completes the proof.
(k+1) (k+1)
We now return to derive the inverse, HBF GS , of BBF GS . First, define
 
(k) γ (k) γ (k)
T =B +  . (2.9.51)
γ (k) δ (k)

Then, by Lemma 2.9.2, we obtain

H (k) γ (k) (γ (k) ) H (k) H (k) γ (k) (γ (k) ) H (k)


T −1 = H (k) −   = H (k) − ,
δ (k) γ (k) + (γ (k) ) H (k) γ (k) ω
(2.9.52)
where & '−1
H (k) = B (k) (2.9.53)

and !
ω = δ (k) γ (k) + (γ (k) ) H (k) γ (k) . (2.9.54)

Thus, by (2.9.38), it is clear from Lemma 2.9.1 that


.   /−1
& '−1 B (k) δ (k) δ (k) B (k)
(k+1) (k+1)
HBF GS = BBF GS= T−  
δ (k) B (k) δ (k)
 
−1 T −1 B (k) δ (k) δ (k) B (k) T −1
=T +    . (2.9.55)
δ (k) B (k) δ (k) − δ (k) B (k) T −1 B (k) δ (k)

From (2.9.55), we have


! !
δ (k) B (k) δ (k) − δ (k) B (k) T −1 Bk δ (k)
! ! H (k) γ (k) (γ (k) ) H (k)
= δ (k) B (k) δ (k) − δ (k) B (k) H (k) − B (k) δ (k)
ω
&  '2
 (k)  (k) (k)  (k) (k)  (k)
δ γ (γ ) δ δ γ
= = . (2.9.56)
ω ω
Now, by (2.9.55) and (2.9.56), it follows from simple algebraic manipulations
that
 
T −1 B (k) δ (k) δ (k) B (k) T −1
   
δ (k) B (k) δ (k) − δ (k) B (k) T −1 B (k) δ (k)
52 2 Unconstrained Optimization Techniques
 
T −1 B (k) δ (k) δ (k) B (k) T −1
=ω &  '2
δ (k) γ (k)
      !
ωI − H (k) γ (k) (γ (k) ) δ (k) δ (k) ωI − γ (k) γ (k) H (k)
= &  '2
ω δ (k) γ (k)
       
ω 2 δ (k) δ (k) + H (k) γ (k) γ (k) δ (k) δ (k) γ (k) γ (k) H (k)
= &  '2
ω δ (k) γ (k)
       
ωH (k) γ (k) γ (k) δ (k) δ (k) + ωδ (k) δ (k) γ (k) γ (k) H (k)
− &  '2
ω δ (k) γ (k)
   
δ (k) δ (k) H (k) γ (k) γ (k) H (k)
=ω &  '2 +
ω
δ (k) γ (k)
   
H (k) γ (k) δ (k) + δ (k) γ (k) H (k)
−   . (2.9.57)
δ (k) γ (k)

Using (2.9.54), we obtain


   (k)  (k)  (k)  (k) (k) ! (k)  (k) 
(k) (k)  δ γ + γ H γ δ δ
δ δ
ω &  '2 = &   '2 .
δ (k) γ (k) δ (k) γ (k)

(2.9.58)
Now, by (2.9.55), (2.9.52), (2.9.57) and (2.9.58), it follows that
 
(k+1) H (k) γ (k) γ (k) H (k)
HBF GS =H − (k)
ω
 (k)     
2 (k)
ω δ δ + H (k) γ (k) (γ (k) ) δ (k) δ (k) γ (k) γ (k) H (k)
+ &  '2
ω δ (k) γ (k)
       
ωH (k) γ (k) γ (k) δ (k) δ (k) + ωδ (k) δ (k) γ (k) γ (k) H (k)
− &  '2
ω δ (k) γ (k)
   
H (k) γ (k) γ (k) H (k) δ (k) δ (k)
=H (k) − + ω &  '2
ω
δ (k) γ (k)
     
H (k) γ (k) γ (k) H (k) H (k) γ (k) δ (k) + δ (k) γ (k) H (k)
+ −  
ω δ (k) γ (k)
2.9 Quasi-Newton Methods 53
&    '  
δ (k) γ (k) + γ (k) H (k) γ (k) δ (k) δ (k)
=H (k) + &  '2
δ (k) γ (k)
   
δ (k) γ (k) H (k) + H (k) γ (k) δ (k)
−  
δ (k) γ (k)
. / .   /
δ (k) (γ (k) ) γ (k) δ (k)
= I−  H (k)
I− 
δ (k) γ (k) δ (k) γ (k)
 
δ (k) δ (k)
+  . (2.9.59)
δ (k) γ (k)

This completes the derivation of the BFGS formula.


The next theorem contains the results on the convergence of the BFGS
method. Its proof can be found in [232].
Theorem 2.9.4 Consider the objective function f : Rn → R. Suppose that
the following conditions are satisfied:
(a) f is twice continuously differentiable on an open convex set D ⊂ Rn .
(b) There is a strong local minimizer x∗ ∈ D such that ∇2 f (x∗ ) is symmetric
and positive definite.
(c) There is a neighbourhood Nε (x∗ ) of x∗ such that
 2 
∇ f (-x) − ∇2 f (x) ≤ K1 - - ∈ Nε (x∗ ).
x − x , ∀x, x

where K1 is a positive constant.


Furthermore, for a positive constant K2 , suppose that the following condition
is satisfied:
 −1 ! 1
K2 ∇2 f (x∗ ) ϕ x(k) , x(k+1) ≤
3
in Nε (x∗ ), where
! * +
ϕ x(k) , x(k+1) = max x(k) − x∗ , x(k+1) − x∗ .
" #
Then, the sequence x(k) generated by the BFGS method converges to x∗
superlinearly.
In closing the chapter, we make some comments on the conjugate gradient
methods and quasi-Newton methods in the following remark:
Remark 2.9.4 If n is large, we may have problems storing the approxima-
tion to the inverse of Hessian in quasi-Newton methods. Thus, conjugate
gradient methods are preferred. On the other hand, if n is not too large, then
quasi-Newton methods tend to perform better.
Chapter 3
Constrained Mathematical Programming

3.1 Introduction

The general constrained mathematical programming problem is to find an


x ∈ Rn to minimize the objective function

f (x) (3.1.1)

subject to the constraints

hi (x) = 0, i = 1, . . . , m, (3.1.2)
hi (x) ≤ 0, i = m + 1, . . . , m + r, (3.1.3)

where f and hi , i = 1, . . . , m + r, are continuously differentiable functions of


the n-vector variable x.
As mentioned in Chapter 2, there are already many excellent books on
nonlinear optimization. For example, see [22], [61], [71], [172], [232] and [275].
In this chapter, we summarize some essential concepts and results in nonlinear
constrained optimization methods. As for Chapter 2, this chapter is based on
the lecture notes on optimization prepared and used by the authors. These
lecture notes have also been used by their colleagues. In addition to these
lecture notes, this chapter includes also some important results from those
references mentioned above and in Section 2.1 of Chapter 2.
Definition 3.1.1 A point x is said to be feasible if it satisfies the con-
straints (3.1.2) and (3.1.3). The set of all feasible points is called the feasible
region (or feasible set) of the constraints. Let Ω denote the feasible region (or
feasible set) throughout this chapter.

Definition 3.1.2 The j-th inequality constraint, j ∈ {m + 1, m + 2, . . . , m +


r}, is said to be active at x∗ if hj (x∗ ) = 0.

© The Author(s), under exclusive license to 55


Springer Nature Switzerland AG 2021
K. L. Teo et al., Applied and Computational Optimal Control, Springer
Optimization and Its Applications 171,
https://doi.org/10.1007/978-3-030-69913-0 3
56 3 Constrained Mathematical Programming

Definition 3.1.3 Let x∗ be a given point in Rn . Then the active set, J (x∗ ),
of inequality constraints at x∗ is the set of indices corresponding to all those
inequality constraints that are active, i.e.,

J (x∗ ) = {j ∈ {m + 1, m + 2, . . . , m + r} : hj (x∗ ) = 0} . (3.1.4)

Definition 3.1.4 The point x∗ is said to be a regular point of the con-


straints (3.1.2)-(3.1.3) if x∗ satisfies all of the constraints and if the gradients
of the equality and active inequality constraints

{∇x hj (x∗ ) , j ∈ {1, . . . , m} ∪ J (x∗ )} (3.1.5)

are linearly independent, where



∗ ∂hj (x) 
∇x hj (x ) = , j = 1, 2, . . . , m + r.
∂x x=x∗

Remark 3.1.1 Note that the condition stated in Definition 3.1.4 is known
as constraint qualification.

Definition 3.1.5 The Lagrangian of the constrained optimization problem


(3.1.1)–(3.1.3) is defined by


m 
m+r
L (x, λ) = f (x) + λj hj (x) + λj hj (x) , (3.1.6)
j=1 j=m+1

where λ = [λ1 , λ2 , . . . , λm+r ] is the vector of Lagrange multipliers.

Definition 3.1.6 A point x∗ ∈ Ω is said to be a relative minimum point or


a local minimum point of f over the feasible region Ω if there exists an ε > 0
such that f (x) ≥ f (x∗ ) for all x ∈ Nε (x∗ ), where

Nε (x∗ ) = {x ∈ Ω : |x − x∗ | < ε} .

If f (x) > f (x∗ ) for all x ∈ Nε (x∗ ) such that x = x∗ , then x∗ is said to be
a strict local minimum point of f over the feasible region Ω.

We now state the well-known Karush-Kuhn-Tucker theorem without proof.


Theorem 3.1.1 (First Order Necessary Optimality Condition) Let
x∗ be a local minimum point of Problem (3.1.1)–(3.1.3). If it is also a regular
point of the constraints (3.1.2)–(3.1.3), then there exist λ∗j , j = 1, 2, . . . ,
m + r, not all equal to zero, such that

∗ ∗ ∂L (x, λ∗ ) 
∇x L (x , λ ) =  = 0 , (3.1.7)
∂x x=x∗
3.1 Introduction 57

hj (x∗ ) = 0, j = 1, 2, . . . , m, (3.1.8a)
hj (x∗ ) ≤ 0, j = m + 1, m + 2, . . . , m + r, (3.1.8b)

λ∗j hj (x∗ ) = 0, λ∗j ≥ 0, j = m + 1, m + 2, . . . , m + r. (3.1.9)

In view of condition (3.1.9), we note that if the j-th inequality constraint


is inactive, then λ∗j = 0; and conversely, if λ∗j > 0, then the j-th inequality
constraint must be active.
In what follows, we assume that f and hj , j = 1, 2, . . . , m + r, are twice
continuously differentiable.
Theorem 3.1.2 (Second Order Sufficient Optimality Condition)
The point x∗ is a local minimum of Problem (3.1.1)–(3.1.3) if the con-
ditions (3.1.7)–(3.1.9) are satisfied and, in addition, the Hessian, H ∗ =
∇xx L (x∗ , λ∗ ), of the Lagrangian satisfies

y  H ∗ y > 0, ∀y ∈ M ∗ , y = 0, (3.1.10)

where ∇xx = (∇x ) ∇x ,

M ∗ = {y ∈ Rn : ∇x hj (x∗ ) y = 0, j ∈ {1, . . . , m} ∪ J+ (x∗ )} (3.1.11)

and " #
J+ (x∗ ) = j ∈ J (x∗ ) : λ∗j > 0 . (3.1.12)

If the Lagrange multipliers corresponding to all active inequality con-


straints are strictly positive (i.e., J+ (x∗ ) = J (x∗ )), then M ∗ is a subspace,
which is the tangent plane of the active constraints (including active inequal-
ity constraints and equality constraints). In this situation, we note, by virtue
of condition (3.1.10), that the Hessian H ∗ of the Lagrangian must be positive
definite on this subspace.
To solve large-scale constrained optimization problems, efficient numeri-
cal algorithms are required. There are many such algorithms, and each of
them is efficient in its respective area of application. We refer the reader
to several classics, such as [22], [54], [61], [71], [172], [187], [232] and [275]
for detailed treatments of computational aspects of constrained optimization
problems. We shall, however, elaborate somewhat on the method of sequential
quadratic programming (see [61], [221], [222], [232] and [275]) in this chap-
ter. This method has been recognized as one of the most efficient methods
for solving small and medium size constrained optimization problems (see
[222] and [232]). As the name suggests, the method of sequential quadratic
programming computes the optimal solution as a sequence of quadratic pro-
gramming problems. Each quadratic programming problem is solved, yielding
the optimum search direction based on analytical gradient information for the
objective and constraint functions. The quadratic programming problem to
be solved at each step involves both linear equality and linear inequality con-
straints. In the next section, we discuss some basic techniques for solving a
58 3 Constrained Mathematical Programming

quadratic programming problem with only linear equality constraints. The so-
lution of quadratic programming problems with both linear equality and lin-
ear inequality constraints via the active set strategy is outlined in Section 3.3.
In Section 3.4, we summarize a constrained quasi-Newton method for solving
a general linearly constrained optimization problem. In Section 3.5, we sum-
marize the essential steps required in the sequential quadratic programming
algorithm, making use of the materials outlined in Sections 3.2–3.4.

3.2 Quadratic Programming with Linear Equality


Constraints

In this section, we consider a general class of quadratic optimization problems


with linear equality constraints:
1 
minimize f (x) = x Gx + c x (3.2.1)
2
subject to
h (x) = Ax − b = 0, (3.2.2)
where x ∈ Rn is the decision vector, c ∈ Rn , G = G ∈ Rn×n , A ∈ Rm×n ,
b ∈ Rm , m < n and rank (A) = m. This problem is referred to as Problem
(QPE).
Note that Problem (QPE) without the linear constraint (3.2.2) is an un-
constrained quadratic programming problem. It can be solved readily if the
Hessian of f (which is equal to G) is positive definite. The first order necessary
condition for optimality is that the gradient vanishes, i.e.,

∂f (x)
∇x f (x) = = x G + c = 0 . (3.2.3)
∂x
If the Hessian ∇xx f = G is positive definite, then, by solving (3.2.3), we
obtain the unique minimum solution:

x∗ = −G−1 c. (3.2.4)

There are many ways of solving Problem (QPE). We consider three of


these: (1) direct elimination, (2) generalized elimination and (3) the method
of Lagrange multipliers.
The direct elimination method seeks to remove the linear constraints by
solving for some m independent variables in terms of the remaining n − m
variables. This subsequently reduces the problem to an unconstrained one
that can be solved as shown in Chapter 2. Since A is of full rank, we can
rearrange the order of the variables such that the first m columns of A are
linearly independent, i.e.,
3.2 Quadratic Programming with Linear Equality Constraints 59

A = [A1 |A2 ], (3.2.5)


where A1 ∈ Rm×m is non-singular. Let

x1
x= ,
x2

where x1 ∈ Rm and x2 ∈ Rn−m . Then, by virtue of (3.2.2), we have

x1 = A−1
1 (b − A2 x2 ) . (3.2.6)

Now we can partition the matrix G and the vector c accordingly:

G11 G12 c1
G= and c= , (3.2.7)
G
12 G22 c2

where G11 ∈ Rm×m , G12 ∈ Rm×(n−m) , G22 ∈ R(n−m)×(n−m) , c1 ∈ Rm and


c2 ∈ Rn−m . Substitution of (3.2.6) and (3.2.7) into (3.2.1) yields

f (x) =f¯ (x2 )


1 &  −1   −1 
= x 
2 G22 + A2 A1 G11 A−1 
1 A2 − A2 A1 G12
2 ' & 
−1
  
− G 12 A1 A2 x2 + b

A−1
1 G12 − b A−1
1 G11 A−1
1 A2
' 1   −1 
 −1  −1
+ c2 − c1 A1 A2 x2 + c1 A1 b + b A1 G11 A−1
1 b. (3.2.8)
2
The first order necessary condition for stationarity of f¯ (x2 ) shows that
&  −1   −1  '
−1
x∗2 = −G̃−1 G 
12 A1 b − A2 A1 G11 A−1 
1 b + c2 − A2 A1 c1 ,
(3.2.9)
where
&  −1   −1  '
G̃ = G22 + A 2 A1 G11 A−1 
1 A2 − A2 A1 G12 − G −1
12 A1 A2 .
(3.2.10)
By the second order sufficient condition, x∗2 will be a minimizing (both locally
and globally) solution of f¯ if G̃ is positive definite. However, this assumption
is not necessarily satisfied in general. The remaining solution is constructed
from (3.2.6) by substitution of (3.2.9).
The direct elimination method is easy to understand, but it requires a
significant amount of computation. The method of generalized elimination
seeks to reduce this computational effort somewhat. Let E and F be real
matrices of dimension n × m and n × (n − m), respectively, such that

[E|F ] is non-singular, (3.2.11)

AE = Im (3.2.12)
60 3 Constrained Mathematical Programming

and
AF = 0. (3.2.13)
Note that E and F are not necessarily unique. In practice, E and $F may% be
obtained by first selecting any matrix Q ∈ Rn×(n−m) such that A |Q is
non-singular. Defining
&$ %−1 '
[E|F ] = A |Q , (3.2.14)

it is then easy to verify that E and F satisfy (3.2.11), (3.2.12) and (3.2.13).
Equation (3.2.13) implies that the columns of F are basis vectors for the null
space of A, defined by {x ∈ Rn : Ax = 0}. The general solution of (3.2.2) can
then be written as
x = Eb + F y, (3.2.15)
where y ∈ Rn−m is arbitrary. Substitution of (3.2.15) into (3.2.1) yields

1    1
f (x) = fˆ (y) = y F GF y + (c + GEb) F y + c + GEb Eb.
2 2
(3.2.16)
Clearly, if F  GF is positive definite, then the unique minimizer y ∗ is given
by
 −1 
y ∗ = − F  GF F (c + GEb) . (3.2.17)
The matrix F  GF is referred to as the reduced Hessian matrix, while the
vector F  (c + GEb) is referred to as the reduced gradient. x∗ can then be
easily obtained from (3.2.15).
The third method employs the idea of Lagrange multipliers. The La-
grangian function for the constrained problem (3.2.1) and (3.2.2) is

L (x, λ) = f (x) + λ (Ax − b) . (3.2.18)

Let x∗ and λ∗ denote the optimal values of x and λ, respectively. Then


the necessary conditions for optimality require that the gradients of L with
respect to both x and λ vanish at x = x∗ and λ = λ∗ , i.e.,

∇x L(x∗ , λ∗ ) = (x∗ ) G + c + (λ∗ ) A = 0 , (3.2.19)


∗ ∗ ∗  
∇λ L(x , λ ) = (Ax − b) = 0 . (3.2.20)

Equations (3.2.19) and (3.2.20) constitute a set of linear simultaneous equa-


tions that can be expressed as

G A x∗ −c
= . (3.2.21)
A0 λ∗ b

If G is positive definite and A has full rank, then it follows that


3.2 Quadratic Programming with Linear Equality Constraints 61

G A
A0

is non-singular and its inverse is given by


−1
G A H P
= , (3.2.22)
A0 P S

where
 −1
H = G−1 − G−1 A AG−1 A AG−1 , (3.2.23)
  −1
P = AG−1 A AG−1 , (3.2.24)
 −1
S = − AG−1 A . (3.2.25)

The solution of Problem (QPE) may then be expressed as

x∗ = −Hc + P  b, (3.2.26)
λ∗ = −P c + Sb. (3.2.27)

If we substitute (3.2.23) and (3.2.24) into (3.2.26), we have


 −1 $ %
x∗ = −G−1 c + G−1 A AG−1 A AG−1 c + b , (3.2.28)

which is a much more elegant form than that obtained by the direct elimina-
tion method in (3.2.6) and (3.2.9). The corresponding Lagrange multipliers,
λ∗ , may be obtained similarly.
Finally, note that x∗ and λ∗ can also be generated by first finding any
solution x that satisfies the constraints, i.e.,

Ax = b. (3.2.29)

Then
x∗ = x − Hp (3.2.30)
and
λ∗ = −P p, (3.2.31)
where

p = (∇x f (x)) = Gx + c, (3.2.32)
because

x − Hp
= x − H (Gx + c)
= x − HGx − Hc
 −1 !
= x − G−1 − G−1 A AG−1 A AG−1 Gx − Hc (from (3.2.23))
62 3 Constrained Mathematical Programming
 −1
= x − G−1 Gx + G−1 A AG−1 A AG−1 Gx − Hc
 −1
= G−1 A AG−1 A b − Hc
= −Hc + P  b (from (3.2.24))
= x∗ (from (3.2.26))

and

−P p = −P (Gx + c)
= −P Gx − P c
 −1
= − AG−1 A AG−1 Gx − P c (from (3.2.24))
 −1
= − AG−1 A b − P c (from (3.2.29))
= −P c + Sb (from (3.2.25))
= λ∗ (from (3.2.27)).

3.3 Quadratic programming via Active Set Strategy

Problem (QPE) only involves linear equality constraints. In this section, let
us consider the following general quadratic programming problem that also
involves linear inequality constraints.
1 
minimize f (x) = x Gx + c x (3.3.1a)
2
subject to
hi (x) = a
i x − bi = 0, i ∈ E, (3.3.1b)
hi (x) = a
i x − bi ≤ 0, i ∈ I. (3.3.1c)

This problem is referred to as Problem (QP), where the constraints have


been partitioned into two sets. E = {1, 2, . . . , m} is the set of indices cor-
responding to the equality constraints and I = {m + 1, 2, . . . , m + r} is the
set of indices corresponding to the inequality constraints. Since the objective
function is quadratic, the Hessian G is constant. The constraints are linear so
that their gradients, ∇x hi (x) = ai , i = 1, 2, . . . , m + r, are also constants.
If G is positive definite, the problem is convex. In this case, the first order
Karush-Kuhn-Tucker necessary conditions (3.1.7)–(3.1.9) are both necessary
and sufficient. Thus, there exists a vector, λ∗ , of Lagrange multipliers, not
all equal to zero, such that

Gx∗ + c = − λ∗i ai , (3.3.2)
i∈E∪I
hi (x∗ ) = 0, i ∈ E, (3.3.3)
3.3 Quadratic programming via Active Set Strategy 63

hi (x∗ ) ≤ 0, i ∈ I, (3.3.4)
λ∗i hi (x∗ ) = 0, i ∈ I, (3.3.5)
λ∗i ≥ 0, i ∈ I. (3.3.6)

Due to the existence of inequality constraints, Problem (QP) cannot be


solved by any elimination method. Instead, it is solved via a series of Problems
(QPE) via the active set strategy. To begin, we require some definitions.
A point x is said to be a feasible solution of Problem (QP) if it satisfies
the constraints (3.3.1b) and (3.3.1c). Since all of these constraints are linear,
it is easy to compute a feasible solution using the first phase of the simplex
algorithm for linear programming.
The active set strategy is an iterative process. In the following definitions,
the superscript (k) refers to a quantity at the k-th iteration of this process.
Definition 3.3.1 The active set, A(k) , is defined by
* ! +
A(k) = j ∈ E ∪ I : hj x(k) = a j x (k)
− b j = 0 , (3.3.7)

where x(k) is the k-th iterate solution.


Definition 3.3.2 The active constraint matrix, A(k) , is defined by
& !'
A(k) = Matrix formed by columns ∇x hi x(k) , i ∈ A(k) ,
& '
= ai , i ∈ A(k) . (3.3.8)

Remark 3.3.1 In view of (3.3.1a)–(3.3.1c), we note that


! ! ! 1
f x(k) + d = f x(k) + d Gx(k) + c + d Gd
2
and ! ! !
hi x(k) + d = a
i x (k)
+ d − b i = a 
i d + h i x (k)
.
The active set strategy can be summarized in the following algorithm.
Algorithm 3.3.1
Step 1. Set k = 0. Select an initial feasible solution x(0) and identify the
corresponding active set A(0) .

Step 2. Compute the search direction d(k) by solving the problem


1  !
minimize d Gd + d Gx(k) + c (3.3.9a)
2
subject to ai d = 0, i ∈ A(k) . (3.3.9b)

If d = 0 solves problem (3.3.9), go to Step 3. Otherwise, go to Step 4.


64 3 Constrained Mathematical Programming

Step 3. Since d = 0 solves problem (3.3.9), x(k) solves the problem


1 
minimize x Gx + c x (3.3.10a)
2
subject to ai x = bi , i ∈ A(k) . (3.3.10b)

Compute the corresponding Lagrange multiplier vector (see (3.3.2)):


& '
(k)
λ(k) = λi , i ∈ A(k) (3.3.11)

by solving
− A(k) λ(k) = Gx(k) + c, (3.3.12)
where A(k) is the active constraint matrix defined by (3.3.8). Select j
such that
(k) (k)
λj = min λi . (3.3.13)
i∈A(k) ∩I

≥ 0, terminate with x∗ = x(k) . Otherwise, set


(k)
If λj

A(k) := A(k) \ {j} (3.3.14)

and go to Step 2.

Step 4. Let d(k) = d = 0 be the solution to problem (3.3.9). Then, compute


α(k) according to α(k) = min{1, ᾱ(k) }, where

bi − a 
i x
(k)
ᾱ(k) = min , (3.3.15)
i∈I\A(k) a
i d
(k)
a
i d
(k)
>0

and set
x(k+1) = x(k) + α(k) d(k) . (3.3.16)
(k)
If α < 1, set
A(k+1) = A(k) ∪ {p} , (3.3.17)
where p ∈ I\A(k) is the index that achieves the minimum in (3.3.15).
Otherwise, if α(k) = 1, set A(k+1) = A(k) .

Step 5. Set k := k + 1 and return to Step 2.

It should be noted that if G is not positive definite, the stationary points


may not be local minima. Special techniques are required for solving this
class of indefinite quadratic programming problems.
The rationale underlying the steps in the active set strategy is as follows.
At each iteration, the method seeks to solve a problem with equality con-
straints only, corresponding to the active set. At the k-th iteration, if all of
3.3 Quadratic programming via Active Set Strategy 65

the Lagrange multipliers associated with the active set are non-negative, the
Karush-Kuhn-Tucker necessary condition is satisfied and a local optimum is
reached. Otherwise, if some or all of the multipliers are strictly negative, the
constraint corresponding to the most negative multiplier is removed from the
active set. This procedure is carried out in Step 3 of the algorithm. Steps 2
and 4 describe the details of solving the equality only constrained problem
associated with the active set. The iterate x(k) may or may not be an optimal
solution to this problem. We may shift the origin of the coordinate system to
x(k) and check if any non-zero local perturbation d solves the corresponding
shifted problem (3.3.9). If the optimal solution is d = 0, then proceed to Step
3 to check for the Lagrange multipliers. If d = d(k) = 0, then we can reduce
the cost function by updating x(k) to x(k) + d(k) . This step may, however,
cause some of the constraints to be violated. To prevent constraint violation,
we change the update to x(k) + α(k) d(k) , where α(k) is chosen such that the
first non-active constraint in I \A(k) becomes active. This is done only for
constraints that increase in the d(k) direction, i.e., for a
i d
(k)
> 0. The new
active constraint is then included in the active set. The whole procedure is
then repeated by returning to Step 2 after updating x(k+1) = x(k) + α(k) d(k)
and k := k + 1 in Step 5.
Note that in the computation of Lagrange multipliers associated with the
active set in (3.3.12), the set of linear algebraic equations need not be solved
independently at each step since only one constraint is removed or added
each time. A pivoting strategy similar to that used in the simplex algorithm
for linear programming can be used to significantly reduce the amount of
computational effort required.
Example 3.3.1 Find an x ∈ R2 such that the objective function

f (x) = x21 + x22 − 4x1 − 5x2 + 2

is minimized subject to

h1 (x) = 2x1 + x2 − 2 ≤ 0,
h2 (x) = −x1 ≤ 0,
h3 (x) = −x2 ≤ 0.

Suppose we start at the feasible point x(0) = 0. The relevant gradients are

2x1 − 4
(∇x f ) = g (x) = , ∇x h1 = [2, 1] ,
2x2 − 5
∇x h2 = [−1, 0] , ∇x h3 = [0, −1] ,

and the Hessian is


2 0
G= .
0 2
66 3 Constrained Mathematical Programming

The active set is A(0) = {2, 3}, so that

−1 0
A(0) = .
0 −1

Since two linearly independent constraints are active, d = 0 is the solution to


problem (3.3.9) of Step 2. Thus we move to Step 3. The Lagrange multipliers
are found from the equation −A(0) λ(0) = g (0) , i.e.,
. /
(0)
−1 0 λ2 −4
− = .
0 −1 λ3
(0) −5

(0) (0)
The solution is λ2 = −4 and λ3 = −5, of which the most negative one is
(0)
λ3 . Hence the third constraint is dropped. Now, we set A(0) = {2} and go
to Step 2.
1  
Problem (3.3.9) is to minimize d Gd+d g (0) subject to ∇x h2 x(0) d =
2
0. It has the solution d = [0, 5/2] . We now move to Step 4 to determine the
step length along this direction of search. We have
* ! ! +
α(0) = min 1, −h1 x(0) /∇x h1 x(0) d ,
 
as the third constraint does not satisfy the criterion ∇x h3 x(0) d > 0. This
gives α(0) = 4/5, so

0
x(1) = x(0) + α(0) d(0) = ,
2
A(1) = {1, 2} .

Moving through Step 5, then Steps 2 and 3 again (two linearly independent
constraints are active), the new Lagrange multipliers are found from
. /
(1)
2 −1 λ1 −4
− = ,
1 0 (1)
λ2 −1

(1) (1)
to give λ1 = 1 and λ2 = −2. The second constraint is dropped, giving
A(1) = {1}.
We now return to Step 2 where Problem (3.3.9) is solved again to give

d(1) = [1/5, −2/5] . Step 4 gives a step length of α(1) = 1, which, in turn,

gives x (2)
= [1/5, 8/5] . Moving back to Step 2 with A(2) = {1}, we find
(2)
that d = 0 and then Step 3 gives λ1 = 9/5, which is greater than zero, so
x(2) is the optimal point. The corresponding optimal Lagrange multiplier is

λ∗ = [9/5, 0, 0] .
3.4 Constrained Quasi-Newton Method 67

3.4 Constrained Quasi-Newton Method

In this section, we briefly describe a constrained quasi-Newton method for


solving a general linearly constrained optimization problem. This method was
initially developed by Fletcher [61].
Consider the linearly constrained optimization problem, where the objec-
tive function

f (x) (3.4.1a)
is to be minimized subject to
hi (x) =ai x − bi = 0, i ∈ E, (3.4.1b)
hi (x) =ai x − bi ≤ 0, i ∈ I, (3.4.1c)

where x ∈ Rn is the decision vector, f is a general nonlinear function in


Rn , E = {1, 2, . . . , m}, I = {m + 1, m + 2, . . . , m + r}, ai , i ∈ E ∪ I, are
n-vectors and bi , i ∈ E ∪ I, are real numbers. Since all the constraints are
linear, ∇x hi = a i , i ∈ E ∪ I, are constant vectors.
A point x is said to be a feasible point of Problem (3.4.1) if it satisfies the
constraints (3.4.1b)–(3.4.1c). Let Ξ be the set of all such feasible points.
Using the first phase of the simplex method for linear programming, we
can compute an initial point x(0) in Ξ. A new iterate x(k+1) = x(k) +d can be
computed from the current iterate x(k) by solving a quadratic programming
subproblem. A quadratic model qk (d) is obtained by using the Taylor series
expansion of the objective function about the point x(k) , truncated after the
second order term, i.e.,
! ! 1
f x(k) + d ≈ qk (d) = f x(k) + d g (k) + d H (k) d,
2
where & !'
g (k) = ∇x f x(k)

and !
H (k) = ∇xx f x(k) .

Maintaining feasibility of the new iterate requires


! ! !
hi x(k) + d = a i x(k) + d − bi = a i d + hi x
(k)
= 0, i ∈ E,
! ! !
hi x(k) + d = a i x(k) + d − bi = a i d + hi x
(k)
≤ 0, i ∈ I.

Thus, x(k+1) = x(k) + d is generated from x(k) by solving the following


quadratic programming subproblem, to be denoted as Problem (QP)k : Find
a d ∈ Rn such that the objective function
68 3 Constrained Mathematical Programming

1  (k)
d B d+d g (k) (3.4.2a)
2
is minimized subject to
!
a
i d + hi x
(k)
= 0, i ∈ E, (3.4.2b)
!
a
i d + hi x
(k)
≤ 0, i ∈ I, (3.4.2c)

where B (k) is a positive definite symmetric matrix that approximates the


Hessian H (k) of the objective function at the point x(k) . It is constructed
according to the Broyden-Fletcher-Goldfarb-Shanno (BFGS) rank 2 updating
formula (see Section 2.9.3):
   
γ (k) γ (k) B (k) δ (k) δ (k) B (k)
B (k+1)
=B (k)
+  −   , (3.4.3)
δ (k) γ (k) δ (k) B (k) δ (k)

where
δ (k) = x(k+1) − x(k)
and
γ (k) = g (k+1) − g (k) .
This formula ensures that if B (0) is symmetric positive definite, then so are all
successive updates of the approximate Hessian matrices B (k) provided that
!
δ (k) γ (k) > 0.
 
Note that if x(k) is feasible, then hi x(k) = 0, i ∈ A(k) ⊇ E, where A(k)
is as defined in Definition 3.3.1.
For each k, Problem (QP)k is solved by the active set method described
in Section 3.3. The algorithm for solving Problem (3.4.1) can now be stated
as follows.
Algorithm 3.4.1
Step 1. Choose a point x(0) ∈ Ξ. (This can be achieved by the first phase of
the simplex algorithm for linear programming.) Approximate H (0) by
a symmetric positive definite matrix B (0) . Choose an ε > 0 and set
k = 0.

Step 2. Solve the quadratic programming subproblem, Problem (QP)k .


 
Step 3. Consider the problem of minimizing f x(k) + αd(k) with respect
to α, where d(k) is the solution "of Problem
# (QP)k . Choose an ap-
proximate minimizer α(k) ≤ min 1, ᾱ(k) for this problem (see Re-
mark 3.4.1 below), where ᾱ(k) is defined by (3.3.15).
3.5 Sequential Quadratic Programming Algorithm 69
 
Step 4. If d(k)  < ε, set x∗ = x(k) and stop. Otherwise, set x(k+1) =
x(k) + αk d(k) .

Step 5. Update B (k) according to the BFGS formula (3.4.3).

Step 6. Set k := k + 1 and go to Step 2.

Remark 3.4.1 In Step 3 ofAlgorithm 3.4.1,


 we need to find an approximate
minimizer α(k) ≤ ᾱ(k) of f x(k) + αd(k) . It must be chosen such that the
following two conditions are satisfied:

(i) There is a sufficient function decrease (known as the Goldstein con-


dition [see [78]])
! ! & ' !
f x(k) + α(k) d(k) ≤ f x(k) + ρα(k) d(k) g x(k) .

(ii) There is a sufficient slope improvement


& !'  & ' !
 
 g x(k) + α(k) d(k) d (k) 
≤ −η d (k)
g x (k)
.
 

Here, ρ and η are constants satisfying


 0 <ρ < η < 1. If η = 0, then
α(k) is a stationary point of f x(k) + αd(k) with respect to α. Typical
values of η are η = 0.9 (weak, i.e., not very accurate line search) and
η = 0.1 (strong, i.e., fairly accurate line search). The parameter ρ is
typically taken to be quite small, for example, ρ = 0.01.

Remark 3.4.2 The new point x(k) + α(k) d(k) must be feasible. This will be
the case provided that
 
−hi x(k)
α ≤ ᾱ = min
(k) (k)
.
i∈I\A(k) ai d
(k)
a
i d
(k)
>0

" #
The condition α(k) ≤ min 1, ᾱ(k) will ensure feasibility of x(k) +α(k) d(k) . As
x(0) is chosen to be a feasible point, Step 3 ensures that the algorithm gener-
ates a sequence of feasible points. The sufficient slope improvement condition
$ %
ensures that δ (k) γ (κ) > 0. However, the constraint α(k) ≤ min{1, ᾱ(k) }
can destroy this property.

3.5 Sequential Quadratic Programming Algorithm

The sequential quadratic programming algorithm is recognized as one of the


most efficient algorithms for small and medium size nonlinearly constrained
70 3 Constrained Mathematical Programming

optimization problems. The theory was initiated by Wilson in [278] and was
further developed by Han in [86] and [87], Powell in [207] and Schittkowski in
[221] and [222]. In this section, we shall discuss some of the essential concepts
of the algorithm without going into detail. The main references of this section
are [222], [232] and [275]. For readers interested in details, see [232].
Consider the equality constrained optimization problem

minimize f (x) (3.5.1)

subject to
hi (x) = 0, i = 1, 2, . . . , m. (3.5.2)
The Lagrangian function is

L(x, λ) = f (x) + λ h(x), (3.5.3)

where h = [h1 , . . . , hm ] and λ = [λ1 , . . . , λm ] ∈ Rm is a Lagrange multi-


plier vector. A point x = [x1 , . . . , xn ] ∈ Rn is a KKT point of (3.5.1)–(3.5.2)
if and only if there exists a non-zero λ ∈ Rm such that

m
∇x L(x, λ) = ∇x f (x) + λi ∇x hi (x) = 0 , (3.5.4)
i=1

and

∇λ L(x, λ) = (h(x)) = 0 . (3.5.5)
(3.5.4) and (3.5.5) may be rewritten as the following system of nonlinear
equations:
⎡ ⎤
  m

(∇x L(x, λ)) (∇ f (x)) + λ (∇ h (x)) ⎦= 0 .
=⎣
x i x i
W (x, λ) = i=1
h(x) 0
h(x)
(3.5.6)
We shall use Newton’s method to find the solution of the nonlinear sys-
tem (3.5.6). For a given iterate x(k) ∈ Rn and the corresponding Lagrange
multiplier λ(k) ∈ Rm , the next iterate (x(k+1) , λ(k+1) ) is obtained by solving
the following system of linear equations:
.  (k) (k)    (k)  /
!
∇ L x , λ ∇ h x x(k+1) − x(k)
W x(k) , λ(k) + xx  (k)  x = 0.
∇x h x 0 λ(k+1) − λ(k)
(3.5.7)
From (3.5.6)–(3.5.7), it can be shown that
! ! !!
∇xx L x(k) , λ(k) x(k+1) − x(k) + ∇x h x(k) λ(k+1)
!!
= − ∇x f x(k) (3.5.8)
3.5 Sequential Quadratic Programming Algorithm 71

and ! ! !
∇x h x(k) x(k+1) − x(k) = −h x(k) . (3.5.9)

Let δx(k) = x − x(k) . Then, (3.5.8) and (3.5.9) become


! !! !!
∇xx L x(k) , λ(k) δx(k) + ∇x h x(k) λ(k+1) = − ∇x f x(k)
(3.5.10)
and ! !
∇x h x(k) δx(k) = −h x(k) . (3.5.11)

Solving the system of linear equations (3.5.10) and (3.5.11) gives (δx(k) ,
λ(k+1) ). Then, we obtain the next iterate as

x(k+1) = x(k) + δx(k) . (3.5.12)

This method is known as the Lagrange-Newton method for solving equality


constrained optimization problems. If the constraints satisfy the constraint
qualification conditions at the KKT point, then the Lagrange-Newton method
is locally convergent and the convergence rate is of second order.
The increment δx(k) resulting in the new iterate x(k+1) given by (3.5.12)
may also be considered as the solution of the following quadratic equality
constrained optimization problem:
! ! 1 ! !
minimize f x(k) + ∇x f x(k) δx(k) + δx(k) ∇xx L x(k) , λ(k) δx(k)
2
(3.5.13)

subject to
! !
hi x(k) + ∇x hi x(k) δx(k) = 0, i = 1, 2, . . . , m. (3.5.14)

To be more specific, we consider the Lagrangian for the problem (3.5.13)–


(3.5.14), given by
!
L x(k) , λ(k) , δx(k) , λ
! ! 1 ! !
= f x(k) + ∇x f x(k) δx(k) + δx(k) ∇xx L x(k) , λ(k) δx(k)
2
m ! ! !
+ λi hi x(k) + ∇x hi x(k) δx(k) , (3.5.15)
i=1

where λ = [λ1 , . . . , λm ] . Then,


 by taking the gradients of the Lagrangian
function L x(k) , λ(k) , δx(k) , λ with respect to δx(k) and λ and noting the
equivalence of λ and λ(k+1) , we obtain (3.5.10)–(3.5.11).
72 3 Constrained Mathematical Programming

We now move on to consider the following optimization problem with both


equality and inequality constraints.

minimize f (x) (3.5.16)

subject to

hi (x) = 0, i = 1, 2, . . . , m, (3.5.17)
hi (x) ≤ 0, i = m + 1, m + 2, . . . , m + r. (3.5.18)

The function f is assumed to be twice continuously differentiable, and hi ,


i = 1, 2, . . . , m + r, are also assumed to be twice continuously differentiable.
We introduce the Lagrangian function L for the constrained optimization
problem (3.5.16)–(3.5.18) defined by


m+r
L(x, λ) = f (x) + λi hi (x). (3.5.19)
i=1

Let x(k) be the current iterate, and let λ(k) ∈ Rm+r be the corresponding
Lagrange multiplier vector. To find the next iterate, we first construct the
following quadratic programming subproblem:
! ! 1 !
minimize f x(k) +∇x f x(k) δx(k) + (δx(k) ) ∇xx L x(k) , λ(k) δx(k)
2
(3.5.20)
subject to
! !
hi x(k) + ∇x hi x(k) δx(k) = 0, i = 1, 2, . . . , m, (3.5.21)

and
! !
hi x(k) + ∇x hi x(k) δx(k) ≤ 0, i = m + 1, 2, . . . , m + r. (3.5.22)

The next iterate is then given by


! !
x(k+1) , λ(k+1) = x(k) + δx(k) , λ , (3.5.23)

where δx(k) is the solution of the subproblem (3.5.20)–(3.5.22) and λ is


the optimal Lagrange multiplier vector for the same subproblem. Suppose
that the optimal solution to the subproblem (3.5.20)–(3.5.18) is such that
δx(k) = 0. Then, x(k) must be a KKT point for the problem (3.5.16)–(3.5.18).
For the subproblem (3.5.20)–(3.5.22), it is necessary to calculate the
 Hessian
of the Lagrange function L defined by (3.5.19). If ∇xx L x(k) , λ(k) fails to be
positive definite, the solution method will fail. Thus, we will use an approxi-
mate matrix B (k) to replace ∇xx L(x(k) , λ(k) ), where B (k) is to be updated in
3.5 Sequential Quadratic Programming Algorithm 73

such a way that if B (k) is positive definite, then B (k+1) will also be positive
definite. To generate a new iterate, we need conditions for the step length
selection. For this, we will introduce a merit function. The SQP method for
solving the nonlinear optimization problem with equality and inequality con-
straints may now be stated as follows.
Algorithm 3.5.1
Step 1. Let k = 0. Choose a starting point x(0) ∈ Rn and choose a positive
definite matrix B (0) .  
Step 2. Solve the following subproblem to obtain δx(k) , λ(k+1) :
! ! 1 !
minimize f x(k) + ∇x f x(k) δx(k) + δx(k) B (k) δx(k)
2
(3.5.24)
subject to
! !
hi x(k) + ∇x hi x(k) δx(k) = 0, i = 1, 2, . . . , m (3.5.25)

and
! !
hi x(k) +∇x hi x(k) δx(k) ≤ 0, i = m+1, 2, . . . , m+r. (3.5.26)

Step 3. If δx(k) = 0, then the algorithm terminates and x(k) is the KKT point
for problem (3.5.16)–(3.5.18). Otherwise, set x(k+1) = x(k) +αk δx(k) ,
where αk is determined by some step length rules, see below.
Step 4. Update B (k) to B (k+1) such that B (k+1) is positive definite. Return
to Step 2.
The next task is to derive the update formula for the matrix B (k) . It is
natural to consider using the BFGS update formula, i.e.,
   
γ (k) γ (k) B (k) δ (k) δ (k) B (k)
B (k+1) = B (k) +   − , (3.5.27)
γ (k) δ (k) (δ (k) ) B (k) δ (k)

where δ (k) = x(k+1) − x(k) and


     
γ (k) = ∇x L x(k+1) , λ(k+1) − ∇x L x(k) , λ(k) .
However, if this formula is used, then the new matrix B (k+1) is positive
 
definite only if the condition γ (k) δ (k) > 0 is satisfied. This condition is
always satisfied for the unconstrained case if the step length is chosen such
that the following conditions are satisfied:

f (x + ᾱδ) ≤ f (x) + ρᾱδ  (∇x f (x)) (3.5.28)

and
|∇x f (x + ᾱδ)δ| ≤ −βδ  (∇x f (x)) , (3.5.29)
74 3 Constrained Mathematical Programming

where ρ and β are constants satisfying 0 < ρ < β < 1. However, this is
not always true for the constrained case. Therefore, the BFGS update should
not be applied directly. It needs to be modified as suggested in [207]. More
specifically, we introduce η (k) to replace γ (k) , where η (k) is given by
⎧  (k)  (k)  
⎨ γ (k) , γ δ ≥ 0.2 δ (k) B (k) δ (k) ,
η (k) =

θk γ (k) + (1 − θk )B (k) δ (k) , otherwise,
(3.5.30)
where  
0.8 δ (k) B (k) δ (k)
θk =   . (3.5.31)
δ (k) B (k) δ (k) − (γ (k) ) δ (k)
Clearly,
! !
η (k) δ (k) ≥ 0.2 δ (k) B (k) δ (k) (3.5.32)

and hence !
δ (k) η (k) > 0. (3.5.33)

Let η (k) be chosen as defined by (3.5.30). Then, the modified BFGS formula
is given ([207]) by
   
η (k) η (k) B (k) δ (k) δ (k) B (k)
B (k+1)
=B (k)
+  −   . (3.5.34)
η (k) δ (k) δ (k) B (k) δ (k)

Now, by (3.5.33), it follows that if B (k) is positive definite, then B (k+1) is


also positive definite.
We obtain the following subproblem:
! ! 1 !
minimize f x(k) + ∇x f x(k) δx(k) + δx(k) B (k) δx(k) (3.5.35)
2
subject to
! !
hi x(k) + ∇x hi x(k) δx(k) = 0, i = 1, 2, . . . , m, (3.5.36)

and
! !
hi x(k) + ∇x hi x(k) δx(k) ≤ 0, i = m + 1, 2, . . . , m + r. (3.5.37)

This is a quadratic programming problem and can hence be solved by the


active set method described in Section 3.3. Let δx&(k) be the solution
' of this
(k) (k) (k) 
quadratic programming problem, and let λ = λ1 , . . . , λm+r be the
corresponding optimal Lagrange multiplier vector for this problem, i.e., it
satisfies
3.5 Sequential Quadratic Programming Algorithm 75

! !  (k)
m+r !!
∇x f x(k) δx(k) + δx(k) B (k) + λi ∇x hi x(k) = 0, (3.5.38)
i=1

(k)
& ! ! '
λi hi x(k) + ∇x hi x(k) δx(k) = 0, i = m+1, 2, . . . , m+r, (3.5.39)

and
(k)
λi ≥ 0, i = m + 1, 2, . . . , m + r. (3.5.40)
(k+1) (k+1) (k+1)
Then, the new estimates x ,λ and B may be determined by

x(k+1) = x(k) + αk d(k) , (3.5.41)


(k)
!
λ (k+1)
=λ + αk λ − λ
(k)
, (k)
(3.5.42)
   
γ (k) γ (k) B (k) δ (k) δ (k) B (k)
B (k+1) (k)
=B +  −   , (3.5.43)
δ (k) γ (k) δ (k) B (k) δ (k)

where
δ (k) = x(k+1) − x(k) (3.5.44)
and
& !' & !'
γ (k) = ∇x L x(k+1) , λ(k+1) − ∇x L x(k) , λ(k) . (3.5.45)

We introduce a merit function to determine the step length in such a


way that the value of the cost function is reduced while the feasibility is
maintained. The merit function given below is suggested in [87], which is an
L1 -penalty function:


m 
m+r
P (x, σ) = f (x) + σi |hi (x)| + σi max{0, −hi (x)}, (3.5.46)
i=1 i=m+1

where σ = [σ1 , . . . , σm+r ] . The parameters σi , i = 1, 2, . . . , m + r, should be


chosen (see [207]) such that P (x, σ) is locally decreasing along the direction
δx. More specifically, for each i = 1, 2, . . . , m + r, define
⎧  
⎨  (0) 
λ i , if k = 0,
(k)
σi = * (k)   !+ (3.5.47)

⎩ max λ  , 1 σ  (k−1)  (k) 
i 2 i +  λ i  , if k ≥ 1,

(k)
where λ is the optimal Lagrange multiplier of the subproblem
 (3.5.35)–
 (k)  (k)
(3.5.37), i.e., it satisfies (3.5.38)–(3.5.40). Clearly, λi  ≤ σi .
To determine the convergence and convergence rate of the SQP method,
we need the following two lemmas from [275].
76 3 Constrained Mathematical Programming

Lemma 3.5.1 Suppose that hi (x), i ∈ I, are given continuously differen-


tiable functions, where I = {1, . . . , r}. Let Φ(x) = maxi∈I {hi (x)}. Then, the
directional derivative, Φ (x; δx), of the function Φ(x) along any direction δx
exists and
Φ (x; δx) = max {∇x hi (x)δx}, (3.5.48)
i∈I(x)

where
I(x) = {i ∈ I : hi (x) = Φ(x)}. (3.5.49)
Lemma 3.5.2 Consider Problem (3.5.16)-(3.5.18), where f (x) and hi (x),
i = 1, 2, . . . , m + r, are continuously differentiable
! functions. Suppose that
(k) (k) (k)
B is positive definite and that δx , λ is a KKT point of the
 
 (k)  (k)
subproblem (3.5.35)-(3.5.37) for which δx(k) = 0 and λi  ≤ σi , i =
1, 2, . . . , m + r. Then
!
P  x(k) , σ (k) ; δx(k) < 0. (3.5.50)

With the help of Lemmas 3.5.1 and 3.5.2, the step size αk can be chosen
(see [87]) such that
! !
P x(k) + αk δx(k) , σ (k) < max P x(k) + αδx(k) , σ (k) + εk , (3.5.51)
1≤α≤β

where β is a given positive number and




εk < ∞. (3.5.52)
k=1

The following theorem is established in [87].


Theorem 3.5.1 Let f (x) and hi (x), i = 1, 2, . . . , m + r, be continuously
differentiable functions. Suppose that there exist constants K1 > 0 and K2 >
0 such that
K1 x ≤ x B (k) x ≤ K2 x , ∀k ≥ 1
2 2
(3.5.53)
and for all x ∈ Rn . Furthermore, it is assumed that there exists a vector σ >
0 satisfying λi  ≤ σi , = 1, 2, . . . , m + r, ∀k ≥ 1. Let the step size be chosen
such that
" conditions
# (3.5.51)–(3.5.52) are satisfied. Then the sequence of the
points x(k) generated by Algorithm 3.5.1 is such that it either terminates
at a KKT point or its accumulation point is a KKT point.

To proceed further, let the following conditions be satisfied.


Assumption 3.5.1 Let f (x) and hi (x), i = 1, 2, . . . , m + r, be continuously
differentiable functions.
Assumption 3.5.2 The sequence {x(k) } converges to x∗ .
3.5 Sequential Quadratic Programming Algorithm 77

Assumption 3.5.3 x∗ is a KKT point and

∇x hi (x∗ ), i ∈ A(x∗ ), (3.5.54)

are linearly independent, where A(x∗ ) = {i ∈ E ∪ I : hi (x∗ ) = 0}, E =


{1, 2, . . . , m} and I = {m + 1, 2, . . . , m + r}. Let A(x∗ ) be the n × |A(x∗ )|
matrix consisting of the vectors given by (3.5.54). For any non-zero vector
δx, if it satisfies
(A(x∗ )) δx = 0, (3.5.55)
then it holds that
(δx) W (x∗ , λ∗ )δx = 0, (3.5.56)
where W (x, λ) is defined by (3.5.6) and λ∗ is the Lagrange multiplier at x∗ .

Assumption 3.5.4 If k is sufficiently large, then δx(k) is a solution of


! 1 !
minn ∇x f x(k) δx + δx(k) B (k) δx(k) (3.5.57)
δx∈R 2
subject to
! ! !! !
hi x(k) + δx(k) ∇x hi x(k) = 0, i ∈ A x(k) . (3.5.58)

Now, suppose that there exists a k0 such that Assumption 3.5.4 is satisfied
for k > k0 . Then, there exists a λ ∈ R|A(x )| such that
(k) (k)

! ! (k)
∇x f x(k) + B (k) δx(k) = A x(k) λ (3.5.59)

and ! !
A x(k) δx(k) = −-
h x(k) , (3.5.60)
 
for all k > k0 , where -
h(x) is the vector whose elements are hi (x), i ∈ A x(k) .
Theorem 3.5.2 Suppose that Assumptions 3.5.1–3.5.4 are satisfied. Then,
δx(k) is a superlinearly convergent step, i.e.,
 (k) 
x + δx(k) − x∗ 
lim   = 0, (3.5.61)
k→∞ x(k) − x∗ 

if and only if   
Pk Bk − W (x∗ , λ∗ )δx(k) 
lim   = 0, (3.5.62)
k→∞ δx(k) 
  
where Pk is a projection from Rn onto the null space of A x(k) , i.e.,
! !! !−1 !!
Pk = I − A x(k) A x(k) A(x(k) ) A x(k) . (3.5.63)
78 3 Constrained Mathematical Programming

The proof of this result may be found in [275].


As a concluding remark for this chapter, we note that any constrained
optimal control problem can be reduced to a mathematical programming
problem by using the control parametrization technique. Throughout this
book, the sequential quadratic programming algorithm will be the basis on
which all these mathematical programming problems are solved. The main
reason is that the sequential quadratic programming algorithm is recognized
as one of the most efficient algorithms for small and medium size nonlinearly
constrained optimization problems.
However, we wish to note that there are other methods in the literature
for solving general nonlinearly constrained mathematical programming prob-
lems. For example, see [61] and [71].
Chapter 4
Optimization Problems Subject to
Continuous Inequality Constraints

4.1 Introduction

In this chapter, we shall present two computational approaches to solve a


general class of optimization problems subject to continuous inequality con-
straints. The first approach is known as the constraint transcription method,
while the other approach is referred to as an exact penalty function method.
To begin, we first consider the problem of finding a feasible solution to a sys-
tem of nonlinear inequality constraints. Using the constrained transcription
method with a local smoothing technique, the problem is approximated by an
unconstrained optimization problem. This approach is extended to find a fea-
sible solution of a system of continuous inequality constraints. We then move
on to consider the optimization problems subject to continuous inequality
constraints. The constraint transcription method is used in conjunction with
a local smoothing method to develop two computational methods to solve
this general optimization problem with continuous inequality constraints. We
then move on to introduce the second approach (i.e., the exact penalty func-
tion approach) to solving the same class of continuous inequality constrained
optimization problems.
The main references for this chapter are [76, 103, 259, 300, 301].

4.2 Constraint Transcription Technique

In this section, we consider the following system of nonlinear inequality con-


straints:
hj (x) ≤ 0, j = 1, 2, . . . , m,

© The Author(s), under exclusive license to 79


Springer Nature Switzerland AG 2021
K. L. Teo et al., Applied and Computational Optimal Control, Springer
Optimization and Its Applications 171,
https://doi.org/10.1007/978-3-030-69913-0 4
80 4 Optimization Problems Subject to Continuous Inequality Constraints

where x = [x1 , . . . , xn ] ∈ Rn and, for each j = 1, 2, . . . , m, hj : Rn → R is a
continuously differentiable function.
The technique to be presented is referred to as the constraint transcription
technique. Using this technique, we can find a feasible solution of a system
of nonlinear inequality constraints in a finite number of iterations by solving
an associated unconstrained optimization problem. The same technique is
extended to find a feasible solution to a system of continuous inequality con-
straints. The resulting algorithms are presented in Sections 4.2.1 and 4.2.2,
respectively. In Sections 4.3 and 4.4, we show that by using the constraint
transcription technique in conjunction with the local smoothing technique,
an optimization problem subject to continuous inequality constraints can be
solved as a conventional constrained optimization problem or as an uncon-
strained optimization problem.

4.2.1 Inequality Constraints

Consider the following problem: Find an x ∈ Rn such that

hj (x) ≤ 0, j = 1, 2, . . . , m, (4.2.1a)

and
a ≤ x ≤ b, (4.2.1b)
  
where x = [x1 , x2 , . . . , xn ] , a = [a1 , a2 , . . . , an ] and b = [b1 , b2 , . . . , bn ] .
Here, ai , i = 1, 2, . . . , n, and bi , i = 1, 2, . . . , n, are given constants, and we
assume that for each j = 1, 2, . . . , m, hj is continuously differentiable with
respect to x.
Given Problem (4.2.1), we can construct a corresponding unconstrained
optimization problem as follows. Let ε > 0 and consider
⎧ ⎫
⎨  m ⎬
minimize Jε (x) = Φε (hj (x)) (4.2.2a)
⎩ ⎭
j=1

subject to
a ≤ x ≤ b, (4.2.2b)
where Φε (·) is defined by


⎪ h, if h ≥ ε,




⎨ 2
(h + ε)
Φε (h) =
⎪ , if − ε ≤ h ≤ ε, (4.2.3)

⎪ 4ε



⎩ 0, if h ≤ −ε,
4.2 Constraint Transcription Technique 81

Φε

−ε ε h

Fig. 4.2.1: Smoothing constraint transcription

see Figure 4.2.1.


Note that Φε (·) is differentiable. This unconstrained optimization problem
is referred to as Problem (4.2.2).
Note that the boundedness constraints (4.2.2b) in Problem (4.2.2) can
also be incorporated in the cost function using the constraint transcription
defined by (4.2.3). We choose to leave them as such because most optimization
algorithms are able to handle these boundedness constraints efficiently.
Theorem 4.2.1 (Necessary Condition) Let x0 ∈ Rn be a solution of
Problem (4.2.1). Then,
  mε
Jε x0 ≤ . (4.2.4)
4
Proof. In view of the constraint transcription defined in (4.2.3), it is clear
from (4.2.1a) that
   ε
Φ ε hj x0 ≤ .
4
Thus, the conclusion follows readily from (4.2.2a).

Theorem 4.2.2 (Sufficient Condition) Let x0 , a ≤ x0 ≤ b, be such that


  ε
Jε x0 ≤ . (4.2.5)
4
Then, x0 is a feasible solution of Problem (4.2.1).

Proof. Suppose x0 is not feasible. Then there exists some j such that
 
hj x0 > 0.

This, in turn, implies that


82 4 Optimization Problems Subject to Continuous Inequality Constraints
   ε
Φ ε hj x0 > ,
4
and hence   ε
J ε x0 > .
4
This is a contradiction, and hence the proof is complete.

Based on these two simple theorems, we can devise a computational algo-


rithm for finding a feasible solution of Problem (4.2.1) in a finite number of
iterations. There are many algorithms available in the literature for solving
unconstrained optimization problems (such as the quasi-Newton method, see
Chapter 2), and most of these can readily incorporate the boundedness con-
straints (4.2.1b). Suppose we employ such an algorithm to minimize (4.2.2a)
subject to (4.2.2b), and also suppose that it is capable of finding a global
optimal solution. Then, at the k-th step of that algorithm with iterate xk ,
we insert the following procedure:
Algorithm 4.2.1
Step 1. If J x(k) > mε/4, go to Step 3. Otherwise, go to Step 2.
Step 2. If the constraints (4.2.1a) are satisfied, go to Step 4. Otherwise, go
to Step 3.
Step 3. Obtain the next iterate x(k+1) . Set k := k + 1, and go to Step 1.
Step 4. Stop. x(k) is a feasible solution.
We can show that Algorithm 4.2.1 terminates in a finite number of itera-
tions under a mild assumption.
Theorem 4.2.3 Let {x(k) } be a sequence of admissible points generated by
Algorithm 4.2.1. If !
lim Jε x(k) = 0, (4.2.6)
k→∞

then there exists a positive integer k0 such that x(k) is a feasible solution of
Problem (4.2.1) for all k > k0 .
 
Proof. Since limk→∞ Jε x(k) = 0, there exists a positive integer N such
that !
Jε x(k) ≤ ε/4

for all k > N . Thus, by Theorem 4.2.2, x(N +1) is a feasible solution of
Problem (4.2.1).

In view of Theorem 4.2.3, we see that Algorithm 4.2.1 will find a feasible
solution of Problem (4.2.1) in a finite number of iterations if (4.2.6) holds.
In fact, if the optimization algorithm simply produces a point x∗ such that
the sufficient condition (4.2.5) is satisfied, then x∗ is a feasible solution of
the inequality constraints (4.2.1), so (4.2.6) is not strictly necessary for finite
convergence.
4.2 Constraint Transcription Technique 83

Remark 4.2.1 Note that Algorithm 4.2.1 can produce a non-feasible solu-
tion, which corresponds to a local minimum point for the corresponding un-
constrained optimization problem. In such a case, the solution should obvi-
ously be rejected. The algorithm may be restarted from another initial point
in an attempt to reach a global minimum.
Remark 4.2.2 ε is required to be positive, but, in practice, it does not need
to be very small. Note that if we choose ε = 0, then Φε becomes nonsmooth.
In such a case, the necessary and sufficient condition for the solvability of
Problem (4.2.1) is Jε = 0. This, however, may require an infinite number of
iterations and is thus not recommended. In general, an appropriate choice of
ε will enhance the convergence characteristic of the algorithm. If the feasible
space is sufficiently large, ε can be larger (and hence speed up convergence).
Otherwise, ε should be made small.

4.2.2 Continuous Inequality Constraints

Here, we shall extend the technique presented in Section 4.2.1 to find a feasible
solution of the following continuous inequality constraints:

hj (x, t) ≤ 0, ∀ t ∈ [0, T ], j = 1, 2, . . . , m, (4.2.7a)

and
a ≤ x ≤ b, (4.2.7b)
where 0 < T < ∞.
For convenience, let this continuous inequality constrained problem be
referred to as Problem (4.2.7).
For any ε > 0, we consider the following optimization problem:
⎧ ⎫
⎨  T m ⎬
min Jε (x) = Φε (hj (x, t))dt (4.2.8a)
⎩ 0 ⎭
j=1

subject to
a ≤ x ≤ b, (4.2.8b)
where Φε (·) is defined by (4.2.3) and, for each j = 1, 2, . . . , m, hj is assumed
to satisfy the following conditions.
3T
Assumption 4.2.1 0 Φε (hj (x, t))dt exists for all x.
Assumption 4.2.2 hj (x, t) is continuous in t ∈ [0, T ], and ∂hj (x, t)/∂t is
piecewise continuous in t ∈ [0, T ] for each x.
Assumption 4.2.3 hj (x, t) is continuously differentiable with respect to x
for all t ∈ [0, T ], except, possibly, at a finite number of points in [0, T ].
84 4 Optimization Problems Subject to Continuous Inequality Constraints

Theorem 4.2.4 (Necessary Condition) Let x0 ∈ Rn be a solution of


Problem (4.2.2). Then,
  mεT
Jε x0 ≤ . (4.2.9)
4
Proof. In view of the constraint transcription defined in (4.2.3), it is clear
from (4.2.7a) that
   ε
Φ ε hj x 0 , t ≤ , ∀ t ∈ [0, T ] , j = 1, 2, . . . , m. (4.2.10)
4
Thus, the conclusion follows readily from (4.2.8a).

Our next task is to derive a sufficient condition for Problem (4.2.7). We


first require the following lemma.
Lemma 4.2.1 Let f be a non-negative valued function defined on [0, T ]. If
f is continuous on [0, T ] and df /dt is piecewise continuous on [0, T ], then
 T 4 5
f- f-
f (t)dt ≥ min ,T , (4.2.11a)
0 2 M

where  
 df (t) 
M = max    (4.2.11b)
t∈[0,T ] dt 
and
f- = max f (t). (4.2.11c)
t∈[0,T ]

Proof. Let t0 ∈ [0, T ] be such that f (t0 ) = f-. There are three cases to be
considered:

f-
t0 + ≤ T, (4.2.12a)
M
f-
t0 − ≥ 0, (4.2.12b)
M
f- f-
t0 − < 0 and t0 + > T. (4.2.12c)
M M
Case (4.2.12a): Define

h(t) = f (t0 ) − M (t − t0 ). (4.2.13)


& '
Then, for t ∈ t0 , t0 + f-/M ,

 t  t
df (s)
f (t) − h(t) = f (t0 ) + ds − f (t0 ) + M ds
t0 ds t0
4.2 Constraint Transcription Technique 85
 t
df (s)
= + M ds ≥ 0.
t0 ds

Hence,
!2
 T  t0 +(f/M )  t0 +(f/M ) -
f
1
f (t) dt ≥ f (t) dt ≥ h(t) dt = . (4.2.14)
0 t0 t0 2 M

Case (4.2.12b): Define

h(t) = f (t0 ) + M (t − t0 ). (4.2.15)


& '
Then, for t ∈ kt0 − f-/M, t0 ,
 t  t
df (s)
f (t) − h(t) = f (t0 ) + ds − f (t0 ) − M ds
t0 ds t0
 t
df (s)
= − M ds
t0 ds
 t0
df (s)
= M− ds ≥ 0.
t ds

Hence,
!2
 T  t0 -
f  t0
1
f (t) dt ≥ f (t) dt ≥ h(t) dt = . (4.2.16)
0 t0 −(f/M ) t0 −(f/M ) 2 M

Case (4.2.12c): Define

f (t0 ) − M (t − t0 ), t ≥ t0 ,
h(t) = (4.2.17)
f (t0 ) + M (t − t0 ), t < t0 .

Then,
⎧ t ⎫

⎪ df (s) ⎪

⎪ + M ds, t ≥ t0 ⎪


⎨ t0 ds ⎬
f (t) − h(t) = ≥ 0, ∀t ∈ [0, T ],

⎪  ⎪



t0
df (s) ⎪
⎩ M− ds, t < t0 ⎪

t ds

and hence,
 T  T
f (t) dt ≥ h(t) dt
0 0
86 4 Optimization Problems Subject to Continuous Inequality Constraints
 T  t0
= h(t) dt + h(t) dt
t0 0
M M t0
= (T − t0 ) f- − (T − t0 ) + t0 f- −
2 2
M M
= T f- − (T − t0 )2 − (t0 )2 . (4.2.18)
2 2
Note that

f-
t0 + >T ⇒ M (T − t0 ) < f-,
M

f-
t0 − < 0 ⇒ M t0 < f-.
M
We have
M M 1& '
T f- − (T − t0 )2 − (t0 )2 ≥ T f- − (T − t0 ) f- + t0 f-
2 2 2
1 1
= T f- − T f- = T f-.
2 2
Thus, it follows from (4.2.18) that
 T
1 -
f (t) dt ≥ T f. (4.2.19)
0 2

Combining (4.2.14), (4.2.16) and (4.2.19), we obtain


⎧ !2 ⎫
 T ⎪ ⎪ 4 5
⎨ f- 1 -⎬ f- f-
f (t) dt ≥ min , T f = min ,T .
0 ⎪
⎩ 2M 2 ⎪
⎭ 2 M

Thus, the proof is complete.


With the help of Lemma 4.2.1, we are in a position to present the sufficient
condition for Problem (4.2.7).
Theorem 4.2.5 Let x0 be an admissible vector such that
  ε * ε +
Jε x0 ≤ min ,T , (4.2.20)
8 4M
where
4  5
 ∂g x0 , t 
 j 
M = max   : t ∈ [0, T ], j = 1, 2, . . . , m . (4.2.21)
 ∂t 

Then x0 is a solution of Problem (4.2.7).


4.2 Constraint Transcription Technique 87

Proof. Suppose x0 is not feasible. Then, by virtue of Assumption 4.2.2, there


exist some j ∈ {1, 2, . . . , m} and an open set θj ⊆ [0, T ] with positive measure
such that  
gj x0 , t > 0, ∀ t ∈ θj . (4.2.22)
By the second part of Assumption 4.2.2, there exists a positive constant M
satisfying (4.2.21). Now, we define
4  5
 ∂Φ h x0 , t 
 ε j 
M̄ = max   : t ∈ [0, T ], j = 1, 2, . . . , m . (4.2.23)
 ∂t 

Clearly,
    
 ∂Φ h x0 , t   ∂Φ h x0 , t   ∂ h x0 , t 
 ε j   ε j  j 
 ≤  . (4.2.24)
 ∂t   ∂hj  ∂t 

Since  
 ∂Φ h x0 , t 
 ε j 
  ≤ 1, ∀ t ∈ [0, T ],
 ∂hj 
it is clear that
M̄ ≤ M. (4.2.25)
Thus, by Lemma 4.2.1, we have
 T
Φ∗ε Φ∗ε
Φε (hj (x, t)) dt ≥ min ,T , (4.2.26)
0 2 M̄

where
Φ∗ε = max Φε (hj (x, t)). (4.2.27)
t∈[0,T ]

By (4.2.22) and (4.2.3), we have


ε
Φ∗ε > . (4.2.28)
4
Thus, it follows from (4.2.28) and (4.2.25) that

Φ∗ε Φ∗ε ε * ε + ε * ε +
min ,T > min , T ≥ min ,T . (4.2.29)
2 M̄ 8 4M̄ 8 4M

This is, however, a contradiction to (4.2.20). Thus, the proof is complete.

Note that the gradient of the cost function (4.2.8a) with respect to each
x ∈ [a, b] is given by
 T 
m
∂Φε (hj (x, t))
∇Jε (x) = dt, (4.2.30)
0 j=1
∂x
88 4 Optimization Problems Subject to Continuous Inequality Constraints

where

⎪ ∂hj (x, t)

⎪ , if hj (x, t) ≥ ε,

⎪ ∂x


∂Φε (hj (x, t)) ⎨ h (x, t) + ε  ∂h (x, t)
= j j
, if −ε ≤ hj (x, t) ≤ ε,
∂x ⎪


⎪ 2ε ∂x




0, if −ε ≤ hj (x, t).
(4.2.31)
Thus, Problem (4.2.8) can be viewed as a standard unconstrained opti-
mization problem and hence is solvable by any efficient unconstrained opti-
mization technique such as the quasi-Newton method (see Chapter 2). We
may summarize these findings in the following algorithm.
Algorithm 4.2.2 At the k-th iteration of the unconstrained optimization
algorithm with iterate x(k) , we insert the following steps:
Step 1. If Jε (x(k) ) > mεT
4 , go to Step 3; otherwise, go to Step 2.
Step 2. Check if the constraints (4.2.7a) are satisfied. If so, go to Step 4;
otherwise, go to Step 3.
Step 3. Continue the next iteration of the unconstrained optimization method
to obtain x(k+1) . Set k := k + 1, and go to Step 1.
Step 4. Stop. x(k) is a feasible solution.
The following theorem shows that Algorithm 4.2.2 terminates in a finite
number of iterations.
" #
Theorem 4.2.6 Let x(k) be a sequence of admissible points generated by
Algorithm 4.2.2. If !
lim Jε x(k) = 0, (4.2.32)
k→∞

then there exists a positive integer k0 such that x(k) is a feasible solution of
Problem (4.2.7) for all k > k0 .

Proof. Since !
lim Jε x(k) = 0,
k→∞

there exists a positive integer N such that


! ε * ε +
Jε x(k) ≤ min ,T (4.2.33)
8 4M
for all k > N . Thus, by Theorem 4.2.5, x(N +1) is a feasible solution of
Problem (4.2.7). This completes the proof.

As in the previous case, it is clear from Theorem 4.2.6 that Algorithm 4.2.2
will find a feasible solution of Problem (4.2.7) in a finite number of iterations
if (4.2.32) holds. Again, it is possible that a solution can still be obtained in a
4.3 Continuous Inequality Constraint Transcription Approach 89

finite number of steps even when (4.2.32) is not satisfied. Also, the algorithm
may converge to a non-global minimum, in which case one may repeat it with
a different starting point in the hope of reaching the global optimum.

4.3 Continuous Inequality Constraint Transcription


Approach

We consider a class of optimization problems subject to continuous inequality


constraints, where the cost function

f (x) (4.3.1a)

is minimized subject to inequality constraints

hj (x) ≤ 0, j = 1, 2, . . . , p (4.3.1b)

and continuous inequality constraints

φj (x, t) ≤ 0, ∀ t ∈ [0, T ], j = 1, 2, . . . , m, (4.3.1c)

where x ∈ Rn is the parameter vector to be found and T is such that 0 <


T < ∞. We assume throughout this section that the following conditions are
satisfied.
Assumption 4.3.1 hj , j = 1, . . . , p, are continuously differentiable in Rn .
Assumption 4.3.2 φj : Rn × R → R, j = 1, . . . , m, satisfy Assump-
tions (4.2.2) and (4.2.3).
For convenience, let this constrained optimization problem be referred to
as Problem (4.3.1).
We shall present two computational methods for solving Problem (4.3.1).
For the first method, the constraint transcription introduced in Section 4.2
is used to transform the continuous inequality constraints into equality con-
straints. However, these equality constraints are nonsmooth. Thus a local
smoothing technique, also introduced in Section 4.2, is used to approximate
these equality constraint functions by smooth inequality constraints. Then
we construct a sequence of approximate optimization problems subject to
conventional inequality constraints. Each of these approximate problems can
be viewed as a conventional optimization problem and hence can be solved
by existing optimization techniques, such as those reported in Chapter 3.
With this ground work, we can construct an effective algorithm for solving
Problem (4.3.1). The proposed algorithm will depend on two parameters, one
controlling the smoothing and the other controlling the position of solutions
(with respect to feasibility–infeasibility) of a sequence of conventional opti-
mization problems. It will be shown that the algorithm produces a sequence
90 4 Optimization Problems Subject to Continuous Inequality Constraints

of suboptimal parameter vectors approaching to an optimal parameter vector


from inside the feasible region of Problem (4.3.1). For the second method, we
first apply the same constraint transcription and smoothing technique that
were suggested in Section 4.2 to transform the continuous inequality con-
straints. Then, a special penalty function will be used to incorporate these
transformed constraints into the cost function, thus forming a new augmented
cost function. We show that this is different from usual penalty function meth-
ods in that the penalty weighting factor does not need to approach infinity
to yield a feasible solution. As for the first method, the smoothing technique
gives rise to a sequence of approximate problems, each of which can be re-
garded as a conventional unconstrained optimization problem. Thus, each
of the approximate problems can be solved by existing optimization tech-
niques such as the quasi-Newton method. The algorithm for solving (4.3.1)
then involves constructing and solving a number of approximate problems.
It depends on two parameters, one controlling the smoothing and the other
controlling the position of solutions with respect to the feasible region. It is
further shown that, as for the first method, the proposed algorithm produces
a sequence of suboptimal parameter vectors which approaches an optimal
parameter vector of (4.3.1) from inside the feasible region of (4.3.1). The
advantage of the second method is that it effectively turns Problem (4.3.1)
into an unconstrained problem that is easier to solve. Thus it can handle
problems of much larger dimensions.

4.3.1 The First Method

The method presented in this subsection first appeared in [103].


For each j = 1, 2, . . . , m, define
 T
Gj (x) = max {φj (x, t) , 0} dt.
0

Since φj is continuously differentiable in x and t, max {φj (x, t) , 0} is a con-


tinuous function of t for each x ∈ Rn . Thus, for each j = 1, 2, . . . , m, the
corresponding continuous constraint (4.3.1c) is equivalent to

Gj (x) = 0. (4.3.2)

For convenience, let us rewrite Problem (4.3.1) with (4.3.1c) replaced by


(4.3.2) as follows:
Minimize the cost function
f (x) (4.3.3a)
subject to
hj (x) ≤ 0, j = 1, 2, . . . , p (4.3.3b)
4.3 Continuous Inequality Constraint Transcription Approach 91

and
Gj (x) = 0, j = 1, 2, . . . , m. (4.3.3c)
This problem is referred to as Problem (4.3.3). Clearly, Problem (4.3.1) is
equivalent to Problem (4.3.3).
Note that for each j = 1, 2, . . . , m, Gj (x) is nonsmooth in x. Hence, stan-
dard optimization routines can have difficulty with these equality constraints.
Let
Θ = {x ∈ Rn : hj (x) ≤ 0, j = 1, 2, . . . , p} , (4.3.4)
and let F be the feasible region of Problem (4.3.1) defined by

F = {x ∈ Θ : φj (x, t) ≤ 0, ∀t ∈ [0, T ], j = 1, 2, . . . , m}
= {x ∈ Θ : Gj (x) = 0, j = 1, 2, . . . , m} . (4.3.5)
◦ ◦
Furthermore, let Θ (respectively, F) be the interior of the set Θ (respectively,
F) in the sense that

Θ = {x ∈ Rn : hj (x) < 0, j = 1, 2, . . . , p} (4.3.6)

and
◦ ◦
F = x ∈ Θ : φj (x, t) < 0, ∀ t ∈ [0, T ], j = 1, 2, . . . , m . (4.3.7)

We assume that the following condition is satisfied.



Assumption 4.3.3 F = ∅.
The smoothing technique is to replace max {φj (x, t) , 0} by Φε (φj (x, t)),
where Φε (·) is defined by (4.2.3). Note that for each j = 1, 2, . . . , m,
Φε (φj (x, t)) is continuously differentiable in x. However, its second derivative
is discontinuous at those points where φj (x, t) = ±ε. For each j = 1, 2, . . . , m,
define  T
Gj,ε (x) = Φε (φj (x, t)) dt. (4.3.8)
0
We assume that the following condition is satisfied.
3T
Assumption 4.3.4 0 Φε (φj (x, t)) dt exists for all x.
Clearly, for each j = 1, . . . , m, Gj,ε is continuously differentiable in x. We
now define two related approximate problems. The first approximate problem
is as follows: Minimize
f (x) (4.3.9a)
subject to
x∈Θ (4.3.9b)
and
92 4 Optimization Problems Subject to Continuous Inequality Constraints

Gj,ε (x) = 0, j = 1, 2, . . . , m. (4.3.9c)


This problem is referred to as Problem (4.3.9). Note that the equality con-
straints specified in (4.3.9c) do not satisfy the regular point condition, which
is also known as the constraint qualification (see Chapter 3). Thus, it is not
advisable to solve it numerically as such. For this reason, we consider our
second approximate problem as follows: Minimize

f (x) (4.3.10a)

subject to
x∈Θ (4.3.10b)
and
Gj,ε (x) ≤ τ, j = 1, 2, . . . , m. (4.3.10c)
This problem is referred to as Problem (4.3.10). Let Fε be the feasible region
of Problem (4.3.9) defined by

Fε = {x ∈ Θ : Gj,ε (x) = 0, j = 1, 2, . . . , m} . (4.3.11)

Then, for each ε > 0, Fε ⊂ F. We assume that the following condition is


satisfied.
Assumption 4.3.5 Let x∗ be an optimal parameter vector of Problem (4.3.1).

Then, there exists a parameter vector x ∈ F such that

αx+ (1 − α) x∗ ∈ F

for all α ∈ (0, 1] .


For the subsequent convergence result, we need the following assumption.
Assumption 4.3.6 The set Θ defined by (4.3.4) is compact.
Remark 4.3.1 As an alternative to Assumption (4.3.6), we can impose the
requirement that
f (x) → ∞ as x → ∞.
We now relate the solutions of Problem (4.3.1) and Problem (4.3.9) as
ε → 0 in the following theorem.
Theorem 4.3.1 Let x∗ be an optimal solution to Problem (4.3.1), and let
x∗ε be an optimal solution to Problem (4.3.9). Then,

lim f (x∗ε ) = f (x∗ ) .


ε→0


Proof. By Assumption 4.3.5, there exists an x̄ ∈ F such that

xα = αx̄ + (1 − α) x∗ = x∗ + α (x̄−x∗ ) ∈ F, ∀α ∈ (0, 1]. (4.3.12)
4.3 Continuous Inequality Constraint Transcription Approach 93

For any δ1 > 0, there exists an α1 ∈ (0, 1] such that

f (x∗ ) ≤ f (xα ) ≤ f (x∗ ) + δ1, ∀α ∈ (0, α1 ) . (4.3.13)



Choose α2 = α1 /2. Then it is clear that xα2 ∈ F. Thus, there exists a δ2 > 0
such that

φj (xα2 , t) < −δ2 , ∀ t ∈ [0, T ] and j = 1, 2, . . . , m.

If we choose ε = δ2 , then xα2 ∈ Fε . Thus it follows that

f (x∗ε ) ≤ f (xα2 ) . (4.3.14)

From (4.3.13) and (4.3.14), we have

f (x∗ ) ≤ f (x∗ε ) ≤ f (x∗ ) + δ1 .

Letting ε → 0 and noting that δ1 > 0 is arbitrary, the conclusion of the


theorem follows.

Theorem 4.3.2 Let x∗ and x∗ε be as in Theorem 4.3.1. Then, the sequence
{x∗ε } has an accumulation point. Furthermore, any accumulation point of the
sequence {x∗ε } is an optimal parameter vector of Problem (4.3.1).

Proof. By Assumption 4.3.6, we note that {x∗ε } is in a compact set. Thus,


there exists a subsequence, denoted again by the original sequence, such that

lim x∗ε − x̂ = 0. (4.3.15)


ε→0

We shall show that x̂ ∈ F. Suppose not. Then, there exist a j and a non-zero
interval I ⊂ [0, T ] such that

max {φj (x̂, t) , 0} = φj (x̂, t) > 0, ∀ t ∈ I.

However, x∗ε ∈ Fε ⊂ F so that φj (x∗ε , t) ≤ −ε. Since, for each t, φj (·, t) is


continuous in Θ, it follows from (4.3.15) that

|φj (x∗ε , t) − φj (x̂, t)| → 0, as ε → 0.

Furthermore, |φj (x∗ε , t) − φj (x̂, t)| is uniformly bounded in Θ ×[0, T ]. Hence,


by the Lebesgue Dominated Convergence Theorem (Theorem A.1.10), we
have 
lim |φj (x∗ε , t) − φj (x̂, t)| dt = 0.
ε→0 I
Therefore,
 
0< φj (x̂, t) dt ≤ lim |φj (x∗ε , t) − φj (x̂, t)| dt = 0.
I ε→0 I
94 4 Optimization Problems Subject to Continuous Inequality Constraints

This contradiction shows that x̂ is feasible.


Recall that f is continuous on Rn . Thus, by (4.3.15), we have

lim f (x∗ε ) = f (x̂) .


ε→0

Combining this result and Theorem 4.3.1, the conclusion follows.


Theorem 4.3.3 For any ε > 0, there exists a τ (ε) > 0 such that for all
τ , 0 < τ < τ (ε), any feasible solution of Problem (4.3.10) is also a feasible
point of Problem (4.3.1).
Proof. We first recall that for each j = 1, 2, . . . , m, φj is continuously dif-
ferentiable in Θ × [0, T ]. Since Θ × [0, T ] is compact, there exists a positive
constant mj such that
 
 ∂φj (x,t) 
  ≤ mj , ∀ (x, t) ∈ Θ × [0, T ]. (4.3.16)
 ∂t 

Next, for any ε > 0, define

ε T ε
kj,ε = min , . (4.3.17)
16 2 2mj

It suffices to show that


Fj,ε,τ ⊂ Fj (4.3.18)
for any τ such that
0 < τ < kj,ε , (4.3.19)
where
Fj,ε,τ = {x ∈ Θ : Gj,ε (x) ≤ τ } (4.3.20)
and
Fj = {x ∈ Θ : Gj (x) = 0} . (4.3.21)
Assume the contrary. Then there exists an x ∈ Θ such that

Gj,ε (x) ≤ τ < kj,ε , (4.3.22)

but
Gj (x) > 0. (4.3.23)
Since φj is a continuously differentiable function in [0, T ], (4.3.23) implies
that there exists a t̄ ∈ [0, T ] such that

φj (x, t̄) > 0. (4.3.24)

Again by continuity, there exists an interval Ij ⊂ [0, T ] containing t̄ such that


ε
φj (x, t̄) > − . (4.3.25)
2
4.3 Continuous Inequality Constraint Transcription Approach 95

4
Φj 6
2 6 T
0 -
y t̄ t
−2
−4 − 2
? θ
- θ
−6 z
−8  Ii -
−10
y
−12 tan(θ) = mi = z

−14 z = mi y ≥ 2
mi
−16
−1 0 1 2 3 4 5 6

Fig. 4.3.1: Geometrical interpretation of Theorem 4.3.3

See Figure 4.3.1. Using (4.3.16), it is clear from (4.3.25) that the length |Ij |
of the interval Ij must satisfy

T ε
|Ij | ≥ min , . (4.3.26)
2 2mj

From the definition of Gj,ε (x) , note that


 T  
Gj,ε (x) = gj,ε (x, t) dt ≥ gj,ε (x, t) dt ≥ min gj,ε (x, t) dt
0 Ij Ij t∈Ij
* + ε T ε
2
≥ min (φj (x, t) + ε) /4ε |Ij | ≥ min ,
t∈Ij 16 2 2mj
=kj,ε .

This is a contradiction to (4.3.22). Thus, the proof is complete.

Algorithm 4.3.1 (Note that the integral in (4.3.8) is replaced by any suit-
able quadrature with positive weights in any computational scheme.)

Step 1. Choose ε = 10−1 , τ = εT /4 (say).


" #
Step 2. Solve Problem (4.3.10) to an accuracy of max 10−8 , ε10−3 to give
x∗ε,τ .
 
Step 3. Check the feasibility of φj x∗ε,τ ≤ 0 for all j = 1, 2, . . . , m at the
quadrature points. If x∗ε,τ is feasible, go to Step 4. Otherwise, set
τ = τ /2. If τ < 10−10 , stop, abnormal exit. Else, go to Step 2.
96 4 Optimization Problems Subject to Continuous Inequality Constraints

Step 4. Set ε = ε/10, and τ = τ /10. If ε > 10−7 , go to Step 2. Otherwise,


stop, successful exit.

Remark 4.3.2 Note that the number of quadrature points may need to be
increased as ε → 0 and that for large values of ε, a small number of quadrature
points are sufficient.

Remark 4.3.3 From Theorem 4.3.2, we see that the halving process of τ in
Step 2 of the algorithm only needs to be carried out a finite number of times.
Thus, the algorithm produces a sequence of suboptimal parameter vectors to
Problem (4.3.1), where each of them is in the feasible region of (4.3.1).

Remark 4.3.4 From the proof of Theorem 4.3.3, we see that ε and τ are
closely related. At the solution of a particular problem, if a constraint is active
over a large fraction of [0, T ], then it appears that τ = O (ε). On the other 
hand, if the constraint is active at only one point in [0, T ], then τ = O ε2 .
" #
Theorem 4.3.4 Let x∗ε,τ be a sequence of the suboptimal parameter vec-
tors produced by the above algorithm. Then,
 
f x∗ε,τ → f (x∗ ) ,
" #
and any accumulation point of x∗ε,τ is a solution of Problem (4.3.1).

Proof. Clearly,  
f (x∗ε ) ≤ f x∗ε,τ ≤ f (x∗ ) .
 
Since f (x∗ε ) → f (x∗ ), it follows that f x∗ε,τ → f (x∗ ).
To prove
" the#second part of the theorem, we note from (4.3.A6) that the
sequence x∗ε,τ is in a compact set. Thus, the existence of an accumula-
tion point is assured. On this basis, the conclusion follows easily from the
continuity of the function f .

Example 4.3.1 We choose an example that was used in [81] and [241]. Min-
imize
x2 (122 + 17x1 + 6x3 − 5x2 + x1 x3 ) + 180x3 − 36x1 + 1224
f (x) =
x2 (408 + 56x1 − 50x2 + 60x3 + 10x1 x3 − 2x21 )
(4.3.27a)
subject to
φ (x, ω) ≤ 0, ∀ ω ∈ Ω, (4.3.27b)
where
2
φ (x, ω) = J (T (x, ω)) − 3.33 [R(T (x, ω))] + 1.0,
$ %
and where Ω = 10−6 , 30 , i2 = −1, T (x,ω) = 1 + H (x, iω) G [iω],

H (x, x) = x1 + x2 /s + x3 s,
4.3 Continuous Inequality Constraint Transcription Approach 97

1
G (s) = ,
(s + 3) (s2 + 2s + 2)
and R(·) and J (·) denote the real and imaginary parts of their arguments,
respectively. Finally, there are simple bounds on the variables:

0 ≤ x1 ≤ 100, 0.1 ≤ x2 ≤ 100, 0 ≤ x3 ≤ 100. (4.3.27c)

We apply Algorithm 4.3.1 to the problem and report the results in Table 4.3.1.
In the table, we report a “failure”parameter F . F = 0 indicates normal termi-
nation, and a minimum has been found. F = 2 indicates that the convergence
criteria are not satisfied, but no downhill search could be determined. F = 3
means the maximum number of iterations, 50, has been exceeded. The num-
ber of function evaluations is given by nf with a ‘∗’ indicating when the
maximum number of iterations was reached. ε, τ and f and x1 , x2 and x3
are self-explanatory. ω̄ gives the approximate value of ω where φ attains its
maximum over Ω at the computed x value. The value of g (x) reported gives
an idea of feasibility/infeasibility over 2001 equally spaced points in Ω, which
is a much finer partition than that used for the quadrature formula. A ‘∗’
in these columns indicates that the solution found is well within the feasible
region. Finally, λ is the Lagrange multiplier for the constraint. Note, in par-
ticular, the good failure record of F = 0 for all iterations, leaving no doubt
that a solution has been found.

Table 4.3.1: Problem (4.3.27)

ε τ ω̄ g f x1 x2 x3 nf F λ

−1 −2 −2
10 10 4.095 −2.9×10 0.2223351 28.23895 22.40785 19.51560 25 0 0.0
−2 −3 −3
10 10 5.655 1.2×10 0.1745032 17.03023 45.46166 34.69278 48 0 −0.440
−4 −3
5×10 5.655 −1.4×10 0.1747744 16.97336 45.46475 34.59242 5 0 −0.706
−3 −5 −6
10 5×10 5.655 5.3×10 0.1746270 17.00433 45.46298 34.64676 11 0 −0.975
−4 −6 −4
10 5×10 5.670 1.8×10 0.1746131 16.75351 45.45822 34.80058 19 0 −0.991
−5 −7 −5
10 5×10 5.670 7.9×10 0.1746124 16.75367 45.45822 34.80085 8 0 −0.991
−6 −8 −5
10 5×10 5.670 2.7×10 0.1746123 16.75368 45.45823 34.80088 6 0 −0.990
−7 −9 −6
10 5×10 5.670 8.8×10 0.1746123 16.75368 45.45823 34.80088 6 0 −0.990

Table 4.3.2 shows the effect of increasing the quadrature accuracy as ε


is decreased for our algorithm. The number of quadrature points at each
iteration is indicated by nq. While less function evaluations are required for
the earlier iterations as expected, more are required for the later iterations.
98 4 Optimization Problems Subject to Continuous Inequality Constraints

Table 4.3.2: Problem (4.3.27) with variable quadrature points

 τ nq ω̄ g f x1 x2 x3 nf F λ

−1 −2 −2
10 10 8 3.720 −8.5×10 0.2417111 26.28261 20.49562 16.42453 13 0 0.0
−2 −3
10 10 16 ∗ ∗ 0.1788802 15.94624 31.52512 34.15840 30 0 0.0
−3 −4 −4
10 10 32 5.610 −5.3×10 0.1747647 17.80857 44.56686 34.13024 34 0 −0.325
−4 −5 −4
10 10 64 5.670 1.6×10 0.1746221 16.77723 44.63652 34.76034 28 0 −0.490
−5 −6 −5
10 10 128 5.670 7.5×10 0.1746210 16.77749 44.63660 34.76076 8 0 −0.693
−7 −5
5×10 128 5.670 7.4×10 0.1746214 16.77740 44.63670 34.76061 6 0 −0.980
−6 −8 −5
10 5×10 256 5.655 2.4×10 0.1746128 16.82795 45.48089 34.75696 18 0 −1.394
−8 −5
2.5×10 256 5.655 2.4×10 0.1746129 16.82788 45.48090 34.75699 5 0 −1.969
−7 −9 −6
10 2.5×10 512 5.655 6.3×10 0.1746168 16.96054 45.48194 34.67752 29 0 −1.901

4.3.2 The Second Method

In this section, we shall introduce a second computational procedure for solv-


ing Problem (4.3.1). It was originally proposed in [259]. We first apply the
same constraint transcription introduced in Section 4.2 to the continuous
inequality constraints. For each j = 1, 2, . . . , m, define
 T
Gj (x) = max{φj (x, t), 0}dt. (4.3.28)
0

We assume once more that Assumptions 4.3.1, 4.3.2 and 4.3.4 are satisfied.
The continuous inequality constraints (4.3.1c) are equivalent to

Gj (x) = 0, j = 1, 2, . . . , m. (4.3.29)

Let Θ, Θ̊, F and F be as defined in (4.3.4), (4.3.6), (4.3.5) and (4.3.7),
respectively. We further assume that Assumptions 4.3.3, 4.3.5, and 4.3.6 are
satisfied.
Note that, for each j = 1, 2, . . . , m, Gj (x) is, in general, nonsmooth
in x. Consequently, standard optimization routines would have difficulties
with this type of equality constraints. The smoothing technique is to replace
max{φj (x, t), 0} with φj,ε (x, t), where

⎨ 0, if φj (x, t) < −ε,
φj,ε (x, t) = (φj (x, t) + ε)2 /4ε, if − ε ≤ φj (x, t) ≤ ε, (4.3.30)

φj (x, t), if φj (x, t) > ε.

For each j = 1, 2, . . . , m, define


 T
Gj,ε (x) = φj,ε (x, t)dt. (4.3.31)
0
4.3 Continuous Inequality Constraint Transcription Approach 99

Then, since for each j = 1, 2, . . . , m, φj,ε (x, t) is continuously differentiable


in x, so is Gj,ε (x). Let

Fε = {x ∈ Θ : Gj,ε (x) = 0, j = 1, 2, . . . , m}
= {x ∈ Θ : φj (x, t) ≤ −ε, ∀ t ∈ [0, T ], j = 1, 2, . . . , m}. (4.3.32)

Then, clearly, Fε ⊂ F for each ε > 0.


We now define an approximate problem, where the smoothed continuous
constraints are treated as penalty functions.
For γ > 0, determine an x ∈ Θ that minimizes the cost function

m
f (x) + γ Gj,ε (x). (4.3.33)
j=1

This problem is referred to as Problem (4.3.33). The following result en-


sures the feasibility of a solution to Problem (4.3.33) with respect to Prob-
lem (4.3.1).
Theorem 4.3.5 There exists a γ(ε) > 0 such that for all γ > γ(ε), any
solution to Problem (4.3.33) is also a feasible point of Problem (4.3.1).

Proof. Let x∗ε,γ denote an optimal solution to Problem (4.3.33). Then,

 ∗  
m
 ∗  
m
f xε,γ + γ Gj,ε xε,γ ≤ f (x) + γ Gj,ε (x), (4.3.34)
j=1 j=1

for all x ∈ Θ. Let xε ∈ Fε be fixed. Then, by the definition of Gj,ε ,

Gj,ε (xε ) = 0

for j = 1, 2, . . . , m. Now, since Θ is compact and f is continuous, there exists



an x̄ ∈ Θ such that f (x̄) ≤ f (x) for all x ∈ Θ. Clearly, f (x̄) ≤ f x∗ε,γ .
Adding the penalty term to each side, using the definition of xε , and (4.3.34),
we have

m
    
m
 
f (x̄) + γ Gj,ε x∗ε,γ ≤ f x∗ε,γ + γ Gj,ε x∗ε,γ ≤ f (xε ). (4.3.35)
j=1 j=1

Rearranging (4.3.35) gives


m
 
γ Gj,ε x∗ε,γ ≤ f (xε ) − f (x̄). (4.3.36)
j=1

Letting z = f (xε ) − f (x̄) in (4.3.36), we get


100 4 Optimization Problems Subject to Continuous Inequality Constraints


m
  z
Gj,ε x∗ε,γ ≤ .
j=1
γ

By Theorem 4.3.3, we recall that for any ε > 0, there exists a τ (ε) such that
for all 0 < τ < τ (ε), if Gj,ε (x) < τ , then
m x ∈ F. Thus, by choosing γ(ε) ≥
z/τ (ε), it follows that for all γ > γ(ε), j=1 Gj,ε x∗ε,γ < τ . Consequently,
 
Gj,ε x∗ε,γ < τ , j = 1, 2, . . . , m, and hence x∗ε,γ ∈ F. This completes the
proof.

Theorem 4.3.6 Let x∗ be an optimal solution to Problem (4.3.1), and let


x∗ε,γ be an optimal solution to Problem (4.3.33), in which γ is chosen appro-
priately to ensure that x∗ε,γ ∈ F. Then,
 
lim f x∗ε,γ = f (x∗ ).
ε→0

Proof. By Assumption (4.3.5), there exists an x̄ ∈ F̊ such that

xα = αx̄ + (1 − α)x∗ = x∗ + α(x̄ − x∗ ) ∈ F̊

for all α ∈ (0, 1]. Now, for any δ1 > 0, there exists an α1 ∈ (0, 1] such that

f (x∗ ) ≤ f (xα ) ≤ f (x∗ ) + δ1 , (4.3.37)



for all α ∈ (0, α1 ). Choose α2 = α1 /2. Then it is clear that xα2 ∈ F. Thus,
there exists a δ2 > 0 such that

max φj (xα2 , t) < −δ2 , for j = 1, 2, . . . , m.


t∈[0,T ]

If we choose ε = δ2 , then xα2 satisfies Gj,ε (xα2 ) = 0, j = 1, 2, . . . , m. Using


this and from the definition of x∗ε,γ , we have

  
m
  
m
f x∗ε,γ + γ Gj,ε x∗ε,γ ≤ f (xα2 ) + γ Gj,ε (xα2 ) = f (xα2 ).
j=1 j=1

Noting that the penalty term is non-negative, we get


 
f x∗ε,γ ≤ f (xα2 ). (4.3.38)

Combining (4.3.38) with (4.3.37) and remembering that x∗ε,γ is feasible for
Problem (4.3.1), we obtain
 
f (x∗ ) ≤ f x∗ε,γ ≤ f (x∗ ) + δ1 .

Letting ε → 0 and noting that δ1 > 0 is arbitrary, the result follows.


4.3 Continuous Inequality Constraint Transcription Approach 101

4.3.2.1 Solution Method

Based on the results presented in Theorems 4.3.5 and 4.3.6, we now propose
the following algorithm for solving Problem (4.3.1). Essentially, the idea of
algorithm is to reduce ε from an initial value to a suitably small value in a
number of stages. The value of γ is adjusted at each stage to ensure that the
solution obtained remains feasible. We only use a simple updating scheme
here.
Algorithm 4.3.2 (Note that the integral in (4.3.31) is replaced by any suit-
able quadrature with positive weights.)
Step 1. Choose ε = 10−1 , γ = 1 and a starting point x ∈ Θ.

Step 2. Solve Problem (4.3.33) to an accuracy of max{10−8 , ε10−3 } to give


x∗ε,γ .
 
Step 3. Check the feasibility of φj x∗ε,γ , t ≤ 0 for all j = 1, 2, . . . , m at the
quadrature points. If x∗ε,γ is feasible, go to Step 4. Else, set γ = 2γ.
If γ > 105 , we have an abnormal exit. Else, go to Step 2, using x∗ε,γ
as the next starting point.

Step 4. Set ε = ε/10. If ε > 10−7 , go to Step 2, using x∗ε,γ as the next
starting point. Else, we have a successful exit.

Remark 4.3.5 Note that it is advisable to use a fixed quadrature scheme.


Also, a small number of quadrature points are sufficient for a large ε and the
number of the quadrature points may need to be increased as ε → 0.

Remark 4.3.6 Using Theorem 4.3.5, we see that the doubling of γ in Step 3
needs only to be carried out a finite number of times. Consequently, the algo-
rithm produces a sequence of suboptimal parameter vectors to Problem (4.3.1),
and each of these is in the feasible region of Problem (4.3.1). Convergence of
the cost is assured by Theorem 4.3.6. Since Θ is a compact set, we know that
the above sequence has an accumulation point. Furthermore, each accumula-
tion point is a solution of Problem (4.3.1).

Remark 4.3.7 Note also that for each ε > 0 and γ > 0, Problem (4.3.33)
is constructed using the concept of penalty functions. However, due to the
special structure of the penalty function used, the penalty weighting factor
γ does not need to go to infinity. In [259], the method was thus referred to
as an exact penalty function method. However, if the solution obtained for
Problem (4.3.33) is a local solution, it is not known whether or not it is a
local solution of the original problem (4.3.1). Thus, to avoid confusion, the
phrase exact penalty function method will be used to refer to the method
introduced in Section 4.4.
102 4 Optimization Problems Subject to Continuous Inequality Constraints

Example 4.3.2 We solve Example 4.3.1 once more with the proposed
penalty method. The integral in (4.3.31) is calculated via the trapezoidal
rule with 256 quadrature points. The optimization routine used is based on
the quasi-Newton method (see Chapter 2). The required accuracy asked of
the optimization routine is max{10−8 , 10−3 ε}.
The results are given in Table 4.3.3. The number of function evaluations
made at each iteration of the algorithm is given in the column nf . The
columns marked ε, γ, f , x1 , x2 and x3 are self-explanatory. Finally we check
the maximum value of g(x) for 2001 equally spaced points of the interval Ω.
This is given in the column marked g, while ω̄ indicates the value of ω at
which this maximum occurred.
It should be noted that, at each iteration, the optimization routine re-
turned with a ‘failure’ parameter of ‘0’. This indicates that the optimum has
been found for each iteration. The algorithm only ensures that φ(x, ω) ≤ 0
for the quadrature points, and this is reflected in the value of g as ε be-
comes small. However, noting that the interval between quadrature points is
of the order of 10−1 , it is quite reasonable to expect a constraint violation
between quadrature points of the order of 10−4 . An encouraging point to note
about the algorithm is that it tends to keep the solution of the approximate
problems more to the inside of the feasible region.

Table 4.3.3: Iterations for Algorithm 4.3.2

ε γ ω̄ g f x1 x2 x3 nf

−1 −2
10 1.0 5.48 −6.7×10 0.1822723 16.18185 48.86607 32.07144 21
−2 −2
10 1.0 5.48 −6.7×10 0.1822723 16.18185 48.86607 32.07144 1
−3 −4
10 1.0 5.63 6.2×10 0.1747277 17.57267 48.63336 34.44601 11
−3 −5
10 2.0 5.63 −7.3×10 0.1748006 17.55490 48.63305 34.42106 4
−4 −6
10 2.0 5.63 −8.7×10 0.1747937 17.55597 48.63278 34.42370 3
−5 −4
10 2.0 5.69 3.7×10 0.1746830 16.53537 48.11178 35.01657 17
−6 −4
10 2.0 5.69 3.7×10 0.1746830 16.53538 48.11178 35.01658 4
−7 −4
10 2.0 5.69 3.7×10 0.1746830 16.53541 48.11173 35.01657 6

4.4 Exact Penalty Function Method

This section basically comes from [300, 301]. Consider the following optimiza-
tion problem with continuous inequality constraints:

minimize f (x) (4.4.1)

subject to
hj (x, t) ≤ 0, ∀t ∈ [0, T ], j = 1, 2, . . . , m. (4.4.2)
4.4 Exact Penalty Function Method 103

Let this problem be referred to as Problem (P ). An exact penalty function


method is introduced for this continuous inequality constrained optimiza-
tion problem. The summation of the integrals of the exact penalty functions
is appended to the cost function forming a new cost function. It is shown
that a minimizer of the cost function can be obtained without requiring the
penalty parameter to go to +∞. Furthermore, any local minimizer of the un-
constrained optimization problem when the penalty parameter is sufficiently
large is a local minimizer of the original problem. This property is not shared
by the approaches reported in Section 4.3. Clearly, this is a major advance-
ment.
For Problem (P ), it is clear that for each x ∈ Rn , max {hj (x, t), 0} is a
continuous function of t, since hj is continuously differentiable. Define

Sε = {(x, ε) ∈ Rn × R+ : hj (x, t) ≤ εγ Wj , ∀ t ∈ [0, T ], j = 1, 2, . . . , m},


(4.4.3)
where R+ = {α ∈ R : α ≥ 0}, Wj ∈ (0, 1), j = 1, 2, . . . , m, are fixed
constants and γ is a positive real number. Clearly, Problem (P ) is equivalent
to the following problem, which is denoted as Problem (P6 ),

minimize f (x) (4.4.4a)

subject to
(x, ε) ∈ S0 , (4.4.4b)
where S0 is simply Sε with ε = 0.
We assume that the following conditions are satisfied.
Assumption 4.4.1 There exists a global minimizer of Problem (P ), imply-
ing that f (x) is bounded from below on S0 .

Assumption 4.4.2 The number of distinct local minimum values of the ob-
jective function of Problem (P ) is finite.

Assumption 4.4.3 Let A denote the set of all local minimizers of Problem
(P ). If x∗ ∈ A, then A(x∗ ) = {x ∈ A : f (x) = f (x∗ )} is a compact set.

We introduce a new exact penalty function fσ (x, ε) as follows:



⎨ f (x), if ε = 0, hj (x, t) ≤ 0, t ∈ [0, T ],
fσ (x, ε) = f (x) + ε−α Δ(x, ε) + σεβ , if ε > 0,

+∞ otherwise,
(4.4.5)
where Δ(x, ε), which is referred to as the constraint violation, is defined by
m 
 T
2
Δ(x, ε) = [max {0, hj (x, t) − εγ Wj }] dt, (4.4.6)
j=1 0
104 4 Optimization Problems Subject to Continuous Inequality Constraints

α and γ are positive real numbers, β > 2 and σ > 0 is a penalty parameter.
We now introduce a surrogate optimization problem as follows:

min fσ (x, ε) (4.4.7a)

subject to
(x, ε) ∈ Rn × [0, +∞). (4.4.7b)
Let this problem be referred to as Problem (Pσ ). Intuitively, during the
process of minimizing fσ (x, ε), if σ is increased, εβ should be reduced, mean-
ing that ε should be reduced as β is fixed. Thus ε−α will be increased, and
hence the constraint violation will also be reduced. This means that the value
2
of [max {0, hj (x, t) − εγ Wj }] must go down, eventually leading to the satis-
faction of the continuous inequality constraints, i.e.,

hj (x, t) ≤ 0, ∀ t ∈ [0, T ], j = 1, 2, . . . , m.

Let {σk }k∈N be a sequence such that σk → ∞ as k → ∞. We will prove in


the next section that, under some mild assumptions, if the parameter σk is
sufficiently large and x(k),∗ , ε(k),∗ is a local minimizer of Problem (Pσk ),
then ε(k),∗ → ε∗ = 0, and x(k),∗ → x∗ with x∗ being a local minimizer of
Problem (P ). The importance of this result is quite obvious.

4.4.1 Convergence Analysis

Taking the gradients of fσ (x, ε) with respect to x and ε gives

m  T
∂fσ (x, ε) ∂f (x) ∂hj (x, t)
= + 2ε−α max{0, hj (x, t) − εγ Wj } dt
∂x ∂x j=1 0
∂x
(4.4.8)
and
m  T
∂fσ (x, ε)
= − αε−α−1
2
[max {0, hj (x, t) − εγ Wj }] dt
∂ε j=1 0
m  T
− 2γεγ−α−1 max {0, hj (x, t) − εγ Wj } Wj dt + σβεβ−1
j=1 0

⎨ m 
 T
−α−1 2
=ε −α [max {0, hj (x, t) − εγ Wj }] dt
⎩ 0
j=1
4.4 Exact Penalty Function Method 105

m 
 T ⎬
+2γ max {0, hj (x, t) − εγ Wj } (−εγ Wj )dt + σβεβ−1 .
0 ⎭
j=1
(4.4.9)
 
For every positive integer k, let x(k),∗ , ε(k),∗ be a local minimizer of
Problem (Pσk ). To obtain the main result, we need the following lemma.
 
Lemma 4.4.1 Let x(k),∗ , ε(k),∗ be a local minimizer of Problem (Pσk ).
 (k),∗ (k),∗ 
Suppose that fσk x ,ε is finite and that ε(k),∗ > 0. Then,
!
x(k),∗ , ε(k),∗ ∈/ Sε ,

where Sε is defined by (4.4.3).


 
Proof. Since x(k),∗ , ε(k),∗ is a local minimizer of Problem (Pσk ) and ε(k),∗ >
0, we have  
∂fσk x(k),∗ , ε(k),∗
= 0. (4.4.10)
∂ε
Let us assume that the conclusion of the lemma is false. Then, we have
! !γ
hj x(k),∗ , ε(k),∗ ≤ ε(k),∗ Wj , ∀t ∈ [0, T ], j = 1, 2, . . . , m. (4.4.11)

Thus, by (4.4.9)–(4.4.11), we obtain


 
∂fσk x(k),∗ , ε(k),∗
0= = σk βεβ−1 .
∂ε
This is a contradiction, hence completing the proof.
To continue, we introduce the following definition.
Definition 4.4.1 It is said that the constraint qualification is satisfied for the
continuous inequality constraints (4.4.2) at x = x, if the following implication
is valid. Suppose that
 T
m
∂hj (x, t)
ϕj (t) dt = 0.
0 j=1
∂x

Then, ϕj (t) = 0, ∀t ∈ [0, T ], j = 1, 2, . . . , m.


Let the conditions of Lemma 4.4.1 be satisfied. Then, we have the following
theorem.
 
Theorem 4.4.1 Suppose that x(k),∗ , ε(k),∗ is a local minimizer of Problem
   
(Pσk ) such that fσk x(k),∗ , ε(k),∗ is finite and ε(k),∗ > 0. If x(k),∗ , ε(k),∗ →
(x∗ , ε∗ ) as k → +∞, and the constraint qualification is satisfied for the con-
tinuous inequality constraints (4.4.2) at x = x∗ , then ε∗ = 0 and x∗ ∈ S0 .
106 4 Optimization Problems Subject to Continuous Inequality Constraints
 
Proof. From Lemma 4.4.1, it follows that x(k),∗ , ε(k),∗ ∈ / Sε(k),∗ . Thus,
by (4.4.8), we have
 
∂fσk x(k),∗ , ε(k),∗
∂x
 
∂f x(k),∗ !−α
= + 2 ε(k),∗ ·
∂x
m  T * ! !γ + ∂h x(k),∗ , t
j
max 0, hj x (k),∗
,t − ε (k),∗
Wj dt
j=1 0
∂x
= 0. (4.4.12)
Similarly, by (4.4.9), we have
 
∂fσk x(k),∗ , ε(k),∗
∂ε
m  T &
!−α−1  * ! !γ +'2
= −α ε (k),∗
max 0, hj x(k),∗ , t − ε(k),∗ Wj dt
j=1 0
m  T
!γ−α−1  * ! !γ +
− 2γ ε(k),∗ max 0, hj x(k),∗ , t − ε(k),∗ Wj Wj dt
j=1 0
!β−1
+ σk β ε(k),∗

!−α−1 ⎨ m 
 T & * ! !γ +'2
= ε(k),∗ −α max 0, hj x(k),∗ , t − ε(k),∗ Wj dt
⎩ 0
j=1

m 
 T * ! !γ + !γ ! ⎬
+2γ max 0, hj x(k),∗ , t − ε(k),∗ Wj − ε(k),∗ Wj dt
0 ⎭
j=1
!β−1
+ σk β ε(k),∗
= 0. (4.4.13)

Suppose that ε(k),∗ → ε∗ = 0. Then, by (4.4.13), we observe that its first term
tends to a finite value, while the last term tends to infinity as σk → +∞,
when k → +∞. This is impossible for the validity of (4.4.13). Thus, ε∗ = 0.
Now, by (4.4.12), we obtain
!α ∂f x(k),∗ 
(k),∗
ε
∂x
m  T * ! !γ + ∂h x(k),∗ , t
j
+2 max 0, hj x (k),∗
,t − ε (k),∗
Wj dt
j=1 0
∂x
= 0. (4.4.14)
4.4 Exact Penalty Function Method 107

Thus,
4
!α ∂f x(k),∗ 
(k),∗
lim ε
k→+∞ ∂x
5
m 
 T * ! !γ + ∂h x(k),∗ , t
j
+2 max 0, hj x (k),∗
,t − ε (k),∗
Wj dt
j=1 0
∂x
m  T
∂hj (x∗ , t)
=2 max {0, hj (x∗ , t)} dt
j=1 0 ∂x
= 0. (4.4.15)

Since the constraint qualification is satisfied for the continuous inequality


constraints (4.4.2) at x = x∗ , it follows that, for each j = 1, 2, . . . , m,

max {0, hj (x∗ , t)} = 0

for each t ∈ [0, T ]. This, in turn, implies that, for each j = 1, 2, . . . , m,


hj (x∗ , t) ≤ 0, ∀t ∈ [0, T ]. The proof is complete.

Corollary 4.4.1 If x(k),∗ → x∗ ∈ S0 and ε(k),∗ → ε∗ = 0, then

Δ(x(k),∗ , ε(k),∗ ) → Δ(x∗ , ε∗ ) = 0.

Proof. The conclusion follows readily from the definition of Δ(x, ε) and the
continuity of hj (x, t).

For the exact penalty function constructed in (4.4.5), we have the following
results.
   δ !
Theorem 4.4.2 Assume that hj x(k),∗ , ω = o ε(k),∗ , δ > 0, j =
1, 2, . . . , m. Suppose that γ > α, δ > α, −α − 1 + 2δ > 0 and 2γ − α − 1 > 0.
Then, as ε(k),∗ → ε∗ = 0 and x(k),∗ → x∗ ∈ S0 , it holds that
!
fσk x(k),∗ , ε(k),∗ → f (x∗ ) (4.4.16)

and !
∇(x,ε) fσk x(k),∗ , ε(k),∗ → (∇f (x∗ ), 0). (4.4.17)

Proof. By the conditions of the theorem, it follows that, for ε = 0,


!
lim fσk x(k),∗ , ε(k),∗
ε(k),∗ →ε∗ =0
x(k),∗ →x∗ ∈S0
4
! !−α
= lim f x(k),∗ + ε(k),∗ ·
ε(k),∗ →ε∗ =0
x(k),∗ →x∗ ∈S0
108 4 Optimization Problems Subject to Continuous Inequality Constraints

m 
5
 T & * ! !γ +'2 !β
max 0, hj x (k),∗
,t − ε (k),∗
Wj dt + σk ε (k),∗

j=1 0
m 3 &
 *    γ +'2
T
0
max 0, hj x(k),∗ , t − ε(k),∗ Wj dt
j=1
= f (x∗ ) + lim .
ε(k),∗ →ε∗ =0 (ε(k),∗ )α
x(k),∗ →x∗ ∈S0
(4.4.18)

For the second term of the right hand side of (4.4.18), it is clear from
Lemma 4.4.1 that
m 3 & *  (k),∗   (k),∗ γ +'2
T
0
max 0, h j x , t − ε W j dt
j=1
lim  α
ε(k),∗ →ε∗ =0 ε(k),∗
x(k),∗ →x∗ ∈S0
 T !− α2 ! !γ− α2 2
= lim ε(k),∗ hj x(k),∗ , t − ε(k),∗ Wj dt,
ε(k),∗ →ε∗ =0 0
j∈J 
x(k),∗ →x∗ ∈S0
(4.4.19)

where
* ! !γ +
J  = j ∈ [1, 2, . . . , m] : hj x(k),∗ , t − ε(k),∗ Wj ≥ 0 .
   
Since γ > α, hj x(k),∗ , t = o (ε(k),∗ )δ and δ > α, we have

 T !− α2 ! !γ− α2 2
lim ε(k),∗ hj x(k),∗ , t − ε(k),∗ Wj dt = 0.
ε(k),∗ →ε∗ =0 0
j∈J 
x(k),∗ →x∗ ∈S0
(4.4.20)
Combining (4.4.18)–(4.4.20) gives
!
lim fσk x(k),∗ , ε(k),∗ = f (x∗ ). (4.4.21)
ε(k),∗ →ε∗ =0
x(k),∗ →x∗ ∈S0

Similarly, we have
!
lim ∇(x,ε) fσk x(k),∗ , ε(k),∗
ε(k),∗ →ε∗ =0
x(k),∗ →x∗ ∈S0
& ! !'
= lim ∇x fσk x(k),∗ , ε(k),∗ , ∇ε fσk x(k),∗ , ε(k),∗ , (4.4.22)
ε(k),∗ →ε∗ =0
x(k),∗ →x∗ ∈S0

where
4.4 Exact Penalty Function Method 109
!
lim ∇x fσk x(k),∗ , ε(k),∗
ε(k),∗ →ε∗ =0
x(k),∗ →x∗ ∈S0
  !−α
∂f x(k),∗
= lim +2 ε(k),∗ ·
ε(k),∗ →ε∗ =0 ∂x
x(k),∗ →x∗ ∈S0
m  T&
 * ! !γ +' ∂h x(k),∗ , t
j
max 0, hj x (k),∗
,t − ε (k),∗
Wj dt
j=1 0 ∂x
 T !−α !
= ∇x f (x∗ ) + lim 2 ε(k),∗ hj x(k),∗ , t
ε(k),∗→ε∗=0
j∈J  0
x(k),∗→x∗∈S0
!γ−α  
∂hj x(k),∗, t
− ε (k),∗
Wj dt
∂x
= ∇x f (x∗ ), (4.4.23)

while
!
lim ∇ε fσk x(k),∗ , ε(k),∗
ε(k),∗→ε∗=0
x(k),∗→x∗∈S0
4
!−α−1
= lim ε(k),∗ ·
ε(k),∗→ε∗=0
x(k),∗→x∗∈S0
m  T& * ! !γ +'2
−α max 0, hj x(k),∗ , t − ε(k),∗ Wj dt
j=1 0
m  T * ! !γ + !γ !
+2γ max 0, hj x(k),∗ , t − ε(k),∗ Wj − ε(k),∗ Wj dt
j=1 0
5
!β−1
(k),∗
+σk β ε
4
 T ! !− α+1
2
= lim −α hj x(k),∗ , t ε(k),∗
ε(k),∗→ε∗=0
j∈J  0
x(k),∗→x∗∈S0
!γ− α+1
2
2  T& ! !γ '
− ε(k),∗ Wj dt +2γ hj x(k),∗ , t − ε(k),∗ Wj ·
j∈J  0
5
!γ ! !−α−1
− ε (k),∗
Wj ε (k),∗
dt

= 0. (4.4.24)

Thus, the proof is complete.


110 4 Optimization Problems Subject to Continuous Inequality Constraints

The exactness of the penalty function is shown in the following theorem.


Theorem 4.4.3 There exists a k0 > 0, such that for any k ≥ k0 , every local
minimizer x(k),∗ , ε(k),∗ of the penalty problem with finite fσk (x∗ , ε∗ ) has
the form (x∗ , 0), where x∗ is a local minimizer of Problem (P ).

Proof. Let us assume that the conclusion is false. Then, there exists a sub-
sequence of { x(k),∗ , ε(k),∗ }, which is denoted by the original sequence, such

that for any k0 > 0, there exists a k  > k0 satisfying ε(k ),∗ = 0. By Theo-
rem 4.4.1, we have

ε(k),∗ → ε∗ = 0, x(k),∗ → x∗ ∈ S0 , as k → +∞.


 β−1
Since ε(k),∗ = 0 for all k, it follows from dividing (4.4.13) by ε(k),∗ that
4 m  T &
!−α−β  * ! !γ +'2
ε (k),∗
−α max 0, hj x(k),∗ , t − ε(k),∗ Wj dt
j=1 0

m  T
5
 * ! !γ + !γ !
+2γ max 0, hj x (k),∗
,t − ε (k),∗
Wj −ε (k),∗
Wj dt
j=1 0

+σk β
= 0. (4.4.25)

This is equivalent to
4 m 
!−α−β  T & * ! !γ +'2
ε (k),∗
−α max 0, hj x(k),∗ , t − ε(k),∗ Wj dt
j=1 0
m  T&
 * ! !γ + !γ !
+ 2γ max 0, hj x(k),∗ , t − ε(k),∗ Wj −ε(k),∗ Wj
j=1 0
* ! !γ + !
+ max 0, hj x(k),∗ , t − ε(k),∗ Wj hj x(k),∗ , t
5
* ! !γ + !'
− max 0, hj x (k),∗
,t − ε (k),∗
Wj h j x (k),∗
,t dt +σk β

= 0. (4.4.26)

Note that
* ! !γ + !γ !
max 0, hj x(k),∗ , t − ε(k),∗ Wj −ε(k),∗ Wj
* ! !γ + !
+ max 0, hj x(k),∗ , t − ε(k),∗ Wj hj x(k),∗ , t
* ! !γ +& ! !γ '
= max 0, hj x(k),∗ , t − ε(k),∗ Wj hj x(k),∗ , t − ε(k),∗ Wj
4.4 Exact Penalty Function Method 111
& * ! !γ +'2
= max 0, hj x(k),∗ , t − ε(k),∗ Wj . (4.4.27)

Combining (4.4.26) and (4.4.27) yields


⎧ ⎫
!−α−β ⎨m  T& * ! !γ +'2 ⎬
ε(k),∗ (2γ −α) max 0, hj x(k),∗ , t − ε(k),∗ Wj dt
⎩ 0 ⎭
j=1

+σk β
m 
!−α−β  T * ! !γ +
= 2γ ε (k),∗
max 0, hj x(k),∗ , t − ε(k),∗ Wj ·
j=1 0
!
hj x(k),∗ , t dt. (4.4.28)

Define
m 
!−α−β  T * ! !γ +
y k = ε(k),∗ max 0, hj x(k),∗ , t − ε(k),∗ Wj dt
j=1 0
(4.4.29)
and  
z k = y k / y k  . (4.4.30)
Clearly,  
lim z k  = |z ∗ | = 1, (4.4.31)
k→+∞

where | · | denotes the modulus.


  The α Euclidean norm is denoted as  · .
 usual
Dividing (4.4.14) by y k  and ε(k),∗ yields

∂f (x(k),∗ ) −α m  T * ! !γ +
2 ε(k),∗ 
∂x
+ max 0, hj x(k),∗, t − ε(k),∗ Wj ·
|y k | |y |
k
j=1 0
 (k),∗ 
∂hj x , t
dt
∂x
= 0. (4.4.32)

Note that x(k),∗ → x∗ as k → +∞ and that ∂f∂x (x)


and, for each j =
∂hj (· ,t)
1, 2, . . . , m, hj and ∂x are continuous in R for each t ∈ [0, T ]. Clearly,
n

[0, T ] is a compact set. Thus, it can be shown that there are constants K̂ and
K, independent of k, such that
  
 ∂f x(k),∗  
 
  ≤ K̂ (4.4.33)
 ∂x 

and
112 4 Optimization Problems Subject to Continuous Inequality Constraints
 
 ∂h x(k),∗ , t 
 j 
  ≤ K, ∀t ∈ [0, T ], j = 1, 2, . . . , m. (4.4.34)
 ∂x 
Note that
1
 β
|y k | ε(k),∗
1
=  −α−β 
m 3 "    γ #  β
 ε(k),∗ T
max 0, hj x(k),∗ , t − ε(k),∗ Wj dt ε(k),∗
0
j=1
1
= 
m 3 "     #   . (4.4.35)
 T
max 0, h x (k),∗ , t − ε(k),∗ γ W dt  ε (k),∗ −α
0 j j
j=1

 
Recalling the
! assumption stated in Theorem 4.4.2, we have hj x(k),∗ , t =
 δ
o ε(k),∗ , γ > α and δ > α. Thus,
 
m  T * ! !γ +  !−α

lim  max 0, hj x(k),∗ , t − ε(k),∗ Wj dt ε(k),∗
k→+∞  
j=1 0
 
m  T !δ  !γ  !−α
 
= lim  max 0, o ε(k),∗ − ε(k),∗ Wj dt ε(k),∗
k→+∞  
j=1 0
 
  T !δ  !−α !γ−δ 
m 
= lim  max 0, o ε(k),∗ ε(k),∗ − ε(k),∗ Wj dt
k→+∞  
j=1 0
 ⎧   ! ⎫ 
  T ⎨ o ε(k),∗
δ
!δ−α !γ−δ ⎬ 
m
= lim   max 0,  δ ε (k),∗
− ε (k),∗
Wj dt
k→+∞  0 ⎩ ε(k),∗ ⎭ 
j=1

= 0. (4.4.36)

Therefore,
1
 β → +∞, k → +∞. (4.4.37)
|y k | ε(k),∗
From (4.4.33) and (4.4.37), it is clear that

∂f (x(k),∗ )

β → +∞, k → +∞.
∂x
 (4.4.38)
|y k | ε(k),∗

On the other hand, by (4.4.29) and (4.4.30), we have


4.4 Exact Penalty Function Method 113
 
 2 ε(k),∗ −α−β  m  T * ! !γ +

 max 0, h j x (k),∗
, t − ε (k),∗
W j ·
 |y k | j=1 0
  
∂hj x(k),∗ , t 
dt
∂x 
 (k),∗ −α−β m  T  * ! !γ +
2 ε  
≤  max 0, h j x (k),∗
, t − ε (k),∗
W j ·
|y k | 
j=1 0
 
∂hj x(k),∗ , t 
dt
∂x 
 −α−β m  T * ! !γ +
2 ε(k),∗ 
= max 0, hj x(k),∗ , t − ε(k),∗ Wj ·
|y |
k
j=1 0
 

 ∂h x(k),∗ , t 
 j 
  dt
 ∂x 
 (k),∗ −α−β m  T * ! !γ +
2 ε 
≤ max 0, h j x (k),∗
, t − ε (k),∗
W j Kdt
|y k | j=1 0

= 2Kz k , (4.4.39)
 
where z k is defined by (4.4.30). Clearly, z k  = 1. Thus, it follows that 2Kz k
is bounded uniformly with respect to k. This together with (4.4.38) is a
contradiction to (4.4.32). Thus, the proof is complete.

We may now conclude that, under some mild assumptions and the con-
straint qualification condition, when the parameter σ is sufficiently large, a
local minimizer of Problem (Pσ ) is a local minimizer of Problem (P ).

4.4.2 Algorithm and Numerical Results

Unconstrained optimization techniques such as quasi-Newton methods or


conjugate gradient methods can be used to solve Problem (Pδ ). Here, the one
implemented is the optimization tool fmincon within the MATLAB environ-
ment. The integral appearing in fσ (x, ε) is approximated by using Simpson’s
Rule. The global error of Simpson’s Rule is of the order of h4 , where h is the
discretization step size. Thus, a reasonable accuracy can easily be achieved
for the integration if the discretization step size is sufficiently small. In the
following, we define various terms used in the algorithm.
114 4 Optimization Problems Subject to Continuous Inequality Constraints

σ is the penalty parameter which is to be increased in every iteration.



t̄ is the point at which max hj x(k),∗ , t̄ = max max hj x(k),∗ , t .
1≤j≤m 1≤j≤m t∈[0,T ]
 
h̄ is the value of max max hj x(k),∗ , t .
1≤j≤m t∈[0,t]
f is the objective function value.
ε is a new variable that is introduced in the construction of the exact
penalty function.
ε is a lower bound of ε(k),∗ , which is introduced to avoid ε(k),∗ → 0.
With the new exact penalty function, we can construct an efficient algorithm,
which is given below.
Algorithm 4.4.1
Step 1. Set the iteration index k = 0. Set σ (1) = 10, ε(1) = 0.1, ε = 10−9
and β > 2, and choose an initial point (x0 , ε0 ). The values of γ and
α are chosen depending on the specific
 structure
 of Problem (P ).
Step 2. Solve Problem (Pσk ), and let x(k),∗ , ε(k),∗ be the minimizer ob-
tained.
Step 3. If ε(k),∗ > ε and σ (k) < 108 , set σ(k+1) = 10 × σ (k) and k := k +
1. Go to Step 2 with x(k),∗ , ε(k),∗ as the initial point in the next
optimization process. Else set ε(k),∗ = ε, and then go to Step 4.
Step 4. Check the feasibility of x(k),∗ , i.e., whether or not
!
max max hj x(k),∗ , t ≤ 0.
1≤j≤m t∈[0,T ]

If x(k),∗ is feasible, then it is a local minimizer of Problem (P ). Exit.


Else go to Step 5.
Step 5. Adjust the parameters α, β and γ such that conditions of Theo-
rem 4.4.2 are satisfied. Set σ (k+1) = 10σ (k) , ε(k+1) = 0.1ε(k) and
k := k + 1. Go to Step 2.

Remark 4.4.1 In Step 3, if ε(k),∗ > ε, we obtain from Lemma 4.4.1 that
x(k),∗ cannot be a feasible point, meaning that the penalty parameter σ may
not be large enough. Thus we need to increase σ. If σk > 108 , but ε(k),∗ > ε∗
still, then we should adjust the value of α, β and γ such that conditions
assumed in the Theorem 4.4.2 are satisfied and go to Step 2.

Remark 4.4.2 Clearly, we cannot check the feasibility of hj (x, t) ≤ 0,


j = 1, 2, . . . , m, for every t ∈ [0, T ]. In practice, we choose a set, T-, which
contains a dense enough set of points in [0, T ]. We then check the feasibility
of hj (x, t) ≤ 0 over T- for each j = 1, 2, . . . , m.

Remark 4.4.3 Although we have proven that a local minimizer of the exact
penalty function optimization problem (Pσk ) will converge to a local minimizer
of the original problem (P ), in actual computation we need to set a lower
bound ε = 10−9 for ε(k),∗ so as to avoid division by zero.
4.4 Exact Penalty Function Method 115

Example 4.4.1 The following example is taken from [81], and it was also
used for testing the numerical algorithms in [259, 261, 296] and Section 4.3.
In this problem, the objective function

x2 (122 + 17x1 + 6x3 − 5x2 + x1 x3 ) + 180x3 − 36x1 + 1224


f (x) =
x2 (408 + 56x1 − 50x2 + 60x3 + 10x1 x3 − 2x21 )
(4.4.40)
is to be minimized subject to

h(x, t) ≤ 0 , ∀ t ∈ [T1 , T2 ], (4.4.41)

0 ≤ x1 , x3 ≤ 100, 0.1 ≤ x2 ≤ 100, (4.4.42)


where [T1 , T2 ] = [10−6 , 30],

h(x, t) = ϕ(x, t) − 3.33[ϕ(x, t)]2 + 1.0,



ϕ(x, t) = 1 + H(x, it)G(it), (i = −1)
H(x, s) = x1 + x2 /s + x3 s,
and
1
G(s) = .
(s + 3)(s2+ 2s + 2)
Here, ϕ(x, t) and ϕ(x, t) are, respectively, the imaginary and real parts
$ %
of ϕ(x, t). The initial point is x01 , x02 , x03 = [50, 50, 50] . Actually, we can
start from any point within the boundedness constraints (4.4.42).
For the continuous inequality constraint (4.4.41), the corresponding exact
penalty function fσ (x, ε) is defined by (4.4.5) with the constraint violation
Δ(x, ε) given by
 T2 $ " #%2
Δ(x, ε) = max 0, ϕ(x, t) − 3.33[ϕ(x, t)]2 + 1.0 − εγ Wj dt.
T1

Simpson’s Rule with [T1 , T2 ] = [10−6 , 30] being divided into 3000 equal subin-
tervals is used to evaluate the integral. The value obtained is very accurate.
Also, these discretized points define a dense subset T- of [T1 , T2 ]. We check the
feasibility of the continuous inequality constraint by evaluating the values of
the function h over T-. Results obtained are given in Table 4.4.1.
As we can see that, as the penalty parameter, σ, is increased, the min-
imizer approaches to the boundary of the feasible region. When σ is suffi-
ciently large, we obtain a feasible point. It has the same objective function
value as that obtained in Section 4.3. However, for the minimizer obtained
in Section 4.3, there are some minor violations of the continuous inequality
constraints (4.4.41).
116 4 Optimization Problems Subject to Continuous Inequality Constraints

Table 4.4.1: Results for Example 4.4.1

σ t̄ h̄ f∗ x∗
1
x∗
2
x∗
3
ε

−4
10 5.66 1.1012×10 0.17469205 16.961442 45.496567 34.677990 0.001976
−5
102 5.65 1.3205×10 0.17469506 16.959419 45.496640 34.674199 0.000261
−6
103 5.66 1.31695×10 0.17469547 16.967833 45.495363 34.668668 0.000054
−7
104 5.66 5.839365×10 0.17469569 16.987820 45.498793 34.657147 0.000019
−8
105 5.66 5.070583×10 0.17469635 16.981243 45.497607 34.660896 0.000011
−7
106 5.66 −1.82251×10 0.17469633 16.980628 45.497520 34.661238 0.000003

Example 4.4.2 Consider the problem:


2
min x21 + (x2 − 3)

t
subject to x2 − 2 + x1 sin ≤ 0, ∀t ∈ [0, π],
x2 − ω
− 1 ≤ x1 ≤ 1, 0 ≤ x2 ≤ 2,

where ω is a parameter that controls the frequency of the constraint. As


in [261], ω is chosen as 2.032. In this case, the corresponding exact penalty
function fσ (x, ε) is defined by (4.4.5) with the constraint violation given by
 π  2
t
Δ(x, ε) = max 0, x2 − 2 + x1 sin − ε γ Wj dt.
0 x2 − ω

Simpson’s Rule with the interval [0, π] being divided into 1000 equal subin-
tervals is used to evaluate the integral. These discretized points also form a
dense subset T- of the interval [0, π]. The feasibility check is carried out over T-.
We use Algorithm 4.4.1 with the initial point taken as [x01 , x02 ] = [0.5, 0.5] .
The solution of this problem is [x∗1 , x∗2 ] = [0, 2] with the objective function
value f ∗ = 1. The results of the algorithm are presented in Table 4.4.2.

Table 4.4.2: Results for Example 4.4.2

σ t̄ h̄ f∗ x∗
1
x∗
2
ε

−7 −7
−3
10 1.41 3.9583799×10 1.000002326 4×10
1.9999992 1.62×10
−8 −7
−4
102 1.51 −2.309265×10 1.000000582 1.769×10
1.9999998 3.0310×10
−8 −7 −5
103 1.51 −1.629325×10 1.00000047 1.438×10 1.9999998 9.6×10

It is observed that for sufficiently large σ, the minimizer obtained is such


that the continuous inequality constraints are satisfied for all t ∈ [0, π].
4.5 Exercises 117

Example 4.4.3 Consider the problem:

min (x1 + x2 − 2)2 + (x1 − x2 )2 + 30[min{0, x1 − x2 }]2


subject to x1 cos t + x2 sin t − 1 ≤ 0, ∀ t ∈ [0, π].

Again, Simpson’s Rule with the interval [0, π] being partitioned into 1000
equal subintervals is used to evaluate the corresponding constraint violation
in the exact penalty function. These discretized points also define a dense
subset T- of the interval [0, π], which is to be used for checking the feasibility of
the continuous inequality constraint. Now, by using Algorithm 4.4.1 with the
initial point taken as [x01 , x02 ] = [0.5, 0.5], the results obtained are reported
in Table 4.4.3.

Table 4.4.3: Results for Example 4.4.3

σ t̄ h̄ f∗ x∗
1
x∗
2
ε

10 0.784 0.1312984779 0.2778915304 0.7999488587 0.7999487935 0.2512652637


102 0.786 0.0155757133 0.3352776626 0.7181206321 0.7181203152 0.0274357241
103 0.79 0.0016566706 0.3423203957 0.7082782234 0.7082782249 0.0002872258
104 0.79 −0.0000022403 0.343138432 0.7071227384 0.7072248258 0.0000389857

By comparing our results with those obtained in [81, 103, 259, 261], it
is observed that the objective values are almost the same. However, for our
minimizer, it is a feasible point, while those obtained in [81, 103, 259, 261]
are not.

4.5 Exercises

4.1. Consider an optimization problem for which the cost function


N
Fα (x) = fi,α (x) (4.5.1)
i=1

is to be minimized with respect to x ∈ Rn , where



⎨ (1 − α)|fi (x)| fi (x) ≤ 0,
fi,α (x) = (4.5.2)

αfi (x) fi (x) > 0.

fi , i = 1, . . . , N , are real-valued continuously differentiable functions in x ∈


Rn , and α ∈ (0, 1) is given. For each i = 1, . . . , N , fi,α is called a ‘lop-
sided’ function. Each of these functions is non-differentiable at those x where
118 4 Optimization Problems Subject to Continuous Inequality Constraints

fi (x) = 0. The following smoothing method was introduced in [105].




⎪ αfi (x) if αfi (x) > δ



⎪  


⎨ δ 2 + (αfi (x))2 /2δ if 0 ≤ αfi (x) ≤ δ
fi,α,δ (x) =   (4.5.3)



⎪ δ 2
+ ((1 − α)f i (x)) 2
/2δ if − δ ≤ (1 − α)f i (x) ≤ δ





(α − 1)fi (x) if (α − 1)fi (x) < −δ,

where δ > 0 is a smoothing parameter.


(i) Show that for each i = 1, . . . , N , fi,α,δ is continuously differentiable
in x ∈ Rn .
(ii) Show that for each i = 1, . . . , N , fi,α,δ and fi,α have their minima
at the same value of x.
(iii) Show that
0 ≤ fi,α,δ (x) − fi,α (x) ≤ δ/2. (4.5.4)
Replacing fi,α (x) with fi,α,δ (x), the cost function (4.5.1) is then
approximated by
N
Fα,δ (x) = fi,α,δ (x). (4.5.5)
i=1

Clearly,

Fα,δ (x) − Fα (x) ≤ . (4.5.6)
2
(iv) Let xδ,∗ and x∗ minimize (4.5.1) and (4.5.5), respectively. Show
that

0 ≤ Fα,δ (xδ ) − Fα (x∗ ) ≤ . (4.5.7)
2
(v) Let
x∗ = arg min Fα (x) (4.5.8)
and
xδ,∗ = arg min Fα,δ (x). (4.5.9)
If there exists a unique minimizer of Fα (x), show that

lim xδ,∗ = x∗ . (4.5.10)


δ→0

4.2. Consider the following optimization problem (see [242]). The cost func-
tion  b
J(x) = |F (x, t)| dt (4.5.11a)
a

is to be minimized with respect to x = [xi , . . . , xn ] ∈ Rn and subject to the


following constraints:
4.5 Exercises 119

hj (x) = 0, j = 1, . . . , Ne (4.5.11b)
gj (x) ≤ 0, j = Ne , . . . , N (4.5.11c)
and
αi ≤ xi ≤ βi , i = 1, . . . , n, (4.5.11d)

where a and b are real constants,

α = [α1 , . . . , αn ] , β = [β1 , . . . , βn ]

and αi , i = 1, . . . , n, and βi , i = 1, . . . , n, are real constants. Let this problem


be referred to as Problem (P ). Clearly, Problem (P ) is a nonlinearly con-
strained nonsmooth optimization problem, where the nonsmooth function
appears in the cost function (4.5.11a). Note that the nonsmoothness is due
to |F (x, t)|, which, for each t ∈ [a, b], is non-negative. Recall the smoothing
approximation introduced in [242] as follows:

⎨ |F (x, t)| ,
⎪ if |F (x, t)| ≥ 2ε
Fε (x, t) = & ' (4.5.12)

⎩ (F (x, t))2 + ε2 /ε, if |F (x, t)| < ε .
4 2

Show that
(i) For each t ∈ [a, b], Fε (x, t) is continuously differentiable with re-
spect to x.
(ii) Fε (x, t) ≥ |F (x, t)| for each (x, t) ∈ Rn × [a, b].
(iii) For each (x, t) ∈ Rn × [a, b], |Fε (x,t) − |F (x, t)|| ≤ 4ε .
(iv) For each t ∈ [a, b], x minimizes |F (x, t)| if and only if it mini-
mizes Fε (x, t). With |F (x, t)| approximated by Fε (x, t), the cost
function (4.5.11a) becomes
 b
Jε (x) = Fε (x, t)dt. (4.5.13)
a

This approximate problem, referred to as Problem (Pε ), is to mini-


mize the cost function (4.5.13) subject to the constraints (4.5.11b)–
(4.5.11d).
For each ε > 0, let x∗ε be an optimal solution to Problem (Pε ).
Furthermore, let x∗ be the optimal solution of Problem (P ). Show
that
ε(b − a)
0 ≤ Jε (x∗ε ) − J (x∗ ) ≤ .
4
(v) Assume that J(x) → ∞ as |x| → ∞, where |·| denotes the Eu-
clidean norm. Show that there exists an accumulation point of the
sequence {x∗ε } when ε → 0. Furthermore, show that any such ac-
cumulation point is an optimal solution of Problem (P ).
120 4 Optimization Problems Subject to Continuous Inequality Constraints

4.3. Consider the following optimization problem:

min f (x) (4.5.14a)

subject to

hi (x) = 0, i = 1, 2, . . . , m (4.5.14b)
hi (x) ≤ 0, i = m + 1, 2, . . . , N, (4.5.14c)

where f and hi , i = 1, 2, . . . , N,, are continuously differentiable functions.


(i) Define the constraint violation of the form of (4.4.6) and then the
exact penalty function of the form of (4.4.5).
(ii) Write down the surrogate optimization problem of the form
of (4.4.7).
(iii) Develop the exact penalty function algorithm as of Algorithm 4.4.1.
(iv) Show that a local minimum of the surrogate optimization problem
is a local minimum of the problem (4.5.12)
Chapter 5
Discrete Time Optimal Control Problems

5.1 Introduction

Discrete time optimal control problems arise naturally in many multi-stage


control and inventory problems where time enters discretely in a natural fash-
ion. In this chapter, we first use the dynamic programming approach to study
a class of discrete time optimal control problems. We then move on to consider
a general class of constrained discrete time optimal control problems in canon-
ical form. An efficient algorithm supported by a rigorous convergence analysis
will be developed for solving this constrained discrete time optimal control
problem. We also consider a class of discrete time optimal control problems
subject to terminal and all-time-step inequality constraints involving state
and control variables. It is then shown that this optimal control problem can
be solved as a constrained discrete time optimal control problem in canon-
ical form via the constraint transcription introduced in Chapter 4. Three
examples are computed using the algorithm developed in this chapter so as
to demonstrate the efficiency and effectiveness of the algorithm. The main
references of the chapter are [58], Chapter 11 of [253] and [258, 265, 266, 282].

5.2 Dynamic Programming Approach

Consider the system of difference equations

x(i + 1) = f (i, x(i), u(i)), i = 0, 1, . . . , N − 1, (5.2.1)


x(0) = x(0) , (5.2.2)

© The Author(s), under exclusive license to 121


Springer Nature Switzerland AG 2021
K. L. Teo et al., Applied and Computational Optimal Control, Springer
Optimization and Its Applications 171,
https://doi.org/10.1007/978-3-030-69913-0 5
122 5 Discrete Time Optimal Control Problems

where x ∈ Rn and u ∈ Rr are the state and control vectors, respectively, and
x(0) ∈ Rn is a given initial state. Furthermore, f (i, ·, ·) : Rn × Rr is a given
function. For each i = 0, 1, . . . , N − 1,

u(i) ∈ Ui , (5.2.3)

where Ui is a given compact and convex subset of Rr . For notational conve-


nience, the set of feasible control sequences starting at time k ∈ {0, 1, . . . , N −
1} is defined by

Uk = {{u(k), u(k + 1), . . . , u(N − 1)} : u(i) ∈ Ui , i = k, k + 1, . . . , N − 1} .


(5.2.4)
We then consider the following discrete time optimal control problem:
4 −1
5

N
min Φ0 (x(N )) + L0 (i, x(i), u(i)) (5.2.5)
{u(0),...,u(N −1)}∈U0
i=0

subject to (5.2.1)–(5.2.2), where L0 (i, ·, ·) : Rn × Rr → R and Φ0 : Rr → R


are given functions. Let this problem be referred to Problem (P ).
Dynamic programming involves a sequence of problems embedded in the
original Problem (P ). Each of these embedded problems shares the dynam-
ics, cost functional, and control constraints of the original but has a dif-
ferent initial state and a different starting time. More specifically, for each
ξ = [ξ1 , . . . , ξn ] ∈ Rn , and k ∈ {0, . . . , N − 1}, we consider the following
problem:
4 −1
5
N
min Φ0 (x(N )) + L0 (i, x(i), u(i)) (5.2.6)
{u(k),...,u(N −1)}∈Uk
i=k

subject to

x(i + 1) = f (i, x(i), u(i)), i = k, k + 1, . . . , N − 1, (5.2.7)


x(k) = ξ. (5.2.8)

Let problem (5.2.6)–(5.2.8) be referred to as Problem (Pk,ξ ). As k and the


initial state ξ can vary, we are dealing with a family of optimal control prob-
lems.
The result presented in the following theorem is known as the Principle of
Optimality:
Theorem 5.2.1 Suppose that {u∗ (k), . . . , u∗ (N − 1)} is an optimal control
for Problem (Pk,ξ ) and that {x∗ (k) = ξ, x∗ (k + 1), . . . , x∗ (N )} is the cor-
responding optimal trajectory. Then, for any j such that k ≤ j ≤ N − 1,
{u∗ (j), u∗ (j + 1) . . . , u∗ (N − 1)} is an optimal control for Problem (Pj,x∗ (j) ).

Proof. Suppose that the conclusion is false. Then, there exists a control
5.2 Dynamic Programming Approach 123

{- - (j + 1), . . . , u
u(j), u - (N − 1)}

with the corresponding trajectory

x(j) = x∗ (j), x
{- -(j + 1), . . . , x
-(N )}

such that

N −1 
N −1
Φ0 (-
x(N )) + L0 (i, x - (i)) < Φ0 (x∗ (N )) +
-(i), u L0 (i, x∗ (i), u∗ (i)).
i=j i=j
(5.2.9)
Now construct the control {6 6 (k + 1), . . . , u
u(k), u 6 (N − 1)} as follows:

u∗ (i), i = k, . . . , j − 1,
6 (i) =
u (5.2.10)
- (i), i = j, . . . , N.
u

The corresponding trajectory, which starts from state ξ at time k, is


{6 6(k + 1), . . . , x
x(k), x 6(N )}, where

x∗ (i), i = k, . . . , j,
6(i) =
x (5.2.11)
-(i), i = j + 1, . . . , N.
x

From (5.2.9)–(5.2.11), it follows that


N −1
Φ0 (6
x(N )) + 6(i), u
L0 (i, x 6 (i))
i=k


j−1 
N −1
=Φ0 (-
x(N )) + L0 (i, x∗ (i), u∗ (i)) + -(i), u
L0 (i, x - (i))
i=k i=j


N −1
<Φ0 (x∗ (N )) + L0 (i, x∗ (i), u∗ (i)) (5.2.12)
i=k

Equation (5.2.12) shows that {u∗ (k), . . . , u∗ (N − 1)} is not an optimal con-
trol for Problem (Pk,ξ ) which is a contradiction to the hypothesis of the
theorem. Hence the proof is complete.
To continue, we assume that an optimal solution to (Pk,ξ ) exists for each
k, 0 ≤ k ≤ N − 1, and for each ξ ∈ Rn . Let V (k, ξ) be the minimum value of
the cost functional (5.2.6) for Problem (Pk,ξ ). It is called the value function.
Theorem 5.2.2 The value function V satisfies the following backward re-
cursive relation. For any k, 0 ≤ k ≤ N − 1,

V (k, ξ) = min {L0 (k, ξ, u) + V (k + 1, f (k, ξ, u))} , (5.2.13)


u∈Uk

with the final condition


124 5 Discrete Time Optimal Control Problems

V (N, ξ) = Φ0 (ξ). (5.2.14)

Proof. Equation (5.2.14) is obvious. Let ξ ∈ Rn . Then, by Theorem 5.2.1,


4 −1
5

N
V (k, ξ) = min Φ0 (x(N )) + L0 (i, x(i), u(i))
{u(k),...,u(N −1)}∈Uk
i=k
4
= min L0 (k, ξ, u(k))
u(k)∈Uk
. −1
/5

N
+ min Φ0 (x(N )) + L0 (i, x(i), u(i))
{u(k+1),...,u(N −1)}∈Uk+1
i=k+1

= min {L0 (k, ξ, u(k)) + V (k + 1, x(k + 1))}


u(k)∈Uk

= min {L0 (k, ξ, u(k)) + V (k + 1, f (k, ξ, u(k)))} ,


u(k)∈Uk

which proves (5.2.13).

Remark 5.2.1 During the process of finding the minimum in the recursive
equation (5.2.13)–(5.2.14), we obtain the optimal control v in feedback form.
This optimal feedback control can be used for all initial conditions. However,
Dynamic Programming requires us to compute and store the values of V and v
for all k and x. Therefore, unless we can find a closed-form analytic solution
to (5.2.13)–(5.2.14), the Dynamic Programming formulation will inevitably
lead to an enormous amount of storage and computation. This phenomena
is known as the curse of dimensionality. It seriously hinders the applicability
of Dynamic Programming to problems where we cannot obtain a closed-form
analytic solution to (5.2.13)-(5.2.14).

For illustration, let us look at a simple example, where all the functions
involved are scalar.
Example 5.2.1 Minimize


N −1 * +
2 2
(x(N ))2 + (x(i)) + (u(i)) (5.2.15)
i=1

subject to

x(i + 1) = x(i) + u(i), i = 0, 1, . . . , N − 1, (5.2.16)


x(0) = x(0) . (5.2.17)

This problem is clearly in the form of Problem (P ) with Φ0 (x) = x2 ,


L0 (x, u) = x2 + u2 , f (x, u) = x + u, and u(i) ∈ R, i = 0, 1, . . . , N − 1.
5.2 Dynamic Programming Approach 125

Solution Let V (k, ξ) be the value function for the corresponding Problem
(Pk,ξ ) starting from the arbitrary state ξ at step k. By Theorem 5.2.2, we
have

V (k, ξ) = min {L0 (ξ, u) + V (k + 1, f (ξ, u))}


u∈R
" #
= min ξ 2 + u2 + V (k + 1, ξ + u) (5.2.18)
u∈R

and
V (N, ξ) = ξ 2 , (5.2.19)
where V (N, ξ) denotes the value function for the minimization problem start-
ing from the arbitrary state ξ at step k = N .
For k = N − 1, we start from the arbitrary state ξ at step N − 1. If we
denote u(N − 1) = u,
" #
V (N − 1, ξ) = min ξ 2 + u2 + V (N, ξ + u)
u∈R
" #
= min ξ 2 + u2 + (ξ + u)2 . (5.2.20)
u∈R

Minimizing the expression in the brackets of (5.2.20) with respect to u gives


" #
∂ ξ 2 + u2 + (ξ + u)2 1
= 0 ⇒ 2u + 2(ξ + u) = 0 ⇒ u = − ξ.
∂u 2
This implies that the optimal control for step N − 1 starting at state ξ is
1
u∗ (N − 1) = − ξ. (5.2.21)
2
The corresponding value function is
2 2
1 1 5 2 1 2 3
V (N − 1, ξ) = ξ 2 + − ξ + ξ− ξ = ξ + ξ = ξ2. (5.2.22)
2 2 4 4 2

Continuing this process, at the next step, we have


" #
V (N − 2, ξ) = min ξ 2 + u2 + V (N − 1, ξ + u)
u∈R
3
= min ξ 2 + u2 + (ξ + u)2 .
u∈R 2

Minimizing the right hand side with respect to u gives


" #
∂ ξ 2 + u2 + 32 (ξ + u)2 3
= 0 ⇒ 2u + 3(ξ + u) = 0 ⇒ u = − ξ.
∂u 5
This implies that the optimal control for step N − 2 starting at state ξ is
126 5 Discrete Time Optimal Control Problems

3
u∗ (N − 2) = − ξ. (5.2.23)
5
The corresponding value function is
2 2
3 3 3 34 2 6 2 8
V (N − 2, ξ) = ξ 2 + − ξ + ξ− ξ = ξ + ξ = ξ 2 . (5.2.24)
5 2 5 25 25 5

Based on the above results, it is reasonable to guess that

V (k, ξ) = ck ξ 2 , (5.2.25)

for some ck > 0. Let us show that this guess is correct. We begin by assuming
the form of (5.2.25). Then, by (5.2.18), we have

V (k, ξ) = min{ξ 2 + u2 + V (k + 1, ξ + u)}


u∈R

= min{ξ 2 + u2 + ck+1 (ξ + u)2 }. (5.2.26)


u∈R

Minimizing with respect to u gives


∂ 2 ck+1
{ξ + u2 + ck+1 (ξ + u)2 } = 0 ⇒ 2u + 2ck+1 (ξ + u) = 0 ⇒ u = − ξ.
∂u 1 + ck+1
(5.2.27)

This implies that the optimal control for step k starting at state ξ is
ck+1
u∗ (k) = − ξ. (5.2.28)
1 + ck+1

Substituting (5.2.28) into (5.2.26) gives


2 2
ck+1 ck+1
2
V (k, ξ) =ξ + ξ + ck+1 x −
2
ξ
1 + ck+1 1 + ck+1
(ck+1 )2 ck + 1
= 1+ + ξ2
(1 + ck+1 )2 (1 + ck+1 )2

ck+1 (ck+1 + 1) 2 ck+1
= 1+ ξ = 1+ ξ2
(1 + ck+1 )2 1 + ck+1
1 + 2ck+1 2
= ξ ,
1 + ck+1

which is of the assumed form of (5.2.25). Thus, we conclude that

V (k, ξ) = ck ξ 2 , (5.2.29)

with
1 + 2ck+1
ck = , and cN = 1. (5.2.30)
1 + ck+1
5.2 Dynamic Programming Approach 127

(5.2.30) allows us to determine the coefficients of the value function recur-


sively, i.e.,
3 8
cN −1 = , cN −2 = , . . . .
2 5
Therefore, we can easily find the solution for the problem for any number of
steps. The optimal control law resulting from the above dynamic program-
ming solution is
ck+1
u∗ (k) = − x∗ (k). (5.2.31)
1 + ck+1
Note that the control law is expressed in terms of the current state of the
system, i.e., the control is expressed in closed-loop or feedback form. This
is a great advantage of dynamic programming over Pontryagin type theory:
it makes the dynamic programming solution more robust with respect to
modeling errors and noise.
Notice how we use the dynamic programming solution in practice. We
compute ck , k = 0, 1, . . . , N , from (5.2.30), starting with cN = 1. We then
run the system using the corresponding controls in the reverse order.
Consider the case N = 3. From (5.2.30),

c3 = 1,
1 + 2c3 1 + 2(1) 3
c2 = = = ,
1 + c3 1+1 2
1 + 2c2 1 + 2( 32 ) 8
c1 = = 3 = ,
1 + c2 1+ 2 5
1 + 2c1 1 + 2( 85 ) 21
c0 = = = .
1 + c1 1 + 85 13

Now, starting with the given initial condition x(0) = x(0) , we obtain
8
c1 8
u∗ (0) = − x∗ (0) = − 5 8 x(0) = − x(0) .
1 + c1 1+ 5 13

Then
8 (0) 5 (0)
x∗ (1) = x∗ (0) + u∗ (0) = x(0) − x = x ,
13 13
and
3
c2
u∗ (1) = − x∗ (1) = − 2 3 x∗ (1)
1 + c2 1+ 3
3 ∗ 3 5 3
= − x (1) = − × x(0) = − x(0) .
5 5 13 13
Next,
5 (0) 3 2 (0)
x∗ (2) = x∗ (1) + u∗ (1) = x − x(0) = x ,
13 13 13
128 5 Discrete Time Optimal Control Problems

and
c3 1 1 1 2 1
u∗ (2) = − x∗ (2) = − x∗ (2) = − x∗ (2) = − × x(0) = − x(0) .
1 + c3 1+1 2 2 13 13
Finally,
2 (0) 1 1 (0)
x∗ (3) = x∗ (2) + u∗ (2) = x − x(0) = x .
13 13 13
Using (5.2.29), the optimal cost can be determined in terms of the initial
condition, i.e.,
!2 21 (0) !2
V (0, x(0) ) = c0 x(0) = x .
13
Remark 5.2.2 The following general discrete time linear quadratic regulator
(LQR) problem can be solved in a similar way.

 −1
N
" #
minimize (x(k)) Qx(k) + (u(k)) Ru(k)
i=1
subject to x(k + 1) = Ax(k) + Bu(k), k = 0, 1, . . . , M
x(0) = x(0) ,

where x(·) ∈ Rn , u(·) ∈ Rr and A, B, Q, R are constant matrices with


appropriate dimensions. Furthermore, we need to assume that Q ≥ 0 (i.e., Q
is positive semi-definite) and R > 0 (i.e., R is positive definite).

We now consider a modified version of Problem (P ) which includes a


terminal state constraint. The dynamics are given by

x(i + 1) = f (x(i), u(i)), i = 0, 1, . . . , N − 1, (5.2.32)

with the initial condition


x(0) = x(0) (5.2.33)
and the terminal state constraint

x(N ) = x̂. (5.2.34)

Here, x = [x1 , . . . , xn ] ∈ Rn and u = [u1 , . . . , ur ] ∈ Rr are, respectively,


the state and control vectors, f : Rn × Rn → Rn is a given function, and
x(0) = [x1 , . . . , xn ] ∈ Rn and x̂ = [x̂1 , . . . , x̂n ] ∈ Rn are given vectors.
(0) (0)

Once again, we assume that for each i = 0, 1, . . . , N − 1,

u(i) ∈ Ui , (5.2.35)

where Ui is a given compact and convex subset of Rr , and Uk , which is


defined by 5.2.4, is the set of feasible control sequences starting at time k ∈
0, 1, ..., N − 1. The problem is then described by
5.2 Dynamic Programming Approach 129


N −1
min L(x(i), u(i)) (5.2.36)
{u(0),...,u(N −1)}∈U0
i=0

subject to (5.2.32)–(5.2.34), where L0 : Rn × Rr → R is a given function. We


denote this problem as Problem (P-). The main difference to Problem (P ) is
that the system must finish at x̂ when i = N .
As we did for Problem (P ), we consider a family of problems which are
embedded in Problem (P- ). These are identical to Problem (P-) except for the
initial time and state. For each ξ = [ξ, . . . , ξ] ∈ Rn , and k ∈ {0, . . . , N − 1},
we consider the following problem:


N −1
min L(x(i), u(i)) (5.2.37)
{u(k),...,u(N −1)}∈Uk
i=k

subject to

x(i + 1) = f (x(i), u(i)), i = 0, 1, . . . , N − 1, (5.2.38)

with
x(k) = ξ (5.2.39)
and
x(N ) = x̂. (5.2.40)
We denote this as Problem (P-k,ξ ). The value function for Problem (P-k,ξ ) is the
minimum value of the cost functional (5.2.37) and denoted again by V (k, ξ).
The following two results are the equivalent of Theorems 5.2.1 and 5.2.2. The
proofs are almost identical and hence omitted.
Theorem 5.2.3 Suppose that {u∗ (k), . . . , u∗ (N − 1)} is an optimal control
for Problem (P-k,ξ ), and that {x∗ (k) = ξ, x∗ (k + 1), . . . , x∗ (N ) = x -} is the
corresponding optimal trajectory. Then, for any j, k ≤ j ≤ N − 1,
{u∗ (j), . . . , u∗ (N − 1)} is an optimal control for Problem (P-j,x∗ (j) ).
Theorem 5.2.4 The value function V satisfies the following backward re-
cursive relation. For k, 0 ≤ k ≤ N − 2,

V (k, ξ) = min {L0 (k, ξ, u) + V (k + 1, f (k, ξ, u))} , (5.2.41)


u∈Uk

with the terminal condition

V (N − 1, ξ) = min {L0 (ξ, u)} (5.2.42)


u∈UN −1

subject to
-.
x(N ) = f (ξ, u) = x (5.2.43)
Let us consider a simple example.
130 5 Discrete Time Optimal Control Problems

Example 5.2.2 Minimize

 −1
N
" #
(x(i))2 + (u(i))2
i=0

subject to

x(i + 1) = x(i) + u(i), i = 0, 1, . . . , N − 1,


x(0) = x(0) ,
x(N ) = 0.
!
Solution In the notation of Problem P- , L0 (x, u) = x2 + u2 , f (x, u) =
x + u, x̂ = 0, and u(i) ∈ R, i = 0, 1, . . . , N − 1. For k = 0, 1, . . . , N , (5.2.41)–
(5.2.43) become
" #
V (k, ξ) = min ξ 2 + u2 + V (k + 1, ξ + u) , k = 0, 1, . . . , N − 2. (5.2.44)
u∈R
" #
V (N − 1, ξ) = min ξ 2 + u2 (5.2.45)
u∈R

subject to
x(N ) = f (ξ, u(N − 1)) = ξ + u(N − 1) = 0. (5.2.46)
This is the basic dynamic programming backward recursive relation. For il-
lustration, let us first consider the case N = 3. For i = N − 1 = 2, we
have " #
V (2, ξ) = min ξ 2 + u2
u∈R

subject to
x(3) = ξ + u = 0.
This can only be satisfied with u∗ (2) = −ξ which, in turn, leads to

V (2, ξ) = 2ξ 2 .

For i = 1, (5.2.44) gives


" # " #
V (1, ξ) = min ξ 2 + u2 + V (2, ξ + u) = min ξ 2 + u2 + 2(ξ + u)2 .
u∈R u∈R

Aiming to minimize the right hand side, we have


" #
∂ ξ 2 + u2 + 2(ξ + u)2 2
= 0 ⇒ 2u + 4(ξ + u) = 0 ⇒ u∗ (1) = − ξ
∂u 3
which then leads to
5 2
V (1, ξ) = ξ .
3
5.2 Dynamic Programming Approach 131

For i = 0, (5.2.44) gives


" # 5
V (0, ξ) = min ξ 2 + u2 + V (1, ξ + u) = min ξ 2 + u2 + (ξ + u)2 .
u∈R u∈R 3

Again minimizing the right hand side, we have


" #
∂ ξ 2 + u2 + 2(ξ + u)2 10 5
= 0 ⇒ 2u + (ξ + u) = 0 ⇒ u∗ (0) = − ξ.
∂u 3 8
with
5 2
V (0, ξ) = ξ .
3
To obtain the solution for the case with arbitrary N , we conjecture that

V (k, ξ) = ck ξ 2 ,

for some ck > 0. To prove this conjecture, we substitute V (k + 1, ξ) = ck+1 ξ 2


into the right hand side of (5.2.44). Thus, for k = 0, 1, . . . , N − 2, we have
" #
V (k, ξ) = min ξ 2 + u2 + ck+1 (ξ + u)2 .
u∈R

Minimizing the contents of the brackets, we have


" #
∂ ξ 2 +u2 + ck+1 (ξ +u)2 ck+1
= 0 ⇒ 2u+2ck+1 (ξ+u) = 0 ⇒ u∗ (k) = − ξ,
∂u 1+ck+1

which, in turn, yields


1 + 2ck+1 2
V (k, ξ) = ξ .
1 + ck+1
Thus,
1 + 2ck+1
ck = . (5.2.47)
1 + ck+1
For k = N − 1, it follows from (5.2.45) and (5.2.46) that
" #
cN −1 ξ 2 = min ξ 2 + u2
u∈R

subject to x(N ) = ξ + u = 0, i.e., u = −ξ. Substituting into the above


equation yields cN −1 ξ 2 = 2ξ 2 which gives cN −1 = 2. By (5.2.47), we have
cN −2 = 53 and cN −3 = 138 .
Therefore, for N = 3, c0 = 13 5
8 , c1 = 3 , c2 = 2, and the final answer is

13 2
V (0, ξ) = c0 ξ 2 = ξ ,
8
132 5 Discrete Time Optimal Control Problems
   (0) 2
or, remembering that x(0) = x(0) , V 0, x(0) = 13
8 x . The corresponding
optimal controls and states can then be calculated as follows:
c1 5 (0) ! 3
u∗ (0) = − x(0) = − x ⇒ x∗ (1) = x(0) + u∗ (0) = x(0) ,
1 + c1 8 8
c 2 1 1
u∗ (1) = − x∗ (1) = − x∗ (1) ⇒ x∗ (2) = x∗ (1) + u∗ (1) = x∗ (1) = x(0) ,
2
1 + c2 3 3 8
u∗ (2) = −x∗ (2) ⇒ x∗ (3) = x∗ (2) + u∗ (2) = 0,

where the latter equations result from the first step of our analysis. Note
that, once again, it is possible to express the optimal control in a feedback
form.

5.2.1 Application to Portfolio Optimization

In this section, we present an application of the dynamic programming


method to a multi-period portfolio optimization problem subject to prob-
abilistic risks. The main reference for this section is [235].

5.2.1.1 Problem Formulation

We consider a multi-period portfolio optimization problem, where an investor


is going to invest in N possible risky assets Sj , j = 1, . . . , N, with a positive
initial wealth of M0 . The investment will be made at the beginning of the
first period of a T -period portfolio planning horizon. Then, the wealth will
be reallocated to these N risky assets at the beginning of the following T − 1
consecutive time periods. The investor will claim the final wealth at the end
of the T th period.
Let xtj be the fraction of wealth at the end of period t − 1 invested in
asset Sj at the beginning of period t. Denote xt = [xt1 , . . . , xtN ] . Here we
assume that the whole investment process is a self-financing process. Thus,
the investor will not increase the investment nor put aside funds in any pe-
riod in the portfolio planning horizon. In other words, the total funds in the
portfolio at the end of period t − 1 will be allocated to those risky assets at
the beginning of period t. Thus,


N
xtj = 1, t = 1, . . . , T. (5.2.48)
j=1

Moreover, it is assumed that short selling of the risky assets is not allowed
at any time. Hence, we have
5.2 Dynamic Programming Approach 133

xtj ≥ 0, t = 1, . . . , T, j = 1, . . . , N. (5.2.49)

Let Rtj denote the rate of return of asset Sj for period t. Define Rt =
[Rt1 , . . . , RtN ] . Here, Rtj is assumed to follow a normal distribution with
mean rtj and standard deviation σtj . We further assume that vectors Rt , t =
1, . . . , T , are statistically independent, and the mean E(Rt ) = rt = [rt1 , . . . ,
rtN ] is calculated by averaging the returns over a fixed window of time τ .
Let
1 
t−1
rtj = Rji , t = 1, . . . , T, j = 1, . . . , N. (5.2.50)
τ i=t−τ

We assume that in any time period, there are no two distinct assets in the
portfolio that have the same level of expected return as well as standard
deviation, i.e., for any 1 ≤ t ≤ T , there exist no i and j such that i = j,
but rti = rtj , and σti = σtj .
Let Vt denote the total wealth of the investor at the end of period t. Clearly,
we have  
Vt = Vt−1 1 + Rt xt , t = 1, . . . , T, (5.2.51)
with V0 = M0 .
First recall the definition of probabilistic risk measure, which was intro-
duced in [233] for the single-period probabilistic risk measure.

wp (x) = min Pr{ |Rj xj − rj xj | ≤ θε }, (5.2.52)


1≤j≤N

where θ is a constant to adjust the risk level, and ε denotes the average risk
of the entire portfolio, which is calibrated by the function below.

1 
N
ε= σj . (5.2.53)
N j=1

The whole idea of this risk measure in (5.2.52) is to locate the single asset
with greatest deviation in the portfolio. With this ‘biggest risk’ mitigated,
the risk of the whole portfolio can be substantially reduced as well. For multi-
period portfolio optimization, the single ‘biggest risk’ should be selected over
all risky assets and over the entire planning horizon. Thus, we define

wp (x) = min min Pr{ |Rtj xtj − rtj xtj | ≤ θε }, (5.2.54)


1≤t≤T 1≤j≤N

where x = [x  
1 , . . . , xT ] , θ is the same as defined in (5.2.52) to be a constant
adjusting the risk level, and ε denotes the average risk of the entire portfolio
over T periods, which is calibrated by

1 
T N
ε= σtj . (5.2.55)
T × N t=1 j=1
134 5 Discrete Time Optimal Control Problems

Assume that the investor is rational and risk-averse, who wants to maximize
the terminal wealth as well as minimize the risk in the investment. Thus, the
portfolio selection problem can be formulated as a bi-criteria optimization
problem stated as follows:
!
max min min f (xtj ), E(VT ) , (5.2.56a)
1≤t≤T 1≤j≤N
 
s.t. Vt = Vt−1 1 + Rt xt , t = 1, . . . , T, (5.2.56b)

N
xtj = 1, t = 1, . . . , T, (5.2.56c)
j=1

xtj ≥ 0, t = 1, . . . , T, j = 1, . . . , N, (5.2.56d)

where f (xtj ) = Pr{ |Rtj xtj − rtj xtj | ≤ θε }.


To reach the optimality for the above bi-criteria optimization problem, we
recall the following definition which is based on Theorem 3.1 in [268].
Definition 5.2.1 A solution x∗ = {x∗1 , . . . , x∗T } satisfying (5.2.56c) and
(5.2.56d) is said to be a Pareto-minimal solution for the bi-criteria optimiza-
tion problem (5.2.56) if there does not exist a solution x̃ = {x̃1 , . . . , x̃N }
satisfying (5.2.56c) and (5.2.56d) such that

min min f (x∗tj ) ≤ min min f (x̃tj ), and E(VT∗ |x∗ ) ≤ E(VT∗ |x̃),
1≤t≤T 1≤j≤N 1≤t≤T 1≤j≤N

for which at least one of the inequalities holds strictly.


We can transform problem (5.2.56) into an equivalent bi-criteria optimization
problem by adding another decision variable y, and T × N constraints.
!
max y, E(VT ) , (5.2.57a)
 
s.t. Vt = Vt−1 1 + Rt xt , t = 1, . . . , T, (5.2.57b)
y ≤ f (xtj ), t = 1, . . . , T, j = 1, . . . , N, (5.2.57c)

N
xtj = 1, t = 1, . . . , T, (5.2.57d)
j=1

xtj ≥ 0, t = 1, . . . , T, j = 1, . . . , N, (5.2.57e)

where y ≤ f (xtj ) is the (N × (t − 1) + j)th probabilistic constraint. The


optimization process trying to maximize y will eventually push the value of y
to be equal to min min f (xtj ). Thus, the optimization problem (5.2.57)
1≤t≤T 1≤j≤N
is equivalent to the optimization problem (5.2.56).
5.2 Dynamic Programming Approach 135

5.2.1.2 Analytical Solution

Recall that in Section 5.2.1.1, the return, Rtj , of asset Sj in period t, follows
the normal distribution with mean rtj and standard deviation σtj . It is clear
that Rtj − rtj is a linear mixture of normal distributions with mean 0 and
standard deviation σtj .
Let qtj = Rtj − rtj . Then it follows that
* θε +
f (xtj ) = Pr{ |Rtj xtj − rtj xtj | ≤ θε } = Pr |Rtj − rtj | ≤
xtj
 xθε 4 5
2
tj 1 qtj
=2 √ exp − 2 dqtj . (5.2.58)
0 2πσtj 2σtj

By the property of cumulative distribution function (5.2.58), f (xtj ) is clearly


a monotonically strictly decreasing function with respect to xtj .
As mentioned in the optimization problem (5.2.57), in the process of
maximizing the objective function, the value of y is to be pushed to reach
min min f (xtj ). As a result, if we choose y to be an arbitrary but fixed
1≤t≤T 1≤j≤N
real number between 0 and 1 (because y is a value of probability), using the
monotonic property of f (xtj ), we can find an upper bound for xtj . Let this
upper bound be denoted as Utj , which is given by

Utj = min{ 1, x̂tj }, (5.2.59)

where x̂tj = f −1 (y).


Consequently, for a fixed value of y, the optimization problem (5.2.57) is
equivalent to the following discrete time optimal control problem:

max E(VT ), (5.2.60a)


 
s.t. Vt = Vt−1 1 + Rt xt , t = 1, . . . , T, (5.2.60b)

N
xtj = 1, t = 1, . . . , T, (5.2.60c)
j=1

0 ≤ xtj ≤ Utj , t = 1, . . . , T, j = 1, . . . , N. (5.2.60d)

We use dynamic programming to solve problem (5.2.60).


For time periods k = 1, . . . , T , we define a series of optimal control prob-
lems with the same dynamics, cost functional and constraints but different
initial states and initial times. These problems are stated as follows:

max E(VT ), (5.2.61a)


 
s.t. Vt = Vt−1 1 + Rt xt , t = k, . . . , T, (5.2.61b)
Vk−1 = ξ, (5.2.61c)
136 5 Discrete Time Optimal Control Problems


N
xtj = 1, t = k, . . . , T, (5.2.61d)
j=1

0 ≤ xtj ≤ Utj , t = k, . . . , T, j = 1, . . . , N, (5.2.61e)

where ξ is a variable. At the time k − 1, the number of steps to go is T − k + 1.


This is a family of optimal control problems determined by the initial
time k − 1 and initial state ξ. We use Problem (Pk−1,ξ ) to denote each dif-
ferent problem. Suppose that {x∗k , . . . , x∗T } is an optimal control for Prob-

lem (Pk−1,ξ ), and that {Vk−1 = ξ, Vk∗ , . . . , VT∗ } is the corresponding optimal

trajectory. Then it follows from Theorem 5.2.4 that, for any k ≤ k ≤ T ,
{x∗k , . . . , x∗T } is an optimal control for Problem (Pk −1,V ∗ ). To continue,
k −1
we utilize an auxiliary value function denoted by F (k − 1, ξ) to be the max-
imum value of the objective function for Problem (Pk−1,ξ ). The following
value function F (k − 1, ξ) is defined as
"    #
F (k−1, ξ) = max F k, ξ 1 + Rk xk : xk ∈ Xk , 1 ≤ k ≤ T, (5.2.62)

with terminal condition


F (T, ξ) = E(VT ), (5.2.63)
⎧ ⎫
⎨ 
N ⎬
where Xk = xk : xkj = 1 and 0 ≤ xkj ≤ Ukj , j = 1, . . . , N .
⎩ ⎭
j=1
Now, we shall solve the discrete time optimal control problem (5.2.60)
backwards using dynamic programming method.
For k = T , we start from the state VT −1 = ξ at time T − 1. Then,
from (5.2.62) and (5.2.63), the value function becomes
"    #
F (T − 1, ξ) = max F T, ξ 1 + RT xT : xT ∈ X T
"    #
= max E ξ 1 + RT xT : xT ∈ X T
"   #
= max ξ 1 + rT xT : xT ∈ XT . (5.2.64)

Consequently, Problem (PT −1,ξ ) becomes a linear programming problem


stated as below.
 
max ξ 1 + rT xT , (5.2.65a)

N
s.t. xT j = 1, (5.2.65b)
j=1

0 ≤ xT j ≤ UT j , j = 1, . . . , N. (5.2.65c)

The optimal solution to this linear programming problem has been obtained
in [233]. The result is quoted in the following as a lemma:
5.2 Dynamic Programming Approach 137

Lemma 5.2.1 Let the assets be sort in such an order that rT 1 ≥ rT 2 ≥ · · · ≥


rT N . Then, there exists an integer n ≤ N such that


n−1 
n
UT j < 1, and UT j ≥ 1,
j=1 j=1

and


⎪ UT j ,n−1 j=1,. . . ,n-1,

⎨ 
x∗T j = 1− UT j , j=n, (5.2.66)



⎩ j=1
0, j >n,

is an optimal solution to problem (5.2.65).


Thus, it follows from (5.2.64) and (5.2.66) that the corresponding value func-
tion is given by  
F (T − 1, ξ) = ξ 1 + rT x∗T .
Note that x∗T in (5.2.66) is independent from ξ as the solution is a function
of UT j , which, from (5.2.58) and (5.2.59), depends only on the mean and
standard deviation of RT j .
Continuing with this process, we obtain the value function at k = T − 1.
"    #
F (T − 2, ξ) = max F T − 1, ξ 1 + RT−1 xT −1 : xT −1 ∈ XT −1
"     #
= max E ξ 1 + RT−1 xT −1 1 + rT x∗T : xT −1 ∈ XT −1
"    #
= max ξ 1 + rT−1 xT −1 1 + rT x∗T : xT −1 ∈ XT −1 .
(5.2.67)

Again, it is easy to show that this maximizing problem is equivalent to the


following linear programming problem:
  
max ξ 1 + rT−1 xT −1 1 + rT x∗T , (5.2.68a)

N
s.t. x(T −1)j = 1, (5.2.68b)
j=1

0 ≤ x(T −1)j ≤ U(T −1)j , j = 1, . . . , N. (5.2.68c)

Since x∗T is solved in (5.2.65) and known, the above problem is in the
same form as (5.2.65). Thus, it can be solved in a similar manner as that
for (5.2.66). Details are given below as a lemma.
Lemma 5.2.2 Let the assets be sort in such an order that r(T −1)1 ≥
r(T −1)2 ≥ · · · ≥ r(T −1)N . Then, there exists an integer n ≤ N such that
138 5 Discrete Time Optimal Control Problems


n−1 
n
U(T −1)j < 1, and U(T −1)j ≥ 1,
j=1 j=1

and


⎪ U(T −1)j , j=1,. . . ,n-1,

⎨ 
n−1
x∗(T −1)j = 1− U(T −1)j , j=n, (5.2.69)



⎩ j=1
0, j >n,

is an optimal solution to problem (5.2.68).


From Lemma 5.2.2, the corresponding value function is
  
F (T − 2, ξ) = ξ 1 + rT−1 x∗T −1 1 + rT x∗T .

Similarly, the optimal solution x∗T −1 has no dependency on the state ξ. It


depends only on fixed values of U(T −1)j , j = 1, . . . , N .
From Lemmas 5.2.1 and 5.2.2, it is reasonable to postulate that
 
   
F (k − 1, ξ) = ξ 1 + rk−1 x∗k−1 1 + rk x∗k · · · 7 1 + rT x∗T ,
k = T, . . . , 1. (5.2.70)

Moreover, the optimal solution for every period k, k = 1, . . . , T , can be


written in a unified form as given below.
Theorem 5.2.5 For any k = 1, . . . , T , let the assets be sort in such an order
that rk1 ≥ rk2 ≥ · · · ≥ rkN . Then, there exists an integer n ≤ N such that


n−1 
n
Ukj < 1, and Ukj ≥ 1,
j=1 j=1

and


⎪ Ukj ,n−1 j=1,. . . ,n-1,

⎨ 
x∗kj = 1− Ukj , j=n, (5.2.71)



⎩ j=1
0, j >n,

where Ukj is defined as in (5.2.58) and (5.2.59). Thus, with x∗k = [x∗k1 , . . . ,
x∗kN ] , {x∗1 , . . . , x∗T } is an optimal solution to problem (5.2.60).
5.2 Dynamic Programming Approach 139

5.2.1.3 Numerical Simulations

We use daily return data of stocks on the ASX100 dated from 01/01/2007
to 30/11/2011. The portfolio takes effect from 01/01/2009 and is open for
trading for 3 years. 3 years is chosen since professionally managed portfolios
(e.g., Aberdeen Asset Management, JBWere) usually list the average holding
period as 3–5 years. At the beginning of each month, the funds in the portfolio
are allocated based on the updated asset return data. We use 2 years historical
data to decide the portfolio allocation each month. For the first portfolio
allocation on 01/01/2009, the return data from 01/01/2007 to 31/12/2008 is
used to evaluate the expected return and standard deviation of each stock.
In the month following, the return data from 01/02/2007 to 31/01/2009 is
used to evaluate the updated expected return and standard deviation, and
this goes on.
The formulation of the corresponding portfolio optimization problem is as
defined in Section 5.2.1.1. Assume the investor has an initial wealth of M0 =
1,000,000 dollars. There are N = 100 stocks to choose from for a portfolio
of investment holding period of T = 36 months. The average risk of the
portfolio, ε, over T periods is calculated as in (5.2.55). Table 5.2.1 shows
the portfolio returns for various combination of θ (risk adjusting parameter),
and y (lower bound of the probabilistic constraint). By changing the value of θ
and/or y, the investor is able to alter the portfolio composition to cater for
different risk tolerance levels. The lower the value of θ, the more diversified
the portfolio can be, while the lower the value of y, the less stringent the
probabilistic constraint.
From Table 5.2.1, it can be seen that the expected portfolio return in-
creases when θ increases. Similarly, the expected portfolio return increases
when y decreases. This makes sense since when θ increases, the portfolio
selection consists of a much smaller number of selected ‘better-performing’
stocks. When y is lower, the risk is higher and hence the return is generally
expected to be higher.
The historical price index value of ASX100, composed of 100 large-cap and
mid-cap stocks, was 3067.90 and 3329.40 at the end of December 2008 and at
the end of December 2011, respectively. This translates to a return of 8.52%
for a portfolio which comprises the entire stock selection of ASX100.
From Table 5.2.1, it can be seen that when θ > 0.5, the expected returns
following our portfolio selection criteria outperforms the market index return.
The multi-period model outperforms the passive single-period model with a
period of 3 years. Table 5.2.2 shows the expected returns using the single-
period model with a period of 3 years.
When θ = 0.1 and y = 0.95, solving the problem with Theorem 5.2.5
suggests a total wealth in portfolio of 1,043,223.89 dollars at the end of the 3
year investment, which is a return of 4.32%. Comparing this with the result of
the single-period investment strategy, the multi-period solution outperforms
140 5 Discrete Time Optimal Control Problems

it by more than one fold (the single-period portfolio has a total return of
about 2%).

Table 5.2.1: Multi-period—expected portfolio returns for selected θ and y

@ θ 0.01 0.1 0.2 0.5 1


y @
0.95 0.59% 4.32% 6.35% 10.78% 15.38%
0.90 0.71% 4.77% 7.01% 11.80% 16.85%
0.80 0.96% 5.48% 8.09% 13.40% 19.25%

Table 5.2.2: Single-period—expected portfolio returns for selected θ and y

@ θ 0.01 0.1 0.2 0.5 1


y @
0.95 −2.01% 2.00% 3.31% 3.30% 4.72%
0.90 −1.89% 2.23% 3.66% 3.83% 5.14%
0.80 −1.69% 2.65% 3.53% 4.38% 5.89%

5.3 Discrete Time Optimal Control Problems with


Canonical Constraints

Consider a process described by the following system of difference equations:

x(k + 1) = f (k, x(k), u(k)), k = 0, 1, . . . , M − 1, (5.3.1a)


where
x = [x1 , . . . , xn ] ∈ Rn , u = [u1 , . . . ur ] ∈ Rr ,
are, respectively, state and control vectors; and f = [f1 , . . . , fn ] ∈ Rn is
a given functional. The initial condition for the system of difference equa-
tions (5.3.1a) is
x(0) = x0 , (5.3.1b)
$ % 
where x0 = x01 , . . . , x0n ∈ Rn is a given vector.
Define

U = {ν = [v1 , . . . vr ] ∈ Rr : αi ≤ vi ≤ βi , i = 1, . . . , r}, (5.3.2)

where αi , i = 1, . . . , r and βi , i = 1, . . . , r, are given real numbers. Note that


U is a compact and convex subset of Rr . Let u denote a control sequence
{u(k) : k = 0, 1, . . . , M − 1} in U. Then, u is called an admissible control.
Let U denote the class of all such admissible controls.
For each u ∈ U , let x(k | u), k = 0, 1, . . . , M, be a sequence in Rn
such that the system of difference equations (5.3.1a) with the initial con-
5.3 Discrete Time Optimal Control Problems with Canonical Constraints 141

dition (5.3.1b) is satisfied. This discrete time function is called the solution
of the system (5.3.1) corresponding to the control u ∈ U .
We now consider the following class of discrete time optimal control prob-
lems in canonical formulation:

M −1
g0 (u) = Φ0 (x(M | u)) + L0 (k, x(k | u), u(k)) (5.3.3)
k=0

is minimized subject to u ∈ U and the following constraints (in canonical


form):

gi (u) = 0, i = 1, 2, . . . , Ne (5.3.4a)
gi (u) ≤ 0, i = Ne + 1, . . . , N , (5.3.4b)

where

M −1
gi (u) = Φi (x(M | u)) + Li (k, x(k | u), u(k)). (5.3.5)
k=0

Let this discrete time optimal control problem be referred to as Problem


(P ). The following conditions are assumed throughout this section:
Assumption 5.3.1 For each k = 0, 1, . . . , M − 1, f (k, ·, ·) is continuously
differentiable on Rn × Rr .
Assumption 5.3.2 For each i = 0, 1, . . . , N , Φi is continuously differen-
tiable on Rn .
Assumption 5.3.3 For each i = 0, 1, . . . , N , and for each k = 0, 1, . . . , M −
1, Li (k, ·, ·) is continuously differentiable on Rn × Rr .
Note that the canonical constraint functionals have the same form as the
cost functional. This feature is to allow gradient formulae of the cost and
constraint functionals to be computed in a unified way.

5.3.1 Gradient Formulae

In this section, our aim is to derive gradient formulae for the cost as well as
the constraint functionals. Define
$ %
u = (u(0)) , (u(1)) , . . . , (u(M − 1)) . (5.3.6)

Let the control vector u be perturbed by εû, where ε > 0 is a small real
number and û is an arbitrary but fixed perturbation of u given by
$ %
û = (û(0)) , (û(1)) , . . . , (û(M − 1)) . (5.3.7)
142 5 Discrete Time Optimal Control Problems

Then, we have

uε = u + εû = [(u(0, ε)) , (u(1, ε)) , . . . , (u(M − 1, ε)) ] , (5.3.8)

where
u(k, ε) = u(k) + εû(k), k = 0, 1, . . . , M − 1. (5.3.9)
Consequently, the state of the system will be perturbed, and so are the cost
and constraint functionals.
Define
x(k, ε) = x(k | uε ), k = 1, 2, . . . , M. (5.3.10)
Then,
x(k + 1, ε) = f (k, x(k, ε), u(k, ε)). (5.3.11)
The variation of the state for k = 0, 1, . . . , M − 1 is

dx(k + 1, ε) 
x(k + 1) = 
dε ε=0
∂f (k, x(k), u(k)) ∂f (k, x(k), u(k))
= x(k) + û(k) (5.3.12a)
∂x(k) ∂u(k)

with
x(0) = 0. (5.3.12b)
For the i−th functional (i = 0 denotes the cost functional), we have

∂gi (u) gi (uε ) − gi (u) dgi (uε ) 
û = lim ≡
∂u ε→0 ε dε ε=0

M −1
∂Φi (x(M )) ∂Li (k, x(k), u(k))
= x(M ) + x(k)
∂x(M ) ∂x(k)
k=0
∂Li (k, x(k), u(k))
+ û(k) . (5.3.13)
∂u(k)

For each i = 0, 1, . . . , N , define the Hamiltonian

Hi (k, x(k), u(k), λi (k + 1)) = Li (k, x(k), u(k)) + (λi (k + 1)) f (k, x(k), u(k)),
(5.3.14)

where λi (k) ∈ Rn , k = M, M −1, . . . , 1, is referred to as the costate sequence


for the i−th canonical constraint. Then, it follows from (5.3.13) that

∂gi (u) ∂Φi (x(M ))


û = x(M )
∂u ∂x(M )

M −1
∂Hi (k, x(k), u(k), λi (k + 1))
+ Δx(k)
∂x(k)
k=0
5.3 Discrete Time Optimal Control Problems with Canonical Constraints 143

∂f (k, x(k), u(k))


− (λi (k + 1)) x(k)
∂x(k)
∂Hi (k, x(k), u(k), λi (k + 1))
+ û(k)
∂u(k)
∂f (k, x(k), u(k))
− (λi (k + 1)) û(k) . (5.3.15)
∂u(k)

By (5.3.12), we have

∂f (k, x(k), u(k)) ∂f (k, x(k), u(k))


Δx(k + 1) = x(k) + û(k). (5.3.16)
∂x(k) ∂u(k)

Let the costate λi (k) be determined by the following system of difference


equations:

∂Hi (k, x(k), u(k), λi (k + 1))


(λi (k)) = , k = M − 1, M − 2, . . . , 1,
∂x(k)
(5.3.17a)
and
∂Φi (x(M ))
(λi (M )) = . (5.3.17b)
∂x(M )
Hence, from (5.3.15)–(5.3.17), it follows from (5.3.12) that

∂gi (u)

∂u
∂Hi (0, x(0), u(0), λi (1)) ∂Hi (M −1, x(M −1), u(M −1), λi (M ))
= ,..., û.
∂u(0) ∂u(M −1)

Since û is arbitrary, we have the following gradient formula:

∂gi (u)
∂u
∂Hi (0, x(0), u(0), λi (1)) ∂Hi (M −1, x(M −1), u(M −1), λi (M ))
= ,..., .
∂u(0) ∂u(M −1)
(5.3.18)

We summarize this result in the following theorem.


Theorem 5.3.1 Consider Problem (P ). For each i = 0, 1, . . . , N , the gradi-
ent of gi (u) with respect to u, where
$ %
u = (u(0)) , (u(1)) , . . . , (u(M − 1)) ,

is given by (5.3.18).
144 5 Discrete Time Optimal Control Problems

5.3.2 A Unified Computational Approach

Problem (P ) is essentially nonlinear mathematical programming problem


where the variables are the control parameters in the vector u. It can be
solved by using any suitable numerical optimization technique, such as se-
quential quadratic programming (see Section 3.5). In order to apply any
such method, for each u ∈ U , we need to be able to compute the correspond-
ing values of the cost functional g0 (u) and the constraint functionals gi (u),
i = 1, . . . , N , where gi (u) is defined by (5.3.3) if i = 0, and by (5.3.5) if
i = 1, . . . , N , as well as their respective gradients.
To calculate the values of cost functional and the constraint functionals
corresponding to each u ∈ U , the first task is to calculate the solution of the
system (5.3.1) corresponding to each u ∈ U . This is presented as an algorithm
for future reference.
Algorithm 5.3.1 For each given u ∈ U , compute the solution x(k | u),
k = 0, 1, . . . , M of system (5.3.1) by solving the difference equations (5.3.1a)
forward in time from k = 0 to k = M with initial condition (5.3.1b).

With the information obtained in Algorithm 5.3.1, the values of gi cor-


responding to each u ∈ U can be easily calculated by the following simple
algorithm:
Algorithm 5.3.2 For a given u ∈ U ,
Step 1. Use Algorithm 5.3.1 to solve for x(k | u), k = 0, 1, . . . , M . Thus,
x(k | u) is known for each k = 0, 1, . . . , M . This implies that
(a) Φi (x(M | u)), i = 0, 1, . . . , N , are known and
(b) Li (k, x(k | u), u(k)), i = 0, 1, . . . , N , are known for each k =
0, 1, . . . , M − 1. Hence, the summations


M −1
Li (k, x(k | u), u(k)), i = 0, 1, . . . , N ,
k=0

can be readily determined.


Step 2. Calculate the values of the cost functional (with i = 0) and the con-
straint functionals (with i = 1, . . . , N ) as follows:


M −1
gi (u) = Φi (x(M | u)) + Li (k, x(k | u), u(k)), i = 0, 1, . . . , N .
k=0

By using Theorem 5.3.1, we can calculate the gradients of the cost func-
tional and the canonical constraint functionals as stated in the following
algorithm:
5.4 Problems with Terminal and All-Time-Step Inequality Constraints 145

Algorithm 5.3.3 For each i = 0, 1, . . . , N , and for a given u ∈ U ,


Step 1. Use Algorithm 5.3.1 to solve system (5.3.1) to obtain x(k | u), k =
0, 1, . . . , M .
Step 2. Solve the system of the costate difference equations (5.3.17) backward
in time from k = M, M − 1, . . . , 1. Let λi (k | u) be the solution
obtained.
Step 3. Compute the gradient of gi according to (5.3.18).

5.4 Problems with Terminal and All-Time-Step


Inequality Constraints

Consider the system (5.3.1), and let U be the class of admissible controls
defined in Section 5.3. Two sets of nonlinear terminal state constraints are
specified as follows:

Ψi (x(M | u)) = 0, i = 1, . . . , N0 (5.4.1a)


Ψi (x(M | u)) ≤ 0, i = N0 + 1, . . . , N1 , (5.4.1b)

where Ψi , i = 1, . . . , N1 , are given real-valued functions defined in Rn . Fur-


thermore, we also consider the following set of all-time-step inequality con-
straints on the state and control variables.

hi (k, x(k | u), u(k)) ≤ 0, ∀ k ∈ M, i = 1, . . . , N2 , (5.4.2)

where
M = {0, 1, . . . , M − 1},
and hi , i = 1, . . . , N2 , are given real-valued functions defined in M×Rn ×Rr .
If u ∈ U satisfies the constraints (5.4.1) and (5.4.2), then it is called a
feasible control. Let F be the class of all feasible controls.
Consider the following problem, to be referred to as Problem (Q): Given
the system (5.3.1), find a control u ∈ F such that the cost functional


M −1
g0 (u) = Φ0 (x(M | u)) + L0 (k, x(k | u), u(k)) (5.4.3)
k=0

is minimized over F.
The following conditions are assumed throughout this section:
Assumption 5.4.1 For each k = 0, 1, . . . , M − 1, f (k, ·, ·) satisfies Assump-
tion (5.3.1).

Assumption 5.4.2 For each i = 1, . . . , N1 , Ψi is continuously differentiable


on Rn .
146 5 Discrete Time Optimal Control Problems

Assumption 5.4.3 For each i = 1, . . . , N2 , and for each k = 0, 1, . . . , M −


1, hi (k, ·, ·) is continuously differentiable on Rn × Rr .

Assumption 5.4.4 Φ0 satisfies Assumption 5.3.2.

Assumption 5.4.5 For each k = 0, 1, . . . , M −1, L0 (k, ·, ·) satisfies Assump-


tion 5.3.3.

Define
"
Θ = u ∈ U : Ψi (x(M | u)) = 0, i = 1, . . . , N0 ;
#
Ψi (x(M | u)) ≤ 0, i = Ne + 1, . . . , N1 (5.4.4)

and

F = {u ∈ Θ : hi (k, x(k | u), u(k)) ≤ 0, ∀k ∈ M, i = 1, . . . N2 } . (5.4.5)

Note that the terminal constraints (5.4.1) are already in canonical form
(5.3.5). Although the all-time-step inequality constraints (5.4.2) are not in
canonical form, they can be approximated by a sequence of canonical con-
straints via the constraint transcription introduced in Section 4.3. Details are
given in the next section.

5.4.1 Constraint Approximation

Consider Problem (Q) of Section 5.4. Note that for each i = 1, . . . , N2 , the
corresponding all-time-step inequality constraint in (5.4.2) is equivalent to


M −1
gi (u) = max{hi (k, x(k | u), u(k)), 0} = 0. (5.4.6)
k=0

For convenience, let Problem (Q) with (5.4.2) replaced by (5.4.6) again be
denoted by Problem (Q). Recall the set Θ defined by (5.4.4). Then, it is clear
that the set F of feasible controls can also be written as

F = {u ∈ Θ : gi (u) = 0, i = 1, . . . , N2 }. (5.4.7)

As in Section 4.2, we approximate the nonsmooth function

max{hi (k, x(k | u), u(k)), 0}

by

Li,ε (k, x(k | u), u(k))


5.4 Problems with Terminal and All-Time-Step Inequality Constraints 147


⎪ h (k, x(k | u), u(k)), if hi (k, x(k | u), u(k)) ≥ 0,
⎨ i 2
(h i (k, x(k | u), u(k))+)
=
⎪ , if −  < hi (k, x(k | u), u(k)) < , (5.4.8)

⎩ 0, 4
if hi (k, x(k | u), u(k)) ≤ −.

Then, for each i = 1, . . . , N2 , let


M −1
gi,ε (u) = Li,ε (k, x(k | u), u(k)). (5.4.9)
k=0

We may now define the following approximate problem:


Problem (Qε ): Problem (Q) with (5.4.6) replaced by
ε
− + gi,ε (u) ≤ 0, i = 1, . . . , N2 . (5.4.10)
4
Note that (5.4.10) is slightly more restrictive in the sense that it is a
sufficient, but not a necessary condition, for (5.4.6) to hold.
Let
"
D = u ∈ U : Ψi (x(M | u)) = 0, i = 1, . . . , Ne ;
#
Ψi (x(M | u)) ≤ 0, i = Ne + 1, . . . , N1 , (5.4.11)

Fε = {u ∈ D : −(ε/4) + gi,ε (u) ≤ 0, i = 1, . . . , N2 }, (5.4.12)


and

F 0 = {u ∈ D : hi (k, x(k | u), u(k)) < 0, k ∈ M, i = 1, . . . , N2 }. (5.4.13)

Clearly, Problem (Qε ) is also equivalent to: Find a control u ∈ Fε such


that the cost functional (5.4.3) is minimized over Fε .
We assume that the following condition is satisfied.
Assumption 5.4.6 For any u in F and any δ > 0, there exists a ū ∈ F 0
such that
max |u(k) − ū(k)| ≤ δ.
0≤k≤M −1

Lemma 5.4.1 If uε is a feasible control vector of Problem (Qε ), then it is


also a feasible control vector of Problem (Q).

Proof. Suppose uε is not a feasible control vector of Problem (Q). Then,


there exist some i = 1, . . . , N2 , and k ∈ M such that

hi (k, x(k | uε ), uε (k)) > 0.

This, in turn, implies that

Li,ε (k, x(k | uε ), uε (k)) > ε/4


148 5 Discrete Time Optimal Control Problems

and hence,
M ε ε
gi,ε (uε ) > > .
4 4
That is,
ε
− + gi,ε (uε ) > 0.
4
This is a contradiction to the constraints specified in (5.4.10), and thus the
proof is complete.
For each ε > 0, Problem (Qε ) can be regarded as a nonlinear mathe-
matical programming problem. Since the constraints appearing in (5.4.11)
and (5.4.12) are in canonical form, their gradient formulae as well as that of
g0 (u) can be readily computed as explained in Section 5.3.2. Hence, Problem
(Qε ) can be solved by any efficient optimization technique, such as the SQP
technique presented in Chapter 3. In view of Lemma 5.4.1, we see that any
feasible control vector of Problem (Qε ) is in F, and hence is a suboptimal
control vector for Problem (Q). Thus, by adjusting ε > 0 in such a way that
ε → 0, we obtain a sequence of approximate problems (Qε ), each being solved
as a nonlinear mathematical programming problem. We shall now investigate
certain convergence properties of this approximation scheme.

5.4.2 Convergence Analysis

The aim of this section is to provide a convergence analysis for the approxi-
mation scheme proposed in the last subsection.
Theorem 5.4.1 Let u∗ and u∗ε be optimal controls of Problems (Q) and
(Qε ), respectively. Then, there exists a subsequence of {u∗ε }, which is again
denoted by the original sequence, and a control ū ∈ F such that

lim |u∗ε (k) − ū(k)| = 0, k = 0, 1, .., M − 1. (5.4.14)


ε→0

Furthermore, ū is an optimal control of Problem (Q).


Proof. Since U is a compact subset of Rr , and since {u∗ε } as a sequence in ε
is in U , it is clear that there exists a subsequence, which is again denoted by
the original sequence, and a control parameter vector ū ∈ U such that

lim |u∗ε (k) − ū(k)| = 0, k = 0, 1, . . . , M − 1. (5.4.15)


ε→0

By induction, it follows from Assumption 5.4.1 that, for each k = 0, 1, . . . , M,

lim |x(k | u∗ε ) − x(k | ū)| = 0. (5.4.16)


ε→0

Thus, by Assumptions 5.4.2 and 5.4.3, we have


5.4 Problems with Terminal and All-Time-Step Inequality Constraints 149

lim Ψi (x(M | u∗ε )) = Ψi (x(M | ū)), i = 1, . . . , N1 (5.4.17)


ε→0

and, for each k ∈ M,

lim hi (k, x(k | u∗ε ), u∗ε (k)) = hi (k, x(k | ū), ū(k)), i = 1, . . . , N2 .
ε→0
(5.4.18)
By Lemma 5.4.1, u∗ε ∈ F for all ε > 0. Thus, it follows from (5.4.17)
and (5.4.18) that ū ∈ F.
Next, by Assumptions 5.4.1 and 5.4.2, we deduce from (5.4.15) and (5.4.16)
that
lim g0 (u∗ε ) = g0 (ū). (5.4.19)
ε→0

For any δ1 > 0, Assumption 5.4.6 asserts that there exists a û ∈ F 0 such
that
|u∗ (k) − û(k)| ≤ δ1 , ∀ k = 0, 1, . . . , M − 1. (5.4.20)
By Assumption 5.4.1 and induction, we can show that, for any ρ1 > 0, there
exists a δ1 > 0 such that for all k = 0, 1, . . . , M

|x(k | u∗ ) − x(k | û)| ≤ ρ1 , (5.4.21)

whenever (5.4.20) is satisfied.


Using (5.4.20), (5.4.21) and Assumptions 5.4.1 and 5.4.2, it follows that,
for any ρ2 > 0, there exists a û ∈ F 0 such that

g0 (u∗ ) ≤ g(û) ≤ g0 (u∗ ) + ρ2 . (5.4.22)

Since û ∈ F 0 , we have

hi (k, x(k | û), û(k)) < 0, k ∈ M, i = 1, . . . , N2 ,

and hence there exists a δ2 > 0 such that

hi (k, x(k | û), û(k)) ≤ −δ2 , k ∈ M, i = 1, . . . , N2 . (5.4.23)

Thus, in view of (5.4.12), we see that

û ∈ Fε

for all ε, 0 ≤ ε ≤ δ2 . Therefore,

g0 (u∗ε ) ≤ g0 (ûp ). (5.4.24)

Using (5.4.22) and (5.4.23), and noting that u∗ε ∈ F, we obtain

g0 (u∗ ) ≤ g0 (u∗ε ) ≤ g0 (u∗ ) + ρ2 . (5.4.25)

Since ρ2 > 0 is arbitrary, it follows that


150 5 Discrete Time Optimal Control Problems

lim g0 (u∗ε ) = g0 (u∗ ). (5.4.26)


ε→0

Combining (5.4.19) and (5.4.26), we conclude that ū is an optimal control


of Problem (Q), and the proof is complete.

5.4.3 Illustrative Examples

In this section, three numerical examples are solved to illustrate the proposed
computational method.
Example 5.4.1 This problem concerns the vertical ascent of a rocket. The
original formulation (continuous time version) and numerical solution to the
problem are taken from [54]. A discrete time version of the control process is
obtained by the Euler scheme to discretize the system equations, where the
time step is taken as 1 s as in [54].

x1 (k + 1) = x1 (k) − u(k) (5.4.27a)


x2 (k + 1) = x2 (k) + x3 (k) (5.4.27b)
[V u(k) − Q(x(k))]
x3 (k + 1) = x3 (k) + − g, (5.4.27c)
x1 (k)

where x1 (k) is the mass of the rocket; x2 (k) is the altitude (km) above the
earth’s surface; x3 (k) is the rocket velocity (km/s); u(k) is the mass flow rate;
2
V = 2 is the constant gas nozzle velocity; g = 0.01 km/s is the acceleration
due to gravity (assumed constant); Q(x2 (k), x3 (k)) is the aerodynamic drag
defined by the formula:

Q(x2 (k), x3 (k)) = 0.05 exp(0.01x2 (k))(x3 (k))2 .

The initial state of the rocket is x1 (0) = 1, x2 (0) = 0, x3 (0) = 0; at the


terminal time M = 100, the final value of the mass is constrained to be 20%
of the initial mass, i.e., x1 (M ) = 0.2. The bounds on the control are

0 ≤ u(k) ≤ 0.04

for k = 0, 1, . . . , M − 1.
The control objective is to maximize the rocket’s peak altitude by suitable
choice of the mass flow rate. In other words, we want to minimize

g0 = −k1 x2 (M ) (5.4.28)

subject to the terminal state constraint

Ψ1 = k2 (x1 (M ) − 0.2) = 0, (5.4.29)


5.4 Problems with Terminal and All-Time-Step Inequality Constraints 151

where k1 = 0.01 and k2 = 10 are the appropriate weighting factors.


The problem is solved by the discrete time optimal control software,
DMISER [58]. Table 5.4.1 summarizes the computed results for this exam-
ple. The original continuous time problem is solved by MISER [104], and the
solution is found to be consistent with the present result.

Table 5.4.1: Numerical results for Example 5.4.1

M g0 Ψ1 x1 (M ) x2 (M )
100 −0.36454188 −0.167 × 10−14 0.2 36.45

Example 5.4.2 The original problem is taken from [189] and the same prob-
lem was also considered in [54]. The control process (discretized by the Euler
scheme using the time step h = 0.02) is described by the difference equations:

x1 (k + 1) = x1 (k) + 0.02x2 (k) (5.4.30a)


x2 (k + 1) = 0.98x2 (k) + 0.02u(k), (5.4.30b)

where k = 0, 1, . . . , M − 1, with initial state

x1 (0) = 0, x2 (0) = −1. (5.4.30c)

The problem is to minimize

 −1
M
$ %
g0 = (x1 (M ))2 + (x2 (M ))2 + (x1 (k))2 + (x2 (k))2 + 0.005(u(k))2
k=0

subject to the all-time-step constraints

h(k, x(k)) = −8(0.02k − 0.05)2 + x2 (k) + 0.5 ≤ 0 (5.4.31)

for k = 1, . . . , M . For computational purpose, we let M = 50.


By using the constraint transcription method described in Section 4.2,
we obtain a sequence of approximate optimal control problems (Qε ). For
each ε > 0, it is a problem of minimizing a function of 50 variables with
one constraint. Each of these problems is solvable by DMISER. Table 5.4.2
summarizes the computed results.
Note that all the constraints in (5.4.31) are satisfied.

Example 5.4.3 Consider the following first order system:

x(k + 1) = 0.5x(k) + u(k)


x(0) = 1,
152 5 Discrete Time Optimal Control Problems

Table 5.4.2: Numerical results for example 5.4.2

ε
M g0 − + gi,
4
50 9.1415507 0.429 × 10−9

where k = 0, 1, . . . , M − 1, with M = 50. The cost functional

 −1
M
$ %
g0 = (x(k))2 + (u(k))2
k=1

is to be minimized with the control bounded by

−1 ≤ u(k) ≤ 1,

for k = 0, 1, . . . , M − 1. The computed result is g0 = 1.1327822.

5.5 Discrete Time Time-Delayed Optimal Control


Problem

The main reference of this section is [131]. Consider a process described by


the following system of difference equations with time-delay:

x(k + 1) = f (k, x(k), x(k − h), u(k), u(k − h)), k = 0, 1, . . . , M − 1, (5.5.1a)

where
x = [x1 , . . . , xn ] ∈ Rn , u = [u1 , . . . ur ] ∈ Rr ,
are, respectively, the state and control vectors, while f = [f1 , . . . , fn ] ∈ Rn
is a given 0 < h < M . Here, we consider the case where there is only one time
delay. The extension to the case involving many time-delays is straightforward
but is more involved in terms of notation.
The initial functions for the state and control functions are

x(k) = φ(k), k = −h, −h + 1, . . . , −1, x(0) = x0 , (5.5.1b)


u(k) = γ(k), k = −h, −h + 1, . . . , −1, (5.5.1c)

where
φ(k) = [φ1 , . . . , φn ] , γ(k) = [γ1 , . . . , γr ] ,
are given functions from k = −h, −h + 1, . . . − 1 into Rn and Rr , respectively,
and x0 is a given vector in Rn . Define
5.5 Discrete Time Time-Delayed Optimal Control Problem 153

U = {ν = [v1 , . . . vr ] ∈ Rr : αi ≤ vi ≤ βi , i = 1, . . . , r}, (5.5.2)

where αi , i = 1, . . . , r, and βi , i = 1, . . . , r, are given real numbers. Note that


U is a compact and convex subset of Rr .
Consider the all-time-step inequality constraints on the state and control
variables given below:

hi (k, x(k), u(k)) ≤ 0, k = 0, 1, . . . M − 1; i = 1, . . . , N2 , (5.5.3)

where hi , i = 1, . . . , N2 , are given real-valued functions.


A control sequence u = {u(0), . . . , u(M − 1)} is said to be an admissible
control if u(k) ∈ U, k = 0, . . . , M − 1, where U is defined by (5.5.2). Let
U be the class of all such admissible controls. If a u ∈ U is such that the
all-time-step inequality constraints (5.5.3) are satisfied, then it is called a
feasible control. Let F be the class of all such feasible controls.
We now state our problem formally as follows:
Problem (Q) Given system (5.5.1a), (5.5.1b), (5.5.1c), find a control u ∈ F
such that the cost functional

M −1
g0 (u) = Φ0 (x(M )) + L0 (k, x(k), x(k − h), u(k), u(k − h)) (5.5.4)
k=0

is minimized over F, where Φ0 and L0 are given real-valued functions.

5.5.1 Approximation

To begin, we first note that the all-time-step inequality constraints are equiv-
alent to the following equality constraints:


M −1
gi (u) = max{hi (k, x(k), u(k)) , 0} = 0, i = 1, . . . , N2 . (5.5.5)
k=0

Thus, the set F of feasible controls can be written as

F = {u(k) ∈ U, k = 0, . . . , M − 1 : gi (u) = 0, i = 1, . . . , N2 , } , (5.5.6)

where U is defined by (5.5.2). However, the functions appearing in (5.5.5)


are nonsmooth. Thus, for each i = 1, . . . , N2 , we shall approximate the nons-
mooth function max{hi (k, x(k), u(k)) , 0} by a smooth function Li,ε (k, x(k),
u(k)) given by
154 5 Discrete Time Optimal Control Problems

⎨ 0, if hi < −ε,
Li,ε = (hi + ε)2 /4ε, if − ε ≤ hi ≤ ε, (5.5.7)

hi , if hi > ε,

where ε > 0 is an adjustable constant with small value. Then, the all-time-
step inequality constraints (5.5.3) are approximated by the inequality con-
straints in canonical form defined by
ε
− + gi,ε (u) ≤ 0, i = 1, . . . , N2 , (5.5.8)
4
where
 
N2 M −1
gi,ε (u) = Li,ε k, x(k), u(k).
i=1 k=0

Define
* ε +
Fε = u(k) ∈ U, k = 0, . . . , M − 1 : − + gi,ε (u) ≤ 0, i = 1, . . . , N2 .
4
(5.5.9)
Now, we can define a sequence of approximate problems (Qε ), where ε > 0,
below.
Problem (Qε ) Problem (Q) with (5.5.5) replaced by
ε
Gε (u) = − + gε (u) ≤ 0, i = 1, . . . , N2 . (5.5.10)
4
In Problem (Qε ), our aim is to find a control u in Fε such that the cost
functional (5.5.4) is minimized over Fε . For each ε > 0, Problem (Qε ) is a
special case of a general class of discrete time optimal control problems with
time-delay and subject to canonical constraints defined below.
Problem (P) Given system (5.5.1a)–(5.5.1c), find an admissible control u ∈
U such that the cost functional

M −1
g0 (u) = Φ0 (x(M )) + L0 (k, x(k), x(k − h), u(k), u(k − h)) (5.5.11)
k=0

is minimized over U subject to the following constraints in canonical form:

gi (u) = 0, i = 1, 2, . . . , Ne , (5.5.12a)
gi (u) ≤ 0, i = Ne + 1, . . . , N, (5.5.12b)

where

M −1
gi (u) = Φi (x(M )) + Li (k, x(k), x(k − h), u(k), u(k − h)). (5.5.13)
k=0
5.5 Discrete Time Time-Delayed Optimal Control Problem 155

We shall develop an efficient computational method for solving Problem


(P) in the next section. In the rest of this section, our aim is to establish
the required convergence properties of Problems (Qε ) to Problem (Q). We
assume that the following conditions are satisfied. These conditions are now
quite standard in optimal control algorithms.
Assumption 5.5.1 For each k = 0, 1, . . . , M −1, f (k, ·, ·, ·, ·) is continuously
differentiable on Rn × Rn × Rr × Rr .
Assumption 5.5.2 For each i = 1, . . . , N2 , and for each k = 0, 1, . . . , M −
1, hi (k, ·, ·) is continuously differentiable on Rn × Rr .

Assumption 5.5.3 Φ0 is continuously differentiable on Rn .

Assumption 5.5.4 For each k = 0, 1, . . . , M − 1, L0 (k, ·, ·, ·, ·) is continu-


ously differentiable on Rn × Rn × Rr × Rr .

Assumption 5.5.5 For any control u in F, there exists a control ū ∈ F 0


such that αū + (1 − α)u ∈F 0 for all α ∈ (0, 1] .

Remark 5.5.1 Under Assumption 5.5.5, it can be shown that for any u in
F and δ > 0, there exists a ū ∈ F 0 such that

max |u(k) − ū(k)| ≤ δ,


0≤k≤M −1

where F 0 is the interior of F, meaning that if u ∈ F 0 , then

hi (k, x(k), u(k)) < 0, i = 1, . . . , N2 ,

for all k = 0, 1, . . . M − 1.
In what follows, we shall present an algorithm for solving Problem (Q) as
a sequence of Problems (Qε ).
Algorithm 5.5.1

Step 1. Set ε = ε0 .

Step 2. Solve Problem (Qε ) as a nonlinear programming problem, obtain-


ing an optimal solution.

Step 3. Set ε = ε/10, and go to Step 2.

Remark 5.5.2 ε0 is usually set as 1.0 × 10−2 ; and the algorithm is termi-
nated as ‘successful exit’ when ε < 10−7 .

The convergence properties of Problems (Qε ) to Problem (Q) are given by


the following theorems. Their proofs are left as exercises.
156 5 Discrete Time Optimal Control Problems

Theorem 5.5.1 Let u∗ be an optimal control of Problem (Q) and let u∗ε be
an optimal control of Problem (Qε ). Then,

lim g0 (u∗ε ) = g0 (u∗ ).


ε→0

Proof. By Assumption 5.5.5, there exists a ū ∈ F 0 such that

uα ≡ αū + (1 + α)u∗ ∈ F 0 , ∀α ∈ (0, 1] .

Thus, for any δ1 > 0, ∃ an α1 ∈ (0, 1] such that

g0 (u∗ ) ≤ g0 (uα ) ≤ g0 (u∗ ) + δ1 , ∀α ∈ (0, 1] . (5.5.14)

Choose α2 = α1 /2. Then, it is clear that α2 ∈ F 0 . Thus, there exists a δ2 > 0


such that
hi (k, x(k |uα2 ), uα2 ) < −δ2 , i = 1, . . . , N2 ,
for all k, 0 ≤ k ≤ M − 1. Let ε = δ2 . Then, it follows from the definition
of Li,ε given by (5.5.7) that Li,ε = 0. Thus, (5.5.10) is satisfied and hence
uα2 ∈ Fε . Let u∗ε be an optimal control of Problem (Q(ε )). Clearly, u∗ε ∈ Fε
and
g0 (u∗ε ) ≤ g0 (uα2 ). (5.5.15)
However,
g0 (u∗ ) ≤ g0 (u∗ε ). (5.5.16)
Thus, if follows from (5.5.14), (5.5.15) and (5.5.16) that

g0 (u∗ ) ≤ g0 (u∗ ) ≤ g0 (uα2 ) ≤ g0 (u∗ ) + δ1 .

Letting ε → 0 and noting that δ1 > 0 is arbitrary, the conclusion of the


theorem follows readily. This completes the proof.

Theorem 5.5.2 Let u∗ε and u∗ be optimal controls of Problems (Qε ) and
(Q), respectively. Then, there exists a subsequence of {u∗ε }, which is again
denoted by the original sequence, and a control ū ∈ F such that, for each
k = 0, 1, .., M − 1,
lim |u∗ε (k) − ū(k)| = 0. (5.5.17)
ε→0

Furthermore, ū is an optimal control of Problem (Q).

Proof. Since U is a compact subset of Rr , and {u∗ε }, as a sequence in ε, is


such that u∗ε (k) ∈ U, for k = 0, 1, . . . , M − 1, it is clear that there exists a
subsequence, which is again denoted by the original sequence, and a control
parameter vector ū ∈ U such that, for each k = 0, 1, . . . , M − 1,

lim |u∗ε (k) − ū(k)| = 0. (5.5.18)


ε→0
5.5 Discrete Time Time-Delayed Optimal Control Problem 157

By induction, we can show, by using Assumption 5.5.1 and (5.5.18), that, for
each k = 0, 1, . . . , M,

lim |x(k |u∗ε ) − x(k |ū )| = 0. (5.5.19)


ε→0

Thus, by Assumption 5.5.2, we have, for each k = 0, 1, . . . , M ,

lim hi (k, x(k |u∗ε ), u∗ε (k)) = hi (k, x(k |ū ), ū(k)),
ε→0
i = 1, . . . , N2 . (5.5.20)

By Lemma 1, u∗ε ∈ F for all ε > 0. Thus, it follows from (5.5.20) that ū ∈ F.
Next, by Assumption 5.5.1, we deduce from (5.5.18) and (5.5.19) that

lim g0 (u∗ε ) = g0 (ū). (5.5.21)


ε→0

For any δ1 > 0, it follows from Remark 3.1 that there exists a û ∈ F 0 such
that, for each k = 0, 1, . . . , M − 1,

|u∗ (k) − û(k)| ≤ δ1 . (5.5.22)

By Assumption 5.5.1 and induction, we can show that, for any ρ1 > 0, there
exists a δ1 > 0 such that for each k = 0, 1, . . . , M,

|x(k |u∗ ) − x(k |û )| ≤ ρ1 , (5.5.23)

whenever (5.5.22) is satisfied. Using (5.5.22), (5.5.23) and Assumption 5.5.4,


it follows that, for any ρ2 > 0, there exists a û ∈ F 0 such that

g0 (u∗ ) ≤ g0 (û) ≤ g0 (u∗ ) + ρ2 . (5.5.24)

Since û ∈ F 0 , we have, for each k = 0, 1, . . . , M,

hi (k, x(k |û ), û(k)) < 0, i = 1, . . . , N2 ,

and hence there exists a δ > 0 such that, for each k = 0, 1, . . . , M,

hi (k, x(k |û ), û(k)) ≤ −δ, i = 1, . . . , N2 . (5.5.25)

Thus, in view of (5.5.9), we see that

û ∈ Fε ,

for all ε, 0 ≤ ε ≤ δ. Therefore,

g0 (u∗ε ) ≤ g0 (û). (5.5.26)

Using (5.5.24) and (5.5.26), and noting that u∗ε ∈ F, we obtain


158 5 Discrete Time Optimal Control Problems

g0 (u∗ ) ≤ g0 (u∗ε ) ≤ g0 (u∗ ) + ρ2 . (5.5.27)

Since ρ2 > 0 is arbitrary, it follows that

lim g0 (u∗ε ) = g0 (u∗ ). (5.5.28)


ε→0

Combining (5.5.21) and (5.5.28), we conclude that ū is an optimal control of


Problem (Q). This completes the proof.

5.5.2 Gradients

To calculate the gradients of the cost and constraint functionals, we will


derive the required gradient formulas corresponding to each control sequence
u = {u(0), . . . , u(M − 1)} as follows:
For each i = 0, 1, . . . , N , let
 
Hi k, x(k), y(k), z(k), u(k), v(k), w(k), λi (k + 1), λ̄i (k)

be the corresponding Hamiltonian sequence defined by


 
Hi k, x(k), y(k), z(k), u(k), v(k), w(k), λi (k + 1), λ̄i (k)
=Li (k, x(k), y(k), u(k), v(k))
+ Li (k + h, z(k), x(k), w(k), u(k))e(M − k − h)
 
+ λi (k + 1) f (k, x(k), y(k), u(k), v(k))
 
+ λ̄i (k) f (k + h, z(k), x(k), w(k), u(k))e(M − k − h), (5.5.29)

where e(·) denotes the unit step function defined by

1, k ≥ 0
e(k) = (5.5.30)
0, k < 0,

and

y(k) = x(k − h), (5.5.31a)


z(k) = x(k + h), (5.5.31b)
v(k) = u(k − h), (5.5.31c)
w(k) = u(k + h), (5.5.31d)
i i
λ̄ (k) = λ (k + h + 1). (5.5.31e)

For each control u, λi is the solution of the following costate system:


5.5 Discrete Time Time-Delayed Optimal Control Problem 159

  ∂Hi (k)
λi (k) = , k = M − 1, M − 2, . . . , 0, (5.5.32)
∂x(k)

with boundary conditions


 i  ∂Φi (x(M ))
λ (M ) = , (5.5.33a)
∂x(M )
λi (k) =0, k > M. (5.5.33b)

We set

z(k) = 0, ∀k = M − h + 1, M − h + 2, . . . , M, (5.5.34)

and
w(k) = 0, ∀k = M − h, M − h + 1, . . . , M. (5.5.35)
Then, the gradient formulas for the cost functionals (for i = 0) and constraint
functionals (for i = 1, . . . , N ) are given in the following theorem.
Theorem 5.5.3 Let gi (u), i = 0, 1, . . . , N , be defined by (5.5.11) (the
cost functional for i = 0) and (5.5.13) (the constraint functionals for
i = 1, . . . , N ). Then, for each i = 0, 1, . . . , N , the gradient of the function
gi (u) is given by

∂gi (u) ∂Hi (0) ∂Hi (1) ∂Hi (M − 1)


= , ,..., , (5.5.36)
∂u ∂u(0) ∂u(1) ∂u(M − 1)

where
 
Hi (k) = Hi k, x(k), y(k), z(k), u(k), v(k), w(k), λi (k + 1), λ̄i (k) ,
k = 0, 1, . . . , M − 1.

Proof. Define
$ %
u = (u(0)) , (u(1)) , . . . , (u(M − 1)) . (5.5.37)

Let the control u be perturbed by εû, where ε > 0 is a small real number
and û is an arbitrary but fixed perturbation of u given by
$ %
û = (û(0)) , (û(0)) , . . . , (û(M − 1)) . (5.5.38)

Then, we have
$ %
uε = u + εû = (u(0, ε)) , (u(1, ε)) , . . . , (u(M − 1, ε)) , (5.5.39)

where
u(k,ε) = u(k) + εû(k), k = 0, 1, . . . , M − 1. (5.5.40)
160 5 Discrete Time Optimal Control Problems

Let the perturbed solution be denoted by

x(k, ε) = x(k | uε ), k = 1, 2, . . . , M. (5.5.41)

Then,

x(k + 1, ε) = f (k, x(k, ε), y(k, ε), u(k, ε), v(k, ε)). (5.5.42)

The variation of the state for k = 0, 1, . . . , M − 1 is



dx(k + 1, ε) 
x(k + 1) = 
dε ε=0
∂f (k, x(k), y(k), u(k), v(k))
= x(k)
∂x(k)
∂f (k, x(k), y(k), u(k), v(k))
+ y(k)
∂y(k)
∂f (k, x(k), y(k), u(k), v(k))
+ û(k)
∂u(k)
∂f (k, x(k), y(k), u(k), v(k))
+ v(k), (5.5.43a)
∂v(k)

where
x(k) = 0, k ≤ 0, (5.5.43b)
u(k) = 0, k < 0. (5.5.43c)
From (5.5.43b) and (5.5.43c), we obtain

y(k) = 0, k = 0, 1, . . . , h, (5.5.44a)

and
v(k) = 0, k = 0, 1, . . . , h − 1. (5.5.44b)
Define

L̄i = Li (k, x(k), y(k), u(k), v(k)), (5.5.45a)


L̂i = Li (k + h, z(k), x(k), w(k), u(k)), (5.5.45b)
f¯ = f (k, x(k), y(k), u(k), v(k)), (5.5.45c)
fˆ = f (k + h, z(k), x(k), w(k), u(k)), (5.5.45d)
H̄i = Hi (k). (5.5.45e)

By chain rule and (5.5.45a), it follows that



∂gi (u) gi (uε ) − gi (u) dgi (uε ) 
û = lim ≡
∂u ε→0 ε dε ε=0
5.5 Discrete Time Time-Delayed Optimal Control Problem 161


M −1
∂Φi (x(M )) ∂ L̄i
= x(M ) + x(k)
∂x(M ) ∂x(k)
k=0
∂ L̄i ∂ L̄i ∂ L̄i
+ y(k) + û(k) v(k) . (5.5.46)
∂y(k) ∂u(k) ∂v(k)

From (5.5.31a), (5.5.31c) and (5.5.45b), we have


M −1  
∂ L̄i ∂ L̄i
y(k) + v(k)
∂y(k) ∂v(k)
k=0
−1
.    /

M
∂ L̂i ∂ L̂i
= e(M − k − h) x(k) + û(k) . (5.5.47)
∂x(k) ∂u(k)
k=0

Substituting (5.5.47) into (5.5.46), and then using (5.5.32) and (5.5.45a)–
(5.5.45e), we obtain
 M−1 
∂gi (u) ∂Φi (x(M )) ∂ H̄i
û = x(M ) + x(k)
∂u ∂x(k) ∂x(k)
k=0

∂ H̄i   ∂ f¯
+ û(k) − λi (k + 1) x(k)
∂u(k) ∂x
  ∂ fˆ
− λ̄i (k) x(k)e(M − k − h)
∂x(k)
  ∂ f¯
− λi (k + 1) û(k)
∂u(k)
/
 i  ∂ fˆ
− λ̄ (k) u(k)e(M − k − h) . (5.5.48)
∂u(k)

Using (5.5.33b) and the definition of e(·), it follows that


−1
. /
M  i T ∂ fˆ ∂ fˆ
λ̄ (k) x(k) + u(k) e(M − k − h)
∂x(k) ∂u(k)
k=0
M −h−1
. /
 i T ∂ fˆ ∂ fˆ
= λ̄ (k) x(k) + u(k) e(M − k − h)
∂x(k) ∂u(k)
k=0

M −1
∂ f¯ ∂ f¯
= (λi (k + 1)) y(k) + v(k) . (5.5.49)
∂y(k) ∂v(k)
k=h

As y(k) = 0, for 0 ≤ k ≤ h, and v(k) = 0, 0 < k ≤ h, we have


162 5 Discrete Time Optimal Control Problems


M −1
∂ f¯ ∂ f¯
(λi (k + 1)) y(k) + v(k)
∂y(k) ∂v(k)
k=h

M −1
∂ f¯ ∂ f¯
= (λi (k + 1)) y(k) + v(k) . (5.5.50)
∂y(k) ∂v(k)
k=0

Combining (5.5.49) and (5.5.50), we obtain


−1
. /
M ∂ ˆ
f ∂ ˆ
f
(λ̄i (k)) x(k) + u(k) e(M − k − h)
∂x(k) ∂u(k)
k=0

M −1
∂ f¯ ∂ f¯
= (λi (k + 1)) y(k) + v(k) . (5.5.51)
∂y(k) ∂v(k)
k=0

From (5.5.43a) and (5.5.51), it follows from (5.5.48) that


 
M −1 
∂gi (u) ∂Φi (x(M )) ∂ H̄i
û = x(M ) + x(k)
∂u ∂x(k) ∂x(k)
k=0

∂ H̄i
+ û(k) − (λi (k + 1)) x(k + 1) . (5.5.52)
∂u(k)

Thus, by (5.5.32), (5.5.45c) and (5.5.52), we obtain


 
M −2 
∂gi (u) ∂Φi (x(M )) ∂ H̄i (k)
û = x(M ) + x(k)
∂u ∂x(k) ∂x(k)
k=0

∂ H̄i (k + 1) ∂ H̄i (M − 1)
− x(k + 1) + x(M − 1)
∂x(k) ∂x(k)

M −1 
∂ H̄i
−(λi (M )) x(M ) + û(k) . (5.5.53)
∂u(k)
k=0

Therefore, by substituting (5.5.33a) and (5.5.43b) into (5.5.47), it follows that

∂gi (u) ∂Hi (0) ∂Hi (1) ∂Hi (M − 1)


û = , ,..., û.
∂u ∂u(0) ∂u(1) ∂u(M − 1)

Since û is arbitrary, we obtain

∂gi (u) ∂Hi (0) ∂Hi (1) ∂Hi (M − 1)


= , ,..., .
∂u ∂u(0) ∂u(1) ∂u(M − 1)

This completes the proof.


5.5 Discrete Time Time-Delayed Optimal Control Problem 163

5.5.3 A Tactical Logistic Decision Analysis Problem

In this section, we consider a tactical logistic decision analysis problem stud-


ied [12] . It is a problem of decision making for the distribution of resources
within a network of support, where the network seeks to mimic how logistic
support might be delivered in a military area of operations. The optimal con-
trol model of tactical logistic decision analysis problem formulated by Baker
and Shi [12] are as follows:

x(t + 1) = Ax(t) + B0 u(t) + B1 u(t − 1), (5.5.54a)


x(0) = x0 , u(−1) = 0, (5.5.54b)

xmin ≤ x(t) ≤ xmax , (5.5.55a)


umin ≤ u(t) ≤ umax , (5.5.55b)

where
⎡ ⎤ ⎡ ⎤
0.95 0 0 0 0 0 −1 −1 0 0 0 0 0
⎢ 0 0.9 0 0 0 ⎥ ⎢ 0 0 0 −1 −1 −1 0 0 ⎥
⎢ ⎥ ⎢ ⎥
A=⎢
⎢ 0 0 0.75 0 0 ⎥ , B0 = ⎢

⎢ 0 0 0 0 0 0 −1 0 ⎥ ,

⎣ 0 0 0 0.75 0 ⎦ ⎣ 0 0 0 0 0 0 0 −1 ⎦
0 0 0 0 0.85 0 0 0 0 0 0 0 0
⎡ ⎤ ⎡ ⎤
0.95 0 0 0 0 0 0 0 3500
⎢ 0 0.87 0 0 0 0 0 0 ⎥ ⎢ 800 ⎥
⎢ ⎥ ⎢ ⎥

B1 = ⎢ 0 0 0 0 0.75 0 0 0.7 ⎥ , x0 = ⎢
⎥ ⎥
⎢ 400 ⎥ .
⎣ 0 0 0.8 0 0 0.8 0.7 0 ⎦ ⎣ 400 ⎦
0 0 0 0.85 0 0 0 0 200
The cost functional is
1
G= (x(T )) Qx(T )
2
 −1
1" #
T
+ (x(T )) Qx(t) + (u(t)) Ru(t) , (5.5.56)
t=0
2

where ⎡ ⎤
100 0 0000
⎡ ⎤ ⎢0 5 0 0 0 0 0 0⎥
100 0 0 ⎢ ⎥
⎢0 0 5 0 0 0 0 0⎥
⎢0 2 0 0 0 ⎥ ⎢ ⎥
⎢ ⎥ ⎢ 0 0 0 2.5 0 0 0 0⎥
Q=⎢
⎢0 0 3 0 0 ⎥⎥, R=⎢
⎢0 0 0 0
⎥.
⎣ 0 0 0 1.5 ⎢ 3 0 0 0⎥

0 ⎦ ⎢0 0 0 0
⎢ 0 4 0 0⎥

000 0 2.5 ⎣0 0 0 0 0 0 2 0⎦
000 0 0002
164 5 Discrete Time Optimal Control Problems

The logistic network for the example is as shown in Figure 5.5.1.

Supply Route 8
Min u=0
Max u=4500
Node 1
Min x=250
Max x=700
Supply Route 7
Min u=0
Node 4
Max u=4000
Min x=200 Supply Route 5
Max x=700 Min u=0
Supply Route 6
Max u=3000
Min u=0
Max u=3500
Node 2
Supply Route 3
Min x=250
Min u=0
Max x=300
Max u=2000

Supply Route 2 Supply Route 4


Min u=0 Min u=0
Node 1 Max u=2000 Max u=2500
Min x=500
Max x=3500
Node 5
Min x=200

Supply Route 1 Max x=700

Min u=0
Max u=2000

Node 0

Fig. 5.5.1: The logistic network

The constraints (5.5.55a) and (5.5.55b) can be rewritten as

gi (u) = xi,min − xi (k) ≤ 0, k = 0, 1, . . . M − 1,


i = 1, 2, . . . , 5, (5.5.57)
gi (u) = xi (k) − xi,max ≤ 0, k = 0, 1, . . . M − 1,
i = 6, 7, . . . , 10. (5.5.58)

The optimal cost functional value obtained by using Algorithm 5.5.1 is


1.68 × 107 , which is much less than that obtained in [12], which is 3.5 × 107 .
The optimal control and the corresponding optimal state obtained using our
method are depicted in Figures 5.5.2, 5.5.3, 5.5.4, 5.5.5, 5.5.6 and 5.5.7. By
careful examination of these figures, we see that the constraints on the control
5.5 Discrete Time Time-Delayed Optimal Control Problem 165

700
+ control1 ◦
control2
600
control3
Stock dispatched supply

500

400

300

200
+
100
+

0
0 0.5 1 1.5 2 2.5 3 3.5 4
Time

Fig. 5.5.2: Stock dispatched supply 1–3


700
control4
control5 •
600 ∗
control6
Stock dispatched supply

500

400

300 •


200
∗ ∗ •
100 ∗ •



0
0 0.5 1 1.5 2 2.5 3 3.5 4
Time

Fig. 5.5.3: Stock dispatched supply 4–6

and the all-time-step constraints are satisfied at each time point. However,
all-time-step constraints in [12] are not always satisfied at each time point.
From Figure 5.5.2, 5.5.3 and 5.5.4, we see that u1 (k) = 0 for k = 0, 1, . . . , 4,
indicating no stock being dispatched along the supply route 1 to Node 1. This
is because u1 (k) could only contribute extra stock to Node 1 through the
supply route 1 from Node 0, and the initial stock in Node 1 is large, twice as
166 5 Discrete Time Optimal Control Problems

700
control7
control8
600
Stock dispatched supply

500

400


300

200

100 

0
0 0.5 1 1.5 2 2.5 3 3.5 4
Time

Fig. 5.5.4: Stock dispatched supply 7–8

900
◦ state1 ◦
Stock of logistic resources at location

800 state2
state3
700

600 ◦

500 ◦

400

300

200

100

0◦ ◦ ◦
0 1 2 3 4 5
Time

Fig. 5.5.5: Stock at location 1–3

large as those in the other nodes. Thus, it is clear that stock should be moved
out of Node 1 to other nodes quickly through the supply routes 2 and 3 so as
to decrease the cost of holding the stock in Node 1. Also from Figure 5.5.2,
5.5.3 and 5.5.4, we see that u2 (k) and u3 (k) are very large at k = 0, meaning
that a large amount of stock is dispatched from Node 1 to the other nodes
5.5 Discrete Time Time-Delayed Optimal Control Problem 167

900
Stock of logistic resources at location state4 ×
800 state5 +

700

600

500
×
× ×
400 ×
×
300 + + +
+ + +
200 ×

100

0
0 1 2 3 4 5
Time

Fig. 5.5.6: Stock at location 4–5

3500 ◦
Stock of logistic resources at location 1

3000

2500

2000 ◦

1500

1000


500 ◦
0 1 2 3 4 5
Time

Fig. 5.5.7: Stock at location 1

of the network at k = 0. From the structure of the network, it is clear that


there is only one supply route to Node 5 with no supply route coming out of
it. This means that Node 5 is a pure receiver of stock from other nodes. In
view of the limits imposed on the maximum stock in various nodes, we see
from Figures 5.5.5 and 5.5.6 that the amount of stock that is moved along
168 5 Discrete Time Optimal Control Problems

the supply route 4 to Node 5 is low for k = 0, 1, . . . , 4. The structure of the


network depicted in Figure 5.5.1 clearly reveals that there are 4 supply routes
(i.e., supply routes 3, 6 and 7 ) to Node 4 with only one supply route (i.e.,
supply route 8) coming out of it. For Node 3, there are 2 supply routes (i.e.,
supply routes 2 and 4) in and only 1 (i.e., supply route 7) out. By virtue of
these observations, the amounts of stock along the supply routes for which
the stocks are moved out should be large. This is confirmed in Figure 5.5.2,
5.5.3 and 5.5.4 that uk (k) and u8 (k), which denote, respectively, the amounts
of stock being moved out from Node 3 along the supply route 7 and Node
4 along the supply route 8 are large for k = 1, . . . , 4. Their values are quite
low at k = 0. This is due to the appearance of time-delay along the supply
routes, which indicates that nodes cannot receive stock instantaneously. The
stocks arrive with delay. For Node 2 , there are 3 supply routes (i.e., supply
routes 4, 5 and 6) out but only 1 supply route (i.e., supply route 2) in. As
shown in Figures 5.5.2, 5.5.3 and 5.5.4, we see that the amounts of the stock
being moved out along the supply routes 4, 5 and 6 are relatively small.

5.6 Exercises

5.6.1 Give a proof of Theorem 5.2.3.


5.6.2 Give a proof of Theorem 5.2.4.
5.6.3 Give a proof of Lemma 5.2.1.
5.6.4 Give a proof of Lemma 5.2.2.
5.6.5 Give a proof of Theorem 5.2.5.
5.6.6 Give a proof of Theorem 5.5.3.
5.6.7 A woman owns an initial sum of money a, and knows that she will
live exactly N years. At the beginning of year i she owns the sum $x(i). She
selects a fraction v(i), 0 ≤ v(i) ≤ 1, to invest, and consumes the remainder,
c(i) = (1 − v(i))x(i). She derives a benefit B(c(i)) from this, where B is a
given monotonically increasing smooth function. The invested sum accrues
interest at the rate r, so that x(i + 1) = (1 + r)x(i)v(i). The woman wishes to
maximize the total benefit, i.e., the sum of all the B(c(i)) over the remaining
N years of her life, and has no particular desire to have any money left to
bequeath after death. Formulate the problem so that it can be attacked by the
dynamic programming, and write down the dynamic programming equation
for the process. Do not solve the dynamic programming equation.
5.6.8 The problem of investing money over a period of N years can be rep-
resented as a multistage decision process where each year $x is subdivided
into $u, where x ≥ u ≥ 0, which are invested, and $(x − u) which are spent.
5.6 Exercises 169

The process is assumed to have the dynamic equation, x(i + 1) = au(i) where
the integer i counts the years and the growth rate ”a” (a > 0) is a constant.
The satisfaction from money spent in any year is described by a function
H(i) = H(x(i) − u(i)) and the objective is to maximize the total satisfaction

N
I= H(i) over N years. Define an optimal return function and write the
i=1
dynamic programming functional recurrence equation for the process. Take
H = (x − u)1/2 and find the optimal spending policy
(i) over 2 years
(ii) over N years.
5.6.9 Same as Exercise 5.6.8, but with H taken as

H = log(x − u).

5.6.10 For the discrete time process

x(k + 1) = 2x(k) + u(k).

It is desired to minimize the cost functional,


1
I = 50(x(2))2 + (u(k))2
k=0

for a two-step process starting from x(0) = 1 with the finishing state x(2)
free.
(i) Use the dynamic programming to find the optimal controls u(i), which
are to be real numbers satisfying |u(i)| ≤ 2.
(ii) Solve the problem in part (i) when the initial state is given by x(0) =
10.
5.6.11
(a) A farm grows wheat as a yearly crop and has available unlimited acreage
and free labour. Of the total grain x(i) tones available at the start of
each year i, u(i) tones (0 ≤ u(i) ≤ x(i)) is planted and the remainder
x(i) − u(i) is sold during the year. The planted wheat produces a new
crop, x(i + 1) = au(i), (a = constant > 1). It is assumed that A tones of
grain (i.e., x(1) = A) is provided to start the venture which is to run for
4 years. The desire is to maximize


4
I= {x(i) − u(i)}
i=1

the total amount of grain sold over the period. Use the dynamic program-
ming to find the optimal planting and selling policy.
170 5 Discrete Time Optimal Control Problems

(b) Use the dynamic programming to find the optimal planting and selling
policy for the problem in part (a) with the following modifications:
(i) The project is to finish with A tones left at the end of the fourth
year.
(ii) As in (b)(i), but the amount of land available for cultivation is
limited so that no more than aA tones can be sown in any year,
i.e., 0 ≤ u(i) ≤ aA.

5.6.12 Consider the unconstrained nonlinear discrete time optimal control


problem with dynamics governed by

x(k + 1) = f (k, x(k), u(k)), k = 0, 1, . . . , M − 1 (5.6.1a)


x(0) = x0 , (5.6.1b)

where x(k) ∈ Rn , and u(k) ∈ Rr . The sequence u(k), k = 1, . . . , M − 1, is to


be determined such that the cost functional

M −1
g0 (u) = Φ0 (x(M )) + L0 (k, x(k), u(k)) (5.6.2)
k=1

is minimized. Let u∗ be an optimal control, and let x∗ be the corresponding


solution of the system (5.6.1). Use the technique of Section 4.1 to show that
∂H(k, x∗ (k), u∗ (k), λ∗ (k + 1))
= 0, k = 0, 1, . . . , M − 1, (5.6.3)
∂u(k)
where
H(k, x∗ (k), u∗ (k), λ∗ (k + 1))
=L0 (k, x∗ (k), u∗ (k)) + (λ∗ (k + 1)) f (k, x∗ (k), u∗ (k)), (5.6.4)
and
∂H(k, x∗ (k), u∗ (k), λ∗ (k + 1))
(λ∗ (k)) = (5.6.5a)
∂x(k)

∂Φ0 (x (M ))
λ∗ (M ) = . (5.6.5b)
∂x(M )

5.6.13 Consider the problem of Exercise 5.6.12. Let u(k) ∈ U for all k =
0, 1, . . . , M − 1, where U is a compact and convex subset of Rr . Show that
the first order necessary condition for optimality is the same as that given in
Exercise 5.6.12, except with (5.6.3) replaced by


M −1
∂H(k, x(k), u(k), λ(k + 1))
(u(k) − u∗ (k)) ≥ 0
∂u(k)
k=0

for all u(k) ∈ U, k = 0, 1, . . . , M − 1.


5.6 Exercises 171

5.6.14 Consider the discrete time process governed by the following system
of difference equations:

x(k + 1) = A(k)x(k) + B(k)u(k), k = 0, 1, . . . , M − 1,

where x(k) ∈ Rn , u(k) ∈ Rr , A(k) ∈ Rn×n , and B(k) ∈ Rn×r . The control
sequence u(k), k = 0, 1, . . . , M −1, is to be determined such that the quadratic
cost functional
1
g0 (u) = (x(M )) Q(M )x(M )
2

M −1
1 1
+ (x(k)) Q(k)x(k) + (u(k)) R(k)u(k)
2 2
k=0

is minimized, where Q(k) = (Q(k)) ∈ Rn×n is positive semi-definite, and


R(k) = (R(k)) ∈ Rr×r is positive definite.

(a) Show that the optimal control is

u(k) = −(R(k))−1 (B(k)) λ(k + 1),

where λ(k) is the costate vector governed by the following system of dif-
ference equations:

λ(k + 1) = (A(k)) λ(k + 1) + Q(k)x(k)


λ(M ) = Q(M )x(M ).

(b) By assuming that the costate and state vectors are related through the
relationship:
λ(k) = S(k)x(k),
where S(k) = (S(k)) . Show that the symmetric matrix S(k) is governed
by the discrete matrix Riccati equation:
 −1
S(k) = Q(k) + (A(k)) (S(k + 1))−1 + B(k)(R(k))−1 (B(k)) A(k)
S(M ) = Q(M ).

(c) Show that the system of optimal state equations is given by


 −1
x(k + 1) = I + B(k)(R(k))−1 (B(k)) S(k + 1) A(k)x(k).

5.6.15 A combined discrete time optimal control and optimal parameter se-
lection problem is defined in the canonical form as follows:

minimize g0 (u, ζ),

where
172 5 Discrete Time Optimal Control Problems


M −1
g0 (u, ζ) = Φ0 (x(M | u, ζ), ζ) + L0 (k, x(k | u, ζ), u(k), ζ),
k=0

subject to

gi (u, ζ) = 0, i = 1, . . . , Ne
gi (u, ζ) ≥ 0, i = 1, . . . , N,

where

M −1
gi (u, ζ) = Φi (x(M | u, ζ), ζ) + Li (k, x(k | u, ζ), u(k), ζ),
k=0

and ζ ∈ Rs is the system parameter vector to be optimized together with u.


Derive the corresponding gradient formulae for the cost functional and the
constraint functionals with respect to ζ and u.

5.6.16 Consider the first order system


1
x(k + 1) = x(k) + u(k)
2
x(0) = 1,

where k = 0, 1, . . . , M . The optimal control problem is find u(k), k =


0, 1, . . . , M − 1, such that


M
" #
g0 (u) = (x(k))2 + (u(k))2
k=0

is minimized. Solve the problem for M = 4, M = 6 and M = 10. Comment


on your results.

5.6.17 Problem (Qε ) constructed in Section 5.4.1 can be solved as a non-


linear optimization problem. Give details of the computational procedure.
Chapter 6
Elements of Optimal Control Theory

6.1 Introduction

There are already many excellent books devoted solely to the detailed exposi-
tion of optimal control theory. We refer the interested reader to [3, 4, 8, 11, 18–
21, 29, 33, 40, 59, 64, 74, 83, 90, 121, 130, 198, 201, 206, 226, 276], just to name
a few. Texts dealing with the optimal control of partial differential equations
include [5, 37, 149, 250]. The aim of this chapter is to give a brief account
of some fundamental optimal control theory results for systems described by
ordinary differential equations.
In the next section, we present the basic formulation of an unconstrained
optimal control problem and derive the first order necessary condition known
as the Euler-Lagrange equations. In Section 6.3, we consider a class of linear
quadratic optimal control problems. This class is important because the prob-
lem can be solved analytically and the optimal control so obtained is in closed
loop form. In Section 6.4, the well-known Pontryagin minimum principle is
briefly discussed. The Maximum Principle is then used to introduce singular
control and time optimal control in Sections 6.5 and 6.6, respectively. In Sec-
tion 6.7, a version of the optimality conditions for optimal control problems
subject to constraints is presented and an illustrative example is solved using
these conditions. To conclude this chapter, Bellman’s dynamic programming
principle is included in Section 6.8. For results on the existence of optimal
controls, we refer the interested reader to [40].
The main references of this chapter are the lecture notes prepared and
used by the authors, [88] and Chapter 4 of [253].

© The Author(s), under exclusive license to 173


Springer Nature Switzerland AG 2021
K. L. Teo et al., Applied and Computational Optimal Control, Springer
Optimization and Its Applications 171,
https://doi.org/10.1007/978-3-030-69913-0 6
174 6 Elements of Optimal Control Theory

6.2 First Order Necessary Condition: Euler-Lagrange


Equations

We shall begin with the simplest optimal control formulation from which we
derive the first order necessary conditions for optimality, also known as the
Euler-Lagrange equations. More complex classes of optimal control problems
will be discussed in later sections. Consider a dynamical system described by
the following system of differential equations:

dx(t)
= f (t, x(t), u(t)), t ∈ [0, T ], (6.2.1a)
dt
with initial condition:
x(0) = x0 , (6.2.1b)

where

x(t) = [x1 (t), x2 (t), . . . , xn (t)] , u(t) = [u1 (t), u2 (t), . . . , um (t)]

are, respectively, referred to as the state and control. f = [f1 , . . . , fn ] is


assumed to be continuously differentiable with respect to all its arguments
and x0 is a given vector in Rn . In many applications, the independent variable
t denotes time, but there are some, such as problems in shape optimization,
where t has a different meaning. We have assumed that the given process
starts at t = 0 and ends at the fixed terminal time T > 0. A process that
starts from t0 = 0 may be readily transformed to satisfy this assumption by
a suitable shifting of the time scale.
For now, any piecewise continuous function u from [0, T ] into Rm may
be taken as an admissible control. Let U be the class of all such admissible
controls.
We now consider an optimal control problem where a control u ∈ U is to
be chosen such that the cost functional
 T
g0 (u) = Φ0 (x(T )) + L0 (t, x(t), u(t))dt (6.2.2)
0

is minimized, where Φ0 and L0 are continuously differentiable with respect


to their respective arguments. Note that the cost functional can be regarded
as depending on u only, as x is implicitly determined by u from (6.2.1).
The dynamical constraint (6.2.1) can be appended to the cost functional by
introducing the appropriate Lagrange multiplier λ ∈ Rn as follows.
 T
dx(t)
ḡ0 (u) = Φ0 (x(T )) + L0 (t, x(t), u(t))+(λ(t)) f (t, x(t), u(t))− dt.
0 dt
(6.2.3)
For convenience, we define the Hamiltonian function as
6.2 First Order Necessary Condition: Euler-Lagrange Equations 175

H(t, x, u, λ) = L0 (t, x, u) + λ f (t, x, u). (6.2.4)

Note that the appended cost functional ḡ0 is identical to the original g0 if the
dynamical constraint is satisfied. The time-dependent Lagrange multiplier
is referred to as the costate vector. It is also known as the adjoint vector.
Substituting (6.2.4) into (6.2.3) and integrating the last term by parts, we
have

ḡ0 (u) =Φ0 (x(T )) − (λ(T )) x(T ) + (λ(0)) x(0)


 T
d(λ(t))
+ H(t, x(t), u(t), λ(t)) + x(t) dt. (6.2.5)
0 dt

For a small variation δu in u, the corresponding first order variations in x


and ḡ0 are δx and δḡ0 , respectively, where δḡ0 is obtained by the chain rule:

ḡ0 (u + δu) − ḡ0 (u)


δḡ0 = lim
→0 
∂Φ0 (x(T ))
= − (λ(T )) δx(T ) + (λ(0)) δx(0)
∂x
 T
∂H(t, x(t), u(t), λ(t)) d(λ(t))
+ + δx(t)
0 ∂x dt
∂H(t, x(t), u(t), λ(t))
+ δu(t) dt. (6.2.6)
∂u

Since λ(t) is arbitrary so far, we may choose



dλ(t) ∂H(t, x(t), u(t), λ(t))
=− (6.2.7a)
dt ∂x

with boundary condition



∂Φ0 (x(T ))
λ(T ) = . (6.2.7b)
∂x

As the initial condition x(0) is fixed, δx(0) vanishes and (6.2.6) reduces to
 T
∂H(t, x(t), u(t), (t))
δḡ0 = δu(t) dt. (6.2.8)
0 ∂u

For a local minimum, δḡ0 is required to vanish for any arbitrary δu. Therefore,
it is necessary that

∂H(t, x(t), u(t), (t))
=0 (6.2.9)
∂u
176 6 Elements of Optimal Control Theory

for all t ∈ [0, T ], except possibly on a finite set (i.e., on a set consisting of a
finite number of points). Note that this condition holds only if u is uncon-
strained. In the case of control bounds, the Pontryagin Maximum Principle
to be discussed later is required.
Equations (6.2.1), (6.2.7) and (6.2.9) are the well-known Euler-Lagrange
equations. Note that (6.2.7) is a set of ordinary differential equations in λ with
boundary conditions specified at the terminal time T . We shall summarize
these results in a theorem.
Theorem 6.2.1 Let u∗ (t) be a local optimal control for the cost func-
tional (6.2.2), and let x∗ (t) and λ∗ (t) be, respectively, the corresponding
optimal state and costate. Then, it is necessary that

dx∗ (t) ∂H(t, x∗ (t), u∗ (t), λ∗ (t))
=
dt ∂λ
∗ ∗
= f (t, x (t), u (t)), (6.2.10a)
∗ 0
x (0) = x , (6.2.10b)
∗ ∗ ∗ ∗ 
dλ (t) ∂H(t, x (t), u (t), λ (t))
=− , (6.2.10c)
dt ∂x

∂Φ0 (x∗ (T ))
λ∗ (T ) = , (6.2.10d)
∂x

and, for all t ∈ [0, T ], except possibly on a finite subset of [0, T ],



∂H(t, x∗ (t), u∗ (t), λ∗ (t))
= 0, (6.2.10e)
∂u

where a finite set denotes a set that contains only a finite number of points.

Note that (6.2.10a)–(6.2.10d) constitute 2n differential equations with n


boundary conditions for x∗ specified at t = 0 and n boundary conditions
for λ∗ specified at t = T . This is referred to as a two-point boundary-value
problem (TPBVP). In principle, the dependence on u∗ can be removed by
solving for u∗ as a function of x∗ and λ∗ from the m algebraic equations
in (6.2.10e) via the implicit function theorem, provided that the Hessian,
Huu = (∇u ) (∇u H), is non-singular at the optimal point. In practice, how-
ever, an analytic solution of (6.2.10e) is often not possible. Even if it could
be determined, the resulting TPBVP is likely to be difficult to solve.
Example 6.2.1 Consider the following simple problem.
 1
" #
min g0 (u) = (x(t))2 + (u(t))2 dt (6.2.11)
0

subject to
6.2 First Order Necessary Condition: Euler-Lagrange Equations 177

dx(t)
= u(t) (6.2.12a)
dt
x(0) = 1. (6.2.12b)

The corresponding Hamiltonian function is

H = x2 + u2 + λu. (6.2.13)

By Theorem 6.2.1, we have

dλ(t)
= −2x(t) (6.2.14a)
dt
λ(1) = 0 (6.2.14b)

∂H
= 2u(t) + λ(t) = 0. (6.2.15)
∂u
Substituting (6.2.15) into (6.2.12a) gives

dx(t) λ(t)
=− . (6.2.16)
dt 2
Differentiating (6.2.16) with respect to t and then using (6.2.14a), we have

d2 x(t)
− x(t) = 0. (6.2.17)
dt2
Clearly, the solution of (6.2.17) is given by

x(t) = Aet + Be−t , (6.2.18)

where A and B are constants to be determined by the boundary conditions.


Some elementary algebra then leads to the exact solution of the resulting
TPBVP:
−2e
u∗ (t) = sinh(1 − t) (6.2.19)
1 + e2
2e
x∗ (t) = cosh(1 − t), (6.2.20)
1 + e2
and
e4 − 1
g0∗ = . (6.2.21)
(1 + e2 )2
While the above example could be solved analytically, it is far less complex
than those one can expect to encounter in practice. In fact, most practical
optimal control problems, especially those with hard constraints, cannot be
solved analytically.
178 6 Elements of Optimal Control Theory

6.3 The Linear Quadratic Theory

An important family of unconstrained optimal control problems is known as


linear quadratic problems. There are basically two types: regulator problems
and tracking problems. Both are intimately related. We shall first consider
the general regulator problem, where the system dynamics are governed by
linear differential equations, while the performance index is quadratic in the
state and control:
4
1
min g0 (u) = (x(T )) S f x(T )
2
 T 5
1 "  
#
+ (x(t)) Q(t)x(t)+(u(t)) R(t)u(t) dt (6.3.1)
2 0

subject to

dx(t)
= A(t)x(t) + B(t)u(t) (6.3.2a)
dt
x(0) = x0 , (6.3.2b)

where x ∈ Rn , u ∈ Rr , S f , Q, A ∈ Rn×n , R ∈ Rr×r and B ∈ Rn×r . Here,


T is fixed, S f and Q(t) are symmetric positive semi-definite matrices, while
R(t) is symmetric positive definite for all t ∈ [0, T ]. The regulator problem
seeks to regulate the state to be as close as possible to zero while minimizing
the control effort.
The corresponding Hamiltonian function is
1"  #
H= x Qx + u Ru + λ {Ax + Bu}. (6.3.3)
2
Applying the first order necessary conditions (6.2.10), we have

dx(t)
= A(t)x(t) + B(t)u(t) (6.3.4a)
dt
x(0) = x0 (6.3.4b)
dλ(t)
= −Q(t)x(t) − (A(t)) λ(t) (6.3.4c)
dt
λ(T ) = S f x(T ) (6.3.4d)

R(t)u(t) + (B(t)) λ(t) = 0, (6.3.4e)

where we have suppressed the superscript ∗ on the optimal control, state and
costate for clarity. Since R(t) is positive definite, the last equation (6.3.4e)
immediately yields
6.3 The Linear Quadratic Theory 179

u(t) = −(R(t))−1 (B(t)) λ(t). (6.3.5)

(6.3.4a) and (6.3.4c) with u(t) given by (6.3.5) can be written in the form of
the following linear homogeneous system of differential equations in x and λ:

d x(t) A(t) −B(t)(R(t))−1 (B(t)) x(t)


= . (6.3.6)
dt λ(t) −Q(t) −(A(t)) λ(t)

Since the boundary conditions for x and λ are prescribed at two different
end points, (6.3.6) presents a linear homogeneous TPBVP. Unlike the general
nonlinear TPBVP problem, this one can be solved analytically in two ways:
the transition matrix method or the backward sweep method [33].
The transition matrix method assumes the existence of two transition
matrices X(t) and Λ(t) ∈ Rn×n such that

x(t) = X(t)x(T ) (6.3.7)

and
λ(t) = Λ(t)x(T ). (6.3.8)
It is easy to see that the transition matrices must also satisfy (6.3.6), i.e.,

d X(t) A(t) −B(t)(R(t))−1 (B(t)) X(t)


= (6.3.9)
dt Λ(t) −Q(t) −(A(t))) Λ(t)

along with the boundary conditions

X(T ) = I (6.3.10a)

and
Λ(T ) = S f . (6.3.10b)
(6.3.9) with the boundary conditions (6.3.10) gives rise to a final value
problem, as the system can be integrated backwards in time, starting at the
terminal time T . If X(t) is non-singular for all t ∈ [0, T ], then, by (6.3.5), (6.3.7)
and (6.3.8), it follows that

u(t) = −(R(t))−1 (B(t)) Λ(t)(X(t))−1 x(t). (6.3.11)

(6.3.11) relates the optimal control at time t to the state at time t and hence
it is called a closed loop (or feedback ) control law. It can be written as

u(t) = −G(t)x(t) (6.3.12a)

with the feedback gain matrix given by

G(t) = (R(t))−1 (B(t)) Λ(t)(X(t))−1 . (6.3.12b)


180 6 Elements of Optimal Control Theory

The feedback gain matrix can be evaluated once the system (6.3.9) is solved
with the boundary conditions (6.3.10).
Note that the control law may also be expressed in terms of the initial
state, i.e.,
u(t) = −Ĝ(t)x(0), (6.3.13)
where
Ĝ(t) = (R(t))−1 (B(t)) Λ(t)(X(0))−1 . (6.3.14)
The proof is left as an exercise.
The transition matrix method is conceptually easy, though difficulty often
arises during actual computation of the inverse of X(t). The reader is referred
to [33] for further elaboration of the numerical difficulties involved.
The backward sweep method is more popular by virtue of its computa-
tional efficiency. It assumes a linear relationship between the state and the
costate of the form:
λ(t) = S(t)x(t). (6.3.15)
Direct differentiation of (6.3.15) yields

dλ(t)
= −Q(t)x(t) − (A(t)) λ(t)
dt
dS(t) dx(t)
= x(t) + S(t)
dt dt
dS(t)
= x(t) + S(t)A(t)x(t) + S(t)B(t)u(t). (6.3.16)
dt
Substituting (6.3.5) and (6.3.15) into (6.3.16) yields

dS(t)
+ S(t)A(t) + (A(t)) S(t) + Q(t) (6.3.17)
dt

−S(t)B(t)(R(t))−1 (B(t)) S(t) x(t) = 0. (6.3.18)

Since (6.3.18) must hold for arbitrary x, it is necessary that, for all t ∈ [0, T ],

dS(t)
= −S(t)A(t) − (A(t)) S(t) − Q(t) + S(t)B(t)(R(t))−1 (B(t)) S(t).
dt
(6.3.19a)
The boundary condition is obtained from (6.3.15) and (6.3.4d):

S(T ) = S f . (6.3.19b)

(6.3.19) is a nonlinear nonhomogeneous differential equation known as the


matrix Riccati equation. Since the boundary condition is specified only at
the end point, it can be solved directly by integrating backwards from T
to 0. The reader may like to check that S(t) is also symmetric, and hence
there are really only n2 (n + 1) differential equations to be solved. Once S(t) is
6.3 The Linear Quadratic Theory 181

determined, a closed loop control law may again be established from (6.3.15)
and (6.3.5):
u(t) = −G(t)x(t), (6.3.20a)
where
G(t) = (R(t))−1 (B(t)) S(t). (6.3.20b)
Once S(t) is determined, the initial condition for the costate is given by
(6.3.15):
λ(0) = S(0)x0 . (6.3.21)
Hence, the optimal state and costate may be obtained by direct integra-
tion of (6.3.6) forward in time starting at t = 0. By comparing (6.3.20)
with (6.3.11), it is obvious that

S(t) = Λ(t)(X(t))−1 . (6.3.22)

This implies that the matrix Riccati equation may be solved alternatively by
the procedure given in the transition matrix method, i.e., by solving the 2n2
linear differential equations in (6.3.9), computing the inverse of X and then
computing S(t) = Λ(t)(X(t))−1 . In principle, it may be easier to solve the
linear differential equation. However, there are efficient numerical algorithms
[44] for solving the nonlinear Riccati equation directly. The need to solve
about four times as many differential equations and to invert an n × n matrix
at each time point t for the transition matrix method is often not warranted.
As an extension to the linear quadratic regulator problem, the tracking
problem seeks to track a desired reference trajectory r(t) with a linear com-
bination of the states over the interval [0, T ]. Again, subject to the previous
linear system (6.3.2), we have the following cost functional:

1
min u g(u) = [y(T ) − r(T )] S f [y(T ) − r(T )]
2

1 T " #
+ [y(t) − r(t)] Q(t)[y(t) − r(t)] + (u(t)) R(t)u(t) dt ,
2 0
(6.3.23)

where
y(t) = C(t)x(t) (6.3.24)
and
S ≥ 0, Q(t) ≥ 0, R(t) > 0 (6.3.25)
(i.e., S f is symmetric and positive semi-definite, Q(t) is symmetric and pos-
itive semi-definite for each t ∈ [0, T ], and R(t) is symmetric and positive
definite for each t ∈ [0, T ].)
If we assume a linear relationship for the state and costate, i.e.,

λ(t) = S(t)x(t) + μ(t) (6.3.26)


182 6 Elements of Optimal Control Theory

and following an argument similar to the previous analysis, it can be shown


that the optimal closed loop control law is given by

u(t) = − G(t)x(t) − (R(t))−1 (B(t)) μ(t) (6.3.27a)


G(t) =(R(t))−1 (B(t)) S(t) (6.3.27b)
dS(t)
= − (A(t)) S(t) − S(t)A(t) + S(t)B(t)(R(t))−1 (B(t)) S(t)
dt
− (C(t)) Q(t)C(t) (6.3.27c)
S(T ) =(C(T )) S f C(T ) (6.3.27d)
dμ(t)
=(−(A(t)) + (G(t)) (B(t)) )μ(t) + (C(t)) Q(t)r(t) (6.3.27e)
dt
μ(T ) = − (C(T )) S f r(T ). (6.3.27f)

6.4 Pontryagin Maximum Principle

There are already many books written solely on the Pontryagin minimum
principle and its applications. In this section, we merely point out some fun-
damental results and briefly investigate some applications.
The Euler-Lagrange equations for the unconstrained optimal control prob-
lem of Section 6.2 require that the Hamiltonian function must be stationary
∂H
with respect to the control, i.e., = 0 at optimality.
∂u
Consider the case when the control is constrained to lie in a subset U of
Rr , where U , known as the control restraint set, is generally a compact subset
of Rr . In this situation, the optimality conditions obtained in Section 6.2 do
not make sense if the optimal control happens to lie on the boundary of U for
any positive subinterval of the planning horizon [0, T ]. To cater for this and
more general situations, some fundamental results due to Pontryagin and his
co-workers [206] will be stated without proof in the next two theorems.
Let U be a compact subset of Rr . Any piecewise continuous function from
[0, T ] into U is said to be an admissible control. Let U be the class of all such
admissible controls.
Now we consider the problem where the cost functional (6.2.2) is to be
minimized over U subject to the dynamical system (6.2.1). We refer to this
as Problem (P 1).
Theorem 6.4.1 Consider Problem (P 1). If u∗ ∈ U is an optimal control,
and x∗ (t) and λ∗ (t) are the corresponding optimal state and costate, then it
is necessary that

dx∗ (t) ∂H(t, x∗ (t), u∗ (t), λ∗ (t))
= = f (t, x∗ (t), u∗ (t)), (6.4.1a)
dt ∂λ
6.4 Pontryagin Maximum Principle 183

x∗ (0) = x0 , (6.4.1b)
∗ ∗ ∗ ∗ 
dλ (t) ∂H(t, x (t), u (t), λ (t))
=− , (6.4.1c)
dt ∂x

∂Φ0 (x∗ (T ))
λ∗ (T ) = (6.4.1d)
∂x

and
min H(t, x∗ (t), v, λ∗ (t)) = H(t, x∗ (t), u∗ (t), λ∗ (t)) (6.4.1e)
v∈U

for all t ∈ [0, T ], except possibly on a finite subset of [0, T ].

Remark 6.4.1 Note that the condition (6.4.1e) in the above theorem may
also be written as

H(t, x∗ (t), u∗ (t), λ∗ (t)) ≤ H(t, x∗ (t), v, λ∗ (t)) (6.4.2)

for all v ∈ U , and for all t ∈ [0, T ], except possibly on a finite subset of [0, T ].
Note furthermore that the necessary condition (6.4.1e) (and hence (6.4.2))
reduces to the stationary condition (6.2.10e) if the Hamiltonian function H
is continuously differentiable and if U = Rr .

To motivate the second theorem, we add an additional terminal constraint


to the dynamical system (6.2.1) as follows:

x(T ) = xf , (6.4.3)

where xf is a given vector in Rn . Our second problem may now be stated as:
Subject to the dynamical system (6.2.1) together with the terminal condi-
tion (6.4.3), find a control u ∈ U such that the cost functional (6.2.2) is
minimized over U .
For convenience, let this second optimal control problem be referred to as
Problem (P 2).
Theorem 6.4.2 Consider Problem (P 2). If u∗ ∈ U is an optimal control,
and x∗ (t) and λ(t) are the corresponding optimal state and costate, then it
is necessary that

dx∗ (t) ∂H(t, x∗ (t), u∗ (t), λ∗ (t))
= = f (t, x∗ (t), u∗ (t)), (6.4.4a)
dt ∂λ
x∗ (0) = x0 , (6.4.4b)
∗ f
x (T ) = x , (6.4.4c)

dλ∗ (t) ∂H(t, x∗ (t), u∗ (t), λ∗ (t))
=− (6.4.4d)
dt ∂x

and
184 6 Elements of Optimal Control Theory

min H(t, x∗ (t), v, λ∗ (t)) = H(t, x∗ (t), u∗ (t), λ∗ (t)) (6.4.4e)


v∈U

for all t ∈ [0, T ], except possibly on a finite subset of [0, T ].

Remark 6.4.2 As noted in Remark 6.4.1, the condition (6.4.4e) in Theo-


rem 6.4.2 is equivalent to

H(t, x∗ (t), u∗ (t), λ∗ (t)) ≤ H(t, x∗ (t), v, λ∗ (t)) (6.4.5)

for all v ∈ U , and for all t ∈ [0, T ], except possibly on a finite subset of [0, T ].

To illustrate the applicability of the Pontryagin Maximum Principle, we


consider some examples.
Example 6.4.1 The original problem that motivates this example is due to
Thompson [262]. Suppose the quality state of a machine can be measured by
x(t). The deterioration of the state is governed by the first order differential
equation:

dx(t)
= −bx(t) + u(t), t ∈ [0, T ] (6.4.6a)
dt
x(0) = x0 , (6.4.6b)

where b is the natural rate of deterioration and u(t) is the non-dimensional


maintenance effort that serves to retard the deterioration and is subjected to
the maintenance budget constraint

0 ≤ u(t) ≤ ū, for all t ∈ [0, T ]. (6.4.7)

The optimal maintenance policy thus seeks to maximize the net discounted
payoff (i.e., productivity benefit minus maintenance cost) plus the salvage
value, assuming that the productivity is proportional to the machine quality,
i.e.,
4  5
T
max g0 (u) = e−rT S x(T ) + e−rt [px(t) − u(t)]dt , (6.4.8)
0

where r, S, p, and T are, respectively, the interest rate, salvage value per unit
terminal quality, the productivity per unit quality and the sale date of the
machine.
Before applying the Pontryagin Maximum Principle presented in Theo-
rem 6.4.1 to this problem, we first note that

max {g0 (u)} = −min {−g0 (u)} .

Thus, it is clear that the objective functional (6.4.8) is equivalent to


6.4 Pontryagin Maximum Principle 185
4  5
T
−rT −rt
min −g0 (u) = −e S x(T ) − e [px(t) − u(t)]dt . (6.4.9)
0

We now write down the corresponding Hamiltonian function for the optimal
control problem with (6.4.9) as the objective functional:

H(t, x, u, λ) = e−rt (u − px) + λ(−bx + u). (6.4.10)

The corresponding costate equation is

dλ∗ (t) ∂H
=− = bλ∗ (t) + pe−rt , (6.4.11a)
dt ∂x
with the boundary condition
" #
∂ −e−rT S x(T )
λ∗ (T ) = = −Se−rT . (6.4.11b)
∂x(T )

The Pontryagin Maximum Principle asserts that

H(t, x∗ (t), u∗ (t), λ∗ (t))


" #
= min v(λ∗ (t) + e−rt ) − px∗ (t)e−rt − bλ∗ (t)x∗ (t) . (6.4.12)
0≤v≤ū

The solution of the costate equation (6.4.11) is



∗ p p −rt
λ (t) = −S + e−(b+r)T +bt − e . (6.4.13)
r+b r+b

Assume that r+b p


> 1 > S. Then, λ∗ (t) grows from a negative value in a
monotonically increasing manner. In (6.4.12), the minimization with respect
to v involves only the first term in the curly brackets of (6.4.12). If λ∗ (t)+e−rt
is negative, v should be as large as possible, and if λ∗ (t) + e−rt is positive, v
should be as small as possible. The resulting optimal maintenance policy is
therefore of the bang-bang type, i.e.,

ū, 0 ≤ t ≤ t∗
u∗ (t) = (6.4.14)
0, t∗ < t ≤ T ,

where t∗ is the time when λ∗ (t) + e−rt switches from being negative to being
positive. Since both λ∗ and e−rt are monotonic, there can only be one such
switching point. By solving for the zero of {λ∗ (t) + e−rt }, we obtain
. p /
∗ 1 r+b − 1
t =T+ ln p . (6.4.15)
b+r r+b − S
186 6 Elements of Optimal Control Theory

Example 6.4.2 Recall the student problem discussed as Example 1.2.1 in


Chapter 1. In the notation of this chapter, we can state the problem in the
following standard form:
 T
Minimize g(u) = u(t) dt
0

subject to

dx(t)
=bu(t) − cx(t),
dt
x(0) =k0 ,
x(T ) =kT ,

and 0 ≤ u(t) ≤ w̄ for all 0 ≤ t < T . Here, u(t) denotes the rate of work done
at time t, x(t) is the knowledge level at time t and T is the total time in
weeks. Furthermore, c > 0, b > 0, w̄, k0 , and kT are constants as described in
Example 1.2.1. We assume that k0 < kT (i.e., the student’s initial knowledge
level is insufficient to pass the examination) and also that w̄ is sufficiently
large so that the final knowledge level can be reached (otherwise the problem
would be infeasible). The Hamiltonian is given by

H = u + λ(bu − cx) = u(1 + λb) − cλx.

The optimal costate must satisfy

dλ∗ (t) ∂H
=− = cλ∗ (t).
dt ∂x
This yields λ∗ (t) = Kect for some constant K, which is a strictly monotone
function (note that K = 0 would lead to u∗ (t) = 0 for all t, which would lead
to a loss of knowledge and the student unable to reach the final knowledge
level). Since Pontryagin’s Maximum Principle requires the minimization of
H with respect to u, we must have

w̄, if 1 + λ∗ (t)b < 0,


u∗ (t) =
0, if 1 + λ∗ (t)b > 0.

Now, K > 0 would result in λ∗ (t) > 0 for all t, which, in turn, leads to
1 + λ∗ (t)b > 0 for all t. As this forces u∗ (t) = 0 for all t, the student could
again not reach the required final knowledge level. Hence we must have K < 0,
which means that λ∗ (t) < 0 for all t and it is monotonically decreasing. If
1 + λ∗ (0)b < 0, it follows that 1 + λ∗ (t)b < 0 for all t, which means that
u∗ (t) = w̄ for all t. The more likely scenario is that 1 + λ∗ (0)b > 0 and
1 + λ∗ (t)b then decreases, becoming negative after a time t∗ . This means the
optimal control is of bang-bang type with
6.5 Singular Control 187

0, if 0 ≤ t < t∗ ,
u∗ (t) =
w̄, if t∗ ≤ t ≤ T.

dx(t)
We can now derive the complete solution. For 0 ≤ t < t∗ , = −cx(t).
dt
Together with the initial condition, this results in

x∗ (t) = k0 e−ct , 0 ≤ t < t∗ .

dx(t)
For t∗ ≤ t ≤ T , we have = bw̄ − cx(t). Together with the terminal
dt
state constraint, this results in

bw̄ bw̄
x∗ (t) = + kT − ec(T −t) , t∗ ≤ t ≤ T.
c c

By equating the two forms of the optimal state trajectory at t∗ , we can derive
$ %
ln ckbw̄ − bw̄ e
ckT cT
0
+ ecT
t∗ = .
c
To conclude this section, we wish to note that there exist several versions of
the proof for the Pontryagin Maximum Principle. What appears in Pontrya-
gin’s original book [206] is somewhat complex. For the proof of a simplified
version, we refer the reader to [3].

6.5 Singular Control

In this section, we wish to consider a situation where the Pontryagin minimum


principle fails to determine a unique value of the optimal control. This leads
to what is known as a singular control. Let us illustrate this situation by
means of the following simple example. For more details, see [40, 69, 72, 73]
and the references cited therein.
Assume that there are two existing highways. We wish to construct a new
highway to link point A on the first highway and point B on the second
highway as shown in Figure 6.5.1. Let L be the horizontal distance between
point A and point B, and let ye (), 0 <  < L, be the existing terrain on
which a new highway is to be built. Let the profile of this highway be denoted
by y(), 0 <  < L. The vertical height from A to ye (0) is denoted by ξ, and
the vertical height from point B to ye (L) is denoted by η. Clearly, the new
highway will be of no practical value if it is too steep for cars to drive on.
Thus, we must make sure that the slope of the new highway be within the
allowable limits, as given by
188 6 Elements of Optimal Control Theory

dy()
− S1 ≤ ≤ S2 , 0 ≤  ≤ L, (6.5.1)
d
where S1 and S2 are the given positive constants.

B
Proposed Proposed
Link Link
a a
A ? B ?
6 R 6
A
6
y()
?
?
-
H Existing 
Highways

(a ) (b )

Fig. 6.5.1: A hypothetical case of highway construction. (a) Plan view. (b)
Section a − a

Our aim is to accomplish this job with minimum cost, where the total cost
is the sum of the costs of cutting and filling of earth. If the costs of cutting
and filling are the same, then the minimum cost is, in fact, equivalent to the
minimum filling and cutting of earth.
We are now in a position to write down the mathematical model of this
optimal control problem. To simplify the presentation, we shall assume that

L = 2, ξ = 0, η = 1, S1 = S2 = 1, ye () = 0 for 0 ≤  ≤ 2. (6.5.2)

With these specifications, the corresponding version of the model is


 2
min g0 (u) = (y())2 d (6.5.3a)
0

subject to

dy()
=u(), 0 ≤  ≤ 2, (6.5.3b)
d
y(0) =0 (6.5.3c)
y(2) =1 (6.5.3d)
6.5 Singular Control 189

and
− 1 ≤ u() ≤ 1, 0 ≤ l ≤ 2. (6.5.3e)
According to Theorem 6.4.2, the corresponding Hamiltonian function is

H = y 2 + uλ (6.5.4)

and the costate system is

dλ() ∂H
=− = −2y(). (6.5.5)
d ∂y
Minimizing H with respect to u, the optimal control takes the form

⎨ 1, if λ() < 0,
u∗ () = −1, if λ() > 0, (6.5.6)

undetermined, if λ() = 0.

Let us assume that we can ignore the third case of (6.5.6) for the time being.
Two possibilities remain, namely u∗ () = 1 for λ() < 0 and u∗ () = −1 for
λ() > 0.
Let us explore each of these cases. We integrate the differential equa-
tions (6.5.3b) and (6.5.5) in turn and we let K1 , K2 , K3 and K4 be the
constants of integration that arise in this process.

(i) λ() < 0:


dy()
In this case, u∗ () = 1 and = 1. Hence,
d
y() =  + K1 . (6.5.7)

By (6.5.5), we obtain

λ() = −()2 − 2K1  + K2 < 0. (6.5.8)

(ii) λ() > 0:


dy()
In this case, u∗ () = −1 and = −1. Hence,
d
y() = − + K3 . (6.5.9)

By (6.5.5), we have

λ() = ()2 − 2K3  + K4 > 0. (6.5.10)

Let us plot the relationship between y and λ for these cases by first elim-
inating the variable . Consider Case (i). Then, from (6.5.7), we have

 = y − K1 .
190 6 Elements of Optimal Control Theory

Hence, for λ < 0,

λ = − (y − K1 )2 − 2K1 (y − K1 ) + K2 = − y 2 + (K2 + (K1 )2 ) < 0. (6.5.11)

Now consider Case (ii). Then, by a similar argument, it follows from (6.5.9)
that
 = K3 − y.
Hence, for λ > 0,

λ = (K3 − y)2 − 2K3 (K3 − y) + K4 = y 2 + (K4 − (K3 )2 ) > 0. (6.5.12)

These are the desired relationships between y and λ. They are plotted on the
phase diagram of Figure 6.5.2. In the top half-plane of the figure, λ is positive,

λ
6

u∗ = −1
K 1
K
K
K
-
−1 1 y
U 
 y=1
U

U
 −1
u = +1 U


 y=0

Fig. 6.5.2: Phase orbits in the y-λ plane

u∗ () = −1, and (6.5.12) applies. In the bottom half-plane, λ is negative,


u∗ () = 1, and (6.5.11) applies. The curves are parabolas, facing upwards
for (6.5.12) and downwards for (6.5.11). Any upwards-facing parabola that
reaches the y-axis is interrupted at this point and replaced by a suitable
segment of a downwards-facing parabola below the axis. Directions of travel
along the phase orbits are indicated with arrows. At any point (y(), λ())
in the phase plane, the travel direction is obtained from (6.5.5). If y() is
positive, λ() decreases with . Otherwise, λ() increases with .
The phase orbits shown on the figure are the only possible candidates for
optimal solutions. However, not all these orbits lead to feasible solutions,
6.5 Singular Control 191

since they do not all meet the boundary conditions, i.e., (6.5.3c), y(0) = 0,
and (6.5.3d), y(2) = 1. In terms of the phase diagram, a feasible solution
must start on the λ-axis (y(0) = 0 when  = 0), and it must end on the
vertical line y = 1 when  = 2.
Inspecting the phase diagram carefully, we see that points on the λ-axis
with λ > 0 are no use. We can never go to the desired final value of y = 1
from such initial points. Thus, we must start from or below the λ-axis. This,
in turn, implies that we must use the control u∗ () = 1 if it is to be an optimal
control. Thus, (6.5.11) applies throughout. But, from (6.5.3c) and (6.5.7), we
have
y(0) = 0 + K1 = 0.
Thus, y() =  and hence y(2) = 2. This clearly does not satisfy the final
condition (6.5.3d) (i.e., y(2) = 1).
Where does this leave us? If we re-examine (6.5.6), there are, in fact,
three possibilities. The second is not feasible in any case, and the exclu-
sive use of the first has been shown to also not give a feasible answer. We
are therefore required to consider the third possibility, λ = 0 and u∗ un-
determined, (6.5.6). This situation leads to a singular control, because the
Maximum Principle (6.4.4e) fails to directly determine a unique value of the
optimal control.
Let us investigate this singular case further. Suppose that λ = 0 not merely
at a single point in [0, 2], but in some finite open interval, say (1 , 2 ). Then,
the derivative dλ()/d must also vanish and, by (6.5.5), we obtain y() = 0
in that same interval. But then the derivative dy()/d = 0 as well and,
by (6.5.3b), we get u∗ () = 0. The unique singular control solution is therefore
given by
y() = u∗ () = λ() = 0. (6.5.13)
On the phase diagram presented in Figure 6.5.2, this solution corresponds
to the origin, i.e., a single point in the plane rather than a curve. There are
other points in the plane with y = 0, namely all points on the λ-axis. But none
of these others can be maintained so that the differential equations (6.5.3b)
and (6.5.5) are satisfied over some finite interval, since (6.5.5) implies that
λ() cannot remain constant, in particular equal to zero, unless 2y vanishes.
The desired final optimal solution now emerges in two parts. From  = 0
until some distance , - the new link should follow the existing terrain. We
start to fill the existing terrain from - onwards in such a way that the slope
at each point  is always at its maximum allowable limit. In this way, the
desired height at  = 2, y(2) = 1, will be reached exactly. The remaining
question is to determine the value of . - First, we recall that

y(2) = 1 (6.5.14)

y() = 0, -
0 ≤  ≤ , (6.5.15)
and
192 6 Elements of Optimal Control Theory

y() =  + K1 , - ≤  ≤ 2. (6.5.16)
Now, from (6.5.14) and (6.5.16), we obtain

K1 = −1. (6.5.17)

Thus, by (6.5.16), we have

y() =  − 1, - ≤  ≤ 2. (6.5.18)

- = 0, by (6.5.15), it follows from (6.5.18) that 0 = −1


Since y() - and therefore
- = 1. Finally, we can conclude that if the new highway is to be built with
minimum cost, its slope must be

0, 0 ≤  ≤ 1
u∗ () = (6.5.19)
1, 1 ≤  ≤ 2.

Hence, the corresponding profile of the new highway is

0, 0 ≤  ≤ 1
y ∗ () = (6.5.20)
 − 1, 1 ≤  ≤ 2.

This clearly agrees with our intuition, although we can quite easily imagine
how uncomfortable it would be to drive on such a highway—the slope of the
highway is at its maximum allowable limit from the point  = 1 onwards.
This is due to the fact that we did not take driving comfort into account in
the problem formulation.
Note that the singular control problem considered in this section is only
a very simple one. For more information regarding singular control, we refer
the interested reader to [40, 69, 72, 73]. Note also that computational algo-
rithms to be introduced in subsequent chapters work equally well regardless
of whether the optimal solution to be obtained is a singular one or otherwise.

6.6 Time Optimal Control

This section is devoted to a class of time optimal control problems. To begin,


we assume that any piecewise continuous function from [0, ∞) into U may be
taken as an admissible control, where U is the control restraint set defined
in Section 6.4. Let U be the class of all such admissible controls.
We may now state the class of time optimal control problems formally as:
Subject to the system (6.2.1) together with the final condition:

x(T ) = xf , (6.6.1)
6.6 Time Optimal Control 193

find a control u ∈ U such that T is minimized, where xf is a given vector in


Rn .
For convenience, let this time optimal control problem be referred to as
Problem (T P ). Note that the cost functional of Problem (T P ) can be written
as  T
min dt. (6.6.2)
0
With this cost functional, the corresponding Hamiltonian function is given
by
H(t, x, u, λ) = 1 + (λ) f (t, x, u). (6.6.3)
The corresponding version of the Pontryagin Maximum Principle is now
stated without proof in the following theorem.
Theorem 6.6.1 Consider Problem (T P ). Let u∗ ∈ U be an optimal control,
let x∗ be the corresponding optimal state, and let T ∗ be the minimum time
such that the final condition (6.6.1) is satisfied. Then, there exists a function

λ∗ = [λ∗1 , . . . , λ∗n ] : [0, T ∗ ] → Rn ,

which is not identically zero such that



dx∗ (t) ∂H(t, x∗ (t), u∗ (t), λ∗ (t))
= = f (t, x∗ (t), u∗ (t)) (6.6.4a)
dt ∂λ
x∗ (0) = x0 (6.6.4b)
∗ ∗ f
x (T ) = x (6.6.4c)
∗ ∗ ∗ ∗ 
dλ (t) ∂H(t, x (t), u (t), λ (t))
=− (6.6.4d)
dt ∂x

and
min H(t, x∗ (t), v, λ∗ (t)) = H(t, x∗ (t), u∗ (t), λ∗ (t)) (6.6.4e)
v∈U

for all t ∈ [0, T ∗ ], except possibly on a finite subset of [0, T ∗ ]. Also,

H(T ∗ , x∗ (T ∗ ), u∗ (T ∗ ), λ∗ (T ∗ )) = 0. (6.6.4f)

Furthermore, if the system (6.2.1a) is autonomous (i.e., f does not depend


on t explicitly), then

H(t, x∗ (t), u∗ (t), λ∗ (t)) = 0, for all t ∈ [0, T ∗ ]. (6.6.4g)

The proof of this theorem is rather involved. Since the emphasis of this
text is on computational methods for optimal control, we omit the proof and
refer the interested reader to [3]. We illustrate the application of the theorem
with the following example.
194 6 Elements of Optimal Control Theory

Example 6.6.1 Consider the motion of a particle of unit mass governed by


the following differential equation:

d2 x(t)
= u(t)
dt2
with the following boundary conditions

dx(0)
x(0) = x01 , = x02
dt
dx(T )
x(T ) = 0, = 0.
dt
Our aim is to find a control u with

|u(t)| ≤ 1, for all t ≥ 0

such that T is minimized.


Define x1 = x and x2 = ẋ. Then, the problem may be re-stated as
 T
min dt
0

subject to

dx1 (t)
=x2 (t)
dt
dx2 (t)
=u(t)
dt
with the initial conditions

x1 (0) = x01 , x2 (0) = x02 ,

the terminal state constraints

x1 (T ) = 0, x2 (T ) = 0,

and the control constraints

| u(t)| ≤ 1, for all t ≥ 0.

The Hamiltonian function is H(x, u, λ) = 1 + x2 λ1 + uλ2 and the costate


system is

dλ1 (t) ∂H
=− =0
dt ∂x1
6.6 Time Optimal Control 195

dλ2 (t) ∂H
=− = −λ1 (t).
dt ∂x2
Minimizing the Hamiltonian function with respect to u, we obtain

⎨ −1, if λ2 (t) > 0,
u∗ (t) = 1, if λ2 (t) < 0,

undetermined, if λ2 (t) = 0.

The costate system can be easily solved to yield

λ1 (t) = C
λ2 (t) = −Ct + D,

where C and D are arbitrary constants. From the linearity of λ2 (t), it follows
that λ2 (t) either changes sign at most once or is identically zero over the time
horizon. However, we can rule out the latter case, since this would require
C = D = 0, leading to λ1 (t) = 0, for all t ≥ 0, and hence

H(x(t), u∗ (t), λ(t)) = 1,

which contradicts the condition H(x(t), u∗ (t), λ(t)) = 0. We conclude there-


fore that λ2 (t) can change sign at most once. This, in turn, implies that u∗ (t)
changes sign at most once. In other words, u∗ (t) is either +1 or −1 with
a possible switch at some time t̂. This type of control is called a bang-bang
control.
If u∗ (t) = +1, dxdt
2 (t)
= 1 ⇒ x2 (t) = t + A, where A is an arbitrary constant.
Clearly,
1 1 1
(x2 )2 = t2 + At + A2
2 2 2
and
dx1 (t) 1
= x2 (t) = t + A ⇒ x1 (t) = t2 + At + B,
dt 2
where B is also arbitrary. Eliminating t from the last two equations, we obtain
1
x1 = (x2 )2 + constant.
2
This describes a family of parabolas shown as solid curves in Figure 6.6.1.
Possible movements along these phase trajectories are indicated by arrows
and can be deduced from dxdt2 (t)
= 1.
196 6 Elements of Optimal Control Theory

6x2
Ps

sB u = +1
7 7 7 7 7 7 7 7 7
U U U U U U U U U

sO -
x1
s
A

u = −1 o o o o o o o o o

s
P

Fig. 6.6.1: Bang-bang optimal control

If u∗ (t) = −1, dxdt


2 (t)
= −1 ⇒ x2 (t) = −t + A , where A is once more an
arbitrary constant. Clearly,
1 1 1
(x2 )2 = t2 − A t + (A )2
2 2 2
and
dx1 (t) 1
= x2 (t) = −t + A =⇒ x1 (t) = − t2 + A t + B  ,
dt 2
where B  is also arbitrary. Eliminating t from these two equations, we obtain
1
x1 = − (x2 )2 + constant.
2
This is another family of parabolas indicated by the broken lines in Fig-
ure 6.6.1. Again, possible movements along these phase trajectories follow
from dxdt2 (t)
= −1 and are indicated by arrows.
Any potential optimal trajectory must lie on one of the parabolas shown
in Figure 6.6.1. It is clear that there are only two curves that will take us
to the origin. The combination of these two curves is the curve marked by
P OP  , and it plays a special role in the solution of the problem. Consider
two distinct starting points, A and B, as indicated.
Case 1: Suppose we are starting at A. In this situation, an optimal trajectory
must follow the solid parabola up until it reaches the curve P OP  and it
then follows this curve to the origin. Noting the control values that give
6.7 Continuous State Constraints 197

rise to the two sections of this optimal trajectory, the time optimal control
is clearly given by

1, for t ∈ [0, t̂),


u∗ (t) =
−1, for t ∈ [t̂, T ∗ ],

where t̂ is the time at which P OP  is reached, and T ∗ is the time at which


the trajectory reaches the origin.
Case 2: Suppose we are starting at B. In this situation, an optimal trajec-
tory must follow the broken parabola down until it reaches P OP  and it
then follows this curve to the origin. In this case, the time optimal control
is clearly given by

−1, for t ∈ [0, t̂),


u∗ (t) =
1, for t ∈ [t̂, T ∗ ],

where t̂ and T ∗ have the same meaning as in Case 1.


P OP  is known as the switching curve, since the optimal control changes
sign once the corresponding optimal trajectory reaches this curve. Note that
it does not matter where we start initially. For any starting point in the phase
plane, we must travel directly to the switching curve and then stay on it to
the origin.
Remark 6.6.1 Note that the control law we end up with is in closed loop
form, since we are able to specify u∗ (t) entirely in terms of the current state
of the system. The law can be stated simply as follows: take the path that
leads onto the switching curve, and then ride the switching curve home.

6.7 Continuous State Constraints

In this section, we extend the Maximum Principle to a more general class


of problems also involving continuous inequality constraints on a function of
the state. The main reference for this section is [88]. Consider the constraints

h(t, x(t)) ≤ 0, for all t ∈ [0, T ], (6.7.1)

a(T, x(T )) ≤ 0, (6.7.2)


and
b(T, x(T )) = 0, (6.7.3)
where h = [h1 , . . . , hq ] , a = [a1 , . . . , ar ] , and b = [b1 , . . . , bs ] are as-
sumed to be continuously differentiable with respect to all their arguments.
In addition, higher order differentiability of h may be required as detailed
later.
198 6 Elements of Optimal Control Theory

Instead of the class of admissible controls defined in Section 6.4, we assume


here that the control is governed by the more general constraint

g(t, u(t)) ≤ 0, for all t ∈ [0, T ], (6.7.4)

where g = [g1 , . . . , gc ] is continuously differentiable in all its arguments. We


then consider the problem of finding a measurable control u(·) that minimizes
the cost functional (6.2.2) subject to the dynamical system (6.2.1) and subject
to the constraints (6.7.1)–(6.7.4). We refer to this state constrained optimal
control problem as Problem (SP ).
Throughout this section, we make the following assumptions.
Assumption 6.7.1 For all possible values of T and x(T ), the terminal state
constraints (6.7.2)–(6.7.3) are such that the following condition is satisfied.

∂a/∂x diag(a)
rank = r + s, (6.7.5)
∂b/∂x 0

where diag(a) denotes the r × r diagonal matrix with a1 , . . . , ar along the


main diagonal. (6.7.5) is known as the constraint qualification for the con-
straints (6.7.2)–(6.7.3). It ensures that the gradients of the equality and active
inequality constraints with respect to x are linearly independent.

Assumption 6.7.2 For all possible u(t) and t,

rank [∂g/∂u diag(g)] = c, (6.7.6)

where diag(g) denotes the c×c diagonal matrix with g1 , . . . , gc along the main
diagonal. The constraint qualification (6.7.6) means that the gradients with
respect to u of all active components of (6.7.4) must be linearly independent.

Before we can state a similar constraint qualification assumption for (6.7.1),


we need to introduce some additional concepts. For each hi , i = 1, . . . , q, de-
fine

h0i (t, x, u) = hi (t, x),


dh0i ∂hi ∂hi
h1i (t, x, u) = = f (t, x, u) +
dt ∂x ∂t
1 1 1
dh ∂h ∂h
h2i (t, x, u) = i = i
f (t, x, u) + i
dt ∂x ∂t
..
.
dhp−1 ∂hp−1 ∂hp−1
hpi (t, x, u) = i
= i
f (t, x, u) + i
(6.7.7)
dt ∂x ∂t
for any integer p > 0. For each i = 1, . . . , q, let pi be such that
6.7 Continuous State Constraints 199

∂hji ∂hpi i
= 0 for all 0 ≤ j ≤ pi − 1 and = 0. (6.7.8)
∂u ∂u
Then we say that the state constraint hi ≤ 0 is of order pi . Note that many
practical problems have state constraints of order 1 only.
Consider the state constraint hi ≤ 0 for a particular i ∈ [1, . . . , q]. We say
that a subinterval (t1 , t2 ) ⊂ [0, T ] with t1 < t2 is an interior interval of the
trajectory x(·) if hi (t, x(t)) < 0 for all t ∈ (t1 , t2 ). An interval [τ1 , τ2 ] ⊂ [0, T ]
with τ1 < τ2 is called a boundary interval if hi (t, x(t)) = 0 for all t ∈ [τ1 , τ2 ].
An instant ten ∈ [0, T ] is called an entry time if there is an interior interval
that ends at ten and a corresponding boundary interval that starts at ten .
Similarly, an instant tex ∈ [0, T ] is called an exit time if a boundary interval
finishes and an interior interval commences at tex . If hi (τ ∗ , x(τ ∗ )) = 0 and if
hi (t, x(t)) < 0 just before and just after τ ∗ , then τ ∗ is referred to as a contact
time. We collectively refer to entry, exit and contact times as junction times.
Finally, let us assume for notational convenience that boundary intervals do
not intersect. If this is not the case, a more elaborate statement of the next
assumption would be required.
Assumption 6.7.3 On any boundary interval [τ1 , τ2 ],
⎡ p
∂h1 1 ⎤
∂u
⎢ .. ⎥
rank ⎢
⎣ . ⎥ = q ,
⎦ (6.7.9)
p 
∂hqq
∂u

where we assume that the constraints have been ordered so that

hi (t, x(t)) = 0, i = 1, . . . , q  ≤ q and hi (t, x(t)) < 0, i = q  + 1, . . . , q,

for t ∈ [τ1 , τ2 ] and pi is the order of the constraint hi ≤ 0.


Since we have replaced the class of admissible control in previous sections
with (6.7.4) here, further assumptions may be required to guarantee the exis-
tence of an optimal solution of Problem (SP ). We refer the interested reader
to [88] for a more detailed analysis of this issue.
To state the first order optimality conditions for Problem (SP ), we follow
the so-called direct adjoining approach where the state constraints (6.7.1) and
control constraints (6.7.4) are directly adjoined to the Hamiltonian to form
the Lagrangian. Thus, we define

H(t, x, u, λ0 , λ) = λ0 L0 (t, x, u) + λ f (t, x, u) (6.7.10)

and

L(t, x, u, λ0 , λ, μ, ν) = H(t, x, u, λ0 , λ) + μ g(t, u) + ν  h(t, x), (6.7.11)


200 6 Elements of Optimal Control Theory

where
λ0 ≥ 0 (6.7.12)
is a constant, λ(·) ∈ Rn is the costate vector, and μ(·) ∈ Rc and ν(·) ∈ Rq
are Lagrange multiplier functions. At any time t ∈ [0, T ], we also define the
feasible control region

Ω(t) = {u ∈ Rm | g(t, u) ≤ 0} . (6.7.13)

Theorem 6.7.1 Consider Problem (SP ). Suppose that u∗ (·), where u∗ (t) ∈
Ω(t), t ∈ [0, T ), is an optimal control that is right continuous with left hand
limits and also assume that the constraint qualification (6.7.6) holds for every
pair {u, t}, t ∈ [0, T ] with u ∈ Ω(t). Denote the corresponding optimal state
as x∗ (·) and assume that it has finitely many junction times. Then there exist
a constant λ0 ≥ 0, a piecewise continuous costate trajectory λ∗ (·) whose con-
tinuous segments are absolutely continuous, piecewise continuous multiplier
functions μ∗ (·) and ν ∗ (·), a vector η(τi ) ∈ Rq for each discontinuity point τi
of λ∗ (·), and α ∈ Rr , β ∈ Rs , γ ∈ Rq such that

(λ0 , λ(t), μ(t), ν(t), α, β, γ, η(τ1 ), η(τ2 ), . . .) = 0

for every t ∈ [0, T ] and such that the following conditions hold almost every-
where.

dx∗ (t) ∂H(t, x∗ (t), u∗ (t), λ0 , λ∗ (t))
= = f (t, x∗ (t), u∗ (t)), (6.7.14a)
dt ∂λ
x∗ (0) =x0 , (6.7.14b)
∗ ∗ ∗ ∗ ∗ ∗ 
dλ (t) ∂L(t, x (t), u (t), λ0 , λ (t), μ (t), ν (t))
=− , (6.7.14c)
dt ∂x
 
∂Φ0 (x∗ (T )) ∂a(T, x∗ (T ))
λ∗ (T − ) =λ0 + α
∂x ∂x
 
∂b(T, x∗ (T )) ∂h(T, x∗ (T ))
+ β + γ , (6.7.14d)
∂x ∂x
α ≥ 0, γ ≥ 0, α a(T, x∗ (T )) = γ  h(T, x∗ (T )) = 0, (6.7.14e)

where T − denotes the limit from the left,

∂L(t, x∗ (t), u∗ (t), λ0 , λ∗ (t), μ∗ (t), ν ∗ (t))


∂u
∂H(t, x∗ (t), u∗ (t), λ0 , λ∗ (t)) ∂g(t, u∗ (t))
= + (μ∗ (t))
∂u ∂u
= 0 (6.7.14f)
6.7 Continuous State Constraints 201

μ∗ (t) ≥ 0, (μ∗ (t)) g(t, u∗ (t)) = 0, g(t, u∗ (t)) ≤ 0, (6.7.14g)
∗ ∗  ∗ ∗
ν (t) ≥ 0, (ν (t)) h(t, x (t)) = 0, h(t, x (t)) ≤ 0, (6.7.14h)

dH(t, x∗ (t), u∗ (t), λ0 , λ∗ (t))


dt
dL(t, x∗ (t), u∗ (t), λ0 , λ∗ (t), μ∗ (t), ν ∗ (t))
=
dt
∂L(t, x∗ (t), u∗ (t), λ0 , λ∗ (t), μ∗ (t), ν ∗ (t))
 , (6.7.14i)
∂t
and

min H(t, x∗ (t), v, λ0 , λ∗ (t)) = H(t, x∗ (t), u∗ (t), λ0 , λ∗ (t)). (6.7.14j)


v∈Ω(t)

Furthermore, for any time τ in a boundary interval and for any contact time
τ , the costate λ∗ (·) may have a discontinuity given by the following jump
conditions.
 
λ∗ (τ − ) = λ∗ τ + + (η(τ )) h(τ, x∗ (τ )), (6.7.14k)
H(τ − , x∗ (τ − ), u∗ (τ − ), λ0 , λ∗ (τ − ))
      
= H τ + , x∗ τ + , u∗ τ + , λ0 , λ∗ τ + − (η(τ )) h(τ, x∗ (τ )),
(6.7.14l)
η(τ ) ≥ 0, (η(τ )) h(τ, x∗ (τ )) = 0, (6.7.14m)

where τ + and τ − denote the left hand side and right hand side limits, re-
spectively. Finally, suppose that τ is a junction time corresponding to a first
order constraint hi (t, x) ≤ 0 for some i ∈ {1, . . . , q}. Recalling the definition
of h1i in (6.7.7), if
h1i (τ − , x∗ (τ − ), u∗ (τ − )) < 0, (6.7.15)
or     
h1i τ + , x∗ τ + , u∗ τ + > 0, (6.7.16)
i.e., if the entry or exit of the state into or out of a boundary interval is
non-tangential, then the costate λ is continuous at τ .
Remark 6.7.1 The condition

(λ0 , λ(t), μ(t), ν(t), α, β, γ, η(τ1 ), η(τ2 ), . . .) = 0

for every t can help us to distinguish a normal case (λ0 = 1) from an abnormal
case (λ0 = 0).
Remark 6.7.2 Conditions (6.7.14i) and (6.7.14l) are equivalent to the re-
quirement that H(t, x∗ (t), u∗ (t), λ0 , λ∗ (t)) is constant in the autonomous
case, i.e., when f , L , g, and h do not depend on t explicitly.
202 6 Elements of Optimal Control Theory

Remark 6.7.3 In most practical cases, λ∗ and H will only jump at junction
times. However, a discontinuity may also occur in the interior of boundary
interval.

Remark 6.7.4 Parts of Theorem 6.7.1 have been proven by a range of dif-
ferent authors, see [88] and the references cited therein. In particular, a com-
plete proof that establishes the existence of various multipliers may be found
in [184]. A more general version of Theorem 6.7.1, where g also depends on
x, is stated as an informal theorem in [88]. However, no formal proof has as
yet been established for this case.

Example 6.7.1 Consider the problem of minimizing


 3
g0 (u) = x(t) dt (6.7.17)
0

subject to

dx(t)
= u(t) (6.7.18)
dt
x(0) = 1, (6.7.19)
− 1 ≤ u(t) ≤ 1, for all t ∈ [0, 3], (6.7.20)
x(t) ≥ 0, for all t ∈ [0, 3], (6.7.21)

and the terminal state constraint

x(3) = 1. (6.7.22)

In the notation of Problem (SP ), we have L0 = x, Φ0 = 0, f = u, h = −x ≤ 0,


g1 = −u − 1 ≤ 0, g2 = u − 1 ≤ 0 and b = x − 1 = 0. Applying Theorem 6.7.1,
we have

H =x + λu,
L =x + λu + μ1 (−1 − u) + μ2 (u − 1) − νx,

where
dλ ∂L
=− = −1 + ν,
dt ∂x
λ(3) = β − γ,
μ1 ≥ 0, μ1 (−1 − u) = 0, for all t ∈ [0, 3],
μ2 ≥ 0, μ2 (u − 1) = 0, for all t ∈ [0, 3],
ν ≥ 0, ν(−x) = 0, for all t ∈ [0, 3],

and γ ≥ 0 along with γh(x(3)) = 0. Note that the last condition gives
γx(3) = γ = 0. Furthermore,
6.8 The Bellman Dynamic Programming 203

∂L
= λ − μ1 + μ2 = 0. (6.7.23)
∂u
While it is possible to derive the optimal solution from the above condi-
tions, it is obvious from the problem statement that

⎨ 1 − t, 0 ≤ t < 1,
x∗ (t) = 0, 1 ≤ t < 2,

t − 2, 2 ≤ t ≤ 3,

and ⎧
⎨ −1, 0 ≤ t < 1,
u∗ (t) = 0, 1 ≤ t < 2,

1, 2 ≤ t < 3.
We can then derive the corresponding costate and multipliers by satisfying
the requirements of Theorem 6.7.1.
For t ∈ [1, 2], u∗ (t) = 0 requires that μ∗1 (t) = μ∗2 (t) = 0. (6.7.23) then
yields λ∗ (t) = 0. It follows that dλ/dt = −1 + ν ∗ (t) = 0, which results in
ν ∗ (t) = 1.
For t ∈ [0, 1), x∗ (t) > 0 implies that ν ∗ (t) = 0 and hence dλ∗ /dt = −1 so
that λ∗ (t) = A − t, for some constant A. Since the state enters the boundary
interval non-tangentially, Theorem 6.7.1 requires λ∗ to be continuous at t =
1. λ∗ (1− ) = A − 1 = λ∗ (1) = 0 yields A = 1 so that λ∗ (t) = 1 − t on
[0, 1). u∗ (t) = −1 on [0, 1) means that μ∗2 (t) = 0 and, by virtue of (6.7.23),
μ∗1 (t) = λ∗ (t) = 1 − t on the same interval.
For t ∈ (2, 3], x∗ (t) > 0 again implies that ν ∗ (t) = 0 and hence dλ∗ /dt =
−1 so that λ∗ (t) = B − t, for some constant B. Since the state exits the
boundary interval non-tangentially, continuity of λ∗ is required at t = 2, so
λ∗ (2+ ) = B − 2 = λ∗ (2) = 0, i.e., B = 2 and λ∗ (t) = 2 − t in this interval.
u∗ (t) = 1 means that μ∗1 (t) = 0, so, using (6.7.23), we have μ∗2 (t) = −λ∗ (t) =
t − 2.
Clearly, the costate and multipliers derived above satisfy all the require-
ments of Theorem 6.7.1.

6.8 The Bellman Dynamic Programming

In contrast to the Pontryagin Maximum Principle, Bellman’s principle of op-


timality for dynamic programming is surprisingly clear and intuitive and the
proof is almost trivial. Nevertheless, it has made a tremendous contribution
in a wide range of applications ranging from decision science to various en-
gineering disciplines. Its application in optimal control theory has led to the
derivation of the famous Hamilton-Jacobi-Bellman (HJB) partial differential
equation that can be used for constructing nonlinear optimal feedback control
laws.
204 6 Elements of Optimal Control Theory

We introduce Bellman’s principle of optimality via a simple diagram as


shown in Figure. 6.8.1.

E
s
* ~
: s z sz s
s B C D
A

Fig. 6.8.1: Proof of Bellman’s principle of optimality

Suppose we wish to travel from the point A to the point D by the short-
est possible path and suppose this optimal path is given by ABCD, where
the intermediate points B and C represent some intermediate stages of the
journey. Now let us suppose that we start from point B and wish to travel to
point D by the shortest possible path. Bellman’s principle of optimality then
asserts that the optimal path will be given by BCD. Though this answer
appears trivial, we shall prove it nevertheless. Suppose there exists another
optimal path from B to D, indicated by the broken line BED. Then, since
distance is additive, this implies that ABED is a shorter path than ABCD.
This contradicts the original assumption and the proof is complete.
In the context of optimal control, we are dealing with a dynamical process
that evolves with time. Once the initial state and initial time are fixed, a
unique optimal control to minimize the cost functional can be determined.
Hence the optimal control is a function of the initial state and time. Each
optimal state is associated with an optimal path called an extremal. If the
process starts at a different initial state and time, a different optimal extremal
will result. For all possible initial points (ξ, t) in the state–time space, a
family of extremals may be computed, which is referred to as the field of
extremals in classical calculus of variation terminology. Each extremal yields
a corresponding cost functional that can be regarded as a function of the
initial state and time, i.e.,

V (ξ, t) = min {g0 (ξ, t | u) : u ∈ U }, (6.8.1)

where
 T
g0 (ξ, t | u) = Φ0 (x(T | u)) + L0 (τ, x(τ | u), u(t))dτ, (6.8.2)
t

U = {u: u is measurable on [0, T ] with u(t) ∈ U, for all t ∈ [0, T ]},


and, for each u ∈ U , x(· | u) is determined from (ξ, t) by
6.8 The Bellman Dynamic Programming 205

dx(τ )
= f (τ, x(τ ), u(τ )), (6.8.3a)

x(t) = ξ. (6.8.3b)

Here, the function V is called the value function, and g0 (ξ, t | u) denotes the
cost functional corresponding to the control u over the interval [t, T ] starting
from x(t) = ξ. Moreover, the control restraint set U is, in general, taken as
a compact subset of Rr .
For convenience, let Problem (P (ξ, t)) denote the optimal control problem
in which the cost functional (6.8.2) is to be minimized over U subject to the
dynamical system (6.8.3).
We may now apply Bellman’s principle of optimality to derive a partial
differential equation for the value function V (ξ, t). Suppose t is the current
time and t + Δt is a future time infinitesimally close to t. Then, by virtue of
the system (6.8.3), it follows that corresponding to each u ∈ U , the state at
t + Δt is
x(t + Δt | u) = ξ + Δξ, (6.8.4)
where  
Δξ = f (t, ξ, u(t))Δt + O Δt2 . (6.8.5)
Now, let Problem (P (ξ + Δξ, t + Δt)) be Problem (P (ξ, t)) with the initial
condition (6.8.3b) and the cost functional (6.8.2) replaced by (6.8.4) and
 T
g0 (ξ+Δξ, t+Δt | u) = Φ0 (x(T | u))+ L0 (τ, x(τ | u), u(τ ))dτ , (6.8.6)
t+Δt

respectively.
Clearly, the cost functional (6.8.2) can be expressed as
 T
g0 (ξ, t | u) = Φ0 (x(T | u)) + L0 (τ, x(τ | u), u(τ ))dτ (6.8.7)
t+Δt
 t+Δt
+ L0 (τ, x(τ | u), u(τ ))dτ . (6.8.8)
t

If we employ a control ũ ∈ U given by

u∗ (τ ), for τ ∈ (t + Δt, T ]
ũ(τ ) = (6.8.9)
u(τ ), for τ ∈ [t, t + Δt),

where u∗ ∈ U is an optimal control for Problem (P (t + Δt, ξ + Δξ)), we may


rewrite (6.8.8) as
 t+Δt
g0 (ξ, t | ũ) = V (ξ + Δξ, t + Δt) + L0 (τ, x(τ | u), u(τ ))dτ , (6.8.10)
t
206 6 Elements of Optimal Control Theory

where

V (ξ + Δξ, t + Δt) = min{g0 (ξ + Δξ, t + Δt | u) : u ∈ U }. (6.8.11)

Obviously,
g0 (ξ, t | ũ) ≥ V (ξ, t), (6.8.12)
which, in turn, becomes an equality if optimal control is used in the interval
[t, t + Δt]. Thus,
*  t+Δt
V (ξ, t) = min V (ξ + Δξ, t + Δt) + L0 (τ, x(τ | u), u(τ ))dτ
t
+
: u measurable on [t, t + Δt] with u(τ ) ∈ U . (6.8.13)

Using a Taylor series expansion of the value function V (ξ + Δξ, t + Δt), and
then invoking (6.8.5), we have

V (ξ + Δξ, t + Δt)
∂V (ξ, t) ∂V (ξ, t)  
= V (ξ, t) + Δξ + Δt + O Δt2
∂ξ ∂t
∂V (ξ, t) ∂V (ξ, t)  
= V (ξ, t) + f (t, ξ, u(t))Δt + Δt + O Δt2 . (6.8.14)
∂ξ ∂t
Next, since Δt is infinitesimally small, it is clear from using the initial con-
dition (6.8.3b) that
 t+Δt  
L0 (τ, x(τ | u), u(τ ))dτ = L0 (t, ξ, u(t))Δt + O Δt2 . (6.8.15)
t

Substituting (6.8.14) and (6.8.15) into (6.8.13), and then noting that V (ξ, t)
is independent of u(t), we obtain

∂V (ξ, t)
− Δt
∂t
∂V (ξ, t)  
= min f (t, ξ, u(t))Δt + L0 (t, ξ, u(t))Δt + O Δt2 : u(t) ∈ U .
∂ξ
(6.8.16)

Thus, dividing (6.8.16) by Δt, and then letting Δt → 0, it follows that

∂V (ξ, t) ∂V (ξ, t)
− = min f (t, ξ, v) + L0 (t, ξ, v) : v ∈ U . (6.8.17)
∂t ∂ξ
6.8 The Bellman Dynamic Programming 207

(6.8.17) is the well-known Hamilton-Jacobi-Bellman (HJB) equation that


is a partial differential equation in ξ and t. It is to be solved subject to the
boundary condition:
V (ξ, T ) = Φ0 (ξ). (6.8.18)
The validity of (6.8.18) is obvious from the definition of the value function.
There are many numerical methods available in the literature for solving
HJB partial differential equations. See, for example, [98, 99, 130, 188, 208,
273, 274, 305–307, 309]. However, except for linear quadratic optimal control
problems, these HJB equations are rather difficult to solve when the dimen-
sion of ξ is larger than 2. On the other hand, if it can be solved, the optimal
control obtained from the minimization process of (6.8.17) is dependent on
the state. It thus furnishes a natural nonlinear optimal feedback control law.
Note that if the Hamiltonian function is defined as

H(t, x, u, λ) = L0 (t, x, u) + λ f (t, x, u), (6.8.19)

then (6.8.17) may be rewritten as


4    5
∂V (ξ, t) ∂V (ξ, t)
− = min H t, ξ, v, :v∈U . (6.8.20)
∂t ∂ξ

In fact, it can even be shown that the optimal costate is



∂V (ξ, t)
λ∗ (t) = . (6.8.21)
∂ξ

We shall conclude this section by re-deriving the optimal feedback con-


trol law of a linear quadratic regulator via the HJB equation (compare Sec-
tion 6.3). Recall that the linear quadratic regulator problem’s formulation
is
* 1
min g0 (u) = (x(T )) S f x(T )
2

1 T" # +
+ (x(t)) Q(t)x(t) + (u(t)) R(t)u(t) dt (6.8.22)
2 0

subject to
dx(t)
= A(t)x(t) + B(t)u(t). (6.8.23)
dt
Let V (ξ, t) be the corresponding value function if the process is assumed to
start at t < T from some state ξ. The associated HJB equation is
208 6 Elements of Optimal Control Theory

∂V (ξ, t)

∂t
∂V (ξ, t) 1 1
= min (A(t)ξ + B(t)v) + ξ  Q(t)ξ + v  R(t)v : v ∈ Rr
∂ξ 2 2
(6.8.24a)

with boundary condition


1  f
V (ξ, T ) = ξ S ξ. (6.8.24b)
2
Given the form of (6.8.24), we may speculate that V (ξ, t) takes the form:
1 
V (ξ, t) = ξ S(t)ξ, (6.8.25)
2
where S(t) is a symmetric matrix. Substitution of (6.8.25) into (6.8.24a)
yields

1 dS(t)
− ξ ξ
2 dt
1 1
= min ξ  S(t)(A(t)ξ + B(t)v) + ξ  Q(t)ξ + v  R(t)v : v ∈ Rr .
2 2
(6.8.26)

Let u∗ (t) denote the v that solves the minimization problem. Clearly, u∗ (t)
is given by
u∗ (t) = −(R(t))−1 (B(t)) S(t)ξ, (6.8.27)
which simply confirms the optimal state feedback control law of (6.3.20).
Substituting the minimizing control u∗ (t) of (6.8.27) into (6.8.26), we have
1 1
− ξ  Ṡ(t)ξ = ξ  S(t)(A(t)ξ − B(t)(R(t))−1 (B(t)) S(t)ξ) + ξ  Q(t)ξ
2 2
1
+ ξ  S(t)B(t)(R(t))−1 (B(t)) S(t)ξ. (6.8.28)
2
This can be simplified to

dS(t)
ξ + 2S(t)A(t) − S(t)B(t)(R(t))−1 (B(t)) S(t) + Q(t) ξ = 0.
dt
(6.8.29)
Since S(t) is symmetric, it follows that

ξ  (2S(t)A(t))ξ
   
=ξ  S(t)A(t) + (A(t)) S(t) ξ + ξ  S(t)A(t) − (A(t)) S(t) ξ. (6.8.30)
6.9 Exercises 209

Also, since S(t)A(t) − (A(t)) S(t) is antisymmetric, the second term on the
right hand side of (6.8.30) vanishes and (6.8.29) therefore reduces to
.
dS(t)
ξ + S(t)A(t) + (A(t)) S(t)
dt
/
− S(t)B(t)(R(t))−1 (B(t)) S(t) + Q(t) ξ = 0. (6.8.31)

Since (6.8.31) must hold true for an arbitrary ξ, we conclude that

dS(t)
= − S(t)A(t) − (A(t)) S(t)
dt
+ S(t)B(t)(R(t))−1 (B(t)) S(t) − Q(t), (6.8.32)

which is exactly the same matrix Riccati differential equation derived previ-
ously in equation (6.3.19).
The boundary condition follows from (6.8.25) and (6.8.24b):

S(T ) = S f . (6.8.33)

After solving the matrix Riccati equation (6.8.32) with the boundary condi-
tion (6.8.33), the feedback control law is obtained readily from (6.8.27).
To close this chapter, we wish to emphasize that the class of linear
quadratic optimal control problems is one of the very few cases that can
be solved analytically via the dynamic programming approach.

6.9 Exercises

6.9.1 Show the validity of (6.2.9) from (6.2.8).

6.9.2 Show that Equation (6.3.9) is valid.

6.9.3 Show that Equation (6.3.11) is valid.

6.9.4 Show that Equation (6.3.27) is valid.


6.9.5 A continuous time controlled process is described by

dx(t)
= u(t)
dt
with the initial condition
x(0) = 1,
210 6 Elements of Optimal Control Theory

and the terminal condition


x(1) = free.
The objective is to find a real-valued control u defined on [0, 1], which mini-
mizes the cost functional
 1
" #
g0 (u) = (x(t))2 + (u(t))2 dt.
0

(i) Write down the Hamiltonian for the problem.

(ii) Show that the optimal control can be expressed as a function of the
adjoint variable λ by the relation

λ(t)
u∗ (t) = − .
2
(iii) Write down the corresponding two-point boundary-value problem .

(iv) Show that the solution for λ is


$ %
2 −e−(1−t) + e(1−t)
λ(t) = .
e + e−1
(v) Determine the optimal control and the optimal trajectory.

6.9.6 Use an appropriate version of the Pontryagin Maximum Principle to


solve the problem of minimizing

g0 (u) = T

subject to

dx(t)
= − x(t) + u(t), t ∈ [0, T ),
dt
x(0) =1,
x(T ) =0,

and
|u(t)| ≤ 1, for all t ∈ [0, T ].

6.9.7 Use the Pontryagin Maximum Principle to solve the problem of min-
imizing  1
2
g0 (u) = (x(t)) dt
0
subject to
6.9 Exercises 211

dx(t)
= −x(t) + u(t), t ∈ [0, 1),
dt
x(0) = 1,
x(1) = 0,

and
|u(t)| ≤ 1, for all t ∈ [0, 1].

6.9.8 Consider the following problem.


 1
minimize g0 (u) = (u(t))2 dt
0

subject to

dx(t)
= u(t)
dt
x(0) = a
x(1) = 0.

Use Pontryagin Maximum Principle to show that

x∗ (t) = a(1 − t).

6.9.9 Use Pontryagin Maximum Principle to solve the following problem.


 1
minimize g0 (u) = (x(t))2 dt
0

subject to

dx(t)
= u(t), 0<t≤1
dt
x(0) = 1,
−1 ≤ u(t) ≤ 1.

6.9.10 Use Pontryagin Maximum Principle to solve the following problem.


Given the system
dx(t)
= x(t) + u(t)
dt
with the boundary conditions

x(0) = a, x(T ) = 0,

find a real-valued control u, defined in [0, T ], such that the cost functional
212 6 Elements of Optimal Control Theory
 T
g0 = (u(t))2 dt
0

is minimized. Here, T is fixed. What is the optimal trajectory?

6.9.11 A continuous time controlled process has differential equation

dx(t)
= u(t)
dt
and we wish to minimize the cost functional

1 T $ %
g0 = (x(t))2 + (u(t))2 dt.
2 0

Write down the Riccati equation for optimal control and hence find the opti-
mal control law and the optimal trajectory.
6.9.12 Consider the necessary condition (6.4.1e). If the Hamiltonian func-
tion H is continuously differentiable and U = Rr , show that the stationary
condition (6.2.10e) must be satisfied.

6.9.13 Consider the highway construction problem described by (6.5.3).


If the comfort of driving is taken into consideration, then the cost func-
tional (6.5.3c) should be replaced by
 2
g0 (u) = (u(l))2 dl.
0

Solve the modified problem by using the Pontryagin Maximum Principle.


6.9.14 Consider a firm that produces a single product and sells it in a mar-
ket that can absorb no more than M dollars of the product per unit time. It is
assumed that if the firm does no advertising, its rate of sales at any point in
time will decrease at a rate proportional to the rate of sales at that time. If the
firm advertises, the rate of change of the rate of sales will have an additional
term that increases proportional to the rate of advertising, but this increase
affects only the share of the market that is not already purchasing the prod-
uct. Define S(t) and A(t) to be the rate of sales and the rate of advertising
at time t, respectively. Let x(t) = S(t)
M , let γ > 0 be a constant, and define
A(t)
v(t) = M . Then, 1 − x(t) represents the share of the market affected by
advertising. Under these assumptions and with a suitable scale for the time
variable, we have
dx(t)
= −x(t) + γv(t)[1 − x(t)]. (6.9.1)
dt
It is assumed that there is an upper bound on the advertising rate, i.e.,

0 ≤ A(t) ≤ Ā. (6.9.2)


6.9 Exercises 213

This implies that


0 ≤ v(t) ≤ a, (6.9.3)

where a = M . The problem is to find an admissible v(t) that will maximize
the profit functional
 T
J(v) = {x(t) − v(t)}dt (6.9.4)
0

subject to the system equation (6.9.1) and the constraint (6.9.3), along with
the initial condition
x(0) = x0 , (6.9.5)
where T is a fixed time.
(a) Write down the problem in the standard form as a minimization problem.

(b) Write down the Hamiltonian for this problem.

(c) Show that the optimal control is of bang-bang type, with



⎨ a, if γ(x(t) − 1)λ(t) > 1,
v ∗ (t) = 0, if γ(x(t) − 1)λ(t) < 1,

undetermined, if γ(x(t) − 1)λ(t) = 1.

(d) Write down the corresponding two-point boundary-value problem.

(e) Show that the optimal advertising policy involves not advertising near the
final time t = T .

(f ) Assume that the optimal control function has the form

a, for 0 < t < τ,


v ∗ (t) =
0, for τ < t < T ,

where τ is a constant yet to be determined. Under this assumption, give


an explicit solution for λ(t) in the region τ < t < T .

(g) Under the same assumption, give an explicit solution for x(t) in the re-
gion 0 < t < τ .

(h) Hence, or otherwise, show that τ is a solution of the equation


!  
−(T −τ ) γa −(1+γa)τ γa
γ 1−e 1 − x0 − e − = 1.
1 + γa 1 + γa
214 6 Elements of Optimal Control Theory

6.9.15 (i) Given the following dynamical system

dx(t)
= u(t)
dt
x(0) = 0.

Let the state x be such that the following constraint is satisfied

x(t) ≥ 0, ∀t ∈ [0, 2].

Find a control u that satisfies the constraints:

−1 ≤ u(t) ≤ 1, ∀t ∈ [0, 2]

such that the cost function


 2
J= x(t)dt
0

is minimized.
(ii) Solve the following optimal control problem
 3
min J = e−ρt u(t)dt
0

subject to
dx1 (t)
= u1 (t), x1 (0) = 4
dt
dx2 (t)
= u1 (t) − u2 (t), x2 (0) = 4
dt
0 ≤ ui (t) ≤ 1, ∀t ∈ [0, 3], i = 1, 2
xi (t) ≥ 0, ∀t ∈ [0, 3], i = 1, 2.

6.9.16 (i) Solve the following optimal control problem


 10
min J = {(10 − t)u1 (t) + tu2 (t)} dt
0

subject to
dx1 (t)
= u1 (t), x1 (0) = 4
dt
dx2 (t)
= u1 (t) − u2 (t), x2 (0) = 4
dt
0 ≤ ui (t) ≤ 1, ∀t ∈ [0, 3], i = 1, 2
xi (t) ≥ 0, ∀t ∈ [0, 3], i = 1, 2.
(ii) Solve the following optimal control problem
6.9 Exercises 215
 5
min J = u(t)dt
0

subject to

dx(t)
= u(t) − x(t)
dt
x(0) = 1
0 ≤u(t) ≤ 1, ∀t ∈ [0, 5]
x(t) ≥ 0.7 − 0.2t, ∀t ∈ [0, 5].
6.9.17 For the problem:
 1 " #
minimize g0 (v) = (x(t)2 + (v(t))2 dt
0

subject to
dx(t)
= v(t)
dt
x(0) = 1,
and no constraint on v(t), show that the dynamic programming equation yields
the optimal control policy

e2(1−t) − 1
v(t) = − x(t).
e2(1−t) + 1
6.9.18 A system follows the equation:

dx(t)
= u(t).
dt
The process of ultimate interest starts at time t = 0 and finishes at t = T
with the cost functional:
 T
2
g0 = (x(T )) + (u(t))2 dt,
0

where T is fixed.
(i) Use Bellman’s dynamic programming to find the optimal control u∗ .

(ii) Substitute for the optimal control in the dynamical equation to find
the optimal trajectory x∗ (t) when x(0) = x0 . What is the value of
x∗ (T )?
216 6 Elements of Optimal Control Theory

6.9.19 Consider the following optimal control problem:


 T
minimize g0 (u) = (x(T ))2 + (x(t))2 + (u(t))2 dt
0

subject to
dx(t)
= −x(t) + u(t), t ∈ [0, T ),
dt
where T is fixed.
(i) Use Bellman’s dynamic programming to find the optimal control
u∗ (t).
(ii) Substitute the optimal control into the dynamical equation to find the
optimal trajectory x∗ (t) when x(0) = x0 . What is the value of g0 (u∗ )?
Chapter 7
Gradient Formulae for Optimal
Parameter Selection Problems

7.1 Introduction

The main theme of this chapter is to derive gradient formulae for the cost
and constraint functionals of several types of optimal parameter selection
problems with respect to various types of parameters. An optimal parameter
selection problem can be regarded as a special type of optimal control prob-
lems in which the controls are restricted to be constant functions of time.
Many optimization problems involving dynamical systems can be formulated
as optimal parameter selection problems. Examples include parameter identi-
fication problems and controller parameter design problems. In fact, optimal
control problems with their control being parametrized are essentially reduced
to optimal parameter selection problems. Thus, when developing numerical
techniques for solving complex optimal control problems, it is essential to
ensure the solvability of the resulting optimal parameter selection problems.
In Section 7.2, we consider a class of optimal parameter selection problems
where the system dynamics are described by ordinary differential equations
without time-delay arguments. Gradient formulae for the cost functional as
well as for the constraint functionals are then derived. With these gradient
formulae, the optimal parameter selection problems can be readily solved us-
ing gradient-based mathematical programming algorithms, such as sequential
quadratic programming (see Section 3.5).
In Sections 7.3–7.4, we consider a class of optimal control problems in
which the control takes a special structure. To be more precise, for each
k = 1, . . . , r, the k−th component of the control function is a piecewise con-
stant function over the planning horizon [0, T ] with jumps at tk1 , . . . , tkMk .
In Section 7.3, only the heights of the piecewise constant control func-
tions are regarded as decision parameters and gradient formulae of the cost
and constraint functionals with respect to these heights are derived on the

© The Author(s), under exclusive license to 217


Springer Nature Switzerland AG 2021
K. L. Teo et al., Applied and Computational Optimal Control, Springer
Optimization and Its Applications 171,
https://doi.org/10.1007/978-3-030-69913-0 7
218 7 Gradient Formulae for Optimal Parameter Selection Problems

basis of the formulae in Section 7.2. In Section 7.4, the switching times of
the piecewise constant control functions are regarded as decision parameters.
Gradient formulae of the cost functional as well as the constraint function-
als with respect to these switching times are then derived. However, these
gradient formulae may not exist when two or more switching times collapse
into one during the optimization process [148, 151, 168]. Thus, optimization
techniques based on these gradient formulae are often ineffective. To over-
come this difficulty, the concept of a time scaling transformation, which was
originally called the control parametrization enhancing transform (CPET),
is introduced. More specifically, by applying the time scaling transformation,
the switching times are mapped into fixed knots. The equivalent transformed
problem is a standard optimal parameter selection problem of the form con-
sidered in Section 7.2. Later in Section 7.4, we consider a class of combined
optimal parameter selection and optimal control problems in which the con-
trol takes the same structure as that of Section 7.3 with variable heights and
variable switching times. After applying the time scaling transformation, gra-
dient formulae of the cost and constraint functionals in the transformed prob-
lem with respect to these heights, switching times and system parameters are
summarized. Furthermore, we consider the applications of these formulae to
the special cases of discrete valued optimal control problems as well as the
optimal control of switched systems in Section 7.4.
In Section 7.5, we extend the results of Section 7.2 to the case involving
time-delayed arguments. In Section 7.6, we consider a class of optimal pa-
rameter selection problems with multiple characteristic time points in the
cost and constraint functionals. The main references for this chapter are
[89, 125, 142, 148, 168, 215] and Chapter 5 of [253].

7.2 Optimal Parameter Selection Problems

Consider a process described by the following system of differential equations


on the fixed time interval (0, T ].

dx(t)
= f (t, x(t), ζ), (7.2.1a)
dt
where x = [x1 , . . . , xn ] ∈ Rn and ζ = [ζ1 , . . . , ζs ] ∈ Rs are, respec-
tively, the state and system parameter vectors, and f = [f1 , . . . , fn ] :
[0, T ] × Rn × Rs → Rn . The initial condition for the system of differential
equations (7.2.1a) is
x(0) = x0 (ζ), (7.2.1b)
$ % 
where x0 = x01 , . . . , x0n ∈ Rn is a given vector valued function of the
system parameter vector ζ.
7.2 Optimal Parameter Selection Problems 219

Define

Z = {ζ = [ζ1 , . . . , ζs ] ∈ Rs : ai ≤ ζi ≤ bi , i = 1, . . . , s}, (7.2.2)

where ai and bi , i = 1, . . . , s, are real numbers. Clearly, Z is a compact and


convex subset of Rs . For each ζ ∈ Rs , let x(· | ζ) be the corresponding
solution of the system (7.2.1). We may now define an optimal parameter
selection problem as follows.
Problem (P 1): Given the system (7.2.1), find a system parameter vector ζ
∈ Z such that the cost functional
 T
g0 (ζ) = Φ0 (x(T | ζ), ζ) + L0 (t, x(t | ζ), ζ) dt (7.2.3)
0

is minimized subject to the equality constraints (in canonical form)


 τi
gi (ζ) = Φi (x(τi | ζ), ζ) + Li (t, x(t | ζ), ζ) dt = 0, i = 1, . . . , Ne ,
0
(7.2.4a)

and subject to the inequality constraints (in canonical form)


 τi
gi (ζ) = Φi (x(τi | ζ), ζ) + Li (t, x(t | ζ), ζ) dt ≤ 0, i = Ne + 1, . . . , N.
0
(7.2.4b)

Here, Φi : Rn × Rs → R and Li : [0, T ] × Rn × Rs → R, i = 0, 1, . . . , N,


are given real-valued functions and, for each i = 0, 1, . . . , N, τi is referred
to as the characteristic time for the i−th constraint with 0 < τi ≤ T . For
subsequent notational convenience, we also define τ0 = T .
We assume throughout that the following conditions are satisfied.
Assumption 7.2.1 For each i = 0, . . . , N, and for each compact subset V
of Rs , there exists a positive constant K such that, for all (t, x, ζ) ∈ [0, T ] ×
Rn × V ,
|f (t, x, ζ)| ≤ K(1 + |x|)
and
|Li (t, x, ζ)| ≤ K(1 + |x|).
Assumption 7.2.2 f and Li , i = 0, 1, . . . , N, together with their partial
derivatives with respect to each of the components of x and ζ are piecewise
continuous on [0, T ] for each (x, ζ) ∈ Rn × Rs , and continuous on Rn × Rs
for each t ∈ [0, T ].

Assumption 7.2.3 Φi , i = 0, 1, . . . , N, are continuously differentiable with


respect to x and ζ. Furthermore, x0 is continuously differentiable with respect
to ζ.
220 7 Gradient Formulae for Optimal Parameter Selection Problems

Remark 7.2.1 From the theory of differential equations, we note that the
system (7.2.1) admits a unique solution, x(· | ζ), corresponding to each ζ ∈
Z.

Remark 7.2.2 The constraints given by (7.2.4a) and (7.2.4b) are said to be
in canonical form.

Remark 7.2.3 For each i = 1, . . . , N, the presence of a characteristic time


τi for the i−th constraint is to cater for the possibility of an interior point
constraint.

7.2.1 Gradient Formulae

To solve Problem (P 1) as a mathematical programming problem, we require


the gradient of the functional gi for each i = 0, 1, . . . , N . Since the func-
tionals depend implicitly on the system parameter vector via the dynamical
system, the derivation of the gradients requires some care. The two common
approaches are known as the variational (or sensitivity) method (see, for ex-
ample, [111, 142, 144, 151, 158, 159, 169, 268]) and the costate method (see,
for example, [75, 89, 148, 164, 168, 169, 180, 181, 238, 245, 246, 253]). We
illustrate both methods below.
In the variational method, we first define an auxiliary dynamic system for
each component of ζ. For k = 1, . . . , s, consider

dψ k (t) ∂f (t, x(t | ζ), ζ) k ∂f (t, x(t | ζ), ζ)


= ψ (t) + , t ∈ [0, T ], (7.2.5)
dt ∂x ∂ζk
along with the initial condition

∂x0 (ζ)
ψ k (0) = . (7.2.6)
∂ζk

For each ζ ∈ Z, let ψ k (· | ψ) denote the corresponding solution of (7.2.5)–


(7.2.6).
Theorem 7.2.1 For each ζ ∈ Z,

∂x(t | ζ)
= ψ k (t), t ∈ [0, T ]. (7.2.7)
∂ζk

Proof. For any t ∈ [0, T ], the solution of (7.2.1) corresponding to ζ ∈ Z may


be written as
 t
x(t | ζ) = x0 (ζ) + f (s, x(s | ζ), ζ) ds. (7.2.8)
0
7.2 Optimal Parameter Selection Problems 221

Differentiating both sides of (7.2.8) with respect to ζk , we obtain



∂x(t | ζ) ∂x0 (ζ) ∂f (s, x(s | ζ), ζ) ∂x(s | ζ) ∂f (s, x(s | ζ), ζ)
t
= + + ds.
∂ζk ∂ζk 0 ∂x ∂ζk ∂ζk
(7.2.9)
Substituting t = 0 into (7.2.9) gives

∂x(0 | ζ) ∂x0 (ζ)


= , (7.2.10)
∂ζk ∂ζk

while differentiating both sides of (7.2.9) with respect to t yields



d ∂x(t | ζ) ∂f (t, x(t | ζ), ζ) ∂x(t | ζ) ∂f (t, x(t | ζ), ζ)
= + . (7.2.11)
dt ∂ζk ∂x ∂ζk ∂ζk

Noting the equivalence of (7.2.11) and (7.2.5) as well as that of(7.2.10)


and (7.2.6), the result follows.

Using (7.2.7), the gradients of the cost and constraint functionals can then
be calculated on the basis of the chain rule of differentiation as follows. For
each i = 0, 1, . . . , N ,

∂gi (ζ) ∂Φi (x(τi | ζ), ζ) ∂Φi (x(τi | ζ), ζ) ∂x(τi | ζ)


= +
∂ζk ∂ζk ∂x ∂ζk
 τi
∂Li (t, x(t | ζ), ζ) ∂x(t | ζ) ∂Li (t, x(t | ζ), ζ)
+ + dt
0 ∂x ∂ζk ∂ζk
∂Φi (x(τi | ζ), ζ) ∂Φi (x(τi | ζ), ζ) k
= + ψ (τi | ζ)
∂ζk ∂x
 τi
∂Li (t, x(t | ζ), ζ) k ∂Li (t, x(t | ζ), ζ)
+ ψ (t | ζ) + dt.
0 ∂x ∂ζk
(7.2.12)

The gradient derivation via the costate method is carried out as follows.
For each i = 0, 1, . . . , N, let the corresponding Hamiltonian Hi be defined by
 
Hi (t, x, ζ, λ) = Li (t, x, ζ) + λi (t) f (t, x, ζ), (7.2.13)

where, for each ζ ∈ Rs , λi (t) is governed by the system


.   /
dλi (t) ∂Hi t, x(t | ζ), ζ, λi (t)
=− , t ∈ [0, τi ) (7.2.14a)
dt ∂x

with the condition



∂Φi (x(τi | ζ), ζ)
λi (τi ) = . (7.2.14b)
∂x
222 7 Gradient Formulae for Optimal Parameter Selection Problems

Here, x(· | ζ) denotes the solution of the system (7.2.1) corresponding to ζ ∈


Rs , the right hand side of (7.2.14a) denotes the transpose of the gradient of Hi
with respect to x evaluated at x(t | ζ), and the right hand side of (7.2.14b)
is to be understood similarly. The system (7.2.14) is known as the costate
system. Let λi (· | ζ) be the solution of this costate system corresponding to
ζ ∈ Rs .
Theorem 7.2.2 Consider Problem (P 1). For each i = 0, 1, . . . , N , the gra-
dient of the functional gi is given by

∂gi (ζ) ∂Φi (x(τi | ζ), ζ)  i  ∂x0 (ζ)


= + λ (0 | ζ)
∂ζ ∂ζ ∂ζ
 τi  
∂Hi t, x(t | ζ), ζ, λ (t | ζ)
i
+ dt. (7.2.15)
0 ∂ζ

Proof. Let ζ ∈ Rs be given and let ρ ∈ Rs be arbitrary but fixed. Define

ζ(ε) = ζ + ερ, (7.2.16)

where ε > 0 is an arbitrarily small real number. For brevity, let x(·) and
x(·; ε) denote, respectively, the solution of the system (7.2.1) corresponding
to ζ and ζ(ε). Clearly, from (7.2.1), we have
 t
x(t) = x0 (ζ) + f (s, x(s), ζ) ds (7.2.17)
0

and  t
x(t; ε) = x0 (ζ(ε)) + f (s, x(s; ε), ζ(ε)) ds. (7.2.18)
0
Thus,

dx(t; ε) 
x(t) =
dε ε=0
0  t
∂x (ζ) ∂f (s, x(s), ζ) ∂f (s, x(s), ζ)
= ρ+ x(s)+ ρ ds.
∂ζ 0 ∂x ∂ζ
(7.2.19)

Clearly,

d(x(t)) ∂f (t, x(t), ζ) ∂f (t, x(t), ζ)


= x(t) + ρ (7.2.20a)
dt ∂x ∂ζ
∂x0 (ζ)
x(0) = ρ. (7.2.20b)
∂ζ
7.2 Optimal Parameter Selection Problems 223

Now gi (ζ(ε)) can be expressed as


 *

τi 
gi (ζ(ε)) =Φi (x(τi ; ε), ζ(ε)) + Hi t, x(t; ε), ζ(ε), λi (t)
 i 
0
+
− λ (t) f (t, x(t; ε), ζ(ε)) dt, (7.2.21)

where λi is as yet arbitrary. Thus



dgi (ζ(ε)) 
gi (ζ) = 
dε ε=0
∂gi (ζ)
= ρ
∂ζ
 *

τi 
=Φi (x(τi ), ζ) + Hi t, x(t), ζ, λi (t)
 i 
0
+
− λ (t) f (t, x(t), ζ) dt, (7.2.22)

where
∂Φi (x(τi ), ζ) ∂Φi (x(τi ), ζ)
Φi (x(τi ), ζ) = x(τi ) + ρ, (7.2.23)
∂x ∂ζ
d(x(t))
f (t, x(t), ζ) = , (7.2.24)
dt
and
 
Hi t, x(t), ζ, λi (t)
   
∂Hi t, x(t), ζ, λi (t) ∂Hi t, x(t), ζ, λi (t)
= x(t) + ρ. (7.2.25)
∂x ∂ζ

Choose λi to be the solution of the costate system (7.2.14) corresponding to


ζ. Then, by substituting (7.2.14a) into (7.2.25), we obtain
 
Hi t, x(t), ζ, λi (t)
   
d λi (t) ∂Hi t, x(t), ζ, λi (t)
=− x(t) + ρ. (7.2.26)
dt ∂ζ

Hence, (7.2.22) yields

∂gi (ζ) ∂Φi (x(τi ), ζ) ∂Φi (x(τi ), ζ)


ρ= x(τi ) + ρ
∂ζ ∂x ∂ζ
 τi 4 ' ∂H t, x(t), ζ, λi (t)
5
d & i   i
+ − λ (t) x(t) + ρ dt
0 dt ∂ζ
224 7 Gradient Formulae for Optimal Parameter Selection Problems

∂Φi (x(τi ), ζ) ∂Φi (x(τi ), ζ)  


= x(τi ) + ρ − λi (τi ) x(τi )
∂x ∂ζ
 τi 4   5
 i  ∂Hi t, x(t), ζ, λi (t)
+ λ (0) x(0) + ρ dt. (7.2.27)
0 ∂ζ

Substituting (7.2.14b) and (7.2.20b) into (7.2.27), we have

∂gi (ζ) ∂Φi (x(τi ), ζ) ∂x0 (ζ)


ρ= ρ + (λi (0)) ρ
∂ζ ∂ζ ∂ζ
 τi 4   5
∂Hi t, x(t), ζ, λi (t)
+ ρ dt. (7.2.28)
0 ∂ζ

Since ρ is arbitrary, (7.2.15) follows readily from (7.2.28) and the proof is
complete.

Remark 7.2.4 The choice between gradient formulae (7.2.12) and (7.2.15)
is problem dependent. If the number of system parameters, s, is large, the
use of (7.2.12) requires the solution of a large number of auxiliary systems
compared to the use of (7.2.15), which requires the solution of just one costate
system. On the other hand, the costate system associated with (7.2.15) needs
to be solved backwards along the time horizon and requires the solution of the
state dynamics to be stored beforehand, whereas (7.2.12) can be solved forward
in time alongside the state dynamics. Nevertheless, the use of (7.2.15) is
generally more efficient in a computational sense and hence is described in
more detail in the next subsection.

7.2.2 A Unified Computational Approach

The optimal parameter selection problem, Problem (P 1), is essentially a non-


linear mathematical programming problem which is typically solved by a
gradient-based numerical algorithm. At each iteration of a gradient-based
numerical optimization algorithm, it is necessary to compute the value of the
cost functional and the values of all the constraint functionals as well as their
respective gradients. We shall do this in a unified manner. To be more precise,
the cost functional and all the constraint functionals are treated the same way
in as far as the computations of their values and their respective gradients
are concerned. To compute these values and the respective gradients, the first
task is to calculate the solution of the system (7.2.1) corresponding to each
ζ ∈ Z. This is presented as an algorithm for future reference.
Algorithm 7.2.1 For each given ζ ∈ Z, compute the solution x(· | ζ) of the
system (7.2.1) by solving the differential equations (7.2.1a) forward in time
from t = 0 to t = T with the initial condition (7.2.1b).
7.3 Control Parametrization 225

With the information obtained in Algorithm (7.2.1), the values of gi (ζ),


i = 0, 1, . . . , N , corresponding to each ζ ∈ Z can be easily calculated by the
following simple algorithm.
Algorithm 7.2.2 For each given ζ ∈ Z, compute the corresponding value of
gi (ζ) from (7.2.3) (respectively, (7.2.4)) if i = 0 (respectively, i = 1, . . . , N ).

In view of Theorem 7.2.2, we see that the derivations of the gradient


formulae for the cost functional and the canonical constraint functionals are
the same. For each i = 0, 1, . . . , N , the gradient of the corresponding gi (ζ)
may be computed using the following algorithm.
Algorithm 7.2.3 For a given ζ ∈ Z.
Step 1 Solve the costate differential equation (7.2.14) backward in time from
t = τi to t = 0, where x(· | ζ) is the solution of the system (7.2.1)
corresponding to ζ ∈ Z. Let λi (· | ζ) be the solution of the costate
system (7.2.14).

Step 2 The gradient of gi (ζ) with respect to ζ ∈ Z is computed from (7.2.15).

Remark 7.2.5 In the above algorithm, the solution x(· | ζ) of the sys-
tem (7.2.1) corresponding to each ζ ∈ Z is computed by Algorithm 7.2.1.

7.3 Control Parametrization

The most basic form of control parametrization assumes a piecewise constant


form for each control function. In this section, we described a class of optimal
control problems with piecewise constant controls and derive the gradients
of the cost and constraint functionals with respect to the heights of the con-
trols. Consider the process described by the following system of nonlinear
differential equations on the fixed time interval (0, T ]:

dx(t)
= f (t, x(t), u(t)), (7.3.1a)
dt
where
x = [x1 , . . . , xn ] ∈ Rn , u = [u1 , . . . , ur ] ∈ Rr
are, respectively, the state and control vectors. The initial condition for the
differential equation (7.3.1a) is

x(0) = x0 , (7.3.1b)

where x0 ∈ Rn is a given vector.


For each k = 1, . . . , r, the k−th component, uk , of the control u is a piece-
wise constant function over the interval [0, T ] with jumps at t1 , . . . , tM . In
226 7 Gradient Formulae for Optimal Parameter Selection Problems

other words, it takes a constant value until the next switching time is reached,
at which point the value changes instantaneously to another constant, which
is held until the next switching time. Mathematically, uk may be expressed
as:

M
uk (t) = hjk χ[tj ,tj+1 ) (t), (7.3.2)
j=0

where
1, if t ∈ I,
χI (t) = (7.3.3)
0, otherwise.

hjk , j = 0, 1, . . . , M , are decision parameters satisfying

ak ≤ hjk ≤ bk , k = 1, . . . , r and j = 0, 1, . . . , M , (7.3.4)

where ak and bk , k = 1, . . . , r, are given constants. Furthermore, t0 = 0,


tM +1 = T , and the given switching times tj , j = 1, . . . , M , satisfy

0 ≤ t1 ≤ · · · ≤ tM ≤ T . (7.3.5)

Let $ %
h = (h0 ) , (h1 ) , . . . , (hM ) ∈ R(M +1)r , (7.3.6)
where & '
hj = hj1 , . . . , hjr ∈ Rr , j = 0, 1, . . . , M . (7.3.7)

Let Λ be the set of all those decision parameter vectors h which satisfy (7.3.4).
For convenience, for any h ∈ Λ, the corresponding control is written as
u(· | h). Let U denote the set of all such controls. Clearly, each control in U
is determined uniquely by a decision parameter vector h in Λ and vice versa.
Let x(· | h) denote the solution of the system (7.3.1) corresponding to
u(· | h) ∈ U (and hence corresponding to h ∈ Λ). We may now state the
optimal control problem as follows.
Problem (P 2): Given the system (7.3.1), find a decision parameter vector
h ∈ Λ such that the cost functional
 T
g0 (h) = Φ0 (x(T | h)) + L0 (t, x(t | h), u(t | h)) dt (7.3.8)
0

is minimized subject to the equality constraints (in canonical form)


 τi
gi (h) = Φi (x(τi | h)) + Li (t, x(t | h), u(t | h)) dt = 0,
0
i = 1, . . . , Ne , (7.3.9a)

and the inequality constraints (in canonical form)


7.3 Control Parametrization 227
 τi
gi (h) = Φi (x(τi | h)) + Li (t, x(t | h), u(t | h)) dt ≤ 0,
0
i = Ne + 1, . . . , N, (7.3.9b)

where Φi : Rn → R and Li : [0, T ] × Rn × Rr → R, i = 0, 1, . . . , N , are given


real-valued functions.
We assume that the corresponding versions of Assumptions 7.2.1, 7.2.3
and 7.4.1 are satisfied throughout this section.
Problem (P 2) is, in essence, a special case of Problem (P 1) where the
system parameter vector is now the vector h of control heights. In view
of (7.3.2), we may write the dynamics (7.3.1a) in the following more detailed
form.
dx(t)  
= f t, x(t), hj , t ∈ [tj , tj+1 ), j = 0, . . . , M. (7.3.10)
dt
Similarly, for each i = 0, 1, . . . , N , the corresponding functional defined
by (7.3.8)–(7.3.9) may be written as
Mi −1 t   
 j+1 τi
gi (h) = Φi (x(τi | h)) + Li (t, x(t | h), hj ) dt + Li t, x(t | h), hMi dt,
j=0 tj t Mi

(7.3.11)

where τ0 = T and Mi ∈ {0, 1, . . . , M } is such that τi ∈ [tMi , tMi +1 ).


To derive the gradients of each functional with respect to the control
heights, we can once again follow either the variational or the costate ap-
proach as we did in Section 7.2. All the formulae we derived in Section 7.2
are still valid if we replace ζ with h. However, due to the special forms
of (7.3.10) and (7.3.11), we can state the gradient formulae with respect to h
in a manner that allows for a more efficient computational implementation.
For each j = 0, . . . , M and each k = 1, . . . , r, consider the variational
system

dψkj (t) ∂f (t, x(t | h), u(t | h)) j ∂f (t, x(t | h), u(t | h))
= ψk (t) + , t ∈ [0, T ],
dt ∂x ∂hjk
(7.3.12)
along with the initial condition

ψkj (0) = 0. (7.3.13)

In view of (7.3.10), this variational system may also be written as


   
dψkj (t) ∂f t, x(t | h), hl j ∂f t, x(t | h), hl
= ψk (t) + ,
dt ∂x ∂hlk
 
∂f t, x(t | h), hl j ∂f (t, x(t | h), u(t | h))
= ψk (t) + δkl ,
∂x ∂uk
228 7 Gradient Formulae for Optimal Parameter Selection Problems

t ∈ [tl , tl+1 ), l = j, . . . , M, (7.3.14)

with
ψkj (t) = 0, t ∈ [0, tj ), (7.3.15)
where
1, if k = l,
δkl =
0, otherwise.

In other words, the integration to determine ψkj (·) only needs to commence
at t = tj . Let ψkj (t | h) denote the solution of (7.3.14)–(7.3.15) corresponding
to h ∈ Λ. We may then state the following theorem.
Theorem 7.3.1 For each i ∈ {0, 1 . . . , N }, the gradient of the functional gi
with respect to hjk , k = 1, . . . , r, j = 0, . . . , M , is given by

∂gi (h) ∂Φi (x(τi | h)) j


= ψk (τi )
∂hjk ∂x
 τi. /
∂Li (t, x(t | h), u(t | h)) j ∂Li (t, x(t | h), u(t | h))
+ ψk (t) + dt
0 ∂x ∂hjk
∂Φi (x(τi | h)) j
= ψk (τi )
∂x
.    /
M i −1  tl+1
∂Li t, x(t | h), hl j ∂Li t, x(t | h), hl
+ ψk (t) + dt
l=0 tl ∂x ∂hjk
 τi .    /
∂Li t, x(t | h), hMi j ∂Li t, x (t | h) , hMi
+ ψk (t) + dt
tM i ∂x ∂hjk
i −1 tl+1    τi  
M ∂Li t, x(t | h), hl j ∂Li t, x(t | h), hMi j
= ψk (t)dt+ ψk (t)dt
∂x ∂x
l=0 tl tM i
⎧3 j
tj+1 ∂Li (t,x(t|h),h )

⎪ dt, if [tj , tj+1 ] ⊂ [0, tMi ],
⎨ tj ∂hjk
+ 3 τi ∂L i( t,x(t|h),h j
) (7.3.16)

⎪ ∂hjk
dt, if [tj , tj+1 ] = [tMi , tMi +1 ],
⎩ tM i
0, if [tj , tj+1 ] ⊂ [τi , T ],

where the subsequent equalities follow due to (7.3.11) and τ0 and Mi are the
same as defined in that equation.

The costate approach to determine the gradients also follows from the cor-
responding results in Section 7.2. Replacing ζ with h in Equations (7.2.13)–
(7.2.15), we obtain the following results. For each i = 0, 1, . . . , N , let the
Hamiltonian Hi be defined by

Hi (t, x(t | h), u(t | h), λi (t)) = Li (t, x(t | h), u(t | h))
+(λi (t)) f (t, x(t | h), u(t | h)), (7.3.17)
7.3 Control Parametrization 229

where, for each h ∈ Λ, λi (·) is governed by the system



dλi (t) ∂Hi (t, x(t | h), u(t | h), λi (t))
=− , t ∈ [0, τi ) (7.3.18a)
dt ∂x

with the condition



∂Φi (x(τi | h))
λi (τi ) = . (7.3.18b)
∂x
Let λi (· | h) denote the solution of this costate system corresponding to
h ∈ Λ. Note that, by virtue of (7.3.2), for t ∈ [tj , tj+1 ),

Hi (t, x(t | h), u(t | h), λi (t | h)) = Hi (t, x(t | h), hj , λi (t | h)).

We may then state the gradient formula in the following theorem.


Theorem 7.3.2 For each i = 0, 1, . . . , N and j = 0, 1, . . . , M , the gradient
of the functional gi with respect to hj is given by
 τi
∂gi (h) ∂Hi (t, x(t | h), u(t | h), λi (t | h))
j
= dt
∂h 0 ∂hj
⎧ 3 tj+1 ∂H (t,x(t|h),hj ,λi (t|h))

⎨ tj
i
∂hj dt, if [tj , tj+1 ] ⊂ [0, tMi ],
3 τi ∂Hi (t,x(t|h),hj ,λi (t|h))
= dt, if [tj , tj+1 ] = [tMi , tMi+1 ],

⎩ tM i ∂hj
0, if [tj , tj+1 ] ⊂ [τi , T ],
(7.3.19)

where, once again, the latter equality in (7.3.19) follows from (7.3.11) and τ0
and Mi are the same as defined in that equation.

For implementation purposes, it is also convenient to express these gradi-


ents in a component-wise form as follows.
⎧ 3 tj+1 ∂H (t,x(t|h),u(t|h),λi (t|h))
⎪ i
dt, if [tj , tj+1 ] ⊂ [0, tMi ],
∂gi (h) ⎨ 3tτji ∂H (t,x(t|h),u(t|h),λ
∂uk
i
(t|h))
= i
dt, if [tj , tj+1 ] = [tMi , tMi+1 ],
∂hjk ⎪ tM i
⎩ ∂uk
0, if [tj , tj+1 ] ⊂ [τi , T ].
(7.3.20)
Remark 7.3.1 As M is typically much larger than N , the gradients based on
the variational approach require a significantly larger amount of computation
and are hence not used as much in practice.

Remark 7.3.2 We have assumed here that each component of the control u
has the same set of switching times. This is purely for notational convenience.
In practice, there is no reason for this restriction.
230 7 Gradient Formulae for Optimal Parameter Selection Problems

7.4 Switching Times as Decision Parameters

Consider once more the dynamics described by (7.3.1) and assume that the
control takes the form (7.3.2). In contrast to Section 7.3, we now assume
that the control heights are given and that the switching times of the control
are decision parameters. In other words, we consider hjk , j = 0, 1, . . . , Mk ,
k = 1, . . . , r, to be given and regard the switching times tj , j = 1, . . . , M , as
decision parameters.
ϑ = [t1 , . . . , tM ] is called the switching vector. Let Ξ be the set which
consists of all those vectors ϑ ∈ RM such that the constraints (7.3.5) are
satisfied. Furthermore, let U denote the set of all the corresponding control
functions. For convenience, for any ϑ ∈ Ξ, the corresponding control in U is
written as u(· | ϑ). Clearly, each control in U is determined uniquely by a
switching vector ϑ in Ξ and vice versa.
Let x(· | ϑ) denote the solution of the system (7.3.1) corresponding to the
control u(· | ϑ) ∈ U (and hence to the switching vector ϑ ∈ Ξ). We may
then state the canonical optimal control problem as follows.
Problem (P 3): Given the system (7.3.1), find a switching vector ϑ ∈ Ξ such
that the cost functional
 T
g0 (ϑ) = Φ0 (x(T | ϑ)) + L0 (t, x(t | ϑ), u(t | ϑ)) dt, (7.4.1)
0

is minimized subject to the equality constraints


 T
gi (ϑ) = Φi (x(T | ϑ)) + Li (t, x(t | ϑ), u(t | ϑ)) dt = 0, i = 1, . . . , Ne ,
0
(7.4.2a)
and the inequality constraints
 T
gi (ϑ) = Φi (x(T | ϑ)) + Li (t, x(t | ϑ), u(t | ϑ)) dt ≤ 0, i = Ne + 1, . . . , N,
0
(7.4.2b)
where Φi : Rn → R and Li : [0, T ] × Rn × Rr → R, i = 0, 1, . . . , N , are given
real-valued functions.
Let Ω be the subset of Ξ which consists of all those switching vectors
ϑ such that the constraints (7.4.2) are satisfied. Furthermore, let F be the
corresponding subset of U .
Remark 7.4.1 Comparing the constraints (7.2.4) with the constraints (7.4.2),
we observe that all the characteristic times in the constraints (7.4.2) are taken
as T . This choice is for the sake of simplicity. In fact, the results to follow
are also valid for the general case, albeit with more cumbersome notation.
7.4 Switching Times as Decision Parameters 231

We assume that the corresponding versions of Assumptions 7.2.1 and 7.2.3


are satisfied. However, Assumption 7.2.2 needs to be replaced by a stronger
version given below.
Assumption 7.4.1 f and Li , i = 0, 1, . . . , N , together with their partial
derivatives with respect to each of the components of x and u are continuous
on [0, T ] × Rn × Rr .

7.4.1 Gradient Computation

Note that the appearance of the switching times in Problem (P 3) is of a


different nature than the appearance of ζ in Problem (P 1), so we cannot
simply adapt the gradient formulae derived in Section 7.2. Once again, we
derive the new gradient formulae using both the variational and the costate
approach. Throughout this section, recall that, by virtue of (7.3.2), we have

u(t | ϑ) = hl , t ∈ [tl , tl+1 ),

for each l = 0, 1, . . . , M .
For each j = 1, . . . , M , consider the following variational system.

dφj (t) ∂f (t, x(t | ϑ), u(t | ϑ)) j


= φ (t), t ∈ (tj , T ), (7.4.3)
dt ∂x
with jump condition
     
φ j t+
j = f tj , x(tj | ϑ), hj−1 − f tj , x(tj | ϑ), hj (7.4.4)

and initial condition


φj (t) = 0, t ∈ [0, tj ). (7.4.5)

j = T if tj = T . Let φ (· | ϑ) denote the unique right-


In (7.4.4), we set t+ j

continuous solution of (7.4.3)–(7.4.5).


Theorem 7.4.1 Consider j ∈ {1, . . . , M } and assume that tj−1 < tj < tj+1 .
Then for all time points t = tj ,

∂x(t | ϑ) x(t | ϑ + εej ) − x(t | ϑ)


= lim = φj (t | ϑ). (7.4.6)
∂tj ε→0 ε

Proof. For any t ∈ [0, T ], we have


 t
x(t | ϑ) = x +
0
f (s, x(s | ϑ), u(s | ϑ))dt. (7.4.7)
0
232 7 Gradient Formulae for Optimal Parameter Selection Problems

There are two cases to consider. If t < tj , then x(t | ϑ) clearly does not
depend on tj and thus

∂x(t | ϑ)
= 0 for all t < tj . (7.4.8)
∂tj

Assume now that t > tj . Then we may rewrite (7.4.7) as follows.


 
tj−1 tj  
x(t | ϑ) = x0 + f (s, x(s | ϑ), u(s | ϑ))dt + f s, x(s | ϑ), hj−1 dt
0 tj−1
⎧ t
⎪  

⎪ f s, x(s | ϑ), hj dt, if t ≤ tj+1 ,



⎪ tj


⎨
+
tj+1   (7.4.9)

⎪ f s, x(s | ϑ), hj dt


⎪ tj t




⎩+ f (s, x(s | ϑ), u(s | ϑ))dt, otherwise.
tj+1

Differentiating both sides of (7.4.9) with respect to tj , we have

∂x(t | ϑ)    
=f tj , x(tj | ϑ), hj−1 − f tj , x(tj | ϑ), hj
∂tj
 t
∂f (s, x(s | ϑ), u(s | ϑ)) ∂x(s | ϑ)
+ dt. (7.4.10)
tj ∂x ∂tj

Taking the limit as t approaches tj from above of both sides of (7.4.10), we


obtain

∂x(t | ϑ) ∂x(t+j | ϑ)    
lim+ = = f tj , x(tj | ϑ), hj−1 − f tj , x(tj | ϑ), hj .
t→tj ∂tj ∂tj
(7.4.11)
Differentiating both sides of (7.4.10) with respect to t, we have

d ∂x(t | ϑ) ∂f (t, x(t | ϑ), u(s | ϑ)) ∂x(t | ϑ)
= , t > tj . (7.4.12)
dt ∂tj ∂x ∂tj

Noting the equivalence of (7.4.3)–(7.4.5) and (7.4.8)–(7.4.12), the conclusion


follows.

Remark 7.4.2 Using (7.4.8), it is easy to see that the limit of the state
variation as t approaches tj from below is

∂x(t | ϑ) ∂x(t−
j | ϑ)
lim− = = 0.
t→tj ∂tj ∂tj
7.4 Switching Times as Decision Parameters 233

Comparing this with (7.4.11), we can conclude that the state variation with
respect to tj does not exist at t = tj but has a jump condition there instead.
As we show below, however, this does not prevent us from calculating the
required gradients of the cost and constraint functionals.

Note that the technical caveats in Theorem 7.4.1 (t = tj and tj−1 < tj <
tj+1 ) are not needed for the state variation with respect to the control heights.
Thus, optimizing the switching times is much more difficult than optimizing
the control heights. This is one reason for the popularity of the time scaling
transformation to be introduced in the next subsection which allows one to
circumvent the difficulties caused by variable switching times.
Using Theorem 7.4.1, we can derive the partial derivatives of the cost and
constraint functionals with respect to the switching times (assuming that
all switching times are distinct). First, recall that the cost and constraint
functionals are defined by
 T
gi (ϑ) = Φi (x(T | ϑ)) + Li (t, x(t | ϑ), u(t | ϑ))dt, i = 0, 1, . . . , N.
0

Using the chain rule of differentiation, we obtain

∂ ∂Φi (x(T | ϑ)) ∂x(T | ϑ)


{Φi (x(T | ϑ))} = . (7.4.13)
∂tj ∂x ∂tj

Furthermore, using the Leibniz rule for differentiating integrals, we obtain


4 5
T

Li (t, x(t | ϑ), u(t | ϑ))dt
∂tj 0
4M  5
∂  tl+1  
= Li t, x(t | ϑ), hl dt
∂tj t l
l=0
   
=Li tj , x(tj | ϑ), hj−1 − Li tj , x(tj | ϑ), hj
 T
∂Li (t, x(t | ϑ), u(t | ϑ)) ∂x(t | ϑ)
+ dt. (7.4.14)
0 ∂x ∂tj

Applying the Leibniz rule to interchange the order of differentiation and in-
tegration is valid here because the partial derivative of x(· | ϑ) with respect
to tj exists at all points in the interior of the interval [tl , tl+1 ]. Recall that
the Leibniz rule does not require differentiability at the end points. Com-
bining (7.4.13) and (7.4.14) with Theorem 7.4.1 yields the following gradient
formulae.
Theorem 7.4.2 For each i = 0, 1, . . . , N and j = 1, . . . , M ,

∂gi (ϑ) ∂Φi (x(T | ϑ)) j


= φ (T | ϑ)
∂tj ∂x
234 7 Gradient Formulae for Optimal Parameter Selection Problems
   
+ Li tj , x(tj | ϑ), hj−1 − Li tj , x(tj | ϑ), hj
 T
∂Li (t, x(t | ϑ), u(t | ϑ)) j
+ φ (t | ϑ)dt. (7.4.15)
0 ∂x

These gradient formulae are based on the variational system. The gradient
formulae derived based on the costate system are given below.
For each i = 0, 1, . . . , N , and for each switching vector ϑ ∈ Ξ, we consider
the following system of differential equations.
.   /
dλi (t) ∂Hi t, x(t | ϑ), u(t | ϑ), λi (t)
=− , t ∈ [0, T ) (7.4.16a)
dt ∂x

with

∂Φi (x(T | ϑ))
λi (T ) = , (7.4.16b)
∂x
where
Hi (t, x, u, λ) = Li (t, x, u) + λ f (t, x, u). (7.4.17)
The system (7.4.16) is called the costate system for the cost functional if
i = 0 and for the i−th constraint functional if i = 0. Let λi (· | ϑ) denote the
solution of the costate system (7.4.16) corresponding to ϑ ∈ Ξ. The gradient
of each functional gi , i = 0, . . . , N , with respect to the switching time tj ,
j = 1, . . . , M , is given in the following theorem.
Theorem 7.4.3 Consider j ∈ {1, . . . , M } and assume that tj−1 < tj <
tj+1 . For each i = 0, 1, . . . , N , the gradient of the functional gi with respect
to tj is given by

∂gi (ϑ)  
= Hi tj , x(tj | ϑ), hj−1 , λi (tj | ϑ) − Hi (tj , x(tj | ϑ), hj , λi (tj | ϑ)).
∂tj
(7.4.18)
Proof. Although this result has been proven in [253], a somewhat more direct
proof is given here. Let v : [0; T ] → Rn be any absolutely continuous function.
Then gi may be written as

M  tl+1  
gi (ϑ) =Φi (x(T | ϑ))+ Li t, x(t | ϑ), hl dt
l=0 tl
M  tl+1  
=Φi (x(T | ϑ))+ Hi t, x(t | ϑ), hl , v(t) dt
l=0 tl
M  tl+1
 dx(t | ϑ)
− v(t) dt.
dt
l=0 tl

Using integration by parts on the last term, we have


7.4 Switching Times as Decision Parameters 235

M 
 tl+1  
gi (ϑ) =Φj (x(T | ϑ)) + Hi t, x(t | ϑ), hl , v(t) dt
l=0 tl
4
M
− (v(tl+1 ))x(tl+1 | ϑ)
l=0
  5
tl+1
 dv(t)
−(v(tl )) x(tl | ϑ)− x(t | ϑ)dt
tl dt
 
=Φi (x(T | ϑ)) − (v(tM +1 )) x(tM +1 | ϑ) + (v(t0 )) x(t0 | ϑ)
M  tl+1
4  5
   dv(t)
+ Hi t, x(t | ϑ), h , v(t) +
l
x(t | ϑ) dt
tl dt
l=0
 
=Φi (x(T | ϑ)) − (v(T )) x(T | ϑ) + (v(0)) x(0 | ϑ)
j−2  tl+1
4  5
   dv(t)
+ Hi t, x(t | ϑ), hl , v(t) + x(t | ϑ) dt
t l
dt
l=0
 tj 4  5
  dv(t)
+ Hi t, x(t | ϑ), h , v(t) +
j−1
x(t | ϑ) dt
tj−1 dt
 tj+1 4  5
  dv(t)
+ Hi t, x(t | ϑ), hj , v(t) + x(t | ϑ) dt
tj dt
M  tl+1
4  5
   dv(t)
+ Hi t, x(t | ϑ), h , v(t) +
l
x(t | ϑ) dt.
tl dt
l=j+1

Using the Leibniz rule, this equation can be differentiated with respect to tj
to give

∂gi (ϑ) ∂Φi (x(T | ϑ)) ∂x(T | ϑ)  dx(T | ϑ)


= − (v(T ))
∂tj ∂x ∂tj ∂tj
   
+ Hi tj , x(tj | ϑ), h , v(tj ) − Hi tj , x(tj | ϑ), hj , v(tj )
j−1

M  tl+1
4  
 ∂Hi t, x(t | ϑ), hl, v(t) ∂x(t | ϑ)
+
∂x ∂tj
l=0 tl
 5
dv(t) ∂x(t | ϑ)
− dt.
dt ∂tj

Setting v(·) = λi (· | ϑ)) and then applying (7.4.16), we obtain

∂gi (ϑ)    
= Hi tj , x(tj | ϑ), hj−1 , v(tj ) − Hi tj , x(tj | ϑ), hj , v(tj ) ,
∂tj

as required.
236 7 Gradient Formulae for Optimal Parameter Selection Problems

Remark 7.4.3 Equations (7.4.15) and (7.4.18) both give the partial deriva-
tives of the canonical functionals gi , j = 0, 1, . . . , N , with respect to the
switching times. Since the state trajectory and the solutions of the varia-
tional and costate systems depend continuously on ϑ, the derivative formu-
lae in (7.4.15) and (7.4.18) also depend continuously on ϑ. In principle,
either (7.4.15) or (7.4.18) can be used in conjunction with a gradient-based
optimization method to optimize the switching times. However, there are sev-
eral difficulties with this approach:
(i) The variational systems for the switching times contain a jump
condition.
(ii) The partial derivatives of the canonical functionals with respect to
tj only exist when the switching times are distinct.
(iii) It is cumbersome to integrate the state and variational or costate
systems numerically when the switching times are variable, espe-
cially when two or more switching times are close together.

For the reasons given in Remark 7.4.3, it is less popular to use the gra-
dient formulae given by (7.4.15) and (7.4.18) in practice. Instead, a time
scaling transformation is typically used to transform a problem with variable
switching times into an equivalent problem with fixed switching times. This
is discussed in the next section.

7.4.2 Time Scaling Transformation

The purpose of this section is to demonstrate that an optimal control problem


with piecewise constant controls and variable switching times can be read-
ily transformed into an equivalent optimal control problem with piecewise
constant controls and fixed switching times. Since gradients for the trans-
formed problem with fixed switching times can be readily determined using
the formulae in Section 7.3, we can thus avoid the difficulties mentioned in
Remark 7.4.3. In early publications see, for example, [125, 126], the transfor-
mation described in this section was referred to as the Control Parametriza-
tion Enhancing Technique (CPET), while later works simply refer to it as a
time scaling transformation.
Consider Problem (P 3) once more where the switching vector ϑ =
[t1 , . . . , tM ] is the decision variable. For notational convenience, the equiva-
lent problem with fixed switching times to be introduced below is more easily
expressed in terms of the durations between individual switching times. These
durations are given by

γj = tj+1 − tj , j = 0, 1, . . . , M, (7.4.19)
7.4 Switching Times as Decision Parameters 237

where we recall that t0 = 0 and tM = T . Let γ = [γ0 , γ1 , . . . , γM ] . Note


that constraints (7.3.5) satisfied by ϑ are equivalent to the constraints

γj ≥ 0, j = 0, 1, . . . , M, (7.4.20)

on γ. Furthermore, γ must also satisfy the additional constraint

γ0 + γ1 + · · · + γM = T. (7.4.21)

The basic idea of the time scaling transformation is to replace the original
time horizon [0, T ] containing the variable switching times tj , j = 1, . . . , M ,
with a new time horizon [0, M + 1] with fixed switching times at 1, 2, . . . , M .
We use s to denote ‘time’ in the new time horizon. The relationship between
t ∈ [0, T ] and s ∈ [0, M + 1] can be defined by the differential equation

dt(s)
= v(s) (7.4.22a)
ds
with the initial condition
t(0) = 0 (7.4.22b)
and the terminal condition

t(M + 1) = T, (7.4.22c)

where the scalar valued function v(s) is called the time scaling control. It is
defined by
M
v(s) = γj χ[j,j+1) (s), (7.4.23)
j=0

where χI (·) is the indicator function defined by (7.3.3) and γj , j = 0, 1, . . . , M ,


are the durations defined above. Note that

0 ≤ γi ≤ T, i = 0, 1, . . . , M, (7.4.24)

where the upper bound follows from (7.4.21). For easy reference, the time
scaling control v(s) is written as v(s | γ). Let Γ be the set containing all those
vectors γ = [γ0 , γ1 , . . . , γM ] ∈ RM +1 satisfying (7.4.24). Clearly, v(s | γ)
is uniquely determined by γ ∈ Γ and vice versa. By virtue of (7.4.24), the
solution of (7.4.22) is monotonically non-decreasing. For each s ∈ [0, M + 1],
we have
 s
t(s) = v(τ | γ)dτ
0


⎨ γl−1
0 s, if s ∈ [0, 1],
=  (7.4.25)

⎩ γj + γl (s − l), if s ∈ [l, l + 1].
j=0
238 7 Gradient Formulae for Optimal Parameter Selection Problems

Note in particular that (7.4.25) results in

t(j) = tj , j = 0, 1, . . . , M + 1,

so that the desired mapping between the variable switching times in [0, T ] and
the fixed switching times in [0, M + 1] has been achieved and that (7.4.22c)
is equivalent to (7.4.21). Furthermore, the piecewise constant control


M
u(t) = hj χ[tj ,tj+1 ) (t)
j=0

for t ∈ [0, T ) is equivalent to


M
ũ(s) = u(t(s)) = hj χ[j,j+1) (s) (7.4.26)
j=0

for s ∈ [0, M + 1). We also adopt the notation

x̃(s) = x(t(s)). (7.4.27)

Using (7.4.22a) and the chain rule, we then obtain the transformed dynamics

dx̃(s) dx(t(s)) dt(s)


= = v(s | γ)f (t(s), x̃(s), ũ(s)) (7.4.28)
ds dt ds
with the initial condition
x̃(0) = x0 . (7.4.29)
For each γ ∈ Γ , let t(· | γ) denote the solution of (7.4.22a)–(7.4.22b) and let
x̃(· | γ) denote the solution of (7.4.28)–(7.4.29).
Finally, note that we may write (7.4.22a) in differential form as dt =
v(s)ds. Hence, for any function L(t) defined on [0, T ], we have
 T  M +1
L(t) dt = L(t(s)) v(s) ds.
0 0

The equivalent transformed optimal control problem may then be stated as


follows.
Problem (T P 3): Given the combined system (7.4.22a) and (7.4.28) with the
initial conditions (7.4.22b) and (7.4.29), find a vector γ ∈ Γ such that the
cost functional
 M +1
g̃0 (γ) = Φ0 (x̃(M + 1 | γ)) + L0 (t(s | γ), x̃(s | γ), ũ(s)) v(s | γ) ds
0
(7.4.30)
is minimized subject to the equality constraints
7.4 Switching Times as Decision Parameters 239
 M +1
g̃i (γ) = Φi (x̃(M + 1 | γ)) + Li (t(s | γ), x̃(s | γ), ũ(s)) v(s | γ) ds = 0,
0
i = 1, . . . , Ne , (7.4.31a)

the inequality constraints


 M +1
g̃i (γ) = Φi (x̃(M + 1 | γ)) + Li (t(s | γ), x̃(s | γ), ũ(s)) v(s | γ) ds ≤ 0,
0
i = Ne + 1, . . . , N, (7.4.31b)

and the additional equality constraint

g̃N +1 (γ) = t(M + 1 | γ) − T = 0. (7.4.32)

Before continuing, let us compare and contrast the equivalent Prob-


lems (P 3) and (T P 3). Problem (T P 3) has an additional state variable, t(·), an
additional associated differential equation, (7.4.22a), and an additional equal-
ity constraint, (7.4.32). Note that the latter is of the same canonical form as
the other constraints, with ΦN +1 = t(M + 1 | γ) − T and LN +1 = 0. Fur-
thermore, Problem (T P 3) has an additional scalar control function v(· | γ)
sharing the fixed switching times j = 1, . . . , M with ũ(·) and with variable
heights. While the transformation of Problem (P 3) to Problem (T P 3) has
added some complexity, Problem (T P 3) is clearly of the same form as Prob-
lem (P 2) in Section 7.3 in that the switching times of the controls are fixed
and the only variables are control heights (the components of γ defining
v(· | γ)). Hence, we can apply the gradient formulae of Section 7.3 to deter-
mine the gradients of the cost and constraint functionals in Problem (T P 3).
For reference, these are detailed as follows.
Consider Problem (T P 3). For each i = 0, 1, . . . , N , define the Hamiltonian
!
H̃i t(s | γ), x̃(s | γ), v(s | γ), ũ(s), λ̃iv (s | γ), λ̃i (s | γ)
=Li (t(s | γ), x̃(s | γ), ũ(s)) + λ̃iv (s | γ) v(s | γ)
!
+ λ̃i (s | γ) v(s | γ)f (t(s | γ), x̃(s | γ), ũ(s)), (7.4.33)

where, for each γ ∈ Γ , λ̃iv (s | γ) and λ̃i (s | γ) denote the solution of the
costate system

d λ̃iv (s)
ds λ̃i (s)
240 7 Gradient Formulae for Optimal Parameter Selection Problems
⎡ ! ⎤
∂ H̃i t(s), x̃(s | γ), v(s | γ), ũ(s), λ̃iv (s), λ̃i (s)
⎢ − ⎥
⎢ ∂t ⎥
⎢ ⎥
⎢ ⎥
=⎢ ⎛ !⎞⎥, s ∈ [0, M +1),
⎢ ⎥
⎢ ∂ H̃i t(s), x̃(s | γ), v(s | γ), ũ(s), λ̃v (s), λ̃ (s)
i i

⎣−⎝ ⎠ ⎦
∂ x̃
(7.4.34)

with the terminal condition


⎡ ⎤
0
λ̃iv (M + 1) ⎢ ⎥
=⎢
⎣  ⎥
⎦. (7.4.35)
i
λ̃ (M + 1) ∂Φi (x̃(M + 1 | γ))
∂ x̃

The gradients of the cost and constraint functionals in Problem (TP3) may
now be given in the following Theorem.
Theorem 7.4.4 The gradient of the functional g̃i with respect to γj , j ∈
{0, 1, . . . , M }, is
!
 j+1 ∂ H̃i t(s | γ), x̃(s | γ), γj , hj , λ̃i (s | γ), λ̃i (s | γ)
∂g̃i (γ) v
= ds. (7.4.36)
∂γj j ∂γj

Remark 7.4.4 Note that for computational purposes, (7.4.36) may also be
written as
!
 j+1 ∂ H̃i t(s | γ), x̃(s | γ), v(s | γ), ũ(s), λ̃i (s | γ), λ̃i (s | γ)
∂g̃i (γ) v
= ds.
∂γj j ∂v
(7.4.37)

Remark 7.4.5 While we neglected to include the gradient of gN +1 (γ) in


Theorem 7.4.4, note that, due to the canonical nature of this functional, it
is quite easy to determine this gradient using formulae similar to (7.4.33)–
(7.4.36). However, since gN +1 (γ) = γ0 + γ1 + · · · + γM − T , it is even easier
to see that
∂gN +1 (γ)
= 1, j = 0, 1, . . . , M. (7.4.38)
∂γj
Remark 7.4.6 Finally, note that since gi (ϑ) = g̃i (γ) for each i ∈
{0, 1, . . . , N } and γj = tj+1 − tj , j = 0, 1, . . . , M , it is possible to gener-
ate the gradients of gi (ϑ) with respect to each tj from the formulae above.
For each j ∈ {1, . . . , M }, we have

∂gi (ϑ) ∂g̃i (γ) ∂γj ∂g̃i (γ) ∂γj


= +
∂tj ∂γj ∂tj ∂γj−1 ∂tj
7.4 Switching Times as Decision Parameters 241

∂g̃i (γ) ∂g̃i (γ)


= −
∂γj ∂γj−1
!
 j+1 ∂ H̃i t(s | γ), x̃(s | γ), γj , hj , λ̃iv (s | γ), λ̃i (s | γ)
= ds
j ∂γj
!
 j ∂ H̃i t(s | γ), x̃(s | γ), γj−1 , hj−1 , λ̃iv (s | γ), λ̃i (s | γ)
− ds.
j−1 ∂γj−1
(7.4.39)

Remark 7.4.7 Note that the terminal constraint (7.4.32) is equivalent to


 M +1
g̃N +1 (γ) = v(s) ds − T = 0 (7.4.40)
0

which is also in canonical form with ΦN +1 = −T and LN +1 = v(s).

Remark 7.4.8 Suppose the functions f and Li , i = 0, 1, . . . , N , in Problem


(P 3) do not depend explicitly on time. Then, there is no need to include
the differential equation (7.4.22a) and its initial condition (7.4.22b) in the
corresponding equivalent transformed problem (T P 3) if we also replace the
terminal constraint (7.4.32) by (7.4.40). In this case, the solution to Problem
(T P 3) can be used to construct the solution of Problem (P 3) by using (7.4.25)
in order to obtain t as a function of s. Indeed, (7.4.25) can be used instead
of (7.4.22a)–(7.4.22b) to define the time scaling transformation in these cases
[170, 171].

Remark 7.4.9 The components of γ can also be regarded as system pa-


rameters instead of arising from the construction of the time scaling control
function v(·) [215].

Remark 7.4.10 Note that the transformed problem has only fixed time
points where the state differential equations are discontinuous. All locations
of the discontinuities of the state differential equations are thus known and
fixed during the optimization process. Even when two or more of the switch-
ing times in the original time scale coalesce, the number of these locations
remains unchanged in the transformed problem.

Remark 7.4.11 Although the allocation of s = 1, 2, . . . , M as fixed switching


times in transformed time horizon [0, M +1] is easy to follow and widely used,
large values of M may result in numerical scaling issues for the transformed
problem when [0, M + 1] becomes too long. Instead, the time scaling trans-
formation can also be formulated so that [0, T ] is transformed to [0, 1] with
fixed switching times located at ξj , j = 1, . . . , M . The choice of the ξj ∈ [0, 1]
is arbitrary as long as ξ1 > 0, ξj < ξj+1 , j = 1, . . . , M − 1, and ξM < 1.
A natural choice typically used in practice is ξj = j/(M + 1), j = 1, . . . , M
[215].
242 7 Gradient Formulae for Optimal Parameter Selection Problems

In the next few subsections, we illustrate several important classes of op-


timal control problems where the time scaling transformation can be used.

7.4.3 Combined Piecewise Constant Control and


Variable System Parameters

In this section, we consider a general class of optimal control problems with


piecewise constant controls, variable switching times for the control and vari-
able system parameters. This class encapsulates a large range of practical
optimal control problems. To deal with the variable control switching times,
we then apply the time scaling transformation and state the gradient formu-
lae with respect to each set of decision variables.
Consider a process described by the following system of differential equa-
tions on the fixed time interval (0, T ].

dx(t)
= f (t, x(t), u(t), ζ), (7.4.41a)
dt
where x = [x1 , . . . , xn ] ∈ Rn , u = [u1 , . . . , ur ] ∈ Rr and ζ = [ζ1 , . . . , ζs ] ∈
Rs are, respectively, the state, control and system parameter vectors, f :
[0, T ] × Rn × Rs → Rn , and f = [f1 , . . . , fn ] ∈ Rn . The initial condition for
the system of differential equations (7.4.41a) is

x(0) = x0 (ζ), (7.4.41b)

where x0 = [x01 , . . . , x0n ] ∈ Rn is a given vector valued function of the system


parameter vector ζ.
As in Section 7.3, we assume that each component of u(·) is a piecewise
constant function of the form (7.3.2) with possible jumps at the variable
switching times t1 , . . . , tM . Let h and Λ be as defined in Section 7.3 and let ϑ
and Ξ be as defined in Section 7.4. Then for each h ∈ Λ and ϑ ∈ Ξ, we denote
the corresponding control as u(· | h, ϑ). Furthermore, assume that ζ ∈ Z,
where Z is as defined in Section 7.2. For each (h, ϑ, ζ) ∈ R(M +1)r × RM × Rs ,
let x(· | h, ϑ, ζ) be the corresponding solution of the system (7.2.1). We may
now define an optimal parameter selection problem as follows.
Problem (P 4): Given the system (7.4.41), find h ∈ Λ, ϑ ∈ Ξ and ζ ∈ Z
such that the cost functional
 T
g0 (h, ϑ, ζ) = Φ0 (x(T | h, ϑ, ζ), ζ) + L0 (t, x(t | h, ϑ, ζ), u(t | h, ϑ), ζ) dt
0
(7.4.42)
is minimized subject to the equality constraints
7.4 Switching Times as Decision Parameters 243
 T
gi (h, ϑ, ζ) = Φi (x(τi | h, ϑ, ζ), ζ) + Li (t, x(t | h, ϑ, ζ), u(t | h, ϑ), ζ) dt = 0,
0
i = 1, . . . , Ne , (7.4.43a)

and subject to the inequality constraints


 T
gi (h, ϑ, ζ) = Φi (x(τi | h, ϑ, ζ), ζ) + Li (t, x(t | h, ϑ, ζ), u(t | h, ϑ), ζ) dt ≤ 0,
0
i = Ne + 1, . . . , N. (7.4.43b)

Here, Φi : Rn ×Rs → R and Li : [0, T ]×Rn ×Rr ×Rs → R, i = 0, 1, . . . , N,


are given real-valued functions. We assume throughout this section that the
corresponding versions of Assumptions 7.2.1–7.4.1 are satisfied.
Consider the time scaling transformation (7.4.22a)–(7.4.22b) and let γ, Γ
and v(· | γ) be as defined in Section 7.4.2. Let


M
ũ(s | h) = u(t(s) | h, ϑ) = hj χ[j,j+1) (s) (7.4.44)
j=0

for s ∈ [0, M + 1). Similarly, we define x̃(s) = x(t(s)), for s ∈ [0, M + 1].
Using (7.4.22a), the transformed dynamics are

dx̃(s)
= v(s | γ)f (t(s), x̃(s), ũ(s | h, ζ)) (7.4.45)
ds
with the initial condition
x̃(0) = x0 (ζ). (7.4.46)
For each γ ∈ Γ , let t(· | γ) denote the corresponding solution of (7.4.22a)–
(7.4.22b). In order to simplify notation somewhat, let the triple (h, γ, ζ) be
denoted by θ. Then for each θ = (h, γ, ζ) ∈ Λ × Γ × Z, let x̃(· | θ) denote
the corresponding solution of (7.4.45)–(7.4.46). The transformed version of
Problem (P 4) is given as follows.
Problem (T P 4): Given the combined system (7.4.22a) and (7.4.45) with
the initial conditions (7.4.22b) and (7.4.46), find a combined vector θ =
(h, γ, ζ) ∈ Λ × Γ × Z such that the cost functional
 M +1
g̃0 (θ) = Φ0 (x(M + 1 | θ), ζ) + L0 (t(s | γ), x̃(s | θ), ũ(s | h), ζ)v(s | γ)ds
0
(7.4.47)
is minimized over Λ × Γ × Z subject to equality constraints
244 7 Gradient Formulae for Optimal Parameter Selection Problems
 M +1
g̃i (θ) = Φi (x(M +1 | θ), ζ) + Li (t(s | γ), x̃(s | θ), ũ(s | h), ζ)v(s | γ)ds = 0,
0
i = 1, . . . , Ne , (7.4.48)

the inequality constraints


 M +1
g̃i (θ) = Φi (x(M +1 | θ), ζ) + Li (t(s | γ), x̃(s | θ), ũ(s | h), ζ)v(s | γ)ds ≤ 0,
0
i = Ne + 1, . . . , N, (7.4.49)

and the additional equality constraint

g̃N +1 (γ) = t(M + 1 | γ) − T = 0. (7.4.50)

Note that all the controls in the transformed problem have fixed switching
times and variable heights, just like those in Section 7.3. The gradients of
the cost and constraint functionals in Problem (T P 4) with respect to each
of h, γ and ζ are summarized in the following theorem, which follows from
the corresponding results in Sections 7.2 and 7.3.
Consider Problem (T P 4). For each i = 0, 1, . . . , N , define the Hamiltonian
!
H̃i t(s | γ), x̃(s | θ), v(s | γ), ũ(s | h), ζ, λ̃iv (s | γ), λ̃i (s | θ)
= Li (t(s | γ), x̃(s | θ), ũ(s | h), ζ) + λ̃iv (s | γ) v(s | γ)
!
+ λ̃i (s | θ) v(s | γ)f (t(s | γ), x̃(s | θ), ũ(s | h), ζ) (7.4.51)

where, for each θ = (h, γ, ζ) ∈ Λ × Γ × Z, λ̃iv (s | γ) and λ̃i (s | θ) denote the


solution of the costate system
⎡ ! ⎤
∂ H̃i t(s | γ), x̃(s | θ), v(s | γ), ũ(s | h), ζ, λ̃iv (s), λ̃i (s)
⎢ − ⎥
⎢ ∂t ⎥
⎢ ⎥
d λ̃iv (s) ⎢ ⎥
⎢ ! ⎞⎥
⎢ ⎛
= ,
ds λ̃i (s) ⎥
⎢ ∂ H̃i t(s | γ), x̃(s | θ), v(s | γ), ũ(s | h), ζ, λ̃v (s), λ̃ (s)
i i

⎣−⎝ ⎠ ⎦
∂ x̃

s ∈ [0, M + 1), (7.4.52)

with the terminal condition


⎡ ⎤
0
λ̃iv (M + 1) ⎢ ⎥
=⎢
⎣  ⎥
⎦. (7.4.53)
i
λ̃ (M + 1) ∂Φi (x̃(M + 1 | θ), ζ)
∂ x̃
7.4 Switching Times as Decision Parameters 245

Theorem 7.4.5 The gradients of the functional g̃i with respect to hj and
γj , j = 0, 1, . . . , M , as well as ζ are given by
!
 j+1 ∂ H̃i t(s | γ), x̃(s | θ), γj , hj , λ̃i (s | γ), λ̃i (s | θ)
∂g̃i (θ) v
j
= j
ds, (7.4.54)
∂h j ∂h
!
 j+1 ∂ H̃i t(s | γ), x̃(s | θ), γj , hj , ζ, λ̃i (s | γ), λ̃i (s | θ)
∂g̃i (θ) v
= ds,
∂γj j ∂γj
(7.4.55)

and
∂g̃i (θ) ∂Φi (x̃(M + 1 | θ), ζ)  i  ∂x0 (ζ)
= + λ̃ (0 | θ)
∂ζ ∂ζ ∂ζ
 
 M+1 ∂ H̃i t(s | γ), x̃(s | θ), v(s | γ), ũ(s | h), ζ, λ̃iv (s | γ), λ̃i (s | θ)
+ ds,
0 ∂ζ
(7.4.56)

respectively.

Remark 7.4.12 Again, for computational implementation, (7.4.54) and


(7.4.55) may be written as:
!
 j+1 ∂ H̃i t(s | γ), x̃(s | θ), v(s | γ), ũ(s | h), λ̃i (s | γ), λ̃i (s | θ)
∂g̃i (θ) v
= ds
∂hj j ∂ ũ
(7.4.57)
and
!
 j+1 ∂ H̃i t(s | γ), x̃(s | θ), v(s | γ), ũ(s | h), ζ, λ̃i (s | γ), λ̃i (s | θ)
∂g̃i (θ) v
= ds,
∂γj j ∂v
(7.4.58)
respectively. Also, the gradients of gN +1 (γ) with respect to h and ζ are clearly
equal to zero, while those with respect to the components of γ are given once
again by (7.4.38).

Remark 7.4.13 Problem (P 4) did not allow for individual characteristic


times for the constraint functionals like Problems (P 1) and (P 2) did. This is
because fixed time points like the τi ∈ [0, T ], i = 1, . . . , N , in Problems (P 1)
and (P 2) are actually transformed to variable time points in [0, M +1] with the
time scaling transformation (7.4.22a)–(7.4.22b). While it is possible to allow
for individual characteristic times by incorporating them with the switching
times of the controls and introducing some additional constraints, the notation
for this procedure becomes very cumbersome. We choose not to do it here.
246 7 Gradient Formulae for Optimal Parameter Selection Problems

7.4.4 Discrete Valued Optimal Control Problems


and Optimal Control of Switched Systems

In many practical optimal control problems, the values of the control com-
ponents may only be chosen from a discrete set rather than from an interval
defined by an upper and a lower bound, as we assumed in Section 7.3. The
main references for this section are [122, 126, 127, 297].
Consider once more the dynamics (7.3.1) and assume that each component
of the control takes the piecewise constant form (7.3.2). Let h and hj , j =
0, 1, . . . , M , be as defined by (7.3.6) and (7.3.7), respectively. Instead of the
individual control heights being bounded above and below by (7.3.4), though,
we now assume that
" #
hj ∈ h̄1 , h̄2 , . . . , h̄q , j = 0, 1, . . . , M, (7.4.59)

where each h̄l , l = 1, . . . , q, is a given fixed vector in Rr . Let Λ̄ be the set of


all those decision parameter vectors h which satisfy (7.4.59). We also assume
that the switching times of the controls are decision variables, so let ϑ and Ξ
be defined as for Problem (P 3). Furthermore, let Ū denote the set of all the
corresponding control functions. For convenience, for any (ϑ, h) ∈ Ξ × Λ̄, the
corresponding control in Ū is written as u(· | ϑ, h). Let x(· | ϑ, h) denote the
solution of the system (7.3.1) corresponding to the control u(· | ϑ, h) ∈ U
(and hence to (ϑ, h) ∈ Ξ × Λ̄). We may then state the canonical discrete
valued optimal control problem as follows.
Problem (P 5): Given the system (7.3.1), find a combined vector (ϑ, h) ∈
Ξ × Λ̄ such that the cost functional
 T
g0 (ϑ, h) = Φ0 (x(T | ϑ, h)) + L0 (t, x(t | ϑ, h), u(t | ϑ, h)) dt (7.4.60)
0

is minimized subject to the equality constraints


 T
gi (ϑ, h) = Φi (x(T | ϑ, h)) + Li (t, x(t | ϑ, h), u(t | ϑ, h)) dt = 0,
0
i = 1, . . . , Ne , (7.4.61a)

and the inequality constraints


 T
gi (ϑ, h) = Φi (x(T | ϑ, h)) + Li (t, x(t | ϑ, h), u(t | ϑ, h)) dt ≤ 0,
0
i = Ne + 1, . . . , N, (7.4.61b)

where Φi : Rn → R and Li : [0, T ] × Rn × Rr → R, i = 0, 1, . . . , N , are given


real-valued functions.
7.4 Switching Times as Decision Parameters 247

Before going on to discuss the solution strategies for Problem (P 5), let us
consider another common class of optimal control problems and show that
it is equivalent to Problem (P 5). Suppose that instead of a single dynamical
system, there is a finite set of distinct dynamical systems each of which can
be invoked on any subinterval of the time horizon [0, T ]. The state of the
system is then determined as follows. Starting with the given initial condition
at t = 0, the first dynamical system active of the first subinterval [0, t1 ] of
[0, T ] is integrated up to t1 . x(t1 ) then becomes the initial state for the
next subinterval [t1 , t2 ]. Starting with x(t1 ) the second dynamical system
active on [t1 , t2 ] is then integrated forward in time until t2 to get x(t2 ).
The process continues in the same manner until we reach the terminal time.
Mathematically, we can describe the overall dynamics as

dx(t)
= f vi (t, x(t)), t ∈ (ti , ti+1 ], i = 0, 1, . . . , M, (7.4.62)
dt
x(0) = x0 , (7.4.63)

where vi ∈ {0, 1, . . . , q} and {f 1 , f 2 , . . . , f q } is a given set of candidate dy-


namical systems. The switching sequence v = [v0 , v1 , . . . , vM ] ∈ RM +1 is
a decision variable which determines the sequence in which the dynamical
systems are to be invoked. Let
" #
V = [v0 , v1 , . . . , vM ] ∈ RM +1 | vi ∈ {1, . . . , q}, i = 0, 1, . . . , M (7.4.64)

be the set of feasible switching sequences. Suppose that the switching times
ti , i = 1, . . . , M , are also decision variables and let ϑ and Ξ be defined
as for Problem (P 5). For each (ϑ, v) ∈ Ξ × V, let x(· | ϑ, v) denote the
corresponding solution of (7.4.62)–(7.4.63). Then we can define the following
canonical optimal switching control problem.
Problem (P 6): Given the system (7.4.62)–(7.4.63), find a combined vector
(ϑ, v) ∈ Ξ × V such that the cost functional
 T
g0 (ϑ, v) = Φ0 (x(T | ϑ, v)) + L0 (t, x(t | ϑ, v)) dt, (7.4.65)
0

is minimized subject to the equality constraints


 T
gi (ϑ, v) = Φi (x(T | ϑ, v)) + Li (t, x(t | ϑ, v)) dt = 0, i = 1, . . . , Ne ,
0
(7.4.66a)
and the inequality constraints
 T
gi (ϑ, v) = Φi (x(T | ϑ, v)) + Li (t, x(t | ϑ, v)) dt ≤ 0, i = Ne + 1, . . . , N,
0
(7.4.66b)
248 7 Gradient Formulae for Optimal Parameter Selection Problems

where Φi : Rn → R and Li : [0, T ] × Rn → R, i = 0, 1, . . . , N , are given


real-valued functions.
Problem (P 6) is effectively a special case of Problem (P 5). This can be
easily seen if we adopt the notation f i (t, x(t)) = f˜(t, x(t), i), i = 1, . . . , q, i.e.,
the index i denoting the choice of system at time t can also be thought of as a
scalar discrete valued control function. Similarly, Problem (P 5) may be seen
as a special case of Problem (P 6) if we adopt the notation f (t, x(t), h¯j ) =
f¯j (t, x(t)), j = 1, . . . , q, at time t and if we choose to neglect the dependence
of the objective and constraint functional integrands on the control.
Indeed, since both problems clearly involve continuous (switching times)
and discrete (control values or system indices) decision variables, they belong
to the class of mixed discrete optimization problems. Most of the existing
solution methods, which we detail further below, treat the continuous and
discrete variables separately. Typically, a bilevel formulation is constructed
where the switching times are the variables of the lower level problem and the
discrete variables appear at the upper level. Thus, the lower level problems
generally take the form of Problem (P 3), where a piecewise constant control
or system sequence is fixed and only the switching times are variable. The
time scaling transformation leading to Problem (T P 3) can then be invoked
to effectively solve the lower level problem for the optimal switching times.
More generally, discrete valued optimal control and optimal switching con-
trol problems also involve additional continuous valued control functions and
system parameters. In these cases the lower level problems most often take
the form of Problem (P 4) and the same time scaling transformation lead-
ing to Problem (T P 4) can then be invoked to solve these. The upper level
problems typically involve only discrete variables and a range of discrete
optimization algorithms can be employed to solve these. Examples include
branch-and-bound [66], simulated annealing [127], and filled function meth-
ods (see Appendix A3 and [283]).
A simple and effective solution strategy which avoids the need for a bilevel
formulation and discrete variables works as follows. Let M̄ be the expected
number of switchings in an optimal solution of Problem (P 5). Consider the
fixed control height vector
&            
h̄ = h̄1 , h̄2 , . . . , h̄q , h̄1 , h̄2 , . . . , h̄q ,
      '
· · · , h̄1 , h̄2 , . . . , h̄q ∈ R(M̄ +1)×q ,

where the sequence h̄1 , h̄2 , . . . , h̄q is repeated M̄ + 1 times. For M = (M̄ +
1) × q, this leads to a problem in the form of Problem (P 3) where we just
need to determine the switching times ti , i = 1, . . . , (M̄ + 1) × q − 1. Note
that any possible order of the h̄j , j = 1, . . . , q with respect to M̄ switches
can be parametrized in this way if any of the successive switching times
are allowed to coalesce. In other words, all possible combinations of control
height sequences of Problem (P 5) with up to M̄ switches are contained in the
7.4 Switching Times as Decision Parameters 249

vector h̄. Optimization of the resulting Problem (P 3) (via the time scaling
transformation leading to the equivalent Problem (T P 3)) will then lead to
many of the switching times coalescing and leave us with an optimal switching
sequence [122, 126]. While this is an effective heuristic scheme, there are
several issues with this approach.
(i) It is possible to get more than the assumed M̄ switches. For many
practical problems, this is not a serious issue, as the number of nec-
essary switches is generally not known in the first place. If a limited
number of switches does need to be strictly adhered to, additional
constraints can be imposed along with the time scaling transforma-
tion for this purpose [297].
(ii) The introduction of many potentially unnecessary switchings creates
a large mathematical programming problem. Numerical experience
indicates that a lot of locally optimal solutions exist for this problem
and it is easy to get trapped in these. This is particularly well illus-
trated by a complex problem requiring the calculation of an optimal
path by a submarine through a sensor field [38].
If the problem under consideration also includes continuous valued con-
trol functions and the number of switches for the discrete valued controls
is small compared to the size of the partition required for the continuous
valued controls, a modified time scaling transformation proposed in [67] can
be used. This involves a coarse partition of the time horizon for the discrete
valued control components. Within each interval of this partition, a much
finer partition is set up for the continuous valued controls.
A special class of optimal control problems which often occurs in practice
is one where the Hamiltonian function turns out to be linear in the control
variables and where the controls have constant upper and lower bounds. In
this case, application of the Maximum Principle leads to an optimal con-
trol which is made up of a combination of bang or singular arcs only, as
demonstrated for several basic examples in Chapter 6. If there are only bang
arcs, the problem can be considered in the form of Problem (P 5). If the for-
mulae for the singular control during singular intervals are known and do
not depend on the costate, the problem can be considered in the form of
Problem (P 6) (see [269] for an example). If the singular control can only be
expressed in terms of the costate, time scaling can still be used by formu-
lating an auxiliary problem where additional dynamics are included for the
costate determination [228].
Another approach to determine optimal switching sequences is a graph-
based semi analytical method proposed in [118]. For a class of optimal control
problems with a single state and multiple controls, an equivalence between
the search for the optimal solution to the problem and the search for the
shortest path in a specially constructed graph is established. The graph is
based on analytical properties derived from applying the Maximum Principle.
Thus the problem of finding optimal sequence for the control policies, which
250 7 Gradient Formulae for Optimal Parameter Selection Problems

is part of finding the optimal control, is reduced to the problem of finding


the shortest path in the corresponding graph. While the method is capable of
finding exact global optimal solutions, it is not readily applicable to complex
optimal control problems which are difficult to treat analytically or to those
where part of the problem formulation depends on experimental data rather
than on well defined functions. Nevertheless, the method has been shown to
applicable to a broad class of machine scheduling problems in [119]. Note
that a related approach can be used for many practical problems which allow
some analytical analysis. An instance of this is given in [95].
The approaches described above are mostly limited to problems where
the switches in the dynamics depend on time only. There is another class of
problems where dynamic switches are driven by changes in the state of the
system. Typically, a change in dynamics is triggered when the state trajectory
moves from one particular region in the state space to another. This will be
elaborated in Chapter 10.2. See also [27, 128, 153].
There are many practical examples of discrete valued optimal control
problems. In addition to some of the application mentioned above, these
include optimizing driving strategies for trains power by diesel electric lo-
comotives [96, 126, 310], battery re-charge scheduling for conventional sub-
marines [213], operation of hybrid power systems [217], optimal operation of
vehicles [66, 128] and sensor scheduling [56, 124].

7.5 Time-Lag System

In this section, we consider a class of optimal control problems with time-


delay. The main reference of the section is Chapter 12 of [253].
Consider a process described by the following system of differential equa-
tions defined on the fixed time interval (0, T ]:

dx(t)
= f (t, x(t), x(t − h), ζ), (7.5.1a)
dt
where
x = [x1 , . . . , xn ] ∈ Rn , ζ = [ζ1 , . . . , ζs ] ∈ Rs
are, respectively, the state and system parameter vectors, f : [0, T ] × R2n ×
Rs → Rn , f = [f1 , . . . , fn ] ∈ Rn , and h is the time delay satisfying 0 < h <
T.
For the sake of simplicity, we have confined our analysis to the case of a
single time-delay. Nevertheless, all the results can be extended in a straight-
forward manner to the case of multiple time delays. The initial function for
the state vector is

x(t) = φ(t), t ∈ [−h, 0); x(0) = x0 , (7.5.1b)


7.5 Time-Lag System 251

where
φ(t) = [φ1 (t), . . . , φn (t)]
is a given piecewise continuous function mapping from [−h, 0) to Rn , and x0
is a given vector in Rn .
Let Z be a compact and convex subset of Rs . For each ζ ∈ Z, let x(· | ζ)
be the corresponding vector-valued function which is absolutely continuous
on (0, T ] and satisfies the differential equation (7.5.1a) almost everywhere
on (0, T ] and the initial condition (7.5.1b) everywhere on [−h, 0]. This func-
tion is called the solution of the system (7.5.1) corresponding to the system
parameter vector ζ ∈ Z.
We may now state an optimal parameter selection problem for the time-
delay system as follows:
Problem (P 7): Given the system (7.5.1), find a system parameter vector
ζ ∈ Z such that the cost functional
 T
g0 (ζ) = Φ0 (x(T | ζ)) + L0 (t, x(t | ζ), x(t − h | ζ), ζ) dt (7.5.2)
0

is minimized over Z and subject to the equality constraints (in canonical


form):
 T
gi (ζ) = Φi (x(T | ζ)) + Li (t, x(t | ζ), x(t − h | ζ), ζ) dt = 0,
0
i = 1, . . . , Ne , (7.5.3a)

and inequality constraints (in canonical form):


 T
gi (ζ) = Φi (x(T | ζ)) + Li (t, x(t | ζ), x(t − h | ζ), ζ) dt ≤ 0,
0
i = Ne + 1, . . . , N , (7.5.3b)

where Φi : Rn → R and Li : [0, T ] × R2n × Rs →


 R , i = 0, 1, . . . , N, are given
real valued functions.
We assume that the following conditions are satisfied:
Assumption 7.5.1 For each compact subset V of Rs , there exists a positive
constant K such that

|f (t, x, y, ζ)| ≤ K(1 + |x| + |y|),

and, for each i = 0, . . . , N ,

|Li (t, x, y, ζ)| ≤ K(1 + |x| + |y|)

for all (t, x, y, ζ) ∈ [0, T ] × R2n × V .


252 7 Gradient Formulae for Optimal Parameter Selection Problems

Assumption 7.5.2 f (t, x, y, ζ) and Li (t, x, y, ζ), i = 0, 1, . . . , N , are piece-


wise continuous on [0, T ] for each (x, y, ζ) ∈ R2n × Rs . Furthermore,
f (t, x, y, ζ) and Li (t, x, y, ζ), i = 0, 1, . . . , N , are continuously differentiable
with respect to each of the components of x, y, and ζ for each fixed t ∈ [0, T ].
Assumption 7.5.3 Φi , i = 0, 1, . . . , N, are continuously differentiable with
respect to x and ζ.
Assumption 7.5.4 φ is piecewise continuous on [−h, 0).
Remark 7.5.1 For each ζ ∈ Rs (and hence for each ζ ∈ Z), there exists
a unique absolutely continuous vector-valued function x(· | ζ) which satisfies
the system (7.5.1a)–(7.5.1b). This is done as follows.
Let k = integer of (T /h). Then, we subdivide the interval [0, T ] into k subin-
tervals if Th = k and k + 1 subintervals if Th > k. Without loss of generality,
we shall only consider the latter case. Clearly, these k + 1 subintervals may
be written as
[(l − 1)h, lh], l = 1, . . . , k, [kh, T ].
Finding the unique solution of the system (7.5.1a)–(7.5.1b) is the same
as finding the unique solution of the system (7.5.1a)–(7.5.1b) on each of
these subintervals successively with appropriate boundary conditions. The
existence of a unique solution follows from repeated applications of well-
known results on ordinary differential equations.

7.5.1 Gradient Formulae

For each i = 0, 1, . . . , N , and for each system parameter vector ζ, consider


the system
.   /
dλi (t) ∂Hi t, x(t | ζ), y(t, ζ), λi (t)
=−
dt ∂x
⎡ ! ⎤
∂ Ĥi t, z(t), x(t | ζ), ζ, λ̂i (t)
−⎣ ⎦ , t ∈ [0, T ] (7.5.4a)
∂x

with the boundary conditions

∂Φi (x(T | ζ))


(λi (T )) = , (7.5.4b)
∂x
λi (t) = 0, t > T, (7.5.4c)

where

y(t) = x(t − h | ζ), (7.5.4d)


7.5 Time-Lag System 253

z(t) = x(t + h | ζ), (7.5.4e)


λ̂i (t) = λi (t + h), (7.5.4f)

Hi (t, x, y, ζ, λ) = Li (t, x, y, ζ) + (λ) f (t, x, y, ζ), (7.5.5a)


Ĥi (t, z, x, ζ, λ̂) = Li (t + h, z, x, ζ)e(T − t − h),
+(λ̂) f (t + h, z, x, ζ)e(T − t − h), (7.5.5b)

and e(·) is the unit step function. To continue, we set

z(t) = 0, for all t ∈ [T − h, T ]. (7.5.6)

For each i = 0, 1, . . . , N , the system (7.5.4) is again known as the corre-


sponding costate system. Furthermore, let λi (· | ζ) denote the solution of the
costate system corresponding to ζ ∈ Z. It is solved backward in time from
t = T to t = 0, using a similar idea to that described in Remark 7.5.1.
Theorem 7.5.1 Consider Problem (P 7). For each i = 0, 1, . . . , N , the gra-
dient of the functional gi is given by
 T4  5
∂gi (ζ) ∂Hi t, x(t | ζ), x(t − h | ζ), ζ, λi (t | ζ)
= dt. (7.5.7)
∂ζ 0 ∂ζ

Proof. Let ζ ∈ Z be any system parameter vector and let ρ be any pertur-
bation about ζ. Define
ζ(ε) = ζ + ερ. (7.5.8)
For brevity, let x(·) and x(·; ε) denote the solutions of the system (7.5.1)
corresponding to ζ and ζ(ε), respectively. Let y(·), z(·), y(·; ε), z(·; ε) be as
defined according to (7.5.4d)–(7.5.4e). Clearly, from (7.5.1), we have
 t
0
x(t) = x + f (s, x(s), y(s), ζ) ds (7.5.9)
0

and  t
0
x(t; ε) = x + f (s, x(s; ε), y(s; ε), ζ(ε)) ds. (7.5.10)
0
Thus,

dx(t; ε) 
x(t) =
dε ε=0
 t
∂f (s, x(s), y(s), ζ) ∂f (s, x(s), y(s), ζ)
= x(s) + y(s)
0 ∂x ∂y
∂f (s, x(s), y(s), ζ)
+ ρ ds. (7.5.11)
∂ζ

Clearly,
254 7 Gradient Formulae for Optimal Parameter Selection Problems

d(x(t)) ∂f (t, x(t), y(t), ζ) ∂f (t, x(t), y(t), ζ)


= x(t) + y(t)
dt ∂x ∂y
∂f (t, x(t), y(t), ζ)
+ ρ (7.5.12a)
∂ζ
x(t) = 0, for t ≤ 0. (7.5.12b)

Now, by (7.5.3), we have


 T
gi (ζ(ε)) = Φi (x(T ; ε)) + Li (t, x(t; ε), y(t; ε), ζ(ε)) dt. (7.5.13)
0

Define

L̄i = Li (t, x(t), y(t), ζ), (7.5.14a)


L̂i = Li (t + h, z(t), x(t), ζ), (7.5.14b)
f¯ = f (t, x(t), y(t), ζ), (7.5.14c)
fˆ = f (t + h, z(t), x(t), ζ), (7.5.14d)
i
H̄i = Hi (t, x(t), y(t), ζ, λ (t)), (7.5.14e)

and !
Ĥi = Ĥi t, z(t), x(t), ζ, λ̂i (t) , (7.5.14f)

where λi (t) is the solution of the costate system (7.5.4) corresponding to the
system parameter vector ζ, and λ̂i (t) is defined by (7.5.4f). From (7.5.13),
we have

dgi (ζ(ε)) 
gi (ζ) = 
dε ε=0
∂gi (ζ)
= ρ
∂ζ
 T
∂Φi (x(T )) ∂ L̄i ∂ L̄i ∂ L̄i
= x(T )+ x(t)+ y(t)+ ρ dt.(7.5.15)
∂x 0 ∂x ∂y ∂ζ

In view of (7.5.12b), we have


  . /
T
∂ L̄i T
∂ L̂i
y(t) dt = e(T − t − h) x(t) dt. (7.5.16)
0 ∂y 0 ∂x

Combining (7.5.15), (7.5.16) and (7.5.5), we have


7.5 Time-Lag System 255

gi (ζ)
 .
∂Φi (x(T )) T
∂ H̄i ∂ Ĥi   ∂ f¯
= x(T ) + x(t) + x(t) − λi (t) x(t)
∂x 0 ∂x ∂x ∂x
/
! ∂ fˆ ∂ H̄i  i  ∂ f¯
− λ̂ (t)
i
x(t)e(T − t − h) + ρ − λ (t) ρ dt.
∂x ∂ζ ∂ζ
(7.5.17)

In view of (7.5.6), (7.5.4c) and (7.5.12b), we have


 T. ! ∂ fˆ
/  T
 i  ∂ f¯
i
λ̂ (t) x(t)e(T − t − h) dt = λ (t) y(t) dt.
0 ∂x 0 ∂y
(7.5.18)
Thus, from (7.5.17), (7.5.18) and (7.5.12a), we get
 T.
∂Φi (x(T )) ∂ H̄i ∂ Ĥi
gi (ζ) = x(T ) + x(t) + x(t)
∂x 0 ∂x ∂x
∂ H̄i   d(x(t))
+ ρ − λi (t) dt. (7.5.19)
∂ζ dt

Using (7.5.4a), (7.5.4b), (7.5.12b), and integration by parts, it follows that


 T. /
∂ H̄i ∂ Ĥi
+ x(t) dt
0 ∂x ∂x
 T  
d λi (t)
= − x(t) dt
0 dt
 T
∂Φi (x(T ))  i  d(x(t))
= − x(T ) + λ (t) dt. (7.5.20)
∂x 0 dt

From (7.5.19) and (7.5.20), we have


. /
T
∂gi (ζ) ∂ H̄i
gi (ζ) = ρ= dt ρ. (7.5.21)
∂ζ 0 ∂ζ

Since ρ is arbitrary, the conclusion of the theorem follows readily.


Remark 7.5.2 The procedure for calculating the values of the cost functional
and the constraint functionals are similar to that described in Algorithm 7.2.2.
Their gradients can be computed by an algorithm similar to Algorithm 7.2.3,
using the formulae presented in Theorem 7.5.1. Thus, we see that the time-lag
optimal parameter selection problem can also be viewed and hence solved as
a standard mathematical programming problem. However, we point out that
the time-lag system and its costate system are solved successively over a finite
256 7 Gradient Formulae for Optimal Parameter Selection Problems

number of subintervals as specified in Remark 7.5.1. This method is known


as the method of steps in the literature [14, 84].

7.6 Multiple Characteristic Time Points

In this section, we consider a class of optimal parameter selection problems


with multiple characteristic time points in the cost and constraint function-
als. The main references for this section are [180, 181]. Consider a process
described by the following system of differential equations defined on the
fixed time interval (0, T ]:

dx(t)
= f (t, x(t), ζ), (7.6.1a)
dt
where x = [x1 , . . . , xn ] ∈ Rn is the state vector, ζ = [ζ1 , . . . , ζs ] ∈ Rs is a
vector of system parameters, f = [f1 , . . . , fn ] ∈ Rn , and f : [0, T ] × Rn ×
Rs → Rn . The initial condition for (7.6.1a) is

x(0) = x0 , (7.6.1b)

where x0 ∈ Rn is given.
For each ζ ∈ Rs , let x(· | ζ) be the corresponding solution of the sys-
tem (7.6.1). We may now state the optimal parameter selection problem as
follows.
Problem (P 8): Given the system (7.6.1), find a system parameter vector
ζ ∈ Rs such that the cost functional
 T
g0 (ζ) = Φ0 (x(τ1 | ζ), . . . , x(τM | ζ)) + L0 (t, x(t | ζ), ζ) dt (7.6.2)
0

is minimized subject to the canonical inequality constraints


 T
gm (ζ) = Φm (x(τ1 | ζ), . . . , x(τM | ζ)) + Lm (t, x(t | ζ), ζ) dt ≤ 0,
0
m = 1, . . . , N. (7.6.3)

Here, Φm : Rn × · · · × Rn → R, m = 0, 1, . . . , N , Lm : [0, T ] × Rn × Rs → R,
m = 0, 1, . . . , N , are given real-valued functions and the time points τi , 0 <
τi < T , i = 1, . . . , M , are referred to as the characteristic times. For standard
optimal parameter selection problems such as those considered in previous
sections, each canonical constraint (as well as the cost which corresponds to
m = 0) depends only on one such time point. Here, however, there may be
many such time points. For convenience, define τ0 = 0 and τM +1 = T.
We assume throughout that the following conditions are satisfied.
7.6 Multiple Characteristic Time Points 257

Assumption 7.6.1 For any compact subset V ⊂ Rs , there exists a positive


constant K such that
|f (t, x, ζ)| ≤ K(1 + |x|)
for all (t, x, ζ) ∈ [0, T ] × Rn × V.
Assumption 7.6.2 f and Lm , m = 0, 1, . . . , N , together with their partial
derivatives with respect to each of the components of x and ζ are piecewise
continuous on [0, T ] for each (x, ζ) ∈ Rn × Rs and continuous on Rn × Rs
for each t ∈ [0, T ].
Assumption 7.6.3 Φm , m = 0, 1, . . . , N , are continuously differentiable on
Rn × · · · × Rn .
To derive the gradient formulae for the cost and the constraint functionals
given by (7.6.2) and (7.6.3), respectively, we consider the following system.

dλm (t) ∂Hm (t, x(t | ζ), ζ, λm (t))
=− , (7.6.4a)
dt ∂x(t | ζ)
where t ∈ (τk−1 , τk ) for k = 1, . . . , M + 1, with the jump conditions
  ∂Φm
λm τk+ − λm (τk− ) = − , for k = 1, . . . , M, (7.6.4b)
∂x(τk )

and the terminal condition


λm (T ) = 0, (7.6.4c)
where Hm is the Hamiltonian function defined by

Hm (t, x(t | ζ), ζ, λm (t)) = Lm (t, x(t | ζ), ζ) + (λm (t)) f (t, x(t | ζ), ζ).
(7.6.5)
For each m = 0, 1, . . . , N , the corresponding system (7.6.4) is called the
costate system for gm . Let λm (· | ζ) be the solution of the costate sys-
tem (7.6.4) corresponding to ζ ∈ Rs .
Remark 7.6.1 For each ζ ∈ Rs , the solution λm (· | ζ) is calculated as
follows.
Step 1. Solve the system (7.6.1), yielding x(· | ζ) on [0, T ].
Step 2. Solve the costate differential equations (7.6.4a) with the terminal con-
dition (7.6.4c) backward from t = T to t = τM+
, yielding λm (· | ζ) on
the subinterval (τM , T ] and λ τM | ζ .
m +

Step 3. Note that x(τm | ζ) is known from Step 1 and that Φm is a given con-
tinuously differentiable function of x(τk | ζ), k = 1, . . . , M . Calculate

λm (τM | ζ) by using the jump condition (7.6.4b) with k = M .
Step 4. Solve the costate differential equations (7.6.4a) backward from t =
− + −
τM to t = τM −1 with the condition at t = τM being taken as

λm (τM | ζ). This yields λm (· | ζ) on the subinterval (τM −1 , τM ]
and λ τM −1 | ζ .
m +
258 7 Gradient Formulae for Optimal Parameter Selection Problems

Step 5. Use the jump condition (7.6.4b) with k = M − 1 to obtain



λm (τM −1 | ζ).
− +
Step 6. Solve the costate system (7.6.4a) backward from t = τM −1 to τM −2
− m −
with the condition at t = τM −1 being taken as λ (τM −1 | ζ). This
yields λ (· | ζ) on the subinterval (τM −2 , τM −1 ] and λm (τM
m +
−2 | ζ).
Step 7. The process is continued until λ (· | ζ) is obtained on the subinterval
m

[0, τ1 ], which includes λm (0 | ζ) at t = 0.

The solution of the costate system (7.6.4a) with the jump conditions (7.6.4b)
and the terminal condition (7.6.4c) is thus obtained by combining λm (· | ζ)
in [τk , τk+1 ], k = 0, 1, . . . , M .

Theorem 7.6.1 For each m = 0, 1, . . . , N , the gradient of gm is given by



∂gm (ζ) T
∂Hm (t, x(t | ζ), ζ, λm (t | ζ))
= dt, (7.6.6)
∂ζj 0 ∂ζj

where j = 1, . . . , s.

Proof. The functions Hm and λm may have discontinuities at the character-


istic time points τi , i = 1, . . . , M . Note that we can write

gm (ζ) = Φm (x(τ1 | ζ), . . . , x(τM | ζ))


 +1  τk
M
$ %
+ Hm (t, x(t), ζ, λm (t)) − (λm (t)) f (t, x(t), ζ) dt. (7.6.7)
k=1 τk−1

Solving (7.6.1) gives


 t
x(t) = x(t | ζ) = x(0) + f (s, x(s | ζ), ζ) ds. (7.6.8)
0

The partial derivative of x(t | ζ) with respect to the j−th component, ζj , of


the vector ζ, j = 1, . . . , s, yields
 t
∂x(t) ∂f (s, x(s), ζ) ∂x(s) ∂f (s, x(s), ζ)
= + ds. (7.6.9)
∂ζj 0 ∂x(t) ∂ζj ∂ζj

Note that  
∂ dx(t) d ∂x(t)
= . (7.6.10)
∂ζj dt dt ∂ζj
The gradient of gm can be calculated as:

∂gm (ζ)  ∂Φm (x(τ1 ), . . . , x(τM )) ∂x(τl )


M
=
∂ζj ∂x(τl ) ∂ζj
l=1
7.6 Multiple Characteristic Time Points 259


M +1  τk
∂Hm ∂x ∂Hm ∂Hm ∂λm
+ + +
∂x ∂ζj ∂ζj ∂λm ∂ζj
k=1 τk−1

∂(λm )  d ∂x(t)
− f (t, x, ζ) − (λm ) dt. (7.6.11)
∂ζj dt ∂ζj

From (7.6.5), we obtain

∂Hm ∂λm ∂(λm )


m
= f (t, x, ζ). (7.6.12)
∂λ ∂ζj ∂ζj

Applying integration by parts to the last term of the right hand side
of (7.6.11) yields
  t=τ −  τk  
τk
d m  ∂x ∂x  k
m  dλm ∂x
(λ ) dt = (λ ) − dt.
τk−1 dt ∂ζj ∂ζj t=τ + τk−1 dt ∂ζj
k−1
(7.6.13)
From (7.6.12) and (7.6.13), it follows from (7.6.11) that

t=τk−

M 
M +1 
∂gm (ζ) ∂Φm (x(τ1 ), . . . , x(τM )) ∂x(τl ) m  ∂x 
= − (λ ) 
∂ζj ∂x(τl ) ∂ζj ∂ζj  +
l=1 k=1 t=τk−1
+1  τk
.   /

M
∂Hm dλm 
∂x ∂Hm
+ + + dt. (7.6.14)
τk−1 ∂x dt ∂ζj ∂ζj
k=1

 + to ζj are continuous in t on


Since the state and its gradient with respect
  −
∂x(τ ) ∂x τk
[0, T ], x(τk− ) = x τk+ and k
= for k = 1, . . . , M . Thus, we
∂ζj ∂ζj
obtain
t=τk−
 ∂x 
M +1
m 
(λ ) 
∂ζj  +
k=1 t=τk−1
M &
      ' ∂x(τk )  m  +  ∂x(τ0 )
= λm (τk− ) − λm τk+ − λ τ0
∂ζj ∂ζj
k=1
 −
 ∂x(τM +1 )
+ λm (τM +1 )
∂ζj
M &
  m −   m  +  ' ∂x(τk )  ∂x(T )
= λ (τk ) − λ τk + (λm (T )) .
∂ζj ∂ζj
k=1
(7.6.15)
 
Since x τ0 +
= x(0) = x0 , which is a fixed vector in Rn , (7.6.14) becomes
260 7 Gradient Formulae for Optimal Parameter Selection Problems

∂gm (ζ)
∂ζj
 
∂Φm (x(τ1 ), . . . , x(τM ))  m −   m  +  ∂x(τk )
M
= − λ (τk ) + λ τk
∂x(τk ) ∂ζj
k=1
 .   /
 ∂x(T )
M+1 τk
∂Hm dλm

∂x ∂Hm
− (λ (T ))
m
+ + + dt.
∂ζj τk−1 ∂x dt ∂ζj ∂ζj
k=1
(7.6.16)
By virtue of the definition of the costate system corresponding to gm given
in (7.6.4a) with the jump conditions (7.6.4b) and terminal condition (7.6.4c),
we obtain +1  τk
∂gm (ζ)
M ∂Hm
= dt. (7.6.17)
∂ζj τk−1 ∂ζj
k=1
This completes the proof.

Note that the costates are discontinuous at the characteristic time points.
The sizes of the jumps are determined by the interior-point conditions given
by (7.6.4b).

7.7 Exercises

7.7.1 Consider Problem (P 1). Write down the corresponding definition for
a system parameter ζ ∗ to be a regular point of the constraints (7.2.4) in the
sense of Definition 3.1.3.

7.7.2 Consider Problem (P 1). Write down the corresponding first order nec-
essary conditions in the sense of Theorem 3.1.1.

7.7.3 Consider a process governed by the following scalar differential equa-


tion defined on a free terminal time interval (0, β(ζ)]:

dx(t)
= f (t, x(t), ζ)
dt
x(0) = x0 (ζ),
where ζ is a system parameter yet to be determined, and both β(ζ) and x0 (ζ)
are given scalar functions of the system parameter ζ. The problem is to find
a system parameter ζ ∈ R such that the following cost function:
 β(ζ)
g(ζ) = L(t, x(t), ζ)dt
0

is minimized.
7.7 Exercises 261

(a) Reduce the problem to the one of fixed terminal time by rescaling the time
with respect to β(ζ), i.e., by setting t = β(ζ)τ . With reference to the fixed
terminal time problem, show that the gradient of the corresponding cost
functional, again denoted by g(ζ), is given by
 1
∂g ∂x0 (ζ) ∂H
= λ(0) + dτ ,
∂ζ ∂ζ 0 ∂ζ
where & '
H = β(ζ) L̃(τ, x(τ ), ζ) + λ(τ ) f˜(τ, x(τ ), ζ)

dλ(τ ) ∂H
=− , λ(1) = 0
dτ ∂x
dx(τ )
= β(ζ) f˜(τ, x(τ ), ζ), x(0) = x0 (ζ),

while
L̃(τ, x(τ ), ζ) = L(β(ζ)τ, x(τ ), ζ)
and
f˜(τ, x(τ ), ζ) = f (β(ζ)τ, x(τ ), ζ).

(b) It is also possible to derive the gradient formula directly. By going through
the steps given in the proof of Theorem 7.2.2, show that the gradient of
the cost functional g(ζ) is given by
 β(ζ)
∂g ∂β(ζ) ∂x0 (ζ) ∂H
=L(β(ζ), x(β(ζ)), ζ) + λ(0) + dt
∂ζ ∂ζ ∂ζ 0 ∂ζ
dλ(t) ∂H
=− , λ(β(ζ)) = 0
dt ∂x
dx(t)
=f (t, x(t), ζ), x(0) = x0 (ς),
dt
where
H = L(t, x, ζ) + λf (t, x, ζ).
(c) Are the results given in part (a) and part (b) equivalent?

7.7.4 (Optimal Design of Suspended Cable [251].) Consider a cable with its
own weight and a distributed load along its span. After appropriate statical
analysis and normalization, the total (non-dimensional) weight of the cable
is given by ;
 1 <  2 
1<
= 2
dy(x)
Φ= (1 + S ) 1 + dx
0 β dx

where
262 7 Gradient Formulae for Optimal Parameter Selection Problems
 2
d2 y(x) dy(x) ,
=α 1+ (1 + S 2 ) + β,
dx2 dx
dy(0) dy(1)
y(0) = 0, = 0, = S, α = a given constant which relates the spe-
dx dx
cific weight and the maximum permissible stress, β = an adjustable parameter
representing the ratio of total loading to horizontal tension in the cable, and
S = the maximum slope of the cable which is also adjustable. The optimal
design problem is to determine β and S such that Φ is minimized.
(a) Formulate the problem as an optimal parameter selection problem by set-
dy1 (x)
ting y(x) = y1 (x) and = y2 (x).
dx
(b) Show that the problem is equivalent to

min (S − β)/(αβ)

subject to
dy2 (x) , ,
= α 1 + (y2 (x))2 (1 + S 2 ) + β,
dx
y2 (0) = 0, and y2 (1) = S ⇒ g1 (β, S) = y2 (1) − S = 0.
(c) Write down the necessary conditions for optimality for the reduced prob-
lem in (b).
(d) Determine the gradient formulae of the cost functional and the equal-
ity constraint function g1 (β, S) of the problem in (b) with respect to the
decision variables β and S.
7.7.5 (Computation of Eigenvalues for Sturm-Liouville Boundary-Value
Problems.) Consider the well-known Sturm-Liouville problem

d dy(x)
p(x) + q(x)y(x) + λω(x)y(x) = 0 (7.7.1a)
dx dx

subject to the boundary conditions

dy(0)
α1 y(0) + α2 =0 (7.7.1b)
dx
dy(1)
β1 y(1) + β2 = 0. (7.7.1c)
dx
The functions p(x), q(x) and ω(x) are assumed to be continuous. In addition,
p(x) does not vanish in (0, 1). The problem is solvable only for a countable
number of distinct values of λ known as the eigenvalues. For each λ, the
corresponding solution y(x) is known as an eigenfunction of the problem.
Traditionally, the eigenvalues and eigenfunctions are obtained by using the
Rayleigh-Ritz method. Later approaches use finite difference and finite ele-
ment methods. This exercise illustrates how the problem can be solved, rather
easily, by posing it as an optimal parameter selection problem.
7.7 Exercises 263

(a) Express (7.7.1a) as a system of first order differential equations in y and


dy
dt .
(b) Since an eigenfunction for a given eigenvalue is non-unique (it is only
unique up to a multiplicative constant), one can normalize it by fixing
either y(0) or dy(0)
dt . With this in mind, write down the initial conditions
for the differential equations obtained in (a).
(c) The end condition (7.7.1c) can be satisfied if and only if λ is an eigen-
value. Thus, (7.7.1c) can be viewed as a function of λ, i.e.,

dy(1)
g1 (λ) = β1 y(1) + β2 = 0. (7.7.2)
dt
In principle, one can find the eigenvalues of the problem by solving for the
zeros of (7.7.2). Alternatively, one can formulate the solution of (7.7.2) as an
optimal parameter selection problem. The following functions may be useful.

Φ(g) = g 2

or ⎧
⎨ 0 if g < −ε
Φε (g) = 1
2ε (g + ε)2 if − ε ≤ g ≤ ε

g if g > ε,
where ε is a small positive constant.

7.7.6 Consider the optimal control problem involving the dynamical sys-
tem (7.2.1) and the cost functional (7.2.3). Assume that the control u has
the following special structure:

u = h(z, x),

where z is some control law parameter to be determined. Formulate the corre-


sponding optimal feedback control problem as an optimal parameter selection
problem.

7.7.7 Consider the linear quadratic optimal regulator problem as described


in Section 6.3 with Sf = 0 and with A, B, Q and R not depending on time
t. Assume that the feedback control law takes the form:

u = −Kx,

where K is a r × n dimensional constant gain matrix. The problem is to find


a constant gain matrix K such that the cost functional:
 T
1
g(K) = (x(t)) [Q + K  RK]x(t) dt
0 2

is minimized subject to the dynamical system:


264 7 Gradient Formulae for Optimal Parameter Selection Problems

dx(t)
= (A − BK)x(t)
dt
with the initial condition
x(0) = x0 .
(a) Use Theorem 7.2.2 to show that the necessary condition for optimality is
 T
∂H
dt = 0, (7.7.3)
0 ∂K
where
1 
H= x [Q + K  RK]x + λ (A − BK)x
2
and
dλ(t)    
= − Q + K  RK x(t) − A − K  B  λ(t)
dt
with
λ(T ) = 0.
(b) Prove that

[Tr(CX  DX)] = DXC + D XC 
∂X
and

[Tr(EX)] = E  ,
∂X
where C, D, E and X are matrices of appropriate dimensions.
(c) Making use of the result given in (b), show that (7.7.3) is equivalent to:
 T $ %
R Kx − R−1 B  λ(t) (x(t)) dt = 0.
0

(d) Assuming that λ(t) = S(t)x(t), show that

dS(t)    
+ S(t)(A − BK) + A − K  B  S(t) + Q + K  RK = 0
dt
and  T $ %
R K − R−1 B  S(t) x(t)(x(t)) dt = 0.
0

7.7.8 Consider Problem (P 1) without the equality constraints (7.2.4a) and


without the inequality constraints (7.2.4b). Let this be referred to as Prob-
lem (P ). If ζ ∗ ∈ Z is an optimal parameter vector for Problem (P ), use
Theorem 7.2.2 to show that

∂g0 (ζ ∗ )
(ζ − ζ ∗ ) ≥ 0
∂ζ

for all ζ ∈ Z.
7.7 Exercises 265

Develop a two-phase method to solve the discrete valued optimal control


problem formulated in Section 7.4.4, where the upper level is the simulated
annealing algorithm (to determine the control sequence) and the lower level
is an optimal parameter selection problem.

7.7.9 Consider the time-lag optimal control problem described in Section 7.5,
but without the canonical equality and inequality constraints (i.e., with-
out (7.5.3a) and (7.5.3b)). The control u is assumed to take the structure
given by (7.3.2). Derive the gradient formulae for the cost functional with
respect to the variable switching times.

7.7.10 Consider the optimal control problem involving multiple character-


istic time points, where the system dynamic is given by (7.3.1), where the
control is assumed to take the form of (7.3.2) and the cost functional is
of the form (7.6.2). No canonical equality and inequality constraints are in-
volved. Derive the gradient formula for the cost functional with respect to the
variable switching times.

7.7.11 Give detailed proof of Theorem 7.3.1.

7.7.12 Give detailed proof of Theorem 7.3.2.


7.7.13 Give detailed proof of Theorem 7.4.2.

7.7.14 Give detailed proof of Theorem 7.4.5.

7.7.15 Show the validity of Equations (7.4.57) and (7.4.58).

7.7.16 Explain in detail the equivalence of Problem (P5) and Problem (P6).

7.7.17 Consider Problem (P7) with two time-delays. State and show the
validity of the corresponding version of Theorem 7.5.1.
7.7.18 Can the time scaling transform be applied to Problem (P8)? Why?
Chapter 8
Control Parametrization for Canonical
Optimal Control Problems

8.1 Introduction

The methods reported in [75, 244, 245, 248, 249] and [253], as well as many
papers cited in the references list, are developed based on the control pa-
rameterization technique for solving various classes of optimal control prob-
lems. Basically, the method partitions the time interval [0, T ] into several
subintervals and the control variables are approximated by piecewise con-
stant or piecewise linear functions with pre-fixed switching times. Through
this process, the optimal control problem is approximated by a sequence
of optimal parameter selection problems. Each of these optimal parameter
selection problems can be viewed as a mathematical programming problem
and is hence solvable by existing optimization techniques. The software pack-
age MISER3.3 (both FORTRAN and ‘Visual Fortran’ or ‘Matlab version of
MISER’) [104] was developed by implementing the control parametrization
method. The Visual MISER [295] is now available. Many practical problems
have been solved using this approach. See relevant references in the refer-
ence list. Intuitively, the optimal parameter selection problem with a finer
partition will yield a more accurate solution to the original optimal con-
trol problem. Convergence results are first obtained in [260] for a class of
optimal control problems involving linear time-lag systems subject to lin-
ear control constraints. Subsequently, a number of control parametrization
type algorithms with associated proof of convergence have been developed in
[240, 244–246, 248, 253, 279, 280], and the relevant references cited therein.
Many of these results are included in [253]. Although convergence analysis
may or may not be of serious consequence for implementation purposes, it
nevertheless provides important insight concerning the performance of an al-
gorithm. Thus, it has become a widely accepted requirement for any new
algorithmic development.

© The Author(s), under exclusive license to 267


Springer Nature Switzerland AG 2021
K. L. Teo et al., Applied and Computational Optimal Control, Springer
Optimization and Its Applications 171,
https://doi.org/10.1007/978-3-030-69913-0 8
268 8 Control Parametrization for Canonical Optimal Control Problems

In practice, the accuracy of the optimal control obtained by the control


parametrization method is not high, as it is impossible to know the pre-
cise switching times a priori. To obtain higher accuracy the switching times
should also be regarded as decision variables. This can be accomplished by a
time scaling transform, which was called the control parametrization enhanc-
ing transform (CPET) in [125, 126] and [215] but has been renamed as the
time scaling transformation. Under this time scaling transform, the variable
switching times of the control are mapped onto a set of fixed knots in a new
time scale. The transformed problems then have the same structure as those
obtained by the classical control parametrization technique. Thus, they can
be solved by using software packages such as MISER3.3 (both FORTRAN
and ‘Visual Fortran’ or ‘Matlab version of MISER’) [104] or Visual MISER
[294]. The details will be covered in Section 8.9.
The main references for this chapter are [75, 89, 125, 148, 168, 215, 244,
245, 254] and Chapter 6 of [253].

8.2 Problem Statement

Consider the process described by the following system of nonlinear differen-


tial equations on the fixed time interval (0, T ]:

dx(t)
= f (t, x(t), u(t)), (8.2.1a)
dt
where x = [x1 , . . . , xn ] ∈ Rn , u = [u1 , . . . , ur ] ∈ Rr , are, respectively, the
state and control vectors, f = [f1 , . . . , fn ] ∈ Rn and f : [0, T ] × Rn × Rr →
Rn . The initial condition for the differential equation (8.2.1a) is

x(0) = x0 , (8.2.1b)

where x0 is a given vector in Rn .


Define
 
U1 = {v = [v1 , . . . , vr ] ∈ Rr : E i v ≤ bi , i = 1, . . . , q}, (8.2.2a)

where E i , i = 1, . . . , q, are r-vectors, and bi , i = 1, . . . , q, are real numbers;


and

U2 = {v = [v1 , . . . , vr ] ∈ Rr : αi ≤ vi ≤ βi , i = 1, . . . , r}, (8.2.2b)

where αi , i = 1, . . . , r, and βi , i = 1, . . . , r, are real numbers. Let

U = U1 ∩ U2 . (8.2.3)
8.2 Problem Statement 269

Clearly, U is a compact and convex subset of Rr . A bounded measurable


function u = [u1 , . . . , ur ] from [0, T ] into Rr is said to be an admissible
control if u(t) ∈ U for almost all t ∈ [0, T ]. Let U be the class of all such
admissible controls.
For each u ∈ U , let x(·|u) be the corresponding vector-valued function
that is absolutely continuous on (0, T ] and satisfies the differential equa-
tion (8.2.1a) almost everywhere on (0, T ] and the initial condition (8.2.1b).
This function is called the solution of the system (8.2.1) corresponding to
u ∈ U . We may now state the canonical optimal control problem as follows.
Problem (P1 ): Given system (8.2.1), find a control u ∈ U such that the cost
functional  T
g0 (u) = Φ0 (x(T |u)) + L0 (t, x(t|u), u(t)) dt (8.2.4)
0
is minimized over U , subject to the equality constraints
 τi
gi (u) = Φi (x(τi |u)) + Li (t, x(t|u), u(t)) dt = 0, i = 1, . . . , Ne , (8.2.5a)
0

and subject to the inequality constraints:


 τi
gi (u) = Φi (x(τi |u)) + Li (t, x(t|u), u(t)) dt ≤ 0, (8.2.5b)
0
i = Ne + 1, . . . , N, (8.2.5c)

where Φi : Rn → R, i = 0, 1, . . . , N , and Li : [0, T ] × Rn × Rr → R, i =


0, 1, . . . , N , are given real-valued functions; and 0 < τi ≤ T is referred to as
the characteristic time for the i−th constraint, i = 1, . . . , N , with τ0 = T by
convention.
Remark 8.2.1
(i) Both the equality constraints (8.2.5a) and the inequality constraints
(8.2.5c) are said to be in their canonical form.
(ii) Equations (8.2.5a) and (8.2.5c) reduce to terminal equality con-
straints and inequality constraints, respectively, if τi = T and
Li = 0.
(iii) Similarly, the corresponding versions of (8.2.5a) and (8.2.5c) with
0 < τi < T and Li = 0 are, respectively, equality interior point
constraints and inequality interior point constraints.
(iv) The continuous inequality constraint: h(t, x(t), u(t)) ≤ 0, t ∈
[0, T ], is equivalent to (8.2.5a) with τi = T , Φi (x(T )) = 0, and

Li (t, x(t), u(t)) = [max{h(t, x(t), u(t)), 0}]2 .

This constraint transcription was first introduced in [241]. For


more details on this constraint transcription, see Remark 8.6.5.
270 8 Control Parametrization for Canonical Optimal Control Problems

We assume throughout that the following conditions are satisfied.


Assumption 8.2.1 For any compact subset V ⊂ Rr , there exists a positive
constant K such that
|f (t, x, u)| ≤ K(1 + |x|)
for all (t, x, u) ∈ [0, T ] × Rn × V.
Assumption 8.2.2 f and Li , i = 0, 1, . . . , N , together with their partial
derivatives with respect to each of the components of x and u are piecewise
continuous on [0, T ] for each (x, u) ∈ Rn × Rr and continuous on Rn × Rr
for each t ∈ [0, T ].
Assumption 8.2.3 Φi , i = 0, 1, . . . , N , are continuously differentiable with
respect to x.
Remark 8.2.2 From the theory of differential equations, we recall that the
system (8.2.1) admits a unique solution, x(·|u), corresponding to each u ∈
L∞ ([0, T ], Rr ), and hence for each u ∈ U .

8.3 Control Parametrization

In this section, we construct a sequence of approximate problems such that


their solutions are progressively better approximations of an optimal solution
to Problem (P ). This is done through the discretization of the control space.
The classical control parametrization technique approximates each control by
a piecewise constant control as follows.
Consider a monotonically non-decreasing sequence {S p }∞ p=1 of finite sub-
sets of [0, T ]. For each p, let np + 1 points of S p be denoted by tp0 , tp1 , . . . , tpnp .
These points are chosen such that tp0 = 0, tpnp = T , and tpk−1 < tpk ,
k = 1, 2, . . . , np . Associated with each S p there is a partition I p of [0, T ),
defined by
I p = {Ikp : k = 1, . . . , np },
where Ikp = [tpk−1 , tpk ). We choose S p such that the following two properties
are satisfied.
Assumption 8.3.1 S p+1 is a refinement of S p .
Assumption 8.3.2 lim S p is dense in [0, T ], i.e.,
p→∞

lim max |Ikp | = 0,


p→∞ k=1,...,np

where |Ikp | = tpk − tpk−1 is the length of the k-th interval.


Example 8.3.1 For each positive integer p, let the interval [0, T ] be parti-
tioned into 2p equal subintervals. Let Δ denote the length of each of these.
8.3 Control Parametrization 271

Then, the partition I p of [0, T ] associated with the corresponding S p is de-


fined by
I p = {[(k − 1)Δ, kΔ] : k = 1, . . . , 2p }.
Note that this form of I p is most commonly used in practice.
Let U p consist of all those elements from U that are piecewise constant
and consistent with the partition I p . It is clear that each u ∈ U p can be
written as

np
up (t) = σ p,k χIkp (t), (8.3.1a)
k=1

where σ p,k
∈ U and χI denotes the indicator function of I, defined by

1, t ∈ I
χI (t) = (8.3.1b)
0, elsewhere.

Let & '


 
σp = σ p,1 , . . . , (σ p,np ) , (8.3.2a)

where & '


σ p,k = σ1p,k , . . . , σrp,k . (8.3.2b)

When restricting to U p , the control constraints defined in (8.2.2a) and


(8.2.2b) become
 
Ei σ p,k ≤ bi , i = 1, . . . , q, k = 1, . . . , np (8.3.3a)

and
αi ≤ σip,k ≤ βi , i = 1, . . . , r, k = 1, . . . , np , (8.3.3b)
respectively. Let Ξ p be the set of all those σ p vectors that satisfy the con-
straints (8.3.3). Clearly, for each control up ∈ U p , there exists a unique control
parameter vector σ p ∈ Ξ p such that the relation (8.3.1) is satisfied and vice
versa.
With up ∈ U p , system (8.2.1) takes the form:

dx(t)
= f6(t, x(t), σ p ), t ∈ [0, T ] (8.3.4a)
dt
x(0) = x0 , (8.3.4b)

where  

np
f6(t, x(t), σ p ) = f t, x(t), σ p,k χIkp (t) . (8.3.4c)
k=1

Let x(·|σ p ) be the solution of the system (8.3.4) corresponding to the con-
trol parameter vector σ p ∈ Ξ p in the sense that it satisfies the differential
equation (8.3.4a) a.e. on (0, T ] and the initial condition (8.3.4b).
272 8 Control Parametrization for Canonical Optimal Control Problems

By restricting u in U p , the constraints (8.2.5a) and (8.2.5c) are reduced


to
 τi
Gi (σ p ) = Φi (x(τi |σ p )) + L6i (t, x(t|σ p ), σ p ) dt = 0,
0
i = 1, . . . , Ne , (8.3.5a)

and
 τi
Gi (σ p ) = Φi (x(τi |σ p )) + L6i (t, x(t|σ p ), σ p ) dt ≤ 0,
0
i = Ne + 1, . . . , N , (8.3.5b)

respectively. Here, L6i , i = 1, . . . , N , are obtained from Li , i = 1, . . . , N , in


the same manner as f6 is obtained from f according to (8.3.4c).

Let Ω p be the subset of Ξ p such that the constraints (8.3.5) are satis-
fied. Furthermore, let F p be the subset of U p , which consists of all those
corresponding piecewise constant controls of the form (8.3.1a). We may now
specify the approximate problem (P1 (p)) as follows.
Problem (P1 (p)): Find a control parameter vector σ p ∈ Ξ p such that the
cost function
 T
G0 (σ ) = Φ0 (x(T |σ )) +
p p
L60 (t, x(t|σ p ), σ p ) dt (8.3.6)
0

is minimized over Ω p , where L60 is obtained from L0 in the same way as f6 is


obtained from f according to (8.3.4c).
Note that for each p, the approximate Problem (P1 (p)) is an optimal pa-
rameter selection problem. It is effectively a finite dimensional optimization
problem, i.e., a mathematical programming problem. This is the main theme
behind the control parametrization technique: to approximate an optimal con-
trol problem by a sequence of appropriate mathematical programming prob-
lems. The discussion of the computational aspects of this approach will be
given in Section 8.5.
Remark 8.3.1 In this section, we have assumed that the partition points tpj ,
j = 0, . . . , np , are the same for each component of the control. This assump-
tion is merely for the sake of brevity and not a rigid requirement.

8.4 Four Preliminary Lemmas

In this section, we present four lemmas that will be used to support the
convergence results in the next section.
8.4 Four Preliminary Lemmas 273

Lemma 8.4.1 For each u ∈ U and each p, let


np
p
u (t) = σ p,k χIkp (t), (8.4.1)
k=1

where χIkp is the indicator function defined by (8.3.1b),



p,k 1
σ = p u(s) ds
|Ik | Ikp
 
and |Ikp | = tpk − tpk−1 . Then as p → ∞,

up → u (8.4.2a)

almost everywhere in [0, T ] and


 T
lim |up (t) − u(t)| dt = 0. (8.4.2b)
p→∞ 0

Proof. Let t1 be a regular point of u. Then clearly, there exists a sequence


p
of intervals {Ik(p) }∞
p=1 such that

p+1 p
t1 ∈ Ik(p+1) ⊂ Ik(p) , ∀ p,

and  
 p 
Ik(p)  → 0 as p → ∞.
Then,
p∞
{t1 } = ∩ I¯k(p) ,
p=1

where ‘¯’ denotes closure, and so, by Theorem A.1.14,


 
1 1
u(t1 ) = lim   u(s)ds = lim   u(s)ds = lim up (t1 ).
p→∞ ¯p  I¯p p→∞  p  I p p→∞
Ik(p)  k(p) Ik(p)  k(p)

Since almost all points of u(t) are regular points, we conclude that up → u
a.e. in [0, T ]. To prove (8.4.2b), we note that u is a bounded measurable
function in [0, T ]. Thus, it is clear from the construction of up that {up }∞
p=1
are uniformly bounded. Thus, the result follows from Theorem A.1.10.

Remark 8.4.1 Note that the second part of Lemma 8.4.1 remains valid if
we take {up }∞
p=1 to be any bounded sequence of functions in L∞ ([0, T ], R )
r

that converges to u a.e. in [0, T ] as p → ∞.

In the next three lemmas, {up }∞ p=1 is assumed to be a bounded sequence


of functions in L∞ ([0, T ], Rr ). In particular, these results are valid if {up }∞
p=1
274 8 Control Parametrization for Canonical Optimal Control Problems

and u are as defined in Lemma 8.4.1. The details are left to the reader as
exercises.
Lemma 8.4.2 Let {up }∞ p=1 be a bounded sequence of functions in L∞ ([0, T ],
Rr ). Then, the sequence {x(·|up )}∞p=1 of the corresponding solutions of the
system (8.2.1) is also bounded in L∞ ([0, T ], Rn ).
Proof. From (8.2.1a), we have
 t
x(t|up ) = x0 + f (s, x(s|up ), up (s)) ds, (8.4.3)
0

for all t ∈ [0, T ]. Then, applying Assumption 8.2.1 to (8.4.3), we get


 t
|x(t|u)| ≤ N0 + K |x(s|up )| ds, (8.4.4)
0
 
where N0 = x0  + KT . Applying the Gronwall-Bellman lemma (see Theo-
rem A.1.19), we obtain

|x(t|up )| ≤ N0 exp(KT ),

for all t ∈ [0, T ]. Thus, the proof is complete.


Lemma 8.4.3 Let {up }∞ p=1 be a bounded sequence of functions in L∞ ([0, T ],
Rr ) that converges to a function u a.e. in [0, T ]. Then,

lim x(·|up ) − x(·|u)∞ = 0,


p→∞

and, for each t ∈ [0, T ],

lim |x(t|up ) − x(t|u)| = 0.


p→∞


Proof. Let the bound of the sequence {up ∞ }p=1 be denoted by N0 . It
follows from Lemma 8.4.2 that there exists a constant N1 > 0 such that

x(·|up )∞ ≤ N1 ,

for all integers p ≥ 1. From (8.2.1a), we get


 t
|x(t|up ) − x(t|u)| ≤ |f (s, x(s|up ), up (s)) − f (s, x(s|u), u(s))| ds.
0

From (8.2.2), the partial derivatives of f (t, x, u) with respect to each com-
ponent of x and u are piecewise continuous in [0, T ] for each (x, u) ∈ B × V
and continuous in B × V for each t ∈ [0, T ], where B = {y ∈ Rn : |y| ≤ N1 }
and V = {z ∈ Rr : |z| ≤ N0 }. Thus, there exists a constant N2 > 0 such
that
8.5 Some Convergence Results 275
 t
|x(t|up ) − x(t|u)| ≤ N2 {|x(s|up ) − x(s|u)| + |up (s) − u(s)|} ds.
0

Applying Theorem A.1.19 once more, we obtain


 t 
|x(t|up ) − x(t|u)| ≤ N2 |up (s) − u(s)| ds exp(N2 T ).
0

Thus, both the conclusions of the lemma follow easily from Remark 8.4.1.

Lemma 8.4.4 Let {up }∞ p=1 denote the bounded sequence of functions in
L∞ ([0, T ], Rr ) that converges to a function u a.e. in [0, T ]. Then

lim g0 (up ) = g0 (u).


p→∞

Proof. From (8.2.4), we have

|g0 (up ) − g0 (u)| ≤ |Φ0 (x(T |up )) − Φ0 (x(T |u))|


 T
+ |L0 (t, x(t|up ), up (t)) − L0 (t, x(t|u), u(t))| dt.
0

The conclusion of this lemma then follows from Lemma 8.4.3, Remark 8.4.1,
Lemma 8.4.2, Assumptions 8.2.2 and 8.2.3 and Theorem A.1.10.

8.5 Some Convergence Results

In this section, we present some convergence properties of the sequence of


approximate optimal controls. To be more precise, for each p = 1, 2, . . .,
let σ p,∗ be an optimal control parameter vector of Problem (P1 (p)), which
is a finite dimensional optimization problem. Furthermore, let {up,∗ } be
the corresponding sequence of piecewise constant controls. Thus, for each
p = 1, 2, .., up,∗ is referred to as the optimal piecewise constant control of
Problem (P1 (p)). In view of Assumption 8.3.1 given in Section 8.3, we see
that each of these controls is suboptimal for the original Problem (P1 ) and
 
g0 up+1,∗ ≤ g0 (up,∗ )

for all p = 1, 2, . . .. Now, two obvious questions to ask are as follows:


(i) Does g0 (up,∗ ) converge to the true optimal cost?
(ii) Does up,∗ converge to the true optimal control in some sense?
We can provide partial answers to the first question if an additional assump-
tion is satisfied. For the second question, it can be shown that up,∗ converges
276 8 Control Parametrization for Canonical Optimal Control Problems

to the true optimal control in the weak∗ topology of L∞ ([0, T ], Rr ) if the dy-
namical system is linear and the cost functional is convex. For further detail,
see Chapter 8 of [253]. To state the required additional assumption, we need
the following preliminary definition.
Definition 8.5.1 A control parameter vector σ p ∈ Ξ p is said to be ε-
tolerated feasible if it satisfies the following ε-tolerated constraints:

− ε ≤ Gi (σ p ) ≤ ε, i = 1, . . . , Ne , (8.5.1a)

Gi (σ p ) ≤ ε, i = Ne + 1, . . . , N, (8.5.1b)
where Gi is defined by (8.3.5).

Let Ω p,ε be the subset of Ξ p such that the ε-tolerated constraints (8.5.1)
are satisfied; and furthermore, let F p,ε be the subset of U p,ε , which consists
of all those corresponding piecewise constant controls of the form (8.3.1a).
Clearly, Ω p ⊂ Ω p,ε (and hence F p ⊂ F p,ε ) for any ε > 0. We now consider
the ε-tolerated version of the approximate Problem (P1 (p)) as follows.
Problem (P1,ε (p)): Find a control parameter vector σ p ∈ Ω p,ε such that the
cost functional (8.3.6) is minimized over Ω p,ε .
Since Ω p ⊂ Ω p,ε for any ε > 0, it follows that

G0 (σ p,ε,∗ ) ≤ G0 (σ p,∗ )

for any ε > 0, where σ p,ε,∗ and σ p,∗ are optimal control parameter vectors of
Problems (P1,ε (p)) and (P1 (p)), respectively. Furthermore, let up,ε,∗ and up,∗
be the corresponding piecewise constant controls in the form of (8.3.1a) with
σ p replaced by σ p,ε,∗ and σ p,∗ , respectively. They are referred to as optimal
piecewise constant controls of Problems (P1,ε (p)) and (P1 (p)), respectively.
We can now specify the additional required assumption mentioned earlier.
Assumption 8.5.1 There exists an integer p0 such that

lim G0 (σ p,ε,∗ ) = G0 (σ p,∗ )


ε→0

uniformly with respect to p ≥ p0 .

Note that Assumption 8.5.1 is not really restrictive from the practical
viewpoint. Indeed, a real practical problem is most likely solved numerically.
The problem formulation would clearly be in doubt if this assumption was
not satisfied. We are now in a position to present the convergence results in
the next two theorems.
Theorem 8.5.1 Let up,∗ be an optimal piecewise constant control of the
approximate Problem (P1 (p)). Suppose that the original Problem (P1 ) has an
optimal control u∗ . Then
8.5 Some Convergence Results 277

lim g0 (up,∗ ) = g0 (u∗ ).


p→∞

Proof. Let up,ε,∗ be an optimal piecewise constant control of Problem (P1,ε (p)).
Then, it is clear from Assumption 8.5.1 that for any δ > 0, there exists an
ε0 > 0 such that
g0 (up,ε,∗ ) > g0 (up,∗ ) − δ (8.5.2)
for any ε, 0 < ε < ε0 , uniformly with respect to p > p0 . Let u∗,p be the
control defined from u∗ by (8.4.1). Then, for any ε, 0 < ε < ε0 , it follows
from Lemmas 8.4.1 and 8.4.3 and Assumptions 8.2.2 and 8.2.3 that there
exists an integer p1 > p0 such that

u∗,p ∈ F p,ε (8.5.3)

for all p ≥ p1 . Thus,


g0 (up,ε,∗ ) ≤ g0 (u∗,p ) (8.5.4)
for all p ≥ p1 . Combining (8.5.2) and (8.5.4), we get

g0 (u∗,p ) > g0 (up,∗ ) − δ (8.5.5)

for all p ≥ p1 . On the other hand, by virtue of Lemmas 8.4.1 and 8.4.4, we
have
lim g0 (u∗,p ) = g0 (u∗ ). (8.5.6)
p→∞

Hence, it follows from (8.5.5) and (8.5.6) that

δ + g0 (u∗ ) ≥ lim g0 (up,∗ ). (8.5.7)


p→∞

Since δ > 0 is arbitrary and u∗ is an optimal control, we conclude that

lim g0 (up,∗ ) = g0 (u∗ ).


p→∞

This completes the proof.

Theorem 8.5.2 Let u∗ be an optimal control of Problem (P1 ), and let up,∗
be an optimal piecewise constant control of the approximate Problem (P1 (p)).
Suppose that
up,∗ → ū,
a.e. on [0, T ]. Then, ū is also an optimal control of Problem (P1 ).

Proof. Since up,∗ → ū a.e. in [0, T ], it follows from Lemma 8.4.4 that

lim g0 (up,∗ ) = g0 (ū). (8.5.8)


p→∞
278 8 Control Parametrization for Canonical Optimal Control Problems

Next, it is easy to verify from Remark 8.4.1, Lemma 8.4.3 and Assump-
tions 8.2.2 and 8.2.3 that ū is also a feasible control of Problem (P ). On the
other hand, it follows from Theorem 8.5.1 that

lim g0 (up,∗ ) = g0 (u∗ ). (8.5.9)


p→∞

Hence, the conclusion of the theorem follows easily from (8.5.8) and (8.5.9).

8.6 A Unified Computational Approach

After the application of control parametrization technique, the constrained


optimal control Problem (P ) is approximated by a sequence of optimal pa-
rameter selection Problems (P1 (p)). For each positive integer p, the optimal
parameter selection Problem (P1 (p)) can be viewed as the following nonlinear
optimization problem.
Minimize
G0 (σ p ) (8.6.1a)
subject to

Gi (σ p ) = 0, i = 1, . . . , Ne , (8.6.1b)
Gi (σ p ) ≤ 0, i = Ne + 1, . . . , N, (8.6.1c)
   
i σ p,k = E i σ p,k − bi ≤ 0, i = 1, . . . , q, k = 1, . . . , np , (8.6.1d)
αi ≤ σip,k ≤ βi , i = 1, . . . , r, k = 1, . . . , np , (8.6.1e)

where G0 is defined by (8.3.6), and Gi , i = 1, . . . , N , are defined by (8.3.5).


(8.6.1c)–(8.6.1e) are the constraints that specify the set Ξ p . The constraints
in (8.6.1e) are known as boundedness constraints in nonlinear mathematical
programming. Let Λp be the set that consists of all those σ p ∈ Rrnp such
that the boundedness constraints (8.6.1e) are satisfied.
This nonlinear mathematical programming problem in terms of the control
parameters can be solved by using any suitable nonlinear optimization tech-
nique, such as sequential quadratic programming (SQP) (see Section 3.5).
Like most nonlinear optimization techniques, SQP requires an initial control
parameter vector (σ p )(0) ∈ Λp to start up the iterative search for an optimal
solution. For each iterate (σ p )(i) ∈ Λp generated during the optimization, the
values of the cost function (8.6.1a) and the constraints (8.6.1b)–(8.6.1c) as
well as their respective gradients are required in order to generate the next
iterate (σ p )(i+1) . Consequently, like many others, the technique gives rise to a
sequence of control parameter vectors. The optimal control parameter vector
obtained by the SQP routine is then taken as an optimal control parameter
vector of Problem (P1 (p)). In what follows, we shall explain how to calculate,
for each σ p ∈ Λp , the values of the cost function G0 (σ p ) and the constraint
8.6 A Unified Computational Approach 279
 
functions Gi (σ p ), i = 1, . . . , N , and i σ p,k , i = 1, . . . , q; k = 1, . . . , np , as
well as their respective gradients.

 each σ ∈ Λ , the corresponding values of the constraint


p p
Remark 8.6.1  For
functions i σ p,k , i = 1, . . . , q; k = 1, . . . , np , are straightforward to calcu-
late via (8.6.1c). Their gradients are given by
 
∂i σ p,k  
= E i , i = 1, . . . , q, k = 1, . . . , np . (8.6.2)
∂σ p,k
To calculate the values of the cost functional (8.6.1a) and the constraint
functionals (8.6.1b)–(8.6.1c) corresponding to each σ p ∈ Λp , the first task is
to calculate the solution of the system (8.3.4) corresponding to each σ p ∈ Λp .
This is presented as an algorithm for future reference.
Algorithm 8.6.1 For each given σ p ∈ Λp , compute the solution x(·|σ p ) of
the system (8.3.4) by solving the differential equations (8.3.4a) forward in
time from t = 0 to t = T with the initial condition (8.3.4b).

With the information obtained in Algorithm 8.6.1, the values of Gi cor-


responding to each σ p ∈ Λp can be easily calculated by the following simple
algorithm.
Algorithm 8.6.2
Step 1. Use Algorithm 8.6.1 to solve for x(·|σ p ). Thus, x(t|σ p ) is known for
each t ∈ [0, T ]. This implies that
(a) Φi (x(τi |σ p )), i = 0, 1, . . . , N , are known; and
(b) L6i (t, x(t|σ p ), σ p ), i = 0, 1, . . . , N , are known for each t ∈ [0, τi ].
Hence, their integrals
 τi
L6i (t, x(t|σ p ), σ p ) dt, i = 0, 1, . . . , N,
0

can be obtained readily using Simpson’s rule.


Step 2. Calculate the values of the cost functional (i = 0) and the constraint
functionals, i = 1, . . . , N , according to
 τi
Gi (σ p ) = Φi (x(τi |σ p )) + L6i (t, x(t|σ p ), σ p ) dt, i = 0, 1, . . . , N.
0

In view of Section 7.3, we see that the derivations of the gradient formu-
lae for the cost functional and the canonical constraint functionals are the
same. For each i = 0, 1, . . . , N , the gradient of the corresponding Gi may be
computed using the following algorithm.
Algorithm 8.6.3 Given σ p ∈ Λp , proceed as follows.

Step 1. Solve the costate differential equation


280 8 Control Parametrization for Canonical Optimal Control Problems
.   /
dλi (t) 6 i t, x(t|σ p ), σ p , λi (t)
∂H
=− (8.6.3a)
dt ∂x

with the boundary condition



∂Φi (x(τi |σ p ))
λi (τi ) = (8.6.3b)
∂x

backward in time from t = τi to t = 0, where x(·|σ p ) is the solution


of the system (8.3.4) corresponding to σ p ∈ Ξ p ; and H 6 i , the corre-
sponding Hamiltonian function for the cost function if i = 0 and the
i-th constraint function if i = 1, . . . , N , defined by
6 i (t, x, σ p , λ) = L6i (t, x, σ p ) + λ f6(t, x, σ p ).
H (8.6.4)

Let λi (·|σ p ) be the solution of the costate system (8.6.3).


Step 2. The gradient of Gi is computed as
 τi 6  
∂Gi (σ p ) ∂ Hi t, x(t|σ p ), σ p , λi (t|σ p )
= dt. (8.6.5)
∂σ p 0 ∂σ p

Remark 8.6.2 During actual computation, very often the control parametriza-
tion is carried out on a uniform partition of the interval [0, T ], i.e.,


np
upj (t) = σjp,k χk (t), j = 1, . . . , r, (8.6.6)
k=1

where χk is the indicator function given by

1, (k − 1)Δ < t < kΔ,


χk (t) = (8.6.7)
0, otherwise,
& '
up (t) = [up1 , . . . , upr ] , σ p,k = σ1p,k , . . . , σrp,k , np is the number of equal
subintervals and Δ = nTp is the uniform interval length. In this case, each
component of the gradient formula (8.6.5) can be written in a more specific
form
⎧  kΔ

⎪ ∂Hi

⎪ p dt, k ≤ li ,

⎪ (k−1)Δ ∂uj


∂Gi (σ p ) ⎨  τi
= ∂Hi (8.6.8)
∂σjp,k ⎪
⎪ dt, k = li + 1,
⎪ li Δ ∂upj






0 k > li + 1,
i = 0, 1, . . . , N , j = 1, . . . , r, k = 1, . . . , np ,
8.6 A Unified Computational Approach 281

Hi (t, x, u, λ) = Li (t, x, u) + λ (f (t, x, u)); (8.6.9a)


>τ ?
i
li = , (8.6.9b)
Δ
@τ A τi
and Δ
i
denotes the largest integer smaller than or equal to Δ.

Note that a uniform partition for the parametrization is not a strict re-
quirement but rather a computational convenience. At times when it is known
a priori that the control changes rapidly over certain intervals and changes
slowly over others, it will be more effective to use a nonuniform partition.
Remark 8.6.3 Note that the gradient formulae for the cost and constraint
functionals can also be obtained using the variational approach as detailed in
Section 7.3.
Remark 8.6.4 Clearly, when np increases, the computational time required
to solve the corresponding approximate problem will increase with some power
of np . To overcome this difficulty, we propose to solve any given problem as
follows.
Let q be a small positive integer. To begin, we solve the approximate Prob-
lem (P (1)) with n1 = q. Let σ 1,∗ be the optimal solution so obtained, and
let u1,∗ be the corresponding control. Then, it is clear that u1,∗ is a subop-
timal control for the original Problem (P ). Next, we choose n2 = 2q, and
let σ02 denote the parameter vector that describes u1,∗ over a new partition
with n2 = 2q intervals. Clearly, σ02 is a feasible parameter vector for Prob-
lem (P (2)). We then solve Problem (P (2)) using σ02 as the initial guess. The
process continues until the cost reduction becomes negligible. Computational
experience indicates that the reduction in the cost value appears to be insignif-
icant for np > 20 in many problems. Also, by virtue of the construction of
the subsequent initial guess, the increase in CPU time is seldom drastic.
Remark 8.6.5 Note that the constraint transcription introduced in [241] is
used in the transformation of the continuous constraints in Remark 8.2.1(iv).
However, this constraint transcription has a serious disadvantage because the
canonical equality state constraint so obtained does not satisfy any constraint
qualification (see Remark 3.1.1). In particular, let us consider the constraint
specified in Remark 8.2.1(iv). Then,

∂G(σ p )
=0
∂σ p
if the parameter vector σ p is such that

max h(t, x(t|σ p ), σ p ) = 0,


0≤t≤T

where the function h is defined in Remark 8.2.1(iv). In this situation, the lin-
ear approximation of the constraint G is equal to zero for all search directions.
This, in turn, implies that the search direction, which is obtained from the
282 8 Control Parametrization for Canonical Optimal Control Problems

corresponding quadratic programming subproblem, may be one along which


the corresponding constraint will always be violated. Two better transforma-
tion methods have been presented in Chapter 4—one is based on the constraint
transcription method and the other one is based on the exact penalty func-
tion method that will be introduced for the continuous inequality constraints
in Chapter 9.3.

8.7 Illustrative Examples

To illustrate the simple and yet efficient solution procedure outlined in the
previous sections, we now present the numerical results of applying this pro-
cedure to several examples. Note that the procedure has been implemented
in the software package MISER 3.3 (see [104]) that is used to generate the
results presented here and also those in later sections and chapters of the
text.
Example 8.7.1 (Bitumen Pyrolysis) This problem has been widely stud-
ied in the literature, more recently in the context of determining global op-
timal solutions of nonlinear optimal control problems (see [53] and the ref-
erences cited therein). The task is to find an optimal temperature profile in
a plug flow reactor with the following reactions involving components Ai ,
i = 1, . . . , 4, with reactions rates kj , j = 1, . . . , 5.
k k k
A1 →1 A2 , A2 →2 A3 , A1 + A2 →3 A2 + A2 ,
k k
A1 + A2 →4 A3 + A2 , A1 + A2 →5 A4 + A2 .
While a more comprehensive model of the problem [176] involves all of the
components, the version commonly presented in the literature (and which
we adopt here) includes components A1 and A2 only [53]. The aim is to
determine an optimal temperature profile to maximize the final amount of
A2 subject to bound constraints on the temperature. In standard form, this
problem can be stated as follows. Minimize

g0 (u) = −x2 (T )

subject to

dx1 (t)
= −k1 x1 (t) − (k3 + k4 + k5 )x1 (t)x2 (t),
dt
dx2 (t)
= k1 x1 (t) − k2 x2 (t) + k3 x1 (t)x2 (t),
dt
x1 (0) = 1,
x2 (0) = 0,
8.7 Illustrative Examples 283

where
bi /R
ki = ai exp − , i = 1, . . . , 5,
u
and the values of ai , bi /R, i = 1, . . . , 5, are given in Table 8.7.1. We also have
the control constraint

698.15 ≤ u(t) ≤ 748.15, t ∈ [0, T ].

We solve the problem with T = 10 and assume that u is piecewise constant


over a uniform partition with 10 subintervals.

Table 8.7.1: Data for the bitumen pyrolysis example

i ln ai bi /R
1 8.86 10,215.4
2 24.25 18,820.5
3 23.67 17,008.9
4 18.75 14,190.8
5 20.70 15,599.8

1 750
x1
0.9 x2
740
0.8

0.7
730
0.6
u
0.5 720

0.4
710
0.3

0.2
700
0.1

0 690
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
t t

Fig. 8.7.1: Computed optimal solution for Example 8.7.1 with 10


subintervals

The optimal solution obtained yields an objective function value of


−0.353434883 when a constant initial guess for the control was chosen halfway
between the upper and lower bounds. As noted in [53], the problem has sev-
eral local minima. While the global solution is obtained with the initial guess
284 8 Control Parametrization for Canonical Optimal Control Problems

we choose here, other initial guesses for the control can lead to one of the
local optimal solutions. To illustrate the effect of the size of the partition
for the piecewise constant control, we also solved the problem assuming a
uniform piecewise constant partition with intervals for the control. A slightly
improved objective function value of −0.353717047 is obtained with this re-
fined partition, although the shape of the optimal control function is now
more well-defined (compare Figures 8.7.1 and 8.7.2).

1 750
x1
0.9 x2
740
0.8
0.7
730
0.6
0.5 u 720

0.4
710
0.3
0.2
700
0.1
0 690
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
t t

Fig. 8.7.2: Computed optimal solution for Example 8.7.1 with 30


subintervals

Example 8.7.2 (Student Problem) Consider the Student Problem pre-


sented in Chapter 1. Choosing b = 0.5, c = 0.1, k0 = 15, T = 15 and w̄ = 15,
the problem may be written in standard form as follows. Minimize
 15
g0 (u) = u(t) dt
0

subject to
dx(t)
= 0.5u(t) − 0.1x(t), x(0) = 0,
dt
with the control constraint

0 ≤ u(t) ≤ 15, t ∈ [0, 15].

We assume that u is piecewise constant over a uniform partition with 20


subintervals.
The analytic optimal solution for this problem can be shown to involve a
pure bang-bang control. The approximate solution calculated here (see Fig-
ure 8.7.3b) is therefore clearly suboptimal. This is due to the fixed partition
we have chosen for the control function, because the exact switching time
8.8 Combined Optimal Control and Optimal Parameter Selection Problems 285

50 16
45
14
40
12
35
30 10
x 25 u 8
20
6
15
4
10
5 2
0 0
0 5 10 15 0 5 10 15
t t

Fig. 8.7.3: Computed optimal solution for Example 8.7.2

for the optimal control does not happen to coincide with one of the fixed
knot points in the chosen partition. While an increasingly finer partition can
yield improved solutions, a much better way to solve bang-bang and related
optimal control problems numerically is discussed in Section 8.9.

8.8 Combined Optimal Control and Optimal Parameter


Selection Problems

In the previous sections, a unified and efficient computational scheme is devel-


oped for solving a general class of fixed terminal time optimal control prob-
lems in canonical form. However, there are many practical problems that
do not belong to this general class of optimal control problems. Examples
include:

(i) optimal parameter selection problems;


(ii) free terminal time optimal control problems, including minimum-
time problems;
(iii) minimax optimal control problems (problems with Chebyshev per-
formance index);
(iv) boundary-value control problems, including problems with periodic
and interrelated boundary conditions and
(v) combined optimal control and optimal parameter selection prob-
lems.
The aim of this section is to extend the results presented in the previous
sections to a more general class of optimization problems that covers all the
important problems mentioned above as special cases. Since the extension is
rather straightforward, the following exposition is relatively brief. To begin,
286 8 Control Parametrization for Canonical Optimal Control Problems

let us consider a process described by the following system of differential


equations on the fixed time interval (0, T ],

dx(t)
= f (t, x(t), ζ, u(t)), (8.8.1a)
dt
where x = [x1 , . . . , xn ] ∈ Rn , ζ = [ζ1 , . . . , ζs ] ∈ Rs , u = [u1 , . . . , ur ] ∈
Rr , are, respectively, the state, system parameter and control parameter vec-
tors. We have f = [f1 , . . . , fn ] ∈ Rn with f : R × Rn × Rs × Rr → Rn . The
initial condition for the differential equation (8.8.1a) is

x(0) = x0 (ζ), (8.8.1b)


$ %
where x0 = x01 , . . . , x0n is a given function of the system parameters ζ.
Define

Z = {ζ = [ζ1 , . . . , ζs ] ∈ Rs : αi ≤ ζi ≤ βi , i = 1, . . . , s}, (8.8.2)

where αi and βi , i = 1, . . . , s, are given real numbers. Clearly, Z is a compact


and convex subset of Rs . Let U be as defined in Section 8.2.
For each (ζ, u) ∈ Rs × L∞ ([0, T ], Rr ), let x(·|ζ, u) be the corresponding
solution of the system (8.8.1). We may then state the combined optimal
control and optimal parameter selection problem as follows.
Problem (Q): Given system (8.8.1), find a combined element (ζ, u) ∈ Z × U
such that the cost function
 T
g0 (ζ, u) = Φ0 (x(T |ζ, u), ζ) + L0 (t, x(t|ζ, u), ζ, u(t)) dt (8.8.3)
0

is minimized over Z × U , subject to the equality constraints


 τi
gi (ζ, u) = Φi (x(T |ζ, u), ζ) + Li (t, x(t|ζ, u), ζ, u(t)) dt = 0,
0
i = 1, . . . , Ne , (8.8.4a)

and subject to the inequality constraints


 τi
gi (ζ, u) = Φi (x(T |ζ, u), ζ) + Li (t, x(t|ζ, u), ζ, u(t)) dt ≤ 0,
0
i = Ne + 1, . . . , N, (8.8.4b)

where Φi : Rn × Rs → R, i = 0, 1, . . . , N, and Li : R × Rn × Rs × Rr → R,
i = 0, 1, . . . , N , are given real-valued functions. As before, τi ≤ T is referred
to as the characteristic time for the i-th constraint.
We assume throughout this section that the corresponding versions of
Assumptions (8.2.1)–(8.5.1) are satisfied. We now apply the concept of control
8.8 Combined Optimal Control and Optimal Parameter Selection Problems 287

parametrization to Problem (Q). Thus, system (8.8.1) takes the form

dx(t)
= f˜(t, x(t), ζ, σ p ), (8.8.5a)
dt
x(0) = x0 (ζ), (8.8.5b)

where
 

np
f˜(t, x(t), ζ, σ ) = f
p
t, x(t), ζ, σ p,k
χIkp (t) , (8.8.5c)
k=1

χIkp (t) is defined by (8.3.1b) and Ikp , k = 1, . . . , np , are as defined in Sec-


tion 8.3. Let x(·|ζ, σ p ) be the solution of the system (8.8.5) corresponding to
the combined vector (ζ, σ p ) ∈ Z × Ξ p , where Ξ p is defined in Section 8.3.
The constraints (8.8.4a) and (8.8.4b) are reduced to
 T
Gi (ζ, σ ) = Φi (x(T |ζ, σ ), ζ) +
p p
L̃i (t, x(t|ζ, σ p ), ζ, σ p ) dt = 0,
0
i = 1, . . . , Ne , (8.8.6a)

and
 T
Gi (ζ, σ p ) = Φi (x(T |ζ, σ p ), ζ) + L̃i (t, x(t|ζ, σ p ), ζ, σ p ) dt ≤ 0,
0
i = Ne + 1, . . . , N, (8.8.6b)

respectively, where
 

np
L̃i (t, x, ζ, σ ) = Li
p
t, x, ζ, σ p,k
χ (t) .
Ikp (8.8.6c)
k=1

Let Dp be the set that consists of all those combined vectors (ζ, σ p ) in Z ×Ξ p
that satisfy the constraints (8.8.6a) and (8.8.6b). Furthermore, let B p be the
corresponding subset of Z × U p . We are now in a position to specify an
approximate version of Problem (Q) as follows.
Problem (Q(p)): Subject to system (8.8.5), find a combined vector (ζ, σ p ) ∈
Dp such that the cost function
 T
G0 (ζ, σ ) = Φ0 (x(T |ζ, σ ), ζ) +
p p
L̃0 (t, x(t|ζ, σ p ), ζ, σ p ) dt (8.8.7)
0

is minimized over Dp .
Problem (Q(p)) can also be stated in the following form.
Minimize
G0 (ζ, σ p ) (8.8.8a)
288 8 Control Parametrization for Canonical Optimal Control Problems

subject to

Gi (ζ, σ p ) = 0, i = 1, . . . , Ne , (8.8.8b)
Gi (ζ, σ p ) ≤ 0, i = Ne + 1, . . . , N, (8.8.8c)
   
i σ p,k = E i σ p,k − bi ≤ 0, i = 1, . . . , q, k = 1, . . . , np , (8.8.8d)

αi ≤ σip,k ≤ βi , i = 1, . . . , r, k = 1, . . . , np , (8.8.8e)
ai ≤ ζi ≤ bi , i = 1, . . . , s, (8.8.8f)
where G0 is defined by (8.8.7), Gi , i = 1, . . . , N , are defined by (8.8.6a)
and (8.8.6b), while (8.8.8d)–(8.8.8f) are the constraints that specify the set
Z × Ξ p . The constraints (8.8.8e)–(8.8.8f) are known as the boundedness
constraints in nonlinear optimization programming. Let Λp be the set that
consists of all those (ζ, σ p ) ∈ Rs × Rrnp such that the boundedness con-
straints (8.8.8e)–(8.8.8f) are satisfied.
This nonlinear mathematical programming problem in the control parame-
ter vectors can be solved by using any nonlinear optimization technique, such
as the sequential quadratic programming (SQP) approach (see Section 3.5).
In solving the nonlinear optimization problem (8.8.8) via SQP, we choose an
initial control parameter vector (ζ, σ p )(0) ∈ Λp to initialize the SQP process.
Then, for each (ζ, σ p )(i) ∈ Λp , the values of the cost function (8.8.8a) and the
constraints (8.8.8b)–(8.8.8d) as well as their respective gradients are required
by SQP to generate the next iterate (ζ, σ p )(i+1) . Consequently, it gives rise
to a sequence of combined vectors. The optimal combined vector obtained
by the SQP process is then regarded as an approximate optimal combined
vector of Problem (Q(p)).
For each (ζ, σ p ) ∈ Λp , the values of the cost function G0(ζ, σ p ) and the
 p,k
constraint functions Gi (ζ, σ ), i = 1, . . . , N , and i σ
p
, i = 1, . . . , q,
k = 1, . . . , np , can be calculated in a manner similar to the corresponding
components of Algorithm
  8.6.1 and Algorithm 8.6.2, and Remark 8.6.1. The
gradients of i σ p,k , i = 1, . . . , q, k = 1, . . . , np , are given in Remark 8.6.1.
A procedure similar to that described in Algorithm 8.6.3 is given below for
computing the gradient of Gi (σ p ) for each i = 0, 1, . . . , N .
Algorithm 8.8.1 Let (ζ, σ p ) ∈ Λp be given.

Step 1. Solve the costate system (8.6.3) with σ p replaced by (ζ, σ p ) backward
in time from t = τi to t = 0 (again τ0 = T by convention). Let the
corresponding solution be denoted by λ 6 i (·|ζ, σ p ).
Step 2. The gradient is computed from
!
p  τi ∂ H 6 i (t|ζ, σ p )
6 i t, x(t|ζ, σ p ), ζ, σ p , λ
∂Gi (ζ, σ )
= dt (8.8.9a)
∂σ 0 ∂σ

and
8.8 Combined Optimal Control and Optimal Parameter Selection Problems 289

∂Gi (ζ, σ p ) 0
6 i (0|ζ, σ p ) ∂x (ζ)
= (λ
∂ζ ∂ζ
!
 τi ∂ H 6i (t|ζ, σ p )
6 i t, x(t|ζ, σ p ), ζ, σ p , λ
+ dt,
0 ∂ζ
(8.8.9b)

6 i is defined by an equation similar to (8.6.4).


where H
Remark 8.8.1 In the above algorithm, the solution x(·|ζ, σ p ) of the sys-
tem (8.8.5) corresponding to each (ζ, σ p ) ∈ Λp can be computed by an algo-
rithm similar to Algorithm 8.6.1.
The convergence properties of the proposed method will be presented in
the next two theorems. Their proofs are similar to those given for Theo-
rems 8.5.1 and 8.5.2, respectively.
Theorem 8.8.1 Let (ζ p,∗ , σ p,∗ ) be an optimal combined vector of Problem
(Q(p)), and let (ζ p,∗ , ūp,∗ ) be the corresponding element in Z × U p . Sup-
pose that the original problem (Q) has an optimal combined element (ζ ∗ , u∗ ).
Then,
lim g0 (ζ p,∗ , ūp,∗ ) = g0 (ζ ∗ , u∗ ).
p→∞

Theorem 8.8.2 Let (ζ p,∗ , ūp,∗ ) be as defined in Theorem 8.8.1. Suppose that

lim |ūp,∗ (t) − u∗ (t)| = 0, a.e. on [0, T ]


p→∞

and
lim |ζ p,∗ − ζ ∗ | = 0.
p→∞

Then, (ζ ∗ , u∗ ) is an optimal combined system parameter vector and control.

8.8.1 Model Transformation

In this subsection, our aim is to show that many different classes of optimal
control problems can be transformed into special cases of Problem (Q). The
following is a list of some of these transformations, but it is by no means
exhaustive. Readers are advised to exercise their ingenuity and initiative in
applying these transformations and devising new ones.
Note that the references for Sections 8.8.1 and 8.8.2 are from Section 6.8.1
and Section 6.8.2 of [253], respectively.
(i) Free terminal time problems (including minimum-time problems).

min g0 (u, T ),
u(·),T
290 8 Control Parametrization for Canonical Optimal Control Problems

where  T
g0 (u, T ) = Φ0 (x(T ), T ) + L0 (t, x(t), u(t)) dt
0
subject to the differential equation

dx(t)
= f (t, x(t), u(t)), t ∈ (0, T ],
dt
with the initial condition
x(0) = x0
and terminal condition

g(x(T )) = 0 (or ≤ 0).

This problem is not in the form of Problem (P ) or Problem (Q), as


the terminal time T is not fixed, but variable. The simple time scale
transformation
t = Tτ
then converts the problem into

min g0 (û, T ),
T,û(·)

where
 1
g0 (û, T ) = Φ0 (x̂(1), T ) + T L0 (τ T, x̂(τ ), û(τ )) dτ,
0

subject to the differential equation

dx̂(τ )
= T f (τ T, x̂(τ ), û(τ )),

the initial condition
x̂(0) = x0 ,
and the terminal condition

g(x̂(1)) = 0 (or ≤ 0).

Note that the transformed problem takes the form of Problem (Q) if we
treat T as a system parameter.
(ii) Minimax optimal control problems.
The state dynamical equations are as in (8.2.1) but the cost functional
to be minimized takes the form
 T
-
g0 (u) = max C(t, x(t), u(t)) + Φ0 (x(T )) + L0 (t, x(t), ζ, u(t)) dt
0≤t≤T 0
8.8 Combined Optimal Control and Optimal Parameter Selection Problems 291

(often referred to as a Chebyshev performance index). If we introduce


the additional parameter

S = max C(t, x(t), u(t)),


0≤t≤T

then the cost functional is equivalent to


 T
ĝ0 (u, S) = Φ0 (x(T ), S) + L0 (t, x(t), ζ, u(t))dt
0

subject to the continuous state constraint

C(t, x(t), u(t)) − S ≤ 0, ∀t ∈ [0, T ],

where
-0 (x(T )) + S.
Φ0 (x(T ), S) = Φ
The resulting problem, due to the additional continuous state con-
straint, is not exactly in the form of Problem (Q). However, the tech-
niques to be introduced in Chapter 9.3 can be readily used to solve
it.
(iii) Problems with periodic boundary conditions.
The cost functional and the state dynamical equations are as described
by (8.2.4) and (8.2.1), but the initial and final state values are related
by
h(x(0), x(T )) = 0. (8.8.10)
In this case, we can introduce a system parameter vector ζ ∈ Rn and
put
x(0) = ζ.
Then the constraint (8.8.10) is equivalent to

h(ζ, x(T )) = 0,

and, once again, we have a special case of Problem (Q).

8.8.2 Smoothness of Optimal Control

Normally, the control parametrization often uses piecewise constant approx-


imation, and hence the resulting optimal control obtained is discontinuous.
While this is reasonable for many applications and actually desirable for some,
others may require a continuous control or even a control with a certain de-
gree of smoothness. In fact, the control parametrization can also be in terms
of piecewise linear functions, and hence the resulting optimal control obtained
292 8 Control Parametrization for Canonical Optimal Control Problems

will be continuous. Alternatively, as it is detailed in Chapter 9 of [253], op-


timal controls with any degree of smoothness can be readily found with the
control parametrization approach and some additional computational effort.
For example, if a piecewise linear continuous control is desired for Problem
(Q), one only needs to introduce an additional set of differential equations,

du(t)
= v(t)
dt
with the initial conditions

u(0) = ζu = [ζs+1 , . . . , ζs+r ] ,

where ζs+1 , . . . , ζs+r are additional system parameters to be optimized. In


the context of Problem (Q), u is now effectively a state function rather than
a control function and v is the new piecewise constant control function. Note
that the resulting u is a piecewise linear and continuous function.
Similarly, if we wish to approximate the optimal control by a C 1 (i.e.,
smooth) function, we introduce two additional sets of differential equations,

du(t)
= v(t),
dt
dv(t)
= w(t),
dt
together with the initial conditions

u(0) = ζu ,

v(0) = ζv ,
where w(t) is piecewise constant, u(t) and v(t) are effectively new state vari-
ables and ζu and ζv are additional system parameter vectors to be optimized.
Clearly, we can extend this process further to generate piecewise polynomial
optimal controls with any desired degree of smoothness.

8.8.3 Illustrative Examples

The efficiency and versatility of the proposed computational schemes will be


illustrated by several numerical examples.
Example 8.8.1 In this example, we consider the planar motion of a free
flying robot (FFR). Let x1 and x2 denote the coordinates of the FFR, x4 and
x5 denote the corresponding velocities, x3 denotes the direction of the thrust,
x6 denotes the angular velocity and u1 and u2 denote the thrusts of the two
jets. Furthermore we assume that the robot moves at a constant height from
8.8 Combined Optimal Control and Optimal Parameter Selection Problems 293

the initial position to the final equilibrium position. The robot is controlled
by the thrust of the two jets, i.e., u1 and u2 , and these two control variables
are subject to boundedness constraints. This problem was studied earlier in
[218] and [270], where it was formulated and solved as an Lp -minimization
problem in [270]. In [13], this problem is formulated as an L2 -minimization
problem, which is described formally as given below.
Given the dynamical system

dx1 (t)
= x4 (t),
dt
dx2 (t)
= x5 (t),
dt
dx3 (t)
= x6 (t),
dt
dx4 (t)
= (u1 (t) + u2 (t)) cos x3 (t),
dt
dx5 (t)
= (u1 (t) + u2 (t)) sin x3 (t),
dt
dx6 (t)
= α(u1 (t) − u2 (t)),
dt
with initial and terminal conditions given, respectively, by

x(0) = (x1 (0), x2 (0), x3 (0), x4 (0), x5 (0), x6 (0))


= (−10, −10, π/2, 0, 0, 0) ,

and

x(T ) = (x1 (T ), x2 (T ), x3 (T ), x4 (T ), x5 (T ), x6 (T )) = (0, 0, 0, 0, 0, 0) ,

find a control u = (u1 , u2 ) ∈ U such that the following cost functional


 T * +
2 2
g0 (u) = (u1 (t)) + (u2 (t)) dt
0

is minimized, where
" #
U = u = (u1 , u2 ) : |u1 (t)| ≤ 0.8, |u2 (t)| ≤ 0.4, ∀t ∈ [0, T ] .

Then, we consider the cases of np = 20, 30 and 40, where np is the number
of partitions points. For each of these cases, the problem is solved using
the MISER software[104]. The optimization method used within the MISER
software is the sequential quadratic programming (SQP). The optimal costs
obtained with different numbers np of partition points are given in Table 8.8.1.
From the results obtained in Table 8.8.1, we see the convergence of the
suboptimal costs obtained using the control parametrization method.
294 8 Control Parametrization for Canonical Optimal Control Problems

Table 8.8.1: Optimal costs for Example 8.8.1 with different np

np Optimal cost

20 6.28983430

30 6.20898660

40 6.18595308

Note that this problem was solved in [13], where it was first discretized us-
ing Euler discretization scheme. Then, both inexact resporation (IR) method
[113, 115, 182] and Ipopt optimization software [277] are used to solve this
discretized problem. The optimal cost obtained with N = 1500 is 6.154193,
where N denotes the number of partition points used in the Euler discretiza-
tion scheme. Comparing this cost value with that obtained for the case of
np = 40 using the control parametrization method, we see that their costs
are sufficiently close. The approximate optimal controls and approximate op-
timal state trajectories for the case of np = 40 obtained using the control
parametrization method are shown in Figure 8.8.1. Their trends are similar
to those obtained in [13].
Example 8.8.2 Consider a tubular chemical reactor of length L and plug
flow capacity v. We wish to carry out the following set of parallel reactions
k
A −→
1
B
k
A −→
2
C,

where both reactions are irreversible. Assuming that the reactions are of first
order and the velocity constants are given by

ki = Ai exp(−Ei /RT ), i = 1, 2,

where Ai , Ei , i = 1, 2 and R are fixed constants, the material balance equa-


tions along the tube 0 ≤ z ≤ L are
dA
v = −(k1 + k2 )A, A(0) = Af ,
dz
dB
v = k1 A, B(0) = 0,
dz
where Af is also a given constant. Due to physical limitations, we need to
specify practical bounds on the temperature:

0 ≤ T (z) ≤ T̄ , 0 ≤ z ≤ L.
8.8 Combined Optimal Control and Optimal Parameter Selection Problems 295
0.5
0.8
0.4
0.6
0.3
0.4
0.2

0.2 0.1
u1 u2
0 0

−0.2 −0.1

−0.2
−0.4
−0.3
−0.6
−0.4
−0.8
−0.5
0 2 4 6 8 10 12 0 2 4 6 8 10 12
t t

2 2

0 0

−2 −2

−4 −4
x1 x2

−6 −6

−8 −8

−10 −10

−12 −12
0 2 4 6 8 10 12 0 2 4 6 8 10 12
t t

1.6 2.5

1.4
2

1.2

1.5
1

x3 x4
0.8 1

0.6
0.5

0.4

0
0.2

0 −0.5
0 2 4 6 8 10 12 0 2 4 6 8 10 12
t t

2.5 0.15

0.1
2
0.05

0
1.5
−0.05
x x
5 6
1 −0.1

−0.15
0.5
−0.2

−0.25
0
−0.3

−0.5 −0.35
0 2 4 6 8 10 12 0 2 4 6 8 10 12
t t

Fig. 8.8.1: Optimal control and optimal state trajectories for


Example 8.8.1 with np = 40
296 8 Control Parametrization for Canonical Optimal Control Problems

To non-dimensionalize the problem, define x1 = A/Af , x2 =$ B/A % f, u =


k1 L/v, y = z/L, p =E2 /E1 , β = LA2 /[v((L/v)A1 )p ], ū = k¯1 L /v and
k¯1 = A1 exp −E1 /RT̄ . It can be shown that the transformed problem is to
minimize
g0 = −x2 (1)
subject to the transformed state equations

dx1 (y)
= −(u(y) + β(u(y))p )x1 (y),
dy
dx2 (y)
= ux1 (y),
dy
and
0 ≤ u(y) ≤ ū, 0 ≤ y < 1.

1 6
x1
0.9 x2
5
0.8
0.7
4
0.6 u
0.5 3
0.4
2
0.3
0.2
1
0.1
0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t t

Fig. 8.8.2: Computed optimal solution for Example 8.8.2 with 100
subintervals

In practice, it is common to recycle a fraction γ of the product stream and


mix it with fresh feed. This means the initial and final values of x1 and x2
are related by

x1 (0) = (1 − γ) + γx1 (1)


x2 (0) = γx2 (1).

We can deal with this aspect of the problem by putting x1 (1) = ζ1 and
x2 (1) = ζ2 where ζ1 and ζ2 are system parameters, both of which are re-
stricted to [0, 1]. Hence, we have two equality constraints as follows:

g1 = x1 (1) − ζ1 = 0,
g2 = x2 (1) − ζ2 = 0.
8.9 Control Parametrization Time Scaling Transform 297

We solve this problem numerically for β = 0.5, p = 2, ū = 6 and γ = 0.1. We


assume that the control is piecewise constant and choose the partition for u
to contain np = 100 equally spaced intervals. The solution yields an optimal
objective functional value of −0.57581 with z1 = 0.03066 and z2 = 0.57581.
The corresponding optimal control and optimal state trajectories are shown
in Figure 8.8.2.

Example 8.8.3 The design of optimal feedback controllers was considered


for a general class of nonlinear systems in [252]. While the actual feedback
control problem is defined over an infinite time horizon, an approximate prob-
lem is constructed over a finite time horizon as follows. Minimize

1 15  
g0 = (x1 (t))2 + 0.1(x2 (t))2 + 0.2(u(t))2 dt
2 0

subject to

dx1 (t)
= x2 (t), x1 (0) = −5,
dt
dx2 (t)  
= −x1 (t) + 1.4 − 0.14(x2 (t))2 x2 (t) + u(t), x2 (0) = −5.
dt
Instead of an open-loop optimal control, the aim is to construct state-
dependent suboptimal feedback control of the form

u = ζ1 x1 + ζ2 x2 + ζ3 x21 + ζ4 x22 .

Note the inclusion of the quadratic terms is in the hope of being able to
address the nonlinearities in the state differential equations. Once the form
of u has been substituted into the statement of the problem, we end up with
a pure optimal parameter selection problem involving 4 system parameters.
We choose upper and lower bounds of −10 and 10 for each of these 4 sys-
tem parameters. The problem is solved using MISER Software [104]. The
suboptimal feedback control obtained is

u = −2.9862x1 − 2.3228x2 − 0.2808(x1 )2 + 0.0926(x2 )2 .

8.9 Control Parametrization Time Scaling Transform

In spite of the flexibility and efficiency of the control parametrization ap-


proach, there are several numerical difficulties associated with it. Consider
the case in which the optimal control that we are seeking is a piecewise con-
tinuous function. Then, it has a finite number of discontinuity points. These
discontinuity points are referred to as switching times. Clearly, the number
of, as well as the locations of, these switching times are not known in ad-
298 8 Control Parametrization for Canonical Optimal Control Problems

vance. The accuracy of the classical control parametrization method thus


depends greatly on the choice of knots distribution. The ideal knot distribu-
tion would be to have a knot placed exactly at the location of each switching
time. However, before the optimal control problem is solved, we do not have
any idea what the optimal control looks like. Thus, we have no insight of
how the switching times are distributed. Thus, a usual practice is to choose
a set of dense and evenly distributed knots in the hope that there would be
a knot placed near each switching time. Hence, the number of parameters in
the approximate optimal parameter selection problem would be very large if
the approximate optimal control obtained is to be accurate. However, as the
number of parameters increases, the optimization process quickly becomes
much more expensive in terms of the computational time required. Intu-
itively, the control parametrization method would be more effective if the
switching times could also be treated as decision variables to be optimized
just like the control parameters. This could largely reduce the overall num-
ber of parameters used. However, there are a number of numerical difficulties
associated with such a strategy as pointed out in the following remark.
Remark 8.9.1 The gradient formulae of the cost functional and constraint
functionals with respect to these switching times may not exist when two or
more switching times coalesce (see Section 7.4 for details). Another difficulty
is that the differential equations governing the dynamics of the problem would
now be only piecewise continuous with the points of discontinuity (that is, the
switching times) varying from one iteration to the next in the optimization
process. Furthermore, the number of the decision variables would change when
two or more switching times coalesce. Thus, the task of integrating the differ-
ential equations accurately can be very involved. For these reasons, the use of
the gradient formulae presented in [89] and Theorem 7.4.3 is not as popular
as those obtained based on time scaling transformation to be presented in the
section.

In this section, the time scaling transform introduced in Section 7.4.2 is


used to enhance the classical control parametrization technique. This time
scaling transform is earlier called the control parametrization enhancing
transform (CPET). See, for example, [125, 126] and [215].
Consider the process described by the following system of nonlinear dif-
ferential equations on the fixed time interval (0, T ]:

dx(t)
= f (t, x(t), u(t)), (8.9.1a)
dt
where x = [x1 , . . . , xn ] ∈ Rn , u = [u1 , . . . , ur ] ∈ Rr are, respectively, the
state and control vectors; and f = [f1 , . . . , fn ] ∈ Rn . The initial condition
for the system of differential equations (8.9.1a) is:

x(0) = x0 , (8.9.1b)
8.9 Control Parametrization Time Scaling Transform 299

where x0 is a given vector in Rn . Let U1 , U2 and U be as defined by (8.2.2a),


(8.2.2b) and (8.2.3), respectively.
Definition 8.9.1 A Borel measurable function u = [u1 , . . . , ur ] mapping
from [0, T ] into Rr is said to be an admissible control if u(t) ∈ U for almost
all t ∈ [0, T ]. Let U be the class of all such admissible controls.

For each u ∈ U , let x(· | u) be the corresponding vector-valued function


that is absolutely continuous on (0, T ] and satisfies the differential equa-
tions (8.9.1a) almost everywhere on (0, T ] and the initial condition (8.9.1b).
This function is called the solution of system (8.9.1) corresponding to u ∈ U .
We may now state the canonical optimal control problem as follows:
Problem (P2 ). Given system (8.9.1), find a control u ∈ U such that the cost
functional
 T
g0 (u) = Φ0 (x(T | u)) + L0 (t, x(t | u), u(t))dt (8.9.2)
0

is minimized over U subject to the equality constraints


 T
gi (u) = Φi (x(T | u))+ Li (t, x(t | u), u(t))dt = 0, i = 1, . . . , Ne , (8.9.3)
0

and the inequality constraints


 T
gi (u) = Φi (x(T | u))+ Li (t, x(t | u), u(t))dt ≤ 0, i = Ne + 1, . . . , N, (8.9.4)
0

where Φi , i = 0, 1, . . . , N , and Li , i = 0, 1, . . . , N , are given real-valued


functions.
For the given functions f , Φi , i = 0, 1, . . . , N , and Li , i = 0, 1, . . . , N , we
assume that the following conditions are satisfied.
Assumption 8.9.1 The relevant conditions of Assumption 8.2.1.

Assumption 8.9.2 The relevant conditions of Assumption 8.2.2.

Assumption 8.9.3 The relevant conditions of Assumption 8.2.3.

8.9.1 Control Parametrization Time Scaling


Transform

For each integer p ≥ 1, let the planning horizon [0, T ] be partitioned into np
subintervals with np + 1 partition points denoted by

τ0p , τ1p , . . . , τnpp ,


300 8 Control Parametrization for Canonical Optimal Control Problems

where τ0p = 0, τnpp = T and τkp , k = 1, . . . , np , are decision variables that are
subject to the following conditions:
p
τk−1 ≤ τkp , k = 1, 2, . . . , np . (8.9.5)

Let * +
S p = τ0p , τ1p , . . . , τnpp . (8.9.6)

For all p > 1, the partition points in S p are chosen such that

S p ⊂ S p+1 . (8.9.7)

The control is now approximated in the form of piecewise constant function


as

np
p
u (t) = σ p,k χ[τ p ,τ p ) (t), (8.9.8)
k−1 k
k=1
$ p 
where χ[τ p ,τ p ) (·) denotes the indicator function of the interval τk−1 , τkp
k−1 k
defined by (8.3.1b), and
& '
σ p,k = σ1p,k , . . . , σrp,k (8.9.9a)

with
σ p,k ∈ U, k = 1, . . . , np (8.9.9b)
and the set U defined by (8.2.3). Let
&  '

σp = σ p,1 , . . . , (σ p,np ) (8.9.10)

and & '


τ p = τ1p , τ2p , . . . , τnpp −1 , (8.9.11)

where τip , i = 1, 2, . . . , np − 1, are decision variables satisfying

0 = τ0p ≤ τ1p ≤ τ2p ≤ · · · ≤ τnpp = T. (8.9.12)

Definition 8.9.2 Let Γ p be the set of all switching vectors τ p defined


by (8.9.11) such that (8.9.12) is satisfied.

Definition 8.9.3 Any piecewise constant control of the form of (8.9.8) sat-
isfying the conditions (8.9.9b) and (8.9.12) is called an admissible piecewise
constant control. Let UBp be the set of all such admissible piecewise constant
controls.

Definition 8.9.4 Let Ξ p be the set of all control parameter vectors σ p =


&  '

σ p,1 , . . . , (σ p,np ) , where σ p,k , k = 1, . . . , np , are as defined by (8.9.9).
8.9 Control Parametrization Time Scaling Transform 301

Clearly, for each control up ∈ U Bp , there exists a unique combined control


parameter vector and switching vector (σ p , τ p ) ∈ Ξ p × Γ p such that the
relation (8.9.8) is satisfied. Conversely, there also exists a unique control up ∈
Bp corresponding to each combined control parameter vector and switching
U
vector (σ p , τ p ) ∈ Ξ p × Γ p .
Restricting the controls to U Bp , system (8.9.1) takes the form

dx(t)
= f6(t, x(t), σ p , τ p ) (8.9.13a)
dt
x(0) = x0 , (8.9.13b)
where
 

np
f6(t, x(t), σ , τ ) = f
p p
t, x(t), σ p,k
χ [τ p p
) (t) . (8.9.13c)
k−1 ,τk
k=1

Let x(· | σ p , τ p ) be the solution of system (8.9.13) corresponding to the


combined control parameter vector and the switching vector (σ p , τ p ) ∈ Ξ p ×
Γ p.
Similarly, by restricting u ∈ U Bp , the constraints (8.9.3) and (8.9.4) are
reduced, respectively, to
 T
6
Gi (σ , τ ) = Φi (x(T | σ , τ )) +
p p p p
L6i (t, x(t | σ p , τ p ), σ p , τ p )dt = 0,
0
i = 1, . . . , Ne , (8.9.14a)
and
 T
6 i (σ p , τ p ) = Φi (x(T | σ p , τ p ))
G + L6i (t, x(t | σ p , τ p ), σ p , τ p )dt ≤ 0,
0
i = Ne + 1, . . . , N, (8.9.14b)

where, for i = 1, . . . , N ,
 

np
L6i (t, x(t | σ p , τ p ), σ p , τ p ) = Li t, x(t), σ p,k
χ [τ p p
) (t) . (8.9.15)
k−1 ,τk
k=1

Definition 8.9.5 Let Ω Bp be the subset of Ξ p × Γ p such that the con-


Bp be the corresponding subset
straints (8.9.14) are satisfied. Furthermore, let F
B B
of U uniquely defined by elements from Ω via (8.9.8).
p p

We may now specify an approximate problem corresponding to Prob-


lem (P2 ).
Problem (P2 (p)). Subject to the dynamical system (8.9.13), find a com-
Bp such that
bined control parameter vector and switching vector (σ p , τ p ) ∈ Ω
the cost functional
302 8 Control Parametrization for Canonical Optimal Control Problems
 T
6 0 (σ p , τ p ) = Φ0 (x(T | σ p , τ p )) +
G L60 (t, x(t | σ p , τ p ), σ p , τ p )dt
0
(8.9.16)
is minimized over Ω Bp , where L60 is defined by (8.9.15) for i = 0.
Note that for each p, Problem (P2 (p)) is an optimal parameter selection
problem. The gradient formulae of the cost functional and the constraint func-
tionals with respect to the switching vector can be obtained readily as those
given in either (7.4.15) or (7.4.18). However, there are deficiencies associated
with the algorithms developed based on these gradient formulae as mentioned
in Remarks 7.4.3 and 8.9.1. Thus, there is a need for a more effective approach
for dealing with the problem (P2 (p)). This is the main motivation behind the
control parametrization time scaling transform introduced below.
Consider the new time scale s ∈ [0, 1]. We wish to construct a transforma-
tion from t ∈ [0, T ] to s ∈ [0, 1] that maps the variable knots
τ1p , τ2p , . . . , τnpp −1
to the fixed knots k
ξkp =
, k = 1, . . . , np − 1. (8.9.17a)
np
Clearly, these fixed knots satisfy
0 = ξ0p < ξ1p < ξ2p < · · · < ξnp p −1 < ξnp p = 1. (8.9.17b)

The required transformation from t ∈ [0, T ] to s ∈ [0, 1] can be defined by


the following differential equation.

dt(s)
= v p (s) (8.9.18a)
ds
with the initial condition
t(0) = 0. (8.9.18b)
Definition 8.9.6 A scalar function v p (s) ≥ 0 for all s ∈ [0, 1] is called a
time scaling control if it is a piecewise non-negative constant function with
possible discontinuities at the fixed knots ξ0p , ξ1p , . . . , ξnp p , that is,


np
v p (s) = θkp χ[ξp p
) (s), (8.9.19)
k−1 ,ξk
k=1

where θkp ≥ 0, k = 1, . . . , np , are decision variables, and χ[ξp ,ξp ) (·) is the
k−1 k
p
indicator function on the interval [ξk−1 , ξkp ) defined by (8.3.1b). Clearly, v p (s)
& '
depends on the choice of θ = θ1p , . . . , θnp p .
8.9 Control Parametrization Time Scaling Transform 303
& '
Definition 8.9.7 Let Θp be the set containing all those θ p = θ1p , . . . , θnp p
with θip ≥ 0, i = 1, . . . , np . Furthermore, let V p be the set containing all the
corresponding time scaling controls obtained by elements from Θp via (8.9.19).

Clearly, each θ p ∈ Θp uniquely defines a v p ∈ V p , and vice versa.


By (8.9.19), it is clear from (8.9.18) that
 
s k−1
 p    $ p %
p
t (s) = v p (τ )dτ = θjp ξjp − ξj−1 + θkp s − ξk−1
p
, s ∈ ξk−1 , ξkp .
0 j=1
(8.9.20)
In particular,
 
np
1  
p
t (1) = p
v (τ )dτ = θkp ξkp − ξk−1
p
. (8.9.21)
0 k=1

Define
ω p (s) = up (t(s)), x̂(s) = [(x(s)) , t(s)] , (8.9.22)
where, by abusing the notation, we write

x(s) = x(t(s)).

Clearly, the piecewise constant control ω p can be written as


np
p
ω (s) = σ p,k χ[ξp p
) (s), s ∈ [0, 1], (8.9.23)
k−1 ,ξk
k=1

where ξkp , k = 0, 1, . . . , np , are fixed knots chosen according


$ p top (8.9.17),
χ[ξp ,ξp ) (·) denotes the indicator function of the interval ξk−1 , ξk defined
k−1 k
by (8.3.1b) and σ p as defined for up given in (8.9.8) is defined by (8.9.10).
Definition 8.9.8 Let U Cp be the set containing all piecewise constant controls
given by (8.9.23) with σ p ∈ Ξ p .
Cp × V p ,
Clearly, each (σ p , θ p ) ∈ Ξ p × Θp defines uniquely a (ω p , v p ) ∈ U
and vice versa.
Definition 8.9.9 Let Ω Cp be the set, which consists of all those elements from
Ξ × Θ such that the following constraints are satisfied.
p p

 1
- i (σ p , θ p ) = Φi (x(1 | σ p , θ p )) +
G L-i (x̂(s | σ p , θ p ), σ p , θ p )ds = 0,
0
i = 1, . . . , Ne , (8.9.24a)
304 8 Control Parametrization for Canonical Optimal Control Problems
 1
- i (σ p , θ p ) = Φi (x(1 | σ p , θ p ))
G + L-i (x̂(s | σ p , θ p ), σ p , θ p )ds ≤ 0,
0
i = Ne + 1, . . . , N, (8.9.24b)


np
 
θkp ξkp − ξk−1
p
= T, (8.9.24c)
k=1

where
 

np
L-i (x̂(s), σ p , θ p ) = v p (s)Li t(s), x(s), σkp χ[ξp ,ξp ) (s) . (8.9.25)
k−1 k
k=1

Cp × V p uniquely defined by
Definition 8.9.10 Let Ap be the subset of U
Cp
elements from Ω .

Clearly, each (ω p , v p ) ∈ Ap can be written as


np
p
ω (s) = σ p,k χ[ξp p
) (s), s ∈ [0, 1], (8.9.26a)
k−1 ,ξk
k=1
np
v p (s) = θip χ[ξp p
) (s), s ∈ [0, 1], (8.9.26b)
i−1 ,ξi
i=1

Cp .
with (σ p , θ p ) ∈ Ω
The equivalent transformed optimal parameter selection problem may now
be stated as follows.
Problem (P3 (p)). Subject to the system of differential equations

dx̂(s)
= fˆ(x̂(s), σ p , θ p ), (8.9.27a)
ds
where
 

np
fˆ(x̂(s), σ , θ ) = v (s)f
p p p
t(s), x(s), σkp χ[ξp ,ξp ) (s) (8.9.27b)
k−1 k
k=1

with initial condition


x0
x̂(0) = , (8.9.27c)
0
Cp
find a combined control parameter vector and switching vector (σ p , θ p ) ∈ Ω
such that the cost functional
8.9 Control Parametrization Time Scaling Transform 305
 1
- 0 (σ p , θ p ) = Φ0 (x̂(1 | σ p , θ p )) +
G L-0 (x̂(s | σ p , θ p ), σ p , θ p )ds (8.9.28)
0

Cp .
is minimized over Ω
Remark 8.9.2 Note that in the transformed problem (P3 (p)), only the knots
contribute to the discontinuities of the state differential equation. Thus, all
locations of the discontinuities of the state differential equation are known
and fixed during the optimization process. These locations will not change
from one iteration to the next during the optimization process. Even when
two or more of the original switching times coalesce, the number of these
locations remains unchanged in the transformed problem. Furthermore, the
gradient formulae of the cost function and constraint functions with respect
to the original switching times in the new transformed problem are provided
by the usual gradient formulae for the classical optimal parameter selection
problem as given in Section 7.2.
The basic idea behind the control parametrization time scaling transform
is aiming to include the switching times as parameters to be optimized and at
the same time, to avoid the numerical difficulties mentioned in Remark 8.9.1.
The time scaling control captures the discontinuities of the optimal control
if the number of knots in the partition of the new time horizon is greater
than or equal to the number of discontinuities of the optimal control. Since
the time scaling control parameters θkp , k = 1, . . . , np , are allowed to vary,
the control parametrization time scaling transform technique gives rise to a
larger search space and hence produces a better or at least equal approximate
optimal cost. Clearly, if the optimal control is a piecewise constant function
with discontinuities at t̄1 , . . . , t̄M , then, by solving the transformed problems
with the number of knots greater or equal to M , and by using (8.9.23), we
obtain the exact optimal control. For the general case, the convergence results
are much harder to establish.

8.9.2 Convergence Analysis

Let us provide similar convergence results as those presented in Section 8.8.


Similar to Definition 8.5.1, we need the following definitions.
Definition 8.9.11 A combined vector (σ p , θ p ) ∈ Ξ p × Θp is said to be ε-
tolerated feasible if it satisfies the following ε-tolerated constraints:
- i (σ p , θ p ) ≤ ε,
−ε≤ G i = 1, . . . , Ne (8.9.29a)
- i (σ p , θ p ) ≤ ε,
G i = Ne + 1, . . . , N. (8.9.29b)

Definition 8.9.12 Let Ω D p,ε be the subset of Ξ p ×Θ p such that the ε-tolerated

constraints (8.9.29) are satisfied. Furthermore, let Ap,ε be the corresponding


306 8 Control Parametrization for Canonical Optimal Control Problems

Cp × V p defined uniquely by elements from Ω


subset of U D Cp ⊂ Ω
p,ε . Clearly, Ω D p,ε

(and hence A ⊂ A ) for any ε > 0.


p p,ε

We now consider the following ε-tolerated version of the approximate prob-


lem (P3,ε (p)).
Problem (P3,ε (p)). Find a combined vector (σ p , θp ) ∈ ΩD
p,ε such that the

cost functional (8.9.28) is minimized over Ω D p,ε .


Cp ⊂ Ω
Since Ω Dp,ε for any ε > 0, it follows that

- 0 (σ p,ε,∗ , θ p,ε,∗ ) ≤ G
G - 0 (σ p,∗ , θ p,∗ ) (8.9.30)

for any ε > 0, where (σ p,ε,∗ , θ p,ε,∗ ) and (σ p,∗ , θ p,∗ ) are optimal vectors to
Problem (P3,ε (p)) and Problem (P3 (p)), respectively.
Let the following additional condition be satisfied.

Assumption 8.9.4 There exists an integer p0 such that


- 0 (σ p,ε,∗ , θ p,ε,∗ ) = G
lim G - 0 (σ p,∗ , θ p,∗ ) (8.9.31)
ε→0

uniformly with respect to p ≥ p0 .


Remark 8.9.3 Note that each (σ p , θ p ) ∈ Ω Cp uniquely defines a (ω p , v p ) ∈
A via (8.9.19) and (8.9.23), and vice versa. We further note that for each
p

(ω p , v p ) ∈ Ap there exists a unique (up , τ p ) defined in the original time


horizon [0, T ] such that


np
p
u (t) = σ p,k χ[τ p p
) (t), (8.9.32)
k−1 ,τk
k=1

&  '

where σ p = σ p,1
, . . . , (σ p,np ) is the same as for ω p defined by
& '
(8.9.23), and τ p = τ1p , . . . , τnpp is determined uniquely by θ p via evalu-
ating (8.9.18) at s = k/np , k = 1, . . . , np .
The functions ω p,∗ and v p,∗ are piecewise constant functions with possible
discontinuities at s = nkp , k = 1, 2, . . . , np − 1. We choose np such that

np+1 > np for p ≥ 1.

In practice, we first choose an integer n1 corresponding to p = 1. Then,


by (8.9.18) with v(s) = v 1,∗ (s), we obtain the corresponding t as a function
of s. Thus, by (8.9.23), (8.9.22), (8.9.18) and (8.9.17a), we obtain ω 1,∗ (s) and
hence u1,∗ (t), where the switching times τk1 , k = 1, . . . , n1 , are given by
  k/n1
k
τk1 =t = v 1,∗ (s)ds, k = 1, 2, . . . , n1 . (8.9.33)
n1 0
8.9 Control Parametrization Time Scaling Transform 307

Define
k
S1 = : k = 1, 2, . . . , n1 − 1 . (8.9.34)
n1
Then, we choose an integer n2 such that n2 > n1 , and let

k
S2 = : k = 1, 2, . . . , n2 − 1 . (8.9.35)
n2

This process is continued in such a way that the following condition is satis-
fied.
Assumption 8.9.5 S p+1 ⊃ S p and limp→∞ S p is dense in [0, 1].
The procedure for solving Problem (P2 ) may be stated as follows. For each
p ≥ 1, we use the control parametrization time scaling transform technique
to obtain Problem (P3 (p)). In what follows, we present a computational pro-
cedure to solve Problem (P3 (p)), giving an approximate optimal solution of
Problem (P2 ).
Algorithm 8.9.1
Step 1. Solve Problem (P3 (p)) as a standard optimal parameter selection
problem by using a computational procedure similar to that described
in Section 8.6. Let the optimal control vector obtained be denoted
by (σ p,∗ , θ p,∗ ). Then, by Remark 8.9.3, we obtain the corresponding
piecewise constant control (ω p,∗ , ν p,∗ ).
Step 2. If np ≥ M , where M is a pre-specified positive constant, go to Step 3.
Otherwise go to Step 1 with np increased to np+1 .
& '
Step 3. Stop. Construct τ p,∗ = τ1p,∗ , . . . , τnp,∗
p
from θ p,∗ . Then, obtain


np
up,∗ (t) = σ p,k,∗ χ[τ p,∗ ,τ p,∗ ) (t), (8.9.36)
k−1 k
k=1

&  '

where σ p,∗ = σ p,1,∗ , . . . , (σ p,np ,∗ ) . The piecewise constant
p,∗
control u obtained is an approximate optimal solution of Prob-
lem (P ).
We are now in a position to present the convergence results in the next
two theorems.
Theorem 8.9.1 Let (σ p,∗ , θ p,∗ ) be an optimal parameter vector of Prob-
lem (P3 (p)), and let (up,∗ , τ p,∗ ) be the corresponding piecewise constant op-
timal control and switching vector of Problem (P2 (p)) such that


np
up,∗ (t) = σ p,k,∗ χ[τ p,∗ ,τ p,∗ ) (t), (8.9.37)
k−1 k
k=1
308 8 Control Parametrization for Canonical Optimal Control Problems
&  ' & '

where σ p,∗ = σ p,1,∗ , . . . , (σ p,np ,∗ ) , and τ p,∗ = τ1p,∗ , . . . , τnp,∗
p
.
Suppose that Problem (P ) has an optimal control u∗ . Then

lim g0 (up,∗ ) = g0 (u∗ ). (8.9.38)


p→∞

Proof. Let (ω p,∗ , ν p,∗ ) be determined uniquely by (σ p,∗ , θ p,∗ ) via (8.9.23)
and (8.9.19). More precisely, solve (8.9.18) to obtain t∗ (s), s ∈ [0, 1]. Then,
by evaluating t∗ (s) at s = k/np , k = 0, 1, . . . , np , we obtain τ p,∗ =
& '
τ1p,∗ , . . . , τnp,∗
p −1
. Now, by (8.9.22), up,∗ can be written as


np
up,∗ (t) = σ p,k,∗ χ[τ p,∗ ,τ p,∗ ) (t), (8.9.39)
k−1 k
k=1

&  '

where σ p,∗ = σ p,1,∗ , . . . , (σ p,np ,∗ ) , τ0p,∗ = 0, τnp,∗
p
= T and τ p,∗ =
& '
τ1p,∗ , . . . , τnp,∗ .
p−1
&
Consider Problem (P3 (p)) but with the switching vector τ p = τ1p , . . . ,
'
τnpp −1 taken as fixed. Let this problem be referred to as Problem (P4 (p)).
Clearly, Problem (P4 (p)) can be considered as the type obtained in Section 8.3
through the application of the control parametrization technique. Let ūp,∗
be an optimal control of Problem (P4 (p)).
Note that the switching vector in Problem (P4 (p)) is taken as fixed, while
Problem (P3 (p)) treats the switching vector as part of the decision vector to
be optimized over. Thus, it is clear that

g0 (u∗ ) ≤ g0 (up,∗ ) ≤ g0 (ūp,∗ ) (8.9.40)

for all p ≥ 1. Now, by Theorem 8.5.1, it follows that for any δ > 0, there
exists a positive integer p̄ such that

g0 (u∗ ) ≤ g0 (ūp,∗ ) < g0 (u∗ ) − δ, (8.9.41)

for all p ≥ p̄. Combining (8.9.40) and (8.9.41), we obtain that, for all p ≥ p̄,

g0 (u∗ ) ≤ g0 (up,∗ ) ≤ g0 (u∗ ) − δ. (8.9.42)

Taking the limit as p → ∞ and noting that δ > 0 is arbitrary, the conclusion
of the theorem follows readily.

Theorem 8.9.2 Let (σ p,∗ , θ p,∗ ) and (up,∗ , τ p,∗ ) be as defined in Theo-
rem 8.9.1. Suppose that

lim up,∗ (t) = û(t), almost everywhere in [0, T ]. (8.9.43)


p→∞
8.10 Examples 309

Then, û is also an optimal control of Problem (P2 ).


Proof. Since up,∗ → û almost everywhere in [0, T ], it follows from Lemma 8.4.4
that
lim g0 (up,∗ ) = g0 (û). (8.9.44)
p→∞

Next, it is easy to verify from Remark 8.4.1, Lemma 8.4.3, Assumptions 8.9.2
and 8.9.3 that û is also a feasible control of Problem (P2 ). On the other hand,
it follows from Theorem 8.9.1 that

lim g0 (up,∗ ) = g0 (u∗ ). (8.9.45)


p→∞

Hence, the conclusion of the theorem follows easily from (8.9.44) and (8.9.45).

8.10 Examples

Example 8.10.1 Considers the free flying robot (FFR) problem as de-
scribed in Example 8.8.1. We shall solve the problem again for the cases
of np = 20, 30 and 40, where np is the number of partitions points. For each
of these cases, the problem is solved using the MISER Software [104]. The
optimal costs obtained with different numbers np of partition points are given
in Table 8.10.1.

Table 8.10.1: Approximate optimal costs for Example 8.10.1 with different
np and solved using time scaling

np Optimal cost with time scaling

20 6.22527918

30 6.19719836

40 6.18444730

From the results listed in Table 8.10.1, we see the convergence of the ap-
proximate optimal costs. The solutions obtained with the time scaling trans-
form are better than those obtained without time scaling transform (see Ta-
ble 8.8.1). Figure 8.10.1 shows the plots of the approximate optimal controls
and approximate optimal state trajectories obtained for the case of np = 40.
Their trends are similar to those obtained in [13] for N = 1000, where N de-
notes the number of partition points used in the Euler discretization scheme.
Furthermore, the cost of 6.18444730 obtained for the case of np = 40 is close
to the cost of 6.154193 obtained in [13] for N = 1000.
310 8 Control Parametrization for Canonical Optimal Control Problems
0.5
0.8
0.4
0.6
0.3
0.4
0.2

0.2 0.1
u u
1 2
0 0

−0.2 −0.1

−0.2
−0.4
−0.3
−0.6
−0.4
−0.8
−0.5
0 2 4 6 8 10 12 0 2 4 6 8 10 12
t t

2 2

0 0

−2 −2

−4 −4
x x
1 2

−6 −6

−8 −8

−10 −10

−12 −12
0 2 4 6 8 10 12 0 2 4 6 8 10 12
t t

1.6 2.5

1.4
2

1.2

1.5
1

x3 x4
0.8 1

0.6
0.5

0.4

0
0.2

0 −0.5
0 2 4 6 8 10 12 0 2 4 6 8 10 12
t t

2.5 0.15

0.1
2
0.05

0
1.5
−0.05
x x
5 6
1 −0.1

−0.15
0.5
−0.2

−0.25
0
−0.3

−0.5 −0.35
0 2 4 6 8 10 12 0 2 4 6 8 10 12
t t

Fig. 8.10.1: Optimal control and optimal state trajectories for


Example 8.10.1 with np = 40 and solved using time scaling
8.10 Examples 311

Example 8.10.2 Consider the tubular chemical reactor problem as de-


scribed in Example 8.8.2. We solve the problem again using the control
parametrization method with time scaling transformation being utilized. The
number of partition points of the control function is np = 20. For this prob-
lem, it is solved using the MISER software [104] with time scaling transform
being applied. The successful termination of the optimization software indi-
cates that the solution obtained is such that the KKT conditions are satis-
fied. The optimal cost obtained is −5.75817152 × 10−1 . The optimal cost of
−5.75805641 × 10−1 is obtained in Example 8.8.2 without the use of the time
scaling transform for which np = 100. Figure 8.10.2 shows the plots of the
approximate optimal control and the optimal state trajectories.

1 0.7
0.9
0.6
0.8
0.7 0.5

0.6 x 2 0.4
x1
0.5
0.3
0.4
0.3 0.2
0.2
0.1
0.1
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
t t

4
u
3

0
0 0.2 0.4 0.6 0.8 1
t

Fig. 8.10.2: Optimal state trajectories and control for Example 8.10.2 with
np = 20 and solved using time scaling

Example 8.10.3 In Example 8.10.3, the control is in the form of a feedback


control. In this example, we consider the same optimal control problem but
the control is now an open-loop control subject to the following boundedness
constraints:
u ≤ u(t) ≤ u, ∀t ∈ [0, 15],
312 8 Control Parametrization for Canonical Optimal Control Problems

where u = −0.5 and u = 0.5. The control is parametrized as a piecewise con-


stant function with the time horizon [0, 15] being subdivided into np subin-
tervals. We solve the problem for the cases of np = 10 and 20. The respective
costs obtained are 80.2451689 and 79.7608063. Figure 8.10.3 shows the plots
of the approximate optimal control and the approximate optimal state tra-
jectories.

4 5
4
2
3
2
0
x1 x2 1
−2 0
−1
−4
−2
−3
−6
−4
−8 −5
0 5 10 15 0 5 10 15
t t

0.5
0.4
0.3
0.2
0.1
u 0
−0.1
−0.2
−0.3
−0.4
−0.5

0 5 10 15
t

Fig. 8.10.3: Optimal state trajectories and optimal control for


Example 8.10.3 with np = 20 and solved using time scaling

8.11 Exercises

8.11.1 With reference to Remark 8.6.5, we consider the constraint specified


in Remark 8.2.1(iv). Show that

∂G(σ p )
=0
∂σ p
if σ p is such that
max h(t, x(t|σ p ), σ p ) = 0.
0≤t≤T
8.11 Exercises 313

8.11.2 Show that  


T 6i
∂H ∂Hi
dt = dt,
0
p,k
∂σj Ikp ∂upj

6 i and Hi are defined by (8.6.4) and (8.6.9a), respectively.


where H

8.11.3 Prove Theorem 8.8.1.

8.11.4 Prove Theorem 8.8.2.

8.11.5 Consider the optimal control problem, where the cost functional
(8.2.4) is to be minimized subject to the dynamical system (8.2.1). Let U be
the class of admissible controls that consists of all C 1 functions from [0, T ] to
U , where U is as defined by (8.2.2b) with αi = −1, and βi = 1, i = 1, . . . , r.
Pose this problem in a form solvable by the usual control parametrization
technique. Derive the gradient formula for the cost functional with respect to
the control parameters and the resulting system parameters.

8.11.6 Consider a minimax optimal control problem, where the following


cost functional
g0 (u) = max C(t, x(t), u(t))
0≤t≤T

is to be minimized subject to the dynamical system (8.2.1). The class of ad-


missible controls is as defined in Section 8.2. Following the idea of [134], we
define
 1/p
T
p
γp = |γ(t)| dt
0

and
γ∞ = ess sup |γ(t)| .
0≤t≤T

Then, it is known (cf. [FMT1]) that

γp ↑ γ∞ , as p → ∞.

(i) Use this fact to construct a sequence of optimal control problems so


that each of them is solvable by the control parametrization technique.
(ii) Show that the optimal costs of these approximate problems converge
monotonically from below to the true optimal cost of the original prob-
lem.

8.11.7 Show that the result of Lemma 8.4.1 remains valid if {up }∞ p=1 is a
bounded sequence of controls in L∞ ([0, T ], Rr ) that converges to u a.e. in
[0, T ], as p → ∞.

8.11.8 Show that Lemmas 8.4.2–8.4.4 still hold if {up }∞


p=1 and u are as
defined in Lemma 8.4.1.
314 8 Control Parametrization for Canonical Optimal Control Problems

8.11.9 Can the control functions chosen in Definition 8.9.1 be just measur-
able, rather than Borel measurable functions?

8.11.10 Derive the gradient formulae given by (8.6.5) and show that it is
equivalent to (8.6.8).

8.11.11 Show that û in the proof of Theorem 8.9.2 is a feasible control of


Problem (P ).
8.11.12 Consider the problem (P ) subject to additional terminal equality
constraints
qi (x(T | u)) = 0, i = 1, . . . , NE , (8.11.1)
where qi , i = 1, . . . , NE , are continuously differentiable functions. Let this
optimal control problem be referred to as the Problem (R). Use the control
parametrization time scaling transform technique to derive a computational
method for solving Problem (R). State all the essential assumptions and then
prove all the relevant convergence results.

8.11.13 Derive the gradient formulae for Gi , i = 0, 1, . . . , N , in Algo-


rithm 8.8.1 by using the variational approach (Hint: Please refer to Sec-
tions 7.2 and 7.3).
Chapter 9
Optimal Control Problems with State
and Control Constraints

9.1 Introduction

In real world, optimal control problems are often subject to constraints on the
state and/or control. These constraints can be point constraints and/or con-
tinuous inequality constraints. The point constraints are expressed as func-
tions of the states at the end point or some intermediate interior points
of the time horizon. These point constraints can be handled without much
difficulty. However, for the continuous inequality constraints, they are ex-
pressed as functions of the states and/or controls over the entire time hori-
zon, and hence are very difficult to handle. This chapter is devoted to de-
vise computational methods for solving optimal control problems subject to
point and continuous constraints. It is divided into two sections. In Sec-
tion 9.2, our focus is on optimal control problems subject to continuous
state and/or control inequality constraints. Through the application of the
control parametrization technique, an approximate optimal parameter se-
lection problem with continuous state inequality constraints is obtained.
Then, the constraint transcription method introduced in Section 4.3 is ap-
plied to construct a smooth approximate inequality canonical constraint for
each continuous inequality state constraint. In Section 9.3, exact penalty
function method introduced in Section 4.4 will be utilized to develop an
effective computational method for solving the class of optimal control prob-
lems considered in Section 9.3. The main references for this Chapter are
[103, 104, 133, 134, 143, 145, 146, 168, 171, 249, 253, 254, 259].

© The Author(s), under exclusive license to 315


Springer Nature Switzerland AG 2021
K. L. Teo et al., Applied and Computational Optimal Control, Springer
Optimization and Its Applications 171,
https://doi.org/10.1007/978-3-030-69913-0 9
316 9 Optimal Control Problems with State and Control Constraints

9.2 Optimal Control with Continuous State Inequality


Constraints

Consider the process described by the following system of nonlinear differen-


tial equations on the fixed time interval (0, T ].

dx(t)
= f (t, x(t), u(t)), (9.2.1a)
dt
where x = [x1 , . . . , xn ] ∈ Rn , u = [u1 , . . . , ur ] ∈ Rr are, respectively, the
state and control vectors; and f = [f1 , . . . , fn ] ∈ Rn .
The initial condition for the system of differential equations (9.2.1a) is

x(0) = x0 , (9.2.1b)

where x0 is a given vector in Rn .


Let U1 , U2 and U be as defined in (8.2.2a) and (8.2.2b) in Section 8.2.
Definition 9.2.1 A piecewise continuous function u = [u1 , . . . , ur ] from
[0, T ] into Rr is said to be an admissible control if u(t) ∈ U for almost all
t ∈ [0, T ] and it is continuous from the right.

Let U be the class of all such admissible controls.


For each u ∈ U , let x(· | u) be the corresponding vector-valued function
which is absolutely continuous on (0, T ] and satisfies the differential equa-
tions (9.2.1a) almost everywhere on (0, T ] and the initial condition (9.2.1b).
This function is called the solution of system (9.2.1) corresponding to u ∈ U .
Inequality terminal state constraints and inequality continuous state and
control constraints are imposed, respectively, as follows.

Φi (x(T | u)) ≤ 0, i = 1, . . . , NT , (9.2.2)

where Φi , i = 1, . . . , NT , are real-valued functions defined on Rn , and

hi (t, x(t | u), u(t)) ≤ 0, a.e. in [0, T ], i = 1, . . . , NS , (9.2.3)

where hi , i = 1, . . . , NS , are real-valued functions defined on [0, T ] × Rn × Rr .


Let F be the set that consists of all those elements from U such that
the constraints (9.2.2) and (9.2.3) are satisfied. Elements from F are called
feasible controls and F is called the class of feasible controls.
We may now state the following optimal control problem.
Problem (P) Given the system (9.2.1), find a control u ∈ F such that the
cost functional
 T
g0 (u) = Φ0 (x(T | u)) + L0 (t, x(t | u), u(t))dt, (9.2.4)
0
9.2 Optimal Control with Continuous State Inequality Constraints 317

is minimized over F, where Φ0 and L0 are given real-valued functions.


We assume throughout that the following conditions are satisfied.
Assumption 9.2.1 f and L0 satisfy the relevant conditions appearing in
Assumptions 8.2.1 and 8.2.2.

Assumption 9.2.2 For each i = 0, 1, . . . , NT , Φi : Rn → R is continuously


differentiable.

Assumption 9.2.3 For each i = 1, . . . , NS , hi : [0, T ] × Rn × Rn → R is


continuously differentiable.

9.2.1 Time Scaling Transform

We shall apply the time scaling transform introduced in Section 7.4.2 to


Problem (P). For each p ≥ 1, let the planning horizon [0, T ] be partitioned
into np subintervals with np + 1 partition points denoted by

τ0p , τ1p , . . . , τnpp , (9.2.5)

where τ0p = 0 and τnpp = T , while τip , i = 1, . . . , np − 1, are decision variables


satisfying the following conditions:
p
τk−1 ≤ τkp , k = 1, 2, . . . , np . (9.2.6)

Let the number np of the partition points be chosen such that np+1 > np .
The control is now approximated in the form of piecewise constant function
as:

np
up (t) = σ p,k χ[τ p ,τ p ) (t), (9.2.7)
k−1 k
k=1

where (σ p,k ) = [σ1p,k , . . . , σrp,k ] with σ p,k ∈ U , k = 1, . . . , np , and the set


U is as defined in Section 8.2. Let
&  '
σ p = σ p,1 , . . . , (σ p,np ) (9.2.8)

and & '


τ p = τ1p , τ2p , . . . , τnpp −1 , (9.2.9)

where τ0p = 0, τnpp = T , while τip , i = 1, . . . , np − 1, are decision variables


satisfying
0 = τ0p ≤ τ1p ≤ τ2p ≤ · · · ≤ τnpp = T. (9.2.10)
Let Ξ p be the set consisting all those σ p and let Γ p be the set consisting
all those τ p .
318 9 Optimal Control Problems with State and Control Constraints

To apply the time scaling transform, we construct a transformation from


t ∈ [0, T ] to a new time scale s ∈ [0, 1]. This transformation maps the knots:

τ0p , τ1p , τ2p , . . . , τnpp −1 , τnpp

into the fixed knots:


k
ξkp = , k = 0, 1, . . . , np . (9.2.11a)
np

Clearly, these fixed knots are such that

0 = ξ0p < ξ1p < ξ2p < · · · < ξnp p −1 < ξnp p = 1. (9.2.11b)

The required transformation from t ∈ [0, T ] to s ∈ [0, 1] can be defined by


the following differential equation.

dt(s)
= v p (s) (9.2.12a)
ds
with the initial condition:
t(0) = 0. (9.2.12b)
In view of Definition 8.9.6, we note that v p (s) is called a time scaling control
defined by (8.9.19), that is,


np
v p (s) = θkp χ[ξp p
) (s), (9.2.13)
k−1 ,ξk
k=1

& '
where θkp ≥ 0, k = 1, . . . , np , are decision variables. Define θ p = θ1p , . . . , θnp p
∈ Rnp with θkp ≥ 0, k = 1, . . . , np . Let Θp be the set consisting all those θ p ,
and let Vp be the set consisting of all the piecewise constant functions in
the form of (9.2.13) with θ p ∈ Θp . Clearly, each θ p ∈ Θp defines uniquely
a v p ∈ V p , and vice versa. By (9.2.13), it is clear from (9.2.12) that, for
k = 1, . . . , np ,
 s 
k−1
p
t (s) = p
v (τ )dτ = θjp (ξjp − ξj−1
p
) + θkp (s − ξk−1
p
), p
for s ∈ [ξk−1 , ξkp ].
0 j=1
(9.2.14)
In particular,
 1 
np
p
t (1) = p
v (τ )dτ = θkp (ξkp − ξk−1
p
) = T. (9.2.15)
0 k=1

and
9.2 Optimal Control with Continuous State Inequality Constraints 319
 k/np 
k
tp (k/np ) = v p (τ )dτ = θjp (ξjp − ξj−1
p
) = τk , k = 1, . . . , np . (9.2.16)
0 j=1

Define
ω p (s) = up (t(s)), (9.2.17)
where up is in the form of (9.2.7). Clearly, ω p is a piecewise constant control
which can be written as

np

) (s), s ∈ [0, 1],


p
ω (s) = σ p,k χ[ξp p (9.2.18)
k−1 ,ξk
k=1

where ξkp , k = 0, 1, . . . , np , are fixed knots chosen as specified


$ p byp (9.2.11),
χ[ξp ,ξp ) (·) denotes the indicator function of the interval ξk−1 , ξk defined
k−1 k
by (7.3.3), and σ p ∈ Ξ p .
Let U Cp be the set containing all those piecewise constant controls given
by (9.2.18) with σ p ∈ Ξ p . Clearly, each (σ p , θ p ) ∈ Ξ p × Θp defines uniquely
Cp × V p , and vice versa.
a (ω p , v p ) ∈ U
With the transformation from t ∈ [0, T ] into s ∈ [0, 1] via the introduction
of (9.2.12), system (9.2.1) becomes

dx̂(s)
= f-(s, x
-(s), σ p , θ p ), (9.2.19)
ds
6(0)
x x0
x̂(0) = = , (9.2.20)
0 0

where
6(s)
x x(t(s))
x̂(s) = = , (9.2.21a)
t(s) t(s)
6(s) = x(t(s)),
x (9.2.21b)
 

np
ˆ p p p
f (s, x̂(s), σ , θ ) = v (s)f x-(s), σkp χ[ξp p
) (s) , (9.2.21c)
k−1 ,ξk
k=1

and v p is given by (9.2.13).


Let B p be the set which consists of all those elements from Ξ p × Θp such
that the following constraints are satisfied.

Φi (6x(1 | σ p , θ p )) ≤ 0, i = 1, . . . , NT , (9.2.22)
 
ĥi s, x̂(s | σ p , θ p ), σ p,k ≤ 0, s ∈ Ik , i = 1, . . . , NS , k = 1, . . . , np (9.2.23)
and

np
Υ-(θ p ) = θkp (ξkp − ξk−1
p
) − T = 0, (9.2.24)
k=1
320 9 Optimal Control Problems with State and Control Constraints

where
 
  
np
-(s), σ
ĥi s, x p,k
= hi 6(s),
t(s), x σ p,k
χ [ξ p p
) (s) , s ∈ Ik , (9.2.25)
k−1 ,ξk
k=1

p
and, for each k = 1, . . . , np , Ik denotes the closure of Ik with Ik = [ξk−1 , ξkp ).
Let Dp be the corresponding subset of U Cp × V p uniquely defined by
elements from B . Clearly, for each (ω , ν ) ∈ Dp , there exists a unique
p p p

(σ p , θ p ) ∈ B p such that


np
p
ω (s) = σ p,k χ[ξp p
) (s), s ∈ [0, 1], (9.2.26a)
k−1 ,ξk
k=1
np
v p (s) = θkp χ[ξp p
) (s), s ∈ [0, 1]. (9.2.26b)
k−1 ,ξk
k=1

The equivalent transformed optimal parameter selection problem may now


be stated as follows:
Problem (P (p)) Subject to system (9.2.19)–(9.2.20), find a combined con-
trol parameter vector and switching vector (σ p , θ p ) ∈ B p such that the cost
functional
 1
x(1 | σ p , θ p )) +
G0 (σ p , θ p ) = Φ0 (6 L-0 (s, x̂(s | σ p , θ p ), σ p , θ p )ds (9.2.27)
0

is minimized over B p , where


np
L-0 (s, x̂(s), σ p , θ p ) = v p (s)L0 (-
x(s), σkp χ[ξp p
) (s)). (9.2.28)
k−1 ,ξk
k=1

9.2.2 Constraint Approximation

For each i = 1, . . . , NS , and k = 1, . . . , np , the corresponding inequality


continuous state constraint in (9.2.23) is equivalent to
 *   +
p
ξk
p p
Gi,k (σ , θ ) = -(s | σ p , θ p ), σ p,k , 0 ds = 0.
max ĥi s, x (9.2.29)
p
ξk−1

However, the equality constraint (9.2.29) is non-differentiable at (σ p , θ p ) ∈


Ξ p × Θp such that ĥi = 0. Nevertheless, since for each i = 1, . . . , NS , and
k = 1, . . . , np , the continuous inequality constraint in (9.2.23) is equivalent
9.2 Optimal Control with Continuous State Inequality Constraints 321

to the equality constraint in (9.2.29), Problem (P (p)) with (9.2.23) replaced


by (9.2.29) is again referred to as Problem (P (p)).
Let B̊ p be the interior of the set B p in the sense that it consists of all those
(σ , θ p ) ∈ B p such that
p

Φi (6x(1 | σ p , θ p )) < 0, i = 1, . . . , NT , (9.2.30)


 !
-(s | σ p , θ p ), σ p,k < 0, i = 1, . . . , NS , k = 1, . . . , np . (9.2.31)
max ĥi s, x
s∈Ik

To continue, we assume that the following condition is satisfied.


Assumption 9.2.4 B̊ p = ∅.

To*handle the non-differentiable equality constraints (9.2.28), we replace


  +  
-(s | σ , θ ), σ
max ĥi s, x p p p,k
, 0 by L-i,ε s, x
-(s | σ p , θ p ), σ p,k , where
 
L-i,ε s, x
-(s | σ p , θ p ), σ p,k
⎧  

⎪ 0, -(s | σ p , θ p ), σ p,k < −ε
if ĥi s, x




x(s|σ p ,θ p ),σ p,k )+ε)2  
= (ĥi (s, , if − ε ≤ ĥi s, x -(s | σ p , θ p ), σ p,k ≤ ε

⎪ 4ε



⎩    
-(s | σ p , θ p ), σ p,k , if ĥi s, x
ĥi s, x -(s | σ p , θ p ), σ p,k > ε.
(9.2.32)

This function
*  is obtained by smoothing out the sharp corner of the func-
 +
tion max ĥi s, x-(s | σ p , θ p ), σ p,k , 0 which is in the form as shown in Fig-
ure 4.2.1. For each i = 1, . . . , NS , and k = 1, . . . , np , define
 1
np
 
p p
Gi,ε (σ , θ ) = L-i,ε s, x
-(s | σ p , θ p ), σ p,k χ[ξp p
) (s)ds. (9.2.33)
k−1 ,ξk
0 k=1

We now define two related approximate problems, which will be referred


to as Problem (Pε (p)) and Problem (Pε,γ (p)), respectively. The first approx-
imate problem is:
Problem (Pε (p)): The Problem (P (p)) with the continuous inequality con-
straints (9.2.23) replaced by

Gi,ε (σ p , θ p ) = 0, i = 1, . . . , NS . (9.2.34)

Let Bεp be the feasible region of Problem (Pε (p)) containing all those
(σ , θ p ) ∈ Ξ p × Θp such that the constraints (9.2.22), (9.2.34) and (9.2.21)
p

are satisfied. Clearly, for each ε > 0, Bεp ⊂ B p .


322 9 Optimal Control Problems with State and Control Constraints

Note that equality constraints (9.2.34) fail to satisfy the usual constraint
qualification. To overcome this difficulty, we consider the second approximate
problem as follows:
Problem (Pε,γ (p)): The Problem (P (p)) with (9.2.23) replaced by

- i,ε,γ (σ p ) = −γ + G
G - i,ε (σ p ) ≤ 0, i = 1, . . . , NS . (9.2.35)

Note that constraints (9.2.35) are already in canonical form, i.e., in the
form of (9.2.29), where the functions Ĝi, in (9.2.35) are equal to the constant
−γ in the present case.
We assume that the following condition is satisfied.
Assumption 9.2.5 For any combined vector (σ p , θ p ) in B p , there exists a
combined vector (σ p , θ p ) ∈ B̊ p such that

α(σ p , θ p ) + (1 − α)(σ p , θ p ) ∈ B̊ p for all α ∈ (0, 1].

To relate the solutions of Problems (P (p)), (Pε (p)) and (Pε,γ (p)) as ε → 0,
we have the following lemma.
Lemma 9.2.1 For any ε > 0, there exists a γ(ε) > 0 such that for all γ,
p
0 < γ < γ(ε), if (σε,γ p
, θε,γ ) ∈ Ξ p × Θp satisfies the constraints of Problem
(Pε,γ (p)), i.e.,

x(1 | σε,γ
Φi (6 p p
, θε,γ )) ≤ 0, i = 1, . . . , NT , (9.2.36a)
p
Gi,k,ε,γ (σε,γ p
, θε,γ ) ≤ 0, i = 1, . . . , NS ; k = 1, . . . , np , (9.2.36b)
 p p
np
Υ-(θ p ) = p
θk (ξk − ξk−1 ) − T = 0, (9.2.36c)
k=1

then it also satisfies the constraints of Problem (P (p)).


Proof. For each i = 1, . . . , NS and any (σ p , θ p ) ∈ Ξ p × Θp , we have

dhi (t(s), x(s | σ p , θ p ))


ds
∂hi (t(s), x(s | σ p , θ p )) dt(s)
=
∂t ds
 n
∂hi (t(s), x(s | σ , θ p )) ˜
p
+ fj (t(s), x(s | σ p , θ p ), σ p , θ p ). (10.5.33)
j=1
∂x j

Note that t(s) depends on θ p ∈ Θp . Now, by Assumption 9.2.3 and Lemma


8.4.2, there exists a positive constant mi such that, for all (σ p , θ p ) ∈ Ap ,
 
 dhi (t(s), x(s | σ p , θ p )) 
  ≤ mi , ∀s ∈ [0, 1]. (9.2.37)
 ds 
9.2 Optimal Control with Continuous State Inequality Constraints 323

Next, for any ε > 0, define

ε ε
ki,ε = min T, . (9.2.38)
16 2mi

It suffices to show that, for each i = 1, . . . , Ns ,


p
Bi,ε,γ ⊂ Bip (9.2.39)

for any γ such that 0 < γ < ki,ε , where


* +
p
Bi,ε,γ - i,ε (σ p , θ p ) ≤ 0
= (σ p , θ p ) ∈ Ap : −γ + G (9.2.40)

and * +
- i (σ p , θ p )) = 0 ,
Bip = (σ p , θ p ) ∈ Ap : G (9.2.41)

where G - i (σ p , θ p ) is defined by (9.2.29). Assume the contrary. Then, there


exists an i ∈ {1, . . . , NS } and a (σ p , θ p ) ∈ Ap such that

- i,ε (σ p , θ p ) ≤ 0
−γ+G (9.2.42)

for any γ such that 0 < γ < ki,ε but

- i (σ p , θ p ) > 0.
G (9.2.43)

Since hi (t(s), x(s | σ p , θ p )) is a continuous function of s in [0, 1], (9.2.43)


implies that there exists a s6 ∈ [0, 1] such that

hi (t(6 s | σ p , θ p )) > 0.
s), x(6 (9.2.44)

Again by continuity, there exists an interval Ii ⊂ [0, 1] containing s6 such that

hi (t(6 s | σ p )) > −ε/2,


s), x(6 ∀t ∈ Ii . (9.2.45)

For geometrical interpretation, please refer to Figure 4.3.1. Using (9.2.45),


and Figure 4.3.1, we see that
y
mi = tan θ = . (9.2.46)
z
Thus, the length |Ii | of the interval Ii must satisfy

y ε
|Ii | ≥ min{T, z} = min T, ≥ min T, . (9.2.47)
mi 2mi

From the definition of G - i,ε (σ p ) and the fact that L-i,ε (t(s), x(s | σ p )) is non-
negative, it follows from (9.2.42) that
324 9 Optimal Control Problems with State and Control Constraints
 T
- i,ε (σ p , θ p ) = −γ +
0 ≥ −γ + G L-i,ε (t(s), x(s | σ p , θ p ))ds
 0

≥ −γ + -
Li,ε (t(s), x(s | σ , θ p ))ds ≥ −γ
p
Ii
* +
+ min L-i,ε (t(s), x(s | σ p , θ p )) |Ii | . (9.2.48)
s∈Ii

Now, by virtue of (9.2.45), we have

minL-i,ε (t(s), x(s | σ p , θ p )) > ε/16. (9.2.49)


s∈Ii

Combining (9.2.47), (9.2.48), and (9.2.49), we obtain

- i,ε (σ p , θ p ) > −γ + ε min T, ε


0 ≥ −γ + G = −γ + ki,ε .
16 2mi

This is a contradiction, because γ < ki, . Thus, the proof is complete.

Remark 9.2.1 An alternative proof of Lemma 8.4.1 without the geometrical


interpretation of Figure 4.3.1 is left as an exercise (use Lemma 4.2.1)

9.2.3 A Computational Algorithm

To solve the constrained optimal control Problem (P (p)), we construct a


sequence of approximate problems (Pε,γ (p)) in ε and γ. For a given positive
p,∗ p,∗
integer p > 0, and for each ε > 0 and γ > 0, let σε,γ , θε,γ be the solution
of the problem (Pε,γ (p)). The following algorithm can be used to generate a
sequence of combined vectors in the feasible region of the Problem (P (p)).
Algorithm 9.2.1
Set ε > 0, γ > 0. ( In particular,  we may choose ε = 10−1 and γ = T ε/16).
p,∗ p,∗
Step 1. Solve (Pε,γ (p)) to give σ ,γ , θε,γ .
   p,∗ 
Step 2. Check the feasibility of ĥi s, x̂ s | σε,γ p,∗ p,∗
, θε,γ , θε,γ ≤ 0 for all s ∈ Ik
and i = 1, . . . , NS ; k =  1, . . . , n p .
p,∗ p,∗
Step 3. If σε,γ , θε,γ is feasible, go to Step 5. Otherwise, go to Step 4.
Step 4. Set γ = γ/2 and go to Step 1.
Step 5. Set ε = ε/10, γ = γ/10 and go to Step 1.

Remark 9.2.2 From Lemma 9.2.1, it is clear that the halving process of γ
in Step 4 of Algorithm 9.2.1 needs only to be carried out a finite number of
times. Let γ̃(ε) be the parameter corresponding to each ε > 0 obtained in the
!
p,∗ p,∗
halving process of γ in Step 4 of the algorithm. Clearly, σε,γ̃(ε) , θε,γ̃(ε)
9.2 Optimal Control with Continuous State Inequality Constraints 325

satisfies*the constraints+ of Problem (P (p)). The algorithm produces a se-


p,∗ p,∗
quence σε,γ̃(ε) , θε,γ̃(ε) in ε > 0, where each of these combined vectors
* +
p,∗ p,∗
σε,γ̃(ε) , θε,γ̃(ε) is in the feasible region of Problem (P (p)). Thus, it is a
sequence of approximate optimal combined vectors of Problem (P (p)).
* +
p,∗ p,∗
Remark 9.2.3 Let σε,γ̃(ε) , θε,γ̃(ε) be the sequence of approximate optimal
combined vectors of Problem (P (p)) obtained by Algorithm 9.2.1
* as explained+
p,∗ p,∗
in Remark 9.2.2. This gives rise to a corresponding sequence ωε,γ̃(ε) , νε,γ̃(ε)
p,∗ p,∗
of approximate piecewise constant controls, where ωε,γ̃(ε) and νε,γ̃(ε) are, re-
p p p,∗
spectively, given by (9.2.26a) and (9.2.26b) with σ and θ taken as σε,γ̃(ε)
p,∗
and θε,γ̃(ε) , respectively. Then, by (9.2.7) and (9.2.17), we obtain a corre-
* +
sponding sequence up,∗ ε,γ̃(ε) of approximate controls to Problem (P (p)), i.e.,


np
p,∗,k   (t).
up,∗
ε,γ̃(ε) (t) = σε,γ̃(ε) χ τ p,∗ p,∗
,τε,γ̃(ε),k
(9.2.50)
k=1 ε,γ̃(ε),k−1

p,∗
Here, we solve (9.2.12a) and (9.2.12b) with θ p taken as θε,γ̃(ε) , giving
p,∗ p,∗
tε,γ̃(ε) (s) defined by (9.2.14). Then, by evaluating tε,γ̃(ε) (s) at s = k/np ,
p,∗
k = 0, 1, . . . , np , we obtain τε,γ̃(ε),k , k = 0, 1, . . . , np .

Remark 9.2.4 By examining the proof of Lemma 9.2.1, we see that ε and
γ are closely related to each other. At the solution of a particular problem,
if a constraint is active over a large fraction of [0, 1], then we should choose
γ = O(ε). On the other hand, if the  constraint is active only over a very
small fraction of [0, 1], then γ = O ε2 .

Remark 9.2.5 In the actual implementation, Algorithm 9.2.1 is terminated


when either of the following two conditions is satisfied.
(i) In Step 4: If γ < 10−10 , then the algorithm is terminated as abnormal
exit.
(ii) In Step 5: If ε < 10−7 , then the algorithm is terminated as successful
exit.
Remark 9.2.6 In Step 1 of Algorithm 9.2.1, we are required to solve Prob-
lem (Pε,γ (p)). It can be solved as a nonlinear optimization problem. The de-
tails are given in the next section.

9.2.4 Solving Problem (Pε,γ (p))

Problem (Pε,γ (p)) can be viewed as the following nonlinear optimization prob-
lem, which is again referred to as Problem (Pε,γ (p)).
326 9 Optimal Control Problems with State and Control Constraints

Minimize:
G0 (σ p , θ p ) (9.2.51)
subject to

x(1 | σ p , θ p )) ≤ 0,
Φi (6 i = 1, . . . , NT , (9.2.52)
Gi,k,ε,γ (σ p , θ p ) ≤ 0, i = 1, . . . , NS ; k = 1, . . . , np , (9.2.53)

np
Υ-(θ p ) = θkp (ξkp − ξk−1
p
)−T =0 (9.2.54)
k=1
 
i (σ p ) = E i σ p,k − bi ≤ 0, i = 1, . . . , q; k = 1, . . . , np (9.2.55)
αi ≤ σip,k
≤ βi , i = 1, . . . , r; k = 1, . . . , np (9.2.56)
p
θk ≥ 0, k = 1, . . . , np . (9.2.57)

The constraints (9.2.55)–(9.2.57) are to specify the sets Ξ p and Θp , re-


spectively. The constraints (9.2.56)–(9.2.57) are known as the boundedness
constraints in nonlinear optimization programming. Let Λ be the set which
consists of all those (σ p , θ p ) ∈ Rrnp × Rnp such that the constraints (9.2.56)–
(9.2.57) are satisfied.
This nonlinear mathematical programming problem in the combined vec-
tors can be solved by using any nonlinear optimization technique, such as
the sequential quadratic programming (SQP) approximation routine with
the active set strategy (see Section 3.5).
To use the SQP approximation routine with the active set strategy to
solve the nonlinear optimization
 problem (9.2.51)–(9.2.57), we choose an
initial combined vector (σ p )(0) , (θ p )(0) ∈ Λ to start up the SQP ap-
proximation
 p (i) p (i) routine.
 The SQP approximation routine will use, for each
(σ ) , (θ ) ∈ Λ, the values of the cost function (9.2.51) and the con-
straint functions (9.2.52)–(9.2.55)
 as well as their respective gradients to gen-
erate the next iterate (σ p )(i+1) , (θ p )(i+1) . Consequently, it gives rise to a
sequence of combined vectors. The combined vector obtained by the SQP
approximation routine is regarded as an optimal combined vector of Problem
(P (p)).
In what follows, we shall explain how the values of the cost functional
(9.2.51) and the constraint functionals (9.2.52)–(9.2.55), as well as their re-
spective gradients are calculated corresponding to each (σ p , θ p ) ∈ Λ.
Remark 9.2.7 For each (σ p , θ p ) ∈ Λ, the corresponding values of the con-
straint functions i (σ p,k ), i = 1, . . . , np ; k = 1, . . . , np and Υ- (θ p ) are
straightforward to calculate via (9.2.54)–(9.2.55). Their gradients are
 
∂i σ p,k  
p,k
= E i , i = 1, . . . , q; k = 1, . . . , np (9.2.58)
∂σ
∂ Υ-(θ p )
=(ξkp − ξk−1
p
), k = 1, . . . , np . (9.2.59)
∂θkp
9.2 Optimal Control with Continuous State Inequality Constraints 327

To calculate, for each (σ p , θ p ) ∈ Λ, the values of the cost functional (9.2.51)


and the constraint functionals (9.2.52) and (9.2.53), the first task is to use
the following algorithm to calculate the solution of system (9.2.19)–(9.2.20)
corresponding to each (σ p , θ p ) ∈ Λ.

Algorithm 9.2.2
For each given (σ p , θ p ) ∈ Λ, compute the solution x̂(· | σ p , θ p ) of sys-
tem (9.2.19)–(9.2.20) by solving the differential equations (9.2.19) forward
in time from s = 0 to s = 1 with the initial condition (9.2.20).

With the solution x̂(· | σ p , θ p ) of the system (9.2.19)–(9.2.20) correspond-


ing to the (σ p , θ p ) ∈ Λ obtained, the values of the cost functional G0 (σ p , θ p )
and the constraint functionals Φi , i = 1, . . . , NT , and Gi,ε,γ , i = 1, . . . , NS ;
k = 1, . . . , np , can be easily calculated using the following simple algorithm.
Algorithm 9.2.3
For a given (σ p , θ p ) ∈ Λ,
Step 1. Use Algorithm 9.2.2 to solve for x̂(· | σ p , θ p ). Thus, x̂(s | σ p , θ p ) is
known for each s ∈ [0, 1]. This implies that:
(a). Φ0 (6x(1 | σ p , θ p )), and Φi (6 x(1 | σ p , θ p )), i = 1, . . . , NT , are known; and
(b). L0 (s, x̂(s | σ , θ ), σ , θ ), and L-i,ε (s, x̂(s | σ p , θ p ), σ p , θ p,k ), i =
- p p p p

1, . . . , NS ; k = 1, . . . , np , are known for each s ∈ [0, 1]. Hence, their


integrals:
 1
L-0 (s, x̂(s | σ p , θ p ), σ p , θ p )ds (9.2.60)
0
 p
ξk−1  
L-i,ε s, x̂(s | σ p , θ p ), σ p , θ p,k ds, i = 1, . . . , NS ; k = 1, . . . , np ,
p
ξk−1
(9.2.61)

can be obtained readily.

Step 2. The value of the cost functional:


 1
x(1 | σ p , θ p )) +
G0 (σ p , θ p ) = Φ0 (6 L-0 (s, x̂(s | σ p , θ p ), σ p , θ p )ds (9.2.62)
0

and the values of the constraint functionals:

x(1 | σ p , θ p )),
Φi (6 i = 1, . . . , NT , (9.2.63)

and
 p
ξk
Gi,k,ε,γ (σ , θ ) = −γ +
p p
L-i,ε (s, x̂(s | σ p , θ p ), σ p , θ p,k )ds,
p
ξk−1
328 9 Optimal Control Problems with State and Control Constraints

i = 1, . . . , NS ; k = 1, . . . , np (9.2.64)

are calculated.
Let us now move to present an algorithm for calculating the gradients of
these functionals for each given (σ p , θ p ) ∈ Λ. For this, we note from Sec-
tion 7.3 that the derivations of the gradient formulae for the cost functional
and the constraint functionals are similar. These gradients can be computed
using the following algorithm:
Algorithm 9.2.4
For a given (σ p , θ p ) ∈ Λ,
Step 1. Solve the costate systems corresponding to the cost functional (9.2.62),
the terminal constraint functionals (9.2.63), and the inequality state con-
straint functionals (9.2.64), respectively, as follows:
(i). For the cost functional (9.2.62):
Solve the following costate system of differential equations:
⎛ ! ⎞
- 0 s, x̂(s | σ p , θ p ), σ p , θ p , λ̂0 (s)
∂H
dλ̂0 (s)
= −⎝ ⎠ (9.2.65)
ds ∂ x̂

with the boundary condition:



0 ∂Φ0 (x(1 | σ p , θ p ))
λ̂ (1) = (9.2.66)
∂x

backward in time from s = 1 to s = 0, where x̂(· | σ p , θ p ) is the solution of


system (9.2.19)–(9.2.20) corresponding to (σ p , θ p ) ∈ Λ; and the Hamiltonian
- 0 is defined by
function H
- 0 (s, x̂, σ p , θ p , λ) = L-0 (s, x̂, σ p , θ p ) + λ f-(s, x̂, σ p , θ p ).
H (9.2.67)

Let λ̂0 (· | σ p , θ p ) be the solution of the costate system (9.2.65)–(9.2.66).


(ii) For the i-th terminal constraint functional given in (9.2.63):
Solve the following costate system of differential equations:
⎛ ! ⎞
- i s, x̂(s | σ p , θ p ), σ p , θ p , λ̂i (s)
∂H
dλ̂i (s)
= −⎝ ⎠ (9.2.68)
ds ∂ x̂

with the boundary condition:



i ∂Φi (x(1 | σ p , θ p ))
λ̂ (1) = (9.2.69)
∂x
9.2 Optimal Control with Continuous State Inequality Constraints 329

backward in time from s = 1 to s = 0, where x̂(· | σ p , θ p ) is the solution of


system (9.2.19) and (9.2.20) corresponding to (σ p , θ p ) ∈ Λ; and H - i , the cor-
responding Hamiltonian function for the i-th terminal constraint functional,
is defined by
H- i (s, x̂, σ p , θ p , λ) = λ f-(s, x̂, σ p , θ p ). (9.2.70)
Let λ̂i (· | σ p , θ p ) be the solution of the costate system (9.2.68) and (9.2.69) .

(iii) For the (i, k)-th inequality state constraint functional given in
(9.2.64):

Solve the following costate system of differential equations:


⎛ ! ⎞
- i,k,ε,γ s, x̂(s | σ p , θ p ), σ p , θ p , λ̂i,k,ε,γ (s)
∂H
dλ̂i,,k,ε,γ (s)
= −⎝ ⎠
ds ∂ x̂
(9.2.71)
with the boundary condition:

λ̂i,k,ε,γ (1) = 0 (9.2.72)

backward in time from s = 1 to s = 0, where x̂(· | σ p , θ p ) is the solution of the


system (9.2.19)–(9.2.20) corresponding to (σ p , θ p ) ∈ Λ; and the Hamiltonian
- i,k,ε,γ is defined by
function H

- i,k,ε,γ (s, x̂, σ p , θ p , λ) = L-i,k,ε,γ (s, x̂, σ p , θ p ) + λ f-(s, x̂, σ p , θ p ). (9.2.73)


H

Let λ̂i,ε (· | σ p , θ p ) be the solution of the costate system (9.2.71) and (9.2.72).
Step 2. The gradients of the cost functional (9.2.62), the terminal inequal-
ity constraint functionals (9.2.63), and the inequality constraint function-
als (9.2.64) are computed, respectively, as follows:
(i) For the cost functional (9.2.62):
!
 -0
1 ∂H s, x̂(s | σ p , θ p ), σ p , θ p , λ̂0 (s | σ p , θ p )
∂G0 (σ p , θ p )
= ds
∂σ p 0 ∂σ p
(9.2.74a)
!
p p  - 0 s, x̂(s | σ , θ ), σ , θ , λ̂ (s | σ , θ p )
1 ∂H p p p p 0 p
∂G0 (σ , θ )
= ds.
∂θ p 0 ∂θ p
(9.2.74b)
(ii) For the i-th terminal constraint functional given in (9.2.63):
!
 1 ∂H - i s, x̂(s | σ p , θ p ), σ p , θ p , λ̂i (s | σ p , θ p )
∂Gi (σ p , θ p )
= ds
∂σ p 0 ∂σ p
(9.2.75)
330 9 Optimal Control Problems with State and Control Constraints
!
 1 - i s, x̂(s | σ p , θ p ), σ p , θ p , λ̂i (s | σ p , θ p )
∂H
∂Gi (σ p , θ p )
= ds.
∂θ p 0 ∂θ p
(9.2.76)
(iii) For the (i, k)-th inequality constraint functional given in (9.2.64):
 
p p   i,ε
1 ∂H s, x̂(s | σ p , θ p ), σ p , θ p , λ̂i,ε (s | σ p , θ p )
∂Gi,ε (σ , θ )
= ds
∂σ p 0 ∂σ p
 (9.2.77)

p p  
1 ∂ Hi,ε s, x̂(s | σ , θ ), σ , θ , λ̂ (s | σ , θ )
p p p p i,ε p p
∂Gi,ε (σ , θ )
= ds.
∂θ p 0 ∂θ p
(9.2.78)

9.2.5 Some Convergence Results

In this section, we shall investigate some convergence properties of the se-


quences of approximate optimal controls obtained in Sections 9.2.3 and 9.2.4.
* +
p,∗ p,∗
Theorem 9.2.1 Let σε,γ̃(ε) , θε,γ̃(ε) be a sequence in ε of the approximate
optimal combined vectors produced by Algorithm 9.2.1, where γ̃(ε) is deter-
mined as described in Remark 9.2.1 so as to ensure that the constraints of
Problem (P (p)) are satisfied. Then,
!
p,∗ p,∗
G0 σε,γ̃(ε) , θε,γ̃(ε) → G0 (σ p,∗ , θ p,∗ ),

as ε → 0, where (σ p,∗ , θ p,∗ ) is an optimal combined


* vector +
of Problem (P (p)).
p,∗ p,∗
Furthermore, any accumulation point of σε,γ̃(ε) , θε,γ̃(ε) is a solution of
Problem (P (p)).
Proof. The proof is similar to that given for Theorem 4.3.1.
Remark 9.2.8 Let (σ p,∗ , θ p,∗ ) be an optimal combined vector of the approxi-
mate Problem (P (p)). Then, (σ p,∗ , θ p,∗ ) defines uniquely a {ω p,∗ , ν p,∗ } ∈ B p
via (9.2.12a) and (9.2.18), and vice versa. Furthermore, corresponding to
{ω p,∗ , ν p,∗ } ∈ B p , there exists a unique (up,∗ , τ p,∗ ) defined on the original
time horizon [0, T ] such that


np
up,∗ (t) = σ p,∗,k χ[τk−1
p,∗
,τkp,∗ ) (t). (9.2.79)
k=1

Here, τ p,∗ = [τ1p,∗ , . . . , τnp,∗


p −1
] , τ0p,∗ = 0 and τnp,∗
p
= T , are determined
p,∗ p,∗ p,∗
uniquely by θ in the same way as uε,γ̃(ε) is obtained from θε,γ̃(ε) as de-
scribed in Remark 9.2.3 . Note that up,∗ is an approximate optimal control
of Problem (P ).
9.2 Optimal Control with Continuous State Inequality Constraints 331

Theorem 9.2.2 Let up,∗ be as defined in Remark 9.2.3. Suppose that the
original Problem (P ) has an optimal control u∗ . Then,

lim g0 (up,∗ ) = g0 (u∗ ). (9.2.80)


p→∞

Proof. The proof is similar to that given for Theorem 8.9.1.

Theorem 9.2.3 Let u∗ be an optimal control of Problem (P ), and let up,∗


be as defined in Remark 9.2.3 Suppose that

up,∗ → ū, (9.2.81)

a.e. in [0, 1]. Then, ū is an optimal control of Problem (P ).

Proof. The proof is similar to that given for Theorem 8.9.2.

9.2.6 Illustrative Examples

Example 9.2.1 A non-trivial test problem is taken from Example 6.7.2 of


[253]:
min g0 , (9.2.82)
where
 1 * +
2 2 2
g0 = (x1 (t)) + (x2 (t)) + 0.005 (u(t)) dt (9.2.83)
0

subject to

dx1 (t)
= x2 (t) (9.2.84a)
dt
dx2 (t)
= −x2 (t) + u(t) (9.2.84b)
dt
with initial conditions
x1 (0) = 0, x2 (0) = −1 (9.2.84c)
and the continuous state inequality constraint

h = −8(t − 0.5)2 + 0.5 + x2 (t) ≤ 0, ∀t ∈ [0, 1] (9.2.85)

together with the control constraints

− 20 ≤ u(t) ≤ 20, ∀t ∈ [0, 1]. (9.2.86)

The constraint transcription of Section 9.2.2 is used to handle the con-


tinuous state inequality constraint (9.2.85). Let Lε be constructed from h
332 9 Optimal Control Problems with State and Control Constraints

according to (9.2.32). Then, we consider the cases of np = 10, 20, 30 and 40,
where np is the number of partitions points of the parametrized control as a
piecewise constant function. For each of these cases, the problem is solved us-
ing the MISER software[104], first without using time scaling transform and
then using time scaling transform. The optimization method used within the
MISER software is the sequential quadratic programming (SQP). Note that
for each case, the constraint transcription method is used to ensure that the
continuous inequality constraint (9.2.85) is satisfied, where Lε is constructed
from h according to (9.2.32). Take the case of N = 40, the numerical results
obtained from the use of the constraint transcription method are summarized
in Table 9.2.1.

Table 9.2.1: Numerical results for example 9.2.1 with np = 40 and solved
using the constraint transcription method but without using time scaling
1 1
ε γ g0 (u) 0
Lε dt 0
max{h, 0}dt Reason for termination
10−2 0.25 × 10−2 0.1709 −0.196 × 10−2 −0.15 × 10−2 Normal
10−2 0.79 × 10−3 0.1732 −0.744 × 10−3 −0.84 × 10−4 Normal
10−2 0.25 × 10−3 0.1751 −0.218 × 10−4 0 Normal
10−3 0.25 × 10−4 0.1727 −0.205 × 10−5 −0.13 × 10−6 Normal
10−4 0.25 × 10−5 0.1727 −0.335 × 10−6 −0.10 × 10−6 Zero derivative

The cost values obtained without using time scaling transform and us-
ing time scaling transformation for the cases of np = 10, 20, 30 and 40 are
summarized in Table 9.2.2 and Table 9.2.3, respectively.

Table 9.2.2: Approximate optimal costs for Example 9.2.1 with different
np and solved without using time scaling

np g0 Terminated successfully
10 1.81473135 × 10−1 Yes
20 1.73092969 × 10−1 Yes
30 1.71593913 × 10−1 Yes
40 1.70814337 × 10−1 Yes

For the case of np = 40, Figures 9.2.1 and 9.2.2 show, respectively, the
approximate optimal control and approximate optimal state trajectories. For
this example, the optimal solution has been obtained in [189]. Compared
the optimal solution obtained in [189] with the solution obtained using our
method with np = 40, we see that their trends are similar. In particular,
their cost values are basically the same. This confirms the convergence of the
approximate optimal costs of our method.
9.2 Optimal Control with Continuous State Inequality Constraints 333

Table 9.2.3: Approximate optimal costs for different np and solved using
time scaling

np g0 Terminated successfully
10 1.78540476 × 10−1 Yes
20 1.7275432 × 10−1 Yes
30 1.71188006 × 10−1 Yes
40 1.70634166 × 10−1 Yes

14

12

10

6
u

−2

−4
0 0.2 0.4 0.6 0.8 1
t

Fig. 9.2.1: Optimal control for Example 9.2.1 with np = 40 and solved
using time scaling

Example 9.2.2 A realistic and complex problem of transferring containers


from a ship to a cargo truck at the port of Kobe. It is taken from Exam-
ple 6.7.3 of [253]. The containers crane is driven by a hoist motor and a
trolley drive motor. For safety reason, the objective is to minimize the swing
during and at the end of the transfer. See Example 1.2.2 of Chapter 1 for
details. Here the problem is summarized after appropriate normalization as
follows:  1
$ %
min g0 = 4.5 (x3 (t))2 + (x6 (t))2 dt (9.2.87)
0

subject to

dx1 (t)
= 9x4 (t) (9.2.88a)
dt
334 9 Optimal Control Problems with State and Control Constraints

0 0.4

0.2
-0.05
0

-0.1 −0.2
X1

X2
−0.4
-0.15
−0.6
-0.2
−0.8

-0.25 −1
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
t t

Fig. 9.2.2: Optimal state trajectories for Example 9.2.1 with np = 40 and
solved using time scaling

dx2 (t)
= 9x5 (t) (9.2.88b)
dt
dx3 (t)
= 9x6 (t) (9.2.88c)
dt
dx4 (t)
= 9(u1 (t) + 17.2656x3 (t)) (9.2.88d)
dt
dx5 (t)
= 9u2 (t) (9.2.88e)
dt
dx6 (t) 9
=− [u1 (t) + 27.0756x3 (t) + 2x5 (t)x6 (t)], (9.2.88f)
dt x2 (t)

where

x(0) = [0, 22, 0, 0, −1, 0] (9.2.89a)


x(1) = [10, 14, 0, 2.5, 0, 0] (9.2.89b)

and
|u1 (t)| ≤ 2.83374 (9.2.90)
− 0.80865 ≤ u2 (t) ≤ 0.71265, ∀t ∈ [0, 1]. (9.2.91)
with continuous state inequality constraints

|x4 (t)| ≤ 2.5, ∀t ∈ [0, 1], (9.2.92)

|x5 (t)| ≤ 1.0, ∀t ∈ [0, 1]. (9.2.93)


The bounds of the states can be formulated as the continuous inequality
constraints as follows:

h1 =x4 (t) − 2.5 ≤ 0 (9.2.94)


h2 = − x4 (t) − 2.5 ≤ 0 (9.2.95)
9.2 Optimal Control with Continuous State Inequality Constraints 335

h3 =x5 (t) − 1.0 ≤ 0 (9.2.96)


h4 = − x5 (t) − 1.0 ≤ 0 (9.2.97)


4
Lε (t, x(t)) = Li,ε (t, x(t)),
i=1

where for each i = 1, . . . , 4, Li,ε (t, x(t)) is constructed from hi (t, x(t)) accord-
ing to (9.2.32), and hi (t, x(t)), i = 1, . . . , 4, are defined by (9.2.94)–(9.2.97),
respectively.
Take the case of np = 40, the numerical results obtained from the use of
the constraint transcription method are summarized in Table 9.2.4.

Table 9.2.4: Numerical results for Example 9.2.2 using constraint


transformation method but without using time scaling

1 1
ε γ g0 (u) 0
Lε dt 0
max{h, 0}dt Reason for termination
−2 −2 −2 −2
10 0.25 × 10 0.55 × 10 −0.25 × 10 −0.799 × 10−5 Normal
−3 −3 −2 −3
10 0.79 × 10 0.53 × 10 −0.25 × 10 −0.795 × 10−13 Normal
−4 −4 −2 −4 −8
10 0.25 × 10 0.53 × 10 −0.25 × 10 −0.365 × 10 Normal

The optimal cost values obtained without using time scaling and those
obtained using time scaling for the cases of np = 20, 30 and 40 are summarized
in the following tables. The reason for the termination of the optimization
software is normal for each of these cases, showing that the solution obtained
for each of the cases is such that the KKT conditions are satisfied, and so
are the continuous inequality constraints.

Table 9.2.5: Approximate optimal costs for Example 9.2.2 with different
np and solved using time scaling and without using time scaling

np g0 without time scaling g0 with time scaling


20 5.6549084 × 10−3 5.50843808 × 10−3
−3
30 5.54097657 × 10 5.18506216 × 10−3
40 5.30287028 × 10−3 5.17354239 × 10−3

Figures 9.2.3 and 9.2.4 show, respectively, the approximate optimal con-
trols and approximate optimal state trajectories obtained using time scaling
transform.
336 9 Optimal Control Problems with State and Control Constraints

3 1

2.5 0.8
2
0.6
1.5
u1 u2 0.4
1
0.2
0.5
0
0

−0.5 −0.2

−1 −0.4
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
t t

Fig. 9.2.3: Optimal controls for Example 9.2.2 with np = 40 and solved
using time scaling

From Table 9.2.5, we see that for each approximate problem, the optimiza-
tion software is unable to reduce the cost value further due to the smallness
of the cost value. We thus multiply the cost functional with a weighting fac-
tor of 103 to give g60 = 103 × g0 . Then, we redo the calculation. The results
obtained are listed in Table 9.2.6. From which, we can see the convergence
of the approximate optimal controls.

Table 9.2.6: Approximate optimal costs for Example 9.2.2 with different
np and solved with a weighting factor of 1000 and using time scaling and
without using time scaling

np g0 without time scaling g0 with time scaling


20 5.25532980 × 10−3 5.15123152 × 10−3
30 5.17674340 × 10−3 5.15078395 × 10−3
40 5.15527749 × 10−3 5.15059955 × 10−3

Figures 9.2.5 and 9.2.6 show the approximate optimal state trajectories,
and approximate optimal controls for the case of np = 40.
Based on Euler discretization scheme, an algorithm is developed using it-
erative restoration method in [13]. The modeling language AMPL [F2] is then
used to implement the algorithm for constrained optimal control problems,
where the optimization software Ipopt [277] is used. This example is solved
by using the algorithm developed in [13], where the optimal control obtained
is shown to satisfy the optimality conditions. The optimal cost obtained in
[13] is 0.005139 with N = 1000, where N denotes number of grid points used
in Euler discretization. From Table 9.2.6, we see that the difference between
the optimal cost obtained in [13] and that obtained by our method for the
case of np = 40 is insignificant. In view of Figures 9.2.5 and 9.2.6, we see that
9.2 Optimal Control with Continuous State Inequality Constraints 337
10 22

9
21
8
20
7
19
6
x x
1 2
5 18

4
17
3
16
2
15
1

0 14
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
t t

0 2.5

−0.005

2
−0.01

−0.015
1.5
x −0.02 x
3 4

−0.025
1
−0.03

−0.035
0.5

−0.04

−0.045 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
t t

0 0.05

−0.2 0.04

0.03
−0.4

0.02
−0.6
x5 x6
0.01
−0.8
0

−1
−0.01

−1.2 −0.02

−1.4 −0.03
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
t t

Fig. 9.2.4: Optimal state trajectories for Example 9.2.2 with np = 40 and
solved with a weighting factor of 1000, solved with a weighting factor but
without time scaling

the trends of the approximate optimal state trajectories and approximate


optimal controls are similar but not identical to those obtained in [13].
In many optimal control problems, a relatively large change in the control
parameters near the optimal solution often produces only a small change in
the value of the cost functional g0 . In other words, the exact value of the
optimal control does not matter much in some problems. This phenomenon,
while may be valuable to the system designer, produces ill-conditioning in
the computation of optimal control.
338 9 Optimal Control Problems with State and Control Constraints

3 0.8

2.5 0.6

0.4
2
0.2
1.5
u
u1 2
0
1
−0.2

0.5
−0.4

0 −0.6

−0.5 −0.8
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
t t

Fig. 9.2.5: Optimal control for Example 9.2.2 with np = 40 and solved
with a weighting factor of 1000 and using time scaling

9.3 Exact Penalty Function Approach

In this section, the exact penalty function approach detailed in Section 4.4
will be used to develop a computational method for solving a class of optimal
control problems to be described below. The main references for this section
are [134, 147, 300, 301].
Consider the system of differential equations given by (9.2.1a) with initial
condition (9.2.1b) and terminal equality constraint given by

x(T ) = xf , (9.3.1)

where T is the terminal time, x = [x1 , . . . , xn ] ∈ Rn and u = [u1 , . . . ur ] ∈


Rr are, respectively, state and control vectors, and f = [f1 , . . . , fn ] ∈ Rn is
a given functional.
Assumption 9.3.1 The function f satisfies the relevant conditions appear-
ing in Assumptions 8.2.1 and 8.2.2.

Define

U = {ν = [v1 , . . . vr ] ∈ Rr : αi ≤ vi ≤ βi , i = 1, . . . , r}, (9.3.2)

where αi , i = 1, . . . , r, and βi , i = 1, . . . , r, are given real numbers. A piece-


wise continuous function u is said to be an admissible control if u(t) ∈ U for
all t ∈ [0, T ]. Let U be the class of all such admissible controls. Furthermore,
let x(· | u) denote the solution of system (9.2.1a)–(9.2.1b) corresponding to
u ∈ U.
Consider the following continuous state inequality constraints (9.2.3).

hi (t, x(t | u), u(t)) ≤ 0, ∀t ∈ [0, T ], i = 1, . . . , N. (9.3.3)


9.3 Exact Penalty Function Approach 339
10 22

9
21
8
20
7
19
6
x x
1 2
5 18

4
17
3
16
2
15
1

0 14
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
t t

0 2.5

−0.005

2
−0.01

−0.015
1.5
x −0.02 x
3 4

−0.025
1
−0.03

−0.035
0.5

−0.04

−0.045 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
t t

0 0.05

0.04
−0.2

0.03
−0.4

0.02
x5 x6
−0.6
0.01

−0.8
0

−1
−0.01

−0.02
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
t t

Fig. 9.2.6: Optimal state trajectories for Example 9.2.2 with np = 40 and
solved with a weighting factor of 1000 and using time scaling

We now state the corresponding optimal control problem formally as fol-


lows:
Problem (P̂ ) Given the dynamical system (9.2.1a) and (9.2.1b) subject
to the terminal constraint (9.3.1) and the continuous inequality constraints
(9.3.3), find a control u ∈ U such that the cost function
340 9 Optimal Control Problems with State and Control Constraints
 T
g0 (u) = Φ0 (x(T | u)) + L0 (t, x(t | u), u(t))dt (9.3.4)
0

is minimized.
We assume that the following conditions are satisfied.
Assumption 9.3.2 Φ0 is continuously differentiable with respect to x.

Assumption 9.3.3 hi , i = 1, . . . , N and L0 are continuously differentiable


with respect to all their arguments.

Remark 9.3.1 By Assumption 9.3.1 and the definition of U , it follows from


an argument similar to that given for the proof of Lemma 8.4.2 that for all
u ∈ U , x(t|u) ∈ X holds, and for all t ∈ [0, T ] , X ⊂ Rn is a compact subset.

9.3.1 Control Parametrization and Time Scaling


Transformation

To solve Problem (P̂ ), we shall apply the control parametrization scheme


together with a time scaling transform. The time horizon [0, T ] is partitioned
with a sequence τ = {τ0 , . . . , τp } of time points τi , i = 1, . . . , p − 1. Then, the
control is approximated by a piecewise constant function as follows.


p
up (t | σ, τ ) = σ j χ[τj−1 ,τj ) (t), (9.3.5)
j=1

& '
where τj−1 ≤ τj , j = 1, . . . , p, with τ0 = 0 and τp = T , σ j = σ1j , . . . , σrj ∈
&  '
 
Rr , j = 1, . . . , p, σ = σ 1 , . . . , (σ p ) ∈ Rpr and χI is the indicator
function of I defined by (7.3.3).
As up ∈ U , σ j ∈ U for j = 1, . . . , p . Let Ξ be the set of all those
&  '
 
σ = σ 1 , . . . , (σ p ) ∈ Rpr such that σ j ∈ U for j = 1, . . . , p .
The switching times τj , 1 ≤ j ≤ p − 1, are also regarded as decision
variables. The time scaling transform is employed to map these switching
times into a set of fixed time points kp , k = 1, . . . , p − 1, on a new time
horizon [0, 1]. This is easily achieved by the following differential equation

dt(s)
= υ p (s), s ∈ [0, 1], (9.3.6a)
ds
with initial condition
t(0) = 0, (9.3.6b)
where
9.3 Exact Penalty Function Approach 341


p
p
υ (s) = θj χ[ j−1 , j ) (s). (9.3.7)
p p
j=1

Here, θj ≥ 0, j = 1, . . . p. Let θ = [θ1 , . . . , θp ] ∈ Rp and let Θ be the set


containing all such θ.
Taking integration
& of !(9.3.6a) with initial condition (9.3.6b), it is easy to
see that, for s ∈ p , p , k = 1, . . . , p,
k−1 k


k−1
θj θk
t(s) = + (ps − k + 1), (9.3.8)
j=1
p p

where k = 1, . . . , p. Clearly, for k = 1, . . . , p − 1,


k
θj
τk = (9.3.9)
j=1
p

and

p
θj
t(1) = = T. (9.3.10)
j=1
p

The approximate control given by (9.3.5) in the new time horizon [0, 1]
becomes
p
p p
ũ (s) = u (t(s)) = σ j χ[ j−1 , j ) (s), (9.3.11)
p p
j=1

which has fixed switching times at s = p1 , . . . , p−1


p . Now, by using the time
scaling transform (9.3.6a) and (9.3.6b), the dynamic system (9.2.1a) and
(9.2.1b) is transformed into

dy(s)  
= θk f t(s), y(s), σ k , s ∈ Jk , k = 1, . . . , p (9.3.12a)
ds
dt(s)
= υ p (s) (9.3.12b)
ds
y(0) = x0 and t(0) = 0, (9.3.12c)

where

y(s) = [y1 (s), . . . , yn (s)] ,
and $ %
x0 = x01 , . . . , x0n
.
The terminal conditions (9.3.1) and (9.3.10) become

y(1) = xf and t(1) = T, (9.3.13a)


342 9 Optimal Control Problems with State and Control Constraints

respectively, where y(s) = x(t(s)) and


⎧& !

⎪ k−1 k
,

⎪ p p , if k = 1,



⎨ !
Jk = , p , if k ∈ {2, . . . , p − 1},
k−1 k
(9.3.14)


p



⎪ '

⎩ k−1 , k , if k = p.
p p

We then rewrite system (9.3.12a)–(9.3.12c) as follows.

dỹ(s) ˜
=f (s, ỹ(s), σ, θ), s ∈ [0, 1] (9.3.15a)
ds
ỹ(0) =ỹ 0 (9.3.15b)

with the terminal conditions

ỹ(1) = ỹ f , (9.3.15c)

where

y1 (s), . . . , y6n (s), y6n+1 (s)]
ỹ(s) = [6 (9.3.16)
with

y6i (s) = yi (s), i = 1, . . . , n, y6n+1 (s) = t(s);


⎡ p ⎤
 k
θk f (t(s), y(s), σ )χJk (s) ⎦
f˜(s, ỹ(s), σ, θ) = ⎣ k=1 ; (9.3.17)
υ p (s)
$ %
ỹ 0 = y610 , . . . , y6n0 , y6n+1
0
(9.3.18)

with
y6i0 = x0i , i = 1, . . . , n, y6n+1
0
= 0;
and & '
ỹ f = y61f , . . . , y6nf , y6n+1
f
(9.3.19)

with
y6if = xfi , i = 1, . . . , n, y6n+1
f
= T.
To proceed further, let ỹ(· | σ, θ) denote the solution of system (9.3.15a)–
(9.3.15b) corresponding to (σ, θ) ∈ Ξ × Θ. Similarly, applying the time scal-
ing transform to the continuous inequality constraints (9.3.3) and the cost
functional (9.3.4) yields
 
hi ỹ(s | σ, θ), σ k ≤ 0, ∀s ∈ Jk , k = 1, . . . , p; i = 1, . . . , N (9.3.20)
9.3 Exact Penalty Function Approach 343

and
 1
g60 (σ, θ) = Φ0 (y(1 | σ, θ)) + L̄0 (s, ỹ(s | σ, θ), σ, θ)ds, (9.3.21)
0

respectively, where

L̄0 (s, ỹ(s | σ, θ), σ, θ) = υ p (s)L0 (t(s), y(s), ũp (s)). (9.3.22)

Remark 9.3.2 Assumption 9.3.1, Assumption 9.3.2 and Assumption 9.3.3,


it follows from Remark 9.3.1 that there exits constants K1 > 0 and K2 > 0
such that

|Υ (s, ỹ(s | σ, θ), σ, θ)| ≤ K1 , s ∈ Jk , k = 1, . . . , p; (σ, θ) ∈ Ξ × Θ

 
 ∂Υ (s, ỹ(s | σ, θ), σ, θ) 
  ≤ K2 , s ∈ Jk , k = 1, . . . , p; (σ, θ) ∈ Ξ × Θ
 ∂σ 

 
 ∂Υ (s, ỹ(s | σ, θ), σ, θ) 
  ≤ K2 , s ∈ Jk , k = 1, . . . , p; (σ, θ) ∈ Ξ × Θ
 ∂θ 

 
 ∂Υ (s, ỹ(s | σ, θ), σ, θ) 
  ≤ K2 , s ∈ Jk , k = 1, . . . , p; (σ, θ) ∈ Ξ × Θ,
 ∂ y6 

where Υ is used to denote f6i , i = 1, . . . , n; hi (t(s), ỹ(s | σ, θ), σ k ), i =


1, . . . , N ; k = 1, . . . , p; and L0 .
The approximate problem to Problem (P̂ ) may now be stated formally as
follows.
Problem (P̂ (p)) Given system (9.3.15a) and (9.3.15b), find a (σ, θ) ∈ Ξ ×
Θ such that the cost functional (9.3.21) is minimized subject to (9.3.15c)
and (9.3.20).
Thus, Problem (P̂ (p)) becomes an optimization problem subject to both
the equality constraints (9.3.15c) and the continuous inequality constraints
(9.3.20). To solve this problem, an exact penalty function method introduced
in Section 4.4 is used.
First, we define
 
Fε = {(σ, θ, ε) ∈ Ξ × Θ × R+ : hi ỹ(s | σ, θ), σ k ≤ εγ Wi ,
∀ s ∈ Jk , k = 1, . . . , p; i = 1, . . . , N }, (9.3.23)

where R+ = {α ∈ R : α ≥ 0}, Wi ∈ (0, 1), i = 1, . . . , N , are fixed constants


and γ is a positive real number. In particularly, when ε = 0,
344 9 Optimal Control Problems with State and Control Constraints
 
F0 = {(σ, θ) ∈ Ξ × Θ : hi ỹ(s | σ, θ), σ k ≤ 0,
∀ s ∈ Jk , k = 1, . . . , p; i = 1, . . . , N }. (9.3.24)

Similarly, we define
" #
Ωε = (σ, θ, ε) ∈ Fε : ỹ(1 | σ, θ) − ỹ f = 0 (9.3.25)

and " #
Ω0 = (σ, θ) ∈ F0 : ỹ(1 | σ, θ) − ỹ f = 0 . (9.3.26)
Clearly, Problem (P̂ (p)) is equivalent to the following problem, which is
denoted as Problem (P̃ (p)).
Problem (P̃ (p)) Given system (9.3.15a) and (9.3.15b), find a (σ, θ) ∈ Ω0
such that the cost functional (9.3.21) is minimized.
Then, by applying the exact penalty function introduced in Section 4.4,
we obtain a new cost functional defined below.

g60δ (σ, θ, ε)


⎪ g60 (σ, θ), if ε = 0, hi (ỹ(s), σ k ) ≤ 0

(s ∈ Jk , k = 1, . . . , p),
=

⎪ g60 (σ, θ) + ε−α (Δ(σ, θ, ε) + Δ1 ) + δεβ , if ε > 0,

+∞, otherwise,
(9.3.27)

where Δ(σ, θ, ε), which is referred to as the continuous inequality constraint


violation, is defined by

 p 
N 
$ "   #%2
Δ(σ, θ, ε) = max 0, hi ỹ(s | σ, θ), σ k − εγ Wi ds,
i=1 k=1 Jk
(9.3.28)
α and γ are positive real numbers, β > 2, and δ > 0 is a penalty parameter,
while Δ1 , which is referred to as the equality constraints violation, is defined
by

  
n+1 !2

Δ1 = ỹ(1 | σ, θ) − ỹ f 2
= ỹi (1 | σ, θ) − ỹif , (9.3.29)
i=1

and | · | denotes the usual Euclidian norm.


Remark 9.3.3 Note that other types of equality constraints, such as interior
point equality constraints (8.3.5) can be dealt with similarly by introducing
appropriate equality constraint violation as defined by (9.3.29).

We now introduce a surrogate optimal control problem, which is referred


to as Problem (P̃δ (p)), as follows.
9.3 Exact Penalty Function Approach 345

Problem (P̃δ (p)) Given system (9.3.15a)–(9.3.15b), find a (σ, θ, ε) ∈ Ξ ×


Θ × [0, +∞) such that the cost functional (9.3.27) is minimized.
Intuitively, during the process of minimizing g60δ (σ, θ, ε), if δ is increased,
β
ε should be reduced, meaning that ε should be reduced as β is fixed. Thus
ε−α will be increased, and hence the constraint violation will be reduced.
This means that the values of
 p 
N   !2
$ "   #%2 n+1
max 0, hi ỹ(s|σ, θ), σ k − γ Wi and ỹi (1|σ, θ) − ỹif
i=1 k=1 Jk i=1

must go down such that the continuous inequality constraints (9.3.20) and
the equality constraints (9.3.15c) are satisfied.
Before deriving the gradient of the cost functional of Problem (P̃δ (p)), we
will rewrite the cost functional in the canonical form below.
 1
g60 (σ, θ, ε) = Φ0 (y(1 | σ, θ)) +
δ
L̄0 (s, ỹ(s | σ, θ), σ, θ)ds
0
4 N 
 1$ " #%2
+ ε−α max 0, h̄i (s, ỹ(s | σ, θ), σ) − εγ Wi ds
i=1 0
5

n+1 !2
+ ỹi (1 | σ, θ) − ỹif + δεβ
i=1
4 5

n+1 !2
−α
= Φ0 (y(1 | σ, θ)) + ε ỹi (1 | σ, θ) − ỹif + δε β

i=1
 1
+ L̄0 (s, ỹ(s | σ, θ), σ, θ)ds
0
N 

−α
1 $ " #%2
+ε max 0, h̄i (s, ỹ(s | σ, θ), σ) − εγ Wi ds,
i=1 0
(9.3.30)

where

h̄i (s, ỹ(s | σ, θ), σ) = hi (ỹ(s | σ, θ), ũp (s)), i = 1, . . . , N (9.3.31)

and ũp (s) is defined by (9.3.11).


Let

n+1 !2
Φ̃0 (ỹ(1 | σ, θ), ε) = Φ0 (y(1 | σ, θ)) + ε−α ỹi (1 | σ, θ) − ỹif + δεβ
i=1
(9.3.32)
and

L̃0 (s, ỹ(s | σ, θ), σ, θ, ε) = L̄0 (s, ỹ(s | σ, θ), σ, θ)


346 9 Optimal Control Problems with State and Control Constraints

N 
 1 $ " #%2
+ ε−α max 0, h̄i (s, ỹ(s | σ, θ), σ) − εγ Wi . (9.3.33)
i=1 0

We then substitute (9.3.32) and (9.3.33) into (9.3.30) to give


 1
g60δ (σ, θ, ) = Φ̃0 (ỹ(1 | σ, θ), ε) + L̃0 (s, ỹ(s | σ, θ), σ, θ, ε)ds. (9.3.34)
0

Now, the cost functional of Problem (P̃δ (p)) is in canonical form. As de-
rived for the proof of Theorem 7.2.2, the gradient formulas of the cost func-
tional (9.3.34) are given by the following theorem.
Theorem 9.3.1 The gradients of the cost functional g60δ (σ, θ, ε) with respect
to σ, θ, and ε are
 1  
∂6
g0δ (σ, θ, ε) ∂H0 s, ỹ(s | σ, θ), σ, θ, ε, λ0 (s | σ, θ, ε)
= ds (9.3.35)
∂σ ∂σ
0
 1  
∂6
g0δ (σ, θ, ε) ∂H0 s, ỹ(s | σ, θ), σ, θ, ε, λ0 (s | σ, θ, ε)
= ds (9.3.36)
∂θ 0 ∂θ
4 N 
∂6
g0δ (σ, θ, ε)  1$ " #%2
−α−1
= −αε max 0, h̄i (s, ỹ(s | σ, θ), σ) − εγ Wi ds
∂ε i=1 0
5

n+1 !2
+ ỹi (1 | σ, θ) − ỹif
i=1
N 
 1 " #
− 2γεγ−α−1 max 0, h̄i (s, ỹ(s | σ, θ), σ)−εγ Wi Wi ds
i=1 0
β−1
+ δβε
4 N 
 1 $ " #%2
=ε−α−1 −α max 0, h̄i (s, ỹ(s | σ, θ), σ) − εγ Wi ds
i=1 0
N  1
 " #
+ 2γ max 0, h̄i (s, ỹ(s | σ, θ), σ) − εγ Wi (−εγ Wi )ds
i=1 0
5

n+1 !2
−α ỹi (1 | σ, θ) − ỹif + δβεβ−1 , (9.3.37)
i=1
 
respectively, where H0 s, ỹ(s | σ, θ), σ, θ, ε, λ0 (s | σ, θ, ε) is the Hamilto-
nian function for the cost functional (9.3.34) given by
 
H0 s, ỹ(s | σ, θ), σ, θ, ε, λ0 (s | σ, θ, ε)
$ %
= L̃0 (s, ỹ(s | σ, θ), σ, θ, ε) + λ0 (s | σ, θ, ε) f˜(s, ỹ(s | σ, θ), σ, θ)
(9.3.38)
9.3 Exact Penalty Function Approach 347

and λ0 (·|σ, θ, ) is the solution of the following system of costate differential


equations
.   /
dλ0 (s) ∂H0 s, ỹ(s | σ, θ), σ, θ, ε, λ0 (s)
=− (9.3.39a)
ds ∂ ỹ

with the boundary condition


. /
0 ∂ Φ̃0 (ỹ(1 | σ, θ), ε)
λ (1) = . (9.3.39b)
∂ ỹ

Remark 9.3.4 By Assumptions 9.3.1, 9.3.2, and 9.3.3, Remarks 9.3.1 and
9.3.2, it follows from arguments similar to those given for the proof of
Lemma 8.4.2 that there exits a compact set Z ⊂ Rn such that λ0 (s | σ, θ, ε) ∈
Z for all s ∈ [0, 1], (σ, θ) ∈ Ξ × Θ and ε ≥ 0.

9.3.2 Some Convergence Results

In this section, we shall show that, under some mild assumptions,  if the pa-
rameter δk is sufficient large (δk → +∞ as k → +∞) and σ (k),∗ , θ (k),∗ , ε(k),∗
is (k),∗
a local minimizer
 of Problem (P̃δ (p)), then ε(k),∗ → ε∗ = 0, and
σ ,θ (k),∗
→ (σ , θ ∗ ) with (σ ∗ , θ ∗ ) being a local minimizer of Problem

(P̃ (p)).  
For every positive integer k, let σ (k),∗ , θ (k),∗ be a local minimizer of
Problem (P̃δ (p)). To obtain our main result, we need
 
Lemma 9.3.1 Let σ (k),∗ , θ (k),∗ , ε(k),∗ be a local minimizer of Problem
 
(P̃δ (p)). Suppose that g60δk σ (k),∗ , θ (k),∗ , ε(k),∗ is finite and that ε(k),∗ > 0.
Then !
σ (k),∗ , θ (k),∗ , ε(k),∗ ∈/ Ωεk ,

where Ωεk is as defined by (9.3.25).

Proof. Since (σ (k),∗ , θ (k),∗ , ε(k),∗ ) is a local minimizer of Problem (P̃δ (p)) and
ε(k),∗ > 0, we have
δ
 
∂6g0k σ (k),∗ , θ (k),∗ , ε(k),∗
= 0. (9.3.40)
∂ε
On the contrary, we assume that the conclusion of the lemma is false. Then,
we have
! !γ
hi ỹ(s | σ (k),∗ , θ (k),∗ ), σ (k),∗ ≤ ε(k),∗ Wi ,
348 9 Optimal Control Problems with State and Control Constraints

∀ s ∈ Jj , j = 1, . . . , p; i = 1, . . . , N, (9.3.41)

and !
ỹ 1 | σ (k),∗ , θ (k),∗ − ỹ f = 0. (9.3.42)

Thus, by (9.3.41), (9.3.43), (9.3.27), and (9.3.40), we obtain


δ
 
∂6g0k σ (k),∗ , θ (k),∗ , ε(k),∗
0= = βδk εβ−1 > 0
∂ε
This is a contradiction, and hence completing the proof.

Before we introduce the definition of the constraint qualification, we first


define

φi (ỹ(1 | σ, θ)) = ỹi (1 | σ, θ) − ỹif , i = 1, . . . , n + 1. (9.3.43)

Assumption 9.3.4 The constraint qualification in the sense of Definition 4.1


is satisfied for the continuous inequality constraints (9.3.20) at (σ, θ) = (σ̄, θ̄)
 
Theorem 9.3.2 !Suppose that σ (k),∗ , θ (k),∗ , ε(k),∗ is a local minimizer of
 
Problem P̃δk (p) such that g60k σ (k),∗ , θ (k),∗ , ε(k),∗ is finite and ε(k),∗ >
δ

 
0. If σ (k),∗ , θ (k),∗ , ε(k),∗ → (σ ∗ , θ ∗ , ε∗ ) as k → +∞, and the constraint
qualification is satisfied for the continuous inequality constraints (9.3.20) at
(σ, θ) = (σ ∗ , θ ∗ ), then ε∗ = 0 and (σ ∗ , θ ∗ ) ∈ Ω0 .

Proof. From Lemma 9.2.1, it follows that (σ (k),∗ , θ (k),∗ , ε(k),∗ ) ∈ / Ωε(k),∗ . Fur-
thermore, in terms of (9.3.27), we have
 
g0δ σ (k),∗ , θ (k),∗ , ε(k),∗
∂6
∂σ     
 1
∂H0 s, ỹ s | σ (k),∗ , θ (k),∗ , σ, θ, ε, λ0 s | σ (k),∗ , θ (k),∗ , ε(k),∗
= ds
0 ∂σ
 1    
∂ L̄0 s, ỹ s | σ (k),∗ , θ (k),∗ , σ (k),∗ , θ (k),∗
= ds + 2(ε(k),∗ )−α ·
0 ∂σ
 N  1 * ! ! !γ +
max 0, h̄i s, ỹ s | σ (k),∗ , θ (k),∗ , σ (k),∗ − ε(k),∗ Wi ·
i=1 0
     1 !
∂ h̄i s, ỹ s | σ (k),∗ , θ (k),∗ , σ (k),∗
ds + λ0 s | σ (k),∗ , θ (k),∗ , ε(k),∗ ·
∂σ 0
   (k),∗ (k),∗ 
˜
∂ f s, ỹ s | σ (k),∗
,θ (k),∗
,σ ,θ
ds
∂σ
=0 (9.3.44)
9.3 Exact Penalty Function Approach 349
 
g0δ σ (k),∗ , θ (k),∗ , ε(k),∗
∂6
∂ε
4 N 
 1 & * ! !
(k),∗ −α−1
= (ε ) −α max 0, h̄i s, ỹ s | σ (k),∗ , θ (k),∗ , σ (k),∗
i=1 0

+'2 N  1
 * ! !
− (ε)γ Wi ds + 2γ max 0, h̄i s, ỹ s | σ (k),∗ , θ (k),∗ , σ (k),∗
i=1 0
5
!γ + 
n+1 ! !2
− ε (k),∗
Wi (−(ε ) Wi )ds − α
(k),∗ γ
ỹi 1 | σ (k),∗
,θ (k),∗
− ỹif
i=1
!β−1
+ δk β ε(k),∗
= 0. (9.3.45)

Suppose that ε(k),∗ → ε∗ = 0. Then, by (9.3.45), it can be shown by using


Remarks 9.3.1 and 9.3.2 and Theorem A.1.10 (Lebesgue dominated conver-
gence theorem) that its first term tends to a finite value, while the last term
tends to infinity as δk → +∞, when k → +∞. This is impossible for the
validity of (9.3.45). Thus, ε∗ = 0.
Now, by (9.3.44), we obtain
 1     !−α
∂ L̄0 s, ỹ s | σ (k),∗ , θ (k),∗ , σ (k),∗ , θ (k),∗
ds + 2 ε(k),∗ ·
0 ∂σ
N  1
 * ! ! !γ +
max 0, h̄i s, ỹ s | σ (k),∗ , θ (k),∗ , σ (k),∗· − ε(k),∗ Wi ·
i=1 0
     1 !
∂ h̄i s, ỹ s | σ (k),∗ , θ (k),∗ , σ (k),∗
ds + λ0 s | σ (k),∗ , θ (k),∗ , ε(k),∗ ·
∂σ 0
   
∂ f˜ ỹ s | σ (k),∗
, θ (k),∗ , σ (k),∗ , θ (k),∗
ds = 0
∂σ
Thus,
4    
1
∂ L̄0 s, ỹ s | σ (k),∗ , θ (k),∗ , σ (k),∗ , θ (k),∗ !−α
lim ds + 2 ε(k),∗ ·
k→+∞ 0 ∂σ
N 
 1 * ! ! !γ +
max 0, h̄i s, ỹ s | σ (k),∗ , θ (k),∗ , σ (k),∗ − ε(k),∗ Wi ·
i=1 0
     1 !
∂ h̄i s, ỹ s | σ (k),∗ , θ (k),∗ , σ (k),∗
ds + λ0 s | σ (k),∗ , θ (k),∗ , ε(k),∗ ·
∂σ
 (k),∗ (k),∗  5
0
 
˜
∂ f s, ỹ s | σ (k),∗
,θ (k),∗
,σ ,θ
ds = 0. (9.3.46)
∂σ
350 9 Optimal Control Problems with State and Control Constraints

Then, by Remarks 9.3.2 and 9.3.3, it follows from Theorem A.1.10 that the
first and third terms appeared on the right hand side of (9.3.46) converge to
finite values. On the other hand, the second term tends to infinity, which is
impossible. Thus,
N 
 1 " # ∂ h̄i (s, ỹ(s | σ ∗ , θ ∗ ), σ ∗ )
max 0, h̄i (s, ỹ(s | σ ∗ , θ ∗ ), σ ∗ ) ds = 0.
i=1 0
∂σ
(9.3.47)
Since the constraint qualification is satisfied for the continuous inequality con-
straints (9.3.20) at (σ, θ) = (σ ∗ , θ ∗ ), it follows that, for each i = 1, . . . , N,
" #
max 0, h̄i (s, ỹ(s | σ ∗ , θ ∗ ), σ ∗ ) = 0

for each s ∈ [0, 1]. This, in turn, implies that, for each i = 1, . . . , N

h̄i (s, ỹ(s | σ ∗ , θ ∗ ), σ ∗ ) ≤ 0 (9.3.48)

for each s ∈ [0, 1]. Next, from (9.3.45) and (9.3.48), it is easy to see that when
k → +∞, for each i = 1, . . . , n + 1,

ỹi (1 | σ ∗ , θ ∗ ) − ỹif = 0. (9.3.49)

The proof is complete.


 
Corollary 9.3.1 Suppose that σ (k),∗ , θ (k),∗ → (σ ∗ , θ ∗ ) ∈ Ω0 and that
 
ε(k),∗ → ε∗ = 0. Then, Δ σ (k),∗ , θ (k),∗ , ε(k),∗ → Δ(σ ∗ , θ ∗ , ε∗ ) = 0, and
Δ1 → 0.
Proof. The conclusion follows readily from the definitions of Δ(σ, θ, ε) and
Δ1 , and the continuity of hi and f˜.
In what follows, we shall turn our attention to the exact penalty func-
tion constructed in (9.3.27). We shall see that, under some mild conditions,
g60δ (σ, θ, ε) is continuously differentiable with continuous limit. For this, we
need the following lemmas.
Lemma 9.3.2 Assume that
 !  !ξ 
 (k),∗ (k),∗ 
 σ ,θ − (σ ∗ , θ∗ ) = o ε (k),∗
− (ε )∗ ξ
,

then
 !  !ξ 
 
ỹ s | σ (k),∗ , θ (k),∗ − ỹ(s | σ ∗ , θ ∗ ) =o ε(k),∗ − (ε∗ )ξ , (9.3.50)

where
ỹ(· | σ, θ)∞ = ess sup |ỹ(· | σ, θ)|. (9.3.51)
s∈[0,1]
9.3 Exact Penalty Function Approach 351

Proof. Note that


!  s !
ỹ s | σ (k),∗
,θ (k),∗
= ỹ(0) + f˜ τ, ỹ(τ ), σ (k),∗ , θ (k),∗ dτ (9.3.52)
0

for any s ∈ [0, 1]. Thus,


 ! 
 
ỹ s | σ (k),∗ , θ (k),∗ − ỹ (s | σ ∗ , θ ∗ )
 s ! 
 
= ˜
f τ, ỹ(τ ), σ (k),∗
,θ (k),∗
− f (τ, ỹ(τ ), σ , θ ) dτ 
˜ ∗ ∗
(9.3.53)
0

for any s ∈ [0, 1]. By Assumption 10.1.1, there exists a constant N2 > 0 such
that
 ! 
 
ỹ s | σ (k),∗ , θ (k),∗ − ỹ(s | σ ∗ , θ ∗ )
 s * ! 
 
≤N2 ỹ τ | σ (k),∗ , θ (k),∗ − ỹ(τ | σ ∗ , θ ∗ )
 0 ! +
 
+  σ (k),∗ , θ (k),∗ − (σ ∗ , θ ∗ ) dτ (9.3.54)

for any s ∈ [0, 1]. By applying Gronwall–Bellman’s lemma, we have


 ! 
 
ỹ s | σ (k),∗ , θ (k),∗ − ỹ(s | σ ∗ , θ ∗ )
 1 ! 
 (k),∗ (k),∗ 
≤ N2  σ ,θ − (σ ∗ , θ ∗ ) dτ exp(N2 )
 0 ! 
 
= N2  σ (k),∗ , θ (k),∗ − (σ ∗ , θ ∗ ) exp(N2 ) (9.3.55)

for any s ∈ [0, 1]. Let N3 = N2 exp(N2 ), (9.3.55) becomes


 !   ! 
   
ỹ s | σ (k),∗ , θ (k),∗ − ỹ(s | σ ∗ , θ ∗ ) ≤ N3  σ (k),∗ , θ (k),∗ − (σ ∗ , θ ∗ )
!ξ 
∗ ξ
= N3 o ε (k),∗
− (ε ) (9.3.56)

for any s ∈ [0, 1]. Since this is valid for all s ∈ [0, 1], it completes the proof.

We assume that the following conditions are satisfied.


Assumption 9.3.5
! ! !ξ 
h̄i s, ỹ s | σ (k),∗ , θ (k),∗ , σ (k),∗ = o ε(k),∗ ,

ξ > 0, s ∈ [0, 1], i = 1, . . . , N. (9.3.57)


352 9 Optimal Control Problems with State and Control Constraints

Assumption 9.3.6
!! !ξ  
φi ỹ 1 | σ (k),∗
,θ (k),∗
=o ε (k),∗
, ξ  > 0, i = 1, . . . , n + 1.
(9.3.58)
Theorem 9.3.3 Suppose that γ > α, ξ > α, ξ  > α, −α − 1 + 2ξ >
0, −α − 1 + 2ξ  > 0, 2γ − α − 1 > 0. Then, as ε(k),∗ → ε∗ = 0 and
 (k),∗
σ , θ (k),∗ → (σ ∗ , θ ∗ ) ∈ Ω0 , it holds that
!
g60k σ (k),∗ , θ (k),∗ , ε(k),∗ → g600 (σ ∗ , θ ∗ , 0) = g60 (σ ∗ , θ ∗ )
δ
(9.3.59)

!
∇(σ,θ,ε) g60k σ (k),∗ , θ (k),∗ , ε(k),∗ → ∇(σ,θ,ε) g600 (σ ∗ , θ ∗ , 0)
δ

= (∇(σ,θ) g60 (σ ∗ , θ ∗ ), 0). (9.3.60)

Proof. For notational brevity, the following abbreviations will be used through
the proof of this theorem and that of Theorem 9.3.5
! !
(k),∗
h̄i (·) = h̄i ·, ỹ ·|σ (k),∗ , θ(k),∗ , σ (k),∗ (9.3.61)
! !
(k),∗
L̄0 (·) = L̄0 ·, ỹ ·|σ (k),∗ , θ(k),∗ , σ (k),∗ , θ(k),∗ (9.3.62)
! !
(k),∗
L̃0,ε (·) = L̃0 ·, ỹ ·|σ (k),∗ , θ(k),∗ , σ (k),∗ , θ(k),∗ , ε (9.3.63)
!
ỹ (k),∗ (·) = ỹ ·|σ (k),∗ , θ(k),∗ (9.3.64)
ỹ ∗ (·) = ỹ (·|σ ∗ , θ∗ ) (9.3.65)
!
(k),∗
ỹi (·) = ỹi ·|σ (k),∗ , θ(k),∗ (9.3.66)
! !
f˜(k),∗ (·) = f˜ ·, ỹ ·|σ (k),∗ , θ(k),∗ , σ (k),∗ , θ(k),∗ (9.3.67)
!
λ̄0,(k),∗ (·) = λ̄ ·|σ (k),∗ , θ(k),∗ (9.3.68)
!
λ0,(k),∗,ε (·) = λ0 ·|σ (k),∗ , θ(k),∗ , ε(k),∗ . (9.3.69)

Now, based on the conditions of the theorem, we can show that, for ε = 0,
!
g60k σ (k),∗ , θ (k),∗ , ε(k),∗
δ
lim
ε(k),∗ →ε∗ =0
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
* !
= lim g60 σ (k),∗ , θ (k),∗
ε(k),∗ →ε∗ =0
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
!−α  N  1& * +'2
(k),∗
+ ε (k),∗
max 0, h̄i (s) − (ε)γ Wi ds
i=1 0
9.3 Exact Penalty Function Approach 353

!−α n+1
 !2 !β +
(k),∗
+ ε(k),∗ ỹi (1) − ỹif + δk ε(k),∗ . (9.3.70)
i=1

It is easy to see that when (σ (k),∗ , θ (k),∗ ) → (σ ∗ , θ ∗ ),

ỹ (k),∗ (s) → ỹ ∗ (s) (9.3.71)

for each s ∈ [0, 1]. By (9.3.71) and (9.3.70), we obtain

lim g60 (σ (k),∗ , θ (k),∗ ) = g60 (σ ∗ , θ ∗ ) (9.3.72)


ε(k),∗ →ε∗ =0
( σ (k),∗ ,θ (k),∗ →(σ ∗ ,θ ∗ )∈Ω
) 0

Substituting (9.3.72) into (9.3.70), we have


!
lim g60δk σ (k),∗ , θ (k),∗ , ε(k),∗
ε(k),∗ →ε∗ =0
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
= g̃0 (σ ∗ , θ∗ )
31& *
(k),∗
+'2
0
max 0, h̄i (s) − (ε(k),∗ )γ Wi ds
+ lim
ε(k),∗ →ε∗ =0 (ε(k),∗ )α
( σ (k),∗ ,θ (k),∗ →(σ ∗ ,θ ∗ )∈Ω0
)
n+1 !2
(k),∗
i=1 ỹi (1) − ỹif
+ lim . (9.3.73)
ε(k),∗ →ε∗ =0 (ε(k),∗ )α
( σ (k),∗ ,θ (k),∗ →(σ ∗ ,θ ∗ )∈Ω
) 0

For the second term and the third term of (9.3.73), it is clear from Lemma 9.2.1
that
N  
  γ
 2
1 (k),∗
0
max 0,h̄i (s)−(ε(k),∗ ) Wi ds
i=1
lim (ε(k),∗ )α
ε(k),∗ →ε∗ =0
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
N 3 &
 − α2  γ−α2 '2
1 (k),∗
= lim 0
ε(k),∗ h̄i (s) − ε(k),∗ Wi ds .
ε(k),∗ →ε∗ =0 i=1
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0

Since ξ > α, γ > α, it follows from Assumption 9.3.5, for any s ∈ [0, 1],
 
 (k),∗ !− α2 (k),∗ 
lim  h̄i (s) = 0.
 ε
ε(k),∗ →ε∗ =0
(9.3.74)
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0

Thus, we obtain
354 9 Optimal Control Problems with State and Control Constraints

N 
 1 !− α2 !γ− α2 2
(k),∗
lim ε(k),∗ h̄i (s) − ε(k),∗ Wi ds
ε(k),∗ →ε∗ =0 0
σ (k),∗ ,θ (k),∗ →(σ ∗ ,θ ∗ )∈Ω
i=1
( ) 0

N  1
 !− α2 !γ− α2 2
(k),∗
= lim ε(k),∗ h̄i (s) − ε(k),∗ Wi ds
0 ε(k),∗ →ε∗ =0
i=1 σ (k),∗ ,θ (k),∗ →(σ ∗ ,θ ∗ )∈Ω
( ) 0

=0. (9.3.75)

Similarly, for the third term of (9.3.73), we have


n+1 !2
(k),∗
i=1 ỹi (1) − ỹif
lim =0
ε(k),∗ →ε∗ =0 (ε(k),∗ )α
( σ (k),∗ ,θ (k),∗ →(σ ∗ ,θ ∗ )∈Ω
) 0

. Combining (9.3.73), (9.3.74) and (9.3.75) gives


!
g60k σ (k),∗ , θ (k),∗ , ε(k),∗ = g60k (σ ∗ , θ ∗ , 0) = g60 (σ ∗ , θ ∗ ).
δ δ
lim
ε(k),∗ →ε∗ =0
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
(9.3.76)
For the second part of the theorem, we need gradient formulas of g60 (σ, θ),
which can be derived in the same way as that for Theorem 7.2.2. These
gradient formulas are given as follows.
 1  
∂6g0 (σ, θ) ∂ H̄0 s, ỹ(s | σ, θ), σ, θ, λ̄0 (s | σ, θ)
= ds (9.3.77)
∂σ 0 ∂σ
 1  
∂6
g0 (σ, θ) ∂ H̄0 s, ỹ(s | σ, θ), σ, θ, λ̄0 (s | σ, θ)
= ds, (9.3.78)
∂θ 0 ∂θ
where H̄0 (s, ỹ(s | σ, θ), σ, θ, λ̄0 (s | σ, θ)) is the Hamiltonian function defined
by
 
H̄0 s, ỹ(s | σ, θ), σ, θ, λ̄0 (s | σ, θ)
$ %
= L̄0 (s, ỹ(s | σ, θ), σ, θ) + λ̄0 (s | σ, θ) f˜(s, ỹ(s | σ, θ), σ, θ),
(9.3.79)

and λ̄0 (· | σ, θ) is the solution of the following system of costate differential


equations corresponding to (σ, θ) ∈ Ξ × Θ.
.   /
dλ̄0 (s) ∂ H̄0 s, ỹ(s | σ, θ), σ, θ, λ̄0 (s)
=− (9.3.80a)
dt ∂ ỹ

with the boundary condition


9.3 Exact Penalty Function Approach 355

∂Φ0 (y(1 | σ, θ))
λ̄0 (1) = . (9.3.80b)
∂y

By (9.3.79), we can rewrite (9.3.80a) as:


. /
dλ̄0 (s) ∂ L̄0 (s, ỹ(s | σ, θ), σ, θ) ∂ f˜(s, ỹ(s | σ, θ), σ)
=− − λ̄0 (s)
dt ∂ ỹ ∂ ỹ
(9.3.81)
(9.3.81) can be written as:

λ̄0 (s | σ, θ) =S(s, 1)λ̄0 (1 | σ, θ)


 s 
∂ L̄0 (ω, ỹ(ω | σ, θ), σ, θ)
+ S(s, ω) − dω, (9.3.82)
1 ∂ ỹ

where S(s, s ) is the fundamental matrix of the differential equation (9.3.81).


Similarly, (9.3.80a), (9.3.80b) and (9.3.81) with (σ, θ) replaced by (σ, θ, ), we
have

λ0 (s | σ, θ, ε) =S(s, 1)λ0 (1 | σ, θ, ε)
 s  
∂ L̃0 (ω, ỹ(ω | σ, θ), σ, θ, ε)
+ S(s, ω) − dω. (9.3.83)
1 ∂ ỹ

Thus, by (9.3.82) and (9.3.83), we obtain for each s ∈ [0, 1],


 
 0,(k),∗ 
lim λ̄ (s) − λ0,(k),∗ (s)
ε(k),∗ →ε∗ =0
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
 & '

= lim S(s, 1) λ̄0,(k),∗ (1) − λ0,(k),∗ (1)
ε(k),∗ →ε∗ =0
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
 s 4 5
∂ L̄0 (ω) ∂ L̃0,ε (ω)
(k),∗ (k),∗ 

+ S(s, ω) − + dω 
1 ∂ ỹ ∂ ỹ
 
 
≤ lim |S(s, 1)| λ̄0,(k),∗ (1) − λ0,(k),∗ (1)
ε(k),∗ →ε∗ =0
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
 0  0  (k),∗


 ∂ L̄0 (ω) ∂ L̃0,ε (ω) 
(k),∗
+ |S(s, ω)|dω − +  dω. (9.3.84)
1 1  ∂ ỹ ∂ ỹ 

By (9.3.39b) and (9.3.80b), Assumption 9.3.5 and ξ  > α, we have


 
 0,(k),∗ 
lim λ̄ (1) − λ0,(k),∗ (1)
ε(k),∗ →ε∗ =0
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
356 9 Optimal Control Problems with State and Control Constraints
 
 ! !2 
 (k),∗ −α ∂  (k),∗
n+1
f 
= lim  ε ỹi (1) − ỹi 
ε(k),∗ →ε∗ =0  ∂y 
σ (k),∗ ,θ (k),∗ →(σ ∗ ,θ ∗ )∈Ω
i=1
( ) 0

= 0. (9.3.85)

On the other hand, by (9.3.33), ξ > α and γ > α, it follows from Assump-
tion 9.3.6 that, for each s ∈ [0, 1],
 0 . (k),∗
/

 ∂ L̄0 (ω) ∂ L̃0,ε (ω) 
(k),∗
lim  − +  dω
ε(k),∗ →ε∗ =0 1  ∂ ỹ ∂ ỹ 
∗ ∗
(σ (k),∗ ,θ (k),∗
)→(σ ,θ )∈Ω0
 0 N &
 * +'

= lim 2ε (k),∗  max 0, h̄(k),∗ (ω) − ε(k),∗ Wi ·
ε(k),∗ →ε∗ =0
 i
1 i=1
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0

∂ h̄i (ω) 
(k),∗

∂ ỹ dω
= 0. (9.3.86)

We then substitute (9.3.85), (9.3.86) into (9.3.84) to give, for each s ∈ [0, 1],
 
 0,(k),∗ 
lim λ̄ (s) − λ0,(k),∗ (s) = 0. (9.3.87)
ε(k),∗ →ε∗ =0
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0

Then we have
!
lim ∇σ g60δk σ (k),∗ , θ (k),∗ , ε(k),∗
ε(k),∗ →ε∗ =0
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
 (k),∗ !−α
1
∂ L̄0 (s)
= lim ds + 2 ε(k),∗ ·
ε(k),∗ →ε∗ =0 0 ∂σ
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
N  1 * !γ + ∂ h̄(k),∗ (s)
(k),∗
max 0, h̄i (s) − ε(k),∗ Wi i
ds
i=1 0 ∂σ
 1
∂ f˜(k),∗ (s)
+ λ0,(k),∗ (s) ds
0 ∂σ
 1 (k),∗  1 !
∂ L̄0 (s)
= lim ds + λ0 s | σ (k),∗ , θ (k),∗ ·
ε(k),∗ →ε∗ =0 0 ∂σ 0
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
N  1
∂ f˜(k),∗ (s)  " (k),∗ !−α (k),∗
ds + lim ε h̄i (s)
∂σ ε(k),∗ →ε∗ =0
i=1 0
∗ ∗
(σ (k),∗ ,θ (k),∗
)→(σ ,θ )∈Ω0
9.3 Exact Penalty Function Approach 357

!γ−α # ∂ h̄i(k),∗ (s)


− ε(k),∗ Wi · ds . (9.3.88)
∂σ

Note that ∂L¯ 0 /∂σ, λ0 and ∂ f˜/∂σ are all bounded. Thus, it follows
from (9.3.87) and Theorem A.1.10 that
 (k),∗
1
∂ L̄0 (s)
lim ds
ε(k),∗ →ε∗ =0 0 ∂σ
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
 1 ! ∂ f˜(k),∗ (s)
+ λ0 s | σ (k),∗ , θ (k),∗ , ε(k),∗ · ds
0 ∂σ
 1 (k),∗
∂ L̄0 (s)
= lim ds
0 ε(k),∗ →ε∗ =0 ∂σ
(σ (k),∗ ,θ (k),∗
)→(σ∗ ,θ∗ )∈Ω0
 1 ! ∂ f˜(k),∗ (s)
+ lim λ0 s | σ (k),∗ , θ (k),∗ , ε(k),∗ ds
0 ε(k),∗ →ε∗ =0 ∂σ
∗ ∗
(σ (k),∗ ,θ (k),∗
)→(σ ,θ )∈Ω0
 1 (k),∗  1
∂ L̄0 (s) ∂ f˜(k),∗ (s)
= ds + λ¯0 (s | σ ∗ , θ ∗ ) ds
0 ∂σ 0 ∂σ
=∇σ g60 (σ ∗ , θ ∗ ). (9.3.89)

Similarly, (ε(k),∗ )−α gi , ∂gi /∂σ are all bounded, and ξ > α, γ > α. It follows
from Assumption 9.3.5 that
N 
 1 !−α !γ−α
(k),∗
lim 2 ε(k),∗ h̄i (s) − ε(k),∗ Wi ·
ε(k),∗ →ε∗ =0 0
σ (k),∗ ,θ (k),∗ →(σ ∗ ,θ ∗ )∈Ω
i=1
( ) 0

(k),∗
∂ h̄i (s)
ds
∂σ
N 
 1 !−α !γ−α
(k),∗
=2 lim ε(k),∗ h̄i (s) − ε(k),∗ Wi ·
0 ε(k),∗ →ε∗ =0
i=1 σ (k),∗ ,θ (k),∗ →(σ ∗ ,θ ∗ )∈Ω
( ) 0

(k),∗
∂ h̄i (s)
ds
∂σ
=0. (9.3.90)

We substitute (9.3.89) and (9.3.90) into (9.3.88) to give


!
lim ∇σ g60δk σ (k),∗ , θ (k),∗ , ε(k),∗ = ∇σ g60 (σ ∗ , θ ∗ ).
(k),∗ →∗ =0
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
(9.3.91)
358 9 Optimal Control Problems with State and Control Constraints

Similarly, we can show that


!
lim ∇θ g60δk σ (k),∗ , θ (k),∗ , ε(k),∗ = ∇θ g60 (σ ∗ , θ ∗ ).
ε(k),∗ →ε∗ =0
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
(9.3.92)
On the other hand, we note that
!
lim ∇ε g60δk σ (k),∗ , θ (k),∗ , ε(k),∗
ε(k),∗ →ε∗ =0
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
& !−α−1 &
= lim ε(k),∗ − α·
ε(k),∗ →ε∗ =0
( σ (k),∗ ,θ (k),∗ →(σ ∗ ,θ ∗ )∈Ω
) 0

N  1
 & * !γ +'2
(k),∗
max 0, h̄i (s) − ε(k),∗ Wi ds
i=1 0
N 
 1 * !γ + !γ !
(k),∗
+ 2γ max 0, h̄i (s) − ε(k),∗ Wi −ε(k),∗ Wi ds
i=1 0


n+1 !!2 ' !β−1 '
(k),∗
+ φi ỹi (1) + σk β ε(k),∗
i=1
N 
 1 & !− α+1
(k),∗ 2
= lim −α h̄i (s) ε(k),∗
ε(k),∗ →ε∗ =0 0
i=1
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
!γ− α+1 '2 N 
 1 & !γ '
2 (k),∗
− ε(k),∗ Wi ds + 2γ h̄i (s) − ε(k),∗ Wi ·
i=1 0
5
& !γ ' !−α−1 
n+1 !!2
(1)(ε(k),∗ )− 2
(k),∗ α+1
−ε(k),∗ Wi ε(k),∗ ds + φi ỹi
i=1
(9.3.93)

As all the terms in (9.3.93) are bounded and −α − 1 + 2ξ > 0, −α − 1 + 2ξ  >


0, 2γ − α − 1 > 0, we have
!
lim ∇ε g0δk σ (k),∗ , θ (k),∗ , ε(k),∗
ε(k),∗ →ε∗ =0
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
4 N  1
 !− α+1
(k),∗ 2
= −α lim h̄i (s) ε(k),∗ (9.3.94)
0 ε(k),∗ →ε∗ =0
i=1 σ (k),∗ ,θ (k),∗ →(σ ∗ ,θ ∗ )∈Ω
( ) 0

!γ− α+1
2
2
− ε(k),∗ Wi ds
9.3 Exact Penalty Function Approach 359

N 
 1 !γ !
(k),∗
+ 2γ lim h̄i (s) − ε(k),∗ Wi ·
0 ε(k),∗ →ε∗ =0
i=1
(σ(k),∗ ,θ(k),∗ )→(σ∗ ,θ∗ )∈Ω0
!γ ! !−α−1 
n+1 !− α+1 2 5
(k),∗ 2
−ε(k),∗
Wi ε (k),∗
ds + φi ỹi (1) ε (k),∗

i=1
=0. (9.3.95)

Thus, the proof is complete.


 
Theorem 9.3.4 Let ε(k),∗ → ε∗ = 0 and σ (k),∗ , θ (k),∗ → (σ ∗ , θ ∗ ) ∈ Ω0 be
 
such that g60δk σ (k),∗ , θ(k),∗ , (k),∗ is finite. Then, (σ ∗ , θ ∗ ) is a local minimizer
of Problem (P̃ (p)).

Proof. On the contrary, assume that (σ ∗ , θ ∗ ) is not a local minimizer


! of Prob-
lem (P̃ (p)). Then, there must exist a feasible point σ̂ , θ̂ ∈ Nδ (σ ∗ , θ ∗ ) of
∗ ∗

Problem (P̃ (p)) such that


!
g60 σ̂ ∗ , θ̂ ∗ < g60 (σ ∗ , θ ∗ ) (9.3.96)
!
where Nδ (σ ∗ , θ ∗ ) is a δ neighbourhood of σ̂ ∗ , θ̂ ∗ in Ω0 for some δ̄ > 0.
 
Since σ ∗ , θ∗ , ε(k),∗ is a local minimizer of Problem (P̃δ (p)), there exists a
sequence {ξ k }, such that
! !
g60δk σ, θ, ε(k),∗ ≥ g60δk σ (k),∗ , θ (k),∗ , ε(k),∗

  * +
for any x ∈ Nδk σ (k),∗ , θ (k),∗ . Now, we construct a sequence σ̂ (k),∗ , θ̂ (k),∗
satisfying
 ! ! ξ k
 (k),∗ (k),∗ 
 σ̂ , θ̂ − σ (k),∗ , θ (k),∗  ≤
k
. Clearly,
! !
g60δk σ̂ (k),∗ , θ̂ (k),∗ , ε(k),∗ ≥ g60δk σ (k),∗ , θ (k),∗ , ε(k),∗ (9.3.97)

Letting k → +∞, we have


 ! !
 
lim  σ̂ (k),∗ , θ̂ (k),∗ − σ̂ ∗ , θ̂∗ 
k→+∞
 ! !
 
≤ lim  σ̂ (k),∗ , θ̂ (k),∗ − σ (k),∗ , θ (k),∗ 
k→+∞
 !   !
   
+ lim  σ (k),∗ , θ (k),∗ − (σ ∗ , θ ∗ ) + (σ ∗ , θ ∗ ) − σ̂ ∗ , θ̂ ∗ 
k→+∞

≤ 0 + 0 + δ̄. (9.3.98)
360 9 Optimal Control Problems with State and Control Constraints

However, δ̄ > 0 is arbitrary. Thus,


! !
lim σ̂ (k),∗ , θ̂ (k),∗ = σ̂ ∗ , θ̂ ∗ . (9.3.99)
k→+∞

Letting k → +∞ in (9.3.97), it follows from the continuity of g̃0 and (9.3.99)


that
!
lim g60δk σ̂ (k),∗ , θ̂ (k),∗ , ε(k),∗
k→+∞
! ! !
= g60δk σ̂ ∗ , θ̂ ∗ , 0 = g60 σ̂ ∗ , θ̂ ∗ ≥ lim g60δk σ (k),∗ , θ (k),∗ , ε(k),∗
k→+∞
∗ ∗ ∗ ∗
= g60δk (σ , θ , 0) = g60 (σ , θ ) . (9.3.100)

This is a contradiction to (9.3.96), and hence it completes the proof.

Theorem 9.3.5 Let −α − β + 2ξ > 0, −α − β + 2ξ  > 0 and −α − β + 2γ >


0, then there exists a k0 > 0, such that ε(k),∗ = 0, σ (k),∗ , θ (k),∗ is local
minimizer of Problem (P̃ (p)), for k ≥ k0 .

Proof. On the contrary,"we assume that the# conclusion is false. Then, there
exists a subsequence of σ (k),∗ , θ (k),∗ , ε(k),∗ , which is denoted by the orig-
inal sequence, such that for any k0 > 0, there exists a k  > k0 satisfying

ε(k ),∗ = 0. By Theorem 9.3.2, we have
!
ε(k),∗ → ε∗ = 0, σ (k),∗ , θ (k),∗ → (σ ∗ , θ ∗ ) ∈ Ω0 , as k → +∞

. Since ε(k),∗ = 0 for all k, it follows from dividing (9.3.45) by (ε(k),∗ )β−1 that

4 N 
!−α−β  1 & * !γ +'2
(k),∗
ε (k),∗
−α max 0, h̄i (s) − ε(k),∗ Wi ds
i=1 0
N  1
 * !γ + !γ !
(k),∗
+ 2γ max 0, h̄i (s) − ε(k),∗ Wi −ε(k),∗ Wi ds
i=1 0
5

n+1 !2
(k),∗
−α ỹi (1) − ỹif + δk β = 0. (9.3.101)
i=1

This is equivalent to
4 N 
!−α−β  1 & * !γ +'2
(k),∗
ε (k),∗
−α max 0, h̄i (s) − ε(k),∗ Wi ds
i=1 0
N  1
 * !γ + !γ !
(k),∗
+ 2γ max 0, h̄i (s) − ε(k),∗ Wi −ε(k),∗ Wi
i=1 0
9.3 Exact Penalty Function Approach 361
* !γ +
(k),∗ (k),∗
+ max 0, h̄i (s) − ε(k),∗ Wi h̄i (s)
* !γ +
(k),∗ (k),∗
− max 0, h̄i (s) − ε(k),∗ Wi h̄i (s) ds
5
 (k),∗
n+1 !2
f
−α ỹi (1) − ỹi + δk β = 0 (9.3.102)
i=1

Rearranging (9.3.102) yields


4 N 
!−α−β  1 & * !γ +'2
(k),∗
ε (k),∗
(2γ − α) max 0, h̄i (s) − ε(k),∗ Wi ds
i=1 0
5

n+1 !2
(k),∗
−α ỹi (1) − ỹif + δk β
i=1
N 
!−α−β  1 * !γ +
(k),∗ (k),∗
= 2γ ε (k),∗
max 0, h̄i (s) − ε(k),∗ Wi × h̄i (s)ds.
i=1 0
(9.3.103)

Let k → +∞ in (9.3.102), and note that −α−β +2ξ > 0 and −α−β +2ξ  > 0.
Then, it follows that the left hand side of (9.3.103) yields
4 N  1&
!−α−β  * !γ +'2
(k),∗
ε (k),∗
(2γ − α) max 0, h̄i (s) − ε(k),∗ Wi ds
i=1 0
5

n+1 !2
(k),∗
−α ỹi (1) − ỹif + δk β → ∞. (9.3.104)
i=1

However, under the same conditions and −α − β + 2γ > 0, the right hand
side of (9.3.103) gives
N 
!−α−β  1 * !γ +
(k),∗ (k),∗
2γ ε (k),∗
max 0, h̄i (s) − ε(k),∗ Wi h̄i (s)ds → 0.
i=1 0
(9.3.105)

This is a contradiction. Thus, the proof is complete.

Theorem 9.3.6 Let up,∗ be an optimal control of the approximate Problem


(P̂ (p)). Suppose that u∗ is an optimal control of the Problem (P̃ ). Then,

lim g0 (up,∗ ) = g0 (u∗ ). (9.3.106)


p→+∞
362 9 Optimal Control Problems with State and Control Constraints

Proof. The proof is similar to that given for Theorem 9.2.2.


Theorem 9.3.7 Let up,∗ be an optimal control of the approximate Problem
(P̃ (p)), and u∗ be an optimal control of the Problem (P̂ ). Suppose that

lim up,∗ = ū, a.e. on [0, T ]. (9.3.107)


p→+∞

Then, ū is an optimal control of the Problem (P̂ )

lim g0 (up,∗ ) = g0 (u∗ ). (9.3.108)


p→+∞

Proof. The proof is similar to that given for Theorem 9.2.3.

9.3.3 Computational Algorithm

In this section, we are in the position to present the computational algorithm


for solving Problem (P̃ (p)) as follows.
Algorithm 9.3.1
Step 1 set δ (1) = 10, ε(1) = 0.1, ε∗ = 10−9 , β > 2, choose an initial point
(σ 0 , θ 0 , ε0 ), the iteration index k = 0. The values of γ and α are chosen
depending on the specific structure of Problem (P̃ (p)) concerned.

Step 2 Solve Problem (P̃δk (p)), and let σ (k),∗ , θ (k),∗ , ε(k),∗ be the minimizer
obtained.
Step 3 If ε(k),∗ > ε∗ , δ (k) < 108 ,  
set δ (k+1) = 10 × σ (k) , k = k + 1. Go to Step 2 with σ (k),∗ , θ (k),∗ , ε(k),∗ as
the new initial point in the new optimization process
Else set ε(k),∗ = ε∗ , then go to Step  4 
Step 4 Check the feasibility of σ (k),∗ , θ (k),∗ (i.e., check whether or not
 
max max hi y(s), σ (k),∗ ≤ 0).
1≤i≤N s∈[0,1]
 
If σ (k),∗ , θ (k),∗ is feasible, then it is a local minimizer of Problem (P̃ (p)).
Exit.
Else go to Step 5
Step 5: Adjust the parameters α, β and γ such that the conditions of
Lemma 9.3.1 are satisfied. Set δ (k+1) = 10δ (k) , ε(k+1) = 0.1ε(k) , k := k + 1.
Go to Step 2.

Remark 9.3.5 In Step 3, if ε(k),∗
 (k),∗  > ε , it follows from Theorem 9.3.2 and
(k),∗
Theorem 9.3.3 that σ ,θ cannot be a feasible point, meaning that
the penalty parameter δ may not be chosen large enough. Thus we need to
increase δ. If δk > 108 , but still ε(k),∗ > ε∗ , then we should adjust the value
of α, β and γ, such that the conditions of Theorem 9.3.3 are satisfied. Then,
go to Step 2.
9.3 Exact Penalty Function Approach 363

Remark 9.3.6 Clearly, we cannot check the feasibility of hi (y(s), σ) ≤ 0,


i = 1, . . . , N , for every s ∈ [0, 1]. In practice, we choose a set which contains
a dense enough of points in [0, 1]. Check the feasibility of hi (y(s), σ) ≤ 0 over
this set for each i = 1, . . . , N .

Remark 9.3.7 Although we have proved that a local minimizer of the exact
penalty function optimization Problem (P̃δk (p)) will converge to a local min-
imizer of the original Problem (P̃ (p)), we need, in actual computation, set a
lower bound ε∗ = 10−9 for ε(k),∗ so as to avoid the situation of being divided
by ε(k),∗ = 0, leading to infinity.

9.3.4 Examples

Example 9.3.1 In this example, we revisit Example 9.2.2 in Section 9.2.6 by


using the exact penalty function method. In this problem, we set p = 20, γ =
3 and W1 = 0.3. The result is shown below. The optimal objective function
value g0∗ = 1.75101803 × 10−1 , where δ = 1.0 × 106 and ε = 1.89531e × 10−5 .

As shown in Figure 9.3.1, the continuous inequality constraints (9.2.85) is


satisfied for all t∈ [0, 1]. Although the minimum value of the cost functional
is almost the same (which is 1.727 × 10−1 for Example 9.2.1) comparing with
the results obtained for Example 9.2.1, (9.2.85) in Example 9.2.1 is violated
at some t ∈ [0, 1]. The optimal state strategies and optimal control are plotted
in Figures 9.3.2 and 9.3.3, respectively.
Example 9.3.2 In this example, we revisit Example 9.2.2 in Section 9.2.6
by using the exact penalty function method. In this problem, we set p = 20,
γ = 3 and W1 = W2 = W3 = W4 = 0.3.

The result obtained is shown below. The optimal cost function value is
g0∗ = 5.75921513 × 10−3 , where δ = 1.0 × 105 and ε = 1.00057 × 10−7 . All the
continuous inequality constraints are satisfied for all t ∈ [0, 1]. Comparing
with the results obtained for Example 9.2.2, our minimum value of the cost
functional is slightly larger (which is 5.3 × 10−3 for Example 9.2.2). However,
the continuous inequality constraints (9.2.94)–(9.2.97) in Example 9.2.2 are
not satisfied at all t ∈ [0, 1]. The continuous inequality constraints are shown
in Figure 9.3.4, and the optimal state trajectories and the optimal control
are shown in Figures 9.3.5 and 9.3.6, respectively.
364 9 Optimal Control Problems with State and Control Constraints

2.5

h 1.5

0.5

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t

Fig. 9.3.1: h(t) for Example 9.3.1


0 0.2

0
-0.05
-0.2
-0.1
x2

-0.4
-0.15
x1

-0.6

-0.2 -0.8

-0.25 -1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t t

Fig. 9.3.2: Optimal state trajectories x1 (t) and x2 (t) for Example 9.3.1
10

4
u

-2

-4
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t

Fig. 9.3.3: Optimal control u(t) for Example 9.3.1


9.3 Exact Penalty Function Approach 365

-2 0.5

-2.5 0

-3 -0.5

-3.5 -1

h2
h1

-4 -1.5

-4.5 -2

-5 -2.5

-5.5 -3
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t t

0 -2

-2.5
-0.2
-3
-0.4
h4 -3.5
h3

-0.6
-4
-0.8
-4.5

-1 -5

-1.2 -5.5
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t t

Fig. 9.3.4: Continuous inequality constraints for Example 9.3.2

Example 9.3.3 The following problem is taken from [68]: Find a control
u : [0, 4.5] → R that minimizes the cost functional
 4.5 " #
(u(t))2 + (x1 (t))2 dt (9.3.109)
0

subject to the following dynamical equations

dx1 (t)
= x2 (t) (9.3.110)
dt
dx2 (t)  
= −x1 (t) + x2 (t) 1.4 − 0.14(x2 (t))2 (t) + 4u(t) (9.3.111)
dt
with the initial conditions x1 (0) = −5 and x2 (0) = −5, and the continuous
inequality constraint
1
h = −u(t) − x1 (t) ≥ 0, t ∈ [0, 4.5]. (9.3.112)
6
366 9 Optimal Control Problems with State and Control Constraints
12 22

21
10

20

8
19
1

2
6 18
x

x
17
4

16

2
15

0 14
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t t

0.01 3

2.5
0

-0.01

1.5
x3

x4
-0.02

-0.03

0.5

-0.04
0

-0.05 -0.5
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t t

0.2 0.06

0.05

0
0.04

0.03
-0.2

0.02
x5

-0.4 0.01
x

-0.6
-0.01

-0.02
-0.8

-0.03

-1 -0.04
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t t

Fig. 9.3.5: Optimal state trajectories for Example 9.3.2

3 0.8

2 0.6

1 0.4
u2
u1

0 0.2

-1 0

-2 -0.2

-3 -0.4
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t t

Fig. 9.3.6: Optimal controls for Example 9.3.2


9.3 Exact Penalty Function Approach 367

In this problem, we set p = 10, γ = 3 and W1 = 0.3. The result is shown


below. The optimal objective function value obtained is g0∗ = 4.58048380e ×
101 , where δ = 1.0 × 104 and ε = 9.99998 × 10−5 . The continuous inequality
constraint (9.3.112) is satisfied for all t ∈ [0, 4.5]. The continuous inequality
constraints are shown in Figure 9.3.7, and the optimal states and the optimal
control are shown in Figures 9.3.8 and 9.3.9, respectively.

1.8
1.6
1.4
1.2
1
h

0.8
0.6
0.4
0.2
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
t

Fig. 9.3.7: Continuous inequality constraint for Example 9.3.3.

1 5

0 4
3
-1
2
-2 1
x1

-3 0
x2

-4 -1
-2
-5
-3
-6 -4
-7 -5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
t

Fig. 9.3.8: Optimal state trajectories for Example 9.3.3.


368 9 Optimal Control Problems with State and Control Constraints

0.5

0
u

-0.5

-1

-1.5

-2
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
t

Fig. 9.3.9: Optimal control for Example 9.3.3.

9.4 Exercises

9.4.1 Can the control functions chosen in Definition 9.2.1 be just measur-
able, rather than Borel measurable functions?
9.4.2 Derive the gradient formulae given by (9.3.32) and (9.3.33).
9.4.3 Consider the problem (P ) subject to additional terminal equality con-
straints:
qi (x(T | u)) = 0, i = 1, . . . , NE , (9.4.1)
where qi , i = 1, . . . , NE , are continuously differentiable functions. Let this
optimal control problem be referred to as the problem (R). Use the control
parametrization time scaling transform technique to derive a computational
method for solving Problem (R). State all the essential assumptions and then
prove all the relevant convergence results.
9.4.4 Provide detailed derivations of the gradient formulae appeared in Al-
gorithm 9.2.4.
9.4.5 Prove Theorem 9.2.1.
9.4.6 Prove Theorem 9.2.2.
9.4.7 Prove Theorem 9.2.3.
9.4.8 Show the validity of Remark 9.3.2.
9.4.9 Construct the equality constraint violation as defined by (9.3.29) fort
the interior equality constraint (8.9.14a).
9.4 Exercises 369

9.4.10 Provide detailed derivations of the gradient formulae presented in


Theorem 9.3.1.

9.4.11 Show the validity of the statement made in Remark 9.3.4.

9.4.12 Provide detailed derivations of the gradient formulae which is given


by (9.3.35), (9.3.36) and (9.3.37).

9.4.13 Give detailed proof of Corollary 9.3.1.

9.4.14 Prove Theorem 9.3.6.

9.4.15 Prove Theorem 9.3.7.


Chapter 10
Time-Lag Optimal Control Problems

In this chapter, we consider three types of optimal control problems: (1)


Time-lag optimal control problems. The main reference is [289, 290, 298].
(2) Optimal control problems with state-dependent switched time-delayed
systems. The main reference is [153]. (3) Min-max optimal control of lin-
ear continuous dynamical systems with uncertainty and quadratic terminal
constraints problems. The main reference is [287].

10.1 Time-Lag Optimal Control

10.1.1 Introduction

A time-delay system is a dynamic system, which evolves depending not only


on the current state and/or control variables but also on the state and/or
control variables at some past time instants. Such systems arise in plethora
of real-world applications in science and engineering, including epidemiolog-
ical modeling [16], vehicle suspension design [129] and spacecraft attitude
control [47]. In this section, we consider a general class of time-delayed op-
timal control problems where the cost functional is to be minimized subject
to canonical constraints. The control parametrization method together with
a hybrid time-scaling transformation strategy is used to devise a computa-
tional algorithm for solving this general class of time-delayed optimal control
problems. For illustration, several numerical examples are solved using the
proposed algorithm. The main references of this section are [289], [290] and
[298].

© The Author(s), under exclusive license to 371


Springer Nature Switzerland AG 2021
K. L. Teo et al., Applied and Computational Optimal Control, Springer
Optimization and Its Applications 171,
https://doi.org/10.1007/978-3-030-69913-0 10
372 10 Time-Lag Optimal Control Problems

10.1.2 Problem Formulation

Consider the following time-delay system, defined on the fixed time interval
(−∞, T ]:

dx
= f (x(t), x̄(t), u(t), ū(t)), t ∈ [0, T ], (10.1.1)
dt
x(t) = φ(t), t ≤ 0, (10.1.2)
u(t) = ϕ(t), t < 0, (10.1.3)

where x(t) = [x1 (t), x2 (t), . . . , xn (t)] ∈ Rn is the state vector; u(t) =
[u1 (t), u2 (t), . . . , ur (t)] ∈ Rr is the control vector; x̄(t) = [x1 (t − h1 ), x2 (t −
h2 ), . . . , xn (t − hn )] and ū(t) = [u1 (t − hn+1 ), u2 (t − hn+2 ), . . . , ur (t −
hn+r )] , in which hq > 0, q = 1, . . . , n + r, are given time-delays; f :
Rn × Rn × Rr × Rr → Rn and φ(t) = [φ1 (t), . . . , φn (t)] are given con-
tinuously differentiable functions; and ϕ(t) = [ϕ1 (t), . . . , ϕr (t)] is a given
function. A Borel measurable function u(t) : (−∞, T ] → Rr is said to be an
admissible control if u(t) ∈ U for almost all t ∈ [0, T ] and u(t) = ϕ(t) for
all t < 0, where U is a compact and convex subset of Rr . Let U denote the
class of all such admissible controls. For simplicity, we use u to denote u(t)
for the rest of the section. Let x(·) denote the solution of (10.1.1)–(10.1.3)
corresponding to each u ∈ U . It is an absolutely continuous function that
satisfies the dynamic (10.1.1) almost everywhere on [0, T ], and the initial
condition (10.1.2) everywhere on (−∞, 0]. Our optimization problem is de-
fined formally as follows: Given the dynamic system (10.1.1)–(10.1.3), choose
an admissible control u ∈ U to minimize the following cost functional:
 T
Φ0 (x(T )) + L0 (x(t), x̄(t), u)dt, (10.1.4)
0

subject to the canonical constraints


 T
Φk (x(T )) + Lk (x(t), x̄(t), u)dt = 0,
0
k = 1, . . . , Ne , (10.1.5)
 T
Φk (x(T )) + Lk (x(t), x̄(t), u)dt ≥ 0,
0
k = Ne + 1, . . . , Ne + Nm , (10.1.6)

where Φk : Rn → R, k = 0, 1, . . . , Ne + Nm , and Lk : Rn × Rn × Rr → R, k =
0, 1, . . . , Ne + Nm , are given real-valued functions. We denote this problem
as Problem (P1 ).
We assume that the following conditions are satisfied throughout this sec-
tion.
10.1 Time-Lag Optimal Control 373

Assumption 10.1.1 Lk : Rn × Rn × Rr → R, k = 0, 1, . . . , Ne + Nm and


Φk : Rn → R, k = 0, 1, . . . , Ne + Nm are continuously differentiable with
respect to each of their arguments.
Assumption 10.1.2 f is twice continuously differentiable.

Assumption 10.1.3 There exists a real number L1 > 0 such that


|f (α, e, ν, ω)| ≤ L1 (1 + |α| + |e|), (α, e, ν, ω) ∈ Rn × Rn × U × U .

Assumptions 10.1.1–10.1.3 ensure the existence and uniqueness of the so-


lution of the dynamic system (10.1.1)–(10.1.3).

10.1.3 Control Parametrization

In this section, we shall propose a numerical method to solve Problem (P1 ).


We subdivide the planning horizon [0, T ] into p ≥ 1 subintervals [ti , ti+1 ), i =
0, 1, . . . , p−1, where ti , i = 0, 1, . . . , p, are the partition points that satisfying

0 = t0 ≤ t1 ≤ t2 ≤ . . . ≤ tp−1 ≤ tp = T. (10.1.7)

Let Ξ denote the set of all vectors σ = [t1 , . . . , tp ] such that (10.1.7) is
satisfied. Then the control u is approximated as follows:


p
u≈ δ (i) χ[ti−1 ,ti ) (t), t ∈ [0, T ], (10.1.8)
i=1

& '
(i) (i)
where δ (i) = δ1 , . . . , δr is the value of the control on the ith subinterval
and χ[ti−1 ,ti ) (t) is the indicator function defined by

1, if t ∈ [ti−1 , ti ),
χ[ti−1 ,ti ) (t) =
0, otherwise.
$ %
Let Δ denote the set of all such vectors δ = (δ (1) ) , . . . , (δ (p) ) . Substi-
tuting (10.1.8) into (10.1.1), the time-delay system defined on the subinterval
[ti−1 , ti ) becomes
dx !
= f x(t), x̄(t), θ (i) , θ̄(t) , (10.1.9)
dt
$ %
where θ̄(t) = θ̄1 (t), . . . , θ̄r (t) . For θ̄m (t), m = 1, . . . , r, there are two cases:
(1) if t−hn+m < 0, then θ̄m (t) = ϕm (t−hn+m ); and (2) if t−hn+m ≥ 0, then
there exist q (q ≤ p) distinct partition points (let these points be denoted as
til , l = 1, . . . , q,) such that

0 < ti1 < ti2 < · · · < tiq = T.


374 10 Time-Lag Optimal Control Problems

Clearly, $ %
[0, T ] = [0, ti1 ) ∪ [ti1 , ti2 ) . . . ∪ tiq−1 , tiq , (10.1.10)
and for any k, l ∈ {1, 2, . . . , q}, k = l,
$  $ 
tik−1 , tik ∩ til−1 , til = ∅. (10.1.11)

Therefore, we can find a unique j ∈ {1, . . . , p} such that t − hn+m ∈ [tj−1 , tj )


(j)
and δ̄m (t) = δm .
Now, δ̄m (t), m = 1, . . . , r, can be expressed as
⎧ (j)
⎨ δm , if t ∈ [tj−1 + hn+m , tj + hn+m )
δ̄m (t) = for some j ∈ {1, . . . , p},

ϕm (t − hn+m ), if t < hn+m .

Let x(· | σ, δ) denote the solution of (10.1.9) corresponding to (σ, δ) ∈


Ξ × Δ. The original time-delay problem (P1 ) is now approximated as: Given
system (10.1.9), choose a (σ, δ) ∈ Ξ × Δ such that the cost functional

g0 (σ, δ) = Φ0 (x(T | σ, δ))


 p  ti !
+ L0 x(t | σ, δ), x̄(t | σ, δ), δ (i) dt, (10.1.12)
i=1 ti−1

is minimized subject to the following constraints:


p 
 ti !
gk (σ, δ) = Φk (x(T | σ, δ)) + Lk x(t | σ, δ), x̄(t | σ, δ), δ (i) dt
i=1 ti−1

= 0, k = 1, . . . , Ne , (10.1.13)
p  ti
 !
gk (σ, δ) = Φk (x(T | σ, δ)) + Lk x(t | σ, δ), x̄(t | σ, δ), δ (i) dt
i=1 ti−1

≥ 0, k = Ne + 1, . . . , Ne + Nm , (10.1.14)

where x̄(t | σ, δ) = [x1 (t − h1 | σ, δ), . . . , xn (t − hn | σ, δ)] . A pair


(σ, δ) ∈ Ξ × Δ is said to be a feasible pair if it satisfies constraints (10.1.13)–
(10.1.14). Let F consist of all such feasible pairs. To proceed further, let this
approximate problem be referred to as Problem (P1 (p)).

10.1.4 The Time-Scaling Transformation

For time-delayed optimal control problems with variable switching times, the
conventional time scaling transformation fails to work. This is because the
conventional time scaling transformation will map variable switching times
10.1 Time-Lag Optimal Control 375

into fixed switching times in a new time horizon. However, the time-delays
will become variable delays. As a consequence, the transformed problem will
be even harder to solve. In this section, we shall develop a novel time-scaling
transformation to transform Problem (P (p)) into an equivalent problem in
which the switching times are fixed.
For any σ = [t1 , . . . , tp ] ∈ Ξ, define a vector θ = [θ1 , . . . , θp ] ∈ Rp
where θi = ti − ti−1 , i = 1, . . . , p. Clearly, θi ≥ 0 is the duration between two
consecutive switching times for the control vector δ (i) and θ1 + · · · + θp = T.
Let Θ denote the set of all such vectors θ ∈ Rp , where vectors in Θ are called
admissible duration vectors.
Now, we introduce a new time variable s in a new time horizon (−∞, p).
For each admissible duration vector θ ∈ Θ and a time instant s in the new
time horizon, define the corresponding time-scaling function as follows:
s

μ(s | θ) = θi + θs+1 {s − max( s!, 0)}, s ∈ (−∞, p], (10.1.15)
i=1

where ·! denotes the floor function. θp+1 is arbitrary and θi = 1 for i ≤ 0.


The form of the time-scaling function is depicted in Figure 10.1.1 below:

t
tp
αp
μ(s | θ)

t3

α3
t2
t1 α2
0 α1 s
1 2 3 p

tan αi = θi , i = 1, . . . , p

Fig. 10.1.1: Time-scaling function

It is clear from Figure 10.1.1 that the time scaling function (10.1.15) is a
continuous piecewise linear non-decreasing function, which maps s ∈ [i − 1, i)
in the new time horizon to [ti−1 , ti ) in the original time horizon. The switching
times for the control vector are fixed integer points 1, 2, . . . , p − 1, in the new
376 10 Time-Lag Optimal Control Problems

time horizon. Moreover, μ(· | θ) is strictly increasing on [i − 1, i) if and only


if θi > 0.
By the nature of the time-scaling function, we see that a fixed time-delay
h will become a variable in the new time horizon. It is clearly important to
find the explicit formula for the variable time-delay in the new time horizon.
For each θ ∈ Θ, delay h ∈ {h1 , . . . , hn+r }, and any s ∈ (−∞, p], define the
variable delay in the new time horizon as follows:

ζ(s | θ) = sup{η(s |θ) ∈ (−∞, p] : μ(η(s | θ) | θ) = μ(s | θ) − h}. (10.1.16)

For simplicity, let η(s), ζ(s) and μ(s) denote η(s | θ), ζ(s | θ) and μ(s | θ),
respectively, for the rest of the section.

t μ(s)

μ(s )

h
μ(s ) − h

0
s
ζ(s ) s

Case 1: Only one point in the new time scale

satisfies μ(η) = μ(s ) − h


t μ(s)

μ(s )

h
μ(s ) − h

0
s
ζ(s ) s

Case 2: Infinite many points in the new time horizon

satisfy μ(η) = μ(s ) − h

Fig. 10.1.2: Two cases for finding the delay in the new time horizon

Note that, for the duration vector θ, it is possible that θi = 0 for some
i ∈ {1, . . . , p}. This means that for some time s in the new time horizon,
10.1 Time-Lag Optimal Control 377

there may exist infinite many η(s) satisfying

μ(η(s)) = μ(s) − h.

By the definition of ζ(s), it follows that the delay time in the new time horizon
is unique. The process for finding the delay time in the new time horizon is
illustrated in Figure 10.1.2.
To proceed, we need the following lemma.
Lemma 10.1.1 For any given t ∈ [0, T ) and θ ∈ Θ, there exists a unique
m ∈ {1, . . . , p} such that θm > 0 and t ∈ [μ(m − 1), μ(m)).

Proof. By the definition of the time-scaling function, we have ti = μ(i). For


any t ∈ [0, T ], it is clear from (10.1.10) $ and (10.1.11)
 that there exists a
unique j ∈ {1, 2, . . . , q}, such that t ∈ tij−1 , tij . This indicates that there
exists a unique j ∈ {1, . . . , q} ⊂ {1, . . . , p} such that θij = tij − tij−1 > 0 and
t ∈ [μ(ij − 1), μ(ij )).

Now, we are in the position to give an explicit formula for ζ(s). This is
presented as a theorem below:
Theorem 10.1.1 Let θ ∈ Θ. Then, for each s ∈ (−∞, p], if μ(s) − h < 0,
then
ζ(s) = μ(s) − h.
Otherwise, let κ(s | θ) denote the unique integer such that θκ(s|θ)+1 > 0 and
⎡ ⎞

κ(s|θ)

κ(s|θ)+1
μ(s) − h ∈ ⎣ θi , θi ⎠ . (10.1.17)
i=0 i=0

Then, the following equation holds:


s

−1 −1 −1
ζ(s) =κ(s | θ) + θκ(s|θ)+1 θl + θκ(s|θ)+1 θs+1 (s − s!) − hθκ(s|θ)+1 .
l=κ(s|θ)+1

Proof. For simplicity, we omit the argument θ in κ(s | θ). Suppose first
that μ(s) − h < 0. Then μ(ζ(s)) < 0. Thus, μ(ζ(s)) = ζ(s). Combining this
equation with equation (10.1.15) gives

ζ(s) = μ(ζ(s)) = μ(s) − h. (10.1.18)

Suppose now that μ(s) − h ∈ [0, T ), it follows from (10.1.17) that κ(s) ≥ 0,
and ζ(s) ∈ [κ(s), κ(s) + 1). That is, κ(s) = ζ(s)!, and hence it follows
from (10.1.15) and (10.1.18) that
378 10 Time-Lag Optimal Control Problems

s

κ(s)

μ(ζ(s)) = θl + θκ(s)+1 (ζ(s) − κ(s)) = θl + θs+1 (s − s!) − h.
l=1 l=1
(10.1.19)

Since ζ(s) ≤ s, we have 0 ≤ κ(s) ≤ s!. Thus, (10.1.19) can be rearranged to


give
s

−1 −1 −1
ζ(s) =κ(s) + θκ(s)+1 θl + θκ(s)+1 θs+1 (s − s!) − hθκ(s)+1 .
l=κ(s)+1

The proof is complete.

In the new time horizon, we consider the following new time-delay system
with fixed switching time, defined on the subinterval [i − 1, i), i = 1, . . . , p:

dy !
= θi f y(s), ȳ(s), δ (i) , δ̄(s) , (10.1.20)
ds
y(s) = φ(s), s ≤ 0, (10.1.21)

where θ ∈ Θ, δ ∈ Δ,

y(s) := [y1 (s), y2 (s), . . . , yn (s)] ∈ Rn


ȳ(s) := [y1 (s̄1 ), y2 (s̄2 ), . . . , yn (s̄n )] ∈ Rn ,
$ %
δ̄(s) = δ̄1 (s̄n+1 ), . . . , δ̄r (s̄n+r ) ,

where s̄q (s | θ) := ζ(s) when h = hq , q = 1, . . . , n + r and s̄q (s | θ) is denoted


as s̄q for simplicity. For each m = 1, . . . , r,
⎧ (j)
⎨ δm , if s̄n+m ∈ [j − 1, j)
δ̄m (s̄n+m ) = for some j ∈ {1, . . . , p},

ϕm (s̄n+m ), if s̄n+m < 0.

f : Rn × Rn × Rr × Rr → Rn , φ : R → Rn and ϕ : R → Rr are as
defined above. It is easy to see that ζ(s) < s. Hence, ȳ(s) is a delay term
in (10.1.20). Note that for the case of s̄q < 0, q = 1, . . . , n, yq (s̄q ) = φq (s̄q ).
By Assumptions 10.1.1 and 10.1.2, system (10.1.20) have a unique solution
for each admissible pair (θ, δ). Let y(· | θ, δ) denote the solution of (10.1.20).
Now, for each admissible duration vector θ ∈ Θ, denote

υ(θ) = [μ(1), . . . , μ(p − 1)] .

Since μ(·) is non-decreasing, μ(i − 1) ≤ μ(i) for each i = 1, . . . , p. Thus, υ(θ)


is an admissible switching time vector for Problem (P (p)), which means that
the state trajectory x(· | υ(θ), δ) is well-defined. Note that x(· | υ(θ), δ) is a
function of the original time variable t ∈ (−∞, T ]. For each (θ, δ) ∈ Θ × Δ,
10.1 Time-Lag Optimal Control 379

let x(· | υ(θ), δ) be the corresponding state trajectory of system (10.1.9). We


can show that

x(t | υ(θ), δ) |t=μ(s) = y(s | θ, δ), s ∈ (−∞, p]. (10.1.22)

We now consider the following constraints:


p 
 i !
g̃k (θ, δ) = Φk (y(p | θ, δ)) + θi Lk y(s | θ, δ), ȳ(s | θ, δ), δ (i) ds
i=1 i−1

= 0, k = 1, . . . , Ne , (10.1.23)
p 
 i !
g̃k (θ, δ) = Φk (y(p | θ, δ)) + θ i Lk y(s | θ, δ), ȳ(s | θ, δ), δ (i) ds
i=1 i−1

≥ 0, k = Ne + 1, . . . , Ne + Nm , (10.1.24)

where the functions Φk : Rn → R, k = 1, . . . , Ne + Nm and Lk : Rn × Rn ×


Rr → R, k = 1, . . . , Ne + Nm are as defined in Section 10.1.2. Let F̃ denote
the set of all pairs (θ, δ) ∈ Θ × Δ which satisfy (10.1.23)–(10.1.24).
The new problem may now be stated formally as follows:
Given system (10.1.20)–(10.1.21), choose a feasible pair (θ, δ) ∈ F̃ such that
the following cost functional:
p 
 i !
g̃0 (θ, δ) = Φ0 (y(p | θ, δ)) + θi L0 y(s | θ, δ), ȳ(s | θ, δ), δ (i) ds
i=1 i−1
(10.1.25)

is minimized over F̃, where Φ0 : Rn → R and L0 : Rn × Rn × Rr → R are as


defined in Section 10.1.2. Let this problem be denoted as Problem (Q1 (p)).
It is easy to prove the equivalence of Problem (P1 (p)) and Problem (Q1 (p)).

10.1.5 Gradient Computation

To solve Problem (Q1 (p)) using the gradient-based nonlinear optimization al-
gorithms, we require the gradients of the cost and constraint functionals with
respect to each of their variables. We first rewrite g̃k (θ, δ), k = 0, . . . , Ne +Nm ,
in the following forms:

 p
∂μ(s)
g̃k (θ, δ) = Φk (y(p | θ, δ)) + L̂k (y(s | θ, δ), ȳ(s | θ, δ), δ) ds,
0 ∂s
(10.1.26)

where
380 10 Time-Lag Optimal Control Problems


p !
L̂k (y(s | θ, δ), ȳ(s | θ, δ), δ) = Lk y(s | θ, δ), ȳ(s | θ, δ), δ (i) χ[i−1,i) (s).
i=1

Then we consider the derivative of ζ(·) with respect to θi , i = 1, . . . , p.


Let S  denote the set of points s such that ζ(s) ∈ {0, 1, . . . , p − 1}. For all
s∈/ S  , according to μ(ζ(s)) = μ(s) − h, h ∈ {h1 , . . . , hn+r }, we obtain

∂ζ(s) −1 ∂μ(s) ∂μ(ζ(s))


= θκ(s)+1 − , i = 1, . . . , p, (10.1.27)
∂θi ∂θi ∂θi

where κ(s) is defined in Theorem 10.1.1.


Next, the gradient of the state with respect to the duration vector θ is
given as a theorem stated below.
Theorem 10.1.2 For each pair (θ, δ) ∈ Θ × Δ,

∂y(s | θ, δ)
= Λ̄(s | θ, δ), s ∈ [0, p]. (10.1.28)
∂θ
Here, Λ̄(· | θ, δ) is the solution of the following auxiliary dynamic on each
[i − 1, i):

dΛ̄ ∂f i (y(s | θ, δ), ȳ(s | θ, δ), θ, δ)


= Λ̄(s)
ds ∂y
∂f i (y(s | θ, δ), ȳ(s | θ, δ), θ, δ) ∂ ȳ(s | θ, δ)
+
∂ ȳ ∂θ
∂f (y(s | θ, δ), ȳ(s | θ, δ), θ, δ)
i
+ (10.1.29)
∂θ
with
Λ̄(s) = 0, s ≤ 0, (10.1.30)
where ⎡ ⎤
Λ̄1 (s̄1 | θ, δ) + ∂y1 (s̄∂s1 |θ,δ) ∂∂θ
s̄1
∂ ȳ(s | θ, δ) ⎢ .. ⎥
=⎣ . ⎦,
∂θ
∂yn (s̄n |θ,δ) ∂ s̄n
Λ̄n (s̄n | θ, δ) + ∂s ∂θ

and for s ∈ [i − 1, i),

df i (y(s | θ, s), ỹ(s | θ, s), θ, s) !


= θi f y(s | θ, δ), ȳ(s | θ, δ), δ (i) , δ̄(s) .
ds
10.1 Time-Lag Optimal Control 381

Proof. Let δ and s ∈ {1, · · · , p} be arbitrary but fixed, and let er be the rth
unit vector in Rp . Then,
 
∂y(s) y s|δ, θ ξ − y(s|θ, δ)
= lim ,
∂θr ξ→0 ξ

where θ ξ = θ + ξer .
Now, we will prove the theorem in the following steps:

Step 1: Preliminaries
 
For each real number ξ ∈ R, let y ξ denote the function y ·|δ, θ ξ . Then,
it follows from dynamic system that, for each ξ ∈ R,
 s
ξ ξ
y (s) = y (0) + F ξ (t)dt, s ∈ [0, p],
0

where F ξ is defined as follows:


ξ       
F ξ (s) = θs+1 f y s|δ, θ ξ , ȳ sξ |δ, θ ξ , δ̄ sξ ,
  $      %
while ȳ sξ |δ, θ ξ = y1 s̄1 s|θξ , . . . , yn s̄n s|θξ ,
 ξ & ! !'
δ̄ s = δ̄1 s̄ξn+1 , . . . , δ̄r s̄ξn+r , and for each m = 1, . . . , r,

! ⎪ j
⎨ δm , if s̄ξn+m ∈ [j − 1, j),
δ̄m s̄ξn+m = for some j ∈ {1, . . . , p},

⎩   
ϕm s̄n+m s|θξ , if s̄ξn+m < 0.

Define

  s  
ξ
Γ (s) = y s|δ, θ ξ
− y(s|δ, θ) = F ξ − F 0 dt. (10.1.31)
0

Applying the mean value theorem, we have, for s ∈ [0, p],

F ξ (s) − F 0 (s)
 1    
∂f y + ηΓ ξ (s), ȳ + η ȳ ξ − ȳ , θ + ηξer , δ ξ
= Γ (s)dη
∂y
0
 1    
∂f y + ηΓ ξ (s), ȳ + η ȳ ξ − ȳ , θ + ηξer , δ  ξ 
+ ȳ − ȳ dη
∂ ȳ
0
 p    
∂f y + ηΓ ξ (s), ȳ + η ȳ ξ − ȳ , θ + ηξer , δ
+ ξdη (10.1.32)
0 ∂θr
382 10 Time-Lag Optimal Control Problems

and
 
ȳ ξ − ȳ 0 = ȳ sξ |δ, θ ξ − ȳ(s|δ, θ)
     
= ȳ sξ |δ, θ ξ − ȳ s|δ, θ ξ + ȳ s|δ, θ ξ − ȳ(s|δ, θ)
   
= ȳ sξ |δ, θ ξ − ȳ s|δ, θ ξ + Γ ξ (s̄),

where s̄ is the corresponding delayed time point in the new


" time horizon. #
From Assumption 10.1.3, it follows that the state set y ξ (s) : ξ ∈ [−a, a]
is equibounded on [0, p], where a > 0 is a fixed small real number. Hence,
there exists a real number C1 > 0 such that for each ξ ∈ [−a, a],

y ξ (s) ∈ Nn (C1 ), s ∈ [0, p],

where Nn (C1 ) denotes the closed ball in Rn of radius C1 centered at the


origin. Thus, for each ξ ∈ [−a, a],

y(s) + ηΓ ξ (s) ∈ Nn (C1 ), s ∈ [0, p], η ∈ [0, 1].

Furthermore, it is easy to see that for each ξ ∈ [−a, a],

θ + ηξer ∈ Np (C2 ), η ∈ [0, 1],

where C2 = |θ|p + a, Np (C2 ) denotes the closed ball in Rp of radius C2


centered at the origin. Recall from Assumptions 10.1.1 and 10.1.2 that ∂f /∂y
and ∂f /∂θr are continuous. Hence, it follows from the compactness of [0, T ],
V, Nn (C1 ) and Np (C2 ) and the definitions of z(s|θ) and φ that there exists
a real number C3 > 0 such that, for each ξ ∈ [−a, a],
 
 ∂f ξ 
 η
  ≤ C3 , s ∈ [0, p], η ∈ [0, 1],
 ∂y 
n×n
 
 ∂f ξ 
 η
  ≤ C3 , s ∈ [0, p], η ∈ [0, 1],
 ∂ ȳ 
n×n
 
 ∂f ξ 
 η
  ≤ C3 , s ∈ [0, p], η ∈ [0, 1],
 ∂θr 
n
 
 ∂φξ 
 η
  ≤ C3 , s ∈ [0, p], η ∈ [0, 1],
 ∂t 
n
   
where fη denotes f y + ηΓ ξ (t), ȳ + η ȳ ξ − ȳ , θ + ηξer , δ , and φξη de-
ξ

notes φ(μ(t|θ + ηξer ) − h), and | · | denotes the Euclidian norm.


10.1 Time-Lag Optimal Control 383

Step 2: The Function Γ ξ (s) Is of Order ξ

Let ξ ∈ [−a, a] be arbitrary. When s̄ < 0, taking the norm of both sides
of (10.1.31) and applying the definition of C3 gives
  4 5 
 ξ   s 1 ∂f ξ ξ ξ   
Γ (s) =   η ∂f η ∂f η 
Γ ξ
(t) + ξ + φ ξ
− φ 0
dηdt 
n  0 0 ∂y ∂θr ∂ ȳ η η

n

where

φξη − φ0η = φ(μ(t|θ + ηξer ) − h) − φ(μ(t|θ) − h)


∂φ(μ(t|θ + ηξer ) − h) ∂μ(t|θ + ηξer )
=ξ , η ∈ [0, 1].
∂t ∂θr
Thus, we have

 ξ  s  
Γ (s) ≤ C3 |ξ| + C32 T |ξ| + C3 Γ ξ (t)n dt, s̄ < 0.
n
0

Applying Theorem A.1.19 (Gronwall-Bellman Lemma) gives


 ξ 
Γ (s) ≤ (C3 + C32 T )exp(C3 p)|ξ|, s ∈ [0, α1 ],
n

where α1 is a time point such that

μ(α1 |θ) = h.

When s̄ ≥ 0,
 
 ξ   s ∂fηξ ξ
1 ∂fηξ ∂fηξ   ξ   
Γ (s) = Γ (t) + ξ+ ȳ s |δ, θ ξ − ȳ s|δ, θ ξ
n  ∂y ∂θ r ∂ ȳ
0 0

 
+ Γ ξ (s̄) dηdt
  n
   
 s 1 ∂f ξ   s 1 ∂f ξ 
 η ξ   η 
≤ Γ (t)dηdt +  ξdηdt
 0 0 ∂y   0 0 ∂θr 
  n
 n
 s 1 ∂f ξ 
 η ξ 
+ Γ (s̄)dηdt
 0 0 ∂ ȳ 
  n

 s 1 ∂f ξ      
 η 
+ ȳ sξ |δ, θ ξ − ȳ s|δ, θ ξ dηdt
 0 0 ∂ ȳ 
  n
 s 1 ∂f ξ 
 η ξ 
≤(C3 + C32 )exp(C3 p)|ξ| +  Γ (t)dηdt
 α1 0 ∂y 
n
384 10 Time-Lag Optimal Control Problems
     
 s 1 ∂fηξ   s 1 ∂f ξ 
   η ξ 
+ ξdηdt +  Γ (s̄)dηdt
 α1 0 ∂θr   α1 0 ∂ ȳ 
  n
 n
 s 1 ∂fηξ   ξ    
 
+ ȳ s |δ, θ ξ − ȳ s|δ, θ ξ dηdt
 α1 0 ∂ ȳ 
n

Since s̄ ≥ 0, it follows from the definitions of s̄ that


    s  s
 s 1 ∂f ξ   ξ   
    C3 Γ ξ (t)n dt
η ξ
 Γ (s̄)dηdt ≤ C3 Γ (s̄) n dt ≤
 α1 0 ∂ ȳ  α1 0
n

and by the mean value theorem


  
 s 1 ∂f ξ      
 η 
 ȳ s |δ, θ − ȳ s|δ, θ
ξ ξ ξ
dηdt
 α1 0 ∂ ȳ 
n
 s  1 
 ∂y(s̄(t|θ + lξer )|δ, θ + lξer ) ∂s̄ 
≤ C3  ξ  dldt ≤ C33 p|ξ|,
 ∂s̄ ∂θr n
α1 0

where l ∈ [0, 1]. Again, by applying Theorem A.1.19 (Gronwall-Bellman


Lemma), we have
 s
 ξ   
Γ (s) ≤ (C3 + C32 T ) exp(C3 p)|ξ| + C3 |ξ| + C33 p|ξ| + 2C3 Γ ξ (t)n dt
n
0
≤ (C3 + C32 T exp(C3 p|ξ|) + (C3 + 3
C3 p) exp(2C3 p)|ξ|. (10.1.33)

Since ξ ∈ [−a, a] is arbitrary, the function Γ ξ (s) is of order ξ.

Step 3: The Definition of ρ and Its Properties

For each ξ ∈ [−a, a] ∈ R, define the corresponding functions λ1,ξ : [0, p] →


R , λ2,ξ : [0, p] → Rn , λ3,ξ : [0, p] → Rn as follows:
n

  
1,ξ ∂f (y + ηΓ ξ (s), ȳ + η ȳ ξ − ȳ , θ + ηξer , δ)
1
λ (t) =
0 ∂y
∂f (y, ȳ, θ, δ)
− Γ ξ (t)dη
∂y
 1  
2,ξ ∂f (y + ηΓ ξ (s), ȳ + η ȳ ξ − ȳ , θ + ηξer , δ)
λ (t) =
0 ∂ ȳ
∂f (y, ȳ, θ, δ)  
− ȳ ξ − ȳ dη
∂ ȳ
10.1 Time-Lag Optimal Control 385
  
1
∂f (y + ηΓ ξ (s), ȳ + η ȳ ξ − ȳ , θ + ηξer , δ)
λ3,ξ (t) =
0 ∂θr
∂f (y, ȳ, θ, δ)
− ξdη.
∂θr

In addition, let the function ρ : [−a, 0) ∪ (0, a] → R be defined as follows:


 p
" 1,ξ      #
ρ(ξ) = |ξ|−1 λ (t) + λ2,ξ (t) + λ3,ξ (t) dt. (10.1.34)
n n n
0

Since the function Γ ξ (s) is of order ξ, it follows that

y + ηΓ ξ (t) → y, as ξ → 0 (10.1.35)

 
ȳ + η ȳ ξ − ȳ → ȳ, as ξ → 0 (10.1.36)

uniformly with respect to t ∈ [0, p] and η ∈ [0, 1]. Meanwhile, it is obvious


that

θ + ηξer → θ, as ξ → 0 (10.1.37)

uniformly with respect to η ∈ [0, 1]. Since the convergences in (10.1.35)


and (10.1.36) take place inside the ball Nn (C1 ), the convergence in (10.1.37)
takes place inside the ball Nn (C2 ), ∂f /∂y, ∂f /∂yz and ∂f /∂θr are uni-
formly continuous on the compact set [0, p] × Nn (C1 ) × Nn (C1 ) × V × Nn (C2 ),
   
∂f y + ηΓ ξ (s), ȳ + η ȳ ξ − ȳ , θ + ηξer , δ ∂f (y, ȳ, θ, δ)
→ , as ξ → 0,
∂y ∂y
   
∂f y + ηΓ ξ (s), ȳ + η ȳ ξ − ȳ , θ + ηξer , δ ∂f (y, ȳ, θ, δ)
→ , as ξ → 0,
∂ ȳ ∂ ȳ
   
∂f y + ηΓ ξ (s), ȳ + η ȳ ξ − ȳ , θ + ηξer , δ ∂f (y, ȳ, θ, δ)
→ , as ξ → 0,
∂θr ∂θr
 l,ξ 
∂ ȳ s |δ, lξer ∂ ȳ(s|δ, θ)
→ , as ξ → 0,
∂s̄ ∂s̄
∂s̄l,ξ ∂s̄
→ , as ξ → 0,
∂θr ∂θr

uniformly with respect to t ∈ [0, p], η ∈ [0, 1] and l ∈ [0, 1],where s̄l,ξ is the
corresponding delayed time of the control θ + lξer . These results together
with (10.1.33) imply that |ξ|−1 λ1,ξ → 0, |ξ|−1 λ2,ξ → 0, |ξ|−1 λ3,ξ → 0 uni-
formly on [0, p] as ξ → 0. Thus,
386 10 Time-Lag Optimal Control Problems

lim ρ(ξ) = 0.
ξ→0

Step 4: The Final Step

Let ξ ∈ [−a, 0)∪(0, a] be arbitrary but fixed. Then, it follows from (10.1.31)
that
 s  s
ξ
$ 1,ξ 2,ξ 3,ξ
% ∂f (y, ȳ, θ, δ) ξ
Γ (s) = λ (t) + λ (t) + λ (t) dt + Γ (t)dt
0 0 ∂y
 s
∂f (y, ȳ, θ, δ) ∂ ȳ(sl,ξ |δ, θ) ∂sl,ξ
+ Γ ξ (s̄) + ξ dt
0 ∂ ȳ ∂s̄ ∂θr
 s
∂f (y, ȳ, θ, δ)
+ ξdt. (10.1.38)
0 ∂θr

Furthermore, integrating the auxiliary system gives


 s
∂f (y(t|θ, δ, ȳ(t|θ, δ), θ, δ)) r
Λ̄r (s) = Λ̄ (t)dt
0 ∂y
 s
∂f (y(s|θ, δ, ȳ(s|θ, δ), θ, δ)) ∂ ȳ(s|δ, θ) ∂s̄
+ Λ̄(s̄|δ, θ) + dt
0 ∂ ȳ ∂s̄ ∂θr
 s
∂f (y(t|θ, δ, ȳ(t|θ, δ), θ, δ))
+ dt. (10.1.39)
0 ∂θr

Multiplying (10.1.38) by ξ −1 , and subtracting it from (10.1.39) yields

ξ −1 Γ ξ (s) − Λ̄r (s)


 s
$ 1,ξ %
= ξ −1 λ (t) + λ2,ξ (t) + λ3,ξ (t) dt
 s0
∂f (y, ȳ, θ, δ) −1 ξ
+ (ξ Γ (t) − Λ̄r (t))dt
0 ∂y
 s
∂f (y, ȳ, θ, δ) $ −1 ξ %
+ (ξ Γ (s̄) − Λ̄r (s̄)) dt
0 ∂ ȳ
 s
∂f (y, ȳ, θ, δ) ∂ ȳ(sl,ξ |δ, lξer ) ∂sl,ξ ∂ ȳ(s|δ, θ) ∂s̄
+ − dt.
0 ∂ ȳ ∂s̄ ∂θr ∂s̄ ∂θr

Let
  
s  ∂ ȳ(sl,ξ |δ, lξer ) ∂sl,ξ ∂ ȳ(s|δ, θ) ∂s̄ 
ρ̄(ξ) = ρ(ξ) + C3  −
0 ∂s̄ ∂θr ∂s̄ ∂θr n

Then, it is easy to see that ρ̄(ξ) → 0, as ξ → 0. Therefore,


 −1 ξ 
ξ Γ (s) − Λ̄r (s)
n
10.1 Time-Lag Optimal Control 387
 s
 
≤ ρ(ξ) + C3 ξ −1 Γ ξ (t) − Λ̄r (t)n dt
 s 0
 
+ C3 ξ −1 Γ ξ (s̄) − Λ̄r (s̄)n
0
 s
∂ ȳ(sl,ξ |δ, lξer ) ∂sl,ξ ∂ ȳ(s|δ, θ) ∂s̄
+ C3 − dt
0 ∂s̄ ∂θr ∂s̄ ∂θr
 s n
 −1 ξ 
≤ ρ̄(ξ) + 2C3 ξ Γ (t) − Λ̄r (t)n dt.
0

By Theorem A.1.19 (Gronwall-Bellman Lemma), we obtain


 −1 ξ 
ξ Γ (s) − Λ̄r (s) ≤ ρ̄(ξ)exp(2C3 p), s ∈ [0, p]. (10.1.40)
n

Noting that ξ ∈ [−a, 0) ∪ (0, a] is arbitrary, we can take the limit as ξ → 0


in (10.1.40). Since lim ρ(ξ) = 0, it follows that
ξ→0

lim ξ −1 Γ ξ (s) = Λ̄r (s|δ, θ), s ∈ [0, p].


ξ→0

t μ(s)

θs +1 = 0

0
s
ζ(s ) s

Scenario C1
t μ(s)

θs +1 > 0

0
s
ζ(s ) s

Scenario C2

Fig. 10.1.3: A demonstration for two scenarios of θs+1


388 10 Time-Lag Optimal Control Problems

Note that Theorem 10.1.2 is valid only under the condition that the deriva-
tive of ζ(·) with respect to θk exists. However, for the case of s ∈ S  (ζ(s) ∈
{0, 1, . . . , p − 1}), there are two scenarios to consider:
(i) θs+1 = 0.
(ii) θs+1 > 0.
If scenarios (i) holds, clearly, the derivative of ζ(s) with respect to θi does
not exist for all s ∈ [j  − 1, j  ) for some j  . However, this will not affect
solving the auxiliary system (10.1.29), since ∂ζ(·)/∂θi is always associated

with ∂f j (y(s | θ, δ), ȳ(s | θ, δ), θ, δ)/∂ ȳ, which takes the value of 0, when
θs+1 = 0.
As for scenario (ii), the derivative of μ(ζ(s)) with respect to s does not exist
for only a finite number of time instant (less than or equal to the number of
elements of S  ), which implies that, the value of ∂ζ(s)/∂θi does not exist only
at these time instants. For this case, the auxiliary dynamic system (10.1.29)
is still numerically solvable.
Remark 10.1.1 Figure 10.1.3 depicts the situations for the two scenarios.
Note that, in Scenario (C1), we have θζ(s )−1 = 0, while in Scenario (C2), we
have θζ(s )−1 > 0. By the definition of ζ(·) in (10.1.16), it is clear that θζ(s ) is
greater than 0 regardless of the value of θζ(s )−1 . Hence, the auxiliary dynamic
system (10.1.29) is numerically solvable either in the case of θζ(s )−1 > 0 in
Scenario (C1) or θζ(s )−1 = 0 in Scenario (C2).
The gradient of g̃k (θ, δ), k = 0, 1, . . . , Ne + Nm , with respect to the dura-
tion vector θ is given as a theorem stated below.
Theorem 10.1.3 The gradient of g̃k (θ, δ) for each k = 0, 1, . . . , Ne + Nm
with respect to θ is given by

∂g̃k (θ, δ) ∂Φk (y(p | θ, δ)) ∂y(p | θ, δ)


= ·
∂θ ∂y ∂θ
 p 4.
∂ L̂k (y(s | θ, δ), ȳ(s | θ, δ), δ) ∂y(s | θ, δ)
+ ·
0 ∂y ∂θ
/
∂ L̂k (y(s | θ, δ), ȳ(s | θ, δ), δ) ∂ ȳ(s | θ, δ) ∂μ(s)
+ ·
∂ ȳ ∂θ ∂s
5
∂ ∂μ(s)
+ L̂k (y(s | θ, δ), ȳ(s | θ, δ), δ) · ds. (10.1.41)
∂θ ∂s

Proof. The proof follows from applying the chain rule to (10.1.26) and The-
orem 10.1.2.
Finally, the gradients of the state and g̃k (θ, δ) with respect to δ are given
below.
10.1 Time-Lag Optimal Control 389

Theorem 10.1.4 For each pair (θ, δ) ∈ Θ × Δ,

∂y(s | θ, δ)
= Ῡ (s | θ, δ), s ∈ [0, p], (10.1.42)
∂δ
where Ῡ (· | θ, δ) is the solution of the following auxiliary dynamic system on
each interval [i − 1, i):

dῩ (s) ∂f i (y(s | θ, δ), ȳ(s | θ, δ), θ, δ)


= Ῡ (s)
ds ∂y
∂f i (y(s | θ, δ), ȳ(s | θ, δ), θ, δ) ∂ ȳ(s | θ, δ)
+
∂ ȳ ∂δ
∂f (y(s | θ, δ), ȳ(s | θ, δ), θ, δ)
i
+ (10.1.43)
∂δ
with the initial condition
Ῡ (s) = 0, s ≤ 0, (10.1.44)
where ⎡ ⎤
Ῡ1 (s̄1 | θ, δ)
∂ ȳ(s | θ, δ) ⎢ .. ⎥
=⎣ . ⎦.
∂δ
Ῡn (s̄n | θ, δ)
Proof. The proof is similar to the proof of Theorem 10.1.2, and hence is
omitted.
Theorem 10.1.5 For each k = 0, 1, . . . , Ne + Nm , the gradient of g̃k (θ, δ)
with respect to δ is given by

∂g̃k (θ, δ) ∂Φk (y(p | θ, δ)) ∂y(p | θ, δ)


= ·
∂δ ∂y ∂δ
 p *&
∂ L̂k (y(s | θ, δ), ȳ(s | θ, δ), δ) ∂y(s | θ, δ)
+ ·
0 ∂y ∂δ
∂ L̂k (y(s | θ, δ), ȳ(s | θ, δ), δ) ∂ ȳ(s | θ, δ)
+ ·
∂ ȳ ∂δ
∂ L̂k (y(s | θ, δ), ȳ(s | θ, δ), δ) dμ(s) +
'
+ ds. (10.1.45)
∂δ ds
Proof. The proof follows from applying the chain rule to (10.1.26) and The-
orem 10.1.2.
Note that Problem (Q1 (p)) is an optimal parameter selection problem.
Theorems 10.1.3 and 10.1.5 give the gradients of the cost and constraint
functionals in Problem (Q1 (p)) with respect to θ and δ, respectively. On this
basis, we can use existing nonlinear optimization software based on gradient
descent techniques—for example, FMINCON in MATLAB or NLPQLP in
390 10 Time-Lag Optimal Control Problems

FORTRAN to solve Problem (Q1 (p)). In the next section, we will demon-
strate the effectiveness of this approach with two numerical examples.

10.1.6 Numerical Examples

Example 10.1.1 (Optimal Control with State Delay)


Consider the following multiple time-delay optimal control problem give ref-
erence:

1  1 tf " #
min g0 (u) = (x (tf )) Sx (tf ) + x(t) Qx(t) + (u(t)) Ru(t) dt,
2 2 0

subject to the time-delay dynamic system


dx
= A1 (t)x(t) + A2 (t)x̄(t) + B(t)u(t), (10.1.46)
dt
x(t) = [1, 0] , t ≤ 0, (10.1.47)

where
x̄ = (x1 (t − 1), x2 (t − 0.5))
0 1
A1 (t) = ,
−4π 2 (a + c cos 2πt) 0
0 0 0
A2 (t) = , B(t) = ,
−4π 2 b cos 2πt 0 1
the parameters of the problem are in Table 10.1.1,

Table 10.1.1: Parameters in Example 10.1.1

a b c tf Q R S
0.2 0.5 0.2 1.5 I2×2 I2×2 104 I2×2

and the control constraints are

− 3 ≤ ui ≤ 4, t ∈ [0, tf ], i = 1, 2.

By choosing different partition number, i.e., q = 5, 7, 10, we obtain the corre-


sponding optimal costs of g0 (u∗ ) by applying the new method and the con-
ventional control parametrization method for each value of q. The detailed
numerical results are listed in Table 10.1.2. It is clear from Table 10.1.2 that
the optimal cost decreases when the partition number increases. Furthermore,
since the variable switching times provide a larger flexibility for optimization,
10.1 Time-Lag Optimal Control 391

the proposed new method can always achieve a better cost when compared
with the conventional control parametrization method for which the partition
points are evenly distributed over the time horizon.
Note that the results obtained by applying the new method have similar
cost values when compared with those obtained by applying the hybrid time-
scaling transformation reported in [298]. However, the implementation of the
new method is much simpler. In particular, it does not require the use of
numerical interpolation to calculate the delay state values in the new time
horizon. Consequently, the computational time requirement is much less.
Figure 10.1.4 shows optimal controls obtained by using the two different
methods. Figures 10.1.5 and 10.1.6 depict, respectively, the two optimal state
trajectories for the case of q = 10.

Table 10.1.2: Optimal costs for Example 10.1.1 using the two different
methods
(b) Conventional control
(a) New method parametrization method
Number of subintervals g0 (uq,∗ ) Number of subintervals g0 (uq,∗ )
q = 10 4.4355 q = 10 7.2878
q=7 4.4610 q=7 7.4755
q=5 4.7630 q=5 8.1382

2
control

-1
Traditional Control Parameterization
-2 Hybrid Time-Scaling Approach

-3

-4
0 0.5 1 1.5
t

Fig. 10.1.4: Optimal controls obtained for Example 10.1.1 using the two
different methods
392 10 Time-Lag Optimal Control Problems

2
x1
x2
1

-1

state -2

-3

-4

-5
0 0.5 1 1.5
t

Fig. 10.1.5: Optimal state trajectory obtained for Example 10.1.1 using
the new method

0.5

-0.5
state

-1

-1.5
x1
-2 x2

-2.5
0 0.5 1 1.5
t

Fig. 10.1.6: Optimal state trajectory obtained for Example 10.1.1 using
the conventional control parametrization method

Example 10.1.2 (Optimal Control with Multiple Time-Delay)


Consider the following time-delay optimal control problem give reference,
which includes different time-delays in every state and control variables:

1 1 2$ %
min g0 = (x(2)) Sx(2) + (x(t)) Qx(t) + (u(t)) Ru(t) dt,
2 2 0

where
10.1 Time-Lag Optimal Control 393
⎡ ⎤ ⎡ ⎤
1 2 0 0 1 0 0 0
⎢2 1 0 0⎥ ⎢ 0⎥

S=⎣ ⎥ , R = ⎢0 1 0 ⎥,
0 0 1 2⎦ ⎣0 0 1 0⎦
0 0 1 1 0 0 0 1
⎡ ⎤
1 0 0 0
⎢0 2 0 0 ⎥
Q=⎢
⎣0 0 1 0
⎥.

0 0 0 2
subject to the time-delay dynamic system

dx1 (t)
= − 2(x1 (t))2 + x1 (t)x2 (t − 0.2) + 2x2 (t)
dt
− u1 (t)u2 (t − 0.5),
dx2 (t)
= − x1 (t − 0.1) + 2x3 (t) + u2 (t),
dt
dx3 (t)
= − (x3 (t))3 − x1 (t)x2 (t) − x2 (t − 0.2)u2 (t)
dt
+ u1 (t − 0.4) + 2u3 (t),
dx4 (t)
= − (x4 (t))2 + x2 (t)x3 (t) − 2x3 (t − 0.3) + 2u4 (t),
dt
the initial conditions

x1 (t − 0.1) = 1, t ≤ 0.1; x2 (t − 0.2) = 1, t ≤ 0.2;


x3 (t − 0.3) = 1, t ≤ 0.3; x4 (t − 0.4) = 1, t ≤ 0.4;
u1 (t − 0.5) = 1, t < 0.5; u2 (t − 0.6) = 1, t < 0.6;
u3 (t − 0.7) = 1, t < 0.7; u4 (t − 0.8) = 1, t < 0.8,

the terminal inequality constraints

g1 (u) = 4 − (x1 (2))2 − (x2 (2))2 − (x3 (2))2 − (x4 (2))2 ≥ 0,


g1 (u) = (x1 (2))2 + (x2 (2))2 + (x3 (2))2 + (x4 (2))2 − 0.002 ≥ 0,

and the control constraints

− 0.9 ≤ ui (t) ≤ 1, t ∈ [0, 2], i = 1, . . . , 4.

By choosing q = 10, we obtain a cost of g0 (u∗ ) = 2.1172. However, the con-


ventional control parametrization technique fails to solve this problem. The
results obtained by applying the new method are again very similar to that
obtained by applying the method reported in [298]. However, the implemen-
tation of the new method is much simpler and the required computational
time is less. The obtained optimal controls are as shown in Figures 10.1.7
394 10 Time-Lag Optimal Control Problems

and 10.1.8, and the corresponding state trajectories are displayed in Fig-
ure 10.1.9.

0.3
u1
0.2 u2

0.1

0
control

-0.1

-0.2

-0.3

-0.4

-0.5
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
t

Fig. 10.1.7: Optimal controls u1 and u2 obtained for Example 10.1.2 using
the new method

u3
0.2 u4

0
control

-0.2

-0.4

-0.6

-0.8

-1
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
t

Fig. 10.1.8: Optimal controls u3 and u4 obtained for Example 10.1.2 using
the new method
10.2 Time-Lag Optimal Control with State-Dependent Switched System 395

1.4
x
1
1.2 x
2
x3
1
x4
0.8

state
0.6

0.4

0.2

-0.2

-0.4

-0.6
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
t

Fig. 10.1.9: Optimal state trajectories obtained for Example 10.1.2 using
the new method

10.2 Time-Lag Optimal Control with State-Dependent


Switched System

10.2.1 Introduction

In this section, we consider a switched system which consists of a number of


sub-systems (or modes) and a switching law. It operates by switching among
different sub-systems and the order of the switching sequence and the times
at which the changes of the sub-systems take place are determined by the
switching law. Switched systems are optimized by changing the switching
sequence, switching times, and some input parameters in the dynamics of
the sub-systems. The switching times and input parameters are normally
continuous valued and hence they can be determined using gradient-based
optimization techniques [142, 143, 148, 155]. On the other hand, optimizing
the switching sequence is a discrete optimization problem, and it is a much
harder optimization problem. Our focus is on the class of switching systems
for which the switching sequence is pre-fixed (for a survey paper, see [148]).
Furthermore, time-delays are assumed to appear in the state of each of the
sub-systems and the switching mechanism is activated automatically when
the state of the system satisfies the switching law. The system parameters
appearing in the initial state functions are the decision variables. The main
reference for this section is [152].
396 10 Time-Lag Optimal Control Problems

10.2.2 Problem Statement

This section is from [152]. Consider the following switched time-delay system,
which consists of N sub-systems operating in succession, over the time horizon
[0, T ]:

dx(t)
= f i (x(t), x(t − γ1 ), . . . , x(t − γr )),
dt
t ∈ (τi−1 , τi ), i = 1, . . . , N, (10.2.1a)
x(t) = φ(t, ζ), t ≤ 0, (10.2.1b)

where x(t) ∈ Rn is the state; γj , j = 1, . . . , r, are given time-delays; ζ =


[ζ1 , . . . , ζs ] ∈ Rs is a system parameter vector; τi , i = 1, . . . , N − 1, are
switching times appeared in increasing order, with τ0 = 0 and τN = ∞;
and f i : R(m+1)×n → Rn , i = 1, . . . , N, and φ : R × Rr → Rn are given
continuously differentiable functions.
The following linear growth condition is assumed throughout this section.

Assumption 10.2.1
 i 0   
f (y , . . . , y r ) ≤ K1 1 + |y 0 | + · · · + |y r | ,
(y 0 , . . . , y r ) ∈ R(r+1)×n , i = 1, . . . , N, (10.2.2)

where K1 > 0 is a real constant and | · | stands for the Euclidean norm.
For i = 1, . . . , N , the system switches from sub-system i − 1 to sub-system
i at the time t = τi defined by

τi = inf {t > τi−1 : hi (x(t)) = 0} , (10.2.3)

where hi : Rn → R, i = 1, . . . , N, are given continuously differentiable func-


tions such that hi (x(t)) = 0 for all t > τi−1 with τ0 = 0 and τN = ∞. (10.2.3)
is referred to as the switching law.
The evolution of the switched system (10.2.1) with its switching times
determined by the switching law (10.2.3) is as follows:
Given a system parameter vector ζ, the system starts from the initial
state x(0 | ζ) = φ(0, ζ) at t = 0 with x(t | ζ) = φ(t, ζ) for t ≤ 0 and evolves
smoothly according to (10.2.1a) with i = 1 until t = τ1 . Then, the system
switches to sub-system i = 2 and evolves according to (10.2.1a) with i = 2
until t = τ2 . This process continues until t = T .
System (10.2.1) is influenced by the choice of the system parameter vector
ζ ∈ Z, where Z is defined by

Z = {ζ ∈ Rs : ak ≤ ζk ≤ bk , k = 1, . . . , s} , (10.2.4)
10.2 Time-Lag Optimal Control with State-Dependent Switched System 397

where ak , k = 1, . . . , s and bk , k = 1, . . . , s, are given constants such that


ak < bk . Any vector ζ ∈ Z is called a feasible system parameter vector, and
Z is called the set of feasible system parameter vectors.
Remark 10.2.1 For (10.2.1), we note that the system parameter vector ap-
pears only in the initial function φ (see (10.2.1b)). However, this is not a
strict restriction but for the sake of brevity of presentation. The results can
be easily extended to the case for which the system parameter vector appears
also in the right hand side of (10.2.1a) and/or the switching law (10.2.3).
Definition 10.2.1 For each feasible parameter ζ ∈ Z, x(· | ζ) is said to be a
solution of system (10.2.1) with switching times determined by (10.2.3), if it
satisfies the dynamics defined by (10.2.1a) almost everywhere on [0, ∞), and
the initial condition (10.2.1b) everywhere on (−∞, 0], where the switching
times τi , i = 1, . . . , m, are determined by the switching law (10.2.3).
Theorem 10.2.1 For each feasible system parameter vector ζ ∈ Z, sys-
tem (10.2.1) with switching times determined by (10.2.3) has a unique solu-
tion.

Proof. For a given ζ ∈ Z, define a set of auxiliary systems recursively as


follows:
dξ i (t)  
= f i ξ i (t), ξ i (t − γ1 ), . . . , ξ i (t − γr ) , t > ρi−1 , (10.2.5a)
dt
ξ i (t) = ξ i−1 (t), t ≤ ρi−1 , (10.2.5b)

where
4
inf{ t > ρi−1 : hi (ξ i (t)) = 0 }, if i ≤ N − 1,
ρi = (10.2.6)
∞, if i = N ,

where ξ 0 (t) = φ(t, ζ) and ρ0 = 0. Given ρi−1 and ξ i−1 (·) for some i ∈ ZN ,
with
ZN = {1, . . . , N } .
The existence of a unique solution to (10.2.5) can be established as follows. Di-
vide [ρi−1 , ∞) into consecutive subintervals of length min{γ1 , . . . , γr }. Then,
consider the system on each of the subintervals consecutively. This gives rise
to a set of consecutive non-delay systems on each of these subintervals in
sequential order. Since the functions φ and f i , i = 1, . . . , N , are continuous
and differentiable and Assumption 10.2.1 is satisfied, the well-known exis-
tence and uniqueness results for non-delay systems [3] can be applied to each
of these consecutive systems one by one in a sequential order. More specif-
ically, starting from ξ 0 (t) = φ(t, ζ) and ρ0 = 0, we can show by induction
that ξ i (·) and ρi are well-defined for each i ∈ ZN . Then, we see that ξ N (·)
satisfies (10.2.5) with ρi = τi , i = 0, . . . , N . Since each ξ i (·) is unique, it is
clear that ξ N (·) is the only solution of (10.2.5). This completes the proof.
398 10 Time-Lag Optimal Control Problems

Now, consider the cost functional g0 defined by

g0 (ζ) = Φ(x(T | ζ)), (10.2.7)

where T > 0 is a given terminal time, and Φ : Rn → R is a given continuously


differentiable function.
We may now formally state the optimal control problem under considera-
tion in this section as follows.
Problem (P2 ). Given the switched system (10.2.1) equipped with the switch-
ing law defined by (10.2.3), find a feasible system parameter vector ζ ∈ Z
such that the cost functional (10.2.7) is minimized.
There are two special features for Problem (P2 ): (1) the switchings among
the sub-systems of the switched system (10.2.1) are influenced by multiple
state-delays; and (2) the switchings among the sub-systems of the switched
system are governed by a state-dependent switching law. We shall develop a
gradient-based computational method to solve Problem (P2 ).

10.2.3 Preliminaries

Some preliminary results are needed for the development of a gradient-based


computational algorithm for solving Problem (P2 ). To begin, let ek denote
the kth unit vector in Rr . Furthermore, let ∂x(t | ζ)/∂ζ denote the n × s
state variation matrix with its kth column defined by
 
∂x(t | ζ) x t | ζ + εek − x(t | ζ)
= lim . (10.2.8)
∂ζk ε→0 ε
Here, we assume that the limit on the right hand side is well-defined. For
systems governed by ordinary differential equations, this assumption will al-
ways be satisfied. However, this assumption is not necessarily valid for the
switched system (10.2.1). We shall show that even for the case where all
the functions involved in the switched system (10.2.1) are smooth, the state
variation matrix for the system does not exist in some situations due to the
presence of time-delays and state-dependent switching law.
Now, suppose that the state variation matrix does exist at t = T . Then,
we can take partial differentiation of the cost functional (10.2.7). In this way,
it follows from the use of the chain rule that
∂g0 (ζ) ∂Φ(x(T | ζ)) ∂x(T | ζ)
= , (10.2.9)
∂ζ ∂x ∂ζ

where ∂g0 (ζ)/∂ζ is an s-dimensional row vector with its kth element being
the partial derivative ∂g0 (ζ)/∂ζk of g0 with respect to the kth element of the
system parameter vector ζ.
10.2 Time-Lag Optimal Control with State-Dependent Switched System 399

To continue, the following assumption is assumed through this section.


Assumption 10.2.2 For any ζ ∈ Z,

hi (x(τi−1 | ζ)) = 0, i ∈ ZN −1 , where τi−1 < ∞. (10.2.10)

This condition is to ensure that there is no switching of any two sub-


systems to occur at the same time. Therefore, the switching law defined
by (10.2.3) is well-defined and the switching times are distinct. More specifi-
cally, τi−1 < τi for each integer i ∈ ZN −1 with τi−1 < ∞.
The following assumption is also assumed throughout.
Assumption 10.2.3 For a given ζ ∈ Z,

∂hi (x(τi | ζ)) i


f (x(τi | ζ), x(τi − γ1 | ζ), . . . , x(τi − γr | ζ)) = 0,
∂x
i ∈ ZN −1 , where τi < ∞. (10.2.11)

In this assumption, it is assumed that the scalar product of ∂hi /∂x (which
is orthogonal to the switching surface hi = 0) and f i (which is tangent to the
state trajectory) is non-zero at the ith switching time. This means that the
state trajectory does not approach to the switching surfaces at a tangential
direction. In the literature, there are similar assumptions being made. See,
for example, [33, 190].
We are now ready to present formulae for both the state variation matrix
and the partial derivatives of the switching times with respect to the system
parameter vector. Throughout, we use the notation ∂ x̃j to denote the par-
tial differentiation with respect to x(t − γj ), with ∂ x̃0 denoting the partial
differentiation with respect to x(t) (that is, γ0 = 0).
To continue, let ζ ∈ Z and k ∈ {1, . . . , s} be arbitrary. Consider the
perturbed system parameter vector ζ + εek , where ε ∈ [ak − ζk , bk − ζk ],
such that ζ + εek ∈ Z. Let ξ i,ε (·), i ∈ ZN denote the trajectories obtained by
solving (10.2.5) recursively corresponding to  the perturbed
 system parameter
vector ζ + εek , starting from ξ 0,ε (t) = φ t, ζ + εek for t ≤ 0 and ρ0 = 0.
Following arguments similar to those used in the proof of Theorem 10.2.1, we 
can show that ρi defined by (10.2.6) for ζ + εek is equal to τiε = τi ζ + εek ,
and ξ i,ε (t) = x t | ζ + εek for all t ≤ τiε .
Consider a function ψ of ε. The following notations are used. (1) If there
k
exists a real number M > 0 and a positive integer k such  |ψ(ε)| ≤ M |ε|
 kthat
for all ε of sufficiently small magnitude, then ψ(ε) = O ε ; (2) ψ(ε) = θ(ε) if
ψ(ε) → 0 as ε → 0; and (3) ψ(ε) = O(1) means that ψ is uniformly bounded
with respect to ε.
Let
γ̄ = max{γ1 , . . . , γr }
and let
μi,ε (t) = ξ i,ε (t) − ξ i,0 (t), i = 0, . . . , N. (10.2.12)
400 10 Time-Lag Optimal Control Problems

Now, consider the following variational system:

dΛk (t)  ∂f i (x(t), x(t − γ1 ), . . . , x(t − γm ))


m
= Λk (t − γj ),
dt j=0
∂ x̃j
t ∈ (τi−1 , τi ), i ∈ ZN , (10.2.13a)

with initial conditions


∂φ(t, ζ)
Λk (t) = , t ≤ 0, (10.2.13b)
∂ζk
∂φ(0, ζ)
Λk (0+ ) = , (10.2.13c)
∂ζk
and intermediate jump conditions
  ∂τi (ζ) " i
Λk τi+ = Λk (τi− ) + f (x(τi ), x(τi − γ1 ), . . . , x(τi − γr ))
∂ζk
#
− f i+1 (x(τi ), x(τi − γ1 ), . . . , x(τi − γr )) , i ∈ ZN −1 and τi < ∞.
(10.2.13d)

For each i ∈ ZN , let Λk be the solution to the variational system governed


by the differential equations (10.2.13a) with initial condition (10.2.13b)–
(10.2.13c) and jump condition (10.2.13d). We need the following two lemmas:
Lemma 10.2.1 For each i ∈ ZN , it holds that
 
max ξ i,ε (t) = O(1) for every Tmax > 0, (10.2.14)
t∈[−γ̄,Tmax ]

 i,ε 
max μ (t) = O(ε) for every Tmax > 0, (10.2.15)
t∈[−γ̄,Tmax ]

 E
% i−1
lim ε−1 μi,ε (t) = Λk (t− ), t ∈ −∞, τi0 \ {τk0 }, (10.2.16a)
ε→0
k=0

lim τiε = τi0 , (10.2.16b)


ε→0

and, when τi0 is finite,

τ ε − τi0
lim i
ε→0 ε

⎪ 0 if i = 0.

⎨ ∂hi (ξi,0 (τi0 ))
= −
0−
∂x Λk (τi )÷

⎪ ∂hi (ξi,0 (τi0 )) i  i,0  0     
⎩ f ξ τi , ξ i,0 τi0 − γ1 , . . . , ξ i,0 τi0 − γr , if i ≥ 1.
∂x

(10.2.16c)
10.2 Time-Lag Optimal Control with State-Dependent Switched System 401

Before we give the proof of Lemma 10.2.1, some important observations


are noted in the following remark.
Remark 10.2.2 From (10.2.14), we see that the solution of (10.2.1) cor-
responding to ζ + εek is uniformly bounded with respect to ε. By (10.2.15)
and (10.2.16a), the solution is continuous and differentiable at ζ with re-
spect to the kth component of the system parameter vector ζ. By (10.2.16b)
and (10.2.16c), the switching times are continuous and differentiable at ζ
with respect to the kth component of the system parameter vector ζ.
Proof of Lemma 10.2.1 The proof is by induction. To start with, con-
sider (10.2.14)–(10.2.15), (10.2.16a)–(10.2.16c) for i = 0. Since τ0ε = 0
and ξ 0,ε (t) = φ(t, ζ + εek ), it is clear that (10.2.14)–(10.2.15), (10.2.16a)–
(10.2.16c) for i = 0 are valid. For (10.2.15), we note that φ is continuously
differentiable on [−γ̄, Tmax ] × Z. Thus,
     
max μ0,ε (t) = max φ t, ζ + εek − φ(t, ζ)
t∈[−γ̄,Tmax ] t∈[−γ̄,Tmax ]
 
 1  ∂φ t, ζ + εηek  
 
≤ |ε| max   dη = O(ε).
t∈[−γ̄,Tmax ] 0  ∂ζk 
(10.2.17)

Now, given the inductive hypothesis (i.e., (10.2.14)–(10.2.15), (10.2.16a)–


(10.2.16c) are valid for each i = 1, . . . , q, where q ≤ N − 1), we shall show
that (10.2.14)–(10.2.15), (10.2.16a)–(10.2.16c) for i = q + 1 are also valid.
Consider the case of τq0 = ∞. Since ξ q+1,ε (t) = ξ q,ε (t) when |ε| is small, it
is clear that (10.2.14)–(10.2.15), (10.2.16a)–(10.2.16b) for i = q + 1 are also
satisfied. Here, (10.2.16c) is irrelevant.
We now consider the case of τr0 < ∞. Since (10.2.16c) is irrelevant, it
suffices to prove the validity of (10.2.14)–(10.2.15), (10.2.16a)–(10.2.16b) for
i = q + 1. This is done one by one as detailed below.
Proof of (10.2.14)
For each i = 0, . . . , q + 1, define

fˆi,ε (t, η)
⎧ i  i,0
⎨ f ξ (t − γ0 ) + ημi,ε (t − γ0 ),. . . ,
ξ (t − γrk)+ ημ (t − γr ) , if i ≥ 1,
i,0 i,ε
= (10.2.18)

dφ t, ζ + εηe /dt, if i = 0,

and, for i ≥ 1, let ∂ fˆi,ε (s, η)/∂ x̃j denote the respective partial derivatives.
From (10.2.5) for i = q + 1, we have

4 3t
q+1,ε ξ q,ε (τqε ) + τqε
fˆq+1,ε (ω, 1)dω, if t ∈ [τqε , Tmax ],
ξ (t) = (10.2.19)
ξ q,ε (t), if t ∈ [−γ̄, τqε ],
402 10 Time-Lag Optimal Control Problems

where τqε is as defined in the proof of Theorem 10.2.1 Thus, for t ∈ [−γ̄, τqε ],
 q+1,ε 
ξ (t) = |ξ q,ε (t)| ≤ max |ξ q,ε (ω)| , (10.2.20)
ω∈[−γ̄,Tmax ]

and for t ∈ [τqε , Tmax ], it is clear from Assumption 10.2.1 that

  
  t
 ˆq+1,ε 
ξ q+1,ε
(t) ≤ ξ q,ε (τqε ) + f (ω, 1) dω
τqε
r 
 t  
≤ max |ξ q,ε
(ω)| + K1 Tmax + K1 ξ q+1,ε (ω − γj ) dω
ω∈[−γ̄,Tmax ] τqε
j=0
 t  
≤ max |ξ q,ε (ω)| + K1 Tmax + (r + 1)K1 ξ q+1,ε (ω) dω.
ω∈[−γ̄,Tmax ] −γ̄
(10.2.21)

Combining (10.2.20) and (10.2.21), it follows from (10.2.14) for i = q that



 q+1,ε  t  
ξ (t) ≤ O(1)+ (r+1)K1 ξ q+1,ε (ω) dω, t ∈ [−γ̄, Tmax ]. (10.2.22)
−γ̄

Finally, by Theorem A.1.19 (Gronwall-Bellman Lemma), we obtain


 q+1,ε 
ξ (t) ≤ O(1) exp(K1 (r + 1)(Tmax + γ̄)) = O(1), t ∈ [−γ̄, Tmax ].
(10.2.23)
Therefore, (10.2.14) for i = q + 1 is established.
Proof of (10.2.15)
From (10.2.16c) for i = q (valid by the induction hypothesis), it follows
that
 
 ε  τε − τ0 τqε − τq0 τqε − τq0 

τq − τq  = |ε| ·  q q
0
− lim + lim  = O(ε). (10.2.24)
 ε ε→0 ε ε→0 ε 

There are four cases to be considered for t ∈ [−γ̄, Tmax ]:

 
(i) t < min τqε , τq0 ; (ii) τq0 ≤ t < τqε ;
 
(iii) τqε ≤ t < τq0 ; (iv) t ≥ max τqε , τq0 .

Using (10.2.5) and the fundamental theorem of calculus, we can derive the
following formula for μq+1,ε (t) = ξ q+1,ε (t) − ξ q+1,0 (t) for each of the four
cases:
 t* +
μ q+1,ε q,ε
(t) = μ (t) + αqt fˆq+1,ε (ω, 1) − fˆq,ε (ω, 1) dω
τqε
10.2 Time-Lag Optimal Control with State-Dependent Switched System 403
 t * +
+ βqt fˆq,0 (ω, 0) − fˆq+1,0 (ω, 0) dω, (10.2.25)
τq0

where αqt and βqt are binary parameters indicating whether or not t ≥ τqε
and t ≥ τq0 , respectively.
Consider cases (i)–(iii) (i.e., at most one of t ≥ τqε and t ≥ τq0 holds). Then,
it follows from (10.2.25) that
 q+1,ε   q     
μ (t) ≤ |μq,ε (t)| + fmax q+1
+ fmax max τqε , τq0 − min τqε , τq0
 q 
q+1  ε

= |μq,ε (t)| + fmax + fmax τq − τq0  , (10.2.26)

where fmax q q+1


and fmax are upper bounds for the norms of fˆq,ε (s, η) and
fˆq+1,ε
(s, η), respectively. The existence of these upper bounds is ensured by
the inductive hypothesis and (10.2.14), the uniform boundedness of ξ q,ε (·)
and ξ q+1,ε (·) on [−γ̄, Tmax ] with respect to ε, and the continuity of the func-
tions f q and f q+1 .
Next, consider case (iv) (i.e., t ≥ τqε and t ≥ τq0 ). Then, (10.2.25) becomes

μq+1,ε (t)
 t * +
= μq,ε (t) + fˆq+1,ε (ω, 1) − fˆq,ε (ω, 1) dω
τqε
 t * +
+ fˆq,0 (ω, 0) − fˆq+1,0 (ω, 0) dω
τq0
 4  5
t t
=ξ q,ε
(τqε ) −ξ q,0
(τq0 ) + fˆq+1,ε (ω, 1)ds − ˆ
f q+1,0
(ω, 0) dω
τqε τq0
 τqε * +
= μq,ε (τq0 ) + fˆq,ε (ω, 1) − fˆq+1,ε (ω, 1) dω
τq0
 t * +
+ fˆq+1,ε (ω, 1) − fˆq+1,0 (ω, 0) dω,
τq0

ε
provided that τq−1 < τq0 when q ≥ 1. Thus, by the mean value theorem, we
obtain
 τqε * +
μ q+1,ε q,ε 0
(t) = μ (τq ) + fˆq,ε (ω, 1) − fˆq+1,ε (ω, 1) dω
τq0
r 
 
t 1
∂ fˆq+1,ε (ω, η) q+1,ε
+ μ (ω − γj )dηdω. (10.2.27)
j=0 τq0 0 ∂ x̃j

Taking the norm of both sides, we have


404 10 Time-Lag Optimal Control Problems
 q+1,ε   q,ε 0   q  
μ (t) ≤ μ (τq ) + fmax + fmax
q+1  ε
τq − τq0 
r  t

q+1  q+1,ε

+ ∂fmax μ (ω − γj ) dω, (10.2.28)
j=0 τq0

where ∂fmaxq+1
denotes an upper bound for the norm of ∂ fˆq+1,ε (s, η)/∂ x̃j
(again, the existence of such an upper bound is ensured by the uniform
boundedness of ξ q+1,ε (·) and the continuous differentiability of f q+1 ). Re-
call that (10.2.28) is established under the condition:
ε
τq−1 < τq0 when q ≥ 1. (10.2.29)

By the inductive hypothesis and Assumption 10.2.2, we have


ε 0
lim τq−1 = τq−1 = τq−1 (ζ) < τq (ζ) = τq0 . (10.2.30)
ε→0

ε
Hence, when ε is of sufficiently small magnitude, τq−1 < τq0 , showing the
validity of (10.2.29). Consequently, (10.2.28) is valid.
Now, we combine (10.2.26) and (10.2.28) and shift the time variable in the
integral. Then, it can be shown that, for all t ∈ [−γ̄, Tmax ],
 q+1,ε   q  
μ (t) ≤ max |μq,ε (ω)| + fmax + fmax τq − τq0 
q+1  ε
ω∈[−γ̄,Tmax ]
 t 
q+1  q+1,ε

+ (r + 1)∂fmax μ (ω) dω. (10.2.31)
−γ̄

Therefore, since μq,ε (ω) = O(ε) and τqε − τq0 = O(ε) from (10.2.15) and
(10.2.24), respectively, we have

 q+1,ε  t  
μ (t) ≤ O(ε) + q+1  q+1,ε
(r + 1)∂fmax μ (ω) dω, t ∈ [−γ̄, Tmax ].
−γ̄
(10.2.32)
Finally, by Theorem A.1.19 (Gronwall-Bellman Lemma), we obtain
 q+1,ε   
μ (t) ≤ O(ε) exp (r + 1)(Tmax + γ̄)∂fmax q+1
= O(ε), t ∈ [−γ̄, Tmax ].
(10.2.33)
Thus, (10.2.15) for i = q + 1 follows readily. This completes the proof
of (10.2.15).
The proofs of (10.2.16a)–(10.2.16c) will depend on some auxiliary re-
sults to be established below. First, from the inductive hypothesis, we recall
that (10.2.16a)–(10.2.16c) are valid for each i = 1, . . . , q, where q ≤ N − 1.
We have already shown in the proofs of (10.2.14) and (10.2.15) that for
any Tmax > 0, ξ q+1,ε (·) is uniformly bounded on [−γ̄, Tmax ] with respect to ε,
and ξ q+1,ε (·) → ξ q+1,0 (·) uniformly on [−γ̄, Tmax ] as ε → 0. Thus, since f q+1
is a continuously differentiable function, the following limit holds uniformly
with respect to t ∈ [0, Tmax ] and η ∈ [0, 1].
10.2 Time-Lag Optimal Control with State-Dependent Switched System 405

∂ fˆq+1,ε (t, η) ∂ fˆq+1,0 (t, 0)


lim j
= , j = 0, . . . , r.
ε→0 ∂x̃ ∂ x̃j
This implies that, for any Tmax > 0,
 1  ˆq+1,ε 
 ∂f (t, η) ∂ fˆq+1,0 (t, 0) 
max  −  dη = θ(ε), j = 0, . . . , r.
t∈[0,Tmax ] 0  ∂ x̃j ∂ x̃j 
(10.2.34)
Now, for t ∈ [0, Tmax ] and q ≥ 1, it follows from the mean value theorem that
 
   r  1  ˆq,ε 
 ˆq,ε   ∂ f (t, η) 
f (t, 1) − fˆ (t, 0) ≤
q,0
  × |μq,ε (t − γj )| dη
0  ∂ x̃ j 
j=0

≤ (r + 1)∂fmax
q
O(ε) = O(ε), (10.2.35)

where ∂fmax q
is an upper bound for the norm of ∂ fˆq,ε (s, η)/∂ x̃j , j =
0, 1, . . . , r. Furthermore, for t , t ∈ [0, Tmax ] and q ≥ 1,
  
   r  t  ˆq,0   dξ q,0 (t − γ ) 
 ˆq,0  ˆq,0    ∂ f (t, 0)   j 
f (t , 0) − f (t , 0) ≤  ×  dt
t   ∂ x̃ j  dt
j=0

≤ (r + 1)∂fmax
q i
max fmax |t − t |
i=0,...,q
 
= O(1) |t − t | , (10.2.36)
i
where, for each i = 0, . . . , q, the existence of the upper bound fmax for the
ˆi,ε
norm of f (s, η) is ensured by the uniform boundedness of ξ (·) i,ε
and the

continuity of the functions f i and φ. Choose Tmax > max τqε , τq0 . Then, it
follows from (10.2.24), (10.2.35) and (10.2.36) that
 
max(τqε ,τq0 )
 ˆq,ε  
f (t, 1) − fˆq,0 τq0 , 0  dt
min(τqε ,τq0 )
  
max(τqε ,τq0 ) *
 ˆq,ε    +
≤ f (t, 1) − fˆq,0 (t, 0) + fˆq,0 (t, 0) − fˆq,0 τq0 , 0  dt
min(τqε ,τq0 )
"  #  
≤ O(ε) + O(1) τqε − τq0  τqε − τq0  = O(ε2 ). (10.2.37)

Similarly, we can show that


 
max(τqε ,τq0 )
 ˆq+1,ε  
f (t, 1) − fˆq+1,0 τq0 , 0  dt = O(ε2 ). (10.2.38)
min(τqε ,τq0 )
406 10 Time-Lag Optimal Control Problems

Finally, we have
 
max(τqε ,τq0 )  −1 q+1,ε  τq0  −1 q+1,ε 
ε μ (t) − Λk (t) dt ≤ ε μ (t) − Λk (t) dt
−γ̄ −γ̄
 max(τqε ,τq0 )  −1 q+1,ε 
+ ε μ (t) − Λk (t) dt. (10.2.39)
τq0

Now, choose Tmax > max(τqε , τq0 ). Then, from (10.2.15), we have

ε−1 μq+1,ε (·) = O(1) uniformly on [−γ̄, max(τqε , τq0 )].

Furthermore, by (10.2.15) and (10.2.16a) for i = q, it follows that, for almost


every t ∈ [−γ̄, τq0 ),

ε−1 μq+1,ε (t) = ε−1 μq,ε (t) → Λk (t) as ε → 0. (10.2.40)

Hence, by Theorem A.1.10 (Lebesgue dominated convergence theorem), the


first integral on the right hand side of (10.2.39) converges to zero as ε → 0.
For the second integral, it follows from (10.2.24) that
 max(τqε ,τq0 )  −1 q+1,ε 
ε μ (t) − Λk (t) dt
−γ̄
 
≤ θ(ε) + O(1) + max |Λk (t)| · τqε − τq0 
t∈[−γ̄,Tmax ]

= θ(ε) + O(ε) = θ(ε). (10.2.41)

Remark 10.2.3 Note that (10.2.34), (10.2.38), (10.2.39) and (10.2.41) will
be used in the proof of (10.2.16a) for i = q + 1.

Proof of (10.2.16a)
Consider the case of t < τq0 . Then, by (10.2.16b) for i = q (which is valid
under induction hypothesis), we have t < τqε for all ε of sufficiently small
magnitude and hence ξ q+1,ε (t) = ξ q,ε (t). If t = τl0 , l = 0, . . . , q − 1, then it is
clear from (10.2.16a) for i = q that

ξ q+1,ε (t) − ξ q+1,0 (t)


lim
ε→0 ε
ξ q,ε (t) − ξ q,0 (t)
= lim = lim ε−1 μq,ε (t) = Λqk (t− ). (10.2.42)
ε→0 ε ε→0

Therefore, (10.2.16a) for i = q + 1 is established for the case of t < τq0 .


We now consider the case of t > τq0 . From (10.2.27), we have, for t ≥
max(τqε , τq0 ),
10.2 Time-Lag Optimal Control with State-Dependent Switched System 407
 τqε * +
μ q+1,ε
(t) = μ q,ε
(τq0 ) + fˆq,ε (ω, 1) − fˆq+1,ε (ω, 1) dω
τq0
r 
 
t 1
∂ fˆq+1,ε (ω, η) q+1,ε
+ μ (ω − γj )dηdω, (10.2.43)
j=0 τq0 0 ∂ x̃j

provided that τq0 > τq−1ε


when q ≥ 1, which is satisfied when ε is of sufficiently
small magnitude (see the proof of (10.2.15)). Now, choose an arbitrary Tmax >
max(τqε , τq0 ). Then, for t ≤ Tmax , it follows from (10.2.34) and (10.2.15) that
 
t 1
∂ fˆq+1,ε (ω, η) q+1,ε
μ (ω − γj )dηdω
τq0 0 ∂ x̃j
 t
∂ fˆq+1,0 (ω, 0) q+1,ε
= μ (ω − γj )dω + θ(ε)O(ε). (10.2.44)
τq0 ∂ x̃j

Now, by (10.2.37) and (10.2.38), we can show that


 τqε * +
fˆq,ε (ω, 1) − fˆq+1,ε (ω, 1) dω
τq0
 *    +
= τqε − τq0 fˆq,0 τq0 , 0 − fˆq+1,0 τq0 , 0 + O(ε2 ). (10.2.45)

Using (10.2.44) and (10.2.45), the expression for μq+1,ε(t) can be simplified.
More specifically, for all times t satisfying max τqε , τq0 ≤ t ≤ Tmax , we can
deduce that
   *    +
μq+1,ε (t) = μq,ε τq0 + τqε − τq0 fˆq,0 τq0 , 0 − fˆq+1,0 τq0 , 0
r  t
∂ fˆq+1,0 (ω, 0) q+1,ε
+ j
μ (ω − γj )dω + O(ε2 ) + θ(ε)O(ε).
j=0 τ 0
q
∂ x̃
(10.2.46)

Now, the solution of the variational system (10.2.13) on (τq0 , τq+1


0
] can be
expressed as

  r t
∂ fˆq+1,0 (ω, 0)
Λk (t− ) = Λk τq0+ + Λk (ω − γj )dω,
j=0 τq0 ∂ x̃j
 %
t ∈ τq0 , τq+1
0
. (10.2.47)

Multiplying (10.2.46) by ε−1 and then subtracting (10.2.47), we obtain

ε−1 μq+1,ε (t) − Λk (t− )


     *    +
= ε−1 μq,ε τq0 − Λk τq0+ + ε−1 τqε − τq0 fˆq,0 τq0 , 0 − fˆq+1,0 τq0 , 0
408 10 Time-Lag Optimal Control Problems
r 
 t
∂ fˆq+1,0 (ω, 0) " −1 q+1,ε #
+ ε μ (ω − γj ) − Λk (ω − γj ) dω + θ(ε),
j=0 τq0 ∂ x̃j
    0 %
t ∈ max τqε , τq0 , min τq+1 , Tmax , (10.2.48)

which holds when ε is of sufficiently small magnitude. Hence, by taking the


norm of both sides and changing the variable of integration in the last integral,
we deduce that
 −1 q+1,ε 
ε μ (t) − Λk (t− )
 t

q+1  −1 q+1,ε

≤ λq,ε + (r + 1)∂fmax ε μ (ω) − Λk (ω) dω + θ(ε),
−γ̄
    0 %
t ∈ max τqε , τq0 , min τq+1 , Tmax , (10.2.49)
q+1
where ∂fmax is as defined in the proof of (10.2.15) and
    

λq,ε =ε−1 μq,ε τq0 − Λk τq0+
 *    + 
+ ε−1 τqε − τq0 fˆq,0 τq0 , 0 − fˆq+1,0 τq0 , 0 . (10.2.50)

Next, by virtue of (10.2.16a) and (10.2.16c) for i = q and the jump condi-
tion (10.2.13d) in the variational system (10.2.13), we have
4 5

 −1  q,ε  0   0− 

τqε − τq0 ∂τq (ζ)
q,ε
lim λ = lim ε μ τ q − Λ k τq + − ·
ε→0 ε→0 ε ∂ζk
*    + 
fˆq,0 τq0 , 0 − fˆq+1,0 τq0 , 0  = 0. (10.2.51)

Then, from (10.2.41) and (10.2.51), it is clear that


 −1 q+1,ε 
ε μ (t) − Λk (t− )
 t

q+1  −1 q+1,ε

≤ θ(ε) + (m + 1)∂fmax ε μ (ω) − Λk (ω) dω,
max(τqε ,τq0 )
    0 %
t ∈ max τqε , τq0 , min τq+1 , Tmax . (10.2.52)

Thus, by Theorem A.1.19 (Gronwall-Bellman Lemma), we obtain


 −1 q+1,ε   
ε μ (t) − Λk (t− ) ≤ θ(ε) exp (r + 1)∂fmax
q+1
Tmax ,
    0 %
t ∈ max τqε , τq0 , min τq+1 , Tmax , (10.2.53)

which holds
 for 0all %ε of sufficiently small magnitude. Finally,
 for any
 0 fixed time%
point t ∈ τq0 , τq+1 , we can choose Tmax > t so that t ∈ τqε , min τq+1 , Tmax
when the magnitude of ε is sufficiently small. Thus, by (10.2.53), we have
10.2 Time-Lag Optimal Control with State-Dependent Switched System 409

ε−1 μq+1,ε (t) → Λk (t− ).

This completes the proof of (10.2.16a) for i = q + 1.


Proof of (10.2.16b)
Consider the case of q = N −1. Then, τq+1 ε 0
= τq+1 = ∞. Clearly, (10.2.16b)
for i = q + 1 holds.
It remains to consider the case of q < N −1. By virtue of Assumption 10.2.2
0
and the definition of τq+1 , there exists a δ > 0 such that τq0 −δ < τq0 < τq+1
0
−δ
and   $ %
hq+1 ξ q+1,0 (t) = 0, t ∈ τq0 − δ, τq+1 0
−δ , (10.2.54)
where τq+1 0
− δ = ∞ if τq+1 0
= ∞. For any Tmax > τq0 , we recall that
ξ q+1,ε
(t) → ξ q+1,0
(t) uniformly on [−γ̄, Tmax ] as ε → 0 (see the proof
of (10.2.15)). Thus, it follows from (10.2.16b) that
  $  0 %
hq+1 ξ q+1,ε (t) = 0, t ∈ τqε , min τq+1 − δ, Tmax , (10.2.55)
0
when ε is of sufficiently small magnitude. If τq+1 = ∞, then (10.2.55) becomes
  $ %
hq+1 ξ q+1,ε (t) = 0, t ∈ τqε , Tmax , (10.2.56)
ε
which implies τq+1 ≥ Tmax . Since Tmax was chosen arbitrarily, we can take
Tmax → ∞ and hence τq+1 ε
→ ∞ as ε → 0. Thus, for the case of τq+1
0
= ∞,
the validity of (10.2.16b) for i = q + 1 is established.
We now consider the case of τq+10
< ∞. Clearly,
  0 
hq+1 ξ q+1,0 τq+1 = 0. (10.2.57)

Next, by Assumption 10.2.3, we have


  0 
∂hq+1 ξ q+1,0 τq+1  0 
fˆq+1,0 τq+1 ,0
∂x  
∂hq+1 x τq+10

= ·
  ∂x     0 
f q+1 x τq+1
0
| ζ , x τq+1
0
− γ1 | ζ , . . . , x τq+1 − γr | ζ
= 0. (10.2.58)

Thus, by continuity, δ > 0 in (10.2.56) may be chosen such that


 
d "  q+1,0 # ∂hq+1 ξ q+1,0 (t) dξ q+1,0 (t)
hq+1 ξ (t) =
dt ∂x  dt
∂hq+1 ξ q+1,0 (t) ˆq+1,0
= f (t, 0) = 0,
∂x  0 
t ∈ τq+1 − δ, τq+1
0
+δ . (10.2.59)
410 10 Time-Lag Optimal Control Problems
 
This shows that hq+1 ξ q+1,0 (·)
 is either strictly increasing or strictly de-
0
creasing on τq+1 − δ, τq+1
0
+ δ . Therefore, in view of (10.2.58) and (10.2.59),
 
hq+1 ξ q+1,0 (·) has different sign at τq+1
0
− δ and τq+1
0
+ δ. This implies that
   
hq+1 ξ q+1,0 (τq+1
0
− δ) · hq+1 ξ q+1,0 (τq+1
0
+ δ) < 0. (10.2.60)

Choose Tmax > τq+1 0


+ δ. Note that ξ q+1,ε (t) → ξ q+1,0 (t) uniformly on
[−γ̄, Tmax ] as ε → 0. Thus,
   
hq+1 ξ q+1,ε (τq+1
0
− δ) · hq+1 ξ q+1,ε (τq+1
0
+ δ) < 0, (10.2.61)
 
whenε is sufficiently
 small. This implies that hq+1 ξ q+1,ε (·) , for example,
hq+1 ξ q+1,0 (·) , has different sign at τq+1
0
−δ and τq+1
0
+δ when ε is sufficiently
small. Combining (10.2.55) and (10.2.61), it gives
0
τq+1 − δ < τq+1
ε 0
< τq+1 + δ.
Thus, the conclusion of (10.2.16b) follows readily by taking δ → 0.
Proof of (10.2.16c)
Since (10.2.16c) is not applicable when i = N , it suffices to consider the
case of q < N − 1. To begin, define
  ε   0 
-
hεq+1 (η) = hq+1 ηξ q+1,ε τq+1 + (1 − η)ξ q+1,0 τq+1 (10.2.62)

and let ∂ -
hεq+1 (η)/∂x denote the respective partial derivative. By Taylor’s
theorem, there exists a constant ηε ∈ (0, 1) such that

0=-
hεq+1 (1) − -
h0q+1 (0)
∂-
hεq+1 (ηε ) " q+1,ε ε #
= ξ (τq+1 ) − ξ q+1,0 (τq+1
0
) . (10.2.63)
∂x
Now, since τqε → τq0 < τq+1 0
as ε → 0, we have τqε < τq+1
0
when ε is of
sufficiently small magnitude. Thus,
 ε   0 
ξ q+1,ε τq+1 − ξ q+1,0 τq+1
 ε   0   0 
= ξ q+1,ε τq+1 − ξ q+1,ε τq+1 + μq+1,ε τq+1
 τq+1
ε
 0 
= fˆq+1,ε (ω, 1)dω + μq+1,ε τq+1
0
τq+1

  1  ε   0 
= ε
τq+1 − 0
τq+1 fˆq+1,ε ητq+1 + (1 − η)τq+1
0
, 1 dη + μq+1,ε τq+1 .
0
(10.2.64)

Substituting (10.2.64) into (10.2.63) and rearranging, we obtain


10.2 Time-Lag Optimal Control with State-Dependent Switched System 411

  ∂-
hεq+1 (ηε ) 1 q+1,ε  ε 
ε
τq+1 − 0
τq+1 fˆ ητq+1 + (1 − η)τq+1
0
, 1 dη
∂x 0
∂-
hεq+1 (ηε ) q+1,ε  0 
=− μ τq+1 , (10.2.65)
∂x
which holds for all ε of sufficiently small magnitude.  ε 
ε
Clearly, since τq+1 → τq+1
0
as ε → 0, we can choose Tmax > max τq+1 0
, τq+1 .
Thus, by Assumption 10.2.3 and (10.2.15), we obtain

∂-
hεq+1 (ηε ) 1 q+1,ε  ε 
lim fˆ ητq+1 + (1 − η)τq+1
0
, 1 dη
ε→0 ∂x 0
-
∂ h0q+1 (0) q+1,0  0 
= fˆ τq+1 , 0 = 0. (10.2.66)
∂x
Therefore, (10.2.65) can be arranged such that

∂-
hεq+1 (ηε ) q+1,ε 0
ε
τq+1 − τq+1
0
=− μ (τq+1 )
4 ∂x  5
∂- hεq+1 (ηε ) 1 q+1,ε  ε 
÷ fˆ ητq+1 + (1 − η)τq+1 , 1 dη .
0
∂x 0
(10.2.67)

Now, dividing both sides of (10.2.67) by ε, it gives


ε
τq+1 − τq+1
0
∂-
hεq+1 (ηε ) −1 q+1,ε  0 
=− ε μ τq+1
ε 4 ∂x 5

∂-hεq+1 (ηε ) 1 q+1,ε  ε 
÷ ˆ
f ητq+1 + (1 − η)τq+1 , 1 dη .
0
∂x 0
(10.2.68)

Thus, by (10.2.16a) and (10.2.16b), we have


4 5
ε
τq+1 − τq+1
0
∂-
h0q+1 (0)  0−  ∂-h0q+1 (0) q+1,0  0 
lim =− Λk τq+1 ÷ fˆ τq+1 , 0 .
ε→0 ε ∂x ∂x
(10.2.69)
Therefore, the validity of (10.2.16c) for i = q + 1 is established. Now, the
proof of Lemma 10.2.1 is complete.
412 10 Time-Lag Optimal Control Problems

10.2.4 Main Results

We are now in a position to derive formulas for the state variation matrix
and the partial derivatives of the switching times with respect to each of the
components of the system parameter vector.
Theorem 10.2.2 For each ζ ∈ Z and for each k = 1, . . . , s, it holds that

∂x(t | ζ)
= Λk (t), t ∈ (τi−1 , τi ), i ∈ ZN , (10.2.70)
∂ζk
and
∂τi (ζ) ∂hi (x(τi ))
=− Λk (τi− )
∂ζk ∂x
∂hi (x(τi )) i
÷ f (x(τi ), x(τi − γ1 ), . . . , x(τi − γm )) ,
∂x
i ∈ ZN −1 and τi < ∞, (10.2.71)

where Λk (·) satisfies the variational system described by the differential equa-
tions (10.2.13a) with initial condition (10.2.13b)–(10.2.13c) and jump con-
dition (10.2.13d).

Proof. Note that (10.2.16a) and (10.2.16b) are valid. Thus, given any t ∈
(τi−1 , τi ), i ∈ ZN , we have t < τiε for all ε of sufficiently small magnitude,
and hence x(t | ζ + εek ) = ξ i,ε (t). This implies that for t < τiε and for all ε
of sufficiently small magnitude,

∂x (t | ζ) x(t | ζ + εek ) − x(t | ζ)


= lim
∂ζk ε→0 ε
= lim ε−1 μi,ε (t) = Λk (t− ) = Λk (t). (10.2.72)
ε→0

Therefore, the validity of (10.2.70) is established. Now, for each i ∈ ZN −1


with τi being finite, we
 recall that (10.2.16c) is satisfied.
Since ξ i,0 τi0 − γj = x(τi − γj | ζ), j = 0, . . . , r, it follows that

∂τi (ζ) τ ε − τi0


= lim i
∂ζk ε→0 ε
 i,0  0 
∂hi ξ τi  
=− Λk τi0−
4 ∂x 5
  
∂hi ξ i,0 τi0  i,0  0  i,0  0   0 
÷ i
f ξ τi , ξ τi − γ 1 , . . . , ξ i,0
τi − γ r
∂x
∂hi (x(τi ))  − 
=− Λ k τi
∂x
10.2 Time-Lag Optimal Control with State-Dependent Switched System 413

∂hi (x(τi )) i
÷ f (x(τi ), x(τi − γ1 ), . . . , x(τi − γr )) . (10.2.73)
∂x

Thus, (10.2.71) is established.


Therefore, the conclusions of the theorem follow readily from (10.2.16a)–
(10.2.16c). This completes the proof.
The switching times t = τi , i ∈ ZN −1 are deliberately excluded from
equation (10.2.70) in Theorem 10.2.2. This is because the state variation
matrix only exists at the switching times in rare circumstances, as the next
result shows.
$ %
Theorem 10.2.3 Let ζ ∈ Z and let Λk (·) = Λ1k , . . . , ΛN k denote the solu-
tion of the variational system governed by the differential equations (10.2.13a)
with initial condition (10.2.13b)–(10.2.13c) and jump condition (10.2.13d)
corresponding to ζ. Then, for each k = 1, . . . , s and each i ∈ ZN −1 with
τi < ∞, one of the following scenarios holds:
(i) Suppose that

f i (x(τi ), x(τi − γ1 ), . . . , x(τi − γr ))


= f i+1 (x(τi ), x(τi − γ1 ), . . . , x(τi − γr )) (10.2.74a)

or
∂τi (ζ)/∂ζk = 0. (10.2.74b)
Then
∂x(τi | ζ)
= Λk (τi+ ) = Λk (τi− ). (10.2.75)
∂ζk
(ii) Suppose that

f i (x(τi ), x(τi − γ1 ), . . . , x(τi − γm ))


= f i+1 (x(τi ), x(τi − γ1 ), . . . , x(τi − γm )) (10.2.76a)

and
∂τi (ζ)/∂ζk > 0. (10.2.76b)
Then,
 
∂ ± x(τi | ζ) x τi | ζ + εek − x(τi | ζ)
= lim = Λk (τi∓ ).
∂ζk ε→0± ε
(10.2.77)
(iii) Suppose that

f i (x(τi ), x(τi − γ1 ), . . . , x(τi − γr ))


= f i+1 (x(τi ), x(τi − γ1 ), . . . , x(τi − γr )) (10.2.78a)
414 10 Time-Lag Optimal Control Problems

and
∂τi (ζ)/∂ζk < 0. (10.2.78b)
Then,
 
∂ ± x(τi | ζ) x τi | ζ + εek − x(τi | ζ)
= lim = Λk (τi± ).
∂ζk ε→0± ε
(10.2.79)

Proof. Consider i ∈ ZN −1 with τi < ∞ and let k ∈ {1, . . . , s}. Then, from
the auxiliary system (10.2.5), we have
   
x τi0 | ζ + εek − x τi0 | ζ
 τi0 * +
 0  0
=ξ i,ε
τi − ξ i,0
τi + fˆi+1,ε (ω, 1) − fˆi,ε (ω, 1) dω
min(τiε ,τi0 )
 τi0 * +
 
= μi,ε τi0 + fˆi+1,ε (ω, 1) − fˆi,ε (ω, 1) dω, (10.2.80)
min(τiε ,τi0 )

when |ε| is sufficiently small such that τi0 < τi+1


ε
.
Now, by (10.2.37) and (10.2.38), which were established in the proof of
(10.2.15), we obtain
   
x τi0 | ζ + εek − x τi0 | ζ
  "  # *    +
= μi,ε τi0 + τi0 − min τiε , τi0 · fˆi+1,0 τi0 , 0 − fˆi,0 τi0 , 0 + O(ε2 ).
(10.2.81)

Hence,
   
x τi0 | ζ + εek − x τi0 | ζ
ε  
−1 i,ε
  τi0 − min τiε , τi0 * i+1,0  0   +
=ε μ 0
τi + · fˆ τi , 0 − fˆi,0 τi0 , 0 + O(ε).
ε
(10.2.82)

If ∂τi /∂ζk > 0, then τiε → τi0± as ε → 0± . Thus,


 
τ 0 − min τiε , τi0 τ 0 − min(τiε , τi0 ) ∂τi
lim i = 0, lim i =− . (10.2.83)
ε→0+ ε ε→0− ε ∂ζk
   
Therefore, since ε−1 μi,ε τi0 → Λk τi0− by (10.2.16a), we have
   
x τi0 | ζ + εek − x τi0 | ζ  
lim+ = Λk τi0− (10.2.84)
ε→0 ε
and
10.2 Time-Lag Optimal Control with State-Dependent Switched System 415
   
x τi0 | ζ + εek − x τi0 | ζ
lim−
ε→0 ε
 0−  ∂τi (ζ) * i,0  0   +  
= Λ k τi + · fˆ τi , 0 − fˆi+1,0 τi0 , 0 = Λk τi0+ .
∂ζk
(10.2.85)

Similarly, if ∂τi /∂ζk < 0, then



x τi0 | ζ + εek − x (τi0 | ζ)
lim
ε→0+ ε
  ∂τ (ζ)    
0− i
= Λk τ i + · fˆi,0 τi0 , 0 − fˆi+1,0 τi0 , 0 = Λk τi0+ (10.2.86)
∂ζk

and    
x τi0 | ζ + εek − x τi0 | ζ  
lim− = Λk τi0− . (10.2.87)
ε→0 ε
   
Finally, suppose that either ∂τi /∂ζk = 0 or fˆi+1,0 τi0 , 0 = fˆi,0 τi0 , 0 is
satisfied. Then,
   
Λk τi0+ = Λk τi0−
and
     
∂x τi0 | ζ x τi0 | ζ + εek − x τi0 | ζ
= lim
∂ζk ε→0 ε
 0−   0+ 
= Λ k τi = Λ k τi . (10.2.88)

The proof is complete.

Remark 10.2.4 In the first scenario of Theorem 10.2.3, the state variation
exists at t = τi . In the last two scenarios (the more likely scenarios), the
state variation does not exist at t = τi due to the facts that the left and
right partial derivatives of the state with respect to the kth component of the
system parameter are different, indicating that Λk (·) is discontinuous at the
ith switching time.

Let ζ ∈ Z be an arbitrary parameter vector. Then, under Assump-


tions 10.2.2 and 10.2.3, it follows from (10.2.15) and (10.2.16b) that ξ i (·),
i ∈ ZN , and τi (·), i ∈ ZN −1 , are continuous with respect to system parame-
ter vector at ζ along each coordinate axis of the space Rs . However, this does
not necessarily imply the continuity (in the space Rs ) at ζ. For continuity, it
is to be shown as follows. Consider a perturbed parameter vector ζ + σ ∈ Z
and let ξ i,σ (·), τiσ (·), and μi,σ (·) denote the analogues of ξ i,ε (·), τiε (·),
and μi,ε (·) with σ replacing εek . Then, the proofs for (10.2.14), (10.2.15)
and (10.2.16c) can be easily modified to prove that the following versions
of (10.2.14), (10.2.15) and (10.2.16b) for each i are valid.
416 10 Time-Lag Optimal Control Problems
 i,σ 
max ξ (t) = O(1) for every Tmax > 0, (10.2.89)
t∈[−γ̄,Tmax ]
 
max μi,σ (t) = θ(σ) for every Tmax > 0, (10.2.90)
t∈[−γ̄,Tmax ]

lim τiσ = τi0 , (10.2.91)


σ→0

where O(1) in (10.2.89) means that the left hand side is uniformly bounded
with respect to σ, and θ(σ) in (10.2.90) means that the left hand side
converges to zero as σ → 0. In (10.2.90) and (10.2.91), the convergence
σ → 0 can be along any path to the origin, but in the original (10.2.15)
and (10.2.16b), the convergence is restricted to be along one of the coordi-
nate axes. Choose Tmax > T . Then, by virtue of (10.2.90) for i = N , we
have

|x(T | ζ + σ) − x(T | ζ)|


 
= ξ N,σ (T ) − ξ N,0 (T )
   N,σ 
= μN,σ (T ) ≤ max μ (t) = θ(σ). (10.2.92)
t∈[−γ̄,Tmax ]

This shows that x(T | ζ + σ) → x(T | ζ) as σ → 0. On this basis, since Φ is


continuous, we have

g0 (ζ + σ) = Φ(x(T | ζ + σ)) → Φ(x(T | ζ)) = g0 (ζ) as σ → 0. (10.2.93)

This shows that the cost functional g0 is continuous. Note that the proof
of (10.2.90) must be carried out simultaneously with (10.2.89) and (10.2.91)
via induction as for the proof of Lemma 10.2.1.
We are now in a position to present the right and left partial derivatives
of the cost functional g0 with respect to the system parameter vector ζ as
given in the following theorem:
Theorem 10.2.4 For each ζ ∈ Z, it holds that
 
x T | ζ + εek → x(T | ζ) as ε → 0± , (10.2.94)
and, for each k = 1, . . . , s,

∂ ± g0 (ζ)
∂ζk
 
g0 ζ + εek − g0 (ζ)
= lim
ε→0± ε
 
∂Φ(x(T | ζ)) x T | ζ + εek − x(T | ζ)
= lim
∂x ε→0± ε
10.2 Time-Lag Optimal Control with State-Dependent Switched System 417

⎪Λk (T | ζ), if τi (ζ) = T , i = 1, . . . , N − 1,
∂Φ(x(T | ζ)) ⎨
= Λk (T ∓ | ζ), if τi (ζ) = T and ∂τi (ζ)/∂ζk ≥ 0,
∂x ⎪

Λk (T ± | ζ), if τi (ζ) = T and ∂τi (ζ)/∂ζk ≤ 0,
(10.2.95)

where Λk (· | ζ) is the solution of the variational system (10.2.13) correspond-


ing to ζ.

Proof. By Theorems 10.2.2 and 10.2.3, we can derive the left and right partial
derivatives of the cost functional g0 . First, by Taylor’s theorem, there exists,
for each ε = 0 and each k = 1, . . . , s, a constant ηε,k ∈ (0, 1) such that
  
∂Φ (1 − ηε,k )x(T | ζ) + ηε,k x T | ζ + εek
g0 (ζ + εe ) − g0 (ζ) =
k
·
"   ∂x #
x T | ζ + εek − x(T | ζ) . (10.2.96)

Collectively, by Theorems 10.2.2 and 10.2.3, the existence of the right and
left partial derivatives of the system state with respect to each component
of the system parameter is assured under Assumptions 10.2.2 and 10.2.3
Thus, (10.2.94) and (10.2.95) are valid. The proof is complete.

Note from (10.2.95) that the left and right partial derivatives of g0 exist at
all ζ ∈ Z under Assumptions 10.2.2 and 10.2.3. In practice, these assumptions
can be easily checked for a given ζ ∈ Z by numerically solving the switched
time-delay system (10.2.1). Note that if T coincides with a switching time
satisfying one of the last two scenarios in Theorem 10.2.3, then the left and
right partial derivatives of g0 with respect to ζk may differ, since in this case
Λk (T − ) = Λk (T + ).
Since g0 has well-defined left and right partial derivatives (under Assump-
tions 10.2.2 and 10.2.3), it is continuous under Assumptions 10.2.2 and 10.2.3.
If these assumptions hold at every point in the compact set Z, then Prob-
lem (P2 ) is guaranteed to admit an optimal solution. This result is summa-
rized below.
Theorem 10.2.5 Problem (P2 ) admits an optimal solution.

The left and right partial derivatives of g0 , as defined in (10.2.95), can


be used to identify search directions at a given system parameter vector ζ
during the optimization process. Indeed, if ∂ + g0 (ζ)/∂ζk < 0, then ek is a
descent direction of g0 at ζ, and if ∂ − g0 (ζ)/∂ζk > 0, then −ek is a descent
direction of g0 at ζ. Performing a line search along a descent direction will
yield an improved point with lower cost.
418 10 Time-Lag Optimal Control Problems

If none of the switching times coincide with the terminal time, or if the
conditions for the first scenario in Theorem 10.2.3 are satisfied at the terminal
time, then the left and right partial derivatives of g0 derived above become
the full partial derivatives as shown in (10.2.9). We now present the following
line search optimization algorithm for solving Problem (P2 ).
Algorithm 10.2.1

1. Choose an initial point ζ ∈ Z.


2. Form an expanded switched time-delay system by combining the state sys-
tem (10.2.1) with the variational system (10.2.13) for each k = 1, . . . , s.
3. Solve the expanded system sub-system by sub-system, checking Assump-
tions 10.2.2 and 10.2.3 at the start and end of each sub-system. If these
assumptions are violated at any stage, then stop with error.
4. Use x(· | ζ) and Λk (· | ζ), k = 1, . . . , s, to determine the left and right
partial derivatives of g0 according to (10.2.95).
5. Use ∂ ± g0 (ζ)/∂ζk , k = 1, . . . , s, to check local optimality conditions at ζ.
If the local optimality conditions hold, then stop; otherwise, continue to
Step 6.
6. Use ∂ ± g0 (ζ)/∂ζk , k = 1, . . . , s, to define a search direction.
7. Perform a line search along the direction from Step 6 to determine a new
point ζ  ∈ Z.
8. Set ζ  → ζ and return to Step 2.

In most cases, the partial derivatives of g0 will exist and Steps 5–7 can
be implemented using well-known methods in nonlinear optimization (see
Chapter 2). If any of the full partial derivatives of g0 does not exist (i.e., one
of the last two scenarios in Theorem 10.2.3 occurred at the terminal time),
then the signs of the left and right partial derivatives can be used to identify
an appropriate descent direction along one of the coordinate axes.

10.2.5 Numerical Example

We consider a fed-batch fermentation process for converting glycerol to 1,3-


propanediol (1,3-PD). This process switches between two modes: batch mode
(during which there is no input feed) and feeding mode (during which glycerol
and alkali are added continuously to the fermentor). The switching of mode
occurs when the concentration of glycerol reaches certain lower and upper
thresholds. Moreover, since nutrient metabolization does not immediately
lead to the production of new biomass, the fermentation process involves a
time-delay.
10.2 Time-Lag Optimal Control with State-Dependent Switched System 419

The model is based on the work in [151, 197]. Let x(t) = [x1 (t), x2 (t), x3 (t),
x4 (t)] , where t is time (hours). Here, x1 (t) is the biomass concentration
(g L−1 ), x2 (t) is the glycerol concentration (mmol L−1 ), x3 (t) is the 1,3-PD
concentration (mmol L−1 ) and x4 (t) is the fluid volume (L). The process
dynamics due to natural fermentation are described by
⎡ dx (t) ⎤ ⎡ ⎤
μ(x2 (t), x3 (t))x1 (t − γ1 )
1
dt
⎢ dx2 (t) ⎥ ⎢
⎢ dt ⎥ ⎢ −q2 (x2 (t), x3 (t))x1 (t − γ1 ) ⎥ ⎥ = f ferm (x(t), x1 (t − γ1 )),
⎢ dx3 (t) ⎥ = ⎣
⎣ dt ⎦ q3 (x2 (t), x3 (t))x1 (t − γ1 ) ⎦
dx4 (t) 0
dt
(10.2.97)
where γ1 = 0.1568 is the time-delay; μ(·, ·) is the cell growth rate; q2 (·, ·) is
the substrate consumption rate; and q3 (·, ·) is the 1,3-PD formation rate. The
process dynamics due to the input feed are
⎡ dx (t) ⎤ ⎡ ⎤
−x1 (t)
1
dt
⎢ dx2 (t) ⎥ u(t) ⎢ ⎥
⎢ dt ⎥
⎢ dx3 (t) ⎥ = ⎢ rcs0 − x2 (t) ⎥ := f feed (x(t), u(t)), (10.2.98)
⎣ dt ⎦ x4 (t) ⎣ −x3 (t) ⎦
dx4 (t) x4 (t)
dt

where u(t) is the input feeding rate (L h−1 ); r = 0.5714 is the proportion of
glycerol in the input feed and cs0 = 10762 mmol L−1 is the concentration of
glycerol in the input feed. The functions μ(·, ·), q2 (·, ·) and q3 (·, ·) in (10.2.97)
are given by
 3
Δ1 x2 (t) x2 (t) x3 (t)
μ(x2 (t), x3 (t)) = 1− ∗ 1− ∗ , (10.2.99)
x2 (t) + k1 x2 x3
Δ2 x2 (t)
q2 (x2 (t), x3 (t)) = m1 + Y1 μ(x2 (t), x3 (t)) + , (10.2.100)
x2 (t) + k2
Δ3 x2 (t)
q3 (x2 (t), x3 (t)) = −m2 + Y2 μ(x2 (t), x3 (t)) + , (10.2.101)
x2 (t) + k3

where x∗2 = 2039 mmol L−1 and x∗3 = 1036 mmol L−1 are, respectively, the
critical concentrations of glycerol and 1,3-PD, and the values of the other
parameters are given in Table 10.2.1.

Table 10.2.1: The other parameters in system

Δ1 k1 m1 Y1 Δ2 k2 m2 Y2 Δ3 k3
0.8037 0.4856 0.2977 144.9120 7.8367 9.4632 12.2577 80.8439 20.2757 38.75
420 10 Time-Lag Optimal Control Problems

Let Nfeed be an upper bound for the number of feeding modes. Since
the process starts and finishes in batch mode, the total number of potential
modes is N = 2Nfeed + 1 (Nfeed feeding modes and Nfeed + 1 batch modes).
During batch mode, there is no input feed and the process is only governed
by (10.2.97). On the other hand, the process is governed by both (10.2.97)
and (10.2.98) during feeding mode. Thus,

dx(t) f ferm (x(t), x1 (t − γ1 )), for batch mode,


=
dt f ferm (x(t), x1 (t − γ1 )) + f feed (x(t), ζi ), for ith feeding mode,
(10.2.102)
where ζi is the feeding rate during the ith feeding mode subject to the fol-
lowing boundedness constraints:

0043 ≤ ζi ≤ 1.9266. (10.2.103)

During the growth phase of the biomass, glycerol is being consumed. Since
no new glycerol is added during the batch mode, the glycerol concentration
will reduce and eventually it will become too low, and hence a switch into
feeding mode is necessary. The corresponding switching condition is

x2 (t) − ζNfeed +1 = 0, (10.2.104)

where ζNfeed +1 is the lower switching concentration. This parameter is a de-


cision parameter which is to be optimized. On the other hand, when the
glycerol concentration becomes too high during feeding mode, cell growth is
inhibited. Thus, the process must switch back into batch mode. The corre-
sponding switching condition is

x2 (t) − ζNfeed +2 = 0, (10.2.105)

where ζNfeed +2 is the upper switching concentration. This is another param-


eter to be optimized.
The bounded constraints on ζNfeed +1 and ζNfeed +2 are

50 ≤ ζNfeed +1 ≤ 260, 300 ≤ ζNfeed +2 ≤ 600. (10.2.106)

Note that the system parameters in this example appear explicitly in the
dynamics and switching conditions. Thus, to apply Theorem 10.2.2, we re-
place the system parameters with auxiliary state variables x4+k (t), k =
1, . . . , Nfeed + 2, where

dx4+k (t)
= 0, t > 0, (10.2.107)
dt
and

x4+k (t) = ζk , t ≤ 0. (10.2.108)


10.2 Time-Lag Optimal Control with State-Dependent Switched System 421

Let δki denote the Kronecker delta function and let ∂x and ∂ x̃1 denote
the partial differentiation with respect to x(t) and x1 (t − γ1 ), respectively.
Then, the variational system corresponding to ζk is

dΛk (t)
dt
⎧ ferm
⎪ ∂f
ferm

⎪ Λk (t) + ∂f∂ x̃1 Λk1 (t − γ1 ), batch mode,
⎨ ∂x

= ∂f ferm ∂f ferm (10.2.109a)



⎪ ∂x Λk (t) + ∂ x̃1 Λk1 (t − γ1 )

⎩ + ∂f feed Λ (t) + δ ∂f feed ,
∂x k ki ∂u ith, feeding mode,

with jump conditions


 
Λk τi+
4
Λk (τi− ) − ∂ζ
∂τi feed
k
f (x(τi ), ζi ), if mode i = batch
= − ∂τi feed
(10.2.109b)
Λk (τi ) + ∂ζk f (x(τi ), ζi ), if mode i = feeding.

Furthermore, for a switch from batch mode to feeding mode,


∂τi
∂ζk
4
(1 − Λk2 (τi− )) ÷ f2ferm (x(τi ), x1 (τi − γ1 )), if k = Nfeed + 1,
= (10.2.110)
−Λk2 (τi− ) ÷ f2ferm (x(τi ), x1 (τi − γ1 )), otherwise.

Similarly, for a switch from feeding mode to batch mode,


∂τi
∂ζk

⎪ (1 − Λk2 (τi− ))

⎨ ÷{f2ferm (x(τi ), x1 (τi − γ1 )) + f2feed (x(τi ), ζi )}, if k = Nfeed + 2,
= (10.2.111)
⎪ (τi− )
⎩ −Λk2

÷{f2ferm (x(τi ), x1 (τi − γ1 )) + f2feed (x(τi ), ζi )}, otherwise.

From the boundedness constraints specified in (10.2.106), we note that


ζNfeed +1 < ζNfeed +2 . Thus, Assumption 10.2.2 is satisfied at all feasible points.
For Assumption 10.2.3, we require

⎨ −q2 (x2 (τi ), x3 (τi ))x1 (τi − γ1 ), if mode i = batch,
0 = −q2 (x2 (τi ), x3 (τi ))x1 (τi − γ1 ) (10.2.112)
⎩ ζi (rcs0 −x2 (τi ))
+ x4 (τi ) , if mode i = feeding.
422 10 Time-Lag Optimal Control Problems

This condition is clearly satisfied with reasons given below: For batch mode,
the right hand side of (10.2.112) is always non-zero because in practice both
q2 and x1 are non-zero. For feeding mode, the right hand side of (10.2.112)
is also non-zero because, during feeding mode, the glycerol loss from natural
fermentation (first term) is dominated by the glycerol addition from the input
feed (second term).
Since x4 is non-decreasing and, for biologically meaningful trajectories,
μ(·, ·) is bounded, the linear growth assumption is also clearly valid.
The initial function φ for the dynamics (10.2.102) was obtained by applying
cubic spline interpolation to the experimental data reported in [197]. As in
[151], the terminal time for the fermentation process is taken as T = 24.16
hours. The upper bound for the number of feeding modes is chosen as Nfeed =
48. Our goal is to maximize the concentration of 1,3-PD at the terminal
time. Thus, the dynamic optimization problem is: Choose the parameters
ζk , k = 1, . . . , Nfeed + 2, such that the cost functional −x3 (T ) is minimized
subject to the boundedness constraints (10.2.103) and (10.2.104).
This dynamic optimization problem is solved using a FORTRAN program
that implements the gradient-based optimization procedure in Section 10.2.4.
In this program, NLPQLP [223] is used to perform the optimization itera-
tions (optimality check and line search), and LSODAR [92] is used to solve
the differential equations. Our gradient-based optimization strategy generates
critical points satisfying local optimality conditions. However, the solution
obtained is not necessarily a global optimal solution. Thus, it is necessary
to repeat the optimization process from different starting points so that a
better estimate of the global solution is obtained. We performed 100 test
runs, where each run starts from a different randomly selected initial point.
The average optimal cost over all runs is: −977.12854, and the best result,
which is obtained on run 73, is: −986.16815. For this control strategy, there
are 8 switches (5 batch modes and 4 feeding modes). The control parameters
and the respective mode durations are listed in Table 10.2.2. The optimal
state trajectories are shown in Figure 10.2.1 Due to the dilution effect from
the new input feed, the concentrations of biomass and 1,3-PD decrease during
the feeding modes. The control strategy listed in Table 10.2.2 is essentially
a state feedback strategy. It produces more 1,3-PD (an increase of 5.789%)
when compared with the time-dependent switching strategy reported in [151].
Furthermore, it requires far fewer switches. For the method reported in [151],
it requires over 1000 switches.
10.3 Min-Max Optimal Control 423
1000
6
900

800 5

700
1,3-PD [mmolL ]

4
-1

Biomass [gL -1 ]
600

500 3

400
2
300

200
1
100

0 0
0 5 10 15 20 25 0 5 10 15 20 25
Fermentation time [h] Fermentation time [h]

5.6
600

550

500

450 5.4
Glycerol [mmolL-1 ]

Volume [L]
400

350

300
5.2
250

200

150

100 5.0
0 5 10 15 20 25 0 5 10 15 20 25
Fermentation time [h] Fermentation time [h]

Fig. 10.2.1: Optimal state trajectory obtained for numerical example

Table 10.2.2: Optimal control parameters

Parameter ζ1 ζ2 ζ3 ζ4 ζ49 ζ50


Optimal values 1.62662 1.36951 1.45283 1.64830 245.76512 581.35390

Remark 10.2.5 ζ1 , . . . , ζ4 are the optimal feeding rates and ζ49 and ζ50 are
the optimal switching concentrations. The optimal values of ζ5 , . . . , ζ48 are
irrelevant because they represent the feeding rates after the terminal time.

10.3 Min-Max Optimal Control

In this section, a class of min-max optimal control of linear continuous dy-


namical systems with uncertainty and quadratic terminal constraints is con-
sidered. It is shown that this min-max optimal control problem is transformed
into a form, which can be solved via solving a sequence of semi-definite pro-
gramming problems. This section is basically from [287].
424 10 Time-Lag Optimal Control Problems

10.3.1 Problem Statement

Consider the continuous linear uncertain dynamical system defined on the


time horizon [0, T ] given below:

dx (t)
= A (t) x (t) + B (t) u (t) + C (t) w (t) , t ∈ [0, T ] , (10.3.1)
dt
x (0) = x0 ,

where x (t) ∈ Rn is the state vector, u (t) ∈ Rm is the control vector, w (t) ∈
Rr is the disturbance, and A (t) = [ai,j (t)], B (t) = [bi,j (t)], C (t) = [ci,j (t)]
are matrices with appropriate dimensions, and x0 is a given initial condition.
The cause of the disturbance w(t) in (10.3.1) can be due to the changes
in external environment or errors in measurement. As in [23, 30], we assume
that the disturbance w(t) ∈ Wρ , where Wρ is a L2 -norm bounded set defined
by * +
2
Wρ = w ∈ L2 ([0, T ] , Rr ) : w ≤ ρ2 , (10.3.2)
2 3T 
where w = 0 (w(t)) w (t) dt and ρ > 0 is a given positive constant.
Furthermore, the control u is restricted to be chosen from Uδ which is a
L2 -norm bounded set defined by
* +
2
Uδ = u ∈ L2 ([0, T ] , Rm ) : u ≤ δ 2 , (10.3.3)

2 3T 
where u = 0 (u(t)) u (t) dt and δ > 0 is a given positive constant.
The cost functional is considered to be a quadratic function given below:
 T * +
 
J (u, w) = (x (t)) Q (t) x (t) + (u (t)) R (t) u (t) dt, (10.3.4)
0

where Q (t) = [qi,j (t)] and R (t) = [ri,j (t)] , t ∈ [0, T ] are matrices with
appropriate dimensions. It is assumed that the following terminal state con-
straint is satisfied:
x(T ) − x∗  ≤ γ, ∀ w ∈ Wρ , (10.3.5)
where x∗ is the desired terminal state and γ > 0 is a given constant. Any ele-
ment u ∈ Uδ is called a feasible control if the terminal state constraint (10.3.5)
is satisfied.
We may now state our optimal control problem formally as follows.
Problem (P3 ). Given the dynamical system (10.3.1) and the terminal
state constraint (10.3.5), find a control u ∈ Uδ such that the worst-case
performance J(u, w) is minimized over Uδ , i.e., finding a control u ∈ Uδ such
that it solves the following min-max optimal control problem:

min max J(u, w). (10.3.6)


u∈Uδ w∈Wρ
10.3 Min-Max Optimal Control 425

Clearly, without the presence of the disturbance w in (10.3.1), Problem


(P3 ) is easy to solve by existing optimal control methods, such as the control
parametrization method used in conjunction with the time scaling trans-
formation presented in earlier chapters. However, in the presence of distur-
bances, Problem (P3 ) becomes much more complicated. In this section, we
shall develop a computational scheme for solving this min-max optimal con-
trol problem. To continue, assume that the matrices A (t) , B (t) , C (t) , Q (t)
and R (t) are all continuous on [0, T ], i.e., each of their elements is a con-
tinuous function in [0, T ]. Furthermore, let Q(t) and R(t) be, respectively,
positive semi-definite and positive definite for each t ∈ [0, T ].

10.3.2 Some Preliminary Results

To continue, we need some preliminary results, particularly, the following two


lemmas, known as the S-Lemma and the Schur complement of a block ma-
trix. The S-Lemma was developed independently in several different contexts
[205, 263] and it has applications in control theory, linear algebra and math-
ematical optimization. It gives conditions under which a particular quadratic
inequality is a consequence of another quadratic inequality. The statement of
the lemma is given below.
Lemma 10.3.1 Let M1 and M2 be symmetric matrices, v 1 and v 2 be vectors
and α1 and α2 be real numbers. Suppose that there is a vector x0 such that
the following strict inequality:
 0 
x M1 x0 + 2(v 1 ) x0 + α1 < 0
holds. Then, the following implication:

(x) M1 x + 2(v 1 ) x + α1 ≤ 0 ⇒ (x) M2 x + 2(v 2 ) x + α2 ≤ 0

holds if and only if there exists a non-negative number λ such that

M1 v 1 M2 v 2
λ  1  −  2 
v α1 v α2

is positive semi-definite.

The following lemma is called the Schur complement of a block matrix.


Lemma 10.3.2 Suppose that A, B, C, D are, respectively, n × n, n × m,
m × p and m × m matrices, and that D is invertible. Let

AB
M=
CD
426 10 Time-Lag Optimal Control Problems

such that M is a (n + m) × (p + m) matrix. Then, the Schur complement of


the block D of the matrix M is the n × n matrix defined by
−1
A − B (D) C.
If A is invertible, then the Schur complement of the block A of the matrix
M is the m × m matrix defined by
−1
D − C (A) B.
Furthermore, suppose that either A or D is singular. Then, by replacing
−1 −1
a generalized inverse for the inverses of D − C (A) B and A − B (D) C
gives rise to the generalized Schur complement.
To close this subsection, we briefly introduce some basic concepts and
results related to the controllability of linear systems. Consider the following
linear time invariant system:

dx(t)
= Ax(t) + Bu(t)
dt

y(t) = Cx(t) + Du(t),


where A, B, C and D are, respectively, n × n, n × r, p × n and p × r matrices.
We say that the linear time invariant system is controllable if and only if
the pair (A, B) is controllable, namely the n × nr controllability matrix
$ %
C = B AB A2 B · · · An−1 B
has rank n. The following statements are equivalent:
1. The pair (A, B) is controllable.
2. The n × n matrix

 t " #
Wc (t) = exp{Aτ }BB  exp A τ dτ
0
 t " #
= exp{A(t − τ )}BB  exp A (t − τ ) dτ
0

is non-singular for any t > 0.


3. The n × nr controllability matrix
$ %
C = B AB A2 B · · · An−1 B

has rank n.
4. The n × (n + r) matrix
[A − λI B]
has full row rank at every eigenvalue λ of A.
10.3 Min-Max Optimal Control 427

Now suppose that all the eigenvalues of A have negative real parts (A is
stable), and that the unique solution of the Lyapunov equation
 
AWc + Wc (A) = −B (B)

is positive definite. Then, the system is controllable. The solution is called


the Controllability Gramian and can be expressed as
 t
" #
Wc = exp{Aτ }BB  exp A τ dτ.
0

We now consider the following linear time varying system:

dx(t)
= A(t)x(t) + B(t)u(t)
dt

y(t) = C(t)x(t)
where A, B and C are, respectively, n × n, n × r, p × n matrices. Then the
system (A(t), B(t)) is controllable at time t0 if and only if there exists a finite
time t1 > t0 such that the n × n matrix, also known as the controllability
Gramian, defined by
 t1
 
Wc (t0 , t1 ) = Φ(t1 , τ )B(τ ) (B(τ )) (Φ(t1 , τ )) dτ
t0

is non-singular, where Φ(t, τ ) is the state transition matrix of the following


system:
dx(t)
= A(t)x(t).
dt
Note that for the Controllability Gramian Wc (t0 , t1 ), it holds that

Wc (t0 , t1 ) = Wc (t, t1 ) + Φ(t1 , t)Wc (t0 , t) (Φ(t1 , t)) .

10.3.3 Problem Approximation

Let Φ(t, τ ) be the transition matrix of (10.3.1). For each u and w, define
 T
T0 (u) = Φ (T, 0) x +0
Φ (T, τ ) B (τ ) u (τ ) dτ,
0
 t
1/2 1/2
T1 (x) = (Q (t)) Φ (t, 0) x +
0
(Q (t)) Φ (t, τ ) B (τ ) u (τ ) dτ,
0
428 10 Time-Lag Optimal Control Problems
 T
F0 (w) = Φ (T, τ ) C (τ ) w (τ ) dτ,
0
 t
1/2
F1 (w) = (Q (t)) Φ (t, τ ) C (τ ) w (τ ) dτ.
0

When no confusion can arise, the same notation ·, · is used as the inner
product in L2 as well as in Rn . The cost functional (10.3.4) and the terminal
state constraint (10.3.5) can be rewritten as
F 1 1
G
J(u, w) = T1 (u) + F1 (w), T1 (u) + F1 (w) + (R) 2 u, (R) 2 u , (10.3.7)

and

T0 (u) + F0 (w) − x∗ , T0 (u) + F0 (w) − x∗  ≤ γ 2 , ∀w ∈ Wρ . (10.3.8)

We have the following theorem.


Theorem 10.3.1 T1 is a linear bounded operator from L2 ([0, T ] , Rm ) to
L2 ([0, T ] , Rn ). Suppose that {un } ⊂ Uδ and un & u. Then, T1 (un ) →
T1 (u), where & and → stand for convergence in the weak topology and strong
topology in L2 space, respectively.

Proof. It is easy to show that T1 is a bounded linear operator from L2 ([0, T ],


Rm ) to L2 ([0, T ] , Rn ). Now, suppose that {un } ⊂ Uδ and un & u. Define

Φ (t, τ ) , if τ ≤ t,
Φ (t, τ ) =
0n×n , else.

Clearly, Φ (t, ·) is a continuous function except at τ = t. Then, for each given


t ∈ [0, T ] , we have

lim (T1 (un ) − T1 (u)) (t)


n→∞
 t
1/2
= (Q (t)) Φ (t, τ ) B (τ ) (un (τ ) − u (τ )) dτ
0
 T
1/2
= (Q (t)) Φ (t, τ ) B (τ ) (un (τ ) − u (τ )) dτ = 0.
0

1/2
On the other hand, since {un } ⊂ Uδ , and (Q (t)) Φ (t, τ ) B (τ ) is con-
tinuous with respect to (t, τ ) ∈ [0, T ] × [0, t], we can easily show that there
exists a constant K1 such that
 t & ' 
 
 (Q (t)) Φ (t, τ ) B (τ ) u (τ ) dτ  ≤ K1
1/2 n
 i
0
10.3 Min-Max Optimal Control 429
3t& 1/2
'
for each t ∈ [0, T ], where 0 (Q (t)) Φ (t, τ ) B (τ ) un (τ ) dτ denotes the
3t 1/2
i
i-th element of 0 (Q (t)) Φ (t, τ ) B (τ ) un (τ ) dτ . Now, by Theorem A.1.10
(Lebesgue Dominated Convergence Theorem), it follows that T1 (un ) →
T1 (u) .

Theorem 10.3.2 F1 is a linear bounded operator from L2 ([0, T ] , Rr ) to


L2 ([0, T ] , Rn ). Suppose that {wn } ⊂ Wρ and wn & w. Then, F1 (wn ) →
F1 (w).

Proof. The proof is similar to that given for Theorem 10.3.1.


Note that since both T1 and F1 are bounded operators, it follows readily
that there exists a constant K2 > 0, such that 0 ≤ J(u, w) < K2 , ∀(u, w) ∈
Uδ × Wρ . For a given u ∈ Uδ , let {wn } ⊂ Wρ ⊂ L2 ([0, T ] , Rr ) be a maximiz-
ing sequence, meaning that J (u, wn ) → sup J (u, w). Since L2 ([0, T ] , Rr )
space is reflexive, Wρ , which is a ball with radius ρ, is weakly sequentially
compact (see Remark A.1.4), there exists a subsequence of the sequence
{wn }, which is denoted by the original sequence, such that wn & w(u) ∈ Wρ .
Thus, by Theorem 10.3.1, it follows that F1 (wn ) → F1 (w(u)), and hence
J (u, wn ) → J(u, w(u)). Clearly, J(u, w(u)) = sup J (u, w) . More pre-
cisely,
J(u, w(u)) = max J(u, w).
w∈Wρ

Clearly, w (u) may be not unique. However, they share the same cost function
value maxw∈Wρ J(u, w). For Problem (P ), we have the following theorem.

Theorem 10.3.3 Problem (P ) has a unique solution u∗ ∈ Uδ such that u∗


satisfies (10.3.5) and

J (u∗ , w(u∗ )) = min max J(u, w). (10.3.9)


u∈Uδ w∈Wρ

Proof. Suppose that un & u. Since R (t) is positive definite, then


3T
0
u (t) R (t) u (t) dt is strictly convex. Hence,
 T  T
 
(un (t)) R (t) un (t) dt ≥ (u (t)) R (t) u (t) dt
0 0
 T

+2 (u (t)) R (t) (un (t) − u (t)) dt. (10.3.10)
0

Thus,
 T  T
 
(u (t)) R (t) u (t) dt ≤ lim (un (t)) R (t) un (t) dt.
0 n→∞ 0

Note that, for any un ,


430 10 Time-Lag Optimal Control Problems

T1 (un ) + F1 (w (un )), T1 (un ) + F1 (w (un )) (10.3.11)


= max T1 (un ) + F1 (w), T1 (un ) + F1 (w)
w∈Wρ

≥ T1 (un ) + F1 (w (u)), T1 (un ) + F1 (w (u)). (10.3.12)

By Theorem 10.3.1, T1 (un ) → T1 (u) when un & u. Taking limit inferior on


both sides of (10.3.12), it gives

lim T1 (un ) + F1 (w (un )), T1 (un ) + F1 (w (un )) (10.3.13)


n→∞
≥ T1 (u) + F1 (w (u)), T1 (u) + F1 (w (u)).

By (10.3.10) and (10.3.13), it shows that J (u, w (u)) is weakly sequentially


lower semicontinuous, i.e.,

J (u, w (u)) ≤ lim J (un , w (un )) as un & u.


n→∞

Furthermore, we can show that maxw∈Wρ T1 (u) + F1 (w), T1 (u) + F1 (w) is
3T 
convex in u. Since (10.3.18) is convex in u and 0 (u (t)) R (t) u (t) dt is
strictly convex in u, it follows that Problem (P ) is strictly convex. Thus, the
conclusion of the theorem holds.

Theorem 10.3.4 Define


 T
 
S= Φ(T, τ )C(τ ) (C(τ )) (Φ(T, τ )) dτ (10.3.14)
0

and let λmax (S) be the largest eigenvalue of S. Suppose that S is invertible.
If Problem (P ) has a feasible control, then

λmax (S)ρ2 ≤ γ 2 . (10.3.15)

Furthermore, the terminal state constraint (10.3.8) is equivalent to the fol-


lowing constraint:
⎛ ⎞
I T0 (u) − x∗ I
⎝ (T0 (u) − x∗ ) γ 2 − ρ2 ς 0 ⎠ " 0, (10.3.16)
−1
I 0 ς (S)

whereA " 0 means that the matrix A is constrained to be positive semi-


definite.

Proof. We first show that


F0 (Wρ ) = H, (10.3.17)
10.3 Min-Max Optimal Control 431
 −1
where F0 (Wρ ) = {F0 (w) : w ∈ Wρ } and H = {h ∈ Rn : (h) (S) h ≤
ρ2 }.
−1/2
For notational simplicity, let G(τ ) = Φ (T, τ ) C (τ ) and (S) G(τ ) =
& '
 
(g 1 (τ )) , . . . , (g n (τ )) . Clearly, g i ∈ L2 ([0, T ] , Rm ), i = 1, . . . , n. Then,
for any w ∈ Wρ ,

 −1

n 
n
(F0 (w)) (S) F0 (w) = g i , w2 ≤ g i 2 w2
i=1 i=1
 
T
−1/2   −1/2
= Tr (S) Φ (T, τ ) C (τ ) (C(τ )) (Φ(T, τ )) (S) dt w2 ≤ ρ2 .
0

Thus, F0 (Wρ ) ⊂ H. On the other hand, for any h ∈ H, define w(t) =


−1
G(t) (S) h. Then,
 T  T
  −1  −1
(w(t)) w(t)dt = (h) (S) (G(t)) G(t) (S) hdt
0 0
 −1
= (h) (S) h ≤ ρ2 ,

and  
T T
−1
G(t)w(t)dt = G (t)G(t) (S) hdt = h.
0 0

Thus, H ⊂ F0 (Wρ ). Therefore, F0 (Wρ ) = H. In light of (10.3.17), the


constraint (10.3.5) is equivalent to the constraint

T0 (u) + h − x∗ , T0 (u) + h − x∗  ≤ γ 2 , ∀ h ∈ H. (10.3.18)

For the simplicity of symbol, let Vu = T0 (u) − u∗ . Inequality constraint


(10.3.18) can be written as follows:
  
(h) h + 2t (Vu ) h + t2 (Vu ) Vu
* +
 −1
≤ γ 2 t2 , ∀ (h, t) ∈ (h, t) : (h) (S) h ≤ ρ2 t2 .

The above inequality can be rewritten as


  
 
t γ 2 − (Vu ) Vu − (Vu ) t
≥ 0,
h −Vu −I h
 2
 
t ρ 0 t
∀(t, h) : −1 ≥ 0. (10.3.19)
h 0 − (S) h

By Lemma 10.3.1, (10.3.3) holds if and only if there exists a ς ≥ 0 such that
432 10 Time-Lag Optimal Control Problems
  
γ 2 − (Vu ) Vu − ςρ2 − (Vu )
−1
−Vu ς (S) − I
 
γ 2 − ςρ2 0 (Vu )
  
= −1 − Vu I " 0 (10.3.20)
0 ς (S) I

which can be equivalently rewritten as (10.3.16) by Lemma 10.3.2. The in-


equality (10.3.3) implies that ςS −1 " I and γ 2 − ςρ2 ≥ 0. Thus, λmax (S)ρ2 ≤
γ 2 . This completes the proof.

The matrix S is the Controllability Gramian of the pair (A(·), C(·)).


Thus, S is invertible if and only if the pair (A(·), C(·)) is controllable. If
the system (10.3.1) is time-invariant, then S is invertible if and only if
(C, CA, . . . , CAn−1 ) is a full rank matrix. In what follows, we assume that S
is invertible.
By Theorem 10.3.4, (10.3.8) and (10.3.16) are equivalent. Problem (P3 )
is equivalent to the problem defined by (10.3.6) and (10.3.16). Clearly,
the problem defined by (10.3.6) and (10.3.16) is a convex infinite dimen-
sional optimization problem. Although the maximization with respect to
w ∈ Wρ is required to be carried out only in J(u, w) without involving
constraint (10.3.16), the problem defined by (10.3.6) and (10.3.16) is still
much too complicated to be solved analytically. It is inevitable to resort to
numerical methods.
∞ ∞
Suppose that {γ i }i=1 and {ψ i }i=1 are orthonormal bases (OB) of L2 ([0, T ] ,
R ) and L2 ([0, T ] , R ) , respectively. Now we approximate u and w by
m r

the truncated OB as u (t) = ΓN (t) θ and w (t) = ΨN (t) ϑ, where N


is the truncated number, ΓN (t) = [γ 1 (t) , γ 2 (t) , . . . , γ N (t)], ΨN (t) =
T
[ ψ 1 (t) , ψ 2 (t) , . . . , ψ N (t)], θ = [θ1 , θ2 , . . . , θN ] ∈ RN and ϑ =
T
[ϑ1 , ϑ2 , . . . , ϑN ] ∈ RN . Denote ΞN = { θ ∈ RN :  θ ≤ δ}, UN =
{ΓN (t) θ : θ ∈ ΞN }, ΠN = { ϑ ∈ RN :  ϑ ≤ ρ} and WN = {ΨN (t) ϑ :
θ ∈ ΠN }. Then, the parametrized finite dimensional optimization problem
can be stated as: Find a control u ∈ Uδ ∩ UN such that the cost function
max w∈Wρ ∩WN J( u, w) is minimized subject to the constraint (10.3.16). Let
this problem be referred to as Problem (PN 3 ). Following a similar proof as
that given for Theorem 10.3.4, we have the following theorem.
Theorem 10.3.5 Problem (PN
3 ) is equivalent to the following semi-definite
programming problem:

min t1 + t2 + 2 (qN ) θ + μ0 (10.3.21)
θ∈ΞN ,t1 ,t2 ,ς1 ≥0,ς2 ≥0

subject to  
1/2
I (PN ) θ
 1/2 " 0, (10.3.22)
(θ) (PN ) t2
10.3 Min-Max Optimal Control 433
 

t1 − ς1 ρ2 − (θ) QN − rN
 " 0, (10.3.23)
−QN θ − (rN ) ς 1 I − RN
⎛ ⎞
I V x0 − x ∗ + V N θ I
⎝ (V x0 − x∗ + VN θ) γ 2 − ρ2 ς2 0 ⎠ " 0, (10.3.24)
−1
I 0 ς2 (S)
where the explicit expressions of PN , QN , RN , qN , rN , V , VN and μ0 are
given as below:
 T  t  t

PN = (ΦB,Γ (t, τ )) dτ Q (t) ΦB,Γ (t, τ )dτ dt
0 0 0
 T

+ (ΓN (t)) R (t) ΓN (t) dt,
0

 T  t  t

QN = (ΦB,Γ (t, τ )) dτ Q (t) ΦC,Ψ (t, τ )dτ dt,
0 0 0
 T  t  t

RN = (ΦC,Ψ (t, τ )) dτ Q (t) ΦFC,Ψ (t, τ )dτ dt,
0 0 0
 T
1/2 1/2
VN = (P ) Φ (T, t) B (t) ΓN (t) dt, V = (P ) Φ (T, 0) ,
0

 T  t
 
qN = (ΓN (τ )) B  (τ ) (Φ (t, τ )) dτ Q (t) [F (t, 0) x0 ] dt,
0 0
 T  t
 
rN = (ΨN (τ )) C  (τ ) (Φ (t, τ )) dτ Q (t) [Φ (t, 0) x0 ] dt,
0 0
 
T
 
μ0 = (x0 ) (Φ (t, 0)) Q (t) Φ (t, 0) dt x0 ,
0

ΦB,Γ (t, τ ) = Φ (t, τ ) B (τ ) ΓN (τ )


ΦC,Ψ (t, τ ) = Φ (t, τ ) C (τ ) ΨN (τ ) .
Proof. Clearly,
    
J(uN , wN ) = (θ) PN θ+2 (θ) QN ϑ+(ϑ) RN ϑ+2 (qN ) θ+2 (rN ) ϑ+μ0 .

Then, minθ∈ΞN maxϑ∈ΠN J(uN , wN ) can be equivalently rewritten as



min t1 + t2 + 2 (qN ) θ + μ0 (10.3.25)

subject to (θ) PN θ ≤ t2 (10.3.26)
& '
 
(ϑ) RN ϑ + 2 (θ) QN + rN ϑ ≤ t1 , ∀ϑ ∈ ΠN . (10.3.27)
434 10 Time-Lag Optimal Control Problems

Using a similar argument as in the proof of Theorem 10.3.4, it follows


that (10.3.27) is equivalent to (10.3.23). Thus, the conclusion of the theo-
rem follows readily.

In view of Theorem 10.3.5 the solution of Problem (PN


3 ) can be obtained
through solving a SDP problem defined by (10.3.21)–(10.3.24).
For the solution procedure of SDP problems, the readers are referred to
[163, 263]. The next theorem shows the relation between Problem (P3 ) and
Problem (PN 3 ).

Theorem 10.3.6 Let u∗ ∈ Uδ and θN ∗


∈ ΞN be the optimal solution of
Problem (P3 ) and the optimal solution of Problem (PN 3 ), respectively. Let
u∗N (t) = ΓN (t)θN

. Suppose that ω(u∗ ) and ω(u∗N ) are such that

J(u∗ , ω(u∗ )) = max J(u∗ , ω), (10.3.28)


ω∈Wρ

and
J(u∗N , ω(u∗N )) = max J(u∗N , ω),
ω∈Wρ ∩VN

respectively. Let ω ∗ ∈ Wρ and ωN



∈ Wρ ∩ VN be such that

J(u∗ , ω ∗ ) = J(u∗ , ω(u∗ )), (10.3.29)

and
J(u∗N , ωN

) = J(u∗N , ω(u∗N )), (10.3.30)
respectively, Then,
(i) lim J(u∗N , ωN

) = J(u∗ , ω(u∗ )); and
N →∞
(ii) u∗N & u∗ as N → ∞.

Proof. Note that u∗ is the optimal solution of problem (P3 ) and θN ∗


∈ ΞN is
N ∗ ∗
the optimal solution of Problem (P3 ). Let uN (t) = ΓN (t)θN . Suppose that
for u∗ ∈ Uδ , and let ω(u∗ ) be such that it satisfies (10.3.28). Note that ω(u∗ )
may not be unique, but gives rise to the same value of maxω∈Wρ J(u∗ , ω).
Let ω ∗ be one of these maximizers. Similarly, let ωN∗
be one of the maximizers
ω(uN ). Without loss of generality, we suppose that u∗N & u
∗ - and ωN∗
&ω -.
N,∗ N,∗ ∗
Let u and ω denote, respectively, the projection of u onto UN and
ω ∗ onto VN . Then, uN,∗ → u∗ and ω N,∗ → ω ∗ . Thus,

J(u∗N , ωN

)= min J(u, ωN ∗
)
u∈UN ∩Uρ
 ∗

≤J uN,∗ , ωN → J(u∗ , ω)
- ≤ J(u∗ , ω ∗ ). (10.3.31)

On the other hand,

 
J(u∗ , ω ∗ ) = min J(u, ω ∗ ) ≤ J(-
u, ω ∗ ) ≤ lim J u∗N , ω N,∗
u∈Uρ N →∞
10.3 Min-Max Optimal Control 435

≤ lim max J(u∗N , ω) = lim J(u∗N , ωN



). (10.3.32)
N →∞ω∈VN ∩Wρ N →∞

Therefore,
lim J(u∗N , ωN

) = J(u∗ , ω(u∗ )) (10.3.33)
N →∞

We shall show that u∗N & u∗ by contradiction. Suppose that it is false.


Then, there exists a subsequence {u∗Nk } of {u∗N } and a subsequence {ωN

k
}

of {ω(uN )} such that

u∗Nk & u
6 = u∗ and ωN

k
6.

Let ωu be one of the maximizers ω(6 u). Then, by virtue of the uniqueness
of the solution of Problem (P3 ), it is clear that

J(u∗ , ω ∗ ) < J(6


u, ωu ). (10.3.34)

Let ω Nk be the projection of ωu onto VNk . Then, ω Nk → ωu . Since


J(u, ω(u)) is weakly sequentially lower semicontinuous, it follows from
(10.3.33) that
 
u, ωu ) ≤ lim J u∗Nk , ω Nk ≤ lim
J(6 max J(u∗Nk , ω)
k→∞ k→∞ω∈VNk ∩Wρ

= lim J(u∗Nk , ωN

k
) ∗ ∗
= J(u , ω(u ),
k→∞

which is a contradiction to (10.3.34). Thus, u∗N & u∗ .

Theorem 10.3.6 shows that the min-max optimal control problem (P3 )
is approximated by a sequence of finite dimensional convex optimization
problems (PN 3 ). Then, an intuitive scheme to solve Problem (P3 ) can be
N
stated as:
 N +1,∗ For  tolerance ε > 0, we solve Problem (P3 ) until
a given
|J u −J u N,∗
| ≤ ε.
Problem (P3 ) with ρ = 0 is a standard optimal control problem without
disturbance. Let it be referred to as Problem (P3 ). Similarly, we can solve
Problem (P3 ) through solving a sequence of approximate optimal control
N
problems, denoted by Problems (P3 ), by restricting the feasible control u
in UN ∩ Uδ . We have the following results.
N
Corollary 10.3.1 Problem(P ) is equivalent to the following SDP problem

min t + 2 (q N ) θ + μ0 (10.3.35)
θ∈ΞN ,t≥0

subject to  
1/2
I (PN ) θ
 1/2 " 0, (10.3.36)
(θ) (PN ) t
436 10 Time-Lag Optimal Control Problems

I V x0 + V N θ − x ∗
∗  " 0. (10.3.37)
(V x + VN θ − x )
0
γ

Remark 10.1. During the computation, both u and w are approximated  by
truncated orthonormal bases. Suppose that u∗N = ΓN (t)θ ∗ . Then I0 uN,∗ =
V x0 + VN θ ∗ , where θ ∗ is the optimal solution of Problem (PN ∗
3 ). Since θ sat-
N,∗
isfies the linear matrix inequality (10.3.24), u satisfies the linear matrix
inequality (10.3.16). Thus, by Theorem 10.3.4, the terminal inequality con-
straint (10.3.5) holds for all w ∈ Wρ . Thus, uN,∗ is a feasible solution. This
feature is not shared by the control parametrization method given in pre-
vious chapters. More specifically, if we directly approximate Uδ , Wρ by UN
and WN , then we can also transform the approximated problem as a SDP
which is different from that defined by (10.3.21)–(10.3.24). Let the solution
obtained by this method be ūN,∗ . Then, the terminal inequality constraint
(10.3.5) is only satisfied for those w ∈ WN ⊂ Wρ , not for all w ∈ Wρ .
Thus, the approximate solution ūN,∗ may be infeasible. For our proposed
approach, the approximations of u and w only affect the computation of the
cost function value (10.3.4). The feasibility of the terminal constraint (10.3.5)
is maintained for all w ∈ Wρ .

10.3.4 Illustrative Example

Consider a worst-case DC motor control. The mathematical model of a DC


motor is expressed as two linear differential equations [65] as

di
V = R a i + La + Ce ω
dt

Cm i = Jr + μω + m, (10.3.38)
dt
where V is the voltage applied to the rotor circuit, i is the current, ω is
the rotation speed, m is the resistant torque reduced to the motor shaft, Ra
and La are the resistance and the inductance of the circuit, Jr is the inertia
moment, Ce and Cm are the constants of the motor, μ is the coefficient
of viscous friction. Let x(t) = [x1 (t), x2 (t)]T = [ω(t), i(t)] , u(t) = V (t),
w(t) = m(t). Then, (10.3.38) can be rewritten as

dx(t)
= Ax(t) + Bu(t) + Cw(t),
dt
where . /
− Jμr CJrm 0 − J1r
A= ,B= 1 ,C= .
−LCe
a
−RLa
a
La 0
10.3 Min-Max Optimal Control 437

Suppose that the initial condition is x(0) = [0, 0] . We wish to find an optimal
control to drive the system to a neighbourhood around the desired state x∗
with reference to all disturbances w ∈ Wρ such that the energy consumption
is minimized. In this case, Q = 0 and R = 1 in (10.3.4). Suppose that the
nominal parameters of the DC motor are given as: μ = 0.01, Jr = 0.028,
Cm = 0.58, La = 0.16, Ce = 0.58 and Ra = 3. Let T = 1, δ = 5, ρ = 0.01,
∞ ∞
γ = 0.2, x∗ = [3, 1] . The two orthonormal bases {γi }i=1 and {ψi }i=1 are
taken as the normalized shifted Legendre polynomial, i.e.,

γi (t) = ψi (t) = 2i + 1Pi (2t − 1) , i = 0, 1, 2, . . . ,

where Pi (t) is the i-th order Legendre polynomial. During the simulation,
SeDuMi [231] and YALMIP [163] are used to solve the SDP problem defined
by (10.3.21)–(10.3.24) and the SDP problem defined by (10.3.35)–(10.3.37).
Note that system (10.3.38) is time invariant. By direct computation, it fol-
lows that the matrix [C, CA] is of full rank. Thus, S in (10.3.14) is invertible.
Using Simpson’s Rule to compute it, we obtain

176.8542 −27.7387
S=
−27.7387 5.3628

and λmax (S) = 181.2293, which indicates that γ should be far larger than ρ.
We set the tolerance
 ε as 10−8 andN =5 to start solving Problem (PN ).

For N = 10, we have J u   − J uN,∗ ≤ 10
N +1,∗ −8
. So we stop the compu-
 
tation. Meanwhile, we have J¯ uN +1,∗ − J¯ uN,∗  ≤ 10−8 , where J¯ uN,∗
N
is the optimal cost function value of Problem (P ). The cost function values
obtained are given in Table 10.3.1, from which we see that the convergence
for the case with disturbance and that without disturbance are very fast.
Figure 10.3.1 depicts the nominal state [x1 (t), x2 (t)]T = [w(t), i(t)] and
Figure 10.3.2 shows that the optimal control u∗11 under worst case perfor-
mance. The terminal constraint (10.3.5) holds for any w ∈ Wρ is ensured by
Theorem 10.3.4.

Table 10.3.1: The optimal cost of Problem (P̄3N ) and the optimal cost of
Problem (PN 3 )

Optimal cost of Problem (P̄3N ) Optimal cost of Problem (PN


3 )
N =5 2.157374508 2.283417844
N =6 2.157349390 2.283381916
N =7 2.157334886 2.283363764
N =8 2.157331918 2.283360165
N =9 2.157331543 2.283359716
N = 10 2.157331514 2.283359680
N = 11 2.157331514 2.283359680
438 10 Time-Lag Optimal Control Problems

2.5
x1*(t)
x*2(t)
2

1.5
x

0.5

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t

Fig. 10.3.1: The nominal state trajectories [x∗1 (t), x∗2 (t)]T of Problem (P 11 )

4.5

3.5

3
u

2.5

1.5

0.5

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t

Fig. 10.3.2: The optimal control u∗ of Problem (P 11 )

10.4 Exercises

10.1. Show the validity of Equation (10.1.22).

10.2. Show the equivalence of Problem (P(p)) and Problem (Q(p)) (see Sec-
tion 10.2.4).

10.3. Provide a proof of Theorem 10.2.2.

10.4. Provide a proof of Theorem 10.2.3.

10.5. Provide a proof of Theorem 10.2.4.

10.6. Show the validity of Equation (10.2.31).


10.4 Exercises 439

10.7. Show the validity of Equation (10.2.28).

10.8. Show that F1 defined in Section 10.3.2 is bounded linear operator from
L2 ([0, T ], Rr ) to L2 ([0, T ], Rn ).

10.9. In the proof of Theorem 10.3.1, show that there exists a constant K >
0, such that 0 ≤ J(u, w) < K, ∀(u, w) ∈ Uδ × Wρ .

10.10. In the proof of Theorem 10.3.1, show that

max T1 (u) + F1 (w), T1 (u) + F1 (w)


w∈Wρ

is convex in u.

10.11. Show the equivalence of Equations (10.3.17) and (10.3.18).

10.12. Show the validity of (10.3.3) by using S-Lemma.

10.13. Use Schur complement to show that (10.3.3) can be written as


(10.3.16).

10.14. Give the proof of Theorem 10.3.4.

10.15. Give the proof of Corollary 10.3.1.


Chapter 11
Feedback Control

11.1 Introduction

In this chapter, we introduce two approaches to constructing suboptimal feed-


back controls for constrained optimal control problems. The first approach is
known as the neighbouring extremals approach. The main references for this
approach are [33, 107]. In this approach, we will present a solution method for
constructing a first-order approximation of the optimal feedback control law
for a class of optimal control problems governed by nonlinear continuous-time
systems subject to continuous inequality constraints on the control and state.
The control law constructed is in a state feedback form, and it is effective
to small state perturbations caused by changes on initial conditions and/or
modeling uncertainty. It has many potential applications, such as spacecraft
guidance and control [140]. For illustration, a generalized Rayleigh problem
with a mixed state and control constraint [24] is solved using the proposed
method.
The second approach is to construct an optimal PID control for a class
of optimal control problems subject to continuous inequality constraints and
terminal equality constraint. The main reference for this approach is [132]. By
applying the constraint transcription method and a local smoothing technique
to these continuous inequality constraint functions, we construct the corre-
sponding smooth approximate functions. We use the concept of the penalty
function to append these smooth approximate functions to the cost function,
forming a new cost function. Then, the constrained optimal PID control prob-
lem is approximated by a sequence of optimal parameter selection problems
subject to only terminal equality constraint. Each of these optimal parameter
selection problems can be viewed and hence solved as a nonlinear optimiza-
tion problem. The gradient formulas of the new appended cost function and

© The Author(s), under exclusive license to 441


Springer Nature Switzerland AG 2021
K. L. Teo et al., Applied and Computational Optimal Control, Springer
Optimization and Its Applications 171,
https://doi.org/10.1007/978-3-030-69913-0 11
442 11 Feedback Control

the terminal equality constraint function are derived, and a reliable computa-
tion algorithm is given. The method proposed is used to solve a ship steering
control problem.

11.2 Neighbouring Extremals

11.2.1 Problem Formulation

Consider a dynamic system governed by the following differential equations


on the time horizon (0, T ]:

dx(t)
= f (t, x(t), u(t)), t ∈ (0, T ], x(0) = x0 , (11.2.1)
dt
where x(t) ∈ Rn and u(t) ∈ Rr are, respectively, the state and control
vectors; f : [0, T ] × Rn × Rr → Rn ; T , 0 < T < ∞, is the fixed terminal time
and x0 ∈ Rn is a given vector.
For the control and state vectors, they are subject to the following contin-
uous inequality constraints:

hk (t, x(t), u(t)) ≤ 0, t ∈ [0, T ], k = 1, . . . , N (11.2.2)

where hk : [0, T ] × Rn × Rr → R, k = 1, . . . , N . Let

h  [h1 , . . . , hN ] .

Furthermore, let A(t, x, u) ⊆ P  {1, . . . , N } denote the index set of the


active constraints in (11.2.2) at the point (t, x, u), that is,

A(t, x, u)  {k ∈ P : hk (t, x(t), u(t)) = 0}. (11.2.3)

A measurable function u : [0, T ] → U ⊂ Rr Satisfying (11.2.2) almost every-


where is called a feasible control. Let F denote the class of all such feasible
controls.
Now, consider the following optimal control problem.
Problem (P1 ) Subject to the dynamical system (11.2.1), find a feasible
control u ∈ F such that the cost functional
 T
g0 (u)  Φ0 (x(T )) + L0 (t, x(t), u(t))dt (11.2.4)
0

is minimized over F, where Φ0 : Rn → R and L0 : [0, T ] × Rn × Rr → R.


The following assumptions are assumed throughout the section.
11.2 Neighbouring Extremals 443

Assumption 11.2.1 f (t, x, u), hi (t, x, u), i = 1, . . . , N , L0 (t, x, u) and


Φ0 (x) are twice continuously differentiable with respect to each of their re-
spective arguments.

Assumption 11.2.2 There exists a unique optimal solution (x∗ , u∗ ).

Let H and L be, respectively, the Hamiltonian function and the augmented
Hamiltonian function defined by

H(t, x, u, λ)  L0 (t, x, u) + λ f (t, x, u), (11.2.5)



L(t, x, u, λ, ρ)  H(t, x, u, λ) + ρ h(t, x, u), (11.2.6)

where λ(t) ∈ Rn is the costate and ρ(t) ∈ Rp is the Lagrangian multiplier


associated with constraints (11.2.2), where

λ  [λ1 , . . . , λn ] and ρ  [ρ1 , . . . , ρN ] .


Under Assumption 11.2.2, there exist multipliers λ∗ (t) ∈ Rn and ρ∗ (t) ∈ RN
such that the following necessary conditions are satisfied [33]:

dx∗ (t)
= f (t, x∗ (t), u∗ (t)), x∗ (0) = x0 , (11.2.7)
dt

dλ∗ (t) ∂L(t, x∗ (t), u∗ (t), λ∗ (t), ρ∗ (t))
=− , (11.2.8)
dt ∂x
 ∂Φ0 (x∗ (T ))
[λ∗ (T )] = , (11.2.9)
∂x
∗ ∗ ∗ ∗
∂L(t, x (t), u (t), λ (t), ρ (t))
0=
∂u
∂H(t, x∗ (t), u∗ (t), λ∗ (t)) ∗ ∗
 ∂h(t, x (t), u (t))
= + [ρ∗ (t)] ,
∂u ∂u
(11.2.10)
0 ≥ hi (t, x∗ (t), u∗ (t)); ρ∗i (t) ≥ 0, i =, . . . , N (11.2.11)
∗  ∗ ∗
0 = [ρ (t)] h(t, x (t), u (t)). (11.2.12)

In what follows, (x∗ , u∗ ) is also called the nominal solution, and a super-
script ‘∗’ indicates that the corresponding function is evaluated along the
nominal trajectory (x∗ , u∗ ).
Along this nominal solution, (t∗k,1 , t∗k,2 ) ⊂ [0, T ], k ∈ P , is called an interior
interval for the kth constraint if

hk (t, x∗ (t), u∗ (t)) < 0 f or all t ∈ (t∗k,1 , t∗k,2 )

and
444 11 Feedback Control

hk (t∗k,1 , x∗ (t∗k,1 ), u∗ (t∗k,1 )) = hk (t∗k,2 , x∗ (t∗k,2 ), u∗ (t∗k,2 )) = 0.

[t∗k,2 , t∗k,3 ] ⊂ [0, T ] is called a boundary interval if it is the maximal interval


on which

hk (t, x∗ (t), u∗ (t)) = 0 f or all t ∈ [t∗k,2 , t∗k,3 ]

t∗k,1 , t∗k,2 and t∗k,3 are called junction points. Let Tk∗ denote the set of junction
points t∗k,j ∈ [0, T ] for hk (t, x∗ (t), u∗ (t)) ≤ 0. We assume that (x∗ , u∗ ) has
the following regular structure.
H
Assumption 11.2.3 The I set T ∗  k∈P Tk∗ = {t∗1 , . . . , t∗M } of all junction
points is finite and Tk∗ Tj∗ = ∅ for k = j, where ∅ denotes an empty set.
Furthermore, there are no isolated touch points with the boundary for the
nominal solution.

In addition to this regular structure, we assume that the following strict


complementarity condition and non-tangential junction condition are satis-
fied for (x∗ , u∗ ).
∗ ∗
Assumption 11.2.4 & Let ρk , 'k ∈ P , denote the kth component of ρ 
[ρ∗1 , . . . , ρ∗N ] , and t∗k,j , t∗k,j+1 ⊂ [0, T ] be any boundary interval for the
!
constraint hk (t, x∗ , u∗ ) ≤ 0. Then, ρ∗k (t) > 0 for all t ∈ t∗k,j , t∗k,j+1 .
& '
Assumption 11.2.5 Let t∗k,j , t∗k,j+1 ⊂ [0, T ], k ∈ P , be any boundary in-
terval for the constraint hk (t, x∗ (t), u∗ (t)) ≤ 0. Then,

dhk (t, x(t), u(t)) 
dt  ∗− = 0
t→t k,j

and 
dhk (t, x(t), u(t)) 
dt  ∗+ = 0.
t→tk,j+1
! !
For convenience, let u t∗−k,j and u t ∗+
k,j+1 denote, respectively, the limits
of u (t) from the left at tk,j and right at t∗k,j+1 .
∗ ∗

Let ĥ(t, x∗ (t), u∗ (t)) and ρ̂∗ (t) denote, respectively, vectors composed of
hk (t, x∗ (t), u∗ (t)) and ρ∗k (t), where k ∈ A(t, x∗ (t), u∗ (t)). Correspondingly,
let q(t) > 0 be the number of the constraints in A(t, x∗ (t), u∗ (t)). We have
the following assumptions.
Assumption 11.2.6 ∂ ĥ(t, x∗ (t), u∗ (t))/∂u is of full row rank when

ĥ(t, x∗ (t), u∗ (t)) = ∅.


11.2 Neighbouring Extremals 445

Assumption 11.2.7 For all γ(t) ∈ ker(∂ ĥ(t, x∗ (t), u∗ (t))/∂u)\ {0}, it holds

that [γ(t)] (∂ 2 L∗ (t, x∗ (t), u∗ (t)/∂u2 )γ(t) > 0. where ker(·) denotes the null
space of a matrix, and

∂ 2 L∗ (t, x∗ (t), u∗ (t))/∂u2 = (∂/∂u) [∂L∗ (t, x∗ (t), u∗ (t))/∂u] .

Now, treat u∗ , λ∗ and ρ∗ as functions of the nominal state x∗ . Let δx(t) ∈ Rn


be a perturbation of the nominal state x∗ (t) such that δx(t) = εδ(t) for some
ε ∈ R+ and δ(t) ∈ B(n, 1), where B(n, s)  {y ∈ Rn : |y| ≤ s} and |·| denotes
the usual Euclidean norm in Rn .
We have the last assumption.
Assumption 11.2.8 u∗ (x∗ ), λ∗ (x∗ ) and ρ∗ (x∗ ) are continuously differen-
tiable with respect to x∗ in a small neighbourhood of x∗ .

Remark 11.2.1 It is proved in [177] for optimal control problems depending


on parameter ξ that if Assumptions 11.2.1–11.2.7, the controllability condi-
tion and the coercivity condition [177] are satisfied, then there exists a neigh-
bourhood G of the nominal parameter ξ ∗ such that a local solution (x, u) and
the associated Lagrangian multipliers λ and ρ exist for each ξ ∈ G. All these
functions x, u, λ and ρ are (Fréchet) differentiable with respect to ξ ∈ G
satisfying x(ξ ∗ ) = x∗ , u(ξ ∗ ) = u∗ , λ(ξ ∗ ) = λ∗ and ρ(ξ ∗ ) = ρ∗ .

Consider neighbouring points x = x∗ + εδ, u = u∗ (x∗ + εδ), λ = λ∗ (x∗ +


εδ) and ρ = ρ∗ (x∗ + εδ). For these neighbouring points to remain optimal,
the following conditions are necessary [33]:

dx(t)
= f (t, x(t), u(t)), x(0) = x0 + εδ, (11.2.13)
dt

dλ(t) ∂L(t, x(t), u(t), λ(t), ρ(t))  ∂Φ0 (x(T ))
=− , [λ(T )] = ,
dt ∂x ∂x
(11.2.14)
∂L(t, x(t), u(t), λ(t), ρ(t))
0=
∂u
∂H(t, x(t), u(t), λ(t))  ∂h(t, x(t), u(t))
= + [ρ(t)] , (11.2.15)
∂u ∂u
0 ≥ hi (t, x(t), u(t)), ρi (t) ≥ 0, i = 1, . . . , N (11.2.16)

0 = [ρ(t)] h(t, x(t), u(t)). (11.2.17)

Now, the objective of this section can be stated formally as follows.


Problem (P1 F ) Given the optimal pair (x∗ , u∗ ) of Problem (P1 ), con-
struct a feedback control law expressed in the form of
∂u∗
u(x) ≈ u∗ + (x − x∗ ). (11.2.18)
∂x
446 11 Feedback Control

11.2.2 Construction of Suboptimal Feedback


Control Law

Lemma 11.2.1 Let x = x∗ + εδ and u = u∗ (x∗ + εδ). If Assump-


tions 11.2.1, 11.2.3–11.2.5 and 11.2.8 are satisfied, then there exists an ε0 > 0
such that for each ε ∈ [0, ε0 ],

-∗
∂h - ∗ ∂u∗
∂h
+ = 0. (11.2.19)
∂x ∂u ∂x

Proof. From Assumptions 11.2.3–11.2.5, there exists a small ε1 > 0


such that for each ε ∈ [0, ε1 ] the structure of the perturbed solution
(x∗ + εδ, u∗ (x∗ + εδ)) is the same as that of (x∗ , u∗ ) [177]. Specifically,
if the junction points of (x∗ , u∗ ) are such that 0 < t∗1 < t∗2 < · · · < t∗M < T ,
then the solution (x∗ + εδ, u∗ (x∗ + εδ)) also has M junction points satisfying
0 < t1 < t2 < · · · < tM < T with ti perturbed from t∗i , i = 1, . . . , M . Let
t∗0 = t0 = 0 and t∗M +1 = tM +1 = T . Then, if A∗ (t)  A∗ (t, x∗ (t), u∗ (t)) =
Ai ⊆ P for all t ∈ [t∗i , t∗i+1 ], i = 0, . . . , M , it follows that A(t) = Ai for all
t ∈ [ti , ti+1 ]. Suppose that

ĥ(t, x∗ + εδ, u∗ (x∗ + εδ)) = 0, ∀t ∈ [ti , ti+1 ].

From Assumption 11.2.8, there exists a small ε2 > 0 such that, for each
ε ∈ [0, ε2 ] and the perturbation δx = εδ, u∗ (x∗ ) is continuously differentiable
with respect to x∗ . Let ε0 = min{ε1 , ε2 }. It follows from the continuity of ĥ
at x∗ and u∗ (x∗ ) that, for ε ∈ [0, ε0 ],

dĥ (t, x∗ (t) + εδ(t), u∗ (x∗ (t) + εδ(t))) 
 = 0.
dε 
ε=0

Since δh ∈ B(n, 1) is arbitrary, (11.2.19) follows.

Lemma 11.2.2 Let x = x∗ + εδ and u = u∗ (x∗ + εδ) with ε ∈ [0, ε0 ]. If


Assumptions 11.2.1–11.2.5 and 11.2.8 are satisfied, then
∂H ∗ ∂u∗  ∂h

= [ρ∗ ] (11.2.20)
∂u ∂x ∂u

Proof. From the complementarity conditions (11.2.11) and (11.2.12), it fol-


lows that ρ∗k = 0 for k ∈ P \A∗ . Then, from (11.2.19), we obtain

 ∂h∗ ∗
 ∂h ∂u

[ρ∗ ] = − [ρ∗ ] . (11.2.21)
∂x ∂u ∂x
Thus, the conclusion follows from (11.2.10).
11.2 Neighbouring Extremals 447

Theorem 11.2.1 Let x = x∗ + εδ, u = u∗ (x∗ + εδ), λ = λ∗ (x∗ + εδ) and


ρ = ρ∗ (x∗ + εδ) with ε ∈ [0, ε0 ]. If Assumptions 11.2.1–11.2.2 and 11.2.8
are satisfied, then

  . /
∂2H ∗  ∂ 2 h∗ ∂u∗ -∗
∂h ∂ ρ-∗
0= + ρ∗k k
+
∂u2 ∗
∂u2 ∂x ∂u ∂x
k∈A

∂2H ∗  ∂ 2 h∗k ∂f ∗

∂λ∗
+ + ρ∗k + . (11.2.22)
∂x∂u ∗
∂x∂u ∂u ∂x
k∈A

Proof. For (x, u, λ, ρ) to remain optimal, Equation (11.2.15) holds with


ε ∈ [0, ε0 ]. Thus,

d ∂L 

0= 
dε ∂u 
ε=0   
 N
d ∂H  dρk ∂hk d ∂hk 
=  + + ρk 
dε ∂u  dε ∂u dε ∂u 
ε=0 k=1
4 ε=0
2 ∗ 2 ∗ ∗ ∗  ∗
∂ H ∂ H ∂u ∂f ∂λ
= + +
∂x∂u ∂u2 ∂x ∂u ∂x
. /5
N
∂h∗k

∂ρ∗k 2 ∗ 2 ∗ ∗
∗ ∂ hk ∗ ∂ hk ∂u
+ + ρk + ρk δ. (11.2.23)
∂u ∂x ∂x∂u ∂u2 ∂x
k=1

From Assumption 11.2.8, ρ is continuously differentiable with respect to


x∗ for ε ∈ [0, ε0 ]. Then ∂ρ∗k /∂x = 0 for k ∈ P \A∗ . Thus, (11.2.23) holds for
δ ∈ B(n, 1), which is arbitrary. Thus the validity of (11.2.22) follows readily.
Let V ∗ (t, x∗ (t)) be the optimal return function corresponding to (x∗ , u∗ ),
which is defined by
 T
V ∗ (t, x∗ (t))  Φ0 (x∗ (T )) + L0 (t, x∗ (τ ), u∗ (τ ))dτ. (11.2.24)
t

By Assumption 11.2.2, it is known from [33] that V ∗ satisfies the Hamilton-


Jacobi-Bellman equation:
∂V ∗
− = H(t, x∗ (t), u∗ (t), λ∗ (t))
∂t

= L0 (t, x∗ (t), u∗ (t)) + [λ∗ (t)] f (t, x∗ (t), u∗ (t)) (11.2.25)

with [λ∗ ] = ∂V ∗ /∂x.
448 11 Feedback Control

For the neighbouring points x = x∗ + εδ, u = u∗ (x∗ + εδ) and λ =


λ (x∗ + εδ) to remain optimal, the following equation must also be satisfied:

∂V 
− = H(t, x(t), u(t), λ(t)) = L0 (t, x(t), u(t)) + [λ(t)] f (t, x(t), u(t)),
∂t
(11.2.26)

where [λ] = ∂V /∂x and
 T
V (t, x(t))  Φ0 (x(T )) + L0 (t, x(τ ), u(τ ))dτ.
t

Let
∂λ∗ (t) ∂ 2 V ∗ (t, x(t))
Q∗ (t)  = . (11.2.27)
∂x ∂x2
Theorem 11.2.2 Let x = x∗ + εδ, u = u∗ (x∗ + εδ) and λ = λ∗ (x∗ + εδ)
with ε ∈ [0, ε0 ]. If Assumptions 11.2.1–11.2.5 and 11.2.8 are satisfied, then
Q∗ satisfies the matrix differential equation
 
dQ∗ (t) ∂2H ∗ ∂u∗ ∂ 2 H ∗ ∂u∗ ∂f ∗
=− 2
− 2
− Q∗ (t)
dt ∂x ∂x ∂u ∂x ∂u

∂f ∗ ∂f ∗ ∂u∗ ∂f ∗ ∂u∗
− Q∗ (t) − Q∗ (t) − Q∗ (t)
∂x ∂u ∂x ∂u ∂x

∂ 2 H ∗ ∂u∗ ∂u∗ ∂2H ∗
− − (11.2.28)
∂u∂x ∂x ∂x ∂x∂u

with 
∂ 2 Φ∗0 
Q∗ (T ) = . (11.2.29)
∂x2 t=T

Proof. Expanding λ into the first order in ε, it follows that

λ = λ∗ + εQ∗ δ + o(ε2 ).

Then, it can be derived by expanding V to the second order in ε that

∂V ∗ ε2 ∂2V ∗
V (x) = V ∗ + ε δ + δ δ + o(ε3 )
∂x 2 ∂x2
 ε2
= V ∗ + ε [λ∗ ] δ + δ  Q∗ δ + o(ε3 ). (11.2.30)
2
Hence,

∂V dV dV ∗ dλ∗ (t)  dδ
H+ = L0 + = L0 + +ε δ + ε [λ∗ (t)]
∂t dt dt dt dt
ε2  dQ∗ (t) dδ
+ δ δ + ε2 δ  Q ∗ + o(ε3 ). (11.2.31)
2 dt dt
11.2 Neighbouring Extremals 449

Thus, from
dV ∗ ∂V ∗
L∗0 + = H∗ + = 0,
dt ∂t
and

dλ∗ (t) ∂L(t, x∗ (t), u∗ (t), λ∗ (t), ρ∗ (t))
=− ,
dt ∂x
it follows that
∂V ∂L∗ 
H+ = L0 − L∗0 − ε δ + [λ∗ ] (f − f ∗ ) + εδ  Q∗ (f − f ∗ ) (11.2.32)
∂t ∂x   
ε 2 ∗
dQ (t)  
+ δ δ + o ε3 (11.2.33)
2 dt

∂L
= L0 − L∗0 − ε δ + λ (f − f ∗ ) (11.2.34)
∂x
ε2 dQ∗ (t)  
+ δ δ + o ε3 (11.2.35)
2 dt
∂L∗0  ∂f

 ∂h

= L0 − L∗0 − ε + [λ∗ ] + [ρ∗ ] δ + λ (f − f ∗ )
∂x ∂x ∂x
(11.2.36)
ε 2 ∗
dQ (t)  
+ δ δ + o ε3 (11.2.37)
2 dt
Then, by expanding λ to first order in ε, and L0 and f to second order
in ε, it gives

∂V
H+
∂t 
∂L∗0 ∂L∗0 ∂u∗
= ε + δ
∂x ∂u ∂x
. /

ε2 ∂ 2 L∗0 ∂ 2 L∗0 ∂u∗ ∂u∗ ∂ 2 L∗0 ∂u∗
+ δ 2
+2 + δ
2 ∂x ∂u∂x ∂x ∂x ∂u2 ∂x
∂L∗0  ∂f

 ∂h

−ε + [λ∗ ] + [ρ∗ ] δ
∂x ∂x ∂x
n  
∗ ∂λ∗k ∂fk∗ ∂fk∗ ∂u∗ ε2 ∂ 2 fk∗
+ λk + ε δ ε + δ + δ
∂x ∂x ∂u ∂x 2 ∂x2
k=1
5
 2 ∗
∂ 2 fk∗ ∂u∗ ∂u∗ ∂ fk ∂u∗ ε2 dQ∗ (t)
+2 + 2
δ + δ δ + o(ε3 )
∂u∂x ∂x ∂x ∂u ∂x 2 dt
∂H ∗ ∂u∗  ∂h

ε2 ∂2H ∗ ∂ 2 H ∗ ∂u∗
=ε δ − ε [ρ∗ ] δ + δ +
∂u ∂x ∂x 2 ∂x2 ∂u∂x ∂x
450 11 Feedback Control
  
∂u∗ ∂2H ∗ ∂u∗
∂ 2 H ∗ ∂u∗ ∗ ∂f

∂f ∗
+ + + Q + Q∗
∂x ∂x∂u ∂x ∂u2 ∂x ∂x ∂x
/
∗ ∗ 
∗ ∂f ∂u ∂f ∗ ∂u∗ ∗ dQ∗ (t)
+Q + Q + δ + o(ε3 ). (11.2.38)
∂u ∂x ∂u ∂x dt

From (11.2.20), the first two terms in the right hand side of (11.2.38)
vanish. Then, (11.2.28) holds because H + ∂V /∂t = 0 and δ ∈ B(n, 1) is
arbitrary. Now, by expanding λ and ∂Φ0 /∂x to the first order in ε and
using (11.2.9), (11.2.14) and (11.2.27), Equation (11.2.29) is obtained.

To continue, let

∂2H ∗  2 ∗
∂ ĥ∗
∗ ∂ hk 
A∗  + ρk , [B ∗ ]  , (11.2.39)
∂u2 ∗
∂u2 ∂u
k∈A

∂2H ∗  ∂ 2 h∗k ∂f ∗

E∗  − − ρ∗k − Q∗ , (11.2.40)
∂x∂u ∗
∂x∂u ∂u
k∈A

and F ∗  −∂ ĥ∗ /∂x.


We have the main theorem.
Theorem 11.2.3 Suppose that the solution (x∗ , u∗ , λ∗ , ρ∗ ) satisfies the As-
sumptions 11.2.1–11.2.8. Then,
−1
∂u∗ A∗ B ∗ E∗
= [Ir or×q ]  . (11.2.41)
∂x [B ∗ ] 0q×q F∗

Proof. By (11.2.19), (11.2.22) and (11.2.27), we obtain

A∗ B ∗ ∂u∗ /∂x E∗
∗ = . (11.2.42)
[B ∗ ] 0q×q ∂ ρ̂ /∂x F∗
From Assumptions 11.2.6–11.2.7, the leftmost block matrix in (11.2.42) is
non-singular. Thus, (11.2.41) holds.

Remark 11.2.2 B ∗ in (11.2.41) will be an empty matrix when A∗ = ∅. In


that case, (11.2.41) reduces from Assumption 11.2.7 to

∂u∗ −1
= [A∗ ] E ∗ (11.2.43)
∂x . /
−1 
∂2H ∗ ∂2H ∗ ∂f ∗
=− + Q∗ .
∂u2 ∂x∂u ∂u

Now, either (11.2.41) or (11.2.43) is substituted into (11.2.28). Then,


a differential Riccati equation for Q∗ is derived with the terminal condi-
tion (11.2.29). Once Q∗ is obtained, ∂u∗ /∂x and hence
11.2 Neighbouring Extremals 451

∂u∗
u(x) ≈ u∗ + (x − x∗ ) (11.2.44)
∂x
can be computed readily.
Remark 11.2.3 Since the magnitude of the admissible perturbation ε0 in
Lemma 11.2.1 is hard to be determined, or the determined ε0 is too small, the
solution’s structure may change after perturbations. Specifically, it is possible
that a small boundary interval or a small interior interval along the nominal
trajectory disappears after perturbations. In the first situation, the method
proposed tries to keep the perturbed trajectory on the boundary, while in the
second the perturbed trajectory may be infeasible in a small interval. As a
remedy, we can modify the control law and project any infeasible point onto
the boundary of the constraints. Suppose there are some constraints infeasible
at time t, which satisfy that hk (t, x, u) > 0 for k ∈ K ⊆ P . Then, the control
law (11.2.44) should be modified as

u(x) = u(x) = {v ∈ U : hk (t, x, v) = 0, k ∈ K ∪ A}. (11.2.45)

In this way, the perturbed solution is always feasible although some optimality
may be lost.
The following algorithm gives the procedure to compute the feedback con-
trol (11.2.44) and its modification (11.2.45).
Algorithm 11.2.1
Step 1. Solve Problem (P1 ) to obtain u∗ (t) and x∗ (t) for t ∈ [0, T ]. The
expression of
ρ∗ (t) = ρ(t, x∗ , u∗ , λ∗ )
can be solved from (11.2.10) to (11.2.12). Substituting the obtained ρ∗
in (11.2.8), λ∗ (t) can be computed by integrating (11.2.8) backwards
from t = T to t = 0 with terminal condition (11.2.9). Then, ρ∗ (t)
can be computed, and A∗ (t) is also obtained.
Step 2. Compute Q∗ (t), t ∈ [0, T ], by integrating (11.2.28) backwards in time
with terminal condition (11.2.29), where ∂u∗ /∂x is given by (11.2.41)
or (11.2.43). Then, ∂u∗ /∂x is obtained from (11.2.41) or (11.2.43)
for each t ∈ [0, T ].
Step 3. For each neighbouring trajectory x(t), the control u(x) is given
by (11.2.44). If any constraints are violated, u(x) shall be modified
as (11.2.45).

11.2.3 Numerical Examples

Consider the following problem, which is generalized from the Rayleigh prob-
lem with a mixed state–control constraint [24].
452 11 Feedback Control

For a given system



dx1 (t) t
= x2 (t) 1 + , x1 (0) = −5,
dt 45
dx2 (t)  
= −x1 (t) + x2 (t) 1.4 − p(x2 (t))2 + 4u(t), x2 (0) = −5,
dt
with p = 0.14, find a control u that minimizes
 4.5
 
g0 (u) = (u(t))2 + (x1 (t))2 dt
0

subject to a continuous inequality constraint

x1 (t) + t
u(t) + ≤ 0, t ∈ [0, 4.5].
6
The Lagrangian L for this problem is
$   %
L = u2 + x21 + λ1 x2 (1 + t/45) + λ2 −x1 + x2 1.4 − px22 + 4u
+ ρ (u + (x1 + t)/6) .

From ∂L/∂u = 0, ρ is solved as

ρ = −2u − 4λ2 .

Then, the dynamics of the costate λ is governed by the differential equations

dλ1 (t)
= −∂L/∂x1 = λ2 (t) − 2x1 (t) − ρ/6
dt
= (5/3)λ2 (t) − 2x1 (t) + u(t)/3, λ1 (T ) = 0
dλ2 (t)
= −∂L/∂x2
dt
= 3pλ2 (t)(x2 (t))2 − 1.4λ2 (t) − λ1 (t)(1 + t/45), λ2 (T ) = 0.

The nominal optimal pair (x∗ , u∗ ) of this problem can be computed


by MISER software [104]. Then, for this nominal trajectory, it follows
from (11.2.41) and (11.2.43) that
⎧$ 1 %
∂u∗ ⎨ − 6 0 , if u∗ + (x∗1 + t)/6 = 0,
= $ % (11.2.46)
∂x ⎩
0 −2 Q∗ , if u∗ + (x∗1 + t)/6 < 0,

where Q∗ is the solution of the following differential equation:

dQ∗ (t)
= F (Q∗ ), Q∗ (4.5) = 02×2 .
dt
11.2 Neighbouring Extremals 453

Here,

− 37 0 0 − 53
F (Q∗ ) = 18 − Q∗
0 0.84λ∗2 x∗2 t+45
45 1.4 − 0.42x∗2
2
t+45
0
− Q∗ 45
− 3 1.4 − 0.42x∗2
5
2

if u∗ + (x∗1 + t)/6 = 0, while

−2 0 0 −1
F (Q∗ ) = − Q∗
0 0.84λ∗2 x∗2 t+45
45 1.4 − 0.42x∗2
2
t+45
0 00
− Q∗ 45 + Q∗ Q∗
−1 1.4 − 0.42x∗2
2 08

if u∗ + (x∗1 + t)/6 < 0.

-2
1

0
x

-4

nominal
-6 feedback
optimal
-5
0 1 2 3 4 0 1 2 3 4

1.5

1 0
0.5
constraint

0 -0.5
u

-0.5
-1
-1

-1.5
-1.5
-2
0 1 2 3 4 0 1 2 3 4
t (seconds) t (seconds)

Fig. 11.2.1: Trajectories under different controls

Consider the case where x1 (0) is perturbed to −6 and p is perturbed to


0.14[1 + 0.2 cos(40πt/9)]. Figure 11.2.1 presents the trajectories of the per-
turbed system under three different controls, the nominal control u∗ , the
454 11 Feedback Control

feedback control u(x) of (11.2.44) and (11.2.45) and the optimal open-loop
control for the perturbed problem. The respective trajectories of the state,
control and constraint are, respectively, depicted by the black solid lines, the
red solid lines and the blue dashed lines in Figure 11.2.1. It is seen that the
errors between the system trajectories under the feedback control and those
under the optimal open-loop control are relatively small, and the nominal
control is infeasible for t ∈ [3.18, 4.5]. Since the perturbation is not small, the
feedback control law (11.2.44) is infeasible in a small interval t ∈ [1.29, 1.34],
where the modified control law (11.2.45) is used instead. Under this modifi-
cation, the feasibility of the feedback control is regained.

11.3 PID Control

11.3.1 Problem Statement

This section is from [132]. Consider the following dynamical system:

dx(t)
= f (x(t), y(t), u(t)), t ∈ (0, T ] (11.3.1)
dt
dy(t)
= p(x(t)) (11.3.2)
dt
x(0) = x0 (11.3.3)
0
y(0) = y (11.3.4)

where T is the terminal time, and x = [x1 , . . . , xn ] ∈ Rn , u = [u1 , . . . , ur ]


∈ Rr , y ∈ R are, respectively, state, control and output, while f =
[f1 , . . . , fn ] ∈ Rn and p ∈ R are, respectively, given continuously differ-
entiable functions. x0 ∈ Rn and y 0 ∈ R are a given constant vector and a
given scalar, respectively.
We assume that the following conditions are satisfied.
Assumption 11.3.1 There exists a constant C1 such that

|f (x, y, u)| ≤ C1 (1 + |x| + |y| + |u|)

for all (x, y, u) ∈ Rn × R × Rr , where | · | denotes the usual Euclidean norm.


Assumption 11.3.2 There exists a constant C2 such that

|p(x)| ≤ C2 (1 + |x|).

Remark 11.3.1 Suppose that the output equations are algebraic equations
given below rather than the output system (11.3.2) with the initial condi-
tion (11.3.4).
11.3 PID Control 455

y(t) = h(x̂(t)), (11.3.5)


where, without loss of generality,

x̂ = [x1 , . . . , xs ] (11.3.6)

with s < n. Furthermore, we assume that

dx̂(t)
= q(x(t)), (11.3.7)
dt
where q = [q1 , . . . , qs ]T is a continuously differentiable function. Then, it is
easy to see that

dy(t)  ∂h(x̂(t)) dxi (t)  ∂h(x̂(t))


s s
= = qi (x(t)) (11.3.8)
dt i=1
∂xi dt i=1
∂xi

with initial condition


y(0) = h(x̂(0)). (11.3.9)
Thus, the formulation of the output expressed in terms of differential equa-
tions given by (11.3.2) with initial condition (11.3.4) is rather general. Cer-
tainly it covers the ship steering problem to be considered later in this section
as a special case.
The control u is assumed to take the form of a PID controller given below:


N1
u(t) = k1,j (y(t) − r(t))χI1,j (t)
j=1


N2  t 
N3
dy(t)
+ k2,j (y(s) − r(s))χI2,j (s)ds + k3,j χI3,j (t),
j=1 0 j=1
dt
(11.3.10)

where r(t) denotes a given reference input, which is a piecewise continuous


function defined on [0, T ],

Ii,j = [ti,j−1 , ti,j ), i = 1, 2, 3; j = 1, . . . , Ni , (11.3.11)

while

0 = ti,0 < ti,1 < ti,2 < · · · < ti,Ni < ti,Ni+1 = T, i = 1, 2, 3, (11.3.12)

are the switching times for the proportional, integral and derivative control
actions, respectively, and χI denotes the indicator function of I given by

1, t ∈ I,
χI (t) = (11.3.13)
0, otherwise.
456 11 Feedback Control

Here, {ki,1 , . . . , ki,Ni }, i = 1, 2, 3, are respective gains for the proportional,


integral and derivative terms of the PID controller.
The form of the PID controller is a generalized version of the conventional
PID controller, particularly, the form of the integral control. For the conven-
tional integral control, it performs the integral action over the whole period of
the time horizon. Because of the accumulation effect, a large value of the gain
for the integral control will cause huge overshoot. On the other hand, if the
gain for the integral control is chosen to be very small, while the overshoot
can become small, the steady state error will take a long time to reduce in
the presence of constant disturbances. The generalized integral control is in
the form for which it is re-set at appropriately chosen fixed switching time
points so as to give a well-regulated control operation.
Remark 11.3.2 Here, we assume that y and r are real-valued functions. It
is straightforward to extend the results to the case where y and r are vector-
valued functions at the expense of notational complexity.

We now specify the region within which the output trajectory is allowed to
move. This region is defined in terms of the following continuous inequality
constraints, which arise due to practical requirements, such as constraints
on the rise time and for avoiding overshoot. They may also arise due to
engineering specification on the PID controller.

gi (t, x(t), y(t), u(t)) ≤ 0, t ∈ [0, T ], i = 1, . . . , M. (11.3.14)

For each i = 1, . . . , M, the function gi is continuously differentiable with


respect to x, y and u while continuous with respect to t.
To ensure a satisfactory tracking of r(t) by y(t), the following terminal
state constraint is imposed:

Ω(y(T )) = y(T ) − r(T ) = 0. (11.3.15)

The optimal control problem may now be stated below. Given sys-
tem (11.3.1)–(11.3.4), design a PID controller in the form defined by (11.3.10)
such that the output y(t) of the corresponding closed loop system will
move within the specified region defined by the continuous inequality con-
straints (11.3.14) and, at the same time, it will track the given reference input
such that the terminal condition (11.3.15) is satisfied. Let this problem be
referred to as Problem (P2 ).
First, we formulate a cost functional below:
 T 2
dy(t)
J(k) = α1 (y(t) − r(t))2 + α2 + α3 [u(t)]2 dt, (11.3.16)
0 dt

where αi , i = 1, 2, 3, are the weighting factors.


11.3 PID Control 457

For the integral term of the PID controller given by (11.3.10), we define
 t
zj (t) = [y(s) − r(s)]χI2,j (s)ds, j = 1, . . . , N2 . (11.3.17)
0

Clearly, for each j = 1, . . . , N2 , (11.3.17) is equivalent to

dzj (t)
= (y(t) − r(t))χI2,j (t), (11.3.18)
dt
zj (0) = 0. (11.3.19)

Let z(t) = [z1 (t), . . . , zN2 (t)] and q(t) = [q1 (t), . . . , qN2 (t)] , where

qj (t) = (y(t) − r(t))χI2,j (t), j = 1, . . . , N2 . (11.3.20)

Then, system (11.3.18)–(11.3.19) become

dz(t)
= q(t), (11.3.21)
dt
z(0) = 0. (11.3.22)

Now, it follows from (11.3.21)–(11.3.22) that system (11.3.1)–(11.3.4) with


u(t) chosen as a PID controller given by (11.3.10) can be written as

⎪ dx(t)

⎪ = f (t, x(t), y(t), z(t), k)

⎨ dt
dy(t)
= p(x(t)) (11.3.23)
⎪ dt



⎩ dz(t) = q(t)
dt
with initial conditions ⎧
⎨ x(0) = x0
y(0) = y 0 (11.3.24)

z(0) = 0
where
f (t, x(t), y(t), z(t), k) = f (x(t), y(t), u(t)), (11.3.25)
while the PID controller u(t) given by (11.3.10) becomes


N1
u(t) = k1,j (y(t) − r(t))χI1,j (t)
j=1


N2 
N3
+ k2,j zj (t) + k3,j p(x(t))χI3,j (t). (11.3.26)
j=1 j=1

Here,
458 11 Feedback Control

k = [k1,1 , . . . , k1,N1 , k2,1 , . . . , k2,N2 , k3,1 , . . . , k3,N3 ] (11.3.27)

is the vector containing the gains for the proportional, integral and derivative
terms of the PID controller.
The specified region remains the same as given by (11.3.14). The cost
functional (11.3.16) becomes
 T
J(k) = α1 (y(t) − r(t))2 + α2 [p(x(t))]2
0
.

N1
+α3 k1,j (y(t) − r(t))χI1,j (t)
j=1


N2 
N3 2
+ k2,j zj (t) + k3,j p(x(t))χI3,j (t) dt. (11.3.28)
j=1 j=1

The problem may now be re-stated as: Given system (11.3.23) with initial
condition (11.3.24), find a PID control parameter vector k such that the
cost function (11.3.28) is minimized subject to the continuous inequality con-
straint (11.3.14) and the terminal equality constraint (11.3.15). Let this prob-
lem be referred to as Problem (Q2 ). Clearly, Problem (Q2 ) is an optimal
parameter selection problem.

11.3.2 Constraint Approximation

The continuous inequality constraints (11.3.14) are handled by constraint


transcription technique presented in Section 4.3. This leads to the following
equivalent equality constraints:
 T
max{gi (t, x(t), y(t), u(t)) , 0}dt = 0, i = 1, . . . , M, (11.3.29)
0

where u(t) is given by (11.3.26). However, the integrands appeared under


the integration in (11.3.29) are nonsmooth. Thus, for each i = 1, . . . , M , we
shall approximate the nonsmooth function max{gi (x(t), y(t), u(t)) , 0} by a
smooth function Li,ε (t, x(t), y(t), u(t)) given by

Li,ε (t, x(t), y(t), u(t))



⎨ 0, if gi (t, x(t), y(t), u(t)) < −ε
= (gi (t, x(t), y(t), u(t)) + ε)2 /4ε, if − ε ≤ gi (t, x(t), y(t), u(t)) ≤ ε

gi (t, x(t), y(t), u(t)), if gi (t, x(t), y(t), u(t)) > ε,
(11.3.30)
11.3 PID Control 459

where u is given by (11.3.26) and ε > 0 is an adjustable constant with small


value. Then, for each i = 1, . . . , M , we define
 T
gi,ε (k) = L̄i,ε (t, x(t), y(t), z(t), k)dt, (11.3.31)
0

where

L̄i,ε (t, x(t), y(t), z(t), k) = Li,ε (t, x(t), y(t), u(t)) (11.3.32)

and u(t) is given by (11.3.26).


We now use the concept of the penalty function to append the functions
gi,ε given by (11.3.31) to the cost functional (11.3.28), forming a new cost
functional given below:
 T
J ε,γ (k) = l(t, x(t), y(t), z(t), k)dt
0
M 
 T
+γ L̄i,ε (t, x(t), y(t), z(t), k)dt,
i=1 0

where
 2  2  2
l(t, x, y, z, k) = α1 y − r + α2 p(x) + α3 u(t) , (11.3.33)

and u(t) is given by (11.3.26) and γ > 0 is a penalty parameter.


We may now state the approximate problem for each ε > 0 and γ > 0 as
follows. Given system (11.3.23) with initial condition (11.3.24) and terminal
condition (11.3.15), find a PID control parameter vector k such that the cost
functional  T
J ε,γ (k) = L̂ε,γ (t, x, y, z, k)dt (11.3.34)
0
is minimized, where

L̂ε,γ (t, x, y, z, k) = l(t, x(t), y(t), z(t), k)



M
+γ L̄i,ε (t, x(t), y(t), z(t), k). (11.3.35)
i=1

This problem is referred to as Problem (Q2 ε,γ ). The relationships between


Problem (Q2 ε,γ ) and Problem (Q2 ) are given in the following theorems. Their
proofs are similar to those given for Theorems 2.1 and 2.2 in [259], respec-
tively.
Theorem 11.3.1 For any ε > 0, there exists a γ(ε) > 0 such that for all

γ, 0 < γ < γ(ε), if kε,γ is an optimal solution of Problem (Q2 ε,γ ), then it
satisfies the continuous inequality constraint (11.3.14) of Problem (Q2 ).
460 11 Feedback Control

Theorem 11.3.2 Let k∗ and kε,γ(ε) ∗


be, respectively, optimal solutions of

Problem (Q2 ) and Problem (Q2 ε,γ ), where γ(ε) is chosen such that kε,γ(ε) sat-
isfies the continuous inequality constraint (11.3.14) of Problem (Q2 ). Then,
!

lim J kε,γ(ε) = J(k∗ ), (11.3.36)
ε→0

where J is defined by (11.3.28).

On the basis of Theorems 11.3.1 and 11.3.2, Problem (P2 ) can be solved
through solving a sequence of optimal parameter selection problems (Q2 ε,γ )
subject to only terminal equality condition (11.3.15). Each of these optimal
parameter selection problems can be solved as a nonlinear optimization prob-
lem by using a gradient-based optimization method, such as the sequential
quadratic programming approximation scheme. See Chapter 3 for details.
Thus, the optimal control software, MISER, is applicable. Further details are
given in the next section.

11.3.3 Computational Method

In this section, we will propose a reliable computational method for solving


Problem (Q2 ) via solving a sequence of Problems (Q2 ε,γ ), where for each ε >
0 and γ > 0, Problem (Q2 ε,γ ) is solved as a nonlinear optimization problem.
For doing this, it is required to provide, for each k, the value of the cost
functional J ε,γ (k), as well as its gradient ∂J ε,γ (k)/∂k. Furthermore, we also
need the value of the terminal constraint function Ω(y(T |k)) and its gradient
∂Ω(y(T |k))/∂k. It is obvious that the values of the cost functional J ε,γ (k)
and the terminal constraint function Ω(y(T |k)) can be readily obtained after
system (11.3.23) with initial condition (11.3.24) corresponding to k is solved.
For the gradient formulas of the cost functional J ε,γ (k) and the terminal
constraint function Ω(y(T |u)) corresponding to each k, we have the following
two theorems. Their proofs are similar to those given for Theorem 5.2.1 in
[253].
Theorem 11.3.3 The gradient formula for the cost function J ε,γ (k) with
respect to k is given by
 T
∂J ε,γ (k) ∂Hε,γ (t, x(t), y(t), z(t), k, λε,γ (t))
= dt. (11.3.37)
∂k 0 ∂k

Here, Hε,γ (t, x, y, z, k, λ) is the Hamiltonian function given by

Hε,γ (t, x, y, z, k, λ) = L̂ε,γ (t, x, y, z, k) + λ ˆ


ε,γ f (t, x, y, z, k),

where L̂ε,γ is as defined by (11.3.35)


11.3 PID Control 461
$ %
fˆ = (f ) , p, q  ,

and λε,γ is the solution of following system of costate differential equations:

dλ(t) ∂Hε,γ (t, x(t), y(t), z(t), k, λ(t))


=− ,
dt ∂x
∂Hε,γ (t, x(t), y(t), z(t), k, λ(t))
,
∂y

∂Hε,γ (t, x(t), y(t), z(t), k, λ(t))
(11.3.38a)
∂z

with the boundary condition


λ(T ) = 0. (11.3.38b)

Theorem 11.3.4 The gradient formula for the terminal constraint function
Ω(y(T |k)) with respect to k is given by

∂Ω(y(T |k) T
∂ H̃ε,γ (t, x(t), y(t), z(t), k, λ̃ε,γ (t))
= dt, (11.3.39)
∂k 0 ∂k

where H̃ε,γ (t, x, y, z, k, λ) is the Hamiltonian function given by

H̃ε,γ (t, x, y, z, k, λ) = λ̃ ˆ


ε,γ f (t, x, y, z, k). (11.3.40)

Here, λ̃ε,γ is the solution of following system of costate differential equations:

dλ̃(t) ∂ H̃ε,γ (t, x(t), y(t), z(t), k, λ̃(t))


=− ,
dt ∂x
∂ H̃ε,γ (t, x(t), y(t), z(t), k, λ̃(t))
,
∂y

∂ H̃ε,γ (t, x(t), y(t), z(t), k, λ̃(t))
(11.3.41a)
∂z

with the boundary condition

dΩ(y(T ))
λ̃(T ) = . (11.3.41b)
dy

For each ε > 0, γ > 0, Problem (Q2 ε,γ ) is to be solved as a nonlinear


optimization problem using the gradient formulas given in Theorems 11.3.3
and 11.3.4. Details are reported in the following as an algorithm.
Algorithm 11.3.1
1. Choose ε > 0, γ > 0 and k.

2. Solve Problem (Q2 ε,γ ) as a nonlinear optimization problem, yielding kε,γ .
462 11 Feedback Control

3. Check whether all the continuous inequality constraint (11.3.14) are satis-
fied or not. If they are satisfied, go to Step 4. Otherwise, increase γ to 10γ

and go to Step 2 with kε,γ as the initial guess for the new optimization
process.
4. If ε is small enough, say, less than or equal to a given small number, we

have a successful exit. Else, decrease ε to ε/10 and go to Step 2, using kε,γ
as the initial guess for the new optimization process.

11.3.4 Application to a Ship Steering Control Problem

In this section, we apply the proposed method to a ship steering control


problem. Our aim is to design a PID controller such that the heading angle,
y(t), of the ship will follow the change course set by the reference input signal
r(t). The control system is as shown in Figure 11.3.1. The ship motion can be
described by the following differential equations defined on [0, T ] (see [17]).
In this application, T = 300s.

Fig. 11.3.1: The overall control system of the ship

 3 
d3 y(t) d2 y(t) dy(t) dy(t) dδ  (t)
+ b 1 + b2 a1 + a2 = b3 + b2 δ  (t) + w,
dt3 dt2 dt dt dt
(11.3.42)
where
d(d)
w = b3 + b2 d,
dt
dδ(t)
= b4 e (t), (11.3.43)
dt
with
e, if |e| ≤ emax
e = (11.3.44)
emax sign(e), if |e| ≥ emax
11.3 PID Control 463

where e = u − δ,

δ, if |δ| ≤ δmax
δ = (11.3.45)
δmax sign(δ), if |δ| ≥ δmax

The variable w is to account for sea disturbances acting on the ship with
d a constant disturbance, u is the control that is chosen in the form of a
PID controller defined by (11.3.10), δ is the rudder angle, e is the error
as defined, e and δ  are the real inputs to the actuator and ship dynamics,
respectively, because of the saturation properties that are defined as (11.3.44)
and (11.3.45). The ship model is in its full generality without resorting to
simplification and linearization. This work develops further some previous
studies of optimal ship steering strategies with time optimal control [257],
phase advanced control [31], parameter self-turning [141], adaptive control
[7] and constrained optimal model following [264].
For a ship steering problem, it has two phases: course changing and
course keeping. During the course changing phase, it is required to manoeu-
vre the ship such that it moves quickly towards the desired course set by
the command without violating the constraints arising from performance
specifications and physical limitations on the controller. During the course
keeping phase, the ship is required to move along the desired course. In
this application, the PID controller of the form defined by (11.3.10) with
N1 = N2 = N3 = 6 is used. More specifically,


6
u(t) = k1,i (y(t) − r(t))χ[ti−1 ,ti ) (t)
i=1

6  t
+ k2,i (y(s) − r(s)) χ[ti−1 ,ti ) (s)ds
i=1 0


6
dy(t)
+ k3,i χ[ti−1 ,ti ) (t), (11.3.46)
i=1
dt

where χI denotes the indicator function of I defined by (11.3.13), while ti ,


i = 1, . . . , 5, are fixed switching time points to be specified later.
Set
dy(t) d2 y(t)
x1 (t) = y(t), x2 (t) = , x3 (t) = , x4 (t) = δ(t) (11.3.47)
dt dt2
and
 t
x5,j (t) = (y(s) − r(s)) χ[tj−1 ,tj ) (s)ds, j = 1, . . . , 6. (11.3.48)
0

Then, the dynamics of the ship can be expressed as


464 11 Feedback Control

dx1 (t)
= x2 (t) (11.3.49)
dt
dx2 (t)
= x3 (t) (11.3.50)
dt
dx3 (t)  
= −b1 x3 (t) − b2 a1 (x2 (t))3 + a2 x2 (t)) + b3 b4 e + b2 (x4 (t) + d
dt
(11.3.51)
dx4 (t)
= b4 e (11.3.52)
dt
dx5,j (t)
= x1 (t) − r(t)χ[tj−1 ,tj ) (t), j = 1, . . . , 6 (11.3.53)
dt
with the initial condition

x(0) = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0] , (11.3.54)

where

x = [x1 , x2 , . . . , x10 ]T
e = u(t) − x4 (t) (11.3.55)

with

6
u(t) = k1,i (x1 (t) − r(t)) χ[ti−1 ,ti ) (t)
i=1

6 
6
+ k2,i x5,j (t) + k3,i x2 (t)χ[ti−1 ,ti ) (t). (11.3.56)
i=1 i=1

The values of the coefficients appeared in the equations are given in


Table 11.3.1. The reference input signal r(t) used in our example is r(t) =

Table 11.3.1: Coefficients for the ship model

a1 a2 b1 b2 b3 b4
−30.0 −5.6 0.1372 −0.0002014 −0.003737 0.5

π/180, for t ∈ [0, 300s].


This ship steering problem is a special case of (11.3.1)–(11.3.4), where the
output system is
dx1 (t)
= x2 (t)
dt
with initial condition
x1 (0) = 0.
11.3 PID Control 465

In practice, a large overshoot is undesirable. In this problem, the following


constraint is imposed on the upper bound of the heading angle x1 (t).

x1 (t) − 1.01r(t) ≤ 0, t ∈ [0, 300s], (11.3.57)

i.e., the heading angle should not go beyond 1% of the desired reference input
r(t). This constraint can be written as

g1 (t) = x1 (t) − 101%r(t) ≤ 0, t ∈ [0, 300s]. (11.3.58)

We also impose constraint on the rise time of the heading angle such that
the heading angle is constrained to reach at least 70% of the desired reference
input in 30 seconds and 95% in 60 seconds, i.e.,

g2 (t) = h(t) − x1 (t) ≤ 0, t ∈ [0, 300s], (11.3.59)

where


⎪ 0, t ∈ [0, 6)

5.1 × 10−4 t − 3.1 × 10−3 , t ∈ [6, 30)
h(t) = (11.3.60)
⎪ 1.5 × 10−4 t + 7.9 × 10−3 ,
⎪ t ∈ [30, 60)

2.2 × 10−6 t + 16.4 × 10−3 , t ∈ [60, 300].

To cater for the saturation property of the actuator, it is equivalent to


impose upper and lower bounds on x4 (t), i.e.,

− π/6 ≤ x4 (t) ≤ π/6, t ∈ [0, 300s], (11.3.61)

which are continuous inequality constraints. They can be rewritten as

g3 (t) = −x4 (t) − π/6 ≤ 0, t ∈ [0, 300s] (11.3.62)

and
g4 (t) = x4 (t) − π/6 ≤ 0, t ∈ [0, 300s]. (11.3.63)
Similarly, to cater for another saturation property, we have

− π/30 ≤ x4 (t) − u(t) ≤ π/30, t ∈ [0, 300s]. (11.3.64)

They are again continuous inequality constraints, which can be rewritten as

g5 (t) = −x4 (t) + u(t) − π/30 ≤ 0, t ∈ [0, 300s] (11.3.65)

and
g6 (t) = x4 (t) − u(t) − π/30 ≤ 0, t ∈ [0, 300s], (11.3.66)
where u(t) is given by (11.3.56).
The terminal equality constraint is
466 11 Feedback Control

Ω(x1 (300)) = x1 (300) − r(300)


= x1 (300) − π/180 = 0. (11.3.67)

0.018

0.016
r(t)
0.014 1.01r(t)
h(t)
y(t)
Heading Angle y (rad)

0.012

0.01

0.008

0.006

0.004

0.002

0
0 30 60 90 120 150 180 210 240 270 300
t (s)

Fig. 11.3.2: The heading angle of the ship

-0.48 -0.42

-0.5 -0.44

-0.52 -0.46

-0.54 -0.48
3

g4
g

-0.56 -0.5

-0.58 -0.52

-0.6 -0.54

-0.62 -0.56
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t (300s) t (300s)

Fig. 11.3.3: The constraints for the saturation of the actuator

0 0

-0.05 -0.05

-0.1 -0.1
6
5

g
g

-0.15 -0.15

-0.2 -0.2

-0.25 -0.25
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Fig. 11.3.4: The constraints for the saturation of the control


11.3 PID Control 467

0.04

0.02

Rudder Angle (rad)


0

-0.02

-0.04

-0.06

-0.08

-0.1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t (300s)

Fig. 11.3.5: The rudder angle of the ship

The optimal PID control problem may now be stated formally as follows.
Given system (11.3.49)–(11.3.54), find a PID control parameter vector
k = [(k 1 ) , . . . , (k 6 ) ] with k i = [k1i , k2i , k3i ] , i = 1, 2, . . . , 6, such that the
cost functional
 300
J= {α1 (x1 (t) − r(t))2 + α2 x22 (t) + α3 u2 (t)}dt (11.3.68)
0

is minimized subject to the continuous inequality constraints (11.3.62),


(11.3.63), (11.3.65), (11.3.66) and (11.3.58) and the terminal condition
(11.3.67), where

0.018

0.016
r(t)
y(t)
0.014

0.012
Heading Angle y (rad)

0.01

0.008

0.006

0.004

0.002

0
0 30 60 90 120 150 180 210 240 270 300
t (s)

Fig. 11.3.6: The heading angle of the ship with a larger disturbance
468 11 Feedback Control
0.02

0.018

0.016

r(t)
0.014
y(t)

Heading Angle y (rad)


0.012

0.01

0.008

0.006

0.004

0.002

0
0 30 60 90 120 150 180 210 240 270 300
t (s)

Fig. 11.3.7: The heading angle of the ship with a disturbance coming from
the initial heading direction


6 
6
u(t) = k1,i (x1 (t) − r(t))χ[ti−1 ,ti ) (t) + k2,i x5,j (t)
i=1 i=1

6
+ k3,i x2 (t)χ[ti−1 ,ti ) (t). (11.3.69)
i=1

Here, t0 = 0, t6 = 300 and ti , i = 1, 2, . . . , 5, are the switching time points


that are chosen to be at the time points where the constraint function g2 is
non-differentiable. They are t1 = 6, t2 = 18, t3 = 30, t4 = 45 and t5 = 60.
Let this problem be referred to as Problem (S), and it is solvable by the
computational method developed in Section 11.3.3.
We then construct Problem (P2ε,γ ) according to the procedure as specified
in Section 11.3.2, where the appended new cost functional is given by
 300 *
J ε,γ (k) = α1 (x1 (t) − r(t))2 + α2 x22 (t) + α3 u2 (t)
0
+
+ γ(g1,ε + g2,ε + g3,ε + g4,ε + g5,ε + g6,ε ) dt, (11.3.70)

where gi,ε , i = 1, . . . , 6, are obtained from gi , i = 1, . . . , 6, respectively,


according to (11.3.31). In this problem, we set α1 = 10, α2 = 400andα3 =
0.05. It is to be minimized subject to terminal equality constraint (11.3.67).
In real world, disturbances always exist and there are many kinds of distur-
bances. We consider the case, where the ship is encountered with a constant
disturbance d. Assume that d = 0.3π/180.
Problem (P2ε,γ ) is solved by using Algorithm 3.3.1, where the final ε and
γ are ε = 0.01 and γ = 10. The optimal parameters for the PID controller u
obtained are
11.4 Exercises 469

k1,∗ = [5.78685, 7.27203, 2.62351, 6.98467, 9.76934, 7.39799]


k2,∗ = [1.03217, 0.00000, 0.00303, 0.00000, 0.24398, 0.84387]
k3,∗ = [99.81791, 100.52777, 100.67912, 100.13533, 99.75020, 99.70358] .

The results obtained are shown in Figures 11.3.2, 11.3.3, 11.3.4, and 11.3.5.
From the results obtained, we see that all the constraints are satisfied. The
heading angle tracks the desired reference input with no steady state error
after some small oscillation due to the constant disturbance. The overshooting
of the heading angle above the reference input is less than 1%, and hence
the constraint g1 (t) ≤ 0, t ∈ [0, 300], is satisfied. To test the robustness
of this PID controller, we run the model with the optimal PID controller
under the following environments: (1) The disturbance is much larger, more
specifically, d = 0.6 × π/180; and (2) the disturbance is coming from the
initial heading direction, more specifically, d = −0.3 × π/180. The results are
shown in Figures 11.3.6 and 11.3.7. In both cases, we see that the heading
angles track the desired reference input with no steady state error after some
small oscillations.

11.4 Exercises

11.1. Provide a proof of Theorem 11.3.1.

11.2. Provide a proof of Theorem 11.3.2.

11.3. Provide a proof of Theorem 11.3.3.

11.4. Provide a proof of Theorem 11.3.4.


Chapter 12
On Some Special Classes of Stochastic
Optimal Control Problems

12.1 Introduction

In this chapter, we consider two classes of stochastic optimal control prob-


lems. More specifically, in Section 12.2 we consider a class of combined opti-
mal parameter selection and optimal control problems in which the dynamical
system is governed by linear Ito stochastic differential equation involving a
Wiener process. Both the control and system parameter vectors may, how-
ever, appear nonlinearly in the system dynamics. The cost functional is taken
as an expected value of a quadratic function of the state vector, where the
weighting matrices are time invariant but are allowed to be nonlinear in both
the control and system parameters. Furthermore, certain realistic features
such as probabilistic constraints on the state vector may also be included.
In Section 12.3, we consider a class of partially observed linear stochastic
control problems described by three sets of stochastic differential equations:
one for the system to be controlled, one for the observer (measurement)
channel and one for the control channel driven by the observed process. The
noise processes perturbing the system and observer dynamics are vector-
valued Poisson processes.
For each of these stochastic optimal control problems, we show that it is
equivalent to deterministic optimal control problems. These equivalent deter-
ministic optimal control problems are further transformed into special cases
of the form considered in Section 8.8. The main references of this chapter are
[239] and [243].

© The Author(s), under exclusive license to 471


Springer Nature Switzerland AG 2021
K. L. Teo et al., Applied and Computational Optimal Control, Springer
Optimization and Its Applications 171,
https://doi.org/10.1007/978-3-030-69913-0 12
472 12 On Some Special Classes of Stochastic Optimal Control Problems

12.2 A Combined Optimal Parameter and Optimal


Control Problem

Consider a system described by the following system of linear Ito stochastic


differential equations defined on the fixed time interval (0, T ]:

dξ(t) = A(t, δ, u(t))ξ(t)dt + b(t, δ, u(t))dt + D(t, δ, u(t))dw(t) (12.2.1a)

with the initial condition


ξ(0) = ξ0 , (12.2.1b)
where ξ(t) = [ξ1 (t), . . . , ξn (t)] ∈ Rn is the state vector, δ = [δ1 , . . . , δs ] ∈ Rs
is the system parameter vector, u = [u1 , . . . , ur ] ∈ Rr is the control vector,
ξ 0 = [ξ10 , . . . , ξn0 ] ∈ Rn is the initial state vector that is Gaussian distributed
with mean μ0 and covariance matrix M 0 and w(t) = [w1 (t), . . . , wm (t)] ∈
Rm is a Wiener process with zero mean and covariance matrix
 min{t,τ }
" #
E w(t)[w(t)] = Θ(s)ds, (12.2.2)
0

where Θ ∈ Rm×m is a symmetric positive definite matrix function, and E


denotes the mathematical expectation.
We assume throughout this chapter that the following conditions are sat-
isfied.
Assumption 12.2.1 A(t, δ, u) ∈ Rn×n , b(t, δ, u) ∈ Rn and D(t, δ, u) ∈
Rn×m are continuously differentiable with respect to all their arguments.

Assumption 12.2.2 The Wiener process w(t) and the initial random vector
ξ 0 are statistically independent.

To continue, let U be the class of admissible controls as defined in Sec-


tion 8.2. Furthermore, let

Ω = {δ ∈ Rs : ηj (δ) ≥ 0, j = 1, . . . , M }, (12.2.3)

where ηj , j = 1, . . . , M , are continuously differentiable functions of the pa-


rameter δ.
The next step is to introduce an important class of constraints on the
state of the dynamical system (12.2.1). These constraints arise naturally in
the situation when we wish to confine the state to be within a given acceptable
region with a certain degree of confidence for all t ∈ [0, T ]. These probabilistic
state constraints may be stated as
*   +
Prob αi ≤ ci ξ(t) ≤ βi , for all t ∈ [0, T ] > Δi , i = 1, . . . , N,
(12.2.4)
12.2 A Combined Optimal Parameter and Optimal Control Problem 473

where ci , i = 1, . . . , N , are n-vectors and αi , βi and Δi , i = 1, . . . , N , are


given real constants.
An element (δ, u) ∈ Ω × U is said to be a feasible combined parameter and
control if it satisfies the probabilistic state constraints specified in (12.2.4).
Let D be the class of all such feasible combined parameters and controls.
Hence D is called the class of feasible combined parameters and controls. We
are now in a position to specify our problem formally as follows.
Subject to the dynamical system (12.2.1), find a feasible combined param-
eter and control (δ, u) ∈ D such that the cost functional
4
g0 (δ, u) = E [ξ(T )]S(δ)ξ(T )+[p(δ)] ξ(T )+υ(δ)

 T* 5
+
 
+ [q(t, δ, u(t))] ξ(t)+ϑ(t, δ, u(t))+[ξ(t)] Q(t, δ, u(t))ξ(t) dt
0

(12.2.5)
is minimized over D, where S(δ) ∈ Rn×n and Q(t, δ, u) ∈ Rn×n are sym-
metric and positive semi-definite matrices continuously differentiable with
respect to their respective arguments, while q(t, δ, u) and p(δ) (respectively,
ϑ(t, δ, u) and υ(δ)) are n vector-valued functions (respectively, real-valued
functions) that are also continuously differentiable with respect to their ar-
guments. For convenience, let this combined optimal parameter selection and
optimal control problem be referred to as Problem (SP1 ).

12.2.1 Deterministic Transformation

In this section, we wish to show that the combined optimal parameter selec-
tion and optimal control problem (SP1 ) is equivalent to a deterministic one.
To begin, we note that the solution of the system (12.2.1) corresponding to
each (δ, u) can be written as
 t
ξ(t | δ, u) = Φ(t, 0 | δ, u)ξ0 + Φ(t, s | δ, u)b(s, δ, u(s))ds
0
 t
+ Φ(t, s | δ, u)D(s, δ, u(s)))dw(s), (12.2.6)
0

where, for each (δ, u), Φ(t, s | δ, u) ∈ Rn×n is the principal solution matrix
of the homogeneous system
∂Φ(t, τ )
= A(t, δ, u(t))Φ(t, τ ), t > τ (12.2.7a)
∂t
Φ(τ, τ ) = I, (12.2.7b)
where I is the identity matrix.
474 12 On Some Special Classes of Stochastic Optimal Control Problems

Since a linear transformation of a Gaussian process is also Gaussian, the


process {ξ(t) : t > 0} given by
 t
ξ2 (t) = Φ(t, s | δ, u)D(s, δ, u(t))dw(s), t > 0,
0

is Gaussian and
ξ1 (t) = Φ(t, 0 | δ, u)ξ 0
is also Gaussian if ξ 0 is. Thus, for each (δ, u), the process {ξ(t) : t ≥ 0},
given by (12.2.6), is a Gaussian Markov process with mean

μ(t | δ, u) = E{ξ(t | δ, u)}


 t
= Φ(t, 0 | δ, u)E{ξ0 } + Φ(t, s | δ, u)b(s, δ, u(s))ds
0
 t
= Φ(t, 0 | δ, u)μ0 + Φ(t, s | δ, u)b(s, δ, u(s))ds (12.2.8)
0

and covariance matrix

Ψ (t | δ, u) = Φ(t, 0 | δ, u)M 0 [Φ(t, 0 | δ, u)]


 t
+ Φ(t, τ | δ, u)D(τ, δ, u(τ ))Θ(τ )[D(τ, δ, u(τ ))] [Φ(t, τ | δ, u)] dτ.
0
(12.2.9)

Differentiating (12.2.8) with respect to t and then using (12.2.7), we note that
for each (δ, u) ∈ D, μ(t | δ, u) is the corresponding solution of the following
system of differential equations.

dμ(t)
= A(t, δ, u(t))μ(t) + b(t, δ, u(t)), t > 0, (12.2.10a)
dt
μ(0) = μ0 . (12.2.10b)

Differentiating (12.2.9) with respect to t and then using (12.2.7), it follows


that for each (δ, u) ∈ D, Ψ (t | δ, u) is the corresponding solution of the
following matrix differential equation.

dΨ (t)
= A(t, δ, u(t))Ψ (t) + [Ψ (t)]A(t, δ, u(t)) + D(t, δ, u(t))Θ(t)[D(t, δ, u(t))]
dt
(12.2.11a)
with the initial condition
Ψ (0) = M 0 . (12.2.11b)
Note that Ψ (t | δ, u) is symmetric. Thus, there are only n(n + 1)/2 dis-
tinct differential equations in (12.2.11). The corresponding conditional joint
probability density function for ξ(t) is given by
12.2 A Combined Optimal Parameter and Optimal Control Problem 475

f (x, t | δ, u)
− 12 [x−μ(t | δ, u)][Ψ (t | δ, u)]−1 [x−μ(t | δ, u)]
= (2π)− 2 [det Ψ (t | δ, u)]
n
exp −
2
. (12.2.12)

Let us now turn our attention to the cost functional (12.2.5). First, we
note that
" #
E [ξ(t)] Q(t, δ, u(t))ξ(t)
" #
= E Tr([ξ(t)] Q(t, δ, u(t))ξ(t))
" #
= E Tr(Q(t, δ, u(t))ξ(t)[ξ(t)] )
" #
= Tr Q(t, δ, u(t))Ψ (t | δ, u) + μ(t | δ, u)[μ(t | δ, u))] , (12.2.13)

where Tr(·) denotes the trace of a matrix.


The first term of the cost functional (12.2.5) can be handled in a similar
manner. The transformation of the second term into its equivalent determin-
istic form is obvious. The third term is already in deterministic form. Thus,
we have the following lemma.
Lemma 12.2.1 The cost functional (12.2.5) is equivalent to
" $ %#
g0 (δ, u) = Tr S(δ) Ψ (T | δ, u) + μ(T | δ, u)[μ(T | δ, u)] + [h(δ)] μ(T )
 T
" #
+ υ(δ) + [q(t, δ, u(t))] μ(t | δ, u) + ϑ(t, δ, u(t)) dt
0
 T " $
+ Tr Q(t, δ, u(t)) Ψ (t | δ, u)
0
%#
+ μ(t, δ, u(t))[μ(t, δ, u(t))] dt, (12.2.14)

where μ(T | δ, u) and Ψ (t | δ, u) are deterministic and determined, respec-


tively, by (12.2.10) and (12.2.11).
For the probabilistic state constraints specified in (12.2.4), we have the
following lemma.
Lemma 12.2.2 For each i = 1, . . . , N , the corresponding probabilistic con-
straint specified in (12.2.4) is equivalent to
⎧ ⎫ ⎧ ⎫
⎨ b − ci  μ(t | δ, u) ⎬ ⎨ a − ci  μ(t | δ, u) ⎬
i i
erf J − erf J ≥ Δi ,
⎩  ⎭ ⎩  ⎭
2π (ci ) Ψ (t | δ, u)ci 2π (ci ) Ψ (t | δ, u)ci
(12.2.15)
for all t ∈ [0, T ].
Proof. Since ξ(t) is Gaussian with mean μ(t | δ, u) and covariance Ψ (t | δ, u),
 
it is clear that for each i = 1, . . . , N , the scalar process ci ξ(t) is also
476 12 On Some Special Classes of Stochastic Optimal Control Problems
   
Gaussian with mean ci μ(t | δ, u) and covariance ci Ψ (t | δ, u)ci .
Thus, the corresponding constraint specified in (12.2.4) can be rewritten as
⎧ !2 ⎫
 βi ⎪  i  ⎪
1 ⎨ 1 y − c μ(t | δ, u) ⎬
J exp dy ≥ Δi .
 ⎪
⎩ 2 (c

i ) Ψ (t | δ, u)ci ⎪

αi 2π (c ) Ψ (t | δ, u)c
i i

By carrying out the required integration, it is easy to show that this constraint
is equivalent to (12.2.15). This completes the proof.

Remark 12.2.1 Define


⎧ ⎫ ⎧ ⎫
⎨ b − ci  μ(t | δ, u) ⎬ ⎨ a − ci  μ(t | δ, u) ⎬
i i
gi (t, δ, u) = erf J −erf J .
⎩  ⎭ ⎩  ⎭
2π (c ) Ψ (t | δ, u)c
i i 2π (ci ) Ψ (t | δ, u)ci
(12.2.16)
Then, (12.2.15) can be written as

− Δi + gi (t, δ, u) ≥ 0, for all t ∈ [0, T ], i = 1, . . . , N. (12.2.17)

The constraints (12.2.17) are continuous inequality constraints. These con-


tinuous constraints can be approximated by a sequence of inequality con-
straints in canonical form by using the constraint transcription technique
presented in Section 4.2.
Let x(t) be the vector formed from μ(t) and the independent components
of the matrix Ψ (t), and let f be the corresponding vector obtained from
the right hand side of (12.2.10) and (12.2.11). Furthermore, let D again de-
note the class of all feasible combined parameters and controls in the sense
that each of its elements is in Ω × U and satisfies the constraints specified
in (12.2.17). We can summarize the above analysis in the following theorem.
Theorem 12.2.1 Problem (SP1 ) is equivalent to the following deterministic
combined optimal parameter selection and optimal control problem, denoted
as Problem (DP1 ).
Subject to the dynamical system

dx(t)
= f (t, x(t), δ, u(t)) (12.2.18a)
dt
x(0) = x0 , (12.2.18b)

where x0 is formed by μ0 and the components appearing in the upper triangu-


lar part of the covariance matrix M 0 , find a combined parameter and control
(δ, u) ∈ D such that the cost functional
 T
g0 (δ, u) = Φ0 (x(T | δ, u), δ) + L0 (t, x(t | δ, u), δ, u(t))dt (12.2.19)
0
12.2 A Combined Optimal Parameter and Optimal Control Problem 477

is minimized over D, where Φ0 and L0 are obtained from the corresponding


terms of (12.2.14) in an obvious manner.
Remark 12.2.2 Note that the deterministic transformation has increased
the dimension of the state from n to 12 (n2 + 3n).
Problem (DP1 ) can be solved by using the approach presented in Sec-
tion 9.2. More specifically, the control parametrization method is first ap-
plied to approximate the control function in Problem (DP1 ) by a piecewise
constant function with its heights and switching times taken as decision vari-
ables. Then, the time scaling transformation (see Section 9.2.1) is applied to
map the varying switching times into fixed switching times in a new time
horizon with an additional control variable, called the time scaling control.
Let the transformed problem be referred to as Problem (DP1 (p)). For the
continuous inequality constraint (12.2.14), the constraint transcription tech-
nique (see Section 9.2.2) is applied to approximate it by a sequence of smooth
inequality constraints. Consequently, Problem (DP1 (p)) is approximated by a
sequence of optimal parameter selection problems (DP1, γ (p)), each of which
can be solved by any gradient-based optimization methods as detailed in Sec-
tions 9.2.3 and 9.2.4. The convergence analysis of the approximation problems
(DP1, γ (p)) to the original problem (DP1 ) can be carried out as in Section 9.2.
Remark 12.2.3 Note that if we consider Problem (SP1 ) with the assumption
that ξ 0 is a deterministic vector rather than a Gaussian distributed random
vector, similar results are also valid. In this situation, the initial conditions
for (12.3.10) and (12.2.11) are, respectively, replaced by x0 and 0.

12.2.2 A Numerical Example

Consider the optimal machine maintenance problem described in Exam-


ple 6.4.1. It is a simple deterministic optimal control problem that can be
solved easily by the Pontryagin Maximum Principle as shown in Section 6.4.
However, from a practical view point, there is usually a multitude of other
factors that contribute in a less significant manner to the deterioration of the
machine’s quality state. Since each of these factors may be assumed indepen-
dent and random, the aggregated effect can be modeled as a random noise
term superimposed on the deterministic part of the deterioration dynamic.
The main reference of this section is [234].
The machine state ξ(t) is thus governed by a stochastic differential equa-
tion:

dξ(t) = −bξ(t)dt + u(t)dt + dw(t) (12.2.20a)


ξ(0) = x0 , (12.2.20b)

where w(t) is a Wiener process with zero mean and variance, i.e.,
478 12 On Some Special Classes of Stochastic Optimal Control Problems
 min(t,t )

E {w(t)w (t )} = θ(s)ds, θ(s) ≥ 0, for all s. (12.2.21)
0

Here, we assume that the variance of the Wiener process is stationary, i.e.,

θ(s) = θ, for all s ≥ 0. (12.2.22)


Furthermore, for satisfactory performance, we need to ensure that the ma-
chine maintains a relatively good quality state with a certain degree of confi-
dence over the whole of its life span. This particular requirement is achieved
by imposing the following probabilistic constraint over the whole time horizon
[0, T ].

Prob {ξ(t) ∈ [x0 − α, x0 + α]} ≥ ε, for all t ∈ [0, T ], (12.2.23)

where α > 0 and ε > 0. This constraint implies that we want to be at least
100ε% confidence that the quality state of the machine is in the interval
[x0 − α, x0 + α] throughout its life span, where α > 0 denotes the deviation
of the quality state of the machine from its initial state. In practice, we would
obviously like to make ε as close as possible to unity, while keeping δ as small
as possible. This may, however, incur excessive control effort. The stochastic
optimal maintenance problem may now be stated as follows.
Given the dynamical system (12.2.20), find an admissible control u ∈ U
such that the expected return
4  5
T
g0 (u) = E exp(−rT )Sξ(T ) + exp(−rt)[θξ(t) − u(t)]dt (12.2.24)
0

is maximized subject to the probabilistic constraint (12.2.23), where U con-


sists of all those controls that satisfy the constraints (6.4.8), while S, r, θ and
T are, respectively, the salvage value per unit terminal quality, the interest
rate, the productivity per unit quality and the sale date of the machine.
Let μ(t) and Ψ (t) be the mean and variance of the state ξ(t) determined,
respectively, by the following differential equations:

dμ(t)
= −bμ(t) + u(t) (12.2.25a)
dt
μ(0) = x0 (12.2.25b)

and
dΨ (t)
= −2bΨ (t) + θ (12.2.26a)
dt
Ψ (0) = 0. (12.2.26b)

Note that the variance Ψ (t) does not depend on the control u. Thus, the
differential equation (12.2.26) needs only to be solved once.
12.2 A Combined Optimal Parameter and Optimal Control Problem 479

Now, by virtue of the theoretical results reported in Section 12.2.1, it fol-


lows readily that this stochastic optimal problem is equivalent to the following
deterministic optimal control problem.
Given the dynamical system (12.2.25)–(12.2.26), find a control u ∈ U such
that the cost functional
 T
g0 (u) = hμ(T ) exp{−rT } + exp{−rt}[θμ(t) − u(t)]dt (12.2.27)
0

is minimized subject to the following continuous constraint

− 2ε + g1 (t, δ, u) ≥ 0, t ∈ [0, T ], (12.2.28)

where
   
x0 + δ − μ(t) x0 − δ − μ(t)
g1 (t, δ, u) = erf , − erf , . (12.2.29)
2πΨ (t) 2πΨ (t)

To study the behavior of this optimal machine maintenance problem, the


above deterministic optimal control problem with the following parameter
values is solved by using the control parametrization technique, where the
planning horizon [0, T ] = [0, 5] is partitioned into np = 20 subintervals with
np+1 switching time points in the partition. The results and the computa-
tional effectiveness can be improved if the time scaling technique is used. This
task is left as an exercise for the reader. Furthermore, the following values
are used for the various model constants.
b = 0.2, s = 0.25, ū = 0.1, p = 0.6, x0 = 1.0, r = 5%, δ = 0.5 and θ = 0.04.
Several cases of different values of ε were computed, and it was found that
the optimal solutions are always of the bang-bang type, i.e.,

ū = 0.1, 0 ≤ t ≤ t∗
u∗ (t) = (12.2.30)
0, t∗ ≤ t ≤ .T

The optimal switching time, however, is substantially larger than that


of the deterministic unconstrained case computed by (6.4.15). This is obvi-
ous due to the stringent state constraint specified in (12.2.28). The optimal
switching times obtained for ε = 0.5, 0.6 and 0.7 are, respectively, t∗ = 3.350,
3.697 and 4.680, with the corresponding return values of g0∗ = 1.922, 1.918
and 1.891. The unconstrained deterministic problem with the same set of pa-
rameter values has an optimal switching time of 3.284. The optimum solution
is consistent with the intuitive notion that as the quality state requirement
becomes more stringent, more control efforts are required thus resulting in a
later switching time. It is expected that if ε becomes larger, full maintenance
effort will be required, i.e., u∗ (t) = ū(t) = 0.1, for all t ∈ [0, T ]. However, if
ε becomes too large, there may not be any feasible solution as the maximal
control may still be insufficient to meet the quality requirement.
480 12 On Some Special Classes of Stochastic Optimal Control Problems

12.3 Optimal Feedback Control for Linear Systems


Subject to Poisson Processes

The main reference for this section is [239]. Consider a system governed by
the following stochastic differential equation over a finite time interval (0, T ].

dx(t) = A(t)x(t)dt + B(t)du(t) + G(t)dN (t), (12.3.1)

where x(t) ∈ Rn , A(t) ∈ Rn×n , B(t) ∈ Rn×r , u(t) ∈ Rr is a control function


that is of bounded variation and hence du(t) is a measure, Γ (t) ∈ Rn×m , and
N (t) ∈ Rm is an m-dimensional Poisson process with mean intensity λ(t).
We assume that the matrix-valued functions A, B and G are continuous on
[0.T ].
Along with (12.3.1), suppose we have an observation system described by
 
dy(t) = H(t)x(t)dt + Γ 0 (t) dN 0 (t) − λ0 (t)dt , (12.3.2)

where y(t) ∈ Rk , H(t) ∈ Rk×n , Γ 0 (t) ∈ Rk×q and N 0 (t) ∈ Rq is a q-


dimensional Poisson process with mean intensity λ0 (t).
It is# assumed that all the components of the Poisson processes {N (t),
N 0 (t) are statistically mutually independent. Furthermore, we assume that
all the components of their mean intensities, λ(t) and λ0 (t), are non-negative
and bounded measurable functions.
Suppose that the control function u is such that the corresponding measure
du(t) is of the form

du(t) = Ky(t)dt + K̂dy(t) − C(t)Γ (t)λ(t)dt, (12.3.3)

where K, K̂ ∈ Rr×k , are constant matrices yet to be determined and

B(t)C(t)Γ (t)λ(t) = Γ (t)λ(t), (12.3.4)

provided such a matrix C(t) exists. In fact, if B(t) has rank r and n > r,
then C(t) is just the right inverse of B(t). Substituting (12.3.3) and (12.3.2)
into (12.3.1), we obtain
!
dx(t) = A(t) + B(t)K̂H(t) x(t)dt + B(t)Ky(t)dt
 
+ B(t)K̂Γ 0 (t) dN 0 (t) − λ0 (t)dt + Γ (t) (dN (t) − λ(t)dt) .
(12.3.5)

Define
x(t)
ξ(t) = .
y(t)
Then, the system dynamics (12.3.5) together with the observation dynam-
ics (12.3.2) can be jointly written as
12.3 Optimal Feedback Control for Linear Systems Subject to Poisson Processes481

dξ(t) = Ã(t, κ)ξ(t)dt + Γ̃ (t, κ)dM̃ (t), (12.3.6)

where the vector κ ∈ R2rk is defined by


& '
κ = K1 , . . . , Kr , K̂1 , . . . , K̂r ,
A(t) + B(t)K̂H(t) B(t)K
Ã(t, κ) = ,
H(t) 0
Γ (t) B(t)K̂Γ 0 (t)
Γ̃ (t, κ) = ,
0 Γ 0 (t)
dN (t) − λ(t)dt
dM̃ (t) = ,
dN 0 (t) − λ0 (t)dt

and for each j = 1, . . . , r, Kj (respectively, K̂j ) is the jth row of the matrix
K (respectively, K̂). For convenience, let the components of the vector κ be
denoted as
κj , j = 1, . . . , 2rk.
Note that M̃ is a vector of zero-mean martingales.
The initial condition for the system dynamics may be deterministic or
Gaussian, i.e.,
x(0) = x0 , (12.3.7)
where x0 ∈ Rn is either a deterministic or a Gaussian vector. In the case
when x0 is a Gaussian vector, let x̄0 and P 0 be its mean and covariance,
respectively. Furthermore, it is assumed that x0 is statistically independent
of N and N 0 .
The initial condition for the observation dynamics is usually assumed to
be
y(0) = 0, (12.3.8)
that is, no information is available at t = 0. Thus, in the notation of (12.3.6),
we have
x0
ξ(0) = = ξ. (12.3.9)
0
Consider the following homogeneous system.

∂ Φ̃(t, τ )
= Ã(t, κ)Φ̃(t, τ ), 0 ≤ τ ≤ t < ∞, (12.3.10a)
∂t
Φ̃(t, t) = I, for any t ∈ [0, ∞), (12.3.10b)

where I denotes the identity matrix. For each κ, let Φ̃(t, τ | κ) be the cor-
responding solution of (12.3.10). Then, it is clear that for each κ, the cor-
responding solution of the system (12.3.6) with the initial condition (12.3.9)
can be written as
482 12 On Some Special Classes of Stochastic Optimal Control Problems
 t
ξ(t | κ) = Φ̃(t, 0 | κ)ξ 0 + Φ̃(t, τ | κ)Γ̃ (τ, κ)dM̃ (τ ). (12.3.11)
0

Since ξ 0 and M̃ are independent, it follows from taking the expectation


of (12.3.11) that
ξ̄(t | κ) = Φ̃(t, 0 | κ)ξ̄ 0 , (12.3.12)
where
x̄0
ξ̄ 0 = .
0
Define
μ(t | κ) = ξ̄(t | κ).
By differentiating (12.3.12) and using (12.3.10), the following theorem can
be readily obtained.
Theorem 12.3.1 For each κ, the mean behaviour of the corresponding so-
lution of the coupled system (12.3.6) with the initial condition (12.3.9) is
determined by the following system of deterministic differential equations

dμ(t)
= Ã(t, κ)μ(t) (12.3.13a)
dt
with the initial condition
μ(0) = ξ̄ 0 . (12.3.13b)

For the covariance matrix of the process ξ, we have the following result.
Theorem 12.3.2 For each κ, let ξ(· | κ) be the solution of the coupled
system (12.3.6) with the initial condition (12.3.9). Then, the corresponding
covariance matrix Ψ (· | κ) is determined by the following matrix differential
equation.

dΨ (t) & ' & '


= Ã(t, κ)Ψ (t) + Ψ (T ) Ã(t, κ) + Γ̃ (t, κ)Λ̃(t) Γ̃ (t, κ) (12.3.14a)
dt
with the initial condition
P0 0
Ψ (0) = Ψ 0 = , (12.3.14b)
0 0

where
Λ(t) 0
Λ̃(t) =
0 Λ0 (t)
 
with Λ(t) = diag (λ1 (t), . . . , λm (t)), and Λ0 (t) = diag λ01 (t), . . . , λ0q (t) .
Furthermore, P 0 ∈ Rn×n is obviously zero in the case when x0 is a deter-
ministic vector.
12.3 Optimal Feedback Control for Linear Systems Subject to Poisson Processes483

Proof. From (12.3.11) and (12.3.12), it follows that


 t
ξ(t | κ) − ξ̄(t | κ) = Φ̃(t, 0 | κ)(ξ − ξ) + Φ̃(t, τ | κ)Γ̃ (τ, κ)dM̃ (τ ),
0
(12.3.15)
where the second term on the right hand side, which is a stochastic integral
with respect to the martingale M̃ , is itself a martingale. Now, for any ϕ ∈
Rn+k , define
*$ %2 +
ϕ Ψ (t | κ)ϕ = E ϕ (ξ(t | κ) − ξ̄(t | κ)) , (12.3.16)

where Ψ (t | κ) is an (n + k) × (n + k) matrix yet to be determined.


From (12.3.15), it follows that

$ %2   t
ϕ (ξ(t | κ)− ξ̄(t | κ) = ϕ Φ(t, 0 | κ) ξ 0 − ξ̄ 0 + ϕ Φ̃(t, τ | κ)Γ̃ (τ, κ)dM̃ (τ ).
0
(12.3.17)
Taking the expectation of both sides and using the quadratic variation of the
martingale M̃ given by
 t  t
E η  dM̃ (τ ) = η  Λ̃ηdτ, η ∈ Rm+q ,
0 0

we obtain
!
ϕ Ψ (t | κ)ϕ = ϕ Φ̃(t, 0 | κ)Ψ 0 Φ̃(t, 0 | κ) ϕ
 t ! !
+ ϕ Φ̃(t, τ | κ)Γ̃ (τ, κ)Λ̃(τ ) Γ̃ (τ, κ) Φ̃(t, τ | κ) ϕdτ. (12.3.18)
0

Since (12.3.18) is valid for arbitrary ϕ ∈ Rn+k , it follows that


!  t
Ψ (t | κ) = Φ̃(t, 0 | κ)Ψ 0 Φ̃(t, 0 | κ) + Φ̃(t, τ | κ)Γ̃ (τ, κ)Λ̃(τ ) ·
0
! !
Γ̃ (τ, κ) Φ̃(t, τ | κ) dτ. (12.3.19)

With t = 0 in the above expression, it follows from (12.3.10b) that

Ψ (0 | κ) = Ψ 0 . (12.3.20)

Now, by differentiating (12.3.19) and then using (12.3.10), we obtain (12.3.14a).


Thus, the proof is complete.

Remark 12.3.1 From (12.3.14), we observe readily that Ψ (t | κ) is symmet-


ric. Thus, we only need to solve a system of (n + k)2 + (n + k) /2 distinct
differential equations for the determination of Ψ (t | κ).
484 12 On Some Special Classes of Stochastic Optimal Control Problems

Let z(t | κ) be a vector consisting of μ(t | κ) and the independent com-


ponents of Ψ (t | κ). Then, for each κ, z(t | κ) is determined by the following
system of differential equations.

dz(t)
= f (t, z(t), κ) (12.3.21a)
dt
with the initial condition
z(0) = z 0 , (12.3.21b)
where f (respectively, z) is determined by (12.3.13a) together with (12.3.14a)
(respectively, (12.3.13b) together with (12.3.14b)).

12.3.1 Two Stochastic Optimal Feedback Control


Problems

In this section, our aim is to formulate two classes of stochastic optimal feed-
back control problems based on the dynamical system (12.3.1), the observa-
tion dynamics (12.3.2) and the proposed control dynamics given by (12.3.3)
(which is driven by the measurement process y).
To begin, let us assume that the vector κ is to be chosen from the set K
defined by
* +
K = κ = [κ1 , . . . , κ2rk ] ∈ R2rk : β̃ ≤ κ < β̄
* +
= κ = [κ1 , . . . , κ2rk ] ∈ R2rk : β̃i ≤ κi ≤ β̄i , i = 1, . . . , 2rk , (12.3.22)

where β̃ and β̄ are given vectors in R2rk . For each κ ∈ K, let Ψ (t | κ) be


partitioned as follows.

Ψ11 (t | κ) Ψ12 (t | κ)
Ψ (t | κ) = , (12.3.23)
Ψ21 (t | κ) Ψ22 (t | κ)

where Ψ11 (t | κ) ∈ Rn×n , Ψ12 (t | κ) ∈ Rn×k , Ψ21 (t | κ) ∈ Rk×n and Ψ22 (t | κ)


∈ Rk×k . Note that Ψ11 (t | κ) and Ψ22 (t | κ) are, respectively, the covariances
of the processes x(t | κ) and y(t | κ), i.e.,
" #2
η  Ψ11 (t | κ)η = E η  (x(t | κ) − x̄(t | κ)) , η ∈ Rn

and " #2
ν  Ψ22 (t | κ)ν = E ν  (y(t | κ) − ȳ(t | κ)) , ν ∈ Rk ,
while Ψ12 (t | κ) and Ψ21 (t | κ) are cross covariances.
With these preparations, the first problem may be stated formally as fol-
lows.
12.3 Optimal Feedback Control for Linear Systems Subject to Poisson Processes485

Subject to the dynamic system (12.3.1), the initial condition (12.3.7), the
observation channel (12.3.2) with the initial condition (12.3.8) and the control
system given by (12.3.3), find a constant vector κ ∈ K such that the cost
functional
4 5
T * +

g0 (κ) = E Tr (x(t | κ) − x̄(t | κ)) [x(t | κ) − x̄(t | κ)] dt
0
(12.3.24)
is minimized over K.
For convenience, let this (stochastic optimal feedback) control problem be
referred to as Problem (SP2a ). Note that Problem (SP2a ) aims to find an
optimal vector (and hence feedback matrix) κ ∈ K such that the resulting
system (12.3.6) with the initial condition (12.3.9) is least noisy.
In our second problem, our aim is to find a constant vector (and hence
constant feedback matrix) κ ∈ K such that the mean behaviour of the cor-
responding dynamical system is closest to a given deterministic trajectory,
while the uncertainty of the corresponding dynamical system is within a given
acceptable limit. Let the given deterministic trajectory be denoted by x̂(t).
Then, the corresponding problem, which is identified as Problem (SP2b ), may
be stated formally as follows.
Given the system (12.3.1) with the initial condition (12.3.7), the obser-
vation channel (12.3.2) with the initial condition (12.3.8) and the proposed
control dynamics of the form (12.3.3), find a constant vector κ ∈ K such that
the cost functional
 T
2
g0 (κ) = x̄(t | κ) − x̂(t) dt (12.3.25)
0

is minimized subject to κ ∈ K and the constraint


4 5
T * +

E Tr (x(t | κ) − x̄(t | κ)) (x(t | κ) − x̄(t | κ)) dt ≤ ε, (12.3.26)
0

where ε is a positive constant corresponding to some acceptable level of un-


certainty.

12.3.2 Deterministic Model Transformation

The stochastic optimal feedback control problems as stated above are difficult
to solve. However, by virtue of the structure of the dynamical system, the
observation channel and the form of the control law, we can show that these
problems are, in fact, equivalent to certain deterministic optimal parameter
selection problems.
486 12 On Some Special Classes of Stochastic Optimal Control Problems

We first define the following deterministic optimal parameter selection


problem, denoted as Problem (DP2a ).
Subject to the system (12.3.21), find a constant vector κ ∈ K such that
the cost functional
 T  T
g0 (κ) = Tr {Ψ11 (t | κ)} dt = Tr {M Ψ (t | κ)} dt (12.3.27)
0 0

is minimized over K, where M ∈ R(n+k)×(n+k) is given by

In×n 0
M=
0 0

and In×n is the identity matrix in Rn×n .


Theorem 12.3.3 Problem (SP2a ) is equivalent to Problem (DP2a ).

Proof. Let Ψ11 (t | κ) be as defined for the matrix Ψ (t | κ) given by (12.3.23).


Then, it is clear that
4 5
T * +

E Tr (x((t | κ) − x̄(t | κ)) (x((t | κ) − x̄(t | κ)) dt
0
 T * +

= Tr E (x(t | κ) − x̄(t | κ)) (x(t | κ) − x̄(t | κ)) dt
0
 T
= Tr {Ψ11 (t | κ)} dt
0
 T
= Tr {M Ψ (t | κ)} dt.
0

The proof is complete.

We now turn our attention to Problem (SP2b ) and define the following
deterministic optimal parameter selection problem, to be denoted as Problem
(DP2b ).
Given the system (12.3.21), find a constant feedback vector κ ∈ K such
that the cost functional (12.3.25) is minimized subject to κ ∈ K and the
constraint  T
Tr {M Ψ (t | κ)} dt ≤ ε, (12.3.28)
0

where ε > 0 is an appropriate positive number and M is the (n + k) × (n + k)


matrix introduced in (12.3.27).
Theorem 12.3.4 The stochastic problem (SP2b ) is equivalent to the deter-
ministic optimal parameter selection problem (DP2b ).
12.3 Optimal Feedback Control for Linear Systems Subject to Poisson Processes487

Proof. We only need to show that the constraint (12.3.26) is equivalent


to (12.3.28). The proof of this equivalence is similar to that given for Theo-
rem 12.3.3.

Remark 12.3.2 Note that our formulation also holds for the case of time-
varying control matrices K = K(t), K̂ = K̂(t), t > 0. In this case, Problems
(DP2a ) and (DP2b ) corresponding to Problems (SP2a ) and (SP2b ), as de-
scribed above, are to be considered as deterministic optimal control problems
with controls K(t) and K̂(t) rather than as deterministic optimal parameter
selection problems with constant matrices K and K̂.

12.3.3 An Example

This example is taken from [234]. Consider a machine maintenance problem,


where there are two types of maintenance action. The first type is continu-
ous (minor) maintenance. It works to slow down natural degradation of the
machine. The second type is overhaul (major) maintenance. It is carried out
at certain discrete time points so as to significantly improve the condition of
the machine. For this machine maintenance problem, its condition is mod-
eled as an impulsive stochastic differential equation over the time horizon.
The objective is to choose the continuous maintenance rate and the overhaul
maintenance times such that the total cost of operating and maintaining the
machine is minimized subject to constraints on the state and output of the
machine satisfying minimum acceptable levels with high probability.
Let x(t) denote the state of the machine at time t, and let y(t) denote the
total output produced by the machine up to time t. The state and output of
the machine are, respectively, governed by the following stochastic differential
equations:

dx(t) = (u(t) − k1 )x(t)dt + k2 dw(t) (12.3.29)


dy(t) = k3 x(t)dt, (12.3.30)

where u(t) denotes the continuous maintenance rate; w(t) denotes the stan-
dard Brownian motion with mean 0 and covariance given by

Cov {w(t1 ), w(t2 )} = min {t1 , t2 } ,

and k1 , k2 and k3 are the given constants. These constants represent, re-
spectively, the natural degradation rate of the machine, the propensity for
random fluctuations in the condition of the machine and the extent to which
the production is being influenced by the state of the machine. It is assumed
that the continuous maintenance rate is subject to the following boundedness
constraints:
488 12 On Some Special Classes of Stochastic Optimal Control Problems

0 ≤ u(t) ≤ ak1 , t ≥ 0, (12.3.31)


where a ∈ (0, 1) is a given constant. The initial state of the machine and the
initial production level are, respectively, given by

x(0) = x0 + δ0 , (12.3.32)
y(0) = 0 (12.3.33)

where δ0 is a normal random variable with mean 0 and variance k4 . The


machine is regarded to be operating in an almost perfect condition when
x(t) ≈ x∗ .
For each i = 1, . . . , N +1, let τi denote the time of the i-th overhaul, where
N is the number of times the machine being overhauled and τN +1 is referred
to as the final time (i.e., the time at which the machine is replaced). To
ensure that overhauls do not happen too frequently, the following constraints
are imposed.
τi − τi−1 ≥ ρ, i = 1, . . . , N + 1, (12.3.34)
where ρ > 0 denotes the minimum duration between any two consecutive
overhauls. To ensure that the time, τN +1 , for the machine being replaced is
greater than or equal to tmin , the following condition is imposed.

τN +1 ≥ tmin . (12.3.35)

We consider the situation where the time required for each overhaul is
negligible when compared with the length of the time horizon. Thus, the
state of the machine improves instantaneously at each overhaul time. On the
other hand, the output level stays the same. This phenomenon is modeled by
the following jump conditions:
 
x τi+ = k5 x(τi− ) + δi , i = 1, . . . , N (12.3.36)
 + −
y τi = y(τi ), i = 1, . . . , N, (12.3.37)

where k5 is a positive constant and δi is a normal random variable with mean


0 and variance k6 . We assume throughout that the Brownian motion w(t) and
the random variables δi , i = 0, . . . , N , are mutually statistically independent.
There are two operational requirements that are required to be satisfied.
First, the state of the machine state is required to stay above a minimum
acceptable level with a high probability. Thus, we impose the following prob-
abilistic state constraint:

Pr {x(t) ≥ xmin } ≥ p1 , t ∈ [0, τN +1 ], (12.3.38)

where xmin is the minimum acceptable level of the state of the machine and
p1 is a given probability level. Second, the accumulated output level over
the entire time horizon is required to be greater than or equal to a specified
12.3 Optimal Feedback Control for Linear Systems Subject to Poisson Processes489

minimum level with a high probability. This requirement can be modeled as


given below:
Pr {y(τN +1 ) ≥ ymin } ≥ p2 , (12.3.39)
where ymin denotes the minimum output level and p2 is a given probability
level. Note that constraint (12.3.39) is only imposed at the final time, while
constraint (12.3.38) is imposed at each time over the time horizon.
Let τ = [τ1 , . . . , τN +1 ] denote the vector of overhaul times. Furthermore,
let Υ be the set defined by
" #
Υ = τ ∈ RN +1 : τi − τi−1 ≥ ρ, i = 1, . . . , N + 1; τN +1 ≥ tmin . (12.3.40)

A vector τ ∈ Υ is called an admissible overhaul time vector. For sim-


plicity, we assume that the continuous maintenance rate is constant between
consecutive overhauls (note that it is easy to extend the approach to the case
where the continuous maintenance rate takes several different constant levels
between overhauls). For a given τ ∈ Υ , let u : [0, ∞) → R be a piecewise con-
stant function that takes a constant value on each of the intervals [τi−1 , τi ),
i = 1, . . . , N + 1. If such a u satisfies (12.3.31), then it is called an admissible
control.
Let U (τ ) be the class of all admissible controls corresponding to τ ∈ Υ .
Then, any element (τ , u) ∈ Υ × U (τ ) is called an admissible pair. Any ad-
missible pair satisfying constraints (12.3.38) and (12.3.39) is called a feasible
pair.
Our goal is to choose a feasible pair such that the following cost functional
is minimized.
⎧ ⎫
 τN +1 ⎪ ⎨ ⎪

g0 (τ , u) = E L1 (x(t)) + L2 (u(t)) dt
0 ⎪
⎩       ⎪

Operating cost continuous maintenance cost
⎧ ⎫ ⎧ ⎫
 N ⎪
⎨ ⎪
⎬ ⎪
⎨ ⎪

+ E Ψ1 (x(τi− )) − E Ψ2 (x(τN +1 )) , (12.3.41)
⎪    ⎪ ⎪
⎩  ⎪
i=1 ⎩ ⎭ ⎭
Overhaul cost Salvage cost

where E denotes the mathematical expectation. This cost functional consists


of four components: (1) the operating cost, (2) the continuous maintenance
cost, (3) the overhaul cost and (4) the salvage value.
We assume that the following conditions are satisfied.
Assumption 12.3.1 L1 : R → R and Ψ2 : R → R are quadratic.
Assumption 12.3.2 L2 : R → R is continuously differentiable with respect
to each of its arguments.
Assumption 12.3.3 Ψ1 : R → R is linear.
The problem may now be stated formally as the following stochastic opti-
mal control problem.
490 12 On Some Special Classes of Stochastic Optimal Control Problems

Problem (EP0 ). Given the system of stochastic differential equations


(12.3.29) and (12.3.30) with the initial conditions (12.3.32) and (12.3.33)
and the jump conditions (12.3.36) and (12.3.37), find an admissible pair
(τ , u) ∈ Υ × U (τ ) such that the cost functional g0 (τ , u) defined by (12.3.41)
is minimized subject to the constraints (12.3.39) and (12.3.40).
Problem (EP0 ) is a stochastic impulsive optimal control problem with
probabilistic state constraints. It is solved using the approach proposed in
this section. First, it is transformed into a new deterministic optimal control
problem. Define
μx (t) = E[x(t)], μy (t) = E[y(t)] (12.3.42)

σxx (t) = V ar[x(t)], σyy (t) = V ar[y(t)], (12.3.43)

σxy (t) = σyx (t) = Cov {x(t), y(t)} . (12.3.44)


Let Φ : R × R → R 2×2
denote the principal solution matrix of the following
homogeneous system:

∂Φ(t, s) u(t) − k1 0
= Φ(t, s), t > s, (12.3.45)
∂t k3 0

Φ(s, s) = I, (12.3.46)
where
φ11 (t, s) φ12 (t, s)
Φ(t, s) = . (12.3.47)
φ21 (t, s) φ22 (t, s)
Then, it is known that for each i = 1, . . . , N +1, the solution of the stochas-
tic impulsive system (12.3.29) and (12.3.30) on [τi−1 , τi ) can be expressed as
follows:
 +   t
x(t) x τi−1  k
= Φ(t, τi−1 ) + + Φ(t, s) 2 dw(s). (12.3.48)
y(t) y τi−1 τi−1 0

This can be written as



 +   +  t
x(t) = φ11 (t, τi−1 ) x τi−1 + φ12 (t, τi−1 ) y τi−1 + k2 φ11 (t, s)dw(s)
τi−1
(12.3.49)
and

 +
  +
 t
y(t) = φ21 (t, τi−1 )x τi−1 + φ22 (t, τi−1 )y τi−1 + k2 φ21 (t, s)dw(s).
τi−1
(12.3.50)
Taking the expectation of x(t) and y(t) gives
12.3 Optimal Feedback Control for Linear Systems Subject to Poisson Processes491
 +   + 
μx (t) = φ11 (t, τi−1 )μx τi−1 + φ12 (t, τi−1 )μy τi−1 (12.3.51)

 +   + 
μy (t) = φ21 (t, τi−1 )μx τi−1 + φ22 (t, τi−1 )μy τi−1 . (12.3.52)
By differentiating (12.3.51) and (12.3.52) with respect to t, we obtain

dμx (t)   +   + 
= (u(t) − k1 ) φ11 (t, τi−1 )μx τi−1 + φ12 (t, τi−1 )μy τi−1
dt
= (u(t) − k1 )μx (t) (12.3.53)

and
dμy (t)  + +

= k3 φ11 (t, τi−1 )μx (τi−1 ) + φ12 (t, τi−1 )μy (τi−1 )
dt
= k3 μx (t). (12.3.54)

Now, their variances can be calculated as given below:


 +   + 
σxx (t) = φ211 (t, τi−1 )σxx τi−1 + φ212 (t, τi−1 )σyy τi−1
 t
 + 
+ 2φ11 (t, τi−1 )φ12 (t, τi−1 )σxy τi−1 + k22 φ211 (t, s)ds
τi−1
(12.3.55)

and
 +   + 
σyy (t) = φ221 (t, τi−1 )σxx τi−1 + φ222 (t, τi−1 )σyy τi−1
 t
 + 
+ 2φ21 (t, τi−1 )φ22 (t, τi−1 )σxy τi−1 + k22 φ221 (t, s)ds.
τi−1
(12.3.56)

Furthermore, the covariance is

σxy (t) = σyx (t)


 + 
= φ11 (t, τi−1 )φ21 (t, τi−1 )σxx τi−1
 + 
+ φ11 (t, τi−1 )φ22 (t, τi−1 ) + φ12 (t, τi−1 )φ21 (t, τi−1 )σxy τi−1
 t
 + 
+ φ12 (t, τi−1 )φ22 (t, τi−1 )σyy τi−1 + k22 φ11 (t, s)φ21 (t, s)ds.
τi−1
(12.3.57)

Differentiating (12.3.55)–(12.3.57) with respect to time, we obtain

dσxx (t)   +   + 
= 2(u(t) − k1 ) φ211 (t, τi−1 )σxx τi−1 + φ212 (t, τi−1 )σyy τi−1
dt  + 
+ 4(u(t) − k1 )φ11 (t, τi−1 )φ12 (t, τi−1 )σxy τi−1
492 12 On Some Special Classes of Stochastic Optimal Control Problems
 t
+ 2(u(t) − k1 ) k22 φ211 (t, s)ds
τi−1

= 2(u(t) − k1 )σxx (t) + k22 (12.3.58)


dσyy (t)
dt  + 
= 2k3 φ11 (t, τi−1 )φ21 (t, τi−1 ) σxx τi−1
 + 
+ 2k3 φ12 (t, τi−1 )φ22 (t, τi−1 )σyy τi−1
 + 
+ 2k3 (φ11 (t, τi−1 )φ22 (t, τi−1 ) + φ12 (t, τi−1 )φ21 (t, τi−1 )) σxy τi−1
 t
+ 2k3 k22 φ11 (t, s)φ21 (t, s)ds
τi−1

= 2k3 σxy (t) (12.3.59)


dσxy (t) dσyx (t)
=
dt dt  + 
= (u(t) − k1 )φ11 (t, τi−1 )φ21 (t, τi−1 )σxx τi−1
+ (u(t) − k1 )φ11 (t, τi−1 )φ22 (t, τi−1 )
 + 
+ φ12 (t, τi−1 )φ21 (t, τi−1 )σxy τi−1
 + 
+ (u(t) − k1 )φ12 (t, τi−1 )φ22 (t, τi−1 )σyy τi−1
 t
+ (u(t) − k1 ) k22 φ11 (t, s)φ21 (t, s)ds
τi−1
 2 +
  + 
+ k3 φ11 (t, τi−1 )σxx
τi−1 + φ221 (t, τi−1 )σyy τi−1
  t 
 +  2 2
+ k3 2φ11 (t, τi−1 )φ12 (t, τi−1 )σxy τi−1 + k2 φ11 (t, s)ds
τi−1

= (u(t) − k1 )σxy (t) + k3 σxx (t). (12.3.60)

The mean, variance and covariance of the initial conditions (12.3.32)


and (12.3.33) are

μx (0) = x∗ , μy (0) = 0 (12.3.61)


σxx (0) = k4 , σyy (0) = 0, σxy (0) = σyx (0) = 0. (12.3.62)

At the overhaul times t = τi , i = 1, . . . , N , the mean, variance and covari-


ance of the state jump conditions (12.3.36) and (12.3.37) are
   
μx τi+ = k5 μx (τi− ), μy τi+ = μy (τi− ), (12.3.63)
 + 2 −
 + −
σxx τi = k5 σxx (τi ) + k6 , σyy τi = σyy (τi ) (12.3.64)
 +  + −
σxy τi = σyx τi = k5 σxy (τi ). (12.3.65)

Since the state equations (12.3.29) and (12.3.30) and the jump con-
ditions (12.3.36) and (12.3.37) are linear, x(t) and y(t) are mixtures of
12.3 Optimal Feedback Control for Linear Systems Subject to Poisson Processes493

normally distributed random variables. Thus, the probabilistic state con-


straints (12.3.39) and (12.3.40) can be written as follows:
 ∞
1 −(η − μx (t))2
2 exp dη ≥ p1 , t ∈ [0, τN +1 ] (12.3.66)
xmin (2πσxx (t)) 2σxx (t)
 ∞
1 −(η − μy (τN +1 ))2
2 exp dη ≥ p2 . (12.3.67)
xmin (2πσyy (τN +1 )) 2σyy (τN +1 )

Constraint (12.3.66) is a continuous inequality constraint in terms of the


new state variables μx and σxx , and constraint (12.3.67) is a terminal state
constraint involving the new state variables μy and σyy .
As the functions L1 (·) and Ψ2 (·) appearing in the cost functional (12.3.41)
are quadratic, we can express E[x(t)] and E[Ψ2 (x(τN +1 ))] in terms of the new
state variables by replacing E[x(t)] and E[(x(t))2 ] with μx (t) and σxx (t) +
μ2x (t), respectively. We denote the resulting functions by L61 (μx (t), σxx (t))
and Ψ62 (μx (τN +1 ), σxx (τN +1 )), respectively. Similarly, since Ψ1 is linear, we
have
E[Ψ1 (x(τi− ))] = Ψ1 (μx (τi− )).
Thus, the cost functional (12.3.41) can be written as
 τN +1 * +
g0 (τ , u) = L61 (μx (t), σxx (t)) + L62 (u(t)) dt
0

N
+ Ψ1 (μx (τi− )) − Ψ62 (μx (τN +1 ), σxx (τN +1 )). (12.3.68)
i=1

We are now able to state the transformed problem as follows.


Problem (P E1 ). Given the dynamic system (12.3.53) and (12.3.54) and
(12.3.58)–(12.3.60) with the initial conditions (12.3.61) and (12.3.62) and
the jump conditions (12.3.63)–(12.3.65), find an admissible pair (τ , u) ∈
Υ × U (τ ) such that the cost function (12.3.68) is minimized subject to con-
straints (12.3.66) and (12.3.67).
In Problem (P E1 ), the state of the machine experiences N instantaneous
jumps during the time horizon. The times at which these jumps occur are
actually decision variables to be optimized. Now, by applying the time scaling
transformation (see Section 7.4.2), the time scale t ∈ [0, τN +1 ] is mapped
into the new time scale s ∈ [0, N + 1] such that the variable jump points are
mapped into fixed jump points. This mapping is realized by the following
differential equation:

dt(s) 
N +1
= v6(s) = vi χ[i−1,i) (s) (12.3.69)
ds i=1
t(0) = 0, (12.3.70)
494 12 On Some Special Classes of Stochastic Optimal Control Problems

where vi = τi − τi−1 ≥ ρ for each i = 1, . . . , N + 1, v1 + · · · + vN +1 ≥ tmin ,


and χ[i−1,i) (s) is the indicator function of [i − 1, i).
Note that vi denotes the time duration between the (i − 1)th and ith jump
times. We collect the duration parameters into a vector v = [v1 , . . . , vN +! ] ∈
RN +1 . Define
" #
V = v ∈ RN +1 : vi ≥ ρ, i = 1, . . . , N + 1; v1 + · · · + vN +1 ≥ tmin .
(12.3.71)
A vector v ∈ V is called an admissible duration vector. From (12.3.69)
and (12.3.70), we have, for each i = 1, . . . , N + 1,
 i
t(i) = t(0) + v6(s)ds = t(0) + v1 + · · · + vi = τi . (12.3.72)
0

This equation shows the relationship between the variable jump points
t = τi , i = 1, . . . , N + 1, and the fixed jump points s = i, i = 1, . . . , N + 1.
Define u 6(s) = u(t(s)). Recall that the continuous maintenance rate is con-
stant between consecutive overhauls. Thus, the admissible controls are re-
stricted to piecewise constant functions that assume constant values between
consecutive jump times. As a result, u 6(s) can be expressed as


N +1
6(s) =
u hi χ[i−1,i) (s), (12.3.73)
i=1

where hi , i = 1, . . . , N + 1, are control heights to be optimized. In view


of (12.3.31), these control heights must satisfy the following constraints:

0 ≤ hi ≤ ak1 , i = 1, . . . , N + 1. (12.3.74)

Let h = [h1 , . . . , hN +1 ] ∈ RN +1 . Furthermore, define


" #
H = h ∈ RN +1 : 0 ≤ hi ≤ ak1 , i = 1, . . . , N + 1 . (12.3.75)

A vector h ∈ H is called an admissible control parameter vector. Further-


more, a pair (h, v) ∈ H × V is called an admissible pair.
We assume throughout that the control switches coincide with the overhaul
times. Note that it is straightforward to consider the case in which the control
can switch value between consecutive jump times, as well as at the jump times
themselves. However, the notation will be more involved. Let

6x (s) = μx (t(s)),
μ 6y (s) = μy (t(s) ,
μ (12.3.76)

6xx (s) = σxx (t(s)), σ


σ 6yy (s) = σyy (t(s)), σ
6xy (s) = σ
6yx (s) = σxy (t(s)).
(12.3.77)
12.3 Optimal Feedback Control for Linear Systems Subject to Poisson Processes495

Then, the dynamics (12.3.53) and (12.3.54) and (12.3.58)–(12.3.60) are trans-
formed into
μx (s)
d6
= v6(s)(6
u(s) − k1 )6
μx (s) (12.3.78)
ds
μy (s)
d6
= k3 v6(s)6
μx (s) (12.3.79)
ds
σxx (s)
d6  
= v6(s) 2(6
u(s) − k1 )6
σxx (s) + k22 (12.3.80)
ds
σyy (s)
d6
= 2k3 v6(s)6
σxy (s) (12.3.81)
ds
σxy (s)
d6 σyx (s)
d6
= = v6(s)(6
u(s) − k1 )6 6xx (s).
σxy (s) + k3 σ (12.3.82)
ds ds
Furthermore, the initial conditions are

6x (0) = x∗ , μ
μ 6y (0) = 0 (12.3.83)
6xx (0) = k4 , σ
σ 6yy (0) = 0, σ
6xy (0) = σ
6yx (0) = 0. (12.3.84)

At the overhaul times t = τi , i = 1, . . . , N , the new state jump conditions


are
   
6x i+ = k5 μ
μ 6x (i− ), μ
6y i+ = μ 6y (i− ), (12.3.85)
 + −
 + −
6xx i = k5 σ
σ 2
6xx (i ) + k6 , σ 6yy i = σ 6yy (i ) (12.3.86)
 +  + −
6xy i = σ
σ 6yx i = k5 σ 6xy (i ). (12.3.87)

The probabilistic state constraints become


 ∞
1 −(η − μ6x (s))2
1/2
exp dη ≥ p1 , s ∈ [0, N + 1]
xmin (2π6
σxx (s)) 26
σxx (s)
(12.3.88)
 ∞
1 −(η − μ
6y (N + 1)) 2

1/2
exp dη ≥ p2 . (12.3.89)
xmin (2π6
σyy (N + 1)) 26
σyy (N + 1)

An admissible pair (h, v) ∈ H × V is said to be feasible if it satisfies the


constraints (12.3.88) and (12.3.89).
After applying the time scaling transformation, Problem (P E1 ) becomes
Problem (P E2 ) defined below.
Problem (P E2 ). Given the dynamic system (12.3.78)–(12.3.82) with the
initial conditions (12.3.83)–(12.3.84) and the jump conditions (12.3.85)–
(12.3.87), find a pair (h, v) ∈ H × V such that the cost functional
496 12 On Some Special Classes of Stochastic Optimal Control Problems
 N +1 * +
g60 (h, v) = v6(s) L61 (6 6xx (s)) + L62 (6
μx (s), σ u(s)) ds
0

N
+ μx (i− )) − Ψ62 (6
Ψ1 (6 6xx (N + 1))
μx (N + 1), σ (12.3.90)
i=1

is minimized subject to constraints (12.3.88) and (12.3.89).


Problem (P E2 ) is an impulsive optimal parameter selection problem with
state constraints. We can apply the constraint transcription technique in-
troduced in Section 4.2 to the continuous inequality constraints (12.3.88) to
obtain respective approximate canonical inequality constraints. Furthermore,
the gradient formulas for the cost functional and the approximate canonical
inequality constraint functionals can be obtained readily from Theorem 7.2.2.
Thus, the resulting approximate optimal control problem can be solved as
nonlinear programming problem. See Section 7.2.2.
For illustration, we consider the stochastic machine maintenance problem
for a brand-new machine costing $10, 000. The manager in charge of the
machine plans to replace the machine after 20 overhauls (major maintenance).
Meanwhile, the workers in the factory will perform continuous maintenance
on the machine (minor maintenance) to ensure that it is kept in good working
order. The model parameters are given by

k1 = 1.35 × 10−2 , k2 = 10−3 , k3 = 2.5, k4 = 10−4 , k5 = 1.18


k6 = 10−4 , a = 0.1, x∗ = 1.0, p1 = 0.8, p2 = 0.8
xmin = 0.1, ymin = 500, ρ = 15.0, tmin = 400 .

The explicit forms for the functions in the cost functional are given as
follows:
40
L1 (x(t)) = 2.5x2 (t) − 20x(t) + 40, L2 (u(t)) = u(t)
k1
1
Ψ1 (x(τi− )) = 1000 − 500x(τi− ), Ψ2 (x(τN +1 )) =
x(τN +1 ) × 1000.
5
Note that N = 20 is the number of overhaul times, and $10, 000 is the
original capital cost of the machine.

Table 12.3.1: Optimal jump times for the example

i τi i τi i τi i τi i τi i τi i τi
1 15 4 60 7 105 10 150 13 195 16 240 19 285
2 30 5 75 8 120 11 165 14 210 17 255 20 300
3 45 6 90 9 135 12 180 15 225 18 270 21 400
12.3 Optimal Feedback Control for Linear Systems Subject to Poisson Processes497

The optimal value of the cost functional obtained is g60 = 11,602.7281.


The optimal jump times (the overhaul times) and the optimal terminal time
(replacement time) are given in Table 12.3.1, while the optimal continuous
maintenance rates (minor maintenance) are shown in Table 12.3.2.
Note that we are assuming the continuous maintenance rate takes a con-
stant value between consecutive jump points. The optimal trajectories of the
state variables, μx (t), μy (t), σxx (t), σxx (t), σyy (t), σxy (t), are shown in Fig-
ure 12.3.1a–e, respectively.
Figure 12.3.1a shows the mean of the machine state over the duration of
400 time periods. The mean starts off at 1 and gradually decreases with
time. However, with each overhaul, the mean of the state of the machine is
restored to a higher value, close to where it was at the previous overhaul.
Figure 12.3.1b shows the mean of the accumulated output, which gradually
increases over time. Figure 12.3.1c shows the variance of the state of the ma-
chine. The variance changes with each overhaul performed and then gradually
decreases after the last overhaul. The variance of the output is shown in Fig-
ure 12.3.1e, while Figure 12.3.1d shows the covariance of the output with the
state of the machine.
To examine the performance of our optimal maintenance policy over a
range of scenarios, 500 sample paths are simulated for the machine state (see
Figure 12.3.1f) and machine output (see Figure 12.3.1g), respectively. The
paths simulated are similar in shape to the paths of mean values as shown
earlier.

Table 12.3.2: Optimal continuous maintenance rate for the example

Interval u(t) Interval u(t) Interval u(t)


1 1.35 ×10−3 8 4.6386 ×10−31 15 6.9333×10−33
2 1.35 ×10−3 9 4.2867 ×10−31 16 0
−3
3 1.35 ×10 10 0 17 0
4 1.35 ×10−3 11 4.0135 ×10−31 18 0
5 1.35 ×10−3 12 0 19 0
6 1.35 ×10−3 13 4.1610 ×10−31 20 0
7 1.35 ×10−3 14 7.7037 ×10−31 21 0
498 12 On Some Special Classes of Stochastic Optimal Control Problems

1
μx (t) μy (t)
600
0.8

0.6 400

0.4 200

0.2
0
0 100 200 300 400 0 100 200 300 400

(a) μx (t) (b) μy (t)

σxx (t) 0.4 σxy (t)


600

0.3
400
0.2
200
0.1

0 0
0 100 200 300 400 0 100 200 300 400

(c) σxx (t) (d) σxy (t)

400 σyy (t)

300

200

100

0
0 100 200 300 400

(e) σyy (t)

1.1 700

1
600
0.9

0.8 500

0.7
400
0.6
300
0.5

0.4 200

0.3
100
0.2

0.1 0
0 50 100 150 200 250 300 350 400 0 50 100 150 200 250 300 350 400

(f) Simulation of x(t) with 500 (g) Simulation of y(t) with 500
sample paths. sample paths.

Fig. 12.3.1: The optimal trajectories of the state variables and simulation
with 500 sample paths. (a) μx (t). (b) μy (t). (c) σxx (t). (d) σxy (t).
(e) σyy (t). (f ) Simulation of x(t) with 500 sample paths. (g) Simulation
of y(t) with 500 sample paths
12.4 Exercises 499

12.4 Exercises

12.4.1 Consider the optimal parameter and optimal control problem (SP)
with the control taking the form as given below:

u = Kx,
where K is an n × n matrix to be determined such that the cost func-
tional (12.2.5) is minimized. Obtain the corresponding deterministic optimal
parameter selection problem.

12.4.2 Consider the optimal parameter and optimal control problem (SP ).
However, the probabilistic state constraints (12.2.4) are replaced by the fol-
lowing constraints
$ %
αi ≤ E ξi (t) − ξi (t) ≤ βi , for all t ∈ [0, T ], i = 1, . . . , N ,
where αi , βi , i = 1, . . . , n, are the given real constants and ξi (t), i = 1, . . . , N ,
are the specified desired state trajectories. Obtain the corresponding determin-
istic optimal parameter and optimal control problem.

12.4.3 Consider the example of Section 12.2.2. Let the probabilistic con-
straint (12.2.23) be replaced by the following constraint
$ %
α ≤ E ξ(t) − ξ(t) ≤ β, for all t ∈ [0, T ],
where α and β are the given real constants and ξ(t) is the specified desired
state trajectory. Obtain the corresponding deterministic optimal control prob-
lem.

12.4.4 Consider Problem (SP2b ) but with the constraint (12.2.26) being re-
placed by appropriate probabilistic constraints of the form given by (12.2.4).
Derive the corresponding deterministic optimal control problem.

12.4.5 Consider the example of Section 12.3.3. Let the probabilistic con-
straints (12.3.38) and (12.3.39) be replaced by the constraints of the form
given by (12.3.25). Derive the corresponding deterministic optimal control
problem.

12.4.6 Consider the example of Section 12.3.3. Suppose that the control is
a piecewise constant function between every pair of overhaul times, where
the heights and switching times of the piecewise constant control are decision
variables. Derive the corresponding deterministic optimal control problem.
Appendix A.1
Elements of Mathematical Analysis

A.1.1 Introduction

In this Section, some results in measure theory and functional analysis are
presented without proofs. The main references are [3, 4, 40, 51, 90, 91, 198,
206, 216, 250, 253].

A.1.2 Sequences

Let X be a given set. It is called a finite set if it is either empty or a finite


sequence. Similarly, the set X is called a countable set if it is either empty or
a sequence.
Let xn be a sequence. The limit superior of the sequence xn , denoted by
lim supn→∞ xn or lim n→∞ xn , is defined by

lim xn = inf sup xk .


n→∞ n k≥n

Similarly, the limit inferior of the sequence xn , denoted by lim inf n→∞ xn
or limn→∞ xn , is defined by

lim xn = sup inf xk .


n→∞ n k≥n

Let xn be a sequence. Then, K ∈ R ∪ {+∞} (respectively, K ∈ R ∪ {−∞})


is the limit superior (respectively, limit superior ) of the sequence {xn } if and
only if the following two conditions are satisfied:

© The Author(s), under exclusive license to 501


Springer Nature Switzerland AG 2021
K. L. Teo et al., Applied and Computational Optimal Control, Springer
Optimization and Its Applications 171,
https://doi.org/10.1007/978-3-030-69913-0
502 A.1 Elements of Mathematical Analysis

1. There exists a subsequence {xn() } of the sequence {xn } such that


lim→∞ xn() = K.
2. If {xn() } is any subsequence of the sequence {xn } such that lim→∞ xn()
= A, then
A ≤ K (respectively, A ≥ K).

Note that a sequence {xn } can only have one limit superior (respectively,
limit inferior). A sequence {xn } in R ∪ {±∞} has the limit A, denoted by
lim xn = A, if and only if
n→∞

lim xn = lim xn = A.
n→∞ n→∞

A.1.3 Linear Vector Spaces

Let X be a set, which is a non-empty collection of elements. We define two


operations, called addition and scalar multiplication, on X. The addition
operation satisfies:
1. x + y ∈ X for all x, y ∈ X;
2. x + (y + z) = (x + y) + z for all x, y, z ∈ X;
3. there exists a unique element, 0, in X such that 0 + x = x + 0 = x for
all x ∈ X;
4. for every x ∈ X there exists a unique element, denoted by −x, such that
x + (−x) = 0;
5. x + y = y + x for all x, y ∈ X.
The scalar multiplication operation satisfies:
6. αx ∈ X for any α ∈ R, and x ∈ X;
7. α(x + y) = αx + αy for any α ∈ R, and x, y ∈ X;
8. (α + β)x = αx + βx for any α, β ∈ R, and x ∈ X;
9. α(βx) = (αβ)x for any α, β ∈ R, and x ∈ X;
10. there exists a unique element, 1, in X such that1x = x for any x ∈ X.
The space X together with the two operations defined above is called a
(linear ) vector space.
Let A be a non-empty subset of X. A is called a convex set if

λx + (1 − λ)y ∈ A, for all x, y ∈ X and λ ∈ [0, 1].

It is easy to show that the set A is convex if and only if



n
λi x i ∈ A
i=1

whenever
A.1.4 Metric Spaces 503


n
xi ∈ A, λi ≥ 0, and λi = 1.
i=1

Note that the intersection of any number of convex sets is a convex set.
However, the union of two convex sets is, in general, not a convex set.
Let x1 , . . . , xm be m vectors in a vector space X. A linear combination of
these m vectors is defined by

α 1 x 1 + · · · + αm x m ,

where αi , i = 1, . . . , m, are real numbers.


Let x1 , . . . , xm be m vectors in a vector space X. The set of these vectors
in X is called linearly dependent if there exist real numbers α1 , . . . , αm , not
all zero, such that
α1 x1 + · · · + αm xm = 0. (A.1.1)
If this collection of vectors is not linearly dependent, then it is called linearly
independent. In this case, the only solution to (A.1.1) is

αi = 0, for all i = 1, . . . , m.

Let M be a non-empty subset of a vector space X. If αx + βy ∈ M for


all real numbers α, β and for all x, y ∈ M, then M is called a subspace of X.
The whole space X is itself a subspace of X. If M is a subspace of X such
that M = X, then M is called a proper subspace of X.
Let S be a subset of a vector space X. Then, the set [S] is said to be the
subspace generated by S if it consists of all those vectors in X that can be
expressed as linear combinations of vectors in S. Let x1 , . . . , xm be m vectors
in X. Then,
α1 x 1 + · · · + αm x m
is
called a convex combination of x1 , . . . , xm , if αi ≥ 0 for i = 1, . . . , m, and
m
i=1 α i = 1.
Let S ≡ {x1 , . . . , xm } be a set of linearly independent vectors in a vector
space X. If X is generated by S, then S is called a basis for the vector space
X. In this case, the vector space X is said to be of (finite) m-dimension. If
a vector space X is not finite dimensional, it is called infinite dimensional.
In a finite dimensional vector space, any two bases must contain the same
number of linearly independent vectors.

A.1.4 Metric Spaces

Let X be a non-empty set. A topology T on X is a family of subsets of X


such that
504 A.1 Elements of Mathematical Analysis

1. X ∈ T , the empty set ∅ ∈ T ;


2. the union of any number of members of T is in T ; and
3. the intersection of a finite number of members of T is in T .

The set endowed with the topology T is called a topological space and is
written as (X, T ). Members of T are called open sets. A set B ⊂ X is said
to be closed if its complement X \ B is open.
Let T1 and T2 be two topologies on X. T1 is said to be stronger than T2
(or T2 weaker than T1 ) if T1 ⊃ T2 .
Let (X, T ) be a topological space, and let A be a non-empty subset of
X. The family TA = {A ∩ B : B ∈ T } is a topology on A and is called the
relative topology on A induced by the topology T on X.
Let (X, T ) be a topological space, A ⊂ X, and C ≡ {Gi } a subfamily of
T such that A ⊂ ∪Gi . Then, C is called an open covering of A. If every open
covering of X has a finite subfamily {G1 , . . . , Gn } ⊂ C such that X = ∪N
i=1 Gi ,
then the topological space (X, T ) is called a compact space. A subset A of
a topological space is said to be compact if it is compact as a subset of X.
Equivalently, A is call compact if every open covering of A contains a finite
subfamily that covers A.
A family of closed sets is said to possess the finite intersection property if
the intersection of any finite number of sets in the family is non-empty.
A topological space is compact if and only if any family of closed sets with
the finite intersection property has non-empty intersection.
A point x ∈ X is said to be an interior point of a set A ⊂ X if there
exists an open set G in X such that x ∈ G ⊂ A. The interior Å of A is
the set which consists of all the interior points of the set A. A neighbourhood
of a point x ∈ X is a set V ⊂ X such that x is an interior point of V . A
point x ∈ X is said to be an accumulation point of a set A ⊂ X if every
neighbourhood of x ∈ X contains points of A other than x. If A ⊂ X is a
closed set, then it contains all its accumulation points. The union of a set B
and its accumulation points is called the closure of B and is written as B.
A set A ⊂ X is said to be dense in a set E ⊂ X if A ⊃ E. A set A is said
to be nowhere dense if the interior of its closure is empty. If X contains a
countable subset that is dense in X, then it is called separable. The boundary
∂A of a set A is the set of all accumulation points of both A and X \ A. Thus,
∂A = A ∩ (X \ A).
A family B of subsets of X is a base for a topology T on X if B is a
subfamily of T and, for each x ∈ X and each neighbourhood U of x, there
is a member V of B such that x ∈ V ⊂ U . A family F of subsets of X is a
subbase for a topology T on X if the family of finite intersections of members
of F is a base for T .
A sequence {xn } ⊂ X is said to converge to a point x ∈ X, denoted
by xn → x, if each neighbourhood of x contains all but a finite number of
elements of the sequence.
A.1.4 Metric Spaces 505

A topological space X is called Hausdorff if it satisfies the separation


axiom: if any x, y ∈ X are such that x = y, then x and y have disjoint
neighbourhoods.
A compact subset of a Hausdorff (topological) space is closed. A closed
subset of a compact set is compact.
We now turn our attention to a special topological space, which is known
as the metric space as follows.
Let X be a vector space and let ρ be a function from X × X into R such
that the following axioms are satisfied:
(M1) ρ(x, y) ≥ 0 for all x, y ∈ X;
(M2) ρ(x, y) = 0 if and only if x = y;
(M3) ρ(x, y) = ρ(y, x) (symmetry) for all x, y ∈ X;
(M4) ρ(x, y) ≤ ρ(x, z) + ρ(z, y) (triangle inequality) for all x, y, z ∈ X.
A vector space X equipped with such a metric ρ is called a metric space.
It is written as (X, ρ).
A set Θ ⊂ X is said to be open if for every x ∈ Θ there exists a δ > 0
such that
{y ∈ X : ρ(x, y) < δ} ⊂ Θ,
where {y ∈ X : ρ(x, y) < δ} is an open ball of radius δ with center x, or a
δ-neighbourhood of x. The whole set X and the empty set ∅ are both open
sets. A metric space (X, ρ) is a topological space with its open sets generated
by the metric ρ. Let D be a subset of X, and let x be a point in X. If, for
any δ > 0, there exists a point y ∈ D such that ρ(x, y) < δ, then x is said to
be a point in the closure of the set D. Let D̄ denote the closure of the set D.
Clearly, D ⊂ D̄. If A ⊂ B, then Ā ⊂ B. Furthermore,

A ∪ B = Ā ∪ B̄

and
A ∩ B ⊂ Ā ∩ B̄.
If Ā = A, then the set A is said to be closed. The closure B̄ of any set B is a
closed set. The whole set X and the empty set ∅ are closed sets. The union of
any two closed sets is a closed set. Although the intersection of any collection
(countable or uncountable) of closed sets is closed, the union of a countable
collection of closed sets needs not be closed. Similarly, the intersection of a
countable collection of open sets needs not be open.
Let A be a subset in X. The complement à of A is defined by

à = {x ∈ X : x ∈
/ A}.

The complement of an open set is a closed set. Similarly, the complement of


a closed set is an open set.
A metric space (X, ρ) is called separable if there exists a subset D of X
that contains only a countable number of points in X such that D̄ = X.
506 A.1 Elements of Mathematical Analysis

We can define different metrics on the same vector space X. Let ρ and
σ be two different functions from X × X into R such that the properties
(M1)–(M3) are satisfied. Then, (X, ρ) and (X, σ) are two different metric
spaces.
In a metric space (X, ρ), a sequence {xn }∞n=1 ⊂ X is called a Cauchy
sequence if
ρ(xn+p , xn ) → 0 as n → ∞
for any integer p ≥ 1. A metric space (X, ρ) is said to be complete if every
Cauchy sequence has a limiting point in X.
Let (X, ρ) be a metric space. For any two distinct points x1 , x2 in X, there
exist two real numbers δ1 > 0 and δ2 > 0 such that

{y ∈ X : ρ(x1 , y) < δ1 } ∩ {y ∈ X : ρ(x2 , y) < δ2 } = ∅.

This implies that the metric space (X, ρ) is a Hausdorff space. Therefore, a
convergent sequence can have only one limiting point.
Let (X, ρ) be a metric space, and let A ⊂ X. If, for any sequence {xn }∞
n=1
in A, there exist a subsequence {xn() }∞
=1 and a point x ∈ X such that

ρ(xn() , x) → 0 as  → ∞,

then A is said to be conditionally sequentially compact. The set A is said to


be sequentially compact if it is conditionally sequentially compact and the
limiting point x remains in the set A.
Let (X, ρ) be a metric space. If for each ε > 0 there exists a finite collection
{Gi,ε }ni=1 of open balls of radius ε such that X = ∪ni=1 Gi,ε , then the metric
space is called totally bounded.
The open ball of radius ε > 0 with center at the point x0 ∈ X is the set

Kε (x0 ) = {x ∈ X : ρ(x, x0 ) < ε}.

The family of sets:

B = {K1/n (x) : n = 1, 2, . . . and x ∈ X}

forms a basis for X, i.e., for each x ∈ X and each neighbourhood U of x,


there is a member V of B such that x ∈ V ⊂ U .
A set A in Rn is compact if and only if it is closed and bounded. Thus, a
closed subset of a compact set A in Rn is compact.
A metric space is compact if and only if it is complete and totally bounded.
If a metric space is compact, then it is separable.
Let (X, ρ) be a metric space and let S be a subset of X. We can restrict
the metric ρ to S. Equipped with this induced metric ρ, S becomes a metric
space, and is written as (S, ρ). In this case, we call (S, ρ) a subspace of (X, ρ).
A.1.5 Continuous Functions 507

Let (S, ρ) be a subspace of the metric space (X, ρ). If E ⊂ X, Ē ∩ S is


the closure of E relative to S, where Ē denotes the closure of E in X. A set
A ⊂ S is closed relative to S if and only if there exists a closed set F in X
such that A = S ∩ F. A set A ⊂ S is open relative to S if and only if there
exists an open set G in X such that A = S ∩ G.
Every subspace of a separable metric space is separable. Let (X, ρ) be a
metric space. If a subset A of X is complete, then it is closed. On the other
hand, a closed subset of a complete metric space is itself complete.

A.1.5 Continuous Functions

Let (X, ρ) and (Y, σ) be two metric spaces, and let f be a function from X
into Y . The function f is said to be continuous at x0 ∈ X if, for every ε > 0
there exists a δ ≡ δ(ε, x0 ) > 0, such that

σ(f (x), f (x0 )) < ε whenever ρ(x, x0 ) < δ.

It is called uniformly continuous if δ does not depend on x0 . The function f


is said to be continuous if it is continuous at every point in X. Let C(X, Y )
be the set of all such continuous functions. If the metric space X is compact,
then f is uniformly continuous. Furthermore, if Y = R, then f attains both
its maximum and minimum.
Let f be a function from an interval I ⊂ R to R. Then, the function f is
said to be continuous at x0 ∈ I if, for every ε > 0, there exists a δ ≡ δ(ε, x0 ),
with δ > 0, such that

|f (x) − f (x0 )| < ε whenever x ∈ I and |x − x0 | < δ,

where |y| denotes the absolute value of y. The concept of uniform continuity
for this special case is to be understood similarly.
Let (X, ρ), (Y, σ), and (Z, η) be three metric spaces. Let f be a continuous
function from X into Y , and g a continuous function from Y into Z. Then,
the composite function g ◦ f : X → Z is also continuous, where

g ◦ f (x) ≡ g(f (x)).

Let f be a continuous function from an interval I ⊂ R to R. Then, for any


real number α, the set

B = {x ∈ I : f (x) = α}

is a compact subset of I. A continuous function is uniformly continuous on


any compact subset A of I. It also attains both its maximum and minimum
on A. The set f (A) defined by
508 A.1 Elements of Mathematical Analysis

f (A) = { f (x) : x ∈ A }

is compact.
Let X = (X, ρ) be a metric space. A real-valued function f defined on X
is said to be lower semicontinuous at x0 ∈ X if for every real number α such
that f (x0 ) > α, there is a neighbourhood V of x0 such that f (x) > α for all
x ∈ V . Upper semicontinuity is defined by reversing the inequalities. We say
that f is lower semicontinuous if it is lower semicontinuous at every x ∈ X.
Let f be an upper (respectively, lower) semicontinuous real-valued function
on a compact space X. Then, f is bounded from above (respectively, below)
and assumes its maximum (respectively, minimum) in X.

Theorem A.1.1 (Dini Theorem). Let {f n } be a sequence of upper semi-


continuous real-valued functions on a compact space X, and suppose that for
each x ∈ X the sequence {f n (x)} decreases monotonically to zero. Then,
{f n } converges to zero uniformly.

A.1.6 Normed Spaces

Let X be a vector space and let · be a function from X into [0, ∞) such
that the following properties are satisfied:

(N1) y ≥ 0 for all y ∈ X, and y = 0 if and only if y = 0.


(N2) x + y ≤ x + y for each x, y ∈ X (triangle inequality).
(N3) αy = |α| y for all real numbers α.
The function · is called the norm, and X = (X, ·) is called a normed
(linear vector) space.
Let X = (X, ·) be a normed space, and let ρ be the metric induced by
the norm · as follows:
ρ(x, y) ≡ x − y .
Then (X, ρ) is a metric space. A Banach space is a complete normed space
with respect to the metric induced by its norm.
Let X be a vector space, and let ·, · be a function from X × X into R
such that the following conditions are satisfied:
(I1) x, x ≥ 0 for all x ∈ X and x, x = 0 if and only if x = 0;
(I2) x, y = y, x for all x, y ∈ X;
(I3) λx + βy, z = λ x, z + β y, z for any real numbers λ, β and for all x,
y, z ∈ X.
Such a function ·, · is called the inner product. If we let x = (x, x)1/2 ,
then (X, ·) becomes a normed space. A Hilbert space is a complete normed
space, where the norm is induced by the inner product.
A.1.6 Normed Spaces 509

Let x = [x1 , . . . , xn ] be a vector in Rn , where the superscript 


denotes
the transpose. The usual Euclidean norm |x| is defined by:
 1/2

n
2
|x| = (xi ) .
i=1

In particular, if x ∈ R, then |x| is simply the absolute value of x.


Let C(I, Rn ) be the space of all continuous functions from I ≡ [a, b] ⊂ R
to Rn . The space C(I, Rn ) is a vector space, and becomes a Banach space
when it is equipped with the sup norm defined by

f C(I,Rn ) ≡ sup |f (t)| ,


t∈I

n !1/2
2
where f ≡ [f1 , . . . , fn ]τ , and |f (t)| = i=1 (fi (t)) .
A set A ⊂ C(I, R ) is said to be equicontinuous if, for any ε > 0, there
n

exists a δ = δ(ε) > 0 such that for all f ∈ A

|f (t ) − f (t)| < ε

whenever t , t ∈ I are such that |t − t| < δ.

Theorem A.1.2 (Arzelà-Ascoli ). Let I = [a, b] ⊂ R. A set A ⊂ C(I, Rn )


is conditionally sequentially compact if and only if A is bounded as well as
equicontinuous.

Let I = [a, b] ⊂ R and f ≡ [f1 , . . . , fn ] ∈ C(I, Rn ). The function f is


said to be absolutely continuous on I if for any given ε > 0 there exists a
δ > 0 such that
m
|f (tk ) − f (tk )| < ε,
k=1

for every finite collection {(tk , tk )} of non-overlapping intervals satisfying


m
|tk − tk | < δ.
k=1

Let AC(I, Rn ) be the class of all such absolutely continuous functions.


Let C m (I, Rn ) be the space of all m-time continuously differentiable func-
tions from I ≡ [a, b] ⊂ R to Rn . It is a vector space and becomes a Banach
space when it is equipped with the sup norm defined by
 
 
f C m (I,Rn ) = max f (i)  ,
o≤i≤m C(I,Rn )

where f (i) denotes the i-time derivative of the function f .


510 A.1 Elements of Mathematical Analysis

A function f from Rn to R is called convex if

f (λy + (1 − λ)z) ≤ λf (y) + (1 − λ)f (z) (A.1.2)

for all y, z ∈ Rn , and for all λ ∈ [0, 1]. If a function f : Rn → R is convex


and continuously differentiable, then

∂f (y)
f (z) − f (y) ≥ (z − y), (A.1.3)
∂x
where
∂f (y)
≡ [∂f (x)/∂x1 , . . . , ∂f (x)/∂xn ]|x=y (A.1.4)
∂x
is called the gradient (vector) of f at x = y.
For strict convexity of the function f , we only need to replace the inequality
in the conditions (A.1.2) and (A.1.3) by strict inequality.
If f : Rn → R is twice continuously differentiable, the Hessian matrix of
the function f at x0 is a real n × n matrix with its ij-th element defined by

∂ 2 f (x)
H(x0 )ij = . (A.1.5)
∂xi ∂xj x=x0

A.1.7 Linear Functionals and Dual Spaces

Let X ≡ (X, ·X ) and Y ≡ (Y, ·Y ) be normed spaces and let f : X → Y
be a linear mapping. If there exists a constant M such that

f (x)Y ≤ M xX for all x ∈ X, (A.1.6)

then f is said to be bounded, and

f  ≡ sup{f (x)Y : xX ≤ 1}

is called the norm, or the uniform norm, of f . The function f is bounded if


and only if it is continuous.
Let L(X, Y ) be the set of all continuous linear mappings from X to Y .
Then, L(X, Y ) is a normed space with respect to the uniform norm. If Y is a
Banach space, then L(X, Y ) is also a Banach space. In particular, let X be a
normed space and let Y = R. Then the set of all continuous linear mappings
from X to Y becomes the set of all continuous linear functionals on X. This
space L(X, R), which is denoted by X ∗ , is called the dual space of X. The
set X  of all linear functionals (not necessarily continuous) on X is called the
algebraic dual of X. Clearly X ∗ ⊂ X  . The dual of X ∗ , also known as the
second dual of X, is denoted by X ∗∗ .
For each x∗ ∈ X ∗ , we can define a continuous linear functional with values
A.1.7 Linear Functionals and Dual Spaces 511

x∗ , x  ≡ x∗ (x) at x ∈ X.

For a fixed x ∈ X, it is clear that the bilinear form also defines a continuous
linear functional on X ∗ , and we can write it as:

x∗ (x) ≡ Jx (x∗ ).

The correspondence x → Jx from X to X ∗∗ is called the canonical map-


ping. Define
X0∗∗ ≡ {x∗∗ ∈ X ∗∗ : x∗∗ = Jx , x ∈ X}.
The canonical mapping x → Jx from X onto X0∗∗ is one-to-one and norm
preserving, (i.e., x = Jx ). Hence, we may regard X as a subset of X ∗∗ .
If, under the canonical mapping, X = X ∗∗ , then X is called reflexive. If X
is reflexive, then so is X ∗ .
Let X be a Banach space and let X ∗ be its dual. The norm topology of
X is called the strong topology. Apart from this topology, elements of X ∗
can also be used to generate another topology for X which is called the weak
topology. A base for the weak topology consists of all neighbourhoods of the
form
N (x0 , F ∗ , ε) ≡ {y ∈ X : |x∗ (y − x0 )| < ε, x∗ ∈ F ∗ },
where x0 ∈ X, F ∗ is any finite subset of X ∗ and ε > 0.
Similarly, we can introduce two topologies on the dual space X ∗ : (1) the
norm (strong) topology; and (2) the weak topology (i.e., the topology induced
by X ∗∗ on X ∗ ). In addition, since X ⊂ X ∗∗ under the canonical mapping, we
can introduce another topology which is induced by X on X ∗ . This topology
is called the weak∗ topology, and it is weaker than the weak topology. A base
for the weak∗ topology consists of all neighbourhoods of the form

N (x∗0 , F, ε) ≡ {x∗ ∈ X ∗ : |x∗ (x) − x∗0 (x)| < ε, x ∈ F },

where x∗0 ∈ X ∗ and F is finite subset of X and ε > 0. If X is reflexive, then


the weak topology for X ∗ and the weak∗ topology for X ∗ are equivalent.
Definition A.1.1. A sequence {xn } in a normed space X is said to converge
w
to x̄ in the weak topology (denoted by xn → x̄) if

lim |x∗ (xn − x̄)| = 0, for every x∗ ∈ X ∗ .


n→∞

Note that every weakly convergent sequence is bounded.


Definition A.1.2. A subset F of a normed space X is said to be weakly
closed in X, if, for every sequence {xn } ⊂ F such that {xn } converges to x̄
in the weak topology, then x̄ ∈ F .
Since a strongly convergent sequence is weakly convergent, a weakly closed
set is strongly closed. However, the converse is not necessarily true. But we
have the following theorem.
512 A.1 Elements of Mathematical Analysis

Theorem A.1.3 (Mazur). A convex subset of a normed space X is weakly


closed if and only if it is strongly closed.
Theorem A.1.4 (Banach-Saks-Mazur). Let X be a normed space and let
{xn } be a sequence in X converging weakly to x̄. Then there exists a sequence
of finite convex combinations of {xn } that converges strongly to x̄.
Definition A.1.3. A subset F of a normed space X is said to be condition-
ally weakly sequentially compact in X if every sequence {xn } in F contains
a subsequence that converges weakly to a point x̄ in X.
Definition A.1.4. A subset F of a normed space X is said to be weakly
sequentially compact in X if it is conditionally weakly sequentially compact
in X and weakly closed (that is, the limit does not leave F ).
Theorem A.1.5. A subset F of a normed space X is weakly sequentially
compact if and only if it is norm (strongly) bounded and weakly closed.
Theorem A.1.6 (Eberlein-S̆mulian). A subset of a Banach space X is
weakly compact if and only if it is weakly sequentially compact.
Definition A.1.5. Let X ∗ be the dual space of a Banach space X. A se-
quence {xn } in X ∗ is said to converge to x̄ ∈ X ∗ in the weak∗ topology
w∗
(denoted by xn → x̄) if

lim |xn (x) − x̄(x)| = 0, for every x ∈ X.


n→∞

Definition A.1.6. A subset F ∗ of X ∗ is said to be weak∗ closed in X ∗ , if,


for every sequence {xn } ⊂ F ∗ such that {xn } converges to x̄ in the weak∗
topology, then x̄ ∈ F ∗ .
Definition A.1.7. A subset F ∗ of X ∗ is said to be conditionally weak∗ se-
quentially compact in X ∗ , if every sequence {xn } ⊂ F ∗ contains a subse-
quence that converges to a point x̄ ∈ X ∗ in the weak∗ topology.
Definition A.1.8. A subset F ∗ of X ∗ is said to be weak∗ sequentially com-
pact in X ∗ , if it is conditionally weak∗ sequentially compact and weak∗ closed.
Remark A.1.1. In a reflexive Banach space X, the weak topology for X ∗ and
the weak∗ topology for X ∗ are equivalent.
Theorem A.1.7 (Alaoglu). A subset of X ∗ is weak∗ sequentially compact
(i.e., sequentially compact in the weak∗ topology) if and only if it is norm
(strongly) bounded and weak∗ closed.
Remark A.1.2. As a direct consequence of Theorem A.1.7, we note that any
closed ball
Kr = {x∗ ∈ X ∗ : xX ∗ ≤ r}
is weak∗ sequentially compact in X ∗ .
A.1.8 Elements in Measure Theory 513

A.1.8 Elements in Measure Theory

Let X be a set, and let A and B be two subsets of X. The symmetric difference
of these two sets is defined by

AΔB = {A \ B} ∪ {B \ A}, (A.1.7)

where
A \ B = {x ∈ X : x ∈ A and x∈
/ B} (A.1.8)
and B\A is defined similarly.
A class D of subsets of X is called a ring if it is closed under finite unions
and set differences. If D is closed with respect to complements and also con-
tains X, it is called an algebra. If an algebra is closed under countable unions,
it is called a σ-algebra.
Let X be a set, D a σ-algebra, and μ̄ a function from D to R+ ∪ {+∞},
where R+ ≡ {x ∈ R : x ≥ 0}, such that the following two conditions are
satisfied:

1. μ̄(∅) = 0;
2. If S1 , S2 , . . . ∈ S is a sequence of disjoint sets, then
∞  ∞
E 
μ̄ Si = μ̄(Si ). (A.1.9)
i=1 i=1

Define


μ∗ (E) = inf μ̄(Ai ), (A.1.10)
i=1

where the infimum is taken with respect to all possible sequences {Ai } from
D such that E ⊂ ∪∞ i=1 Ai .
A set E is called μ∗ measurable if, for every A ⊂ X,

μ∗ (A) = μ∗ (A ∩ E) + μ∗ (A ∩ (X \ E)). (A.1.11)

The set function μ∗ is called an outer (Lebesgue) measure. It satisfies:


1. μ∗ is increasing (i.e., if A ⊂ B, μ∗ (A) ≤ μ∗ (B));
2. μ∗ (∅) = 0; H∞ ∞
3. μ∗ is countably subadditive (i.e., μ∗ ( I=1 Ai ) ≤ i=1 μ∗ (Ai ));
4. μ∗ extends μ̄ (i.e., if A ∈ D, μ∗ (A) = μ̄(A)).
The μ∗ measurable sets form a σ-algebra F containing D, and μ∗ restricted
to F is a (Lebesgue) measure μ. Also, (X, F) is a (Lebesgue) measurable space,
and (X, F, μ) is a (Lebesgue) measure space. The measure space (X, F, μ) is
complete (i.e., any subset in X with zero μ∗ measure, and hence zero μ
measure, is in F). The measure μ is said to be finite if μ(X) < ∞.
514 A.1 Elements of Mathematical Analysis

A function f : X → R ∪ {±∞} is said to be (Lebesgue) measurable if for


any real number α the set {x ∈ X : f (x) < α} is measurable. The function
f is called a simple function if there is a finite, disjoint class {A1 , . . . , An } of
measurable sets and a finite set {a1 , . . . , an } of real numbers such that


n
f (x) = ai χAi (x), (A.1.12)
i=1

where χAi is the characteristic (indicator) function of Ai defined by

1 if x ∈ Ai
χAi (x) = (A.1.13)
0 otherwise.

Every non-negative measurable function is the limit of a monotonically


increasing sequence of non-negative simple functions.
Consider a measure space (X, S, μ). Let B be the smallest σ-algebra gen-
erated by all open sets in X. It is also the smallest σ-algebra that contains
all closed sets in X. Elements of B are called Borel sets. Let μ̂ denote the
measure μ restricted to B. It is called a Borel measure. All open sets and
closed sets are Borel sets. A function f : X → R ∪ {±∞} is said to be Borel
measurable if for any real number α the set

{x ∈ X : f (x) < α} (A.1.14)

is a Borel set. If f is measurable and B is a Borel set,


−1
f (B) ≡ {x ∈ X : f (x) ∈ B} (A.1.15)

is a measurable set. Every Borel measurable function is measurable. If f is


Borel measurable, and B is a Borel set, then f −1 (B) is a Borel set. Every
lower (respectively, upper) semicontinuous function is Borel measurable, and
hence measurable. There exist measurable functions that are not Borel mea-
surable. In fact, suppose f (t) = g(t) almost everywhere in [a, b] (in the sense
of Borel), and f is Borel measurable. It is not necessarily true that g is also
Borel measurable, since the Borel measure space (X, B, μ̂) is not complete.
That is, not all sets with zero μ̂ measure are in B.
Let (X, ρ), (Y, σ), and (Z, η) be three metric spaces. Let f be a Borel
measurable function from X into Y , and g a measurable function from Y
into Z. Then, the composite function g ◦ f : X → Z is measurable, where

g ◦ f (x) ≡ g(f (x)).

Let I be a measurable set and ψ a non-negative simple function



n
ψ(x) = ci χEi (x), (A.1.16)
i=1
A.1.8 Elements in Measure Theory 515

where ci , i = 1, . . . , n, are real non-negative Hnumbers, and the measurable


n
subsets {Ei }ni=1 ⊂ E are disjoint and satisfy i=1 Ei = I. Also, let χEi be
the indicator function of Ei . We define the (Lebesgue) integral of ψ over I
by
 n
ψ(x) dx = ci μ(Ei ), (A.1.17)
I i=1

where μ(Ei ) denotes the Lebesgue measure of Ei .


Let f be a non-negative measurable function from I to R ∪ {+∞}. Then,
the (Lebesgue) integral of the function f is defined by
 
f (x) dx = sup ψ(x) dx, (A.1.18)
I I

where the supremum is taken with respect to all non-negative simple functions
ψ with 0 ≤ ψ(x) ≤ f (x) for all x ∈ I. We say f is integrable if

f (t) dt < ∞.
I

For any measurable function f : I → R ∪ {±∞}, we can write f = f + − f − ,


where f + ≡ max{f, 0} and f − ≡ max{−f, 0}. The function f is said to be
integrable if  
f (t) dt < ∞ and
+
f − (t) dt < ∞.
I I
The integral of f is
  

f (t) dt = f +
(t) − f (t) dt.
I I I

Theorem A.1.8 (Fatou’s Lemma). If {fn }∞ n=1 is a sequence of non-


negative measurable functions on I, then
 
lim fn (t) dt ≤ lim fn (t) dt.
I n→∞ n→∞ I

Theorem A.1.9 (The Monotone Convergence Theorem). If {fn }∞ n=1


is an increasing sequence of non-negative measurable functions on I such that
fn → f pointwise in I, then f is measurable and
 
f (t) dt = lim fn (t) dt.
I n→∞ I

A property is said to hold almost everywhere (a.e.) if it holds everywhere


except on a set of measure zero in the sense of Lebesgue. Two (Lebesgue)
measurable functions f and g from I ⊂ R to R are said to be equivalent if
f (t) = g(t) a.e. on I.
516 A.1 Elements of Mathematical Analysis

A sequence {fn }∞ n=1 of measurable functions from I ⊂ R to R is said to


converge almost everywhere (a.e.) to a function f (written as fn → f a.e.
on I) if there exists a set A ⊂ I such that μ(A) = 0 and

fn (t) → f (t)

for each t ∈ I \A, where μ denotes the Lebesgue measure. In this situation,
the function f is automatically a measurable function from I to R.
Theorem A.1.10 (The Lebesgue Dominated Convergence Theorem).
Let {fn }∞
n=1 be a sequence of measurable functions on I. If fn → f a.e. and
there exists an integrable function g on I such that |fn (t)| ≤ g(x) for almost
every x ∈ I, then  
f (t) dt = lim fn (t) dt.
I n→∞ I
Theorem A.1.11 (Luzin’s Theorem). Let I ⊂ R be such that μ(I) < ∞,
and let f be a measurable function from I to Rn . Then, for any ε > 0,
there exists a closed set Iε ⊂ I such that μ(I \Iε ) < ε and the function f is
continuous on Iε .

A.1.9 The Lp Spaces

Let I = (a, b) ⊂ R, and 1 ≤ p < ∞. Two functions are said to be equivalent if


they are equal almost everywhere. Let Lp (I, Rn ) be the class of all measurable
functions from I to Rn such that

p
|f (t)| dt < ∞,
I

n !1/2
where f ≡ [f1 , . . . , fn ] , and |f (t)| =
2
i=1 (fi (t)) . If we do not distin-
guish between equivalent functions in Lp (I, R ), for 1 ≤ p < ∞, then it is a
n

Banach space with respect to the norm


 1/p
p
f p = |f (t)| dt . (A.1.19)
I

Theorem A.1.12. If f ∈ L1 (I, Rn ) and g is defined by


 t
g(t) = g(a) + f (τ ) dτ, for t ∈ I, (A.1.20)
a

then g ∈ AC(I, Rn ) and dg(t)/dt = f (t) a.e. on I.


A measurable function f from I to Rn is said to be essentially bounded if
there exists a positive number K < ∞ such that the set
A.1.9 The Lp Spaces 517

S = {t ∈ I : |f (t)| > K} (A.1.21)

has (Lebesgue) measure zero.


Let L∞ (I, Rn ) denote the space of all such essentially bounded measurable
functions. The smallest number K for which (A.1.21) is valid is called the
essential supremum of |f (t)| over t ∈ I and is written as:

f ∞ = ess sup{|f (t)| : t ∈ I}. (A.1.22)

L∞ (I, Rn ) is a Banach space with respect to the norm ·∞ , where we


identify functions that are equivalent.
For a given 1 ≤ p ≤ ∞, a sequence {f n }∞
n=1 in Lp (I, R ) is said to converge
n

to a function f ∈ Lp (I, R ) if f − f p → 0 as n → ∞.
n n

Let q be a number such that 1/p + 1/q = 1. Clearly, if p = 1 , then q = ∞,


while if p = ∞ , then q = 1.

Theorem A.1.13. Let 1 ≤ p ≤ ∞ and q = p/(p − 1). Then,


(a) (Hölder’s inequality) for f ∈ Lp (I, Rn ) and g ∈ Lq (I, Rn ),
   1/p  1/q
 
 (f (t)) g(t) dt ≤ p
|f (t)| dt
q
|g(t)| dt ; (A.1.23)
 
I I I

and
(b) (Minkowski’s inequality) for f and g ∈ Lq (I, Rn ),
 1/p  1/p  1/p
p p p
|f (t) + g(t)| dt ≤ |f (t)| dt + |g(t)| dt .
I I I
(A.1.24)

Remark A.1.3. It is well-known that if I is a finite interval,


3 p 1/p
(a) f ∞ = lim I |f (t)| dt ; and
p→∞
(b) L1 (I, Rn ) ⊃ L2 (I, Rn ) ⊃ . . . L∞ (I, Rn ).

Let f : I ≡ (a, b) → U ⊂ Rn be a measurable function. If θ ∈ I is such


that the condition
μ(f −1 (V ) ∩ J)
lim =1 (A.1.25)
μ(J)→0 μ(J)
is satisfied for every neighbourhood V ⊂ U of f (θ), where

f −1 (V ) = {t ∈ I : f (t) ∈ V },

J denotes an arbitrary interval that contains θ, and μ denotes the Lebesgue


measure, then θ is called a regular point for the function f .
Let θ ∈ I be a continuity point of f . Then, J can be made sufficiently
small such that J ⊂ f −1 (V ), and hence condition (A.1.25) is satisfied. Thus,
518 A.1 Elements of Mathematical Analysis

θ is a regular point for the function f . If f is piecewise continuous, then all


but a finite number of points in I are regular points. In fact, almost all points
in I are regular points for a general measurable function.
The following well-known result is an important tool in deriving pointwise
necessary conditions for optimality in optimal control theory. It is also an
important theorem in the convergence analysis of computational algorithms.

Theorem A.1.14. Let I be an interval of R, f ∈ L1 (I, Rn ), t ∈ I a regular


point of f , and {Ik } a decreasing sequence of subintervals of I such that t ∈ Ik
for all k and limk→∞ μ(Ik ) = 0. Then,

1
lim f (τ ) dτ = f (t).
k→∞ μ(Ik ) I
k

Let F be a set in Lp (I, Rn ), for 1 ≤ p < ∞. A sequence {f (n) } ⊂ F is


w
said to converge weakly to a function fˆ ∈ Lp (I, Rn ) (written as f (n) −→ fˆ)
if  !  !
lim f (n) (t) g(t)dτ = fˆ(t) g(t)dt
n→∞ I I

for every g ∈ Lq (I, R ), where 1/p + 1/q = 1 and the superscript  denotes
n

the transpose. The function fˆ is called the weak limit of the sequence {f (n) }.
A set F ⊂ Lp (I, Rn ), for 1 ≤ p < ∞, is said to be weakly closed if the
limit of every weakly convergent sequence {f (n) } ⊂ F is in F .
A set F ⊂ Lp (I, Rn ), for 1 ≤ p < ∞, is said to be conditionally weakly
sequentially compact if every sequence {f (n) } ⊂ F contains a subsequence
that converges weakly to a function fˆ ∈ Lp (I, Rn ). The set F is said to be
weakly sequentially compact if it is conditionally weakly sequentially compact
and weakly closed.
Let F be s set in L∞ (I, Rn ). A sequence {f (n) } ⊂ F is said to converge
to a function fˆ ∈ L∞ (I, Rn ) in the weak ∗ topology of L∞ (I, Rn ) (written
w∗
as f (n) −→ fˆ) if
 !  !
lim f (n) (t) g(t)dt = fˆ(t) g(t)dt
n→∞ I I

for every g ∈ L1 (I, Rn ). The function fˆ is called the weak ∗ limit of the
sequence {f (n) }.
A set F ⊂ L∞ (I, Rn ) is said to be weak ∗ closed if the limit of every
weak∗ convergent sequence {f (n) } ⊂ F is in F .
A set F ⊂ L∞ (I, Rn ) is said to be conditionally sequentially compact
in the weak ∗ topologyof L∞ (I, Rn ) if every sequence {f (n) } ⊂ F contains
a subsequence that converges to a function fˆ ∈ L∞ (I, Rn ) in the weak∗
topology. The set F is called sequentially compact in the weak ∗ topology of
L∞ (I, Rn ) if it is conditionally sequentially compact in the weak∗ topology
of L∞ (I, Rn ) and weak∗ closed.
A.1.9 The Lp Spaces 519

The following well-known result is extremely important in proving exis-


tence of optimal controls, and in analysing the convergence properties of
computational algorithms for linear optimal control problems.

Theorem A.1.15. Let U be a compact and convex subset of Rn . Then, the


set
U ≡ {u ∈ L∞ (I, Rn ) : u(t) ∈ U , a.e. on I}
is sequentially compact in the weak∗ topology of L∞ (I, Rn ).

Definition A.1.9. Let L : [0, T ] → U , where U is a compact and convex


subset of Rr . Then, L is said to be weakly sequentially lower semicontinuous
if for any sequence {un } ⊂ L2 ([0, T ], Rr ) such that un → u in the weak
topology of L2 ([0, T ], Rr ), where u ∈ L2 ([0, T ], Rr ), then

L(u) ≤ lim L(un ).


n→∞

Remark A.1.4. The set U defined by

U = {u ∈ L2 ([0, T ], Rr ) : u ≤ ρ}
is weakly sequentially compact, where · denotes the L2 -norm in L2 ([0, T ], Rr )
and ρ is a given positive constant.

Theorem A.1.16. Let I be an open bounded subset of R, U a compact subset


of Rr , and f a function from I × U to Rn . If f (t, ·) is continuous on U for
almost all t ∈ I, and f (·, u) is measurable on I for every u ∈ U , then for
any ε > 0, there exists a closed set Iε ⊂ I such that μ(I \ Iε ) < ε and f
is a continuous function on Iε × U.

Theorem A.1.17. Let I be an interval in R such that μ(I) < ∞, and let
f ∈ Lp (I, Rn ) for all p ∈ [1, ∞). If there exists a constant K such that
f p ≤ K for all such p, then f ∈ L∞ (I, Rn ) and f ∞ ≤ K.

Theorem A.1.18. Let I be an open bounded subset of R, Ω a compact and


convex subset of Rn , f a continuous function defined on I ×Ω such that f (t, ·)
is convex on Ω for each t ∈ I, and {yk } a sequence of measurable functions
w∗
defined on I with values in Ω. If yk −→ y0 in L∞ (I, Rn ), then
 
   
f t, y 0 (t) dt ≤ lim f t, y k (t) dt.
I k→∞ I

A measurable function f from [0, ∞] to R is said to belong to Lloc


1 if
 t
|f (τ )| dτ < ∞, for any t < ∞.
0
520 A.1 Elements of Mathematical Analysis

Theorem A.1.19. (Gronwall-Bellman Lemma) Suppose that


 t
f (t) ≤ α(t) + K(τ )f (τ ) dτ,
0

where α is a continuous and bounded function on [0, ∞) such that α(t) ≥ 0 for
all t ∈ [0, ∞). If K ∈ Lloc
1 such that K(t) ≥ 0 a.e. on [0, ∞), and f (t) ≥ 0 for
all t ∈ [0, ∞), then
 t  t
f (t) ≤ α(t) + exp K(τ ) dτ K(s)α(s) ds
0 s

for t ∈ [0, ∞).

A.1.10 Multivalued Functions

For results on existence of optimal controls, we are required to make use


of the concept of multivalued functions, also known as set-valued functions.
We refer the reader to [3, 4, 40] for details. In this section, we shall prove a
selection theorem involving lower and upper semicontinuous functions.
Let K be the set of all non-empty compact subsets of R. For x ∈ R, A ∈ K,
the distance ρ(x, A) of x from A is defined by

ρ(x, A) = inf { |x − a| : a ∈ A}. (A.1.26)

We now define, for A, B ∈ K,

2ρH (A, B) = sup {ρ (a, B) : a ∈ A} + sup {(b, A) : b ∈ B} . (A.1.27)

Here, ρH is called the Hausdorff metric on K. Let I be an open bounded


set in R. Let F be a multivalued function defined on I such that F (x) ∈ K.
We assume that the set-valued function F is continuous with respect to the
Hausdorff metric. That is, if x̄ ∈ I is arbitrary but fixed, then for any
ε > 0 there exists a δ > 0 such that

ρH (F (x) , F (x̄)) < ε

whenever
x ∈ Iδ ≡ {x ∈ I : |x − x̄| < δ} .

Lemma A.1.1. Let I and U be compact subsets in R and let g be a con-


tinuous function from I × U into R. Furthermore, let F (·) be a continuous
multivalued function defined on I with respect to the Hausdorff metric such
that F (x) is a non-empty compact subset of U for each x ∈ I. Define
A.1.10 Multivalued Functions 521

r(x) = inf {g (x, u) : u ∈ F (x)} . (A.1.28)

Then, r is an upper semicontinuous function on I.


Proof. Since F (x̄) is compact, there exists a ū ∈ F (x̄) such that

r(x̄) = g (x̄, ū) = inf g (x̄, u) . (A.1.29)


u∈F (x̄)

By the continuity of the function g on I × U , we see that g (·, ū) is contin-


uous at x̄. Thus, for any ε > 0 there exists a δ1 > 0 such that

|g (x̄, ū) − g (x, ū)| < ε (A.1.30)

for all x ∈ Iδ1 (x̄), where

Iδ1 (x̄) = {x ∈ I : |x − x̄| < δ1 } . (A.1.31)

In particular, it is clear that

r (x̄) = g (x̄, ū) > g (x, ū) − ε (A.1.32)

for all x ∈ Iδ1 (x̄).


Now, for any x ∈ Iδ1 (x̄), there are two cases to be considered:
(i) ū ∈ F (x) ; and
(ii) ū ∈
/ F (x) .
For case (i), it follows that

g (x, ū) ≥ inf g (x, u) . (A.1.33)


u∈F (x)

For case (ii), there exists a u0 ∈ F (x) such that

|ū − u0 | = ρ (ū, F (x)) ≤ ρH (F (x̄) , F (x)) . (A.1.34)

By the continuity of the set-valued function F (·) on I with respect to the


Hausdorff metric, it follows that for any δ2 > 0 there exists a δ3 > 0 such
that
ρH (F (x̄) , F (x)) < δ2 (A.1.35)
whenever |x̄ − x| < δ3 . Thus, by (A.1.34),

|ū − u0 | < δ2 whenever |x̄ − x| < δ3 . (A.1.36)

Since the function g (·, ·) is continuous on I × U , where I × U is compact,


it is clear that it is uniformly continuous on I × U . Thus, for any ε > 0 as
defined in (A.1.30), we can choose a δ2 > 0 and hence δ3 > 0 such that

|g (x, ū) − g (x, u0 )| < ε (A.1.37)


522 A.1 Elements of Mathematical Analysis

whenever |ū − u0 | < δ2 . This, in turn, implies that

g (x, ū) > g (x, u0 ) − ε (A.1.38)

whenever |ū − u0 | < δ2 . Thus,

g (x, ū) > g (x, u0 ) − ε ≥ inf g (x, u) − ε (A.1.39)


u∈F (x)

whenever |x̄ − x| < δ3 . Combining (A.1.32), (A.1.33) and (A.1.39) yields

r (x̄) ≥ inf g (x, u) − 2ε = r (x) − 2ε


u∈F (x)

for all x ∈ Iδ (x̄), where δ = min {δ1 , δ3 }. This completes the proof.

Lemma A.1.2. Let I, U , g, and F (·) be as defined in Lemma A.1.1. Let


{xk } ⊂ I be such that xk → x̂, and u (xk ) → û, both as k → ∞. If u (xk ) ∈
F (xk ) for all k ≥ 1, then û ∈ F (x̂).

Proof. Since xk → x̂ as k → ∞, and F (·) is continuous on I with respect


to the Hausdorff metric, it follows that for any ε > 0 there exists an integer
N > 0 such that
ρH (F (xk ) , F (x̂)) < ε
for all k > N . This, in turn, implies that

F (xk ) ⊂ F ε (x̂) (A.1.40)

for all k > N , where F ε (x̂) is the closed ε-neighbourhood of F (x̂) defined
by
F ε (x̂) = {v ∈ U : ρ (v, F (x̂)) ≤ ε} .
Since u (xk ) ∈ F (xk ) for all k ≥ 1, it is clear from (A.1.40) that u (xk ) ∈
F ε (x̂) for all k > N . Thus, by the facts that u(xk ) → û as k → ∞, and
F ε (x̂) is closed, we have
û ∈ F ε (x̂) .
Since this relation is true for all ε > 0, and F (x̂) is closed, it follows that
û ∈ F (x̂). Thus, the proof is complete.

Theorem A.1.20. Let I, U, g, and F (·) be as defined in Lemma A.1.1.


Define
g (x, F (x)) = {g (x, u) : u ∈ F (x)} , (A.1.41)
and
r (x) = inf {g (x, u) : u ∈ F (x)} , (A.1.42)
for all x ∈ I. Then, there exists a lower semicontinuous function u (x) with
values in F (x) such that
A.1.10 Multivalued Functions 523

r (x) = g (x, u (x)) (A.1.43)


for all x ∈ I.

Proof. By Lemma A.1.1, r is an upper semicontinuous function on I. Fur-


thermore,
r (x) ∈ g (x, F (x)) (A.1.44)
for all x ∈ I. We choose the smallest u ∈ F (x) for which

r (x) = g (x, u) ∈ g (x, F (x)) . (A.1.45)

Since g (x, ·) is continuous in U and F (x) is compact, the set

{u ∈ F (x) : g (x, u) = r (x)} (A.1.46)

is compact. Thus, the choice of such a function u is possible.


For the function u so chosen to be lower semicontinuous, it suffices to show
that for any real number α, the set

A = {x ∈ I : u (x) ≤ α} (A.1.47)

is closed. Suppose this is false. Then, there exists a sequence {xk } in A ⊂ I


such that
r (xk ) = g (xk , u (xk )) ∈ g (xk , F (xk )) , (A.1.48)
xk → x̂ ∈ I, (A.1.49)
u (x̂) > α. (A.1.50)
By the definition of the set A given in (A.1.47), there exists an ε > 0 such
that
u (xk ) ≤ u (x̂) − ε. (A.1.51)
Since x ∈ F (x) ⊂ U for all x ∈ I, and U is compact, there exists a
constant K > 0 such that

|u (x)| ≤ K for all x ∈ I.

In particular,
|u (xk )| ≤ K (A.1.52)
for all positive integers k. Thus, there exists a subsequence of the sequence
{xk }, again denoted by the original sequence, such that

u (xk ) → û. (A.1.53)

Since u (xk ) ∈ F (xk ) , F is continuous on I with respect to the Hausdorff


metric, and F (x̂) is closed, it follows from Lemma A.1.2 that û ∈ F (x̂).
Thus, by (A.1.51) and (A.1.53), we have
524 A.1 Elements of Mathematical Analysis

û ≤ u (x̂) − ε. (A.1.54)

Since r is an upper semicontinuous function on the closed interval I, it is


bounded. In particular, there exists a constant K1 > 0 such that |r (xk )| ≤ K1
for all positive integers k. Thus, there exists a further subsequence, which is
again denoted by the original sequence, such that

r (xk ) → r̂, and (A.1.55)

r̂ ≤ r (x̂) . (A.1.56)
We note that

r (xk ) = g (xk , u (xk )) ∈ g (xk , F (xk )) = {g (xk , u) : u ∈ F (xk )} , (A.1.57)

F is continuous on I, g is continuous on I × U which is compact, F (x̂) is


closed, and hence g (x̂, F (x̂)) is closed. Thus, by (A.1.55), (A.1.49), (A.1.53),
and (A.1.57), we have

r̂ = g (x̂, û) ∈ g (x̂, F (x̂)) . (A.1.58)

However, in view of the definition of r (x̂), it is clear that

r (x̂) ≤ r̂. (A.1.59)

Combining (A.1.56) and (A.1.58), we have

r (x̂) = r̂. (A.1.60)

Therefore,
r (x̂) = g (x̂, û) . (A.1.61)
However, by (A.1.54), we see that u(x̂) is not the smallest value of u such
that
r (x̂) = g (x̂, u) .
This contradicts the definition of u (x). Thus, the set A defined by (A.1.47)
must be closed, and hence the function u is lower semicontinuous on I. The
proof is complete.

A.1.11 Bounded Variation

By a partition of the interval [a, b], we mean a finite set of points ti ∈ [a, b],
i = 0, 1, . . . , m, such that

a = t0 < t1 < t2 < · · · < tm = b.


A.1.11 Bounded Variation 525

A function f defined on [a, b] is said to be of bounded variation if there is


a constant K so that for any partition of [a, b]


m
|f (ti ) − f (ti−1 )| ≤ K.
i=1

b
The total variation of f , denoted by ∨f (t), is defined by
a

b 
m
∨f (t) = sup |f (ti ) − f (ti−1 )|,
a
i=1

where the supremum is taken with respect to all partitions of [a, b]. The total
variation of a constant function is zero and the total variation of a monotonic
function is the absolute value of the difference between the function values
at the end points a and b.
The space BV [a, b] is defined as the space of all functions of bounded
variation on [a, b] together with the norm defined by
b
f  = |f (a)| + ∨f (t).
a

Suppose f ∈ BV [a, b]. Then, f is differentiable a.e. on [a, b]. If f : [a, b] →


R is absolutely continuous, then it is of bounded variation.

Theorem A.1.21. If f ∈ BV [a, b], then f is absolutely continuous if and


only if
 b 
 df (t)  b
 
 dt  dt = a∨f (t).
a

b
If f is monotone, then f ∈ BV [a, b] and ∨f (t) = |f (b) − f (a)|.
a
If f ∈ BV [a, b], then the jump of f at t is defined as:

⎨ |f (t) − f (t − 0)| + |f (t + 0) − f (t)| if a < t < b
|f (a + 0) − f (a)| if t = a

|f (b) − f (b − 0)| if t = b

We now consider a function f ≡ [f1 , . . . , fn ] : [a, b] → Rn , where [a, b] is


a finite closed interval in R. The full variation of f is defined as:

b 
n
b
∨f (t) = ∨fi (t).
a a
i=1

Let BV ([a, b], Rn ) be the space of all functions f : [a, b] → Rn which are
of bounded variation on [a, b].
526 A.1 Elements of Mathematical Analysis

Theorem A.1.22. If f ∈ BV ([a, b], Rn ), then f (t + 0) ≡ lims↓t+0 f (s), the


limit from the right at t, exists if a ≤ t < b; and f (t − 0) ≡ lims↑t−0 f (s), the
limit from the left at t, exists if a < t ≤ b.

In order that f shall approach a limit in Rn as s approaches t from the


right (respectively, from the left), the following condition is necessary and
sufficient: To each ε > 0 there corresponds a δ > 0 such that

|f (τ ) − f (t)| < ε

if s < τ < t + δ (respectively, t − δ < τ < s).

Theorem A.1.23. If f ∈ BV ([a, b], Rn ), the set of points of discontinuity of


f is countable.

Let E be a family of functions in BV ([a, b], Rn ). It is said to be equibounded


with equibounded total variation if there exist constants K1 > 0, K2 > 0 such
that
b
|f (t)| ≤ K1 and ∨f (t) ≤ K2 , for all f ∈ E.
a

Theorem A.1.24 (Helly). Let E be a family of functions in BV ([a, b], Rn )


which is equibounded with equibounded total variation. Then, any sequence
{f (n) } of elements in E contains a subsequence {f (n(k)) } which converges
pointwise everywhere on [a, b] toward a function f (0) ∈ BV ([a, b], Rn ) with
b b
∨f (0) (t) ≤ lim ∨f n(k) (t).
a n→∞ a
Appendix A.2
Global Optimization via Filled Function
Approach

Consider the following optimization problem defined by

min f (x), (A.2.1)


x∈X

where X ⊂ Rn is a closed bounded domain containing all global minimizers


of f (x) in its interior. It is assumed that f (x) has only a finite number of
local minimizers. Let this optimization problem be referred to as Problem
(B).
We note that the solution obtained from solving Problem (B) using a
gradient-based method is unlikely to be a global minimizer. We will introduce
a filled function [57, 285, 286] and then use it to obtain a global minimizer
for Problem (B) by incorporating the filled function with a gradient-based
method.
We suppose that x1,∗ is a local minimizer of Problem (B). Since f (x) are
differentiable with respect to x and the set X is bounded and closed, there
exists a constant M > 0 such that for all x1 and x2 ∈ X, the following
condition is satisfied
     
f x 1 − f x 2 ≤ M x 1 − x 2  , (A.2.2)

where | · | denotes the standard Euclidean norm in Rn . To escape from the


local minimizer x∗ , we construct the function

p(x, x∗ , ρ, μ) =f (x∗ ) − min {f (x∗ ), f (x)} − ρ |x − x∗ |2


+ μ [max {0, f (x) − f (x∗ )}] ,
2
(A.2.3)

where x ∈ X, while μ and ρ are parameters, which are such that ρ > 0 and
0 ≤ μ < ρ/M 2 . To proceed further, we need some definitions.

© The Author(s), under exclusive license to 527


Springer Nature Switzerland AG 2021
K. L. Teo et al., Applied and Computational Optimal Control, Springer
Optimization and Its Applications 171,
https://doi.org/10.1007/978-3-030-69913-0
528 A.2 Global Optimization via Filled Function Approach

Definition A.2.1. The basin of f (x) at an isolated minimizer x∗ is a con-


nected domain, denoted as X ∗ , which contains x∗ and within X ∗ the steepest
descent trajectory of f (x) converges to x∗ from any initial point.

Definition A.2.2. The basin of f (x) at an isolated minimizer x∗1 is said to


be lower than another basin of f (x) at an isolated minimizer x∗2 if and only
if f (x∗1 ) < f (x∗2 ).

Definition A.2.3. The hill of f (x) at an isolated minimizer x∗ is the basin


of −f (x) at its isolated minimizer x∗ .

In the following, we will show that p(x, x∗ , ρ, μ) satisfies the following


properties.

Property A.2.1. x∗ is a maximizer of p(x, x∗ , ρ, μ) and the whole basin X ∗


of f (x) at x∗ becomes part of a hill of the function p(x, x∗ , ρ, μ).

Property A.2.2. p(x, x∗ , ρ, μ) has no minimizers or saddle points in any basin


of f (x) higher than X ∗ .

Property A.2.3. Let x1,∗ be an isolated minimizer of f (x) in X, and let X1∗
be the basin as defined in Definition A.2.1. If f (x) has a basin X2∗ at x2,∗

that is lower
 than X 1,∗
is a point x ∈ X2∗ that mini-
 1 at x , then there 1,∗
mizes p x, x , ρ, μ on the line through x and x , for every x in some
1,∗

neighbourhood of x2,∗ .

Property A.2.1 will be established in the next two theorems.


Theorem A.2.1. Assume that x∗ is a local minimizer of f (x). If ρ > 0 and
0 ≤ μ < ρ/M 2 , then x1,∗ is a strict local maximizer of p(x, x∗ , ρ, μ).

Proof. Since x∗ is a local minimizer of f (x), there exists a neighbourhood


N (x∗ , δ) with radius δ > 0 and center x∗ such that for any x ∈ N (x∗ , δ), we
have
f (x) ≥ f (x∗ ). (A.2.4)
Now, for any x ∈ N (x∗ , δ),we obtain

p(x, x∗ , ρ, μ) − p(x∗ , x∗ , ρ, μ)
= μ [f (x) − f (x∗ )] − ρ|x − x∗ |2 < 0.
2
(A.2.5)

Thus, x∗ is a local maximizer of p(x, x∗ , ρ, μ).

Theorem A.2.2. Assume that x∗ is a local minimizer of f (x). Suppose that


x1 and x2 are two points such that
 1   
x − x ∗  < x 2 − x ∗  (A.2.6)

and
A.2 Global Optimization via Filled Function Approach 529
   
f (x∗ ) ≤ f x1 ≤ f x2 . (A.2.7)
" #
If ρ > 0 and 0 ≤ μ < min ρ/M 2 , ρ/M M1 , where
    2 
 ∂f x1 + α x2 − x1   x − x 1 
 
M1 ≥ max   2 , (A.2.8)
0≤α≤1  ∂x  |x − x1,∗ | − |x1 − x1,∗ |

then
 
p x2 , x∗ , ρ, μ < p(x∗ , x∗ , ρ, μ) < 0 = p(x∗ , x∗ , ρ, μ). (A.2.9)

Proof. Note that


   
p x2 , x∗ , ρ, μ − p x1 , x∗ , ρ, μ
*$   %2 $   %2 +
= μ f x2 − f (x∗ ) − f x1 − f (x∗ )
 2  2 !
− ρ x 2 − x ∗  − x 1 − x ∗ 
 2  2 !
= x 2 − x ∗  − x 1 − x ∗  ·
4 $  2 %2 $   %2 5
f x − f (x∗ ) − f x1 − f (x∗ )
−ρ + μ 2 2
|x2 − x∗ | − |x1 − x∗ |
 2  2 !
= x 2 − x ∗  − x 1 − x ∗  ·
4 $  2   %$    % 5
f x +f x1 −2f (x∗ ) f x2 −f x1
−ρ+μ 2 . (A.2.10)
(|x − x∗ | + |x1 − x∗ |) (|x2 − x∗ | − |x1 − x∗ |)

From (A.2.1), we obtain


       
f x2 + f x1 − 2f (x∗ ) = f x2 − f (x∗ ) + f x1 − f (x∗ )
    $   %
≤ M x 2 − x ∗  + M x 1 − x ∗  = M x 2 − x ∗  + x 1 − x ∗  . (A.2.11)

Combining (A.2.10), (A.2.11) and then using the mean value theorem, we
obtain
   
p x2 , x∗ , ρ, μ − p x1 , x∗ , ρ, μ
4     5
 2   1  ! f x2 − f x1

≤ x −x − x −x ∗ 2  ∗ 2
−ρ + μM 2
|x − x∗ | − |x1 − x∗ |
 2  2 !
≤ x 2 − x ∗  − x 1 − x ∗  ·
4    2  5
  1  2   2
1  x −x
1 x − x 1 

−ρ+μM ∇f x +α x −x ,
|x2 −x1 | |x2 − x∗ | − |x1 − x∗ |
(A.2.12)
530 A.2 Global Optimization via Filled Function Approach

where α is some value in (0, 1), and


  
   ∂f x1 + α x2 − x1
∇f x + α x − x
1 2 1
=
∂x
.
Thus, by (A.2.8) and the assumption that 0 ≤ μ < ρ/M 2 , we obtain
   
p x2 , x∗ , ρ, μ − p x1 , x∗ , ρ, μ
≤ (|x2 − x∗ |2 − |x1 − x∗ |2 )(−ρ + μM M1 ) < 0. (A.2.13)

Therefore, we conclude that


   
p x2 , x∗ , ρ, μ < p x1 , x∗ , ρ, μ < 0 = p(x∗ , x∗ , ρ, μ).

This completes the proof.

Our next task is to show the validity of Property A.2.2. For this, we need
the following lemma.

Lemma A.2.1. Assume that x∗ is a local minimizer of f (x). Suppose that


x1 is a point such that f x1 > f (x∗ ). Suppose that ρ > 0 and
" #
0 ≤ μ < min ρ/M 2 , ρ/M M1 . (A.2.14)

Then, there exists a sufficiently small ε1 > 0, such that whenever d1 is chosen
satisfying 0 < |d1 | ≤ ε1 , it holds that
 1     
x − d 1 − x ∗  < x 1 − x ∗  < x 1 + d 1 − x ∗  , (A.2.15)
 
f x1 + d1 ≥ f (x∗ ) (A.2.16)
and
   
p x1 + d1 , x∗ , ρ, μ < p x1 , x∗ , ρ, μ
 
< p x1 − d1 , x∗ , ρ, μ
< 0 = p(x∗ , x∗ , ρ, μ). (A.2.17)

Proof. For a given ε1 > 0, let

ε1 x 1 − x ∗
d1 = . (A.2.18)
2 |x1 − x∗ |

Then,
ε1
0 < |d1 | =≤ ε1 . (A.2.19)
2
Clearly, if ε1 > 0 is sufficiently small, we have
A.2 Global Optimization via Filled Function Approach 531
 1  ε1    
x + d1 − x∗  = (1 + )  x 1 − x ∗  > x 1 − x ∗  , (A.2.20a)
2|x1 ∗
−x |
 1  ε1    
x − d1 − x∗  = (1 − )  x 1 − x ∗  < x 1 − x ∗  . (A.2.20b)
2|x1 ∗
−x |
 
Since f x1 > f (x∗ ) and 0 < |d1 | ≤ ε1 , it follows that
 
f x1 ± d1 ≥ f (x∗ ), (A.2.21)

if ε1 > 0 is chosen
" sufficiently small.
# Now, choose ρ and μ such that ρ > 0
and 0 ≤ μ < min ρ/M 2 , ρ/M M1 . Then, by using arguments similar to that
given for Theorem A.2.2, we can show that
   
p x1 + d1 , x∗ , ρ, μ < p x1 , x∗ , ρ, μ
 
< p x1 − d1 , x∗ , ρ, μ
< 0 = p(x∗ , x∗ , ρ, μ). (A.2.22)

This completes the proof.

This lemma shows that any local minimizer of p(x, x∗ , ρ, μ) must be in


the set
S = {x : f (x) ≤ f (x∗ )} . (A.2.23)
The next theorem shows that the function satisfies Property A.2.2.

Theorem A.2.3." Assume that x#∗ is a local minimizer of f (x). If ρ > 0


and 0 ≤ μ < min ρ/M 2 , ρ/M M1 , then any local minimizer or saddle point
must belong to the set S.

Proof. It suffices to show that for any x, if f (x) > f (x∗ ), then

∂p(x, x∗ , ρ, μ)
= 0. (A.2.24)
∂x

From (A.2.4), we have


 
∂p(x, x∗ , ρ, μ) ∂f (x)
= −2ρ(x−x∗ ) + 2μ[f (x)−f (x∗ )] . (A.2.25)
∂x ∂x

∂f (x)
If = 0, we have
∂x

∂p(x, x∗ , ρ, μ)
= −2ρ(x − x∗ ) = 0. (A.2.26)
∂x

∂f (x)
Now suppose that = 0. Define
∂x
532 A.2 Global Optimization via Filled Function Approach
& '
∂f (x)

x−x ∂x
d= − β ∂f (x) , (A.2.27)
|x − x∗ | | ∂x |

where β > 0 is sufficiently small. Then, by taking the inner product of d and

∂p(x, x∗ , ρ, μ)
, we obtain
∂x

∂p(x, x∗ , ρ, μ)
d
∂x
 K 
∂f (x)  ∂f (x) 

= −2ρ|x − x | + 2ρβ(x − x ) ∗   
∂x  ∂x 
∂f (x) x − x∗
+ 2μ [f (x) − f (x∗ )]
∂x |x − x∗ |
∂f (x)
− 2μβ [f (x) − f (x∗ )] | |. (A.2.28)
∂x
.   /
 1,∗
∗  ∂f (x) T ∂p x, x , ρ, μ
If (x − x ) ≤ 0, then d < 0. Otherwise,
∂x ∂x
choose μ ≥ 0 to be sufficiently small. Since β > 0 can also be chosen

∂p(x, x∗ , ρ, μ)
to be sufficiently small, it follows that d < 0. Thus,
∂x

∂p(x, x∗ , ρ, μ)
= 0. This completes the proof.
∂x
The next theorem shows that the filled function p satisfies Property A.2.3.

Theorem A.2.4. Assume that x1,∗ is a local minimizer of f (x). If x2,∗ is


another minimizer of f (x) and satisfies
   
f x2,∗ < f x1,∗ , (A.2.29)
 
then there exists a neighbourhood N (x2,∗ , δ) of x2,∗ such that p x, x1,∗ , ρ, μ
has a minimizer x which is on the line segment connecting x1,∗ and x2 for
every x2 ∈ N (x2,∗ , δ) when 0 ≤ μ < ρ/M 2 and 0 < ρ < ε1 /D1 , where
   
0 < ε1 < f x1,∗ − f x2 (A.2.30)

and  
D1 = max x − x1,∗ 2 . (A.2.31)
x∈N (x 2,∗ ,δ)

Furthermore, if there exists no basin lower than B1∗ between B1∗ and B2∗ , where
B1∗ and B2∗ are the basins of f (x) at x1,∗ and x2,∗ , respectively, then there
exists a x ∈ B2∗ such that
A.2 Global Optimization via Filled Function Approach 533
 
f (x ) ≤ f x1,∗ . (A.2.32)

Proof. By Theorem A.2.1, there is a neighbourhood N (x1,∗ , δ1 ) of x1,∗ with


δ1 > 0 such that for all x1 ∈ N (x1,∗ , δ1 ),
   
p x1 , x1,∗ , ρ, μ < 0 = p x1,∗ , x1,∗ , ρ, μ . (A.2.33)

Furthermore, there is a neighbourhood N (x2,∗ , δ1 ) of x2,∗ with δ2 > 0 such


that for all x2 ∈ N (x2,∗ , δ2 ),
   
0 < ε1 < f x1,∗ − f x2 . (A.2.34)

Thus, by (A.2.3), it follows from (A.2.34) and (A.2.31) that


       2
p x2 , x1,∗ , ρ, μ = f x1,∗ − f x2 − ρ x2 − x1,∗ 
> ε1 − ρD1 . (A.2.35)

If ρ is chosen such that ρ < ε1 /D1 , then


 
p x2 , x1,∗ , ρ, μ > 0. (A.2.36)

Thus, by the continuity of the filled function, there exists a minimizer


x which is on the line segment connecting x1,∗ and x2 for every x2 ∈
N (x2,∗ , δ2 ).
Now, we consider the case when there exists no basin lower than B1∗ be-
tween B1∗ and B2∗ . Let xB be the boundary point of B2∗ on the line segment
connecting x1,∗ and a x2 ∈ N (x2,∗ , δ2 ). Since there exists no basin lower
than B1∗ between B1∗ and B2∗ , it is clear that
   
f xB − f x1,∗ > 0. (A.2.37)

Thus, by the continuity of f (x), there are three points x0,− , x0 and x0,+ on
the line segment connecting x1,∗ and x2 such that
   
f x0 = f x1,∗ (A.2.38)

and          
f xB > f x0,− ≥ f x0 ≥ f x0,+ > f x2 , (A.2.39)
where  
x0,− = x0 − η x0 − x1,∗ (A.2.40a)
and  
x0,+ = x0 + η x0 − x1,∗ , (A.2.40b)
where η > 0 is sufficiently small. Since
   2  
p x0 , x1,∗ , ρ, μ = −ρ x0 − x1,∗  < 0 = p x1,∗ , x1,∗ , ρ, μ . (A.2.41)
534 A.2 Global Optimization via Filled Function Approach

We note from (A.2.40b) that


 0,+   
x − x1,∗  = x0 + ηx0 − ηx1,∗ − x1,∗ 
   
=  x0 − x1,∗ + η x0 + x1,∗ 
   
= (1 + η) x0 − x1,∗  > x0 − x1,∗  . (A.2.42)

By (A.2.38) and (A.2.39), we recall that


     
f x0,+ ≥ f x0 = f x0,∗ . (A.2.43)

Thus, it follows from Theorem A.2.2 that


     
p x0,+ , x1,∗ , ρ, μ < p x0 , x1,∗ , ρ, μ < 0 = p x1,∗ , x1,∗ , ρ, μ . (A.2.44)

Next, we note from (A.2.40a) that


 0,−   
x − x1,∗  = x0 − ηx0 + ηx1,∗ − x1,∗ 
   
=  x0 − x1,∗ − η x0 + x1,∗ 
   
= (1 − η) x0 − x1,∗  < x0 − x1,∗  . (A.2.45)

From (A.2.39), we can write down that


     
f x0,− ≥ f x0 = f x1,∗ . (A.2.46)

Thus, by (A.2.45) and (A.2.46), it follows from Theorem A.2.2 that


   
0 > p x0,− , x1,∗ , ρ, μ > p x0 , x1,∗ , ρ, μ . (A.2.47)

From (A.2.44) and


 (A.2.47), it is clear that x0 − x1,∗ is a descent direction
ofp x,x , ρ, μ at x . Therefore, there exists a x ∈ B2∗ such that f (x ) ≤
1,∗ 0

f x1,∗ .

Now, by virtue of Theorems A.2.1–A.2.4, we see that p(x, x∗ , ρ, μ) satisfies


Properties A.2.1–A.2.3. Thus, it is a filled function. By Theorem A.2.3, any
local minimizer x∗ of p(x, x∗ , ρ, μ) in Ξ × Υ satisfies

f (x∗ ) ≤ f (x∗ ). (A.2.48)

Therefore, we can escape from the current local minimizer x∗ of f (x) by


searching for a local minimizer of p(x, x∗ , ρ, μ).
∂p(x, x∗ , ρ, μ)
To implement the computation, we need the gradient with

∂x
∂p(x, x , ρ, μ)
respect to the parameter x. Note that the gradient is needed

∂x
only when f (x) > f (x ). We give the following theorem.
A.2 Global Optimization via Filled Function Approach 535

Theorem A.2.5. Suppose that f (x) > f (x∗ ). Then,



∂p(x, x∗ , ρ, μ)
= − 2ρ(x − x∗ )
∂x

∂f (x)
+ 2μ [f (x) − f (x∗ )] . (A.2.49)
∂x

Proof. The result is obvious.

Now, we give the following algorithm to search for a x such that

f (x) < f (x∗ ).

Algorithm A.2.1
Step 1. Initialize ρ, μ, μ̂ (where μ̂ < 1), μl (where μl is sufficiently small).
Step 2. Start from x1,∗ , construct a search direction according to (A.2.49)
and select a search step. We minimize the filled function p(x, x1,∗ , ρ, μ)
along this search direction with the selected search step. Then, we find
a point x. Go to Step 3.
Step 3. If f (x) < f x1,∗ , stop. Otherwise, go to Step 5.
Step 4. Continue the search as described in Step 2. Find a new point x. Go
to Step 3.
Step 5. If x is on the boundary of X and μ ≥ μl , set μ = μ̂μ and go to Step
1. Otherwise, go to Step 4.

Algorithm A.2.1 can be modified by incorporating the filled function (A.2.3)


so that we can search for a global minimizer.
Algorithm A.2.2
Step 1. Choose a x0 ∈ X, and obtain a local minimizer by a gradient-based
optimization algorithm. Let it be denoted as x∗ .
Step 2. Use Algorithm A.2.1 to find another initial point x such that f (x) <
f (x∗ ). If this point cannot be found, then go to Step 4.
Step 3. Use x as an initial point, and obtain another local minimizer x∗ by
the gradient-based optimization algorithm. Go to Step 2.
Step 4. x∗ is a global minimizer of Problem (B).
Appendix A.3
Elements of Probability Theory

Let S denote the sample space which is the set of all possible outcomes
of an experiment. If S is finite or countably infinite, it is called a discrete
sample space. On the other hand, if S is a continuous set, then it is called a
continuous sample space. An element from S is called a sample point (i.e., a
single outcome). A collection of possible outcomes is referred to as an event. In
set theory, it is called a subset of S. A ⊂ B menas that if event B occurs, then
event A must occur. What probability does is to assign a weight between 0 and
1 to each outcome of an experiment. This weight represents the likelihood
or chance of that outcome occurring. These weights are determined by long
run experiment, assumption or some other methods.
The probability of an event A in S is the sum of the weights of all sample
points in A and is denoted by P (A). It satisfies the following properties. (1)
0 ≤ P (A) ≤ 1; (2) P (S) = 1; and (3) P (∅) = 0, where ∅ denotes the empty
event. Two events A and B are said to be mutually exclusive if A ∩ B = ∅.
If {A1 , . . . , An } is a set of mutually exclusive events, then


n
P (A1 ∪ A2 ∪ · · · ∪ An ) = P (Ai ).
i=1

If an experiment is such that each outcome has the same probability, then
the outcomes are said to be equally likely.
Consider the case for which an experiment can result in any one of N
different equally likely outcomes. Let the event A consists of exactly n of
these outcomes. Then, the probability of event A is
n
P (A) = .
N

© The Author(s), under exclusive license to 537


Springer Nature Switzerland AG 2021
K. L. Teo et al., Applied and Computational Optimal Control, Springer
Optimization and Its Applications 171,
https://doi.org/10.1007/978-3-030-69913-0
538 A.3 Elements of Probability Theory

The following probability rules are easy to prove.


1. Let Ā denotes the complement of A (i.e., those outcomes not in A).
Then, P (Ā) = 1 − P (A).
2. P (AB) + P (AB̄) = P (A).
3. P (A ∪ B) = P (A) + P (B) − P (AB).
4. P (A ∪ B ∪ C) = P (A) + P (B) + P (C) − P (AB) − P (AC) − P (BC) +
P (ABC).
Here, AB denotes the intersection of A and B. This abbreviation is used
throughout this Appendix.
Conditional Probability: The conditional probability of event A given that
event B has occurred is defined by

P (AB)
P (A|B) = (A.3.1)
P (B)

where A and B are events in the same sample space S.

Example A.3.1. A fair coin is tossed 3 times. What is the probability of


getting at least 1 head, h, given that the first toss was a tail, t.

Solution. The sample space S is

S = {hhh, hht, hth, thh, tth, tht, htt, ttt}.

Define A = {at least one head} and B = {first toss resulted in a tail}.
Then,

A = {hhh, hht, hth, thh, tth, tht, htt}


B = {thh, tht, tth, ttt} ⇒ AB = {thh, tht, tth}
7 4 3
⇒ P (A) = , P (B) = , and P (AB) =
8 8 8
P (AB) 3/8 3
⇒ P (A|B) = = = .
P (B) 4/8 4

Random Variables: A function X whose value is a real number determined


by each element in the sample space is called a random variable (r.v.).
The probability distribution of X, written as P (X = x), is called the
probability function of X. Define

P (X = x) = f (x).

Convention: Capital letters for random variables and lower case for values
of the random variable.
A random variable X is discrete if its range forms a discrete (countable)
set of real numbers. On the other hand, a random variable X is continuous
A.3 Elements of Probability Theory 539

if its range forms a continuous set of real numbers. For a continuous random
variable X, the probability of a specified outcome being occurred is 0.
The cumulative distribution F (x) of a discrete random variable X with
probability function f (x) is given by
 
F (x) = P (X ≤ x) = f (t) = P (X = t).
t≤x t≤x

Independence: 2 events A and B are said to be independent if and only if


P (A|B) = P (A) (or equivalently, P (B|A) = P (B)).
If A and B are independent events, then P (AB) = P (A)P (B).
From (A.3.1), we have

P (AB) = P (A|B)P (B) = P (B|A)P (A) (A.3.2)

It can be generalized to any number of events. For example, consider


A1 , A2 , . . . , An . Then,

P (A1 A2 . . . An ) = P (A1 )P (A2 |A1 )P (A3 |A1 A2 ) . . . P (An |A1 . . . An−1 ).

Theorem A.3.1 (Bayes Theorem). Suppose that {B1 , B2 , . . . , Bn } is a


H
n
partition of the sample space S, where P (Bi ) = 0 for i = 1, . . . , n; Bi = S;
i=1
and Bi Bj = ∅ for i = j (i.e., Bi , i = 1, . . . , n, are mutually exclusive). Let
A be any event in S such that P (A) = 0. Then, for any k = 1, . . . , n,

P (Bk A) P (Bk )P (A|Bk )


P (Bk |A) = 
n = 
n . (A.3.3)
P (Bi A) P (Bi )P (A|Bi )
i=1 i=1

Continuous random variable: A continuous random variable X has a real


value function associated with it which is called probability density function
(p.d.f ) f (x). The p.d.f. f (x) is defined by
 b
P (a < X < b) = f (x)dx. (A.3.4)
a

Clearly,
 a
P (X = a) = P (a ≤ X ≤ a) = f (x)dx = 0.
a
 ∞
Properties: (1) f (x) ≥ 0 for all x ∈ R; (2) f (x)dx = 1; and (3).
−∞
3b
P (a < X < b) = a f (x)dx.
In general, since P (X = x) = 0, it holds that P (X < x) = P (X ≤ x).
The cumulative distribution F (x) of a continuous random variable is
540 A.3 Elements of Probability Theory
 x
dF (x)
F (x) = P (X ≤ x) = f (t)dt ⇒ f (x) = . (A.3.5)
−∞ dx

Joint Random Variables: Suppose that there are 2 random variables X


and Y on the sample space S. Then, each point in S has a value for X and
a value for Y . X and Y are said to be jointly distributed.
1. If X and Y are both discrete random variables, then X and Y have a joint
probability function

f (x, y) = P (X = x, Y = y)

with properties: (1) f (x, y) ≥ 0 for all x, y; (2) f (x, y) = 1; and (3)
 x y
P {(X, Y ) ∈ A} = f (x, y) for any region A in the xy plane.
A
2. If X and Y are continuous random variables, then they have a joint prob-
ability density function f (x, y)

f (x, y) = P (X = x, Y = y)
 ∞ ∞
with properties: (1)f (x, y) ≥ 0 for all x, y; (2) f (x, y)dxdy = 1;
3 3 −∞ −∞
and (3.) P {(X, Y ) ∈ A} = A f (x, y)dxdy for any region A in the xy
plane.
Joint Cumulative Distribution: The joint cumulative distribution is
 ∞ ∞
F (x, y) = P (X ≤ x, Y ≤ y) = f (t, s)dt ds.
−∞ −∞

Marginal Distributions: The marginal distribution of X is defined as:


⎧

⎨ all y P (X = x, Y = y), if X and Y discrete
G(x) =

⎩3∞
−∞
f (x, y)dy, if X and Y continuous,
where G(x) is also called the marginal probability distribution function of X.
Similarly, the marginal distribution of Y is defined as
⎧ 

⎨ all x P (X = x, Y = y), if X and Y discrete
H(y) =

⎩3∞
−∞
f (x, y)dx, if X and Y continuous,

where H(y) is called the marginal probability distribution function of Y.


Conditional distributions: The distribution of X given Y = y is called the
conditional distribution of X given Y = y and is defined as:
A.3 Elements of Probability Theory 541

f (x, y) joint distribution of X and Y


f (x|y) ≡ P (X = x|Y = y) = = .
H(y) marginal distribution of Y

The conditional distribution of Y given X = x is

f (x, y)
f (y|x) ≡ P (Y = y|X = x) = .
G(x)

Independence: The random variables X and Y , which are jointly dis-


tributed, are said to be independent if and only if

f (x, y) = G(x)H(y).

(i.e., Joint distribution = product of the marginal distributions of the random


variables X and Y ).
Clearly, if X and Y are independent, then

f (x|y) = G(x) and f (y|x) = H(y).

Expectation: The expectation of the random variable X is defined by


⎧

⎨ x xf (x), if X is discrete,
E(X) =

⎩3∞
−∞
xf (x)dx, if X is continuous.

Generalization: Let g(X) be a function of the random variable X. Then,


the expectation of g(X) is
⎧

⎨ x g(x)f (x), if X is discrete,
E[g(X)] =
⎩3∞

−∞
g(x)f (x)dx, if X is continuous.

Joint random variables: Let X and Y be jointly distributed with proba-


bility function f (x, y). Then, the expectation of g(X, Y ) is defined as:
⎧

⎪ g(x, y)f (x, y), if X, Y discrete,
⎪ x y

E[g(X, Y )] =  ∞  ∞



⎩ g(x, y)f (x, y)dxdy, if X, Y continuous.
−∞ −∞

Let g(X, Y ) = X. Then, E[g(X, Y )] = E(X), meaning that


542 A.3 Elements of Probability Theory
⎧

⎪ xf (x, y), if X, Y discrete,

⎨ x y
E(X) =  


∞ ∞

⎩ xf (x, y)dxdy, if X, Y continuous.
−∞ −∞

In the case when X and Y are continuous,


 ∞ ∞
E(X) = xf (x, y)dxdy
−∞ −∞
 ∞  ∞  ∞
= x f (x, y)dy dx = xG(x)dx,
−∞ −∞ −∞
 ∞
where G(x) ≡ f (x, y)dy is the marginal probability distribution function
−∞
of X.
Also,  ∞
E(Y ) = y{marginal probability distribution function of Y }dy
−∞
Similar conclusions are valid for discrete random variables.
Rules of Expectations:
1. Let b be a constant. Then
 ∞  ∞
E(b) = bf (x)dx = b f (x)dx = b.
−∞ −∞

2. Let a and b are two constants. Then


 ∞
E(aX + b) = {ax + b} f (x)dx
−∞
 ∞  ∞
=a xf (x)dx + b f (x)dx
−∞ −∞
= aE(X) + b.

3. Let S(X) and T (X) be functions of X. Then


 ∞
E(S(X) ± T (X)) = [S(x) ± T (x)]f (x)dx
−∞
 ∞  ∞
= S(x)f (x)dx ± T (x)f (x)dx
−∞ −∞
= E(S(X)) ± E(T (X)).

4. Let X and Y be jointly distributed, and let g and h be functions of the


random variables X and Y . Then
A.3 Elements of Probability Theory 543

E[g(X, Y )] ± h(X, Y )]
 ∞ ∞
= {g(x, y) ± h(x, y)} f (x, y)dxdy
−∞ −∞
 ∞ ∞  ∞ ∞
= g(x, y)f (x, y)dxdy ± h(x, y)f (x, y)dxdy
−∞ −∞ −∞ −∞
= E[g(X, Y )] ± E[h(X, Y )].

As a consequence, we have, by setting g(X, Y ) = X and h(X, Y ) = Y ,

E(X ± Y ) = E(X) ± E(Y ).

5. Let Xi , i = 1, . . . , n, be n random variables. Then


. n /
 n
E Xi = E(Xi ).
i=1 i=1

6. If X and Y are independent random variables, then

E(XY ) = E(X)E(Y )
.

Remark A.3.1. The condition (6) is necessary but not sufficient. That is,
E(XY ) = E(X)E(Y ) does not necessarily imply that X and Y are indepen-
dent.

Moment: The expectation of the kth (k is a positive integer) power of a


random variable X is called the kth moment of the random variable X and
-k . That is, for any k = 0, 1, 2, . . .,
it is denoted by μ
⎧ k
⎪ x f (x), if X is discrete,
 k ⎨ x
-k = E X =
μ

⎩3∞ k
−∞
x f (x)dx, if X is continuous.

Clearly,
E(X 0 ) = E(1) = 1.
The first moment is called the mean of X and is denoted by μ.

-1 = E(X) = mean of X ≡ μ.
μ

The kth moment about the mean of the random variable X is defined as:
⎧
⎨ x (x − μ) f (x),
k
⎪ if X is discrete,
$ %
E (X − μ) ≡ μk =
k
⎪3∞

−∞
(x − μ)k f (x)dx, if X is continuous.
544 A.3 Elements of Probability Theory

Variance: The second moment about the mean of a random variable X is


called the variance of X. It is denoted by σ 2 . More specifically,
⎧
⎨ x (x − μ) f (x),
2
⎪ X discrete,
$ %
σ = E (X − μ) =
2 2
⎪3∞

−∞
(x − μ)2 f (x)dx, X continuous.

Clearly,  
V ar(X) = E X 2 − [E(X)]2 , i.e., σ 2 = μ
-2 − μ2 .
Bivariate or Joint Moments: Let X and Y be two random variables with
a joint probability function f (x, y). Then
⎧

⎪ xyf (x, y), X, Y discrete,

⎨ x y
E(XY ) =  ∞  ∞



⎩ xyf (x, y)dxdy, X, Y continuous.
−∞ −∞

The “joint” moment of X and Y about their respective means is the covari-
ance of X and Y and is denoted by

Cov(X, Y ) ≡ C(X, Y ) ≡ σXY .

That is to say,

σXY = E[(X − μX )(Y − μY )] = E(XY ) − μX μY ,

where E(X) ≡ μX and E(Y ) = μY .


If X and Y are independent, then E(XY ) = E(X)E(Y ). Under this situa-
tion, it is easy to see that σXY = 0. However, σXY = 0 does not necessarily
mean that X and Y are independent.
Rules of Variance: Let h(X) be a function of the random variable X. Then
the variance of h(X) is
$ % $ %
V ar[h(X)] = E (h(X) − E(h(X))2 = E (h(X))2 − [E(h(X))]2 .

Consider h(X) = aX + b, where a and b are constants. Then,


$ %
V ar(aX + b) = E ((aX + b) − E(aX + b))2 = a2 V ar(X).

For a constant b, V ar(b) = 0.


For bivariate case, we have:
1. If a, b are constants, then
$ %
V ar(aX + bY ) = E (aX + by − E(aX + bY ))2
= a2 V ar(X) + b2 V ar(Y ) + 2abCov(X, Y ).
A.3 Elements of Probability Theory 545

2. If X and Y are independent, then

V ar(aX + bY ) = a2 V ar(X) + b2 V ar(Y ).

3. Cov(aX + b, cY + d) = E[(aX + b − E(aX + b))(cY + d − E(cY + d))]


= acCov(X, Y ).
Binomial Distributions: A binomial experiment is one with properties: (1)
The experiment consists of m independent trials; (2) Each trial results in one
of 2 possible outcomes called success and failure; and (3) The probability of
success does not change from trial to trial and is denoted by p.
Define the random variable X as the number of successes in m trials of a
binomial experiment. This random variable X is called a binomial random
variable. There are 2m possible sequences in all. Consider the case of x
successes in m trials. Then, there are

m m!
=
x x!(m − x)!

ways this can occur, each with probability

P (x successes and (m − x) failures) = px (1 − p)m−x ⇒



m
P (X = x) = px (1 − p)m−x , x = 0, 1, . . . , m.
x

This is the binomial distribution and is written as:

X ∼ B(m, p).

Note that P (X = x) ≥ 0 for all x and


m 
m 
m
P (X = x) = px (1 − p)m−x .
x
x=0 x=0

However,

m 
m m
(a + b) = ax bm−x .
x
x=0

Thus,

m
P (X = x) = (p + 1 − p)m = 1.
x=0

Hence, 
m
P (X = x) = px (1 − p)m−x
x
is its probability function. The mean of X is
546 A.3 Elements of Probability Theory


m 
m
E(X) = x px (1 − p)m−x
x
x=0

But 
m m!
= .
x x!(m − x)!
Thus,

m
m(m − 1)!
E(X) = ppx−1 (1 − p)m−x
x=1
(x − 1)!(m − x)!
= mp[p − (1 − p)]m−1 = mp.

The variance of X is
 
V ar(X) = E X 2 − [E(X)]2 ,

where 
   m
m
E X2 = x2 px (1 − p)m−x .
x
x=0

Since X = X(X − 1) + X, it is clear that


2

 
E X 2 = E[X(X − 1)] + E(X).

On the other hand,



m
m!
E[X(X − 1)] = x(x − 1) px (1 − p)m−x
x=0
(m − x)!x!
= m(m − 1)p2 [p − (1 − p)]m−2 = m(m − 1)p2

Therefore,
 
V ar(X) = E X 2 − [E(X)]2 = mp(1 − p).

Example A.3.2. Let X be a binomial variate for which



m
P (X = x) = px (1 − p)m−x , x = 0, 1, . . . , m.
x

Then,

m
P (X = x + 1) = px+1 (1 − p)m−x−1
x+1
m!
= px+1 (1 − p)m−x−1
(m − x − 1)!(x + 1)!
A.3 Elements of Probability Theory 547

m! (m − x)p
= px (1 − p)m−x
(m − x)!x! (x + 1)(1 − p)
m−x p
= P (X = x).
x+1 1−p

Poisson Distribution: A Poisson experiment is one with properties: (1)


Number of successes occurring in a single time interval (or region) is inde-
pendent of those occurring in any other disjoint time interval (or region); (2)
The probability of a single success occurring in a very short time interval (or
region) is only proportional to the length of the time interval (or size of the
region); and (3) The probability of more than one success occurring in a very
short time interval (or region) is negligible.
The number X of successes in a Poisson experiment is called a Poisson
random variable, which is written as
X ∼ P (λ).
The probability function of X is
e−λ λx
P (X = x) = , x = 0, 1, 2, . . . ,
x!
where λ denotes the mean number of successes in a specified interval. The
mean of X is
∞ ∞
e−λ λx λx
E(X) = x = e−λ
x=0
x! x=1
(x − 1)!
∞ ∞
λx−1 λx
= λe−λ = λe−λ = λe−λ eλ = λ.
x=1
(x − 1)! x=0
x!

The variance of X is
 
V ar(X) = E X 2 − [E(X)]2 = E[X(X − 1)] + E(X) − [E(X)]2
= λ2 + λ − λ2 = λ.

The sum of 2 independent Poisson random variables is also a Poisson


variable. That is,

X1 ∼ P (λ1 ), X2 ∼ P (λ2 ), X1 and X2 independent.


⇒ X1 + X2 ∼ P (λ1 + λ2 ).

Let X be a Binomial random variable with m trials, and let p be the


probability of success. Let m → ∞ and p → 0. If mp = λ remains constant
and finite, then
B(m, p) → P (λ).
548 A.3 Elements of Probability Theory

Example A.3.3. Experiment shows that the number of mechanical failures per
quarter for a certain component used in a loading plant is Poisson distributed.
The mean number of failures per quarter is 1.5. Stocks of the component are
built up to a fixed number at the beginning of a quarter and not replenished
until the beginning of the next quarter. Calculate the least number of spares
of this component which should be carried at the beginning of the quarter to
ensure that the probability of a demand exceeding this number during the
quarter will not exceed 0.10. If stocks are to be replenished only once in 6
months, how many ought now to be carried at the beginning of the period
to give the same protection?
Solution.
(a) Let X ∼ P (1.5) be the number of mechanical failures/quarter, and let t
be the smallest v number of spares. Then, P (X > t) ≤ 0.1 and P (X ≤
t −1.5 1.5
x
t) > 0.9. Since P (X ≤ t) = x=0 e , we have P (X ≤ 2) =
x!
0.8088, and P (X ≤ 3) = 0.9344 (hence P (X > 3) = 0.0666). This
implies 3 spares/quarter
(b) For 6 months, λ = 3, X ∼ P (3). Find t such that P (X > t) ≤ 0.1.

P (X > 5) = 0.0839 ⇒ 5 spares/6 months.

Normal Distribution: Let X be a continuous random variable with its value


denoted by x. The random variable X is said to have a normal distribution
if its probability density function is
1 *  2 +
f (x) = √ exp − 12 x−μ
σ ,
2πσ
where −∞ < x < ∞, μ ≡ E(X) is the mean of X such that −∞ < μ < ∞,
σ 2 ≡ V ar(X) is the variance of X, and σ is standard deviation of X such
that σ > 0. Here, μ and σ are the parameters of the distribution.
Characteristic:

1. f (−x) = f (x);  
1 *   + x−μ
1 x−μ 2
2. df (x)
dx = √ exp − 2 σ − .
σ 2π σ2
Then, dfdx (x)
= 0 ⇒ x = μ;
*  2
 +
2

d2 f (x) 1 1 x−μ
−2 1 
1 x−μ 2 x−μ
3. dx2 = − √ e σ
+ √ exp − 2 σ .
σ 3 2π σ 2π σ2
Then,

d2 f (x)  1
=− √ ≤ 0.
dx2 x=μ 3
σ 2π
A.3 Elements of Probability Theory 549

This implies that x = μ is the point at which the function f (x) attains its
maximum, f (x) is symmetric about μ, and the points of inflexion are at
x = μ + σ and x = μ − σ.
Let X be a random variable with its value denoted by x. If X is normally
distributed with mean μ and variance σ 2 , written as X ∼ N (μ, σ 2 ). If μ = 0
and σ 2 = 1, then the corresponding random variable U is called a standard
normal random variable, written as U ∼ N (0, 1). Let the value of U be
denoted by u. Then, its probability density function φ(u) is
1 " #
φ(u) = √ exp − 12 u2 ,

and its cumulative distribution is
 u
1 " #
Φ(u) = √ exp − 12 v 2 dν.
−∞ 2π

If X ∼ N (μ, σ 2 ), then
 *
x
1  2 +
P (X < x) = √ exp − 21 t−μ
σ dt.
−∞ σ 2π

Return again to the case when U ∼ N (0, 1). Then,


1x " #
Φ(u) = P (U < u) = √ exp − 12 v 2 dν
−∞ σ 2π
1
Φ(2) = P (U < 2) = 0.9772; Φ(0) = ; Φ(∞) = 1; Φ(−∞) = 0.
2

Φ(−u) = P (U < −u) = P (U > u) = 1 − P (U < u)


⇒ Φ(−u) = 1 − Φ(u),

and
P (U > u) = 1 − P (U < u) = 1 − Φ(u).
Thus,

P (u1 < U < u2 ) = P (U < u2 ) − P (U < u1 ).


= Φ(u2 ) − Φ(u1 ).

Note that

P (|U | < u) = P (−u < U < u) = Φ(u) − (1 − Φ(u))


= 2Φ(u) − 1.
550 A.3 Elements of Probability Theory
    
Probabilities for X ∼ N μ, σ 2 : We can transform X ∼ N μ, σ 2 to
(U ∼ N (0, 1)). To begin with, we recall
 x2 *
1  2 +
P (x1 < X < x2 ) = √ exp − 12 t−μ
σ dt.
x1 σ 2π

Set
t−μ 1
ν= ⇒ t = σν + μ and dν = dt
σ σ
x1 − μ x2 − μ
t = x1 ⇒ ν = ≡ u1 ; t = x2 ⇒ ν = ≡ u2
σ  u2 σ
1 1 2
⇒ P (x1 < X < x2 ) = √ e− 2 ν dν = Φ(u2 ) − Φ(u1 )
u1 2π
 
x2 − μ x1 − μ
=Φ −Φ .
σ σ
  X −μ
Thus, we conclude that if X ∼ N μ, σ 2 , then U = ∼ N (0, 1). This
σ
is called standardization.
 
Theorem A.3.2. Suppose that X ∼ N μ, σ 2 . Then,
 
aX + b ∼ N aμ + b, a2 σ 2 .
 
Theorem A.3.3. For each i = 1, . . . , n, let Xi ∼ N μi , σi2 . If Xi , i =
1, . . . , n, are independent, then
 n 
n  
n
Y ≡ ai Xi ∼ N ai μi , a2i σi2 .
i=1 i=1 i=1

Lognormal Random Variables: A random variable Z is lognormal if the


random variable ln Z is normal. Equivalently, if X is normal, then Z =
exp{X} is lognormal. This means that the density function for Z has the
form
1 1 2
p(Z) = √ exp − 2 (ln Z − υ) .
2πσz 2σ
We have
υ + σ2
E(Z) = exp ; E(ln Z) = υ
2
" #" " # #
V ar(Z) = exp υ + σ 2 exp σ 2 − 1 ⇒ V ar(ln Z) = σ 2 .
It follows from the summation result for joint normal random variables
that products and powers of jointly lognormal variables are again lognormal.
For example, if U and V are lognormal, then Z = U α V β is also lognormal.
A.3 Elements of Probability Theory 551

Normal approximation to a binomial distribution: Suppose that X ∼


B(m, p). Then,

m
P (X = x) = px (1 − p)m−x , x = 0, 1, . . . , m; 0 < p < 1.
x
E(X) = m; V ar(X) = mp(1 − p).
 
This means that X is approximately ∼ N m, (mp(1 − p))2 . The approxi-
1
mation is very accurate if m is large and p is close to . It is fairly good if
2
m is not large and p is not very close to 0 or 1.

Theorem A.3.4 (Central Limit Theorem). If X1 , . . . , Xn are indepen-


dently identically distributed random variables with mean μ and variance σ 2 ,
then, for large n,

X̄ − μ
U≡ √ ∼ (approximately) N (0, 1), (*)
σ/ n


n
where X̄ = Xi /n. In other words, for a large sample size, the sample
i=1
mean is approximately normally distributed no matter what the distribution
of the Xi , i = 1, 2, . . ., are as long as they are independently identically
distributed.

In conclusion, if a random sample of size n is selected from a population


whose variance σ 2 is known, then the confidence interval for μ (i.e., the mean
of the population) is

σ σ
x̄ − uα/2 √ , x̄ + uα/2 √ .
n n

On the other hand, if σ is not known, then we shall replace σ by the sample
standard deviations, provided that n ≥ 30.
Let (Ω, F, P ) denote a complete probability space, where Ω represents the
sample space, F the σ-algebra (Borel algebra) of the subsets of the set Ω, and
P the probability measure on the algebra F. Let Ft , t ≥ 0, be an increasing
family of complete subsigma algebras of the σ-algebra F. For any random
variable (or equivalently F measurable function) X, let

E{X} = X(ω)P (ω)
Ω

denote the expected value provided it exists. It does exist if X ∈ L1 (Ω, P ).


For a subsigma algebra G ⊂ F, the conditional expectation of X, relative
to G, is denoted by
E{X | G) = Y
552 A.3 Elements of Probability Theory

. The random variable Y is G measurable.


Let G and G1 ⊂ G2 ⊂ F be any three (complete) subsigma algebras of
the sigma algebra F. The conditional expectation is a linear operator in the
sense that

1. For α1 , α2 ∈ R,

E{αX1 + αX2 | G) = α1 E{X1 | G) + α2 E{X2 | G).

2.
E {E{X | G1 ) | G2 } = E {E{X | G2 ) | G1 } = E{X | G1 )}.
3. If Z is a bounded G measurable random variable with G ⊂ F, then

E{ZX | G) = ZE{X | G},


which is a G measurable random variable.
4. If X is a random variable independent of the sigma algebra G ⊂ F , then

E{X | G1 ) = E{X};
5. For any F measurable and integrable random variable Z, the process

Zt = E{Z | Ft }, t ≥ 0,
is an Ft martingale in the sense that for any s ≤ t < ∞,

E{Zt | Fs } = Zs .

We say that a process W (t) is a Wiener process (or alternatively, Brownian


motion) if it satisfies the following properties: (1) For any s < t, the quantity
W (t) − W (s) is a normal random variable with mean zero and variance t − s;
(2) For any 0 ≤ t1 ≤ t2 ≤ t3 ≤ t4 , the random variables W (t2 ) − W (t1 ) and
W (t4 ) − W (t3 ) are uncorrelated; and (3) W (t0 ) = 0 with probability 1.

Theorem A.3.5. Suppose that the random process X is defined by the Ito
process:

dX(t) = α(X(t), t)dt + β(X(t), t)dW (t),


where W is a standard Wiener process.
A.3 Elements of Probability Theory 553

Suppose also that the process Y (t) is defined by

Y (t) = F (X(t), t).

Then, Y (t) satisfies the Ito equation

∂F ∂F 1 ∂2F ∂F
dY (t) = α+ + 2
(β)2 dt + βdW
∂x ∂t 2 ∂x ∂x
References

1. Åkesson, J., Arzen, K., Gäfert, M., Bergdahl, T., Tummescheit, H.:
Modelling and optimization with Optimica and JModelica.org – lan-
guages and tools for solving large-scale dynamic optimization problems.
Comput. Chem. Eng. 34(11), 1737–1749 (2010)
2. Abu-Khalaf, M., Lewis, F.L.: Nearly optimal control laws for nonlin-
ear systems with saturating actuators using a neural network HJB ap-
proach. Automatica 41(5), 779–791 (2005)
3. Ahmed, N.U.: Elements of Finite-Dimensional Systems and Control
Theory. Longman Scientific and Technical, Essex (1988)
4. Ahmed, N.U.: Dynamic Systems and Control with Applications. World
Scientific, Singapore (2006)
5. Ahmed, N.U., Teo, K.L.: Optimal Control of Distributed Parameter
Systems. Elsevier Science, New York (1981)
6. Al-Tamimi, A., Lewis, F., Abu-Khalaf, M.: Discrete-time nonlinear
HJB solution using approximate dynamic programming: convergence
proof. IEEE Trans. Syst. Man Cybern. B Cybern. 38, 943–949 (2008)
7. Amerongen, J.V.: Adaptive steering of ships-a model reference ap-
proach. Automatica 20(1), 3–14 (1984)
8. Anderson, B.D.O., Moore, J.B.: Linear Optimal Control. Prentice-Hall,
Englewood Cliffs (1971)
9. Anderson, B.D.O., Moore, J.B.: Optimal Control: Linear Quadratic
Methods. Dover, New York (2007)
10. Aoki, M.: Introduction to Optimization Techniques: Fundamentals and
Applications of Nonlinear Programming. Macmillan, New York (1971)
11. Athans, M., Falb, P.L.: Optimal Control. McGraw-Hill, New York
(1966)
12. Baker, S., Shi, P.: Formulation of a tactical logistics decision analysis
problem using an optimal control approach. ANZIAM J. 44(E), 1737–
1749 (2002)

© The Author(s), under exclusive license to 555


Springer Nature Switzerland AG 2021
K. L. Teo et al., Applied and Computational Optimal Control, Springer
Optimization and Its Applications 171,
https://doi.org/10.1007/978-3-030-69913-0
556 References

13. Banihashemi, N., Kaya, C.Y.: Inexact restoration for Euler discretiza-
tion of box-constrained optimal control problems. J. Optim. Theory
Appl. 156, 726–760 (2003)
14. Banks, H.T., Burns, J.A.: Hereditary control problem: Numerical meth-
ods based on averaging approximations. SIAM J. Control. Optim. 16,
169–208 (1978)
15. Bartlett, M.: An inverse matrix adjustment arising in discriminant anal-
ysis. Ann. Math. Stat. 22(1), 107–111 (1951)
16. Bashier, E.B.M., Patidar, K.C.: Optimal control of an epidemiological
model with multiple time delays. Appl. Math. Comput. 292, 47–56
(2017)
17. Bech, M., Smitt., L.W.: Analogue Simulation of Ship Manoeuvres. Hy-
dro and Aerodynamics Lab. Report No. Hy-14, Denmark (1969)
18. Bellman, R.: Introduction to the Mathematical Theory of Control Pro-
cesses, Vol. 1. Academic, New York (1967)
19. Bellman, R.: Introduction to the Mathematical Theory of Control Pro-
cesses, Vol. 2. Academic, New York (1971)
20. Bellman, R., Dreyfus, R.: Dynamic Programming and Modern Control
Theory. Academic, Orlando (1977)
21. Bensoussan, A., Hurst, E., Naslund, B.: Management Application of
Modern Control Theory. North Holland, Amsterdam (1974)
22. Bertsekas, D.: Constrained Optimization and Lagrange Multiplier
Methods. Academic, New York (1982)
23. Bertsimas, D., Brown, D.: Constrained stochastic LQC: a tractable
approach. IEEE Trans. Autom. Control 52, 1826–1841 (2007)
24. Betts, J.: Practical Methods for Optimal Control and Estimation Using
Nonlinear Programming. SIAM Press, Philadelphia (2010)
25. Biegler, L.: An overview of simultaneous strategies for dynamic op-
timization. Chem. Eng. Process. Process Intensif. 46(11), 1043–1053
(2007)
26. Birgin, E.G., Martinez, J.M.: Local convergence of an Inexact-
Restoration method and numerical experiments. J. Optim. Theory
Appl. 127(2), 229–247 (2005)
27. Blanchard, E., Loxton, L., Rehbock, V.: Dynamic optimization of dual-
mode hybrid systems with state-dependent switching conditions. Op-
tim. Methods Softw. 33(2), 297–310 (2018)
28. Blanchard, E., Loxton, R., Rehbock, V.: A computational algorithm for
a class of non-smooth optimal control problems arising in aquaculture
operations. Appl. Math. Comput. 219, 8738–8746 (2013)
29. Boltyanskii, V.: Mathematical Methods of Optimal Control. Holt, Rine-
hart and Winston, New York (1971)
30. Boyd, S., Vandenberghe, L.: Convex Optimization (2013). http://www.
stanford.edu/∼boyd/cvxbook/
References 557

31. Brooke., D.: The design of a new automatic pilot for the commercial
ship. In: First IFAC/IFIP Symposium on Ship Operation Automation,
Oslo (1973)
32. Broyden, C.: The convergence of a class of double-rank minimization
algorithms. J. Inst. Math. Appl. 6, 76–90 (1970)
33. Bryson, A., Ho, Y.: Applied Optimal Control. Hemisphere Publishing,
Washington DC (1975)
34. Büskens, C.: Optimierungsmethoden und sensitivitätsanalyse für opti-
male steuerprozesse mit steuer und zustands beschränkungen. Ph.D.
thesis, Institut für Numerische und Inentelle Mathematik, Universität
Münster (1998)
35. Büskens, C., Maurer, H.: Nonlinear programming methods for real-time
control of an industrial robot. J. Optim. Theory Appl. 107(3), 505–527
(2000)
36. Buskens, C., Maurer, H.: SQP-methods for solving optimal control
problems with control and state constraints: adjoint variables, sensi-
tivity analysis and real-time control. J. Comput. Appl. Math. 120,
85–108 (2000)
37. Butovskiy, A.: Distributed Control Systems. American Elsevier, New
York (1969)
38. Caccetta, L., Loosen, I., Rehbock, V.: Computational aspects of the
optimal transit path problem. J. Ind. Manage. Optim. 4, 95–105 (2008)
39. Canuto, C., Hussaini, M., Quarteroni, A., Zang, T.: Spectral Methods
in Fluid Dynamics. Springer, New York (1988)
40. Cesari, L.: Optimization: Theory and Applications. Springer, New York
(1983)
41. Chai, Q., Yang, C., Teo, K.L., Gui, W.: Time-delay optimal control
of an industrial-scale evaporation process sodium aluminate solution.
Control Eng. Pract. 20, 618–628 (2012)
42. Chen, T., Xu, C., Lin, Q., Loxton, R., Teo, K.L.: Water hammer mit-
igation via PDE constrained optimization. Control Eng. Pract. 45,
54–63 (2015)
43. Cheng, T.C.E., Teo, K.L.: Further extensions of a student related op-
timal control problem. Int. J. Math. Model. 9, 499–506 (1987)
44. Choi, C., Laub, A.: Efficient matrix-valued algorithms for solving stiff
Riccati differential equations. IEEE Trans. Autom. Control 35(7), 770–
776 (1990)
45. Chyba, M., Haberkorn, T., Smith, R.N., Choi, S.K.: Design and im-
plementation of time efficient trajectories for autonomous underwater
vehicles. Ocean Eng. 35, 63–76 (2008)
46. Cuthrell, J.E., Biegler, L.: Simultaneous optimization and solution
methods for batch reactor control profiles. Comput. Chem. Eng. 13,
49–62 (1989)
558 References

47. Denis-Vidal, L., Jauberthie, C., Joly-Blanchard, G.: Identifiability of a


nonlinear delayed-differential aerospace model. IEEE Trans. Autom.
Control 51(1), 154–158 (2006)
48. Dontchev, A.L.: In: Balakrishnan, A.V., Thoma, M. (eds.) Perturba-
tions, Approximations and Sensitivity Analysis of Optimal Control Sys-
tems. Lecture Notes in Control and Information Sciences. Springer,
Berlin (1983)
49. Dontchev, A.L., Hager, W.W.: The Euler approximation in state con-
strained optimal control problems. Math. Comput. 70, 173–203 (2000)
50. Dontchev, A.L., Hager, W.W., Malanowski, K.: Error bound for Eu-
ler approximation of a state and control constrained optimal control
problem. Numer. Funct. Anal. Optim. 21(6), 653–682 (2000)
51. Dunford, N., Schwartz, J.T.: Linear Operators, Part 1 and Part 2. Wi-
ley, New York (1958)
52. Elnagar, G., Kazemi, M., Razzaghi, M.: The Pseudospectral Legendre
method for discretizing optimal control problems. IEEE Trans. Autom.
Control 40(10), 1793–1796 (1995)
53. Esposito, W., Floudas, C.: Deterministic global optimization in nonlin-
ear optimal control problems. J. Glob. Optim. 17(1–4), 97–126 (2000)
54. Evtushenko, Y.: Numerical Optimization Techniques. Springer, New
York (1985)
55. Feehery, W., Barton, P.: Dynamic optimization with state variable path
constraints. Comput. Chem. Eng. 22(9), 1241–1256 (1998)
56. Feng, Z.G., Teo, K.L., Rehbock, V.: Hybrid method for a general opti-
mal sensor scheduling problem in discrete time. Automatica 44, 1295–
1303 (2008)
57. Feng, Z.G., Teo, K.L., Rehbock, V.: A discrete filled function method
for the optimal control of switched systems in discrete time. Optimal
Control Appl. Methods 30(6), 585–593 (2009)
58. Fisher, M.E., Jennings, L.: Discrete-time optimal control problems with
general constraints. ACM Trans. Math. Softw. 18(4), 401–413 (1992)
59. Fleming, W., Rishel, R.: Deterministic and Stochastic Optimal Control.
Springer, Berlin (1975)
60. Fletcher, R.: A new approach to variable metric algorithms. Comput.
J. 13(3), 317–322 (1970)
61. Fletcher, R.: Practical Methods of Optimization, 2nd edn. Wiley-
Interscience, New York (1987)
62. Fletcher, R., Reeves, C.: Function minimization by conjugate gradients.
Comput. J. 7, 149–154 (1964)
63. Fu, J., Chachuat, B., Mitsos, A.: Local optimization of dynamic pro-
grams with guaranteed satisfaction of path constraints. Automatica
62, 184–192 (2015)
64. Gamkrelidze, R.: Principles of Optimal Control Theory. Plenum Press,
New York (1978)
References 559

65. Gao, Y., Kostyukova, O., Chong, K.T.: Worst-case optimal control for
an electrical drive system with time-delay. Asian J. Control 11(4),
386–395 (2009)
66. Gerdts, M.: Solving mixed-integer optimal control problems by branch
and bound: a case study from automobile test-driving with gear shift.
Optimal Control Appl. Methods 26(1), 1–18 (2005)
67. Gerdts, M.: A variable time transformation method for mixed-integer
optimal control problems. Optimal Control Appl. Methods 27, 169–182
(2006)
68. Gerdts, M.: Global convergence of a nonsmooth Newton method for
control-state constrained optimal control problems. SIAM J. Control
Optim. 19(1), 326–350 (2008)
69. Gerdts, M.: Optimal control of ODEs and DAEs. De Gruyter, Berlin
(2012)
70. Giang, D., Lenbury, Y., Seidman, T.: Delay effect in models of popu-
lation growth. J. Math. Anal. Appl. 305, 631–643 (2005)
71. Gill, P., Murray, W., Wright, M.: Practical Optimization. Academic,
London (1981)
72. Goh, B.: Necessary conditions for singular extremals involving multiple
control variables. SIAM J. Control 4(4), 716–731 (1966)
73. Goh, B.: The second variation for the singular Bolza problem. SIAM
J. Control 4(2), 309–325 (1966)
74. Goh, B.: Management and Analysis of Biological Populations. Elsevier,
Amsterdam (1980)
75. Goh, C.J., Teo, K.L.: Control parametrization: a unified approach to
optimal control problems with general constraints. Automatica 24(1),
3–18 (1988)
76. Goh, C.J., Teo, K.L.: Alternative algorithms for solving nonlinear func-
tion and functional inequalities. Appl. Math. Comput. 41(2), 159–177
(1991)
77. Goldfarb, D.: A family of variable-metric methods derived by varia-
tional means. Math. Comput. 24, 23–26 (1970)
78. Goldstein, A.: On steepest descent. SIAM J. Control 3, 147–151 (1965)
79. Gong, Z.H., Loxton, R., Yu, C.J., Teo, K.L.: Dynamic optimization for
robust path planning of horizontal oil wells. Appl. Math. Comput. 274,
711–725 (2016)
80. Gong, Z.H., Teo, K.L., Liu, C.Y., Feng, E.: Horizontal well’s path plan-
ning: an optimal switching control approach. Appl. Math. Model. 39,
4022–4032 (2015)
81. Gonzaga, C., Polak, E., Trahan, R.: An improved algorithm for opti-
mization problems with functional inequality constraints. IEEE Trans.
Autom. Control 25(1), 211–246 (1980)
82. Graham, K., Rao, A.: Minimum-time trajectory optimization of low-
thrust earth-orbit transfers with eclipsing. J. Spacecr. Rocket. 53(2),
289–303 (2016)
560 References

83. Gruver, W., Sachs, E.: Algorithmic Methods in Optimal Control. Re-
search Notes in Mathematics, vol. 47. Pitman, London (1981)
84. Guinn, T.: Reduction of delayed optimal control problems to nonde-
layed problems. J. Optim. Theory Appl. 18(3), 371–377 (1976)
85. Hager, W.W.: Runge-Kutta methods in optimal control and the trans-
formed adjoint system. Numer. Math. 87, 247–282 (2000)
86. Han, S.: Superlinearly convergent variable metric algorithms for gen-
eral nonlinear programming problems. Math. Program. 11(1), 263–282
(1976)
87. Han, S.: A globally convergent method for nonlinear programming. J.
Optim. Theory Appl. 22(3), 297–309 (1977)
88. Hartl, R.F., Sethi, S.P., Vickson, R.G.: A survey of the maximum prin-
ciples for optimal control problems with state constraints. SIAM Rev.
37(2), 181–218 (1995)
89. Hausdorff, L.: Gradient Optimization and Nonlinear Control. Wiley,
New York (1976)
90. Hermes, H., LaSalle, J.P.: Functional Analysis and Time optimal Con-
trol. Academic, New York (1969)
91. Hewitt, E., Stromberg, K.: Real and Abstract Analysis. Springer, New
York (1965)
92. Hindmarsh, A.: Large ordinary differential systems and software. IEEE
Control Mag. 2, 24–30 (1982)
93. Ho, C.Y.F., Ling, B.W.K., Liu, Y.Q., Tam, P.K.S., Teo, K.L.: Opti-
mal PWM control of switched-capacitor DC–DC power converters via
model transformation and enhancing control techniques. IEEE Trans.
Circuits Syst. I 55, 1382–1391 (2008)
94. Hounslow, M.J., Ryall, R.L., Marshall, V.R.: A discretized population
balance for nucleation, growth, and aggregation. AIChE J. 34(11),
1821–1832 (1988)
95. Howlett, P.: Optimal strategies for the control of a train. Automatica
32(4), 519–532 (1996)
96. Howlett, P.: The optimal control of a train. Ann. Oper. Res. 98(1-4),
65–87 (2000)
97. Howlett, P.G., Pudney, P.J., Vu, X.: Local energy minimization in op-
timal train control. Automatica 45, 2692–2698 (2009)
98. Huang, C., Wang, S., Teo, K.L.: Solving Hamilton-Jacobi-Bellman
equations by a modified method of characteristics. Nonlinear Anal.
40(1–8), 279–293 (2000)
99. Huang, C., Wang, S., Teo, K.L.: On application of an alternating direc-
tion method to Hamilton-Jacobi-Bellman equations. J. Comput. Appl.
Math. 166(1), 153–166 (2004)
100. Hull, D., Speyer, J., Tseng, C.: Maximum-information guidance for
homing missiles. J. Guid. Control. Dyn. 8(4), 494–497 (1985)
References 561

101. Huntington, G., Rao, A.: Optimal reconfiguration of spacecraft forma-


tions using the Gauss pseudospectral method. J. Guid. Control. Dyn.
31(3), 689–698 (2008)
102. Hussein, I., Bloch, A.: Optimal control of underactuated nonholonomic
mechanical systems. IEEE Trans. Autom. Control 53(3), 668–682
(2008)
103. Jennings, L.S., Teo, K.L.: A computational algorithm for functional
inequality constrained optimization problems. Automatica 26(2), 371–
375 (1990)
104. Jennings, L.S., Fisher, M.E., Teo, K.L., Goh, C.J.: MISER3 optimal
control software: theory and user manual-both FORTRAN and MAT-
LAB versions (2004).
105. Jennings, L.S., Wong, K., Teo, K.L.: Optimal control computation to
account for eccentric movement. J. Aust. Math. Soc. B 38(2), 182–193
(1996)
106. Jiang, C., Lin, Q., Yu, C., Teo, K.L., Duan, G.R.: An exact penalty
method for free terminal time optimal control problem with continuous
inequality constraints. J. Optim. Theory Appl. 154, 30–53 (2012)
107. Jiang, C., Teo, K.L., Duan, G.: A suboptimal feedback control for non-
linear time-varying systems with continuous inequality constraints. Au-
tomatica 48, 660–665 (2012)
108. Jiang, C., Teo, K.L., Loxton, R., Duan, G.R.: A neighboring extremal
solution for an optimal switched impulsive control problem. J. Ind.
Manage. Optim. 8, 591–609 (2012)
109. Kailath, T.: Linear Systems. Prentice-Hall Information and System
Science Series. Prentice-Hall, Englewood Cliffs (1980)
110. Kamien, M., Schwartz, N.: Dynamic Optimization: The Calculus of
Variations and Optimal Control in Economics and Management. North
Holland, Amsterdam (1991)
111. Kaya, C.Y., Noakes, J.L.: Computational method for time-optimal
switching control. J. Optim. Theory Appl. 117(1), 69–92 (2003)
112. Kaya, C.Y., Noakes, J.L.: Leapfrog for optimal control. SIAM J. Numer.
Anal. 46(6), 2795–2817 (2008)
113. Kaya, C.Y.: Inexact restoration for Runge-Kutta discretization of opti-
mal control problems. SIAM J. Numer. Anal. 48(4), 1492–1517 (2010)
114. Kaya, C.Y.: Markov–Dubins path via optimal control theory. Comput.
Optim. Appl. 68, 719–747 (2017)
115. Kaya, C.Y., Martinez, J.M.: Euler discretization for inexact restoration
and optimal control. J. Optim. Theory Appl. 134, 191–206 (2007)
116. Kaya, C.Y., Maurer, H.: A numerical method for nonconvex multi-
objective optimal control problems. Comput. Optim. Appl. 57, 685–702
(2014)
117. Kaya, C.Y Noakes, J.L.: Computations and time-optimal controls. Op-
timal Control Appl. Methods 17, 171–185 (1996)
562 References

118. Khmelnitsky, E.: A combinatorial, graph-based solution method for a


class of continuous time optimal control problems. Math. Oper. Res.
27(2), 312–325 (2002)
119. Kogan, K., Khmelnitsky, E.: Scheduling: Control-Based Theory and
Polynomial-Time Algorithms. Kluwer Academic, Dordrecht (2000)
120. Lee, C., Leitmann, G.: On a student-related optimal control problem.
J. Optim. Theory Appl. 65(1), 129–138 (1990)
121. Lee, E., Markus, L.: Foundations of Optimal Control Theory. Wiley,
New York (1967)
122. Lee, H., Ali, M., Wong, K.: Global optimization for a class of optimal
discrete-valued control problems. Dyn. Contin. Discrete Impuls. Syst.
B 11(6), 735–756 (2004)
123. Lee, H.W.J., Teo, K.L., Jennings, L.S.: On optimal control of multi-link
vertical planar robot arms systems moving under the effect of gravity.
J. Aust. Math. Soc. B 39(2), 195–213 (1997)
124. Lee, H.W.J., Teo, K.L., Lim, A.E.B.: Sensor scheduling in continuous
time. Automatica 37(12), 2017–2023 (2001)
125. Lee, H.W.J., Teo, K.L., Rehbock, V., Jennings, L.S.: Control parame-
terization enhancing technique for time optimal control problems. Dyn.
Syst. Appl. 6, 243–262 (1997)
126. Lee, H.W.J., Teo, K.L., Rehbock, V., Jennings, L.S.: Control parame-
terization enhancing technique for optimal discrete-valued control prob-
lems. Automatica 35(8), 1401–1407 (1999)
127. Lee, W., Rehbock, V., Caccetta, L., Teo, K.L.: Numerical solution of
optimal control problems with discrete-valued system parameters. J.
Glob. Optim. 23(3-4), 233–244 (2002)
128. Lee, W., Wang, S., Teo, K.L.: Optimal recharge and driving strategies
for a battery-powered electric vehicle. Math. Probl. Eng. 5(1), 1–32
(1999)
129. Lei, J.: Optimal vibration control of nonlinear systems with multiple
time-delays: an application to vehicle suspension. Integr. Ferroelectr.
170, 10–32 (2016)
130. Lewis, F.: Optimal Control. Wiley, New York (1986)
131. Li, B., Teo, K.L., Duan, G.R.: Optimal control computation for discrete
time time-delayed optimal control problem with all-time-step inequality
constraints. Int. J. Innov. Comput. Inf. Control 6(7), 3157–3175 (2010)
132. Li, B., Teo, K.L., Lim, C.C., Duan, G.R.: An optimal PID controller
design for nonlinear constrained optimal control problems. Discrete
Contin. Dyn. Syst. B 16, 1101–1117 (2011)
133. Li, B., Teo, K.L., Zhao, G.H., Duan, G.: An efficient computational
approach to a class of minmax optimal control problems with applica-
tions. ANZIAM J. 51(2), 162–177 (2009)
134. Li, B., Yu, C., Teo, K.L., Duan, G.R.: An exact penalty function
method for continuous inequality constrained optimal control problem.
J. Optim. Theory Appl. 151(2), 260–291 (2011)
References 563

135. Li, B., Zhu, Y.G., Sun, Y.F., Aw, G., Teo, K.L.: Deterministic con-
version of uncertain manpower planning optimization problem. IEEE
Trans. Fuzzy Syst. 26(5), 2748–2757 (2018)
136. Li, B., Zhu, Y.G., Sun, Y.F., Aw, G., Teo, K.L.: Multi-period port-
folio selection problem under uncertain environment with bankruptcy
constraint. Appl. Math. Model. 56, 539–550 (2018)
137. Li, C., Teo, K.L., Li, B., Ma, G.: A constrained optimal PID-like con-
troller design for spacecraft attitude stabilization. Acta Astrnaut. 74,
131–140 (2011)
138. Li, R., Teo, K.L., Wong, K.H., Duan, G.R.: Control parameterization
enhancing transform for optimal control of switched systems. Math.
Comput. Model. 43(11-12), 1393–1403 (2006)
139. Li, Y.G., Gui, W.H., Teo, K.L., Zhu, H.Q., Chai, Q.Q.: Optimal control
for zinc solution purification based on interacting CSTR models. J.
Process Control 22, 1878–1889 (2012)
140. Liang, J.: Optimal magnetic attitude control of small spacecraft. Ph.D.
thesis, Utah State University (2005)
141. Lim, C., Forsythe., W.: Autopilot for ship control. IEEE Proc. 130(6),
281–294 (1983)
142. Lin, Q., Loxton, R., Teo, K.L.: Optimal control of nonlinear switched
systems: computational methods and applications. J. Oper. Res. Soc.
China 1, 275–311 (2013)
143. Lin, Q., Loxton, R., Teo, K.L., Wu, Y.H.: A new computational method
for a class of free terminal time optimal control problems. Pac. J.
Optim. 7(1), 63–81 (2011)
144. Lin, Q., Loxton, R., Teo, K.L., Wu, Y.H.: Optimal control computation
for nonlinear systems with state-dependent stopping criteria. Automat-
ica 48, 2116–2129 (2012)
145. Lin, Q., Loxton, R., Teo, K.L., Wu, Y.H.: Optimal feedback control for
dynamic systems with state constraints: an exact penalty approach.
Optim. Lett. 8(4), 1535–1551 (2014)
146. Lin, Q., Loxton, R., Teo, K.L., Wu, Y.H.: Optimal control problems
with stopping constraints. J. Glob. Optim. 63(4), 835–861 (2015)
147. Lin, Q., Loxton, R., Teo, K.L., Wu, Y.H., Yu, C.J.: A new exact penalty
method for semi-infinite programming problems. J. Comput. Appl.
Math. 261(1), 271–286 (2014)
148. Lin, Q., Loxton, R.C., Teo, K.L.: The control parameterization method
for nonlinear optimal control: a survey. J. Ind. Manage. Optim. 10(1),
275–309 (2014)
149. Lions, J.: Optimal Control of Systems Governed by Partial Differential
Equations. Springer, New York (1971)
150. Liu, C., Gong, Z.: Optimal control of Switched Systems Arising in Fer-
mentation Processes. Springer, Berlin (2014)
564 References

151. Liu, C., Loxton, R., Teo, K.L.: Switching time and parameter optimiza-
tion in nonlinear switched systems with multiple time delays. J. Optim.
Theory Appl. 163, 957–988 (2014)
152. Liu, C., Loxton, R., Lin, Q., Teo, K.L. : Dynamic optimization for
switched time-delay systems with state-dependent switched conditions.
SIAM J. Control Optim. 56, 3499–3523 (2018)
153. Liu, C., Loxton, R., Lin, Q., Teo, K.L.: Dynamic optimization for
switched time-delay systems with state-dependent switching conditions.
SIAM J. Control Optim. 56(5), 3499–3523 (2018)
154. Liu, C.M., Feng, Z.G., Teo, K.L.: On a class of stochastic impulsive op-
timal parameter selection problems. Int. J. Innov. Comput. Inf. Control
5(4), 1043–1054 (2009)
155. Liu, C.Y., Gong, Z., Feng, E., Yin, H.: Optimal switching control of a
fed-batch fermentation process. J. Glob. Optim. 52, 265–280 (2012)
156. Liu, C.Y., Gong, Z., Shen, B., Feng, E.: Modelling and optimal control
for a fed-batch fermentation process. Appl. Math. Model. 37, 695–706
(2013)
157. Liu, C.Y., Gong, Z.H., Lee, H.W.J., Teo, K.L.: Robust bi-objective
optimal control of 1,3-propanediol microbial batch production process.
J. Process Control 78, 170–182 (2019)
158. Liu, C.Y., Gong, Z.H., Teo, K.L., Feng, E.: Multi-objective optimization
of nonlinear switched time-delay systems in fed-batch process. Appl.
Math. Model. 40, 10,533–10,548 (2016)
159. Liu, C.Y., Gong, Z.H., Teo, K.L., Loxton, R., Feng, E.: Bi-objective
dynamic optimization of a nonlinear time-delay system in microbial
batch process. Optim. Lett. 12, 1249–1264 (2018)
160. Liu, C.Y., Gong, Z.H., Teo, K.L., Sun, J., Caccetta, L.: Robust multi-
objective optimal switching control arising in 1,3-propanediol microbial
fed-batch process. Nonlinear Anal. Hybrid Syst. 25, 1–20 (2017)
161. Liu, Y., Teo, K.L., Agarwal, R.P.: A general approach to nonlinear mul-
tiple control problems with perturbation consideration. Math. Comput.
Model. 26, 49–58 (1997)
162. Liu, Y., Teo, K.L., Jennings, L.S., Wang, S.: On a class of optimal
control problems with state jumps. J. Optim. Theory Appl. 98(1),
65–82 (1998)
163. Löberg, J.: YALMIP : A toolbox for modeling and optimization in
Matlab. In: Proc. Int. Symp. CACSD, Taipei, pp. 284–289 (2004)
164. Loxton, R., Lin, Q., Teo, K.L.: Minimizing control variation in nonlinear
optimal control. Automatica 49, 2652–2664 (2013)
165. Loxton, R., Teo, K.L., Rehbock, V.: An optimization approach to state-
delay identification. IEEE Trans. Autom. Control 55, 2113–2119 (2010)
166. Loxton, R., Teo, K.L., Rehbock, V.: Robust suboptimal control of non-
linear systems. Appl. Math. Comput. 217(14), 6566–6576 (2011)
References 565

167. Loxton, R., Teo, K.L., Rehbock, V., Ling, W.K.: Optimal switching
instants for a switched-capacitor DA/DC power converter. Automatica
45, 973–980 (2009)
168. Loxton, R.C., Lin, Q., Teo, K.L., Rehbock, V.: Control parameteri-
zation for optimal control problems with continuous inequality con-
straints: new convergence results. Numer. Algebra Control Optim. 2(3),
571–599 (2012)
169. Loxton, R.C., Teo, K.L., Rehbock, V.: Optimal control problems with
multiple characteristic time points in the objective and constraints.
Automatica 44(11), 2923–2929 (2008)
170. Loxton, R.C., Teo, K.L., Rehbock, V.: Computational method for a
class of switched system optimal control problems. IEEE Trans. Autom.
Control 54(10), 2455–2460 (2009)
171. Loxton, R.C., Teo, K.L., Rehbock, V., Yiu, K.F.C.: Optimal control
problems with a continuous inequality constraint on the state and the
control. Automatica 45(10), 2250–2257 (2009)
172. Luenberger, D.G., Ye, Y.: Linear and Nonlinear Programming, 3rd edn.
Springer, New York (2008)
173. Luus, R.: Optimal control by dynamic programming using systematic
reduction in grid size. Int. J. Control 51(5), 995–1013 (1990)
174. Luus, R.: Piecewise linear continuous optimal control by iterative dy-
namic programming. Ind. Eng. Chem. Res. 32(5), 859–865 (1993)
175. Luus, R.: Iterative Dynamic Programming. Chapman & Hall/CRC,
Boca Raton (2000)
176. Luus, R., Okongwu, O.: Towards practical optimal control of batch
reactors. Chem. Eng. J. 75(1), 1–9 (1999)
177. Malanowski, K., Maurer, H.: Sensitivity analysis for state constrained
optimal control problems. Discrete Contin. Dyn. Syst. 4(2), 241–272
(1998)
178. Malanowski, K., Buskens, C., Maurer, H.: Convergence of approxima-
tions to nonlinear optimal control problems. In: Fiacco, A.V. (ed.)
Mathematical Programming with Data Perturbations V. Lecture Notes
in Pure and Applied Mathematics, vol. 195, pp. 253–284. Springer, New
York (1997)
179. Martez, J.M.: Inexact restoration method with Lagrangian tangent de-
crease and new merit function for nonlinear. J. Optim. Theory Appl.
111, 39–58 (2001)
180. Martin, R.B.: Optimal control drug scheduling of cancer chemotherapy.
Automatica 28, 1113–1123 (1992)
181. Martin, R., Teo, K.L.: Optimal Control of Drug Administration in Can-
cer Chemotherapy. World Scientific, Singapore (1994)
182. Martinez, J.M., Pilotta, E.A.: Inexact restoration algorithm for con-
strained optimization. J. Optim. Theory Appl. 104(1), 135–163 (2000)
183. The Mathworks, Inc., Natick, Massachusetts: MATLAB version
8.5.0.197613 (R2015a) (2015)
566 References

184. Maurer, H.: On the minimum principle for optimal control problems
with state constraints. Tech. Rep. 41, Schriftenreihe des Rechenzen-
trums der Universität Münster (1979)
185. Maurer, H., Osmolovskii, N.P.: Second order sufficient conditions for
time-optimal bang–bang control problems. SIAM J. Control Optim.
42, 2239–2263 (2004)
186. Maurer, H., Buskens, C., Kim, J.-H.R., Kaya, C.Y.: Optimization meth-
ods for the verification of second order sufficient conditions for bang–
bang controls. Optimal Control Appl. Methods 26, 129–156 (2005)
187. McCormick, G.: Nonlinear Programming: Theory, Algorithms and Ap-
plications. Wiley, New York (1983)
188. McEneaney, W.: A curse-of-dimensionality-free numerical method for
solution of certain HJB PDEs. SIAM J. Control Optim. 46(4), 1239–
1276 (2007)
189. Mehra, R., Davis, R.: A generalized gradient method for optimal control
problems with inequality constraints and singular arcs. IEEE Trans.
Autom. Control AC-17(1), 69–78 (1972)
190. Mehta, T., Egerstedt, M.: Multi-modal control using adaptive motion
description languages. Automatica 44, 1912–1917 (2008)
191. Miele, A., Wang, T.: Dual-properties of sequential gradient-restoration
algorithms for optimal control problems. In: Conti, R., De Giorgi,
E., Giannessi, F. (eds.) Optimization and Related Fields, pp. 331–357.
Springer, New York (1986)
192. Miele, A., Pritchard, R.E., Damoulakis, J.N.: Sequential gradient-
restoration algorithm for optimal control problems. J. Optim. Theory
Appl. 5, 235–282 (1970)
193. Miele, A., Wang, T., Basapur, V.K.: Primal and dual formulations of
sequential gradient-restoration algorithms for trajectory optimization.
Acta Astronaut. 13, 491–505 (1986)
194. Misra, C., White, E.: Kinetics of crystallization of aluminium trihy-
droxide from seeded caustic aluminate solutions. Chem. Eng. Prog.
Symp. Ser. 67(110), 53–65 (1971)
195. Mitsos, A.: Global optimization of semi-infinite programs via restriction
of the right-hand side. Optimization 60(10–11), 1291–1308 (2011)
196. Mordukhovich, B.S.: Variational Analysis and Generalized Differentia-
tion: Applications, vol. II. Springer, Berlin (2006)
197. Mu, Y., Zhang, D., Teng, H., Wang, W., Xiu, Z.: Microbial production
of 1,3-propanediol by Klebsiella pneumoniae using crude glycerol from
biodiesel preparation. Biotechnol. Lett. 28, 1755–1759 (2008)
198. Neustadt, L.: Optimization: A Theory of Necessary Conditions. Prince-
ton University Press, New York (1976)
199. Nocedal, J., Wright, S.: Numerical Optimization, 2nd edn. Springer,
Berlin (2006)
References 567

200. Oberle, H.J., Sothmann, B.: Numerical computation of optimal feed


rates for a fed-batch fermentation model. J. Optim. Theory Appl.
100(1), 1–13 (1999)
201. Ořuztöreli, M.: Time-Lag Control Systems. Academic, New York (1966)
202. Parlar, M.: Some extensions of a student related optimal control prob-
lem. IMA Bull. 20, 180–181 (1984)
203. Polak, E.: On the use of consistent approximations in the solution of
semi-infinite optimization and optimal control problems. Math. Pro-
gram. 62(1), 385–414 (1993)
204. Polak, E., Ribiere, G.: Note sur la convergence de méthodes de direc-
tions conjuguées. ESAIM Math. Model. Numer. Anal. 3(R1), 35–43
(1969)
205. Polik, I., Terlaky., T.: A survey of the S-Lemma. SIAM Rev. 49(3),
371–418 (2007)
206. Pontryagin, L., Boltyanskii, V., Gamkrelidze, R., Mishchenko, E.: The
mathematical Theory of Optimal Processes, vol. 4. Gordon and Breach
Science Publishers, Montreux (1986)
207. Powell, M.: A fast algorithm for nonlinearly constrained optimization
calculations. In: Watson, G. (ed.) Numerical Analysis. Lecture Notes
in Mathematics, vol. 630, pp. 144–157. Springer, Berlin (1978)
208. Powell, W.: Approximate Dynamic Programming: Solving the Curses
of Dimensionality. Wiley, New York (2007)
209. Raggett, G., Hempson, P., Jukes, K.: A student-related optimal control
problem. Bull. Inst. Math. Appl. 17, 133–136 (1981)
210. Rao, A.: Trajectory optimization: a survey. In: Waschl, H., Kol-
manovsky, I., Steinbuch, M., del Re, L. (eds.) Optimization and Control
in Automotive Systems. Lecture Notes in Control and Information Sci-
ences. Springer, Cham (2014)
211. Rao, A., Benson, D., Darby, C., Patterson, M., Francolin, C., Sanders,
I., Huntington, G.: Algorithm 902: GPOPS, a matlab software for
solving multiple-phase optimal control problems using the gauss pseu-
dospectral method. ACM Trans. Math. Softw. 37(2), 22:1–22:39 (2010)
212. Reddien, G.: Collocation at Gauss points as a discretization in optimal
control. SIAM J. Control Optim. 17, 298–306 (1979)
213. Rehbock, V., Caccetta, L.: Two defence applications involving discrete
valued optimal control. ANZIAM J. 44, 33–54 (2002)
214. Rehbock, V., Livk, I.: Optimal control of a batch crystallization process.
J. Ind. Manage. Optim. 3(3), 585–596 (2007)
215. Rehbock, V., Teo, K.L., Jennings, L.S., Lee, H.: A survey of the control
parameterization and control parameterization enhancing methods for
constrained optimal control problems. In: Eberhard, A., Hill, R.,
Ralph, D., Glover, B. (eds.) Progress in Optimization: Contributions
from Australasia, pp. 247–275. Kluwer Academic, Dordrecht (1999)
216. Royden, H.L.: Real Analysis, 2nd edn. MacMillan, New York (1968)
568 References

217. Ruby, T., Rehbock, V., Lawrance, W.B.: Optimal control of hybrid
power systems. Dyn. Contin. Discrete Impuls. Syst. 10, 429–439 (2003)
218. Sakawa, A.: Trajectory planning of a free-flying robot by using the
optimal control. Optimal Control Appl. Methods 20, 235–248 (1999)
219. Sakawa, Y., Shindo, Y.: Optimal control of container cranes. Automat-
ica 18(3), 257–266 (1982)
220. Schittkowski, K.: The nonlinear programming method of Wilson, Han,
and Powell with an augmented Lagrangian type line search function,
Part 1: convergence analysis. Numer. Math. 38(1), 83–114 (1982)
221. Schittkowski, K.: On the convergence of a sequential quadratic pro-
gramming method with an augmented Lagrangian line search function.
Optimization 14(2), 197–216 (1983)
222. Schittkowski, K.: NLPQL: a Fortran subroutine solving constrained
nonlinear programming problems. Ann. Oper. Res. 5(2), 485–500
(1986)
223. Schittkowski, K.: NLPQLP: a Fortran implementation of a sequential
quadratic programming algorithm with distributed and non-monotone
line search - User’s guide, version 2.24. University of Bayreuth,
Bayreuth (2007)
224. Schwartz, A.: Homepage of RIOTS. http://www.schwartz-home.com/
riots/ (1997)
225. Schwartz, A.: Theory and implementation of numerical methods based
on Runge-Kutta integration for solving optimal control problems.
Ph.D. thesis, Electrical Engineering and Computer Sciences, Univer-
sity of California at Berkeley (1998)
226. Sethi, S., Thompson, G.: Optimal Control Theory: Applications to
Management Science, 2nd edn. Kluwer Academic, Dordrecht (2000)
227. Shanno, D.: Conditioning of quasi-Newton methods for function mini-
mization. Math. Comput. 24(111), 647–656 (1970)
228. Siburian, A., Rehbock, V.: Numerical procedure for solving a class of
singular optimal control problems. Optim. Methods Softw. 19(3–4),
413–426 (2004)
229. Sirisena, H.: Computation of optimal controls using a piecewise poly-
nomial parameterization. IEEE Trans. Autom. Control 18(4), 409–411
(1973)
230. Sirisena, H., Chou, F.: Convergence of the control parameterization
Ritz method for nonlinear optimal control problems. J. Optim. Theory
Appl. 29(3), 369–382 (1979)
231. Sturm, J.F.: Using SeDuMi 1.02, a MATLAB toolbox for optimization
over symmetric cones. Optim. Methods Softw. 12, 625–633 (1999)
232. Sun, W., Yuan, Y.: Optimization Theory and Methods - Nonlinear
Programming. Springer, New York (2006)
233. Sun, Y., Aw, E., Teo, K.L., Zhou, G.: Portfolio optimization using a
new probabilistic risk measure. J. Ind. Manage. Optim. 11(4), 1275–
1283 (2015)
References 569

234. Sun, Y., Aw, G., Loxton, R., Teo, K.L.: An optimal machine main-
tenance problem with probabilistic state constraints. Inf. Sci. 281,
386–398 (2014)
235. Sun, Y., Aw, G., Teo, K.L., Wang, X.: Multi-period portfolio optimiza-
tion under probabilistic risk measure. Financ. Res. Lett. 18, 60–66
(2016)
236. Sun, Y.F., Aw, G., Loxton, R., Teo, K.L.: Chance constrained optimiza-
tion for pension fund portfolios in the presence of default risk. Eur. J.
Oper. Res. 256, 205–214 (2017)
237. Teo, K.L.: Control parametrization enhancing transform to optimal
control problems. Nonlinear Anal. Theory, Methods Appl. 63, e2223–
e2236 (2005)
238. Teo, K.L., Womersley, R.S.: A control parameterization algorithm for
optimal control problems involving linear systems and linear terminal
inequality constraints. Numer. Funct. Anal. Optim. 6, 291–313 (1983)
239. Teo, K.L., Ahmed, N.U., Fisher, M.F.: Optimal feedback control for
linear stochastic systems driven by counting processes. Eng. Optim.
15(1), 1–16 (1989)
240. Teo, K.L., Clements, D.: A control parametrization algorithm for con-
vex optimal control problems with linear constraints. Numer. Funct.
Anal. Optim. 8(5–6), 515–540 (1985)
241. Teo, K.L., Goh, C.J.: A Simple computational procedure for optimiza-
tion problems with functional inequality constraints. IEEE Trans. Au-
tom. Control 32(10), 940–941 (1987)
242. Teo, K.L., Goh, C.J.: On constrained optimization problems with non-
smooth cost functionals. Appl. Math. Optim. 18(1), 181–190 (1988)
243. Teo, K.L., Goh, C.J.: A unified computational method for several
stochastic optimal control problems. Int. Ser. Numer. Math. 86(2),
467–476 (1988)
244. Teo, K.L., Goh, C.J.: A computational method for combined optimal
parameter selection and optimal control problems with general con-
straints. J. Aust. Math. Soc. B 30(3), 350–364 (1989)
245. Teo, K.L., Jennings, L.S.: Nonlinear optimal control problems with con-
tinuous state inequality constraints. J. Optim. Theory Appl. 63(1),
1–22 (1989)
246. Teo, K.L., Jennings, L.S.: Optimal control with a cost on changing
control. J. Optim. Theory Appl. 68(2), 335–357 (1991)
247. Teo, K.L., Lim., C.C.: Computational algorithm for functional inequal-
ity constrained optimization problems. J. Optim. Theory Appl. 56(1),
145–156 (1998)
248. Teo, K.L., Wong, K.H.: A computational method for time-lag control
problems with control and terminal inequality constraints. Optimal
Control Appl. Methods 8(4), 377–395 (1987)
249. Teo, K.L., Wong, K.H.: Nonlinearly constrained optimal control prob-
lems. J. Aust. Math. Soc. B 33(4), 517–530 (1992)
570 References

250. Teo, K.L., Wu, Z.S.: Computational Methods for Optimizing Dis-
tributed Systems. Academic, Orlando (1984)
251. Teo, K.L., Ang, B., Wang, M.: Least weight cables: optimal parameter
selection approach. Eng. Optim. 9(4), 249–264 (1986)
252. Teo, K.L., Fischer, M.E., Moore, J.B.: A suboptimal feedback stabiliz-
ing controller for a class of nonlinear regulator problems. Appl. Math.
Comput. 59(1), 1–17 (1993)
253. Teo, K.L., Goh, C.J., Wong, K.H.: A Unified Computational Approach
to Optimal Control Problems. Longman Scientific and Technical, Essex
(1991)
254. Teo, K.L., Jennings, L.S., Lee, H.W.J., Rehbock, V.: The control pa-
rameterization enhancing transform for constrained optimal control
problems. J. Aust. Math. Soc. B Appl. Math. 40, 314–335 (1999)
255. Teo, K.L., Jepps, G., Moore, E.J., Hayes, S.: A computational method
for free time optimal control problems, with application to maximizing
the range of an aircraft-like projectile. J. Aust. Math. Soc. B 28(3),
393–413 (1987)
256. Teo, K.L., Lee, W.R., Jennings, L.S., Wang, S., Liu, Y.: Numerical
solution of an optimal control problem with variable time points in the
objective function. ANZIAM J. 43(4), 463–478 (2002)
257. Teo, K.L., Lim, C.C.: Time optimal control computation with applica-
tion to ship steering. J. Optim. Theory Appl. 56, 145–156 (1988)
258. Teo, K.L., Liu, Y., Goh, C.J.: Nonlinearly constrained discrete-time
optimal-control problems. Appl. Math. Comput. 38(3), 227–248 (1990)
259. Teo, K.L., Rehbock, V., Jennings, L.S.: A new computational algorithm
for functional inequality constrained optimization problems. Automat-
ica 29(3), 789–792 (1993)
260. Teo, K.L., Wong, K.H., Clements, D.J.: Optimal control computation
for linear time-lag systems with linear terminal constraints. J. Optim.
Theory Appl. 44(3), 509–526 (1984)
261. Teo, K.L., Yang, X.Q., Jennings, L.S.: Computational discretization
algorithms for functional inequality constrained optimization. Ann.
Oper. Res. 98(1), 215–234 (2000)
262. Thompson, G.: Optimal maintenance policy and sale date of a machine.
Manag. Sci. 14(9), 543–550 (1968)
263. Uhlig, F.: A recurring theorem about pairs of quadratic forms and
extensions: a survey. Linear Algebra Appl. 25, 219–237 (1979)
264. Rehbock, V., Lim, C.C., Teo, K.L.: A stable constrained optimal model
following controller for discrete-time nonlinear systems affine in control.
Control Theory Adv. Technol. 10(4), 793–814 (1994)
265. Varaiya, P.: Notes on Optimization. Van Nostrand Reinhold Notes on
System Sciences. Van Nostrand Reinhold, New York (1972)
266. Varaiya, P.: Lecture Notes on Optimization (2013). https://people.
eecs.berkeley.edu.cn.edu/∼varaiya-optimization.pdf
References 571

267. Veliov, V.M.: Error analysis of discrete approximations to bang-bang


optimal control problems: the linear case. Control Cybern. 34(3), 967–
982 (2005)
268. Vincent, T.L., Grantham, W.J.: Optimality in Parametric Systems.
Wiley, New York (1981)
269. Vossen, G., Rehbock, V., Siburian, A.: Numerical solution methods for
singular control with multiple state dependent forms. Optim. Methods
Softw. 22(4), 551–559 (2007)
270. Vossen, G.A., Maurer, H.: On L1 -minimization in optimal control and
applications to robots. Optimal Control Appl. Methods 27, 301–321
(2006)
271. Wang, L.Y., Gui, W.H., Teo, K.L., Loxton, R., Yang, C.H.: Optimal
control problems arising in the zinc sulphate electrolyte purification
process. J. Glob. Optim. 54, 307–323 (2012)
272. Wang, L.Y., Gui, W.H., Teo, K.L., Loxton, R.C., Yang, C.H.: Time de-
layed optimal control problems with multiple characteristic time points:
computation and industrial applications. J. Ind. Manage. Optim. 5(4),
705–718 (2009)
273. Wang, S., Gao, F., Teo, K.L.: An upwind finite-difference method for
the approximation of viscosity solutions to Hamilton-Jacobi-Bellman
equations. IMA J. Math. Control Inf. 17(2), 167–178 (2000)
274. Wang, S., Jennings, L.S., Teo, K.L.: Numerical solution of Hamilton-
Jacobi-Bellman equations by an upwind finite volume method. J. Glob.
Optim. 27(2–3), 177–192 (2003)
275. Wang, Y., Xiu, N.: Theory and Algorithms for Nonlinear Programming
(in Chinese). Shanxi Publisher of Science and Technology, Shanxi,
China (2004)
276. Warga, J.: Optimal Control of Differential and Functional Equations.
Academic, New York (1972)
277. Wächter, A., Biegler, L.T.: On the implementation of an interior-
point filter line-search algorithm for large-scale nonlinear programming.
Math. Program. 106, 25–57 (2006)
278. Wilson, R.: A simplicial algorithm for concave programming. Ph.D.
thesis, Harvard University, Cambridge (1963)
279. Wong, K.H.: Convergence analysis of a computational method for time-
lag optimal control problems. Int. J. Syst. Sci. 19(8), 1437–1450 (1988)
280. Wong, K.H., Clements, D.J., Teo, K.L.: Optimal control computation
for nonlinear time-lag systems. J. Optim. Theory Appl. 47(1), 91–107
(1985)
281. Wong, K.H., Jennings, L.S., Benyah, F.: The control parametrization
enhancing transform for constrained time-delayed optimal control prob-
lems. ANZIAM J. 43, E154–E185 (2002)
282. Wong, K.H., Jennings, L.S., Teo, K.L.: A class of nonsmooth discrete-
time constrained optimal control problems with application to hy-
drothermal power systems. Cybernet. Syst. 24, 339–352 (2007)
572 References

283. Woon, S.F., Rehbock, V., Loxton, R.C.: Towards global solutions of op-
timal discrete-valued control problems. Comput. Optim. Appl. 33(5),
576–594 (2012)
284. Wu, C.Z., Teo, K.L.: Global impulsive optimal control computation. J.
Ind. Manage. Optim. 2(4), 435–450 (2006)
285. Wu, C.Z., Teo, K.L., Rehbock, V.: A filled function method for opti-
mal discrete-valued control problems. J. Glob. Optim. 44(2), 213–225
(2009)
286. Wu, Z.Y., Zhang, L.S., Teo, K.L., Bai, F.S.: A New Filled Function
Method for Global Optimization. J. Optim. Theory Appl. 125, 181–
203 (2005)
287. Wu, C.Z., Teo, K.L., Wu, S.Y.: Min–max optimal control of linear sys-
tems with uncertainty and terminal state constraints. Automatica 49,
1809–1815 (2013)
288. Wu, C.Z., Teo, K.L., Li, R., Zhao, Y.: Optimal control of switched
systems with time delay. Appl. Math. Lett. 19(10), 1062–1067 (2006)
289. Wu, D., Bai, Y.Q., Xie, F.S.: Time-scaling transformation for optimal
control problem with time-varying delay. Discrete Contin. Dyn. Syst.
S (2019). https://doi.org/10.3934/dcdss.2020098
290. Wu, D., Bai, Y.Q., Yu, C.Y.: A new computational approach for optimal
control problems with multiple time-delay. Automatica 101, 388–395
(2019)
291. Xiao, L., Liu, X.: An effective pseudospectral optimization approach
with sparse variable time nodes for maximum production of chemical
engineering problems. Can. J. Chem. Eng. 95, 1313–1322 (2017)
292. Xiu, Z., Song, B., Sun, L., Zeng, A.: Theoretical analysis of effects of
metabolic overflow and time delay on the performance and dynamic
behavior of a two-stage fermentation process. Biochem. Eng. J. 11,
101–109 (2002)
293. Xiu, Z., Zeng, A., An, L.: Mathematical modeling of kinetics and re-
search on multiplicity of glycerol bioconversion to 1,3-propanediol. J.
Dalian Univ. Technol. 40, 428–433 (2000)
294. Yang, F., Teo, K.L., Loxton, R., Rehbock, V., Li, B., Yu, C.J., Jennings,
L.: Visual miser: an efficient user-friendly visual program for solving op-
timal control problems. J. Ind. Manage. Optim. 12(2), 781–810 (2016)
295. Yang, F., Teo, K.L., Loxton R., Rehbock, V., Li, B.,Yu, C.J., Jennings,
L.: VISUAL MISER: an efficient user-friendly visual program for solv-
ing optimal control problems. J. Ind. Manage. Optim. 12, 781–810
(2016)
296. Yang, X.Q., Teo, K.L.: Nonlinear Lagrangian functions and applications
to semi-infinite programs. Ann. Oper. Res. 103(1), 235–250 (2001)
297. Yu, C.J., Li, B., Loxton, R., Teo, K.L.: Optimal discrete-valued control
computation. J. Glob. Optim. 56(2), 503–518 (2013)
References 573

298. Yu, C.J., Lin, Q., Loxton, R., Teo, K.L., Wang, G.Q.: A hybrid time-
scaling transformation for time-delay optimal control problems. J. Op-
tim. Theory Appl. 169, 876–901 (2016)
299. Yu, C.J., Teo, K.L., Bai, Y.Q.: An exact penalty function method for
nonlinear mixed discrete programming problems. Optim. Lett. 7, 23–38
(2013)
300. Yu, C.J., Teo, K.L., Zhang, L.S., Bai, Y.Q.: A new exact penalty func-
tion method for continuous inequality constrained optimization prob-
lems. J. Ind. Manage. Optim. 6(4), 895–910 (2010)
301. Yu, C.J., Teo, K.L., Zhang, L.S., Bai, Y.Q.: On a refinement of the
convergence analysis for the new exact penalty function method for
continuous inequality constrained optimization problem. J. Ind. Man-
age. Optim. 8(2), 485–491 (2012)
302. Yuan, J.L., Zhang, Y.D., Yee, J.X., Xie, J., Teo, K.L., Zhu, X., Feng,
E.M., Yin, H.C., Xi, Z.L.: Robust parameter identification using paral-
lel global optimization for a batch nonlinear enzyme-catalytic time-
delayed process presenting metabolic discontinuities. Appl. Math.
Model. 46, 554–571 (2017)
303. Zhang, K., Teo, K.L.: A penalty-based method from reconstructing
smooth local volatility surface from American options. J. Ind. Manage.
Optim. 11(2), 631–644 (2015)
304. Zhang, K., Teo, K.L., Swartz, M.: A robust numerical scheme for pricing
American options under regime switching based on penalty method.
Comput. Econ. 43, 463–483 (2014)
305. Zhang, K., Wang, S., Yang, X.Q., Teo, K.L.: A power penalty approach
to numerical solutions of two-asset American options. Numer. Math.
Theory Methods Appl. 2(2), 202–223 (2009)
306. Zhang, K., Wang, S., Yang, X.Q., Teo, K.L.: Numerical performance of
penalty method for American option pricing. Optim. Methods Softw.
25(5), 737–752 (2010)
307. Zhang, K., Yang, X.Q., Teo, K.L.: Augmented Lagrangian method ap-
plied to American option pricing. Automatica 42, 1407–1416 (2006)
308. Zhang, K., Yang, X.Q., Teo, K.L.: A power penalty approach to Amer-
ican option pricing with jump diffusion processes. J. Ind. Manage.
Optim. 4, 783–799 (2008)
309. Zhang, K., Yang, X.Q., Teo, K.L.: Convergence analysis of a monotonic
penalty method for American option pricing. J. Math. Anal. Appl. 348,
915–926 (2008)
310. Zhong, W.F., Lin, Q., Loxton, R., Teo, K.L.: Optimal train control via
switched system dynamic optimization. Optim. Methods Softw. (2019).
https://doi.org/10.1080/0556788.2019.1604704
574 References

311. Zhou, J.Y., Teo, K.L., Zhou, D., Zhao, G.H.: Optimal guidance for
lunar module soft landing. Nonlinear Dyn. Syst. Theory 10(2), 189–
201 (2010)
312. Zhou, J.Y., Teo, K.L., Zhou, D., Zhao, G.H.: Nonlinear optimal feed-
back control for lunar module soft landing. J. Glob. Optim. 52(2),
211–227 (2012)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy