100% found this document useful (1 vote)
49 views208 pages

Tröltzsch OptimalControlPDE AMS Providence (2010)

This book on optimal control of partial differential equations focuses on minimizing cost functions for systems described by elliptic and parabolic equations. It covers essential topics such as existence of optimal solutions, necessary optimality conditions, and numerical techniques, making it suitable for advanced undergraduates and beginning graduate students. The text also addresses nonlinear control problems and includes a survey on the Karush-Kuhn-Tucker theory, providing a comprehensive introduction to the mathematical theory and its applications.

Uploaded by

Andrea Maero
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
49 views208 pages

Tröltzsch OptimalControlPDE AMS Providence (2010)

This book on optimal control of partial differential equations focuses on minimizing cost functions for systems described by elliptic and parabolic equations. It covers essential topics such as existence of optimal solutions, necessary optimality conditions, and numerical techniques, making it suitable for advanced undergraduates and beginning graduate students. The text also addresses nonlinear control problems and includes a survey on the Karush-Kuhn-Tucker theory, providing a comprehensive introduction to the mathematical theory and its applications.

Uploaded by

Andrea Maero
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 208

Optimal Control of Partial

Optimal control theory is concerned with finding control func-


tions that minimize cost functions for systems described by
differential equations.The methods have found widespread appli-
Differential Equations
cations in aeronautics, mechanical engineering, the life sciences,
and many other disciplines. Theory, Methods and
This book focuses on optimal control problems where the state
equation is an elliptic or parabolic partial differential equation.
Included are topics such as the existence of optimal solutions,
Applications
necessary optimality conditions and adjoint equations, second-
order sufficient conditions, and main principies of selected numerical techniques. It
also contains a survey on the Karush-Kuhn-Tucker theory of nonlinear program-
ming in Banach spaces.
The exposition begins with control problems with linear equations, quadratic cost
functions and control constraints.To make the book self-contained, basic facts on Fredi Tr®Itzseh
weak solutions of elliptic and parabolic equations are introduced. Principies of
functional analysis are introduced and explained as they are needed. Many simple
examples illustrate the theory and its hidden difficulties. This start to the book
makes it fairly self-contained and suitable for advanced undergraduates or begin-
ning graduate students.
Advanced control problems for nonlinear partial differential equations are also
discussed. As prerequisites, results on boundedness and continuity of solutions
to semilinear elliptic and parabolic equations are addressed.These topics are not
yet readily available in books on PDEs, making the exposition also interesting for
researchers.
Alongside the main theme of the analysis of problems of optimal control,Trbltzsch
also discusses numerical techniques.The exposition is confined to brief introduc-
tions into the basic ideas in order to give the reader an impression of how the
theory can be realized numerically. After reading this book, the reader will be
familiar with the main principies of the numerical analysis of PDE-constrained
optimization.
Graduate Studies
ISBN 978-0-8218-4904-0 in Mathematics
Volume 112

American Mathematical Society


Optimal Control of Partial
Differential Equations
Theory, Methods and
Applications

Fredi Tróltzsch

Transiated by Jürgen Sprekels

Graduate Studies
in Mathematics
Volume 112

American Mathematical Society


Providence, Rhode Island
EDITORIAL COMMITTEE
David Cox (Chair)
Steven G. Krantz
Rafe Mazzeo
Martin Scharlemann

Originally published in German by Friedr. Vieweg & Sohn Verlag, 65189 Wiesbaden,
Germany, under the title: "Fredi Thóltzsch: Optimale Steuerung partieller
Differentialgleichungen." 1. Auflage (lst edition). © Friedr. Vieweg & Sohn
Verlag/GWV Fachverlage GmbH, Wiesbaden, 2005

T4anslated by Jürgen Sprekels

2000 Mathematics Subject Classification. Primary 49-01, 49K20, 35J65, 35K60, 90C48,
35B37.

To my wife Silvia
For additional information and updates on this book, visit
www.ams.org / bookpages / gsm-112

Library of Congress Cataloging - in-Publication Data


Tróltzsch, F7edi, 1951-
[Optimale Steuerung partieller Differentialgleichungen. English]
Optimal control of partial differential equations : theory, methods and applications / Fredi
Trbltzsch.
p. cm. - (Graduate studies in mathematics : v. 112)
Includes bibliographical referentes and índex.
ISBN 978-0-8218-4904-0 (alk. paper)
1. Control theory. 2. Differential equations, Partial. 3. Mathematical optimization. 1. Title.

QA402.3.T71913 2010
515'.642-dc22
2009037756

Copying and reprinting . Individual readers of this publication, and nonprofit libraries
acting for them, are permitted to make fair use of the material, such as to copy a chapter for use
in teaching or research. Permission is granted to quote brief passages from this publication in
reviews, provided the customary acknowledgment of the source is given.
Republication, systematic copying, or multiple reproduction of any material in this publication
is permitted only under license from the American Mathematical Society. Requests for such
permission should be addressed to the Acquisitions Department, American Mathematical Society,
201 Charles Street, Providente, Rhode Island 02904-2294 USA. Requests can also be made by
e-mail to reprint - permission® ams.org.
© 2010 by the American Mathematical Society. All rights reserved.
The American Mathematical Society retains all rights
except those granted to the United States Government.
Printed in the United States of America.
© The paper used in this book is acid-free and falls within the guidelines
established to ensure permanente and durability.
Visit the AMS home page at http ://www.ams.org/
10987654321 151413121110
Contents

Preface to the English edition xi

Preface to the German edition xiii

Chapter 1. Introduction and examples 1


§1.1. What is optimal control? 1
§1.2. Examples of convex problems 3
§1.3. Examples of nonconvex problems 7
§1.4. Basic concepts for the finite-dimensional case 9

Chapter 2. Linear-quadratic elliptic control problems 21


§2.1. Normed spaces 21
§2.2. Sobolev spaces 24
§2.3. Weak solutions to elliptic equations 30
§2.4. Linear mappings 40
§2.5. Existence of optimal controls 48
§2.6. Differentiability in Banach spaces 56
§2.7. Adjoint operators 60
§2.8. First-order necessary optimality conditions 63
§2.9. Construction of test examples 80
§2.10. The formal Lagrange method 84
§2.11. Further examples * 89
§2.12. Numerical methods 91
§2.13. The adjoint state as a Lagrange multiplier * 106
§2.14. Higher regularity for elliptic problems 111
§2.15. Regularity of optimal controls 114

vi¡
vi¡¡ Contents Contents ix

§2.16. Exercises 116 §5.9. Numerical methods 308


§5.10. Further parabolic problems * 313
Chapter 3. Linear-quadratic parabolic control problems
§5.11. Exercises 321
§3.1. Introduction
§3.2. Fourier's method in the spatially one-dimensional case Chapter 6. Optimization problems in Banach spaces 323
§3.3. Weak solutions in W210(Q) §6.1. The Karush-Kuhn-Tucker conditions 323
§3.4. Weak solutions in W (O, T) §6.2. Control problems with state constraints 338
§3.5. Parabolic optimal control problems §6.3. Exercises 353
§3.6. Necessary optimality conditions Chapter 7. Supplementary results on partial differential equations 355
§3.7. Numerical methods §7.1. Embedding results 355
§3.8. Derivation of Fourier expansions §7.2. Elliptic equations 356
§3.9. Linear continuous functionals as right-hand sides §7.3. Parabolic problems 366
§3.10. Exercises
Bibliography 385
Chapter 4. Optimal control of semilinear elliptic equations
Index 397
§4.1. Preliminary remarks
§4.2. A semilinear elliptic model problem
§4.3. Nemytskii operators
§4.4. Existente of optimal controls
§4.5. The control-to-state operator
§4.6. Necessary optimality conditions
§4.7. Application of the formal Lagrange method
§4.8. Pontryagin's maximum principie
§4.9. Second-order derivatives
§4.10. Second-order optimality conditions
§4.11. Numerical methods
§4.12. Exercises

Chapter 5. Optimal control of semilinear parabolic equations 265


§5.1. The semilinear parabolic model problem 265
§5.2. Basic assumptions for the chapter 268
§5.3. Existente of optimal controls 270
§5.4. The control-to-state operator 273
§5.5. Necessary optimality conditions 277
§5.6. Pontryagin's maximum principie 285
§5.7. Second-order optimality conditions 286
§5.8. Test examples 298
Preface to the English
edition
In addition to correcting some misprints and inaccuracies in the German
edition, some parts of this book were revised and expanded. The sections
dealing with gradient methods were shortened in order to make space for
primal-dual active set strategies; the exposition of the latter now leads to the
systems of linear equations to be solved. Following the suggestions of several
readers, a derivation of the associated Green's functions is provided, using
Fourier's method. Moreover, some referentes are discussed in greater detail,
and some recent referentes on the numerical analysis of state-constrained
problems have been added.
The sections marked with an asterisk may be skipped; their contents
are not needed to understand the subsequent sections. Within the text, the
reader will find formulas in framed boxes. Such formulas contain either re-
sults of special importante or the partial differential equations being studied
in that section.
1 am indebted to all readers who Nave pointed out misprints and sup-
plied me with suggestions for improvements-in particular, Roland Herzog,
Markus Müller, Hans Josef Pesch, Lothar v. Wolfersdorf, and Arnd Rdsch.
Thanks are also due to Uwe Prüfert for his assistance with the LATj( type-
setting. In the revision of the results on partial differential equations, 1 was
supported by Eduardo Casas and Jens Griepentrog; 1 am very grateful for
their cooperation. Special thanks are due to Jürgen Sprekels for his careful
and competent translation of this textbook into English. His suggestions
have left their mark in many places. Finally, 1 have to thank Mrs. Jutta
Lohse for her careful proofreading of the English translation.

Berlin, July 2009

F. Trbltzsch

xi
Preface to the German
edition

The mathematical optimization of processes governed by partial differential


equations has seen considerable progress in the past decade. Ever faster com-
putational facilities and newly developed numerical techniques have opened
the door to important practical applications in fields such as fluid flow, mi-
croelectronics, crystal growth, vascular surgery, and cardiac medicine, to
name just a few. As a consequence, the communities of numerical analysts
and optimizers have taken a growing interest in applying their methods
to optimal control problems involving partial differential equations; at the
same time, the demand from students for this expertise has increased, and
there is a growing need for textbooks that provide an introduction to the
fundamental concepts of the corresponding mathematical theory.
There are a number of monographs devoted to various aspects of the op-
timal control of partial differential equations. In particular, the comprehen-
sive text by J. L. Lions [Lio71] covers much of the theory of linear equations
and convex cost functionals. However, the interest in the class notes of my
lectures held at the technical universities in Chemnitz and Berlin revealed
a clear demand for an introductory textbook that also includes aspects of
nonlinear optimization in function spaces.
The present book is intended to meet this demarid. We focus on basic
concepts and notions such as:

• Existence theory for linear and semilinear partial differential equa-


tions

• Existence of optimal controls

xiii
xiv Preface to the German edition Preface to the German edition xv

• Necessary optimality conditions and adjoint equations can also be covered, provided that the students have a sufficient working
knowledge of functional analysis and partial differential equations.
• Second-order sufficient optimality conditions
The sections marked with an asterisk may be skipped; their contents
• Foundation of numerical methods are not needed to understand the subsequent sections. Within the text, the
reader will find formulas in framed boxes. Such formulas contain either re-
In this connection, we will always impose constraints on the control func- sults of special importance or the partial differential equations being studied
tions, and sometimes also on the state of the system under study. In order in that section.
to keep the exposition to a reasonable length, we will not address further During the process of writing this book, 1 received much support from
important subjects such as controllability, Riccati equations, discretization, many colleagues. M. Hinze, P. Maaf3, and L. v. Wolfersdorf read various
error estimates, and Hamilton-Jacobi-Bellman theory. chapters, in parts jointly with their students. W. Alt helped me with
The first part of the textbook deals with convex problems involving the typographical aspects of the exposition, and the first impetus to writ-
quadratic cost functionals and linear elliptic or parabolic equations. While ing thís textbook carne from T. Grund, who put my class notes into a
these results are rather standard and have been treated comprehensively first LATX version. My colleagues C. Meyer, U. Prüfert, T. Slawig, and
in [Lio7l], they are well suited to facilitating the transition to problems D. Wachsmuth in Berlin, and my students 1. Neitzel and I. Yousept, proof-
involving semilinear equations. In order to make the theory more accessible read the final version. 1 am indebted to all of them. I also thank Mrs.
to readers having only minor knowledge of these fields, sorne basic notions U. Schmickler-Hirzebruch and Mrs. P. Rufikamp of Vieweg-Verlag for their
from functional analysis and the theory of linear elliptic and parabolic partial very constructive cooperation during the preparation and implementation
differential equations will also be provided. of this book project.
The focus of the exposition is on nonconvex problems involving semi-
linear equations. Their treatment requires new techniques from analysis, Berlin , April 2005
optimization, and numerical analysis, which to a large extent can presently
F. Trdltzsch
be found only in original papers. In particular, fundamental results due to
E. Casas and J.-P. Raymond concerning the boundedness and continuity of
solutions to semilinear equations will be needed.
This textbook is mainly devoted to the analysis of the problems, al-
though numerical techniques will also be addressed. Numerical methods
could easily fill another book. Our exposition is confined to brief introduc-
tions to the basic ideas, in order to give the reader an impression of how the
theory can be realized numerically. Much attention will be paid to revealing
hidden mathematical difficulties that, as experience shows, are likely to be
overlooked.
The material covered in this textbook will not fit within a one-term
course, so the lecturer will have to select certain parts. One possible strat-
egy is to confine oneself to elliptic theory (linear-quadratic and nonlinear),
while neglecting the chapters on parabolic equations. This would amount
to concentrating on Sections 1.2-1.4, 2.3-2.10, and 2.12 for linear-quadratic
theory, and on Sections 4.1-4.6 and 4.8-4.10 for nonlinear theory. The chap-
ters devoted to elliptic problems do not require results from parabolic theory
as a prerequisite.
Alternatively, one could select the linear-quadratic elliptic theory and
add Sections 3.3-3.7 on linear-quadratic parabolic theory. Further topics
Chapter 1

Introduction and
examples

1.1. What is optimal control?

The mathematical theory of optimal control has in the past few decades
rapidly developed into an important and separate field of applied mathe-
matics. One area of application of this theory lies in aviation and space
technology: aspects of optimization come into play whenever the motion of
an aircraft or a space vessel (which can be modeled by ordinary differen-
tial equations) has to follow a trajectory that is "optimal" in a sense to be
specified.
Let us explain this by a simple example: a vehicle that at time t = 0 is
at the space point A moves along a straight line and stops at time T > 0 at
another point B on that line. Suppose that the vehicle can be accelerated
along the line in either direction by a variable force whose maximal strength
is the same in both directions. For example, this situation might represent
a jet engine that can be switched between forward and backward thrust.
What is the minimal time T > 0 needed for the travel, provided that the
available thrust u(t) at time t is subject to the constraint -1 < u(t) _í 1?
Here, u(t) = +1 (respectively, u(t) = -1) corresponds to maximal forward
(respectively, backward) acceleration.
To model this situation, let y(t) denote the position of the vehicle at time
t, m the mass of the vehicle (which is assumed to remain constant during the
process), and yo, YT E R the points corresponding to the positions A and B.
The mathematical problem then reads as follows:

1
2 1. Introduction and examples 1.2. Examples of convex problems 3

Minimize T > 0, subject to the constraints "merely" consists of the fact that a partial differential equation has to be
dealt with in place of an ordinary one. In this textbook, we will discuss,
my"(t) = u(t)
thr-ough examples in the form of mathematically simplified case studies, the
y(0) = yo optimal control of heating processes, two-phase problems, and Huid flows.
y'(0) = 0,
There are many types of partial differential equations. Here, we focus
on linear and semilinear elliptic and parabolic partial differential equations,
y(T) = yT
y'(T) = 0 since a satisfactory regularity theory is available for the solutions to such
¡u(t)1 < 1 dt e [0, T]. equations. This is not the case for hyperbolic equations. Also, the treatment
of quasilinear partial differential equations is considerably more difficult, and
The aboye problem, which is referred to as the rocket car in the textbook the theory of their optimal control is still an open field in many respects.
by Macki and Strauss [MS82], exhibits all the essential features of an optimal We begin our study with problems involving linear equations and qua-
control problem: dratic cost functionals. To this end, we introduce simple model problems in
• a cost functional to be minimized (here, the time T > 0 needed for the next section. In the following chapters, they will repeatedly serve as il-
the travel), lustrations of theoretical results. This analysis will be facilitated by the fact
that the Hilbert space setting suffices as a functional analytic framework in
• an initial value problem for a differential equation (here, m y" = u, ,
the case of linear-quadratic theory. The later chapters deal with semilinear
y(0) = yo, y'(0) = 0) describing the motion, in order to determine
equations. Here, the examples under study will be less academic. Owing
the state y,
to the presence of nonlinearities, the mathematical analysis will have to be
• a control function u, and more delicate.
• various constraints (here, y(T) = yT, y(T) = 0, ¡u¡ < 1) that have
to be obeyed. 1.2. Examples of convex problems
The control u may be freely chosen within the given constraints (e.g., for 1.2.1. Optimal stationary heating.
the rocket car, by stepping on the gas or the brake pedal), while the state is
uniquely determined by the differential equation and the initial conditions. Optimal boundary heating. Let us consider a body that is to be heated
We have to choose u in such a way that the cost function is minimized. or cooled and which occupies the spatial domain 9 C 1183. We apply to its
Such controls are called optimal. In the case of the rocket car, intuition boundary F a heat source u (the control), which is constant in time but
immediately tells us what the optimal choice should be. For this reason, this depends on the location x on the boundary, that is, u = u ( x). Our aim
example is often used to test theoretical results. is to choose the control in such a way that the corresponding temperatura
The optimal control of ordinary differential equations is of interest not distribution y = y(x) in t2 ( the state ) is the best possible approximation to a
only for aviation and space technology. In fact, it is also important in fields desired stationary temperature distribution yo = yn(x) in 9. We can model
such as robotics, movement sequences in sports, and the control of chemical this in the following way:
processes and power plants, to name just a few of the various applications.
In many cases, however, the processes to be optimized can no longer be ade-
min J(y, u) := 2 J I y(x) - yn (x) I2 dx + 2 Iu(x)12 ds(x),
quately modeled by ordinary differential equations; instead, partial differen-
subject to the state equation
tial equations have to be employed for their description. For instante, heat
conduction, diffusion, electromagnetic waves, fluid flows, freezing processes, -Dy = 0 in 11
and many other physical phenomena can be modeled by partial differential ay on F
equations. a (u -y)
av
In these fields, there are numerous interesting problems in which a given
and the pointwise control constraints
cost functional has to be minimized subject to a differential equation and
certain constraints being satisfied. The difference from the aboye problem ua(x) < u(x) < ub(x) on F.
4 1. Introduction and examples
1.2. Examples of convex problems 5

Such pointwise bounds for the control are quite natural, since the avail-
subject to
able capacities for heating or cooling are usually restricted. The constant
A > 0 can be viewed as a measure of the energy costs needed to implement -Dy = /3u in Sl
the control u. From the mathematical viewpoint, this term also serves as y = 0 on F
a regularization parameter; it has the effect that possible optimal controls
and
show improved regularity properties.
na(x) < u(x) < ub(x) in Q.
Throughout this textbook, we
Here, the coefficient = /3(x) is prescribed.
will denote the element of surface
area by ds and the outward unit Observe that by the special choice
normal to F at x E F by v(x). The fi = Xs, (where XE denotes the char-
function a represents the heat trans- acteristic function of a set E), it can
mission coefficient from S to the sur- be achieved that u acts only in a
rounding medium. The functional subdomain f C 9. This problem
J to be minimized is called the cost is a linear-quadratic elliptic control
functional. The factor 1/2 appear- problem with distributed control. It
ing in it has no influence on the so- can be more realistic to prescribe an
lution of the problem. It is intro- exterior temperature ya rather than
duced just for the sake of conve- assume that the boundary ternpera-
Boundary control. Distributed control.
nience: it will later cancel out a fac- ture vanishes. Then a better model
tor 2 arising from differentiation. We seek an optimal control u = u(x) is given by the state equation
together with the associated state y = y(x). The minus sign in front of
the Laplacian A appears to be unmotivated at first glance. It is introduced -Dy = ¡3 u in í
because A is not a coercive operator, while -0 is. ay
a (ya - y) on F.
Observe that in the aboye problem the cost functional is quadratic, the
av
state is governed by a linear elliptic partial differential equation, and the 1.2.2. Optimal nonstationary boundary control. Let f C R3 repre-
control acts on the boundary of the domain. We thus have a linear-quadratic sent a potato that is to be roasted over a fire for some period of time T > 0.
elliptic boundary control problem. We denote its temperature by y = y(x, t), with x e 9, t e [O, T]. Ini-
tially, the potato has temperature yo = yo(x), and we want to serve it at a
Rernark. The problem is strongly simplified. Indeed, in a realistic model La-
pleasant palatable temperature yo at the final time T. We now introduce
place's equation Ly = 0 has to be replaced by the stationary heat conduction
notation that will be used throughout this book: we write Q := S2 x (0, T)
equation div (a grad y) = 0, where the coefficient a can depend on x or even on
and Y- := F x (0, T). The problem then reads as follows:
y. If a = a(y) or a = a(x, y), then the partial differential equation is quasilinear.
T
In addition, it will in many cases be more natural to describe the process by a
min J(y,u) := 2 / iy(x, T) - yq(x)1'dx+ z ^ ¡u(x,t)12ds(x)dt,
time-dependent partial differential equation.

subject to
Optimal heat source. In a similar way, the control can act as a heat
source in the domain Q. Problems of this kind arise if the body S2 is heated yt - Ay = 0 in Q
by electromagnetic induction or by microwaves. Assuming at first that the ay
boundary temperature vanishes, we obtain the following problem: = a(u -y) onY
dv
y(x,0) = YO (X) in 9
12
min J(y, u) :=
2 J^ jy(x ) yo(x) dx +
2 f u( x)j2 dx, and
ua(x, t ) < u(x, t) < ub ( x, t) on
6 1. Introduction and examples 1.3. Examples of nonconvex problems 7

By continued turning of the spit, we produce u(x, t). The heating process As a consequence, many of the techniques presented in this book fail in the
has to be described by the nonstationary hect equatíon, which is a parabolic hyperbolic case.
differential equation. We thus have to deal with a linear-quadratic parabolic
boundary control problem. Here and throughout this textbook, yt denotes 1.3. Examples of nonconvex problems
the partial derivative of y with respect to t.
So far, we have only considered linear differential equations. However, lin-
1.2.3. Optimal vibrations . Suppose that a group of pedestrians crosses ear models do not suffice for many real-world phenomena. Instead, one
a bridge, trying to excite oscillations in it. This can be modeled (strongly often needs quasilinear or, much simpler, semilinear equations. Recall that
abstracted) as follows: let 9 C IR2 denote the domain of the bridge, y = a second-order equation is called semilinear if the main parts (that is, the
y(x, t) its transversal displacement, u = u(x, t) the force density acting in expressions involving highest-order derivatives) of the differential operators
the vertical direction, and Yd = yd(x, t) a desired evolution of the transversal considered in the domain and on the boundary are linear with respect to the
vibrations. We then1
2 tain1 ob the optimal control problem desired solution. For such equations, the theory of optimal control is well
developed.
T
min J(y, u) 1 y(x, t) yd(x, t) I2 dx dt + ' f f Ju(x,t) I2 dx dt, Optimal control problems with semilinear state equations are, as a rule,
t nonconvex, even if the cost functional is convex. In the following section, we
subject to will discuss examples of semilinear state equations. Associated optimal con-
trol problems can be obtained by prescribing a cost functional and suitable
ytt - Dy = u in Q
constraints.
y(0) = yo in 52
yt(0) = y1 in íl 1.3.1. Problems involving semilinear elliptic equations.
y = 0 on E
Heating with radiation boundary condition . If the heat radiation of
and the heated body is taken into account, then we obtain a problem with a
ua(x,t) < U(x,t) < ub(x,t) in Q. nonlinear Stefan-Boltzmann boundary condition. In this case, the control u
This is a linear-quadratic hyperbolic control problem with distributed con- is given by the temperature of the surrounding medium:
trol. -Dy = 0 in S2
Since hyperbolic problems óy = /
c (u4 - y4) on F.
are not the subject of this text- óv
book, we refer the interested
In this example, the nonlinearity y4 occurs in the boundary condition, while
reader to the standard mono-
the hect conduction equation itself is linear.
graph by Lions [Lio71], as well
as to Ahmed and Teo [AT81]. Simplified superconductivity. The following simplified (Ginzburg-Lan-
Interesting control problems for dau) model for superconductivity was considered by Ito and Kunisch [IK96]
oscillating elastic networks have to test numerical methods for optimal control problems:
been treated by Lagnese et al.
[LLS94]. An elementary intro- -Dy,
- y + y3 = u in St
Excitation of vibrations. on F.
duction to the controllability of yJr = 0
oscillations can be found in the textbook by Krabs [ Kra95]. For analytic reasons, we will later discuss the simpler equation -Ay+y+y3 =
In the linear-quadratic case, the theory of hyperbolic problems has many u, which is also of interest in the theory of superconductivity; see [IK96].
similarities to the parabolic theory studied in this textbook. However, the
treatment of semilinear hyperbolic problems is much more difficult, since Control of stationary flows. Stationary flows of incompressible media in
the smoothing properties of the associated solution operators are weaker. two- or three-dimensional spatial domains S2 are described by the stationary
8 1. Introduction and examples 1.4. Basic concepts for the finite-dimensional case 9

Navier-Stokes equations following type:

P
+(u• V)u+Vp = f in 9 ut+2-Ot = sLu+f in Q
1 Au
u = 0 on F Twt = X20 p + g(p) + 2u in Q
div u = 0 in St; au 0ap
av
, av =
o on

see Temam [Tem79 ] and Galdi [Ga194 ]. Here, in contrast to the notation
u(., 0) = uo, cp(-, 0) = cpo in Q.
used so far , u = u(x) E R3 denotes the velocity vector of the particle located
at the space point x ; moreover , p = p(x) and f = f (x) represent the pressure In a liquid-solid transition, the quantity u = u(x, t) represents a temperature,
and the density of the volume force , respectively. The constant Re is called and the so-called phase function cp = cp(x, t) E [-1, 1] describes the degree
the Reynolds number. In this example, f is the control , and the nonlinearity of solidification, where {cp = 1} and {cp = -1} correspond to the liquid
arises from the first -order differential operator (u • V) being applied to u, and solid phases, respectively. The function f represents a controllable heat
which results in (with Di denoting the partial derivative with respect to xi) source, and -g is the derivative of a so-called "double well" potential G. One
a Diui standard form for G is G(z) = 1(z2 - 1)2. In many applications g has the
(u - V) u = u1 Dlu + u2D2u + u3D3u = ni Diu2 form g(z) = az+bz2-cz3, with bounded coefficient functions a, b, and c > 0.
i=1 Diu3 For the precise physical meaning of the quantities rc, P, •r, and ^ we refer the
interested reader to Section 4.4 in the monograph by Brokate and Sprekels
The aboye mathematical model is of particular interest in relation to electri-
[BS96].
cally conducting fluids that can be influenced by magnetic fields. A possible
In this example, the target of optimization could be the approximation
target for the optimization could be the realization of a desired stationary
of a desired evolution of the melting/solidification process. First results for
flow pattern.
related control problems have been published by Chen and Hoffmann [CH91]
1.3.2. Problems involving sernilinear parabolic equations. and by Hoffmann and Jiang [M92].

The examples from Section 1.3.1. Both of the examples involving semi- Control of nonstationary flows. Nonstationary flows of incompressible
linear elliptic equations discussed in Section 1.3.1 can be formulated in non- fluids are described by the nonstationary Navier-Stokes equations
stationary form. The first example leads to a parabolic initial-boundary
value problem with Stefan-Boltzmann condition for the temperature y(x, t): ut-ReDu+(u•V)u+Vp = f in Q
div u = 0 in Q
yt - Ay 0 in Q
ay u = 0 onE
a (u4 - y4) on E
av u(•, 0) = uo in Q.

y(., 0) 0 in9. Here, a volume force f acts on the Huid, whose velocity is initially equal
An optimal control problem for a system of this type was initially investigated to uo and is zero at the boundary ("no-slip condition"). Depending on the
by Sachs [Sac78]; see also Schmidt [Sch89]. Similarly, a nonstationary particular circumstances, other boundary conditions may also be of interest.
analogue of the simplified model for superconductivity can be studied: One of the first contributions to the mathematical theory of optimal control
of fluid flows is due to Abergel and Temam [AT90].
yt-Dy-y+y3 = u in Q
vIr = 0 on E
y(., 0) = 0 in 9. 1.4. Basic concepts for the finite-dimensional case

Some fundamental concepts of optimal control theory can easily be explained


A phase field model. Many phase change phenomena (e.g., melting or by considering optirization problems in Euclidean space with finitely many
solidification) can be modeled by systems of phase field equations of the equality constraints. A little detour finto finite-dimensional optimization has
10 1. Introduction and examples 1.4. Basic concepts for the finite-dimensional case 11

the advantage that the basic ideas will not be complicated by technical details problem
from partial differential equations or functional analysis.
(1.3) min f(u), U E Uad.
1.4.1. Finite- dimensional optimal control problems . Suppose that In this reduced problem only the control u appears as an unknown.
J = J(y, u), J : Rn x R' - fl , denotes a cost functional to be mini-
In the following sections, we will discuss some basic ideas that will be
mized, and that an n x n matrix A, an n x m matrix B, and a nonempty set
repeatedly encountered in similar forms in the optimal control of partial
Uad c Rm are given (where "ad" stands for "admissible"). We consider the
differential equations.
optimization problem
1.4.2. Existence of optimal controls.
min J(y, u)
Ay = Bu, U E Uad. Definition . A vector ti E Uad is called an optimal control for problem
(1.1) if f (ú) < f (u) for all u E Uad; then y := Su is called the optimal
state associated with ú.
We seek vectors y and u minimizing the cost functional J subject to
Optimal or locally optimal quantities will be indicated by overlining, as
the constraints Ay = Bu and u E Uad. In this connection, we introduce
in ú.
the following convention: Unless specified otherwise, throughout this book
vectors will always be regarded as column vectors. Theorem 1.1. Suppose that J is continuous on W x Uad and that the set
Uad is nonempty, bounded, and closed. If the matriz A is invertible, then
Example. Often quadratic cost functionals are used , for instante
(1.1) has at least one solution.
J(y, u) = y - yd12 + A S2,
Proof Obviously, the continuity of J implies that f is also continuous on Uad.
where 1 denotes the Euclidean norm. o Moreover, as a bounded and closed set in a finite-dimensional space, Uad iS
compact. By the well-known Weierstrass theorem, f attains its minimum in
As it stands, (1.1) is a standard optimization problem in which the un- Uad. Hence, there is some ú E Uad such that f (ú) = min f (u). ❑
knowns y and u play similar roles. But this situation changes if we make uEUmd

the additional assumption that the matrix A has an inverse A-'. Indeed, This proof becomes more complicated in the case of optimal control
we can then solve for y in (1.1), obtaining problems for partial differential equations, since bounded and closed sets
need not be compact in (infinite-dimensional) function spaces.
(1.2) y=A-1B u,
1.4.3. First-order necessary optimality conditions. In this section, we
and for any u c R' there is a uniquely determined solution y E IR ; that
investigate what conditions the optimal vectors w and y must satisfy. We
is, we may choose (i.e. "control") u in an arbitrary way to produce the
do this in the hope that we will be able to extract enough information from
associated y as a dependent quantity. We therefore call u the control vector
these conditions to determine ti and D. Usually, this will have to be done
or, for short, the control, and y the associated state vector or state. In this
using numerical methods.
way, (1.1) becomes a finite-dimensional optimal control problem.
Next, we introduce the solution matrix of our control system Notation . We use the following notation for the derivatives of functions
f :Rm-*H:
S:R- -+R''', S=A-'B.
a a (partial derivatives)
Then y = Su, and, owing to (1.2), we can eliminate y from J to obtain the
Di ax Dx ax
reduced cost functional f,
f'(x) _ (Di f (x), ... , Dm f (x)) (derivative)
J(y, u) = J(S u, u) _: f (u). V f (u) = f'( u)T (gradient)
For instante, for the quadratic function in the aboye example we get f (u) _ where T stands for transposition. For functions f = f (x, y) : Rrn x W' - ll ,
jSu-yd12+.luj2. The problem (1.1) then becomes the nonlinear optimization
we denote by Dx f the row vector of partial derivatives of f with respect to
1.4. Basic concepts for the finite-dimensional case 13
12 1. Introduction and examples

It can be considerably simplified by introducing the adjoint state, a simple


x1, ..., x,,,,, and by V f the corresponding column vector. The expressions
trick that is of utmost importante in optimal control theory.
Dy f and yy f are defined in a similar way. Moreover,
1.4.4. Adjoint state and reduced gradient . As motivation, let us as-
sume that the use of the inverse matrix A-1 is too costly for numerical
(U' V)Rm = U V = [^ U' Vi
i-1 calculations. This is usually the case for realistic optimal control problems.
denotes the standard Euclidean scalar product in R. For the sake of con- Then, a numerical method that avoids the explicit calculation of A-1 (e.g.,
venience, we will use both kinds of notation for the scalar product between the conjugate gradient method) must be used for the solution of the linear
vectors. The application of f'(u) to a column vector h E 118', denoted by system Ay = b. The same applies for AT. We therefore replace the term
f'(u) h, coincides with the directional derivative of f in the direction h, (AT)-1V0J(y, u) in (1.6) by a new variable

P (AT) 1VyJ //(9,u)•


f'(u) h = (V f (u) , h) R- = of(u) • h.
The quantity p corresponding to the pair (y, v,) can be determined by solving
We now make the additional assumption that the cost functional J is con- the linear system
tinuously differentiable with respect to y and u; that is, the partial deriva-
(1.7) AT p = D0J(y, u)
tives D.0J(y, u) and DuJ(y, u) with respect to y and u are continuous in
(y, u). Then, by virtue of the chain rule, f (u) = J(Su, u) is continuously Definition. The equation (1.7) is called the adjoint equation, and íts
differentiable. solution p is called the adjoint state associated wíth (y, U).

Example. Suppose that f(u) = 2 1 Su - yd i2+ 2 u12. Then it follows that


Example. In the case of the quadratic function J(y, u) = 11 Iy-ydi2+ 2 IuI2,
we obtain the adjoint equation
Vf(u) = ST( Su-yd )+1\u, f'(u) = (ST(Su-yd)+ñu)T,
ATP=y - Yd,
f'(u)h = (ST( Su-Yd )+ ,u, h)Rm. O
since V0J(y, u) =Y - Yd. O

Theorem 1.2. Let Uad be convex. Then any optimal control Ii for (1.1) The introduction of the adjoint state has two advantages: the first-order
satisfies the variational inequality necessary optimality conditions simplify, and the use of the inversa matrix
(AT)-1 is avoided. Also, the forrn of the gradient of f simplifies. Indeed,
(1.4) f'(ú)(u - ú) > 0 bu E Uad.
with = Su, it follows from (1.5) that

This simple yet fundamental result is a special case of Lemma 2.21 on page Vf(u)=BTp+V..J(y,u)
63. It reflects the observation that f cannot decrease in any direction at a The vector V f (u) is referred lo as the reduced gradient. Moreover, since
minimum point. y = Sú, the directional derivative f'(ii) h at an arbitrary point ti is given by
Invoking the chain rule and the rules for total differentials, we can de-
f'(u) h = (BTp + DuJ(y,) , h)am.
termine the derivative f' in (1.4), which is given by f = D0J S+ D,,,J. We
find that The two expressions aboye involving the adjoint state p do not depend on
whether ú is optimal or not. We will encounter them repeatedly in control
f'(v,) h = D0J(Si, h) Sh + D..J(Si, ú) h problems for partial differential equations. Moreover, the use of the adjoint
state p also simplifies Theorem 1.2:
= (VyJ(y, u), A-1B h)R^ + (DuJ(y, U)'
Theorem 1.3. Suppose that the matrix A is invertible, and let v, be an
(1.5) = (BT(AT)-1 o0J(y, ú) + ouJ(y,) , h) optimal control for (1.1) with associated state y. Then the adjoint equation
(1.7) has a unique solution p such that
Hence, the variational inequality (1.4) takes the somewhat clumsy form
(1.8) (BTp+VuJ(y,26), u - ti)nm >0 tÍu E Uad.
(1.6) (BT (AT)-1V 0J(y, ü) + V J(y, U) , u - ú)Rm > 0 V u E Uad.
14 1. Introduction and examples 1.4. Basic concepts for the finite-dimensional case 15

The assertion follows directly from the variational inequality (1.6) and the which is a linear system for the unknowns y and p. Once y and p have
definition of p. In summary, we have derived the following optimality system been recovered from it, the optimal control Ti can be determined from (1.10).
for the unknown vectors y, ú, and p , which can be used lo determine the o
optimal control:
Remark. We have chosen a linear equation in (1.1) for the sake of simplicity.
The fully nonlinear problem
Ay=Bu, u E Uad
(1.11) -in J(y, u), T(y, u) = 0, u E Uad
ATp = VAY, u)
will be discussed in Exercise 2.1 on page 116.
(BTp + V v J( y, u) , v - u) ^m > 0 V V E Uad.
1.4.5. Lagrangians . By using the Lagrangian function from basic calculus,
the optimality system can also be formulated as a Lagrange multiplier rule.
Every solution (y, ú) to the optimal control problem (1.1) must, together
with p, satisfy this system.
Definition . The function
No restrictions on u. In this case, Uad = R. Then u - u may attain L : R2+m -> R, L(y,u,p) := J(y,u) - (Ay - Bu, p)R,,.,
any value h E Rm, and thus the variational inequality (1.8) reduces to the
equation is called the Lagrangian function or Lagrangian.

B T p + V J(y, ú) = 0.
Using L, we can formally eliminate the equality constraints from (1.1), while
retaining the seemingly simpler restriction u E Uad in explicit form. Upon
Example. Suppose that comparison, we find that the second and third conditions in the optimality
system are equivalent to
J(y, u) = 2 IC y - ydj2 + 2 luJ2,
V z,L(y, ú, p) = 0
with a given n x n matrix C. Then, obviously,
(17uL(9,u,p) , u - ii)R,,. ! 0 Vu E Uad.

V J(y, u) = CT (C y - Yd), V AY, u) = Au.


Conclusion . The adjoint equation ( 1.7) is equivalent to DyL (y, ú, p) = 0
The optimality system becomes
and thus can be recovered by differentiating the Lagrangian with respect to
Ay=Bu, u E Uad y. Similarly, the variational inequality follows from differentiation of L with
respect to u.
ATp=CT(Cy-Yd)
(BTp+Au, v-u)Rm>0 VVE Uad. Consequently, (y, ú) is a solution to the necessary optimality conditions of
the following minimization problem without equality constraints:
If Uad = 118m, then BTp + )ú = 0. In the case where A > 0, we can solve for
ú to obtain (1.12) min L(y,u,p), u E Uad, y E I18n.
y, u

(1.10) Ii=- BTp. By the way, this does not imply that (y, ú) can always be determined nu-
merically as a solution to (1.12 ). In fact, the "right" p is usually not known,
Substitution in the two other relations yields the optimality system and (1 . 12) may not be solvable or could even lead to wrong solutions. The
vector p E Rn also plays the role of a Lagrange multiplier. It corresponds to
the equation Ay - Bu = 0.
Ay = -^BBTp We remark that the aboye conclusion remains valid for the fully nonlinear
problem (1.11), provided that the Lagrangian is defined by L(y, u, p) :_
ATp=CT(CJ-yd),
J(y, u) - (T( y, u) , p),,n.
16 1. Introduction and examples 1.4. Basic concepts for the finite-dimensional case 17

1.4.6. Discussion of the variational inequality . In later chapters the In optimization theory, these are usually referred to as complementary slack-
admissible set Uad will be defined by upper and lower bounds, so-called box ness conditions or complementarity conditions.
constraints. We assume this here too, i.e., The inequalities hold trivially, so that only the equations have to be
(1.13) Uad = {u E l[8m : ua < u < ub}. verified. We confine ourselves to showing the first orthogonality condi-
tion: in view of (1.14), the strict inequality ua,i < úi can only be valid
Here, Ua < ub are given vectors in Rm, where the inequalities are to be un-
if (BTp + DuJ(y, i))i < 0. By definition, this implies that µa,i = 0, hence
derstood componentwise, that is, ua,i < ui < ub,i for i = 1, . . . , m. Rewriting
(ua,i - ti) ua,i = 0. if la,i > 0, then, owing to the definition of µa, also
the variational inequality (1.8) as
(BTp + VuJ(y,7L))i > 0, and from (1.14) we conclude that ua,i = vi.
(BTp+VuJ(y, ), u)R„, < (BTp +DUJ(y, ) , u)Rm bu E Uad, Again, it follows that (Ua,i - 2ii) µa,i = 0. Summation over i then yields

we find that t solves the linear optimization problem (Ua - U, ¡a ) R" = 0.


Note that (1.15) implies that µa - µb = DuJ(y, v,) + BT p, so that
min
uEUad
(BTp + V uJ( y, u) , u) R- = min
uE U E (BTp
ad i=1
+ DuJ(9, )) i ui
(1.16) VuJ(9,)+BTp-¡a+µb=0.

If Uad is given as in (1.13), then it follows from the fact that the ui are We now introduce an extended Lagrangian G by adding the inequality con-
independent from each other that straints in the following way:

(BTp+VuJ(y,t))v,i = min
x ua,.<u,,<UG (BTp + 7u J(y,)) i ui £(Y, U, P, ÍLa, Pb) := J(y, u) - (A y - B u,

for i = 1, .... m. Hence, we must have +(u-ub, ¡lb) am

Then (1.16) can be expressed in the form


( Ub,i if (BTp+vuJ( y,u))i<0
(1.14) Du £(y, u, p, Pa, I-¿b) = 0.
Sl Ua,i if (BTp+V J( p,v,)) > 0.
Moreover, the adjoint equation is equivalent to the equation
No direct information can be recovered from the variational inequality for
the components that satisfy (BTp + V J( y, t)) . = 0. However , in many
Dy C(9, u, p, µa, Pb) = 0,
cases useful information can still be extracted simply from the fact that this
equation holds. since V0L = VyL. Hence, µa and Pb are the Lagrange multipliers corre-
sponding to the inequality constraints ua - u < 0 and u ub < 0. The
1.4.7. Formulation as a Karush-Kuhn-Tucker system. Up to now, optimality conditions can therefore by rewritten in the following alternative
the Lagrangian L has only been used to eliminate the conditions in equation forra.
form. The same can be done with the inequality constraints induced by Uad.
To this end, we introduce the quantities Theorem 1.4. Suppose that A is invertible, Uad is given by (1.13), and
tla (BT p + DaJ(y, u))+
ü is an optimal control for (1.1) with associated state y. Then there exist
(1.15) Lagrange multipliers p E R" and µi E Rnz, i = 1, 2, such that the following
¡b (BTp + VaJ(y, ))_. conditions hold:

We have pa,i = (BTp + V " j(9, ti)) . if the right-hand side is positive,
and pLa,i = 0 otherwise; likewise, ¡b,i = I (BTp + DuJ(y, 2L))i1 for a negative V AY, u, p, lLa, ttb) = O
right-hand side, and ¡b,i = 0 otherwise. Invoking (1.14), we deduce the Qu C- (9, u, p, µa, Pb) = 0
relations
Pa >- 0, P b >>- 0
¡La > 0, ua - u < 0,
( Ua - U, ¡a) p m = 0,

[Lb > 0, u - ub <0 , ( u - ub, ÍLb)am = O. (Ua - u, fla )am = - ub, Pb ) pgm = 0.
18 1. Introduction and examples 1.4. Basic concepts for the finite-dimensional case 19

The aboye optimality system, which combines the conditions of Theorem Generalization to partial differential equations . In optimal control
1.4 with the constraints problems for partial differential equations, the argumentation follows similar
lines to that aboye. In this case, the equation A y = B u stands for an elliptic
Ay-Bu= 0, ua<'a<nb,
or parabolic boundary value problem, with A being a differential operator
constitutes the famous Karush-Kuhn-Tucker conditions. and B representing some coefficient or embedding operator. The solution
matrix S = A-' B corresponds to the part of the solution operator associ-
In order to be able to compare later with the results in Section 4.10, we
ated with the differential equation that occurs in the cost functional. The
are now going to state the second-order sufficient optirnality conditions; see,
associated optimality conditions will be of the same forro as those established
e.g., [GT97b], [GMW81 ], or [Lue84] . To this end , we introduce index sets
aboye.
corresponding to the active inequality constraints, I(1) = fa(u) U 1b(ú), and
to the strongly active inequality constraints, A(ú) C I(2). We have Lagrangians are also powerful tools in the control theory of partial dif-
ferential equations. In the formal Lagrange method, they are used as con-
Ia(u) = {i : ui = ua,i }, Ib(u) = {i : 17i = ub,i}, venient means to formally derive optimality conditions that can easily be
A(u) = {i : /la,i > 0 Or lob i > 0}. memorized. Their application in the rigorous proof of optimality conditions
is not so straightforward; in fact, it is based on the Karush-Kuhn-Tucker
Moreover, let C(ú) denote the critical cope consisting of all h E R' with the
theory of optimization problems in Banach spaces, which will be discussed
properties
in Chapter 6.
hi = 0 for i E A(ü)
hi>0 for iEIa(ü)\A(d)
hi < 0
c- A

By definition Of /-ta and /l b, we Nave i c A(ü) t j(BTp+ VuJ( 9, ))jI > 0.

Hence, an active constraint for u is strongly active if and only if the


corresponding component of the gradient of f does not vanish.

Theorem 1.5. Suppose that Uad is given by (1.13), and let (y, ú, P, µa, /-ib)
satisfy the Karush-Kuhn-Tucker conditions. If
T
Gyy(9, u, M Na, lob) ^yu(y, u, p, lea, tsb) J L y J
>0
u
y I Cug(9, U,P, -Ma, /ib) £uu(V, u,P, /2a, / b) u
for all (y, u) (0, 0) with Ay = B u and u E C(ú), then (y, ú) is locally
optimal for (1.1).

In the aboye theorem, Gyy, Gyu, and Guu denote the second-order partial
derivatives DyL, DuDYG, and Du , respectively. Owing to a standard com-
pactness argument , the definiteness condition of the theorem is equivalent
to the existente of some ó > 0 such that
T
( y, Ü, P, /ia, lob) £yu ( y, Ü,)5, laa, Pb) J L y j > ó (jyI2 + 1.12)

Lu
y JCuy
^ Lyy (y, u, p, /ia, Pb) £uu ( y, u, p, /La, Pb) u -
for all corresponding (y, u). If A is invertible , it even suffices to postulate
that the aboye quadratic form is greater than or equal to ó JuJ2.
Chapter 2

Linear- quadratic elliptic


control problems

2.1. Normed spaces

In the first few sections, we present some basic notions from functional anal-
ysis. We are guided by the principie of covering only the material that is
absolutely necessary for a proper understanding of the subsequent section.
The proofs will not be given; in this regard, the interested reader is referred
to standard textbooks on functional analysis such as those by Alt [A1t99],
Kantorovich and Akilov [KA64], Kreyszig [Kre78], Lusternik and Sobolev
[LS74], Wouk [Wou79], or Yosida [Yos80].
We assume that the reader is already familiar with the concept of a
linear space over the field R of real numbers. Standard examples include
the n-dimensional Euclidean space W1 and the space of continuous real-
valued functions defined on an interval [a, b] ci R. Their elements are vectors
x = (X 1, ... , xj T or functions x : [a, b] -3 R, respectively. In both spaces,
operations of addition "+" of two elements and multiplication by real num-
bers are defined that obey the familiar rules in linear spaces.

Definition . Let X be a linear space over R. A mapping II - II : X -3 IR is


called a norm on X if the following hold for all x, y E X and A E R:

(i) I1xI1 >0, and lx11=0.> x=0


(ii) IIx+ y 11 < Ilxll+ II0 (triangle inequality)
(iii) P,4 = ¡Al II4 (hornogeneity)

If 11 - 11 is a norm on X, then {X, 11 - Il} is called a (real) normed space.

21
22 2. Linear-quadratic elliptic control problems 2.1. Normed spaces 23

The space 118' is a normed space when equipped with the Euclidean norm for m < n. Hence, we have a Cauchy sequence. However, its pointwise limit
n
f 0 0 < t < 1
¡xi x
2)1/2 x(t) _ imn^w
xn(t)
l1 1< t< 2
i=1
is not continuous on [0, 2] and thus not an element of CL2 [0, 2] . O
The space of continuous real-valued mappings x : [a, b] ---> R is a normed
space, denoted by C[a, b], with respect to the maximum norm of x(.),
Definition. A normed space {X, 1J.II} is said to be complete if every Cauchy
¡HIIC[a,b] = tmax lx( t)l- sequence in X converges, i.e., has a limit in X. A complete normed space is
called a Banach space.
Another normed space, denoted by CL2[a,b], is obtained if we endow the
space of continuous real-valued functions with the L2 norm, The spaces W and C[a, b] are Banach spaces with respect to their natural
norms 1 • 1 and li • iiC[a,b], while {CL2[a,b], CL2[a,b]} is not complete and
b 1/2
"X"CL2[a,b] = (f Ix(t)12dt) . hence not a Banach space.
a In a Banach space there does not necessarily exist an equivalent to the
The reader will be asked in Exercise 2.2 to verify that the norm axioms scalar product of two vectors in ', which is fundamental for the concept of
( i)-(iii ) are satisfied for the latter two examples. orthogonality.

Remark. By definition, the space X and the associated norm together define Definition . Let H be a real linear space. A mapping (- , •) : H -+ R is
a normed space. The introduction of another norm on the same space leads to a called a scalar product on H if the following conditions are satisfied for all
different normed space. However, it is usually clear which norm is under consid- u, v, ul, u2 E H and A E IR:
eration; in such a situation, we will simply refer to the normed space X without
(i) (u, u) > 0, and (u, u) = 0 u=0
making any reference to the particular norm.
(ii) (u, v) = (v, u)
Definition. Let {X, be a normed space, and let {xn}°_1 C X be a (iii) (n1 + u2 , v) = (u1 , v) + (u2 , v)
sequence. (iv) (A u, v) = A(u, v).
(i) The sequence is said to be convergent if there is some x e X such that If (• , •) is a scalar product on H, then {H, (. , .) } is called a pre-Hilbert
limn.. IIxn - xJi = 0. space.
(ü) We call x the limit of the sequence, written as limn..<2, xn = x.
Remark . Again, we speak of the pre-Hilbert space H instead of {H, (, )} if it
(iii) The sequence is called a Cauchy sequence if for any E > 0 there is some is clear which scalar product is being considered on H.
no = no(s) E N such that Il xn - x,n[[ < E for all n > no(s) and
m > no(E). The space 118' is a pre-Hilbert space with respect to the scalar product
(u, v) := uT v, and CL2 [a, b] is a pre-Hilbert space when equipped with the
Any convergent sequence in a normed space is also a Cauchy sequence, but scalar product
the converse is in general false, as the following example shows.
(U, v) = b u(t) v(t) dt.
Example. Consider the sequence of functions in the space CL2 [0, 2] defined f
by xn(t) = min{1, t 2} for t E [0, 2], n E N. Then Every pre-Hilbert space {H, (., •) } ia normed space with respect to its
natural norm (see Exercise 2.3)
1
I x- lCL2
i2[0,2] =
J0 1
(t- tm)2dtf
= ( t2n - 2tn+- + t2^n) dt lluli (u, u).
We then have the Cauchy-Schwarz inequality:
1 20 1 2
2n+1 n + m + 1 + 2m+1 C 2m+1 1( n, v)I < [ u[[ ¡¡vil Vu, v E H.
24 2. Linear-quadratic elliptic control problems 2.2. Sobolev spaces 25

Definition . A pre-Hilbert space {H, (• , •)} is called a Hilbert space if it is In the following, S2 C RN is a domain, i.e., an open and connected set,
complete with respect to the norm whose boundary is generally denoted by F. Moreover, v : Q 118 is a function
defined in 9, and the closure of a set E will be denoted by E.
Ilull (u, u).

Definition.
The Euclidean space R' is a Hilbert space with respect to the standard (i) Let k e N. We denote by Ck(Q) the linear space of all real-valued func-
scalar product, while CL2 [a, b] is not complete and hence not a Hilbert space.
tions on Q that, together with their partial derivatives up to order k, are
continuous in St.
2.2. Sobolev spaces
(ü) The set supp v = {x E Q : v(x) 0} is called the support of v. It is
In this section, we will recall basic notions from the theory of LP spaces and the smallest closed set outside of which v vanishes identically.
Sobolev spaces, which are indispensable prerequisites for the next chapters. (iii) CO (S2), k c N U {O, oo}, denotes the set of all k-times continuously
In the following, E C 118N denotes a nonempty, bounded, and Lebesgue differentiable functions with compact support in Q.
measurable set having the N-dimensional Lebesgue measure ¡El.
The case of k = oe, i.e., the set CO (S2) of so-called test functions, is of
2.2.1. LP spaces.
special interest to us. Test functions vanish on the boundary h and thus
yield zero boundary integrals upon integration by parts; en the other hand,
Definition . We denote by LP(E), 1 < p < oo, the linear space of all
they can be differentiated up to arbitrary order. Both of these properties
(equivalente classes of) Lebesgue measurable functions y that satisfy
will be exploited in the definition of Sobolev spaces. We remark that, since
the topology of CO (S2) will not be needed here, we have used the notion of
^y(x)jPdx < oo.
fE set instead of space for Có ()).
Next, we recall the notion of multi-indices, Le., vectors a = (al, ... , ON)
having nonnegative integer components. The number Ial = al + ... + UN is
In this connection, functions that differ only on a set of zero measure are
called the length of the multi-index. The components al are used to indícate
identified with each other and considered to belong to the same equivalente
how often a function has to be differentiated with respect to xi. For example,
class. Endowed with the norm /
the multi-index a = (1, 0, 2) means that we have to differentiate once with
^IYIILP(E) _ (J y(x)lPdx)' °, respect to xr and twice with respect to X3, but not with respect to x2, that
E is,
LP(E), with 1 < p < 00, becomes a Banach space which is reflexive (this 3
D( 1,0,2)y = y
notion will be defined in Section 2.4).
áxláx3

Definition. We denote by L°°(E) the Banach space of all (equivalente Hence, Day(x) is shorthand for D11...DP y(x), and the length ¡al representa
classes of) Lebesgue measurable and essentially bounded functions, equipped the total order of differentiation. We put D(°) y := y.
with the norm
Definition . Let Q C iiI be bounded. For any k E N U {0}, we denote
HyllL -(E) = ess sup ^y(x)j := inf ( sup ¡y(x)l).
by Ck(O) the linear space of all elements of Ck(S2) that together with their
xEE FI=O xEE\F
partial derivatives up to order k can be continuously extended to Q. In the
k = 0 case, we write simply C(S2) instead of C°(S2).
By "ess sup" we mean the essential maximum or supremum of a function.
This excludes any maxima that change upon the removal of single points The spaces Ck (9) are Banach spaces with respect to the following norms:
that are isolated in a certain sense and thus not essential. For instante, the
forkEN.
function y : [0, 11 -* R which attains the values zero on (0, 1] and one at ^lylIC(f) = ma x y(x )I , llyllCk(ft) = IIDayllc(f),
XEü jal<k
x = 0 has maximum 1 but essential supremum 0.
26 2. Linear-quadratic elliptic control problems 2.2. Sobolev spaces 27

2.2.2. Regular domains. The theory of partial differential equations re- theorem (see [A1t99] or [Cas921). Having defined the surface measure, we
quires the spatial domains S2 to have sufficiently smooth boundary. The can proceed in the usual way to introduce the notions of measurable and
following definition is given in the books by Necas [ Nec67], Ladyzhenskaya integrable functions on F. We denote the surface measure by ds(x) or ds.
et al. [LSU68], Gajewski et al. [GGZ74], and Adams [Ada78]. Compre-
hensive treatment of Lipschitz domains can be found in the monographs by 2.2.3. Weak derivatives and Sobolev spaces. In bounded Lipschitz do-
Alt [A1t99], Grisvard [ Gri85] , and Wloka 1W1o87]. mains 9, Gauss's theorem is valid. In particular, for y, v E C1(S2) we have
the integration by parts formula r
Definition . Let S2 C IRN, N > 2, be a bounded domain with boundary F.
We soy that S2, or F, belongs to the class Ck,l, k E N U {O}, if there exist J v(x) Diy(x) dx = J v(x) y(x) v,(x) ds(x) - J y(x) Div(x) dx.
fanitely many local coordínate systems S1, ... , SNi, functions h1, ... , hm, and
numbers a > 0 and b > 0 that have the following properties: Here, vi(x) denotes the ith component of the outward unit normal v(x) to
F at x E F, and ds is the Lebesgue surface measure on F. If, in addition,
(i) The functions hi, 1 < i < M, are k-times differentiable on the v = 0 on r, then it follows that

L
closed (N - 1)-dimensional cabe
QN-1 = {y = (y1, ..., yN-1) : ¡yi¡ < a, i = 1 ... N - l},
and the partial derivatives of order k are Lipschitz continuous on More generally, if y E Ck(S2), v E Co ( S2), and some multi-index o of length
QN-1 ¡al < k are given , then repeated integration by parts yields
¡
(ü) For any P E F there is some i E {1, ... , M} such that in the coor-
y(x) D' v(x) dx = (- 1)1x1 v(x) Day(x) dx.
dinate system Si there is some y E QN-1 with P = (y, hi(y)). J J
(iii) In the local coordinate system Si we have This relation motivates a generalization of the classical notion of derivatives
that will be explained now. To this end, we denote by Lío, (í) the set of
(y, yN) E S2 ^? y E QN-1, hi(y) < YN < hi(y) + b;
all locally integrable functions in Q, that is , the set of all functions that are
(y, YN) 0 Q y E QN-1, hi(y) - b < yN < hi(y). Lebesgue integrable on every compact subset of Q.

The geometrical meaning of condition (iii) is that the domain lies locally on Definition . Let y E Llo, ( S2) and some multi- index a be given. If a function
one side of the boundary. Domains and boundaries of class C°" are called w E L'0 ( S2) satisfies ¡
Lipschitz domains (or regular domains) and Lipschitz boundaries, respec-
tively. Boundaries of class Ck,1 are referred to as Ck,1 boundaries. (2.1) f y(x)Dav ( x) dx = (-1 )^a^ fSZ w (x)v(x) dx V v E Co (s2),
Using the local coordinate systems Si, we can introduce a Lebesgue mea-
sure on F in a natural way. To this end, suppose that the set E C F can then w is called the weak derivative of y (associated with a).
be completely represented by the coordinate system Si, that is, for every
In other words, w is the weak derivative of y if it satisfies the formula of
P E E there is some y E QN_1 such that P = (y, hi(y)). Moreover, let
integration by parts in the same manner as the (strong ) derivative Day
D = (h¡)-1(E) C QN_1 denote the counter-image of E. Then the set E is
would if y belonged to Ck ( íl). This observation and the easily proven fact
called mensurable if D is measurable with respect to the (N - 1)-dimensional
that y can have at most one weak derivative justify our henceforth denoting
Lebesgue measure. ¡The measure of E is then defined by
the weak derivative by the same symbol as the strong one, that is, we write
¡El = / 1 + ¡Ohi(y1, • • • , yN -1)I2 dy1 ... dyN-1; w = Day.
D

see [Ada78] or [GGZ74]. For a set E whose representation requires several Example. Consider the function y(x) = 1x1 in S2 = (-1, 1). We can easily
different local coordinate systems, the measure will be put together appro- check that the first-order weak derivative is given by
priately by using a suitable partition of unity. We also use the fact that the -1, x E (-1, 0)
Lipschitz function hi is almost everywhere differentiable by Rademacher's y,(x) .- w(x) _
+1, x E (0, l).
28 2. Linear-quadratic elliptic control problems 2.2. Sobolev spaces 29

A hidden difficulty arises when one wants to assign boundary values


Indeed , we obtain for each v E Có (-1, 1) that
to functions from Sobolev spaces. For instante, how do we interpret the
1
1
x1 v'(x) dx =f o (_x) v'(x ) dx +
1 o
J 1r
x v'(x) dx statement that a function y E Wk,P(S2) vanishes on F? After all, since f,
as a subset of RN, has zero measure, the values of any function y e LP(S2)
can be changed arbitrarily on F without affecting y as an element of LP(A);
_ -x v(x) ° 1 - J o (-1) v( x) dx + x v ( x) I o - f l ( +1 ) v(x) dx indeed, functions that have equal values except on a set of zero measure are
regarded as equal in the sense of LP(S2).
J¡r1 w(x) v(x) dx. We now recall the notion of the closure of a set E C X in a normed space
{X, Il ll}, which is by definition the set
Note that the value of y' at x = 0 is immaterial , since an isolated point has
zero measure. 0 E = {x E X : x is the limit of some sequence {x,}°°_1 c E}.

Weak derivatives do not necessarily exist. However, if they do, then they We say that a set E C X is dense in X if X. With this notion, we can
may belong to "better" spaces than merely Li jS2), e.g., to the space LP(Q). define another class of Sobolev spaces.
This gives rise to the following notion:
Definition . The closure of Cp (S2) in Wk'P(S2) is denoted by Wó
Definition . Let 1 < p < oo and k E N. We denote by Wk,P(S2) the linear'
Nloreover, we put Ho (S2) := Wó'2(S2).
space of all functions y E LP(S2) having weak derivatives Day in LP(S2) for
all multi-indices a of length la¡ < k, endowed with the norm
Obviously, Wó'P(S2), endowed with the norm 11 • l1wk,P(9), is a normed space
and, as a closed subspace of Wk,P( S2), also a Banach space. Also note that
llyllwk,P(Q) _ (1: f IDay(x)lP dx)11P.
lal<k 2 by definition Có (S2) is dense in Hó (í ).
The elements of Wó'P(S2) can be regarded as functions for which all
Analogously, for p = oo, Wk,°°(S2) is defined, equipped with the norm
derivatives up to order k - 1 vanish at the boundary. This is a consequence
lyllWk,oo(n) = max ll D`kYll Lo(o). of the following result, which answers the question of in what sense functions
from Wk,P(S2) have boundary values.

The spaces Wk°P(Q) are Banach spaces (see, e.g., [Ada78], [W1o871). They Theorem 2.1 (Trace theorem). Let S2 C RN be a bounded Lipschitz dornain
are referred to as Sobolev spaces. For the particularly interesting case of
and let 1 < p < oo. Then there exists a linear and continuous mapping T
p = 2, we write
W"°(S2) --> LP(F) such that for all y e W1,P(S2) nC(S2) we have (Ty)(x) _
Hk(S2) := Wk'2(S2). y(x) for all x E F.
Since Hl (S2) is of special importante for our purposes, we repeat the defini-
tion given aboye more explicitly for this space. We have
In particular, for p = 2 it follows that T : H1(S2) -* L2(F). In the case
Hl(s2) = {y E L2(S2) : D¡y EL 2(s2), i = 1, ... , N}, of continuous functions, T y coincides with the restriction ylr of y to F.
and the norrn is given by The proof of the trace theorem can be found, e.g., in the monographs of
// 1/2 Adams [Ada78], Evans [Eva98] , Netas [Nec67], and Wloka [W1o87]. We
lIYIIH'(^o) _ (
f (y2 + Vy 2) dx) , note that it follows from the embedding result Theorem 7.1 on page 355 that
for p > N, the elements of W1"P(S2) can be identified with elements of C(S2).
where loyl2 = (D1 y)2 + ... + (DN y)2. With the scalar product
In this case, T defines a continuous mapping from W1"P(SZ) into C(F).
(u,
v)HI(9)
= fuvdx+ f Vu.vvdx , Definition . The element T y is called the trace of y on F, and the mapping
H1(S2) becomes a Hilbert space. T is called the trace operator.
30 2. Linear-quadratic elliptic control problems 2.3. Weak solutions to elliptic equations 31

Remark. In the following we will, for the sake of simplicity, use the notation ylr L2(Q) cannot be distinguished on sets of zero measure. we do not have to
in place of T y. In this sense, yr„ is, for measurable subsets Fo C I', defined as the specify the values of f on the interior boundaries.
restriction of T y to Fo.
Obviously, Poisson ' s equation -Ay = f cannot have a classical solution
y e C2(5l) fl C(S2) for such an f. Instead , we seek a weak solution y in the
Since the trace operator is continuous and thus bounded, there exists
space H01(9 ). Its definition is based on a variational formulation of (2.2).
some constant cT = cr(S2, p) such that
To this end , we assume for the time being that f is sufficiently smooth
IJIFII L P(r) I
cT J IIWI.P(O) d y E Wl,p(Q). and that y e C2 (Q ) n C'( 0) is a classical solution to (2.2 ). The domain fl
Moreover, for bounded Lipschitz domains S2 it follows that is generally assumed to be bounded . Multiplying Poisson 's equation by an
arbitrary test function v E CO ( 52) and integrating over S2, we obtain
H0I(S2) = {y E H'(S2) : ylr = 0}
see, e.g., [Ada78] or [Wlo87]. Finally, we note that in Hó(S2) a norm can - f v Ay dx = J f v dx,
be defined by
whence, upon using integration by parts,
I yIIHó(^ ) := Á I Vyj2dx,
¡
- v a„yds+ f Vy•Vv dx = ffv dx.
which turns out to be equivalent to the norm in H' ( S2). Consequently, there
are suitable positive constants cl and c2 such that
Jr ^ i
Here, á„ y denotes the normal derivative of y, Le., the directional derivative of
Cl II y JIHo (9) <_ II J IIH1(f) <- c2 IIJIIHa (0) Vy E Há v in the direction of the outward unit normal v to F . Recall that 0,y = V y • v.
cf. the estimate ( 2.10) on page 33 and the remark following it. Since v vanishes on F, it follows that

2.3. Weak solutions to elliptic equations f Vy - Vvdx= f fvdx.


si ^
In order to keep the exposition to a reasonable length, we shall not give a Note that this equation holds for any v E Có (S2). Recalling that Co (íl)
comprehensive treatrnent of elliptic boundary value problems. Instead, we is dense in H01 (Sl), and observing that for fixed y all expressions in the
confine ourselves to a few types of elliptic equations, for which basic concepts equation depend continuously on v E H01(9), we conclude its validity for all
of optimal control theory will be developed later in this book; in particular, v E Ho ( í ). Conversely, one can show that any sufficiently smooth y E Hó (9)
we shall focus en equations containing the Laplacian or, more generally, satisfying the aboye equation for each v E Co (S2) is a classical solution
differential operators in divergente form. In this section, we generally assume te Poisson's equation -Dy = f . In summary, the following definition is
that 52 C JRN, N > 2, is a bounded Lipschitz domain with boundary F. justified:

2.3.1. Poisson ' s equation . We begin our study with the elliptic bouridary Definition . We call y E Hó (Q) a weak solution to the boundary value
value problem problem (2.2) if it satisfies the so-called weak or variational formulation

-Dy = f in S2 .f2 Vy • Vv dx = f f v dx
S2
v v e Hó (Q).

y = 0 on F,
Equation (2.3) is also referred to as a variational equality. The boundary
where f E L2( S2) is given . Such functions f may be very irregular. For condition ylr = 0 is encoded in the definition of the solution space H01(9).
example, imagine that the open unit square f C R2 is divided into square It is remarkable that only (weak) first-order derivatives are needed for a
subdomains in the form of a chessboard , and that f equals unity on the second-order equation.
black squares and zero on the others . Since the boundaries between the In order to be able to treat equations more general than Poisson's with
subdomains have zero Lebesgue measure, and since functions belonging to a unified approach, we write V = H0 1(Q) and define the bilinear form
32 2. Linear-quadratic elliptic control problems 2.3. Weak solutions to elliptic equations 33

a : V x V -* R, Lemma 2.3 (Friedrichs inequality). For any bounded Lipschitz domain S2


there is a constant c(9) > 0, which depends only en the domain 52, such
( 2.4) a[y, v] := J Vy • Vv dx. that
IVyl2 dx by
st

Then the weak formulation ( 2.3) can be rewritten in the abstract form L IY12 dx < c(S2) E Ho(52).

a[y,v] = (f , v)L2(Q) Vv E V. The proof of this lemma can be found, e.g., in Alt [Alt99], Casas
,[Cas92], Netas [Nec67], and Wloka [W1o87]. Observe that the validity
Next , we define the linear and continuous functional (for this notion, see of the Friedrichs inequality is restricted to functions with zero boundary val-
Section 2 .4) F : V - R, ues, i.e. those in Hó(S2); it cannot hold for general functions in H'(S2), as
the counterexample y(x) - 1 shows.
F(v) (f, v)LZ(sz).

Then ( 2.3) attains the general form Theorem 2 .4. If í is a bounded Lipschitz domaín , then for every f E L2(S2)
problem (2.2) has a unique weak solution y E Hó (S2 ). Moreover, there ís a
constant cp > 0, which does not depend en f, such that
a[y, v] = F (v) Vv E V. (2.9) IIYIIHI (O) <_ cP IIfIIL2(o).

Proof.• We apply the Lax-Milgram lemma in V = Hó (52). To this end, we


We denote by V* the dual space of V, the space of all linear and contin-
verify that the bilinear form (2.4) satisfies the conditions (2.6) and (2.7).
uous functionals on V (see page 42); hence, F E V*. The following result is
Since H01(2) is a subspace of H' (S2), we use the standard Hl norma see,
of fundamental importante to the existente theory for linear elliptic equa-
however, Remark (i) following this proof. The boundedness condition (2.6)
tions. It forms the basis for the proof of the existente and uniqueness of a
for a follows from the Cauchy-Schwarz inequality:
weak solution to (2.2), as well as to the other linear elliptic boundary value
2 ( ¡ IVv12 dx) i/2
problems investigated in this book.
I. Vy • Vv dx IVy12 dx) i/
^ fez
1/2
Lemma 2.2 (Lax and Milgram). Let V be a real Hilbert space, and let
a : V x V -> R denote a bilinear forro. Moreover, suppose that there exist
positive constants o¿0 and la such that the following conditions are satisfied
< ( f (Iy12+ IVy12) dx)1/2 (
Iv12+IVv12 ) dx)

for all v, y E V : <_ IlyllH^(^) IlVIIH^(^z)


To show the V-ellipticity, we estimate , using the Friedrichs inequality,
(2.6) a[y,v]I < co Ilyllvllvlly (boundedness)
(2.7) a[y,y] > /() Iy11v (V-ellipticity). a[y, y] I Vy12 dx = 1 f Vy12 dx +
fsz 2 2 0
IVy12 dx lf
Then for every F E V* the variational equation (2.5) admits a unique solu-
tion y e V. Moreover, there is some constant ca > 0, which does not depend > 2
f
VY2 dx+2 ) Iy12 dx j
on F, such that 1
(2.10) > 2 min{i,c(Q) 1} Ilyllxl(o)
(2.8) Ilylly < ca IFIIv*.
Hence, the assumptions of Lemma 2.2 are satisfied in V = H01(9). The
boundedness of the functional F is again a consequence of the Cauchy-
The application of the Lax-Milgram lemma to the case of homogeneous Schwarz inequality . Indeed , we have
Dirichlet boundary conditions y1r = 0 requires the following estimate.
IF(v)I = 1 (f v)L2(O)I
1
IIfUIL2(o) IVIIL2(O) 5 IIfIML2(o) IIHII H I(o),
34 2. Linear- quadratic elliptic control problems 2.3. Weak solutions to elliptic equations 35

so that IIFIIv *
< IIfIIL2(9). Lemma 2.2 yields the existente of a unique The boundary condition in (2.11) does not need te be accounted for
solution y to (2.2 ). Inserting the aboye estimate for F in (2 . 8), we conclude in the solution space. As a so-called natural boundary condition, it follows
that IIyIIw (si) -<c JIFIIv* < IIfMIL2( o), which proves (2.9). ❑ automatically for sufficiently smooth solutions. In order to apply the Lax-
Milgram lmma to the present situation, we put V := H'(S2) and define the
Remarks.
functional F and the bilinear form a, respectively, by
(i) Inequality (2.10) shows that
1/2 F(v) := ffv
¡dx + J g v ds,
IIyIIHl(n) ^^ Vyl2dx) (2.13)
n
defines a norm in H01 ( 9) that is equivalent to the standard norm of Hl (Q). If a[y,v] := J Vy•Vv dx+ J coyv dx+ J ayvds.
in ^ r
V = H, (Q) is endowed with this norm a priori, then the assumptions of Lemma
2.2 are directly fulfilled . This is one reason why IIyIIHl(C ) is frequently used. Observe that in this case F can no longer be identified with a function
f EL 2 (Q); F has a more complicated structure and can only be interpreted
(ü) The Lax-Milgram lmma is also valid for functionals F E V* that are not
as an element of V*. The variational formulation ( 2.12) is again of the
generated by some f E L2( S2). This fact will be used in the next section.
form (2.5). To prove the V-ellipticity of a this time, we need the following
inequality.
2.3.2. Boundary conditions of the third kind . In a similar way, we
can treat the boundary value problem
Lemma 2.5. Let 9 C RN denote a bounded Lipschitz domain , and let F1 C
F be a measurable set such that F11 > 0. Then there exists a constant
c(F1) > 0, which is independent of y E H'(íl), such¡that
-Dy+coy = f in 9
(2.11)
dvy + a y = g on F. (2.14) IIy1IH1 (9)
for all y E H1(Q).
< c(F1 )
(1
S2
IVY12 dx +
r
( J y ds)2)
l

Here , the functions f E L2( í) and g E L2(F), as well as the nonnegative


coefficient functions co e L°°( f) and a E L' (F), are prescribed . The bound-
ary condition in (2.11 ) is usually referred to as a boundary condition of the
The proof of this generalization of the Friedrichs inequality can be found,
third kind or a Robin boundary condition . Again , d, denotes the directional
e.g., in [Cas92] or [W1o87]. The Friedrichs inequality obviously arises as a
derivative in the direction of the outward unit normal v to F.
special case with Fl : = F and functions y e Hó (9). An analogous inequality
The aboye problem is treated in a similar way as (2.2). We multiply holds for subsets of f2 : for any set E C 9 having positive measure there
the partial differential equation by an arbitrary v E C'(0 ). Under the same exists some constant c(E) > 0, which is independent of y E H1(í ), such
assumptions as in Section 2.3.1, integration by parts leads to that • the generalized Poincaré inequality ¡
¡ ¡
- J vdvyds+1 Dy•Vvdx+ J coyvdx= f fvdx. ydx)2)
r sz sz ^ (2.15) I^IIxl (o) < c(E) ( f IVyj2dx+ (J
S2 E
Substitution of the boundary condition dvy = g - a y then yields that
holds for all y E H' ( í ); see [Cas92 ] or [GGZ74] . In the case where E := S2
(2.12) f VY . Vvdx +fcoYvdx+fayvdsffvdx+fgvds Poincaré 's inequality results.
We are now in a position to show the existente of a weak solution.
for all v E C'( 0). Using the fact that C'(f2) is for Lipschitz domains 9 a
dense subset of H'( S2), and assuming that y E H1(í ), we finally arrive at Theorem 2 . 6. Let 9 C RN be a Lipschitz domain, and suppose that almost-
the following definition: everywhere nonnegative functions co E L' (9) and a E L'>0 (F) are given such
that
Definition . A function y E H1( 9) is called a weak solution to the boundary
value problem (2.11) if the variational equality ( 2.12) holds for all v E H1(S2).
f (co(x))2 dx + f ( a(x))2 ds(x) > 0.
36 2. Linear-quadratic elliptic control problems
2.3. Weak solutions to elliptic equations 37

Then for every given pair f E L2(9) arad g E L2(F), the boundary value
Consequently, the assumptions of Lemma 2.2 are satisfied. In addition, em-
problem (2.11) has a unique weak solution y e Hl(S ). Moreover, there is
ploying the trace theorem once more, we can conclude as follows:
some constant cR > 0, independent of f and g, such that

(2.16) IIy1IHl(o) < CR (II.fllL2(o) + II911L2(r))• IF(v)l < f lfvl dx+f l9vlds

Proof: We apply the Lax-Milgram lemma in V = H1(a). To this end, we < II!IIL2(o) IIv11L2(O) + lI9llL2(r) IIv11L2(r)
have to verify that the bilinear form (2.13) is bounded and V-elliptic. In this
< IIfllL2(S2) IIVIIH1(O) + C 11911L2(r) IIVIIH1(si)
proof, as throughout this textbook , c > 0 denotes a generic constant that
depends only on the data of the problem. First, one easily derives < c (Ilfl]L2(Q) + 11911L2(r)) IlvlIHI(Q)•
a[y,v]I =
Á
Vy•Vv dx + f coyv dx+ f ayvds < ao IIIIHI (st)IIIIHI(st),
st r
But this means that IlFlly. < ^ (IIf ¡L2(9) + l1911L2 (r)), and the asserted
estimate for llylIH'(c ) then follows from the Lax-Milgram lemma. This con-
i.e., the boundedness of a. Indeed , this is an immediate consequence of the eludes the proof.
estimates
2.3.3. Differential operators in divergente form. The boundary value

fs^ coyv dx < 11CO11Loo(c) IIYIIL2(s2) IIv11L2(O) problems investigated in Sections 2.3.1 and 2.3.2 are special cases of the
problem
< I1COIILoo(g) IIy1IH1(o) IIvIIHI(O),
ayvds < IlUlIL°°(r) IIYIIL2(r) IIVIIL2(r) Ay+coy = f in 9
fr (2.18) vvAy +ay = 9 on Fl
y = 0 en F0.
< 110111L-(r) C IIYIIHI(n) IIvMIH1(O),
where the trace theorem has been used for the latter estimate.
Here, A is an elliptic differential operator of the form
To show the V-ellipticity, we argue as follows. In view of the assumptions,
N
we have co 0 in L °°(í) or es 0 in L°O(F). If co ^ 0 , then there exist a
(2.19) Ay (x) Di (aij(x) Dj y(x)), x E Q.
measurable set E C 9 with 1El > 0 and some S > 0 such that co ( x) > b for all
i,i=1
x e E. Hence , invoking ( 2.15) and the inequality ( fE y dx ) 2 < ¡E¡ fE y2 dx,
we find that The coefficient functions aij of A are assumed to belong to L°°(Q) and te
satisfy the symmetry condition aij (x) = aji (x) for all i , j e {1, ... , N} and
a[y, y] = f ( oy12 + Co 1y12 ) dx + f a yl2 ds > f 1 Vy12 dx + 6 f I y12dx x e Q. Moreover, they are assumed to satisfy with some 70 > 0 the condition
S2 r S2 E
of uniform ellipticity, that is,

> ruin {l, ó } ( f l Vyl2 dx + Ly2) N


S (2.20) azj(x) ^j >_ _yo l^l2 V E RN
min {1, 6} i,j=1

c(E) max {1, IEl} Ilyllxl(^) for almost all x E Q. In this more general case we denote by 0 the
directional derivative in the direction of the conormal vector v,q whose com-
In the case where a 0 there exist a measurable set F1 C F with lFl l > 0 and
ponente are given by
some ó > 0 such that a(x) > b for all x E F1. In view of (2.14), similar
reasoning yields that in this case,
(2.21) (-,4)i(x) = (x) vj( ), 1<i<N.
min{l, b}
(2.17) a[y,y]> 1Vy12dx+b f Iyl2ds> 1yiIH1(sz)• j=1
fu 1 c(Fi) max{l, lFll} Observe that with the N x N matrix function A = (ai1) we have vA = A v.
38 2. Linear-quadratic elliptic control problems 2.3. Weak solutions to elliptic equations 39

The boundary F = Fo U F1 is split into two disjoint measurable subsets (ü) In all three cases studied aboye, the Dirichlet boundary conditions that occurred
Fo and FI, one of which may be empty. Moreover, almost-everywhere non- were merely homogeneous. There are good reasons for this. First, a nonhomoge-
negative functions co E LO°(Q) and a e L2(r1) are given, as well as functions neous boundary condition of the form yjr = g automatically entails that g has the
f E L2(S2) and g e L2(F1). regularity g E H1/2(F), provided that y e H1(fh) (fractional-order Sobolev spaces
will be defined in Section 2.14.2). If, as in later sections, g were a control, then it
The appropriate solution space for problem (2.18) is would have to be chosen from H1/2(F). This does not make sense in many practical
applications.
V := {y E H'(Q) : ylro = 0}.
Moreover, the standard variational formulation does not work in the case
We thus have T y = 0 almost everywhere in Fo . The associated bilinear form of inhomogeneous Dirichlet boundary conditions. A possible way out is a re-
a is given by duction to homogeneous boundary conditions by using a function that satisfies
the inhomogeneous Dirichlet conditions. In Lions [Lio71], inhomogeneous Dirich-
let problems for elliptic and parabolic equations were treated using the so-called
(2.22) a[y , v] aij Diy Djv dx + f co y v dx + J a y v ds, transposition method. For the parabolic case, we also refer to Bensoussan et al.
tt r, [BDPDM92, BDPDM93], where semigroups and the variation of constants for-
mula were employed. Recent results on boundary control involving boundary con-
and the weak solution y E V is defined as the solution to the variational ditions of Dirichlet type can be found in, e.g., [CROE], [KV07], and [Vex07].

equality (iii) The estimates (2.9), (2.16), and (2.23), of the type ¡¡y¡¡ < c (II f 11 + IgIO,

dvEV. are equivalent to saying that the mappings f H y and (f,g) F4 y are continuous
a[y,v] = (f' v)L2(9) + (9, v)L2(r1)
between the respective spaces.

We have the following well-posedness result. Data belonging to LP spaces with p < 2. We reconsider problem (2.18)
Theorem 2.7. Suppose that í C R N is a bounded Lipschitz domain, and from page 37 in the form
suppose that the assumptions from aboye are satisfied. Moreover, assume Ay+coy = f in St
that co E L°°(l) and a E L°°(F1) satisfy co(x) > 0 and a(x) > 0 almost a„Ay+ay = g on F,
everywhere in 9 and in F1, respectively. If one of the conditions
where the assumptions of Theorem 2.7 condition (ii) are assumed to hold.
(i) ¡rol > 0 Till now, it has been assumed that f E L2(9) and g E L2(F). We are

(ü) r1 = F and J (co(x))2 dx + f (a(x))2 ds(x) > 0 now going to demonstrate that a unique solution y E H' (Q) exists also if
f E Lr(S2) and g E L8(F), for suitably chosen 1 < r, s < 2. For this purpose,
we interpret f and y as functionals on H1(S2)* and define
is satisfied, then for all pairs f E L2(S2) and g E L2(F1) problem (2.18) has
a unique weak solution y E V. Moreover, there is a constant c,q > 0, which Fi(v) = ff(x)v(x)dx, F2(v) = J g(x)v(x)ds(x).
depends on neither f nor g, such that r
From Sobolev's embedding result, Theorem 7.1 on page 355, we infer that the
(2.23) IIYIIH1(o) _< C .4 (IIf II L 2(o) + II 9 IIL2(r1)) Vf EL 2(9), Vg E L2(r1). embedding H1(S2) -* LP(S2) is continuous for all p < oo if N = dim 9 = 2,
and continuous for all p < 2N/(N - 2) if N = dim St > 2. Owing to Hólder's
The proof proceeds along the same lines as that of Theorem 2.6, us- inequality, we have
ing Lemma 2.2; see Exercise 2.4. Compare this also with the treatment of IF1(v)I s IIfIILr( Q) IIVII L P(0),
equations of the form (2.18) in [Cas92], [Lio71], or [W1o87]. where 1/r + 11p = 1. In the case of N = 2, an arbitrarily large p may be
chosen , that is, r may be arbitrarily Glose to unity . Hence, for N = 2 we
Remarks.
have Fl E Hl (S2)* if f E L' (Q) merely for some r > 1. In the N > 2 case,
(i) Assumption (ii) aboye is equivalent to saying that at least one of the following
conditions is satisfied: there is a set E C S2 with IEI > 0 such that co(x) > 0 for the smallest possible r is given by
almost all x E E; or, there is a set D C F with IDA > 0 such that a(x) > 0 for 1 N-2 2N
almost all x E D.
r + 2N = 1 r = N+2"
40 2. Linear-quadratic elliptic control problems 2.4. Linear mappings 41

Thus, Fl E H1(SZ)* for N > 2 if f E LT (S2) for some r > 2N/(N + 2).
Obviously, A is a linear mapping from U into itself. To prove that A is
In a similar way, we can study F2, invoking Theorem 7.2 on page 355. continuous , we show its boundedness and employ the aboye theorem. We
The results can be summarized as follows: if N = 2, then the trace T y have
belongs to LP(I') for all p < co, while in the case N
> 2 we have T y E LP(I )
only if p < 2(N - 1)/(N - 2).
In summary, F2 E Hl(5l)* provided that
geLs(I'),wheres>1ifN=2ands>2-2/NifN>2.
1 (Au)
(t)l
<
f
max u(t)I
eeS u(s)I ds < et (1 - e-1) tc[o,1]

< (e - 1) IIu1IC[o, 1]
In any of these cases, the Lax-Milgram theorem ensures the existence of
and, therefore,
a uniquely determined solution y E Hl(fl) to the aboye problem. Moreover,
we have, with a suitable generic constant c > 0, the estimate
IlAullu = tmo] I (Au)(t)I < (e - 1) Ilullu-
IIYIIH1(o) < c (II f IIL-(2) + 11911Ls(r))
Consequently, A is bounded with c(A) = e - 1. o
2.4. Linear mappings
Definition. If A : U V is a linear and continuous operator, then
2.4.1. Continuous linear operators and functionals . The results of
this section are listed without proof. They can be found in most standard II AIl c(u,v) = sup Il Auliv < +oo.
Iiullu=l
textbooks on functional analysis, e.g., Alt [A1t99], Kantorovich and Akilod
[KA64], Kreyszig [Kre78], Lusternik and Sobolev [LS74], Wouk [Wou79], The finite number IHAII c(u,v) is called the (operator) norm of A. The shorter
and Yosida [Yos80]. notation 11A11 is also commonly used.
In the following, {U, 11 - IIu} and {V,
11 - IIv} denote normed spaces over R. Since for linear operators continuity is equivalent to boundedness, there
Definition . is some c > 0 such that IIAuIIv < c llullu for all u E U. Obviously, c = ¡¡Al¡
We say that a mapping A : U -> V is linear or a linear
operator if A(u + v) = A u + A v and A (.\ v) = A A v for all u, v E U and is the smallest such constant. Also, the terco norm is justified, because 11AII
A E R. A linear mapping f: U is in fact a norm on the linear space of all linear and continuous mappings
R is called a linear functional.
from U into V; the reader will be asked to verify this in Exercise 2.5.
More generally, real- or complex-valued mappings are referred to as func-
tionals. Definition . £(U, V) denotes the normed space of all linear and continuous
mappings from U into V, endowed with the operator norrn 11 - ILc(u,v). If
Definition . We call a mapping A : U -> V continuous on U if for any U = V, then we write C(U, V) =: £(U).
sequence {uJ°° 1 C U with hm ¡¡un - ullu = 0 we have
m IlAun -
lies The space £(U, V) is complete (and thus a Banach space ) if V is complete.
n
Auliv = 0.
Example: rnultiplication operator . Let U = V = L°°(Q), and let a
Definition . A linear operator A : U V is said lo
be bounded if there is fixed function a E L°°(52) be given. We consider the operator A : U -> V
a constant c(A) > 0 such that
given by
II Aull v << e(A) Ilullu bu E U. (A u) (x) = a(x)u(x) for almost every x E Sl.
A is bounded, since
Theorem 2.8. A linear operator is bounded if and only if it is continuous.
IlAully = ¡¡a(-)u(-)IIL-(O) < IIalIL-(O) IInIIL^(o),
Example. We take U = V = C[0, 1] and consider the where obviously the latter estimate cannot be improved. In conclusion, A E
integral operator A £(L°°(S2)), and IIAIIc(L-(o)) = IIaIIL^(^)
defined by
1 As an illustration, consider the operator A : L°°(0, 1) -> L°°(0, 1),
(A u) (t) = f et-s u ( s) ds, t E [0, 1].
(A u) (x) = x2 u(x).
42 2. Linear-quadratic elliptic control problems 2.4. Linear mappings 43

We have IAII = 1, since the function a(x) = x2 belongs to the unit sphere
arbitrary u e U, let f vary over U*, and consider the mapping Fu : U* -# R
in L°°(0, 1). o
induced by u,

Definition . The space of all continuous linear functionals on {U, Fu : f H f (u).


II - I1u},
denoted by U*, is called the dual space of U. Clearly, Fu is linear, and its continuity is a consequence of the simple estímate

Observe that U* = £(U, R). The associated norm is given by Fu(f)I = I.f(u)I <- Ilullu IlfJIU..
Ilfllu* = sup If(u)I. Hence, the functional Fu induced by u belongs to the dual space (U*)* U**
Ilullu=1 of U*. Since the mapping u H Fu turns out to be injective , we may identify
Moreover , since R is a complete space, the dual space U* is always a Banach u with Fu, thereby interpreting u E U as an element of U**.
space. The space U** is called the bidual space of U. In light of the aboye
identification, it is always true that U E U**. The mapping u H Fu from
Example. We consider the linear functional f (u) = u(2') en U = C[0, I]. U into U ** is called the canonical embedding or canonical mapping. If this
Since mapping is surjective , i.e., if U = U**, then U is called a reflexive space. In
If(u)I = lu(1/2) I < tmo 1] l u(t) I =1 - Ilullc[0,1] du E C[0,1], the case of reflexive spaces, taking the dual twice leads back to the original
space. In particular , we infer from the Riesz representation theorem that
we see that f is bounded with 11 f 11 u* < 1. Moreover , for u (t) - 1 it follows Hilbert spaces are always reflexive.
that l f (u)l =1=1Iu11, and thus IIf llu. > 1. In summary , Iif IIU• =1. o
Example. The spaces LP(E) introduced in Section 2.2.1 are also reflexive
In the following, we are concerned with the explicit representation of if 1 < p < oo. In fact , it can be shown that the dual space LP(E)* can be
continuous linear functionals , aiming at a characterization of dual spaces. identified with L'I( E), where the conjugate exponent q of p is given by the
Note that there can be many different ways to represent the same continuous relation 1 = 1. More precisely, to every continuous linear functional
1+
linear functional; for instante , the expressions F E LP (E)* there corresponds a uniquely determined function f E L"(E)
such that
(2.24) F(v) = J 1 ln(exp(3v(x) - 5)) dx + 5, G(v) = 3v,
0 F(u) = L f( x) u(x) dx b'u E LP(E).
E
while looking quite different, represent the same functional en R. The fol-
Repeating this argument , we arrive at the conclusion that the bidual space
lowing result, which settles the representation problem for Hilbert spaces in
LP(E)** can be identified with LP ( E), which proves the reflexivity. Observe
tercos of the scalar product, is of fundamental importante.
that the continuity of the aboye functional F is a consequence of Hdlder's
inequality for integrals,
Theorem 2.9 (Riesz representation theorem). Let {H, (• , •)H} be a real
Hilbert space. Then for any continuous linear functional F E H* there exists
a uniquely determined f E H such that ItF11H• = II.f lIH and
(2.25) f f( x) u(x)t dx < ( f f(x)lgdx)q1u (x)IPdx)P.
o
F(v)=(f,v)H VvEH.

Remark. Note that the spaces L- (E) and L1(E) are not reflexive. Indeed, -hile
By virtue of this result, we can identify H* with H, writing H = H*. For L1(E)* can be identified with L°°(E), the dual space of L°°(E) cannot be identified
example, in the case of the functional on H = R for which different represen- with L1(E).
tations were given in (2.24), the canonical form referred to in the theorem is
that of G, with f = 3 E R. 2.4.2. Weak convergence . The contents of this subsection are of impor-
tance mainly for proving the existente of optimal controls; thus, they may
Next, we introduce the fundamental notion of reflexivity. To this end, for the time being be skipped by readers who are more interested in actu-
let U denote a real Banach space with associated dual space U*. We fix an
ally finding the solution to optimal control problems. In the following, the
44 2. Linear-quadratic elliptic control problems 2.4. Linear mappings 45

underlying spaces will always be Banach spaces, even though not all of the Now observe that 0 = (f , 0) for all f E H. Consequently , the sequence
results require the completeness property. {un}°° 1 converges weakly to the zero function:

Definition . Let U be a real Banach space. We soy that a sequence {un} 1 C un= sin( n•)^0 asn -> w
7r
U converges weakly to some u E U if

lim f (un) = f (u) V f E U.


n-oo
On the other hand, we have

Ilunll2
1
L 27r sin2 (nx) dx = 1 Vn E N.
We denote weak convergente by the symbol that is, we write un u as = 7r
n -->oc. 0

The limit u is uniquely determined and is called the weak limit of the se- Conclusion . There exist sequences that converge weakly to the zero function
quence . Moreover , it follows from the Banach -Steinhaus theorem , which is even though all their tercos belong to the unit sphere.
a consequence of the principie of uniform boundedness ,
that {JJunll }°°_i C R
is bounded for any weakly convergent sequence {un} 1 C U. The aboye sequence of sine functions, while converging weakly to the zero
function , oscillates ever more strongly as n increases . This example shows
Examples. that little (if any) information about the actual pointwise convergente behav-
(i) If a sequence {un}O 1 C U converges strongly (that is, with respect to ior can be extracted from the mere fact that a sequence is weakly convergent.
the norrn of U) to u E U, then it also converges weakly to u, i.e., Therefore, the notion of weak convergente is not of major importance from
the numerical point of view. However, in the context of proving existente
un - u un - u as n-4oc.
results it plays a fundamental role. We are now going to provide some re-
sults that form the conceptual basis for the application of the notion of weak
(ii) By virtue of the Riesz representation theorem , weak convergence in a
Hilbert space { H, (• , •)} is equivalent to convergente.

li m (v, un ) -> (v, u) Vv e H. Definition . Let U and V denote real Banach spaces. A mapping F : U -+ V
n
is said to be weakly sequentially continuous if the following holds: whenever
Moreover , if un - u and vn -> v ( strong convergente ), then (vn, un) -+ a sequence {un}nno-1 C U converges weakly in U to some u E U, its image
(v, u) as n --> oo; see Exercise 2.8. In other words, the scalar products of {F(un)}°_1 C V converges weakly to F(u) in V; that is,
the terms of a weakly convergent sequence and a strongly convergent one
tend to the scalar product of the associated limits. un - u F(un) - F(u) as n -> co.

(iii) We consider in the Hilbert space H = L2(0, 2 7r) the sequence of func-
tions Examples.
un(x) = sin (71X), x E (0, 27r). (i) Every continuous linear operator A : U -+ V is weakly sequentially con-
7r
tinuou.s.
Moreover, let f E L2(0, 2 7r) be arbitrary. Then
The proof of this statement is easy: suppose that un - U. We have to
¡2 7r
show that then Aun - Au, i.e., that f (Aun) -# f (A u) for all f e V.
(f, un) = J f (x) 1 sin(nx) dx
y/Ir Now if f e V* is fixed , then the functional F(u) := f(Au) is obviously
defines the nth Fourier coefficient associated with f with respect to the linear and continuous on U, and thus belongs to U*. Hence, we must have
orthonormal system consisting of the functions sin(nx )/V/W-, n e N, in F(un) - F( u) or, in view of the definition of F, f (Aun) -+ f (A u). Since f
L2 (0, 2 7r). Owing to the well-known Bessel inequality , the sequence of coef- was arbitrarily chosen, we can conclude that Aun - A u.
ficients tends to zero as n -> cc, that is,
(ü) The functional f (u) = ¡Jull is not weakly sequentially continuous in
(f,un)->0 as n ->oc. the Hilbert space H = L2(0 , 27r). The sequence of sine functions un(x) -
46 2. Linear-quadratic elliptic control problems 2.4. Linear mappings 47

sin(nx )/V/^E from aboye serves as a counterexample . Indeed , we know that


Definition.
un -'Oas n-* oobut
(i) A subset C of a real Banach space U is said to be convex if for any pair
lim f(un) = lim IIunll
n-oo n-*oo
= 1 11011= f(0). u, v E C and any A E [0, 1] the convex combination A u + (1 - .A) v lies in C.
The fact that the norm in the Hilbert space H = L2(0, 2 7r) is not weakly (ii) A functional f : C -* 118 is said to be convex if
sequentially continuous presents a problem that will have to be attended
to when dealing with infinite-dimensional Banach spaces. It is one reason f(Au+(1-A)v) <a f(u)+(1-A)f(v)
for the introduction of the concept of weak lower semicontinuity; cf. the
for all A E [0, 1] and all u , v E C. The functional is said to be strictly
example following Theorem 2.12. o
convex if the aboye inequality holds with < in place of < whenever u v
Definition. Let M be a subset of a real Banach space U. We say that M andAE(0,1).
is weakly sequentially closed if the limit of every weakly convergent sequence
{un}°_1 C M lies in M. We say that M is weakly sequentially relatively Theorem 2.11. Every convex and closed subset of a Banach space is weakly
compact if every sequence {un}°° 1 C M contains a weakly convergent sub- sequentially closed. If the space is reflexive and the set is in addition bounded,
sequence; if, in addition, M is weakly sequentially closed, then M is said to then it is weakly sequentially compact.
be weakly sequentially compact.

The reader will be asked to verify in Exercise 2.7 that every strongly The first assertion of the theorem is an easy consequence of Mazur's
convergent sequence also converges weakly. As the aboye example involving theorem, which states that the weak limit of a weakly convergent sequence
sine functions shows, the contrary is false in general; that is to say, in general is at the same time the strong limit of a sequence consisting of suitable
there are more weakly convergent sequences than strongly convergent ones. convex combinations of the terms of the sequence. This part of the assertion
is already true in normed spaces; see [BP78] and [Wer97]. The second
Conclusion. Any weakly sequentially closed set is also (strongly) closed; assertion follows from Theorem 2.10.
however, not every (strongly) closed set must be weakly sequentially closed.
Theorem 2 . 12. Every continuous and convex functional f : U 118 on a
For instante , the unit sphere in the space H = L2(0, 27r) is closed but not Banach space U is weakly lower semicontinuous; that is, for any sequence
weakly sequentially closed : the sequence of sine functions {sin(nx)/,/} {un}n 1 C U such that un -. u as n -> oc we have
belongs to the unit sphere while its weak limit , the zero function , does not.
lim inf f (un) > f (u).
n-- + 00
The next two results can be found in, e.g., [Kre78], [Wou79], and
[Yos80].
For a proof of this result, we refer the interested reader to [BP78],
Theorem 2.10. Every bounded subset of a reflexive Banach space is weakly
[Wer97], or [Wou79]. Note that the preceding two theorems underline the
sequentially relatively compact.
key importante of the concept of convexity for the treatment of optiinization
problems in function spaces.
The aboye result is the main reason why the concept of weak convergente
is of such fundamental importante: it says that the notion of weak sequential
Example. The functional f (u) = ¡lul¡ is obviously continuous on any
relative compactness can in a certain sense take over the role of relative
Banach space. It is also convex , since it follows from the triangle inequality
compactness. It follows from a theorem of Eberlein and Shmulian that this
and homogeneity that
property even characterizes reflexive Banach spaces; see [Yos80l.
Ilau+(1-a)vIl<>, Ilu11 +(1-A)Ilvll V\E[0,1], Vu,yEU.
Owing to the aboye theorem , the norm functional is thus weakly lower semi-
continuous on U. 0
48 2. Linear-quadratic elliptic control problems 2.5. Existente of optimal controls 49

Remark . In the literature, the notions of weak compactness and weak closedness 2.5.1. Optimal stationary heat sources . As the first case study, we
in the sense of the weak topology are often used in place of weak sequential com- investigate the problem of finding an optimal heat source under homogeneous
pactness and weak sequential closedness, respectively. This may lead to confusion Dirichlet boundary conditions, which can be written in the form
and sometimes renders the study of the relevant literature a bit difficult. It should
be noted, however, that in reflexive Banach spaces the two concepts are equivalent; (2.26) min J(y, u) := z Ily - YOlli2(O) + 2 HU112(Q),
see [A1t99 ], Section 6.7, or [Con90].
subject to the constraints

2.5. Existente of optimal controls -Ay = fl u in S2


(2.27)
y = 0 on F
In this chapter, we are concerned with optimal control problems for linear
elliptic differential equations. In the course of our study, we will discuss the
following fundamental questions: Does a solution to the problem (i.e, an and
optimal control with associated optimal state) exist? What optimality con- (2.28) ua(x) < u(x) < ub(x) for almost every x E Q.
ditions must possible solutions necessarily satisfy? How can their solutions
be determined numerically? We first investigate the problem of existente, First, we have to decide from which class of functions the control u
beginning with the simplest of the examples presented in Section 2.3, namely should be selected. Continuous functions are not eligible, since the set of all
the boundary value problem for Poisson's equation. continuous functions u such that ua < u < ub does not, as a rule, have the
compactness properties needed to prove existence; for instante, this applies
We remark generally that if for a given problem existence cannot be
to the case of continuous bounds satisfying ua(x) < ub(x) on Q. Moreover,
shown by standard techniques, this is often due to mistakes made during
it will turn out that optimal controls may have jump discontinuities if A = 0.
the process of modeling; such mistakes are also likely lo lead to numerical
difficulties. With these considerations, a natural choice for the control space is given by
the Hilbert space L2(f ). We thus define the set of admissible controls by
In this section, we make the following general assumptions on the data
that characterize the problems under study. In this connection, E denotes Uad = {u e L2(í) : ua(x) < u(x) < ub(x) for almost every x E SZ}.
a set whose actual meaning varíes from case to case and will become clear Uad is a nonempty, closed, and convex subset of L2(í ); see Exercise 2.9. Its
from the context. elements are called admissible controls.
Owing to Theorem 2.4 on page 33, to every u E Uad there corresponds
Assumption 2.13. SZ C RN denotes a bounded Lipschitz domain with a unique weak solution y E H0 1(Q) to the boundary value problem (2.27),
boundary F, and we assume that we are given \ > 0, yc E L2(9) yr E called the state associated with u. The space
L2(F), ,d e L00(S2), and a E L°°(F) with n(x) > 0 for almost every x e F,
Y := H'0(9)
as well as functions ua, ub, va, Vb E L2(E) having the property that ua(x) <
ub(x) and Va(X) < Vb(x) for almost every x E E. is referred to as the state space. The dependence of y on u is expressed by
the notation y = y(u). The context will always ensure that this expression
cannot be confused with the value y(x) of y at x E 0.
In this connection, ys and yr represent desired functions (i.e., targets to be
approximated), n and,3 are coefficient functions, and the functions ua, ub, Va, Definition . We call a control ti E Uad optimal and y(u) the associated
and Vb will define the sets of admissible controls acting on E = 52 or on E = F. optimal state if
In most cases to follow, the control function will be denoted by u. This J(y, v,) < J(y(u), u) V U E Uad.
commonly used notation goes back to the Russian word "upravlenie" for
control. If, however, both a distributed control and a boundary control occur
in a problem, then u will denote the boundary control and v the distributed For the treatment of the existence question, we now rewrite the optimal
control. control problem as an optimization problem in terms of u.
50 2. Linear-quadratic elliptic control problems 2.5. Existente of optimal controls 51

Definition . The mapping G : LZ(SZ) -> Ha (S2), u H> y(u), defined by problem
Theorem 2.4 on page 33 is called the control-to-state operator.
(2.30) mid f (U) := 2 11 S u - ydMMH + 2 IIu11U
Obviously, G is a linear mapping and, by virtue of the estirnate (2.9), also
continuous. admits an optimal solution ú . If \ > 0 or S is injective, then the solution is
uniquely determined.
In view of the obvious estimate IyIb(9) < IIY6I(9), the space H'(S2)
and its subspace Hó (S2) are linearly and continuously embedded in L2(S2).
Proof: Since f (u) > 0, there exists the infimum
Therefore, G may also be viewed as a continuous linear operator with range in
L2(S2), which we will do henceforth. In other words, we consider the operator j inf f (u),
uEUad
EyG instead of G, where Ey : H'(S2) -> L2(S2) denotes the embedding
operator that assigns to each function y E Y = H'(fl) the same function in and there is a sequence {un}nn'--1 C Uad such that f (un) --> j as n oo. Uad
L2(S2). More precisely, we have to interpret EY first as an operator acting is bounded and closed but-in contrast to the existente result of Theorem
between Hó(S2) and L2(S2). However, H01(íl) is a subspace of H1(SZ), and 1.1 for the finite-dimensional case-not necessarily compact. However, as a
the norms of the two spaces are equivalent, so we avoid in this way the use of Hilbert space, H is reflexive; hence, by virtue of Theorem 2.11, its bounded,
two different embedding operators. Note that EY is a linear and continuous closed, and convex subset Uad is weakly sequentially compact. Consequently,
operator. The operator thus defined is denoted by S, that is, some subsequence {unk }- 1 converges weakly to some v, E Uad, that is,

S=EYG. unk -2l as k -> oo.

In the following, S will always represent that part of the state y that actually Since S is continuous, f is also continuous. At this point it would be a
mistake to conclude that this implies f (unk) --> f (ú). Instead, we have to
occurs in the quadratic cost functional. This can be either y itself or its trace
yr. In the problem of stationary heat sources, we thus have invoke the convexity of f, which together with the continuity ensures that f
is weakly lower semicontinuous. Consequently,
S : L2(Q) L2(9), u H y(u).
.f (u) < h mw inf f (unk) = j-
k--+
The use of S has the advantage that the adjoint operator S* (see Section
u, E Uad, we must have f (u) = j, and v, is therefore an optimal control.
2.7 for the definition of this notion) also acts in the space L2(S2). Moreover,
the optimal control problem (2.26)-(2.28) reduces to the following quadratic The asserted uniqueness follows from the strict convexity of f. If ñ > 0,
optimization problem in the Hilbert space L2(S2): this follows immediately from the second summand of f, while in the case
of A = 0 the strict convexity is a consequence of the injectivity of S; see
Exercise 2.10. ❑

(2.29) mvd.f(u) Y2112 2(n)+2IIU112 szl• Remark. The proof only made use of the fact that f is continuous and convex.
The existente result thus holds for any functional f : U -* R having these properties
in a Hilbert space U. By virtue of Theorem 2.11, the whole assertion remains true
The functional f just defined is referred to as the reduced functional. The also for reflexive Banach spaces U.
following existence result for problem (2.29) will be applied repeatedly during
the course of this textbook. As a consequence of the aboye theorem , we obtain an existente and
uniqueness result for the elliptic optimal control problem ( 2.26)-(2.28):

Theorem 2.14. Let {U, II • IIu} and {H, II - IIx} denote real Hilbert spaces, Theorem 2.15. Suppose that the conditions of Assumption 2.13 are fulfilled.
and let a nonempty, closed, bounded, and convex set Uad C U, as well as Then the problem ( 2.26)-(2 . 28) has at least one optimal control v,. If, in
some yd E H and constant A > 0 be given. Moreover, let S : U -3 H be addition, \ > 0 or a Y- 0 almost everywhere in S2, then the solution is
a continuous linear operator. Then the quadratic Hilbert space optimization una que.
52 2. Linear-quadratic elliptic control problems
2.5. Existente of optimal controls 53

Proof: We apply the previous theorem with U = H = L2(2), Yd = yo,


stationary heat source, in which a Robin boundary condition was given in-
and S = EYG. The set Uad = {u E L2(52) : Ua < u <_ ub a.e. in E}
stead of a homogeneous boundary condition of Dirichiet type. The state
is bounded, closed, and convex. Hence, it follows from Theorem 2.14 that
equation is in this case given by
the corresponding problem (2.30) admits at least one solution ú, which is
unique if ñ > 0. In the A = 0 case, we have ,Q ^ 0 almost everywhere in
E, which implies that the operator S is injective. Indeed, if Su = 0, then -Dy = ,Q u in 9
y = 0, and inserting this into the differential equation yields /3 u = 0 and avy = -(y. -y) on F,
thus u = 0 almost everywhere in E. In conclusion, S is injective, that is, we
have uniqueness also for this case. This concludes the proof of the assertion. where the outside temperature ya E L2(F) and an almost-everywhere non-
negative function a E L°°(F) with fr(a(x))2 ds > 0 are prescribed.

Remark . In the proof of Theorem 2.14, u is obtained as the limit of a weakly This problem can be treated similarly to the case with homogeneous
convergent sequence {u,,}. Since the control-to-state operator G : L2 (9) -, Hó (S2) Dirichlet boundary condition, where this time the state space is given by
Y = H1(E). Owing to Theorem 2.6, for each pair of functions u E L2(E)
is a continuous linear operator, it is also weakly continuous. This implies that the
sequence of states {y,,} converges weakly in H, (Q) to y = Gv,.
and Y. E L2(F) there is a unique weak solution y e Y to the aboye boundary
value problem. By the superposition principle, we may decompose y in the
We now allow for one or both of the inequality constraints defining Uad form
to be absent. Forinally, this can be expressed by putting ua = -oo and/or y=y(u)+yo,
ub = +oo. Then Uad is no longer bounded and hence not weakly sequentially where y(u) E H'(2) is the solution to the boundary value problem for Pois-
compact. However, we still have existence and uniqueness if \ > 0, as the son's equation with homogeneous boundary condition corresponding to the
following result shows. pair (13 u, ya = 0), while yo E H' (2) solves the boundary value problem for
Laplace's equation with inhomogeneous boundary condition associated with
Theorem 2 . 16. Suppose that Uad is nonempty , closed, and convex . If \ > 0, the pair (3u = 0, ya). Clearly, G : u H y(u) is linear and maps L2(2)
then problem ( 2.30) has a unique optimal solution. continuously into Hl(E). Again, we interpret G as an operator with range
in L2(E), that is, S : L2(E) -* L2(E), S = EYG, so that

Proof: By assumption, there exists some no E Uad. Now observe that if y = Su + yo.
IluI12 > 2 A-' f (uo), then The problem then attains the forro

f(u) _ ¡¡SU- ydIIH + 2 (2.31) ^mid f(u) 11 IISu - (yo - yo)I1i2(o)+ I1u1112( )
Therefore, the search for an optimum can be restricted to the closed, convex, Invoking Theorem 2.14 and Theorem 2.16, we immediately find that the
and bounded set Uadfl {u e U : )ullÚ < 2 A 1 f (no)} . The remainder of the existence results established in Theorem 2.15 and in Theorem 2.17, respec-
proof now proceeds along the same lines as that of the preceding theorern. tively, remain valid under the aboye hypotheses. In particular, there exists
a unique optimal control if \ > 0. If \ = 0, existence still follows if the
threshold functions are bounded; we have uniqueness in this case if /3 0
As an immediate consequence, we obtain the following result.
almost everywhere in 9.
Theorem 2.17. Suppose that ua = -oo and/or ub = +oo. If \ > 0, then
2.5.2. Optimal stationary boundary temperature . In a similar way,
under the given conditions the problem (2.26)-(2.28) of finding the optimal
we can treat the problem of finding the optimal stationary boundary tem-
stationary heat source has a uniquely deterrnined optirnal control.
perature. It has the form

(2.32) min J(y, u) :=


Optimal stationary heat source with prescribed outside tempera-
ture. We now recall another variant of the problem of finding an optimal
2 Ily - yoI 2(o) + 2 IIuIIL2(r),
subject to the constraints
54 2. Linear-quadratic elliptic control problems 2.5. Existente of optimal controls 55

and

-Z^,y = 0 in 9 Va(x) < v ( x) < vb(x) for a . e. x E 9


(2.33) (2.38)
0,y = a(u -y) on F ua(x) < u(x) < ub(x) for a.e. x e F1.

and
Here, the uniformly elliptic differential operator A and the sets ro and F1
(2.34) Ua(X) < u(x) < ub(x) for almost every x e F. are defined as in Section 2.3.3 on page 37.

To guarantee existente and uniqueness of a solution to the aboye elliptic


boundary value problem, we r additionally require that Assumption 2.19. Suppose that Assumption 2.13 on page 48 holds. In
addition, let co E L°°(9), ao E L°°(9), /3r E L°°( F1) as well as constants
(2.35) (a (x))2ds(x) > 0. Xo > 0, )r > 0, w > 0, and )u > 0 be prescribed . Moreover, suppose
i that the functions co and a satisfy one of the assumptions ( i) or (ii) from
We seek the control u in L2(F) and the corresponding state y in the state Theorem 2.7 en page 38.
space Y = H1(9). The set of admissible controls is
Uad = {u E L2(F) : ua(x) < u(x) < ub(x) for almost every x E F}. We begin our analysis by recalling that the appropriate state space in
By virtue of Theorem 2.6, for any u E L2(F) the elliptic boundary value this case is
problem (2.33) has a unique weak solution y = y(u) E H'(9). The operator V = {yeH'(9): ylro=0}.
G : L2(F) H1(9), u y(u), is continuous. We interpret C as a
Under the aboye assumptions, the control-to-state mapping G : (u, v) H y
continuous linear operator mapping L2(F) into L2(9), that is, we take S =
is linear and maps L2 (F1) x L2 (9) continuously into V. Again, we use
EyG and S : L2(F) -3 L2(9). We have the following result.
S = EyG, S : L2(F1) x L2(0) --> L2(Sf). The boundary observation operator
Sr = T o G, (u, v) H yjr, maps L2(Fl) x L2(9) continuously into L2(F).
Theorem 2.18. Suppose that the conditions of Assumption 2.13 on page 48
The sets of admissible controls are given by
and (2.35) are satisfied. Then problem (2.32)-(2.34) has an optimal control,
which is unique if A > 0.
Vad = {v E L2(9) Va(X) < v(x) < vb(x) for a.e. x E 9 },
This result is also a consequence of Theorem 2.14. By virtue of Theorem Uad = {u E L2(F1) : Ua(X) < u(x) < ub(x) for a.e. x E F1}•
2.16, it carries over to the case of unbounded admissible sets Uad.
Finally, after elimination of y the cost functional J attains the reduced
2.5.3. General elliptic equations and cost functionals *. In this sec- form
tion, we study the general problem a2
J(y,u,v) = f(u,v) = IIS(u,v) -YQIIL2(o) + 2r IISr(u,v) -Yrll22(r)
(2.36) min J(y, u, v) 2 Ily - yszlli2 (Q) + 2r Ily yrML2(r)
+ 2V Ilv11L2 (Q) + z H-1 1 12 (r^).
+ 2 Ilvlli2 (o) + Al,
2 lnlli2(r1)
subject to the constraints In this example, both a distributed control v and a boundary control u
occur. In addition, the cost funetional contains terms of y that act in the
domain as well as terms that act en the boundary ( distributed observation
Ay+coy = /3ev in 9 and boundary observation). Also, this functional is convex and continuous
(2.37) with respect to (v, u ), so that Theorem 2.14 on page 50 applies. Observe
a, y+ay = flru on F1
that the second summand of the-cost functional (2.36) has an impact only
y = 0 on ro on F1i since y is prescribed on Fo.
56 2. Linear-quadratic elliptic control problems 2.6. Differentiability in Banach spaces 57

By virtue of Theorem 2.14 on page 50, there exists an optimal pair Sometimes the Gáteaux derivative is not denoted by F'(u) but rather, for
(ú, v) E Uad x Ved, which is unique if w > 0 and w > 0. We note that for example, by FG(u). This is done in order te avoid confusion with the Fréchet
unbounded Uad, existente follows as in Theorem 2.16 if \,, > 0 and w > 0. derivative F'(u) to be introduced below. Note that if the Fréchet derivative
exists, then so does the Gáteaux derivative , and we have F(u) = FG(u). The
converse is false, in general . However , since in all examples and exercises to
2.6. Differentiability in Banach spaces be considered in this book the Gáteaux derivatives that occur will also be
Fréchet derivatives, we simply use for the sake of conveniente the common
Gáteaux derivatives. For the derivation of necessary optimality condi- notation F'(u).
tions in the later sections of this book, we will need a generalization of the
notion of derivatives. We begin here with first-order derivatives; higher-order Examples.
derivatives will be encountered later in this book. We caution the reader not
( i) Evaluation of a function at a point.
to confuse the meaning that the Banach spaces {U, 11 • ¡Iu} and {V, 11 - Mv}
have in this section with their later meaning in optimal control problems. In Let U=U=C[0,1]. We define f : U -3Rby
the following, U will always denote a nonempty and open subset of U, while
F will always denote a mapping from U into V. f (u(.)) = sin (u(1)).

Definition . Let u E U and h E U be given. If the limit Suppose that h = h(x) is another element of C[0, 11. We calculate the
directional derivative of f at u(•) in the direction h(.). We have
6F(u, h) lim 1 (F(u + th) - F(u))
tlo t
exists in V, then it is called the directional derivative of F at u in the lm1(f(u +th)-f(u)) = lió 1(sin (u(1)+th(1))-sin(u(1)))
direction h. If this limit exists for all h E U, then the mapping h ÓF(u, h)
is termed the first variation of F at u. sin (u(1) + t h(1))
dt t=o
Observe that the openness of U implies that u + th belongs to U, and cos (u (1) + t h(1)) h(1) = cos (u (1)) h(1).
t=o
therefore to the domain of F, provided that t > 0 is sufficiently small. Hence,
the aboye definition is meaningful. Hence, 6f (u, h) = cos(u(1)) h(1). The mapping h(-) ^-> cos(u(1)) h(1) is
linear and continuous with respect to h E C[0, l], and therefore the Gáteaux
The first variation is not necessarily a linear mapping, as is demonstrated
derivative f'(u) exists at any point u E U and satisfies
by the following example from [IT79]: the function f : I[82 -> R, which in
tercos of polar coordinates is given by f (x) = r cos(cp), has a first variation
at the origin that is nonlinear with respect to h, namely J f (0, h) = f (h). f'(u) h = cos (u(1)) h(1).

Definition . Suppose that the first variation SF(u, h) at u e U exists, and


suppose there exists a continuous linear operator A : U - V such that Remark. In this example, it is impossible to express f'(u) without referente to
the increment h. We therefore have to use the evaluation rule for the functional
bF(u, h ) = A h d h E U. f'(u) E U*.
Then F is said to be Gáteaux differentiable at u, and A is referred lo as the
(ii) Square of the norm in Hilbert spaces.
Gáteaux derivative of F at u. We write A = F'(u).
Let {H, (- , •)H} be a real Hilbert space equipped with the standard norm
It follows from the definition that Gáteaux derivatives can be determined 11 • IIH. We determine the Gáteaux derivative of the functional f : H = U -*
as directional derivatives (which we will do below). Note also that in the IR,
case where V = I8, that is, if a functional f : U -> I is Gáteaux diffentiable
at a point u E U, then f(u) is an element of the dual space U*. .f(u) = 11-11x•
58 2. Linear-quadratic elliptic control problems 59
2.6. Differentiability in Banach spaces

We have
where the so-called remainder r satisftes the condition
lim 1 (f (u + t h) - f (u)) 2 2
t-ro t
tlim
o t (Ilu+thlH - IIuIIH) Ilr(u,h)IIv -> 0 as IIhIIu o.
IIhIIu
(u , h)H+t2 I h iI H
lim 2t The operator A is then called the Fréchet derivative of F at u, and me write
t-+o t
A = F'(u). If A is Fréchet differentiable at every point u e U, then A is said
2 (u, h)H, to be Fréchet differentiable in U.
and therefore
Since U is an open set, we have u + h E U for all h c U with sufficiently
f'(u) h = (2u, h)H. small norm. Hence, the relation to be satisfied by the remainder r(u, h) is
If we identify H with its dual space H* in the sense of the Riesz representa- meaningful at least for all h e U from a small ball about the origin. We also
tion theorem , then we obtain for f ( u) = IIuIIH the simple formula remark that it is often more convenient to prove Fréchet differentiability by
showing that
f'(u) = 2 u.
IF(u+h)-F(u) -AhIv
The expression , which results from identifying f'(u) with an element of (2.39) O as h
Il IlU
0,
IIhIIu
H, is called the gradient of f. We thus distinguish between the derivative,
which is obviously equivalent to postulating that F(u + h) - F(u) - A h _
which is given by the rule f'(u) h = (2u, h)H, and the gradient f'(u) = 2u.
r(u, h), where (Ir(u, h) 11 v/IIhIIu -+ 0 as IIhIIu O.
( iii) Application to the norm in L2(S2).
Examples.
By virtue of (ii), the Gáteaux derivative of the functional
(iv) The following function taken from [IT79] is a standard example illus-
trating the fact that Gáteaux differentiability is not sufficient to guarantee
f(n) == Iln()IIL2(O) = f Iu(x)IZdx Fréchet differentiability: we consider the mapping f :1182 -* R,
s^
is given by 1 ify=x2 andx$0
.f (x> y) = 0 otherwise.
f'(u) h= f 2 u( x) h(x) dx d h E L2(S2).
n This function is Gáteaux differentiable at the origin. It is, however, not even
Identification of L2 ( 9)* and L2 ( S2) yields the gradient ( f'(u)) (x) = 2u(x), continuous at the origin, let alone Fréchet differentiable.
for almost every x E Q. o
(v) The functional f (u) = sin(u(1)) is Fréchet differentiable at every u E
C[0,1].
All the mappings considerad in the aboye examples have even better
differentiability properties. In fact, they are actually Fréchet differentiable. (vi) The mapping f (u) =IIuIIH is Fréchet differentiable on every Hilbert
space H; sea Exercise 2.11.

Fréchet derivatives. (vii) Every continuous linear operator A is Fréchet differentiable. Indeed,
As before, let {U, I . II u } and {V, II • II v } denote real Banach spaces and U A(u + h) = Au + Ah + r (u, h) holds with r(u, h) = 0, and we conclude: " The
derivative of a continuous linear operator is given by the operator itself " o
an open subset of U.

Definition . A mapping F : U C U -# V is said to be Fréchet differentiable Calculation of Fréchet derivatives . Evidently, every Fréchet differen-
at u e U if there exist an operator A E G(U, V) and a mapping r(u, •) : U -> tiable mapping F is also Gáteaux differentiable, and the two derivatives
V with the following properties: for all h E U such that u + h E U, we have coincide (i.e., FG(u) = F'(u); sea also the remarks following the definition of
the Gáteaux derivative). Hence, the explicit forro of a Fréchet derivative can
F(u + h) = F(u) + A h + r(u, h),
be determinad through the Gáteaux derivative, which ultimately amounts to
60 2. Linear-quadratic elliptic control problems 2.7. Adjoint operators 61

calculating the directional derivative. This has already been demonstrated U --- > V, and a functional f E V* be given. We can then define the functional
on pages 57 and 58. g = f o A : U -s I,
g(u) = f (Au).
Theorem 2.20 (Chain rule). Suppose that Banach spaces U, V, and Z are The mapping g is obviously linear ; its continuity follows from the estimate
given, and let U C U and V C V denote open sets. Let F : U -* V and
G : V -+ Z be Fréchet differentiable at u e U and at F (u) E V, respectively.
Ig(u)I ^ lflly* lAllc(u,v) Ilullu•
Then the composition E = G o F : U -3 Z, defined by E(u ) = G(F(u)), is Hence, g belongs to the dual space U*, and we have the estimate
Fréchet differentiable at u, and , ( 2.41) llgllu * < IIAII c(u,v) Ilflly*.
E'(u) = G'(F(u)) F'(u).
Definition . The mapping A* : V* --> U* defined by f -s g = f o A is called
the adjoint operator or dual operator of A.
Example. Let two real Hilbert spaces {U, (• , •)u} and {H, (• , •)H} be
given, and let z E H be fixed. For some S E £(U, H) we consider the
It follows from the aboye arguments that
functional E : U -i IR,
(A* f)(u) = f(Au) Vu E U,
E(u)=IISu-zllH.
In this case , E can be expressed in the forro E(u) = G(F ( u)), where G(v) _
IIA*flly <_ IAIIr(u,v) Ilflly df EV*.
llvl¡H and F(u) = Su - z. We know already from examples ( ü) and (vi)
that
Remark. In many texts the notation A' for the adjoint or dual operator is
G'(v) h = (2 v , h) H, F(u) h = S h. used. We have chosen to write A* in order to avoid any possible confusion with
The chain rule thus yields derivatives. We also remark that the notion of adjoint operator is often reserved for
Hilbert spaces. Below we will therefore-but only for a moment-write A*; note
E'(u) h = G'(F(u))F(u) h = (2v, F'(u) h)H
the typographical difference between A* and A*. For the definition of the adjoint
(2.40) = 2 (Su-z,Sh)H
or dual operator, we follow Alt [Alt99], Kreyszig [Kre78], and Wouk [Wou79].
= 2(S*(Su- z), h)U.
Here, S* E 1(H*, U*) denotes the so-called adjoint of S, which will be An immediate consequence of estimate (2.41) is that A* is continuous,
defined in Section 2.7. so that A* E 1(V*, U*). More precisely, we have IIA*II r(v*,u*) < IIAIIc(U,v)•
0
We even have equality of these norms; see, e.g., [LS74], [Wou79].
Remark. The aboye results and further information concerning the differentia- For better readability, in the following we will make use of the so-called
bility of operators and functionals can be found, e.g., in [Car67], [IT79], [Jah94], duality pairing, which resembles a scalar product: if a functional f E V* is
and [KA64]. evaluated at v E V, then we write

2.7. Adjoint operators f (v) = (f , v)y,v .


This notation makes the definition of the operator A* more transparenta
If A is an rn x n matrix and AT its transpone, then indeed, we have
(A u, v)am = (u, AT v)xn for all u E I° and v E R'. (f, Au)v*,v = (A*f, u)u*,u =: (u, A*f)uu* Vf e V*, b'u e U.
In a similar way, for real Hilbert spaces {U, (• , •)u} and {V, (-, .)v} one can This forro, while easily memorized, can lead to the misconception that A* is
assign to any linear and continuous operator A : U -s V a so-called adjoint already explicitly determined by it (for instante, in terms of a matrix repre-
operator A*, which allows the transformation (A u, v) v = (u, A*v)u sentation or via an integral operator). This is, however, not to be expected,
for
al!uEUandv V. since a functional f E V* may admit several completely different represen-
The corresponding definition in Banach spaces is more general. To this tations; see (2.24) on page 42. Explicit expressions for adjoint operators can
end, let two real Banach spaces U and V, a continuous linear operator A : be derived if results like the Riesz representation theorem are available that
62 2. Linear-quadratic elliptic control problems 2.8. First-order necessary optimality conditions 63

provide a characterization in concrete forrn of continuous linear functionals. where the adjoint operator has the representation
We therefore confine ourselves from now on to adjoint operators in Hilbert f
spaces. ( A*v)(t) = v(s)eds.

Definition . Let real Hilbert spaces {U, (• , -)u} and {V, (• , •)v} as well
as an operator A E G(U, V) be given. An operator A* is called the Hilbert 2.8. First-order necessary optimality conditions
space adjoint or adjoint of A if
In Section 2.5, the existente and uniqueness of optimal controls was demon-
strated for selected types of elliptic optimal control problems. In this section,
( 2.42) (v,Au )V=(A*v,u)u duEU, VvEV. we will invoke the first derivative of the cost functional to derive conditions
that optimal solutions have to satisfy. These necessary conditions allow for
far-reaching conclusions concerning the form of optimal controls and the
Remark. The terms dual, adjoint, and Hilbert space adjoint are not used consis-
verification that numerically determined controls are actually optimal. In
tently in the literature. We will use adjoint operator in both Banach and Hilbert
addition, they form the theoretical basis for the development of numerical
spaces, since the definition will always become olear from the context. Moreover,
methods.
dual spaces and adjoint operators will generally be marked by *.
2.8.1. Quadratic optimization in Hilbert spaces. For proving the ex-
Examples.
istence of optimal controls, we transformed the control problems under in-
(i) Let A : R' -* R?12 denote a linear operator, which is represented by an vestigation into a reduced quadratic optimization problem in terms of u,
m x n matrix also denoted by A. Since (v, A u)Rm = (AT v , u)R- for all namely
u E R''2 and v E I18t, the (Hilbert space) adjoint A* can be identified with
the transponed matrix AT. (2.43) umi d f(u) 2 ¡¡Su - yd11H + 2 IIuI1U.

(ii) We consider in the Hilbert space L2(0, 1) the integral operator To this minimization problem, the following fundamental result can be ap-
plied. It is the key to the derivation of first-order necessary optimality con-
ditions in the presente of control constraints.
(A u) (t) = J e(t-S) u(s) ds.
0
Lemma 2 .21. Let C denote a nonempty and convex subset of a real Banach
It is easily seen that A is linear and continuous on L2(0, 1); see Exercise 2.12. space U, and let the real-valued mapping f be Gáteaux differentiable in an
Its adjoint A* can be calculated as follows: open subset of U containing C. If it E C is a solution to the problem
min f (u),
uEC
(v, Au)L2(0,1 ) = f v(t
r e(t-') u(s) ds) dt
o ) (fo t then it solves the variational inequality
(2.44) f'(ú)(u - ú) > 0 Vu E C.
Conversely, if v, E C solves the variational inequality (2.44) and f is convex,
¡1 1 then ú is a solution to the minimization problem minUEC f (u).
= Jo fs v(t) e(t-5) u(s ) dt ds ( Fubini 's theorem)

Proof Let u e C be arbitrary. Since C is convex, v, + t (u - ú ) E C for any


=
fo
1 Js elt-S>v(t) dt) ds
u(s) (1 t c (0, 1 ]. Since w is optimal , f (u + t (u - v,)) > f (v,) and hence also
¡1 1
e(s-t)v(s) ds) u(t) dt (f (v, + t(u - ú)) - f (u)) > 0 for t E (0, 1].
= fo * ( ft (exchange of variables) 11
Letting t .. 0, we arrive at f'(ú) (u - ti) _> 0 , which proves the validity of
(A v, u)L2(o 1),
(2.44).
64 2. Linear-quadratic elliptic control problems 2.8. First-order necessary optimality conditions 65

Now suppose that i solves the variational inequality. Since f is convex, In many instantes it is advantageous to write the variational inequality
it follows from a standard argument that (2.45) in the equivalent form

f(u) - f(ue) ? f'(u) (u - ú) du E C. (2Á7) (Sv - yd, Su - SE)H. +'\ (ú, u - E)U > 0 VU E Uad,

By (2.44), the right-hand side of this inequality is nonnegative, whence which avoids the adjoint operator S*.
f (u) > f (ii) follows. This concludes the proof of the assertion. ❑ Below, we apply the variational inequality to our various optimal control
problems, following the scheme indicated in Section 1.4.
Lemrna 2.21 yields a necessary, and in the case of convexity also suffi-
cient, so-called first-order optimality condition. It is apparent that the result '2.8.2. Optimal stationary heat source . The problem (2.26)-(2.28) de-
remains valid if merely the existente of all directional derivatives of f is pos- fined on page 49 reads
tulated. It can even make sense to consider only the directional derivatives
with respect to all directions from a dense subspace of U, as the following
example shows. subject to

Example. Let E E (0,1) be fixed, and let CE = {u E L2(a, b) : u(x) >


-Ay = ¡Su in E
E for a.e. x c (a, b)}. The functional
y = 0 onF

f(u) = In (u(x)) dx,


J
a and
for a.e. x e E.
which is well defined on CE, is not Gáteaux differentiable at u E C, ú(x) - 1, ua(x) < u(x) < Ub(x)
in the sense of L2 (a, b). However, directional derivatives exist in any direction
h E L' (a, b). In fact, we have As aboye, we denote the solution operator of the boundary value problem
by S, viewed as a mapping in L2(Q). In view of (2.45), any optimal control
bf (ú, h ) h(x) dx = ^b h(x) dx. u must obey the variational inequality
(2.48) (S*(SE-yst)+Xu, u - u) L2 (0) > 0 VuE Uad,
Functionals of this type occur in the ( study of interior-point methods for the
solution of optimization problems in function spaces. o where the adjoint operator S* is yet to be determined. For this purpose, we
prove the following preparatory result.
We are now going to apply Lemma 2.21 to the quadratic optirnization
problem (2.43). Lemma 2.23. Let functions z, u e L2(Q) and co, 13 E L°°(Q) with co > 0
a. e. in E be given, and let y and p denote, respectively, the weak solutions
Theorem 2.22. Suppose that real Hilbert spaces U and H, a nonempty
to the elliptic boundary value problems
and convex set Uad C U, sorne yd E H, and a constant A > 0 are given.
Moreover, let S : U -+ H denote a continuous linear operator. Then u E Uad -Ay + co y = '3U -,^ip + co p = z in S2
is a solution to the minimization problem (2.43) if and only if ú solves the y = 0 p = 0 on F.
variational inequality Then
(2.45) (S*(Su-yd)+An,u-ii)U>0 duEUad• (2.49)
f^2 zydx =
J /pudx.

Proof: In view of (2.40), the gradient of the functional f defined in (2.43) is


Proof: We invoke the variational formulations of the aboye boundary value
of the forro
problems. For y, insertion of the test function p E Ho(Q) yields
(2.46) f(Ü) = S* (S v, - Yd) + \ v,.
f (v.vv+ cou'p)dx= f /3Pudx,
The assertion is thus a direct consequence of Lemma 2.21. ❑
66 2. Linear-quadratic elliptic control problems 2.8. First-order necessary optimality conditions 67

while for p we obtain with the test function y E Hó ( 9) that Definition . The weak solution p e Ho (S2) to the adjoint equation

f
(OP - vy + co P y ) dx =
fsz z y dx.

Since the left-hand sides are equal, the assertion immediately follows. ❑
(2.50) -,^,p = y - ysa in Sd
p = 0 on F
is called the adjoint state associated with y.

Lemma 2.24. For the boundary value problem ( 2.27), the adjoint operator The right-hand side of the adjoint equation belongs to L2(L), since yo E
S* : L2(S2) --> L2(S2) is given by L2(S2) by assumption and y E Y = HH(L) -+ L2(Q). From Theorem 2.4 on
page 33, we infer that (2.50) admits a unique solution p E Hó (9). Putting
S*z:=/3p, z = y yQ, we conclude from Lemma 2.24 that
where p E Hó ( Q) is the weak solution to the boundary value problem S*(Sú - yo) = S* (y yo) _ ,dp,
-Op = z in S2 whence, upon invoking (2.48),
p = 0 on F. (/3 p + \ú, u - 2z) L2 (o) > 0 V U E Uad.
Thus, it follows directly from the variational inequality (2.44) that the fol-
Proof.• According to (2.42) on page 62, the operator S* is given by the lowing result holds.
relation

(z, Su)L2(O) _ (S*z, u)L2(O) Vz E L2(L), Vu E L2(S2). Theorem 2.25 . Suppose that ú is an optimal control for the problem ( 2.26)-
(2.28) of optimal stationary heat sources from page 49, and let y denote the
Invoking Lemma 2.23 with co = 0 and y = Su, we find that associated state. Then the adjoint equation (2.50) has a unique weak solution
p that satisfies the variational inequality
(z, Su)L2(O) = (z, Y)L2(O) _ (,iP, u)L2(o).
Owing to Theorem 2.4 on page 33, the mapping z 3p is linear and
(2.51) fsz (/3( x) p(x) + ), k(x)) (u (x) - ü(x)) dx > 0 du E Uad.
continuous from L2(í) into itself. Since z and u can be chosen arbitrarily
Conversely, any control zi E Uad which , together with its associated state
and S* is uniquely determined, we conclude that S* z = /3 p. O
y = y(ú) and the solution p to (2 . 50), satisfies the variational inequality
(2.51) is optimal.
The construction of S* in the aboye proof, which is based on Lemma 2.23,
is not easy to understand intuitively. In Section 2.10, we will get acquainted
with the formal Lagrange method, which is an effective tool for finding the The sufficiency part of the statement follows from the convexity of f. Hence,
form of the partial differential equation from which S* can be determined. a control u, together with the optimal state y and the adjoint state p, is
optimal for problem (2.26)-(2.28) if and only if the triple (u, y, p) satisfies
Remark. As we know, S = EyG has range in the space H, 1(Q). However, the following optimality system:
if we had considered the operator G : L2(L) --> Ho(9) instead of S, then (alter
identifying L2(f2)* with L2(f )) the adjoint operator G* : Ho(Q)* -* L2(9) would
have occurred. We have avoided the space Ho (l2)* by choosing S : L2(52) -* L2(í ).
-Dy = /3 u -Op = y - yo
This choice restricts the applicability of the aboye theory to a certain extent; it is, yIr = 0 Pir = 0
(2.52)
however, simpler and suffices for the time being. In Section 2.13, we will briefly u E Uad
explain how to work in Ho(f2)*. There are good reasons not to identify H01(9)*
03P + \ u, v - u) L2 (sa) > 0 Vv E Uad.
with the Hilbert space H01(9) in this approach.

Adjoint state and optimality system. The variational inequality ( 2.48) Discussion of pointwise optimality conditions . In this section, we
can be easily transformed if S* is known. perform a detailed analysis of the variational inequality (2.51). We begin
68 2. Linear-quadratic elliptic control problems
2.8. First-order necessary optimality conditions 69

our investigation by rewriting it in the form we define the function u E Uad,

f (/jp+Au)u dx < f(+ u)u dx Vu E u,, (x) for x E E+


S2
v, (x) for x E 52 \ E+.
hence
Then
( 2.53) f (P+ ) fidx mif (fi+ fi)udx.
iEU.d 2 (Q(x) P(x) + A ü(x)) (u(x) - v,(x)) dx

Conclusion . Under the assumption that the expression inside the bracket in = E( 13(x) + Au(x)) (ua(x) - f (x)) dx < 0,
(2.53) is known, we obtain v. as the solution to a linear optimization problem
in a function space. since the first factor is positive on E+ while the second is negative. This
evidently contradicts (2.51). The other case can be handled in a similar way
This simple observation forms the basis of the conditioned gradient method; by putting u(x) = ub(x) on E_ and u(x) = u(x) otherwise.
see Section 2.12.1.
(ii) Next, we show that (2.54) implies (2.55). We have for almost every
It is intuitively clear that the variational inequality can also be formu- x c- A+(ú) that ú(x) = ua(x), and thus v-v,(x) > 0 for all v E [ua(x), ub(x)].
lated in pointwise form. The following lemma provides insight in this direc- Hence, by the definition of A+(v,),
tion.
(f3(x) p(x) +A ú(x)) (v - f(x)) > 0 for almost every x e A+ (u).

Lemma 2.26 . A necessary and sufficient condition for the variational in- Similar reasoning shows that this inequality also holds almost everywhere
equality (2.51) to be satisfied is that for almost every x e St, in A- (u). Since this is trivially the case whenever f (x) p(x) + Au(x) = 0,
(2.55) holds almost everywhere in Q.
ua(x) if f3(x) P(x) +) u(x) > 0
(iii) Finally, we show that (2.55) implies (2.51). Te this end, let u e Uad be
(2.54) i(x) = E [ua(x), ub(x)] if ^3(x) p(x) + A U(x) = 0 arbitrarily chosen. Since ú(x) E [ua(x),ub(x)] for almost every x E 9, we
ub(x) i f ,d(x)P(x) + A < 0. may put v := u(x) in (2.55) to find that

An equivalent condition is given by the pointwise variational inequality in 118, (Q(x) p(x) + Aú(x)) (u(x) - u(x)) > 0 for a.e. x E Q.

Integration yields that (2.51) holds. ❑


(2.55)
(^(x) p(x) + Aú(x)) (v - ú(x)) > 0 bv E [ua(x), Ub(x)], for a.e. x e Q.
Next we observe that by a simple rearrangement of terms the pointwise
variational inequality (2.55) can be rewritten in the form
Proof: (i) First, we show that (2.51) implies (2.54). To this end, let ú, ua, (2.56)
and ub be arbitrary but fixed representatives of the corresponding equivalente (O(x) P(x) + Av,(x)) u(x) < (f(x) P(x) + Aú(x)) v Vv E [ua(x), ub(x)],
classes in the sense of L°°. Suppose that (2.54) is false. We consider the
for almost every x E Q. Here, as in (2.55), v is a real number, not a function.
measurable sets

A+ (u) = {x E Q : /3(x) p(x) + A > 0}, Theorem 2.27 . A control ú E Uad is optimal for (2.26)-(2.28) if and only if
A- (u) = {x E Q ^(x) p(x) + Au (x) < 0}. it satisfles, together with the adjoint state p from (2.50), one of the following
two 7ninimum conditions for almost all x E t2:
By our assumption , there is a set E+ C A+(u) having positive measure such the weak minimum principle
that ü ( x) > ua(x ) for all x E E+, or there is a set E_ C A_(u) having
positive measure such that ú ( x) < ub(x ) for all x E E-. In the first case,
vE[ a^ m úehl^ (a(x) P(x) + a (x)) v} _ (a (x) P(x) + A ü(x)) (x)
70 2. Linear-quadratic elliptic control problems 2.8. First-order necessary optimality conditions 71

or the minimum principie Here, for real numbers a < b P[a,b1 denotes the projection of R onto [a, b],

min
VE[la ( X),Ub(x)1
{a(x) P(x) v + 2 v2 } = Nx)P(x) ú(x ) + 2 2G(x)2.
IP[a,b] (n) := min { b , max{a, u}}.

Proof.• The assertion is a direct consequence of Theorem 2.27: indeed, the


Proof.- The weak minimum principie is evidently nothing but a reformulation solution to the quadratic optimization problem in 118 formulated in terms of
of (2.56 ). The minimum principie is also easily verified: a real number v the minimum principle
solves for fixed x the (convex ) quadratic optimization problem in 118,

VE[Ua(m) wb(X)1 {O(x)P(x) v + 2 v2


g(v) ,3(x) P(x) v + 2 v2
vE[ua (x) ub(x)1
is given by the projection formula ( 2.58). The reader will be asked to verify
if and only if the variational inequality this claim in Exercise 2.13. ❑

g'(ti)(v - v) > 0 Vv E [na(x), ub(x)] Case 2a: A > 0 and Uad = L2(1). In this case, the control is uncon-
is satisfied, that is, if strained, and we can infer from (2.58) (or directly from (2.55)) that

(/3(x) P(x) + .A v) (v - v) > 0 VV E [ua(x), ub(x)]. (2.59) u_ ^aP-


The minimum condition follows from taking v = v.(x). Putting this in the state equation leads to the optimality system

The derived pointwise conditions can be further evaluated in order to


-zy = -^_i Q2 P -Ap = y-yo
extract additional information. Depending on the choice of A, different con-
y^r = 0 P[r = 0.
sequences result.

Case 1: A = 0. In this case, it follows from (2.54) that almost everywhere, This is a coupled system of two elliptic boundary value problems for the
determination of y = y and p. Once p has been found, the optimal control
ua(x) if ¡3(x)P(x) > 0 ú is obtained from (2.59).
(2.57) ú(x) =
nb(x) if /3(x) p(x) < 0.

At points x E 9 where /3(x) p(x) = 0, no information concerning ti(x) can Formulation as a Karush- Kuhn -Tucker system . By the introduction
be extracted . If /3(x ) p(x) 0 almost everywhere in Q , then w is a so-called of Lagrange multipliers, the variational inequality (2.51) in the optimality
bang-bang control , that is, the values ú(x) coincide almost everywhere with system can be reformulated in terms of additional equations. The associated
one of the threshold values ua ( x) or ub(x). technique was explained in Section 1.4.7.

Case 2: A > 0. We interpret the second relation in ( 2.54) as saying that Theorem 2.29. The variational inequality (2.51) is equivalent to the exis-
"ú is undetermined if )iú + í3 p = T. This is not really true , since the tence of almost-everywhere nonnegative functions Pa, µb e L2(Si) that sat-
equation w, + ¡3 p = 0 yields u (x) = -A-' ,3( x)p(x) and therefore provides isfy the equation
a hint towards a complete understanding of the minimum condition.
(2.60) 3P+Aú-tba+tbb=0

Theorem 2.28. If A > 0, then ú is an optimal control to the problem (2.26)- as well as the complementarity conditions
(2.28) if and only if it satisfies , together with the associated adjoint state p, (2.61) ILa(x) (ua(x) - ti(x)) = Irb(x) (u(x) - ub(x)) = 0 for a.e. x e Q.
the projection formula

Proof (i) We first show that (2.60) and (2.61) are consequences of the
(2.58) ú (x) = ^[ua(x),ab(x)1 {(x)P(x)} for almost every x e Q.
variational inequality (2.51). To this end, we follow the treatment in Section
72 2. Linear-quadratic elliptic control problems 2.8. First-order necessary optimality conditions 73

1.4.7 and define the functions By virtue of the aboye theorem, we can replace the optimality system
(2.52), which contains the variational inequality, by the following Karush-
lla(x) (/3( x)p(x)+)(x)),
(2.62) KKhn-Tucker system:
llb( x) ((x)p(x) +(x)).
Here, we use the usual definitions of s+ and s_ for s e IR, namely
-Dy = fi u -Op = y - yo

s+ = 2 (s + IsI), S- = 2081 -S).


yIr = 0 PIr = 0
' (2.64) Qp+)u- µa +µb =0
Then, by definition, lla > 0, µb > 0, and ,Q p + A U = µa - µb, which shows < u <
ua ub, ll a >_0, ll b >_0
(2.60). Moreover, in view of (2.54), the following implications are valid for
P. (na - u) _ {lb (u - ub) = 0.
almost every x E S2:

(,6 p + A 1)(x) > 0 u( x) = ua(x) Here, the relations in the last three lines hold for almost every x E Q.

(f3p+AU,)(x)<0 ü( x)=ub(x)
Definition . The functions µa, µb e L2(S2) defined in Theorem 2.29 are
ua(x) < Ü (x) < ub(x ) (f3 p + 1\ 2) (x) = 0. called Lagrange multipliers associated with the ínequality constraints ua < u
and u < Ub, respectively.
From these implications, we can conclude the validity of (2.61), since in both
products at least one of the factors vanishes for almost all x e Q. Indeed, Remark . The system ( 2.64) can be derived directly by using a Lagrangian func-
suppose that µa(x) > 0. Then obviously µb(x) = 0; in addition, (,3p + tion provided that the existente of multipliers ,a , µb E L2(f2) is assumed; see
aú)(x) = µa(x) > 0, which implies that 2(x) - ua(x) = 0. Next, suppose Section 6 . 1. However , the existente of such multipliers cannot directly be con-
that lla(x) = 0. We have to show that the second product also vanishes. In cluded from the Karush- Kuhn-Tucker theory in Banach spaces, since the set of
fact, if µb(x) > 0, then (O p + A2)(x) < 0, and thus 2(x) - ub(x) = 0- almost-everywhere nonnegative functions in L2(2) has empty interior . By explic-
itly defining the multipliers µa and pb, we have circumvented this difficulty here.
(¡¡) Conversely, assume that u E Uad satisfies (2.60) and (2.61), and let
A detailed analysis of this problem will be given in Section 6.1.
u E Uad be given. We have to discuss three different cases.
First, for almost all x with ua(x) < zc(x) < nb(x), it follows from the
complementarity conditions (2.61) that lla(x) = µb(x) = 0, whence, upon The reduced gradient of the cost functional . The calculation of the
invoking (2.60), reduced gradient, that is, the gradient of f (u) = J(y(u), u), is also simplified
by invoking the adjoint state. The representation of f'(u) given in the fol-
(0p+),cú)( x)=0. lowing lemma will apply to almost all optimal control problems lo be studied
In conclusion, we have in this book.

(2.63) (0(x)p(x) + 2(x)) (u(x) u(x)) > 0.


Lemma 2.30. The gradient of the functional
In the case where ua(x) = 2(x), we find, from u E Uad, that u(x)-2(x) > 0.
Moreover, equation (2.61) immediately yields that µb(x) = 0. Therefore, we f (n) = J(y(u),u) = 2 Ily-y9IIL2(O) + 2 IIuIIL2(o)
can infer from equation (2.60) that is given by
f'(u)=f3p+Au,
0(x) p(x ) + w, (x) = lla(x) > 0,
where p E Hp (f2 ) denotes the weak solution to the adjoint equation
whence inequality ( 2.63) again follows. The third case 2(x) = ub(x) is
-Ap = y-yo in 9
treated similarly. In summary, ( 2.63) holds for almost every x E ft, and (2.65)
p = 0 on F
integration over S2 yields the validity of the variational inequality (2.51).
This concludes the proof of the theorem. ❑ and y = y(u) is the state associated with u.
74 2. Linear-quadratic elliptic control problems 2.8. First-order necessary optimality conditions 75

Proof: Invoking equation (2.46) on page 64, we conclude from Lemma 2.24 and insertion of y e H1(SZ) in the equation for yields
that pp
f'(u)h= (S*(Su-yQ)+A u, h) L2(Q) = ,C3p+Au,
( L (9).h) 2
From this , the assertion immediately follows. ❑
By virtue of the Riesz representation theorem, f'(u) is identified with ,3 p +
With this result in hand , it is now easy to treat the problem of finding
the optimal stationary heat source for a Robin boundary condition ; for the
We conclude this section by reformulating the variational inequality sake of simplicity, we assume the latter to be homogeneous . We also include
(2.48) on page 65. Owing to the definition of the adjoint S*, the variational a boundary term in the cost functional . The problem then reads:
inequality is equivalent to
(2.69) min J (y, u) 2 Ily - YOIIL2(o) + 2r Ily - yrIIL2(r) + 2 IIuIIL2(2),
(2.66) (Sú-ySt,Su-Sú)L2(Q)+\ (ú,u-ú)L2(Q)>0 VUEUad•

With Sti and y = Su, it follows that

(2.67) .f'(u)(u - ú) _ (y - ysz , y - y)L2(Q) + (ú , u - v,)L2(Q) > 0.


- Ay = /3u in 9
The form (2.67) of the variational inequality makes it possible to apply the avy +ay = 0 on F
next result, Lemma 2.31, to determine S*. Even though the operator S* will
not appear explicitly, it will stand behind the construction. We prefer to use
this approach in what follows.
(2.71) Ua(X) < u(x) < ub(x) for a.c. in Q.
2.8.3. Stationary heat sources and boundary conditions of the third
kind . In this section, we will treat the optimal control problem (2.69)-(2.71) We postulate that A > 0, Ag > 0, .r > 0 and a E L°°(F), where
defined below. We begin our analysis by proving an analogue of Lemma 2.23 a > 0 almost everywhere and II a II L- (r) 0, and also that yc E L2 (SZ) and
that can be applied directly to determine the adjoint equation. yr E L2(F). The optimal quantities ti, y, and p then satisfy the optimality
condition
Lemma 2 .31. Let functions aq, v E L2(9), al,, u E L2(F), co , d2 E (/3p+A u)(u-u,)dx> 0 du E Uad,
Jsz
L°°(í ), and a, /3r E L°°(F) be given, where a > 0 and co > 0 almost
where the adjoint state p solves the boundary value problem
everywhere. Moreover, let y and p denote the weak solutions to the elliptic
boundary value problems
-Op = AQ (y - yo) in 9
-Dy+coy = /3 v -Op + co p = au
á,y+ay ar. a,p + a p = Ar (y - yr) on F.
= Ñru avp+ap =

Then
The aboye relations are derived as in the next subsection, by invoking Lemma
( 2.68) faQ Ydx+f arYds= fQPvdx+f Fpuds. 2.31; see Exercise 2.14.

2.8.4. Optimal stationary boundary temperature . Let us recall the


Proof: We use the variational formulations of the aboye two boundary value boundary control problem (2.32)-(2.34) from page 53:
problems. Inserting p E H' (í) in the equationfor y, we findthat
min J(y, u) := 2 111y - YQIIL2(s^) + 2 IuIIL2(r),
subject to
2.8. First-order necessary optimality conditions 77
76 2. Linear-quadratic elliptic control problems

satisfies

-Ay = 0 in 9 (2.74) fVp. Vvdx+/avds=f( Yn )v dX d v E H1().


Óvy+ay = au r
en F
The optimal state S v, is the weak solution to the state equation
and associated with ú, while y = Su corresponds to u. Hence , by the linearity
ua(x) < u (x) < ub(x) for a.e. x E F. of the state equation , we have y - y = S(u - v,). Lemma 2.31 applied with
y=y-yandv= u-u,yieldsthat

Owing to Theorem 2.6, the control-to-state operator G : u y(u) is


f (_Yn )(Y_V)dx =faP(uti)ds
a continuous linear mapping from L2(F) into H1(S2). However, we again
consider G as an operator with range in L2(S2), that is, S = EyG : L2(F) -->
L2(S2), with the embedding operator Ey : Hl(S2) -+ L2(9). The cost func- With this , ( 2.72) becomes¡
tional then attains the form
f'(ú) (u - U ) = J (.^ i + a p) ( u - ú) ds > 0 V U E Uad.
r
J(y,u) = f(u) = 2 IISu - ystIIL2(o) + 2 IHL2(r). The form of the derivative f(ú) does not depend on the fact that ú is
optimal . Hence, we obtain as a side result that the reduced gradient f'(u)
We now proceed similarly as in Section 2.8.2. A simpler method for
at an arbitrary u is of the form
the construction of the adjoint equation will be given later by the Lagrange
method. To begin with, let E Uad and y denote the optimal control and
(2.75) f'(u) = a pir + A u,
its associated state, respectively. We employ Theorem 2.22 on page 64 and
rearrange the resulting variational inequality as in (2.67) to get where p solves the associated adjoint equation

-Op = y(u) - yo in 9
(2.72) f'(u)(u - ú) = (y - yo, y - 9)L2(o) + A (v,, u - i)L2(r) > O. óvp+ap = 0 on F.

In accordance with the Riesz representation theorem, we have expressed the


We intend to apply Lemma 2.31. Comparison of the boundary value
derivative f'(u) as an elernent of L2(F), namely the gradient.
problems satisfied by y indicates that the choices /3c = 0, /3r = a, and co = 0
have to be made. With this, we see that the expression (y - Y2, y - y)L2(12)
Summarizing the aboye considerations , we have proved tare following
attains the forra of the left-hand side of equation (2.68), provided we replace result.
y - y by y and make the choices as^ = y - yn and ar = 0. Our plan is to
express y - y in tercos of u - u, in order to calculate f'(u) from (2.72). Theorem 2.32 . Let a denote an optimal control te the problem (2.32)-(2.34)
In view of the ahoye considerations, we are motivated to define p as the en page 53, and let y denote the associated state. Then the adjoint equation
solution to the following adjoint equation: (2.73) has a unique solution p such that the variational inequality

(2.76) f(a(x)(x)+ A u(x)) (u(x) - u(x)) ds(x) > 0 d u E Uad


-Op = y - y9 in 5
(2.73) is satisfied. Conversely, every control ú E Uad that, together with y := y(i)
Óvp+ap = 0 on F.
and the solution p to (2.73), solees the variational inequality (2.76) is optimal.

The right-hand side of the differential equation belongs to L2(Q), since


Further discussion of the variational inequality (2.76) follows the same
yo e L2(Q) by assumption and y e Y = H1(Q) L2(í ). Owing to
lines as in the case of Poisson's equation. In this case, we obtain that
Theorern 2.6, problem (2.73) admits a unique weak solution p E H' (Q) that
78 2. Linear-quadratic elliptic control problems 2.8. First-order necessary optimality conditions 79

ua(x) if a(x) p(x) +Au(x) > 0


(2.77) ii(x) = E [ua(x), ub(x)] if c(x)p(x) +) ii(x) = 0
ub(x) if a(x) p(x) +Au(x) < 0,

and the weak minimum principie becomes

^) { (a(x) P(x) + ,\ú(x ) ) i'} = (a(x) P (x) + X ú(x)) u(x) Optimal control for A = 0.
ua(x)mv<^b(

for almost every x E F. For \ > 0, we obtain u as the projection of the function -.)-1a p onto
[-1, 1].
In addition , we have the following result.

Theorem 2.33 (Minimum principie). Suppose that u is an optimal control


for the problem (2.32)-(2.34) on page 53, and let p denote the associated
adjoint state. Then, for almost every x e F, the minimum

min 2 v2
{a (x ) p(x) v +
v.a(x)<v<ub(x) 2
{

is attained at v = u(x). Hence, for A > 0 we have for almost every x E F


the projection formula

(2.78) u(x) = {_(x)P(x)}.


Optimal control for A > 0.
Conversely, a control u E Uad is optimal if it satisfies, together with the
associated adjoint state p, the projection formula (2.78).
2.8.5. A linear optimal control problem . Let us consider the linear
problem with distributed control v and boundary control u:
The proof of this result is identical to that for the problem of finding the
optimal stationary heat source. In the unconstrained case where ua = -oo
and Ub = oo, one obtains that
min J(y, u, v) := ft (ao y + v) dx + f(ar y + r u) ds,

u( x) -(X)P(X) -
-Ay = ío v in 9
ó„y + a y = /3r u on F,
In the special case of A = 0, we have to distinguish between different
cases as in (2.57) on page 70. As an illustration, we choose a two-dimensional
and
domain í and imagine that its boundary F is unrolled onto part of the real
axis. As bounds, we prescribe ua = -1 and ub = +1. va(x) < v(x) < vb(x) a.e. in t2, ua(x) < u(x) < ub(x) a. e. on F.
80 2. Linear- quadratic elliptic control problems 81
2.9. Construction of test examples

We impose the following conditions on the data of this problem : the func- of IJuII2 vanishes. In the following, we will construct a case in which such a
tions ac, A2 and ar, \r are square integrable on their domains 9 and F, control appears for the problem
respectively, /3c and / 3r are bounded and measurable en St and F , respec-
tively, and the bounds va, Vb, ua, and ub are square integrable as well. In min f y_ yn2 dX,
addition, a is nonnegative almost everywhere and does not vanish almost
everywhere. subject to
Then , the optimality conditions for an optimal triple (y, v, u) read -Ay = u+ eO
yIr = 0
J (,612p+Al2)(v-v)dx + ((rp+,\r)( u-ú)ds > 0 bv E Vad, bu E Uad,
sz and
where the adjoint state p is given by -1 < u(x) < 1,

-Ap = an in S where en is yet to be defined.


avp + a p = ar en F. This problem differs from that of the optimal stationary heat source
investigated in Section 2.8.2 only by the term en in the state equation. It is,
The reader will be asked to derive these relations in Exercise 2.15.
however, easily seen that this term influences neither the adjoint equation
Linear control problems arise, for instante, if nonlinear optimal control
(2.50) nor the variational inequality (2.51); see Exercise 2.16.
problems are linearized at optimal points. By linearization and application
As the domain, we choose again the unit square 9 = (0, 1)2. We look for
of the necessary conditions to the linearized problem, optimality conditions
for nonlinear problems can be derived . This is one possible way to treat a chessboard function u as optimal control. To this end, we subdivide the
nonlinear problems. unit square like a chessboard into 8 x 8 = 64 congruent subsquares. Within
these subsquares, the optimal control u shall alternately attain the values
+lor-1.
2.9. Construction of test examples
In view of the necessary optimality condition (2.57), and since ua = -1
To validate numerical methods for the solution of optimal control problems, and ub = 1, u is optimal if and only if
test examples are needed for which the exact solutions are known explicitly. for a.e. x E Sh,
u(x) = - sign p(x)
By means of such test examples it can be checked whether a numerical
method yields correct results. Invoking the necessary optirnality conditions where we have put sign(0) :_ [-1, 1] to express that u can vary here arbi-
proved aboye, it is not hard to construct such examples. However, partial trarily in [-1, 1]. An adjoint state that fits the chessboard pattern is given
differential equations require a different approach than ordinary ones. by
Indeed, in the optimal control theory of ordinary differential equations p(x) = p(xi,X2) = 1287x2 sin(87rx1)sin(87rx2).
it is possible, at least for specifically chooen examples, to solve the state
equation in closed form if an analytic expression is prescribed for the control. The factor 1/(128 r2) simplifies other expressions. The associated control u
In the case of partial differential equations, this is much more difficult: even has value +1 in the lower left subsquare of 52 and changes sign according to
in the simplest cases the best we can hope for is to obtain a series expansion the chessboard pattern. Next, we choose as optimal state the function
of the state y for a given u. Therefore, we take the opposite approach: we y(x) = sin(7r xl) sin(7r x2).
simply prescribe the desired solution triple (u, y, p), and then adjust the state
equation and the cost functional in such a way that u, y, and p satisfy the Note that y vanishes on F; moreover, it solves the Poisson equation
necessary optimality conditions. Ay = 27r2 y = 27x2 sin(v xl) sin(x x2).

2.9.1. Bang-bang control . By ba7ag-bang controls we mean control func- For the state equation to be solved by y, we must have -Ay = u + e2, that
tions whose values almost all lie en the boundary of the admissible set. Such is,
controls occur in certain situations if the regularization parameter A in front ec = -Ay - u = 27x2 sin(7r xl) sin ( 7r x2 ) + sign (- sin(8 7r xl ) sin(8 7 x2)) .
82 2. Linear-quadratic elliptic control problems 2.9. Construction of test examples 83

Correspondingly, the adjoint state satisfies where the adjoint state p is the weak solution to the boundary value problem

Op(x) =2(87x)2 sin(87rx1)sin(8ir x2)128x2 =sin(87rx1)sin(87rX2), -Op+p = y-yo


(2.81)
avp = er.
and it has to be a solution to the boundary value problem
We thus fix the adjoint state by putting
-",p = y - ysz
1
P ir = 0. p(x) =-12ix -x1 2 + 1 3=-12 r + 3.

Therefore, we choose yo = y + Ap, that is, The graph of -p is a paraboloid,


Y2 (x) = sin(7r x1) sin(7r x2) + sin(8 7r x1) sin(8 7r X2). which is cut by the planes {p =
0} and {p = 1} in such a way
2.9.2. Distributed control and Neumann boundary condition. In that the aboye control fi results.
this section, we consider a problem with homogeneous Neumann boundary Having defined the adjoint state
condition ó„y = 0, namely: and control, we now choose the
associated state y. Since its nor-
mal derivative must vanish on
(2.79) min J(y, u) := 2 f ¡y - yo 12 dx + J er y ds + 2 J u12 dx, F, we simply take y(x) - 1. For
l r ', this function to satisfy the
subject to state equation, the function eO =
e2(x) is used as compensation.
Clearly, since Dy = 0, we must 0 0
-¿1y+y = u+eQ
choose co = 1 - ú. Hence, af-
a,y = 0 Constructed control.
ter substituting the expression
(2.80) for ü,
and the control constraints
co = 1 - min {1, max{0, 12 r2 - l/3}}.
0 < u(x) < 1. The functions yc and er can still be chosen in order to fit equation (2.81)
for the adjoint state. Application of the Laplacian to p yields
For the sake of conveniente, we again choose as dornain the unit square
SZ = (0, 1)2, which has center x = (0.5,0.5 )T . The functions yo, eO, and er Op = Dip+D22p = -12{2+2} = -48,
will again be fitted in such a way that a desired solution results. To this
whence we conclude that
end, we put r = lx - xl = (xl - 0.5)2 + (x2 - 0.5)2. The desired optimal
control is the function depicted below, which is given by yo(x)=(y+Op-p)(x)=1 48-3+12Ix-x2=-132+12x2.
1 if We still have to satisfy the boundary condition for p. To this end, we have
ú(x) = 12r2- 1 if included the boundary integral in the cost functional. Indeed, er needs to
satisfy the relation
0 if
a,.p = er•
or, equivalently, by
Obviously,
(2.80) ú(x) = min { 1 , max{0, 12 r2 - l/3}}.
D1p = -24 (x1 - 0.5), D2p = -24 (x2 - 0.5),
As the reader will be asked to show in Exercise 2.17, u can be recovered from
so that en the part of the boundary given by {(x1,x2) E S2 : xl = 0} we
the projection formula
have
ú(x) = F[0,11 { - p(x) } for a.e. x E 9, a„p = -Dlp ix,=o = 24 (0 - 0.5) = -12.
84 2. Linear-quadratic elliptic control problems 2.10. The formal Lagrange method 85

On the other parts of F, the same value results, so we may choose er(x) _ spaces. One simply assumes that the state y and the multipliers that occur,
ó„p(x) -- -12. as well as their derivatives, are all square integrable. In this way, L2 scalar
próducts can be used, and one avoids functionals from more general dual
2.10. The formal Lagrange method spaces.

In the preceding section, we determined the actual form of the adjoint equa- This approach, while being justified in a certain sense, is not mathe-
tion in a more or less intuitive way. However, this equation can easily be matically rigorous. It is not meant to be used as a tool for rigorous proofs,
derived by means of a Lagrangian function. In this connection, we recall but primarily as a convenient means to derive and formulate the correct op-
that in the finite-dimensional case treated in Section 1.4, the adjoint equa- timality conditions. Once these have been established, it does not matter
tion resulted from tbe derivative D5L of the Lagrangian with respect to y. anymore how they were found. This line of argument is particularly help-
Given the right formalism, this ought to be possible here as well. ful for complex problems involving nonlinear systems of partial differential
equations.
Necessary optimality conditions in function spaces can be directly de-
duced from the Karush-Kuhn-Tucker theory for optimization problems in While the adjoint equation is too easy to guess for the problem of finding
Banach spaces. This method, which might be called the exact Lagrange the optimal stationary heat source, it is quite instructive to demonstrate the
method, will be investigated in Chapter 6. In many cases, its application is basic ideas of the technique by means of the problem of finding the optimal
difficult, requiring a lot of experience in matching the operators, functionals, stationary boundary temperature, defined in (2.32)-(2.34):
and spaces involved. Indeed, the given quantities have to be differentiable in
the chosen Banach spaces, adjoint operators have to be determined, and the min J(y, u) 2
2 ¡¡y - y0IIL2(O) + 2 llui L2 (F)
Lagrange multipliers must exist in the right spaces. We will discuss some
examples of the application of this method in Sections 2.13, 6.1.3, and 6.2. subject to
Up to now, we have taken a different approach: we first expressed the
state y by means of the control-to-state mapping G in the form y = G(u). -Ay = 0 in 5
From this, we derived a variational inequality that was simplified by the
á„y+ay = au en F
introduction of the adjoint state p. The adjoint state is the Lagrange multi-
plier associated with the boundary value problem for the partial differential
equation if this is defined as in Sections 2.13 or 6.1.3. This approach is equiv- and
alent to application of the general Karush-Kuhn-Tucker theory, since in this ua(x) < u(x) < ub(x) a.e. on F.
way one actually preves a Lagrange multiplier rule. However, because of the
Here, three constraints have te be obeyed, two difficult ones (the boundary
spaces involved, the application of Karush-Kuhn-Tucker theorems can be
value problem) and a harmless one (the pointwise constraint u e Udd). We
very complicated.
now apply the Lagrangian principle, eliminating only the equations by means
The aboye remarks apply to the proof of optimality conditions. It is, of Lagrange multipliers pl and p2. To this end, we define, still somewhat
however, a completely different task to derive them (e.g., te determine the formally, the Lagrangian function
adjoint equation for complex problems), as well as to find a form for the
conditions that is easy te memorize. For these purposes, the formal Lagrange r = (y, u, p) = J(y, u) - f (-Ay) pt dx - J (ó,y - a(u - y)) p2 ds.
method te be introduced here is particularly well suited. Basically, it is just ^ r
the (exact) Lagrange principie described in, e.g., Ioffe and Tihomirov [IT79] Here, the Lagrange multipliers pl and p2 are functions defined en fi and F,
and Luenberger [Lue69]. This principie will be discussed in Chapter 6, in respectively, which in G are expressed as the vector p := (pi, p2).
particular for a problem involving a semilinear elliptic equation in Section
The definition of G is not rigorous for three reasons: we only know that
6.1.3. y e H'(Q), so neither Dy nor a,y need to be functions. Indeed, without
The formal Lagrange rnethod differs from the exact method in the fol- further knowledge concerning the higher regularity of y, we can only claim
lowing way: differential operators such as -0 or ó„ are written formally, and that Dy E H1 (52)* and 33vy E H-1/2(F) (for the definition of H-1/2(F), see,
all multipliers are regarded as functions, without specifying the underlying e.g., Lions and Magenes [LM721). This has the unpleasant consequence that
86 2. Linear-quadratic elliptic control problems 2.10. The formal Lagrange method 87

the integrals which occur might be meaningless. In addition, it has to be Since h is arbitrary on F (see remark (iii) below), we can infer that
clarified what regularity pl and P2 have.
a„ pr + apl = 0 on F.
Nevertheless, we continue our approach, simply taking sufficient smooth-
If we now put p := pl and P2 := per, then we recover the adjoint equation
ness of pl and P2 for granted, and integrate by parts using the second Green's
for the problem of the optimal stationary boundary temperature, which had
formula. For the sake of brevity, we omit the differentials in the integrals.
been introduced intuitively before.
We obtain
Observe that the variational inequality is also easily derived. We have
£(y,u,P ) = J(y,u)+ f Play ya,.Pl+ J Y'41- (a y( u-Y»P2
r - si ir
Recalling the Lagrange principie , we expect the pair ( y, ú), together with
Dar(y, ú ,p)(u-ú ) = f '6(u-v.) ds+ f np( u -ú) ds0.
the Lagrange multipliers pl and P2, to satisfy the optimality conditions as- Next,we introduce the Lagrangian function in such a way that all the
sociated with the problem occurring tercos are meaningful.
min G ( y, u, p), y unconstrained , u E Uad•
Definition . The Lagrangian function G : Hl(Q) x L2(F) x H1(t2) -3 R
Since y is now formally unconstrained , the derivative of G with respect to y for problem (2.32)-(2.34) is definedbyfvy.
has to vanish at the optimal point, that is,
(2.83) L(y, u, P) J(y, u) - V p dx + fn(u - y) p ds.
DAD, u , P) h = ((y - y2) + Opi) h dx + (Pl - P2) a„h ds 2

(2.82) J (avpl+ ap2)hds=0 dhEY= H1(S2). This forrn arises from the earlier, only formally correct form of G, upon
integrating once by parts. The terms containing c9, y cancel each other out.
Here, we have used the fact that the derivative of a linear mapping is that We readily convince ourselves of the following equivalentes:
mapping itself. Moreover , from the box constraints for u we deduce the
variational inequality
Dy,C(y,'6, p) h = 0 b'h E H'(S2) q weak formulation of (2.73);
Du£(y,u,P )( u 26) > 0 Vu E Uad.
D.£ (9, ú, p) (u - ú) > 0 b'u E Uad 4=> variational inequality (2.76).
Let us take a closer look at equation ( 2.82). First , we choose some h E
Có (t2 ), so that h = 0,h = 0 on F . We obtain that Conclusion . The Lagrangian function just introduced yields the optimality
conditions derivad in Section 2.8.4. The adjoint equation and the variational
f^ ((y - y^t) + opi) h dx =0 Vh E CC (s2), inequality are obtained by taking the derivative with respect to the state y and
and the density of Co ( S2) in L2 ( S2) implies that the control u, respectively.

-Z^pl = y - yr in 9. Although G is now properly defined by (2.83), one question remains


This is already the first half of the adjoint system that had only been guessed unanswered : How do we know that such a p E H1 ( 1) exists? In the preceding
at before , and we see that the first integral in ( 2.82) vanishes . Next , we only section , we avoided this problem by defining p directly as the solution to
postulate hir = 0 and let a„h vary (see the remark ( iii) below ). For all such the adjoint equation ; note, however , that the Lagrange method presentad
h we evidently have in this section has the sole purpose of determining the correct form of the
adjoint equation. Nevertheless, the existente of p as Lagrange multiplier
' - P2) a„h ds = 0,
fr can be concluded directly from the Karush-Kuhn-Tucker theory in Banach
which is only possible if pl = p2 on F. Finally, we vary h on F and consider spaces; see Section 2.13.
the only remaining term in (2.82),
Remarks.
0=- f (0vpi+n p2)hds=-J (api+npl)hds. (i) If we assurne right from the beginning that the Lagrange multipliers pl and P2
88 2. Linear-quadratic elliptic control problems 2.11. Further examples 89

coincide on the boundary, then the second integration by parts leading lo the term 2.11. Further examples *
-Apl is superfluous. In fact, one integration by parts suffices, and the variational
form of the adjoint equation is immediately established (cf. the argument in the 2.11.1. Differential operators in divergence form. In the same way,
next section). Unfortunately, this simplified approach does not always succeed. the general problem (2.36)-(2.38) on page 54 can be treated using the La-
This is the case, e.g., if boundary controls of Dirichlet type are considered; see grange method:
Exercise 2.19 on page 118.

(ü) The gradient of f (u) = J(y(u), u) can be obtained from


min J(y, u, v) := 2 Ily YOIIL2(o) + 2 Ily - y111i2(r)
f'(u) = D.£(y,u,p)
if y = y ( u) and p = p(u) are inserted . We thus can memorize as a rule that this
+ '\ V IMIL2(2)
2
+ II^IIL2(rl),
2 2
reduced gradient is calculated by differentiating the Lagrangian with respect to the subject to
control.

(iii) Above, we argued that a, h is essentially arbitrary on r' within the set of all h Ay+coy = /3cv in fI
with hIr = 0. This is a consequence of the fact that the mapping h H (T h, 0,h) is á„Ay+ay = iru on F1
surjective from H2(fl) onto H3/2(F) x H1/'(I') (see, e.g., [Ada78], Thm. 7.53). A
y = 0 on Fo
similar conclusion can be drawn for h, since the trace operator rr : H1(S2) H1/2 (F)
is also a surjective mapping.
and
Va(x) < v(x) < vb(x) for a. e. x E fI
We have demonstrated the application of the Lagrangian function for
problem (2.32)-(2.34) only. Other problems can be treated similarly. We ua(x) < u(x) < ub(x) for a.e. x E F1.
also note that it is possible to eliminate the box constraints for the control
by incorporating them into the Lagrangian by means of additional Lagrange The state y vanishes on Fo; therefore, only its restriction to F1 contributes
multipliers M. and µb. This was demonstrated earlier in Section 1.4.7. The to the boundary term in the cost functional.
Lagrangian (2.83) then has to be extended in the following way: Suppose that Assumption 2.19 on page 55 holds. The natural state space
is evidently
Y = {yeH' (I): yjr0=0}.
(2.84) £(y, u, P, 1%a, µb) J(y, u) - Vy • Vp dx + J a (u - y) p ds
J^ r The optimality conditions can be derived as in Section 2.8.4. First, we
construct the adjoint equation . The Lagrangian can be defined by
+ í (,.a(va - u) + Mb(u - ub)) dx.
£(y, u, v, P) = J(y, u, v ) - f (Ay + coy - ,350 v) p dx
sz
The formal Lagrange method described in this section will repeatedly
serve us well when adjoint equations are to be determined . Lagrangian func-
(a„Ay + ,,y - ^r .) p ds,
tions like ( 2.83) or ( 2.84) are the proper tool for formulating the optimality rl
conditions in a both elegant and rigorous way. Later on, we will also employ since the boundary condition ylr0 = 0 is already accounted for in the space
them to represent second-order sufficient optimality conditions rigorously in Y. As a simplification, we tacitly assume that the multiplier p occurring in
a form that is commonly used in both finite-dimensional optimization and the boundary integral coincides with the trace of the multiplier p appearing
the foundation of numerical methods. in the integral over Q. This follows as in Section 2.10. Again , we could have
chosen different multipliers pl and p2. After integration by parts and with
the bilinear form a defined in (2.22 ) on page 38 it follows that for all h E Y
we have

Dy£(9,ú,v,p )h=f \sa (y-yo)hdx+ f Ar (y - yr) h ds - a[h, p] =0.


sz r
90 2. Linear-quadratic elliptic control problems 2.12. Numerical methods 91

It is not necessary to integrate by parts once more in order to transfer the 2.11.2. Optimal stationary heat source with given outside temper-
second-order differential operator to p. Indeed, the aboye relation is just ature. This problem, which was mentioned on page 4, is a special case of
the variational equation corresponding to the weak solution to the adjoint the aboye general problem with the choices A := -./1, c := 0, (3r := a,
equation
,Q2 := 1, and ua = ub = ya. The boundary component of the control is
fixed, since ya is prescribed. The cost functional is even more general; by
putting .Nr = Az, = 0 and .\, = 1, it can be brought into the form discussed
A p + co p = Asz (y - yst)
in Section 1.2.1. Here, the control v plays the role of u. The necessary
(2.85) ó„Ap+ap = Ar (y - yr) optimality conditions follow immediately from Theorem 2.34.
p = 0

2.12. Numerical methods


Remark. The symmetry condition postulated for the operator A is a prerequisite
neither for the well-posedness of the state equation nor for the optimality conditions.
In this section, we are going to sketch some fundamental concepts for the
If this condition is dropped, then A has to be replaced in the adjoint equation by
numerical solution of linear-quadratic elliptic problems. This field of math-
its (formally) associated adjoint operator, which is defined by
ematics has been well established for many years, and there are quite a few
N
methods available to tackle even complex problems successfully. Here, and
A* p(x) Dj (aij (x) Di p(x)).
in the other sections dealing with numerical methods, we can merely give
We have generally postulated the symmetry condition , because it will be needed the reader a feel for how to approach such problems numerically. It would be
later in this textbook for certain regularity results. completely beyond the scope of this book if we attempted to present an even
half-way comprehensive treatment of these methods. We therefore refer the
The formal Lagrange method gave us a hint to introduce the adjoint interested reader to the relevant literature-in particular, to the monographs
equation in exactly this way. Owing to Theorem 2.7 on page 38, there by Betts [Bet01 ], Gruver and Sachs [ GS80], Hinze et al. [HPUUO9], Kel-
exists a unique solution p e Y to this equation , and hence one and only ley [Ke199], and Ito and Kunisch [ IK08].
one adjoint state associated with y . Invoking a generalization of Lemma In most cases, we ignore the fact that the problems have to be dis-
2.31 on page 74 for the construction of S*, one can derive the first-order cretized, tacitly assuming that the partial differential equations that arise
necessary optimality conditions . The interested reader will be asked to do can be solved explicitly. To be sure, a serious numerical analysis has to
this in Exercise 2.18. Instead , we employ here the Lagrange method again. include the discretization of the equations by, e.g., finite differences or the
We obtain the variational inequalities finite element method. However, without these technical details it is much
easier to illustrate the special features that originate from optimization the-
Dvr(y , ú, v, p) (v - v ) = f (Av v + 3Q p) (v v) dx > 0 dV E Vad,
SZ ory. Nevertheless, finite difference techniques and the finite element method
(FEM) will be briefly .touched upon in Sections 2.12.3 and 2.12.4, respec-
DuG (y , ú, v, p) (u - ú ) = f ( Au ú + /3r p ) ( u - u) ds > 0 d u E Uad. tively.
r,
In summary, the following necessary and sufficient first -order optimality con- We begin our analysis with gradient methods. Historically, these were
ditions hold. among the first techniques by which control problems for partial differential
equations were solved. These methods are slow but easily implemented, and
Theorem 2.34. Suppose that Assumption 2.19 holds. Then the pair of therefore well suited for exercises and first numerical tests. Moreover, for
controls (v, ú) E Vad x Uad, together with the associated state y E Y and the very complex and highly nonlinear problems they are still often the first
corresponding adjoint state p e Y defined by (2.85), is optimal if and only if choice.
the following variational inequalities are satisfied:
After that, we study the direct transformation into a finite-dimensional
(ñv v+ /.3O p, V- v)L2(O) > 0 V V E Vad, optimization problem, using the finite difference method; we also derive the
reduced optimal control problem, which proves to be advantageous in certain
(Auu+&rp, u-26)L2(ri) >- 0 Vu E Uad.
situations. Finally, we explain the basic ideas of the so-called primal-dual
92 2. Linear-quadratic elliptic control problems 2.12. Numerical methods 93

active set strategy, which is one of the most efficient and commonly used Application to elliptic control problems . We now apply the conditioned
numerical methods nowadays. gradient method to the problem of finding the optimal stationary heat source:

2.12.1. The conditioned gradient method. We discuss the conditioned (2.86) min J(y,u) := 2 lly-y5t11i2(S2)+ 2 i'(st),
gradient method mainly for didactic reasons, since it provides much insight
into how the necessary optimality conditions can be exploited for the corl-
struction of numerical methods. In addition, it is easy to implement for ex- -Dy = ^ u
ercises and testing. However, this method is comparatively slow, being only
linearly convergent. In general, the so-called projected gradient method con- y^r = 0
verges faster. For a detailed discussion of the conditioned gradient method,
we refer the reader to Gruver and Sachs [GS80].
(2.88) ua(x) < u(x) < ub(x) for a.e. x E Q.
The conditioned gradient method in Hilbert spaces. We first for-
mulate the conditioned gradient method for an optimization problem in the The associated control space is U = L2 ( Q), Uad is defined as before through
Hilbert space U: the given box constraints (2.88) for u, and the reduced cost functional f is
min f (u), given by f (u) = J(y(u), u). Owing to Lemma 2.30 on page 73, we have
uEUad

where f : U -* R denotes a Gáteaux differentiable functional and Uad c U f'(un)V= f(Pn + A un)vdx,
is a nonempty, bounded, closed, and convex set. Suppose that the iterates
ul, .... un have already been determined, so that u, is the current approxi- where pn is the solution to the adjoint equation
mation to the solution. Then the following steps have to be taken:

Si (Determination of a new descent direction) We determine a direction


-Op n = Y. - Y2
vn by solving the Hilbert space optimization problem (2.89)
(pn)IP = O.
f'(un) Un = min f'(u,,) v.
vEU, d

This problem is linear with respect to the cost functional and, in view of Suppose that u1..... un are already known. The method then proceeds
the assumptions on Uad, solvable. If f'(nn.)(v,, - u,,,) > 0, then un solees the as follows:
variational inequality (why?) and is a solution. The algorithm terminates.
Si Determine the state yn corresponding to un by solving the state equa-
Otherwise, if f'(un)(vn - un) < 0, then vn - u, is a descent direction.
tion ( 2.87).
S2 (Line search and step size control ) Find s,z E ( 0, 1] from solving the S2 Determine the adjoint state p,, by solving the adjoint equation ( 2.89).
one-dimensional optimization problem
S3 (Direction search) Find vn bsolving the linear optimization problem
f (un + sn (vn - un)) = min f (4.Gn + s (vn - un.)).
sE(0,1] vmi n (^3 pn+ñ un,)vdx.
f
Then put un+1 := un + sn ( vn - Un ), n := n + 1, and go to Si.
If the vn obtained frorn this linear problem has the same value as un,
In the convex case, the sequence { f (un) }00 1 converges in a strictly de- then un is optimal, and the algorithm terminates. Since this is unlikely to
creasing fashion to the optimal value (descent method). Note that, since happen in practice , it makes sense to incorporate a stopping criterion. For
Un, vn E Uad, the convex combination Un + Sn (vn - un) also belongs to Uad. instante , one terminates the algorithm if the value for v,, is not smaller at
This is an essential feature of this method. least by E > 0 than that for un.
94 2. Linear-quadratic elliptic control problems 2.12. Numerical methods 95

S4 (Step size control) Determine the step size sn by solving Every iteration step requires the solution of two elliptic boundary value prob-
lems. An advantage is the possibility of carrying out the steps S3 and S4
f ( un + Sn (vn - un)) = mm f (un + S (vn - un)).
E(0,1] analytically. In contrast to this, for the projected gradient method described
below the determination of the step size is more complicated; it may require
S5 Set un+1 := un + sn (vn - un), n := n + 1, and go te S1.
the solution of additional differential equations.
As mentioned initially, the conditioned gradient method can only be
Remarks on the practical performance . The steps S3 and S4 can be carried out in connection with a suitable discretization. Both the control u
carried out analytically in the following way: and the state y have to be replaced by discrete approximations, for instante
For S3: A meaningful direction vn is evidently by piecewise constant or piecewise linear functions. It should be clear how
the steps of the algorithm described aboye have to be performed for the
if Aun(x) + /3(x) pn(x) > 0 discretized quantities. In fact, the method just requires the solution of the
Ua (X)
elliptic boundary value problems involved and a prescription of routines for
(ua + ub)(x) if )u(x) + /3(z) p(x) = O calculation of the integrals that occur. Note that this also applies to the
A = 0 case.
Ub(X) if Au(x)+/3(x)p(x)<O.
Note that the second case is unlikely to occur in practice. 2.12.2. The projected gradient method. A better gradient method is
the so-called projected gradient method. Since this method will be discussed
For S4: We exploit the fact that f is a quadratic cost functional. We have in some detail in Chapter 3 for parabolic problems, here we only briefly
explain its differences from the conditioned gradient method: in step S3, we
f(un+s(vn-un)) = 2llyn+s(wn-yn)-yS21L2 (Q) choose as descent direction the negative gradient
vn :_ -(/3pn + Aun).
+2 IIun+S(vn -un) ll2 ),
L 2( 9
To guarantee admissibility, the step size sn is determined by solving the
where yn = y(un) and wn = y(vn) denote the states associated with u, and one-dimensional minimization problem
vn, respectively. Since here only dependence en the step size s matters, we
put g(s) := f (un + s (vn - un)). Straightforward computation yields that f (P[uo.,ubl (un + sn vn)) = min f (P[ua,ubl (un + s vn)) .

If there are no control constraints, then the optimal step size is obtained
g(s) 2=
lyn -1 2
yS2lILz(Q) + s (yn - Y,£?, wn - yn)Lz(Q) from solving
2 f (un + sn vn) = mitin f (un + s vn).
2
+ 2 I {wn - ynlI Lz(9) + 2 11 un11L 2 (U) + A s (un , vn - un)L2(o) In the linear-quadratic case studied here, sn can easily be determined as the
exact step sine (see page 94). The new approximation for the optimal control
+S2 2 Ilvn - UnhIL2(o). is obtained by setting

The function g is a parabola of the form g ( s) = 90 + 91 s + 92 s2 , where the un+1 IP[ua,ub] (un + sn vn),
constants gi can be determined beforehand . Hence, the problem regardless of how sn has been determined.

sn := arg min g(s) In the presente of restrictions, the determination of the step sine is a
SE(o,1] nontrivial task. The exact step size can usually no longer be calculated, so
can be solved by hand. Its solution , which is the projection of the zero of an acceptable step size has to be constructed numerically. For instante, one
g'(s) onto [0, 1], is called the exact step size. can employ the method of bisection: starting from a small initial step size so,
e.g., the step sine used in the previous iteration step, one takes consecutively
Initially, the conditioned gradient method exhibits fast convergente, but s = sa, 4 , 8 , and so en, until an s is found such that f (IE [ub] (un+s vn)) is
it then becomes slow. This behavior is characteristic of gradient methods. sufficiently smaller than the previous value f (un). Otherwise, the algorithm
96 2. Linear-quadratic elliptic control problems 2.12. Numerical methods 97

is terminated after a prescribed number of bisections of the step size. Another the inner grid points by
meaningful method for the determination of the step size is Armijo's rule
-Ay(xi1) 4Yi2 - [Y(¡-1)j +Yi(j-) +y(i+1)j +Yi(j+1)]
from the theory of nonlinear optimization; see, e.g., Nocedal and Wright (2.91)
h2
[NW99] and Polak [Po197].
i, j = 1 ... n -1. In addition, we define open squares S2j of side length h that
For each new step size, the evaluation of the cost functional requires the
are parallel to the axes and have midpoints xij (see the figure). Within these
solution of a partial differential equation, which is costly. Therefore, one
subsquares, the control u is assumed to be constant, i.e., we put u(x) = uij
sometimes chooses to work with a fixed, sufficiently small step size as long
on Stij for i, j = 1, ... , n - 1; within the remaining part of í in the vicinity
as a sufficiently large descent can be maintained.
of the boundary, we simply put u(x) = 0.
Very nice expositions of the projected gradient method and its conver-
Next, we number the quantities
gence properties in the finite-dimensional case can be found in Gruver and
xij, yij, and uij lexicographically,
Sachs [GS80], Kelley [Ke199], and (with geometric illustration) Nocedal
e.g., from the southwest corner to
and Wright [NW99]; the Hilbert space case is treated in, e.g., Hinze et al. ik
k the northeast corner of 9. In this
[HPUUO9].
way, we obtain vectors Y = (x1, ... ,
)T
2.12.3. Transformation into a finite- dimensional quadratic opti- x(n-1)2)T , (y1, ... , 2J( n-1)z ,
mization problem. and 2G = (u1,. .. , u(,a-1)2)TER(n-j)2
Moreover, we define the vectors
Derivation of a discretized problem. For the numerical treatment of v,a,, and úb by putting y0,i = yu(xi),
problem (2.86)-(2.88), we have to discretize the elliptic boundary value prob- ua,i = ua (xi), and ub,i = ub(xi).
lem, the state y, and the control u. If the boundary of the domain consists
Finally, we replace the integrals
of pieces that are parallel to the axes, then finite difference methods offer
occurring in the cost functional by
a very simple means to this end. Here, we will pursue this approach for Subdivison of Q. midpoint rules, for example,
the sake of simplicity, even though it is unsatisfactory from the viewpoint
ff y(x) dx rs h2yij. Then, dividing the cost functional by h2, we infer
of numerical analysis: optimal controls are, as a rule, not smooth enough to
from (2.91) a problem of the form
guarantee the convergente of the finite difference method; also, the related
1 (n-1)2
error estimates do not apply. Therefore, the finite element method is almost
exclusively applied in the relevant literature. We will briefly discuss this min - - y52,i)2 + u2]
2
method in Section 2.12.4. On the other hand, the finite difference method is i=1

easy to implement and thus very suitable for the purpose of testing. Ahj=Bhú, úa< Ti <14-
Once again, we consider the problem of finding the optimal stationary
Here, Ah is an (n - 1)2 x (n - 1)2 matrix, and Bh is a diagonal matrix with
heat source, (2.86)-(2.88). As the domain, we choose the two-dimensional
the entries bii = /3(xi).
unit square Q = (0, 1)2, which we subdivide into n2 congruent subsquares:

U oij,
n
Remark. A discretized version of the problem can also be set up using the finite
(2.90) t2 = Stij x (J - r J ) element method; the matrices and the discretized cost functional then look different.
n n n n
i,j=1
We will discuss this approach in Section 2.12.4 in connection with the primal-dual
The corners of these subsquares,
es, active set strategy.

J
xij=hL i, j =0,...,n, The aboye quadratic optimization problem with linear equality and in-
equality constraints can be implemented in existing numerical routines. For
form an equidistant grid of step size h = 1/n. The finite difference method instante, the code quadprog in MATLAB can be employed. In addition, other
determines approximations yij for the values y(xij) of the state at the grid programs for large optimization problems are available; a selection can be
points. Using the standard five-point stencil, we approximate -Dy(xij) at found on the website of NEOS (NEOS Server for Optimization). Elliptic
98 2. Linear-quadratic elliptic control problems 2.12. Nuinerical methods 99

problems in two-dimensional domains do not present too many difficulties, The main workload in formulating the reduced problem then consists of the
as the results of Maurer and Mittelmann [MMOO, MMO1] for semilinear determination of the functions
elliptic problems with control and state constraints show.
In this method, which is often referred te as first discretize, then optimize, Y¡ (X) :_ (Se¡) (x), i = 1, ... , m,
and ú play the role of independent variables; the fact that they are state
and control, respectively, is not exploited. In the next section, we will briefly which are the solutions to the boundary value problems
demonstrate how to express the problem in terms of the control u alone.
The dilnensionality of this problem is smaller, which reduces the storage Dy = Q el
problems arising during the solution of quadratic optimization problems. y^r = 0.
This method works as long as the computer is able to handle the partial
differential equation. In other words, m partial differential equations Nave to be solved in this
approach. In the case of the subdivision into subsquares of side length h =
Formulation of a reduced optimization problem. If the control can be 0.01 described aboye, this already amounts to 104 equations. Therefore,
expressed as a linear combination of a relatively small number of basis func- setting up a reduced problem is worthwhile only if either m is comparatively
tions, then it makes sense te reduce the problem te a quadratic optimization small or the same problem has to be solved several times with different data.
problem in terms of u alone. The basis functions may either result from a, Once we have determined the functions yi, we obtain the state y = Su
discretization (as described below) or be prescribed by the application under by superposition:
investigation.
Thus, let the control function u be of the form
(2.93) y = ui yi
m
i-l
(2.92) u(x) _ ui e¡ (X),
i=l Inserting the expressions ( 2.92) and (2.93) in the cost function yields the
with finitely many given functions el : fI --s R and real variables ui. More- finite-dimensional cost functional
over, suppose that the control constraints are either equivalent to

ua < ui < ub, i = 1, ... , m, 2 2


1
fm(ul, ..., um) = ui yi - Yo
with given real numbers u,,, < ub, or given in this form from the beginning. 2
i=l
L2(o)
Example. Suppose that the basis functions eij(x) are defined, with respect to
the subdivision (2.90), by The finite-dimensional approximation of the original problem then con-
sists of finding ú = (ul, ... , Um)T as the solution to
( 1 xES2ij
F^^ (x) )
0 otherwise. (Pm) mir( fm (u)
ua ub
Their values on the boundaries between the subsquares are inmaterial, since these
sets have zero measure. We number the aboye functions from 1 to m = n2, that is, A slight rearrangement of the cost functional yields
el = Ell, .... en = el,,, en+l = E21, .... em = c,n . The resulting vis obviously a step
function. Note that here the control is assumed te be constant en the subsquares
m
fi,j, whereas in the last section the subsquares fIi were used (which was due to
fm(ú) = 1 y^IL2( S2) - (Yo
2
the finite difference method used for the solution of the differential equation).
=1
m
For simplicity, we again assume that the state y associated with a given 1 A
+ 2
control u can be determined exactly as the solution to the state equation. uj yj) r,2rsai + 2 (E uZ e2 L2(0)
i=l j=1 i=l j=l
100 2. Linear-quadratic elliptic control problems 2.12. Numerical methods 101

where S : L2(S2) -* L2( S) denotes the solution operator of the boundary


m 1 m value problem
_Qy = u
2 IIyQIILz (^) ui (y SÉ) yi)L2 (S2) + 2 ui uj yi, yj L2(Q)
i=1 ¡a=1
y1r = 0
(for better readability, we assume a - 1).
iuj (ei, ej)L2($Z).
According to the projection relation (2.58) on page 70, the optimal con-
trol has to satisfy
Therefore, up to the constant 21y2IIL2(2), (P,) is equivalent to the {-A-1 p(x)} ,
u(x) =
finite-dimensional quadratic optimization problem
where the adjoint state p is the solution to problem (2.50) on page 67. There-
fore, with
m= - ( ' p + ) = -,\-1 .f'(u),
min{ZúT (C+AD)ú-áTV,}
(2.94)
úa < ú < v,b,

ua(x) if -A-1 p(x) < ua(x) (`^¿ p(x) < 0)


where úa = (ua, ... , ua)T and v,b = (ub, .... ub)T. Evidently, the foliowing
-A-1 P(X) if -,\-1 p(x) E [ua(x), ub (x)] (`> µ(x) = 0)
quantities have to be calculated beforehand:
ub(x) if -\-1 p(x) > ub(x) (=^> µ(x) > 0).
a = (ai), al = (yS , Yi)L2(2)

C = (eij ), eij = (Y¡, Yj)L2(O)


In the first case, µ(x) < 0 and thus ti(x)+µ(x) < ua(x), by the definition
D = (dij ), dij = (e¡, ej ) L2(O). of µ and the fact that ú = ua. Similarly, ú(x) + µ(x) > ub(x) in the third
case. In the second case, we have p(x) = 0, hence ú(x)+µ(x) = -A-1 p(x) E
In the case of step functions , we have ( ei, ej )L2(o) - bijileillL2 (Q), so that
[ua(x), ub(x)]. In summary, u = v, satisfies the relations
.D is a diagonal matrix, D = diag (IIein2(O))

For the solution of such problems, the aforementioned MATLAB code ua( x) if u(x) + P(X) < ua(x)
quadprog can be used . Numerous other codes can be found on the internet
(2.96) u(x) _ -A-1 P(X) if u(x) + µ(x) E [ua(x), ub(x)]
website of NEOS (NEOS Server for Optimization).
ub(x) if u(x) + µ(x) > ub(x).
2.12.4. The primal-dual active set strategy.
Conversely, if u E Uad satisfies (2.96), then u satisfies the projection
condition and is therefore optimal. To see this, we discuss the first relation:
The infinite-dimensional case. For control problems involving partial
since u = ua, it immediately follows that M(x) < 0; hence
differential equations, this method goes back to Ito and Kunisch [IK00].
Here, it will be explained for the optimal stationary heat source problem, 0>-'\-1p-u=-A-1p-ua

so that -.A-1 p < ua, and thus u(x) = IID(ud(x) ab(x)i {-.-1 p(x)}. The other
min f (u) 2IISU - YS21122(2)+ 211.1122(Q), cases are treated similarly.
subject to Summarizing, we conclude that the quantity u + µ indicates whether
the inequality restrictions are active or not. These considerations motivate
u E Uad = {u e L2(Sl) : ua(x) < u(x) < ub(x) for a.e. x e S2}1 the primal-dual active set strategy described next.
102 2. Linear-quadratic elliptic control problems 2.12. Numerical methods 103

As initial guesses, two arbitrary functions uo, µo E L2(S2) are chosen, We also observe that the case distinction in (2.95) remains unchanged if
where uo is not necessarily admissible. Suppose that the iterates un_1 and the function s is replaced by c a with some constant e > 0. We may therefore
µn_1 have already been found. To determine un, the following steps are work with cµ,,_1 in place of ten_1 in step S1, which can be of benefit in
carried out: numerical computations. Sometimes this is also used for the convergente
analysis.
Si (New active , respectively , inactive set)
For a detailed analysis of the method and of more general variants in
Put which, e.g., admissibility is enforced by a multiplier shift, we refer the in-
Ab = {x : un_1(x) + ibn-1(x) > ub(x)}
terested reader to the book by Ito and Kunisch [IKO8], as well as to Ito
An' = {x : un-1(x) +lln -1(x) < ua(x)} and Kunisch [IK00], Bergounioux et al. [BIK99 ], and Kunisch and Riisch
[KR02 ]. The method can be interpreted as a semismooth Newton method,
In = Q\(AbUAa)• which explains the fact that it usually converges superlinearly ; see [IK08]
If A' = A'_1 and An = A^-1, then we have optimality and terminate the or [HPUUO9].
algorithm. Otherwise, we continue with the next step.

Numerical realization using a finite element method. The nurnerical


S2 (New control)
realization of the primal-dual active set strategy requires the setting up of a
Determine the solution y, p e H10 (52) to the linear system
discretized analogue. In contrast to the discretization of Poisson's equation
by finite differences described on page 96, this time we employ linear finite
ua on Aa
-Ay = u elements. To fix things, suppose that l2 e 182 is a polygonal domain, which
u= -X-1 p on In is subdivided by a regular triangulation into finitely many triangles with
-Ap = y-yo
Ub on A'. pairwise disjoint interiors. Associated with the triangulation is a finite set
of continuous piecewise linear basis functions {4 i ..., ^Pe} C Hi(Q). Neither
Observe that this system constitutes the first-order necessary optimality the regular triangulation nor the basis functions are described in greater
condition for a solvable linear-quadratic optimal control problem (namely detail; for more information, the reader is referred to the monographs by
the one that results if, in the initially given problem, the control u is fixed Braess [ Bra07], Brenner and Scott [BS94] , Ciarlet [Cia78], and Grossmann
by taking u = ua on Añ and u = Ub on Ab)_ hence, it admits a unique and Roos [GR05].
solution. The control is assumed to be piecewise constant. More precisely, u is
One then puts assumed to be constant on each triangle of the triangulation. We denote by
ei, i = 1, .... m, the associated set of unit step functions, i.e., el equals unity
un := u, Pn p, Pn -(\-1 Pn + un), n := n + 1,
on the subtriangle with index i and zero elsewhere.
and continues with step Si.
In summary, we use the ansatz

It is convenient to rewrite the linear system to be solved in step S2 in a


slightly different form. To this end, let Xñ and Xb,, denote the characteristic
y(x) yit (x), u(x) = > uiei(x),
functions of the sets Añ and Ab, respectively. Then, evidently,
i=1 i=1
b
u+(1-Xn-Xn)X 1P=Xnua+Xn ub,
with the real unknowns yi and uj, for i = 1, ... , ^, j = 1, ... , m. Inserting
and in step S2 the following system has to be solved, where y and p vanish these expressions into the weak form of the boundary value problem and
on the boundary: choosing I as the test function, we find that
-Ay -u = 0
(2.97) -Ap -y = -yo
(1-Xc,-Xb)X-1P +u = Xn un + X^ ub•
104 2. Linear-quadratic elliptic control problems 2.12. Numerical methods 105

In terms of the unknown vectors y = (yl, ... , ye) T and i c = (u1i ... I U M )T for the components of the optimal vector fi we obtain
this is a system of linear equations of the form
ua if ui + µi < ua
Khy=Bhu pLi if ui + Mi E [un, ub] 1 < i < m.

where the elements of the stiffness matrix Kh and the matrix Bh are given
Ub if ui + Mi > Ub
by
Comparison with the infinite-dimensional case thus motivates the following
kh,ij = f V Di . D<Pj dx, bh ij = ej dx. primal-dual active set strategy:
fS2 <Di
The positive number h , called the mesh size of the triangulation , is a mea-
First, we choose initial guesses Eo and µo. In the nth step of the algo-
sure of the refinement of the mesh . For the cost functional , we find after a
rithm, we define the sets of active and inactive restrictions according to
straightforward computation the identity
Ab _ {1_— M} Un-1,i + P n-1,i > ub}

1 Aa _ {iE{1,...,m} un-1,i + µn-1,i < ua}


2IIy-y^11r2 (st)+2IuIi2 (st)= 2yTMhy- a y+ Z u Dh^c+2 I1YQ11, o),

where the entries of the mass matrices Mh and Dh and of the vector áh are
given by Having defined the new active sets A' and Ab, we determine, in anal-
ogy to the characteristic functions X', and Xb introduced on page 102, the
mh,ij = fSt <Di (Dj dx, dh,ij = f el ej dx, ah, i = f <Di Yo dx. diagonal matrices X,, and Xb with the diagonal elements
st
a= J 1 ifiEAn b= J 1 ifieAb
The constant term 2 IIyOIIz,2(o ) does not influence the minimization. Since Xn,ii 0 otherwise ' Xn,ii 0 otherwise
the interiors of the triangles are pairwise disjoint , Dh is a diagonal matrix.
Summarizing, we have the following discretized analogue of the initial Next, we put Eh :_ (.XDh)-1(I - Xn - X^). The diagonal elements eh,ii
optimal control problem: A. We then have to solve the following
of Eh vanish if and only if i c Aa u Ab
system of linear equations for P, y, w:

0 Kh -Bh p 0
min {TMh_ ahTy + 2 úTDh2t}
(2.99) Kh -Mh 0 -dh
(2.98)
Khy = Bhu EhBh 0 I ú Xaaúa+Xñ'db

2la < ZA, < 2Gb,


Once this has been done, we put ún := ú and )cn, :_ - ((ADc)-'13 h n + ún).
It can be shown that this algorithm terminates at the optimal solution
where úa = (ua, .... na )T and ¡7b = ( nb, ... , ub )T . The associated optimality after finitely many steps. This happens when Añ = Aa_1 and Ab = An_1
system reads
for the first time, because then V. satisfies all the restrictions. In this sense,
Khy=Bhu, Ua <2G <ib
all iterates generated by the primal-dual active set strategy are inadmissible,
except for the last one, when the optimum is achieved. For the proofs,
Kh¡ =Mhy-ah we refer the reader to [BIK99] and [KR02 ]. It is possible te achieve the
(ñ Dh2G + Bh p )T (v - 2G) > 0 d 2a < 2J < úb. admissibility of all iterates by a shift of the multipliers; see [1K00]. Moreover,
as in the infinite-dimensional case, ca, may be used instead of [1,'.
Since Dh is a diagonal matrix, we can argue as in the infinite-dimensional
Often, the piecewise linear ansatz u = =1 ui(Di is used in place of
case. Putting
the piecewise constant one. Then Dh is no longer a diagonal matrix, and
li = ((XDh)-1Bhp+u) the aboye discussion of the variational inequality fails. In this situation,
106 2. Linear-quadratic elliptic control problems 2.13. The adjoint state as a Lagrange multiplier 107

the active set strategy for the continuous case described in Section 2.12.4 is form a),
applied pointwise at the mesh points of the triangulation; that is to say, in
¡Ayl1v. = sup ay(v)j = sup a[y,v]I
the case distinctions the points x E t1 are replaced by the mesh points xi.
IIvIIv=r I1vIIv=1
< sup ao 1 1 0 V 1I4v = no blIv-
Remark. An irnportant task is of course to estimate the error of the method, i.e., Ilvll V =1
the amount by which the exact optimal control and the optimal control of the dis-
Clearly, this implies that IIAMM < ao. Hence, A is bounded and therefore
cretized problem differ. For elliptic problems we refer to [ACT02], [CM02b],
[CMT05], [HinO5 ], [HPUUO9], and [MR04], and for parabolic problems to continuous. Moreover, we have
[Ma181], [R5s04], and [TT96].
(2.101) a[y,v] = ay (v) by, v E V,
Other active set strategies . The strategy described aboye generates in-
so that the variational equality a[y, v] = F(v) can be regarded as an equality
admissible controls until the optimum is reached. With projected Newton
in V*, namely,
methods, there exist schemes that always generate admissible controls and
exhibit a convergente behavior comparable to that of the primal-dual active Ay = F.
set strategy described aboye. Such methods have been successfully applied
By virtue of the Lax- Milgram lemma, for any functional F E V* this equa-
to parabolic problems, in particular. We refer to Bertsekas [ Ber82] and
tion has under the corresponding assumptions a unique solution y E V,
Kelley [Ke199] regarding the foundation of these methods, and to Kelley
and IylIv < ca IjFIjv*. Consequently, the inverse operator A-' : V* -+ V,
and Sachs [KS94 , KS951 with respect to their application to the solution
F s y, exists and is continuous . Observe that the continuity of A-r can
of parabolic optimal control problems.
also be concluded from the well-known open mapping theorem , since A is
surjective . In summary, we have shown the following result.
Direct solution of the optimality system. Another technique that can.
be recommended is the direct numerical solution of the nonsmooth optimality
system Lemma 2 . 35. Every V-elliptic and bounded bilinear form a = a[y,v] gen-
erates via
- -Ay = r 1P[ua,ubl {-A -rap} -Ap = y - yO
ylr = 0 P Ir = 0. (Ay,v)v%,v=a[y,v] dy,vEV

The application of this method to parabolic problems will be discussed in a continuous and bijective linear operator A : V -> V*. The inverse operator
some detail on page 170. A-' : V* -> V is continuous as well.

2.13. The adjoint state as a Lagrange multiplier * The application of Lemma 2.35 offers several advantages . In particular,
one can work with the operator A in a similar way as with the matrix A in
2.13.1 . Elliptic equations with data from V*. Using the theory of weak Section 1.4. The representation becomes symmetric , since it holds for the
solutions , all of the elliptic boundary value problems investigated in this adjoint operator A* : (V*)* -* V* and therefore A* : V - V *, provided that
chapter have been transformed finto the general form V is reflexive . In this textbook , this will always be the case.

(2.100) a[y, v] = F(v),


2.13.2. Application to the proof of optimality conditions . In this
with a contimuous bilinear form a: V x V -> R and a functional F E V*. section, we aim to demonstrate the advantages of Lemma 2.35 by applying
We now consider the bilinear forro from a different point of view . To this it to the problem of finding the optimal stationary boundary temperature.
end, observe that for any fixed y E V the linear mapping ay : V R, Here, y E H' (Q) is the weak solution to the state equation
v a[y,v], is continuous on V and thus an element of V*. The linear
mapping A : V -> V*, y -* ay, is continuous , since it satisfies , with the -Ay = 0 in tl
constant cap from the inequality ( 2.6) on page 32 ( continuity of the bilinear c9 y+ay = au on 1,,
108 2. Linear-quadratic elliptic control problems 2.13. The adjoint state as a Lagrange multiplier 109

where we postulate that a > 0 almost everywhere on F and IIaIIL-(r) > 0. Then p solves the adjoint equation
We choose V = H1 (S2) and define a and F by f
(2.104) A*p=E* (y-yoo).
a[ y, v ] = J Vy Vv dx + a y (T v) ds
S
The operator E* : L2(Q) 3 V* assigns the function y2 to itself, but we

I, a u (r v) ds, consider it as a functional on V, defined by the equation (E* (y-yo))(v)


(y - yQ , v)L2(Q). It remains to determine the explicit form of A*. It follows
where rr : V -> L2(F) denotes the trace operator . Now let A : V -> V* be from the symmetry of a that
the operator generated by the bilinear form a. Then the given problem can
be rewritten in the form (Ay, v)v*,v = a[y, v] = a[v, y] = (Av, y)v*,v = (y, Av)v,v• Vy, v E V,

min J(y, u) 2 lly - y^l12 ( 2) + 2 IIUI1i2(r) which shows that A = A*. Therefore, the adjoint equation (2.104) is equiv-
(2.102)
alent to the boundary value problem
Ay = Bu, UEUad.

Here, the linear operator -Ap = y - Yo


--> V* = H1(11)* is defined by (2.105)
a,p+ap = 0.
(Bu,v)V.,v= bvEV.
B : L2(F)fau(v)ds
Finally, the adjoint operator B* : V -> L2(F ) is given by B*p = a (Tp),
In the following, we identify L2(fl)* with L2(S2), but not V* with V since so that the variational inequality (2.103 ) takes the form
this would, for example, necessitate using the Hl scalar product at places
where this is not appropriate. (2.106 ) (a (T p) + Av, , u - ú)L2 (r) > 0 du E Uad.
Next, we define the solution operator G := A-1 : V* -> V, u F--> y =
GBu. Again, we consider G as an operator S with range in L2(11), which is
All these results have already been derived aboye, using different tech-
meaningful since V L2(S2). We thus put S = EvG, with the embedding niques. However, the method employed here is more general. It is applied,
operator Ev : V -> L2(Q) . Then S : V* -> L2(Sl) and
in particular, in the monograph by [Lio71], which we have followed here.
S=EvA-1. For instante, it allows for the use of control functions from Lr spaces with
r < 2 or even more general functionals from V* (cf. page 40). In addition,
The adjoint operator S* maps L2(l) into V and thus into L2(tl). Putting with this rnethod the adjoint state p can be defined as a Lagrange multiplier
y = EvG B u = S B u in the cost functional J(y, u), we find that problem in a natural way, as will be explained in the next section.
(2.102) can be rewritten as

2.13.3. The adjoint state as a multiplier. The result just proved can
uÉo .f(u) zIISBu YOIIL2(.^)+ 2IlullL2(r). also be deduced from the Lagrange multiplier rule for optimization prob-
ad
lerns in Banach spaces. In anticipation of the contents of Chapter 6, we
Theorem 2.14 on page 50 yields the existente of an optimal control ú, and demonstrate this for problem (2.102). This is an optimization problem in
from (2.45) in Theorem 2.22 on page 64 we deduce the associated variational a Banach space of the type (6.1) on page 324, with the equality constraint
inequality Ay - B v = 0 for the unknown u := (y, v) E U := Y x L2(F) and the range
space Z := Y* for the equation. To ensure compatibility with the notation
(2.103) (B*S*(y - yo) + A u, u - u) L2 (j') > 0 Vu E Uad. of Chapter 6, we denote the control by v (instead of u), and put Y := H1(S2)
and Vad := Uad. We thus consider the problem
Next , we define the adjoint state p by
1 1
p := S*(y - yo ) =
(A-1)*
minJ(y,v) 211y YQ11L2(o)+211v1122(r), Ay-Bv=0, V E Vad.
Ev ( y - yo).
110 2. Linear-quadratic elliptic control problems 2.14. Higher regularity for elliptic problems 111

The corresponding Lagrangian function L : Y x U x Y** -> R reads, in 2.14. Higher regularity for elliptic problems
view of (6.2 ) on page 325,
2.14.1. Limited applicability of the state space H1(9). In the present
L(y, v, z*) = J(y, v) + (z* , Ay - B v)Y„ Y* chapter, the state was generally chosen from the space H1(S2). While this is
1 appropriate for standard linear-quadratic problems, simple changes already
21Evy yszIIL2 (2)+21IVIIL2(r) +(z*, Ay-Bv)Y,.Y* lead to difficulties that cannot be resolved in H1(S2), as the following two
examples show.
Since Y is reflexive , we can identify Y** with Y and hence z* with an
element of Y . Evidently, A is a surjective operator . Therefore, the regularity
Evaluation at a point . In this example, we replace the quadratic integral
condition ( 6.11) for equality constraints by Zowe and Kurcyusz is fulfilled.
functional used so far by the value of the state at a fixed point xo E Q. We
Hence, owing to Theorem 6.3 on page 330, there exists a Lagrange multiplier
thus consider the following problem:
z* E Y such that the variational inequality ( 6.13) holds, i . e., in the present
case, min y(xo),
D(y,v)L(9, v, z*)(y y, v - v) > 0 b(y, v) E Y X Vad. subject to
-Dy + y = 0 in í1
Since y E Y may be chosen arbitrarily, it follows that
á„y+oy = u on I
D0L(y, v, z*) = 0,
and constraints on the control u E L2(1'),
so that
ua(x) < u(x) < ub(x) for a.e. x E S2.
E*(Evy-y0)+A*z*=0.
Putting p := - z* and recalling that Ev y = y, we finally arrive at equation This problem is not well posed in H1 ( 52) unless S2 is one-dimensional.
(2.104), Indeed, y must be continuous for y(xo) to be defined, and functions in Hl(9)
need not be continuous if dim Q > 2. We will treat this problem in Section
A*p=Ev(y-yn)• 6.2.1, using Y = H1(S2) (l C(í) as the state space.
Its unique solution is the weak solution to problem (2.105). We have thus
shown the following result.
Best approximation with respect to the maximum norm. A similar
situation as in the aboye example arises for the problem
Lemma 2.36. The adjoint state p associated with the optimal control v
of the optimal stationary heat source problem is the (uniquely determíned) min J(y, u) Ily - YQIIC(o) + 2 I1u1112(r),
Lagrange multiplier corresponding to the state equation Ay - B v = 0.
subject to the same restrictions as in the case of the point-evaluation func-
tional. Here, the target yc has to he approximated by y uniformly, not in
For the sake of completeness, we mention that the variational inequality the quadratic mean.
DT1L(y, v, z*)(v - v) > 0 for all V E Vad Also in this case, H1(Q) is not the appropriate state space: the state y
needs to be continuous for J to be defined. Moreover, while J is convex, it
implies, as is to be expected, the variational inequality (2.106), here formu-
is not differentiable. In Section 6.2.1, this problem will be treated by trans-
lated with v in place of u.
forming it into a problem with differentiable cost functional and pointwise
In the same way, all the other problems for elliptic equations can be state constraints.
treated. A similar approach is also appropriate for parabolic equations. Dif-
ficulties arise for certain classes of nonlinear equations (see Chapter 4), since 2.14.2 . Sobolev-Slobodetskii spaces. In this section, we briefly discuss
right-hand sides v e Y* only lead to y E H'(Q). We will, however, need Sobolev spaces for noninteger orders of differentiation. These spaces play an
boundedness of the state y, which is not guaranteed in H' (S2) if N > 2. irnportant role in the theory of partial differential equations. In this book,
In the study of problems with state constraints, we will later even need the they will be used in Section 2.15 to prove that the optimal control for the
regularity y E C(Sl). problem of finding the optimal stationary boundary temperature belongs to
112 2. Linear-quadratic elliptic control problems 2.14. Higher regularity for elliptic problems 113

Hl(F). For the next definition, we follow the unes of Thm. 7.48 in Adams
Boundary value problems in Cr> ' domains . We consider the Dirichlet
[Ada78].
problem

Definition . Let fI C RN be a bounded domain, and let s > 0 be noninteger, Ay+)y = f in 52


(2.107)
with .\ = s - [s] > 0 being its noninteger part. Then we denote by HS(fl) the
y = g on F
normed space of all functions v E HISI (ft) such that
and the Neumann problem

D`v(x ) - Dav(y)2 dxd Ay+Ay = f in f2


< o0 '(2.108)
sz Ix - y1 N+2a y d„Ay = g on F,

with the elliptic operator A defined in (2.19) en page 37. We assume that the
endowed with the norm coefficient functions ai4 obey both the symmetry condition and the ellipticity
condition (2.20). In addition, we assume that aij e C°"1(f2) for all i, j E
{11—— , N}, and we postulate that SZ is a bounded C1,1 domain. Moreover,
= Dav ( x ( y) ^ 2
^^v^Is
H (2 ) §v^^ His)(O) + ) y^
D 2a dxdy. ) e R is prescribed.
S2 S2
Homogeneous boundary data. In this case, we can deduce the following
result from Thms. 2.2.2.3 and 2.2.2.5 in Grisvard [Gri85]:

To define the spaces HS(F), we need the representations YN = hi(y), for If g = 0, A > 0, and f E L2(ft), then the weak solution to the Dirichlet
y E Q1_1, with respect to all the local coordinate systems Si, which were problem (2.107) belongs to H2(f2). If, in addition, A > 0, then the sane
used to introduce the notion of Ck,l domains in Section 2.2.2. A function v result holds for the Neumann problem (2.108).
belongs to HS(F) if and only if all the functions vi(y) = v(y, hi(y)) belong
to the space HS(QN_1). Inhomogeneous boundary data. The following result is a consequence of
A thorough treatment of the mathematical foundations is given in Wloka Thms. 2.4.2.5 and 2.4.2.7 in [Gri85]:
[W1o87], Chap. 1, §4; we also refer the reader to Adains [Ada78] and Alt
Let 1 < p < oo. If g e W2-1/P'r(F), A > 0, and f e LP(SZ), then the weak
[A1t99]. In the same way, one can define the spaces Hk(F) for integer k
in Ck-1,1 domains, using the Sobolev space Hk(QN_1); see Cajewski et al. solution y to the Dirichlet problem (2.107) belongs to W2,P(I). If \ > 0
[GGZ74]. and g E W1-'/P,P(F), then the same result holds for the Neumann problern
(2.108).
These so-called Sobolev-Slobodetskii spaces turn out to be Hilbert spaces
if equipped with the corresponding scalar product. The spaces WS,P(f2), for
The results cited aboye were proved in [Gri85] for more general boundary
p 2, are introduced in a similar way; see [Ada78]. In particular, one has
operators that include, in particular, the case of Robin boundary conditions.
HS(ft) = WS,2(f2).
Note that the results for hornogeneous boundary conditions follow as special
cases from the W2P results, since g = 0 is smooth.

2.14.3. Higher regularity of solutions . The examples discussed in Sec- Lipschitz domains. For Lipschitz domains, the following interesting result
due to Jerison and Kenig [JK81] holds:
tion 2.14.1 show that H1 regularity of solutions to elliptic boundary value
problems does not suffice for important classes of optimal control problems.
Let f2 be a Lipschitz dornain . If A = -A , A > 0, f 0, and g e L2(F), then
One therefore tries to find additional conditions relating to the smoothness
the weak solution y to the Neumann problem belongs to 113/2(f2).
of the boundary and/or the prescribed data which guarantee better regu-
larity properties. In this section, we collect some standard results from the
Convex domains . The following result shows that additional regularity
relevant literature.
can be expected for convex domains:
114 2. Linear-quadratic elliptic control problems 2.15. Regularity of optirnal controls 115

If fI is a bounded and convex domain, then the results stated for Cl,l domains Kinderlehrer [KS80], which states that the mapping u(.) - ¡u(.)¡ maps
and homogeneous boundary data remain valida that is, if g = 0, f e L2(f2), H'(f2) continuously into itself. In conclusion, the optimal control u = ti
and A > 0 or A > 0, respectively, for the Dirichlet or Neumann problems, belongs to Hl(f2). ❑
then y E H2(ft).

This result follows from Thms. 3.2.1.2 and 3.2.1.3 in [Gri85]. Further Optimal stationary boundary temperature . The optimality system for
regularity results for CO° boundaries can be found in Triebel [Tri95]. More- this problem reads
over, the Stampacchia technique can be employed to prove boundedness or
-Dy = 0 --,^ip = y - yn
continuity of the solution y under slightly weaker conditions. We will come
back to this in Sections 4.2 and 7.2.2. á„ y + a y = au avp+ap = 0

2.15. Regularity of optimal controls u = F[uo.,ubl {-A-1 aplr} .


In this case, the aboye method fails to yield the expected regularity.
In the problems studied aboye, the controls were chosen from the Hilbert
Again, the adjoint equation yields p E H1(f2). However, the projection
spaces L2(F) or L2(ft); hence, no better regularity than L2 can at first glance
relation for u involves the trace of p, which by Theorem 7.3 merely belongs to
be expected for the optimal controls. It turns out, however, that for A > 0
H1/2(F). We can thus expect u E H1/2(F) at best. Nevertheless, additional
the mere fact of optimality entails additional regularity.
regularity can be recovered under natural conditions.
Optimal stationary heat sources. We treat this problem for zero bound-
ary temperature; the case of a boundary condition of the third kind with Theorem 2.38. Let LI be a bounded C1,1 domain, let es be Lipschitz con-
prescribed outside temperature can be handled analogously. The optimality tinuous, and let Ua, Ub c Hl(F). Then the solution ú to the problem of the
system associated with the corresponding problem (2.26)-(2.28) (see page optimal stationary boundary control belongs to H'(F).
49) reads

-4y = ,Qu -Ap = y-yo Proof.- We write the adjoint equation in the form

YIr = 0 pIr = 0 -Op+p = y-yo+p


avp = -n p.
u = F(ua.,ub] {-A-"3P}
The solution p = pl +p2 is composed of two summands pl and p2. Here, pr is
the solution to the boundary value problem -Apl +pl = f, 0,p, = 0, with
We have the following regularity result.
f := y-yo+p. The sumrnand p2 is the solution to -Op2+p2 = 0, á„ p2 = 9,
Theorem 2.37 . Suppose that ¡3 is Lipschitz continuous on 9 , and assume with g := -a p.
that ua, ub e H'(f2 ) and yc E L2(I ). Then the optimal control for the Since LI is a domain of class C1"1, we have pl E H2(f2) owing to the reg-
optimal stationary heat source problem ( 2.26)-(2 . 28) on page 49 belongs to ularity results for the homogeneous Neumann problem from [Gri85], which
Hl (f2). are collected in Section 2.14.3.
Also, p2 belongs to H2(L): indeed, we have p E H'(L), so that pIr E
Proof. Since y - yc belongs to L2(9), the solution to the adjoint equation H1/2(F) by virtue of Theorem 7.3 on page 356. Then, owing to Thm. 1.4.1.2
satisfies p E Hó (ff). This holds for the product ,3p only if /3 is sufficiently in [Gri85], the product g = -apir also belongs to H1/2(F). The claim now
smooth. To ensure this, we have postulated that ,Q is Lipschitz continuous follows from the regularity result for the nonhomogeneous Neumann problem
on Q. Hence, we have ,Q p E Ho (f2); see Grisvard [ Gri85] , Thm. 1.4.1.2. in Cl,l domains stated in Section 2.14.3: we can infer that the mapping
Moreover, Ua, Ub E H1(f2). g P2 is continuous from Hr/2(F) = W1-1/2,2(F) into W2'2(L) = H2(f2).
Finally, we claim that the projection operator F[ua,ubj maps H1(f2) into In summary, p E H2(f2). Hence, by virtue of Theorem 7.3, the trace of
itself. Indeed, this is a consequence of a result due to Stampacchia and p belongs at least to H1(F). The assertion now follows from the projection
116 2. Linear-quadratic elliptic control problems 2.16. Exercises 117

formula for u and the continuity of the inapping u(•) -t u(-)1 in H' (F), sine is convex and closed . Use the following well-known result : if IIu„)L2(n) -
the product of the Lipschitz function a and the H'(F) function plr belongs 0, then there exists a subsequence of {u„}- r that converges to zero
to H'(F); see [ Gri85] , Thm. 1.4.1.2. ❑ almost everywhere in 9.
2.10 Suppose that {Y, )-((Y} and {U, j -¡¡u} are Hilbert spaces , and let Yd E Y,

2.16. Exercises 4 > 0, and an operator S E C(U, Y) be given. Show that the functional

+A jjujju
2.1 1Ne sketch the treatment of the nonlinear problem f (u) = (SIL - yd11Y

min J(y,u) is strictly convex if 4 is positive or S is injective.

T(y,u) = 0, u E Uad. 2.11 Show that the following functionals are continuously Fréchet differen-
tiable:
Here, in addition to the quantities J and Uad defined on page 10, a con-
a) f (u) = sin (u(1», in C[0, 1];
tinuously differentiable mapping T : 118" x R""` - > R' is given. Suppose
b) f(u) = uI12H, in any Hilbert space {H, (•, •)}.
that the Jacobian matrix D,T(y, u) is nonsingular at the optimal point
(y, ú). Then the solution z to the equation linearized at (y, v,), 2.12 Show that the linear integral operator
/1
D5T(y, u)(z y) + D,T(y, v,)(u - v,) = 0, (A u) (t) = J e(t-') u(s ) ds, t e [0, 1]
behaves, in a suitable neighborhood of (y, v,) and up to an error of higher 0
order than ^)u - ti , like the solution y to the equation T(y,u) = 0. In is well defined in H = L2(0, 1) and maps H continuously into itself.
comparison with the linear state equation (1.1) on page 10, DyT(y,u) 2.13 Let real numbers u,,, ub, /3, p and some .1 > 0 be given. Solve the qua-
takes over the role of A and DuT(y, ú) that of B. It is therefore plausible dratic optimization problem in IR,
te conjecture that the pair (y, ú) satisfies the following optimality system
in place of (1.9) on page 14: min /3 pv+ ,u21

T(y,u)=0, uEU,d
VE[ua,up] 2
J1
by deriving a projection formula of the type (2.58) on page 70.
D5T(y, u)Tp = VyJ(y, u)
(DuT(y,u)Tp+V, J(y,u), v - u),,, > 0 2.14 Prove the necessary optimality conditions for problem (2.69)-(2.71) on
Vv E UUd.
page 74.
Use the implicit function theorem to prove this conjecture.
2.15 Derive the necessary optimality conditions for the linear optimal control
2.2 Show that the expressions U4c[a,b] (maximum norm) and MIc,[-,b] in- problem on page 79. Hint.• Use the fact that the value of the cost func-
troduced in Section 2.1 satisfy the norm axioms. tional at an arbitrary triple cannot be smaller than that at an optimal
2.3 Let {H, (- , -) } be a pre-Hilbert space. Show that (( u )) defines triple, and write down the differential equation satisfied by the difference
a norin on H. between an arbitrary and an optimal triple.
2.4 Prove Theorem 2.7 en page 38- 2.16 Let a bounded Lipschitz domain U C RN and functions yo E L2(S2),
2.5 Show that )IAlic(u,v) = es^ E L2(P), and er e L2(F) be given, where er is assumed to be the trace
sup lIAul^v defines a norm in £(U, V).
l'4 u =r of a function y E H2(U). Derive the necessary optimality conditions for
the problem
2.6 Determine the operator norm of the integral operator A : C[0, 1] -->
C[0, 1], min ly YP 12
dx,

(A u) (t) = J e(L-s) u ( s) ds, t E [0, 1]. subject to -Ay = u + e0, yir = er, and the box constraints -1 <
n u(x) < 1.
2.7 Show that every strongly convergent sequence in a normed space con-
2.17 Derive the necessary optimality conditions for the following problem (cf.
verges weakly. page 82):
2.8 Prove that in Hilbert spaces {H, ( , •)} the following important result
holds: if u,, - u and v" - v, then ( u,, vj - (u, v) as n -> co. minJ u
(y,') 2 ¡y yst 12 dx + J er y ds 2 J Ju12 dx,
rz r n
2.9 Suppose that Assumption 2.13 en page 48 is fulfilled. Show that the set
subject to -Dy + y = u + es?, cy„y = er, and the box constrairits
{u c L2 ( S2) : ua ( x) < u(x) < ub ( x) for a.e. x c S2} 0<u(x)<1.
118 2. Linear-quadratic elliptic control problerns

Chapter 3
2.18 Prove Theorem 2.34 on page 90, that is, the first-order necessary opti-
mality conditions for the problem (2.36)-(2.38) on page 54, which were
derived only formally in Section 2.11.1 by using the formal Lagrange
method.
2.19 Apply the formal Lagrange method to the problem with boundary control
of Dirichlet type,

min
. ¡y - y212 dx +r
fsz
A J kul2 ds, Linear-quadratic
subject to
-Ay = 0 parabolic control
ylr = u
and -1 < u(x) < 1, and state the necessary optimality conditions that
are to be expected. Hint: Use different multipliers pr and P2 for the
problems
Laplace equation and the boundary condition, respectively.

3.1. Introduction

Preliminary remarks.
Elliptic differential equations model stationary physical processes such as
heat conduction processes with equilibrium temperature distributions. If
the process under investigation is nonstationary, then time comes into play
as an additional physical parameter. As an archetypical case, let us consider
the problem of finding the optimal nonstationary boundary temperature dis-
cussed in Section 1.2.2: our task is to control the temperature in a spatial
domain 9, which is initially given by yo(x), x E S2, iri such a way that a de-
sired final temperature distribution yq(x), x E f2, is achieved within a finite
period of time T > 0. A simplified mathematical model for this problem
reads as follows:
¡ T
(3.1) min J(y, u) := 2 l y(x ,T)- yo(x)I2 dx + f Iu(x, t ) j2 ds(x)dt,
J^ 2 f

subject to

yt - Dy = 0 in Q := SZ x (0, T)
á„y+ sy = 3u onE:=Fx(0,T)
y(x,0) = yo (x) in f

119
121
120 3. Linear-quadratic parabolic control problems 3.1. Introduction

and The set of admi.ssible controls is defined by

(3.3) ua(x,t) < u(x,t ) < ub(x,t) for a. e. (x, t) e E. Uad = {u E L2(E) : ua(x, t) < u(x, t) < ub(x, t) for a.e. (x, t) E E}.

In contrast to elliptic problems, the process evolves within the space-time The initial condition y(-, 0) = yo for y has to be taken into account.
cylinder Q := fI x (0, T). The control function 'u = u(x, t) acts on the spatial Therefore, the Lagrange method at first yields the variational inecuality
boundary F; it is therefore defined on the set E := F x (0, T).
DAy,?t,P)(y - y) >_ 0
In this chapter, we will pursue a similar strategy to that in the elliptic
case. First, we will show that the initial-boundary value problem (3.2) admits for all sufficiently smooth functions y with y(•, 0) = yo. Substituting y
for any given control u = u(x, t) a unique solution y = y(x, t) in a suitable y-y, we find that D5G(y, i , p)y > 0 for all y satisfying y(•, 0) = 0. Since -y
function space. Then we will investigate the solvability of the optimal control also belongs to this class of functions, we finally obtain that D,£(y, v,, p) y =
problem, that is, whether there exists an optimal control ú with associated 0. With respect to u, the known variational inequality follows. In conclusion,
optimal state y. This will again follow from the continuity of the solution we expect the following necessary optimality conditions:
operator G : u H y. Finally, necessary optimality conditions will be derived.
From the viewpoint of optimization, this approach is in principle the
DyG(y, ú, p) y = 0 for all y with y(0) = 0
same as in the elliptic case. However, the theory of weak solutions is slightly
more involved for parabolic equations: in addition to second-order spatial^ Dat(y, ú,, p) (u - v,) > 0 for all u E Uad.
derivatives, a first-order derivative of y with respect to the time variable
also occurs. This makes the use of another solution space necessary, and The first equation leads to the adjoint equation. Since the derivative of
eventually leads to the conclusion that the associated adjoint equation has the linear and continuous mapping y y(., T) coincides with the mapping
to be taken as an equation that runs backwards in time. itself, we find that
Formal derivation of the optimality conditions . In order to get, from
the very beginning, an idea of what sort of optimality conditions can be
expected, we once more apply the formal Lagrange technique. It will turn
Dyr(9, u, p) y (y(T) - yo) y(T) dx -
ffQ (yt - ¿^, y) p1 dx dt
out later that this approach actually leads to the correct result. To this
end, we introduce the Lagrangian function G associated with the problem
- JJE (avy +ay)P2dsdt.
(3.1)-(3.3), Upon integrating by parts (with respect to t in yt and with respect to x in
Dy), invoking Green's formula, and finally regrouping the terms, we obtain
£(y, u, p ) = J(y, u) - ff( ot - Oy ) Pr dx dt - Jf ( D„y + (y y - ,3'u) P2 ds dt, that any sufficiently smooth y with y(0/) = 0 satisfles

in which we account for the "difficult " constraints involving derivatives. The (y(T) - ys2) y(T) dx - J^ y(T) pi(T) dx + 1 !Q yP1,t dx df
initial condition and the inequality constraints for u are not eliminated by 0 Jst
introducing corresponding Lagrange multipliers . We note that in this case
it would suffice to choose the same function p in both integrais; however,
+
JJ p'¿l ydsdt- J JE
y 3 pidsdt+ JJ yOPidxdt
Q
since this leads to difficulties in other situations , we prefer to work with two
P2c)„ydsdt- Jj ayp2dsdt
different functions pi and p2, yet eventually arrive at the conclusion that JJ
Pr = P2 on E. It is advisable to take such an approach whenever there is
doubt.
f (T_Yo_PiT)YTdx+ff(Pit+APi)Ydxd1t
Q
In the following , we will often write y (., t) or, for short , y(t), and we will
regard y as a function with values in a Banach space ( for the explanation (avP1 +Cl p2 ) ydsdt+ fÁ (P r-P2)rwydsdt.
of this notion , see page 141 ). The first form y ( -, t) stresses the dependence
on x. Note that the term containing y(0) vanishes, since y(0) = 0.
122 3. Linear-quadratic parabolic control problems 3.1. Introduction 123

First, we note that for all y e Có (Q) the expressions y(T), y(0) and y, Evaluation of the variational inequality for D. E y¡i¡elds
&,y vanish on t2 and E, respectively. Therefore,

ff(Pi,t+
Q Pi)dxdt = o b y E Co' (Q)
DaE(y, 21, P) (u - 2t)
=A
ff y(n - ) ds dt + f E f p (u - u) ds dt
= ff( +/3P)( u_ )dsdt ^o.
Now observe that Có (Q) is dense in L2(Q). We therefore must have

Pr,t + Opl = 0 in Q. Hence, the following variational inequality has to be satisfied:

In particular, the integral over Q vanishes in the aboye equation. Next, we no


longer require that y(T) = 0 and consider the set of all functions y e C'(Q)
such that yjr = 0. For such functions, it follows that
II (Au+/p)( u-2t) dsdt > 0 Vu E Uad-
Jst (y(T) - yo - pr(T)) y(T) dx = 0.
The aboye derivation was only formal, with not much attention paid to
The possible values y(T) form a dense subset of L2(52). We do not discuss mathematical rigor. For instante, in calculations we treated the time deriva-
the validity of this claim here, since we are arguing formally; similarly, we tives yt and pt as if they were ordinary functions; also, we have not specified
will claim below that the boundary values and normal derivatives of smooth the function spaces to which y and p, as well as their derivatives, are sup-
functions y form dense subsets in L2(E). If, however, the claim is true, then posed to belong. Moreover, the careless use of the initial and final values
it follows that
of y and p was rather bold. Finally, the claimed density of the boundary
Pi(T)=y(T)-yo iní. values of y and of d„y does not always hold, unless precise srnoothness as-
Next, we no longer require yIE = 0 and vary over all functions y E C'(Q). sumptions are imposed on the boundary of Q. Consequently, we can for the
We obtain time being only guess that our result is the correct one. (Note, however,
that later in this chapter a mathematically rigorous derivation of the op-
ff(api + p2)ydsdt = o. timality conditions will still be given.) Nevertheless, a careful application
of the Lagrangian function yields, in any case, a convenient formulation of
Claiming that the set {y¡E : y E Cr(Q)} is dense in L2(E), we conclude that
these conditions that is also easy to memorize.
avpr + C ¿P2 = 0 in E.
Recommended course of study of the upcoming sections . In this
In summary, we have chapter, we will introduce the notion of weak solutions to linear parabolic
equations, preve existente and uniqueness of such solutions, and then inves-
f fE (Pl - P2) a„y ds dt = 0 tigate the questions originating from optimization theory.
At first, however, the spatially one-dimensional parabolic case will be
for all sufficiently smooth y. Claiming that the set of normal derivatives á„ y treated using the Fourier technique. This method does not peed the theory
is dense in L2(E), we finally find that P2 = Pr on E . We thus put p := pr of weak solutions to parabolic equations as a prerequisite; it therefore ben-
to obtain that P2 = p on E . In conclusion, our formal argument yields the efits those readers who rnight prefer to defer the study of this theory until
following system as the adjoint equation: later. The Fourier method is also interesting in itself, since it is equivalent to
the theory of strongly continuous semigroups. In addition, in this compar-
atively elementary way one can deduce the well-known bang-bang principle
-Pt = Op in Q for optimal controls in the case of a pure final-value functional.
(3.4) a p+crp = 0 on E Readers who want to acquaint themselves immediately with the theory
of weak solutions may omit Section 3.2 for the time being and continue with
P(T) = y(T) - yst in Q.
Section 3.3, for which Section 3.2 is not required.
124 3. Linear-quadratic para.bolic control problems 3.2. Fourier's method in the spatially one-dimensional case 125

3.2. Fourier ' s method in the spatially one-dimensional case a real interval. One could imagine the heating of a very thin beam of unit length
which, except for the right endpoint x = 1, is completely isolated. The boundary
3.2.1. One-dimensional model problems. condition y,; = 0 at the left endpoint x = 0 models isolation; it may also model a
symmetry condition if the beam has length 2 (with left endpoint at x = -1 and
A boundary control problem. To provide some physical background, right endpoint at x = 1) and the heating occurs with the same outside temperature
we once more interpret the following control problems as heating problems. u(t) at both endpoints.
We consider the problem of heating the one-dimensional spatial domain 2 =
A somewhat more realistic interpretation is the heating of an infinitely long
(0, 1) in an optimal way by means of a control u = u(t) that acts at the right plate of unit thickness (with both plane surfaces assumed to be orthogonal to the
boundary point x = 1: x-axis) whose right surface is heated by u(t) while the left surface is isolated. Anal-
r ogously, we can think of a píate of thickness 2, where both left and right outside
(3.6) min J(y,u) f 1y(x,T) -yo(x) 2dx+ 2 J ^u(t)^2dt, temperatures are given by u(t).
0
Recall, however, that our main concern in this book is not the description of
subject to
heat conduction processes; instead, we intend to explain the basic ideas governing
optimal control problems. The easiest way to do this is to consider simple toy
yt(x,t) = yxx ( x,t) in (0, 1) x (0, T) models. Also, the assumption of a homogeneous boundary temperature is due only
to methodological considerations.
yX (0, t) = 0 in (0, T)
yx(1, t) _ j3 u(t) - ay(1, t) in (0, T) A problem involving a controllable heat source. Another physically
y(x, 0) = 0 in (0, 1), meaningful situation is the control of a heat source distributed in the spa-
tial domain. A typical example is the heating of metals by electromagnetic
and the control constraints induction. For a change, we consider a different cost functional. Here, the
aim is to follow a desired nonstationary temperature evolution yQ(x, t) in
(3.8) ua(t) < u(t) < ub(t) for a.e. t c (0, T). Q = (0, 1) x (0, T), which is modeled in the following way:

(3.9)
Assumption 3.1. Assume that we are given real numbers T > 0 (the period
¡T i f
of heating), a > 0 (heat transmission coefficient), and N > 0, as well as mi J(y,u) :_ 1 / J
y(x,t) yQ(x,t)2dxdt+1 u(x,t)2dxdt,
10
functions ¡3 E L°° (0, T) with ,3 (t) > 0 for almost every t E (0, T), y,^ E
L2(0, 1) (the desired final temperatura distribution), and u,,, Ub e L2(0,T) subject to
such that ua,(t) < ub(t) for almost every t E (0,T).

The control u that we seek is supposed to belong to L2(0,T). Note that yt(x, t) y,xx (x, t) + u(x, t ) in (0,1 ) x (0, T)
any admissible control u is autornatically essentially bounded if ua, and Ub
y^ (0, t) = 0 in (0, T)
are bounded and measurable. (3.10)
y, (1, t) + a y(1, t) = 0 in (0, T)
From the physical viewpoint, we ought to have /3 = a, since the physically
y(x, 0) = 0 in (0, 1)
correct form of the right-hand boundary condition is given by y^, (1, t) =
a (u(t) - y(1, t)) (the heat flux y,, (1, t) at the boundary is proportional
to the difference between the outside temperature u(t) and the boundary and
temperature y(1, t)). However, the aboye form of the boundary condition
(3.11) u,,(x, t) < u(x, t) < ub(x, t) for a.e. (x, t) e Q.
allows for adinitting a Neumann boundary condition (by choosing a = 0). It
also seems appropriate for purely mathematical reasons to decouple a and /3. Here, homogeneous Neumann boundary conditions are prescribed (to model
isolation). Analogously, a fixed outside temperature could have been pre-
Remarks. The aboye problem is merely academic from the physical viewpoint. scribed in the forro of a boundary condition of the third kind. The given
Indeed, it is rather unrealistic to consider a one-dimensional spatial domain t3, i.e., furictions yQ and u,, 5 Ub are assumed to belong to L2(Q).
126 3. Linear-quadratic parabolic control problems 3.2. Fourier's method in the spatially one-dimensional case 127

3.2.2. Integral representation of solutions -Green ' s function. Un- the boundary conditions formulated in (3.12); the functions cos(nlrx) and
der suitable assumptions, the solutions to linear parabolic initial-boundary cos(µnx) are the corresponding eigenfunctions. The aboye series expansions
value problems can be expressed by means of Fourier series, which are ob- will be derived in Section 3.8. We also refer to Tychonov and Samarski
tained by separation of variables. This theory is comparatively simple, since [TS64].
it does not need the theory of weak solutions. Unfortunately, its applicability The Green's function is nonnegative and symmetric with respect to the
is limited to geometrically simple spatial domains (e.g., in the case of N > 1 variables x and ^. For x = ^, it becomes singular at t = 0; more precisely,
to cubes or balls; see, however, Glashoff and Weck [GW761). G has a so-called weak singularity at (x = ^, t = 0). Moreover, we have
To fix things, consider the initial-boundary value problem for the one- y E L2(Q) if f E L2(Q), yo E L2(0,1), and u e L2(0,T). The correspond-
dimensional heat equation ing estimation can be found in [Tr884b], where estimates for the Green's
function from Friedman [Fri64] are employed.

yt(x,t ) - Y..(x,t ) = f(x,t) Definition . We call the function y given by (3.13) with square integrable
yx(0,t) = 0 functions f, u, and yo the generalized solution to (3.12).
(3.12)
yx(1,t ) + a y(1,t) = u(t)
Each of the three summands in (3.13) defines a linear operator. The
y(x,0) = yo(x) following two cases are interesting for the discussion of the initial-boundary
value problem (3.12):
in Q = (0, 1) x (0, T), where f E L2(Q), yo E L2(0, 1), u e L2(0,T), and the
(i) u := [3 u, f = yo = 0 (boundary control):
constant a > 0 are given. If the data f, yo, and u are sufficiently smooth,
then there exists a unique classical solution y, which can be expressed by In this case, only the final distribution y(x, T) occurs in the cost functional
means of a Creen 's function G = G(x, ^, t) in the form (3.9), and (3.13) simpli/f es to
T
(3.15) y(x,T) = G(x, 1, T - s) ¡3(s) u(s) ds =: (Su) (x).
(3.13) o

The integral operator S represents the part of the state y that appears
y(x, t) = f G(x, e, t) yo(^ ) d^ + f'11 G(x, e, t - ) f d^ ds in the cost functional, here the final value y(T). The operator S maps
o¡ L2(0,T) continuously into L2(0, 1); see [Tr584b]. We therefore consider S
+ J G(x, 1, t - s) u(s) ds. as a mapping between these spaces,
0
S : L2(0,T) -3 L2(0,1).
In the physically interesting cases for the constant a, G has the form of a
The aboye property also follows from the equivalence with weak solutions
Fourier series:
and their properties; see Theorem 3.13 on page 150.
(3.14)
00
(ii) u = yo = 0 (distributed control):
cos(nlrx) cos(n7r^) exp(-n2ir2t) for a = 0
n=1 This case arises from the control of the heat source. From (3.13), we deduce
G(x, ^, t) =
the representation
cos(ttnx) cos(µn^) exp(-It,t) for a > O.
(3.16) y(x, t) = J Jt G(x,
1 ^, t - s) f (e, s) d^ ds =: (S f) (x, t).
Here, the tr.n > 0 denote the solutions to the equation µ tan p = a, or- 0 0
dered according to increasing magnitude , while Nn = 1/2 + sin(2p.)/(4µn) S is a linear and continuous mapping from the space L2(Q) = L2((0, 1) x
are normalizing factors. The numbers n 7r and µn are the eigenvalues of (0,T)) into itself. In the following, we consider S as an operator in L2(Q),
the differential operator ó 2/óx2 combined with the homogeneous version of even though it is actually continuous as a linear mapping from L2(Q) into
128 3. Linear-quadratic parabolic control problems 3.2. Fourier's method in the spatially one-dimensional case 129

C([0,TI,L2(0,1)) (see [Tr684b]). The latter is a space of functions with adjoint differential equation. We now determine the forro of the adjoint S*
values in a Banach space, which will be introduced in Section 3.4.1 on page of the integral operator S, defined through the relation
142. It expresses a certain continuity with respect to the variable t, which,
(v, Su)L2(0,1) _ (S*v, u)L2(0,T) du E L2(O,T), Vv EL 2(0,l).
in particular, ensures the convergente y(x, t) -> 0 as t ). 0. We conclude that
Fixing arbitrary elements u E L2(O,T) and v e L2(0,1) and invoking
S : L2(Q) L2(Q).
Fubini's theorem, we find that
These properties may also be deduced from Theorem 3.13 en page 150 con-
fT
f
cerning weak solutions. (v, SU) L2(0,1) G(x,1, T - s) Q(s) u(s) ds) dx
= J 1 v(x) (

3.2.3. Necessary optimality conditions. With the aid of the integral 101 T
representations derived aboye, the spatially one-dimensional parabolic opti-
mal control problems in Section 3.2.1 can be readily treated theoretically.
J0 u(s) ,3 (s) G(x, 1, T - s) v(x) ds dx
Boundary control problem. We first investigate problem (3.6)-(3.8),
f T () (,3(8) f 'G(xl,T_s)v(x)dx)ds
which can be rewritten in the forro *
u, S v)L2(oT) _ (S v, u)L2(oT).

minJ(y,u) := 2 Ily(T) YOlli2(o,1) + 2 Ilu11L2(o,T) We thus have


subject to U E Uad and
, (3.18) (S*v) (t) = f3(t) J 1 G(^,1, T - t) v(^) d^.
yt(x,t) = yx.(x,t)
0
yX(0,t) = 0
yx(1, t) + a y(1, t) = ,3(t) u(t) Lemma 3.2. (S*v) (t) = /3(t) p(1, t), where p is the generalized solution to
y(x, 0) = 0, the parabolic final value problem

-pt(x,t ) = pXX(x,t)
where Uad = {u E L2(0,T) : Ua(t) < u(t) < ub(t) for a. e. t E (0, T) }.
p^, (0,t) = 0
Substituting the integral representation (3.15) for y(x,T), ¡.e., Su - p.(l,t)+ ap(l,t) = 0
y(,T), in the cost functional, we obtain a quadratic optimization problem
p(x,T) = v(x).
in a Hilbert space, namely
1
f (u) :_ l(Su y.OIIL 2(o,1) + 2 IIuIIL2(o,T) Proof: Owing to the symmetry of the Green's function, it follows from equa-
umin tion (3.18) that
Since S : L2(0,T) -s L2(0, 1) is continuous, we infer from Theorem 2.14
on page 50 the existente of an optimal control u for the boundary control ( S*v) (t) _ fl (t) f G(
1 1, ^, T -- t ) v(S) d^.
0
problem (3.6)-(3.8), which is unique if A > 0. Let y denote the associated
generalized solution. In view of Theorern 2.22 en page 64, the necessary and We now introduce the functions
sufficient optimality condition is given by the variational inequality (2.45)
en page 64, which in the present case takes the form p(x, t) = l 1 G (x, ^, T - t) ¿«)
o
d^

(3.17) (S*(Sú- Yo) + u v,)L2(0T) > 0 b'u E Uad. asid, with the time transformation r = 1' - t,

The variational inequality contains the adjoint operator S*. Frorn our
p(x, T) = p(x, t) = p(x, T - T) = 1 1 G(x, ^ , r) v(^) d^.
experience with elliptic problems, we expect it to be closely connected to an 0
130 3. Linear-quadra tic parabolic control problems 3.2. Fourier's method in the spatially one-dimensional case 131

Recalling (3.13), we see that p solves the initial-boundary value problem Lemma 3.4. The variational inequality (3.19) holds if and only if for almost
every t e [0, T] the following variational inequality in 1R is satisfied:
pT (x, t) - p22(x)t)

pz': (0, r) = 0 (3.21) (a(t) p(1, t) + A-,(t)) (v - u(t)) > 0 dv E [ua(t), ub(t)[.
g7^(1,T)-{-IXp(1,T) = 0
p(x,0) = v(x). Conclusion . If A > 0, then for almost every t e (0, T) the optimal control
satisfies the projection relation
The assertion now follows from the substitution p(x, T) = p(x, t), using
the fact that
u(t) =lp ^ (t) p (1 , t) }.
[ua(t),ub(t)] {
Drp(x,T) = DTp(x,T -T) = -Dtp(x,t).
If A = 0, then for all points t such that /3(t) p(1, t) # 0, ?t(t) is determined
by
ua(t) if ¡3( t) p(1, t) > 0
Remark. The equation for p runs backwards in time. It is, however, well posed, u(t) ub
since a final condition is prescribed and not an initial condition; if an initial condi- (t) i f 3 ( t) p(1, t) < 0.
tion were prescribed instead, we would have the typical case of an ill-posed back
wards parabolic equation known from the theory of inverse problems.
The proof of this result proceeds similarly to that in the elliptic case; see
page 70.
We are now in a position to state the necessary (and, owing to the con-
vexity, also sufficient) optimality conditions. Distributed control. The problem (3.9)-(3.11) on page 125 can be treated
similarly. The existente of an optimal control ú follows from Theorem 2.14 on
Theorem 3.3. A control ú e Uad with associated state y is optimal for page 50. Since in this case S is injective, we have uniqueness. For application
the one-dimensional boundary control problem ( 3.6)-(3.8 ) if and only if it of the necessary optimality condition (2.45) on page 64, we again have to
satisfies the vaariational inequality determine the adjoint operator S* in L2(Q). Proceeding as in the case of
T the boundary control, we find that
(3.19) (,3(t) p(1, t) + \ 2t(t)) (u ( t) - w(t)) dt > 0
Jo Vu E Uad,
(v, Su) L2(Q) _ fT
f l (/tfl()G(t s) v(xt)
de ds) dx dt
where p e L2( Q) is the generalized solution to the adjoint equation
1
-Pt(x,t)
p2(0,t)
=
=
p^x(x,t)
0
=
f
f ' nxt (
T t o J G(, x, s t) v(, s) dds) dx dt
(3.20)
px(1,t) + a p(1,t) = 0
(u, S *v)L2(Q),
p(x, T) = y(x, T) - y0(x).
so that

Proof: The assertion follows directly from inserting the representation of S*


from Lemma 3.2, with v := y(T) - yc, into the variational inequality (3.17). t o
( S*v) (x, t ) = J T 1 G(^, x, s - t) v (^, s) d^ ds.

1 Len-una 3.5. The function


The function p is called the adjoint state associated with (w, y). In anal-
ogy to Lemma 2.26 on page 69, we have the following result. T 1
p(x, t) = G( ^, x, s - t ) v(^, s) d^ ds
f o
132 3. Linear-quadratic parabolic control problems
3.2. Fourier's rnethod in the spatially one-dimensional case 133

is the generalized solution to formula


-Pt(x,t ) = Pxx(x,t ) + v(x,t)
í¿(-' t) = F[ua(x,t),ub(x,t)1 { P(x, t) } for a. e. (x, t) E Q.
px(0,t) = 0
while for ,\ = 0 we have, almost everywhere in Q,
px(1,t)+ ap(l,t) = 0
p(x,T) = 0. ua(x,t) if p(x,t) > 0
(3.22) v(x,t) = ub(x,t)
if p(x,t) < 0-

Proof.- We use the Green's function (3.14) and substitute T = T t and 3.2.4. The bang- bang principie . We now reconsider the optimal bound-
a = T - s. Then the integration variable a runs from T - t = -r to 0. In
ary control problem (3.6)-(3.8), but this time without regularization, that
addition, da = -ds. We thusobtain
is, for \ = 0. It turns out that the missing regularizing term is reflected in a
fo il lower regularity of the optimal control. For simplicity, we choose ua = -1,
p(x, T - T) _ - G(^, x, y - a) v(^, T - a) d^ da
ub = 1, and ,3(t) - 1; in addition, let a > 0 and yc E L(0, 1) be given. We
T r consider the problem

ff
0 o
G( x, a) v(^,T - a) dl; da =: p(x, v).

In view of equation (3.13), p is the generalized solution to the (forward)


min 2 J¡0
1
I y(x, T) yo (x)12 dx,

initial value problem fi, = pxx + v(x, T - T), p(x, 0) = 0, with homogeneous subject to
boundary conditions. Since fi, = -pt(x,T -T) = -pt(x,t), the assertion of
the lemma follows. ❑ yt(x,t) = yxx(x,t ) in (0, 1) x (0, T)
yx (0, t) = 0 in (0, T)
Summarizing, we have proved the following necessary optimality condi-
yx(1,t ) = u(t) - e, y (1,t) in (0,T)
tions:
y(x, 0) = 0 in (0,1)
Theorem 3.6. A control 2 E Uad with associated state y is optimal for the
optimal heat source problem (3.9)-(3.11) on page 125 if and only if with the and
solution p E L2(Q) to the adjoint equation u(t)j < 1 in (0, T).

-Pt(x,t) = Pxx (x,t) + 9(x,t) - YQ(x,t) Owing to (3.22), we have


px(0,t) = 0
1 if p(1 , t) < 0
px(1, t) + a p(1, t) = 0 u(t)
1 if p(1 , t) > 0 .
p(x,T) = 0

we have the variational inequality This relation (loes not yield any information for points t where p(1, t) = 0.
However, it turras out that for problems of the aboye type this can only occur

IIQ (p+\2t)(u-u)dxdt>0 duEUad at isolated points if the optimal value is positive.

Theorem 3.7 (Bang-bang principie). Let ú be ara optimal control of the


or, equivalently, the pointwise relation
bouridary control problern (3.6)-(3.8) on paye 124, where A = 0, u, = -1,
(p(x, t) + A 5(x, t)) (v - 5(x, t)) > 0 b'v E [ua(x, t), ub(x, t)] ub=1, and,3=1. If

for almost every (x, t) E Q. Ih(-, T) - ynM 2(o,1) > 0,


then the function p(1, t) has at most countably many zeros that can accumu-
As in the elliptic case, explicit expressions for v, can be derived from the late only at t = T. We thus have
aboye theorem: if \ > 0, then the optimal control has to obey the projection
w(t)j = 1 for a.e. t e (0,T),
134 3. Linear-quadratic parabolic control problems 3.2. Fourier's method in the spatially one-dimensional case 135

and .u is piecewise equal to ±1, with at most countably many switching points (u) W(z) = 0: In this case p(1,t) - 0 in (-w, T), that is,
at the zeros of p(1, t). 00
1: 1
cos(µn) exp (-µ2(T - t)) dn = 0 b't < T.
n=1 n
'a
Multiplying this equation by exp(µi(T - t)), we see that
1
1 0 1
cos(µi ) d1 + Z - cos( µn) exp ( ( µn - P2 ) ( T - t)) dn = O.
^ t n=2
H
I
Letting t - -oo, we find that d1 = 0. Repeating this process inductively,
we deduce that dn = 0 for all n E N. In conclusion , we must have d = 0,
-1 which contradicts the assumption . Therefore , case (ii) cannot occur, and the
assertion is proved. ❑

Conclusion. Under the assumptions of the bang-bang principie (A = 0,


positive optimal value ) the optimal control is uniquely determined.
Proof: We assume a > 0, so that the second expression for G in (3.14)
applies. The case where a = 0 can be treated by analogous reasoning. Proof Let u, and ú2 be two different optimal controls with associated op-
Let d := y(•,T) - yq. Then, by assumption, IIdIIL2(o,1) 0. In view of timal states yl and 92i and let j denote the optimal value for Ily(T) - yoMM.
(3.18) and Theorem 3.3, for x = 1 and 0 < t < T the adjoint state p has the We claim that then u := (t1 + ú2)/2 is also optimal . Indeed , u is admissible
form since Uad is convex , and the corresponding state is evidently y = (y1 +y2)/2.
On the other hand , the triangle inequality yields that
p(l, t) _ (S-d) (1, t) =
f0 l G(^,1, T t) d(^) d^

COS(µ n) exp (-Mn(T - t)) J 1 Nn COS(1Un^) d(S) dS 1 1


< 21IVi(T) - + 2-11
92 (T) - Y916(0j)
YOML2(o,1) ,

=dn
which obviously implies that
Here, dn denotes the nth Fourier coefficient of d with respect to the orthonor-
mal system formed by the eigenfunctions cos(µnx ); see page 127. Owing I y(T) - YOI1L2(o,1) = j-
to Bessel ' s inequality, the sequence { dn}O°_1 is square summable. Moreover,
Now, owing to the last theorem, ñl and v,2 are bang-bang solutions. But
the µn behave asymptotically like (n - 1 ) 7r for n -+ oo. Since for 0 < t < T then, (tt1 + ú2)/2 cannot be a bang-bang solution, which contradicts the
the terms exp (-µ2(T - t)) and all their derivatives with respect to t decay
aboye theorem. ❑
exponentially fast as n oo, we can infer that the aboye series is infinitely
differentiable with respect to t in [0, T); hence, the complex extension of the
Example. Consider the following problem due te Schittkowski [Sch79]:
series,
1
p(z) :_ ^, N cos(pe ) exp ( - µn ( T - z)) dn, min 2
1 1
J0 y(,, T) - yo (x) 12 dx,
n=1
defines an analytic function of the complex variable z in the half plane subject to
{Re(z) < T}. We have to distinguish two different cases: yt(x, t) = yxx(x, t)
yx(0,t) = 0
(i) eo(z) 0: By the identity theorem for analytic functions, ep can have at
most finitely many zeros in any compact subset of the half plane {Re(z) < T} yx(1,t) = u(t) y(1,t)
and, in particular, in any real interval [0, T - e] with 0 < e < T. y(x, 0) = 0
136 3. Linear-quadratic parabolic control problems 3.3. Weak sol ti tions in w210(Q) 137

and

1<u(t)<1. yt - oy+coy = f in Q = S2 x (O, T)

(3.23) dvy+ay = 9 onE=Fx(0,T)


With the choices yq(x) = ,r-,(1 - x2) and T = 1.58, one obtairis a numerical
solution that has a switching point at t = 1.329; see the figure below. y(., O) = yo(-) in Q,

confining ourselves from the beginning to boundary conditions of the third


u kind. It does not present any problem to split the boundary F into disjoint
1 pieces Fo and Fl and prescribe hornogeneous Dirichlet data en Fo; see (2.11)
on page 34. The case of inhomogeneous nonsmooth Dirichlet data is more
t difficult to handle. We do not delve into this case here; boundary value prob-
lems of this type are treated, e.g., in [Lio71] or, using the theory of strongly
T
continuous semigroups, in [BDPDM92], [BDPDM93], and [Fat99]. An
-1 approximation approach via weak solutions with Robin boundary conditions
can be found in [AEFROO] and [BBEFR03].

Competed bang- bang control. We make the following assumptions:

Assumption 3.8. Let f c RN be a bounded Lipschitz domain with boundary


F, and let T > 0 denote a fixed final time. Moreover, assume that functions
co e L°°(Q) and a e L' (E), where a(x, t) > 0 for almost every (x, t) E E,
By combining numerical calculations with careful estimates for the Fourier
are prescribed.
series, it can be shown that the optimal value is positive and that the optimal
control has exactly one switching point in [1.329,1.32941; see [DT09]. The functions co, a, f , and g all depend on (x, t); we assume that f E

References. Further results concerning the bang-bang principie in the control L2(Q), 9 E L2(E), and yo E L2(S2).
theory of parabolic equations are collected in [GS80]; see also [GS77], [GW76], Let us briefly discuss the difficulties we have to face when trying to
[Kno77], [Sac78], [Sch80], [Sch89], and [Tr884b], to name just a few of the define an appropriate notion of a solution to the parabolic problem (3.23).
numerous contributions. In [Kar77], it was proved that if the maximum norm From a classical solution y = y(x, t), one would expect the existente of all
is used in place of the L2 norrn, then the optimal control can have only finitely derivatives that appear, as well as their continuity in the interior of the
many switching points for a positive optimal value. Numerical applications, using
space-tirrie cylinder Q = S2 x (0,T), i.e., y e C2,1 (Q). This requirement
the calculation of switching points, have been reported in, e.g., [ET86], [Mac83b]
is much too restrictive for optirnal control problems in which the controls
and, for the case of mixed control-state constraints, [Tr684a].
belong to L2 spaces. Recall that in the elliptic case we merely required the
Green's functions may also be used in higher dimensions. We do not pursue
existente of the weak first-order spatial partial derivatives Di = alar,; the
Chis here, and refer the interested reader to [GW76] or, for the application of the
boundary value problem was brought into a variational form in which half
integral equation to semilinear equations, to [Tr584b]. A generalization of this
of the derivatives were shifted to the test function v = v(x).
concept is the approach via strongly continuous semigroups; detailed expositions of
For parabolic problems, we follow a similar approach at first. Again, we
their use in control theory have been given in [BDPDM92], [BDPDM93], and
introduce a variational formulation that requires the existente of the weak
[Fat99].
first-order partial derivatives Di = 3/3 only; the remaining half of the
spatial derivatives will be shifted to the test function v = v(x, t).
In addition, we have to account for the time t. With respect to t, also
3.3. Weak solutions in W2'o(Q)
only the weak derivative comes into question. There are two possibilities:
We consider as a typical case the parabolic analogue to problem (2.11) en either we postulate the existente of the weak time derivative of y (which then
page 34, is not needed for the test function v) or we do not (in which case the weak
138 3. Linear-quadratic parabolic control problems
3.3. Weak solutions in W2'0(Q) 139

time derivative is shifted to v). In most situations, the requirement that


yt is a normed space with the norm
belong to a space of functions, for instante yt e L2(Q), is too restrictive.
Hence, initially, only the second possibility remains . This, however, is a
source of asymmetry in the treatment of y and of v that renders the control
IIyII^^z ^1Q1= (10Tf ( Iy (x ,t)I2+IVy(x,t)I2+lyt(x,t)I2) dxdt)1/2.
theory more difficult. Eventually, we will obtain the existente of yt, but as Here, 17 := Ox is again the gradient with respect to x.
a functional and not as a function.
To begin with, we introduce two function spaces that are commonly W4 '1(Q ) also becomes a Hilbert space when endowed with the natural
used for the treatment of parabolic initial-boundary value problems (cf. the scalar product . Note that this space coincides with H' ( Q). Its elements
standard textbook [LSU681). have, apart from the weak partial derivatives with respect to the xi, also a
weak partial derivative with respect to t; that is , there is a function w e
Definition . We denote by W2'0(Q) the normed space of all (equivalente L2(Q)), denoted by w = yt, such that
classes of) functions y E L2(Q) having weak first-order partial derivatives
with respect to x1, ..., xnr in L2(Q), endowed with the norm ff Y(xt ) vt(xt ) dxdt=_ ff w(x,t)v(x,t)dxdt b' v e Co (Q).
Q Q
íl1lwz.°(Q) = ^^T/ (ly(x,t)12+Ioy (x,t)I2) dxdt)"2 We now transform problem ( 3.23) into a variational formulation by multi-
plying the parabolic partial differential equation by a test function v e Cl (Q)
and then integrating over Q. Since we do not yet know what regularity can
Here, 0 stands for the gradient with respect to x, i.e., V V x. We have
be expected from a "solution" y to (3.23 ), we argue formally, assuming that
y is a classical solution for which all the integrals appearing below exista in
W2'0(Q) _ {y E L2(Q) : Di y E L2(Q) V i = l . . . . . N}.
particular , y is assumed to be continuous on Q. However , eventually the vari-
ational formulation should also be meaningful if we merely have y E W2'0(Q),
The space W2'0(Q) is often referred to in the literature as H1"0(Q); it the space to which y should belong as a weak solution. Upon integrating
coincides with the space L2 (0, T; Hl (S2)) that will be introduced in Section
over Q and by parts, we obtain for all v E C' (Q) that
3.4.1. One should note that in the notation W2'0(Q), the left and right
upper indices indicate the order of the derivatives with respect to x and t,
T T fT f
(3.24 ) vdxdt- Oydxdt+ coyvdxdt
respectively, while the lower index indicates the order of integrability. In
contrast to this, the second upper index in Wk,P(Sl) reflects the order of
integrability. However, context and the different domains Q and í ought to y(x,t)v ( x,t)dx - (yvt - Vy - Ov - coyv)dxdt
1o Q
prevent any confusion.
The elements of W2'0(Q) possess all first-order spatial derivatives in weak
form,,that is, there are functions wii E L2(Q) such that ff v0,ydsdt
ff YxtDiv(xt)dxdt=
(, ) , ff w(x, t)V(X,t)dXdt 'dv E Có` (Q). = ffQ f vdxdt.
Q Q
In this formulation, y(x,0) and y(x,T) occur. These values are not
Then, we put Diy(x, t) := wi(x, t). Note that W2'0(Q)
becomes a Hilbert necessarily defined, since functions y E W2'0(Q) need not be continuous in
space when endowed with the natural scalar product; see Ladyzhenskaya et t. While the given initial value yo(x) can be inserted for y(x,0), the final
al. [LSU68]. value y(x, T) cannot be eliminated so easily. The test function v = v(x, t)
is smoother; it belongs to C1(Q). But the same operations as aboye can be
Definition . The space W2'1(Q), defined by performed even if merely v eW2'1(Q); in particular, the functions v(•, 0) and

W2'1(Q) = {y e L2(Q) : yt E L2(Q) and Di y E L2(Q) d i = 1, ... , N},


W2'1(Q)
v(-, T) are for all v E well defined as traces in L2(S2); see [LSU68].
It therefore makes sense to require that v(x, T) = 0 in order to get rid of the
3.4. Weak solutions in W(0,T) 141
140 3. Linear-quadratic parabolic control problems

The variational formulation (3.25) has a severe disadvantage: the test


terco y ( x, T). Hence, substituting the boundary condition 0,y = g - a y, we
deduce that for all v e W2'1(Q)with v(x,T) = 0, function v has te belong to the space LV2, (Q) and satisfy v(.,T) = 0, and
eventually we should insert the adjoint state p in place of v. As a rule,
however, p neither belongs to W2'1(Q) nor needs to satisfy p(•,T) = 0.
Hence, the asymmetry between the requirements en y and the requirements
on the test function v is a disturbing factor in the theory of optirnal control,

(3.25)
ff ( - yvt + Dy •Vv +coyv ) dxdt+ Jf ay vdsdt and a different approach is needed. Summarizing, we conclude that the space
W2'°(Q) is not well suited for the study of optimal control problems.
= fffvdxdt+ ff 9vdsdt+fov(.,o)dx.
a
3.4. Weak solutions in W (O, T)

We thus arrive at the following definition. 3.4.1. Functions with values in Banach spaces. The concept of func-
tions with values in Banach spaces is a fundamental tool for the treatment
Definition . We call a function y E W2'°(Q)
a weak solution to the initial- of nonstationary equations (evolution equations). Here, we consider only
boundary value probleni (3.23) if the variational equality (3.25) is satisfied functions that are defined on a compact irrterval [a, b] C R. We will call
for all v E W211(Q)
such that v(-, T) = 0. such functions vector-valued functions in this textbook; the term abstract
functions is also commonly used in the relevant literature.
Remark. Evidently, all tercos occurring in (3.25) are well defined if y E W2'0(Q)
and v E W2'l (Q). However, so far the regularity of y does not permit us to conclude Definition . Any mapping from [a, b] c R into a Banach space X is called
the existente of the initial value y(., 0). a vector-valued function.

Theorem 3.9. Suppose that Assumption 3.8 holds. Then the parabolic Examples. Depending on the choice of the space X, we have the following
initial- boundary value problem (3.23) has a unique weak solution in W210(Q). special cases:
Moreover, there is a constant cp > 0, which is independent of f, g, and yo,
(i) X = R. Then a vector-valued function y : [a, b] -* R is just a real-valued
such that
function of one variable.

(3.26) (ii) X = RN. Then y : [a, b] -s RN is a function that assigns to the variable
max ¡¡y(-,t)Ilf,2(12) + Ilyllwz.°(Q) < t a vector
tE[o,T]
cv (II.fIIL2IQI + IIg)IL2( ) + IIYOIIL2(0))
y(t) = ERN.
for all f e L2(Q), g E L2(E), and yo E L2(tZ).

The aboye result is a special case of the more general Theorem 7.9 in
Section 7.3.1. It ensures, in particular, that y is a continuous mapping (iii) X = H' (Q). In this case, for any t E [a, b] the value y(t) of a function
from [0, T] into L2(Q), that is, y c C([0,T[,L2(Q)) (the latter space will y : [a, b] --3 H' (Q ) is an element of Hl (S2) and hence a function itself that
be introduced in the next section). Therefore, the norm max Ily(•,t)IML2iQl acts col the spatial variable x E Q . In other words, y (t) = y(., t) E H'(12)
tE[o,T]
for any fixed t e [a, b], that is, the function x y(x, t) belongs to H' (Q).
and the initial and final values y(-, 0) and y(-, T) are defined, and the initial
condition y(-, 0) = yo is satisfied. The solution y e W2'° (Q) to our parabolic initial-boundary value prob-
1em is of this type. o

Conclusion . The linear mapping (f, g, yo) y y is a continuous opera-


tor from L2 (Q) x LE(E) x L2(9) finto W2'0(Q) and finto L2(0,T; Hr(S2)) fl We are now going to introduce some important spaces of vector-valued
C([0,T], L2(Q)). functions.
142 3. Linear-quadratic parabolic control problems 3.4. Weak solutions in W(0,T) 143

Definition . Let {X, 11 • lix} be a real Banach space. We soy that a vector- Now we are ready to introduce LP spaces of vector-valued functions.
valued function y : [a, b] -* X is continuous at the point t E [a, b] if we have
lim ¡¡y('r) - y(t)lIx = O. We denote the space of all vector-valued functions Definition.

that are continuous at every t E [a, b] by C ([a, b], X). The space C ([a, b], X) (i) We denote by LP(a, b; X), 1 < p < ce, the linear space of all (equivalence
is a Banach space with respect to the norm classes of) measurable vector-valued functions y : [a, b] -* X having the
property that
MYMC([a,b],X) = tmax] II y(t)II X•
y(t) dt
f
Example. The space C([O,TI, L2(Q)) consists of all real-valued functions The space LP(a, b; X) is a Banach space with respect to the norm
y = y(x, t) that are measurable in 9 x [0, T], square integrable with respect
b 1/p
to x E 9 for every t E [0, T], and continuous in t with respect to the norm ^^y^ILP(a,b;X ) ll^J(t) IjX dt)
of L2(í ). For any t e [0, T] we have (fa

fy(x,t)I2dx < 00,


( ii) We denote by L°°(a, b; X) the Banach space of all (equivalente classes
of) measurable vector-valued functions y : [a, b] -> X having the property
that
and for -r -3 t we have
I MI L°°(a,b;X) ess sup 1y(t)11x < oo.
[-,b]
-y(t) 11L2(o ) _ (J y(x, )- y(x,t)12dx
M-) 0. ) rl2
st
The norm of y is given by In the spaces just defined , functions that differ at most on a subset of
[a, b] of zero Lebesgue measure belong to the same equivalente class and thus
1/2
IP are regarded as equal . Observe that obviously C ([a, b], X ) C LP(a, b; X) C
11YIICQo,T], L2(o)) (Iy(t)IIL2(o) = tm ax (f y(x, t)I2 dx) o
= tÉax
L9(a,b;X ) for 1 < q<p< 00.

For functions y E C([a, b], X), the Riemann integral I = fa y(t) dt can Example. For any T > 0 the function y(x, t) _ is an element of
x
be defined in much the same way as ir1 the cases where X = IR or X = RN C([O,T],L1(0, 1)) and thus also of LP(O,T;L1(O, 1)) whenever 1 <p < oo.
we define the integral as the limit of Riemann sums 0
k

y(^i) (ti - ti -1), In L1 (a, b; X), and hence also in the spaces U(a, b; X) with 1 < p < co
i=1 and C ([a, b], X), the Bochner integral can be defined for any vector-valued
with arbitrary intermediate points ^i e [ti_1i ti], as the step size of the sub- function. For step functions y it is defined by
division a = t0 < t1 < ... < tk = b tends to zero. Note that the integral I is fb
an element of the space X; see Hille and Phillips [HP57]. y(t) dt :_ yi ¡Mil,
i=1

Definition . A vector-valued function y : [a, b] -> X is called a step where yi E X is the value of y on Mi and ¡Mi l denotes the Lebesgue measure
function if there are finitely many yi E X and Lebesgue measurable, pairwise of the set Mi, 1 <_ i <_ m. It is apparent that the integral is an element
disjoint sets Mi C [a, b], 1 < i < m, such that [a, b] = U', Mi and y(t) = yi of X. For arbitrary y E L1(a,b;X), we can argue as follows: since y is
for every t E Mi, 1 < i < M. measurable, there exists a sequence {yk}' 1 of step functions converging
almost everywhere on [a, b] to y. Then the Bochner integral of y is defined
Definition . A vector-valued function y : [a, b] - X is said to be measurable by
if there exists a sequence {Yk}k 1 of step functions yk : [a, b] - X such that ¡b

b
y(t) = tli ^yk(t) for almost every t e la, b]. y(t) dt = lim la yk(t) dt.
a k-^oo a
144 3. Linear-quadratic parabolic control problems 3.4. Weak solutions in W(O,T) 145

This limit is independent of the choice of the sequence {yk}ti 1; see Hille and remaining t te, e.g., zero , we obtain the vector - valued function t H f (t, )
Phillips [HP57] or Pazy [Paz83]. The Bochner integral is the analogue of belonging to L2(O ,T; L2(O)).
the Lebesgue integral for X = R.
For almost every t E (0, T ), the expression en the right - hand side of (3.27)
Example. The vector-valued elements of L2 (0, T; H1(O)) can be viewed is linear and continuous in v and assigns to each v E H1 (O) a real number.
as real-valued functions of the variables x and t, i.e., y = y(x, t) with x e O Hence, it defines for almost every t a linear and continuous functional F =
and t E [0,T]. For each t, y(.,t) belongs to H1(O) with respect to x. The F(t) E H1(O)*, and we have
norm, given by
(3.28) f t(t)vdx = F(t) v d v E H1(O ), for a.e. t E [0, T].
T 11
1/z
Iky1 L2(O,T;H1(E)) _ (f y(t) 2 H1 ( 2)
dt)
Since the right - hand side of ( 3.27) defines a functional F = F(t) E Hl(O)*,
T so should the left - hand side; consequently , yt(t) ought to be a continuous
1/2 ,
(° (¡y(x,t ) 12+lV y ( x,t)j2)dxdt ) linear functional on Hl(O ), i.e., yt (t) E H1(Q)* for almost every t e (O,T).
Now, every weak solution y E W2'°(Q ) can-after an appropriate mod-
is obviously the same as that in WW'0(Q). This suggests that the two spaces ification en a null set-be understood as an element of L2 (0, T ; H1(O)) . In
might coincide. Indeed, it turns out that they are isometric and isomorphic the proof of Theorem 3.12 it will be shown that then
to each other, that is,

W (0,T;H1(O)).
z'0(Q) -- L2 f T IIF(t)Ilxl(Q)* dt < oo;

More precisely, it can be shown that any y e W2'0(Q) coincides-modulo a in other words , F E L2 (0, T; H1(O)*). Comparing this with the left-hand
modification en a set of zero measure-with a function in L2 (0, T; H1(O)) side of (3.28 ), we conclude that
and vice versa. For a proof, we refer the reader to [HP57]. o
( 3.29) yt E L2(0,T, H1(O)*).
Additional information about vector-valued functions can be found, e.g.,
in Hille and Phillips [HP57], Pazy [Paz83], Tanabe [Tan79], and Wloka This observation gives us an important hint as to in which space the de-
[W1o87]. rivative yt should be looked for . However , we still do not know how this
derivative has to be understood . To this end, we need the notion of vector-
3.4.2. Vector - valued functions and parabolic problems . Once again, valued distributioras.
we derive a variational formulation of the parabolic problem (3.23) en page
137. We proceed similarly as in the derivation of (3.25). However, this time 3.4.3. Vector- valued distributions . In the following, let V be a Banach
we keep t E (0, T) fixed, multiplying (3.23) by a function v e H' (O) = V of space, where we have V = H1(O) in mind. For given y e Y, we define a
x alone and integrating over O only. It follows (only formally, since it is still vector-valued distribution T : Co' (0, T) -* V by setting
unclear how to define the time derivative yt) that

(3.27)
j y(t) e dx = -
f Vy(t)- Ve dx+ .f^ (f(t)-co(t)y(t)) v dx
T^p :=
f T y(t) p(t) dt
o
Vp e Co (0, T).
As usual, we identify the generating function y with the distribution T,
which is then investigated in place of y. In order to stress the dependence
+ f (g(t) - a(t) y(t)) v ds d t E [0, T],
en y, the notation Ty is also commonly used in the literature; but we avoid
this notation for the sake of clarity.
where we have suppressed the dependence on x. Since L2(Q) - L2(O,T;
L2(O)) and L2(E) ' L2(0,T;L2(F)), we may regard f,g, and y as vector- We now introduce the derivative T' as a vector-valued distribution, in
valued functions in t that coincide almost everywhere with the correspond- terms of the Bochner integral:
ing real-valued functions. For example, by Fubini's theorem the function
x i--* f (t, a) belongs to L2(O) for almost all t E (0, T); modifying f for the T'^p -lo y(t),p(t) dt.
0
T
146 3. Linear-quadratic parabolic control problems 3.4. Weak solutions in W(O, T) 147

Likewise , one can define T" by putting T"cp := fó y( t) eo"(t) dt, etc . Observe The space W(0,T) = {y e L2(0,T;V) : y' E L2(0,T;V*)} is a Hilbert
that y is a vector- valued function , while ep is real-valued . If a vector-valued space with the scalar product
function w = w(t) E L'(O,T; V) exists such that
(u' v) w(o,T) = f T (u(t) , v(t))V dt + (u (t) , v'(t))V, dt.
T'w = - J0
y(t) cp '(t) dt
T
0
T = J w(t) cp(t) dt V p E Co (0, T),
Here, we have used the abbreviation (F, G)V, := (J F , JG)V, where J :
V* V is the duality mapping from the Riesz representation theorem that
then T' can be identified with w, since it is induced by w. Since we have
assigns to each functional F E V* the corresponding f E V.
identified T with the generating function y and T' with w, we do the same
with the generating functions and define To facilitate comprehension of the arguments that follow, we now in-
troduce the notion of a Gelfand triple. In our case, we have the following
y ,(t) := w(t). situation: the space V = H1(St) is continuously and densely embedded in the
Hilbert space H = L2(Sl). By the Riesz representation theorem, we identify
In this sense, we have y' E Ll (0, T; V) here. Hidden behind this is the
the dual H* with H. Also, V is a Hilbert space that could be identified with
formula of integration by parts, which for continuously differentiable y :
its dual V*; however, we avoid doing this for obvious reasons-for instance,
[0, T] ---> V and cp E Co (0, T) yields that
in integrations by parts we tacitly use the scalar product of H and not that
f Ty(t)/( t)dt = y (t) (t) - = -
of V, which would have to be used if V were identified with V*.
Any f E H can via

v-4(f,v)HER VvEV
Remark . The weak derivatives defined in (2.1 ) on page 27 can be introduced in
be regarded as an element of V*. In this sense,
the same way : to this end , one considers the real-valued distribution T : Co (S2) -->
VcH=H*CV*.
TP := f y(x) P(x) dx V E C, (Q), Now, it can be shown that the embedding H C V* is also dense and contin-
n
uous; see Wloka [W1o87], page 253 and following. The chain of dense and
and regards y as a distribution . In this way, D'y is initially defined as a distri-
continuous embeddings
butional derivative . Those distributional derivatives that are generated by locally
VCHcV*
integrable functions are then weak derivatives . This approach is employed in many
texts dealing with Sobolev spaces . Obviously, the aboye function w is an analogue is called a Gelfand triple. Owing to the Riesz representation theorem, func-
of the weak derivative w = D'y introduced in (2.1). tionals F E V* can be continuously extended to the larger space H if and
only if they are of the form
If we even have w E L2 (0, Ti V), then y would belong to the class of
functions in L2 (0, Ti V) having a regular derivative in L2 (0, Ti V). However, F(v) = (f, v)H,
this class of functions is too small for our purposes . Indeed, relation (3.29)
with a fixed f e H. In the case of V = H' (9) and H = L2(SZ), this means
indicates that the derivative y' = yt in the parabolic equation ( 3.23) should
that
belong to the larger space L2 (0, Ti V *) .
Hr(Q) c L2(2) c Hr(Q)*,
In the following definition, we consider y as a vector-valued function in
and F E H'(Q)* can be continuously extended to L2(fl) if and only if there
L2 (O, T; V*), so that T maps into V. This is possible since V is continuously
is some f e L2(tl) such that
embedded in V*, as we shall see below.

F(v) = f f (x) v(x) dx = (f , v) L2(Q) b'v E H' (2).


Definition . We denote by W (O, T) the linear space of all y E L2 (0, Ti Y)
having a (distributional) derivative y' E L2(0,T;V*), equipped with the norm.
The following results, which can be found in Wloka [W1o87] or Zeidler
[Zei90a] , hold for any Gelfand triple. They are of fundamental importante
IMIW(o,T) = (I T (jjy(t)jjv + ly( t)l1V.) dt)1 2 for our purposes.
148 3. Linear-quadratic parabolic control problems 3.4. Weak solutions in W(0, T) 149

Theorem 3.10. Every y E W (0,T) coincides-possibly after a suitable mod- Theorem 3.12 . Let y E W2'°(Q)
be the weak solution to problem (3.23),
ification on a set of zero measure-with an element of C([0,T],H). In this which exists according to Theorem 3.9. Then y belongs -possibly after a
sense, we have the continuous embedding W (O, T) C ([0, T], H). modification on a set of zero measure -to W(O,T).

In particular, it follows that for any y e W(0,T), the values y(0) and
Proof: I t follows from ( 3.25) that, for all v E W2' ( Q) with v(T) = 0,
y(T) belong Lo H. Moreover, there exists a constant cE > 0 such that

IIYIIC([0,T],H) <_ CE IIPIIW(O,T) Vy E W(O, T). -


ff y vt dx dt = - f J V y • V v dx dt - J J co y v dxdt
Q Q Q

Theorem 3.11. For all y, p E W (O, T) the formula of integration by parts - f¡¡
E
¡
^ Q
¡¡ ¡¡
ay v dsdt + J yov(0) dx+ J J f v dxdt+ J J yv dsdt.
E
holds:
In particular, we may insert any function of the form v(x,t) := v(x ) cp(t),
fT (^ (t) , p(t)v*,v dt = (y(T), p(T)) H - (y(0) , P(0))H where cp E Co' (0, T) and v c V = H' (S2). Setting H = L2(9 ) and HN =
H x H x ... x H ( N times ), we find that
fT (p (t) , y(t)v,,v dt.
T (y(t) p, (t) , v)H dt = (Vy (t), Vv)H,, ^,(t) dt
Here, (F, v)v*,v := F(v) for F E V* and v e V. This notation, which
JJT
- ,.T (Co T (U (t) y(t) ,
resembles a scalar product , is commonly used in the literature.
y ( t) , v)H (t) dt - I V) L2 (r) p (t) dt
Conclusion. Taking p = y in the formula of integration by parts, we find
that any y E W(O,T) satisfies the useful identity + fT (f(t) , v)H
w(t) dt + J T (9(t) , v)L2(r) ^(t) dt.
T 1 1
(3.30) (y'(t) , y(t))v.,v dt = 2Ily(T)IIH - t ly(0)I1. The initial condition vanishes, since ep(0) = 0. Now y E L2(Q), by
f the definition of W2'°(Q). Hence, by Fubini's theorem, y(•, t) E L2(S2) for
Hence, we can formally write almost every t E (0, T). Moreover, Dzy c L2(Q) for i = 1, .... N, and thus
Vy(-,t) E (L2(S2))N = HN for almost every t E (0,T). Finally, y(.,t) E
112
^^(y(t),y(t))v,vdt= f T 2dtIIy(t)IIHdt=211y(T)IIH-21 H1(Q), and thus y(., t) E L2(I) for almost every t E (0, T).
On the set of measure zero in [0, TI where one of the aboye statements
possibly does not hold, we put y(t) = 0, which does not change the vector-
1
W2 '°(Q)
valued function y in the sense of L2 spaces. Hence, we see that for any
3.4.4. Weak solutions from belong to W(0,T). In this section,
we will show that weak solutions Lo our parabolic initial-boundary value fixed t, the expressions in the integrals on the right-hand side define linear
functionals Fi(t) : H'(Q) -+ R, namely
problems belong to W(0, T). To this end, we again consider the problem
(3.23) on page 137,
(3.31) Fi(t) : v (Vy(t) , VV)Hrr

yt - . y +coy = f in Q F2 (t) : v (co(t) y(t) , v) H


c)„y+ay = g onz
F3 (t) : v (n (t) y(t) , v)L 2(r)
y(0) = yo in S2,
F4 (t) : v (f (t) , v)H
and require that Assumption 3.8 be satisfied. We have the following result.
F5 (t) : v (9(t), v)L 2(r)
150 3. Linear-quadratic parabolic control problems 3.4. Weak solutions in W(0, T) 151

in that order. We claim that the functionals Fi(t), 1 < i < 5, are bounded with some constant c,,, > 0 that does not depend on (f, g, yo). In other
and thus continuous on V for every t. We verify this claim only for Fl (t) words, the mappzng (f, g, yo) H y defines a continuous linear operator from
and F3(t), leaving the other cases as an easy exercise for the reader. First, L2(Q)xL2(E) xL2(S2) into W(0,T) and, in particular, into C([0,T], L2(Sl)).
we have

dx < IIy(t)IIH1 (o) IIVII H 1(O) e H1(st). Proof- We estimate the norm lyllw(o,T) or, more precisely, its square
< f lVy (t)I I Vv I
F1(t)vl Vv
2 2
Note that the function t - lly(t)II H1 (O) belongs to L2 ( 0,T), and Ily(t)IIHI(n) Ilyllw(o,T) = IIYML 2(o,T;H1(O)) + Ily'11L2(0,T;H1(Q)*).
is by construction everywhere finite. F3(t) can be estimated similarly: For the first summand, we obtain from Theorem 3.9 on page 140, with a
generic constant c > 0, the estimate
F3(t)vl < f ia(t)Iy (t)1 lvids < IIaMIL-(E) I1y(t)IIL2 (r) IVIIL2(r)
(3.33) I1y11i2(o,T; H 1(2)) = Ilyllti,zo(Q) <- e(IIfII i2(Q) +II9Ili2(E)+Ilyolli2(s2))
< CIIaIIL°(E ) IIy(t)IIHI (O) IIVIIH1(O)-
The second summand requires only a little more effort. Indeed, with the
Hence, F3 ( t) is bounded , and IIF3 ( t)IIH1(o )• < c IIaIIL-( E ) I y (t)IIH1(o)- functionals Fi defined in (3.31), we have
In this way, we find that Fi (t) E V* = Hl( fl)* for every t, and there is
some constant e > 0 such that
5 1 IIL2(O,T;H1(0) *) = L2 ( oT , HI ( O ) , ) < 11FiIIL2( o,T;Hl (o)-)-
i=1
(3.32) IIFi(t)IIv* < e (lly(t)IIHI(sz) + Ilf(t)IIL2(o) + ll9(t)IIL2(r)) -
i=1 We estimate only the norm of F1, leaving the others to the reader. Using
Since the expression on the right-hand side belongs to L2(0,T), so does the aboye estimates, in particular (3.32) and (3.33), we find, with a generic
the expression on the left-hand side, showing that Fi E L2 ( 0, T; V*) for constant c > 0, that
1 < i < 5. But then the functional F on the right-hand side of the variational T
dt < f c lly(t ) IIHI(0) dt
f T IIF1(t)M
formulation , being just the sum of the Fi, also belongs to L' ( O, TI- V*). IIF1Il2L2(o,T;V*)
Rewriting the variational formulation in terms of F, we obtain that for
all v e V we have the chain of equalities < e Ilyllwz,o (Q) < e (IIfIIL2( Q) + II9IIL2 (E) + IIyoIIL2(o))2.

- f T y(t) (t) dt, v)L2 (Q) = - f T (y(t) w(t) , v) L 2(9) dt The norms of F2,. .., F5 can be estimated similarly. The assertion is thus
proved. ❑

=
I0

T (F(t) w(t) , v) v, ,v dt = ( f T F(t) W( t) dt, v^ Now that the existente of the derivative yt := y' has been shown, we are
o v.,v
in a position to reformulate the variational equality (3.25). Assuming that
and therefore , as an equation in the space V*, y e W(0, T) in (3.25) on page 140, we can keep the final value y(T), which is
now well defined. It follows from (3.24) on page 139 that for all v E W2'1(Q)
fTy(t) (t) dt = fTF(t) ^p(t) dt d E Co (0, T). we have
0 0
But this means that y' = Fin the sense of vector -valued distributions ; hence,
- f f yvt+ff Vy.vv+ f f coyv+ f f ayv
y' E L2 (0, T; V*). In conclusion , y e W (0, T), and the assertion is proved. Q Q Q E

=ff fv+fj gv+ f yov(,0)-fs2y(., T)v(.,T),


Theorem 3.13 . The weak solution y to the problem (3.23) satisfies an esti-
mate of the forro
where the differentials in the integrals have been omitted for the sake of
(Iyllw(o,T) < cw (IIfIIL2( Q) + II9IIL2 (E) + IIyO(L2 (0)), brevity. We now use the facts that y, v E W(0,T) and y(0) = yo. Invoking
152 3. Linear-q u adra tic parabolic control problems 3.5. Parabolic optimal control problems 153

the formula of integration by parts on page 148, we find that for all v E The basis for this representation is Theorem 3.13.
W(O,T),
T 3.5. Parabolic optimal control problems
( 3.34) f (yt , v ) v>,v + Vy • 17V +
ff ff co y v+ II a y v
As in the elliptic case, we begin our analysis by transforming selected linear-
quadratic parabolic optimal control problems into quadratic optimization
= ffiv+ffv,
problems in Hilbert spaces and proving their solvability.
To begin with, we fix some general assumptions on the given quantities.
y(0) = yo, These concern the spatial domain í and its boundary F, the final time
where yt is a vector-valued function in L2 (0, T; V *) . The extension from T > 0, target functions yo, yQ, yv that are to be approximated , the initial
v E W1 NQ) to v E W(O, T) follows from a density argument; indeed, the distribution yo, coefficients a and j3, as well as bounds ua, ub, vo,, vb that-
depending on the particular problem-have to be obeyed on either E = Q =
integrals appearing in the aboye equation are continuous with respect to v
in W(O,T), and the functions in W1,1 (Q), regarded as elements of W(0,T), S2 x (0, T) or E = E = F x (0, T). The actual meaning of the set E can be
discerned from the context.
forro a dense subset of W (0, T). The aboye variational formulation is valid
even for v e L2(0,T;V), since all of the expressions involved are continu-
Assumption 3.14. Let f2 C R' be a domain with Lipschitz boundary U, and
ous also in this space. Therefore, (3.34) can be rewritten in the following
let.\ > 0 be a fixed constant. Assurne that we are given functions yo e L2(9),
equivalent forro:
yQ E L2(Q), yE E L2(E), a, 0 E L` (E), and ua,, ub, Va,, Vb e L2(E) with
ua,(x, t) < ub(x, t) and va,(x, t) < vb(x, t) for almost every (x, t) E E. Here,
(3.35) depending on the specific problem under study, E = Q or E = E.

(yt,v)v,vdt+ ( VyVv+coyv)dxdt+ f ayvdsdt 3.5.1. Optimal nonstationary boundary temperature . We consider


ff E the problem (3.1)-(3.3) from page 119:
¡¡
f v dx dt + J J g v ds dt dv E L2 (0, T; V),
= JJQ E min J(y, u) := 2 9 l y(x, T) - Yo(x) 12 dx + 2 JJE 1u(x, t) 2 ds(x)dt,
Y(O) = yo. subject to

The solution mapping (f, g, yo) --> y corresponding to the initial-boundary


value problem (3.23), yt - Ay = 0 in Q

yt - Ay+coy = f in Q = S2 x (0, T) ó„y+ay = /3u on

M+ay = g onE =Fx(0,T) y(0) = 0 in S2

y( 0) = yo
and
has the structure u,,,(x,t ) < u(x,t) < ub(x,t) for a.e. (x, t) E E.
(3.36) y - GQ f +Gyg+Goyo,
Owing to Theorem 3.13 on page 150 and Theorem 3.12 on page 149, the
with continuous linear operators GQ : L2(Q) -> W(0,T), GE : L2(E) -3 initial-boundary value problem (3.2) on page 119 has for any given control
W(0,T), and G0 : L2(Q) -> W(0,T) which are defined by
u E L2(E) a unique weak solution y E W(0, T), represented by (cf. equation
GQ : f H y for g = 0, yo = 0 (3.36))

GE:g^y for f=0,yo=0


y = GE(í3n)_
Go: yoI->y for f=0,g=0.
154 3. Linear-quadratic parabolic control problems 3.5. Parabolic optimal control problems 155

For evaluation of the cost functional the full information about y is not 3.5.2. Optimal nonstationary heat source . This section deals with the
needed, only the final value y(T). The observation operator ET : y H optimal control problem
y(T) is a continuous and linear mapping from W(0,T) into L2(íl), since
the embedding W(0, T) --> C([0,T], L2(í )) has these properties. Hence, for
(3.38) ¡ ¡
some constant c > 0, IIy(T)IIL2(o) < IMIC([O,T],L2(2)) -< CIly11w(0,T), and we
have min J(y, u) y( x, t) - yE ( x, t )1 2 ds ( x) dt + 2 J f Iu(x, t) 12 dx dt,
y(T) = ETGE(/3u) =: Su.

Again, S represents the part of the state that appears in the cost func-
tional. In summary, the composition u y H y(T) is a continuous linear yt - Ay = /3u in Q
mapping
av y = o on E
S:uHy(T) y(0) = 0 in íl

from the control space L2(E) luto the space L2(íl) that contains y(T). Re-
placing the expression y(T) in the cost functional (3.1) by Su, we eliminate
the initial-boundary value problem (3.2) (of course, only theoretically). Then
(3.40) ua(x,t) < u(x,t) < ub(x,t) for a. e. (x, t) E Q.
the optimal control problem (3.1)-(3.3) becomes a quadratic optimization
problem in the Hilbert space U = L2(E
We have to find the optimal heat source u, aiming at the best possible
approximation of a desired evolution of the boundary temperatue yE, where
(3.37) mÍn f(u) 2 1ISu y2IIi2(o) + 2 IIu1IL2(E)
UE^d the costs due to the control action are accounted for by the term 2 )HuII2•
This can also be seen as an inverse problem: an unknown heat source u
distributed within the body Q has to be recovered frorn measurements of
the temperature evolution yIE on the surface F. In this context, and also in
Uad = {u E L2(E) : ua(x, t) < u(x, t) < ub ( x, t) for a.e. (x, t) E E}.
the interpretation as an optimal control problem, A > 0 plays the role of a
regularization parameter.
Obviously, the functional f is convex and continuous, and the admissible By virtue of Theorem 3.12 on page 149 and Theorem 3.13 on page 150
set Uad is a nonempty, closed, bounded, and convex subset of the Hilbert applied with f := /3u, g = 0, and yo = 0, there exists for any u e L2(Q)
space L2(E). Hence, we can infer from Theorem 2.14 on page 50 the following a unique weak solution y E W (0, T) to the parabolic initial-boundary value
existente result. problem (3.39). The mapping u y defines a linear (since y(x, 0) = 0)
and continuous operator from L2(Q) finto W(0,T) and, by the definition of
Theorem 3.15. Suppose that Assumption 3.14 on page 153 holds with E:= W(0,T), into L2(0,T;Hl (í)) as well. With the control-to-state operator
E. Then the optimization problem (3.37), and hence the optimal nonstation- GQ : L2(Q) -^ W(0,T) defined in (3.36), it has the representation
ary boundary temperature problem (3.1)-(3.3) on page 119, has at least one
y = GQ(/3u).
optimal control u E Uad. If A > 0, then u is uniquely determined.
The evaluation of the cost functional only requires knowledge of the
boundary values y(x,t)jE. Since the trace operator y H yIr, maps Hl(Q)
The problem just discussed is a parabolic boundary control problem with
continuously into L2(I ), the mapping EE : y yIE defines a continuous
final-value cost functional. Next, we will use a similar approach to study a
linear operator from L2 (o, T ; H' (rl)) into L2 (0, T; L2(r». Consequently,
problem with distributed heat source control. For a change, we assume iso-
the mapping u r--j y yjE, i.e., the operator
lation at the boundary and minimize a functional that involves the evolution
of the boundary temperature (boundary observation). S:u->yIE,
156 3. Linear-quadratic parabolic control problems 3.6. Necessary optimality conditions 157

maps the control space L2 (Q) continuously into the space L2 (0, T; L2 (F)) Lemma 3.17 . The parabolic problem (3.43) has a unique weak solution p E
L2((O,T ) x h) L2( E) to which ylE belongs . With the operators thus W(Q),
Z'o which is the solution lo the variational problem
defined, S takes the form ¡T
p vt dx dt + J a[t; p, v] dt
(3.41) Su = EnGQ(fu). JJ
Substituting y = Su in the cost functional J(y,u), we eliminate the par-
= f aov(T)dx+ ff aQvdxdt+ ff auvdsdt
abolic initial-boundary value problem to arrive at the following quadratic
v ,o E W2'1(Q) with v(-,0) = 0.
minimization problem in the Hilbert space U = L2(Q):
We have p E W(O,T), and there is a constant ca > 0, which does not depend
1
(3.42) Umm d f(u) 12
2 IISu-yrlli2(E) + z IluI L2(Q)' on the given functions, such that

IIPllW(o,T) < ea (IIaQIIL2(Q) + IIaEIIL 2(E) + IIaoII L2(rn).


Invoking Theorem 2.14, we conclude the existente of an optimal control, i.e.,
the solvability of the problem. We have thus shown the following result. Proof: Let T E [0, T], and put p(r) := p(T - r) and 5(T) := v(T - T). Then
p(0) = p(T),p(T) = p(0), 5(0) = v(T), 5(T) = v(0), ¿íQ(-, t) aQ(., T - T),
Theorem 3.16 . Suppose that Assumption 3.14 on page 153 holds with E _ etc., and also
Q. Then the optimal nonstationary heat source problem (3.38)-(3.40) has at
least one optimal control v, E Uad. If A > 0, then ti is uniquely determined.
Jft dx dt=- ff PTdxdT,
and so en. Consequently, the asserted variational formulation is equivalent to
the definition of the weak solution to the (forward) parabolic initial-boundary
3.6. Necessary optimality conditions value problem
pT-OP+cop aQ
In this section, we will derive the first-order necessary optimality conditions
for the problems stated in Sections 3.5.1 and 3.5.2. First, a variational ó„p+ap
inequality will be derived that still involves the state y; then, y will be p(0) = ac.
eliminated by means of the adjoint state to deduce a variational inequality By Theorem 3.9 on page 140, there is a unique weak solution p, which by
for the control.
Theorem 3.12 on page 149 belongs to W(0,T). The assertion now follows
from reversing the time transformation. 11
3.6.1. An auxiliary result for adjoint operators . Consider the para-
bolic problem Since p E W(0,T), we can, in analogy to equation (3.35) on page 152,
rewrite after an integration by parts the variational formulation of the adjoint
-Pt - Op + co p aQ equation in the following shorter form:
(3.43) 8„p+np aE

p(.,T) ate, (3.44)

with bounded and measurable coefficient functions co and a and prescribed


functions aQ E L2(Q), aE E L2(E), and aO E L2(Q). We define the bilinear
fT
{(pt,v) v.,v+a[t;p,v] } dt = f f aQvdxdt + ff aEvdsdt
form dvEL2(0,T;V)

a[t; y, v] (oy • V v + co(., t) Y V) dx + faot)yvds.


p(T) = asz.

Like in the elliptic case, for the derivation of the adjoint system we need
We have the following well-posedness result. the following somewhat technical result.
3. Linear-quadratic parabolic control problems 3. 6. Necessary optimality conditions 159
158

Theorem 3.18 . Let y e W(O,T) be the solution to the parabolic problem 3.6.2. Optimal nonstationary boundary temperature . In this sec-
tion, we determine the necessary optimality conditions for the problem (3.1)-
yt - oy+coy = bQ v (3.3) on page 119:

avy+ay = bE u
min J(y, u) 2 Ily(T) - YOllL2(O) + A IIU11^2(E),
y(0) = b0 w,
subject to
with coefficient functions co, bQ E L°°(Q), a, by_ E L°°(E), and b0 E
yt - Dy = 0
L°O(lZ) and controls v E L2(Q), u e L2(E), and w E L2(f1). Moreover,
let square integrable functions att, aQ, and aE be given, and let p E W(0, T) av y +a y = '3-
be the weak solution to (3.43 ). Then we have y (0) = yo

f aoy(.,T )dx + J JQ aQydxdt+ ff auYdsdt and


Ua <U<U U.

=
JJ bEpudsdt+ J bQpvdxdt+ J bop(•, 0)wdx.
E S2
In this problem, the initial state yo E L2(Sl) may differ from zero, a
situation we have avoided so far for the sake of simpler exposition. From
Proof: We use the variational formulations for y and p. For y, using the test the previous section, we know that in the case of yo = 0 the problem has an
function p, we have optimal control ú with associated state y. The reader is invited te show the
corresponding result for yo 0 in Exercise 3.2 on page 178.
(3.45) We now determine the form of the adjoint problem. This could easily be
I ¡/ ¡¡ done by means of the formal Lagrange method. However, we are sufficientiy
T { (yt, p),. , , + a [t; y, p] } dt = J JR bQpvdxdt + E bE p u ds dt,
JJ experienced by now to be able to deduce the correct form directly: each of
the terms occurring in the derivative of the cost functional J with respect
with the initial condition y(0) = b0 w. Analogously, taking y as test function to y appears as a right-hand side in the adjoint system. The domains of
in the equation for p, we find that definition of these tercos have to coincide with the domains in which the
corresponding condition in the adjoint system is valid. In the present case,
the derivative of the cost functional with respect to y is given by the function
y(T) - ys i which is defined in 9. Hence, the derivative must appear in the

T { - (pt , y)V, v + a[t; p , y] } dt = ffQ aQydxdt+ JffJE


(3.46) J aE y ds dt, condition of the adjoint system that has to be satisfied in 9. Keeping in
mind that the adjoint system ought to be a parabolic final value problem
backwards in time, in which only the final value p(T) has 9 as its domain, we
with the final condition p(T) = ag. Integrating by parts in (3.45), we obtain
conclude that the adjoint state p associated with y must solve the following
adjoint system:

-pt - Op = 0 in Q
(3.47) {-(pt, y),. , ,+ a[tiy,p]} dt = -(y(T), ast)Lz(Q)
f T en E
(3.48) a„p+ap = 0
bQpvdxdt + bE p u ds dt.
+ (bo w , p(
0)) L2(9 ) + JJ
Q f,
11
p(T) = y(T) - yc in Q.

Since the left-hand sides of this equation and equation (3.46) coincide, the
right-hand sides of (3.46) and (3.47) must also be equal, hence the assertion Theorem 3.19. Let ti E Uad be a control with associated state y, and let
follows. ❑ p E W(O,T) be the corresponding adjoint state that solves (3.48). Then ú
160 3. Linear-quadratic parabolic control problems 3.6. Necessary optimality conditions 161

is an optimal control for the optimal nonstationary boundary temperature Next, we employ the method described on page 69 for elliptic problems to
problem (3.1)-(3.3) on page 119 if and only if the variational inequality derive a number of results concerning the possible form of optimal controls.

(3.49) f (a(x,t)p(x,t) + A ú( x, t))(u(x,t) - u(x, t)) ds(x)dt > 0 Theorem 3.20. A control v, E Uad with associated state y is optimal for the
f problem (3.1)-(3.3) on page 119 if and only if it satisfies, together with the
ho lds for all u E Uad. adjoint state p from (3.48), the following conditions for almost all (x, t) E E:
the weak minimum principie
Proof.: Let S : L2(E) -4 L2(Q) be the continuous linear operator that, for the ,(3.51) (a(x, t) p(x, t) +A ü(x, t)) (v - v.(x, t)) > 0 Vv E [u, ,(x, t), ub(x, t)],
homogeneous initial condition yo = 0, assigns to each control u the final value
the minimum principie
y(T) of the weak solution y to the state equation. Moreover, let y = Go yo
(3.52) (
denote the weak solution corresponding to yo 0 and u = 0. Then it follows
},
from the superposition principie for linear equations that (x, t) p(x, t) u(x, t) + ñ 26(x, t) 2 = min t) p(x, t) v + Av2
2 vE[tu(x,t),Ub(x,t)1 l 2
y (T) - ys2 = S u + (Go yo) (T) - yst = S u - z,
and, in the case of A > 0, the projection formula
where z := yc - (Go yo) (T). The aboye control problem then takes the form
1 1\ 2
(3.53) u(x, t) = P[ua(s,t),ub(x,t)[ { - f (x, t) p(x, t) }
Ev d
Min f(u) :- - Il su - ZIIL2(A) + 2 IIuIIL2(E)-
The general variational inequality ( 2.47) on page 65 yields, for all u E Uad, Proof.• For the proof of this assertion one takes, starting from the variational
inequality (3.49), the same series of steps that led from Theorem 2.25 on
(3.50) 0 < ( Sú - z, S (u-v,)) 2 -1 (ú u -ú)
L (n) , L2(E page 67 to Theorem 2.27 on page 69 in the elliptic case. ❑
f^ (y(T) - Yo) (y (T ) - y(T)) dx + A f f u ( u - u) ds dt.
Conclusion . In the A > 0 case, the triple (u, y, p) satisfies the optimality
Here, we have used the identity system

Su - Su = Su + (Go yo) (T) - (Go yo ) (T) - S26 = y(T) - y(T)


and, once more , that z = y2 - (Co yo ) (T). Now put := y - y and apply
yt - Ay = 0 -pt - OP = 0
Theoreni 3.18 with the specifications a,^2 = y(T ) - y^2, aQ = 0, aE = 0,
bE=0,v= 0,w=0,y: = y, and E:=u-26 . Note that w = 0 in CU +cxy = /lu d,P+csp = 0
this situation since, by definition , Su(0) = 0; the part originating from yo is (3.54) Y(O) = yo p(T) y(T) -yo
incorporated into z. It follows that
1 ¡^
u 7
- ^lun,ubJ { ` N p}
(y(T)-y0, ü(T))L2(Q) ffPfidsdt.

Substituting this result in inequality ( 3.50), we find that


If .\ = 0, then the projection formula has to be replaced by:
0 < f (y(T)- yo)(y(T )-y(T))dx+A f f u( u - u)dsdt
ua(x, t ) if /.3(x, t) p(x, t) > 0
u(x, t) = {
ub(x,t ) if 33(x,t)p(x,t) < 0.
_ /ffip(u-u)dsdt +f/u( _u)c1sdt
Special case: ua, = -oo , ub = oo (no control constraints). In this case,
= ff(+ u)(u _u )dsdt, the projection formula yields u = -,\-'/3p. Hence, u can be eliminated from
the state equation, and we obtain the following forward-backward system of
which concludes the proof of the assertion. ❑ two parabolic problems for the unknown functions y and p:
162 3. Linear-quadratic parabolic control problems 3.6. Necessary optimality conditions 163

that is, upon substituting 91E = Su and yjr = Su,

yt - Dy = 0 -pt - Op = 0 (3.56) 0 < ff (v -yE)(y-y)dsdt+.^ ff Y(u -u )dxdt VuEUad


(3.55) a„y+ay = -(52)-1p a,P+ap = 0 Q

y(0) = yo p(T) = y(T) - yst- It isevident how the adjoint state p must be defined, namely as the weak
solution lo the parabolic problem

The solution of such systems is not an easy task. Similar equations


arise as Hamiltonian systems in the optimal control of ordinary differential -pt - Ap =
equations, where they are solved by means of (multiple) shooting techniques.
avp =
In the case of partial differential equations, the large number of variables
originating from the spatial discretization presents an additional challenge p(T) =
that makes the direct solution of the aboye optimality system difficult. One
possible approach is to apply multigrid methods; see, e.g., Borzi [Bor03] By virtue of Lemma 3.17, it has a unique weak solution p.
and Borzi and Kunisch [BK01 ]. Direct solution as an elliptic system is also
promising (see the recommendations concerning numerical methods starting, Theorem 3.21. A control u c Uad is optimal for the optimal nonstation-
on page 170). ary heat source problem (3.38)-(3.40) on page 155 if and only if it satisfies,
together with the adjoint state p defined abone, the variational inequality
3.6.3. Optimal nonstationary heat source . We recall problem (3.38)-
(3.40) on page 155, which in shortened form reads ffQ (+)(u-ñ)dxcit^ o VUEUad.

min J(y, u) := 2 11y - yEIIL2(E) + 2 IIuIIL2(Q),


Proof: This assertion is again a direct consequence of Theorem 3.18, with
subject to U E Uad and to the state system the specifications aE = y - yE, aSZ = 0, aQ = 0, bQ = 0, bE = 0, and bQ = 0.
The steps are similar to those in the proof of Theorem 3.20. ❑

yt -ny = /3 u
As in Section 3.6.2, the variational inequality just proved can be trans-
a„y = 0 formed into an equivalent pointwise minimum principie for u or , if \ > 0,
Y(O) = O. into a projection formula. In particular, if .\ > 0,

u(x, t) = P[ua ( x,t),ub(x,t)] {- /3(x, t) p(x, t) } for a.e. (x, t) E Q.


Invoking the operator GQ : L2(Q) -* W(0,T) introduced in (3.36) en
page 152, we can express the solution y to the state system in the form
3.6.4. Differential operators in divergente form *.
y = GQ(au)•
Statement of the problem and existente of optimal controls. The
The cost functional involves the observation y,E. The observation op-
parabolic problems studied so far Nave been comparatively simple, since we
erator EE : y H yjE is a continuous linear mapping from W(0,T) into
have confined ourselves for methodological reasons to the Laplacian. How-
L2(0,T; L2(F)) - L2(E), which entails that the control-to-observation op-
ever, the theory can easily be extended to more general equations. To this
erator S : u H yjE defined in (3.41) en page 156 is continuous from L2(Q)
end, we recall the uniformly elliptic differential operator in divergence form
into L2(E). Hence, the problem is equivalent to the problem minUeu d f (u),
introduced on page 37,
where f is the reduced functional introduced in (3.42) on page 156. As in
(3.50), we obtain the following as the necessary optimality condition for u:
Ay(x) = - > Di( aij(x ) Dj y(x)),
0 < (Su - yE 1 Su - Su) L2(E) +), (u , u - u) L2(Q) V u E Uad, i,j=1
164 3. Linear-quadratic parabolic control problems 3.6. Necessary optimality conditions 165

under the assumptions made there. We consider the optimal control problem The previous definition of weak solutions is evidently a special case. We
now introduce the family of bilinear forms a[t; •, •] : H'(5l) x Hl (9) 118 for
(3.57) t e [0, T],

min J(y, v, u) a^ Ily(T) - yoIIL2(O ) +Q Il y - YQII L2(Q)


a[t; y w] = J (aj (x) Diy(x) Djw(x) + c(x, t) y(x) w(x)) dx
+ 2E Ily - YEIIL2IE) + 2v IIv1IL2(Q) + 2u IIuIIL2lE ¡¡,j=1

subject to the parabolic initial-boundary value problem + J (x, t) y(x) w(x) ds(x).

Then the aboye weak formulation can be rewritten in the following somewhat
yt+Ay+coy = ,3Qv in Q
simpler form : for all w e W211(Q) such that w(T) = 0, we have, with
(3.58) Ó„AY + a y = f3r u on E H=L2(S),
y(0) = Yo in 9 (3.60)

IT {-(y(t), wt(t))Hdt+a[t;y(t ),w(t)]}dt= (fQ(t)v(t), w(t))Hdt


and the control constraints ^T
T
va(x, t) < v(x, t) < vb(x, t) for a.e. (x, t) E Q
(3.59) +^ (QE(t) n(t) , w(t )) L2(E) dt+ (yo, w(0))H
ua,(x, t) < u(x, t) < ub(x, t) for a. e. (x, t) E E.

Here, in addition to the quantities introduced in Assumption 3.14 on page Again, we have suppressed the dependence on x.
153, we are given nonnegative constants \Q, \Q, \r, \,,, and )u, as well as By virtue of Theorem 7.9 on page 373, under the aboye assumptions the
coefficient functions Í3Q E L°°(Q) and QE E L°°(E). initial-boundary value problem (3.58) has for any triple (v, u, yo) E L2(Q) x
For clearer exposition , the (x, t) dependence of all functions has been L2(E) x L2(9) a unique weak solution y E W2'o(Q). Moreover, there is a
suppressed in the parabolic problem . Observe also that the operator A itself constant e, > 0, which does not depend on the choice of (v, u, yo), such that
does not depend on t; for simplicity , we have dispensed with the time depen-
dence of the coefficient functions azj. Note, however , that the existente result (3.61) IIYIIW(o,T) <- cP (IIvuIL2(Q) + IILIIL2(E) + Ily011L2l9))
of Theorem 5.1 in Ladyzhenskaya et al. [LSU68] allows time-dependent co-
efficients ajj (x, t ) under appropriate smoothness assumptions ; see also Wloka As in Theorem 3.12, it can be shown that (possibly after a modifica-
[Wlo87]. The definition of weak solutions to problem ( 3.58) reads as follows: tion on a set of zero measure) y belongs to W(0,T). Hence, the mapping
(v,u,yo) --5 y defines a continuous linear operator from L2(Q) x L2(E) x
Definition . A function y e W2'o(Q) is said lo be a weak solution lo L2(12) into W(0, T). In particular, the following mappings are continuous:
(3.58) if the following variational equation holds for all w E W2'i(Q) such
(v, u , yo) y from L2(Q) x L2(E) x L2(Q) into L2(Q),
that w (.,T) = 0:
(y, u, yo) ->y 1 E from L2(Q) x L2(E) x L2(5l) into L2(E),

JJ y(x, t) wt (x, t ) dx dt = JJ atj (x) Dzy (x, t) Djw( x, t) dx dt


(y, u, Y o) H y(T) from L2 (Q) x L2(E) x L2 ( 2) roto L2(S2).
Q ¡,j=r
With this information in hand, we can argue as in the preceding sections to
+ ff (co(x, t) y(x, t ) - í3Q(x, t) v(x, t)) w(x, t) dx dt arrive at the following result.

Conclusion . Under the aboye assumptions, the control problem (3.57)-


+ A (a ( x, t) y(x, t) - $E (x, t) u(x, t)) w (x, t) ds(x)dt
(3.59) has optimal controls v and Ii. These are uniquely deternained if one
J^yo(x ) w(x ,0)dx. of the following conditions is fulfilled: either w > 0 and .\.. > 0, or \Q > 0
and i3Q and f3E are nonzero almost everywhere.
166 3. Linear-quadratic parabolic control problems 3.7. Numerical methods 167

Remark. As in Section 2.3.3, it is possible to subdivide the boundary as subject to Ua < U < ub and
F = Po U f'1 and prescribe homogeneous boundary data for y on 1'o. We leave it to
the reader to work out the details.
yt-Ay = 0
Necessary optimality conditions. We state the optimality conditions (3.62) a,y+ay = /3u
for problem (3.57)-(3.59) without proof, since the lines of argument follow
closely those of the problems discussed previously. Let us consider optimal y(0) = yo.
controls v e L2(Q) and ú e L2(E) with associated state y. The correspond-
ing adjoint state p is the unique weak solution to the adjoint system
As before, let S : L2(E) -> L2(S2), u y(T), be the operator that, for
the homogeneous initial datum yo = 0, assigns to each control u the final
-Pt+Ap +coP = AQ(y-yQ) in Q value y(T) of the solution to the aboye initial-boundary value problem. We
also denote by y the solution that corresponds to the inhomogeneous initial
¿9"P+aP = AE(y yE) in E
datum yo and the control u = 0. Evidently,
P(T) = Aq (y(T) - yst) in Q.
y(x,T) = (Su)(x) + 9(x,T),
The necessary and sufficient optimality condition is given by the varia-
and the optimal control problem becomes a quadratic Hilbert space opti-
tional inequalities
mization problem,

ff(Q(x,t)p(x,t)+v( x,t))(v(x,t)_(x,t))dx dt > 0 dv E Vad


mid f (u) 2 II Su + y(T) - ys^ Ili2(O) + 2 Ilujji2(E)
ff (E(x,
/3 t)p(x,t)+nü(x,t))(u(x,t )_ ü(x,t))dsdt > 0 t1u E Uad, The derivative of f at an iterate un is given by

which again can be expressed in terms of pointwise relations or projection f /( u n) v = JJ (/3(x, t) p, (x, t) + \ u, (x, t)) v(x, t) ds dt ,
formulas. Here, Uad is defined as before , and Vad is the set of all v E L2(Q)
that respect the constraint va(x,t ) < v(x,t) < vb(x,t) for almost every where p,,, is the solution to the following adjoint equation:
(x, t) E Q.
-Pt - Ap = 0
3.7. Numerical methods (3.63) a„p+ap = 0

We first explain the projected gradient method and the formulation of a P(T) = y..(T) - yn•
finite-dimensional reduced problem, since these techniques are easily imple- By the Riesz representation theorem, we obtain the usual representation of
mented for tests. Even today, gradient techniques are still the method of the reduced gradient,
choice for complex problems, for instante in three-dimensional spatial do- f'(un) = /3p n + A u,z.
mains. At the end of this section we also provide the reader with a brief
overview of recommended, more efficient methods.
The algorithm proceeds as follows: suppose that the iterates ul, .... u,z
3.7.1. Projected gradient methods. We sketch the projected gradient have already been determined.
method for the problem of finding the optimal nonstationary boundary tem-
peratura. Parabolic problems with distributed control can be treated simi- Si (New state) Solve the state system (3.62) with u := un for y =: yn.
larly; see also the elliptic case with distributed control. We have to solve the
S2 (New descent direction) Calculate the associated adjoint state pn by
optimal control problem
solving (3.63). Take as descent direction the negative gradient
min J(y, u) := 2 Ily(T) - yC IIL2(^t) + A I1.I1L2(E), vn = -f'(un) = -(/3Pn+Aun).
3. Linear-quadratic parabolic control problems 3.7. Numerical methods 169
168

S3 (Step size control) Determine the optimal step size s,,, by solving value problems
yt = Ay
f (^[ua,nbl {un + Sn vn} ) = min f vn }) .
S>o ( [ua,ubI {un + S avy+ay = ¡3ei
y(0) = O.
S4 Put un+1 := P[ua.,nbj {un + sn vn }, n:=n+1, GO TO Si. In addition, we need to calculate y(x,T), the final value of the solution to
the initial-boundary value problem
The method is completely analogous to that for the elliptic case. The
yt = Dy
projection step is necessary, since un + sn vn may not be admissible. Al-
though the method converges only slowly, it is easy to implement and thus á y+ay = 0
very suitable for numerical tests. Note also that parabolic problems require
y(0) = yo.
much more computational effort than elliptic ones, since we have time as an
additional variable. Therefore, gradient methods are still useful alternatives Substituting this representation into the cost functional, we obtain the
to methods with higher order of convergente. For a detailed analysis of the reduced cost function f,, that depends orily on the vector it = (ul, ... , um)T
2 2
method, we refer the reader to [GS80] and [HPUUO9].
ui yi (T) - y9 + y(T) + 2
3.7.2. Derivation of the reduced problem. If problem (3.62) has to: i=1 L2 (O) L2(r)
be solved several times, say for different initial values yo, final values yc,
With these specifications, the finite-dimensional approximation of the
or regularization parameters A, then reducing the problem to a statement
original optinial control problem reads
involving u only may be worthwhile. This is also the case when the control
has the form (3.64) with only a few functions ei. min
(Pm)
T. <v.<v.b
For simplicity and clarity, we imagine (as in the elliptic case) that the
state y can be determined exactly for a given control u and that all integrals Here, the inequality úa < v, < úb is to be understood in the componentwise
that arise can be evaluated exactly. sense. We also tacitly assume that the restriction ua < ui < ub by means of
As in the elliptic case, we assurne that the control u = u(x, t) is, in terms constants ua and ub is a meaningful substitute for the original constraínt ua <
of fixed ansatz functions ei, of the form u(x, t) < ub. This is certainly the case for the step functions el introduced
m in the aboye example. Alternatively, the restrictions uu, < ui < ub can be
( 3.64) u (x, t) _ ui ei(x, t). postulated a priori in the case of more general ansatz functions ei.
i=1 A straightforward calculation yields
m

Example. Let N = 2 and O = (0, 1)2, and imagine that the (one-dimensional) fm(a) = )I1y(")-YglIL2(9) +Eui (y(T) - yo, yi (T))L2(o)
boundary F is unrolled onto the interval [0, 4] on the real axis. If x varíes over the i=1
m
boundary f, then we can interpret u as a function of the arc length s and the time 1 m
t. Then u is defined en the rectangle [0, 4] x 10, T], which we split up into ns - nt + 2 Z uiui (y¡(T), yi(T))L2(o) + 2 u7 ( e7) L2(r
rronoverlapping subrectangles. i,7=1 i,7=1

We take the control u to be a step function that is constant on each of the Consequently, (Pm) is-up to the constant 2 y(T) - y^ IL2(0)-equivalent
subrectangles. In this case, each basis function el equals unity on exactly one te the finite-dimensional reduced quadratic optimization problem
subrectangle and is zero otherwise; evidently, we have m = ns nt different basis
functions. As mentioned before, the aboye ansatz can be given a priori without
discretization; this is often the case in practical applications. o
min {tT ( C+>,D)ú- +cdT }
(3.65)
The functions y¡ (x, T) :_ (Sei) (x), i = 1, ... , ni, have to be computed
lía < ú < 2Lb.
in advance as the final values of the solutions y = yi to the initial-boundary
170 3. Linear-quadratic parabolic control problems 3.8. Derivation of Fourier expansions 171

In order to set up this problem, the following quantities must be com- by taking the corresponding upper or lower threshold value. To determine
puted beforehand: the remaining values, one again solves a forward-backward parabolic system
az = without constraints.
d = (az), (y(T) - y9, yi(T))L2(9)
C = (cij), cjj = (yz(T), yj(T))l,2(Q) Direct solution of the optimality system (3.54). In the presente of control
D = (dzj), djj = (ei, e7)L2(E).
constraints, it is also very promising to solve the optimality system (3.54)
directly. To this end, we substitute u = ^[ud,ubl { 1Q p} in the state
2 equation, using the representation ^[ua,^bl (z) = max{ua, min{ub, z}} , to
In the example of the step functions, we have (e¡ , eJ)L2(E) = S2j I L2(E),
and D is a diagonal matrix, D = diag(IIejIIL2(E)). arrive at the following nonlinear system for y and p:

The reduced problem can be solved by using a standard code of quadratic 0


yt - Ay
optimization, e.g., quadprog in MATLAB. Alternatively, an active-set strategy
could be implemented. The internet website of NEOS (NEOS Server for ó„y+ay ¡ max{u, min{ub, {-)-' 3p}}}
Optimization) contains a list of other available codes. y(0) yo,
The approach described here is worthwhile only if m is not too large,
_pt _ Ap 0
since the determination of all of the involved quantities requires the numerical
solution of m+1 parabolic initial-boundary value problems. In principle, the á„p+e, p 0
method works as long as the numerical solution of the heat equation can be
handled, that is, possibly also for spatially three-dimensional domains. It can p(T) y(T)-yn•
be recommended whenever u is a priori given as the linear combination (3.64)
of a small number of ansatz funetions with unknown (control) components The only points of non-differentiability of the maximum and minimum
uz functions appearing in this system are the points ua and ub; these funetions
are, however, globally Newton differentiable; see Ito and Kunisch [1K08].
Recommended numerical methods . We now give an overview of some These facts explain the successes that have been reported in the literature
methods that have been applied successfully to numerous problems and can for the direct numerical solution of the aboye nonsrrrooth system. Another
be recommended for the numerical solution of linear-quadratic parabolic possibility is to approximate the maximum and minirnum to very high accu-
optimal control problems. racy by smooth functions (which in some codes is done automatically); the
(smooth) nonlinear system of two parabolic problerns thus generated is then
The unconstrained case. If the choice of the control u is unrestricted, then
solved numerically. Neitzel et al. [NPSO9] report good experiences with
the coupled system (3.55) on page 162 consisting of state and adjoint systems
this technique using existing software packages.
can be solved, provided it can be handled numerically. This does not present
major difficulties in one-dimensional domains. Many available codes can also Optimization after full discretization. This method, often referred to as
handle simple two-dimensional geometries. In the case of domains in higher discretize then optimize, consists (as in the elliptic case) of a full discretiza-
dimensions , multigrid techniques can be tried ; see [Bor03] or [BK01]. tion of both the parabolic problem and the cost functional, leading to a
(large) optimization problem. It is easily performed, in particular, for spa-
Primal-dual active set strategies. These techniques are among the most often
tially one-dimensional parabolic problerns; in this case, existing solvers for
used methods in recent years; see Ito and Kunisch [ IKOO ], Bergounioux et
finite-dimensional quadratic optimization problems may be ernployed.
al. [BIK99 ], Kunisch and Rbsch [KR02] , as well as the detailed exposition
in Ito and Kunisch [1K08 ]. The paper [ KR02] treats the case of a general
continuous linear mapping u H y = Su and therefore covers parabolic
problems, while the other referentes deal only with elliptic problems. 3.8. Derivation of Fourier expansions
As in the elliptic case, at each iteration step one updates active sets for For the sake of completeness, we derive in this section the Fourier series
the upper and lower box constraints; the control is fixed in the next step (3.14). To this end, we consider the weak solution y e W2,0 (Q) rl W(0,T)
3.8. Derivation of Fourier expansions 173
172 3. Linear-quadratic parabolic control problems

Lo the parabolic problem (3.66) and the ansatz (3.68) imply that, after formally interchanging limit
and summation,
yt(x,t) - yxz (x,t) = f(x,t) in Q = (0,1) x (0, T)
00
yx (0, t ) = 0 in (0, T)
(3.66) (3.71) yo ( x) = l m
o y (x,t) _ v,( x)zn(0).
yx(1,t) + a y(1,t ) = u(t) in (0, T) a
n=1

y(x,O) = yo (x) in (0,1),


On the other hand , by virtue of the theory of Fourier series in Hilbert spaces,
for fixed a > 0, T > 0, and given functions f E L2(Q), u E L2(0,T), and
yo E L2(0,1). The (normalized) eigenfunctions {vn(x)}°_1 of the differential 00
operator A = -a2/3x2 with the homogeneous boundary conditions YO (x) = 1 Y O,nvn (x),
n= 1

(3.67) (0) = o, a- (1) + av0(1) = 0


x with the Fourier coefficients

form a complete orthonormal system in L2(0, 1); see, e.g ., Tychonov and 1
Samarski [TS64 ]. We aim to express y in the form of a series expansion, (3.72) vn(x)yo(x)dx.

( 3.68) y (x, t) _ v,( x)zn(t), Comparison of the coefficients shows that


n=1

where the functions vn and zn are yet Lo be determined . First , we construct


zn(0) = fl vn (x)yO(x) dx.
the eigenfunctions vn. Let {fin}'=1 denote the eigenvalues of A, i.e., Avn = 0
Anvn. Then v, satisfies the homogeneous boundary conditions (3.67) and

(3.69) v','(x) + anvn(x) = 0 Vx E [0,1]. Next, we use the weak formulation (3.25) (on page 140) of the para-
bolic problem (3.66) in the spatially one-dirnensional case and use v(x, t) =
The ansatz vn(x) = Cl cos(\x )+c2 sin (vx) yields, upon taking (3.67) p(t)vn(x), where co e H1(O,T) and p(T) = 0, as the test function. It follows
into account , that c2 = 0; consequently, a cos (\) = /sin(/). Putting that
VA, we find that µ has Lo satisfy the equation

y(x, t) ^'(t)vn ( x) dx dt + (x, t)(t ) v(x) dx dt


( 3.70) y. tan p = a. -
J
J0
T
0 0
f J y

This equation has a countably infinite number of positive solutions µn, which + f y(1, t) p(t)vn (1) dt - f y(o, t) p(t)vn(0) dt
we arrange as an increasing sequence The associated functions (3.73)
0 0
cos(µnx ) satisfy the orthogonality relations
_ f 1 f (x, t) a(t)vn(x) dx dt + f u(t)^(t)vn(1) dt
1 .0
/T 0 0

L cos(µnx ) cos(µex ) dx = { N',


0,
n=4
+ f 1 yo (x) P(0)vn(x) dx.
0
with Nn := 2 + ''( "). Hence, the functions
Substitution of the series expansion (3.68) in the first integral of (3.73)
yields, upon invoking the orthogonality relations,
vn(x) = Nn cos(µnx)

form an orthonormal system that, owing Lo the theory of Sturm-Liouville (3.74) fT f y( x, t) '(t ) vn(x) dx dt = fT zn (t) p (t) dt.
eigenvalue problems, is complete (see, e.g., [TS64]). The initial condition in 0
174 3. Linear-quadratic parabolic control problems 3.9. Linear continuous functionals as right-hand sides * 175

The second integral can be transformed as follows, using (3.67), (3.69), Finally, we replace the integration variable x in (3.79) by ^ and insert this
the definition of µn, and the pairwise orthogonality of the vn: expression for zn into (3.68). Interchanging the summation and integration
(3.75) then yields
T fl
(x, t)<p(t ) vn(x) dx dt = f ( y(1, t)^p ( t)v'n(1 ) - y(0, t)^p ( t)v' (0))dt
J o yx y(x, t)
JO
vn(x)vn(^)e-µ"tyo(^) d^
¡T 1 n=0
y(x, t) cp(t)v, ( x) dx dt
- 0 0 JJ (3.80)
t 1 0
+ Y' vn( x)vn(^ ) e-t^2 (t -s) f(^, s) d^ da
0 0 _„
-a y( 1, t) (t ) v(1) dt + y( x, t) (t ) vn(x) dx dt Ce
=f fT ^1
0 -
+ f vn(x) vn(1)e µTy(t-s)u (s) ds.
= -a y( 1, t) (t ) v(1) dt + J z( t)(t) dt. 0 n=0
f 0
By the definition of the Green's function G, this is the asserted formula
Putting (3.74) and (3.75) into (3.73) yields that for all (p e Hl(0,T) such (3.14) for a > 0. The a = 0 case can be treated in a similar way. The reader
that cp(T) = 0,
will be asked to do this in Exercise 3.11.
(3.76)
I0 Remarks. In the aboye derivation , we have repeatedly interchanged limit or
- T zn(t)^ (t) dt + T t.tn zn(t)(p(t) d t
J o
integration with summation . This needs careful justification , since the infinite series
fT 1 1 under the integrals in (3.80 ) are not uniformly convergent . However, for fixed t, the
=
C1
f(x, t) vn(x) dx + u(t)vn (1)^ ^p(t) dt + yo(x)vn(x) dx ^o(0).
0 o
function s
n=0
n -3 m, we have
e-µ2 ( t-3) is integrable over [0, t]. Indeed, since p, - (n - 1)7r2 as

This is none other than the weak formulation of the ordinary initial value
t - 1
problem -µ1(t-3)ds = 2(1-e „2 < oo.
ñ
n=0 —0 n=0
zn(t) + µ^zn(t) = Fn(t) in (0, T)
(3.77) Consequently, if f and u are continuous or at least bounded, then the partial sums
zn(0) = zo,n,
of the series containing f and u are easily seen to be majorized by an integrable
where function. Moreover, they converge pointwise en [0, t). Hence, we can infer from
Lebesgue's dominated convergente theorem that integration and summation may be
interchanged in this case. The treatment of the integral containing yo and relation
Fn(t) = J f1 ( x, t)vn ( x) dx + u ( t)vn(1)
0 (3.71) follow easily frorn the L2 theory of Fourier series; indeed, the sequence of
(3.78)
zn(0) = f ' Yo( x)vn(x)dx.
Fourier coefficients {z,(0)} of yo is square summable, and the continuous functions
zn(t) can be estimated as aboye. We leave the details to the reader. Finally, if
f and u are unbounded, then they are approximated by sequences of continuous
The solution to (3.77) is given by the variation of constants formula functions in L2(Q) and L2(0,T), respectively.

t If yo, f, and u are sufficiently smooth, then (3.80) even defines a classical
zn(t) = e µñtzn(0) + f e-µñ(t-s)Fn(s) ds, solution y to the parabolic initial-boundary value problem; see, e.g., Tychonov and
0 Samarski [TS64].
whence, owing to (3.78),
3.9. Linear continuous functionals as right- hand sides *
zn(t) =
J01 yo( x)v n( x ) e
0
µ2t dx + JtJ
0
1 e-u2 (t s) f (x,
t)vn(x) dx
Parabolic equations can be written more generally than in the previous sec-
(3.79) t
+ e µñ( t_s)v(t )vn(1) ds. tions as equations in L2(0,T;V*). To this end, we once more consider the
0 initial-boundary value problem
176 3. Linear-quadratic parabolic control problems 3.10. Exercises 177

yt+Ay+coy = f in Q y'(t) + Ay(t) = F(t) E V* for a.e. t E (0,T)


(3.81) ,9"Y+ay = 9 on E y(0) = yo-

Y(O) = yo in S2,
Here, any arbitrary functional F E L2(0, T;V*) is permitted on the right-
hand side. Now observe that under our assumptions we have, for almost
with given functions f E L2 (0, T; L2(í1)), g E L2 (0, T; L2(F)) and coeffi-
every t E (0, T),
cient functions co e L°°(2), a E L°°(F). The elliptic differential operator A
is defined as in (2.19) en page 37, with bounded and measurable coefficients
(3.82)
la[y,v]I < ao Il y MvMvMv
ai1 that obey the symmetry condition and the uniform ellipticity condition a[v, v] > R Ilv11H'(2) - 130 IIvII2L2(Q)
(2.20). The weak formulation for y E W(0,T) reads as follows: y(0) = yo
and, for all v E L2(0,T;V), for all y, v E V, with constante >0and /3oe R.
T IT ¡T
f (yt(t) , v(t))v*,v
dt + a[y(t), v(t)] dt (F(t), v(t))v =,v dt. Theorem 3.22 . Suppose that ( 3.82) is satisfied . Then the evolution problem
Jo
Here, a = a[y, v] is the bilinear form associated with A, introduced en page y'(t) + Ay(t) = F(t) E V* for a. e. t E (0,T)
38. Moreover, V = Hl (9), and the vector-valued function F : [0, T] -* V* y(0) = Yo
is defined by
has for any F E L2(0 ,T;V*) and any yo e H = L2(9) a unique solution
F(t) v = f (x, t) v(x ) dx + J g( x, t) v(x ) ds(x). y E W(0 ,T). Moreover, there exists a constant cp > 0 such that
f
It was proved in Section 2.13 that the bilinear form a generates a con-
IMI W(o,T) < cp (I I FII L2 (o,T;v*) + IY0MMH)

tinuous linear operator A : V -> V* upon taking a[y,v] = (Ay , v)vx,v.


Therefore , the weak form of (3.81 ) can be rewritten as follows: y(0) = yo A more general version of this theorem (which also applies lo the case of
and, for all v E L2(0,T;V), time-dependent coefficients) and its proof can be found, e.g., in Gajewski et
al. [GGZ74] and Wloka [Wlo87].
.^T (yt(t ) , v(t))v.,v dt + ^T (Ay (t), v(t))v*,v dt
An application. By virtue of Theorem 3.22, the adjoint state of a para-
bolic problem can be interpreted as a Lagrange inultiplier associated with
(F(t), v( t))v,,v dt.
o the parabolic differential equation. To this end, the differential equation
Since f and g are square integrable vector-valued functions, we have y' + Ay - F = 0 is regarded as a constraint in the range space L2 (O, T ; Y').
Since the mapping y H y+ Ay is surjective, the Karush Kuhn-Tucker the-
JI F(-)ML2(o,T;v*) C c OIP I L2(Q) + IM9MML2(E)), orem, Theorem 6.3 on page 330, yields the existente of a Lagrange multiplier
so that all integrals in the aboye relation exist. Now, the preceding equation z* = p e L2(0, T; V*) * = L2(0,T; V), where the latter equality follows from
is equivalent lo (V*)* = V. Expositions of this technique in the theory of optimal control can
71
he found in, e.g., Lions [Lio711, Hinze et al. [HPUUO9], and Neittaanmáki

lo (yt(t ) + Ay(t) - F (t), v(t))v, v dt = 0 Vv E L2(O,T; V), and Tiba [NT94].

where yt is to be understood as a (regular ) vector- valued distribution be- 3.10. Exercises


longing to L2 ( 0,T;V*). But this implies that t
3.1 The function y(x, t) _ belongs to the space C([0,T], L1 (O, 1)). Coni-
yt(t) + Ay (t) - F(t) = 0 in L2(0,T;V*),
pute its norm . To which of the spaces LP(0,T, U"( 0,1)), 1 < p, q < oo,
so the initial-boundary value problem (3.81) finally becomes does this function belong?
178 3. Linear-quadratic parabolic control problems 3.10. Exercises 179

3.2 Prove the existente of an optimal control for the problem (3.1)-(3.3) on
yt(x,t) - yx.(x,t ) = 0 in (0,e) x (0, T)
page 119 with inhomogeneous initial state yo.
y, (0,t ) = 0 in (0, T)
3.3 Extend the notion of a weak solution to the initial-boundary value prob- (3.83)
yx(e, t ) + y(8, t) = u ( t) in (0,T)
lem (3.58) on page 164 to the problem with mixed boundary conditions:
y(x,0) = yo ( x) in (0,e).
yt+Ay+coy = in Q
Use the implicit Euler method with time step T for the discretization
a,y+ay = on El
of (0, T) and the difference quotient (2.91) on page 97 with step size h
y = 0 on Eo for the discretization of (0,4). Take the control u to be a step function
in 9. corresponding te the partition of (0, T), and use the values of the given
y(0) = Yo
function yo at the spatial grid points.

Here, E = P2 x (0, T), i = 1, 2, where the boundary pieces F are defined 3.8 Use the program from the preceding exercise to establish the finite-dimen-
as in Section 2.3.3. sional reduced problem for the optimal control problem

3.4 Determine the first-order necessary optimality conditions for the problem min J(y,u) 1 lly(T) - yollL2(o,1) + 211.111-(o,T),
(3.57)-(3.59) on page 164 by means of the formal Lagrange method.
subject to (3.83) and the box constraints u(t)1 < 1. Employ the method
3.5 Investigate the problem (3.1)-(3.3) on page 119 with the extended cost described in Section 3.7.2. Use the MATLAB code quadprog to solve the
functional problem for the values chosen by Schittkowski [Sch791: 4 = 1, T =
1.58, y9 (x) = 0.5 (1 - x2), yo (X) = 0, a = 10-3. Take the time step
J(y, u, v ) := J(y, u, v ) + JJ aQ (x, t) y(x, t) dx dt to be T = 1 / 100 and fit the spatial step size h accordingly.
3.9 Solve the reduced problem from the preceding exercise also for the choices
+ JJ ar( x, t) y(x , t) ds(x)dt >, = 10-k, k = -1, 0...5, and A = 0. Solve the same problem for the
function yn(x) = 0.5 (1 - x), and halve the time step T several times.

+ J¡J
¡E

Q
vQ ( x, t) v(x, t) dx dt +
E
JJ ua (x, t) u(x, t) ds(x)dt,
What do you observe? Interpret your findings in light of Theorem 3.7 en
page 133.
3.10 Develop, by slightly modifying the code written in Exercise 3.7, a program
where J is the cost functional defined in (3.1) and the functions aQ, VQ E for the solution of the adjoint problem corresponding to the aboye optimal
L2(Q) and ar, vE E L2(E) are prescribed. Do optimal controls exist for control problem. Compute the adjoint states associated with the optimal
this problem? Derive the first-order necessary optimality conditions. final states y(T) obtained in Exercises 3.8 and 3.9. Verify the necessary
optimality conditions numerically using the projection formula.
3.6 Consider the initial value control problem
3.11 Use the method of Section 3.8 to construct the Green's function (3.14) in
the case where a = 0.
min J(y, u) := 2 Ily(T) - ynlll2(n) + z IlllL2(Q),

subject to

yt-0y = 0 in Q

avy+y = 0 on E

Y(0) = w in 9

and w E L2 (52), 1 w(x) 1 < 1 for almost every x e 11. Suppose that Assump-
tion 3.14 on page 153 holds. Show the existente of an optimal control and
derive the necessary optimality conditions.

3.7 Write a MATLAB program for the numerical solution of the initial-boundary
value problem
Chapter

Optimal control of
semilinear elliptic
equations

4.1. Preliminary remarks

In the previous chapters, we confined ourselves to linear partial differential


equations and thus excluded many important applications. From now on,
we will also consider nonlinear equations. A typical example is given by the
optimal control problem

(4.1) min J (y, u) : =


2 1 +
2 ^st
1u(x)12 dx,
subject to

-Ay u in Q
(4.2) + y+y3 =
ó„y = 0 on F

and

(4.3) ua < u(x) < 2Gb for a.e. x e S2.

The elliptic equation occurring in problem (4.2) is semilinear. Once


more, we will have to discuss relevant questions such as well-posedness of
(4.2), existente of optimal controls, first-order necessary optimality condi-
tions, and numerical methods. We should expect, however, that the corre-
sponding analysis will become more difficult.

181
182 4. Optimal control of semilinear elliptic equations 4.2. A semilinear elliptic model problem 183

One might argue that , for instante , the necessary optimality conditions where B denotes the embedding operator from L2 ( Q) finto V*.
can easily be derived using the Lagrange technique . Indeed , they can be In this way, the problem ( 4.2) can be treated in the state space Hr(S2).
formulated in terms of an adjoint system which , in the sense of Chapter 2, This method also works if y3 is replaced by the stronger nonlinearity y5;
comprises the adjoint problem corresponding to problern ( 4.2) linearized at indeed , we will see in Section 4.3.3 that I (y) = y5 is a differentiable mapping
y: from L6(Q) finto L65 (C) and thus , as the reader can check , also from V into
-Op + p + 392 p = y - ysz in S2 V*.

ó„ p = 0 on F. This method , while being adequate for many problems, is limited to cases
in which 1 maps V finto V*. This requires growth conditions as in Section
Moreover, we have, just as in the linear-quadratic case, the projection for- 4.3, which for instance are satisfied for P(y) = y3 and 4>(y) = y5 but not for
mula (D (y) = exp (y). In fact, if y E Hl (Q), we cannot even expect exp(y ) E L1(S2).
u(x) p(x) We usually also have to impose restrictions on the dimension N of Q.
However , we will see that under natural conditions the solution y is
However, it becomes immediately evident that already the use of the continuous on 9, provided u E Lr(52 ) for a sufficiently large r. If this is the
space H1(í) may create serious problems: indeed, we need to show the case, then growth conditions are superfluous , and so are restrictions on the
differentiability of nonlinear mappings like y(-) H y(.)3 for our analysis, and dimension such as N < 3.
it is by no means obvious in which function spaces this should be done. There are two more reasons to look for continuous solutions y. First, the
There is another unpleasant fact: the aboye optimal control problem is aboye method becomes more complicated in the case of parabolic problems
not a convex one, even though the cost functional is convex. This arises in W(0 , T), since the degree of integrability of y = y(x,t) on SZ x (0, T)
from the fact that the elliptic state equation is nonlinear. Consequently, the is lower than in the elliptic case; second , the continuity of the state y will
first-order necessary optimality conditions are no longer sufficient, and there be needed anyway for the treatment of state constraints in Chapter 6. It
is a need to separately consider sufficient second-order optimality conditions. is therefore worthwhile to prove continuity or, at least, boundedness of the
Unfortunately, unexpected difficulties arise in their analysis that have to be state y.
overcome. We proceed as follows. First , we treat the elliptic boundary value prob-
lem in H' (S2) under strong boundedness conditions . Then we show that the
4.2. A semilinear elliptic model problem solution is actually bounded or continuous and that the boundedness con-
ditions can be weakened . To this end , we investigate a more general clase
4.2.1. Motivation of the upcoming approach . To motivate the next of problems . Monotone increasing nonlinearities of the types <P(y) = y3
steps, we consider the semilinear elliptic boundary value problern (4.2) in a or 1(y ) = exp(y ) are too special for many applications : if, for instance,
bounded Lipschitz domain Q c 1183. S2 = S21 U í12 stands for a body composed of two materials having different
Since the function y y3 is monotone increasing, it will follow from physical constante al and C2, then a nonlinearity of the form
Theorem 4.4 that a unique solution y E Hl (S2) exists for every given u e
L2(S2). By Theorem 7.1 on page 355, the embedding Hl(S2) y L6(Q) is
d(x, y) =
Í Kr y3 xEf i
continuous for N = 3, so that y3 E L2(Q). Moreover, as will be shown on 3
K2 Y xEQ2
page 230, the Nemytskii operator <P : y(•) H y3(•) is a Fréchet differentiable
mapping from L6(S2) arto L2(S2). This property will be needed to derive can be adequate. Evidently, d is no longer continuous in x, but is still
necessary optimality conditions. Finally, it follows from Lemma 2.35 on bounded and measurable. Motivated by the aboye considerations, we treat
page 107 that the operator A := -4 + 1 maps V = H1(Q) continuously as a model problem the elliptic boundary value problem
into V*.
Summarizing, we may rewrite the aboye boundary value problem as an
Ay+co( x)y+d(x, y) = f in í2
equation in V*,
Ay+B-P(y) = Bu, ó„Ay+a(x)y+b(x,y) = g on F.
184 4. Optimal control o£ semilinear elliptic equations 4.2. A semilinear elliptic model problem 185

Here, S2, F, co > 0, anda > 0 are defined as in Section 2.1, and the nonlinear Theorem 4.1 (Main theorem on monotone operators). Let V be a separable
functions d and b are given. The elliptic differential operator A is assumed to Hilbert space, and let A : V -* V* be monotone, coercive, and hemicontinu-
take the form (2.19) on page 37, and the functions f and g will play the role of ous. -Then the equation Ay = f has for every f e V* a solution y E V. The
controls. This class of elliptic problems, while not being overly complicated, set of all solutions is bounded, closed, and convex. If A is strictly monotone,
still exhibits the essential difficulties associated with nonlinear equations.
then y is uniquely determined. If A is moreover strongly monotone, then the
Under suitable assumptions, we will be able to avoid the occurrence of the inverse A-' : V * -> V is a Lipschitz continuous mapping.
functions co and/or n.
The aboye theorem is due to Browder and Minty. Its proof can be
4.2.2. Solutions in H1 (9). We begin our analysis by investigating the
found in, e.g., Zeidler [ Zei90b] . We apply it to problem (4.5) in the space
existente and uniqueness of solutions to the semilinear elliptic boundary
V = H'(S2). To do this, we first have to define the notion of a weak solution
value problem (4.5) in the space H'(f2). To this end, we employ the theory
to the nonlinear elliptic boundary value problem (4.5). The idea is simple:
of monotone operators.
we bring the nonlinear terms d(x, y) and b(x, y) in (4.5) to the right-hand
The basic idea is simple: if a sides of the equations, thus obtaining a boundary value problem with the
a
continuous function a : 118 - R is right-hand sides f = f - d(•, y) and g = g - b(-, y), respectively, and linear
1
strictly monotone increasing with differential operators on the left-hand sides. For this purpose, we use the
limef^ a(x) = foo, then the variational formulation for linear boundary value problems.
equation a(y) = f has for any At this point, a problem arises if b(x, y) or d(x, y) is unbounded (e.g.,
f E R a uniquely determined so- for nonlinearities like y'` or ey): elements of y e Hr(S2) need not be bounded.
lution y E R. This simple principie ,Without further assumptions, it is therefore unclear to which function spaces
y generalizes to equations Ay = f in d(x, y) and b(x, y) should belong. Initially, we postulate that d and b be
Banach spaces. bounded on their respective domains; then d(x, y) and b(x, y) will be bounded,
In the following, V denotes a even if y is not.
real separable Hilbert space, e.g.,
V = H'(S2) or V = H0l(S2). Recall Assumption 4.2.
f -point y of a monotone function a. that a Banach space is said to be (i) 9 C RN, N > 2, is a bounded Lipschitz dornain with boundary f, and
separable if it contains a countable A is an elliptic differential operator of ¿he form (2.19) (see page 37) with
dense subset. bounded and measurable coefficient functions a> that satisfy the symmetry
condition and the condition (2.20) of uniform ellipticity.
Definition . An operator A : V > V* is said lo be monotone if
(ü) The functions co : S2 -* R and c : E -* R are bounded, measurable and
(Ay, - Ay2 , yi - y2) v' v -> 0 V Y1, y2 E V. almost everywhere nonnegative. Assume that at least one of these functions
It is said lo be strictly monotone if equality can occur only if yl = y2. We does not vanish almost everywhere, that is, ((co)L-(Q) + Ia11L-(r) > 0.
say that A is coercive if (iii) The functions d = d(x, y) : 52 x R -> R and b = b(x, y) : F x f8 -> R
(Ay, y)y*,y
are bounded and measurable with respect to x E Q and x E F, respectively, for
-*00 as M1v -3 -- every fixed y E R. Moreover, they are continuous and monotone increasing
iIyliv
in y for almost every x E f2 and x E F, respectively.
A is said lo be hemicontinuous if the real-valued function cp : [0, 1] -* R,
t H (A(y + tv), w) v* v, is continuous on [0, 1 ] for all fixed y, v, w E V. It follows from this assumption, in particular, that d(x, 0) and b(x, 0)
Finally, if there exists some do > 0 such that are bounded and measurable in 9 and F, respectively. In view of the problern
(Ay,-Ay2,yr-y2)v.V->00 yi-y2jV of unboundedness, we initially make a further assumption.
dyr,y2El V,
then A is said to be strongly monotone. Assumption 4.3. For almost every x e 9 (respectively, x E F) we have
d(x, 0) = 0 (respectively, b(x, 0) = 0). Moreover, b and d are globally
186 4. Optirnal control of semilinear elliptic equations 4.2. A semilinear elliptic model problem 187

bounded, that is, there is a constant M > 0 such that for any y E R we is continuous en V and thus belongs to V*. In this sense, using the canonical
have isomorphism from V into V*, we may identify d(., y(.)) with Fd e V'. In
(4.6) ib(x,y )¡ < M for a.e. x E F, conclusion, we may define A2 : V -* V * by putting A2 Y = Fd. The third.
1 d(x,y)¡ < M for a- e. x E Q.
part A3, which corresponds to the nonlinearity b, can be defined similarly.
Indeed, the linear functional
The differential equation is associated with the bilinear form

(4.7) a[y, v] := f
N
i(x)DzyDjvdx + f coyvdx+iayvds.
Fb(v) =
Jr b(x,y(x)) v(x)ds(x)
z r ' also belongs to V* and can be identified with b(•, y(-)). In this sense, A3 y =
Fb. The sum of the three operators yields the operator A, i.e., A = Al +
A2 + A3.
Definition . Suppose that Assumptions 4.2 and 4.3 hold. A function
y E H'( í) is called a weak solution ¡to (4.5 ) if we have, for every v E Hl (9),
(ii) Monotonicity of A
¡ ¡
(4.8) a [y, v] + J d(x, y) v dx + J b (x, y) v ds = J f v dx + J g v ds. We show that each of the operators A¡, 1 < i < 3, is monotone, so that this
✓ st r ^ r property then also holds for A. First, Al is monotone, since a[y, y] > 0 for all
y e V. Next, we consider A2. Owing to the monotonicity of d in y, we have
Invoking the main theorem on monotone operators, we can prove the
(d(x, yl) - d(x, y2)) (y1 - Y2) > 0 for all y,, Y2 E R and all x. Therefore, for
following well-posedness result.
all y E Hl(52)1

Theorem 4.4. Suppose that Assumptions 4.2 and 4.3 hold. Then the elliptic (A2((yl) - A2(y2) , yl - y2)V*,V
boundary value problem (4.5) has for any pair f E L2(9) and g E L2(F) of
right-hand sides a unique weak solution y E Hl (Q). Moreover, there is some = J (d(x,yl(x))-d(x,y2(x)))(Y1(x)-y2(x))dx>0.
constant cm > 0, which is independent of d, b, f, and g, such that
Note that the boundedness condition for d guarantees that the function
IIYIIH1(O) < CM (II!IIL2(o) + IIgIIL2(r))• x H d(x, y,(x)) - d(x, y2(x)) is square integrable, so that the aboye integral
exists. In conclusion, A2 is monotone. The monotonicity of A3 follows from
Proof. We apply the main theorem on monotone operators in V = H1(1). analogous reasoning.

(i) Definition of a monotone operator A : V -+ V* (iii) Coercivity of A

Al is coercive, since the assumptions made on co and a imply, as in the proof


It follows from Section 2 . 13 that the bilinear form (4.7) generates a contin-
uous linear operator Al : V V* through the relation of Theorem 2.6 on page 35, that

(Aly, v ) v*,v = a[y,v]•


(Al v,v)v*,v = a[v,v] > (4o IIvI1v VvEV.

The operators A2 and A3 contribute nonnegative terms that do not destroy


This is the linear part of the nonlinear operator A. The first nonlinear part
the coercivity. Here, the assumption d(x, 0) = b(x, 0) = 0 is exploited.
of A is formally defined by the identity (A2y)(x ) := d(x,y(x )). We have to
Indeed, we have, by the monotonicity of d,
make this formal "definition" precise: any y E V belongs to L2( í ). Owing
to the strong assumptions imposed on d, the function x ^-4 d(x, y(x)) is (A2 V, v)v*,v = fd(x,v (x))v(x)dx
measurable (by continuity of d in y) and bounded (by boundedness of d).
We thus have d (., y) E L' (Q ) for all y E V. Therefore , the linear functional
f(d(x, v (x)) - d(x, 0))(v(x) - 0) dx > 0.
Fd given by
r A similar estimate holds for A3, and this proves the claim that A = Al +
Fd (v) = d (x, y (x)) v (x) dx
J
o A2 + A3 is coercive.
188 4. Optimal control of semilinear elliptic equations 4.2. A semilinear elliptic model problem 189

(iv) Hemicontinuity of A the right-hand side from aboye by the Cauchy-Schwarz inequality . We then
obtain
The operator Al is linear and thus hemicontinuous. For A2 we argue as
follows: we put /3o I y lx(12) < IIfIIL2(Q) IIYIIL2(O) + 11911L2(r) IIYIIL2(r)
< c (IIfIIL2(O) + 11911L2(r)) IIYIIH1(12) ,
^p(t) :_ (A2 (Y+ t v) , w)V, v = fd(x,y(x)+tv(x))w(x)dx.
t whence the asserted estimate follows. This concludes the proof of the theo-
rem. ❑
Now let T E R be fixed, and let {tn}ñ 1 c R be some sequence such that
t, ->Tas n->oo. Weneedtoshow thaty(tn)-->w(rr)as n-3oo. Remarks.
Since by Assumption 4.2 d is continuous with respect to y for almost (i) The aboye theorem does not apply directly to problem (4.2), since d(x, y) _
every x E 52, it follows that y3 does not meet Assumption 4.3. But Assumption 4.3 was only instrumental
in guaranteeing that d(-, y(.)) E L2(S2), which, as mentioned aboye, is true for
fn(x) := d(x, y(x) + tn v(x)) w(x) -> d(x, y(x) +T v(x)) w(x) =: f (x) d(x, y(x)) = y(x)3 if y e H' (52) and N < 3.

pointwise almost everywhere in 9. Moreover, the sequence { fn}n-1 is also (ü) The aboye proof only made use of the fact that f and g generate continuous
linear functionals on V and therefore can be identified with elernents of V*. But
pointwise almost everywhere majorized by an integrable function; indeed, it
for this to be the case the square integrability is not necessary; see page 40. In
follows from (4.6) that
particular, the result remains valid for data f E L' (s2) and g E Ls(F) whenever
Id(x,y(x)+tnv(x))w(x)I <MIw(x)I r > 2 and s > N - 1, respectively; owing to Theorem 4.7 below, y is then even
for a.e. x E52,
continuous en fi. For N E {2, 3} this includes also cases where r, s < 2.
where w E L2(S2). Thus, we can infer from Lebesgue's dominated conver-
(iii) 1`urther techniques for the treatment of nonlinear elliptic equations can be
gence theorem that f
found in the monographs by Barbu [Bar93], Lions [Lio69], Ladyzhenskaya and
Ural'ceva [LU73], Neittaanmi3.ki et al. [NST06], and Zeidler [Zei90b, Zei95].
, d(x,y(x)+rv(x))w(x)dx,
4.2.3. Continuity of solutions . In this section, we follow ideas developed
and hence cp(tn) --> p(t) as n -* oo. The operator A3 can be treated
by E. Casas in [Cas93] . We begin our analysis by using a techrrique due
analogously.
to Stampacchia to show that the weak solution y E H1(L) is actually even
essentially bounded, provided that the functions f and g have "better" prop-
(v) Well-posedness of the solution
erties than just square integrability. The main result of this section will be
Existente and uniqueness of a weak solution y c H' (t2) now follow directly Theorem 4.8 on the continuity of y.
from the main theorem on monotone operators. Since A is obviously strongly
monotone, the asserted estirnate also holds. However, it is not clear why the Theorem 4.5. Suppose that Assumptions 4.2 and 4.3 hold, and let r > N12
estimate does not depend on d or b. Therefore, we give a direct proof of and s > N - 1. Theri for any pair f E L"(S2) and g e Ls(I), there exists
U. To this end, in tire variational equation (4.8) we take y itself as the test a unique weak solution y E H1(52) to the boundary value problem (4.5). We
function to obtain Nave y E L°°(Q), and there is some constant c,,, > 0, which does not depend
on d, b, f, or g, such that
a[y, y] +(A2 y , y)v*,v+(A3 y , y) v*,v = J f (x) y(x) dx+ l g(x) y(x) ds(x). (4.10) IIYIIL-(O) <_ c (II.f(Lr(O) + IIgl1Ls(r)).
sz ./ r

Since the tercos containing A2 and A3 are nonnegative, it follows that


The proof of this theorem will be given in Section 7.2.2, beginning on page
358. As the reader will be asked to verify in Exercise 4.1, we have the
a[y, y] < J f (x) y(x) dx + J g(x) y(x) ds(x).
St r inequality
I ylIL =(r) <- IIPIIL= (o) dy E H1 (s2) n L- (Q).
This is the reason why the asserted estimate does not depend on d or b.
Now, we estimate the bilinear form a[y, y] from below by the H1 norm and Therefore , from (4 . 10) the same estimate for IIyIIL- ( r) follows.
190 4. Optimal control of semilinear elliptic equations 4.2. A semilinear elliptic model problem 191

It is noteworthy that the estimate (4.10) does not depend on the nonlin- Recalling the definition of the sequences {fn}°^ 1 and {gn}," 1 and invoking
earities d and b. The reason behind this is their monotonicity. It is therefore Theorem 4.5 once more (in particular, the estimate (4.10)), we conclude that
quite natural to presume that the postulated boundedness of the nonlinear- Ilyn - yIIL-(o) --^ 0 as n -> oc. In particular, {y,} ñ= 1 is a Cauchy sequence
ities is dispensable. This is indeed the case, and without this boundedness in L°°(Q) and, since all the terms yn of the sequence are continuous on 52,
assumption it is still possible to show that the solution y is continuous on also in C(f2). Hence, {yn}°° 1 has a limit in C(Q), which obviously must be
Q. To this end, we need another preparatory result. y. This concludes the proof of the assertion. ❑

Lemma 4.6 ([Cas93]). Suppose that 9 c RN is a bounded Lipschitz do-


The proof for the case of L°° coefficients a2j and Lipschitz domains is due
main, and let f e L''(fh) and g E L8(F) with r > N/2 and s > N - 1 be
to Casas [ Cas93] . It was extended to more general situations by Alibert and
given. Then the weak solution y to the Neumann problem
Raymond [AR97]. Recent results by Griepentrog [ Gri02 ] on the regularity
Ay+y = f of solutions to elliptic boundary value problems include the aboye lemma as a
a1Ay = 9 special case. This will be explained in Section 7.2.1, following page 356. The
Hdlder continuity of the solution for mixed (Dirichlet-Neumann) boundary
is continuous on Q. Moreover, there is some constant c(r, s) > 0, which does
conditions was recently proved by Haller-Dintelmann et al. [HDMRS091.
not depend on f or g, such that

I1yllC(0) < c(r, s) (Ilf 11L-(Q) + 11911L-(r)) We now drop the postulate of Assumption 4.3 that the nonlinearities b
and d be bounded. In this situation, we call y E H1(9) f1 L°°(Q) a weak
Proof: For simplicity, we prove the assertion under the additional assumption solution to the boundary value problem (4.5) if it satisfles the variational
that 9 has a C1,1 boundary and that the coefficients atij belong to C°"1(S2) equality (4.8).
Referentes for the general case will be given below, after this proof.
Theorem 4.7. Let 9 C RN be a bounded Lipschitz domain, and let r > N/2
The existente of a unique weak solution y E Hl (Q) follows from Theorem and s > N -1. Suppose also that Assumption 4.2 holds, and that b(x, 0) = 0
2.6 on page 35 (cf. the remarks on page 40 concerning data with r < 2 or and d(x, 0) = 0 for almost every x e F and x E I, respectively. Then the
s < 2). Owing to Theorem 4.5, applied with co = 1 and d = b = a = 0, y semilinear boundary value problem (4.5) has for any pair f E L' (L) and
is essentially bounded and satisfies the estimate (4.10). It remains to show g E L'(F) a unique weak solution y E Hl(L) f1 L°°(L). Moreover, y is
that y is continuous on 0, since then IylIL°°(Q) = IIy11c(0), which implies continuous on S2, and there is some constant c. > 0, which does not depend
the validity of the asserted estimate. on d, b, f, or g, such that
To this end, observe that under the aboye regularity assumption for F,
the spaces C°°(Q) and C°°(F) are dense in L'(Q) and L'(F), respectively.
(4.11) IIyIIW(sa) + 11Y11C(n) < c^ (II f II L-(o) + 11911L-(r))
We may therefore choose functions f, E C°°(S) and gn E C°°(F), n E N,
such that Proof: We again follow the argument in Casas [Cas93]. First, we show
that the requirement that d and b be bounded is dispensable. To this end,
Ilfn - f IILr(e) --> 0 and 119- - 911L' (r) -> 0 as n -> DO. consider for arbitrary k > 0 the cut-off function
Now denote by yn the unique weak solution to the aboye Neurnann prob- d(x, k) if y>k
lem with right -hand sides f,, and gn, n e N. Owing to the assumed regularity
dk(x,y) = d(x,y) if lllI <k
of the boundary F and the coefficient functions azj, we may apply the regu-
larity results from Grisvard [ Gri85] collected in Section 2.14.3 to conclude d(x, -k) if y < -k.
that yn E W2,'(9). Since W2,T(Sl) is for r > N/2 continuously embedded in In the same way, we define a cut-off function bk for b. The functions bk and
C(S2) (see Theorem 7.1 on page 355), it follows that yn E C ( (). Moreover, dk are uniformly bounded and satisfy Assumption 4.3. We can thus infer
the difference y - yn solves the boundary value problem from Theorem 4.5 that the elliptic boundary value problem

A(y-yn )+ y-yn = .f -.fn Ay+co(x)y+dk(x,y) = f in LI

al, (y - yn) = g - gn . ó„Ay+a(x)y+bk(x,y) = g on F


192 4. Optimal control of sernilinear elliptic equations 4.2. A semilinear elliptic model problem 193

possesses a unique weak solution y E Hr(4) n L°°(f2). Moreover, y satisfies 4.2.4. Weakening of the assumptions . So far, we have studied the semi-
the estimate (4.10), which does not depend on dk or bk and therefore not linear elliptic model problem in the form (4.5) on page 183. The required
on k. Now choose k > ce^ (IIf1I LI(o) + IgIILs(r)). Then, by virtue of (4.10), coercivity of the elliptic operator has been guaranteed by the properties of
y(x) I < k for almost every x E 52 and almost every x E F. Therefore, the coefficient functions co and a. In particular, the choice d(x, y) = 0 and
dk(x,y(x)) = d(x,y(x)) and bk(x,y(x)) = b(x,y(x)) almost everywhere. b(x, y) = 0 has been possible. This raises the question of under what con-
Consequently, y is a solution to problem (4.5). ditions the functions co and a can be omitted, since the coercivity of the
Next, we show the continuity of y. To this end, we rewrite the nonlinear nonlinear operator follows from the properties of d and b alone.
boundary value problem solved by y in the form As characteristic examples, let us consider the semilinear Neumann prob-
lems
Ay+y = f +y-coy-dk(x,y) in S2
-Ay +ey = f in Q
avAy = g-ay-bk(x,y) on I. (4.13)
c7„y = 0 on F
Since y is by (4.10) essentially bounded on both 9 and F, the right-hand and
sides of this problem belong to LT(í) and Ls(I ), respectively. Hence, we
-Dy + y3 = f in S2
can infer from Lernma 4.6 that y is continuous on Q. (4.14)
a„y = 0 onl.
It remains to show that y is unique. Indeed, owing to the rnonotonicity`
of d and b with respect to y, any solution y E H'(O) n L°(S2) to problem We investigate whether unique solutions exist in Hr(S2)nL'(O) that depend
(4.5) is at the same time the uniquely determined solution to the aboye cut- continuously on the right-hand side f.
off system for any k > IIyIILo(O). But this obviously entails that there can
It is easy to see that this is not the case for problem (4.13). In fact,
be at most one solution in the space Hl (Q) n L'(9), which concludes the
the function y(x) = c solves (4.13) with the right -hand side f (x) - e`,
proof of the assertion. ❑
and we have f 0 as c --> -co. However , for f = 0 there can be no
solution y e H1 (S2) n Loo (í ). Indeed , if there were such a solution , then we
As the final step, we now demonstrate that the postulate d(x, 0) _
could use the test function v - 1 in the variational formulation to find that
b(x, 0) = 0 is also dispensable; this was used in the proof of Theorem 4.4 to
f eyhl dx = 0, which is a contradiction . Observe that this problem does not
guarantee rnonotonicity of the operator A.
arise if a homogeneous Dirichlet condition is given: the operator A = -0 is
coercive in H01(52).
Theorem 4.8 . The assertion of Theorem 4.7 remains valid without the as-
In contrast to this, the boundary value problein (4.14) is well posed in
sumption b(x, 0) = d(x, 0) = 0, provided that the esti7nate (4.11) is replaced
by Hr(t2) n L-(52). This will be a consequence of the next theorem, which was
communicated to me by E. Casas. It concerns the boundary value problein
(4.12) IIy)IHL(S2) + MIC(sz) < c^ ( Ilf - Cl(_, 0) 1L'(52) + )g -- b(', 0)IILs(r))-
.A y + d(x, y) = 0 in S2
(4.15)
Proof: We rewrite the boundary value problem in the form ó,Ay + b(x, y) = 0 on F.

Ay + co(x) y + d(x, y) - d(x, 0) = f (x) - d(x, 0) in S2


c 9 , , + a(x)y + b(x,y) - b(x,0) = g(x) - b(x,0) on F. Here, A is defined as in (4.5), and the given right-hand sides f and y are
incorporated into d(x, y) and b(x, y), respectively.
The functions y ^--+ d(x, y) d(x, 0) and y --> b(x, y) - b(x, 0) vanish
at zero. Moreover, Assumption 4.2 obviously implies that d(., 0) E LT (t2) Assumption 4.9. The domain O and the linear differential operator A sat-
and b(., 0) E LS(F). Hence, Theorem 4.7 applies with the right-hand sides isfy the conditions stated in Assumption 4.2 on page 185. The functions
f -d(.,0) and g-b(.,0), yielding the validity of (4.12). The assertion is thus d = d(x, y) : O x R -> R and b = b(x, y) : f x R - R are measurable with re-
proved. ❑ spect to x for every y e R, and are monotone increasíng and continuous in y
194 4. Optinial control of semilinear elliptic equations 4.2. A semilinear elliptic model problem 195

for almost all x E t2 and x c F, respectively. Moreover, for any 111 > 0 there page 37, we obtain from (4.16) that
are functions zbm E Lr(S2) with r > N12 and OM E LS(F) with s > N - 1
such that

(4.16)
Id(x,y)I M(x) for a. e. x E 9, whenever M;
70 1
IVY1
2 dx <
f
OK iyn1 dx +
z r
OK ¡y,1 ds < e for all n e N.

b(x,y)I < 0ivl(x) for a, e. x e F, whenever IyI < M. Since IyJL-(Q) < K for all n e N, it follows that {IlynhIH1(o)} is also
bounded. We may therefore select a subsequence {ynk } such that with some
Finally, one of the following two conditions holds:
,y e Hl(Q) (1 L°°(S2) we have ynk y weakly in H'(Q) and, by compact
(i) There exist a set Ed C t2 with positive measure and constants Md > 0 embedding, strongly in L2(Q).
and ñd > 0 such that the following inequalities hold: By virtue of the boundedness of {y,,k} in L°°(Q) and in L°°(F), we
V x e Ed, may apply Lebesgue's dominated convergente theorem to conclude that
d(x, y1 ) < d(x, y2 ) d yl < y2;
(4.17) d(., Y%) --^ d(., y) strongly in L2 (Q) and b(.,ynk) --> b(., y) strongly in L2 (17).
(d(x, y) - d (x, 0)) y > Ad IyI2 b'x E Ed, V Iyl > Md. Hence, taking the limit as k -> oo in the aboye sequence of boundary value
problems, we find that y is a solution to (4.15).
(ii) There exist a set Eb C F with positive measure and constants Mb > 0 The uniqueness of the solution follows from a standard argument: sup-
and 1 b > 0 such that the following inequalities hold: pose that we are given two weak solutions yl, y2 E Hl (9) (1 LOO (Q). Testing
b(x, yr ) < b(x, y2 ) V x E Eb, V Y1 < p2; the difference between the corresponding equations by v = yl -y2i we obtain
(4.18) that
( b(x, y) - b (x, 0)) y > Xb Iyi2 V x E Eb, V IyI > Mb.

(4.19) -Yo J 1V(y1 -y2 )I2dx+ J ( d(x,yi)-d(x,y2 ))( yi-y2)dx

Theorem 4.10. Suppose Assumption 4.9 holds. Then the boundary value
problem (4.15) has a unique weak solution y E Hl(f2) f1 L°°(f2). The weak + f (b(x, yi ) - b(x, Y2 )) ( y' - y 2) ds < 0.
solution is continuous on 0.
r

Owing to the monotonicity of d and b, all three summands are nonneg-


Proof.• We follow an idea of E. Casas and consider for n E N the boundary ative and therefore must vanish. But then 1 V (yi - y2) (x) I = 0, and thus
value problem yl(x) - y2(x) = c for almost every x E S2, with some c c R (cf. Zeidler
[Zei90a] , Problem 21.31a). Since yl - y2 E Hl(Q) is equivalent in the
Ay+n-ry+d(x,y) = 0 in 9
Lebesgue sense to the continuous function y - c, it follows from the trace
ó„Ay + b(x, y) = 0 on F. theorem that also y, (X) - y2(x) = c for almost every x E F.
In analogy to Theorem 4.8, under these slightly modified assumptions there If c 0, then without loss of generality we may assume that yl (x) <
is also a unique weak solution y,, E H'(f2) (1 L°°(S2) to this problem; in this y2(x) for almost every x E 9 and almost every x e F. From (4.17) and
connection, we put co(x) = n-1. From Theorem 7.6 on page 363, we can (4.18) it then follows that at least one of the last two summands in (4.19)
infer that there is some constant K > 0 such that must be positive, and we have a contradiction. Consequently, c = 0, and
thus yl = y2•
11YnMML-(o) <K Vn E N.
Finally, the continuity of the solution is a consequence of Lemma 4.6.
Si-ce IIYnIIL°O(r) < IIynMILO°(Q) for all n E N, the sequence {IIynMIL°°(r)} is
also bounded. But then the L2 norms of the functions y, and of their traces
on the boundary are bounded, too. If the functions d(x, y) and b(x, y) are not only increasing in y but also
Next, we put v = yn in the weak formulation of the aboye problem, for differentiable with respect to y for almost all x, then the following conditions
n E M. Using the monotonicity of d and b and invoking inequality (2.20) on are sufficient for (4.17) and (4.18) to hold:
196 4. Optimal control of semilinear elliptic equations 4.3. Nemytskii operators 197

There are measurable sets Ed C 9 and Eb C F of positiva measure, as well the next definition, we denote by E the set in which the dornain variable
as constants A d > 0 and )\b > 0, such that varíes. In the case of elliptic problems we have E = S2 or E = F, while in
(4.20) dy(x,y) > Ad Vx E Ed,Vy e IR; by(x,y) > Nb Vx e Eb,dy e IR. the parabolic case E = SZ x (0, T) or E = P x (0, T). We generally assume
that the set E is bounded and Lebesgue measurable.
If one of these conditions holds, then Theorem 4.8 applies to the boundary The numerical analysis of nonlinear optimal control problems requires
value problem (4.15): for instante, we put co(x) := X(Ed)Ad and write the determination of first- and second-order derivatives of Nemytskii oper-
ators. In this section, we begin our analysis with continuity and first-order
d(x, y) = co(x)y + (d(x, y) - co(x)y) = co(x)y + d(x,y).
,derivatives. Second-order derivatives will be discussed later in connection
Then d is increasing with respect to y, and I1coJIL-(o) 0. In a similar way, with sufficient optimality conditions.
b can be transformed using a(x) := X(Eb)Ab. With this, the boundary value
problem (4.15) attains the form (4.5), and the assumptions of Theorem 4.8 Definition . Let E c I18m, m e N, be a bounded and measurable set, and
are met with either of the two conditions in (4.20). let cp = W(a, y) : E x R -. 118 be a function. The mapping ib given by

4.3. Nemytskii operators '15 (Y) = c,(.,y(.»,

which assigns to a function y : E -* IR the function z : E -4 IR, z(x) _


4.3.1. Continuity of Nemytskii operators. Any nonlinearity d(x, y)
cp(x, y(x)), is called a Nemytskii operator or superposition operator.
generates for a given function y a new function by putting z(x) = d(x, y(x)).
Such operators are called superposition operators or Nemytskii operators.
The analysis of Nemytskii operators in LT' spaces with 1 < p < oo ne-
Quite unexpectedly, it is a nontrivial task to study the differentiability prop-
cessitates more or less restrictive growth conditions en ep(x, y) with respect
erties of such operators.
to y. Since all control and state functions to be studied in the following will
Examples. The following mappings y(.) z(-) define Nemytskii operators: be uniformly bounded, we can work in L°° and thus with simpler conditions
that are met, for example, by all elementary functions defined on the whole
z(x) = (y(x))3, z(x) = a(x)(y(x))3, z(x) = sin (y(a)), real line.

z(x) = (y(x) - a(x))2. Definition.


The first mapping occurs in the superconductivity example, the third is (i) A function cp = cp(x, y) : E x R --> R is said to satisfy the Carathéodory
the archetypical example for illustrating possible difficulties, and the fourth condition if it is measurable with respect to x for any finad y E R and con-
appears as an integrand in our quadratic cost functionals. The corresponding ti,nuous with respect to y for almost every finad x e E.
generating nonlinearities are evidently given by (ü) cp is said to satisfy the boundedness condition if there exists a constant
))2
K > 0 such that
(4.21) d(y) = y3, d(x,y) = a(x)y3, d(y) = sin(y), d(x,y) _ (y - a
0 (4.22) jcp(x,0)j < K for a.e x E E.

All nonlinearities of the aboye type that appear in this book may depend (iii) co is said to be locally Lipschitz continuous with respect to y if for any
on two variables, namely the dornain variable x and the function variable for constant 1VI > 0 there is a constant, L(M) > 0 such that for almost every
which a control or state function is inserted. The function variable will x E E and all y, z E [-M, M] we have the estimate
usually be denoted by y, u, v, or w. In the case of controls distributed
in the domain, the nonlinearity d = d(x, y), x e Q, (d for distributed) is (4.23) icp(x, y) - cp(x, z) 1 < L(M) ¡y zi.
used; likewise, for boundary controls the function b = b(x, y), x E F, (b for
boundary) appears.
The following fact is readily seen: if cp satisfies the boundedness condition
In parabolic problema we have, in addition to the spatial variable x, the and is locally Lipschitz continuous, then for any M > 0 there is some cm > 0
time variable t, so that (x, t) represents the domain variable. To simplify such that lcp(x, y) 1 < cm for all y E [-M, M] and almost all x E E.
4.3. Nemytskii operators 199
198 4. Optimal control of sernilinear elliptic equations

Examples. All functions co e Cl (IR) satisfy the aboye conditions. The The function cp(x, y) = sin(y) is thus globally bounded and Lipschitz con-
tinuous with constant 1. Hence, 4 is globally Lipschitz continuous, and
same holds for cp(x,y) := al(x) +a2(x)b(y) if al E L' (E) and b E C'(R).
Aleo, the functions from (4.21) comply with the conditions provided that sin (y(')) - sin
I (z(.)) L r(E )
< IIy - 4 Vy, z e Lr(E).
Lr(E) o

a E L°° (E). o
4.3.2. Differentiability of Nemytskii operators.

Lemma 4.11. Suppose that the function co = cp(x, y) : E x R -3 R is Assumptions on the nonlinearities. To obtain differentiability proper-
measurable with respect to x E E for every y E IR, and suppose that cp ties of Nemytskii operators, higher regularity conditions llave to be imposed
satisfies the boundedness condition and is locally Lipschitz continuous with ,on the function W. To this end, let us agree upon the following terminology:
respect to y. Then the associated Nemytskii operator P is continuous in
L°°(E). Moreover, for all r E [1, oo], we have Definition. Let E C Rm, m E N, be a bounded set, and let y = cp(x, y) :
E x R - R denote a function of the domain variable x and the function
11-5(Y) - <P(z) II Lr(E) <_ L(M) ¡¡y - zII Lr(E) variable y. Suppose that co is k-times differentiable with respect to y for
for all y, z E L°°(E) such that IIYIIL-(E) < M and IIzIIL-(E) < M. almost every x E E. We say that y satisfies the boundedness condition of
order k if there exists some K > 0 such that
Proof.- Let y e L°°(E) be given. Then there is some M > 0 such that (4.24) I Dy y(x, 0) I < K V O < 1 < k, for a. e. x E E.
y(x) I < M for almost every x E E. By virtue of the conditions (4.22) and
We say that y satisfies the local Lipschitz condition of order kif for every
(4.23), we llave, for almost every x c E,
M > 0 there is some (Lipschitz) constant L(M) > 0 such that
I ^o(x, y(x)) I < I v(x, 0) 1 + l,p(x, y(x)) - (p(x, 0) 1 < K + L(M)M. (4.25) ID k ^p(x, yi) - Dk y(x, Y2) j < L(M) l yi - Y21
Hence, d) (y(•)) = ^p(•,y(•)) E L' (E), and <D maps L°°(E) into itself. forally¡eRsuchthat¡y¡¡<my i=1,2.
Now let y, z E LOO (E) be given such that 11y11L-(E) < M and IIz11L-(E) <
M. Then it follows from the local Lipschitz continuity that, for any 1 < r < If y depends only on the second variable , i.e. y = y(y), then the aboye
oo, two conditions are equivalent to the local Lipschitz continuity of y(k),
cp( k )(yl)-cp k (y2)I <L(M) yl-y2 dyzEIIBsuchthat yz <N1, i= 1, 2.
E 1 ^, (x, y(x)) - p(x, z(x)) l
ydx < L(M)' 1y(x) - z(x)Irdx
f J E
L(M) r 11Y - zllir(E)• Remark. It is easily seen that the validity of both the boundedness condition
and the local Lipschitz condition of order k implies local boundedness and local
Lipschitz continuity of all derivatives up to order k: indeed, if 1 < 1 < k, for all y
The asserted estimate is thus proved for 1 < r < oo- For r = oc the argurnent
with ¡y¡ < M it follows that
is even simpler. Hence, P is also locally Lipschitz continuous in L°°(E). ❑
I Diy P(x, y)¡ < I Dy y(x, y) - Diy ^o(x, 0) I + I Diy (P(x, 0) I < L(M) y1 + K < K(M).
If cp is moreover uniformly bounded and Lipschitz continuous on the whole Therefore, Dy y is locally bounded, and the derivative of order 1 - 1 is locally
real line I, then <P is Lipschitz continuous in any of the spaces LT(E), that Lipschitz continuous; for instante, local Lipschitz continuity of order 2 implies local
is, there is sorne L > 0 such that Lipschitz continuity of order 1, since the mean value theorem yields for ¡y ¡ < M,
i = 1, 2, that
I0(y) - 'P(z)II Lr(E) < L Ily - z11Lr(E) Vy, z e Lr(E).
Dy ^o(x, yi) - Dy y(x, y2)1 = I Dy y(x, y,9) (y, - y2) I < 2 K(M) M,
This follows immediately from the aboye proof.
with an intermediate point y,g between yr and y2. Applying this argument induc-
tively, we can work from 1 = k down to 1 = 0, that is, to y itself.
Example. (D (y) = sin (y(•)).
The associated Nemytskii operator ' is generated by ep(x, y) = sin(y). Ob- First-order derivatives in L°° (E). The differentiability of J seems to be
viously, perfectly clear: if yp is continuously differentiable with respect to y, then the
1 sin(y )1 < 1 and 1 sin(y ) - sin(z )1 < 1y - z1 b' y, z e R. associated Nemytskii operator 4 is also expected to be differentiable. As we
200 4. Optimal control of semilinear elliptic equations 4.3. Nemytskii operators 201

will discover below, this expectation is justified only in principie: everything The non-differentiability of the sine operator is easily verified, as the
depends on an appropriate choice of the function spaces. interested reader can check in Exercise 4.4(i). This is particularly simple at
To simplify the following exposition, we will use the following familiar the zero element of the space LP(0, 1) with 1 < p < oo. To see this, let
abbreviations for partial derivatives: we write wy := Dy cp = a:p/ay and h E LP(0, 1) be given. Taylor's theorem with integral remainder, applied to
^oyy := Dyy = é32,p/r3y2. the sine function, yields
Now, for every fixed x let cp be continuously differentiable with respect sin (0 + h(x)) = sin(o) + cos(o) h(x)
to y. If the Fréchet derivative of 1 at y exists, then it can be determined as
the Gáteaux derivative + J 1 cos (0 + s h(x )) - cos(0 ) ] h(x) ds

(V(y)h) (x ) = lim t [^,(x, y(x) + t h (x) - w(x, y(x) ] = 0 + h(x) + r(x).


(4.26)
d Here, h(x) is regarded as a real number for fixed x. For the increment h, we
dt (x, y(x) + t h(x)) Jc-o = Wy (x, y(x)) h(x), choose the step function
where the limit evidently exists for each fixed x. But this fact does not 1 in [0, E ]
provide any information on whether the limit exists in the sense of a suitable h(x) =
Lr space, nor does it indicate to which space the function cpy (•, y(•)) and its
0 in (E, 1]
product with h should belong. In other words, mere pointwise existente of with 0 < E < 1 and pass to the limit as E 1 0. The remainder r(x) can
the aboye limit does not yet suffice to guarantee Fréchet differentiability. be immediately read off without the integral remainder: we have r(x) _
sin(h(x)) - h(x). Therefore, for x e [0, E],
Example: Sine operator . The sine function is infinitely differentiable
r(x) = sin (1) - 1 = c 0,
and all of its derivatives are uniformly bounded. In view of these nice smooth-
ness and boundedness properties , we employ the sine function as a test case so that
( c in [0, E]
in the seemingly simplest space, namely the Hilbert space L2(E). The corre- r(x) =
sponding Nemytskii operator ^D (y(.)) = sin (y(•)) is globally Lipschitz con- Sl 0 in (E, 1].
tinuous in L2(E), as the example in the preceding section shows. We con- If the sine operator were Fréchet differentiable at the zero function, then we
jecture that 1 is also Fréchet differentiable. By (4.26), the derivative must would have
be given by IITIILP(0,1) 0
as Ilhllz,P(0,1) -> 0.
(4.27) (V(y) h)(x) = cos (y(x)) h(x). )hllr,P(o,i)
However, we obtain that
This seems to fit, since the cosine function is bounded by 1 and thus the
right-hand side of (4.27) defines a continuous linear operator in L2(E) that
IIrIILP(o,r ) _ (fo Ir(x)IPdx)p CEP
assigns to each h E L2(E) the product cos (y(-)) h(•) E L2(E). =C^0.
llhll r,P(o,r ) p
(fo h( x ) IPdx)
EP
Quite unexpectedly, however, our conjecture is wrong. In spite of all
the nice smoothness properties of the sine function, the sine operator cannot
be Fréchet differentiable in any of the spaces LP(E), 1 < p < De. This Fortunately, the situation is not completely hopeless. Indeed, it will
disappointing fact follows from a well-known result which asserts that (D is follow from the next lemnra that the sine operator is Fréchet differentiable at
Fréchet differentiable in the space LP(E) for 1 < p < oe if and only if ep is least in the space L°°(E). Moreover, we conclude from the aboye calculation
an affine function with respect to y, that is to say, ^o(x, y) = lpo(x) +<pr(x) y that it ought to be differentiable from LPr(0, 1) into LP2(0, 1) for 1 < P2 < pi;
with some oo E LP(E) and some <pr E L°°(E); see Krasnoselskii et al. indeed, then the corresponding quotient EP2 Pl tends to zero as E approaches
[KZPS76] . This fact, which will be dernonstrated below for the case of the zero; therefore, the aboye contradiction no longer exists. The reader will be
sine operator, is a big obstacle that renders the optimal control theory of asked to show the differentiability between this pair of spaces in Exercise
nonlinear problems more difficult. 4.4(ii). 0
202 4. Optimal control of semilinear elliptic equations 4.3. Nemytskii operators 203

Lemma 4.12. Suppose that the function co is measurable with respect lo Definition . Let F : U -> V be a Fréchet differentiable mapping in an open
x E E for every y E R and differentiable with respect to y for almost every neighborhood U of a point v E U. F is said to be continuously Fréchet dif-
x e E. Moreover, let both the boundedness condition (4.24) and the local ferentiable at v if the mapping u F'(u) from U into £(U, V) is continuous
Lipschitz condition (4.25) of order k = 1 be satisfied. Then the Nemytskii at v,, that is, if
operator <P associated with cp is Fréchet differentiable in L°°(E), and we have
iu -úlIv->0 IIF'(u) - F'(« ¡c(uv) -->0.
(V(y)h)(x) = cpy(x,y(x)) h(x) for a.e. x E E and all h E L°°(E).
F is said to be continuously Fréchet differentiable in U if it is continuously
Fréchet differentiable at every 2 E U.
Proof: Let y, h E LO°(E) be arbitrary, and choose M > 0 such that ly(x)I <
M and I h(x) I < M for almost every x E E. Then Lemma 4.13. Suppose that the assumptions of Lemma 4.12 hold. Then the
Nemytskii operator 1 is continuously Fréchet differentiable in
co(x, y(x) + h(x)) - (x, y(x)) _ ^,y (x, y(x))h(x) + r(y, h) (x)
L°°(E).
with the remainder

Proof: Let y E L°°(E) be arbitrary but fixed, and let Ilyn - OLoo(E) ---^ 0,
,(Y, h) (x) = f 1 [cPy (x, y(x) + s h(x)) - Wy (x, y(x))] ds h(x).
where yn E L' (E) for all n E N. We have to show that

By the Lipschitz continuity of cpy, we can estimate, for almost all x E E, IIV(yn) - V(y)I1c(L°°(E)) + 0 as n-4oo.

Evidently, there is some M > 0 such that


Ir(y, h)(x )j < L(2M) f 1 s lh(x ) l ds h(x) l < L(ZM) Ih(x)12
IIYIIL-(E) + IlynI L00(E) < M dneN.

< L(2M ) Ijhlli-(E)- Using the local Lipschitz continuity of cpy, we obtain

Therefore , 1 I r(y, h) 11 L- (E) < c I I 1 1 L2 i_ (E) and thus II01(yn) - pMi i £(L-(E»
sup
r(y,h)I1L-(E) 0 Il [^ly(-,yn( )) - Py(,^JO)^vO^I L- (E)
IIhIIL-(E) as IIhIIL-(E) 0. IIvIIc,^(E)=1

The desired convergente of the remainder is thus shown. Moreover, since


<_ Ilwy(-,y n(-)) - py(,yO)IILoo(E) < L(M) Ilyn -YIIL-( E ) -> 0.
cpy(-,y(-)) is by the boundedness condition for epy bounded, the multiplica-
Hence, d) is (Lipschitz) continuously differentiable.
tion operator h(-) H cpy(-,y (-)) h(-) is a continuous linear mapping from
LO°(E) into itself. In this connection, note that the measurability of cpy
Example. On C[0, 1] and for k e N, k > 2, we consider the mapping
follows from the measurability of cp, since the derivative with respect to y
comes from a limit of measurable functions. With this, all properties of the `D(y(.» = y(-)k.
Fréchet derivative are proved. ❑
The directional derivative at y E C[0, 1] in the direction y E C[0, 1] is obvi-
Conclusion . Every function cp e C2(R) that depends only on y generates ously
a Fréchet differentiable Nemytskii operator in L°°(E); indeed, cp' is locally V(y)y=kyk 1y-
Lipschitz continuous.
We can identify V(y) with the function k yk-1. Now let {yn}°O_1 C C[0, 1]
Since the implicit function theorem will be applied later, we now in- be any sequence such that yn -* y in C[0, 1]. Again, there is some M > 0
troduce the notion of continuous differentiability where the operator F'(u) such that
depends continuously on u. Ilyllc[o,11 + IlynllC[0,11 - M Vn E N.
204 4. Optimal control of semilinear elliptic equations 4.4. Existente of optimal controls 205

Using the mean value theorem, we have with suitable 6,(x) E (0, 1) the For the proof of the differentiability, we have to show that (4.28) holds
estimate with q replaced by r and ep replaced by cpy. This is plausible, since the prod-
uct Sp, h will have to belong to L9 (E) for h E LP(E). We thus determine the
IV(yn) - V (y)Ii£lc[o,1j) = 111,11 ox =r II (^'(yn) - ^'(y)) y 11c[o,1] conjugate exponent s of p/q from the equation 1/s + q/p = 1 and estimate,
using Hblder's inequality (2.25) on page 43, that
max 11k (y^-1 _ -1) yll
yk
C[0,1]
< k Ilyk -1- y-111 C[0,1]
IY!Ic[o,il=1

^E wy(x,y( x))I4Ih(x)Igdx C (L ll ,y(x,y ( x))Iq3dx ) E l h(. ) 1 qP dx)P


< k(k - 1) sup 1(yn + 8n(yn - y))(x)
XE[0,1]
Ik -2 1(yn - 9)(x)1 ^( ✓

< k(k - 1) (11Y.11c[o,1] + 11YIIC[0,11)k-2 ÍJyn - YIIc[o,l] (f I^


y (x,y(x))I'dx )'( f h(x) ¡Pdx)P

By assumption, both integrals are finite.


< k(k - 1) Mk-2 Jlyn - y11c[0,r] ---> 0 as n oo.
Example. Let 9 C RN be a bounded domain , k > 1 an integer, and <P the
Consequently, the mapping y H V(y) is continuous, so that is continu-
Nemytskii operator generated by cp (y) = y'. In connection with equation
ously differentiable. o
(4.2) on page 181, we want to know for which values of k the operator I is
differentiable from L6( Q) into L6/5(S2). By ( 4.29), the derivative cpy has to
4.3.3. Derivatives in other LP spaces *. For completeness, we now col-
map the space L6(52) into L' ( 4) with
lect some properties of Nemytskii operators in LP spaces with 1 < p < oo.
These will, for instante, be used for the analysis of the elliptic equation 65
r = _3
-Dy + y + y3 = u in Hl(í ). The proofs can be found in the monographs 6-5 2'
[AZ90] and [KZPS76] and in the papers [App88] and [GKT92]. For the
We have 1,py1 = k yIk-1. In view of the growth condition (4.28), we postulate
analysis of Nemytskii operators in Sobolev or H ilder spaces, we refer the
that k 1 < p/r with p = 6 and r = 3/2; hence
reader to [AZ90] and [Goe92].
k-1<6=4,
Continuity. Let a bounded and measurable set E C Rn be given, and r
assurne that co = p(x, y) satisfies the Carathéodory condition. Then the and thus k < 5. Therefore, for k < 5 y(•)k is differentiable from L6(4) into
Nemytskii operator d)(y) := ^p(-,y(.)) maps LP(E) into Lq(E) for 1 < q < L5/5(S2). 0
p < oo if and only if there are functions a E Lq(E) and 13 E L°O(E) such
that the growth condition 4.4. Existente of optirnal controls
(4.28) l^,(x,y)1 < a( x) +a(x) ¡y¡4 4.4.1. General assumptions for this chapter. Owing to the required
assumptions on the nonlinearities, the theory involving norilinear equations
is satisfied. Moreover, the operator 4 is for q < oc automatically continuous
if it maps LP(E) finto Lq(E); see [AZ90]. and cost functionals may become confusing. Depending on the problem-for
instante, the existente of optimal controls, necessary first-order conditions,
Differentiability. In addition, let the partial derivative 9. (x, y) exist for or sufficient second-order optimality conditions-the requirements differ and
almost every x e E, and assume that the Nemytskii operator generated by would have to be specified anew in each section. To avoid this, we list a set
3py(x, y) maps LP(E) into L' (E). If 1 < q < p < oc satisfies the condition of assumptions to hold throughout the remainder of this chapter, which is in
fact too strong for most results. We will discuss at the relevant places which
(4.29) pq
r=
p-q , parts of the assumptions are dispensable; see also the remark at the end of
this section.
then t is Fréchet differentiable from LP(E) finto LQ(E), and we have
In the following, the real-valued functions d(x,y), b(x,y), <p(x, y), and
(V(y)h)(x) = ^,y(x,y(x)) h(x). i)(x, u), which depend on a domain variable x E E and a real function
206 4. Optimal control of semilinear elliptic equations 4.4. Existente of optimal controls 207

variable y or u, will repeatedly occur. Here, the specifications E = 9 and d(x, y) = co (x) y+ yk with odd k E N and co(x) > 0 in f2
E = F are possible. Moreover, thresholds na, Ub, va, Vb : E --> R for the
such that IICOIILOO(st) > 0,
controls will be prescribed.
d(x, y) = co ( x) y + exp ( a(x) y) with co E L°°( 2), co(x) > 0,
Assumption 4.14.
(i) 9 c RN is a bounded Lipschitz domain. IlcollL-(o) > 0 and a E L°°(fl), a(x) > 0.

0
(ü) The functions d = d(x, y), cp = co(x, y) : Q x R -> R, b = b(x, y)
F x R -- R, and zb = zb(x,u) : E x R -> R, where E = 9 or E = F,
Under Assumption 4.14, the existente result of Theorern 4.8 on page 192
are measurable with respect to x for every y E l1 (respectively, u E R) and
applies to the subsequent elliptic boundary value problems. We rewrite d in
twice differentiable with respect to y (respectively, u) for almost every x E
the form
9 (respectively, x E f'). Moreover, they satisfy the boundedness and local
Lipschitz conditions (4.24)-(4.25) of order k = 2; for ep this means, for (4.30) d(x,y) = co(x)y + (d(x,y) - co(x)y) = co(x)y + d(x,y),
example, that there are constants K > 0 and L(M) > 0 such that for almost
every x e 9 we Nave with co = X(Ed) Xd. Then d satisfies Assumption 4.2 on page 185. Analo-
gously, we can rewrite b, defining a := X(Eb) Xb-

I^o(x,0)I + 1Wy(x,0)1 + 1Wyy(x,0)1 < K, 4.4.2. Distributed control . We exemplify this case by investigating the
Oyy(x, yl) - ^lyy(x, y2)I < L(M) 1 yl - Y21 dy1, y2 E L-M, M]. optimal control problem ¡ ¡

(4.31) min J(y, u) := J w(x, y(x)) dx + J (x, u(x)) dx,


(iii) Additionally, dy(x,y) > 0 for almost every x E 9 and all y E R, and
by(x, y) > 0 for almost every x E I' and all y E R. Moreover, there are sets
Ed C SZ and Eb C F of positive measure and constants Ad > 0 and Ab > 0
such that -Ay +d(x,y) = u in 9
á„y = 0 on F
dy(x, y ) > Ad Vx E Ed, dy E R; by (x, y) > Ab Vx E Eb, Vy E R-

(¡y) The bounds ua, ub, Va, Vb : E -> R belong to L°°(E) for E = Q or (4.33) ua(x) < u(x) < ub(x) for a. e. x e Q.
E = F and satisfy the conditions ua(x) < ub(x) and va(x) < vb(x) for
almost every x E E.
We recall that problems in which the control occurs as a source term on
the right-hand side of the partial differential equation are termed distributed
As mentioned before, the aboye set of assumptions is too restrictive. In
control problems. Here, the set of admissible controls is given by
fact, for the existente of optimal controls the conditions in (ii ) concerning the
derivatives of co and 0 are dispensable; these conditions, including Lipschitz Uad = {u E L°°(Q) : ua(x) < u(x) < ub(x) for a.e. x E S2}.
continuity, are needed only for the functions themselves (order k = 0). On
the other hand, we have to postulate that zG is convex with respect te u. Definition . A control u e Uad is said to be optimal if it satisfles, together
For first-order necessary optimality conditions, (ii) needs to be postulated with the associated optimal state y = y(v,), the inequality
up to order k = 1 only, while Assumption 4.14 is needed in its entirety for
second-order conditions and for SQP methods. J(y(u),u) J(y(u),u) du E Uad.

Example. The following functions satisfy the aboye assumption: A control is said to be locally optimal in the sense of Lr(9) if there exists
some E > 0 such that the aboye inequality holds far all u e Uad such that
,p (x, y) = a(x) y + 0 (x) (y - yo(x))2 with a, 0, y^? E L°°(9), Ilu--uIIL1(Q) <E.
208 4. Optimal control o£semilinear elliptic equations 4.4. Existente of optimal controls 209

Before stating the first result on the existence of optimal controls, we 2.11 that Uad is weakly sequentially cornpact. Hence, there exists a sequence,
note two properties of the functionals without loss of generality {u, }ñ 1 itself, that converges weakly in L''(O) to
¡¡ some ti e Uad, i.e.,
(4.34) F( y) = W(x, y(x)) dx, Q(u) = f (xu(x))dx.
J^ u,,-'ú asn -* oo.

Both functionals are composed of a Nernytskii operator and a continuous Now we have found a candidate for the optimal control, but this has
linear integral operator from L' ( S2) into R. By virtue of Lemma 4.11 on page yet to be proved. To this end, we have to show that the state sequence
198, F is Lipschitz continuous on the set {y E L2( S2) : IMIL-(Q ) < M}, for {yn}ñ 1 converges in a suitable sense. This is not as straightforward as in
any fixed but arbitrary M > 0. The same holds for Q on Uad, since Uad is the linear-quadratic case. To begin with, consider the sequence
bounded with respect to the L°° norm . Moreover , the reader will be asked zn(') = d(.,y,,(.)), n e N.
in Exercise 4.5 to show that Q is convex on Uad provided that 0 satisfies the
convexity condition stated in ( 4.35) below. Now recall that llYnIILoo(S1) < M for all n E N. Then {z,,}n--1 is also
bounded in L°°(O) and, a fortiori, in L'(9). Therefore, a subsequence,
Theorem 4.15 . Suppose that Assumption 4.14 holds, and assume that 0 is without loss of generality {z,,,}°° 1 itself, converges weakly in U(O) to some
convex in u, that is, z c Lr(O).

(4.35) 0(x,\ u + (1 - A)v) < Arj,(x, u) + (1 - A) ^b(x, v) Next, observe that yn solves the boundary value problem

for almost every x E 9, all u, v E R, and every A E (0, 1). Then the problem -Dyn + Y. = Rn
(4.31)-(4.33 ) with distributed control has at least one optimal control P with av yn = 0,
associated optimal state y = y(ü) E H1(Q ) fl C(S2).
where the right-hand side R,, := -d(x, yn) + u,,, converges weakly in LU (S2)
to -z + ti. By virtue of Theorem 2.6 on page 35, the mapping Rn H yn
Proof.• We first bring the elliptic equation in (4.32 ) into the form ( 4.5) on is linear and continuous from L2(O) into Hl(O), and since r > 2, also from
page 183 , using the transformation from (4.30 ). Recalling the remarks fol- U(O) into H'(O). Since every continuous linear operator is also weakly
lowing ( 4.32), we may apply Theorem 4.8 on page 192 with d in place of continuous, {y,,,}nn---1 must converge weakly in H1(O) to some y E Hl(S2),
d. Consequently , the state problem (4.32) has for every control u E Uad a i.e.
uniquely determined state y = y(u) E Hl (Q) n C(S2). yn -'y as n -+oo.
Next, observe that Uad is a bounded subset of L°°(9) and thus bounded Moreover, since H1(9) is by Theorem 7.4 on page 356 compactly enibedded
in any space U(S2) for r > N/2. Without loes of generality, we may assume in L2(O), we also have the strong convergente
r > 2. Hence, we may employ the estirnate ( 4.11) on page 191 to conclude
that there is sorne constant M > 0 such that MMy', - MML2(o) -* 0 as u - * .

The function y is the natural candidata for the desired optimal state.
IIY(u)Ilc(Q) < M
After these sornewhat lengthy preliminaries, the rest of the proof is
for all states y(u) that correspond to a control u e Uad.
straightforward. First, recall that yn(x)j < Al Vx e 52. Now, the set
By virtue of Assumption 4.14, the functional J(y, u) = F(y) + Q(u) is {v E LT(O) : Iv(L-(Q) < M} is bounded, closed, and convex, and thus
bounded from below. Therefore, the infimum weakly sequentially closed. Therefore, y belongs to this set.
j = inf J(y(u),u) We now aim to show that y is the weak solution associated with ú. Once
UEUad
this is shown, we also know that y E C(h ). Now observe that, by Lemma
exists. Let { (yn, un)}0-1 be a minirnizing sequence, that is, let u,,, e Uad 4.11 on page 198, it follows from the boundedness of {y,z}ñ 1 in L°O(O) that
and yn = y(u, ), for n E N, be such that J(y,,, u.) -e j as n -* oo.
(4.36) lid(., yn) - d(., y)1IL2(2) < L(M) I1y^ -Y lL2(o),
We now interpret Uad as a subset of U(52). The reader will be asked
in Exercise 4.6 to verify that Uad is nonempty, closed, bounded, and convex and thus d(., y,,,) -3 d(., y) in L2(O). Moreover, the sequence {u,,,} °1 also
in L''(O). Since LT(O) is a reflexive Banach space, it follows froin Theorem converges weakly in L2(9) to ú.
210 4. Optimal control of semilinear elliptic equations 4.5. The control-to-state operator 211

Now, we have, for any ¡n E N, Since the state equation is nonlinear, the optimization problem is non-
convex with respect to u. Therefore, the uniqueness of the optimal control
fvy. Ov dx + J d(•, yn) v dx = f unvdx t1 v E H1(S2). ti cannot be shown without imposing additional assumptions. Theoretically,
arbitrarily many global and local minima are possible. The following exam-
Passing to the limit as n -+ oo, we see that yn - y in H1(O) yields the ples demonstrate how strange things can be even for the simplest nonlinear
convergence of the first integral ; moreover , from yn - y in L2 ( Sl) and optimization problems in Banach spaces. We will come back to this issue
IIynI1L-(Q ) < M the convergente of the second integral follows , and from later in the chapter; these examples are not really optimal control problems,
un - in Lr ( 52) that of the third integral . In summary, we have ,since they do not have a differential equation as constraint.

f V.vvdx+ f d(Y)vdx= f üvdX d v E Hl (). Examples.


t
(i) Consider the problem
In other words , y is the weak solution corresponding to the right - hand side
ti, that is, y = y(ü).
( 4.37) mi f (u) :_ - cos (u(x))dx, 0 u(x) < 2nr, u E L(0, 1).
The proof is still not complete: it remains to show the optimality of v,. f
At this point , one might be tempted to believe that the continuity of the
Evidently, -1 is the optimal value, attained for instante at zi - 0. But
functional Q already suffices to conclude from the convergente un - v, that
there are uncountably many other global solutions, namely, all mensurable
Q(un) -+ Q (t). Unfortunately, however, nonlinear continuous functionals
functions u that attain only the values 0 and 27r. These global minima can
need not be weakly continuous, so this line of argument is inconclusive. 1
be arbitrarily close to each other with respect to the L2 norm, while their
Here , the convexity of the functional Q comes to the rescue. In fact, by L°° distante is always 27.
Theorem 2.12 on page 47, Q is therefore weakly lower semicontinuous, that
is, (ii) The situation is similar (cf. Alt and Malanowski [AM931) for the prob-
un - 4i li m f Q(un) > Q(2i). lem
¡1
In summary, we have min ju2(x) - 112 dx, ju(x)j < 1, u e L°°(0, 1).
o
j = lim J(yn, un) = lim F(yn) + lim inf Q(un) l o
n--roo n-oo n--^oe
F(y) + lim nf Q(un) > F(y) + Q(ú) = J(y, u)•
Boundary control . Under Assumption 4.14, the existence of at least one
By definition of the infimum j, we therefore must have J(9,v.) = j, which solution to the boundary control problem (4.49)-(4.51) to be discussed on
proves the optimality. page 218 can be shown in a similar way.

Analogous reasoning yields the existence of an optimal control for zero


4.5. The control- to-state operator
Dirichlet boundary conditions.
For all of the problems addressed in this book, a unique state y is assigned to
Remarks. In the aboye proof, the following two conclusions would have been the control. In the linear-quadratic case, we assigned to the state y the part
wrong:
Su that actually occurs in the cost functional. Here, we no longer follow
(i) To conclude that y, - y d(•, yn) - d(., y) without knowing the strong this approach, simply because the diversity of the spaces and dual spaces
convergence yn -^ y: indeed, nonlinear mappings need not be weakly continuous. involved would necessitate a very technical exposition. We therefore only
consider the mapping u y, which is generally denoted by G. The range
(ü) Te conclude that y,.. -3 y in L2(O) = d(., yn) -3 d(., y) without knowing that
of G will always be a subset of Y = C(Q) fl H'(2). As before, we will often
hJt-(n) < M Vn e N.
write y(u) in place of G(u), presuming that any confusion with the value
Observe that only the boundedness and Lipschitz condition of order k = 0 in y(x) of y at the space point x will be clarified by the context. We begin our
Assumption 4.14 have to be postulated for cp and r4 for the theorem to be valid. analysis with the control acting as a source in the domain.
212 4. Optimal control of semilinear elliptic equations 4.5. The control-to-state opera tor 213

4.5.1. Distributed control. We consider the state problem (4.32) Because co > 0, the function co(x) y is increasing with respect to y. We
may therefore apply Theorem 4.7 on page 191 with the aboye co and the
-Ay + d(x, y) = u in f2 specifications d(x, y) = 0, f = u, and g = b = a = 0. Invoking (4.11), we
immediately deduce that
0„y = 0 on F.
IIyllH^(n) + Ilylic(f) <_ L Il uIÍ Lr(Q)
By Theorem 4.8 on page 192, for any control u E U := L'( Q) with for all u and the associated y. Since the aboye boundary value problem is
r > N/2, there exists a unique state y E Y = Hl(Q) n c(0), provided that linear, we may take u = ul - u2 and y = yl - 92 to arrive at the asserted
the corresponding assumptions are met (which we shall take to be the case). estimate. ❑
We denote the associated control -to-state operator by G : U -> Y, G(u) = y.
Remark. In both the preceding and the next theorem, only the boundedness
and Lipschitz conditions of order k = 1 from Assumption 4.14 are needed for the
Theorem 4.16 . Suppose that Assumption 4.14 on page 206 holds for f2 and
asserted results to be valid.
d. Then G is a Lipschitz continuous mapping from L'(Q), r > N/ 2, into
Hl (Q) n C( 5l), that is, there is a constant L > 0 such that
Next, we determine the Fréchet derivative of the control-to-state operator
Ilyr - Y2 11H' (o) + Ilyr - Y2 11c(0) < L Ilui - u2 11 Lr(Q) at a fixed point ú. In the later applications, u will be a locally optimal
control. We have the following result.
whenever ui E L'( f2) and yi = G ( ui), i = 1, 2.
Theorem 4.17. Suppose that Assumption 4.14 on page 206 holds. Then for
Proof: By virtue of Theorem 4.10 on page 194, yi E C(Q) for i = 1, 2. every r > N/2 the control-to-state operator G is Fréchet differentiable from
Subtracting the equations satisfied by yr and y2, we see that L'(9) into H'(f2) n C(St). Its directional derivative at ú E L'(Q) in the
direction u is given by
-A(yi - 92) + d(x, y1) - d(x, 92) = ul - u2
(4.38) G'(ú) u = y,
0v(yi - 92) = 0.
where y denotes the weak solution to the boundary value problem linearized
Evidently, we have at y = G(ú):
-Ay + dy(x, y) y = u in S2
d(x,y,(x))-d(x,y2(x)) _ dsd(x,yr(x)+1(?12(x)-y,(x)))ds (4.39)
0„ y = 0 on F.

('r
= J dy(x,yr(x) + s (y2(x) -'y1(x))) ds (Y1 (X) - 92(X))• Proof:• \Ve nave to show that

G(ú + u) - G(ü) = D u + r(9, u)


Owing to the continuity of the functions dy, y1, and 927 the integral
in the second line defines a bounded and rneasurahle function co = co(x), with a continuous linear operator D : L'(52) -o Hr(Q)nC(S2) and a mapping
which is nonnegative since the mapping y H d(x, y) is increasing. On Ed r that satisfies
the integrand satisfies lr(u,u)IIHr(U)r,c(Q) -30 as lullL1(2)_> 0.
lulILr(o)
dy(x,yr(x) + 3 (92(x) - yr(x))) > Ad > 0,
Here, we have put 11 r1IH^(n)ne(f) llrllHI(.^1)+llrllc(O). It then follows that
so that co(x) > %vd for almost every x E Ed. Of course, co also depends on G'(ü) = D.
Yi and y2, but this is immaterial for the argumentation to follow.
The boundary value problems satisfied by y(9) and y(9 + u)
Now let y = yi - 92 and u = ur - u2. Then, in view of (4.38), read, respectively,
-Ay+co(x)y = u -Ay +d(x,y) = u -Ay+d(x,y) = v,+u
0„y = 0. ci„ y = 0 <9,ú = 0.
4. Optirnal control of semilinear elliptic equations 4.6. Necessary optimality conditions 215
214

Subtracting them yields 4.5.2. Boundary control . In this case, the situation is quite similar. In
fact, under Assumption 4.14 on page 206, G is continuously Fréchet differ-
=dv (x,y) (t-y)+rd
entiable from U = LS(F) into Y = H'(S2) n C(f2) for all s > N - 1. The
-0(y y) + d(x, y) d(x, y) = u directional derivative at ú e LS(F) in the direction u is given by
av(y - y) = 0.
G'(ú) u = y,
The Nemytskii operator 4?(y) = d(•,y(.)) is, by Lemma 4.12 on page where y denotes the weak solution to the boundary value problem linearized
202 , Fr é c h e t differentiable from C(Q) into L°°(S2) . Therefore , at y = G(ú), that is,
-Ay = 0
(4.40) P(y) - <P(y) = d(., y(-)) - d(., y(')) = dy (-, y(-)) (y(-) - y(')) + rd, ó„y+by(x,y)y = u.
with a remainder rd such that
4.6. Necessary optimality conditions
IIrdIIL-(o) 0
as ¡ID - yllc(0) -> 0- 4.6.1. Distributed control . In the following, let ú e L°°(Q) denote some
IIy - yllc(0)
(in the sense of the L°° norm) locally optimal control of the problem (4.31)-
The reader will be asked in Exercise 4.7 to show that this implies that (4.33) on page 207. We derive the first-order necessary conditions that have
y-y=y+yp, to be obeyed by u, and the associated state Y.
For y we can write y(u) = G(u), with the control-to-state operator G
with the solution y to (4.39) and a remainder yp that solves the boundary
L- (Q) -* H'(S2) n C(S2). The cost functional thus attains the form
value problem
-Dyp + dy(-, y) yp = -rd J(y, u) = J(G(u), u) = F(G(u)) + Q(u) =: f (u),
(441)
DYyp = 0. where F and Q are defined as in (4.34). Under Assumption 4.14, f is a
Fréchet differentiable functional in L°°(l); indeed, F, Q, and G are, by
In this connection, recall that dy(x, y) > >Id > 0 in Ed, so that this problem
virtue of Lemma 4.12 on page 202 and Theorem 4.17, Fréchet differentiable.
is uniquely solvable. From the Lipschitz continuity shown in Theorem 4.16
Suppose now that is locally optimal, Uad is convex, and u E Uad IS
it follows that
arbitrarily chosen. Then for all sufficiently small A > 0 the convex combina-
IIy - YIl c(O) + IIy - 911H1(Q) < L IIuIILr(st)- tion v := ú +.\(u - ú) belongs to both Uad and the e-neighborhood of ú in
which f (v,) < f (v) holds. Hence, for all sufficiently small A > 0,
Moreover,
f(U + Mu - ú)) > f(U)•
IrdII Lo°(c) _ IIrdIIL^(sz) IIy - yllc(^2) < IIrdIIL-(O) L,
IIuIILr(o) IIy - yllc(O) IIUIILr(C2) ¡l b - y llc(0) Division by A and passage to the lirnit as .\ .. 0 lead lo the variational
inequality of Lemma 2.21 on page 63. We have thus shown the following
and thus Il ydllL- (s2) = o(IIUIkr(s )). By (4.41), we also have result.
IIYPlIc(O) + I1ypllx1(o) = o(IIuIILr(o))-
Lemma 4.18. Suppose that Assumption 4. 14 on page 206 holds, and let
Denoting the continuous linear mapping u H y by D, we conclude that ú be a locally optimal control for problem (4.31)-(4.33). Then we have the
G(i+u)-G(ú)=y-y=Du+yp=Du+r(t.,u), variational inequality
(4.42) f' (t.)(u - v, ) > 0 Vu E Uad.
where r(i, u) = yp has the required properties. This concludes the proof of
the assertion. 11
The aboye result is valid for all nonlinear functionals of the type f (u) =
Conclusion . A fortiori, G is Fréchet differentiable from LO°( íl) into J(y(u),u) that we subsequently consider. Using the chain rule, we imme-
H1(9) n C(S2). diately see that the directional derivative at ú in the direction h is given
216 4. Optimal control of semilinear elliptic equations 4.6. Necessary optimality conditions 217

by As in Chapter 2, we can reformulate the variational inequality in tercos


f'(ti) h = F'(G(ú)) G'(ú) h + Q'(ú) h of a minimum principie.

(4.43) F''(y) y + Q(u) h Conclusion . Suppose that Assumption 4.14 holds, and suppose that ú is a
J Ipy ( x, y(x)) y (x) dx +
f a (x, u ( x)) h(x) dx. locally optimal control of problem (4.31)-(4.33) with associated adjoint state
p. Then for almost every x E 52 the minimum of the problem
Here, y = G'(ú) h is by virtue of Theorem 4.17 the weak solution to the
(4.48) min { (p(x) + Ou(x, ú(x))) v}
linearized boundary value problem ua (X ):5 v <ub(x)

-Ay + áy(x, y) y = h is attained at v = u(x).


(4.44)
a„y = 0.
Special case: The function r/i(x,u) = 2 u2, with .\ > 0, obviously satis-
Next, we define the adjoint state p E H'(Q) fl C(í) as the unique weak
the assumptions. Clearly, cb (x, u) _ .\ u; hence, the minimum for the
solution to the adjoint equation
problem
min { (p(x) + \ ti(x)) v}
ua (x) < v <ub (x)
-Op + dy(x, y) p = Cpy (x, y(x)) in tl
(4.45)
is attained at v = ú(x) for almost every x e Q. Therefore, we have the
avp = 0 on I.
projection formula as in (2.58) on page 70:

Lemma 4.19. Let y be the weak solution to problem (4.44) for given h E
L2(S ), and let p be the adjoint state defined as the weak solution to problem u(x) _ ^[ua(x),ub(x)I {_v(x)} for a.e. x E S2.
(4.45). Then

f cpy ,
(xV(x))Y (x)dx =
Jsz
p(x)h(x) dx.
If ua and Ub are continuous, then so is v,; in fact, we have p E Hl (Q) nC(S2),
and the projection operator rnaps continuous functions to continuous ones.
Proof: The assertion follows directly from Lemma 2.31 on page 74, with If, in addition, ua, ub E H'(Q), then, by the same token, u E Hl (Q) n C(S2).
the specifications aSZ(x) = yy(x,y(x)), co(x) = dy(x,y(x)) (observe that
dy(',yO) 0),^39=1,andar=a=Qr=0. ❑ Example. Consider the "superconductivity" problem

As a simple conclusion, the following expression for the directional de- mili J(y,u) ._ 2 lly - YQIIi^(sz) + 2 II"nI1i,'-(2),
rivative of the reduced functional f at u in the direction h e L'(4) results:
subject to -2 < u(x) < 2 and
(4.46) f'( ) h = f (p(x) + P. (x, (x))) h(x) dx. -Ay+y+y3 u

Moreover , we obtain the desired necessary optimality condition. avy = 0.


This is a special case of problem (4.31)-(4.33) en page 207, with the speci-
Theorem 4.20 . Suppose that Assumption 4.14 holds. Then every locally
fications
optirnal control v, for problem (4.31)-(4.33) satisfiies, together with the as-
sociated adjoint state p e H1(12) fl C(Q) defined by (4.45), the variational 1 2
,p (x, y) = 2 ( y - y12 (x)) (x, u) = 2 u2, d(x, y) = y + y3.
inequality

(4.47) f (p(x) + vu(x, ú( x))(u(x) - ú (x)) dx > 0 V U E Uad.


If we assume yQ E L°°(1), then all requirements (measurability, bounded-
ness, differentiability, monotonicity of d, convexity of y with respect to u)
219
218 4. Optimal control of semilinear elliptic equations 4.6. Necessary optirnality conditions

are met. Theorem 4.15 on page 208 yields the existente of an optimal control The necessary optirnality conditions for locally optimal controls ú can be
ti. The adjoint equation for p reads derived in a similar way as in the case of distributed controls, except that G
and G' have a different meaning. First, it results in the variational inequality
-Op+p+3y2p = y-yst
(4.52)
avp = 0.
f 0(x,y-(x))o(x)+fu(x,ti(x)) (u(x) - u(x)) ds(x) 0 Vu E Uad,
Together with the solution p E H' (í) n C(S2) to the adjoint system, ú must
obey the variational ¡inequality In
with the directional derivative y = G'(u)(u-ú) at v, in the direction u-ú.
Js^ (Aú + p) (u - v,) dx > 0 Vn E Uad. analogy to Theorem 4.17 on page 213, one obtains that y solves the boundary
value problem linearized at y:
In the case where \ > 0, the usual projection relation follows , and we have 0
-4+y =
ú E H'(St) n C(l) provided that Ua, ub e Hr(9) n C(SZ). If A = 0, then (4.53)
avy+by(x,y)y = u-ú.
obviously 11(x) = -2 sign p(x). o

The adjoint state p is defined as the weak solution to the adjoint equation
Test example. The reader will be asked in Exercise 4.8 to verify that the
necessary first-order optimality conditions are fulfilled for i(x) -- 2 in the
case where ) = 1 and yQ = 9. o in S Z
-Op+p = PY (x,2(x))
(4.54)
Remark. In the aboye example, the assumption yn E Lr(í), r > N/2, would avp + by ( x, y) p = 0 on F.
be sufficient. Then it also follows from our regularity results that p and thus also
the optimal control v, are continuous if A > 0. In addition, the boundedness and We have p e Hl (9 ) n C(0), since the right- hand side cpy belongs to L°°(9)-
Lipschitz conditions from Assumption 4.14 are needed only up to order k = 1. For this conclusion to hold, cpy E Lr(52 ), for some r > N/2, already suffices.
Invoking Lemma 2.31 on page 74 with the specifications asi (x) _ Ipy (x, y(x)),
4.6.2. Boundary control . Consider the problem
a(x) = by (x, y(x)), /3 = 0, andd ¡3r =¡1, we find that
(4.49) min J(y, u) y(x)) dx + J / (x, u(x)) ds(x),

subject to
sz r j s ( x, (x)) y(x ) dx = f (x) lu(x) - u ( x)) ds(x) V u E L2(F).

Substituting this into the variational inequality ( 4.52) yields the following
-Dy + y = 0 in S2
(4.50) result.
avy + b(x,y) = u on F

and Theorem 4.21. Suppose that Assumption 4.14 on page 206 holds, and let
ú be a locally optimal control for the boundary control problem (4.49)-(4.51),
(4.51) ua(x) < u(x) < ub(x) for a.e. x E F. with associated adjoint state p defined as the solution to the boundary value
problem (4.54). Then v, satisfies the variational inequality
Here, we put
(4.55) f (p(x) + Ou(x, u(x))) (u(x) - .0(x)) ds(x) > 0 bu E Uad•
Uad = {u E L°°(F) : ua(x) < u(x) < ub(x) for a.e. x E F}.
r
Under Assumption 4.14 on page 206, the control-to-state operator G
u H y maps L°°(F) into Hr(í)nC(í ). This would also be the case if -A+I Again, the variational inequality can be rephrased in terms of a point-
were replaced by -O. However, below we will discuss an example in which wise minimum principie. Since this is completely analogous to the case of
absence of the term I would lead to problems in the adjoint equation when distributed controls, we omit the details. As in the distributed control case,
u can
9 =0. the directional derivative of f (u) = J(G(u), u) at u in the direction
220 4. Optimal control of semilinear elliptic equations 4.7. Application of the formal Lagrange method 221

be expressed in the form demonstrate this approach for the following rather general optimal control
problem:
(4.56) f'(ú) u = J (P(x) + 0.(X, P( x))) u(x) ds(x).
(4.57) min J(y, v, u) ^o(x, y(x), v(x)) dx + f(xy(x)u(x))ds(x),

Example. Consider the boundary control problem

-Dy+d(x,y,v) = 0 in í
min J(y,u) IIy - ygIl22l + 2 IIUIIi2(r),
á„y+b(x,y,u ) = 0 on I'
subject to
-Ay + y = 0 in S
va(x) < v(x) < vb(x) for a.e. x E Q
M +y3IyI = u onI' (4.59)
ua(x) < u(x) < ub(x) for a.e. x E F.
and 0 < u(x) < 1.

The cost functional , containing also boundary values of y, is more general


This is a special case of the boundary control problem (4.49)-(4.51), with
than before . Since the method will be explained only formally , we do not
state precise assumptions on the involved quantities . Clearly, the functions
'P(x, y) = 2 (y - yg(x)) 2 , ^ '(x, u) = z u 2 , b(x, y) = y 3 I yI _
pp, 0, d, and b have to be measurable with respect to x and differentiable
with respect to y, u, and v. In addition , at least the monotonicity of d and
Actually, the boundary condition is of Stefan-Boltzmann type, because
b in y must be postulated.
y3IyI = y4 for nonnegative y. However, unlike y4, the function y3IyI is
monotone increasing. Once more, we assume yu E L°°(S2). Obviously, the Since the controls u and v appear nonlinearly in the state problem, a gen-
requested properties concerning measurability, differentiability, monotonic- eral existente result for optimal controls cannot be expected. We therefore
ity of b, and convexity of 0 with respect to u all hold. Therefore, in analogy assume that locally optimal controls v, and v exist , for which the necessary
to Theorem 4.15 on page 208, at least one optimal control Ii exists. optimality conditions have to be derived.

A simple calculation shows that b(y) = 4y2IyI. Hence, the adjoint In analogy to the problems studied earlier, we denote the sets of admis-
boundary value problem (4.54) becornes sible controls by Vad c LOO(Q) and Uad c LOO( h). The Lagrangian function
is formally introduced as
-OP + p = y2
L(y, v, u, P) = J(y, v, u ) - (- Dy + d(., y, v)) P dx
0YP+4V2IyI P = 0, st

and it follows that, with its solution p, for all u such that 0 < u(x) < 1 for
- (a„y + b(-, y, u)) p ds.
almost every x E F the inequality

It is apparent that the Lagrangian G is not meaningful if the state space


P) (u - v,) ds > 0 Y = H' (4) rl C(S2 ) is used . Therefore , we use formal integration by parts
and redefine L by
holds.
(4.60) £ ( y, v, u, p) J(y, v, u ) - J (Vy . Vp + d(., y, v) p) dx
sz
4.7. Application of the formal Lagrange method

The formal Lagrange method is a powerful tool for the derivation of optirnal-
- Jr b(•, y, u) p ds.
ity conditions also for the nonconvex problems treated in this chapter. One This definition anticipates that the Lagrange multiplier for the boundary
easily arrives at the correct result, which can then be verified rigorously. We condition will eventually coincide with the boundary values of the multiplier
222 4. Optimal control o£ semilinear elliptic equations 4.7. Application of the formal Lagrange method 223

p for the differential equation . Therefore , we have in all integrals the same subject to
function p. However , we caution the reader to use different multipliers pl -Dy+y+ey = v in 9
and p2 for the differential equation and the boundary condition if there is on F
avy + y4 = u4
any doubt . In particular, this should be done in the case of a Dirichlet-type
boundary control. and
We expect there to exist a function p E H1(S2) fl C(S2) satisfying the -1 < v(x) < 1, 0 < u(x) <
following conditions:
Dyf(y,v,L,p )y= 0 Vy E H' (Q) with prescribed functions yc, vO E L°O(íl). Since the nonlinearity y4 of
Stefan-Boltzmann type is not monotone increasing, it does not fit into the
Dv£(y, v, ú, p) (v - v) > 0 V v E Vad
general theory. We therefore replace y4 by the increasing function ¡y¡ y3,
D.ar(9, v, ú, p) (u - E ) > 0 V U E Uad. which is identical to y4 for y > 0. Then Theorem 4.7 en page 191 applies,
We explain only briefly how these relations are exploited , because the yielding for each pair u and v of admissible controls the existente of a unique
solution y e Hl (Q) n C(0) to the state problem.
technique is completely analogous to that used in Chapter 2. The first rela-
tion yields In spite of the occurrence of the nonlinear function u4 in the boundary
condition, the existente of an optimal control can be proved. To this end, we
Dy£(y, v, vr, p) y = wy(', v) y dx + J Oy (', y, v) y ds replace u4 by the new control ú. Then we have ú in the boundary condition
sz P and 1\2 ú2 in the cost functional, while the constraint for u becomes 0 < fi < 1.
The problem thus transformed has, in analogy to Theorem 4.15 on page 208,
-^Vy•Vpdx - fda(.)YPdx- j b(•,,)ypds = 0
^ z r a pair of optimal controls v and v,, whence we obtain for the original problem
the optimal controls v and ti = ,5r/4
for every y e H1(S2). This is nothing but the variational formulation for the
The optimal control problem under study is a special case of the class of
weak solution p to the linear boundary value problem
problems (4.57)-(4.59), with the specifications
-Op+dy(•,y,v)p = <py(',y,v)
(4.61) W(x, y, v) = y2 + y9 (x) y + VQ (x) V + ñ1 v2, (x, y, u) = \2 u8,
0,p+by (',9,ú)p = Oy(',y,v)1
d(x, y, v) = y + ey - v, b(x, y, u) _ yjy3 - u4.
which we interpret as the adjoint equation. Its solution p, the adjoint state,
exists whenever by, dy, (py, z/iy are bounded and measurable and by, dy are The boundedness of yc and vo has been assumed solely for the purpose
nonnegative and not both zero almost everywhere. Frorn the second and of fitting our problem into this general class of problems. The following
third relations we deduce the variational inequalities optimality conditions for arbitrary locally optimal controls v and v, evidently
remain valid for merely square integrable functions yc¿ and vo:
v) -pdz,v)1 (v - v)dx 0 Vv E Vad Adjoint problem:
f
(4.62)
-Op+p+e9p = 29+yo in 9
p bu(', y, u )) ( u - v,) ds > 0 bu E Uad•
ó„p+4y2Iylp = 0 on F.
The optimality system of the optimal control problem (4.57)-(4.59) then
consists of the state equation, the adjoint equation, the two variational in- Variational inequalities:
equalities, and the inclusions u E Uad, V E Vad•
(2>lv+vo+p)(v-v)dx > 0 dvEVad
Example. As an illustration, let us investigate the problem sz

min J(y,u,v) [y2+yoy+Alv2+vQ v] dx+ J x2u8ds,


f (82ti7 ± 4ti' p)(u - ti)ds > 0 duEUad.
Sz r
224 4. Optimal control of semilinear elliptic equations 4.8. Pontryagin's maximum principie 225

4.8. Pontryagin' s maximum principie * The Hamiltonians are functions of real variables; their arguments are
not functions. The necessary optimality conditions for a locally optimal pair
4.8.1. Hamiltonian functions . All of the first-order necessary optimal-
(v,, v) of controls, given by (4.61) and (4.62), can be rephrased more elegantly
ity conditions derived aboye have, with respect to the controls, the form
in terms of the Hamiltonian: the adjoint system is equivalent to
of a variational inequality, obtained by differentiating the Lagrangian with
respect to the control variable. -Ap = DyH°(x, y, v, 1, p) in f2
The famous Pontryagin maximum princíple avoids the differentiation (4.64)
a,P = DyHr(x,y,,l,P) on F,
with respect to the control. It was first proved for optimal control problems
involving ordinary differential equations (see Pontryagin et al. [PBGM62]). and the variational inequalities attain the pointwise form
An extension to the case of semilinear parabolic partial differential equa-
tions, using the integral equation method, is due te von Wolfersdorf [vW76, DvH° (x, y(x), v(x), 1, p(x)) (v - 4(x)) >
- O b'v E [va(x), vb(x)],
vW771; he employed a general technique developed by Bittner [ Bit75]. D,,HF (x, y(x), (x),1,P(x)) (u - ú(x)) >0 V u E [Va(x),Vb(x)],
More general results for semilinear elliptic and parabolic problems have for almost all x e SZ and x E F, respectively. From this, we immediately
been established in recent years, beginning with Bonnans and Casas [BC91]. deduce weak minimum principies, for instante,
Later, the maximum principie was extended to include state constraints;
we refer the reader to Casas [ Cas86] and Bonnans and Casas [BC95] for (4.65) min {DvHs0 (x, y(x), v(x), 1, p(x)) v}
elliptic equations, and to Casas [Cas97] and Raymond and Zidani [RZ99] VE[Va( X),Vb(x)I

for parabolic equations. We also refer in this context to the book [LY95] by
= DvH° (x, y(x), v(x), 1, p(x)) v(x) for a. e. x e ft.
Li and Yong.
We demonstrate the use of the maximum principie for the optimal control Hence , the minimum in (4.65) is almost everywhere attained at 4(x).
problem (4.57)-(4.59):
4.8.2. The maximum principie . If the Hamiltonian H° is convex with
min J(y, u, v) := J co(x, y, v) dx + J z/ (x, y, u) ds, respect to v, then the weak minimum condition (4.65) is equivalent to the
st r minimum condition
subject to (4.66) H° (x, y(x), v(x), 1, p(x)) = min H° (x, y(x), v, 1, p(x))
-Dy+d(x,y,v ) = 0 in fi vE[va ( x),vb(x)]

c9„y+b (x,y,u) = 0 on F for almost every x e 12. In order to derive the corresponding maxirnum
and formulation, we niultiply the Hamiltonian by -1 and introduce the negative
adjoint state q :_ -p. In view of (4.64), q solves the adjoint equation
va(x) < v ( x) < vb(x ) for a.e. x E ti, u,,(x ) < u(x) < ub (x) for a.e. x E F.
Oq . = DyH°(x, ), v, -1, q) in fZ
(4.67)
a,q = DyH' (x, y, u, -1, q) on F.
We choose Y = HI(fl) fl C(S2) as the state space and start from the La-
grangian fimction (4.60). The integrands of its derivative-free tercos are in- The minimum condition (4.66) then becomes the maximum condition
corporated into two Hamiltonian functions that separately collect the terms
involving 52 and F:
(4.68) H° (x, y(x), v(x), -1, q(x)) = max H° (x, 9(x), v, -1, q(x))
vE[vo.(a),vb(a)1
Definition . The functions HIN : f2 x R4 -s R and HI' : F x d84 -3 f8 defined
by for almost every x E Q. An analogous result holds for the boundary control
U.
H°(x,y,v,Po,P) = Poyp(x,y,v)-d(x,y,v)P
(4.63) Quite unexpectedly, the maximum condition (4.68) remains valid un-
Hr(x, y, u, po, P) = Po «x, y, u) - b(x, y, u) P
der natural assumptions without the convexity postulate used in the aboye
are called Hamiltonian functions. argument. Then Pontryagin's maximum principie holds:
226 4. Optimal control of semilinear elliptic equations 4.9. Second-order derivatives 227

Definition. The controls u, E Ud and v E Vad obey Pontryagin's max- abstract quantity F"(u), but with F"(u)u1 or (F"(u)u1)u2 instead. We use
imum principie if with qo = -1 and the adjoint state q defined by (4.67) the following notation:
the following maximum conditions hold for almost every x e Q and x E F,
respectively: F"(u)[ul, u2] :_ (F"(u) u1) U2, F" (u) v2 := F"(u)[v v]

(4.69) With respect to ul and U2, F"(u)[ul, u2] is a symmetric and continuous
H° (x, y(x), v(x), q0, q(x)) = max H° (x, y(x), v, qo, q(x)),
E[ua(x),ub(x)] bilinear form; see Cartan [Car67]. Taylor's theorem shows that for twice
max Hr(x,y(x),u,go,q(x)). Fréchet differentiable mappings F : U -> V we have the representation (cf.
H'(x,y(x),(x),go,q(x)) =
uC[ua(x),ub(x)] [Car67])
Global solutions to elliptic optimal control problems must obey Pontrya-
gin's maximum principie if certain natural conditions are postulated. In F(u -[- h) = F(u) + F(u) h + 2F"(u) h2 + r2 (u, h),
comparison with the weak minimum conditions, the maximum principie has
the advantage that no partial derivatives with respect to the controls are where the second- order remainder r2 satisfies
needed. If need be, with its help those among several solutions to (4.65)
can be identified that are not minimizers. Also, functionals that cannot be I r2 (u,h)II y
differentiated with respect to the control can be handled. All the problems IIhIIu as IIhMIU -> 0.
involving linear-quadratic controls in this chapter satisfy the requirements
for the maximum principie to be valid. Therefore, in all of these cases the We call the mapping F twice continuously Fréchet differentiable if the
optimal controls must obey Pontryagin's maximum principie. mapping u ^-> F"(u) is continuous, that is, if
In the case of control problems involving partial differential equations,
the maximum principie has so far been mostly of theoretical interest. Since II F"(u) - F"(E)II
G(U,G(Uv )) --3 0 whenever II u - ujiU -> 0.
numerical methods require, as a rule, derivatives with respect to the control,
weak minimum conditions in the form of variational inequalities usually suf- Next, our interest turns to the question of how these norms can be cal-
fice. culated or at least estimated . For example , this can be done using the
corresponding bilinear form. In fact, we have, by definition,
4.9. Second-order derivatives
I I F"(u) j G(U,G(U,V))
= sup I F"(u) u1IIc(U,V)
If F is a Fréchet differentiable mapping from an open set U C U into V, then IIulllu=1
u F'(u) defines an operator -valued mapping from U into Z = £(U, V). ir(u)u1)u2II
sup ( sup II (F v),
The question arises as to when this mapping is differentiable. lulllu =l iu 2 11U=1

Definition . Let F : U C U --^ V be a Fréchet differentiable mapping. If and so, with the notation introduced aboye,
the mapping u F'(u) is Fréchet differentiable at u E U, then F is said
to be twice Fréchet differentiable at u. For the second derivative we write (4.70) sup
II F"(u)II r(U,c(U,v)) = II F"(u)[ul,U2]II v-
F"(u) := (F')'(u). IIu111u=1, IIu211U=1

The equivalence between F"(u) and the bilinear form F"(u)[•, -] is dis-
By definition, F"(u) is a continuous linear operator from U into Z =
cussed in, e.g., [Car67], [KA64], and [Zei86].
G(U, V), that is, F"(u) E .C(U, G(U, V)). This is a rather abstract mathe-
matical object. Fortunately, we do not need to know the operator F" itself,
only how it is to be evaluated at given points. For any fixed direction ul E U, Calculation of F"(u). In place of F"(u), we work with the associated
the object F"(u)u1 is already somewhat simpler : it is just a linear operator bilinear form. To this end, we first determine for fixed but arbitrary u1 E U
from £(U, V). Applying this linear operator to another direction u2 E U, the directional derivative F(u)u1. Then we put F(u) := F'(u)ui. Clearly,
we obtain an element (F"(u)u1)u2 of V. We will not have to deal with the F maps U into V. For its directional derivative F-'(u)u2 in the direction u2,
228 4. Optimal control of semilinear elliptic equations 4.9. Second-order derivatives 229

we easily find that Theorem 4.22. Suppose that the function co = ey(x, y) : E x R --> IR is
measurable with respect lo x E E for all y e I and twice differentiable with
F (u)u2 -P(u+tu2)4t=o = dt(F (u+tu2)ul)1 t=o
respect lo y for almost every x E E. Let eo satisfy the boundedness and
= (F"(u + tu2)u1)u2Ct=o = (F"(u) ul)u2 local Lipschitz conditions of order k = 2 from (4.24)-(4.25) on page 199.
Then the Nemytskii operator (D generated by W is twice continuously Fréchet
= F"(u)[u1, u2].
differentiable in L°°(E), and the second derivative can be evaluated through

Example. We carry out this procedure (only formally at first) for the
(D"(y)[ y1, y2]) (x) = yyy (x, y(x)) y 1 (x) y2(x)-
Nemytskii operator
q> (y)
= ^o (-,y(.»
in the space Y = L°O(E), tacitly assuming that the derivative V" exists. Proof: ( i) We have to show that the representation of I" asserted in the
This will be a consequence of Theorem 4.22 below , whose assumptions we theorem in fact represents the first derivative of V, that is,
suppose are satisfied . By Lemma 4.12 on page 202, we have

(V(y)y1 ) ( x) = 4" (x, y(x)) yi (x). (4.72)


II V(y + h) - V(y) - V"(y) hIl c(L-(E)) 0 as IIh]IL=(o) -> 0.
IlhlíL-(E)
We now put
^, (x,y) :=^,y(x,y)yi(x)
Moreover, it must be proved that Y'(y) h belongs to G(L°°(E)) and that
and define a new Nemytskii operator by the linear mapping h 4"(y) h is bounded. Finally, we need to verify that
the mapping y e 4?"(y) is in fact continuous. Right from the beginning we
<P(y) = ^o (.,y('))-
shall use the notation V", even though this will be justified only later.
Since cp satisfies the assumptions of Lemma 4.12, (D is differentiable. The For the proof, let y E L°°(E) and h e L°°(E) be given. Evidently, the
directional derivative in the direction 92 is given by linear operator A = "(y) h has to be defined as the multiplication operator
(^'(y)y2)(x) = ^Oy(x, y(x)) 92(x) _'Pyy(x, y(x)) yr(x) y2(x), kHAk,

so, summarizing, we have


(Ak)(x) = (yyy(x,y(x))h(x)) k(x).

(D"(y)[yi,y2])(x)_^Oyy(x,y(x))y1(x)y2(x). o
The operator A corresponds to the given functions y and h, is generated
As the subsequent theorem will show, we have indeed calculated a second- by the bounded and measurable function y55 (x, y(x)) h(x), and is obviously
order Fréchet derivative. In view of the representation just derived, the a continuous linear operator in L°° (E). For a better overview, we list the cor-
abstract derivative"(y) can be identified with a simple real-valued function, respondences between the abstract operators and the associated generating
namely coyy(x,y(x)). Moreover, the norms of the two quantities coincide, functions:
since
G(L°°(E),£(L°°(E)))
V/ (Y) - `Pyy(x,y(x)) E
(4.71) Pp"(Y)L(L-(E),E(L-(E))) "(y) h - ^oyy ( x,y(x)) h (x) - A E G(L°°(E))
(y) [h, k] - `Pyy(x , y(x)) h(x) k(x) - Ak E L°°(E).
sup JI '(y)[y1,y2]L-(E)
¡¡y1 11 L- (E)= 1152]] L°° (E)=1
(ü) The mapping "(y) : h ^-> "(y) h =: A defines a continuous linear
sup ii^lyy(-, y) y1 5211 L-(E)
IMPL-(E)=115211L-(E)=1 operator from L°°(E) into G(L°°(E)): indeed, the linearity is evident, and
the boundedness follows from (4.71), since yyy(-, y(-)) is bounded and mea-
W 11(-' y) l IL-(E)' surable so that the last norm in (4.71) is finite.
230 4. Optirnal control of seinilinear elliptic equations 4.10. Second-order optimality conditions 231

(iii) Proof of (4.72): for the numerator we obviously have the estimate cosine operator y(-) cos(y(•)). Sufficient for the differentiability of the
sine operator from LP(E) finto L"(E) is the condition 1 < q < p; in fact, in
I V(y + h) - V(y) - <D "(y) hl i L(L-(E)) this case r = pq/(p - q) < oo, and it is evident that the cosine operator
sup y + h) - py(., y) pyy(-, y) h) k
maps LP(E) into L' (E). Similar reasoning shows that for the existente of
JIkIILO0(E)=1 the second derivative one needs q < p/2. o
= Il,py(., y + h) - pY(., y) - ^oyy(., y) h1I (E)
4.10. Second-order optimality conditions
The function cp(x, y) := ^py(x, y) generates a Fréchet differentiable Ne-
4.10.1. Basic ideas for sufficient optimality conditions. For the con-
mytskii operator whose derivative is generated by cpy(x,y) = cpyy(x,y).
vex problems studied in Chapters 2 and 3, any control satisfying the first-
Therefore,
order necessary optimality conditions is automatically globally optimal. In-
Il ^oy( -, y + h) - ,py(., y) - wyy(, y ) hll L-(E) - 0 as deed, for convex problems the necessary optimality conditions are also suffi-

IIhllL-(E)
II hII Lo(E) - 0. cient. In the nonconvex case, derivatives of higher order have to be employed
to guarantee local optimality.
Dividing the aboye chain of inequalities by IIhllLo(E), we thus conclude that To begin with, recall that a function f : Rn --j IR has a local minimum
(4.72) is valid. at ú e R' if, in addition to the first-order necessary condition f'(ú) = 0, the
Hessian matrix f"(v,) is positive definite; that is, there is some b > 0 such
(iv) Continuity of the mapping y - V' (y): let Y1, Y2 e L°°(9) be given. As
that
in the previous conclusions, we obtain
Ti' f"(ú)h> blh12 t/h ERn.
I "(yl) - "(Y2)L(Lo(E),L(Lo°(E))) 1I'Pyy(',yl) -^Oyy(',Y2)ll L°O(E)
In infinite- dimensional spaces, the situation is quite similar, except that
< L(M) Ily1 - Y2 1 L- (E), the theory is more challenging. We begin our analysis with a simple result
that in many situations does not work in function spaces, since its assump-
where max {IIy111Loo(E),
IIy21iLo(E)} < M. This even proves local Lipschitz tions cannot be satisfied in the spaces under investigation. In particular,
continuity. The theorem is thus proved. ❑
the condition (4.74) postulated for all h e U is usually overly restrictive; it
needs to be weakened.
Second-order derivatives in other LP spaces. The second derivative
of (D : LP(E) --> L9 (E) exists for 1 < 2 q < p < m if y(-) H pyy(•, y(.)) maps Theorem 4.23 . Let U be a Banach space, let C C U be convex, and suppose
LP(E) finto L' (E), where that the functional f : U R is twice continuously Fréchet differentiable in
pq an open neíghborhood of v, E C. Let the control v, satisfy the first-order
(4.73) p-2q necessary condition
see [GKT92]. In this case, we have f'(i)(u - ú) > 0 bu E C,

(4,"(y)[hl, h2])(x) = ^o,y(x, y(x)) hl(x) h2(x)• and assume there is some 6 > 0 such that

The relation (4.73) is evident, lince for h1i h2 e LP(E) the product h =
(4.74) f "(v.)[h, h] > b Ilhl12 b'h e U.
hl h2 belongs to LZ (E); in light of formula (4.29) oil page 204 for the first Then there are constants E > 0 and a > 0 such that we have the quadratic
directional derivative in the direction h, the function cpyy must map finto growth condition
Lr(E), where f (U) > f (ú) + II U v1 ú for all u E C such that Ilu - ullu < e.
2q pq
r= 2-q=p-2q In particular, f has a local minirnum in C at u.

Example. We consider the sine operator y(-) H sin(y(•)), which maps Proof: The proof is the sarne as that in a finite-dimensional space, and we use
any space LP(E) into Ly(E) for 1 < q < p < oo. The sarne holds for the the abbreviation f"(v,) h2:=f"(v,)[h, h]. Consider the function F : [0, 1] -> R,
232 4. Optirnal control of semilinear elliptic equations 233
4.10. Second-order optimality conditions

F(s) := f (u+s(u-v,)). Then f (u) = F(1) and f (u) = F(0), and the Taylor
We assume that dimO = N < 3 and that functions Ua, Ub e L°(Q)
expansion
and yQ E L2(O) are prescribed. We have 2 > N/2 = 3/2 and can use the
fact that A = -0 is a coercive operator in H, '(Q). Therefore, the assertion
(4.75) F(1) =F(0)+F'(0)+ 2F"(9), 9 E (0,1) of Theorem 4.10 en page 194 remains valid for this elliptic problem with
yields zero boundary condition of Dirichlet type (see, however, the counterexample
on page 193 for Neumann boundary conditions). The mapping G : u ^-> y

2
f(u) = f(ü) + f'(ü)(u -ü )+ f"(ü+ e(n -u))(u-ü)2
f„ ü + e(u - ü)) (u ü)2
is therefore twice continuously Fréchet differentiable from L2(O) into C(Q).
Mente, the reduced functional f : L2(O) -* R,
f (ü) + 2 1
f(u) 211G(u)-yQIIL2 (o)+ 2IIUIIL2(2)'
= f(ü)
+ 2 f( u)(u - ú)z +
2 [f"(ú + 9( u - ú)) - f"(ú)](u - ü)2.
is also twice continuously differentiable . As in ( 4.60) en page 221, we define
the Lagrangian function 1 : (Hó (O ) n C(O)) x L2(O) x H, '(Q ) -> R by
Setting h = u - ú in (4.74), we find that

f"(ü)(u - ü) 2 > ó Ilu - u112 £(y,u,p)=J(y,u)-


f
( Vy•Vp+ey-u)pdx.

Moreover , the second derivative of f is continuous , and thus there is some T hen G is also twice continuously differentiable. Now let ú be a control
E > 0 such that the last summand in the aboye chain of inequalities can be that, together with the state y = G(ú) and the adjoint state p, satisfies the
bounded from aboye by 6 Ila - ülIÚ / 4, provided that first-order optimality conditions for the aboye problem.
Il u - ül iu < E.
In summary, we obtain for 11u - üll u <_ E that In Theorem 4.25, we will prove the representation

(4.76) f" ( ú)[hr, h2] = r"(y, ü,p)[(yr, hl), (y2, h2)],


f(u) ?f(ü)+ 2llu-üllú- 4¡¡u -u u>_f (ü)+4llu-üllu,
where y2, with i = 1 or 2, denotes the solution to the problem linearized at
whence the assertion follows with the choice o, := 5/4.
y ,

-Dy+eyy = h in 9
This theorem will be applicable to optimal control problems involving (4.77)
yr = 0 on 1',
sernilinear partial differential equations if the control-to-state operator C is
twice continuously differentiable as a mapping from L2 into the state space with h = hz. If, in addition, the triple (y, u, p) satisfies with some b > 0 the
and if the control appears only linearly or quadratically in the cost functional second-order sufficierat condition
and only linearly in the differential equation. If G maps from L2 into C(O)
or C(Q), then this simple situation can be expected to occur; see Sections £"(fl,ü,p)(y,h)2>b IIh112
4.10.6 and 5.7.4.
for all h E L2(9) and associated solutions y to (4.77), then the preceding
theorem and (4.76) yield the local optimality of ú in the sense of L2(O). o
Example. We consider the optimal control problem
The condition (4.76) has been postulated for all h E L2(O), which
min J(y, u) := 2 Ily - YQII L2 (Q) + z IIujIL2(0), is an overly restrictive requirement. In fact we know, for example, that
h(x) = u(x) - ú(x) > 0 for almost every x such that ü(x) = ua(x) and,
subject to
analogously, that h(x) < 0 for almost every x such that ú(x) = ub(x). Such
-Dy + ey = u in 9 siga conditions restrict the set of admissible directions h for which (4.76)
YIr = 0 on F rnust be postulated. The set of the directions h to be taken into account can
and be restricted even further by strongly active restrictions. The problem

ua(X) << u (x) < ub(x) min -u2,


for a.e. x e 12. uE[-1,11
4.10. Second-order optimality conditions 235
234 4. Optimal control of semilinear elliptic equations

which will be treated in Section 4.10.5, demonstrates this in a very simple This carinot be trae. Indeed, for any arbitrarily small s > 0 the function
way.
1 2xr if 0 < x < e
The choice of admissible directions h influences only the strength of the u, (X)
second-order sufficient condition and not its general applicability. More re- 0 if e<x<1
strictive are regularity aspects of the corresponding partial differential equa-
tion, sine they limit the applicability of the L2 technique: in fact, while we also yields the global minimum value -1 and, therefore, is globally optimal.
could admit a three-dimensional domain in the aboye example, we would Moreover,
have to póstulate N < 2 in the case of a boundary control under oth-
1 1/2 e 1/2

_(J
erwise unchanged assumptions. For comparable parabolic problems, only 2
(2x) dx
distributed controls with N = 1 can be handled, while boundary controls
¡¡u,- IIL 2(0,1) = ( 0
1UE(x)12 dx)
0
carinot b t treated in L2.
27r / -i0 as E.,0.
We will explain the reasons for this difficulty in the next section. Later,
we will introduce a two-norm technique to overcome these problems. This contradicts the quadratic growth condition, which requires that, for
sufficiently small E > 0, uE must yield a larger value than u..
4.10.2. The two-norm discrepancy. The following example demon-
strate9' that a careless application of Theorem 4.23 may easily lead to er- What is wrong? The mistake is hidden in a nice trap into which we
roneous conclusions in the infinite-dimensional case. Here, no differential fell by tacitly assuming and taking for granted that the cosine functional f
egnation is prescribed, but the control function u occurs non-quadratically is twice continuously Fréchet differentiable in the space L2(0, 1). Below we
in the.cost functional. will demonstrate that this is not true at the point u - 0, while the treatment
of the general case will be the subject of Exercise 4.9. °
(Counter)Example. We reconsider the problem (4.37) discussed earlier,
lf the Banach space Lx(0, 1) is chosen for U in place of L2(0, 1), then f
is twice continuously Fréchet differentiable. Here, we have a typical example
m in f (u) :_ - f cos (u(x)) dx.
o<u(x)<2ir of the well-known two-norm discrepancy: in L2(0, 1), f does not meet the
requirement concerning differentiability, while f"(v,) is positive definite,
Silice, as a rule, the analysis is easiest in a Hilbert space setting, an
obvious choice of control space is U = L2(0, 1). Then we have C = {u e f"(ú) u2 > 6 IIUIIi2(o,1).
U 0 u(x) < 2ir for a.e. x E (0, 1)}, and u - 0 is an obvious global
minimizér . Let us check whether the optimality conditions are satisfied by On the other hand the second derivative of f exists and is continuous in
u. The first-order necessary condition, fu (ú) u2 >_
the space L°°(0, 1), but there cannot exist a 5 > 0 such that
6 JIU112O(0,l) for every u E L°°(0, 1).
f'(u){u, - 4d) =
lo
1 sin (i(x)) (u(x ) - ú(x)) dx = J0 1 sin(o) u(x) dx = 0, Could this be a case in which no second-order sufficient condition can be
satisfied? No. Fortunately, there is a way out that was discovered by Ioffe
holds trivially. For the second -order sufficient condition , we find after a
[Iof79] : one has to work with two different norms.
formal calculation that
To this end, we examine the remainder of second order in the Taylor
f"(2L)Y1,2 J
expansion of f in the space L°°(0, 1). Using the known series expansion of
COS(O) u2(x) dx = 1 I u2(x) dx = 1 . u^ L2(0,1)
0
1 01 cosine at u(x) and the integral form of the remainder for real-valued functiotrs
(see Heuser [Heu08] , Eq. (168.6)), we obtain that
for all u E L2(0, 1). Hence, the condition is satisfied with 6 = 1. Theoren:
4.23 then implies the existente of some constant a > 0 such that f (u) > ¡1
f (ú) + a ll u - v,iiL2(o,l) for every u E C that is sufficiently close to u, - 0 with f(u + h) J cos(u ( x) + h(x)) dx
0
respect to the L2 norm.
236 4.10. Second-order optimality conditions 237
4. Optimal control of semilinear elliptic equations

1
= J¡ [ - cos (u(x)) + sin (u(x)) h(x)
This expression does not tend to zero as E . 0, while IIhllL2(o,1) does. Con-
sequently, the cosine functional cannot be twice Fréchet differentiable at the
point u - 0. o
+ J¡1 (1 - s) cos (u(x) + s h (x)) h2(x) ds] dx. In order to overcome the two-norm discrepancy, we estimate r2 (u, h) as
follows:
On the other hand, by the definition of Fréchet derivatives,
(4.79) Irf (u, h) I < ^1 ^1(1 s) s h(,) j h2 (x) ds dx
f (u + h) cos (u(x)) + sin (u(x)) h(x ) + 2 cos (u(x)) h2(x)] dx 0 0 1

< IIhllL-(o,1) h2(x) dx < IIhllL-(o,1) IlhllL2(o,l).


+ r2 (u, h), 6 lo 61

where r2 (u, h) denotes the second -order remainder of f at u in the direction Conclusion . The second-orden remainder of the cosine functional satisfies
h. Comparing the two representations for f ( u + h), we find that

(4.78)
=
(4.80) r'f (u,h)I as IIhllL-(o,1) 3 0.
rf (u, h)
l
o f(i - s) [cos (u(x) + s h (x)) - cos (u(x))J h2 (x) ds dx. Ilhllo
L2(o,1)
This representation is a special case of the general version of Taylor's theorem
with integral remainder (cf. Cartan [Car67], Thm. 5.6.1). To verify this, we just have to divide (4.79) by Ilhlli2( o,1). Notice that in
It is advantageous to use the integral form of the remainder instead of the the important estimate (4.80) two different norms occur , which is character-
simpler Lagrangian form: in this way, a discussion of the measurability with istic of the treatment of the two-norm discrepancy.
respect to x of the intermediate points 0 = 0(x), which arise for instance in We are now in a position to conclude the cosine example. We find that
the expansion atu-0,

cos(u(x) + h(x)) = cos(u(x)) - sin(u(x) + 9(x) h(x))h(x),

can be avoided.
f (0 + h) = f ( 0) + f '(0) h +
1
2
f"(0) h2 + rf2 (0, h)

= f(o) + 0 + 2 lhl1,s(o,1) + r2 (0, h)


f
Example . We insert here the proof that f is not twice Fréchet differentiable
with respect to L2(0,1) at u = 0. To this end, we calculate r2(0, h) as on
f(0) + IlhllL C2 + IIhIILho,l»
page 200 for
>_ f(0) + IlhllL2( o,1) (2 s IhIIL-(O l)
h(x)
1 1 in [0, E]

0 in (E, l] f (0) + 3 IIhIIL


and take E 10. We get whenever IlhllL-(o,1 ) < E = 1. Thus, a quadratic growth condition with
respect to the L2 norm is satisfied in a sufficiently small L°° neighborhood
rf (0, h) _ ff( 1_s)(cos ( 0+s)_cos (o))ds dx=a (_cos(n) = E c of ti - 0 , whence we can conclude that u is a locally optimal solution in the
sense of L. Of course , this is nothing new , lince we knew already that v, is
with c 0, and thus in fact even globally optimal.

In the following, the method just illustrated will be applied to optiinal


h) = E c í£ 12dx^ 1=c.
¡¡h 112 (o,1) control problems, leading to results of the kind described aboye.
239
238 4. Optimal control of semilinear elliptic equations 4.10. Second-order optimality conditions

4.10.3. Distributed control . In this section, we investigate second-order Theorem 4.24. Suppose that Assurnption 4.14 on page 206 holds. Then
conditions for the problem (4.31)-(4.33): the operator G : L°°(Q) -^ Hl(S2) fl C(Q) is twice continuously Fréchet
differentiable. The second derivative G"(u) is given by
min J(y, u) := JSZ
W(x, y(x)) dx +
SZ
J (x, u(x)) dx, G"(u)[ul, u2] = Z,

subject to where z is the unique weak solution to the elliptic boundary value problein
-Oz +d9(x, y) z -d,, (x, y) Y1 Y2
-Dy + d(x, y) = u in 52
(4.81)
onF avz = 0,
a„y = 0
where y = G(u) and yti = G'(u) uz E H'(Q) for i = 1, 2.
and
ua(x) < u(x ) < ub(x) for a.e. x e Q.
Proof.• ( i) Existente of the second derivative.
By virtue of Theorern 4.17 on page 213, we know that G is Fréchet dif-
We assume that the control u E Uad satisfies, together with the associated
ferentiable . To show the existente of the second derivative , we apply the
state y = G(ú) and the adjoint state p, the first-order necessary optimality
implicit function theorem. To this end , we transform the elliptic bound-
conditions (4.45)-(4.47) on pages 216 and 217. Then the pair (y, u) need not
be optimal, but is our candidate for which local optimality is to be shown. ary value problem for y = G(u) into a suitable form. For this purpose,
Since we work in the control space L°°(Q), no restrictions en dim 9 = N are let R : L °°(fl) -+ Hl (9) n C(Q ) denote the solution operator of the linear
elliptic boundary value problem
needed.
-Dy+y = v in Q
It should be emphasized that the validity of the second-order sufficient
optimality conditions can only be verified once (y, u) is known. The situation av y 0 onF.
is much the same as in the minimization of real-valued functions of several
real variables: after the (usually numerical) determination of a solution to the We regard R as an operator with range in C(S2 ). The equation y = G(u)
optimality system, we have to check that the sufficient optimality condition is means that
-py+y = u-d(x,y)+y in Q
fulfilled, which then guarantees local optimality. Observe, however, that the
(4.82)
numerical method will usually exhibit reasonable convergente behavior only a, y = 0 on F.
if it starts in the vicinity of a critical point satisfying the sufficient conditions.
One can only hope to find a starting point in such a neighborhood. In terms of R, this means that y = R(u - d (•, y) + y) or , more precisely,

(4.83) y - R(u - ^D(y)) = 0,


Remark. It is a difficult task to verify a second-order sufficient condition in a
function space nurnerically. In fact, one has to use positive definiteness obtained where : C(Q) -* L°°(52) denotes the Nemytskii operator generated by
for a finite-dimensional approximation to the infinite-dimensional case. There is, d(•, y) - y. Obviously, if y E C(Q) solees ( 4.83), then y automatically lies in
however, a numerical rnethod due to R0sch and Wachsmuth [RW08] that can be the range of R and hence belongs to H' ( Q) and is a weak solution to ( 4.82).
applied to certain classes of problems. Therefore , ( 4.82) and ( 4.83) are equivalent. Next , we define the operator
F : C(Q) x L°°(Q) -> C(Q),
Second-order derivatives. F(y, u) = y - R(u - P(y)).

The second derivative of the control-to-state mapping . As before, Since <D is, by Theorem 4 . 22 on page 229, twice continuously differentiable
we denote by G the control-to-state mapping u H y for the aboye elliptic and R is linear and continuous , it foliows from the chain rule that F is also
boundary value problem. We consider G as a mapping between L°°(9) and twice continuously Fréchet differentiable.
H'(S2) fl C(st). First, we show existente and continuity of the second-order Moreover , the derivative DyF(y, u ) is surjective: in fact , the equality
Fréchet derivative of G. DyF(y, u) w = v
240 4. Optirnal control of semilinear elliptic equations 4.10. Second-order optimality conditions 241

is equivalent to w + R V(y) w = v and, upon putting w := w - v, to which by the definition of R means that z is the unique solution to the
w = -R V(y) (w + v). A straightforward calculation, using the definition boundary value problem
of R, shows that the latter equation is equivalent to the boundary value
-Oz+z = -dy(x, y) z - dyy(x, y) y1 y2
problem
8„z = 0.
-Ow + dy(x, y) w = -dy(x, y) v + v
With this, the proof of the theorem is complete.
av w = 0,
which for every v E C(S2) has a unique solution w E H1(l) f1 C(2).
,The second derivative of the cost functional . Under Assumption 4.14
In summary, all assumptions of the implicit function theorem are satisfied en page 206, the cost funetional J is twice continuously Fréchet differentiable
and, therefore, the equation F(y, u) = 0 has a unique solution y = G(u) in Y x L' (P). By virtue of the chain rule, f : u -+ J(G(u), u) is then also
for any u in a suitable open neighborhood of ti. This is nothing new, since twice continuously Fréchet differentiable in L°°(í ). The second derivative
we have shown already that a unique solution y = G(u) exists even for all can be calculated as in the preceding proof. First, we obtain
u E Uad. However, the implicit function theorem also yields that G inherits
the smoothness properties of F (see, e.g., [Car67]); therefore, G is twice f'(u) u1 = DyJ(G(u), u)G'(u) ul + DuJ(G( u), u) ul.
continuously Fréchet differentiable.
Next, we calculate the directional derivative of f (u) := f(u) ul in the direc-
(ii) Calculation of G"(u). tion u2. Invoking the product and chaira rules, we find that
Taking y = G(u) in the definition of F, we see that
(4.84) f"(u)[ui, u2] = PU) U2

F(G(u),u)=G(u)-Ru+Rq)(G(u))=0. DyJ(G(u), u) [G'(u) ul, G'(u) u2]

Differentiation in the direction ul yields, by the chain rule, + DuD0J(G(u),


u) [G'(u) u1, u2] + DyJ(G(u), u)G"(u)[u1, u2]

+ DyDuJ(G(u), u) [ul, G'(u)u2] + D2J(G(u), u) [ul, u2]


G'(u) ui - Ru1 + R V(G(u))G'(u) ul = 0
J"(y, u) [(Pi, ul), (y2 , u2)] + DyJ(y, u)G"(u)[n1, u2].
or, upon defining the operator K : L' (52) - C(S2), K(u) := G'(u) u,,
To simplify the tercos, we again use the abbreviations z := G"(u) [ul, u2] and
K(u) - Ru, +RV(G(u)) K(u) = 0. yi := G'(u)ui, i = 1, 2. With this, we obtain the expression

Next, we calculate the directional derivative in the direction u2. Using D5J(y, u) z = / ípy (x, y(x)) z(x) dx,
the product and chain rules, we obtain that
which can be easily transformed using the adjoint state p. In fact, p is defined
K'(u) 71,2 + R V' (G(u)) [K(u), G(u) u2] + R V (G(u)) K'(u) 112 = 0. as the unique weak solution to the boundary value problem

Since K'(u) u2 = G"(u)[ul , u2] (cf. the scheme for the evaluation of second- -,^,p+dy(•,y)p = ^o,(-,y) in 14
(4.85)
order derivatives explained en page 227), this is in turn equivalent to the ó„p = 0 on F.
equation
By virtue of Theorem 4.24, z solves the boundary value problem (4.81)
G"(u)[ul, u2] + RV"(G(u)) [G'(u)'u1,G'(a) 02]
whose right-hand side ú := -dyy(-, y) yl y2 can be regarded as a `control".
+R('(G(u))G"(u)[ul,02] = 0. Specifying the involved quantities appropriately, in particular putting ast =
co,, ,(3sz = 1, and v = ü, we conclude from Lemma 2.31 on page 74 that
vVith z := G"(u) [ul, u2], we thus have
(4.86) D0J(y, u) z = 1 p ü dx = - J p dyy(x, y) yl y2 dx.
z + R (D'(y) z + V" (y)[y1, y2]) = 0, sz sz
242 4. Optimal control of semilinear elliptic equations 4.10. Second-order optimality conditions 243

Using this in (4.84) finally yields An auxiliary result. The treatment of the two-norm discrepancy requires
estimates of a somewhat technical nature.

(4.87) f"(u)[u1, u2] = J"(y, u) [(yl, ul), (y2, u2)] - f p dyy(x, y) y1 y2 dx.
Lemma 4.26. Suppose that Assumption 4.14 holds, and let the functional
f : L°°(9) -+ R be given by
This expression can be further simplified using the Lagrangian function.
f (u) = J(y, u) = J(G(u), u).

Definition . The Lagrangian function associated with the problem (4.31)- Then for each M > 0 there exists a constant L(M) > 0 such that
(4.33) on page 207 is defined by
(4.89)
I f"(u + h)[u 1, u 2] - f"(u)[u 1 , u 2 ]1 < L(M) IIhIIL-(o) II1111L2(O)IIu211L2(O)
£(y,u,p)=f (W(x,y)+ (x, u)-(d( x,y)-u)p )dx -
f vy • opdx.
for all u, h, u1, u2 E L°°(52) such that max {IIuIIL-(a), 11hIIL-(o)} < M.
We simplify the notation for the second derivative of £ with respect to
(y, u) by setting Proof.: (i) Transformation using the Lagrangian.

(y, u, p) [( y1, u1 ), ( y2, u2)] :_ D(y,u) £(y , u, p) [(yl, u1), (y2, u2)] We put y = G(u) and Yh = G(u + h), and denote the corresponding adjoint
states by p and ph. Moreover, let yz = G'(u) ui and Yi,h = G'(u + h) ui for
Here, the increment (y¡, ui) indicates that the derivative is to be understood i = 1, 2. The existente of the second derivative f"(u) has been proved in
with respect te (y, u ). It follows from (4 . 87) that Theorem 4.25. Invoking relation (4.88), we find that
(4.88) (4.90) f"(u+h)[ul,u2] - f"(u)[u1,u2]
f"(u) [ul , u2] = f(a( xY)Yly2 - p dyy ( x, y) yly2 + 4'uu ( x, u) ulu2) dx "
(yh, u + h,ph) (y1,h, u1), (y2,h, u2)

= £"(y,u,p)[(y1,u1 ),(y2,u2)].
- £" (Y, u, p) [(y1, ul), (y2, u2)]

Again, the Lagrangian function has proved to be a powerful tool for the
calculation of derivatives with respect to u. Summarizing, we have shown I (o (x, Yh) Y1,h Y2,h - ^Pyy(x, y) y1 Y2) dx
the following result.
Ph Yh) Y1,h Y2,h dx + p dyy(x, y) yi Y2 dx
Theorem 4.25. Suppose that Assumption 4.14 holds. Then the reduced
- dyy(x,
in
functional f : L°° (9) --> R,
+ f^ (buu ( x, u + h) - ^uu ( x, u)) u1 u2 dx.

f (u) = J(y,u) = J(G(u),u),


ís twice continuously Fréchet differentiable . The second derivative of f can (ii) Estimation of Yi,h - y2 and Ph - p.
be expressed in the forro
Owing to Theorem 4 . 16 en page 212, the operator G is Lipschitz continuous
f"(u)[u1, u2] = E" (Y, U, P fflY1,-1 ), (y2,u2)]• from L °O(S2) into C(S2), that is,

Here, y is the state associated with u, p is the corresponding adjoint state, (4.91) Ilyh-YIIC(9) 5 CLIhIILo°(2).
and yz = G'(u) u¡, i = 1, 2, denote the solutions to the linearized problems
Hence, with a constant c(M) > 0 that depends on M,
-Ay¡+dy(x,y)y2 = ui in S2
ci„yti = 0 on F. (4.92) Il dy (x, yh) - dy (x, y) II L-(o) < e(M) II hII Lo(O).
244 4. Optimal control of sernilinear elliptic equations 4.10. Second-order optimality conditions 245

In the following, c > 0 will always denote a generic constant. The deriva- (iii) Proof of the assertion. The tercos in (4.90) can now be estimated. For
tives yi and Yi,h solve, respectively, the elliptic equations instante, we have

(4.93)
- oyi + dy( x, y) yi = ui I IPyy(-, Ph) Y 1,h Y2,h - ^Pyy(-, y) Pi Y2 11L^(o)
-Dyi,h + dy (x, yh) Yi,h = ui
< 1(Wyy(' ,Yh) - <Pyy(-,y)) YiY2I Li(O) + 1 IPyy(',Yh) (y1,hy2,h - Y 1 Y2) 11L1 (o)

with Neumann boundary conditions. Since d is increasing with respect to y, _< Ilwyy(', yh) - Wyy(', Y )II L°°(2) I1Y111L2(o) I1y211L2(o)
we deduce from Theorem 4.7 en page 191 the existente of a constant c > 0,
+ 11^lyy(', Yh)11L-(Q) (Iy1,h(Y2,h - Y2)11Ll (O) + I1(Y1,h - y1) y211D(o))
which is independent of h, such that
< c IIhIIL-(si)11U111L2(E) IIu211L2(o) + C I1y1,hIIL2(o) IIy2,h y2I1L2(o)
(4.94) II Y iII Hl(S) <_ C I uiJIL2(o), Ilyi,hlMH'(2) <_ C IIni1IL 2(S2).
+CJIY2IIL2(O) Ily1,h - y1lIL2(o) < cI1hIILoo(o) IulI1L2(o) IIu21IL2(O).
The difference yi,h - yi satisfies the equation
The other integrals in (4.90) are estirnated similarly, and (4.89) follows. This
(4.95) -0 (yi,h - yi) + dy(x, y) (yi,h - Pi) concludes the proof of the lemma. ❑

= -(dy(x,yh) - d,(x,y) Yi,h


Second-order optimality conditions.
The L2 norm of the right-hand side can be estimated by ineans of (4.92) and
(4.94): Second-order necessary conditions . For the derivation of necessary
conditions, we follow [CT99]. The minimum principie (4.48) en page 217
(4.96) 11(dy(x, yh) - dy(x, y)) yi,h11L2(o) < C I1h1IL-(o)I1 ui11L2(f2). yields for the solution to problem (4.31)-(4.33) en page 207 the representa-
tion
Using the solution properties of the linear equation (4.95) and recalling that
dy(x, y) > Ad > 0 en Ed, we find that ua(x) if p(x) + Wv (x, Ü(x)) > 0

ub(x) lf p(x) + z/ a (x, U(x)) < 0.


(4.97) Ilyi - Yi,hllH'(o) < CIIhllL-(o)IIuiIIL2(o).
Hence, ú is determined en the set of all x such that I p(x) + , (x, Y(x))1 > 0,
The difference of the adjoint states obeys the equation and higher-order conditions are only of interest en its cornplement. Now let

A (Ph p) + d.(x,y)(ph - p) (4.99) A0(ú) _ {x E fi : jp(x) + v, (x, u(x)) > 0}.

_ ^PyYh) - `Py(', y) + (dy(x, y) d, (r, yh)) Ph For any u e Uad, n(x) - ú(x) is nonpositive if ü(x) = ub(x•) and nonnegative
if ü(x) = ua(x). These facts motivate the following notation.
Owing to (4.91), the C(Q) norms of y and Ph are uniformly bounded. This
then also applies to the adjoint states ph, because the boundedness of Ph Definition ( Critical cone ). The set Co(ú) is defi ned as the set of all
is inherited by the right-hand side py(-,yh) of the adjoint equation (4.85). h e L°°(4) such that
Hence, the right-hand side of the aboye equation can be estimated as follows:
=0 if xEA0(u)
II ^Oy(',Yh) wy(', y)II L-(11) +11 dy(x,y)-dy(x, Yh)IIL Ph L- (52) h(x) > 0 if x Ao( ú) and ú ( x) = ua(x)
< ellyh-y11L-(Q) < cIIhlIL-(p) <0 if x Ao( u) and ü ( x) = ub(x).
Finally, from the difference of the adjoint equations for p and Ph we obtain
that Hence, h can be chosen freely en the inactive set {x E 9 : ua(x) < ú(x) <
ub(x)}.
(4.98) Ilph - AL-(Q) < C II hiI L-(o) •
246 4. Optimal control of semilinear elliptic equations 4.10. Second-order optimality conditions 247

Theorem 4.27. Suppose that Assumption 4.14 on page 206 holds, and let u Lemma 4.28. The second-order necessary condition (4.100) is equivalent to
be a locally optimal control for the problem (4.31)-(4.33) on page 207. Then
G"(y, u, p)(y, h)2 > 0 V h e Co(u),
(4.100) f"(u)h2 > 0 Vh e Co(z).
where y = y(h) E H'( íl) is the solution to the associated linearized problem
-Dy + dy(x, y) y = h
Proof. Let h c Co(z) be arbitrary. Observe that even for very small t > 0
we cannot guarantee that z + t h belongs to Uad. We therefore introduce the
ó„y = 0.
sets

In={xE52: ua(x)+1/n<u(x)<ub(x)-1/n}, nEN, Second-order sufficient conditions . For the formulation of the sufficient
conditions, we introduce the following cone:
and consider the functions hn := Xn h, where
(4.102) C(E) = {u e L°°(f1) : u(x) > 0 if u(x) = ua(x)

1 if x E In or [0(x) E {ua(x),ub(x)} and u(x) < 0 if z(x) = ub(x)}.

xn(x)

t 0 if
and ub ( X) - ua(x) > 1/n]
u ( x) E (u,, ( x),ua(x ) + 1/n) U (ub( x) - 1/n,ub(x)).
The following condition is an example of a second-order sufficient condi-
tion.

There exists some b > 0 such that


Hence, Xn(x) = 0 also if u(x) E {ua(x),ub(x)} and ub(x) - ua(x) <
and we have u = u + t hn E Uad for sufficiently small t > 0. Therefore, (4.103) f,, (u) u2 > S IIu11i2(Q) V U E C(u).

0 < J(y, u) - J(y, u) = f (u) - f (u) By (4.88) on page 242, this is equivalent to the condition

(4.104)
f'(u)thn + 2 f"(z)t2 h.22 + r2(u,thn),
f {(ay(x,) - p d(x, )) y2 z) u21 dx > b JIU L2(o)
z
with the second-order remainder r2 of f. By virtue of the minimum condition for all u E C(u) and y e H'(S2) such that
(4.48) on page 217, h and also h, vanish at almost all points where p+u(•, z)
-Dy + dy( x, y) y = u
differs from zero. Therefore, f'(u) hn = (p +.a(•, z) , hn)L2(o) = 0, and we
obtain after division by t2 that dvy = 0.

(4.101) 0 < - f„ (u) hn + t -2 r2 (u, thn). Remark. The aboye suffrcient condition is too restrictive. In fact, in comparison
with CO(u), the cone C(o) is too largo. The gap between C(u) and Co(v,) can be
Passage to the limit as t 4. 0 shows that f"(u) h22 > 0. Now observe that narrowed by invoking strongly active constraints; see Section 4.10.5. However,
hn(x) -> h(x) pointwise almost everywhere as n co, and hn(x)2 _< h(x)2 the aboye form is frequently used, in particular, as a usual assumption in the
pointwise everywhere for all n E N. We may therefore employ Lebesgue's convergente analysis of numerical methods.
dominated convergence theorem to deduce that hn -> h in L2(fl). Hence,
passing to the limit as n -+ oo, we can conclude the validity of the assertion If the pair (9'Ü) satisfies both the first-order necessary conditions and the
from the continuity of the quadratic form f" (u) h2 in L2(9). ❑ second-order sufficient condition (4.104), then u is a locally optimal control
in the sense of L°°(Q), as the following result shows.

The second-order condition (4.100) has been formulated in terms of the Theorem 4.29. Suppose that Assumption 4.14 on page 206 holds. Let the
abstract operator f", i.e., not with an explicitly given function. From Theo- control z E Uad, together with the associated state y = G(u) and the ad-
rem 4.25, we obtain the following, in the theory of optimization more popular, joint state p, satisfy the first-order necessary optimality conditions stated in
form. Theorem 4.20 on page 216. If, in addition, (y, u) satisfies the second-order
248 4. Optimal control of semilinear elliptic equations 4.10. Second-order optimality conditions 249

sufficient condition (4.104), then there exist constants E > 0 and o > 0 such
for every u e C(4) and every y E H'(Q) such that
that we have the quadratic growth condition
-¿Ny + dy (x, y) y = u in O
J(y, u) >- J(y, ú) + jIlu - ü1122(2) du E Uad with llu - I L-(Q) < E, C%vy = 0 on F.
where y = G(u). In particular, h is a locally optimal control in the sense of
L°°(52). 4.10.4. Boundary control. In order to elucidate the second-order opti-
mality conditions, we exemplarily investigate the boundary control problem
Proof.• The proof is almost identical to that for the cosine functional, so we (4.49)-(4.51):
will be brief. We have
min J(y, u ) := Jsz
W(x, y(x )) dx + J V (x, u(x)) ds(x),
r
J(y,u) = f(u) = f(ü) + f' ( )(u - v,) + 2 f" (h + 9(u - ú)) ( u - ú)2
subject to
with 9 E (0, 1 ). In view of the first-order necessary condition , -AY = 0 in Q
the first-order
term is nonnegative . Indeed , it follows from the variational inequality in
0,y + b(x,y) = u on F
Theorem 4.20 (see also formula (4.43)) that
and
f'(ü)(u-u)= J (p+^bu(.,Ü))( u-b)dx>0. ua(x) < u(x ) < ub(x) for a.e. x e I.

Next, note that u - u, E C( ú). We estimate the second-order terco from


To this end , we use the associated Lagrangian function
below . We obtain

f"(h + 9( u - 2l)) (u - ú)2 £(y, u, P ) _ f (^,( x, y) Vy • Vp) dx + J (,P( x, u) - (b(x, y) - u) p) ds.


= f"(u) (u - h)2 + [f" (ú + 9 (u - 2L )) - fil(a )] (u - 2G)2
The second-order sufficient condition postulates the existente of some
> 6 ¡¡U 8 > 0 such that
uIIi2(9) - L ll u - ÜIIL-(O) Ilu - úIIL2(Q)
f- " (Y, ,p)( Y,u)2 > 6 11uI112(r)
2 Ilu -IIL2(9), for all u E C(2) and all y e Hl (Q) such that
(4.105)
provided that 1u - vII L-(Q) < E for some suf$ciently small E > 0. Here, we -Dy = 0
have used (4.103) and the estimate (4.89) on page 243, as well as the fact
that all u E Uad are bounded in L°°(52) by a common constant avy + by(x,y)y = u.
M > 0. In
summary, we llave
Here, the cope C ( ú) is defined as in ( 4.102 ), except that 12 has to be replaced
J(y, u) >- f ( ) + llu - U11^2(O)
= j(9, ,ü ) + II U by F. As an explicit expression for C ", we obtain
ü) 2(0)
4
with a = b/4,
provided that Ilu - ul1L-(o) < E for a sufficiently srnall rll(y,ü,P)(y,u)2 = f Pyy(x,y)y2dx - y,(x,y)py2ds
E>0. st ir b

In arialogy to Lemma 4.28, we rewrite (4.104) in the following more


+ J¡r 0uu (x, u,) u2 ds.
commonly used forro:

Theorem 4.31. Suppose that Assumption 4.14 holds, and suppose that the
Lemma 4.30. The second-order sufficient condition (4.104)
is equivalent to control v, E Uad, together with the associated state y = G(u) and the ad-
joint state p, satisfies the first-order necessary optimality condition stated
P)(y, u)2 > 6 IIUIIL2(Q) in Theorem 4.21 on page 219. If, in addition, the pair (g, v.) satisfies the
250 4. Optimal control of semilinear elliptic equations 4.10. Second-order optimality conditions 251

second-order sufficient condition (4.105), then there are constants e > 0 and A simplified example in function space. Suppose that the function cp
a > 0 such that we have the quadratic growth condition satisfies the conditions stated in Assumption 4.14 on page 206. We consider
the minimization problem
J(y, u) >_ J(y, u) + a ^¡u - u2(r)
for every u E Uad such that llu-uMLo(r) < e, where y = G(u). In particular, mi f (u) .=
d
UEU^
f (x,u(x)) dx,
u is locally optimal in the sense of L°O(I ).
where Uad = {u E L°°(Q) : ua(x) < u(x) < ub(x) for a.e. x E SZ}. Let the
function u e Uad obey the first-order necessary condition
The proof is completely analogous to that for the case of distributed
controls.
pu (x, ú(x)) (u(x) - u(x)) dx > 0 V u E Uad
i
4.10.5. Inclusion of strongly active constraints *. In the previously
Then, almost everywhere in fI, we must have
established results, the gap between necessary and sufficient second-order
conditions is too large. In fact, in the necessary condition the nonnegativity ( > 0 if i(x) = ua(x)
of .C" is postulated on the critical cone Co(u), which is usually smaller than `Pu (x, u(x)) ú(x) = ub(x).
Sl < 0 if
the cope C(u) appearing in the second-order sufficient optimality condition.
In Co(u) the controls vanish on the strongly active set, while in C(u) they
only have to obey sign restrictions. Therefore, the second-order sufficient Definition . For arbitrary but fixed T > 0, let
condition is actually an overly restrictive requirement. By including strongly AT(u) = {x e fI : l^pu(x,u(x))1 > T}.
active constraints, this gap can be closed to a certain extent. To this end,
one additionally considers first-order sufficient conditions. Then AT(i1) is said to be the set of strongly active constraints or, for short,
the strongly active set.
Example. The nonconvex
function f : R -* R, f (u) = In the special case of T = 0, we obtain the set Ao(ü) defined in (4.99).
The aboye definition is due to Dontchev et al. [DHPY95]. In the example
-u2, has two minimizers
under discussion, we suppose the following sufficient second-order optimality
u1=-1 and u2=1inUad=
[-1, 1]. At both points, the
condition to be valid: there exist constants 6 > 0 and T > 0 such that
first-order necessary con- f"(u) h2 > 6 Ijh1Ii2(o)
dition f' (u¡) (u - ui) > 0 for
all u E [-1, 1] holds, and we for all h e L°°(Q) such that
even have j'(ujj = 2 > 0; =0 ifxEA.,.
see the figure.
h(x) > 0 if x AT and u(x) = ua(x)
Second-order conditions
< 0 if x AT and u(x) = ub(x).
of the previously used type First-order sufficient conditions.
cannot be valid here, since f
We therefore postulate that puu (x, u(x)) > 5 on I\AT and I (pu (x, u(x))
is concave. However, they are not needed, because the uz satisfy the first-
> T on A, and hence the positive definiteness of f"(v,) is assumed only for
order sufficient optimality conditions, given by Jf'(ui)j 0. This already
a proper subset of the set C(u) defined in (4.102) on page 247. We claim
implies local optimality: for instante, for u1 we have, for any h E (0, 2),
that this postulate, in combination with the first-order necessary condition,
f (-1 + h) = f (-1) + f'(-1) h + r(h) = f (-1) + 2h - h2 > -1 = f (-1). is sufficient for the local optimality of u. This can be seen as follows.
Hence, ul = -1 is locally (and here even globally) optimal. o For u E Uad sufficiently Glose to u, say, for llu - U!ILOO(O) < e, we have
the Taylor expansion
Analogous constructions can be applied in function spaces to weaken
second-order sufficient optimality conditions. f (u + h) - f (u) = f'(ú) h + 2 f" (u,) h2 + 2 (f"(u + 8 h) - f " (u)) h2
252 4. Optirnal control of semilinear elliptic equations 4.10. Second-order optimality conditions 253

with h = u - u and a suitable 0 = 0(x) E (0,1). As in (4.80) on page 237, we have


We split h into two parts, h = hl + h2, in such a way that h2(x) = 0 on
ir(u,h)1 -30 as 11hIILoo(o) _+ 0,
AT and hl (x) = 0 on Q\AT. The function hl exploits the first-order sufficient
lhllL2(n)
conditions, while for h2 the positive definiteness of f" applies. Obviously,
hl(x) > 0 whenever ú(x) = ua, and hi(x) < 0 whenever u(x) = ub. With a whence we conclude that
remainder r(u, h) of second order, we obtain ,f( + h) -.f(u) _ > o I hlli2(o)
f(4 + h) - f (u) = J <ou (x, u(x)) h(x) dx with a = 4 ruin {T, 8} , provided that e > 0 is sufficiently small and
^ u - u11LOO(C) < E. In other words, ti is locally optimal , which proves our
claim.
+ 2 puu ( x, ú(x)) h2 (x) dx + r(u, h).
fz
Strongly active constraints in elliptic optimal control problems.
Now note that <pu(x, ú (x)) h(x) > 0 for almost every x e 9. Hence, invoking
Obviously, the aboye method can also be employed to weaken the suffi-
the fact that hl(x)h2 ( x) = 0, we obtain , with the abbreviations cpu(x)
cient optimality conditions in control problems involving partial differential
cpu(x,u( x)) and cpuu(x ) <puu(x,u (x)), that
equations. In this connection, we refer the reader to [CTU96] for elliptic
.f (u + h) - f (v.) problems with control constraints, te [CM02a] for the case where additional
constraints in integral form are imposed, and te [CD1RTO8] for pointwise
<ou(x) hi(x) dx + 1 f cpuu( x) (hi(x) + h2 (x))2 dx + r(u, h) state constraints.
f T 2 z
We discuss the use of strongly active constraints for the distributed con-

f (i^,u(x) hr( x)I + 2 wuu ( x) hl(x)2) dx trol problem (4.31)-(4.33):

ruin J(y,u) := Jsz cp (x, y(x)) d x + J O (x, u(x)) dx,


+1 J
2 S2\A,
wuu( x) h2(x)2 dx + r(u, h)
subject to
st

>
f
A

8
(x I hi(x ) i - 2 I ^^uuI1L°O ( Q) ¡hr(x)12) dx
-Ay + d(x, y) =
8„y =
u
0
in S2
on F

+ h2(x) 2dx + r(u,h). and


2 Q\A, Ua(X) < u (x) < ab(x) for a.e. x E S2.

For sufficiently small e E (0, 1), we have, for almost every x e 9,


We assume that a_control il e Ua, with associated state y is given that
Il^waIlL^(o) I hi(x)I < 11 ^,tiu11L^(^) < e T and I hr(x)I > hr(x)2. satisfies the first-order necessary optimality condition

Hence, for sufficiently small e > 0 it follows that (4.106) (p (x) + V),, (x, u(x)) (u (x) - u(x)) dx > 0 V u E Ua,
f
f (ü+ h) - f O ? z fhi(x)dx+fh2(x)2dx+r(n,h) We define for fixed T > 0 the strongly active set
-F 6 AT(u) = {x e 9 1p(x) +0u(x, u (x)) > T}
> rnin { 2 , 2 } I (hl (x)2 + h2 (x)2) dx + 7- (u, h)
Js
¡ and the T-critical cope CT(b) = {u e L°° (t) : u satisfies (4.107)}, where
2 min {T, 8} J h2(x) dx + r(u, h)
=0 if xEAT(U)

IIh1122(^) 1 min{T, b} r(z,h)1 (4.107) u(x) >0 if x AT( u) and Ü(X) = Ua


IhIIL2(p) <0 if x AT(ú) and ú( x) = ub.
254 4. Optimal control o£ semilinear elliptic equations 4.10. Second-order optimality conditions 255

The following condition is called the second-order sufficient condition: Distributed control . Let 52 be a bounded Lipschitz dornain with dimf2 =
N < 3, and suppose Assumption 4.14 holds. Then, by Theorem 4.16 on
(4.108)
page 212, the control-to-state operator G : u -* y of the distributed control
J { (Pyy(x, p d00(x, y)) y2 + 4'uu (x, ü) u2 } dx > 5 II UII L2(9) problem (4.31)-(4.33) on page 207 maps L2(S2) Lipschitz continuously into
C(Q). Hence, Theorem 4.24 on page 239 concerning G" remains valid if
for all u E G(u) and y E Hr(í) such that L' (Q) is replaced by L2(S2); in fact, the argument carries over unchanged
-Dy + dy(x, y = u if we simply replace L°°(S2) by L2(52) in the proof ( see also the discussion
in the next section for r := 2). Consequently, under Assumption 4.14, G is
avy = 0.
twice continuously differentiable as a mapping from L2(S2) into C(f2).
It remains to discuss the cost functional. To this end, we assume for
Theorem 4.32. Suppose that Assumption 4.14 holds, and let the control simplicity that 4 is of the form
2 E Uad, together with the associated state y and the adjoint state p, satisfy (4.109) , b (x, n) = 71 (x) U + `Y2 (x)u2
the first-order necessary optimality condition stated in Theorem 4.20 on page
216. If, in addition, the pair (y, u) obeys for some -r > 0 the second-order with functions 7i, y2 E L°°(S2), where 72 > 0. Then the functional
suf ficient condition (4.108), then there exist constante e > 0 and o, > 0 such
that we have the quadratic growth condition (71(x)u(x) + 72(x)u(x)2) dx
f

J(y, u) > J(y, ) + QIIu - 2IIL2(St) V u E Uad with IIu - v,IIL-(O) < e, is obviously twice continuously Fréchet differentiable in L2(Q).

where y = G(u). In particular, ú is locally optimal in the sense of L°°(S2). Remark. It can be shown that a sufficient second-order condition can only hold
if ^Y2 (x) > 6 > 0 for almost every x E SZ; see, e.g., [Tr600].

A proof of this result can be found in [CTU96]. However, one can also Based on these premises, we can replace the control space L°°(S2) by
argue as in the proof of Theorem 5.17 on page 292 for the parabolic case; L2(f2) in the (strong) second-order sufficient conditions.
therefore, we do not give the proof here.
Theorem 4.33. Suppose that Assumption 4.14 on page 206 holds for the
The issue of the gap between necessary and sufficient second-order condi-
distributed control problem (4.31)-(4.33), where dim S2 = N < 3. Let the
tions is also the subject of the monograph by Bonnans and Shapiro [BS00],
control u E Uad C L2(SZ), together with the associated state y = G(u) and the
where various other results concerning the use of second-order derivatives in
adjoint state p, satisfy the first-order necessary optimality conditions stated
optimization theory can be found. We also refer the reader to Casas and
in Theorem 4.20 en page 216. Moreover, let the function 4i =(u) be of
Mateos [CM02a] and, in connection with state constrairits, to [CT02] and
the forro (4.109). If, in addition, the pair (y, ti) satisfies the second-order
[CD1RT08].
sufficient condition (4.103) on page 247, then there exist constants e > 0 and
a > 0 such that we have the quadratic growth condition
4.10.6. Cases without two-norm discrepancy. So far, we have used
two different norms for the derivation of sufficient second-order optimality J(y, n) >- J(y, 2) + QII u - wII2L2 (Q)
conditions, namely, the L°° norm for the differentiation and the L2 norm for every u E Uad with IIu - ÜIIL2(Q)
< e, where y = G(u). In particular, v,
for the positive definiteness of f"(2). This is not always necessary. In- is a locally optimal control in the sense of L2(9).
deed, the two-norm discrepancy does not play any role if the following three
conditions are met: the equation is only linear in the control, the control-to-
state mapping G maps L2 continuously into C(Sl), and the cost functional is Boundary control . The situation is quite similar for the boundary
linear-quadratic with respect to u in a sense that is yet to be made precise. control problem (4.49)-(4.51) on page 218, provided that St is a bounded
An example of this type has been studied on page 232. In the cases below, two-dimensional Lipschitz domain. Then G : u ^-4 y is twice continuously
we will be able to work with the L2 norm alone. differentiable from L2(I) into H1(11) fl C(Q). Hence, with a corresponding
256 4. Optimal control of semilinear elliptic equations 4.11. Numerical methods 257

modification, Theorem 4.33 remains valid for S2 C R2. In this case, the local with -y, E L°°(9), i = 1, 2. Moreover, assume that the control ú E Uad,
optimality of ú in the sense of L2(F) results. together with the associated state y = G(ú) and the adjoint state p, satisfies
the first-order necessary optimality condition stated in Theorem 4.20 on page
4.10.7. Local optimality in LT (í ). The results en local optimality shown 216. If, in addition, the (strong) second-order sufficient condition (4.104) on
so far Nave a weakness when the two-norm discrepancy is present . Indeed, page 24x7 is satisfied, then there exist constants e > 0 and o- > 0 such that
they only ensure the local optimality of ú in the sense of L°o(S2) or L°°(I'). we have the quadratic growth condition
Hence, if ú has jump discontinuities , any function sufficiently close to ú in
J(y, u) > J(y, ú) + u[I u - UI jL2(n)
the sense of the L°° norm has to exhibit the same jump behavior as ú; in
particular , this must be the case in some L°° neighborhood of ú in which ú for all u E Uad with ¡¡u - ÜIILr(0) < e, where y = G(u). In particular, ú is a
yields the ( locally) smallest value of the cost functional . It would thus be of locally optimal control in the sense of Lr(S2).
great benefit if one were able to show the local optimality of fe with respect
to the LT (52) norm for some 1 < r < oo; then such effects would not matter Remark. The aboye result does not directly generalize to problems in which
anymore. either the control occurs nonlinearly in the differential equation or the cost func-
We exernplarily address this problem for the case of distributed controls. tional does not have the requested quadratic form with respect to u. In such cases,
We already know that G is for all r > N/2 continuously Fréchet differentiable additional assumptions have to be imposed; see [CTU961. Also, the LT optimality
as a mapping from LT (52) into C(4) fl H' (S2). Moreover , the linear solution was derived without referente to strongly active constraints. Otherwise, the anal-
operator R : u y of the boundary value problem ysis becomes more delicate; in this regard, we refer to, e.g., [TW061 for the case
involving the Navier-Stokes equations.
-Dy + y = u
á,y = 0, 4.11. Numerical methods
which was introduced in the proof of Theorem 4.24 on page 239, also defines 4.11.1. Projected gradient methods. In principle, this method does not
a continuous mapping from L' (Q) into C(S2) f1 H' (f ). Consequently, the differ from that for linear-quadratic parabolic problems, which was described
equation (4.83), in the section beginning on page 166. However, the nonlinearity of the
y-R(u-4> ( y)) =F(y,u) =0, equation leads to an additional complication in step Si that renders the
method rather unattractive: if the approximation u, is known, we have to
with (P(y) = d(., y) - y is well posed in (C(S2) fl H'(Q )) x LT(9). We
solve the semilinear boundary value problem (4.50) for the associated state
have F : (C(S2) l H' (S2)) x LT (52) -^ C( 52), and the iinplicit function theo-
y,,,. Usually, this has to be done by an iterative technique, for instante, by
rem is applicable . Therefore , the solution operator G is twice continuously
Newton's method. It therefore makes sense to apply a method of Newton
differentiable from LT (52) hato C(t2) n Hl (52).
type instead of the projected gradient inethod. One such technique is the
As in the last section, we now postulate that SQP method, which will be discussed in detail in the next subsection.

y(x, n) = 71 (r;) u, + 72( x) Au2 Also, the choice of the step size is rather costly; even in the case without
constraints, it can no longer be determined analytically. One has to be
A closer look at the steps leading up to Theorem 4.29 reveals that 11h1l L- (Q) content with finding a step size s., that yields a sufficiently large descent.
can always be replaced by IhLr( 1) without losing the validity of the respec- This can be done either by bisection or by Armijo's rulo; see Section 2.12.2.
tive estiinates. Also, the proof of Lemnia 4.26 en page 243 remains correct
if the norrn ¡¡u - fIl Loo(0) there is replaced by ¡¡u - Ui Lr(0). In surnmary, we 4.11.2. Basic idea of the SQP method . To inotivate the SQP method
have the following LT version of Theorem 4.29. (Sequential Quadratic Programming inethod), we first study a problem in
the space 118n:
Theorem 4.34 . Suppose that Assarnption 4.14 holds, let r > N/2, and let (4.110) ruin f (u), u E C,
the function z/ be of the forro
where f E C2( Rn) and C c R''2 is nonempty, closed , and convex . Initially,
W(x,u) =71(x)2+72 (x)A u2, we treat the case without constraints , that is, we take C = R°''. Then the
258 4. Optimal control of semilinear elliptic equations 4.11. Numerical methods 259

first-order necessary optimality condition for a local solution ti to (4.110) can be interpreted as a Newton method for a generalized equation. Just
reads like the classical Newton method, this method converges locally quadrati-
cally. Sufficient conditions for its local convergente to ti are, for example,
(4.111) f'(ú) = 0.
the positive definiteness of f"(v,) and the regularity requirement f E C2,1.
This equation can be solved via Newton's method, provided that the condi- In this connection, we refer the interested reader to Alt [A1t02l, Robinson
tions for its convergente are met . If the iterate u, is known , the next iterate [Rob80], and Spellucci [ Spe93] . Presentations of the analysis of Newton's
u = un+1 is obtained as the solution to the system of linear equations method for equations in function spaces can be found in Deuflhard [ Deu04]
and Kantorovich and Akilov [KA64]. Klatte and Kummer [KK02] discuss
(4.112) f'(un.) + f"(un)(u - un) = 0.
generalizations to merely Lipschitz continuous functions that are relevant to
To guarantee the unique solvability of this system, the matrix f"(un) has optimization problems.
to be nonsingular, that is, because we look for a local minimum, positive
definite. In other words, the validity of the second-order sufficient optimality Direct application to optirnal control problems . The basic idea just
condition is a natural requirement for Newton's method to converge to a local described generalizes directly to the case where Rn is replaced by a Banach
minimizer. space U. For instante, let U =¡L°°(F) and C = U¡ad, and let the functional
We now take a different look at Newton's method. It is apparent that
equation (4.112) is none other than the first-order necessary optimality con- f (u) = J(y(u), u) = J p (x, y(x)) dx + J 1 / i (x, u(x)) ds(x)
sz r
dition for the linear-quadratic optimization problem
be given, where y = y(u) denotes the weak solution to the elliptic boundary
(4.113) min {f1(un)T(u _un)+ (u_ u )Tfu(u)( n _u)}. value problein (4.50),

-Dy = 0 in ll
If the Hessian f"(un) is positive definite, then the cost functional is
strictly convex, so that this minimization problem has a unique solution ó„y + b(x,y) = u on F.
un+1• In fact, it does not make any difference whether we solve the quadratic
optimization problem ( 4.113 ) or the system of linear equations ( 4.112), since The derivatives f'(un) and f"(un) are determined as in formula (4.56) en

they are equivalent. Thus, Newton's method for the solution of the nonlinear page 220 and Theorem 4.25 on page 242, respectively, using the Lagrangian
system (4.111) can alternatively be performed as the solution of a sequence function. Starting from un E Uad, we obtain u = un+i as the solution to the

of quadratic optimization problems (4.113), that is, as an SQP method. quadratic optirnal control problem

This observation is the key to the treatment of the problern with con- umin j f'(un.)(u un) +
( U.) (U
un)2 ¡.
straints. Instead of equation (4.111), we then have the variational inequality

(4.114) f'(v,)T(u-ü)>0 VuEC.


There is a small but essential difference between this "SQP" method
A direct application of the classical Newton method is not possible . However, and the familiar SQP method from the literature en nonlinear optimiza-
we can easily add the constraint u E C to problem (4.113); that is, in order tion (this is the reason why the method is sometimes also called Newton's
to obtain the next iterate un+1, we solve the minimization problem method): once the new iterate un+1 is calculated, the new state Yn+1 is
determined as thé solution to a nonlinear elliptic boundary value problem,
Yn+1 = y(un+1) = G(un+1). The calculation of yn+1 could again be done
using Newton's method. This additional effort is avoided by using, instead
(4.115) mC {ff(u)T(u - un) +
2 (u - un)T f//(un)( u - un)^ of the rule Yn+1 = G(un+i), its linearization

yn+1 = yn + G (un) (un+1 - un)

If the Hessian f"(un) is positive definite, then (4.115) is uniquely solv- which amounts to solving just a linear equation. We will investigate this
able. The solution of the sequence of quadratic optimization problems (4.115) type of SQP method in the next subsection.
260 4. Optirnal control of semilinear elliptic equations 4.11. Nurnerical methods 261

4.11.3. The SQP method for elliptic problems . Here, we discuss the that is,
method for the distributed control problem (4.31)-(4.33):
-Ay + d(x, yn) + dy(x, yn) (y - yn) - u = 0.
min J(y,u) cp(x, y(x)) dx + ,
Hence, the linear terms are preserved. We thus obtain, as optimality system
for the determination of (y, u, p) = (yn+i, un+i, pn+i), the equations

(4.116)
-Ay + d(x, y) = u in 52
óvy = O en I -Ay + d(x,yn) + dy(x,yn)(y - yn) = u
Óvy = 0
and
ua(x) < u(x) < ub(x) for a.e x E í. -Ap+dy(x,Y.)p+pndyy(x,yn)(Y-yn) = Oy(x,yn)
+ wyy(x,yn)(y - Yn)
We postulate that Assumption 4.14 holds. We aim te determine a local avP = 0
referente solution (y, v,) that is supposed to satisfy the second-order sufficient ,,/1 ,,/'
4'u(x, un) + Wuu(x, un)(u - un) +p = 0.
optimality conditions (4.104). As before, p denotes the associated adjoint
state. The triple (y, ú, p) solves the optimality system
Obviously, this is just the optimality system for the problem
-Ay + d(x, y) = u -Ap + dy (x, y) p = coy (x, y)
óvy = 0 óvp = 0
min { fS2 ( wy(x, yn) (y - yn) + 0. ( X, un) (u - un)) dx

Jo (,P. (x, u) + p) (v - u) dx > 0 b'v E Uad.


2 pn dyy ( x, yn)(y - yn) 2 dx

As in the motivation of the SQP method, we first consider the uncon- +2 f (yy (x,yn)(y _ Yn)2+nu(x,Un)(u _ Un)2)dx}

strained case Uad = L°°(í ). Then we have the equation z/iu(•,u) + p = 0


instead of the variational inequality, and hence the optimality system subject to u E L2 ( Q) and

Ay + d(x, y) = u Ap + dy(x, y) p = <Py(x, y)


Ay + d(x, yn) + dy(x, yn )( y - yn) = u
avy = 0 Óvp = 0
óvy = 0.

u(x,u)+p=0.
This problem is in turn equivalent to the linear -quadratic problem

This nonlinear system for the unknowris (y, u, p) can be solved by means of
min{J'(yn,un)(y-yn,u-un)+ 1£u(yn,Un,pn)(Y-yn,U_Un)2},
Newton's method; see Deuflhard [ Deu04]. To this end, suppose that the
iterates (yj, ui, pi), 1 < i < n, have already been determined. Then the new subject te
iterate (yn+i,un+i,Pn+i) is the solution to the optimality system linearized -Ay+d(x,yn)+dy(x,yn)(y-yn) = u
at (yn, un, Pn).
óvy = 0.
To find the latter, recall that linearization of a mapping F just ineans to
make the approximation F(y) F(yn) + F'(yn)(y - yn). If we perform this
Thus we may solve this problem instead of the system (4.116).
for the first equation, we find the linearized equation
While (4.116) as a system of equations does not directly generaliza to the
-Ayn - A(y - yn) + d(x, yn) + dy(x, y.) (y - yn) - un - (u - un) = 0, case with constraints u E Uad, this is evidently true for the aboye problem:
262 4. Optimal control of semilinear elliptic equations 4.12. Exercises 263

one only has to add the constraints. With box constraints, we have to solve In the case of boundary controls, the structure and theory of the SQP
in the nth iteration step the following problem: method is completely analogous to that for distributed controls. The same
is true if homogeneous boundary conditions of Dirichlet type are considered
(QPn) instead of Neumann conditions.

min {X(yn,un)(y yn,u-un)+ 1£"(yn,un,pn)(y-yn,u-un)2} Remark . It requires considerable additional effort to develop the basic idea
of the SQP method described here finto reliable software tools. For instante, tech-
subject to niques for globalization have to be incorporated, and the solution of the quadratic
-Dy + d(x, yn) + dy(x, yn)(y - yn) = u subproblems has to be linked to the outer iteration in an efficient way. For details,
we refer the reader to the relevant literature. However, the basic scheme described
a„y = 0
aboye shows excellent performance for the test examples given in this book.
and
ua, < u < U.
4.12. Exercises
4.1 Let fI be a bounded Lipschitz domain. Prove the inequality
As a result, we obtain the new control un+l, the new state yn+1, and then IIYIILU(r) IIYIIL-(n) dy e Hl(fd) n L°°(ft).
also the associated adjoint state Pn+1. The reader will be asked in Exercise
Hint: Use the fact that Hl(f2) n C(l') is dense in H'(ft). Choose a
4.10 to determine the boundary value problem to be satisfied by pn+1. With
sequence {yn} C H'(Sl) n C(f2) such that Ily, - yIIH'(n) -> 0 as n -> 00,
this, the algorithm of the SQP method is described. A number of questions and project this sequence onto the interval [-c, c], where c IIyIIL°°(n)-
arise: Does (QPn) have a solution (yn+1, un+l)? If so, is it unique? Does The projection operator is continuous in Hl (f2) n C(fl). Apply Theorem
the method converge, and if so, what is the order of convergente? 2.1 on page 29.
4.2 Examine the uniqueness question for the semilinear elliptic boundary
Theorem 4.35. Suppose that Assumption 4.14 on page 206 holds for the dis- value problem (4.5) on page 183. Show that under Assumption 4.2 on
tributed control problem (4.31)-(4.33). Moreover, let the triple (y, u, p) satisfy page 185 there can be at most one solution y E Hl (9) n L- (Q).
the necessary optimality conditions and the second-order sufficient optimality 4.3 Let E C tN be a bounded and measurable set. For which spaces L9 (E) is
condition (4.104) on page 247. Then there is some convergente radios g > 0 the Nemytskii operator y(.) H sin(y(.)) Fréchet differentiable from L2(E)
such that the SQP method, starting from an initial guess (yo, uo, po) with into L9 (E)?
4.4 (i) Prove that the Nemytskii operator y(.) H sin(y(.)) is not Fréchet
max { ¡¡yo - y11c(ft), 11u0 - tMML-(to), ¡¡Po - MIC(0) } < 0, differentiable in any of the spaces LP(0,T) with 1 <p < co. Hínt: The
requested property of the remainder is already violated for step functions.
generates a uniquely determined sequence {(yn, un, pn) }°O 1 of iterates. More-
(ü) Show that this operator is Fréchet differentiable from LP'(0,T) hito
over, there is a constant CN > 0 such that
LP2 (0, T) whenever 1 < P2 < Pi < 00.
11(yn+l,un+l,Pn+1) - (y, u,HIC(o)xL°°(o)xC(o) 4.5 Suppose that Assumption 4.14 on page 206 holds. Verify without the use
of Lemma 4.11 that the functionals F and IfQ defined in (4.34), namely
< CN Il (yn,un,Pn) - (y,u,P)112 -)xL°°(o)xC(S2) Vn E N.
F(y) = f W(x,y(x)) dx, Q(u) = Jn (x,u(x)) dx,
The SQP method is therefore locally quadratically convergent under the are Lipschitz continuous in their domain of definition. Show also that Q
given assumptions. Here, we have assumed the stronger sufficient optimal- is convex.
ity conditions (4.104), which do not invoke strongly active inequality con-
4.6 Let E C EN be bounded and measurable, and let ua,ub e L' (E) be
straints. This is cornmon practice in convergente proofs for SQP methods. given such that ua(x) < ub(x) for almost every x E E. Show that the set
In [Ung97], a proof based on the Newton method for generalized equations
UUd = {u e L'(E) : ua(x) < u(x) < ub(x) for a.e x e E}
can be found. Another, to a large extent analogous, proof for the parabolic
case is given in [Tr599]. is nonempty, closed, bounded, and convex in L'(E) whenever 1 < r < oo.
264 4. Optimal control of semilinear elliptic equations

4.7 Prove the representation used in (4.40):


Chapter 5
ID (y) - ^D (9) = d (., y(.)) - d(., y(.)) = d5 (, yO) (y O yO) + rd,
with a remainder rd that satisfies Irdlic(2
)/b - MIc (si) -5 0 as
h - y1Ic(n) - 0.
4.8 Show that for \ E ( 0, 11 and y^ - 9 ti - 2 satisfies the necessary op-
timality conditions for the "superconductivity " problem defined on page
217. Are the sufficient second - order optimality conditions satisfied?
4.9 Show that the functional f defined by
Optimal control of
semilinear parabolic
f (u) =
fl
0
cos (u(x)) dx

is not twice Fréchet differentiable in L2(0, 1). Use the hint given in Ex-
ercise 4.4. In which of the spaces LP(0, 1) does a second-order Fréchet
equations
derivative exist?

4.10 Derive the adjoint equation solved by the adjoint state p,+i for the linear-
quadratic problem (QP.,) defined on page 262.

In this chapter we study, in analogy to the elliptic case, optimal control prob-
lems for semilinear parabolic problems. While the existente and regularity
theory for solutions to parabolic problems differs in many aspects from that
for elliptic problems, the optimization theory for parabolic problems is rather
similar. Hence, although we give a rather comprehensive treatment of the
state problems, we do not go into as much detail as in Chapter 4 regarding
the optimization theory. We can afford this, since the corresponding proofs
are more or less the same as in the elliptic case.

5.1. The semilinear parabolic model problem

The state problems to be studied in the subsequent sections are all special
cases of the following general initial-boundary value problem:

yt+Ay+d(x,t,y) = f in Q

0uAy+b(x,t,y) = y on E

y(., 0) = Yo in O.

Here, T > 0 is a given final time, and we have set Q := 9 x (0, T) and
E := h x (0, T). As before , A is the uniformly elliptic differential operator

265
266 5. Optimal control of semilinear parabolic equations 5.1. The semilinear parabolic model problem 267

defined in (2.19) on page 37, and c3„A denotes the corresponding outward
conormal derivative. For clarity, we usually suppress as in (5.1) the variables + b(x,t,y ) vdsdt = f f f vdxdt+ f J
J JE gvdsdt+ J yov(,0)dx
Q E st
x and t in the function y and in the given data. The functions d and b are
defined as in the elliptic case. for every v E W2' (Q) such that v(x,T) = 0.
In this section, we present relevant properties of the initial-boundary
value problem (5.1), the main result being Theorem 5.5 on the existente Relation ( 5.4) is called the weak or variational formulation of the initial-
and uniqueness of a continuous weak solution. These results, which are due boundary value problem (5.1). The following result is the parabolic analogue
to Casas [Cas97] and Raymond and Zidani [ RZ98 ], form the basis of the of Theorem 4.4 on page 186.
corresponding optimal control theory. We generally assume the following:
Lemma 5.3. Suppose that Assumptions 5.1 and 5.2 hold. Then for every
Assumption 5.1. 9 C RN, N > 1, is a bounded Lipschitz domain (for given triple f E L2(Q), g E L2(E), and yo E L2(S2) the initial-boundary
10
N = 1 a bounded open interval). The function d = d(x, t, y) : Q x R -+ R value problern (5.1) has a unique weak solution y e W2' (Q).
is measurable with respect to (x, t) E Q for any fixed y e R. Similarly, let
b = b(x, t, y) : E x R -+ R satisfy the same condition with E in place of The aboye lemma will be proved in Section 7.3.1, beginning on page 373.
Q. Moreover, d and b are monotone increasing with respect lo y for almost However, Assumption 5.2 is much too restrictive and excludes many impor-
every (x, t) E Q and (x, t) e E, respectively. tant applications. For instante, nonlinearities such as d(y) = y", n > 1, fail
to satisfy it. For this reason, one works with the more general conditions of
As standard spaces for the treatment of linear parabolic initial-boundary
local boundedness and Lipschitz continuity.
value problems, we have so far used W2'0(Q) and W(0,T). Evidently, if y E
W (0, T) or y E W2'° (Q), then the functions d(x, t, y(x, t)) or b(x, t, y(x, t))
Assumption 5.4. The function d = d(x, t, y) : Q x R -> R satisfies on
may be unbounded and possibly not integrable, unless further assumptions
E = Q the boundedness condition ( 5.2) and is for every ( x, t) E E locally
are imposed. For the proof of the existente of a unique solution to (5.1), we
Lipschitz continuous with respect to y, that is , for any M > 0 there is sorne
initially make the following additional assumption:
L(M) > 0 such that
(5.5)
Assumption 5.2. The function d = d(x, t, y) : Q x R -* R is uniformly M, i = 1, 2.
bounded and globally Lipschitz continuous with respect to y for almost every (d(x, t, y, ) - d(x, t, Y2 )1 < L(M) ¡y1 - y21 dyi E R with ¡y¡¡ <
(x, t) E Q, that is, there are constants K > 0 and L > 0 such that for almost The same is assumed to hold for b = b(x, t, y ) : E x R --> R on E = E.
all (x, t) E Q and all y1i y2 E R we have
(5.2) Analogously to Theorem 4.5 on page 189, it can be shown without the
Id(x,t,0)I < K
strong Assumption 5.2 that , for given data f and g from suitable LP spaces,
(5.3) id(x,t,y1)-d(x,t,y2)j < Lly1-y21• there exists a unique solution in W (0 , T) f1 L°° (Q). For this result to be
The function b = b(x, t , y) : E x R -i R is assumed to satisfy the same valid , the boundedness of d and b is not needed , and the weaker Assumption
condition with E in place of Q. 5.4 suffices . Therefore , the following generalization of the notion of weak
solution is meaningful.
The weak formulation (3.25) on page 140, valid for linear problems, is
extended to the nonlinear case as follows. Definition . A function y e W2'°(Q) fl L°°(Q) is said lo be a weak solu-
11
tion to (5.1) if the variational formulation (5.4) holds for all v E W2' (Q)
Definition . Suppose Assumptions 5.1 and 5.2 hold. A function y E W2 °(Q) satisfying the additional condition v(x, T) = 0.
is said lo be a weak solution to (5.1) if
The corresponding existence and uniqueness results in connection with
optimal control problems have been proved by Casas [Cas97] and Raymond
(5.4) f y vt dx dt + Jf
¡¡ N
aii (x) Diy Djv + d(x, t, y ) v) dx dt and Zidani [ RZ99].
Q Q i,j=1
268 5. Optimal control of semilinear parabolic equations 5.2. Basic assumptions for the chapter 269

Theorem 5.5. Suppose that Assumptions 5.1 and 5.4 hold. Then the semi- Besides d = d(x, t, y) and b = b(x, t, y), the following quantities oc-
linear parabolic initial-boundary value problem cur: in the cost functionals the functions 0 = O(x, y), P = cp(x, t, y, v),
and zb = b(x, t, y, u), and in the control constraints the threshold functions
yt+Ay+d( x,t,y) _
ua, ub, Va, and Vb, which all depend on (x, t). The "function variables" will
ó„Ay + b(x, t, y) _ be y, whenever the state y(x, t) is to be inserted, m well as v and u for the
y(0) _ controls v(x, t) and u(x, t). Some of these may not come up, but this will
not contradict the assumptions to follow.
has a unique weak solution y E W(0,T) fl C(Q) for any triple f E LT(Q),
Again, we use the notation Q := 9 x (0, T) and E := F x (0, T), for some
g e L8(E), and yo E C(S2) with r > N/2+ 1 and s > N+ 1. Moreover, there
fixed final time T > 0.
is a constant c,>,, > 0, which is independent of d, b, f, g, and yo, such that

(5.6) IIYII w(o,T) + IIYIIc(Q) < c. (ji f - d(., 0) II L-(Q) Assumption 5.6.
(i) 9 C R N is a bounded Lipschitz domain.
+Ilg - b(-, 0)II Ls(E) + Il yo lIC(n))
(ii) The functions

The basic idea of the proof is the following. By virtue of Lemma 5.3, one d = d(x, t , y) : Q x R ---> R, q5 =g5(x,y):f2xR->R,
obtains as in the elliptic case a unique solution to the problem with cutoff
^ o = cp(x, t, y, v) : Q x R2 R, b = b(x, t, y ) : E x R -> H,
functions dk and bk. Using techniques from [LSU68], one then shows that
IlykllL-(Q) is bounded from aboye by a constant that does not depend on V) _ O(x, t, y, u) E x R2 R
k. The continuity of the solution is a consequence of Lemma 7 . 12 en page
are measurable with respect to (x, t ) for all y, v, u e R and, for almost every
378, which is the parabolic analogue of Theorem 4.8 on page 192. To be
(x, t) in Q or E, twice differentiable with respect to y, v, and u . Moreover,
able to apply this lmma, we bring the bounded and measurable functions
d(x, t, y ) and b (x, t, y) to the right - hand sides of the differential equation and they satisfy the boundedness and local Lipschitz conditions (4.24)-(4.25) of
order k = 2; this means that for cp, for example, there exist some K > 0 and
the boundary condition, respectively; in this way, we obtain data from L' (Q)
and L8(E), respectively. Finally, the estimate for Ilyllw(o,T) in (5.6) follows a constant L(M) > 0 for any M > 0 such that we have, with the objects Ocp
from Lemma 7.10. and cp" explained below,

1 ^o(x ,t,0, 0)1 + 1 V (x,t, 0,0)1+(x,t,0,0)1 <K,


Remark. If yo e L°°(Q) only, then y e C(Q) can no longer be expected. In
I P "(x, t, yr, vi) P'(x, t, y2, v2)1 < L(M) {¡y, - Y21 + Ivr - v21 },
this case, one only obtains the regularity y e C((0, TI x ll) n L°°(Q); see Raymond
and Zidani [RZ99] . In particular, this concerns the regularity of the adjoint states, for almost every (x, t) E Q and any y¡, vi e [-M, Al], i = 1, 2.
because in their final condition a function may occur that is merely bounded and
(iii) We have dy(x, t, y) > 0 for almost every (x, t) E Q and by(x, t, y) > 0
measurable. In this case, the norm jyIJc(Q) in the estimate (5.6) rnust be replaced
for alrnost every (x, t) E E. Moreover, yo E C(fl).
by IMIL-(Q)•
(iv) The bounds ua, ub and va, vb : E --- > R belong to L' (E) for E = E and
5.2. Basic assumptions for the chapter E = Q, respectively, and ua(x, t) < ub(x, t ) and va (x, t) < Vb(x, t) for almost
every (x, t) E E.
For better readability, we now impose a set of assumptions for this chapter
that is sufficiently strong for all the following theorems to hold. However, sev-
Remark. The gradient V and the Hessian ^o" in (ü) are defined by
eral of the results are valid under much weaker assumptions. It will become
evident in the individual theorems which parts of the general assumption IPYY IPyv
V o ^v , ^Ovy ^Pvv
are dispensable. Moreover, for the sake of a shorter exposition we confine
ourselves to the Laplacian -A, although all of the following results remain For these quantities , 1 • j represents an arbitrary norm in R2 or R2x2 , respectively.
valid for the general elliptic operator A studied so far. Assumption 5.6 is, for instance, satisfied with functions d, b e C3 (R) that only
270 5. Optimal control of semilinear parabolic equations 5.3. Existente of optimal controls 271

depend on the function variable, such as d(y ) = yk where k E N is odd or d(y) _ Definition . A pair of controls (v, u) E Vad x Uad is said to be optimal,
exp(y ). A typical example of an admissible co is given by and y(v, fi) the associated optimal state, if

^o(x, t, y, v) = a(x, t) (y - yQ (x, t))' + l5( x, t) (v - vQ (x, t)) J(y(v, u), v, u) < J(y(v, u), y, u) V (v, u) E Vad X Uad.

with a, j3, yQ, vQ E L°° ( Q); see also the remarks following Assumption 4.14 on
A pair (v, v,) E Vad X Uad is said to be locally optimal in the sense of Lr(Q) x
page 206 for the elliptic case. LS(E) if there is some e > 0 such that the aboye inequality holds for all
(v, u) E Vad X Uad such that v - V 1 L- (Q) + 11U - UIILS(E) < E.
5.3. Existente of optimal controls
In the next theorem, convexity of ep and with respect to the control
We begin the study of parabolic optimal control problems by showing the variables will be needed. For cp, for example, this requirement means that
existence of optimal controls. In order to be able to treat several cases at
< p (x, t , y, A vi + (1 - \)v2) < \ W (x, t, y, vi) + (1 - \) co (x, t, y, v2)
the same time, we consider a problem that contains both distributed and
boundary controls as well as a cost functional that combines observations on for almost every (x, t) E Q, all y,VIiV2 E R, and every \ E (0, 1). The
the boundary, within the dornain, and at the final time. More precisely, we convexity of zl with respect to the control variable is understood similarly.
consider the problem
(5.7) Theorem 5.7. Suppose that Assumption 5.6 holds, and let ep and z/, be con-
vex with respect to v and u, respectively . Then the optimal control problem
min J(y, v, u) J (x, y(x, T)) dx +
sz
J JQ w(x, t, y(x, t), v(x, t)) dx dt (5.7)-(5.9) has at least one optimal pair (v, u) with associated optimal state

+ J¡ f p (x, t, y(x, t), u(x, t)) ds dt,


y = Y(v,u)•

Proof.• The proof is similar to that of Theorem 4.15 for elliptic problems.
We can therefore afford to be brief. By virtue of Theorem 5.5, the state
yt - Ay + d(x, t, y) = v in Q equation (5.8) has for any pair of admissible controls a unique associated
state y = y(v, u) E W (O, T) n C(Q). Now observe that Vad X Uad is a
a, y + b(x, t, y) = u on E
bounded subset of L°°(Q) x L' (E) and, a fortiori, also of Lr(Q) x LS(E) for
y(0) = yo in S2 r > N/2 + 1 and s > N + 1. In view of estimate (5.6), we can thus conclude
that there is some M > 0 such that

va(x,t) < v(x,t) < vb(x,t) for a.e. (x, t) E Q


(5.10) II y(v, u) (c(Q) < M d (v, u) E Vad X Uad.
ua(x,t) < u(x,t) < ub(x,t) for a. e. (x, t) E E. Owing to Assumption 5.6 and (5.10), and since Uad and Vad are bounded,
the functional J is bounded from below and therefore has a finite infimum
If one of the two controls is not to occur, one can enforce pure boundary j. By the reflexivity of LT ( Q) x LS ( E), we may select a minimizing sequence
control by putting va = Vb = 0, or pure distributed control by putting { (va, un )}1 that converges weakly in this space to some limit (v, u):
ua = ub = 0. As the sets of admissible controls, we define
4Jn - V, un - fi as n -i oc.

Vad = {v E L°°(Q) : Va(X, t) < v(x, t) < vb(x, t) for a. e. (x, t) E Q} Since Vad X Uad is closed and convex, we have (v, e) E Vad X Uad, so it is an
admissible pair.
Uad = {u e L°°(E) ua(x, t) < u(x, t) < ub(x, t) for a.e. (x, t) E E}. Next , the strong convergente of the state sequence in a suitable space
has to be shown, which requires a little more effort than in the elliptic case.
In the following, we denote by y = y(v, u) the state associated with the To this end, define the sequences of functions zn(x, t) = -d(x, t, yn(x, t))
control (v, u) E Vad X Uad. and wn(x, t) = -b(x, t, yn(x, t)), n e N. By (5.10) and Assumption 5.6,
272 5. Optimal control of semilinear parabolic equations 5.4. The control-to-state operator 273

these sequences are almost everywhere uniforraly bounded. Hence, there are note that we have, for every test function w e YV2'1 ( Q) with w(T) = 0,
subsequences, without loss of generality {zn}ñ 1 and {wn}°° 1 thernselves,
that converge weakly in L'(Q) and Ls(E), respectively, to limits z and w. - ff y72wtddt + f fQ(Vy7,-Ow+ d(x,t,yn ) w)dxdt
We now regard the semilinear parabolic initial-boundary value problem
as a linear problem with right-hand sides z72 + v,,, and wn + un:
+ J JE b(x, t, yn) w ds dt
yn,t - Dyn = zn + vn
(5.11) M n = wn + un = ff vnwclxdt+ ff unwdsdt+fow (. O)dx.

yn(0) = Yo- Passage to the limit as n ---> w, using the convergentes shown aboye, yields
These right-hand sides converge weakly to z + v and w + ii, respectively. that
Since the solution mapping (v, u) H y(v, u ) of the linear parabolic problem
is weakly continuous , we can infer that the state sequence converges weakly - JJ Q
y wt dx dt + J JQ (Vy • Vw + d(x, t, y) w) dx dt
in W (0, T) to some limit y E W (0, T),
J¡¡J
+ b(x, t, y) w ds dt
Y. - y as n -> oo.

- ,¡J wdsdt+ J ow(•, 0)dx,


At this point, we employ a regularity result from [ Gri07a, Gri07bj, s
which asserts that for zero initial condition yo := 0 the mapping (v, u) '-->
that is, y is indeed a weak solution.
y(v, u) maps L'(Q) x L8( E) continuously into the space of HSlder continuous
functions CO,k(Q) for some tc E (0,1). It remains to show the optimality of (v, v ). To do this, we have to
invoke the lower semicontinuity of the cost functional . It is apparent that
Now let y E C(Q) denote the (fixed) contribution to yn that solves
the arguments used in the proof of the elliptic case carry over unchanged.
the linear parabolic problem with initial value yo, homogeneous right-hand
The assertion is thus completely proved. ❑
side and homogeneous boundary condition in (5.11). Then the sequence
{yn -y}°Ó_r is weakly convergent in Co'7(Q). Since C°,"(Q) is by the Arzelá-
Remark . Obviously, only the boundedness and Lipschitz conditions of order
Ascoli theorem compactly embedded in C(Q), the sequence also converges
k = 0 from Assumption 5.6 were needed in the aboye proof.
stronglyin C(Q). Now, y E C(Q), and thus

yn -> y as n -i oo 5.4. The control - to-state operator

with some y e C(Q). Hence, we even have uniform convergence. This sirn- In this section, we prove the continuity and differentiability of the control-
plifies the following arguments relative to the elliptic case (where, however, to-state mapping. Again, we treat the cases of boundary and distributed
we could work with simpler methods). controls simultaneously, by considering the problem (5.8):
Owing to the assumed local Lipschitz continuity of d and b, we can infer
that
yt - Ay + d(x, t, y) = v in Q

d(•, , yn) d(•, y) strongly in LO° ( Q) and in L2(Q), ó„y + b(x, t, y) = u on E


y(0) = yo in Q.
b(-, yn) -> b (-, , y) strongly in L' (E ) and in L2(E).

As in the elliptic case, we now use the variational formulation of the Again , we denote the control -to-state mapping by C : V x U := Lr(Q) x
parabolic initial-boundary value problem to conclude that y is the weak LS(E) Y := W(0,T) x C( Q), (v, u) H y. We generally assume that
solution associated with the pair (v, ú), that is, y = y(v, ú). To this end, r>N/2+1 ands>N+1.
5.4. The control-to-state operator 275
274 5. Optimal control of semilinear parabolic equations

is valid. This concludes the proof of the assertion. ❑


By virtue of Lemma 4.12 on page 202, the Nemytskii operators y(•) H
d(•, ., y(•)) and y(•) H b(•, .,y(.» are continuously differentiable from C(Q)
into L°°(Q) and L°°(E), respectively. By Theorem 5.5, the operator G as- For the proof of differentiability, we consider a fixed (v, v,), which in the
signs to each pair of controls (v, u) E V x U a unique state y E Y. As applications will be a locally optimal pair of controls.
preparation for the proof of differentiability, we first show the Lipschitz con-
tinuity of G. Theorem 5.9. Suppose that Assumption 5.6 on page 269 holds. Then for
r > N/2 + 1 and s > N + 1 the control - to-state operator G is Fréchet
Theorem 5.8. Under Assumption 5.6, for r > N/2 + 1 and s > N + 1 the ,differentiable from Lr(Q) x LS(E) finto W(0,T) f1 C (Q). The directional
mapping G is Lipschitz continuous from LT(Q) x LS(E) finto W(0,T)r1C(Q); derivative in the direction (v, u) is given by
that is, there exists some L > 0 such that
G'(v, v) (v, u) = y,
Ilyi - y211 W(o,T) + Ilyr - Y211 c(Q) < L (¡¡vi - V2 1 Lr(Q) + ¡¡V I - u211 Ls(v))
where, with the state y = G(v, ú) corresponding to (v, ú), y is the weak
for all ( vi, ui ) E LT(Q ) x LS(E ) and associated states yi = G(vi, ui ), i = 1, 2.
solution to the initial - boundary value problem linearized at y:

Proof: Owing to Theorem 5.5, yi E C(Q) for i = 1, 2. Subtracting the yt - Dy + dy (x, t, y) y = v in Q


initial-boundary value problems corresponding to yl and y2, we find that
(5.13) ó„ y + by(x, t, y) y = u on E
the differences y = yr - y2, u = ur - u2i and v = vr - v2 satisfy the problem
y(0) = 0 in 9.
yt - Ay + d(x, t, yl) - d(x, t, y2) = v

(5.12) ,9 y + b(x, t , yr) - b(x, t, y2) = u


y(0) = 0. Proof: We subtract the parabolic problem solved by y = G(v, ú) from the one
solved by y = G(v + v, u + u) to obtain the initial-boundary value problem
By the fundamental theorem of calculus, we liave for yr, y2 E R the identity
(y y)t - O(y y) + d(x, t, y) - d(x, t, y) _ V
d(x, t, yi) - d(x, t, y2) = (1 dy(x, t, y2 + s ( yr - y2 ) ds) (yi - y2). u
00
Now, we put yi = y¡ (x, t), i = 1, 2, in this identity. Since dy ís nonnegative, (y (0) _ 0.
the aboye integral becomes a nonnegative function 5 = 5(x, t) E L°O(Q). An
The Nemytskii operators -P : y H d(-, ., y(.)) and T : y b(., y(.)) are,
analogous representation holds for b with an integral term a = f3 ( x, t) > 0.
by Lemma 4.12 on page 202, Fréchet differentiable in LO°(Q) and L- (E),
We thus have the initial-boundary value problem
respectively. Therefore,
yt - Dy + 5(x, t) y = v
^(fi) - <P (y) = dy(•, y(-)) (y(-) - y(')) + rd,
ó„y+f(x,t)y = u
y(0) = 0. `F(y) - T(y) = by(-, -, y(-)) (y(-) - y(-)) + rb,

Note that 5 and /3 depend on yr and y2, but this is immaterial for our with rernainders rd, rb satisfying
arguments. In fact, in view of the boundedness and nonnegativity of 5 and
f3, the functions d(x, t, y) := 5(x, t) y and b(x, t, y) := fi(x, t) y are increasing IIrdIILo(Q)/IIy - OLo(Q) _> 0 as ¡l b - AL- (Q) 0,
in y and vanish at y = 0. By virtue of Theorem 5.5 on page 268, the solution IIrb IILo(E)/Ily - YIILOO( E ) 0 as Ily - YIIL-( E ) * 0.
y is unique , and its norm does not depend on 5 or 3 . Since d ( x, t, 0) =
b(x, t, 0) = 0, it follows from the estimate (5.6) on page 268 that the asserted We now write y - y with a remainder y, in the form
estimate
lyllw(o,T) + IIy11G(Q) <- L (IIviILr (Q) + IIuIILs(E)) y+yn,
276 5. Optimal control of semilinear parabolic equations 5.5. Necessary optimality conditions 277

where y is defined as in (5.13). The remainder yp then solves the initial- If one wants G to be differentiable with the dornajo of definition Lr(E)
boundary value problem in place of L°°(E), then r > r > N/2 + 1 has to be chosen sufficiently large
so as to guarantee that the mapping u H u4 is differentiable from L"(E) into
yp,t - Dyp + dy(,, •, y) yp = -rd Lr(E). We do not enter into details hese; because then growth conditions
civyp + by(., y) yp = - rb like those in Section 4.3.3 would be necessary. Differentiability is more easily
obtained with controls belonging to L°°(E). In particular, this is true for
yp(0) = 0.
the treatment of the two-norm discrepancy in connection with second-order
,sufficient optimality conditions.
At this point, the Lipschitz continuity just shown comes into play: in fact,
we have Special cases. For the sake of brevity, we have so far investigated the prop-
erties of G simultaneously for distributed and boundary controls. We now
h -YIIc(Q)<L II(V, u)II L r(Q)XLs(E) -_> 0. give an example for each of these cases in which the required conditions are
The proof can now be concluded in a similar way as the proof of Theorem fulfilled.
4.17 en page 213. ❑
Distributed control . Consider the initial-boundary value problem
Conclusion . G is Fréchet differentiable as a mapping from L°°(Q) x L' (E)
into W (O, T) fl C(Q). yt - Ay + do(x,t) + di(x,t)y3 = v
(5.15) a„y + bo(x,t) + b1(x,t)y = 0
Remark. The preceding proof could also have been carried out with the aid of y(0) = 0.
the implicit function theorem, without referente te the Lipschitz continuity of G.
This technique will be applied in the proof of Theorem 5.15. However, the aboye
argumentation is less abstract and not much longer, and it even yields the form of
Here, we assume do E LT(Q), bo E Ls(E), and that almost-everywhere
the derivative G'. Moreover, it is interesting to know that G is uniformly Lipschitz nonnegative functions d1 E L°°(Q) and b1 E L°°(E) are given. Then
continuous.
d(x, t, y) := do (x, t) + di (x, t) y3 and b(x, t, y) := bo (x, t) + b1(x, t) y sat-
isfy the assumptions of the aboye theorem. Hence, the mapping v y is for
r > N/2 + 1 Fréchet differentiable from L'(Q) into W (O, T) fl C(Q).
Nonlinear controls. In some applications, the physical background leads
to controls that occur nonlinearly. This is, for instance, the case for the heat Boundary control. Under analogous assurnptions, the mapping G : u ->
conduction problem with Stefan-Boltzmann boundary condition, y for the problem with radiation boundary condition,

yt -Ay + do(x,t) + d1(x,t) y = 0


yt - Ay = 0 (5.16) aUy + bo(x,t) + bl(x,t)^yly3 = u
(5.14) (9vy + /3 (x,t)Wy3 = u4 y(0) = 0,
y(0) = Yo
is continuously Fréchet differentiable from L5(E) finto W (O, T) fl c(Q) when-
ever s > N+ 1.
Here, the mapping G is composed of the Nernytskii operator u(.) -> u(•)4
and the solution operator that assigns the solution to the semilinear initial-
5.5. Necessary optimality conditions
boundary value problem to the function .u := u(_)4. The mapping u(.) i-->
u(-)4 is by Lemma 4.13 en page 203 Fréchet differentiable in L°°(E) and, Once more, we consider the problem (5.7)-(5.9). We aun to derive the first-
a fortiori, from L°°(E) into Lr(E), for any r > 1. Hence, the composition order necessary conditions for a locally optimal pair (v, ú). It is olear that
G : u -* y is Fréchet differentiable from L' (E) into W(0, T) fl C(Q). if we keep u = v, fixed, then v must obey the necessary conditions for the
5. Optimal control of semilinear parabolic equations 5.5. Necessary optimality conditions 279
278

The derivative f' can be calculated using the chain rule. We obtain
distributed control problem with variable v. Similarly, if v is kept fixed,
then i satisfies the necessary conditions for the corresponding problem with
boundary control u. We can therefore investigate the cases of distributed (5.21) f'(v) (v - v) = Jy(y,v) G'(v)(v - v) + Jv (9,v)(v - v)
and boundary control separately at first.

5.5.1. Distributed control . Consider the optimal control problern


_
f ^(x,9(x,T)) y(., T) dx

(5.17)
min J(y, v) O x, y(x, T)) dx +
¡ /¡
J JQ ^O(x, t, y(x, t), v(x, t)) dx dt
+
11 soy ( x, t, y(x, t), 2(x, t)) y ( x, t) dx dt

+ ff oy (x, t, y(x, t )) y(x, t) ds dt


((
+
ff p (x, t, y(x, t)) ds dt, )) dx dt,
+
IJQ (p2, (x, t, 9(x, t), 0(x, t)) (v ( x, t) - 2(x, t
yt - Dy + d(x, t, y) = v in Q
where y = G'(v)(v - v) is by Theorem 5.9 the solution to the linearized
O„y+b(x,t,y) = 0 on E
problem
y(0) = yo in 52

yt-Dy+dy(x,t,y)y = v-v
(5.22) O„y+by(x,t,y)y = 0
(5.19) va(x,t) < v(x,t) < vb(x,t) for a. e, (x, t) E Q.
y(0) = 0.

We have y = y(v) = G(v) with the control-to-state operator G : L°°(Q) -*


W(0,T) n C(Q). Substituting this into J, we obtain the reduced cost func- As before, y can be eliminated from (5.21) by means of an adjoint state
tional f, p = p(x, t). Guided by the experience gained in the treatment of the elliptic
J(y, v) = J(G(v), v) =: f (v). case, we define the adjoint state as the solution to the adjoint problern

Under Assumption 5.6 on page 269, f is Fréchet differentiable in L°°(Q),


since J (by Lemma 4.12 on page 202) and G (by Theorem 5.9) are both -pt - ,^,p + dy(x, t, y) p ^oy(x, t, y, v)
differentiable.
(5.23) Ovp+by (x,t,y)p = OY( x,t,y)
Obviously, Vad is convex. Hence, if v is locally optimal and v e Vad lS
arbitrary, then we have for every sufficiently small A > 0 the inequality p(x,T) _ Oy(x,y(x,T)),

f(v+A(v-v)) - f(v) > 0.

Dividing by ) and passing to the limit as A 4. 0, we arrive, as in the elliptic which belongs to W(0 , T) n L- (Q) n C([0, T), C(S2)). In fact , the existente
case, at the following result. of a unique solution p E W(0, T) follows from Lemma 3 . 17 on page 157; the
higher regularity of p is a consequence of Lemma 7.12 on page 378 and the
Lemma 5.10 . Suppose that Assumption 5.6 on page 269 holds. Then every subsequent remark on the L°° case. For this, the substitution r := T - t
locally optirnal control v for the problem (5.17)-(5.19) satisfies the variatíonal has to be made. Moreover, if 0y ( x, y) is continuous in Q x R, then the
inequality function x - Oy(x,y (x,T)) is also continuous in Q . In this case , we even
have p E W(0 , T) n C(Q).
(5.20) f'(v)(v - v) > 0 V V E Vad•
280 5. Optimal control of seinilinear parabolic equations 281
5.5. Necessary optimality conditions

Lemma 5.11. Let y be the weak .solution to the linearized problem ( 5.22),
and let p be the weak solution to ( 5.23). Then we have, for all v E L2(Q), Special case : Let ^p(x, t, y, v) := ^P(x, t, y) + 2 v2, with some > 0. Then

¡ ,Pv (x, t, y, v) = A v; hence, the minimum of the problem


J b, (x, 9(x, T)) y(x, T) dx + A (Py ( x, t, 9(x, t ), v(x, t)) y ( x, t) dx dt
S2 min { (p(x, t) + \ v(x, t)) v}
va(x,t )G v <Vb(x,t)

+ is for almost every (x, t) E Q attained at v = v(x, t). This implies, as in


ff (x, t, (x, t)) y(x, t) ds(x)dt
formula (2.58) on page 70, that in the \ > 0 case we have for almost every
(Cx, t) E Q the projection formula
= ff p(x, t) ( v(x, t) - v (x, t)) dx dt. 1 1
2(x, t) _ PIva(x,t),vb(x,t)J {- P(x,t) .
Proof.- The assertion follows from Theorem 3.18 on page 158 with the spec-
ifications an(x) = cy (x, y(x, T)), aQ (x, t) = coy (x, t, y(x, t), v(x, t)), and
aE (x, t) = zb, (x, t, y(x, t)) . o If va, and Vb are continuous, then this implies that v E C(Q ) if p is continuous.
Sufficient for this to be true is the continuity of 0y( x, y); in this case, any
locally optimal control v must be continuous.
From formula (5.21), we can thus deduce the form of the derivative f'(v):

Example. We now discuss the "superconductivity" problem, which was


already treated in the stationary situation, in the time-dependent case and
(5.24) f'(v)v=
ff (p+^ov(x,t, y, v))vdxdt. with a slightly different cost functional:

Moreover , we obtain the desired necessary optimality condition:


min J(y, v) :_ 2 II y(.,T) - ysZllL2(Q) + 2 Ily - yE1122(E) + 2 IIV11L2(Q),
Theorem 5.12. Suppose that Assumption 5.6 on page 269 holds, and let v subject to
be locally optimal for the problem ( 5.17)-(5.19 ). If p e W(0,T) fl L°°(Q)
yt-Dy+y' = v
is the associated adjoint state solving problem ( 5.23), then the variational
inequality 0,y+i3(x,t)y = 0
Y(-, 0) = yo
(5.25) ff(P+(xtv5))(v2)ddt>o bV E Vad
and
-1 < v(x, t) < 1 for a. e. (x, t) E Q.
is satisfied.

Evidently, the aboye problem is a special case of problem (5.7)- (5.9) on


As in the case of elliptic problems, the variational inequality can be page 270, with the specifications
reformulated in terms of a minimum principie.
ñ 2
O(x, y) = 2 ( y - y2 (x))2, P(x, t, y, v) _ 2
v ,
Conclusion . Suppose that the assumptions of Theorem 5.12 hold, let v
be locally optimal for problem (5.7)--(5.9), and let p be the associated adjoint y) = - (y - yE(x, t))2, d(x, t, y) = y3, b(x, t, y) a(x, t) y.
state. Then the minimum of the minimization problem
We assume that yo E C(Q), /3 E L°°( E) with /3 > 0, yo e C (SO), and
(5.26) va { (p(x, t ) + ^Pv(x, t, y (x, t), v(x, t))) v}
(x,t)mm5vb(x,t) yE E L°'(E). Then all the required conditions (measurability with respect
to (x, t ), boundedness , differentiability, monotonicity of d, convexity of W in
is for almost every (x, t) E Q attained at v = 2(x, t). v) are met, and Theorem 5.7 yields the existente of at least one (globally)
283
282 5. Optimal control of semilinear parabolic equations 5.5. Necessary optimality conditions

optimal control v. The adjoint system solved by the associated adjoint state where p e W(0,T) fl L°°(Q) is the solution to the adjoint problem
p e W(0, T) n C(Q) reads
-pt - Op + dy (x, t, y) P = yoy(x, t, y)

-pt-OP+3y2p = 0 (5.31) wP+by(x,t,y)p = Oy(x,t,9,u)


dvP+/3P = y-2JE p(x,T) = 0y(x,p(x,T))-
P(.,T) = y('j)-yQ.
In analogy to Theorem 5.12, we obtain the following result.

Evidently, we have the variational inequality


Theorem 5.13 . Suppose that Assumption 5.6 on page 269 holds. Let v, be a
locally optimal control for the boundary control problem (5.27)- (5.29), and let

fiQ
(Av + p) (v - v) dx dt > 0 V v E Vad,
p e W(O,T) fl LOO (Q) be the associated adjoint state solving problem
Then the variational inequality
(5.31).

from which, for A > 0, the usual projection relation and the property v E
C(Q) follow. For A = 0, we have v( x, t) = - sign p(x, t). o (5.32) ff (p+(xtY))(uY)dSdt 0 du E Uad

5.5.2. Boundary control . The necessary optimality conditions for the is satisfied. Moreover, the minimum of the minzrnization problem
corresponding boundary control problem are derived by similar reasoning.
We consider the problem (5.33) min { (p(x, t) + ,(x, t, y(x, t), Y(x, t)) u}
ua (2,t) < u <ub(X,t)

is for almost all (x, t) E E attained at u = u(x, t).


(5.27) min J(y, u) J (x, y(x, T)) dx + JJ co (x, t, y(x, t)) dx dt

+ ( x, t, y( x, t), u(x, t )) ds dt, Example. Consider the problem


ff min J(y,u) := 2 ¡¡y(-,T) - yoIIL2(n) + 2 I -YQIIL2(Q) + 2 IIuIIL2(E),
subject to
subject to
yt - Dy + d (x, t, y) = 0
yt -M = 0
(5.28) wy + b(x, t, y) = u on E
avy +y31yl = u
y(0) = yo in S y(0) = yo

and and
0 < u(x,t) < 1.
(5.29) ua(x,t) < u(x,t) < ub(x,t) for a.e. (x, t) E E.

This is a special case of the aboye problern with the specifications


Then the control - to-state operator G = G(u) : u y( u) maps L°°(E)
into W (O, T) fl C(Q), since yo e C(5l). We may now follow the lines of ¢(x, y) = 2 (y - yo(x ))2, Vo(x, t, y) = 2 (y - YQ( x, t))
argumentation employed in the case of distributed controls. The directional
derivative of the reduced functional f (u) = J(G( u), u) at ú in the direction ,P(x, t, y, u) = 2 u2, b( x, t, y) = y31y1 •
u is given by = y4
The boundary condition is of Stefan-Boltzmann type, since y31y1
y31y1
is increasing
(5.30) f (U) u = JJ (p + u(x, t, y, ú)) u ds dt,
for > 0 ; we choose this form because the mapping y H
y -
while y H y4 is not. Here we also msume yQ, yo E C (Q). Then Assumption
284 5. Optimal control of semilinear parabolic equations 5.6. Pontryagin's maximum principie 285

5.6 holds, and 0 is convex with respect to u. We can thus infer from Theorem Remark. It ought to be clear that for all the first-order necessary optimality con-
5.7 that there exists at least one optimal control E. ditions established in this chapter, only the boundedness and Lipschitz conditions

Since by(y) = 4 y2 ¡y¡, we obtain for the adjoint problem of arder k = 1 frorn Assumption 5.6 are needed.

-pt - Ap = y - yQ 5.6. Pontryagin' s maximum principie *


a,p+4y2 1gp = 0 We will discuss Pontryagin's maximum principie for the following optimal
p(.,T) = y(-,T) -ysz. control problem in which the control functions appear nonlinearly:

Together with its solution p E W(O, T) f1 C(Q), u satisfies the variational (5.35)
inequality min J(y, v, u) L 0(x, y(x,T)) dx + J JQ Ip(x, t, y (x, t), v(x, t)) dx dt
¡
fin (w,+p)(u-E) dsdt > 0 VU E Uad. +
ff P(x,t,y(x,t),u(x,t))ds(x)dt,

Moreover, we have for A > 0 the projection relation

1 yt - Ay + d(x, t, y, v) = 0 in Q
E(x, t) = ]p[o,lj - p(x, t) for a.e. (x, t) E E.
0„y + b(x,t,y,u) = 0 on E

Since p is continuous, we can conclude that u is also continuous in E. o y(0) = yo in Q

The general case. Combining the results established for the cases of va(x,t) < v(x,t) < vb(x,t) for a.e. (x, t) E Q
distributed and boundary control, we are finally in a position to state the
ua(x,t ) < u(x,t) < ub(x,t) for a. e. (x, t) E E.
necessary optimality condition for the general case. Here, the associated
adjoint state p is the solution to the following adjoint problem:
In analogy to the elliptic case, one introduces Hamiltonian functions.

-pt-Ap+dy ( x,t,y)p = 'py(x,t,y,v) in Q


Definition . The fanctions HQ : Q x IlI. -> R, HE : E x R4 -> I[8
(5.34) a,p+by ( x>t,y)p = 5y (x ,t,y, ) on y, and H° : S2 x R3 -> R, given by
p(-, T) = q5y (x, y(x, T) ) in 52. H1 (x, t, y, v, qo, q) = qo p(x, t, y, v) - q d(x, t, y, v)
HE (x, t, y, u, qo, q) = qo (x, t, y, u) q b(x, t, y, u)
Theorem 5.14 . Suppose that Assumption 5.6 on page 269 is satisfied, let H0(x,y,go,q) = q00 (x, y),
(v, it) be a locally optimal pair for the problem (5.7)-(5.9) on page 270, and
are called Hamiltonian fanctions.
let p E W(0,T) fl L'(Q) be the associated adjoint state solving problem
(5.34). Then the variational inequalities (5.25) and (5.32) and the minimum We define the adjoint state q as the solution to the adjoint system
conditions (5.26) and (5.33) are satisfied.
-qt - Aq + dy(x, t, y, v) q = qo ^oy (x, t, y, v)
(5.38) d q+by(x,t,y, )q = qo y(x,t,.y,,u)
Proof: The necessary condition for v follows from the fact that v has to solve
the general problem (5.7)-(5.9) for fixed u = u, which in turn is a special q(x,T) = go¢y(x,y(-,T))•
case of the distributed control problem (5.17)-(5.19) if we put b(x, t, y)
b(x, t, y) - ú(x, t). Similarly, ú must solve the general problem for fixed v. a'v'e have q = -p with the solution p to (5.34), provided we put qo = -1.
In this way, we obtain the full optirnality system. ❑ Using the aboye Hamiltonians, and putting qo := -1, we rnay rewrite the
286 5. Optimal control of sernilinear parabolic equations 5.7. Second-order optiinality conditions 287

adjoint system in the form the equation for y in the form

yt - Ay v - d(x, t, y) in Q
-qt - Oq = D y HQ(x, t, y, v, -1, q) in Q avy u - b(x,t,y) on E
(5.39) avq = Dy HE(x, t, y, ta, -1, q) on E y(0) Yo in Q.
q(.,T) = DyH°(x, y(•,T), -1, q) in St.
The linear part en the left- hand side is decomposed into three con-
,tinuous linear solution operators GQ : L°°(Q) --4 Y := W(0,T) n C(Q),
GE : L°°(E) -3 Y, and Go : C(íl) Y, which correspond to the linear
Definition . Let q be the adjoint state defined in (5.39) and let qo = -1. initial-boundary value problem
The controls v and u are said to satisfy Pontryagin's maximum principie if
the maximum conditions yt - Ay = v in Q

av y = u on E
max {HQ (x, t, y(x, t), v, qo, q(x, t)) }
va(x,t)<v<vb( x ,t)
y(0) = w in Sl
= HQ (x, t, y(x, t), v (x, t), qo, q (x, t)) , in the following way: we have
y=GQv for u=0,w=0
max {H E (x, t, y(x, t), u, qo, q(x, t)) }
ua (x,t) <u<ub (x,t)
y=GEU for v=0,w=0
= HE (x, t, y(x, t), v,(x, t), qo, q(x, t))
y=Gow for u=0,v=0.
are satisfied for almost every (x, t) E Q and (x, t) E E, respectively. In the following, we regard these operators as mappings with range in
C(Q). It is apparent that the solution y to the nonlinear equation can be
In other words, the maxima of HQ and HE must for almost every (x, t) be
expressed in the form
attained at v(x, t) and ú(x, t), respectively. Under natural assumptions, it
can be expected that (globally) optimal controls satisfy the maximum prin- (5.40) y = GQ (v - d(•, y)) + GE (u - b(•, y)) + Go yo
cipie; see, e.g., the papers [Cas97], [LY95], [RZ99], [vW76], and [vW77]
listed in the overview of relevant literature on the maximum principie at the or, equivalently,
beginning of Section 4.8.1.
0=y-GQ(v-d(•,y)) - GE (u-b(•,y)) - Goyo=: F(y,v,u).

In this way, we avoid the discussion of differential operators and the use of the
5.7. Second- order optimality conditions
space W (O, T). Evidently, F is a twice continuously Fréchet differentiable
5.7.1. Second - order derivatives . We again consider the optirnal control mapping from C(Q) x L°°(Q) x L°°(E) into C(Q); indeed, GQ, Cr, and
problem (5.7)-(5.9) on page 270. We have the following result. Go are continuous linear mappings, and the Nemytskii operators y --> -d(.' y)
and y F-> b(., y) are twice continuously Fréchet differentiable from C(Q) into
Theorem 5.15. Suppose that Assumption 5.6 holds. Then the control-to- L°°(Q) and L°°(E), respectively.
state mapping G : (v, u) y associated with the initial-boundary value prob- According to Exercise 5.1, the partial Fréchet derivative FF(y, v, u) is
lem (5.8) is twice continuously Fréchet differentiable from L°°(Q) x L°°(E) invertible in C(Q). Hence, by the implicit function theorem, the equation
finto W (0, T) x C(Q). F(y, v, u) = 0 has, in some open neighborhood of any arbitrarily chosen point
(y, v, w), a unique solution y = y(v, u); moreover, the mapping (v, u) -+ y is
twice continuously differentiable. Since y = G(v, u) is a solution, we conclude
Proof. As in the elliptic case, we apply the implicit function theorem. We
that G is twice continuously differentiable. 11
first derive an operator equation for y = G(v, u). To this end, we reformulate
288 5. Optimal control of semilinear parabolic equations 5.7. Second-order optimality conditions 289

Theorem 5.16 . Suppose that Assumption 5.6 on page 269 holds. Then the A further differentiation in the direction u2 yields the equation
second derivative of G at ( v, u) is given by the expression

G"(v, u) [( VI, u1 ), ( V2, u2)] = z,


Guu(v,u)[u1,u2] _ -GQ {dyy( , ,G(v,u))v(Gu (v,u)ul ) (Cu(v,U)u2)

+ d 0 (•, ', G(v, u) J Gu,(v, u) [u1, u2] }


with z being the uniquely determined weak solution to the parabolic initial-
boundary value problem -GE {byy(-,•,G(v,u)) (Gu(v,u)ui) (Gu(v,u)u2)

+ by (•, •, G(v, u)) Guu(v, u) [ ul, u2] }.


zt - Oz + dy(x, t, y) z = -dyy (x, t, y) {9192 + yiw2 + wly2 + wlw2 }
a„ z + by(x, t, y) z = 1
-byy (x, t, y) {yry2 + ylw2 + wly2 + wlw2 } Putting y = G(v, u), yi = Gu(v, u) ui, and zuu := Guu(v, u)[ui, u2] in this
z(0) = 0 equation, we find that

with y = G(v, u), where the functions yi, wi E W (O, T), i = 1, 2, are the
zuu = -GQ {dyy(-, ',y)yly2 + dy(•,',y)zuu}
solutions to the following linearized initial-boundary value problems:
-GE {by,(-, •, y) yry2 + by(•, y) zuu},
ciyi/8t - Dyi + dy(x, t, y) yi = 0
where, by virtue of Theorem 5.9, yr and 92 solve the initial-boundary value
av yi + by (x, t, y) yi = vi
problems defined in the assertion. By the definitions of GQ, GE, and Co,
yi(0) = 0, the function zuu solves the initial-boundary value problem

c9wi /at - Owi + dy(x, t, y) wi = vi zt-Oz+ dy(x,t,y)z -dyy(x, t, y) 9192


óv wi+by ( x,t,y)wi = 0 8„z+by(x,t,y)z -byy(x, t , y) 9192

wi(0) = 0. z(0) 0.

We have thus obtained the first contribution zuu to the representation of


Proof: The mapping G"(v, u) is given in terrns of second -order partial deriva- z asserted in the theorem. The remaining three contributions are constructed
tives in the form by the same procedure: Guv[ul , v21 with u2 = 0 and vi = 0, Gvu[vl , u2] using
ur = 0 and v2 = 0, and G„2,[vl, v2] with ui = u2 = 0. Superposition of the
C"(v, u) [( ul, vl ), ( u2, v2)] Guu[u1,u2] + Guv[ui,v2] + Gvu[vl,u2]
four contributions finally yields the function z from the statement of the
+Gvv[vl,V2], theorem and the asserted form of G". ❑

where the operators Guu, Guv, Gvu, and Gvv can be determined by suitable
With the aboye theorem, the ground is prepared to state sufficient op-
combinations of the directions ui and vi. For example, by choosing vi =
tirnality conditions and to prove their suffiiciency for local optirnality. We
V2 = 0, we calculate Gzlu. As in the preceding proof, we start from the
could do this for distributed and boundary controls simultaneously, but this
representation (5.40),
would necessitate a rather complicated exposition. Therefore, we choose to
y = G(v, u) = GQ (v - d(-, •, G(v, u))) + GE (u - b(., •, G(v, u))) + Go yo. treat the two cases separately, giving a proof only for distributed controls.
The case of boundary controls can be handled analogously.
Differentiating the left- and right-hand sides with respect to u in the direction
ul, we obtain that
5.7.2. Distributed controls. Once again, we consider the optimal control
problem (5.17)-(5.19) on page 278. Let the control v E Vd satisfy the
Gu(v, u) ur = - GQ dy (., ', G(V, u)) Gu(v, u) u1 + Gr uj
associated first-order necessary optimality conditions, and let p denote the
- GE by (., G(v, u)) Gu(v, u) ui. corresponding adjoint state defined by (5.23). Then we have the variational
5.7. Second-order optimality conditions 291
290 5. Optimal control of semilinear parabolic equations

of the Lagrangian in AT (v). The set AT(v) is fundamental for the introduc-
inequality (5.25),
tion of the T-critical cone, which is defined just as in the elliptic case. It
is the set of those controls for which the positive definiteness of C' holds
x,t,9,v))(v - )dxdt > 0 VV E Vad•
ffQ (p+cpv( upon their insertion into C' together with the solution y to the linearized
initial-boundary value problem
Second- order necessary optimality conditions can be obtained in the
same way as in the elliptic case; we do not discuss this further here. The yt - Dy + dy (x, t, y) y = v in Q
most convenient way to state second - order sufficient conditions is to make (5.41) a„y+by(x,t,y)y = 0 on s
use of the Lagrangian function , which for the problem ( 5.17)-(5.19 ) on page
y(0) = 0 in Q.
278 takes the form

G(y, v, p ) = J(y, v) - J I ((yt + d(x, t, y ) - v) p + V y - V p) dx dt


Q Definition . The T-critical cone CT(v) is the set of all v E LO°(Q) satisfying

- f f b(x,t Y)dsdt.
, p = 0 if (x, t) E A,(v)
(5.42) v(x,t) > 0 if v(x,t) = va and (x,t) A,(v)
The explicit expression for the second derivative of G is given by < 0 and (x, t) A, (v).
if f) (X, t) = Vb

G" (y, v, p ) ( y, v) 2 = J"(y , v) (y, v)2 - f f p dyy (x, t, y) y2 dx dt


Q Depending on the sign of p + ^ov, on A,(v) we have either v = va or
'v = vb, whence the aboye sign conditions are derived. One may put v = 0
- f f p byy (x, t, y) y2 ds dt, at points where the gradient of the cost functional, that is, of the function
E
p + (Pv(x, t, y, v), is at least T in absolute value. We must assume T > 0,
where a restriction that is not needed in the finite-dimensional case: indeed, in

J"(y, (y, v ) 2 = f^ ^yy( x, y(x,T )) y(x,T)2 dx + ff ^byy( x, t, y) y2 ds dt the finite-dimensional case every component of a vector lying in the critical
cone for which the corresponding component of the gradient is nonzero must
T vanish. The counterexample constructed by J. Dunn [ Dun98] shows that
y ^oyy ( x, t, y, v) Pyv(x, t, y, v ) ( dx dt. this does not hold in function spaces. Therefore, the second-order sufficient
+ f fQ [ v J [`Pvy(x, t, y , v) 'Pvv ( x, t, 9, v)IL yv optimality condition is formulated as follows.

There exist constants 5 > 0 and T > 0 such that


For pedagogic reasons, we began our investigations in the elliptic case
with sufficient optimality conditions that do not invoke strongly active con-
£"(y, v, p) (y, v)2 > 6 IIvI12(Q)
(5.43)
straints and thus are usually too restrictive; weaker conditions were studied for all v E CT(v) and y E W(0,T) solving (5.41).
only later. Here, we do not take this detour and incorporate strongly active
constraints right from the beginning. Similarly to the elliptic case, we make
Theorem 5.17 . Suppose that Assumption 5.6 holds, and assume that the
the following definition:
pair (y, v) obeys the first- orden necessary optimality conditions of Theorem
5.12 and all the constraints of the problem (5.17)-(5.19). Moreover, suppose
Definition . For given T > 0 the set there exist constants á > 0 and T > 0 such that the positive definiteness
AT(v) = {(x, t) E Q : Ip(x, t) + Wv (x, t, y(x, t), v(x, t)) l > T} condition (5.43) is fulfilled . Then there are constants E > 0 and o, > 0
such that every v E Vad with Ilv - VIILOO(Q) < E, together with the associated
is called the set of strongly active constraints for v.
solution y(v) of problem (5.18) on page 278, satisfies the quadratic growth
condition
As was explained on page 250 for an optirnization problem in R, it does
not make sense to postulate the positive definiteness of the second derivative J(y, v) >- J(y, v) + U Ilv - vllL2(Q).
292 5. Optimal control of semilinear parabolic equations 5.7. Second-order optimality conditions 293

In particular, v is locally optimal in the sense of LO°(Q). Hence, with h(x, t) = v(x, t) - v(x, t),

Proof: (i) Preliminaries.


As before, G : L°°(Q) --> W(0,T)fC(Q), v y, denotes the control-to-state
f (v) - f (v) = f''(v) h +
2- f "(v) h2 + r2

operator. We know already that G is twice continuously Fréchet differen-


tiable. Now let f (v) := J(y(v), v) = J(G(v), v). In analogy to Theorem 4.25
> ff¡, g(x , t) h(x, t) dx dt + 2 f(v) h2 + r
(v)

on page 242, we again have > T J ¡ I h(x, t ) I dx dt + f"(v) h2 + r2 .


Ar(V))
2

Here, r2 = r2 (v, h) denotes the second- order remainder in the Taylor expan-
(5.44) f"(v) [v1, v2j = C"(y, v, P) [(yl, vl), (y2, v2)] ,
sion of f. We now make the decomposition h := ho + hl, where

where p is the adjoint state associated with (y, v) and y2 := G'(v)vi, i =


h(x, t) if (x, t) AT
ho(x, t) :_
1, 2, denote the solutions to the linearized problem with right-hand sides v2. 0 if (x, t) E A,.
With this notation, f"(v) can be estimated in terms of the L2 norms of the
increments: By construction, we have ho E Cr(v), since ho satisfies the siga conditions
(5.45) of the critical cone. With these functions, it follows that

I f" (V) [VI, V21 1 < lL"(y, v, p) [(y1, v1), (y2, v2)] 1

< C {IIy1IIW(O,T) II Y 2IIW( O,T) + IIYiIIW(O,T) IIV 2 11L2(Q)


(5.46) f (v) - f (v) > T ff r Ih(x, t)dx dt +
(v)
2
+ IIY211W(0,T) IIv1IIL2(Q) + IIv1IIL2(Q) IIv2IIL2(Q)}
< C IIv1IIL2(Q) IIv2IIL2(Q). ( iii) Estimation of f" (V) (ho + hl )2.
Invoking (5.43), ho E C,(», and the representation (5.44), we readily see
that
Here, we have, used the continuity of the operator G'(v) in the representation
yi = G'(v) vi as a mapping from L2(Q) into W(0,T), and c > 0 denotes a 2
generic constant. The estimate just shown will be used repeatedly in step 2 f^^(v) ho2 > 2 IIhoIIL 2(Q)
(iii) below.
Using Young's inequality, we conclude from (5.45) that, with a generic con-
Note also that f'(v) can be expressed, with y := p + v), in the
stant c > 0,
form
5 2 2
f'(v)h = ff g (x, t) h(x, t) dx dt.
1 f„(v)[ho,hljl < CIIholIL2(Q)IIhiIIL2(Q) < 4 IlhoIIL2(Q)+cIIh11L2(Q)
4 IIho1IL2(Q) +CIIhlllLI(Q) IIh1IIL-(Q)
5
(ü) Taylor expansion. < 4 I1hoIlL2(Q) + c1 E IIh11ILI(Q),
Let v(•) E Vdd with II v - vll L°°(Q) < s be arbitrary. For almost every (x, t) E
Q, we have the pointwise variational inequality because IIhllL-(Q) < E. By the same token,

g(x,t)(v- v(x,t)) >0 dvE [VQ( x,t),Vb(x, t)]. 2f"(v)hi <-Ci1h111i2(Q)<C2 E IIh111L1(Q)
5.7. Second-order optimality conditions 295
294 5. Optimal control of semilinear parabolic equations

Remarks.
Summarizing, we obtain after combining the aboye inequalities that
(i) The assertion of the aboye theorem is valid, in particular, if the definiteness
1 f" (v) (ho + h1)2 >
2111^oIIL2 (Q) - 14 IIhoIIL2( Q) + (Cl + c2) E IlhlllLr(Q)1
condition (5.43) is postulated for the larger set of all (y, v) satisfying both the
linearized system (5.41) and the sign conditions in (5.42), but not the condition
v = 0 on AT(v). These stronger second-order sufficient conditions, which are often
>_ 411hoIIL2( Q)- (cl +c2 ) EIlhlllL^(Q)- assumed in the convergente proofs of numerical methods, have been used by us
before in the study of elliptic problems.
Now we choose E > 0 so small that E ( c1+c2) < T/2. Since hl = 0 on 9 \AT,
we can infer that (ü) Another, more elegant, approach to deriving the second-order sufficient opti-
mality conditions which avoids the need for b > 0 and r > 0 can be found in Casas
IIhiML,(Q) = r hlidxdt. et al. [CDIRT08]. However, that second-order condition is equivalent to the one
A (v)
used here.
Substituting the aboye estimates into (5.46) then yields
5.7.3. Boundary control. In view of the analogy, we present the suffi-
cient conditions for boundary controls without proof. We consider the opti-
f (v) -f(v) ? ^JJAr( v)Ihldxdt zJJaT(
v)Ihldxdt+411hoIIL2(Q)+r2
mal control problem (5.27)-(5.29) on page 282. The Lagrangian function is
defined, in analogy to the case of distributed controls, by
+4ho2 +r.
> ff 1 (v)
r(y,v,p ) = J(y,v)-ff ((yt+d(x,t,y))p+Vy • Vp)dxdt
Now we choose E < 1, which can be done without loss of generality. Q
Then , by the definition of h, we have 1 h(x, t) I > h ( x, t)2. Since
-
ff (b(x, t, y) - u) p ds dt.
f
IIhoIIL2(Q ) = f QvA,( v) h2 dx dt,
Its second derivative is given by
we finally obtain that
)
rll (y, ü, p) (y, v 2 = J,,(y ú) (y, u)2 p dY ( x, t, y) y2 dx dt
.f ( v) - f ( ) > h2 dx dt + 4 IIhIIL2(QvAr) + r2 IfQ
ff r(v)
- ffPb9( X,t,)Y2dSdt,
> rn in{2 , 4} 1hMM2L2(Q) +r2 f
where
In Exercise 5.2, the reader will be asked to show that the remainder r2 (v, h)
J,l(y, ü)(y, u)2 = f 5(x,y(x,T )) y(x, T) 2 dx + A Wyy( x, t, y) y2 dx dt
satisfies, as in (4.79 ) on page 237,

r2 (v, h) T
y
1
-3 0 as Il hII L= (Q) ^ 0. Oyy(x, t,y,u yw(x,t,y,u)
_ s dt.
d
Ih1IL2(Q) y
E [ Wuy
/1( x, t, y, u)
0.u ( x, t, y, u)

Hence, for sufficiently small E > 0 we have the estimate


The control v E Uad is assumed to comply with the first-order neces-
sary optimality conditions. With the associated adjoint state p, defined by
f(v)-.f(v)> 1 min{2 ,4} I1hll22(Q)_aIIh1IL2(Q),
the adjoint problem (5.31) on page 283, the variational inequality (5.32) is
which is the asserted quadratic growth. 11 satisfied:

fL (p + 0, (x, t, 9, ti)) (u - u) ds dt > 0 V U E Uad.


296 5. Optimal control of semilinear parabolic equations 5.7. Second-order optirnality conditions 297

For given T > 0, the set cost functional and a one-dimensional spatial domain t2 = (0, f). We take as
an example the problem
A, (u) = {(x, t) E E : jp(x, t) + 5u (x, t, y(x, t), 2(x, t)) I > T}

is called the set of strongly active constraints for ú. The linearized problem
fT
reads
min {e ¢(x, y(x, T)) dx + . (r (t, y(0, t )) + qP 2 (t, y(f, t))) dt
o
T
yt-^,y+dy(x,t,y)y = 0 in Q + cp (x,t,y,v)dxdt
(5.47) ó„ y + by (x, t, y) y = u on E 10 , 10
y(0) = 0 in 9, subject to

and the T-critical cone C,(v,) consists of all u e L°°(E) such that
yt(x, t) - yxx(x, t) + d(x, t, y(x, t)) = v(x, t) in (0, f) x (0, T)
0 if (x , t) E A, (ú) -yx(0, t) + bi(t, y(0, t)) = 0 in (0, T)
(5.48) u (x, t) > 0 if 5(x, t) = u, and (x, t) A, (u) yx(f, t) + b2(t, y(f, t)) = 0 in (0, T)
< 0 if u ( x, t) = ub and (x, t) A, (u). in (0,f)
y(x,0) = y0(x)
Again , we postulate the following as the second-order sufficient condition:
and
(5.49)
,p )(y,u)2 > óllu1112
Va(x, t) < v(x, t) < Vb(x, t).
for all u c C,(u) and y e W(0,T ) satisfying (5.47).

So that the two-norm discrepancy does not come into play, the cost
Theorem 5.18. Suppose that Assumption 5.6 on page 269 holds, and as-
functional must be linear-quadratic with respect to v. We therefore postulate
sume that the pair (y, s) cornplies with both the first-order necessary opti-
the following form for co:
mality conditions stated in Theorem 5.13 on page 283 and the constraints
of the optimal boundary control problem. Moreover, let there be constants
(5.50) p(x, t, y, v) = cpl (x, t, y) + <p2 (x, t, y) v + A(x, t) v2.
S > 0 and rr > 0 such that the definiteness condition (5.49) is fulfilled. Then
there exist constants s > 0 and o- > 0 such that for every u e Ud with
For d = 0 and br = b2 = 0, that is, for the linear parabolic problem,
¡¡u - uIIL-(E) < s and associated state y, the quadratic growth condition
right-hand sides v E L'(Q) of the differential equation are mapped into
j(9, ú ) W(0,T) fl C( Q), provided that yo is continuous and r > N12 + 1. The
J(y, u) > + o llu - r^11i2 (E)
latter holds for N = 1 and r = 2. Hence, with a slight modification of the
holds. In particular, it is locally optimal.
proof of Theorern 5.15.on page 286, it can be shown that the control- to-state
5.7.4. A case without two-norm discrepancy. For parabolic problems operator G is twice continuously Fréchet differentiable as a mapping from
also, the two-norrn discrepancy does not occur if for L 2 controls the regularity L2(Q) into W(0,T) n C( Q). The linear operators used there are defined as
of the state and the differentiability of the nonlinearities fit with each other. GQ:L2(Q)-4 Y,GE:L'(E)--> Y,andGn:C(52)->Y.
For instante, this is the case for nonlinearities that are quadratic in a certain Consequently, we may employ L2(Q) controls for distributed control
sense so that the second-order remainder vanishes. Such an example will be problems in one-dimensional domains. Unfortunately , this is not possible
given in Section 5.10.2 by considering control of the Navier-Stokes equations. for boundary control problems.
Also for the phase field model with cubic nonlinearities to be studied in By modifying Assumption 5.6 on page 269 appropriately to suit the one-
Section 5.10.1, a two-norm discrepancy does not occur.
dimensional case considered here, the reader will be able to determine in
In the case of problem (5.7)-(5.9) on page 270 with the very general Exercise 5.3 the regularity conditions that the functions 0, jj, and e must
nonlinearities d(x, t, y) and b(x, t, y), the two-norm discrepancy can be ex- obey in order for the cost functional to be twice continuously Fréchet dif-
cluded for the following constellation: distributed control with a suitable ferentiable in C(Q ) x L2(Q ). In particular, A needs to be bounded and
298 5. OPtimal control of semilinear e uations 299
parabolic 9 5.8. Test examples

measurable, and we have to postulate .\(x, t) > 5 > 0 for a sufficient second- Here, the following quantities are prescribed:
order optimality condition to hold. The assertion of Theorem 5.17 then holds
with L2(Q) in place of L' (Q).

f = 4 , T = 1, A = 22 (e2 / 3 - el/3 P(y) = y yj3,


),
(t) = e/3
5.8. Test examples yn(x) = (e + e - 1) cos ( a), ay(t) = e-2t, a
r t - 1/3
In this section, we are going to discuss two test examples for nonlinear par- -4t
a(x) = cos ( a), b(t) _ e - min { 1max S 0, ee
e3
e l l3
abolic control problems in which the constructed solution satisfies sufficient
optimality conditions. For this purpose, we use the same approach as for
elliptic problems: the optimal quantities ú and y and the associated adjoint
state p are chosen a priori; then certain linear parts of the cost functional Obviously, this is an optimal control problem for the one-dimensional
and constant terms in the initial-boundary value problem are fitted in such a heat equation with boundary condition of Stefan--Boltzmann type. The
way that both the optimality system and a second-order sufficient optimality reader will have the opportunity in Exercise 5.4 to verify that the following
condition are satisfied. triple of functions satisfies the first-order necessary optimality conditions:
The first test example is constructed in such a way that the sufficient
condition holds in the entire control-state space. In the second, more com- t 1/3
plicated example, the sufficient conditions will only be satisfied where the ú(t) = 2
min < 1, max ` 0, e2/3 eel/3 J 1 '
control constraints are not strongly active. In addition, a state constraint in
the form of an integral inequality will be prescribed. y(x, t) = e-t cos(x), p(x, t) = -et cos(x).

5.8.1. A test example with control constraints . Let T > 0, Q > 0, Here, the adjoint state p solves the adjoint system
and Q = (0,£) x (O, T). We consider the spatially one-dimensional optimal
control problem
-Pt(x,t) - pxx(x,t) = 0
f T ay(t) y( Px (0, t) =
min J(y, u) 0
1 y( x, T) - ySn ( x)12 dx - P, t) dt (5.54)
(5.51) Px(^,t) + [1 +w(y(f,t)) ] p(e,t) _ -ay(t)
+ (a. (t ) u(t) + 2 (u(t))2) dt, P(x,T) = y(x,T) - ysz(x)_

subject to
In particular, the projection relation for zt has to be verified. For further
(5.52) details, we refer the interested reader to the paper [ART021, from which
this example was taken.
yt(x, t) - yxx(x, t) = 0 in (0, P) x (O, T)
-yx (O, t) = 0 The basic idea for the construction of v, is the following: the graph of
in (0, T)
an interesting optimal control should reach both the upper and the lower
yx(f, t) + y(8, t) = b(t) + u(t) - ^o(y(f, t)) in (0, T) bounds, connecting them srnoothly. In view of the projection relation, it
y(x,0) = a(x) in (0, P) has to be a multiple of the adjoint state p between the bounds. Moreover,
exponential functions of time are very good candidates for solving the heat
and conduction equation explicitly, especially for p. These considerations, and an
appropriate adjustment of constants, lead to the aboye choice of the function
(5.53) 0 < u(t) < 1 for a. e. t e (0, T). u, which is depicted on page 312.
300 5. Optimal control of semilinear parabolic equations 5.8. Test examples 301

To prove local optimality, we show that a second-order sufficient opti- As in formula ( 5.44) on page 292, we have
mality condition is fulfilled. The (formal) Lagrangian function reads
f"(v, + 8(u - ú )) ( u - ú) 2 = °(yo, uo, P o ) (y, u - ú) 2
cT e
= J(y, u) - (yt - yxx ) p dx dt - / (y(x, 0) - a(x) p(x, 0) dx T
o o 0 - ^ Po(f, t) y" (yo (f, t )) y(f, t)2dt.

+ T y^( 0, t) p(0 , t) dt -
I o

J T (Y. (e, t) + y(f, t) - b (t) - u(t)) p (f, t) dt Here, uo = ú+9 ( u-ú), yo is the associated state, po is the adjoint state, and
y denotes the solution to the linearized problem at yo corresponding to the
(y(f, t)) p (f, t) dt, control u - v,. Since cp" is nonnegative , the assertion f (u) > f (ú) will follow
- f once we are able to show that all possible adjoint states po are nonpositive.
which is a special case of the Lagrangian for more general boundary control We sketch the proof of this claim. To this end , we first show the following
problems that was introduced on page 295. Since F = {0, 1}, we have claim: for any no E Uad, the associated state yo satisfles the inequality

yo(x,T) - yQ(x) < 0 Vx E 9.


pa„ydsdt = J T (P(0, t) a„y(0, t) + P(f, t)ay (f, t)) dt
Indeed, we imrnediately see that yS > (e + e-') cos (ir/4) > 3'/2 >

IT (- P(0, t) yx(0, t ) + P(f, t)yx (f, t)) dt. 2, b(t) + uo ( t) < 1.25, and yo(x) = cos ( x) < 1. Moreover , since all the
given data of the problem are nonnegative, it follows from standard com-
Moreover, since ^,"(y) = 12 y2, parison principles for parabolic problems, as in [RZ991 , that the state yo is
nonnegative . Hence, ye ¡yo¡ = yB > 0. But then
£11(9,ü,p)( y,u)2 = Ily (T)I12L 2(o,e) + aIIUI12,2(o,T) (
(yo,x + yo ) ( f, t) = b(t ) + u(t) - Y0 4 f, t) < 1.25.

-12 ^ p(f, t) 9 (f, t)2 y ( f, t)2 dt By the maximum principie , the maximurn of the linear parabolic initial-
0T boundary value problem with boundary condition yo, + yo = 1.25 can be
and thus, since p(x, t) < 0, attained only at the boundary (i.e., at x E {0,f }) or at t = 0. Consequently,
we must have yo < rnax { 1, 1.25} = 1.25, and thus yo ( x, T) - y0 < 1.25 - 2 <
(5.55) 0, which proves the claim.
G""(9,ü,P)(y,u)2 > A l^.11i2(0.T)
Hence, if we replace y by yo in the adjoint problem ( 5.54), then all four
for all square integrable functions y and u. Theorem 5.18 yields the local right-hand sides are nonpositive . Moreover , we have 1 + ^p'(yo) > 0 in the
optimality of tt in the sense of L°°(0, T). Observe, however, that a very strong boundary condition. It then follows from the aforementioned comparison
form of the second-order sufficient conditions holds; we therefore should not principies for parabolic problems that po is in fact nonpositive. This con-
be too surprised at the following result. eludes the proof of the assertion.

Theorem 5.19 . The pair (y, ti) defined aboye is (globally) optimal for the 5.8.2. A problem with a state constraint of integral form *. In
optimal control problem (5.51)-(5.53). [MT02], the following optimal control problem was investigated numerically
with respect to second-order sufficient conditions:
Proof: Let (y, u) be another admissible pair. Invoking the variational in- (5.56)
equality satisfied by the local minimizer ú of the reduced functional f, we
find that, with some function 8 = 8(x) E (0, 1), min J(y, u) 2 J JQ
n(x, t) iy(x, t) yQ(x, t) 12 dx dt + 2 J ju(t) j2 dt
o
/
+ (ay(t) y(f, t) +a. (t) u(t)) dt,
f (u) > f (ü) + 2 f"(u + 8(u - ¡))(u - v,)2.
o
l
302 5. Optimal control of seinilinear parabolic equations 5.8. Test examples _ 303

subject te and the variational inequality

yt(x,t) - Yxx(x,t) = eR in Q
(5.61) J T (a u(t) + p(£, i) + au(t)) (u(t) - (t)) dt > 0 du E Uad.
yx (0, t) = 0 in (0, T)
(5.57)
yx (e, t) + y(f, t)2 = eu (t) + u (t) in (0, T) In addition , the complem entar yslackness condition
y(x, 0) = 0 in (0, f)
(5.62) µ ff (xt) dxdt=0
and the control and state constraints

must hold; see Casas and Mateos [CM02aj. The result also follows from
(5.58) 0 < u(t) <1 for a. e. t c (0, T),
Theorem 6.3 on page 330. Note that the variational inequality (5.61) is
equivalent to the projection relation
(5.59) ff y(x.t)dxdt <0.
(5.63) ú (t) (p(f, t) + a.a(t)) } for a.e . t E (0, T),

Here, T > 0, 2 > 0, and > 0 are given, and we have put Q := (0, f) x (0, T).
The functions a, yR, eR E L°°(Q) and ay, au, eE E L°°(0,T) are prescribed. where P10,11 : IR -3 [0 , 1] denotes pointwise projection ente the interval [0, 1].
As before,
These optimality conditions can also be obtained using the formal La-
Uad = {u e L°°(0,T) : 0 < u(t) < 1 grange method. In this case the Lagrangian function read¡s¡
for a.e. t E (0, T) }

denotes the set of admissible controls. Now observe that the nonlinearity L(y, u, p , µ) = J(Y, u) - ^^ ( yt - y.. - eR) P dx dt + J J µ y(x, t) dx dt
y H+ y2 is not monotone increasing. We could circumvent this difficulty R R
by considering the function y y ¡y¡ instead, in order to guarantee well-
(y^(f, t) + y(f, t) 2 - u(t) - eE (t)) p(f, t) dt,
posedness of the state y; but this would be immaterial, since we will construct -f
a nonnegative solution anyway.
where we assurne that the homogeneous initial and boundary conditions
for y are incorporated in the choice of state space. Then the conditions
First-order necessary conditions. Suppose that ti is a locally optimal (5.60)-(5.61) follow from the requirements that Dy.(y, v,, p, p) y = 0 for all
control for the aboye problem, and let y be the associated state. In addi- admissible increments y and DuG(y, 2t, p, y) (u - ti) > 0 for all u E Uad•
tion to the control constraints, a state constraint is also to be respected. For the construction of the test example, we fix the data as follows:
Therefore, the necessary optimality conditions derived so far do not apply
T = 1, f = ir, .\ = 0.004, au(t) = )i + 1 - (1 + 2,\) t,
directly. In the next chapter, we will establish the so-called Karush-Kuhn-
Tucker conditions, which are valid under a certain regularity assumption.
( a0ER, tE [0,41
The test solution is constructed in such a way that these assumptions hold. a(x , t) =
Hence, it follows from Theorem 6.3 en page 330 that there exist arr adjoint Sl 1 t E ( 1 1]
4 1
state p E W(0, T) fl C(Q) and a nonnegative Lagrange multiplier y E R
1
satisfying the adjoint problem (1 - (2 - t) cos(a)), tE [0, á]
(X ,t)
-
yQ(x, =
t) Q
1
-Pt-Pxx = a(9-yR)+y in Q (1 (2 t a(x,t) (t - 0.5)2)) cos(a), t e (2 ,1],
a(x, t)
px(0, t ) = 0 in (0, T)
(5.60)
P. (t, t) + 2 y(t, t) P(1, t) = ay (t) in (0, T) r 0, t E [0, 2]
p(x, T) = 0 in (0, f) Sl 2(t-0.5)2(1-t), tE (z, 11,
304 5. Optimal control of semilinear parabolic equations 5.8. Test examples 305

0, tE [0, 2] We also assume that the state constraint (5.59) is active. Otherwise, this
eQ(t)
(t 2 + t - 0.75) cos (a), t E (2, 1], constraint is meanin g less , and the sufficient conditions coincide with those
for puye control constraints . The second-order sufficient optimality condition
f 0, t e [0 , 2] reads as follows : there exist constants 6 > 0 and T > 0 such that
eE(t)
(t - 0.5)4 - (2t - 1), t e (2,l]. T
(5.65) £'1(9,,p, u )(y, u)2 > b J u2 dt
o
Theorem 5.20 . With the abone data, the quantities
'for all y E W(0, T) and u E L2(0,T) satisfying the following conditions:

u(t) = max{0, 2t - 1}, y(x, t) _


r 0, te [o, 2] yt - Yxx = 0
(t - 1 / 2 )2COS (X), t e (2,1],
y^(0,t) = 0
p(x,t) = (1 - t)cos (x), µ = 1 (5.66)
yx(f, t) + 2 9(f, t) y(f, t) = u(t)
satisfy the first-order necessary optimality conditions for the problem (5.56)- y(x,0) = 0,
(5.59).
=0 if teAT(ú)
Proof: Evidently, the state and adjoint problems are satisfied . Since the inte-
gral of the cosine function over [0 , 7] vanishes , the constraint ( 5.59) is active, (5.67) u(t) > 0 if ú(t) = 0 and t A,(u)
which entails that the complementarity condition (5.62) is valid. Moreover, < 0 if ú(t) = 1 and t AT(ú),
v is an admissible control . It remains to verify the variational inequality
(5.61). Invoking ( 5.63), we find that ff y(xt) dxdt= 0.
(p(7r,t )+ a,, (t))=2t-1
< 0, te [o, z)r
> 0, t E (2,i]. Sufficient conditions in the test example. Let us verify that our test
Therefore, solution (y, u) satisfies the aboye conditions. In fact, by construction ti
attains the lower bound 0 on [0, 2); the constraint is strongly active there,
P[o,1] -^ (p(rr, t) + a.,,,(t)) = max {0, 2t - 1} = zt(t). since for b < 2 and t E [0, b] we have

The assertion now follows from the equivalente between the variational in- Aú(t) + p(x, t) + a,,(t) = p(xr, t) + a,,(t) _ -A (2t - 1) > -A (2b - l).
equality and the projection relation. ❑
Therefore, [0, b] C A,(u) for T = 1) (2b - 1)

Second-order sufficient optimality conditions . In orden to incorporate For the verification of the second-order sufficient conditions it thus suf-
strongly active constraints, we define, for given T > 0, fices to show the definiteness condition (5.65) for all (y, u) such that the
linearized problem (5.66) is satisfied and such that u = 0 on [0, b] for some
A, (u) = {t E ( 0,T) : JA' ( t) +p(f,t ) + au(t)J > T}. arbitrary but fixed b E (0, 2). We are still free to choose a. We put

Owing to the variational inequalities , either ú(t) = 0 or t(t) = 1 must


0 < t < b
hold on A, ( u), depending on the sign of 4(t ) + p(f, t) + a, ( t). The second (5.68) a(x,t) =
derivative G" is given by 1, h<t<1.

(5.64)
(¡ T
Theorem 5.21. Let a be chosen as in (5.68) with some b e [0, 2). Then the
u,p,m )( y,u)2= Qay2dxdt +I (- 2p(f,t) y(e,t)2 +au (t)2)dt.
JJ second-order sufficient condition is satisfied for any oro 6 IR by (y, ú, p, µ).
306 5. Optimal control of semilinear parabolic equations 5.8. Test examples 307

Proof: We choose (y, u) in such a way that u vanishes on [0, b] and y solves the and therefore we postulate that
initial-boundary value problem (5.66). Then y(x, t) = 0 on [0, b]. Therefore
(5.72)
fb y2dxdt+ f0 2(1-t)y2(xr,t)dt+A f0 u2dt I1+I2+I3
(5.69) £',(y, ü , p, p) (y, u) 2 ¡no¡> fo
fo fo y2dxdt Io
= f f 1 Iy12 dx dt + ñ IuI2 dt - 2 J 'p(,,, t) Iy(x, t) I2 dt
Here, b E [0, 1) can be chosen arbitrarily, say b = 1. We evaluate the
integrals Ij with this choice and with
> A 1 1 u12 dt - 2 I 1 (- (1 - t)) Iy(7r, t)I2 dt > A I 1 uI2 dt,
0 0 0
whence the definiteness condition (5.65) follows. ❑ u(t) =
1 in lo, 14]
0 in (4,i1.

Observe that no has not been assumed positive. Obviously, for no > 0,
The corresponding state y solves (5.66), the homogeneous heat conduction
G" would be uniformly positive definite on the entire space W(0, 1) x L2(0,1)_
equation with zero initial condition, zero boundary condition at x = 0, and
hence, in this case the definiteness condition would be met in a very strong
sense . We now choose a partially negative n so that G" becomes indefinite.
r 1 in 10,
yx(7r,t) =
Theorem 5.22. For any sufficiently large negative no there exists a pair -2y(nr,t) in (4,l].
(y, u) with the following properties: u > 0, y solves the linearized problem
(5.66), and A numerical evaluation of the integrals Ij, 0 < j < 3, yields Io
0.0103271, Il = 0.0401844, I2 = 0.0708107, 13 = 0.001, and thus
(5.70) (y, u, p,11)(y,u)2 <0.
I1 +12+I3
Io = 10.845.
Proof: We fix b E (0, .1,) and set

1 in [0, b]
u(t) = In this example, the second-order sufficient conditions are fulfilled, even
0 in (b, 11 .
though the bilinear form G"(y, v., p, p) is not positive definite in the entire
f o, space; it is negative definite in some sets where the control constraints are
Then y cannot vanish identically, so f0 IyI2 dx dt > 0. Therefore,
strongly active. This has been enforced by an analytical construction.

(5.71) no L it b
dx dt
The question arises as to whether the second-order conditions can be
verified numerically, because an explicit solution of the problem is usually
impossible. Such a numerical solution of the problem has to be found by
as no -* -oo. Consequently, the expression (5.69) becomes negative for means of a discretization, for instante by using finite differences in time and
sufficiently largo negative ao, since the additional integral terco (5.71) occurs. finite elements in space.
For the discretized problem, the so-called reduced Hessian matriz can be
determined. This is the Hessian of the cost functional taken with respect
For numerical purposes, one needs a rough estimate of how small no has to the discrete control u, after y has been eliminated using the discreto
to be chosen so as to guarantee that £" (9, e, p, p) (y, u)2 < 0. We must nave linearized problem. It also accounts for the active constraints; see Kelley
[Ke199] . If the eigenvalues of the Hessian are sufficiently positive, then one
f 1b 2 fj 1 can trust that the sufficient conditions are fulfilled--but this is no proof.
dx dt + o
y2 dx dt + 2 f'
(1 t ) y2 (xr, t) dt +u2
dt < 0,
The aboye method has been tested numerically in [MT02].
l
308 S. Optimal control of semilinear parabolic equations 5.9. Numerical methods 309

5.9. Numerical methods Si Find the state yn = y(Vn, un ) by solving the initial- boundary value
problem
The numerical methods presented in this section are concerned with the
optimal control problem ( 5.7)-(5.9):
yt-^,y +d(x,t,y) = n
a„y+b(x,t,y) = un
min J(y, v, u )
f^ (x, y(x, T)) dx + J JQ ^o (x, t, y(x, t), v(x, t)) dx dt y(0) = yo.

+
ff (x, t, y( x, t), u(x, t)) ds dt,
S2 Determine the adjoint state pn by solving the adjoint problem (5.73).
subject to
S3 (Descent directions) Put
yt - ny + d(x, t, y) = v in Q
8„y+b(x,t,y) = u onE hn (yOv(', yn,vn )+ pn), rn (Ou(.,ynIE,Un)+ pnHE).

y(0) = yo in S
S4 (Step size control) Determine sn from
and
va(x,t ) < v(x,t) < vb ( x,t) for a.e. (x, t) E Q mitin f (Fv(vn + s hn), IPu( un + s rn)).
ua(x, t ) < u(x, t) < Ub ( x, t) for a. e. (x, t) E E.
Here, lPv and Fu denote the pointwise projections onto the admissible sets
We present the methods only formally , assuming that all of the func- Vad and Uad, respectively, that is, lPv = lP[va vbl and Pu = 1L laa,76bl .
tions and data involved are so smooth that the expressions that arise are
S5 (New controls) Set
meaningful.

5.9.1. Gradient methods. We begin with the projected gradient method ( vn+l, un +1 ) (FV(un +sn hn), PU(vn+sn rn)), n := n+1, GO TO Si.
introduced in Chapter 2. To this end, we consider the reduced cost func-
tional f (v, u) = J(y( v, u), v, u ). The Fréchet derivative of f at ( vn, un),
Remark. In contrast to the corresponding method for semilinear elliptic prob-
with associated adjoint state p,,, is given by
lems, the nonlinear problem in step Si is comparatively easy to solve nurrrerically.
Here, semi-explicit methods can be employed: to this end, take an equidistant
f (vn, un ) (v, u) t, yn, v n) + pn) v dx dt subdivision ti = i T, with i = 0, .... k and T = T/k, of the time interval [0, T],
Q and denote the approximation of y,(x, ti) to be found by wi(x). Then wi+r (x) is
determined as the solution to the linear elliptic boundary value problem
+ f F (u ( x, t, yn, u n) + pn) u ds dt.
J Wi+r - T Owi+i = wi - r d(x, ti+i, wi ) + T ua (', ti+r) in 9
Here, the adjoint state pn is the solution to the initial- boundary value prob- ó„wi+r = -b ( x, ti+r, wi ) + un(., ti+i) on T,
lem
for i = 0, ... , k - 1. In other words, k linear elliptic boundary value problems have
-pt - ¿p + dy ( x, t, yn) p = `PY(x, t, yn, vn) to be solved numerically.
(5.73) 3,.p+by ( x,t,yn )p = 0y(x,t, yn,un)
p(x,T) = by(x,yn(x,T)). 5.9.2. The SQP method. The SQP method follows the same lines as in
the elliptic case. Therefore, the individual steps are not explained in further
Suppose now that the controls (y1, uj), 1 < j < n, have already been detail. We again consider the optimal control problem (5.7)-(5.9) studied
calculated. Then the next iterate (vn+i, un+1) is determined as follows: aboye.
310 5. Optimal control of semilinear parabolic equations 5.9. Numerical methods 311

Let the iterates vn, un, yn, and pn be determined. In the next iteration problem, which is solved for p = pn+r. Here, the derivative of j with respect
step, we have to solve the quadratic optirnal control problem to y occurs on the right-hand side, in accordance with the respective dornains
of integration; the adjoint system therefore reads
(5.74) min J(y, v, u) J'(yn, vn, un) (y - yn, v - vn, u - un)
-pt - Ap + dy(x, t, yn) p 'Py(x, t, yn, vn) + 'Pyy (x, t, yn, vn) (y - yn)
1
+ 2 í-' (2Jn, vn, un, pn) (y - yn,'U - vn, u -pndyy(x,t,yn)(y-yn)

wp+by (x,t,yn )p = Vy(x,t ,yn,un )+ yy(x,t, yn,un)(y-yn)

yt-Ay+d( x,t,yn )+ dy(x,t,yn )(y-yn) = V in Q


-pnbyy(x, t,Yn)(y Yn)
a„y + b(x,t,yn) + by(x,t,yn )(y - yn) = u on E p(T) Oy(x,yn(',T))
y(0) = yo in S2 +Oyy(x, yn(-,T)) (y(.,T) - y.(•,T))

Va(X, t) v(x,t) < vb(x,t) for a.e. (x, t) e Q In the practical implementation of the numerical method, this adjoint
(5.76)
Un(x,t) u(x,t) < ub(x,t) for a.e. (x, t) e E. state is often obtained as a byproduct, so that it is not necessary to solve
the aboye problem. The SQP method described aboye converges locally
For brevity, the Lagrangian is written downonly formally: quadratically to a local minimizer in the sense of the L°° norms of y, v, u,
and p, provided the second-order sufficient optimality conditions are fulfilled;
see [U899] and the referentes given there.
G=J-
JJ (yt - Ay - d(x, t, y) - v) p dx dt -JJ (a„y + b(x, t, y) - u) p ds dt.
Note that alter calculation of the second derivative, the terms yt, -Ay and Example. The test example (5.51)-(5.53) on page 298 was solved with the
3,,y that are only formally defined no longer appear anyway. A tedious but initial guesses
straightforward calculation yields for the quadratic cost functional j the
uo(t) = yo(x,t) = po(x,t) se 0.5.
expression
The numerical solutionr of the parabolic problem was performed by an im-
J(y,v,u) = {O0( x,yn(.,T)) (y (-,T) - 2Jn(.,T)) plicit finite difference scheme using a second-order approximation of the
J boundary condition. The step sizes for time and space were both chosen
Oyy(x,yn(•,T))I y(.,T) - yn(.,T)12} dx to be 1/200, and the controls were chosen as piecewise constant functions on
+2 the time grid.
y-yl + 1 r y-^T í yyyv 1 r y Yn
dxdt The sequences of controls un(t) and boundary values y,(£, t) are depicted
+ffQ { [ V ] f v -vn 2 v-vn ^Pvy (Pvv l v J L - vn in the figures that follow. Already alter four iteration steps, the accuracy
-un]T
was higher than could be expected from numerical approximation of the
+
f L0.
[Y - dsdt partial differential equation. Graphically, a further refinement did not yield
J U- un] + 2 [u [ Wuy buu ] [ u - un] any improvement.

fQPndYY ( xtYn ) Y y)2 dx dt - ffPnby0 (tYn )(Y - y)2 ds dt,


2 f Referentes . For the convergente analysis of the projected gradient rnethod,
we refer the reader to [GS80j and [Ke199]. In [GS80 ], other commonly used
where the first- and second-order partial derivatives of cp and 0 also depend methods for the solution of discretized optimal control problems are discussed in
on the variables (x, t, Yn, vn) and (x, t, yn, un), respectively. detail. Parabolic problems with box constraints for the control can be solved very
The solution of the quadratic problem (5.74)-(5.76) yields the new it- efficiently by the projected Newton rnethod; see, e.g., [HPUUO9], [KS94] , [ KS95],
erases (yn+r, vn+r, ron+r ). These functions are substituted into the adjoint and [Hei96].
312 5. Optimal control of semilinear parabolic equations 5.10. Further parabolic problems 313

decomposition), which plays an important role especially for flow problems. Appli-
cations of this technique can be found in, e.g., [AHO1], [HV03], [KV01], [KV02],
i/u3, u4J
0.8 and -[Vol01]. Another model reduction technique, which is commonly used for lin-
!
ear problems, is the balanced truncation method; see, e.g., [Ant05] and the list of
12 ! I
0.6 referentes given there.
uo i / ul

0.4 1 !
! 5.10. Further parabolic problems *
0.2 ! i In this section, we demonstrate that the methods developed so far carry over
to more general semilinear parabolic problems. We only sketch and explain
0
the principal features of the approach, without entering into theoretical de-
0.2 0.4 0.6 0.8 1 tails that are beyond the scope of this textbook and which can be looked
up in original papers. Both of the following examples will make it clear
that eventually each class of nonlinear partial differential equations has to
be tackled by techniques that are specifically tailored to it.

5.10.1. A phase field model. Here, we investigate an optimal control


problem for the phase field model from Section 1.3.2. We establish the
optimality system using the formal Lagrange method. The aim of the opti-
mization is to approximate a desired temperature evolution uQ(x,t) and a
desired solidification process coQ(x, t):

(5.77) min J (u, p f) 2 ff


Q
f l u - UQ 12 dx dt

+^ ff Q
p - Q2 dxdt + 2 fff2dxdt ,
Q
Controls u, ( t) and boundary values y,,,(f,t).
subject to

ut + 2 ^pt = te u + f in Q
Many papers, especially in the parabolic case, report en the use of the SQP in Q
method; see [GT97a], [ KS92] , and [LSO0]. Schur complement techniques for (5.78) _r,pt = ^20W+g(40)+2u
the effective application of SQP methods to discretized parabolic problerris are 0,u = 0, ó„ cp = 0 on E
discussed in [SW98 ] and [MS00]. Besides SQP methods, trust-region and interior-
u(-, 0) = uo, w(-, O) = upo in Q
point methods, as well as hybrid techniques derived from them, are also gaining
importante in the solution of nonlinear parabolic problems. Among numerous
and
referentes, we mention [UUH99], [UU00], and [WSO4] as representatives. Further
referentes are given, e.g., in the review article [HV01]. (5.79) fa, (x, t ) < f (x, t) < fb( x, t) for a.e. (x, t) E Q.
In the case of nonlinear parabolic problems in multi-dimensional domains, the
dimensionality of the discretized problems is very higli. It is possible to use model Here, the following data are prescribed: functions u Q, (pQ E L2(Q), corr-
reduction techniques to establish problems of much smaller dimension that exhibit stants a > 0, /3 > 0, A > 0, positive constants e, a, T, ^, and a polynomial
important properties of the original problem and yet are relatively accurate. Among g = g(z) = az+bz2 -cz3 with bounded coefficient functions a(x,t), b(x,t),
these techniques, there is the so-called POD method (POD for proper orthogonal and c(x, t) > c > 0. Moreover, Q C RN, N E {2, 3}, is a bounded domain of
314 5. OPtimal control of semilinear e uations
parabolic 9 5.10. Further parabolic problems 315

class C2, and the initial data u0 and y0 are assumed to belong to W2',(2 ) The zero initial conditions for the directions u and y are prescribed so
and to satisfy the compatibility conditions ó„ ypo = a„uo = 0 en F = af2. that v, + s u and cp + s y satisfy the initial conditions for s e R. The first of
The distributed heat source f E L°°(Q) is the control variable, and the the aboye conditions yields, after integration by parts with respect to t,
states are given by the temperature u and the phase function (p. The states
(a(ú-uQ)u-put-raOu•Vp+2zpu)dxdt=0
are considered in the normed spaces 1l,
2
WP'r(Q) = {u E LP(Q) : u E LP (Q) for i, j = 1,... N }, for a ll u w ith u (0) = 0 . This is just the weak formulation of the initial-
, axi' a^ a--, boundary value problem
endowed with the corresponding norm.
-pt = rAp+20+a(u-uQ) in Q
It has been shown in [HJ92] that the parabolic initial-boundary value
system (5.78) admits for any given f E L9(Q) with q > 2 a unique state pair a„p = 0 on E
(P,u) E W9'1(Q) x WP'1(Q), where p(T) = 0 in Q.

By the same token, we obtain from the second condition that


5 -5 2q if2<q<2andN=3
P=
{ arbitrary in R+ if 2 <gandN=3orq>2andN=2.
-2Pt T t = ^2AO+gl(p)0+0(p-yPQ) in Q

According to [HT99], the mapping G : f H (u, y ) is twice continuously ave = 0 on E


Fréchet differentiable from F := L2(Q) into Y := W2'2 ( Q) x W2'2(Q). This 1b(T) = 0 in Q.
follows from the fact that the Nemytskii operator y(., •) F-y g(,p(., •)) is twice
continuously differentiable from W2'2 (Q) finto L2 ( Q). The latter property The third condition yields the variational inequality
is a consequence of the embedding W2'2(Q)<---* L6(Q) for N < 3 and the
special form of g as a third- degree polynomial in yo. fJíQ (kf +P)(f - f)dxdt>0 b'f EFad,
For the derivation of the necessary optimality conditions , we introduce
the Lagrangian function , following the lines of Section 3.1. We set from which pointwise minimum principies or projection relations may again
be derived. Further details can be looked up in original papers.
G(u, y, f, P, ) == J(u, tP, .f ^ot - f P + c Vu • Vpdxdt
) ff ( + 2
- Qut J]
Referentes . The existente of optimal controls and first-order necessary optimal-
ity conditions were derived in [HJ92] , and the gradient method for the numerical
- ff (( yt - g ( y) - 2 u) + 2 Vy . V) dx dt,
solution of the problem was described in [CH91]. Second-order sufficient optimal-
ity conditions and the convergente of the SQP method were treated in [HT99].
which already accounts for the homogeneous Neumann boundary conditions. Since the nonlinearity g is a polynomial of degree three, the two-norm discrepancy
The set of admissible controls is given by does not arise, and one can work in the space Y := WZ'2(Q) x W21'2(Q) x L2(Q).
The nuinerical solution of a concrete optimally controlled solidification problem was
Fad = {f E L2(Q) : fa(x, t) < f (x, t) < fb( x, t) for a.e. (x, t) E Q}. presented in [Hei97 ] and in [HS94]. The model reduction of such a problem is
the subject of [Vo101]. The optimal control of more general (thermodynamically
The Lagrange method yields the following first-order necessary optimal-
ity conditions: consistent) phase field models was studied in, e.g., [ SZ92] and [LS07].

DaJ(ü,,p, f , p, 2p) u = 0 du : u(0) = 0 5.10 . 2. Nonstationary Navier- Stokes equations.

D^G(v, p f p ) p = 0 Vcp : cp (0) = 0 Problem formulation and definitions . The control of flows is an impor-
tant and very active field of research in both mathematics and engineering.
D fG(u, p, f , P,) (f - f) ? 0 Vf E Fad. The great interest in such problems is motivated by various applications.
316 5. Optimal control of semilinear parabolic equations 5.10. Further parabolic problems 317

For instante, the behavior of the flow past an airfoil can be improved by V := {v E Hl ( Q)2 : divv = 0 }, endowed with the scalar product
controlled blowing or suction of air at the airfoil surface. Flow control can
also prevent loss of contact of the water flow with the rudder of a ship or (u, v) Dui •Vvi,
_.
reduce the noise produced by jet engines. Optimal shape design of cardiac i =1
valves would enable flow patterns in artificial hearts to be improved.
In this section, we exemplarily discuss a simplified problem in which an and H = {v e L2(52)2 : divv = 0}. Note that for functions v E H the
optimal velocity field is to be generated in a spatial domain l C R2 via requirement div v = 0 is to be understood in the sense of (v , Vz) = 0 for all
controllable forces acting in 52. Problems of this type arise, for example, in z E Hl (Q )2. In addition, we set
the flow control of magnetohydrodynamic processes. In mathematical terms,
the problem reads W(0,T) := {v = (VI, V2) E W(0, T)2 : divvi(t) = 0, i = 1,2,

(5.80) for a. e. t e (0, T) } .

min J(u, f) := 2 J JQ l u(x, t) - uQ (x, t) 12dx dt + 2 fiQ I f (x, t) 12 dx dt, The abstract formulation of the state problem is introduced by means of
the trilinear for7n b : V x V x V R,
subject to

i
ut
Rez1u+(u•V)u+Vp = f in Q b(u, v, w) ((u . y) v , L2 (9) = J ui wj Diva dx.
div u = 0 in Q
(5.81)
u = 0 onE We have b(u, v, w ) = -b(u, w, v) and
u(., 0) = uo in 52,
b(u, v, w)I < C IIuIIL4 (Q)2 IIVIIH1 (n)2 I }W L4(9)2;
and constraints on the control f,
see Temam [Tem79 1. From the first relation it follows that b(u, v, v) = 0.
( 5.82) f.,¡ (x, t) < f¡ (x, t) < fb i(x , t) for a.e. (x, t ) E Q, i = 1, 2.
In an analogous way as in Section 2.13.1 , the operator Au := -Du
generates a continuous linear mapping from L2(0,T;V) into L2(0,T;V*).
We restrict our analysis to two-dimensional spatial domains 9, since only Moreover, the operator B def ried by
for this case is there a satisfactory result concerning well-posedness of the
problem (5.81). We are given two-dimensional vector functions uQ, fa, fb E
fT (B(u)( t) , w(t))v,,v dt j T b(u(t ), u(t), w(t)) dt
L2(0, T; L2(9) 2) with f,,,(x, t) < fb(x, t) almost everywhere, a divergence-
free vector function uo e L2(12)2, and constants T > 0, A > 0, and Re > 0.
The force density f e L2(0, T; L2(S2)2) plays the role of the control, while maps L2(0,T;V) into L' ( 0,T;V*). J
the associated state u represents a velocity field. In this connection, the
unknown p denotes the pressure (not the adjoint state). Definition . A vector-valued function u E L2 (0, T; V) with ut e L' (0, T; V*)
In practical applications, the Reynolds number Re can be very large, is called a weak solution to the initial-boundary value problem (5.81) if u(0) _
which makes the numerical solution of (5.81) a difficult task. In fluid me- no and, in the sense of L'(O,T;V*),
chanics, the number 1/Re is viscosity of the Huid, usually denoted by v; ut+KKAu+B(u) = f.
since, however, v always denotes the outward unit normal to F in this book,
we use the letter rc instead, that is, we put K := 1/Re.
In the following, we denote by Fo,d the admissible set, which consists of
all controls f E L2(0, T; L2(S2)2) that obey the aboye constraints. Denoting Theorem 5.23 ([Tem79]). For any pair f E L2(O,T;V*) and no E H,
as before the scalar product in R2 by the dot • , we introduce the spaces there exists a unique weak solution u e W(0,T) to the problem (5.81).
318 5. Optimal control of semilinear parabolic equations 5.10. Further parabolic problems 319

Observe that the aboye theorem yields more regularity than the postu- Next, we conclude from the condition Dv,G = 0 that
lated ut E L1(0, T; V*). Moreover, it can be shown that the control-to-state
(5.83)
mapping G : L2(0, T; V*) -4 W (O, T); f H u, is twice Fréchet differentiable.
JJ¡¡ ( ú-uQ)•udxdt - ffQ wO=D1(,pfq)u •utd xdt

Optimality conditions . The derivation of first- and second-order opti-


mality conditions can in principie be performed using the methods presented ffQ
qff
Q Q
f( f .wddt

earlier in this chapter. To this end, the solvability of the associated linearized
problem and the regularity of its solution have to be investigated. The cor- for all sufficiently smooth u with u(0) = 0 and ulz = 0. Since uIE = 0, we
responding analysis may be found in the papers that are referred to at the infer from the third Green ' s formula that
end of this section. Instead of providing a rigorous analysis, we once more
employ the formal Lagrange method to determine what kind of first-order
necessary conditions can be expected. These conditions turn out to coincide
J w u dx dt = wi Dui dx dt
ff
I Q Q i=1
with those established rigorously in the relevant literature. We introduce the
following Lagrangian function:
wi Ó„u i ds dt + u • Ow dx dt.
ffQ
£(u, p, f, w, q) = J(u, f) + ff qdivudxdt Moreover,
Q

-A(ut-KDu+(u-17)u+vp- f) •wdxdt. ((u•V)u+(u•V)v,) •wdxdt


IfQ

While the homogeneous boundary condition for u is encoded in implicit = b(v,, u, w ) + b(u, v,, w) = -b(ú, w, u) + b(u, v,, w)
form for the sake of brevity, this is not done with the condition div u = 0,
although it would be possible . By virtue of the formal Lagrange method, we (- ui (Diwi) + ui (Diu) wj) dx dt
expect that at a locally optimal (fi, p, f , w, q) the following relations ought to
Q iá=r
be satisfied : DuG = 0 for all u with u(0) = 0, DpL = 0, and D1 G (f -f) >
0 for all f e Fad.
From the second equality it follows that
=
ffQ (- ((u . V) w) u+ ((Vú)T w) • u) dx dt,

with the matrix Vú (Vúl O2r2)T. In addition, because u(0) = 0,


0=Dp12 ( ,p,f,w,q )p=-ff w - Vpdxdt
Q

JJQ w • ut dx dt = J^ w(x, T ) u(x, T) dx - A u •


wt dx dt.

=- fE pw•vdsdt+ ff Pdivwdxdt
Invoking all of the aboye rearrangements of terms, we infer from (5.83)
for all sufficiently smooth p. Letting p vary over Co (Q)1 we immediately
that
see that

div w = 0 in Q, O= ffQ {u,-uQ+wt +IrOw- (Vü)T w+(ú•7)w-Oq} udxdt

whence we conclude that also w • v = 0. This relation will be satisfied


anyway, since we will see below that w iE = 0.
^ J wi óvui ds dt - J w(x, T) u(x, T) dx,
E i -r st
320 5. Optimal control of sernilinear parabolic equations 5.11. Exercises 321

for all relevant u. As before, we deduce from the fact that u, u(T), and ci„ui to the numerical analysis of optimal control problems are presented or citecl in
can be freely chosen in Q, S2, and E, respectively, that w is a weak solution [HPUUO9].
to the adjoint problem
5.11. Exercises
5.1 Show that the partial Fréchet derivative Fy(y, v, u) defined in the proof
-wt-KOw+(v )Tw-(ü.V)w+Vq = ú-uQ in Q
of Theorem 5.15 is continuously invertible in C(Q).
div w = 0 in Q 5.2 Show that the second-order remainder r2 of the function f (u) := J(y(u), u)
w = 0 on E used in Theorem 5.17 on page 291 satisfies

w(T) = 0 in 9. r2 (v, h)
0 as¡hik°°(Q) -o O.
jh11L2(Q)
5.3 Prove that the functional
Finally, from the derivative of G with respect to f we infer the variational
inequality J 0 (x, y(x, T))dx+IT{01 (t, y(0, t))+ ?12 (t,y(£,t))}dt
Joe
e T
( f + w) • (f - f) dx dt > 0 V f E Fad. + (p(x, t, y, v) dx dt
IfQ o Jo
is twice continuously Fréchet differentiable in C([O, e] x [0, T]) x L2 (0, T)7
The solution (w, q) to the adjoint problem is to be understood as a weak provided that cp has the form
solution. Here, w enjoys less regularity than expected. In fact, we can
W(x, t, y, v) = IPr (x, t, y) + W2 (x, t, y) v + ^\(x, t) v2,
guarantee merely that w E W4/3 (0, T; V), where
the functions 0¡ and (pi are sufficiently smooth, and A is bounded and
measurable.
W4/3(0,T;V) = {w E L2(0,T,V) : wt E L4/3(0,T,V*)}. 5.4 Verify that the functions ti, y, and p defined in the test example (5.51)-
(5.53) on page 298 jointly satisfy the state problem (5.52), the adjoint
The regularity w E C([O,T],H) can only be obtained under additional as- problem (5.54), and the corresponding projection relation for u,.
sumptions.
5.5 Examine whether the Nemytskii operator (D : y -o y3 from Hl (S2) into
L2(D) has first and second Fréchet derivatives. Use the results stated in
Referentes. Since the theory and numerical treatment of optimal flow control Section 4.3.3.
problems is currently a very active research area, there are numerous papers devoted
to this subject. One of the first groundbreaking papers on the theory of first-order
necessary optimality conditions for this class of problems was [AT90]. More re-
cent works are, for instante, [Cas95], [GM99], [GMO0], [ Hin99 ], [HKO1I, and
[Rou02]. We also refer to the overview given in [Gun95]. The technically more
difficult case of boundary control was treated in [HK04]. Second-order sufficient
optimality conditions were studied in [Hin99] and [HKO1] in connection with nu-
merical methods, and [RT03] contains the proof that optimal controls of stationary
problems depend Lipschitz continuously on perturbations. A weaker version of the
second-order conditions, using strongly active control constraints, was presented in
[TW06]. Model reduction by POD methods is the subject of the papers [AHO1]
and [KV02]. An overview of numerical methods for the optimal control of flows
was given in [Gun03]. Second-order techniques such as the SQP method were
addressed in, e.g., [ Hin99] and [HK04]. New grid adaption techniques were pro-
posed in [ BKROO] and in the review paper [BRO1]. Various other contributions
Chapter 6

Optimization problems
in Banach spaces

6.1. The Karush-Kuhn-Tucker conditions

6.1.1. Convex problems.

The Lagrange multiplier rule . The formal Lagrange method, which was
employed repeatedly in the previous chapters, has a rigorous mathematical
foundation. In this section, we introduce the basics of this theory needed
for understanding problems with state constraints. The corresponding proofs
and further results can be found in texts dealing with optimization in general
spaces. The theory of convex problems is described in Balakrishnan [Ba165],
Barbu and Precupanu [ BP78], and Ekeland and Temam [ET74]; nonconvex
differentiable problems are treated in, e.g., Ioffe and Tihomirov [IT791, Jahn
[Jah94], Luenberger [Lue69] , and Tróltzsch [Tr684b].
There are numerous books dealing with the theory and numerical treat-
ment of nonlinear differentiable finite-dimensional optimization problems.
In this connection, we refer the interested reader to Alt [Alt02], Gill et al.
[GMW81], Grossmann and Terno [GT97b], Kelley [ Ke199], Luenberger
[Lue84] , Nocedal and Wright [NW99], Polak [Po197], and Wright [Wri93],
to name just a few.
In the following, we generally assume that U and Z are real Banach
spaces, G : U -4 Z is in general a nonlinear mapping, and C C U is a
nonempty and convex set.

323
324 6. Optimization problems in Banach spaces
6.1. The Karush-Kuhn-Tucker conditions 325

Definition . A convex set K C Z is said lo be a convex cone if A z E K


The constraints in (6.1) are viewed differently: as a "complicated" in-
whenever z E K and A > 0.
equality G(u) <K 0, which is to be eliminated by means of a Lagrange mul-
Any convex cone induces a partial ordering >K in the space Z: tiplier, and a "simple" constraint u E C, which is accounted for explicitly.
This motivates the following definition.
Definition . Let K C Z be a convex cope. We write z >K 0 if and only if
z E K. Analogously, we write z <K 0 if and only if -z E K. Definition . The function L : U x Z* -* R,

The elements in K are said to be nonnegative. Note, however, that (6.2) L(u, z*) =f(u) + (z* , G(u )) z°,z,
nonnegativity in the sense of this definition does not imply the usual non-
negativity in the set of real numbers, as the following example shows. is called the Lagrangian function . Any (fi, z*) E U x K+ satisfying the chain
of inequalities
Example. Let Z = R3, and let K = {z E R3 : zl = 0, z2 < 0, z3 > 0}.
(6.3) L(ú, v *) < L(ú, z*) < L(u, z*) V u e C, Vv* E K+
Then K is evidently a convex cone, but z >K 0 implies nonnegativity only
for z3. 0 is called a saddle point of L. If this is the case, z* is said lo be a Lagrange
multiplier associated with ü.
The next definition enables us to introduce a notion of "nonnegativity"
also in dual spaces. This notion will be needed for defining Lagrange multi-
pliers, because they are elements of dual spaces. In the previous chapters , when dealing with the optimal control of partial
differential equations we denoted the Lagrangian by £. To facilitate the
Definition . Let K C Z be a convex cope. Then the set distinction , we use the letter L here. The existence of saddle points is most
easily shown for convex optimization problems.
K+={z*EZ* : (z*,z)z*z>0 VzeK}

is called the dual cone of K. Definition . Let U be a Banach space, and let the convex cone K C Z
induce the partial ordering >K in the Banach space Z. An operator G
Examples. U -> Z is said lo be convex (with respect lo -<K) íf
(i) Let Z = L2(52) with a bounded domain f? C RN, and let
G(A u + (1- A) v) <K A G(u) + (1- A) G(v) V u, v E U, V A E (0,1).
K= {zeL2(S2) : z(x) > 0 for a.e. aES2}.

Here, we have Z = Z* by the Riesz representation theorem and K+ = K


Evidently, every linear operator is convex. In the following, we write the
according to Exercise 6.1.
strict inequality z <K 0 if and only if -z is an interior point of K, that is,
(ii) Let Z be a Banach space and let K = {O}. Then z >K 0 if and only if z <K 0 ^-z E int K.
z = 0, and thus K+ = Z*, in fact, for any z* E Z* we have (z*, 0)z-,z =
0>0. Theorem 6.1. Suppose that a con vex functional f : U -> R, a convex oper-
(iii) If K = Z, then all elements of Z are nonnegative. Ilence, K-'- = {O} ator G : U -> Z, and a solution zi lo the problem (6.1) are given. Moreover,
with the zero functional 0 E Z*. o let there exist Borne ú e C such that G(ü) <K 0, that is,

(6.4) -G(ü) E int K.


Below, we consider the following optimization problem in a Banach space:
Then there is some z* E K+ such that (ü, z*) is a saddle point of the La-
grangian L. In addition, we have the complementary slackness condition
ruin f(u),

G(u) <-K 0, U E C.
(z*, G(-))z•,z = O.
326 6. Optimization problems in Banach spaces 6.1. The Karush-Kuhn-Tucker conditions 327

The proof of the aboye theorem can be found in, e.g., Luenberger In the unconstrained case where C = U, we get the equation
[Lue69 1. In the literature, the condition (6.4) is usually referred to as the
f'(ii) + G'(ú)*z* = 0 E U*.
Slater condition. It can only be satisfied if the cone K has nonempty inte-
rior. This excludes, for instance, the case K = {0}, which corresponds to the
equality constraint G(u) = 0. In this case, the aboye theorem fails to apply, Examples. We illustrate the application and limitations of the aboye the-
but other existente results concerning Lagrange multipliers are available. orems by means of simple examples that do not involve partial differential
The lack of interior points is a rnuch more serious problem in the following equations.
situation.
One-sided box constraints in L2(0 , 1). Let Ud E L2(0, 1) be given. We
Example. Consider the natural nonnegative cone in Z = L2(0, 1), consider the minimization problem

K = {z(•) E L2( 0, 1) : z(x ) > 0 for a.e. x e ( 0, 1)}. ¡r


(6.6) min f (u) := 2 J I u(x) - Ud(x)j2 dx,
Quite unexpectedly, we have int K = 0. How can this be possible? One 0
is tempted to believe that, for instance , z(x) - 1 is an interior point of K. subject to
Unfortunately, this is not true . In fact , the sequence {vn}°O_1 C L2( Q) with u(x) < 0 for a.e . x E (0, 1).
1 in [ 0 ,1-
The aboye problem is a special case of problem (6.1), with the specifi-
-1 in [1- 1, 1i,
cations U = Z = L2(0, 1) and G = 1 (the identity mapping). The associ-
ated convex cone K is the set of almost-everywhere nonnegative elements of
while obviously converging to z with respect to the L2 norm, is not con-
L2(0, 1), and we have C = U.
tained in K. Consequently, z ^ int K. This undesired behavior is simply a
consequence of the fact that the L2 norm, and likewise any other LP norm The problem has a unique minimizer ti, namely,
with 1 < p < oo, measures an integral and not the maximal absolute value
v, (x) = min{ud(x), 0}.
of a function. This fact constitutes a major obstacle in the treatment of
optimization problems in function spaces. o We investigate whether there exists an associated Lagrange rultiplier.
The corresponding Lagrangian function reads
Theorem 6.2. Suppose that the mappings f and G in Theorem 6.1 are
Gáteaux differentiable at ti. Then we have the variational inequality
L(u, u) = f (u) + (µ, G(u))LZ(o,r) = rr (2 (u(x) - ud(x))2 + µ(x) u(x)) dx.
DiL(u,z*)(u-ú)>0 duEC.
Here, by the Riesz representation theoorem, the functional z* E Z* has been
identified with some functíon u E L2(0, 1). We search for a Lagrange multi-
Here and in the following, D,, again denotes the partial Gáteaux or
plier jz E L2(0, 1). Since int K = 0, Theorem 6.1 does not apply. Instead,
Fréchet derivative with respect to u. The assertion is an imrrrediate con-
the Lagrange multiplier is constructed using a pointwise approach. To this
sequence of the saddle point condition (6.3), which implies that v, solves the
end, recall that, owing to Lemma 2.21 on page 63, we have the variational
problem without the constraint G(u) <K 0, namely
inequality
L(ú, z*) = min L(u, z*).
uEC ¡r
(^c(x) - ud(x)) (u ( x) - u(x)) dx > 0 b u(•) < 0.
The associated variational inequality reads, in explicit form, f
f'(i)(u - ú) + (z*, G'(-)(u - ú))z. z > 0 v u E C This can only be true if the implications

or, equivalently, ú(x) - ud(x) = 0

(f'(ú) + G'(v,)* z* , u - i )U*,u > 0 du E C. v.(x) - ud(x) 0


328 6. Optimization problems in Banach spaces 6.1. The Karush-Kuhn-Tucker conditions 329

are valid almost everywhere. But then ü(x) - ud(x) must be nonpositive We make the pointwise definitions
almost everywhere. Since we have used arguments of this kind repeatedly in
Chapter 2, we do not explain this in detail here. We now define
M. (X) = (f'(x))+ _ (U (X) - Ud(x))+

M(x) f'(u)(x) = - (Ü (X) - ud(x)). Mb(x) = (f'(x)) _ (f], (X) - Ud (X» _,

Then, owing to the aboye implications, we have µ > 0 as well as where, as usual , z+ = (z + z1)/2 and z_ = (jzj - z)/2. Obviously, pa and
Mb are nonnegative . The reader will be asked in Exercise 6.2 to check that
µ(x) i(x) = 0 for a.e. x E (0, 1), 'the arguments from Section 1.4.7 carry over almost unchanged to give
which is the pointwise form of the complementary slackness condition (6.5).
Finally, it follows from the definition of z that lc = - f'(ti), that is, D,,L (u, f' (u) + [Lb - i•^a = 0

f'(u) + 11 = 0, and the slackness conditions


which, in turn, is equivalent to the equation Dj(ú, µ) = 0. Hence, the
function u defined aboye is a Lagrange multiplier. o (-ú - 1, µa)L2(01) = (v, - l, pb)L2(0,1) = 0.

Consequently, µa and Mb are Lagrange multipliers for ú.


Two-sided box constraints in L2(0 , 1). We now consider the aboye min-
imization problem with the same functional f as in (6.6), but this time with If we assume Ud E L°°(0, 1), then both multipliers belong to L'(0, 1).
constraints from both aboye and below: This nice byproduct of the pointwise construction follows from the fact that
v - Ud E L°O(0,1). o
(6.7) min f (u),
subject to Remark. The problem with two-sided constraints could also be considered in
the space L°°(0, 1), since in this case every admissible control u is automati-
-1 < u(x) < 1 for a.e. x E (0, l).
cally bounded and measurable. Moreover, the cone K of nonnegative functions in
Again, we put C = U = L2(0, 1), and we cast the constraints in the form L°°(0, 1) has interior points, and ú(x) - 0 satisfies the Slater condition. Theorem
6.1 then yields the existente of Lagrange multipliers pa, Pb c L-(0, 1)*. How-
u(x) 1 < 0, -u(x) - 1 < 0. ever, we do not gain much benefit froin this result, since L-(0,1)* is a space of
continuous linear functionals that need not even be measures.
We then have to choose Z = L2(0,1) x L2(0, 1) and K = L2(0, 1)+ x
L2(0, 1)+, where L2(0,1)+ denotes the set of almost-everywhere nonnegative
elements of L2(0, 1). The convex operator G : L2(0, 1) 6.1.2. Differentiable problems.
L2(0, 1) x L2(0, 1)
is defined by
Lagrange multiplier rules and constraint qualifications. We now
G(u) u(•) - 1 investigate the problem (6.1) without assuming f and G to be convex. We
consider

Although the function i(x) - 0 obeys both inequalities strictly, we again


min f (u), G(u) <K 0, u E C,
have int K = 0 and thus cannot employ Theorem 6.1. However, the construc-
tion used in Section 1.4.7 works. The Lagrangian is now given by where C is still convex. Instead of convexity, we postulate the Fréchet differ-
entiability of f and G. We use the same Lagrangian function L = L(u, z*)
L(u, 1-¿) = L(u, µa, Pb) as in Section 6.1.1, but, in view of the nonconvexity, we can no longer ex-
1 pect a saddle point property to be valid. Therefore, Lagrange multipliers are
2 Hu - ud((c,2(0,1) + (-u - 1, µa)L2(0,1) + (u - 1 , ib)L2(o,1) • defined in a slightly different way.
330 6. Optimiza.tion problems in Banach spaces 6.1. The Karush-Kuhn-Tucker conditions 331

Definition . Let ú e U be admissible. We call ú a local solution of the the constraint qualification (6.11) holds, then títere exists a Lagrange multi-
minimization problem (6.1) if títere is some e > 0 such that plier z* E Z* associated with v,. Moreover, the set of Lagrange multipliers
associated with u, is bounded.
f (ü) < f (u) Vu E C with G(u) <K 0 and ¡¡u - v,llu < s.

The proof of this multiplier rule is due to Zowe and Kurcyusz [ZK791.
From (6.9) it follows that
Definition . Let v, be a local solution to the problem (6.1). Then any z* E K+
satisfying the conditions (6.13) (f '(u) + G'(v.)* z* , u U. ,u > 0 b'u E C.

(6.9) Du,L(ú, z*)(u - ú) > 0 b'u E C Remark. Sometimes it is difficult or even meaningless to establish G'O* in
explicit form. Then (6.13) is replaced by the equivalent inequality
(6.10) (z* , G())z*,z = 0 f'(u) (u - v,) + (z* , G'(u) (u - ú))z.,z > 0 V u E C.
is called a Lagrange multiplier associated with ú.

In order that the existente of such a Lagrange multiplier be guaranteed, a Example. Of particular interest is the minimization problem with both
so-called constraint qualification must be postulated. Since such a condition equality and set constraints:
involves the locally optimal control itself, it usually cannot be verified with-
out knowledge of this function. There are various constraint qualifications. (6.14) min f(u), G(u) = 0, u E C,
A rather general one, which suffices for our purposes, is the Zowe-Kurcyusz
where f, G, and C are defined as before. In this special case, the constraint
condition (see Zowe and Kurcyusz [ZK791).
qualification (6.11) reads

Definition . Suppose that ú E C with G(u) <K 0 is given. We call the sets (6.15) G'(v,) C(u) = Z.

If it is satisfied, then a Lagrange multiplier z* E Z* exists such that the


C(ú) := {a (u - ú) : a > 0, u E C}, K(z) := {,(3 (z - z) : /3 > 0, z E K}
variational inequality (6.13) is valid. The complementary slackness condition
the conical hulls to C and K at ú and z, respectively. The condition (6.10) is meaningless for equality constraints.
In Section 6.1.3, we will apply this result to the special case of G(u) _
(6.11) G'(ú) C(u) + K (- G(u)) = Z Ay - Bv = 0, where A : Y Y* is a continuously invertible operator
representing an elliptic differential operator, y denotes the state, and v E
Vad C V is the control.
is called the Zowe-Kurcyusz constraint qualification.
In this case, we have Z := Y*, U := Y x V, and C := Y x Vad• The
constraint qualification is always satisfied, since the equation
The aboye relation is obviously equivalent to saying that for any z E Z
the equation G'(ú)(u-ú) =A(y-y)+B(v-v) =z

(6.12) a G'(ú) (u - ú) + /3 (v + G(ú)) = z is solvable for any z e Z = Y* with v = v and y = A-1 z + y. The element
u - ú = (y - y, v - v) belongs to the cone C(u). 0
is solvable with suitable u E C, v >K 0, a > 0, and /3 > 0. Recall that
v >K 0 if and only if v c K. Discussion of the Zowe-Kurcyusz constraint qualification . In the
following, we illustrate the application of condition (6.11) for various types of
Theorem 6.3. Let it be a local solution lo problem (6.1), and let f and constraints, first in the general situation, and then for pointwise constraints
G be continuously Fréchet differentiable in an open neighborhood of ti. If in function spaces.
332 6.1. The Karush-Kuhn-Tucker conditions 333
6. Optimization problems in Banach spaces

Pure equality constraints G(u) = 0. With C = U and K = {0}7 (6.11) Equality and inequality constraints . Suppose that the constraints have
becomes the forro

Cl(U)=0, G2(u) <K 0, u E C.


(6.16) G' (fi) U = Z.
Then the following condition is sufficient for (6.11) te hold (cf. [HPUUO9],
In other words, the operator G'(fi) must be surjective. This surjectivity Lemma 1.14): Gí(ú) is surjective, and
requirement comes from the classical Lagrange multiplier rule for equality
constraints . The relation ( 6.13) attains the form (6.20) 3 h E C(u) : G' (fl) h = 0, G2(fa) + G2 (ú) h <K 0.

( 6.17) f'(ü) +G'(fi)* z* = 0. As the following examples will show, the applicability of the Zowe-
Kurcyusz constraint qualification to inequality constraints in function spaces
is essentially restricted to tones of nonnegative functions with nonempty in-
Inequality constraints . Let the constraints be given as in (6.1). If the terior.
minimizer fi satisfies G(fi) <K 0, that is, if -G(u) E int K, then the con-
straint qualification (6.11) is fulfilled; the reader will be asked te verify this One-sided box constraints for u. We begin our analysis with a problem
in Exercise 6.3. Since the constraint is not active, this case is not interesting. involving one-sided constraints:
The following linearized Slater condition is sufficient for the Zowe-Kurcy-
V, u(x) ub(x)
(6.21) min f (u) := f(x,u(x))dx, for a.e. x E St.
usz constraint qualification (6.11) to hold:
i

Here, ub E L°° (52) is given, and the function z/' is sufficiently smooth and
(6.18) 1] ü E C : G(ft) + G' (ü) (ü - fi) <K 0
satisfies a suitable growth condition in order to guarantee that the given
integral functional f be continuously differentiable in U = L2(St). The aboye
minimization problem is of the form
This is easily seen: the Zowe-Kurcyusz condition postulates for any z e Z
the existente of constants a > 0, /3 > 0 and elements k E K, u E C such
that the equation min f (u), G(u) <K 0,

with G(u)(x) := u(x) - ub(x). As an affine continuous operator, G is


a(G(ú)(u-ú)+13(k+G(ú)) = z differentiable from U finto Z = U. The cone K is given by the set of almost-
everywhere nonnegative elements of L2(Sl).
is valid. To show this, put a = ¡3, u = ü, and z = G(fi) + G' (ú) (ü - u). Then
In this case, the Zowe-Kurcyusz constraint qualification (6.11) is satis-
the aboye equation reduces to a(z + k) = z and, since K is a cone, to
fied: in view of C = L2(St), we have C(ü) = L2(O). And since G(ü) is the
identity mapping, for any z e L2(2) there is some u e L2(St) = C(ü) such
az+q=z,
that G'(u) u = z : one simply chooses u = z.
Consequently, Theorem 6.3 may be applied in L2(Q), where, in view of
with q E K. Now we choose a so large that z - az _>K 0. This is possible,
the Riesz representation theorem, every z* E L2(Q)* can be identified with
because by (6.18 ) z lies in the interior of -K. With this, we satisfy the
some u E L2(5l). Hence, for any local solution ü there exists some almost-
aboye condition with the choice q = z - caz >K 0.
everywhere nonnegative multiplier p e L'(1 1) such that f'(fd)+G'(fc)* p = 0.
If both K and C have interior points, then (6.18) is equivalent to the We may identify f(u) with the function v,(-,fi(-)) E L2(Q), and G'(ü) * is
following condition ( cf. Penot [Pen82]): the identity operator in L2(f2). We therefore find that

(6.19) ^ (x, v,(x)) + M(x) = 0, M(x) > 0, for a.e. x e f2.


^ h e int C(f) : G(fi) + G(ü) h <K 0.
334 6. Optimization problems in Banach spaces 6.1. The Karush-Kuhn-Tucker conditions 335

In this example, the Zowe-Kurcyusz condition was applicable even for all u E C (fu) such that
though the cope of nonnegative functions in L2(S2) had empty interior. This
(6.24) G' (ii) u = O.
is in a certain sense an exceptional case. Alternatively, we could have con-
structed the multiplier directly as in (6.8) by setting p(x) = (O,,(x, v,(x)))_.
Lemma 6.4. If ú is admissible for problem ( 6.14) and the conditions (6.22)-
Two-sided box constraints for u. We consider the same problem as (6.24) are fulfilled, then ú is locally optim.al for (6.14).
aboye, but this time with the two-sided control constraints
These second-order sufficient optimality conditions follow from general
ua(x) < u(x) < ub(x) for a.e. x E S2,
results due to Maurer and Zowe [MZ79 , Mau81]. The lemma applies only
with bounded and measurable functions U. < ub. We fit these constraints to problems in which the two-norm discrepancy does not play a role. Since
into the abstract framework of (6.1) by choosing the operator G to be of the we have not provided any further information concerning the structure of
forro the set C, we are not in a position to define and make use of strongly active
constraints . The conditions aboye are thus too restrictive . In the case of
G(u) ^ Ua - U inequality constraints of the forro G(u) <K 0, first- order sufficient optimal-
u-ub ity conditions can also be employed; see [MZ79]. For partial differential
equations with state constraints, we refer to [CDIRTO8] . In the case of
Evidently, G is a continuously differentiable mapping from L2(S2) into
pointwise constraints in function spaces, usually strongly active sets in the
L2(S2) x L2(S2). However, the Zowe-Kurcyusz constraint qualification cannot
sense of Dontchev et al. [DHPY95 ] are used for this purpose.
be directly satisfied in the form (6.11), as can be shown with a little effort.
Again, we have the problem that the cone of nonnegative functions in L2(1) 6.1.3. A semilinear elliptic problem . Let SZ C 1183, N < 3, be a bounded
has empty interior. It would also not be helpful to work in L°°(í) instead, Lipschitz domain . For given v e L2( Q), we consider the elliptic boundary
since then we would at best obtain measures as multipliers for the control value problem
constraints. As in (6.8), a possible way out is to define Lagrange multipliers
by -Ay +y+y3 = v in 9

3„y = 0 on F.
ILa(x) :_ 0- (x, fl(x)), µb (x) :_ 0u (x, u(x))
As shown on pages 181 to 183, this problem is easy to handle in the
with which the Karush-Kuhn-Tucker conditions are fulfilled.
state space Y = Hl(S2). Introducing the mapping A : Y -> Y* generated
by the elliptic operator -O+I, the Nemytskii operator : Y s V = L2(S2),
Second- order optimality conditions . The scope of the Karush-Kuhn-
y(.) r-* y(.)3, and the embedding operator B : L2(Q) Y*, we can transform
Tucker theory in Banach spaces also encompasses second-order necessary and
the aboye boundary value problem into the equation Ay + B <P (y) = B v
sufficient optimality conditions. For illustration, we only discuss the problem
in Y*.
(6.14):
In the following, we are going lo demonstrate how Theorem 6.3 and
min f (u ), G(u) = 0, u E C, Lemma 6.4 can be applied to a corresponding optimal control problem. To
additionally assuming that f and G are twice continuously Fréchet differ- this end, we study the minimization of
entiable. Suppose that ti satisfies, together with z* E Z*, the first-order
necessary condition J(y, v)
2
2 Il y - YOllL2(sz) + 2 IIvIIL 2(Q),
(6.22) f'(v.)(u - ú) + (z*, G'(ú)(u - ú))z•,z > 0 du E C. subject to the aboye elliptic state problem and to the control constraint
-1 < v(x) < 1 for almost every x e 9. With the embedding operator
Moreover, let there exist some S > 0 such that EY : H' (ti) -> L2 (Sl) and the admissible set

(6.23) L"(ú, z*)[u, u] := f"(ti)[u, u] + (z*, G"(u)[u, u] )z',z > 5 Ilullú V a d = {v E L2 (52) : 1 < v(x) < 1 for a.e. x e Q},
336 6. Optimization problems in Banach spaces 6.1. The Karush-Kuhn-Tucker conditions 337

we obtain the problem The first minus sign is needed so that the adjoint problem will take on
the form familiar to us from the previous chapters. If ti = (y, v) is locally
(6.25) min J(y, v) 2 IIEY y - yolI 2(o) + z Ilv11^2(Q), optimal, that is, if v is a locally optimal control, then, by virtue of Theorem
6.3, the variational inequality
subject to

(6.26) Ay + B ( ^D(y) - v) = 0 , v E Vad• D(y,,) L(y, v, p) (y - y, v - v) > 0 d (y, v) E C

is valid . In terms of y, this evidently means that D,L(9, v, p) y = 0 for all


Obviously, this is a special case of the problem ( 6.14), ' y E Y, that is,

min J (u), G(u) = 0, u E C, (y - y2 , y) - (A*p, y) - (B* p, V(y) y) = 0 V y E Hl (52).


with the specifications U := Y x V, u (y, v), G : Y Y*, G(u) Now, B* = EY and A* = A. Therefore,
Ay + B (^b (y) - v), and C := Y x Vad.
9 -yo-Ap-V(y)*B*p=0,
First-order necessary conditions . According to the results established
in Section 4.3.3, the Nemytskii operator is continuously differentiable from and the Lagrange multiplier p turns out to be the unique solution te the
H'(S2) into L2(í ). This implies that the operator G is continuously Fréchet adjoint boundary value problem
differentiable from Y into Y*.
-Op+p+3p2p = y - yn in S
Moreover, we claim that for any ii = (y, v) E U, G'(zi) is a surjective
mapping. To prove this claim, we first observe that G'(zi)(y,v) = Ay + ó„p = 0 on F.
B (''(y) y - v). Now consider, for an arbitrary right-hand side z E V*, the
equation Evaluating the aboye variational inequality for v, we find that

D, L(y, v, p) (v - v) > 0 Vv E Vad,


(6.27) Ay + B (V(y) y - v) = z.
and we finally arrive at the variational inequality
On the left-hand side , we have the differential operator Á,
(p + 3 , V- x) 12(92) > 0 V V E Vad.
Ay = -Dy+y+392y.
The aboye optimality conditions also resulted from the analysis of the
By virtue of Theorem 4.7 on page 191, y is continuous and thus bounded. optimal control problem ( 4.31)--(4.33 ) on page 207 , which was performed in
Hence, the coefficient function co(x) = 1 + 3y(x)2 is both positive and the state space H1(9 )nC(Sl). The rnethod presented here has the advantage
bounded. It therefore follows that the bilinear form a[•, .] generated by A that the conditions can be derived directly from the optimization theory in
meets the conditions of the Lax-Milgram lemma . From Chis , we readily Banach spaces . It only works in this simple form if the given nonlinearity is
deduce that the equation ( 6.27) admits a solution for any z e Z = Y*: we differentiable in H' ( S2). Since H'(Q) C (Sl) for N > 2, it does not apply
simply put v = 0 and determine the unique solution y E V to the equation to problems with pointwise state constraints.
Ay + B V ( y) y = z. This proves the claim , and G(u) is thus surjective.
In addition , in Exercise 6.5 the interested reader will have the oppor- Remark. In applying the general result Theorem 6.3 in function spaces, usually
tunity to check that the constraint qualification ( 6.15) on page 331 is also a compromise has to be inade between two conflicting restraints: in order that
fulfilled. Consequently, Theorem 6.3 applies, yielding the existente of a La- the constraint qualification be valid and, at the same time, the nonlinearities be
grange multiplier z* E Z* = (Y*)* = Y. Putting p :_ -z*, we obtain for differentiable, the range space Z should not be too large; en the other hand, Z also
the associated Lagrangian function should not be too small, since otherwise the dual space Z* becornes too large in the
sense that it contains functions of low regularity that can no longer be interpreted
L(u, p) = L(y, v, p) = J(y, v) - (A y + B ( V (y) y -v ), p) y. y as (weakly differentiable) solutions to adjoint problems.
338 6. Optimization problems in Banach spaces 6.2. Control problems with state constraints 339

Second-order sufficient conditions . Since the functional J and the Ne- more simply by means of the Lagrange method in Banach spaces. This
mytskii operator <P are twice continuously Fréchet differentiable in Hl (52) x technique was employed also in the works of Casas [Cas86] and Trdltzsch
L2(Q) and H1(S2), respectively, so is the Lagrangian. Thus, in view of [Tr884b]. In the following, we apply it to derive first-order necessary con-
Lemma 6.4, the following condition is sufficient for local optimality: the pair ditions. We do not pursue second-order sufficient conditions, referring the
(u, v) satisfies both the first-order necessary conditions and the definiteness reader to the papers [CTU00] and [RT00]. Thus far, second-order suffi-
condition cient conditions for problems with pointwise state constraints in the whole
domain could only be shown for low-dimensional domains; see Casas et al.
p)[( y,v), (y, v)] > b (IlyllH' (O) + IIvIIL2(9)) [CD1RT08].

for all pairs ( y, v) satisfying the boundary value problem


6.2.1. Convex problems.
-Ay +y+3y2y = v in S2
An elliptic problem with pointwise state constraints . Let S2 C 118"
a„y = 0 en F. denote a bounded Lipschitz dolnain. We consider the optimal control prob-
lem
Then v is locally optimal in the sense of the norm of L2(í ). The aboye
definiteness condition is already valid if we merely have, with a modified b,
(6.28) min J( y, u) := 2 Ily - yo lli2 (n) + z IIuIIL2(5^),
L"(y, v, p) [(y, v), (y, v)] > b II V 11L2(Q) subject to
The explicit expression for the second derivative L" is
-Ay + y = u in 52
(6.29)
a„y = 0 en F
L"(y, v,p) [(y, v), (2J, v)] = IIyllL2(n ) +A lv1122(sz) - 6 f py y2 dx.
and the constraints
6.2. Control problems with state constraints ua(x) < u(x) < ub(x) for a.e. x e S2,
(6.30)
State constraints naturally arise in many applications. A typical example is y(x)<0 VxES2.
that of heating problems in which the temperature is forbidden to exceed or
fall short of certain prescribed threshold values. Such problems raise interest- We are given yQ E L2 (Q), A > 0, and ua, ub E L' (9) such that Ua < ub
ing, and in parts still unsolved, mathematical questions. Here, we shall only almost everywhere. In addition to the usual box constraints for the control,
briefly address some basic ideas in order to enable the reader to consult the a pointwise state constraint for y is imposed. By virtue of Lemma 4.6 on
relevant literature for a more in-depth study. For a comprehensive treatment page 190, for every u e Lr(Si) with r > N/2, the elliptic boundary value
of the elliptic case, we refer the reader to Neittaanmaki et al. [NST06]. For problem (6.29) has a unique weak solution y E H1(fi) n C(0). We therefore
simplicity, we confine ourselves to elliptic problems; the theory for parabolic choose for the controls u the space U = LT (S2) with some arbitrary r > N/2.
problems is quite similar. The control-to-state mapping G : u y is considered as a map between
The necessary optimality conditions to be proved below may also be two different pairs of spaces: in the cost functional, y appears as an L2
derived from Pontryagin's maximum principie for state-constrained elliptic function; there, we define y = Su = Ey G u, with the embedding operator
problems; for this purpose, the maximum condition is transformed into a Ey : H1(52) L2 (9). We then have S : Lr(Q) -* L2(9).
variational inequality. In the case of boundary controls, the corresponding In the state constraint (6.30), the continuity of y is exploited, because
maximum principie was proved by Alibert and Raymond [AR97] and by the cone K of nonnegative functions in C(Sl) has interior points. Here, we
Casas [Cas93]. The same applies to state-constrained parabolic problems, regard u y as a mapping from Lr(í) finto C(S2). To avoid the introduction
which were treated in Casas [Cas97] and in Raymond and Zidani [RZ99]. of further notation, we denote this mapping also by G, although we had
However, the proof of Pontryagin's maximum principie is very technical, introduced H'(Q) n C(S2) as the range space of G. From this, no confusion
while the optimality conditions to be presented here can be obtained much should arise.
340 6. Optimization problems in Banach spaces
6.2. Control problems with state constraints 341

With the specifications U = Lr(f ), Z = C(S2), and K = {y E C(S2) :


and hence
y(x) > 0 Vx E S2}, the optimal control problem ( 6.28)-(6 . 30) then turns
out to be a special case of the optimization problem ( 6.1), namely, (6.32) f'(u)(u - ú) + f G'(ú) (u - ú) dp >_ 0.

min .f (u) := 2 ¡¡Su-ysZJIL2(sz) + 2 IIu11L2(Q), The first terco on the left-hand side can be reformulated as in Chapter 4
by means of an adjoint state pl:
subject to
G(u) <K 0, u E C, f'(z-c)(u - v,) = J (pr + A u ) (u - ú) dx.
where C := U a d = {u E Lr( S) : ua ( x) < u(x) < ub ( x) for a.e. x E S2}. sz
Here, pr E Hl ( S2) is the weak solution to the elliptic boundary value problem
Through the representation

(z* , y)z*,z = -Opr + pr = y - ysz in 9


J _ y(x) dN (x), (6.33)
óU pr = 0 on F.
the dual space Z* can be identified with the space M(S2) of all regular Borel
measures p defined on 0; see Alt [A1t991. Note that in view of the com
pactness of 9, the notion of Radon measure, which is also cornmonly used, We aim to deal with the second term in a similar way. Since G(u)
is equivalent to the notion of regular Borel measure. maps Lr(S2) continuously into C(52), the dual operator G'(ú)* maps M(52)
continuously into Lr(SZ)* = Lr (S2), where r' = r/(r -1). Hence, there exists
We thus identify z* with p e M(S2). The nonnegativity z* >K+ 0 then
some function p2 E Lr' (S2) such that
means that

JY( x)dfl(x )^ 0 V y(•) > 0, C'(ú)(u - ú) d1 = (ir , G'(ú )(


J u - u»Nt(n),cp
(6.34)
which is equivalent to nonnegativity of the measure M - The Lagrangian ( G'(u)* l , u - ú)Lr = fP2(u_u)dx.
function is given by (O ), Lr(Q ) Z
We now decompose p into fssz+pr, where, as indicated by the subscripts,
L(u,m) = f (u) + J (G u)(x)dp(x) f1S2 and sr Nave their supports in í and F, respectively. Owing to the
linearity of G, we have G'(zi)(u - u) = G (u - u) = y - y, where y and y
¡ SU YQ 112 11 denote the states associated with u and v,, respectively. Hence, p2 satisfies
2¡ 2(o) + 2 Iu L2(sa) + i y(x) dp(x). the relation
To guarantee the existente of a Lagrange multiplier , we now suppose that
the Slater condition is valid : we assurne that there is some control ü E Uad
such that the associated state y = G la satisfies the strict inequality
Á (y -y) dµ= f(y-
y)dµst+ J (y- y)dfr= f p2
sZ P z
(u - u) dx.

(6.31) y (x) < 0 V x e S2. A comparison with the variational inequality (2.67) on page 74 shows
how the function p2 has to be defined. Formally, we must have
Since y is continuous on 0, there is some b > 0 such that y(x) < b < 0 for
all x E 0 . In other words , - y E int K, and the Slater condition formulated AP2 + p2 = µn in S2
(6.35)
in Theorem 6.1 is fulfilled. Hence, for any solution u to the given problein 0, P2 = µP on F.
there exists an associated Lagrange multiplier p E M(í ) such that (?i, µ) is
a saddle point of the Lagrangian . By virtue of Theorem 6.2, it follows that More precisely, p should be a function that satisfies the variational equation

D.L(u, µ)(u - u) > 0 V u E Uad, f (V.Vv+pv )dx+ f vds= f vd11n+f vdir


6.2. Control problems with state constraints 343
342 6. Optimization problems in Banach spaces

for all v e H' (S2) which are so smooth that all the integrals appearing in the Then, by definition,
equation are meaningful. In this way, we have arrived at an elliptic boundary bLa>0, µb>-0, 1\ 21+P+/ib-N-a=O,
value problem with a measure on the right-hand side. Note that the measure
p generates a functional that does not belong to Hl (52)*, in general. Since and the usual pointwise analysis of the variational inequality (6.37) yields
the elements v E Hl (52) are not necessarily continuous, not every functional that for almost all x E S2 the complementarity conditions
from C(Q)* can be applied to v. Moreover, the integrability properties of the
gradients Vp and Vy have to match each other. In conclusion, the definition (ua(x) - u(x))M.(x) = 0, (u(x) - ub(x))Pb(x) = 0
of a "solution" to (6.35) requires a somewhat different formulation. For the
are valid. In summary, we have derived the following optimality system for
unique solvability of (6.35) in a suitable sense, we refer the reader to Theorem
the quantities u, y, /L.a, µb, and p:
7.7 on page 366. With p = pl + p2, one then arrives at the following result.

Theorem 6.5. Suppose that the control ú, together with the associated state -Dy+y = u -Op + P = Y - y0 + µo
y, is optimal for the problem (6.28)-(6.30), and suppose that the Slater con- ay = 0, 0,P = l-rr,
dition (6.31) is fulfilled. Then there exist a regular Borel measure p e M(Sl)
Au+Pf
+µb-Pa=0,
of the form p = p- + pr, where pq := pin and Frr := µIr, and an associated
adjoint state p E W""s(SZ), with s E [l, r) arbitrary, such that the adjoint µ > 0, y( x)d(x)=0 ,
problem (6.36), the variational inequality (6.37), and the complementarity
condition (6.38) are satisfied. We thus have: M. (X) > 0 , (ua(x) - u ( x)) lia(x ) = 0 for a.e. x E Sl,

in Sl pb(x) > 0 , (u(x) - ub ( x)) lib(a ) = 0 for a . e. x E Q.


-Op+p = y -ySt +lisz
(6.36)
,9, p = lLr on F,
Remark. Also in this case , the formal Lagrange method leads to the correct

(6.37) f (+P)(u- b)dx^ O


Z
V u E Uad, result : in fact, defining the Lagrangian function as

G(u, y, P , 14a, Irt, li) J(y, u) + f y dp

( 6.38) µ > O , f y(x) dp(x) = 0.


- f {Vy- VP- (y-u)P- ( aa - U)ila - (U - wb) Pb} dx,

we find that the equality D..£ = 0 leads to the adjoint system while D,J = 0, in
Proof.: The variational inequality with p = Pr + P2 follows from the rela-
combination with the usual complementarity conditions, yields the other relations.
tions (6.32)-(6.34). The representation of p as the solution to the adjoint
problem is a consequence of the fact that p2 solves, by Theorem 7.7, the ellip-
tic boundary value problem (6.35). Finally, the complementarity condition Minimization of the state at a point . In the next convex problem,
results from the relation (6.5) of the Lagrange multiplier rule. ❑ no state constraints will be prescribed. We will see, however, that simple
linear optimal control problems with box constraints for the control may
already lead to adjoint differential equations involving rneasures if the cost
In the aboye theorem, only the state constraints have been eliminated by
functional does not have the usual integral form. Hence this problem, which
using a Lagrange multiplier, while the box constraints for the control have
is a boundary control problem for a change, fits into this chapter.
been accounted for by the variational inequality. By means of the pointwise
construction (6.8) on page 329, these constraints can also be included through We again consider a bounded Lipschitz domain f2 C RN, N > 2, and
multipliers µa, µb E Ls(íl). To this end, we set assume that functions a, ua, ub c L°O(I') are given such that a > 0 and
Ua < ub almost everywhere in F. We change the problem of achieving an
optimal stationary temperature distribution a little bit by specifying that
M. (X) (AU(x) +P(x))+, [Lb (X) (l u(x) +P(x))-.
344 6. Optimization problems in Banach spaces 6.2. Control problems with state constraints 345

the temperature y be minimized at a prescribed point xo E Q. We thus This inequality is of the same form as (6.32), except that µ represents the
consider the following problem: derivative of J, not a multiplier.
The next steps are completely analogous to those taken in the preceding
(6.39) min J(y) := y(xo), problem: we define p E Wl,s(t2) as the solution to the adjoint system

subject to
-L p+p = bz,p
(6.42)
-Dy + y = 0 in S2 avis+ap = 0.
(6.40)
á„y+ay = u on r From Theorem 7.7 on page 366, we infer that

and
f(y-V)d= fP(u_ti)ds.
(6.41) ua(x) < u(x) < ub(x) for a. e. x E E.
t Consequently, ti must obey the variational inequality

At first glance, the problem (6.39)-(6.41), being only linear, seems to be


(6.43) VuEUad.
simpler than the linear-quadratic optimal control problems investigated in
Chapter 2. It turras out, however, that from a theoretical viewpoint it is just
The relations (6.42) and (6.43) constitute the necessary optimality condi-
as difficult.
tions.
We consider y in the state space Y = Hl (S2) n C(SZ), so that the value
y(xo) will be well defined. The existente of at least one optimal control ti
Best approximation in the maximum norm. We now discuss a convex
is easily proved: the existente result of Theorem 4.15 remains valid for the
problem with nonsmooth cost functional which can be transformed into a
linear optimal control problem under study, although the cost functional is
not of integral type as in (4.31) on page 207. However, this property of convex differentiable problem with state constraints. Here, the distante to
the desired target function yQ is no longer measured in terms of the L2 norm,
the cost functional was not used in the proof of the theorem; in fact, we
but rather with respect to the maximum norm. We thus consider the optimal
only needed its continuity, which is evident for (6.39). To guarantee that
y e C(52), we consider the set Uad of admissible controls in some space control problem
LP(F) for p > N - 1.
(6.44) min J(y, u) ¡¡y - yO ^ (C(Q) + 2 1 U 2(r),
Difficulties first arise in the derivation of necessary optimality conditions,
since the cost functional is not of integral type. Application of the formal subject to
Lagrange method to determine the adjoint problem fails, as the reader may
-Dy + y = 0 in fI
verify. However, by means of the Dirac measure ¡ = ó,o, the cost functional
can be written in the integral forro ci„ y+ay = u on f

and
y(xo) = f (y) = f y(x) dlu(x) Ua(X) < u(x) < ub(x ) for a.e. x E F.
sa
Clearly, f is continuous on Y. Here, yc E C(S2) is prescribed, and all the other data are chosen as in
Proceeding as in the previous section, we now make use of the linear (6.39)-(6.41). The constant A is nonnegative and hence may vanish. Evi-
solution operator G : u H y, which maps LP(F) continuously into Y. We dently, the cost functional in (6.44) is not differentiable. In addition, it is
obtain not even well defined in the space H1(f1), because elements of H'(fl) do not
have to be continuous. In order that y e C(S2) autornatically, we assume
f'(u)(U - 4L ) = (y - 9 ) dµ = G (u - u) dli > 0 V U E Uad. that ua,ub E L' (F). Then u is bounded and measurable, and Theorem 4.7
on page 191 ensures the desired continuity of y.
346 6. OPtimization problems
in Banach spaces
problems with state constraints
6.2. Control 347

By means of a simple and often used trick, we transform the given prob- using the theory of Chapter 2. Te this end, we introduce as the Lagrangian
lem finto a linear-quadratic and thus differentiable optimization problem. To function
this end, we put
77 + A
77 =m a 1y(x) - yo( x)^. L(y, u, 7), 41, 112)

Evidently, we have +f (-y+yo-r1)dP¿2,


o

-77 < y(x) - yo(x ) < 77 bx E 9.


with regular Borel measures µ1 and µ2. The reason for this choice is given
Therefore, our problem can be rewritten in the equivalent form by the following interpretation of our problem: we minimize the cost func-
tional (6.45), subject to the state constraints (6.48) and the convex constraint
(y, u, y) E C, where
(6.45) min + 2
IMIi2(r) }
C := {(y,-, 77) E H1(S2 ) x L2(F) x R : y and u satisfy (6.46) and ( 6.47)}.
subject to

77ER,
The associated Lagrange multiplier rule follows from Theorem 6.1 and
Theorem 6.2. Since the postulated constraint qualification is satisfied with
-Ay+y = 0 in SZ
(6.46) sufficiently large r), there exist Lagrange multipliers µ1i 112 E M( S2) satisfying
ó„y+ay = u en r, the saddle point condition ( 6.3). By means of these multipliers, we can
the control constraints eliminate the state constraints : in fact , by (6.3) the triple (y, v, i) solves the
following optimal control problern without state constraints:
(6.47) Ua(X) < u(x) < ub(x) for a.e. x E 17,
min J(y, u, 77) := L(y, u, r), µl, )L2),
and the state constraints

y(x) < yo (x) + 7) subject to


(6.48) forallxE9.
-y(x) _< -yQ(x) +77 -Dy + y = 0 in S 2

We have thus removed the nondifferentiability at the expense of adding


pointwise state constraints. But we have already demonstrated for the prob-
lem (6.28)-(6.30) how such state constraints can be handled. The mathe- y E R, ua(x) < u(x) < ub (x) for a.e. x E F.
matically rigorous derivation of the necessary optimality conditions will be
a task given to the reader in Exercise 6.3. It should be noted that the Slater The controls are the variable y E R and the function u E Uad = {u e L2(r)
condition (6.4) can always be satisfied with sufficiently large y > 0. ua(x) < u(x) < ub(x) for a.e. x E r}.
The new problem has only box constraints for the control u. The cor-
Formal derivation of the optimality conditions. For simplicity, we responding theory was treated in Chapter 2. If the necessary optimality
employ the formal Lagrange method, which leads to the correct result. At conditions for problems with cost functionals of integral type are formally
first, we eliminate only the state constraints by means of Lagrange multipli- applied, then one obtains the derivative of the cost functional with respect
ers, but not the elliptic boundary value problem or the box constraints for to y on the right-hand side of the adjoint problem for p; more precisely, the
the control. In this way, the state-constrained problem is transformed into a part defined in SZ appears in the elliptic equation, while the part defined on
problem that only has control constraints and thus can be formally treated r occurs in the boundary condition. We therefore have to determine the
6.2. Control problems with state constraints 349
348 6. Optimization problems in Banach spaces

derivative D. J for fixed multipliers lc1 and 112. It follows that to y. However, if we take the same approach as in the problem of minimizing
the value of the state at a given point, then the minimization of j without
state constraints leads to the same result in a mathematically more rigorous
0 = Dy J(y,, f1) y = fy(x)(di(x)-dt2(x))
way.

y (dµ1 - dµ2)l5Z + J y (dfir - d/12) Ir 6.2.2. A nonconvex problem. We now give another illustrative example
of the use of Theorem 6.3. We consider the optirnal control problem for a
Therefore, the adjoint problem is given by
semilinear elliptic equation:
-Op+p = (µ1 -lr2)19 in 9
(6.49) (6.52) min J (y, u) := p(x, y(x)) dx + f (x u( x))dx,
avp+ap = (111 -/r2)Ir on F. 1
By virtue of Theorem 7.7 on page 366, it has a unique solution p E W1"s(9) subject to
with s E [l, N-1
N
-Dy + d(x, y) = u in 9
To derive the otlrer necessary conditions, we make use of the Lagrangian (6.53)
a„y = 0 onI
function £ associated with the minimization problem for j, which is given
by
and to the box and state constraints

(6.54) ua(x) < u(x) < Ub(x) for a.e. x E 52,


£(y, u, 77) = J(y, ú, fl) - f V y • V p dx - J y p dx - J (n y - u) p ds.
SZ sZ r y(x) < 0 Vx E 0.
(6.55)
We must have Du £(9, fi, fj)(u - fi) > 0 for all u E Uad, that is,
We suppose that Assumption 4.14 on page 206 holds. By the nonlinearity
of the mapping G : u H y, the aboye problem is a nonconvex one. This would
(6.50) (Au + p) (u - f6) dx > 0 VU E Uad.
also be the case if a convex quadratic cost functional were chosen in place
of the functional in (6.52). By Theorem 4.17 on page 213, the mapping
Moreover, we have
G : u H y is continuously differentiable from L''(f) into H1(S2) n C(SZ), for
r > N/2. We therefore fix some r > N12 and put f (u) = J(G(u), u). Then
D,7G=D77 L=1- f dµ1- f dµ2, the aboye problern becomes an optimization problem in a Banach space,
sz
namely,
and thus the condition D, C(y, ü, y) = 0 irnplies that
(6.56) min f (u), u E C, G(u) <K 0,
(6.51) f(dii + d2) = 1.
where
C = {rt e LT( 52) : ua(x) < u(x) < 1¿b ( X) for a.e. x E S2}
In summary , (y, fi, f7) has to satisfy the optimality system consisting of
the state problem ( 6.46), the adjoint problem ( 6.49), the constraints (6.47) K = {y E C (S2) : y(x) > 0 V x E S2}.
and (6 . 48), the complementary slackness conditions
This problern is a special case of the problem ( 6.1), with the specifications
U = L''(S2) and Z = C(S2 ). K has nonempty interior in Z. Therefore, it
f(o_
y- )dtti = f (- y + ysz - 77) dµ2 = o, rnakes sense to postulate the validity of the linearized Slater condition (6.18)
on page 332 : we require that there are some e > 0 and sorne ü E C such
the variational inequality ( 6.50), and equation ( 6.51).
that, with y = G'(ü)(f - u) and y = G(v.),
The above derivation was only formal , since we treated the integral func-
(6.57) y(x) + y(x) < -E V x E Q.
tional j as if it were a continuous linear functional on L2 ( S2) with respect
350 6. Optimization problems in Banach spaces 6.2. Control problems with state constraints 351

The function y thus defined is the solution to the linearized problem Proof: The assertion follows from the expression for G'(u), upon rearranging
(6.59) for u. ❑
-Dy + dy(x, y) y = ü- u in fl
O„y = 0 on F. By the aboye lemma, u solves a convex problem in which only the box
constraints for the control occur. From this point onward, the optimality
Under the condition (6.57), the Zowe-Kurcyusz constraint qualification conditions can be derived as in the preceding section. To this end, we define
(6.11) on page 330 is fulfilled. The multiplier z* associated with the state the adjoint state p as the solution to
constraint y < 0 can be identified with some z E M(S2), and the Lagrangian
L corresponding to problem (6.56) is given by -Op + dy(x, 9) p = wy(x, y) + psz in S Z
(6.61)
avp = 1-ir on F.
L(u, p) = f (u) + J G(u) dli = f (u) + J G(u) dµo + J G(u) diír,
^ sz r
where zc and pr are the restrictions of µ E M(Sl) to tl and F , respectively. Owing to Theorem 7.7, the aboye elliptic problem has a unique solution
From Theorem 6.3, we deduce the following necessary optimality conditions. p that belongs to W1,s( Sl) for every s < N/(N - 1). With this p, the
variational inequality
Lemma 6.6 . Suppose that u is a local solution to problem ( 6.56) that sal-
isfies the constraint qualification (6.57). Then there is a nonnegative regular f(p(x)+ (xü(x)))(u(x)_ti ( x))dx^O Vu E C
(6.62)
Borel measure lr E M(S1) such that

(6.58) DuL( , p )(u - u) > 0 Vu E C is satisfied . The first- order necessary optimality conditions are thus derived
in the form of a variational inequality for u. Alternatively , one can transform
fu (x)dli(x) = 0. the variational inequality ( 6.62) by means of Lagrange multipliers into a set
of equations and formulate all optimality conditions in terms of a Karush-
Kuhn-Tucker system.
Substituting the derivative DuL of the aboye Lagrangian into (6.58) To this end, we define Lagrange multipliers Pa and Mb associated with
yields
the box constraints for u by

(6.59) f'(ú) (u - u) + f G'(u)(u - u) dpQ + J G'O (u - v,) dpr > 0 µa(x) = (p (x) + u(x, u( x)))+, lib(a ) = (p(x) + 0-( x, u(x))) _-

for all u E C. We thus obtain the following result. By virtue of the embedding result of Theorem 7.1 on page 355, these
pointwise multipliers define functions belonging to L9 (L) for any q < N/(N-
Lemma 6.7. Under the abone assumptions , u solves the linear optimal con- 2). As has been explained repeatedly before, one can now transform the vari-
trol problem ational inequality into an equation plus complementary slackness conditions.
The reader will be asked in Exercise 6.4 to produce the corresponding argu-
mili j (Y, u) := f py ( x, y(x)) y (x) dx + f 0u (x, u (x)) u(x) dx ment. One then arrives at the following result.
(6.60)
Theorem 6 . 8. Suppose that u is a locally optimal control for the problem
+ f y(x) dlro ( x) + fY(x)dr(x),
µ
^ (6.52)-(6 . 55) with associated state y, and suppose that the constraint qualifi-
subject lo cation (6.57) is fulfilled . Then there exist µa, and [lb belonging to L9(S2) for
all q < N/(N - 2), some µ E M(S2), and an adjoint state p belonging lo
-Dy + dy(x, y) y = u, ua(x) < u(x) < ub(x),
W""s(Q) for all s < N/(N - 1) such that u = u, y = y, p, µa, µb, and p
O„y = 0. satisfy the following optimality system:
352 6. Optimization problems in Banach spaces 6.3. Exercises 353

[BDPDM93] , [Fat99], [LT00a] , and [LTO0b]. Riccati techniques for the solution
of control problems, as well as results concerning controllability and observabil-
-Dy+d(x,y) = u -Op+dy(x,y)p = py(x,y)+lq ity, are covered comprehensively in [LT00a] and [LT00b]. An introduction to the
a„y = 0 a,p = ur modern theory of controllability and stabilizability is given in [Cor07]. In this
context, coupled systems of partial differential equations were treated in [Las02].
p+ v.(x,U)+µ b -µa=0
The use of Riccati operators was also addressed in [BDPDM92 ], [BDPDM93],

¡c > 0, f2
(x)d(x)=O and [Lio71].

6.3. Exercises
tia(x) > 0 , (ua(x) - u(x)) j.La(x) = 0 for a.e. x e 9
6.1 Determine the dual cone K+ to the cone K of almost-everywhere non-
Iib(x) > 0 , (u(x) - ub(x)) µb (x) = 0 for a. e. x E 9.
negative functions in LP(íl) for 1 < p < co.
6.2 Prove that the functions µr and /r2 defined in relation (6.8) on page 329
are Lagrange multipliers associated with the optimal control ti of problem
Referentes . State constraints have attracted much interest because of their im- (6.7).
portante in various applications. We only mention a small selection of the numerous 6.3 Derive the first-order necessary optimality conditions for the optimal con-
relevant papers. trol problem with maximum norm functional defined on page 345.
In the case of elliptic problems, first-order necessary optimality conditions were 6.4 Derive the optimality system stated in Theorem 6.8 on page 351 by means
treated, for instante, in [AR97], [BC91], [BC95] , [Cas86] , and [Cas93], while of the variational inequality (6.62) on page 351.
the papers [CM02a], [CT99], [CTUO0 ], [MT06] , and [CDIRT08] deal with 6.5 Show that for the elliptic problem (6.25)-(6.26) on page 336, every solu-
second-order conditions. For a comprehensive treatment of the elliptic case, we tion ( y, v.) satisfies the constraint qualification (6.15) on page 331.
refer the reader to Neittaanmáki et al. [NST06]. In the parabolic case, we refer to
[Cas97], [Mac81 ], [Mac82], [Mac83a] , [RZ98] , [RZ99], and [Tr684b] for first-
order optimality conditions, and to [GT93], [ RT00], and [CDIRTO8] for second-
order conditions. The structure of Lagrange multipliers for state constraints was
studied in [ BK02a].
Error estimates for finite element approximations and state constraints were
investigated in [CM02b] and [TT96]. A more detailed exposition of this subject,
as well as referentes to the relevant literature, can be found in [HPUUO9]. Numer-
ical techniques for the solution of state-constrained elliptic problems are given in,
e.g., [BK02a], [BK02b], [GR01], [MRT06], [MTO6], [MM01], and [MM00].
For parabolic problems, we refer to [AM84], [AM89], [LSO0], and [Tr684a]. A
comprehensive treatment of numerical methods for the optirnal control of partial
differential equations is contained in [IK08] and [HPUU09].

Further referentes for optimal control problems. A comprehensive


treatment of the optimal control theory for linear-quadratic elliptic, parabolic, and
hyperbolic problems can be found in the standard textbook [Lio71]. Various prac-
tical applications were presented in [But69] and [But75]. Nonlinear problems were
treated in, e.g., [Bar93], [HPUUO9], [ Lio69] , [NT94], [NST061, and [Tib90]_
Numerical methods for the optimal control of flow problems were discussed in
[Gun031. In [NST06], further results concerning the theory of elliptic problems
with state constraints were given. The use of strongly continuous semigroups, in
place of weak solutions, for parabolic problems can be found in [BDPDM92],
Chapter 7

Supplementary results
on parcial differential
equations

7.1. Embedding results

The usefulness of Sobolev spaces is to a large extent determined by embed-


ding results and trace theorems that will be provided in this chapter. We
follow the standard text by Adams (Ada78].

Theorem 7 .1. Let 9 C 118N be a bounded Lipschitz domain. Moreover,


let 1 < p < oo, and let m be a nonnegative integer. Then the following
embeddings exist and are continuous:

• for m p < N: Wm,P(S2) y Lq(Q) if 1<q< Np


N-mp
• for mp = N: Wm'P(S2) y Lq(í) if 1 < q < o0
• for mp > N: Wm,P(SZ) y C(52).

In particular, if S2 C 1182, then H1(í) = W1"2(Q) --> Lq(9) for all 1 <
-
q < oo, and if 9 C II83, then H1(S2) -> L6(9). The smoothness properties of
boundary values are described by the following result.

Theorem 7 . 2. Let m E N with m > 0, and let the boundary F be of class


Cm-1,1 Then for m p < N the trace operator 7- is continuous from Wm,P(í)
into L' (F), provided that 1 < r < (N=1)P . If m p = N, then T is contínuous
for all 1 < r < oo.

355
356 7. Supplementary results on partial differential equations
7.2. Elliptic equations 357

The aboye theorems follow from Theorem 5.4 and Theorem 5.22 in
This result was proved on page 190 under the simplifying assumption
[Ada78]. We also refer to [Eva98] and [W1o82]. Collections of results
that the coefficient functions ajj and the boundary I' of St are all sufficiently
en Sobolev spaces can be found in [ Fur99] and [GGZ74]. For an extension
smooth. However, it remains valid for the elliptic differential operator A
of the aboye theorems te noninteger m, see [Ada78], Theorems 7.57 and
introduced on page 37 if the coefficient functions ajj belong to L°°(S2) and
7.53 and Remark 7.56, as well as the comprehensive treatment in [Tri95].
S2 is a bounded Lipschitz domain. This can be verified using the following
By means of fractional-order Sobolev spaces, one can obtain a more precise
line of argument that was communicated to me by J. Griepentrog.
characterization of the trace mapping. F^om Theorem 7.53 in [Ada78], we
have the following result for integers m > 1: Let co E L°°(S2) be nonnegative almost everywhere, and let ljcoIILoo(n) >
0. In the following, we put V := H'(52) and C := S2. We then consider the
Theorem 7 .3. Suppose that S2 is a domain of class Cm, and let 1 < p < oo. elliptic operator L E £(V, V*) defined by
Then the trace operator r is continuous from Wm'P(S2) onto Wm-1/P'P(F). N
(Ly, v)v•,v = J (y- aij( x)Dly(x )Djv(x) +co (x)y(x)v (x)) dx.
¡,j -1
In particular , the continuity of the mapping r : Hl(Q) -+ H1/2( F) fol-
lows; r is even surjective . The following result was used in the existente Then the linear elliptic Neumann boundary value problem Lu = F has for
proof for optimal controls. every functional F E V* a unique solution y E V.
In [Gri02], Theorem 4.12, it was proved that for any w E [0, N] two
Theorem 7.4 (Rellich). Suppose that S2 is a bounded Lipschitz domain, and
families of Sobolev-Campanato spaces Wó'2'"(G ) c V and Y-1,2,w(G) C V*
let 1 < p < oo and m E N, with m > 0. Then every bounded set in Wm'P(S2)
with the following property can be found: there exists a constant Cu e (N -
is relatively compact in Wm- 1,P(5 ).
2, N) such that the restriction of L to Wó '2'"(G) is a linear isomorphism
between Wó '2'"(G) and Y 1,2,"(G), for any w E [0, w).
The aboye property is called a compact embedding . In particular , bounded
subsets of Hl ( 9) are relatively compact in L2(S2). We remark that the aboye Here (i.e., in the case of homogeneous Neumann data ), W0'2'"(G) coin-
results remain valid under somewhat weaker requirements on the boundary cides with the Sobolev -Campanato space
F (regular boundaries , boundaries satisfying a cone condition ); see, e.g., W1,2'W(SZ) = {u E V : (Duo E G2'"(q)}
[Ada78], [ GGZ74] , or [Gri85].
of all functions in V that have weak derivatives in the Campanato space
£2,,(q). In this context , the fact
7.2. Elliptic equations that for w E (N-2, N) the space Wó'2'"(G)
is continuously enibedded in the Hilder space C°'K(G) with K = (N - w)/2
In this section, we will first discuss the proof of Lemma 4.6 in the case is of particular importance. We only have te guarantee that under the given
where the coefficient functions of the differential operator are nonsmooth assumptions f and g belong to Y-1,2,"(G). We need w e (N - 2, N) for this
and the domain is Lipschitz. Then we will prove the results concerning property to hold.
essential boundedness of the solution to the semilinear elliptic boundary By Theorem 3.9 in [Gri02] , for any w E [0, N) all those functionals
value problem (4.5) in Section 4.2. We will also present an existente result F E V* that belong to Y-1'2°w(G) can be represented in the forro
for elliptic problems with measures as data.

7.2.1. Elliptic regularity and continuity of solutions . Owing to Lemnia (F, (P)v*,v = fi(x)DtW(x) dx + f f (x),p(x) dx + f g(x),p(x) ds(x),
19 =1 r
4.6, the weak solution y to the linear elliptic boundary value problem
where
Ay+y = f
N+2)(c)
¿9"y = g f E C2N/(N+2),"N/(

with given data f E Lr( Sl) and g E Ls(F), where r > N/2 and s > N - 1, is g E &N-1)lN'w(N-1)1N(F)

continuous in 9 and thus belongs to Hl (9) i C(9).


The mapping (f1, .... fN, f, g) H F defines a continuous linear operator.
358 7. Supplementary results on partial differential equations 7.2. Elliptic equations 359

To apply these results, we need the connection between Campanato in the variational forrnulation with the part of y that is larger than k > 0 in
spaces and the usual Lebesgue spaces; see Remark 3.10 in IGriO2l: absolute value, and then show that this part vanishes for sufficiently large k.
In the statement of the theorem, integrability properties of f and g were
(i) If q > 2 and wq = N(1 - 2/q), then Lq( Q) is continuously embedded in
postulated. Here , we denote the orders of integrability by r and . , respec-
£2,wq ( 9). In particular , wq > N - 2 if q > N.
tively. We thus have f E Lr(S2) and g E Ls(I ), where í > N/2 and
(ü) If r > 2N/(N+2) and wr = 2+N(1-2/r), then Lr( 9) is continuously >N-1.
embedded in G2NI(N+2),w,N/(N+ 2) (9). In particular , wr > N- 2 if r > N/2. We first assume N > 3 and explain at the end of the proof which mod-
ifications have to made for the case of N = 2. We fix some .A E (1, Ñ-2)
(di) If s > 2(N - 1)/N and ws = 1 + (N - 1)(1 - 2/s), then L8(F) is
sufficiently close to unity such that
continuously embedded in G2(N-1)/N,ws(N-1)1N(I ). In particular , ws > N-2
N N-1
if s>N-1. r>r:= >s:=
N-A(N-2)' N-1-)(N-2)_
The latter two conditions, that is, r > N/2 and s > N - 1, are just the
Since N > 3, and owing to the choice of .\, we obviously have r > 1 and
assumptions of Lernma 4.6.
s > 1. If we succeed in proving the result for r and s, then it will be valid
7.2.2. Stampacchia' s method . In this section, we prove the validity of for all r > r and s > s. The conjugate exponents r' and s' for r and s are
Theorem 4.5 on page 189, that is, the boundedness of the solution to the given by
elliptic problem (4.5) on page 183: 1_ 1 N-2
(7.3) T -1-r=^NN2 A
s' -
1s_N - l
Ay + co( x) y + d(x , y) = f in SZ
ó„Ay + a(x) y + b(x, y) = g on F. Below, we will use the embedding estimates

To this end, we employ a method due to Stampacchia. It makes use 1 1 1 N-2 1


of the following auxiliary result (cf. Kinderlehrer and Stampacchia [KS80], IIV II LI(Q) s e IIV II HIA for
p 2 N 2N 2)r"
Lernma B.1).
1 1 1 N-2 1
IIVII Lq(r)<- clIVII H1(si) for
Lemma 7.5. Let k0 E R, and suppose that cp is a nonnegative and nonin- q 2 2(N - 1) 2(N - 1) 2) s'
creasing function defined in [ko, oo) and having the following property: for
Since 2r' < p and 2s' < q, this implies that
every h > k > ko,

(h) < (h Ck)a W(k)b IIVIIL2r'(O) <- CIIVIIHI (sz), IIVIIL2s' (r) <eIIVIIHI(O)•
with constants C > 0, a > 0, and b > 1. Then cp(ko + b) = 0, where
Next, we define for each k > 0 a function vk E Hl (P), such that
(7.2) 5a = C P(ko)b-r 26á1.
y(x) - k if y(x) > k
vk(x) = 0 if Iy(x)I < k
Proof of Theorem 4.5: We modify a proof given in [Sta65] and [KS80]
for homogeneous Dirichlet boundary conditions to deal with the present case. y(x) +k if y(x) < -k.

(i) Preliminaries We aim to show that vk vanishes almost everywhere for sufficiently large k,
which then implies the boundedness of y. For the sake of brevity, we suppress
Existente and uniqueness of a solution y E Hl (S2) follow from Theorem 4.4 the subscript k, writing vk simply as v. We introduce the sets
on page 186. It remains to show the boundedness and the validity of the
estimate (4.10) on page 189. To this end, we will test the solution y to (4.5) S2(k) = {x E 9 : Iy(x)I > k}, F(k) = {x E I : (rry)(x)I > k},
360 7. Supplementary results on partial differential equations 7.2. Elliptic equations 361

where T y denotes the trace of y on F.


Indeed , we obviously have Diy = Dtiv in 5l ( k) and v = 0 in Sl \ S2(k),
whence it follows that
(ii) Consequences of the monotonicity assumptions
N
We now derive the inequality (7.6) below. First, we claim that
(x)D2y Djv dx = J iJ ( ) D2v Djv dx.

(7.5) d( x, y) v dx > 0, fb(x, y)vds >_ 0.


Js2 Moreover , since y - k > 0 in 9 +( k), y + k < 0 in íl _( k), and v = 0 in
SZ \ í(k),
To see this , let Q +(k) := {x : y(x ) > k}. Using the monotonicity of d with
respect to y and the fact that d(x, 0) = 0, we find that
= f coy(y-k)dx+ J coy(y+k)dx
1,1
f9 +(k)
d(x, y) v dx = J d(x, y) (y - k) dx
+(k) 2-(k)

+ (k)
l(k) co [( y - k)2 + (y - k )k] dx + J k co [(y+ k ) 2 - (y + k)k] dx
x, y - k + k ) ( y - k) dx > x, y - k) (y - k) dx > 0.
= f+(k) d( Js^+(k) d( > f covdx.
By the same token,
The integral fr a y v ds can be treated similarly. From (7.6) and (7.7) and
d(x, y) v dx > 0, the coercivity properties of the elliptic boundary value problem, we conclude
L(k) that, with some ,Q > 0,
where 12 (k) := {x : y(x) < -k}. This proves the claim for the first integral
in (7.5). The claim for the integral over F follows by analogous reasoning. ( 7.8) aIIVIIH1(0) < f fvdx + vds.
From the variational formulation for y, we infer that, with the bilinear Jrr
form a[y, v] defined in (4.7) on page 186,

(iv) Estimation of both sides of (7.8)


a[y, v] + f d( x, y) v dx + J b(x, y ) v ds = J f v dx + J g v ds,
s2 r s2 r Now recall the first ernbedding inequality in (7.4) and Young's inequality
ab < Ea 2 + 4 EÉ b2 for all a, b E IR and E > 0. With some generic constant
whence, in view of (7.5), c > 0, and using Hblder's inequality, we can estirnate the first term of the
right-hand side as follows:
( 7.6) a[y, v] < f fvdx + J g v ds.
^ r
f,. f v dx <- IIfIIL-(o) IIVIILI'(o)
At this point, we can already see that the nonlinearities d and b will not play
a role in the estimation.
IvI2r
< IIfIIL^(o ( f )2(
dx 1f dx)
2(k) 1l(k)
(iii) Estimation of IIvIIHI(Q)
We claim that IIfIIL'(Q) IIvIIL21'( 0) Ií(k)12',' < C. IIfIILr(o2 ) IIvIIH1(0 ) IQ(k)Ii5,
(7.7) a[v, v] < a[y, v]. < cllfllir(s2) I^(k)Ir +EIIvIIHI(Q) = C IfII r(O) I9(k )IA2+EIIVIIHI(o).
363
362 7. Supplementary results on partial differential equations 7.2. Elliptic equations

The number E > 0 is yet to be determined. By similar reasoning, for the obtain the inequality
boundary integral we find that
(h - k )2 W (h) < c (ji f I L1(O) + 11 9 11L 5(r)) w(k)^,
gvds < II9IILs(r) IIVIILs'(r) <- 11911Ls(r) IIvIIL25 (r) IF(k)12.' for all h > k > 0. We now apply Lemma 7.5 with the specifications

< CII9IILs(r) 1F(k)I^92 2


+EIIVIIHI(Q) a=2, b=A>1, ko=0 , C=C(IlfllLr(n)+11911Ls(r)).
Now, by chooosing E := 6/4, we may absorb finto the left -hand side of (7.8) We obtain 52 = c f IILr(9) + lgllLS( r)), whence the assertion follows: in fact,
(II
the terms E IIvIIHI(n) occurring in the last two inequalities. I
cp(b) = 0 means that ly(x)I < 5 for almost every x E 9 and (Ty)(x)I
<d

Next ,we split the expression llv11H1(9) on the left-hand side of ( 7.8) finto for almost every x E F.
two equal parts. Invoking ( 7.8) and the two estimates given in (7.4), we find
that (vi) Modification for the N = 2 case
f and
Let r > N/2 = 1 and s > N - 1 = 1 be the orders of integrability of
2 2

( f (k) IvlP dx) p + (f (^) Ivl4 ds) 9 < IIvIIHI(2), g assumed in the theorem . In the case of N = 2 , the embedding inequalities
(7.4) are valid for all p < co and q < oc. We therefore define p and q with
whence, by the definition of v, )>1by
1 1 1 1
(7.9) f, (k) (jyj - k )P
dx) P + (f (k) (¡y¡ - k)e dx) < C IIVI1 2 p 2)r" q 2)s'
°

With this specification, all conclusions subsequent to (7.4) in the N > 3 case
carry over to N = 2, yielding the validity of the assertion. This concludes

(v) Application of Lemma 7.5 the proof. 1]

,Suppose that h > k. Then S2(h) c Q( k) and F (h) C r(k ), and thus
Remark. In the aboye proof, the boundedness of y on r has been shown directly,
ls2(h)l < S2 ( k)I and IF(h)I < IF(k) I. Therefore,
too. Since MIL-(P) < MMIL-(.n), it would already follow from the boundedness of
2 2 y; see Exercise 4.1.
(lyl - k)Pdx) P (l yl - k)P dx) P
> (fe(a) (h - k)P dx) F

(• f2(k) > (f (n)


The aboye rnethod does not apply directly to the equation -Dy+yk = f,
(h - k)2 I9 (h)I2 /P with k E N odd, subject to a Neumann boundary condition. We now discuss
an extension of Stampacchia's method due to E. Casas. This technique
The boundary integral is estimated similarly. Finally, we infer from (7.9) applies to boundary value problems of the form considered in (4.15) on page
and (7.8) thát 193:
2 12 Ay + d(x, y) = 0 in 52
(h- k)2 (I9 (h)I p + IF(h)Q) a„Ay + b(x, y) = 0 on F.

< C (IIfIIL-(u) + 119IILs(r)) (Is2(k)P + Ir( k)IA9)


Theorem 7.6. Suppose that Assumption 4.9 on page 193 is satisfied. For
each n E N let y, denote the unique weak solution to the elliptic boundary
<- C (IIfIILT(Q) + 11911L3 (r)) (I9(k)I P + Ir(k)14)A. value problem
Here, we have used the fact that for all a > 0, b > 0, and A > 1 the inequality Ay + n-ly + d(x, y) = 0 in S2
2 2
a,N +ba < (a+b)^ is valid. Putting <p(h)
:= I9(h)1p +ir(h)19,
we therefore a„Ay + b(x,y) = 0 on r.
364 7. Supplementary r•esults on partial differential equations 7.2. Elliptic equations 365

Then there is sorne K > 0 such that In this way, we eventually arrive at an inequality resembling (7.8) in the pre-
II in 11 L-(O) < K for every n E N.
ceding proof, whence the subsequent arguments carry over unchanged. We
thus obtain a bound 5 > 0 for IIy1ILOO(O) = Iyn11LO(O) that does not depend
Proof.- We shall only explain how the preceding proof has to be inodified.
on n. The assertion thus follows with K := 5. ❑
Since we have zero right-hand sides f and g, we cannot additionally assume
that d(x, 0) = 0 and b(x, 0) = 0. In order to have the same situation as in
7.2.3. Elliptic problems with measures . Since in the case of pointwise
the preceding proof, we cut off d and b at k E N and at -k, respectively, to
state constraints measures occur on the right-hand sides of the associated
obtain the functions dk and bk defined on page 192. Putting
adjoint problems , a corresponding extension of the theory of elliptic and
.f (x) := -dk (x, 0), g(x) := -bk (x, 0), d(x, y) := dk (x, y) - dk(x, 0), parabolic problems is needed. The foundations of such a theory are due
to Casas [Cas86 , Cas93 1 and Alibert and Raymond [AR97] for elliptic
b(x,y) bk(x,Y) - bk(x,0), problems, and to Casas [ Cas97] and Raymond and Zidani [RZ99j for the
we then have, as in the preceding proof, d(x, 0) = 0 and b(x, 0) = 0, as well parabolic case. In the following, we cite such a result for the boundary value
as f E LT(S) and g E L3(F). Note that all of these functions do not depend problem
on n E N.
We now assume that in Assumption 4.9 the inequality (i) is satisfied by Ap + co p = pp in í
d. If (ii) is valid instead, we can argue analogously using b. (7.11)
ó,,,lp+ap = /ir on F.
Now let y := y,L, where n E N is arbitrary but fixed. The following easily
verified relations are essential for our method to work: as in the preceding
proof, the monotonicity of d yields that on 9 \ Ed, we have Here , 2 c Rw is a bounded Lipschitz domain . The Borel measures
llc and µr, which are supported in í and on F, respectively, are the cor-
(d(x, y(x )) - d(x, 0))(y (x) - k) > 0, x E 9+ (k) responding restrictions of a regular Borel measure p e M(í); that is, we
d(x, y(x)) v(x) = 0, x E 9 \ 52(k) have Ez = ¡i + Irr . The differential operator A is the elliptic operator
(d(x, y(x )) - d(x, 0))(y( x) + k) > 0, introduced in (2.19 ) on page 37; we assume that the coefficient functions
x E Q- (k).
2.20) and (only for simplicity)
aij E L°°(í) obey the ellipticity condition (
Hence, d(x, y(x))v(x) > 0 for all x E S2 \ Ed. Moreover, for all x e Ed, the symmetry condition aij(x ) = aji(x). Moreover, we assume that we are
given functions co E L°°( S2) and a E L°O( F) such that co > 0, a > 0, and
> Ad l y(x)lv(x) = Ad lv(x)12, l y(x)1 > k IHaIIL-(r) + TICO IL- ( SZ) > O.
(7.10) d(x, y(x))v(x) = d(x, y(x)) With the aboye elliptic problem, we associate the bilinear form
• o = Ad iv(x)12, l y(x)I < k.
We may therefore modify the arguments employed between (7.5) and (7.6)
as follows: a[p, v] /, (z aij (x ) Dip(x ) Djv(x ) + co(x) p(x) v(x) ) dx
,j=1

a [y,v] 17V12 J ¡ v i ' J d(x,y)vdx+ J ¿ ( X ,


>-Yo
^ ^ ^ r + lr u(x)p( x) v(x) ds(x).
> 7o J VvI2 dx + d(x,y) vdx
J Moreover , we define for r > N12 and s > N - 1 the linear space

> - Y o Ivv12dx+ I\d¡vi'dx. V','={v EH1(,Q): Av ELT(r),a„AvEL'(F)},


S2 J Ed

By the generalized Poincaré inequality (2.15) on page 35, the latter expres- where Av is to be understood in the sense of distributions and á,,,, v is
sion is the square of a norm that is equivalent to the standard norm of H1(52). defined as in [Cas93].
366 7. Supplementary results on partial differential equations 7.3. Parabolic problems 367

Definition . A function p E W1'°(S2), where o, > 1, is called a weak solution Observe that in the case of intersections like WZ'°(Q) n L°°(0,T;L2(SZ)),
to the problem (7.11) if it satisfies the variational equality the elements of W2'°(Q) have to be identified with vector -valued functions
beldnging to L2(0,T; H1(S2 )); otherwise , one would be trying to compare
(7.12) a^), v] = v(x) dpq(x) + J v(x) dµr(x) b'v E C1(s2). real-valued functions with vector-valued ones.
f r
For the sake of better readability, we once more write down the initial-
boundary value problem ( 3.23) under investigation:
The following theorem is a special case of a result proved in [Cas93].

Theorem 7.7. Under the abone assumptions, the boundary value problem
(7.11) has a unique weak solution p such that p E W1'°(52) for all 1 < a < yt+Ay+coy = f in Q = 9 x (0, T)
N/(N - 1), with the property that for all v E Vr,s the integration by parts (7.13) a,Ay+ay = 9 onE=Fx(O,T)
formula in 9.
Y(., 0) = yo

fvdo+fvdtir
t Here, the uniformly elliptic differential operator A is defined as in (2.19) on
is valid. Moreover, there exists a constant c° > 0, which does not depend on page 37.
µ, such that ^Ip11w1,(sz) < e°11µ11M(5).
Theorem 7.8. Suppose that 12 c R N is a bounded Lipschitz domain, and
let function co E L°°(Q), a E L°°(E) with a > 0 almost everywhere on
Remark. Weak solutions to (7.11) do not have to be unique, as the detailed E, yo E L2(S2), f E L2(Q), and g E L2(E) be given. Moreover, let the
discussion in [AR97] shows. Uniqueness is only ensured if the integration by parts differential operator A have coefficients a2j E L°°(9) such that ai.i = aji and
formula (Green's formula) is valid. A definition of 0,A p is given in [Cas93]; see such that the uniform ellipticity condition (2.20) on page 37 is fulfilled. Then
also [AR97]. For a definition of the norm jpjj,u(n), we refer the reader to [A1t99]. the initial-boundary value problem (7.13) has a unique solution in V2(Q).

Proof: The assertion follows from Theorem 5.1 in Chapter III of [LSU68].
7.3. Parabolic problems
Its proof indicates the changes that have to be made in comparison with
7.3.1. Solutions in W(O,T). the proof of Theorem 4.1 in Chapter III for the case of homogeneous Dirich-
let boundary conditions. We follow the proof in [LSU68] and sketch the
modifications for boundary conditions of the third kind that are needed to
The linear problem. Following the monograph by Ladyzhenskaya el al.
understand the proof of Lemma 5.3 concerning our semilinear problem.
[LSU68], we introduce the following spaces:
Without loss of generality, we may assume that co (x, t) > 0 for almost
every (x, t) E Q. Indeed, if this property fails to hold, we simply substitute
Definition . We denote by V2(Q) the space W2'0(Q) n L°°(0,T; L2(S2)),
y(x, t) = eAt y(x, t). Then in the resulting differential equation for y the term
endowed with the norm
(A+co) y occurs in place of co y, and this is nonnegative for sufficiently large
)1/2 A>0.

C
llyilvz(Q) = essts p) Ily(t )11L2 (n) + JJQ ¡V y(x,t)I2 dxdt
(i) Galerkin approximation

and by Y21 ' 0(Q) the space WZ'o(Q) n C([0,T], L2 (S2)), endowed with the We set V = H1(S2) and H = L2(S2). Since V is a separable Hilbert space, we
norm may choose a countable dense set of linearly independent elements {vi}°°1
in V. After possibly performing an orthogonalization process with respect to
1/2
tmo Ily(t)IIL2 (o)+ ( / IVSY(x,t)12dxdt) the scalar product of H, we may assume that {vi}001 forms an orthoriormal
Ily°(Q)
fQ system in H which is also complete in H.
368 7. Supplementary results on partial differential equations 7.3. Parabolic problems 369

For arbitrary but fixed n E N, we now determine approximations y-,, _ Evidently, since un E (H'(0,T))n, the function yn, : [0,T] - L2(Q) is almost
yn (x, t ) through the ansatz everywhere differentiable with respect to t in (0, T).

(7.14) yn (x,t) _ UI, ( t) vi (x), (ii) Estimates for {yn}


i=1
For any arbitrary but fixed •r E (0, T], we have the identity
with unknown functions u7 : [0, T] -* IR, i = 1, ... , n. In the following, we
, rT (
write (• , •) and II • II for the scalar product and the norm, respectively, in H,
and (-, •)r for the scalar product in L2(F). We also define
71 t ) , Yn(t ) I dt = 2 J
T dt Ilyn (t)112 dt = 2 I1yn(T)112 - 2 Ilyn (o)II2.
Integration of (7.18) over [0, T] therefore yields that
a[t; y, v] { 1: ai.7 (x) Di y(x ) Dj v(x ) + co(x, t) y(x) v(x) } dx
i,i=1
I
(7.19) 1 II yn(T)112 + T a[t; y7(t), yn(t)] dt
+ Jr e, (x, t) y(x) v(x ) ds(x).
= 2 llyn(0 )112 + fT { (f ( t) , yn(t )) + ( g (t) , yn(t))1 } dt.
We interpret yn = y,(•, t) as a function with values in H'(íl). Multiply-
ing the parabolic equation by vj and integrating over 9 by parts, we find By virtue of Bessel's inequality, we have
that
n n

(7.15) (dt yn(t) , vi + a [t; yn(t), vj] = (f (t), vj) + (g(t) , vj)r
(7.20) llyn (0)112 = > Iu; (0)12 = I(Yo, v,)12 < Ilyoll2.
j=1 j=1

for almost every t E (0, T). The initial condition for y is equivalent to Moreover, because co(x, t) > 0 and a(x, t) > 0,
(y(•, 0) , v) = (yo , v) for all v e V. Therefore, we postulate that
(7.21) a[t;v,v] >-yo11IVvlf12 VvEV,
(7.16) (yn(•, 0) , vi) = (Yo, vi), 1 < j < n.
where -yo is as in (2.20) on page 37. Using some standard estimates (which
Substituting the ansatz (7.14) in (7.15) and using the orthonormality, we see
may be oinitted here) and invoking Gronwall's lernma, we can infer from
that
(7.19) and (7.20) that
n
u'(t) a[t; vi, vj] = b1(t) for a.e. t e (0,T),
(7.17) d u^ (t) + (7.22) rMax^ Il yn (t)11 <- c (IIyo11 + Ilf11L2(Q) + 11911L2(r))
i=1
,L.7(0) = (yo, vj),
Recalling (7.21), and inserting this estimate for yn in C([0, TI, H) into (7.19)
for j = 1, .... n, with the given functions bj(t) = (f (t) , vj) + (g(t) , vj)r. with T = T, we see that there is a constant K > 0 such that
Owing to Carathéodory's theorem, this initial value problem for a system
of n linear ordinary differential equations on [0, T] for the unknown vector (7.23) I ynI1 C([o,T], H) + I Yn I W2,°(Q) < K Vn e N.
function un = (ui .... , u,)T has a unique absolutely continuous solution
un E (Hl (0, T))n. Multiplying (7.15) by uj(t) and adding the resulting In particular, llyn(t)I12 < K2, and in view of the orthonormality,
equations from j = 1 to j = n, we obtain, for almost every t e (0, T),
n
(7.18)
(7.24) É lUn (t) 12 < K2 'd t E [0 ,T], V n E N.
i=1
(dt yn(t) , Yn(t)) + a[t; yn(t), yn(t)] = (f (t) , yn(t)) + (g(t) , yn(t))r-
370 7. Supplementary results on partial differential equations 7.3. Parabolic problems 371

(iii) Convergente properties of the sequences {u^ } and {yn} (iv) y is a weak solution

Owing to (7.24), we have luj(« < K for all t, j, and n. Moreover, it By virtue of the estimate (7.23), we may assume without loss of generality
follows from (7.15) by integration over time that for any fixed j E N the that {ynk}' 1 converges weakly in W',O(Q) to y. Next observe that we can
sequence {uj}°n° 1 forms an equicontinuous set in C[0, T]. Hence, the Arzelá- take as test function in (7.15) any function vn,, of the form
Ascoli theorem may be applied to any of these sequences of functions. We
m
now combine this theorem with a suitable diagonal selection procedure to
vm,,,(x, t ) = Y- aj (t) vo(x), m < n,
establish the existente of a subsequence {nk}k 1 of indices such that

lim u` = uj strongly in C[0, T] V j E N.


k--^oo J where aj E Cl [0, T] satisfies aj (T) = 0 for 1 < j < m. It then follows from
(7.18) that
To this end, we proceed as follows: first, we select a subsequence {uie},
= 1, 2, ..., that converges uniformly on [0, T]. We now choose for each j d
the element u^l as the first term. Next, we consider the sequence {u2e}, d ynk (t) , vm(t)) + a[t; ynk (t), v m (t)] = (f(t) , v m (t)) + (g(t) , vm(t))r,
$ = 2, 3, ..., from which we again select a uniformly convergent subsequence
{u2e"}. Obviously, the sequence {uie'"`}, being a subsequence of {uie}, also
whence, upon integrating over [0, T] by parts,
converges uniformly. Now we choose for each j the element u^ el as the second
term; that is, we put n2 := nel.
Continuing this selection process inductively for all j E N, we obtain a
f d
vm(t )) dt+
subsequence {nk }° 1 of indices such that all the sequences {u^ k } converge
T (Ynk ( t
J T a[t ;2✓nk(t)1vm(t)] dt
uniformly on [0, T]. Observe that for any of the sequences {u^ k }, at most
the first j - 1 terms do not belong to the selected convergent subsequences. ff f vmdxdt+ff gvmdsdt+ ( ynk(-,0), vm(•,0)).
_ E
With the limit functions uj thus constructed, we define the function
Now recall that ynk -> y weakly in W2'0 ( Q) and ynk (0 ) ---> yo strongly in
ao
L2(52 ). Passage to the limit as k co in the aboye equation therefore yields
y(x, t) u¡ ( t ) vi (X), (x, t) E Q.
i =1

- pT
It can then be shown that the sequence {ynk (-, t)} converges weakly in L2(SZ)
to y(.,t), uniformly with respect to t E [0, T]. Owing to the weak lower
(y(t) dt vm(t)) dt + fT a[t; y(t), vm(t)] dt
sequential semicontinuity of the norm, we infer from the estimate (7.23)
= fvm,dxdt+
that Ily(t)11 < K for almost all t, which means that y c LO°(0,T; L2(S2)).
Moreover, ynk (0) even converges strongly in L2 (Sl) to yo; indeed, we have,
JJQ E JJ gvmdsdt+
sz J yovm(•, 0)dx.
ask -> oo,
Finally, we use the fact that, by Lemrna 4.12 in Chapter II of [LSU68],
n the set of al¡ functions vm of the aboye type is dense in the class of all
IIynk(0) - yo1I = uink (0) vi - (yo, vi) vi u¡ (0) vi -> 0, functions from W2'1(Q) having zero final value. Therefore, y satisfies the
i=1 i=nk+1 variational formulation and is thus a weak solution. This concludes the
proof of the assertion. ❑

since r_ Iui(0)12 < oc by (7.24). All of the derivations aboye can be found
i=1 Remark. The assumption that a be nonnegative is dispensable; see, e.g., Ray-
in full detail in [LSU68]. mond and Zidani [RZ98].
372 7. Supplementary results on partial differential equations
7.3. Parabolic problems 373

We now turn to the uniqueness question. The proof of uniqueness of the


In summary, we have the following main result.
solution is somewhat technical. It is based en the energy equality
(7.25)
+ JT Theorem 7.9. Let 9 C 1[8N be a bounded Lipschitz domain, and suppose that
2 ly(T)I L2(2) o
a[t; y(t), y(t)] dt functions co E L°O(Q), a E L°°(E), yo E L2(S2), f E L2(Q), and g e L2(E)
are given. Moreover, let the differential operator A satisfy the conditions
= z Ily(0) IIL2(o) +l 0 T [(f (t) , y(t))L2(9) + (9(t) , y(t))L2(r)1 dt. stated in Theorem 7.8. Then the initial-boundary value problem (7.13) has a
unique solution in W2'0(Q) that belongs to V2 '0(Q). In addition, the solution
satisfies the estímate (7.26) with a constant cp > 0 that does not depend on
Clearly, the aboye equality follows from the variational formulation for
f, 9, or' yo-
the solution when y itself is inserted as the test function. However, this is not
allowed, since y does not necessarily have the properties y E W2' (Q) and
y(T) = 0 that are required for test functions. In [LSU68], it was shown by
Remark . The proof becomes simpler if one works in the space W (0, T) right
means of averaged functions that the aboye energy balance equation is indeed
from the beginning; see Lions [Lio71] or Wloka [W1o82 ]. We have followed the
valid, provided we have more regularity, namely, y E V2'0(Q). Hence, if the
arguments of Ladyzhenskaya et al. [LSU68 ] in order to be able to introduce the
assumptions of Theorem 7.8 hold and y is a solution belonging to V2'o(Q)1 space W2'0(Q) first, in analogy to the treatment of weak solutions in the elliptic
then there exists a constant cp > 0, which does not depend en f, y, or yo, case.
such that

(7.26) The semilinear equation . We now study the semilinear parabolic initial-
max IIy(t) IIL2(n) + IIyIIW2.°(Q) <_ CP (IIfil L 2(Q) + II g IIL2 (E) + II Y OIIL2(lz)).
tE [O,T]
boundary value problem

Since the energy balance equation ( 7.25) has the same structure as equation
(7.19), the inequality (7.26) can be derived from it in the same way as the yt + A y + d(x,t,y) = f in Q
estimates ( 7.22) and (7.23) for y „ were derived from ( 7.19). (7.27) a,,Áy + b(x, t, y) = g on E
By means of similar estimates , it can then be shown that in W2'0(Q) y(', 0) = yo in 9.
there is also at most one solution ; see [LSU68], Chapter III , Theorem 3.3.
The reason for this is the fact that the difference of two solutions solves the
initial- boundary value problem with homogeneous and thus smooth data.
In this way, an estimate resembling ( 7.25) can be employed, which finally Here, we follow in parts the proof for the linear case, and also employ
leads to the conclusion that the difference of two solutions must vanish; see ideas from Gajewski et al. [GGZ74] and Wloka [Wlo82]. Initially, we
[LSU68], Chapter III , Theorem 3.2. impose the strong conditions of uniform boundedness and uniform Lipschitz
continuity of d and b with respect to y. We prove Lemma 5.3, which was
By virtue of Theorem 7.8, we know that there is at least one solution
stated on page 267.
in V2 ( Q). In [LSU68], Chapter III, Theorem 4.2, it is shown that any
weak solution in V2(Q) even belongs to V2'0( Q). In view of the uniqueness Lemma 5.3 Suppose that Assumptions 5.1 and 5.2 from page 266 hold,
in W2'0 ( Q), we may thus conclude that the unique solution y belongs to and that the assumptions on A stated in Theorem 7.8 are satisfied. Then the
V2'0(Q). This information makes it possible to derive the estimate ( 7.26) for initial-boundary value problem (7.27) has for any triple of data f E L2(Q),
the solution. g E L2 (E), and yo e L2(S2) a unique weak solution y E W(O,T).

Remark. The results cited from [LSU68], which were proved there for homoge- Proof.• (i) Galerkin approximation
neous Dirichlet boundary conditions, carry over to the case of boundary conditions
We use the same notation as in the proof of Theorem 7.8; in particular,
of the third kind.
{ves}'_, has the same meaning and properties. Again, we make the ansatz
374 7. Supplementary results on partial differential equations 7.3. Parabolic problems 375

as well as the estimate (7.20) for y,,,(0). We thus obtain an analogous inequal-
yn(x, t) _ u2 (t) vi (x). This time, the bilinear forro a has the form
i-1 ity in place of the equality (7.19). Since we estimated from aboye anyway in
the further steps of the proof of Theorem 7.8, we may argue as in that proof
to arrive at the estimate (7.23):
a[y,v] J azj (x ) Di y Dj v dx.
i,7=1 IIynMIC[o,T],H) + IIyn'IW' °(Q) < K VnEN.
As in the proof of Theorem 7.8, we obtain from the parabolic problem the
following initial value problem for a nonlinear system of ordinary differential ' (iii) Weak convergente of {y,,,}
equations:
From the aboye estimate, we conclude that sorne subsequence, without loss of
WZ'o(Q)
generality {yn}n 1 itself, converges weakly in W2'o(Q) to some y e
u? (t) a[vi, vj] + <Pj(t, un(t)) = bi (t), Since d and b are nonlinear, we cannot conclude from this that d(-, Y.)
(7.28) dt u^ (t) +
i=1
(respectively, b(•, y,,)) converges weakly to d(., y) (respectively, b(•, y)). How-
u; (0) = (yo, vj), ever, we do know that these sequences are bounded in L2(Q) and L2(E),
for j = 1, . . . , n. Here, we have set bj (t) := (f (t) , y1) + ( g (t) , vj)1, and respectively. We may therefore assume without loss of generality that

^Dj (t, u ) d(-, t


n

i=r
ui vi) , v, + (
i=1
ui vi) , vi
r
(7.31) d(., yn) D in L2(Q), b(•, yn) - B in L2(E),

with suitable D E L2(Q) and B E L2(E). Passing to the limit as n -* oc, we


find that y is the weak solution to a linear auxiliary problem: indeed, as in
By assumption, the mapping $ : [0, T] x Rn - > Rn is uniformly Lip-
schitz continuous and uniformly bounded. Hence, owing to Carathéodory's the linear case, it follows that for all functions v,(•, t) _ ai(t) vi(.) where
i=1
theorem, for every n e N the aboye initial value problem has a unique
al E Cl[0,T] and ni (T) = 0 for 1 < i < m, we have
absolutely continuous solution un(.) E (H1 (0, T))7' in [0, T]. As in the linear
case, we obtain, for almost1every t E (0, T),
fT (y(t), á vm(t)) dt+ { a[y(t), vm (t)] + (D(t), vm(t))
^T
(7.29) Gdt yn(t) , yn(t) J + a[yn(t), y.(t)] + (d(., t, yn(t)) , yn(t)) dt
+ (B(t)) , vm(t))r}

+ (b(•, t, yn(t)) , yn(t))0 _ (f (t) , yn(t)) + (9 (t) , yn(t))r


=
fffvmdxcit+ ff 9vmdsdt+ fi Yovrn(,0)dx.
(ii) Estimates for {yn} Since the set of all functions v, of the aboye form is dense in W1'1(Q) (see
[LSU68], Chapter II , Lemma 412), we can infer that
Without loss of generality, we may assurne that d(., •, 0) = b(•, •, 0) = 0
(otherwise, we subtract these terms from both sides of (7.27)). As in the
derivation of (7.19), it follows from the monotonicity of d and b that - JT (y ( t) , dt v (t)) dt + JT {a [y(t), v (t)] + (D(t), v(t))
(d(', t, yn) , yn) = (d(', t, yn) - d(•, t, 0) , yn - 0) > 0. + (¡B(t), v(t )) 1,} dt
An analogous inequality holds for b. We therefore have
_ ff f vdxdt+
Q E si
JJ
gvdsdt + J yov(.,0)dx
1
2 Ilyn(T)II2+ f T a [t; yn (t), y. (t)] dt for all v E W1'1(Q) with v(T) = 0. But this is none other than the
(7.30) variational equation for a weak solution y having initial value y(0) = yo.
1
< 2 lyn(0)II2 + / { (f (t) , yn(t)) + (9 (t) , yn(t))r} dt, Hence, if we were to succeed in showing that D(x, t) = d(x, t, y(x, t)) and
376 7. Supplementary results on partial differential equations 7.3. Parabolic problems 377

B(x, t) = b (x, t, y(x, t)) almost everywhere , then y would be the desired whence, upon using (3.30) once more,
solution and the proof would be complete.
To this end, we put Y := L2(0,T;V). Then Y* = L2(0,T;V*), and we
have
lm(A(yn) , yn)y* y < (F, y)y *,y
+2
Ily(0)IIH - 2
Il y(T)II H
(F, y)y*,y - (y ' , y)y*,y
(7.32) y' + w = F
(w, y)y*,y
in Y*, where F E Y* is defined by r
by (7.32). In summary, we have shown the relations
(F,v)y*,y = f f fv dxdt+ JJ gvdsdt
Q E yn - y, A( yn) - w, lim (A(yn) , yn) < (w, y)•
n->oo
andwEY*by
Hence, we can infer from Lemma 7.11 below that A(y) = w, whence, finally,
D = d(•, y) and B = b(., y) follow. This concludes the proof of the assertion.
(w, v)y*,y = I T {a[y(t), v(t)] + (D(t) , v(t) + (B(t) , v(t)r} dt.
We remark that an alternative way of concluding this proof can be found in
Lions [Lio69]. ❑

(iv) y is a weak solution to the nonlinear problem


The assumed uniform boundedness of the nonlinearities is too restrictive.
We follow the lines of the proof of Theorem 1.1 in Chapter VI of [ GGZ74].
Therefore, the aboye theorem has restricted applicability; it is more of an
By construction , yn E W ( O, T) for all n E N. As in part (i) of the proof of
auxiliary result. In Theorem 5.5, the aboye restriction was removed. In the
Theorem 4.4, we define a monotone operator A : W (O, T) -> L2 (0, T ; V *) by proof of Theorem 5.5, we made use of the following estimate instead:
A = Al+A2+A3, with Ai : W(0,T) -* L2(0,T;V*), y -> zti, for i = 1,2,3,
where
Lemma 7 .10. The solution y E W(O,T) that exists by Lemma 5 .3 satisfies
the estimate
zi(t) = a[y(t), '], z2(t) = d (•, t,y(t)), z3 (t) = b(•,t,y(t)).
From (7.29), we infer that (7.33) Ilyllw (o,T) < cp (II f - d(., 0)IIL2(Q) + IIg - b(., 0) h2(E) + IIyoIIL2(o))

jT (y^( t), yn(t )) v *v dt+


IT (A(yn )(t) , yn(t)) v*,v dt
with a constant cp > 0 that depends neither on f nor on g.

Proof.• Since y e W (O, T), we may test the nonlinear equation with y itself
= fffYndXdt+ ff gydSdt to obtain the nonlinear analogue of the energy balance equation (7.25):

for every n E N. By virtue of formula (3.30) on page 148, and using the
definition of F, we have 2 Ily(7)Ili2 (n) + if { a[y(t), y(t)] + (d (., t, y(t)) , y(t))
+ (b(., t, y(t)) , y(t))r} dt
fT (A(yn)( t), yn(t )) v *,v dt = (F, yn)y *,y + 2 llyn(0)IIH 2 II yn(T)11 2
= 2 Ily(o)Ili2 (^) + f
o (f(t) t
{ , y( ))L2(Q)
+ ( g(t) , Y(t))L2 r } dt.
()
Evidently, y,(O) -* yo strongly in H = L2( Q). Moreover , it follows from
yn - y in W(0,T) and the continuity of the linear operator y F- y(T) in Now add to both sides of the aboye equation the expression
W(0,T) that also y,, ( T) - y(T) in H . Therefore,

limninf
>oo
Ilyn(T)IIH > Iy(T)Ilx, - f T { (d(-, t, O) , y(t)) + (b (., t, 0) , y(t )) r} dt.
379
378 7. Supplementary results on partía] differential equations 7.3. Parabolic problems

Applying Young's inequality to the terms on the right-hand side and invoking remains to show the continuity of the solution y to the initial-boundary value
the monotonicity of b and d, we then easily conclude the validity of (7.33). problem
This concludes the proof of the assertion. ❑ 0 in Q
yt+Ay =
c9„Ay = 0 on E
Finally, we provide the reader with the auxiliary result that was applied
in the final step of the proof of Lemma 5.3. Its proof can be found in y(0) = yo in Q.
[GGZ74], Chapter III, Lernma 1.3, or in [Zei95]. The estimate then follows from the continuity of the respective solution map-
pings.
Lemma 7.11. Let Y be a reflexiva Banach space, and suppose that A : Y
To this end , we adapt the notation from [Gri07a, Gri07b] and put
Y* is a monotone and demicontinuous operator. Moreover, let
S := (0, T) and V := Hl (f2). Moreover , we define the continuous linear
y,,,- yin Y, A(yn)_winY* as n -+ oo operator A : L2(S; V) --> L2(S; V*),

and 1irn (A(yn), yn)Y-,Y < (w , y)Y',Y•


(7.34) j(( Ay)(t), v (t))v*,dt = a (x)Dy(x, t )D v(x, t) dx dt.
Then A(y) = w.
fi a,9-1

With this, the linear initial-boundary value problem attains the form
7.3.2. Continuity of solutions . In the following, A is the uniformly el-
liptic differential operator with coefficients in L°°(f2) that was introduced in
(7.35) yt+Ay=0, y(0)=yo,
(2.19) on page 37. The continuity result below is a consequence of Theorem
6.8 in Griepentrog [Gri07a] on maximal parabolic regularity. Since a proper with homogeneous Neumann boundary conditions. Problem (7.35) has for
understanding of this result requires some knowledge of the Sobolev-Morrey any initial datum yo E L2(S) a unique solution y E W(O,T). In the follow-
spaces defined in [Gri07b] and their embedding properties, we will briefly ing, we will, as in íGri07a, Gri07b], denote W (0, T) by W (S; V) - We aim
comment on Griepentrog's results at the end of this section. to show that the solution y belongs to C(S; C(S2)) = C(Q) if yo E C(f2).
To this end, let y* and y* denote the minimum and maximum, respec-
Lemma 7 . 12. Let 9 C RN be a bounded Lipschitz domain, and suppose
tively, of yo in Sl. It is then a consequence of the maximum principie for
that there are given functions f E Lr(Q), g E L5(E), and yo E C(í), where
parabolic equations and the structure of the operator A that the solution y
r > N12 + 1 and s > N + 1. Then the weak solution y to the linear parabolic
of (7.35) satisfies
initial- boundary value problem with Neumann boundary condition

yt+Ay = f inQ (7.36) y* < y(x,t) < y* for a.e. (x, t) E 52 x S.

ó„-Ay = g on E To verify this, one can test the variational formulation of (7.35) with the
Since
y(0) = yo in SZ functions v = (y - y*)+ E L2(S; V) and v = (y* - y)+ E L2(S; V).
this argument, while not trivial, is rather standard in the theory of parabolic
belongs to W(0 , T)f C(Q). Moreover, there exists some constant c ( r, s) > 0, equations, we do not go into the details here.
which does not depend on f, y, or yo , such that
(ii) Next, we approximate yo by a sequence of smooth initial data yo,k and
IIYII W(o,T) + Ilyllc(Q) < c(r, s) (IIfMIL-(Q) + IIgIILs(Y) + I1yo11c(0)). show that the associated solutions yk are continuous.
To this end, we use the fact that yo E C(f2). It follows from the classical
Proof: In realizing the idea for this proof , the author acknowledges J. Griepen-
Stone-Weierstrass theorem that there exists a sequence yo,k E C°°(í) of
trog's help.
initial values that converges uniformly on S2 to yo as k -> co. By virtue
(i) In the case where y=0, the assertion follows from Theorem 6.8 in [Gri07a]; of Theorem 2.4 in íGri07b], the solutions Yk E W(S; V) of problem (7.35),
this will be explained later. Owing to the superposition principle, it therefore which correspond to the regularized initial values Yo,k, converge as k - -> o0
380 7. Supplementary results on partial differential equations 7.3. Parabolic problems 381

in W(S; V) lo the solution y lo problern (7.35) that is associated with the constitute a regular domain in the sense of our definition given in Section
initial datum yo. We claim that yk E C(S2 x S) for all k E N. 2.2.2.
For this purpose, we define vk(t) := yo,k for all t E S and k e N. The
functions vk are constant in time and smooth in space. Let wk E W (S1- V) Remarks.

denote the associated solutions to the problern (i) If the coefficients of A and the boundary F are so smooth that there exists a
Green's function G(x, ^, t) for the aboye linear initial-boundary value problem, then
Lemma 7.12 can be proved rather easily. In fact, in this case y can be represented
(7.37) (wk)t + Awk = -Avk, Wk(0) = 0. ,in the form
t
Owing to Theorem 5.6 in [Gri07b], every Avk, k e N, belongs to the
y(x, t) = f f G(x, ^, t - r) f ( , T) d^ d•r + f G(x, , t - r ) g( , y) ds(^)drr
Sobolev-Morrey space L2 +2(S; V*). By virtue of the maximal parabolic o stn o r
regularity for problems of type (7.37) (see [Gri07a], Theorem 6.8), there is
si
some exponent w E (N, N + 2) such that wk belongs to the Sobolev-Morrey
space Ww(S;V) for every k E N. By Theorems 3.4 and 6.8 in [Gri07b], the We have already met the spatially one-dimensional analogue of this formula, (3.13)
latter space is continuously embedded in C(S;C(52)). Hence wk, and thus on page 126. For t > 0 and x, E S2 with x ^, the function G obeys an estimate
also Yk = wk + Vk, belongs to the space C(S; C(52)) for every k E N, which of the form

proves our claim.

(iii) It is now easy lo deduce the continuity of y. Indeed, for any k, P E N


G(x, t)I <- cl t
_N
/2 exp
( -C2

with positive constants cl and c2. With this, the lmma can be proved directly by
the difference Yk - ye E W(S; V) is the solution to problem (7.35) associated estimation; see [U684b], Lemma 5.6.6.
with the initial datum yo,k - yo,e. From step (ii), it follows that Yk - ye E
(ii) The mapping yo y is continuous from C(S2) into C(Q) if, for instante,
C(S; C(S2)). Hence, applying (7.36) to Yk - ye, we have the estimate
the elliptic differential operator generates a continuous semigroup in C(SZ). This
property was shown in [WarO6] for the Laplacian and homogeneous variational
(7.38) mi (yo,k(x) - yo,£(x)) 5 Yk(x, t) - ye (x, t) < max (yo,k(x) - yo,e(x)) boundary conditions, and in [Nit09l for the aboye operator A. Inhomogeneous
Dirichlet boundary conditions were treated in [ABHNO1].
for all k, £ E N and all (x, t) E S2 x S. Now, {yo,k}k 1 converges uniformly lo
yo, and therefore is a Cauchy sequence in C(S2). But then it follows imme- Maximal regularity of parabolic problems . In [Gri07b , Gri07aj,
diately from (7.38) that {yk}k r is a Cauchy sequence and thus convergent parabolic equations of the type u' + Au + Bu = F were investigated for
in C(S; C(St)). Since Yk -> y in W (S-1 V) and y is the solution of (7.35) zero initial conditions. For our purposes, the special case
associated with yo, we finally obtain that y E C(S; C(f2)), which concludes
the proof of the lmma. E (7.39) 1 u+ Au = F, u(0) 0

suffices, since for yo = 0 the parabolic initial-boundary value problem from


The aboye result can be found in [Cas97]. In [RZ99], a detailed proof
Lemma 7.12 can be rewritten in the form ( 7.39). To this end, the operator
using strongly continuous sernigroups was given under somewhat more re-
A : L2 (0 , T; V) -3 L2 (0, T; V *) introduced in (7.34 ) on page 379 is used. We
strictive smoothness assumptions on the coefficients of the differential oper-
consider ( 7.39) in the time interval S = ( 0,T) and with V = Hl (12).
ator and on the boundary F.
In this context, the Morrey spaces Lz (S; L2 ( SZ)), Lz (S; L2(F)), and
Meanwhile, the papers [Gri07b, Gri07a] on maximal parabolic regu-
Lz (S; V*) defined in [Gri07b] are needed . We do not give a precise def-
larity include the aboye result as a special case. In this connection, 9 can
inition of these spaces here, since that would be beyond the scope of this
even be a bounded Lipschitz domain in the sense of Grisvard [Gri85]; this
book . Indeed , we are mainly concerned with their embedding properties and
fact may become important, since two three-dimensional rectangular boxes
with regularity results for the solutions lo (7.39). The space
laid on top of each other in the forrn of a cross define a Lipschitz domain in
the sense of Grisvard but not in the sense of Netas, and hence they do not L2 (S; V) - {u e L2(S; V ) : u E L2 (S ; L2(S2 )), IVul E L2 ( S; L2(S2))}
382 7. Supplementary results on partial differential equations 7.3. Parabolic problems 383

also plays a role. Finally, in the ( for us ) important case of the duality where
mapping E : V V*, the Sobolev-Morrey space
fl, ... , fN E Lz
(S; L2(Q)), f E L2-2(S; L2(Q)), g E L2`1 (S; L2(r)),
WW(S; V) = {u e L2 (S; V) : ú E L2 (S; V*)}
and the mapping (fi, ... , fN, f, g) H F defines a continuous linear operator.
is needed ; see Definition 6.1 in [Gri07b] with X = Y = V. The inter- The connection between these Morrey spaces and the usual Lebesgue spaces,
ested reader will find a comprehensive discussion of all these Sobolev-Morrey in which our assumptions for the data f and g are formulated, is revealed
spaces and their properties in [Gri07b]. by Remarks 3.4 and 3.7 in [Gri07b]. We obtain:

(a) If q > 2 and wq = (N + 2)(1 - 2/q), then Lq(Q) is continuously


Remark . In [Gri07a ], when G = S2 U F c R`v the notation H, '(G) is used for
embedded in L2 4 (S; L2 (S2) ), and we have wq > N if q > N + 2.
the subspace of Hl (52) consisting of those functions that vanish on the Dirichlet
boundary, that is, on the complement of the boundary portion F c 3f2 on which (b) If r > 2 and wr = 2 + (N + 2)(1 - 2/r), then Lr(Q) is continuously
Neumann data are prescribed . Since in our case F = ¿9, the Dirichlet boundary is embedded in Ltr-2(S; L2(S2)), and we have wr > N if r > N/2 + 1.
empty, and thus H, '(G) = H' ( 12) and H-' (G) = Hl (S2)* = V *.
(c) If s > 2 and ws = 1 + (N + 1)(1 - 2/s), then L3(E) is continuously
embedded in L2-1(S; L2(F)), and we have ws > N if s > N + I.
Understanding of the results from [Gri07a] on maximal parabolic reg-
ularity and their application to the proof of Lernma 7.12 will be facilitated Obviously, the last two conditions, namely r > N/2 + 1 and s > N + 1, are
by the following facts. exactly those imposed on f and g in Lemma 7.12.

(i) By Theorem 6.8 in [Gri07a], there is some w E (N, N + 2] such that for
any w E [0, w) the restriction of the parabolic differential operator

P:uHÚ + Au

to the space {u E Ww(S; V) : u(0) = 0} is a linear isomorphism between


{ u E WW(S ; V) : u(0) = 0} and L2 (S; V*).
(ii) The space W" (S; V) has the important property that for w > N it is
continuously embedded in some space C°''(Q) of Holder continuous func-
tions; see Remark 6.1 in [Gri07a ]. Then the assertion of Lemma 7.12 for the
case of yo = 0 follows from the fact that under our assumptions on r and s
the mapping ( f, g) y is continuous from L' (Q) x L' ( E) into W' (S; V)
for some w E (N, w). This can be deduced from the results in [Gri07b] in
the following way:

(iii) Owing to Theorem 5.6 in [Gri07b ], for w e [0 , N + 2] the space


L2 (S; V*) contains all those functionals F E L2( S; V*) that can be rep-
resented in the forro

f (t), (t))*dt = fff (xt)Di(xt)dxdt


2=1

+ f f f(x, t) ^(x, t) dx dt + f f g(x, t) ^(x, t) ds(x)dt,


s s r
Bibliography

[ABHN01] C. Arendt, C. Batty, M. Hieber, and F. Neubrander, Vector-valued Laplace


Transforms and Cauchy Problems, Birkh9.user, Basel, 2001.
[ACT02] N. Arada, E. Casas, and F. Tri ltzsch, Error estimates for the numerical ap-
proximation of a semilinear elliptic control problem, Comput. Optim. Appl.
23 (2002), 201-229.
[Ada78] R. A. Adams, Sobolev Spaces, Academic Press, Boston, 1978.
[AEFROO] N. Arada, H. El Fekih, and J.-P. Raymond, Asymptotic analysis of some
control problems , Asymptot. Anal. 24 (2000), 343-366.
[AHO1] K. Afanasiev and M . Hinze , Adaptive control of a wake flow using proper
orthogonal decomposition, Shape Optimization and Optimal Design, Lect.
Notes Pure Appl. Math., vol. 216, Marcel Dekker, 2001, pp. 317-332.
(A1t99] H. W. Alt, Lineare Funktionalanalysis , Springer, Berlin, 1999.
[Alt02] W. Alt, Nichtlineare Optimierung, Vieweg, Wiesbaden, 2002.
[AM84] W. Alt and U. Mackenroth, On the numerical solution of state constrained
coercive parabolic optimal control problems, Optimal Control of Partial Dif-
ferential Equations (K.-H. Hoffmann and W. Krabs, eds.), Int. Ser. Numer.
Math., vol. 68, Birkhá.user, 1984, pp. 44-62.
[AM89] , Convergente of finte element approximations to state constrained
convex parabolic boundary control problems, SIAM J. Control Optim. 27
(1989), 718-736.
[AM931 W. Alt and K. Malanowski, The Lagrange--Newton method for nonlinear
optimal control problems, Comput. Optim. Appl. 2 (1993), 77-100.
[Ant05] A. C. Antoulas, Approximation of Large-Scale Dynamical Systems, SIAM,
Philadelphia, 2005.
[App88] J. Appell, The superposition operator in function spaces - A survey, Expo.
Math. 6 (1988), 209-270.

[AR97] J.-J. Alibert and J.-P. Raymond, Boundary control of semilinear elliptic
equations with discontinuous leading coef icients and unbounded controls, Nu-
mer. Funct. Anal. Optim. 18 (1997), no. 3-4, 235-250.

385
386 Bibliography Bibliography 387

[ART02] N. Arada, J.-P. Raymond, and F. Triiltzsch, On an augmented Lagrangian (Bor03l A. Borzi, Multigrid methods for parabolic distributed optimal control prob-
SQP method for a class of optimal control problems in Banach spaces, Com- lems, J. Comput. Appl. Math. 157 (2003), 365-382.
put. Optim. Appl. 22 (2002), 369-398. [BP78] V. Barbu and Th. Precupanu, Convexity and Optimization in Banach Spaces,
[AT81] N. U. Ahmed and K. L. Teo, Optimal Control of Distributed Parameter Editura Academiei, Bucharest, and Sijthoff & Noordhoff, Leyden, 1978.
Systems, North Holland, New York, 1981. [BR01] R. Becker and R. Rannacher, An optimal control approach to a posteriori
[AT90] F. Abergel and R. Temam, On some control problems in fluid mechanics, error estimation in finite element methods, Acta Numer. 10 (2001), 1-102.
Theor. Comput. Fluid Dyn. 1 (1990), 303-325. [BraO7] D. Braess, Finite Elements: Theory, Fast Solvers, and Applications in Elas-
[AZ90] J. Appell and P. P. Zabrejko, Nonlinear Superposition Operators, Cambridge ticity Theory, 4th Edition, Cambridge University Press, Cambridge, 2007.
University Press, Cambridge, 1990. [BS94] S. C. Brenner and L. R. Scott, The Mathematical Theory of Finite Element
Methods, Springer, New York, 1994.
[Bal65] A. V. Balakrishnan, Optimal control problems in Banach spaces, SIAM J.
Control 3 (1965), 152-180. [BS96] M. Brokate and J. Sprekels, Hysteresis and Phase Transitions, Springer, New
York, 1996.
[Bar93] V. Barbu, Analysis and Control of Nonlinear Infinite Dimensional Systems,
Academic Press, Boston, 1993. [BS00] F. Bonnans and A. Shapiro, Perturbation Analysis of Optimization Problems,
Springer, New York, 2000.
[BBEFR031 F. Ben Belgacem, H. El Fekih, and J.-P. Raymond, A penalized Robin ap-
proach for solving a parabolic equation with nonsmooth Dirichlet boundary [But69] A. G. Butkovskii, Distributed Control Systems, American Elsevier, New
condition, Asymptot. Anal. 34 (2003), 121-136. York, 1969.
[But75[ , Methods for the Control of Distributed Parameter Systems (in
[BC91] F. Bonnans and E. Casas, Une principe de Pontryagine pour le contróle
Russian), Isd. Nauka, Moscow, 1975.
des systémes semilinéaires elliptiques, J. Differential Equations 90 (1991),
288-303. [Car67] H. Cartan, Calcul Différentiel. Formes Différentíelles, Hermann, Paris, 1967.

[BC95] , An extension of Pontryagin's principie for state-constrained optimal [Cas86] E. Casas, Control of an elliptic problem with pointwise state constraints,
control of semilinear elliptic equations and variational inequalities, SIAM J. SIAM J. Control Optim. 4 (1986), 1309-1322.
Control Optim. 33 (1995), 274-298. [Cas92] , Introduccion a las Ecuaciones en Derivadas Parciales, Universidad
de Cantabria, Santander, 1992.
[BDPDM92] A. Bensoussan, G. Da Prato, M. C. Delfour, and S. K. Mitter, Representation
and Control of Infinite Dimensional Systems, Vol. I, Birkháuser, Basel, 1992. [Cas93] , Boundary control of semilinear elliptic equations with pointwise state
constraints, SIAM J. Control Optim. 31 (1993), 993-1006.
[BDPDM93] , Representation and Control of Infinite Dimensional Systems, Vol.
II, Bi r khá user, B ase l , 1993 . [Cas95] , Optimality conditions for some control problems of turbulent flow,
Flow Control (New York) (M. D. Gunzburger, ed.), Springer, 1995, pp. 127-
[Ber82] D . M. Bertsekas , Projected Newton methods for optimization problems with
147.
simple constraints, SIAM J. Control Optim. 20 (1982), 221-246.
[Cas97] , Pontryagin's principie for state-constrained boundary control prob-
[BetO1] J. T. Betts, Practical Methods for Optimal Control Using Nonlinear Pro-
lems of semilinear parabolic equations, SIAM J. Control Optim. 35 (1997),
gramming , SIAM , Philadelphia , 2001 .
1297-1327.
[BIK99] M. Bergounioux, K. Ito, and K. Kunisch, Primal-dual strategy for constrained CDIRT08] E. Casas, J. C. De los Reyes, and F. Tróltzsch, Sufficient second-order op-
optimal control problems, SIAM J. Control Optim. 37 (1999), 1176-1194. timality conditions for semilinear control problems with pointwise state con-
[Bit75] L. Bittner, On optimal control of processes governed by abstract functional, straints, SIAM J. Optim. 12 (2008), no. 2, 616-643.
integral and hyperbolic differential equations, Math. Methods Oper. Res. 19 [CH91] Z. Chen and K. H. Hoffmann, Numerical solutions of the optimal control
(1975), 107-134. problem governed by a phase field model, Estimation and Control of Dis-
[BKO1] A. Borzi and K. Kunisch, A multigrid method for optimal control of time- tributed Parameter Systems (W. Desch, F. Kappel, and K. Kunisch, eds.),
dependent reaction diffusion processes, Fast Solution of Discretized Opti- Int. Ser. Numer. Math., vol. 100, Birkháuser, 1991, pp. 79-97.
mization Problems (K. H. Hoffmann, R. Hoppe, and V. Schulz, eds.), Int. [Cia78] P. G. Ciarlet, The Finite Element Method for Elliptic Problems, North-
Ser. Numer. Math., vol. 138, Birkháuser, 2001, pp. 513-524. Holland, Amsterdam, 1978.
[BK02a] M. Bergounioux and K. Kunisch, On the structure of the Lagrange multi- [CM02a] E. Casas and M. Mateos, Second order sufficíent optimality conditions for
plier for state-constrained optimal control problems, Systems Control Lett. semilinear elliptic control problems with finitely many state constraints,
48 (2002), 16-176. SIAM J. Control Optim. 40 (2002), 1431-1454.
[BK02b] , Primal-dual active set strategy for state-constrained optimal control [CM02b] , Uniform convergence of the FEM. Applications to state constrained
problems, Comput. Optim. Appl. 22 (2002), 193-224. control problems, J. Comput. Appl. Math. 21 (2002), 67-100.
[BKROO] R. Becker, H. Kapp, and R. Rannacher, Adaptive finite element methods [CMT05[ E. Casas, M. Mateos, and F. Trbltzsch, Error estimates for the numeri-
for optimal control of partial differential equations: Basic concepts, SIAM J. cal approximation of boundary semilinear elliptic control problems, Comput.
Control Optim. 39 (2000), 113-132. Optim. Appl. 31 (2005), 193-220.
388 Bibliography
Bibliography 389

[Con90] J. B. Conway, A Course in Functional Analysis, Wiley & Sons, War-


saw/Dordrecht, 1990. [GKT92] H. Goldberg, W. Kampowsky, and F. Triiltzsch, On Nemytskij operators in
L.-spaces of abstract functions, Math. Nachr. 155 (1992), 127-140.
[Cor07] J. M. Coron, Control and Nonlinearity, American Mathematical Society,
Providente, 2007. [GM99] M. D. Gunzburger and S. Manservisi, The velocity tracking problem for
[CRO6] E. Casas and J.-P. Raymond, Error estimates for the numerical approxima- Navier-Stokes flows with bounded distributed controls, SIAM J. Control Op-
tion of Dirichlet boundary control for semilinear elliptic equations, SIAM J. tim. 37 (1999), 1913-1945.
Control Optim. 45 (2006), 1586-1611. [GMOOI , Analysis and approximation of the velocity tracking proble?n for
[CT99] E. Casas and F. TrSltzsch, Second-order necessary optimality conditions Navier-Stokes flows with distributed control, SIAM J. Numer. Anal. 37
for some state-constrained control problems of semilinear elliptic equations, (2000), 1481-1512.
Appl. Math. Optim. 39 (1999), 211-227. [GMW81] P. E. Gill, W. Murray, and M. H. Wright, Practical Optimization, Academic
[CT02] , Second-order necessary and sufficient optimality conditions for op- Press, London, 1981.
timization problems and applications lo control theory, SIAM J. Optim. 13 [Goe92] M. Goebel, Continuity and Fréchet-differentiability of Nernytskij operators
(2002), 406-431. in Hiilder spaces, Monatsh. Math. 113 (1992), 107-119.
[CTU96] E. Casas, F. Triiltzsch, and A. Unger, Second order sufficient optimality [GR01] T. Grund and A. RSsch, Optimal control of a linear elliptic equation with a
conditions for a nonlinear elliptic control problem, Z. Anal. Anwendungen supremum-norm functional, Optim. Methods Softw. 15 (2001), 299-329.
(ZAA) 15 (1996), 687-707.
[GR051 C. Grossmann and H.-G. Reos, Numerische Behandlung partieller Differen-
[CTUO0] , Second order suffiicient optimality conditions for some state- tialgleichungen, Teubner, Wiesbaden, 2005.
constrained control problems of semilinear elliptic equations, SIAM J. Con-
[Gri85] P. Grisvard, Elliptic Problems in Nonsmooth Domains, Pitman, Boston,
trol Optim. 38 (2000), no. 5, 1369-1391.
1985.
[DeuO4] P. Deuflhard, Newton Methods for Nonlinear Problems. Afine Invariance
and Adaptive Algorithms, Springer Series in Computational Mathematics, [Gri02] J. A. Griepentrog, Linear elliptic boundary value problems with non-sniooth
vol. 35, Springer, Berlin, 2004. data: Canipanato spaces of functionals, Math. Nachr. 243 (2002), 19-42.

[DHPY95] A. L. Dontchev, W. W. Hager, A. B. Poore, and B. Yang, Optimality, sta- [Gri07a] , Maximal regularity for nonsmooth parabolic problems in Sobolev-
bility, and convergente in nonlinear control, Appl. Math. Optim. 31 (1995), Morrey spaces, Adv. Differential Equations 12 (2007), 1031-1078.
297-326. [Gri07b] , Sobolev-Morrey spaces associated with evolution equations, Adv. Dif-
[DT09] V. Dhamo and F. Triiltzsch, Some aspects of reachability for parabolic bound- ferential Equations 12 (2007), 781-840.
ary control problems with control constraints, accepted for publication in: [GS77] K. Glashoff and E. W. Sachs, On theoretical and numerical aspects of the
Comput. Optim. Appl. (2009). bang-bang principie, Numer. Math. 29 (1977), 93-113.
[Dun98] J. C. Dunn, On second order sufficient optirnality conditions for structured [GS80] W. A. Gruver and E. W. Sachs, Algorithmic Methods in Optimal Control,
nonlinear programs in infinite- dimensional function spaces, Mathematical Pitman, London, 1980.
Programming with Data Perturbations (A. Fiacco, ed.), Marcel Dekker,
[GT93] H. Goldberg and F. Trhltzsch, Second-order sufficient optimality conditions
1998, pp. 83-107.
for a class of nonlinear parabolic boundary control problems, SIAM J. Control
[ET74] I. Ekeland and R. Temam, Analyse Convexe et Problémes Variationnels,
Optim. 31 (1993), 1007-1025.
Dunod, Gauthier-Villars, 1974.
[GT97a] , On a SQP-multigrid technique for nonlinear parabolic boundary con-
[ET86] K. Eppler and F. Trbltzsch, On switching points of optiraal controls for co-
trol problems, Optimal Control: Theory, Algorithms and Applications (Dor-
ercive parabolic boundary control problems, Optimization 17 (1986), 93-101.
drecht) (W. W. Hager and P. M. Pardalos, eds.), Kluwer Academic Publish-
[Eva98] L. C. Evans, Partial Differential Equations, Graduate Studies in Mathemat- ers, 1997, pp. 154-177.
ics, vol. 19, American Matliematical Society, Providente, 1998.
[GT97b] C. Grossmann and J. Terno, Numerik der Optimierung, Teubner, Stuttgart,
[Fat99] H. O. Fattorini, Infinite Dimensional Optimization and Control Theory, 1997.
Cambridge University Press, Cambridge, 1999.
[Gun95] M. D. Gunzburger (ed.), Flow Control, Springer, New York, 1995.
[Fri64] A. Friedman, Partial Differential Equations of Parabolic Type, Prentice-Hall,
Englewood Cliffs, 1964. (Gun03] , Perspectives in Flow Control and Optimization, SIAM, Philadelphia,
2003.
[Fur991 A. V. Fursikov, Optimal Control of Distríbuted Systems. Theory and Appli-
cations, American Mathematical Society, Providence, 1999. [GW76] K. Glashoff and N. Weck, Boundary control of parabolic differential equa-
[Gal94] P. G. Galdi, An Introduction to the Navier-Stokes Equations, Springer, New tions in arbitrary dimensions: Supremum-norm problems, SIAM 3. Control
York, 1994. Optim. 14 (1976), 662-681.

[GGZ74] H. Gajewski, K. Grdger, and K. Zacharias, Nichtlineare Operatorgleichungen [HDMRS091 R. Haller-Dintelmann, C. Meyer, J. Rehberg, and A. Schiela, Hólder continu-
und Operatordifferentialgleichungen, Akademie-Verlag, Berlin, 1974. ity and optimal control for nonsmooth elliptic problems, Appl. Math. Optim.
60 (2009), 397-428.
390 Bibliography Bibliography 391

(Hei961 M. Heinkenschloss, Projected sequential quadratic programming methods, [Jah94] J. Jahn, Introduction lo the Theory of Nonlinear Optimization, Sp nger,
SIAM J. Optim. 6 (1996), 373-417. Berlín, 1994.
[Hei97] , The numerical solution of a control problem governed by a phase [JK81] D. S. Jerison and C. E. Kenig, The Neumann problem on Lipschitz domains,
field model, Optim. Methods Softw. 7 (1997), 211-263. Bull. Amer. Math. Soc., New Ser. 4 (1981), 203-207.
[HeuO8] H. Heuser, Lehrbuch der Analysis, Teil 2, Vieweg+Teubner, Wiesbaden, [KA64] L. V. Kantorovich and G. B. Akilov, Functional Analysis in Normed Spaces,
2008. Pergamon Press, Oxford, 1964.
[Hin99] M. Hinze, Optimal and instantaneous control of the instationary Navier- [Kar77] A. Karafiat, The problem of the number of switches in parabolic equations
Stokes equations, Habilitation thesis, Technische Universitiit Berlin, 1999. with control, Ann. Pol. Math. XXXIV (1977), 289-316.
[Hin05] , A variational discretization concept in control constrained optimiza- [Kel99] C. T. Kelley, Iterative Methods for Optimization, SIAM, Philadelphia, 1999.
tion: The linear-quadratic case, Comput. Optim. Appl. 30 (2005), 45-63.
[KK02] D. Klatte and B. Kummer, Nonsmooth Equations in Optimization: Reg-
[HJ92] K.-H. Hoffmann and L. Jiang, Optimal control problem of a phase field model ularity, Calculus, Methods and Applications, Kluwer Academic Publishers,
for solidification, Numer. Funct. Anal. Optim. 13 (1992), 11-27. Dordrecht, 2002.
[HKO1] M. Hinze and K. Kunisch, Second order methods for optimal control of time- [Kno77] G. Knowles, Über das Bang-Bang-Prinzip be¡ Kontrollproblemen aus der
dependent fluid flow, SIAM J. Control Optim. 40 (2001), 925-946. Wiirmeleitung, Arch. Math. 29 (1977), 300-309.
[HK04] , Second order methods for boundary control of the instationary [KR02] K. Kunisch and A. Rbsch, Primal-dual active set strategy for a general class
Navier-Stokes system, Z. Angew. Math. Mech. (ZAMM) 84 (2004), 171-
of constrained optimal control problems, SIAM J. Optim. 13 (2002), 321-334.
187.
[Kra95] W. Krabs, Optimal Control of Undamped Linear Vibrations, Heldermann,
[HP571 E. Hille and R. S. Phillips, Functional Analysis and Semigroups, Colloquium
Lemgo, 1995.
Publications, vol. 31, American Mathematical Society, Providence, 1957.
[Kre78] E. Kreyszig, Introductory Functional Analysis with Applications, Wiley, New
[HPUUO9] M. Hinze, R. Pinnau, M. Ulbrich, and S. Ulbrich, Optimization with PDE
York, 1978.
Constraints, Mathematical Modelling: Theory and Applications, vol. 23,
Springer, Berlin, 2009. [KS80] D. Kinderlehrer and G. Stampacchia, An Introduction to Variational In-
equalities and Their Applications, Academic Press, New York, 1980.
[HS94] M. Heinkenschloss and E. W. Sachs, Numerical solution of a constrained con-
trol problem for a phase field model, Control and Estimation of Distributed [KS92] F.-S. Kupfer and E. W. Sachs, Numerical solution of a nonlinear parabolic
Parameter Systems (W. Desch, F. Kappel, and K. Kunisch, eds.), Int. Ser. control problem by a reduced SQP method, Comput. Optim. Appl. 1 (1992),
Numer. Math., vol. 118, Birkhkuser, 1994, pp. 171-188. 113-135.
[HT99] M. Heinkenschloss and F. Tróltzsch, Analysis of the Lagrange-SQP-Newton [KS94] C. T. Kelley and E. W. Sachs, Multilevel algorithms for constrained compact
method for the control of a phase field equation, Control Cybernet. 28 (1999), fixed point problems, SIAM J. Sci. Comput. 15 (1994), 645-667.
no. 2, 178-211.
[KS95] , Solution of optimal control problems by a pointwise projected Newton
[HV01] M. Heinkenschloss and L. N. Vicente, Analysis of inexact trust-region SQP method, SIAM 3. Control Optim. 33 (1995), 1731-1757.
algorithms, SIAM J. Optim. 12 (2001), 283-302.
[KV01] K. Kunisch and S. Volkwein, Galerkin proper orthogonal decomposition meth-
[HV03] D. HSmberg and S. Volkwein, Control of laser surface hardening by a reduced- ods for parabolic problems, Numer. Math. 90 (2001), 117-148.
order approach using proper orthogonal decomposition, Math. Comput. Mod-
[KV02] , Galerkin proper orthogonal decomposition methods for a general
elling 38 (2003), 1003-1028.
equation influid dynamics, SIAM J. Numer. Anal. 40 (2002), 492-515.
[IK96] K. Ito and K. Kunisch, Augmented Lagrangian-SQP methods for nonlinear
[KV07] K. Kunisch and B. Vexler, Constrained Dirichlet boundary control in L2 for a
optimal control problems of tracking type, SIAM J. Control Optim. 34 (1996),
class of evolution equations, SIAM J. Control Optim. 46 (2007), 1726-1753.
874-891.
[IKOO] [KZPS76] M. A. Krasnoselskii, P. P. Zabreiko, E. I. Pustylnik, and P. E. Sobolevskii,
, Augmented Lagrangian methods for nonsmooth, convex optimization
in Hilbert spaces, Nonlinear Anal., Theory Methods Appl. 41 (2000), 591- Integral Operators in Spaces of Summable Functions, Noordhoff, Leyden,
616. 1976.

[IK08] , Lagrange Multiplier Approach to Variational Problems and Appli- [Las02l I. Lasiecka, Mathematical Control Theory of Coupled PDEs, CBMS-NSF Re-
cations, SIAM, Philadelphia, 2008. gional Conference Series in Applied Mathematics, vol. 75, SIAM, Philadel-
phia, 2002.
(Iof79] A. D. Ioffe, Necessary and sufficient conditions for a local minimum. 3: Sec-
ond orden conditions and augmented duality, SIAM J. Control Optim. 17 [Lio69] J. L. Lions, Quelques Méthodes des Résolution des Problémes aux Limites
(1979), 266-288. non Linéaires, Dunod, Gauthier-Villars, Paris, 1969.
[IT79] A. D. Ioffe and V. M. Tihomirov, Theory of Extremal Problems, North- [Lio71] , Optimal Control of Systems Governed by Partial Differential Equa-
Holland, Amsterdam, 1979. tions, Springer, Berlin, 1971.
392 Bibliography Bibliography 393

[LLS94] J. E. Lagnese , G. Leugering , and E. J. P. G. Schmidt, Modeling, Analysis


[MR04] C. Meyer and A. RSsch , Superconvergence properties of optimal control prob-
and Control of Dynamic Elastic Multi - Link Structures , Birkháuser , Boston, lems, SIAM J. Control Optim . 43 (2004 ) , 970-985.
1994.
[MRTO6] C. Meyer , A. Rbsch, and F. Triiltzsch , Optimal control of PDEs with regular-
[LM72] J. L. Lions and E. Magenes , Nonhomogeneous Boundary Value Problems and
ized pointwise state constraints, Comput . Optim. Appl. 33 (2006 ), 209-228.
Applications , vol. 1-3, Springer , Berlin , 1972.
[MS82] J. Macki and A . Strauss, Introduction to Optimal Control Theory , Springer ,
[LS74] L. A. Lusternik and V. J. Sobolev, Elements of Functional Analysis, Wiley
Berlin, 1982.
& Sons, New York, 1974.
[LS00] [MS00] B. Maar and V. Schulz , Interior point multigrid methods for topology opti-
F. Leibfritz and E . W. Sachs, Inexact SQP interior point methods and large
mization, Structural Optimization 19 (2000 ), 214-224.
scale optimal control problems, SIAM J . Control Optim . 38 (2000 ), 272-293.
[LS07] C. Lefter and J. Sprekels , Control of a phase field system modeling non- [MT02] H. D. Mittelmann and F . Triiltzsch , Sufficient optimality in a parabolic con-
trol problem, Trends in Industrial and Applied Mathematics ( Dordrecht)
isotherrnal phase transitions , Adv. Math . Sci. Appl. 17 (2007 ), 181-194.
(A. H. Siddiqi and M . Kocvara, eds .), Kluwer Academic Publishers, 2002,
[LSU68 ] O. A. Ladyzhenskaya , V. A. Solonnikov , and N . N. Ural 'ceva, Linear and
pp. 305-316.
Quasilinear Equations of Parabolic Type, American Mathematical Society,
Providente, 1968. [MT06 ] C. Meyer and F. Tr5ltzsch , On an elliptic optimal control problem with point-
wise mixed control-state constraints , Recent Advances in Optimization. Pro-
[LT00a] I. Lasiecka and R. Triggiani , Control Theory for Partial Differential Equa-
ceedings of the 12th French-German - Spanish Conference on Optimization
tions: Continuous and Approximation Theories . 1: Abstract Parabolic Sys-
held in Avignon , September 20-24 , 2004 ( A. Seeger, ed .), Lect . Notes Econ.
tems, Cambridge University Press, Cambridge, 2000.
Math. Syst ., vol. 563 , Springer , 2006, pp . 187-204.
[LTO0b] , Control Theory for Partial Diffe ren ti a l E qua ti ons : C ont i nuous and
[MZ79] H. Maurer and J . Zowe, First- and second-order conditions in infinite-
Approximation Theories . II: Abstract Hyperbolic- like Systems over a Finite
dimensional programming problems, Math . Program. 16 (1979 ), 98-110.
Time Horizon, Cambridge University Press, Cambridge, 2000.
[LU73] [Nec67] J . Netas , Les Méthodes Directes en Théorie des Equations Elliptiques ,
O. A. Ladyzhenskaya and N. N. Ural'ceva, Linear and Quasilinear Equations
of Elliptic Type ( in Russian ), Izd. Nauka, Moscow, 1973. Academia, Prague, 1967.

[Lue69] D. G. Luenberger , Optimization by Vector Space Methods , Wiley , New York , NitO9] R. Nittka , Regularity of solutions of linear second order elliptic and par-
1969. abolic boundary value problems on Lipschitz domains, arXiv:0906 . 5285v1,
June 2009.
[Lue84] , Linear and Nonlinear Pro g rammin g, Addis on Wes l ey, R ea di ng, M as-
sachusetts, 1984. [NPSO9] I. Neitzel , U. Prüfert, and T . Slawig, Strategies for time - dependent PDE con-
trol with inequality constraints using an integrated modeling and simulation
[LY95] X. Li and J. Yong, Optimal Control Theory for Infinite Dimensional Systems,
environment , Numer . Algorithms 50 (2009 ), 241-269.
Birkháuser , Boston, 1995.
[Mac81 ] [NST06] P. Neittaanmáki , J. Sprekels , and D . Tiba , Optimization of Elliptic Systems:
U. Mackenroth , Time-optimal parabolic boundary control problems with state
constraints , Numer . Funct. Anal. Optim . 3 (1981), 285-300. Theory and Applications , Springer , Berlin, 2006.

[Mac82 ] , Convex parabolic boundary control pro bl ems w ith s t a t e constra i nts, [NT94] P. Neittaanmáki and D . Tiba, Optimal Control of Nonlinear Parabolic Sys-
. 1. Math. Anal . Appl. 87 ( 1982 ), 256-277. tems: Theory , Algorithms , and A pp lications , Marcel Dekker , New York ,
1994.
[Mac83a] , On a parabolic distributed optimal control problem with restrictions
on the gradient, Appl. Math. Optim . 10 (1983 ) , 69-95. [NW99] J. Nocedal and S. J. Wright , Numerical Optimization, Springer , New York,
1999.
[Mac83b] , Some remarks on the numerical solutio n of b ang- b ang t ype opt i ma l
control problems , Numer. Funct . Anal. Optim. 5 (1983 ), 457-484. [Paz83 ] A. Pazy, Semigroups of Linear Operators and Applications to Partial Differ-
[Ma181 ] ential Equations , Springer , New York , 1983.
K. Malanowski, Convergence of approximations vs. regularity of solutions for
convex, control- constrained optimal control problems, Appl. Math. Optim. 8 [ PBGM62] L. S . Pontryagin , V. G. Boltyanskii, R. V. Gamkrelidze, and E. F.
(1981 ), 69-95. Mishchenko , The Mathematical Theory of Optimal Processes , Wiley, New
[Mau81] H . Maurer, First and second order sufficient optimality conditions in mathe- York, 1962.
matical programming and optimal control, Math . Program. Study 14 (1981), [Pen82] J.-P. Penot , On regularity conditions in mathematical programming, Math.
163-177. Program. Study 19 (1982 ), 167-199.
[MM00] H. Maurer and H . D. Mittelmann , Optimization techniques for solving elliptic [Po197] E. Polak, Optimization : Algorithms and Consistent Approximations , Applied
control problems with control arad state constraínts . I. Boundary control, J. Math. Sciences , vol. 124, Springer, New York, 1997.
Comput. Appl . Math . 16 (2000 ), 29-55. [Rob80] S. M . Robinson , Strongly regular generalizad equations , Math . Oper. Res. 5
[MM01] , Optimization techniques for solvin g ellip ti c con t ro l pro bl ems w i t h (1980 ) , 43-62 .
control and state constraints . I1: Distributed control, J. Comput. Appl. Math. [RSs04] A. RSsch , Error estimates for parabolic optimal control problems with control
18 (2001 ), 141-160. constraints, Z. Anal . Anwendungen ( ZAA) 23 ( 2004), 353-376.
394 Bibliography Bi bliograph y- 395

[Rou02] T. Roubícek, Optimization of steady-state flow of incompressible fluids, [Tró84b] , Optimality Conditions for Parabolic Control Problems and Applica-
Analysis and Optimization of Differential Systems (Boston) (V. Barbu, tions, Teubner Texte zur Mathematik, vol. 62, Teubner, Leipzig, 1984.
1. Lasiecka, D. Tiba, and C. Varsan, eds.), Kluwer Academic Publishers, , On the Lagrange-Newton-SQP method for the optimal control of
[Tró99]
2002, pp. 357-368.
semilinear parabolic equations, SIAM J. Control Optim. 38 (1999), 294-312.
[RT00] J.-P. Raymond and F. Tróltzsch, Second order sufficient optimality condi-
[Tr600] , Lipschitz stability of solutions of linear-quadratic parabolic control
tions for nonlinear parabolic control problems with state constraints, Discrete
problems with respect to perturbations, Dyn. Contin. Discrete Impulsive Syst.
Contin. Dyn. Syst. 6 (2000), 431-450.
7 (2000), 289-306.
[RT03] T. Roubícek and F. Tróltzsch, Lipschitz stability of optimal controls for the
[TS64] A. N. Tychonov and A. A. Samarski, Partial Differential Equations of Math-
steady state Navier-Stokes equations, Control Cybernet. 32 (2003), 683-705.
ematical Physics, Vol. I, Holden-Day, San Francisco, 1964.
[RW08] A. Rósch and D. Wachsmuth, Numen cal verification of optimality conditions,
[TT96] D. Tiba and F. Tróltzsch, Error estimates for the discretization of state
SIAM J. Control Optim. 47 (2008), no. 5, 2557-2581.
constrained convex control problems, Numer. Funct. Anal. Optim. 17 (1996),
[RZ98] J.-P. Raymond and H. Zidani, Pontryagin's principie for state-constrained 1005-1028.
control problems governed by parabolic equations with unbounded controls, [TW06] F. Tróltzsch and D. Wachsmuth, Second-order sufficient optimality condi-
SIAM J. Control Optim. 36 (1998), 1853-1879. tions for the optimal control of Navier-Stokes equations, ESAIM: Control
[RZ99] , Hamiltonian Pontryagin's principies for control problems governed Optim. Calc. Var. 12 (2006), 93-119.
by semilinear parabolic equations, Appl. Math. Optim. 39 (1999), 143-177.
[Ung97] A. Unger, Hinreichende Optimalitdtsbedingungen 2. Ordnung und Konver-
[Sac78] E. W. Sachs, A parabolic control problem with a boundary condition of the genz des SQP-Verfahrens für semilineare elliptische Randsteuerprobleme,
Stefan-Boltzmann type, Z. Angew. Math. Mech. (ZAMM) 58 (1978), 443-' Ph.D. thesis, Technische Universitkt Chemnitz, 1997.
449.
[UU00] M. Ulbrich and S. Ulbrich, Superlinear convergente of affine-scaling inte-
[Sch79] K. Schittkowski, Numerical solution of a time-optimal parabolic boundary- rior point Newton methods for infinite-dimensional nonlinear problems with
value control problem, J. Optimization Theory Appl. 27 (1979), 271-290. pointwise bounds, SIAM J. Control Optim. 38 (2000), 1938-1984.
[Sch80] E. J. P. G. Schmidt, The bang-bang principie for the time-optimal problem [UUH99] M. Ulbrich, S. Ulbrich, and M. Heinkenschloss, Global convergence of trust-
in boundary control of the heat equation, SIAM J. Control Optim. 18 (1980), region interior-point algorithms for infinite-dimensional nonconvex mini-
101--107. mization subject lo pointwise bounds, SIAM J. Control Optim. 37 (1999),
[Sch89] , Boundary control for the heat equation with non-linear boundary 731-764.
condition, J. Differential Equations 78 (1989), 89-121. [Vex07] B. Vexler, Finite element approximation of elliptic Dirichlet optimal control
[Spe93] P. Spellucci, Numerische Verfahren der nichtlinearen Optimierung, problems, Numer. Funct. Anal. Optim. 28 (2007), 957-973.
Birkhiiuser, Basel, 1993. [Vo101] S. Volkwein, Optimal control of a phase-field model using proper orthogonal
[Sta65] G. Stampacchia, Le probléme de Dirichlet pour les équations elliptiques du decomposition, Z. Angew. Math. Mech. (ZAMM) 81 (2001), 83-97.
second ordre á coefficients discontinus, Ann. Inst. Fourier, Grenoble 15 [vW76] L. Y. Wolfersdorf, Optimal control for processes governed by mildly nonlinear
(1965),189-258. differential equations of parabolic type I, Z. Angew. Math. Mech. (ZAMM)
[SW98] V. Schulz and G. Wittum, Multigrid optimization methods for stationary 56 (1976), 531-538.
parameter identification problems in groundwater flow, Multigrid Methods [vW77] , Optirnal control for processes governed by mildly nonlinear differ-
V (W. Hackbusch and G. Wittum, eds.), Lect. Notes Comput. Sci. Eng., ential equations of parabolic type II, Z. Angew. Math. Mech. (ZAMM) 57
vol. 3, Springer, 1998, pp. 276-288. (1977), 11-17.
[SZ92] J. Sprekels and S. Zheng, Optimal control problems for a thermodynamically íWar06] M. Warma, The Robin and Wentzell-Robin Laplacians on Lipschitz domains,
consistent model of phase-field type for phase transitions, Adv. Math. Sci. Semigroup Forum 73 (2006), no. 1, 10-30.
Appl. 1 (1992), 113-125.
[Wer97] D. Werner, Funktionalanalysis, Springer, Berlin, 1997.
[Tan79] H. Tanabe, Equations of Evolution, Pitman, London, 1979.
[W1o82] J. Wloka, Partielle Differentialgleichungen, Teubner, Leipzig, 1982.
[Tem791 R. Temam, Navier-Stokes Equations, North-Holland, Amsterdam, 1979. , Partial Differential Equations, Cambridge University Press, Cam-
[WIo87]
[Tib90] D. Tiba, Optimal Control of Nonsmooth Distributed Parameter Systems, bridge, 1987.
Lect. Notes Math., vol. 1459, Springer, Berlin, 1990. [Wou79] A. Wouk, A Course of Applied Functional Analysis, Wiley, New York, 1979.
[Tri95] H. Triebel, Interpolation Theory, Function Spaces, Differential Operators, J.
[Wri93] S. J. Wright, Primal-Dual Interior-Point Methods, SIAM, Philadelphia,
A. Barth, Heidelberg-Leipzig, 1995. 1993.
[Tró84a] F. Tróltzsch, The generalized bang-bang principie and the numerical solution [WS04[ NI. Weiser and A. Schiela, Function space interior point methods for PDE
of a parabolic boundary-control problem with constraints on the control and constrained optimization, Proc. Appl. Math. Mech. (PAMM) 4 (2004), 43-
the state, Z. Angew. Math. Mech. (ZAMM) 64 (1984), 551-557. 46.
396 Bibliography

[YOS80] K. Yosida, Functional Analysis, Springer, New York, 1980.


[Zei86] E. Zeidler, Nonlinear Functional Analysis and its Applications L• Fixed-point
Theorems, Springer, New York, 1986.
[Zei90aj , Nonlinear Functional Analysis and its Applications II/A: Linear
Monotone Operators, Springer, New York, 1990.
[Zei90b] , Nonlinear Functional Analysis and its Applications II/B: Nonlinear
Monotone Operators, Springer, New York, 1990.
[Zei95] , Applied Functional Analysis and its Applications. Main Principies
Index
and their Applications, Springer, New York, 1995.
[ZK79] J. Zowe and S. Kurcyusz, Regularity and stability for the mathematical pro-
gramming problem in Banach spaces, Appl. Math. Optim. 5 (1979), 49-62.

active set strategy, 101, 106 constraint


strongly active, 233, 251, 290, 296
constraint qualification, 330
Banach space, 23 Zowe-Kurcyusz, 330
bang-bang control
control, 80 admissible, 49
principie, 133 bang-bang, 70
bilinear form, 31, 165, 227
distributed, 5
bisection method, 257
locally optimal, 207, 271
Bochner integral, 143
optimal, 49, 207, 271
Borel measure
control-to-state operator, 50
regular, 342
convergente
boundary condition
strong, 22
inhomogeneous Dirichlet, 39
weak, 44
Neumann , 82, 113, 124, 125, 190
cost functional, 4
of the third kind, 34
Robin, 34
Stefan-Boltzmann, 7, 8, 220, 223, 276,
ds, 4
283, 299
D,, Dy, 11
boundary control, 4
a„, 31
boundary observation, 55, 154
0,,, 37
boundedness condition, 197
derivative
of order k, 199
Fréchet, 59
Gáteaux, 56
c (generic constant), 36 weak, 27
C[a, b], 22 descent direction, 92
C([a, b], X), 142 differential operator
Có (S2), 25 elliptic, 37
Carathéodory condition, 197, 204 in divergence form, 37, 163
chain rule, 60 Dirac measure, 344
cone directional derivative, 56
r-critical, 296 distribution
convex, 324 vector-valued, 145
critical, 245 domain, 25
dual, 324 Lipschitz, 26
conormal, 37 of class Ck,l, 26

397
398 Index 399
Index

Ey, 50 Friedrichs, 33 operator state constraint


ellipticity generalized Friedrichs, 35 adjoint, 61 integral, 302
uniform, 37 generalized Poincaré, 35 bounded,40 pointwise, 111, 337, 339, 346, 3-19
embedding H&lder, 43 continuous, 40 state equation, 3
compact, 356 Poincaré, 35 convex, 325 step function, 98, 142, 168
continuous, 355 integral operator, 40, 62 dual, 61 step size
equation adjoint, 129 monotone, 184 Armijo, 96, 257
adjoint, 13, 67, 76, 122, 159, 163, 166, integration by parts, 27, 148 optimality system, 14, 67, 73, 161, 343, 351 exact, 94
216, 219, 279, 342 optimization problem for bisection method, 95, 257
generalized, 259 Karush-Kuhn-Tucker quadratic, in Hilhert space, 50 stiffness matrix, 104
semilinear, 7 conditions, 18 superconductivity, 7, 217, 281
sernilinear elliptic, 7, 8 system, 73, 351 surface element, 4
partial ordering, 324 surface measure
semilinear parabolic, 8, 9
phase field rnodel, 9, 313 Lebesgue's, 27
error analysis, 106 LP(a,b;X), 143
Poisson's equation, 30 switching point, 134
LP(E), 24
problem
finite difference method, 96 £(U,V), 41
reduced, 11, 98, 168 T, 29, 355
first-order condition Lagrangian function , 15, 17, 87, 89, 120,
projection formula, 70, 78, 131, 133, 161, test function, 31, 139, 141
sufficient, 250 221, 325, 336
163, 217, 281, 284 theorem
formulation Lax-Milgram lemma, 32
weak, 31 Lipschitz condition Taylor's, 227
Fréchet derivative local, of order k, 199 Q, 5 Browder-Minty, 185
continuous, 203 Lipschitz continuity Rellich's, 356
first-order, 275 local, 197
Riesz representation, 42
regularization parameter, 4, 155
of a Nemytskii operator, 202, 204 Lipschitz domain, 26
trace operator, 29
remainder, 237
second-order, 226, 229, 230, 238, 242, 288 continuity of, 356
in integral form, 235, 236
function M(12), 340, 366 trace theorem, 29
Green's, 126, 136, 381 main theorem two-norm discrepancy, 235, 254, 256, 296
measurable vector-valued, 142 on monotone operators, 185 S, 50
vector-valued, 141 mapping E, 5 V-elliptic, 107
functional, 40 saddle point, 325 variational equality, 140, 164
continuous, 40
convex, 47 second-order condition variational formulation, 31, 267
mass matrix, 104
linear, 32, 40 necessary, 246, 247 variational inequality, 12, 63, 64, 215, 216,
maximum condition, 226
reduced, 50 sufficient, 247-249, 254, 255, 257, 262, 219, 278, 280, 283
maximum norm, 111
strictly convex, 47 291, 296, 300, 305 in Hilbert space, 63
minimization of, 345
weakly lower semicontinuous, 47 semigroup, 136 pointwise, 68
maximum principie
Pontryagin's, 226, 286 set
r, 3 convex,47 W(0,T), 146
minimum principie, 70, 78
Gelfand triple, 147 weak, 70 strongly active, 251, 253, 290, 296 Wk,P(fi), W,,'P(ll), 28
gradient, 11, 58 multiplier weakly sequentially closed, 47 W2'0(Q), 138
reduced, 13, 73, 77, 88 Lagrange, 15, 17, 73, 85, 110, 325, Slater condition, 326, 329 WW'r(Q), 138
gradient method 328-330, 351 linearized, 332 weakly
conditioned, 92 multiplier rule, 15, 331 Sobolev space, 28 convergent, 44
projected, 95, 167, 308 Sobolev-Slobodetskii space, 112 sequentially closed, 46
growth condition, 204 v, 4 solution sequentially compact, 46
quadratic, 231, 237, 248, 250, 254, 255, Navier-Stokes equations generalized, 127 sequentially continuous, 45
257, 291, 296 nonstationary, 9, 316 weak, 31, 140, 164, 186, 191, 266, 267,
stationary, 8 317, 366 yt, 6
Hk(l4), Hó (S2), 28 Nemytskii operator, 196, 197 space
HS(r), 112 Neurnann problem, 190 bidual, 43
Hamiltonian function, 285 Newton method, 257-259 complete, 23
hect source, 4 projected, 106 dual, 42
Hilbert space, 24 norm normed, 21
axioms, 21 reflexive, 43
index of a linear operator, 41 SQP method, 258, 259, 262, 309
conjugated, 43 normal derivatíve, 31 state
inequality adjoint, 13, 67
Cauchy-Schwarz, 23 observation operator, 154, 162 optimal, 207

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy