0% found this document useful (0 votes)
13 views13 pages

OPTCON LQ Optimal Control 2024-10-16

Uploaded by

yue3510
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views13 pages

OPTCON LQ Optimal Control 2024-10-16

Uploaded by

yue3510
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Optimal Control

Linear Quadratic (LQ) Optimal Control

Prof. Giuseppe Notarstefano

Department of Electrical, Electronic, and Information Engineering


Alma Mater Studiorum Università di Bologna
giuseppe.notarstefano@unibo.it

A special thank to Dr. L. Sforni for the support on the slide preparation
The present slides are for internal use of the course
Optimal Control @ University of Bologna.
Linear Quadratic (LQ) Optimal Control Problem
Consider a linear quadratic optimal control problem as:
T
X −1  
x1
min
,...,x
1
2
x> > 1 >
t Qt xt + ut Rt ut + 2 xT QT xT
T
u0 ,...,uT −1 t=0

s.t. xt+1 = At xt + Bt ut t = 0, . . . , T − 1
x0 = xinit

• x ∈ Rn and u ∈ Rm • Qt ∈ Rn×n and Qt = Q>


t ≥ 0 for all t = 0, . . . , T − 1

• At ∈ Rn×n • QT ∈ Rn×n and QT = Q>


T ≥ 0

• Bt ∈ Rn×m • Rt ∈ Rm×m and Rt = Rt> > 0 for all t = 0, . . . , T − 1

Prof. Giuseppe Notarstefano – Optimal Control – LQ Optimal Control 1 | 12


Refresh first-order optimality conditions
Consider the optimal control problem
T
X −1
min `t (xt , ut ) + `T (xT )
x,u
t=0

subj.to xt+1 = f (xt , ut ) x0 = xinit

First-order necessary optimality condition


Let (x∗ , u∗ ) be an optimal (minimum) trajectory, then ∃ λ∗ such that

x∗t+1 = ft (x∗t , u∗t ) t = 0, . . . , T − 1


λ∗t = ∇1 ft (x∗t , u∗t )λ∗t+1 + ∇1 `t (x∗t , u∗t ) t = T − 1, . . . , 1

0 = ∇2 ft (x∗t , u∗t )λ∗t+1 + ∇2 `t (x∗t , u∗t ) t = 0, . . . , T − 1

with x∗0 = xinit and λ∗T = ∇`T (x∗T ).

Prof. Giuseppe Notarstefano – Optimal Control – LQ Optimal Control 2 | 12


LQ Problem: first-order optimality condition
Consider the LQ problem
T
X −1  
x1
min
,...,x
1
2
x> > 1 >
t Qt xt + ut Rt ut + 2 xT QT xT
T
u0 ,...,uT −1 t=0

s.t. xt+1 = At xt + Bt ut t = 0, . . . , T − 1
x0 = xinit

The first-order necessary and sufficient conditions for (x∗ , u∗ ) to be an optimal trajectory become
There exits a λ∗ such that

x∗t+1 = At x∗t + Bt u∗t t = 0, 1, . . . , T − 1


λ∗t = A> ∗
t λt+1 + Qt x∗t t = T − 1, . . . , 0
0= Bt> λ∗t+1 + Rt u∗t t = 0, . . . , T − 1

with x∗0 = xinit and λ∗T = QT x∗T .

Prof. Giuseppe Notarstefano – Optimal Control – LQ Optimal Control 3 | 12


LQ Problem: optimal solution (I)
Starting from Bt> λ∗t+1 + Rt u∗t = 0, we can write:

u∗t = −Rt−1 Bt> λ∗t+1

Introducing a matrix Pt = Pt> ≥ 0, it can be proven that

λ∗t = Pt x∗t .

Assuming that it holds for some t ≤ T − 1, then we have

u∗t = −Rt−1 Bt> Pt+1 x∗t+1

Now, considering the constraint represented by the dynamics:

u∗t = −Rt−1 Bt> Pt+1 (At x∗t + Bt u∗t )

Solving by u∗t it follows:

u∗t = −(Rt + Bt> Pt+1 Bt )−1 Bt> Pt+1 At x∗t t = 0, . . . , T − 1

Prof. Giuseppe Notarstefano – Optimal Control – LQ Optimal Control 4 | 12


LQ Problem: optimal solution (II)
We thus obtain:
x∗t+1 = At x∗t + Bt u∗t
= At x∗t − Bt (Rt + Bt> Pt+1 Bt )−1 Bt> Pt+1 At x∗t
Multiplying both sides with A> Pt+1 we obtain
A> ∗ > ∗ > >
t Pt+1 xt+1 = At Pt+1 At xt − At Pt+1 Bt (Rt + Bt Pt+1 Bt )
−1 >
Bt Pt+1 At x∗t
| {z }
λ∗
t+1

Adding Qt x∗t to both sides and collecting x∗t , we obtain


 
A> ∗ ∗ > > >
t λt+1 + Qt xt = At Pt+1 At − At Pt+1 Bt (Rt + Bt Pt+1 Bt )
−1 >
Bt Pt+1 At + Qt x∗t
| {z }
λ∗
t
| {z }
Pt

λ∗t = Pt x∗t
where we have used the first-order necessary condition λ∗t = A> ∗
t λt+1 + Qt xt .

Starting from the terminal condition PT = QT , matrix Pt can be computed by


Pt = A> > >
t Pt+1 At − At Pt+1 Bt (Rt + Bt Pt+1 Bt )
−1 >
Bt Pt+1 At + Qt .
Prof. Giuseppe Notarstefano – Optimal Control – LQ Optimal Control 5 | 12
LQ Problem: optimal solution (III)

Defining the gain matrix Kt∗ as

Kt∗ = −(Rt + Bt> Tt+1 Bt )−1 Bt> Pt+1 At

the optimal control law for the linear quadratic problem is

x∗t+1 = At x∗t + Bt u∗t


t = 0, . . . , T − 1 x0 = xinit
u∗t = Kt∗ x∗t

where Pt is obtained by backward integration:

PT = QT
t = T − 1, . . . , 0
Pt = A> > >
t Pt+1 At − At Pt+1 Bt (Rt + Bt Pt+1 Bt )
−1 >
Bt Pt+1 At + Qt

which is known as Difference Riccati equation.

Remark: the optimal control law u∗t is a state feedback controller.

Prof. Giuseppe Notarstefano – Optimal Control – LQ Optimal Control 6 | 12


Other formulations of the Riccati equation
The usual Riccati recursion reads:

Pt = Qt + A> > >


t Pt+1 At − At Pt+1 Bt (Rt + Bt Pt+1 Bt )
−1 >
Bt Pt+1 At

by exploiting the matrix inversion lemma 1 , we can write:

Pt = Qt + A> > >


t Pt+1 At − At Pt+1 Bt (Rt + Bt Pt+1 Bt )
−1 >
Bt Pt+1 At
= Qt + A> >
t Pt+1 (I − Bt (Rt + Bt Pt+1 Bt )
−1 >
Bt Pt+1 )At
= Qt + A> >
t Pt+1 (I − Bt ((I + Bt Pt+1 BR
−1
)Rt )−1 Bt> Pt+1 )At
= Qt + A>
t Pt+1 (I − BR
−1
(I + Bt> Pt+1 BR−1 )−1 Bt> Pt+1 )At
= Qt + A>
t Pt+1 (I + BR
−1 >
Bt Pt+1 )−1 At
= Qt + A>
t (I + Pt+1 BR
−1 > −1
Bt ) Pt+1 At

1
(At + BC)−1 = A−1
t − A−1
t Bt (I + CA
−1
Bt )−1 CA−1
Prof. Giuseppe Notarstefano – Optimal Control – LQ Optimal Control 7 | 12
Infinite horizon LQ optimal control (I)
Consider the infinite-horizon optimal control problem

X  
1 > >
min
x ,x ,... 2
xt Qxt + u t Ru t
1 2
u0 ,u1 ,... t=0

s.t. xt+1 = Axt + But t = 0, 1, . . .


x0 = xinit
where

• x ∈ Rn and u ∈ Rm • Q ∈ Rn×n and Q = Q> ≥ 0


• A ∈ Rn×n • R ∈ Rm×m and R = R> > 0
n×m
• B∈R
Assumption The pair (A, B) is controllable and the pair (A, C), with Q = C > C, is observable.

Remark The controllability assumption guarantees a finite cost. Indeed, if the system is controllable,
there exists a sequence driving the state to zero in finite time, so that the cost is finite.

Remark The role of observability will be clarified later.

Prof. Giuseppe Notarstefano – Optimal Control – LQ Optimal Control 8 | 12


Infinite horizon LQ optimal control (II)
Proposition Let the pair (A, B) be controllable and the pair (A, C), with Q = C > C, be observable.
Then the following holds
• there exists a unique positive definite P∞ equilibrium solution of the Difference Riccati Equation.
That is, P∞ is a solution of

P∞ = Q + A> P∞ A − A> P∞ B(R + B > P∞ B)−1 B > P∞ A

which is called Algebraic Riccati Equation.


• the optimal control is a feedback of the state given by:

K ∗ = −(R + B > P∞ B)−1 (B > P∞ A)


u∗t = K ∗ x∗t

xt+1 = Ax∗t + Bu∗t t = 1, 2, . . . x∗0 = xinit

and it asymptotically stabilizes the system.

Remark The observability of (A, C) guarantees that if the stage cost goes to zero, then the state
trajectory goes to zero.

Prof. Giuseppe Notarstefano – Optimal Control – LQ Optimal Control 9 | 12


Affine LQR (I)
Consider a LQR problem with affine cost and affine dynamics
T −1  >    > 
St>
 
X qt xt xt Qt xt
min + 1
2
+ qT> xT + 12 x>
T QT xT
x1,...,x T
u0 ,...,uT −1
rt ut ut St Rt ut
t=0

subj. to xt+1 = At xt + Bt ut + ct t = 0, . . . , T − 1

where:
• Qt ∈ Rnx ×nx and Qt = Q>
t ≥ 0 for all t = 0, . . . , T

• Rt ∈ Rnu ×nu and Rt = Rt> > 0 for all t = 0, . . . , T − 1


• St ∈ Rnu ×nx such that the problem is convex.2
It can be conveniently solved by augmenting the state as
 
1
x̃t :=
xt

2
Namely, Qt − St> Rt−1 St positive semi-definite.
Prof. Giuseppe Notarstefano – Optimal Control – LQ Optimal Control 10 | 12
Affine LQR (II)
Augmenting the state as
 
1
x̃t :=
xt

we can rewrite the cost and system matrices as

0 qt>
     
  1 0 0
Q̃t := S̃t := rt St R̃t := Rt Ãt := B̃t := .
qt Q t ct At Bt

Then, we solve the associated LQR problem


T −1  > 
S̃t>
 
X x̃t Q̃t x̃t
min 1
2
+ 21 x̃>
T QT x̃T
x̃1 ,...,x̃T ut S̃t R̃t ut
u0 ,...,uT −1 t=0

s.t. x̃t+1 = Ãt x̃t + B̃t ut t = 0, . . . , T − 1


x0 = xinit

Prof. Giuseppe Notarstefano – Optimal Control – LQ Optimal Control 11 | 12


Affine LQR (III)

The optimal solution of the problem reads

u∗t = Kt∗ x∗t + σt∗


∗ t = 0, . . . , T − 1
xt+1 = At x∗t + Bt u∗t

where

Kt∗ = −(Rt + Bt> Pt+1 Bt )−1 (St + Bt> Pt+1 At )


 
σt∗ = −(Rt + Bt> Pt+1 Bt )−1 rt + Bt> pt+1 + Bt> Pt+1 ct
 
>
pt = qt + A> > ∗
t pt+1 + At Pt+1 ct − Kt Rt + Bt> Pt+1 Bt σt∗
>
Pt = Qt + A> ∗ > ∗
t Pt+1 At − Kt (Rt + Bt Pt+1 Bt )Kt

with pT = qT and PT = QT .

Prof. Giuseppe Notarstefano – Optimal Control – LQ Optimal Control 12 | 12

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy