0% found this document useful (0 votes)
12 views24 pages

Lecture 1 - Linear Algebra

Uploaded by

wmendieta1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views24 pages

Lecture 1 - Linear Algebra

Uploaded by

wmendieta1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

CONTENTS CONTENTS

Lecture 1: Linear Algebra


Erfan Nozari

September 24, 2022

Linear algebra is the most fundamental pillar of linear systems and controls. A comprehensive coverage
of linear algebra can take years!, and is way beyond our scope here. In this lecture I cover only some of
the basic concepts and results that we will use later in the course. For a nice and more comprehensive
treatment, but still without proofs, see Chapter 3 of the Chen’s textbook. For a pretty comprehensive
treatment, you can take a stop at Tom Bewley’s encyclopedia, “Numerical Renaissance: simulation,
optimization, & control”. If you want just something in between, which is nicely readable and moderately
comprehensive, try Gilbert Strang’s “Introduction to Linear Algebra”.

Contents
1.1 Vectors and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Definition (Scalars, vectors, and matrices) . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.2 Definition (Matrix multiplication) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.3 Theorem (Breakdown of matrix multiplication) . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.4 Exercise (Breakdown of matrix multiplication) . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Linear (In)dependence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 MATLAB (Linearity of matrix multiplication) . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.2 Definition (Linear combination) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.3 Definition (Linear (in)dependence) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.4 Remark (Determining linear independence) . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.5 Exercise (Linear (in)dependence) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Rank and Determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.1 Definition (Rank) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.2 Example (Rank of matrices) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.3 Theorem (Rank) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.4 MATLAB (Rank) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.5 Definition (Full rank matrices) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.6 Definition (Determinant) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.7 Theorem (Rank and determinant) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

ME 120 – Linear Systems and Control


Copyright © 2022 by Erfan Nozari. Permission is granted to copy, distribute and modify this file, provided that the original
source is acknowledged.
1.1 VECTORS AND MATRICES

1.3.8 MATLAB (Determinant) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11


1.3.9 Theorem (Determinant of product and transpose) . . . . . . . . . . . . . . . . . . . . . . . 11
1.3.10 Exercise (Rank using definition and determinant) . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Linear Systems of Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4.1 Existence of solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4.2 Uniqueness of Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4.3 Example (Uniqueness of solutions) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4.4 Definition (Null space) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4.5 Finding the Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4.6 Definition (Matrix inverse) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4.7 MATLAB (Inverse and pseudo-inverse) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5.1 Example (2D mappings) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5.2 Definition (Eigenvalue and eigenvector) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.5.3 Example (3x3 matrix with unique eigenvalues) . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.5.4 Example (3x3 matrix with repeated eigenvalues) . . . . . . . . . . . . . . . . . . . . . . . . 20
1.5.5 Theorem (Independence of eigenvectors) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.5.6 MATLAB (Eigenvalues & eigenvectors) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.6 Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.6.1 Exercise (Diagonalization) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

1.1 Vectors and Matrices


You probably know very well what a vector and a matrix are.

Definition 1.1.1 (Scalars, vectors, and matrices) A “scalar”, for the purpose of this course, is either
a real (R) or a complex (C) number. A “vector”, again for the purpose of this course, is an ordered set of
numbers, depicted as a column:
 
x1
 x2 
x= . 
 
 .. 
xn n×1

Almost always, our vectors are column vectors. But occasionally, we need row vectors as well, which we may
show using the transpose T notation:

xT = x1 x2 · · · xn 1×n
 

ME 120 – Linear Systems and Control


Copyright © 2022 by Erfan Nozari. Permission is granted to copy, distribute and modify this file, provided that the original
source is acknowledged.
1.1 VECTORS AND MATRICES

And finally a “matrix” is a rectangular ordered set of numbers,


 
a11 a12 ··· a1n
 a21 a22 ··· a2n 
A= .
 
.. .. .. 
 .. . . . 
am1 am2 ··· amn m×n

As you have noticed, throughout this course, we use bold-faced small letters for vectors and bold-faced capital
letters for matrices. □
Vectors and matrices of the same size can be added together, and both vectors and matrices can be multiplied
by a scalar, not super interesting. What is more interesting and, as you will see, essentially the basis of linear
algebra, is matrix multiplication. You probably have seen the basic definition.

Definition 1.1.2 (Matrix multiplication) For two matrices Am×n and Br×p , their product C = AB is
only defined if n = r, in which case the entries of Cm×p are defined as
n
X
cij = aik bkj .
k=1


The above definition gives little intuition about what a matrix multiplication really does. To see this, we
need to notice two facts.

Theorem 1.1.3 (Breakdown of matrix multiplication) Let ai and bi denote the i’th column of A and
B, respectively:
   
A = a1 a2 · · · an , B = b1 b2 · · · bp

Then

(i) The matrix-matrix multiplication AB applies to each column of B separately, that is


   
A b1 b2 · · · bp = Ab1 Ab2 · · · Abp (1.1)

In other words, the i’th column of AB is A times the i’th column of B.


(ii) Each matrix-vector multiplication Abi is a weighted sum of the columns of A, that is
 
b1i
  b
 2i 

Abi = a1 a2 · · · an  .  = b1i a1 + b2i a2 + · · · + bni an (1.2)
 .. 
bni

You can easily show both of these properties using Definition 1.1.2, but they are going to be super useful in
understanding linear algebra as it really is.

Exercise 1.1.4 (Breakdown of matrix multiplication) Use the above two rules to compute AB, where
   
0 2 1 −4 0
A = −1 1 −3 , B =  2 1
3 −1 3 −1 0

ME 120 – Linear Systems and Control


Copyright © 2022 by Erfan Nozari. Permission is granted to copy, distribute and modify this file, provided that the original
source is acknowledged.
1.2 LINEAR (IN)DEPENDENCE


You might be wondering why I put so much emphasis on these simple properties? One reason is the intuition
that they give you on matrix multiplication. But it’s more than that. The real reason is the important role
that matrix-vector multiplication plays in linear algebra. Notice that for a matrix Am×n and a vector xn × 1,
their product

ym×1 = Ax

is also a vector. In other words, a matrix is more than a rectangular array of numbers! It defines a function
that maps vectors to vectors.

Rn Rm
A

x y

This view of matrices as maps (or operators) is arguably the basis of linear algebra!

1.2 Linear (In)dependence


Matrices, when viewed as maps, have an important property: they are linear:

A(α1 x1 + α2 x2 ) = α1 Ax1 + α2 Ax2 (1.3)

where x1 , x2 are vectors and α1 , α2 are scalars. In other words, if you have already computed y1 = Ax1 and
y2 = Ax2 , and you want to compute A(α1 x1 + α2 x2 ), you don’t have to use matrix multiplication (which is
computationally expensive) anymore. You can simply reuse y1 and y2 and “combine” them using the same
coefficients α1 and α2 .

MATLAB 1.2.1 (Linearity of matrix multiplication) To see the advantage of using the right hand
side of Eq. (1.3) over its left hand side, compare

1 n = 1e4;
2 x1 = rand(n, 1);
3 x2 = rand(n, 1);
4 A = rand(n);
5 tic
6 for i = 1:1e3
7 alpha1 = rand;
8 alpha2 = rand;
9 y = A * (alpha1 * x1 + alpha2 * x2);
10 end
11 toc

with

1 n = 1e4;
2 x1 = rand(n, 1);

ME 120 – Linear Systems and Control


Copyright © 2022 by Erfan Nozari. Permission is granted to copy, distribute and modify this file, provided that the original
source is acknowledged.
1.2 LINEAR (IN)DEPENDENCE

3 x2 = rand(n, 1);
4 A = rand(n);
5 tic
6 y1 = A * x1;
7 y2 = A * x2;
8 for i = 1:1e3
9 alpha1 = rand;
10 alpha2 = rand;
11 y = alpha1 * y1 + alpha2 * y2;
12 end
13 toc

If you haven’t seen them before, rand(n, m) generates a matrix of size n × m with random entries in the
interval [0, 1], rand(n) is equivalent to rand(n, n) (not rand(n, 1)), and rand is equivalent to rand(1).
The tic, toc pair compute the CPU time for all the commands computed in between. □
In general, terms of the form α1 x1 + α2 x2 appear over and over again in linear algebra, so they have been
given a name!

Definition 1.2.2 (Linear combination) A linear combination of a set of vectors x1 , x2 , . . . , xn is any


weighted sum of them, that is, any vector

xn+1 = α1 x1 + α2 x2 + · · · + αn xn (1.4)

for some scalars (also called “coefficients”) α1 , . . . , αn ∈ R. □


For obvious reasons, the vector xn+1 in Eq. (1.4) is said to be linearly dependent on the vectors x1 , x2 , . . . , xn
(because you can obtain xk+1 from a linear combination of x1 , x2 , . . . , xn ). Notice that this is not always the
case. For example, you cannot find any linear combination of
   
1 0
x1 = 0 , x2 = 1
0 0

that gives you


 
0
x3 = 0
1

and, so, x3 is linearly independent from x1 and x2 . But


 
2
x4 = 3
0

is in fact linearly dependent on x1 and x2 (right?).


Here, you might have noticed that not only x3 cannot be obtained from any linear combination of x1 and
x2 , but x2 cannot be obtained from any linear combination of x1 and x3 either, and same for x1 . On the
other hand, not only x4 can be obtained from x1 and x2

x4 = 2x1 + 3x2

but also x1 can be obtained from x4 and x2


1 3
x1 = x4 − x2
2 2
5

ME 120 – Linear Systems and Control


Copyright © 2022 by Erfan Nozari. Permission is granted to copy, distribute and modify this file, provided that the original
source is acknowledged.
1.2 LINEAR (IN)DEPENDENCE

and x2 can be obtained from x4 and x1 as well


1 2
x2 = x4 − x1
3 3
In other words, (assuming that no coefficients are 0) linear dependence and linear independence are symmetric
properties between a set of vectors: they are either all linearly dependent on each other, or neither is linearly
dependent on the rest. The formal version is:

Definition 1.2.3 (Linear (in)dependence) A set of vectors x1 , x2 , . . . , xn are “linearly dependent” if


there exists a set of scalars α1 , . . . , αn , at least one of which is not equal to 0, such that

α1 x1 + α2 x2 + · · · + αn xn = 0 (1.5)

In contrast, if Eq. (1.5) holds only for α1 = · · · = αn = 0 (which it clearly always does), then x1 , x2 , . . . , xn
are called “linearly independent”. □
Now, if I give you a set of vectors x1 , x2 , . . . , xn , how can you say if they are linearly independent or not?
For example,
     
2 −3 7
x1 =  0  , x2 =  3  , x3 = −3 (1.6a)
−1 1 3

are linearly independent, but


     
2 −3 7
x1 =  0  , x2 =  3  , x3 = −3 (1.6b)
−1 1 −3

are not. There are various ways for determining it, and we will get back to it in more detail in Section 1.3.
But for now, we can say the following.

Remark 1.2.4 (Determining linear independence) For now,

• First, notice that for two vectors, if they are linearly dependent, it means that

α1 x1 + α2 x2 = 0

for some α1 and α2 that at least one of them, say α1 , is nonzero. This means that
α2
x2 = − x1
α1
or, in words, two vectors are linearly dependent if and only if they are a multiple of each other.
• Second, if you have n, m-dimensional vectors and n > m, they are necessarily linearly dependent. So
in R2 , you cannot have 3 linearly independent vectors, in R3 , you cannot have 4 linearly independent
vectors, and so on. We will see why shortly, in Section 1.3.
• Also, just remember that you can always check linear independence directly from Definition 1.2.3. At
the end of the day, Eq. (1.5) is a system of linear equations with unknowns α1 , α2 , . . . , αn . Later in this
note I will discuss systems of linear equations in detail, but you can always solve them manually (for
example using the substitution method, reviewed here). If the only answer is α1 = α2 = · · · = αn = 0,
then the vectors are linearly independent. If you fail to find a unique answer, then the vectors are
linearly dependent.
6

ME 120 – Linear Systems and Control


Copyright © 2022 by Erfan Nozari. Permission is granted to copy, distribute and modify this file, provided that the original
source is acknowledged.
1.3 RANK AND DETERMINANT

Exercise 1.2.5 (Linear (in)dependence) Check whether any of the following set of vectors are linearly
dependent or independent:

• Vectors in Eq. (1.6a)

• Vectors in Eq. (1.6b)


• The columns of
 
2 3 −2 3 −3
A= 3 1 0 −2 2 (1.7)
−3 −3 3 3 −3

• The rows of A (or the columns of AT ).



Determining linear independence by solving the linear system of equations is pretty tedious. In fact, we
will later do the opposite (solve linear system of equations using linear independence). To determine linear
independence more easily, we need rank and determinant.

1.3 Rank and Determinant


For a set of vectors x1 , x2 , . . . , xn ∈ Rm , we start by placing them into a matrix (as columns):
 
A = x1 x2 · · · xn

So the matrix A has n columns and m rows. Notice that this same matrix can also be seen as a stack of m,
n-dimensional row vectors
 T
y1
 y2T 
A= . 
 
 .. 
T
ym

where each row vector yiT denotes the i’th row of A (or, equivalently, the column vector yi is the transpose
of the i’th row of A). Now, we are ready to define what the rank of a matrix is:

Definition 1.3.1 (Rank) Consider a matrix A ∈ Rm×n as above.

• The column-rank of A is the number linearly independent columns of A (the number of linearly
independent x1 , x2 , . . . , xn ).
• The row-rank of A is the number of linearly independent rows of it (the number of linearly independent
y1 , y2 , . . . , ym ).

The notion of rank essentially translate a property of a set of vectors to a property of a matrix, but is very
fundamental to linear algebra. Let’s see a couple of examples.

ME 120 – Linear Systems and Control


Copyright © 2022 by Erfan Nozari. Permission is granted to copy, distribute and modify this file, provided that the original
source is acknowledged.
1.3 RANK AND DETERMINANT

Example 1.3.2 (Rank of matrices) Consider again the example vectors in Eq. (1.6). The first set of
vectors were linearly independent (as you checked in Exercise 1.2.5). Therefore, when we put them side by
side in the matrix
 
2 −3 7
A= 0 3 −3
−1 1 3
it has column rank equal to 3. What about its row rank? The rows of the matrix are
y1T = 2 −3 7 , y2T = 0 3 −3 , y3T = −1 1 3
     

which are also linearly independent (check). So the row rank of the matrix is also 3.
Now, consider the vectors in Eq. (1.6b). They are not linearly independent, because x3 = 2x1 − x2 . But x1
and x2 are indeed linearly independent, because they are not a multiple of each other (remember from last
section). So putting them side by side, the matrix
 
2 −3 7
B= 0 3 −3
−1 1 −3
has column rank 2. What about its row rank? The rows of B are
y1T = 2 −3 7 , y2T = 0 3 −3 , y3T = −1
     
1 −3
which are also not linearly independent, because y3 = − 21 y1 − 16 y2 . And similar to x1 and x2 , y1 and y2
are also linearly independent because they are not a multiple of each other. So, exactly 2 of y1 , y2 , y3 are
linearly independent, and the row rank of B is also 2. □
The fact that the column rank and the row rank of both A and B were equal was not a coincident. This is
always the case:

Theorem 1.3.3 (Rank) For any matrix Am×n , its column rank equals its row rank, which is called the
rank of the matrix. As a consequence,
rank(A) ≤ min{m, n}. (1.8)

Recall that in the previous section (second point after Definition 1.2.3), I told you that if you have n, m-
dimensional vectors and n > m, they are necessarily linearly dependent. Now you can see why from Eq. (1.8).
For example, consider the same A matrix in Eq. (1.7). The rank of the matrix can at most be 3, because it
has only 3 rows and its row-rank (number of independent rows) cannot be more than the number of rows!
In this case, the rank is in fact 3, which means that out of the 5 columns, no more than 3 of them can be
simultaneously independent form each other.

MATLAB 1.3.4 (Rank) Compute the rank of a matrix using rank. □


Now you know how to easily check the independence of a set of vectors x1 , . . . , xn : stack them side by side
in a matrix
 
A = x1 · · · xn
and check the rank of A (using MATLAB for example). If
rank(A) = #columns = n
the vectors x1 , . . . , xn are linearly independent, and dependent if rank(A) < n. This motivates the following
definition:
8

ME 120 – Linear Systems and Control


Copyright © 2022 by Erfan Nozari. Permission is granted to copy, distribute and modify this file, provided that the original
source is acknowledged.
1.3 RANK AND DETERMINANT

Definition 1.3.5 (Full rank matrices) A matrix Am×n is

• full column rank if rank(A) = n;


• full row rank if rank(A) = m;
• full rank if rank(A) = min{m, n}.

However, the problem of determining linear independence doesn’t end here. Consider the same A in Eq. (1.7).
rank(A) = 3 < 5, so the columns of A are not linearly independent. You know even more, that among the 5
columns, at most 3 are linearly independent. But which 3? Note, that column rank = 3 does not mean that
any selection of 3 columns are linearly independent. Clearly, the last column is minus the 4th column, so
any selection of 3 columns that include both the fourth and last columns will be linearly dependent. Instead,
rank = 3 means that at least one selection of 3 columns are linearly independent. For example, the first,
second, and third columns are linearly independent, and so are the first, second, and fourth columns.
A notion closely related to rank and linear (in)dependence is determinant.

Definition 1.3.6 (Determinant) Consider any n-by-n matrix A ∈ Rn×n .

• If n = 1 (a is a scalar), then its determinant equals itself


det(a) = a

• If n = 2,
 
a b
A=
c d
det(A) = ad − bc

• If n = 3,
 
a b c
A = d e f
g h i
det(A) = aei + bf g + cdh − ceg − bdi − af h
Which is much easier to remember and calculate using the following image:

• If n > 1 (including n = 2 and 3 above, but really used for n ≥ 4), then the determinant of A is defined
based on an expansion over any arbitrary row or column. For instance, choose row 1. Then
n
X
det(A) = |A| = (−1)1+j a1j det(A−(1,j) )
j=1

where A−(1,j) is an n − 1 by n − 1 matrix obtained from A by removing its 1st row and j’th column.
9

ME 120 – Linear Systems and Control


Copyright © 2022 by Erfan Nozari. Permission is granted to copy, distribute and modify this file, provided that the original
source is acknowledged.
1.3 RANK AND DETERMINANT

We sometimes use |A| instead of det(A). □


The determinant also has a nice geometrical interpretation,
when n = 2 : | det(A)| = area of the parallelogram formed by the columns (or rows) of A
when n = 3 : | det(A)| = volume of the parallelepiped formed by the columns (or rows) of A

The main use of determinants for us, however, is not their geometrical interpretation, but rather their relation
to independence of vectors (finally!):

Theorem 1.3.7 (Rank and determinant) Consider a matrix A ∈ Rm×n .

(i) If m = n (square matrix), then


det(A) ̸= 0 ⇔ rank(A) = n ⇔ all columns/rows of A are linearly independent

(ii) In general (square matrix or not),


rank(A) = dimension of largest square sub-matrix that is nonsingular

A matrix is called “nonsingular” if its determinant is nonzero, and “singular” otherwise. □


To see how to apply this theorem, let us revisit Example 1.3.2. First,
det(A) = 36 ̸= 0
so the rank of A is 3 and all of its columns/rows are independent. For B, we have
det(B) = 0
and so the rank of B cannot be 3. But still, its rank may be 2, 1, or even 0 (which is clearly not the case,
because the rank of a matrix is only 0 if all of its entries are 0). To see if its rank is 2 or 1, we have to find
the largest square sub-matrix that is nonsingular. Easily, you can find many 2 × 2 nonsingular submatrices,
for example
     
2 −3 2 −3 2 7
, ,
0 3 −1 1 −1 −3
so the rank of B is 2. To also see an example of a matrix with rank 1, take
 
1 2 3
C = 1 2 3
1 2 3
All columns (or rows) are multiples of each other, and you cannot find any 2 × 2 nonsingular submatrices.
You can, however, find 1 × 1 nonsingular submatrices (any entry of C), which means rank(C) = 1.
10

ME 120 – Linear Systems and Control


Copyright © 2022 by Erfan Nozari. Permission is granted to copy, distribute and modify this file, provided that the original
source is acknowledged.
1.4 LINEAR SYSTEMS OF EQUATIONS

MATLAB 1.3.8 (Determinant) In MATLAB, use the function det to obtain the determinant of a matrix.

Before finishing this section, let’s see a couple of basic properties of the determinant:

Theorem 1.3.9 (Determinant of product and transpose) For two square matrices A and B,

det(AT ) = det(A)

and

det(AB) = det(A) det(B).

Exercise 1.3.10 (Rank using definition and determinant) Compute that rank of
 
−3 −3 −4 −4
A= 3 3 4 4
0 1 −4 2

in 3 ways: from Definition 1.2.3, from Theorem 1.3.7, and using MATLAB. From both of the first two
methods, determine a maximal set of linearly independent columns. □

1.4 Linear Systems of Equations


The notions of rank and determinant not only help with determining whether a set of vectors are independent
or not, they also help with solving linear systems of equations.
Consider a general linear system of equation

a11 x1 + a12 x2 + · · · + a1n xn = b1


a21 x1 + a22 x2 + · · · + a2n xn = b1
..
.
am1 x1 + am2 x2 + · · · + amn xn = bm

containing m equations and n unknowns x1 , . . . , xn . Now you can easily see that this system can be written
in matrix form as

Ax = b (1.9)

So the question is: for a given A and b, find all x that solve this equation. In general, Eq. (1.9) can have 0,
1, or infinitely many solutions.
The above question is essentially composed of two questions:

1) Does there exist any solutions to Eq. (1.9)?


2) If any solutions exist, is it unique?

We answer them in order:


11

ME 120 – Linear Systems and Control


Copyright © 2022 by Erfan Nozari. Permission is granted to copy, distribute and modify this file, provided that the original
source is acknowledged.
1.4.1 Existence of solutions 1.4 LINEAR SYSTEMS OF EQUATIONS

1.4.1 Existence of solutions


Recall from the end of Section 1.1 that A defines a map from Rn to Rm . Now, we are given a specific vector
b in Rm and asked whether there are any vectors x in Rn that map to it. Again, the property in Eq. (1.2)
comes in handy. Notice that
 
x1
 x2 

 
Ax = a1 a2 · · · an  .  = x1 a1 + x2 a2 + · · · + xn an
 .. 
xn

so whether Eq. (1.9) has a solution is the same as asking whether b is linearly dependent on the columns of
A. To check this, you can just create a larger augmented matrix
 
Aaug = A b

where you append b to the columns of A. Then

if rank(Aaug ) = rank(A) ⇒ b is linearly dependent on columns of A ⇒ at least 1 solution exists


if rank(Aaug ) > rank(A) ⇒ b is linearly independent from columns of A ⇒ no solutions exist

(what if rank(Aaug ) < rank(A)?) and you can check the ranks either using determinants via Theorem 1.3.7,
or directly using rank() in MATLAB. Note that if you are using the determinants, your job often becomes
easier if you remove any linearly dependent columns of A before appending b to it (right?).

1.4.2 Uniqueness of Solutions


Let us continue with the example above.

Example 1.4.3 (Uniqueness of solutions) Consider the same A and b as above,


   
1 0 1 1
A = 0 1 1 , b = −1
2 3 5 −1
T
I gave one solution x above, but that is not the only solution. x′ = 2

0 −1 also solves this equation,
T
so does x′′ = 3 1 −2 , and many others. Why is this? Because



1
v1 + v2 − v3 = 0 ⇒ A  1  = 0
−1
| {z }
z

And so, for any scalar α,

A(x + αz) = Ax + αAz = b + 0 = b

so all x + αz are solutions as well. Note that this only happens because there exists a nonzero vector z such
that Az = 0. □
The above examples motivates the definition of another fundamental concept in linear algebra:

12

ME 120 – Linear Systems and Control


Copyright © 2022 by Erfan Nozari. Permission is granted to copy, distribute and modify this file, provided that the original
source is acknowledged.
1.4.5 Finding the Solutions 1.4 LINEAR SYSTEMS OF EQUATIONS

Definition 1.4.4 (Null space) The null space of a matrix A is the set

{z | Az = 0}.


Let me emphasize that the null space is never empty because it always contains at least the zero vector.
So to determine whether the solution to Ax = b (assuming that at least one exists) is unique, we need to
determine whether the null space of A contains any nonzero vectors. Again, Eq. (1.2) comes in handy!
 
z1
 z2 
 
Az = a1 a2 · · · an  .  = z1 a1 + z2 a2 + · · · + zn an
 .. 
zn

and so the question of whether

Az = 0

for a nonzero z is precisely the question of whether

z1 a1 + z2 a2 + · · · + zn an = 0

for a nonzero set of coefficients z1 , z2 , . . . , zn , which is precisely the definition of linear independence! So,

The solution to Ax = b (if any) is unique ⇔ A has full column rank (1.10)

When the solution to the equation Ax = b is not unique, then you essentially have a solution space rather
than a solution. Obtaining the solution space is very easy once you have the null space: if {z1 , z2 , . . . , zk } is
a basis for the null space of A and x is one solution to the equation, then

Solution space = {x + α1 z1 + α2 z2 + · · · + αk zk | α1 , α2 , . . . , αk ∈ R}. (1.11)

1.4.5 Finding the Solutions


At this point you might ask: ok, even if a unique solution exists, how to find it? This is done using the
notion of matrix inverse:

Definition 1.4.6 (Matrix inverse) For a square and nonsignular matrix A, there exists a unique matrix
A−1 such that

AA−1 = A−1 A = I

where I is the identity matrix


 
1 0 ··· 0
0 1 ··· 0
I = . . ..  .
 
 .. .. ..
. .
0 0 ··· 1

13

ME 120 – Linear Systems and Control


Copyright © 2022 by Erfan Nozari. Permission is granted to copy, distribute and modify this file, provided that the original
source is acknowledged.
1.4.5 Finding the Solutions 1.4 LINEAR SYSTEMS OF EQUATIONS

You may remember that for a 2 by 2 matrix, its inverse is given by


 
a b
A=
c d
 
−1 1 d −b
A =
ad − bc −c a
For larger matrices, we will use MATLAB to find their inverses.
Let us now see how we can use matrix inverse to solve systems of linear equations:

• If m = n (A is square) and A is nonsingular, then any b is linearly dependent on the columns of A.


Therefore, at least one solution exists. But this solution is also unique from Eq. (1.10), because A has
full column rank as well. To find this unique solution, simply multiply both sides of Ax = b by A−1
from the left:
x = A−1 b (1.12)

• If m > n (A is a tall matrix), and A is full column rank (it cannot be full row-rank, right?), then we
have a unique solution only if b is linearly dependent on the columns of A. If so, to find that solution,
Ax = b ⇒ AT Ax = AT b ⇒ (AT A)−1 AT Ax = (AT A)−1 AT b ⇒ x = (AT A)−1 AT b

• If m < n (A is a fat matrix), the equation never has a unique solution (even if A is full row-rank),
because A cannot be full column rank (right?). Given any solution (that you can find, e.g., using
elimination of variables), you can build the entire solution space as in Eq. (1.11).
If, however, A is full row rank, then we are sure that the system of equations has always a solution
for any b (why?). In this case, we also know (from a theorem we don’t prove) that the square matrix
AAT is nonsingular, and
x = AT (AAT )−1 b
is one solution to the equation (just plug it in and check!).

If you compare the three case above, you will notice that the matrices (AT A)−1 AT (for full column rank
A) and AT (AAT )−1 (for full row rank A) are essentially taking the place of A−1 in Eq. (1.12). In fact,
if A is square and nonsingular, both of them will become equal to A−1 (because (AT )−1 = (A−1 )T and
(AB)−1 = B−1 A−1 ). In other words, (AT A)−1 AT is an extension of A−1 for non-square, full column rank
matrices and AT (AAT )−1 is an extension of A−1 for non-square, full row rank matrices. As such, both of
them are called the “pseudo-inverse” of A, shown as A† . Therefore, whenever A has full rank,
x = A† b (1.13)

is a solution to Ax = b. It goes beyond our course, but be aware that A† is always defined for any matrix
(even the zero matrix) and Eq. (1.13) is always a solution to Ax = b.

MATLAB 1.4.7 (Inverse and pseudo-inverse) To find the inverse of a matrix, use the inv() function.
Similarly, use pinv() to find the pseudo-inverse. However, if you want to invert a matrix only for the purpose
of solving a linear equation, as in Eq. (1.12) or Eq. (1.13), a computationally advantageous way is to use the
MATLAB’s left division:

1 x = inv(A) * b; % Using matrix inverse


2 x = pinv(A) * b; % Using matrix pseudo-inverse
3 x = A \ b; % Using left division

14

ME 120 – Linear Systems and Control


Copyright © 2022 by Erfan Nozari. Permission is granted to copy, distribute and modify this file, provided that the original
source is acknowledged.
1.5 EIGENVALUES AND EIGENVECTORS

Using left division also allows you to solve systems of equations without a unique solution, or even non-square
systems of equations. If your system of equations has infinitely many solutions, A \ b returns one of them.
If your system has no solutions, then it returns an x for which the error Ax − b is smallest (in magnitude).□

1.5 Eigenvalues and Eigenvectors


OK, you have made it so far, and we are finally ready to learn about eigenvectors and eigenvalues! You’ll see
later why I put so much emphasis on them – they are one of the most important and widely used constructs
in control theory.
Throughout this section, I will focus on square matrices A ∈ Rn×n , because eigenvalues and eigenvectors are
only defined for square matrices.
Recall, from Section 1.1, that matrices are not just arrays of numbers, but mappings from one vector space
to another. So A maps from Rn to Rn . In some cases, we can very easily describe what this mapping does:

Example 1.5.1 (2D mappings) Consider a few simple mappings in two dimensions:

• A = I maps any vector to itself (identity mapping)


 
−1 0
• A= reflects any vector with respect to the vertical axis
0 1
   
x −x
A =
y y
 
1 0
Similarly, A = reflects any vector with respect to the horizontal axis.
0 −1
 
k 0
• A= = kI for k > 0 scales any vector by a factor of k
0 k
   
x kx
A =
y ky
 
−1 0
• A= = −I reflects any vector with respect to the origin
0 −1
   
x −x
A =
y −y
 
cos θ − sin θ
• A= rotates any vector as much as θ radians counter-clockwise.
sin θ cos θ

What about more complex matrices? For example, how can we describe (or even intuitively understand)
what does
 
5 −1
A= (1.14)
−1 5

do to vectors? Here is how it maps a whole bunch of random points (the red dot shows the origin, the blue
dots are random x’s, the red dots are the corresponding Ax, and the arrow shows the mapping):
15

ME 120 – Linear Systems and Control


Copyright © 2022 by Erfan Nozari. Permission is granted to copy, distribute and modify this file, provided that the original
source is acknowledged.
1.5 EIGENVALUES AND EIGENVECTORS

It already gives us a sense: The arrows are all pointing outwards, showing that A perform some sort of
enlargement (scaling with a scale k > 1). But this enlargement is not uniform, it is more pronounced along
a NorthWest-SouthEast axis. Notice that this NorthWest-SouthEast axis can be described by the vector
 
−1
v1 =
1

This vector is indeed special, since


    
5 −1 −1 −6
Av1 = = = 6v1
−1 5 1 6

In other words, v1 is special because when A acts on it, the result is a multiple of v1 again! In other words,
the effect of A on v1 is a pure scaling. (If you think this is not so special, try a whole bunch of random
vectors and check if Av becomes exactly a multiple of v.)
Note that this special property also clearly holds for any multiple of v1 :

A(αv1 ) = αAv1 = α6v1 = 6(αv1 )

In other words, the effect of A on the whole NorthWest-SouthEast axis is a 6-time enlargement.
But this does not tell us all about A. What about other directions other than the NorthWest-SouthEast
axis? We can visually see that no other direction is scaled as much. To see what A does to other vectors, we
16

ME 120 – Linear Systems and Control


Copyright © 2022 by Erfan Nozari. Permission is granted to copy, distribute and modify this file, provided that the original
source is acknowledged.
1.5 EIGENVALUES AND EIGENVECTORS

can search to see if there are any other vectors such that the effect of A on them is a pure scaling. In other
words, are there any vectors v, other than v1 and its multiples, such that

Av = λv (1.15)

for some scalar λ. Clearly, v = 0 satisfies this, but that is not what we are looking for.
Eq. (1.15) is fortunately a linear system of equations:

(A − λI)v = 0

and we are looking for nonzero vectors v that satisfy it. The difficulty compared to a usual linear system of
equations is that λ is also unknown. But notice one thing. If λ is such that A − λI is nonsingular, then

(A − λI)v = 0 ⇒ (A − λI)−1 (A − λI)v = (A − λI)−1 0 ⇒ v = 0

In other words, for any λ such that A − λI is nonsingular, Eq. (1.15) has only the trivial solution v = 0,
which is of no help. This is very useful, because we know we have to restrict our attention to values of λ for
which A − λI is singular:
 
5−λ −1
det(A − λI) = 0 ⇔ det =0
−1 5−λ
⇔ (5 − λ)2 − 1 = 0
⇔ λ2 − 10λ + 24 = 0

This already gives a polynomial equation in λ, with solutions

λ1 = 6
λ2 = 4

λ1 = 6 is what we had originally found by guessing v1 . So for λ1 , there is even no need to solve Eq. (1.15),
because we already know its solution. But what about the solution to (A − λ2 I)v = 0?
    
1 −1 v1 v − v2
(A − λ2 I)v = = 1 (1.16)
−1 1 v2 v2 − v1

So (A − λ2 I)v = 0 if and only if v1 = v2 . Not surprisingly, we did not get a unique solution v, because we
found λ2 precisely such that we get infinitely many solutions. It is not hard to see that vectors v for which
v1 = v2 are all multiples of
 
1
v2 =
1

and constitute the SouthWest-NorthEast axis in the picture. Now the picture makes even more sense: the
mapping A scales all the vectors along the NorthWest-SouthEast axis by 6 times, and all the vectors along
the SouthWest-NorthEast axis by 4 times.
What about other vectors x that lie neither on the NorthWest-SouthEast axis nor on the SouthWest-
NorthEast axis? Well, notice that v1 and v2 are linearly independent, and we can decompose any other
vector into a linear combination of them (using standard orthogonal projection you learned in geometry):

x = α1 v1 + α2 v2

17

ME 120 – Linear Systems and Control


Copyright © 2022 by Erfan Nozari. Permission is granted to copy, distribute and modify this file, provided that the original
source is acknowledged.
1.5 EIGENVALUES AND EIGENVECTORS

2.5

x
2

1.5
2
v2
v1
1
v2

1
v1
0.5

0
-1 -0.5 0 0.5 1 1.5

To find the unknown coefficients α1 and α2 , you can simply use your knowledge of linear algebra!
x = α1 v1 + α2 v2
 
  α1
x = v1 v2
| {z } α2
V
 
α1
= V−1 x
α2
Note that V is invertible precisely because v1 and v2 are independent. If they weren’t, then we wouldn’t be
able to decompose x based on them. But fortunately, eigenvectors always turn out to be linearly independent
(more on this later).
Now, we can clearly see what happens when A is applied to x:
Ax = A(α1 v1 + α2 v2 )
= α1 Av1 + α2 Av2
= 6α1 v1 + 4α2 v2
So A scales the component of any vector along the NorthWest-SouthEast axis (along v1 ) by 6 times and
its component along the SouthWest-NorthEast axis (along v2 ) by 4 times. Visually, this looks a bit like a
combination of scaling (stretching) and rotation (rotation away from the ±v2 and towards ±v1 ), but there
isn’t any rotation really, it’s just an unequal scaling.

Definition 1.5.2 (Eigenvalue and eigenvector) For any matrix A ∈ Rn×n , there exist precisely n num-
bers λ (potentially complex, and potentially repeated) for which the equation
Av = λv
has nonzero solutions. These numbers are called the eigenvalues of A, and the corresponding nonzero vectors
v that solve this equation are called the eigenvectors of A. □
Let’s see a few examples of finding eigenvalues and eigenvectors.

Example 1.5.3 (3x3 matrix with unique eigenvalues) Consider the matrix
 
−5 3 7
A = −5 3 5
−4 2 6

18

ME 120 – Linear Systems and Control


Copyright © 2022 by Erfan Nozari. Permission is granted to copy, distribute and modify this file, provided that the original
source is acknowledged.
1.5 EIGENVALUES AND EIGENVECTORS

To find its eigenvalues, we need to solve the equation

det(A − λI) = 0
−5 − λ 3 7
−5 3−λ 5 =0
−4 2 6−λ
−λ3 + 4λ2 − 6λ + 4 = 0

This a polynomial equation and we know, from Section 0.2, that it has exactly 3 (potentially repeated,
potentially complex) roots. Using hand calculations or MATLAB, we can see that its solutions are

λ1 = 2
λ2 = 1 + j
λ3 = 1 − j

which we sometimes show more compactly as

λ1 = 2
λ2,3 = 1 ± j

To find the eigenvectors associated with each eigenvalues, we simply solve the equation (A − λi I)v = 0, as
we did in Eq. (1.16). Note that this is nothing but finding the null space of A − λi I. For λ1 , this becomes

−7a + 3b + 7c = 0
 
a 
(A − 2I) b = 0 ⇔ −5a + b + 5c = 0 ⇔ b = 5a − 5c
 
c

−4a + 2b + 4c = 0

|{z}
v

Substituting b = 5a − 5c into the other two equations then gives


(
8a − 8c = 0
⇔c=a
6a − 6c = 0

Therefore, any vector


 
a
v1 =  0  , for any a ∈ C, a ̸= 0
a

is an eigenvector corresponding to λ1 = 2. This gives an entire line in the 3D space that is scaled by 2 by A
(similar to the NorthWest-SouthEast and SouthWest-NorthEast directions for the matrix A in Eq. (1.14)).
If you prefer (or are
 asked
 to provide) a single eigenvector associated with λ1 , pick any a ∈ C that you like,
2j
for example v1 =  0 .
2j
To find the eigenvector associated with λ2 , we proceed similarly:

−(6 + j)a + 3b + 7c = 0
 
a 
(A − (1 + j)I) b = 0 ⇔ −5a + (2 − j)b + 5c = 0
 
c 5−j

−4a + 2b + (5 − j)c = 0 ⇔ b = 2a − 2 c

19

ME 120 – Linear Systems and Control


Copyright © 2022 by Erfan Nozari. Permission is granted to copy, distribute and modify this file, provided that the original
source is acknowledged.
1.5 EIGENVALUES AND EIGENVECTORS

Substituting the last equation into the first two gives


(
−ja − (0.5 − 1.5j)c = 0
−(1 + 2j)a + (0.5 + 3.5j)c = 0

These equations may not immediately look multiples of each other, but they are in fact. To see, solve one
and replace in the other:
2nd eq
1st eq ⇔ a = (1.5 + 0.5j)c ⇒ −(0.5 + 3.5j)c + (0.5 + 3.5j)c = 0

which holds for any c. Substituting a = (1.5 + 0.5j)c into b = 2a − 5−j2 c also gives us b = (0.5 + 1.5j)c.
Therefore, any vector
 
(1.5 + 0.5j)c
v2 = (0.5 + 1.5j)c , for any c ∈ C, c ̸= 0
c

is an eigenvector associated with λ2 = 1 + j. Again, if you want a single matrix, pick your choice of c ̸= 0,
 T
such as c = 2 ⇒ v2 = 3 + j 1 + 3j 2 .
Finally, to find the eigenvector associated with λ3 = 1 − j, you can repeat the same process as above, which
gives,
 
(1.5 − 0.5j)c
v3 = (0.5 − 1.5j)c , for any c ∈ C, c ̸= 0
c


Notice that v3 is the complex conjugate of v2 . This is not by chance. Whenever you have two eigenvalues
that are complex conjugates of each other, their corresponding eigenvectors are also complex conjugate of
each other (can you prove this?).
In the examples that we have seen so far, all the eigenvalues of A have been distinct. This is not always the
case. The following is an example.

Example 1.5.4 (3x3 matrix with repeated eigenvalues) This time consider the matrix
 
−5 2 −6
A =  6 −1 6 
6 −2 7

Similar to previous examples, eigenvalue are found using

det(A − λI) = 0
−5 − λ 2 −6
6 −1 − λ 6 =0
6 −2 7−λ
−λ3 + λ2 + λ − 1 = 0

This equation has three roots equal to

λ1 = −1
λ2 = λ3 = 1

20

ME 120 – Linear Systems and Control


Copyright © 2022 by Erfan Nozari. Permission is granted to copy, distribute and modify this file, provided that the original
source is acknowledged.
1.5 EIGENVALUES AND EIGENVECTORS

two of which are repeated. This is perfectly fine. However, finding the eigenvectors becomes a bit more
complicated. For λ1 = −1 which is not repeated, everything is as before. You solve (A + I)v = 0 and will
find that any vector
 
a
v1 = −a , for any a ∈ C, a ̸= 0
−a
is an eigenvector corresponding to λ1 .
In order to find the eigenvectors corresponding to both of λ2 and λ3 , we have to solve the same equation

−6a + 2b − 6c = 0
 
a 
(A − I) b = 0 ⇔ −6a + 2b − 6c = 0 ⇔ b = 3a + 3c
 
c

−6a + 2b − 6c = 0

You can see that this time, we get only one equation in 3 variables, and we have two free variables to choose
arbitrarily (I chose a and c, but any two would work). Therefore, any vector
     
a 1 0
v2,3 = 3a + 3c = a 3 + c 3 , for any a, c ∈ C, (a, c) ̸= 0 (1.17)
c 0 1
is an eigenvector corresponding to λ2,3 = 1. Notice the difference with the example before where the eigen-
values where different. When the eigenvalues where different, we found one line of eigenvectors corresponding
to each eigenvalue. Now that we have a repeated eigenvalue, we found a plane of eigenvectors, with the same
dimension (2) as the multiplicity of the repeated eigenvalue.
Similar to before, if you want (or are asked to) give two specific eigenvectors corresponding to λ2 and λ3
(instead of the plane of eigenvectors in Eq. (1.17)), you can pick any two linearly independent vectors from
that plane, for example,
   
1 0
v2 = 3 , v3 = 3
0 1

The case of repeated eigenvalues can get more complex than this. In the above example, we were able to
find two linearly independent vectors that satisfy (A − I)v = 0. In other words, the dimension of the null
space of A − I was 2, equal to the multiplicity of the repeated eigenvalues. We may not always be so lucky!
Consider for example the matrix
 
3 1
A=
0 3
It is not hard to see that the two eigenvalues are λ1 = λ2 = 3. So we ideally would need two linearly
independent vectors that satisfy (A − 3I)v = 0. This is impossible, because the null space of A − 3I is only
1 dimensional. To see this, notice that
  (
a b=0
(A − 3I) =0⇔
b 0=0

So we can only choose a freely, but b must be zero, and only the vectors
 
a
v1 = , for any a ∈ C, a ̸= 0
0

21

ME 120 – Linear Systems and Control


Copyright © 2022 by Erfan Nozari. Permission is granted to copy, distribute and modify this file, provided that the original
source is acknowledged.
1.6 DIAGONALIZATION

are an eigenvector corresponding to λ1 = λ2 = 3. What about the other eigenvector v2 ? It doesn’t exist!
For these kinds of matrices where not enough eigenvectors can be found (which can only happen if we have
repeated eigenvalues), we have to supplement the eigenvectors with additional vectors called “generalized
eigenvectors”. Good news, that’s beyond our course!
Let’s go back to Example 1.5.4 where we were lucky and able to find two linearly independent eigenvectors
corresponding to the repeated eigenvalue. You might have noticed that not only v2 and v3 are linearly inde-
pendent, but all three eigenvalues {v1 , v2 , v3 } are linearly independent. The same was true in Example 1.5.3
where the eigenvalues were distinct. This is again not by chance:

Theorem 1.5.5 (Independence of eigenvectors) Consider a matrix A ∈ Rn×n that has distinct eigen-
values, or a matrix A ∈ Rn×n that has repeated eigenvalues but we are able to find as many linearly
independent eigenvectors as the number of repeated eigenvalues. In both cases, the set of all eigenvectors
{v1 , v2 , . . . , vn } is linearly independent. □
In the above theorem, I focused on a specific set of (square) matrices: those that either have distinct
eigenvalues, or even if they have repeated eigenvalues, we are able to find as many independent eigenvectors
as the multiplicity of each repeated eigenvalue. For reasons that we will see shortly, these matrices are called
diagonalizable.
Before closing this section, here is how to do all of these in MATLAB.

MATLAB 1.5.6 (Eigenvalues & eigenvectors) The function eig() gives you the eigenvalues and eigen-
vectors. If you only want the eigenvalues, type

1 lambda = eig(A);

and it will give you a column vector lambda containing the eigenvalues of A. If you want the eigenvectors as
well, type

1 [V, D] = eig(A);

and it will give you two matrices the same size as A. D is a diagonal matrix with the eigenvalues of A on its
diagonal, and V is a matrix with eigenvectors of A as its columns (with the correct ordering, such that the
i’th column of V is the eigenvector corresponding to the i’th diagonal element of D. □

1.6 Diagonalization
The process of diagonalization is one in which a square matrix is “transformed” into a diagonal one using a
change of coordinates.
Consider a diagonalizable matrix A ∈ Rn×n (which you know what it means from the last section, right?).
As always, we look at A as a map from Rn to Rn :

y = Ax (1.18)

Now let λ1 , λ2 , . . . , λn be the eigenvalues of A (potentially repeated) and v1 , v2 , . . . , vn be the corresponding


linearly independent eigenvectors. The matrix
 
V = v1 v2 ··· vn

22

ME 120 – Linear Systems and Control


Copyright © 2022 by Erfan Nozari. Permission is granted to copy, distribute and modify this file, provided that the original
source is acknowledged.
1.6 DIAGONALIZATION

is nonsingular. So let’s define


x̂ = V−1 x, ŷ = V−1 y
These vectors are the new coordinates of x and y in a coordinate system consisting of the columns of V. But
anyhow, substituting these into Eq. (1.18), we get
Vŷ = AVx̂
−1
| {zAV} x̂
ŷ = V

ŷ = Âx̂

Let’s look at  = V−1 AV more closely. First, let us look at the product
 
AV = A v1 v2 · · · vn
Recall from Eq. (1.1) that this is equal to
 
AV = Av1 Av2 ··· Avn
but vi are not any vectors, they are the eigenvectors of A, so this simplifies to
 
AV = λ1 v1 λ2 v2 · · · λn vn
 
λ1 0 · · · 0
 0 λ2 · · · 0 

 
= v1 v2 · · · vn  . .. .. .. 
 .. . . . 
0 0 · · · λn
Can you convince yourself of the last equality? Notice that this is nothing but the product of two matrices,
so you can first break it into n matrix-vector products from Eq. (1.1), each of which is a linear combination
from Eq. (1.2). The matrix
 
λ1 0 · · · 0
 0 λ1 · · · 0 
Λ= .
 
.. .. .. 
 .. . . . 
0 0 ··· λn
is an important matrix for us, because it contains the eigenvalues of A on its diagonal. Using this new
notation, we get
AV = VΛ
or, in other words,
 = Λ
This is called “diagonalization”, and Λ is also called the diagonalization/diagonalized version of A.
If these seem like a lot to digest,notice that
 this is what we did at the beginning of Section 1.5 where we
5 −1
were analyzing what the matrix does to vectors in the plane. After going step by step through
−1 5
the extraction of eigenvalues and eigenvectors, at the end we decomposed any vector x into its components
along v1 and v2 and used the fact that in this new “coordinate system”, A becomes a pure scaling in each
direction, which is precisely what the diagonal matrix Λ does. Diagonalization will be a very valuable tool
later in the study of linear systems.
23

ME 120 – Linear Systems and Control


Copyright © 2022 by Erfan Nozari. Permission is granted to copy, distribute and modify this file, provided that the original
source is acknowledged.
1.6 DIAGONALIZATION

Exercise 1.6.1 (Diagonalization) Diagonalize these matrices (find the eigenvalues and eigenvectors and
check  = V−1 AV against Λ):
 
−4 1 −2
• A= 0 −4 0
1 4 −2
 
−1 2 1
• A= 2 −4 −2
1 −2 −1

24

ME 120 – Linear Systems and Control


Copyright © 2022 by Erfan Nozari. Permission is granted to copy, distribute and modify this file, provided that the original
source is acknowledged.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy