Chap2 Eigenvalues and Eigenvectors
Chap2 Eigenvalues and Eigenvectors
1
Chapter 2
Note: These lecture notes aim to present a clear and crisp presentation of some topics in Linear Algebra.
Comments/suggestions are welcome on the e-mail: sukuyd@gmail.com to Dr. Suresh Kumar.
A real number λ is an eigenvalue of an n-square matrix A iff there exists a non-zero n-vector X
such that AX = λX or (A − λIn )X = 0. The non-zero vector X is called eigenvector of A correspond-
ing to the eigenvalue λ. Since the non-zero vector X is non-trivial solution of the homogeneous system
(A − λIn )X = 0, we must have |A − λIn | = 0. This equation, known as the characteristic equation of A,
yields eigenvalues of A. So to find the eigenvalues of A, we solve the equation |A − λIn | = 0.
The set Eλ = {X : AX = λX} is known as the eigenspace of λ. Note that Eλ contains all eigenvectors
of A corresponding to the eigenvalue λ in addition to the vector X = 0 since A0 = λ0. Of course, by
definition X = 0 is not an eigenvector of A.
12 −51
Ex. Find eigenvalues and eigenvectors of A = .
2 −11
Sol. Here, the characteristic equation of A, that is, |A − λI2 | = 0 reads as
12 − λ −51
= 0.
2 −11 − λ
λ2 − λ − 30 = 0.
2
Applying R1 → (1/6)R1 , we get
1 −17/2 x1 0
= .
2 −17 x2 0
Applying R2 → R2 − R1 , we have
1 −3 x1 0
= .
0 0 x2 0
−4 − λ 8 −12
6 −6 − λ 12 = 0.
6 −8 14 − λ
3
This leads to a cubic equation in λ given by
λ3 − 4λ2 + 4λ = 0.
x1 + x3 = 0, x2 − x3 = 0.
Now, let us find the eigenvectors of A corresponding to λ = 2. For this, we need to find non-zero
solutions X of the homogeneous system (A − 2I3 )X = 0, that is,
−6 8 −12 x1 0
6 −8 12 x2 = 0
6 −8 12 x3 0
x1 − (4/3)x2 + 2x3 = 0.
4
Thus,
the eigenvectors
corresponding to λ = 2 are non-trivial linear combinations of the vectors X2 =
4/3 −2
1 and X3 = 0 . So E2 = {aX2 + bX3 : a, b ∈ R} is the eigenspace corresponding to λ = 2.
0 1
Note: The algebraic multiplicity of an eigenvalue is defined as the number of times it repeats. In the
above example, the eigenvalue λ = 2 repeats twotimes. So its
algebraic
multiplicity is 2. Also, we get
4/3 −2
two linearly independent eigenvectors X2 = 1 and X3 = 0 corresponding to λ = 2. The follow-
0 1
ing example shows that there may not exist as many linearly independent eigenvectors as the algebraic
multiplicity of an eigenvalue.
0 1 0
Ex. If A = 0 0 1 , then eigenvalues of A are λ = 0, 0, 0. The eigenvectors corresponding to λ = 0
0 0 0
1
are non-zero multiples of the vector X = 0. The eigenspace corresponding to λ = 0, therefore, is
0
1
E0 = a 0 : a ∈ R . Please try this example yourself. Notice that there is only one linearly indepen-
0
1
dent eigenvector X = 0 corresponding to the repeated eigenvalue (repeating thrice) λ = 0.
0
5
2.3 Similar Matrices and Diagonalization
A square matrix A is said to be similar to a matrix B if there exists a non-singular matrix P such that
P −1 AP = B. In case, B is a diagonal matrix, we say that A is a diagonalizable matrix. Thus, a square
matrix A is diagonalizable if there exists a non-singular matrix P such that P −1 AP = D, where D is a
diagonal matrix.
0 0 ... λn
This shows that if we construct P from eigenvectors of A, then A is diagonalizable, and P −1 AP = D has
eigenvalues of A at the diagonal places.
Note: If A has n different eigenvalues, then it can be proved that there exist n linearly independent
eigenvectors of A and consequently A is diagonalizable. However, there may exist n linearly independent
eigenvectors even if A has repeated eigenvalues as we have seen earlier. Such a matrix is also, of course,
diagonalizable. In case, A does not have n linearly independent eigenvectors, it is not diagonalizable.
12 −51 17 3 −1 6 0
Ex. If A = , then P = and P AP = . (Verify!)
2 −11 2 1 0 −5
−4 8 −12 4 −2 −1 2 0 0
Ex. If A = 6 −6 12 , then P = 3 0 1 and P −1 AP = 0 2 0. (Verify!)
6 −8 14 0 1 1 0 0 0
Note: If A is a diagonalizable matrix, that is, P −1 AP = D or A = P DP −1 , then for any positive integer
n, we have An = P Dn P −1 . For,
A2 = (P DP −1 )2 = P DP −1 P DP −1 = P D2 P −1 .
Likewise, A3 = P D3 P −1 . So in general, An = P Dn P −1 .
This result can be utilized to evaluate powers of a diagonalizable matrix easily.
−4 8 −12
Ex. Determine A2 , where A = 6 −6 12 .
6 −8 14
4 −2 −1 1 −1 2 2 0 0
So. P = 3 0 1 , P −1 = 3 −4 7 and D = 0 2 0.
0 1 1 −3 4 −6 0 0 0
4 −2 −1 4 0 0 1 −1 2 −8 16 −24
So A2 = P D2 P −1 = 3 0 1 0 4 0 3 −4 7 = 12 −12 24 .
0 1 1 0 0 0 −3 4 −6 12 −16 28
Ex. Show that similar matrices have same eigenvalues. Also, discuss about their eigenvectors.
6
Sol. Suppose A and B are similar matrices. Then there exists a non-singular matrix P such that
P −1 AP = B. If λ is an eigen value of B, then we have
|B − λI| = 0.
xn
q
||X|| = x21 + x22 + ... + x2n .
xn yn
by X.Y , is defined as
X.Y = X T Y = x1 y1 + x2 y2 + ... + xn yn .
The vectors X and Y are orthogonal if X.Y = 0. Further, X and Y are orthonormal if X.Y = 0 and
||X|| = 1 = ||Y ||.
So, if two nonzero vectors X and Y are orthogonal, then X/||X|| and Y /||Y || are orthonormal.
Orthogonal and orthonormal matrices: A matrix is orthogonal (orthonormal) if all of its column
vectors are mutually orthogonal (orthonormal). Note that an orthonormal matrix P has the property
P −1 = P T .
The Gram-Schmidt Process: This process converts a LI set of vectors, say {X1 , X2 , X3 , ..., Xn } to an
orthogonal set of vectors {Y1 , Y2 , Y3 , ..., Yn } as follows:
(i) Y1 = X1 ,
(ii) Y2 = X2 − (X2 .Y1 )Y1 /||Y1 ||2 ,
(iii) Y3 = X3 − (X3 .Y1 )Y1 /||Y1 ||2 − (X3 .Y2 )Y2 /||Y2 ||2 ,
and so on.
7
Theorem: If A is a real symmetric matrix of order n × n, then
(i) A has n real eigenvalues,
(ii) A has n LI eigenvectors,
(iii) A is diagonalizable,
(iv) eigenvectors of A corresponding to different eigenvalues are mutually orthogonal,
(v) there exists an orthonormal matrix P with columns as linearly independent eigenvectors of A such
that P −1 AP = D, where D a diagonal matrix with the eigenvalues of A at the diagonal places.
The above theorem tells us some useful properties of a real symmetric matrix. First, all the eigenvalues
of A are real. Further, A is diagonalizable, and it is diagonalized by an othronormal matrix P where
column vectors of P are linearly independent eigenvectors of A. Also, note that the eigenvectors of A
corresponding to different eigenvalues are mutually orthogonal. Thus, to find P , first we need eigenvalues
of the matrix. If all eigenvalues are different, then the corresponding eigenvectors are mutually orthogonal.
In case of repeated eigen values, we can use Gram-Schmidt process to generate mutually orthogonal vectors
as explained in the following example.
0 1 1
Ex. Determine an orthonornal matrix P , and use it to diagonalize A = 1 0 1.
1 1 0
Sol. The eigenvalues
of A are −1,
−1 and 2. The eigenvectors corresponding to the repeated eigenvalue
−1 −1 1
−1 are X1 = 1 and X2 = 0 while the eigenvector corresponding to the eigenvalue 1 is X3 = 1.
0 1 1
We find that X1 .X3 = 0 and X2 .X3 = 0. So X1 and X2 are orthogonal to X3 , as expected. But X1 .X2 6= 0.
Using Gram-Schmidt process, we find the orthogonal vectors:
−1
Y1 = X1 = 1 ,
0
−1 −1/2 −1/2
Y2 = X2 − (X2 .Y1 )Y1 /||Y1 ||2 = 0 − 1/2 = −1/2 .
1 0 1
Also, Y1 .X3 = 0 and Y2 .X3 = 0. Thus, {Y1 , Y2 , X3 } is an orthogonal set, and
{Y1 /||Y1 ||2 , Y2 /||Y2 ||2 , X3 /||X3 ||2 } is an orthonormal set. It follows that the orthonormal matrix P that
diagonalizes the given materix A, is given by
√ √ √
−1/√ 2 −1/√6 1/√3
P = 1/ 2 −1/√ 6 1/√3 .
0 2/ 6 1/ 3
8
2.5 Quadratic form
If A is a symmetric matrix of order n × n, and X is a column vector of n variables, then X T AX is a
homogeneous
expression
of second
degree in n variables, defined as a quadratic form. For example, if
1 2 −1 x1
A = 2 3 1 and X = x2 , then
−1 1 4 x3
1 2 −1 x1
X T AX = x1 x2 x3 2 3 1 x2 = x21 + 3x22 + 4x23 + 4x1 x2 − 2x1 x3 + 2x2 x3
−1 1 4 x3
is a quadratic form, which is a homogeneous expression of second degree in x1 , x2 and x3 .
yn
Notice that the transformed quadratic form Y T DY does not carry the cross product terms. It is called
the canonical form of the quadratic form X T AX.
The number of nonzero terms in the canonical form (number of nonzero eigenvalues of A) is called
rank (r); the number of positive terms in the canonical form (number of positive eigenvalues of A) is called
index (i); the difference of the positive and negative terms in the canonical form (difference of the number
of positive and negative eigenvalues of A) is called signature (s), of the quadratic form X T AX. Further,
the quadratic form X T AX is said to be (i) positive definite if all the eigenvalues of A are positive; (ii)
negative definite if all the eigenvalues of A are negative; (iii) positive semi-definite if the samllest eigen-
value of A is 0; (iv) negative semi-definite if the largest eigenvalue of A is 0; (v) idefinite if the eigenvalues
of A are positive and negative.
Ex. Transform the quadratic form Q = x21 − x23 − 4x1 x2 + 4x2 x3 to its canonical form, and hence find its
rank, index and signature.
Sol. The given quadratic form can be expressed in matrix notation as
1 −2 0 x1
Q = X T AX = x1 x2 x3 −2 0
2 x2 .
0 2 −1 x3
2/3
The eigenvalues of the matrix A are 0, −3, 3, and the corresponding orthonormal eigenvectors are 1/3,
2/3
−1/3 −2/3
−2/3, 2/3 , respectively. (Verify!)
2/3 1/3
So the transformation
2/3 −1/3 −2/3 y1
X = P Y = 1/3 −2/3 2/3 y2
2/3 2/3 1/3 y3
9
leads to the canonical form
0 0 0 y1
Q = X T AX = Y T DY = y1 y2 y3 0 −3 0 y2 = −3y22 + 3y32 .
0 0 3 y3
We notice that r = 2, i = 1 and s = 0. Further, the quadratic form is indefinite.
Remark: In case the eigenvalues are equal, we use the derivative of equation (2.1) for determining the
constants in the remainder. For, if f (x) = 0 has repeated root α, then f (α) = 0 and f 0 (α) = 0.
10
2.7 Applications of the Eigenvalues and Eigenvectors
Knowledge of the mathematics of Eigenvalues and Eigenvectors is very important. See some applications
of the Eigenvalues and Eigenvectors as given below.
(Source: http://www.soest.hawaii.edu)
1. Communication systems: Eigenvalues were used by Claude Shannon to determine the theoretical
limit to how much information can be transmitted through a communication medium like your telephone
line or through the air. This is done by calculating the eigenvectors and eigenvalues of the communica-
tion channel (expressed a matrix), and then waterfilling on the eigenvalues. The eigenvalues are then,
in essence, the gains of the fundamental modes of the channel, which themselves are captured by the
eigenvectors.
2. Google’s PageRank: Google’s extraordinary success as a search engine was due to their clever use
of eigenvalues and eigenvectors. From the time it was introduced in 1998, Google’s methods for delivering
the most relevant result for our search queries has evolved in many ways. See the link Google’s PageRank
for more details.
3. Designing bridges: The natural frequency of the bridge is the eigenvalue of smallest magnitude of
a system that models the bridge. The engineers exploit this knowledge to ensure the stability of their
constructions.
4. Designing car stereo system: Eigenvalue analysis is also used in the design of the car stereo
systems, where it helps to reproduce the vibration of the car due to the music.
5. Electrical Engineering: The application of eigenvalues and eigenvectors is useful for decoupling
three-phase systems through symmetrical component transformation.
7. Underground oil search: Oil companies frequently use eigenvalue analysis to explore land for oil.
Oil, dirt, and other substances all give rise to linear systems which have different eigenvalues, so eigenvalue
analysis can give a good indication of where oil reserves are located. Oil companies place probes around
a site to pick up the waves that result from a huge truck used to vibrate the ground. The waves are
changed as they pass through the different substances in the ground. The analysis of these waves directs
the oil companies to possible drilling sites.
Eigenvalues are not only used to explain natural occurrences, but also to discover new and better
designs for the future. Some of the results are quite surprising. If you were asked to build the strongest
column that you could to support the weight of a roof using only a specified amount of material, what
shape would that column take? Most of us would build a cylinder like most other columns that we have
seen. However, Steve Cox of Rice University and Michael Overton of New York University proved, based
on the work of J. Keller and I. Tadjbakhsh, that the column would be stronger if it was largest at the
top, middle, and bottom. At the points of the way from either end, the column could be smaller because
the column would not naturally buckle there anyway. Does that surprise you? This new design was
discovered through the study of the eigenvalues of the system involving the column and the weight from
11
above. Note that this column would not be the strongest design if any significant pressure came from the
side, but when a column supports a roof, the vast majority of the pressure comes directly from above.
Very roughly, the eigenvalues of a linear mapping is a measure of the distortion induced by the
transformation and the eigenvectors tell you about how the distortion is oriented. It is precisely this
rough picture which makes PCA (Principal Component Analysis = A statistical procedure) very useful.
12