0% found this document useful (0 votes)
140 views14 pages

Linear Algebra 1

This document provides an introduction to basic linear algebra concepts: - Matrices are rectangular arrays of numbers defined by their rows and columns. Vectors are ordered tuples that can be represented as row or column matrices. - Operations on matrices include addition, scalar multiplication, and matrix multiplication. Matrix multiplication combines elements using dot products and results in a matrix whose columns (or rows) are linear combinations of the original matrices' columns (or rows). - Important matrix properties include being diagonal, square, the identity matrix, and matrix powers. Matrix multiplication is not commutative.

Uploaded by

shruti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
140 views14 pages

Linear Algebra 1

This document provides an introduction to basic linear algebra concepts: - Matrices are rectangular arrays of numbers defined by their rows and columns. Vectors are ordered tuples that can be represented as row or column matrices. - Operations on matrices include addition, scalar multiplication, and matrix multiplication. Matrix multiplication combines elements using dot products and results in a matrix whose columns (or rows) are linear combinations of the original matrices' columns (or rows). - Important matrix properties include being diagonal, square, the identity matrix, and matrix powers. Matrix multiplication is not commutative.

Uploaded by

shruti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Linear Algebra 1

Keshav Dogra∗

The following material is based on Chapter 1 of Sydsaeter et al., “Further Mathematics for
Economic Analysis” and Sergei Treil, “Linear Algebra Done Wrong” (available at http://www.
math.brown.edu/~treil/papers/LADW/LADW.html)

1 Basic Linear Algebra


An m × n matrix is a rectangular array with m rows and n columns:

 
a11 a12 ... a1n
 
 a21 a22 ... a2n 
A = (aij )m×n = .
 
 .. .. .. 
 . . 

am1 am2 ... amn

where aij denotes the element in the ith row and the jth column.
A n-vector is an ordered n-tuple of numbers. We can think of an n-vector as either a 1 × n
matrix (a row vector ) or as an n × 1 matrix (a column vector ).
An m × n matrix A can be written as a set of column vectors or as a set of row vectors. That
is,
 
α1
 
   α2 
A= a1 a2 ... an = .
 
 ..


 
αm
∗ Department of Economics, Columbia University, kd2338@columbia.edu

1
where  
a1i
 
 a2i 
ai =  .
 
 ..


 
ami

are column vectors and


 
αj = aj1 aj2 ... ajn

are row vectors.


Addition and scalar multiplication of matrices are defined in the obvious way:

A + B = (aij + bij )m×n , αA = (αaij )m×n , A − B = (aij − bij )m×n

The dot product (also called the inner product) of two vectors a = (a1 , a2 , ..., an ), b =
(b1 , b2 , ..., bn ) is defined as
n
X
a·b= ai bi
i=1

If a and b are regarded as column vectors, we can write the dot product as a0 b. The dot product
has the properties

a · b = b · a, a · (b + c) = a · b + a · c, (αa) · b = a · (αb) = α(a · b)

Note that a · b = b · a. We say two vectors a and b are orthogonal if a · b = 0.


Suppose A is a m × n matrix and B is a n × p matrix. Then the product of A and B is the
m × p matrix C = AB, whose element in the ith row and jth column is the dot product of the ith
row of A and the jth column of B:
n
X
cij = air brj
r=1

This means that the columns of C are linear combinations of the columns of A. In particular, the
jth column of C is
       
c1j a11 a12 a1n
       
 c2j   a   a   a 
 21  22  2n
 = b1j   + b2j   + ... + bnj 
    
 .
 ..

  ...   ...   ... 
       
cmj am1 am2 amn

2
Also, the rows of C are linear combinations of the rows of B; the ith row of C is

(ci1 , ci2 , . . . , cip ) = ai1 (b11 , b12 , . . . , b1p ) + ai2 (b21 , b22 , . . . , b2p ) + . . . + ain (bn1 , bn2 , . . . , bnp )

The product AB is defined only if the number of columns in A equals the number of rows in
B. If this is true we say the matrices are conformable.
Matrix multiplication satisfies

(AB)C = A(BC)
A(B + C) = AB + AC
(A + B)C = AC + BC

Matrix multiplication is not commutative: that is, AB 6= BA. AB = 0 does not imply that either
A or B equals the matrix of zeros, 0.
A matrix is square if it has the same number of rows and columns. The nth power of a square
matrix is defined as
An = AA . . . A (n times)

A square matrix is diagonal if all its off-diagonal elements are zero. We sometimes write
 
d11 0 ... 0
 
 0 d22 ... 0 
diag{d1 , d2 , ..., dn } =  .
 
 .. .. .. 
 . . 

0 0 ... dnn

The nth power of a diagonal matrix diag{d1 , d2 , ..., dn } is diag{dn1 , dn2 , ..., dnn }.
The identity matrix of order n, denoted by In or I, is the n × n matrix with ones on the main
diagonal and zeros elsewhere,
 
1 0 ... 0
 
 0 1 ... 0 
In = diag{1, 1, ..., 1} = 
 
.. .. .. 

 . . . 

0 0 ... 1

If the matrices are conformable, then

IA = A, BI = B

3
A square matrix is upper triangular if all entries below the main diagonal are zero. A square
matrix is lower triangular if all entries above the main diagonal are zero. A matrix is triangular
if it is either lower triangular or upper triangular.
The transpose of a matrix A, denoted by A0 or AT , is obtained by reversing rows and columns
(if B = A0 , aij = bji ). The following rules apply:

(A0 )0 = A, (A + B)0 = A0 + B0 , (αA)0 = αA0 , (AB)0 = B0 A0

A square matrix is symmetric if A = A0 .


A square matrix is orthogonal if A0 A = I.
A square matrix is idempotent if AA = A.

1.1 Determinants and Inverses


Intuitively, the determinant of a n × n matrix A, denoted |A|, is a measure of the n-dimensional
volume of the parallelepiped determined by the columns of A. (A parellelepiped, or paralleletope,
is just the n-dimensional version of a parallelogram.) For example, for a 2 × 2 matrix A, the
absolute value of the determinant is the area of the parallelogram formed by the rows of A. (We
take absolute values because sometimes the determinant is negative.)
The determinant has the following properties, which are natural given that it is a measure of
volume:

• If two rows (or two columns) of A are interchanged, its determinant changes sign but its
absolute value remains the same;

• If a single row (or column) of A is multiplied by a number c, the determinant is multiplied


by c;

• If two rows (or columns) of A are proportional, then |A| = 0;

• The value of |A| remains unchanged if a multiple of a row is added to another row (or if a
multiple of a column is added to another column);

• The determinant of a (lower or upper) triangular matrix equals the product of its diagonal
entries, a11 a22 ...ann . In particular, this holds for diagonal matrices, and the determinant of
the identity matrix equals 1.

• If columns of two square matrices are all identical (x1 , ...., xn ) except for their kth column,
which is equal to uk for one matrix and vk for the other, then the sum of these matrices’

4
determinants is equal to the determinant of the matrix with columns all equal to x1 , ...., xn
except for the kth column, which equals uk + vk . That is,

x1 ··· uk + vk ··· xn = x1 ··· uk ··· xn + x1 ··· vk ··· xn

The determinant |A| of a 2 × 2 matrix is defined as




a11 a12
|A| = = a11 a22 − a21 a12
a21 a22

To define the determinant of a n × n matrix, we need to introduce some extra terms.


A minor of A of order k is obtained by deleting all but k rows and k columns, and taking the
determinant of the resulting k × k matrix. For the moment, we are only concerned with minors of
order n − 1. The ijth minor of order n − 1 - call it Mij - is obtained by deleting the ith row and
jth column of A, and then taking the determinant of the resulting (n − 1) × (n − 1) matrix:

a11 ... a1,j−1 a1,j+1 ... a1n


.. .. .. ..



. . . .


ai−1,1 ... ai−1,j−1 ai−1,j+1 ... ai−1,n

ai+1,1 ... ai+1,j−1 ai+1,j+1 ... ai+1,n


.. .. .. ..



. . . .


an1 ... an,j−1 an,j+1 ... ann

The ijth cofactor of A, Aij , is the ijth minor, multiplied by (−1)i+j :

Aij = (−1)i+j Mij

Having defined the cofactor, we can provide an expression for the determinant. For a n × n
matrix A, the determinant |A| can be defined by expanding along any row i = 1, 2, ..., n:

|A| = ai1 Ai1 + ai2 Ai2 + ... + ain Ain

where each cofactor Aij is the ijth cofactor of A.


The determinant has the properties

|A0 | = |A|, |AB| = |A| · |B|

5
The inverse A−1 of a n × n square matrix A is the matrix B that satisfies

AB = In , BA = In

The inverse of a matrix A may or may not exist; if it does exist, we say A is invertible. A−1 exists
if and only if |A| =
6 0. The unique inverse of A, if it exists, is given by

 
A11 A21 ... An1
 
1  A12 A22 ... An2 
−1
A = adj(A), where adj(A) =  .
 
|A|  .. .. .. 
 . . 

A1m A2m ... Ann

where Aij is the ijth cofactor defined above. Note that the order of subscripts in the adjoint
matrix adj(A) is the opposite of what you might expect. In practice, this formula is almost never
useful except in the special case of 2 × 2 matrices:
 −1  
a b 1 d −b
  =  
c d ad − bc −c a

For invertible A and B, the inverse has the properties:

AA−1 = A−1 A = I
(A−1 )−1 = A
(AB)−1 = B−1 A−1
(A0 )−1 = (A−1 )0
(A + B)−1 = A−1 (A−1 + B−1 )−1 B−1

1.2 Cramer’s Rule


Consider a linear system of n equations in n unknowns

a11 x1 + a12 x2 + ... + a1n xn = b1


a21 x1 + a22 x2 + ... + a2n xn = b2
.................................................
an1 x1 + an2 x2 + ... + ann xn = bn

6
which we can write more compactly as Ax = b, where the matrix A and vector b are parameters
and we want to find vectors x that solve this system of equations. The system has a unique solution
if and only if |A| =
6 0. In this case, the solution is

xi = |Aj |/|A|, j = 1, 2, ..., n

where Aj is the matrix formed by replacing the jth column of A with the vector b.1
If b equals the vector of zeros 0, so we have Ax = 0, the system is homogeneous. A ho-
mogeneous system always has the trivial solution x = 0. It has nontrivial solutions if and only if
|A| = 0.

2 Vectors
A n-vector is an ordered n-tuple of numbers. We can think of an n-vector as either a 1 × n matrix
(a row vector ) or as an n × 1 matrix (a column vector ).
Let S be a set of n × 1 vectors. S is a vector space in n-dimensional space if

1. If x1 , x2 ∈ S, then x1 + x2 ∈ S [S is closed under summation]

2. If x ∈ S, and α is a real scalar, then αx ∈ S [S is closed under scalar multiplication].

Note that:
 
0
..
 
• If α = 0, we get the null vector  , so any vector space S contains the null vector;
 
 . 
0

• If x1 , x2 ∈ S and α, β ∈ R, then αx1 + βx2 ∈ S. (In fact, this condition implies 1. and 2.)

We can give a more general definition of a vector space, in which the objects we call ‘vectors’
are not necessarily n × 1 arrays of numbers. A vector space V is a collection of objects called
vectors, closed under two operations2 , addition of vectors and scalar multiplication, such that the
following properties hold:

• v + w = w + v for all v, w ∈ V ;

• (u + v) + w = u + (v + w), for all u, v, w ∈ V

• There exists a zero vector denoted 0 such that v + 0 = v for all v ∈ V


1 Cramer’s Rule is not an efficient method for solving linear systems. It is better to use row reduction. See LADW,
Chapter 2.
2 That is, if v, w ∈ V , then αv + βw ∈ V for any real scalars α, β.

7
• For every vector v ∈ V there exists a vector w such that v + w = 0. We denote this ‘additive
inverse’ vector as −v.

• 1v = v for all v ∈ V

• (αβ)v = α(βv) for all v ∈ V and all scalars α, β;

• α(u + v) = αu + αv, for all u, v ∈ V and all scalars α;

• (α + β)v = αv + βv for all v ∈ V and all scalars α, β.

For example, the space Pn of all polynomials of degree at most n, consisting of all polynomials
of the form
p(t) = a0 + a1 t + . . . + an tn ,

is a vector space.
If S, T are vector spaces with S ⊂ T , then S is a vector subspace of T .
Let V be a vector space, and v1 , v2 , ..., vp ∈ V a collection of vectors. A linear combination
Pp
of these vectors is a sum k=1 αk vk . A system of vectors v1 , v2 , ..., vp ∈ V is called a basis if any
vector v ∈ V has a unique representation as a linear combination of these vectors.
For example, take the vector space V = Rn . Consider the vectors
     
1 0 0
     
 0   1   0 
e1 =   , e2 =  .  , ..., en =  .
     
..  ..  ..


 .
 

 


0 0 1

The system of vectors is a basis in Rn , since any vector


 
x1
 
 x2 
v=
 
.. 

 . 

xn

can be uniquely represented as the linear combination

v = x1 e1 + x2 e2 + . . . + xn en

The system e1 , e2 , ..., en is called the standard basis in Rn .


The linear span of a set of vectors v1 , ..., vn is the set of all linear combinations of these vectors.
If any vector v ∈ V has a representation (not necessarily unique) as a collection of these vectors -

8
that is, if the linear span of these vectors is equal to the whole space V - then we say these vectors
span V , and we call them a spanning system (generating system ; complete system). Note
that every basis is a spanning system, but not every spanning system is a basis.
The dimension of a vector space V , denoted dim V , is the number of vectors in a basis of V .
(It can be shown that any basis of V has the same number of vectors.)

2.1 Inner products


If S is a vector space, a function hx, yi defined for all x, y ∈ S is an inner product if, for all
x, y, z ∈ S, and any scalar α:

1. hx, xi ≥ 0, hx, xi = 0 iff x = 0

2. hx, yi = hy, xi

3. hx + y, zi = hx, zi + hy, zi

4. hαx, yi = αhx, yi

The most common example of an inner product, in a Euclidean space, is the dot product. The
dot product of two vectors a = (a1 , a2 , ..., an ), b = (b1 , b2 , ..., bn ) is defined as

n
X
a·b= ai bi
i=1

If a and b are regarded as column vectors, we can write the dot product as a0 b. The dot product
has the properties

a · b = b · a, a · (b + c) = a · b + a · c, (αa) · b = a · (αb) = α(a · b)

The Cauchy-Schwarz inequality states that if x and y are elements of a vector space S, and
hx, yi is an inner product, then
(hx, yi)2 ≤ hx, xi · hy, yi

For example, in a Euclidean space with the dot product, we have


!2 ! !
X X X
xi yi ≤ x2i · yi2
i i i

Two vectors a and b are orthogonal (we denote this by a ⊥ b) if their inner product is zero, a·b = 0

9
2.2 Normed vector spaces
A normed vector space is a vector space S, together with a function (called a norm) ||·|| : S → R,
such that for all x, y ∈ S and α ∈ R:

1. ||x|| ≥ 0, ||x|| = 0 iff x = 0

2. ||αx|| = |α| · ||x||

3. ||x + y|| ≤ ||x|| + ||y||

In Rn , the Euclidean norm or length of the vector a = (a1 , a2 , ..., an ) is

√ q
||a|| = a·a= a21 + a22 + . . . + a2n

Some other norms in Rn are:


Pn
• ||x||1 = i=1 |xi | is the L1 1 norm (or sum-norm)
Pn 1/2
• ||x||2 = i=1 x2i is the L2 norm (which is another name for the Euclidean norm)
Pn 1/p
• ||x||p = ( i=1 |xi |p ) is the Lp norm

• ||x||∞ = max1≤i≤n |xi | is the sup-norm.

2.3 Metric spaces


A metric space is a set S, together with a metric (distance function) ρ : S × S → R, such that
for all x, y, z ∈ S:

• ρ(x, y) ≥ 0, with equality iff x = y;

• ρ(x, y) = ρ(y, x);

• ρ(x, z) ≤ ρ(x, y) + ρ(y, z) (the triangle inequality)

It is standard to view any normed vector space (S, || · ||) as a metric space where the metric is taken
to be ρ(x, y) = ||x − y||.

3 Linear Independence
The n vectors a1 , a2 , ..., an are linearly dependent if some linear combination of these vectors
equalis zero; that is, if there exist numbers c1 , c2 , ..., cn not all zero, such that

c1 a1 + c2 a2 + ... + cn an = 0

10
If this equation holds only in the case when c1 = c2 = ... = cn = 0, then the vectors are linearly
independent.
Equivalently, a set of vectors is linearly independent if no one of them can be expressed as a
linear combination of the others.

3.1 Linear Dependence and Systems of Linear Equations


Consider the system of m equations in n unknowns

a11 x1 + a12 x2 + ... + a1n xn = b1


a21 x1 + a22 x2 + ... + a2n xn = b2
.................................................
am1 x1 + am2 x2 + ... + amn xn = bm

We can write this matrix form (Ax = b), or in vector form:

x1 a1 + . . . + xn an = b

where aj is the jth column of A. We can prove that if the system has more than one solution, the
vectors a1 , ..., an are linearly dependent. Suppose the system has two solutions, u and v. Then
u1 a1 + . . . + un an = b and v1 a1 + . . . + vn an = b. Subtracting one from the other,

(u1 − v1 )a1 + . . . + (un − vn )an = 0

If the two solutions are different, (u1 − v1 ), ..., (un − vn ) are not all equal to zero, and the vectors
are linearly dependent. Thus if the system has more than one solution, the vectors a1 , ..., an are
linearly dependent. Equivalently, if the vectors are linearly independent, the system has at most
one solution.
A set of vectors v1 , ..., vn in a vector space V is a basis iff it is linearly independent and spans
V.

4 The Rank of a Matrix


The linear span of the rows of an m × n matrix A (a subspace of Rn ) is the row space of A.3 The
linear span of the columns of A is the column space of A (a subspace of Rm ).
3 Recall that the linear span of a set of vectors v1 , ..., vn is the set of all linear combinations of these vectors.

11
The row rank of A is the dimension of the row space of A.4 The column rank of A is the
dimension of the column space of A.

Theorem 4.1. The row rank and the column rank of a matrix A are equal. We call their value
the rank of A, r(A).

FMEA gives an alternative, equivalent definition: r(A) is the largest number of column vectors
in A that form a linearly independent set. This definition is equivalent because the maximum
number of linearly independent vectors in a set equals the dimension of the linear span of that set.
The rank of a matrix equals the rank of its transpose: r(A) = r(A0 ). It follows that the rank
of a matrix is less than or equal to the number of its rows or columns (whichever is smallest). A
matrix has full rank if its rank is equal to the number of rows or columns (whichever is smallest).
A minor of A of order k is obtained by deleting all but k rows and k columns, and taking the
determinant of the resulting k × k matrix.
The rank r(A) of a matrix A equals the order of the largest minor of A that does not equal
zero.
The rank of a matrix is not affected by elementary row or column operations. Elementary row
(column) operations are:

1. Interchanging two row (columns) of a matrix;

2. Multiplying a row (column) of a matrix by a number a 6= 0;

3. Adding (a× row (column) j) to row (column) k.

One way to find the rank of a matrix is to perform row or column operations until the number of
linearly independent row or column vectors is clear.5

5 Main Results on Linear Systems


Theorem 5.1. Suppose x1 satisfies Ax = b. Let H be the se of all solutions to Ax = 0. Then the
solutions of Ax = b are {x = x1 + xh : xh ∈ H}.
4 Recall
that the dimension of a vector space V , denoted dim V , is the number of vectors in a basis of V .
5 If
you know how to perform row reduction, one way to find the rank of a matrix is to put the matrix in reduced
echelon form. For an echelon form matrix, only those rows which contain a pivot are linearly independent; likewise,
only those columns which contain a pivot are independent. (This is one way to prove the theorem stated above.)
Again, see LADW chapter 2.

12
Proof. Take x1 ,xh such that Ax1 = b, Axh = 0. Let x = x1 + xh . Then we have

Ax = A(x1 + xh )
= A]x1 + Axh )
=b+0=b

So any x1 + xh , xh inH, solves Ax = b.


Let x solve Ax = b. Let xh = x − xh . Then

Axh = A(x − x1 ) = b − b = 0

so xh ∈ H. So any solution of Ax = b is x1 + xh for some xh ∈ H.

A corollary of this theorem is that, assuming Ax = b has some solution, this solution is unique
if and only if Ax = 0 has a unique solution (namely, x = 0).
Consider again the system of m equations in n unknowns

a11 x1 + a12 x2 + ... + a1n xn = b1


a21 x1 + a22 x2 + ... + a2n xn = b2
.................................................
am1 x1 + am2 x2 + ... + amn xn = bm

or, in matrix form, Ax = b, where A is the m × n coefficient matrix.


As before, we define the augmented matrix of this system to be the m × (n + 1) matrix Ab
that contains A in the first n columns and b in the last column:
 
a11 a12 ... a1n b1
 
 a21 a22 ... a2n b2 
(A|b) =  .  = Ab (using the notation in FMEA)
 
 .. .. .. ..
 . . . 

am1 am2 ... amn bm

Either r(Ab ) = r(A), or r(Ab ) = r(A) + 1. (In general, adding columns to a matrix can never
decrease the rank; and adding one column can increase the rank - the maximum number of linearly
independent columns - by at most one.)

Theorem 5.2. The system Ax = b has at least one solution if and only if the rank of A equals
the rank of Ab .

13
Corollary 5.3. If A is n × n and full rank, Ax = b has a unique solution. (We already knew that,
if this system had a solution, that solution would be unique; now, we know it does indeed have a
solution.)

Theorem 5.4. Suppose the system has solutions with r(A) = r(Ab ) = k.

• If k < m (the rank of these matrices is less than the number of equations) then m − k
equations are superfluous: if we choose any subsystem of equations corresponding to k linearly
independent rows, any solution to these equations also satisfies the remaining m−k equations.

• If k < n (the rank of these matrices is less than the number of unknowns) then there exist
n − k variables that can be freely chosen, with the values of the remaining k variables uniquely
determined by the choice of these n−k free variables. The system has n−k degrees of freedom.

14

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy