0% found this document useful (0 votes)

96 views13 pages

CS3220 Lecture Notes: Singular Value Decomposition and Applications

The document summarizes the singular value decomposition (SVD) and its applications. The SVD decomposes a matrix into three components: left singular vectors (U), singular values (Σ), and right singular vectors (V). It can be used to find a matrix's fundamental subspaces, solve rank-deficient least squares problems, determine matrix rank in noisy data, and for principal component analysis. The SVD works even when the matrix is rank-deficient, providing insight into the column space, nullspace, and more.

Uploaded by

Onkar Pandit

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

96 views13 pages

CS3220 Lecture Notes: Singular Value Decomposition and Applications

Uploaded by

Onkar Pandit

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

CS3220 Lecture Notes: Singular Value

decomposition and applications

Steve Marschner
Cornell University
5–7 April 2010

This document is a transcription of the notes I have used to give CS322/3220

lectures on the SVD over the years, so it includes a bit more material than I’ve
covered this year. It covers the SVD and what it is, and then applications to
finding a matrix’s fundamental subspaces, solving rank-deficient least squares
problems, deciding matrix rank in the presence of noise, and in principal com-
ponent analysis.

1 From QR to SVD
We’ve just been looking at an orthogonal matrix factorization of an m×n matrix
A that gives us an orthogonal factor on the left:

A = QR

By drawing in that dotted line we separate Q in to a basis for the range of

A (the left-hand part, called Q1 ) and a basis for the rest of IRm (the right-hand
part, called Q2 ). This Q factor tells us about the column space of A.
When we wanted to know about the row space of A instead, we factored AT ,
resulting in something that might be called the “LQ factorization” (except that
it isn’t):
AT = QR or A = RT QT

1
In the full-rank case the row space of a tall matrix or the column space of
a wide matrix are uninteresting, because either the rows (of a tall matrix) or
the columns (of a wide matrix) span their whole space. So we only ever need
to know about either the rows or the columns, and we can pick one of these
two factorizations. If the matrix is not full rank (it is rank deficient), then both
the row space and the column space are interesting (meaning that they are less
than the full space IRm or IRn ), but we will still be stuck choosing one or the
other.
The next factorization we’ll look at, the SVD, has orthogonal factors on both
sides, and it works fine in the presence of rank deficiency:

A = U ΣV T (or AV = U Σ)
The SVD writes A as a product of two orthogonal transformations with a di-
agonal matrix (a scaling operation) in between. It says that we can replace
any transformation by a rotation1 from “input” coordinates into convenient co-
ordinates, followed by a simple scaling operation, followed by a rotation into
“output” coordinates. Furthermore, the diagonal scaling Σ comes out with its
elements sorted in decreasing order.

2 SVD definitions and interpretation

The pieces of the SVD have names following the “singular” theme. The columns
of U are the left singular vectors ui ; the entries on the diagonal of Σ are the
singular values; and the columns of V (which are the rows of V T are (you
guessed it) the right singular vectors vi .
The SVD has a nice, simple geometric interpretation (see also Todd Will’s
SVD tutorial linked from the Readings page, which has a similar take). It’s
easiest to draw in 2D. T
v
Let U = u1 u2 and V T = 1T .
v2

If we take the unit circle and transform it by A, we get an ellipse (because A

is a linear transformation). The left singular vectors u1 , u2 are the major and
1 Really, an orthographic transformation, which is a rotation plus an optional refleciton.

But rotation is the right picture to have in your mind.

2
minor axes of that ellipse (being on the left they live in the “output” space).
The right singular vectors v1 , v2 are the vectors that get mapped to the major
and minor axes (being on the right they live in the “input” space).
If we break the transformation down into these three stages we see a circle
being rotated to align the vs with the coordinate axes, then scaled along those
axes, then rotated to align the ellipse with the us:

Another way to say some of this is that Avi = σi ui , which you can see by
this 2D example: T
v1 1 1
v 1 = ; U Σ = σ1 u1
v2T 0 0
We can easily promote this idea to 3D, though it becomes harder to draw:

Let’s look at what happens when we have a singular (aka. rank-deficient)

matrix, in this 3 × 3 setting. Recall that the rank of a matrix is the dimension
of the span of the columns (or rows—they are always the same): if the third
column is in the span of the first two, it’s a rank-2 matrix; if the second and
third columns are both in the span of the first one (that is, all three are parallel)
then it is a rank-1 matrix.
A rank-deficient matrix is one whose range is a subspace of IR3 , not all of
3
IR , so it maps the sphere to a flat ellipse (in the rank-2 case) rather than an
ellipsoid:

  T
a v1
b  v2T 

A = u1 u2 u3 
0 v3T

3
This means one of the singular values (the last one, since we sort them in
decreasing order) is zero. The last left singular vector is the normal to that
ellipse.
A rank-deficient matrix is also one that has a nontrivial null space: some
direction that gets mapped to zero. In this case, that vector is v3 , since
     
0 0 0
V T v3 = 0 and Σ 0 = 0 .
1 1 0

More generally, the SVD of a rank-r matrix looks like this:

In this picture, r < n < m. So the matrix is rank-deficient (r < n) with an

r-dimensional range and an (n − r)-dimensional null space.

3 Applications of the SVD

3.1 Fundamental subspaces of a matrix
From our rectangles-and-squares picture of the SVD, we can read off the four
fundamental spaces:

• U1 is a basis for ran(A), aka span(A), aka the column space of A (just like
in QR) (dim = r). After multiplication by Σ only the first r entries can
be nonzero, so only vectors in span(U1 ) can be produced by multiplication
with A.
• U2 is a basis for ran(A)⊥ , aka null(AT ) (dim = m − r). Since all the
entries of ΣV T x corresponding to those columns of U are zero, no vectors
in span(U2 ) can be generated by multiplication with A.
• V2 is a basis for null(A), aka ran(A)⊥ (dim = n − r). Since any vector
in span(V2 ) will end up multiplying the zero singular values, it will get
mapped to zero.

• V1 is a basis for the row space of A, aka ran(AT ) (dim = r).

Again, when the matrix is rank-deficient, all four spaces are nontrivial.
If we are just interested in a factorization of A from which we can reconstruct
A we need only the first r singular vectors and singular values:

4
This is exactly analogous to the “skinny” QR factorization.

3.2 Rank-deficient least squares

As with most of our other linear algebra tools, SVD provides yet another way
to solve linear systems.

Ax = b x = V Σ−1 U T b

No problem! It’s just three matrix multiplications. The inverse of Σ is easy:

it is just diag(1/σ1 , 1/σ2 , . . . , 1/σn ). This may be the simplest yet—but not
the fastest. Note that back substitution takes less arithmetic than full matrix
multiplication.
This process is nice and stable, and it’s very clear exactly where the numbers
can get bigger: only when they get divided by small singular values. It’s also
clear up front if the process will fail: it’s when we divide by a zero singular
value.
One of the strengths of the SVD is that it works when the matrix is singular.
How would we go about solving a singular system? More specifically, how can
we solve Ax ≈ b, A square n × n, rank r < n?
This is no longer a system of equations we can expect to have an exact
solution, since ran(A) ⊂ IRn and b might not be in there. It also will not have
a unique solution. If Ax∗ − b is minimum, so is A(x∗ + y) − b for any vector y ∈
null(A). Thus a system like this is both overdetermined and underdetermined.
SVD gives us easy access to the solution space, though. The procedure for
doing this is a combination of the procedures we used for over- and underdeter-
mined systems using QR. First we expand the residual we’re trying to minimize
using the SVD of A and then transform it by U T (which does not change the
norm of the residual):

kAx − bk = kU ΣV T x − bk2 = kΣV T x − U T bk2

T 2
Σ1 0 T U
= V x − 1T b
0 0 U2

Breaking apart the top and bottom rows:

0 V T x − U1T bk2 + kU2T bk2

= k Σ1

The first term can be made zero; the second is the residual. But because of rank
deficiency, minimizing the first term is an overdetermined problem—the matrix

5
T T T
Σ1 0 V is wider than tall. If we let y = V x and c = U1 b, then split y into
y1
the system to be solved is
y2

y1
Σ1 0 =c
y2
Σ1 y1 = c

Since y2 does not change the answer we’ll go for the minimum-norm solution
and set y2 = 0. Then x = V1 y1 .
So the solution process boils down to just:
1. Compute SVD.
2. c = U1T b.
3. y1 = Σ−1
1 c.

4. x = V1 y1 .
Or, more briefly,
x = V1 Σ−1 T
1 U1 b. (1)
This is just like the full-rank square solution, but with U2 and V2 left out.
I can also explain this process in words:
1. Transform b into the U basis.
2. Throw out components we can’t produce anyway, since there’s no point
trying to match them.
3. Undo the scaling effect of A.
4. Fill in zeros for the components that A doesn’t care about, since it doesn’t
matter what they are.
5. Transform out of the V basis to obtain x.

3.3 Pseudoinverse
If A is square and full-rank it has an inverse. Its SVD has a square Σ that has
nonzero entries all the way down the diagonal. We can invert Σ easily by taking
the reciprocals of these diagonal entries:

Σ = diag(σ1 , σ2 , . . . , σn )
−1
Σ = diag(1/σ1 , 1/σ2 , . . . , 1/σn )

and we can write down the inverse of A directly:

A−1 = (U ΣV T )−1 = (V T )−1 Σ−1 U −1 = V Σ−1 U T (2)

6
So the SVD of A−1 is closely related to the SVD of A. We simply have swapped
the roles of U and V and inverted the scale factors in Σ. Earlier we interpreted
U and V as bases for the “output” and “input” spaces of A, so it makes sense
that they are exchanged in the inverse. A nice thing about this way of getting
the inverse is that it’s easy to see before we start whether and where things will
go badly: the only thing that can go wrong is for the singular values to be zero
or nearly zero. In our other procedures for the inverse we just plowed ahead and
waited to see if things would blow up. (Of course, don’t forget that computing
the SVD is many times more expensive than a procedure like LU or QR.)
If A is non-square and/or rank deficient, it no longer has an inverse, but we
can go ahead and generalize the SVD-based inverse, modeling it on the rank-
deficient least squares process we just saw. Looking back at (1), we can see that
we are treating the matrix
A+ = V1 Σ−1
1 U1
T
(3)
like an inverse for A, in the sense that we solve the problem Ax ≈ b by writing
x = A+ b. It is also just like the formula in (2) for A−1 except that we have kept
only the first r rows and/or columns of each matrix.
The matrix A+ is known as the pseudoinverse of A. Multiplying a vector by
the pseudoinverse solves a least-squares problem and gives you the minimum-
norm solution. Note that writing down a pseudoinverse for a matrix always
involves deciding on a rank for the matrix first.
There’s another way in which the pseudoinverse is like an inverse: it is the
matrix that comes closest to acting like an inverse, in the sense that multiplying
A by A+ almost produces the identity:

A+ = min kAX − Im kF
X∈IRn×m
= min kXA − In kF
X∈IRn×m

We can prove this by looking at X in A’s singular vector space:

X 0 = U XV T

so that

AX − I = (U ΣV T )(V X 0 U T ) − I
U T (AX − I)U = ΣX 0 − I

Since U is orthogonal, multiplying by it didn’t change the F-norm of AX − I.

This means we’ve reduced the problem of minimizing kAX − IkF to the easier
problem of minimizing kΣX 0 − IkF . Since the last m − r rows of Σ are zero,
the closest we can hope to match I is to get the first r rows to match. We can
do that easily by setting
−1
0 + Σ1 0
X =Σ =
0 0

7
so that
Ir 0
ΣΣ+ = .
0 0
Now that we know the optimal X 0 we can compute the corresponding X, which
is A+ , and you can easily verify that A+ comes out to the expression in (3).
Here are some facts about A+ for special cases:

• A full rank, square: A+ = A−1 (as it has to be if it’s going to be the closes
thing to an inverse, when the inverse in fact exists!)
• A full rank, tall: A+ A = In but AA+ 6= Im . By the normal equations
we learned earlier, A+ = (AT A)−1 AT since both matrices compute the
(unique) solution to a full-rank least squares problem.

• A full rank, wide: AA+ = Im but AT A 6= In . Again by normal equations,

A+ = AT (AAT )−1 .
Said informally, in both cases you can get the small identity but not the big
identity, because the big identity has a larger rank than A and A+ .

3.4 Conditioning of least-squares problems

With the SVD and pseudoinverse we are equipped to ask about the conditioning
of least squares problems: when we solve minx ||Ax − b|| what can we say about
the error in x relative to the error in A and b? For square, nonsingular linear
systems the answer was the same for either kind of errors: the bound is provided
by cond(A). Let’s look at what happens to these bounds as we move from
equality to least squares.
For errors in b, in the equality case, we assumed there is an error ∆b added
to b and asked how much we have to change x to keep the equality true. We
found that A∆x = ∆b so that:

∆x = A−1 ∆b
k∆xk ≤ kA−1 kk∆bk

Likewise,

b = Ax
kbk ≤ kAkkxk
kxk ≥ kbk/kAk

and dividing these two inequalities,

k∆xk k∆bk k∆bk

≤ kAkkA−1 k = cond(A)
kxk kbk kbk

8
In the (full rank) least-squares case, x = A+ b and ∆x = A+ ∆b, but Ax 6= b.
From the SVD of A+ it’s clear that kA+ k = 1/σn and we’ll define cond(A) =
σ1 /σn = kAkkA+ k. We can then follow roughly the same argument to get:

∆x = A+ ∆b
k∆xk ≤ kA+ kk∆bk

Likewise,

kAxk ≤ kAkkxk
kxk ≥ kAxk/kAk

and dividing these two inequalities,

k∆xk −1 k∆bk k∆bk kbk k∆bk
≤ kAkkA k = cond(A) = cond(A)
kxk kAxk kAxk kAxk kbk

So the condition number of the least squares problem is cond(A) times the ratio
kbk/kAxk. This ratio is often called 1/ cos θ because it is the angle between the
vector b and the range of A. If b ∈ ran(A) then cos θ = 1 and the conditioning
is the same as the equality system. As b approaches orthogonal to ran(A), cos θ
decreases towards zero and the condition number of the least squares problem
becomes arbitrarily large. This is not too surprising: in this case the output of
the problem (x) becomes small but the propagated error only depends on kA+ k
which doesn’t depend on b, so the relative error can be very large.
On the other hand, it takes a special b to have cos θ very near zero. The
sensitivity of the solution to errors in b is similar to the equality case for “non-
special” values of b.
Let’s examine the effects of errors in A. In the equality case we saw that
these errors behave the same as errors in b. This time we change A to A + E
and ask how we have to change x to maintain equality:

(A + E)(x + ∆x) = b
Ax + Ex + A∆x + E∆x = b
A∆x ≈ −Ex

In this last step we have discarded E∆x on the basis that it is second-order
in terms of the errors and is therefore negligible for small errors relative to the
other terms, and we have canceled Ax with b. Carrying on and applying the
usual matrix-norm bounds:

∆x ≈ −A−1 Ex
k∆xk ≤ kA−1 kkEkkxk
k∆xk kEk
≤ kA−1 kkEk = cond(A)
kxk kAk

9
So relative errors in A propagate to relative errors in x the same way as errors
in b do.
Now let’s look at this in the least-squares case. For this we will use the
normal equations, and we’ll need a fact about the matrix AT A:

AT A = V ΣU T U ΣV T = V Σ2 V T

(where U, Σ, V is the SVD of A). So its singular values are the squares of A’s
singular values. This means kAT Ak = σ12 = kAk2 , k(AT A)−1 k = 1/σn2 =
kA−1 k2 , and therefore cond(AT A) = σ12 /σn2 = cond(A)2 .
Now, following the approach used for equality systems, we ask about the
effect of changing A to A + E, using the normal equations to get an equation
for its effect on ∆x:

(A + E)T (A + E)(x + ∆x) = (A + E)T b

AT Ax + AT Ex + E T Ax + AT A∆x ≈ AT b + E T b

In the second line we have discarded four terms that are second or third order
in the errors (e.g. AT E∆x). Canceling AT Ax with AT b we get

(AT A)∆x ≈ E T (b − Ax) − AT Ex

∆x ≈ (AT A)−1 E T r − A+ Ex

where r = b − Ax is the residual, and we’ve substituted A+ for its definition in

terms of normal equations. Applying matrix norms:

k∆xk ≤ k(AT A)−1 kkEkkrk + kA+ kkEkkxk

kEk
k∆xk ≤ kA−1 k2 kEkkrk + cond(A) kxk
kAk
k∆xk kA−1 k2 kEk
≤ kEkkrk + cond(A)
kxk kxk kAk
kEk krk kEk
= cond(A)2 + cond(A)
kAk kAkkxk kAk

Since kAkkxk ≥ kAxk,

k∆xk 2 krk kEk
≤ cond(A) + cond(A)
kxk kAxk kAk

so we see in this case that the condition number for error in x propagated from
error in E is the sum of two pieces: the familiar condition number of A, and a
second term proportional to the square of that condition number and the ratio
krk/kAxk. This ratio is known as tan θ since it is the tangent of the angle
between b and the range of A. This one is more troubling because it will be
large except for “special” values of b that are very near ran(A).

10
3.5 Numerical rank
Sometimes we have to deal with rank-deficient matrices that are corrupted by
noise. In fact we always have to do this, because when you take some rank-
deficient matrix and represent it in floating point numbers, you’ll usually end
up with a full-rank matrix! (This is just like representing a point on a plane
in 3D – it’s only if we are lucky that a floating point number happens to lie
exactly on the plane; usually it has to be approximated with one near the plane
instead.)
When our matrix comes from some kind of measurement that has uncertainty
associated with it, it will be quite far from rank-deficient even if the underlying
“true” matrix is rank-deficient.
SVD is a good tool for dealing with numerical rank issues because it answers
the question:
Given a rank-k matrix, how far is it from the nearest rank-(k − 1)
matrix?

That is, How rank-k is it?, or How precisely to we have to know the matrix in
order to be sure that it’s not actually rank-(k − 1)?
SVD lets us get at these questions because, since rank and distances are
unaffected by the orthogonal transformations U and V , numerical rank questions
about A are reduced to questions about the diagonal matrix Σ.
If Σ has k entries on its diagonal, we can make it rank-(k − 1) by zeroing
the k th diagonal entry. The distance between the two matrices is equal to the
entry we set to zero, namely σk , in both the 2-norm and the F − norm. It’s not
surprising (though I won’t prove it) that this is the closest rank-(k − 1) matrix.
If we have a rank-k matrix and perturb it by adding a noise matrix that has
norm less then , it can’t change the zero singular values by more than . This
means noise has a nicely definable effect on our ability to detect rank: if the
singular values are more than we know they are for real (they did not just
come from the noise) and if they are less then we don’t know whether they
came from the “real” matrix or from the noise.
We can roll these ideas up in a generalization of rank known as numeri-
cal rank: whereas rank(A) is the (exact) rank of the matrix A, we can write
rank(A, ) for the highest rank we’re sure A has if it could be corrupted by noise
of norm . That is, rank(A, ) is the highest r for which there are no rank-r
matrices within a distance of A.

3.6 Principal component analysis

Another way of viewing the SVD is that it gives you a sequence of low-rank
approximations to a data matrix A. These approximations become accurate as
the rank of the approximation approaches the “true” dimension of the data. To
do this, think of the product U ΣV T as a sum of outer products.

11
[Aside: recall the outer product version of matrix multiply, from way back
at the beginning:
X X
C = AB ; cij = aik bkj ; C = a:k bk:
k k

We can use the same idea even with Σ in the middle.]

We can express the SVD as a sum of outer products of corresponding left
and right singular vectors:
  — v T —
| | 1
A = u1 · · · um  Σ 
 .. 
. 
| | T
— vn —
X
T
= σi ui vi
i

This is a sum of a bunch of rank-1 matrices (outer product matrices are always
rank 1), and the norms of these matrices are steadily decreasing.
By truncating this sum and including only the first r terms, we wind up
with a matrix of rank r that approximates A. In fact, it is the best rank-r
approximation, in both the 2-norm and the Frobenius norm.
Interpreting this low-rank approximation in terms of column vectors leads
to a powerful and widely used statistical tool known as principal component
analysis. If we have a lot of vectors that we think are somehow related—they
came from some process that only generates certain kinds of vectors—then one
way to try to identify the structure in the data is to try and find a subspace that
all the vectors are close to. If the subspace fits the data well, then that means
the data are low rank: they don’t have as many degrees of freedom as there
are measurements for each data point. Then, with a basis for that subspace in
hand, we can talk about our data in terms of combinations of those basis vectors,
which is more efficient, especially if we can make do with a fairly small number
of vectors. It can make the data easier to understand by reducing the dimension
to where we can manage to think about it, or it can make computations more
efficient by allowing us to work with the (short) vectors of coefficients in place
of the (long) original vectors.
To be a little more specifc, let’s say we have a bunch of m-vectors called
a1 , . . . , an . Stack them, columnwise, into a matrix A, and then compute the
SVD of A.
A = U ΣV T
The first n columns of U are the principal components of the data vectors
a1 , . . . , an . The entries of Σ tell us the relative importance of the principal
components. The rows of V give the coefficients of the data vectors in the
principal components basis. Once we have this representation of the data, we
can look at the singular values to see how many of the principal components
seem to be important. If we use only the first k principal components, leaving
out the remaining n − k components, we’ve chosen a k-dimensional subspace,

12
and the error of approximating our data vectors by their projections into that
subspace is limited by the magnitude of the singular values belonging to the
components we left off. This subspace is optimal in the 2-norm sense, and in
the F-norm sense.
One adjustment to this process, which sacrifices a bit of optimality in ex-
change for more meaningful principal components, is to first subtract the mean
of the ai s from each ai . Of course, you need to keep the mean vector around and
add it back in whenever you are approximating vectors from the components.

Sources
• Our textbook: Heath, Scientific Computing: An introductory survey, 2e.
Section 3.6.
• Golub and Van Loan, Matrix Computations, Third edition. Johns Hopkins
Univ. Press, 1996. Chapter 5.

Singular Value Decomposition
No ratings yet
Singular Value Decomposition
15 pages
SVD Topics
No ratings yet
SVD Topics
38 pages
The Singular Value Decomposition.
No ratings yet
The Singular Value Decomposition.
88 pages
SVD Other2
No ratings yet
SVD Other2
11 pages
Note 8
No ratings yet
Note 8
15 pages
Sol 11
No ratings yet
Sol 11
15 pages
Singular Value Decomposition - Wikipedia
No ratings yet
Singular Value Decomposition - Wikipedia
16 pages
Singular Value Decomposition Worked Numerical Examples
No ratings yet
Singular Value Decomposition Worked Numerical Examples
24 pages
MFDS Lecture BITS WILP
No ratings yet
MFDS Lecture BITS WILP
29 pages
11 SVD
No ratings yet
11 SVD
47 pages
Lecture 09
No ratings yet
Lecture 09
22 pages
Lecture 9 Unit2
No ratings yet
Lecture 9 Unit2
168 pages
AdvancedSensorySystems 3b SVD
No ratings yet
AdvancedSensorySystems 3b SVD
13 pages
Final Report: Linear Algebra For It
No ratings yet
Final Report: Linear Algebra For It
25 pages
Internal 4 Sem
No ratings yet
Internal 4 Sem
36 pages
Singular-Value Decomposition and Its Applications
No ratings yet
Singular-Value Decomposition and Its Applications
28 pages
Spring 2017 Homework 9
No ratings yet
Spring 2017 Homework 9
30 pages
L14 SVD
No ratings yet
L14 SVD
8 pages
Lecture1 Slides
No ratings yet
Lecture1 Slides
26 pages
Week-5 SVD
No ratings yet
Week-5 SVD
3 pages
Vietnam General Confederation of Labor: Ton Duc Thang University Faculty of Information Technology
No ratings yet
Vietnam General Confederation of Labor: Ton Duc Thang University Faculty of Information Technology
26 pages
Lec 14
No ratings yet
Lec 14
20 pages
Lecture 5
No ratings yet
Lecture 5
30 pages
Final 4 Sem
No ratings yet
Final 4 Sem
29 pages
Singular Value Decomposition SVD (1) : EEM3L1: Numerical and Analytical Techniques
No ratings yet
Singular Value Decomposition SVD (1) : EEM3L1: Numerical and Analytical Techniques
17 pages
PART I: Approximation of Static Systems
No ratings yet
PART I: Approximation of Static Systems
123 pages
Linear Algebraic Equations, SVD, and The Pseudo-Inverse: 1 A Little Background
No ratings yet
Linear Algebraic Equations, SVD, and The Pseudo-Inverse: 1 A Little Background
8 pages
1 Singular Value Decomposition: Lecture 8-10 Notes: SVD and Its Applications
No ratings yet
1 Singular Value Decomposition: Lecture 8-10 Notes: SVD and Its Applications
8 pages
29
No ratings yet
29
11 pages
Elec9731 LM1
No ratings yet
Elec9731 LM1
41 pages
Electrical Network Theory
100% (4)
Electrical Network Theory
969 pages
A.0. Introduction: The Free Enctyclopedia
No ratings yet
A.0. Introduction: The Free Enctyclopedia
10 pages
1.2.7 Singular Value Decomposition: Mathematical Background 39
No ratings yet
1.2.7 Singular Value Decomposition: Mathematical Background 39
7 pages
Singular Value Decomposition Fast Track Tutorial
No ratings yet
Singular Value Decomposition Fast Track Tutorial
5 pages
R22B.Tech - CSE Syllabus
No ratings yet
R22B.Tech - CSE Syllabus
154 pages
Lecture 6
No ratings yet
Lecture 6
53 pages
Lecture 6
No ratings yet
Lecture 6
53 pages
CS168: The Modern Algorithmic Toolbox Lecture #9: The Singular Value Decomposition (SVD) and Low-Rank Matrix Approximations
No ratings yet
CS168: The Modern Algorithmic Toolbox Lecture #9: The Singular Value Decomposition (SVD) and Low-Rank Matrix Approximations
11 pages
MTH603 MIDTERM SOLVED MCQS by JUNAID
100% (3)
MTH603 MIDTERM SOLVED MCQS by JUNAID
39 pages
Singular Value Decomposition: Notes On Linear Algebra
No ratings yet
Singular Value Decomposition: Notes On Linear Algebra
9 pages
Least Squares and The Singular Value Decomposition: Ivan Markovsky
No ratings yet
Least Squares and The Singular Value Decomposition: Ivan Markovsky
52 pages
Where Can Buy A Textbook of Engineering Mathematics I 2nd Edition H.S. Gangwar Ebook With Cheap Price
No ratings yet
Where Can Buy A Textbook of Engineering Mathematics I 2nd Edition H.S. Gangwar Ebook With Cheap Price
47 pages
Cse (Ug) - R23
No ratings yet
Cse (Ug) - R23
45 pages
Karnataka 2PU-PCMB - Model Question Paper 2014-15
0% (1)
Karnataka 2PU-PCMB - Model Question Paper 2014-15
134 pages
SVD Slides
No ratings yet
SVD Slides
26 pages
The Singular Value Decomposition (SVD)
No ratings yet
The Singular Value Decomposition (SVD)
9 pages
Singular Value Decomposition
No ratings yet
Singular Value Decomposition
34 pages
Singular Value Decomposition - MIT PDF
No ratings yet
Singular Value Decomposition - MIT PDF
5 pages
SVD Slides
No ratings yet
SVD Slides
17 pages
Linear Models: Stability and Redundancy: 2.1 Singular Value Decomposition
No ratings yet
Linear Models: Stability and Redundancy: 2.1 Singular Value Decomposition
24 pages
SVD Computation
No ratings yet
SVD Computation
13 pages
The Singular Value Decomposition
No ratings yet
The Singular Value Decomposition
8 pages
The Singular Value Decomposition: Prof. Walter Gander ETH Zurich Decenber 12, 2008
No ratings yet
The Singular Value Decomposition: Prof. Walter Gander ETH Zurich Decenber 12, 2008
18 pages
Strang 367-376
No ratings yet
Strang 367-376
11 pages
Singular Value Decomposition (SVD) : - Definition
No ratings yet
Singular Value Decomposition (SVD) : - Definition
5 pages
Cos323 s06 Lecture09 SVD
No ratings yet
Cos323 s06 Lecture09 SVD
24 pages
Home School Sub-Maths Notes
No ratings yet
Home School Sub-Maths Notes
36 pages
Intro SVD
No ratings yet
Intro SVD
16 pages
HHW Xii B
No ratings yet
HHW Xii B
19 pages
2 - Introduction To SVD
No ratings yet
2 - Introduction To SVD
5 pages
Literature Review On Inverse Kinamatic
No ratings yet
Literature Review On Inverse Kinamatic
6 pages
Math Primer
No ratings yet
Math Primer
13 pages
9.4 Singular Value Decomposition: 9.4.1 Definition of The SVD
No ratings yet
9.4 Singular Value Decomposition: 9.4.1 Definition of The SVD
4 pages
6 s088 Course Materials-3
No ratings yet
6 s088 Course Materials-3
14 pages
SB PDF
No ratings yet
SB PDF
197 pages
Madanapalle Institute of Technology & Science: Madanapalle (Ugc-Autonomous) WWW - Mits.ac - in
No ratings yet
Madanapalle Institute of Technology & Science: Madanapalle (Ugc-Autonomous) WWW - Mits.ac - in
41 pages
Section 7.4 Notes (The SVD)
No ratings yet
Section 7.4 Notes (The SVD)
9 pages
CS168: The Modern Algorithmic Toolbox Lecture #9: The Singular Value Decomposition (SVD) and Low-Rank Matrix Approximations
No ratings yet
CS168: The Modern Algorithmic Toolbox Lecture #9: The Singular Value Decomposition (SVD) and Low-Rank Matrix Approximations
10 pages
Singular Value Decomposition Geometry
No ratings yet
Singular Value Decomposition Geometry
9 pages
Singular Value Decomposition: Yan-Bin Jia Sep 6, 2012
No ratings yet
Singular Value Decomposition: Yan-Bin Jia Sep 6, 2012
9 pages
Singular Value Decomposition: How It Works
No ratings yet
Singular Value Decomposition: How It Works
5 pages
Assignment Matrix & Determinents New
No ratings yet
Assignment Matrix & Determinents New
9 pages
Ipcc Manual
No ratings yet
Ipcc Manual
17 pages
MFin Sample Exam Solution
No ratings yet
MFin Sample Exam Solution
8 pages
10.1515 - Spma 2022 0181
No ratings yet
10.1515 - Spma 2022 0181
4 pages
Solving Systems of Linear Equations Using Matrices
No ratings yet
Solving Systems of Linear Equations Using Matrices
10 pages
Equation Matrix
No ratings yet
Equation Matrix
24 pages
Chapter 12: Analysis-of-Variance Models: 12.1 Non-Full-Rank Models
No ratings yet
Chapter 12: Analysis-of-Variance Models: 12.1 Non-Full-Rank Models
9 pages
B.Pharmacy IY - R17 - Remedial Maths PDF
No ratings yet
B.Pharmacy IY - R17 - Remedial Maths PDF
2 pages
The Derivation of Markov Chain Properties Using Generalized Matrix Inverses
No ratings yet
The Derivation of Markov Chain Properties Using Generalized Matrix Inverses
31 pages
Diagonalization: Definition. A Matrix
No ratings yet
Diagonalization: Definition. A Matrix
5 pages
Math 415. Exam 1. September 28, 2017
No ratings yet
Math 415. Exam 1. September 28, 2017
18 pages
Assignment I Matrices
No ratings yet
Assignment I Matrices
2 pages
Assigment 1 20231
No ratings yet
Assigment 1 20231
2 pages
Ann R16 Unit 2
No ratings yet
Ann R16 Unit 2
16 pages
Fitting Helices To Data by Total Least Squares by Yves Nievergelt
No ratings yet
Fitting Helices To Data by Total Least Squares by Yves Nievergelt
12 pages
GE 122-Lec4-Matrix Rank and Ill-Conditioned Systems-Handout
No ratings yet
GE 122-Lec4-Matrix Rank and Ill-Conditioned Systems-Handout
6 pages
Add Maths Year 10
No ratings yet
Add Maths Year 10
17 pages
ISNIN
No ratings yet
ISNIN
2 pages
A Short Course in Automorphic Functions
From Everand
A Short Course in Automorphic Functions
Joseph Lehner
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

CS3220 Lecture Notes: Singular Value Decomposition and Applications

Uploaded by

CS3220 Lecture Notes: Singular Value Decomposition and Applications

Uploaded by

CS3220 Lecture Notes: Singular Value

decomposition and applications

This document is a transcription of the notes I have used to give CS322/3220

By drawing in that dotted line we separate Q in to a basis for the range of

2 SVD definitions and interpretation

If we take the unit circle and transform it by A, we get an ellipse (because A

But rotation is the right picture to have in your mind.

Let’s look at what happens when we have a singular (aka. rank-deficient)

More generally, the SVD of a rank-r matrix looks like this:

In this picture, r < n < m. So the matrix is rank-deficient (r < n) with an

3 Applications of the SVD

• V1 is a basis for the row space of A, aka ran(AT ) (dim = r).

3.2 Rank-deficient least squares

No problem! It’s just three matrix multiplications. The inverse of Σ is easy:

kAx − bk = kU ΣV T x − bk2 = kΣV T x − U T bk2

Breaking apart the top and bottom rows:

0 V T x − U1T bk2 + kU2T bk2

and we can write down the inverse of A directly:

A−1 = (U ΣV T )−1 = (V T )−1 Σ−1 U −1 = V Σ−1 U T (2)

We can prove this by looking at X in A’s singular vector space:

Since U is orthogonal, multiplying by it didn’t change the F-norm of AX − I.

• A full rank, wide: AA+ = Im but AT A 6= In . Again by normal equations,

3.4 Conditioning of least-squares problems

and dividing these two inequalities,

k∆xk k∆bk k∆bk

and dividing these two inequalities,

(A + E)T (A + E)(x + ∆x) = (A + E)T b

(AT A)∆x ≈ E T (b − Ax) − AT Ex

where r = b − Ax is the residual, and we’ve substituted A+ for its definition in

k∆xk ≤ k(AT A)−1 kkEkkrk + kA+ kkEkkxk

Since kAkkxk ≥ kAxk,

3.6 Principal component analysis

We can use the same idea even with Σ in the middle.]

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.