Vietnam General Confederation of Labor: Ton Duc Thang University Faculty of Information Technology
Vietnam General Confederation of Labor: Ton Duc Thang University Faculty of Information Technology
OF LABOR
TON DUC THANG UNIVERSITY
FACULTY OF INFORMATION
TECHNOLOGY
FINAL REPORT
FINAL REPORT
Advised by
Mrs.Truong Thi Kim Tien
HO CHI MINH CITY, 2025
i
ACKNOWLEDGEMENT
We would like to express our sincere gratitude to Mrs.Truong Thi Kim Tien, our
instructor and mentor, for her valuable guidance and support throughout the final report of
our project. She has been very helpful and patient in providing us with constructive
feedback and suggestions to improve our work.We have learned a lot from her expertise
and experience. We are honored and privileged to have her as our teacher and supervisor.
Dai
Huynh Quoc Dai
Anh
Le Duc Anh
i
DECLARATION OF AUTHORSHIP
We hereby declare that this is our own project and is guided by Mrs.Truong Thi
Kim Tien; The content research and results contained herein are central and have not been
published in any form before. The data in the tables for analysis, comments and evaluation
are collected by the main author from different sources, which are clearly stated in the
reference section.
In addition, the project also uses some comments, assessments as well as data
of other authors, other organizations with citations and annotated sources.
If something wrong happens, We’ll take full responsibility for the content of
my project. Ton Duc Thang University is not related to the infringing rights, the
copyrights that We give during the implementation process (if any).
Dai
Huynh Quoc Dai
Anh
Le Duc Anh
1
TABLE OF CONTENT
CHAPTER 1:SOLUTION
-Let A be an m×n real matrix. Before introducing the concept of Singular Value
Decomposition (SVD), we first define the singular values of A, which are
fundamental components of the SVD.
-Definition:
The singular values of A are defined as the square roots of the eigenvalues of the
matrix ATA (if m ≤ n) or AAT (if m > n). Since ATA and AAT are symmetric and
positive semi-definite matrices, all of their eigenvalues are real and non-negative.
λ1≧λ2≧…λn≧0
Since ‖Ax‖² ≥ 0 and ‖x‖² > 0, it follows that λ ≥ 0, hence σᵢ ≥ 0 for all i.
+This is because A and ATA share the same kernel, so their ranks are equal. Therefore,
the number of non-zero singular values (i.e., positive σᵢ) equals rank(A)
Geometric Interpretation
+The singular values describe how A stretches or compresses unit vectors in ℝⁿ.
Specifically:
3
+The largest singular value σ₁ equals the maximum value of ‖Ax‖ when x is a
unit vector:
‖Ax‖ = √λ₁ = σ₁
+Similarly, σ₂ is the maximum of ‖Ax‖ for unit vectors orthogonal to the one that
gives σ₁, and so on.
A = U Σ VT
- Where:
Σ = [ σ₁ 0 ... 0]
[0 σ₂ ... 0]
[… … … …]
[0 0 ... σᵣ ] ∈ ℝ^(m×n)
- All off-diagonal entries are zero, and singular values appear on the diagonal in
descending order.
Interpretation
+ Where vᵢ is a right singular vector and uᵢ is the corresponding left singular vector.
A = PDPT
Where:
+ In this case, the singular values of A are the absolute values of the eigenvalues:
σᵢ = |λᵢ|
A = U Σ VT
Step-by-Step Procedure
Step 1. Compute AT A:
Let V = [v1,v2,….,vn]
Then V is an n × n orthogonal matrix.
Build the diagonal matrix Σ ∈ Rm*n , where the diagonal entries are:
A = U Σ VT
+This confirms that the factorization is valid and that the resulting matrices are
consistent with the definition of SVD.
6
+The Singular Value Decomposition (SVD) is a powerful and versatile tool in linear
algebra with wide applications across mathematics, computer science, and
engineering. Below are five practical and theoretical applications of SVD, as
discussed in the course.
Rank Estimation
+SVD helps determine the numerical rank of a matrix by identifying how many
singular values are significantly greater than zero. In real-world data (such as
measurements), noise may cause all singular values to be non-zero. However, only a
few large singular values indicate the true dimensional structure, while the rest
correspond to noise.
For example, if only two singular values are large and the rest are near zero, the
matrix is close to having rank 2. This is especially useful when analyzing noisy
data.
Low-Rank Approximation
Using only the top s < r singular values and corresponding vectors, we can construct a
low-rank approximation of the matrix:
7
+This technique is commonly used to compress data while preserving the most
important features. It provides the best possible approximation (in 2-norm and
Frobenius norm) of rank s, known as the Eckart–Young–Mirsky theorem.
Image Compression
+In systems where the matrix is nearly singular (ill-conditioned), direct methods like
Gaussian elimination are unstable. SVD allows for a numerically stable solution by
inverting only the significant singular values and ignoring those close to zero.
Using Python or symbolic tools (or numerical approximation), the eigenvalues are
approximately:
+Note that the second row is exactly half of the first row, so rank = 1.
All other singular values are zero, and corresponding vectors can be chosen arbitrarily
(to complete orthonormal bases)
We have:
+Definition
A = U1 Σ1 V1T
+Where:
+This decomposition is a more compact form of the full SVD, which includes zero
singular values and additional orthonormal vectors that are not essential for
reconstructing the original matrix.
+We now compute the compact (reduced) singular value decomposition for both
example matrices from Sections 5 and 6. The compact SVD retains only the non-zero
singular values and their corresponding singular vectors.
Eigenvalues:
Singular values:
We retain:
σ1 ≈ 6.96
σ2 ≈ 2.80
Let:
Step 5: Compute
Normalize both to obtain orthonormal vectors u1,u2 then form matrix U1∈R2*2.
A = U1 Σ1 V1T
Where:
Let:
As shown in Section 6, the matrix has rank r = 1. Its compact SVD keeps only the
first singular value and associated vectors:
A = U1 Σ1 V1T
+This reduced form captures the exact structure of the rank-1 matrix using minimal
components.
Definition
A = U1 Σ1 V1T
Where:
U1 ∈ Rm*r is a matrix whose columns are the left singular vectors corresponding
to the nonzero singular values.
Σ1 ∈ Rr*ris a diagonal matrix with singular values σ1 ≥σ 2≥ … ≥ σr >0
V1 ∈ Rn*s contains the corresponding right singular vectors.
As = Us Σs VsT
Where:
2-norm error:
14
+This property makes truncated SVD highly useful in data science and machine
learning applications.
Given matrix A1 :
+From the full or compact SVD (see Section 5), we previously found:
To perform a rank-1 truncated SVD, we keep only the largest singular value σ1 and
the corresponding vectors u1,v1:
This matrix is the best rank-1 approximation of A1 in terms of both Frobenius and
spectral norms.
Conclusion:
+By keeping only the largest singular value, we obtain a low-rank version of A1 that
captures its most significant structure. Although some details are lost (since rank is
reduced from 2 to 1), the overall pattern remains recognizable.
Image compression
Dimensionality reduction
Recommendation systems
+We are given the following user–movie rating matrix, where missing values are
indicated as red boxes (or "?"):
Liam 5 5 1 ?
Noah 4 5 1 1
Oliver 4 4 1 2
James 5 4 1 1
Olivia 1 2 5 5
Emma ? 1 5 5
Mia ? 1 5 5
Luna 5 4 1 2
+For each missing value, we take the average of the corresponding row mean and
column mean:
For example:
Row mean ≈ 3.67, Column mean ≈ 3.38 → Fill = (3.67 + 3.38)/2 ≈ 3.52
Row mean ≈ 3.67, Column mean ≈ 4.0 → Fill = (3.67 + 4.0)/2 ≈ 3.83
2,Multiply:
This gives us an approximate version of the rating matrix, where missing values are
predicted via reconstruction.
+These values are used as predicted ratings for the missing entries.
Liam 5 5 1 ?
Noah 4 5 1 1
Oliver 4 4 1 2
James 5 4 1 1
Olivia 1 2 5 5
Emma ? 1 5 5
Mia ? 1 5 5
Luna 5 4 1 2
As before, fill missing entries using the average of the row and column means.
The filled matrix has no missing entries.
For each row (user), subtract their row mean from all entries in that row. Let:
Compute:
19
To recover predicted ratings, add back the row means that were subtracted in Step 3:
Now the matrix returns to its original rating scale, and missing values are predicted.
+In this section, we compare and reflect on the two different approaches used for
predicting missing ratings in the recommendation system: Approach 1 (Non-
Centering) and Approach 2 (Centering). Both approaches utilized truncated SVD
with rank 1, but differed in how the data was preprocessed.
20
The non-centered approach assumes that all users rate movies similarly, without
adjusting for personal rating habits.
The centered approach subtracts each user’s average rating before SVD, helping
to remove user bias (e.g., generous vs. strict raters).
+As a result, centered predictions are generally lower, especially for users who tend to
give high ratings (like Liam). This adjustment often leads to more balanced and
realistic estimates.
Interpretation
Liam’s predicted rating for Casablanca dropped from 3.42 to 2.44 after
centering, suggesting that although Liam tends to rate movies highly, Casablanca
may not align with his true preferences.
Emma and Mia have identical observed ratings, so both methods gave them
the same prediction. However, the centered values are lower, reflecting the
adjustment for their high average ratings.
14:References