0% found this document useful (0 votes)

14 views47 pages

Updated Feature Enginering Notes

The document discusses feature engineering, particularly focusing on dimensionality reduction techniques such as PCA and LDA, which help address issues like the curse of dimensionality and overfitting. It outlines various methods for feature selection, including filter and wrapper methods, and explains the steps involved in PCA and LDA for transforming high-dimensional data into lower-dimensional spaces while preserving important information. Additionally, it covers matrix decomposition techniques like SVD, which are essential for applications in machine learning and data analysis.

Uploaded by

Srushti Girish kulkarni

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views47 pages

Updated Feature Enginering Notes

Uploaded by

Srushti Girish kulkarni

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 47

JSS ACADEMY OF TECHNICAL EDUCATION,

BENGALURU
Department of Computer Science and Engineering

Module-2
Topic: Feature Engineering
by

Dr. P B Mallikarjuna

1
Curse of Dimensionality
• Data Sparsity

• Increased computation

• Overfitting

• Distances lose meaning

• Performance degradation. Algorithms, especially those relying

on distance measurements like k-nearest neighbors,

• High-dimensional data is hard to visualize, making exploratory

data analysis more difficult.

2
Dimensionality Reduction
Dimensionality reduction is the process of reducing the number of features in a
dataset (Feature Matrix/Data Matrix).

• Subspace Methods: Projection of data from high dimension space to lower

dimension space

PCA (Principal Component analysis) – Unsupervised Algorithm

LDA (Linear Discriminant Analysis) - Supervised Algorithm

• Feature Subset Selection:

Sequential Forward Selection (SFS)
Sequential Backward Selection (SBS)
Sequential Floating Forward Selection (SFFS)
Sequential Floating Backward Selection (SFFS)

3
Issues Associated with Features

• Curse of dimensionality
• Misleading features
• Redundant features

needs

Feature Engineering

4
Feature Engineering : Dimensionality Reduction
• Filter Methods
Correlation Index, PCA, LDA, FLDA, Autoencoders

• Wrapper Methods
SFS, SBS, SFFS, SFBS

✓ Wrapper methods evaluate feature subsets based on the performance of a

chosen machine learning model

✓ Filter methods evaluate features based on statistical measures without

involving a machine learning model

5
Filter Method Wrapper Method

6
Wrapper Algorithm
• Supervised
• Best discriminating feature is assumed to carry
• Quadratic complexity : n(n+1)
• Approximate algorithm
• Wrapping around learning algorithm
• Classifier sensitive
• Execution time is longer than filter methods
• It is based on greedy approach

7
Wrapper Methods
• Sequential Forward Selection (SFS)

• Sequential Backward Selection (SBS)

• Sequential Floating Forward Selection (SFFS)

• Sequential Floating Backward Selection (SFFS)

8
Sequential Forward Selection (SFS)

• Method of inclusion

• Starts with empty set

• At each step it adds a best feature such that

criterion function is maximized

• Criterion function fails stops

9
SFS - Example

10
Sequential Backward Selection (SBS)

• Method of deduction

• Starts with set of all features

• At each step it eliminates a worst feature such

that criterion function is maximized

• Criterion function fails stops

11
SBS - Example

12
Sequential Floating Forward Selection (SFFS)

• Method of inclusion and deduction

• Starts with empty set

• At each step – forward walk followed by backward walk(s)

Forward walk
It adds a best feature such that criterion function is maximized
Backward walk
It eliminates a worst feature such that criterion function is
maximized
• Criterion function fails stops

13
Sequential Floating Backward Selection (SFBS)

• Method of deduction and inclusion

• Starts with set of all features (Full set)

• At each step – Backward walk followed by forward walk(s)

Backward walk
It eliminates a worst feature such that criterion function is
maximized
Forward walk
It adds a best feature such that criterion function is maximized
• Criterion function fails stops

14
Principal Component Analysis (PCA)
• It is a way of identifying patterns in data, and expressing the data in such a
way as to highlight their similarities and differences.

• Since patterns in data can be hard to find in data of high dimension, where
the luxury of graphical representation is not available, PCA is a powerful
tool for analyzing data

• Once found these patterns in the data, and we can compress the data by
reducing the number of dimensions, without much loss of information

• Principal Component Analysis (PCA) is a dimensionality reduction

technique that transforms high-dimensional data into a lower-
dimensional subspace while preserving the most important
information (variance)

• The variance of a dataset represents the spread or distribution of data points

15
Principal Component Analysis (PCA)
• PCA finds the Principal components (Eigen Vectors) that represent the
directions of maximum variance in the data. They are uncorrelated and
ordered (ranked) based on the amount of variance they capture (Eigen
Values)

• Each principal component is a linear combination of the original features

• Principal Components are perpendicular (uncorrelated) to each other

• Eigenvector with the highest eigenvalue is the first Principal Component

(PC1) & it captures the highest variance

• Second principal component (PC2) captures next highest variance &

This continues for all components, ensuring dimensionality reduction while
preserving information
16
Steps in PCA
✓ Get some dataset

✓ Subtract the mean from each of the data dimensions in the dataset. The
mean subtracted is the average across each dimension

✓ Compute the covariance matrix: It helps determine how strongly features

vary together

In case of 3-dimensional
dataset

17
Steps in PCA
✓ Calculate the eigenvectors and eigenvalues of the covariance matrix

✓ Sort Eigenvectors by Eigenvalues. Eigenvector with the highest

eigenvalue is the first Principal Component (PC1) & it captures the
highest variance

✓ Choosing components and forming a feature vector

✓ Transform the Data: Project the original data onto the selected principal
components.

Transformed Data= Original_Data Matrix × Selected_Eigen_Vectors Matrix

18
Example Problem on PCA

19
Example Problem on PCA

20
Example Problem on PCA

21
Example Problem on PCA

22
Example Problem on PCA

23
Linear Discriminant Analysis (LDA)
• Linear Discriminant Analysis (LDA) is a supervised dimensionality
reduction technique primarily used for classification tasks

• It projects high-dimensional data onto a lower-dimensional space while

maximizing class separability

• Maximize the between-class variance

• Minimize the within-class variance

• Improve class separation in a lower-dimensional space

• LDA finds the best projection direction that maximizes class separability

• In LDA, the eigenvectors obtained from the scatter matrices are not
necessarily perpendicular (orthogonal) to each other

24
Steps in LDA
✓ Compute the Mean Vectors
If there are C classes, calculate the mean vector for each class

where 𝑁𝑐 is the number of samples in class 𝑐, and 𝑥 represents the feature vectors
✓ Compute the Scatter Matrices
• Within-Class Scatter Matrix (𝑆𝑊 ): Measures the variance within each class

• Between-Class Scatter Matrix (𝑆𝐵): Measures the variance between different class
means

25
where 𝜇 is the overall mean of all data points
Steps in LDA
✓ Compute the Eigenvalues and Eigenvectors

where 𝑣 are the eigenvectors and 𝜆 are the eigenvalues

✓ Select the Top 𝑘 Eigenvectors

• Choose the top 𝑘 eigenvectors corresponding to the largest eigenvalues
• These eigenvectors form the transformation matrix 𝑊

✓ Project the Data

Transform the original dataset 𝑋 using the learned projection matrix 𝑊

This reduces the dimensionality while preserving class separability 26

Example Problem on LDA

Each data point has two features (x₁, x₂)

27
Example Problem on LDA

28
Example Problem on LDA

29
Example Problem on LDA

30
Example Problem on LDA

31
Example Problem on LDA

32
Example Problem on LDA

33
Example Problem on LDA

34
JSS ACADEMY OF TECHNICAL EDUCATION,
BENGALURU
Department of Computer Science and Engineering

Machine Learning
by

Dr. P B Mallikarjuna

Date: 19.08.2023

35
Matrix Decomposition
• Matrix decomposition is the process of breaking down a matrix into a
product of simpler matrices

• Matrix decomposition is also known as matrix factorization

• This is widely used in numerical analysis, optimization, machine learning,

and linear algebra for solving equations, reducing dimensionality, and
improving computational efficiency
✓ LU Decomposition (Lower-Upper Decomposition)

✓ QR Decomposition

✓ Eigenvalue Decomposition : used in PCA, LDA

✓ Singular Value Decomposition (SVD)

36
Singular Value Decomposition (SVD)
• Singular Value Decomposition (SVD) is a fundamental matrix factorization
technique in linear algebra

• It is used in many applications such as dimensionality reduction, image

compression, machine learning, and signal processing

• It is unsupervised dimensionality technique

37
Steps in SVD
✓ Consider some data (Matrix A)

✓ Compute 𝐴𝑇𝐴 and 𝐴𝐴𝑇

• Compute eigenvalues and eigenvectors of 𝐴𝑇𝐴 and 𝐴𝐴𝑇

• The eigenvectors of 𝐴𝑇𝐴 form the columns of 𝑉

• The eigenvectors of 𝐴𝐴𝑇 form the columns of 𝑈

✓ Compute Singular Values

• The square roots of the eigenvalues of 𝐴𝑇𝐴 (or 𝐴𝐴𝑇) give the singular
values in Σ

✓ Construct 𝑈, Σ, 𝑉𝑇

• Arrange the singular values in descending order in Σ

• Use corresponding eigenvectors for 𝑈 and 𝑉

38
Example Problem on SVD

39
Example Problem on SVD

40
Example Problem on SVD

41
Example Problem on SVD

42
Example Problem on SVD

43
Example Problem on SVD

44
Example Problem on SVD
To Compute Eigen Values of 𝐴𝐴𝑇

45
Example Problem on SVD

PCA and LDA Assignment
No ratings yet
PCA and LDA Assignment
5 pages
R21 Unit 2
No ratings yet
R21 Unit 2
101 pages
ML 4
No ratings yet
ML 4
14 pages
UNIT - 4
No ratings yet
UNIT - 4
76 pages
UNIT-4 Machine Learning
No ratings yet
UNIT-4 Machine Learning
20 pages
Linear Discriminant Analysis
No ratings yet
Linear Discriminant Analysis
16 pages
Mathematics Competency Based Questions
No ratings yet
Mathematics Competency Based Questions
210 pages
Fundamentals of Kalman Filtering - A Practical Approach
No ratings yet
Fundamentals of Kalman Filtering - A Practical Approach
67 pages
PCA - Feb 8
No ratings yet
PCA - Feb 8
28 pages
Principal Component Analysis
100% (1)
Principal Component Analysis
17 pages
Pattern Classification 06. Feature Selection & Extraction: Abdelmoniem Bayoumi, PHD
No ratings yet
Pattern Classification 06. Feature Selection & Extraction: Abdelmoniem Bayoumi, PHD
29 pages
Visualization 9 Dim Reduction
No ratings yet
Visualization 9 Dim Reduction
73 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
19 pages
315 F19 27 Pca1
No ratings yet
315 F19 27 Pca1
28 pages
Sta 5
No ratings yet
Sta 5
16 pages
L06 Feature Selection and Extraction
No ratings yet
L06 Feature Selection and Extraction
29 pages
EDAB Module 5 Singular Value Decomposition (SVD)
No ratings yet
EDAB Module 5 Singular Value Decomposition (SVD)
58 pages
PCALDAICA (2)
No ratings yet
PCALDAICA (2)
28 pages
AI Unit-5
No ratings yet
AI Unit-5
53 pages
Day School 03
No ratings yet
Day School 03
32 pages
Feature Engineering
No ratings yet
Feature Engineering
51 pages
ML RUSA Module 5 Dim Red
No ratings yet
ML RUSA Module 5 Dim Red
85 pages
Lesson 7-Feature Selection and Principal Component Analysis
No ratings yet
Lesson 7-Feature Selection and Principal Component Analysis
24 pages
Lecture W12ab
No ratings yet
Lecture W12ab
60 pages
Outline: Reducing Data Dimension
No ratings yet
Outline: Reducing Data Dimension
7 pages
Feature Extraction Techniques
No ratings yet
Feature Extraction Techniques
32 pages
ML Mod 4 Part 2
No ratings yet
ML Mod 4 Part 2
32 pages
MLSP-6 dimensionality reduction
No ratings yet
MLSP-6 dimensionality reduction
39 pages
UNIT-4
No ratings yet
UNIT-4
79 pages
16 dm2 Dimred 2022 23
No ratings yet
16 dm2 Dimred 2022 23
49 pages
Psychophysics, Second Edition: A Practical Introduction Kingdom instant download
No ratings yet
Psychophysics, Second Edition: A Practical Introduction Kingdom instant download
63 pages
Linear Discriminant Analysis: A Detailed Tutorial
No ratings yet
Linear Discriminant Analysis: A Detailed Tutorial
22 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
82 pages
کتاب نهم بارگزاری شده
No ratings yet
کتاب نهم بارگزاری شده
55 pages
I2ml3e Chap6
No ratings yet
I2ml3e Chap6
37 pages
ML Unit 3
No ratings yet
ML Unit 3
29 pages
Vtunotesforall Machine Learning Module 3
No ratings yet
Vtunotesforall Machine Learning Module 3
32 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
7 pages
CHP 4
No ratings yet
CHP 4
72 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
60 pages
Presentation
No ratings yet
Presentation
31 pages
5 Data Pre Processing III
No ratings yet
5 Data Pre Processing III
30 pages
Dimensions Reduction
No ratings yet
Dimensions Reduction
27 pages
Data Reduction Techniques
No ratings yet
Data Reduction Techniques
41 pages
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
No ratings yet
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
59 pages
Feature Selection and Extraction
No ratings yet
Feature Selection and Extraction
26 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
27 pages
22AIP3101A Session 7
No ratings yet
22AIP3101A Session 7
28 pages
Unit V Foml
No ratings yet
Unit V Foml
18 pages
Data Characterization
No ratings yet
Data Characterization
31 pages
3
No ratings yet
3
12 pages
Winitzki - Linear Algebra Via Exterior Products - Large Format
No ratings yet
Winitzki - Linear Algebra Via Exterior Products - Large Format
127 pages
# Loop Over Classes: 6.2 Principal Components Analysis (Pca)
No ratings yet
# Loop Over Classes: 6.2 Principal Components Analysis (Pca)
10 pages
6 - Data Pre-Processing-III
No ratings yet
6 - Data Pre-Processing-III
30 pages
R22 B.tech CSE Syllabus (2)
No ratings yet
R22 B.tech CSE Syllabus (2)
183 pages
ML 6
No ratings yet
ML 6
7 pages
ML Unit 2 Part -2
No ratings yet
ML Unit 2 Part -2
6 pages
ML - Unit 3
No ratings yet
ML - Unit 3
4 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
3 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
19 pages
70-Mechanical Technology
No ratings yet
70-Mechanical Technology
50 pages
Download Full The Book of R: A First Course in Programming and Statistics 1st Edition Tilman M. Davies PDF All Chapters
100% (2)
Download Full The Book of R: A First Course in Programming and Statistics 1st Edition Tilman M. Davies PDF All Chapters
55 pages
AllSem23 MAKAUT
No ratings yet
AllSem23 MAKAUT
81 pages
maths0_compress
No ratings yet
maths0_compress
47 pages
A New Face Recognition Method Using PCA, LDA and Neural Network
No ratings yet
A New Face Recognition Method Using PCA, LDA and Neural Network
6 pages
Dimension Reduction
No ratings yet
Dimension Reduction
15 pages
Context-Based Persian Multi-Document Summarization (Global View)
No ratings yet
Context-Based Persian Multi-Document Summarization (Global View)
5 pages
Solution of Linear System of Equations
No ratings yet
Solution of Linear System of Equations
49 pages
ML Mod 6
No ratings yet
ML Mod 6
5 pages
Vtunotesforall Machine Learning Module 4
No ratings yet
Vtunotesforall Machine Learning Module 4
11 pages
Structural Dynamics Toolbox & Femlink: User'S Guide Sdtools
No ratings yet
Structural Dynamics Toolbox & Femlink: User'S Guide Sdtools
711 pages
Ch6 Group Technology and Facilities Layout - Publish2
No ratings yet
Ch6 Group Technology and Facilities Layout - Publish2
18 pages
Q Learning Ejemplo
100% (1)
Q Learning Ejemplo
11 pages
JEE Mains Syllabus 2025
No ratings yet
JEE Mains Syllabus 2025
36 pages
Feature Extraction
No ratings yet
Feature Extraction
3 pages
Pyqubo
No ratings yet
Pyqubo
14 pages
Lecture16 Diagonalization
No ratings yet
Lecture16 Diagonalization
14 pages
PDS Metrics
No ratings yet
PDS Metrics
26 pages
Home Assignment I
No ratings yet
Home Assignment I
3 pages
pgm2
No ratings yet
pgm2
5 pages
Assignment 2 MTH113
100% (1)
Assignment 2 MTH113
3 pages
Truss Analysis Calculator - Free online truss tool _ Encomp
No ratings yet
Truss Analysis Calculator - Free online truss tool _ Encomp
6 pages
Applying Page Rank and HITS Algorithm To Identify Key Use Cases
No ratings yet
Applying Page Rank and HITS Algorithm To Identify Key Use Cases
8 pages
285 Project Paper
No ratings yet
285 Project Paper
7 pages
Unit I Introduction: CS640 2 - Design and Analysis of Algorithms - Unit I - 1.1
No ratings yet
Unit I Introduction: CS640 2 - Design and Analysis of Algorithms - Unit I - 1.1
20 pages
COURSE OUTLINE BS102
No ratings yet
COURSE OUTLINE BS102
4 pages
pgm5
No ratings yet
pgm5
3 pages
Review of Voltage Stability Indices
No ratings yet
Review of Voltage Stability Indices
10 pages
A Closed-Form Localization Algorithm Using Angle-of-Arrival and Difference Time of Scan Time Measurements in Scan-Based Radar
No ratings yet
A Closed-Form Localization Algorithm Using Angle-of-Arrival and Difference Time of Scan Time Measurements in Scan-Based Radar
5 pages
SSC CHSL Exam 01.11.2015 Question Paper Morning Shift
No ratings yet
SSC CHSL Exam 01.11.2015 Question Paper Morning Shift
20 pages
2017imotc
No ratings yet
2017imotc
4 pages
Questions_IIT JEE Advance 2023_MPS
No ratings yet
Questions_IIT JEE Advance 2023_MPS
3 pages
Técnicas Estadísticas para la Ciencia de Datos a través de R. Aprendizaje Supervisado: Análisis Discriminante, Árboles de Decisión, Redes Neuronales y Modelos Lineales Generalizados
From Everand
Técnicas Estadísticas para la Ciencia de Datos a través de R. Aprendizaje Supervisado: Análisis Discriminante, Árboles de Decisión, Redes Neuronales y Modelos Lineales Generalizados
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
César Pérez López
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Updated Feature Enginering Notes

Uploaded by

Updated Feature Enginering Notes

Uploaded by

JSS ACADEMY OF TECHNICAL EDUCATION,

• Distances lose meaning

• Performance degradation. Algorithms, especially those relying

• High-dimensional data is hard to visualize, making exploratory

• Subspace Methods: Projection of data from high dimension space to lower

PCA (Principal Component analysis) – Unsupervised Algorithm

LDA (Linear Discriminant Analysis) - Supervised Algorithm

• Feature Subset Selection:

✓ Wrapper methods evaluate feature subsets based on the performance of a

✓ Filter methods evaluate features based on statistical measures without

• Sequential Backward Selection (SBS)

• Sequential Floating Forward Selection (SFFS)

• Sequential Floating Backward Selection (SFFS)

• Starts with empty set

• At each step it adds a best feature such that

• Criterion function fails stops

• Starts with set of all features

• At each step it eliminates a worst feature such

• Criterion function fails stops

• Method of inclusion and deduction

• Starts with empty set

• At each step – forward walk followed by backward walk(s)

• Method of deduction and inclusion

• Starts with set of all features (Full set)

• At each step – Backward walk followed by forward walk(s)

• Principal Component Analysis (PCA) is a dimensionality reduction

• The variance of a dataset represents the spread or distribution of data points

• Each principal component is a linear combination of the original features

• Principal Components are perpendicular (uncorrelated) to each other

• Eigenvector with the highest eigenvalue is the first Principal Component

• Second principal component (PC2) captures next highest variance &

✓ Compute the covariance matrix: It helps determine how strongly features

✓ Sort Eigenvectors by Eigenvalues. Eigenvector with the highest

✓ Choosing components and forming a feature vector

Transformed Data= Original_Data Matrix × Selected_Eigen_Vectors Matrix

• It projects high-dimensional data onto a lower-dimensional space while

• Maximize the between-class variance

• Minimize the within-class variance

• Improve class separation in a lower-dimensional space

where 𝑣 are the eigenvectors and 𝜆 are the eigenvalues

✓ Select the Top 𝑘 Eigenvectors

✓ Project the Data

This reduces the dimensionality while preserving class separability 26

Each data point has two features (x₁, x₂)

• Matrix decomposition is also known as matrix factorization

• This is widely used in numerical analysis, optimization, machine learning,

✓ Eigenvalue Decomposition : used in PCA, LDA

✓ Singular Value Decomposition (SVD)

• It is used in many applications such as dimensionality reduction, image

• It is unsupervised dimensionality technique

✓ Compute 𝐴𝑇𝐴 and 𝐴𝐴𝑇

• Compute eigenvalues and eigenvectors of 𝐴𝑇𝐴 and 𝐴𝐴𝑇

• The eigenvectors of 𝐴𝑇𝐴 form the columns of 𝑉

• The eigenvectors of 𝐴𝐴𝑇 form the columns of 𝑈

✓ Compute Singular Values

• Arrange the singular values in descending order in Σ

• Use corresponding eigenvectors for 𝑈 and 𝑉

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.