0% found this document useful (0 votes)
16 views11 pages

Tripti Ahmed 20 42960 1

The document discusses feature selection and feature reduction in machine learning, highlighting their importance in improving model accuracy and efficiency. Feature selection involves choosing relevant features while retaining the original data, whereas feature reduction transforms data into a lower-dimensional space. Techniques for both processes are outlined, including Recursive Feature Elimination (RFE) for selection and Principal Component Analysis (PCA) for reduction, along with the consequences of not implementing feature reduction.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views11 pages

Tripti Ahmed 20 42960 1

The document discusses feature selection and feature reduction in machine learning, highlighting their importance in improving model accuracy and efficiency. Feature selection involves choosing relevant features while retaining the original data, whereas feature reduction transforms data into a lower-dimensional space. Techniques for both processes are outlined, including Recursive Feature Elimination (RFE) for selection and Principal Component Analysis (PCA) for reduction, along with the consequences of not implementing feature reduction.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 11

PRESENTATION

NAME : TRIPTI AHMED


ID: 20-42322-1
COURSE : MACHINE
LEARNING
FEATURE SELECTION
• FEATURE SELECTION IS THE PROCESS OF CHOOSING THE MOST RELEVANT FEATURES
FROM YOUR DATASET TO BUILD A MORE ACCURATE AND EFFICIENT MACHINE LEARNING
MODEL.
• THE OBJECTIVE OF FEATURE SELECTION IS TO REMOVE
IRRELEVANT AND/OR REDUNDANT FEATURES AND RETAIN ONLY
RELEVANT FEATURES. IRRELEVANT FEATURES CAN BE REMOVED
WITHOUT AFFECTING LEARNING PERFORMANCE. REDUNDANT
FEATURES ARE A TYPE OF IRRELEVANT FEATURES.
FEATURE REDUCTION

• FEATURE REDUCTION IS THE PROCESS OF REDUCING THE DIMENSIONALITY OF YOUR


DATASET BY TRANSFORMING OR PROJECTING IT INTO A LOWER-DIMENSIONAL SPACE.
• FEATURE REDUCTION, ALSO KNOWN AS DIMENSIONALITY
REDUCTION, IS THE PROCESS OF REDUCING THE NUMBER OF
FEATURES IN A RESOURCE HEAVY COMPUTATION WITHOUT
LOSING IMPORTANT INFORMATION.
DIFFERENCES BETWEEN FEATURE
SELECTION AND FEATURE
REDUCTION
• FEATURE SELECTION KEEPS THE ORIGINAL FEATURES AND SELECTS A
SUBSET OF THEM, WHILE FEATURE REDUCTION TRANSFORMS OR
PROJECTS THE DATA INTO A LOWER-DIMENSIONAL SPACE.
• FEATURE SELECTION IS OFTEN USED WHEN YOU WANT TO MAINTAIN THE
INTERPRETABILITY OF FEATURES, WHILE FEATURE REDUCTION IS USED
WHEN YOU'RE MORE CONCERNED WITH REDUCING COMPUTATIONAL
COMPLEXITY.
WHERE TO USE FEATURE SELECTION AND
FEATURE REDUCTION?

• USE FEATURE SELECTION WHEN YOU HAVE A LARGE NUMBER OF


FEATURES, AND YOU WANT TO CHOOSE THE MOST RELEVANT ONES TO
IMPROVE MODEL PERFORMANCE.
• USE FEATURE REDUCTION WHEN YOU WANT TO REDUCE
DIMENSIONALITY, IMPROVE EFFICIENCY, OR VISUALIZE DATA IN A LOWER-
DIMENSIONAL SPACE.
TECHNIQUES FOR FEATURE SELECTION
AND FEATURE REDUCTION.
• FEATURE SELECTION TECHNIQUES:
• A. CORRELATION ANALYSIS: CALCULATE THE CORRELATION BETWEEN
NUMERICAL FEATURES AND THE TARGET VARIABLE (E.G., REVENUE).
SELECT FEATURES WITH HIGH ABSOLUTE CORRELATION VALUES.
• B. CHI-SQUARE TEST: FOR CATEGORICAL FEATURES, YOU CAN USE THE
CHI-SQUARE TEST TO ASSESS THE DEPENDENCY BETWEEN EACH
CATEGORICAL FEATURE AND THE TARGET VARIABLE.
• C. RECURSIVE FEATURE ELIMINATION (RFE): AS DEMONSTRATED IN
THE PREVIOUS EXAMPLE, USE RFE WITH A SUITABLE MACHINE LEARNING
MODEL TO RANK AND SELECT IMPORTANT FEATURES.
TECHNIQUES FOR FEATURE SELECTION
AND FEATURE REDUCTION.

• FEATURE REDUCTION TECHNIQUES:


• A. PRINCIPAL COMPONENT ANALYSIS (PCA): APPLY PCA FOR DIMENSIONALITY REDUCTION
WHEN YOU HAVE A LARGE NUMBER OF NUMERICAL FEATURES. PCA TRANSFORMS FEATURES
INTO A NEW SET OF UNCORRELATED FEATURES (PRINCIPAL COMPONENTS) AND YOU CAN
SELECT A SUBSET OF THESE COMPONENTS.
• B. LINEAR DISCRIMINANT ANALYSIS (LDA): LDA IS A DIMENSIONALITY REDUCTION
TECHNIQUE THAT IS PARTICULARLY USEFUL FOR CLASSIFICATION TASKS. IT SEEKS TO MAXIMIZE
THE SEPARATION BETWEEN CLASSES BY TRANSFORMING THE DATA INTO A LOWER-
DIMENSIONAL SPACE.
• C. T-DISTRIBUTED STOCHASTIC NEIGHBOR EMBEDDING (T-SNE): T-SNE IS USED FOR
VISUALIZATION AND NON-LINEAR DIMENSIONALITY REDUCTION. IT CAN BE HELPFUL FOR
EXPLORING THE STRUCTURE OF YOUR DATA.
IMPLEMENT ONE TECHNIQUE OF FEATURE SELECTION

• Recursive Feature Elimination (RFE) method used for feature selection.


IMPLEMENT ONE TECHNIQUE OF FEATURE REDUCTION
• Principal Component Analysis (PCA) method used for feature reduction.
OUTCOME OF FEATURE REDUCTION
• DIMENSIONALITY REDUCTION: REDUCE THE NUMBER OF FEATURES,
MAKING DATA MORE MANAGEABLE.
• NOISE REDUCTION: FOCUS ON INFORMATIVE FEATURES, MITIGATING
OVERFITTING RISKS.
• DATA VISUALIZATION: PROJECT DATA TO LOWER DIMENSIONS FOR
EXPLORATION.
• IMPROVED MODEL PERFORMANCE: MINIMIZE OVERFITTING IN
COMPLEX MODELS.
• REDUCED RESOURCE DEMANDS: LESS MEMORY AND STORAGE
REQUIRED FOR SMALLER DATASETS.
WHAT WOULD HAPPEN IF FEATURE REDUCTION
IS NOT IMPLEMENTED
1. DIMENSIONALITY CHALLENGE:
• HIGH-DIMENSIONAL DATA, WITH MANY FEATURES, LEADS TO INCREASED COMPLEXITY AND SLOWER MODEL
TRAINING. IT CAN BE CHALLENGING TO FIND MEANINGFUL PATTERNS IN SUCH DATA.
2. OVERFITTING RISK:
• WHEN YOU HAVE MORE FEATURES THAN DATA POINTS, THE RISK OF OVERFITTING IS HIGH. MODELS MAY
BECOME TOO SPECIALIZED IN THE TRAINING DATA AND FAIL TO GENERALIZE TO NEW, UNSEEN DATA.
3. DATA DEMANDS:
• MODELS WITH NUMEROUS FEATURES REQUIRE LARGER DATASETS TO MAKE ACCURATE PREDICTIONS.
COLLECTING A SUFFICIENT AMOUNT OF DATA CAN BE EXPENSIVE OR IMPRACTICAL.

4. MODEL INTERPRETABILITY:
• MORE FEATURES MAKE MODELS LESS INTERPRETABLE. UNDERSTANDING WHICH FEATURES ARE DRIVING
PREDICTIONS BECOMES COMPLEX, WHICH CAN BE A PROBLEM FOR DECISION-MAKING.
5. INEFFICIENT RESOURCES:
• MODELS WITH A HIGH NUMBER OF FEATURES DEMAND MORE COMPUTATIONAL RESOURCES, INCLUDING
MEMORY AND PROCESSING POWER. THIS INCREASED RESOURCE CONSUMPTION CAN BE INEFFICIENT AND
COSTLY.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy