ML Techniques
ML Techniques
Feature extraction is a crucial step in machine learning where raw data is transformed into a
reduced set of features (or variables) that retain the most important information while reducing
computational complexity.
Dimensionality Reduction – Instead of using all raw data, we extract or construct a smaller set of
meaningful features.
Example: In image processing, instead of using every pixel, we might extract edges, textures, or
color histograms.
Preserving Relevant Information – The extracted features should capture patterns that are
useful for the ML model.
Example: In text classification, instead of using all words, we might use TF-IDF (Term
Frequency-Inverse Document Frequency) to select the most discriminative words.
Efficiency – The reduced feature set speeds up training and prediction without sacrificing model
performance.
Principal Component Analysis (PCA) – Transforms data into a lower-dimensional space while
retaining variance.
Handcrafted Features (e.g., SIFT, HOG for images) – Designed based on domain knowledge.
How to Convince the Reader That Key Information Was Retained?
Compare model performance (accuracy, F1-score) before and after feature extraction.
Use visualization (e.g., PCA scatter plots) to show that clusters or patterns are preserved.
Perform statistical tests (e.g., correlation analysis) to ensure extracted features relate to the
target variable.