0% found this document useful (0 votes)
106 views7 pages

ML - ML in Nutshell

The document outlines various machine learning models categorized into supervised learning, unsupervised learning, recommender systems, model evaluation & optimization, and optimization & regularization. It details specific algorithms within these categories, their use cases, advantages, and disadvantages. Additionally, it covers general concepts such as the bias-variance tradeoff and data filtering.

Uploaded by

nihadiqbalkhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
106 views7 pages

ML - ML in Nutshell

The document outlines various machine learning models categorized into supervised learning, unsupervised learning, recommender systems, model evaluation & optimization, and optimization & regularization. It details specific algorithms within these categories, their use cases, advantages, and disadvantages. Additionally, it covers general concepts such as the bias-variance tradeoff and data filtering.

Uploaded by

nihadiqbalkhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Machine Learning Models

├── 1. Supervised Learning

│ ├── Linear Regression

│ ├── Logistic Regression

│ ├── Decision Trees

│ ├── Support Vector Machines (SVM)

│ ├── K-Nearest Neighbors (K-NN)

│ ├── Naïve Bayes

│ ├── Linear Discriminant Analysis (LDA)

│ └── Ensemble Methods

│ ├── Random Forest

│ ├── Bagging

│ └── Gradient Boosting / XGBoost / LightGBM

├── 2. Unsupervised Learning

│ ├── K-Means Clustering

│ ├── Hierarchical Clustering

│ └── Dimensionality Reduction (e.g., PCA)

├── 3. Recommender Systems

│ ├── Collaborative Filtering

│ ├── Matrix Factorization

│ └── Implicit vs. Explicit Feedback


├── 4. Model Evaluation & Optimization

│ ├── Metrics (F1, RMSE, AUC, etc.)

│ └── Cross-Validation

├── 5. Optimization & Regularization

│ ├── Gradient Descent

│ └── Regularization

│ ├── Ridge

│ ├── Lasso

│ └── Elastic Net

└── 6. General Concepts

├── Supervised vs. Unsupervised

├── Bias-Variance Tradeoff

├── Occam’s Razor

├── Data Filtering


1. Supervised Learning
Supervised learning models learn from labeled data. They are used primarily for prediction
and classification.

A. Linear Regression

Use Case: Predict continuous values

• Pros:
o Simple and easy to interpret
o Fast and computationally inexpensive
• Cons:
o Assumes linearity
o Sensitive to outliers and multicollinearity

B. Logistic Regression

Use Case: Binary classification

• Pros:
o Efficient and interpretable
o Works well with linearly separable data
• Cons:
o Poor performance with non-linear data

C. Decision Trees

Use Case: Classification and regression

• Pros:
o Easy to understand and visualize
o Non-linear relationships can be captured
• Cons:
o Prone to overfitting

D. Support Vector Machines (SVM)

Use Case: Classification with large feature spaces

• Pros:
o Effective in high-dimensional spaces
o Can handle non-linear data using kernel trick
• Cons:
o Choosing the right kernel is challenging
o Requires feature scaling

E. K-Nearest Neighbors (K-NN)


Use Case: Classification

• Pros:
o Simple and effective
o No training phase
• Cons:
o Slow for large datasets
o Sensitive to irrelevant features and feature scaling

F. Naïve Bayes

Use Case: Classification

• Pros:
o Works well with high-dimensional data
o Fast and requires less training data
• Cons:
o Assumes feature independence, which rarely holds

G. LDA (Linear Discriminant Analysis)

Use Case: Dimensionality reduction and classification

• Pros:
o Effective in finding linear combinations for class separation
• Cons:
o Assumes Gaussian distribution and equal covariance

H. Ensemble Methods

i. Random Forest

• Pros:
o Reduces overfitting compared to individual trees
o Handles large datasets well
• Cons:
o Can be slow and memory-intensive

ii. Bagging

• Pros:
o Reduces variance
• Cons:
o Less interpretable

iii. Gradient Boosting / XGBoost / LightGBM

• Pros:
o High predictive power
o Handles missing data and categorical features (LightGBM)
• Cons:
o Prone to overfitting if not tuned properly

2. Unsupervised Learning
Unsupervised learning finds patterns from unlabeled data.

A. K-Means Clustering

• Pros:
o Simple and scalable
o Efficient for large datasets
• Cons:
o Requires predefined number of clusters
o Sensitive to outliers and scaling

B. Hierarchical Clustering

• Pros:
o Dendrogram provides visual intuition
o No need to predefine number of clusters
• Cons:
o Computationally intensive

C. Dimensionality Reduction (e.g., PCA)

• Pros:
o Helps in visualization and removing multicollinearity
• Cons:
o May lose interpretability

3. Recommender Systems
A. Collaborative Filtering

• Pros:
o Personalized recommendations
• Cons:
o Suffers from cold start and sparsity

B. Matrix Factorization

• Pros:
o Effective for large, sparse matrices
• Cons:
o Requires tuning and matrix decomposition
C. Implicit vs. Explicit Feedback

• Explicit (e.g., ratings) more accurate but rare


• Implicit (e.g., views, clicks) abundant but noisy

4. Model Evaluation & Optimization


A. Metrics

• Silhouette Score – Measures clustering quality


• Davies-Bouldin Index – Lower is better for clusters
• Inertia – K-Means internal metric
• RMSE/MAE/R² – Regression errors
• F1 Score – Best for imbalanced data
• MRR, NDCG – Recommender ranking metrics
• AUC – Discriminatory power of classifiers
• Recall – Important when missing positives is costly

B. Cross-Validation

• Pros:
o More reliable model evaluation
• Cons:
o Slower training process

5. Optimization & Regularization


A. Gradient Descent

• Pros:
o General optimization algorithm
• Cons:
o May converge to local minima

B. Regularization (Ridge, Lasso, Elastic Net)

• Purpose: Prevent overfitting by penalizing large weights

i. Ridge – Good for correlated features

ii. Lasso – Feature selection

iii. Elastic Net – Combines Ridge and Lasso

6. General Concepts
• Supervised vs. Unsupervised Learning – Labeled vs. pattern discovery
• Bias-Variance Tradeoff – Underfitting vs. overfitting
• Occam’s Razor – Prefer simpler models
• Data Filtering – Preprocessing step for cleaning data

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy