0% found this document useful (0 votes)
3 views12 pages

UCS-401 - CSE7th M L Lect 06 - Done

Uploaded by

buest21ucs028
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views12 pages

UCS-401 - CSE7th M L Lect 06 - Done

Uploaded by

buest21ucs028
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 12

Machine Learning

Topic’s
Regression: Assessing performance of Regression-
Error measures, Overfitting- Catalysts for
Overfitting.

Prepared by
Maneesh Kumar Jamwal Date 21-10-2024
Regression
Regression is a type of supervised machine
learning algorithm used for predicting a
continuous output variable based on one or more
input features.
Its goal is to establish a relationship between the
dependent variable (target) and independent
variables (features).
Types of Regression:
1. Linear Regression: This method fits a straight
line (best fit) through the data points, with the
equation
Y=β0+β1X+ϵ , where Y is the predicted output, X is
the input feature, β0​and β1​are coefficients,
and ϵ(epsilon) represents the error.
2. Polynomial Regression: Extends linear
regression by allowing polynomial relationships
between input and output. It fits a curve to the
data instead of a straight line.
3. Logistic Regression: While typically used for
classification, logistic regression is another type of
regression. It predicts the probability of binary
outcomes, using a logistic function to map
predicted values to a probability between 0 and 1.
4. Ridge and Lasso Regression: Both are types of
linear regression with regularization to prevent
overfitting. Ridge adds an L2 penalty (squared
magnitude of coefficients), while Lasso adds an L1
penalty (absolute value of coefficients)
Regression is widely used in forecasting, risk
analysis, and other areas where predicting
numeric values is essential
Assessing performance of Regression- Error measures:
In regression analysis, error measures assess how well a
model predicts continuous values compared to actual
values. Common error measures include:
1. Mean Squared Error (MSE): This metric calculates the
average of the squared differences between predicted
and actual values. MSE penalizes larger errors more
heavily.
MSE=1/n+1∑n​(yi​−y^​i​)2
2. Root Mean Squared Error (RMSE): The square
root of MSE, providing an error metric in the same
units as the target variable, making it easier to
interpret.
RMSE=√͞͞͞͞(MSE).
3. Mean Absolute Error (MAE): This is the average
of the absolute differences between predicted and
actual values. It gives equal weight to all errors.
MAE=1/n*1∑n​∣yi​−y^​i∣​
Overfitting- Catalysts for Overfitting.
Overfitting occurs when a model learns not only
the underlying patterns in the data but also the
noise and random fluctuations, leading to
excellent performance on training data but poor
generalization to unseen data.
Overfitting occurs when a machine learning model
learns not just the underlying patterns in the
training data but also the noise and outliers,
resulting in poor performance on unseen data.
Several factors can act as catalysts for overfitting.
1. Complex Models: Using overly complex
algorithms or architectures (like deep neural
networks with many layers) can lead to overfitting,
especially when the model has more parameters
than necessary.
2.Insufficient Training Data: A small dataset
increases the likelihood that the model will
memorize rather than generalize from the data
3. High Dimensionality: When the feature space is
high-dimensional, the risk of overfitting increases,
as the model may find patterns that don't
generalize well.
4.Noise in Data: If the training data contains a lot
of noise or irrelevant features, the model can learn
these misleading signals instead of the true
patterns.
5. Lack of Regularization: Without techniques like
L1 or L2 regularization, dropout, or early stopping,
models can easily overfit to the training data.
6. Inadequate Cross-Validation: Poor validation
strategies can lead to misleading performance
metrics, giving the impression that the model is
performing well when it may not generalize
7. Imbalanced Datasets: When certain classes are
overrepresented, the model may overfit to these
dominant classes while neglecting minority classes
thx

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy