0% found this document useful (0 votes)

7 views14 pages

Bias Variance

The document discusses bias and variance in linear and logistic regression, highlighting the issues of overfitting and underfitting. It presents methods for model validation and selection, including holdout, repeated holdout, and k-fold cross-validation, as well as strategies to combat overfitting through feature reduction and regularization techniques like L1, L2, and Elastic Net. The importance of using separate training, validation, and test sets to ensure generalization is emphasized throughout the document.

Uploaded by

bharat.goel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views14 pages

Bias Variance

Uploaded by

bharat.goel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Bias and Variance

Minati Rath
Example: Linear regression (housing prices)
Fitting a linear function
Fitting a quadratic function
Price

Fitting a higher order function

Size

Bias vs. variance in linear regression

Price

Size
Bias vs. variance in linear regression
Price

Size

High bias “Just right” High variance

(underfitting) (overfitting)

Overfitting
If we have too many features, the learned hypothesis may fit the
training set very well

but fail to generalize to new examples.

Bias vs. variance in logistic regression

Example: Logistic regression

Sources of noise and error
While learning a target function using a training set
Two sources of noise
Some training points may not come exactly from the target
function: stochastic noise
The target function may be too complex to capture using the
chosen hypothesis set: deterministic noise
Generalization error: Model tries to fit the noise in the training data,
which gets extrapolated to the test set

Ways to handle noise

Validation
Check performance on data other than training data, and tune model
accordingly
Regularization
Constraint the model so that the noise cannot be learnt too well
Validation
Divide given data into train set and test set
E.g., 80% train and 20% test
Better to select randomly
Learn parameters using training set
Check performance (validate the model) on test set, using
measures such as accuracy, misclassification rate, etc.
Trade-off: more data for training vs. validation
An example: model selection
• Which order polynomial will best fit a given data? Polynomials
available: h1, h2, …, h10
• As if an extra parameter - degree of the polynomial - is to be
learned
• Approach 1
– Divide into train and test set
– Train each hypothesis on train set, measure error on test set
– Select the hypothesis with minimum test set error
• Problem with the previous approach
– The test set error we computed is not a true estimate of
generalization error
– Since our extra parameter (order of polynomial) is fit to the test
set
An example: model selection

Approach 2
– Divide data into train set (60%), validation set
(20%) and test set (20%)
– Select that hypothesis which gives lowest error on
validation set
– Use test set to estimate generalization error

Note: Test set not at all seen during training

Popular methods of evaluating a classifier
• Holdout method
– Split data into train and test set (usually 2/3 for train and 1/3 for
test). Learn model using train set and measure performance
over test set
– Usually used when there is sufficiently large data, since both
train and test data will be a part
• Repeated Holdout method
– Repeat the Holdout method multiple times with different
subsets used for train/test
– In each iteration, a certain portion of data is randomly selected
for training, rest for testing
– The error rates on the different iterations are averaged to yield
an overall error rate
– More reliable than simple Holdout
Popular methods of evaluating a classifier
• k-fold cross-validation
– First step: data is split into k subsets of equal size;
– Second step: each subset in turn is used for testing and the
remainder for training
– Performance measures averaged over all folds

Popular choice for k: 10 or 5

Advantage: all available data points being used to train as well test
model
Classifier

k-fold cross validation (shown for k=3)

train train test

train test train

Data
test train train
Regularization
Addressing overfitting: Two ways
1. Reduce number of features
— Manually select which features to keep
— Problem: loss of some information (discarded features)
2. Regularization
— Keep all the features, but reduce magnitude/values of parameters
— Works well when we have a lot of features, each of which contributes a
bit to predicting

Intuition of regularization

Price
Price

Size of house
Size of house

Suppose we penalize and make really small

Combatting Overfitting
➢ Problem of overfitting can be overcome by increasing the input
training data points
➢ Number of input data points should be at least more than 10
times the number of parameters or features
➢ But what if we have less data points:
➢ Put a bound on regression coefficients by using regularization

Regularization for linear regression

In regularized linear regression, we choose to minimize

By convention, regularization is
not applied on θ0 (makes little
difference to the solution)
λ: Regularization parameter

Smaller values of parameters lead to more generalizable models,

less overfitting
L1, L2 and Elastic net Regularization
What we are discussing is called L2 regularization or “ridge”
regularization – it adds squared magnitude of parameters as penalty
term

Look up L1 or “Lasso” regularization

– adds absolute value of magnitude of parameters as penalty term

Elastic Net (Combination of L1 and L2 Regularization)

Effect: Combines the benefits of both Ridge and Lasso. It allows
for some coefficients to be set to zero (like Lasso) while shrinking
others (like Ridge). It is useful when there is multicollinearity, and
some feature selection is needed

Apache Spark With Scala - Cheatsheet
No ratings yet
Apache Spark With Scala - Cheatsheet
7 pages
Data Structures and Algorithms
No ratings yet
Data Structures and Algorithms
30 pages
4 - Training and Testing Classifier Models
No ratings yet
4 - Training and Testing Classifier Models
15 pages
Regression and Generalization
No ratings yet
Regression and Generalization
67 pages
ML-4 Cross Validation in Machine Learning
No ratings yet
ML-4 Cross Validation in Machine Learning
13 pages
ML Tips and Tricks
No ratings yet
ML Tips and Tricks
32 pages
Lecture 7
No ratings yet
Lecture 7
29 pages
10 Advice For Applying Machine Learning
No ratings yet
10 Advice For Applying Machine Learning
25 pages
Ahmad 2005
No ratings yet
Ahmad 2005
16 pages
Linear Regression, Polynomical, Gradiant Descent
No ratings yet
Linear Regression, Polynomical, Gradiant Descent
42 pages
Regularization Linear Models
No ratings yet
Regularization Linear Models
23 pages
Lec-1 Bias-variance-Tradeoff
No ratings yet
Lec-1 Bias-variance-Tradeoff
24 pages
Bias
No ratings yet
Bias
3 pages
5 DL
No ratings yet
5 DL
33 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
116 pages
Lecture 4 - Regularization
No ratings yet
Lecture 4 - Regularization
22 pages
06 Regularizations
No ratings yet
06 Regularizations
42 pages
DS Notes
No ratings yet
DS Notes
36 pages
Lecture16 Crossvalidation
No ratings yet
Lecture16 Crossvalidation
32 pages
Lecture 5b - Model Performance Analytics
No ratings yet
Lecture 5b - Model Performance Analytics
27 pages
DSOST3
No ratings yet
DSOST3
31 pages
Lecture 09 ML
No ratings yet
Lecture 09 ML
26 pages
Supervised Regression Notes
No ratings yet
Supervised Regression Notes
11 pages
Machine Learning
No ratings yet
Machine Learning
63 pages
Lec06 PracticalML
No ratings yet
Lec06 PracticalML
40 pages
19 ML Intro
No ratings yet
19 ML Intro
31 pages
5.1 Simultaneous Equations-AZE
No ratings yet
5.1 Simultaneous Equations-AZE
16 pages
Overfitting & Feature Engineering
No ratings yet
Overfitting & Feature Engineering
37 pages
Overview of Adaboost: Reconciling Its Views To Better Understand Its Dynamics
No ratings yet
Overview of Adaboost: Reconciling Its Views To Better Understand Its Dynamics
39 pages
MST Prim Worksheet
No ratings yet
MST Prim Worksheet
2 pages
Lecture 18. Backpropagation
No ratings yet
Lecture 18. Backpropagation
55 pages
Lecture 19
No ratings yet
Lecture 19
25 pages
Unit 2
No ratings yet
Unit 2
23 pages
ML 04 Validation Regularization
No ratings yet
ML 04 Validation Regularization
57 pages
11 Ip
No ratings yet
11 Ip
3 pages
Unit 1.2 Perceptron 2024
No ratings yet
Unit 1.2 Perceptron 2024
107 pages
L2 - Problems in ML & Performance Evaluation
No ratings yet
L2 - Problems in ML & Performance Evaluation
30 pages
Regression Analysis
No ratings yet
Regression Analysis
11 pages
Lecture 7 - Part A - Mutli Class and Overfitting and Regularization
No ratings yet
Lecture 7 - Part A - Mutli Class and Overfitting and Regularization
43 pages
2.2 Lazy Learning
No ratings yet
2.2 Lazy Learning
26 pages
Model Generalization
No ratings yet
Model Generalization
117 pages
Ec3354 - SS QB1
No ratings yet
Ec3354 - SS QB1
20 pages
Introduction To Soft Computing Koe 046
No ratings yet
Introduction To Soft Computing Koe 046
2 pages
Facility Location and Covering Problems
No ratings yet
Facility Location and Covering Problems
3 pages
Deep Learning
No ratings yet
Deep Learning
28 pages
15-The Bias - Variance - Trade-Off-08-04-2024
No ratings yet
15-The Bias - Variance - Trade-Off-08-04-2024
23 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
32 pages
ML11 Generalization
No ratings yet
ML11 Generalization
40 pages
Udaydaa1 7
No ratings yet
Udaydaa1 7
75 pages
Jkkklphftbbhuii
No ratings yet
Jkkklphftbbhuii
17 pages
Chapter 3
No ratings yet
Chapter 3
23 pages
Machine Learning Exploring The Model
No ratings yet
Machine Learning Exploring The Model
17 pages
Chapter 1 Capstone Project Ai Class 12
No ratings yet
Chapter 1 Capstone Project Ai Class 12
5 pages
Theory in Machine Learning
No ratings yet
Theory in Machine Learning
60 pages
Overfitting Regression
No ratings yet
Overfitting Regression
14 pages
ML - Module 5
No ratings yet
ML - Module 5
80 pages
ML Unit 2 Part 1
No ratings yet
ML Unit 2 Part 1
47 pages
Session 3
No ratings yet
Session 3
26 pages
Week 6 Lecture Notes
No ratings yet
Week 6 Lecture Notes
9 pages
Lec4 Oct12 2022 PracticalNotes LinearRegression
No ratings yet
Lec4 Oct12 2022 PracticalNotes LinearRegression
34 pages
Deep Learning Unit 3
No ratings yet
Deep Learning Unit 3
19 pages
Unit 5 New
No ratings yet
Unit 5 New
9 pages
Quantum Computing: Exercise Sheet 4: Steven Herbert
No ratings yet
Quantum Computing: Exercise Sheet 4: Steven Herbert
2 pages
Equalization
No ratings yet
Equalization
11 pages
Gansp Awareness Quiz PDF
No ratings yet
Gansp Awareness Quiz PDF
13 pages
Machine Learning Using Matlab: Lecture 8 Advice On ML Application
No ratings yet
Machine Learning Using Matlab: Lecture 8 Advice On ML Application
30 pages
CSO504 Machine Learning: Evaluation and Error Analysis Validation and Regularization Koustav Rudra 22/08/2022
No ratings yet
CSO504 Machine Learning: Evaluation and Error Analysis Validation and Regularization Koustav Rudra 22/08/2022
28 pages
Group Delay
No ratings yet
Group Delay
38 pages
Wk07 Topic07 2 - 202303
No ratings yet
Wk07 Topic07 2 - 202303
21 pages
10: Advice For Applying Machine Learning: Deciding What To Try Next
No ratings yet
10: Advice For Applying Machine Learning: Deciding What To Try Next
8 pages
K - Means Clustering and Related Algorithms: Ryan P. Adams COS 324 - Elements of Machine Learning Princeton University
No ratings yet
K - Means Clustering and Related Algorithms: Ryan P. Adams COS 324 - Elements of Machine Learning Princeton University
18 pages
MLA TAB Lecture3
No ratings yet
MLA TAB Lecture3
70 pages
Advanced Machine Learning: Neural Networks Decision Trees Random Forest Xgboost
No ratings yet
Advanced Machine Learning: Neural Networks Decision Trees Random Forest Xgboost
61 pages
ML 5
No ratings yet
ML 5
14 pages
Compression Theory
No ratings yet
Compression Theory
7 pages
ML 01
No ratings yet
ML 01
24 pages
Choosing Model and Tuning
No ratings yet
Choosing Model and Tuning
20 pages
Image Compression Standards
No ratings yet
Image Compression Standards
4 pages
AI & ML Notes
No ratings yet
AI & ML Notes
22 pages
Analog Filter Design PDF
No ratings yet
Analog Filter Design PDF
69 pages
Convolution Function-Help - ArcGIS For Desktop
No ratings yet
Convolution Function-Help - ArcGIS For Desktop
9 pages
Module 3 - ML
No ratings yet
Module 3 - ML
101 pages
Computation of DFT
No ratings yet
Computation of DFT
13 pages
Newton's Divided Difference
100% (1)
Newton's Divided Difference
25 pages
AIML Lab Manual
67% (3)
AIML Lab Manual
31 pages
Hashing PDF
No ratings yet
Hashing PDF
56 pages
Ese562 Lect01
No ratings yet
Ese562 Lect01
35 pages
Image Compression Using DCT Implementing Matlab
91% (11)
Image Compression Using DCT Implementing Matlab
23 pages
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
Machine Learning Interview Questions.
50% (2)
Machine Learning Interview Questions.
43 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Bias Variance

Uploaded by

Bias Variance

Uploaded by

Bias and Variance

Fitting a higher order function

Bias vs. variance in linear regression

High bias “Just right” High variance

but fail to generalize to new examples.

Example: Logistic regression

Ways to handle noise

Note: Test set not at all seen during training

Popular choice for k: 10 or 5

k-fold cross validation (shown for k=3)

train test train

Suppose we penalize and make really small

Regularization for linear regression

Smaller values of parameters lead to more generalizable models,

Look up L1 or “Lasso” regularization

Elastic Net (Combination of L1 and L2 Regularization)

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.