0% found this document useful (0 votes)
10 views13 pages

DS Notes Unit - V

The document discusses model evaluation metrics in machine learning, focusing on generalization error, confusion matrices, hypothesis testing, and the types of errors that can occur. It also covers techniques like cross-validation, overfitting, underfitting, ridge regression, and grid search for optimizing hyperparameters. The importance of these concepts is emphasized for ensuring reliable model performance on unseen data.

Uploaded by

Vishva
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views13 pages

DS Notes Unit - V

The document discusses model evaluation metrics in machine learning, focusing on generalization error, confusion matrices, hypothesis testing, and the types of errors that can occur. It also covers techniques like cross-validation, overfitting, underfitting, ridge regression, and grid search for optimizing hyperparameters. The importance of these concepts is emphasized for ensuring reliable model performance on unseen data.

Uploaded by

Vishva
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

UNIT V

MODEL EVALUATION

GENERALIZATION ERROR

EVALUATION METRICS

Model Evaluation Metrics define the evaluation metrics for evaluating the performance of a

machine learning model, which is an integral component of any data science project. It aims to

estimate the generalization accuracy of a model on the future (unseen/out-of-sample) data.

Confusion Matrix

A confusion matrix is a matrix representation of the prediction results of any binary testing that

is often used to describe the performance of the classification model (or on a set of

test data for which the true values are known.

The confusion matrix itself is relatively simple to understand, but the related terminology can be

confusing.

1
Each prediction can be one of the four outcomes, based on how it matches up to the actual value:

True Positive (TP): Predicted True and True in reality.

True Negative (TN): Predicted False and False in reality.

False Positive (FP): Predicted True and False in reality.

False Negative (FN): Predicted False and True in

reality. Now let us understand this concept using

hypothesis testing.

A Hypothesis is speculation or theory based on insufficient evidence that lends itself to further

testing and experimentation. With further testing, a hypothesis can usually be proven true or false.

A Null Hypothesis is a hypothesis that says there is no statistical significance between the two

variables in the hypothesis. It is the hypothesis that the researcher is trying to disprove.

We would always reject the null hypothesis when it is false, and we would accept the null

hypothesis when it is indeed true.

Even though hypothesis tests are meant to be reliable, there are two types of errors that can occur.
2
These errors are known as Type 1 and Type II errors.

For example, when examining the effectiveness of a drug, the null hypothesis would be that the

drug does not affect a disease.

Type I Error:- equivalent to False Positives(FP).

The first kind of error that is possible involves the rejection of a null hypothesis that is true.

go back to the example of a drug being used to treat a disease. If we reject the null

hypothesis in this situation, then we claim that the drug does have some effect on a disease. But

if the null hypothesis is true, then, in reality, the drug does not combat the disease at all. The

drug is falsely claimed to have a positive effect on a disease.

Type II Error:- equivalent to False Negatives(FN).

The other kind of error that occurs when we accept a false null hypothesis. This sort of error is
called a type II error and is also referred to as an error of the second kind.

CROSS VALIDATION

Cross validation is a technique for assessing how the statistical analysis generalises to an
independent data set.It is a technique for evaluating machine learning models by training several
models on subsets of the available input data and evaluating them on the complementary subset
of the data. Using cross-validation, there are high chances that we can detect over-fitting with
ease.

K-Fold Cross Validation

3
First I would like to introduce you to a golden rule mix training and test . Your
first step should always be to isolate the test data-set and use it only for final evaluation.
Cross- validation will thus be performed on the training set.

Initially, the entire training data set is broken up in k equal parts. The first part is kept as the hold
out (testing) set and the remaining k-1 parts are used to train the model. Then the trained model is then tested on

changing the holdout set. Thus, every data point get an equal opportunity to be included in the
test set.

Usually, k is equal to 3 or 5. It can be extended even to higher values like 10 or 15 but it becomes
extremely computationally expensive and time-consuming. Let us have a look at how we can
implement this with a few lines of Python code and the Sci-kit Learn API.

from sklearn.model_selection import cross_val_score


print(cross_val_score(model, X_train, y_train, cv=5))

We pass the model or classifier object, the features, the labels and the parameter cv which
indicates the K for K-Fold cross-validation. The method will return a list of k accuracy values for

4
each iteration. In general, we take the average of them and use it as a consolidated cross-
validation score.

import numpy as np
print(np.mean(cross_val_score(model, X_train, y_train, cv=5)))

Although it might be computationally expensive, cross-validation is essential for


evaluating the performance of the learning model.

Overfitting and Underfitting :

What is Overfitting?

When a model performs very well for training data but has poor performance with test data (new
data), it is known as overfitting. In this case, the machine learning model learns the details and
noise in the training data such that it negatively affects the performance of the model on test data.
Overfitting can happen due to low bias and high variance.

5
Reasons for Overfitting

Data used for training is not cleaned and contains noise (garbage values) in it

The model has a high variance

The size of the training dataset used is not

enough The model is too complex

Ways to Tackle Overfitting

Using K-fold cross-validation

Using Regularization techniques such as Lasso and

Ridge Training model with sufficient data


Adopting ensembling techniques

What is Underfitting?

When a model has not learned the patterns in the training data well and is unable to generalize
well on the new data, it is known as underfitting. An underfit model has poor performance on the
training data and will result in unreliable predictions. Underfitting occurs due to high bias and low
variance.

6
Reasons for Underfitting

Data used for training is not cleaned and contains noise (garbage values) in it

The model has a high bias

The size of the training dataset used is not

enough The model is too simple

Ways to Tackle Underfitting

Increase the number of features in the

dataset Increase model complexity


Reduce noise in the data

Increase the duration of training the data

Ridge Regression

7
Ridge regression is a model tuning method that is used to analyse any data that suffers from
multicollinearity. This method performs L2 regularization. When the issue of
multicollinearity occurs, least-squares are unbiased, and variances are large, this results in
predicted values being far away from the actual values.

The cost function for ridge regression:

Min(||Y X(theta)||^2 +

Lambda is the penalty term. given here is denoted by an alpha parameter in the ridge function.

Ridge Regression Models

For any type of regression machine learning model, the usual regression equation forms the base
which is written as:

Y = XB + e

Where Y is the dependent variable, X represents the independent variables, B is the regression
coefficients to be estimated, and e represents the errors are residuals.

Ridge Regression Predictions


We now show how to make predictions from a Ridge regression model. In particular, we will
make predictions based on the Ridge regression model created for Example 1 with lambda = 1.6.
The raw input data is repeated in range A1:E19 of Figure 1 and the unstandardized regression
coefficients calculated in Figure 2 of Ridge Regression Analysis Tool is repeated in range G2:H6
of Figure 1.

8
The predictions for the input data are shown in column J. In fact, the values in range J2:J19 can
be calculated by the array formula

=H2+MMULT(A2:D19,H3:H6).

Alternatively, they can be calculated by the array formula

=RidgePred(A2:D19,A2:D19,E2:E19,H9)

Real Statistics Function: The Real Statistics Resource Pack provides the following functions.

RidgeMSE(Rx, Ry, lambda) = MSE of the Ridge regression defined by the x data in Rx, y data
in Ry and the given lambda value.

RidgePred(Rx0, Rx, Ry, lambda): returns an array of predicted y values for the x data in range
Rx0 based on the Ridge regression model defined by Rx, Ry and lambda; if Rx0 contains only
one row then only one y value is returned.

9
GRID SEARCH

Grid-search is used to find the optimal hyperparameters of a model which results in the most

Grid search refers to a technique used to identify the optimal hyperparameters for a model. Unlike parameters, fin
hyperparameters, we create a model for each combination of hyperparameters.

Grid search is thus considered a very traditional hyperparameter optimization method since we are
basically - all possible combinations. The models are then evaluated through cross-
validation. The model boasting the best accuracy is naturally considered to be the best.

A model hyperparameter is a characteristic of a model that is external to the model and whose value cannot

10
Cross validation

We have mentioned that cross-validation is used to evaluate the performance of the models.
Cross-validation measures how a model generalizes itself to an independent dataset. We use
cross-validation to get a good estimate of how well a predictive model performs.

With this method, we have a pair of datasets: an independent dataset and a training dataset. We
can partition a single dataset to yield the two sets. These partitions are of the same size and are
referred to as folds. A model in consideration is trained on all folds, bar one.

The excluded fold is used to then test the model. This process is repeated until all folds are used
as the test set. The average performance of the model on all folds is then used to estimate the

In a technique known as the k-fold cross-validation, a user specifies the number of folds,
represented by kk. This means that when k=5k=5, there are 5 folds.

11
K-fold cross-validation with K as 5.

Grid search implementation

The example given below is a basic implementation of grid search. We first specify the
hyperparameters we seek to examine. Then we provide a set of values to test.

1. Load dataset.

My first step is loading the dataset using from sklearn.datasets import load_iris and iris =
load_iris(). The iris dataset is sci-kit learn library in Python. Data is stored in
a 150 4150 4 array.

2. Import GridSearchCV, svm and SVR.

After loading the dataset, we then import GridSearchCV as well


as svm and SVR from sklearn.model_selection

sklearn.model_selection import

GridSearchCV from sklearn import svm


from sklearn.svm import SVR

12
3. Set estimator parameters.

In this implementation, we use the rbf kernel of the SVR model. rbf stands for the radial basis
function. It introduces some form of non-linearity to the model since the data in use is non-linear.
By this, we mean that the data arrangement follows no specific sequence.

estimator=SVR(kernel='rbf')
4. Specify hyperparameters and range of values.

rbf kernel,
the three hyperparameters to use are C, epsilon, and gamma. We can give each one several
values to choose from.

5. Evaluation.

We mentioned that cross-validation is carried out to estimate the performance of a model. In k-


fold cross-validation, k is the number of folds. As shown below, through cv=5, we use cross-
validation to train the model 5 times. This means that 5 would be the kk value.

6. Fitting the data.

We do this through grid.fit(X,y), which does the fitting with all the parameters.

13

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy