0% found this document useful (0 votes)
22 views48 pages

ML 3170724 Unit-3

Machine learning unit -3 gtu

Uploaded by

Hetvy Jadeja
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views48 pages

ML 3170724 Unit-3

Machine learning unit -3 gtu

Uploaded by

Hetvy Jadeja
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 48

Department of CE

ML: Machine Learning Unit no : 3


Modelling and
Modelling and Evaluation
(3170724)
Evaluation

Prof. Hetvy Jadeja


Outline :
Department of CE
Selecting a Model: Predictive/Descriptive
Training a Model for supervised learning
Unit no : 3
Model Representation and Interpretability Modelling and
Evaluating performance of a model Evaluation
(3170724)
Improving performance of a model

Prof. Hetvy Jadeja


Introduction • The structured representation of raw input data
to the meaningful pattern is called a model.
• The model might have different forms. It might
be a mathematical equation or a graph or tree
structure or computational block.
• The decision regarding which model is to be
selected for a specific data set is based on the
problem to be solved and the type of data.
• E.g., when the problem is related to prediction
and the target field is numeric and continuous,
the regression model is assigned.
• The process of assigning a model, and fitting a
specific model to a data set is called model
training.
Selecting a Model
 Input variables can be denoted by X. while individual
input variables are represented as X1, X2,…,Xn and
output variable by symbol Y. The relationship between
X and Y is represent in the general form: Y = f(x)+e
where f is target function and ‘e’ is a random error
term.
 A cost function helps to measure the extent to which
the model is going wrong in estimating the
relationship between X and Y.
 Loss function is synonymous to cost function – only
difference being loss function is usually defined on a
data point, while cost function is for the entire training
data set.
 Machine learning is an optimization problem. We
define a model and tune the parameters to find the
most suitable solution to a problem. However, we need
to have a way to evaluate the quality or optimality of a
solution. This is done using an objective function.
Selecting a Model

 For different types of ML, model that has to be created/trained


is different. Multiple factors plays role in selection of model,
Two most important factors are:
 Kind of problem to be solved
 The nature of underlying data

 There is no model that works best for every machine learning


problem – No Free Lunch
 That’s why, while doing the data exploration we need to
understand the data characteristics, combine this
understanding with the problem we are trying to solve and
then decide which model to be selected for solving the
problem.
Predictive Models

 Predictive models predict the value of a category or class to


which a data instance belongs to; classification models, e.g.
KNN, DT, NB,...
1. Predicting win/loss in a cricket match
2. Predicting whether a transaction is fraud
3. Predicting whether a customer may move to another product
 Predictive models are also used to predict numerical values of
the target; regression models, e.g. LR
1. Prediction of revenue growth in the succeeding year
2. Prediction of rainfall amount in the coming monsoon
3. Prediction of potential flu patients and demand for flu shots next winter
Descriptive Models

 There is no target y in the case of unsupervised learning.


 Descriptive models which group together similar data
instances, i.e. data instances having a similar value of
the different features are called clustering models, e.g. k-
means
1. Customer grouping or segmentation based on social,
demographic, etc.
2. Grouping of music based on different aspects like genre,
language, etc.
3. Grouping of commodities in an inventory
 Descriptive models related to pattern discovery is used
for market basket analysis of transactional data.
Training a Model

 Holdout method
 K-fold cross-validation method
 Bootstrap sampling
 Lazy v/s Eager learners
Holdout

 In case of supervised learning, a model is trained


using the labelled input data. However, how can we
understand the performance of the model?
 The test data may not be available immediately.
 Also, the label value of the test data is not known.
That is the reason why a part of the input data is held
back (that is how the name holdout originates) for
evaluation of the model.
 This subset of the input data is used as the test data
for evaluating the performance of a trained model.
Holdout
Holdout
K-fold CV

 With holdout, smaller data sets may have the


challenge to divide the data of some of the classes
proportionally amongst training and test data sets.
 A special variant called repeated holdout, is
employed to ensure the randomness of the
composed data sets.
 In repeated holdout, several random holdouts are
used to measure the model performance. Finally,
the average of all performances is taken.
 This process of repeated holdout is the basis of k-
fold cross validation technique, where the data set
K-fold CV

 E.g. 3-fold CV

 Model1: Trained on Fold1 + Fold2, Tested on Fold3

 Model2: Trained on Fold2 + Fold3, Tested on Fold1

 Model3: Trained on Fold1 + Fold3, Tested on Fold2


K-fold CV
K-fold CV

 In 10-fold cross-validation, for each of the 10-folds,


each comprising of approximately 10% of the data,
one of the folds is used as the test data for
validating model performance trained based on the
remaining 9 folds (or 90% of the data). This is
repeated 10 times, once for each of the 10 folds
being used as the test data and the remaining folds
as the training data. The average performance
across all folds is reported.
 Leave-one-out cross-validation (LOOCV) is an
extreme case of k-fold cross-validation using one
record or data instance at a time as a test data. This
Bootstrap Sampling

 Bootstrap sampling or bootstrapping is a popular


way to identify training and test data sets from the
input data set.
 It uses the technique of Simple Random Sampling
with Replacement.
 Bootstrapping randomly picks data instances from
the input data set, with the possibility of the same
data instance to be picked multiple times.
 This means that from the input data set having N
data instances, bootstrapping can create one or
more training data sets having N data instances,
Bootstrap Sampling
K-fold CV v/s
Bootstrap Sampling
Lazy v/s Eager

 Eager learning follows the general principles of machine learning –


it constructs a generalized target function during the training
phase.
 It follows the typical steps of machine learning, i.e. abstraction
and generalization and comes up with a trained model at the end
of the learning phase.
 Hence, when the test data comes in for classification, the eager
learner is ready with the model and doesn’t need to refer back to
the training data.
 Eager learners take more time in the learning phase than the lazy
learners.
 E.g. SVM, DT, NB, NN
Lazy v/s Eager

 Lazy learning, on the other hand, completely skips the abstraction and
generalization processes
 Lazy learner doesn’t ‘learn’ anything. It uses the training data in exact,
and uses the knowledge to classify the unlabelled test data.
 Since lazy learning uses training data as-is, it is also known as rote
learning.
 Due to its dependency on the given training data instance, it is also
known as instance learning.
 Lazy learners take very little time in training because not much of
training actually happens. However, it takes more time in testing as for
each tuple of test data, a comparison-based assignment of label
happens.
 E.g. KNN
Model Representation &
Interpretability
 A key consideration in learning the target function from the training data
is the extent of generalization.
 This is because the input data is just a limited, specific view and the new,
unknown data in the test data set may be differing from the training data.
 Underfitting: If the target function is kept too simple, it may not be able to
capture the essential nuances and represent the underlying data well.
 Underfitting results in both poor performance with training data as well as
poor generalization to test data.
 Underfitting can be avoided by
 using more training data
 reducing features by effective feature selection
Model Representation &
Interpretability
 Overfitting: refers to a situation where the model has been
designed in such a way that it emulates the training data too
closely.
 Any specific deviation in the training data, like noise or
outliers, get embedded in the model.
 It adversely impacts the model performance on the test data.
 Overfitting results in good performance with training data set,
but poor generalization with test data set.
 Overfitting can be avoided by
 using re-sampling techniques like k-fold cross validation
 hold back of a validation data set
Model Representation &
Interpretability
Model Representation &
Interpretability
 Errors in learning can be due to ‘bias’ and due to
‘variance’
 Bias: Errors due to bias arise from simplifying
assumptions made by the model to make the target
function less complex or easier to learn. Underfitting
results in high bias.
 Variance: Errors due to variance occur from difference in
training data sets used to train the model.
 Ideally, the difference in the data sets should not be
significant and the model trained using different training
data sets should not be too different. However, in case of
overfitting, since the model closely matches the training
Model Representation &
Interpretability
Evaluating Model Performance

 FOR CLASSIFICATION
 There are four possibilities for cricket match win/loss
prediction:
1. the model predicted win and the team won
2. the model predicted win and the team lost
3. the model predicted loss and the team won
4. the model predicted loss and the team lost
 The first case is where the model has correctly classified data
instances as the class of interest. True Positive (TP)
cases.
 The second case is where the model incorrectly classified data
instances as the class of interest. False Positive (FP)
 Confusion Matrix
Evaluating
Model
Performan
ce
Evaluating Model Performance

 Accuracy : model accuracy is given by total number


of correct classifications (either as the class of
interest, i.e. True Positive or as not the class of
interest, i.e. True Negative) divided by total number
of classifications done.
Evaluating Model Performance

 cohen's kappa: is a measure of how closely the instances


classified by the machine learning classifier matched the data
labeled as ground truth
 Kappa value of a model indicates the adjusted the model
accuracy.

 P(a) is proportion of observed agreement between actual and


predicted.

 P(pr) is proportion of expected agreement between actual and


predicted
Evaluating Model Performance

 Sensitivity: of a model measures the proportion of


positive cases which were correctly classified.
 Sensitivity measure gives the proportion of tumors
which are actually malignant and are predicted as
malignant.
 A high value of sensitivity is more desirable than a
high value of accuracy.
Evaluating Model Performance

 Specificity : of a model measures the proportion of


negative examples which are correctly classified.
 TNR is True Negative Rate
Evaluating Model Performance

 Precision indicates the reliability of a model in


predicting a class of interest.
 For the example of win / loss prediction, precision
indicates how often the model predicts the win
correctly.
 Recall indicates the proportion of correct prediction
of positives to the total number of positives. For the
example of win / loss prediction, recall resembles
what proportion of the total wins were predicted
correctly.
Evaluating Model Performance

 Receiver Operating Characteristic (ROC) curve helps in visualizing the


performance of a classification model. It shows the efficiency of a
model in the detection of true positives while avoiding false positives.
 In the ROC curve, FPR (x axis) is plotted against TPR (y axis) at
different classification thresholds.

 The area under curve (AUC) value is the area of the two-dimensional
space under the curve from (0, 0) to (1, 1), where each point on the
curve gives a set of TP and FP values at a specific classification
threshold.
 AUC value ranges from 0 to 1, with an AUC of less than 0.5 indicating
that the classifier has no predictive ability.
Evaluating Model Performance

 Data:
Evaluating Model
 FORPerformance
PREDICTION
 A regression model which
ensures that the difference
between predicted and
actual values is low can be
considered as a good model.
 The distance between the
actual value and the fitted
or predicted value, i.e. ŷ is
known as residual.
 The regression model can
be considered to be fitted
well if the difference
between actual and
predicted value, i.e. the
residual value is less.
Evaluating Model Performance

 Sum of Squares Total (SST) = squared differences of each observation


from the overall Mean
 Sum of Squared Errors (SSE) (of prediction) = sum of the squared
residuals
Evaluating Model Performance

 For a data set clustered into ‘k’ clusters, silhouette width is calculated
as:
Evaluating Model Performance

 FOR CLUSTURING
1. Internal evaluation
 The internal evaluation methods generally measure cluster quality based
on homogeneity of data belonging to the same cluster and heterogeneity
of data belonging to different clusters. The homogeneity/heterogeneity is
decided by some similarity measure.
 silhouette coefficient, which is one of the most popular internal evaluation
methods, uses distance (Euclidean or Manhattan distances most
commonly used) between data elements as a similarity measure. The
value of silhouette width ranges between –1 and +1, with a high value
indicating high intracluster homogeneity and inter-cluster heterogeneity
Evaluating Model Performance

2. External evaluation
 In this approach, class label is known for the data set subjected to
clustering. However, quite obviously, the known class labels are not a part
of the data used in clustering. The cluster algorithm is assessed based on
how close the results are compared to those known class labels.
 Purity is one of the most popular measures of cluster algorithms –
evaluates the extent to which clusters contain a single class
 For a data set having ‘n’ data instances and ‘c’ known class labels which
generates ‘k’ clusters, purity is measured as:
Improving Model Performance

 Model parameter tuning is the process of adjusting the model


fitting options. E.g. in KNN, using different values of K, the
model can be tuned.
 Ensemble: Several models with diverse strengths are combined
together. One model may learn one type data sets well while
struggle with another type of data set. Another model may
perform well with the data set which the first one struggled
with.
 Ensemble helps in reducing the bias of different models and
also reducing the variance.
 Ensemble methods combine weaker learners to create a
stronger model.
Improving Model Performance
Improving Model Performance

 Build a number of models based on the training data


 For diversifying the models generated, bootstrapping can be used to
generate unique training data sets.
 Alternatively, the same training data may be used but the models
combined are quite varying, e.g, SVM, neural network, KNN, etc.
 The outputs from the different models are combined using a
combination function, e.g. majority voting of the different models.
 For example, 3 out of 5 classes predict ‘win’ and 2 predict ‘loss’ –
then the final outcome of the ensemble using majority vote would be
a ‘win’.
Improving Model Performance

 Bagging (bootstrap aggregating): uses bootstrap


sampling method to generate multiple training data
sets, which are used to train a set of models using
the same learning algorithm. The outcomes are
combined by majority voting (classification) or by
average (regression).
Improving Model Performance

 Boosting is another ensemble-based technique where,


weaker learning models are trained on resampled data and
the outcomes are combined with a weighted voting approach
based on performance of various models.
 Algorithm for boosting: Initialize a weight vector with uniform
weights
 Loop:
– Apply weak learner to weighted training examples (instead of orig.
training set, draw bootstrap samples with weighted probability)
– Increase weight for misclassified examples
 Weighted majority voting on trained classifiers
Improving Model Performance
Improving Model Performance
Improving Model Performance

 Random forest is another ensemble-based technique. It is an


ensemble of decision trees – hence the name random forest to
indicate a forest of decision trees
Department of CE

Thanks
Unit no : 3
Modelling and
Evaluation
(3170724)

Prof. Hetvy Jadeja

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy