0% found this document useful (0 votes)

22 views48 pages

ML 3170724 Unit-3

Machine learning unit -3 gtu

Uploaded by

Hetvy Jadeja

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views48 pages

ML 3170724 Unit-3

Machine learning unit -3 gtu

Uploaded by

Hetvy Jadeja

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 48

Department of CE

ML: Machine Learning Unit no : 3

Modelling and
Modelling and Evaluation
(3170724)
Evaluation

Prof. Hetvy Jadeja

Outline :
Department of CE
Selecting a Model: Predictive/Descriptive
Training a Model for supervised learning
Unit no : 3
Model Representation and Interpretability Modelling and
Evaluating performance of a model Evaluation
(3170724)
Improving performance of a model

Prof. Hetvy Jadeja

Introduction • The structured representation of raw input data
to the meaningful pattern is called a model.
• The model might have different forms. It might
be a mathematical equation or a graph or tree
structure or computational block.
• The decision regarding which model is to be
selected for a specific data set is based on the
problem to be solved and the type of data.
• E.g., when the problem is related to prediction
and the target field is numeric and continuous,
the regression model is assigned.
• The process of assigning a model, and fitting a
specific model to a data set is called model
training.
Selecting a Model
 Input variables can be denoted by X. while individual
input variables are represented as X1, X2,…,Xn and
output variable by symbol Y. The relationship between
X and Y is represent in the general form: Y = f(x)+e
where f is target function and ‘e’ is a random error
term.
 A cost function helps to measure the extent to which
the model is going wrong in estimating the
relationship between X and Y.
 Loss function is synonymous to cost function – only
difference being loss function is usually defined on a
data point, while cost function is for the entire training
data set.
 Machine learning is an optimization problem. We
define a model and tune the parameters to find the
most suitable solution to a problem. However, we need
to have a way to evaluate the quality or optimality of a
solution. This is done using an objective function.
Selecting a Model

 For different types of ML, model that has to be created/trained

is different. Multiple factors plays role in selection of model,
Two most important factors are:
 Kind of problem to be solved
 The nature of underlying data

 There is no model that works best for every machine learning

problem – No Free Lunch
 That’s why, while doing the data exploration we need to
understand the data characteristics, combine this
understanding with the problem we are trying to solve and
then decide which model to be selected for solving the
problem.
Predictive Models

 Predictive models predict the value of a category or class to

which a data instance belongs to; classification models, e.g.
KNN, DT, NB,...
1. Predicting win/loss in a cricket match
2. Predicting whether a transaction is fraud
3. Predicting whether a customer may move to another product
 Predictive models are also used to predict numerical values of
the target; regression models, e.g. LR
1. Prediction of revenue growth in the succeeding year
2. Prediction of rainfall amount in the coming monsoon
3. Prediction of potential flu patients and demand for flu shots next winter
Descriptive Models

 There is no target y in the case of unsupervised learning.

 Descriptive models which group together similar data
instances, i.e. data instances having a similar value of
the different features are called clustering models, e.g. k-
means
1. Customer grouping or segmentation based on social,
demographic, etc.
2. Grouping of music based on different aspects like genre,
language, etc.
3. Grouping of commodities in an inventory
 Descriptive models related to pattern discovery is used
for market basket analysis of transactional data.
Training a Model

 Holdout method
 K-fold cross-validation method
 Bootstrap sampling
 Lazy v/s Eager learners
Holdout

 In case of supervised learning, a model is trained

using the labelled input data. However, how can we
understand the performance of the model?
 The test data may not be available immediately.
 Also, the label value of the test data is not known.
That is the reason why a part of the input data is held
back (that is how the name holdout originates) for
evaluation of the model.
 This subset of the input data is used as the test data
for evaluating the performance of a trained model.
Holdout
Holdout
K-fold CV

 With holdout, smaller data sets may have the

challenge to divide the data of some of the classes
proportionally amongst training and test data sets.
 A special variant called repeated holdout, is
employed to ensure the randomness of the
composed data sets.
 In repeated holdout, several random holdouts are
used to measure the model performance. Finally,
the average of all performances is taken.
 This process of repeated holdout is the basis of k-
fold cross validation technique, where the data set
K-fold CV

 E.g. 3-fold CV

 Model1: Trained on Fold1 + Fold2, Tested on Fold3

 Model2: Trained on Fold2 + Fold3, Tested on Fold1

 Model3: Trained on Fold1 + Fold3, Tested on Fold2

K-fold CV
K-fold CV

 In 10-fold cross-validation, for each of the 10-folds,

each comprising of approximately 10% of the data,
one of the folds is used as the test data for
validating model performance trained based on the
remaining 9 folds (or 90% of the data). This is
repeated 10 times, once for each of the 10 folds
being used as the test data and the remaining folds
as the training data. The average performance
across all folds is reported.
 Leave-one-out cross-validation (LOOCV) is an
extreme case of k-fold cross-validation using one
record or data instance at a time as a test data. This
Bootstrap Sampling

 Bootstrap sampling or bootstrapping is a popular

way to identify training and test data sets from the
input data set.
 It uses the technique of Simple Random Sampling
with Replacement.
 Bootstrapping randomly picks data instances from
the input data set, with the possibility of the same
data instance to be picked multiple times.
 This means that from the input data set having N
data instances, bootstrapping can create one or
more training data sets having N data instances,
Bootstrap Sampling
K-fold CV v/s
Bootstrap Sampling
Lazy v/s Eager

 Eager learning follows the general principles of machine learning –

it constructs a generalized target function during the training
phase.
 It follows the typical steps of machine learning, i.e. abstraction
and generalization and comes up with a trained model at the end
of the learning phase.
 Hence, when the test data comes in for classification, the eager
learner is ready with the model and doesn’t need to refer back to
the training data.
 Eager learners take more time in the learning phase than the lazy
learners.
 E.g. SVM, DT, NB, NN
Lazy v/s Eager

 Lazy learning, on the other hand, completely skips the abstraction and
generalization processes
 Lazy learner doesn’t ‘learn’ anything. It uses the training data in exact,
and uses the knowledge to classify the unlabelled test data.
 Since lazy learning uses training data as-is, it is also known as rote
learning.
 Due to its dependency on the given training data instance, it is also
known as instance learning.
 Lazy learners take very little time in training because not much of
training actually happens. However, it takes more time in testing as for
each tuple of test data, a comparison-based assignment of label
happens.
 E.g. KNN
Model Representation &
Interpretability
 A key consideration in learning the target function from the training data
is the extent of generalization.
 This is because the input data is just a limited, specific view and the new,
unknown data in the test data set may be differing from the training data.
 Underfitting: If the target function is kept too simple, it may not be able to
capture the essential nuances and represent the underlying data well.
 Underfitting results in both poor performance with training data as well as
poor generalization to test data.
 Underfitting can be avoided by
 using more training data
 reducing features by effective feature selection
Model Representation &
Interpretability
 Overfitting: refers to a situation where the model has been
designed in such a way that it emulates the training data too
closely.
 Any specific deviation in the training data, like noise or
outliers, get embedded in the model.
 It adversely impacts the model performance on the test data.
 Overfitting results in good performance with training data set,
but poor generalization with test data set.
 Overfitting can be avoided by
 using re-sampling techniques like k-fold cross validation
 hold back of a validation data set
Model Representation &
Interpretability
Model Representation &
Interpretability
 Errors in learning can be due to ‘bias’ and due to
‘variance’
 Bias: Errors due to bias arise from simplifying
assumptions made by the model to make the target
function less complex or easier to learn. Underfitting
results in high bias.
 Variance: Errors due to variance occur from difference in
training data sets used to train the model.
 Ideally, the difference in the data sets should not be
significant and the model trained using different training
data sets should not be too different. However, in case of
overfitting, since the model closely matches the training
Model Representation &
Interpretability
Evaluating Model Performance

 FOR CLASSIFICATION
 There are four possibilities for cricket match win/loss
prediction:
1. the model predicted win and the team won
2. the model predicted win and the team lost
3. the model predicted loss and the team won
4. the model predicted loss and the team lost
 The first case is where the model has correctly classified data
instances as the class of interest. True Positive (TP)
cases.
 The second case is where the model incorrectly classified data
instances as the class of interest. False Positive (FP)
 Confusion Matrix
Evaluating
Model
Performan
ce
Evaluating Model Performance

 Accuracy : model accuracy is given by total number

of correct classifications (either as the class of
interest, i.e. True Positive or as not the class of
interest, i.e. True Negative) divided by total number
of classifications done.
Evaluating Model Performance

 cohen's kappa: is a measure of how closely the instances

classified by the machine learning classifier matched the data
labeled as ground truth
 Kappa value of a model indicates the adjusted the model
accuracy.

 P(a) is proportion of observed agreement between actual and

predicted.

 P(pr) is proportion of expected agreement between actual and

predicted
Evaluating Model Performance

 Sensitivity: of a model measures the proportion of

positive cases which were correctly classified.
 Sensitivity measure gives the proportion of tumors
which are actually malignant and are predicted as
malignant.
 A high value of sensitivity is more desirable than a
high value of accuracy.
Evaluating Model Performance

 Specificity : of a model measures the proportion of

negative examples which are correctly classified.
 TNR is True Negative Rate
Evaluating Model Performance

 Precision indicates the reliability of a model in

predicting a class of interest.
 For the example of win / loss prediction, precision
indicates how often the model predicts the win
correctly.
 Recall indicates the proportion of correct prediction
of positives to the total number of positives. For the
example of win / loss prediction, recall resembles
what proportion of the total wins were predicted
correctly.
Evaluating Model Performance

 Receiver Operating Characteristic (ROC) curve helps in visualizing the

performance of a classification model. It shows the efficiency of a
model in the detection of true positives while avoiding false positives.
 In the ROC curve, FPR (x axis) is plotted against TPR (y axis) at
different classification thresholds.

 The area under curve (AUC) value is the area of the two-dimensional
space under the curve from (0, 0) to (1, 1), where each point on the
curve gives a set of TP and FP values at a specific classification
threshold.
 AUC value ranges from 0 to 1, with an AUC of less than 0.5 indicating
that the classifier has no predictive ability.
Evaluating Model Performance

 Data:
Evaluating Model
 FORPerformance
PREDICTION
 A regression model which
ensures that the difference
between predicted and
actual values is low can be
considered as a good model.
 The distance between the
actual value and the fitted
or predicted value, i.e. ŷ is
known as residual.
 The regression model can
be considered to be fitted
well if the difference
between actual and
predicted value, i.e. the
residual value is less.
Evaluating Model Performance

 Sum of Squares Total (SST) = squared differences of each observation

from the overall Mean
 Sum of Squared Errors (SSE) (of prediction) = sum of the squared
residuals
Evaluating Model Performance

 For a data set clustered into ‘k’ clusters, silhouette width is calculated
as:
Evaluating Model Performance

 FOR CLUSTURING
1. Internal evaluation
 The internal evaluation methods generally measure cluster quality based
on homogeneity of data belonging to the same cluster and heterogeneity
of data belonging to different clusters. The homogeneity/heterogeneity is
decided by some similarity measure.
 silhouette coefficient, which is one of the most popular internal evaluation
methods, uses distance (Euclidean or Manhattan distances most
commonly used) between data elements as a similarity measure. The
value of silhouette width ranges between –1 and +1, with a high value
indicating high intracluster homogeneity and inter-cluster heterogeneity
Evaluating Model Performance

2. External evaluation
 In this approach, class label is known for the data set subjected to
clustering. However, quite obviously, the known class labels are not a part
of the data used in clustering. The cluster algorithm is assessed based on
how close the results are compared to those known class labels.
 Purity is one of the most popular measures of cluster algorithms –
evaluates the extent to which clusters contain a single class
 For a data set having ‘n’ data instances and ‘c’ known class labels which
generates ‘k’ clusters, purity is measured as:
Improving Model Performance

 Model parameter tuning is the process of adjusting the model

fitting options. E.g. in KNN, using different values of K, the
model can be tuned.
 Ensemble: Several models with diverse strengths are combined
together. One model may learn one type data sets well while
struggle with another type of data set. Another model may
perform well with the data set which the first one struggled
with.
 Ensemble helps in reducing the bias of different models and
also reducing the variance.
 Ensemble methods combine weaker learners to create a
stronger model.
Improving Model Performance
Improving Model Performance

 Build a number of models based on the training data

 For diversifying the models generated, bootstrapping can be used to
generate unique training data sets.
 Alternatively, the same training data may be used but the models
combined are quite varying, e.g, SVM, neural network, KNN, etc.
 The outputs from the different models are combined using a
combination function, e.g. majority voting of the different models.
 For example, 3 out of 5 classes predict ‘win’ and 2 predict ‘loss’ –
then the final outcome of the ensemble using majority vote would be
a ‘win’.
Improving Model Performance

 Bagging (bootstrap aggregating): uses bootstrap

sampling method to generate multiple training data
sets, which are used to train a set of models using
the same learning algorithm. The outcomes are
combined by majority voting (classification) or by
average (regression).
Improving Model Performance

 Boosting is another ensemble-based technique where,

weaker learning models are trained on resampled data and
the outcomes are combined with a weighted voting approach
based on performance of various models.
 Algorithm for boosting: Initialize a weight vector with uniform
weights
 Loop:
– Apply weak learner to weighted training examples (instead of orig.
training set, draw bootstrap samples with weighted probability)
– Increase weight for misclassified examples
 Weighted majority voting on trained classifiers
Improving Model Performance
Improving Model Performance
Improving Model Performance

 Random forest is another ensemble-based technique. It is an

ensemble of decision trees – hence the name random forest to
indicate a forest of decision trees
Department of CE

Thanks
Unit no : 3
Modelling and
Evaluation
(3170724)

Prof. Hetvy Jadeja

Modellingandevaluationunit2june2322 220623063944 5c70ebed
No ratings yet
Modellingandevaluationunit2june2322 220623063944 5c70ebed
53 pages
Chapter 7 Learning
No ratings yet
Chapter 7 Learning
34 pages
Chapter 2 Part II
No ratings yet
Chapter 2 Part II
28 pages
5 - Model For Predictions - ML
No ratings yet
5 - Model For Predictions - ML
52 pages
UNIT03
No ratings yet
UNIT03
52 pages
Model Generalization
No ratings yet
Model Generalization
117 pages
AN2DL 03 2324 NeuralNetwroksTraining
No ratings yet
AN2DL 03 2324 NeuralNetwroksTraining
40 pages
CHP 3
No ratings yet
CHP 3
70 pages
Wk07 Topic07 2 - 202303
No ratings yet
Wk07 Topic07 2 - 202303
21 pages
SML Updated UNIT 4
No ratings yet
SML Updated UNIT 4
44 pages
Wa0001.
No ratings yet
Wa0001.
173 pages
Evaluating Machine Learning Algorithms and Model Selection
No ratings yet
Evaluating Machine Learning Algorithms and Model Selection
10 pages
Evaluating Model Performance: Evaluation Strategies: Train/Validation/Test
No ratings yet
Evaluating Model Performance: Evaluation Strategies: Train/Validation/Test
127 pages
UNIT II Machine Learning
No ratings yet
UNIT II Machine Learning
43 pages
Machine Learning General: Definiton
No ratings yet
Machine Learning General: Definiton
14 pages
ML.1Lecture.2 (Old)
No ratings yet
ML.1Lecture.2 (Old)
23 pages
19 ML Intro
No ratings yet
19 ML Intro
31 pages
Training Evaluation
No ratings yet
Training Evaluation
42 pages
ML Unit 2
No ratings yet
ML Unit 2
86 pages
19 ML Intro
No ratings yet
19 ML Intro
33 pages
ML Unit IV
No ratings yet
ML Unit IV
70 pages
Lecture 9 - Evaluations
No ratings yet
Lecture 9 - Evaluations
68 pages
ML Unit 2
No ratings yet
ML Unit 2
35 pages
ML Unit 2 Part 1
No ratings yet
ML Unit 2 Part 1
47 pages
Unit3ModellingandEvaluationpptx 2023 09 02 15 19 21
No ratings yet
Unit3ModellingandEvaluationpptx 2023 09 02 15 19 21
49 pages
Ch6-Models Selection Evaluating Classifiers
No ratings yet
Ch6-Models Selection Evaluating Classifiers
28 pages
Unit 3 (ML)
No ratings yet
Unit 3 (ML)
26 pages
Unit IV
No ratings yet
Unit IV
51 pages
Unit 3
No ratings yet
Unit 3
13 pages
Chapter 19
No ratings yet
Chapter 19
30 pages
Model Selection and Evaluation
No ratings yet
Model Selection and Evaluation
23 pages
INT354 - Unit 1
No ratings yet
INT354 - Unit 1
72 pages
ch-3 FML
No ratings yet
ch-3 FML
14 pages
Lect 03 Evaluation Part 2
No ratings yet
Lect 03 Evaluation Part 2
40 pages
L2 - Problems in ML & Performance Evaluation
No ratings yet
L2 - Problems in ML & Performance Evaluation
30 pages
Lecture4 Foundations Supervised Learning
No ratings yet
Lecture4 Foundations Supervised Learning
22 pages
Unit 4
No ratings yet
Unit 4
34 pages
Lecture 9
No ratings yet
Lecture 9
16 pages
Study Notes - Lesson 1 - 7 PDF
No ratings yet
Study Notes - Lesson 1 - 7 PDF
25 pages
ML3 - Evaluation
100% (1)
ML3 - Evaluation
65 pages
The Toyota Kata Practice Guide Practicing Scientific Thinking Skills For Superior Results in 20 Minutes A Day Full Download
100% (1)
The Toyota Kata Practice Guide Practicing Scientific Thinking Skills For Superior Results in 20 Minutes A Day Full Download
404 pages
PSCS511 - Machine Learning
No ratings yet
PSCS511 - Machine Learning
23 pages
Evaluating Machine Learning Algorithms and Model Selection
No ratings yet
Evaluating Machine Learning Algorithms and Model Selection
8 pages
Chapter 01 Introduction To Machine Learning
No ratings yet
Chapter 01 Introduction To Machine Learning
59 pages
Lecture-4 Model Evaluation
No ratings yet
Lecture-4 Model Evaluation
28 pages
04 - Model Selection
No ratings yet
04 - Model Selection
62 pages
UNIT 2 Data Science LM 2023
No ratings yet
UNIT 2 Data Science LM 2023
13 pages
Jkkklphftbbhuii
No ratings yet
Jkkklphftbbhuii
17 pages
ML Notes (Module-3)
No ratings yet
ML Notes (Module-3)
21 pages
ML 5
No ratings yet
ML 5
26 pages
Xchapter 1
No ratings yet
Xchapter 1
31 pages
Module3 DS PPT
No ratings yet
Module3 DS PPT
68 pages
Media and Information Literacy: Quarter 2 - Module 11: Evaluating Creative Multimedia Form
100% (3)
Media and Information Literacy: Quarter 2 - Module 11: Evaluating Creative Multimedia Form
28 pages
APS1070 Lecture (3) Slides
No ratings yet
APS1070 Lecture (3) Slides
70 pages
Lec2 Intro To ML
No ratings yet
Lec2 Intro To ML
35 pages
ML 5
No ratings yet
ML 5
14 pages
Huawei H12-211 PRACTICE EXAM HCNA-HNTD H
No ratings yet
Huawei H12-211 PRACTICE EXAM HCNA-HNTD H
117 pages
Final Demo Lesson Plan
No ratings yet
Final Demo Lesson Plan
10 pages
Technological Factors and Academic Performance of Grade 9 High School Students in Mathematics
100% (1)
Technological Factors and Academic Performance of Grade 9 High School Students in Mathematics
39 pages
Unit 5 Intro To Machine Learning
No ratings yet
Unit 5 Intro To Machine Learning
25 pages
Machine Learning HC
No ratings yet
Machine Learning HC
4 pages
FL - Masining Na Pagpapahayag PDF
No ratings yet
FL - Masining Na Pagpapahayag PDF
13 pages
Unit III - I
No ratings yet
Unit III - I
15 pages
Lesson Plan in Mathematics 9
100% (9)
Lesson Plan in Mathematics 9
2 pages
Eclectic Approach
No ratings yet
Eclectic Approach
12 pages
Narrative Report
No ratings yet
Narrative Report
66 pages
Cot 1 Quarter 1 Lesson Plan
No ratings yet
Cot 1 Quarter 1 Lesson Plan
5 pages
Curriculum Project: Teaching Leadership in Elementary Physical Education Tommy Hamlin Post University
No ratings yet
Curriculum Project: Teaching Leadership in Elementary Physical Education Tommy Hamlin Post University
17 pages
CDPNHS Entrep Q4 W1 Las1
No ratings yet
CDPNHS Entrep Q4 W1 Las1
12 pages
From Principle To Practice A Users Guide To Do No Harm
No ratings yet
From Principle To Practice A Users Guide To Do No Harm
154 pages
Tle Techdraft9 Q2 M5
No ratings yet
Tle Techdraft9 Q2 M5
18 pages
Topic 7 - Session Plan Development
100% (1)
Topic 7 - Session Plan Development
15 pages
MFPC 111 Class Activity 4 Mathematical Knowledge Memo
No ratings yet
MFPC 111 Class Activity 4 Mathematical Knowledge Memo
3 pages
Lesson 2 - Public Speaking
No ratings yet
Lesson 2 - Public Speaking
45 pages
TEFL OT Certificate - Sheline 2
No ratings yet
TEFL OT Certificate - Sheline 2
2 pages
Research in Developmental Disabilities: Sciencedirect
No ratings yet
Research in Developmental Disabilities: Sciencedirect
8 pages
Kumar Kumaresan
No ratings yet
Kumar Kumaresan
16 pages
Acknowledgement
No ratings yet
Acknowledgement
41 pages
Idan Fakhry (1908103172) Journal
No ratings yet
Idan Fakhry (1908103172) Journal
13 pages
Mathematics 9 (2nd Quarter) : Radical Expressions
No ratings yet
Mathematics 9 (2nd Quarter) : Radical Expressions
3 pages
Defining Key Concepts in Didactics 03 February-2025
No ratings yet
Defining Key Concepts in Didactics 03 February-2025
4 pages
DLL 6. Eapp PASSIVIZATION AND NOMINALIZTN
No ratings yet
DLL 6. Eapp PASSIVIZATION AND NOMINALIZTN
1 page
Week6 Q3 2023 2024
No ratings yet
Week6 Q3 2023 2024
5 pages
Algo
No ratings yet
Algo
1 page
Childhood in The 1980s vs. Childhood Today
No ratings yet
Childhood in The 1980s vs. Childhood Today
4 pages
FS2 - Episode 2 Format
No ratings yet
FS2 - Episode 2 Format
5 pages
Annisa Qinaya R-1041-Week 4 Task (Learning Style) - 21042247
No ratings yet
Annisa Qinaya R-1041-Week 4 Task (Learning Style) - 21042247
3 pages
Homeroom Guidance PT
No ratings yet
Homeroom Guidance PT
3 pages
Mastering Machine Learning: A Comprehensive Guide to Success
From Everand
Mastering Machine Learning: A Comprehensive Guide to Success
Rick Spair
No ratings yet
Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

ML 3170724 Unit-3

Uploaded by

ML 3170724 Unit-3

Uploaded by

Department of CE

ML: Machine Learning Unit no : 3

Prof. Hetvy Jadeja

Prof. Hetvy Jadeja

 For different types of ML, model that has to be created/trained

 There is no model that works best for every machine learning

 Predictive models predict the value of a category or class to

 There is no target y in the case of unsupervised learning.

 In case of supervised learning, a model is trained

 With holdout, smaller data sets may have the

 Model1: Trained on Fold1 + Fold2, Tested on Fold3

 Model2: Trained on Fold2 + Fold3, Tested on Fold1

 Model3: Trained on Fold1 + Fold3, Tested on Fold2

 In 10-fold cross-validation, for each of the 10-folds,

 Bootstrap sampling or bootstrapping is a popular

 Eager learning follows the general principles of machine learning –

 Accuracy : model accuracy is given by total number

 cohen's kappa: is a measure of how closely the instances

 P(a) is proportion of observed agreement between actual and

 P(pr) is proportion of expected agreement between actual and

 Sensitivity: of a model measures the proportion of

 Specificity : of a model measures the proportion of

 Precision indicates the reliability of a model in

 Receiver Operating Characteristic (ROC) curve helps in visualizing the

 Sum of Squares Total (SST) = squared differences of each observation

 Model parameter tuning is the process of adjusting the model

 Build a number of models based on the training data

 Bagging (bootstrap aggregating): uses bootstrap

 Boosting is another ensemble-based technique where,

 Random forest is another ensemble-based technique. It is an

Prof. Hetvy Jadeja

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.