Chapter - 1
Chapter - 1
Introduction to Machine
Learning
Dr. Tatwadarshi P. N.
Introduction
To solve a problem on a computer, we need an algorithm.
An algorithm is a sequence of instructions that should be carried out to
transform the input to output.
For example, one can devise an algorithm for sorting.
Dr. Tatwadarshi P. N.
Machine Learning
Dr. Tatwadarshi P. N.
Classical Approach Vs Machine Learning
Dr. Tatwadarshi P. N.
How does Machine Learning work ????
A Machine Learning system learns from historical data, builds the
prediction models, and whenever it receives new data, predicts
the output for it.
The accuracy of predicted output depends upon the amount of
data, as the huge amount of data helps to build a better model
which predicts the output more accurately.
Dr. Tatwadarshi P. N.
Why is ML required?
Rapid increment in the production of data
Dr. Tatwadarshi P. N.
Key
Terminologies
Dr. Tatwadarshi P. N.
Key Task of Machine Learning
Classification
In classification, our job is to predict what class an instance of data
should fall into.
Regression
Regression is the prediction of a numeric value.
Clustering
In unsupervised learning, there’s no label or target value given for the
data.
A task where we group similar items together is known as clustering.
Dr. Tatwadarshi P. N.
Types of Machine Learning
Dr. Tatwadarshi P. N.
Machine Learning Classification
Machine Learning
Regression Clustering
Linear Semi-Supervised
K-means, PCA Learning
Multivariate
Classification Association
Dr. Tatwadarshi P. N.
Unsupervised
Learning
“Unsupervised learning is a type
of machine learning in which
models are trained using
unlabeled dataset and are
allowed to act on that data
without any supervision.”
Unsupervised learning cannot be
directly applied to a regression
or classification problem because
unlike supervised learning, we
have the input data but no
corresponding output data.
The goal of unsupervised
learning is to find the underlying
structure of dataset, group that
data according to similarities,
and represent that dataset in a
compressed format. Dr. Tatwadarshi P. N.
Semi Supervised Learning
Dr. Tatwadarshi P. N.
Reinforcement Learning
Dr. Tatwadarshi P. N.
Issues in Machine Learning
Which Algorithm to select?
Prior knowledge held by the learner is used at which time and manner to guide the
process of generalization from examples?
What is the best strategy for choosing a useful next training experience, and how
does the choice of this strategy affect the complexity of the learning problem?
To improve the knowledge representation and to learn the target function, how can
the learner automatically alter its representation?
Dr. Tatwadarshi P. N.
Applications of Machine Learning
Automating Employee Access Control
Protecting Animals
Predicting Emergency Room Wait Times
Identifying Heart Failure
Predicting Strokes and Seizures
Predicting Hospital Readmissions
Stop Malware
Understand Legalese
Improve Cybersecurity
Get Ready For Smart Cars
Dr. Tatwadarshi P. N.
Applications of Machine Learning
Are you trying to fit your data into some discrete groups? If so and
that’s all you need, you should look into clustering.
Do you need to have some numerical estimate of how strong the fit
is into each group? If you answer yes, then you probably should look
into a density estimation algorithm.
You should spend some time getting to know your data, and the
more you know about it, the better you’ll be able to build a
successful application.
Dr. Tatwadarshi P. N.
Steps in developing a machine learning
application
Dr. Tatwadarshi P. N.
Reading a dataset
Continuous Data Categorical Data
Dr. Tatwadarshi P. N.
Cross Validation
In machine learning, we couldn’t fit the model on the training data
and can’t say that the model will work accurately for the real data.
For this, we must assure that our model got the correct patterns
from the data, and it is not getting up too much noise. For this
purpose, we use the cross-validation technique.
Dr. Tatwadarshi P. N.
Overfitting and Underfitting
When we talk about the Machine Learning model, we actually talk
about how well it performs and its accuracy which is known as
prediction errors.
Let us consider that we are designing a machine learning model.
A model is said to be a good machine learning model if it generalizes
any new input data from the problem domain in a proper way.
This helps us to make predictions about the future data, that the
data model has never seen.
Now, suppose we want to check how well our machine learning
model learns and generalizes to the new data.
For that, we have overfitting and underfitting, which are majorly
responsible for the poor performances of the machine learning
algorithms.
Dr. Tatwadarshi P. N.
Overfitting and Underfitting
Dr. Tatwadarshi P. N.
Bias and Variance
Bias: Assumptions made by a model to make a function easier to
learn. It is actually the error rate of the training data. When the
error rate has a high value, we call it High Bias and when the error
rate has a low value, we call it low Bias.
Variance: The error rate of the testing data is called variance. When
the error rate has a high value, we call it High variance and when
the error rate has a low value, we call it Low variance.
Dr. Tatwadarshi P. N.
Bias and Variance
Low Bias: Suggests less assumptions about the form of the target
function.
High-Bias: Suggests more assumptions about the form of the target
function.
Dr. Tatwadarshi P. N.
Underfitting
A statistical model or a machine learning algorithm is said to have
underfitting when it cannot capture the underlying trend of the data,
i.e., it only performs well on training data but performs poorly on
testing data.
Dr. Tatwadarshi P. N.
Underfitting
It usually happens when we have fewer data to build an accurate
model and also when we try to build a linear model with fewer non-
linear data.
In such cases, the rules of the machine learning model are too easy
and flexible to be applied on such minimal data and therefore the
model will probably make a lot of wrong predictions.
Underfitting can be avoided by using more data and also increasing
the features by feature selection.
Dr. Tatwadarshi P. N.
Underfitting
Reasons for Underfitting:
High bias and low variance
The size of the training dataset used is not enough.
The model is too simple.
Training data is not cleaned and also contains noise in it.
Techniques to reduce underfitting:
Increase model complexity
Increase the number of features, performing feature engineering
Remove noise from the data.
Increase the number of epochs or increase the duration of training to
get better results.
Dr. Tatwadarshi P. N.
Overfitting
A statistical model is said to be overfitted when the model does not
make accurate predictions on testing data.
When a model gets trained with so much data, it starts learning from
the noise and inaccurate data entries in our data set.
And when testing with test data results in High variance.
Then the model does not categorize the data correctly, because of
too many details and noise.
The causes of overfitting are the non-parametric and non-linear
methods because these types of machine learning algorithms have
more freedom in building the model based on the dataset and
therefore they can really build unrealistic models.
Dr. Tatwadarshi P. N.
Overfitting
A solution to avoid overfitting is using a linear algorithm if we have
linear data or using the parameters like the maximal depth if we are
using decision trees.
Dr. Tatwadarshi P. N.
Overfitting
Reasons for Overfitting are as follows:
High variance and low bias
The model is too complex
The size of the training data
Techniques to reduce overfitting:
Increase training data.
Reduce model complexity.
Early stopping during the training phase (have an eye over the loss over
the training period as soon as loss begins to increase stop training).
Ridge Regularization and Lasso Regularization
Use dropout for neural networks to tackle overfitting.
Dr. Tatwadarshi P. N.
Overfitting and Underfitting
Use these steps to determine if your machine learning model, deep
learning model or neural network is currently underfit or overfit.
Ensure that you are using validation loss next to training loss in the
training phase.
When your validation loss is decreasing, the model is still underfit.
When your validation loss is increasing, the model is overfit.
When your validation loss is equal, the model is either perfectly fit
or in a local minimum.
Dr. Tatwadarshi P. N.
Performance Metrics
Dr. Tatwadarshi P. N.
Performance Metrics for Regression
There are three error metrics that are commonly used for evaluating
and reporting the performance of a regression model; they are:
Dr. Tatwadarshi P. N.
MAE, MAPE, MSE, RMSE and R2
Dr. Tatwadarshi P. N.
MAE, MAPE, MSE, RMSE and R2
Dr. Tatwadarshi P. N.
MAE, MAPE, MSE, RMSE and R2
Root Mean Squared Error (RMSE) is the square root of Mean Squared
error. It measures the standard deviation of residuals
Dr. Tatwadarshi P. N.
MAE, MAPE, MSE, RMSE and R2
Dr. Tatwadarshi P. N.
MAE, MAPE, MSE, RMSE and R2
Mean Squared Error(MSE) and Root Mean Square Error penalizes the
large prediction errors vi-a-vis Mean Absolute Error (MAE). However,
RMSE is widely used than MSE to evaluate the performance of the
regression model with other random models as it has the same units
as the dependent variable (Y-axis).
MSE is a differentiable function that makes it easy to perform
mathematical operations in comparison to a non-differentiable
function like MAE. Therefore, in many models, RMSE is used as a
default metric for calculating Loss Function despite being harder to
interpret than MAE.
MAE is more robust to data with outliers.
The lower value of MAE, MSE, and RMSE implies higher accuracy of a
regression model. However, a higher value of R square is considered
desirable.
Dr. Tatwadarshi P. N.
Performance Metrics for Classification
Classification is a type of supervised machine learning problem where
the goal is to predict, for one or more observations, the category or
class they belong to.
An important element of any machine learning workflow is the
evaluation of the performance of the model. This is the process
where we use the trained model to make predictions on previously
unseen, labelled data. In the case of classification, we then evaluate
how many of these predictions the model got right.
In real-world classification problems, it is usually impossible for a
model to be 100% correct. When evaluating a model it is, therefore,
useful to know, not only how wrong the model was, but in which way
the model was wrong.
Dr. Tatwadarshi P. N.
Performance Metrics for Classification
7 Metrics to Measure Classification Performance
Accuracy
Confusion Matrix
Precision
Recall
F1 score
AUC/ROC
Kappa
Dr. Tatwadarshi P. N.
Performance Metrics for Classification
Accuracy
The overall accuracy of a model is simply the number of correct predictions
divided by the total number of predictions. An accuracy score will give a
value between 0 and 1, a value of 1 would indicate a perfect model.
Dr. Tatwadarshi P. N.
Performance Metrics for Classification
Recall
Recall tell us how good the model is at correctly predicting all the
positive observations in the dataset. However, it does not include
information about the false positives so would be more useful in the
cancer example.
Dr. Tatwadarshi P. N.
Performance Metrics for Classification
Dr. Tatwadarshi P. N.
Sensitivity vs Specificity
Sensitivity and Specificity are similar to precision and recall, which
evaluate the model performance.
In the data science community, while evaluating the model performance,
precision and recall are mainly used, but in the medical world, Sensitivity
and
Specificity are used to evaluate the medical test.
In medical terms, Sensitivity indicates the ability to detect the disease,
while Specificity refers to the percentage of people who don’t actually
have the disease are tested negative.
Dr. Tatwadarshi P. N.
Sensitivity vs Specificity
Sensitivity measures how well a machine learning model can detect positive
instances.
In other words, it measures how likely you will get a positive result when
you test for something.
Dr. Tatwadarshi P. N.
Sensitivity vs Specificity
Specificity measures the proportion of True Negative which are correctly
identified by the model.
It is also called a True Negative Rate (TNR).
The Sum of the True Negative Rate and False Negative Rate is 1.
The higher Specificity of the model indicates that the model correctly
identifies most of the negative results.
A lower specificity value indicates the model misled the negative results as
positive.
In Medical terms, Specificity is a measure of the proportion of people not
suffering from the disease who got predicted correctly as those not
suffering from the disease.
Dr. Tatwadarshi P. N.
Sensitivity vs Specificity
Dr. Tatwadarshi P. N.