0% found this document useful (0 votes)
140 views46 pages

ML UNIT 2 Sir

Uploaded by

sampathmandru18
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
140 views46 pages

ML UNIT 2 Sir

Uploaded by

sampathmandru18
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

MACHINE LEARNING

Supervised Learning(Regression/Classification)
BTech III Year – II Semester
Computer Science & Engineering

UNIT-II

1
SYLLABUS
MACHINE LEARNING
Unit II: Supervised Learning(Regression/Classification):
Basic Methods:
▪ Distance based Methods,
▪ Nearest Neighbours,
▪ Decision Trees,
▪ Naive Bayes,
Linear Models:
▪ Linear Regression,
▪ Logistic Regression,
▪ Generalized Linear Models,
▪ Support Vector Machines,
Binary Classification:
▪ Multiclass/Structured outputs,
▪ MNIST, Ranking.

2
Types of Learning

The learning methods in ML can be broadly classified into three basic types: Supervised, unsupervised
and reinforced learning.

3
Supervised and Unsupervised Learning.

4
Supervised Learning
1. As its name suggests, Supervised machine learning is based on supervision.
2. It means in the supervised learning technique, we train the machines using the "labelled"
dataset, and based on the training, the machine predicts the output.
3. Here, the labelled data specifies that some of the inputs are already mapped to the output.
More preciously, we can say; first, we train the machine with the input and corresponding
output, and then we ask the machine to predict the output using the test dataset. Supervised
machine learning can be classified into two types of problems, which are Regression and
Classification.:

5
Machine Learning Basic Methods: Distance Based Methods

1. Distance-based algorithms are machine learning algorithms that classify queries by


computing distances between these queries and a number of internally stored exemplars.
2. Exemplars that are closest to the query have the largest influence on the classification assigned
to the query.
3. Distance-based algorithms are nonparametric methods that can be used for classification
4. These algorithms classify objects by the dissimilarity between them as measured by distance
functions
5. Machine Learning, algorithms use distance metrics to recognize similarities among the data.
These distance metrics use functions that tells us the distance between two points in the
dataset.

6
Machine Learning Basic Methods: Distance Based Methods

The different types of distances used in Machine Learning are:

1. Euclidean distance
2. Manhattan distance
3. Minkowski distance
4. Hamming distance
5. Cosine similarity

Euclidean distance Manhattan distance

7
Basic Methods: K-Nearest Neighbor(KNN)
1. K-Nearest Neighbour is one of the simplest Machine Learning algorithms based on Supervised
Learning technique.
2. K-NN algorithm assumes the similarity between the new case/data and available cases and put the
new case into the category that is most similar to the available categories.
3. K-NN algorithm stores all the available data and classifies a new data point based on the
similarity. This means when new data appears then it can be easily classified into a well suite
category by using K- NN algorithm.
4. K-NN algorithm can be used for Regression as well as for Classification but mostly it is used for
the Classification problems.
5. K-NN is a non-parametric algorithm, which means it does not make any assumption on
underlying data.
6. It is also called a lazy learner algorithm because it does not learn from the training set
immediately instead it stores the dataset and at the time of classification, it performs an action on
the dataset.
8
Basic Methods: K-Nearest Neighbor(KNN)

• The K-NN algorithm:


• Step-1: Select the number K of the neighbors
• Step-2: Calculate the Euclidean distance of K
number of neighbors
• Step-3: Take the K nearest neighbors as per the
calculated Euclidean distance.
• Step-4: Among these k neighbors, count the number
of the data points in each category.
• Step-5: Assign the new data points to that category
for which the number of the neighbor is maximum.
• Step-6: model is ready to be used.

9
Basic Methods: K-Nearest Neighbor(KNN)(Example)

10
Basic Methods: K-Nearest Neighbor(KNN)

Advantages of KNN Algorithm:


1. It is simple to implement.
2. It is robust to the noisy training data
3. It can be more effective if the training data is large.
Disadvantages of KNN Algorithm:
1. Always needs to determine the value of K which may be complex some time.
2. The computation cost is high because of calculating the distance between the
data points for all the training samples.

11
Machine Learning Basic Methods: Decision Trees

1. Decision tree is one of the predictive modeling approaches used n statistics, data
mining and machine learning
2. Decision Tree is a Supervised learning technique that can be used for both classification and
Regression problems
3. it is commonly used for solving Classification problems.
4. It is a tree-structured classifier, where internal nodes represent the features of a dataset,
branches represent the decision rules and each leaf node represents the outcome.
5. In a Decision tree, there are two nodes, which are the Decision Node and Leaf Node. Decision
nodes are used to make any decision and have multiple branches, whereas Leaf nodes are the
output of those decisions and do not contain any further branches.
6. The decisions or the test are performed on the basis of features of the given dataset.

12
Machine Learning Basic Methods: Decision Trees

It is a graphical representation for getting all the possible solutions to a problem/decision based on
given conditions.

1. Internal nodes represent the features of a dataset,


2. Branches represent the decision rules and each
3. Leaf node represents the outcome.

13
Machine Learning Basic Methods: Decision Trees Algorithm

1. Step-1: Begin the tree with the root


node, says S, which contains the
complete dataset.
2. Step-2: Find the best attribute in the
dataset using Attribute Selection
Measure (ASM).
3. Step-3: Divide the S into subsets that
contains possible values for the best
attributes.
4. Step-4: Generate the decision tree node,
which contains the best attribute.
5. Step-5: Recursively make new decision
trees using the subsets of the dataset
created in step -3. Continue this process
until a stage is reached where you
cannot further classify the nodes and
called the final node as a leaf node.
Example: The basic flow of decision tree for decision
making with labels (Rain(Yes), No Rain(No)).
Machine Learning Basic Methods: Decision Trees Example
Machine Learning Basic Methods: Decision Trees

Advantages of the Decision Tree


▪ It is simple to understand as it follows the same process which a human follow while
making any decision in real-life.
▪ It can be very useful for solving decision-related problems.
▪ It helps to think about all the possible outcomes for a problem.
▪ There is less requirement of data cleaning compared to other algorithms.

Disadvantages of the Decision Tree


➢ The decision tree contains lots of layers, which makes it complex.
➢ For more class labels, the computational complexity of the decision tree may increase.
Machine Learning Basic Methods: Naive Bayes Methods

1. Naïve Bayes algorithm is a supervised learning algorithm, which is based on Bayes


theorem and used for solving classification problems.
2. It is mainly used in text classification that includes a high-dimensional training dataset.
3. It is a probabilistic classifier, which means it predicts on the basis of the probability
of an object.
4. In simple terms, a Naive Bayes classifier assumes that the presence of a particular feature
in a class is unrelated to the presence of any other feature.
5. Some popular examples of Naïve Bayes Algorithm are spam filtration, Sentimental
analysis, and classifying articles.
Machine Learning Basic Methods: Naive Bayes Methods
Example:

A fruit may be considered to be an apple if it is red, round, and about 3 inches in diameter. Even if
these features depend on each other or upon the existence of the other features, all of these properties
independently contribute to the probability that this fruit is an apple and that is why it is known as
‘Naive’.

The Naïve Bayes algorithm is comprised of two words Naïve and Bayes, Which can be described as:

1. Naïve: It is called Naïve because it assumes that the occurrence of a certain feature is independent
of the occurrence of other features. Such as if the fruit is identified on the bases of color, shape, and
taste, then red, spherical, and sweet fruit is recognized as an apple. Hence each feature individually
contributes to identify that it is an apple without depending on each other.
2. Bayes: It is called Bayes because it depends on the principle of Bayes' Theorem.
Machine Learning Basic Methods: Naive Bayes Methods
1. Naive Bayes model is easy to build and particularly useful for very large data sets.
2. Along with simplicity, Naive Bayes is known to outperform even highly sophisticated classification
methods.
3. Bayes theorem provides a way of calculating posterior probability P(c|x) from P(c), P(x) and P(x|c).
Look at the equation below:
Where
• P(c|x) is the posterior probability of
class (c, target) given predictor (x,
attributes).
• P(c) is the prior probability of class.
• P(x|c) is the likelihood which is the
probability of predictor given class.
• P(x) is the prior probability of
predictor.
Machine Learning Basic Methods: Naive Bayes algorithm
1. Convert the given dataset into frequency tables.
2. Generate Likelihood table by finding the probabilities of given features.
3. Now, use Bayes theorem to calculate the posterior probability.
Example:

Consider raining data set of weather and corresponding target variable ‘Play’
(suggesting possibilities of playing). Now, we need to classify whether players
will play or not based on weather condition.
Machine Learning Basic Methods: Naive Bayes Example

Problem: Players will play if weather is sunny. Is this statement is correct?

solve it using above discussed method of posterior probability.

P(Yes | Sunny) = P( Sunny | Yes) * P(Yes) / P (Sunny)

Here we have
P (Sunny |Yes) = 3/9 = 0.33, P(Sunny) = 5/14 = 0.36, P( Yes)= 9/14 = 0.64

Now, P (Yes | Sunny) = 0.33 * 0.64 / 0.36 = 0.60, which has higher probability.
Machine Learning Basic Methods: Naive Bayes Method

Advantages

1. It is easy and fast to predict class of test data set. It also perform well in multi class prediction
2. When assumption of independence holds, a Naive Bayes classifier performs better compare to
other models like logistic regression and you need less training data.
3. It perform well in case of categorical input variables compared to numerical variable(s). For
numerical variable, normal distribution is assumed (bell curve, which is a strong assumption).

Dis-Advantages

1. If categorical variable has a category (in test data set), which was not observed in training data
set, then model will assign a 0 (zero) probability and will be unable to make a prediction. This
is often known as “Zero Frequency”. To solve this, we can use the smoothing technique. One of
the simplest smoothing techniques is called Laplace estimation.
2. On the other side naive Bayes is also known as a bad estimator, so the probability outputs from
predict_proba are not to be taken too seriously.
3. Another limitation of Naive Bayes is the assumption of independent predictors. In real life, it is
almost impossible that we get a set of predictors which are completely independent.
Machine Learning Basic Methods: Linear Models:

1. Linear Models:
2. Linear Regression,
3. Logistic Regression,
Machine Learning Basic Methods: Regression

1. Linear models describe a continuous response variable as a function of one or more predictor
variables.
2. A regression model provides a function that describes the relationship between one or more
independent variables and a response, dependent, or target variable.
3. Regression is a method to determine the statistical relationship between a dependent variable and
one or more independent variables.
4. Regression analysis is a predictive modeling technique that analyzes the relation between the target
or dependent variable and independent variable in a dataset.
Types of regression techniques:
• Linear Regression
• Logistic Regression
• Ridge Regression
• Lasso Regression
• Polynomial Regression
• Bayesian Linear Regression
Machine Learning Basic Methods: Linear Regression

1. Linear regression analysis is used to predict the value of a variable based on the value of another
variable.
2. The variable you want to predict is called the dependent variable. The variable you are using
to predict the other variable's value is called the independent variable.
3. It is a supervised learning and statistical method that is used for predictive analysis.
4. Linear regression performs the task to predict a dependent variable value (y) based on a given
independent variable (x).
5. General function for Linear Regression :

• where X is the independent variable (predictor Variable)


• Y is the dependent variable (Target Variable)
• The slope of the line is b, and a is the intercept(constant)
Machine Learning Basic Methods: Linear Regression

Types of Linear Regression


Linear regression can be divided into two types of the algorithm:
1. Simple Linear Regression:
If a single independent variable is used to predict the value of a numerical dependent variable,
then such a Linear Regression algorithm is called Simple Linear Regression.
2. Multiple Linear regression:
If more than one independent variable is used to predict the value of a numerical dependent
variable, then such a Linear Regression algorithm is called Multiple Linear Regression.
Machine Learning Basic Methods:
Linear Regression: Example House Price Prediction
Machine Learning Basic Methods:
Linear Regression: Example House Price Prediction
Machine Learning Basic Methods:
Linear Regression: Example House Price Prediction
Machine Learning Basic Methods: Logistic Regression

1. Logistic regression is a statistical model that uses Logistic function to model the conditional
probability.
2. It is an example of supervised learning. It is used to calculate or predict the probability of a
binary (yes/no) event occurring.
3. outcome is a probability, the dependent variable is bounded between 0 and 1.
4. Logistic regression is used for solving the classification problems.
5. A logistic regression model predicts a dependent data variable by analyzing the relationship
between one or more existing independent variables.
Example, a logistic regression could be used to predict whether a
• Political candidate will win or lose an election.
• whether a high school student will be admitted or not to a particular college
• Whether an employee can buy a car or not based on salary.
Machine Learning Basic Methods: Logistic Regression

Logistic Function (Sigmoid Function):


1. The sigmoid function is a mathematical function used to map the predicted values to
probabilities.
2. It maps any real value into another value within a range of 0 and 1.
3. The value of the logistic regression must be between 0 and 1, which cannot go beyond this
limit, so it forms a curve like the "S" form. The S-form curve is called the Sigmoid function or
the logistic function.
Machine Learning Basic Methods: Logistic Regression

Logistic Function (Sigmoid Function):


In the formula of the logistic model,
when b0+b1X == 0, then the p will be 0.5,
similarly,b0+b1X > 0, then the p will be going towards 1 and
b0+b1X < 0, then the p will be going towards 0.
Machine Learning Basic Methods: Linear vs Logistic Regression
Generalized Linear Models,

1. Generalized Linear Model (GLM) is an advanced statistical modelling technique


formulated by John Nelder and Robert Wedderburn in 1972
2. The general linear model includes number of different statistical models like ANOVA,
ANCOVA, MANOVA, MANCOVA, ordinary linear regression, t-test and F-test.
3. The general linear model is a generalization of multiple linear regression to the case of more
than one dependent variable.
4. In general linear model, a dependent variable must be linearly associated with values on the
independent variables. Whereas the relationship in the generalized linear model between
dependent variable and independent variables can be non-linear.
5. There are three components in generalized linear models.
• Linear predictor
• Link function
• Probability distribution
Support Vector Machines

1. Support Vector Machine(SVM) is a supervised machine learning algorithm used for both
classification and regression.
2. The objective of the support vector machine algorithm is to find a hyperplane in an N-
dimensional space(N the number of features) that distinctly classifies the data points.
3. The dimension of the hyperplane depends upon the number of features.
4. SVM chooses the extreme points/vectors that help in creating the hyperplane. These extreme
cases are called as support vectors, and hence algorithm is termed as Support Vector Machine.
5. If the number of input features is two, then the hyperplane is just a line. If the number of input
features is three, then the hyperplane becomes a 2-D plane. It becomes difficult to imagine when
the number of features exceeds three.
Support Vector Machines

1. Hyperplane: There can be multiple lines/decision boundaries to segregate the classes in n-


dimensional space, but we need to find out the best decision boundary that helps to classify the
data points. This best boundary is known as the hyperplane of SVM.
2. Support Vectors:
The data points or vectors that are the closest to the hyperplane and which affect the position of
the hyperplane are termed as Support Vector. Since these vectors support the hyperplane, hence
called a Support vector.
Support Vector Machines Linear Separators
• Binary classification can be viewed as the task of separating classes in feature space:

f(x) = sign(wTx + b)

wTx + b = 0
wTx + b > 0
wTx + b < 0
Support Vector Machines Linear Separators

• Which of the linear separators is optimal?

Linear Separators
Support Vector Machines Classification Margin

• Distance from example xi to the separator is wT x + b


r= i
w
• Examples closest to the hyperplane are support vectors.
• Margin ρ of the separator is the distance between support vectors.

r
Support Vector Machines Maximum Margin Classification

• Maximizing the margin is good according to intuition and PAC theory.


• Implies that only support vectors matter; other training examples are ignorable.
Support Vector Machines Linear SVM Mathematically

• Let training set {(xi, yi)}i=1..n, xiRd, yi  {-1, 1} be separated by a hyperplane with
margin ρ. Then for each training example (xi, yi):

wTxi + b ≤ - ρ/2 if yi = -1 y (wTx + b) ≥ ρ/2


wTxi + b ≥ ρ/2 if yi = 1
 i i

• For every support vector xs the above inequality is an equality. After rescaling w and b by ρ/2 in
the equality, we obtain that distance between each xs and the hyperplane is
y (wT x + b) 1
r= s s
=
• Then the margin can be expressed through (rescaled) w and b as: w w

2
 = 2r =
w
Support Vector Machines Advantages

Advantages of SVM:

1. Effective in high dimensional cases


2. Its memory efficient as it uses a subset of training points in the decision function
called support vectors
3. Different kernel functions can be specified for the decision functions and its
possible to specify custom kernels
Binary Classification:
Multiclass/Structured Outputs

1. Classification means categorizing data and forming groups based on the similarities.

2. In a dataset, the independent variables or features play a vital role in classifying data.

3. In multiclass classification, we have more than two classes in our dependent or target variable

4. algorithms such as Naïve Bayes, Decision trees, SVM, Random forest classifier, KNN,

and logistic regression for classification

5. Examples of multi-class classification are

• classification of news in different categories,

• classifying books according to the subject,

• classifying students according to their streams etc.


Classification: Binary vs Multiclass Classification
Parameters Binary classification Multi-class classification

It is a classification of two groups, i.e. There can be any number of classes in it, i.e.,
No. of classes
classifies objects in at most two classes. classifies the object into more than two classes.

The most popular algorithms used by the Popular algorithms that can be used for multi-
binary classification are- class classification include:
• Logistic Regression •k-Nearest Neighbors
Algorithms used •k-Nearest Neighbors •Decision Trees
•Decision Trees •Naive Bayes
•Support Vector Machine •Random Forest.
•Naive Bayes •Gradient Boosting

Examples of binary classification


Examples of multi-class classification include:
include-
•Face classification.
Examples • Email spam detection (spam or not).
•Plant species classification.
• Churn prediction (churn or not).
•Optical character recognition.
• Conversion prediction (buy or not).
MNIST

• The MNIST database (Modified National Institute of Standards and Technology


database) is a large database of handwritten digits that is commonly used for training
various image processing systems.
• Provides a baseline for testing image processing systems
46

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy