0% found this document useful (0 votes)
12 views23 pages

E-Notes 34758 Content Document 20250415115803AM

Unit 3 covers supervised machine learning, focusing on classification and regression techniques, including K-Nearest Neighbors, Linear Models, Naive Bayes, and Decision Trees. It explains the working of supervised learning, its advantages and disadvantages, and various applications such as spam detection and medical diagnosis. The document also discusses the differences between regression and classification, detailing specific algorithms and their use cases.

Uploaded by

fykw9snwfk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views23 pages

E-Notes 34758 Content Document 20250415115803AM

Unit 3 covers supervised machine learning, focusing on classification and regression techniques, including K-Nearest Neighbors, Linear Models, Naive Bayes, and Decision Trees. It explains the working of supervised learning, its advantages and disadvantages, and various applications such as spam detection and medical diagnosis. The document also discusses the differences between regression and classification, detailing specific algorithms and their use cases.

Uploaded by

fykw9snwfk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

UNIT -3

SUPERVISED MACHINE LEARNING

1. CLASSIFICATION AND REGRATION


2. SOME SAMPLE DATASETS
3.K-NEAREST NEIGHBOURS
4. LINEAR MODELS
5. NAIVE BAYES CLASSIFIERS
6. DECISSION TREES
1) What is supervised machine learning?
i)supervised Machine learning:-
• Supervised learning algorithms are trained using labeled data.
• Supervised learning model takes direct feedback to check if it is
predicting
correct output or not.
Examples: Text categorization. Face Detection. Signature recognition.

2) How Supervised Learning Works?


In supervised learning, models are trained using labeled dataset, where the model
learns about each type of data. Once the training process is completed, the model
is tested on the basis of test data (a subset of the training set), and then it predicts
the output.
The working of Supervised learning can be easily understood by the below
example and diagram:
Suppose we have a dataset of different types of shapes which includes square,
rectangle, triangle, and Polygon. Now the first step is that we need to train the
model for each shape.
o If the given shape has four sides, and all the sides are equal, then it will
be labeled as a Square.
o If the given shape has three sides, then it will be labeled as a triangle.
o If the given shape has six equal sides, then it will be labeled as hexagon.

3) Explain different types of supervised machine learning?


There are two types
a) Regression b) Classification

a) Regression:-
Regression algorithms are used if there is a relationship between the input
variable and the output variable.
It is used for the prediction of continuous variables, such as Weather
forecasting, Market Trends, changes in temperature or fluctuations in electricity
demand etc.
some popular Regression algorithms which come under supervised learning:
o Linear Regression
o Regression Trees
o Non-Linear Regression
o Bayesian Linear Regression
o Polynomial Regression
b) Classification:
Classification techniques is a predict discrete responses.

For example medical imaging, speech recognition, and credit scoring.


Classification algorithms are used when the output variable is categorical,
which means there are two classes such as Yes-No, Male-Female, True-false, etc.
some popular classification algorithms which come under supervised learning:
o Linear Regression algorithm
o Decision Tree algorithm
o Logistic Regression
o Support vector Machines

4) What are the advantages of supervised machine learning?


Advantages of Supervised learning
• Supervised learning allows collecting data and produces data output from
previous experiences.
• Helps to optimize performance criteria with the help of experience.
• Supervised machine learning helps to solve various types of real-world
computation problems.
• It performs classification and regression tasks.
• It allows estimating or mapping the result to a new sample.
• We have complete control over choosing the number of classes we want
in the training data.
Disadvantages of Supervised learning
• Classifying big data can be challenging.
• Training for supervised learning needs a lot of computation time. So, it
requires a lot of time.
• Supervised learning cannot handle all complex tasks in Machine
Learning.
• Computation time is vast for supervised learning.
• It requires a labeled data set.
• It requires a training process.

5) Explain applications of supervised ML?


There are many examples of supervised learning being used in everyday life.
Here are a few examples:
Spam filters:
• Many email clients use supervised learning algorithms to filter out spam
emails.
• The algorithms are trained on a dataset of labeled emails (spam and non-
spam) and use this information to predict whether a new email is spam or
not.
Fraud detection:
• Many financial institutions use supervised learning algorithms to identify
fraudulent activity.
• The algorithms are trained on a dataset of labeled transactions (fraudulent
and non-fraudulent) and use this information to flag potentially fraudulent
transactions in real-time.
Recommendation systems:
• Many online platforms, such as Netflix and Amazon, use supervised
learning algorithms to make recommendations to users based on their past
activity.
• The algorithms are trained on a dataset of user behavior (e.g. which
movies or products a user has watched or purchased) and use this
information to suggest similar movies or products to the user.
Speech recognition:
• Many voice assistants, such as Apple’s Siri and Amazon’s Alexa, use
supervised learning algorithms to process and interpret spoken
commands.
• The algorithms are trained on a dataset of labeled speech data
(transcribed speech and the corresponding text) and use this information
to transcribe and interpret spoken commands.
Image classification:
• Many image recognition systems, such as those used by social media
platforms to automatically tag photos, use supervised learning algorithms
to classify images based on their content.
• The algorithms are trained on a dataset of labeled images (e.g. images of
cats and dogs) and use this information to classify new images.

Medical diagnosis:
• It is very common to use supervised algorithms in the medical field for
diagnosis purposes.
• Supervised learning is integral to medical diagnosis as it enables
machines to learn from labeled data, aiding in predictive modeling,
decision support, early detection, and personalized medicine.
Bioinformatics:
• This is among the most widely used Supervised Learning applications,
and we all use it regularly.
• Bioinformatics is the study of how individuals retain biological
knowledge such as fingerprints, eye texture, earlobes, and so on.
• Mobile phones are now clever enough to comprehend our biological data
and then verify us in order to increase system security.
object recognition for the vision:
• This type of software is utilized when you have to define anything.
• You have a big dataset that you utilize to train the algorithm, and it can
recognize a new object using this.
6) Differences between Regression and classification.

Briefly explain classification?


Classification techniques is a predict discrete responses.
For example medical imaging, speech recognition, Email Spam Detector. and
credit scoring.
Classification algorithms are used when the output variable is categorical,
which means there are two classes such as Yes-No, Male-Female, True-false, etc.
The algorithm which implements the classification on a dataset is known as a
classifier. There are two types of Classifications:
o Binary Classifier: If the classification problem has only two possible
outcomes, then it is called as Binary Classifier.
Examples: YES or NO, MALE or FEMALE, SPAM or NOT SPAM,
CAT or DOG, etc.
o Multi-class Classifier: If a classification problem has more than two
outcomes, then it is called as Multi-class Classifier.
Example: Classifications of types of crops, Classification of types of
music..
Classification algorithm in ML are
a) Logistic Regression:
Description: Logistic regression is a linear classification algorithm used for
binary classification tasks. It estimates the probability that a given input belongs
to a particular class.
Advantages: Simple, interpretable, works well for linearly separable data.
Application: Spam detection, customer churn prediction.

b) Support Vector Machines (SVM):


Description: SVM is a versatile classification algorithm that finds the optimal
hyperplane to separate classes in the feature space. It can handle linear and non-
linear classification tasks.
Advantages: Effective in high-dimensional spaces, works well with clear margin
of separation.
Application: Text categorization, image recognition.
c) Decision Trees:
Description: Decision trees are non-linear classifiers that recursively split the
data based on feature values to make predictions. They create a tree-like structure
of decisions.
Advantages: Easy to interpret, can handle both numerical and categorical data.
Application: Customer segmentation, medical diagnosis.

d) Random Forest:
Description:Random Forest is an ensemble method that consists of multiple
decision decissiontrees.it improves prediction accuracy and reduces overfitting
by aggregating the predictions of individual trees.
Advantages: Robust to overfitting, handles high-dimensional data well.
Application: Credit risk analysis, image classification.

e) K-Nearest Neighbors (KNN):


Description: KNN is an instance-based classifier that classifies data points
based on the majority class among their k nearest neighbors in the feature space.
Advantages: Simple, non-parametric, easy to implement.
Application: Pattern recognition, recommendation systems.

f) Neural Networks:
Description: Neural networks are deep learning classifiers that consist of
interconnected layers of nodes. They learn complex patterns in the data through
training with back propagation.
Advantages: Capable of learning intricate patterns, suitable for large datasets.
Application: Image recognition, speech recognition.

7) Briefly explain Regression in ML?


Regression algorithms are used if there is a relationship between the input
variable and the output variable.
It is used for the prediction of continuous variables, such as Weather
forecasting, Market Trends, changes in temperature or fluctuations in electricity
demand etc.
Some popular Regression algorithms which come under supervised learning:
a) Linear Regression:
Description: Linear regression models the relationship between the
independent variables and the continuous target variable by fitting a linear
equation to the data.
Advantages: Simple, interpretable, computationally efficient.
Application: Predicting house prices, estimating sales revenue.
b) Polynomial Regression:
Description: Extends linear regression by adding polynomial terms to the
model, allowing it to capture non-linear relationships.
Advantages: Can model a broader range of data shapes than linear regression.
Application: Situations where the relationship between variables is curved,
such as growth rates, trajectories, and other natural phenomena.

c) Ridge Regression:
Description: Ridge regression is a regularized form of linear regression that
adds a penalty term to the cost function to prevent over fitting by shrinking the
coefficients.
Advantages: Handles multicollinearity, reduces model complexity.
Application: Stock price prediction, risk analysis.

d) Lasso Regression:
Description: Lasso regression is another regularized linear regression technique
that uses the L1 norm penalty for feature selection by shrinking some
coefficients to zero.
Advantages: Feature selection, interpretable models.
Application: Marketing spend optimization, medical cost prediction.

e) Random Forest Regression:


Description: Random Forest regression is an ensemble method that combines
multiple decision trees to improve prediction accuracy and handle non-linear
relationships.
Advantages: Robust to overfitting, handles large datasets.
Application: Demand forecasting, stock market analysis.

f) Decision Tree Regression:


Description: Decision tree regression builds a tree structure to make predictions
by partitioning the feature space into regions and assigning a constant value to
each region.
Advantages: Easy to interpret, handles both numerical and categorical data.
Application: Sales forecasting, risk assessment.

8) Briefly Explain K-Nearest Neighbors algorithm?


Definition:
K Nearest Neighbor is a simple algorithm that stores all the available cases and
classifies the new data or case based on a similarity measure. It is mostly used
to classifies a data point based on how its neighbors are classified.
Features
o K-Nearest Neighbor is one of the simplest Machine Learning algorithms
based on Supervised Learning technique.
o K-NN algorithm assumes the similarity between the new case/data and
available cases and put the new case into the category that is most similar
to the available categories.
o K-NN algorithm stores all the available data and classifies a new data point
based on the similarity. This means when new data appears then it can be
easily classified into a well suite category by using K- NN algorithm.
o K-NN algorithm can be used for Regression as well as for Classification
but mostly it is used for the Classification problems.
o K-NN is a non-parametric algorithm, which means it does not make any
assumption on underlying data.
o It is also called a lazy learner algorithm because it does not learn from
the training set immediately instead it stores the dataset and at the time of
classification, it performs an action on the dataset.
o KNN algorithm at the training phase just stores the dataset and when it gets
new data, then it classifies that data into a category that is much similar to
the new data.
o Example: Suppose, we have an image of a creature that looks similar to
cat and dog, but we want to know either it is a cat or dog. So for this
identification, we can use the KNN algorithm, as it works on a similarity
measure. Our KNN model will find the similar features of the new data set
to the cats and dogs images and based on the most similar features it will
put it in either cat or dog category.
Why do we need a K-NN Algorithm?
Suppose there are two categories, i.e., Category A and Category B, and we have
a new data point x1, so this data point will lie in which of these categories. To
solve this type of problem, we need a K-NN algorithm. With the help of K-NN,
we can easily identify the category or class of a particular dataset. Consider the
below diagram:

How does K-NN work?


The K-NN working can be explained on the basis of the below algorithm:
o Step-1: Select the number K of the neighbors
o Step-2: Calculate the Euclidean distance of K number of neighbors
o Step-3: Take the K nearest neighbors as per the calculated Euclidean
distance.
o Step-4: Among these k neighbors, count the number of the data points in
each category.
o Step-5: Assign the new data points to that category for which the number
of the neighbor is maximum.
o Step-6: Our model is ready.
Suppose we have a new data point and we need to put it in the required
category. Consider the below image:

Firstly, we will choose the number of neighbors, so we


will choose the k=5.
Next, we will calculate the Euclidean distance between
the data points. The Euclidean distance is the distance
between two points, which we have already studied in
geometry. It can be calculated as:
o By calculating the Euclidean distance we got the nearest neighbors, as three
nearest neighbors in category A and two nearest neighbors in category B.
Consider the below image:
o As we can see the 3 nearest neighbors are from category A, hence this new
data point must belong to category A.
How to select the value of K in the K-NN Algorithm?
Below are some points to remember while selecting the value of K in the K-NN
algorithm:
o There is no particular way to determine the best value for "K", so we need
to try some values to find the best out of them. The most preferred value
for K is 5.
o A very low value for K such as K=1 or K=2, can be noisy and lead to the
effects of outliers in the model.
o Large values for K are good, but it may find some difficulties.
Advantages of KNN Algorithm:
o It is simple to implement.
o Easy to understand
o It is robust to the noisy training data
o It can be more effective if the training data is large.
o It can perform well with enough representative data
Disadvantages of KNN Algorithm:
o Always needs to determine the value of K which may be complex some
time.
o Sensitive to irrelevant features.
o Requires high memory storage.
o Prediction is slow if the value of K.
Applications of KNN
1. Text mining
2. Agriculture
3. Finance
4. Medical
5. Facial recognition
6. Recommendation systems (Amazon, Hulu, Netflix, etc)
Note: Read Lab Program7 (KNN)

8) Explain LINEAR MODELS?


Linear model is a type of ML algorithm that is commonly used for supervised
learning tasks such as regression. It is based on the idea of fitting a linear equation
to a set of data points, which can be used to make predictions about new data.
The equation for a simple linear regression model can be expressed as
Y=mx+b
Where,
y is the dependent variable
x is the independent variable
m is the slope of the line and b is the intercept.

There are two types of linear models in ML


a)linear regression b)logistic regression

a) linear regression :
Linear regression is one of the easiest and most popular Machine Learning
algorithms. It is a statistical method that is used for predictive analysis. Linear
regression makes predictions for continuous/real or numeric variables such as
sales, salary, age, product price, etc.
Linear regression algorithm shows a linear relationship between a
dependent (y) and one or more independent (x) variables, hence called as linear
regression. Since linear regression shows the linear relationship, which means it
finds how the value of the dependent variable is changing according to the value
of the independent variable.
The linear regression model provides a sloped straight line representing the
relationship between the variables. Consider the below image:

Mathematically, we can represent a linear regression as:


y=a0+a1x+ε
Here,
y=Dependent Variable (Target Variable)
x= Independent Variable (predictor Variable)
a0= intercept of the line (Gives an additional degree of freedom)
a1 = Linear regression coefficient (scale factor to each input value).
ε = random error
The values for x and y variables are training datasets for Linear Regression
model representation.

Types of Linear Regression


Linear regression can be further divided into two types of the algorithm:
o Simple Linear Regression:
If a single independent variable is used to predict the value of a numerical
dependent variable, then such a Linear Regression algorithm is called
Simple Linear Regression.
o Multiple Linear regression:
If more than one independent variable is used to predict the value of a
numerical dependent variable, then such a Linear Regression algorithm is
called Multiple Linear Regression.

b) logistic regression
Logistic regression is one of the most popular Machine Learning
algorithms, which comes under the Supervised Learning technique. It is
used for predicting the categorical dependent variable using a given set of
independent variables.
o Logistic regression predicts the output of a categorical dependent variable.
Therefore the outcome must be a categorical or discrete value. It can be
either Yes or No, 0 or 1, true or False, etc. but instead of giving the exact
value as 0 and 1, it gives the probabilistic values which lie between 0
and 1.
o Logistic Regression is much similar to the Linear Regression except that
how they are used. Linear Regression is used for solving Regression
problems, whereas Logistic regression is used for solving the
classification problems.
o In Logistic regression, instead of fitting a regression line, we fit an "S"
shaped logistic function, which predicts two maximum values (0 or 1).
o The curve from the logistic function indicates the likelihood of something
such as whether the cells are cancerous or not, a mouse is obese or not
based on its weight, etc.
o Logistic Regression is a significant machine learning algorithm because it
has the ability to provide probabilities and classify new data using
continuous and discrete datasets.
o Logistic Regression can be used to classify the observations using different
types of data and can easily determine the most effective variables used for
the classification. The below image is showing the logistic function:

Logistic Function (Sigmoid Function):


o The sigmoid function is a mathematical function used to map the predicted
values to probabilities.
o It maps any real value into another value within a range of 0 and 1.
o The value of the logistic regression must be between 0 and 1, which cannot
go beyond this limit, so it forms a curve like the "S" form. The S- form
curve is called the Sigmoid function or the logistic function.
o In logistic regression, we use the concept of the threshold value, which
defines the probability of either 0 or 1. Such as values above the threshold
value tends to 1, and a value below the threshold values tends to 0.
Assumptions for Logistic Regression:
o The dependent variable must be categorical in nature.
o The independent variable should not have multi-collinearity.
Logistic Regression Equation:
The Logistic regression equation can be obtained from the Linear Regression
equation. The mathematical steps to get Logistic Regression equations are given
below:
o We know the equation of the straight line can be written as:

o In Logistic Regression y can be between 0 and 1 only, so for this let's divide
the above equation by (1-y):
o But we need range between -[infinity] to +[infinity], then take logarithm
of the equation it will become:

The above equation is the final equation for Logistic Regression.


Note:Write Lab Program 8

9) Explain NAIVE BAYES CLASSIFIERS?


o Naïve Bayes algorithm is a supervised learning algorithm, which is based
on Bayes theorem and used for solving classification problems.
o It is mainly used in text classification that includes a high-dimensional
training dataset.
o Naïve Bayes Classifier is one of the simple and most effective
Classification algorithms which helps in building the fast machine learning
models that can make quick predictions.
o It is a probabilistic classifier, which means it predicts on the basis of
the probability of an object.
o Some popular examples of Naïve Bayes Algorithm are spam filtration,
Sentimental analysis, and classifying articles.

Why is it called Naïve Bayes?


The Naïve Bayes algorithm is comprised of two words Naïve and Bayes, Which
can be described as:
o Naïve: It is called Naïve because it assumes that the occurrence of a certain
feature is independent of the occurrence of other features. Such as if the
fruit is identified on the bases of color, shape, and taste, then red, spherical,
and sweet fruit is recognized as an apple. Hence each feature individually
contributes to identify that it is an apple without depending on each other.
o Bayes: It is called Bayes because it depends on the principle of Bayes'
Theorem.

Bayes' Theorem:
o Bayes' theorem is also known as Bayes' Rule or Bayes' law, which is used
to determine the probability of a hypothesis with prior knowledge. It
depends on the conditional probability.
o The formula for Bayes' theorem is given as:

Where,
P(A|B) is Posterior probability: Probability of hypothesis A on the observed
event B.
P(B|A) is Likelihood probability: Probability of the evidence given that the
probability of a hypothesis is true.
P(A) is Prior Probability: Probability of hypothesis before observing the
evidence.
P(B) is Marginal Probability: Probability of Evidence.

Working of Naïve Bayes' Classifier:


Working of Naïve Bayes' Classifier can be understood with the help of the below
example:
Suppose we have a dataset of weather conditions and corresponding target
variable "Play". So using this dataset we need to decide that whether we should
play or not on a particular day according to the weather conditions. So to solve
this problem, we need to follow the below steps:
1. Convert the given dataset into frequency tables.
2. Generate Likelihood table by finding the probabilities of given features.
3. Now, use Bayes theorem to calculate the posterior probability.
Problem: If the weather is sunny, then the Player should play or not?
Solution: To solve this, first consider the below dataset:
Outlook Play

0 Rainy Yes

1 Sunny Yes

2 Overcast Yes

3 Overcast Yes

4 Sunny No

5 Rainy Yes

6 Sunny Yes

7 Overcast Yes

8 Rainy No

9 Sunny No

10 Sunny Yes
11 Rainy No

12 Overcast Yes

13 Overcast Yes
Frequency table for the Weather Conditions:
Weather Yes No

Overcast 5 0

Rainy 2 2

Sunny 3 2

Total 10 5
Likelihood table weather condition:
Weather No Yes

Overcast 0 5 5/14= 0.35

Rainy 2 2 4/14=0.29

Sunny 2 3 5/14=0.35

All 4/14=0.29 10/14=0.71


Applying Bayes'theorem:
P(Yes|Sunny)= P(Sunny|Yes)*P(Yes)/P(Sunny)
P(Sunny|Yes)= 3/10= 0.3
P(Sunny)= 0.35
P(Yes)=0.71
So P(Yes|Sunny) = 0.3*0.71/0.35= 0.60
P(No|Sunny)= P(Sunny|No)*P(No)/P(Sunny)
P(Sunny|NO)= 2/4=0.5
P(No)= 0.29
P(Sunny)= 0.35
o P(No|Sunny)= 0.5*0.29/0.35 = 0.41
So as we can see from the above calculation that P(Yes|Sunny)>P(No|Sunny)
Hence on a Sunny day, Player can play the game.
Advantages of Naïve Bayes Classifier:
o Naïve Bayes is one of the fast and easy ML algorithms to predict a class
of datasets.
o It can be used for Binary as well as Multi-class Classifications.
o It performs well in Multi-class predictions as compared to the other
Algorithms.
o It is the most popular choice for text classification problems.

Disadvantages of Naïve Bayes Classifier:


o Naive Bayes assumes that all features are independent or unrelated, so it
cannot learn the relationship between features.

Applications of Naïve Bayes Classifier:


o It is used for Credit Scoring.
o It is used in medical data classification.
o It can be used in real-time predictions because Naïve Bayes Classifier is
an eager learner.
o It is used in Text classification such as Spam filtering and Sentiment
analysis.

Types of Naïve Bayes Model:


There are three types of Naive Bayes Model, which are given below:
o Gaussian: The Gaussian model assumes that features follow a normal
distribution. This means if predictors take continuous values instead of
discrete, then the model assumes that these values are sampled from the
Gaussian distribution.
o Multinomial: The Multinomial Naïve Bayes classifier is used when the
data is multinomial distributed. It is primarily used for document
classification problems, it means a particular document belongs to which
category such as Sports, Politics, education, etc.
The classifier uses the frequency of words for the predictors.
o Bernoulli: The Bernoulli classifier works similar to the Multinomial
classifier, but the predictor variables are the independent Booleans
variables. Such as if a particular word is present or not in a document. This
model is also famous for document classification tasks.

10) Explain DECISSION TREES?


Decision Tree is a Supervised learning technique that can be used for both
classification and Regression problems, but mostly it is preferred for solving
Classification problems.
It is a tree-structured classifier, where internal nodes represent the
features of a dataset, branches represent the decision rules and each leaf node
represents the outcome.
o In a Decision tree, there are two nodes, which are the Decision Node
and Leaf Node. Decision nodes are used to make any decision and have
multiple branches, whereas Leaf nodes are the output of those decisions
and do not contain any further branches.
o The decisions or the test are performed on the basis of features of the given
dataset.
o It is a graphical representation for getting all the possible solutions to a
problem/decision based on given conditions.
o It is called a decision tree because, similar to a tree, it starts with the root
node, which expands on further branches and constructs a tree-like
structure.
o In order to build a tree, we use the CART algorithm, which stands for
Classification and Regression Tree algorithm.
o A decision tree simply asks a question, and based on the answer (Yes/No),
it further split the tree into subtrees.
o Below diagram explains the general structure of a decision tree:
Note: A decision tree can contain categorical data (YES/NO) as well as numeric
data.

Why use Decision Trees?


There are various algorithms in Machine learning, so choosing the best algorithm
for the given dataset and problem is the main point to remember while creating a
machine learning model. Below are the two reasons for using the Decision tree:
o Decision Trees usually mimic human thinking ability while making a
decision, so it is easy to understand.
o The logic behind the decision tree can be easily understood because it
shows a tree-like structure.

Decision Tree Terminologies


Root Node: Root node is from where the decision tree starts. It represents the
entire dataset, which further gets divided into two or more homogeneous sets.
Leaf Node: Leaf nodes are the final output node, and the tree cannot be
segregated further after getting a leaf node.
Splitting: Splitting is the process of dividing the decision node/root node into
sub-nodes according to the given conditions.
Branch/Sub Tree: A tree formed by splitting the tree.
Pruning: Pruning is the process of removing the unwanted branches from the
tree.
Parent/Child node: The root node of the tree is called the parent node, and
other nodes are called the child nodes.

How does the Decision Tree algorithm Work?


o Step-1: Begin the tree with the root node, says S, which contains the
complete dataset.
o Step-2: Find the best attribute in the dataset using Attribute Selection
Measure (ASM).
o Step-3: Divide the S into subsets that contains possible values for the best
attributes.
o Step-4: Generate the decision tree node, which contains the best attribute.
o Step-5: Recursively make new decision trees using the subsets of the
dataset created in step -3. Continue this process until a stage is reached
where you cannot further classify the nodes and called the final node as a
leaf node.
Example: Suppose there is a candidate who has a job offer and wants to decide
whether he should accept the offer or Not. So, to solve this problem, the decision
tree starts with the root node (Salary attribute by ASM). The root node splits
further into the next decision node (distance from the office) and one leaf node
based on the corresponding labels. The next decision node further gets split into
one decision node (Cab facility) and one leaf node. Finally, the decision node
splits into two leaf nodes (Accepted offers and Declined offer). Consider the
below diagram:
Advantages of the Decision Tree
o It is simple to understand as it follows the same process which a human
follow while making any decision in real-life.
o It can be very useful for solving decision-related problems.
o It helps to think about all the possible outcomes for a problem.
o There is less requirement of data cleaning compared to other algorithms.

Disadvantages of the Decision Tree


o The decision tree contains lots of layers, which makes it complex.
o It may have an overfitting issue, which can be resolved using
the Random Forest algorithm.
o For more class labels, the computational complexity of the decision tree
may increase.

Ex:Lab program 9

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy