0% found this document useful (0 votes)

2 views34 pages

Unit-II

The document states that the training data is current up to October 2023. It implies that any information or developments after this date are not included. This sets a temporal limitation on the relevance of the content.

Uploaded by

zackbhavsar1209

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views34 pages

Unit-II

Uploaded by

zackbhavsar1209

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

Unit-II

Supervised Machine Learning

Algorithms
Supervised Machine Learning
● In supervised machine learning, models are trained using a dataset that consists of
input-output pairs.
● The supervised learning algorithm analyzes the dataset and learns the relation between the
input data (features) and correct output (labels/ targets). In the process of training, the model
estimates the algorithm's parameters by minimizing a loss function. The loss function
measures the difference between the model's predictions and actual target values.
● The model iteratively updates its parameters until the loss/ error has been sufficiently
minimized.
● Once the training is completed, the model parameters have optimal values. The model has
learned the optimal mapping/ relation between the inputs and targets. Now, the model can
predict values for the new and unseen input data.
●
Types of Supervised Learning Algorithm
1. Classification
The key objective of classification-based tasks is to predict categorical output labels or responses for the
given input data such as true-false, male-female, yes-no etc. As we know, the categorical output responses
mean unordered and discrete values; hence, each output response will belong to a specific class or category.

Some popular classification algorithms are decision trees, random forests, support vector machines (SVM),
logistic regression, etc.

2. Regression
The key objective of regression-based tasks is to predict output labels or responses, which are continuous
numeric values, for the given input data. Basically, regression models use the input data features
(independent variables) and their corresponding continuous numeric output values (dependent or outcome
variables) to learn specific associations between inputs and corresponding outputs.

Some popular regression algorithms are linear regression, polynomial regression, Laso regression, etc.
Decision Tree
● A Decision tree is a tree-like structure used to make decisions and analyze the
possible consequences. The algorithm splits the data into subsets based on
features, where each parent node represents internal decisions and the leaf node
represents final prediction.
● A decision tree is a supervised learning algorithm used for both classification and
regression problems. It is represented as a tree structure where each internal
node represents a test on an attribute, each branch represents the outcome of the
test, and each leaf node represents a class label or a predicted value. The goal of
a decision tree is to split the dataset into subsets based on the value of an
attribute, repeating this process until each subset contains only instances that
belong to a single class or have similar values.
Example
Attribute Selection Measures

Choosing the right attribute to split the data at each node is a critical step in
building an accurate decision tree. The most common methods used for attribute
selection are information gain and Gini index.

Information Gain

Information Gain is based on the concept of entropy, which measures the

amount of disorder or uncertainty in the dataset. The goal is to select the attribute
that reduces entropy the most when the data is split.

The formula for entropy is:

Gini Index
The Gini index measures the impurity of a dataset. A lower Gini index indicates a
purer dataset, meaning most of the instances belong to a single class. The Gini
index is computed as:

Attributes that result in the lowest Gini index after the split are considered the best
candidates for splitting the data.
Example of Decision Tree
import pandas
df = pandas.read_csv("data.csv")
d = {'UK': 0, 'USA': 1, 'N': 2}
df['Nationality'] = df['Nationality'].map(d)
d = {'YES': 1, 'NO': 0}
df['Go'] = df['Go'].map(d)
print(df)
features = ['Age', 'Experience', 'Rank', 'Nationality']
X = df[features]
y = df['Go']
dtree = DecisionTreeClassifier()
dtree = dtree.fit(X, y)

tree.plot_tree(dtree, feature_names=features)
plt.show()
Output
Rank <= 6.5 means that every comedian with a rank of 6.5 or lower will follow the True arrow (to the left), and the
rest will follow the False arrow (to the right).

gini = 0.497 refers to the quality of the split, and is always a number between 0.0 and 0.5, where 0.0 would mean
all of the samples got the same result, and 0.5 would mean that the split is done exactly in the middle.

samples = 13 means that there are 13 comedians left at this point in the decision, which is all of them since this is
the first step.

value = [6, 7] means that of these 13 comedians, 6 will get a "NO", and 7 will get a "GO".

The Gini method uses this formula:

Gini = 1 - (x/n)2 - (y/n)2

Where x is the number of positive answers("GO"), n is the number of samples, and y is the number of
negative answers ("NO"), which gives us this calculation:

1 - (7 / 13)2 - (6 / 13)2 = 0.497

True - 5 Comedians End Here:

gini = 0.0 means all of the samples got the same result.

samples = 5 means that there are 5 comedians left in this branch (5 comedian with a Rank of 6.5 or lower).

value = [5, 0] means that 5 will get a "NO" and 0 will get a "GO".

False - 8 Comedians Continue:

Nationality

Nationality <= 0.5 means that the comedians with a nationality value of less than 0.5 will follow the arrow to the left (which
means everyone from the UK, ), and the rest will follow the arrow to the right.

gini = 0.219 means that about 22% of the samples would go in one direction.

samples = 8 means that there are 8 comedians left in this branch (8 comedian with a Rank higher than 6.5).

value = [1, 7] means that of these 8 comedians, 1 will get a "NO" and 7 will get a "GO".
Advantages of DT
● Simple to understand and to interpret. Trees can be visualized.
● Requires little data preparation. Other techniques often require data normalization,
dummy variables need to be created and blank values to be removed. Some tree and
algorithm combinations support missing values.
● The cost of using the tree (i.e., predicting data) is logarithmic in the number of data
points used to train the tree.
● Able to handle both numerical and categorical data. However, the scikit-learn
implementation does not support categorical variables for now. Other techniques are
usually specialized in analyzing datasets that have only one type of variable. See
algorithms for more information.
● Able to handle multi-output problems.
Disadvantages of DT
● Decision-tree learners can create over-complex trees that do not generalize the data well.
This is called overfitting. Mechanisms such as pruning, setting the minimum number of
samples required at a leaf node or setting the maximum depth of the tree are necessary to
avoid this problem.
● Decision trees can be unstable because small variations in the data might result in a
completely different tree being generated. This problem is mitigated by using decision trees
within an ensemble.
● Predictions of decision trees are neither smooth nor continuous, but piecewise constant
approximations as seen in the above figure. Therefore, they are not good at extrapolation.
● The problem of learning an optimal decision tree is known to be NP-complete under several
aspects of optimality and even for simple concepts. There are concepts that are hard to learn
because decision trees do not express them easily, such as XOR, parity or multiplexer
problems.
● Decision tree learners create biased trees if some classes dominate. It is therefore
recommended to balance the dataset prior to fitting with the decision tree.
Appropriate Problems for Decision Tree Learning
Decision tree learning is generally best suited to problems with the following characteristics:
○ Instances are represented by attribute-value pairs.
■ There is a finite list of attributes (e.g. hair colour) and each instance stores a value for
that attribute (e.g. blonde).
■ When each attribute has a small number of distinct values (e.g. blonde, brown, red) it
is easier for the decision tree to reach a useful solution.
■ The algorithm can be extended to handle real-valued attributes (e.g. a floating point
temperature)
○ The target function has discrete output values.
■ A decision tree classifies each example as one of the output values.
■ Simplest case exists when there are only two possible classes (Boolean
classification).
■ However, it is easy to extend the decision tree to produce a target function with
more than two possible output values.
■ Although it is less common, the algorithm can also be extended to produce a target
function with real-valued outputs.
● Disjunctive descriptions may be required.
○ Decision trees naturally represent disjunctive expressions.
● The training data may contain errors.
○ Errors in the classification of examples, or in the attribute values describing those examples
are handled well by decision trees, making them a robust learning method.
● The training data may contain missing attribute values.
○ Decision tree methods can be used even when some training examples have unknown values
(e.g., humidity is known for only a fraction of the examples).
ID3 Algorithm
● ID3 or Iterative Dichotomiser3 Algorithm is used in machine learning for building
decision trees from a given dataset. It was developed in 1986 by Ross Quinlan. It is a
greedy algorithm that builds a decision tree by recursively partitioning the data set into
smaller and smaller subsets until all data points in each subset belong to the same class.
It employs a top-down approach, recursively selecting features to split the dataset based
on information gain.
● The ID3 algorithm selects the feature that provides the most information about the target
variable. The decision tree is built top-down, starting with the root node, which represents
the entire dataset. At each node, the ID3 algorithm selects the attribute that provides the
most information gain about the target variable. The attribute with the highest information
gain is the one that best separates the data points into different categories.
Steps in ID3 Algorithm
1. Determine entropy for the overall the dataset using class distribution.
2. For each feature.
● Calculate Entropy for Categorical Values.
● Assess information gain for each unique categorical value of the
feature.
3. Choose the feature that generates highest information gain.
4. Iteratively apply all above steps to build the decision tree structure.
Step 1: Calculating Entropy for dataset
def calculate_entropy(data, target_column):

total_rows = len(data)

target_values = data[target_column].unique()

entropy = 0

for value in target_values:

# Calculate the proportion of instances with the current value

value_count = len(data[data[target_column] == value])

proportion = value_count / total_rows

entropy -= proportion * math.log2(proportion)

return entropy

entropy_outcome = calculate_entropy(df, 'Outcome')

print(f"Entropy of the dataset: {entropy_outcome}")

Step 2: Calculating Entropy and Information Gain
def calculate_information_gain(data, feature, target_column):

# Calculate weighted average entropy for the feature

unique_values = data[feature].unique()

weighted_entropy = 0

for value in unique_values:

subset = data[data[feature] == value]

proportion = len(subset) / len(data)

weighted_entropy += proportion * calculate_entropy(subset, target_column)

# Calculate information gain

information_gain = entropy_outcome - weighted_entropy

return information_gain
Step 3: Assessing best feature with highest information gain
for column in df.columns[:-1]:

entropy = calculate_entropy(df, column)

information_gain = calculate_information_gain(df, column, 'Outcome')

print(f"{column} - Entropy: {entropy:.3f}, Information Gain: {information_gain:.3f}")

Step 4: Built ID3 Algorithm
def id3(data, target_column, features):

if len(data[target_column].unique()) == 1:

return data[target_column].iloc[0]

if len(features) == 0:

return data[target_column].mode().iloc[0]

best_feature = max(features, key=lambda x: calculate_information_gain(data, x, target_column))

tree = {best_feature: {}}

features = [f for f in features if f != best_feature]

for value in data[best_feature].unique():

subset = data[data[best_feature] == value]

tree[best_feature][value] = id3(subset, target_column, features)

return tree
Issues in Decision Tree Learning
1. Overfitting:
Decision trees can become overly complex and fit the training data too closely, including noise and irrelevant
details. This leads to poor performance on new, unseen data.
2. Sensitivity to Data Changes:
Small variations in the training data can lead to significant differences in the resulting decision tree, making the
model unstable.
3. Bias Towards Features with More Levels:
Decision trees may favor features with many unique values, potentially leading to biased models that neglect other
important features
4. Difficulty in Capturing Linear Relationships:
Decision trees inherently represent non-linear relationships and may struggle with datasets where relationships
are primarily linear.
K-Nearest Neighbor (KNN)
The k-Nearest Neighbors (KNN) algorithm is a supervised machine learning algorithm used for both
classification and regression tasks. It operates on the principle that similar data points exist in close
proximity to each other.

Supervised Learning:
KNN requires labeled training data, where each data point has a known class or value.
Distance Metric:
KNN relies on calculating distances between data points. The most common metric is Euclidean
distance, but others like Manhattan, Minkowski, and Hamming distances can also be used.
K Value:
The 'k' represents the number of nearest neighbors considered when making a prediction. The
choice of 'k' is crucial and can impact the model's performance.
No Training Phase:
KNN is a "lazy learner," meaning it doesn't have an explicit training phase. The training data is
simply stored and used during prediction.
KNN is a simple, supervised machine learning (ML) algorithm that can be used for
classification or regression tasks - and is also frequently used in missing value imputation. It
is based on the idea that the observations closest to a given data point are the most "similar"
observations in a data set, and we can therefore classify unforeseen points based on the
values of the closest existing points. By choosing K, the user can select the number of
nearby observations to use in the algorithm.

https://colab.research.google.com/drive/1l0t7JHRHXFAr_r_bd7FWLvy2rV5LL_Ge
?pli=1&authuser=1#scrollTo=vJOqkRd3nfxY
K Means Clustering

● K-Means Clustering is an Unsupervised Machine Learning algorithm which groups

unlabeled dataset into different clusters. It is used to organize data into groups based
on their similarity.
● The algorithm works by first randomly picking some central points called centroids and
each data point is then assigned to the closest centroid forming a cluster. After all the
points are assigned to a cluster the centroids are updated by finding the average
position of the points in each cluster. This process repeats until the centroids stop
changing forming clusters. The goal of clustering is to divide the data points into
clusters so that similar data points belong to same group.

https://colab.research.google.com/drive/1BDvrp2jol_afAow3ZAnzse77dK-yeF-Y
Distance Metrics Used in KNN Algorithm

1. Euclidean Distance

2. Manhattan Distance

3. Minkowski Distance
Working of KNN algorithm
Step 1: Selecting the optimal value of K

● K represents the number of nearest neighbors that needs to be considered while making
prediction.

Step 2: Calculating distance

● To measure the similarity between target and training data points Euclidean distance is
used. Distance is calculated between data points in the dataset and target point.

Step 3: Finding Nearest Neighbors

● The k data points with the smallest distances to the target point are nearest neighbors.
Step 4: Voting for Classification or Taking Average for Regression

● When you want to classify a data point into a category like spam or not spam, the KNN
algorithm looks at the K closest points in the dataset. These closest points are called
neighbors. The algorithm then looks at which category the neighbors belong to and picks
the one that appears the most. This is called majority voting.

Idebe Physics From 1
78% (37)
Idebe Physics From 1
107 pages
ML_UNIT_3_NOTES-1
No ratings yet
ML_UNIT_3_NOTES-1
118 pages
ML Unit 3 Notes
No ratings yet
ML Unit 3 Notes
117 pages
ML Unit 3
No ratings yet
ML Unit 3
15 pages
Session 9 10 Decision Tree
No ratings yet
Session 9 10 Decision Tree
41 pages
Refer For KNNDecison Tree SVM
No ratings yet
Refer For KNNDecison Tree SVM
90 pages
Data Mining Classification Algorithms: Credits: Padhraic Smyth
No ratings yet
Data Mining Classification Algorithms: Credits: Padhraic Smyth
54 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
22 pages
Decision Tree
No ratings yet
Decision Tree
82 pages
Mad theory
No ratings yet
Mad theory
138 pages
RB's ML2 Notes
No ratings yet
RB's ML2 Notes
5 pages
Unit 3 Classification - Dr. Vidyut D
No ratings yet
Unit 3 Classification - Dr. Vidyut D
72 pages
Random Forest Regression
No ratings yet
Random Forest Regression
57 pages
EDA Cat2
No ratings yet
EDA Cat2
54 pages
ML-Lecture-8-9-Classification
No ratings yet
ML-Lecture-8-9-Classification
35 pages
4. Classification
No ratings yet
4. Classification
75 pages
Lecture 7 Overview of ML models
No ratings yet
Lecture 7 Overview of ML models
77 pages
Cse 445 Lecture 8 Mma
No ratings yet
Cse 445 Lecture 8 Mma
107 pages
Ml Unit 2 Final_iii Yr
No ratings yet
Ml Unit 2 Final_iii Yr
72 pages
Slide 3
No ratings yet
Slide 3
23 pages
unit-4[1].docx ML
No ratings yet
unit-4[1].docx ML
42 pages
Asset v1 MKAU+SEng9032+DEV 01+Type@Asset+Block@ML Chapterthree
No ratings yet
Asset v1 MKAU+SEng9032+DEV 01+Type@Asset+Block@ML Chapterthree
129 pages
DT-0 (3 Files Merged)
No ratings yet
DT-0 (3 Files Merged)
143 pages
ml important
No ratings yet
ml important
11 pages
ML CLASS 6 Decision Tree Algorithm
No ratings yet
ML CLASS 6 Decision Tree Algorithm
21 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
70 pages
Decision Tree.pptx
No ratings yet
Decision Tree.pptx
41 pages
Module 4 Lecture -2
No ratings yet
Module 4 Lecture -2
65 pages
Machine_Learning_Lecture_08_Decision Tree Learning (1)
No ratings yet
Machine_Learning_Lecture_08_Decision Tree Learning (1)
67 pages
Unit IV Da Online - PPTX 2 82
No ratings yet
Unit IV Da Online - PPTX 2 82
81 pages
AI Chapter 3 Part 2
No ratings yet
AI Chapter 3 Part 2
51 pages
UNIT-3[MLT]
No ratings yet
UNIT-3[MLT]
42 pages
AI&Ml-module 4 (Complete)
No ratings yet
AI&Ml-module 4 (Complete)
124 pages
Unit 3 (A) NGP
No ratings yet
Unit 3 (A) NGP
78 pages
Unit IV Decision Trees
No ratings yet
Unit IV Decision Trees
37 pages
Decisiontree1 2
No ratings yet
Decisiontree1 2
29 pages
MLunit 2 Mynotes
No ratings yet
MLunit 2 Mynotes
15 pages
AI&Ml-module 4 (Part 1)
No ratings yet
AI&Ml-module 4 (Part 1)
85 pages
Decision Trees-Lecture 9&10
No ratings yet
Decision Trees-Lecture 9&10
60 pages
2 ML Ch3 Decision Trees Final
No ratings yet
2 ML Ch3 Decision Trees Final
70 pages
Supervised Learning Algorithm DT
No ratings yet
Supervised Learning Algorithm DT
15 pages
Unit 5. Decision Trees
No ratings yet
Unit 5. Decision Trees
58 pages
Wk. 5.2. Decision Trees (27.10.2020)
No ratings yet
Wk. 5.2. Decision Trees (27.10.2020)
57 pages
ML L8 Decision Tree
No ratings yet
ML L8 Decision Tree
109 pages
Unit-I
No ratings yet
Unit-I
42 pages
Decision_tree
No ratings yet
Decision_tree
15 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Unit 1-NoSQL
No ratings yet
Unit 1-NoSQL
31 pages
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
31 pages
Decision Tree in Machine Learning
No ratings yet
Decision Tree in Machine Learning
11 pages
Decision Trees and Regression Techniques
No ratings yet
Decision Trees and Regression Techniques
27 pages
Decisiontree 2
No ratings yet
Decisiontree 2
16 pages
Decision Tree
No ratings yet
Decision Tree
31 pages
Decision Tree Is An Upside
No ratings yet
Decision Tree Is An Upside
17 pages
Lecture Note #5_PEC-CS701E
No ratings yet
Lecture Note #5_PEC-CS701E
16 pages
Unit Ii
No ratings yet
Unit Ii
22 pages
An Introduction TO Decision Trees
No ratings yet
An Introduction TO Decision Trees
30 pages
Chapter 4classification and Prediction
No ratings yet
Chapter 4classification and Prediction
19 pages
Fractional Nonholonomic Ricci Flows: Sergiu I. Vacaru
No ratings yet
Fractional Nonholonomic Ricci Flows: Sergiu I. Vacaru
32 pages
08 Decision - Tree
No ratings yet
08 Decision - Tree
9 pages
Decision Tree
No ratings yet
Decision Tree
43 pages
11681677
No ratings yet
11681677
82 pages
unit- 3
No ratings yet
unit- 3
10 pages
PRTK. Password Recovery ToolKit EFS (Encrypting File System) HTTP - en - Wikipedia.org - Wiki - Encrypting - File - System
No ratings yet
PRTK. Password Recovery ToolKit EFS (Encrypting File System) HTTP - en - Wikipedia.org - Wiki - Encrypting - File - System
26 pages
Higher Nationals: Internal Verification of Assessment Decisions - BTEC (RQF)
No ratings yet
Higher Nationals: Internal Verification of Assessment Decisions - BTEC (RQF)
130 pages
unit-4-3
No ratings yet
unit-4-3
8 pages
CGC Construction Handbook Ch9 Acoustical Ceiling Design and Application Can en PDF
100% (1)
CGC Construction Handbook Ch9 Acoustical Ceiling Design and Application Can en PDF
26 pages
GNN-Foundations-Frontiers-and-Applications-chapter8
No ratings yet
GNN-Foundations-Frontiers-and-Applications-chapter8
28 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
15 pages
Report Vark
No ratings yet
Report Vark
3 pages
Cream and Brown Minimalist Let's Learn Presentation - 20230928 - 174206 - 0000
No ratings yet
Cream and Brown Minimalist Let's Learn Presentation - 20230928 - 174206 - 0000
21 pages
f20b Service Manual Yama
No ratings yet
f20b Service Manual Yama
94 pages
4_1243918431246352391
No ratings yet
4_1243918431246352391
15 pages
Equilibrium and Elasticity (2024)
No ratings yet
Equilibrium and Elasticity (2024)
36 pages
Amended Annexure - I: Ref:Mempl/Wo/Kptcl-Thabakadahonnalli/Benakatti/2021-22/071A Dated: 05.12.2021
100% (1)
Amended Annexure - I: Ref:Mempl/Wo/Kptcl-Thabakadahonnalli/Benakatti/2021-22/071A Dated: 05.12.2021
19 pages
It S A PHD Not A Nobel Prize How Experienced Examiners Assess Research Theses
No ratings yet
It S A PHD Not A Nobel Prize How Experienced Examiners Assess Research Theses
19 pages
Action Research Proposal in Reading
No ratings yet
Action Research Proposal in Reading
8 pages
Holiday Homework Gems Learning Gateway
100% (1)
Holiday Homework Gems Learning Gateway
4 pages
Fisher 1922
No ratings yet
Fisher 1922
17 pages
Installation and Operations Manual Ringfeder Locking Assemblies RFN 7012 RFN 7012 2 en 08 2019
No ratings yet
Installation and Operations Manual Ringfeder Locking Assemblies RFN 7012 RFN 7012 2 en 08 2019
8 pages
Module 1: Introduction To Arts ?
No ratings yet
Module 1: Introduction To Arts ?
8 pages
Viva Voce Guidelines PDF
100% (1)
Viva Voce Guidelines PDF
1 page
Arduino DC Motor
No ratings yet
Arduino DC Motor
12 pages
New TAT Probe JB Installation
No ratings yet
New TAT Probe JB Installation
12 pages
Industrial Training On Rdso
No ratings yet
Industrial Training On Rdso
24 pages
admin,+14-18+JPS+Alya+Rahma+Trishna+1 Removed
No ratings yet
admin,+14-18+JPS+Alya+Rahma+Trishna+1 Removed
8 pages
Motor Protection Switches: MS25, MST25, MS20, MST20
No ratings yet
Motor Protection Switches: MS25, MST25, MS20, MST20
6 pages
Mb0032 Operation Research
100% (1)
Mb0032 Operation Research
11 pages
C5200n/C5400 Series: Color Without Compromise
No ratings yet
C5200n/C5400 Series: Color Without Compromise
2 pages
28 10 00 ACES Security Contractors General Requirements & Specifications
No ratings yet
28 10 00 ACES Security Contractors General Requirements & Specifications
21 pages
Zulu Paper 3 Imibhalo Yokuziqambela Preview PDF
No ratings yet
Zulu Paper 3 Imibhalo Yokuziqambela Preview PDF
1 page
Hydraulic Transmission, Installing: Información de Servicio
No ratings yet
Hydraulic Transmission, Installing: Información de Servicio
5 pages
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Unit-II

Uploaded by

Unit-II

Uploaded by

Unit-II

Supervised Machine Learning

Information Gain is based on the concept of entropy, which measures the

The formula for entropy is:

The Gini method uses this formula:

Gini = 1 - (x/n)2 - (y/n)2

1 - (7 / 13)2 - (6 / 13)2 = 0.497

False - 8 Comedians Continue:

for value in target_values:

# Calculate the proportion of instances with the current value

value_count = len(data[data[target_column] == value])

proportion = value_count / total_rows

entropy -= proportion * math.log2(proportion)

entropy_outcome = calculate_entropy(df, 'Outcome')

print(f"Entropy of the dataset: {entropy_outcome}")

# Calculate weighted average entropy for the feature

for value in unique_values:

subset = data[data[feature] == value]

proportion = len(subset) / len(data)

weighted_entropy += proportion * calculate_entropy(subset, target_column)

# Calculate information gain

information_gain = entropy_outcome - weighted_entropy

entropy = calculate_entropy(df, column)

information_gain = calculate_information_gain(df, column, 'Outcome')

print(f"{column} - Entropy: {entropy:.3f}, Information Gain: {information_gain:.3f}")

best_feature = max(features, key=lambda x: calculate_information_gain(data, x, target_column))

tree = {best_feature: {}}

features = [f for f in features if f != best_feature]

for value in data[best_feature].unique():

subset = data[data[best_feature] == value]

tree[best_feature][value] = id3(subset, target_column, features)

● K-Means Clustering is an Unsupervised Machine Learning algorithm which groups

Step 2: Calculating distance

Step 3: Finding Nearest Neighbors

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Unit-II

Uploaded by

Unit-II

Uploaded by

Unit-II

Supervised Machine Learning

Information Gain is based on the concept of entropy, which measures the

The formula for entropy is:

The Gini method uses this formula:

Gini = 1 - (x/n)2 - (y/n)2

1 - (7 / 13)2 - (6 / 13)2 = 0.497

False - 8 Comedians Continue:

for value in target_values:

# Calculate the proportion of instances with the current value

value_count = len(data[data[target_column] == value])

proportion = value_count / total_rows

entropy -= proportion * math.log2(proportion)

entropy_outcome = calculate_entropy(df, 'Outcome')

print(f&quot;Entropy of the dataset: {entropy_outcome}&quot;)

# Calculate weighted average entropy for the feature

for value in unique_values:

subset = data[data[feature] == value]

proportion = len(subset) / len(data)

weighted_entropy += proportion * calculate_entropy(subset, target_column)

# Calculate information gain

information_gain = entropy_outcome - weighted_entropy

entropy = calculate_entropy(df, column)

information_gain = calculate_information_gain(df, column, 'Outcome')

print(f&quot;{column} - Entropy: {entropy:.3f}, Information Gain: {information_gain:.3f}&quot;)

best_feature = max(features, key=lambda x: calculate_information_gain(data, x, target_column))

tree = {best_feature: {}}

features = [f for f in features if f != best_feature]

for value in data[best_feature].unique():

subset = data[data[best_feature] == value]

tree[best_feature][value] = id3(subset, target_column, features)

● K-Means Clustering is an Unsupervised Machine Learning algorithm which groups

Step 2: Calculating distance

Step 3: Finding Nearest Neighbors

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

print(f"Entropy of the dataset: {entropy_outcome}")

print(f"{column} - Entropy: {entropy:.3f}, Information Gain: {information_gain:.3f}")