0% found this document useful (0 votes)
12 views6 pages

Saad Iqbal 301-211073 Assign 2

Uploaded by

mundacharsi447
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views6 pages

Saad Iqbal 301-211073 Assign 2

Uploaded by

mundacharsi447
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

HAZARA UNIVERSITY MANSEHRA

Starting with the Name of ALLAH Almighty, Who’s


Rehman and Raheem.

Name: SAAD IQBAL

Roll no: 301-211073

Department: Information Technology

Section: BSCS-B

Semester: 6th

Subject Artificial Intelligence

Submitted to Miss Muneeba Darwaish

Assignment 2
Question 1
You are given a dataset containing information about weather conditions and whether people
decide to play tennis under those conditions. The dataset includes the following attributes:
1. Outlook: {Sunny, Overcast, Rain}
2. Temperature: {Hot, Mild, Cool}
3. Humidity: {High, Normal}
4. Wind: {Weak, Strong}
5. PlayTennis: {Yes, No} (target attribute)

The dataset is as follows:


Outlook Temperature Humidity Wind PlayTennis
Sunny Hot High Weak No
Sunny Hot High Strong No
Overcast Hot High Weak Yes
Rain Mild High Weak Yes
Rain Cool Normal Weak Yes
Rain Cool Normal Strong No
Overcast Cool Normal Strong Yes
Sunny Mild High Weak No
Sunny Cool Normal Weak Yes
Rain Mild Normal Weak Yes
Sunny Mild Normal Strong Yes
Overcast Mild High Strong Yes
Overcast Hot Normal Weak Yes
Rain Mild High Strong No

Using the ID3 algorithm, answer the following questions:


1. Entropy Calculation: Calculate the initial entropy of the dataset with respect to the target
attribute "PlayTennis".
2. Information Gain Calculation: Calculate the information gain for each attribute (Outlook,
Temperature, Humidity, and Wind) with respect to the target attribute.
3. Attribute Selection: Based on the information gain calculated, identify which attribute the
ID3 algorithm will choose as the root node for the decision tree.
4. Subsequent Steps: Describe the next steps the ID3 algorithm will take after selecting the
root node.
STEP 1: Entropy Calculation
To calculate the initial entropy of the dataset with respect to the target attribute "PlayTennis",
we need to find the probability of each class (Yes and No) and then apply the entropy formula.

Calculate the entropy of all datasets


Total instances = 14
Instances with "PlayTennis = Yes" = 9
Instances with "PlayTennis = No" = 5

Probability of "PlayTennis = Yes" = p(Yes) = 9/14 = 0.6429


Probability of "PlayTennis = No" = p(No) = 5/14 = 0.3571

Entropy(S) = -p(Yes) * log2(p(Yes)) - p(No) * log2(p(No))


= -(0.6429 * log2(0.6429)) - (0.3571 * log2(0.3571))
= -(0.6429 * (-0.6365)) - (0.3571 * (-1.4854))
= 0.4095 + 0.5308
= 0.9403

STEP 2: Information Gain Calculation

To calculate the information gain for each attribute, we need to find the entropy of each
subset of the data based on the attribute values and then apply the information gain
formula.

a) Information Gain for "Outlook"


➢ Entropy of sunny:
Outlook = Sunny
Instances with "PlayTennis = Yes" = 2
Instances with "PlayTennis = No" = 3
Entropy (Sunny) = -(2/5 * log2(2/5)) - (3/5 * log2(3/5)) = 0.9710

➢ Entropy of overcast:
Outlook = Overcast
Instances with "PlayTennis = Yes" = 4
Instances with "PlayTennis = No" = 0
Entropy (Overcast) = -(4/4 * log2(4/4)) - (0/4 * log2(0/4)) = 0

➢ Entropy of rain:
Outlook = Rain
Instances with "PlayTennis = Yes" = 3
Instances with "PlayTennis = No" = 2
Entropy (Rain) = -(3/5 * log2(3/5)) - (2/5 * log2(2/5)) = 0.9710

➢ Information gain of outlook


Information Gain (Outlook) = Entropy(S) - ((5/14 * Entropy(Sunny)) + (4/14 *
Entropy(Overcast)) + (5/14 *Entropy(Rain)))
= 0.9403 - ((5/14 * 0.9710) + (4/14 * 0) + (5/14 * 0.9710))
= 0.9403 - (0.3467 + 0 + 0.3467)
= 0.9403 - 0.6934
= 0.2469

b) Information Gain for "Temperature"


➢ Entropy of hot:
Temperature = Hot
Instances with "PlayTennis = Yes" = 2
Instances with "PlayTennis = No" = 2
Entropy (Hot) = -(2/4 * log2(2/4)) - (2/4 * log2(2/4)) = 1

➢ Entropy of mild:
Temperature = Mild
Instances with "PlayTennis = Yes" = 4
Instances with "PlayTennis = No" = 2
Entropy (Mild) = -(4/6 * log2(4/6)) - (2/6 * log2(2/6)) = 0.9183

➢ Entropy of cool:
Temperature = Cool
Instances with "PlayTennis = Yes" = 3
Instances with "PlayTennis = No" = 1
Entropy (Cool) = -(3/4 * log2(3/4)) - (1/4 * log2(1/4)) = 0.8113
➢ Information gain of temperature:
Information Gain (Temperature) = Entropy(S) - ((4/14 * Entropy(Hot)) + (6/14 *
Entropy(Mild)) + (4/14 * Entropy(Cool)))
= 0.9403 - ((4/14 * 1) + (6/14 * 0.9183) + (4/14 * 0.8113))
= 0.9403 - (0.2857 + 0.3937 + 0.2317)
= 0.9403 - 0.9111
= 0.0292

c) Information Gain for "Humidity"


➢ Entropy of high:
Humidity = High
Instances with "PlayTennis = Yes" = 3
Instances with "PlayTennis = No" = 4
Entropy (High) = -(3/7 * log2(3/7)) - (4/7 * log2(4/7)) = 0.9852

➢ Entropy of normal:
Humidity = Normal
Instances with "PlayTennis = Yes" = 6
Instances with "PlayTennis = No" = 1
Entropy (Normal) = -(6/7 * log2(6/7)) - (1/7 * log2(1/7)) = 0.5917

➢ Information gain of humidity:


Information Gain(Humidity) = Entropy(S) - ((7/14 * Entropy(High)) + (7/14 *
Entropy(Normal)))
= 0.9403 - ((7/14 * 0.9852) + (7/14 * 0.5917))
= 0.9403 - (0.4896 + 0.2959)
= 0.9403 - 0.7855
= 0.1548

d) Information Gain for "Wind"


➢ Entropy of weak:
Wind = Weak
Instances with "PlayTennis = Yes" = 6
Instances with "PlayTennis = No" = 2
Entropy (Weak) = -(6/8 * log2(6/8)) - (2/8 * log2(2/8)) = 0.8113

➢ Entropy of strong:
Wind = Strong
Instances with "PlayTennis = Yes" = 3
Instances with "PlayTennis = No" = 3
Entropy (Strong) = -(3/6 * log2(3/6)) - (3/6 * log2(3/6)) = 1

➢ Information gain of wind:


Information Gain (Wind) = Entropy(S) - ((8/14 * Entropy (Weak)) + (6/14 *
Entropy(Strong)))

= 0.9403 - ((8/14 * 0.8113) + (6/14 * 1))


= 0.9403 - (0.4636 + 0.4286)
= 0.9403 - 0.8922
= 0.0481

STEP 3: Attribute Selection

Based on the information gain calculations, the attribute with the highest information gain is
"Outlook" with an information gain of 0.2469. Therefore, the ID3 algorithm will choose
"Outlook" as the root node for the decision tree.

STEP 4: Subsequent Steps

After selecting "Outlook" as the root node, the ID3 algorithm will create branches for each
possible value of "Outlook" (Sunny, Overcast, and Rain). Then, for each branch, the algorithm
will recursively construct the decision tree by selecting the next attribute with the highest
information gain on the subset of instances corresponding to that branch.

The process continues until all instances in a branch belong to the same class (i.e., the
entropy is zero), or there are no remaining attributes to split on. In the latter case, the
algorithm assigns the majority class in that subset as the leaf node.

The algorithm repeats this process for each branch, creating sub-trees until all instances are
classified or there are no more attributes to split on.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy