The document explains how to build a predictive model using the 1D3 algorithm to determine whether golf will be played based on features like Outlook, Temperature, Humidity, and Wind. It outlines the steps of the algorithm, including calculating entropy and information gain to select the best attributes for the decision tree. The example illustrates the process of growing the decision tree and highlights the significance of the Outlook attribute in predicting the outcome.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0 ratings0% found this document useful (0 votes)
2 views5 pages
Decision Tree Id3 Problem
The document explains how to build a predictive model using the 1D3 algorithm to determine whether golf will be played based on features like Outlook, Temperature, Humidity, and Wind. It outlines the steps of the algorithm, including calculating entropy and information gain to select the best attributes for the decision tree. The example illustrates the process of growing the decision tree and highlights the significance of the Outlook attribute in predicting the outcome.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 5
- gxample 4
Let's understand this with the help of an example. Consider a piece of data collected
over the course of 14 days where the features are Outlook, Temperature, Humidity, Wind
and the outcome variable is whether Golf was played on the day. Now, our job is to
build predictive model which takes in above 4 parameters and predicts whether Golf
will be played on the day. We'll build a decision tree to do that using 1D3 algorithm
(1D3 algorithm, stands for Iterative Dichotomiser 3, is a classification algorithm that follows
a greedy approach of building a decision tree by selecting a best attribute that yields
maximum Information Gain (IG) or minimum Entropy (H)).
Day | Outlook | Temperature | Humidity | Wind
Play Golf |
DI | Sunny Hot ‘High | Weak
No
te
D2 Sunny Hot
ba. omer x
Mild
Cool
| Cool
Rain
Rain
Rain
Overcast
Seamed ith CamScamergo
se
S124 Arificial Ineligence and Meer
Soe
sks jivel} .
1D3 Algorithm will perform following tasks recursively
1. Create root node for the tree
i retur eat le ‘positive’
2. If all examples are positive, return leaf node ‘po:
tive, return leaf node ‘negatives
Else if all examples are nega
Ss
Calculate the entropy of current state H(S)
4.
5. For each attribute, calculate the entropy with respect to the i,
denoted by H (S, x) *
6. Select the attribute which has maximum value of 1G (S, x)
7. Remove the attribute that offers highest IG from the set of At
8. Repeat until we run out of all attributes, or the decision tree has all eg
Now, let’s go ahead and grow the decision tree. The initial Step is to
le, we can sein
H(S), the Entropy of the current state. In the above exampl
there are 5 No’s and Oievesisn
Yes: No Total
2} I 14
1
Entropy (S)= ) p(x) toe 5
xex
9 9 5 Sale
Entropy =-( 7a Jom (za }| 14 Jal 14 |= 9-940
Remember that the Entropy is 0 if all members belong to
1 when half of them belong to one class and other half belor
is perfect randomness. Here it’s 0.94 which means the
random. Now, the next step is to choose the attribute that gives us
Information Gain which we'll choose as the root node. Let’se
Fy (Swe?
| A (Sstrons
a p (Sweat
p (Sstrone)
(9) 29-94 which we had already calculated in the previous example
all the 14 examplh
te a au Sean ples we have 8 places where the wind is weak and
—_ = Weak Wind = Strong Total
6 14
Number of weak
P a = Numeret
P Sstroi
the 8 Weak e
Now, out of
for ‘Play Golf’
nem were ‘No’
6
Entropy (Sweak) =~ [5]
Similarly, out of 6 Strol
‘yes’ for Play Golf ar
Entropy (Sstrong)
Remember, here half 3
uve perfect randomness.
IG (S, wind) = H (S) =
IGS, wind) =H (S
Gain,
Seamed ith CamScamerSia
Artificial tel
Which tells us the Ink ea
give f oh ‘Ormation Ga
Stve us in: . N Gain
cua nformation Sain of 0.048. by considering
in for all the features, OW We must similarly
“Wind
calcul
1G (S, Outlook) = 0.26
IG(S, Temprature) = 0.029
IG(S, Humidity) = 0.151
1G (S, Wind) = 0.048 (Previous example)
We can clearly see that IG(S, Outlook) has the highest information
2.
0.246, hence we chose Outlook attribute as the root node. At this point, 1,
decision tree looks like
Sunny Rein
Fig.
Here we observe that whenever the outlook is Overcast, Play Golf is alway
*Yes’, it’s no coincidence by any chance, the simple tree resulted because of the
highest information gain is given by the attribute Outlook.
We can simply apply recursion, you might want to look at the algorithm steps
described earlier. Now that we've used Outlook, we’ve got three of them remaining
Humidity, Temperature, and Wind. And, we had three possible values of Outlook
Sunny, Overcast, Rain. Where the Overcast node already ended up having leaf nod:
“Yes’, so we’re left with two subtrees to compute: Sunny and Rain. Next step would
be computing A (Ssunny)
Table where the value of Outlook is Sunny looks like:
| Temperature | Humidity Wind | Play Golf |
| Hot High Weak No | |
| Hot High Strong | No |
Mild High Weak i | No |
(al Normal Weak | Yes
SarvAp Normal Strong Yes
= Mie se
elaa —
Ss )
3.427
) tale ot f
un 5 log, (2) (25 2
; >: | Sail \5 |log, ( 2 \=0.96
\ OT a |
e similar fashion, we compute the following valu
alues
Scan Humidity) < 9.96
Temper:
ature) = 0.57
e can see the highest Inj
way with will 9;
Outlook
‘Sunny Bon
__ ll
Humidity A ee. |
, Strong taal
High Normal x
Se at No Yes
wa Yes
| Fig. |
Seamed ith CamScamer