0% found this document useful (0 votes)

129 views27 pages

Data Mining

A decision tree uses attributes to split a node into subsets, with each leaf representing a class. It works by testing attribute values at each node to determine the path to a leaf class. The goal is to find the attribute that best splits the data to reduce impurity and diversity between child nodes.

Uploaded by

Salman Khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

129 views27 pages

Data Mining

Uploaded by

Salman Khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 27

Decision Tree

A decision tree is an approach like that needed to

support the game of twenty questions where each
internal node of the tree denotes a question or a test on
the value of an independent attribute, each branch
represents an outcome of the test, and each leaf
represents a class.

Assume that each object has a number of independent

attributes and a dependent attribute.

15/10/2019 ©GKGupta 1
Example (From Han and Kamber)
age income student credit_rating buys_computer
<=30 high no fair no
<=30 high no excellent no
30…40 high no fair yes
>40 medium no fair yes
>40 low yes fair yes
>40 low yes excellent no
31…40 low yes excellent yes
<=30 medium no fair no
<=30 low yes fair yes
>40 medium yes fair yes
<=30 medium yes excellent yes
31…40 medium no excellent yes
31…40 high yes fair yes
>40 medium no excellent no
15/10/2019 ©GKGupta 2
A decision Tree (From Han and Kamber)

age?

<=30 overcast
30..40 >40

student? yes credit rating?

no yes excellent fair

no yes no yes
15/10/2019 ©GKGupta 3
Decision Tree

To classify an object, the appropriate attribute value is

used at each node, starting from the root, to determine
the branch taken. The path found by tests at each
node leads to a leaf node which is the class the model
believes the object belongs to.

Decision tree is an attractive technique since the

results are easy to understand. The rules can often be
expressed in natural language e.g. if the student has
GPA > 3.0 and class attendance > 90% then the
student is likely to get a D.
15/10/2019 ©GKGupta 4
Basic Algorithm
1. The training data is S. Discretise all continuous-valued
attributes. Let the root node contain S.
2. If all objects S in the root node belong to the same class then
stop.
3. Split the next leaf node by selecting an attribute A from
amongst the independent attributes that best divides or
splits the objects in the node into subsets and create a
decision tree node.
4. Split the node according to the values of A.
5. Stop if any of the following conditions is met otherwise
continue with 3.
-- data in each subset belongs to a single class.
-- there are no remaining attributes on which the
sample may be further divided.
15/10/2019 ©GKGupta 5
Building A Decision Tree

The aim is to build a decision tree consisting of a root

node, a number of internal nodes, and a number of leaf
nodes. Building the tree starts with the root node and
then splitting the data into two or more children nodes
and splitting them in lower level nodes and so on until
the process is complete.

The method uses induction based on the training data.

We illustrate it using a simple example.

15/10/2019 ©GKGupta 6
An Example (from the text)
Sore Swollen
Throat Fever Glands Congestion Headache Diagnosis
Yes Yes Yes Yes Yes Strep throat
No No No Yes Yes Allergy
Yes Yes No Yes No Cold
Yes No Yes No No Strep throat
No Yes No Yes No Cold
No No No Yes No Allergy
No No Yes No No Strep throat
Yes No No Yes Yes Allergy
No Yes No Yes Yes Cold
Yes Yes No Yes Yes Cold

•First five attributes are symptoms and the last

attribute is diagnosis. All attributes are categorical.
• Wish to predict the diagnosis class.
15/10/2019 ©GKGupta 7
An Example (from the text)
Consider each of the attributes in turn to see
which would be a “good” one to start
Sore
Throat Diagnosis
No Allergy
No Cold
No Allergy
No Strep throat
No Cold
Yes Strep throat
Yes Cold
Yes Strep throat
Yes Allergy
Yes Cold

• Sore throat does not predict diagnosis.

15/10/2019 ©GKGupta 8
An Example (from the text)
Is symptom fever any better?

Fever Diagnosis
No Allergy
No Strep throat
No Allergy
No Strep throat
No Allergy
Yes Strep throat
Yes Cold
Yes Cold
Yes Cold
Yes Cold

Fever is better but not perfect.

15/10/2019 ©GKGupta 9
An Example (from the text)
Try swollen glands

Swollen
Glands Diagnosis
No Allergy
No Cold
No Cold
No Allergy
No Allergy
No Cold
No Cold
Yes Strep throat
Yes Strep throat
Yes Strep throat

Good. Swollen glands = yes means Strep Throat

15/10/2019 ©GKGupta 10
An Example (from the text)
Try congestion

Congestion Diagnosis
No Strep throat
No Strep throat
Yes Allergy
Yes Cold
Yes Cold
Yes Allergy
Yes Allergy
Yes Cold
Yes Cold
Yes Strep throat

Not helpful.
15/10/2019 ©GKGupta 11
An Example (from the text)

Try the symptom headache

Headache Diagnosis
No Cold
No Cold
No Allergy
No Strep throat
No Strep throat
Yes Allergy
Yes Allergy
Yes Cold
Yes Cold
Yes Strep throat

Not helpful.
15/10/2019 ©GKGupta 12
An Example
This approach does not work if there are many
attributes and a large training set. Need an algorithm to
select an attribute that best discriminates among the
target classes as the split attribute.

How do we find the attribute that is most influential in

determining the dependent attribute?

The tree continues to grow until it is no longer possible

to find better ways to split the objects.

15/10/2019 ©GKGupta 13
Finding the Split

One approach involves finding the data’s diversity (or

uncertainty) and choosing a split attribute that
minimises diversity amongst the children nodes or
maximises the following:

diversity(before split) - diversity(left child) -

diversity(right child)

We discuss two approaches. One is based on

information theory and the other is based on the work
of Gini who devised a measure for the level of income
inequality in a country.
15/10/2019 ©GKGupta 14
Finding the Split

Since our aim is to find nodes that belong to the same

class (called pure), a term impurity is sometime used to
measure how far the node is from being pure.

The aim of the split then is to reduce impurity:

impurity(before split) - impurity(left child) -

impurity(right child)

Impurity is just a different term. Information theory or

the Gini index may be used to find the split attribute
that reduces impurity by the largest amount.
15/10/2019 ©GKGupta 15
Information Theory
value x or value y.
• If s is going to be always x then there is no
information and there is no uncertainty.
• What about at p(x) = 0.9 and p(y) = 0.1?
• What about p(x) = 0.5 and p(y) = 0.5?

The measure of information is

Suppose there is a variable s that can take either a
I = – sum (pi log(pi))

15/10/2019 ©GKGupta 16
Information
Information is defined as –pi*log(pi) where pi is the
probability of some event.

pi is always less than 1, so log(pi) is always negative and

–pi*log(pi) is always positive.

Note that log of 1 is always zero, the log of any number

greater than 1 is always positive and the log of any
number smaller than 1 is always negative.

15/10/2019 ©GKGupta 17
Information Theory
I = 2*(– 0.5 log (0.5))
This comes out to 1.0 and is the max information for an
event with two possible values. Also called entropy. A
measure of the minimum number of bits required to
encode the information.
Consider a dice with six possible outcomes with equal
probability. The information is:
I = 6 * (– (1/6) log (1/6)) = 2.585
Therefore three bits are required to represent the
outcome of rolling a dice.

If a loaded dice had much more chance of getting a 6,

say 50% or even 75%, does the roll of the dice has less
or more information?

The information is:

50% I = 5 * (– (0.1) log (0.1)) – 0.5*log(0.5)

75% I = 5 * (– (0.05) log (0.05)) – 0.75*log(0.75)

How many bits are required to represent the outcome
of rolling the loaded dice?
15/10/2019 ©GKGupta 19
Information Gain
• Select the attribute with the highest information gain
• Assume the training data S has two classes, P and N
– Let S contain a total of s objects, p of class P and n of
class N (so p + n = s)
– The amount of information in S given the two class P
and N is
p p n n
I ( p, n )   log 2  log 2
s s s s

15/10/2019 ©GKGupta 20
Information Gain
• Assume that using an attribute A the set S is
partitioned into {S1, S2 , …, Sv}
– If Si contains pi examples of P and ni examples of N,
the entropy, or the expected information needed to
classify objects in all subtrees Si is
 p n
E ( A)   i i I ( pi , ni )
i 1 p  n

– The encoding information that would be gained by

branching on A
Gain( A)  I ( p, n)  E ( A)
15/10/2019 ©GKGupta 21
Back to the Example
There are 10 (s = 10) samples and three classes.
Strep throat = t = 3
Cold = c = 4
Allergy = a = 3

Information = 2*(– 3/10 log(3/10)) – (4/10 log(4/10)) = 1.57

Let us now consider using the various symptoms to split

the sample.

15/10/2019 ©GKGupta 22
Example
Sore Throat
Yes has t = 2, c = 2, a =1, total 5
No has t = 1, c = 2, a = 2, total 5
I(y) = 2*(–2/5 log(2/5)) – (1/5 log(1/5)) = 1.52
I(n) = 2*(–2/5 log(2/5)) – (1/5 log(1/5)) = 1.52
Information = 0.5*1.52 + 0.5*1.52 = 1.52

Fever
Yes has t = 1, c = 4, a =0, total 5
No has t = 2, c = 0, a = 3, total 5
I(y) = –1/5 log(1/5)) – 4/5 log(4/5)
I(n) = –2/5 log(2/5)) – 3/5 log(3/5)
Information = 0.5 I(y) + 0.5 I(n) = 0.846

15/10/2019 ©GKGupta 23
Example
Swollen Glands
Yes has t = 3, c = 0, a =0, total 3
No has t = 0, c = 4, a = 3, total 7
I(y) = 0
I(n) = – 4/7 log(4/7)) – (3/7 log(3/7))
Information = 0.3*I(y) + 0.7*I(n) = 0.69

Congestion
Yes has t = 1, c = 4, a =3, total 8
No has t = 2, c = 0, a = 0, total 2
I(y) = –1/8 log(1/8)) – 4/8 log(4/8) – 3/8 log(3/8)
I(n) = 0
Information = 0.8 I(y) + 0.2 I(n) = 1.12

15/10/2019 ©GKGupta 24
Example
Headache
Yes has t = 2, c = 2, a =1, total 5
No has t = 1, c = 2, a = 2, total 5
I(y) = 2*(–2/5 log(2/5)) – (1/5 log(1/5)) = 1.52
I(n) = 2*(–2/5 log(2/5)) – (1/5 log(1/5)) = 1.52
Information = 0.5*1.52 + 0.5*1.52 = 1.52

So the values for information are:

Sore Throat 1.52
Fever 0.85
Swollen Glands 0.69
Congestion 1.12
Headache 1.52

15/10/2019 ©GKGupta 25
Decision Tree
Continuing the process
one more step will find Swollen
Fever as the next split Glands
attribute and the final
Result as shown.
No Yes

Diagnosis = Strep Throat

Fever

No Yes

Diagnosis = Allergy Diagnosis = Cold

15/10/2019 ©GKGupta 26
Information Gain
Assume that using an attribute A the set S is
partitioned into {S1, S2 , …, Sv}
– If Si contains pi samples of class P and ni of N, the
entropy, or the expected information needed to
classify objects in all subtrees Si is
 p n
E ( A)   i i I ( pi , ni )
i 1 p  n

– The information gain by branching on A

THE DOG AGGRESSION WORKBOOK, 3RD EDITION
From Everand
THE DOG AGGRESSION WORKBOOK, 3RD EDITION
James O'Heare
No ratings yet
Hypertension and Cardiovascular Disease - Nutritional Case Study
No ratings yet
Hypertension and Cardiovascular Disease - Nutritional Case Study
9 pages
Government of India Act 1858
No ratings yet
Government of India Act 1858
20 pages
Lecture W5ab
No ratings yet
Lecture W5ab
56 pages
Decision Tree Basics
No ratings yet
Decision Tree Basics
30 pages
ML UNIT-2 Notes
No ratings yet
ML UNIT-2 Notes
15 pages
Decision Trees
No ratings yet
Decision Trees
25 pages
Chapter4 Machine Learning Part3
No ratings yet
Chapter4 Machine Learning Part3
43 pages
Chapter 3 Decision Trees
No ratings yet
Chapter 3 Decision Trees
61 pages
Decision Tree Induction
No ratings yet
Decision Tree Induction
52 pages
Module 3
No ratings yet
Module 3
101 pages
Unit3 ID3 DT Examples
No ratings yet
Unit3 ID3 DT Examples
12 pages
Module 3
No ratings yet
Module 3
102 pages
Classification With Decision Trees I: Instructor: Qiang Yang
No ratings yet
Classification With Decision Trees I: Instructor: Qiang Yang
29 pages
Classification With Decision Trees: Instructor: Qiang Yang
100% (1)
Classification With Decision Trees: Instructor: Qiang Yang
62 pages
Module - 2 Decision Tree Learning
No ratings yet
Module - 2 Decision Tree Learning
79 pages
Decision Tree Induction
No ratings yet
Decision Tree Induction
80 pages
ML Lec5
No ratings yet
ML Lec5
7 pages
DMDW Co3 Session 14
No ratings yet
DMDW Co3 Session 14
55 pages
ML Unit 3
No ratings yet
ML Unit 3
36 pages
Decision Trees
No ratings yet
Decision Trees
40 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
70 pages
Machine Learning II - Decision Trees
No ratings yet
Machine Learning II - Decision Trees
16 pages
ID3 MedhaPradhan
No ratings yet
ID3 MedhaPradhan
22 pages
Cse 445 Lecture 8 Mma
No ratings yet
Cse 445 Lecture 8 Mma
107 pages
Unit 5. Decision Trees
No ratings yet
Unit 5. Decision Trees
58 pages
Decision Tree
No ratings yet
Decision Tree
42 pages
Decision-Tree Learning .
No ratings yet
Decision-Tree Learning .
29 pages
Chapter 5 2018 2019
No ratings yet
Chapter 5 2018 2019
5 pages
Unit 3 Part 2
No ratings yet
Unit 3 Part 2
21 pages
15 1 Random Forest and Decision Tree
No ratings yet
15 1 Random Forest and Decision Tree
66 pages
2 ML Ch3 Decision Trees Final
No ratings yet
2 ML Ch3 Decision Trees Final
70 pages
Machine Learning Unit-3.2
No ratings yet
Machine Learning Unit-3.2
61 pages
Unit 3
No ratings yet
Unit 3
81 pages
Chapter 8: Learning: By, Safa Hamdare
No ratings yet
Chapter 8: Learning: By, Safa Hamdare
46 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
18 pages
Unit 3
No ratings yet
Unit 3
90 pages
3 Decision Trees - LMS
No ratings yet
3 Decision Trees - LMS
47 pages
2.3 Decision-Tree-Algorithm
No ratings yet
2.3 Decision-Tree-Algorithm
61 pages
ML Lec-12
No ratings yet
ML Lec-12
17 pages
AI-day-3-14th Mar-2023
No ratings yet
AI-day-3-14th Mar-2023
12 pages
ML Unit 2
No ratings yet
ML Unit 2
84 pages
Classification Trees
No ratings yet
Classification Trees
48 pages
Chapter 4A Tutorial Questions and Solutions
No ratings yet
Chapter 4A Tutorial Questions and Solutions
12 pages
7-Decision Trees Learning
No ratings yet
7-Decision Trees Learning
51 pages
23 Id3
No ratings yet
23 Id3
20 pages
Lec4 - Decision Trees
No ratings yet
Lec4 - Decision Trees
43 pages
ML Lecture04x2
No ratings yet
ML Lecture04x2
16 pages
Machine Learning 10601 Recitation 8 Oct 21, 2009: Oznur Tastan
No ratings yet
Machine Learning 10601 Recitation 8 Oct 21, 2009: Oznur Tastan
46 pages
Decision Tree Example
No ratings yet
Decision Tree Example
21 pages
University of Gondar: August 2011 E.C Gondar, Ethiopia
No ratings yet
University of Gondar: August 2011 E.C Gondar, Ethiopia
10 pages
03-FSSR DS610 2024 2025T1 DT
No ratings yet
03-FSSR DS610 2024 2025T1 DT
51 pages
5 - Predictive Modeling Using Decision Trees
No ratings yet
5 - Predictive Modeling Using Decision Trees
25 pages
Unit 3 MLT
No ratings yet
Unit 3 MLT
18 pages
Wk. 5.2. Decision Trees (27.10.2020)
No ratings yet
Wk. 5.2. Decision Trees (27.10.2020)
57 pages
Asset v1 MKAU+SEng9032+DEV 01+Type@Asset+Block@ML Chapterthree
No ratings yet
Asset v1 MKAU+SEng9032+DEV 01+Type@Asset+Block@ML Chapterthree
129 pages
2025 Lecture07 P1 ID3
No ratings yet
2025 Lecture07 P1 ID3
41 pages
Trees
No ratings yet
Trees
78 pages
Lecture2 DT
No ratings yet
Lecture2 DT
75 pages
Decision Tree Intro MDT903
No ratings yet
Decision Tree Intro MDT903
40 pages
Decision Tree
No ratings yet
Decision Tree
25 pages
Physics for Men: The Science Behind Being a Guy
From Everand
Physics for Men: The Science Behind Being a Guy
P.R. Kelt
No ratings yet
Excel 300 ShortcExcel Uts
No ratings yet
Excel 300 ShortcExcel Uts
10 pages
Lecture 7 Z Transform
No ratings yet
Lecture 7 Z Transform
43 pages
Lecture 5 Discrete Time Systems New
No ratings yet
Lecture 5 Discrete Time Systems New
53 pages
Lecture 1 Introduction To DSP
No ratings yet
Lecture 1 Introduction To DSP
55 pages
Latihan Bahasa Inggris
No ratings yet
Latihan Bahasa Inggris
13 pages
Nonlinear Inversion Flight Control For A Supermaneuverable Aircraft
100% (1)
Nonlinear Inversion Flight Control For A Supermaneuverable Aircraft
9 pages
Dao 2015-09
No ratings yet
Dao 2015-09
14 pages
Gaming Industry - Group 1 - MM
No ratings yet
Gaming Industry - Group 1 - MM
20 pages
AUT International Scholarships - South Asia - Regulations S1 2025 Final Version
No ratings yet
AUT International Scholarships - South Asia - Regulations S1 2025 Final Version
5 pages
3D CAD Matrix PDF
No ratings yet
3D CAD Matrix PDF
5 pages
Print - Udyam Registration Certificate
No ratings yet
Print - Udyam Registration Certificate
2 pages
SFT3508S (SFT3508I) IPTV Gateway Server Spec
No ratings yet
SFT3508S (SFT3508I) IPTV Gateway Server Spec
4 pages
New Misc Mod
No ratings yet
New Misc Mod
36 pages
Credentials - Impeerical Consulting
No ratings yet
Credentials - Impeerical Consulting
22 pages
Ce2304 Nol
No ratings yet
Ce2304 Nol
171 pages
Basic Firefighting Course
No ratings yet
Basic Firefighting Course
15 pages
Unit III
No ratings yet
Unit III
58 pages
MAN TGA ZF Transmission 16S151/16S181 (RL)
100% (4)
MAN TGA ZF Transmission 16S151/16S181 (RL)
4 pages
Spru I 11444
No ratings yet
Spru I 11444
24 pages
Cat - D8T Dozer Specs, Videos & 360 Views - D8 Dozer - Caterpillar
No ratings yet
Cat - D8T Dozer Specs, Videos & 360 Views - D8 Dozer - Caterpillar
17 pages
Grade 10 - Unit 01
No ratings yet
Grade 10 - Unit 01
2 pages
Advanced Microcontroller Programming DC
No ratings yet
Advanced Microcontroller Programming DC
3 pages
Mapei 4-To-1 Mud Bed Mix Floor and Decor
No ratings yet
Mapei 4-To-1 Mud Bed Mix Floor and Decor
1 page
The Meaning of Foreign Exchange
No ratings yet
The Meaning of Foreign Exchange
5 pages
Dominique Pelz Resume
No ratings yet
Dominique Pelz Resume
1 page
HTML in A Day For Digital Marketing Pro Course
No ratings yet
HTML in A Day For Digital Marketing Pro Course
1 page
Untitled 2
No ratings yet
Untitled 2
31 pages
Language Elements: Clauses
No ratings yet
Language Elements: Clauses
6 pages
Full Job Description - Project Management Fresher
No ratings yet
Full Job Description - Project Management Fresher
2 pages
9-ch3 Part3 ch5 Part1
No ratings yet
9-ch3 Part3 ch5 Part1
24 pages
Dell - Digital Firm: Rehan Khan Sana Bashir Shah Shamael.Z.Khan Shoaib Shamim Sulaiman Shakil Taji
No ratings yet
Dell - Digital Firm: Rehan Khan Sana Bashir Shah Shamael.Z.Khan Shoaib Shamim Sulaiman Shakil Taji
23 pages
Final Training Design
No ratings yet
Final Training Design
4 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Data Mining

Uploaded by

Data Mining

Uploaded by

Decision Tree

A decision tree is an approach like that needed to

Assume that each object has a number of independent

student? yes credit rating?

no yes excellent fair

To classify an object, the appropriate attribute value is

Decision tree is an attractive technique since the

The aim is to build a decision tree consisting of a root

The method uses induction based on the training data.

•First five attributes are symptoms and the last

• Sore throat does not predict diagnosis.

Fever is better but not perfect.

Good. Swollen glands = yes means Strep Throat

Try the symptom headache

How do we find the attribute that is most influential in

The tree continues to grow until it is no longer possible

One approach involves finding the data’s diversity (or

diversity(before split) - diversity(left child) -

We discuss two approaches. One is based on

Since our aim is to find nodes that belong to the same

The aim of the split then is to reduce impurity:

impurity(before split) - impurity(left child) -

Impurity is just a different term. Information theory or

The measure of information is

pi is always less than 1, so log(pi) is always negative and

Note that log of 1 is always zero, the log of any number

If a loaded dice had much more chance of getting a 6,

The information is:

75% I = 5 * (– (0.05) log (0.05)) – 0.75*log(0.75)

– The encoding information that would be gained by

Information = 2*(– 3/10 log(3/10)) – (4/10 log(4/10)) = 1.57

Let us now consider using the various symptoms to split

So the values for information are:

Diagnosis = Strep Throat

Diagnosis = Allergy Diagnosis = Cold

– The information gain by branching on A

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.