100% found this document useful (1 vote)
181 views12 pages

Decision Tree

This document discusses decision trees and their use for classification and prediction. It provides the following key points: 1) Decision trees are flowchart-like structures that can be used to classify or predict outcomes. They work by splitting the data into partitions at each node based on an attribute value test. 2) Decision trees are constructed through a top-down recursive process that selects the optimal attribute to split the data on at each node. The goal is to create pure partitions that contain only one class. 3) Common algorithms for building decision trees include ID3 and C4.5. They select the splitting attribute using measures like information gain or gain ratio, which assess how well an attribute separates the classes.

Uploaded by

Umar Arshad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
181 views12 pages

Decision Tree

This document discusses decision trees and their use for classification and prediction. It provides the following key points: 1) Decision trees are flowchart-like structures that can be used to classify or predict outcomes. They work by splitting the data into partitions at each node based on an attribute value test. 2) Decision trees are constructed through a top-down recursive process that selects the optimal attribute to split the data on at each node. The goal is to create pure partitions that contain only one class. 3) Common algorithms for building decision trees include ID3 and C4.5. They select the splitting attribute using measures like information gain or gain ratio, which assess how well an attribute separates the classes.

Uploaded by

Umar Arshad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 12

Decision Tree

https://www.saedsayad.com/decision_tree.htm

1
1
Classification and Prediction
 Classification is the process of finding a model (or function)
that describes and distinguishes data classes or concepts.
 The model are derived based on the analysis of a set of
training data (i.e., data objects for which the class labels are
known).
 The model is used to predict the class label of objects for
which the class label is unknown.

2
Decision Tree Induction
 A decision tree is a flowchart-like tree structure, where each
node denotes a test on an attribute value, each branch
represents an outcome of the test, and tree leaves represent
classes or class distributions.
 At each node, the algorithm chooses the “best” attribute to
partition the data into individual classes.
 The construction of decision tree classifiers does not require
any domain knowledge or parameter setting, and therefore is
appropriate for exploratory knowledge discovery.
 Decision trees can easily be converted to classification rules.
 Decision trees can handle multidimensional data.

4
Decision Tree Induction: Training Dataset
age income student credit_rating buys_computer
<=30 high no fair no
<=30 high no excellent no
31…40 high no fair yes
This >40 medium no fair yes
follows an >40 low yes fair yes
example of >40 low yes excellent no
31…40 low yes excellent yes
Quinlan’s <=30 medium no fair no
ID3 <=30 low yes fair yes
>40 medium yes fair yes
(Playing <=30 medium yes excellent yes
Tennis) 31…40 medium no excellent yes
31…40 high yes fair yes
>40 medium no excellent no

5
Output: A Decision Tree for “buys_computer”

age?

<=30 overcast
31..40 >40

student? yes credit rating?

no yes excellent fair

no yes yes

6
Algorithm for Decision Tree Induction
 Basic algorithm (a greedy algorithm)
– Tree is constructed in a top-down recursive divide-and-conquer
manner
– At start, all the training examples are at the root
– Attributes are categorical (if continuous-valued, they are discretized
in advance)
– Examples are partitioned recursively based on selected attributes
– Test attributes are selected on the basis of a heuristic or statistical
measure (e.g., information gain)

7
Attribute Selection Measure for ID3: Information Gain

 Select the attribute with the highest information gain


 Let pi be the probability that an arbitrary tuple in D belongs
to class Ci
 Expected information (entropy) needed to classify a tuple
in D: m
Info(D)   pi log2 ( pi )
i1

 Information needed (after using A to split D into v


v |D |
partitions) to classify D:
InfoA (D )    I(Dj )
j

j1 | D|
 Information gained by branching on attribute A
Gain(A) Info(D) InfoA(D)
9
Attribute Selection: Information Gain

 Class P: buys_computer = “yes” 5 4


Info age (D)  I (2,3)  I (4,0)
 Class N: buys_computer = “no” 14 14
Info(D)  I (9,5)   9 log 2 ( 9 )  5 log 2 ( 5 ) 0.940  5 I (3,2)  0.694
14 14 14 14 14
age pi ni I(pi, ni) 5
I (2,3)means “age <=30” has 5 out of
<=30 2 3 0.971 14 14 samples, with 2 yes’es and 3
31…40 4 0 0 no’s. Hence
>40 3 2 0.971
age income student credit_rating buys_computer Gain(age)  Info(D)  Infoage(D)  0.246
<=30 high no fair no
<=30 high no excellent no Similarly,
31…40 high no fair yes
>40 medium no fair yes
>40 low yes fair yes
>40
31…40 low
low yes
yes
excellent
excellent
no
yes Gain(income)  0.029
<=30
<=30
medium
low
no
yes
fair
fair
no
yes Gain(student)  0.151
Gain(credit_ rating)  0.048
>40 medium yes fair yes
<=30 medium yes excellent yes
31…40 medium no excellent yes
31…40February
high 18, 2016 yes fair yes 10
>40 medium no excellent no
Gain Ratio for Attribute Selection (C4.5)

• Information gain measure is biased towards attributes with


a large number of values
• C4.5 (a successor of ID3) uses gain ratio to overcome the
problem.
v | Dj |
| Dj |
SplitInfoA (D)    log2 ( )
j1 | D | | D|
– GainRatio(A) = Gain(A)/SplitInfo(A)
4 4 6 6 4 4
• Ex. SplitInfo A(D)    log 2( )  log 2 ( )  log 2 ( )  0.926
14 14 14 14 14 14
– gain_ratio(income) = 0.029/0.926 = 0.031
• The attribute with the maximum gain ratio is selected as
the splitting attribute
11
Comparing Attribute Selection Measures

• The three measures, in general, return good results but


– Information gain:
• biased towards multivalued attributes
– Gain ratio:
• tends to prefer unbalanced splits in which one
partition is much smaller than the others

12
Thank you for your attention.

Any Question?

13

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy