0% found this document useful (0 votes)

10 views14 pages

Adobe Scan 16 May 2023

Decision tree learning is a popular classification algorithm that builds a model in a tree structure, allowing for multi-dimensional analysis and easy interpretation of rules. The algorithm involves recursively partitioning data based on feature values to predict output variables, with nodes representing features and leaf nodes representing classifications. Key concepts include entropy, information gain, and various algorithms like ID3 and CART, which help determine the best features for splitting the data.

Uploaded by

SUBHANKAR PARIA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views14 pages

Adobe Scan 16 May 2023

Uploaded by

SUBHANKAR PARIA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Decision tree

Decision tree learning is one of the most widely adopted algorithms for classification.
Asthe name indicates,it builds a model in the form of a tree structure. Its grouping
exactness is focused with different strategies, and it is exceptionally productive.
Adecision tree is used formulti-dimensional analysis with multiple classes. It is char
acterized by fast execution time and ease in the interpretation of the rules. The goal of
decision tree leaning is to create a model (based on thepast data called past vector)that
predicts the value of the output variable bascd on the input variables in the feature vector.
Each node (or decision node) of adecision trec corresponds to one of the feature
vector. Fromevery node, there are cdgestochildren, whercin there is an edge for
cach of the possible values (oT range of values) of the feature associated with the
node. lhc tree ternninates at different leaf nodes (or terminal nodes) where each
lcaf node represents a possible valuc for the output variable. The output variable is
deteminedby following a path that starts at the root and is guided by the values
of the inputvariables.
Adecision tree is usually represcnted in the format depicted in Figure 78.

B B

F F

FIG. 7.8
Decision tree structure

Each internal node (represented by boxes) tests an attribute (represented as'A`B

within the boxes).Each branch corresponds to an attribute value (T/F) in the above
case. Each leaf node assigns a classification. The first node is called as Root Node.
Branches from the root node are called as 'Leaf Nodes where 'A' is the Root Node
(first node). B' is theBranch Node. T' & 'F are Leaf Nodes.
Thus. a decision tree consistsof three types of nodes:
" Root Node
" Branch Node
" LeafNode

Figure9 shows an example decision tree tor acar driving - the decision to be
takenis whether to'Keep Going' or to 'Stop; which depends on various situationsas
depicted in the figure. If thesignal is RED in colour,hen thecar should be stopped.
If there is not enough gas (pelrol) in the car, the car should be stopped at the next
available gas station.
Keep driving forward

Is there a Is the
stoplight Yes
stop
ahead?
light
red?
Yes
No Yes
No
Find a petrol
stafion, No
and buv Keep Stop
Is there enough going
gas in the car?

FIG. 7.9
Decision tree example

7.5.2.1 Buildingg a decision tree

Decision trees are built corresponding to the training data
called recursive partitioning. The approach splits the data following an approach
the basis of the feature values. It starts into multiple subsets on
the entire data set.It first selects the from the root node, which is nothing but
feature which predicts the target class in the
strongest way. The decision tree splits the data set into
data in each partition having a distinct value for multiple partitions, with
the feature based on which the
partitioning has happened. This is the first set of branches. Likewise, the
continues splitting the nodes on the basis of the feature algorithm
partition. This continues till astopping criterion is which helps in the best
criteria are - reached. The usual stopping
1. Allor most of the examples at a
particular node have the same cass
2. All features have been used up in the
partitioning
3. The tree has grown to a pre-defined threshold limit
Let us try to understand this in the context of an
of
Solutions (GTS), aleading provider IT solutions, example. Global Technology
Engineering and Management (CEM) for hiring B.Tech. is coming to College of
campus recruitment, they had shortlisted 18 students for the students.
final
Last year during
company of international repute, they follow a
stringent interview. Being a
tointerview
only the best of the students. The information process to select
related
results of shortlisted students (hiding the names) on the the
interview evaluation
tion parameters is available for reference in basis of
Figure 7.10. Chandra, a different of
evalua-
student CEM,
189
Common Classification Algorithms
wantsto find out if
he may be offered a job in His self-
ovaluation thc other parameters is as follows:GTS, His
on CGPA is quite high.
Comunication - Bad;
Aptitude High;
- Programming skills - Bad
CGPA
Communication Programming Job offered?
Aptitude Skill
High Good High Good Yes
Medium Good
High Good Yes
Low Bad Low No
Good
Low Good Low Bad No
High Good
High Bad Yes
High Good
High Good Yes
Medium Bad Low Bad No
Medium Bad Low Good No

High Bad High Good Yes

Medium Good High Good Yes

Low Bad High Bad No

Medium Good High Bad Yes

Low Good Low Good No

Bad Low Bad No

High
Medium Bad High Good No

Bad Low Bad No

High
Good High Bad Yes
Medium

FIG. 7.10
Training datafor GTS recruitment
Let us try to solve this problem, i.e. predicting whether Chandra will get a job offer
the decision tree correspond.
byusing thedecision tree model. First,we need to draw
the training data given in Figure 710. Accordng to the table, job otter condition
ing to where Aptitude = Low, irrespective of
(i.e. the outcome) is FALSE for allthe cases be taken up as the tirst node of th
can
Other conditions. So. the feature Aptitude
decision tree.
job offer condilion Is TRUE for all the cases k
For Aptitude = High, Commumication =Bad, job offer cond
Communication = Good. For cases where
CGPA = High.
tion is TRUE for allthe cases where decision tree diagram for the table
the complete given in
Figure 711 depicts
Figure 710.
START

Job not offered

Medium/
Low

High Bad
Aptitude? Communication?
CGPA?

Low Good
High
Job not offered Job offered
Job offered
FIG. 7.11
Decision tree based on the trainingdata

7.5.2.2 Searching a decision tree

By using the above decision tree
Chandra might get a job offer depicted
in Figure 711, we need to predict
for the given parameter values: whether
Communication = Bad, Aptitude = High, CGPA = High.
tiple ways to search through the Programming
trained decision
skills = Bad. There are mul
prediction problem. tree for a solution to the given

Exhaustive search
1. Place the item in the first
item in the first group group (class). Recursively examine solutions with the
2. Place the item in the (class).
second group (class).
the itemn in the
second grOup (class). Recursively examine solutions with
3. Repeat the above
steps untilthe solution is
Exhaustive search reached.
travels
much time when the decisionthrough the decision tree
values. tree is big with multiple exhaustively, but it will take
leaves and multiple attrnbule
Branch and bound search
Branch and bound uses an
decision tree in full. When the existing best solution to sidestep
have the worst possible algorithm starts, the best searching
solution
of the entire
makes the algorithm value; thus, any solution it finds out is well detined
though that is unlikely initially run down to the
to produce a left-most
is an
branch împrovement.
of
e
solution corresponds to realistic result. In the tree, evo
the
solution. A
programme putting every item in one
can partitioning
initial solution. This can be speed up the process bygroup, and it is an problem, unacceptable
right, the savings can be used as an input for using a fast heuristic to find an
substantial. branch and bound. If the heuristici
Section 7.5 Common Classification Algorithms 191

START

Job not offered

Medium/
Low

Aptitude? High Bad

Communication? CGPA?

Low
Good High
Job not offered
Job offered Job offered
FIG. 7.12
Decision tree based on the training data (depicting a sample path)
Figure 712 depicts a sample path (thick line) for the conditions CGPA = High,
Communication = Bad, Aptitude =High and Programming skills =Bad. According to
the above decision tree, the prediction can be made as Chandra will get the job offer.
There are many implementations of decision tree, the most prominent ones being
CS.0, CART (Classification and Regression Tree), CHAID (Chi-square Automatic
Interaction Detector) and ID3 (Iterative Dichotomiser 3) algorithms. The biggest
challenge of a decision tree algorithm is to find out which feature to split upon. The
main driver for identifying the feature is that the data should be split in such a way
that the partitions created by the split should contain examples belonging to asingle
class. If that happens, the partitions are considered to bepure. Entropy is a measure
of impurity of an attribute or feature adopted by many algorithms such as ID3 and
C5.0. The information gain is calculated on the basis of the decrease in entropy (S)
after a data set is split according to a particular attribute (A). Constructing a decision
tree is all about finding an attribute that returns the highest information gain (i.e. the
most homogeneous branches).

Note:
Like information gain, there are other measures like Gini index or chi-square for
individual nodes to decide the feature on the basis of which the split has to be applied.
TheCART algorithm uses Gini index, while the CHAID algorithm uses chi-square
for deciding the feature for applying split.

7.5.2.3 Entropy of adecision tree

Let us say S is the sample set of training examples. Then, Entropy (S) measuring the
impurity of S is defined as

Entropy(S) = P, log 2Pi

192 Chapter 7 Supervised Learning: Classification
where c is the number of different class labels and prefers to the
values falling into the i-thclass label. proportion of
For example, with respect to the training data in Figure 710, we have two values
for thetarget class Job Offered? - Yes and No. The value of p, for class value Yes
is 0.44 (i.e. 8/18) and that for class value 'No is 0.56 (1.e. 10/18). So, we can calcubs
the entropy as
Entropy( S) = -0.44 log ,(0.44) - 0.56 log 2(0.56) = 0,99.
7.5.2.4 Information gain of a decision tree
The information gain is created on the basis of the decrease in entropy (S) after a
data set is split according to a particular attribute (A). Constructing a decision tree is
all about finding an attribute that returns the highest information gain (i.e. the most
homogeneous branches). If the information gain is 0, it means that there is no reduc.
feature, On
tion in entropy due to split of the data set according to that particular
the other hand, the maximum amount of information gain which may happen is the
entropy of the data set before the split.
Information gain for aparticular feature A is calculated by the difference in
entropy before asplit (or S) with the entropy after the split (Sas).
Information Gain (S, A) = Entropy (S,s)- Entropy (S)
For calculating the entropy after split,entropy for all partitions needs to be con
sidered. Then, the weighted summation of the entropy for each partition can be taken
as the total entropy after split. For performing weighted summation, the proportion
of examples falling into each partition is used as weight.

Entropy(Sas) =w,
i=1
Entropy (p;)
Let us examine the value of information gain for the training data set shown in
Figure 710. We will find the value of entropy at the beginning before any split happens
and then again after the split happens. We will compare the values for all the cases -
1. when the feature CGPA' is used for the split
2. when the feature Communication' is used for the split
3. when the feature 'Aptitude' is used for the split
4. when the feature 'Programming Skills' isused for the split
Figure 7.13a gives the entropy values for the first level split for each of the cases
mentioned above.
Ascalculated, entropy of the data set before split (i.e. Entropy (Sh:)) =
entropy of the data set after split (i.e. Entropy (Sas)) is 0.99, and
" 0.69 when the feature
'CGPA' is used for split
" 0.63 when the feature
" 0.52 when the feature Communication'
is used for split
" 0.95 when the feature
'Aptitude is used for split
Programming skill' is used for split
Section 7.5 Common 193
Classification Algoritnms
(a) Original data
set:
Yes No Total
Count 8
18
pi 0.44 0.56
-pilog(pi) 0.52 0.47 0.99

Total Entropy = 0.99

(b)Splitted data set (based on the feature 'CGPA'):
CGPA = High CGPA = Medium CGPA = Low
Yes No Total Yes No Total Yes No Total
Count 4 2 Count 4 3 Count 5 5
0.67 0.33 pi 0.57 0.43 pi 0.00 1.00
-pilog(pi) 0.39 0.53 0.92 -pi*log(pi) 0.46 0.52 0.99 -pi*log(pi) 0.00 0.00 0.00

Total Entropy = 0.69 Information Gain = 0.30

(c) Splitted data set (based on the feature 'Communication'):

Communication = Good Communication = Bad

Yes No Total Yes No Total

Count 7 2 9 Count 1 8
pi 0.78 0.22 pi 0.11 0.89
-pi*log(pi) 0.28 0.48 0.76 -pi*log(pi) 0.35 0.15 0.50

Total Entropy = 0.63 Information Gain= 0.36

(d) Splitteddata set (based on the feature 'Aptitude'):

Aptitude = High Aptitude = Low

Yes No Total Yes No Total

11 Count 7 7
Count 3
pi 0.00 1.00
pi 0.73 0.27
0.85 -pi*log(pi) 0.00 0.00 0.00
-pi*log(pi) 0.33 0.51
Information Gain =0.47
Total Entropy =0.52
'Programming Skill'):
(e)Splitted data set (based on the featureProgramming Skill = Bad
Programming Skill = Good
Yes No Total
Yes No Total
Count 3 6
Count 5 4
pi 0.33 0.67
pi 0.56 0.44 -pi*log(pi) 0.53 0.39 092
-pi*log(pi) 0.47 0.52 0.99
Information Gain = 0.04
Total Entropy = 0.95
194 Chapter 7 Supervised Learning: Classification
Therefore, the information gain from the feature 'CGPA' = 0,99 - 0.69 =
whereas the information gain from the feature Communication' = 0.99- 0.63 03,
Likewise, the information gain for Aptitude' and 'Programming skills' is 0.47=0.a36,
0.04, respectively.
Hence, it is quite evident that among all the features,'Aptitude' results iin the best
information gain when adopted for the split. So, at the first level, a split will be applie
according to the value of 'Aptitude' or in other words, Aptitude' willbe the firstnode
of the decision tree formed. One important point to be noted here is that for Aptitude
=Low, entropy is 0, which indicates that always the result will be the same irrespec-
tiveof the values of the other features. Hence, the branch towards Aptitude= Low
will not continue any further.
As a part of level 2, we willthus have only one branch to navigate in this case - the
One for Aptitude = High. Figure Z13b presents calculations for level 2. As can be seen
from the figure, the entropy value is as follows:
" 0.85 before the split
" 0.33 when the feature CGPA' is used for split
" 0.30 when the feature Communication' is used for split
" 0.80when the feature 'Programming skill' is used for split
Hence, the information gain after split with the features CGPA, Communication
and Programming Skill is 0.52, 0.55 and 0.05, respectively. Hence, the feature
Communication should be used for this split as it results in the highest informa
tion gain. So, at the second level, a split will be applied on the basis of the value of
Communication. Again, the point to be noted here is that for Communication =
Good,entropy is 0, which indicates that always the result willbe the same irrespec
tive of the values of the other features. Hence, the branch towards Communication =
Good will notcontinue any further.

Aptitude = High
CGPA Communication Programming Skill Job offered?
High Good Good Yes
Medium Good Good Yes
High Good Bad Yes
High Good Good Yes
High Bad Good Yes
Medium Good Good Yes
Low Bad Bad No
Low Bad Bad No
Medium Good Bad Yes
Medium Bad Good No
Medium Good Bad
Yes
FIG. 7.13B (Continued)
Section 7.5 Common Classification 195
Algorithms
(a) Level 2 starting set:
Yes No Total
Count 3

0.73 0.27
pilogpi) 0.33 0.51 0.85

Total Entropy= 0.85

(b) Splitted data set (based on the feature 'CGPA'):
CGPA = High CGPA = Medium CGPA = Low

Yes No Total Yes No Total Yes No Total

Count Count 4 1 Count 2
pi 1.00 0.00 pi 0.80 0.20 0.00 1.00
pi
-pi*log(pi) 0.00 0.00 0.00 -pi*log(pi) 0.26 0.46 0.72 -pi*log(pi) 0.00 0.00 0.00

Total Entropy = 0.33 Information Gain = 0.52

() Splitted data set (based onthe feature 'Communication'):

Communication = Good Communication = Bad

Yes No Total Yes No Total

Count 7 0 7 Count 3 4

p! 1.00 0.00 pi 0.25 0.75

piloglpi) 0.00 0.00 0.00 -pi*log(pi) 0.50 0.31 0.81

Information Gain =0.55

Total Entropy = 0.30

(d) Spitted data set (based on the feature (Programming Skill'):

Programming Skill = Good Programming Skill = Bad

Yes No Total Yes No Total

Count 3
Count
0.83 0.17 0.60 0.40
p! 0.97
pi*loglpi) 0.22 0.43 0.65 pilog(pi) 0.44 0.53

Total Entropy = 0.80

InformationGain =0.05

FIG. 7.13B
Entropy and information gain calculation (Level 2)
196 Chapter 7 Supervised Learning: Classification
As a part of level 3, we willthus have only one branch to navigate in this . case -the
one for Communication = Bad. Figure Z13c presents calculations for level 3.. As can
be seen from the figure, the entropy value is as follows:
0.81 before the split
" 0when the feature CGPA' is used for split
Skill' is used for split
" 0.50when the feature Programming

Aptitude = High & Communication = Bad

CGPA Programming Skill Job offered?
Good Yes
High
Low Bad No
Low Bad No
Medium Good No

(a) Level 2 starting set:

Yes No Total

Count 1 3 4
0.25 0.75
pi
-pi*log(pi) 0.50 0.31 0.81

Total Entropy = 0.81

(b) Splitted data set (based on the feature 'CGPA):
CGPA = Medium CGPA = Low
CGPA = High
Yes No Total Yes No Total Yes No Total

1 1 Count 1 1 Count 2 2
Count
1.00 0.00 pi 0.00 1.00 pl 0.00 1.00
pi
-pi*log(pi) 0.00 0.00 0.00 -pi*log(pi) 0.00 0.00 0.00 -pi*log(pi) 0.00 0.00 0.00

Information Gain = 0.81

Total Entropy = 0.00
() Splitted data set (based on the feature 'Programming Skill'):
Programming Skill = Good Programming Skill = Bad
Yes No Total Yes No Total
1 1 2 Count
Count 0 2 2
0.50 0.50 pi 0.00 1,00
pi
-pi*log(pi) 0.50 0.50 1.00
-pi*log(pi) 0.00 0.00 0.00

Total Entropy = 0.50 Information Gain = 0.31

FIG. 7.13C
Entropy and information gain calculation (Level 3)
Hence, the information gain after split with the feature CGPA is 0.81, which
is the maximum possible information gain (as the entropy before
split was 0.31):
Hence, as obvious, a split will be applied on the basis of the value of CGPA.
Because the maximum information gain is already achieved, the tree will not con
tinue any further.

7.5.2.5 Algorithm for decision tree

Input:Training data set, test data set (or data points)
Steps:
Do for all attributes
Calculate the entropy E; of the attribute F;
if E;<Emin
then Emin = E; and Fmin = F;
end if
End do
Split the data set into subsets using the attribute Fmin
Draw a decision tree node containing the attribute Fmin and split the
set into subsets data
Repeat the abovesteps until the full tree is drawn covering all the
of the original table. attributes

7.5.2.6 Avoiding overfitting in decisiontree - pruning

The decision tree algorithm, unless a stopping
criterion is
indefinitely - splitting for every feature and dividing intoapplied, may keep growing
smaller partitions till the
point that the data is perfectly classified. This, as is quite evident,
problem. To prevent a decision tree getting overfitted to the training resultsin overfitting
of the decision tree is data, pruning
essential. Pruning a decision tree
such that the model is nmore generalized and can classifyreduces the size of the tree
data in a better way. unknown and unlabelled
There are two approaches of pruning:
" Pre-pruning: Stop growing the tree before it reaches
" Post-pruning: Allow the tree to grow entirely and then
perfection.
branches from it. post-prune some of the
In thecase of pre-pruning, the tree is stopped from further growing once it
acertain number of decision nodes or decisions. Hence, in this strategy. the reaches
algorithm
avoids overfitting as well as optimizes computational cost. However, it also stands a
chance toignore important information contributed by afeature which was skipped,
thereby resulting in miss out of certain patterns in the data.
On theother hand, in thecase of post-pruning, the tree is allowed to grow to the
Tullextent. Then,by using certain pruning criterion,e.g, errorrates at the nodes, the
SIZe of the tree is reduced. This is amore effective approach in terms of classification
Learning: Classification
8 Chapter 7 Supervised
minute information available from the trainines
avray as it considers all pre-prunino
However.the computationalcost is obviously more than that of
tree
7.5.2. 7 Strengths of declsion
trees, not much matk
understandable rules. For smallerunderstand
" lt produces verv simple
knowledge is required to this model.
ematical and computational
problems.
" Works wellfor nmost of the
can handle both numerical and categoricalvariables.
" I and large training data sets.
small
" Can wvork well both with are more useful f
Decision trees provide a definite clue of which teatures
"
classification.

7.5.2.8 Weaknesses of decision tree

number
biased towards features having more
" Decision tree models are often
of possible values, i.e. levels.
underfitted quite easily.
" This modelgets overfitted or classification problems with many classes
trees are prone to errors in
" Decision
and relatively smallnumber of training examples.
expensive to train.
A decision tree can be computationally
understand.
" Large trees are complex to
7.5.2.9 Application of decision tree
there is a finite list of attributes
Decision tree can be applied in a dataset in which
attribute (e.g. High for the attri
and each data instance stores a value for that
values (e.g. 'High.
bute CGPA). When each attribute has a small number of distinct
an
'Medium, Low'), it is easier/quicker for thedecision tree to suggest (or choose)(e.g.
effective solution.This algorithm can beextended to handle real-value attributes
a floating point temperature).
The most straightforward case exists when there are only two possible values tor
an attribute (Boolean classification). Example: Communication has only two values
as Good' or 'Bad. It is also easy to extend the decision tree tocreate a target fune
tionwith more than two possible output values. Example: CGPA can take one of the
values from 'High, 'Medium} and Low: Irrespective of whether it is a binary value
multiple values, it is discrete in nature. For example, Aptitude can take the value of
either 'High' or Low. It is not possible to assign the value of both High' and 'Low 0
theattribute Aptitude to draw adecision tree.
There should be no infinite loops on taking adecision. As we move from the T0v
node to the next level node, it should move step-by-step towards the decision nodt
Otherwise,the algorithm may not give the final result for agiven data. If aset of code
goes ina loop, it would repeat itself forever, unless the
A decision tree can be used even for some system crashes.
instances with
instances with errors in the classification of examples or inmissing attributes
the attribute valu
Section 7.5 Common
Classification
describing those examples;such instances are handled well Algorithms
199

making them a robust learning method. by decision trees, thereby

GMAT Foundations of Math
From Everand
GMAT Foundations of Math
Manhattan Prep
4/5 (4)
Decision Tree Is An Upside
No ratings yet
Decision Tree Is An Upside
17 pages
EDA Cat2
No ratings yet
EDA Cat2
54 pages
Adobe Scan 16 May 2023
No ratings yet
Adobe Scan 16 May 2023
12 pages
Decision Tree
No ratings yet
Decision Tree
13 pages
Machine - Learning - Lecture - 08 - Decision Tree Learning
No ratings yet
Machine - Learning - Lecture - 08 - Decision Tree Learning
67 pages
TEAA - Tree Ensembles-1
No ratings yet
TEAA - Tree Ensembles-1
43 pages
Unit IV Decision Trees
No ratings yet
Unit IV Decision Trees
37 pages
Lecture Note #5 - PEC-CS701E
No ratings yet
Lecture Note #5 - PEC-CS701E
16 pages
Dmbi Mini Project Report
No ratings yet
Dmbi Mini Project Report
7 pages
Classification: Decision Trees: Business Analytics Lecture 7/8
No ratings yet
Classification: Decision Trees: Business Analytics Lecture 7/8
35 pages
Decision Tree Is An Upside
No ratings yet
Decision Tree Is An Upside
7 pages
Chapter 4classification and Prediction
No ratings yet
Chapter 4classification and Prediction
19 pages
Unit IV Da Online - PPTX 2 82
No ratings yet
Unit IV Da Online - PPTX 2 82
81 pages
Decision Trees and Regression Techniques
No ratings yet
Decision Trees and Regression Techniques
27 pages
AI&Ml-module 4 (Part 1)
No ratings yet
AI&Ml-module 4 (Part 1)
85 pages
AI&Ml-module 4 (Complete)
No ratings yet
AI&Ml-module 4 (Complete)
124 pages
Decisiontree
No ratings yet
Decisiontree
6 pages
Decision Tree
No ratings yet
Decision Tree
41 pages
Decision Trees Lectures
No ratings yet
Decision Trees Lectures
55 pages
Supervised Learning-Classification Part-4 Divide and Conquer
No ratings yet
Supervised Learning-Classification Part-4 Divide and Conquer
32 pages
Decision Tree Algorithm, Explained-1-22
No ratings yet
Decision Tree Algorithm, Explained-1-22
22 pages
Cse 445 Lecture 8 Mma
No ratings yet
Cse 445 Lecture 8 Mma
107 pages
Unit 5. Decision Trees
No ratings yet
Unit 5. Decision Trees
58 pages
Decision Tree
No ratings yet
Decision Tree
18 pages
Konsep Ensemble
No ratings yet
Konsep Ensemble
52 pages
ML Unit 3 Qa
No ratings yet
ML Unit 3 Qa
26 pages
Decision Tree in Machine Learning
No ratings yet
Decision Tree in Machine Learning
11 pages
U4 ML Updated
No ratings yet
U4 ML Updated
32 pages
Decision Tree Classification Algorithm
No ratings yet
Decision Tree Classification Algorithm
11 pages
Unit Ii
No ratings yet
Unit Ii
22 pages
Decision Tree Algorithm in Machine Learning
No ratings yet
Decision Tree Algorithm in Machine Learning
17 pages
Decision Tree
No ratings yet
Decision Tree
74 pages
Decision Trees: Make A Decision (Represent An Outcome
No ratings yet
Decision Trees: Make A Decision (Represent An Outcome
4 pages
08 Decision - Tree
No ratings yet
08 Decision - Tree
9 pages
Decisiontree 2
No ratings yet
Decisiontree 2
16 pages
Decision Tree
No ratings yet
Decision Tree
31 pages
Unit 4 Classification
No ratings yet
Unit 4 Classification
15 pages
Classification
No ratings yet
Classification
75 pages
Decsion Tree
No ratings yet
Decsion Tree
6 pages
ML L8 Decision Tree
No ratings yet
ML L8 Decision Tree
109 pages
Tree
No ratings yet
Tree
7 pages
ML Unit 3 Notes-1
No ratings yet
ML Unit 3 Notes-1
118 pages
ML Unit 3 Notes
No ratings yet
ML Unit 3 Notes
117 pages
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
31 pages
Week 4 - Classification - Decision Tree 1
No ratings yet
Week 4 - Classification - Decision Tree 1
40 pages
Lecture 5a
No ratings yet
Lecture 5a
24 pages
Unit-4 (1) .Docx ML
No ratings yet
Unit-4 (1) .Docx ML
42 pages
Decision Tree
No ratings yet
Decision Tree
66 pages
ML - Module-3-Chapter-6 RNSIT
No ratings yet
ML - Module-3-Chapter-6 RNSIT
10 pages
S&ML Unit 6 - Q & A
No ratings yet
S&ML Unit 6 - Q & A
12 pages
1.decision Trees Concepts
No ratings yet
1.decision Trees Concepts
70 pages
21-Data Clustering (K-Means Clustering Algorithm), Predictive Analytics-11!04!2023
No ratings yet
21-Data Clustering (K-Means Clustering Algorithm), Predictive Analytics-11!04!2023
41 pages
Decisiontrees
No ratings yet
Decisiontrees
28 pages
Decision Tree
No ratings yet
Decision Tree
16 pages
Decision Tree
No ratings yet
Decision Tree
11 pages
Lecture 07 On Decision Trees
No ratings yet
Lecture 07 On Decision Trees
36 pages
Decision Tree
No ratings yet
Decision Tree
15 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
5 pages
CODING INTERVIEW: 50+ Tips and Tricks to Better Performance in Your Coding Interview
From Everand
CODING INTERVIEW: 50+ Tips and Tricks to Better Performance in Your Coding Interview
Eric Schmidt
No ratings yet
M Bharath
No ratings yet
M Bharath
3 pages
OHS-PR-02-07 Document Control
100% (2)
OHS-PR-02-07 Document Control
14 pages
Nse 3.1
No ratings yet
Nse 3.1
4 pages
CSEC Information Technology June 2022 P2 (Answered)
No ratings yet
CSEC Information Technology June 2022 P2 (Answered)
20 pages
SCADA User Interface: E-Terracontrol - Module 4
No ratings yet
SCADA User Interface: E-Terracontrol - Module 4
14 pages
Building Information Modelling (Bim) For Facilities Management (FM) : The Mediacity Case Study Approach
No ratings yet
Building Information Modelling (Bim) For Facilities Management (FM) : The Mediacity Case Study Approach
21 pages
Sample Complaint Letter
No ratings yet
Sample Complaint Letter
2 pages
Unit 1
No ratings yet
Unit 1
10 pages
MSC Pool Conceptdfadslfkdslfkdsal
No ratings yet
MSC Pool Conceptdfadslfkdslfkdsal
4 pages
Mystic Media House Profile
No ratings yet
Mystic Media House Profile
16 pages
eSthenos-Mobility Solutions For MFI/Banks/SBL
No ratings yet
eSthenos-Mobility Solutions For MFI/Banks/SBL
8 pages
Extension Officer-Paper-2-Master Question Paper
No ratings yet
Extension Officer-Paper-2-Master Question Paper
40 pages
Compaq Armada m300
No ratings yet
Compaq Armada m300
102 pages
Digital Communications: Instructor: Dr. Phan Van Ca Lecture #4: Introduction To Digital Communications
No ratings yet
Digital Communications: Instructor: Dr. Phan Van Ca Lecture #4: Introduction To Digital Communications
27 pages
Odbc
No ratings yet
Odbc
2 pages
t10 - Requirements Management
No ratings yet
t10 - Requirements Management
47 pages
Fixlog
No ratings yet
Fixlog
108 pages
COS 101.use. Lecture 1
No ratings yet
COS 101.use. Lecture 1
16 pages
2000 Procedimientos Industriales - Formoso
100% (2)
2000 Procedimientos Industriales - Formoso
1,219 pages
Rundown Pelatihan Threat Hunting - Beta Dan Charlie (WIB)
No ratings yet
Rundown Pelatihan Threat Hunting - Beta Dan Charlie (WIB)
3 pages
dt209x Manual
No ratings yet
dt209x Manual
68 pages
HONO
No ratings yet
HONO
29 pages
Buy Stigum's Money Market 4E Ebook at Discount Price
100% (1)
Buy Stigum's Money Market 4E Ebook at Discount Price
12 pages
(ET) Remote Utilities (Viewer + Host) Pro 6.8.0.1 TORRENT (v6.8.0
No ratings yet
(ET) Remote Utilities (Viewer + Host) Pro 6.8.0.1 TORRENT (v6.8.0
5 pages
PS ScreenShots - Manual
No ratings yet
PS ScreenShots - Manual
32 pages
Parallelism in A Uniprocessor System: Multiprogramming
No ratings yet
Parallelism in A Uniprocessor System: Multiprogramming
2 pages
Academic
No ratings yet
Academic
8 pages
01 - Disaster - (2) - JupyterLab
No ratings yet
01 - Disaster - (2) - JupyterLab
16 pages
1 MP
No ratings yet
1 MP
3 pages
Course Syllabus Gamification
No ratings yet
Course Syllabus Gamification
4 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Adobe Scan 16 May 2023

Uploaded by

Adobe Scan 16 May 2023

Uploaded by

Decision tree

Each internal node (represented by boxes) tests an attribute (represented as'A`B

7.5.2.1 Buildingg a decision tree

High Bad High Good Yes

Low Bad High Bad No

Low Bad High Bad No

Medium Good High Bad Yes

Low Good Low Good No

Bad Low Bad No

Bad Low Bad No

Job not offered

7.5.2.2 Searching a decision tree

Job not offered

Aptitude? High Bad

7.5.2.3 Entropy of adecision tree

Entropy(S) = P, log 2Pi

Total Entropy = 0.99

Total Entropy = 0.69 Information Gain = 0.30

(c) Splitted data set (based on the feature 'Communication'):

Yes No Total Yes No Total

Total Entropy = 0.63 Information Gain= 0.36

(d) Splitteddata set (based on the feature 'Aptitude'):

Yes No Total Yes No Total

Total Entropy= 0.85

Yes No Total Yes No Total Yes No Total

Total Entropy = 0.33 Information Gain = 0.52

() Splitted data set (based onthe feature 'Communication'):

Yes No Total Yes No Total

p! 1.00 0.00 pi 0.25 0.75

Information Gain =0.55

(d) Spitted data set (based on the feature (Programming Skill'):

Yes No Total Yes No Total

Total Entropy = 0.80

Aptitude = High & Communication = Bad

(a) Level 2 starting set:

Total Entropy = 0.81

Information Gain = 0.81

Total Entropy = 0.50 Information Gain = 0.31

7.5.2.5 Algorithm for decision tree

7.5.2.6 Avoiding overfitting in decisiontree - pruning

7.5.2.8 Weaknesses of decision tree

making them a robust learning method. by decision trees, thereby

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.