0% found this document useful (0 votes)

47 views44 pages

DWM - Classification-Unit7

The document discusses classification and prediction problems in machine learning. It explains that classification predicts categorical class labels by constructing models from training data to classify new data, while prediction models continuous functions to predict unknown values. The document provides examples of common classification applications like credit approval, medical diagnosis, and fraud detection.

Uploaded by

Hansica Madurkar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views44 pages

DWM - Classification-Unit7

Uploaded by

Hansica Madurkar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

Classification

NOTE : THIS PRESENTATION SHOULD BE CONSIDERED AS

SUPPORTING MATERIAL ONLY. FOR DETAILED STUDY
STUDENTS MUST REFER THE TEXT BOOKS AND REFRENCE
BOOKS MENTIONED IN SYLLABUS.

August 20, 2020 PRASHASTI KANIKAR 1

Classification vs. Prediction
• Classification
– predicts categorical class labels
– classifies data (constructs a model) based on the training
set and the values (class labels) in a classifying attribute
and uses it in classifying new data
• Prediction
– models continuous-valued functions, i.e., predicts unknown
or missing values
• Typical applications
– Credit approval
– Target marketing
– Medical diagnosis
– Fraud detection

August 20, 2020 PRASHASTI KANIKAR 2

Classification—A Two-Step Process
• Model construction: describing a set of predetermined classes
– Each tuple/sample is assumed to belong to a predefined class, as
determined by the class label attribute
– The set of tuples used for model construction is training set
– The model is represented as classification rules, decision trees, or
mathematical formulae
• Model usage: for classifying future or unknown objects
– Estimation accuracy of the model
• The known label of test sample is compared with the classified
result from the model
• Accuracy rate is the percentage of test set samples that are
correctly classified by the model
– If the accuracy is acceptable, use the model to classify data tuples
whose class labels are not known

August 20, 2020 PRASHASTI KANIKAR 3

Process (1): Model Construction

Classification
Algorithms
Training
Data

NAME RANK YEARS TENURED Classifier

M ike A ssistan t P ro f 3 no (Model)
M ary A ssistan t P ro f 7 yes
B ill P ro fesso r 2 yes
Jim A sso ciate P ro f 7 yes
IF rank = ‘professor’
D ave A ssistan t P ro f 6 no
OR years > 6
Anne A sso ciate P ro f 3 no
THEN tenured = ‘yes’
August 20, 2020 PRASHASTI KANIKAR 4
Process (2): Using the Model in Prediction

Classifier

Testing
Data Unseen Data

(Jeff, Professor, 4)
NAME RANK YEARS TENURED
T om A ssistant P rof 2 no
Tenured?
M erlisa A ssociate P rof 7 no
G eorge P rofessor 5 yes
Joseph A ssistant P rof 7 yes
August 20, 2020 PRASHASTI KANIKAR 5
Classification Examples
• Teachers classify students’ grades as A, B, C, D
or F.
• Identify mushrooms as poisonous or edible.
• Predict when a river will flood.
• Identify individuals with credit risks.
• Speech recognition
• Pattern recognition

August 20, 2020 PRASHASTI KANIKAR 6

Classification Ex: Grading
x
• If x >= 90 then grade =A.
<90 >=90
• If 80<=x<90 then grade
=B. x A

• If 70<=x<80 then grade <80 >=80

=C. x B
• If 60<=x<70 then grade
<70 >=70
=D.
x C
• If x<50 then grade =F.
<50 >=60

F D
August 20, 2020 PRASHASTI KANIKAR 7
Classification problem
The classification problem can be expressed as: Use the training database given and predict
the class label of a previously unseen instance

August 20, 2020 PRASHASTI KANIKAR 8

August 20, 2020 PRASHASTI KANIKAR 9
Database
Name Gender Height Output1 Output2
Kristina F 1.6m Short Medium
Jim M 2m Tall Medium
Maggie F 1.9m Medium Tall
Martha F 1.88m Medium Tall
Stephanie F 1.7m Short Medium
Bob M 1.85m Medium Medium
Kathy F 1.6m Short Medium
Dave M 1.7m Short Medium
Worth M 2.2m Tall Tall
Steven M 2.1m Tall Tall
Debbie F 1.8m Medium Medium
Todd M 1.95m Medium Medium
Kim F 1.9m Medium Tall
Amy F 1.8m Medium Medium
Wynette F 1.75m Medium Medium

August 20, 2020 PRASHASTI KANIKAR 10

Bayesian Classification: Why?
• A statistical classifier: performs probabilistic prediction, i.e.,
predicts class membership probabilities
• Foundation: Based on Bayes’ Theorem.
• Performance: A simple Bayesian classifier, naïve Bayesian
classifier, has comparable performance with decision tree and
selected neural network classifiers
• Incremental: Each training example can incrementally
increase/decrease the probability that a hypothesis is correct —
prior knowledge can be combined with observed data
• Standard: Even when Bayesian methods are computationally
intractable, they can provide a standard of optimal decision
making against which other methods can be measured

August 20, 2020 PRASHASTI KANIKAR 11

Bayesian Theorem: Basics
• Let X be a data sample (“evidence”): class label is unknown
• Let H be a hypothesis that X belongs to class C
• Classification is to determine P(H|X), the probability that the
hypothesis holds given the observed data sample X
• P(H) (prior probability), the initial probability
– E.g., X will buy computer, regardless of age, income, …
• P(X): probability that sample data is observed
• P(X|H) (posteriori probability), the probability of observing the
sample X, given that the hypothesis holds
– E.g., Given that X will buy computer, the prob. that X is 31..40,
medium income
August 20, 2020 PRASHASTI KANIKAR 12
Bayesian Theorem
• Given training data X, posteriori probability of a hypothesis H,
P(H|X), follows the Bayes theorem

P(H | X)  P(X | H )P(H )

P(X)
• Informally, this can be written as
posteriori = likelihood x prior/evidence
• Predicts X belongs to C2 iff the probability P(Ci|X) is the highest
among all the P(Ck|X) for all the k classes
• Practical difficulty: require initial knowledge of many
probabilities, significant computational cost
August 20, 2020 PRASHASTI KANIKAR 13
Naïve Bayesian Classifier: Training Dataset
age income studentcredit_rating
buys_comput
<=30 high no fair no
<=30 high no excellent no
Class: 31…40 high no fair yes
C1:buys_computer = ‘yes’ >40 medium no fair yes
C2:buys_computer = ‘no’ >40 low yes fair yes
>40 low yes excellent no
Data sample 31…40 low yes excellent yes
X = (age <=30, <=30 medium no fair no
Income = medium, <=30 low yes fair yes
Student = yes
>40 medium yes fair yes
Credit_rating = Fair)
<=30 medium yes excellent yes
31…40 medium no excellent yes
31…40 high yes fair yes
>40 medium no excellent no
August 20, 2020 PRASHASTI KANIKAR 14
Naïve Bayesian Classifier: An Example
• P(Ci): P(buys_computer = “yes”) = 9/14 = 0.643
P(buys_computer = “no”) = 5/14= 0.357

• Compute P(X|Ci) for each class

P(age = “<=30” | buys_computer = “yes”) = 2/9 = 0.222
P(age = “<= 30” | buys_computer = “no”) = 3/5 = 0.6
P(income = “medium” | buys_computer = “yes”) = 4/9 = 0.444
P(income = “medium” | buys_computer = “no”) = 2/5 = 0.4
P(student = “yes” | buys_computer = “yes) = 6/9 = 0.667
P(student = “yes” | buys_computer = “no”) = 1/5 = 0.2
P(credit_rating = “fair” | buys_computer = “yes”) = 6/9 = 0.667
P(credit_rating = “fair” | buys_computer = “no”) = 2/5 = 0.4

• X = (age <= 30 , income = medium, student = yes, credit_rating = fair)

P(X|Ci) : P(X|buys_computer = “yes”) = 0.222 x 0.444 x 0.667 x 0.667 = 0.044

P(X|buys_computer = “no”) = 0.6 x 0.4 x 0.2 x 0.4 = 0.019
P(X|Ci)*P(Ci) : P(X|buys_computer = “yes”) * P(buys_computer = “yes”) = 0.028
P(X|buys_computer = “no”) * P(buys_computer = “no”) = 0.007

Therefore, X belongs to class (“buys_computer = yes”)

August 20, 2020 PRASHASTI KANIKAR 15

Naïve Bayesian Classifier: Comments
• Advantages
– Easy to implement
– Good results obtained in most of the cases
• Disadvantages
– Assumption: class conditional independence, therefore loss of
accuracy
– Practically, dependencies exist among variables
• E.g., hospitals: patients: Profile: age, family history, etc.
Symptoms: fever, cough etc., Disease: lung cancer, diabetes, etc.
• Dependencies among these cannot be modeled by Naïve Bayesian
Classifier

August 20, 2020 PRASHASTI KANIKAR 16

Decision Tree Induction( ID3): Training Dataset
age income student credit_rating buys_computer
<=30 high no fair no
<=30 high no excellent no
31…40 high no fair yes
>40 medium no fair yes
>40 low yes fair yes
>40 low yes excellent no
31…40 low yes excellent yes
<=30 medium no fair no
<=30 low yes fair yes
>40 medium yes fair yes
<=30 medium yes excellent yes
31…40 medium no excellent yes
31…40 high yes fair yes
>40 medium no excellent no
August 20, 2020 PRASHASTI KANIKAR 17
Output: A Decision Tree for “buys_computer”

age?

<=30 overcast
31..40 >40

student? yes credit rating?

no yes excellent fair

no yes yes

August 20, 2020 PRASHASTI KANIKAR 18

Attribute Selection Measure:
Information Gain (ID3)
 Select the attribute with the highest information gain
 Let pi be the probability that an arbitrary tuple in D
belongs to class Ci, estimated by |Ci, D|/|D|
 Expected information (entropy) needed to classify a tuple
in D: m
Info ( D)   pi log 2 ( pi )
i 1

 Information needed (after using A to split D into v

partitions) to classify D: v |D |
Info A ( D)    I (D j )
j

j 1 | D |

 Information gained by branching on attribute A

Gain(A)  Info(D)  Info A(D)
August 20, 2020 PRASHASTI KANIKAR 19
Attribute Selection: Information Gain

 Class P: buys_computer = “yes” 5 4

Infoage ( D)  I (2,3)  I (4,0)
 Class N: buys_computer = “no” 14 14
9 9 5 5 5
Info( D)  I (9,5)   log 2 ( )  log 2 ( ) 0.940  I (3,2)  0.694
14 14 14 14 14
age pi ni I(pi, ni) 5
I (2,3)means “age <=30” has 5 out of
<=30 2 3 0.971 14 14 samples, with 2 yes’es and 3
31…40 4 0 0 no’s. Hence
>40 3 2 0.971
age income student credit_rating buys_computer Gain(age)  Info( D)  Infoage( D)  0.246
<=30 high no fair no
<=30 high no excellent no Similarly,
31…40 high no fair yes
>40 medium no fair yes
>40 low yes fair yes
>40 low
31…40 low
yes
yes
excellent
excellent
no
yes Gain(income)  0.029
Gain( student )  0.151
<=30 medium no fair no
<=30 low yes fair yes
>40 medium yes fair yes
<=30 medium
31…40 medium
yes
no
excellent
excellent
yes
yes
Gain(credit _ rating )  0.048
August
31…40 high20, 2020 yes fair PRASHASTI KANIKAR
yes 20
>40 medium no excellent no
C4.5

• This approach uses gain ratio instead of gain.

• Decision tree algorithm C4.5 improves ID3 in following ways

– Handling both continuous and discrete attributes - In order to handle continuous
attributes, C4.5 creates a threshold and then splits the list into those whose
attribute value is above the threshold and those that are less than or equal to it.
– Handling training data with missing attribute values - C4.5 allows attribute values
to be marked as ? for missing. Missing attribute values are simply not used in
gain and entropy calculations.
– Pruning trees after creation - C4.5 goes back through the tree once it's been
created and attempts to remove branches that do not help by replacing them
with leaf nodes.
– Splitting-For splitting C4.5 uses the largest gain ratio that ensures a larger than
average information gain.
– Rules

August 20, 2020 PRASHASTI KANIKAR 21

Gain Ratio for Attribute Selection (C4.5)
• Information gain measure is biased towards attributes with a
large number of values
• C4.5 (a successor of ID3) uses gain ratio to overcome the
problem (normalization to information gain)
v | Dj | | Dj |
SplitInfo A ( D)    log 2 ( )
j 1 | D| |D|
– GainRatio(A) = Gain(A)/SplitInfo(A)
• Ex.

– gain_ratio(income) = 0.029/1.557 = 0.019

• The attribute with the maximum gain ratio is selected as the
splitting attribute
22
August 20, 2020 PRASHASTI KANIKAR
Algorithm for Decision Tree Induction
• Basic algorithm (a greedy algorithm)
– Tree is constructed in a top-down recursive divide-and-conquer
manner
– At start, all the training examples are at the root
– Attributes are categorical (if continuous-valued, they are
discretized in advance)
– Examples are partitioned recursively based on selected
attributes
– Test attributes are selected on the basis of a heuristic or
statistical measure (e.g., information gain)
• Conditions for stopping partitioning
– All samples for a given node belong to the same class
– There are no remaining attributes for further partitioning
– There are no samples left
23
August 20, 2020 PRASHASTI KANIKAR
Gini Index (CART, IBM IntelligentMiner)
• If a data set D contains examples from n classes, gini index,
gini(D) is defined as n 2
gini( D)  1  p j
j 1
where pj is the relative frequency of class j in D
• If a data set D is split on A into two subsets D1 and D2, the gini
index gini(D) is defined as
|D1| |D |
gini A (D)  gini(D1)  2 gini(D2)
|D| |D|
• Reduction in Impurity:
gini( A)  gini(D)  giniA(D)
• The attribute provides the smallest ginisplit(D) (or the largest
reduction in impurity) is chosen to split the node
24
Computation of Gini Index
• Ex. D has 9 tuples in buys_computer = “yes” and 5 in “no”
2 2
9 5
gini ( D)  1        0.459
 14   14 
• Suppose the attribute income partitions D into 10 in D1: {low, medium} and 4 in
D2  10  4
giniincome{low,medium} ( D)   Gini( D1 )   Gini ( D2 )
 14   14 

Similarly, Gini{low,high} is 0.458 and Gini{medium,high} is 0.450.

• Evaluating age, we obtain {youth, senior} or {middle-aged} as the best split
with Gini index of 0.357
• The attributes student and credit_rating are both binary with Gini index values
of 0.367 and 0.429.
• Therefore, the attribute age and splitting subset {youth, senior} gives the
minimum Gini index overall and is used as the splitting criterion.
25
August 20, 2020 PRASHASTI KANIKAR
Comparing Attribute Selection Measures

• The three measures, in general, return good results but

– Information gain:
• biased towards multivalued attributes
– Gain ratio:
• tends to prefer unbalanced splits in which one partition is
much smaller than the others
– Gini index:
• biased to multivalued attributes
• has difficulty when number of classes is large

26
August 20, 2020 PRASHASTI KANIKAR
Overfitting and Tree Pruning
• Overfitting: An induced tree may overfit the training data
– Too many branches, some may reflect anomalies due to
noise or outliers
– Poor accuracy for unseen samples
• Two approaches to avoid overfitting
– Prepruning: Halt tree construction early ̵ do not split a node
if this would result in the goodness measure falling below a
threshold
• Difficult to choose an appropriate threshold
– Postpruning: Remove branches from a “fully grown” tree—
get a sequence of progressively pruned trees
• Use a set of data different from the training data to
decide which is the “best pruned tree”
27
August 20, 2020 PRASHASTI KANIKAR
Regression models
• Regression models can be used to approximate the given data.
• In (simple)linear regression, the data are modeled to fit a straight
line.
• For example, a random variable, y (called a response variable), can
be modeled as a linear function of another random variable, x
(called a predictor variable), with the equation
• y=wx+b
• where the variance of y is assumed to be constant. In the context of
data mining, x and y are numeric database attributes.
• The coefficients, w and b (called regression coefficients), specify the
slope of the line and the y-intercept, respectively.
• These coefficients can be solved for by the method of least squares,
which minimizes the error between the actual line separating the
data and the estimate of the line.

August 20, 2020 PRASHASTI KANIKAR 28

Multiple linear regression
Multiple linear regression is an extension of
(simple) linear regression, which allows a
response variable, y, to be modeled as a linear
function of two or more predictor variables.

August 20, 2020 PRASHASTI KANIKAR 29

Classifier Model Evaluation Metrics:
Confusion Matrix
Confusion Matrix:
Actual class\Predicted class C1 ¬ C1
C1 True Positives (TP) False Negatives (FN)
¬ C1 False Positives (FP) True Negatives (TN)

Example of Confusion Matrix:

Actual class\Predicted buy_computer buy_computer Total
class = yes = no
buy_computer = yes 6954 46 7000
buy_computer = no 412 2588 3000
Total 7366 2634 10000

• Given m classes, an entry, CMi,j in a confusion matrix indicates

# of tuples in class i that were labeled by the classifier as class j
• May have extra rows/columns to provide totals
30
30
Classifier Evaluation Metrics: Accuracy,
Error Rate, Sensitivity and Specificity
A\P C ¬C
C TP FN P
¬C FP TN N
P’ N’ All

• Classifier Accuracy, or  Sensitivity: True Positive

recognition rate: percentage of recognition rate
test set tuples that are correctly  Sensitivity = TP/P
classified
 Specificity: True Negative
Accuracy = (TP + TN)/All recognition rate
• Error rate: 1 – accuracy, or  Specificity = TN/N
Error rate = (FP + FN)/All

August 20, 2020 PRASHASTI KANIKAR 31

31
Classifier Evaluation Metrics:
Precision and Recall, and F-measures
• Precision: exactness – what % of tuples that the classifier
labeled as positive are actually positive

• Recall: completeness – what % of positive tuples did the

classifier label as positive?

• F measure (F1 or F-score): harmonic mean of precision and

recall,

• Fß: weighted measure of precision and recall

August 20, 2020 PRASHASTI KANIKAR 32

32
Classifier Evaluation Metrics: Example

Actual Class\Predicted class cancer = yes cancer = no Total Recognition(%)

cancer = yes 90 210 300 30.00 (sensitivity
cancer = no 140 9560 9700 98.56 (specificity)
Total 230 9770 10000 96.40 (accuracy)

– Precision = 90/230 = 39.13% Recall = 90/300 = 30.00%

33
33
Holdout
• In this method, the given data are randomly partitioned into
two independent sets, a training set and a test set.
• Typically, two-thirds of the data are allocated to the training
set, and the remaining one-third is allocated to the test set.
• The training set is used to derive the model. The model’s
accuracy is then estimated with the test set .

34
Random subsampling
• Random subsampling is a variation of the
holdout method in which the holdout method
is repeated k times.
• The overall accuracy estimate is taken as the
average of the accuracies obtained from each
iteration.

August 20, 2020 PRASHASTI KANIKAR 35

cross-validation
• In k-fold cross-validation, the initial data are randomly partitioned into k
mutually exclusive subsets or “folds,” D1, D2, : : : , Dk, each of
approximately equal size.
• Training and testing is performed k times.
• In iteration i, partition Di is reserved as the test set, and the remaining
partitions are collectively used to train the model.
• That is, in the first iteration, subsets D2, : : : , Dk collectively serve as the
training set to obtain a first model, which is tested on D1;
• the second iteration is trained on subsets D1, D3, : : : , Dk and tested on
D2; and so on.
• Unlike the holdout and random subsampling methods, here each sample is
used the same number of times for training and once for testing.
• For classification, the accuracy estimate is the overall number of correct
classifications from the k iterations, divided by the total number of tuples
in the initial data.

August 20, 2020 PRASHASTI KANIKAR 36

Bootstrap
• The bootstrap method samples the given
training tuples uniformly with replacement.
• That is, each time a tuple is selected, it is equally
likely to be selected again and re-added to the
training set.
• For instance, imagine a machine that randomly
selects tuples for our training set.
• In sampling with replacement, the machine is
allowed to select the same tuple more than once.
August 20, 2020 PRASHASTI KANIKAR 37
.632 bootstrap

• Suppose we are given a data set of d tuples.

• The data set is sampled d times, with replacement,
resulting in a bootstrap sample or training set of d samples.
• It is very likely that some of the original data tuples will
occur more than once in this sample.
• The data tuples that did not make it into the training set
end up forming the test set.
• Suppose we were to try this out several times. As it turns
out, on average, 63.2% of the original data tuples will end
up in the bootstrap sample, and the remaining 36.8% will
form the test set (hence, the name, .632 bootstrap).

August 20, 2020 PRASHASTI KANIKAR 38

Comparing Classifier Performance using ROC
(Receiver operating characteristic ) Curves
• Receiver operating characteristic curves are a useful visual tool for comparing two
classification models.
• An ROC curve for a given model shows the trade-off between the true positive rate
(TPR) and the false positive rate (FPR).
• Given a test set and a model, TPR is the proportion of positive (or “yes”) tuples that
are correctly labeled by the model; FPR is the proportion of negative (or “no”)
tuples that are mislabeled as positive.
• Given that TP, FP, P, and N are the number of true positive, false positive, positive,
and negative tuples, respectively, we know that TPR = TP/P , which is sensitivity.
• Furthermore, FPR = FP/N , which is 1 - specificity.
• For a two-class problem, an ROC curve allows us to visualize the trade-off between
the rate at which the model can accurately recognize positive cases versus the rate
at which it mistakenly identifies negative cases as positive for different portions of
the test set.
• Any increase in TPR occurs at the cost of an increase in FPR. The area under the
ROC curve is a measure of the accuracy of the model.
August 20, 2020 PRASHASTI KANIKAR 39
Techniques to Improve Classification
Accuracy
• An ensemble for classification is a composite
model, made up of a combination of
classifiers.
• The individual classifiers vote, and a class label
prediction is returned by the ensemble based
on the collection of votes.
• Ensembles tend to be more accurate than
their component classifiers.

August 20, 2020 PRASHASTI KANIKAR 40

August 20, 2020 PRASHASTI KANIKAR 41
Bagging (bootstrap aggregation)

• The bagging algorithm—create an ensemble of classification models for a

learning scheme where each model gives an equally weighted prediction.
• Input:
– D, a set of d training tuples;
– k, the number of models in the ensemble;
– a classification learning scheme (decision tree algorithm, na¨ıve Bayesian,
etc.).
• Output: The ensemble—a composite model, M.
• Method:
– (1) for i = 1 to k do // create k models:
– (2) create bootstrap sample, Di , by sampling D with replacement;
– (3) use Di and the learning scheme to derive a model, Mi ;
– (4) endfor
• To use the ensemble to classify a tuple, X:
– let each of the k models classify X and return the majority vote;

August 20, 2020 PRASHASTI KANIKAR 42

Boosting
• In boosting, weights are also assigned to each
training tuple.
• A series of k classifiers is iteratively learned.
• After a classifier, Mi , is learned, the weights are
updated to allow the subsequent classifier,MiC1,
to “pay more attention” to the training tuples
that were misclassified by Mi
• The final boosted classifier, M, combines the votes
of each individual classifier, where the weight of
each classifier’s vote is a function of its accuracy.
August 20, 2020 PRASHASTI KANIKAR 43
Random Forests
• We now present another ensemble method called random
forests.
• Imagine that each of the classifiers in the ensemble is a
decision tree classifier so that the collection of classifiers is
a “forest.”
• The individual decision trees are generated using a random
selection of attributes at each node to determine the split.
• More formally, each tree depends on the values of a
random vector sampled independently and with the same
distribution for all trees in the forest.
• During classification, each tree votes and the most popular
class is returned.

August 20, 2020 PRASHASTI KANIKAR 44

Bayesian Classification- problem (1)
No ratings yet
Bayesian Classification- problem (1)
4 pages
Ml Module4 Classification
No ratings yet
Ml Module4 Classification
79 pages
UNIT- iv
No ratings yet
UNIT- iv
169 pages
Unit-3 AML (Bayesian Concept Learning)
No ratings yet
Unit-3 AML (Bayesian Concept Learning)
40 pages
4_22865_IS465_2019_1__2_1_08ClassBasic
No ratings yet
4_22865_IS465_2019_1__2_1_08ClassBasic
43 pages
Naive Bayes
No ratings yet
Naive Bayes
37 pages
ML 05 Bayesian Classifier
No ratings yet
ML 05 Bayesian Classifier
19 pages
SAP PCE PreProvisioningForm For SAP DC - v1.10
No ratings yet
SAP PCE PreProvisioningForm For SAP DC - v1.10
18 pages
AI notes
No ratings yet
AI notes
19 pages
Classification & Prediction: - Shailesh Yadav Central University of Rajasthan
No ratings yet
Classification & Prediction: - Shailesh Yadav Central University of Rajasthan
28 pages
Statistical Inference INF312 - Is - Lecture 03 - Part 3
No ratings yet
Statistical Inference INF312 - Is - Lecture 03 - Part 3
18 pages
Unit 5 Classification PDF
No ratings yet
Unit 5 Classification PDF
131 pages
Data Mining - Classification - Lecture04
No ratings yet
Data Mining - Classification - Lecture04
21 pages
Module - 4 - ECE3047 - Machine Learning
No ratings yet
Module - 4 - ECE3047 - Machine Learning
81 pages
Classification-Clustering
No ratings yet
Classification-Clustering
44 pages
Module 3- Bayesian Classifier (1)
No ratings yet
Module 3- Bayesian Classifier (1)
17 pages
2.3 Bayes classification
No ratings yet
2.3 Bayes classification
15 pages
9-Decision Tree Induction-23-01-2025
No ratings yet
9-Decision Tree Induction-23-01-2025
40 pages
Naive Bayes.ppt
No ratings yet
Naive Bayes.ppt
24 pages
TTDS Lecture 5
No ratings yet
TTDS Lecture 5
8 pages
6 Classification
No ratings yet
6 Classification
53 pages
20210913115710D3708 - Session 09-12 Bayes Classifier
No ratings yet
20210913115710D3708 - Session 09-12 Bayes Classifier
30 pages
Alaa Ali ch8 Mining
No ratings yet
Alaa Ali ch8 Mining
13 pages
Bayesian
No ratings yet
Bayesian
23 pages
10 Classification New 1
No ratings yet
10 Classification New 1
31 pages
Classification Naive Bayes
No ratings yet
Classification Naive Bayes
17 pages
Nayes Bayes Classifier
No ratings yet
Nayes Bayes Classifier
46 pages
L3 (Week3) Bayesian Classifier
No ratings yet
L3 (Week3) Bayesian Classifier
21 pages
Lecture Slide 03 - Bayesian Classifier - Summer 2023
No ratings yet
Lecture Slide 03 - Bayesian Classifier - Summer 2023
23 pages
Bayes Classification Methods
No ratings yet
Bayes Classification Methods
22 pages
Lecture 5 Bayesian Classification
No ratings yet
Lecture 5 Bayesian Classification
16 pages
Classification
No ratings yet
Classification
33 pages
Unit6 -3 Classification-Bayesian_e224638f-6bb6-4684-a1a1-adb33ef1b15d
No ratings yet
Unit6 -3 Classification-Bayesian_e224638f-6bb6-4684-a1a1-adb33ef1b15d
15 pages
U02Lecture07 Classification
100% (1)
U02Lecture07 Classification
56 pages
Lecture12-Ch8-ClassBasic-Part2
No ratings yet
Lecture12-Ch8-ClassBasic-Part2
22 pages
Bayes Classification Method
No ratings yet
Bayes Classification Method
18 pages
29-Naive Bayes-03-10-2024
No ratings yet
29-Naive Bayes-03-10-2024
48 pages
Classification DMKD
No ratings yet
Classification DMKD
50 pages
Lecture 8 - Naive Bayes
No ratings yet
Lecture 8 - Naive Bayes
27 pages
Data MIning Chapter 8
No ratings yet
Data MIning Chapter 8
11 pages
Lesson 3.3 - Supervised Learning Rule Based Classification
No ratings yet
Lesson 3.3 - Supervised Learning Rule Based Classification
43 pages
Chapter14 - Limit, Continuity and Differentiability PDF
No ratings yet
Chapter14 - Limit, Continuity and Differentiability PDF
87 pages
Bayes Classification
No ratings yet
Bayes Classification
9 pages
Chapter 5 Classification
No ratings yet
Chapter 5 Classification
24 pages
Machine Learning-Lecture 04
No ratings yet
Machine Learning-Lecture 04
31 pages
23-Naive Bayes
No ratings yet
23-Naive Bayes
22 pages
IME672 - Lecture 44
No ratings yet
IME672 - Lecture 44
16 pages
Naive Bayes Classification
No ratings yet
Naive Bayes Classification
47 pages
8 - Classification NaiveBayes PDF
No ratings yet
8 - Classification NaiveBayes PDF
13 pages
Naive-By
No ratings yet
Naive-By
23 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
3 - Bayesian Classification
No ratings yet
3 - Bayesian Classification
15 pages
Data Mining - Bayesian Classification
No ratings yet
Data Mining - Bayesian Classification
6 pages
Classification - Naive Bayes Classifier: DR - Aruna Malapati Asst Professor Dept of CS & IT BITS Pilani, Hyderabad Campus
No ratings yet
Classification - Naive Bayes Classifier: DR - Aruna Malapati Asst Professor Dept of CS & IT BITS Pilani, Hyderabad Campus
9 pages
FRP C & DS Dumps
100% (1)
FRP C & DS Dumps
1,278 pages
Bayesian Classification
No ratings yet
Bayesian Classification
25 pages
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
No ratings yet
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
35 pages
1350 Oms SDH R 9.5 - Top63052 - V2.0
No ratings yet
1350 Oms SDH R 9.5 - Top63052 - V2.0
844 pages
Lista de Precios 14.03.22 t.c3.72
No ratings yet
Lista de Precios 14.03.22 t.c3.72
68 pages
Microsoft OEM licensing—Windows Server
No ratings yet
Microsoft OEM licensing—Windows Server
30 pages
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
No ratings yet
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
35 pages
Unit-4 DWDM
No ratings yet
Unit-4 DWDM
10 pages
Unit-Iv Data Classification: Data Warehousing and Data Mining
No ratings yet
Unit-Iv Data Classification: Data Warehousing and Data Mining
7 pages
Msme Industries List PDF
0% (1)
Msme Industries List PDF
24 pages
A5 PDF
No ratings yet
A5 PDF
9 pages
Project Phase1
No ratings yet
Project Phase1
23 pages
Master Press Catalog Update
0% (1)
Master Press Catalog Update
17 pages
QFF-014 COP Checklist Guidelines
No ratings yet
QFF-014 COP Checklist Guidelines
29 pages
EURANE
No ratings yet
EURANE
19 pages
ExamSectionGuide2079 AdmitCard 2025 01-10-03 38
No ratings yet
ExamSectionGuide2079 AdmitCard 2025 01-10-03 38
4 pages
Electrical Engineering 05
No ratings yet
Electrical Engineering 05
32 pages
Java - Lab - Manual-21csl35 - Skit
No ratings yet
Java - Lab - Manual-21csl35 - Skit
30 pages
Assignment 4 - Project Design and Development Management
No ratings yet
Assignment 4 - Project Design and Development Management
6 pages
Synopsis Template DOC UO
No ratings yet
Synopsis Template DOC UO
10 pages
Fall 2020 Deep Learning2
No ratings yet
Fall 2020 Deep Learning2
2 pages
Top Interview Questions Asked To A Penetration Tester
No ratings yet
Top Interview Questions Asked To A Penetration Tester
11 pages
Dlbt1705342en01 PDF
No ratings yet
Dlbt1705342en01 PDF
1 page
Applied Mathematical Modelling: Gyoungwoo Lee, S. Surendran, Sang-Hyun Kim
No ratings yet
Applied Mathematical Modelling: Gyoungwoo Lee, S. Surendran, Sang-Hyun Kim
17 pages
Chapter 6 Op Amp
No ratings yet
Chapter 6 Op Amp
14 pages
Er
No ratings yet
Er
6 pages
+ - 12V Dual Power Supply Using 7812, 7912
No ratings yet
+ - 12V Dual Power Supply Using 7812, 7912
9 pages
Classroom Training Vs Online Learning
No ratings yet
Classroom Training Vs Online Learning
26 pages
RubyTech CR 2326E
No ratings yet
RubyTech CR 2326E
3 pages
User Manual Universal Remote: Marketed by
No ratings yet
User Manual Universal Remote: Marketed by
15 pages
ASTM G171-03 Scratch Hardness
No ratings yet
ASTM G171-03 Scratch Hardness
2 pages
Free Download Lionel Messi Wallpapers Download High Quality HD Images of Messi [1024x1820] for Your Desktop, Mobile & Tablet E
No ratings yet
Free Download Lionel Messi Wallpapers Download High Quality HD Images of Messi [1024x1820] for Your Desktop, Mobile & Tablet E
1 page
BP350
No ratings yet
BP350
2 pages
Datasheet 071NSTR
No ratings yet
Datasheet 071NSTR
1 page
Master the DSST Astronomy Exam
From Everand
Master the DSST Astronomy Exam
Peterson's
No ratings yet
Master the DSST Fundamentals of College Algebra Exam
From Everand
Master the DSST Fundamentals of College Algebra Exam
Peterson's
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

DWM - Classification-Unit7

Uploaded by

DWM - Classification-Unit7

Uploaded by

Classification

NOTE : THIS PRESENTATION SHOULD BE CONSIDERED AS

August 20, 2020 PRASHASTI KANIKAR 1

August 20, 2020 PRASHASTI KANIKAR 2

August 20, 2020 PRASHASTI KANIKAR 3

NAME RANK YEARS TENURED Classifier

August 20, 2020 PRASHASTI KANIKAR 6

• If 70<=x<80 then grade <80 >=80

August 20, 2020 PRASHASTI KANIKAR 8

August 20, 2020 PRASHASTI KANIKAR 10

August 20, 2020 PRASHASTI KANIKAR 11

P(H | X)  P(X | H )P(H )

• Compute P(X|Ci) for each class

• X = (age <= 30 , income = medium, student = yes, credit_rating = fair)

P(X|Ci) : P(X|buys_computer = “yes”) = 0.222 x 0.444 x 0.667 x 0.667 = 0.044

Therefore, X belongs to class (“buys_computer = yes”)

August 20, 2020 PRASHASTI KANIKAR 15

August 20, 2020 PRASHASTI KANIKAR 16

student? yes credit rating?

no yes excellent fair

August 20, 2020 PRASHASTI KANIKAR 18

 Information needed (after using A to split D into v

 Information gained by branching on attribute A

 Class P: buys_computer = “yes” 5 4

• This approach uses gain ratio instead of gain.

• Decision tree algorithm C4.5 improves ID3 in following ways

August 20, 2020 PRASHASTI KANIKAR 21

– gain_ratio(income) = 0.029/1.557 = 0.019

Similarly, Gini{low,high} is 0.458 and Gini{medium,high} is 0.450.

• The three measures, in general, return good results but

August 20, 2020 PRASHASTI KANIKAR 28

August 20, 2020 PRASHASTI KANIKAR 29

Example of Confusion Matrix:

• Given m classes, an entry, CMi,j in a confusion matrix indicates

• Classifier Accuracy, or  Sensitivity: True Positive

August 20, 2020 PRASHASTI KANIKAR 31

• Recall: completeness – what % of positive tuples did the

• F measure (F1 or F-score): harmonic mean of precision and

• Fß: weighted measure of precision and recall

August 20, 2020 PRASHASTI KANIKAR 32

Actual Class\Predicted class cancer = yes cancer = no Total Recognition(%)

– Precision = 90/230 = 39.13% Recall = 90/300 = 30.00%

August 20, 2020 PRASHASTI KANIKAR 35

August 20, 2020 PRASHASTI KANIKAR 36

• Suppose we are given a data set of d tuples.

August 20, 2020 PRASHASTI KANIKAR 38

August 20, 2020 PRASHASTI KANIKAR 40

• The bagging algorithm—create an ensemble of classification models for a

August 20, 2020 PRASHASTI KANIKAR 42

August 20, 2020 PRASHASTI KANIKAR 44

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.