0% found this document useful (0 votes)

41 views61 pages

Predictive

The document introduces predictive modeling and supervised segmentation. It discusses how selecting informative attributes is important for segmentation and predictive models. Attribute selection aims to find attributes that best separate a target variable into pure, homogeneous groups. Information gain is introduced as a common method that measures how well attributes reduce uncertainty about the target variable. The goal of segmentation is to identify subgroups with different target variable values in a way that is understandable to humans.

Uploaded by

pg ai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views61 pages

Predictive

Uploaded by

pg ai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 61

Introduction to Predictive Modeling :

From Correlation to Supervised

Segmentation

1
 In the process of discussing supervised segmentation, we
introduce one of the fundamental ideas of data mining:
finding or selecting important, informative variables or
“attributes” of the entities described by the data.
 What exactly it means to be “informative” varies among

applications, but generally, information is a quantity that

reduces uncertainty about something.
 Finding informative attributes also is the basis for a widely

used predictive modeling technique called tree induction,

which we will introduce toward the end of this chapter as an
application of this fundamental concept.

2
Outline
 Models, Induction, and Prediction
 Supervised Segmentation

◦ Selecting Informative Attributes

◦ Example: Attribute Selection with Information Gain
◦ Supervised Segmentation with Tree-Structured Models

3
Models, Induction, and Prediction
 A model is a simplified representation of reality created
to serve a purpose. It is simplified based on some
assumptions about what is and is not important for the
specific purpose, or sometimes based on constraints on
information or tractability).
 For example, a map is a model of the physical world. It
abstracts away a tremendous amount of information that
the mapmaker deemed irrelevant for its purpose. It
preserves, and sometimes further simplifies, the relevant
information.

4
Models, Induction, and Prediction
 In data science, a predictive model is a formula for
estimating the unknown value of interest: the target.
 The formula could be mathematical, or it could be a
logical statement such as a rule. Often it is a hybrid
of the two.
 Given our division of supervised data mining into
classification and regression, we will consider
classification models (and class-probability
estimation models) and regression models.

5
Terminology : Prediction
 In common usage, prediction means to forecast a future
event.
 In data science, prediction more generally means to estimate
an unknown value. This value could be something in the
future (in common usage, true prediction), but it could also
be something in the present or in the past.
 Indeed, since data mining usually deals with historical data,
models very often are built and tested using events from the
past.
 The key is that the model is intended to be used to estimate
an unknown value.

6
Models, Induction, and Prediction
 Supervised learning is model creation where the model
describes a relationship between a set of selected variables
(attributes or features)
 Predefined variable called the target variable. The model
estimates the value of the target variable as a function (possibly
a probabilistic function) of the features.
 So, for our churn-prediction problem we would like to build a
model of the propensity to churn as a function of customer
account attributes, such as age, income, length with the
company, number of calls to customer service, overage
charges, customer demographics, data usage, and others.

7
Models, Induction, and Prediction

8
Many Names for the Same Things
 The principles and techniques of data science historically have
been studied in several different fields, including machine
learning, pattern recognition statistics, databases, and others.
 As a result there often are several different names for the

same things. We
 typically will refer to a dataset, whose form usually is the

same as a table of a database or a worksheet of a

spreadsheet. A dataset contains a set of examples or
instances. An instance also is referred to as a row of a
database table or sometimes a case in statistics.

9
Many Names for the Same Things
 The features (table columns) have many different
names as well. Statisticians speak of independent
variables or predictors as the attributes supplied
as input. In operations research you may also
hear explanatory variable.
 The target variable, whose values are to be

predicted, is commonly called the dependent

variable in statistics.

10
Models, Induction, and Prediction
 The creation of models from data is known as
model induction.
 The procedure that creates the model from the

data is called the induction algorithm or learner.

Most inductive procedures have variants that
induce models both for classification and for
regression.

11
Outline
 Models, Induction, and Prediction
 Supervised Segmentation

◦ Selecting Informative Attributes

◦ Example: Attribute Selection with Information Gain
◦ Supervised Segmentation with Tree-Structured Models

12
Supervised Segmentation

 A predictive model focuses on estimating the

value of some particular target variable of
interest.
 To try to segment the population into subgroups

that have different values for the target variable .

 Segmentation may at the same time provide a

human-understandable set of segmentation

patterns.

13
Supervised Segmentation

 We might like to rank the variables by how good they are at

predicting the value of the target.
 In our example, what variable gives us the most information

about the future churn rate of the population? Being a

professional? Age? Place of residence? Income? Number of
complaints to customer service? Amount of overage charges?
 We now will look carefully into one useful way to select

informative variables, and then later will show how this

technique can be used repeatedly to build a supervised
segmentation.

14
Outline
 Models, Induction, and Prediction
 Supervised Segmentation

◦ Selecting Informative Attributes

◦ Example: Attribute Selection with Information Gain
◦ Supervised Segmentation with Tree-Structured Models

15
Selecting Informative Attributes

 The label over each head represents the value of the

target variable (write-off or not).
 Colors and shapes represent different predictor
attributes.

16
Selecting Informative Attributes
 Attributes:
◦head-shape: square, circular
◦body-shape: rectangular, oval
◦body-color: gray, white
 Target variable:
◦write-off: Yes, No

17
Selecting Informative Attributes
 So let’s ask ourselves:
◦ which of the attributes would be best to segment these people into
groups, in a way that will distinguish write-offs from non-write-offs?
 Technically, we would like the resulting groups to be as pure
as possible. By pure we mean homogeneous with respect to
the target variable. If every member of a group has the same
value for the target, then the group is pure. If there is at least
one member of the group that has a different value for the
target variable than the rest of the group, then the group is
impure.
 Unfortunately, in real data we seldom expect to find a variable

that will make the segments pure.

18
Selecting Informative Attributes
 Purity measure.
 The most common splitting criterion is called

information gain, and it is based on a purity

measure called entropy.
 Both concepts were invented by one of the

pioneers of information theory, Claude Shannon,

in his seminal work in the field (Shannon, 1948).

19
Selecting Informative Attributes
 Entropy is a measure of disorderthat can be
applied to a set, such as one of our individual
segments.
 Disorder corresponds to how mixed (impure) the

segment is with respect to these properties of

interest.

20
Selecting Informative Attributes

21
Selecting Informative Attributes
p(non-write-off) = 7 / 10 = 0.7
p(write-off) = 3 / 10 = 0.3
entropy(S)
= - 0.7 × log2 (0.7) – 0.3 × log2
(0.3)
≈ - 0.7 × - 0.51 - 0.3 × - 1.74
≈ 0.88

22
Selecting Informative Attributes

23
Selecting Informative Attributes
entropy(parent)
= - p( • ) × log2 p( • ) - p( ☆ ) × log2 p( ☆ )
≈ - 0.53 × - 0.9 - 0.47 × - 1.1
≈ 0.99 (very impure)

24
Selecting Informative Attributes
 The entropy of the left child is:
entropy(Balance < 50K) = - p( • ) × log 2 p( • ) - p( ☆ ) × log2 p( ☆ )
≈ - 0.92 × ( - 0.12) - 0.08 × ( - 3.7)
≈ 0.39
 The entropy of the right child is:
entropy(Balance ≥ 50K) = - p( • ) × log2 p( • ) - p( ☆ ) × log2 p( ☆ )
≈ - 0.24 × ( - 2.1) - 0.76 × ( - 0.39)
≈ 0.79

25
Selecting Informative Attributes
Information Gain
= entropy(parent)
- (p(Balance < 50K) × entropy(Balance < 50K) +
p(Balance ≥ 50K) × entropy(Balance ≥
50K))
≈ 0.99 – (0.43 × 0.39 + 0.57 × 0.79)
≈ 0.37

26
Selecting Informative Attributes

entropy(parent) ≈ 0.99
entropy(Residence=OWN) ≈ 0.54
entropy(Residence=RENT) ≈ 0.97
entropy(Residence=OTHER) ≈ 0.98

Information Gain ≈ 0.13

27
Numeric variables
 We have not discussed what exactly to do if the attribute is
numeric.
 Numeric variables can be “discretized” by choosing a split
point(or many split points).
 For example, Income could be divided into two or more ranges.
Information gain can be applied to evaluate the segmentation
created by this discretization of the numeric attribute. We still
are left with the question of how to choose the split point(s) for
the numeric attribute.
 Conceptually, we can try all reasonable split points, and choose
the one that gives the highest information gain.

28
Outline
 Models, Induction, and Prediction
 Supervised Segmentation

◦ Selecting Informative Attributes

◦ Example: Attribute Selection with Information Gain
◦ Supervised Segmentation with Tree-Structured Models

29
Example: Attribute Selection with Information Gain

 For a dataset with instances described by attributes and a

target variable.
 We can determine which attribute is the most informative
with respect to estimating the value of the target variable.
 We also can rank a set of attributes by their
informativeness, in particular by their information gain.

30
Example: Attribute Selection with Information Gain

 This is a classification problem because we have a

target variable, called edible?, with two values yes
(edible) and no (poisonous), specifying our two
classes.
 Each of the rows in the training set has a value for
this target variable. We will use information gain to
answer the question: “Which single attribute is the
most useful for distinguishing edible (edible?=Yes)
mushrooms from poisonous (edible?=No) ones?”

31
Example: Attribute Selection with Information Gain

 If you’re going to build a model to determine the

mushroom edibility using only a single feature,
you should choose its odor.
 If you were going to build a more complex model

you might start with the attribute ODOR before

considering adding others.
 In fact, this is exactly the topic of the next

section.

32
Outline
 Models, Induction, and Prediction
 Supervised Segmentation

◦ Selecting Informative Attributes

◦ Example: Attribute Selection with Information Gain
◦ Supervised Segmentation with Tree-Structured Models

33
Supervised Segmentation with Tree-Structured Models

 Let’s continue on the topic of creating a supervised

segmentation, because as important as it is, attribute
selection alone does not seem to be sufficient.
 If we select the single variable that gives the most
information gain, we create a very simple segmentation.
 If we select multiple attributes each giving some
information gain, it’s not clear how to put them together.
 We now introduce an elegant application of the ideas we’ve
developed for selecting important attributes, to produce a
multivariate (multiple attribute) supervised segmentation.

34
Supervised Segmentation with Tree-Structured Models

 Consider a
segmentation of
the data to take
the form of a
“tree,” such as
that shown in
Figure 3-10.

35
Supervised Segmentation with Tree-Structured Models

 The values of Claudio’s attributes are

Balance=115K, Employed=No, and
Age=40.
36
Supervised Segmentation with Tree-Structured Models

 There are many techniques to induce a supervised

segmentation from a dataset. One of the most
popular is to create a tree-structured model (tree
induction).
 These techniques are popular because tree models
are easy to understand, and because the induction
procedures are elegant (simple to describe) and
easy to use. They are robust to many common data
problems and are relatively efficient.

37
Supervised Segmentation with Tree-Structured Models

 Combining the ideas introduced above, the goal

of the tree is to provide a supervised
segmentation— more specifically, to partition the
instances, based on their attributes, into
subgroups that have similar values for their target
variables.
 We would like for each “leaf ” segment to contain

instances that tend to belong to the same class.

38
Supervised Segmentation with Tree-Structured Models

 To illustrate the process of classification tree

induction, consider the very simple example set
shown previously in Figure 3-2

39
Selecting Informative Attributes
 Attributes:
◦head-shape: square, circular
◦body-shape: rectangular, oval
◦body-color: gray, white
 Target variable:
◦write-off: Yes, No

40
Supervised Segmentation with Tree-Structured Models

 Figure 3-11. First

partitioning:
splitting on body
shape (rectangular
versus oval).

41
Supervised Segmentation with Tree-Structured Models

 Figure 3-12.
Second
partitioning:
the oval
body people
sub-grouped
by head
type.

42
Supervised Segmentation with Tree-Structured Models

 Figure 3-13.
Third
partitioning:
the
rectangular
body people
subgrouped
by body
color.

43
 Figure 3-14. The
classification tree
resulting from the
splits done in
Figure 3-11 to
Figure 3-13.

44
Supervised Segmentation with Tree-Structured Models

 In summary, the procedure of classification tree

induction is a recursive process of divide and conquer,
where the goal at each step is to select an attribute to
partition the current group into subgroups that are as
pure as possible with respect to the target variable.
 We perform this partitioning recursively, splitting
further and further until we are done. We choose the
attributes to split upon by testing all of them and
selecting whichever yields the purest subgroups.

45
Visualizing Segmentations

*The black dots correspond to instances of the class

Write-off.
*The plus signs correspond to instances of class non-
Write-off.

46
Trees as Sets of Rules
 You classify a new unseen instance by starting at
the root node and following the attribute tests
downward until you reach a leaf node, which
specifies the instance’s predicted class.
 If we trace down a single path from the root node
to a leaf, collecting the conditions as we go, we
generate a rule.
 Each rule consists of the attribute tests along the
path connected with AND.

47
Trees as Sets of Rules

48
Trees as Sets of Rules
 IF (Balance < 50K) AND (Age < 50) THEN Class=Write-
off
 IF (Balance < 50K) AND (Age ≥ 50) THEN Class=No
Write-off
 IF (Balance ≥ 50K) AND (Age < 45) THEN Class=Write-
off
 IF (Balance ≥ 50K) AND (Age < 45) THEN Class=No
Write-off

49
Trees as Sets of Rules
 The classification tree is equivalent to this rule
set.
 Every classification tree can be expressed as a set

of rules this way.

50
Probability Estimation
 In many decision-making problems, we would like a
more informative prediction than just a classification.
 For example, in our churn prediction problem. If we have

the customers’ probability of leaving when their contracts

are about to expire, we could rank them and use a limited
incentive budget to the highest probability instances.
 Alternatively, we may want to allocate our incentive

budget to the instances with the highest expected loss,

for which you’ll need the probability of churn.

51
Probability Estimation Tree

52
Probability Estimation
 If we are satisfied to assign the same class probability
to every member of the segment corresponding to a
tree leaf, we can use instance counts at each leaf to
compute a class probability estimate.
 For example, if a leaf contains n positive instances

and m negative instances, the probability of any new

instance being positive may be estimated as n/(n+m).
This is called a frequency-based estimate of class
membership probability.

53
Probability Estimation
 A problem: we may be overly optimistic about the
probability of class membership for segments with
very small numbers of instances. At the extreme, if
a leaf happens to have only a single instance,
should we be willing to say that there is a 100%
probability that members of that segment will have
the class that this one instance happens to have?
 This phenomenon is one example of a fundamental
issue in data science (“overfitting”).

54
Probability Estimation
 Instead of simply computing the frequency, we would
often use a “smoothed” version of the frequency-
based estimate, known as the Laplace correction, the
purpose of which is to moderate the influence of
leaves with only a few instances.
 The equation for binary class probability estimation

becomes:

55
Example: Addressing the Churn
Problem with Tree Induction
 We have a historical data set of 20,000 customers.
 At the point of collecting the data, each

customer either had stayed with the company

or had left (churned).

56
Example: Addressing the Churn
Problem with Tree Induction

57
Example: Addressing the Churn
Problem with Tree Induction
 How good are each of these variables individually?
 For this we measure the information gain

of each attribute, as discussed earlier. Specifically,

we apply Equation 3-2 to each variable
independently over the entire set of instances, to
see what each gains us.

58
59
Example: Addressing the Churn
Problem with Tree Induction
 The answer is that the table ranks each feature
by how good it is independently, evaluated
separately on the entire population of instances.
 Nodes in a classification tree depend on the

instances above them in the tree.

60
Example: Addressing the Churn
Problem with Tree Induction
 Therefore, except for the root node, features in a
classification tree are not evaluated on the entire
set of instances.
 The information gain of a feature depends on the

set of instances against which it is evaluated, so

the ranking of features for some internal node
may not be the same as the global ranking.

Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet
Ds Chapter 3
No ratings yet
Ds Chapter 3
38 pages
PPT4 TOPIK4 R0 Predictive Modeling
No ratings yet
PPT4 TOPIK4 R0 Predictive Modeling
35 pages
3 - Intro To Predictive Modeling
No ratings yet
3 - Intro To Predictive Modeling
40 pages
Unit 4 Classification (1) (P)
No ratings yet
Unit 4 Classification (1) (P)
50 pages
5 - Predictive Modeling Using Decision Trees
No ratings yet
5 - Predictive Modeling Using Decision Trees
25 pages
Ds Chapter 3
No ratings yet
Ds Chapter 3
38 pages
Classification DecisionTreesNaiveBayeskNN
No ratings yet
Classification DecisionTreesNaiveBayeskNN
75 pages
Classification and Prediction
No ratings yet
Classification and Prediction
40 pages
Classification
No ratings yet
Classification
45 pages
Class Basic
No ratings yet
Class Basic
75 pages
Data Mining: Practical Machine Learning Tools and Techniques
No ratings yet
Data Mining: Practical Machine Learning Tools and Techniques
69 pages
Classification Unit-4
No ratings yet
Classification Unit-4
19 pages
5 & 6 - Introduction To Predictive Modeling
No ratings yet
5 & 6 - Introduction To Predictive Modeling
28 pages
Data Mining Book
No ratings yet
Data Mining Book
84 pages
08ClassBasic v1
No ratings yet
08ClassBasic v1
46 pages
VII - CS8031 - DMDW - Module 6 - Classification - VBP
No ratings yet
VII - CS8031 - DMDW - Module 6 - Classification - VBP
99 pages
Unit-6: Classification and Prediction
No ratings yet
Unit-6: Classification and Prediction
63 pages
DWDM Unit 4
No ratings yet
DWDM Unit 4
80 pages
08 Class Basic
No ratings yet
08 Class Basic
81 pages
Data Mining Unit 2
No ratings yet
Data Mining Unit 2
40 pages
Concepts and Techniques: Data Mining
No ratings yet
Concepts and Techniques: Data Mining
88 pages
15 1 Random Forest and Decision Tree
No ratings yet
15 1 Random Forest and Decision Tree
66 pages
05 Classification
No ratings yet
05 Classification
79 pages
Module 3
No ratings yet
Module 3
132 pages
Concepts and Techniques: Data Mining
100% (1)
Concepts and Techniques: Data Mining
81 pages
08 Class Basic
No ratings yet
08 Class Basic
81 pages
CH 5
No ratings yet
CH 5
81 pages
Concepts and Techniques: - Chapter 8
No ratings yet
Concepts and Techniques: - Chapter 8
42 pages
Classification Ppts 2021
No ratings yet
Classification Ppts 2021
80 pages
L-10 Iiitmg
No ratings yet
L-10 Iiitmg
28 pages
Classification
100% (1)
Classification
37 pages
Concepts and Techniques: - Chapter 8
No ratings yet
Concepts and Techniques: - Chapter 8
81 pages
Machine Learning
No ratings yet
Machine Learning
9 pages
Classification & Prediction
No ratings yet
Classification & Prediction
24 pages
Concepts and Techniques: - Chapter 8
No ratings yet
Concepts and Techniques: - Chapter 8
81 pages
Chapter 6 Classification and Prediction25.10.13
No ratings yet
Chapter 6 Classification and Prediction25.10.13
43 pages
AI Chapter 3 Part 2
No ratings yet
AI Chapter 3 Part 2
51 pages
L05 - Advance Analytical Theory and Methods - Classification
No ratings yet
L05 - Advance Analytical Theory and Methods - Classification
34 pages
08ClassBasic L
No ratings yet
08ClassBasic L
78 pages
Unit 4 DM
No ratings yet
Unit 4 DM
88 pages
Predictive Modeling Week3
No ratings yet
Predictive Modeling Week3
68 pages
Unit - Iii
No ratings yet
Unit - Iii
52 pages
Slide 07 Chapter8 Classification Basic Concept
No ratings yet
Slide 07 Chapter8 Classification Basic Concept
55 pages
Module 4
No ratings yet
Module 4
99 pages
20210913115613D3708 - Session 05-08 Decision Tree Classification
No ratings yet
20210913115613D3708 - Session 05-08 Decision Tree Classification
37 pages
DmUnit 3
No ratings yet
DmUnit 3
42 pages
7 - Classification
No ratings yet
7 - Classification
71 pages
Concepts and Techniques: - Chapter 8
No ratings yet
Concepts and Techniques: - Chapter 8
87 pages
IS4242 W5 Predictive Modeling
No ratings yet
IS4242 W5 Predictive Modeling
81 pages
Classification and Prediction: Data Mining 이복주 단국대학교 컴퓨터공학과
No ratings yet
Classification and Prediction: Data Mining 이복주 단국대학교 컴퓨터공학과
75 pages
Classification
No ratings yet
Classification
73 pages
Lecture 4
No ratings yet
Lecture 4
79 pages
Unit 4 Classification
No ratings yet
Unit 4 Classification
87 pages
Data Mining Unit 3
No ratings yet
Data Mining Unit 3
50 pages
Dbms Unit 3
No ratings yet
Dbms Unit 3
40 pages
Pattern Recognition: Fundamentals and Applications
From Everand
Pattern Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet
Machine Learning - A Complete Exploration of Highly Advanced Machine Learning Concepts, Best Practices and Techniques: 4
From Everand
Machine Learning - A Complete Exploration of Highly Advanced Machine Learning Concepts, Best Practices and Techniques: 4
Peter Bradley
No ratings yet
The Secret Of Machine Learning
From Everand
The Secret Of Machine Learning
Mhd Arjunanta
No ratings yet
Statistical Classification: Fundamentals and Applications
From Everand
Statistical Classification: Fundamentals and Applications
Fouad Sabry
No ratings yet
Hadoop: A Report Writing On
No ratings yet
Hadoop: A Report Writing On
13 pages
SQP 11 - QP
No ratings yet
SQP 11 - QP
12 pages
IntroClassificationDA 2024
No ratings yet
IntroClassificationDA 2024
129 pages
2nd Year Statistics Question Bank CH#15
No ratings yet
2nd Year Statistics Question Bank CH#15
4 pages
Presentation: T3 - Group 6
No ratings yet
Presentation: T3 - Group 6
94 pages
Databricks Certified Professional Data Engineer Practice Questions
No ratings yet
Databricks Certified Professional Data Engineer Practice Questions
13 pages
CC Unit 4
No ratings yet
CC Unit 4
17 pages
An Automated System For Patient Record Management
No ratings yet
An Automated System For Patient Record Management
7 pages
Data Sheet 6MD5513-0AP00-0AA0: Product Details
No ratings yet
Data Sheet 6MD5513-0AP00-0AA0: Product Details
2 pages
Bigdata Units
No ratings yet
Bigdata Units
80 pages
Mooc On Weka
No ratings yet
Mooc On Weka
59 pages
Lecture PPT 1
No ratings yet
Lecture PPT 1
23 pages
OPSQL
No ratings yet
OPSQL
2 pages
Laravel Interview Questions
No ratings yet
Laravel Interview Questions
12 pages
Question Bank-Java FSD 7th Sem
No ratings yet
Question Bank-Java FSD 7th Sem
6 pages
Hasan Project Report File
No ratings yet
Hasan Project Report File
70 pages
Sap Bods - Quick Guide
0% (1)
Sap Bods - Quick Guide
99 pages
Deleting Diskgroup Via Asmcmd
No ratings yet
Deleting Diskgroup Via Asmcmd
7 pages
Towards Innovative System For Hadith Isnad Processing
No ratings yet
Towards Innovative System For Hadith Isnad Processing
3 pages
Gujeweduweje
No ratings yet
Gujeweduweje
3 pages
RDBMS Session 1-5 QA
No ratings yet
RDBMS Session 1-5 QA
13 pages
TSQL Interview FAQ
No ratings yet
TSQL Interview FAQ
5 pages
Microsoft - Query Processing Enhancements On Partitioned Tables and Indexes (Partition Elimination)
No ratings yet
Microsoft - Query Processing Enhancements On Partitioned Tables and Indexes (Partition Elimination)
12 pages
Chapterone ICT Assignment
No ratings yet
Chapterone ICT Assignment
8 pages
Lab 4 Tasks - Table Creation Solved
No ratings yet
Lab 4 Tasks - Table Creation Solved
6 pages
Ecommerce Web Application Sem IV Project Report
No ratings yet
Ecommerce Web Application Sem IV Project Report
10 pages
Data Warehouse Security: Vaishali S
No ratings yet
Data Warehouse Security: Vaishali S
8 pages
Database 101 - Guy Kawasaki - Ss
No ratings yet
Database 101 - Guy Kawasaki - Ss
2 pages
Optimizing Data Warehousing With Advanced AI Modeling Techniques
No ratings yet
Optimizing Data Warehousing With Advanced AI Modeling Techniques
25 pages
Oratop16 5 1
No ratings yet
Oratop16 5 1
10 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Predictive

Uploaded by

Predictive

Uploaded by

Introduction to Predictive Modeling :

From Correlation to Supervised

applications, but generally, information is a quantity that

used predictive modeling technique called tree induction,

◦ Selecting Informative Attributes

same as a table of a database or a worksheet of a

predicted, is commonly called the dependent

data is called the induction algorithm or learner.

◦ Selecting Informative Attributes

 A predictive model focuses on estimating the

that have different values for the target variable .

human-understandable set of segmentation

 We might like to rank the variables by how good they are at

about the future churn rate of the population? Being a

informative variables, and then later will show how this

◦ Selecting Informative Attributes

 The label over each head represents the value of the

that will make the segments pure.

information gain, and it is based on a purity

pioneers of information theory, Claude Shannon,

segment is with respect to these properties of

Information Gain ≈ 0.13

◦ Selecting Informative Attributes

 For a dataset with instances described by attributes and a

 This is a classification problem because we have a

 If you’re going to build a model to determine the

you might start with the attribute ODOR before

◦ Selecting Informative Attributes

 Let’s continue on the topic of creating a supervised

 The values of Claudio’s attributes are

 There are many techniques to induce a supervised

 Combining the ideas introduced above, the goal

instances that tend to belong to the same class.

 To illustrate the process of classification tree

 Figure 3-11. First

 In summary, the procedure of classification tree

*The black dots correspond to instances of the class

of rules this way.

the customers’ probability of leaving when their contracts

budget to the instances with the highest expected loss,

and m negative instances, the probability of any new

customer either had stayed with the company

of each attribute, as discussed earlier. Specifically,

instances above them in the tree.

set of instances against which it is evaluated, so

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.