0% found this document useful (0 votes)
17 views19 pages

2-Overview of Data Mining

This document provides an overview of predictive analytics and data mining processes. It discusses key concepts like supervised vs. unsupervised learning, labeled vs. unlabeled data, and structured vs. unstructured data. Predictive analytics uses machine learning models to predict outcomes like customer churn, risk of default, or likelihood of clicking an ad based on historical data patterns. The data mining process involves collecting and preparing data, then applying supervised or unsupervised algorithms to discover patterns and relationships to help make predictions on new data.

Uploaded by

gbdgkmbn8c
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views19 pages

2-Overview of Data Mining

This document provides an overview of predictive analytics and data mining processes. It discusses key concepts like supervised vs. unsupervised learning, labeled vs. unlabeled data, and structured vs. unstructured data. Predictive analytics uses machine learning models to predict outcomes like customer churn, risk of default, or likelihood of clicking an ad based on historical data patterns. The data mining process involves collecting and preparing data, then applying supervised or unsupervised algorithms to discover patterns and relationships to help make predictions on new data.

Uploaded by

gbdgkmbn8c
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

ISOM3360 Data Mining for Business Analytics

Overview of Data Mining Process

Instructor: Yi Yang
Department of ISOM
Spring 2023
Last Lecture
Course overview

This Lecture
Data

Overview of data mining process

2
Prediction is at the heart of making
decisions under uncertainty. Our
businesses and personal lives are
riddled with such decisions.

Uncertainty constrains strategy.


Better prediction creates
opportunities for new business
structures and strategies to
compete.

3
Predictive Machine learning
ML is the process of training a software, called a model, that
learns pattern from a dataset. This predictive model can then
make predictions about previously unseen data.

Can be applied in situations where it is very challenging (or


impossible) to define patterns by hand

Patterns, patterns, patterns

It’s a
Label FACE
face

ML model: induces
a pattern from data
Label
non-face
4
Predictive analytics

We use predictions to take action in a product, or


in a business process.

E.g., a system predicts that a user will like a new


camera, the system then sends emails about this new
camera to the user.

That is main difference with descriptive analysis.

Prediction ≠ Decision

5
Exercise
You, as a company marketing director, want to know the answers to the
following questions. Which ones require a data mining solution?

Who are the high-value customers?

Is there an age difference between the high-value customers and the low-value
customers?

Will some particular new customer be high-value customer?

How many sales amount should I expect a new customer to generate?

Customer Gender Age Membership Monthly Amount


Purchase

Alice F 25 Y 5 $120
Let’s define customers whose
Bob M 40 Y 3 $30
amount > $100 as high-value
Charlie M 35 Y 6 $210 customer. The rests are the low-
value customer.
Doug M 18 N 4 $95

… … … … … …

6
Exercise

q Say you work in a digital media company that provides


online streaming video service. You have lots of data
about lots of users watching lots of movies/TVs. What
decisions can benefit from predictive analytics?

7
Data mining process

8
Terminology: data
Label: the thing we are predicting. The label can be the kind of
animals in a picture, binary indicator of a spam email, housing price,
the Chinese translation of an English sentence.

Features: variables that represent the data. Feature space can be as


large as millions.

What can be features in a spam detector?

What can be features in predicting customer churn?


Name Balance Age Default
Example: a particular instance of data.
Mike 123,000 50 No

Dataset: a set of examples. Mary 51,100 40 Yes


Bill 68,000 55 No
Jim 74,000 46 Yes
Dave 23,000 44 No
Anne 100,000 50 Yes 9
Types of features

Numeric feature
the number of items bought by a customer (e.g., 12)

the time that a customer spends on the website (e.g., 16.49 min)

Categorical feature
Example: industry sector (Education, Computer, Agriculture, Energy,
etc..)

Example: location region (Sai Kung, Sha Tin, Wan Chai, etc)

The balance in a bank account is a ______ feature.

Zipcode (e.g. 200041) is a _____ feature.


10
An Example: Customer Churn

Which customers may exit the contract, which may


stay?

11
12
Descriptive analytics

13
What about unstructured data

Merrill Lynch cited a rule of thumb that


somewhere around 90% of all potentially usable
business information may originate in
unstructured form.

14
Unstructured data: Text

15
Unstructured data: Image

16
Two learning paradigms
Supervised learning (prediction): learn a model
that predicts labels that can be used for unseen data.
House price prediction (numerical label)
Credit card default (categorical(binary) label)

Unsupervised learning (relationship mining) :


finds relationships in training data without reference
to labels.

Customer clustering

Key: is there a label that we are trying to


predict?
17
Supervised learning learns A->B

Input (A) Output (B) Application


customer churn? customer churning
customer conversion? targeting
ad user info click? Online advertising
email spam? spam filtering
customer complaint category? document categorization
English Chinese machine translation
audio text transcript speech recognition
CT image coronavirus? disease diagnose
image, radar info driving path self-driving car

18
19

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy