CSE 411 ML CH 1
CSE 411 ML CH 1
Introduction
Fall 2023
Data and Learning Machine Learning Supervised Learning Unsupervised Learning Reinforcement Learning
Outline
2 Machine Learning
3 Supervised Learning
4 Unsupervised Learning
5 Reinforcement Learning
”
– Albert Einstein
Data and Learning Machine Learning Supervised Learning Unsupervised Learning Reinforcement Learning
Big Data
Why “Learn”?
Data Mining
Machine Learning
Applications
■ Association
■ Supervised Learning
⧈ Classification
⧈ Regression
■ Unsupervised Learning
■ Reinforcement Learning
Learning Associations
■ Basket analysis: P (Y ∣X) probability that somebody who buys X also buys Y
where X and Y are products/services.
⧈ Example: P (chips∣beer) = 0.7
Question
In basket analysis, we want to find the dependence between two items X and Y . Given
a database of customer transactions, how can we find these dependencies? How would
we generalize this to more than two items?
Supervised Learning
Supervised Learning
Discriminant: IF income > θ1 AND savings > θ2 THEN low-risk ELSE high-risk
Classification: Applications
Classification: Question
Question
In a daily newspaper, find five sample news reports for each category of politics, sports,
and the arts. Go over these reports and find words that are used frequently for each
category, which may help you discriminate between different categories. For example, a
news report on politics is likely to include words such as “government,” “recession,”
“congress,” and so forth, whereas a news report on the arts may include “album,”
“canvas,” or “theater.” There are also words such as “goal” that are ambiguous.
■ x : car attributes
■ y : price
■ y = g(x∣q)
⧈ g() model
⧈ q parameters
Regression: Question
Question
In estimating the price of a used car, it makes more sense to estimate the percent
depreciation over the original price than to estimate the absolute price. Why?
■ Prediction of future cases: Use the rule to predict the output for future inputs
■ Knowledge extraction: The rule is easy to understand
■ Compression: The rule is simpler than the data it explains
■ Outlier detection: Exceptions that are not covered by the rule, e.g., fraud
Unsupervised Learning
Unsupervised Learning
Unsupervised Learning
Reinforcement Learning
Reinforcement Learning
Reinforcement Learning
Question
How would you approach a computer programming competition?