0% found this document useful (0 votes)
9 views37 pages

AAI Lecture 10 Sp 25

Supervised learning is a machine learning method that uses labeled data to train models for predicting outcomes on unlabeled data, with common types including classification and regression. Linear regression, a key technique in supervised learning, establishes relationships between independent and dependent variables and is widely used for predictive modeling in various fields. Challenges in linear regression include outliers, linearity assumptions, and evaluation methods such as mean squared error and cross-validation.

Uploaded by

i228791
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views37 pages

AAI Lecture 10 Sp 25

Supervised learning is a machine learning method that uses labeled data to train models for predicting outcomes on unlabeled data, with common types including classification and regression. Linear regression, a key technique in supervised learning, establishes relationships between independent and dependent variables and is widely used for predictive modeling in various fields. Challenges in linear regression include outliers, linearity assumptions, and evaluation methods such as mean squared error and cross-validation.

Uploaded by

i228791
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Machine Learning

Supervised Learning

1
Supervised Learning
• A type of machine learning method in which:
• labeled data is provided to the machine learning system in order to train it,
• The system creates a model using labeled data to understand the datasets and
learn about each data,
• On the basis of learning, Model predicts the output for an unlabeled data
• Basically discovers patterns in the data that relate data attributes with a
target (class) attribute.

• Performance of model is measured on the basis of that how accurately


new data is predicted

2
Supervised Learning
• The two most common types of supervised learning are
• Classification
• Prediction/Classification of the discrete values
• outputs are discrete labels
• Male or Female, True or False, Spam or Not Spam, etc.
• Regression
• algorithms are used to predict the continuous values
• outputs are real-valued.
• price, salary, age, stock exchange market prediction

3
What is Linear Regression ?

Regression Analysis Linear Function Data Points & Scatter Plots

Linear Regression is a type of regression analysis that Linear Regression is based on the concept of a linear Linear Regression is used to draw a line of best
is used to establish relationships between function, y =b0 +b1x, where y is the dependent fit through a series of data points on a scatter
independent variables and dependent variables.It variable, x is the independent variable, b1 is the plot. The line is drawn in such a way that it
assumes a linear relationship between the variables. slope of the line and b0 is the y- intercept. predicts the value of the dependent variable
4
based on the value of the independent variable.
Linear regression
• Linear regression at its simplest expresses the mathematical
relationship between two variables or attributes.
• a linear relationship between an outcome variable and a predictor
(or set of predictors) is the simplest form of a relationship

5
Whyis LinearRegressionImportant?
Predictive Modeling Data Analysis Business Decision Making

Linear Regression is a powerful Linear Regression helps in Linear Regression is extensively


predictive modeling technique analyzing the data more used in business decision
which is used to make accurate efficiently by establishing making, particularly in
predictions on a range of relationships between variables marketing and finance, to
variables based on their and understanding the impact forecast trends and future
relationship with the of one variable on another. events based on historical data.
dependent variable.

6
TypesofLinearRegressionModels
Logistic Regression
Multiple Linear Regression
Logistic Regression is used when the
Multiple Linear Regression involves two dependent variable is categorical in
or more independent variables and one nature and cannot be measured
dependent variable. numerically.

1 2 3 4

Simple Linear Regression Polynomial Regression


Simple Linear Regression involves only Polynomial Regression involves
one independent variable and one relationships where the dependent
dependent variable. variable is related to independent
variables raised to a power. 7
ApplicationsofLinearRegression

Stock Market Medical Weather Forecasting Education

Linear Regression is used to Linear Regression models are Linear Regression is used in educational
Linear Regression is used in
predict the outcome of a used in weather forecasting to institutions to predict a student's
trend analysis to predict the
particular treatment based on predict temperatures, performance based on various factors such
future value of stocks based
various factors such as age, precipitation, and other as demographics, socio-economic status,
on their historical
gender, and medical history. meteorological parameters. and previous academic performance.
performance.

8
Challenges in Linear Regression

1 Outliers 2 Linearity 3 Homoscedasticity


Outliers can have a Linear Regression assumes Linear Regression assumes
significant impact on the that the relationship that the variance of the
slope and accuracy of the between variables is linear, dependent variable is
line of best fit. but in reality, this may not constant across different
always be the case. values of the independent
variable, but in reality, this
may not always be the case.

9
Evaluation Methods in Linear Regression

Mean Squared Error R-squared Cross Validation


Measures the average squared Measures the proportion of Divides the data into k-folds and
difference between the predicted variation in the dependent evaluates the model k times by
and actual values. The lower the variable that is explained by the training on k-1 folds and testing
value, the better the model. independent variable.A value on the remaining fold. The
closer to 1 indicates a better average performance is taken as
model. the final evaluation metric.

10
What is Linear
• Remember this?

11
What is linear
• A slope of 2 means that
every 1-unit change in X
yields a 2-unit change in Y.

12
A simplistic example
• Suppose you run a social networking site that charges a monthly
subscription fee of $25. Each month you collect data and count
your number of users and total revenue

• S = {(x, y) = (1,25) , (10,250) , (100,2500) , (200,5000)}

• There’s a linear pattern.


• The coefficient relating x and y is 25.

13
Equations for finding slope and intercept

14
Equations for finding slope and intercept

15
Linear regression equation

16
Trend and variation
• FINDING THE MODEL
• As we consider it to be a linear
relationship, the functional form will
be:
• Y = mx + c y = w0 + w1x

• FITTING THE MODEL


• Find a line that minimizes the
distance between all the points and
line

17
Example
• Dataset giving the living areas and prices of 50
houses

18
Example
• We can plot this data

Given data like this,


how can we learn to
predict the prices of
other houses as a
function of the size of
their living areas?

19
Predictions
• Predicting in this manner is equivalent to “drawing line through
data”
3.2
Observed days
3.0 Prediction

2.8

2.6

2.4

2.2

2.0

1.8

1.6

1.4
55 60 65 70 75 80 85 90 95 100
20
Notations

• The “input” variables – x(i) (living area in this example)


• The “output” or target variable that we are trying to predict – y(i)
(price)
• A pair (x(i), y(i)) is called a training example
• A list of m training examples {(x(i), y(i)); i = 1, . . . ,m}—is called a
training set
• X denote the space of input values, and Y the space of output
values
21
Hypothesis
• Generally we’ll have more than one input features

x1=Living area h ( x )  w0  w1 x1  w2 x 2


x2 = # of bedrooms
22
Multi-
Variable
Linear
Regression

23
Types of Loss/Error

24
Error/Loss Calculation

25
Practice
• Given c= 32, m = 10, find the value of error for the given data:

26
Chosing the regression line
Which of these
lines to chose?

Y Y

X X 27
y  hw( x )  w0  w1 x
Chosing the regression line
The predicted value is:
yˆ i  hw( x i )  w0  w1 xi

Y The true value for xi is yi

yˆ i
Error or residual yˆ i  y i
yi

Consider this point xi


28
xi X
Chosing the regression line
How to chose this
best fit line
m
min  (hw( x (i ) )  y (i ) ) 2
Y w
i 1

Minimize the sum of the squared


In other words: (why squared?) distances of the
How to chose points (Yi’s) from the line for the
the θs m training examples

X
29
Chosing the regression line
Sum the error over
m training
To simplify examples We dont want
calculations negative values
m
1
J (w)  min  (hw( x ( i ) )  y (i ) )2
2 w i 1

Difference between what hypothesis


predicted and what the actual value is
Find w which
minimizes the
expression
30
Equation to Calculate MSE Values

31
min J (w)
w
Gradient Descent
• Choose initial values of w0 and w1 and continue moving the
direction of steepest descente
J(w)

32
W0
W1
Gradient Descent
• Choose initial values of w0 and w1 and continue moving the
direction of steepest descente
• The step size is controlled by a parameter called learning rate
 Starting point is
important

33
Gradient Descent - Steps

34
Gradient descent
J(W1) ; W1 is a real number

d
W1  W1   J (W1 )
dW1

w1

If is W1 already at minimum, the derivative term will be


zero and the algorithm will converge in the first
iteration 35
Logistic regression (Logit Model)
• Is applicable in cases where there is one dependent variable and
more independent variables.
• Difference between multiple vs logistic regression is that the
target variable in the logistic approach is discrete (binary or an
ordinal value).
• the dependent variable is finite or categorical
• either P or Q (binary regression) or a range of limited options P, Q, R, or S.
• The variable value is limited to just two possible outcomes in
linear regression. However, logistic regression addresses this
issue as it can return a probability score that shows the chances
of any particular event.

36
Logistic regression (Logit Model)
• Example: One can determine the likelihood of choosing an offer
on your website (dependent variable).
• For analysis purposes, you can look at various visitor
characteristics such as the sites they came from, count of visits to
your site, and activity on your site (independent variables).
• This can help determine the probability of certain visitors who are
more likely to accept the offer.
• As a result, it allows you to make better decisions on whether to
promote the offer on your site or not.

37

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy