0% found this document useful (0 votes)
8 views19 pages

3

The lecture on Linear Regression aims to teach students how to compare bivariate data and make future predictions using various methods, including scatter plots and regression techniques. Students will learn to perform manual calculations and use software tools to solve real-world problems through regression analysis. Key concepts include understanding univariate and bivariate data, the significance of linear regression, and the process of finding the line of best fit using least squares regression.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views19 pages

3

The lecture on Linear Regression aims to teach students how to compare bivariate data and make future predictions using various methods, including scatter plots and regression techniques. Students will learn to perform manual calculations and use software tools to solve real-world problems through regression analysis. Key concepts include understanding univariate and bivariate data, the significance of linear regression, and the process of finding the line of best fit using least squares regression.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 19

Comparing Data and

Making Predictions
(Linear Regression)
Course Title: Artificial Intelligence
Instructor: Dr. Umara Zahid
Objective/ Learning Outcome
• The objective of this lecture is to enable students compare bivariate
data using different methods and make predictions for the future
• At the end students will be able to compare data using scatter plots,
line of best fit, and regression
• The students will be able to predict future outcomes
• Based on the kind of data students will be able to select a suitable
technique of data comparison
• Manual calculation of linear regression, python coding, in rapid miner,
in SPSS
• Enable students to solve real world problems using regression
Types of Data
Univariate and Bivariate Data
• Univariate: one variable (one type of data) • Bivariate: two variables (there
• Example: Travel Time (minutes): 15, 29, 8, 42, are two types of data)
35, 21, 18, 42, 26
• The variable is Travel Time • With bivariate data we
• Example: 2.5, 3.5, 3.3, 3.1, 2.6, 3.6, 2.4 have two sets of related data
• The variable is Puppy Weight we want to compare
• lots of things can be done with univariate
data:
• Example: Sales vs Temperature
1. Find a central value • The two variables are Ice Cream
using mean, median and mode Sales and Temperature
2. Find how spread out it is
using range, quartiles and standard
deviation
3. Make plots like Bar Graphs, Pie
Charts and Histograms
Bivariate Comparison Examples
• An ice cream shop keeps
track of how much ice
cream they sell versus the
temperature on that day
• Here are their figures for
the last 12 days
Scatter plot of Ice cream sales data
• Now we can easily see that warmer weather and more ice cream
sales are linked, but the relationship is not perfect
Ways to compare Data
• We can use:
1. Tables,
2. Scatter Plots,
3. Correlation,
4. Linear regression
What is Linear Regression?
• Linear Regression (LR) is a statistical model used to predict the
relationship between independent and dependent variables
• LR examines two factors:
Two Factors
Examined

1. Which variables in particular 2. How significant is the


are significant predictors of the regression line to make
outcome variables? predictions with highest
possible accuracy?
If regression line is inaccurate we cannot use it
Linear Regression (Line of Best Fit)
• Linear regression fits a straight line or
surface that minimizes the discrepancies
between predicted and actual output
values.
• We can also draw a "Line of Best Fit" (also
called a "Trend Line") on our scatter plot
• Try to have the line as close as possible to
all points, and as many points above the
line as below
• But for better accuracy we can calculate
the line using Least Squares Regression
Regression Equation
• LR is based on drawing lines using data so we
will look at some Euclidean Geometry
• The simplest form of a simple linear regression
equation with one dependent and one
independent variable is represented by:
Y=m*x+C
• We have plotted two points over here
• Y Dependent variable
• X Independent variable
• Y Depends on X (e.g. crop yield depends on rain)
• ‘m’ is the slop of the line: m=
Another Example: Sea Level Rise
Interpolation and Extrapolation
• Interpolation is where we find a • Extrapolation is where we find a
value inside our set of data points value outside our set of data points
• Suppose we use linear interpolation to • Here we use linear extrapolation to estimate
estimate the sales at 21 °C the sales at 29 °C (which is higher than any
value we have).
Finding the best fit line
Minimizing the distance: There are lots of ways to minimize the
distance between the line and the data points like sum of squared
errors, sum of absolute errors, root mean square error

We keep moving this line through the


data points to make sure the best fit
line has the least square of distance
between the data points and the
regression line.
Least Square Regression
• For better accuracy we calculate the line using Least Squares
Regression.
• Equation of straight line: y = mx + b
• y = how far up
• x = how far along
• m = Slope or Gradient (how steep the line is)
• b = the Y Intercept (where the line crosses the Y axis)
Steps: To find the line of best fit for N points
• Step 1: For each (x,y) point calculate x2 and xy
• Step 2: Sum all x, y, x2 and xy, which gives us Σx, Σy, Σx2 and Σxy
• Step 3: Calculate Slope m:

• (N is the number of points.)


• Step 4: Calculate Intercept b:

• Step 5: Assemble the equation of a line


Example
• Sam found how many hours of
sunshine vs how many ice creams were
sold at the shop from Monday to Friday
• Let us find the best m (slope) and b (y-
intercept) that suits that data
• Step 1: For each (x,y) calculate x2 and xy
• Step 2: Sum x, y, x2 and xy (gives us Σx,
Σy, Σx2 and Σxy)
• N (number of data values) = 5
Example
• Step 3: Calculate Slope m

• Step 4: Calculate Intercept b


Example
• Assemble the equation of a line
Predicting from previous results
• Sam hears the weather forecast which says "we expect 8 hours of sun
tomorrow", so he uses the above equation to estimate that he will
sell
• y = 1.518 x 8 + 0.305 = 12(approx.) Ice Creams
• Sam makes fresh waffle cone mixture for 14 ice creams just in case.
Yum.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy