0% found this document useful (0 votes)
39 views2 pages

Linear Non Linear Regression

Linear regression finds the best-fitting straight line to describe the relationship between two variables. It determines the line that minimizes the sum of the squared distances between the data points and the line. The document discusses how linear regression was developed and provides an example of using it to model the relationship between driving speed and stopping distance, finding an equation and correlation coefficient that best fits the data.

Uploaded by

ledmabaya23
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views2 pages

Linear Non Linear Regression

Linear regression finds the best-fitting straight line to describe the relationship between two variables. It determines the line that minimizes the sum of the squared distances between the data points and the line. The document discusses how linear regression was developed and provides an example of using it to model the relationship between driving speed and stopping distance, finding an equation and correlation coefficient that best fits the data.

Uploaded by

ledmabaya23
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

MDM4U Linear and Non-Linear Regression

Linear regression analyzes the relationship between two variables X and Y, and
determines the best straight line through the data.

The term “regression”, like many statistical terms is used in statistics quite
differently than it is used in other contexts. The method was first used to examine
the relationship between the heights of fathers and sons. They were related, of
course, but the slope was less than 1. Why? The height of sons regressed to the
mean. The term “regression” is now used for many sorts of curve fitting.

In general, the goal of linear regression is to find the line that best predicts Y from
X. Linear regression does this by finding the line that minimizes the sum of the
squares of the vertical distances of the points (actual data) from the line
(estimated data).

Note that linear regression assumes that data are linear, and finds the slope and
intercept that make a straight line best fit the data.

The data in this scatter plot relates driving speed and stopping distance.

Speed of Car 25 35 45 55 60 70 80 90 100 110


(km/h)
Stopping 10 15 21 27 33 42 54 61 78 103
Distance (m)

Stopping Distance Scatter Plot


110
100
90
80
70
Distance

60
50
40
30
20
10

20 30 40 50 60 70 80 90 100 110 120


Speed
Distance = 1.03Speed - 24.6; r^2 = 0.95
Distance = 0.969Speed - 21

The least squares regression line (red) minimizes the total distance of the points
from the line and is very tedious to create by hand (so we won’t). As you can see
from the graph, it is pretty close to the median-median line (blue).
Collection 2 Scatter Plot
110
100
90
80
70
distance

60
50
40
30
20
10

20 30 40 50 60 70 80 90 100 110 120


speed
distance = 1.03speed - 24.6; r^2 = 0.95;
Sumof squares = 434.4
distance = 0.969speed - 21;
Sumof squares = 463.2

For the line of best fit in the least-squares method,


1- the sum of the residuals is zero (the positive and negative residuals cancel out)
2- the sum of the squares of the residuals has the least possible value

Residual value – vertical distance between a point and the regression line.

Recall: Correlation coefficient (r) – used to measure the strength and direction
of the relationship modelled by the least squares line. It is a measure of how well a
regression line fits a set of data. The sign of r indicates the slope, while a number
close to ±1 indicates a strong correlation and a number close to zero indicates a
weak correlation.

Coefficient of determination (r2) – used to measure the strength of the


relationship modelled by the least squares line. An r2 value of 0.8 means that 80%
of the change in the dependent variable is due to changes in the independent
variable.

Predictive model – the median-median line and the least squares line are
examples of linear regression models for the data. It is also possible to model
data using other equations (quadratic, exponential).

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy