0% found this document useful (0 votes)
19 views18 pages

Day 24 Supervised Learning - REgression Analysis - 2

The document discusses supervised learning with a focus on regression analysis, specifically the least squares estimators for parameters β0 and β1. It explains the process of minimizing squared errors to derive these estimators and outlines properties of estimated residuals, including testing hypotheses on regression parameters. Additionally, it covers the distribution of regression estimators and the test statistics for evaluating the significance of the regression coefficients.

Uploaded by

anjushukla76
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views18 pages

Day 24 Supervised Learning - REgression Analysis - 2

The document discusses supervised learning with a focus on regression analysis, specifically the least squares estimators for parameters β0 and β1. It explains the process of minimizing squared errors to derive these estimators and outlines properties of estimated residuals, including testing hypotheses on regression parameters. Additionally, it covers the distribution of regression estimators and the test statistics for evaluating the significance of the regression coefficients.

Uploaded by

anjushukla76
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

MM 225 – AI and Data Science

Day 24: Supervised Learning : Regression Analysis-2

Instructors: Hina Gokhale, MP Gururajan, N. Vishwanathan

1 O C TO B E R 2024
Least Squares Estimators of 𝛽0 , 𝛽1
◦ Let 𝑌𝑖 , 𝑥𝑖 ∶ 𝑖 = 1, 2, … , 𝑛 be the data.

◦ These can be expressed as 𝑌𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 + 𝜖𝑖 ∶ 𝑖 = 1, 2, … , 𝑛

◦ Want to find 𝛽0 , 𝛽1 by minimizing the squared error between values of 𝑌𝑖 and its
estimator 𝛽0 + 𝛽1 𝑥𝑖

෢0 and 𝛽
◦ Let us denote estimated value of 𝛽0 , 𝛽1 as 𝛽 ෢1 respectively

◦ Want find 𝛽0 and 𝛽1 that would minimize sum of squares of deviations from the
observations from the regression line:

◦ 𝑆𝑆 = σ𝑛𝑖=1 𝜖𝑖2 = σ𝑛𝑖=1 𝑌𝑖 − 𝛽0 − 𝛽1 𝑥𝑖 2

26 September 2024 MM 225 : AI AND DATA SCIENCE 2


𝜕𝑆𝑆
= −2 σ𝑛𝑖=1 𝑦𝑖 − 𝛽0 − 𝛽1 𝑥𝑖 = 0
𝜕𝛽0

𝜕𝑆𝑆
= −2 σ𝑛𝑖=1 𝑥𝑖 𝑦𝑖 − 𝛽0 − 𝛽1 𝑥𝑖 = 0
𝜕𝛽1

Simplify these two equation leads to


𝑛 𝑛

𝑛𝛽0 + 𝛽1 ෍ 𝑥𝑖 = ෍ 𝑦𝑖
𝑛
𝑖=1
𝑛
𝑖=1
𝑛 Least Squares Normal
𝛽0 ෍ 𝑥𝑖 + 𝛽1 ෍ 𝑥𝑖2 = ෍ 𝑦𝑖 𝑥𝑖 Equations
𝑖=1 𝑖=1 𝑖=1
Further simplification we will get
෢0 = 𝑦ഥ − 𝛽
𝛽 ෢1 𝑥ҧ

σ𝑛𝑖=1 𝑥𝑖 𝑦𝑖 − 𝑛𝑥ҧ 𝑦ത σ 𝑥𝑖 − 𝑥ҧ 𝑦𝑖

𝛽1 = 𝑛 = 𝑛
2
σ𝑖=1 𝑥𝑖 − 𝑛𝑥ҧ 2 σ𝑖=1 𝑥𝑖2 − 𝑛𝑥ҧ 2

26 September 2024 MM 225 : AI AND DATA SCIENCE 3


Notations
Let estimated value of 𝑌𝑖 and 𝜖𝑖 be denoted by 𝑌෠𝑖 and 𝑒𝑖 for i =1, 2,…,n
෢0 + 𝛽
෡𝑖 = 𝛽
𝑌 ෢1 𝑥𝑖 and 𝑒𝑖 = 𝑌𝑖 − 𝑌෡𝑖

2
Sum of Squares of residuals = 𝑆𝑆𝐸 = σ𝑛𝑖=1 𝑒𝑖2 = σ𝑛𝑖=1 𝑌𝑖 − 𝑌෠𝑖

2
σ𝑛
𝑖=1 𝑥𝑖
𝑆𝑥𝑥 = σ𝑛𝑖=1 𝑥𝑖 − 𝑥ҧ 2
= σ𝑛𝑖=1 𝑥𝑖2 −
𝑛

𝑛 𝑛
σ𝑛𝑖=1 𝑥𝑖 σ𝑛𝑖=1 𝑌𝑖
𝑆𝑥𝑦 = ෍ 𝑥𝑖 − 𝑥ҧ 𝑌𝑖 − 𝑌ത = ෍ 𝑥𝑖 𝑌𝑖 −
𝑛
𝑖=1 𝑖=1

1 October 2024 MM 225 : AI AND DATA SCIENCE 4


Properties of Estimated Residuals
2
◦ 𝑆𝑆𝐸 = σ𝑛𝑖=1 𝑒𝑖2 = σ𝑛𝑖=1 𝑌𝑖 − 𝑌෠𝑖

◦ It can be shown that E(SSE) = (n-2) 𝜎 2

◦ Hence SSE/(n-2) is an unbiased estimator of 𝜎 2 and it is denoted by

𝑛 2
σ 𝑖=1 𝑌𝑖 − ෠𝑖
𝑌
𝜎ො 2 =
𝑛−2

◦ SSE can be simplified as


෢1 ∗ 𝑆𝑥𝑦
𝑆𝑆𝐸 = 𝑆𝑆𝑇 – 𝛽

Where, 𝑆𝑆𝑇 = 𝑇𝑜𝑡𝑎𝑙 𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑒𝑑 𝑠𝑢𝑚 𝑜𝑓 𝑠𝑞𝑢𝑎𝑟𝑒𝑠 = σ𝑛𝑖=1 𝑌𝑖 − 𝑌ത𝑖 2

1 October 2024 MM 225 : AI and Data Science 5


Properties of Estimated Residuals
◦ For i = 1, 2, …, n
◦ 𝑌𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 + 𝜖𝑖 and E(𝜖𝑖 ) = 0 and Var(𝜖𝑖 ) = 𝜎 2
◦ It implies that
𝐸 𝑌𝑖 = 𝛽0 + 𝛽1 𝑥𝑖
𝑉𝑎𝑟 𝑌𝑖 = 𝜎 2

σ 𝑥𝑖 − 𝑥ҧ 𝐸(𝑌𝑖 ) 𝜎2
෢1 =
𝐸 𝛽 ෢1 =
= 𝛽1 𝑎𝑛𝑑 𝑉𝑎𝑟 𝛽
2
σ 𝑥𝑖 − 𝑛𝑥ҧ 2 𝑆𝑥𝑥

𝑛
2
𝐸(𝑌𝑖 ) 1 𝑥ҧ
෢0 = ෍
𝐸 𝛽 − 𝑥𝐸 ෢0 = 𝜎 2 +
ҧ 𝐵 = 𝛽0 𝑎𝑛𝑑 𝑉𝑎𝑟 𝛽
𝑛 𝑛 𝑆𝑥𝑥
𝑖=1
1 October 2024 MM 225 : AI AND DATA SCIENCE 6
Properties of LS Estimators
A is an unbiased estimator of β0
B is an unbiased estimator of β1

𝜎2 𝑥ҧ
It can be shown that Cov(A, B) = −
𝑆𝑥𝑥

Standard Errors of estimator of intercept and slope are respectively

1 𝑥ҧ 2
෢0 =
𝑆𝐸 𝛽 𝜎2 +
𝑛 𝑆𝑥𝑥

𝜎2
෢1 =
𝑆𝐸 𝛽
𝑆𝑥𝑥

෢0 and 𝛽
Estimated std error for 𝛽 ෢1 can be obtained by replacing 𝜎 2 by its unbiased estimate 𝜎ො 2

1 October 2024 MM 225 : AI AND DATA SCIENCE 7


Testing Hypothesis on regression parameters
෢0 and 𝛽
𝑌 = 𝛽0 + 𝛽1 𝑥 + 𝜖 and 𝛽0 and 𝛽1 are estimated as 𝛽 ෢1 .
How do we know that statistically this relationship is significant?
If 𝛽1 = 0 then this implies that Y is not dependent on x!

Therefore, it is of interest to test the hypothesis


𝐻0 : 𝛽1 = 0
In general it would be of interest to test the hypothesis that
𝐻0 : 𝛽1 = 𝛽1,0
To statistically test the hypothesis an additional assumption needs to be made:
𝝐 ~𝑵 𝒐, 𝝈𝟐

1 October 2024 MM 225 : AI and Data Science 8


Distribution of the regression
estimators
𝝐 ~𝑵 𝒐, 𝝈𝟐
Hence 𝑌 ~ 𝑁 𝛽0 + 𝛽1 𝑥, 𝜎 2
σ 𝑥𝑖 −𝑥ҧ 𝑌𝑖
෢1 = 𝑛
Estimator 𝛽 2 is a linear combination of independent RV 𝑌𝑖
σ 2
𝑖=1 𝑥𝑖 −𝑛𝑥ҧ

෢1 ~𝑁 𝜎2
Hence 𝛽 𝛽1 ,
𝑆𝑥𝑥

෢0 ~𝑁 1 𝑥ҧ 2
Similarly 𝛽 𝛽0 , 𝜎 2 +
𝑛 𝑆𝑥𝑥
𝜎2
(𝑛−2)ෝ
And ~𝜒 2
𝑛−2
𝜎2

1 October 2024 MM 225 : AI and Data Science 9


The test statistic for testing 𝛽1 = 𝛽1,0
𝜎ෝ2
Unbiased estimator of 𝛽1 is B and estimated SE(B) =
𝑆𝑥𝑥

Hence the test statistic for testing 𝛽1 = 𝛽1,0 is

෢1 − 𝛽1,0
𝛽
𝑇=
𝜎ො 2
𝑆𝑥𝑥
Hence when 𝐻0 is true: T~𝑡(𝑛 − 2)

1 October 2024 MM 225 : AI and Data Science 10


Critical region for testing 𝛽1 = 𝛽1,0
Alternative Hypothesis Critical region for given α
𝜷𝟏 ≠ 𝜷𝟏,𝟎 : Two sided ෢1 − 𝜷𝟏,𝟎
𝛽 ෢1 − 𝜷𝟏,𝟎
𝛽
alternative < 𝒕𝜶ൗ (𝒏 − 𝟐) ∪ > 𝒕𝟏−𝜶ൗ (𝒏 − 𝟐)
𝟐 𝟐
ෝ𝟐
𝝈 ෝ𝟐
𝝈
𝑺𝒙𝒙 𝑺𝒙𝒙

𝜷𝟏 < 𝜷𝟏,𝟎 : one sided ෢1 − 𝜷𝟏,𝟎


𝛽
alternative < 𝒕𝜶 (𝒏 − 𝟐)
ෝ𝟐
𝝈
𝑺𝒙𝒙

𝜷𝟏 > 𝜷𝟏,𝟎 : one sided ෢1 − 𝜷𝟏,𝟎


𝛽
alternative > 𝒕𝟏−𝜶 (𝒏 − 𝟐)
ෝ𝟐
𝝈
𝑺𝒙𝒙

1 October 2024 MM 225 : AI and Data Science 11


The test statistic for testing 𝛽0 = 𝛽0,0
1 𝑥ҧ 2
Unbiased estimator of 𝛽0 is A and estimated SE(B) = 𝜎ො 2 +
𝑛 𝑆𝑥𝑥

Hence the test statistic for testing 𝛽0 = 𝛽0,0 is

෢0 − 𝛽0,0
𝛽
𝑇=
1 𝑥ҧ 2
𝜎ො 2 +
𝑛 𝑆𝑥𝑥
Hence when 𝐻0 is true: T~𝑡(𝑛 − 2)

1 October 2024 MM 225 : AI and Data Science 12


Critical region for testing 𝛽0 = 𝛽0,0
Alternative Critical region for given α
Hypothesis
𝜷𝟎 ≠ 𝜷𝟎,𝟎 : Two sided ෢0 − 𝜷𝟎,𝟎
𝛽 ෢0 − 𝜷𝟎,𝟎
𝛽
alternative < 𝒕𝜶ൗ (𝒏 − 𝟐) ∪ > 𝒕𝟏−𝜶ൗ (𝒏 − 𝟐)
𝟐 𝟐
𝟏 ഥ𝟐
𝒙 𝟏 ഥ𝟐
𝒙
ෝ𝟐
𝝈 + ෝ𝟐
𝝈 +
𝒏 𝑺𝒙𝒙 𝒏 𝑺𝒙𝒙

𝜷𝟎 < 𝜷𝟎,𝟎 : one sided ෢0 − 𝜷𝟎,𝟎


𝛽
alternative < 𝒕𝜶 (𝒏 − 𝟐)
𝟏 ഥ𝟐
𝒙
ෝ𝟐
𝝈 +
𝒏 𝑺𝒙𝒙

𝜷𝟎 > 𝜷𝟎,𝟎 : one sided ෢0 − 𝜷𝟎,𝟎


𝛽
alternative > 𝒕𝟏−𝜶 (𝒏 − 𝟐)
𝟏 ഥ𝟐
𝒙
ෝ𝟐
𝝈 +
1 October 2024 𝒏 𝑺𝒙𝒙
MM 225 : AI and Data Science 13
Prediction
Suppose new value 𝑌𝑛+1 is to be predicted when 𝑥 = 𝑥𝑛+1
Then the point estimator 𝑌𝑛+1 can be given by 𝑌෠𝑛+1 = 𝛽መ0 + 𝛽መ1 𝑥𝑛+1
Error in prediction 𝑒𝑝 = 𝑌𝑛+1 − 𝑌෠𝑛+1
Note that
𝐸(𝑒𝑝 ) = 𝐸(𝑌𝑛+1 − 𝑌෠𝑛+1 ) = 0

1 𝑥𝑛+1 − 𝑥ҧ 2
𝑉𝑎𝑟 𝑌෠𝑛+1 = 𝛽መ0 + 𝛽መ1 𝑥𝑛+1 = 𝜎 2
+
𝑛 𝑆𝑥𝑥
𝑉𝑎𝑟 𝑌𝑛+1 = 𝜎 2

Also note that 𝑌𝑛+1 refers to the future observation while 𝑌෠𝑛+1 is estimated from the model
developed. Hence, 𝑌𝑛+1 and 𝑌෠𝑛+1 are independent.

1 𝑥𝑛+1 −𝑥ҧ 2
Therefore: 𝑉𝑎𝑟 𝑒𝑝 = 𝜎2 1+ +
𝑛 𝑆𝑥𝑥

1 October 2024 MM 225 : AI AND DATA SCIENCE 14


Prediction Interval
Thus we have
1 ҧ 𝑛+1 2
𝑥−𝑥
𝑌 − 𝛽መ0 − 𝛽መ1 𝑥𝑛+1 ~𝑁 0, 𝜎 2
1+ +
𝑛 𝑆𝑥𝑥
𝑌𝑛+1 − 𝑌෠𝑛+1
And hence ഥ −𝑥𝑛+1 2
~ 𝑡(𝑛 − 2)
1 𝑥
ෝ 2 1+ +
𝜎 𝑛 𝑆𝑥𝑥

Therefore prediction interval at 100(1-α)% confidence level is

1 𝑥ҧ − 𝑥𝑛+1 2
𝑦ො𝑛+1 − 𝑡𝑛−2,𝛼ൗ 𝜎ො 2 1+ + ≤ 𝑌𝑛+1
2 𝑛 𝑆𝑥𝑥
1 𝑥ҧ − 𝑥𝑛+1 2
≤ 𝑦ො𝑛+1 + 𝑡𝑛−2,𝛼ൗ 𝜎ො 2 1+ +
2 𝑛 𝑆𝑥𝑥

1 October 2024 MM 225 : AI AND DATA SCIENCE 15


Summary

Statistical properties of estimated errors = residuals


Statistical properties of least squares estimators
Testing of Hypothesis for regression coefficients
Prediction and Prediction Interval

1 October 2024 MM 225 : AI AND DATA SCIENCE 16


Thank you….

1 October 2024 MM 225 : AI AND DATA SCIENCE 17


Difference between mean value prediction and
Prediction of future value of Y
Suppose we are interested in predicting Y for given xn+1 say Y(xn+1 )

Difference between mean response 𝛽0 + 𝛽1 𝑥0 and Y(xn+1)

◦ Example: let x0 be temperature and Y be response to an experiment carried


out at temperature x0 , then
◦ When several experiments are carried out at a given x0 , then expected value would be mean

value of 𝛽0 +𝛽1 𝑥0

◦ However, if only one experiment is carried out Y will be only one response….Present case
relates to this possibility

1 October 2024 MM 225 : AI and Data Science 18

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy