0% found this document useful (0 votes)

62 views9 pages

Simple Regression (Continued) : Y Xu Y Xu

The document discusses the assumptions and properties of simple linear regression. It begins by presenting the true population regression function and stochastic regression function. It then outlines several useful properties of the stochastic regression line, including that the sum of residuals is zero, the sample covariance between regressors and residuals is zero, and the point (XY) is always on the regression line. The document introduces the concepts of total, explained, and residual sum of squares. It derives the expected values and variances of the OLS estimators β0 and β1. Finally, it discusses the assumptions of the simple linear regression model, including that the error term has a zero conditional mean and constant variance.

Uploaded by

Catalin Parvu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

62 views9 pages

Simple Regression (Continued) : Y Xu Y Xu

Uploaded by

Catalin Parvu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Simple regression (continued)

True population regression function

Y 0 1X u

We need to estimate the PRF on the basis of the SRF

(stochastic form)
Y 0 1 X u

SRF

The SRF is determined on the basis of a simple assumption: E( u )=0. In which case we
can write the SRF in its deterministic form as:

Y
0
1X

Useful algebraic properties of the SRL.

(1) The sum of the OLS residuals is 0 and hence average must be equal to 0:
n

ui 0 .This says nothing about the residuals for any particular observation and needs

i 1

no proof as it follows from the first order condition.

(2) The sample covariance between the regressors and the residual is zero. This
follows from the first order condition

Xu 0 . The residual does NOT depend on the

i 1

i i

explanatory variable. This also follows from the first order condition.
(3) The point ( XY ) is always on the regression line. This follows from equation

6a:
0 Y
1X

) is the same as the sample average

(4) Sample average of the fitted values ( Y

Y.
for the actual values ( Y ). In other words Y
0
(5) The fitted values and the residuals are uncorrelated: (Yu)
u
u
or Y Y

(6) Y Y
Some new concepts
(1) Total sum of squares (TSS or SST)
(2) Explained sum of squares (ESS)
(3) Residual or sum of squares (RSS )
TSS=

Y ) 2 - the total variation of the actual Y values about their sample mean.

i 1

Total variation in Y- the spread of Y.

ESS=

(Y Y )
i

-the total variation in the estimated Y about their mean values.

i 1

In other words the variation in Y which is explained by the regression.

Empirical Economics
26163
Handout 2

RSS=

i 1

2
(Yi Y ) 2 = ui -unexplained variation.

Figure 6-8 Gujarati

Actual observation Yi

Yi Y

*
Yi

u
i Variation in Yi not explained by the regression
Yi Y
i

Yi Y Variation in Yi explained by the regression

SRF : Y 0 1 X

It can easily be proven that TSS=ESS+RSS but in order to tell us how well the RHS
variables explain the variation in the LHS variable we compute what is known as the R2.
SRL:
This is known as the coefficient of determination - ESS/TSS or 1-[RSS/TSS].
In our example: 1-[9.5515/393.60]=0.9757
R2 =0.98 means that 98% of variation in Y.
If the SRF fits the data well that the ESS should be much larger than the RSS.
If SRF fits the data poorly that the RSS should be much larger than the ESS.
In the extreme where X explains no variation in Y RSS=TSS and ESS=0.
These are polar cases.
If ESS is relatively larger than RSS the SRF explains a substantial proportion of
the variation in Y.
If RSS is relatively larger than ESS the SRF explains a small proportion of the
variation in Y
We would like the ESS to be bigger RSS
Expected values and variances of the OLS estimators
By utilising the statistical properties of OLS estimation we can obtain expected values
and variance of the OLS estimators.
This means that we are looking at the properties of the distributions of
(they are estimators of the parameters

0 and 1

that appear in the population model).

Linear in parameters the s: Y=0+1X+u

We know that LHS variable can be expressed as log (log X) or be squared
(X2) or as a reciprocal (1/X) - there are no restrictions on how Y and X
relate to each other.
2 Random sampling- each observation has an equivalent chance of being chosen. If
we are looking at a cross section of individuals then each individual has an equal
chance of chosen.
Empirical Economics
26163
Handout 2

3 Zero conditional mean: E(uX)=0: the error term is totally independent of the
explanatory variables. That is, given the value of X the mean or expected value of e
is zero.
the factors which are not explicitly included in the model and
therefore subsumed in u do not systematically affect the mean
value of Y- positive us cancel out negative us.
If E(uX)=0 hold then E(YX)=0+1X
4 X is nonstochastic.
Nonstochastic means that the X is fixed. Consider the relationship between
wages (Y) and education (X). We may subdivide education into those with primary
(7), secondary (14) and tertiary (17 or 19 etc).. Each time X is fixed at a certain
level. For each of the fixed Xs there is an associated distribution of Y
If each X is nonstochastic then this assumption [E(uX)=0] is automatically
filled. Since by extension if X is fixed then it is independent of u.
One can use this assumption of a fixed X to explore the covariance of X and u :
Cov(X, u )=0
Since the X are nonstochastic

Cov( X , u) E( Xu) XE (u) 0

All the Y are not the same-variation in the independent variable

Using all of the above we can show that

is an unbiased estimator of .

E ( 1 ) 1

E( 0 ) 0

Variances of the OLS estimators

Firstly how do we calculate the variance ( ) of the regression (error variance)?
2

If we assume that
on X:

u and X are independent then the distribution of u does not depend

E( u X ) 0
This means

E( u ) 0 and Var ( u X ) is equivalent to some unchanging value denoted by

Var( u X ) E u E( u )

Var( u X ) E u 2

E u
2

Empirical Economics
26163
Handout 2

Hence

E( u 2 ) 2
However we cannot observe these values what we have are the estimates. Hence we use
an estimator of which is . Also
for the error variance becomes
2

u will be an estimator of u . So that the estimator

u i2

i 1

The denominator is n 2 because there are two restrictions

E( u ) 0 and

E( X u ) 0 .
And the standard error of the regression (or root mean square error) is defined as
n

u i2

i 1

Our primary interest in

and

is to use them to help us derive the variance of

2 and
( ) and

the estimators (and hence their standard errors). That is, we use the
to derive standard errors of our estimators

and

1 ,

denoted as

se( 1 ) repectively.

var(1 ) 2 / TSSX 2 / (Xi X ) 2

(1) The larger the error variance-2 the larger will be the variance of the estimator.
(2) The larger the variability in X the smaller the variance in
(3)We can also find the standard deviation of

1 : / TSSX

1.
=

var(1 ) 2 / TSSX

The sd( 1 )is simply the square root of the variance-

We do not have

[ s SSTX ].

, we have an estimator - . We can now use it to derive of

sd( 1 ), that is sd( 1 )= / s

Since we are using

and NOT

to derive sd( 1 ) it means that we are finding

an estimator of sd( 1 ).The natural estimator of the latter is referred to as the

standard errors of
o

1 ; se( 1 )= / s

is an estimator of sd( 1 )= / s .

This can be defined as

se(
1 )

Empirical Economics
26163
Handout 2

(Xi X ) 2
i
1

1/ 2

or se(
1 )

In order to find the

var(
1 )

(Xi X ) 2
i
1

se( 0 ) :
n

2n 1 X 2 / (Xi X ) 2 and
var( 0 )

2n 1 X 2 / (Xi X ) 2
se( 0 )
Note: Importance of the minimum variance property.
Another ideal property of an estimator is that of minimum variance. That is in the class
of all estimators the var(
1 ) and var(
0 ) must be the lowest. If this is achieved then
the estimators are said to be efficient. Therefore in addition to being unbiased
estimators should also be efficient. An efficient estimator is a reliable estimator.

Continuing with assumptions

6
Var(uX)=2
This is referred to as homoskedasticity. It says that the error term has a constant
variance for all observations
This is quite different from E(uX)=0. The latter involves expected value
what we are looking at now concerns the variance.
This property plays no role in showing the unbiasedness of the parameters.
The var (uX)= 2 implies var (YX)= 2 (see proof below)
We assumed that var (u) is fixed at 2 , that is

var(u) E[(u E(u)]2

var(u) E[(u 0] 2
var(u) E(u) 2 2
And the var(Y) E(u ) 2 var(u) 2

Empirical Economics
26163
Handout 2

Homoskedasticity can be shown graphically.

In the diagram we have the population regression (Fig 2.8 Woolridge).
Y

E(Y X) 0 1X

the conditional distributions have the same spread.

variance for each of these distribution is the same for various values of X.
individual values of Y are spread around their mean with the same variance.
What does this means in terms of homoskedasticity?
All the Y values corresponding to the various Xs are equally reliable in the sense
reliability is being judged by how closely the Y values are distributed around the
means.
An interpretation using the model with wages and education
In our education example we are saying that the var (ueducation=2) the
variance in u does not depend on the level of education.
var(wageeducation=2).
variability in wages about its mean is the same across all
education levels.
variability in wages about its mean for those only with primary
school education is the same as variability in wages about its
mean for those only with secondary school education
This may not be so- individuals with no of little education may just be earning
the minimum wage hence no or little variability in the distribution of wages
across this group. Similarly individuals with more education have more job
opportunities that could lead to more variability in wages hence the distribution
in wages across this group may be more spread out.

Empirical Economics
26163
Handout 2

Heretoskedasticity
Figure 2.9 Woolridge.

[var(uX)] increases with X.

This also means that var(YX)] increases with X.
The distribution spreads out as X increase. For small values of X the distribution is tight
and for large values of X there is a greater spread.
Y

E(Y X) 0 1X
X

Not all the Y values corresponding to the various Xs are equally reliable in the
sense reliability is being judged by how closely the Y values are distributed
around the means.
If we pick an individual from a group with a smaller variance it will be more likely
that the corresponding Y for this individual will be closer to the mean than if we
pick an individual from a group with a bigger variance.

To summarise: given the value of X the variance of ut is the same for all observations.
That is the conditional variances of u t are identical.

2 E (u 2 ) var(u)

2 is the unconditional variance of u and is often called the error variance

or the disturbance variance.

The square root is the standard deviation of the error. A larger means
the distribution of the unobserve-ables affecting the dependent variable is
more spread out.
This has implication for the variance of the slope parameters.

Empirical Economics
26163
Handout 2

Using the above info we can show

Why OLS? Summary on the properties of the OLS estimators
OLS is popular because of its strong properties which are summarise in the GaussMarkov theorem-BLUE
The estimators are best- smallest variance (efficient)
They are linear function of the random variable Y
They are unbiased
Regression through the origin
This means estimating a regression without a constant. In many cases this be a realistic
situation, for example tax paid should be zero when income is zero. Or if savings is solely
a function of current income then it should be zero when an individual in unemployed and
receiving no income.
n

~
1

Xi Yi

t 1

(12)

Xi2
t 1

Notice the difference in the estimate when

is the estimator.

Yi Xi nY X

i1
n

Xi2
i1

Empirical Economics
26163
Handout 2

Summary
We can now specify the 2-variable linear regression model by listing its important
assumptions:
1. Linear in parameter: Y=0+1X+u
2. X is nonstochastic value is fixed. (this means that each independent variable is
controlled by the researcher, who can change its value in accordance with the
experimental objectives).This assumption implies assumption 3- that is, if the Xs are
fixed it must by extension be independent of the error.
3. Zero conditional mean: E(uX)=0
4. The error has a zero expected value E(u)=0.
5. The error has a constant variance for all observations, that is, var(uX)=E(u2)=2 (the
property of homoskedasticity)
6. The random variable ui are statistically independent, thus E(ui,uj)=0. This means that
the errors associated with any two observations are independent (reflects an absence of
serial correlation).
7. The error term is normally distributed (return to this later on)
8. Other assumptions
The X values must not all be the same- variation is necessary to use regression
analysis as a research tool.
The number of observation is greater than the numbers of parameters to be
estimated. Otherwise the parameters could not be estimated- from a single
observation there is no way to obtain two parameters (2 unknowns and one
equation)
The regression model is correctly specified. That is there is no specification
bias.

Empirical Economics
26163
Handout 2

Properties of The OLS Estimator: Quantitative Methods 2
No ratings yet
Properties of The OLS Estimator: Quantitative Methods 2
57 pages
Pertemuan 3 - Simple Linear Regression
No ratings yet
Pertemuan 3 - Simple Linear Regression
19 pages
Jeffrey M Wooldridge Solutions Manual and Supplementary Materials For Econometric Analysis of Cross Section and Panel Data 2003
94% (17)
Jeffrey M Wooldridge Solutions Manual and Supplementary Materials For Econometric Analysis of Cross Section and Panel Data 2003
135 pages
Econometrics Test Prep
100% (2)
Econometrics Test Prep
7 pages
Week 3-4
No ratings yet
Week 3-4
75 pages
Kolmogorov-Smirnov Two Sample Test
No ratings yet
Kolmogorov-Smirnov Two Sample Test
17 pages
CH 02
No ratings yet
CH 02
41 pages
3.0 ErrorVar and OLSvar-1
No ratings yet
3.0 ErrorVar and OLSvar-1
42 pages
EC2C4 Econometrics II
No ratings yet
EC2C4 Econometrics II
56 pages
Econometric S
No ratings yet
Econometric S
8 pages
Econometrics Endterm Summary 2 PDF
No ratings yet
Econometrics Endterm Summary 2 PDF
43 pages
Properties of OLS Estimators: Assumptions Underlying Model
100% (1)
Properties of OLS Estimators: Assumptions Underlying Model
23 pages
Econ20222 MJAbackgr
No ratings yet
Econ20222 MJAbackgr
164 pages
Lecture 2 SLR - 1
No ratings yet
Lecture 2 SLR - 1
28 pages
Lecture5-Estimating The Linear Conditional Mean Model II - Annotated
No ratings yet
Lecture5-Estimating The Linear Conditional Mean Model II - Annotated
27 pages
SGPE Econometrics Lecture 1 OLS
No ratings yet
SGPE Econometrics Lecture 1 OLS
87 pages
4-Econometrics-Linear Regression
No ratings yet
4-Econometrics-Linear Regression
12 pages
MASI MAH 23S SM - Sample 11 23 22
No ratings yet
MASI MAH 23S SM - Sample 11 23 22
411 pages
1 - The Simple Regression Model
No ratings yet
1 - The Simple Regression Model
41 pages
Econometrics 8
No ratings yet
Econometrics 8
35 pages
Gujarati Chap 3
No ratings yet
Gujarati Chap 3
44 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
17 pages
EC501 Lecture 02
No ratings yet
EC501 Lecture 02
27 pages
Chapter3
No ratings yet
Chapter3
52 pages
Topic 2
No ratings yet
Topic 2
23 pages
Additional Problem Set Units I and II
No ratings yet
Additional Problem Set Units I and II
8 pages
Lecture 6
No ratings yet
Lecture 6
45 pages
Classical Linear Regression and Its Assumptions
No ratings yet
Classical Linear Regression and Its Assumptions
63 pages
Sampling Distribution and SE
No ratings yet
Sampling Distribution and SE
9 pages
Chapter Two Metrics (I)
No ratings yet
Chapter Two Metrics (I)
35 pages
Chapter 1
No ratings yet
Chapter 1
17 pages
Chapter3 PDF
No ratings yet
Chapter3 PDF
52 pages
Introductory Econometrics: Prachi Singh & Partha Bandopadhyay
No ratings yet
Introductory Econometrics: Prachi Singh & Partha Bandopadhyay
18 pages
Chapter 2 Simple Linear Regression
No ratings yet
Chapter 2 Simple Linear Regression
31 pages
Lecture 6373 07
No ratings yet
Lecture 6373 07
53 pages
Notes 2
No ratings yet
Notes 2
16 pages
Regression Basics in Matrix Terms: 1 The Normal Equations of Least Squares
No ratings yet
Regression Basics in Matrix Terms: 1 The Normal Equations of Least Squares
3 pages
Chapter 2
No ratings yet
Chapter 2
18 pages
Econometrics Bacheror's Lectures Utrecht University
No ratings yet
Econometrics Bacheror's Lectures Utrecht University
24 pages
Oversikt ECN402
No ratings yet
Oversikt ECN402
40 pages
SLRM Note
No ratings yet
SLRM Note
15 pages
CHP 3 Notes, Gujarati
No ratings yet
CHP 3 Notes, Gujarati
4 pages
Main
No ratings yet
Main
11 pages
Simple Linear Regression Model
No ratings yet
Simple Linear Regression Model
6 pages
Chapter 2 Econometrics
No ratings yet
Chapter 2 Econometrics
9 pages
Assignment3SolNew Fall2024
No ratings yet
Assignment3SolNew Fall2024
9 pages
Central Limit Theorem
100% (3)
Central Limit Theorem
38 pages
The Simple Regression Model (2 Variable Model) : Empirical Economics 26163 Handout 1
No ratings yet
The Simple Regression Model (2 Variable Model) : Empirical Economics 26163 Handout 1
9 pages
Lecture 2-3
No ratings yet
Lecture 2-3
8 pages
5ssmn932 Lecture5 2021 Slides Collated
No ratings yet
5ssmn932 Lecture5 2021 Slides Collated
78 pages
Unit 3
No ratings yet
Unit 3
20 pages
Section 4 5 Solutions
No ratings yet
Section 4 5 Solutions
14 pages
Statistical Comparison of The Slopes of Two Regression Lines A Tutorial. J.M. Andrade, M.G. Estévez-Pérez
No ratings yet
Statistical Comparison of The Slopes of Two Regression Lines A Tutorial. J.M. Andrade, M.G. Estévez-Pérez
12 pages
統計摘要
No ratings yet
統計摘要
12 pages
Queuing
No ratings yet
Queuing
45 pages
Probablity and Random Variables
No ratings yet
Probablity and Random Variables
113 pages
Semi-Detailed Lesson Plan in Statistics and Probability I. Objectives
100% (2)
Semi-Detailed Lesson Plan in Statistics and Probability I. Objectives
5 pages
Assignment 11: Introduction To Machine Learning Prof. B. Ravindran
100% (1)
Assignment 11: Introduction To Machine Learning Prof. B. Ravindran
3 pages
Measures of Variability: Levin and Fox Statistics For Political Science
No ratings yet
Measures of Variability: Levin and Fox Statistics For Political Science
61 pages
Statistics Exam 2024
No ratings yet
Statistics Exam 2024
1 page
Unit - 5 - Review - 1
No ratings yet
Unit - 5 - Review - 1
103 pages
M3 Part 2: Regression Analysis
No ratings yet
M3 Part 2: Regression Analysis
21 pages
wp667 PDF
No ratings yet
wp667 PDF
38 pages
Econometrics: Two Variable Regression: The Problem of Estimation
No ratings yet
Econometrics: Two Variable Regression: The Problem of Estimation
28 pages
DS 1 - Tut 2 - Sec A
No ratings yet
DS 1 - Tut 2 - Sec A
9 pages
Assignment 7
No ratings yet
Assignment 7
3 pages
Arma Arima
No ratings yet
Arma Arima
10 pages
Bivariate Data
No ratings yet
Bivariate Data
8 pages
Assignment MPH
No ratings yet
Assignment MPH
2 pages
Module 6. T-Test Two Sample Test
No ratings yet
Module 6. T-Test Two Sample Test
6 pages
Probabilistic Engineering Design
No ratings yet
Probabilistic Engineering Design
7 pages
An Adaptive Simulated Annealing Algorithm PDF
No ratings yet
An Adaptive Simulated Annealing Algorithm PDF
9 pages
Anova Answer Report
No ratings yet
Anova Answer Report
5 pages
N - 9 N - 15 M - 33 M - 42 SS - 740 SS - 1240: Males Females
No ratings yet
N - 9 N - 15 M - 33 M - 42 SS - 740 SS - 1240: Males Females
3 pages
A Practical Approach To Kalman Filter and How To Implement It
No ratings yet
A Practical Approach To Kalman Filter and How To Implement It
13 pages
Testing For Multicollinearity and Its Remedies
No ratings yet
Testing For Multicollinearity and Its Remedies
4 pages
NPCR and UACI Randomness Tests For Image Encryption
No ratings yet
NPCR and UACI Randomness Tests For Image Encryption
8 pages
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
Calculus Refresher
From Everand
Calculus Refresher
A. A. Klaf
3/5 (8)
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Introduction to Bessel Functions
From Everand
Introduction to Bessel Functions
Frank Bowman
2.5/5 (1)
A Treatise on the Calculus of Finite Differences
From Everand
A Treatise on the Calculus of Finite Differences
George Boole
4/5 (1)
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
Elgenfunction Expansions Associated with Second Order Differential Equations
From Everand
Elgenfunction Expansions Associated with Second Order Differential Equations
E. C. Titchmarsh
No ratings yet
Introduction to Differentiable Manifolds
From Everand
Introduction to Differentiable Manifolds
Louis Auslander
4.5/5 (2)
Lectures on the Coupling Method
From Everand
Lectures on the Coupling Method
Torgny Lindvall
No ratings yet
Integration, Measure and Probability
From Everand
Integration, Measure and Probability
H. R. Pitt
No ratings yet
A Short Course in Automorphic Functions
From Everand
A Short Course in Automorphic Functions
Joseph Lehner
No ratings yet
Capsule Calculus
From Everand
Capsule Calculus
Ira Ritow
No ratings yet
Exercises of Advanced Statistics
From Everand
Exercises of Advanced Statistics
Simone Malacrida
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Simple Regression (Continued) : Y Xu Y Xu

Uploaded by

Simple Regression (Continued) : Y Xu Y Xu

Uploaded by

Simple regression (continued)

True population regression function

We need to estimate the PRF on the basis of the SRF

Useful algebraic properties of the SRL.

no proof as it follows from the first order condition.

Xu 0 . The residual does NOT depend on the

) is the same as the sample average

Total variation in Y- the spread of Y.

-the total variation in the estimated Y about their mean values.

In other words the variation in Y which is explained by the regression.

Figure 6-8 Gujarati

Yi Y Variation in Yi explained by the regression

that appear in the population model).

Linear in parameters the s: Y=0+1X+u

Cov( X , u) E( Xu) XE (u) 0

All the Y are not the same-variation in the independent variable

Using all of the above we can show that

Variances of the OLS estimators

u and X are independent then the distribution of u does not depend

E( u ) 0 and Var ( u X ) is equivalent to some unchanging value denoted by

u will be an estimator of u . So that the estimator

The denominator is n 2 because there are two restrictions

Our primary interest in

is to use them to help us derive the variance of

var(1 ) 2 / TSSX 2 / (Xi X ) 2

The sd( 1 )is simply the square root of the variance-

, we have an estimator - . We can now use it to derive of

sd( 1 ), that is sd( 1 )= / s

Since we are using

to derive sd( 1 ) it means that we are finding

an estimator of sd( 1 ).The natural estimator of the latter is referred to as the

This can be defined as

In order to find the

Continuing with assumptions

var(u) E[(u E(u)]2

Homoskedasticity can be shown graphically.

the conditional distributions have the same spread.

[var(uX)] increases with X.

2 is the unconditional variance of u and is often called the error variance

Using the above info we can show

Notice the difference in the estimate when

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.