0% found this document useful (0 votes)
62 views9 pages

Simple Regression (Continued) : Y Xu Y Xu

The document discusses the assumptions and properties of simple linear regression. It begins by presenting the true population regression function and stochastic regression function. It then outlines several useful properties of the stochastic regression line, including that the sum of residuals is zero, the sample covariance between regressors and residuals is zero, and the point (XY) is always on the regression line. The document introduces the concepts of total, explained, and residual sum of squares. It derives the expected values and variances of the OLS estimators β0 and β1. Finally, it discusses the assumptions of the simple linear regression model, including that the error term has a zero conditional mean and constant variance.

Uploaded by

Catalin Parvu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views9 pages

Simple Regression (Continued) : Y Xu Y Xu

The document discusses the assumptions and properties of simple linear regression. It begins by presenting the true population regression function and stochastic regression function. It then outlines several useful properties of the stochastic regression line, including that the sum of residuals is zero, the sample covariance between regressors and residuals is zero, and the point (XY) is always on the regression line. The document introduces the concepts of total, explained, and residual sum of squares. It derives the expected values and variances of the OLS estimators β0 and β1. Finally, it discusses the assumptions of the simple linear regression model, including that the error term has a zero conditional mean and constant variance.

Uploaded by

Catalin Parvu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Simple regression (continued)

True population regression function

Y 0 1X u

We need to estimate the PRF on the basis of the SRF

(stochastic form)
Y 0 1 X u

SRF

The SRF is determined on the basis of a simple assumption: E( u )=0. In which case we
can write the SRF in its deterministic form as:

Y
0
1X

Useful algebraic properties of the SRL.


(1) The sum of the OLS residuals is 0 and hence average must be equal to 0:
n

ui 0 .This says nothing about the residuals for any particular observation and needs

i 1

no proof as it follows from the first order condition.


(2) The sample covariance between the regressors and the residual is zero. This
follows from the first order condition

Xu 0 . The residual does NOT depend on the


i 1

i i

explanatory variable. This also follows from the first order condition.
(3) The point ( XY ) is always on the regression line. This follows from equation

6a:
0 Y
1X

) is the same as the sample average


(4) Sample average of the fitted values ( Y

Y.
for the actual values ( Y ). In other words Y
0
(5) The fitted values and the residuals are uncorrelated: (Yu)
u
u
or Y Y

(6) Y Y
Some new concepts
(1) Total sum of squares (TSS or SST)
(2) Explained sum of squares (ESS)
(3) Residual or sum of squares (RSS )
TSS=

(Y

Y ) 2 - the total variation of the actual Y values about their sample mean.

i 1

Total variation in Y- the spread of Y.

ESS=

(Y Y )
i

-the total variation in the estimated Y about their mean values.

i 1

In other words the variation in Y which is explained by the regression.

Empirical Economics
26163
Handout 2

RSS=

i 1

i 1

2
(Yi Y ) 2 = ui -unexplained variation.

Figure 6-8 Gujarati


Actual observation Yi

Yi Y

*
Yi

u
i Variation in Yi not explained by the regression
Yi Y
i

Yi Y Variation in Yi explained by the regression

SRF : Y 0 1 X

It can easily be proven that TSS=ESS+RSS but in order to tell us how well the RHS
variables explain the variation in the LHS variable we compute what is known as the R2.
SRL:
This is known as the coefficient of determination - ESS/TSS or 1-[RSS/TSS].
In our example: 1-[9.5515/393.60]=0.9757
R2 =0.98 means that 98% of variation in Y.
If the SRF fits the data well that the ESS should be much larger than the RSS.
If SRF fits the data poorly that the RSS should be much larger than the ESS.
In the extreme where X explains no variation in Y RSS=TSS and ESS=0.
These are polar cases.
If ESS is relatively larger than RSS the SRF explains a substantial proportion of
the variation in Y.
If RSS is relatively larger than ESS the SRF explains a small proportion of the
variation in Y
We would like the ESS to be bigger RSS
Expected values and variances of the OLS estimators
By utilising the statistical properties of OLS estimation we can obtain expected values
and variance of the OLS estimators.
This means that we are looking at the properties of the distributions of
(they are estimators of the parameters

0 and 1

0 and 1

that appear in the population model).

Linear in parameters the s: Y=0+1X+u


We know that LHS variable can be expressed as log (log X) or be squared
(X2) or as a reciprocal (1/X) - there are no restrictions on how Y and X
relate to each other.
2 Random sampling- each observation has an equivalent chance of being chosen. If
we are looking at a cross section of individuals then each individual has an equal
chance of chosen.
Empirical Economics
26163
Handout 2

3 Zero conditional mean: E(uX)=0: the error term is totally independent of the
explanatory variables. That is, given the value of X the mean or expected value of e
is zero.
the factors which are not explicitly included in the model and
therefore subsumed in u do not systematically affect the mean
value of Y- positive us cancel out negative us.
If E(uX)=0 hold then E(YX)=0+1X
4 X is nonstochastic.
Nonstochastic means that the X is fixed. Consider the relationship between
wages (Y) and education (X). We may subdivide education into those with primary
(7), secondary (14) and tertiary (17 or 19 etc).. Each time X is fixed at a certain
level. For each of the fixed Xs there is an associated distribution of Y
If each X is nonstochastic then this assumption [E(uX)=0] is automatically
filled. Since by extension if X is fixed then it is independent of u.
One can use this assumption of a fixed X to explore the covariance of X and u :
Cov(X, u )=0
Since the X are nonstochastic

Cov( X , u) E( Xu) XE (u) 0

All the Y are not the same-variation in the independent variable

Using all of the above we can show that

is an unbiased estimator of .

E ( 1 ) 1

E( 0 ) 0

Variances of the OLS estimators


Firstly how do we calculate the variance ( ) of the regression (error variance)?
2

If we assume that
on X:

u and X are independent then the distribution of u does not depend

E( u X ) 0
This means

E( u ) 0 and Var ( u X ) is equivalent to some unchanging value denoted by

2:

Var( u X ) E u E( u )

Var( u X ) E u E( u )

Var( u X ) E u 2

E u
2

Empirical Economics
26163
Handout 2

Hence

E( u 2 ) 2
However we cannot observe these values what we have are the estimates. Hence we use
an estimator of which is . Also
for the error variance becomes
2

u will be an estimator of u . So that the estimator

u i2

i 1

n2

The denominator is n 2 because there are two restrictions

E( u ) 0 and

E( X u ) 0 .
And the standard error of the regression (or root mean square error) is defined as
n

u i2

i 1

n2

Our primary interest in

and

is to use them to help us derive the variance of

2 and
( ) and

the estimators (and hence their standard errors). That is, we use the
to derive standard errors of our estimators

and

1 ,

denoted as

se( 1 ) repectively.

var(1 ) 2 / TSSX 2 / (Xi X ) 2


i1

(1) The larger the error variance-2 the larger will be the variance of the estimator.
(2) The larger the variability in X the smaller the variance in
(3)We can also find the standard deviation of

1 : / TSSX

1.
=

var(1 ) 2 / TSSX

The sd( 1 )is simply the square root of the variance-

We do not have

/s

[ s SSTX ].

, we have an estimator - . We can now use it to derive of

sd( 1 ), that is sd( 1 )= / s


o

Since we are using

and NOT

to derive sd( 1 ) it means that we are finding

an estimator of sd( 1 ).The natural estimator of the latter is referred to as the


standard errors of
o

1 ; se( 1 )= / s

is an estimator of sd( 1 )= / s .

This can be defined as

se(
1 )

Empirical Economics
26163
Handout 2

(Xi X ) 2
i
1

1/ 2

or se(
1 )

In order to find the

var(
1 )

(Xi X ) 2
i
1

se( 0 ) :
n

i1

i1

i1

i1

2n 1 X 2 / (Xi X ) 2 and
var( 0 )

2n 1 X 2 / (Xi X ) 2
se( 0 )
Note: Importance of the minimum variance property.
Another ideal property of an estimator is that of minimum variance. That is in the class
of all estimators the var(
1 ) and var(
0 ) must be the lowest. If this is achieved then
the estimators are said to be efficient. Therefore in addition to being unbiased
estimators should also be efficient. An efficient estimator is a reliable estimator.

Continuing with assumptions


6
Var(uX)=2
This is referred to as homoskedasticity. It says that the error term has a constant
variance for all observations
This is quite different from E(uX)=0. The latter involves expected value
what we are looking at now concerns the variance.
This property plays no role in showing the unbiasedness of the parameters.
The var (uX)= 2 implies var (YX)= 2 (see proof below)
We assumed that var (u) is fixed at 2 , that is

var(u) E[(u E(u)]2


var(u) E[(u 0] 2
var(u) E(u) 2 2
And the var(Y) E(u ) 2 var(u) 2

Empirical Economics
26163
Handout 2

Homoskedasticity can be shown graphically.


In the diagram we have the population regression (Fig 2.8 Woolridge).
Y

E(Y X) 0 1X

X1

X2

X3

the conditional distributions have the same spread.


variance for each of these distribution is the same for various values of X.
individual values of Y are spread around their mean with the same variance.
What does this means in terms of homoskedasticity?
All the Y values corresponding to the various Xs are equally reliable in the sense
reliability is being judged by how closely the Y values are distributed around the
means.
An interpretation using the model with wages and education
In our education example we are saying that the var (ueducation=2) the
variance in u does not depend on the level of education.
var(wageeducation=2).
variability in wages about its mean is the same across all
education levels.
variability in wages about its mean for those only with primary
school education is the same as variability in wages about its
mean for those only with secondary school education
This may not be so- individuals with no of little education may just be earning
the minimum wage hence no or little variability in the distribution of wages
across this group. Similarly individuals with more education have more job
opportunities that could lead to more variability in wages hence the distribution
in wages across this group may be more spread out.

Empirical Economics
26163
Handout 2

Heretoskedasticity
Figure 2.9 Woolridge.

[var(uX)] increases with X.


This also means that var(YX)] increases with X.
The distribution spreads out as X increase. For small values of X the distribution is tight
and for large values of X there is a greater spread.
Y

E(Y X) 0 1X
X

X1

X2

X3

Not all the Y values corresponding to the various Xs are equally reliable in the
sense reliability is being judged by how closely the Y values are distributed
around the means.
If we pick an individual from a group with a smaller variance it will be more likely
that the corresponding Y for this individual will be closer to the mean than if we
pick an individual from a group with a bigger variance.

To summarise: given the value of X the variance of ut is the same for all observations.
That is the conditional variances of u t are identical.

2 E (u 2 ) var(u)

1.

2 is the unconditional variance of u and is often called the error variance


or the disturbance variance.

2.

3.

The square root is the standard deviation of the error. A larger means
the distribution of the unobserve-ables affecting the dependent variable is
more spread out.
This has implication for the variance of the slope parameters.

Empirical Economics
26163
Handout 2

Using the above info we can show


Why OLS? Summary on the properties of the OLS estimators
OLS is popular because of its strong properties which are summarise in the GaussMarkov theorem-BLUE
The estimators are best- smallest variance (efficient)
They are linear function of the random variable Y
They are unbiased
Regression through the origin
This means estimating a regression without a constant. In many cases this be a realistic
situation, for example tax paid should be zero when income is zero. Or if savings is solely
a function of current income then it should be zero when an individual in unemployed and
receiving no income.
n

~
1

Xi Yi

t 1

(12)

Xi2
t 1

Notice the difference in the estimate when

is the estimator.

Yi Xi nY X

i1
n

Xi2
i1

nX

Empirical Economics
26163
Handout 2

Summary
We can now specify the 2-variable linear regression model by listing its important
assumptions:
1. Linear in parameter: Y=0+1X+u
2. X is nonstochastic value is fixed. (this means that each independent variable is
controlled by the researcher, who can change its value in accordance with the
experimental objectives).This assumption implies assumption 3- that is, if the Xs are
fixed it must by extension be independent of the error.
3. Zero conditional mean: E(uX)=0
4. The error has a zero expected value E(u)=0.
5. The error has a constant variance for all observations, that is, var(uX)=E(u2)=2 (the
property of homoskedasticity)
6. The random variable ui are statistically independent, thus E(ui,uj)=0. This means that
the errors associated with any two observations are independent (reflects an absence of
serial correlation).
7. The error term is normally distributed (return to this later on)
8. Other assumptions
The X values must not all be the same- variation is necessary to use regression
analysis as a research tool.
The number of observation is greater than the numbers of parameters to be
estimated. Otherwise the parameters could not be estimated- from a single
observation there is no way to obtain two parameters (2 unknowns and one
equation)
The regression model is correctly specified. That is there is no specification
bias.

Empirical Economics
26163
Handout 2

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy