0% found this document useful (0 votes)
26 views12 pages

Lecture II - Docx - 12

Uploaded by

Supravat Bagli
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views12 pages

Lecture II - Docx - 12

Uploaded by

Supravat Bagli
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

Econometric Methods/ ECON0704/ July-December 2017

Sarmila Banerjee
Lecture II
K-Variable CLRM

 Matrix representation of the basic model and Linear Independence among


regressors;

 Differentiation of Linear and Quadratic Forms;

β^
 Derivation of OLS estimator
;

V ( β^ ) s2u
 Derivation of and its estimator ;

 Characteristic roots, vectors, Idempotent Matrix;

 Gauss-Markov Theorem (BLUE Property);

Specification of k-Variable CLRM

Y i =β 1 +β 2 X 2 i +.. . .. ..+β k X ki +ui , i=1 .. . n ,


ui ' s(iidrv );u i ~ IN (0 , σ 2u )∀ i;
⇒(i) E(ui )=0 ∀ i ,
⇒(ii)V (ui )=σ 2u ∀ i
⇒(iii)Cov (ui , u j )=0 ∀ i≠ j
X :non−stochastic ⇒cov ( X , u )=0 ;
ρ( X )=k <n ;

1
[ ][ ][ ] [ ]
Y1 1 X 21 . . X k1 β 1 u1
Y2 1 X 22 . . X k2 β 2 u2
. = . . . . . . + .
. . . . . . . .
Yn 1 X2 n . . X kn βk un

Y = Xβ+U
, the intercept term is represented by a sum vector : [1, 1, …., 1]’

E(U )=0 ⇔ E (ui )=0 ∀ i ;


with

Var ( U )=E [ { U −E( U ) }{U −E( U ) } ' ]

This is the variance-Covariance matrix of U, which has V(ui)’s along diagonal and Cov (ui, uj),
∀ i≠ j
, along off-diagonal;

[ ]
u21 u1 u2 . . u1 un
u1 u 2 u22 . . u2 un
Var (U )=E(UU ' )=E . . . . . =σ 2u I n
. . . . .
u1 u n . . . u 2n

 Data Matrix [X]nxk is non-stochastic means in repeated sampling the values of X’s are
held constant;

 This assumption is sufficient to guarantee stochastic independence between Xj and u, for


all j.

i.e., cov(Xji, ui) = 0 for all i & j; or, cov(X,u)=0

 Since the regression equation has a linear structure, while producing the effect the
included causes should not interact among themselves;

2
 Implies the columns of matrix X should be linearly independent. ρ(X) = k;

 However, for the model to be estimable another additional assumption needed is [k < n].

 Number of unknowns in the k-variable model is k no. of βj’s and n no. of ui’s; i.e., (k+n)
unknowns and n number of equations, where ui’s are stochastic.

 Simple algebraic solution is not possible;

 Statistical solutions are searched for;

ui ~ IN (0,σ 2u )∀ i,
 Since hence (k+1) parameters are really unknown in the system, viz.,
σ 2u
[(β1, β2, …., βk) and ];

 The estimators are obtained by minimizing the RSS (residual sum of squares) and k no. of
F.O.C.s (first order conditions) are obtained as:

∂(e' e)
=0 ,
∂β e=(Y −Y^ )=(Y −X β^ );
where RSS = e’e and

 These F.O.C.s are giving the number of constraints that the optimal value of RSS has to
satisfy.

 So, in the presence of k-parameters, and, therefore, k no. of F.O.C.s, out of n no. of ei’s
only (n-k) can be chosen freely.

 Thus, (n-k) stands for the degree of freedom of the system and indicates the statistical
power of the inferences drawn about the unknown parameters.

 So, the requirement in a compact form turns out to be: ρ(X) = k < n;

OLS Estimation

^ (Y− X β)
e' e=(Y −X β)' ^
^ β'
=Y ' Y −Y ' X β− ^ X ' Y + β^ '( X ' X ) β^
^ β'
=Y ' Y −2Y ' X β+ ^ ( X ' X ) β^

3
^ β^ ' X ' Y
Y ' X β∧
Since both are scalar and one is the transpose of the other, hence both are
equal;

Y ' X β^ β^
is a linear function of and

β^ ' ( X ' X ) β^ β^
is a quadratic function of ;

Linear & Quadratic Forms

Differentiation of linear form:

n
∑ ci x i =c ' x
i=1
is a linear form in x, where, x(nx1) & c(nx1);

[ ][ ]
∂( c ' x )
∂ x1 c1
∂( c ' x ) . .
=. = . =c
∂x
. .
∂( c ' x ) cn ∂(Y ' X β^ )
∂ xn =X'Y
∂ β^
; so,

Differentiation of Quadratic Form:

Consider the form x’Ax, where x(nx1) and A(nxn) with

A = A’ (i.e., square & symmetric);

For n=2, the expression can be expanded as:

x ' Ax=( x 1 x2 )
[ a11
a12 ]( )
a12 x 1
a22 x 2

=a11 x 21 +2 a12 x 1 x 2 +a22 x 22

4
[ ][
∂( x ' Ax )
∂( x ' Ax )
∂x
=
∂ x1
∂( x ' Ax )
∂ x2
=
2 a11 x 1 +2 a 12 x 2
2 a12 x 1 +2 a 22 x 2 ]
=2 Ax

∂ [ β^ ' ( X ' X ) β^ ]
∴ =2( X ' X ) β^
∂ β^

∂(e ' e ) ^
∴ =−2 X ' Y +2( X ' X ) β=0
∂ β^

^
∴( X ' X ) β=X 'Y ;

(X’X) is a (kxk) square, symmetric matrix and

ρ(X’X) = min[ρ(X), ρ(X’)];

Again, ρ(X) = ρ(X’) = k

So, ρ(X’X) = k, implies (X’X) is non-singular;

Therefore, |(X’X)| ≠ 0 and (X’X)-1 exists;

β^ OLS=( X ' X )−1 X ' Y

β^ OLS :

Properties of

β^
 Linear in Y; Since Y is a linear function of U, so will be ;

β^ j
The marginal distribution of each would be normal;

β^ β^
 is unbiased estimator of β; E( ) = β;

5
β^ OLS=( X ' X )−1 X ' Y

=( X ' X )−1 X '( Xβ +U )=β+( X ' X )−1 X ' U

E( β^ )=β+[( X ' X )−1 X ' E(U )]=β ,∵ E(U )=0;

β^
 Variance-Covariance matrix of :

Var( β^ )=E [ { β−E


^ ( β^ ) }{ β−E(
^ β^ ) }' ]
=E [ ( X ' X )−1 X ' UU ' X ( X ' X )−1 ]
¿( X ' X )−1 X ' E (UU ' ) X ( X ' X )−1=σ 2u ( X ' X )−1

β^
 is the minimum variance unbiased estimator (MVUE); i.e., in the class of all linear
β^
and unbiased estimators of , the OLS estimator has the minimum variance.
 Minimum Variance Property (Best):
Direct comparison of the variance-covariance matrices would be inappropriate as they
contain information not only about variance but covariance as well and the claim is
placed only against variance.

Way out:

β^ ,
 Consider a linear combinations of the elements of vector b=c’β, where c(kx1) is a
vector of constant coefficients;

 Here b is a scalar, so for b direct comparison of variance is possible;

 Again the coefficient vector c can be specified in different ways to represent


individual elements of β-vector;

 Suppose, c’ = [0,0,…, 1,0…0], with 1 in the jth position. Then, b = c’β = βj and
^ β^ ;
c ' β= j

6
^ ' β=c
b=c ^ ' ( X ' X )−1 X ' Y =c ' ( X ' X )−1 X ' ( Xβ +U )
¿c β+c ( X X )−1 X U
' ' ' '

 The OLS estimator of b is

E( b^ )=c β=b
'

 as E(U) = 0.

^
V ( b)=E {b−E(
^ b) ^ }{b−E(
^ b)[^ }′ ]
¿ E [ c ( X X)−1 X U U X( X X)−1 c ]
' ' ' ' '

¿σ 2u c ( X X )−1 c
' '

 Propose another linear estimator of b, say, b*, defined as: b* = a’Y, where a(nx1) is a
vector of constant coefficients;

 b* = a’(Xβ + U) = a’Xβ + a’U

 E(b*) = c’β = b, iff a’X = c’;

V (b∗)=E { b∗−c β }{ b∗−c β } [ ' ' ′


]
¿ E [a U U a ]
' '

¿ σ 2u a' a

V (b∗)−V ( b^ )=σ 2u [ a' a−a ' X ( X ' X )−1 X ' a ]


[
¿ σ 2u a { I k −X ( X X )−1 X } a
' ' '
]
¿ σ 2u a
'
[ I k −N ] '
a=σ 2u a Ma ,

 Here both N & M are idempotent matrices;

 [a’Ma] is a quadratic form in a;

Idempotent Matrix: Idempotent Matrix A(nxn) is a square, symmetric matrix, which


produces itself after getting multiplied by itself; i.e., A = A2 = … = An, where n is any
positive integer;

Characteristic Roots & Vectors

7
Definition:
 Characteristic roots (latent roots/ Eigen values) and characteristic vectors (latent vectors/
Eigen vectors) are often introduced to students in the context of linear algebra courses
focused on matrices.

 In linear algebra, a characteristic vector of a linear transformation is a non-zero vector


(X) whose direction does not change when that linear transformation is applied to it.

 The transformation is represented as: AX = λX;

 Matrix A acts by stretching the vector X, not changing its direction, so X is a


characteristic vector of A with λ as the associated characteristic root.

 So, an eigenvector is a nonzero vector X which is imaged by the linear transformation of


A into a vector λX, a scalar multiple of itself. That is, it is a vector X such that AX = λX
where λ is a scalar called an eigenvalue.

 If A is an n-dimensional matrix, then the characteristic equation AX = λX is a system of n-


linear homogeneous equations: (AX – λX) = 0; or, equivalently (A – λI) X = 0;

 It can produce non-trivial solution for X, if and only if, the matrix (A – λI) is non-singular,
i.e. | A – λI| = 0; this is known as the characteristic equation;

 This is an n-degree polynomial in λ producing λ1, λ2,…..,λn as n characteristic roots of A,


with n associated characteristic vectors, X1, X2, …, Xn;

 All these vectors will be scalar multiples of one another and, therefore, parallel and
collinear (i.e., independent of each other).

8
A=
[ a 11
a 12
a12
a22 ]
Example: Consider the 2x2 real, square, symmetric matrix

∴|A−λI|=0
a −λ a12
→| 11 |=0
a 12 a22− λ
→ λ2 −(a11+a22 ) λ+(a11 a22−a212 )=0;

( a11 +a22 )±√ ( a11+ a22 )2 −4( a 11 a22−a212 )


λ=
2
; where,

√(a 11+a 22 )2−4( a11 a22−a212 )


¿ √(a 11−a22 )2 +(2 a12 )2 ≥0 ;

 Implies that the roots λ1 & λ2 are real;

 Suppose, a11 = a22 = 1 & a12 = 2; then λ1 = 3 & λ2 = -1; Solving the characteristic function
x 11=x 12
AX = λX for λ = 3, we get ;

x 21 =−x 22
 Similarly, the solution corresponding to λ = -1 gives ;

 i.e., the solution comes in the form of a ratio and solvable up to a scalar multiple.

 To get exact solution the system needs to be closed by adding another equation involving
the same variables.

 We normalize the characteristic vectors by setting their norms equal to 1, and in a 2-


[ ( xi1 )2+( x i2)2=1 ]
dimensional plane that produces the equation for unit circle , i = 1, 2;

9
[] [ ]
1 1
X 1= √ ∧ X 2= √
2 2
1 1

√2 √2
Using that, ; where

‖X ‖=X X =1∀i=1,2∧( X X =0∀i, j ) ;


i i′ i i′ j

 Construct a matrix P by taking the characteristic vectors as columns;

 This P will be symmetric, i.e., P = P’ and P will be orthogonal, i.e., P’ = P-1;

Diagonalization of Matrix A:

 Let (AXi = λiXi) be the characteristic equation corresponding to λi;

 Then, Xj’AXi = Xj’λiXi = λiXj’Xi;

 Or, [X1 X2 … Xn]’A[X1 X2 … Xn] = diag[λ1 λ2 … λn]

 Or, P’AP = Λ, or, A = PΛP’ → the matrix A can be completely characterized in terms of
its Eigen values and the associated Eigen vectors.

 Hence, matrix A and Λ are similar matrices and the transformation is called similarity
transformation.

Definiteness of Matrices:

 Consider a quadratic form X’AX; if all characteristic roots of A are positive then
irrespective of the value of X, (X’AX) > 0 and A is called positive definite;

 Similarly, negative definiteness and semi-positive definiteness can be defined for the
quadratic form X’AX according as λi’s are all negative or semi-positive.

 Consider an Idempotent matrix M: M = M2 ;

 MX = λX, → M2X = λMX = λ2X; since M2 = M, hence, λ = λ2, or, λ = 1, 0 (semi-


positive).

^
Var(b∗)−Var( b)=σ 2 '
u a Ma≥0

10
^
Var(b∗)≥Var( b);

 Hence, the BLUE property;

σ 2u :
V(ui) =
Unbiased estimator of


ei helps to form an idea about ui ;
σ 2u ;

Try V(ei) = [RSS/n] as an unbiased estimator of
'
ee
 RSS = ;
e=(Y −Y^ )=(Y −X β^ )

=Y −X ( X X )−1 X Y
' '

¿ Xβ +U− Xβ+ X ( X X )−1 X U


' '

¿ [ I n −X ( X ' X )−1 X ' ] U=MU

e ' e=U ' M ' MU=U ' MU ;∵M ' =M∧M 2 +M ;



' '
E( e e )=E(U MU )
 ;

E(U MU )=E [ u1
'
u2 ]
[ m11
m12 ][ ]
m12 u1
m22 u2
 For n=2,

=m11 E (u 21 )+m22 E(u22 )+2 m12 E(u1 u 2 )


¿( m11+m22 )σ 2u=tr( M )σ u2 ;

11
tr ( M )=tr [ I n − X ( X X )−1 X ]
' '

¿ tr (I n )−tr[ X ' X ( X ' X )−1 ]


¿ tr (I n )−tr( I k )=(n−k ),
[ tr ( A±B )=tr( A )+tr (B )]
¿ [tr ( ABC )=tr (CAB)=tr (BCA )]
Now,

∴ E(e e )=E(U MU )=σ 2u (n−k )


' '

( )
'
ee RSS
¿E =σ u2 ;or , s 2u=
n−k n−k

s2u σ 2u ;
is an unbiased estimator of

Note:
Since tr(M) = (n - k) and the characteristic roots of M are either 1 or 0, hence, exactly (n
– k) columns of M are linearly independent and the remaining k are not. So, the system
has (n – k) degrees of freedom.

12

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy