0% found this document useful (0 votes)
8 views37 pages

02 Advance

The document discusses statistical methods for testing economic hypotheses including: - Linear and non-linear hypotheses testing using techniques like least squares estimation, confidence intervals, and the delta method. - Multiple hypothesis testing and assumptions of least squares estimators like exogeneity and full rank of explanatory variables. - Statistical inference procedures involving hypotheses testing using test statistics, rejection regions, and calculating type I and type II errors based on the significance level. - Examples of simple hypothesis testing using t-tests and calculating p-values.

Uploaded by

afsatoukane8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views37 pages

02 Advance

The document discusses statistical methods for testing economic hypotheses including: - Linear and non-linear hypotheses testing using techniques like least squares estimation, confidence intervals, and the delta method. - Multiple hypothesis testing and assumptions of least squares estimators like exogeneity and full rank of explanatory variables. - Statistical inference procedures involving hypotheses testing using test statistics, rejection regions, and calculating type I and type II errors based on the significance level. - Examples of simple hypothesis testing using t-tests and calculating p-values.

Uploaded by

afsatoukane8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Testing economic hypotheses.

Multiple hypothesis
testing. Linear and non-linear hypotheses. Confidence
intervals. Delta method.

Jakub Mućk
SGH Warsaw School of Economics

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses 1 / 36


Least squares estimator

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Least squares estimator 2 / 36
Multiple regression

Least squares estimator :

y = β0 + β1 x1 + β2 x2 + . . . + βK xK + ε (1)
where
I y is the (outcome) dependent variable;
I x1 , x2 , . . . , xK is the set of independent variables;
I ε is the error term.
The dependent variable is explained with the components that vary with the
the dependent variable and the error term.
β0 is the intercept.
β1 , β2 , . . . , βK are the coefficients (slopes) on x1 , x2 , . . . , xK .

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Least squares estimator 3 / 36
Multiple regression

Least squares estimator :

y = β0 + β1 x1 + β2 x2 + . . . + βK xK + ε (1)
where
I y is the (outcome) dependent variable;
I x1 , x2 , . . . , xK is the set of independent variables;
I ε is the error term.
The dependent variable is explained with the components that vary with the
the dependent variable and the error term.
β0 is the intercept.
β1 , β2 , . . . , βK are the coefficients (slopes) on x1 , x2 , . . . , xK .

β1 , β2 , . . . , βK measure the effect of change in x1 , x2 , . . . , xK upon the


expected value of y (ceteris paribus).

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Least squares estimator 3 / 36
Assumptions of the least squares estimators I
Assumption #1: true DGP (data generating process):

y = Xβ + ε. (2)

Assumption #2: the expected value of the error term is zero:

E (ε) = 0, (3)

and this implies that E (y) = Xβ.


Assumption #3: Spherical variance-covariance error matrix.

var(ε) = E(εε0 ) = Iσ 2 (4)


. In particular:
I the variance of the error term equals σ:
var (ε) = σ 2 = var (y) . (5)
I the covariance between any pair of εi and εj is zero”
cov (εi , εj ) = 0. (6)
Assumption #4: Exogeneity. The independent variable are not random
and therefore they are not correlated with the error term.

E(Xε) = 0. (7)
Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Least squares estimator 4 / 36
Assumptions of the least squares estimators II

Assumption #5: the full rank of matrix of explanatory variables (there is


no so-called collinearity):

rank(X) = K + 1 ≤ N. (8)

Assumption #6 (optional): the normally distributed error term:

ε ∼ N 0, σ 2 .

(9)

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Least squares estimator 5 / 36
Gauss-Markov Theorem

Assumptions of the least squares estimators


Under the assumptions A#1-A#5 of the multiple linear regression model,
the least squares estimator β̂ OLS has the smallest variance of all linear and
unbiased estimators of β.

β̂ OLS is the Best Linear Unbiased Estimators (BLUE) of β.

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Least squares estimator 6 / 36
The least squares estimator

The least squares estimator


−1
β̂ OLS = X0 X X0 y. (10)

The variance of the least square estimator


−1
V ar(β̂ OLS ) = σ 2 X0 X (11)

If the (optional) assumption about normal distribution of the error


term is satisfied then

β ∼ N β̂ OLS , V ar(β̂ OLS ) .



(12)

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Least squares estimator 7 / 36
Statistical inference

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Statistical inference 8 / 36
Statistical inference

Statistical inference is the process that using sample data allows to deduce
properties of an underlying features of population.
Statistical inference consists of:
I estimation of underlying parameters,
I testing hypotheses.

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Statistical inference 9 / 36
Hypotheses testing

Hypothesis testing is a comparison of a conjecture we have about a population


to the information contained in a sample of data.
The hypotheses are formed about economic behavior.
In statistical inference, the hypotheses are then represented as statements
about model parameters.
General procedures:
1. A null hypothesis H0 ,
2. An alternative hypothesis H1 ,
3. A test statistic
4. A rejection region
5. A conclusion

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Statistical inference 10 / 36
The test statistics and rejection region

Based on the value of a test statistic we decide either to reject the null
hypothesis or not to reject it.
The rejection region consists of values that are unlikely and that have low
probability of occurring when the null hypothesis is true.
The rejection region depends on:
I Distribution of test statistics when the null is true.
I Alternative hypothesis.
I Level of significance.

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Statistical inference 11 / 36
Type I & Type II error and significance level

Type I error is a situation, in which we reject the null hypothesis when it


is true.
Type II error is a situation, in which we do not reject the null hypothesis
when it is false.
Significance level α:
P (Type I error) = α. (13)
α is usually arbitrary chosen to be 0.01, 0.05 or 0.10

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Statistical inference 12 / 36
Probability value (p-value)

Standard practice is to use the probability value (p-value). This is the is


the smallest significance level at which the null hypothesis could be rejected.
Given p-value we do not have to compare test statistics with the correspond-
ing critical value.
If the p-value is lower than the significance level (α) then we are able to reject
the null.

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Statistical inference 13 / 36
Testing simply hypotheses

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Testing simply hypotheses 14 / 36
t-test and possible alternative hypotheses

Based on the t statistics:


β̂iLS − βi
t= ∼ tN −(K+1) . (14)
se (βiLS )
we can consider the following alternative hypotheses:
1. H1 : βi ≤ c,
2. H1 : βi 6= c,
3. H1 : βi ≥ c.
Test of significance:
I The null H1 : βi = 0,
I The alternative H0 : βi 6= 0.
I t-test statistics:
β̂iLS
t=  ∼ tN −(K+1) , (15)
se βiLS
which is an inverse of relative standard error.

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Testing simply hypotheses 15 / 36
One-tail test with alternative greater than

The null and alternative:

H0 : βi =c
f(tm)

H1 : βi ≥c

α The null hypothesis can be


rejected if
t ≥ t(1−α,N −(K+1))
tc = t1−α,N−(K+1)

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Testing simply hypotheses 16 / 36
One-tail test with alternative less than

The null and alternative:

H0 : βi =c
f(tm)

H1 : βi ≤c

α The null hypothesis can be


rejected if
t ≤ t(α,N −(K+1))
tc = tα,N−(K+1)

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Testing simply hypotheses 17 / 36
Two-tail test with alternative not equal to

tm The null and alternative:

H0 : βi =c
H1 : βi 6= c
f(tm)

α α The null hypothesis can be


2 2
rejected if
t ≤ t(α/2,N −(K+1))
or t ≥ t(1−α/2,N −(K+1))
tc = tα2,N−(K+1) tc = t1−α2,N−(K+1)

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Testing simply hypotheses 18 / 36
Linear combination of parameters I
A linear combination of parameters:

λ = c1 β1 + c2 β2 (16)

where c1 and c2 are some constants.


Under the assumptions #1-#6 (without normality of the error term) the
least square estimators β̂1LS and β̂2LS are the best linear unbiased estimators
of β1 and β2 .
Moreover, the λ̂LS = c1 β̂1LS + c2 β̂2LS is also BLUE of λ.
I The estimator λ̂LS is unbiased because:
E(λ̂LS ) = E(c1 β̂1LS ) + E(c2 β̂2LS ) = c1 E(β̂1LS ) + c1 E(β̂2LS ) = c1 β1 + c2 β2 = λ.
(17)
The variance of the linear combination of the LS estimates:

var(λ̂) = var(c1 β̂1LS + c2 β̂2LS ) (18)


= c1 var(β̂1LS ) + c2 var(β̂2LS ) + 2c1 c2 cov(β̂1LS , β̂2LS ). (19)

therefore we can estimate the variance of the λ by replacing with the (known)
estimated variances and covariance.

var( ˆ β̂1LS ) + c2 var(


ˆ λ̂) = c1 var( ˆ β̂2LS ) + 2c1 c2 cov(
ˆ β̂1LS , β̂2LS ) (20)
Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Testing simply hypotheses 19 / 36
Linear combination of parameters II

if the assumption of the error term normality holds or if the sample is large
then λ̂ have normal distribution:

λ̂ = c1 β̂1LS + c2 β̂2LS ∼ N λ, var(λ̂) .



(21)

The standard t-statistics for the linear combination is:

λ̂ − λ λ̂ − λ
t= p = ∼ tN −K . (22)
var(λ̂) se(λ̂)

Based on the above formulation the variety of hypotheses can be tested. The
null is typically:
H0 : λ = c1 β1 + c2 β2 = λ0 , (23)
while the possible alternative hypotheses:

H1 : λ = c1 β1 + c2 β2 6= λ0 ,
H1 : λ = c1 β1 + c2 β2 ≤ λ0 ,
H1 : λ = c1 β1 + c2 β2 ≥ λ0 .

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Testing simply hypotheses 20 / 36
Confidence intervals

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Confidence intervals 21 / 36
Point vs Interval estimation

Point estimate is a single value of the estimator (mean).


Interval estimation provides a range of values in which the true parameter
is likely to fall
Interval estimation allows to account for the precision with which the
unknown parameter is estimated. The precision is typically measured with
variance.

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Confidence intervals 22 / 36
Confidence intervals I

Under the assumption of normality of the error term the least squares esti-
mator of β̂ LS is:
β̂ LS ∼ N (β, Σ) (24)
where Σ is the variance-covariance of the least squares estimator.
For illustrative purpose we focus on slope parameter in the simple regression
model (β̂1LS ): !
σ2
β̂1LS ∼ N β1 , PN (25)
i
(xi − x̄)2

A standardized normal random variable can be obtained from β̂1LS by sub-


tracting its mean and dividing by its standard deviation

β̂1LS − β1
Z= q ∼ N (0, 1) . (26)
PN
σ2 i
(xi − x̄)2

Based on the features of standard normal distribution:

P (−1.96 ≤ Z ≤ 1.96) = .95, (27)

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Confidence intervals 23 / 36
Confidence intervals II
we can substitute Z
 
β̂1LS − β1
P −1.96 ≤ q ≤ 1.96 = .95, (28)
PN
σ2 i
(xi − x̄)2

and after manipulations:


 v v 
u N
X
u N
LS
u 2 LS
u X 2
P β̂1 − 1.96tσ 2 (xi − x̄) ≤ β1 ≤ β̂1 + 1.96tσ 2 (xi − x̄)  = .95.
i i
(29)
q PN
The two end-points β̂1LS ± 1.96 σ̂ 2
i
(xi − x̄)2 provide an interval esti-
mator.
In repeated sampling 95% of the intervals constructed this way will contain
the true value of the parameter β1 .
This easy derivation of an interval estimator is based on the assumption
about normality of the error term and that we know the variance of the error
term σ 2 .

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Confidence intervals 24 / 36
Obtaining interval estimates
In simply regression, replacing σ 2 by its estimates σ̂ 2 produces a random
variable t:
β̂1LS − β1 β̂ LS − β1 β̂ LS − β1
t= q = √1  = 1 LS . (30)
PN v̂ar β̂1LS se (β1 )
σ̂ 2 i (xi − x̄)2

In multiple regression model, the t ratio, i.e. t = (β̂jLS − βj )/se(β̂jLS ) has a


t-distribution with N − (K + 1) degrees of freedoms:
t ∼ tN −(K+1) , (31)
where K is the number of explanatory variables.
Critical value from a t distribution tc can be found as follows:
α
P (t ≥ tc ) = P (t ≤ −tc ) = , (32)
2
where α is arbitrary probability (significance level).
The confidence intervals:
P (−tc ≤ t ≤ tc ) = 1 − α, (33)
and after manipulations (with definition of t random variable):
P β̂1LS − tc se β̂1LS ≤ β1 ≤ β̂1LS + tc se β̂1LS
 
= 1 − α. (34)

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Confidence intervals 25 / 36
Testing joint hypotheses

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Testing joint hypotheses 26 / 36
Testing joint hypotheses

A null hypothesis with multiple conjectures, expressed with more than one
equal sign, is called a joint hypothesis.
[Example ] Wages (w) and experience (exper):

w = β0 + β1 exper + β2 exper2 + ε. (35)

I Are wages related to experience?


I To answer the above question we should test jointly H0 : β1 = 0 and H0 :
β2 = 0.
I The joint null is H0 : β1 = β2 = 0.
I Test of H0 is a joint test for whether all two conjectures hold simultaneously.

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Testing joint hypotheses 27 / 36
Restricted least squares estimator

The restricted least square estimator is obtained by minimizing the


sum of squares (SSE) subject to set of restrictions, which is a function of
the unknown parameters, given the data:
N
X
SSE (β0 , β1 , . . . , βK ) = [yi − β0 − β1 x1 − . . . − βK xK ]2
i=1
subject to restrictions.
Examples of restrictions:
I β1 = β2 ,
I β1 = 2.

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Testing joint hypotheses 28 / 36
Wald test I
Wald test allows to test a set of linear restrictions.
The F-statistic determines what constitutes a large reduction or a small
reduction in the sum of squared errors:
(SSER − SSEU ) /J
F= , (36)
SSEU / (N − K)
where:
I J is the number of restrictions,
I N is the number of observations,
I K is the number of coefficients in the unrestricted model,
I SSER is sum of squared error in restricted model,
I SSEU is sum of squared error in unrestricted model,
If the null is true then the F-statistic has an F-distribution with J numerator
degrees of freedom and N − K denominator degrees of freedom.
If the null can be rejected then, the differences in sum of squared errors
between restricted model (SSER ) and unrestricted model (SSEU )
become large.
I In other words, the imposed restriction significantly reduce the ability of the
model to fit the data.
The F-test can also be used in many application:
I Testing economic hypotheses.

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Testing joint hypotheses 29 / 36
Wald test II

I Testing the significance of the model.


I Excluding/including a set of explanatory variables.

Alternatively, the W statistics can be used which is defined as

W = JF, (37)

and W is χ2 distributed with the J degrees of freedom.

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Testing joint hypotheses 30 / 36
Testing the significance of the model

Multiple regression model with K explanatory variable:

y = β0 + β1 x1 + . . . + β2 xk + ε. (38)

Test of the overall significance of the regression model. The null


hypothesis:
H0 : β1 = β2 = . . . = βk = 0, (39)
while the alternative is that at least one coefficient is different from 0.
In this test the restricted model:

y = β0 , (40)

which implies that SSER = SST .


Thus, the F -statistic in the overall significance test can be written as:
(SST − SSE) /K
F= . (41)
SSE/ (N − K − 1)

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Testing joint hypotheses 31 / 36
Linear restriction in matrix form
General notation:
R × β = q, (42)
where R is the J × (K + 1) matrix describing linear restriction and q is the
vector of intercepts in each restriction.
Example #1: test of the overall significance of the regression model:

0 1 0 ... 0 0
   
 0 0 1 ... 0   0 
R=
 ... .. .. .. ..  and q=
 ... 

. . . .

0 0 0 ... 1 0

Example #2: the following restrictions


1. β1 = β3
2. β2 = ν
3. β1 + β4 = γ.
can be described as
−1
" # " #
0 1 0 0 0
R= 0 0 1 0 0 and q= ν
0 1 0 0 1 γ

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Testing joint hypotheses 32 / 36
t and F statistics

If a single restriction is considered bot t and F statistics can be used,


The results will be identical.
This is due to an exact relationship between t- and F-distributions. The
square of a t random variable with df degrees of freedom is an F random
variable with 1 degree of freedom in the numerator and df degrees of freedom
in the denominator.

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Testing joint hypotheses 33 / 36
Non-sample information

In many cases we have information over and above the information contained
in the sample observation.
This non-sample information can be taken from e.g. economic theory.
[Example] Production function. Consider the regression of logged output (y)
on logged capital (k) and logged labor input (l):

y = β0 + β1 k + β2 l + ε. (43)

The natural assumption to verify is constant return to scale (CRS). In this


case:
β1 + β2 = 1. (44)

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Testing joint hypotheses 34 / 36
Delta method

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Delta method 35 / 36
Delta method I
Delta method is the popular strategy for estimating variance for nonlinear
function of the parameters.
Key assumption: the g(β) is the nonlinear continuous function of the
parameters.
Taylor expansion around true value of the parameters, i.e., β:
 0
∂g(β)
g(β̂) = g(β) + (β̂ − β) + o(||β̂ − β||), (45)
∂β

where  0
∂g(β) ∂g ∂g ∂g
= , ,..., . (46)
∂β ∂β1 ∂βK ∂βK
After manipulation
 0
∂g(β)
g(β̂) − g(β) = (β̂ − β) + o(||β̂ − β||), (47)
∂β

and taking the variance


 0  
 ∂g(β)  ∂g(β)
var g(β̂) − g(β) = var β̂ . (48)
∂β ∂β

Jakub Mućk Advanced Applied Econometrics Testing economic hypotheses Delta method 36 / 36

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy