0% found this document useful (0 votes)
130 views92 pages

2 - Time Series Regression (pt.1)

The document discusses time series regression models. It explains that time series data differs from cross-sectional data in that time series observations are dependent on each other due to their ordering over time. Static regression models relate a dependent variable to independent variables measured in the same time period. Finite distributed lag models allow independent variables to affect the dependent variable with a lag. The coefficients in these models can be interpreted as the impact or change in the dependent variable from temporary or permanent changes in the independent variables.

Uploaded by

Daniel Patraboy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
130 views92 pages

2 - Time Series Regression (pt.1)

The document discusses time series regression models. It explains that time series data differs from cross-sectional data in that time series observations are dependent on each other due to their ordering over time. Static regression models relate a dependent variable to independent variables measured in the same time period. Finite distributed lag models allow independent variables to affect the dependent variable with a lag. The coefficients in these models can be interpreted as the impact or change in the dependent variable from temporary or permanent changes in the independent variables.

Uploaded by

Daniel Patraboy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 92

NOVA

IMS
Information

NOVA
Management
School

Time Series Regression


IMS Part I
Information
Management
School Econometrics II

Bruno Damásio
 bdamasio@novaims.unl.pt
 @bmpdamasio
Carolina Vasconcelos
 cvasconcelos@novaims.unl.pt
 @vasconceloscm
2022/2023
Nova Information Management School
NOVA University of Lisbon
Instituto Superior de Estatística e Gestão da Informação
Universidae Nova de Lisboa
NOVA
IMS Table of contents i
Information
Management
School

1. Basic Regression Analysis with Time Series Data


The Nature of Time Series Data
Time Series Regression Models
Finite Sample Properties of OLS
Functional Form and Dummy Variables
Trends and Seasonality

2. Further Issues in Time Series


Stationary and Weakly Dependent Time Series
Asymptotic Properties of OLS
Using Highly Persistent Time Series in Regression Analysis
Dynamically Complete Models

1
NOVA
IMS
Information
Management
School

Basic Regression Analysis with


Time Series Data
NOVA
IMS Cross Sectional Data versus Time Series Data
Information
Management
School

• Cross Sectional Data


1. The sample xi , i = 1, . . . , n is collected from independent
experiments and assumed to be a random sample.
2. The order of the sample xi , i = 1, . . . , n is irrelevant.
3. The observed sample xi , i = 1, . . . , n is a realization (only one
possible outcome) from the population.
• Time Series Data
1. The dependence among observations of the time series
xt , t = 1, . . . , T is prominent.
2. The order of the time series xt , t = 1, . . . , T is of considerable
importance.
3. The actual observed time series xt , t = 1, . . . , T is a realization (only
one possible outcome) of the underlying stochastic process,
{(xt ); t = 1, . . . , } (or simply xt ).

2
NOVA
IMS Cross Sectional Data versus Time Series Data
Information
Management
School

• A sequence of random variables indexed by time is called a


stochastic process.
• When we collect a time series data set, we obtain one possible
outcome, or realization, of the stochastic process.
• The set of all possible realizations of a time series process plays the
role of the population in cross-sectional analysis.
• The sample size for a time series data set is the number of time
periods over which we observe the variables of interest.

3
NOVA
IMS Static Models
Information
Management
School

• Suppose that we have time series data avilable on two variables, say
y and z, where yt and zt are dated contemporaneously.
• A static model relating y to z is

yt = β0 + β1 zt + ut , t = 1, 2, . . . , n (1)

• The name “static model” comes from the fact that we are modeling
a contemporaneous relationship between y and z.
• Usually, a static model is postulated when a change in z at time t is
believed to have and immediate effect on y: ∆yt = β1 ∆zt , when
∆ut = 0.

4
NOVA
IMS Static Models
Information
Management
School

Example
Let mrdrtet denote the murders per 10,000 people in a particular city
during year t, let convrtt denote the murder conviction rate, let unemt
be the local unemployment rate, and let yngmalet be the fraction of the
population consisting of males between the ages 18 and 25. Then, a
static multiple regression model explaining murder rates is

mrdrtet = β0 + β1 convrtet + β2 unemt + β3 yngmalet + ut (2)

Using a model such as this, we can estimate the ceteris paribus effect of
an increase in the conviction rate on a particular criminal activity.

5
NOVA
IMS Finite Distributed Lag Models
Information
Management
School

In a finite distributed lag (FDL) model, we allow one or more variable


to affect y with a lag.
Example
For annual observation, consider the model

gfrt = α0 + δ0 pet + δ1 pet−1 + δ2 pet−2 + ut (3)

where gfrt is the general fertility rate (children born per 1,000 women of
childbearing age) and pet is the real dollar value of the personal tax
exemption.
The goal is to assess whether, in the aggregate, the decision to have
children is linked to the tax value of having a child.
Equation (3) recognizes that, for both biological and behavioral
reasons, decisions to have children would not immediately result from
changes in the personal exemption.

6
NOVA
IMS Finite Distributed Lag Models
Information
Management
School

• Equation (3) is an example of the model

yt = α0 + δ0 zt + δ1 zt−1 + δ2 zt−2 + ut (4)

which is an FDL of order two.


• How do we interpret the coefficients in equation (4)? We can
assume two types of increase in z:
• a temporary increase
• a permanent increase.

7
NOVA
IMS Finite Distributed Lag Models
Information
Management
School

• To interpret the coefficients in equation (4) as a temporary increase,


suppose that z is a constant, equal to c, in all time periods before
time t. At time t, z increases by one unit to c + 1 and then reverts
to its previous level at time t + 1:

...
zt−2 = c
zt−1 = c
zt = c + 1
zt+1 = c
zt+2 = c
...

• That is, the increase in z is temporary.

8
NOVA
IMS Finite Distributed Lag Models
Information
Management
School

• To focus on the ceteris paribus effect of z on y, we set the error


term in each time period to zero. Then,

yt−1 = α0 + δ0 c + δ1 c + δ2 c
yt = α0 + δ0 (c + 1) + δ1 c + δ2 c
yt+1 = α0 + δ0 c + δ1 (c + 1) + δ2 c
yt+2 = α0 + δ0 c + δ1 c + δ2 (c + 1)
yt+3 = α0 + δ0 c + δ1 c + δ2 c,

9
NOVA
IMS Finite Distributed Lag Models
Information
Management
School

Considering the previous equations, we have:

• yt − yt−1 = δ0 , which is the immediate change in y due to the


one-unit increase in z at time t. This is called the impact
propensity or impact multiplier.
• yt+1 − yt−1 = δ1 , which is the change in y one period after the
temporary change.
• yt+2 − yt−1 = δ2 , which is the change in y two periods after the
temporary change

10
NOVA
IMS Finite Distributed Lag Models
Information
Management
School

11
NOVA
IMS Finite Distributed Lag Models
Information
Management
School

• For a permanent increase in z, we set z equal to c before time t. At


time t, z increases permanently to c + 1. Again, setting the errors to
zero:

yt−1 = α0 + δ0 c + δ1 c + δ2 c
yt = α0 + δ0 (c + 1) + δ1 c + δ2 c
yt+1 = α0 + δ0 (c + 1) + δ1 (c + 1) + δ2 c
yt+2 = α0 + δ0 (c + 1) + δ1 (c + 1) + δ2 (c + 1),

12
NOVA
IMS Finite Distributed Lag Models
Information
Management
School

Considering the previous equations, we have:

• yt − yt−1 = δ0 , which is the immediate change in y due to the


permanent increase in z at time t.
• yt+1 − yt−1 = δ0 + δ1 , which is the change in y one period after the
permanent change.
• yt+2 − yt−1 = δ0 + δ1 + δ2 , which is the change in y two periods
after the permanent change. After two periods, there are no further
changes in y.

The sum of coefficients on current and lagged z, δ0 + δ1 + δ2 , is the


long-run change in y given a permanent increase in z. This is called the
long-run propensity (LRP) or long-run multiplier.
The LRP is the cumulative effect after all changes have taken place; it is
simply the sum all of the coefficients on the zt−j .

13
NOVA
IMS Time Series Regression Assumptions
Information
Management
School

TS1: Linear in Parameters


The stochastic process {(xt1 , xt2 , . . . , xtk , yt ) : t = 1, 2, . . . , n} follows
the linear model

yt = β0 + β1 xt1 + · · · + βk xtk + ut , (5)

where {ut : t = 1, 2, . . . , n } is the sequence of erros or disturbances and


n the number of observations (time periods).

TS2: No Perfect Collinearity


In the sample (and therefore in the underlying time series process), no
independent variable is constant nor a perfect linear combination of the
others.

14
NOVA
IMS Time Series Regression Assumptions
Information
Management
School

TS3: Zero Conditional Mean


For each t, the expected value of the error ut , given the explanatory
variables for all time periods, is zero. Mathematically,

E[ut |X] = 0, t = 1, 2, . . . , n (6)

TS4: Homoskedasticity

The error u has the same variance given any value of the explanatory
variable. In other words,

Var(ut | x) = σ 2 , t = 1, 2, . . . , n (7)

15
NOVA
IMS Time Series Regression Assumptions
Information
Management
School

TS5: No Serial Correlation

Conditional on X, the errors in two different time periods are


uncorrelated: Corr(ut , us ) = 0, for all t ̸= s.

16
NOVA
IMS Unbiasedness of OLS
Information
Management
School

Theorem 1
Unbiasedness of OLS
Under Assumptions TS1 to TS3 we have:
h i
E β̂j | x = βj (8)

17
NOVA
IMS Sampling Variances of the OLS Estimators
Information
Management
School

Theorem 2
Variances of OLS estimators
Under assumptions TS1-TS5 the variance-covariance matrix of the OLS
estimators is:  
−1
Var β̂ | X = σ 2 (X′ X)

Theorem 3
Gauss-Markov Theorem
Under assumptions TS1-TS5, the OLS estimators are BLUE (Best
Linear Unbiased Estimator), conditional on X.

18
NOVA Inference under the Classical Linear Model Assump-
IMS
Information
Management
School
tions

TS6: Normality
The errors ut are independent of X and are independently and
identically distributed as Normal(0, σ 2 ).

19
NOVA Inference under the Classical Linear Model Assump-
IMS
Information
Management
School
tions

Theorem 4
Normal Sampling Distributions
Under Assumptions TS.1 through TS.6, the CLM assumptions for time
series, the OLS estimators are normally distributed, conditional on X.
Further, under the null hypothesis, each t statistic has a t distribution,
and each F statistic has an F distribution. The usual construction of
confidence intervals is also valid.

20
NOVA
IMS Functional Form
Information
Management
School

All of the functional forms can be used in time series regressions.


The most important of these is the natural logarithm: time series
regressions with constant percentage effects appear often in applied
work.

21
NOVA
IMS Functional Form
Information
Management
School

Example
We can use logarithmic functional forms in distributed lag models. For
example, for quarterly data, suppose that money demand (Mt ) and
gross domestic product (GDPt ) are related by

log(Mt ) = α0 + δ0 log(GDPt ) + δ1 log(GDPt−1 )+


δ2 log(GDPt−2 ) + δ3 log(GDPt−3 ) + δ4 log(GDPt−4 ) + ut (9)

The impact propensity in this equation, δ0 , is also called the short-run


elasticity: it measures the immediate percentage change in money
demand given a 1% increase in the GDP.
The LRP, δ0 + δ1 + δ2 + δ3 + δ4 , is sometimes called the
long-run elasticity: it measures the percentage increase in money
demand after four quarters given a permanent 1% increase in GDP.

22
NOVA
IMS Dummy Variables
Information
Management
School

• Binary or dummy independent variables are also quite useful in time


series applications.
• Since the unit of observation is time, a dummy variable represents
whether, in each time period, a certain event has occurred.

Examples
• For annual data, we can indicate in each year whether a Democrat
or a Republican is president of the United States by defining a
variable democt , which is unity if the president is a Democrat, and
zero otherwise.
• Looking at the effects of capital punishment on murder rates in
Texas, we can define a dummy variable for each year equal to one if
Texas had capital punishment during that year, and zero otherwise.

23
NOVA
IMS Dummy Variables
Information
Management
School

Suppose we want to explain the general fertility rate (gfr), i.e., the
number of children born to every 1,000 women of childbearing age. For
the years 1913 through 1984, we have obtained the following estimation:

c = 98.86 + 0.083pet − 24.24ww2t − 31.59pillt


gfr

where pe is the personal tax exemption, ww2 is a binary variable which


takes on the value unity during the years 1941 through 1945, when the
US was involved in the World War II. The variable pill is unity from 1963
onward, when the birth control pill was made available for contraception.
• During World War II, there were about 24 fewer births for every
1,000 women of childbearing age, holding other factors fixed.
• After the introduction of birth control pill, there were about 31
fewer births for every 1,000 women of childbearing age, holding
other factors fixed.
24
NOVA
IMS Trending Time Series
Information
Management
School

• Many economic time series have a common tendency of growing


over time.
• We must recognize that some series contain a time trend in order
to draw causal inference using time series data.
• What kind of statistical models adequately capture trending
behavior? One popular formulation is to write the series {yt } as

yt = α0 + α1 t + et , t = 1, 2, . . . , n (10)

where {et } is an i.i.d. sequence with E[et ] = 0 and Var(et ) = σe2 .


• Note how the parameter α1 is the effect resulting from the linear
time trend.

25
NOVA
IMS Trending Time Series
Information
Management
School

26
NOVA
IMS Interpreting trend
Information
Management
School

• Holding all other factors fixed, α1 measures the change in yt from


one period to the next due to the passage of time.
• If ∆et = 0, then
∆yt = yt − yt−1 = α1
• If α1 > 0, then, on average, yt is growing over time and therefore
has an upward trend.
• If α1 < 0, then yt has a downward trend.

27
NOVA
IMS Interpreting trend
Information
Management
School

• Many economic time series are better approximated by an


exponential trend.
• In practice, an exponential trend in a time series is captured by:

log(xt ) = β0 + β1 t + et , t = 1, 2, . . . (11)

• Recall that, for small changes, ∆log(yt ) is approximately the growth


rate in y from period t − 1 to period t, i.e.,
∆log(yt ) ≈ (yt − yt−1 )/yt−1 .
• If ∆et = 0, we have ∆log(yt ) = β1
• β1 is approximately the average period growth rate in yt .

28
NOVA
IMS Trending Time Series
Information
Management
School

29
NOVA
IMS Using Trending Variables in Regression Analysis
Information
Management
School

• Accounting for explained or explanatory variables that are trending is


fairly straightforward in regression analysis.
• Trending variables do not violate the classical assumptions.
• However, we must be careful to allow for the fact that unobserved,
trending factors that affect yt might also be correlated with the
explanatory variables.
• The phenomenon of finding a relationship between two or more
trending variables simply because each is growing over time is an
example of a spurious regression problem.

30
NOVA
IMS Using Trending Variables in Regression Analysis
Information
Management
School

Considering
yt = β0 + β1 x1t + β2 x2t + β3 t + ut (12)

• Allowing for trend in this equation explicitly recognizes that yt may


be growing (β3 > 0) or shrinking (β3 < 0) over time for reasons
essentially unrelated to x1t and x2t .
• If x1t and x2t are trending themselves, then omitting t from the
regression will yield biased estimators of β1 and β2 .

31
NOVA
IMS Using Trending Variables in Regression Analysis
Information
Management
School

library(wooldridge)
lm(linvpc ~ lprice, data=hseinv)

##
## Call:
## lm(formula = linvpc ~ lprice, data = hseinv)
##
## Coefficients:
## (Intercept) lprice
## -0.5502 1.2409

lm(linvpc ~ lprice + t, data=hseinv)

##
## Call:
## lm(formula = linvpc ~ lprice + t, data = hseinv)
##
## Coefficients:
## (Intercept) lprice t
## -0.913060 -0.380961 0.009829 32
NOVA
IMS Detrending of Regressions
Information
Management
School

Board.

33
NOVA Computing R-Squared When the Dependent Variable
IMS
Information
Management
School
is Trending

• R-squareds in time series regressions are often very high, especially


compared with typical R-squareds for cross-sectional data.
• This is due to
1. Time series data often come in aggregate form (such as average
hourly wages in the U.S. economy), and aggregates are often easier
to explain than outcomes on individuals, families, or firms, which is
often the nature of cross-sectional data.
2. The usual and adjusted R-squareds for time series regressions can be
artificially high when the dependent variable is trending
• If the dependent variable is trending, SST/(n − 1) can substantially
overestimate the variance in yt , because it does not account for the
trend in yt .

34
NOVA Computing R-Squared When the Dependent Variable
IMS
Information
Management
School
is Trending

• The simplest method is to computed the usual R-Squared in a


regression where the dependent variable has already been detrended.
• Let ÿt be the residuals of the regression of yt on t.
• Then, we regress ÿt on x1t , x2t and t, and analyze the R-squared
from this regression.
• The R-squared from this regression is

SSR
1 − Pn 2
t=1 ÿt

where SSR is identical to the sum of squared residuals from the


standard regression.

35
NOVA
IMS Seasonality
Information
Management
School

• If a time series is observed at monthly or quarterly intervals (or even


weekly or daily), it may exhibit seasonality.
• For example, retail sales in the fourth quarter are typically higher
than in the previous three quarters because of the Christmas holiday.
• Many economic time series that do display seasonal patterns are
often seasonally adjusted before they are reported for public use. A
seasonally adjusted series is one that, in principle, has had the
seasonal factors removed from it.
• Sometimes, we do work with seasonally unadjusted data, and it is
useful to know that simple methods are available for dealing with
seasonality in regression models.

36
NOVA
IMS Seasonality
Information
Management
School

• Generally, we can include a set of seasonal dummy variables to


account for seasonality in the dependent variable, the independent
variables, or both.

yt = β0 + δ1 febt + δ2 mart + · · · + δ11 dect


+ β1 x1t + · · · + βk xtk + ut , (13)

where febt , mart , . . . , dect are dummy variables indicating whether


time period t corresponds to the appropriate moth. In this
formulation, January is the base month.
• If there is no seasonality in yt , once all the xtj have been controlled
for, then δ1 through δ1 1 are all zero. This is easily tested via an F
test.

37
NOVA
IMS Seasonality
Information
Management
School

library(wooldridge)
reg_seas <- lm(lchnimp ~ lchempi + feb + mar +
apr + may + jun + jul +
aug + sep + oct + nov + dec,
data = barium)

library(car)
linearHypothesis(reg_seas, c('feb','mar', 'apr', 'may',
'jun', 'jul', 'aug', 'sep',
'oct', 'nov', 'dec'))

## Linear hypothesis test


##
## Hypothesis:
## feb = 0
## mar = 0
## apr = 0
## may = 0
## jun = 0
## jul = 0
## aug = 0
## sep = 0
## oct = 0
## nov = 0
## dec = 0
##
## Model 1: restricted model
## Model 2: lchnimp ~ lchempi + feb + mar + apr + may + jun + jul + aug +
## sep + oct + nov + dec
##
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 129 47.438 38
## 2 118 44.191 11 3.2463 0.788 0.6515
NOVA
IMS
Information
Management
School

Further Issues in Time Series


NOVA
IMS Stationary and Nonstationary Time Series
Information
Management
School

• There are some key concepts to learn in order to apply the usual
large sample approximations in regression analysis with time series
data.
• The notion of stationary process has played an important role in
the analysis of time series.

Stationary Stochastic Process


The stochastic process { xt : t = 1, 2, . . . } is stationary if for every
collection of time indices 1 ≤ t1 ≤ t2 ≤ · · · ≤ tm , the joint distribution
of (xt1 , xt2 , . . . , xtm ) is the same as the joint distribution of
(xt1+h , xt2+h , . . . , xtm+h ) for all integers h ≥ 1.

39
NOVA
IMS Stationary and Nonstationary Time Series
Information
Management
School

• Sometimes, a weaker form of stationarity suffices.

Covariance Stationary Process


The stochastic process { xt : t = 1, 2, . . . } with a finite second moment
  
E x2t < ∞ is covariance stationary if:

1. E (xt ) is constant;
2. Var (xt ) is constant;
3. For any t, h ≥ 1, Cov (xt , xt+h ) depends only on h and not on t.

40
NOVA
IMS Stationary and Nonstationary Time Series
Information
Management
School

How is stationarity used in time series econometrics?

• On a technical level, stationarity simplifies statements of the law of


large numbers (LLN) and the CLT;
• On a practical level, if we want to understand the relationship
between two or more variables using regression analysis, we need to
assume some sort of stability over time.
• If we allow the relationship between two variables (say, yt and xt ) to
change arbitrarily in each time period, then we cannot hope to learn
much about how a change in one variable affects the other variable
if we only have access to a single time series realization.
• Assumptions TS.4 and TS.5 imply that the variance of the error
process is constant over time and that the correlation between errors
in two adjacent periods is equal to zero, which is clearly constant
over time.
41
NOVA
IMS Weakly Dependent Time Series
Information
Management
School

• Stationarity has to do with the joint distributions of a process as it


moves through time.
• A very different concept is that of weak dependence, which places
restrictions on how strongly related the random variables xt and
xt1+h can be as the time distance between them, h, gets large.

A covariance stationary time series is weakly dependent if the


correlation between xt and xt+h goes to zero “sufficiently quickly” as
h → ∞.

42
NOVA
IMS Weakly Dependent Time Series
Information
Management
School

Why is weak dependence important for regression analysis?

• Essentially, it replaces the assumption of random sampling in


implying that the LLN and the CLT hold.
• The most well-known CLT for time series data requires stationarity
and some form of weak dependence: thus, stationary, weakly
dependent time series are ideal for use in multiple regression analysis.

43
NOVA
IMS Moving Average Process
Information
Management
School

• The simplest example of a weakly dependent time series is an


independent, identically distributed sequence: a sequence that is
independent is trivially weakly dependent.
• A more interesting example of a weakly dependent sequence is

xt = θ0 + ϵt + θ1 ϵt−1 , t = 1, 2, . . . (14)

where {ϵt : t = 0, 1, . . . } is an i.i.d. sequence with zero mean and


variance σϵ2 .
• The process {xt } is called a moving average process of order one
[MA(1)]: xt is a weighted average ϵt and ϵt−1

44
NOVA
IMS MA(1)
Information
Management
School

• Unconditional mean: E[xt ] = µ = θ0 < ∞


• Unconditional variance: Var(xt ) = (1 + θ12 )σϵ2 < ∞
• Stationarity: always!
Important Result: Invertibility
The MA(1) process is invertible if the (inverse) roots of the MA
polynomial θ(x) = 1 − θ1 x = 0 are (inside) outside the unit circle, i.e.,
|θ1 | < 1

45
NOVA
IMS MA(1) process
Information
Management
School

library(tsibble)
set.seed(12345)
u = rnorm(200)
x = numeric(200)
for (i in 2:length(u)) x[i] <- u[i]-0.8*u[i - 1]
e <- tsibble(sample = 1:200, u = x[2:201], index = sample)

46
NOVA
IMS MA(1) process
Information
Management
School

library(fpp3)
e %>%
autoplot(u) + labs(title='MA(1)', y ='', x='')

MA(1)

−2

−4
0 50 100 150 200

47
NOVA
IMS MA(q)
Information
Management
School

A moving average process of order q is given by:

w.n.
xt = θ0 + ϵt − θ1 ϵt−1 − θ2 ϵt−2 − . . . − θq ϵt−q ϵt ∼ (0, σϵ2 )

• Unconditional mean: E[xt ] = µ = θ0 < ∞


• Unconditional variance:
γ0 = Var(xt ) = (1 + θ12 + θ22 + . . . + θq2 )σϵ2 < ∞
• Stationarity: always!
Important Result: Invertibility
The MA(q) process is invertible if the (inverse) roots of the MA
polynomial θ(x) = 1 − θ1 x − . . . − θq xq = 0 are (inside) outside the unit
circle.

48
NOVA
IMS Autoregressive Process
Information
Management
School

• Another popular example is the process:

yy = ρ0 + ρ1 yt−1 + ϵt , t = 1, 2, . . . (15)

where {ϵt : t = 0, 1, . . . } is an i.i.d. sequence with zero mean and


variance σϵ2 .
• The process {xt } is called a autoregressive process of order one
[AR(1)].
• The crucial assumption for weak dependence of an AR(1) process if
the stability condition |ρ1 | < 1. Then, we say that {yt } is a stable
AR(1) process.

49
NOVA
IMS AR(1)
Information
Management
School

• Unconditional mean: E[yt ] = µ = ρ0


1−ρ1 <∞
σϵ2
• Unconditional variance: Var(yt ) = 1−ρ21

Important Result: Stationarity


The AR(1) process is stationary if the (inverse) roots of the AR
polynomial ρ(x) = 1 − ρ1 x = 0 are (inside) outside the unit circle, i.
e., |ρ1 | < 1

50
NOVA
IMS AR(1)
Information
Management
School

library(tsibble)
set.seed(12345)
x <- rnorm(201)
u <- rnorm(200)
for (i in 2:length(x)){x[i] <- 2+0.6*x[i - 1] + u[i]}
y <- tsibble(sample = 1:200, x = x[2:201], index = sample)

51
NOVA
IMS AR(1)
Information
Management
School

library(fpp3)
y %>%
autoplot(x) + labs(title='AR(1)', y ='', x='')

AR(1)
8

2
0 50 100 150 200

52
NOVA
IMS AR(p)
Information
Management
School

An autogressive model of order p is given by:

w.n.
yt = ρ0 + ρ1 yt−1 + . . . + ρp yt−p + ϵt , ϵt ∼ (0, σϵ2 )

ρ0
• Unconditional mean: E(yt ) = µ = 1−ρ1 −...−ρp

Important Result: Stationarity


The AR(p) process is stationary if the (inverse) roots of the AR
polynomial ρ(x) = 1 − ρ1 x − . . . − ρp xp = 0 are (inside) outside the unit
circle.

53
NOVA
IMS Asymptotic Properties of OLS
Information
Management
School

TS1’: Linearity and Weak Dependence


We assume the model is exactly as in Assumption TS.1, but now we
add the assumption that {(xt , yt ) : t = 1, 2, . . . } is stationary and
weakly dependent. In particular, the LLN and the CLT can be applied
to sample averages.

TS2’: No Perfect Collinearity


Same as TS2.

TS3’: Zero Conditional Mean


The explanatory variable Xt = (x1t , x2t , . . . , xtk ) are
contemporaneously exogenous:

E[ut |Xt ] = 0, t = 1, 2, . . . , n (16)

54
NOVA
IMS Asymptotic Properties of OLS
Information
Management
School

TS4’: Homoskedasticity

The errors are contemporaneously homoskedastic, that is,

Var(ut | xt ) = σ 2 , t = 1, 2, . . . , n (17)

TS5’: No Serial Correlation

For all t ̸= s, E(ut , us |xt , xs ) = 0.

55
NOVA
IMS Asymptotic Properties of OLS
Information
Management
School

Theorem 5
Consistency of OLS
Under Assumptions TS1’ to TS3’, the OLS estimators are consistent:
plim β̂ = βj , j = 0, 1, . . . , k.

Theorem 6
Asymptotic Normality of OLS
Under TS1’-TS5’, the OLS estimators are asymptotically normally
distributed. Further, the usual OLS standard errors, t statistics, F
statistics, and LM statistics are asymptotically valid.

56
NOVA
IMS Highly Persistent Time Series
Information
Management
School

• In the simple AR(1) model, the assumption |ρ1 | > 1 is crucial for the
series to be weakly dependent. It turns out that many economic
time series are better characterized by the AR(1) model with ρ = 1.
In that case, we can write:

yt = yt−1 + ϵt , t = 1, 2, . . . (18)

This process is called a random walk or a unit root process.

57
NOVA
IMS Random Walk Process
Information
Management
School

• To find the expected value of yt , we can write as:

yt = ϵt + ϵt−1 + · · · + ϵ1 + y0

Taking expected value of both side gives

E (yt ) = E (ϵt ) + E (ϵt−1 ) + · · · + E (ϵ1 ) + E (y0 )


= E (y0 ) , for all t ≥ 1.

• Therefore, the expected value of a random walk does not depend on


t.

58
NOVA
IMS Random Walk Process
Information
Management
School

• To find the variance of yt , we can write as:

Var (yt ) = Var (ϵt ) + Var (ϵt−1 ) + · · · + Var (ϵ1 ) + Var (y0 )
= Var (y0 ) + σϵ2 t

• Therefore, the variance of a random walk increases as a linear


function of time.
• This shows that the process cannot be stationary.

59
NOVA
IMS Highly Persistent Behavior
Information
Management
School

• Even more importantly, a random walk displays highly persistent


behavior in the sense that the value of y today is important for
determining the value of y in the very distance future.
• To see this, we can write for h periods hence,

yt+h = ϵt+h + ϵt+h−1 + · · · + ϵt+1 + yt

Now, suppose at time t, we want to compute the expected value of


yt+h given the current value yt . Since the expected valye of ϵt+j
given yt , is zero for all, j ≥ 1, we have

E (yt+h |yt ) = yt , for all h ≥ 1

• This means that, no matter how far in the future we look, our best
prediction of yt+h is today’s value, yt .

60
NOVA
IMS Highly Persistent Behavior
Information
Management
School

• We can contrast this with the stable AR(1) case, where a similar
argument can be used to show that

E (yt+h |yt ) = ρh1 yt , for all h ≥ 1

• Under stability, |ρ1 | < 1, and so E (yt+h |yt ) approaches zero as


h → ∞: the value of yt becomes less and less important, and
E (yt+h |yt ) gets closer and closer to the unconditional expected
value, E (yt ) = 0

61
NOVA How to verify in practice if a time series is stationary
IMS
Information
Management
School
or not?

1. Time-series plot =⇒ Is the mean of the observed time series


more a less constant over time or not? Is the variance of the
observed time series roughly constant or not? Although useful, it is
only a visual and totally informal method.
2. Unit root tests =⇒ formal statistical inference tests to detect
unit roots.

62
NOVA
IMS Non stationary Process
Information
Management
School

Trend Stationary Process


yt = α + βt + ϵt

Difference Stationary Process (Random Walk)


yt = α + yt−1 + ϵt

(Discuss)

63
NOVA
IMS Non stationary Process
Information
Management
School

Difference Stationary Series


8
4
0

0 25 50 75 100

Trend Stationary Series

5.0
2.5
0.0
0 25 50 75 100

64
NOVA
IMS Trend Stationarity vs Difference Stationarity
Information
Management
School

• A trend stationary process will be stationary after trend removal.


• A difference stationary process will be stationary after a
differentiation.
• How to decide if we have a TSP or a DSP?

65
NOVA
IMS Difference Stationarity
Information
Management
School

library(gridExtra)
library(fpp3)
## set random number seed
set.seed(123)
## length of time series
T <- 100
## initialize {x_t} and {w_t}
x <- w <- rnorm(n=T, mean=0, sd=1)
## Random walk without drift
for(t in 3:T) { x[t] <- x[t-1] + w[t] }
y <- tsibble(sample=1:100, x = x, index=sample)
p1 <- y %>%
autoplot(x) + labs(title='Non Stationary Series', x='', y='')
p2 <- y %>%
autoplot(difference(x)) + labs(title='Differenced Series', x='', y='')
grid.arrange(p1, p2)

66
NOVA
IMS Difference Stationarity
Information
Management
School

Non Stationary Series


8
4
0

0 25 50 75 100

Differenced Series
2
1
0
−1
−2
0 25 50 75 100

67
NOVA
IMS Difference Operator
Information
Management
School

• A first difference can be written as

∆Xt = Xt − Xt−1 = Xt − Lyt = (1 − L)yt

• second-order differences

∆∆yt = yt − 2yt−1 + yt−2 = (1 − 2L + L2 )yt = (1 − L)2 yt

• d − th-order difference can be written as

∆d yt = (1 − L)d yt

68
NOVA
IMS Unit root testing: Dickey-Fuller test
Information
Management
School

• Consider the following AR(1) model: yt = α + ρyt−1 + ut


• And the test:
( (
H0 : ρ = 1 H0 : θ = 0

H1 : ρ < 1 H1 : θ < 0

in the model
∆yt = α + θyt−1 + ut
Under the null tθ ∼ DF

69
NOVA
IMS Dickey-Fuller test
Information
Management
School

library(urca)

#Difference stationary process


set.seed(1234)
n=200
y = numeric(n)
e = rnorm(n)
for (t in 2:n) {
y[t] = 2 + y[t - 1] + e[t]
}

summary(ur.df(y, type='drift'))

70
NOVA
IMS Dickey-Fuller test
Information
Management
School

##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression drift
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.9637 -0.6220 -0.1578 0.5580 3.0348
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.5314398 0.1831636 8.361 1.17e-14 ***
## z.lag.1 0.0011390 0.0006414 1.776 0.0773 .
## z.diff.lag 0.1042094 0.0722390 1.443 0.1507
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.011 on 195 degrees of freedom
## Multiple R-squared: 0.03125,^^IAdjusted R-squared: 0.02131
## F-statistic: 3.145 on 2 and 195 DF, p-value: 0.04525
##
##
## Value of test-statistic is: 1.7757 60.2901
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau2 -3.46 -2.88 -2.57 71
## phi1 6.52 4.63 3.81
NOVA
IMS Dickey-Fuller Test
Information
Management
School

• The Dickey-Fuller test does not follow a standard distribution, so the


critical values are specific to this test.
• The test is sensitive to deterministic regressors (i.e., trend and
intercept), so one must be careful with the inclusion of these terms
in the auxiliary regression to perform the test.
• Additionally, the test is sensitive to autocorrelation. One approach is
to use the Augmented Dickey-Fuller test (ADF). This test
includes lags of the dependent variable in the auxiliary regression:

∆yt = α + γt + θyt−1 + γ1 ∆yt−1 + · · · + γp−1 ∆yt−p−1 + ut

• The number of lags can be determined according to the information


criteria or through a the general-to-specific (GTS) approach.

72
NOVA
IMS GTS t-sig procedure
Information
Management
School

library(Quandl)
indpro = Quandl("FRED/INDPRO", collapse="monthly",
start_date="2013-01-01", type="ts")
plot(indpro)
100
indpro

95
90
85

2014 2016 2018 2020 2022

Time

73
NOVA
IMS GTS t-sig procedure
Information
Management
School

library(Quandl)
summary(ur.df(indpro, type='drift', lags=6))

##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression drift
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.4813 -0.2941 0.0985 0.5124 3.1348
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 16.444576 6.407731 2.566 0.0119 *
## z.lag.1 -0.163595 0.063918 -2.559 0.0121 *
## z.diff.lag1 0.308495 0.107051 2.882 0.0049 **
## z.diff.lag2 -0.166367 0.111114 -1.497 0.1377
## z.diff.lag3 0.030113 0.111206 0.271 0.7871
## z.diff.lag4 -0.005395 0.110552 -0.049 0.9612
## z.diff.lag5 0.053911 0.105506 0.511 0.6106
## z.diff.lag6 0.055066 0.104410 0.527 0.5992
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.58 on 94 degrees of freedom 74
## Multiple R-squared: 0.1736,^^IAdjusted R-squared: 0.1121
NOVA
IMS GTS t-sig procedure
Information
Management
School

summary(ur.df(indpro, type='drift', lags=5))

##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression drift
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.5350 -0.3063 0.0810 0.4891 3.1639
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 15.47025 6.13908 2.520 0.01339 *
## z.lag.1 -0.15391 0.06124 -2.513 0.01363 *
## z.diff.lag1 0.30218 0.10564 2.861 0.00519 **
## z.diff.lag2 -0.17671 0.10877 -1.625 0.10752
## z.diff.lag3 0.02181 0.10927 0.200 0.84223
## z.diff.lag4 -0.02479 0.10410 -0.238 0.81230
## z.diff.lag5 0.06114 0.10332 0.592 0.55543
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.567 on 96 degrees of freedom
## Multiple R-squared: 0.1705,^^IAdjusted R-squared: 0.1186
## F-statistic: 3.288 on 6 and 96 DF, p-value: 0.005511 75
##
NOVA
IMS GTS t-sig procedure
Information
Management
School

summary(ur.df(indpro, type='drift', lags=4))

##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression drift
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.5203 -0.3170 0.1037 0.5128 3.2623
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 14.550231 5.888871 2.471 0.0152 *
## z.lag.1 -0.144719 0.058741 -2.464 0.0155 *
## z.diff.lag1 0.292762 0.103557 2.827 0.0057 **
## z.diff.lag2 -0.186184 0.106690 -1.745 0.0841 .
## z.diff.lag3 0.001478 0.102928 0.014 0.9886
## z.diff.lag4 -0.015782 0.101977 -0.155 0.8773
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.553 on 98 degrees of freedom
## Multiple R-squared: 0.1674,^^IAdjusted R-squared: 0.1249
## F-statistic: 3.941 on 5 and 98 DF, p-value: 0.002677
## 76
##
NOVA
IMS GTS t-sig procedure
Information
Management
School

summary(ur.df(indpro, type='drift', lags=3))

##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression drift
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.5129 -0.3027 0.0838 0.5072 3.2508
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 14.7684909 5.6309381 2.623 0.01009 *
## z.lag.1 -0.1469048 0.0561707 -2.615 0.01029 *
## z.diff.lag1 0.2951468 0.1013350 2.913 0.00442 **
## z.diff.lag2 -0.1810848 0.1001664 -1.808 0.07364 .
## z.diff.lag3 -0.0009431 0.1007733 -0.009 0.99255
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.538 on 100 degrees of freedom
## Multiple R-squared: 0.1672,^^IAdjusted R-squared: 0.1339
## F-statistic: 5.02 on 4 and 100 DF, p-value: 0.0009956
##
## 77
## Value of test-statistic is: -2.6153 3.4689
NOVA
IMS GTS t-sig procedure
Information
Management
School

summary(ur.df(indpro, type='drift', lags=2))

##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression drift
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.5089 -0.3110 0.0751 0.5086 3.2549
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 14.71618 5.36698 2.742 0.00721 **
## z.lag.1 -0.14641 0.05354 -2.735 0.00736 **
## z.diff.lag1 0.29487 0.09460 3.117 0.00237 **
## z.diff.lag2 -0.18205 0.09799 -1.858 0.06609 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.523 on 102 degrees of freedom
## Multiple R-squared: 0.1669,^^IAdjusted R-squared: 0.1424
## F-statistic: 6.811 on 3 and 102 DF, p-value: 0.0003133
##
##
## Value of test-statistic is: -2.7348 3.7841 78
##
NOVA
IMS GTS t-sig procedure
Information
Management
School

diff_indpro <- diff(indpro)


summary(ur.df(diff_indpro, lags=6))

##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression none
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.4261 -0.4125 0.0731 0.6087 3.8494
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## z.lag.1 -1.23740 0.29657 -4.172 6.72e-05 ***
## z.diff.lag1 0.45087 0.26915 1.675 0.0972 .
## z.diff.lag2 0.17451 0.24251 0.720 0.4736
## z.diff.lag3 0.11719 0.20852 0.562 0.5755
## z.diff.lag4 0.03453 0.17242 0.200 0.8417
## z.diff.lag5 0.02860 0.13262 0.216 0.8297
## z.diff.lag6 0.02035 0.10449 0.195 0.8460
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.633 on 94 degrees of freedom
## Multiple R-squared: 0.4544,^^IAdjusted R-squared: 0.4138 79
## F-statistic: 11.19 on 7 and 94 DF, p-value: 3.172e-10
NOVA
IMS GTS t-sig procedure
Information
Management
School

summary(ur.df(diff_indpro, lags=5))

##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression none
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.4252 -0.4017 0.0837 0.6307 3.8164
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## z.lag.1 -1.21056 0.26544 -4.561 1.51e-05 ***
## z.diff.lag1 0.42361 0.23581 1.796 0.0756 .
## z.diff.lag2 0.14790 0.20575 0.719 0.4740
## z.diff.lag3 0.09240 0.17055 0.542 0.5892
## z.diff.lag4 0.01100 0.13119 0.084 0.9334
## z.diff.lag5 0.01162 0.10344 0.112 0.9108
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.618 on 96 degrees of freedom
## Multiple R-squared: 0.4541,^^IAdjusted R-squared: 0.42
## F-statistic: 13.31 on 6 and 96 DF, p-value: 6.394e-11
## 80
##
NOVA
IMS GTS t-sig procedure
Information
Management
School

summary(ur.df(diff_indpro, lags=4))

##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression none
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.4128 -0.3895 0.0849 0.6293 3.8196
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## z.lag.1 -1.197832 0.232413 -5.154 1.32e-06 ***
## z.diff.lag1 0.410786 0.199097 2.063 0.0417 *
## z.diff.lag2 0.135645 0.168150 0.807 0.4218
## z.diff.lag3 0.080946 0.129533 0.625 0.5335
## z.diff.lag4 0.002342 0.102319 0.023 0.9818
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.601 on 98 degrees of freedom
## Multiple R-squared: 0.4541,^^IAdjusted R-squared: 0.4262
## F-statistic: 16.3 on 5 and 98 DF, p-value: 1.119e-11
##
## 81
## Value of test-statistic is: -5.1539
NOVA
IMS GTS t-sig procedure
Information
Management
School

summary(ur.df(diff_indpro, lags=3))

##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression none
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.4125 -0.3841 0.0844 0.6187 3.8172
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## z.lag.1 -1.19448 0.19611 -6.091 2.1e-08 ***
## z.diff.lag1 0.40775 0.16127 2.528 0.013 *
## z.diff.lag2 0.13270 0.12753 1.040 0.301
## z.diff.lag3 0.07887 0.10069 0.783 0.435
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.585 on 100 degrees of freedom
## Multiple R-squared: 0.454,^^IAdjusted R-squared: 0.4322
## F-statistic: 20.79 on 4 and 100 DF, p-value: 1.713e-12
##
##
## Value of test-statistic is: -6.091 82
##
NOVA
IMS GTS t-sig procedure
Information
Management
School

summary(ur.df(diff_indpro, lags=2))

##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression none
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.3731 -0.3976 0.0587 0.6589 3.7907
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## z.lag.1 -1.10561 0.15936 -6.938 3.74e-10 ***
## z.diff.lag1 0.32501 0.12166 2.672 0.00879 **
## z.diff.lag2 0.07051 0.09945 0.709 0.47994
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.575 on 102 degrees of freedom
## Multiple R-squared: 0.4506,^^IAdjusted R-squared: 0.4344
## F-statistic: 27.88 on 3 and 102 DF, p-value: 2.997e-13
##
##
## Value of test-statistic is: -6.9378
## 83
## Critical values for test statistics:
NOVA
IMS GTS t-sig procedure
Information
Management
School

summary(ur.df(diff_indpro, lags=1))

##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression none
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.2565 -0.3927 0.0431 0.6137 3.7892
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## z.lag.1 -1.03269 0.12067 -8.558 1.09e-13 ***
## z.diff.lag 0.27157 0.09471 2.867 0.00501 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.563 on 104 degrees of freedom
## Multiple R-squared: 0.4481,^^IAdjusted R-squared: 0.4375
## F-statistic: 42.23 on 2 and 104 DF, p-value: 3.759e-14
##
##
## Value of test-statistic is: -8.5583
##
## Critical values for test statistics: 84
## 1pct 5pct 10pct
NOVA
IMS Dynamically Complete Models
Information
Management
School

• In the AR(1) model, we showed that, the errors {ut } must be


serially uncorrelated in the sense that Assumption TS.5’ is
satisfied: assuming that no serial correlation exists is practically the
same thing as assuming that only one lag of y appears in
E (yt |yt−1 , yt−2 , . . . ).
• Can we make a similar statement for other regression models? Yes.
• Consider the following static regression model
yt = β0 + β1 zt + ut
where yt and zt are contemporaneously dated.
• For consistency of OLS, we only need E(ut |zt ) = 0. Generally, the
{ut } will be serially correlated. So, we need to assume that
E (ut |zt , yt−1 , zt−1 , . . . ) = 0 (19)
then Assumption TS.5’ holds.
85
NOVA
IMS Dynamically Complete Models
Information
Management
School

Dynamically Complete Models


In the general model

yt = β0 + β1 xt1 + · · · + βk xtk + ut

where the explanatory variables xt = (x1t , . . . , xtk ) may or may not


contain lags of y or z, Equation (19) becomes

E (ut |xt , yt−1 , xt−1 , . . . ) = E (ut |xt )

In other words, whatever is in xt , enough lags have been included so


that further lags of y and the explanatory variables do not matter for
explaining yt . When this condition holds, we have a dynamically
complete model.

86
NOVA
IMS Dynamically Complete Models
Information
Management
School

Consider the output below, where we considered the general fertility rate,
∆gfr, on personal exemption, ∆pe, allowing for two lags of ∆pe:
summary(lm(diff(gfr) ~ diff(pe) + diff(pe_1) + diff(pe_2), data=fertil3))

##
## Call:
## lm(formula = diff(gfr) ~ diff(pe) + diff(pe_1) + diff(pe_2),
## data = fertil3)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.8307 -2.1842 -0.1912 1.8442 11.4506
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.96368 0.46776 -2.060 0.04339 *
## diff(pe) -0.03620 0.02677 -1.352 0.18101
## diff(pe_1) -0.01397 0.02755 -0.507 0.61385
## diff(pe_2) 0.10999 0.02688 4.092 0.00012 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.859 on 65 degrees of freedom
## (2 observations deleted due to missingness)
## Multiple R-squared: 0.2325,^^IAdjusted R-squared: 0.1971
## F-statistic: 6.563 on 3 and 65 DF, p-value: 0.0006054

87
NOVA
IMS Dynamically Complete Models
Information
Management
School

Adding ∆gfrt−1 to model leads to a significant estimate:


summary(lm(diff(gfr) ~ diff(pe) + diff(pe_1) + diff(pe_2) + diff(gfr_1), data=fertil3))

##
## Call:
## lm(formula = diff(gfr) ~ diff(pe) + diff(pe_1) + diff(pe_2) +
## diff(gfr_1), data = fertil3)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.7491 -2.2345 0.0776 1.7393 9.2857
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.702159 0.453799 -1.547 0.126724
## diff(pe) -0.045472 0.025642 -1.773 0.080926 .
## diff(pe_1) 0.002064 0.026778 0.077 0.938800
## diff(pe_2) 0.105135 0.025590 4.108 0.000115 ***
## diff(gfr_1) 0.300242 0.105903 2.835 0.006125 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.666 on 64 degrees of freedom
## (2 observations deleted due to missingness)
## Multiple R-squared: 0.3181,^^IAdjusted R-squared: 0.2755
## F-statistic: 7.464 on 4 and 64 DF, p-value: 5.336e-05

88
NOVA
IMS Dynamically Complete Models
Information
Management
School

This means that the model is not dynamically complete. More terms
can be added that are relevant to explain the dependent variable.
The fact that model is not dynamically complete suggests that there
may be serial correlation in the errors.

89

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy