0% found this document useful (0 votes)
58 views101 pages

Eco Merged

The document discusses autocorrelation in time series data, its causes, and implications for statistical modeling. It highlights the importance of detecting non-randomness and selecting appropriate time series models, emphasizing that autocorrelation can arise from inertia, specification bias, and other factors. The document also explains the consequences of using Ordinary Least Squares (OLS) estimators in the presence of autocorrelation, advocating for Generalized Least Squares (GLS) as a more reliable method for hypothesis testing and confidence intervals.

Uploaded by

tnitu3108
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views101 pages

Eco Merged

The document discusses autocorrelation in time series data, its causes, and implications for statistical modeling. It highlights the importance of detecting non-randomness and selecting appropriate time series models, emphasizing that autocorrelation can arise from inertia, specification bias, and other factors. The document also explains the consequences of using Ordinary Least Squares (OLS) estimators in the presence of autocorrelation, advocating for Generalized Least Squares (GLS) as a more reliable method for hypothesis testing and confidence intervals.

Uploaded by

tnitu3108
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 101

Gross national product

you can analyze the daily price of a product by observing its past four
weeks’ prices
AUTOCORRELATION

RAKESH SRIVASTAVA
DEPARTMENT OF STATISTICS
THE M.S.UNIVERSITY OF BARODA
VADODARA-390 002
PURPOSE:

1.To detect non randomness


2.To identify an appropriate time series model. If the
data are not random.
When Auto Correlation is used to detect non
randomness it is usually only the first (Lag-1)
ACC and when it is used to identify appropriate
T.S.model the ACC are usually plotted for many lags.
QUESTIONS:

The concept of ACC is required to answer the


following questions
1. Was the sample data set generated from a
random process?
2. Would a non linear T.S. model be more
appropriate for these data than a simple plus
error model.
IMPORTANCE:

Randomness is one of the Key assumptions in


Determining if a univariate process is in control?
If, the assumptions of constant location and scale
,randomness and fixed distributions are
reasonable, then the univariate process can be
modeled as Yi = Aο + Ei (Error term)
If randomness is not valid then a different model
needs to be used. This will be either typical a
T.S. Model or a non-linear model with time as
independent variable.
definition:-
The term autocorrelation may be defined as “correlation between
member of series of observations ordered in time [as in time series data]
or space [as in cross-sectional data].”
In the regression context, the classical linear regression model (CLRM)
assumes that such correlation does not exist in disturbances ui.
Symbolically, no autocorrelation means
E(uiuj) = 0 i ≠ j
Put simply, the classical linear regression model assumes that the
disturbance term relating to any observation is not influenced by the
disturbance term relating to any other observation.
However, if there is any influence , we have autocorrelation.
Symbolically,
E(uiuj) ≠ 0 i ≠ j

let us visualize some of the plausible patterns of auto- and non-


autocorrelation,
First order auto regressive scheme
The autoregressive structure is
with
where ρ = the coefficient of the autocorrelation relationship.
= a random term which fulfills all the usual assumption of a random
variable, that is,
E(υ) = 0

The complete form of the first-order Markov process (the pattern of


autocorrelation for the all the values of u), is
CAUSES OF AUTOCORRELATION:-
1. Inertia:- In most of the time series , the value of a series at one point in time is
greater than its previous value. Thus there is a “momentum” built into them,
and it continues until something happens to slow them down. Therefore, in
regressions involving time series data, successive observations are likely to be
interdependent. Examples of such time series are GNP, Price Indexes,
Productions, Employment, and Unemployment exhibit (business) cycles.
2. Specification Bias: - Sometimes autocorrelation patterns such as those shown
in above figures from (a) to (d) occur not because successive observations are
correlated but because the regression model is not “correctly” specified. By
incorrect specification of a model we mean that either some important
variables that should be included in the model not included or unnecessary
variables included. This is the case of specification bias. Often the inclusion
of such variables removes the correlation pattern observed among the
residuals. For example, suppose we have the following demand model:

Yt=β1+ β2X2t+ β3X3t+ β4X4t+ut …(1)


CONT…

where Y = quantity of coffee demanded, X2 = price of coffee, X3 = consumer


income, and X4 = price of tea. However, for some reason we run the following
regression:
Yt = β1 + β2X2t + β3X3t + vt …(2)
Now if (1) is the “correct” model or the “truth” or true relation, running (2)
is tantamount to letting vt=β4X4t+ut .
And to the extent the price of tea affects the consumption of coffee, the error or
disturbance term v will reflect systematic pattern, thus creating (false)
autocorrelation.
In the case of wrong functional form, also model specification bias can occur.
Suppose, the correct functional form of some regression model is linear and
instead of that one use log linear model then there are chances of autocorrelation
in the disturbances.
3. Cobweb Phenomenon:- The supply of many agricultural commodities reflects the
so-called cobweb phenomenon, where supply reacts to price with a lag of one
time period because supply decisions take time to implement (the gestation
period).thus, at the beginning of this year’s planting of crops , farmers are
influenced by the price prevailing last year, so that their supply function is
CONT…

Supplyt= β1 + β2Pt-1+ ut …(3)


Suppose at the end of period t, price Pt turns out to be lower than Pt-1
therefore, in period t+1 farmers may very well decide to produce less than they
did in period t. obviously, in this situation the disturbances ut are not expected to
be random because if the farmers overproduce in year t, they are likely to reduce
their production in t+1, and so on, leading to a cobweb pattern.
Lags:-if in the regression model one of the explanatory variable is the previous
value (i.e. lagged value) of the dependent variable among other variables, then
this regression will known as autoregression. For example,
Consumptiont= β1+ β2incomet+ β3 consumptiont-1+ut …(4)
in the given example one can easily understand that, the consumer would
not change its consumption habits readily for psychological, technological, or
institutional reasons. Here, if we neglect the lagged term in given example, the
resulting error will reflect a systematic pattern due to the influence of lagged
consumption on current consumption.
5. “Manipulation” of Data:- in empirical analysis , the raw data are often
manipulated . For example, in time series regressions involving quarterly data ,
CONT…

such data are usually derived from the monthly data by simply adding three
monthly observations and dividing the sum by 3. this averaging introduces
smoothness into the data by dampening the fluctuations in the monthly data.
Therefore, the graph plotting the quarterly data looks much smoother than the
monthly data, and this smoothness may itself lend to systematic pattern in
disturbances , thereby introducing autocorrelation. Another source of
manipulation is interpolation or extrapolation of the data.

6. Data Transformation:- As an example of this, consider the following model:


Yt=β1+ β2Xt+ut …(5)
where, say, Y= consumption expenditure and X= income. Since eq. 5 hold true
at every time period, it holds true also in the previous time period (t-1). So, we
can write eq 5 as
Yt-1=β1+ β2Xt-1+ut-1 …(6)
Yt-1,Xt-1 and ut-1 are known as lagged value of Y, X and u, respectively,
CONT…

lagged by one period. Now if we subtract eq (6) from eq(5), we obtain


ΔYt= β2ΔXt+Δut …(7)
where Δ, known as the first difference operator, tells us to tale successive
differences of the variables in the question. Thus, ΔYt=(Yt-Yt-1), ΔXt =(Xt-Xt-1),
and Δut = (ut – ut-1). For empirical purposes, we write above equation as
ΔYt= β2ΔXt+Δvt …(8)
where vt= Δut = (ut – ut-1)
If the error term in equation (5) satisfies standard OLS assumptions, particularly
the assumption of no autocorrelation, it can be shown that the error term in
equation (8) is autocorrelated. It may be noted here that models like (8) are
known as dynamic regression models, that is model involving lagged
regressends.
7. Nonstationarity:- Whenever we are dealing with time series data, we may have
to find out whether the given series is stationary or not. If it does not satisfy the
stationarity condition than the series is nonstationary. In simple words, when
mean, variance and covariance are time variant the series will be known as
nonstationary.
Ols estimation in the presence of autocorrelation
Let us once again revert to the two-variable regression model, to get the basic idea,
Yt=β1+ β2Xt+ut ...(8)
let us assume that the error terms are correlated i.e. E(utut+s)≠0 (s≠0) and they are
generated by following mechanism.
ut= ρut-1+ εt -1< ρ <1 …(9)
Where, ρ = rho is known as the coefficient of autocovariance and εt is the
stochastic disturbance term such that it satisfied the standard OLS assumption,
namely,
E(εt)=0
var(εt)= σ2ε
cov(εt, εt+s)=0 s≠0 ...(10)
An error term with the preceding properties is called as white noise error term.
equation (9) postulates that the value of disturbance term in period t is equal
to rho times its value in the previous period plus purely random error term.
The scheme (9) is known as Markov first order scheme , or simply first-order
autoregressive scheme, usually denoted as AR(1).
CONT…

The name autoregressive is appropriate because eq(9) can be interpreted as


regression of ut on itself lagged one period. Its first order because its ut and its
immediate past value is involved; that is the maximum lag is 1. if the model were
ut = ρ1ut-1+ ρ2ut-2 + εt, it would be AR(2) or second order auto regressive scheme
and so on.
In passing, note that rho, the coefficient of autocovariance in eq(9) can also be
interpreted as the first order coefficient of autocorrelation, or more accurately,
the coefficient of autocorrelation at lag 1.
Given the AR(1) scheme, it can be shown that:

…(11)

…(12)

…(13)
CONT…

where cov(ut,ut+s) means covariance between error terms s periods apart and
where cor(ut,ut+s) means correlation between error terms s periods apart. Note
that symmetry property of covariances and correlations, cov(ut,ut+s)=cov(ut,ut-s)
and cor(ut,ut-s).
Since ρ is a constant between -1 and +1,in eq(11) shows that under scheme
AR(1) scheme, the variance of ut is still homoscedastic, but ut is correlated past. It
is critical to note that ‫׀‬ρ‫ <׀‬1, that is, the absolute value of the rho is less than one.
If for example, rho is one, the variance and covariances listed above are not
defined. If ‫׀‬ρ‫ <׀‬1, we say that the AR(1) process given in eq (9) is stationary;
that is mean, variance, covariance of ut do not change over time. If ‫׀‬ρ‫ <׀‬1, then it
is clear from eq.(12) that the value of the covariance will decline as we go in the
past.
Now return to our two-variable regression model: Yt=β1+ β2Xt+ut . And we know
that the OLS estimators of slope coefficient is
…(14)
CONT…

and its variance is given by …(15)


where small letters as usual denote deviation from the mean values.
Now under the AR(1) scheme, it can be shown that the variance of this estimator
is:
…(16)
A comparison of eq. (16) with eq. (17) shows the former is equal to the latter
times a term that depends on ρ as well as the sample autocorrelations between
the values taken by the regressor X at various lags. And in general we cannot
foretell whether is less than or greater than AR1.
To give some idea about the difference between the variance given in eq. (15)
and (16), assume the regressor X also follows the first order autoregressive
scheme with coefficient of autocorrelation of “r”. Then it can be shown that eq.
(16) reduces to:

() …(17)
The blue estimator in the presence of autocorrelation

In the two variable model, assuming that the AR(1) process, we can show
that the BLUE estimator of β2 is given by the following expression;

…(18)

Where C is a correction factor that may be disregarded in practice. Note


that the subscript t now runs from t=2 to t=n . And its variance is given by

…(19)

Where D too is a correction factor that may also be disregarded in


practice.
The estimator , as the subscript suggests, is a obtained by the
method of GLS. In GLS we incorporate any additional information, we have
(e.g., the nature of heteroscedasticity or of the autocorrelation) directly in to the
estimating procedure by transforming the variables, where as OLS such side
information not directly taken in to consideration.
As we can see, the GLS estimator of is given in equation 18
incorporates the autocorrelation parameter ρ in the estimating formula, whereas
the OLS formula given in equation 14 simply neglects it. Intuitively, this is the
reason why the GLS estimator is BLUE and not the OLS estimator – the GLS
estimator makes the most use of the available information. It hardly needs to be
added that if ρ = 0, there is no additional information to be considered and hence
both the GLS and OLS estimators are identical.
In short, under autocorrelation, it is the GLS estimated is given in equation
18 that is BLUE and the minimum variance is now given by equation 19. and
not by equation 15 and 16.
Consequences of ols in the presence of autocorrelation

OLS ESTIMATION ALLOWING FOR AUTOCORRELATION


As noted, is not BLUE, and even if we use , the
confidence interval derived from there are likely to be wider than those based on
the GLS procedure. That is, is not asymptotically efficient. The implication
of this finding for hypothesis testing is clear; we are likely to declare a coefficient
statistically insignificant (i.e., not different from zero) even though in fact (i.e.,
based on the correct GLS procedure) it may be statistically significant. This
difference can be seen clearly from the given figure;

GLS 95% interval


OLS 95% interval
CONT…

In this figure we show the 95% OLS [AR(1)] and GLS confidence interval
assuming that true = 0. consider a particular estimate of , say, b2. since b2
lies in the OLS confidence interval, we could accept the hypothesis that the true
is zero with 95% confidence but if we were to use the (correct) GLS confidence
interval, we could reject the null hypothesis that true is zero, for b2 lies in the
region of rejection.
The message is: to establish confidence intervals and to test hypothesis
one should use GLS and not OLS even though the estimators derived from
the latter are unbiased and consistent.
OLS ESTIMATION DISREGARDING AUTOCORRELATION
the situation is potentially very serious if we not only use but also continue
to use variance ( ), which completely disregards the problem of
autocorrelation, that is,, we mistakenly believe that the usual assumption of
classical linear model hold true. Errors will arise for the following reasons:
➢ The residual variance is likely to under estimate the
true .
➢ As a result, we are likely to over estimate R2.
➢ Even if is not under estimated, var ( ) may under estimate variance
, its variance under (first order) autocorrelation, even though the
latter is inefficient compare to var GLS.
➢ Therefore, the usual t and F tests of significance are no longer valid, and
if applied are likely to give seriously misleading conclusions about the
statistical significance of the estimated regression coefficients.
DETECTING AUTOCORRELATION
➢ Graphical Method
The assumption of non-autocorrelation of classical model relates to
population disturbances ut, which are not directly observable. What we have
instead are their proxies, the residuals , which can be obtained by usual OLS
procedure. Although the are not same thing as ut, very often usual
combination of ‘s give us some clue about likely presence of autocorrelation in
the u’s.
PATTERNS OF AUTOCORRELATION AND NONAUTOCORRELATION
CONT…

Figure (a) to (d) shows that there is discernible pattern among the u’s. Figure (a)
shows a cyclical pattern; figure (b) and (c) suggests an upward or downward
trend in disturbances; whereas figure (d) indicates that both quadratic and
linear trend in the disturbances. Only figure (e) indicates no systematic pattern,
supporting the nonautocorrelation assumption of classical linear regression
model.
Residuals and standardized residuals
There are various ways of examining the residuals. We can simply plot them
against time, the time sequence plot, as we have done in above Figure.
Alternatively, we can plot the standardized residuals against time, which are
also shown in above Figure. The standardized residuals are simply the
residuals divided by the standard error of the regression
that is, they are Notice that are measured in the
units in which the regress and Y is measured. The values of the standardized
residuals will therefore be pure numbers (devoid of units of measurement)
and can be compared with the standardized residuals of other regressions.
Moreover, the standatrized residuals, like , have zero mean and
approximately unit variance. In large samples
is approximately normally distributed with zero mean and unit variance.
Now we will plot against that is, plot the residuals at time “t” against
their value at time (t-1) [ in below figure], a kind of empirical test of the AR(1)
scheme. The figure reveals, that most of the residuals are bunched in the second
(northeast) and the fourth (southeast) quadrants, suggesting a strong positive
correlation in the residuals.
CONT…

➢ Durbin-Watson dTest
The most celebrated test for detecting serial correlation is that developed by
statisticians Durbin and Watson. It is popularly known as Durbin-Watson d
statistic, which is defined as

which is simply the ratio of the sum of squared differences in successive


residuals to the RSS. Note that in the numerator of the d statistic, the number of
observations is n-1 because one observation is lost in taking successive
differences.
A great advantage of the d statistic is that it is based on the estimated
residuals, which are routinely computed in regression analysis. Because of this
advantage, it is now a common practice to report Durbin-Watson d along with
summery measures, such as R square, adjusted R square, t and F test.
➢ Assumptions of DWD test
1. The regression model includes the intercept term. If it is not present, as
in the case of the regression through the origin, it is essential to rerun
the regression including the intercept term to obtain the RSS.
2. The explanatory variables, the X’s, are non-stochastic, or fixed in
repeated sampling.
3. The disturbances ut are generated by the first order autoregressive
scheme: ut = ρut-1 + εt. Therefore, it can not be used to detect higher
order autoregressive schemes.
4. The error term ut is assumed to be normally distributed.
5. The regression model does not include the lagged value(s) of the
dependent variable as one of the explanatory variables. Thus, the test is
inapplicable in models of following type
Yt = β1 + β2X2t + β3 X3t + …….+ βk Xkt + λYt-1 + ut
where Yt-1 is the one period lagged value of Y. Such model are
known as autoregressive models.
6. There are no missing observations in the data.
➢ The actual test procedure can be explain with the aid of following figure:

Reject H0 Zone of Zone of Reject H*0


Evidence of Inde- Inde- Evidence of
positive auto- cision cision negative
correlation auto-
Correlation
Do not reject H0
or H*0 or both
d
0 dL dU 2 4-dU 4-dL 4
CONT….

Where

Which can be written as

Now let us define

as the sample first order coefficient of auto correlation, an estimator of . We


can express above equation as

But since , implies that


These are the bounds of d ; any estimated d value must lie within these limits.
CONT….

➢ it is apparent from the above equation that if ; that is, if there is


no serial correlation (of the first order), d is expected to be about 2. therefore
as a rule of thumb, if d is found to be 2 in an application, one may assume that
there is no first order autocorrelation. If , indicating perfect positive
correlation in the residuals, . Therefore, the closer d is to 0 , the greater
the evidence of positive serial correlation. This relationship could be evident
from the first equation because if there is positive autocorrelation, the will
be bunched together and their differences will therefore tend to be small. As a
result, the numerator sum of squares will be smaller in comparison with the
denominator sum of squares,, which remains a unique value of any given
regression.
➢ Durbin – Watson d Test: Decision Rules
Null hypothesis Decision If
No positive autocorrelation Reject
No positive autocorrelation No decision
No negative correlation Reject
No negative correlation No decision
No autocorrelation, positive or No not reject
negative

1. If H0 versus H1: . Reject H0 at α level if . That is there is


statistically significant positive autocorrelation.
2. H0 : versus H1: . Reject H0 at α level if the estimated
. That is there is statistically significant evidence of negative autocorrelation.
3. H0 : versus H1: . Reject H0 at 2α level if or
That is, there is statistically significant evidence of autocorrelation, positive or
negative.
CONT….

➢ The Breusch-Godfrey Test:-


To avoid some of the pitfalls of the Durbin-Watson d test of autocorrelation,
statisticians Breusch and Godfrey have developed a test of autocorrelation that is
general in the sense that it allows for (1) non-stochastic regressors, such as the
lagged values of the regressend ; (2) high-order autoregressive schemes, such as
AR(1),AR(2), etc.; and (3) simple or high-order moving averages of white noise
error terms.
The BG test, is also known as the LM test, proceeds as follows: We use the
two-variable regression model to illustrate the test, altough many regressors can
be added to the model. Also lagged values of the regressend can be added to the
model. Let
Yt = β1 + β2Xt + ut ...(1.1)
Assume that error term ut follows the pth-order autoregressive, AR(p), scheme as
follows:
ut=ρ1ut-1+ ρ2ut-2+…+ ρput-p+εt …(1.2)
Where εt is a white noise error term .
CONT….

The null hypothesis H0 to be tested is that


H0: ρ1= ρ2=…= ρp= 0 …..(1.3)
that is, there is no serial correlation of any order. The BG test involves the
following steps:
1. Estimate (1.1) by OLS and obtain residuals, .
2. Regress on the original Xt (if there is more than one X variable in the
original model, include them also) and -1, -2,… -p, where latter are
lagged values of the estimated of the residuals in step 1. thus, if p=4, we will
introduce four lagged values of the residuals as additional regressors in the
model. Note that to run regression we will have only (n-p) observations. In the
short, run the following regression;
..(1.4)
and obtain R squared from this regression.
3. If the sample size is large (technically, infinite), Breusch and Godfrey have
shown that (n-p)R2 ~χ2p …(1.5)
CONT….

That is asymptotically, n-p times the R2 value obtain from the auxiliary
regression (1.4) follows the chi-square distribution with p df. If in application
(n-p)R2 exceeds the critical chi-square value at the chosen level of significance,
we reject the null hypothesis, in which case at least one of the rho in (1.2) is
statistically significantly different from zero.
The following practical points about the BG TEST may be noted:
➢ The regressors included in the regression model may contain lagged values
of the regressand Y, that is, Yt-1, Yt-2, etc., may appear as explanatory
variables. Contrast this model with the DWD test restriction there be no
lagged value of the regressand among the regressors.
➢ As noted earlier, the BG test is applicable even if the distubances follow a pth
order moving average (MA) process, that is, the ut are generated as follows
ut = εt + λ1ε t-1+ λ2 ε t-2+…+ λp ε t-p
where εt is a white noise error term, that is, the error term that satisfies all the
classical assumptions.
CONT…

➢ If in the equation (1.2) p=1, meaning first order autocorrelation, then the BG
test is known as Durbin’s M test.
➢ A drawback of the BG test is that the value of p, the length no of the lag,
cannot be specified a prior. Some experimentation with the p value is
inevitable. Sometime one can use the so called Akaike and Schwarz
information criterion to select lag length.
REMEDIAL MEASURES
If after applying one or more of the diagnostic tests of autocorrelation discussed
in the previous section, we find that there is autocorrelation, what then? We have
four option:
1. Try to find out if autocorrelation is pure autocorrelation, not the result of
mis-specification of the model.
2. If it is pure autocorrelation, one can use appropriate transformation of the
original model so that in the transformed model we do not have the problem of
(pure) autocorrelation.
3. In the large samples, we can use the Newey-West method to obtain
standard errors of OLS estimators that are corrected for autocorrelation. This
method is actually an extension of White’s hetroscedasticity-consistent standard
errors method.
4. In some situations we can continue to use the OLS method.
MODEL MIS-SPECIFICATION VERSUS PURE
AUTOCORRELATION

If we take an example of real consumption(index) = Y and productivity


(index) = X of United State from 1959 to 1998 and we run regression based on
different model the available result are as under:
1. 2.
= 29.5192 + 0.7136 = 1.4752 + 1.3057 + 0.9032t
se = (1.9423) (0.0241) se = (13.18) (0.1765) (0.4203)
t = (15.1977) (29.6066) t = (0.1119) (4.7230) (-2.1490)
r2 = 0.9584 d = 0.1229 r2= 0.9632 d = 0.2046
= 2.6755
3.
= 16.2181 + 1.9488Xt + 0.0079
t = (-5.4891) (24.9868) (-15.9363)
r2 = 0.9947 d = 1.02
CORRECTING FOR PURE AUTOCORRELATION:
THE METHOD OF GLS:
Considering the two variable regression model
…(2.1)
and assume that the error term follows the AR(1) scheme namely
ut= ρut-1+ εt -1< ρ <1
now we consider two cases : 1) ρ is known and 2) ρ is not known but has to
estimated.

1) ρ is known
If the coefficient of first order autocorrelation is known, the problem of
autocorrelation can be easily solved if equation (2.1) holds true at time t it is
also holds true at time (t-1). Hence
…(2.2 )
multiplying equation (2.2) by ρ on both sides we obtain
…(2.3)
subtracting eq. (2.3) from (2.1)
…(2.4)
where εt = ut - ρut-1
We can express eq. (2.4) as
…(2.5)
Where
Regression (2.4) is known as Generalised or Quasi, Differenced equation. It
involves regressing Y on X, not in original form but in difference form which
is obtain by substituting a proportion (=ρ) of the value of variable in the
previous time period from its value in the current time period.
2) ρ is not known:-
The Cochrane-Orcutt iterative procedure to estimate ρ.:-
An alternative to estimating ρ from the Durbin-Watson d is the
frequently used Cochrane-Orcutt method that uses the estimated
residuals to obtain information about the unknown ρ.
To explain the method, consider the two-variable model:
…..(3.1)

And assume that is generated by the AR(1) scheme, namely,

…..(3.2)
Cochrane and Orcutt then recommend the following steps to
estimate ρ:
1) Estimate the two-variable model by the standard OLS routine and obtain
the residuals, .
2) Using the estimated residuals, run the following regression:

…..(3.3)
which is empirical counterpart of the AR(1) scheme given previously.
3) Using obtained from eq. 1, run the generalized difference equation,
namely,

Or,
……(3.4)
4) Since a priori it is not known that the obtained from eq.(3.3) is the
“best” estimate of ρ, substitute the values of and obtained
from eq. (3.4) into the original regression [i.e., in eq.(3.1)] and obtain the
residuals, say , as
….(3.5)
Which can be easily computed since are all known.
5) Now to estimate the regression
….(3.6)
Which is similar to eq.(3.3). thus is the second- round estimate of ρ.
Since we do not know whether this second round estimate of ρ is the best of
ρ, we can go into the third round estimate, and so on. As the preceding steps
suggests, the Cochrane-Orcutt method is iterative. But how long should we go
on? The general procedure is to stop carry out the
iterations when the successive estimates of ρ differ by very small amount,
say, by less than 0.01 or 0.005.
The Cochrane-Orcutt Two-Step Procedure:-
This is shortened version of the iterative process. In step one we estimate ρ from
the first iteration, that is, that is from regression (eq. 3.3), and in step two we use
that estimate of ρ to run the generalized difference equation. Sometimes in
practice this two step method gives results quite similar to those obtained from
the more elaborate iterative procedure discussed above.
Durbin’s two step method of estimating ρ:-
To illustrate this method, let us write the generalized difference equation
equivalently as

…..(3.7)
Durbin suggests the following two-step procedure to estimating ρ:
1) Treat eq. 3.7 as a multiple regression model, regressing Yt on Xt,
Xt-1, and Yt-1, and treat the estimated value of the regression coefficient of Yt-
1(= as an estimate of ρ. Although biased, it provides consistent estimate
of ρ.
2) Having obtained , transform the variables as
and and run the OLS regression on the transformed
variables as in

Where
MODEL
MISSPECIFICATION
➢ One of the assumption of the classical linear regression
model (CLRM), in that the regression model used in
analysis is “correctly” specified: if the model is not
correctly specified, we encounter the problem of model
specification error or model specification bias.
MODEL SELECTION CRITERIA

➢ Data admissible.
It should reflect data generating process and Prediction made from that model
must be logically possible.
➢ Consistent with theory.
It must make good economic sense.
➢ Have weakly exogenous regression.
The explanatory variables, or regressors must be uncorrelated with the error term.
➢ Exhibit parameter constancy.
The value of parameter should me stable.
➢ Exhibit data coherency.
The residuals estimated from the model must be purely random.
➢ Encompassing.
The model should encompass or include all the rival models in the sense that it is
capable of explaining their results.
➢ Parsimonious.
Model should be compact.
TYPES OF SPECIFICATTION ERROR
Assume that on the basis of the criteria just listed, we arrive at a
model that we accept a good model. To be concrete, let the model to
be
3.1
where Y = total cost of production and X = output. Equation 3.1 is
the familiar textbook example of the cubic total cost function.
But suppose for some reason, a researcher decides to use
following model:

Note that we have changed the notation to distinguish this model


from the true model.
Since 3.1 is assumed true, adopting 3.2 would constitute a
specification error, the error consisting in omitting a relevant
variable from the true model.
Cont…

Therefore, the error term u2 in 3.2 is in fact

Now suppose that another researcher uses the following model

if 3.1 is the true, 3.4 also constitutes a specification error, the error
here consisting in including an unnecessary or irrelevant variable
in the sense that true assumes to be zero. The now error term is in
fact

Now assume that another researcher postulates the following


model:

In relation to true model, 3.6 would also constitute a specification


bias, the bias here being the use of wrong functional form.
Cont…

Finally, consider the researcher who uses the following model

where and and being the error of


measurement. What 3.7 states is that instead of using the true Yi and
Xi we are using their proxies, , which may contain errors of
measurement. Therefore, in 3.7 we commit the error of measurement
bias.
Another type of specification error relates to the way the
stochastic error ui (or ut) enters the regression models. Consider for
instance, the following model with out the intercept term:

where the stochastic error term enters multiplicatively with the


property that ln ui satisfies the assumptions of the CLRM, against the
following model
Cont…

where the error term enters additively. Although the variables are the
same in the two models, we have denoted the slope coefficient in 3.8
by β and the slope coefficient in 3.9 by α. Now if 3.8 is the correct or
true model, would the estimated α provide an unbiased estimate of the
true β? This is, will If that is not the case, improper
stochastic specification of the error term will constitute another
source of specification error.
To sum up, in developing an empirical model, one likely to
commit one or more of the following specification errors:
CONSEQUENCES OF MODEL SSPECIFICATION ERROR

➢ Under-fitting a Model (Omitting a Relevant Variable)


Suppose the true model is

but for some reason we fit the following model:

The consequences of omitting variable X3 are as follows:


▪ If the left out variable X3 correlated with the included variable
X2, that is r23, the correlation coefficient between the two
variables, is nonzero, are biased as well as
inconsistent. That is and , and the bias
does not disappear as the sample size gets larger.
▪ Even if X2 and X3 are not correlated, in biased, although is
now unbiased.
Cont…

▪ The disturbance variance is incorrectly estimate.


▪ The conventionally measured variance of is
a biased estimator of variance of true estimator
▪ In consequence, the usual confidence interval and hypothesis
testing procedures are likely to give misleading conclusion
about the statistical significance of the estimated parameters.
▪ As another consequence, the forecasts based on the incorrect
model and the forecasts (confidence) intervals will be
unreliable.
Cont…

➢ Inclusion of Irrelevant Variable (Over-fitting a Model)


Now let us assume that

is the true, but we fit the following model:

The consequences of this specification error are as follows:


▪ The OLS estimators of the parameters of the incorrect model
are all unbiased and consistent, that is,

▪ The error variance is correctly estimated.


▪ The usual confidence interval and hypothesis testing procedure
remain valid.
▪ However, the estimated α’s will be generally inefficient, that is
their variance will be generally larger than those of the of
true model.
➢ Error of measurement
➢ Error of measurement in the Dependent variable Y.
Consider the following model
3.12
where

Since is nit directly measurable, we may use an observable


expenditure variable Yi such that
3.13
where denote error of measurement in . Therefore, instead
of estimating 3.12, we estimate
Cont…

3.14

where is a composite error term, containing the


population disturbance term and the measurement error term.
For simplicity, assume that
(which is the assumption of classical linear regression),
and that is the error of measurement in are
uncorrelated with Xi and ; that is the equation
error and the measurement error are uncorrelated. With this
assumptions, it can be seen that β estimated from either 3.12 or
3.13 will be an unbiased estimator of the true β; that is the error of
measurement in the dependent variable Y do not destroy the
unbiasedness property of the OLS estimator. However, the
variances and standard error of β estimated from 3.12 and 3.14
Cont…

will be different because, employing the usual formula, we


obtain
Model (3.12)

Model (3.13)

Obviously, the latter variance is larger than the former.


Therefore, although the error of measurement in the dependent
variable still gives unbiased estimates of the parameters and
their variances, the estimated variances are now larger than in
the case where there are no such errors of measurement.
➢ Error of measurement in the Explanatory Variable X
Now assume that instead of 3.12, we have the following model:

where Yi = current consumption expenditure


= permanent income
ui = disturbance term

Suppose instead of observing we observe

where wi represents errors of measurement in . Therefore,


instead estimating 3.17, we estimate
where , a compound of equation and measurement errors.
Now even if we assume that wi has zero mean, is serially
independent, and is uncorrelated with ui, we can no longer assume that
the composite error term zi is independent of the explanatory variable
Xi because (assuming E(zi) = 0)

using 3.18

Thus, the explanatory variable and the error term in 3.20 are correlated,
which violates the crucial assumption of the classical linear regression
model that the explanatory variable in uncorrelated with the stochastic
disturbance term. If this assumption is violated, it can be shown that the
OLS estimators are not only biased but also inconsistent, that is, they are
remain biased even if the sample size n increases indefinitely.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy