0% found this document useful (0 votes)
72 views12 pages

Chap 5 MCQ

The document discusses assumptions required for OLS estimators to have desirable properties like consistency, unbiasedness, and efficiency. It states that assumptions (i), (ii), and (iii) are required, but assumption (iv) of normality is not necessary for coefficient estimation, only for hypothesis testing. It also discusses how a Durbin Watson statistic close to zero indicates a first order autocorrelation coefficient close to +1.

Uploaded by

Shalabh Tewari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views12 pages

Chap 5 MCQ

The document discusses assumptions required for OLS estimators to have desirable properties like consistency, unbiasedness, and efficiency. It states that assumptions (i), (ii), and (iii) are required, but assumption (iv) of normality is not necessary for coefficient estimation, only for hypothesis testing. It also discusses how a Durbin Watson statistic close to zero indicates a first order autocorrelation coefficient close to +1.

Uploaded by

Shalabh Tewari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

1.

Which of the following assumptions are required to show the consistency,


unbiasedness and efficiency of the OLS estimator?

(i) E(ut) = 0
(ii) Var(ut) = δ2
(iii) Cov(ut, ut-j) = 0  j
(iv) ut ~ N(0, δ2)

(i), (ii), and (iii) only

All of the assumptions listed in (i) to (iii) are required to show that the OLS estimator has
the desirable properties of consistency, unbiasedness and efficiency. However, it is not
necessary to assume normality (iv) to derive the above results for the coefficient
estimates. This assumption is only required in order to construct test statistics that follow
the standard statistical distributions – in other words, it is only required for hypothesis
testing and not for coefficient estimation.

2. If a Durbin Watson statistic takes a value close to zero, what will be the value of the first
order autocorrelation coefficient?

Ans. Recall that the formula relating the value of the Durbin Watson (DW) statistic, and
the coefficient of first order autocorrelation, p, is:

DW is approximately equal to 2(1-p)


Thus, if DW is close to zero, the first order autocorrelation coefficient, p, must be close
to +1. A value of p close to –1 would suggest negative autocorrelation, while a value
close to +1 or –1 would suggest that we thought there was strong autocorrelation but we
didn’t know whether it was positive or negative! Such a situation would not happen in
practice because DW can distinguish between the two, and positive and negative
autocorrelation would result in completely different values of the DW statistic.
17. Near multicollinearity occurs when
Near multicollinearity is defined as the situation where there is a high, but not perfect,
correlation between two or more of the explanatory variables. There is no definitive
answer as to how big the correlation has to be before it is defined as “high”. If the
explanatory variables were highly correlated with the error terms, this would imply that
these variables were stochastic (and OLS optimality requires that they are not). If the
explanatory variables are highly correlated with the dependent variable, this would not
be multicollinearity, which only considers the relationship between the explanatory
variables.

Which of the following may be consequences of one or more of the CLRM assumptions
being violated?
(i) The coefficient estimates are not optimal

(ii) The standard error estimates are not optimal

(iii) The distributions assumed for the test statistics are inappropriate

(iv) Conclusions regarding the strength of relationships between the dependent and


independent variables may be invalid.

If one or more of the assumptions is violated, either the coefficients could be wrong or
their standard errors could be wrong, and in either case, any hypothesis tests used to
investigate the strength of relationships between the explanatory and explained
variables could be invalid. So all of (i) to (iv) are true.

What is the meaning of the term “heteroscedasticity”?

By definition, heteroscedasticity means that the variance of the errors is not constant.

 Consider the following regression model

(2)
Suppose that a researcher is interested in conducting White’s heteroscedasticity test
using the residuals from an estimation of (2). What would be the most appropriate form
for the auxiliary regression?

(i)

(ii)

(iii)

(iv)

The first thing to think about is what should be the dependent variable for the auxiliary
regression. Two possibilities are given in the question: u_t-squared and u_t. Recall that
the formula for the variance of any random variable u_t is

E[u_t – E(u_t)]squared

and that E(u_t) is zero, by the first assumption of the classical linear regression  model.
Therefore, the variance of the random variable simplifies to

E[u_t squared]. Thus, our proxy for the variance of the disturbances at each point in time
t becomes the squared residual. Thus, answers c and d, which contain u rather than u-
squared, are both incorrect. The next issue is to determine what should be the
explanatory variables in the auxiliary regression. Since, in order to be homoscedastic,
the disturbances should have constant variance with respect to all variables, we could
put any variables we wished in the equation. However, White’s test employs the original
explanatory variables, their squares, and their pairwise cross-products. A regression
containing a lagged value of u as an explanatory variable would be appropriate for
testing for autocorrelation (i.e. whether u is related to its lagged values) but not for
heteroscedasticity. Thus (ii) is correct.

Consider the following regression model

(2)

Suppose that model (2) is estimated using 100 quarterly observations, and that a test of
the type described in question 4 is conducted. What would be the appropriate 2 critical
value with which to compare the test statistic, assuming a 10% size of test?
The chi-squared distribution has only one degree of freedom parameter, which is the
number of restrictions being placed on the model under the null hypothesis. The null
hypothesis of interest will be that all of the coefficients in the auxiliary regression, except
the intercept, are jointly zero. Thus, under the correct answer to question 4, b, this would
mean that alpha2 through alpha6 were jointly zero under the null hypothesis. This
implies a chi-squared distribution with 5 degrees of freedom. The 10% critical value from
this distribution is approximately 9.24 and thus 4 is correct.

What would be then consequences for the OLS estimator if heteroscedasticity is present in a
regression model but ignored?

Under heteroscedasticity, provided that all of the other assumptions of the classical
linear regression model are adhered to, the coefficient estimates will still be consistent
and unbiased, but they will be inefficient. Thus c is correct. The upshot is that whilst this
would not result in wrong coefficient estimates, our measure of the sampling variability
of the coefficients, the standard errors, would probably be wrong. The stronger the
degree of heteroscedasticity (i.e. the more the variance of the errors changed over the
sample), the more inefficient the OLS estimator would be.

Which of the following are plausible approaches to dealing with a model that exhibits
heteroscedasticity?
(i) Take logarithms of each of the variables

(ii) Use suitably modified standard errors

(iii) Use a generalised least squares procedure

(iv) Add lagged values of the variables to the regression equation.

(i), (ii) and (iii) are all plausible approaches to dealing with heteroscedasticity. The
choice as to which one should be used in practice will depend on the model being
constructed and also on which remedies are available to the researcher. Taking
logarithms of all of the variables and running a regression model using logarithms of the
variables instead of the raw variables themselves often works as a “cure” for
heteroscedasticity since this transformation can often turn a non-linear multiplicative
relationship between variables into a linear additive one. However, logs of a variable
cannot be taken if the variable could have negative or zero values for some
observations (since the log of a negative or zero number is not defined). An alternative
approach would be not to remove the heteroscedasticity, but instead to simply allow for
its presence. Since heteroscedasticity may imply that the standard errors are wrong, a
remedy would be to use White’s modification to the standard errors, which will lead to
standard errors that are robust in the presence of heteroscedasticity. A generalised least
squares (GLS) procedure could also be used. This would involve weighting or scaling
each observation, but feasible GLS must be used, which means that for this method to
apply, the form of the heteroscedasticity must be known. In practice, this is rarely the
case so that GLS is not often used in financial econometrics. Finally, it would not make
sense to add lagged values of the explanatory variables as a possible response to a
finding of heteroscedasticity. Adding the lagged values would probably mean that the
heteroscedasticity will still be there and unaccounted for, and taking lags is often an
approach to a finding of residual autocorrelation rather than heteroscedasticity.

Negative residual autocorrelation is indicated by which one of the following?

Negative residual autocorrelation implies a negative relationship between one residual


and the immediately preceding or immediately following ones. This implies that, if
negative autocorrelation is present, the residuals will be changing sign more frequently
than they would if there were no autocorrelation. Thus negative autocorrelation would
result in an alternating pattern in the residuals, as they keep crossing the time axis. A
cyclical pattern would have arisen in the residuals if they were positively autocorrelated.
This would be since adjacent residuals would have the same sign more frequently than
would have been the case if there were no autocorrelation, resulting in the time series
plot of the residuals not crossing the time axis very often. A complete randomness in the
residuals would occur if there were no autocorrelation, while the residuals being all close
to zero could occur if there were significant autocorrelation in either direction or if there
were not significant autocorrelation!

Which of the following could be used as a test for autocorrelation up to third order?

The Durbin Watson test is one for detecting residual autocorrelation, but it is designed to
pick up first order autocorrelation (that is, a statistically significant relationship between a
residual and the residual one period ago). As such, the test would not detect third order
autocorrelation (that is, a statistically significant relationship between a residual and the
residual three periods ago). The Breusch-Godfrey test is also a test for autocorrelation,
but it takes a more general auxiliary regression approach, and therefore it can be used
to test for autocorrelation of an order higher than one. White’s test and the RESET tests
are not autocorrelation tests, but rather are tests for heteroscedasticity and appropriate
functional form respectively.

Suppose that the Durbin Watson test is applied to a regression containing two explanatory
variables plus a constant (e.g. equation 2 above) with 50 data points. The test statistic takes a
value of 1.53. What is the appropriate conclusion?

The value of the test statistic is given at 1.53, so all that remains to be done is to find the
critical values. Recall that the DW statistic has two critical values: a lower and an upper
one. If there are 2 explanatory variables plus a constant in the regression, this would
imply that using my notation, k = 3 and k’ = 3-1 = 2. Thus, we would look in the k’=2
column for the lower and upper values, which would be in the row corresponding to n =
50 data points. The relevant critical values would be 1.40 and 1.63. Therefore, since the
test statistic falls between the lower and upper critical values, the result is in the
inconclusive region. We therefore cannot say from the result of this test whether first
order serial correlation is present or not.

Suppose that a researcher wishes to test for autocorrelation using an approach based
on an auxiliary regression. Which one of the following auxiliary regressions would be
most appropriate?

(i)

(ii)

(iii)

(iv)

Residual autocorrelation is concerned with the relationship between the current (time t)
value of the residual and its previous values. Therefore, the dependent variable in the
auxiliary regression must be the residual itself and not its square. So (i) and (ii) are
clearly inappropriate. This also suggests that it should be lagged values of the residuals
that should be the regressors in the auxiliary regression and not any of the original
explanatory variables (the x’s).
 If OLS is used in the presence of autocorrelation, which of the following will be likely
consequences?

(i) Coefficient estimates may be misleading

(ii) Hypothesis tests could reach the wrong conclusions

(iii) Forecasts made from the model could be biased

(iv) Standard errors may inappropriate

The consequences of autocorrelation are similar to those of heteroscedasticity. Thus the


coefficient estimates will still be okay (i.e. they will be consistently and unbiasedly
estimated) provided that the other assumptions of the classical linear regression model
are valid. The OLS estimator will be inefficient in the presence of autocorrelation, which
implies that the standard errors could be sub-optimal. Since the standard errors may be
inappropriate in the presence of autocorrelation, it is true that hypothesis tests could
reach the wrong conclusion, since the t-test statistic contains the coefficient standard
error in it. As the parameter estimates should still be correct, forecasts obtained from the
model will only use the coefficients and not the standard errors, so the forecasts should
be unbiased. Therefore (ii) and (iv) are likely consequences and so a is correct.

Which of the following are plausible approaches to dealing with residual autocorrelation?

(i) Take logarithms of each of the variables

(ii) Add lagged values of the variables to the regression equation

(iii) Use dummy variables to remove outlying observations

(iv) Try a model in first differenced form rather than in levels.

(ii) and (iv)

Autocorrelation often arises as a result of dynamic (i.e. time-series) structure in the


dependent variable that is not being captured by the model that has been estimated.
Such structure will end up in the residuals, resulting in residual autocorrelation.
Therefore, an appropriate response would be one that ensures that the model allows for
this dynamic structure. Either adding lagged values of the variables or using a model in
first differences will be plausible approaches. However, estimating a static model with no
lags in the logarithmic form would not allow for the dynamic structure in y and would
therefore not remove any residual autocorrelation that had been present. Taking logs is
often proposed as a response to heteroscedasticities or non-linearities, shown by the
White and Ramsey tests respectively. Similarly, removing a small number of outliers will
also probably not remove the autocorrelation. Removing outliers using dummy variables
is often suggested as a response to residual non-normality.

Which of the following could result in autocorrelated residuals?


(i) Slowness of response of the dependent variable to changes in the values of the
independent variables

(ii) Over-reactions of the dependent variable to changes in the independent variables

(iii) Omission of relevant explanatory variables that are autocorrelated

(iv) Outliers in the data

I,ii,iii

Autocorrelation often arises as a result of dynamic (i.e. time-series) structure in the


dependent variable that is not being captured by the model that has been estimated.
This dynamic structure could result either from slow adjustment of the dependent
variable to changes in the independent variables or from over-adjustment of the
dependent variable to changes in the independent variables. The former may be termed
“under-reaction” and would result in positive residual autocorrelation, while the latter
may be termed “over-reaction” which would result in negative residual autocorrelation. It
is also the case that omitting a relevant explanatory variable (in other words, one that is
an important determinant of y) that is itself autocorrelated, will also result in residual
autocorrelation. Outliers in the data are unlikely to cause residual autocorrelation.
 Including relevant lagged values of the dependent variable on the right hand side of a regression
equation could lead to which one of the following?

Biased but consistent

Including lagged values of the dependent variable y will cause the assumption of the
CLRM that the explanatory variables are non-stochastic to be violated. This arises since
the lagged value of y is now being used as an explanatory variable and, since y at time
t-1 will depend on the value of u at time t-1, it must be the case that lagged values of y
are stochastic (i.e. they have some random influences and are not fixed in repeated
samples). The result of this is that the OLS estimator in the presence of lags of the
dependent variable will produce biased but consistent coefficient estimates. Thus, as the
sample size increases towards infinity, we will still obtain the optimal parameter
estimates, although these estimates could be biased in small samples. Note that no
problem of this kind arises whatever the sample size when using only lags of the
explanatory variables in the regression equation.

Which one of the following is NOT a plausible remedy for near multicollinearity?

Principal components analysis (PCA) is a plausible response to a finding of near


multicollinearity. This technique works by transforming the original explanatory variables
into a new set of explanatory variables that are constructed to be orthogonal to one
another. The regression is then one of y on a constant and the new explanatory
variables. Another possible approach would be to drop one of the collinear variables,
which will clearly solve the multicollinearity problem, although there may be other
objections to doing this. Another approach would involve using a longer run of data.
Such an approach would involve increasing the size of the sample, which would imply
more information upon which to base the parameter estimates, and therefore a
reduction in the coefficient standard errors, thus counteracting the effect of the
multicollinearity. Finally, taking logarithms of the variables will not remove any near
multicollinearity.

What will be the properties of the OLS estimator in the presence of multicollinearity?

In fact, in the presence of near multicollinearity, the OLS estimator will still be consistent,
unbiased and efficient. This is the case since none of the four (Gauss-Markov)
assumptions of the CLRM have been violated. You may have thought that, since the
standard errors are usually wide in the presence of multicollinearity, the OLS estimator
must be inefficient. But this is not true – the multicollinearity will simply mean that it is
hard to obtain small standard errors due to insufficient separate information between the
collinear variables, not that the standard errors are wrong.
Which one of the following is NOT an example of mis-specification of functional form?

A “mis-specification of functional form” will occur when the researcher assumes a linear
model for the relationship between the explanatory variables and the explained variables
but a non-linear relationship is more appropriate. Clearly, then, a, b and c are all
examples of mis-specification of functional form since a linear model has been proposed
but in all three of these cases the true relationship between y and x is better
represented using a particular non-linear function of x. But, if a relevant variable that is
not a function of the included variables (i.e. a completely separate variable z) is
excluded, this will not cause a rejection of the null hypothesis for the Ramsey RESET
test.
If the residuals from a regression estimated using a small sample of data are not normally
distributed, which one of the following consequences may arise?
Only assumptions labelled 1-4 in the lecture material are required to show the
consistency, unbiasedness and efficiency of the OLS estimator, and not the assumption
that the disturbances are normally distributed. The latter assumption is only required for
hypothesis testing and not for optimally determining the parameter estimates. Therefore,
the only problem that may arise if the residuals from a small-sample regression are not
normally distributed is that the test statistics may not follow the required distribution. You
may recall that the normality assumption was in fact required to show that, when the
variance of the disturbances is unknown and has to be estimated, the t-statistics follow a
t-distribution.
By definition, a leptokurtic distribution is one that has fatter tails than a normal
distribution and is more peaked at the mean. In other words, a leptokurtic distribution will
have more of the “probability mass” in the tails, more close to the centre and less in the
tails. Skewness is not a required characteristic of leptokurtic distributions.
Under the null hypothesis of a Bera-Jarque test, the distribution has

Which one of the following would be a plausible response to a finding of residual non-
normality?

If a relevant variable is omitted from a regression equation, the consequences would be


that:
(i) The standard errors would be biased

(ii) If the excluded variable is uncorrelated with all of the included variables, all of the
slope coefficients will be inconsistent.

(iii) If the excluded variable is uncorrelated with all of the included variables, the intercept
coefficient will be inconsistent.

(iv) If the excluded variable is uncorrelated with all of the included variables, all of the
slope and intercept coefficients will be consistent and unbiased but inefficient.

If a relevant variable is omitted from a regression equation, then the standard conditions
for OLS optimality will not apply. These conditions implicitly assumed that the model was
correctly specified in the sense that it includes all of the relevant variables. If relevant
variables (that is, variables that are in fact important determinants of y) are excluded
from the model, the standard errors could be biased (thus (i) is true), and, the slope
coefficients will be inconsistently estimated unless the excluded variable is (are)
uncorrelated with all of the included explanatory variable(s) - thus (ii) is wrong. If this
condition holds, the slope estimates will be consistent, unbiased and efficient (so (iv) is
wrong), but the intercept estimator will still be inconsistent (so (iii) is correct.
Which of the following consequences might apply if an explanatory variable in a regression
is measured with error?
(i) The corresponding parameter will be estimated inconsistently

(ii) The corresponding parameter estimate will be biased towards zero

(iii) The assumption that the explanatory variables are non-stochastic will be violated

(iv) No serious consequences will arise

When there is measurement error in an explanatory variable, all of (i) to (iii) could occur.
So parameter estimation may be inconsistent (thus parameter estimates will not
converge upon their true values even as the sample size tends to infinity), the
parameters are always biased towards zero, and obviously measurement error implies
noise in the explanatory variables and thus they will be stochastic.
 Which of the following consequences might apply if the explained variable in a regression is
measured with error?
(i) The corresponding parameter will be estimated inconsistently

(ii) The corresponding parameter estimate will be biased towards zero

(iii) The assumption that the explanatory variables are non-stochastic will be violated

(iv) No serious consequences will arise


In the case where the explained variable has measurement error, there will be no
serious consequences – the standard regression framework is designed to allow for this
as an error term also influences the value of the explained variable. This is in stark
contrast to the situation where there is measurement error in the explanatory variables,
which is a potentially serious problem because they are assumed to be non-stochastic.

Which of the following statements is TRUE concerning OLS estimation?

OLS minimises the sum of the squares of the vertical distances from the points to the
line. The reason that vertical rather than horizontal distances are chosen is due to the
set up of the classical linear regression model that assumes x is non-stochastic.
Therefore, the question becomes one of how to find the best fitting values of y given the
values of x. If we took horizontal distances, this would mean that we were choosing
fitted values for x, which wouldn’t make sense since x is fixed. The reason that squares
of the vertical distances are taken rather than the vertical distances themselves is that
some of the points will lie above the fitted line and some below, cancelling each other
out. Therefore, a criterion that minimised the sum of the distances would not give unique
parameter estimates since an infinite number of lines would satisfy this.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy