Assumptions of The Ols Method
Assumptions of The Ols Method
𝐸(𝑈𝑖 /𝑋𝑖 ) = 0
▪ This assumption can be pictured as in figure which shows a few values of the variables X and the Y
population associated with each of them.
▪ As shown, each Y population corresponding to a given X is distributed its mean value (shown by the
circled points on the PRF (population regression function) ) with some Y values above the mean and
some below it.
▪ The distances above and below the mean values are nothing but the 𝑈𝑖 , what to any given X should
be zero.
4. Homoscedasticity or equal variance of Ui.
▪ Given the value of X, the variance of 𝑈𝑖 is the same for all observations.
▪ That is the conditional variances of 𝑈𝑖 are identical.
▪ Symbolically, we have,
𝑣𝑎𝑟 𝑈𝑖 𝑋𝑖 = 𝐸[𝑈𝑖 |𝑋𝑖 )]2
= 𝐸 𝑢𝑖2 𝑋𝑖
= 𝜎2
▪ This equation states that the variance of 𝑈𝑖 for each 𝑋 𝑖 is some positive constant number equal to
𝜎2.
▪ Technically represents the assumption of homoscedasticity, or equal (homo) spread (scedasticity) or
equal variance.
▪ Simply, the variation around the regression line (which is the line of average relationship between Y
and X) is the same across the X values, it neither increases or decreases as X varies.
▪ Diagrammatically, the situation is as depicted in following figure.
▪ Following figure, where the conditional variance of the Y population varies with X.
▪ This situation is known appropriately as heteroscedasticity, or unequal spread, or variance.
▪ Symbolically, in this situation can be written as,
𝑣𝑎𝑟 𝑈𝑖 𝑋𝑖 = 𝜎𝑖2
Cont…
▪ Note the subscript on 𝜎 2 , which indicates that the variance of the Y population is no longer
constant.
▪ Y represent weekly consumption expenditure and X weekly income.
▪ Those two figures show that as income increases the average consumption expenditure also
increases.
▪ But homoscedasticity figure, the variance of consumption expenditure remains the same at all
levels of income, whereas in heteroscedasticity figure, it increases with increase in income.
▪ In other words, richer families on the average consume more than poorer families, but there is
also more variability in the consumption expenditure of the former.
5. No autocorrelation between the
disturbances.
▪ Given any two X values, 𝑋𝑖 and 𝑋𝑗 𝑖 ≠ 𝑗 the correlation between any two 𝑈𝑖 and 𝑈𝑗 (𝑖 ≠ 𝑗) is
zero,
▪ Symbolically,
𝑐𝑜𝑣 𝑈𝑖 , 𝑈𝑗 𝑋𝑖 , 𝑋𝑗 = 𝐸 𝑈𝑖 − E 𝑈𝑖 𝑋𝑖 {[𝑈𝑗 − 𝐸(𝑈𝑗 − 𝐸 𝑈𝑗 ]|𝑋𝑗 }
= 𝐸 𝑈𝑖 𝑋𝑖 𝑈𝑗 𝑋𝑗
=0
▪ In words, postulates that the disturbances ui and uj are uncorrelated. Technically, this is the
assumption of no serial correlation, or no autocorrelation.
Cont…
▪ This means that, given Xi, the deviations of any two Y values from their mean value do not
exhibit patterns such as those shown in Figure 3.6a and b.
▪ In Figure 3.6a, we see that the u’s are positively correlated, a positive u followed by a positive
u or a negative u followed by a negative u.
▪ In Figure 3.6b, the u’s are negatively correlated, a positive u followed by a negative u and vice
versa.
▪ If the disturbances (deviations) follow systematic patterns, such as those shown in Figure 3.6a
and b, there is auto- or serial correlation, and what Assumption 5 requires is that such
correlations be absent.
▪ Figure 3.6c shows that there is no systematic pattern to the u’s, thus indicating zero correlation.
6. Zero covariance between 𝑈𝑖 and 𝑋𝑖
▪ This assumption states that the disturbance u and explanatory variable X are uncorrelated.
▪ Symbolically,
𝑐𝑜𝑣 (𝑈𝑖 , 𝑋𝑖 ) = 𝐸 𝑈𝑖 − 𝐸 𝑈𝑖 𝑋𝑖 − 𝐸 𝑋𝑖
= 𝐸[𝑈𝑖 𝑋𝑖 − 𝐸 𝑋𝑖 since 𝐸 𝑈𝑖 = 0
= 𝐸 𝑈𝑖 𝑋𝑖 − 𝐸 𝑋𝑖 𝐸(𝑈𝑖 ) Since 𝐸 𝑋𝑖 is non stochastic
= 𝐸(𝑈𝑖 𝑋𝑖 ) since 𝐸 𝑈𝑖 = 0
=0
Cont…
▪ we assumed that X and u (which may represent the influence of all the omitted variables) have
separate (and additive) influence on Y.
▪ But if X and u are correlated, it is not possible to assess their individual effects on Y.
▪ Thus, if X and u are positively correlated, X increases when u increases and it decreases when
u decreases.
▪ Similarly, if X and u are negatively correlated, X increases when u decreases and it decreases
when u increases. In either case, it is difficult to isolate the influence of X and u on Y.
7. The number of observations n must be greater
than the number of parameters to be estimated.
▪Alternatively, the number of observations n must be greater than the
number of explanatory variables.
8. Variability in X values.
▪ The X values in a given sample must not all be the same. Technically, var (X) must be a finite
positive number.
▪ If all the X values are identical, then Xi = X¯ (Why?) and the denominator of that equation will
be zero, making it impossible to estimate β2 and therefore β1. Intuitively, we readily see why
this assumption is important.
▪ The sample variance of X is,
ത 2
σ(𝑋𝑖 − 𝑋)
𝑣𝑎𝑟 𝑋 =
𝑛−1
9. The regression model is correctly
specified.
◦ Alternatively, there is no specification bias or error in the model used in
empirical analysis.
◦ An econometric investigation begins with the specification of the econometric
model underlying the phenomenon of interest.
◦ Some important questions that arise in the specification of the model include
the following: (1) What variables should be included in the model? (2) What is
the functional form of the model? Is it linear in the parameters, the variables, or
both? (3) What are the probabilistic assumptions made about the Yi, the Xi,
and the ui entering the model?
10. There is no perfect multicollinearity.
▪ There are no perfect linear relationships among the explanatory variables.
Questions?
Thank you