Assosa University School of Graduate Studies Mba Program
Assosa University School of Graduate Studies Mba Program
MBA PROGRAM
GROUP ASSIGNMENT
GILGEL BELES-CENTER
JUNE 26, 2014
PART I: SAY TRUE OR FALSE AND EXPLAIN
heteroscedasticity means that the observations that are either small or large with respect to the
other observations are present in the sample. Heteroscedasticity is also caused due to omission
Heteroscedasticity is a problem because ordinary least squares (OLS) regression assumes that
all residuals are drawn from a population that has a constant variance (homoscedasticity). To
satisfy the regression assumptions and be able to trust the results, the residuals should have a
constant variance.
The t-distribution is defined by the degrees of freedom. These are related to the sample size.
The t-distribution is most useful for small sample sizes, when the population standard
deviation is not known, or both. As the sample size increases, the t-distribution becomes more
Answers
QUESTION 1. A) multicollinearity
Reasons
correlated. Two variables are considered to be perfectly collinear if their correlation coefficient
is +/- 1.0. Multicollinearity among independent variables will result in less reliable statistical
inferences.
1|Page
Multicollinearity exists whenever an independent variable is highly correlated with one or more
QUESTION 2. Answers
the correlations among the independent variables are strong. In some cases, multiple regression
results may seem paradoxical. For instance, the model may fit the data well (high F-Test), even
though none of the X variables has a statistically significant impact on explaining Y. How is
this possible? When two X variables are highly correlated, they both convey essentially the
same information. When this happens, the X variables are collinear and the results show
Factor (VIF) that measures multicollinearity in the model. Why is multicollinearity a problem?
Multicollinearity increases the standard errors of the coefficients. Increased standard errors in
turn means that coefficients for some independent variables may be found not to be
significantly different from 0, whereas without multicollinearity and with lower standard
errors, these same coefficients might have been found to be significant and the researcher may
not have come to null findings in the first place. In other words, multicollinearity misleadingly
inflates the standard errors. Thus, it makes some variables statistically insignificant while they
should be otherwise significant. It is like two or more people singing loudly at the same time.
One cannot discern which is which. They offset each other. How to detect multicollinearity?
Formally, variance inflation factors (VIF) measure how much the variance of the estimated
coefficients are increased over the case of no correlation among the X variables. If no two X
variables are correlated, then all the VIFs will be 1. If VIF for one of the variables is around or
greater than 5, there is collinearity associated with that variable. The easy solution is: If there
are two or more variables that will have a VIF around or greater than 5, one of these variables
2|Page
must be removed from the regression model. The VIF for a predictor is calculated using this
formula.
QUESTION 4.
One of the assumptions made about residuals/errors in OLS regression is that the errors have
the same but unknown variance. This is known as constant variance or homoscedasticity. When
Consequences of Heteroscedasticity
• The OLS estimators and regression predictions based on them remains unbiased and
consistent.
• The OLS estimators are no longer the BLUE (Best Linear Unbiased Estimators)
because they are no longer efficient, so the regression predictions will be inefficient
too.
1. Bartlett Test
3|Page
2. Breusch Pagan Test
3. Score Test
4. F-Test
QUESTION 5. A) unbiased
The estimated parameters are unbiased estimators of the population parameters. So, the
expected or mean values of the estimated parameters are equal to the true population
parameters:
QUESTION. B) Autocorrelation
Autocorrelation refers to the degree of correlation between the values of the same variables
across different observations in the data. The concept of autocorrelation is most often
discussed in the context of the time series data in which observations occur at different points
in time (e.g., air temperature measured on different days of the month). For example, one
might expect the air temperature on the 1st day of the month to be more similar to the
temperature on the 2nd day compared to the 31st day. If the temperature values that occurred
closer together in time are, in fact, more similar than the temperature values that occurred
Part III
Answer. The coefficient of determination measures the proportion of variability in the values
of dependent variable ( price) explained by its linear relation with the independent variables
𝐸𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛
( BDR, Bath, Hsize, Lsize, Age and Poor). 𝑅 2 = , The expression for
𝑇𝑜𝑡𝑎𝑙 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛
4|Page
where, n is the sample size and k denotes the number of slope coefficients corresponding to the
Or:
The R2 is 0.85. Our R2 is between 0 and 1 inclusive. That is 0 ≤ 0.85 ≤ 1. This means that 85%
of the variability in home selling price is explained by its linear relationship ( BDR, Bath,
Hsize, Lsize, Age and Poor) and 15% of variation is due to other factors or chance which are
not part of the model. The adjusted R-square, a measure of explanatory power, is 0.8546. The
standard error of the regression is 0.45, which is an estimate of the variation of the observed
B). Ceteris paribus .significance of coefficient of each variable is different from zero this
5|Page
We have calculated t-statistics or each independent variable in the above table.
If t-statistics is greater than 1.96 and less than -1.96 the coefficient on each variable is
The coefficient of Bath, Hsize, Lsize and poor is significantly different from zero, so reject
the null hypothesis and and accept alternative hypothesis according to 5% critical value.
According to the result the coefficient of BDR, Age is not statistically significantly different
from zero. Accept the null hypothesis and reject alternative hypothesis.
6|Page
C) The fitted equation is
D) There are two types of variables in the model. In a study of price, multiple regression
analysis is done to examine the relationship between home price and six potential predictors.
These are dependent variable i.e., home price, and independent variables include six variables
such as BDR, Bath, Hsize Lsize ,Age and Poor. The independent variables have its own
coefficients which tells each nature of relationship with the dependent variable home price.
The Constant is the predicted value of home price when all of the independent variables have
a value of zero.
Generally, we expect the following results estimated coefficients of each significant variable,
1. We first see the sign of each coefficient like BDR, BTH, Hsize Lsize ,Age explanatory
variables are positively affecting the dependent variable home price. the independent
2. The degree of coefficients of explanatory variables that change the home price
=0.567*1000,where price=$1000
=$567
If one increment of BDR leads to the home price increase by $567 ceteris paribus
7|Page
= $26900
If one increment of Bath leads to the home price increase by $26900 ceteris paribus.
= $239
If the Bath changes by one lead to the home price increases by $239, ceteris paribus.
= $5
If the Lsize changes by one lead to the home price increases by $5, ceteris paribus.
According to the result the effect of Lsize is $5,on $1000 Home price and it very low when
compared with the effect of other explanatory variables on home price .it is insignificant. But
= $100
If the Age changes by one lead to the home price increase by $100, ceteris paribus.
= $(-56900)
If the poor changes by one lead to the home price decrease by $56900, ceteris paribus.
8|Page
The from 6 explanatory variable’s home r should focused on 5 significant explanatory
variables, like BDR, Bath, Hsize, Age and poor. These variables have significant effect on the
dependent variable home price. The coefficient of BDR, Bath, Hsize, Age and poor is
significantly different from zero, so reject the null hypothesis and and accept alternative
According to the result the effect of Lsize is $5,on $1000 Home price and it very low when
compared with the effect of other explanatory variables on home price . The coefficient of
Lsize is insignificant. Accept the null hypothesis and reject alternative hypothesis. But other
9|Page