0% found this document useful (0 votes)
40 views72 pages

Econometrics Cha 4

Econometrics chapter four essential for exit exam and for general knowledges.

Uploaded by

abrishasha383
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views72 pages

Econometrics Cha 4

Econometrics chapter four essential for exit exam and for general knowledges.

Uploaded by

abrishasha383
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 72

CHAPTER FOUR: MULTIPLE REGRESSION

ANALYSIS WITH QUALITATIVE INFORMATION

After learning the contents of chapter, students will be


able to:
4.1 Describing Qualitative Information
4.2 Dummy as Independent Variables
4.3 Dummy as Dependent Variable
4.3.1 The Linear Probability Model (LPM)
4.3.2 The Logit and Probit Models
4.3.3 Interpreting the Probit and Logit Model Estimates

5/14/2023 Ayu
4.1 Describing Qualitative Information
 What is qualitative information?
 It describes qualities or characteristics. It is a non measurable
information that we obtain or gather for a given variable.
 It is an indicator variable in nature. Indicator variable, binary
variable, categorical and dichotomous variable are use
interchangeable.
 It is collected using questionnaires, interviews, or observation.
In regression analysis the dependent variable can be influenced
by variables that are essentially qualitative in nature, such as
sex, race, color, religion, nationality, geographical region,
political upheavals, and party affiliation.
 One way we could “quantify” such attributes is by constructing
artificial variables that take on values of 1 or 0, 1 indicating the
presence (or possession) of that attribute and 0 indicating the
absence of that attribute.
5/14/2023 Ayu
Conti…
Ayu

 Variables that assume such 0 and 1 values are called dummy/


indicator/ binary/ categorical/ dichotomous variables. Such
variables are essentially a device to classify data into mutually
exclusive categories.
 Dummy variables are a data-classifying device in that they
divide a sample into various subgroups based on qualities or
attributes and implicitly allow one to run individual
regressions for each subgroup.
 The category that receives the value of zero is called base/
reference/ benchmark group. And all comparisons are made
in relation to the benchmark category.
 Dummy variables can be incorporated in regression models just
as easily as quantitative variables.

5/14/2023
Examples qualitative variables:
 Gender may play a role in determining salary levels.
 Different ethnic groups may follow different consumption
patterns.
 Educational levels can affect earnings from Employment.
Other examples of qualitative variables are :
 Marriage status (Single, Married, Separated, divorced)
 Employment status (Employed, Unemployed).
 Union membership
 Owning a house
 Voting in elections (No, Yes and Undecided).
 Political party membership (Republican, democrat, other)
4.2 The nature of dummy variables
 In regression analysis the dependent variable is frequently
influenced not only by variables that can be readily
quantified on some well-defined scale (e.g., income,
output, prices, costs, height,Ayuand temperature),
5/14/2023
Conti….
but also by variables that are essentially qualitative in nature
(e.g., sex, race, color, religion, nationality, wars, earthquakes,
strikes, political upheavals, and changes in government
economic policy).
For example, holding all other factors constant, female college
professors are found to earn less than their male counterparts,
and nonwhites are found to earn less than whites. This pattern
may result from sex or racial discrimination.
 Since such qualitative variables usually indicate the presence or
absence of a “quality” or an attribute,
o In regression analysis, dummy variables are mainly used to
capture qualitative attributes or characteristics.
o Dummy variables thus sort the data into mutually exclusive
categories. Ayu

5/14/2023
4.3. Dummy as Independent Variables
A regression model may contain regressors that are all
exclusively dummy, or qualitative, in nature. Such models are
called Analysis of Variance (ANOVA) models.
The significance of the difference between the means of two
samples can be judged through either z-test or the t-test. But,
when we want to examine the significance of the difference
amongst more than two sample means at the same time, the
ANOVA technique enables us to perform this simultaneous test.
On the other hand, regression models containing a mixture of
quantitative and qualitative variables are called analysis of
covariance (ANCOVA) models.
The interpretation of dummy variables remains the same in both
the ANCOVA and ANOVA models.

5/14/2023 Ayu
4.3.1Regression with only qualitative
As a matter of fact, a regression model may contain
explanatory variables that are exclusively dummy, or
qualitative, in nature.
Example: Consider the following model for salary of a
college professors as a function of gender
Yi = α + βDi + ui -----------------------------(1)
where Y=annual salary of a college professor
Di = 1 if male college professor
= 0 otherwise (i.e., female professor)
α = intercept term and β the slope coefficient of dummy.
Model (1) may enable us to find out whether gender makes
any difference in a college professor‟s salary, assuming that all
other variables such as age, degree attained, and years of
experience are held constant.
5/14/2023
Ayu
Conti…
 The slope coefficient β tells by how much the mean salary of a male
college deviating from female college professor.
If D = 0, E(Y) = E(Y|D = 0) =α
If D = 1, E(Y) = E(Y|D = 1) =α+β
 Thus, the difference between the two groups (in mean values of Y)
is: E(Y|D=1) – E(Y|D=0) =β
 The significance of this difference is tested by a t-test of β = 0.
 Therefore, mean salary of female college professor: E(Yi / Di = 0) =α
 Mean salary of male college professor: E(Yi / Di = 1) = α + β,
female is the base in the sense that comparisons are made with that
category.
 The coefficient attached to the dummy variable D can be called the
differential intercept coefficient. Because it tells by how much the
value of the intercept term of the category that receives the value of
1 differs from the intercept coefficient of the base category
5/14/2023
Ayu
Conti…

Ayu
5/14/2023
Conti…

 If the estimator of β is positive and statistically significant,


average salary of male college professor exceeds average
salary of female college professor by the amount equal to
β.
 On the other hand, if the estimator of β is negative and
statistically significant, it means average salary of female
college professor exceeds average salary of male college
professor by the amount equal to the estimator of β.
 If the estimator of β is statistically insignificant, average
salary of female college professor does not have
statistically significant difference with average salary of
male college professor.
5/14/2023 Ayu
Conti…

5/14/2023 Ayu
Example 4.1:
1. Suppose income of 22 accountant in thousands regression is
given (Standard errors are given within parenthesis)
Y = 35.20 + 10.25D
t (43.82) (3.45)
o Where, Y is income in thousands and D is a dummy variable
taking on the value of 1 if male. Then, answer the following
question:
A. Find the average salary of an accountant of male?
B. Find the average salary of an accountant of female?
C. Find the difference in average salary of an accountant male and
female as head of the household.
D. Test for both different intercept & slope.
E. Interpret
5/14/2023
and identify which is the best result
Solution:
A. The estimated mean salary of male accountant is the sum of
intercept and coefficient of dummy (I.e α + β=$45,450).
B. The intercept term gives the estimated mean salary of a
female accountant (I.e α = $35,200).
C. The difference between males and females is given by the
coefficient of the dummy variable and it equals 10.25(45.45-
35.20).
D. Thus, t- statistics shows us gender differential is statistically
significant. Since β is positive and statistically significant
(because the t-calculated value =β/se(β) =10.25/3.45=2.97
is greater than the t-tabulated value at 5% significance level
for the two tailed test).Therefore, it means average salary of
male people exceeds average salary of female people by the
amount equal to the estimator of β (10.25).
5/14/2023 Ayu
Conti…
Figure 4.1: Average salary as shown by dummy regressors

Average salary Ayu


5/14/2023
4.3.2 Regression on one quantitative variable and one
qualitative variable with two classes, or categories
 Consider the model: Yi = αi + α2 Di + βXi + ui ----------------(3)
Where: Yi = annual salary of a college professor
Xi = years of teaching experience
Di = 1 if male =0 otherwise
 Model (3) contains one quantitative variable (years of teaching
experience) and one qualitative variable (geneder) that has two classes
(or levels, classifications, or categories), namely, male and female.
What is the meaning of this equation?
 Assuming, as usual, that E(ui ) = 0, we see that
 Mean salary of female college professor:
E(Yi / Xi , Di = 0) = α1 + βXi ----------------------------------(4)
o Mean salary of male college professor:
o E(Yi / Xi , Di = 1) = (α + α 2 ) + βXi -----------------(5)
o Geometrically,
5/14/2023
we have the situation shown in fig below (for
illustration, it is assumed that α > 0 ). Ayu
Conti…
 In model (3 ) postulates that the male and female college professors‟
salary functions in relation to the years of teaching experience have
the same slope (β ) but different intercepts.
 In other words, it is assumed that the level of the male professor‟s
mean salary is different from that of the female professor‟s mean
salary (by α2 ) but the rate of change in the mean annual salary by
years of experience is the same for both sexes.

5/14/2023 Ayu
Example 4.2

. reg wage sexfemale educ

Source SS df MS Number of obs = 99


F( 2, 96) = 7.65
Model 2497908.51 2 1248954.25 Prob > F = 0.0008
Residual 15676652.1 96 163298.46 R-squared = 0.1374
Adj R-squared = 0.1195
Total 18174560.6 98 185454.7 Root MSE = 404.1

wage Coef. Std. Err. t P>|t| [95% Conf. Interval]

sexfemale -165.5421 81.60353 -2.03 0.045 -327.5239 -3.560389


educ 59.61774 17.04221 3.50 0.001 25.78921 93.44627
_cons 338.5187 244.7084 1.38 0.170 -147.2236 824.261

5/14/2023 Ayu
Conti….
1. Assume the following regression result from a model given
by above equation with Y being the hourly wage rate, D a
dummy for men, and X a variable for years of schooling. The
dependent variable is expressed in USA dollar ($). Standard
errors are given within parenthesis:
Ϋ = 338.5 -165.5 D + 59.6 X
(244.7) (81.6) (17.04)
A. Find the slope and the intercept of dummy variable (for
male and female)?
B. Find the difference in average hourly wages male and
female as head of the household.
C. Interpret the estimated coefficient and model result

5/14/2023 Ayu
4.3.3 Regression on one quantitative variable and one
qualitative variable with more than two classes
Suppose that, on the basis of the cross-sectional data, we
want to regress the annual expenditure on health care by an
individual on the income and education of the individual.
Since the variable education is qualitative in nature,
suppose we consider three mutually exclusive levels of
education: less than high school, high school, and college.
Now, unlike the previous case, we have more than two
categories of the qualitative variable education.
Therefore, following the rule that the number of dummies
be one less than the number of categories of the variable
(m-1), we should introduce two dummies to take care of the
three levels of education.
5/14/2023 Ayu
Conti…
 Assuming that the three educational groups have a common slope
but different intercepts in the regression of annual expenditure on
health care on annual income, we can use the following model:
Yi = α1 + α2 D2i + α3 D3i + βXi + ui ----------(6)
Where Yi = annual expenditure on health care
Xi= annual income
D2= 1 if high school education
= 0 otherwise
D3 = 1 if college education
= 0 otherwise
• Note: The intercept α1 will reflect “less than high school
education” category as the base category.
• The differential intercepts α2 and α3 tell by how much the
intercepts of the other two categories differ from the intercept of
5/14/2023 Ayu
the base category, which can be readily checked as follows:
Conti…
• Assuming E(ui ) = 0 , we obtain
E(Yi | D2 = 0, D3 = 0, Xi ) = α1 + βXi
E(Yi | D2 = 1, D3 = 0, Xi ) = (α1 + α 2 ) + βXi
E(Yi | D2 = 0, D3 = 1, Xi ) = (α1 + α3 ) + βXi
which are, respectively the mean health care expenditure functions for
the three levels of education, namely, less than high school, high
school, and college. Geometrically, the situation is shown in fig 1.2
(for illustrative purposes it is assumed that α3 > α2 ).

5/14/2023 Ayu
Illustrative Example

5/14/2023 Ayu
4.3.4 Regression on one quantitative variable and two
qualitative variables
 The technique of dummy variable can be easily extended
to handle more than one qualitative variable.
 Let us revert to the college professors‟ salary regression,
but now assume that in addition to years of teaching
experience and sex the, skin color of the teacher is also an
important determinant of salary.
 For simplicity, assume that color has two categories: black
and white and assume that sex has two categories male
and female . We can now write as :
Yi = α1 + α2 D2i + α3 D3i + βXi + ui ------(7)
Where Yi = annual salary
Xi = years of teaching experience
D2 = 1if male =0 otherwise
D3 = 1if white =0 otherwise
5/14/2023 Ayu
Conti…
 Notice that each of the two qualitative variables, sex and color, has
two categories and hence needs one dummy variable for each.
 Note also that the omitted, or base, category now is “black female
professor.” Assuming E(ui ) = 0 , we can obtain the following
regression from equation …………………..…(7)
Mean salary for black female professor:
E(Yi | D2 = 0, D3 = 0, Xi ) = α1 + βXi
Mean salary for black male professor:
E(Yi | D2 = 1, D3 = 0, Xi ) = (α1 + α 2 ) + βXi
Mean salary for white female professor:
E(Yi | D2 = 0, D3 = 1, Xi ) = (α1 + α3 ) + βXi
Mean salary for white male professor:
E(Yi | D2 = 1, D3 = 1, Xi ) = (α1 + α2 + α3 ) + βXi
 Once again, it is assumed that the preceding regressions differ only in
5/14/2023
the intercept coefficient but not in the slope coefficient
Ayu β.
Example
1. Now, suppose we will run the regression of Y on the four
explanatory variables and a constant.
o Y =2736 + 12598D1 + 10969D2 + 5.197X1 + 10.562X2.
o Where, Y is the price of the house.
o D1= 1 (if the house has a driveway) or 0 (if it does not).
o D2= 1 (if the house has a recreation room) or 0 (otherwise)
X1 is the size of the garden and X2 is land rent and
Required: Calculate the expected value if the house has no
driveway, no recreation room, a driveway and a recreation
room, citreous paribus ? And interpret the result of all
explanatory variables.

5/14/2023 Ayu
Solution
I. If the house has no driveway ( D1= 0 ) and no recreation
room ( D2 = 0 ), its value will be Y =2736.
II. If the house has a driveway, its value will be, ceteris
paribus), $12598 more.
III. If the house has a recreation room, its value will be, ceteris
paribus, $10969 more.
IV. If the house has a driveway and a recreation room, its value
will be, ceteris paribus, 12598+10969 = $23567 more.
V. Increasing the size of the garden by 1 square foot will
increase the price of the house by $5.197 whether the house
has or not a driveway or a recreation room.
VI. If the land of rent increase by one birr, the price of house
will be rise by $ 10.56, citreous paribus.
Ayu
5/14/2023
4.3.5 Dummy variable Trap
First, if the regression contains a constant term, the
number of dummy variables must be one less than the
number of classes of each qualitative variable.
If all categories of a qualitative variable are incorporated
with intercept, there will be perfect multicollinearity and
regression will be impossible. This is called dummy
variable trap.
Dummy Variable Trap occurs when two or more dummy
variables created by one-hot encoding are highly
correlated (multi-collinear).
This means that one variable can be predicted from the
others, making it difficult to interpret predicted coefficient
variables in regression models.
5/14/2023 Ayu
Conti…
 There is a way to avoid dummy variable trap.
 First, by introducing as many dummy variables as the number
of categories of that variable and omit the intercept term in a
model. Yi = β1D1i + β2D2i + β3D3i + ui
 Second, if there is base group in the model, the coefficient
attached to the dummy variables must always be interpreted in
relation to the base, or reference, group. That is, include the
intercept term and introduce only (m-1) dummies, where m is
the number of categories of the dummy variable.
 For example, If we want to look at the effect of location( Addis
Ababa, Hawassa, Arba Minch) on Person's salary in thousands
of Birr (Y). If, Arba Minch dropped then:
 Multiple Regression Model: Y= β0 + β1D1+ β2D2+ e

5/14/2023 Ayu
Conti…
 To distinguish the two categories, male and female, we
have introduced only one dummy variable Di . For if Di =
1 always denotes a male, when Di = 0 we know that it is a
female since there are only two possible outcomes.
 Hence, one dummy variable suffices to distinguish two
categories.
 The general rule is this: If a qualitative variable has „m‟
categories, introduce only „m-1‟ dummy variables.
 In the above example, sex has two categories, and hence
we introduced only a single dummy variable. If this rule is
not followed, we shall fall into what might be called the
dummy variable trap, that is, the situation of perfect
multicollinearity.
5/14/2023
Ayu
Conti…

5/14/2023 Ayu
4.3.6 ANOVA and ANCOVA MODELS
1. ANOVA stands for Analysis of Variance. It is a regression
model in which the dependent variable is quantitative in
nature, but all the explanatory variables are qualitative in
nature (dummies).
There are two major types of ANOVA models:
ANOVA model with one qualitative variable
ANOVA model with two qualitative variables
2. ANCOVA stands for analysis of covariance. It is regression
model contains a mixture of qualitative and quantitative
variables.
 NB. The interpretation of dummy variable remains the same in
both the ANCOVA and ANOVA.

5/14/2023 Ayu
4.4 Dummy as Dependent Variable
 Qualitative Response Model shows situations in which the
dependent variable in a regression equation simply represents a
discrete choice assuming only a limited number of values. Or it
is defined as a dependent variable whose range of values is
substantively restricted.
 On occasions the variable that we are trying to explain may be
discrete rather than continuous.
 Models that involve such variables are called
 Qualitative Response models or
 Discrete Choice models
 Categorical dependent variable model
 Dummy as Dependent Variable
 Dichotomous dependent variable models
5/14/2023 Ayu
Limited dependent variable models.
Conti…
 If the dependent variable of the model is dummy, the usual OLS
technique will no more be useful. Instead, the maximum likelihood
estimation technique is used. Because when the dependent
variable is dummy, the objective is finding maximum probability
of something happening for the given values of regressors
 In a regression analysis, we usually face a qualitative response
(dependent) variable of the “yes” or “no” type.
 Discrete choice models dealing with such kind of binary responses
are called binary choice models.
 At this junction, it is important to distinguish between:
 Binary choices: the dependent variable can take two values.
 Multiple choices: the dependent variable can take more than two
values.
 Multinomial choices: work as a teacher, or as a clerk, or as a self
employed or professional or as a factory worker
 Multinomial ordered choices: strongly agree, agree, neutral,
disagree.
5/14/2023 Ayu
Conti…
 There are several types of such models. Some of them include
the
Linear Probability Model (LPM),
Probit model
Logit model,
The tobit(censored regression) model
Heckman two stage model etc.
 Technically, it is possible to estimate the binary choices using
OLS.
 Such linear model for binary choices where OLS is used is
called linear probability model (LPM).
 The primary objective in categorical response models is to
explain how observations fall into each category.
• For example in the labor market case we may wish to explain
labor force participation decision of a women by linking the
dependent variable to explanatory variables like age, education,
marital
5/14/2023 status etc. Ayu
Basic framework of binary models

5/14/2023 Ayu
Conti

5/14/2023 Ayu
4.4.1 The Linear Probability Model
In the 1960‟s and early 1970‟s the linear probability
model was widely used mainly because it is a model
that can be easily estimated using multiple
regression analysis.
It is a multiple regression model with a dependent variable in
the form of binary rather than continuous.
The term linear probability model comes from the fact that the
right hand side of the equation is linear.
Because the dependent variable Y is binary, the population
regression function corresponds to the probability that the
dependent variable equals 1 given explanatory variables, Xs, i.e.

o is the change in the probability that Y=1 associated with a


unit change in , i.e.
Ayu
5/14/2023
Interpreting the coefficients of a LPM

Ayu
5/14/2023
Conti…
The regression coefficients in the LPM are estimated by
OLS.
The usual (Heteroscedastic-robust) OLS standard errors can
be used to construct confidence intervals and hypotheses
tests.
Let be the probability that Y=1 (probability of success),
then = probability that Y=0 (probability of failure).
Therefore;

Probability
0
1
5/14/2023 Ayu
Advantages of the linear probability model
 It is easy to estimate and interpret the results
Drawbacks of LPM
I. the partial effect of any explanatory variable is constant. The
dependent variable is discrete while the independent variable
is the combination of discrete and continuous variables.
II. The disturbances are not normally distributed. I.e E(Ui)#0
dependent variable Yᵢ assumes only two values (0 or 1), the
disturbances also takes only two values; that is, the error term
follows the Bernoulli distribution. As a result, is not
normally distributed.

5/14/2023
Ayu
Conti…

The above equation shows heteroscedasticity because


P = β1 + β2Xi. Thus, the distribution of ui is non-
normal.
Ayu
5/14/2023
Conti….
IV. R2 as a Measure of Goodness of Fit is Questionable
Corresponding to the value of regressors (X‟s), the
dependent variable (Y) is either 0 or 1. Therefore, all the Y
values will either lie along the X axis or along the line
corresponding to Y equals 1. Therefore, generally no LPM
is expected to fit such a scatter well. As a result, computed
R2 is of limited value in the dichotomous response models
or in qualitative dependent variables be it constrained or
unconstrained.
V. The restriction is not fulfilled: OLS estimation of the LPM
gives no guarantee for the probability to be between 0 and 1.
This is because the probability increases linearly with
regressors. In fact, we can restrict the LPM under OLS to be
between 0 and 1 or use estimation techniques other than OLS
that guarantee equation This is the real problem with the OLS
estimation of the LPM.
 It is this weakness that gives rise to better Ayu
methods of
estimating
5/14/2023 binary dependent variable models (Logit and Probit
Model).
Conti…

5/14/2023 Ayu
Conti….
 Probability model that has the following two features:
 As Xi increases, Pi= E(Y = 1/X) increases but never steps outside
the 0-1 interval.
 The relationship between Pi and Xi is nonlinear, that is, “ one
which approaches zero at slower and slower rates as Xi gets
small and approaches one at slower and slower rates as Xi gets
very large”.

 S-shaped curve is very much similar


with the cumulative distribution
function (CDF) of a random variable.
 CDF of a random variable X is simply
the probability that it takes a value
less than or equal to x₀, were x₀ is
some specified numerical value of X.
5/14/2023 F(X),
Ayu the CDF of X, is F(X = x₀) = P(X ≤x₀).
5/14/2023 Ayu
4.5.2 Binary The Logit model
 The logit model uses a the cumulative logistic distribution to
transform the model so that the probabilities follow the S-
shape given on the previous slide
 The binomial logit is an estimation technique for equations
with dummy dependent variables that avoids the
unboundedness problem of the linear probability model.
 BLM is non-linear and does so by using a variant of the
cumulative logistic function

5/14/2023 Ayu
Conti..
Note that: both response and non-response probabilities lie
in the interval [0 , 1] , and hence, are interpretable.
Odd ratio: the ratio of the response probabilities (Pi) to the
non response probabilities (1-Pi).

5/14/2023 Ayu
Cont….
 L(the log of the odds ratio) is linear in X as well as (the
parameters). L is called the logit and hence the name
logit model is given to it.
 Thus, the log-odds ratio is a linear function of the
explanatory variables.
 For the LPM it is Pi, which is assumed to be a linear
function of the explanatory variables.
 If the odd ratio is equal to 1, then both outcomes have
equal probability.
 If the odd ratio is equal to 2 , then the outcome Yi = 1 is
twice more likely than the outcome Yi = 0.
 The odd ratio is always non-negative.
5/14/2023
Ayu
Feature of the logit model
 As Pi goes from 0 to 1, (i.e., as Z varies from −∞ to
+∞),Li goes from -∞ to ∞. Although, the probabilities
lie between 0 and 1, the logit Li are not so bounded.
 Logit is linear in X, the probabilities themselves are
not. This property is in contrast with the LPM model
where the probabilities increase linearly with X.
 If Li, the logit becomes increasingly large and
positive, as when the value of the explanatory
variable(s) increases and as the odds ratio increases
from 1 to infinity and the logit becomes increasingly
large and negative, as the odds ratio decreases from 1
to 0.
 LPM assumes that Pi is linearly related to Xi, the logit model assumes that
the log of the odds ratio is linearly related to Xi.
 Interpretation: Be remind that we does not directly interpreted the
coefficients of the variables rather we interpreted their marginal effects.
5/14/2023
Ayu
Conti…
 The coefficient β in logit (Non-linear model) is not necessarily a
measure of change of probability for a unit change the
covariates x. It is only interpreted in terms of odd-ratios
 The coefficient β measures the percentage change in log-odds
ratio for a unit change in a covariate. That is, a unit increase in
X1 leads to an increase of 100β1% in the odds-ratio.

5/14/2023
Ayu
I. Individual data

 SE are asymptotic hence we have to use Z statistic instead of t-


statistic.
 R-square is not meaningful in binary response models.
 LR test, which is chi-square test with df equal to number of
regressors, in Logit is equivalent the use of F-test for joint test
of multiple regression model
5/14/2023
Ayu
II. Grouped data

5/14/2023
Ayu
Marginal Effect
 Reporting marginal effect instead of odd ratio is more popular in
economics. In most of applications, the primary goal is to explain the
effects of Xj on the response probability Pr (Y = 1), not of the log
odd-ratio.
 The changes in probabilities (slopes) can be computed, though not
constant, and are termed as marginal effects.

Ayu
5/14/2023
Cont…
Each slope coefficient shows how the log of the odds in
favor of the outcome changes as the value of the X
variable changes by a unit.
 βi, the slope, measures the marginal effect of Xi on the
log odds-ratio in favor of Y=1.
The intercept 𝛽0 is the value of the log odds if Xi‟s are
zero.
The coefficient β measures the percentage change in
log-odds ratio for a unit change in a covariate.
Merits of Logit Model
• Logit analysis produces statistically sound results. By
allowing for the transformation of a dichotomous
dependent variable to a continuous variable ranging
from - ∞ to + ∞, the problem of out of range estimates
is avoided.
5/14/2023 Ayu
Conti…
 The logit analysis provides results which can be easily interpreted and
the method is simple to analyze.
 It gives parameter estimates which are asymptotically consistent,
efficient and normal, so that the analogue of the regression t-test can
be applied.
Demerits of Logit Model

Ayu
5/14/2023
Difference between Logit and LPM
 In the LPM the slope coefficient measures the marginal effect
of a unit change in the explanatory variable on the probability
of the outcome, holding other variables constant.
 In the logit model, the marginal effect of a unit change in the
explanatory variable not only depends on the coefficient of that
variable but also on the level of probability from which the
change is measured.
 The logit model depends on the values of all the explanatory
variables in the model.
 The LPM assumes that Pi is linearly related to Xi, where as the
logit model assumes that the of odds ratio is linearly related to
Xi.

Ayu
5/14/2023

4.5.3 The Probit model


The probit model uses the cumulative normal distribution
function, hence sometimes referred to as the Normit
model.
The probit model is similar to the logit model except that
the logistic function is replaced by the normal distribution
function.
The estimating model that emerges from the normal
cumulative distribution function is popularly known as the
probit model.
In the probit model, G is the standard normal cumulative
distribution function (cdf ), which is expressed as an
integral.
5/14/2023 Ayu

Conti…

5/14/2023 Ayu
Conti…
 The latent variable is assumed to be a linear function of the
observed X‟s through the structural model.
 However, since the latent dependent variable is unobserved
the model cannot be estimated using OLS.
 Maximization of the likelihood function for either the probit or
the logit model is accomplished by nonlinear estimation
methods.
 Maximum likelihood can be used instead. the choice is
between normal errors and logistic errors, resulting in the
probit (Normit) and logit models, respectively.
• It are used to predict an outcome variable that is categorical
(violates the assumption of linearity in normal regression)
from one or more categorical .
Ayu
5/14/2023
5/14/2023
Similarities between Logit and Probit Models
 Both models give qualitatively similar results.
 In both model, interpret the sign of the coefficient but not the
magnitude. The magnitude cannot be interpreted using the
coefficient because different models have different scales of
coefficients.
 In both cases, as with the LPM, it is assumed that E[∈i/Xi] = 0
 Both Logit & probit models are S – shaped function.
 Both the probit and the logit models are estimated by Maximum
Likelihood Estimation
 Both Logit & probit models are a non-linear response function.
 Both the Probit and Logit models have the same basic structure.
 Estimate a latent variable Y* using a linear model.
 Y* ranges from negative infinity to positive infinity.
 Use a non-linear function to transform Y* into a predicted Y .Y
lies between 0 and 1. Ayu
Difference between Logit and Probit Models
 The main difference being that the logistic distribution has
slightly fatter tails (the conditional probability Pi approaches
zero or one at a slower rate in logit than in probit).
 In practice many researchers choose the logit model because of
its comparative mathematical simplicity.
 The parameters of the two models are scaled differently. I.e
The parameter estimates in logistic regression tend to be 0.6 to
0.8 times higher than they are in corresponding probit model.
 The coefficients derived from the maximum likelihood (ML)
function will be the coefficients for the probit model, if we
assume a normal distribution.
 If we assume that the appropriate distribution of the error term
is a logistic distribution, the coefficients that we get from the
ML function will be the coefficient of the logit model
Ayu
5/14/2023
5/14/2023
Difference and similarities between Logit and Probit Models
Both uses a the CDF (CLF and CNF) to transform the model so that
the probabilities follow the S-shape, but differ in the relative thickness
of the tails. Logit is relatively thicker than Probit. This difference
would, however, disappear, as the sample size gets large.

Ayu
How to interpret coefficients in both model?
 In both logit and probit model
 If β > 1 P increase as X increase
 If β < 1 P increase as X increase
 β can not be interpreted as a simple slope as in
ordinary regression. B/c the rate at which the curve
ascends and descends changes according to the value
of x. In other words it is not a constant change as in
ordinary regression

5/14/2023 Ayu
Example on Logit and Probit
1. Suppose that we want to examine the effect of routine weekly
exercises on the performance of students.
To this end, suppose we gave routine exercises to second year
section A students and at the end of the semester, we found
average scores in exercise (ASE) for each student.
The dependent variable in this example is dichotomous, 𝑌_𝑖=1
for those students scoring A and 𝑌_𝑖=0 for those students
scoring other grades (B, C, D, F and FX).
There are two continuous variables (GPA and ASE) and one
categorical variable, PC ownerships.
Where , PC=1 for students with PC and PC =0 for students
with out PC ownerships.

5/14/2023 Ayu
Example on Logit and Probit
A. Interpretation of Logit Model
logit grade gpa ase pc
Logistic regression Number of obs = 32
LR chi2(3) = 15.40
Prob > chi2 = 0.0015
Log likelihood = -12.889633 Pseudo R2 = 0.3740

grade Coef. Std. Err. z P>|z| [95% Conf. Interval]

gpa 2.826113 1.262941 2.24 0.025 .3507938 5.301432


ase .0951577 .1415542 0.67 0.501 -.1822835 .3725988
pc 2.378688 1.064564 2.23 0.025 .29218 4.465195
_cons -13.02135 4.931325 -2.64 0.008 -22.68657 -3.35613

5/14/2023 Ayu
Interpretation of Logit

 GPA: for every one-unit increase in GPA, we expect


a 2.826113 increase in log-odds of getting A grade ,
holding all other independent variables constant.
 ASE: for every one unit increase in ASE(so, for every
additional point scoring in exercise), we expect a
0.951577 increase in the log-odds of getting A grade,
holding all other explanatory variables constant.
 PC: for a one unit increase in PC (in other world,
individual going from no pc to pc ownership), we
expect a 2.37868 increase in the log-odds of getting A
grade, holding all other independent variables
constant.

5/14/2023 Ayu
Interpretation of Logit
B. Odds Ratio Interpretation of Logit Model
logit grade gpa ase pc, or
Logistic regression Number of obs = 32
LR chi2(3) = 15.40
Prob > chi2 = 0.0015
Log likelihood = -12.889633 Pseudo R2 = 0.3740

grade Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]

gpa 16.87972 21.31809 2.24 0.025 1.420194 200.6239


ase 1.099832 .1556859 0.67 0.501 .8333651 1.451502
pc 10.79073 11.48743 2.23 0.025 1.339344 86.93802
_cons 2.21e-06 .0000109 -2.64 0.008 1.40e-10 .03487

5/14/2023 Ayu
Odds Ration Interpretation
• GPA: As GPA increases by one point, the
probability of getting A is 16.87 times as large as
the probability of getting other grades (B, C, D, F ,
and FX).
• ASE: As ASE increase by one point, the
probability of getting A is 1.09 times as large as the
probability of getting other grades.
• PC: For a PC owners, the probability of getting A
is 10.79 times as large as the probability for non-
owners to getting A.
Ayu

5/14/2023
Interpretation of Logit
C. Probability Interpretation of Logit Model
. mfx

Marginal effects after logit


y = Pr(grade) (predict)
= .25282025

variable dy/dx Std. Err. z P>|z| [ 95% C.I. ] X

gpa .5338589 .23704 2.25 0.024 .069273 .998445 3.11719


ase .0179755 .02624 0.69 0.493 -.033448 .069399 21.9375
pc* .4564984 .18105 2.52 0.012 .10164 .811357 .4375

(*) dy/dx is for discrete change of dummy variable from 0 to 1


Ayu

5/14/2023
Marginal Effect (mfx) Interpretation
• Both logit and probit give us similar results .
• GPA: As GPA increases by one point, the probability
of getting grade A by student increase by 53%.
• ASE: As ASE increase by one point, the probability of
getting A grade by student increase by 1.79%.
• PC: if the student with PC ,(the change in pc ownership
from no pc ownership to pc ownership) the probability
of getting A grade by the student increase by 45.64%.

5/14/2023 Ayu
Interpretation of Probit Model

D. Probit Estimation
probit grade gpa ase pc
Probit regression Number of obs = 32
LR chi2(3) = 15.55
Prob > chi2 = 0.0014
Log likelihood = -12.818803 Pseudo R2 = 0.3775

grade Coef. Std. Err. z P>|z| [95% Conf. Interval]

gpa 1.62581 .6938825 2.34 0.019 .2658255 2.985795


ase .0517289 .0838903 0.62 0.537 -.1126929 .2161508
pc 1.426332 .5950379 2.40 0.017 .2600795 2.592585
_cons -7.45232 2.542472 -2.93 0.003 -12.43547 -2.469166
5/14/2023 Ayu
Interpretation of Probit
GPA: For one unit increase in GPA, the probit
index(Z-score) increase by 1.62581.
ASE: For one unit increase in ASE, the probit
index increase by 0.517.
PC: The Student with PC, increase the Z-score
by 1.426.

Ayu
5/14/2023

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy