0% found this document useful (0 votes)
161 views5 pages

Probit Model

The probit model is used for binary dependent variables and models the probability of observing a value of one using a cumulative distribution function. It can be estimated using maximum likelihood which involves iterative solutions. The probit model can be interpreted as a latent variables specification or a conditional mean specification. EViews allows estimation of probit models and provides output including coefficient estimates, test statistics and measures of fit.

Uploaded by

Nidhi Kaushik
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
161 views5 pages

Probit Model

The probit model is used for binary dependent variables and models the probability of observing a value of one using a cumulative distribution function. It can be estimated using maximum likelihood which involves iterative solutions. The probit model can be interpreted as a latent variables specification or a conditional mean specification. EViews allows estimation of probit models and provides output including coefficient estimates, test statistics and measures of fit.

Uploaded by

Nidhi Kaushik
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

PROBIT MODEL

BACKGROUND

Suppose that a binary dependent variable, y, takes on values of zero and one. A simple linear
regression of y on x is not appropriate, since among other things, the implied model of the
conditional mean places inappropriate restrictions on the residuals of the model. Furthermore, the
fitted value of from a simple linear regression is not restricted to lie between zero and one.

Instead, we adopt a specification that is designed to handle the specific requirements of binary
dependent variables. Suppose that we model the probability of observing a value of one as:

Where F is a continuous, strictly increasing function that takes a real value and returns a value
ranging from zero to one.

The choice of the function F determines the type of binary model. It follows that:

Given such a specification, we can estimate the parameters of this model using the method of
maximum likelihood. The likelihood function is given by:

The first order conditions for this likelihood are nonlinear so that obtaining parameter estimates
requires an iterative solution. By default, EViews uses a second derivative method for iteration and
computation of the covariance matrix of the parameter estimates.

EViews allows you to override these defaults using the Options dialog.

There are two alternative interpretations of this specification that are of interest. First, the binary
model is often motivated as a latent variables specification. Suppose that there is an unobserved
latent variable yi* that is linearly related to x:

Where u1* is a random disturbance. Then the observed dependent variable is determined by
whether yi* exceeds a threshold value:
In this case, the threshold is set to zero, but the choice of a threshold value is irrelevant, so long as a
constant term is included in xi, then:

Where Fu is the cumulative distribution function of u. Common models include probit (standard
normal), logit (logistic), and gompit (extreme value) specifications for the F function.

In principle, the coding of the two numerical values of y is not critical since each of the binary
responses only represents an event. Nevertheless, EViews requires that you code y as a zero-one
variable. This restriction yields a number of advantages. For one, coding the variable in this fashion
implies that expected value of y is simply the probability that y=1:

This convention provides us with a second interpretation of the binary specification: as a conditional
mean specification. It follows that we can write the binary model as a regression model:

Where ei is a residual representing the deviation of the binary yi from its conditional mean.

Estimating Binary Model in E views:

 To estimate a binary dependent variable model, choose Object/New Object… from the
main menu and select the Equation object from the main menu.
 From the Equation Specification dialog, select the BINARY - Binary Choice (Logit, Probit,
and Extreme Value) estimation method.
 The dialog will change to reflect your choice. Alternately, enter the keyword binary in the
command line and press ENTER.
 There are two parts to the binary model specification. First, in the Equation Specification
field, you may type the name of the binary dependent variable followed by a list of
regressors or you may enter an explicit expression for the index. Next, select from among
the three distributions for your error term.
ILLUSTRATION

For example, consider the probit specification example described in Greene (2008, p. 781-783)
where we analyze the effectiveness of teaching methods on grades.

The variable GRADE represents improvement on grades following exposure to the new teaching
method

Also controlling for alternative measures of knowledge (GPA and TUCE), we have the specification:

Once you have specified the model, click OK.

EViews estimates the parameters of the model using iterative procedures, and will display
information in the status line.

EViews requires that the dependent variable be coded with the values zero-one with all other
observations dropped from the estimation.

Following estimation, EViews displays results in the equation window. The top part of the estimation
output is given by:
The header contains basic information regarding the estimation technique (ML for maximum
likelihood) and the sample used in estimation, as well as information on the number of iterations
required for convergence, and on the method used to compute the coefficient covariance matrix.

Displayed next are the coefficient estimates, asymptotic standard errors, z-statistics and
corresponding p-values.

Interpretation of the coefficient values is complicated by the fact that estimated coefficients from a
binary model cannot be interpreted as the marginal effect on the dependent variable. The marginal
effect of xj on the conditional probability is given by:

Where is the density function corresponding to F. Note that is weighted by


a factor f that depends on the values of all of the regressors in x. The direction of the effect of a

change in xj depends only on the sign of the coefficient. Positive values of imply that
increasing xj will increase the probability of the response; negative values imply the opposite.

In addition to the summary statistics of the dependent variable, EViews also presents the following
summary statistics:
First, there are several familiar summary descriptive statistics: the mean and standard deviation of
the dependent variable, standard error of the regression, and the sum of the squared residuals. The
latter two measures are computed in the usual fashion using the ordinary residuals:

Additionally, there are several likelihood based statistics:

•Log likelihood is the maximized value of the log likelihood function.

•Avg. log likelihood is the log likelihood divided by the number of observations.

•Restr. Log likelihood is the maximized log likelihood value, when all slope coefficients are restricted
to zero,. Since the constant term is included, this specification is equivalent to estimating the
unconditional mean probability of “success”.

•The LR statistic tests the joint null hypothesis that all slope coefficients except the constant are zero
and is computed as. This statistic, which is only reported when you include a constant in your
specification, is used to test the overall significance of the model. The degrees of freedom is one less
than the number of coefficients in the equation, which is the number of restrictions under test.

•Probability (LR stat) is the p-value of the LR test statistic. Under the null hypothesis, the LR test
statistic is asymptotically distributed as a variable, with degrees of freedom equal to the number of
restrictions under test.

•McFadden R-squared is the likelihood ratio index computed as, where is the restricted log
likelihood. As the name suggests, this is an analog to the reported in linear regression models. It has
the property that it always lays between zero and one.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy