0% found this document useful (0 votes)
19 views43 pages

Discrete Choice Methods and Their Applications To Short Term Travel Decisions

Uploaded by

Thiago Rocha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views43 pages

Discrete Choice Methods and Their Applications To Short Term Travel Decisions

Uploaded by

Thiago Rocha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

DRAFT

DISCRETE CHOICE METHODS


AND THEIR APPLICATIONS TO
SHORT TERM TRAVEL
DECISIONS
Moshe Ben-Akiva and Michel Bierlaire

Chapter for the Transportation Science Handbook


DRAFT
MIT, April 1999

Introduction

Modeling travel behavior is a key aspect of demand analysis,


where aggregate demand is the accumulation of individuals’
decisions. In this chapter, we focus on “short-term” travel
decisions. The most important short-term travel decisions include
choice of destination for a non-work trip, choice of travel mode,
choice of departure time and choice of route. It is important to
note that short-term decisions are conditional on long-term travel
and mobility decisions such as car ownership and residential and
work locations.
The analysis of travel behavior is typically disaggregate, meaning
that the models represent the choice behavior of individual
travelers. Discrete choice analysis is the methodology used to

1
2 M. Ben-Akiva and M. Bierlaire
DISCRETE CHOICE METHODS

analyze and predict travel decisions. Therefore, we begin this


chapter with a review of the theoretical and practical aspects of
discrete choice models. After a brief discussion of general
assumptions, we introduce the random utility model, which is the
most common theoretical basis of discrete choice models. We then
present the alternative discrete choice model forms such as Logit,
Nested Logit, Generalized Extreme Value and Probit, as well as
more recent developments such as Hybrid Logit and the Latent
Class choice model. Finally, we elaborate on the applications of
these models to two specific short term travel decisions: route
choice and departure time choice.

Discrete Choice Models

We provide here a brief overview of the general framework of


discrete choice models. We refer the reader to Ben-Akiva and
Lerman (1985) for the detailed developments.

General modeling assumptions

The framework for a discrete choice model can be presented by a


set of general assumptions. We distinguish among assumptions
about the:
1. decision-maker -- defining the decision-making entity and its
characteristics;
2. alternatives -- determining the options available to the decision-
maker;
3. attributes -- measuring the benefits and costs of an alternative to
the decision-maker; and
4. decision rule -- describing the process used by the decision-
maker to choose an alternative.

DRAFT
M. Ben-Akiva and M. Bierlaire 3
DISCRETE CHOICE METHODS

Decision-maker
Discrete choice models are also referred to as disaggregate
models, meaning that the decision-maker is assumed to be an
individual. The “individual” decision-making entity depends on the
particular application. For instance, we may consider that a group
of persons (a household or an organization, for example) is the
decision-maker. In doing so, we may ignore all internal interactions
within the group, and consider only the decisions of the group as a
whole. We refer to “decision-maker” and “individual”
interchangeably throughout this chapter. To explain the
heterogeneity of preferences among decision-makers, a
disaggregate model must include their characteristics such as the
socio-economic variables of age, gender, education and income.

Alternatives
Analyzing individual decision making requires not only
knowledge of what has been chosen, but also of what has not been
chosen. Therefore, assumptions must be made about available
options, or alternatives, that an individual considers during a choice
process. The set of considered alternatives is called the choice set.
A discrete choice set contains a finite number of alternatives that
can be explicitly listed. The choice of a travel mode is a typical
example of a choice from a discrete choice set. The identification
of the list of alternatives is a complex process usually referred to as
choice set generation. The most widely used method for choice set
generation uses deterministic criteria of alternative availability. For
example, the possession of a driver’s license determines the
availability of the auto drive option.
The universal choice set contains all potential alternatives in the
application’s context. The choice set is the subset of the universal
choice set considered by, or available to, a particular individual.

DRAFT
4 M. Ben-Akiva and M. Bierlaire
DISCRETE CHOICE METHODS

Alternatives in the universal choice set that are not available to the
individual are therefore excluded from the choice set.
In addition to availability, the decision-maker’s awareness of the
alternative could also affect the choice set. The behavioral aspects
of awareness introduce uncertainty in modeling the choice set
generation process and motivate the use of probabilistic choice set
generation models that predict the probability of each feasible
choice set within the universal set. A discrete choice model with a
probabilistic choice set generation model is described later in this
chapter as a special case of the latent class choice model.

Attributes
Each alternative in the choice set is characterized by a set of
attributes. Note that some attributes may be generic to all
alternatives, and some may be alternative-specific.
An attribute is not necessarily a directly measurable quantity. It
can be any function of available data. For example, instead of
considering travel time as an attribute of a transportation mode, the
logarithm of the travel time may be used, or the effect of out-of-
pocket cost may be represented by the ratio between the out-of-
pocket cost and the income of the individual. Alternative
definitions of attributes as functions of available data must usually
be tested to identify the most appropriate.

Decision rule

The decision rule is the process used by the decision-maker to


evaluate the attributes of the alternatives in the choice set and
determine a choice. Most models used for travel behavior
applications are based on utility theory, which assumes that the
decision-maker’s preference for an alternative is captured by a
value, called utility, and the decision-maker selects the alternative in
the choice set with the highest utility.

DRAFT
M. Ben-Akiva and M. Bierlaire 5
DISCRETE CHOICE METHODS

This concept, employed by consumer theory of micro-economics,


presents strong limitations for practical applications. The
underlying assumptions of this approach are often violated in
decision-making experiments. The complexity of human behavior
suggests that the decision rule should include a probabilistic
dimension.
Some models assume that the decision rule is intrinsically
probabilistic, and even complete knowledge of the problem would
not overcome the uncertainty. Others consider the individuals’
decision rules as deterministic, and motivate the uncertainty from
the limited capability of the analyst to observe and capture all the
dimensions of the choice process, due to its complexity.
Specific families of models can be derived depending on the
assumptions about the source of uncertainty. Models with
probabilistic decision rules, like the model proposed by Luce
(1959), or the “elimination by aspects” approach proposed by
Tversky (1972), assume a deterministic utility and a probabilistic
decision process. Random utility models, used intensively in
econometrics and in travel behavior analysis, are based on
deterministic decision rules, where utilities are represented by
random variables.

Random Utility theory

Random utility models assume, as does the economic consumer


theory, that the decision-maker has a perfect discrimination
capability. However, the analyst is assumed to have incomplete
information and, therefore, uncertainty must be taken into account.
Manski (1977) identifies four different sources of uncertainty:
unobserved alternative attributes; unobserved individual
characteristics (also called “unobserved taste variations”);
measurement errors; and proxy, or instrumental, variables.

DRAFT
6 M. Ben-Akiva and M. Bierlaire
DISCRETE CHOICE METHODS

The utility is modeled as a random variable in order to reflect this


uncertainty. More specifically, the utility that individual n associates
with alternative i in the choice set Cn is given by
Uin = Vin + εin,
where Vin is the deterministic (or systematic) part of the utility, and
εin is the random term, capturing the uncertainty. The alternative
with the highest utility is chosen. Therefore, the probability that
alternative i is chosen by decision-maker n from choice set Cn is
P(i| Cn) = P[ Uin ≥ Ujn ∀ j ∈ Cn] = P[Uin = max Ujn].
j ∈Cn

In the following we introduce the assumptions necessary to make a


random utility model operational.

Location and scale parameters


Considering two arbitrary real numbers α and µ, where µ > 0,
we have that
P[ Uin ≥ Ujn ∀ j ∈ Cn] =

P[µUin+α ≥ µUjn+α ∀ j ∈ Cn] =

P[ Uin -Ujn ≥ 0 ∀ j ∈ Cn].


The above illustrates the fact that only the signs of the differences
between utilities are relevant here, and not utilities themselves. The
concept of ordinal utility is relative and not absolute. In order to
estimate and use a specific model arbitrary values have to be
selected for α and µ. The selection of the scale parameter µ is
usually based on a convenient normalization of one of the variances
of the random terms. The location parameter α is usually set to
zero. See also the discussion below of Alternative Specific
Constants.

DRAFT
M. Ben-Akiva and M. Bierlaire 7
DISCRETE CHOICE METHODS

Alternative Specific Constants


The means of the random terms can be assumed to be equal to any
convenient value c (usually zero, or the Euler constant γ for Logit
models). This is not a restrictive assumption. If we denote the mean
of the error term of alternative i by mi = E[εin], we can define a new
random variable ein = εin - mi+c such that E[ein]=c. We have
P[ Uin ≥ Ujn ∀ j ∈ Cn] = P[ Vin+ mi + ein ≥ Vjn+ mj + ejn ∀ j ∈ Cn],
a model in which the deterministic part of the utilities are Vin+ mi
and the random terms are ein (with mean c). The terms mi are then
included as Alternative Specific Constants (ASC) that capture the
means of the random terms. Therefore, we may assume without
loss of generality that the error terms of random utility models have
a constant mean c by including alternative specific constants in the
deterministic part of the utility functions.
As only differences between utilities are relevant, only differences
between ASCs are relevant as well. It is common practice to define
the location parameter α as the negative of one of the ASCs. This
is equivalent to constraining that ASC to zero. From a modeling
viewpoint, the choice of the particular alternative whose ASC is
constrained is arbitrary. However, Bierlaire, Lotan and Toint
(1997) have shown that the estimation process may be affected by
this choice. In the context of the Multinomial Logit Model, they
show that constraining the sum of ASCs to 1 is optimal for the
speed of convergence of the estimation process. This result is also
generalized for the Nested Logit Model.

The deterministic term of the utility


The deterministic term Vin of each alternative is a function of the
attributes of the alternative itself and the characteristics of the
decision-maker. That is

DRAFT
8 M. Ben-Akiva and M. Bierlaire
DISCRETE CHOICE METHODS

Vin = V(zin, Sn)


where zin is the vector of attributes as perceived by individual n for
alternative i, and Sn is the vector of characteristics of individual n.
This formulation is simplified using any appropriate vector valued
function h that defines a new vector of attributes from both zin and
Sn, that is
xin = h(zin, Sn).
The choice of h is very general, and several forms may be tested to
identify the best representation in a specific application. It is
usually assumed to be continuous and monotonic in zin. For a linear
in the parameters utility specification, h must be a fully determined
function (meaning that is does not contain unknown parameters).
Then we have
Vin = V(xin).
A linear in the parameters function is denoted as follows
Vin = ∑ βk xink .
k

The deterministic term of the utility is therefore fully specified by


the vector of parameters β.

The random part of the utility


Among the many potential models that can be derived for the
random parts of the utility functions, we describe below the most
popular. The models within the Logit family are based on a
probability distribution function of the maximum of a series of
random variables, introduced by Gumbel (1958). Probit and Probit-
like models are based on the Normal distribution motivated by the
Central Limit Theorem.

DRAFT
M. Ben-Akiva and M. Bierlaire 9
DISCRETE CHOICE METHODS

The main advantage of the Probit model is its ability to capture


all correlations among alternatives. However, due to the high
complexity of its formulation, very few applications have been
developed. The Logit model has been much more popular, because
of its tractability, but it imposes restrictions on the covariance
structure. They may be unrealistic in some contexts. The derivation
of other models in the “Logit family” is aimed at relaxing
restrictions, while maintaining tractability.
We discuss here the specification and properties of the models
from the Logit family (the Multinomial Logit model, the Nested
Logit model, the Cross-Nested model and the Generalized Extreme
Value model). After presenting the Probit model, we introduce
more advanced models. The Generalized Factor Analytical
Representation and the Hybrid Logit models are designed to bridge
the gap between Logit and Probit models. The Latent Class Choice
model is a further extension designed to explicitly include in the
model discrete unobserved factors.

The LOGIT family

Logit-based models have been widely used for travel demand


analysis. Practitioners and researchers have used, refined and
extended the original Binary Logit Model to obtain a class of
models based on similar assumptions. We refer to this class as the
Logit-family.

Multinomial Logit Model


The Logistic Probability Unit, or the Logit Model, was first
introduced in the context of binary choice where the logistic
distribution is used. Its generalization to more than two alternatives
is referred to as the Multinomial Logit Model. The Multinomial
Logit Model is derived from the assumption that the error terms of
the utility functions are independent and identically Gumbel

DRAFT
10 M. Ben-Akiva and M. Bierlaire
DISCRETE CHOICE METHODS

distributed (or Type I extreme value). That is, εin for all i,n is
distributed as:
F (ε ) = exp[ − e − µ (ε − η ) ], µ > 0

f (ε ) = µe − µ ( ε −η ) exp[ − e − µ ( ε −η ) ]

where η is a location parameter and µ is a strictly positive scale


parameter. The mean of this distribution is
η+γ/µ
where
k
1
γ = lim ∑ − ln( k ) ≅ 0.5772
k →∞ i =1 i

is the Euler constant. The variance of the distribution is


π2/6µ2.
The probability that a given individual n chooses alternative i within
the choice set Cn is given by
e µVin
P(i | C n ) = .
∑e
µV jn

j∈C n

An important property of the Multinomial Logit Model is


Independence from Irrelevant Alternatives (IIA). This property can
be stated as follows: The ratio of the probabilities of any two
alternatives is independent of the choice set. That is, for any choice
sets C1 and C2 such that C1 ⊆ Cn and C2 ⊆ Cn, and for any
alternatives i and j in both C1 and C2, we have

DRAFT
M. Ben-Akiva and M. Bierlaire 11
DISCRETE CHOICE METHODS

P(i|C1 ) P(i|C2 )
=
P( j|C1 ) P( j|C2 )
.

An equivalent definition of the IIA property is: The ratio of the


choice probabilities of any two alternatives is unaffected by the
systematic utilities of any other alternatives.
The IIA property of Multinomial Logit Models is a limitation for
some practical applications. This limitation is often illustrated by the
red bus/blue bus paradox in the modal choice context. We use here
instead the following path choice example.

Consider a commuter traveling from origin O to destination D.


He/she is confronted with the path choice problem described in
Figure 1 below, where the choice set is {1,2a,2b} and the only
attribute considered for the choice is travel time. We assume
furthermore that the travel time for any alternative is the same, that
is V(1) = V(2a) = V(2b) = T, and that the travel time on the small
sections a and b is δ.

Path 2
a b
O D

Path 1

Figure 1

The probability of each alternative provided by the Multinomial


Logit Model for this example is

DRAFT
12 M. Ben-Akiva and M. Bierlaire
DISCRETE CHOICE METHODS

e µT 1
P(1|{1,2a ,2b}) = P(2a|{1,2a ,2b}) = P(2b|{1,2a ,2b}) = =
∑e 3 µT

j ∈{1, 2 a ,2 b}

Clearly, this result is independent of the value of δ. However,


when δ is significantly smaller than the total travel time T, we
expect the probabilities to be close to 50%/25%/25%. The
Multinomial Logit Model is not consistent with this intuitive result.
This situation appears in choice problems with significantly
correlated random utilities, as it is clearly the case in the path choice
example. Indeed, alternatives 2a and 2b are so similar that their
utilities share many unobserved attributes of the path and, therefore,
the assumption of independence of the random parts is not valid in
this context.

Nested Logit Model


The Nested Logit Model, first proposed by Ben-Akiva (1973 and
1974), is an extension of the Multinomial Logit Model designed to
capture some correlations among alternatives. It is based on the
partitioning of the choice set Cn into M nests Cmn such that

C
M
Cn = mn
m =1

and
Cmn ∩ Cm’n = ∅ ∀ m≠m’.

The utility function of each alternative is composed of a term


specific to the alternative and a term associated with the nest. If i is
an alternative from nest Cmn, we have
~ ~
U = V + ε~ + V + ε~ .
in in in Cmn Cmn

DRAFT
M. Ben-Akiva and M. Bierlaire 13
DISCRETE CHOICE METHODS

The error terms ε~in and ε~Cmn are supposed to be independent. As in


the Multinomial Logit Model, the error terms ε~ are assumed to be in
independent and identically Gumbel distributed, with scale
parameter µm (it can be different for each nest). The distribution of
ε~Cmn is such that the random variable max U jn is Gumbel
j∈C mn

distributed with scale parameter µ.


Each nest within the choice set is associated with a composite
utility
~ +
=V
1 ~
ln ∑ e m jn .
µ V
V Cmn Cmn µ m j∈Cmn
The second term is called expected maximum utility, LOGSUM,
inclusive value or accessibility in the literature. The probability for
individual n to choose alternative i within nest Cmn is given by
P (i|Cn ) = P(Cmn |Cn ) P(i|Cmn )
where
µVC mn
e
P(C mn | C n ) = M
,
∑e
µVC ln

l =1

and
~
e µmVin
P(i|Cmn ) = .
∑e
~
µ mV jn

j ∈Cmn

Parameters µ and µm reflect the correlation among alternatives


within the nest Cmn. The covariance between the utility of two
alternatives i and j in nest Cmn is

DRAFT
14 M. Ben-Akiva and M. Bierlaire
DISCRETE CHOICE METHODS

var(ε~C mn ) if i and j ∈ C mn
Cov(U in , U jn ) = 
0 otherwise
and the correlation is
 µ2
1 − if i and j ∈ C mn
Corr(U in , U jn ) =  µ m2 .
0 otherwise
Therefore, as the correlation is non negative, we have
µ
0≤ ≤ 1,
µm
and
µ
= 1 ⇔ corr(U in ,U jn ) = 0 .
µm
The parameters µ and µm are closely related in the model.
Actually, only their ratio is meaningful. It is not possible to identify
them separately. A common practice is to arbitrarily constrain one
of them to a specific value (usually 1).
As an example, we apply now the Nested Logit Model to the
route choice problem described in Figure 1. We partition the choice
set Cn={1,2a,2b} into C1n={1} and C2n={2a,2b}. The probability of
choosing path 1 is given by
1
P(1 | {1,2a,2b}) = µ
,
µ2
1+ 2
where µ2 is the scale parameter of the random term associated with
C2n, and µ is the scale parameter of the choice between C1n and C2n.

DRAFT
M. Ben-Akiva and M. Bierlaire 15
DISCRETE CHOICE METHODS

Note that we require 0 ≤ µ/µ2 ≤ 1. The probability of the two other


paths is
µ
µ2
1 2
P(2a | C n ) = P(2b | C n ) = µ
.
2
1 + 2 µ2
In this example, we need to normalize either µ or µ2 to 1. In the
latter case we have
1
P(1|{1,2a ,2b}) =
1 + 2µ
and
1  2µ 
P(2a|Cn ) = P(2b|Cn ) =  
2  1 + 2µ 
and we require that 0 ≤ µ ≤1. Note that for µ=1 we obtain the
MNL result. For µ approaching zero, we obtain the expected result
when paths 2a and 2b fully overlap. A model where the scale
parameter µ is normalized to 1 is said to be “normalized from the
top.”
A model where one of the parameters µm is normalized to 1 is
said to be “normalized from the bottom.” The latter may produce a
simpler formulation of the model. We illustrate it using the
following example.
In the context of a mode choice with Cn={bus, metro, car, bike},
we consider a model with two nests: C1n={bus,metro} contains the
public transportation modes and C2n={car,bike} contains the private
transportation modes. For the example’s sake, we consider the
following deterministic terms of the utility functions:
Vbus=β1 tbus; Vmetro=β1 tmetro; Vcar=β2 tcar; Vbike=β2 tbike

DRAFT
16 M. Ben-Akiva and M. Bierlaire
DISCRETE CHOICE METHODS

where ti is the travel time using mode i and β1 and β2 are


parameters to be estimated. Note that we have one parameter for
private and one for public transportation, and we have not included
the alternative specific constants in order to keep the example
simple.
Applying the Nested Logit Model, we obtain
ln (e )µ µ1β1tbus µ1β1tmetro
 µ1β1t bus
 µ1
+e
 e
P(bus ) = µ1β1tbus  e
 + e 1 1 metro  µ1 ln (e µ1β1tbus +e µ1β1tmetro ) + µ 2 ln (eµ2β2tcar +eµ 2 β2tbike )
µβt µ µ
e e e
Define θ1= µ/µ1, θ2= µ/µ2 , β1*=µ1β1 and β2*=µ2β2 to obtain
 * * 
β1*t bus θ1 ln  e β1 tbus + e β1 tmetro 
 
P(bus ) = e  *
e   * 
,
β1*tbus β1*t metro *
θ1 ln  e β1 tbus + e β1 tmetro 
*
θ 2 ln  e β 2tcar + e β2tbike 
e +e +e
e    

with 0≤θ1,θ2≤1.
This formulation simplifies the estimation process. For this
reason, it has been adopted by the Ben-Akiva and Lerman (1985)
textbook and in estimation packages like ALOGIT (Daly, 1987)
and HieLoW (Bierlaire, 1995, Bierlaire and Vandevyvere, 1995).
We emphasize here that these packages should be used with caution
when the same parameters are present in more than one nest.
Specific techniques inspired from artificial trees proposed by
Bradley and Daly (1991) must be used to obtain a correct
specification of the model. In the above example, if µ1=µ2, then
imposing the restriction β1=β2 is straightforward. However, for the
case of µ1≠µ2 and β1=β2=β, we define β*=µ1µ2β and create
artificial nodes below each alternative, with a scale µ2 for the first
nest and scale µ1 for the second. We refer the reader to Koppelman
and Chen (1998) for further discussion.
A direct extension of the Nested Logit Model consists in
partitioning some or all nests into sub-nests which can in turn, be
divided into sub-nests. The model described above is valid at every

DRAFT
M. Ben-Akiva and M. Bierlaire 17
DISCRETE CHOICE METHODS

layer of the nesting, and the whole model is generated recursively.


Because of the complexity of these models, their structure is usually
represented as a tree. Clearly, the number of potential structures
reflecting the correlation among alternatives can be very large. No
technique has been proposed thus far to identify the most
appropriate correlation structure directly from the data.
The Nested Logit Model is designed to capture choice problems
where alternatives within each nest are correlated. No correlation
across nests can be captured by the Nested Logit Model. When
alternatives cannot be partitioned into well separated nests to reflect
their correlation, the Nested Logit Model is not appropriate.

Cross-Nested Logit Model


The Cross-Nested Logit Model is a direct extension of the
Nested Logit Model, where each alternative may belong to more
than one nest. Similar to the Nested Logit Model, the choice set Cn
is partitioned into M nests Cmn. Moreover, for each alternative i and
each nest m, parameters αim (0≤αim≤1) representing the degree of
“membership” of alternative i in nest m are defined. The utility of
alternative i is given by
~ ~
U = V + ε~ + V + ε~ + ln α .
imn in in Cmn Cmn im

The error terms ε~in and ε~C mn are independent. The error terms
ε~in are independent and identically Gumbel distributed, with unit
scale parameter (this assumption is not the most general, but
simplifies the derivation of the model). The distribution of ε~C mn is
such that the random variable max U jmn is Gumbel distributed with
j∈Cmn

scale parameter µ. The probability for individual n to choose


alternative i is given by

DRAFT
18 M. Ben-Akiva and M. Bierlaire
DISCRETE CHOICE METHODS

M
P (i|Cn ) = ∑ P(Cmn |Cn ) Pn (i| Cmn )
m =1

where
µVC mn
e
P(C mn | C n ) = M
,
∑e
µVC ln

l =1

~
α im eVin
P(i | C mn ) = ,
∑α jm e
~
V jn

j∈Cmn

∑α
~
~
and VCmn = VCmn + ln
V jn
jm e .
j∈Cmn

This model was first presented by McFadden (1978) as a special


case of the GEV model that is presented below. It was applied by
Small (1987) for departure time choice and by Vovsha (1998) for
route choice.

Generalized Extreme Value


The Generalized Extreme Value (GEV) model has been derived
from the random utility model by McFadden (1978). This general
model consists of a large family of models that include the
Multinomial Logit and the Nested Logit models. The probability of
choosing alternative i within Cn is

eVin
∂ G V1n
∂e Vin
(
e ,..., e J n
V
)
P (i | C n ) =
(
µG eV1n ,..., e J n
V
.
)

DRAFT
M. Ben-Akiva and M. Bierlaire 19
DISCRETE CHOICE METHODS

Jn is the number of alternatives in Cn and G is a non-negative


differentiable function defined on IR J+n with the following
properties:

1. G is homogeneous of degree µ > 01,


2. lim G ( x1 ,..., xi ,..., x J n ) = ∞, ∀i = 1,..., J n
xi → ∞

3. the kth partial derivative with respect to k distinct xi is non-


negative if k is odd, and non-positive if k is even, that is, for any
distinct i1,…ik ∈ {1,…Jn } we have

∂ kG
( −1) k
( x ) ≤ 0 ∀x ∈ IR J+n .
∂xi1 ...∂xi k
The Multinomial Logit Model, the Nested Logit Model and the
Cross-Nested Logit Model are GEV models, with
Jn

G ( x ) = ∑ xiµ
i =1

for the Logit model,


µ
M   µm
G ( x ) = ∑  ∑ xiµ m 
m = 1  i ∈Cmn 
for the Nested Logit model and

1
McFadden’s original formulation with µ=1 was generalized to
µ>0 by Ben-Akiva and François (1983).

DRAFT
20 M. Ben-Akiva and M. Bierlaire
DISCRETE CHOICE METHODS

µ
M   µm
G ( x) = ∑  ∑ α jm x µj m 
m =1  j∈C n 
for the Cross-nested Logit model.

Multinomial Probit Model

The Probability Unit (or Probit) model should have been called
Normit, for Normal Probability Unit model. It is derived from the
assumption that the error terms of the utility functions are normally
distributed. The Probit model captures explicitly the correlation
among all alternatives. Therefore, we adopt a vector notation for
the utility functions:
Un = Vn + εn,
where Un, Vn and εn are (Jn×1) vectors. The vector of error terms
εn=[ε1n,ε2n,...,εJn]T is multivariate normal distributed with a vector of
means 0 and a JnxJn variance-covariance matrix Σn.
The probability that a given individual n chooses alternative i
from the choice set Cn is given by
P(i|Cn ) = P(U jn − U in ≤ 0 ∀j ∈ Cn ) .

Denoting ∆i the (Jn-1×Jn) matrix such that


∆iUn =[U1n-Uin,…,U(i-1)n-Uin,U(i+1)n-Uin,…, U J nn -U in]T,

we have that
∆iUn ~ N(∆iVn, ∆iΣn ∆iT).
The density function is given by

DRAFT
M. Ben-Akiva and M. Bierlaire 21
DISCRETE CHOICE METHODS

 1 
f i ( x ) = λ exp  − ( x − ∆ iVn ) T ( ∆ i Σ n ∆Ti ) −1 ( x − ∆ iVn )
 2 
where
J n −1

λ = (2π ) 2
| ∆ i Σ n ∆Ti | −1 / 2
and
0 0
P(i | C n ) = P(∆ i U n ≤ 0) = ∫ ... ∫ f i ( x )dx1 ...dx i −1 dxi +1 ...dx J n .
−∞ −∞

The matrix ∆i is such that the ith column contains -1 everywhere.


If the ith column is removed, the remaining (Jn−1×Jn−1) matrix is
the identity matrix. For example, in the case of a trinomial choice
model, we have
 1 −1 0
∆2 =  .
 0 −1 1
We note that he multifold integral becomes intractable even for a
relatively low number of alternatives. Moreover, the number of
unknown parameters in the variance-covariance matrix grows with
the square of the number of alternatives. We refer the reader to
McFadden (1989) for a detailed discussion of multinomial Probit
models. We present below the Generalized Factor Analytic
formulation designed to decrease the degree of complexity of Probit
models.

Generalized Factor Analytic Specification of the


Random Utility

The general formulation of the factor analytic formulation is

DRAFT
22 M. Ben-Akiva and M. Bierlaire
DISCRETE CHOICE METHODS

Un = Vn + εn = Vn + Fn ζn,
where Un is a (Jn×1) vector of utilities, Vn is a (Jn×1) vector of
deterministic utilities, εn is a (Jn×1) vector of random terms, ζn is a
(M×1) vector of factors which are IID standard normal distributed,
and Fn is a Jn × M matrix of loadings that map the factors to the
random utility vector. This specification is very general. If M = J,
the number of alternatives in the universal set, we can define the
matrix F as the Cholesky factor of the variance-covariance matrix
Σ, that is Σ=F FT. Fn is then obtained by removing the rows
associated with unavailable alternatives. We describe here special
cases of factor analytical representations. They are discussed in
more details by Ben-Akiva and Bolduc (1996).

Heteroscedasticity
A heteroscedastic2 model is obtained when Fn is a Jn×Jn diagonal
matrix. Let T be a diagonal matrix containing the alternative
specific standard deviations σi. Fn is obtained by removing the rows
and columns of the unavailable alternatives. We obtain the
following model, in scalar form:
Uin = Vin + σiζin.

Factor Analytic
In this model, the general matrix Fn is divided into a matrix of
loadings Qn and a diagonal matrix T containing the factor specific
standard deviations. We obtain the following model,

2
Heteroscedasticity here refers to different variances among the
alternatives. We use it in this context to refer to a diagonal
variance-covariance matrix with potentially different terms on the
diagonal.

DRAFT
M. Ben-Akiva and M. Bierlaire 23
DISCRETE CHOICE METHODS

Un =Vn + Qn T ζn.
Or, in scalar form:
M
U in = Vin + ∑ qimnσ mζmn ,
m =1

where qimn are the elements of Qn and σm are the diagonal elements
of T. The matrix Qn is normalized so that

∑q
i ,m
2
imn = 1 ∀n .

When the matrix Qn is known and does not need to be estimated


the model is referred to as the Error Component Formulation.

General Autoregressive Process


We consider the case where the error term εn is generated from a
first-order autoregressive process:
εn = ρWnεn + T ζn,
where Wn is a (Jn× Jn) matrix of weights describing the influence of
each component of the error terms on the others, and ζn~N(0,Ijn),
as above. Then we have
εn = (I-ρWn)-1 T ζn,
which is a special case of the factor analytic representation with
Qn = (I-ρWn)-1.

DRAFT
24 M. Ben-Akiva and M. Bierlaire
DISCRETE CHOICE METHODS

Hybrid Logit Model

The Multinomial Probit with a Logit kernel, or Hybrid Logit3,


model has been introduced by Ben-Akiva and Bolduc (1996). It is
intended to bridge the gap between Logit and Probit models by
combining the advantages of both of them. It is based on the
following utility functions:
Uin = Vin + ξin + υin,
where ξin are normally distributed and capture correlation between
alternatives, and υin are independent and identically distributed
Gumbel variables.
If the ξin are given, the model corresponds to a Multinomial Logit
formulation:
eVin + ξin
P (i|Cn , ξn ) = ,
∑ eV jn +ξin
j ∈Cn

where ξn=[ξ1,..., ξJ]T is the vector of unobserved random terms.


Therefore, the probability to choose alternative i is given by
P(i | C n ) = ∫ P(i | C n , ξ n ) f (ξ n )dξ n
ξn

where f(ξn) is the probability density function of ξn. This model is a


generalization of the Multinomial Probit Model when the
distribution f(ξn) is a multivariate normal. Other distributions may
also be used. The earliest application of this model to capture
random coefficients in the Logit Model (see below) was by Cardell
and Dunbar (1980). More recent results highlighted the robustness
of Hybrid Logit (see McFadden and Train, 1997).

3
Sometimes called Mixed Logit

DRAFT
M. Ben-Akiva and M. Bierlaire 25
DISCRETE CHOICE METHODS

Hybrid Logit with factor analytical representation.


The Hybrid Logit Model can be combined with the factor
analytical representation presented above to allow practical
estimation using a simulated maximum likelihood procedure (see
Ben-Akiva and Bolduc, 1996). The “Probit” error term is
transformed using any appropriate factor analytical representation
to obtain the following choice probability:
P(i| Cn ) = ∫
ζn
P(i| Cn , ζ n ) N (0, I M )dζ n .

This formulation of the multinomial Probit is especially useful when


the number of alternatives is so high that the use of probability
simulators is required.

Random coefficients
We conclude our discussion of the Hybrid Logit model with a
formulation of the Multinomial Logit Model with randomly
distributed coefficients:
υn = Xnβn+υ
Un=Vn+υ υn.
Assume that βn~N(β,Ω). If Γ is the Cholesky factor of Ω such that
ΓΓT=Ω, we replace βn by β+Γζζn to obtain
Un= Xn β+ Xn Γζζn+υ
υn.
It is an Hybrid Logit model with a factor analytic representation
with Fn= Xn Γ.

Latent class choice model

Latent class choice models are also designed to capture unobserved


heterogeneity. The underlying assumption is that the heterogeneity
is generated by discrete constructs. These constructs are not
directly observable and therefore are represented by latent classes.

DRAFT
26 M. Ben-Akiva and M. Bierlaire
DISCRETE CHOICE METHODS

For example, heterogeneity may be produced by taste variations


across segments of the population, or when choice sets considered
by individuals vary (latent choice set).
The latent class choice model is given by:
S
P (i| X n ) = ∑ P(i| X n ; βs , Cs ) P( s| X n ;θ )
s =1

where S is the number of latent classes, Xn is the vector of attributes


of alternatives and characteristics of decision-maker n, βs are the
choice model parameters specific to class s, Cs is the choice set
specific to class s, and θ is an unknown parameter vector.
The model
P( s| X n ;θ )
is the class membership model, and
P (i| X n ; βs , Cs )
is the class-specific choice model.

A special case: latent choice sets


A special case is the choice model with latent choice sets:
P ( i, n ) = ∑ P(i | C
Cn ∈G
n )P(C n )

where G is the set of all non-empty subsets of the universal choice


set M, and P(i|Cn) is a choice model. We note here that the size of
G grows exponentially with the size of the universal choice set.
The latent choice set can be modeled using the concept of
alternative availability. Then, a list of constraints or criteria are used
to characterize the availability of alternatives. For each alternative i,
a binary random variable Ain is defined such that Ain=1 if alternative

DRAFT
M. Ben-Akiva and M. Bierlaire 27
DISCRETE CHOICE METHODS

i is available to individual n, and 0 otherwise. A list of Kin


constraints is defined as follows:
Ain = 1 if Hink ≥ 0, ∀k=1,…,Kin.
For example, in a path choice context, one may consider that a path
is not available is the ratio between its length and the shortest path
length is above some threshold, represented by a random variable.
The associated constraint for path i would then be:
Li / L* ≥ 2+ε
where L* is the length of the shortest path, Li is the length of path i
and ε a random variable with zero mean. It means that, on average,
paths longer than twice the length of the shortest path are rejected.
The probability for an alternative to be available is given by
P(Ain = 1) = P(Hink ≥ 0 ∀k=1,…,Kin).
The latent choice set probability is then:
P( Ain = 1, ∀i ∈ C n and A jn = 0, ∀j ∉ C n )
P (C n ) = .
1 − P( Aln = 0, ∀l ∈ M )
If the availability criteria are assumed to be independent, we have

∏ P( A
i∈C
in = 1)∏ P( Ain = 0)
j∉C
P (C n ) = .
1 − ∏ P( Al = 0)
l∈M

Swait and Ben-Akiva (1987) estimate a latent choice set model


of mode choice in a Brazilian city.

DRAFT
28 M. Ben-Akiva and M. Bierlaire
DISCRETE CHOICE METHODS

Route choice applications

The route choice problem can be stated as follows. Given a


transportation network composed of nodes, links, origins and
destinations; and given an origin o, a destination d and a
transportation mode m, what is the chosen route between o and d
on mode m. This discrete choice problem has specific
characteristics. First, the universal choice set is usually very large.
Second, not all physically feasible alternatives are considered by the
decision-maker. Third, the alternatives are usually correlated, due
to overlapping paths.
We now describe typical assumptions associated with route
choice models.

Decision-Maker

The traveler’s characteristics most often used for route choice


applications are:
• Value-of-time. Obviously, travel time is a key attribute of
alternative routes. Its influence on behavior, however, may
vary across individuals. A Wall Street broker is likely to
perceive and evaluate travel time differently from a retired
Floridian. The sensitivity of an individual to travel time is
usually referred to as the value-of-time. It can be represented
by a continuous variable (e.g., the dollar-value equivalent of
a minute spent traveling) or by a discrete variable identifying
the decision-maker’s value-of-time as low, medium or high.
• Access to information. Information about network conditions
may significantly influence route choice behavior. Therefore,
it may be important that a route choice model explicitly
differentiates travelers with access to such information from
those without access. It may be modeled by a single binary

DRAFT
M. Ben-Akiva and M. Bierlaire 29
DISCRETE CHOICE METHODS

attribute (access/no access) or by several binary variables


identifying the type of information available to the traveler
(pre-trip information, on-board computer, etc.)
• Trip purpose. The purpose of the trip may significantly
influence the route choice behavior. For example, a trip to
work may be associated with a penalty for late arrival, while
a shopping trip would usually have no such penalty.
However, note that the trip purpose may be highly correlated
with the value-of-time.

Alternatives

Identifying the choice set in a route choice context is a difficult


task. Two main approaches can be considered.
First, it may be assumed that each individual can potentially
choose any path between her/his origin and destination. The choice
set is easy to identify, but the number of alternatives can be very
large, causing operational problems in estimating and applying the
model. Moreover, this assumption is behaviorally unrealistic.
Second, a restricted number of paths may be considered in the
choice set. The choice set generation can be deterministic or
stochastic, depending on the analyst’s knowledge of the problem.
Dial (1971) proposes to include in the choice set “reasonable”
paths composed of links that would not move the traveler farther
away from her/his destination. The labeling approach (proposed by
Ben-Akiva et al., 1984) includes paths meeting specific criteria,
such as shortest paths, fastest paths, most scenic paths, paths with
fewest stop lights, paths with least congestion, paths with greatest
portion of freeways, paths with no left turns, etc.
An application of an implicit probabilistic choice set generation
model has been proposed by Cascetta and Papola (1998), where the
utility function associated with path i by individual n is defined as

DRAFT
30 M. Ben-Akiva and M. Bierlaire
DISCRETE CHOICE METHODS

Uin = Vin + ln qin + εin,


where qin is a random variable with mean
1
q in = .
∑ −γ k X inkA
1+ e k
A
X ink are the attributes for availability and perception of the path
and γk are parameters to be estimated.
Some recent models (Nguyen and Pallottino, 1987, Nguyen,
Pallottino and Gendreau, 1988) consider hyperpaths instead of
paths as alternatives. An hyperpath is a collection of paths with
associated strategies at decision nodes. This technique is
particularly appropriate for a public transportation network.

Attributes

In describing the attributes of the alternatives to be included in


the utility function, we need to distinguish between link-additive
and non-link-additive attributes.
If i is a path composed of links a ∈ Γi, xi is a link-additive
attribute of i if
xi = ∑ xa ,
a∈Γi

where xa is the corresponding attribute of link a. For example, the


travel time on a path is the sum of the travel times on links
composing the path. Qualitative attributes are in general non-link-
additive. For example, a binary variable xi equal to one if the path is
an habitual path and 0 otherwise, is non-additive. In the context of
public transportation, variables like transfers and fares are usually
not link-additive. The distinction is important because some models,

DRAFT
M. Ben-Akiva and M. Bierlaire 31
DISCRETE CHOICE METHODS

designed to avoid path enumeration, use link attributes and not path
attributes.
Among the many attributes that can potentially be included in a
utility function, travel time is probably the most important. But
what does travel time mean for the decision-maker? How does
she/he perceive travel time? Many models are based on the
assumption that most travelers are sufficiently experienced and
knowledgeable about usual network conditions and, therefore, are
able to estimate travel times accurately. This assumption may be
satisfactory for planning applications using static models. With the
emergence of Intelligent Transportation Systems, models that are
able to predict the impact of real-time information have been
developed. In this context, the "perfect knowledge" assumption is
contradictory with the ITS services that provide information.
Several approaches can be used to capture perceptions of travel
times. One approach represents travel time as a random variable in
the utility function. This idea was introduced by Burrell (1968) and
is captured by a random utility model. Also, the uncertainty or the
variability of travel time along a given path can be explicitly
included as an attribute of the path.
In addition to travel time, the following attributes are usually
included.
• Path length. The length of the path is likely to influence the
decision maker’s choice. Also, this attribute is easy to
measure. Note that it may be highly correlated with travel
time, especially in uncongested networks.
• Travel cost. In addition to the obvious behavioral motivation,
including travel cost in the utility function is necessary to
forecast the impact of tolls and congestion pricing, for
example. It is common practice to distinguish the so-called
out-of-pocket costs (like tolls), which are directly associated

DRAFT
32 M. Ben-Akiva and M. Bierlaire
DISCRETE CHOICE METHODS

with a specific trip, from other general costs (like car


operating costs).
• Transit specific. Attributes specific to route choice in transit
networks include number of transfers, waiting and walking
time and service frequency.
• Others. Traffic conditions (e.g. level of congestion, volume
of conflicting traffic streams or pedestrian movements),
obstacles (e.g. number of stop signs, number of traffic lights,
number of left turns against traffic), road types (e.g. dummy
variable capturing preference for freeways) and road
condition (e.g. surface quality, number of lanes, safety,
scenery) are some of the other attributes that may be
considered. Whether to include them in the utility function
depends on their behavioral pertinence in a specific context,
and on data availability.

Decision Rules

Shortest path
The simplest possible decision rule in the route choice context
assumes that each individual chooses the path with the highest
utility. Models based on deterministic utility maximization are
supported by efficient algorithms to compute shortest paths in a
graph (e.g. Dijkstra, 1959, and Dial, 1969). However, the
behavioral limitations of this approach have motivated the
development of stochastic models based on the random utility
model.

DRAFT
M. Ben-Akiva and M. Bierlaire 33
DISCRETE CHOICE METHODS

Logit route choice


A Multinomial Logit Model with an efficient algorithm for route
choice has been proposed by Dial (1971). Using the concept of
“reasonable paths” to define the choice set and assuming the paths
attributes to be link-additive, this algorithm avoids explicit path
enumeration.
As described earlier, the IIA property of the Multinomial Logit
Model is the major weakness of Dial’s algorithm in the context of
highly overlapping routes. Therefore, its use is limited to networks
with specific topologies. A Logit model may also be used with a
choice set generation model, such as the Labeling approach, that
results in a small size choice set with limited overlap.

Probit route choice


Given the shortcomings of the Logit route choice model, Probit
models have been proposed in the context of stochastic network
loading by Burrell (1968) and Daganzo and Sheffi (1977). The two
problems in this case are (i) the complexity of the variance-
covariance matrix and (ii) the lack of an analytical formulation for
the probabilities. The covariance structure can be simplified when
path utilities are link-additive, the variance of link utility is
proportional to the utility itself, and the covariance of utilities of
two different links is zero. A Monte-Carlo simulation is often used
to circumvent the absence of a closed analytical form.

C-Logit
The C-Logit model, proposed by Cascetta et al. (1996) in the
context of route choice, is a Multinomial Logit Model which
captures the correlation among alternatives in a deterministic way.
They add to the deterministic part of the utility function a term,

DRAFT
34 M. Ben-Akiva and M. Bierlaire
DISCRETE CHOICE METHODS

called “commonality factor”, that captures the degree of similarity


between the alternative and all other alternatives in the choice set.
e µ (Vin −CFin )
P(i | C n ) =
∑e
µ (V jn − CF jn )

j∈Cn

Cascetta et al. (1996) propose the following specification for the


commonality factor
γ
 Lij 
CFin = βCF ln ∑  

j ∈Cn  Li j 
L

where Lij is the length4 of links common to paths i and j, and Li and
Lj are the overall length of paths i and j, respectively. βCF is a
coefficient to be estimated. The parameter γ may be estimated or
constrained to a convenient value, often 1 or 2.
Considering the path choice example in Figure 1, the
commonality factor for path 1 is zero because it does not overlap
with any other path. The commonality factor for paths 2a and 2b is
βCF ln(1 + [(T-δ)/T]γ ).
Note that the commonality factor of an alternative is not one of
its attributes. It can be viewed as a measure of how the alternative
is perceived within a choice set.

PS-Logit
Path-Size Logit is an application of the notion of elemental
alternatives and size variables. See Ben-Akiva and Lerman
(Chapter 9) for details about models with elemental and aggregate
alternatives. In the route choice context, we assume that an

4
or any other link-additive attribute

DRAFT
M. Ben-Akiva and M. Bierlaire 35
DISCRETE CHOICE METHODS

overlapping path may not be perceived as a distinct alternative.


Indeed, a path contains links which may be shared by several paths.
Hence, the size of a path with one or more shared links may be less
than one. We include a size variable in the utility of a path to obtain
the following model :
e µ (Vin + ln Sin )
P(i | C n ) = ,
∑e
µ (V jn + ln S jn )

j∈Cn

where the size Sin is defined by


la 1
S in = ∑L L*Cn
∑δ
a∈Γi i
aj
j∈Cn Lj

and Γi is the set of links in path i; la and Li are the length of link a
and path i, respectively; δaj is the link-path incidence variable that is
one if link a is on path j and 0 otherwise; and L*Cn is the length of
the shortest path in Cn.
Considering again the path choice problem from Figure 1, the
size of path 1 is 1, and the size of paths 2a and 2b is (T+δ)/2T. It is
interesting to note that the size variable formulation is equivalent to
the commonality factor formulation for the extreme cases where
δ=0 or δ=T, assuming that βCF=1 and for any value of γ. However,
the two models are different for intermediary values.

Departure choice applications

Modeling the choice of departure time appears in the context of


dynamic traffic assignment as an extension of the route choice
problem. It is important to distinguish the departure time choice

DRAFT
36 M. Ben-Akiva and M. Bierlaire
DISCRETE CHOICE METHODS

itself and the choice of changing departure time. The latter appears
usually in the context of Traveler Information Systems, where
individuals may revisit a previous choice using additional
information. We now describe typical modeling assumptions
associated with the departure time choice model.

Decision-Maker

The relevant traveler’s socioeconomic characteristics are similar to


those of route choice models. Additional characteristics important
for departure time choice are desired arrival time and penalties for
early and late arrival.
In the context of departure time change, the individual’s
“habitual” or “historical” departure time must also be known.

Alternatives

The choice set specification for departure time models is an


intricate problem. First, the continuous time must be discretized. A
reasonable compromise must be found between a fine temporal
resolution and the model complexity. Indeed, there is a potentially
large number of alternatives, particularly for realistic dynamic traffic
applications. Second, the correlation among alternatives cannot be
ignored, especially when time intervals are short. Choosing between
the 7:45-7:50 and 7:50-7:55 time intervals differs from choosing
between 7:45-7:50 and 8:45-8:50. In the first case, the two
alternatives are likely to share unobserved attributes. Third, the
perception of the alternatives depends on trip travel time. Most
individuals round time and the rounding may depend on the travel
time and travel time variability. For short trips, 7:52 may be
rounded to 7:50, whereas for long trips it may be approximated by
8:00.

DRAFT
M. Ben-Akiva and M. Bierlaire 37
DISCRETE CHOICE METHODS

The choice set generation consists of defining an acceptable


range of departure time intervals considered by an individual n. A
common procedure is based on the desired arrival time AT*n. Let
[ATn,min; ATn,max] be the feasible arrival time interval, and let
[TTn,min; TTn,max] be the range of travel times. Then the interval of
acceptable departure times is [DTn,min; DTn,max] = [ATn,min-TTn,max;
ATn,max-TTn,min]. Small (1987) analyzed the impact of truncating the
departure time choice set. He concluded that there is no problem if
the true model is a Multinomial Logit Model. Some adjustments are
needed if a Cross-Nested Logit with ordered alternatives is
assumed.
In the context of departure time change, the alternatives may be
described in a relative way. Antoniou et al. (1997) propose a choice
set with five alternatives: do not change, switch to an earlier or a
later departure, by one or two time intervals.

Attributes

Travel time is a key attribute of departure time alternatives. Other


important attributes are the early and late schedule delays. Given a
desired arrival time AT*n, a penalty-free interval is defined:
[AT*n,min;AT*n,max]. It is assumed that the individual suffers no
penalty if the arrival times lies within the interval. The actual arrival
time ATn is equal to DTn+TT(DTn),where TT(DTn) is the travel
time if the trip starts at time DTn. The early schedule delay is
defined as
Max [AT*n.min–ATn , 0]
and the late schedule delay is defined as
Max [ATn–AT* n.max , 0].

DRAFT
38 M. Ben-Akiva and M. Bierlaire
DISCRETE CHOICE METHODS

In the context of departure time change, a penalty can also be


associated with departure times very different from the habitual
choice, capturing the inertia associated with habits.

Decision Rules

Small (1982) and Cascetta et al. (1992) use Multinomial Logit


Models for departure time choice. However, the intrinsic
aforementioned correlation among alternatives is not captured by
such models. Small (1987) proposed an Ordered Generalized
Extreme Value model. It is a Cross-Nested Logit Model, where m
adjacent departure time intervals are nested together, capturing
their intrinsic correlation. A single departure time interval belongs
to m different nests, source of the cross-nested structure.
In the context of departure time change, Antoniou et al. (1997)
propose a Nested Logit Model for joint choice of departure time
and route. Liu and Mahmassani (1998) propose a Probit model
where day-to-day correlation is assumed.

Conclusion

Discrete choice methods are constantly evolving to accommodate


the requirements of specific applications. This is an exciting field of
research, where a deep understanding of the underlying theoretical
assumptions is necessary both to apply the models and develop new
ones. In this Chapter, we have summarized the fundamental aspects
of discrete choice theory, and we have introduced recent model
developments, illustrating their richness. A discussion on route
choice and departure time choice applications have shown how
specific aspects of real applications must be addressed.

DRAFT
M. Ben-Akiva and M. Bierlaire 39
DISCRETE CHOICE METHODS

Acknowledgment

We wish to thank Scott Ramming for his useful input and


suggestions. We also benefited from comments by Joan Walker,
Andrea Papola and Julie Bernardi.

References
Anderson, S. P., de Palma, A. and Thisse, J.-F. (1992). Discrete Choice Theory of
Product Differentiation, MIT Press, Cambridge, Ma.

Antoniou, C., Ben-Akiva, M.E., Bierlaire, M., and Mishalani, R. (1997) Demand
Simulation for Dynamic Traffic Assignment. Proceedings of the 8th
IFAC/IFIP/IFORS symposium on transportation systems.

Ben-Akiva, M. and François, B. (1983). µ homogeneous generalized extreme value


model, Working paper, Department of Civil Engineering, MIT, Cambridge, Ma.

Ben-Akiva, M. E. (1973). Structure of passenger travel demand models, PhD thesis,


Department of Civil Engineering, MIT, Cambridge, Ma.

Ben-Akiva, M. E. (1974). Structure of passenger travel demand models,


Transportation Research Record 526.

Ben-Akiva, M. E. and Boccara, B. (1995). Discrete choice models with latent choice
sets, International Journal of Research in Marketing 12: 9–24.

Ben-Akiva, M. E. and Lerman, S. R. (1985). Discrete Choice Analysis: Theory and


Application to Travel Demand, MIT Press, Cambridge, Ma.

Ben-Akiva, M. E., Bergman, M. J., Daly, A. J. and Ramaswamy, R. (1984). Modeling


inter-urban route choice behaviour, in J. Volmuller and R. Hamerslag (eds),
Proceedings from the ninth international symposium on transportation and traffic
theory, VNU Science Press, Utrecht, Netherlands, pp. 299–330.

Ben-Akiva, M. E., Cyna, M. and de Palma, A. (1984). Dynamic model of peak period
congestion, Transportation Research B 18(4–5): 339–355.

DRAFT
40 M. Ben-Akiva and M. Bierlaire
DISCRETE CHOICE METHODS

Bierlaire, M. (1995).A robust algorithm for the simultaneous estimation of hierarchical


logit models, GRT Report 95/3, Department of Mathematics, FUNDP.

Bierlaire, M. (1998). Discrete choice models, in M. Labbé, G. Laporte, K.Tanczos and


Ph. Toint (eds), Operations Research in Traffic and Transportation Management,
Vol. 166 of NATO ASI Series, Series F: Computer and Systems Sciences, Springer
Verlag, pp. 203-227.

Bierlaire, M. and Vandevyvere, Y. (1995).HieLoW: the interactive user's guide,


Transportation Research Group - FUNDP, Namur.

Bierlaire, M., Lotan, T. and Toint, Ph. L. (1997). On the overspecification of


multinomial and nested logit models due to alternative specific constants,
Transportation Science 31(4): 363–371.

Bolduc, D., Fortin, B. and Fournier, M.-A. (1996). The effect of incentive policies on
the practice location of doctors: A multinomial Probit analysis, Journal of labor
economics 14(4): 703.

Bradley, M. A. and Daly, A. (1991). Estimation of logit choice models using mixed
stated preferences and revealed preferences information, Methods for understanding
travel behaviour in the 1990's, International Association for Travel Behaviour,
Québec, pp.~116--133. 6th international conference on travel behaviour.

Burrell, J. E. (1968). Multipath route assignment and its application to capacity


restraint. Proceedings of the 4th international symposium on the theory of road
traffic flow, Karslruhe, Germany.

Cardell and Dunbar (1980).Measuring the societal impacts of automobile downsizing,


Transportation Research A 14(5–6): 423–434.

Cascetta, E. and Papola, A. (1998). Random utility models with implicit


availability/perception of choice alternatives for the simulation of travel demand,
Technical report, Universita degli Studi di Napoli Federico II.

Cascetta, E., Nuzzolo, A. and Biggiero, L. (1992). Analysis and Modeling of


Commuters’ Departure Time and Route Choice in Urban Networks. Proceedings of
the Second International CAPRI Seminar on Urban Traffic Networks.

DRAFT
M. Ben-Akiva and M. Bierlaire 41
DISCRETE CHOICE METHODS

Cascetta, E., Nuzzolo, A., Russo, F. and Vitetta, A. (1996). A modified logit route
choice model overcoming path overlapping problems. Specification and some
calibration results for interurban networks, Proceedings of the 13th International
Symposium on the Theory of Road Traffic Flow (Lyon, France).

Chang, G. L. and Mahmassani, H. S. (1986). Experiments with departure time choice


dynamics of urban commuters, Transportation Research B 20(4): 297–320.

Chang, G. L. and Mahmassani, H. S. (1988). Travel time prediction and departure time
adjustment dynamics in a congested traffic system, Transportation Research B 22
(3): 217–232.

Daganzo, C. F. and Sheffi, Y. (1977). On stochastic models of traffic assignment.


Transportation Science 11(3): 253–274.

Daly, A. (1987). Estimating “tree” logit models, Transportation Research B


21(4): 251–268.

de Palma, A., Khattak, A. J. and Gupta, D. (1997). Commuters’ departure time


decisions in Brussels, Belgium, Transportation Research Record 1607: 139–146.

Dial, R. B. (1969). Algorithm 360: shortest path forest with topological ordering.,
Communications of ACM 12: 632–633.

Dial, R. B. (1971). A probabilistic multipath traffic assignment algorithm which


obviates path enumeration, Transportation Research 5(2): 83–111.

Dijkstra, E. W. (1959). A note on two problems in connection with graphs, Numerische


Mathematik 1: 269–271.

Gumbel, E. J. (1958). Statistics of Extremes, Columbia University Press, New York.

Hendrickson, C. and Kocur, G. (1981). Schedule delay and departure time decisions in
a deterministic model, Transportation Science 15: 62–77.

Hendrickson, C. and Plank, E. (1984). The flexibility of departure times for work trips,
Transportation Research A 18: 25–36.

DRAFT
42 M. Ben-Akiva and M. Bierlaire
DISCRETE CHOICE METHODS

Hensher, D. A. and Johnson, L. W. (1981). Applied discrete choice modeling, Croom


Helm, London.

Horowitz, J. L., Koppelman, F. S. and Lerman, S. R. (1986). A self-instructing course


in disaggregate mode choice modeling, Technology Sharing Program, US
Department of Transportation, Washington, D.C. 20590.

Khattak, A. J. and de Palma, A. (1997). The impact of adverse weather conditions on


the propensity to change travel decisions: a survey of Brussels commuters,
Transportation Research A 31(3): 181–203.

Koppelman, F. S. and Wen, C.-H. (1997). The paired combinatorial logit model:
properties, estimation and application, Transportation Research Board, 76th Annual
Meeting, Washington DC. Paper #970953.

Koppelman, F. S. and Wen, C.-H. (1998). Alternative nested logit models: Structure,
properties and estimation, Transportation Research B. (forthcoming).

Liu, Y.-H. and Mahmassani, H. (1998). Dynamic aspects of departure time and route
decision behavior under ATIS: modeling framework and experimental results,
presented at the 77th annual meeting of the Transportation Research Board,
Washington DC.

Luce, R. (1959). Individual choice behavior: a theoretical analysis, J. Wiley and Sons,
New York.

Luce, R. D. and Suppes, P. (1965). Preference, utility and subjective probability, in


R. D. Luce, R. R. Bush and E. Galanter (eds), Handbook of Mathematical
Psychology, J. Wiley and Sons, New York.

Manski, C. (1977). The structure of random utility models, Theory and Decision
8: 229–254.

McFadden, D. (1978). Modeling the choice of residential location, in A. K. et al. (ed.),


Spatial interaction theory and residential location, North-Holland, Amsterdam,
pp. 75–96.

DRAFT
M. Ben-Akiva and M. Bierlaire 43
DISCRETE CHOICE METHODS

McFadden, D. (1989). A method of simulated moments for estimation of discrete


response models without numerical integration, Econometrica 57(5): 995–1026.

McFadden, D. and Train, K. (1997). Mixed multinomial logit models for discrete
response, Technical report, University of California, Berkeley, Ca.

Nguyen, S. and Pallottino, S. (1987). Traffic assignment for large scale transit
networks, in A. Odoni (ed.), Flow control of congested networks, Springer Verlag.

Nguyen, S., Pallottino, S. and Gendreau, M. (1998). Implicit Enumeration of


Hyperpaths in a Logit Model for Transit Networks. Transportation Science 32(1).

Small, K. (1982). The scheduling of consumer activities: work trips, The American
Economic Review pp. 467–479.

Small, K. (1987). A discrete choice model for ordered alternatives, Econometrica


55(2): 409–424.

Swait, J.D. and Ben-Akiva, M. (1987). Incorporating Random Constraints in Discrete


Models of Choice Set Generation, Transportation Research B 21(2).

Tversky, A. (1972). Elimination by aspects: a theory of choice, Psychological Review


79: 281–299.

Vovsha, P. (1997). Cross-nested logit model: an application to mode choice in the Tel-
Aviv metropolitan area, Transportation Research Board, 76th Annual Meeting,
Washington DC.Paper #970387.

Whynes, D., Reedand, G. and Newbold, P. (1996). General practitioners' choice of


referral destination: A Probit analysis, Managerial and Decision Economics
17(6): 587.

Yai, T., Iwakura, S. and Morichi, S. (1997). Multinomial Probit with structured
covariance for route choice behavior, Transportation Research B 31(3): 195–208.

DRAFT

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy