0% found this document useful (0 votes)
4 views

Chapter 5

Chapter 5 discusses Generalized Linear Models, focusing on situations where the outcome variable is not continuous, such as binary and categorical outcomes. It introduces the linear probability model, latent outcomes, and alternative models like probit and logit for binary outcomes, emphasizing the use of Maximum Likelihood Estimation for inference. The chapter concludes with applications in consumer choice analysis and the necessity of additional assumptions for proper model estimation.

Uploaded by

daryn.imashev.bu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Chapter 5

Chapter 5 discusses Generalized Linear Models, focusing on situations where the outcome variable is not continuous, such as binary and categorical outcomes. It introduces the linear probability model, latent outcomes, and alternative models like probit and logit for binary outcomes, emphasizing the use of Maximum Likelihood Estimation for inference. The chapter concludes with applications in consumer choice analysis and the necessity of additional assumptions for proper model estimation.

Uploaded by

daryn.imashev.bu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Statistical Foundations of Business Analytics

Chapter 5: Generalized Linear Models

Tim Ederer

Mini 2, 2024
Tepper Business School
Introduction

Chapter 1-4 give you a complete toolkit to make inference about β


• Mostly focus on cases where y is continuous

What happens when y is not continuous?


• Binary outcomes: binary choice, credit default,...
• Categorical outcomes: duration, multiple choice,...

1 / 21
Binary Outcomes
Linear Probability Model

What happens when yi is binary?


• Example: yi = 1 if consumer i chooses product A and yi = 0 otherwise

Linear regression model is not appropriate!


• Linear probability model: E[yi |xi ] = P(yi = 1|xi ) = xi′ β
• Can lead to predictions where P(yi = 1|xi ) is below 0 or above 1

Need to think about an alternative model


• Build model where P(yi = 1|xi ) ∈ [0, 1]

2 / 21
Latent Outcomes

Assume that there is a continuous latent outcome yi∗ such that


(
1, if yi∗ ≥ 0
yi =
0, otherwise

Examples
• Choice: yi∗ could be the utility/valuation of a specific product
• Credit default: yi∗ could be the solvency of a company
• yi∗ is normalized such that you buy the product or you default when yi∗ ≥ 0

How does that help us?

3 / 21
Adapting the Linear Regression Model

We can use the linear regression model to relate yi∗ to xi

yi∗ = xi′ β + εi

Under this structure, all you need to know is β!


• Causal analysis: how a change in xi would change yi∗ and eventually yi
• Forecasting: what would be yi under a counterfactual realization of xi

How do we use data on (yi , xi ) to learn about β?


• How can we overcome the challenge that yi∗ is not observed?

4 / 21
Assumptions

We still need EXO and RANK


• E[εi |xi ] = 0 and no perfect collinearity between elements of x

But given that we do not observe yi∗ we need more structure

P(yi = 1|xi ) = P(xi′ β + εi ≥ 0|xi )

Can we recover β from P(yi = 1|xi )?

5 / 21
Probit

Answer is YES if you specify the distribution of ε


• Reminder: this is not needed in the standard linear regression model

Probit model: εi |xi ∼ N (0, 1)


• P(yi = 1|xi ) = Φ(xi′ β)
• Φ(.) is the c.d.f. of the standard normal distribution

β is identified under this assumption!


• If we would observe the population we could directly derive β

6 / 21
Logit

Alternative to probit: logit model


• Different assumption on distribution of εi

More convenient than probit because of tractable analytical expressions

exp{xi′ β} 1
P(yi = 1|xi ) = and P(yi = 0|xi ) =
1 + exp{xi′ β} 1 + exp{xi′ β}

β is also identified under this assumption


• Use the fact that log P(yi = 1|xi ) − log P(yi = 0|xi ) = xi′ β

7 / 21
What About Inference?

β is identified, now what?


• Does not tell us how we can make inference about β with our sample

We cannot use OLS for these models


• OLS would only work if we observed yi∗

How can we find an alternative estimator for β?


• Use Maximum Likelihood Estimation (MLE)
• Intuition: find value of β such that the predictions of the model are closest to data

8 / 21
Maximum Likelihood Estimation
Likelihood

We want an estimator that “fits” the data best


• Need to measure how likely it is that our model will predict the observed outcome

Likelihood of individual i: l(β; yi , xi )


• How likely it is given a value of β that I observe (yi , xi )?

Qn
Likelihood of the sample: i=1 l(β; yi , xi )
• How likely it is given a value of β that I observe (yi , xi ) for all i = 1, ..., n?
• If this value is small, the model fits poorly =⇒ we should change β!

9 / 21
Maximum Likelihood Estimator

MLE is the value of β that maximizes the log of the likelihood of the sample
• We take the log for mathematical and computational tractability

Pn
Log-Likelihood of the sample: L(β; y , X ) = i=1 log l(β; yi , xi )
• Allows to transform product into a sum =⇒ easier to compute in R

The maximum likelihood estimator for β is defined as

β̂ ML = arg max L(β; y , X )


β

10 / 21
Examples: Logit and Probit

Likelihood of individual i in the logit model


yi  1−yi
exp{xi′ β}

1
l(β; yi , xi ) =
1 + exp{xi′ β} 1 + exp{xi′ β}

Likelihood of individual i in the probit model


y 1−yi
l(β; yi , xi ) = (Φ(xi′ β)) i (1 − Φ(xi′ β))

11 / 21
Illustration in R: MLE with Probit Model

Assume that yi∗ = βxi + εi with β = 1


• Probit: εi |xi ∼ N (0, 1)

12 / 21
Properties of MLE

β̂ ML is unbiased, consistent and efficient


• Only under EXO and RANK

Unbiased and consistent estimator for variance of MLE


" #−1
L(β̂; y , X )
ML
Var(β̂ |X ) = −
c = Iˆ−1
∂β∂β ′

β̂ ML is normally distributed for large n


 
β̂ ML |X ∼ N β, Iˆ−1

13 / 21
Recap

Linear regression model is not appropriate for binary outcomes


• Leads to incoherent predictions

Alternative model: latent variable model


• Impose linear regression model on continuous latent outcome yi∗

We can make inference about β even if we do not observe yi∗


• Need to impose distributional assumption on ε (probit, logit,...)
• β is identified and β̂ ML is unbiased, consistent and efficient
• =⇒ we can make inference about β using the tools from Chapter 2!

14 / 21
Categorical Outcomes
Categorical Outcomes

What should we do when yi is a categorical variable?


• Multiple choice: yi is the product chosen by consumer i
• Survival analysis: yi is the duration before an event occurs (credit default, insurance claim)

As in the binary case, the linear regression model is not appropriate


• Rely instead on latent variable model
• Use linear regression model to link continuous latent outcome to x
• Estimate parameters of interest via Maximum Likelihood

Focus of this chapter: multiple choice analysis


• Useful for analysis of consumer behavior, optimal pricing, advertisement strategy

15 / 21
Discrete Choice Model

Consider yi as being consumer i’s choice over J products




 1 if i chooses product 1

2 if i chooses product 2

yi = ..


 .

J if i chooses product J

Goal: study the relationship between consumer choice yi and (z1 , z2 , ..., zJ )
• zj : product j’s characteristics (i.e. price, quality)

16 / 21
Latent Utility Model

Define yi as a function of uij the utility consumer i gets from buying product j

1
 if ui1 ≥ uij for all j

2 if ui2 ≥ uij for all j

yi = ..


 .

J if uiJ ≥ uij for all j

Use linear regression model to link uij to zj

uij = zj′ β + εij

17 / 21
Conditional Logit

Assumptions needed
• As always we need EXO and RANK
• Fix distribution of ε: εij ∼ Gumbel(0, 1)

This is called the conditional logit model

exp{zj′ β}
P(yi = j|z1 , ..., zJ ) = PJ

k=1 exp{zk β}

18 / 21
Estimation of Conditional Logit Model

Use Maximum Likelihood to estimate β


• β̂ ML is unbiased, consistent and efficient under EXO and RANK

Likelihood of individual i
J
!1{yi =j}
Y exp{zj′ β}
l(β; yi , z1 , ..., zJ ) = PJ
j=1 k=1 exp{zk′ β}

19 / 21
Summary

This chapter: what happens when yi is not continuous?


• Linear regression model is not appropriate anymore

Alternative: latent variable model


• Need additional assumptions (fix distribution of errors)
• Need to change the estimator (maximum likelihood estimator)

Very useful applications


• Consumer choice analysis, duration analysis, pricing strategy,...
• Essential in economics, finance, strategy, marketing

20 / 21
Thank you and good luck!

21 / 21

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy