0% found this document useful (0 votes)

4 views

Chapter 5

Chapter 5 discusses Generalized Linear Models, focusing on situations where the outcome variable is not continuous, such as binary and categorical outcomes. It introduces the linear probability model, latent outcomes, and alternative models like probit and logit for binary outcomes, emphasizing the use of Maximum Likelihood Estimation for inference. The chapter concludes with applications in consumer choice analysis and the necessity of additional assumptions for proper model estimation.

Uploaded by

daryn.imashev.bu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Chapter 5

Uploaded by

daryn.imashev.bu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Statistical Foundations of Business Analytics

Chapter 5: Generalized Linear Models

Tim Ederer

Mini 2, 2024
Tepper Business School
Introduction

Chapter 1-4 give you a complete toolkit to make inference about β

• Mostly focus on cases where y is continuous

What happens when y is not continuous?

• Binary outcomes: binary choice, credit default,...
• Categorical outcomes: duration, multiple choice,...

1 / 21
Binary Outcomes
Linear Probability Model

What happens when yi is binary?

• Example: yi = 1 if consumer i chooses product A and yi = 0 otherwise

Linear regression model is not appropriate!

• Linear probability model: E[yi |xi ] = P(yi = 1|xi ) = xi′ β
• Can lead to predictions where P(yi = 1|xi ) is below 0 or above 1

Need to think about an alternative model

• Build model where P(yi = 1|xi ) ∈ [0, 1]

2 / 21
Latent Outcomes

Assume that there is a continuous latent outcome yi∗ such that

(
1, if yi∗ ≥ 0
yi =
0, otherwise

Examples
• Choice: yi∗ could be the utility/valuation of a specific product
• Credit default: yi∗ could be the solvency of a company
• yi∗ is normalized such that you buy the product or you default when yi∗ ≥ 0

How does that help us?

3 / 21
Adapting the Linear Regression Model

We can use the linear regression model to relate yi∗ to xi

yi∗ = xi′ β + εi

Under this structure, all you need to know is β!

• Causal analysis: how a change in xi would change yi∗ and eventually yi
• Forecasting: what would be yi under a counterfactual realization of xi

How do we use data on (yi , xi ) to learn about β?

• How can we overcome the challenge that yi∗ is not observed?

4 / 21
Assumptions

We still need EXO and RANK

• E[εi |xi ] = 0 and no perfect collinearity between elements of x

But given that we do not observe yi∗ we need more structure

P(yi = 1|xi ) = P(xi′ β + εi ≥ 0|xi )

Can we recover β from P(yi = 1|xi )?

5 / 21
Probit

Answer is YES if you specify the distribution of ε

• Reminder: this is not needed in the standard linear regression model

Probit model: εi |xi ∼ N (0, 1)

• P(yi = 1|xi ) = Φ(xi′ β)
• Φ(.) is the c.d.f. of the standard normal distribution

β is identified under this assumption!

• If we would observe the population we could directly derive β

6 / 21
Logit

Alternative to probit: logit model

• Different assumption on distribution of εi

More convenient than probit because of tractable analytical expressions

exp{xi′ β} 1
P(yi = 1|xi ) = and P(yi = 0|xi ) =
1 + exp{xi′ β} 1 + exp{xi′ β}

β is also identified under this assumption

• Use the fact that log P(yi = 1|xi ) − log P(yi = 0|xi ) = xi′ β

7 / 21
What About Inference?

β is identified, now what?

• Does not tell us how we can make inference about β with our sample

We cannot use OLS for these models

• OLS would only work if we observed yi∗

How can we find an alternative estimator for β?

• Use Maximum Likelihood Estimation (MLE)
• Intuition: find value of β such that the predictions of the model are closest to data

8 / 21
Maximum Likelihood Estimation
Likelihood

We want an estimator that “fits” the data best

• Need to measure how likely it is that our model will predict the observed outcome

Likelihood of individual i: l(β; yi , xi )

• How likely it is given a value of β that I observe (yi , xi )?

Qn
Likelihood of the sample: i=1 l(β; yi , xi )
• How likely it is given a value of β that I observe (yi , xi ) for all i = 1, ..., n?
• If this value is small, the model fits poorly =⇒ we should change β!

9 / 21
Maximum Likelihood Estimator

MLE is the value of β that maximizes the log of the likelihood of the sample
• We take the log for mathematical and computational tractability

Pn
Log-Likelihood of the sample: L(β; y , X ) = i=1 log l(β; yi , xi )
• Allows to transform product into a sum =⇒ easier to compute in R

The maximum likelihood estimator for β is defined as

β̂ ML = arg max L(β; y , X )

10 / 21
Examples: Logit and Probit

Likelihood of individual i in the logit model

yi 1−yi
exp{xi′ β}

1
l(β; yi , xi ) =
1 + exp{xi′ β} 1 + exp{xi′ β}

Likelihood of individual i in the probit model

y 1−yi
l(β; yi , xi ) = (Φ(xi′ β)) i (1 − Φ(xi′ β))

11 / 21
Illustration in R: MLE with Probit Model

Assume that yi∗ = βxi + εi with β = 1

• Probit: εi |xi ∼ N (0, 1)

12 / 21
Properties of MLE

β̂ ML is unbiased, consistent and efficient

• Only under EXO and RANK

Unbiased and consistent estimator for variance of MLE

" #−1
L(β̂; y , X )
ML
Var(β̂ |X ) = −
c = Iˆ−1
∂β∂β ′

β̂ ML is normally distributed for large n

β̂ ML |X ∼ N β, Iˆ−1

13 / 21
Recap

Linear regression model is not appropriate for binary outcomes

• Leads to incoherent predictions

Alternative model: latent variable model

• Impose linear regression model on continuous latent outcome yi∗

We can make inference about β even if we do not observe yi∗

• Need to impose distributional assumption on ε (probit, logit,...)
• β is identified and β̂ ML is unbiased, consistent and efficient
• =⇒ we can make inference about β using the tools from Chapter 2!

14 / 21
Categorical Outcomes
Categorical Outcomes

What should we do when yi is a categorical variable?

• Multiple choice: yi is the product chosen by consumer i
• Survival analysis: yi is the duration before an event occurs (credit default, insurance claim)

As in the binary case, the linear regression model is not appropriate

• Rely instead on latent variable model
• Use linear regression model to link continuous latent outcome to x
• Estimate parameters of interest via Maximum Likelihood

Focus of this chapter: multiple choice analysis

• Useful for analysis of consumer behavior, optimal pricing, advertisement strategy

15 / 21
Discrete Choice Model

Consider yi as being consumer i’s choice over J products



 1 if i chooses product 1

2 if i chooses product 2

yi = ..


 .

J if i chooses product J


Goal: study the relationship between consumer choice yi and (z1 , z2 , ..., zJ )
• zj : product j’s characteristics (i.e. price, quality)

16 / 21
Latent Utility Model

Define yi as a function of uij the utility consumer i gets from buying product j

1
 if ui1 ≥ uij for all j

2 if ui2 ≥ uij for all j

yi = ..


 .

J if uiJ ≥ uij for all j


Use linear regression model to link uij to zj

uij = zj′ β + εij

17 / 21
Conditional Logit

Assumptions needed
• As always we need EXO and RANK
• Fix distribution of ε: εij ∼ Gumbel(0, 1)

This is called the conditional logit model

exp{zj′ β}
P(yi = j|z1 , ..., zJ ) = PJ
′
k=1 exp{zk β}

18 / 21
Estimation of Conditional Logit Model

Use Maximum Likelihood to estimate β

• β̂ ML is unbiased, consistent and efficient under EXO and RANK

Likelihood of individual i
J
!1{yi =j}
Y exp{zj′ β}
l(β; yi , z1 , ..., zJ ) = PJ
j=1 k=1 exp{zk′ β}

19 / 21
Summary

This chapter: what happens when yi is not continuous?

• Linear regression model is not appropriate anymore

Alternative: latent variable model

• Need additional assumptions (fix distribution of errors)
• Need to change the estimator (maximum likelihood estimator)

Very useful applications

• Consumer choice analysis, duration analysis, pricing strategy,...
• Essential in economics, finance, strategy, marketing

20 / 21
Thank you and good luck!

21 / 21

MUNICIPAL ABE DIVISION - Fin
67% (3)
MUNICIPAL ABE DIVISION - Fin
10 pages
Design and Manufacture of Composite Liquid Oxygen Propellent Tank For University Rocket
No ratings yet
Design and Manufacture of Composite Liquid Oxygen Propellent Tank For University Rocket
10 pages
Seminar Econometrie
No ratings yet
Seminar Econometrie
15 pages
Binaryresponsemf IMP
No ratings yet
Binaryresponsemf IMP
11 pages
metrikaq
No ratings yet
metrikaq
11 pages
Binary data advanced
No ratings yet
Binary data advanced
42 pages
Limited Dependent Variables - Binary Dependent Variables
No ratings yet
Limited Dependent Variables - Binary Dependent Variables
24 pages
Econometria Avanzada: Generalized Linear Models
No ratings yet
Econometria Avanzada: Generalized Linear Models
30 pages
7 binaryresponsemf
No ratings yet
7 binaryresponsemf
11 pages
Probit and Logit-Madesh
No ratings yet
Probit and Logit-Madesh
22 pages
Msfe Week9
No ratings yet
Msfe Week9
5 pages
09 Discrete Choice 1 Notes
No ratings yet
09 Discrete Choice 1 Notes
17 pages
3.Handouts_binary_dependent_variables
No ratings yet
3.Handouts_binary_dependent_variables
8 pages
Econometrics - Qualitative Response Models
No ratings yet
Econometrics - Qualitative Response Models
17 pages
Econometrics Eviews 6
No ratings yet
Econometrics Eviews 6
12 pages
MicroEconometrics Lecture10
No ratings yet
MicroEconometrics Lecture10
27 pages
Logit and Probit Models
No ratings yet
Logit and Probit Models
44 pages
Binary
No ratings yet
Binary
47 pages
Section 9 Limited Dependent Variables
No ratings yet
Section 9 Limited Dependent Variables
17 pages
09-Limited Dependent Variable Models
No ratings yet
09-Limited Dependent Variable Models
71 pages
Section 11 PDF
No ratings yet
Section 11 PDF
7 pages
Chapter 5-LDVM-2024
No ratings yet
Chapter 5-LDVM-2024
27 pages
Lecture15 Binary Dependent Variables
No ratings yet
Lecture15 Binary Dependent Variables
38 pages
Limited Dependent Variables
No ratings yet
Limited Dependent Variables
17 pages
411 Note LDV
No ratings yet
411 Note LDV
12 pages
Chapter 1
No ratings yet
Chapter 1
35 pages
Pro Bit
No ratings yet
Pro Bit
5 pages
Week1 Lecture2
No ratings yet
Week1 Lecture2
57 pages
Econometric Lec7
No ratings yet
Econometric Lec7
26 pages
2A.3 Lecture Slides20 LDV 1
No ratings yet
2A.3 Lecture Slides20 LDV 1
21 pages
Unitb - II - Linear Probability, Logit and Probit
No ratings yet
Unitb - II - Linear Probability, Logit and Probit
34 pages
Microeconometrie Chapitre1 BinaryOutcomeModels
No ratings yet
Microeconometrie Chapitre1 BinaryOutcomeModels
42 pages
Chapter 5 Mgt
No ratings yet
Chapter 5 Mgt
60 pages
(Discrete Choice Model Soderbom)
No ratings yet
(Discrete Choice Model Soderbom)
43 pages
Qualitative Response Models
No ratings yet
Qualitative Response Models
35 pages
LPM, Logit and Probit Models
No ratings yet
LPM, Logit and Probit Models
21 pages
1. Linear regression Model - Applied_Part 1&2
No ratings yet
1. Linear regression Model - Applied_Part 1&2
69 pages
Topic 3: Qualitative Response Regression Models
No ratings yet
Topic 3: Qualitative Response Regression Models
29 pages
Econometrics 2 Module 5 Video 2 Canvas
No ratings yet
Econometrics 2 Module 5 Video 2 Canvas
13 pages
17 ae2
No ratings yet
17 ae2
29 pages
Logit To Probit To LPM Example
No ratings yet
Logit To Probit To LPM Example
21 pages
Probit_Logit_Models
No ratings yet
Probit_Logit_Models
26 pages
CH 5. Discrete Choice Model
No ratings yet
CH 5. Discrete Choice Model
38 pages
Logit probit
No ratings yet
Logit probit
11 pages
Regression With A Binary Dependent Variable
No ratings yet
Regression With A Binary Dependent Variable
63 pages
Econ Shu301 CH11
No ratings yet
Econ Shu301 CH11
53 pages
Bgpev2 LDV
No ratings yet
Bgpev2 LDV
53 pages
Probit Model
No ratings yet
Probit Model
29 pages
Limited Dependent Variables Models PDF
No ratings yet
Limited Dependent Variables Models PDF
47 pages
Robustness of Logit Analysis: Unobserved Heterogeneity and Misspecified Disturbances
No ratings yet
Robustness of Logit Analysis: Unobserved Heterogeneity and Misspecified Disturbances
14 pages
Econometrics 2 Module 6 Video 3 Canvas
No ratings yet
Econometrics 2 Module 6 Video 3 Canvas
10 pages
Econometrics
No ratings yet
Econometrics
37 pages
Notes 13
No ratings yet
Notes 13
18 pages
lpm stata baum
No ratings yet
lpm stata baum
73 pages
4a. LPM-Logit-Probit-Tobit Model - IInd Sem 23-24
No ratings yet
4a. LPM-Logit-Probit-Tobit Model - IInd Sem 23-24
130 pages
CH-4-Discrete Choice Models-short
No ratings yet
CH-4-Discrete Choice Models-short
58 pages
Chapter - Five - Limited Dependent Variable Models
No ratings yet
Chapter - Five - Limited Dependent Variable Models
75 pages
Dummy Dependent Variable
100% (1)
Dummy Dependent Variable
58 pages
Statistics 3 Notes
No ratings yet
Statistics 3 Notes
90 pages
Calculus I Essentials
From Everand
Calculus I Essentials
Editors of REA
1/5 (1)
SAT Math: Master the Skills in 40 Pages
From Everand
SAT Math: Master the Skills in 40 Pages
Jennifer L Johnson
No ratings yet
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
Handbook Earthing
No ratings yet
Handbook Earthing
68 pages
B.Tech (Electronics & Computer Engineering)
No ratings yet
B.Tech (Electronics & Computer Engineering)
6 pages
Modeling and Simulation of Cryogenic Processes Using Ecosimpro
No ratings yet
Modeling and Simulation of Cryogenic Processes Using Ecosimpro
22 pages
BBMB PST Pandoh Dam 17022020 F
No ratings yet
BBMB PST Pandoh Dam 17022020 F
55 pages
BACTERIAL PROFILE IN STERILE BODY FLUIDS AND THEIR ANTIBIOTIC SUSCEPTIBILITY PATTERNS AMONG PATIENTS AT MENELIK II COMPREHENSIVE SPECIALIZED HOSPITAL AND YEKATIT 12 HOSPITAL MEDICAL COLLEGE, ADDIS ABABA.
No ratings yet
BACTERIAL PROFILE IN STERILE BODY FLUIDS AND THEIR ANTIBIOTIC SUSCEPTIBILITY PATTERNS AMONG PATIENTS AT MENELIK II COMPREHENSIVE SPECIALIZED HOSPITAL AND YEKATIT 12 HOSPITAL MEDICAL COLLEGE, ADDIS ABABA.
62 pages
Ppce 312 Analysis and Critical Understanding of The Visual Arts
No ratings yet
Ppce 312 Analysis and Critical Understanding of The Visual Arts
2 pages
Chapter 2 Notes
No ratings yet
Chapter 2 Notes
8 pages
Career Guidance Orientation Matrix Final
No ratings yet
Career Guidance Orientation Matrix Final
4 pages
Geology For Engineers (Week 1,2,3)
No ratings yet
Geology For Engineers (Week 1,2,3)
24 pages
R15 Regulations
No ratings yet
R15 Regulations
85 pages
Anorexia Mirabilis Victorian Web
No ratings yet
Anorexia Mirabilis Victorian Web
11 pages
AIIB Environmental and Social Framework ESF November 2022 Final
No ratings yet
AIIB Environmental and Social Framework ESF November 2022 Final
95 pages
BSC 6st Sem
No ratings yet
BSC 6st Sem
1 page
Examples of Too Broad Thesis Statements
100% (2)
Examples of Too Broad Thesis Statements
7 pages
Complete Download Integrated life-cycle and risk assessment for industrial processes and products Second Edition Marta Schuhmacher PDF All Chapters
100% (4)
Complete Download Integrated life-cycle and risk assessment for industrial processes and products Second Edition Marta Schuhmacher PDF All Chapters
55 pages
Bai 11. Xu Ly Nhanh Gon Cac Cau Tu Vung
No ratings yet
Bai 11. Xu Ly Nhanh Gon Cac Cau Tu Vung
13 pages
Prediction of Transmission Line Overloading Using Intelligent Technique
No ratings yet
Prediction of Transmission Line Overloading Using Intelligent Technique
18 pages
Presentation Summer 2024 Duc Nguyen
No ratings yet
Presentation Summer 2024 Duc Nguyen
21 pages
(1835) The Unparalleled Adventure of One Hans Pfaall - Edgar Allan Poe
No ratings yet
(1835) The Unparalleled Adventure of One Hans Pfaall - Edgar Allan Poe
28 pages
IAAC Final Round
No ratings yet
IAAC Final Round
6 pages
Elektor #401 - May 2010
No ratings yet
Elektor #401 - May 2010
88 pages
Strings in Python
No ratings yet
Strings in Python
13 pages
SQL Optimization
No ratings yet
SQL Optimization
20 pages
Unit 2 Project Identification and Selection
No ratings yet
Unit 2 Project Identification and Selection
23 pages
Application of Minimum Curvature Method To Wellpath Calculations
No ratings yet
Application of Minimum Curvature Method To Wellpath Calculations
8 pages
Activity Sheet Grade 9 Sampling
No ratings yet
Activity Sheet Grade 9 Sampling
2 pages
Total Tilenga Work Package Management Services JD
No ratings yet
Total Tilenga Work Package Management Services JD
2 pages
g1 SF 9 Report Card Automated
No ratings yet
g1 SF 9 Report Card Automated
3 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Chapter 5

Uploaded by

Chapter 5

Uploaded by

Statistical Foundations of Business Analytics

Chapter 5: Generalized Linear Models

Chapter 1-4 give you a complete toolkit to make inference about β

What happens when y is not continuous?

What happens when yi is binary?

Linear regression model is not appropriate!

Need to think about an alternative model

Assume that there is a continuous latent outcome yi∗ such that

How does that help us?

We can use the linear regression model to relate yi∗ to xi

Under this structure, all you need to know is β!

How do we use data on (yi , xi ) to learn about β?

We still need EXO and RANK

But given that we do not observe yi∗ we need more structure

P(yi = 1|xi ) = P(xi′ β + εi ≥ 0|xi )

Can we recover β from P(yi = 1|xi )?

Answer is YES if you specify the distribution of ε

Probit model: εi |xi ∼ N (0, 1)

β is identified under this assumption!

Alternative to probit: logit model

More convenient than probit because of tractable analytical expressions

β is also identified under this assumption

β is identified, now what?

We cannot use OLS for these models

How can we find an alternative estimator for β?

We want an estimator that “fits” the data best

Likelihood of individual i: l(β; yi , xi )

The maximum likelihood estimator for β is defined as

β̂ ML = arg max L(β; y , X )

Likelihood of individual i in the logit model

Likelihood of individual i in the probit model

Assume that yi∗ = βxi + εi with β = 1

β̂ ML is unbiased, consistent and efficient

Unbiased and consistent estimator for variance of MLE

β̂ ML is normally distributed for large n

Linear regression model is not appropriate for binary outcomes

Alternative model: latent variable model

We can make inference about β even if we do not observe yi∗

What should we do when yi is a categorical variable?

As in the binary case, the linear regression model is not appropriate

Focus of this chapter: multiple choice analysis

Consider yi as being consumer i’s choice over J products

Use linear regression model to link uij to zj

uij = zj′ β + εij

This is called the conditional logit model

Use Maximum Likelihood to estimate β

This chapter: what happens when yi is not continuous?

Alternative: latent variable model

Very useful applications

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.