The Problem of Modeling Rare Events in ML-based Logistic Regression - Assessing Potential Remedies Via MC Simulations
The Problem of Modeling Rare Events in ML-based Logistic Regression - Assessing Potential Remedies Via MC Simulations
net/publication/269708531
CITATIONS READS
13 3,649
1 author:
Heinz Leitgöb
Katholische Universität Eichstätt-Ingolstadt (KU)
42 PUBLICATIONS 200 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Measuring Value Priorities and Testing Theory among University Students in Africa and Middle Europe View project
All content following this page was uploaded by Heinz Leitgöb on 19 December 2014.
Heinz Leitgöb
University of Linz, Austria
Problem
• In logistic regression, MLEs are consistent but only
asymptotically unbiased -> MLEs may be heavily biased away
from 0
𝐛 = 𝐗 𝑇 𝐖𝐗 −𝟏 𝐗 𝑇 𝐖𝛏 (1)
𝑛
log𝐿 𝛃 = 𝑖=1 𝑦𝑖 log 𝜋 𝐱 𝑖 , 𝛃 + 1 − 𝑦𝑖 log 1 − 𝜋 𝐱 𝑖 , 𝛃 (4)
𝜕log𝐿 𝛃
𝐪= =𝟎 (5)
𝜕𝛃
First result:
Exact logistic regression is only applicable when
• n is (very) small (<200)
• covariates are discrete (best: dichotomous)
• # of covariates is small
𝛃 = 𝛃 − bias 𝛃 (6)
Pr 𝑦𝑖 = 1 𝐱 𝑖 ≈ 𝜋𝑖 + 𝐶𝑖 (7)
with
𝜕𝐢
𝐪𝑃𝑀𝐿 = 𝐪𝑀𝐿 + 1 2 𝑡𝑟 𝐢−1 (11)
𝜕𝛃
Italic implies, that the 95%ci does not contain the true score
ESRA 2013, Ljubljana 10
MC simulation results ― King/Zeng correction
Table 3a: King/Zeng correction― mean intercepts
p (𝛽0 ) 5,000 1,000 500 250 100
0.5 (0) -0,0001084 0,0004286 0,0018100 0,0001658 0,0027876
0.1 (-3.3) -3,2996170 -3,2995520 -3,3017820 -3,3001730 -3,2840120
0.05 (-4.3) -4,3016320 -4,2998200 -4,3032600 -4,2905530 -4,0773970
0.01 (-6.6) -6,5957210 -6,5828270 -6,3526430 ― ―
Italic implies, that the 95%ci does not contain the true score
ESRA 2013, Ljubljana 11
MC simulation results ― Firth’s PMLE
Table 4a: Firth’s PMLE ― mean intercepts
p (𝛽0 ) 5,000 1,000 500 250 100
0.5 (0) -0,0001084 0,0004286 0,0018100 0,0001657 0,0027907
0.1 (-3.3) -3,2996230 -3,2996900 -3,3023610 -3,3027510 -3,3129370
0.05 (-4.3) -4,3016500 -4,3002850 -4,3053100 -4,3010580 -4,3168090
0.01 (-6.6) -6,5961770 -6,6058880 -6,5950600 ― ―
Italic implies, that the 95%ci does not contain the true score
ESRA 2013, Ljubljana 12
Comparison of mean intercepts
Graph 1: Comparison of mean intercepts (p = 0.05 -> 𝜷𝟎 = −𝟒. 𝟑)
n
5.000 1.000 500 250 100
-3,5
-4,0
-4,077397
-4,316809
-4,399801
ß0
-4,5 -4,507153
-5,0
-5,083293
-5,5
ML_ß0 King/Zeng_ß0 PML_ß0 true_ß0
ESRA 2013, Ljubljana 13
Comparison of mean slopes
Graph 2: Comparison of mean slopes (p = 0.05; 𝜷𝟏 = 𝟐)
2,5
2,450068
2,4
2,3
2,2
2,1 2,117404
2,056378
2,0
ß1
2,001215
1,9 1,895128
1,8
1,7
1,6
1,5
5.000 1.000 500 250 100
n
ML_ß1 King/Zeng_ß1 PML_ß1 true_ß1
ESRA 2013, Ljubljana 14
Comparison of probabilities
• PMLEs seem unbiased, even in cases with small n and very few
#e. Further advantages: PMLE is always converging and
solves the “problem of separation” (Heinze/Schemper 2002)
heinz.leitgoeb@jku.at