0% found this document useful (0 votes)

9 views62 pages

(Seminar) Likelihood-Free Frequentist Inference

The document discusses likelihood-free inference (LFI), particularly in the context of using machine learning to improve inferential methods when traditional likelihoods cannot be evaluated. It highlights the development of new approaches that provide valid inference and diagnostics with finite-sample guarantees, addressing challenges in constructing confidence sets and testing hypotheses. The work aims to unify machine learning with classical statistics to enhance the efficiency and reliability of statistical inference in complex data scenarios.

Uploaded by

Yiqiao Jin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views62 pages

(Seminar) Likelihood-Free Frequentist Inference

Uploaded by

Yiqiao Jin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 62

Likelihood-Free Frequentist Inference

Ann B. Lee
Department of Statistics & Data Science / Machine Learning Department
Carnegie Mellon University

Collaborators: Luca Masserano (CMU); Nic Dalmasso (JP Morgan AI); Rafael Izbicki (UFSCar);
Mikael Kuusela (CMU); Tommaso Dorigo (Padova); Alex Shen (CMU)
Simulators are Ubiquitous in Science

Credit: Dalmasso (adapted from Cranmer et al, 2020)

For many complex phenomena, the only meaningful

model (theory) may be in the form of simulations.

2
Likelihood-Based Inference

L(D; ✓)

X1, . . . , Xn ⇠ N (✓, Id), where n = 10, ✓ = 0

n o
b
R(D) = ✓ 2 ⇥ | (D; ✓) b✓,↵
C

X1, . . . , Xn ⇠ 0.5N (✓, 1) + 0.5N ( ✓, 1)

3
⇣ ⌘
What is Likelihood-Free Inference?

L(D; ✓)

X1, . . . , Xn ⇠ N (✓, Id), where n = 10, ✓ = 0

n o
Image credit: Nic Dalmasso
b
R(D) = ✓ 2 ⇥ | (D; ✓) b✓,↵
C
The likelihood cannot be evaluated. But it is implicitly
encoded by the simulator…

Inference on parameters in this setting is called

likelihood-free inference (LFI)
X1 , . . . , X n ⇠ 0.5N (✓, 1) + 0.5N ( ✓, 1)

⇣ ⌘
b
PD|✓ ✓ 24 R(D) ✓ =1 ↵, 8✓ 2 ⇥
Classical LFI: Approximate Bayesian
Computation (ABC)

5
Image credit: Sunnaker et al. 2013
Changing LFI Landscape [Cranmer et al, PNAS 2019]
More recent developments use ML algorithms to directly
estimate key inferential quantities from simulated data

Posteriors, f( |x) [e.g., Papamakarios et al, 2016; Lueckmann et al, 2016;

Izbicki et al, 2019; Greenberg et al, 2019]

Likelihoods, f(x| ) or f(x| )/g(x) [e.g., Izbicki et al, 2014; Thomas et al,
2016; Durkan et al, 2020; Brehmer et al., 2020]

Likelihood ratios, f(x| 1)/f(x| 2) [e.g, Cranmer et al, 2015; Thomas et al,
2016; Hermans et al, 2020; Durkan et al, 2020; Brehmer et al, 2020]

These new training-based approaches can handle complex

high-dimensional data without a prior dimension reduction.
Provide “amortized” inference.
6
𝝷
𝝷
𝝷
𝝷
𝝷
Changing LFI Landscape [Cranmer et al, PNAS 2019]
More recent developments use ML algorithms to directly
estimate key inferential quantities from simulated data

Posteriors, f( |x) [e.g., Papamakarios et al, 2016; Lueckmann et al, 2016;

Izbicki et al, 2019; Greenberg et al, 2019]

Likelihoods, f(x| ) or f(x| )/g(x) [e.g., Izbicki et al, 2014; Thomas et al,
2016; Durkan et al, 2020; Brehmer et al., 2020]

Likelihood ratios, f(x| 1)/f(x| 2) [e.g, Cranmer et al, 2015; Thomas et al,
2016; Hermans et al, 2020; Durkan et al, 2020; Brehmer et al, 2020]

These new training-based approaches can handle complex

high-dimensional data without a prior dimension reduction.
Provide “amortized” inference.
7
𝝷
𝝷
𝝷
𝝷
𝝷
So What’s Missing in the LFI-ML Literature?

Given observed data, we would like to constrain parameters of

interest using assumed theoretical/simulation model. Valid measures
of uncertainty, no matter the value of the unknown parameter.

Shortage of practical inferential and diagnostic tools with

finite-sample guarantees of conditional coverage.

8
Open Problems in LFI

Confidence sets with correct

conditional coverage (for small n)?

Most approaches that estimate likelihoods or likelihood ratios

rely on asymptotic assumptions (Wilks 1938) for downstream inference

do not assess validity across entire parameter space, or

use costly MC simulations at fixed parameter settings on a grid

9
Unified Inference Machinery for Frequentist LFI
Bridges ML with classical statistics to provide:

(i) valid inference: confidence sets and tests with finite-sample

guarantees (Type I error control and power)

(ii) practical diagnostics: check actual coverage across entire

parameter space

Goal: Modular and computationally efficient procedures

Can leverage generative, predictive and posterior algorithms

Compatible with any test statistic and prior

https://github.com/lee-group-cmu/lf2i
10
https://arxiv.org/abs/2002.10399 (ICML 2021)
https://arxiv.org/abs/2205.15680 (AISTATS 2023)

LF2I
https://arxiv.org/abs/2107.03920

11
Equivalence of Tests and Confidence Sets

Data D = {X1 , ..., Xn } ≥ F◊

Test statistic ⁄(D; ◊)
Critical values

Reject H0 : ◊ = ◊0 ≈∆ ⁄(D; ◊0 ) < C◊0 ,–

Theorem (Neyman 1937)

Constructing a 1 ≠ – confidence set for ◊ is equivalent to testing

H0 : ◊ = ◊ 0 vs. HA : ◊ ”= ◊0

for every ◊0 œ .

12
Ann B. Lee (Carnegie Mellon University) 2 / 10
1. Fixed ◊. Find the rejection region for test statistic ⁄.

Ann B. Lee (Carnegie Mellon University) 13 4 / 10

2. Repeat for every ◊ in parameter space.

Ann B. Lee (Carnegie Mellon University) 14 4 / 21

3. Observe data D = D. Evaluate ⁄(D; ◊).

Ann B. Lee (Carnegie Mellon University) 15 6 / 10

4. Construct (1 ≠ –) confidence set for ◊.

Ann B. Lee (Carnegie Mellon University) 16 6 / 21

Challenges
Neyman construction itself. L. Lyons, “Open Statistical Issues
in Particle Physics”, AOAS 2008:

Validation of frequentist coverage. R. Cousins: “Lectures on

Statistics in Theory: Prelude to Statistics in Practice”,
arXiv:1807.05996, 2018:

17
How Do we Turn the Neyman Construction and Validation
into Practical Procedures?
The Neyman construction requires one to test

H0 : ◊ = ◊ 0 vs. HA : ◊ ”= ◊0

for every ◊0 œ .

Key insight:

1 Test statistic ⁄(D; ◊)

2 Critical values C◊0 ,– or p-values p(D; ◊0 ) of the test
1 2
3 Coverage PD|◊ ◊ œ R(D)
‚ of the constructed confidence set

are conditional distribution functions of the (unknown) parameters, and

often vary smoothly across the parameter space .
18
Efficient Construction of Finite-Sample Confidence Sets

Rather than running a batch of Monte Carlo simulations for every null
hypothesis ◊ = ◊0 on, e.g., a fine enough grid in , we can interpolate
across the parameter space using training-based ML algorithms.

Ann B. Lee (Carnegie Mellon University) 19 5 / 16

Our Inference Machinery

Ann B. Lee (Carnegie Mellon University) 20 8 / 21

Test Statistics: Leverage ML Classification/
Prediction Algorithms

Examples of LF2I test statistics:

classification/odds → ACORE (approximate LRT)

[Dalmasso et al 2020; arXiv:2002.10399]

classification/odds → BFF (approximate Bayes Factor)

[Dalmasso et al 2021; arXiv:2107.03920]

prediction or posterior estimation → WALDO (modified

Wald test statistic) [Masserano et al 2022; arXiv:2205.15680]

21
Center Branch: Estimating Odds and Test Statistic
Parameter : ◊ œ
Simulated data: X, x œ X. Observed data: Xobs , xobs œ X.

1 Proposal distribution ﬁ(◊) over

the parameter space
2 Forward simulator F◊
I F◊1 ”= F◊2 for ◊1 ”= ◊2 œ

3 Reference distribution G over

the feature space X
I F◊ π G for all ◊ œ

4 A simulated sample of size B to

estimate odds and test statistic

22
Ann B. Lee (Carnegie Mellon University) 9 / 21
Estimate Odds via Probabilistic Classification
Simulate two samples:
{(◊k , Xk , Yk = 1)}k=1 , where ◊ ≥ ﬁ(◊), X ≥ F◊
B/2

{(◊l , Xl , Yl = 0)}l=1 where ◊ ≥ ﬁ(◊), X ≥ G

B/2

Probabilistic classifier r:

r : (◊, X) ≠æ P(Y = 1|X, ◊)

Define the odds at ◊ œ and fixed x œ X as

P(Y = 1|x, ◊) f◊ (x)
O(x; ◊) := =
P(Y = 0|x, ◊) g(x)

Interpretation: Chance that x was generated from F◊ rather than G.

23
Ann B. Lee (Carnegie Mellon University) 10 / 21
24
ACORE and BFF are Approximations of the LR Statistic and
the Bayes Factor respectively!

Lemma (Fisher’s Consistency)

If ‚
P(Y = 1|◊, X) = P(Y = 1|◊, x) ’◊, X
sup◊œ 0 L(D;◊)
1 =∆ ‚ (D; 0 ) = LR(D; 0 ) © log sup◊œ L(D;◊) ,
s
P(D|H0 ) L(D;◊)dﬁ0 (◊)
2 =∆ ·‚(D; 0) = BF(D; 0 ) © P(D|H1 ) =
s 0
L(D;◊)dﬁ1 (◊)
.
1

Note: The Bayes factor is often used as a Bayesian alternative to

significance testing but here we are treating it as a frequentist test statistic.

25
Ann B. Lee (Carnegie Mellon University) 9 / 16
Test Statistics Based on Odds: ACORE and BFF
Suppose we want to test:

H0 : ◊ = ◊ 0 vs H1 : ◊ ”= ◊0

For observed data D = {Xobs

1 , ..., X obs }, we define
n

ACORE (Approximate Computation via Odds Ratio Estimation):

rn ‚ obs ; ◊ )
‚ (D; ◊ ) := log i=1 O(X i 0
0 rn obs ; ◊ )
sup◊œ i=1
‚
O(X i 0

BFF (Bayesian Frequentist Factor):

rn ‚ obs ; ◊ )
O(X 0
·‚(D; ◊0 ) := s 1r i=1 i 2 .
obs
i=1 O(Xi ; ◊) dﬁ· (◊)
n ‚

where ﬁ· (◊) is a probability distribution over the parameter space.

Ann B. Lee (Carnegie Mellon University) 26 7 / 16
Left Branch: Estimate Critical Values or P-Values

We use B simulations to estimate critical values.

27
Ann B. Lee (Carnegie Mellon University) 12 / 21
Estimating Critical Values C◊0 ,–

To control Type I error at level –:

Reject H0 : ◊ = ◊0 when ⁄(D; ◊0 ) < C◊0 ,– , where
Ó Ô
C◊0 ,– = arg sup C : PD|◊0 (⁄(D; ◊0 ) < C) Æ – .
CœR

Problem: Need to compute PD|◊ (⁄(D; ◊) < C) for every ◊ œ .

Solution: F⁄|◊ (C | ◊) © PD|◊ (⁄(D; ◊) < C | ◊) is a conditional CDF, so

we can estimate its –-quantile via quantile regression F⁄|◊
≠1
(–|◊).

28
Construct Confidence Set via Neyman Inversion

29
Are the Constructed Confidence Sets Valid?
i
Theorem (Validity for any test statistic)
Let CB Õ be the critical value of a level-– test based on the statistic
⁄(D; ◊0 ). Then, if the quantile regression estimator is consistent,
P
C BÕ ≠≠Õ≠≠≠æ C ú ,
B ≠æŒ

where C ú is such that

PD|◊ (⁄(D; ◊0 )) Æ C ú ) = –.

If B is large enough, we can construct a confidence set with guaranteed

nominal coverage regardless of the observed sample size n.

30
Right Branch: Assessing Conditional Coverage of R(D)
„

How do we check coverage of constructed confidence sets across ?

Note: Ó Ô
‚
R(D) = ◊œ | ⁄(D; ◊) Ø C‚◊,–
1 2 Ë 1 2 È
‚
PD|◊ ◊ œ R(D) | ◊ = ED|◊ I ◊ œ R(D)
‚ |◊

1 Sample ◊i and data Di ≥ F◊i

2 Construct confidence set R(D

‚ i)

ÕÕ
3 For {◊i , R(D
‚ i )}B , regress
i=1
‚ i )) on ◊i .
Zi := I(◊i œ R(D

How close is the actual coverage to the nominal confidence level 1 ≠ –?

Ann B. Lee (Carnegie Mellon University) 31 8 / 10
Ex: Estimate Critical Values (GMM; n=1000)
& Run Diagnostics Across the Parameter Space

(Left) LR with1000 MC simulations at each θ on a fine grid

(Center) Assume chi-squared distribution of LR statistic
(Right) LR with quantile regression with B’=1000 simulations total
32
Ex: Construct Confidence Sets (MVG data)

When d=2, ACORE and BFF confidence sets (for B=B’=5000) are
similar in size to the Exact LR confidence sets.
33
LF2I scales well for <10 parameters

34
35
36
LF2I scales well for <10 parameters. However…

The parameters θ

One more issue: the “theory” space is not the only thing eﬀecting the data
• every step of the forward process comes with its own parameters
(we understand the process generally but need additional knobs to model the data)

p(zd |zh ) p(zh |zp ) p(zp |✓)

<latexit sha1_base64="zbxNK9arzdEtoZ1dJL4ntuYzOzM=">AAAB73icbVBNT8JAEJ3iF+IX6tHLRmKCF9Iaoh6JXjxiIh8JNGS73cKG7bbsbo1Y+RNePGiMV/+ON/+NC/Sg4EsmeXlvJjPzvJgzpW3728qtrK6tb+Q3C1vbO7t7xf2DpooSSWiDRDySbQ8rypmgDc00p+1YUhx6nLa84fXUb91TqVgk7vQ4pm6I+4IFjGBtpHZcfnh67PmnvWLJrtgzoGXiZKQEGeq94lfXj0gSUqEJx0p1HDvWboqlZoTTSaGbKBpjMsR92jFU4JAqN53dO0EnRvFREElTQqOZ+nsixaFS49AznSHWA7XoTcX/vE6ig0s3ZSJONBVkvihIONIRmj6PfCYp0XxsCCaSmVsRGWCJiTYRFUwIzuLLy6R5VnHOK9Xbaql2lcWRhyM4hjI4cAE1uIE6NIAAh2d4hTdrZL1Y79bHvDVnZTOH8AfW5w+3mI/F</latexit>

p(x|zd ) <latexit sha1_base64="YWnOMIBvakokA5LbrnhDTU/UHqw=">AAAB8XicbVBNS8NAEJ34WetX1aOXxSLUS0mqoMeiF48V7Ae2IWw2m3bpZhN2N0Ib+y+8eFDEq//Gm//GbZuDtj4YeLw3w8w8P+FMadv+tlZW19Y3Ngtbxe2d3b390sFhS8WpJLRJYh7Ljo8V5UzQpmaa004iKY58Ttv+8Gbqtx+pVCwW93qUUDfCfcFCRrA20kNSGXvB09gbnHmlsl21Z0DLxMlJGXI0vNJXL4hJGlGhCcdKdR070W6GpWaE00mxlyqaYDLEfdo1VOCIKjebXTxBp0YJUBhLU0Kjmfp7IsORUqPIN50R1gO16E3F/7xuqsMrN2MiSTUVZL4oTDnSMZq+jwImKdF8ZAgmkplbERlgiYk2IRVNCM7iy8ukVas659Xa3UW5fp3HUYBjOIEKOHAJdbiFBjSBgIBneIU3S1kv1rv1MW9dsfKZI/gD6/MHNwCQnQ==</latexit> <latexit sha1_base64="4AWqj0cx8FT0gKWhkmuZy4L8snE=">AAAB8XicbVBNSwMxEJ31s9avqkcvwSLUS9mtgh6LXjxWsB/YLks2zbah2WxIskJb+y+8eFDEq//Gm//GtN2Dtj4YeLw3w8y8UHKmjet+Oyura+sbm7mt/PbO7t5+4eCwoZNUEVonCU9UK8SaciZo3TDDaUsqiuOQ02Y4uJn6zUeqNEvEvRlK6se4J1jECDZWepClUdB/GgXyLCgU3bI7A1omXkaKkKEWFL463YSkMRWGcKx123Ol8cdYGUY4neQ7qaYSkwHu0balAsdU++PZxRN0apUuihJlSxg0U39PjHGs9TAObWeMTV8velPxP6+dmujKHzMhU0MFmS+KUo5Mgqbvoy5TlBg+tAQTxeytiPSxwsTYkPI2BG/x5WXSqJS983Ll7qJYvc7iyMExnEAJPLiEKtxCDepAQMAzvMKbo50X5935mLeuONnMEfyB8/kDSUyQqQ==</latexit> <latexit sha1_base64="MkMF2xZCe6xW09RuBMs/uenQRL8=">AAAB9HicbVBNSwMxEM36WetX1aOXYBHqpexWQY9FLx4r2A9ol5JNs21oNhuT2UJd+zu8eFDEqz/Gm//GtN2Dtj4YeLw3w8y8QAluwHW/nZXVtfWNzdxWfntnd2+/cHDYMHGiKavTWMS6FRDDBJesDhwEaynNSBQI1gyGN1O/OWLa8Fjew1gxPyJ9yUNOCVjJV6XHrnrqwIABOesWim7ZnQEvEy8jRZSh1i18dXoxTSImgQpiTNtzFfgp0cCpYJN8JzFMETokfda2VJKIGT+dHT3Bp1bp4TDWtiTgmfp7IiWRMeMosJ0RgYFZ9Kbif147gfDKT7lUCTBJ54vCRGCI8TQB3OOaURBjSwjV3N6K6YBoQsHmlLcheIsvL5NGpeydlyt3F8XqdRZHDh2jE1RCHrpEVXSLaqiOKHpAz+gVvTkj58V5dz7mrStONnOE/sD5/AGTvZH4</latexit>

<latexit sha1_base64="lYXrxI6WR6UwEF+Mbu3Lz9keJP4=">AAACJ3icbVDLSgMxFM3UV62vqks3wSLUTZmRooIoRTcuK9gHdIaSyaRtaOZBckesY//Gjb/iRlARXfonZtoubOuBwLnn3kvOPW4kuALT/DYyC4tLyyvZ1dza+sbmVn57p67CWFJWo6EIZdMligkesBpwEKwZSUZ8V7CG279K+407JhUPg1sYRMzxSTfgHU4JaKmdv4iK94829BiQQ3yObR4Atn0CPekn3vCh7U1Vvakqss/a+YJZMkfA88SakAKaoNrOv9leSGOfBUAFUaplmRE4CZHAqWDDnB0rFhHaJ13W0jQgPlNOMrpziA+04uFOKPXTPkfq342E+EoNfFdPpjbVbC8V/+u1YuicOgkPohhYQMcfdWKBIcRpaNjjklEQA00IlVx7xbRHJKGgo83pEKzZk+dJ/ahkHZfKN+VC5XISRxbtoX1URBY6QRV0jaqohih6Qi/oHX0Yz8ar8Wl8jUczxmRnF03B+PkFXkmm7A==</latexit>

Z
p(x|✓) = dzd dzh dzp p(zd |zh , ✓d )
<latexit sha1_base64="edZJY0tplnnI8aY64BPBcmYtrEw=">AAAB/HicbVDLSsNAFJ34rPUV7dJNsAgVpCRS1GXRjcsK9gFtCJPJpBk6eTBzI6Sx/oobF4q49UPc+TdO2yy09cCFwzn3cu89bsKZBNP81lZW19Y3Nktb5e2d3b19/eCwI+NUENomMY9Fz8WSchbRNjDgtJcIikOX0647upn63QcqJIuje8gSaod4GDGfEQxKcvRKUhs73uPYCc4GEFDAjnfq6FWzbs5gLBOrIFVUoOXoXwMvJmlIIyAcS9m3zATsHAtghNNJeZBKmmAywkPaVzTCIZV2Pjt+YpwoxTP8WKiKwJipvydyHEqZha7qDDEEctGbiv95/RT8KztnUZICjch8kZ9yA2JjmoThMUEJ8EwRTARTtxokwAITUHmVVQjW4svLpHNety7qjbtGtXldxFFCR+gY1ZCFLlET3aIWaiOCMvSMXtGb9qS9aO/ax7x1RStmKugPtM8fTWSUjg==</latexit>

p(zh |zp , ✓h ) p(zp |✓p , ✓th )

<latexit sha1_base64="Vpcv3r4VeMVEJGi6EvJgDNWIi+w=">AAAB/HicbVDLSsNAFJ34rPUV7dLNYBEqSEmkqMuiG5cV7APaECbTaTN08mDmRkhj/RU3LhRx64e482+ctllo64ELh3Pu5d57vFhwBZb1baysrq1vbBa2its7u3v75sFhS0WJpKxJIxHJjkcUEzxkTeAgWCeWjASeYG1vdDP12w9MKh6F95DGzAnIMOQDTgloyTVLcWXs+o9jNz7rgc+AuP6pa5atqjUDXiZ2TsooR8M1v3r9iCYBC4EKolTXtmJwMiKBU8EmxV6iWEzoiAxZV9OQBEw52ez4CT7RSh8PIqkrBDxTf09kJFAqDTzdGRDw1aI3Ff/zugkMrpyMh3ECLKTzRYNEYIjwNAnc55JREKkmhEqub8XUJ5JQ0HkVdQj24svLpHVetS+qtbtauX6dx1FAR+gYVZCNLlEd3aIGaiKKUvSMXtGb8WS8GO/Gx7x1xchnSugPjM8fZjCUng==</latexit> <latexit sha1_base64="RzOaQ0IXifvqPG1+AvoCqrW9XQs=">AAACC3icbZDLSsNAFIYnXmu9VV26GVqEClISKeqy6MZlBXuBpoTJdNoMnSTDzIlQY/dufBU3LhRx6wu4822ctllo6w8DH/85hznn96XgGmz721paXlldW89t5De3tnd2C3v7TR0nirIGjUWs2j7RTPCINYCDYG2pGAl9wVr+8GpSb90xpXkc3cJIsm5IBhHvc0rAWF6hKMv3nnxwIWBAPHmSgRsSCFSYQjA+9golu2JPhRfByaCEMtW9wpfbi2kSsgioIFp3HFtCNyUKOBVsnHcTzSShQzJgHYMRCZnuptNbxvjIOD3cj5V5EeCp+3siJaHWo9A3nZMd9XxtYv5X6yTQv+imPJIJsIjOPuonAkOMJ8HgHleMghgZIFRxsyumAVGEgokvb0Jw5k9ehOZpxTmrVG+qpdplFkcOHaIiKiMHnaMaukZ11EAUPaJn9IrerCfrxXq3PmatS1Y2c4D+yPr8AcLVm4U=</latexit>

<latexit sha1_base64="1DAppW+NSyjYN/0oA+7NV5+YEcA=">AAAB+nicbVBNS8NAEN34WetXqkcvwSJUkJJIUY9FLx4r2A9oQ9hstu3SzSbsTrQ17U/x4kERr/4Sb/4bt20O2vpg4PHeDDPz/JgzBbb9baysrq1vbOa28ts7u3v7ZuGgoaJEElonEY9ky8eKciZoHRhw2oolxaHPadMf3Ez95gOVikXiHkYxdUPcE6zLCAYteWYhLg3HT15w1oE+BewNTz2zaJftGaxl4mSkiDLUPPOrE0QkCakAwrFSbceOwU2xBEY4neQ7iaIxJgPco21NBQ6pctPZ6RPrRCuB1Y2kLgHWTP09keJQqVHo684QQ18telPxP6+dQPfKTZmIE6CCzBd1E25BZE1zsAImKQE+0gQTyfStFuljiQnotPI6BGfx5WXSOC87F+XKXaVYvc7iyKEjdIxKyEGXqIpuUQ3VEUGP6Bm9ojdjbLwY78bHvHXFyGYO0R8Ynz/jTZPF</latexit>

p(x|zd , ✓x )

core “theory”
nuisance parameters parameters of inferest
(e.g. “Higgs Mass”

Credit: Lukas Heinrich

37
38
39
40
41
Hybrid Methods and Confidence Sets

Hybrid methods (which maximize or average over

nuisance parameters) do not always control the type
I error of statistical tests.

“For small sample sizes, there is no theorem as to

whether profiling or marginalization will give better
frequentist coverage for the parameter of interest”
(Cousins 2018)

Can our diagnostic tools provide guidance as to

which method to choose for the problem at hand?

42
Poisson Counting Experiment
[cf., Lyons, 2008; Cowan et al, 2011; Cowan, 2012]
Particle collision events counted under the presence of a
background process.

The observed data D consist of n=10 observations of

X=(NB, NS), where

NB is the # of events in the background region (assume =1)

NS is the # of events in the signal region

Unknown parameters:

signal strength (s); two nuisance parameters (b and ϵ)

𝛾
Diagnostics to Check Coverage Across the Entire Parameter Space

h-BFF (averages over nuisance parameters) performs the best in

terms of having the largest proportion of the parameter space
with CC and only a small fraction of the parameter space with UC

44
Our diagnostic tool can identify regions in parameter
space with UC, CC and OC
(Bottom: heat maps of upper limit of 2σ prediction band)

45
Take-Away: LF2I
Can construct finite-sample confidence sets with nominal
coverage, and provide diagnostics, even without a tractable
likelihood. (Do not rely on large n, or costly MC samples)

46
Take-Away: LF2I
Validity: Any existing or new test statistic — that is, not only
estimates of the LR statistic — can be used in our framework
to create frequentist confidence sets. (~10 parameters)

Power: Hardest to achieve in practice. Area where most

statistical and computational advances will take place.

Nuisance parameters and diagnostics: No guarantee that

hybrid methods are valid. However, we have a practical tool
for assessing coverage across the entire parameter space.

https://github.com/lee-group-cmu/lf2i
47
Current Projects (2023-)

Constructing test statistics that are invariant to nuisance

parameters (with Luca Masserano and Rafael Izbicki) → next
time?

Nuisance-parametrized LF2I of atmospheric cosmic-ray

showers (with Alex Shen, Tommaso Dorigo, Michele Doro,
Luca Masserano) → next talk by Alex!

https://github.com/lee-group-cmu/lf2i
48
Acknowledgments
Nic Dalmasso (JP Morgan AI)
original LF2I framework
Rafael Izbicki (UFSCar)

Luca Masserano (CMU)

Mikael Kuusela (CMU)

Tommaso Dorigo (INFN/Padova)

David Zhao (CMU)

This work is funded in part by NSF DMS-2053804

and NSF PHY-2020295.

49
EXTRA SLIDES START
HERE

50
Likelihood-Free Inference (LFI)

L(D; ✓)

X1, . . . , Xn ⇠ N (✓, Id), where n = 10, ✓ = 0

n o
Image credit: Nic Dalmasso
b
R(D) = ✓ 2 ⇥ | (D; ✓) b✓,↵
C
The likelihood cannot be evaluated. But it is implicitly
encoded by the simulator…

Inference on parameters in this setting is called

likelihood-free inference (LFI)
X1 , . . . , X n ⇠ 0.5N (✓, 1) + 0.5N ( ✓, 1)

⇣ ⌘
PD|✓ ✓ 2 b
51 R(D) ✓ = 1 ↵, 8✓ 2 ⇥
Predictive AI Approach Can Be Very Powerful, But
One Needs to Correct for Bias
[with Luca Masserano, Tommaso Dorigo, Rafael Izbicki and Mikael Kuusela]

Source: Dorigo et al 2020.

[Kieseler et al., July 2021 arXiv:2107.02119]
52 Slide credit: Luca Masserano
https://arxiv.org/abs/2205.15680 (AISTATS 2023)

53
Back to muon energy calorimeter problem:
LF2I/Waldo Confidence Sets
Derived from CNN Predictions:
Correct Coverage Across the Parameter Space

prediction sets

Figure credit: Luca Masserano

Ex: Credible Regions from Neural (NF) Posteriors

Blue contours: 95% credible regions from Normalizing Flows

(overly confident when prior is 55poorly specified)
Ex: LF2I/Waldo Confidence Sets Derived from the
Same Neural Posteriors ⇨ Correct Coverage

Waldo guarantees coverage everywhere, even if the prior poorly

specified. Well-specified prior ⇨ power (tighter constraints)
56
57
58
59
60
61
62

Detection Estimation and Modulation Theory, Solutions Part I
No ratings yet
Detection Estimation and Modulation Theory, Solutions Part I
80 pages
Detection Estimation and Modulation Theory Solution Manual
33% (3)
Detection Estimation and Modulation Theory Solution Manual
82 pages
Murphy Book Solution
No ratings yet
Murphy Book Solution
100 pages
Determination of Sample Size
100% (1)
Determination of Sample Size
3 pages
CHANG Duke 0066D 13207
No ratings yet
CHANG Duke 0066D 13207
118 pages
Lectures On Statistics in Theory - Prelude To Statistics in Practice
No ratings yet
Lectures On Statistics in Theory - Prelude To Statistics in Practice
94 pages
BayesianHypothesisTesting
No ratings yet
BayesianHypothesisTesting
17 pages
ECE 368 Course Review: Probabilistic Reasoning 2023
No ratings yet
ECE 368 Course Review: Probabilistic Reasoning 2023
138 pages
Bayesian Inference: A Practical Primer: Outline
No ratings yet
Bayesian Inference: A Practical Primer: Outline
28 pages
Reference: "Detection, Estimation and Modulation Theory" by H.L. Van Trees
No ratings yet
Reference: "Detection, Estimation and Modulation Theory" by H.L. Van Trees
18 pages
Bayes and Frequentism: Return of An Old Controversy: Louis Lyons
No ratings yet
Bayes and Frequentism: Return of An Old Controversy: Louis Lyons
40 pages
rpp2024 Rev Statistics
No ratings yet
rpp2024 Rev Statistics
38 pages
Multivariate Analysis From A Statistical Point of View: K.S. Cranmer
No ratings yet
Multivariate Analysis From A Statistical Point of View: K.S. Cranmer
4 pages
Zzzz-Essential Bayes
No ratings yet
Zzzz-Essential Bayes
158 pages
Maximum Likelihood Estimation by K.Kashin
No ratings yet
Maximum Likelihood Estimation by K.Kashin
34 pages
Model Selection/ Structure Learning Koller & Friedman Chapter 14 Mackay Chapter 28
No ratings yet
Model Selection/ Structure Learning Koller & Friedman Chapter 14 Mackay Chapter 28
49 pages
Bayes Manuscripts
No ratings yet
Bayes Manuscripts
180 pages
Classical Detection Theory
No ratings yet
Classical Detection Theory
23 pages
Bayesian Learning: Thanks To Nir Friedman, HU
No ratings yet
Bayesian Learning: Thanks To Nir Friedman, HU
41 pages
15.097: Probabilistic Modeling and Bayesian Analysis
No ratings yet
15.097: Probabilistic Modeling and Bayesian Analysis
42 pages
Lecture 2 BayesianHypothesisTesting
No ratings yet
Lecture 2 BayesianHypothesisTesting
10 pages
Block 4 ST3189
No ratings yet
Block 4 ST3189
25 pages
Prints PDF
No ratings yet
Prints PDF
106 pages
Overview of Principles of Statistics
No ratings yet
Overview of Principles of Statistics
8 pages
Statistical Reasoning
No ratings yet
Statistical Reasoning
19 pages
Tests of Hypo PDF
No ratings yet
Tests of Hypo PDF
176 pages
CLASS 2025 Bayesian Framework
No ratings yet
CLASS 2025 Bayesian Framework
46 pages
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
No ratings yet
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
23 pages
Likelihood, Bayesian, and Decision Theory
No ratings yet
Likelihood, Bayesian, and Decision Theory
50 pages
X400004 20220215 Solutions
No ratings yet
X400004 20220215 Solutions
8 pages
Jeff Byers - Machine Learning and Advanced Statitics
No ratings yet
Jeff Byers - Machine Learning and Advanced Statitics
48 pages
ML Lecture 03 - Probabilistic Inference (Spring 2024)
No ratings yet
ML Lecture 03 - Probabilistic Inference (Spring 2024)
46 pages
Classical Detection and Estimation Theory.
100% (1)
Classical Detection and Estimation Theory.
13 pages
Preprint of The Book Chapter: "Bayesian Versus Frequentist Inference"
No ratings yet
Preprint of The Book Chapter: "Bayesian Versus Frequentist Inference"
29 pages
PracticeProblems Bayesian
No ratings yet
PracticeProblems Bayesian
10 pages
M2S2 - Statistical Modelling: DR Axel Gandy Imperial College London Spring 2011
No ratings yet
M2S2 - Statistical Modelling: DR Axel Gandy Imperial College London Spring 2011
25 pages
Final Review Handout
No ratings yet
Final Review Handout
47 pages
Ba Yes Freq Book
No ratings yet
Ba Yes Freq Book
30 pages
MidSem 202223 Solution
No ratings yet
MidSem 202223 Solution
4 pages
Lecture Notes For Probability and Statistics
No ratings yet
Lecture Notes For Probability and Statistics
7 pages
MA40189 Notes
No ratings yet
MA40189 Notes
70 pages
ML Unit 3
No ratings yet
ML Unit 3
14 pages
Introduction To Machine Learning CS - 229
No ratings yet
Introduction To Machine Learning CS - 229
109 pages
Mec15663 Sup 0003 Appendixs3
No ratings yet
Mec15663 Sup 0003 Appendixs3
6 pages
03 Lectureslides ParameterInference
No ratings yet
03 Lectureslides ParameterInference
24 pages
PLSC504 Bayes 2024 Slides
No ratings yet
PLSC504 Bayes 2024 Slides
30 pages
Weatherwax Vantrees Solutions
No ratings yet
Weatherwax Vantrees Solutions
82 pages
FSMLecture6 - Statistics
No ratings yet
FSMLecture6 - Statistics
61 pages
Bayes ML Tutorial
No ratings yet
Bayes ML Tutorial
69 pages
hw10 3077 Fa24 - Soln
No ratings yet
hw10 3077 Fa24 - Soln
3 pages
Lecture Notes An Introduction To Digital Communications: 1997-2011 by Armand M. Makowski
No ratings yet
Lecture Notes An Introduction To Digital Communications: 1997-2011 by Armand M. Makowski
67 pages
Pengantar Analisis Real I
No ratings yet
Pengantar Analisis Real I
177 pages
Math2830 Chapter 08
No ratings yet
Math2830 Chapter 08
9 pages
ELEG 5633 Detection and Estimation Detection Theory I: Jing Yang
100% (1)
ELEG 5633 Detection and Estimation Detection Theory I: Jing Yang
27 pages
Statistical+Inference+1 Shaw2007
No ratings yet
Statistical+Inference+1 Shaw2007
66 pages
MLT Unit 4 Notes
No ratings yet
MLT Unit 4 Notes
26 pages
Statistics
No ratings yet
Statistics
7 pages
Stat 111
No ratings yet
Stat 111
7 pages
D.R. Cox Nuffield College, Oxford OX1 1NF, UK E-Mail: David - Cox@nuf - Ox.ac - Uk
No ratings yet
D.R. Cox Nuffield College, Oxford OX1 1NF, UK E-Mail: David - Cox@nuf - Ox.ac - Uk
4 pages
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet
ML - Lab-7.ipynb - Colab
No ratings yet
ML - Lab-7.ipynb - Colab
2 pages
SPC Case
No ratings yet
SPC Case
7 pages
Wiley Series in Probability and Statistics
No ratings yet
Wiley Series in Probability and Statistics
10 pages
Statistics Section B (2) 1
0% (1)
Statistics Section B (2) 1
2 pages
Hypothesis Testing Flowchart v0.2 2017 02 03
No ratings yet
Hypothesis Testing Flowchart v0.2 2017 02 03
1 page
Regresi Latihan
No ratings yet
Regresi Latihan
19 pages
International Journal of Pure and Applied Mathematics No. 3 2013, 583-592
No ratings yet
International Journal of Pure and Applied Mathematics No. 3 2013, 583-592
10 pages
Statistics - Docx Unit 1
No ratings yet
Statistics - Docx Unit 1
9 pages
7 - Sampling Distributions & Point Estimation of Parameters
No ratings yet
7 - Sampling Distributions & Point Estimation of Parameters
45 pages
Real Databricks Certified Professional Data Scientist Dumps With Actual Questions - Valid IT Exam Dumps Questions
No ratings yet
Real Databricks Certified Professional Data Scientist Dumps With Actual Questions - Valid IT Exam Dumps Questions
44 pages
Demand Forecasting
0% (1)
Demand Forecasting
56 pages
Multisample Inference: Analysis of Variance
No ratings yet
Multisample Inference: Analysis of Variance
78 pages
MLC2
No ratings yet
MLC2
9 pages
Solution HW 1
No ratings yet
Solution HW 1
9 pages
Open Screenshot 2023-12-13 at 8.06.13 PM 26
No ratings yet
Open Screenshot 2023-12-13 at 8.06.13 PM 26
56 pages
Statistics Acadza
No ratings yet
Statistics Acadza
2 pages
Sample Size and Optimal Design For Logistic Regression With Binary Interaction - Eugene Demidenko
No ratings yet
Sample Size and Optimal Design For Logistic Regression With Binary Interaction - Eugene Demidenko
11 pages
AMOS Tutorial
No ratings yet
AMOS Tutorial
52 pages
ML Notes (Module-3)
No ratings yet
ML Notes (Module-3)
21 pages
Final Exam Review
No ratings yet
Final Exam Review
46 pages
Stats Poster Project 1
No ratings yet
Stats Poster Project 1
3 pages
Barbosa Et Al. (2022) - Spatial Correlates of COVID-19 First Wave Across Continental Portugal
No ratings yet
Barbosa Et Al. (2022) - Spatial Correlates of COVID-19 First Wave Across Continental Portugal
13 pages
Chapter4 Sampling Stratified Sampling 1
No ratings yet
Chapter4 Sampling Stratified Sampling 1
27 pages
Class 2
No ratings yet
Class 2
34 pages
OPM Forecasting
No ratings yet
OPM Forecasting
57 pages
AP 2015 Statistics
No ratings yet
AP 2015 Statistics
65 pages
Master's Written Examination and Solution
No ratings yet
Master's Written Examination and Solution
14 pages
Highly Variable Drugs (HVDS)
No ratings yet
Highly Variable Drugs (HVDS)
4 pages
STAT 266 - Lecture 2
No ratings yet
STAT 266 - Lecture 2
45 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.