0% found this document useful (0 votes)
16 views63 pages

Unit 2

Uploaded by

dukeyue1212
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views63 pages

Unit 2

Uploaded by

dukeyue1212
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 63

18.

650 – Fundamentals of Statistics

2. Foundations of Inference

1/61
Goals
In this unit, we introduce a mathematical formalization of
statistical modeling to make a principled sense of the Trinity of
statistical inference.
We will make sense of the following statements:
1. Estimation: only one number .

”p̂ = R̂n is an estimator for the proportion p of couples


that turn their head to the right”
(side question: is 64.5% also an estimator for p?)
2. Confidence intervals:
”[0.56, 0.73] is a 95% confidence interval for p”

3. Hypothesis testing: simple Yes or No question .

”We find statistical evidence that more couples turn their


head to the right when kissing”

2/61
The rationale behind statistical modeling
I Let X1 , . . . , Xn be n independent copies of X.
I The goal of statistics is to learn the distribution of X.
I If X 2 {0, 1}, easy! It’s and we only have to
learn the parameter
=

I Can be more complicated. For example, here is a (partial)


dataset with number of siblings (including self) that were
collected from college students a few years back: 2, 3, 2, 4, 1,
3, 1, 1, 1, 1, 1, 2, 2, 3, 2, 2, 2, 3, 2, 1, 3, 1, 2, 3, . . .
I We could make no assumption and try to learn the pmf:

x 1 2 3 4 5 6 7
P
IP(X = x) p1 p2 p3 p4 p5 p6 i 7 pi

yb
since sum to 1
That’s 7 parameters to learn.
I Or we could assume that X 1⇠ . That’s 1
parameter to learn!
3/61
Statistical model
Formal definition

Let the observed outcome of a statistical experiment be a


X1 , . . . , Xn of n i.i.d. random variables in some
measurable space E (usually E ✓ IR) and denote by IP their
common distribution. A associated
to that statistical experiment is a pair

(E, (IP✓ )✓2⇥ ) ,

where:

I E is called

I (IP✓ )✓2⇥ is a family of measures on E;

I ⇥ is any set, called


4/61
Parametric, nonparametric and semiparametric models
I Usually, we will assume that the statistical model is well
specified, i.e., defined such that Such .
I This particular ✓ is called the , and is
unknown: The aim of the statistical experiment is to
✓, or check it’s properties when they have a special meaning
✓ > 2?, ✓ 6= 1/2?, . . . )
I We often assume that ⇥ ✓ IRd for some d 1: The model is
called
I Sometime we could have ⇥ be infinite dimensional in which
case the model is called
I If ⇥ = ⇥1 ⇥ ⇥2 , where ⇥1 is finite dimensional and ⇥2 is
infinite dimensional: semiparametric model. In these models
we only care to estimate the finite dimensional parameter and
the infinite dimensional one is called nuisance parameter. We
will not cover such models in this class.
5/61
Examples of parametric models
1. For n Bernoulli trials:
⇣ ⌘
, (Ber(p))p2(0,1) .

iid
2. If X1 , . . . , Xn ⇠ Poiss( ) for some unknown > 0,

, .

iid 2 ),
3. If X1 , . . . , Xn ⇠ N (µ, for some unknown µ 2 IR and
2 > 0:

⇣ ⌘
2
, N (µ, ) (µ, 2 )2 .

iid d
4. If X1 , . . . , Xn ⇠ Nd (µ, Id ), for some unknown µ 2 IR :
⇣ ⌘
, (Nd (µ, Id ))(µ2 ) .

6/61
Examples of nonparametric models

1. If X1 , . . . , Xn 2 IR are i.i.d with unknown unimodal1 pdf f :

E= ⇥=

2. If X1 , . . . , Xn 2 [0, 1] are i.i.d with unknown invertible cdf F .

1
Increases on ( 1a) and then decreases on (a, 1) for some a > 0. 7/61
Further examples
Sometimes we do not have simple notation to write (IP✓ )✓2⇥ , e.g.,
(Ber(p))p2(0,1) and we have to be more explicit:
1. Linear regression model: If
d
(X1 , Y1 ), . . . , (Xn , Yn ) 2 IR ⇥ IR are i.i.d from the linear
> iid
regression model Yi = Xi + "i "i ⇠ N (0, 1) for an
unknown 2 IRd and Xi ⇠ Nd (0, Id ) independent of "i

E= ⇥=

2. Cox proportional Hazard model: If


(X1 , Y1 ), . . . , (Xn , Yn ) 2 IRd ⇥ IR : the conditional
distribution of Y given X = x has CDF F of the form
✓ Z t ◆
( > x)
F (t) = 1 exp h(u)e du
0

where h is an unknown non-negative nuisance function and


d
2 IR is the parameter of interest.
8/61
Identifiability

The parameter ✓ is called identifiable i↵ the map ✓ 2 ⇥ 7! IP✓ is


injective, i.e.,
0
✓ 6= ✓ )
or equivalently:
IP✓ = IP✓0 )

Examples

1. In all previous examples, the parameter is identifiable.

iid 2 ),
2. If Xi = 1IYi 0 (indicator function), Y1 , . . . , Yn ⇠ N (µ,
for some unknown µ 2 IR and 2 > 0, are unobserved: µ and
2 are not identifiable (but ✓ = µ/ is).

9/61
Exercises
a) Which of the following is a statistical model?
⇣ ⌘
1. {1}, (Ber(p))p2(0,1)
⇣ ⌘
2. {0, 1}, (Ber(p))p2(0.2,0.4)
3. Both 1 and 2
4. None of the above

iid
b) Let X1 , . . . , Xn ⇠ U([0, a]) for some unknown a > 0. Which
one of the following is the associated statistical model?
1. [0, a], (U ([0, a]))a>0
2. IR+ , (U ([0, a]))a>0
3. IR, (U ([0, a]))a>0
4. None of the above

10/61
Exercises

2 iid
c) Let Xi = where Y1 , . . . , Yn ⇠ U([0, a]), for some unknown
Yi ,
a, are unobserved. Is a identifiable?
1. Yes
2. No

iid
d) Let Xi = 1IYi a2 , where Y1 , . . . , Yn ⇠ U([0, a]), for some
unknown a, are unobserved. Is a identifiable?
1. Yes
2. No

11/61
Estimation

12/61
Parameter estimation
Definitions
I Statistic: Any measurable2 function of the sample, e.g.,
X̄n , max Xi , X1 + log(1 + |Xn |), sample variance, etc...
i
I Estimator of ✓: Any statistic whose expression does not
depend on
I An estimator ✓ˆn of ✓ is weakly (resp. strongly )
if
IP (resp. a.s.)
✓ˆn ! ✓ (w.r.t. IP✓ ).
n!1

I An estimator ✓ˆn of ✓ is asymptotically normal if


p (d)
n(✓ˆn ✓) !
n!1

The quantity 2 is then called asymptotic of ✓ˆn .


2
Rule of thumb: if you can compute it exactly once given data, it is
measurable. You may have some issues with things that are implicitly defined 13/61
Bias of an estimator

I Bias of an estimator ✓ˆn of ✓:

bias(✓ˆn ) =

I If bias(✓)
ˆ = 0, we say that ✓ˆ is
I Example: Assume that X1 , . . . , Xn ⇠ Ber(p) and consider the
iid

following estimators for p:


I p̂n = X̄n : bias(p̂n ) =

I p̂n = X1 : bias(p̂n ) =

X1 + X2
I p̂n = : bias(p̂n ) =
2
p
I p̂n = 1I(X1 = 1, X2 = 2)

14/61
Variance of an estimator

An estimator is a random variable so we can compute its variance.


In the previous examples:

I p̂n = X̄n : var(p̂n ) =

I p̂n = X1 : bias(p̂n ) =

X1 + X2
I p̂n = : bias(p̂n ) =
2
p
I p̂n = 1I(X1 = 1, X2 = 2)

15/61
Quadratic risk

I We want estimators to have low bias and low variance at the


same time.

I The Risk (or quadratic risk) of an estimator ✓ˆn 2 IR is


h i
ˆ ˆ
R(✓n ) = IE |✓n ✓| 2

I Low quadratic risk means that both bias and variance are
small:
quadratic risk=

16/61
Exercises

Let X1 , X2 . . . , Xn be a random sample from U ([a, a + 1]).


Questions a), b), c) and d) are about this sample.
⇥ ⇤
a) Find IE X̄n

1
b) Is X̄n 2 an unbiased estimator for a?

1
c) Find the variance of X̄n 2 .

1
d) Find the quadratic risk of X̄n 2 .

17/61
Confidence intervals

18/61
Confidence intervals

Let (E, (IP✓ )✓2⇥ ) be a statistical model based on observations


X1 , . . . , Xn , and assume ⇥ ✓ IR. Let ↵ 2 (0, 1).

I Confidence interval (C.I.) of level 1 ↵ for ✓: Any random


(depending on X1 , . . . , Xn ) interval I whose boundaries do
not depend on ✓ and such that:
⇥ 3✓

IP✓ I 3 1 ↵, 8✓ 2 ⇥.
I C.I. of asymptotic level 1 ↵ for ✓: Any random interval I
whose boundaries do not depend on ✓ and such that:

IP✓ [I 3 ✓] 1 ↵, 8✓ 2 ⇥.

3
I 3 ✓ means that I contains ✓. This notation emphasizes the randomness
of I but we can equivalently write ✓ 2 I. 19/61
A confidence interval for the kiss example
I Recall that we observe R1 , . . . , Rn ⇠ Ber(p) for some
iid

unknown p 2 (0, 1).


⇣ ⌘
I Statistical model: {0, 1}, (Ber(p))p2(0,1) .
I Recall that our estimator for p is p̂ = R̄n .
I From CLT:
p R̄n p (d)
np ! N (0, 1)
p(1 p) n!1
This means (precisely) that:
p R̄n p
I (x): cdf of N (0, 1); n (x): cdf of n p .
p(1 p)
I Then: n (x) ⇡ (x) (CLT) when n becomes large. Hence, for
all x > 0,
p !!
⇥ ⇤ x n
IP |R̄n p| x ' 2 1 p .
p(1 p)
20/61
Confidence interval?
I For a fixed ↵ 2 (0, 1), if q↵/2 is the (1 ↵/2)-quantile of
N (0, 1), then with probability ' 1 ↵ (if n is large enough !),
" p p #
q↵/2 p(1 p) q↵/2 p(1 p)
R̄n 2 p p ,p + p .
n n

I It yields
" p p # !
q↵/2 p(1 p) q↵/2 p(1 p)
lim IP R̄n p , R̄n + p 3p
n!1 n n

I But this is not a confidence interval because

I To fix this, there are 3 solutions.


21/61
Solution 1: Conservative bound
I Note that no matter the (unknown) value of p,

p(1 p) 

I Hence, roughly with probability at least 1 ↵,



q↵/2 q↵/2
R̄n 2 p p ,p + p .
2 n 2 n

I We get the asymptotic confidence interval:



q↵/2 q↵/2
Iconserv = R̄n p , R̄n + p
2 n 2 n

I Indeed
lim IP(Iconserv 3 p) 1 ↵
n!1
22/61
Solution 2: Solving the (quadratic) equation for p
I We have the system of two inequalities in p:
p p
q↵/2 p(1 p) q↵/2 p(1 p)
R̄n p  p  R̄n + p
n n
I Each is a quadratic inequality in p of the form
2
q↵/2 p(1 p)
2
(p R̄n ) 
n
We need to find the roots p1 < p2 of
2
q↵/2
2 2
1+ p p+ R̄n =0
n
I This leads to a new confidence interval Isolve = [p1 , p2 ] such
that:
lim IP(Isolve 3 p)
n!1
(it’s complicated to write in generic way so let us wait to have
values for n, ↵ and R̄n to plug-in)
23/61
Solution 3: plug-in

IP,a.s.
I Recall that by the LLN p̂ = R̄n !p
n!1
I So by Slutsky, we also have .

p R̄n p (d)
np ! N (0, 1)
p̂(1 p̂) n!1

I This leads to a new confidence interval:


" p p #
q↵/2 p̂(1 p̂) q↵/2 p̂(1 p̂)
Iplug-in = R̄n p , R̄n + p
n n

such that
lim IP(Iplug-in 3 p)
n!1

24/61
95% asymptotic CI for the kiss example
Recall that in the kiss example we had n = 124 and R̄n = 0.645.
Assume ↵ = 5%.
For Isolve , we have to find the roots of:
2
1.03p 1.32p + 0.41 = 0 p1 = 0.53, p2 = 0.75

We get the following confidence intervals of asymptotic level 95%:


⇥ ⇤
I Iconserv = 0.56 , 0.73
⇥ ⇤
I Isolve = 0.53 , 0.75
⇥ ⇤
I Iplug-in = 0.56 , 0.73
There are many4 other possibilities in softwares even ones that use
the exact distribution of nR̄n ⇠
⇥ ⇤
IR default = 0.55 , 0.73

4
See R. Newcombe (1998). Two-Sided Confidence Intervals for the Single
Proportion: Comparison of Seven Methods. 25/61
Exercises

a) Let I, J be some 95% and 98% asymptotic confidence intervals


(respectively) for p. Which one of the following statements is
correct?
1. We always have I ⇢ J.
2. We always have J ⇢ I.
3. None of the above.
b) Find a 98% asymptotic confidence interval for p.

26/61
Exercises

c) Consider a new experiment in which there are 150 participants,


75 turned left and 75 turned right. Which of the following is the
correct answer?
1. [0, 0.5] is a 50% asymptotic confidence intervals for p
2. [0.5, 1] is a 50% asymptotic confidence intervals for p
3. [0.466, 0.533] is a 50% asymptotic confidence intervals for p
4. [0.48, 0.52] is a 50% asymptotic confidence intervals for p
5. both (1) and (2)
6. (1), (2) and (3)
7. (1), (2), (3) and (4)

27/61
Exercises
d) If [0.34, 0.57] is a 95% confidence interval for an unknown
proportion p, then the probability that p is in this interval is
1. 0.025
2. 0.05
3. 0.95
4. None of the above

e) If [0.34, 0.57] is a 95% confidence interval for an unknown


proportion p, is it also a 98% confidence interval?
1. Yes
2. No

f) If [0.34, 0.57] is a 95% confidence interval for an unknown


proportion p, is it also a 90% confidence interval?
1. Yes
2. No
28/61
Another example: The T

29/61
Statistical problem

I You observe the times (in minutes) between arrivals of the T


at Kendall: T1 , . . . , Tn .

I You assume that these times are:


I Mutually independent
I Exponential random variables with common parameter

I You want to estimate the value of , based on the observed


arrival times.

30/61
Discussion of the modeling assumptions

I Mutual independence of T1 , . . . , Tn : plausible but not


completely justified (often the case with independence).

I T1 , . . . , Tn are exponential r.v.: lack of memory of the


exponential distribution:

IP[T1 > t + s|T1 > t] = IP[ ], 8s, t 0.

Also, Ti > 0 almost surely!


I The exponential distributions of T1 , . . . , Tn have the same
parameter: in average all the same inter-arrival time. True
only for limited period (rush hour 6= 11pm).

31/61
Estimator

I Density of T1 :
t
f (t) = e , 8t 0.
1
I IE[T1 ] = .

1
I Hence, a natural estimate of is

n
X
1
T̄n := Ti .
n
i=1

I A natural estimator of is

ˆ :=

32/61
First properties

I By the LLN’s,
a.s./IP 1
T̄n !
n!1
I Hence,
ˆ a.s. /IP
! .
n!1
I By the CLT,
✓ ◆
p 1 2
n T̄n ! N (0, ).
n!1

I How does the CLT transfer to ˆ ? How to find an asymptotic


confidence interval for ?

33/61
The Delta method

Let (Zn )n 1 sequence of r.v. that satisfies


p (d)
n(Zn ✓) !
n!1

for some ✓ 2 IR and 2 > 0 (the sequence (Zn )n 1 is said to be


asymptotically normal around ✓).

Let g : IR ! IR be continuously di↵erentiable at the point ✓.


Then,
I (g(Zn ))n 1 is also asymptotically normal;
I More precisely,
p (d) 2
n (g(Zn ) g(✓)) ! N (0, ).
n!1

34/61
Consequence of the Delta method

p ⇣ ⌘ (d)
I n ˆ ! N (0, 2
).
n!1

I Hence, for ↵ 2 (0, 1) and when n is large enough,

|ˆ |

with probability approximately 1 ↵.



q↵/2 q↵/2
I Can ˆ p ,ˆ+ p be used as an asymptotic
n n
confidence interval for ?

35/61
Three solutions
1. The conservative bound: we have no a priori way to bound
2. We can solve for :
✓ ◆ ✓ ◆
q↵/2 q↵/2 q↵/2
|ˆ | p () 1 p ˆ 1+ p
n n n

()

It yields
" ✓ ◆ ✓ ◆ #
1 1
q↵/2 q↵/2
Isolve = ˆ 1+ p ,ˆ 1 p
n n

3. Plug-in yields
 ✓ ◆ ✓ ◆
q↵/2 q↵/2
Iplug-in = ˆ 1 p ,ˆ 1+ p
n n
36/61
95% asymptotic CI for the T example

Assume that n = 64 and T̄n = 6.23 and ↵ = 5%.

We get the following confidence intervals of asymptotic level 95%:


⇥ ⇤
I Isolve = 0.13 , 0.21
⇥ ⇤
I Iplug-in = 0.12 , 0.20

37/61
Meaning of a confidence interval
⇥ ⇤
Take Iplug-in = 0.12 , 0.20 for example. What is the meaning of
“Iplug-in is a confidence intervals of asymptotic level 95%”.

Does it mean that


⇥ ⇤
lim IP 2= 0.12 , 0.20 .95?
n!1

There is a frequentist interpretation5 : ●


If we were to repeat this experiment






(collect 64 observations) then would






be in the resulting confidence interval







about of the time (image







credit: openintro.org). ●

5
The frequentist approach is often contrasted with the Bayesian approach. 38/61
Hypothesis testing

39/61
How to board a plane?

40/61
What is the fastest boarding method?
What is the fastest method to board a plane?

R2F or WilMA?

I R2F= Rear to Front

I WilMA=Window, Middle, Aisle. It is basically an OUTSIDE


to INSIDE method.

41/61
The data

We collected data from two di↵erent airlines: JetBlue (R2F) and


United (WilMA).
better
We got the following results: Does this really

R2F ⑦
WilMA
Average (mins) 24.2 15.9
Std. Dev (mins) 2.1 1.3
Sample size 72 56
"
"
sufficient statistics

42/61
Model and Assumptions
I Let X (resp. Y ) denote the boarding time of a random
JetBlue (resp. United) flight.

0
I We assume that X ⇠ N (µ1 , 12 ) and Y ⇠ N (µ2 , 22 )
I Let n and m denote the JetBlue and United sample sizes
respectively.
I We have X1 , . . . , Xn independent copies of X and Y1 , . . . Ym
independent copies of Y .
I We further assume that the two samples are independent.
We want to answer the question:

Is µ1 = µ2 or is µ1 > µ2 ?

By making modeling assumptions, we have reduced the number


of ways the hypothesis µ1 = µ2 may be rejected. We do not allow
that µ1 < µ2 !
We have two samples: this is a two-sample test
43/61
A first heuristic
Simple heuristic:

“If X̄n > Ȳn , then µ1 > µ2 ”

This could go wrong if I randomly pick only full flights in my


sample X1 , . . . , Xn and empty flights in my sample Y1 , . . . , Yn .

Better heuristic:

“If

X̄n > Ȳn +

then µ1 > µ2 ”

To make this intuition more precise, we need to take the size of the
random fluctuations of X̄n and Ȳn into account!
44/61
Waiting time in the ER
I The average waiting time in the Emergency Room (ER) in the
US is 30 minutes according to the CDC
I Some patients claim that the new Princeton-Plainsboro
hospital has a longer waiting time. Is it true?
I Here, we collect only one sample: X1 , . . . , Xn (waiting time in
minutes for n random patients) with unknown expected value
IE[X1 ] = µ.
I We want to know if µ > 30.

This is a one-sample test

45/61
Heuristic

Heuristic:

“If
X̄n + < 30
then conclude that ”

46/61
Example 1
According to a survey conducted in 2017 on 4,971 randomly
sampled Americans, 32% report to get at least some of their news
on Youtube. Can we conclude that at most a third of all
Americans get at least some of their news on Youtube?

I n = 4, 971, X1 , . . . , Xn ⇠ Ber(p);
iid

I X̄n = 0.32
I If it was true that p = .33: By CLT,

p X̄n .3
np ⇡ N (0, 1).
.33(1 .33)

p X̄n .3
I np ⇡ 1.50
.3(1 .3)
I Conclusion:

47/61
The Standard Gaussian distribution

68%

95%

99.7%

µ − 3σ µ − 2σ µ−σ µ µ+σ µ + 2σ µ + 3σ

48/61
Example 2
Example 2: A coin is tossed 30 times, and Heads are obtained 13
times. Can we conclude that the coin is significantly unfair ?

I n = 30, X1 , . . . , Xn ⇠ Ber(p);
iid

I X̄n = 13/30 ⇡ .43


I If it was true that p = .5: By CLT,

p X̄n .5
np ⇡ N (0, 1).
.5(1 .5)

p X̄n .5
I Our data gives n p ⇡
.5(1 .5)
I The number .77 realization of a random
variable Z ⇠ N (0, 1).
I Conclusion:
49/61
Statistical formulation
I Consider a sample X1 , . . . , Xn of i.i.d. random variables and a
statistical model (E, (IP✓ )✓2⇥ ).

I Let ⇥0 and ⇥1 be disjoint subsets of ⇥.


(
H0 : ✓ 2 ⇥0
I Consider the two hypotheses:
H1 : ✓ 2 ⇥1

I H0 is the null hypothesis, H1 is the alternative hypothesis.

I If we believe that the true ✓ is either in ⇥0 or in ⇥1 , we may


want to test H0 against H1 .

I We want to decide whether to reject H0 (look for evidence


against H0 in the data).
50/61
Asymmetry in the hypotheses

I H0 and H1 do not play a symmetric role: the data is is only


used to try to disprove H0
I In particular lack of evidence, does not mean that H0 is true
(“innocent until proven guilty”)

I A test is a statistic 2 {0, 1} such that:


I If = 0, H0 is not rejected;
I If = 1, H0 is rejected.

I Coin example: H0 : p = 1/2 vs. H1 : p 6= 1/2.

n o
I = 1I , for some C > 0.

I How to choose the threshold C ?

51/61
Errors
I Rejection region of a test :
n
R = {x 2 E : (x) = 1}.

I Type 1 error of a test (rejecting H0 when it is actually


true):
↵ : ⇥0 ! IR
✓ 7! IP✓ [ = 1].
I Type 2 error of a test (not rejecting H0 although H1 is
actually true):

: ⇥1 ! IR
✓ 7!

I Power of a test :

⇡ = inf (1 (✓)) .
✓2⇥1
52/61
Level, test statistic and rejection region

I A test has level ↵ if

↵ (✓)  ↵, 8✓ 2 ⇥0 .

I A test has asymptotic level ↵ if

↵ (✓)  ↵, 8✓ 2 ⇥0 .

I In general, a test has the form

= 1I{Tn > c},

for some statistic Tn and threshold c 2 IR.

I Tn is called the test statistic. The rejection region is


R = .

53/61
One-sided vs two-sided tests

We can refine the terminology when ✓ 2 ⇥ ⇢ IR and H0 is of the


form
H0 : ✓ = ✓ 0 , ⇥0 = {✓0 }

I If H1 : ✓ =
6 ✓0 : two-sided test
I If H1 : ✓ > ✓ or H1 : ✓ < ✓0 : one-sided test

Examples:
I Boarding method:
I Waiting time in the ER:
I The kiss example:
I Fair coin:

One or two sided tests will have di↵erent rejection regions.

54/61
Bernoulli experiment

I Let X1 , . . . , Xn ⇠ Ber(p), for some unknown p 2 (0, 1).


iid

I We want to test:
H0 : p = 1/2 vs. H1 : p 6= 1/2

with asymptotic level ↵ 2 (0, 1).

p p̂n 0.5
I Let Tn = n p , where p̂n is the MLE.
.5(1 .5)

I If H0 is true, then by CLT,

IP[Tn > q↵/2 ] ! 0.05


n!1

I Let ↵ = 1I{Tn > q↵/2 }.

55/61
Examples
For ↵ = 5%, q↵/2 = 1.96 and q↵/2 =
Fair coin
H0 is at the asymptotic level 5% by the test 5% .

News on Youtube
H0 : p 0.33 vs. H1 : p < 0.33. This is a -sided test.
We reject if:
p p̂n p
n p >c
p(1 p)
But what value for p 2 ⇥0 = should we choose?
Type 1 error is the function p 7! IPp [ = 1]. To control the level
we need to find the p that maximizes it over ⇥0
! no need for computations, it’s clearly p =
H0 is at the asymptotic level 5% by the test 5% .
56/61
p-value
Definition

The (asymptotic) p-value of a test ↵ is the smallest (asymptotic)


level ↵ at which ↵ rejects H0 . It is random, it depends on the
sample.

Golden rule

p-value  ↵ , H0 is rejected by ↵, at the (asymptotic) level ↵.

The smaller the p-value, the more confidently one can reject
H0 .

I Example 1: p-value = IP[|Z| > 3.21] ⌧ .01.


I Example 2: p-value = IP[|Z| > .77] ⇡ .44.
57/61
6
Exercise: Cookies

Students are asked to count the num-


ber of chocolate chips in 32 cookies for
a class activity. They found that the
cookies on average had 14.77 choco-
late chips with a standard deviation of
4.37 chocolate chips. The packaging
for these cookies claims that there are
at least 20 chocolate chips per cookie.
One student thinks this number is un-
reasonably high since the average they
found is much lower. Another student
claims the di↵erence might be due to
chance. What do you think (compute
a p-value)?

6
from the textbook OpenIntro Statistics 58/61
Exercise: kiss

Recall that in the Kiss example we observed 80 out of 124 couples


turning their head to the right. Formulate the statistical hypothesis
problem, compute the p-value and conclude.

59/61
Exercise : Machine learning predicts breast cancer

A vast problem in breast cancer are false positive, that is surgery


performed on benign tumors. A new machine learning procedure
claims to improve the state-of-the art (95% of false positive)
significantly while preserving the same true positive rate (detecting
malignant tumors as malignant). To verify this claim, we collected
data on 297 benign tumors. The algorithm recommended to
perform surgery on 206 of them.

Let p denote the proportion of benign tumors on which the


algorithm prescribes surgery.
Formulate the statistical hypothesis problem, compute the p-value
and conclude.

60/61
Recap
I A statistical model is a pair of the form (E, (IP✓ )✓2⇥ ) where
E is the sample space and (IP✓ )✓2⇥ is a family of candidate
probability distributions.
I A model can be well specified and identifiable.
I The trinity of statistical inference: estimation, confidence
intervals and testing
I Estimator: one value whose performance can be measured by
consistency, asymptotic normality, bias, variance and
quadratic risk
I Confidence intervals provide “error bars” around estimators.
Their size depends on the confidence level
I Hypothesis testing: we want to ask a yes/no answer about an
unknown parameter. They are characterized by hypotheses,
level, power, test statistic and rejection region. Under the null
hypothesis, the value of the unknown parameter becomes
known (no need for plug-in).
61/61

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy