0% found this document useful (0 votes)
44 views16 pages

MIT18 05S14 Class23slides PDF

The document discusses confidence intervals, beginning with an overview of confidence intervals for estimating the proportion θ in a Bernoulli distribution based on polling data. It then covers the central limit theorem and how it allows constructing large-sample confidence intervals for the mean without assuming normality. Finally, it presents three views of defining confidence intervals: using standardized point statistics, based on hypothesis tests, and as intervals that satisfy a formal probability property.

Uploaded by

IslamSharaf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views16 pages

MIT18 05S14 Class23slides PDF

The document discusses confidence intervals, beginning with an overview of confidence intervals for estimating the proportion θ in a Bernoulli distribution based on polling data. It then covers the central limit theorem and how it allows constructing large-sample confidence intervals for the mean without assuming normality. Finally, it presents three views of defining confidence intervals: using standardized point statistics, based on hypothesis tests, and as intervals that satisfy a formal probability property.

Uploaded by

IslamSharaf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Confidence Intervals II

18.05 Spring 2014


Agenda

Polling: estimating θ in Bernoulli(θ).


CLT ⇒ large sample confidence intervals for the mean.
Three views of confidence intervals.
Constructing a confidence interval without normality:
the exact binomial confidence interval for θ

January 1, 2017 2 / 16
Polling confidence interval
Also called a binomial proportion confidence interval
Polling means sampling from a Bernoulli(θ) distribution,
i.e. data x1 , . . . , xn Bernoulli(θ).

Consevative normal confidence interval for θ:


1
x ± zα/2 · √
2 n
p
Proof uses the CLT and the observation σ = θ(1 − θ) ≤ 1/2.

Rule-of-thumb 95% confidence interval for θ:


1
x±√
n
(Reason: z0.025 ≈ 2.)
January 1, 2017 3 / 16
Board question

For a poll to find the proportion θ of people supporting X we know


that a (1 − α) confidence interval for θ is given by
 
zα/2 zα/2
x̄ − √ , x̄ + √ .
2 n 2 n

1. How many people would you have to poll to have a margin of error
of 0.01 with 95% confidence? (You can do this in your head.)

2. How many people would you have to poll to have a margin of error
of 0.01 with 80% confidence. (You’ll want R or other calculator here.)

3. If n = 900, compute the 95% and 80% confidence intervals for θ.

January 1, 2017 4 / 16
Concept question: overnight polling

During the presidential election season, pollsters often do ‘overnight


polls’ and report a ‘margin of error’ of about ±5%.

The number of people polled is in which of the following ranges?


(a) 0 – 50
(b) 50 – 100
(c) 100 – 300
(d) 300 – 600
(e) 600 – 1000

January 1, 2017 5 / 16
National Council on Public Polls: Press Release, Sept 1992
“The National Council on Public Polls expressed concern today about the
current spate of overnight Presidential polls. [...] Overnight polls do a
disservice to both the media and the research industry because of the
considerable potential for the results to be misleading. The overnight
interviewing period may well mean some methodological compromises, the
most serious of which is..”

...what?

“...the inability to make callbacks, resulting in samples that do not


adequately represent such groups as single member households, younger
people, and others who are apt to be out on any given night. As overnight
polls often result in findings that are less reliable than those from more
carefully conducted polls, if the media reports them, it should be with
great caution.”
http://www.ncpp.org/?q=node/42

January 1, 2017 6 / 16
Large sample confidence interval
Data x1 , . . . , xn independently drawn from a distribution that may not
be normal but has finite mean and variance.

A version of the central limit theorem says that large n,


x̄ − µ
√ ≈ N(0, 1)
s/ n
i.e. the sampling distribution of the studentized mean is
approximately standard normal:
So for large n the (1 − α) confidence interval for µ is approximately
 
s s
x̄ − √ · zα/2 , x̄ + √ · zα/2
n n

This is called the large sample confidence interval.


January 1, 2017 7 / 16
Review: confidence intervals for normal data
Suppose the data x1 , . . . , xn is drawn from N(µ, σ 2 )
Confidence level = 1 − α
z confidence interval for the mean (σ known)
zα/2 · σ zα/2 · σ zα/2 · σ
 
x − √ , x + √ or x± √
n n n
t confidence interval for the mean (σ unknown)
tα/2 · s tα/2 · s tα/2 · s
 
x − √ , x + √ or x± √
n n n
χ2 confidence interval for σ 2
 
n−1 2 n−1 2
s , s
cα/2 c1−α/2
t and χ2 have n − 1 degrees of freedom.
January 1, 2017 8 / 16
Three views of confidence intervals

View 1: Define/construct CI using a standardized point statistic.

View 2: Define/construct CI based on hypothesis tests.

View 3: Define CI as any interval statistic satisfying a formal


mathematical property.

January 1, 2017 9 / 16
View 1: Using a standardized point statistic
Example. x1 . . . , xn ∼ N(µ, σ 2 ), where σ is known.
The standardized sample mean follows a standard normal distribution.
x −µ
z = √ ∼ N(0, 1)
σ/ n
Therefore:
x −µ
P(−zα/2 < √ < zα/2 | µ) = 1 − α
σ/ n
Pivot to:
σ σ
P(x − zα/2 · √ < µ < x + zα/2 · √ | µ) = 1 − α
n n
This is the (1 − α) confidence interval:
σ
x ± zα/2 · √
n
Think of it as x ± error
January 1, 2017 10 / 16
View 1: Other standardized statistics

The t and χ2 statistics fit this paradigm as well:


x −µ
t= √ ∼ t (n − 1)
s/ n

(n − 1)s 2
X2 = ∼ χ2 (n − 1)
σ2

January 1, 2017 11 / 16
View 2: Using hypothesis tests

Set up: Unknown parameter θ. Test statistic x.

For any value θ0 , we can run an NSHT with null hypothesis

H0 : θ = θ 0

at significance level α.
Definition. Given x, the (1 − α) confidence interval contains all θ0
which are not rejected when they are the null hypothesis.

Definition. A type 1 CI error occurs when the confidence interval


does not contain the true value of θ.
For a 1 − α confidence interval, the type 1 CI error rate is α.

January 1, 2017 12 / 16
Board question: exact binomial confidence interval

Use this table of binomial(8,θ) probabilities to:


1 find the (two-sided) rejection region with significance level 0.10
for each value of θ.
2 Given x = 7, find the 90% confidence interval for θ.
3 Repeat for x = 4.
θ/x 0 1 2 3 4 5 6 7 8
.1 0.430 0.383 0.149 0.033 0.005 0.000 0.000 0.000 0.000
.3 0.058 0.198 0.296 0.254 0.136 0.047 0.010 0.001 0.000
.5 0.004 0.031 0.109 0.219 0.273 0.219 0.109 0.031 0.004
.7 0.000 0.001 0.010 0.047 0.136 0.254 0.296 0.198 0.058
.9 0.000 0.000 0.000 0.000 0.005 0.033 0.149 0.383 0.430

January 1, 2017 13 / 16
Solution
For each θ, the non-rejection region is blue, the rejection region is red.
In each row, the rejection region has probability at most α = 0.10.

θ/x 0 1 2 3 4 5 6 7 8
.1 0.430 0.383 0.149 0.033 0.005 0.000 0.000 0.000 0.000
.3 0.058 0.198 0.296 0.254 0.136 0.047 0.010 0.001 0.000
.5 0.004 0.031 0.109 0.219 0.273 0.219 0.109 0.031 0.004
.7 0.000 0.001 0.010 0.047 0.136 0.254 0.296 0.198 0.058
.9 0.000 0.000 0.000 0.000 0.005 0.033 0.149 0.383 0.430

For x = 7 the 90% confidence interval for p is [0.7, 0.9].


These are the values of θ we wouldn’t reject as null hypotheses. They
are the blue entries in the x = 7 column.

For x = 4 the 90% confidence interval for p is [0.3, 0.7].

January 1, 2017 14 / 16
View 3: Formal
Recall: An interval statistic is an interval Ix computed from data x.
This is a random interval because x is random.
Suppose x is drawn from f (x|θ) with unknown parameter θ.

Definition:
A (1 − α) confidence interval for θ is an interval statistic Ix such that

P(Ix contains θ | θ) = 1 − α

for all possible values of θ (and hence for the true value of θ).

Note: equality in this equation is often relaxed to ≥ or ≈.


= : z, t, χ2
≥ : rule-of-thumb and exact binomial (polling)
≈ : large sample confidence interval
January 1, 2017 15 / 16
MIT OpenCourseWare
https://ocw.mit.edu

18.05 Introduction to Probability and Statistics


Spring 2014

For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy