MIT18 05S14 Class23slides PDF
MIT18 05S14 Class23slides PDF
January 1, 2017 2 / 16
Polling confidence interval
Also called a binomial proportion confidence interval
Polling means sampling from a Bernoulli(θ) distribution,
i.e. data x1 , . . . , xn Bernoulli(θ).
1. How many people would you have to poll to have a margin of error
of 0.01 with 95% confidence? (You can do this in your head.)
2. How many people would you have to poll to have a margin of error
of 0.01 with 80% confidence. (You’ll want R or other calculator here.)
January 1, 2017 4 / 16
Concept question: overnight polling
January 1, 2017 5 / 16
National Council on Public Polls: Press Release, Sept 1992
“The National Council on Public Polls expressed concern today about the
current spate of overnight Presidential polls. [...] Overnight polls do a
disservice to both the media and the research industry because of the
considerable potential for the results to be misleading. The overnight
interviewing period may well mean some methodological compromises, the
most serious of which is..”
...what?
January 1, 2017 6 / 16
Large sample confidence interval
Data x1 , . . . , xn independently drawn from a distribution that may not
be normal but has finite mean and variance.
January 1, 2017 9 / 16
View 1: Using a standardized point statistic
Example. x1 . . . , xn ∼ N(µ, σ 2 ), where σ is known.
The standardized sample mean follows a standard normal distribution.
x −µ
z = √ ∼ N(0, 1)
σ/ n
Therefore:
x −µ
P(−zα/2 < √ < zα/2 | µ) = 1 − α
σ/ n
Pivot to:
σ σ
P(x − zα/2 · √ < µ < x + zα/2 · √ | µ) = 1 − α
n n
This is the (1 − α) confidence interval:
σ
x ± zα/2 · √
n
Think of it as x ± error
January 1, 2017 10 / 16
View 1: Other standardized statistics
(n − 1)s 2
X2 = ∼ χ2 (n − 1)
σ2
January 1, 2017 11 / 16
View 2: Using hypothesis tests
H0 : θ = θ 0
at significance level α.
Definition. Given x, the (1 − α) confidence interval contains all θ0
which are not rejected when they are the null hypothesis.
January 1, 2017 12 / 16
Board question: exact binomial confidence interval
January 1, 2017 13 / 16
Solution
For each θ, the non-rejection region is blue, the rejection region is red.
In each row, the rejection region has probability at most α = 0.10.
θ/x 0 1 2 3 4 5 6 7 8
.1 0.430 0.383 0.149 0.033 0.005 0.000 0.000 0.000 0.000
.3 0.058 0.198 0.296 0.254 0.136 0.047 0.010 0.001 0.000
.5 0.004 0.031 0.109 0.219 0.273 0.219 0.109 0.031 0.004
.7 0.000 0.001 0.010 0.047 0.136 0.254 0.296 0.198 0.058
.9 0.000 0.000 0.000 0.000 0.005 0.033 0.149 0.383 0.430
January 1, 2017 14 / 16
View 3: Formal
Recall: An interval statistic is an interval Ix computed from data x.
This is a random interval because x is random.
Suppose x is drawn from f (x|θ) with unknown parameter θ.
Definition:
A (1 − α) confidence interval for θ is an interval statistic Ix such that
P(Ix contains θ | θ) = 1 − α
for all possible values of θ (and hence for the true value of θ).
For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms.