0% found this document useful (0 votes)

22 views37 pages

12 UnknownProportions

The document outlines a course on statistical inference for means, focusing on various tests such as the z-test, t-test, and chi-squared test. It reviews the Central Limit Theorem and discusses prediction intervals for sample sums and averages, particularly in the context of a 0-1 box model. Examples illustrate how to calculate prediction intervals and the impact of different population proportions on these intervals.

Uploaded by

ishrat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views37 pages

12 UnknownProportions

Uploaded by

ishrat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 37

Unknown Proportions

Decisions with Data | Inference for means

© University of Sydney MATH1062/1005

05 October 2024
Course Overview

Population

3 Sampling Data 4 Decisions with Data

1 Exploring Data Sample 2 Modelling Data

2/37

 Module4 Decisions with Data

The z-test
How can we make evidence based decisions? Is an observed result due to
chance or something else? How can we test whether a population has a certain
proportion?

The t-test
How can we test whether an unkown population has a certain mean?

The two-sample test

How can we test whether two variables have the same mean?

𝜒 2 -test
How to compare frequencies of categories?
3/37

 Today’s outline

A review of the Central Limit Theorem (CLT)

The 0-1 box

Prediction intervals for the 0-1 box

Confidence intervals for the 0-1 box

4/37
A review of the Central Limit
Theorem (CLT)
Central Limit Theorem
· Let 𝑋1 , … , 𝑋𝑛 be 𝑛 random draws with replacement from a box and let:
- 𝑆 = 𝑋1 + ⋯ + 𝑋𝑛 = ∑𝑛𝑖=1 𝑋𝑖 be the sample sum
- 𝑋¯ = 𝑋1 +⋯+𝑋𝑛 = 𝑆 = 1 ∑𝑛 𝑋𝑖 be the sample average
𝑛 𝑛 𝑛 𝑖=1

· If we know the mean 𝜇 and SD 𝜎 of the box, then:

- 𝐸(𝑆) = 𝑛𝜇 , 𝑆𝐸(𝑆) = 𝜎√𝑛
- 𝐸(𝑋¯ ) = 𝜇, 𝑆𝐸(𝑋¯ ) = 𝜎
√𝑛

· For large 𝑛 , the box of all possible sample sums has mean 𝐸(𝑆) = 𝑛𝜇 , SD
𝑆𝐸(𝑆) = 𝜎√𝑛 and is approximately normal
· For large 𝑛 , the box of all possible sample means has mean 𝐸(𝑋¯ ) = 𝜇, SD
𝑆𝐸(𝑋¯ ) = 𝜎/√𝑛 and is approximately normal

6/37
The 0-1 box
Important special case: 0-1 boxes
· An important example is where the box only contains 0 and 1 .

· Let 𝑝 denote the proportion of 1 s in the box, and 𝑁 the number of tickets. Then
there are:
- (1 − 𝑝)𝑁 0 s and

- 𝑝𝑁 1 s:

0 ⋯ 0 1 ⋯ 1
 
(1−𝑝)𝑁 of these 𝑝𝑁 of these

8/37
𝜇 and 𝜎 only depend on 𝑝
· We can calculate the mean and SD of the box in terms of 𝑝 :
- the mean of the box 𝜇 = 𝑝𝑁 = 𝑝 ;
𝑁
- the mean square of the box is also 𝑝 , and so the SD of the box is

𝜎 = √‾mn.sq. − (mean)‾2 = √‾𝑝‾‾‾‾

‾‾‾‾‾‾‾‾‾‾‾‾‾‾ − 𝑝‾2 = √𝑝(1
‾‾‾‾‾‾‾
− 𝑝)‾

· Therefore 𝐸(𝑆), 𝐸(𝑋¯ ), 𝑆𝐸(𝑆), 𝑆𝐸(𝑋¯ ) only depends on 𝑝 and 𝑛 .

9/37
Prediction intervals
Introduction to prediction intervals
· A 100 ⋅ 𝛾% (two-sided) prediction interval for the sample sum 𝑆 is an interval
[𝑎, 𝑏] in which there is a 100 ⋅ 𝛾% chance that 𝑆 lands in [𝑎, 𝑏]:
𝑃(𝑎 ≤ 𝑆 ≤ 𝑏) = 𝛾
· A 100 ⋅ 𝛾% (two-sided) prediction interval for the sample average 𝑋¯ is an interval
[𝑎, 𝑏] in which there is a 100 ⋅ 𝛾% chance that 𝑋¯ lands is in [𝑎, 𝑏]:
𝑃(𝑎 ≤ 𝑋¯ ≤ 𝑏) = 𝛾
· How can we find 𝑎 and 𝑏?

11/37
Derivation with 𝑋¯
· Note the following:

qnorm(0.025)

## [1] -1.959964

· This means that 2.5% under the normal curve is to the left of -1.96 and 2.5%
under the normal curve is to the right of 1.96.
· In other words 95% of the area under the normal curve is between -1.96 and 1.96.
· 𝑋¯ is approximately normal with mean 𝐸(𝑋¯ ) and SD equal to 𝑆𝐸(𝑋¯ ) .
· Equivalently 𝑋¯−𝐸(𝑋¯) is approximately standard normal with mean 0 and SD 1 .
𝑆𝐸(𝑋¯ )

12/37
Derivation with 𝑋¯
· Finally note that

( 𝑆𝐸(𝑋) 𝑆𝐸(𝑋¯ ) )
𝑎 − 𝐸(𝑋 ¯) ¯ − 𝐸(𝑋¯ )
𝑋 𝑏 − 𝐸( 𝑋¯)
𝑃(𝑎 ≤ 𝑋¯ ≤ 𝑏) = 𝑃 ≤ ≤
¯ ¯
𝑆𝐸(𝑋)
· So if I choose 𝑎 such that 𝑎−𝐸(𝑋¯) = −1.96 and 𝑏 such that 𝑏−𝐸(𝑋¯) = 1.96 , then
𝑆𝐸(𝑋¯ ) 𝑆𝐸(𝑋¯ )
𝑋¯ −𝐸(𝑋¯ )
the quantity ¯ will land between -1.96 and 1.96, with 95% probability.
𝑆𝐸(𝑋)
· Rearranging these two equations we get:

𝑎 = 𝐸(𝑋¯ ) − 1.96 ⋅ 𝑆𝐸(𝑋¯ ), 𝑏 = 𝐸(𝑋¯ ) + 1.96 ⋅ 𝑆𝐸(𝑋¯ )

· In other words, 𝑋¯ will be between these two values 95% of the time.

13/37
Putting this together
· So for an (approximate) 95% prediction interval for the sample average 𝑋¯ is:

[𝐸(𝑋¯ ) − 1.96 ⋅ 𝑆𝐸(𝑋¯ ), 𝐸(𝑋¯ ) + 1.96 ⋅ 𝑆𝐸(𝑋¯ )]

· But we know how to compute 𝐸(𝑋¯ ) and 𝑆𝐸(𝑋¯ ) in the 0-1 box and so more
precisely:
- 𝐸(𝑋¯ ) = 𝑝

𝑆𝐸(𝑋¯ ) = √‾‾‾‾‾
- 𝑝(1−𝑝)‾
𝑛

· So for an (approximate) 95% prediction interval for the sample average 𝑋¯ the 0-1
box:

[ ]
‾𝑝(1
‾‾‾‾‾‾‾
− 𝑝)‾ ‾𝑝(1
‾‾‾‾‾‾‾
− 𝑝)‾
√ √
𝑝 − 1.96 , 𝑝 + 1.96 ⋅
𝑛 𝑛
· For other values of 100 ⋅ 𝛾% like 90% or 99%, the “1.96” needs to be adjusted
14/37
Example 1
· Suppose we draw 𝑛 = 49 times randomly from a box with 𝑝 = 0.4 .
· What is the 95% prediction interval for 𝑋¯ ?
· Solution:
- the expected value is 𝐸(𝑋¯ ) = 𝜇 = 𝑝 = 0.4 .
- the standard error is
𝑆𝐸(𝑋¯ ) = 𝜎/√𝑛 = √‾‾‾‾‾
𝑝(1−𝑝)‾ ‾1‾‾‾‾‾‾‾‾‾3‾
√ 49
= × 2
× √6
𝑛 5 5 = 35 ≈ 0.07 .
- the distribution of 𝑋¯ has a (roughly) normal shape.
- Hence substituting into our prediction interval gives us:

[0.4 − 1.96 × 0.07, 0.4 + 1.96 × 0.07]

or 0.4 ± 1.96 × 0.07 or (0.26, 0.54).

15/37
Visualisation
· The histogram of 𝑋¯ is approximated by the following normal curve

· 𝑋¯ lands in the interval (0.26, 0.54) (the blue interval) with probability 95%.

16/37
What if 𝑝 = 0.2 instead of 0.4?
· It is interesting to see how this changes if the proportion in the box is 0.2 instead of
0.4.
· We then get
- 𝐸(𝑋¯ ) = 𝜇 = 𝑝 = 0.2

𝑆𝐸(𝑋¯ ) = 𝜎/√𝑛 = √‾‾‾‾‾

- 𝑝(1−𝑝)‾ ‾1‾‾‾‾‾‾‾‾‾4‾
√ 49
1 2
𝑛 = × 5 × 5 = 35 = 0.057

· So the box of all possible 𝑋¯ values has

- mean 0.2;
- SD 0.057 ;
- a normal shape.

· The resulting 95% prediction interval is 0.2 ± 1.96 × 0.057 or (0.09, 0.31).

17/37
Interval now a bit narrower

· Not that the interval is now a little narrower, i.e. 0.22 units wide (compare with
0.28 when 𝑝 = 0.4 ).

18/37
Simulation for 𝑝 = 0.4
too.big = 0
too.small = 0
for (i in 1:1000) {
samp = sample(c(0, 1), prob = c(0.6, 0.4), repl = T, size = 49)
prop = mean(samp)
too.big[i] = prop > (0.4 + 1.96 * 0.07)
too.small[i] = prop < (0.4 - 1.96 * 0.07)
}
num.too.small = sum(too.small)
num.too.big = sum(too.big)
num.just.right = 1000 - num.too.small - num.too.big
cbind(num.too.small, num.just.right, num.too.big)

## num.too.small num.just.right num.too.big

## [1,] 13 967 20

19/37
Simulation for 𝑝 = 0.2
too.big = 0
too.small = 0
for (i in 1:1000) {
samp = sample(c(0, 1), prob = c(0.8, 0.2), repl = T, size = 49)
prop = mean(samp)
too.big[i] = prop > (0.2 + 1.96 * 0.057)
too.small[i] = prop < (0.2 - 1.96 * 0.057)
}
num.too.small = sum(too.small)
num.too.big = sum(too.big)
num.just.right = 1000 - num.too.small - num.too.big
cbind(num.too.small, num.just.right, num.too.big)

## num.too.small num.just.right num.too.big

## [1,] 26 948 26

20/37
Size of prediction intervals
· The variability in the sample proportion gets smaller as the 𝑝 in the box gets
further from 0.5.

·
This is precisely reflected in 𝑆𝐸(𝑋¯ ) = √
‾𝑝(1−𝑝)
‾‾‾‾‾.
𝑛

· The function 𝑝 ↦ 𝑝(1 − 𝑝) = 𝑝 − 𝑝2 is a quadratic function of 𝑝 :

p = 0:1000/1000
plot(p, p * (1 - p), type = "l")

21/37
Prediction interval for 𝑆
· For a 95% prediction interval for the sample average 𝑋¯ is:

[𝐸(𝑋¯ ) − 1.96 ⋅ 𝑆𝐸(𝑋¯ ), 𝐸(𝑋¯ ) + 1.96 ⋅ 𝑆𝐸(𝑋¯ )]

· Following the same steps, we get that a 95% prediction interval for the sample
sum 𝑆 is:

[𝐸(𝑆) − 1.96 ⋅ 𝑆𝐸(𝑆), 𝐸(𝑆) + 1.96 ⋅ 𝑆𝐸(𝑆)]

or more precisely

[𝑛𝑝 − 1.96 ⋅ √‾𝑛𝑝(1

‾‾‾‾‾‾‾
− 𝑝)‾, 𝑛𝑝 + 1.96 ⋅ √‾𝑛𝑝(1 − 𝑝)‾]
‾‾‾‾‾‾‾

· Again, the value “1.96” needs to be adjusted for different values of 100 ⋅ 𝛾%.

22/37
Confidence intervals
Interval of values consistent with each 𝑝
· The previous section showed us how the sample mean/proportion 𝑋¯ behaves for
a known box proportion 𝑝 .
· We saw that each value 𝑝 has associated with it an interval of values consistent
with that 𝑝 , characterized as a 95% “prediction interval” for the sample proportion.
- the interval is centred at 𝑝
- its width depends on 𝑛 and 𝑝
- interval is wider the closer 𝑝 is to 0.5.

24/37
Turning things around
· What if the “population” proportion 𝑝 is unknown?

· Suppose

- we have a sample of size 𝑛 = 49 from a box with unknown 𝑝 ,

- the observed sample sum is 𝑠 = 14 , so that
- the observed sample proportion is 𝑥¯ = 𝑠 = 14 = 2 ≈ 0.2857.
𝑛 49 7
· We might ask the following question:

Which values of 𝑝 is this observation consistent with (using the 95%

prediction intervals) ?

25/37
How about both 𝑝 = 0.2 and 𝑝 = 0.4 ?
· We replicate our graph from before, showing intervals of values consistent with
both 𝑝 = 0.2 and 𝑝 = 0.4 , when 𝑛 = 49 .
· The vertical green line below shows our observed value 𝑥¯ = 2 .
7

· Note that 𝑥¯ = 2 is consistent with both 𝑝 = 0.2 and 𝑝 = 0.4 .

7
· What other values of 𝑝 is the observed value 2 consistent with (in this sense)?
7

26/37
Furthest values
· Clearly, there exists “upper” and “lower” values of 𝑝 for which the observation is
just on the edge.
· These values form the endpoints of a 95% (two-sided) confidence interval for
the unknown 𝑝 (this is called a Wilson’s confidence interval).
· In other words, we want to find all the 𝑝 ’s that make our observation 𝑥¯ land in the
interval [𝑝 − 1.96√ √ 𝑛 ]
‾𝑝(1−𝑝)
‾‾‾‾‾, 𝑝 + 1.96 ⋅ ‾𝑝(1−𝑝)
‾‾‾‾‾
𝑛

· How can we find these endpoints?

· Three methods:
1. Theoretically by solving a quadratic equation (we shall not spend time on this, it
is too complicated)
2. Using a numerical method to find these endpoints (this requires a little require
work from our side)
3. Use an R package (all the work is taken care for you).

27/37
The R binom package
· The R package binom computes these endpoints using the binom.confint()
function.

· In our case, we compute the endpoints as follows:

require(binom) # this makes sure the binom package is loaded

## Loading required package: binom

binom.confint(x = 14, n = 49, method = "wilson") # note here the argument 'x' is the sample sum or count

## method x n mean lower upper

## 1 wilson 14 49 0.2857143 0.1784959 0.4240888

28/37
Sanity check
· This shows us the “extreme” values of 𝑝 for which 𝑥¯ = 2 ≈ 0.285 still falls in the
7
95% prediction interval for 𝑝 = 0.178 and 𝑝 = 0.424.
· We can check this to be sure:
- For 𝑝 = 0.178 the 95% prediction interval is:
0.178 ± 1.96 ⋅ √‾0.178⋅0.822
‾‾‾‾‾‾‾‾ = (0.071, 0.285)
49

- For 𝑝 = 0.424 the 95% prediction interval is:

0.424 ± 1.96 ⋅ √‾0.424⋅0.576
‾‾‾‾‾‾‾‾ = (0.285, 0.562)
49

29/37
Interpreting the confidence interval
· Suppose we construct a 95% confidence interval from a box with a proportion 𝑝 of
1 s.
· We know there is a 95% chance that 𝑋¯ will fall in the prediction interval. The
confidence interval will include that 𝑝 if and only if that happens!
· Equivalently, there is a 95% chance that 𝑝 will fall in the confidence interval.
· 𝑝 is not what is random here, it is the confidence interval since it depends on the
observed value of 𝑋¯ .
· I use will here, because this is different than 𝑝 falling into the confidence interval
computed using the observed 𝑥¯ !
· This is a deterministic statement, that is either true or false.

30/37
Demonstration
· Let us see how (Wilson’s) confidence interval works when repeatedly sampling
from a box with a known 𝑝 .

p = 0.3
n = 50
over.est = 0
under.est = 0
for (i in 1:1000) {
samp = sample(c(0, 1), prob = c(1 - p, p), replace = T, size = n)
s = sum(samp)
w = binom.confint(s, n, method = "wilson")
over.est[i] = w$lower > p
under.est[i] = w$upper < p
}
num.over.est = sum(over.est)
num.under.est = sum(under.est)
num.covering = 1000 - num.over.est - num.under.est
cbind(num.under.est, num.covering, num.over.est)

## num.under.est num.covering num.over.est

## [1,] 21 948 31

We see that close to 95% of the time, the interval covers the “true” value of 𝑝 = 0.3 .

31/37
Properties of the (Wilson) confidence interval
· Under repeated sampling from a 0-1 box, the 95% Wilson confidence interval
covers the “true” proportion 𝑝 in (approx.) 95% of samples.
· This is a long-run property of the procedure.
· For a single data set, you don’t know if it has covered the true value or not.
- You just know that the procedure you have used is 95% reliable in the long
run.
· Note that the interval is not (in general) symmetric about the observed sample
proportion 𝑥¯ .
- The midpoint of the interval is somewhere between 𝑥¯ and 0.5.

32/37
Different confidence levels
· We can change the confidence level by replacing 1.96 with another value.

· E.g., for 99% we should replace 1.96 with

qnorm(0.995)

## [1] 2.575829

(which gives 0.5% in the upper tail under the standard normal curve).

33/37
Changing confidence level using
binom.confint()
· Using binom.confint() we simply set the conf.level= argument to the
desired level:

binom.confint(14, 49, conf.level = 0.99, method = "wilson")

## method x n mean lower upper

## 1 wilson 14 49 0.2857143 0.1531828 0.4693562

34/37
Sanity check
· This shows us the “extreme” values of 𝑝 for which 𝑥¯ = 2 ≈ 0.285 still falls in the
7
99% prediction interval for 𝑝 = 0.153 and 𝑝 = 0.469.
· We can check this to be sure:
- For 𝑝 = 0.153 the 99% prediction interval is:
0.153 ± 2.576 ⋅ √‾0.153⋅0.847
‾‾‾‾‾‾‾‾ = (0.021, 0.285)
49

- For 𝑝 = 0.469 the 99% prediction interval is:

0.469 ± 2.576 ⋅ √‾0.469⋅0.531
‾‾‾‾‾‾‾‾ = (0.285, 0.653)
49

35/37
Example
· The file march2023.csv has daily weather observations from the Canterbury
Racecourse weather station for March 2023.

mar.2023 = read.csv("march2023.csv", skip = 5)

summary(mar.2023)

## X Date Minimum.temperature..degC.
## Mode:logical Length:31 Min. :11.60
## NA's:31 Class :character 1st Qu.:16.20
## Mode :character Median :17.20
## Mean :16.89
## 3rd Qu.:18.20
## Max. :21.60
## Maximum.temperature..degC. Rainfall..mm. Evaporation..mm. Sunshine..hours.
## Min. :22.70 Min. : 0.000 Mode:logical Mode:logical
## 1st Qu.:25.00 1st Qu.: 0.000 NA's:31 NA's:31
## Median :26.40 Median : 0.000
## Mean :27.77 Mean : 2.058
## 3rd Qu.:29.50 3rd Qu.: 1.700
## Max. :38.10 Max. :31.400
## Direction.of.maximum.wind.gust. Speed.of.maximum.wind.gust..km.h.
## Length:31 Min. :24.00
## Class :character 1st Qu.:31.00
## Mode :character Median :37.00
## Mean :38.23
## 3rd Qu.:46.00
## Max. :57.00
## Time.of.maximum.wind.gust X9am.Temperature..degC. X9am.relative.humidity.... 36/37
## Length:31 Min. :17.20 Min. : 42.00
Rainfall
mar.2023$Rain

## [1] 0.0 0.4 0.0 3.2 0.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 5.0 2.0 31.4
## [16] 0.0 0.0 0.0 0.0 0.0 2.6 0.0 0.2 4.0 0.0 1.4 8.8 0.4 3.8 0.4
## [31] 0.0

· What proportion of days in March have rain?

· Suppose we can model the presence or absence of rain as being like a random
sample from a 0-1 box with an unknown proportion 𝑝 of 1s.
· What is a 95% Wilson confidence interval for 𝑝 ?

x = sum(mar.2023$Rain > 0)
binom.confint(14, 31, method = "wilson")

## method x n mean lower upper

## 1 wilson 14 31 0.4516129 0.2916174 0.6222783

· The data is thus consistent with the “true” 𝑝 being anywhere in the range
(0.29, 0.62).
37/37

Interval Estimation
100% (1)
Interval Estimation
42 pages
Cakes
100% (1)
Cakes
21 pages
Snubbing Practice Irp15 - Final - 2007
100% (4)
Snubbing Practice Irp15 - Final - 2007
149 pages
Vedic Math Notes Hand Written (WWW - ExamTyaari.in) PDF
100% (1)
Vedic Math Notes Hand Written (WWW - ExamTyaari.in) PDF
228 pages
Plant Breeding For Biotic Stress Resistance
No ratings yet
Plant Breeding For Biotic Stress Resistance
166 pages
Intro Stats Formula Sheet
No ratings yet
Intro Stats Formula Sheet
5 pages
Lectuer 21-ConfidenceInterval
100% (1)
Lectuer 21-ConfidenceInterval
41 pages
Intervals
No ratings yet
Intervals
43 pages
Approved Document B Fire Safety Volume 2 - Buildings Other Than Dwellings 2019 Edition Incorporating 2020 and 2022 Amendments
100% (1)
Approved Document B Fire Safety Volume 2 - Buildings Other Than Dwellings 2019 Edition Incorporating 2020 and 2022 Amendments
204 pages
Statistical Estimation
No ratings yet
Statistical Estimation
32 pages
Question: How Do We Estimate Precision Error?
No ratings yet
Question: How Do We Estimate Precision Error?
38 pages
4.CI For Prop, Var, Ratio
No ratings yet
4.CI For Prop, Var, Ratio
36 pages
Lecture 12
No ratings yet
Lecture 12
24 pages
Mep Hvac 3
100% (1)
Mep Hvac 3
133 pages
Iit-Jee Prepration: A Complete Guide by Students Helper
No ratings yet
Iit-Jee Prepration: A Complete Guide by Students Helper
7 pages
Reliance JIO
No ratings yet
Reliance JIO
69 pages
Statistical+Inference+1 Shaw2007
No ratings yet
Statistical+Inference+1 Shaw2007
66 pages
Mine Wastes PDF
No ratings yet
Mine Wastes PDF
18 pages
Lc07 PS Estimation
No ratings yet
Lc07 PS Estimation
17 pages
Point and Interval Estimates
No ratings yet
Point and Interval Estimates
17 pages
Statistical Tests Martin G 161131 V15 UPLOAD
No ratings yet
Statistical Tests Martin G 161131 V15 UPLOAD
33 pages
2006 Geog090 Week06 Lecture01 CentralLimitTheorem
No ratings yet
2006 Geog090 Week06 Lecture01 CentralLimitTheorem
37 pages
Chapter 5
No ratings yet
Chapter 5
43 pages
Policarpio 5 - Refresher SEC
100% (1)
Policarpio 5 - Refresher SEC
2 pages
5.confidence Interval
No ratings yet
5.confidence Interval
53 pages
Finding Interval Estimators
No ratings yet
Finding Interval Estimators
10 pages
Probability and Statistics - Practice Tests and Solutions
No ratings yet
Probability and Statistics - Practice Tests and Solutions
46 pages
Estimation
No ratings yet
Estimation
41 pages
Definition of R-WPS Office
No ratings yet
Definition of R-WPS Office
8 pages
Lecture6 Estimation PDF
No ratings yet
Lecture6 Estimation PDF
17 pages
Lab 4
No ratings yet
Lab 4
13 pages
Interval Estimation
No ratings yet
Interval Estimation
45 pages
Confidence Intervals and Hypothesis Tests: 2.1 Binomial Data
No ratings yet
Confidence Intervals and Hypothesis Tests: 2.1 Binomial Data
13 pages
1 Point Estimation: Parameter Estimation Kaustav Banerjee Decision Sciences Area, IIM Lucknow
No ratings yet
1 Point Estimation: Parameter Estimation Kaustav Banerjee Decision Sciences Area, IIM Lucknow
4 pages
Lc07 SL Estimation - PPT - 0
No ratings yet
Lc07 SL Estimation - PPT - 0
48 pages
Confidence Interval Estimation
No ratings yet
Confidence Interval Estimation
62 pages
Interval of Confidence
No ratings yet
Interval of Confidence
34 pages
Confidence Interval
100% (1)
Confidence Interval
19 pages
Circle-Theorems Test
No ratings yet
Circle-Theorems Test
12 pages
Interval Estimation: Part 1: Statistics (MAST20005) & Elements of Statistics (MAST90058) Semester 2, 2018
No ratings yet
Interval Estimation: Part 1: Statistics (MAST20005) & Elements of Statistics (MAST90058) Semester 2, 2018
18 pages
x (sample mean) is the most unbiased estimate for the population mean μ p= x n
No ratings yet
x (sample mean) is the most unbiased estimate for the population mean μ p= x n
5 pages
Chapter 3
No ratings yet
Chapter 3
40 pages
11.estimation IV
No ratings yet
11.estimation IV
62 pages
06 Chapter 15, 16, 17, 21 Inferences For Proportions Completed
No ratings yet
06 Chapter 15, 16, 17, 21 Inferences For Proportions Completed
18 pages
2089 340065 Multimeter Mastech Ms8240c
No ratings yet
2089 340065 Multimeter Mastech Ms8240c
11 pages
Resume Vikash+tiwari+10314802818
No ratings yet
Resume Vikash+tiwari+10314802818
1 page
Psalm 91
100% (1)
Psalm 91
7 pages
Applied Statistics and Probability For Engineers Chapter - 8
No ratings yet
Applied Statistics and Probability For Engineers Chapter - 8
13 pages
Chapter 2 Organizing and Summarizing Data
No ratings yet
Chapter 2 Organizing and Summarizing Data
8 pages
Ivi6000c Im 12
No ratings yet
Ivi6000c Im 12
4 pages
Confidence Interval
No ratings yet
Confidence Interval
44 pages
Health and Safety Inspection Checklist
No ratings yet
Health and Safety Inspection Checklist
10 pages
OnePlus Nord 3 - Full Phone Specifications
No ratings yet
OnePlus Nord 3 - Full Phone Specifications
2 pages
Phase Diagrams
No ratings yet
Phase Diagrams
3 pages
Ls C
No ratings yet
Ls C
4 pages
Lecture 6 Estimation
No ratings yet
Lecture 6 Estimation
8 pages
Tong Hop Cong Thuc Mt2013 Lop Thay Dungclc
No ratings yet
Tong Hop Cong Thuc Mt2013 Lop Thay Dungclc
9 pages
Rle Requirements Wardspcl Area NCP DS
No ratings yet
Rle Requirements Wardspcl Area NCP DS
3 pages
ECM1001 Formula Sheet
No ratings yet
ECM1001 Formula Sheet
15 pages
STM 003 Sas Module #24
No ratings yet
STM 003 Sas Module #24
8 pages
Thesis On Flooding in Nigeria
100% (3)
Thesis On Flooding in Nigeria
4 pages
Isuzu
No ratings yet
Isuzu
1 page
Populatio N Sampl E: Parameters: Statistics
No ratings yet
Populatio N Sampl E: Parameters: Statistics
21 pages
Ci Ps
No ratings yet
Ci Ps
41 pages
Point and Interval Estimation
No ratings yet
Point and Interval Estimation
55 pages
Mit18 05 s22 Class23-Prep-B
No ratings yet
Mit18 05 s22 Class23-Prep-B
6 pages
Module03 Slides Print
No ratings yet
Module03 Slides Print
82 pages
CI Estimation and Sample Size Determination
No ratings yet
CI Estimation and Sample Size Determination
53 pages
CS1 2021 Upgrade
No ratings yet
CS1 2021 Upgrade
48 pages
4 3+–+Interval+Estimates+for+Proportions
No ratings yet
4 3+–+Interval+Estimates+for+Proportions
4 pages
Q1 Math 10
No ratings yet
Q1 Math 10
4 pages
Localization of Function IB Psychology Review Sheet
No ratings yet
Localization of Function IB Psychology Review Sheet
1 page
Ch3 Prob II Anu Fall24 1
No ratings yet
Ch3 Prob II Anu Fall24 1
20 pages
Chapter11 - Inf - Proportions - Student
No ratings yet
Chapter11 - Inf - Proportions - Student
37 pages
Chapter 3 - Parametric Estimation Using Confidence Intervals
No ratings yet
Chapter 3 - Parametric Estimation Using Confidence Intervals
7 pages
Stats Quiz 2 Cheatsheet
No ratings yet
Stats Quiz 2 Cheatsheet
2 pages
AERO1400 Quiz Notes
No ratings yet
AERO1400 Quiz Notes
4 pages
X400004 20220215 Solutions
No ratings yet
X400004 20220215 Solutions
8 pages
Assessing The Urbanization-Driven Challenges of Transportation Scarcity and Commuter Accessibility in Davao City: A Systematic Review
No ratings yet
Assessing The Urbanization-Driven Challenges of Transportation Scarcity and Commuter Accessibility in Davao City: A Systematic Review
11 pages
Stats Exam 1 Cheat Sheet
No ratings yet
Stats Exam 1 Cheat Sheet
3 pages
Formula Sheet - Test 2 - STAT4001
No ratings yet
Formula Sheet - Test 2 - STAT4001
5 pages
Jesd51 14
No ratings yet
Jesd51 14
46 pages
Python Notes
No ratings yet
Python Notes
27 pages
380 Dia Clutch - Oyster
No ratings yet
380 Dia Clutch - Oyster
29 pages
STAT401 Lecture 10
No ratings yet
STAT401 Lecture 10
41 pages
14 UnknownMeans
No ratings yet
14 UnknownMeans
43 pages
13 UnknownProportionsMore
No ratings yet
13 UnknownProportionsMore
38 pages
Research Hypotheses-2025
No ratings yet
Research Hypotheses-2025
54 pages
Topic5 Probability
No ratings yet
Topic5 Probability
39 pages
Middle Term Problem Sets
No ratings yet
Middle Term Problem Sets
27 pages
Cala NJ Test Prep
No ratings yet
Cala NJ Test Prep
2 pages
Topic2 Numerical Summary
No ratings yet
Topic2 Numerical Summary
62 pages
Topic3 NormalCurve
No ratings yet
Topic3 NormalCurve
40 pages
MZB127 Topic 10 Lecture Notes (Unannotated Version)
No ratings yet
MZB127 Topic 10 Lecture Notes (Unannotated Version)
19 pages
Trigonometric Graphs
No ratings yet
Trigonometric Graphs
5 pages
Par Est
No ratings yet
Par Est
36 pages
Sta 224 Note...
No ratings yet
Sta 224 Note...
26 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

12 UnknownProportions

Uploaded by

12 UnknownProportions

Uploaded by

Unknown Proportions

Decisions with Data | Inference for means

© University of Sydney MATH1062/1005

3 Sampling Data 4 Decisions with Data

1 Exploring Data Sample 2 Modelling Data

The two-sample test

A review of the Central Limit Theorem (CLT)

The 0-1 box

Confidence intervals for the 0-1 box

· If we know the mean 𝜇 and SD 𝜎 of the box, then:

𝜎 = √‾mn.sq. − (mean)‾2 = √‾𝑝‾‾‾‾

· Therefore 𝐸(𝑆), 𝐸(𝑋¯ ), 𝑆𝐸(𝑆), 𝑆𝐸(𝑋¯ ) only depends on 𝑝 and 𝑛 .

𝑎 = 𝐸(𝑋¯ ) − 1.96 ⋅ 𝑆𝐸(𝑋¯ ), 𝑏 = 𝐸(𝑋¯ ) + 1.96 ⋅ 𝑆𝐸(𝑋¯ )

[𝐸(𝑋¯ ) − 1.96 ⋅ 𝑆𝐸(𝑋¯ ), 𝐸(𝑋¯ ) + 1.96 ⋅ 𝑆𝐸(𝑋¯ )]

[0.4 − 1.96 × 0.07, 0.4 + 1.96 × 0.07]

𝑆𝐸(𝑋¯ ) = 𝜎/√𝑛 = √‾‾‾‾‾

· So the box of all possible 𝑋¯ values has

## num.too.small num.just.right num.too.big

## num.too.small num.just.right num.too.big

· The function 𝑝 ↦ 𝑝(1 − 𝑝) = 𝑝 − 𝑝2 is a quadratic function of 𝑝 :

[𝐸(𝑋¯ ) − 1.96 ⋅ 𝑆𝐸(𝑋¯ ), 𝐸(𝑋¯ ) + 1.96 ⋅ 𝑆𝐸(𝑋¯ )]

[𝐸(𝑆) − 1.96 ⋅ 𝑆𝐸(𝑆), 𝐸(𝑆) + 1.96 ⋅ 𝑆𝐸(𝑆)]

[𝑛𝑝 − 1.96 ⋅ √‾𝑛𝑝(1

- we have a sample of size 𝑛 = 49 from a box with unknown 𝑝 ,

Which values of 𝑝 is this observation consistent with (using the 95%

· Note that 𝑥¯ = 2 is consistent with both 𝑝 = 0.2 and 𝑝 = 0.4 .

· How can we find these endpoints?

· In our case, we compute the endpoints as follows:

require(binom) # this makes sure the binom package is loaded

## Loading required package: binom

## method x n mean lower upper

- For 𝑝 = 0.424 the 95% prediction interval is:

## num.under.est num.covering num.over.est

· E.g., for 99% we should replace 1.96 with

binom.confint(14, 49, conf.level = 0.99, method = "wilson")

## method x n mean lower upper

- For 𝑝 = 0.469 the 99% prediction interval is:

mar.2023 = read.csv("march2023.csv", skip = 5)

· What proportion of days in March have rain?

## method x n mean lower upper

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.