Chapter 7 - Sampling Distributions CLT
Chapter 7 - Sampling Distributions CLT
Statistical techniques
Population
Make decisions
Sample Draw conclusions
Sample
about population
Statistical inference
Statistical inference
Statistical
inference
Parameter Hypothesis
estimation testing
Point estimate
A statistic
(estimator)
Population A single
having numerical
parameters Sample Sample
value
A statistic
is a function of the observations in a random sample X1, X2, …, Xn.
Distribution of a statistic
is called sampling distribution.
For example, the probability distribution of ഥX is the sampling distribution of the mean.
(Point) Estimation problems
Mean
Proportion p 2 Variance
Estimate
----------------
Note that the sampling distribution of ഥ X is normal
with mean 100 ohms and std 𝜎/ 𝑛 = 10/ 25 = 2.
ത
𝑋−100 95−100
ഥ
P(X < 95) = P( < ) = P(Z < -2.5) = 0.0062
2 2
CLT – Ex2
Customers at a popular restaurant are waiting to be served.
Waiting times are independent and exponentially distributed
with mean 1/λ = 30 minutes. If 16 customers are waiting what
is the probability that their average wait is less than 25
minutes?
---
Let X be the average waiting time of 16 customers.
Since waiting time is exponentially distributed, the mean and
standard deviation of individual waiting time is μ = σ = 30.
By the central limit theorem, Z = (X - 30)/(30/ 16) ~ N(0, 1)
P(X < 25) = P(Z < -0.667) = 0.252
Approximate Sampling Distribution of
a Difference in Sample Means
N(0, 1)
Ex. The effective life of a component used in a jet-turbine aircraft engine is a random
variable with mean 5000 hours and standard deviation 40 hours. The distribution of
effective life is fairly close to a normal distribution. The engine manufacturer introduces
an improvement into the manufacturing process for this component that increases the
mean life to 5050 hours and decreases the standard deviation to 30 hours. Suppose that
a random sample of n1 = 16 components is selected from the “old” process and a
random sample of n2 = 25 components is selected from the “improved” process.
What is the probability that the difference in the two sample means is at least 25 hours?
Assume that the old and improved processes can be regarded as independent
populations.
Example (cont.)
• We have
•ഥ
X1 ~ N(1, 𝜎12/𝑛1 ) = N(5000, 100)
•ഥ
X2 ~ N(2, 𝜎22/𝑛2 ) = N(5050, 36)
ഥ1 ~ N(1 - 2, 𝜎12/n1 + 𝜎22/n2) = N(50, 136)
ഥ2 - X
➔X
Like hurricanes and earthquakes, geomagnetic storms are natural
hazards with possible severe impact on the Earth. Severe storms can
cause communication and utility breakdowns, leading to possible
blackouts. The National Oceanic and Atmospheric Administration beams
electron and proton flux data in various energy ranges to various stations
on the Earth to help forecast possible disturbances. The following are 25
readings of proton flux in the 47-68 kEV range (units are in p / (cm2-sec-
sterMeV)) on the evening of December 28, 2011: 2310 2320 2010 10800
2190 3360 5640 2540 3360 11800 2010 3430 10600 7370 2160 3200
2020 2850 3500 10200 8550 9500 2260 7730 2250
a/ Find a point estimate of the mean proton flux in this time period.
b/ Find a point estimate of the standard deviation of the proton flux in this
time period.
c/ Find an estimate of the standard error of the estimate in part a/.
Exercises
• (3x + 13) mod 21
• (5x + 15) mod 21
• (7x + 17) mod 21