0% found this document useful (0 votes)
11 views41 pages

STAT401 Lecture 10

The document is a review for Midterm 1 covering Chapters 1-7.3, focusing on Descriptive Statistics, Basic Probability Theory, and Inferential Statistics. It includes various statistical concepts such as data organization, summary statistics, probability models, and confidence intervals, along with problems for practical application. Key topics include sample means, variance, normal distributions, and Bernoulli trials.

Uploaded by

msn101
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views41 pages

STAT401 Lecture 10

The document is a review for Midterm 1 covering Chapters 1-7.3, focusing on Descriptive Statistics, Basic Probability Theory, and Inferential Statistics. It includes various statistical concepts such as data organization, summary statistics, probability models, and confidence intervals, along with problems for practical application. Key topics include sample means, variance, normal distributions, and Bernoulli trials.

Uploaded by

msn101
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

Lecture 10

Review for Midterm 1


(Chapters 1-7.3)
Review of Lectures 1-9
• Descriptive Statistics

• Basic Probability Theory

• Inferential Statistics
Descriptive Statistics
• Organize Data
• Frequency table, Pie chart, Bar chart,
histogram, dotplot, stem-and-leaf diagram
• Summarize Data
• mean, median, mode, midrange
• variance, standard deviation, range
• five-number summary, boxplot
Review: Frequency table, Pie chart, Bar chart, histogram
Review: Dotplot and stem-and-leaf plot

Dotplots and stem-and-leaf plots recover the dataset


Review: Boxplot

Q1 = 23, Q2 = 30.5, Q3 = 36.5; Potential outlier: 66; Adjacent values: 5 and 43.
Review: sample mean and sample variance
𝒏
𝟏
ഥ = ෍ 𝒙𝒊
𝒙
𝒏
𝒊=𝟏

𝒏
𝟏
𝒔𝟐 = ഥ
෍ 𝒙𝒊 − 𝒙 𝟐
𝒏−𝟏
𝒊=𝟏

𝟐
𝒏 𝒏
𝟏 𝟐 𝟏
= ෍ 𝒙𝒊 − ෍ 𝒙𝒊
𝒏−𝟏 𝒏
𝒊=𝟏 𝒊=𝟏
Problem 1. In a survey, the observations of a sample are shown
in the stem-and-leaf diagram below

a. Find the summary statistics and fill in the following table

Sample size Minimum Maximum Range Median Mode Midrange

b. Find the mean and variance of the observations whose stem


is “20” in above stem-and-leaf diagram
b. Find the mean and variance of the observations whose stem
is “20” in above stem-and-leaf diagram
b. Find the mean and variance of the observations whose stem
is “20” in above stem-and-leaf diagram
Basic Probability Theory
• Axioms of probability
• Sample space, events, probability
• Addition and complementation rules
• Equal-likelihood models, Counting rules
• Random Variables
• Discrete: Bernoulli, binomial, Poisson
• Continuous: Normal (Gaussian) 𝒩(𝜇, 𝜎 2 ),
Student’s 𝑡𝜈 and 𝜒𝑘2 distributions
Review: Basic Probability
Experiment: Action with unpredictable outcomes
Sample space (𝑺): Set of all outcomes of an experiment
Event (𝑨, 𝑩, 𝑪 …): Subset of the sample space 𝑆

Equal-likelihood model (ELM) Suppose an experiment has finitely


many possible outcomes, which are all equally likely to happen
(Equal-likelihood model). Let 𝑆 be the sample space of such an
experiment. Then for any event 𝐴,

|𝑨|
𝑷 𝑨 = .
|𝑺|

• Random selection from a collection of objects,


• fair coins, fair dice, balanced dime…
Review: Basic Probability
• Let 𝐵~𝐵𝑒𝑟𝑛(𝑝) be a Bernoulli random variable with
success probability 𝑝 and let 𝑞 = 1 − 𝑝.
𝐸 𝐵 = 𝑝 and 𝑉𝑎𝑟 𝐵 = 𝑝𝑞

• Let 𝑋~𝑏𝑖𝑛𝑜𝑚𝑖𝑎𝑙 𝑛, 𝑝 . Then


𝑋 = ∑𝐵𝑖 = 𝐵1 + 𝐵2 + ⋯ + 𝐵𝑛 ,
where each 𝐵𝑖 ~𝐵𝑒𝑟𝑛(𝑝). The PMF of 𝑋 is
𝑛 𝑥
𝑃 𝑋=𝑥 = 𝑝 1 − 𝑝 𝑛−𝑥 ,
𝑥
where 𝑥 = 0,1,2, … , 𝑛 is the number of successes.
𝐸 𝑋 = 𝑛𝑝 and 𝑉𝑎𝑟 𝑋 = 𝑛𝑝𝑞
Review: Basic Probability
• 𝑋~𝒩(𝜇, 𝜎 2 ) denotes a normal random variable 𝑋
such that 𝐸 𝑋 = 𝜇 and 𝑉𝑎𝑟 𝑋 = 𝜎 2
Review: Basic Probability
• 𝑡~𝑡𝜈 denotes a random variable with 𝑡-distribution
whose degrees of freedom df = 𝜈. Note 𝐸 𝑡 = 0.
Review: Basic Probability
• 𝜒 2 ~𝜒𝑘2 denotes a random variable with 𝜒𝑘2 -distribution
whose degrees of freedom df = 𝑘. Note 𝐸 𝜒 2 = 𝑘.
Problem 2. When an 8-sided die is rolled, the possible outcomes
are 1, 2, 3, 4, 5, 6, 7 and 8. E.g., 4 outcomes are shown below.

a. Find the sample space 𝑆;

b. Let 𝐴 be the event the die comes up odd; let 𝐵 be the event
the die comes up 4 or more. Find the events 𝐴 and 𝐵.

c. Suppose the die is fair, find the probability 𝑃(𝐴 ∪ 𝐵).

d. If the die is rolled independently for 6 times. Find the


probability that exactly 3 times the die comes up odd.
a. Find the sample space 𝑆;

a. Let 𝐴 be the event the die comes up odd; let 𝐵 be the event
the die comes up 4 or more. Find the events 𝐴 and 𝐵.
c. Suppose the die is fair, find the probability 𝑃(𝐴 ∪ 𝐵).

d. If the die is rolled independently for 6 times. Find the


probability that exactly 3 times the die comes up odd.
Inferential Statistics
• Normal model 𝒙~𝓝(𝝁, 𝝈𝟐 )
• Mean: 𝑧, 𝑡-interval for 𝜇; sample size
estimation given 𝜎, 𝛼 and 𝐸
• Variance: 𝜒 2 -interval for 𝜎 and 𝜎 2 ; sample
size estimation given 𝛼 and 𝑑
• Bernoulli model 𝒃~𝑩𝒆𝒓𝒏(𝒑)
• Proportion: 𝑧-interval for 𝑝; sample size
estimation given 𝛼 and 𝐸 with 𝑝ො𝑔 or not
Review: Normal model
𝒙~𝓝(𝝁, 𝝈𝟐 )
A simple random
sample of size 𝒏
{143}

{162} {162}

{168}
{168}

{188}
{192} ഥ, 𝒔𝟐
Statistics: 𝒙

Parameters: 𝝁, 𝝈𝟐

Normal model is widely applicable to continuous quantitative


variables on a population, e.g., height, weight, length, temperature…
e.g., 𝒙 is height in above illustration
Review: Point and interval estimations
• Point estimate:
• 𝒙
ഥ is an unbiased estimator of 𝝁
• 𝒔𝟐 is an unbiased estimator of 𝝈𝟐
• Interval estimate: 1 − 𝛼 % confidence intervals
• 𝑧-interval (𝜎 known)
ഥ − 𝑧𝛼/2 𝜎/ 𝑛 < 𝝁 < 𝒙
𝒙 ഥ + 𝑧𝛼/2 𝜎/ 𝑛
• 𝑡-interval
ഥ − 𝑡𝛼/2 𝒔/ 𝑛 < 𝝁 < 𝒙
𝒙 ഥ + 𝑡𝛼/2 𝒔/ 𝑛
• 𝜒 2 -interval
𝑛−1 𝟐 𝟐
𝑛−1 𝟐
2 𝒔 <𝝈 < 2 𝒔
𝜒𝛼/2 𝜒1−𝛼/2
Review: Margin of error and sample size
• 𝑧-interval:
• margin of error 𝑬 = 𝒛𝜶/𝟐 𝝈/ 𝒏
• sample size required given 𝐸 and 𝛼
𝒛𝜶/𝟐 𝝈 𝟐
𝒏=
𝑬

• 𝜒 2 -interval:
• relative margin of error 𝒅
• sample size required given 𝑑 and 𝛼
𝟏 𝒛𝜶/𝟐 𝟐
𝒏=
𝟐 𝒅
Confidence interval for population mean (𝝈 known)

(n > 30)
Problem 4. A sample of 32 bags of the same brand of candies
was selected at random. The sample mean weight was 2.4
ounces. Assume the population distribution of bag weights is
normal with population variance 0.04 ounce.

a. Construct a 98% confidence interval for the population


mean weight of the bags.
b. What is the sample size required so that a 95% confidence
interval for the population mean weight has a length 0.01?
Confidence interval for 𝝁 (𝝈 known)
Confidence interval for 𝝁 (𝝈 unknown)

(n > 30)
Problem 6. A pharmaceutical company makes tranquilizers. It is
assumed that the distribution for the length of time they last is
approximately normal. Researchers in a hospital used the drug on
a random sample of 9 patients. The sample mean effective time
is 4.61 and the sample standard deviation is 0.78.

a. Construct a 95% confidence interval for the population


mean effective time of the tranquilizers.
b. Suppose the population variance of the effective time is
0.36. What is the sample size required for the confidence
interval, so that we can be 98% confident that the sample
mean effective time is within 0.03 of the population mean
effective time?
Confidence interval for 𝝁 (𝝈 unknown)
Confidence interval for 𝝈𝟐 and 𝝈
Problem 7. A post office experiments with a single waiting line
and finds that for a random sample of 31 customers, the waiting
times for customers have a standard deviation of 4.7 minutes.
Suppose the waiting times for customers follow a normal
distribution.

a. Construct a 99% confidence interval for the population


standard deviation of the waiting times for customers in
this post office.
b. What is the sample size required for the confidence
interval, so that we can be 98% confident that the sample
standard deviation of the waiting times is within 5% of the
population standard deviation of the waiting times?
Confidence interval for 𝝈𝟐 and 𝝈
Review: Bernoulli model
𝒃~𝑩𝒆𝒓𝒏(𝒑) n Bernoulli trials

A simple random
sample of size 𝒏
{1}

{0} {0}
{1}
{1}
{1}
{0} ෝ
Statistic: 𝒑

Parameter: 𝒑 {𝒃𝟏 , 𝒃𝟐 , ⋯ , 𝒃𝒏 }

𝑬 𝒃 = 𝝁𝒃 = 𝒑 number of successes
𝒙 = 𝒃𝟏 + 𝒃𝟐 + ⋯ + 𝒃𝒏 = ∑𝒃𝒊
𝑽𝒂𝒓(𝒃) = 𝝈𝟐𝒃 = 𝒑𝒒

ෝ = 𝒙/𝒏 = ∑𝒃𝒊 /𝒏 = 𝒃
𝒑
Review: Point and interval estimations
• Point estimate:
• 𝒑
ෝ is an unbiased estimator of 𝒑

• Interval estimate: 1 − 𝛼 % confidence interval


• 𝑧-interval for 𝑝
ෝ − 𝑧𝛼/2 𝒑
𝒑 ෝ𝒒ෝ/𝑛 < 𝝁 < 𝒑
ෝ + 𝑧𝛼/2 𝒑
ෝ𝒒ෝ/𝑛
ෝ =𝟏−𝒑
where 𝒒 ෝ.
Review: Margin of error and sample size
Review: Sample size for confidence interval of 𝒑
• The sample size required for a 1 − 𝛼 % CI of 𝑝 with 𝐸 is
𝟏 𝒛𝜶/𝟐 𝟐
𝒏=
𝟒 𝑬
rounded up to the next larger integer if necessary.

• ෝ𝒈 for 𝒑
If an educated guess 𝒑 ෝ is available, then the reduced
sample size is
𝒛𝜶/𝟐 𝟐
𝒏=𝒑 ෝ𝒈 𝒒
ෝ𝒈
𝑬
rounded up to the next larger integer if necessary.

• ෝ, then apply the


If there is a range for the observed value of 𝒑
above formula with the educated guess 𝒑 ෝ𝒈 being the value
closest to 𝟎. 𝟓 in the range.
Confidence interval for 𝒑
Problem 8. A new insect spray A is to be tested. 300 insects are
released into a room, and after 1 hour the numbers of dead
insects are counted. It is found that 180 insects are dead.

a. Construct a 98% confidence interval for the population


killing rate of spray A.
b. What is the sample size required so that a 99% confidence
interval for the population killing rate of spray A has a
length 0.06?
c. Given that an educated guess of the sample killing rate of
spray A will be between 0.6 and 0.9, what is the sample
size required for the confidence interval, so that we can be
98% confident that the sample killing rate of spray A is
within 0.03 of the population killing rate of spray A?
Confidence interval for 𝒑
Problem 8. A new insect spray A is to be tested. 300 insects are
released into a room, and after 1 hour the numbers of dead
insects are counted. It is found that 180 insects are dead.

b. What is the sample size required so that a 99% confidence


interval for the population killing rate of spray A has a
length 0.06?
Problem 8. A new insect spray A is to be tested. 300 insects are
released into a room, and after 1 hour the numbers of dead
insects are counted. It is found that 180 insects are dead.

c. Given that an educated guess of the sample killing rate of


spray A will be between 0.6 and 0.9, what is the sample
size required for the confidence interval, so that we can
be 98% confident that the sample killing rate of spray A is
within 0.03 of the population killing rate of spray A?

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy