Math-138 Unit 3 Packet Fall 2024 (Canvas)
Math-138 Unit 3 Packet Fall 2024 (Canvas)
MATH-138 STATISTICS
Unit 3 Classroom Activities and Examples
Revised AY24-25
Howard Community College
Allison Bell
Lamont Vaughan
Unit 3 Key Concepts
2
Chapter 8
If the population mean 𝜇 is unknown, a sample mean can be used as a starting point to create an
interval for the population mean. That starting point is referred to as a ____________________.
In order for the estimate to be useful, we must describe how close it is likely to be. Using the
point estimate along with adding and subtracting a number an interval can be created.
Point Estimate − Margin of Error < 𝜇 < Point Estimate + Margin of Error
____________________< 𝜇 <____________________
This can also be written as 4 𝑝𝑜𝑖𝑛𝑡 𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑒 ± (𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 𝑣𝑎𝑙𝑢𝑒 × 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟)H
There are actually many different Student’s t distributions and they are distinguished by a
quantity related to the sample size called the degrees of freedom.
When using the Student’s t distribution to construct a confidence interval for a population mean,
the number of degrees of freedom is 1 less than the sample size.
3
Student’s t distributions are symmetric and unimodal, just
like the normal distribution. However, they are more spread
out because s is, on average, a bit smaller than 𝜎. Also, since
s is random, replacing 𝜎 with s increases the spread.
Assumptions/Conditions
The assumptions for constructing a confidence interval for a population mean when σ is not known
are:
1. Simple Random Sample
2. 𝑛 > 30 or approximately normal population
A simple method to check for normalilty to draw a dotplot or boxplot of the sample to look for
skewness or outliers.
Note on Interpreting a Confidence Interval: The confidence interval does not guarantee that the
parameter is within the interval. Therefore, we say that we have “blank” % confidence that the
population parameter is in this interval
.CALCULATOR PROCEDURES
4
Example 1
A food chemist analyzed the calorie content for a popular type of chocolate cookie. Following are the
numbers of calories in a sample of eight cookies.
Find a 98% confidence interval for the mean number of calories in this type of cookie.
Example 2
Hoping to lure more shoppers downtown, a city builds a new public parking garage. The city plans to
pay for the structure through parking fees. During a two-year period, a random sample of 44 days of
daily fees were collected. The mean of this sample was $126, and the standard deviation was $15. A
consultant predicted the daily average income would be $130. Construct a 90% confidence interval
to determine if the consultant is correct.
a. Construct the 90% confidence interval for the mean fees collected daily.
b. Was the consultant correct that the daily average would be $130? Explain.
Example 3
A company has developed a new type of lightbulb and wants to estimate its mean lifetime. A simple
random sample of 100 bulbs had a sample mean lifetime of 750.2 hours with a sample standard deviation of
30 hours.
b. Construct a 95% confidence interval for the population mean lifetime of all bulbs
manufactured by this new process.
5
Section 8.3 – Confidence Intervals for a Proportion
To construct a confidence interval, we again need a point estimate. The point estimate for the
population proportion 𝑝 is ____________
The Central Limit Theorem for Proportions provides the standard error of 𝑝̂ which is
___________________
To compute the margin of error, we multiply the standard error by the critical value and the
confidence interval is for a proportion is
𝑝M (1 − 𝑝
M)
𝑝̂ ± 𝑧𝛼/2 ∙ L
𝑛
Assumptions/ Conditions
CALCULATOR PROCEDURES
6
Example 1
A poll found that 38% of a random sample of 1012 adults said that they believe in ghosts.
a. Compute a 90% confidence interval of our estimate of adults who believe in ghosts.
b. Find the margin of error for this poll if we want 90% confidence in our estimate of
adults who believe in ghosts (hint: use only the interval from part b!).
d. Find the margin of error if we want to be 99% confident (hint: find the 99%
confidence interval first!).
Example 2
In a survey of 800 parents, 632 said that music education has a positive effect on academic
performance. Construct a 95% confidence interval for the proportion of parents who believe that
music education has a positive effect.
7
Example 3
A simple random sample of 200 third graders in a large school district was chosen to participate
in an after-school program to improve reading skills. After completing the program, the children
were tested, and 142 of them showed improvement in their reading skills.
a. Construct a 95% confidence interval for the proportion of third graders in the
school district whose reading scores would improve after completing the program.
b. Is it reasonable to conclude that more than 60% of the students would improve
their reading scores after completing the program?
Example 4
A national health organization warns that 30% of the middle school students nationwide have
been drunk. Concerned, a local health agency randomly and anonymously surveys 110 of the
1212 middle school students in its city. Only 21 of them reported having been drunk.
b. Create a 95% confidence interval for the proportion of the city’s middle school students
who have been drunk.
c. Is there any reason to believe that the national level of 30% is not true of the middle
school students in this city?
8
FINDING SAMPLE SIZE
A smaller margin of error makes the confidence interval more useful. If we wish to make the
margin of error of a confidence interval smaller while keeping the confidence level the same, we
can do this by making the sample size larger.
Let 𝑚 represent the margin of error, we can rewrite this formula and solve for 𝑛 .
($ "
Sample Size 𝑛 = 𝑝M(1 − 𝑝M) T )% U
Example 1
In a sample of 800 parents, 632 believed that music education had a positive effect on academic
performance. Estimate the sample size needed so that a 95% confidence interval will have a
margin of error of 0.025.
Example 2
In preparing a report on the economy, we need to estimate the percentage of businesses
that plan to hire additional employees in the next 60 days.
b. Suppose we want to reduce the margin of error to 3%. What sample size will
suffice?
9
Chapter 9
There are two hypotheses, one is called the null hypothesis and the other is called the alternate
hypothesis.
Example 1
State appropriate null and alternate hypotheses:
a. Boxes of a certain kind of cereal are labeled as containing 20 ounces. An inspector thinks
that the mean weight may be less than this.
b. Last year, the mean monthly rent for an apartment in a certain city was $800. A real
estate agent believes that the mean rent is higher this year.
c. Scores on a standardized test have a mean of 70. Modifications are made to the test, and
an educator believes that the mean may have changed.
10
Decision and Conclusion
When performing a hypothesis test there are two decision options
• Reject 𝐻* – Conclusion: There is enough evidence to support 𝐻+
• Do Not Reject 𝐻* - Conclusion: There is not enough evidence to support 𝐻+
Notice that the decision for the hypothesis test references 𝐻* while the conclusion will
reference 𝐻+ within context of the situation.
Types of Errors
Decision Rule
We reject the null hypothesis when the data provide strong evidence against it. The P-value is
the probability, assuming that 𝐻* is true, of observing a value for the test statistic that is at least
as extreme as the value actually observed.
Example 1
Decide if a Type I error, Type II error, or a correct decision occurs in each case:
a. A test is made of 𝐻* : 𝜇 = 100 versus 𝐻+ : 𝜇 ≠ 100. The true value of 𝜇 is 150 and
H0 is rejected.
11
CALCULATOR PROCEDURES
Example 2 - Means
Generic drugs are lower-cost substitutes for brand-name drugs. Before a generic drug can be sold
in the United States, it must be tested and found to perform equivalently to the brand name
product.
The U.S. Food and Drug Administration is now supervising the testing of a new generic
ointment. The brand-name ointment is known to deliver a mean of 3.5 micrograms of active
ingredient to each square centimeter of skin. As part of the testing, seven subjects apply the
ointment. Six hours later, the amount of drug that has been absorbed into the skin is measured.
How strong is the evidence that the mean amount absorbed differs from 3.5 micrograms? Use the
𝛼 = 0.05 level of significance.
12
Example 3 - Means
Hoping to lure more shoppers downtown, a city builds a new public parking garage. The
city plans to pay for the structure through parking fees. During a two year period, a random
sample of 44 days of daily fees were collected. The mean of this sample was $126 and the
standard deviation was $15. A consultant predicted the daily average income would be
$130.
a. Check the necessary conditions.
b. Would the consultant be correct at the 10% significance level? Write your
hypotheses, perform the test, find the test statistic, the p-value, and state your
conclusion.
Example 4 – Means
Judy is an ad designer who designs the newspaper ads for the Giant grocery store. Electronic
counters at the entrance total the number of people entering the store. Before Judy was hired, the
mean number of people entering everyday was 3018. Since she has started working at Giant, the
management thinks this average has increased. A random sample of 42 business days gave an
average of 3333 people entering the store daily, with a standard deviation of 287.
Does this indicate that the average number of people entering the store every day has increased?
Write your hypotheses, perform the test, find the test statistic, the p-value, and state your
conclusion (Use α = .05).
CALCULATOR PROCEDURES
Example 5 – Proportions
A company is willing to renew its advertising contract with a local radio station only if the
station can prove that more than 20% of the residents of the city have heard the ad and recognize
the company’s product. The radio station conducts a random phone survey of 600 people and
finds 133 people that recognize the product.
a. Use an alpha of 0.05 to discuss what this survey reveals about renewing the contract.
b. Which possible error could have occurred during the above hypothesis test? State the
error in the context of the problem.
Example 6 – Proportions
A company hopes to improve customer satisfaction, setting their goal to attain less than 8%
negative comments. A random survey of 350 customers found only 27 with complaints.
Does this provide evidence that the company has reached its goal? (Use a 0.05 significance
level to state your conclusion)
Write your hypotheses, perform the test, find the test statistic, the p-value, and state your
conclusion.
Example 7 – Proportions
The Pew Research Center reported that only 15% of 18- to 24-year-olds read a daily newspaper.
The publisher of a local newspaper wants to know whether the percentage of newspaper readers
among students at a nearby large university differs from the percentage among 18- to 24-year-
olds in general. She surveys a simple random sample of 200 students at the university and finds
that 40 of them, or 20%, read a newspaper each day. Can she conclude that the proportion of
students who read a daily newspaper differs from 0.15? Use the 𝛼 = 0.05 level of significance.
Write your hypotheses, perform the test, find the test statistic, the p-value, and state your
conclusion.
Chapter 10 & 11 Part 1
Confidence Intervals & Hypothesis Tests for Two Independent Samples
INDEPENDENT SAMPLES
For independent sampling the observations in one sample do not influence the observations in
the other.
Confidence Intervals
We wish to construct a confidence interval for the difference 𝜇+ − 𝜇" . Because we will still not
likely know value of 𝜎+ or of 𝜎" , the confidence will use the t-distribution.
CALCULATOR PROCEDURES
Example 1
A drug company has developed a new drug designed to reduce high blood pressure. To test the
drug, a sample of 15 patients is recruited to take the new drug. Their systolic blood pressures are
reduced by an average of 28.3 mmHg, with a standard deviation of 12.0 mmHg. In addition,
another sample of 20 patients take a pre-existing drug. The blood pressures in this group are
reduced by an average of 17.1 mmHg with a standard deviation of 9.0 mmHg. Assume that
blood pressure reductions are approximately normally distributed. Find a 95% confidence
interval for the difference between the population mean reduction for the new drug and that of
the pre-existing drug.
Example 2
The Acme Company has developed a new battery. The engineer in charge thinks that the
new battery will operate longer than the old battery. The company selects a simple
random sample of 100 new batteries and 100 old batteries. The old batteries run
continuously for and average of 190 minutes with a standard deviation of 20 minutes; the
new batteries, and average of 200 minutes with a standard deviation of 40 minutes.
Compute and interpret a 95% confidence interval for the difference in average running
times for the batteries.
Example 3
A total of 23 Gossett High School students were admitted to State University. Of those students,
7 were offered athletic scholarships. The school’s guidance counselor looked at their composite
ACT scores (shown in the table), wondering if State U. might admit people with lower scores if
they also were athletes. Assuming this group of students is representative of students throughout
the state, what do you think? Construct a 98% confidence interval for the mean difference
between athletes and non-athletes.
The null hypothesis says that the population means are equal: 𝐻* : 𝜇+ = 𝜇"
• 𝐻+ : 𝜇+ < 𝜇"
• 𝐻+ : 𝜇+ > 𝜇"
• 𝐻+ : 𝜇+ ≠ 𝜇"
CALCULATOR PROCEDURES
Example 4
Below are the sugar content (in mg) of various cereals, separated by whether they are
intended to be marketed to children or adults. Does the data provide evidence to suggest
there is more sugar in children’s cereals? Write your hypotheses, perform the test, find the
test statistic, the p-value, and state your conclusion (Use α = .05).
𝑠. = 6.6 𝑠/ = 8.1
𝑛. = 18 𝑛/ = 18
Example 5
Below are test results from two different classes on Exam #2. Did Class #1 score higher than
the Class #2? Write your hypotheses, perform the test, find the test statistic, the p-value, and
state your conclusion (Use α = .01).
Class #1 Class #2
77 79 86 60
80 76 72 88
77 66 29 71
85 98 29 75
71 68 73 43
53 64 70 79
60 74 83 57
64 90 50 100
68 80 71 77
76 64 71 83
72 80 52 93
69 85 67 55
75 48 53
Example 6
The National Assessment of Educational Progress (NAEP) tested a sample of students who had
used a computer in their mathematics classes, and another sample of students who had not used a
computer. The sample mean score for students using a computer was 309, with a sample
standard deviation of 29. For students not using a computer, the sample mean was 303, with a
sample standard deviation of 32. Assume there were 60 students in the computer sample, and 40
students in the sample that hadn’t used a computer. Can you conclude that the population mean
scores differ? Use the α = 0.05 level.
Chapter 10 & 11 Part 2
Confidence Intervals & Hypothesis Tests for Two Proportions
NOTATION:
• 𝑝+ and 𝑝" are the population proportions of the category of interest in the two
populations.
• 𝑥+ and 𝑥" are the numbers of individuals in the category of interest in the two samples.
• 𝑛+ and 𝑛" are the two sample sizes.
Confidence Intervals
When the sample sizes are large enough, 𝑝̂+ and 𝑝̂ " are approximately normally distributed, so
the critical value is 𝑧! ⁄" .
CALCULATOR PROCEDURES
Example 1
A random sample of 50 children living in a community with a high level of ozone pollution had
their lung capacities measured, and 14 of them had capacities that were below normal. A second
random sample of 80 children was drawn from a community with a low level of ozone pollution,
and 12 of them had lung capacities that were below normal. Construct a 95% confidence interval
for the difference between the proportions of children with diminished lung capacity between the
two communities.
Example 2
A survey of 430 randomly selected adults found that 21% of the 222 freshman and 18% of
the 208 sophomores had purchased books online.
a. Create a 95% confidence interval for the difference between the proportion of
freshman and sophomores who purchase books online.
b. Is there evidence that freshman are more likely to make online purchases of books?
Example 3
Researchers at the National Cancer Institute released the results of a study that investigated the
effect of weedkillers on house pets. They examined 827 dogs from homes where weedkillers
were used on a regular basis, diagnosing malignant lymphoma in 473 of them. Of the 130 dogs
from homes where no weedkillers were used, only 19 were found to have lymphoma.
a. Construct a 95% confidence interval for the difference in the two proportions.
The null hypothesis says that the population proportions are equal: 𝐻* : ______________________
• 𝐻+ : 𝑝+ < 𝑝"
• 𝐻+ : 𝑝+ > 𝑝"
• 𝐻+ : 𝑝+ ≠ 𝑝"
CALCULATOR PROCEDURES
Example 4
The General Social Survey took a poll that asked 350 employed people aged 25–40 whether they
used a computer at work, and 259 said they did. They also asked the same question of 500
employed people aged 41–65, and 384 of them said that they used a computer at work. Can you
conclude that the proportion who use a computer at work is greater among those aged 41–65 than
among those aged 25–40? Use the 𝛼 = 0.05 level.
Example 5
Out of a total of 82,486 accidents involving drivers aged 15–24 years, 4243 of them, or 5.1%,
occurred in a driveway. Out of a total of 219,170 accidents involving drivers aged 25–64 years,
10,701 of them, or 4.9%, occurred in a driveway. Can you conclude that accidents involving
drivers aged 15–24 are more likely to occur in driveways than accidents involving drivers aged
25–64? Use 𝛼 = 0.05.
Example 6
Would being part of a support group that meets regularly help people who are wearing the
nicotine patch actually quit smoking? A county health department tries an experiment
using several hundred volunteers who are planning to use the patch. The subjects were
randomly divided into two groups. People in Group 1 were given the patch and attended a
weekly discussions meeting with counselors and others trying to quit. People in Group 2
also used the patch but did not participate in the counseling groups. After six months 46 of
the 143 smokers in Group 1 and 30 of the 151 smokers in Group 2 had successfully stopped
smoking. Do these results suggest that such support groups could be an effective way to
help people stop smoking? Use an alpha level of 0.05.
Chapter 10 & 11 Part 3
Confidence Intervals & Hypothesis Tests for Two Proportions
NOTATION
is the sample mean of the differences between the values in the matched pairs.
is the sample standard deviation of the differences between the values in the matched
pairs.
If we let 𝜇+ and 𝜇" represent the population means and 𝜇0 represent the population mean of the
difference, then 𝜇0 = 𝜇+ − 𝜇" . The paired data reduce the two-sample problem to a one-sample
problem.
CALCULATOR PROCEDURES
Individual
1 2 3 4 5
Before 170 164 168 158 183
After 145 132 129 135 145
Example 2
Eight students in a statistics class were asked to report the number of hours they slept on
weeknights and on weekends. The table below shows the results.
Student 1 2 3 4 5 6 7 8
Weeknight hours 8 5.5 7.5 8 7 6 6 8
Weekend hours 4 7 10.5 12 11 9 6 9
a. How can you tell that this is a problem involving dependent samples?
b. Create a 90% confidence interval on the mean differences and interpret the
interval in the context of the problem.
Hypothesis Tests
CALCULATOR PROCEDURES
Example 4
Using the data about a tune-up improving car engine gas mileage, test 𝐻* : 𝜇0 = 0 versus
𝐻+ : 𝜇0 > 0. Use the 𝛼 = 0.01 significance level.
Automobile 1 2 3 4 5 6 7 8
After 35.44 35.17 31.07 31.57 26.48 23.11 25.18 32.39
Before 33.76 34.30 29.55 30.90 24.92 21.78 24.30 31.25
Difference 1.68 0.87 1.52 0.67 1.56 1.33 0.88 1.14
Example 5
A soft drink company is conducting research to select a new design for the can. A random
sample of participants has been selected. Instead of a typical taste test with two different sodas,
they actually give each participant the same soda twice. One drink is served in a predominantly
red can, the other in a predominantly blue can. The order is chosen randomly. Participants are
asked to rate each drink on a scale of 1 to 10. Thus, the company wished to test if the color of the
can influences the rating. The ratings were recorded for each participant. The data are shown in
the table below. Does this sample indicate that there is a difference in ratings? Use a significance
level of 𝛼 = 0.1.
A composition teacher wishes to see whether a new grammar program will reduce the
number of grammatical errors her students make when writing a two page essay. The
student scores are shown below. Using an alpha of 0.01, can it be concluded that the
number of errors has been reduced using this new program?
Student 1 2 3 4 5 6
Score before 12 9 5 5 14 15
Score after 17 13 4 10 16 15
a. Write your hypotheses, perform the test, find the test statistic, the p-value, and state
your conclusion.
b. Compute and interpret a 98% confidence interval for the mean difference in scores
before and after.
Chapter 12 – Goodness of Fit Test
Hypothesis tests for qualitative data, also called categorical data, are based on the chi-square
distribution. The test statistic used for these tests is called the chi-square statistic, denoted 𝜒 " .
There are actually many different chi-square distributions, each with a different number of
degrees of freedom. The figure below shows several examples of chi-square distributions.
Chi-square distributions are skewed to the right, and the values of the χ2 statistic are always
greater than or equal to 0. They are never negative.
The notation 𝜒!" represents the value that has an area of 𝛼 to its right. We can consult a table to
find critical values associated with the chi-square distribution.
.
Chi-Square Goodness of Fit Test Hypotheses
𝐻* : The distribution of counts occurs in a manner consistent with the model provided.
𝐻+ : The distribution of counts occurs in a manner that is inconsistent with the model
provided.
EXPECTED FREQUENCIES
The expected frequencies counts are the mean counts that would occur if 𝐻* were true.
If the probabilities specified by 𝐻* are 𝑝+ , 𝑝" , …, and the total number of trials is 𝑛, the expected
frequencies counts are computed by multiplying the sample size by probability of each category.
Note: If no probabilities are given divide the sample size 𝑛 by the number of categories.
The statistic that measures how large the differences are between the observed and expected
frequencies is called the chi-square statistic.
Let 𝑘 be the number of categories, let 𝑂+ , … , 𝑂1 be the observed frequencies, and let 𝐸+ , … , 𝐸1 be
the expected frequencies. The expected counts of all categories must be at least 5.
TI-84
STAT à EDIT
Then type the Observed values in L1 and the Expected values in L2.
2nd MODE to get back to the main screen
STAT à TESTS à D: 𝜒 " GOF-Test
Observed: L1
Expected: L2
df: (# of categories minus 1)
Press Enter – both the test statistic and p-value are given (no other information is needed)
TI-83
STAT à EDIT
Then type the Observed values in L1 and the Expected values in L2.
Highlight L3 and type in (L1 – L2)^2 / L2, then press Enter.
2nd MODE to get back to the main screen
2nd Stat à MATH, then choose 5: sum( and enter 2nd 3, so the screen shows sum(L3), then press
Enter. This will be your Test Statistic.
Example 2
A high school principal believes that when the SAT tests are given, the group consists of 10%
freshmen, 20% sophomores, 40% juniors, and 30% seniors. The group that just took the test
consisted of 12 freshmen, 18 sophomores, 45 juniors, and 25 seniors. Using a 10% significance
level, test the principal’s conjecture.
Monday 125
Tuesday 105
Wednesday 120
Thursday 114
Friday 136
Total
Chapter 12 – Tests for Independence
Tests for Independence involve two categorical variables. We are interested in determining
whether the distribution of one variable differs, depending on the value of the other variable.
If the distribution of one variable is the same for all values of the other variable, the variables are
______________________________.
The expected frequency for a cell represents the number of individuals we would expect to find
in that cell under the assumption that the two variables are independent.
Given a contingency table, to obtain each cell’s expected count multiply the cells row total by
the column total and divide by the total sample size.
If the differences between the observed and expected frequencies tend to be large, we will reject
the null hypothesis of independence.
We enter the observed frequencies that are given in the contingency table into
the MATRIX function. We do so by pressing , then to navigate
to the EDIT menu. We will place the observed frequencies into MATRIX [A], so
we press . The contingency table is made up of rows versus columns. For example, we
type , to indicate that we will be entering in 4 rows and 3 columns of
data. Then enter the data.
Now, press to QUIT the matrix mode. Next, press , scroll to
the TESTS menu and select 𝝌𝟐 -Test.
Make sure that [A] is listed in the Observed field. The calculator will place the expected
frequencies into matrix [B] by default. If we choose the Calculate option, we have the
following:
We now go back into the matrix menu by pressing to view matrix [B].
Example 1
Perform a test of the null hypothesis that major and hours studying are independent.
Use 𝛼 = 0.01.
Major
Hours Studying Per Week Humanities Social Science Business Engineering
0–10 68 106 131 40
11–20 119 103 127 81
More than 20 70 52 51 52
Example 2
Are students’ post-graduation plans independent of the college they graduate from within the
same university?
Agriculture Arts & Sciences Engineering ILR
Employed 209 198 177 101
Grad School 104 171 158 33
Other 135 115 39 16
Example 3
A poll conducted at a U.S. university classifies respondents by whether they lived on or off
campus and their political party. The results were investigated for evidence of an
association between campus living status and party affiliation.