0% found this document useful (0 votes)
15 views55 pages

Statistical Inference - AP - TV

The document covers key concepts in statistical inference, including sampling techniques, point estimation, sampling distributions, interval estimation, and hypothesis testing. It explains the importance of sampling populations, the central limit theorem, and how to construct confidence intervals for means. Additionally, it details hypothesis testing procedures, including null and alternative hypotheses, Type I and II errors, and the use of p-values in decision-making.

Uploaded by

Hai Yen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views55 pages

Statistical Inference - AP - TV

The document covers key concepts in statistical inference, including sampling techniques, point estimation, sampling distributions, interval estimation, and hypothesis testing. It explains the importance of sampling populations, the central limit theorem, and how to construct confidence intervals for means. Additionally, it details hypothesis testing procedures, including null and alternative hypotheses, Type I and II errors, and the use of p-values in decision-making.

Uploaded by

Hai Yen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

Statistical Inference

Learning objectives
❖ Selecting a Sample
❖ Point Estimation
❖ Sampling Distributions
❖ Interval Estimation
❖ Hypothesis Tests
Selecting samples
Population, sample and individual cases

Source: Saunders et al. (2009)


Figure 7.1 Population, sample and individual cases
LO1 Explain why a sample is often the only feasible
way to learn something about a population

Why Sample the Population?


1. To contact the whole population would
be time consuming.
2. The cost of studying all the items in a
population may be prohibitive.
3. The physical impossibility of checking
all items in the population.
4. The destructive nature of some tests.
5. The sample results are adequate.
8-4
Overview of sampling techniques

Source: Saunders et al. (2009)


Figure 7.2 Sampling techniques
Simple Random Sample
Simple Random Sample: A sample selected so that each
item or person in the population has the same chance of
being included.

EXAMPLE:
A population consists of 845 employees of Nitra Industries. A
sample of 52 employees is to be selected from that population.
The name of each employee is written on a small slip of paper and
deposited all of the slips in a box. After they have been thoroughly
mixed, the first selection is made by drawing a slip out of the box
without looking at it. This process is repeated until the sample of
52 employees is chosen.

8-6
LO1

Sample the Population


1. Sampling from a finite population
2. Sampling from an infinite population

8-7
LO2 Point Estimation

Point Estimation

8-8
LO2 Point Estimation

Point Estimation

8-9
LO3 Sampling Distributions

Sampling Distributions
- Sampling Distribution of the Sample
Mean
- Sampling Distribution of the Sample
Proportion (find out in the chapter 6)

8-10
LO4 Describe the sampling distribution of the
mean.

Sampling Distribution of the


Sample Mean
The sampling distribution of
the sample mean is a probability
distribution consisting of all
possible sample means of a
given sample size selected from a
population.

8-11
LO4
Sampling Distribution of the
Sample Means - Example
Tartus Industries has seven production employees (considered the
population). The hourly earnings of each employee are given in the
table below.

1. What is the population mean?


2. What is the sampling distribution of the sample mean for samples of size 2?
3. What is the mean of the sampling distribution?
4. What observations can be made about the population and the sampling distribution?

8-12
LO4
Sampling Distribution of the
Sample Means - Example

8-13
LO4
Sampling Distribution of the
Sample Means - Example

8-14
LO4
Sampling Distribution of the
Sample Means - Example

8-15
LO4
Sampling Distribution of the
Sample Means - Example

8-16
LO4
Sampling Distribution of the
Sample Means - Example

These observations can be made:


a. The mean of the distribution of the sample mean ($7.71) is equal to the mean of the
population
b. The shape of the sampling distribution of the sample mean and the shape of the frequency
distribution of the population values are different. The distribution of the sample mean tends
to be more bell-shaped and to approximate the normal probability distribution.

8-17
LO4 Describe the sampling distribution of the
mean.

Sampling Distribution of the


Sample Mean
When the expected value of a point estimator equals the
population parameter, the point estimator is unbiased

8-18
LO4 Describe the sampling distribution of the
mean.

Sampling Distribution of the


Sample Mean
◼ For a finite population, where the total number of objects is N and the size
of the sample is n, the following adjustment is made to the standard errors
of the sample means:
◼ However, if n/N < .05, the finite-population correction factor may be ignored.

8-19
LO5 Explain the central limit
theorem.
Central Limit Theorem
CENTRAL LIMIT THEOREM If all samples of a particular size are
selected from any population, the sampling distribution of the sample
mean is approximately a normal distribution. This approximation
improves with larger samples.

◼ If the population follows a normal probability distribution, then for


any sample size the sampling distribution of the sample mean will
also be normal.
◼ If the population distribution is symmetrical (but not normal), the
normal shape of the distribution of the sample mean emerge with
samples as small as 10.
◼ If a distribution that is skewed or has thick tails, it may require
samples of 30 or more to observe the normality feature.
◼ The mean of the sampling distribution equal to μ and the variance
equal to σ2/n.
8-20
LO5

8-21
LO5

Central Limit Theorem - Example


Spence Sprockets, Inc. employs 40 people and faces some major decisions regarding health
care for these employees. Before making a final decision on what health care plan to purchase,
Ed decides to form a committee of five representative employees. The committee will be asked to
study the health care issue carefully and make a recommendation as to what plan best fits the
employees’ needs. Ed feels the views of newer employees toward health care may differ from
those of more experienced employees. If Ed randomly selects this committee, what can he
expect in terms of the mean years with Spence Sprockets for those on the committee? How does
the shape of the distribution of years of experience of all employees (the population) compare
with the shape of the sampling distribution of the mean? The lengths of service (rounded to the
nearest year) of the 40 employees currently on the Spence Sprockets, Inc., payroll are as follows.

8-22
LO5

Central Limit Theorem - Example

25 Samples of Five Employees 25 Samples of 20 Employees


8-23
LO6 Interval Estimation

Interval Estimation
- Interval estimation of population
mean
- Interval estimation of population
proportion (find out in chapter 6)

8-24
Confidence Intervals
for Means
LO6 Define a confidence estimate.

Confidence Interval Estimates


◼ A confidence interval is a range of values
constructed from sample data so that the
population parameter is likely to occur within
that range at a specified probability. The
specified probability is called the level of
confidence.

C.I. = point estimate ± margin of error

9-26
LO6

Some terms
◼ Interval estimate is another name for confidence interval.
◼ Confidence level is the confidence associated with an interval
estimate. For example, if an interval estimation procedure provides
intervals such that 95% of the intervals formed using the procedure
will include the population parameter, the interval estimate is said to
be constructed at the 95% confidence level.
◼ Confidence coefficient is the confidence level expressed as a
decimal value. For example, 0.95 is the confidence coefficient for a
95% confidence level.
◼ Level of significance is the probability that the interval estimation
procedure will generate an interval that does not contain the value of
parameter being.
Level of significance = 1 – confidence coefficient

9-27
LO6
Factors Affecting Confidence
Interval Estimates
The width of a confidence interval is
determined by:
1.The sample size, n.
2.The variability in the population, usually
σ estimated by s.
3.The desired level of confidence.

9-28
LO7 Compute a confidence interval for the
population mean when the population standard
deviation is known.

Confidence Intervals for a Mean – σ Known

x − sample mean
z − z - value for a particular confidence level
σ − the population standard deviation
n − the number of observations in the sample

9-29
LO7

Interval Estimates - Interpretation


For a 95% confidence interval about 95% of the similarly
constructed intervals will contain the parameter being estimated.
Also 95% of the sample means for a specified sample size will lie
within 1.96 standard deviations of the hypothesized population

9-30
LO7
How to Obtain z value for a Given
Confidence Level
The 95 percent confidence refers
to the middle 95 percent of the
observations. Therefore, the
remaining 5 percent are equally
divided between the two tails.

Following is a portion of Appendix B.1.

9-31
LO7

Example: Confidence Interval for a Mean –


σ Known
The American Management Association wishes to
have information on the mean income of middle
managers in the retail industry. A random sample of
256 managers reveals a sample mean of $45,420. The
standard deviation of this population is $2,050. The
association would like answers to the following
questions:

What is a reasonable range of values for the


population mean?

Suppose the association decides to use the 95 percent


level of confidence:
9-32
LO7

Example: Confidence Interval for a Mean -


Interpretation
The American Management
Association wishes to have information
on the mean income of middle
managers in the retail industry. A
random sample of 256 managers
reveals a sample mean of $45,420. The
standard deviation of this population is
$2,050. The confidence limit are
$45,169 and $45,671

What is the interpretation of the


confidence limits $45,169 and $45,671?

If we select many samples of 256


managers, and for each sample we
compute the mean and then construct
a 95 percent confidence interval, we
could expect about 95 percent of these
confidence intervals to contain the
population mean. Conversely, about 5
percent of the intervals would not
contain the population mean annual
income, µ

9-33
LO8 Compute a confidence interval for the population
mean when the population standard deviation is not
known.
Population Standard Deviation (σ) Unknown
In most sampling situations the population standard deviation (σ) is
not known. Below are some examples where it is unlikely the
population standard deviations would be known.
1. The Dean of the Business College wants to estimate the mean number of
hours full-time students work at paying jobs each week. He selects a
sample of 30 students, contacts each student and asks them how many
hours they worked last week.

2. The Dean of Students wants to estimate the distance the typical commuter
student travels to class. She selects a sample of 40 commuter students,
contacts each, and determines the one-way distance from each student’s
home to the center of campus.

3. The Director of Student Loans wants to know the mean amount owed on
student loans at the time of his/her graduation. The director selects a
sample of 20 graduating students and contacts each to find the information.

9-34
LO8

Characteristics of the t-distribution


1. It is, like the z distribution, a continuous distribution.
2. It is, like the z distribution, bell-shaped and symmetrical.
3. There is not one t distribution, but rather a family of t distributions.
All t distributions have a mean of 0, but their standard deviations
differ according to the sample size, n.
4. The t distribution is more spread out and flatter at the center than
the standard normal distribution As the sample size increases,
however, the t distribution approaches the standard normal
distribution

9-35
LO8
Confidence Interval for the Mean –
Example using the t-distribution

A tire manufacturer wishes to


investigate the tread life of its Given in the problem :
tires. A sample of 10 tires
driven 50,000 miles revealed a n = 10
sample mean of 0.32 inch of
tread remaining with a standard x = 0.32
deviation of 0.09 inch. s = 0.09
Construct a 95 percent
confidence interval for the
population mean.
Compute the C.I. using the
Would it be reasonable for the
manufacturer to conclude that t - dist. (since  is unknown)
after 50,000 miles the
s
population mean amount of
tread remaining is 0.30 inches?
X  t / 2,n −1
n

9-36
LO8

Student’s t-distribution Table

9-37
Hypothesis tests
LO 9 Hypothesis tests

Hypothesis tests
- Developing null and alternative
hypotheses
- Type I and II Errors
- Hypothesis test of the population
mean
- Hypothesis test of the population
proportion (find out in chapter 6)

8-39
LO 10,
LO 11

Null and Alternate Hypothesis

NULL HYPOTHESIS A statement about the value of


a population parameter developed for the purpose of
testing numerical evidence.

ALTERNATE HYPOTHESIS A statement that is “not


rejected” and is found “consistent with the data” if
the sample data provide sufficient evidence that the
null hypothesis is false.

10-40
LO12 Describe the Type I and Type II
errors.

Decisions and Errors in Hypothesis Testing

10-41
LO12

Type of Errors in Hypothesis Testing

◼ Type I Error
▪ Defined as the probability of rejecting the null
hypothesis when it is actually true.
▪ This is denoted by the Greek letter “”
▪ Also known as the significance level of a test

◼ Type II Error
▪ Defined as the probability of failing to reject
the null hypothesis when it is actually false.
▪ This is denoted by the Greek letter “β”
10-42
LO13

Single Sample
Hypothesis Tests –
Population Means With
Sigma Known
LO 13 Define the term test statistics and explain how it is
used.

Test Statistic versus Critical Value

TEST STATISTIC A value, determined from sample


information, used to determine whether to reject the
null hypothesis.

Example: z, t

CRITICAL VALUE The dividing point between the


region where the null hypothesis is rejected and the
region where it is not rejected.

10-44
LO 16 Conduct a test of hypothesis about a population
mean.

Hypothesis Setups for


Testing a Mean ()

10-45
LO 17 Compute and interpret a p-value

p-Value in Hypothesis Testing


◼ p-VALUE is the probability of observing a sample value
as extreme as, or more extreme than, the value
observed, given that the null hypothesis is true.

◼ In testing a hypothesis, we can also compare the p-


value to the significance level ().

◼ Decision rule using the p-value:


Reject H0 if p-value < significance level

10-46
LO 17

p-Value in Hypothesis Testing -


Example
Recall the last problem where the
hypothesis and decision rules were set
up as:

H0:  ≤ 200
H1:  > 200
Reject H0 if Z > Z
where Z = 1.55 and Z =2.33

Reject H0 if p-value < 


0.0606 is not < 0.01

Conclude: Fail to reject H0

10-47
LO 18

Single Sample Hypothesis


Tests – Population Means
With Sigma Unknown
LO 18

Testing for the Population Mean:


Population Standard Deviation Unknown

◼ When the population standard deviation (σ) is


unknown, the sample standard deviation (s) is used in
its place
◼ The t-distribution is used as test statistic, which is
computed using the formula:

10-49
LO 18
Testing for the Population Mean: Population
Standard Deviation Unknown - Example
The McFarland Insurance Company Claims Department
reports the mean cost to process a claim is $60. An
industry comparison showed this amount to be larger
than most other insurance companies, so the company
instituted cost-cutting measures. To evaluate the effect of
the cost-cutting measures, the Supervisor of the Claims
Department selected a random sample of 26 claims
processed last month. The sample information is reported
below.
At the .01 significance level is it reasonable a claim is
now less than $60?

10-50
LO 18
Testing for a Population Mean with an
Unknown Population Standard Deviation- Example

Step 1: State the null hypothesis and the alternate


hypothesis.

H0:  ≥ $60
H1:  < $60
(note: keyword in the problem “now less than”)

Step 2: Select the level of significance.


α = 0.01 as stated in the problem

Step 3: Select the test statistic.


Use t-distribution since σ is unknown
10-51
LO 18

t-Distribution Table (portion)

10-52
LO 18
Testing for a Population Mean with an
Unknown Population Standard Deviation- Example
Step 4: Formulate the decision rule.
Reject H0 if t < -t,n-1

Step 5: Make a decision and interpret the result.


Because -1.818 does not fall in the rejection region, H0 is not rejected at
the .01 significance level. We have not demonstrated that the cost-
cutting measures reduced the mean cost per claim to less than $60.
The difference of $3.58 ($56.42 - $60) between the sample mean and the
population mean could be due to sampling error.
10-53

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy