0% found this document useful (0 votes)

202 views90 pages

Statistical Inference 417

Good for students

Uploaded by

maxwell amponsah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

202 views90 pages

Statistical Inference 417

Good for students

Uploaded by

maxwell amponsah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 90

Statistical Inference

MLB 417
LEARNING OUTCOMES
• After studying this chapter, the student will
• 1. Explain the importance and basic principles of
estimation
• 2. Be able to calculate interval estimates for a variety of
parameters.
• 3. Be able to interpret a confidence interval
• 4. Identify the basic properties and uses of the t
distribution, chi-square distribution, and F distribution
2
Introduction
• Means and variances in most cases are calculated from
samples drawn from populations.
• These statistics serve as estimates of the corresponding
population parameters.
• These estimates are expected to differ by some amount from
the parameters they estimate.
• Estimation procedures take these differences into account,
thereby providing a foundation for statistical inference.
• The two basic areas of statistical inference are estimation and
hypothesis testing.
3
Introduction
• Statistical inference procedures can be used to reach
conclusions about the target population only when the target
population and the sampled population are the same.
• For example, to assess the effectiveness of a method for
treating arthritis, the target population will consist of all
patients suffering from the disease.
• It is not practical to draw a sample from this population.
• Select a sample from all arthritis patients seen in some
specific clinic as the sampled population.
• Inferences about this sampled population may be drawn on
the basis of the information in the sample. 4
Definitions
• Statistical inference is the procedure by which we reach a
conclusion about a population on the basis of the
information contained in a sample drawn from that
population.
• It includes methods like point estimation, interval estimation
and hypothesis testing which are all based on probability
theory.
• Making decisions in the face of uncertainty
• Estimation entails calculating from the data of a sample,
some statistic that is offered as an approximation of the
corresponding parameter of the population from which the
sample was drawn
5
Definitions
• An estimate is that single computed value from a sample
• A point estimate is a single numerical value used to estimate the
corresponding population parameter.
• An interval estimate consists of two numerical values defining a
range of values that, with a specified degree of confidence, most
likely includes the parameter being estimated.
• An estimator is the rule used to compute this value, or estimate.
• The sampled population is the population from which one actually
draws a sample.
• The target population is the population about which one wishes to
make an inference. 6
Point and Interval Estimates
• A point estimate is a single number,
• a confidence interval provides additional
information about the variability of the
estimate

Lower Upper
Confidence Point Estimate Confidence
Limit Limit
Width of
confidence interval
7
Point Estimators – Most common to use sample values

• Sample mean estimates population mean m

ˆ  y   y i

• Sample std. dev. estimates population std. dev. s

ˆ  s   i
( y  y ) 2

n 1
• Sample proportion ˆ estimates population
proportion
8
Confidence Intervals

• A confidence interval (CI) is an interval of numbers

believed to contain the parameter value.

• The probability the method produces an interval that

contains the parameter is called the confidence level.
Most studies use a confidence level close to 1, such
as 0.95 or 0.99.

9
Confidence Interval Estimate

• An interval gives a range of values:

• Takes into consideration variation in sample statistics from
sample to sample
• Based on observations from 1 sample
• Gives information about closeness to unknown population
parameters
• Stated in terms of level of confidence
• e.g. 95% confident, 99% confident
• Can never be 100% confident
10
Confidence Interval

• In practice you only take one sample of size n

• In practice you do not know µ so you do not know if the
interval actually contains µ
• However you do know that 95% of the intervals formed in
this manner will contain µ
• Thus, based on the one sample, you actually selected you can
be 95% confident your interval will contain µ (this is a 95%
confidence interval)

11
General Formula
• The general formula for all confidence
intervals is:
Point Estimate ± (Critical Value)(Standard Error)
Where:
•Point Estimate is the sample statistic estimating the population
parameter of interest
•Critical Value is a table value based on the sampling distribution
of the point estimate and the desired confidence level
12
Confidence Intervals
Confidence
Intervals

Population Population
Mean Proportion

σ Known σ Unknown

13
Confidence Interval for 𝜇
(𝜎 known)

• Assumptions
• Population standard deviation σ is
known
• Population is normally distributed
• If population is not normal, use large
sample

14
Confidence Interval for 𝜇
(𝜎 known)
• Confidence interval estimate:
σ
X  Z α/2
n

where X is the point estimate

Zα/2 is the normal distribution critical value for a
probability of /2 in each tail
is the standard error σ/ n

15
Finding the Critical Value, Zα/2

• Consider a 95% confidence interval:

1  α  0.95 so α  0.05

α α
 0.025  0.025
2 2

Z units: Zα/2 = -1.96 0 Zα/2 = 1.96

Lower Upper
X units: Confidence Point Estimate Confidence
Limit Limit
16
Example 1
• Data on percentage saturation of bile for 31 male
patients is as follows: 𝑥 =84.65 s = 24.00
• Find the 95% confidence of the mean.

17
Solution

24.00
• SE(𝑥) = = 4.31
31
• 84.65 ± (1.96)(4.31) = (76.2; 93.1)
• We are 95% confident that the true mean saturation
level is between 76.2 and 93.1
• Although the true mean may or may not be in this
interval, 95% of intervals formed in this manner will
contain the true mean
18
Confidence Interval for 𝜇 (𝜎 Unknown)
• If the population standard deviation is unknown but the population
is normally distributed, a large sample size should be used
• the sample standard deviation, S can be substituted for population
standard deviation σ.
• This introduces extra uncertainty, since S is variable from sample
to sample
• So the t distribution with n-1 degrees of freedom is used instead of
the normal distribution.
• The degrees of freedom measure the amount of information
available in the data that can be used to estimate σ² ; hence, it
measures the reliability of s² as an estimate of σ². 19
Confidence Interval for 𝜇
(𝜎 Unknown) )

• Confidence Interval Estimate:

S
X  tα / 2
n

(where tα/2 is the critical value of the t distribution

with n -1 degrees of freedom and an area of α/2
in each tail)
20
Student’s t Distribution
• The t is a family of distributions
• The tα/2 value depends on degrees of freedom
(d.f.)
• Number of observations that are free to vary
after sample mean has been calculated
• d.f. = n - 1

21
Student’s t Distribution
Note: t Z as n increases

Standard
Normal
(t with df = ∞)

t (df = 13)
t-distributions are bell-
shaped and symmetric, but
have ‘fatter’ tails than the t (df = 5)
normal

0 t
22
Example

A random sample of n = 25 has X = 50 and

S = 8. Form a 95% confidence interval for μ

• d.f. = n – 1 = 24, so t α/2  t 0.025  2.0639

The confidence interval is

23
Example of t distribution confidence interval

S 8
X  t α/2  50  (2.0639)
n 25

46.698 ≤ μ ≤ 53.302

24
25
Confidence Interval for a Proportion
• The estimate of a population proportion, p , is similar to
estimating a population mean.
• A sample is drawn from the population of interest, and the
sample proportion, p is computed. This sample proportion is
used as the point estimator of the population proportion. A
confidence interval is obtained by the general formula
• estimator ± (reliability coefficient)*(standard error of the
estimator)
The standard error of the sample proportion = √(p(1 – p)/n).
The 100( 1 – α) percent confidence interval for p is given by
pˆ  z1 / 2 pˆ (1  pˆ ) / n 26
Example 1:

What percentage of 18-22 year-old Ghanaians report being

“very happy”?
Recent GSS data: 35 of n = 164 say they are “very happy”
(others report being “pretty happy” or “not too happy”)
𝑝 = 35/164 = .213

𝑠𝑒 = 𝑝(1 − 𝑝)/𝑛

= 0.213(0.787)/164 = 0.032 27
Example 1:

95% CI is 0.213 ± 1.96(0.032), or 0.213 ± 0.063,

(i.e., “margin of error” = 0.063)

which gives (0.15, 0.28). We’re 95% confident, the

population proportion who are “very happy” is between
0.15 and 0.28.

28
Example 2
18 percent of Internet users have used it to search for information regarding
medicines. The sample consisted of 1220 adult Internet users, construct a 95
percent confidence interval for the proportion of Internet users in the sampled
population who have searched for information on medicines.
Solution
𝑝 ̂ = .18,
the reliability coefficient corresponding to a confidence level of .95 is 1.96,
Estimate of the standard error is =√(.18)(.82)/1220 = .0110
The 95 percent confidence interval for p, based on these data, is
0.18 ± 1.96(0.0110)
.18 ± .022 .158 , .202
29
Example 2 cont.
We are 95 percent confident that the population proportion p is
between .158 and .202.
Thus, we expect, with 95 percent confidence, to find
somewhere between 15.8 percent and 20.2 percent of adult
Internet users to have used it for information on medicine.

30
Example 3:

• Weight measured before and after period of

treatment
• y = weight at end – weight at beginning
• The result is shown below For n=17 girls
receiving the treatment.
y = 11.4, 11.0, 5.5, 9.4, 13.6, -2.9, -0.1, 7.4, 21.5, -
5.3, -3.8, 13.4, 13.1, 9.0, 3.9, 5.7, 10.7
31
SPSS: Analyze
---------------------------------------------------------------------------------------
Variable N Mean Std.Dev. Std. Error Mean
weight_change 17 7.265 7.157 1.736
----------------------------------------------------------------------------------------
se obtained as se  s / n  7.157/ 17  1.736
Since n = 17, df = 16, t-score for 95% confidence is 2.12
95% CI for population mean weight change is
y  t (se), which is 7.265  2.12(1.736), or (3.58, 10.94)
We can predict that the population mean weight change is positive
(i.e., the treatment is effective, on average), with value between
about 3.6 and 10.9 pounds
32
Example 2

• Consider the problem of estimating the prevalence of a disease

in 45 to 54 year-old women in Accra. Suppose that a random
sample of n = 5000 women is selected from this age group and
x = 28 are found to have the disease. Calculate the 95%
confidence interval.

33
Example 2
• Solution
28
• Point estimate , p = = 0.0056
5000
0.0056 1−0.0056
• SE(p) = = 0.0011
5000
• 95% confidence interval is given by
• 0.005±1.96(0.0011) = (0.0034; 0.0078)

34
Exercise
• 1. A researcher found that of 472 mechanically ventilated patients,
63 had clinical evidence of ventilator-associated pneumonia.
Construct a 95 percent confidence interval for the proportion of all
mechanically ventilated patients at these hospitals who may be
expected to develop the disease.

• 2. 125 unemployed male high-school dropouts between the ages of

16 and 21 were sampled. 88 stated that they were regular consumers
of alcoholic beverages. Construct a 95 percent confidence interval
for the population proportion
35
DETERMINING SAMPLE SIZE
• During early stages of planning a survey, how large a sample
to take is of importance.
• There are different formulae for determination of appropriate
sample size when different techniques of sampling are used.
• Determining representative sample size using simple random
sampling technique give equal probability to all units.
• These include
• using a census for small populations,
• imitating a sample size of similar studies,
• using published tables, and
• applying formulas to calculate a sample size. 36
Determining Sample Size

Determining
Sample Size

For the For the

Mean Proportion

Sample size depends on:

• Size of the population standard deviation, σ
• The desired degree of reliability, z.
•The desired interval width, e. 37
Determining Sample Size
Interval estimation are to obtain narrow
Determining intervals with high reliability. The width of
the interval is determined by the
Sample Size
magnitude of the quantity
(reliability coefficient) * (standard error of
the estimator) = margin of error.
For the
Mean
Sampling error (margin
of error)

σ σ
X  Zα / 2 e  Zα / 2
n n

38
Determining Sample Size - Cochran’s formula
(continued)
Determining
Sample Size

For the
Mean

σ 2
Zα / 2 σ 2
e  Zα / 2 Now solve for n
to get n
n e 2
39
Determining Sample Size
• Thus, sample size require the knowledge of σ² which
is not known. It has to be estimated using;
• 1. Select a pilot sample and estimate σ with the
sample standard deviation, S or
• 2. from previous or similar studies.

40
Example 1
• A biostatistician is to advice on size of a sample to be
taken to conduct a survey among a population of
teenage girls to determine their average daily protein
intake (measured in grams).
• How will he provide this assistance?
• These three items of information should be provided:
(1) the desired width of the confidence interval,
• (2) the level of confidence desired, and
• (3) the magnitude of the population variance
41
Example 2
• Assume an interval of about 10 grams wide is desired; (within 5
grams of the population mean in either direction - a margin of
error of 5 grams).
• Also assume that a confidence coefficient of .95 is decided
• from past experience, the population standard deviation is
probably about 20 grams.
• Thus, z= 1.96, σ = 20, e = 5
•
• 2
1.96 202 = 61.47.
• n A sample of size 62 is advised
5 2 42
Sample Size Example 2

If  = 45, what sample size is needed to

estimate the mean within ± 5 with 90%
confidence?

Z 2 σ 2 (1.645)2 (45)2
n 2
 2
 219.19
e 5

So the required sample size is n = 220

43
Exercise
• 1. To estimate the mean weight of babies born in her hospital, how
large a sample of birth records should be taken if a 99 percent
confidence interval is desired and that is 1 pound wide? Assume that
a reasonable estimate of the standard deviation is 1 pound.
• What sample size is required if the confidence coefficient is lowered
to .95?
• 2. in order to estimate the mean age of persons bitten by dogs, a
sample is to be drawn from the department’s records of dog bites
reported. A 95 percent confidence interval is desired, a margin o error
o 2.5 is satisfied and from previous studies estimates of the
population standard deviation is to be about 15 years.
• How large a sample should be drawn? 44
Determination Of Sample Size For Estimating Proportions -
Cochran’s formula
• This is essentially the same as that described for estimating a
population mean.
2

n  z pq 2
• e ‘ q=1-p
• p is the proportion in the population possessing the
characteristic of interest. This will be unknown, a pilot
sample is taken and an estimate is computed to be used in
place of p.
• If it is impossible to come up with a better estimate, one may
set p equal to .5 and solve for n.
45
Example
1. How large a sample size do we need to estimate a population
proportion to within 0.03, with probability 0.95? Assume the
population proportion is 50%.
solution
n = (1.96)² (0.50) (0.50) = 0.9604/0.0009 = 1067
0.03²

2. to determine what proportion of families in a certain area are

medically indigent, it is believed that the proportion cannot be
greater than .35. A 95 percent confidence interval is desired with e =
0.05. What size sample of families should be selected? 46
Practice 1
• Determine the sample size that would be required to estimate
the true proportion to within .03 with 95 percent confidence of
adults living in a large metropolitan area having hepatitis B
virus. In a similar metropolitan area the proportion of adults
with the characteristic is reported to be .20.
• If data from another metropolitan area were not available and a
pilot sample could not be drawn, what sample size would be
required.

47
Practice 2
• An administrator at Alpha-Beta clinic, wishes to know what
proportion of discharged patients is unhappy with the care received
during hospitalization. How large a sample should be drawn if the
margin of error is assumed to be 0.05, the confidence coefficient is
.95, and no other information is available?
• How large should the sample be if p is approximated by 0.25?

48
Yamane’s or Slovin's formula

• This is an alternative to Cochran’s formula. According

to him, for a 95% confidence level and p = 0.5 , size of
the sample should be

• where, N is the population size and e is the level of

precision

49
Using Published Tables
• Table 1. Sample Size for ±5% and ±10% Precision Levels where Confidence
Level is 95% and P=0.5
• Size
. of Size of Sample Size Sample Size
Sample Size Sample population (n) ±5% ±10%
population (n) ±5% Size ±10%
500 222 83
100 81 51
1000 286 91
125 96 56
2000 333 95
150 110 61
200 134 67 3000 353 97
250 154 72 4000 364 98
300 172 76 5000 370 98
350 187 78 7000 378 99
400 201 81 9000 383 99
450 212 82 10000 385 99 50
HYPOTHESIS TESTING
• Overview
• Basics of Hypothesis Testing
• Key Concepts in Hypothesis Testing
• Testing a Claim About a Mean: σ Known
• Testing a Claim About a Mean: σ Not Known

51
HYPOTHESIS TESTING
• Overview
• Hypothesis testing is the second of two general areas of
statistical inference. The main goal in many research studies
is to check whether the data collected support certain
statements or predictions.
• A hypothesis test involves collecting data from a sample and
evaluating the data. Then, a decision is made as to whether or
not there is sufficient evidence, based upon analyses of the
data, to reject the null hypothesis.
• Hypothesis testing consists of two contradictory hypotheses
or statements, a decision based on the data, and a conclusion
52
Basic Concepts
• A hypothesis may be defined simply as a statement about one
or more populations
• It is frequently concerned with the parameters of the
populations about which the statement is made.
• A hospital administrator may hypothesize that the average
length of stay of patients admitted to the hospital is 5 days;
• A public health nurse may hypothesize that a particular
educational program will result in improved communication
between nurse and patient;
• A physician may hypothesize that a certain drug will be
effective in 90 percent of the cases for which it is used. 53
Basic Concepts
• The null hypothesis - Hჿ - is a statement that the value of a
population parameter (such as proportion, mean, or standard
deviation) is equal to some claimed value.
• The null hypothesis states that the “null” condition exists;
that is, there is nothing new happening, the old theory is still
true, the old standard is correct, and the system is in control.
• The alternative hypothesis – H₁ or HA - is the statement that
the parameter has a value that somehow differs from the null
hypothesis. The alternative hypothesis, on the other hand,
states that the new theory is true, there are new standards, the
system is out of control, and/or something is happening 54
Basic Concepts
• NOTE: new hypotheses that researchers want to “prove” are stated in
the alternative hypothesis.
• Alternative hypothesis must use one of these symbols: ≠, <, >
Identifying Null and Alternative Hypothesis
• 1. In 2013 , 70% of Ghanaians 18years old participated in volunteer
work.
• i) A researcher believes that this percentage is different today.
• ii) A researcher believes that this percentage is lower today than in
2013
• iii) A researcher believes that this percentage is higher today than in
2013 55
Basic Concepts - Solution
• i) Ho : p = 70%; 70% of Ghanaians 18years old participated in
volunteer work in 2013.
• H1: p ≠ 70%; percentage of Ghanaians 18years old who
participated in volunteer work in 2013 is different from 70% today.

• ii) H1 : p < 70% ; percentage of volunteer work among Ghanaians

18years old is lower today than 70% in 2013

• iii) H1 : p > 70%; percentage of volunteer work among Ghanaians

18years old is higher today than 70% in 2013
56
Class Exercise
• Identify the Null and Alternative Hypothesis. Express the
corresponding null and alternative hypotheses in symbolic form.
• a) The proportion of drivers who admit to running red lights is
greater than 0.5.
• b) The mean height of professional basketball players is at most
7ft.
• c) The standard deviation of IQ scores of actors is equal to 15.

57
Solution
• a) The proportion of drivers who admit to running red lights is
greater than 0.5.
• H0 : p = 0.5.
• H1 : p > 0.5
• b) The mean height of professional basketball players is at most 7
ft. H0 : µ = 7
• H1 : µ ≤ 7.
• c) The standard deviation of IQ scores of actors is equal to 15.
• H0 : σ = 15.
• H1 : σ ≠ 15 58
Test Statistic
• The test statistic is a value used in making a decision about
the null hypothesis, and is found by converting the sample
statistic to a score with the assumption that the null
hypothesis is true.

59
Critical Value

• A critical value is any value that separates the critical region

(where we reject the null hypothesis) from the values of the
test statistic that do not lead to rejection of the null
hypothesis. The critical values depend on the nature of the
null hypothesis, the sampling distribution that applies, and the
significance level α.
60
Critical Region

• The critical region (or rejection region) is the set of all

values of the test statistic that cause us to reject the null
hypothesis.

• The significance level (denoted by α) is the probability that

the test statistic will fall in the critical region when the null
hypothesis is actually true.
61
Two-tailed Test
• H0 : = ……

62
Right-tailed Test and Left- tailed Test

63
Significance Level
• Level of significance reflects the fact that hypothesis tests
are sometimes called significance tests, and a computed value
of the test statistic that falls in the rejection region is said to
be significant.
The level of significance, α , specifies the area under the curve
of the distribution of the test statistic that is above the values
on the horizontal axis constituting the rejection region.
The more frequently encountered values of α are .01, .05, and
.10
64
Types of Errors
• 1. Type I Error – it is the error committed when a true null
hypothesis is rejected.
• Type II Error - it is the error committed when a false null
hypothesis is not rejected.

65
HYPOTHESIS TESTING
• To perform a hypothesis test:
• 1. Set up two contradictory hypotheses (null hypothesis and
alternative hypothesis)
• 2. Collect sample data
• 3. Determine the correct distribution to perform the
hypothesis test.
• 4. Test statistic. = relevant statistic - hypothesized parameter
standard error of the relevant statistic

• 5. Make a decision and write a meaningful conclusion.

66
HYPOTHESIS TESTING
• Conclusion. If the null hypothesis is rejected, we conclude that the
alternative is true. If the null hypothesis is not rejected, the
conclusion is that the null hypothesis may be true.
• A p value is the probability that the computed value of a test
statistic is at least as extreme as a specified value of the test
statistic when the null hypothesis is true.
• Thus, the p value is the smallest value of α for which we can reject
a null hypothesis.
• Reject H0 if the P-value ≤ α (where α is the significance level, such
as 0.05).
• Fail to reject H0 if the P-value > α. 67
HYPOTHESIS TESTING
• The purpose of hypothesis testing is to assist making
decisions.
• The administrative or clinical decision usually depends on
the statistical decision.
• If the null hypothesis is rejected, the administrative or
clinical decision is compatible with the alternative
hypothesis. The reverse is usually true if the null
hypothesis is not rejected. The administrative or clinical
decision, however, may take other forms, such as a
decision to gather more data
68
Example

• Blood glucose levels for obese patients have a mean of 100

with a standard deviation of 15. A researcher thinks that a diet
high in raw cornstarch will have a positive or negative effect
on blood glucose levels. A sample of 30 patients who have
tried the raw cornstarch diet have a mean glucose level of
140. Test the hypothesis that the raw cornstarch had an effect
69
Solution
• Step 1: H0 :μ=100
• Step 2: H1 :≠100
• Step 3: We’ll use 0.05 for this example. As this is a two tailed
test, split the alpha into two. 0.05/2=0.025
• Step 4: A z-score for (0.5-0.025=0.475) is 1.96.
• Step 5: Find the test statistic using this formula
• z=(140-100)/(15/√30)=14.60.
• Step 6: If Step 5 is less than -1.96
• or greater than 1.96 (Step 3), reject the
null hypothesis this case, it is greater, so you can reject the
null
70
Example
• Researchers are interested in the mean age of a certain
population. The data available to the researchers are the ages
of a simple random sample of 10 individuals drawn from the
population of interest. From this sample a mean of 27 and a
variance of 20 have been computed. It is assumed that the
sample comes from a population whose ages are
approximately normally distributed. Can we conclude that the
mean age of this population is different from 30 years?
• Solution
• Hჿ: μ = 30
• H₁: μ ≠ 30 71
Solution
• The decision rule: reject Hჿ if the computed value of the test
statistic is either ≥ 1.96 or ≤ -1.96; otherwise do not reject
Hჿ.

•
• Statistical decision
• Reject the null hypothesis since -2.12 is in the rejection
region, that is, the computed value of the test statistic is
significant at the .05 level.
• The conclusion is that the population mean is not equal to 30.
72
HYPOTHESIS TESTING
• Sampling from Normally Distributed Populations with
Population Variance unknown

• We wish to check that normal body temperature may be less

than 98.6 degrees. In a random sample of n=18 individuals,
the sample mean was found to be 98.217 and the standard
deviation was .684. Assume the population is normally
distributed. Use alpha = 0.05.
• H0 : μ = 98.6
• HA: μ < 98.6
73
Solution
• Left tailed, α = 0.05 , df=18-1=17

• t critical value = 1.740

• t = 98.217 −98.6

0.684 /(√18 ) = −2.375631 = −2.38

Our test value is smaller than the critical value of -1.74.

We have enough evidence to support the claim that average

body temperature is less than 98.6 degrees 74
Exercise
• 1. The ages (years) of 16 subjects with eye defects are: 62,
62, 68, 48, 51, 60, 51, 57, 57, 41, 62, 50, 53, 34, 62, 61. Can
we conclude that the mean age of the population from which
the sample may be presumed to have been drawn is less than
60 years? Let α = .05
• 2. A sample of 18 patients were investigated concerning their
oral status. The mean teeth index value was 10.3 with a
standard deviation of 7.3. Is this sufficient evidence to allow
us to conclude that the mean index is greater than 9.0 in a
population of similar subjects?
75
Exercise
• 3. At a chronic disease hospital on an outpatient basis, a study
was made of a sample of 25 records of patients with the mean
number of outpatient visits per patient was 4.8, and the
sample standard deviation was 2. Can it be concluded from
these data that the population mean is greater than four visits
per patient? Let the probability of committing a type I error
be .05. What assumptions are necessary?
• 4. Forty-nine adolescents served as the subjects in a study.
The variable of interest was the diameter of skin test reaction
to an antigen. The sample mean and standard deviation were
21 and 11 mm, respectively. Can it be concluded from these
data that the population mean is less than 30? 76
Exercise
• 5. To know if the mean daily caloric intake in the adult rural
population of a developing country is less than 2000,a sample
of 500 had a mean of 1985 and a standard deviation of 210.
take a significance level of 5%.
• 6. A survey of 100 similar-sized hospitals revealed a mean
daily census in the pediatrics service of 27 with a standard
deviation of 6.5. Do these data provide sufficient evidence to
indicate that the population mean is greater than 25? Let α =
.10; α = .05
77
Analysis of Variance
• Analysis of variance (ANOVA) is a method of testing the equality of
three or more population means by analyzing sample variances
• Techniques for comparing the means of three or more different
populations or samples.
• The one-way analysis of variance is used to test the claim that three
or more population means are equal.
• This is an extension of the two independent samples t-test.
• when there are 3 or more means being compared, statistical
significance can be ascertained by conducting one statistical test,
ANOVA, or by repeated t-tests

78
WHEN TO USE ANOVA
• ANOVA is used in applications such as the following:
• 1. To determine if there is sufficient evidence to support the
• claim that the three groups have different mean blood pressure
• levels by treating group 1 with 2 tablets of aspirin, group 2
• 1tablet each day and the group 3 placebo.
• 2. To test the claim that the cereals on the shelves have the
• same mean sugar content since it is believed that supermarkets
• place high-sugar cereals on shelves that are at eye-level for
• children

79
Assumptions of one way ANOVA
• Populations are normally distributed
• The data are randomly sampled and independently chosen from the
populations
• The variances of each sample are assumed equal./ Populations have
equal variances.
• It is based on a comparison of two different estimates
• The variance among (between) samples and the variance within
samples.
• It is one-way since the sample data are separated into groups
according to one characteristic, or factor.
80
One-Way ANOVA
• Hypotheses
• Hჿ: μ₁ = μ₂ = μ₃ = ……μₖ
• All population means are equal
• i.e., no factor effect (no variation in means among groups)
• H₁ :Not all of the population means are the same
• At least one population mean is different
• i.e., there is a factor effect
• Does not mean that all population means are different (some
pairs may be the same)
81
One-Way ANOVA
• ANOVA analyzes the variance among values.
• It calculates the variance by summing the squares of the
differences between each value and the mean.
• This is called the sum of squares.
• The variance has two components when the data is from
several groups.
• 1. variation from differences among the group mean.
• 2. variation from differences among the subjects within
• each group (within-groups sum of square)
82
Computing a one-way ANOVA
• Here is the basic one-way ANOVA table
Source SS df MS F p F
Between SSA k-1 SSA / k-1 F = MSA/
(Among) MSW

Within SSW n-k SSW / n-k

Total SST N-1

83
Decision Rule

84
Class exercise
• A study was conducted to test the question as to whether cigarette smoking
is associated with reduced serum levels in men aged 35 to 45. The outcome
is as follows:
a) What is the null hypothesis?
b. What is the alternative hypothesis?
Source SS df MS F c. Identify the value of the test statistic.
d. Find the critical value for a 0.05
Between 0.7248 3 0.2416 9.152 significance level
(Among)
e. how many groups are there?
f. Based on the preceding results, what
do you conclude about equality of the
Within 8.1516 309 population
0.0264 means?

Total 8.8764 312 85

86
Chi-Square Test
• The most frequently employed statistical technique for the analysis of
count or frequency data.
• One may wish to know, for the population from which the sample
was drawn, if a certain variable differs according to gender.
• There may be frequencies for a variable in category represented and
for another variable represented.
• One might want to know if, in the population from which the sample
was drawn, there is a relationship between the variables of interest.
• chi-square assumes values between 0 and infinity
• Chi-square is used testing hypotheses where the data available for
analysis are in the form of frequencies. 87
Types of Chi-Square Tests
• Tests of Goodness-of-fit - is appropriate when one wishes to decide
if an observed distribution of frequencies is drawn from a
preconceived or hypothesized distribution (Normal, binomial).
• Tests of Independence - to test the null hypothesis that two criteria
of classification, when applied to the same set of entities, are
independent. For example, if socioeconomic status and area of
residence of the inhabitants of a certain city are independent, we
would expect to find the same proportion of families in the low,
medium, and high socioeconomic groups in all areas of the city.
• Tests of Homogeneity – to test the null hypothesis that samples are
drawn from populations that are homogeneous with respect to
88
some criterion of classification.
Chi-Square Tests
• The chi-square statistic is most appropriate for use with
categorical variables, such as marital status (married, single,
widowed, and divorced).
• The quantitative data used in the computation of the test statistic
are the frequencies associated with each category of the one or
more variables under study.
• There are two sets of frequencies with which we are concerned,
observed frequencies and expected frequencies. The observed
frequencies are the number of subjects or objects in our sample
that fall into the various categories of the variable of interest.
89
Chi-Square Tests
• The computed value of X² is compared with the tabulated ꭕ²
value with k-r degrees of freedom. k is equal to the number
of groups available, and r is the number of restrictions or
constraints imposed
• The decision rule, then, is: Reject Hჿ if ꭕ² is greater than or
equal to the tabulated ꭕ² for the chosen value of α.

Get Statistics For Business & Economics 13th Revised Edition Edition David Ray Anderson - Ebook PDF PDF Ebook With Full Chapters Now
100% (5)
Get Statistics For Business & Economics 13th Revised Edition Edition David Ray Anderson - Ebook PDF PDF Ebook With Full Chapters Now
51 pages
Lecture 9
No ratings yet
Lecture 9
41 pages
Stat Chapter-9
No ratings yet
Stat Chapter-9
64 pages
Confidence Intervals-Reader
No ratings yet
Confidence Intervals-Reader
9 pages
2 2020 12 21!07 45 30 PM
No ratings yet
2 2020 12 21!07 45 30 PM
40 pages
CHP 2
No ratings yet
CHP 2
60 pages
Estimation of Parameters
No ratings yet
Estimation of Parameters
14 pages
Inbound 7102223658709038496
No ratings yet
Inbound 7102223658709038496
9 pages
Statistical Tests in Publications of The Wildlife Society
No ratings yet
Statistical Tests in Publications of The Wildlife Society
8 pages
Confidence Intervals
No ratings yet
Confidence Intervals
28 pages
Chapter 05
No ratings yet
Chapter 05
43 pages
Sustainability 15 16293 v2
No ratings yet
Sustainability 15 16293 v2
25 pages
Chapter 6 Statistics
No ratings yet
Chapter 6 Statistics
60 pages
Chap 09
No ratings yet
Chap 09
46 pages
ProjectTemplate - Lavesh Kewlani
No ratings yet
ProjectTemplate - Lavesh Kewlani
10 pages
Hypothesis Testing Notes 2025
No ratings yet
Hypothesis Testing Notes 2025
116 pages
Lecture 4-Statistical Inferences
No ratings yet
Lecture 4-Statistical Inferences
118 pages
Estimation and CI
No ratings yet
Estimation and CI
87 pages
Special Staining
No ratings yet
Special Staining
18 pages
Point and Interval Estimation
No ratings yet
Point and Interval Estimation
55 pages
Ch11 13
No ratings yet
Ch11 13
30 pages
Level of Significance: Advanced Statistics
No ratings yet
Level of Significance: Advanced Statistics
31 pages
Week 8a - Interval Estimation
No ratings yet
Week 8a - Interval Estimation
43 pages
Theory Term2
No ratings yet
Theory Term2
9 pages
Estimation by Confidence Interval
No ratings yet
Estimation by Confidence Interval
13 pages
S8 Estimate
No ratings yet
S8 Estimate
44 pages
Salkind PPT ch02 8E
100% (1)
Salkind PPT ch02 8E
31 pages
Cardiac Disease-Laboratory Test
No ratings yet
Cardiac Disease-Laboratory Test
72 pages
Ci 1
No ratings yet
Ci 1
47 pages
QEM 2004 - Module 2 (Confidence Interval Estimation)
No ratings yet
QEM 2004 - Module 2 (Confidence Interval Estimation)
59 pages
The Logic of Hypothesis Testing - Political Poll - Solutions
50% (2)
The Logic of Hypothesis Testing - Political Poll - Solutions
4 pages
CI Lecture 10 - A
No ratings yet
CI Lecture 10 - A
62 pages
10 Estimation and Confidence Intervals
No ratings yet
10 Estimation and Confidence Intervals
33 pages
Solutions One Sample Hypothesis Testing 7
No ratings yet
Solutions One Sample Hypothesis Testing 7
12 pages
Chapter 1 Research
No ratings yet
Chapter 1 Research
42 pages
Section 5.3 and 5.4
No ratings yet
Section 5.3 and 5.4
41 pages
CI Estimation and Sample Size Determination
No ratings yet
CI Estimation and Sample Size Determination
53 pages
Estimations
No ratings yet
Estimations
24 pages
Confidence Interval
No ratings yet
Confidence Interval
44 pages
Estimation
No ratings yet
Estimation
44 pages
Applied Maths Unit1, 2018
100% (2)
Applied Maths Unit1, 2018
26 pages
Chapter 4. Estimation of Parameters
No ratings yet
Chapter 4. Estimation of Parameters
68 pages
Inroduction of Cell Architecture
No ratings yet
Inroduction of Cell Architecture
108 pages
Bacteria Indentification Techniques
No ratings yet
Bacteria Indentification Techniques
19 pages
Intestinal Nematodes
No ratings yet
Intestinal Nematodes
52 pages
5.confidence Interval
No ratings yet
5.confidence Interval
53 pages
Chapter 06
No ratings yet
Chapter 06
44 pages
Chapter Two-Four
No ratings yet
Chapter Two-Four
118 pages
My Co Plasmas
No ratings yet
My Co Plasmas
10 pages
Wa0195
No ratings yet
Wa0195
2 pages
Confidence Intervals
No ratings yet
Confidence Intervals
30 pages
Lecture 7
No ratings yet
Lecture 7
50 pages
Wa0018
No ratings yet
Wa0018
35 pages
L1 MLB 110 Outline Molecular Biology and Diagnostics
No ratings yet
L1 MLB 110 Outline Molecular Biology and Diagnostics
19 pages
Statistical Intervals
No ratings yet
Statistical Intervals
27 pages
Actinomyces
No ratings yet
Actinomyces
6 pages
Design of Experiments PDF
No ratings yet
Design of Experiments PDF
272 pages
Confidence Intervals
No ratings yet
Confidence Intervals
56 pages
Question 1. (12 Marks) : Module #3: Sampling Distributions, Estimates, and Hypothesis Testing
No ratings yet
Question 1. (12 Marks) : Module #3: Sampling Distributions, Estimates, and Hypothesis Testing
11 pages
Introduction To Molecular Diagnostic - Atu-2022
No ratings yet
Introduction To Molecular Diagnostic - Atu-2022
21 pages
Chapter 2
No ratings yet
Chapter 2
30 pages
Topic 6 - Confidence Interval Slides
No ratings yet
Topic 6 - Confidence Interval Slides
34 pages
POLC 6314 - Homework 4 - DELAO
No ratings yet
POLC 6314 - Homework 4 - DELAO
16 pages
PCR With Variations III GNA
No ratings yet
PCR With Variations III GNA
18 pages
Antimicrobial Agents
No ratings yet
Antimicrobial Agents
12 pages
Anaerobic Bacteria and Anaerobic Culture Techniques
No ratings yet
Anaerobic Bacteria and Anaerobic Culture Techniques
18 pages
Chapter 3
No ratings yet
Chapter 3
40 pages
PCR With Variation I - GNA
No ratings yet
PCR With Variation I - GNA
25 pages
Tukey Kramer
No ratings yet
Tukey Kramer
11 pages
Question Paper
No ratings yet
Question Paper
10 pages
Abstract
No ratings yet
Abstract
2 pages
L5 MLB 110 Recombinant Dna Techniques
No ratings yet
L5 MLB 110 Recombinant Dna Techniques
15 pages
Fish FTNP
No ratings yet
Fish FTNP
4 pages
Confidence Intervals
100% (1)
Confidence Intervals
42 pages
Bus 7
No ratings yet
Bus 7
48 pages
JML 7
No ratings yet
JML 7
32 pages
Stat-II CH-TWO
No ratings yet
Stat-II CH-TWO
68 pages
A Session 18 2021
No ratings yet
A Session 18 2021
36 pages
Cardiovascular System ATU (Autosaved)
No ratings yet
Cardiovascular System ATU (Autosaved)
58 pages
Assigment One Haem V MLH 405
No ratings yet
Assigment One Haem V MLH 405
1 page
Credit Sessions5 & 6
No ratings yet
Credit Sessions5 & 6
91 pages
Assignment 1 - MLB 417
No ratings yet
Assignment 1 - MLB 417
1 page
Differences Between Mitosis and Meiosis
No ratings yet
Differences Between Mitosis and Meiosis
79 pages
Chapter 5
No ratings yet
Chapter 5
43 pages
L2 MLB 110 DNA REPLICATION and Protein Synt 1
No ratings yet
L2 MLB 110 DNA REPLICATION and Protein Synt 1
9 pages
Report On Data Interpretation
No ratings yet
Report On Data Interpretation
2 pages
6 - Introduction To Antibiotics
No ratings yet
6 - Introduction To Antibiotics
8 pages
BP-501T Unit-1 Part-1 Histamine
No ratings yet
BP-501T Unit-1 Part-1 Histamine
24 pages
CORE Stat and Prob Q4 Mod17 W6 Hypothesis Testing On Population Proportion
No ratings yet
CORE Stat and Prob Q4 Mod17 W6 Hypothesis Testing On Population Proportion
30 pages
Chapter 5.1 Point Estimation - 9march2016
No ratings yet
Chapter 5.1 Point Estimation - 9march2016
44 pages
10 Inferential Statistics
No ratings yet
10 Inferential Statistics
39 pages
13.1 Factorial ANOVA 1: Balanced Designs, No Interactions
No ratings yet
13.1 Factorial ANOVA 1: Balanced Designs, No Interactions
54 pages
Estimation 06
No ratings yet
Estimation 06
29 pages
Cell Bio
No ratings yet
Cell Bio
12 pages
SEQUENCING ATU Online
No ratings yet
SEQUENCING ATU Online
39 pages
Practical Research II Literature Review and Topic Selection
No ratings yet
Practical Research II Literature Review and Topic Selection
25 pages
Effect of Stock Split
100% (1)
Effect of Stock Split
28 pages
Estimation and Confidence Intervals
No ratings yet
Estimation and Confidence Intervals
28 pages
Estimation of Population Means: Point Estimation and Confidence Interval
No ratings yet
Estimation of Population Means: Point Estimation and Confidence Interval
26 pages
Statistical Estimation
No ratings yet
Statistical Estimation
32 pages
MLB 110 - MLS - E - 1920 - 02 B (1) (1) Exams
No ratings yet
MLB 110 - MLS - E - 1920 - 02 B (1) (1) Exams
6 pages
Hypothesis Test Example For Population Mean
No ratings yet
Hypothesis Test Example For Population Mean
9 pages
Testing The Difference Between Proportions
100% (2)
Testing The Difference Between Proportions
20 pages
Biostatistics - Lecture - Hypothesis Testing
No ratings yet
Biostatistics - Lecture - Hypothesis Testing
10 pages
Statistics and Probabiltity
No ratings yet
Statistics and Probabiltity
25 pages
000.chapter8 Cumulative PDF
No ratings yet
000.chapter8 Cumulative PDF
19 pages
Interval Estimation
100% (1)
Interval Estimation
42 pages
C 4
No ratings yet
C 4
61 pages
Antimicrobial Susceptibility Testing
No ratings yet
Antimicrobial Susceptibility Testing
18 pages
L8 Estimate 2014
No ratings yet
L8 Estimate 2014
40 pages
MLB 110 MLS M 1920 02 (1) Molecular Exams
100% (2)
MLB 110 MLS M 1920 02 (1) Molecular Exams
15 pages
Chapter Two
No ratings yet
Chapter Two
28 pages
Mansi Garg Capstone Project Report
No ratings yet
Mansi Garg Capstone Project Report
55 pages
Confidence Interval
100% (1)
Confidence Interval
19 pages
Statistical Inference
100% (1)
Statistical Inference
33 pages
Confidence Intervals
No ratings yet
Confidence Intervals
50 pages
Type 1 Error and Type 2 Error
100% (1)
Type 1 Error and Type 2 Error
3 pages
Confidence Interval Estimation
No ratings yet
Confidence Interval Estimation
62 pages
Layout Sketch: Atul Motors (Maruti Suzuki
No ratings yet
Layout Sketch: Atul Motors (Maruti Suzuki
108 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Statistical Inference 417

Uploaded by

Statistical Inference 417

Uploaded by

Statistical Inference

• Sample mean estimates population mean m

• Sample std. dev. estimates population std. dev. s

• A confidence interval (CI) is an interval of numbers

• The probability the method produces an interval that

• An interval gives a range of values:

• In practice you only take one sample of size n

where X is the point estimate

• Consider a 95% confidence interval:

Z units: Zα/2 = -1.96 0 Zα/2 = 1.96

• Confidence Interval Estimate:

(where tα/2 is the critical value of the t distribution

A random sample of n = 25 has X = 50 and

• d.f. = n – 1 = 24, so t α/2  t 0.025  2.0639

The confidence interval is

What percentage of 18-22 year-old Ghanaians report being

95% CI is 0.213 ± 1.96(0.032), or 0.213 ± 0.063,

(i.e., “margin of error” = 0.063)

which gives (0.15, 0.28). We’re 95% confident, the

• Weight measured before and after period of

• Consider the problem of estimating the prevalence of a disease

• 2. 125 unemployed male high-school dropouts between the ages of

For the For the

Sample size depends on:

If  = 45, what sample size is needed to

So the required sample size is n = 220

2. to determine what proportion of families in a certain area are

• This is an alternative to Cochran’s formula. According

• where, N is the population size and e is the level of

• ii) H1 : p < 70% ; percentage of volunteer work among Ghanaians

• iii) H1 : p > 70%; percentage of volunteer work among Ghanaians

• A critical value is any value that separates the critical region

• The critical region (or rejection region) is the set of all

• The significance level (denoted by α) is the probability that

• 5. Make a decision and write a meaningful conclusion.

• Blood glucose levels for obese patients have a mean of 100

• We wish to check that normal body temperature may be less

• t critical value = 1.740

0.684 /(√18 ) = −2.375631 = −2.38

Our test value is smaller than the critical value of -1.74.

We have enough evidence to support the claim that average

Within SSW n-k SSW / n-k

Total SST N-1

Total 8.8764 312 85

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.