0% found this document useful (0 votes)
67 views45 pages

STAT 266 - Lecture 2

The document discusses various statistical distributions including the t-distribution, chi-square distribution, F-distribution and how they relate to constructing confidence intervals. Specifically: - The t-distribution is used to estimate population means and compare two population means when sample sizes are small (less than 30). - The chi-square distribution is used to test hypotheses about categorical data and determine critical values. - The F-distribution is used to compare the variances of two populations by taking the ratio of their sample variances. - Confidence intervals can be constructed for a population mean, variance, proportion or ratio of variances using the relevant distribution and critical values based on the sample size and desired confidence level
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views45 pages

STAT 266 - Lecture 2

The document discusses various statistical distributions including the t-distribution, chi-square distribution, F-distribution and how they relate to constructing confidence intervals. Specifically: - The t-distribution is used to estimate population means and compare two population means when sample sizes are small (less than 30). - The chi-square distribution is used to test hypotheses about categorical data and determine critical values. - The F-distribution is used to compare the variances of two populations by taking the ratio of their sample variances. - Confidence intervals can be constructed for a population mean, variance, proportion or ratio of variances using the relevant distribution and critical values based on the sample size and desired confidence level
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Outline

1 Sampling Distributions of Other Statistics

2 Confidence Intervals
C.I for Population Variance and Ratio of two Variances

DR S.K. Appiah () Math 271 November 21, 2016 1 / 44


The Student t-distribution

The student t-distribution is used in the estimation of the population


mean (µ) and also comparing two population means, when the sample
size(s) is/are small (< 30). The distribution is as shown below:

DR S.K. Appiah () Math 271 November 21, 2016 2 / 44


Properties of t-distribution

It is a continuous distribution which is symmetrical about its mean,


t = 0.
It has a variance greater than 1 but approaches 1 as n → ∞
It is less peaked at the center and has higher tails compared with the
Normal distribution.
The shape of the distribution depends on its degrees of freedom, (n
-1) which approaches the normal distribution as (n − 1) → ∞. The
random variable t ranges from ∞ to ∞.
The probability ,P(t ≥ tα,(n−1) ) = α, where tα,(n−1) is the critical
value.

DR S.K. Appiah () Math 271 November 21, 2016 3 / 44


Determination of Critical Values

The following gives some critical values of t-distribution from the table:
t0.10,15 = 1.341
t0.025,24 = 2.064
t0.01,5 = 3.365
t0.05,4 = 2.132

DR S.K. Appiah () Math 271 November 21, 2016 4 / 44


The Chi-Square Distribution

DR S.K. Appiah () Math 271 November 21, 2016 5 / 44


Graph of Chi-square Distribution

DR S.K. Appiah () Math 271 November 21, 2016 6 / 44


Properties of Chi-Square Distribution

It is a continuous distribution.
The critical values, χ2α(−1) are all positive.
The shape of the distribution depends on its degrees of freedom,
(n − 1). The diagram below shows the various curves associated with
their degrees of freedom, (n − 1) .
The distribution is markedly skewed to the right when the degrees of
freedom is small.
The skewness disappears rapidly to become normally distributed as
degrees of freedom increases to 30 or more.

DR S.K. Appiah () Math 271 November 21, 2016 7 / 44


Determination of Critical Values

χ20.025,(21) = 35.4789
χ20.95,(17) = 8.67176
χ20.05,(4) = 9.48773
Using the degrees of freedom 12, P(χ2 ≥ 18.5494) = 0.10

DR S.K. Appiah () Math 271 November 21, 2016 8 / 44


F-Distribution

This is used when we want to compare the variances of two populations.


Let the sample variances be s12 and s22 , then the ratio
s12
σ12 s12
F = =
s22 s22
σ22

has a sampling distribution called F-distribution with degrees of freedom


v1 = (n1 − 1) and v2 = (n2 − 1)

DR S.K. Appiah () Math 271 November 21, 2016 9 / 44


Graph of F-distribution

Below is an F-distribution graph

DR S.K. Appiah () Math 271 November 21, 2016 10 / 44


Critical Values

We can read the following critical values from the various F-distribution
tables:
(i) F0.10(1,6) = 3.78
(ii) F0.05(5,19) = 2.74
(iii) F0.01(12,5) = 9.89
(iv) F0.10(15,10) = 2.24
(v) F0.025(5,10) = 4.24
(vi) F0.99(12,5) = (F0.01(5,12) )−1 = (5.06)−1 = 0.198

DR S.K. Appiah () Math 271 November 21, 2016 11 / 44


Outline

1 Sampling Distributions of Other Statistics

2 Confidence Intervals
C.I for Population Variance and Ratio of two Variances

DR S.K. Appiah () Math 271 November 21, 2016 12 / 44


Construction of Confidence Intervals

Interval estimation provides a range of values believed to contain the


unknown parameter with a degree of confidence.A point estimate with its
associated standard error gives a measure of the range of likely values for
the parameter.
We consider a practical example to illustrate the construction of confidence
intervals. Suppose we wish to estimate the mean yearly income of workers
in an establishment. We randomly draw, say N Cn = 20 = 20 samples each
of size 20 observations and construct a confidence interval for the true
mean yearly income, µ for each sample as shown in the diagram below.

DR S.K. Appiah () Math 271 November 21, 2016 13 / 44


DR S.K. Appiah () Math 271 November 21, 2016 14 / 44
The true mean income (µ) which is fixed is represented by the vertical line
while the thick horizontal line segments represent the random intervals of
the samples. We note that the random intervals vary from µ and that the
intervals for samples 6 and 19 do not contain µ while the rest captures it.
Hence
  the required interval estimator for µ is obtained with
18
(100%) = 90% degree of confidence.
20

DR S.K. Appiah () Math 271 November 21, 2016 15 / 44


Suppose the sampling distribution of the estimator θ̂ for θ is normally
distributed. Then the critical value, k takes a standard score, Z α2 for large
n.

The maximum error of estimation of the parameter θ, denote by E , is


defined as:
E = Z α2 Se(x) for n ≥ 30
,
DR S.K. Appiah () Math 271 November 21, 2016 16 / 44
Confidence Intervals for Population Mean and Proportion
The (1 − α)% confidence interval for the population (µ) may exist for two
cases: large sample size and small sample size.
When n is large (n ≥ 30) and σ 2 is known, the (1 − α)100%
confidence interval for µ is given by
σ
x ± Z α2 Se(x) = x ± Z α2 · √
n

where σ is estimated by the sample standard deviation s, if it is


unknown.
The maximum error of estimation is,
σ
E = Z α2 · √
n

From which we obtain


Z 2σ2
n=
E2
DR S.K. Appiah () Math 271 November 21, 2016 17 / 44
When n is small (n < 30), then the sampling distribution of xbecomes
the t-distribution and the confidence interval becomes
s
x ± t α2 ,(n−1) Se(x) = x ± t α2 ,(n−1) · √
n

where t α2 ,(n−1) is the critical value obtained from the t-distribution of


freedom (n − 1).

DR S.K. Appiah () Math 271 November 21, 2016 18 / 44


Confidence interval for proportion

r r
p(1 − p) p̂(1 − p̂)
p̂ ± Z α2 Se(p̂) = p̂ ± Z α2 · ≈ p̂ ± Z α2 ·
n n
where p̂ = xn is the point estimator for p and n ≥ 30 and both np̂ and
np̂(1 − p̂) are ≥ 5
r
p̂(1 − p̂)
The error of Estimation becomes E = Z α2 ·
n
Z α2 p̂(1 − p̂)
from which we have n = 2
E2
If there is no prior knowledge of p or p̂, p̂(1 − p̂) is replaced by its
maximum value 0.25.

DR S.K. Appiah () Math 271 November 21, 2016 19 / 44


Illustrative Example

1.(a) A sample survey conducted in a city showed that 180 families spend
on the average $180.45 per week on food with a standard deviation of
$22.60. What can we say with 95% confidence about the maximum error
if $180.45 is weekly food expenditure of families used as an estimate of
the actual average in the population? Obtain the interval estimate for this
population average and comment on your result.

DR S.K. Appiah () Math 271 November 21, 2016 20 / 44


Solution

Given a confidence coefficient of 1 − α = 0.95 where α = 0.05 and


Z α2 = 1.96, we compute the maximum error of estimation

σ 22.60
E = Z α2 √ = (1.96) √ = 3.3
n 180
The actual population average is estimated within the interval estimate

x ± E = 180.45 ± 3.30 = (177.15, 183.75)

DR S.K. Appiah () Math 271 November 21, 2016 21 / 44


Example 2

The financial aid office of a University wishes to estimate the mean cost of
recommended textbooks per semester for students. In order for the
estimate to be useful, they want it to be within $8,500 of the true mean
cost.
How large a sample of semesters should be considered in order to be
99% confident of achieving this level of accuracy? Assume a standard
deviation of $40,000.
Obtain a 95 interval estimate for the true mean cost if a random
sample of size 40 with a mean of $80,000 and standard deviation
$35,000.

DR S.K. Appiah () Math 271 November 21, 2016 22 / 44


Solution

DR S.K. Appiah () Math 271 November 21, 2016 23 / 44


Example 3

To estimate the proportion of unemployed an economist selected 400


persons at random from large group of people in a community and found
that 26 persons were unemployed.
Estimate the true proportion of unemployed using a confidence
coefficient of 0.90.
How many persons must be sampled to reduce the error of estimation
to 0.012?

DR S.K. Appiah () Math 271 November 21, 2016 24 / 44


Solution

DR S.K. Appiah () Math 271 November 21, 2016 25 / 44


C.I. for Difference between Two Population Means

Determining an interval estimate for the difference between two population


means (µ1 − µ2 ) is a means of comparing two population parameters. The
(1 − α)100% confidence interval for (µ1 − µ2 ), like for the single
parameter µ, two case of sample sizes.
If the sample sizes are large, n1 ≥ 30 and n2 ≥ 30,we have
s
σ12 σ22
(x 1 − x 2 ) ± Z α2 Se(x 1 − x 2 ) = (x 1 − x 2 ) ± Z α2 +
n1 n2
σ12 and σ22 are estimated using s12 and s22 respectively, if they are unknown.

DR S.K. Appiah () Math 271 November 21, 2016 26 / 44


Cont’d

If n1 = n2 = n, then the error of estimation will be given by


s r
σ12 σ22 σ2 + σ2
E = Z α2 + = Z α2
n1 n2 n

from which we have


Z α2 (σ 2 + σ 2 )
2
n=
E2

DR S.K. Appiah () Math 271 November 21, 2016 27 / 44


Small Sample Size (n1 < 30, and n2 < 30)

If the sample sizes are small, the two population variances are
assumed to be equal(i.e σ12 = σ22 )and the interval becomes
r
1 1
(x 1 − x 2 ) ± t α2 ,(n1 +n2 −2) Sp +
n1 n2
where
(n1 − 1)s12 + (n2 − 1)s22
sp2 =
n1 + n2 − 2
which is the estimate of the common variance σ 2 .

DR S.K. Appiah () Math 271 November 21, 2016 28 / 44


C.I Difference in means of Paired Observation

The confidence interval for µd is given by


sd
d ± t α2 ,(n−1) · √
n
1 Pn
where n is the number of sample observations, and d = di is the
n i=1
mean difference between the paired sample observations.
The standard deviation of the paired differences is given as
v
u X n  n 2 
u 1 1 X
sd = t di2 − di
n−1 n
i=1 i=1

DR S.K. Appiah () Math 271 November 21, 2016 29 / 44


C.I for Difference in Proportion

The (1 − α)100% confidence interval for (p1 − p2 ) is given by


s
p̂1 (1 − p̂1 ) p̂2 (1 − p̂2 )
(p̂1 − p̂2 ) ± Z α2 +
n1 n2
x1 x2
where n1 and n2 are large and p̂1 = n1 and p̂2 = n2 .

DR S.K. Appiah () Math 271 November 21, 2016 30 / 44


Illustrative Example 1

(a) A mid-semester examination in Statistics was given 25 students


randomly selected from class A and also to another 40 students
randomly selected from class B. The mean scores obtained from both
samples and the standard deviations are as shown below.
Class A nA = 55 x A = 66 sA = 10
Class B nB = 40 x B = 62 sB = 8
Construct a 95% confidence interval for the difference in the mean
scores.

DR S.K. Appiah () Math 271 November 21, 2016 31 / 44


Solution (a)

The 95% confidence interval for the difference (µA − µB ), between the
scores where both nA ≥ 30 and nB ≥ 30 is
s
sA2 s2
= (x A − x B ) ± Z α2 + 2
nA nB
r
102 82
= (66 − 62) ± (1.96) +
55 40
= 4 ± 1.96(1.85)
= 4 ± 3.626
= (0.374, 7.626)

DR S.K. Appiah () Math 271 November 21, 2016 32 / 44


Example 1(b)

(b)A random sample of size, n1 = 10 was selected from a population


where x 1 13.9 and s= 2.1 . Another random sample of size, n2 = 8 was
selected from a different population with x 2 = 11.3 and s2 = 2.8 . If the
difference between the two mean scores is to be estimated,
(i) find an estimate for the common variance of the populations, and
(ii) provide a 95% confidence interval for the difference.

DR S.K. Appiah () Math 271 November 21, 2016 33 / 44


Solution 1(b)
(i)To estimate the difference (µ1 − µ2 ) for small sample sizes we assume
that the two population variances are equal which is estimated by
(n1 − 1)s12 + (n2 − 1)s22
sp2 =
n1 + n2 − 2
(10 − 1)(2.1)2 + (8 − 1)2.82
=
10 + 8 − 2
= 5.91 = (2.43)2
(ii)
r
1 1
= (x 1 − x 2 ) ± t α2 ,(n1 +n2 −2) Sp +
n1 n2
r
1 1
(13.9 − 11.3) ± t0.025,(16) (2.43) +
10 8
= 2.6 ± (2.120)(2.43)(0.473)
= 2.6 ± 2.44
DR S.K. Appiah () Math 271 November 21, 2016 34 / 44
Example 2

Two independent random samples taken from two different populations


produced the following results:
Sample 1 n1 = 400 p̂1 = 0.48
Sample 2 n2 = 300 p̂2 = 0.36

(i) What is the point estimate of the difference between the two
population proportions, (p1 − p2 )?

(ii) Develop a 99% confidence interval for the difference (p1 − p2 ).

DR S.K. Appiah () Math 271 November 21, 2016 35 / 44


Solution 2

(i) The point estimate of (p1 − p2 ) = 0.48 − 0.36 = 0.12.

(ii) The 99% confidence interval for (p1 − p2 ) is


s
p̂1 (1 − p̂1 ) p̂2 (1 − p̂2 )
(p̂1 − p̂2 ) ± Z α2 +
n1 n2
r
0.48(1 − 0.48) 0.36(1 − 0.36)
= (0.48 − 0.36) ± Z0.05 +
400 300
= 0.12 ± (2.58)(0.03731)
= 0.12 ± 0.0963

Hence the difference,(p1 − p2 ) ∈ (0.024, 0.21).

DR S.K. Appiah () Math 271 November 21, 2016 36 / 44


Outline

1 Sampling Distributions of Other Statistics

2 Confidence Intervals
C.I for Population Variance and Ratio of two Variances

DR S.K. Appiah () Math 271 November 21, 2016 37 / 44


Confidence Intervals for Population Variance

The confidence interval for the population variance is based on the
1 (n − 1)s 2
sampling distribution of the statistic χ2 = 2 ni=1 xi − x =
P
σ σ2
which has the chi-square distribution with (n − 1) degrees of freedom. The
confidence interval for the population variance σ 2 is illustrated as follows:

DR S.K. Appiah () Math 271 November 21, 2016 38 / 44


Cont’d

The confidence interval for the population variance, σ 2 is

(n − 1)s 2 2 (n − 1)s 2
≤ σ ≤
χ2α , (n − 1) χ21− α (n − 1)
2 2

DR S.K. Appiah () Math 271 November 21, 2016 39 / 44


C.I for Ratio of Variances
σ12
The construction for the confidence interval for the ratio σ22
is based on
s12
the ratio, F = s22
which has the F-distribution with degrees of freedom
v1 = (n1 − 1) and v2 = (n2 − 1), where we find the two critical values
F1− α2 ,(v1 ,v2 ) and F α2 ,(v1 ,v2 )

DR S.K. Appiah () Math 271 November 21, 2016 40 / 44


Cont’d

DR S.K. Appiah () Math 271 November 21, 2016 41 / 44


from which we obtain the (1 − α)100% confidence interval for the ratio
σ12 /σ22 as
s22 1 σ22 s22
· ≤ ≤ · F α2 ,(v1 ,v2 )
s12 F α2 ,(v1 ,v2 ) σ12 s12

DR S.K. Appiah () Math 271 November 21, 2016 42 / 44


Illustrative Example

(a)Acid gases must be removed from other gases in chemical production


to minimize corrosion of plants. Two methods for removing acid gases
produced the corrosion rates (in mm/year) in an experimental tests gave
the following results:

Method I 1.7 0.8 0.8 1.2 0.7 0.9 -


Method II 0.9 0.7 0.8 2.1 1.0 1.4 1.5
(i) Estimate the ratio of variations in corrosion rates using a 95%
confidence interval.
(ii) What assumptions can you make for your answer in (i) to be valid?

DR S.K. Appiah () Math 271 November 21, 2016 43 / 44


Example b

(b)A random sample of size P 5 from a normally P


distributed population
n n 2
yields the following results: i=1 xi = 23 and i=1 xi = 135
(i)Give the estimates of the population variance and standard deviation
using
(ii) a point estimation and 95

DR S.K. Appiah () Math 271 November 21, 2016 44 / 44


Assumptions of Estimation of Parameters

The following assumptions are made in the estimation of population


parameter.
(i) The sample consists of n independent observations randomly selected.
(ii) The sampled population is normally distributed with mean, µ and
variance σ 2
(iii) In comparing two populations for small sample sizes the variances are
assumed to be equal.
(iv) The sample size(s) may be large or small (that is, n ≥ 30 or n < 30 ).

DR S.K. Appiah () Math 271 November 21, 2016 45 / 44

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy