0% found this document useful (0 votes)
64 views77 pages

CH 7

The document discusses statistical intervals based on a single sample. It defines confidence intervals and how they can provide a margin of error around a point estimate. It also presents the formula for finding a confidence interval for a population mean when the population standard deviation is known, and provides examples and practice problems.

Uploaded by

Arjun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views77 pages

CH 7

The document discusses statistical intervals based on a single sample. It defines confidence intervals and how they can provide a margin of error around a point estimate. It also presents the formula for finding a confidence interval for a population mean when the population standard deviation is known, and provides examples and practice problems.

Uploaded by

Arjun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 77

BITS Pilani

Pilani Campus

Course No: MATH F113

Probability and Statistics


Chapter 7: Statistical Intervals
Based on a Single Sample
Sumanta Pasari
BITS Pilani
Pilani Campus sumanta.pasari@pilani.bits-pilani.ac.in
Interval Estimation
• A point estimate cannot be expected to provide the exact value (close
value) of the population parameter.

• Usually, an interval estimate can be obtained by adding and


subtracting a margin of error to the point estimate. Then,

Interval Estimate = Point Estimate


` + / - Margin of Error

• Interval estimation provides us information about how close the


point estimate is to the value of the parameter.
• Why we use the term confidence interval?
3
BITS Pilani, Pilani Campus
Interval (CI) Estimation
– Instead of considering a statistic as a point estimator, we
may use random intervals to understand the parameter.
– In this case, the end points of the interval are RVs and we
can talk about the probability (in the sense of frequency
definition) that it brackets the parameter value.
Confidence Interval : A 100(1- α)% confidence interval for a
parameter is a random interval [L1,L2] such that
P[L1 ≤ θ ≤ L2] = 1- α , regardless the value of θ.
4
BITS Pilani, Pilani Campus
Theorem 7.1: Interval estimation for
µ: σ known
Let X 1 , X 2 , , X n be a random sample from a normal population with
mean   unknown  and the variance  2  known  . Then, we know,
 2  X -
X N  ,  N  0,1
 n    
 
 n
Taking two points  z 2 symmetrically about the origin, we get
 
 X - 
P  - z 2   z 2   1 - 
  
 
 n 
Here 1 -   is known as confidence level, and  is the level of significance.

5 BITS Pilani, Pilani Campus


Interval estimation for µ: σ known

   
P X - z 2    X  z 2   1 - 
 n n 
Hence, the confidence interval for population mean  having confidence
   
level 100  1 -   % is given as  x - z 2 , x  z 2  .
 n n 
The endpoints of the confidence interval is called confidence limits.

BITS Pilani, Pilani Campus


Interval estimation for µ: σ known

Most commonly used confidence levels:

   
Hence, 95% CI for  is given as  x - 1.96 , x  1.96 .
 n n
   
That is, P  X - 1.96    X  1.96   0.95
 n n

7 BITS Pilani, Pilani Campus


Practice Problems
Ex.1. The mean of a sample size 50 from a normal population is observed to
be 15.68. If the s.d. of the population is 3.27, find (a) 80% (b) 95%, (c) 99%
confidence interval for the population mean. Can you find out the respective
margin of errors? What is the length of CI for each case?

Sol. (b) First check the two assumptions : (i) normality (ii)  known
Step 1: Here n  50, x  15.68,  =3.27, and   0.05. We need CI for  .
 
Step 2: As   0.05, we need to find z 2 such that P Z  z 2 =0.975.
From cumulative normal distribution table, we see z 2  1.96.
   
Step 3: The CI for   known  is  x - z0.025 , x  z0.025   14.77,16.59 
 n n 

8 BITS Pilani, Pilani Campus


Interval estimation for µ: σ known

Confusions and confusions?  


In the above example, we found 95% CI for μ is (14.77, 16.59).
This means, the unknown μ lies within the fixed interval with
probability 0.95. That is,
P [μ lies in (14.77, 16.59)]= 0.95 – right?

If not, then what is the interpretation of “95% confidence”?


- Long run relative frequency?
- A single replication/realization of random interval is not
enough! Not satisfactory, at least.

9 BITS Pilani, Pilani Campus


95% CIs for population mean

One hundred 95% CIs (asterisks identify intervals that do not include ).

10 BITS Pilani, Pilani Campus


Practice Problems
HW 1. Studies have shown that the random variable X, the
processing time required to do a multiplication on a new 3-
D computer, is normally distributed with mean μ and
standard deviation 2 microseconds. A random sample of 16
observations is to be taken
(a) These data are obtained
42.65 45.15 39.32 44.44
41.63 41.54 41.59 45.68
46.50 41.35 44.37 40.27
43.87 43.79 43.28 40.70
Based on these data, find an unbiased estimate for μ.
(b) Find a 95% confidence interval for μ.
11
BITS Pilani, Pilani Campus
Confidence Level, Precision, and
Sample Size
Ex.2. Extensive monitoring of a computer time-sharing
system has suggested that response time to a particular
editing command is normally distributed with standard
deviation 25 millisec.

A new operating system has been installed, and we wish


to estimate the true average response time  for the new
environment.

Assuming that response times are still normally


distributed with  = 25, what sample size is necessary to
ensure that the resulting 95% CI has a width of (at most)
10? 12
BITS Pilani, Pilani Campus
Confidence Level, Precision, and
Sample Size

13
BITS Pilani, Pilani Campus
Impracticality of Assumptions in CI

In practice, we usually face mainly two


problems in application of previous C.I.
formula.
• What if the population is not normal? (large
sample size is needed) - Can we take help from
CLT?
• What if the population variance is unknown?

14
BITS Pilani, Pilani Campus
7.2: Large Sample CI for µ

Let X 1 , X 2 , , X n  large sample  be a random sample from a with


mean   unknown  and the variance  2  known  . Then, using CLT,
 2  X -
X N  ,   N  0,1
 n    
 
 n
   
100  1 -   % CI for  is given as  x - z 2 , x  z 2  .
 n n 

15
BITS Pilani, Pilani Campus
7.2: Large Sample CI for µ

Let X 1 , X 2 , , X n be a random sample from a large sample  n  40 


with mean   unknown  and sample variance S 2 . Then,
X -
N  0,1 approximately, a standard normal distribution 
 S 
 
 n
 S S 
 P X - z 2    X  z 2   1 - 
 n n 
Hence, the large-sample confidence interval for population mean  having confidence
 s s 
level 100  1 -   % is approximately given as  x - z 2 , x  z 2  , n  40 is needed.
 n n 

16
BITS Pilani, Pilani Campus
HW 2

17
BITS Pilani, Pilani Campus
One Sided CI for µ

18
BITS Pilani, Pilani Campus
Enjoy 

https://www.youtube.com/watch?v=sv5-9wiXjVk

19
BITS Pilani, Pilani Campus
BITS Pilani
Pilani Campus

Course No: MATH F113

Probability and Statistics


BITS Pilani, Pilani Campus
Deriving a CI: Example 7.5

21
BITS Pilani, Pilani Campus
Example 7.5

(https://en.wikipedia.org/wiki/Relationships_among_probability_distributions)

22
BITS Pilani, Pilani Campus
Example 7.5

23
BITS Pilani, Pilani Campus
Example 7.5

24
BITS Pilani, Pilani Campus
Sample/Population Proportion

Let us draw a random sample X1, X2, …, Xn of size n from


population, where
Xi = 1, if the i-th member of the sample has the trait
= 0, if i-th sample does not have the trait
n
Then, X   X gives the number of objects in the
i
i 1
sample with the trait and the statistic X/n gives the
proportion of the sample with the trait. Note that X is a
binomial RV with parameters n (known) and p.
BITS Pilani, Pilani Campus
Sample Proportion
The statistic that estimates the parameter p, a proportion of a
population that has some property, is the sample proportion
number in sample with the trait (success) X
pˆ  
sample size n
Properties:
(i) As the sample size increases (n large), the sampling
distribution of pˆ becomes approximately normal (WHY?)
p 1 - p 
(ii) The mean of pˆ is p, and variance of pˆ is (WHY?)
n
(iii) Can we get estimators of p? Point and interval estimator
BITS Pilani, Pilani Campus
Sample Proportion
n

X X i
Note that pˆ   1  X,
n n
where, each X i is an independent point binomial (Bernoulli RV),
that is, P  X i  1  p and P  X i  0   1 - p
xi 1 0
f(xi) p 1-p

E[Xi] = 1(p)+0(1-p)=p
Var (Xi) = E[Xi2]-(E[Xi])2 = p(1-p)
BITS Pilani, Pilani Campus
Form of Sampling Distribution of
Sample Proportion
If np  10 and n 1 - p   10, sampling distribution of pˆ can be approximated
p 1 - p 
by a normal distribution with mean p and s.d. .
n
Knowing, E  pˆ   0.60 and  pˆ  0.0894, can we find
(i) P  0.55  pˆ  0.65   ?
(ii) P  0.50  pˆ  0.65   ?

28 BITS Pilani, Pilani Campus


(Traditional) Confidence Interval on p

Interval Estimate = Point Estimate


` + / - Margin of Error
Note that for large n (using CLT),
 p 1 - p   pˆ - p
pˆ N  p,   N  0,1
 n  p 1 - p 
n
Taking two points  z 2 symmetrically about the origin, we get
 
 
 pˆ - p
P - z 2   z 2   1 - 
 p 1 - p  
 
 n 
Here 1 -   is known as confidence level.
BITS Pilani, Pilani Campus
Confidence Interval on p
 p 1 - p  p 1 - p  
P  pˆ - z 2  p  pˆ  z 2  1-
 n n 
 
As p is unknown, above confidence bounds are not statistics. So replace p by
unbiased estimator pˆ , and then the CI on p having confidence level 1 -   is
 pˆ obs 1 - pˆ obs  pˆ obs 1 - pˆ obs  
 pˆ obs - z 2 , pˆ obs  z 2 .
 n n 
 
The endpoints of the confidence interval is called confidence limits.

BITS Pilani, Pilani Campus


Confidence Interval on p

BITS Pilani, Pilani Campus


Sample Size for Estimating p
We can be 100(1-% sure that p̂ and p differ by
at most d , where d is given by
pˆ obs (1 - pˆ obs )
d  z 2
n
Thus, sample size for estimating p, when prior
estimate available is
pˆ obs (1 - pˆ obs )
n  z 2
2
2
d
BITS Pilani, Pilani Campus
Sample Size for Estimating p

1
It can be shown that the value of pˆ obs (1 - pˆ obs )  .
4
Thus, sample size for estimating p, when
z2 2
prior estimate is not available is n  2
.
4d

BITS Pilani, Pilani Campus


Problem Solving
Ex 3
A study of electromechanical protection devices used in
electrical power systems showed that of 193 devices
that failed when tested, 75 were due to mechanical
part failures.
a)Find a point estimate for p, the proportion of failures
that are due to mechanical failures.
b)Find a 95% confidence interval on p.
c) How large a sample is required to estimate p to within
0.03 with 95% confidence.
BITS Pilani, Pilani Campus
Problem Solving
Random variable X= number of failed devices which were due
to mechanical failure among 193 failed devices.
X has approx. normal dist with mean = 193p,
variance=193p(1-p).
a) Point estimate for p = pˆ obs = x/n = 75/193 = 0.3886.
b) 95% confidence interval on p is
75 75  75 
 z0.025  1-  193   0.3198,0.4574 
193 193  193 

1.962  0.389  0.611


(c) (with using prior estimate) n  ~ 1015
2
(0.03 )
z2 1.962
(without using prior estimate) n   / 2  ~ 1068.
4d 2 (4)(0.032 )
BITS Pilani, Pilani Campus
BITS Pilani
Pilani Campus

Course No: MATH F113

Probability and Statistics


BITS Pilani, Pilani Campus
Score CI for p

BITS Pilani, Pilani Campus


Score CI for p

BITS Pilani, Pilani Campus


Score CI for p (Example 8; Page 281)

BITS Pilani, Pilani Campus


Score CI for p (Example 8; Page 281)

BITS Pilani, Pilani Campus


Practice Problems

BITS Pilani, Pilani Campus


Practice Problems

BITS Pilani, Pilani Campus


Practice Problems

BITS Pilani, Pilani Campus


Practice Problems

BITS Pilani, Pilani Campus


Practice Problems

BITS Pilani, Pilani Campus


Practice Problems

BITS Pilani, Pilani Campus


Practice Problems

BITS Pilani, Pilani Campus


BITS Pilani
Pilani Campus

Course No: MATH F113

Probability and Statistics


BITS Pilani, Pilani Campus
7.3: CI based on a normal population,
Interval estimation for µ: σ unknown
Interval Estimate for µ = X + / - Margin of Error
• If sample size is small and the population standard deviation σ
is unknown, then?
 Like previously in large-sample case, we can use sample
standard deviation S to calculate the margin of error.
• With small sample, what is the distribution of RV obtained by
replacing σ by S? Can it be a normal distribution, as earlier?
 In this case, assuming that the population is normal, we would
obtain a T-distribution – HOW?
49 BITS Pilani, Pilani Campus
Interval estimation for µ: σ unknown

Note :
Let Z N  0,1 and 2 be an independent chi-squared RV with  degrees
Z
of freedom, then T  follows a T -distribution with  dof.
 
2

Theorem :
Let X 1 , X 2 , , X n be a random sample of size n from a normal distribution
X -
with mean  and variance  . Then,2
Tn -1
S n

50 BITS Pilani, Pilani Campus


T-Distribution
• Random variable T with  degrees of freedom (called
parameter) is a continuous r.v. with density
- (  1) / 2
(  1) / 2  t 2

f (t )  1   ;-  t  .
( / 2)    

• Density plot is bell shaped, symmetric about 0.


• Variance of T decreases as  increases. In fact T
approximates standard normal for large .

BITS Pilani, Pilani Campus


T-Distribution

BITS Pilani, Pilani Campus


Properties of T-distribution

BITS Pilani, Pilani Campus


CDF of T-distribution
• Critical values for t-distribution are given in
Table A.5.
• By tα,v we denote the value of the t-variable
such that area under its density to its right is
α. (degrees of freedom v must be mentioned
separately).

BITS Pilani, Pilani Campus


Interval estimation for µ: σ unknown

The T -distribution is symmetric, and becomes approx. std. normal for large n.
Taking two points  t 2 symmetrically about the origin, we get
 
 X - 
P  -t 2,n -1   t 2,n -1   1 - 
 S 
 
 n 
Here 1 -   is known as confidence level, and  is the level of significance. So,
 S S 
P X - t 2,n -1    X  t 2,n -1   1 - 
 n n 
Hence, the CI for   unknown  having confidence level 100  1 -   %
 S S 
is given as  x - t 2,n -1 , x  t 2,n -1  .
 n n 
55 BITS Pilani, Pilani Campus
Interval estimation for µ: σ unknown
(small sample)

BITS Pilani, Pilani Campus


Examples
Ex.4. Seven laboratory experiments of the value of g (accelearation due
to gravity that follows normal distribution) at Pilani gave a mean 977.51 cm/s 2
and a s.d. 4.42 cm/s 2 . Find 95% CI for the true value of g (i.e., population mean).
Sol.
Step 1: Here n  7, x  977.51, s =4.42, and   0.05. We need CI for  .
Population is also known to be normal dist.
Step 2: As   0.05, we need to find t 2 from t-distribution with (n -1)  6

 
degree of freedom, such that P T  t 2 =0.975.
From t-distribution table, we see t0.025,6  2.447.
 s s 
Step 3: The CI for   unknown  is  x - t0.025 , x  t0.025 
 n n 
  973.09, 981.93
57 BITS Pilani, Pilani Campus
https://en.wikipedia.org/wiki/Student
%27s_t-distribution

Similar to the table A.5

58 BITS Pilani, Pilani Campus


α/2
1-α/2
Degree of
freedom tα/2

Cumulative Probability

Similar to
the table
A.5

59 BITS Pilani, Pilani Campus


Example 11 (Page 288)

• Even as traditional markets for sweetgum lumber


have declined, large section solid timbers
traditionally used for construction bridges and
mats have become increasingly scarce.

• The article “Development of Novel Industrial


Laminated Planks from Sweetgum Lumber” (J. of
Bridge Engr., 2008: 64–66) described the
manufacturing and testing of composite beams
designed to add value to low-grade sweetgum
lumber.

BITS Pilani, Pilani Campus


Example 11 (Page 288)
cont’d

• Here is data on the modulus of rupture (psi; the article contained


summary data expressed in MPa):

• 6807.99 7637.06 6663.28 6165.03 6991.41 6992.23


• 6981.46 7569.75 7437.88 6872.39 7663.18 6032.28
• 6906.04 6617.17 6984.12 7093.71 7659.50 7378.61
• 7295.54 6702.76 7440.17 8053.26 8284.75 7347.95
• 7422.69 7886.87 6316.67 7713.65 7503.33 7674.99

BITS Pilani, Pilani Campus


Example 11 (Page 288)
cont’d

• Let’s now calculate a confidence interval for true


average MOR using a confidence level of 95%.
The CI is based on n – 1 = 29 degrees of freedom,
so the necessary t critical value is t.025,29 = 2.045.
The interval estimate is now

BITS Pilani, Pilani Campus


Examples

HW.3. Seven laboratory experiments of the value of g (accelearation due


to gravity that follows normal distribution) at Pilani gave a mean 977.51 cm/s 2
and a s.d. 4.42 cm/s 2 . Find 80%, 90% and 95% CIs for the population mean.
Can you find out the respective margin of errors (bound on the the error of
estimation)?

HW.4. A sample of size 15 taken from a larger population (normally dist.)


has a sample mean 12, and sample variance 25. Construct 95% CI for population
mean when population s.d. is 5. What is the length of CI?

63 BITS Pilani, Pilani Campus


One Sided CI
HW 5: One sided confidence interval can be used to
approximate the maximum and minimum value of the
population mean.
An interval (-∞,L1] such that P(μ ≤ L1) = 1-α allows us
to place bounds on the maximum value of population
mean L  X t
1 S/ n
 , n -1

An interval [L2, ∞) such that P(μ ≥ L2) = 1-α allows us


to place bounds on the minimum value of population
mean L  X -t
2 S/ n
 , n -1
BITS Pilani, Pilani Campus
One Sided CI
Use the following data on X, the time that a commercial
airliner stays at the gate during a through flight, to find a 95%
one sided confidence interval that puts a bound on the
minimum time in minutes for μ:
25 29 32 37 40 27 30 35 38 41 42 45 45 47 49 50
55 53 60 (Assume normality of population)

Solution : x  41.05, s 2  98.61, s  9.93


s 9.93
Lobs  x - t / 2  41.05 - 1.734   37.10
n 19
CI  [37.10, )
BITS Pilani, Pilani Campus
7.4: Interval estimation of Variability

Recall: S2 is an unbiased estimator for 2.


Theorem: Let X1, …, Xn be the random sample of
size n from a normal population n
with mean  and
s.d. . Then (n - 1) S2 /  2   ( X i - X )2 /  2
i 1
has chi-squared distribution with (n-1) degrees of
freedom https://jekyll.math.byuh.edu/courses/
m321/handouts/mean_var_indep.pdf

Recall: Chi-squared dist with (n-1) degrees of freedom


is gamma dist with  = (n-1)/2,  = 2.
BITS Pilani, Pilani Campus
Recall Chi-squared Distribution

67
BITS Pilani, Pilani Campus
Interval estimation of Variability
Theorem: Let X1, …, Xn be the random sample of size n
from a normal population with mean  and s.d. . Then,
100(1-)% confidence interval estimate for 2 is given by
(n - 1) s 2 (n - 1) s 2
2 
2 / 2, n -1 12- / 2, n -1
Proof:
(n - 1) S 2
P(  df2 ,1-     df2 , )  1 - 
 2 2 2

(n - 1) S 2 (n - 1) S 2
2 
 df2 ,  df2 ,1- 
2  2
BITS Pilani, Pilani Campus
Interval estimation for σ2
 
Thus 100  1 -   % CI for  is  2
n - 1 s 2
 n - 1 s 2

  2,n -1 12- 2,n -1 
2
, .
 

A confidence interval for  has lower and upper limits


that are the square roots of the corresponding limits in
the interval for 2.

An upper or a lower confidence bound results from


replacing /2 with  in the corresponding limit of the CI.
69
BITS Pilani, Pilani Campus
Interval estimation for σ2

 
Recall that 100  1 -   % CI for  is  2
n - 1 s 2
 n - 1 s 2

  2,n -1 12- 2,n -1 
2
, .
 

Ex.5. A sample of size 9 from a normal population is given below. Find


the 90% CI for the mean  of the population. Also find 90% CI for the
variance  2 of the population. Sample: 0, 1, -1, 1, 1, 0, -1, - 2, 3.
Also find 90% CI for  .
70
BITS Pilani, Pilani Campus
Interval estimation for σ2
Sol. Here n  9, s 2 =0.24 (check), and   0.10.
To find the CI for the population variance  2 , first let us
calculate 12- 2, n -1
=  2
0.95,8  2.733 and  2
 2, n -1
=  0.05,8  15.507. Thus,
2

  n - 1 s 2
 n - 1 s 2

the 90% CI for  is     0.1238, 0.7025  .The length
2
,
 2 2 1- 2 
2

of the CI for  2 is  0.7025 - 0.1238  =0.5787. The 90% CI for  is (?,?).

HW.6. The heights in inches of 8 students of a college, chosen at random,


were as follows: 62.2, 62.4, 63.1, 63.2, 65.5, 66.2, 66.3, 66.5. Compute 90%
and 95% CI for the variance of the population of heights, assuming
it be to be normal. Also, find the length of the interval in each case.
71 BITS Pilani, Pilani Campus
Similar to A.7
of your book

72 BITS Pilani, Pilani Campus


α
Degree of 1-α/2
freedom Chi-sq(α)

Cumulative Probability

Even this is
fine

73 BITS Pilani, Pilani Campus


One Sided CI on σ2
Ex 6: X is actual length of 63 mm nails. Use the given data
to find a 95% one side confidence interval on the variance
in length
63.0 63.1 63.0 63.0 62.9 63.0 63.0
63.1 62.8 63.1 63.1 63.0 62.9 63.2
The manufacturer wants to check to be sure that the
population variance of the length of nails being produced
does not exceed 0.03. Assuming normality of population,
does this sample indicate that this is in the case? Explain.
Sol:

BITS Pilani, Pilani Campus


Interval estimation for σ2
14 14

 xi  882.2;
i 1
i
x 2

i 1
 55591.34; s 2
 0.0105

We want L, such that P   L   0.95


 2

(n - 1) s 13  0.0105
2
 Lobs  2   0.023
1- ,13 5.89

Yes, this is the case.

BITS Pilani, Pilani Campus


Supplementary HW

BITS Pilani, Pilani Campus


Supplementary HW

Extra Class on Sunday at 9 AM

BITS Pilani, Pilani Campus

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy