0% found this document useful (0 votes)
62 views41 pages

06 Stat Est

The document discusses statistical estimation and sampling distributions. It defines parameters and statistics, and describes how to estimate population parameters like the mean and proportion from sample statistics. It also introduces the concept of sampling distributions and how the sampling distribution of the sample mean follows a normal distribution for large sample sizes.

Uploaded by

Konica Rokeya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views41 pages

06 Stat Est

The document discusses statistical estimation and sampling distributions. It defines parameters and statistics, and describes how to estimate population parameters like the mean and proportion from sample statistics. It also introduces the concept of sampling distributions and how the sampling distribution of the sample mean follows a normal distribution for large sample sizes.

Uploaded by

Konica Rokeya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

Statistical Estimation and Sampling

Distributions
Mahbub Latif, PhD

January 2024
Plan
Point estimate
Sampling distributions
Sampling distribution of sample mean
Sampling distribution of sample proportion

2
Statistical inference

Descriptive statistics deal with describing observations of a sample, e.g. sample men, sample
variance, etc. are tools of descriptive statistics
Inferential statistics deal with making a statement (conclusion) about a population
characteristic using the infromation obtained from a sample
There are two methods of statistical inference
Estimation and test of hypothesis
Two methods of estimation
point estimation
interval estimation (confidence interval)

3
Point Estimate

4
Parameters

A parameter is a property of a probability distribution


E.g. mean, variance or a particular quantile of a probability distribution may be a
parameter
Parameters are usually unknown and the main goal of statistical inference is to estimate
parameters using sample data

5
Example (Machine breakdown)

Let p be the probability of machine breakdown due to "operator misuse"


0

p0 is a parameter because it is an unknown quantity of the corresponding probability


distribution

Example (Milk contents)

Let μ and σ be mean and variance of probability distribution of milk contents of a container
2

μ and σ are parameters of the distribution of milk contents.


2

6
Statistics

Statistic is a property of a sample, e.g. sample mean, sample variance, etc. are example of
statistics
Statistic is a function of sample observation and it can be used to estimate unknown
parameters
Statistics are random variables whose observed values can be calculated from a set of
observed data observations.

7
Statistics

Let x 1, … , xn be a random sample from a probability distribution f (x)

Any function the sample observations, say T (x 1, … , xn ) , is a statistic

Sample mean x̄ is a statistic


x1 + ⋯ + xn
x̄ =
n

Sample variance s is also a statistic


2

n 2
∑ (xi − x̄)
2 i=1
s =
n − 1

8
Estimation

Estimation is a procedure by which the information contained within a sample is used to


investigate properties of the population from which the sample is drawn

A point estimate of an unknown parameter θ is a statistic θ^ that represents a "best guess" of


the value of θ

A point estimate θ^ may not be exactly equal to the corresponding parameter θ, but a good
point estimator is a good indicator of the actual value of the parameter
A point estimate can only be as good as the data set from which they are calculated, e.g.,
whether sample is randomly selected, representative of the population, etc.

9
The relationship between a point estimate θ^ and a parameter θ

10
Estimation of the population mean

11
Estimation (Sample I)

Sample observations: 72.3, 77.7, 82.1, 70.8, 81.6, 80.2, 80.7, 88.9, 70.2, 90.1

Point estimates: x̄ = μ
^ = 79.46 and s
2
= σ
^
2
= 47.9
12
Estimation (Sample II)

Sample observations: 75, 81.5, 73.3, 92.8, 82.6, 73.4, 83.9, 85.9, 84.6, 77.6

Point estimates: x̄ = μ
^ = 81.06 and s
2
= σ
^
2
= 39.11
13
Estimation of population proportion

Consider estimating a parameter p , the probability that a machine breakdown is due to


0

operator misuse
Suppose a sample of n machine breakdowns is recorded, of which x are due to operator
0

misuse

14
Estimation of population proportion

The point estimate of unknown parameter p 0

x0
^ =
p 0
n

Estimate of population proportion due to operator misuse

^ =
p
13
= 0.28
type Frequency
0
46 Electrical 9
Mechanical 24
Misuse 13

15
Summary

Let x 1, … , xn be a random sample from a population with mean μ and variance σ 2

The point estimate of μ


m
1
^ = x̄ =
μ ∑ xi
n
i=1

The point estimate of σ 2

2
1
2 2
σ
^ = s = ∑(xi − x̄)
n − 1
i=1

16
Summary

Suppose x 1, … , xn be a random sample from X ∼ B(1, p)

The point estimate of p


∑ xi
^ =
p
n

17
Sampling distribution

18
Sampling distribution

Probability distribution of a statistic is known as a sampling distribution of the statistic


E.g. probability distribution of the sample mean x̄ is its sampling distribution
The main goal of statistical inference is to estimate unknown population characteristics
(parameters) using sample observations
A number of samples of a specific size (say n) can be drawn from the population and from
each of these samples, the statistic of interest (e.g. x̄) can be calculated
The distribution of these statistics constitute the sampling distribution

19
Sampling distribution of a sample mean

Let x 1, … , xn be a random sample from a population with mean μ and variance σ 2

The sample mean


1
^ = x̄ =
μ ∑x
n

For a large n, the sampling distribution of sample mean x̄ follows a normal distribution
¯ 2
^ = X
μ ∼ N (μ, σ /n)

Note this result is valid for any population, e.g. skewed or bell-shaped

20
Sampling distribution of a sample mean

The standard deviation (sd) of a sampling distribution is known as the standard error (se)
The standard error of sample mean
σ
se(x̄) =
√n

The corresponding Z statistic


X̄ − μ
Z =
σ/√n

√n(X̄ − μ)
= ∼ N (0, 1)
σ

21
Sampling distribution of a sample mean

Sample I: 10.35, 18.76, 6.55, 3.37, 5.88


Sample mean: x̄ = 8.982
Sample II: 23.65, 6.42, 2.94, 5.66, 1.06
Sample mean: x̄ = 7.946
Sample III: 0.59, 5.79, 39.59, 11.73, 9.97
Sample mean: x̄ = 13.534

22
Exponential distribution

23
Normal distribution

24
Exercise 7.3.24

Suppose that components have weights that are normally distributed with μ = 341 and
σ = 2.

An experimenter measures the weights of a random sample of 20 components in order to


estimate μ.

What is the probability that the experimenter's estimate of μ will be less than 341.5?

25
Exercise 7.3.24

The estimate of μ is the sample mean

^ = x̄ = ∑ x/20
μ

The sampling distribution of sample mean


^ = x̄ ∼ N (341, 4/20)
μ

The probability that μ


^ less than 341.5 is

341.5 − 341
^ < 341.5) = Φ(
P (μ )
2/√20

= Φ(1.12)

= 0.8686
26
Exercise 7.3.2
Consider a sample of X 1, … , Xn of normally distributed random variables with mean μ and
variance σ = 1.
2

If n = 10, what is the probability that


¯
P (|μ − X| ≤ 0.3)

What is this probability when n = 30?

27
t-distribution

We have shown for a large n


¯
X − μ
Z = ∼ N (0, 1)
σ/√n

If population variance is unknown and needs to be estimated from the sample of n


observation, the Z statistic does not follow the standard normal distribution and it follows a
distribution known as t-distribution
Similar to standard normal distribution, t-distribution is symmetric about its mean 0
t -distribution has one parameter, which is the degrees of freedom (df)

28
t-distribution

The following t statistic follows a t distribution with n − 1 degrees of freedom


¯
X − μ
t = ∼ tn−1
s/√n

the sample standard deviation


∑(x−x̄)
s = √ →
n−1

Quantiles of t-distribution with different df can be obtained from a t-table

29
Comparison between t and standard normal distributions

30
P (t(1) > 6.314) = 0.05

P (t(9) > 1.833) = 0.05

P (Z > 1.645) = 0.05

31
Exercise 7.3.9
The breaking strengths of 35 pieces of cotton thread are measured.
The sample mean is x̄ = 974.3 and the sample variance is s 2
= 452.1 .
Construct a point estimate of the average breaking strength of this type of cotton thread.
What is the standard error of your point estimate?

32
Sampling distribution of a sample proportion

Let X ∼ B(n, p) , the sample proportion p^ = X/n follows a normal distribution

p(1 − p)
^ ∼ N (p,
p )
n

The standard error of sample proportion:

p(1 − p)
^) = √
se(p
n

The corresponding Z statistic


^ − p
p
Z = ∼ N (0, 1)
p(1−p)

n

33
Sampling distribution of a sample proportion

A coin that is suspected of being biased is tossed many times in order to investigate the
possible bias. Consider the following two scenarios:
Scenario I: The coin is tossed 100 times and 40 heads are obtained
Scenario II: The coin is tossed 1000 times and 400 heads are obtained
What is the difference, if any, between the interpretations of these two sets of experimental
results?

34
Sampling distribution of a sample proportion

Point estimate p^ = .40 is the same for both scenarios


Scenario I

.4(1 − .4)
^) = √
se(p = 0.0024
100

Scenario II
.4(1 − .4)
^) = √
se(p = 0.0002
1000

35
Exercise 7.3.7
Consider a sample X , … , X of normally distributed random variables with mean μ.
1 n

Suppose that n = 21.


What is the value of c for which
¯
X − μ
∣ ∣
P( ≤ c) = 0.95?
∣ ∣
S

36
Exercise 7.3.21
Unknown to an experimenter, when a coin is tossed there is a probability of p = 0.63 of
obtaining a head.
The experimenter tosses the coin 300 times in order to estimate the probability p.
What is the probability that the experimenter's point estimate of p will be within the interval
(0.62, 0.64)?

37
Exercise 7.3.8
In a consumer survey, 234 people out of a representative sample of 450 people say that they
prefer product A to product B.
Let p be the proportion of all consumers who prefer product A to product B.
Construct a point estimate of p.
What is the standard error of your point estimate?

38
Exercise 7.3.20
The capacitances of certain electronic components have a normal distribution with a mean
μ = 174 and a standard deviation σ = 2.8.

If an engineer randomly selects a sample of n = 30 components and measures their


capacitances, what is the probability that the engineer's point estimate of the mean μ will be
within the interval (173, 175)?

39
Exercise 7.3.23
A scientist reports that the proportion of defective items from a process is 12.6%. If the
scientist’s estimate is based on the examination of a random sample of 360 items from the
process, what is the standard error of the scientist's estimate?
Exercise 7.3.34
Consider the data set 7, 9, 14, 15, 22. Obtain the standard error of the sample mean.

40
Exercise 7.7.18
The probability that a medical treatment is effective is 0.68, unknown to a researcher. In an
experiment to investigate the effectiveness of the treatment, the researcher applies the
treatment in 140 cases and measures whether the treatment is effective or not.
What is the probability that the researcher's estimate of the probability that the medical
treatment is effective is within 0.05 of the correct answer?

41

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy