0% found this document useful (0 votes)
4 views11 pages

Sampling

The document discusses the importance and methodology of sampling in statistical investigations, highlighting its merits such as cost-effectiveness and detailed analysis, as well as demerits like potential misleading results. It explains key concepts such as parameters and statistics, hypothesis testing, and the significance of sample sizes, including the use of t-tests and chi-square tests for data analysis. Various examples illustrate the application of these statistical methods in real-world scenarios.

Uploaded by

mavnish604
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views11 pages

Sampling

The document discusses the importance and methodology of sampling in statistical investigations, highlighting its merits such as cost-effectiveness and detailed analysis, as well as demerits like potential misleading results. It explains key concepts such as parameters and statistics, hypothesis testing, and the significance of sample sizes, including the use of t-tests and chi-square tests for data analysis. Various examples illustrate the application of these statistical methods in real-world scenarios.

Uploaded by

mavnish604
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Data play an important role in any statistical investigation.

Sampling is a
popular method to collect the data. The fundamental assumption behind the
sampling method is that if the units of a sample are selected at random, its
characteristics will almost be same as they exists in the universe.

Sample

Sample is a small part representing universe and its salient features .


“ A sample is that part of the universe which we select for the purpose of
investigation.”

Importance of sampling

Samples are devices for learning about large masses by observing a few individ-
uals. In fact is that we are living in the age of sampling.

Merits

1. Economic method We dont need to investigate the whole population.


2. Saving of time and Labour Save both not only in conducting the sam-
pling enquiry but also in the processing, editing and analysing the data.
3. Testing of accuracy Sampling methods allow us to investigate the accu-
racy by comparing the result of two or more samples.
4. Detailed and intensive enquirey As the number of units investigated
are limited hence it is possible to study them in detail and intensively.
5. Only method in many cases If the population is too large or if the
testing of units is destructive then we are left with no other way nut to use
sampling.

Demerits

1. misleading results If a sample survey is not properly planned and care-


fully executed.
2. Need of specialised knowledge
3. Heterogeneous units If the units of the population are too heterogeneous.

Sampling theory

It is a study of relationship existing between a population and samples drawn


from the population.

1
Parameter and statistic

A statistic is a characteristic of a sample while a parameter is a characteristic


of a population.
Suppose, we have a town whose population is 50,000. The statistical measures
based on data of all these persons will be a parameter.
On the other hand, if we draw a sample of 5,000 persons and compute various
statistical measures such as mean, SD etc., they will be statistic.
Thus, a parameter is a statistical measure which is related to the population
and is based on population, whereas a statistic is a statistical measure which
relates to to the sample and is based on sample data.
Objectives of sampling theory
1. Estimation of parameters
(a) point estimate
The estimate of a population parameter by a single number.
(b) Interval estimate
It is a statement of two values between which the parameter is expected
to exist.
2. Testing of hypothesis
Standard error
It is the average amount of variability of the observation of a sampling dis-
tribution is computed it is called as standard error. e.g. standard deviation of
statistic means is called as standard error.
Parameters Statistics
Population size N Sample size n
Population mean µ Sample mean x
Population SD σ Sample SD S
Large and small samples A sample with size > 30 is known as a large sample
while any sample with size ≤ 30 is known as a small sample.
Hypothesis testing and errors A hypothesis is a statement about the
population parameter.
1. Null Hypothesis (H0 )
2. Alternate Hypothesis (H1 )
Errors
1. Type I
When a Null hypothesis is true but rejected

2
2. Type II error
Accepting a null hypothesis when it is false.
True position H0 ) Accepted H0 rejected
H0 is true Correct decision Type I error
H0 is not true Type II error Correct decision
The maximum possibility of type I error is known as level of significance and
it is determined in advance. e.g if level of significance is fixed at 5 %, it means
that there is a possibility of making a type I error in 5 out of 100 cases (rejection
of a true null hypothesis). We can minimize the type I error by reducing the
level of significance. However, controlling the type error, the chances of type II
error (acceptance of null hypothesis) increases.
Critical value The value is obtained from a standard table at a particular
level of significance.
Two tailed test and One tailed test

3
Critical value Zα 1% 5%
Two tailed test |zα | = 2.58 |zα | = 1.96
Right tailes test |zα | = 2.33 |zα | = 1.64
Left tailed test zα = −2.33 zα = −1.64
Example
In the study of mean, the null hypothesis H0 = µ = µ0
Now, the possible alternate hypothesis be
1. H1 : µ 6= µ0 (i.e. µ > µ0 or µ < µ0 ). It is a two tailed test.
2. H1 : µ < µ0 (one tailed test or left sided test).
3. H1 : µ > µ0 (one tailed test or right sided test)
Test of Significance for Single Mean
Under the null hypothesis (H0 ): the sample has been drawn from a population
with mean µ and variance σ 2 , i.e., there is no significant difference between the
sample mean (x) and population mean (µ), the test statistic (for large samples),
is
x−µ
Z=
√σ
n
If the SD of the population is not known, then we use
x−µ
Z=
√s
n
Here, s is the SD of the sample.

Confidence limits for µ: 95% confidence interval for µ is given by


x−µ
|z| ≤ 1.96 ⇒ | | ≤ 1.96
√σ
n
σ σ
⇒ x − 1.96 √ ≤ µ ≤ x + 1.96 √
n n
Similarly, we can obtain the 99 % confidence limit for µ as x ± 2.58 √σn

Prob: A sample of 900 members has a mean 3.4 cms. Is the sample from a
large population of mean 3.25 cms. and s.d. 2.61 cms. ?
Find the 95 % confidence limits of true mean.

Solution

4
Null hypothesis (H0 ): The sample has been drawn from the population with
mean(µ) = 3.25 cms. and S.D. σ = 2.61 cms.
Here, we are given
x = 3.4cms., n = 900, µ = 3.25cms and σ = 2.61cms.
3.40 − 3.25 0.15 × 30
Z= 2.61 = = 1.73

900
2.61
Since |Z| < 1.96, therefore H0 can be accepted at 5 % level of significance.
95 % confidence limits are x ± 1.96 √σn ⇒ 3.40 ± 1.96 × √2.61
900
⇒ 3.40 ± 0.1705
Hence the limits are 3.5705 and 3.2295.

Prob: A sample of 1600 units is found to have a mean of 3.4 cms. Can it be
reasonably regarded as a simple sample from a large population with mean 3.2
cms and SD 2.3 cms.

Solution
Here, n = 1600, µ = 3.2, x = 3.4 and σ = 2.3
H0 : The sample is drawn from a population with mean 3.2 cms.
Now,
|x − µ| |3.4 − 3.2|
|Z| = σ = 2.3 = 3.478
√ √
n 1600
Since |Z| > 3, therefore we will reject H0 .

Problem A population has a mean of 159.7 cms and SD 4.5 cms. How large
a sample would be necessary to make the standard error of the mean less than
or equal to 0.5 cm.

Soultion
Given, n =?, x = 159.7 and σ = 4.5 and SE < 0.5
σ 4.5 √ √
SE = √ ⇒ 0.5 = √ ⇒ 0.5 × n = 4.5 ⇒ n = 9
n n
We get n = 81.
Therefore, the size of the sample is 81 at least.

Problem An automatic machine was designed to pack 2.0 Kg of Vanaspati.


A sample of 100 tins was examined to test the machine. The average weight
was found to be 1.94 Kg with SD 0.10 Kg. Is the machine working properly?

Solution
H0 : Machine is working properly.

5
Given, n = 100, µ = 2Kg, x = 1.94Kg and s = 0.10Kg
Now,
|x − µ| |1.94 − 2|
|Z| = s = 0.10 =6
√ √
n 100
Since the calculated value of Z is greater than the tabular value, therefore
we reject H0 . Hence, machine is not working properly.
Problem In the past the average length of an outgoing telephone call from a
business office has been 143 seconds. A manager wishes to check whether that
average has decreased after the introduction of policy changes. A sample of 100
telephone calls produced a mean of 133 seconds, with a standard deviation of
35 seconds. Perform the relevant test at the 1 % level of significance.
Solution
H0 : µ = 143 H1 : µ < 143
Given n = 100, x = 133, s = 35
Now,
|x − µ| |133 − 143|
|Z| = s = 35 = 2.85
√ √
n 100
The calculated value of Z is greater than the tabular value hence we reject
H0 . OR
x−µ 133 − 143
Z= = = −2.85
√s √35
n 100
The calculated value of Z is lesser than the tabular value hence we reject H0 .

Problem
The average household size in a certain region several years ago was 3.14
persons. A sociologist wishes to test, at the 5 % level of significance, whether it
is different now. Perform the test using the information collected by the sociol-
ogist: in a random sample of 75 households, the average size was 2.98 persons,
with sample standard deviation 0.82 person.

Solution
H0 : µ = 3.14 H1 : µ 6= 3.14
Given n = 75, x = 2.98, s = 0.82
Now,
|x − µ| |2.98 − 3.14|
|Z| = s = 0.82 = 1.68
√ √
n 75
The calculated value of Z is less than the tabular value hence we accept H0 .
Small sample test or t-test
The t-test used to test the significance of

6
1. The mean of a small sample
2. The difference between the means of two small samples or to compare two
small samples
Test of significance of the mean of small sample
Steps involved
To calculate the significance of sample mean at 5 % level of significance
• H0 : The population mean (µ) is equal to the given value of the mean (i.e.
µ = µ0 ).
x−µ √
(x−µ) n
• calculate t = s or t = s .

n

• Compare the calculated value with the tabular value of t at (n − 1) degree


of freedom and 5 % level of significance.(dof = (n − 1))
Problem A random sample of size 20 drawn from a normal population
yielded the following results: x = 49.2, s = 1.33. Test H0 : µ = 50vs.H1 :
µ 6= 50 at α = 0.01. (The tabular value of t at 19 dof and 1 % level of
significance is 2.86)
Using the formula
|x − µ| |49.2 − 50|
t= = = 2.690
√s 1.33

n 20

Since the calculated value is less than the tabular value, we accept H0 .

Problem
The height of 9 children selected at random from a given colony had a mean
63.5 cms. and variance 6.25 cms. Test, at 5 % level of significance, the hypoth-
esis that the children of the given colony are on average 65 cms long and not
less than 65 cm. in all. (The value of t for 8 d.f. at 5 % level of significance is
2.262)
Solution
H0 : The average height of the children is 65 cms. or µ = 65. H1 : µ < 65
n = 9, x = 63.5cms., variance = 6.25 (or SD = 2.5) and µ = 65
Using the formula
|x − µ| |63.5 − 65|
t= = = 1.8
√s 2.5

n 9

Since the calculated value is less than the tabular value, we accept H0 .

Problem Six boys are selected at random from a school and their marks
in Mathematics found to be 63,63,64,66,60 and 68 out of 100. In the light of

7
these marks discuss the general observations that the mean in Mathematics in
the school were 66. (The value of t for 5 d.f. at 5 % level of significance is 2.571)

Solution
H0 : µ = 66
Marks di = (xi − 64) d2
63 -1 1
63 -1 1
64 0 0
66 2 4
60 -4 16
68 4 16
P P P 2
x = 384 d=0 d = 38
P
xi 384
x= = = 64
n 6
s P s
d2 38
s= = = 2.756
(n − 1) (6 − 1)
Using the formula
|x − µ| |64 − 66|
t= = = 1.777
√s 2.756

n 6

Since the calculated value is less than the tabular value, we accept H0 .

χ2 test
Chi-square test is a measurement which
• tell about magnitude of difference between actual or observed frequencies
(fo ) and corresponding theoretical or expected frequencies (fe ).
• explains that whether difference is significant or due to sample fluctuations?
X  (f0 − fe )2 
2
χ =
fe
Use of Chi-square test
• Test of independence
• Test of goodness of fit
Problem The following figures show the distribution of digits in numbers
chosen at random from a telephone directory) :

8
Digits 0 1 2 3 4 5 6 7 8 9 Total
frequency 1026 1107 997 966 1075 933 1107 972 964 853 10,000
Test whether the digits may be taken to occur equally frequently in the direc-
tory. (The tabular value of χ2 at 5 % level of significance for 9 degree of freedom
is 16.919. )

Solution H0 : Digits are equally frequently


Expected frequency of each digit (fe ) 10,000
10 = 1000
(fo −fe )2
Digits fo fe (fo − fe ) (fo − fe )2 fe
0 1026 1000 26 676 0.676
1 1107 1000 107 11,449 11.449
2 997 1000 -3 9 0.009
3 966 1000 -34 1156 1.156
4 1075 1000 75 5625 5.625
5 933 1000 -67 4489 4.489
6 1107 1000 107 11449 11.449
7 972 1000 -28 784 0.784
8 964 1000 -36 1296 1.296
9 853 1000 -147 21609 21.609
Total 10,000 58.542
X  (f0 − fe )2 
2
χ = = 58.542
fe
The calculated value of χ2 is much greater than the tabular value, hence we
reject our null hypothesis.

Problem
In an anti malaria campaign in a certain area, quinine was administered to
812 persons out of the total population of 3,248. The number of fever cases is
given below
Treatment Fever No fever
Quinine 20 792
No Quinine 220 2216
Discuss the usefulness of quinine in checking malaria. The tabular value of
2
χ at 5 % level of significance for 1 degree of freedom is 3.841.

Solution The given data can be represented as

9
Quinine No quinine Total
No fever 792 2216 3008
Fever 20 220 240
Total 812 2436 3248
H0 : Quinine is not effective in treating malaria
Quinine No quinine Total
812×3008 2436×3008
No fever 3248 = 752 3248= 2256 3008
812×240 2436×240
Fever 3248= 60 3248= 180 240
Total 812 2436 3248
X  (f0 − fe )2  (792 − 752)2 (20 − 60)2 (2216 − 2256)2 (220 − 180)2
χ2 = = + + +
fe 752 60 2256 180
= 2.128 + 26.667 + 0.709 + 8.889 = 38.393
The calculated value of χ2 is much greater than the tabular value, hence we
reject our null hypothesis.

Problems A sample of 400 under-graduate students and 400 students of


post graduate was taken to know their opinion about autonomous college. 290
of the undergraduate and 310 of the post graduate students favoured the au-
tonomous status. Present these facts in the form of a table and test at 5 %
level of significance, that the opinion regrading autonomous status of colleges
are independent of the level of classes of students. (The tabular value of χ2 at
5 % level of significance for 1 degree of freedom is 3.841.)

Solution The given data can be represented as


Undergraduate post graduate Total
favored 290 310 600
against 110 90 200
Total 400 400 800
H0 : The opinions are independent of the level of classes of students
Undergraduate post graduate Total
favored 400×600
800 = 300 400×600
800 = 300 600
400×200 400×200
against 800 = 100 800 = 100 200
Total 400 400 800
X  (f0 − fe )2  (290 − 300)2 (310 − 300)2 (110 − 100)2 (90 − 100)2
2
χ = = + + +
fe 300 300 100 100
= 0.33 + 0.33 + 1 + 1 = 2.66

10
The calculated value of χ2 is less than the tabular value, hence we accept our
null hypothesis.

Problem A set of five similar coins is tossed 320 times and the result is given
in the following table
No. of heads 0 1 2 3 4 5
frequency 6 27 72 112 71 32
Test the hypothesis that data followed a binomial distribution. (The tabular
value of χ2 at 5 % level of significance for 5 degree of freedom is 11.07.)

Solution H0 : data followed the binomial distribution


P (H) = 12 , P (T ) = 21 .
Let r represents the number of heads then theoretical frequencies are obtained
as

5
( 12 )5 = 10

P (r = 0) = 320 0

5
( 21 )5 = 50

P (r = 1) = 320 1

5
( 21 )5 = 100

P (r = 2) = 320 2

5
( 12 )5 = 100

P (r = 3) = 320 3

5
( 12 )5 = 50

P (r = 4) = 320 4

5
( 21 )5 = 10

P (r = 5) = 320 5

X  (f0 − fe )2  (6 − 10)2 (27 − 50)2 (72 − 100)2 (112 − 100)2


χ2 = = + + +
fe 10 50 100 100
2 2
(71 − 50) (32 − 10)
+ + = 78.68
50 10
The calculated value of χ2 is greater than the tabular value, hence we reject
our null hypothesis.

11

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy