0% found this document useful (0 votes)
17 views70 pages

STS 201 Solution

The Applied Statistics (STS 201) Tutorial Workbook is designed for 200-level students at the Federal University of Agriculture, Abeokuta, to enhance their understanding of statistics through exercises and explanations. It covers various topics including data presentation, probability, hypothesis testing, and data summarization, providing structured sections for student engagement. The workbook serves as a supplementary resource alongside recommended textbooks, aiming to facilitate learning and comprehension in applied statistics.

Uploaded by

kellybryan466
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views70 pages

STS 201 Solution

The Applied Statistics (STS 201) Tutorial Workbook is designed for 200-level students at the Federal University of Agriculture, Abeokuta, to enhance their understanding of statistics through exercises and explanations. It covers various topics including data presentation, probability, hypothesis testing, and data summarization, providing structured sections for student engagement. The workbook serves as a supplementary resource alongside recommended textbooks, aiming to facilitate learning and comprehension in applied statistics.

Uploaded by

kellybryan466
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 70

Applied Statistics (STS 201) Tutorial Workbook 2020

1
Applied Statistics (STS 201) Tutorial Workbook 2020

NAME:

MATRIC NUMBER:

DEPARTMENT:

COLLEGE:

PAYMENT RECEIPT NO.:

2
Applied Statistics (STS 201) Tutorial Workbook 2020

APPLIED

STATISTICS

(STS 201)

TUTORIAL WORKBOOK

Department of Statistics
College of Physical Science (COLPHYS)
Federal University of Agriculture,
Abeokuta, Ogun State

© copyright, Department of Statistics 2020

3
Applied Statistics (STS 201) Tutorial Workbook 2020

All rights reserved. No part of this publication may be reproduced,


stored in retrieval system or transmitted in any form or by any means,
electronic, mechanical, photocopying, recording or otherwise, without the
prior
written permission of the publisher.

Designed by the Department of Statistics,


Federal University of Agriculture,
Abeokuta.

TABLE OF CONTENT

Section Page

4
Applied Statistics (STS 201) Tutorial Workbook 2020

A. Nature of Statistics 6

B. Data Presentation 13

C. Diagrammatic Representation of Data 22

D. Summarization of Data 29

E. Probability 35

F. Sampling Distribution 42

G. Estimation of Parameters 49

H. Test of Hypothesis 56

I. Correlation and Regression 64

J. Others 72

Tables 85

STUDENT DATA
PERSONAL INFORMATION ASSESSMENTS
Section Marks
NAME: ___________________________ 1
2
MATRIC NO: ___________________________________
3
4
DEPT: _______________________
5

COLLEGE: _____________________________________ 6
7

RECEIPT NO: ____________________________________

PREFACE

5
Applied Statistics (STS 201) Tutorial Workbook 2020

The Department of Mathematical Statistics of the University of Agriculture, Abeokuta


has STS 201 – Applied Statistics, as one of the Courses in Statistics being taught at the
200 level of students in some of her programmes of the University.

The Department prides itself in the quality of teachers that have taught this course for
many years. To complement the textbooks recommended for this course and to help
students understand the subject thoroughly, the Department has decided to produce
this workbook. While the workbook is not a text on its own, it is to serve as a platform
for better understanding of the course. For this purpose, sufficient workspace is
provided at the end of each section. This is to ensure that Students work through the
exercises and corrected by teachers and instructors.

Although, the work is written primarily for students who register for STS 201 Applied
Statistics, it is also adequate for students in other higher institutions taking courses
equivalent to STS 201.

The Department wished to that its member who have contributed immensely to this
workbook.

The Head of Department


(On behalf of the Department)

SECTION A

6
Applied Statistics (STS 201) Tutorial Workbook 2020

NATURE OF STATISTICS

1. Explain the followings:

(a) Data (b) Observation

(c) Information (d) Phenomenon

(e) Statistics (f) Population

(g) Variable (h) Sample

(i) Attribute (j) Census

(k) Descriptive Statistics (l) Statistical Method

(m) Discrete Variable (n) Continuous Variable

2. Mention the two types of data and illustrate with examples

3. (a) Discuss the sources of data and the various methods of collection

(b) What are the advantaged and disadvantages of these methods?

4. (a) What are the characteristics of a good questionnaire?

5. (a) What is measurement scale?

(b) Explain the four (4) basic kinds of measurement Scale

7
Applied Statistics (STS 201) Tutorial Workbook 2020

SOLUTIONS

1.

a. Data is the record of observation of a phenomenon. It is a mass of unprocessed


information obtained from measurement of counting of a characteristics or a
phenomenon.

b. Observation is the numerical recording of information. It is the accurate watching


& classical method of scientific enquiry as they occur in nature. It is a method of data
collection in which the observer position himself to observations depending on the
nature and size of the community.

c. Information: This are facts or details about a subject. It is a numerical quantity that
measures the uncertainty in the outcome of an experiment or analysis carried out.

d. Phenomenon: The characteristics of an object or event that is susceptible to


observation. It is the segment of reality that is under observation.

e. Statistics: It is the science that deals with the collection, organization,


presentation, and summarization of numerical data. It deals with the description and
analysis of numerical data with a view to making inference about the population from
which the data is obtained.

f. Population: This is the collection of items under investigation. It may be finite or


infinite.

g. Variable: This refers to features of the phenomenon that can be assigned


numerical values. It is the characteristics of an object or event that is capable of
being assigned numerical values. It can be expressed in quantitative terms e.g
Meters, Weight, KG etc.

h. Sample: This is the representative past of a population observed for the purpose of
making scientific decisions about the population.

8
Applied Statistics (STS 201) Tutorial Workbook 2020

i. Attributes: This describes a characteristics of an object or event which is not


capable of numerical definition or being assigned numerical values e.g Marital status,
Beauty.

j. Census: This is the complete enumeration of all the unit of the population.

k. Descriptive Statistics: This is the act of summarizing and giving a descriptive


account of numerical information in forms of reports, charts, and diagrams.

l. Statistical method: This is the act of making deductive statement about a


population from the quantities computed from its representative sample. It has to do
with estimation and test of hypothesis.

m. Discrete Variable: These are variables whose values changes by steps. It’s values
may be obtained by counting. It normally takes integer values e.g number of cars
etc.

n. Continuous Variables: This is a variant which may take all values within a given
range. Its value are obtained by measurements e.g Time, volume etc.

2. Two types of data Includes

A. PRIMARY DATA: Data generated first hand or obtained directly from respondents
by personal interview, measurement or observation.

B. SECONDARY DATA: Obtained from publications, newspapers, magazines, journals,


and all printed information.

3. a (i) Documentary Source: Means of collecting data from already documented or


published abstracts. Such information can be extracted from internal secondary data
and external secondary data.

9
Applied Statistics (STS 201) Tutorial Workbook 2020

(ii) Observation: This method includes accurate writing and classic method of
scientific enquiry as they occur in nature. Natural measurement or counting copies
under the heading of observation .

(iii) Postal Questionnaires: This is one of the most widely used methods of data
collection mostly in social surveys. Questionnaires are mailed out to respondents
who in turn are expected to send them back through posts when they are duly
completed.

(iv) Personal interview: This is the method used mainly in most survey. It could be
formal or less formal.

(v) Telephone: The method of collecting data through telephone from respondent.

3b (i) Observation

ADVANTAGES

- Observers can be highly accurate

- There is no problem of respondent error

DISADVANTAGES

- It is limited in application

- Bias may occur and mistakes are possible.

(ii) Postal Questionnaires

ADVANTAGES

- Generally cheaper than other methods

- It may be easier to ask some questions than I personal interviews.


10
Applied Statistics (STS 201) Tutorial Workbook 2020

DISADVANTAGES

- No opportunity to supplement the respondent data

- Response rate is low.

(iii) Telephone

ADVANTAGES

- It is faster than any other method.

- It is more flexible than questionnaires

DISADVANTAGES

- Investigators may not be able to ascertain if he is talking to the right person

- Response cannot be supplemented with observational data

(iv) Personal interview

ADVANTAGES

- Allows for call back when necessary

- Most appropriate method when urgency is attached

DISADVANTAGES

- Expensive when compared to other methods

- Bias by interviewers thereby falsifying respondent response.

4. i Questionnaires must be written in simple language

ii Questions must not be technical except for specialized surveys.

iii Vague words must be avoided

iv Leading questions should be avoided

11
Applied Statistics (STS 201) Tutorial Workbook 2020

v Questions must be asked in logical sequence

5. The measurement scale is a scheme for assigning numbers or symbols for specify
differing attributes

b. i NOMINAL SCALE: This is a naming scale that classifies objects into categories. It
is by assigning number to a group to give it distinct identity.

ii ORDIBAL SCALE: This involves the assignment of number of symbol for the
purpose of identifying ordered relations to some characteristics.

iii INTERVAL SCALE; Numbers are assigned for the purpose of identifying ordered
relations with some characteristics, the order with equal interval and an arbitrary
zero point.

iv RATO SCALE; Numbers are assigned for the purpose of identifying ordered
relations of some characteristics, the order having being arbitrarily assigned with
equal intervals, but an absolute zero point.

SECTION B

DATA PRESENTATION

1. Explain the followings:

(a) Raw Data (b) Array

(c) Distribution (d) Frequency Distribution

(e) Cummulative frequency Distribution (f) Relative frequency distribution

2. Differentiate between a histogram and frequency polygon an illustrate with an


example

12
Applied Statistics (STS 201) Tutorial Workbook 2020

3. From the following distribution

Classes No. of
Objects (f)
10 – 14 19
15 – 19 24
20 – 24 37
25 – 29 81
30 – 34 43
35 – 39 30
40 – 44 16
250
Find:

(a) The class interval

(b) The class boundaries

(c) The class mark

(d) The class width or size of class

(e) The cumulative frequency of the distribution

(f) The relative frequency of the distribution

4. Consider the distribution given below

Ages (Year to the next Frequency


birthday
15 < 20 37
20 < 25 81
25 < 30 43
30 < 35 24
35 < 40 9

13
Applied Statistics (STS 201) Tutorial Workbook 2020

40 < 45 6
200

(a) Construct a cumulative frequency graph


(b) Construct a histogram
(c) Construct a frequency polygon
(d) Construct a bar chart

5. A company administers an aptitude test to 100 applicants for a job with the
company. The following are the times taken to complete a simple task for
each applicant, measured to the nearest second.
44 92 72 45 85 61 66 46 59 57 52 40 93
54

52 64 65 44 51 66 92 58 74 42 43 56 46
52 45 56 68 40 48 76 71 99 51 72 52 56
69 58

40 76 70 42 52 46 73 59 41 55 74 66 64
47 58 46 52 54 63 89 87 41 57 68 59 81
82 60 67 68 97 57 47 53 61 52 49 47 86
55 54 48 85 45 84 53 49 47 70 78 58 96
54 62 60 57 58 41 70

(a) Construct a frequency table for the above data using classes of 40 – 49, 50
– 59, 60 – 69, e.t.c

(b) Construct a cumulative frequency distribution.

(c) Construct a relative frequency distribution

(d) Draw the histogram

14
Applied Statistics (STS 201) Tutorial Workbook 2020

(e) Draw the ogive

SOLUTIONS

Raw Data: is the data that as not being processed for use. It is also known as primary
data. They are collected data that as not being numerically organized,

Array: is a systematic arrangement of objects usually in rows and columns. It is an


arrangement of numerical data in ascending or descending order of magnitude.

Distribution: Is a listing showing all possible values or intervals of the data and how
often they occur.

Frequency distribution: is a tabular arrangement showing various values of variable


together with their corresponding number of occurrences.

Cumulative Frequency distribution: is used to determine the number of observations


that are above or below a particular value in a data set.

Relative Frequency Distribution: it shows the fraction or percentage of data value in


each class.

i A histogram is a bar graph while a frequency polygon is a line graph or a bell curve

ii A histogram is a representation of frequency distribution in form of a diagram


consisting of rectangles whose bases corresponds to class intervals and heights of
the rectangles are proportional to the frequency, while a frequency polygon is the
plot of class mark or mid-point against frequency for a distribution or equal class
interval.

15
Applied Statistics (STS 201) Tutorial Workbook 2020

iii A histogram is two dimensional while a polygon has more than four dimensions

iv More than one histogram cannot be drawn on the same axis while a frequency
polygon of several distributions can be plotted on the same axis.

v A histogram cannot be drawn from a polygon but a frequency polygon may be


drawn from histogram by joining the mid of upper horizontal sides to each rectangle.

HISTOGRAM

FREQUENCY POLYGON

3a. Class interval: Range of value covered by class in frequency distribution

16
Applied Statistics (STS 201) Tutorial Workbook 2020

10-14, 15-19

Class interval is 4.

b Class boundaries = 14+15 = 29 = 14.5


2 2

c Class Mark = Upper class limit + Lower class Limit


2
= 14+10 = 24 = 12
2 2

d Class width = Upper boundary - Lower boundary

= 14.5 - 9.5 = 5.0

e.

17
Applied Statistics (STS 201) Tutorial Workbook 2020

4a

4b

18
Applied Statistics (STS 201) Tutorial Workbook 2020

4c

19
Applied Statistics (STS 201) Tutorial Workbook 2020

4d

5a Frequency Table 5b Relative frequency tale

20
Applied Statistics (STS 201) Tutorial Workbook 2020

5c. Histogram

5d O give

21
Applied Statistics (STS 201) Tutorial Workbook 2020

SECTION C

DIAGRAMATIC REPRESENTATION OF DATA

1. Explain the followings:

(a) Bar Chart (b

) Component bar Chart

(c) Multiple Bar chart (d) Pie Chart

2. Differentiate between a bar diagram and a pie chart

3. Draw a pie-chart using the following information

Marital No of Women
Status
Single 670
Married 480
Separated 120
Divorced 330
Widow 400
4. The following data gives the enrolment of junior students in a secondary
school.

Session No of Male No of Female No of students


90/91 500 1,000 1,500
91/92 750 1,000 1,750
92/93 840 960 1,800
93/94 1,050 950 2,000

Present the information in:

22
Applied Statistics (STS 201) Tutorial Workbook 2020

(a) Simple bar Diagram

(b) Component bar chart

(c) Percentage component bar

(d) Multiple bar diagram

SOLUTIONS

a. Bar Chart: This is a diagramatic representation of qualitative data consisting of


separated vertical or horizontal bars with equal width whose height or length are
drawn proportional to the frequencies of the classes they represent. Bar charts may
be simple, multiple or component.

b. Component Bar Chart: This is used when data involve more than one category.
Simple bars are divided into sections and each component correspond in size to the
magnitude. It allows the representation of more variables and allows comparison to
be made within each component.

c. Multiple Bar chart: This is a variation of the component chart which is also used to
show data comprising two or more categories. It allows comparison to be made both
within and across classes in actual terms. It is also referred to as a compound bar
chart.

d. Pie chart: This is a circle which is divided into sector by radial lines. The circle
represents the total mass of data under consideration, while the various sectors
shows the relative sizes or proportions of different variables or values.

2. DIFFERNCE BETWEEN PIE CHART AND BAR DIAGRAMS

23
Applied Statistics (STS 201) Tutorial Workbook 2020

24
Applied Statistics (STS 201) Tutorial Workbook 2020

SECTION D

SUMMARISATION OF DATA

1. Explain the followings

(a) Measures of Central Tendency (b) Skewness

(c) Measures of Validity (d) Coefficient of Variation

2. What are the properties of a typical value of our Central tendency

3. Consider the following distribution

Ages (Year to the next Frequency


birthday
15 < 20 37
20 < 25 81
25 < 30 43
30 < 35 24
35 < 40 9

25
Applied Statistics (STS 201) Tutorial Workbook 2020

40 < 45 6
200

Calculate:

(a) The mean (b) Median (c) Mode (d) Variance

(e) Standard deviation (f) Quartile deviation (g) coefficient of


variation

(h) Skewness

SOLUTIONS

1a Measure of central tendency: This are typical values that are representative of a
set of data and tend to lie centrally within a set of data arranged according to
magnitude. These are also called averages. The measures include Mean, Median,
Mode.

B Skewness: Is the degree of departure from symmetry of a distribution. If the


frequency curve of a distribution is said to be skewed to the right, its to have positive
skewness. If the reverse is negative, it is said to be skewed

Skewness = mean - mode


Standard deviation

C. Measure of Variability: The degree to which numerical data tend to spread about
an average value of the distribution. The most common measure of variability are

26
Applied Statistics (STS 201) Tutorial Workbook 2020

the range, mean deviation, semi-interquatile range, percentile range, interquatile


range and standard deviation.

D. Coefficent of variation: This is the ratio of any measure of dispersion to any


measure of location and expressed in percentage. It is also called relative dispersion
or coefficient of dispersion. It is denoted by V and is given by coefficent of variation
(V)

2
PROPERTIES OF A TYPICAL VALUE OR CENTRAL TENDENCY
1. The algebraic sum of deviation of a set of numbers from their arithmetic mean is
zero
2. The sum of the squares of the deviations of a set of numbers xi from any number
ai is a minimum if and only if a = --
X
3. If fi numbers have mean mi, f2 numbers have mean m2 then fk numbers have
mean mk, then rhe mean of all the number is

- = fimi + f2m2 + ………….fkmk


X f1 + f2 + ……………fk I.e a weighted arithmetic mean of all the means.

2b.

27
Applied Statistics (STS 201) Tutorial Workbook 2020

A. The mean

B Median

Where:
L1 = Lower class boundary of median class = 19.5
N = Total frequency Ef = 200
Cfb = Cumulative Frequency of class preceding the median class = 37
Fm = Frequency of median class = 18
W = Class width or class size = 5

= 23.4 years

Mode

28
Applied Statistics (STS 201) Tutorial Workbook 2020

Where:

L1 = Lower class boundary of modal class = 19.5

= Difference between the frequency of modal class and the class preceding it = 81-37 = 46

= Difference between the frequency of the modal class and the class after it = 81-43 =38

W = class width

Ef = 200, Efx = 4,925, Efx2

= 128,725

29
Applied Statistics (STS 201) Tutorial Workbook 2020

Variance

Standard Deviation

Quartile deviation

Where

L = 19.5, Cfb = 37, F = 8, w = 5

30
Applied Statistics (STS 201) Tutorial Workbook 2020

L= 24.5, Cfb = 118, F= 48, w= 5

Hence Quatile deviation =

G. Co-efficient of variation

H. Skewness

31
Applied Statistics (STS 201) Tutorial Workbook 2020

SECTION E

PROBABILITY

1. Define the followings:

(a) Experiment (b) Sample Space

(c) Sample point (d) Event

(e) Permutation (f) Combination

(g) Mutually Exclusive (h) Independent Event

(i) Dependent event (j) Conditional probability

2 (a) List two types of probability distribution

(b) List four examples of discrete probability distribution

(c) State he properties of Binomial Distribution

(d) What are the properties of a Normal distribution?

3. (a) Show that the letters of word ANTICIPATION can be arranged in three times
as many ways as the letters of the word COMMENCEMENT

(b) In the random experiment of tossing 5 coins, list the event that

(i) At least 3 heads occur

(ii) Exactly 2 heads

(iii) No head at all

4 (a) Simplify the followings:

(i) 10
C4 (ii) 10
P4 (iii) C4
5
(iv) P4
10

(b) if nP5 /. nC3 = 2: 1, what is the value of n?

32
Applied Statistics (STS 201) Tutorial Workbook 2020

(c) if P3 /. nC4 = 6, find n.


n

5. Using normal tables, find the values of the following probabilities.

(a) P(Z < 0.20) (b) P (Z < - 1.62)

(c) P ( 0.57 < Z < 1.62) (d) P (-1.50 < Z < 2.50)

SOLUTIONS

1a Experiment is any procedure that can be infinitely repeated and has a well defined set of outcomes e.g
the flip of a fair coin.

B. Sample space is the set of all the possible outcomes of a specific random experiment, it is denoted by S.

C. Sample point is each of the elements in a sample space. O is not a sample point because it is not
obtainable

D. Event is a set of outcomes of an experiment [ a subset of one sample space] to which a probability is
assigned

E. Permutation is the number of arrangement possible of objects with regards to the order of arrangement.
It is denoted by nPr [ r < n]

F. Combination is the selection of object without regard to the order of arrangement. It is denoted by nCr
where (r < n)

G. Mutually Exclusive: Two or more events are mutually exclusive if the occurrence of the others. I.e the
event cant occur simultaneously. I.e Pr [ E1, E2] = 0

H. Independent event: Two events are said to be independent if the probability that one event occurs is not
affected by the occurrence or non-occurrence of the other. An example of two independent events is rolling
a die is flipping a coin I.e P ( B/A) = P (B)

I. Dependent Event: The Probability of one event occurring affects the probability of occurrence of the
other event

J. Conditional Probability: Dependence of events by looking at probability of an event given that some
other event first occurs. It is denoted by P(B/A) = P(AnB)/ P(B)

33
Applied Statistics (STS 201) Tutorial Workbook 2020

2a. Two types of probability

I. The Binomial Distribution

Ii. The Poisson Distribution

2b. Four Examples of discrete Probability distribution includes

i. Binomial Distribution

ii. Poisson Distribution

iii. Geometric Distribution

iv. Bernovlli Distribution

2c. Properties of Binomial Distribution

i. The trials are defined before the experiment begins

ii. The result of every trial can be classified into one of two mutually exclusive events, Success or Failure

iii. The result of any trial is independent of the result of all other trials.

iv. The probability of success or a failure does not change from trial to trial.

2d. Properties of Normal Distribution

i. It is bell-shaped & Symmetrical about x=u

ii. Approximately 95% of the distribution lies within 2 standard deviation of the mean, also 99.7% of the
distribution lies within 3 standard deviation of the mean.

iii. The maximum value of f(x) occurs when x= ne and is given by

iv. The total area under the normal curve is 1

34
Applied Statistics (STS 201) Tutorial Workbook 2020

3a. ANTICIPATION

n= 12, A= 2, N= 2, T= 2, J= 3, C= 1, P=1, O= 1.

= 12/ 3!2!2!2!

COMMENCEMENT

n=12, c=2, O=1, m=3, E=3, N=2, T=1

3 times as many ways as letters of the word COMMENCEMENT

= 3x {3326400}

= 9,979,200 ways

3b A fair coin has p = q = 1/2

n = Number of coin = 5

Where P= Head, q= Tail

Probability of at least 3 heads are 3 heads, 4 heads, 5

35
Applied Statistics (STS 201) Tutorial Workbook 2020

ii. Probability of exactly two heads

P(2) =

Iii. Probability of no heads at all I.e tails

36
Applied Statistics (STS 201) Tutorial Workbook 2020

4a.

C.

D.

37
Applied Statistics (STS 201) Tutorial Workbook 2020

E.

38
Applied Statistics (STS 201) Tutorial Workbook 2020

C.

5a. P( z < 0.20)

On the normal table we locate 0.2 and the first column with leading .00

P(z<0.20) = 0.0793

B. P (Z < - 1.62)

Z > 0.5 minus corresponding entry on table

0.5 - 0.4474 = 0.0526

P (Z < -1.62)= 0.0526

39
Applied Statistics (STS 201) Tutorial Workbook 2020

C. P [ 0.57 < Z 1.62 ]

The difference between the only entry in table 1 corresponding to

Z = 0.57 and Z= 1.62

0.4474 - 0.2157

= 0.2317

D. P [ -1.50 < Z < 2.50]

The sum of the entry table

0.4938 + 0.4332

= 0.927

40
Applied Statistics (STS 201) Tutorial Workbook 2020

SECTION F

SAMPLING DISTRIBUTION

1. Explain the followings:


(a) Sample distribution (b) Population distribution
(c) Sampling distribution (d) Statistic
(e) Parameter (f) Central Limit theorem
(g) Sampling with replacement
(h) Sampling without replacement

2. A finite population consists of the numbers 1,2,3,4,5.


a. Construct the sampling distribution of mean (X) when the samples of size are drawn
i) With replacement
ii) Without replacement

b) Verify that µx=µ and  x  n


2 2

3. The distribution of student height of 168cm and the standard deviation of 4cm. What is the
probability that the mean of a random sample of 64 students is greater than 162cm?

4. A normal population with unknown variance is believed to have an average of 20 is on likely to


obtain random sample of size 9 from his population which has a mean x=24 and standard deviation
s=4/1? If not, what conclusion will you draw?

5. a) Find t 0.025 when V=14

b) Find t 0.1 when V=10 (c) Find t 0.995 when V=7


P ( t 0, 005  t  t 0.00 )
d) Find (e) Find P (t  t 0.025 )

41
Applied Statistics (STS 201) Tutorial Workbook 2020

6. Given a random samples of size 24 from a normal distribution.


Find K such that:

a) P ( 2.069  t  k ) 0.965

b) P (k  t  2.807) 0.095

c) P (  k  t ) 0.90

7. The UNAAB feed mills claims that the average content of each bag of feed is 50kg. If a random
sample 0f 10 bags selected at random have average contents as 52, 48, 50, 53, 51, 50, 49, 47, 50, 52
kilogrames, would you agree with the manufacturer’s claim?

SOLUTIONS

1a. Sample distribution: These are all possible samples of SBCN that can be drawn from a given population
(either with or without replacement) for each sample we can compute a statistic ( such as the mean and the
standard deviation) that will vary from sample to sample.

1b. Population distribution: is the distribution of values of its member and has mean denoted by N
variance or r^2 and SD r.

1c. Sampling Distribution: The important concept which makes it possible to make inference from a
sample to its related population. It is the distribution of a statistic, considered as a random variable, when
derived from a random sample size n. It depends on the underlying distribution of the population, the
statistic being considered, the sampling procedure employed and sample size used.

1d. Statistic is a quantity calculated from the values of the observation in the sample (e.g its arithmetic
mean value ) e.g the sample mean is a statistic that estimates the population mean, a test statistic is used to
test a hypothesis eg proportion, variance, median etc.

42
Applied Statistics (STS 201) Tutorial Workbook 2020

1e. Parametric is a population value of an estimate of a population. It can be regarded as a numerical


characteristics of a population. Among the parameterized distributions are the normal distribution, the
Poisson distribution , the Binomial distribution and the experimental distribution.

1f. Central Unit Theorem: This is a statistical theory that states that given a sufficient large sample size
from a population with a finite level of variance, the mean of all sample from the same population will be
approximately equal to the mean of the population.
Ux = U

1g. Sampling with replacement: This is sampling where each member of the population may be chosen
more than once. It is when a unit selected at random from the population is returned to the population.

1h. Sampling without replacement:This means when an element is selected from the sample, it is not put
back into the population before the next elements are drawn.

3. Population size U = 3
Let us assume that 2 samples are drawn (I) n ----- 2
The possible sample that can be drawn with replacement is Nn = 5^n = 5^2 = 25 samples

ii The possible sample that can be drawn without replacement

43
Applied Statistics (STS 201) Tutorial Workbook 2020

b. The population mean

44
Applied Statistics (STS 201) Tutorial Workbook 2020

45
Applied Statistics (STS 201) Tutorial Workbook 2020

SECTION G
ESTIMATION OF PARAMETERS
1. Explain the followings:
(a) Point Estimate (b) Interval Estimate

(c)Unbiased Estimate (d) Biased Estimate

(e) Estimator (f) Estimate

(g) Efficient estimator (h) Confidence Interval

2. A random sample of 25 UNAAB employees showed an average contribution of N12,500 to Ogun


State Poverty Alleviation Programme with a standard deviation N225. Construct.
a) 90% confidence interval

b) 95% confidence interval

c) 99% confidence interval

for the average contribution by all employees of UNAAB to Ogun State Poverty Alleviation
Programme. Assume normally distributed.

3. A random sample of 400 smokers at Abeokuta is selected and 125 are found to have a preference
for a brand called Benson and Hedges. Construct.
a) 90% confidence interval

b) 95% confidence interval

c) 99% confidence interval

for the proportion of te population of Cigarettes smokers at Abeokuta who prefer Benson and Hedges.

4. A study is to be made by UNAAB students offering STS201 on Adolescent sexuality and fertility at
Abeokuta. How large a sample is needed, if they wish to be at least.
a) 90% confidence interval

b) 95% confidence interval

c) 99% confidence interval

46
Applied Statistics (STS 201) Tutorial Workbook 2020

the estimate differs from the true proportion by an amount not exceeding 0.01?

SOLUTIONS

1a Point estimate is an estimate of a population parameter given by a single number. Point estimates do
not tell us anything about the intrinsic reliability or precision of the method of estimation which is
being used.

1b Interval estimate is an estimate of a population parameter given by 2 numbers between which the
parameter may be considered to be be. Interval estimate indicates the precision or accuracy of an
estimate.

1c Unbiased estimate:is an estimate whose expectation is equal to the population/ parameter value. X
and s^2 are unbiased since E(x) = N and E(s^2) = r^2

1d Biased estimate is the result of a statistical technique whereby the expected value differs from the
value being estimated

1e Estimator is a function or formular that is used to determine the unknown population parameter.
Sample average x is an estimator. It is used for estimation.

1f Estimate is the value obtained by using an estimator

1g Efficient estimator: an estimator is efficient if in the class of others unbiased estimators is having
minimum variance

1h Confidence interval The confidence interval of a population is obtained by point estimate + margin
of error.

47
Applied Statistics (STS 201) Tutorial Workbook 2020

2.

n-=25, x =12,500, r= 225 for n< 30

X + t/2 s/n

For 90% confidence interval

12500 + 61-0/1/2 = 225/root 25

12500 + t0.95.45

V= DEGREE OF FREEDOM

V= n-1 tx = 1.71 at

V= 25-1 =24

12500 + 1.71 (45)

= 12423.05 < N < 12576.95

ii 95% confidence interval

X + t/2 S/root /n

12500 + t1 0.05/1/2 = 225/root 25

12500 + t1 -0.025.45

12500 + 0.975.45

V=23

t*/2 - 2.06 at degree of freedom = 24

12500 + 2.06(45)

12407.3 < N < 12592.4

iii 99% Confidence interval

X + t* s/root n

48
Applied Statistics (STS 201) Tutorial Workbook 2020

12500 + t1 - 0.01/2 - 225/root 25

12500 + t1 -5 x 10^-13 - 225/5

12500 + t 0.995.45

t*/2 - 2.80 at degree of freedom 24

12500 + 2.80 (45)

= 12374 < N < 12626.

3. N = 400 x= 125

The proportion of the population of cigarate smokers is likely to be between 27.4 to 35.1 percent.

B for 95% Confidence interval

49
Applied Statistics (STS 201) Tutorial Workbook 2020

*/2 = 1.960

The population proportion of smokers is likely to be between 26.7 to 35.8%

C for 90% Confidence interval

*/2 = 2.576

50
Applied Statistics (STS 201) Tutorial Workbook 2020

Population proportion lies between

0.2528 < P < 0.3733

25.3% to 37.2%

The population proportion of smokers is likely to be between 25.3 to 37.2%

4. E = + 0.01

n is unknown sample size

51
Applied Statistics (STS 201) Tutorial Workbook 2020

52
Applied Statistics (STS 201) Tutorial Workbook 2020

SECTION H
TEST OF HYPOTHESIS

1.Explain the followings:

(a) Statistical hypothesis (b) Null hypothesis

(c) alternative hypothesis (d) Type 1 error

(e)Type 2 error (f) One tailed test

(g) Two tailed test (i) Critical region

(h) Acceptance region (j) Level of significance

2. State the null and alternative hypothesis to be used in the following claims and determine generally
where the critical region is located:

a) The mean rainfall at Abeokuta during the month of June is 68cm.

b) On the average, students attend lectures within 2.5 kilometers of their homes at UNAAB, Abeokuta

c) No more than 20% of the students of UNAAB contributed to Ogun Poverty Alleviation Fund

d) The proportion of voters favouring the new Students Union President is 0.63

3. In Abeokuta, men have a mean height of 168cm, standard deviation of 8cm, and women have a mean
height of 160cm, standard deviation 5cm. In a random sample of 100 married couple, the average
height differences between husband and wife was 5cm. does this suggest that height of partner affects
the decision to propose marriage?

53
Applied Statistics (STS 201) Tutorial Workbook 2020

4. Two drugs, A and B, were tested for a certain effect on UNAAB laboratory mice. Two samples,
each 50 mice were chosen randomly. One drug administered to each, and a measure, X of the effect
obtained. The results were as follows:

DrugA :  X 1,750 X 2
62,520

DrugB :  X 1,980 X 2
85,610

Test at 1% and 5% levels of significance. The hypothesis that the two drugs have the same mean effect
against the alternative hypothesis that drug B has a higher mean effect. You may assume that the effect
on mice of each drug are normally distributed.

SOLUTIONS

1a A statistical hypothesis is a hypothesis that is testable on the basis of observed data modelled as the
realised values taken by a collection of random variables. A set of data is modelled as being realised values
of a collection of random variables having a joint probability distribution in some set of possible joint
distributions.

1b A null hypothesis is a theory that assumes there is no statistical importance between the two variables
in the hypothesis. It is the assumption that the researcher is seeking to expose. For example, there is no
statistically meaningful relationship between the type of water fed to the plants and growth of the plants. A
researcher is questioned by the null hypothesis and normally wants to deny it, to illustrate that there is a
statistically vital relationship between the two variables in the hypothesis.

1c Alternative hypothesis is a position that states something is happening, a new theory is preferred instead
of an old one (null hypothesis). It is usually consistent with the research hypothesis because it is
constructed from literature review, previous studies, etc. However, the research hypothesis is sometimes
consistent with the null hypothesis.

54
Applied Statistics (STS 201) Tutorial Workbook 2020

1d. A type I error is the rejection of a true null hypothesis (also known as a "false positive" finding or
conclusion; example: "an innocent person is convicted").

1e. A type II error is the non-rejection of a false null hypothesis (also known as a "false negative" finding
or conclusion; example: "a guilty person is not convicted").

1f. A one-tailed test is a statistical test in which the critical area of a distribution is one-sided so that it is
either greater than or less than a certain value, but not both.

1g. A two-tailed test is a method in which the critical area of a distribution is two-sided and tests
whether a sample is greater than or less than a certain range of values. It is used in null-hypothesis testing
and testing for statistical significance.

1i A critical region, also known as the rejection region, is a set of values for the test statistic for which the
null hypothesis is rejected. i.e. if the observed test statistic is in the critical region then we reject the null
hypothesis and accept the alternative hypothesis.

1j. Region of Acceptance. For a hypothesis test, a researcher collects sample data. From the sample data,
the researcher computes a test statistic. If the statistic falls within a specified range of values, the
researcher cannot reject the null hypothesis. That range of values is called the region of acceptance.

1k The level of significance is the measurement of the statistical significance. It defines whether the null
hypothesis is assumed to be accepted or rejected. It is expected to identify if the result is statistically
significant for the null hypothesis to be false or rejected. Level of Significance Symbol

55
Applied Statistics (STS 201) Tutorial Workbook 2020

2a.

SECTION I

CORRELATION AND REGRESSION


1.Explain the followings:

(a) Regression (b) Correlation (c) Scatter diagram

56
Applied Statistics (STS 201) Tutorial Workbook 2020

(d) Regression coefficient (e) Correlation coefficient

(f) Positive correlation (g) Negative Correlation (h) Regression line

(i) coefficient of determination (j) Rank correlation

2. The table below gives the index figures for the production and the price of an article
over ten consecutive years.

Year 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999

Production 92 96 103 108 109 108 96 103 109 103

Price 109 111 94 93 89 84 100 106 87 97

a) Find an equation that expresses the index figure for the production in terms of the
term of the price of an article.
b) Calculate the coefficient of correlation (r)
c) Use spearman’s Rank Correlation to calculate the correlation coefficient is
d) Determine the percentage of observation that is explained by the regression equation.

3. A study was made by UNAAB mills Ltd to determine the relationship between
weekly advertising expenditures and sales. Assuming the coefficient of determination is
r2 =0.81 and the prediction equation is y = 0.957-0.032x

a) What is the relationship between advertising and sales.

57
Applied Statistics (STS 201) Tutorial Workbook 2020

b) Find the correlation coefficient r

4.The following are estimated average of unemployment percentages, 1990-2000, for 10


States in Nigeria.

State Male(x) Female(y)

Lagos 4.12 3.29


Fit regression lines: (i) for y on x
Abeokuta 2.60 1.61
(ii) for x on y and find the
Anambra 3.24 3.58
coefficient of correlation between x
Imo 4.32 4.99
and y
Benue 2.88 3.02
_ Plateau 3.26 3.19
Kano 2.97 3.19
Kaduna 3.21 2.58
Rivers 1.12 0.19
Delta 1.04 0.99

SOLUTONS
1a Regression is the statistical tool that helps to study the trends position of movements in one
variance in response to change in another variable on basis of an assumed relationship between
them.

58
Applied Statistics (STS 201) Tutorial Workbook 2020

1b Correlation is the measurement of the relationship between two variables Y and X.

1c Scatter Diagram: A scatter diagram portrays the direction, form and strength of any relationship
between quantitative variables.

1d Correlation co-efficient: This is a quantity that measures the strength of the linear relationship
between 2 qualitative variables.

1e Regression co-efficient : A regression co-efficient gives the rate of change of a particular


variable with another variable, I.e it gives the rate of change of E per unit charge in X.

1f Positive correlation; this is when an increase in one variable is associated with an increase in the
other and vice versa

1g Negative correlation: This is when an increase in one variable is associated with a decrease in
the other and vice versa.

1h Rank Correlation: This is a measure of the strength of relationship between two qualitative
variables or attributes. Its used when the exact measurement of qualitative variables may not be
accurate, impossible or impracticable

1i Co-efficient of determinations: The co-efficient of determination r^2 is the number that indicates
the proportion of the variance in the dependent variable that is predictable from the independent
variable.

2.

Where:
WSAR= Weight simple average relative
W = Weight

59
Applied Statistics (STS 201) Tutorial Workbook 2020

Pn = Price at current yea


Po = Price at base year

2b.

SECTION J

60
Applied Statistics (STS 201) Tutorial Workbook 2020

OTHERS
1a. What is ANOVA?
b.Write an assumption for analysis of variance.
The data in the table below represent 5 random samples of 5 from independent

normal distribution with 1 ,  2 ,...,  5 and common variance  .


2

1  2 ...  5
Test the hypothesis at 5% that

A B C D E

5 9 3 2 7

4 7 5 3 6

8 8 2 4 9

6 6 3 1 4

3 9 7 4 7

Total 26 36 20 14 33

2. Four different test were used in the treatment of a course and the final
grades of the students were recorded below

1 2 3 4

60 80 97 67

80 81 84 84

69 73 93 90

65 69 79 78

75 92 61 72

61
Applied Statistics (STS 201) Tutorial Workbook 2020

Test the hypothesis at α = 5% that there is no differences in the final grades


from the four different tests.

3a) What is a run?

b) From the following arrangement of M and F, is there evidence of


randomness α = 5%

MM F MMMM FF MMMM FM FF MMMM F M FF

MMMMM F MMM F M F MMM.

4. Consider the following measurements, which are weights of some people in


kilograms

163 165 160 189 161 171 158 151 169 162 151

169 162 163 139 172 165 148 166 172 163 187 173

Test the null hypothesis µ=163 against the alternative µ>163 at α = 5%

5. Consider the following three samples

Sample I Sample II Sample III

29 36 24
36 17 18
37 19 20
36 21 24
36 26 25
35 29 28
39 27 31
38 21 34
40 32 30
23 33 22
27 21
16

Use Kruskal-Wallis test at 0.05 to test

62
Applied Statistics (STS 201) Tutorial Workbook 2020

Ho: 1   2   3

6.The following are the weights in kilogram before and after of 16 persons who
stayed on a certain reducing diet for four weeks

Before After

147.0 137.9
183.5 176.2
232.1 219.0
161.6 163.8
197.5 193.5
206.3 201.4
177.0 180.6
215.4 203.2
147.7 149.0
208.1 195.4
166.8 158.5
131.9 134.4
150.3 149.3
297.2 189.1
159.8 159.1
171.7 173.2

Use Wilcoxon Signed-Rank test at α=0.05 whether the weight reducing diet is
effective.

7.Define

a) Statistical Quality Control (SQC)

b)Process Control

c) What is sampling inspection plan?

d) What the causes of variation?

e) What is Control Chart?


63
Applied Statistics (STS 201) Tutorial Workbook 2020

f) List the types of Control charts and their examples

g) The table below shows the thickness of some materials produced by a machine.

Draw control charts for the mean and range and comment on the state of control

Sample 1 2 3 4 5 6 7 8 9 10

13 14 11 14 12 10 10 8 13 5
8 12 10 10 10 12 16 12 8 8
15 19 16 9 13 10 8 8 14 5
8 10 8 13 7 8 10 10 7 10

9. Define Time Series

b) List the components of a Time Series

c)The following gives the volume of passengers (‘000)of an Airline over a period of
time .

Year Jan-Mar Apr-Jun Jul-sept Oct-Dec


1991 24 35 55 30

1994 20 42 70 26

1995 26 37 82 38

1996 25 38 90 40

Estimate the trend

i. Using the Least Square Method


ii. Using the method of moving averages smoothing out the trend.

10a) What is an Index Number?

64
Applied Statistics (STS 201) Tutorial Workbook 2020

b) What are the uses of Index Numbers?

c) What are the limitations of Index Numbers?

d) Given the following table

1970 1980

Community
Price Quantity Price Quantity
A 4 10 8 16

B 5 12 6 12

C 2 5 4 10

D 3 4 5 8

E 4 12 7 10

Find the Laspeyre and Paasche quality indices

Find Fisher’s ideal price index.

65
Applied Statistics (STS 201) Tutorial Workbook 2020

1a Anova: Analysis of Variance describes the partition of the response variable sum
of squares in a linear model into explained and unexplained components. It also
refers to procedures for fitting and testing linear models in which the explanatory
variables are categorical.

1b (i) Observations were randomly and independently chosen from the populations

ii Population distributions are normal for each group

iii Population variances are equal for all groups.

I)Laspeyre=EPnqo/EPoqo×100
={(80+72+20+20+84)/(40+60+10+12+48)}×100
=162.35

ii)Passche=EPnqn/EPoqn
={(128+72+40+40+70)/(64+60+20+24+40)}÷100
=168.25

iii) Fischer's=√laspeyre×passche
=√162.35×168.27
=165.28
a)An index number is the number that expresses the relative change in
price, quantity, or value compared to a base period

b)ii)It helps in wages and salary negotiation and adjustment of allowance


*Used in international comparison
*Construction of cost of living index

66
Applied Statistics (STS 201) Tutorial Workbook 2020

Limitations of index number

The choice of representative commodities may lead to fallacious conclusions as they are based
on samples.
There may be errors in the choice of base periods or weights, etc.
Comparisons of changes in variables over long periods are not reliable.
They may be useful for one purpose but not for another.
They are specialized types of averages and hence are subject to all those limitations which an
average suffers from.

67
Applied Statistics (STS 201) Tutorial Workbook 2020

68
Applied Statistics (STS 201) Tutorial Workbook 2020

69
Applied Statistics (STS 201) Tutorial Workbook 2020

70

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy