1.1 Definition of Statistics
1.1 Definition of Statistics
INTRODUCTION
Analyze data
Draw conclusions
1
1.2 Types of Statistics
Statistics
Descriptive statistics
Consists of methods for organizing, displaying, and describing
data by using tables, graphs, and summary measures.
Deals with the description and analysis of a given group of data.
Present information in a convenient, usable and comprehensible
form.
2
1.3 Population versus Sample
Population
Consist of all elements, i.e. individuals, items, or objects whose
characteristics are being studied.
Sample
A portion of the population selected for study.
For example, you could randomly choose a few of your classmates and
ask them whether they like to eat chocolate ice-cream. You will find out
3
that 60% of your classmates like to eat chocolate ice-cream. This is a
statistic because you only choose some students (sample) from your class
(population). You had calculated the proportion of students like to eat
chocolate ice-cream in your class.
Now, you could ask every classmate who likes to eat strawberry ice
cream. 75% of them like it. This is a parameter because you had asked
everyone in the class.
Sample survey
- The technique of collecting information from a portion of the
population.
- The data obtained from sample surveys is called sample data.
Census
- A survey which includes every member of the entire population.
Data collected is called census data.
- It gives detailed and more accurate information but it needs longer
to analyze. Besides that, it is also costly.
4
Methods of survey
i) Phone survey
advantages disadvantages
Save time and easy to be Some people do not have phone
conducted. or not answer the call.
People will have more candid in Not all people have chance of
their opinions. being surveyed.
Cover wider geography area. Costly if oversea survey is
conducted.
i) i)
ii) ii)
iii) iii)
iv) 5
Four common ways to select a random sample are discussed below:
Examples:
i)
ii)
6
Examples:
i) Suppose there are 5000 students in a college. 100 students are
to be selected by using the systematic random sampling
method to conduct a survey. First, a random number, k
between 1 and 50 (inclusive) is selected randomly and then
every kth student is chosen.
ii) Every fourth person is selected from the phone directory to
conduct a survey on their monthly cellular phone data usage.
7
Example:
A study is conducted to estimate the average income of all Assistant
Professors (A.P.) in UTAR Sungai Long campus.
First, subdivide the population according to the faculty as follow:
Then, determine the sample size needs to be collected, says 80. Finally,
samples are chosen randomly from each faculty, proportionally, as follow:
Faculty CFS FMC FAM FES FCI
No. of A.P. 24 2 10 40 4
Other examples:
i) Subdivide all the UTAR students according to their program of
study and then samples are collected from each group according to
the proportion to conduct a survey on their most favorite outdoor
activity.
ii) Dean wishes to know whether is there any different in the online
survey responds given by the students from the different year of
study in UTAR. So, he collected the samples from the different
year of study according to the proportion and then analyzes it.
8
Cluster sampling
Divide the population into at least two groups called clusters such
that each group is a representative of the population. The clusters are
non-overlap to each other. Then, a random sample of clusters is
selected. Finally, a random sample of elements from each of the
selected clusters is selected. These random samples will then are
grouped and form a cluster random sample.
Examples:
i) A survey is conducted in Kuala Lumpur to determine whether
the doctors are satisfied with their salary. A few hospitals in
Kuala Lumpur are randomly selected and all doctors in the
selected hospitals are interviewed.
ii) Suppose in a batch of 100 boxes of light bulbs are produced
by a machine and each box contain 10 light bulbs. 5 boxes are
selected randomly and all light bulbs inside these 5 boxes are
inspected.
i) Convenience sampling
A sample that includes the most accessible members from the
population.
9
Examples:
a) A survey is conducted by a lecturer to determine the average GPA
of students in UTAR. Samples are only collected from his own
lecture class.
b) Questionnaires are sent through email to friends to conduct a
survey on the behavior of UTAR students in lecture class.
10
iii) Quota sampling
A sample selected in such a way that each group or
subpopulation is represented in the sample in exactly the same
proportion as in the target population.
Types of error
or systematic error
11
Sampling error or chance error
The difference between the results obtained from sample survey and
the result that would have been obtained from census survey. It
occurs because of chance, and it cannot be avoided. A sampling error can
occur only in sample survey. It does not occur in a census.
Nonsampling errors
Errors that are occur in the collecting, recording, and tabulating the
data. It happens because of human mistakes and not chance. It can be
minimized if questions are prepared carefully and data are handled
cautiously.
a) Selection error
Occurs because of the sampling frame is not representative of the
population.
Examples:
i)
ii)
b) Response error
Occurs when those who are included in the survey does not
provide the true information.
Examples:
i)
ii)
12
c) Nonresponse error
Occurs when most of the selected people (sample) are ignored
the survey.
Examples:
i)
ii)
Examples:
i)
ii)
Types of variable
Qualitative Quantitative
13
Qualitative Variable
A variable that cannot be measured numerically but it can be
classified or ranked into two or more nonnumeric categories. The
data collected on such a variable are called qualitative data.
Examples: beauty, color, gender, race, exam grade.
Quantitative Variable
A variable whose value can be measured numerically and it can be
classified into the discrete or continuous variables. The data
collected on a quantitative variable are called quantitative data.
a) Discrete variable
A countable variable, it can be represented by an integer
only.
Examples:
i)
ii)
iii)
b) Continuous variable
A variable whose cannot take the exact value. The
precision depends on the instruments used. Assume any
numerical value include fraction and decimals over a
certain interval or intervals.
Examples:
i)
ii)
iii)
Data classification
In addition to being classified as qualitative or quantitative,
variables can be classify by how they are categorized, counted, or
measured.
14
Nominal Ordinal Interval Ratio
Postcode Exam grade Temperature Weight
Qualitative Quantitative
No order or Can be ranked Zero value exists Zero value not
ranking exists
Types of data
Cross-section data
Data collected from different elements at the same point in time or
for the same period of time.
Examples:
i) Total population in each state of Malaysia in year 2016.
ii)
15
Time series data
Data collected from the same element for the same variable at
different points in time or for different periods of time.
Examples:
i) Total population in Malaysia between 2010 to 2016.
ii)
16