0% found this document useful (0 votes)
62 views16 pages

1.1 Definition of Statistics

This document provides an introduction to statistics, including: 1. Definitions of statistics as both numerical facts and the field of study involving methods to collect, organize, analyze and draw conclusions from data. 2. The two main types of statistics - descriptive statistics which organize and summarize data, and inferential statistics which make predictions about populations from samples. 3. Key concepts including populations, samples, parameters and statistics, and why samples are used instead of whole populations for surveys. 4. Common sampling techniques like simple random sampling, systematic random sampling, stratified random sampling and cluster sampling and examples of each.

Uploaded by

Byakuya Bleach
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views16 pages

1.1 Definition of Statistics

This document provides an introduction to statistics, including: 1. Definitions of statistics as both numerical facts and the field of study involving methods to collect, organize, analyze and draw conclusions from data. 2. The two main types of statistics - descriptive statistics which organize and summarize data, and inferential statistics which make predictions about populations from samples. 3. Key concepts including populations, samples, parameters and statistics, and why samples are used instead of whole populations for surveys. 4. Common sampling techniques like simple random sampling, systematic random sampling, stratified random sampling and cluster sampling and examples of each.

Uploaded by

Byakuya Bleach
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

CHAPTER 1

INTRODUCTION

1.1 Definition of Statistics

The meaning of statistics


The word ‘statistics’ has 2 meanings.
I) Statistics refers to numerical facts.
 The total number of student in UTAR is over 10,000 in
year 2010.
 The height of a student, the passing rate of an
examination, …
II) Statistics refers to the field or discipline of study.
 Statistics is a group of methods that are used to collect,
organize, summarize, present and analyze data, as well as
to draw valid conclusions and to make reasonable
decisions on the basis of such analysis.

Statistical methods in problem solving

Identify problems / issues

Plan and collect data

Classify and simplify data

Analyze data

Draw conclusions

1
1.2 Types of Statistics

Statistics

Descriptive statistics
 Consists of methods for organizing, displaying, and describing
data by using tables, graphs, and summary measures.
 Deals with the description and analysis of a given group of data.
 Present information in a convenient, usable and comprehensible
form.

Example: Record the marks gain by every student in a Statistics test,


analyze it to determine the mean, mode, median and standard deviation.
Then, summarize the dataset in a table form and also plot a bar chart / pie
chart to presents the dataset.
Inferential statistics
 Consists of methods that use sample results to make decisions or
predictions about a population.
 Deals with the problems of making inferences or drawing
conclusions about population based on information obtained from
the samples taken from the population.

Example: 65% of the Professors in Malaysia are satisfied with their


current job and salary.

The reasons for learning statistics are the following:


(i) To know how to properly present and describe information.
(ii) To know how to obtain reliable forecasts of variables of
interest.
(iii) To know how to improve processes.
(iv) To know how to draw conclusions about large populations
based on information obtained from samples.

2
1.3 Population versus Sample

Population
Consist of all elements, i.e. individuals, items, or objects whose
characteristics are being studied.

Sample
A portion of the population selected for study.

Example: Suppose we wanted to study the average height of the


students in UTAR. To do so, 1000 students in Year 1 Semester 1
were selected and their heights were measured. State the population
and the sample.
Population :
Sample :

Parameter versus statistic


Definitions for statistic and parameter are similar. They are both
descriptions of groups, for example “38% of dog owners prefer XYZ
Brand of canned dog food.” The difference between parameter and
statistic is that parameter describes a population while statistics
describe a sample.

For example, you could randomly choose a few of your classmates and
ask them whether they like to eat chocolate ice-cream. You will find out
3
that 60% of your classmates like to eat chocolate ice-cream. This is a
statistic because you only choose some students (sample) from your class
(population). You had calculated the proportion of students like to eat
chocolate ice-cream in your class.

Now, you could ask every classmate who likes to eat strawberry ice
cream. 75% of them like it. This is a parameter because you had asked
everyone in the class.

Why choose sample?


i) Save time
ii) Save cost
iii) Impossible to collect / identify all samples in the population.

 Sometimes it is impossible to identify all member of the population.

Example: Conduct a survey about the opinion on a TV program. But, we


don’t know whether someone had watched the particular TV program.

 Sometimes conducting a survey means destroying the samples /


items.

Example: Conduct a survey to estimate the average life time of non-


rechargeable batteries. This would destroy all the batteries used in the
survey.

Sample survey
- The technique of collecting information from a portion of the
population.
- The data obtained from sample surveys is called sample data.
Census
- A survey which includes every member of the entire population.
Data collected is called census data.
- It gives detailed and more accurate information but it needs longer
to analyze. Besides that, it is also costly.

4
Methods of survey
i) Phone survey
advantages disadvantages
 Save time and easy to be  Some people do not have phone
conducted. or not answer the call.
 People will have more candid in  Not all people have chance of
their opinions. being surveyed.
 Cover wider geography area.  Costly if oversea survey is
conducted.

ii) Mailed/online questionnaire


advantages disadvantages
 Cover wider geography area.  Low number of response.
 Less expensive.  Inappropriate answers to
 Respondents can remain questions.
anonymous if desire.  Some people may have difficulty
reading or understanding the
questions.

iii) Personal interview


advantages disadvantages
 Obtaining in-depth responses.  Interviewer must be trained in
 Provide better understanding on asking questions.
the answers provided by the  Costly and time consuming.
respondents.

1.4 Sampling Techniques


Sampling
Techniques

Random Non Random


Sampling Sampling

i) i)
ii) ii)
iii) iii)
iv) 5
Four common ways to select a random sample are discussed below:

Simple random sampling


Samples are selected in such a way that each member of the population
has the equal chance of being selected.

Examples:
i)

ii)

Statistical software such as MS Excel, SAS, SPSS, MATHLAB,


Mathematica, etc can be used to generate a list of random numbers.

Systematic random sampling


In systematic random sampling, we first randomly select one member
from the first k units (by lottery or computer software).
k = [(population size) / (sample size)]
Then every kth member, starting with the first selected member, is
included in the sample.

6
Examples:
i) Suppose there are 5000 students in a college. 100 students are
to be selected by using the systematic random sampling
method to conduct a survey. First, a random number, k
between 1 and 50 (inclusive) is selected randomly and then
every kth student is chosen.
ii) Every fourth person is selected from the phone directory to
conduct a survey on their monthly cellular phone data usage.

Stratified random sampling


We first subdivide the population into at least two subgroups
(strata) that share the same characteristics such as gender or age
bracket. Then, samples are selected from each stratum. Usually, the
sizes of the samples selected from different strata are proportional to
the sizes of the strata.
Advantage: produce more reliable results as compared to the simple
random sampling if the measurements within strata are very similar.
Disadvantage: this sampling method is unusable when all the
members in population are identical, i.e. cannot be grouped.

7
Example:
A study is conducted to estimate the average income of all Assistant
Professors (A.P.) in UTAR Sungai Long campus.
First, subdivide the population according to the faculty as follow:

Faculty CFS FMC FAM FES FCI


No. of A.P. 120 10 50 200 20

Then, determine the sample size needs to be collected, says 80. Finally,
samples are chosen randomly from each faculty, proportionally, as follow:
Faculty CFS FMC FAM FES FCI
No. of A.P. 24 2 10 40 4

Other examples:
i) Subdivide all the UTAR students according to their program of
study and then samples are collected from each group according to
the proportion to conduct a survey on their most favorite outdoor
activity.

ii) Dean wishes to know whether is there any different in the online
survey responds given by the students from the different year of
study in UTAR. So, he collected the samples from the different
year of study according to the proportion and then analyzes it.

8
Cluster sampling
Divide the population into at least two groups called clusters such
that each group is a representative of the population. The clusters are
non-overlap to each other. Then, a random sample of clusters is
selected. Finally, a random sample of elements from each of the
selected clusters is selected. These random samples will then are
grouped and form a cluster random sample.

Examples:
i) A survey is conducted in Kuala Lumpur to determine whether
the doctors are satisfied with their salary. A few hospitals in
Kuala Lumpur are randomly selected and all doctors in the
selected hospitals are interviewed.
ii) Suppose in a batch of 100 boxes of light bulbs are produced
by a machine and each box contain 10 light bulbs. 5 boxes are
selected randomly and all light bulbs inside these 5 boxes are
inspected.

In a nonrandom sample, some element of the population may not


have chance of being included in the sample. Three types of
nonrandom sampling are:

i) Convenience sampling
A sample that includes the most accessible members from the
population.
9
Examples:
a) A survey is conducted by a lecturer to determine the average GPA
of students in UTAR. Samples are only collected from his own
lecture class.
b) Questionnaires are sent through email to friends to conduct a
survey on the behavior of UTAR students in lecture class.

When to use convenience sampling?


When there is no criteria/condition involved in selecting the
sample, anyone can be involved in the survey.

Advantage: easy to be conducted, data can be collected in a short


duration and less costly.

Disadvantage: biasness in the sample collection and high level of


sampling error.

ii) Judgment sampling


A sample that includes the members which are selected from the
population based on the judgment or prior knowledge of an expert
or experimenter.

Example: A researcher wishes to know the most popular luxury


car’s brand in Malaysia. He may use his own judgment to select
the target with high income such as ministers, CEOs, senior
managers, directors, etc.

When to use judgment sampling?


When the potential number of peoples involved in survey is
limited. This sampling method aims to collect information/opinion
from a specific group of persons.
Advantage: less costly and save time.

Disadvantage: biasness in the sample collection and the reliability


of the expert cannot be evaluated.

10
iii) Quota sampling
A sample selected in such a way that each group or
subpopulation is represented in the sample in exactly the same
proportion as in the target population.

Example: An interviewer has been told to interview 50 males


and 100 females from the LKC FES in UTAR to determine
their mode of transport to UTAR.

When to use quota sampling?


When a researcher aims collect data from a certain subgroups from
population.

Advantage: Allows the researcher to observe the interaction / to


make comparison within the subgroups.

Disadvantage: Biasness in the sample collection and the potential


sampling error might not be detected.

1.5 Types of Error

Types of error

Sampling error / chance error Nonsampling errors

or systematic error

11
Sampling error or chance error
The difference between the results obtained from sample survey and
the result that would have been obtained from census survey. It
occurs because of chance, and it cannot be avoided. A sampling error can
occur only in sample survey. It does not occur in a census.

Nonsampling errors
Errors that are occur in the collecting, recording, and tabulating the
data. It happens because of human mistakes and not chance. It can be
minimized if questions are prepared carefully and data are handled
cautiously.

a) Selection error
Occurs because of the sampling frame is not representative of the
population.

Examples:
i)

ii)

b) Response error
Occurs when those who are included in the survey does not
provide the true information.

Examples:
i)

ii)

12
c) Nonresponse error
Occurs when most of the selected people (sample) are ignored
the survey.

Examples:
i)

ii)

d) Voluntary response error


Occurs when a survey is not conducted on a randomly
selected sample but people are invited to respond voluntarily
to the survey. Only respondents with strong opinions about
the issues involved will response to the survey.

Examples:
i)

ii)

1.6 Types of Variable and Data

Types of variable

Qualitative Quantitative

13
Qualitative Variable
A variable that cannot be measured numerically but it can be
classified or ranked into two or more nonnumeric categories. The
data collected on such a variable are called qualitative data.
Examples: beauty, color, gender, race, exam grade.

Quantitative Variable
A variable whose value can be measured numerically and it can be
classified into the discrete or continuous variables. The data
collected on a quantitative variable are called quantitative data.

a) Discrete variable
A countable variable, it can be represented by an integer
only.

Examples:
i)
ii)
iii)

b) Continuous variable
A variable whose cannot take the exact value. The
precision depends on the instruments used. Assume any
numerical value include fraction and decimals over a
certain interval or intervals.

Examples:
i)
ii)
iii)

Data classification
In addition to being classified as qualitative or quantitative,
variables can be classify by how they are categorized, counted, or
measured.

14
Nominal Ordinal Interval Ratio
Postcode Exam grade Temperature Weight

Qualitative Quantitative
No order or Can be ranked Zero value exists Zero value not
ranking exists

Types of data

Cross-section data
Data collected from different elements at the same point in time or
for the same period of time.

Examples:
i) Total population in each state of Malaysia in year 2016.

ii)

15
Time series data
Data collected from the same element for the same variable at
different points in time or for different periods of time.

Examples:
i) Total population in Malaysia between 2010 to 2016.

ii)

16

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy