Sta104 Chapter 1
Sta104 Chapter 1
CHAPTER 1:
INTRODUCTION TO
STATISTICS
PRIMARY DATA
ADVANTAGES:
1. Primary data are more accurate, reliable and up-to-date
2. If the data needed by decision makers are not available from other
sources (secondary data) then the primary data has to be gathered.
3. Primary data usually satisfy the objectives of a research.
DISADVANTAGES:
1. Data gathered from primary sources are very costly, time consuming
and require a lot of man power
ADVANTAGES:
1. It is less time, less cost, and less effort required
DISADVANTAGES:
1. May contain errors due to error in printing and also due to transcription
from the primary sources.
2. Secondary data may not be able to fulfill the objectives of a research.
3. Individuals who use secondary data do not know the conditions under
which the data were collected and summarized. Therefore, the
intended user must first determine whether the data is relevant or not.
DISCRETE
Can be measured precisely by
counting such as number of flower,
number of pen and number of students
response in numerical form
example: age, weight and
CONTINUOUS
height.
response that can only be approximated
to some accuracy using measuring
devices such as temperature and time
6. For each of the following situations, state whether the area of statistics
used is descriptive or inferential
a) Obesity has become a major problem in Malaysia. It is found that 35% of
primary school children in the country are obese. Of these, 58% of the
children are males
b) A study on “Obesity Among Primary School Children” conducted by
Professor Kamariah, which involved a random sample of 1500 primary school
children, indicated that these children are obese because they preferred
eating at fast food restaurants and playing video games rather than sports
activities.
Prepared by: Fadila Amira Razali
METHODS OF COLLECTING DATA
TELEPHONE MAILED
DIRECT
QUESTIONNAIRE
INTERVIEW INTERVIEW S
DIRECT
ONLINE
OBSERVATIO
SURVEY
N
Advantages Disadvantages
• cheap way to reach respondents • Respondent may not answer the
globally questionnaire
• Response rates are higher than • Respondent might not understand
mailed questionnaire method the question very well.
SAMPLING FRAME
-A list of all population members
Eg: List of all students registered in UiTM Raub
SAMPLING UNIT
-The element listed in the frame
PILOT STUDY
-Small exploratory exercise conducted on a small number of respondents before the actual
survey is done.
-Objectives: to improve questionnaire, to identify problems that occur during the survey, to
predict cost, time and workforce needed.
SAMPLING ERROR
-Error that arises because a sample cannot give complete information on a population
NON-SAMPLING ERROR
-Error that occur from the survey due to non-response from respondent, faulty measuring
devices, and respondents giving false information or errors in writing and analyzing data
Solution Steps:
A total of 200 students of a school are grouped according to their race. The sample size
needed is 60.
(n = 60 , N = 200)
𝑵𝑵𝒊𝒊
Groups Number of student, Ni Sample: 𝒏𝒏𝒊𝒊 = × 𝒏𝒏
𝑵𝑵
A 20 6
B 60 18
C 80 24
D 40 12
TOTAL 200 60
Next, elements are selected from each group by using a random procedure, usually SRS.
ADVANTAGES DISADVANTAGES
1. Reduce cost, time and workforce 1. The sample might not be
since only a few clusters formed are representative of the population as a
selected as sample whole since nothing is known of the
cluster not sampled
2. It is preferable to divide the
population into a large number of
small clusters than a small number of
large clusters.
Solution:
This is a cluster sampling technique because those selected as the samples are all
household of five housing estates out of 60 housing estates. Those who are not in the
selected housing estates will not be selected as samples. The sampling frame is the
list of 60 housing estates.
E.g: Suppose we need a random sample of 2,000 residents from the Malaysian population.
Since Malaysia consists of 14 states, with many districts within each state, and many villages
within each district, we could apply the multistage sampling technique.
QUESTION 1
Ali Travel Agency, a nationwide travel agency, offers special rates on Phuket-Penang
cruises to Malaysian citizens. A researcher of Ali Travel Agency wants to do a research on
the ages of those people taking the cruise. 100 customers taking a cruise last year was
selected as a sample.
QUESTION 2
A group of researchers plan to carry out a survey on the number of vehicles bought in Town
Y from January to June 2017. These vehicles can be categorized according to the types.
Numbers of authorized
Type
dealer
Motorcycle 50
Car 85
Heavy Vehicles 20
In order to save cost and time, they plan to survey only 40 of these authorized dealers.
a) State the population.
b) Which is the most appropriate sampling technique that may be used by the researchers?
Give ONE reason for your answer.
c) Calculate the sample size that represent each type of vehicles using proportional
sampling technique.
d) Which is the most suitable method of data collection? Give TWO advantages of this
method.
The manager then selects randomly a number of employees from each branch. He decides
to select at random a total of 400 employees from 1000 employees.