0% found this document useful (0 votes)
43 views54 pages

01 Econ115a Mod1 Lesson1 BasicStatisticalConcepts

The document discusses introductory statistical concepts including defining statistics, the two branches of statistics, and key terms like population and sample. It covers topics like data collection, different sampling methods, and how to determine sample size.

Uploaded by

lyriemaecutara0
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views54 pages

01 Econ115a Mod1 Lesson1 BasicStatisticalConcepts

The document discusses introductory statistical concepts including defining statistics, the two branches of statistics, and key terms like population and sample. It covers topics like data collection, different sampling methods, and how to determine sample size.

Uploaded by

lyriemaecutara0
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

Econ 115a – Econometrics

Module 1: Introduction to Statistics and SPSS


Econ 115a

Lesson 1: Basic Statistical Concepts

Learning objectives:
- Define Statistics
- Differentiate the two branches of Statistics
- Define basic concepts in Statistics
Econ 115a

Outline
1.1 What is Statistics?
1.2 Branches of Statistics
1.3 Population vs Sample
1.4 Sampling
1.5 Data Collection
1.6 Sources of Bias
Econ 115a

1.1 What is Statistics?


Econ 115a

Definitions of Statistics
- the science of collecting, analyzing, presenting, and interpreting data (Anderson, et
al., 2020).

- a discipline consists of a body of methods for collecting and analyzing data (Agresti
& Finlay, 1997).

- is a collection of methods for collecting, displaying, analyzing, and drawing


conclusions from data.
Econ 115a

Statistics is the study of data: involves describing properties of data and drawing
conclusions about a population based on information in a sample.

Statistics is a mathematical science that is concerned with the collection, analysis,


interpretation or explanation, and presentation of data.

Everything that deals with the collection, processing, interpretation, and presentation
of data belongs to the domain of statistics, and so does the detailed planning of that
precedes all these activities.
Econ 115a

Statistical methods can be used to find answers to some basic questions like:
• What kind and how much data need to be collected?

• How should we organize and summarize the data?

• How can we analyze the data and draw conclusions from it?

• How can we assess the strength of the conclusions and evaluate their uncertainty?
Econ 115a

Statistics also provides methods for:


1. Design: Planning and carrying out research studies.

2. Description: Summarizing and exploring data.

3. Inference: Making predictions and generalizing about phenomena represented


by the data.
Econ 115a

1.2 Branches of Statistics


Econ 115a

Branches of Statistics
1. Descriptive Statistics
- the branch of statistics that involves organizing, displaying, and describing data.
- overview of the attributes of the data, more on describing the data obtained from
the samples/population.

Example:
The average age of citizens who voted for the winning candidate in the last election,
the average length of all books about statistics, the variation in the weight of 100
boxes of cereal selected from a factory’s production line.
Econ 115a

2. Inferential Statistics
- the branch of statistics that involves drawing conclusions about a population based
on information contained in a sample taken from that population.
- making inferences about the population, provides measures of how well data
supports hypothesis

Example:
A survey that sampled 2,001 full- or part-time workers ages 50 to 70, conducted by
the American Association of Retired Persons (AARP), discovered that 70% of those
polled planned to work past the traditional mid60s retirement age.
Econ 115a

In general, statistics is a study of data: describing properties of the data, which is


called descriptive statistics, and drawing conclusions about a population of interest
from information extracted from a sample, which is called inferential statistics.

Computing the single number to summarize the data is an operation of descriptive


statistics; using it to make a statement about the population is an operation of
inferential statistics.
Econ 115a

Commonly used methods in Statistics:


1. Descriptive Statistics
1.a Measures of Central Tendency
1.b Measures of Dispersion/Variability
1.c Frequency tables and Custom Tables
1.d Kurtosis and Skewness
1.e Graphs (Histogram, Box plots, Steam and Leaf, Bar, Pie, Scatter plot, Q-Q plot, etc.)
Econ 115a

2. Inferential Statistics
2.a Test for mean difference (z-test, t-tests, ANOVA’s, Mann-Whitney U, Wilcoxon,
Kruskal-Wallis, etc.)
2.b Testing for relationship (Correlation, Chi-square, etc.)
2.c Regression Analyses (OLS, Logit, Probit, Tobit, etc.)
2.d Structural Equation Modelling (PLS, Covariance-based)
Econ 115a

1.3 Population vs Sample


Econ 115a

Population and Sample


Population (N)
- all the members of a group about which
you want to draw a conclusion. They have
some or the other characteristics in common.
Sample (n)
- the subset of the population

The process of selecting a sample is known as sampling.


The number of elements in the sample is the sample size.
Image source: https://www.omniconvert.com/what-is/sample-size/
Econ 115a

Parameter vs Statistic
Parameter
- a number that summarizes some aspect of the population as a whole.
- a numerical measure that describes a characteristic of a population.

Statistic
- a number computed from the sample data.
- a numerical measure that describes a characteristic of a sample.
Econ 115a

Why do we need to do sampling?


- Time constraint
- Budget constraint
- Requirement for the method of analysis
- So that judgement (inferences) about the population
Econ 115a

1.4 Sampling
Econ 115a

Two (2) Sampling Categories


1. Probability Sampling
- each unit of the population has a
probability of being selected to be part
of the sample.

2. Non-probability Sampling
- each unit of the population has NO probability
of being selected to be part of the sample.

Image source: https://keydifferences.com/difference-between-probability-and-non-probability-sampling.html


Econ 115a

Probability Sampling
1. Simple Random Sampling (SRS) – every member and set of members has an
equal chance of being included in the sample.

2. Stratified Sampling – the population is first split into groups. The overall sample
consists of some members from every group. The members from each group are
chosen randomly.
Econ 115a

3. Cluster Sampling – the population is first-split into groups. The overall sample
consists of every member from some of the groups. The groups are selected
randomly.

4. Systematic Sampling – members of the population are put in some order. A


starting point is selected at random, and every nth member is selected to be in the
sample.

Others: Two-stage random sampling, multi-stage random sampling


Econ 115a
Econ 115a

Non-probability Sampling
1. Convenience Sampling – the researcher selects anyone he or she happens to come
across.

2. Purposive Sampling – only those elements will be selected from the population
which suits the best for the purpose of our study.

3. Quota Sampling – a two-stage non-probability sampling method that assigns


quotas to the population in order to ensure that when elements of the population are
selected, the sample group is representative of the population’s characteristics.
Econ 115a

4. Referral/Snowball Sampling – takes the help from the first element selected from
the population and ask him/her to recommend other elements who will fit the
description of the sample needed.
Econ 115a

How to determine sample size?


1. Slovin’s formula
2. Proportionate Sampling
3. Cochran Sampling
4. Other methods
Econ 115a

1. Slovin’s formula (Yamane, 1967)

𝑁
𝑛=
(1 + 𝑁𝑒 2 )
where:
n = sample size
N = population
e = margin of error

NOTE: The higher the margin of error (e), the lower is the required sample size (n)
Econ 115a

Example: Your school has 1,500 students and you wish to draw samples for your study.
For e=1% For e=5% For e=10%

1,500 1,500 1,500


𝑛= 𝑛= 𝑛=
(1 + 1,500(1%)2 ) (1 + 1,500(5%)2 ) (1 + 1,500(10%)2 )

1,500 1,500 1,500


𝑛= 𝑛= 𝑛=
1.15 4.75 16
𝑛 = 1,304 𝑛 = 316 𝑛 = 94

NOTE: Slovin’s formula should only be used when estimating a population proportion
and a 95% confidence level.*

*Punzalan, R. B. and Tejada, J. J. (2012).On the Misuse of Slovin’s Formula. The Philippine Statistician Vol. 61, No. 1 (2012).
Econ 115a

2. Proportionate Sampling
- samples are determined according to weights/proportions.

Grade Level (JHS) No. of students Proportion Sample size

Grade 7 400 26.66% 84

Grade 8 300 20% 63

Grade 9 500 33.33% 106

Grade 10 300 20% 63

TOTAL 1,500 100% 316

Slovin’s identified sample size (e=5%) = 316


Econ 115a

3. Cochran’s Sampling (Cochran, 1977)


- this sampling procedure is used when population is not known.

𝑍 2 𝑝 ∗ (1 − 𝑝)
𝑛𝑜 =
𝑒2
where:
Z = Z-value from the z-table reflecting the confidence interval
e = desired level of precision (margin of error)
p = estimated proportion of the population which has the attribute in question
Econ 115a

Z-value (score) from the z-table


Econ 115a

Example:
Suppose, you are doing a study on the inhabitants of a large community, let’s say a
housing or subdivision, and you want to find out if how many households have cable
televisions. You don’t have much information about them to begin with, so you’re
going to assume that half of the families do have. So, you’ll have p =0.50 and it let’s
say you want 95% confidence with a 5% margin of error, then you can now compute
for your sample size:
𝑍 2 𝑝 ∗ (1 − 𝑝)
𝑛𝑜 =
𝑒2
1.962 0.5 ∗(1−0.5)
𝑛𝑜 = 0.052
= 384
Econ 115a

If we want 99% confidence and at least 1% margin of error. Assuming proportion is


50%:

𝑍 2 𝑝 ∗ (1 − 𝑝)
𝑛𝑜 =
𝑒2
2.582 0.5 ∗(1−0.5)
𝑛𝑜 = = 16, 641
0.012
Econ 115a

4. Other methods
4.1 Neuman (2006)
N = 1,000 (30% of the population)
N = 10,000 (10% of the population)
N = 150,000 (1% of the population)

4.2 Krejcie & Morgan (1970) Sample Size Table


4.3 Power Analysis (G-Power, Sample Power)
Econ 115a

Krejcie & Morgan (1970)


Sample Size Table

Source: Krejcie, R.V., & Morgan, D.W., (1970). Determining Sample Size for Research
Activities. Educational and Psychological Measurement.
Econ 115a

You may also use online sample size calculators:

Surveymonkey: https://www.surveymonkey.com/mp/sample-size-calculator/

Qualtrics: https://www.qualtrics.com/au/experiencemanagement/research/determine-
sample-size/

Raosoft: http://www.raosoft.com/samplesize.html
Econ 115a

Note:
- smaller margin of error, larger sample size given the same population.
- the higher the sampling confidence level, the larger your sample size will be.
- the larger the sample size, the more statistically significant it is — meaning there’s
less of a chance that your results happened by chance/coincidence.
Econ 115a

1.5 Data Collection


Econ 115a

Data Collection
- is the process of gathering and measuring information on variables of interest,
in an established systematic fashion that enables one to answer stated research
questions, test hypotheses, and evaluate outcomes (Responsible Conduct of Research
(RCR) - Northern Illinois University).
Econ 115a

General types of methods for Data Collection


1. Survey Method
- carefully planned and organized study or enquiry to collect data on the subject of
the study/enquiry.
2. Observation Method
- records data as things occur, making use of an appropriate and accepted method of
measurement.
3. Experimental Method
- collects data through well designed and controlled statistical experiments.
Econ 115a

Common methods of Data Collection in the Social Sciences


1. Interview
- involves services of trained enumerators (F2F,telephone, video call)

2. Questionnaires and surveys


- used to ask structured close or open-ended questions.

3. Observation
- gathering firsthand information.
Econ 115a

4. Focus Group Discussion (FGD)


- interview of people who have something in common (usually has a maximum of 12
people in a session).
Econ 115a

Common tools/platforms for Data Collection


1. Pen-and-paper questionnaires
2. Telephone/Cellphone/E-mail/Other internet-based platforms
- Gmail, Yahoomail, Outlook, Facebook/Messenger, etc.
3. Online forms
- Google forms, Microsoft Forms, SurveyMonkey, JotForms, etc.
4. Online & Offline Data Collection Toolkits
- Open Data Kit (ODK), Kobo, Survey CTO, Enketo, etc.
Econ 115a

1.6 Sources of Bias


Econ 115a

Sources of Bias
Bias is a source of systematic error.

In statistics, we have a saying: “Garbage in equals garbage out.”

If you select your subjects in a way that is biased — that is, favoring certain individuals
or groups of individuals — then your results will also be biased.

Sample needs to be a good representation of the study population.


Econ 115a

If the sample is biased, it is not representative of the study population, conclusions


draw from the study sample might not apply to the study population.

A statistic used to estimate a parameter is unbiased if the expected value of its


sampling distribution is equal to the value of the parameter being estimated.
Econ 115a

Bias enter the studies in two primary ways:


1. During the selection and retention of the subjects of study – related to sampling.
2. In the way information is collected about the subjects – related to data collection.

Types of Biases:
1. Sample Selection Bias
2. Information Bias
Econ 115a

1. Sample Selection Bias


1.a Selection bias: if some potential subjects are more likely than others to be
selected for the study sample.

The sample is selected in a way that systematically excludes part of the population.

1.b Volunteer bias: the fact that people who volunteer to be in the studies are usually
not representative of the population as a whole.
Econ 115a

1.c Nonresponse bias: the other side of volunteer bias. Just as people who volunteer
to take part in a study are likely to differ systematically from those who do not, so
people who decline to participate in a study when invited to do so very likely differ
from those who consent to participate.

1.d Informative censoring: can create bias in any longitudinal study (a study in which
subjects are followed over a period of time).
Losing subjects (participants) during a long-term study is common, but the real
problem comes when subjects do not drop out at random, but for reasons related to
the study's purpose.
Econ 115a

2. Information Bias
2.a Interviewer bias: when bias is introduced into the data collected because of the
attitudes or behavior of the interviewer.

2.b Recall bias: the fact that people with a life experience such as suffering from a
serious disease or injury are more likely to remember events that they believe are
related to that experience.

2.c Social desirability bias: caused by people’s desire to present themselves in a


favorable light.
Econ 115a

2.d Detection bias: the fact that certain characteristics may be more likely to be
detected or reported in some people than in others.

A test or treatment for a disease may perform differently according to some


characteristic of the study participant, which itself may influence the likelihood of
disease detection or the effectiveness of the treatment.

Detection bias can occur in trials when groups differ in the way outcome information
is collected or the way outcomes are verified.
Econ 115a

Example:
Larger men have bigger prostates, which makes diagnosing prostate cancer via biopsy
more difficult (it is harder to hit the target).

Therefore, men with larger prostates are less likely to be accurately diagnosed with
prostate cancer. Thus, a real association between obesity and prostate cancer risk may
be underestimated.
Econ 115a

There are many other biases that may be present or may exist in any study.

To know more about the other forms/source of biases (especially for health/life-
related studies), you may visit a database maintained by the University of Oxford
through the link:

https://catalogofbias.org/
Econ 115a

References:
Beginning Statistics (2012). https://2012books.lardbucket.org/books/beginning-statistics/
Cochran, W.G. (1977). Sampling Techniques. 3rd ed. New York: John Wiley & Sons.
Best, J. W., Kahn, J. V. (2006). Research in Education 10th Edition. Pearson Education
Fraenkel, J. R., Wallen, N. E., Hyun, H. H. (2012). How to Design and Evaluate Research in Education 8th Edition.
McGraw-Hill
Isotalo, Jarkko (n.d.) Basic Statistics.
Wahl, M. (2013). Crash Course on Basic Statistics. University of New York at Stony Brook
Yamane, Y. (1967). Mathematical Formulae for Sample Size Determination.
https://www.statisticshowto.com/probability-and-statistics/how-to-use-slovins-formula/
https://www.statisticshowto.com/probability-and-statistics/find-sample-size/#Cochran

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy