0% found this document useful (0 votes)
56 views8 pages

Elementary Statistics

The document discusses fundamental concepts in statistics including descriptive and inferential statistics. It defines key terms like population, sample, parameter, statistic, variable and different types of variables. It also discusses different levels of measurement for variables including nominal, ordinal, interval and ratio scales.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views8 pages

Elementary Statistics

The document discusses fundamental concepts in statistics including descriptive and inferential statistics. It defines key terms like population, sample, parameter, statistic, variable and different types of variables. It also discusses different levels of measurement for variables including nominal, ordinal, interval and ratio scales.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

STATISTICS AND PROBABILTY

Module 1: FUNDAMENTAL CONCEPTS OF STATISTICS


Statistics is the science deals with the methods of collecting, organizing, summarizing, presenting, analyzing,
and interpreting quantitative data to support the creation of a more effective, efficient, reliable, and valid
decisions in a particular problem.
Statistics is widely used in education. Research has become a common feature in all branches of activities.
Statistics is necessary for the formulation of policies to start a new course, consideration of facilities available
for new courses, etc. Educators, students, and all individuals in the academe are engaged in research work to
test the past knowledge and evolve new knowledge. These are made into realization through the help of
statistics.
TWO AREAS OF STUDYING STATISTICS
1. Descriptive Statistics
These are statistical methods that organize, summarize, and describe data, providing an organized visual
presentation of the data collected without drawing generalization (inferences) about a large group or population
data.
Ex. Measures of central tendencies (mean, median mode) and measures of variability (range,
interquartile range, variance, semi-quartile range, and standard deviation).
2. Inferential Statistics
These are statistical techniques used to estimate or predict a population parameter from a sample statistic. It
includes making a decision, estimate, prediction, or generalization about a population based on a sample. The
idea is we study the characteristics of a sample data set and out of the results we have derived based on the
sample allows us to make a conclusion and generalization about the population where the sample data are taken.
Population- is referring to a complete set of all possible elements, individuals, objects, observation, or
measurements of interest. It is a group of individuals/subjects that comprise the same characteristics.
Sample- is referring to a portion, part, or a subset of the population of interest that is chosen for
analysis. It is a data set of interest which the investigators or researcher plans to study for the objective of
making a generalization about the population.
Statistics- is a summary measure used to describe the characteristics of a sample. A value calculated
using the data from a sample such as mean, median, standard deviation, variance, etc. A statistic is a datum
(singular form of data) that can be represented numerically. The collection of more than one figure is called
statistics [plural].
Parameter- is a summary measure used to describe the characteristics of a population. A value
calculated using the data from a population.
TYPES OF VARIABLES
Variable is a condition or characteristic that the researcher manipulates, controls, or observes.
Example: Age, sex, business income and expenses, country of birth, capital expenditure, class grades,
eye color, and vehicle type are examples of variables.
General Classification of Variable
1. Qualitative Variable- It is a general category for any variable no numerical characteristics and it is gathered
through categorical responses. Simply describe the quality or characteristics of something.
Examples. Gender, religious affiliation, class sections, place of birth, hair color, eye color.
2. Quantitative Variable- It is a general category of any variable that can be counted, measured, or has a
numerical value associated with it. Simply describe the amount or number of something. This variable can be
categorized as either a discrete quantitative variable or a continuous quantitative variable.
Examples. Number of Students, Length of time to solve a statistical problem, grades in Statistics subject,
monthly salary.
 Discrete Variable- A variable that can only assume certain values and there are usually “gaps” between
values. Typically, discrete variables result from counting.
Examples. The number of students in the family, number of newly structured school buildings in EVSU
Tanauan, number of COVID-19 deaths recorded.
 Continuous Variable- A variable that can be derived from a measuring process. Which we can assume
any values within a specific range of values.
Examples: minutes remaining in class, student’s height, patient's body temperature, student's general weighted
average in Statistics.
5. Other Variables- These are variables commonly utilize in doing a research project. You need to be
familiarized with the distinction and uses of each variable.
A Constant variable is any quantity or number whose value does not change.
A Dependent Variable is a variable that changed because of the independent variable. In other words, it
is the response to the change in the independent variable. Also called as the Outcome of an experiment.
An Independent Variable is a variable that is not affected by anything that is set or manipulated by the
researcher.
Intervening Variable is a hypothetical variable used to explain causal links between other variables.
Control Variable is a variable that the researcher wants to remain constant and unchanging. Commonly
used when doing an experimental study.
The Nuisance Variable is an extraneous variable that increases variability overall.
Moderator Variable is a moderator is a qualitative (e.g., sex, race, class) or quantitative (e.g., level of
reward) variable that affects the direction and/or strength of the relation between an independent or predictor
variable and a dependent or criterion variable.
Mediator Variable is a given variable may be said to function as a mediator to the extent that it
accounts for the relation between the predictor and the criterion.
In a correlational study, the Moderator variable that influences the strength of a relationship between
two other variables, while the Mediator variable is one that explains the relationship between the two other
variables.
For example, “let's consider the relationship between social class (SES) and frequency of breast self-
exams (BSE). The age might be a moderator variable, in that the relation between SES and BSE could be
stronger for older women and less strong or nonexistent for younger women. Education might be a mediator
variable in that it explains why there is a relation between SES and BSE. When you remove the effect of
education, the relation between SES and BSE disappears.
TYPES OF DATA
Datum is one information.
Data are much information, known facts, figures, observations, statistics, records, and reports, among others.
Ungrouped data is given as raw, individual data points, unorganized information.
Example. Examination scores of 12 education students on a 10-item test. 9, 9, 9, 8, 10, 10, 7, 8, 7, 5, 9, 10
Grouped data are data formed by aggregating individual observations or data values of a variable into groups
so that a frequency distribution formed out of these groups serves as a convenient means of summarizing or
analyzing the data.
Example. Age Frequency
22 20
21 30
20 15
19 15

VARIABLE/LEVEL OF DATA MEASUREMENTS


1. Nominal Scale- is the lowest level of data measurement. It is merely used for classification or identification
purposes and no measurements of data are involved, only counting. There is an ordering of categories and the
sizes of the categories have no meaning. These data categories are mutually exclusive and exhaustive, which
means a value of observation contains one and only one category.
Example. Eye color, House ownership, Religion Affiliation
Note:
 Mutually exclusive. A data value, item, or observation that of being contained in one category, must be
excluded from any other category.
 Exhaustive. A particular data value, element, item, object, or observation must be classified in at least one
category.
2. Ordinal Scale- is a level of data measurement that has the characteristics of a nominal scale but this time the
data/variables have naturally ordered categories. The order of measurements is possible, however, the distance
between data values cannot be determined or meaningless.
Example. Faculty Rank (Categorize as Instructor, Assistant Professor, Associate Professor, Professor)
Beauty Contest (2nd Runner up, 1st Runner up, Winner).
3. Interval Scale- is a data level of measurement that possesses all the characteristics of the ordinal scale level
of measurement. It is possible to differentiate between any two classes based on the degree of differences.
Distances between any two points are of known size, the unit of measurement is constant (but arbitrary), and the
zero points is arbitrary (there is no natural zero point). Besides, with interval measurements, operations of
addition, and subtraction have meaning.
Example. 1. The temperature on the Celsius scale on certain wintertime in Canberra City Australia: (12oC, 0oC,
-5oC). Note that the zero value is just a point on the scale and does not represent the absence of the condition.
4. Ratio Scale- is the highest-level data measurement that has all the characteristics of the interval scale data
measurement. Unlike to interval scale, it has true zero which indicates a total absence of the measurement or
property being measured. Also, it permits comparisons such as being twice as high. The ratio between the two
data values is meaningful.
Example. 1. The number of students in Statistics class.
2. Weight (A person who is 120 lbs. is twice as heavier as a 60 lbs. person).

Module 2: SAMPLE SIZE AND SAMPLING DESIGN IN RESEARCH


STRATEGIES FOR DETERMINING SAMPLE SIZE
1. Census or Complete Enumeration (Small Population)
2. Through Literature or Published Tables of Sample Size selection
3. Adopting the Sample Size of similar Study.
4. Sample Size Determination Formula
- Level of Confidence
- Degree of Variability of Subjects
- Margin of Precision or Error
5. Sample Size Calculator/Software
6. Sample size recommendation based on someone with authority.

SAMPLE SIZE COMPUTATION CRITERIA


o Level of Confidence
o Level of Precision or Margin of Error (Sampling error)
o Degree of Variability of Subjects

Level of Confidence
The key idea encompassed in the Central Limit Theorem is that when a population is repeatedly sampled, the
average value of the attribute obtained by those samples is equal to the true population value.
Measures the certainty of the estimate.
*Widely Used Confidence Level
90%, 95%, & 99%
Example: Using a 95%, what does it mean?
95 out of 100 samples will have the true population value within the range of precision specified.
Margin of Precision or Error
It is the range which the true value of the population is estimated to be. This range is often presented in
percentage form (e.g. ± 5%).
Measures the uncertainty of the estimate.
Example. If the researcher finds that 60% of the sample students adopted the synchronous mode of
classes with a ±5% margin of error. It only tells us that between 55% and 65% in the population have adopted
the modality.
Degree of Variability of attributes of Being Measured
Refers to the distribution of attributes in the population. The more heterogeneous a population, the larger the
sample size required to obtain a given level of precision. The less variable (more homogeneous) a population,
the smaller the sample size.
*Commonly used percent of variability is: 20%, 80%, 50%.
SAMPLE SIZE DETERMINATION FORMULA

SAMPLING
The process of choosing individual members or a portion of the population in order to draw statistical
conclusions and estimate the characteristics of the entire population.
TWO GENERAL METHODS OF SAMPLING
Probability sampling involves random selection, allowing you to make strong statistical inferences about the
whole group.
Non-probability sampling involves non-random selection based on convenience or other criteria, allowing you
to easily collect data.
Probability Sampling
1. Simple Random Sampling In a simple random sample, each member/subject of the target population has
equal chance of being selected or chosen to be part of the target number of samples. Your sampling frame
(Complete lists of individual or elements—depends on the subject of interest) should include the whole
population.
Common techniques: Lottery technique, Fish ball Technique
For example, in an organization of 500 employees, if the HR team decides on conducting team building
activities, it is highly likely that they would prefer picking chits out of a bowl. In this case, each of the 500
employees has an equal opportunity of being selected.
2. Systematic Random Sampling Using systematic random sampling, researchers select sample members from
a population at regular intervals. It involves the selection of a starting point for the sample and a repeatable
sample size at regular intervals.
For example, a researcher intends to collect a systematic sample of 500 people in a population of 5000.
He/she numbers each element of the population from 1-5000 and will choose every 10th individual to be a part
of the sample (Total population/ Sample Size = 5000/500 = 10).
3. Stratified Random Sampling is a technique in which the researcher divides the population into subgroups
(strata) that do not overlap but are representative of the full population. During sampling, it is possible to
organize these groups and then draw samples from each group independently.
Example: The company has 800 female employees and 200 male employees. You want to ensure that
the sample reflects the gender balance of the company, so you sort the population into two strata based on
gender. Then you use random sampling on each group, selecting 80 women and 20 men, which gives you a
representative sample of 100 people.
4. Cluster Random sampling also entails splitting the population into subgroups, but each subgroup should
have characteristics that are similar to those of the entire sample. Instead of randomly selecting individuals from
each subgroup, entire subgroups are selected.
Example: The company has offices in 10 cities across the country (all with roughly the same number of
employees in similar roles). You don’t have the capacity to travel to every office to collect your data, so you use
random sampling to select 3 offices – these are your clusters.
5. Multi-Stage Random Sampling Using combination of random sampling designs in order to obtain needed
samples based on a large population.
Non-Random Sampling
1. Convenience Sampling- This type of sampling, researchers prefer participants as per their own convenience.
The researcher selects the closest live persons as respondents. In convenience sampling, subjects who are
readily accessible or available to the researcher are selected.
2. Purposive Sampling- In this type of sampling, the researcher chooses the participants as per his/her own
judgment, keeping back in mind the purpose of the study. It uses the judgment of an expert in selecting cases or
it selects cases with a specific purpose in mind.
3. Quota Sampling- You select your sample according to some fixed quota. This type of sampling is somehow
related to stratified sampling. Make sure that the sample represents each group or stratum of the population.
Unlike the stratified sampling, the researcher in quota sampling method selects the subjects available
immediately fulfilling the criteria.
4. Snowball sampling- Also called "chain referral sampling,” in this method, the sample is actually collected in
various stages. This method is appropriate when the members of a special population are difficult to locate. It
begins by the collection of data from one or more contacts usually known to the person collecting the data. At
the end of the data collection process (e.g., questionnaire, survey, or interview), the data collector asks the
respondent to provide contact information for other potential respondents. These potential respondents are
contacted, interviewed and further asked to provide more contacts. This process goes on till the purpose of the
researcher is achieved.
Sample Scenario:
1. A researcher looking to analyze the characteristics of people belonging to different annual income divisions
will create strata (groups) according to the annual family income. E.g. – less than $20,000, $21,000 – $30,000,
$31,000 to $40,000, $41,000 to $50,000, etc. By doing this, the researcher concludes the characteristics of
people belonging to different income groups. Marketers can analyze which income groups to target and which
ones to eliminate to create a roadmap that would bear fruitful results.
Answer: Stratified Random Sampling
2. Company A distribute leaflets of upcoming events or promotion of their new product toothpaste – they do
that by standing at the mall entrance and giving out pamphlets randomly.
Answer: Convenience Sampling
3. All employees of the company are listed in alphabetical order. From the first 10 numbers, you randomly
select a starting point: number 6. From number 6 onwards, every 10th person on the list is selected (6, 16, 26,
36, and so on), and you end up with a sample of 100 people.
Answer: Systematic Random Sampling
4. The company has offices in 10 cities across the country (all with roughly the same number of employees in
similar roles). You don’t have the capacity to travel to every office to collect your data, so you use random
sampling to select 3 offices – these are your clusters.
Answer: Cluster Random Sampling
5. You are researching experiences of students with successful business in your school. Since there is no list of
all students with successful business in the school, you meet one person who agrees to participate in the
research, and she puts you in contact with other students with potentials to be included in the survey that she
knows in the area.
Answer: Snowball Random Sampling
Module 4: INTRODUCTION OF DATA COLLECTION AND METHODS / TECHNIQUES OF DATA
COLLECTION
DATA
➢ Data is various kinds of information formatted in a particular way.
➢ Is a collection of facts such as numbers, word, measurements, observation or just description of things.
DATA COLLECTION
➢ Is the process of gathering, measuring, and analyzing accurate data from a variety of relevant sources to find
answers to research problems, answer questions, evaluate outcomes, and forecast trends and probabilities.

Two General/Sources of Data


Primary Source of Data Collection
➢ As the name implies, this is original, first-hand data collected by the data researcher. This process is the
initial information gathering step, performed before anyone carries out any further or related research. Primary
data results are highly accurate provided the researcher collects the information. However, potentially time-
consuming and expensive.
Methods on primary data collection
1. Interviews
The researcher asks questions of a large sampling of people, either by direct interviews or means of mass
communication such as by phone or email. This method is by far the most common means of the data gathering.
2. Focus Group
Focus group, like interviews are a commonly used technique. The group consists of anywhere from a half dozen
to a dozen people, led by a moderator, brought together to discuss the issue.
3. Questionnaire
Are a simple straight forward data collection method. Respondents get a series questions, either open or close-
ended, related to the matter at hand.
Secondary Data Collection
Unlike primary data collection, there are no specific collection methods. Instead, since the information has
already been collected, the researcher consults various data sources such as:
• Journal articles
• Websites
• Full-blown research proper
• Government publications
• Books
• Internal record
It is a second-hand data which is based in tried and tested data which is previously analyzed, filtered and
already available.
TWO CATEGORIES FOR PRIMARY DATA COLLECTION METHODS
1. Qualitative Methods
➢ It is the process of collecting and analyzing non-numerical data (e.g., text, video or audio) to understand
concepts, opinion or experiences. It can be used to gather in-depth insights into a problem or generate new ideas
for research.
2. Quantitative Methods
➢ Refers to the collection of numerical data that can be analyzed using statistical method.
➢ This type of data collection is often used in surveys, experiment and other research
methods.
➢ It measures variables and establish relationships between variables.
➢ The data collection through quantitative methods is typically in the form of numbers, such as response
frequencies, means, and standard deviation, and can be analyzed using statistical software.
➢ It answered the “what and who” questions rather than “why”.
METHODS FOR QUANTITATIVE DATA COLLECTION
1. Surveys/Questionnaires
➢ Defined as objectives or closed-ended questions used to gain detailed insights from respondents about a
certain research topic. These questions form the core of a survey and are used to gather numerical data to
determine statistical results.
a. Online Survey
Are quick and easy to saved out, either via email or messenger. They can also appear in pop-ups on websites or
via a link embedded is social media. From the participants point of view, online surveys are convenient to
complete and submit, using whichever device they prefer (mobile phone, tablet, or computer). Anonymity is
also a plus point: online survey software ensures respondents identities are kept confidential.
SOME SOFTWARE/APPLICATIONS USED FOR ONLINE SURVEY
Google Forms, SurveyMonkey, QuestionPro, Jotform, Amazon MTurk, SurveyGizmo, Qualtrics, Google
Consumer Surveys
b. Offline Survey
While online surveys are by far the most common way to collect quantitative data in today’s modern age, there
are still some harder to reach respondents where other mediums can be beneficial; for example, those who
aren’t tech-survey or who don’t have stable internet connection. The anonymity of the respondents is still
potential.
Some Methods of offline survey
In-person Paper-pencil survey - It represents a process of personal interviewing where the pollster holds a
printed-out questionnaire, reads the question to the respondent and fills the answers into the questionnaire.
Mail survey - are often described to be straightforward, comprehensible, and having few open-end questions.
Compared with telephone surveys, the cost in conducting a mail survey is typically lower. To make the most out
of it, mail surveys should be used when the researcher wishes to know if there are changes on the product or
service that the consumer would want, the consumers’ opinion regarding the company’s eventual plans, or time-
sensitive matters.
2. Interviews
Interviews are another popular way of researching polling a population. They can be thought as a survey but in
verbal, in-person or virtual face-to-face format. The online format of interviews is becoming more popular
nowadays, as it is cheaper and logistically easier to organize them face-to-face interviews, yet still allows the
interviewer to see the respondent.
An interviewer runs through a survey with the respondents, asking mainly closed-ended questions (yes or no,
multiple choice questions, or questions with rating scale) that ask how strongly the respondents agrees with the
statements.
The advantage of structured interviews is that the interviewer can pace the survey, making sure the respondents
give enough consideration to each question.
Methods/Techniques Interview of Data Collection Method
Face -to -face interviews have a distinct advantage of enabling the researcher to establish rapport with potential
participants and therefor gain their cooperation. These interviews yield highest response rates in survey
research. They also allow the researcher to clarify ambiguous answers and when appropriate, seek follow-up
information. Disadvantages include impractical when large samples are involved time consuming and
expensive. (Leedy and Ormrod, 2001)
Telephone interviews are less time consuming and less expensive and the researcher has ready access to anyone
on the planet who has a telephone. Disadvantages are that the response rate is not as high as the face-to- face
interview but considerably higher than the mailed questionnaire. The sample may be biased to the extent that
people without phones are part of the population about whom the researcher wants to draw inferences.
Computer Assisted Personal Interviewing (CAPI): is a form of personal interviewing, but instead of completing
a questionnaire, the interviewer brings along a laptop or hand-held computer to enter the information directly
into the database. This method saves time involved in processing the data, as well as saving the interviewer
from carrying around hundreds of questionnaires. However, this type of data collection method can be
expensive to set up and requires that interviewers have computer and typing skills.
3. Observation
➢ Is a technique that focuses on recording the number or types of people who do a certain action such as
choosing a specific product from a grocery shelf, speaking to a company representative at an event, or how
many people pass through a certain are within a given time frame.
➢ Observation studies in quantitative research are similar in nature to a qualitative ethnographic study (in
which a researcher also observes consumers in their natural habitats), yet observation studies for quantitative
research remain focused on the numbers – how many people do an action, how many of a products consumer
pick up etc.

4. Review Existing Documents


Reviewing existing research to see how it can contribute to understanding a new issue in question.
There are numerous documents that can be analyzed to support primary data, or used as an end in themselves.
Secondary data collection can include reviewing public records, governments research, company data bases,
existing reports, paid for research publication, magazines, journals, case studies, websites, and more.
Aside from using secondary research alone, document review can also be used in anticipation of primary
research, to understand which knowledge gaps need to be filled and to nail down the issues that might be
important to explore in a primary research study.
Why Data Collection is Necessary / Important?
Data collection plays a significant asset in any research study. Research work caried can vary from field to field
but all the research work based on data which is analyzed and interrupted to get information. Data is one of the
vital resources for all research since it drawn conclusions based on the results of statistical analysis. The quality
of data collection methods improved the accuracy or validity of study outcomes or findings.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy