WEEK 2 Data Collection and Organization 335401
WEEK 2 Data Collection and Organization 335401
AND ORGANIZATION
COURSE TITLE: BIOSTATISTICS
COURSE CODE: BIOSTAT
1. _________________:
• Descriptive and categorical data that cannot be measured numerically.
• Examples include gender, colors, or types of diseases.
2. _________________:
• Numerical data that represents measurable quantities.
• Examples include height, weight, and temperature.
• Advantages:
• Advantages:
• Specific to the research question.
• Cost-effective and readily available.
• Highly reliable and customizable to • Suitable for longitudinal studies where
meet study needs. historical data is needed.
• Disadvantages: • Disadvantages:
• Time-consuming and resource- • May not perfectly align with the
intensive. research objectives.
• Can be expensive for large-scale • Potential biases or errors in the original
collection process.
studies.
SAMPLING TECHNIQUES
Probability sampling refers to a sampling technique in which each member of the population has a
known, non-zero chance of being selected for the sample.
Main Features:
•______________: The selection process is random, ensuring that every individual has an equal or known
chance of being selected.
•______________: Probability sampling aims to eliminate any researcher bias, ensuring a more objective
selection.
•______________: It allows for statistical methods to estimate population parameters and calculate the
margin of error.
Benefits of Probability Sampling:
•Representativeness: This method tends to produce samples that reflect the population’s structure.
•Accuracy: Reliable and generalizable results, useful for large-scale surveys, opinion polls, and more.
TYPES OF PROBABILITY SAMPLING
Systematic Sampling:
• Process: The population is ordered, and every nth individual is
selected starting from a random starting point.
• Strengths: Simpler than simple random sampling, especially with large
populations.
• Weaknesses: Can introduce bias if there is a hidden pattern in the
population
• Example: Selecting every 10th person on a list of customers.
TYPES OF PROBABILITY SAMPLING
Cluster Sampling:
• Process: The population is divided into clusters, usually geographically or
naturally occurring groups. Entire clusters are randomly selected, and all
individuals within the chosen clusters are surveyed.
• Strengths: Cost-effective and practical when populations are spread over
large geographical areas.
• Weaknesses: Clusters may not be representative of the population, and
homogeneity within clusters can reduce sample diversity.
• Example: Selecting several schools in a city and surveying all students in
those schools.
ADVANTAGES OF PROBABILITY SAMPLING
• Representation: The random nature ensures that every member of the population has a
fair chance of being included, leading to a more accurate representation of the
population.
• Bias Reduction: Eliminates selection bias by giving each individual an equal opportunity
of being chosen.
• Statistical Analysis: Probability sampling enables the application of statistical techniques
such as confidence intervals, significance tests, and margin of error calculations, which
help quantify uncertainty and ensure that results are accurate.
• Transparency and Reproducibility: Probability sampling is transparent, as the process
can be replicated by others, enhancing research reliability and credibility.
NON-PROBABILITY SAMPLING OVERVIEW
• Definition: Non-probability sampling does not involve random selection, and the probability of any given individual
being selected is not known.
• Key Features:
• Subjective Selection: The selection process is based on the researcher’s judgment or convenience rather than
randomness.
• Lower Cost and Time: Non-probability sampling is quicker and more affordable to conduct, especially for
smaller or exploratory studies.
• Limitations: Results from non-probability sampling cannot be generalized to the entire population with high
confidence.
• Benefits:
• Cost-Effective: Cheaper and quicker than probability sampling methods.
• Flexibility: Useful for exploratory, qualitative, or pilot research, where broad generalization is not the priority.
TYPES OF NON-PROBABILITY SAMPLING
Convenience Sampling:
• Process: The researcher selects participants who are easiest to
reach or most accessible.
• Strengths: Fast, inexpensive, and simple to use.
• Weaknesses: Results may not be representative, leading to
biased conclusions.
• Example: Surveying people in a shopping mall because they are
easily accessible.
TYPES OF NON-PROBABILITY SAMPLING
Quota Sampling:
• Process: The population is divided into subgroups, and the researcher
selects participants non-randomly to meet a specific quota for each
subgroup.
• Strengths: Ensures that important subgroups are represented in the
sample.
• Weaknesses: The non-random selection can introduce bias.
• Example: Surveying a fixed number of individuals from different
income levels without random selection.
TYPES OF NON-PROBABILITY SAMPLING
Snowball Sampling:
• Process: Initial participants are selected, and they then refer other
participants. This process continues, creating a "snowball" effect.
• Strengths: Particularly useful for hard-to-reach populations, like drug users
or homeless people.
• Weaknesses: May not represent the wider population, and can suffer from a
"snowball" effect where only similar individuals are selected.
• Example: Studying individuals in underground social groups or hidden
populations.
ADVANTAGES OF NON-PROBABILITY SAMPLING
• Bias: Results are highly susceptible to researcher bias, and the sample
may not reflect the diversity of the larger population.
• Limited Generalizability: Since the sample is not random, findings cannot
be confidently generalized to the population.
• Lack of Statistical Analysis: Non-probability sampling does not support
the use of statistical methods for hypothesis testing or confidence
intervals, making it harder to quantify uncertainty.
COMPARISON OF PROBABILITY AND NON-PROBABILITY
SAMPLING
Feature Probability Sampling Non-Probability Sampling
Selection Process Random, with equal chance for Subjective, based on
all researcher's choice
Representation More representative of the Less representative
population
Generalizability Can generalize results to the Cannot generalize results
population confidently
Statistical Inference Enables statistical analysis Limited or no statistical
(e.g., margin of error) inference
Cost and Time Often more expensive and Quicker and cheaper
time-consuming
ORGANIZING DATA
• Groups large datasets into intervals or classes for easier analysis and
visualization.
• Purpose: Helps identify patterns or trends in continuous data.
• Steps:
• Determine the range of the data by subtracting the smallest value from the largest.
• Decide on the number of classes based on the dataset size.
• Calculate class width by dividing the range by the number of classes.
• Create class intervals and tally the occurrences for each
RECALL LESSON
DATA VISUALIZATION
• A graphical representation of
categorical data using rectangular bars.
• Key Features:
• Bars are proportional in length to the
frequencies or values they represent.
• Suitable for comparing different categories.
HISTOGRAMS
• Tables systematically organize data into rows and columns, providing a structured
format for summarizing and comparing values.
• Key Features:
• Enables clear and precise presentation of large datasets.
• Allows for the inclusion of detailed numerical or categorical information.
• Suitable for presenting raw data or summaries such as means, medians, and frequencies.
Reflective Questions:
• Can you think of a research scenario where non-probability sampling would be
more effective than probability sampling?
• What are some challenges you might face when using probability sampling in
large-scale surveys?"
• What are the most critical factors to consider when choosing a sampling
technique?
• How can data visualization improve communication with non-expert audiences?
FOR NEXT MEETING