0% found this document useful (0 votes)
13 views49 pages

WEEK 2 Data Collection and Organization 335401

This document provides a comprehensive overview of data collection and organization in the context of biostatistics, emphasizing the importance of accurate data for scientific research. It covers various types of data, methods of data collection, sampling techniques, and strategies for organizing and visualizing data effectively. Additionally, it discusses the advantages and disadvantages of different sampling methods and highlights best practices for ensuring data quality.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views49 pages

WEEK 2 Data Collection and Organization 335401

This document provides a comprehensive overview of data collection and organization in the context of biostatistics, emphasizing the importance of accurate data for scientific research. It covers various types of data, methods of data collection, sampling techniques, and strategies for organizing and visualizing data effectively. Additionally, it discusses the advantages and disadvantages of different sampling methods and highlights best practices for ensuring data quality.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

DATA COLLECTION

AND ORGANIZATION
COURSE TITLE: BIOSTATISTICS
COURSE CODE: BIOSTAT

COURSE CREDIT: 3 UNITS OF LECTURE, 3 HOURS OF LECTURE


PREPARED BY: ARLIE JOY CORDERO, RMT
INTRODUCTION

• ________ is the foundation of scientific research, enabling


researchers to derive meaningful insights, answer critical
questions, and validate hypotheses.
• Proper collection and organization of data ensure accuracy,
reliability, and validity in analysis and interpretation.
• This presentation covers types of data, methods of data
collection, sampling techniques, and strategies for organizing
data effectively.
WHAT IS DATA?

• Data refers to raw facts, measurements, observations, or


descriptions collected for analysis and interpretation.
• Data serves as the building block for information and
knowledge generation, providing a basis for evidence-based
decision-making in various fields such as science, healthcare,
and business.
TYPES OF DATA

1. _________________:
• Descriptive and categorical data that cannot be measured numerically.
• Examples include gender, colors, or types of diseases.

2. _________________:
• Numerical data that represents measurable quantities.
• Examples include height, weight, and temperature.

Subtypes of Quantitative Data:


• ______________: Whole numbers, such as the number of patients in a ward.
• ______________: Can take any value within a range, such as blood pressure readings
IMPORTANCE OF DATA COLLECTION

• Ensures that the information gathered is accurate, complete,


and relevant to the research objectives.
• Minimizes errors and biases in the analysis phase.
• Facilitates the generation of reliable evidence for decision-
making and policy formulation.
DATA COLLECTION METHODS

1. Primary Data Collection:


• Definition: Gathering new and original data directly from sources.
Methods:
• Surveys and Questionnaires: Structured tools designed to collect standardized responses
from participants. Effective for gathering opinions, preferences, or demographic information.
• ________________: Conversations conducted either face-to-face or virtually to explore detailed
responses and obtain qualitative insights.
• ________________: Monitoring and recording behaviors, events, or conditions as they naturally
occur without interference.
• ________________: Controlled studies designed to manipulate variables and test specific
hypotheses under predefined conditions.
DATA COLLECTION METHODS

2. Secondary Data Collection:


• Definition: Using pre-existing data collected by others for a different
purpose.
Sources:
• Government publications, research articles, organizational reports, and historical
archives.
Advantages: Cost-effective and time-saving, as the data is already available.
Disadvantages: May lack relevance or accuracy for the specific research
objective.
ADVANTAGES AND DISADVANTAGES OF SAMPLING
TECHNIQUES
PRIMARY DATA COLLECTION: SECONDARY DATA COLLECTION:

• Advantages:
• Advantages:
• Specific to the research question.
• Cost-effective and readily available.
• Highly reliable and customizable to • Suitable for longitudinal studies where
meet study needs. historical data is needed.
• Disadvantages: • Disadvantages:
• Time-consuming and resource- • May not perfectly align with the
intensive. research objectives.
• Can be expensive for large-scale • Potential biases or errors in the original
collection process.
studies.
SAMPLING TECHNIQUES

• Sampling is the process of selecting a subset of individuals or units


from a larger population to represent the whole.
• A well-designed sample allows researchers to generalize findings to
the entire population without having to study every individual.
IMPORTANCE OF SAMPLING:
What is Sampling?
Sampling is a process of selecting a subset of individuals or
items from a larger population for study. By studying this smaller
sample, researchers can make inferences or generalizations about the
entire population.
Importance of Sampling:
•Time Efficiency: It is impractical and often unnecessary to study the entire
population.
•Cost-Effective: Sampling reduces the cost associated with data collection.
•Practicality: Allows for feasible data collection in large populations or when
dealing with limited resources.
Two Main Types of Sampling:
•Probability Sampling
•Non-Probability Sampling
PROBABILITY SAMPLING OVERVIEW

Probability sampling refers to a sampling technique in which each member of the population has a
known, non-zero chance of being selected for the sample.
Main Features:
•______________: The selection process is random, ensuring that every individual has an equal or known
chance of being selected.
•______________: Probability sampling aims to eliminate any researcher bias, ensuring a more objective
selection.
•______________: It allows for statistical methods to estimate population parameters and calculate the
margin of error.
Benefits of Probability Sampling:
•Representativeness: This method tends to produce samples that reflect the population’s structure.
•Accuracy: Reliable and generalizable results, useful for large-scale surveys, opinion polls, and more.
TYPES OF PROBABILITY SAMPLING

Simple Random Sampling:


• Process: Each member of the population is assigned a unique number. Using
a random number generator or a lottery system, individuals are selected at
random.
• Strengths: Provides unbiased selection.
• Weaknesses: May be impractical for large populations.
• Example: Drawing 100 names from a list of 1000 employees for a survey.
TYPES OF PROBABILITY SAMPLING

Stratified Random Sampling:


• Process: The population is divided into distinct, non-overlapping groups (strata) based on
specific characteristics, such as age, income, or education. Random samples are then
taken from each group.
• Strengths: Ensures that subgroups are well represented in the sample.
• Weaknesses: Requires knowledge of the strata beforehand, can be more complex to
implement.
• Example: Surveying students in a university by dividing them into strata based on
their majors
TYPES OF PROBABILITY SAMPLING

Systematic Sampling:
• Process: The population is ordered, and every nth individual is
selected starting from a random starting point.
• Strengths: Simpler than simple random sampling, especially with large
populations.
• Weaknesses: Can introduce bias if there is a hidden pattern in the
population
• Example: Selecting every 10th person on a list of customers.
TYPES OF PROBABILITY SAMPLING

Cluster Sampling:
• Process: The population is divided into clusters, usually geographically or
naturally occurring groups. Entire clusters are randomly selected, and all
individuals within the chosen clusters are surveyed.
• Strengths: Cost-effective and practical when populations are spread over
large geographical areas.
• Weaknesses: Clusters may not be representative of the population, and
homogeneity within clusters can reduce sample diversity.
• Example: Selecting several schools in a city and surveying all students in
those schools.
ADVANTAGES OF PROBABILITY SAMPLING

• Representation: The random nature ensures that every member of the population has a
fair chance of being included, leading to a more accurate representation of the
population.
• Bias Reduction: Eliminates selection bias by giving each individual an equal opportunity
of being chosen.
• Statistical Analysis: Probability sampling enables the application of statistical techniques
such as confidence intervals, significance tests, and margin of error calculations, which
help quantify uncertainty and ensure that results are accurate.
• Transparency and Reproducibility: Probability sampling is transparent, as the process
can be replicated by others, enhancing research reliability and credibility.
NON-PROBABILITY SAMPLING OVERVIEW

• Definition: Non-probability sampling does not involve random selection, and the probability of any given individual
being selected is not known.
• Key Features:
• Subjective Selection: The selection process is based on the researcher’s judgment or convenience rather than
randomness.
• Lower Cost and Time: Non-probability sampling is quicker and more affordable to conduct, especially for
smaller or exploratory studies.
• Limitations: Results from non-probability sampling cannot be generalized to the entire population with high
confidence.
• Benefits:
• Cost-Effective: Cheaper and quicker than probability sampling methods.
• Flexibility: Useful for exploratory, qualitative, or pilot research, where broad generalization is not the priority.
TYPES OF NON-PROBABILITY SAMPLING

Convenience Sampling:
• Process: The researcher selects participants who are easiest to
reach or most accessible.
• Strengths: Fast, inexpensive, and simple to use.
• Weaknesses: Results may not be representative, leading to
biased conclusions.
• Example: Surveying people in a shopping mall because they are
easily accessible.
TYPES OF NON-PROBABILITY SAMPLING

Judgmental or Purposive Sampling:


• Process: The researcher selects participants based on their
knowledge or expertise, or because they meet specific criteria.
• Strengths: Useful for selecting specialized or experienced
individuals.
• Weaknesses: Can lead to researcher bias and poor
generalization.
• Example: Interviewing experts in a field for insights into a new
technology.
TYPES OF NON-PROBABILITY SAMPLING

Quota Sampling:
• Process: The population is divided into subgroups, and the researcher
selects participants non-randomly to meet a specific quota for each
subgroup.
• Strengths: Ensures that important subgroups are represented in the
sample.
• Weaknesses: The non-random selection can introduce bias.
• Example: Surveying a fixed number of individuals from different
income levels without random selection.
TYPES OF NON-PROBABILITY SAMPLING

Snowball Sampling:
• Process: Initial participants are selected, and they then refer other
participants. This process continues, creating a "snowball" effect.
• Strengths: Particularly useful for hard-to-reach populations, like drug users
or homeless people.
• Weaknesses: May not represent the wider population, and can suffer from a
"snowball" effect where only similar individuals are selected.
• Example: Studying individuals in underground social groups or hidden
populations.
ADVANTAGES OF NON-PROBABILITY SAMPLING

• Speed and Convenience: Faster and easier to implement compared to


probability sampling, especially when time and resources are limited.
• Cost-Effective: Particularly beneficial for pilot studies, qualitative
research, and smaller projects.
• Flexibility in Selection: Ideal for niche populations where a probability
sample may not be feasible.
LIMITATIONS OF NON-PROBABILITY SAMPLING

• Bias: Results are highly susceptible to researcher bias, and the sample
may not reflect the diversity of the larger population.
• Limited Generalizability: Since the sample is not random, findings cannot
be confidently generalized to the population.
• Lack of Statistical Analysis: Non-probability sampling does not support
the use of statistical methods for hypothesis testing or confidence
intervals, making it harder to quantify uncertainty.
COMPARISON OF PROBABILITY AND NON-PROBABILITY
SAMPLING
Feature Probability Sampling Non-Probability Sampling
Selection Process Random, with equal chance for Subjective, based on
all researcher's choice
Representation More representative of the Less representative
population
Generalizability Can generalize results to the Cannot generalize results
population confidently
Statistical Inference Enables statistical analysis Limited or no statistical
(e.g., margin of error) inference
Cost and Time Often more expensive and Quicker and cheaper
time-consuming
ORGANIZING DATA

Organizing data involves systematically arranging collected information to facilitate


analysis and interpretation.
Steps:
1.Data Cleaning: Removing errors, duplicates, or incomplete records to ensure
accuracy.
2.Data Categorization: Grouping data into meaningful categories or classes based on
common features.
3.Data Summarization: Presenting data in summarized formats, such as tables,
charts, or graphs, to highlight key insights.
FREQUENCY DISTRIBUTION TABLES

• Summarizes data by listing values or ranges alongside their


corresponding frequencies.
• Purpose: Provides a clear overview of how often each value or
category appears in the dataset.
• Steps to Create:
• List all unique values or intervals in the dataset.
• Count the occurrences of each value or interval.
• Record the frequencies in a tabular format.
GROUPED FREQUENCY DISTRIBUTION

• Groups large datasets into intervals or classes for easier analysis and
visualization.
• Purpose: Helps identify patterns or trends in continuous data.
• Steps:
• Determine the range of the data by subtracting the smallest value from the largest.
• Decide on the number of classes based on the dataset size.
• Calculate class width by dividing the range by the number of classes.
• Create class intervals and tally the occurrences for each
RECALL LESSON
DATA VISUALIZATION

• Data visualization is the process of presenting data in a graphical or


pictorial format to enhance understanding and communication.
Purpose:
• Makes complex data more accessible and interpretable.
• Reveals patterns, trends, and relationships that may not be apparent in
raw data.
BAR GRAPHS

• A graphical representation of
categorical data using rectangular bars.
• Key Features:
• Bars are proportional in length to the
frequencies or values they represent.
• Suitable for comparing different categories.
HISTOGRAMS

• Histograms are graphical representations of continuous data.


They group data points into intervals (bins) and display the
frequency of data within each interval using adjacent bars.
This visualization technique is particularly useful for
identifying the distribution and spread of a dataset.
• Key Features:
• Adjacent bars signify that the data is continuous.
• The height of each bar corresponds to the frequency of data
within that interval.
• Useful for detecting patterns such as skewness, modality, and
outliers in data.
PIE CHARTS

• A pie chart is a circular statistical graphic divided into slices to


illustrate the proportion of different categories within a dataset.
Each slice represents a category's share of the whole.
• Key Features:
• Provides a visual summary of relative proportions.
• Best suited for comparing parts of a whole when there are limited
categories.
• Not ideal for datasets with numerous categories or when precise
comparisons are required.

• Example: A pie chart showing the percentage distribution of


various types of respiratory diseases in a patient population.
LINE GRAPHS

• A line graph uses points connected by lines to


show trends or changes in data over time or
another continuous variable.
• Key Features:
• Displays relationships between two variables,
typically with time on the x-axis.
• Ideal for showing trends, progressions, or
fluctuations.
• Can display multiple data series for comparison.
• Example: A line graph illustrating the trend in
patient admissions over months in a respiratory
therapy department.
SCATTER PLOTS

• A scatter plot displays data points on a two-dimensional plane


to depict the relationship between two quantitative variables.
• Key Features:
• Each point represents a pair of values for the variables being
compared.
• Useful for identifying correlations, clusters, or patterns.
• Can indicate positive, negative, or no correlation between
variables.

• Example: A scatter plot showing the relationship between lung


capacity and age in patients.
TABLES AS DATA PRESENTATION TOOLS

• Tables systematically organize data into rows and columns, providing a structured
format for summarizing and comparing values.
• Key Features:
• Enables clear and precise presentation of large datasets.
• Allows for the inclusion of detailed numerical or categorical information.
• Suitable for presenting raw data or summaries such as means, medians, and frequencies.

• Example: A table summarizing patient demographics and clinical characteristics.


ADVANTAGES OF DATA VISUALIZATION

• Enhances understanding of complex datasets by simplifying


information.
• Facilitates better communication of findings to stakeholders.
• Reveals hidden patterns, trends, or relationships that may not be
apparent in raw data.
• Supports informed decision-making by providing actionable insights.
STEPS IN DATA COLLECTION AND ORGANIZATION

1.Define Objectives: Clearly outline the goals of the study or analysis.


2.Select Data Collection Methods: Choose methods based on the type of data
required and resources available.
3.Collect Data: Execute the data collection plan while ensuring accuracy and
reliability.
4.Clean and Verify Data: Remove errors, inconsistencies, and missing values.
5.Organize Data: Arrange data into structured formats, such as tables or graphs, for
analysis and interpretation.
CHALLENGES IN DATA COLLECTION

• Incomplete Data: Missing or incomplete records can compromise the


validity of findings.
• Bias: Sampling or measurement biases can skew results.
• Resource Constraints: Limited time, budget, or personnel may affect
data quality.
• Ethical Concerns: Issues related to privacy, consent, or data misuse
must be addressed.
BEST PRACTICES IN DATA COLLECTION AND
ORGANIZATION

• Develop a clear and detailed data collection plan.


• Train personnel involved in data collection to ensure consistency and
accuracy.
• Use standardized tools and protocols.
• Regularly validate and audit data to detect errors or inconsistencies.
• Ensure ethical considerations are upheld throughout the process.
APPLICATION OF SAMPLING TECHNIQUES IN RESEARCH

• Probability Sampling: Used in large-scale epidemiological studies to


ensure representativeness.
• Non-Probability Sampling: Common in qualitative research where
exploratory insights are prioritized.
• Mixed-Methods Sampling: Combines both techniques for
comprehensive analysis.
REAL-WORLD EXAMPLES OF DATA PRESENTATION

• Healthcare: Visualizing patient outcomes across different


treatment groups using bar graphs and line charts.
• Education: Using tables to summarize student performance
metrics.
• Business: Employing pie charts to depict market share
distributions.
IMPORTANCE OF DATA PRESENTATION

• Effective data presentation makes complex information


more understandable and accessible.
• Allows quick identification of patterns, outliers, and trends.
ETHICAL CONSIDERATIONS IN DATA COLLECTION

• Ensuring informed consent from participants.


• Protecting confidentiality and privacy.
• Avoiding data manipulation or misrepresentation.
CHALLENGES IN DATA COLLECTION AND
ORGANIZATION

• Missing or incomplete data.


• Inconsistent or inaccurate recording.
• Resource constraints.
BEST PRACTICES IN DATA COLLECTION AND
ORGANIZATION

• Use standardized tools and protocols.


• Double-check and validate data entries.
• Maintain detailed documentation.
CASE STUDY: REAL-WORLD APPLICATION

• Example: A hospital study on the effectiveness of a new drug


for diabetes management.
• Steps:
• Data collection through patient surveys and medical records.
• Sampling based on stratified random sampling.
• Data organization into tables and graphs for analysis.
QUESTIONS AND DISCUSSION

Reflective Questions:
• Can you think of a research scenario where non-probability sampling would be
more effective than probability sampling?
• What are some challenges you might face when using probability sampling in
large-scale surveys?"
• What are the most critical factors to consider when choosing a sampling
technique?
• How can data visualization improve communication with non-expert audiences?
FOR NEXT MEETING

• Assessment of this week’s topics.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy