Definitions Stat Exam 1
Definitions Stat Exam 1
4. List and describe the different types of probability sampling methods. Provide
one real-world example of when each would be appropriately used.
o Simple Random Sampling: Every member has an equal chance of being
selected. Example: Randomly selecting patients from a hospital database.
o Systematic Sampling: Every nth member is selected. Example: Every 10th
visitor to a clinic is chosen for a survey.
o Stratified Sampling: The population is divided into subgroups (strata), and
a random sample is taken from each subgroup. Example: Surveying different
age groups in a national health survey.
o Cluster Sampling: The population is divided into clusters, and some
clusters are selected randomly. Example: Randomly selecting schools from a
district to study student health behaviors.
5. What are common applications of probability sampling in healthcare
research? Provide two examples.
o Example 1: Using stratified sampling to ensure a representative sample of
patients from various age groups for a health survey.
o Example 2: Using simple random sampling to select individuals from a
population for a clinical trial.
Types of Data
8. What are the three measures of central tendency? Define each and explain how
they are used in data analysis.
o Mean: The average of all values in a data set. Useful for data that is
normally distributed.
o Median: The middle value when the data is ordered. Used for skewed data
or when there are outliers.
o Mode: The value that appears most frequently. Used for categorical data.
9. Explain what is meant by unimodal, bimodal, and multimodal distributions.
o Unimodal: A distribution with one peak. Example: Heights of a population.
o Bimodal: A distribution with two peaks. Example: Age distribution in a
retirement home (younger and older adults).
o Multimodal: A distribution with multiple peaks. Example: Survey results
where there are different clusters of opinions.
10. How is the range calculated? What does it tell you about a data set?
o Range: The difference between the maximum and minimum values in a data
set. It gives an idea of the spread or variability of the data but is sensitive to
outliers.
11. Define standard deviation. How do you interpret the standard deviation of a
data set?
o Standard deviation is a measure of the amount of variation or dispersion in
a data set. A small standard deviation means the data points are close to the
mean, while a large standard deviation means the data points are spread out
over a wider range.
12. Explain the concept of dispersion in statistics. How does it relate to variability
in data?
o Dispersion refers to the spread of data points around the central tendency
(mean or median). Measures of dispersion, like range and standard
deviation, provide insights into how much variability exists in a data set.
13. What is hypothesis testing? Describe the steps involved in testing a hypothesis.
o Hypothesis testing is a statistical method used to determine whether there is
enough evidence to reject a null hypothesis.
o Steps:
1. State the null and alternative hypotheses.
2. Choose the significance level (alpha).
3. Collect and analyze the data.
4. Compute the test statistic and p-value.
5. Make a decision to reject or fail to reject the null hypothesis.
14. Explain the difference between rejecting and accepting the null hypothesis.
o Rejecting the null hypothesis means there is enough evidence to support
the alternative hypothesis.
o Accepting the null hypothesis means there is insufficient evidence to reject
it, not necessarily that it is true.
15. What does a p-value indicate in hypothesis testing?
o A p-value indicates the probability of obtaining results at least as extreme as
the ones observed, assuming the null hypothesis is true. A p-value below a
pre-determined threshold (e.g., 0.05) suggests that the results are statistically
significant.
16. What is the significance of a confidence interval (CI) in statistical analysis?
How do you interpret a 95% confidence interval?
o A confidence interval provides a range of values within which the true
population parameter is likely to fall. A 95% confidence interval means that
there is a 95% chance that the true value lies within the interval.
Power of a Study
18. What is the power of a study? How does it relate to the probability of correctly
rejecting a false null hypothesis?
o The power of a study is the probability that the study will correctly reject a
false null hypothesis. A higher power reduces the chance of making a Type
II error (failing to reject a false null hypothesis).
Frequency / Percentage
23. How do you calculate frequency and percentage from a data set? Provide an
example.
o Frequency is the count of how often each value occurs in a data set.
o Percentage is calculated by dividing the frequency of a value by the total
number of observations and multiplying by 100.
o Example: If 30 out of 100 patients report a certain symptom, the frequency
is 30, and the percentage is (30/100) * 100 = 30%.
24. Explain why frequency distributions and percentages are useful in healthcare
research.
o They help summarize and present large amounts of data in a meaningful
way. Percentages allow for easy comparisons between different groups or
time points.
25. Describe the key differences between a line graph and a bar graph. When
should each be used?
o Line graph: Used to display trends over time or continuous data.
o Bar graph: Used to compare discrete categories or groups.
26. What types of data are best represented using a histogram, and why?
o Histogram: Best for continuous data where you want to display the
frequency distribution. It helps visualize the shape of the data distribution.
27. How would you interpret a pie chart showing the distribution of various health
conditions in a population?
o A pie chart visually displays the proportion of each health condition in the
population, making it easy to compare relative sizes. Larger segments
represent more common conditions, while smaller segments represent less
common conditions.
1. Definitions / Concepts Related to Probability and Nonprobability
Sampling
Advantages Disadvantages
Reduces sampling bias. Time-consuming and expensive.
Requires access to the entire population list
Provides accurate and generalizable results.
(sampling frame).
Statistical analysis can be used to calculate
May be difficult in large-scale studies.
error probabilities.
3. Nonprobability Sampling
Sampling
Description Example
Method
Simple Random Every individual has an equal chance Randomly selecting 100 patients
Sampling of being selected. from a hospital registry.
Systematic Every nth individual is selected from a Selecting every 10th visitor from a
Sampling list. hospital for a survey.
The population is divided into
Stratified Surveying patients of different age
subgroups, and random samples are
Sampling groups in a national health survey.
taken from each subgroup.
Cluster The population is divided into Selecting a few schools randomly
Sampling clusters, and entire clusters are in a district for a study on student
Sampling
Description Example
Method
selected randomly. health behaviors.
Example Description
Ensuring a representative sample of patients across different age
Stratified Sampling
groups for a survey.
Simple Random Selecting patients from a clinic for a clinical trial to study the
Sampling effectiveness of a new medication.
6. Types of Data
Type of
Description Example
Data
Nominal Categories with no specific order. Gender (Male, Female).
Categories with an order but no equal Pain scale (None, Mild,
Ordinal
intervals between them. Moderate, Severe).
Data with equal intervals between values, but
Interval Temperature in Celsius.
no true zero point.
Data with equal intervals and a true zero
Ratio Weight (in kilograms).
point.
Description Example
The range is calculated as the difference between the For data set: 5, 10, 15, 20, 25,
maximum and minimum values in a data set. Range = 25 - 5 = 20.
Description Example
Standard deviation measures the spread of data In a data set (5, 7, 9, 11, 13), the
points around the mean. A smaller standard standard deviation is calculated to be
deviation means data points are close to the mean. 3.16, indicating a moderate spread.
12. Dispersion
Description Example
Dispersion refers to how spread out A data set (5, 6, 7, 8, 9) has low dispersion, while
data points are in a data set. (1, 10, 15, 20, 25) has high dispersion.
Step Description
State Hypotheses Formulate the null and alternative hypotheses.
Choose Significance
Set the significance level (alpha), usually 0.05.
Level
Collect Data Gather the data necessary for analysis.
Analyze Data Perform statistical analysis to compute test statistics and p-
Step Description
values.
Make Decision Reject or fail to reject the null hypothesis based on the p-value.
Action Description
There is enough evidence to support the alternative
Reject Null Hypothesis
hypothesis.
Fail to Reject Null There is insufficient evidence to support the alternative
Hypothesis hypothesis.
Description Example
A p-value indicates the probability of observing the
A p-value of 0.03 means there is a 3%
data assuming the null hypothesis is true. A p-value
chance the results are due to chance,
below 0.05 usually means the results are statistically
suggesting statistical significance.
significant.
Description Example
A confidence interval (CI) provides a range of values A 95% CI of [120, 130] for blood
that is likely to contain the true population parameter. pressure suggests the true
A 95% CI means we are 95% confident the true value population blood pressure is likely
lies within the interval. between 120 and 130.
Description Example
A Likert scale measures attitudes or A 5-point Likert scale assessing patient
opinions on a range from strongly agree to satisfaction (Strongly Agree, Agree, Neutral,
strongly disagree. Disagree, Strongly Disagree).
Analysis Description
Descriptive Statistics Mean, median, mode to summarize Likert scale data.
Non-parametric Tests Chi-square test, Mann-Whitney U test for Likert data analysis.
Graph
Description Example
Type
Line Displays trends over time or Tracking patient blood pressure readings over
Graph continuous data. time.
Bar Compares discrete categories Comparing the number of patients with
Graph or groups. different types of health conditions.
25. Histogram
Description Example
A histogram displays the frequency A histogram showing the distribution of
distribution of continuous data. patients' ages in a hospital.
Description Example
A pie chart shows the proportion A pie chart showing the distribution of different health
of different categories in a conditions in a population (e.g., 40% cardiovascular
whole. disease, 30% diabetes).