0% found this document useful (0 votes)
10 views36 pages

Biostatistics3 2

The document covers fundamental concepts in biostatistics, focusing on measures of central tendency, including mode, median, and mean, as well as percentages, proportions, and ratios. It explains how to calculate these measures and when to use them, alongside examples for clarity. Additionally, it discusses quartiles, percentiles, and the interquartile range as measures of position and variability.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views36 pages

Biostatistics3 2

The document covers fundamental concepts in biostatistics, focusing on measures of central tendency, including mode, median, and mean, as well as percentages, proportions, and ratios. It explains how to calculate these measures and when to use them, alongside examples for clarity. Additionally, it discusses quartiles, percentiles, and the interquartile range as measures of position and variability.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 36

Biostatistics

Measures of Central Tendency

3
Percentages and Proportions
 Report relative size.
 Compare the number of cases in a
specific category to the number of cases
in all categories.
 Compare a part (specific category) to a
whole (all categories).
 The part is the numerator (f ).
 The whole is the denominator (N).
Percentages and Proportions: Example
• What % of social science majors is male?
• of (whole) = all social science majors
• 97 + 132 = 229
• is (part) = male social science majors
• 97
• (97/229) * 100 = (.4236) * 100 = 42.36%
• 42.36% of social science majors are male
Ratios

• Compare the relative sizes of categories.

• Compare parts to parts.

• Ratio = f1 / f2

• f1 - number of cases in first category

• f2 number of cases in second category


Ratios - example
• In a class of 23 females and 19 males, the ratio of
males to females is:
• 19/23 = 0.83
• For every female, there are 0.83 males.

• In the same class, the ratio of females to males is:


• 23/19 = 1.21
• For every male, there are 1.21 females.
Rate ‫ میزان‬- ‫سرعت‬
• Expresses the number of actual occurrences of an event
(births, deaths, homicides) vs. the number of possible
occurrences per some unit of time.

• Example: Birth rate is the number of births divided by the


population size times 1000 per year.

• If a town of 2300 had 17 births last year, the birth rate is:
• (17/2300) * 1000 = (.00739) * 1000 = 7.39
• The town had 7.39 births for every 1000 residents.
Percentage Change
• Measures the relative increase or decrease in a variable
over time.

f1 is the first (or earlier) frequency.


f2 is the second (or later) frequency.
Percentage Change: Example

• In 1990, a state had a murder rate of 7.3.

• By 2000, the rate had increased to 10.7.

• What was the relative change?


• (10.7 – 7.3 / 7.3) * 100 = (3.4 / 7.3) * 100 = 46.58%

• The rate increased by 46.58%.


Measures of Central Tendency

‫اندازه گیری گرایش مرکزی‬


Measures of Central Tendency
• When working with a large data set, it can be useful to
represent the entire data set with a single value that
describes the "middle" or "average" value of the entire set.
In statistics, that single value is called the measure of central
tendency
• A measure of central tendency is a descriptive statistic that
describes the average, or typical value of a set of scores
• There are three common measures of central tendency:
• the mode
• the median
• the mean 10
Mode
• The mode is the score that occurs
most frequently in a set of data

12, 15, 11, 11, 7, 13


The mode is 11.

11
Bimodal Distributions
• When a distribution has two “modes,” it
is called bimodal

Sometimes a set of data will have more than one mode.

For example, in the following set the numbers both the numbers
5 and 7 appear twice.
2, 9, 5, 7, 8, 6, 4, 7, 5
5 and 7 are both the mode and this set is said to be bimodal.
12
Multimodal Distributions
• If a distribution has more than 2 “modes,” it is called
multimodal

Sometimes there is no mode in a set of data.

3, 8, 7, 6, 12, 11, 2, 1
All the numbers in this set occur only once therefore there is
no mode in this set. 13
When To Use the Mode

• The mode is not a very useful measure of central tendency


• It is insensitive to large changes in the data set
• That is, two data sets that are very different from each other can
have the same mode

• The mode is primarily used with nominally scaled data


• It is the only measure of central tendency that is appropriate for
nominally scaled data 14
The

$100, $275, $300, $325, $350, $375,


$500
What is the mode ?

$100, $275, $300, $325, $350, $375, $500

There is no mode!
The Median
• It is the score in the middle; half of the scores are larger than the
median and half of the scores are smaller than the median
• Conceptually, it is easy to calculate the median

• Sort the data from highest to lowest

• Find the score in the middle


• middle = (N + 1) / 2
• If N, the number of scores, is even the median is the average of the middle
16
Median Example -1
• What is the median of the following scores:
10 8 14 15 7 3 3 8 12 10 9
• Sort the scores:
15 14 12 10 10 9 8 8 7 3 3
• Determine the middle score:
middle = (N + 1) / 2 = (11 + 1) / 2 = 6
• Middle score = median = 9

17
Median Example - 2
• What is the median of the following scores:
24 18 19 42 16 12
• Sort the scores:
42 24 19 18 16 12
• Determine the middle score:
middle = (N + 1) / 2 = (6 + 1) / 2 = 3.5
• Median = average of 3rd and 4th scores:
(19 + 18) / 2 = 18.5

18
When To Use the Median
• The median is often used when the distribution of scores is
either positively or negatively skewed
• The few really large scores (positively skewed) or really
small scores (negatively skewed) will not overly influence
the median

19
Mean Calculating the Mean

• The mean is: • Calculate the mean of the


• the arithmetic average of following data:
1 5 4 3 2
all the scores
(X)/N • Sum the scores (X):
1 + 5 + 4 + 3 + 2 = 15
• The mean of a population is • Divide the sum (X = 15) by the
represented by the Greek number of scores (N = 5):
letter ; the mean of a 15 / 5 = 3
sample is represented by X • Mean = X = 3

20
When To Use the Mean
• You should use the mean when
• the data are interval or ratio scaled
• Many people will use the mean with ordinally scaled data too
• and the data are not skewed

• The mean is preferred because it is sensitive to every score


• If you change one score in the data set, the mean will change
21
Mean Example
An electronics store sells CD players at the following
prices: $350, $275, $500, $325, $100, $375, and $300.
What is the mean price?

$350 + $275 + $500 + $325 + $100 +$375 + $300 =

$2225
$2225 / 7 = $317.86

The mean or average price of a CD player is $317.86.


Relations Between the Measures of Central Tendency

• In symmetrical distributions, the median and mean are equal


• For normal distributions, mean = median = mode

• In positively skewed distributions, the mean is greater than the


median
In negatively skewed distributions, the mean is smaller than the median

23
Quartiles, Deciles and Percentiles
(Measure of Position)
• The median splits the data into equal sized halves

• Quartiles split the data into quarters

• Deciles into tenths

• And percentiles can be any split of our choosing


50% - - 50%

Lowest Data Median Highest Data


Value 50% value Value
Quartiles
25% 25% 25% 25%
Q1 Q2 Q3

Deciles 1/10

10% 10% 10% 10% 10% 10% 10% 10% 10% 10%
Percentile Computation
• To formalize the computational procedure, let Lp refer to the location of a
desired percentile. So if we wanted to find the 33rd percentile we would
use L33 and if we wanted the median, the 50th percentile, then L50.

• The number of observations is n, so if we want to locate the median, its


position is at (n + 1)/2, or we could write this as
(n + 1)(P/100), where P is the desired percentile.
LO2

Percentiles - Example

Locate the median, the first quartile, and the third quartile for the
below data:

$2,038 $1,758 $1,721 $1,637


$2,097 $2,047 $2,205 $1,787
$2,287 $1,940 $2,311 $2,054
$2,406 $1,471 $1,460
LO2

Percentiles – Example (cont.)


Step 1: Organize the data from lowest to largest value

$1,460 $1,471 $1,637 $1,721


$1,758 $1,787 $1,940 $2,038
$2,047 $2,054 $2,097 $2,205
$2,287 $2,311 $2,406
Percentiles – Example (cont.)

Step 2: Compute the first and third quartiles. Locate L25 and L75 using:

25 75
L25 (15  1) 4 L75 (15  1) 12
100 100
Therefore, the first and third quartiles are located at the 4th and 12th
positions, respective ly
L25 $1,721
L75 $2,205
Quartile Measures
Calculating The Quartiles: Example
Sample Data in Ordered Array: 11 12 13 16 16 17 18 21 22

(n = 9)
Q1 is in the (9+1)*25/100 = 2.5 position of the ranked data,
so Q1 = (12+13)/2 = 12.5

Q2 is in the (9+1)*50/100 = 5th position of the ranked data,


so Q2 = median = 16

Q3 is in the (9+1)*75/100 = 7.5 position of the ranked data,


so Q3 = (18+21)/2 = 19.5

Q1 and Q3 are measures of non-central location


Q2 = median, is a measure of central tendency
Boxplots
A box plot is a graphical display, based on quartiles, that helps us
picture a set of data.

To construct a box plot, we need only five statistics:

1. the minimum value,


2. Q1(the first quartile),
3. the median,
4. Q3 (the third quartile), and
5. the maximum value.
LO3

Boxplot - Example
Boxplot Example
Step1: Create an appropriate scale along the horizontal axis.
Step 2: Draw a box that starts at Q1 (15 minutes) and ends at Q3 (22
minutes). Inside the box we place a vertical line to represent the median (18
minutes).

Step 3: Extend horizontal lines from the box out to the minimum value (13
minutes) and the maximum value (30 minutes).
Distribution Shape and
The Boxplot

Negatively-Skewed Symmetrical Positively-Skewed

Q1 Q2 Q3 Q1 Q2 Q3 Q1 Q2 Q3
Quartile Measures:
The Interquartile Range (IQR)

― The IQR is Q3 – Q1 and measures the spread in the middle 50% of the data

― The IQR is a measure of variability that is not influenced by outliers or extreme


values

― Measures like Q1, Q3, and IQR that are not influenced by outliers are called
resistant measures
The Interquartile Range

Example:
Median X
X Q1 Q3
(Q2) maximum
minimum
25% 25% 25% 25%

11 12.5 16 19.5 22

Interquartile range
= 19.5 – 12.5 = 7

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy