0% found this document useful (0 votes)

25 views37 pages

Lecture 2-Summarizing Data - HSciences Biostats - 010232en

Biostatics Lecture UPNG SMHS

Uploaded by

Oxy Maine

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views37 pages

Lecture 2-Summarizing Data - HSciences Biostats - 010232en

Biostatics Lecture UPNG SMHS

Uploaded by

Oxy Maine

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 37

SUMMARIZING DATA-

MEASURES OF CENTRAL
TENDENCY: PART A

Elias Namosha
Division of Public Health, SMHS-UPNG
Introduction to Biostatistics
01st March, 2022
OBJECTIVES
Given a set of data you can be able to choose;

 appropriate measure of central locations (Mean,

Median, Mode).

 Be able to calculate MEAN

 Be able to identify and use the MEDIAN and

MODE

The above are used to describe location of data.

Mostly used in descriptive statistics..
MEASURE OF CENTRAL
LOCATION
Definition: a single value that represents
an entire frequency distribution.

Also known as:

• “Measure of the center”
• “Measure of central tendency”

 When we’re talking about measures of central

tendency, what we’re really trying to do is describe
some middle or mid point of data distribution.

 Finding a value that somehow conveys information

about an entire frequency distribution.
MEAN
𝑆𝑢𝑚 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠 𝑓𝑜𝑟 𝑒𝑎𝑐ℎ 𝑚𝑒𝑚𝑏𝑒𝑟 𝑜𝑓 𝑎 𝑔𝑖𝑣𝑒𝑛 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛
Mean =
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑚𝑏𝑒𝑟𝑠

𝑛
𝑖=1 𝑋𝑖
µ=
𝑁
or simply

𝑋
µ= 𝑁

This is the “average” (e.g., height) of all the members.

1. MEAN
Method for identification
1. Sum up all of the values
2. Divide the sum by n
Definition: the “average” (center of gravity)
0, 2, 3, 4, 5, 5, 6, 7, 8, 9,
9, 9, 10, 10, 10, 10, 10, 11, 12, 12,
12, 13, 14, 16, 18, 18, 19, 22, 27, 49
Sum = 360; n = 30
Mean = 360 / 30 = ?
MEAN – PROPERTIES / USES

 Probably most common measure of

central location
 Use all of the data
• Affected by extreme values (outliers)
 Best for normally distributed data
 Not usually equal to one of the original
values
 Good statistical properties
2.MEDIAN

Definition: the middle value

Method for identification:

1. Arrange observations in order

2. Find middle rank as (n + 1) / 2
3. Identify the value at the middle
LENGTH OF STAY DATA

0, 2, 3, 4, 5, 5, 6, 7, 8, 9, 9, 9, 10, 10, 10,

10, 10, 11, 12, 12, 12, 13, 14, 16, 18, 18,

19, 22, 27, 49

What is the median for this data set??

LENGTH OF STAY DATA
n = 30
Median @ 30+1 / 2 = 15.5, i.e., between 15th
and 16th position
Value at 15th position = 10
Value at 16th position = 10
So median = 10

0, 2, 3, 4, 5, 5, 6, 7, 8, 9,
9, 9, 10, 10, 10, M 10, 10, 11, 12, 12,
12, 13, 14, 16, 18, 18, 19, 22, 27, 49
MEDIAN – PROPERTIES / USES

• Does not use all the data available

• Insensitive to extreme values (outliers)
• Poor statistical properties
• Measure of choice for skewed data
• Equals an original value of n is odd

Medians do not use all data available and thus are

insensitive to extreme values. The median is the
preferred measure of central tendency for skewed data.
3.MODE: METHODS FOR
IDENTIFICATION
Definition: the value that occurs most frequently

1a. Arrange data into frequency

distribution, showing the values of the
variable and the frequency with which
each value occurs.

1b. Alternatively, arrange raw data in

ascending order.
MODE: METHOD FOR
IDENTIFICATION

2. Identify the value that occurs most

often.

The first measure used to describe central tendency

is the simplest – the mode
LENGTH OF STAY DATA

0, 2, 3, 4, 5, 5, 6, 7, 8, 9, 9, 9, 10, 10, 10,

10, 10, 11, 12, 12, 12, 13, 14, 16, 18, 18,

19, 22, 27, 49

Identify the value that occurs most often in this

dataset..??
Another way to understand data distribution is to depict the
values graphically (as above), with number of observations on
the y axis and data for the variable value on the x axis. In this
graph, it’s immediately apparent that the mode is 10.
MODE – PROPERTIES / USES

 Easiest measure to understand, explain,

identify
 Always equals an original value
 Insensitive to extreme values (outliers)
 Poor statistical properties
 May be more than one mode
 Does not use all the data

The mode is the easiest measure of central tendency to

identify, explain and understand, but , unfortunately, it is also
the least valuable.
COMPARISON OF MODE,
MEDIAN AND MEAN
 Mode – most common value
 Median – central value
 Arithmetic mean – average value
 Mean uses all data, so sensitive to outliers
 Mean has best statistical properties
 Mean preferred for normally distributed
data
 Median preferred for skewed data
NORMAL CURVE

 Here’s an example of a normal curve or normal distribution.

• In a normal curve, the mean, median and mode are all

the same (same value).
THREE CURVES WITH DIFFERENT
SKEWING

 Most curves are not perfectly normal. They exhibit some

degree of skewing.

 Mean, Median and Mode, are all different in a skewed curve

SUMMARIZING DATA-
MEASURES OF
DISPERSION (SPREAD):

PART B
OBJECTIVES
Describe the following measures of
spread/dispersion:
– Range
– interquartile range
– variance
– standard deviation
MEASURES OF
VARIATION
Definition: quantify the variation or dispersion
or spread of a set of data from its central
location
Also known as:
• “Measure of dispersion”
• “Measure of spread”
Common measures
• Range • Standard error
• Interquartile range • 95% CI
• Variance / standard deviation
RANGE
Properties / Uses
• 2 values or 1?
• Greatly affected by outliers
• Usually used with median

Definition: difference between largest

and smallest values
Range
2 4 20
3 49 22
12 10 11
5 0 18
27 10 18
6 5 13
7 9 14
8 10 9
9 10 12
12 16
=MIN(A1:C10) =MAX(A1:C10)
What is the range of this dataset??
Range

Length of hospital stay for pneumonia

MIN: 0
MAX: 49
MODE: 10
MEDIAN: 10
MEAN: 12

Have a look at this dataset above.

What is the range of length of stay??
 This graphs gives us a great visual representation of the spread of the data.
 However, statisticians and epidemiologists tend to like numbers, so how do
we describe this ‘spread’ of data with numbers?
INTERQUARTILE
RANGE
Properties / Uses
Used with median
Five-number summary for box-and
whiskers diagram:
– Maximum (100%, largest value)
– Third quartile (75%)
– Median (50%)
– First quartile (25%)
– Minimum (0%, smallest value)
Definition: the central 50% of a distribution
THE MIDDLE HALF OF THE OBSERVATIONS
IN A FREQUENCY DISTRIBUTION LIE
WITHIN THE INTERQUARTILE RANGE

The white space under the curve represents

the interquartile range in this graphic.
Length of stay data
MEASURES OF
VARIABILITY/ SPREAD
• Units of variance are the square of the units of the
variable of interest.
• Its more common to present the square root of
variance = standard deviation

2 2
𝑋−µ 𝑋−µ
∂= or for a ∂=
𝑁 𝑁−1

2
𝑋−µ
SD =
𝑁−1
VARIANCE AND STANDARD
DEVIATION
 Variance = average of deviations from mean
Sum (x – mean)2 / n

 Variance is the average of the squared differences from the

mean
 Standard deviation is simply the square root of variance

 Standard deviation is a measure of variation that quantifies

how closely clustered the observed values are to the mean

 Standard deviation is a measure of how spread out the numbers

are – it is usually given the greek symbol sigma ‘σ’
Variance is the sum of all differences between
observations and the mean, squared then divided by the
number of observations. Standard deviation is the
square root of variance. The smaller the variance or
standard deviation, the more tightly clumped the data is.
STANDARD DEVIATION –
PROPERTIES / USES

Standard deviation usually calculated only

when data are more or less normally
distributed (bell shaped curve)

For normally distributed data,

• 68.3% of the data fall within plus/minus 1 SD
• 95.5% of the data fall within plus/minus 2 SD
• 99.7% of the data fall within plus/minus 3 SD
AREAS UNDER THE NORMAL CURVE THAT
LIE BETWEEN 1, 2, AND 3 STANDARD
DEVIATIONS ON EACH SIDE OF THE MEAN

In a normal distribution, about 95% of data values are

contained within the mean plus or minus two SDs.
Don’t worry about the math and the formula here,
focus on the concept.
SUMMARY
Mode – simple, not always
useful

Median – best for skewed data

Arithmetic mean – best for
normally distributed data

Geometric mean – use for lab

titers (Geometric mean – different from a regular
mean (arithmetic), it’s not a simple average. Lab
test that measure the presence & amount of
antibodies in blood).
SUMMARY
 Range – use with median
 Standard deviation – use with mean
(Standard deviation shows how much
individuals within the same sample differ from
the sample mean).

 Standard error – used to construct

confidence intervals. (standard error shows
how close your sample mean is to the
population mean).

This also means that standard error should decrease if the

sample size increases, as the estimate of the population mean
improves. Standard deviation will not be affected by sample size.
END OF
PRESENTATION
THANK YOU!

Week 3
No ratings yet
Week 3
37 pages
ISM Session 1-8+webinar1,2 Merged
No ratings yet
ISM Session 1-8+webinar1,2 Merged
718 pages
BUSS101 Week 3 S1 2024
No ratings yet
BUSS101 Week 3 S1 2024
56 pages
Measures of Central Tendency - and - Dispersion
No ratings yet
Measures of Central Tendency - and - Dispersion
44 pages
CH 2
No ratings yet
CH 2
49 pages
Measures of Central Tendency - 1-1
No ratings yet
Measures of Central Tendency - 1-1
24 pages
DS Module 2
No ratings yet
DS Module 2
113 pages
UKP6053 L3 Descriptive Statsitcs
100% (1)
UKP6053 L3 Descriptive Statsitcs
92 pages
2a. Describing Variables With Numbers
No ratings yet
2a. Describing Variables With Numbers
30 pages
01 - Scales of Mesurement - Sumarising Numeric Data
No ratings yet
01 - Scales of Mesurement - Sumarising Numeric Data
26 pages
Math Unit Test Study Guide
No ratings yet
Math Unit Test Study Guide
12 pages
Normal DistrCent Tendency Measures of Dispersion
No ratings yet
Normal DistrCent Tendency Measures of Dispersion
26 pages
Lec - 4 (Summary Data)
No ratings yet
Lec - 4 (Summary Data)
89 pages
Numerical Descriptive Techniques (6 Hours)
No ratings yet
Numerical Descriptive Techniques (6 Hours)
89 pages
Drawing Conclusions From Statistical Data: Measures of Central Tendency
No ratings yet
Drawing Conclusions From Statistical Data: Measures of Central Tendency
22 pages
Part C - Statistics
No ratings yet
Part C - Statistics
53 pages
Share MBBS - Lecture 4 (1) - 1
No ratings yet
Share MBBS - Lecture 4 (1) - 1
68 pages
Unit 4 & 5 8614
No ratings yet
Unit 4 & 5 8614
58 pages
Biostatistics: Khadeeja PK
0% (1)
Biostatistics: Khadeeja PK
27 pages
Data Presentation
No ratings yet
Data Presentation
104 pages
Bioepi Lesson 6. Descriptive Statistics
No ratings yet
Bioepi Lesson 6. Descriptive Statistics
38 pages
Lecture 2: Graphical Techniques and Numerical Measures
No ratings yet
Lecture 2: Graphical Techniques and Numerical Measures
40 pages
Lecture 6
No ratings yet
Lecture 6
84 pages
Descreptive Statistics 1
No ratings yet
Descreptive Statistics 1
74 pages
STA02 Lab Prelim Module 1
No ratings yet
STA02 Lab Prelim Module 1
14 pages
Interpreting Test Score: Online Workshop 8602 Aiou
100% (1)
Interpreting Test Score: Online Workshop 8602 Aiou
39 pages
MCS Lecture 3
No ratings yet
MCS Lecture 3
57 pages
Topic 3
No ratings yet
Topic 3
49 pages
Measures of Central Tendency and Dispersion
No ratings yet
Measures of Central Tendency and Dispersion
9 pages
Chapter 3 A
No ratings yet
Chapter 3 A
62 pages
Measures of Central Tendency Position and Dispersion 1.Pptx 20241015 145631 0000
No ratings yet
Measures of Central Tendency Position and Dispersion 1.Pptx 20241015 145631 0000
44 pages
Chapter 3 - Numerical Technique - Send
No ratings yet
Chapter 3 - Numerical Technique - Send
49 pages
Mastering Exploratory Data Analysis With Python - A Comprehensive Guide To Unveiling Hidden Insights
No ratings yet
Mastering Exploratory Data Analysis With Python - A Comprehensive Guide To Unveiling Hidden Insights
73 pages
Session 1 ISM May 2024
No ratings yet
Session 1 ISM May 2024
59 pages
Lecture 2 - Descriptive Statistics
No ratings yet
Lecture 2 - Descriptive Statistics
40 pages
Dsbda Unit 2
No ratings yet
Dsbda Unit 2
155 pages
03 Numerical Description
No ratings yet
03 Numerical Description
52 pages
المحاضرة رقم 3
No ratings yet
المحاضرة رقم 3
44 pages
Introduction To Statistics Lecture 7
No ratings yet
Introduction To Statistics Lecture 7
32 pages
Chapter 5 Statistics and Data
No ratings yet
Chapter 5 Statistics and Data
25 pages
Virtual Try On and Shopping Behaviour in Clothing Brands
No ratings yet
Virtual Try On and Shopping Behaviour in Clothing Brands
10 pages
1.2 Mathematical Presentation of Data
No ratings yet
1.2 Mathematical Presentation of Data
28 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
35 pages
8614.educational Statitics Unit 4
No ratings yet
8614.educational Statitics Unit 4
34 pages
5B09 - Measures of Dispersion
No ratings yet
5B09 - Measures of Dispersion
32 pages
43hyrs Principles of Statistics 3
No ratings yet
43hyrs Principles of Statistics 3
56 pages
Lesson 5 Measure of Skewness 1
No ratings yet
Lesson 5 Measure of Skewness 1
9 pages
Click To Add Text Dr. Cemre Erciyes
No ratings yet
Click To Add Text Dr. Cemre Erciyes
69 pages
Descriptive Statistic
No ratings yet
Descriptive Statistic
37 pages
Data Description Analysis
No ratings yet
Data Description Analysis
40 pages
Lecture 3
No ratings yet
Lecture 3
14 pages
2nd Unit - Statistics
No ratings yet
2nd Unit - Statistics
15 pages
Stat 1101 4 7
No ratings yet
Stat 1101 4 7
18 pages
Measures of Location and VARIATION For 1 Variable
No ratings yet
Measures of Location and VARIATION For 1 Variable
44 pages
Lecture 6 - Intro To Hypothesis Testing - Biostats - HS - 280323
No ratings yet
Lecture 6 - Intro To Hypothesis Testing - Biostats - HS - 280323
55 pages
MMW Nursing
No ratings yet
MMW Nursing
23 pages
Week 3 - Review Topic - Measures of Central Tendency and Dispersion - NEUVLE
No ratings yet
Week 3 - Review Topic - Measures of Central Tendency and Dispersion - NEUVLE
13 pages
Module 10 Introduction To Data and Statistics
No ratings yet
Module 10 Introduction To Data and Statistics
63 pages
MMW Reviewer
No ratings yet
MMW Reviewer
9 pages
Lecture 5 - Types of Analysis - HSciences - 200323
No ratings yet
Lecture 5 - Types of Analysis - HSciences - 200323
42 pages
Finals. Fmch. Measure of Central Tendency Shape of The Distribution of Dispe
No ratings yet
Finals. Fmch. Measure of Central Tendency Shape of The Distribution of Dispe
5 pages
AOL 1 Chapter Chapter 7 Part 1
No ratings yet
AOL 1 Chapter Chapter 7 Part 1
10 pages
Measure of Central Tendency Dispersion A
No ratings yet
Measure of Central Tendency Dispersion A
8 pages
Statistics 3: DR Taher
No ratings yet
Statistics 3: DR Taher
38 pages
Learning Activity Sheets: Mean and Variance of Discrete Random Variable
No ratings yet
Learning Activity Sheets: Mean and Variance of Discrete Random Variable
9 pages
f592b059 1643454320549
No ratings yet
f592b059 1643454320549
39 pages
Module 2 - Section 4 (Linear Regression) - 11
No ratings yet
Module 2 - Section 4 (Linear Regression) - 11
20 pages
Batch18 Group01 Excel Assignment Econ105
No ratings yet
Batch18 Group01 Excel Assignment Econ105
41 pages
Lecture 4 - Normal and Nonnormal Dist - HS - 070323en
No ratings yet
Lecture 4 - Normal and Nonnormal Dist - HS - 070323en
39 pages
Papua New Guinea Review Dfat Support TB Response PNG 2011 2018
No ratings yet
Papua New Guinea Review Dfat Support TB Response PNG 2011 2018
92 pages
PNG TB Who Report
No ratings yet
PNG TB Who Report
86 pages
معوقات تطبيق الإدارة الإلكترونية
No ratings yet
معوقات تطبيق الإدارة الإلكترونية
11 pages
Central Tendency: Mode, Median, and Mean
No ratings yet
Central Tendency: Mode, Median, and Mean
15 pages
Statistics and Probability Exam Quiz
No ratings yet
Statistics and Probability Exam Quiz
16 pages
Ba Midterm
No ratings yet
Ba Midterm
4 pages
Math11 SP Q3 M2 PDF
No ratings yet
Math11 SP Q3 M2 PDF
16 pages
St. Paul University Philippines
No ratings yet
St. Paul University Philippines
46 pages
Chapter 4 - Acct
No ratings yet
Chapter 4 - Acct
16 pages
Real Estate Model
No ratings yet
Real Estate Model
5 pages
Assignment 1
No ratings yet
Assignment 1
15 pages
Chelsea Stats Passes 2024-25
No ratings yet
Chelsea Stats Passes 2024-25
6 pages
Aditya Surya Pratama 36B - Tugas Statistik
No ratings yet
Aditya Surya Pratama 36B - Tugas Statistik
22 pages
Day 8 - Module Linear Correlation
No ratings yet
Day 8 - Module Linear Correlation
5 pages
MDC 4 5 Basic Statistics
No ratings yet
MDC 4 5 Basic Statistics
2 pages
ASSIGNMENT
No ratings yet
ASSIGNMENT
3 pages
Business Statistics
No ratings yet
Business Statistics
13 pages
Meaning Average or Central Value Methods of Central Tendency
No ratings yet
Meaning Average or Central Value Methods of Central Tendency
14 pages
Ial Maths s1 Review Exercise 1 Ans
No ratings yet
Ial Maths s1 Review Exercise 1 Ans
5 pages
Statistics Assignment 1
No ratings yet
Statistics Assignment 1
4 pages
Viii - Atso - Level - 2 - Averages
No ratings yet
Viii - Atso - Level - 2 - Averages
2 pages
Hasil Uji Normalitas Data Shapiro-Wilk: Case Processing Summary
No ratings yet
Hasil Uji Normalitas Data Shapiro-Wilk: Case Processing Summary
6 pages
Skripsi Ilmi
No ratings yet
Skripsi Ilmi
3 pages
Lampiran 5. Hasil Analisis
No ratings yet
Lampiran 5. Hasil Analisis
2 pages
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
From Everand
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
Seaport AI Madhavan
No ratings yet
Statistics: a QuickStudy Laminated Reference Guide
From Everand
Statistics: a QuickStudy Laminated Reference Guide
BarCharts Publishing, Inc.
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Lecture 2-Summarizing Data - HSciences Biostats - 010232en

Uploaded by

Lecture 2-Summarizing Data - HSciences Biostats - 010232en

Uploaded by

SUMMARIZING DATA-

 appropriate measure of central locations (Mean,

 Be able to calculate MEAN

 Be able to identify and use the MEDIAN and

The above are used to describe location of data.

Also known as:

 When we’re talking about measures of central

 Finding a value that somehow conveys information

This is the “average” (e.g., height) of all the members.

 Probably most common measure of

Definition: the middle value

Method for identification:

1. Arrange observations in order

0, 2, 3, 4, 5, 5, 6, 7, 8, 9, 9, 9, 10, 10, 10,

19, 22, 27, 49

What is the median for this data set??

• Does not use all the data available

Medians do not use all data available and thus are

1a. Arrange data into frequency

1b. Alternatively, arrange raw data in

2. Identify the value that occurs most

The first measure used to describe central tendency

0, 2, 3, 4, 5, 5, 6, 7, 8, 9, 9, 9, 10, 10, 10,

19, 22, 27, 49

Identify the value that occurs most often in this

 Easiest measure to understand, explain,

The mode is the easiest measure of central tendency to

 Here’s an example of a normal curve or normal distribution.

• In a normal curve, the mean, median and mode are all

 Most curves are not perfectly normal. They exhibit some

 Mean, Median and Mode, are all different in a skewed curve

Definition: difference between largest

Length of hospital stay for pneumonia

Have a look at this dataset above.

The white space under the curve represents

 Variance is the average of the squared differences from the

 Standard deviation is a measure of variation that quantifies

 Standard deviation is a measure of how spread out the numbers

Standard deviation usually calculated only

For normally distributed data,

In a normal distribution, about 95% of data values are

Median – best for skewed data

Geometric mean – use for lab

 Standard error – used to construct

This also means that standard error should decrease if the

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.