0% found this document useful (0 votes)
11 views37 pages

Statistics - Lecture Slides 3 - For Lecture

The document discusses statistical concepts including quartiles (Q1, Q2, Q3), interquartile range (IQR), and the identification of outliers. It explains how to calculate these metrics and provides examples using data sets, including boxplots to visualize the distribution of data. Additionally, it covers the concepts of variance and standard deviation in relation to sample and population data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views37 pages

Statistics - Lecture Slides 3 - For Lecture

The document discusses statistical concepts including quartiles (Q1, Q2, Q3), interquartile range (IQR), and the identification of outliers. It explains how to calculate these metrics and provides examples using data sets, including boxplots to visualize the distribution of data. Additionally, it covers the concepts of variance and standard deviation in relation to sample and population data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Stand for ambition.

kent.ac.uk
25% of the 25% of the 25% of the 25% of the
data data data data

Lower Median Upper


Quartile Quartile

• 𝑄2

• 𝑄1

• 𝑄3

• 𝑄1 , 𝑄2 , 𝑄3
𝑄1 𝑄2 𝑄3

𝑛+1
• 𝑄2
2
• ⟹
• ⟹
1
• 𝑄1 ×𝑛 ⟹
4
3
• 𝑄3 ×𝑛
4 ⟹



• 𝑄3 − 𝑄1 =
Positively skewed Negatively skewed

Symmetrical

4
5
Stands out as being
‘unusually’ small.
Is it an outlier?

× 𝑄3 𝑄1 .

𝑄1 − 5 ×
𝑄3 + ×
𝑄1 = 70, 𝑄3 = 100
𝐼𝑄𝑅 = 𝑄3 − 𝑄1 = 100 – 70 = 30
1.5 × IQR = 1.5 × 30 = 45
𝑄1 − 1.5 × IQR = 70 − 45 = 25
𝑄3 + 1.5 × IQR = 100 + 45 = 145
𝑄1 = 3.2, 𝑄3 = 4

𝐼𝑄𝑅 = 𝑄3 − 𝑄1 = 4 − 3.2 = 0.8


1.5 × IQR = 1.5 × 0.8 = 1.2
𝑄1 − 1.5 × IQR = 3.2 − 1.2 = 2
𝑄3 + 1.5 × IQR = 4 + 1.2 = 5.2

Smallest Median=3 Largest Outlier=10


observation observation
that is NOT Lower Upper that is NOT
an outlier = 2 quartile=2.5 quartile=5 an outlier=7
Box is
middle 50%
of the data

Whiskers are smallest


and largest 50% of data
1. Which School achieved the highest mark?
2. Which School achieved the lowest mark?
3. Which School had a mark that stood out as being unusually low for that
school?
4. Which School had the largest IQR?
5. Which School had the largest range of marks?
6. The marks in which School were more variable?
7. True or False: the median mark in School A was greater than the median mark
in School B?
8. Which School’s median mark was closest to 60?
Negatively skewed: lower whisker longer Symmetrical: whiskers approximately
and median towards upper quartile equal and median in centre of box
• Symmetrical distribution

• Positively skewed distribution

• Negatively skewed distribution


Smallest 𝑄1 𝑄2 𝑄3 Largest Largest observation
observation observation
that is not an outlier that is not an outlier

We need to know:
1. Q1 , Q 2 and Q 3
2. all outliers
3. smallest value that is not an outlier
4. largest value that is not an outlier
Example: consider the scores (already in ascending order)
18, 27, 34, 52, 54, 59, 61, 68, 78, 82, 85, 87, 91, 93, 100
𝑛 = 15
𝑛+1 15+1
1. Median (𝑄2 ): position = = 8 ⟹ 𝑄2 = 68
2 2
1 1
Lower quartile (𝑄1 ): position × 𝑛 = × 15 = 3.75 ⟹ round up to 4
4 4
⟹ 𝑄1 = 52
3 3
Upper quartile (𝑄3 ): position × 𝑛 = × 15 = 11.25 ⟹ round up to 12
4 4
⟹ 𝑄3 = 87
2. 𝐼𝑄𝑅 = 𝑄3 − 𝑄1 = 87 − 52 = 35, so 1.5 × 𝐼𝑄𝑅 = 52.5
Lower fence = 𝑄1 − 1.5 × 𝐼𝑄𝑅 = 52 - 52.5 = -0.5
Upper fence = 𝑄3 + 1.5 × 𝐼𝑄𝑅 = 87 + 52.5 = 139.5
Observations < -0.5 or > 139.5 are outliers. There are no outliers.
3. Smallest observation that is not an outlier = 18
4. Largest observation that is not an outlier = 100
Information we need:
𝑄1 = 52, 𝑄2 = 68, 𝑄3 = 87
There are no outliers.
Smallest observation that is not an outlier = 18
Largest observation that is not an outlier = 100

We can draw the boxplot: this is what it would look like.


• 𝑄1 − 1.5 × IQR
𝑄3 + 1.5 × IQR

22
𝑥 𝑥ҧ

σ |𝑥−𝑥|ҧ
𝑛
σ |𝑥−𝑥|ҧ
𝑛

𝑥.ҧ
σ𝑥
𝑥ҧ =
𝑛
450
σ 𝑥 = 92 + 75 + 95 + 90 + 98 = 450 𝑛=5 𝑥ҧ = = 90.
5

𝑥 − 𝑥ҧ
σ |𝑥−𝑥|ҧ
𝑛
𝑥 ഥ
𝒙−𝒙 𝒙 − 𝒙ഥ
92 92-90 = 2 2
75 75-90 = -15 15
95 95-90 = 5 5
90 90-90 = 0 0
98 98-90 = 8 8
Total 0 30

σ 𝑥 − 𝑥ҧ = 30 𝑛=5
30
= 6.
5

• (𝑥)ҧ
• (𝑠 2 )

• (𝑠)



1 𝑥 − 𝑥ҧ
2
𝑠 = ෍(𝑥 − 𝑥)ҧ 2
𝑛−1
𝑛−1

( σ 𝑥) 2
σ 𝑥2 −
𝑠2 = 𝑛
𝑛−1

• The more spread out a data set is, the higher the value of 𝑠 2 .
• Note that 𝑠 2 is never negative.

28
1
𝑠2 = σ(𝑥 − 𝑥)ҧ 2
𝑛−1

𝑥.ҧ
σ𝑥
𝑥ҧ =
𝑛
320
σ 𝑥 = 79 + 75 + 80 + 86 = 320 𝑛=4 𝑥ҧ = = 80.
4

𝑥 − 𝑥ҧ
2
1
𝑠 = ෍(𝑥 − 𝑥)ҧ 2
𝑛−1

𝒙 ഥ
𝒙−𝒙 (𝒙 − 𝒙ഥ )𝟐
79 79-80 = -1 (−1)2 = 1
75 75-80 = -5 (−5)2 = 25
80-80 = 0 02 = 0
86-80 = 6 62 = 36
Total 0

σ(𝑥 − 𝑥)ҧ 2 = 62 𝑛=4


62
𝑠2 = 20.67.
4−1
(σ 𝑥)2
σ 𝑥2−
Suppose we use the other formula: 𝑠2 = 𝑛
𝑛−1

σ 𝑥 = 79 + 75 + 80 + 86 = 320
σ 𝑥 2 = 792 + 752 + 802 + 862 = 25662

3202
25662 −
𝑠2 = 4
3
25662 − 25600
=
3
62
= = 20.67
3

• 𝑁
• 𝜇
• 𝜎2

σ𝑥
𝜇=
𝑁
1 σ 𝑥2
𝜎2 = σ 𝑥−𝜇 2 and 𝜎2 = − 𝜇2
𝑁 𝑁

32
• variance

• sample variance
• population variance

33
1800
a) σ 𝑥 = 1800 and 𝑛 = 30 𝑥ҧ =
30
= 60
𝜇
1 σ𝑥 2
b) σ 𝑥2 = 124888.9 𝑠2 = σ 𝑥2 −
𝑛−1 𝑛

1 18002
𝑠2 = 124888.9 −
30 − 1 30
= 582.3759
𝜎2)
𝑠 = 582.3759 = 24.1325
σ)

o
o
o

36


37
σ |𝑥−𝑥|ҧ

𝑛

(σ 𝑥)2
1 σ 𝑥 2−
• 𝑠2 = σ(𝑥 − 𝑥)ҧ 2 𝑠2 = 𝑛
𝑛−1 𝑛−1

σ𝑥
• 𝜇= 𝑁
𝑁

1 σ 𝑥2
• 𝜎2 = σ 𝑥−𝜇 2 and 𝜎2 = − 𝜇2
𝑁 𝑁

• sample/population variance

39

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy