0% found this document useful (0 votes)

3 views28 pages

PDF Notes

Chapter Two discusses methods for describing data sets, focusing on frequency distributions, measures of center, and measures of dispersion. It explains different types of data, including quantitative and qualitative variables, and introduces graphical methods for visualizing data such as bar charts and histograms. The chapter also covers descriptive statistics, including measures of central tendency like mean, median, and mode, and emphasizes the impact of extreme values on these measures.

Uploaded by

anwilliams2004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views28 pages

PDF Notes

Uploaded by

anwilliams2004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

Chapter Two

Methods for Describing Sets

of Data

Chapter 2

Purpose
 In this chapter we will study several ways to summarize
data. In this chapter we discuss three complementary
aspects of data description: frequency distributions,
measures of center, and measures of dispersion. The three
help us “paint a picture” of our data by giving us
information about the shape, center, and spread.

1
Chapter Two

Example: Car insurance company evaluates many variables

before deciding on an appropriate rate for automobile insurance.

Types of Data
 Quantitative (Numeric) variable have measurements that are
recorded on a naturally occurring numerical scale.
 Discrete variables arise from a counting process.
 Continuous variables arise from a measuring process.

 Qualitative (Categorical) variable have measurements that cannot

be measured on a natural numerical scale; they can only be classified
into distinct categories.
 Nominal scale classifies data into distinct categories in which no order
or ranking is implied. Nominal scales cannot be ordered!!
 An ordinal scale classifies data into distinct categories in which order or
ranking is implied. Ordinal scales can be ordered!!

2
Chapter Two

Example: An insurance company evaluates many variables

before deciding on an appropriate rate for automobile insurance.
1. The number of claims the principle driver has made in the last 3 years is a
A. Categorical, ordinal scale
B. Numerical, discrete
C. Numerical, continuous
2. The odometer reading on the car being insured is a
A. Categorical, ordinal scale
B. Numerical, discrete
C. Numerical, continuous
3. The color of the car being insured is a
A. Categorical, ordinal scale
B. Categorical, nominal scale
C. Numerical, continuous

Graphical Methods

Sections 2.1 & 2.2

3
Chapter Two

Visualizing Categorical Data

 Statistical pictures used to visualize data
 One categorical variable
 Frequency Distribution
 Bar Chart
 Pie Chart

Frequency Distribution
 A frequency distribution is a table that displays the number of
occurrences (frequency) of each category or class in a data set.

 Relative frequency =

Example: Impairment of language ability.

Type of Type of
Subject Subject
Aphasia Aphasia
Summary Table (Impairment):
1 Broca’s 12 Broca’s
relative
2 Anomic 13 Anomic Class frequency frequency percentage
3 Anomic 14 Broca’s Anomic 10 0.455 45.5
4 Conduction 15 Anomic Broca's 5 0.227 22.7
5 Broca’s 16 Anomic Conduction 7 0.318 31.8
6 Conduction 17 Anomic
TOTALS 22 1.000 100.0

7 Conduction 18 Conduction
8 Anomic 19 Broca’s
9 Conduction 20 Anomic
10 Anomic 21 Conduction
8
11 Conduction 22 Anomic

4
Chapter Two

Bar Chart
Bar chart – a series of bars, with each bars representing the class
frequency/class relative frequency/class percentage.
• Can be used for two or three variables simultaneously

Example: Impairment of language ability.

Bar Chart of Type
100
Summary Table (Impairment):
relative 80

Percentage
Class frequency frequency percentage 60
Anomic 10 0.455 45.5
Broca's 5 0.227 22.7 40
Conduction 7 0.318 31.8 20
TOTALS 22 1.000 100.0
0
Anomic Broca's Conduction
Type

Pie Chart
Pie chart – uses sections of a circle to represent the class
frequency/class relative frequency/class percentage.

Example: Impairment of language ability.

Pie Chart of Type
Summary Table (Major):
relative
Class frequency frequency percentage
Anomic 10 0.455 45.5
31.8%
Broca's 5 0.227 22.7 Anomic
45.5%
Conduction 7 0.318 31.8 Broca's
TOTALS 22 1.000 100.0 Conduction

22.7%

5
Chapter Two

Visualizing Numeric Data

 Statistical pictures used to visualize data
 One numeric variable
 Dotplot
 Stem-and-leaf plot
 Histogram

Dotplot
 A dotplot is a graph that is used to show the distribution of a
numeric variable when the sample size is small.

Example: A group of thirty-six 2-year old sows of the same breed were
bread to Yorkshire boars. The number of piglets surviving to 21 days of
age was recorded for each sow

6
Chapter Two

Histogram
 A histogram is a graphical display that results when we
replace the dots of a dotplot with bars.
 In histograms, the bars usually touch. If there is a space, it is not
arbitrary like in a bar chart.
Example: A group of thirty-six 2-year old sows of the same breed
were bread to Yorkshire boars. The number of piglets surviving to 21
days of age was recorded for each sow

Example: Serum CK Creatine phosphokinase (Ck) is an enzyme related

to muscle and brain function. As part of a study to determine the
natural variation in Ck concentration, blood was drawn from 36 male
volunteers. Their serum concentrations of CK (measure in U/l) are
given in Table 2.2.6.

7
Chapter Two

Example CK Serum:

25 classes 5 classes

8
Chapter Two

Describing the Shape of a Histogram

Modality, symmetry, and skew.

Mode: Peak/peaks of the histogram Tails: The distribution is

• Unimodal  One peak • Left-skewed  left tail is longer than the
• Bimodal  Two peaks right tail.
• Multimodal  Two or more • Right skewed  left tail is shorter than the
peaks right tail.
• Symmetric if the left and right tails are
approximately equal (mirror images but if
this is NOT the case, it is asymmetric).

How would we describe the shape of the distribution?

9
Chapter Two

How would we describe the shape of the distribution?

10
Chapter Two

How would we describe the shape of the distribution?

11
Chapter Two

Boxplots

Section 2.7 & 2.6

Terminology
 PERCENTILE: the pth percentile is a value such that p% of
the observations fall below (or at) that value and (100-p)% fall
above (or at) that value

 QUARTILES(Q) divides the distribution into four parts

 Q1 (Q(.25))divides the lower 25% from the upper 75% of the
distribution.
 Q2 divides the lower 50% of the distribution from the upper 50%
of the distribution. (median of entire data set)
 Q3 (Q(.75)) divides the lower 75% from the upper 25% of the
distribution

12
Chapter Two

Terminology
 INTERQUARTILE RANGE: describes the middle 50% of
data.
 Robust measure of variability(resistant to extreme
values)
 IQR = Q3 - Q1

 FIVE-NUMBER SUMMARY includes:

Minimum, Q1, Median (Q2), Q3, Maximum

Terminology
 Outlier- a data point that differs so much from the rest of
the data.

 Data point is an outlier that falls outside of the fence

 Data point < Lower Fence = 𝑄 − 1.5 𝑋 𝐼𝑄𝑅
 Data point > Upper Fence = 𝑄 + 1.5 𝑋 𝐼𝑄𝑅

 * or Dot to represent an outlier on a boxplot

STAT 205 26

13
Chapter Two

 Example: The pulse rates (beat/min) of 12 college students

were measured. Here are the data arranged in order:
62 64 68 70 70 74 74 76 76 78 78 80

Find the five-number summary and IQR.

Boxplot for Data with No Outliers

A boxplot is a graph of the 5-number summary.
IQR

25% 25% 25% 25%

Minimum Q1 Median Q3 Maximum

14
Chapter Two

Constructing a Boxplot with Outliers

Upper inner fence = Q3 + 1.5 (IQR)

If there are outliers, the whisker is

drawn to the smallest or largest value
that is not an outlier and a special
Q3 character is drawn to denote the
Q2 IQR outliers. (See page 50 of the text)
Q1

Lower inner fence = Q1 - 1.5 (IQR)

 Example: The pulses of 12 college students were measured.

Here are the data arranged in order:
62 64 68 70 70 74 74 76 76 78 78 80

Data point < Lower Fence = 𝑄 − 1.5 𝑋 𝐼𝑄𝑅

Are there any outliers?:
Data point > Upper Fence = 𝑄 + 1.5 𝑋 𝐼𝑄𝑅

𝐈𝐐𝐑 = 77 − 69 = 8
( ) ( )
Q1 = = 69 Q3 = = 77

15
Chapter Two

Boxplot from R

IQR

5-number summary
31
62, 69, 74, 77, 80
31

Box Plot

16
Chapter Two

Distribution Shape and The Boxplot

Left-Skewed Symmetric Right-Skewed

Q1 Q2 Q3 Q1 Q2 Q3 Q1 Q2 Q3

DESCRIPTIVE STATISTICS:
MEASURES OF CENTER
Section 2.3

17
Chapter Two

Definitions
 Statistics: A numerical measure that is calculated from
the sample data.

 Parameters: A numerical measure that is calculated from

the population data.

Measures of Central Tendency

 Measures of center are used to describe the center or
location of the data.

 Three commonly used measures

 Mean
 Median
 Mode

18
Chapter Two

Mean
Mean of a variable is computed by determining the sum of all the values of
the variable in the data set divided by the number of observations.

The sample mean(𝑦) for a sample of size n is

∑ 𝑦 𝑦 + 𝑦 + 𝑦 +⋯+ 𝑦
𝑦= =
𝑛 𝑛

where
𝑦 is the 𝑖 value of variable Y
𝑛 is the sample size

*Population mean is denoted by 𝜇.

Median
Median: the middle value of the data set. (At most 50% of data is
greater than M and at most 50% of data is less than M)

Steps to calculate M:
o Order n data values from smallest to largest.
o Observation in position in the ordered list is the median M

o If is not a whole number, the median will be the average of the

two middle observations.

19
Chapter Two

Mode
 The mode of a variable is the most frequent observation
of the variable that occurs in the data set.

 If there is no observation that occurs with the most

frequency, we say the data has no mode.

 Two modes  BI-modal

Weight Gain of Lambs The following are the 2 week weight

gains (lb) of six young lambs of the same breed that had been
raised on the same diet:
11 13 19 2 10 1
1 2 10 11 13 19

Find the mean, median, and mode of this dataset (by hand).

20
Chapter Two

What if………………….Extreme Values

Weight Gain of Lambs The following are the 2 week weight gains (lb) of
six young lambs of the same breed that had been raised on the same diet:

1 2 10 11 13 19

• we add an observation of 100 pounds?.

1 2 10 11 13 19 100

• ONE extreme value changed the mean by 12.96…

Extreme Values
 MEAN is STRONGLY AFFECTED by extreme
values

 MEDIAN is less sensitive than the mean to extreme

values.
 Because the median is not affected by large outlying values
as much as the mean, we say it is robust.
 Large values skew the mean in the direction of the skew.

21
Chapter Two

Shapes of Distributions

Which To Use?
The most appropriate measure of central tendency depends
on the data set:

 Approximately symmetric and unimodal 

 Skewed 

 Categorical

22
Chapter Two

MEASURES OF DISPERSIONS
2.4 & 2.6

Measures of Variation
 Measures of dispersion give us an idea about the
spread of a distribution. Are the observations all
nearly equal or do they differ substantially from each
other.

 Measures of Dispersion
 Range
 Standard deviation & Variance
 IQR

23
Chapter Two

Range
 Simplest measure of variation.
 RANGE = largest value – smallest value
 Does not consider how the values cluster or distribute between the
extremes.
Example: The data below represents the waiting time at a local urban
outpatient facility. Waiting time is measured from the time when the patient
registered to the time when he or she received the care service. Data was
collected for a sample of 10 patients. Determine the range.

Values 29 31 35 39 39 40 43 44 44 52
Ranks 1 2 3 4 5 6 7 8 9 10

Variance & Standard deviation

 Common measure of the spread of values in a distribution.
 Shows variation about the mean.

∑ 𝑦 −𝑦
The sample variance (𝑆 ) is 𝑠 =
𝑛−1

The sample standard deviation (𝑆) is ∑ 𝑦 −𝑦

𝑠=
**𝑆 is measured in the same unit. 𝑛−1

where,
𝑦 is the sample mean
𝑦 is the 𝑖 value of variable Y
𝑛 is the sample size
𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 = 𝑦 − 𝑦 (difference between the observation and sample mean)

24
Chapter Two

Example: Standard Deviation

The data below represents the waiting time at a local urban outpatient
facility. Waiting time is measured from the time when the patient registered
to the time when he or she received the care service. Data was collected for
a sample of 10 patients. Compute the sample standard deviation.
𝑦 = 39.6 𝑛 = 10
𝒚𝒊 − 𝒚 𝟐
𝒚𝒊 𝒚𝒊 − 𝒚
39 39 − 39.6 = −0.6 −0.6 = 0.36
29 29 − 39.6 = −10.6 −10.6 = 112.36
43 43 − 39.6 = 3.4 3.4 = 11.56
52 52 − 39.6 = 12.4 12.4 = 153.76
39 39 − 39.6 = −0.6 −0.6 = 0.36
44 44 − 39.6 = 4.4 4.4 = 19.36
40 40 − 39.6 = 0.4 0.4 = 0.16
31 31 − 39.6 = −8.6 −8.6 = 73.96
44 44 − 39.6 = 4.4 4.4 = 19.36
35 35 − 39.6 = −4.6 −4.6 = 21.16

INTERQUARTILE RANGE:
 Describes the middle 50% of data.
 IQR = Q3 - Q1

Example: Data was collected for a sample of 10 patients.

Compute the IQR.
Values 29 31 35 39 39 40 43 44 44 52
Ranks 1 2 3 4 5 6 7 8 9 10

25
Chapter Two

Extreme Values
 Range and Standard Deviation are AFFECTED by
extreme values

 IQR is less sensitive than the range and standard

deviation to extreme values.
 Because the IQR is not affected by large outlying values,
we say it is robust.

The Empirical Rule

 For unimodal approximately symmetric distributions (think
bell-shaped), we are able to use the Empirical Rule

• About 68% of observations are within one standard deviation

of the mean (in either direction).
𝑦 ± 1𝑠
• About 95% of observations are within two standard deviations
of the mean (in either direction).
𝑦 ± 2𝑠
• About 99.7% of observations are within three standard
deviations of the mean (in either direction).
𝑦 ± 3𝑠

26
Chapter Two

Illustration of the Empirical Rule

Example
The Health and Nutrition Examination Study of 1976-1980 (HANES)
studied the heights of adults (aged 18-24) is bell-shaped with a
Women Mean (𝒚): 65.0 inches standard deviation (s): 2.5 inches
Men Mean (𝒚): 70.0 inches standard deviation (s): 2.8 inches

Find the intervals of the Empirical Rule for the men.

Approximately 68%:

Approximately 95%:

Approximately 99.7%:
61.6 64.4 67.2 70 72.8 75.6 78.4

27
Chapter Two

Summary
 The End!!

Biostats - PST 426.sister HO Fawole
No ratings yet
Biostats - PST 426.sister HO Fawole
85 pages
Class 1
No ratings yet
Class 1
52 pages
L2-Types of Data, Central Tendency and Dispersion-2
No ratings yet
L2-Types of Data, Central Tendency and Dispersion-2
81 pages
Descriptive Statistics 2024
No ratings yet
Descriptive Statistics 2024
31 pages
Data Analysis
No ratings yet
Data Analysis
43 pages
Week 1-12 Statistics
No ratings yet
Week 1-12 Statistics
84 pages
Lecture 1 21022024 033638pm
No ratings yet
Lecture 1 21022024 033638pm
30 pages
TIS - Descriptive Statistics
No ratings yet
TIS - Descriptive Statistics
21 pages
Chapter 3 Statistics
No ratings yet
Chapter 3 Statistics
8 pages
Basics of Statistics
No ratings yet
Basics of Statistics
40 pages
Lecture 2 - Descriptive Statistics
No ratings yet
Lecture 2 - Descriptive Statistics
53 pages
Ap Stat Exam Rev ch1-13
No ratings yet
Ap Stat Exam Rev ch1-13
120 pages
Unit-2 MFAI
No ratings yet
Unit-2 MFAI
118 pages
Data Analytics Summary
No ratings yet
Data Analytics Summary
80 pages
Ancient Indian History (Quick Revision)
No ratings yet
Ancient Indian History (Quick Revision)
20 pages
Organization of Data
No ratings yet
Organization of Data
6 pages
MÔ TẢ BIẾN SỐ
No ratings yet
MÔ TẢ BIẾN SỐ
48 pages
Manm526 W1
No ratings yet
Manm526 W1
38 pages
Lecture 01
No ratings yet
Lecture 01
12 pages
Biostat Aguila Mission Solis
No ratings yet
Biostat Aguila Mission Solis
44 pages
Topic 2 - Descriptive - Statistics
No ratings yet
Topic 2 - Descriptive - Statistics
36 pages
Chapter 1
No ratings yet
Chapter 1
51 pages
Basic Biostatistics
No ratings yet
Basic Biostatistics
31 pages
Lecture 01 Introduction To Statistics PPT 06022025 095924am
No ratings yet
Lecture 01 Introduction To Statistics PPT 06022025 095924am
40 pages
Biostatics For Nurses
No ratings yet
Biostatics For Nurses
74 pages
Intro SRM
No ratings yet
Intro SRM
73 pages
Year 9 Term 4 Study Guide Statistics and Probability
No ratings yet
Year 9 Term 4 Study Guide Statistics and Probability
12 pages
Ch1 Prob&Stat NEW
No ratings yet
Ch1 Prob&Stat NEW
35 pages
Introduction To Biostatistics
No ratings yet
Introduction To Biostatistics
53 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
34 pages
Stats 1 Module Updated
No ratings yet
Stats 1 Module Updated
53 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
35 pages
Chapter 2
No ratings yet
Chapter 2
46 pages
Descriptive Statistics, Tables and Graphs 20
No ratings yet
Descriptive Statistics, Tables and Graphs 20
34 pages
Stats For PGDM
No ratings yet
Stats For PGDM
52 pages
Biostats Lesson 3
No ratings yet
Biostats Lesson 3
6 pages
Notes 3 Descriptive Statistics RJMurden 2021
No ratings yet
Notes 3 Descriptive Statistics RJMurden 2021
47 pages
Collection of Data Part 2 Edited MLIS
No ratings yet
Collection of Data Part 2 Edited MLIS
45 pages
Making Sense of Data Statistic Course
No ratings yet
Making Sense of Data Statistic Course
39 pages
Lecture 1
No ratings yet
Lecture 1
38 pages
Water-Soluble Polymers For Petroleum Recovery PDF
No ratings yet
Water-Soluble Polymers For Petroleum Recovery PDF
355 pages
PLAY - The Bean Game - Worksheet
No ratings yet
PLAY - The Bean Game - Worksheet
5 pages
Article Review 1 Eng
No ratings yet
Article Review 1 Eng
30 pages
Probability+&+Statistics Formulas
No ratings yet
Probability+&+Statistics Formulas
47 pages
C1S1 Statistics Packet
No ratings yet
C1S1 Statistics Packet
24 pages
B7 CREATIVE ARTS First-Term 2024 DEC EXAMS
No ratings yet
B7 CREATIVE ARTS First-Term 2024 DEC EXAMS
6 pages
Udacity Statistics Notes
No ratings yet
Udacity Statistics Notes
37 pages
Statistics: I. II. Iii. IV
No ratings yet
Statistics: I. II. Iii. IV
6 pages
AP Stats Semester 1 Finals Prep
No ratings yet
AP Stats Semester 1 Finals Prep
4 pages
11.-I Love The Earth - Password - Removed
No ratings yet
11.-I Love The Earth - Password - Removed
16 pages
Li Ion Standards
No ratings yet
Li Ion Standards
4 pages
DLL G6 Q3 WEEK 9 Version2 (Mam Inkay Peralta)
No ratings yet
DLL G6 Q3 WEEK 9 Version2 (Mam Inkay Peralta)
71 pages
STATS
No ratings yet
STATS
3 pages
Introduction To Descriptive Statistics I: Sanju Rusara Seneviratne Mbpss
No ratings yet
Introduction To Descriptive Statistics I: Sanju Rusara Seneviratne Mbpss
35 pages
Iec 309.1-1988
No ratings yet
Iec 309.1-1988
66 pages
02 - Descriptive Statistics
No ratings yet
02 - Descriptive Statistics
45 pages
Making Sense of Data Mooc Notes PDF
No ratings yet
Making Sense of Data Mooc Notes PDF
32 pages
NITKclass 1
No ratings yet
NITKclass 1
50 pages
8085 Block Diagram and Pin Diagram
No ratings yet
8085 Block Diagram and Pin Diagram
38 pages
"Stepper Motor Control Using Arduino": Minor Project
No ratings yet
"Stepper Motor Control Using Arduino": Minor Project
22 pages
Biology Investigatory Project On Effects of Music Genres in Heart Rates
No ratings yet
Biology Investigatory Project On Effects of Music Genres in Heart Rates
47 pages
WK 1b Biostat
No ratings yet
WK 1b Biostat
38 pages
FORM - 2 - Appendix 2.A - Fire Protection Codes Matrix
No ratings yet
FORM - 2 - Appendix 2.A - Fire Protection Codes Matrix
9 pages
Course Introduction Inferential Statistics Prof. Sandy A. Lerio
No ratings yet
Course Introduction Inferential Statistics Prof. Sandy A. Lerio
46 pages
Notes: Section 1: Exploratory Data Analysis
No ratings yet
Notes: Section 1: Exploratory Data Analysis
6 pages
SIROLL ALU en
No ratings yet
SIROLL ALU en
28 pages
Scrubber
No ratings yet
Scrubber
15 pages
2.5 Screw Pile Info
No ratings yet
2.5 Screw Pile Info
4 pages
Brochure AVEVA InitialDesign PDF
No ratings yet
Brochure AVEVA InitialDesign PDF
4 pages
CSEC Math 2018 Paper 032
No ratings yet
CSEC Math 2018 Paper 032
16 pages
Pengertian Narrative Text Kls 2
No ratings yet
Pengertian Narrative Text Kls 2
11 pages
Veterinary Cytology - 1st Edition Complete EPUB Download
100% (15)
Veterinary Cytology - 1st Edition Complete EPUB Download
16 pages
SSE UK'23 - Programme Brochure
No ratings yet
SSE UK'23 - Programme Brochure
12 pages
Guide For The Development of The Practical Component - Unit 2 - Phase 4 - Development of The Simulated Practical Component
No ratings yet
Guide For The Development of The Practical Component - Unit 2 - Phase 4 - Development of The Simulated Practical Component
15 pages
Scheerlinckk Depthconfigurations - Proximity, Permeabilityandterritorialboundariesinurbanprojects
No ratings yet
Scheerlinckk Depthconfigurations - Proximity, Permeabilityandterritorialboundariesinurbanprojects
16 pages
Analisis Factorial 2 2 y 2 3
No ratings yet
Analisis Factorial 2 2 y 2 3
8 pages
BOM Prod Analysis
No ratings yet
BOM Prod Analysis
3 pages
Bustat Reviewer
No ratings yet
Bustat Reviewer
6 pages
Sample New Criticism Essay
No ratings yet
Sample New Criticism Essay
5 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
4 pages
CSEC Qualitative of Cations
No ratings yet
CSEC Qualitative of Cations
2 pages
The Trinity - Lesson 4
100% (1)
The Trinity - Lesson 4
3 pages
A Celebration of Ego Death
No ratings yet
A Celebration of Ego Death
6 pages
African Religion
No ratings yet
African Religion
5 pages
SAR Data Access and Availability One-Pager
No ratings yet
SAR Data Access and Availability One-Pager
2 pages
The Elements of Quantitative Investing
From Everand
The Elements of Quantitative Investing
Giuseppe A. Paleologo
No ratings yet
Log-Linear Modeling: Concepts, Interpretation, and Application
From Everand
Log-Linear Modeling: Concepts, Interpretation, and Application
Alexander von Eye
No ratings yet
Profit Driven Business Analytics: A Practitioner's Guide to Transforming Big Data into Added Value
From Everand
Profit Driven Business Analytics: A Practitioner's Guide to Transforming Big Data into Added Value
Wouter Verbeke
No ratings yet
IGNOU MCA Digital Image Processing and Computer Vision Unsolved Paper Book MCS 230
From Everand
IGNOU MCA Digital Image Processing and Computer Vision Unsolved Paper Book MCS 230
Manish Soni
No ratings yet
Network Models in Finance: Expanding the Tools for Portfolio and Risk Management
From Everand
Network Models in Finance: Expanding the Tools for Portfolio and Risk Management
Gueorgui S. Konstantinov
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

PDF Notes

Uploaded by

PDF Notes

Uploaded by

Chapter Two

Methods for Describing Sets

Example: Car insurance company evaluates many variables

 Qualitative (Categorical) variable have measurements that cannot

Example: An insurance company evaluates many variables

Sections 2.1 & 2.2

Visualizing Categorical Data

Example: Impairment of language ability.

Example: Impairment of language ability.

Example: Impairment of language ability.

Visualizing Numeric Data

Example: Serum CK Creatine phosphokinase (Ck) is an enzyme related

Describing the Shape of a Histogram

Mode: Peak/peaks of the histogram Tails: The distribution is

How would we describe the shape of the distribution?

How would we describe the shape of the distribution?

How would we describe the shape of the distribution?

How would we describe the shape of the distribution?

How would we describe the shape of the distribution?

Section 2.7 & 2.6

 QUARTILES(Q) divides the distribution into four parts

 FIVE-NUMBER SUMMARY includes:

 Data point is an outlier that falls outside of the fence

 * or Dot to represent an outlier on a boxplot

 Example: The pulse rates (beat/min) of 12 college students

Find the five-number summary and IQR.

Boxplot for Data with No Outliers

25% 25% 25% 25%

Minimum Q1 Median Q3 Maximum

Constructing a Boxplot with Outliers

Upper inner fence = Q3 + 1.5 (IQR)

If there are outliers, the whisker is

Lower inner fence = Q1 - 1.5 (IQR)

 Example: The pulses of 12 college students were measured.

Data point < Lower Fence = 𝑄 − 1.5 𝑋 𝐼𝑄𝑅

Distribution Shape and The Boxplot

Left-Skewed Symmetric Right-Skewed

 Parameters: A numerical measure that is calculated from

Measures of Central Tendency

 Three commonly used measures

The sample mean(𝑦) for a sample of size n is

*Population mean is denoted by 𝜇.

o If is not a whole number, the median will be the average of the

 If there is no observation that occurs with the most

 Two modes  BI-modal

Weight Gain of Lambs The following are the 2 week weight

What if………………….Extreme Values

• we add an observation of 100 pounds?.

• ONE extreme value changed the mean by 12.96…

 MEDIAN is less sensitive than the mean to extreme

 Approximately symmetric and unimodal 

Variance & Standard deviation

The sample standard deviation (𝑆) is ∑ 𝑦 −𝑦

Example: Standard Deviation

Example: Data was collected for a sample of 10 patients.

 IQR is less sensitive than the range and standard

The Empirical Rule

• About 68% of observations are within one standard deviation

Illustration of the Empirical Rule

Find the intervals of the Empirical Rule for the men.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.