0% found this document useful (0 votes)
9 views86 pages

Measures of Central Tendency & Variation

The document discusses descriptive statistics, focusing on histograms and their role in visualizing data distributions, including concepts like central tendency, dispersion, and outliers. It explains the differences between histograms and bar charts, and provides examples of various distribution shapes such as normal, skewed, and bimodal distributions. Additionally, it covers measures of central tendency, including mean, median, and mode, and their applications in summarizing data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views86 pages

Measures of Central Tendency & Variation

The document discusses descriptive statistics, focusing on histograms and their role in visualizing data distributions, including concepts like central tendency, dispersion, and outliers. It explains the differences between histograms and bar charts, and provides examples of various distribution shapes such as normal, skewed, and bimodal distributions. Additionally, it covers measures of central tendency, including mean, median, and mode, and their applications in summarizing data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 86

Slide

4-1

Descriptive Statistics
Slide
4-2

Histograms
Looking at the Distribution of the
Data
Slide
4-3 Histogram
• A Picture of a list of numbers
Data 4

Frequency
11 15 3
8 26
2
10 5
1
15
0
0 10 20 30 Data value

• BARS ARE HIGH when many elementary units


fall within this range
• Shows typical value (center), dispersion
(variability), distribution shape, outliers (if any)
Slide
4-4 Histogram
• A Picture of a list of numbers Normal
distribution
Data 4

Frequency
11 15 3
8 26
2
10 5
1
15
0
0 10 20 30 Data value

• BARS ARE HIGH when many elementary units


fall within this range
• Shows typical value (center), dispersion
(variability), distribution shape, outliers (if any)
Slide
4-5 Stem-&-Leaf Diagram (Histogram)
• Columns (or rows) of numbers form histogram
bars
• Here, the data value “15” is recorded as a “5” in
the “10” column
Data 5
11 15 0
8 26 5 5
10 5 8 1 6
15
0 10 20 30
Slide
4-6 Histogram and Bar Chart
• Histogram is a bar chart of the frequencies of the
data
– Histogram: bar height represents number of cases
within the range & bar width represents class interval
– Ordinary bar chart: bar height represents data value for
just one case & bar width doesn’t have any meaning
– Histogram is used for classified quantitative data; bar
chart is used for qualitative (categorical data)
• Histogram shows overall distribution
– Histogram: the “big picture” of patterns in the data
– Ordinary bar chart: often too much detail (each
individual case)
Slide
4-7
Example 2
• Consider the following sales data in thousands
of shillings. Form a frequency distribution with
5 classes and construct a histogram for the
data.
29 44 12 53 21 34 39 25 48 23
17 24 27 32 34 15 42 21 28 37
Slide
29 44 12 53 21 34 39 25 48 23
4-8
17 24 27 32 34 15 42 21 28 37
Slide
4-9
Example 2
• Consider the following sales data in thousands
of shillings. Plot a stem & Leaf diagram.

29 44 12 53 21 34 39 25 48 23
17 24 27 32 34 15 42 21 28 37
Slide
4-10
Slide
4-11 Distribution Shapes (Ideal)
• Normal
– Symmetric
– Bell-Shaped
• Skewed
– Not symmetric
– Can cause trouble
– Transform? Logarithm?
• Bimodal
– Two clear groups
– Find out why!
– Analyze separately?
Slide
4-12 Idealized Normal Distributions
• Can shift center, width (diversity) of distribution
• In idealized form, without the randomness of data
Slide
4-13 Data from a Normal Distribution
• All are sampled from the same idealized normal
distribution. Note the random differences.
30 30
Frequency

Frequency
20 20

10 10

0 0
60 80 100 120 140 60 80 100 120 140

30 30
Frequency

20 Frequency 20

10 10

0 0
60 80 100 120 140 60 80 100 120 140
Slide
4-14
Fig 3.2.1
Example: Mortgage Interest Rates
• Values from about 5.7% to 6.6%
• Typical: from about 6.2% to 6.4%
• Diversity among institutions
• Special
15
features: gap just below 6.5%, some low rates
Frequency (lenders)

10

0
5.5% 6.0% 6.5% 7.0%
Interest rate
Slide
4-15 Idealized Skewed Distributions
• Not symmetric
• Various shapes are possible
• In idealized form, without the randomness of data
Slide
4-16 Example: Commercial Bank Assets
Fig 3.4.2

• Most banks are smaller: tall bars at the left


• A few banks are larger (to the right)
• A skewed distribution
Frequency (banks)

30

20

10

0
0 100 200 300 400 500
Bank assets ($ billions)
Slide
4-17 Bimodal Distribution
Fig 3.5.1

• Two distinct groups in the data (ask “why?”)


• Example: yields of money market funds
– Tax-exempt funds pay a lower rate
– Taxable funds generally pay more

40
Frequency (funds)

30
20
10
0
2% 3% 4% 5% 6%
Yield
Slide
4-18 Outlier
• A data value very different from the others
• Difficult to see distribution of most of the data,
even after changing histogram scale

Defects 10
8
11 19
Frequency

Frequency
23 15
18 19 0 0
13 268 0 100 200 300 0 100 200 300
25 9
Slide
4-19 Outlier: What to Do?
• Note the outlier. If error, then fix it
• (Perhaps) analyze with and without outlier(s)
– If similar answers, then no problem
• OK to omit outlier(s) IF not part of situation
under study
– e.g., Lab analysis, dropped test tube
• OK to omit, if studying normal operation, not laboratory
accidents
– e.g., Statistical audit, “special occurrence” error
• Use care. Such an error in a sample may represent other
“explainable” errors in accounts that were not examined
Slide
4-20 Example: TV Advertising
Fig 3.6.5

• One advertiser (Regal Communications) had


increased TV spending 2,353.7%
Frequency (Advertisers)

20

10

0
0% 1,000% 2,000%
Percent Increase in Syndicated TV Spending
Slide
4-21 Data Mining Promotions Received
Fig 3.6.5

• Number of promotions received by 20,000 people


in the donations database
Number of people

3,000

2,000

1,000

0
0 50 100 150 200
Promotions
Slide
4-22 More Detail in Promotions
Fig 3.6.5

• Reduce bar width from 10 to 1 promotion


• With large data set, can see interesting structure
– such as the peak at about 15 promotions

600
Number of people

500
400
300
200
100
0
0 20 40 60 80 100 120 140 160 180
Promotions
Slide
4-23 Data Mining Donations
Fig 3.6.5

• Size of donation received in response to mailing


• Note: many donations of $0 among these 20,000
– Difficult to see anything else! (six donated $100)

20,000
Number of people

15,000

10,000

5,000

0
$0 $20 $40 $60 $80 $100 $120
Donation
Slide
4-24 More Detail in Donations
Fig 3.6.5

• Keep only the 989 who donated (eliminate $0)


– to see detail among those who made a gift
• Can now see the distribution of the gift amounts
Number of people

300
250
200
150
100
50
0
$0 $20 $40 $60 $80 $100 $120
Donation
Slide
4-25 Even More Detail in Donations
Fig 3.6.5

• With so much data (989 people)


– we can use smaller bars to see more details
• Note the “spikes” at $5, 10, 15, 20, 25, and 50
Number of people

200
150
100
50
0
$0 $20 $40 $60 $80 $100 $120
Donation
Slide
4-26

Numerical Descriptive
Measures
Landmark Summaries:
Interpreting Typical Values and
Percentiles
Slide
4-27 Numerical Descriptive
• Large data sets can often be adequately
described by just a few numbers
– Parameters describe populations
– Statistics describe samples
• Types of descriptive measures
Measures of
– Central tendency
– Dispersion
– Shape
– Relationships
Slide
4-28

Measures of Central
Tendency
Landmark Summaries:
Interpreting Typical Values and
Percentiles
Slide
4-29 Measures of Central Tendency
• Also referred to as averages
• An average is a single value, which is
considered as the most representative (or
typical) value for a given set of data
• Are used to give an impression of the size of all
items in a given set of data
Slide
4-30

• Objectives
– To get one single value that describes the characteristic
of the entire data
– By condensing the mass of data in one single value, we
get an idea of the entire data; easy to remember and
figure out
– To facilitate comparison; by reducing the mass of data
in one single figure comparison is made possible -
comparison can be made either at a point in time or
over a period of time
Slide
4-31 Average or Mean
• Add the data, divide by n or N (the number of
elementary units)
X 1  X 2  ...  X n
X  Sample average
n
X 1  X 2  ...  X N
 Population average
N

• Divides total equally. The only such summary


• A representative, central number (if data set is
approximately normal)
• Summation notation 1 1 n N
X  X  X
– S is capital Greek sigma
i i
n N i 1 i 1
Slide
4-32
Fig 4.1.1
Example: Number of Defects
• Defects measured for each of 10 production lots
4, 1, 3, 7, 3, 0, 7, 14, 5, 9

Frequency (lots) 2

0
0 5 10 15 20
Defects per lot
Average is 5.1
defects per lot
Slide
4-33 Median
• Also summarizes the data
• The middle one
– Put data in order
– Pick middle one (or average middle two if n is even)
– Median (9, 4, 5) = Median(4, 5, 9) = 5
5+7
– Median (9, 4, 5, 7) = Median (4, 5, 7, 9) = = 6
2
• Rank of the median is (1+n)/2
– If n=3, rank is (1+3)/2 = 2
– If n=4, rank is (1+4)/2 = 2.5 (so average 2nd and 3rd)
– If n=262, rank is (1+262)/2 = 131.5
Slide
4-34 Median (continued)
• A representative, central number
– If data set has a center
• Less sensitive to outliers than the average
• For skewed data, represents the “typical case”
better than the average does
– e.g., incomes
• Average income for a country equally divides the total, which
may include some very high incomes
• Median income chooses the middle person (half earn less, half
earn more), giving less influence to high incomes (if any)
Slide
4-35 Example: Spending
• Customers plan to spend ($thousands)
3.8, 1.4, 0.3, 0.6, 2.8, 5.5, 0.9, 1.1
• Rank ordered from smallest to largest
0.3, 0.6, 0.9, 1.1, 1.4, 2.8, 3.8, 5.5
1 2 3 4 5 6 7 8

Rank of median
= (1+8)/2 = 4.5
9
• Median is (1.1+1.4)/2 = 1.25 6 4
3 1 8 8 5
– Smaller than the average, 2.05 0 1 2 3 4 5
• Due to slight skewness?
Median Average
Slide
4-36
Fig 4.1.2
Example: The Crash of 1987
• Dow-Jones Industrials, stock-price changes as
each stock began trading that fateful morning
• Fairly normal
• Mean and median are similar
Frequency

0
-20% -10% 0%
Percent change at opening
Median = -8.6%
Average = -8.2%
Slide
4-37
Fig 4.1.3
Example: Incomes
• Personal income of 100 people
• Average is higher than median due to skewness

50
40
Frequency

30
20
10
0
$0 $100,000 $200,000 Income
Average = $38,710
Median = $27,216
Slide
4-38 Mode
• Also summarizes the data
• Most common data value
– Middle of tallest histogram bar Mode

• Problems: Mode

– Depends on how you draw histogram (bin width)


– Might be more than one mode (two tallest bars)
• Good if most data values are “correct”
• Good for nominal data (e.g., elections)
Slide
4-39 Normal Distribution
• Average, median, and mode are identical
– If the data come from a normal distribution

Average, median, and mode


are identical
in the case of a normal distribution
Slide
4-40 Skewed Distribution
• Average, median, and mode are different
– The few large (or small) values influence the mean
more than the median
– The highest point is not in the center

Average
Median
Mode
Slide
4-41 Which summary to use?
• Average
– Best for normal data
– Preserves totals
• Median
– Good for skewed data or data with outliers, provided
you do not need to preserve or estimate total amounts
• Mode
– Best for categories (nominal data).
– The mode is the only summary computable for nominal
data!
Slide
4-42 Which Summary? (continued)
• Average requires quantitative data (numbers)
• Median works with quantitative or ordinal
• Mode works with quantitative, ordinal, or nominal

Quantitative Ordinal Nominal


Average Yes - -
Median Yes Yes -
Mode Yes Yes Yes
Slide
4-43 Weighted Average
• Ordinary average gives same weight to all
elementary units
1 1 1
X  X 1  X 2  ...  X n
n n n

• Weighted average allows different weights

X  w1 X 1  w2 X 2  ...  wn X n

• Weights must add up to 1


w1  w2  ...  wn  1

– If not, then divide each by their total


Slide
4-44 Weighted Average (continued)
• Average is per elementary unit
– The average of your course grades is your “average per
course”
• Weighted average is per unit of weight
– Your GPA (grade point average) is a weighted average,
using credit hours to define the weights. The weighted
average is your “average per credit hour”
Slide
4-45 Example: Portfolio Rate of Return
• Portfolio expected return (an interest rate,
indicating performance) is the weighted average
of the expected rates of return of assets in the
portfolio, weighted by $dollars invested

• Portfolio contains three stocks. One ($1,000


invested) is expected to return 20%. Another
($1,800 invested) expects 15%. Third is $2,200
and 30%.

• Total invested is 1,000+1,800+2,200 = $5,000


Slide
4-46 Example (continued)
• Weights are
w1 = $1,000/$5,000 = 0.20
w2 = $1,800/$5,000 = 0.36
w3 = $2,200/$5,000 = 0.44

• Weighted average is
0.20(20%) + 0.36(15%) + 0.44(30%) = 22.6%
– The expected return for the portfolio.
– Each stock is represented in proportion to $ invested
Slide
4-47 Percentiles
• Landmark summaries in the same measurement
units as the data
– e.g., dollars, people, miles per gallon, …
• Some familiar percentiles
– Smallest data value is 0th percentile
– Median is 50th percentile
– Largest data value is 100th percentile
– 90th percentile is larger than 90% of elementary units
• Finding percentiles
– Difficult to see from histogram
– Easy using CDF (Cumulative Distribution Function)
Slide
4-48 Cumulative Distribution Function
• Data axis horizontally (as in histogram)
• Cumulative percent vertically
• Equal vertical jump at each data value
0.3, 0.6, 0.9, 1.1, 1.4, 2.8, 3.8, 5.5

80% 100%
Cumulative

50%
Percent

0%
$0 $2 $4 $6
Spending
80th percentile
is $3.80
Slide
4-49 Five-Number Summary
• Selected landmarks to represent entire data set
– Median = 50th percentile
– Quartiles
Discard decimal,
• LQ = Lower Quartile = 25th percentile if any.
int(10.5)=10
1  n  int(35)=35
1  int 
– Rank =  2 
Rank of median
2
• UQ = Upper Quartile = 75th percentile
– Rank is n+1–[rank of lower quartile]
– Extremes
• Smallest = 0th percentile
• Largest = 100th percentile
Slide
4-50 Five-Number Summary (continued)
• Provides information about
– Central summary
• Median
– Range of the data
• Largest – smallest
– “Middle half” of the data
• From LQ to UQ
– Skewness
• If median is not approximately half way between quartiles
Slide
4-51 Box Plot
• Displays five-number summary
Median
Lower Upper
Quartile Quartile
Smallest Largest

0 {
2 4
Middle half
6 8

of the data
• Less detail than histogram
– Easier to compare many groups
Slide
4-52 Example: Spending
• Spending rank ordered from smallest to largest
0.3, 0.6, 0.9, 1.1, 1.4, 2.8, 3.8, 5.5
1 2 3 4 5 6 7 8
Rank of LQ Rank of median Rank of UQ
= (1+4)/2 = 2.5 = (1+8)/2 = 4.5 = 8+1-2.5=6.5

4 = int(4.5)

• LQ is (0.6+0.9)/2 = 0.75
• UQ is (2.8+3.8)/2 = 3.3
Slide
4-53 Example: Spending (continued)
• Five-number summary
0.3, 0.75, 1.25, 3.3, 5.5
Smallest, LQ, Median, UQ, Largest
• Box plot

0 5
Spending ($thousands)

– Shows some skewness (lack of symmetry)


Slide
4-54 Identifying Outliers
• Outliers are defined as observations, if any, either:
– More than UQ + 1.5 (UQ  LQ), or
– Less than LQ  1.5 (UQ  LQ)
• Outliers are far from the center of the distribution
– and may be interesting as special cases

Lower 1.5(UQ  LQ) UQ  LQ 1.5(UQ  LQ) Upper


outliers outliers

LQ UQ
Slide
4-55
Fig 4.2.3
Example: Technology CEO Pay
• CEO compensation in technology companies
– Detailed box plot identifies outliers
• and identifies the most extreme non-outliers,
• gives more detail than the (ordinary) box plot

Apple
Computer
AMD IBM
Detailed Box Plot Sun
Microsystems

$0 $5,000,000 $10,000,000

Box Plot

$0 $5,000,000 $10,000,000
Slide
4-56
Fig 4.2.3
Example: CEO Compensation
• Box plots to compare firms within industry groups
– Utilities group generally shows lower compensation
– Highest-paid are in Financial Services group

Utilities

Technology

Financial

Energy

$0 $10,000,000 $20,000,000 $30,000,000


Slide
4-57
Fig 4.2.3
CEO Compensation (continued)
• Detailed box plots (with outliers and most extreme
non-outliers named)

GPU
Enron
Utilities Duke
Energy
Apple
Computer
AMD IBM
Technology Sun
Microsystems
Berkshire
Hathaway Lehman Merrill Goldman
Brothers Lynch Sachs Citigroup
Financial Morgan Stanley Bear
Baker Dean Witter Stearns
Hughes
Energy Phillips Petroleum

$0 $10,000,000 $20,000,000 $30,000,000


Slide
4-58
Fig 4.2.4
Mining the Donations Database
• More frequent donors (top) tend to give smaller
current donation amounts (shift to left)
Number of previous gifts

4+
past 2 years

$0 $50 $100
Size of current donation
Slide
4-59
Fig 4.2.9
Example: Business Failures
• Per million people, by state
90th percentile is 432.4
50th percentile is 260.2

100%
Cumulative Percent

50%

0%
0 100 200 300 400 500 600 700
Failures
Slide
4-60
Fig 4.2.10
Example: Business Failures
• Compare histogram, box plot, and CDF
10
Histogram
0
0 Failures 500

Box plot

0 Failures 500

100%
CDF
0%
0 Failures 500
1-4

GEOMETRIC MEAN
 The geometric mean (GM) of a set of n numbers
is defined as the nth root of the product of the
n numbers
 GM=(X1.X2….Xn)1/n
 Here, geometric mean is used to average
percentages, indexes, and relatives
 Example 1: it is known that the price of a
commodity has risen by 6%, 13%, 11%, and 15%
in each of 4 successive years
Determine the GM (average) rise
3-16

 Example 2: Suppose a small firm had been


growing over a 4-year period with its average
number of employees per year given as 85,
97, 116, and 129
 Determine the GM (mean rise in employees per
year
3-16

 Another use of the geometric mean is to


determine the average percent increase in
sales, production or other business or
economic series from one time period to
another.
 The formula for this type of problem is:

GM  n (value at end of period) / (value at beginning of period)  1


3-17

Example: The total number of females enrolled


in Kenyan Universities increased from 755,000
in 2002 to 835,000 in 2010.
Here n = 9, so (n-1) = 8

GM  8 835,000 / 755,000  1  .0127 .


That is, the geometric mean rate of increase is
1.27%.
Slide
4-65

Measures of Spread
Variability: Dealing with
Diversity
Slide
4-66 Variability: Introduction
• Also known as dispersion, spread, uncertainty,
diversity, risk
• I is the extent to which the values in a set of
observations are different from each other i.e. it
describes the degree of spread in a distribution
• If all the values are similar, the dispersion is low; if
there is a wide range of different values, the
dispersion is high
• Measure of Dispersion
– Is a measure, which helps to describe the amount of
dispersion, spread, or variability in a set of observations
• Importance of Measuring Variations
Slide
4-67 – It points out as to how far an average is representative
of the entire data i.e. small variation means the average
is representative and vice versa
– To determine the nature and cause of variation in order
to control the variation itself
– To enable comparison to be made of two or more series
with regard to their variability i.e. a means of
determining uniformity or consistency; low degree of
variation means high uniformity or consistency
– To facilitate the use of other statistical measures. For
example, correlation analysis, testing of hypotheses, the
analysis of fluctuations etc. are all based on measures
of variation
Slide
4-68 Examples
• Stock market, daily change, is uncertain
– Not the same, day after day!
• Risk of a business venture
– There are potential rewards, but possible losses
• Uncertain payoffs and risk aversion
– Which would you rather have
• $1,000,000 for sure
• $0 or $2,000,000, each outcome equally likely
– Both have same average! ($1,000,000)
– Most would prefer the choice with less uncertainty
Slide
4-69 Types
• Variance & standard deviation
• Coefficient of Variation
• Range
• Inter Quartile Range
• Quartile deviation
Slide
4-70 Standard Deviation S
• Measures variability by answering:
– “Approximately how far from average are the data
values?” (same measurement units as the data)
– The square root of the average squared deviation
• (dividing by n-1 instead of n for a sample)
• For a sample
( X 1  X )2  ( X 2  X )2  ...  ( X n  X ) 2
S
n 1

• For a population
( X 1  )2  ( X 2  )2  ...  ( X N  ) 2
s
N
Slide
4-71
Slide
4-72
Slide
4-73 Example: Spending
• Customers plan to spend ($thousands)
3.8, 1.4, 0.3, 0.6, 2.8, 5.5, 0.9, 1.1
• Average is 2.05. Sum of squared deviations is
(3.8–2.05)2+(1.4–2.05)2+…+(1.1–2.05)2 = 23.34
• Divide by 8–1=7 and take square root:
23 .34
 3.334286  1 .83 = Standard deviation
7

• Customers plan to spend about 1.83 (thousand,


i.e., $1,830) more or less than the average, 2.05.
– Some plan to spend more, others less than average
Slide
4-74 Example: Spending (continued)
• On the histogram
– Average is located near the center of the distribution
– Standard deviation is a distance away from the average
– Standard deviation is the typical distance from average

3
Frequency

2
1
0
0 1 2 3 4 5 6 7
spending
S = 1.83 X = 2.05 S = 1.83
Slide
4-75
Fig 5.1.3
Normal Distribution and Std. Dev.
• For a normal distribution only
• 2/3 of data within one standard deviation of the average
(either above or below)
• 95% for 2 std. devs.
• 99.7% for 3
one one
standard standard
deviation deviation

2/3 of data

95% of the data

99.7% of the data


Slide
4-76 Skewed Distribution and Std. Dev.
• No simple rule for percentages within one, two,
three standard deviations of the average

• Standard deviation retains its interpretation as the


standard measure of

Typically how far the observations are from average


Slide
4-77 Example: Quality Control Charts
• Control limits are often set at
3 standard deviations from the average
• If the process is normally distributed, then
– Over the long run, observations will stay within the
control limits 99.7% of the time
• If the process goes out of control, you will know
100 Out of control
Quality

50

0
Slide
4-78 Example: The Stock Market
• Daily stock market returns, S&P500 index, first
half of 2001. Standard deviation is 1.43%
– Average daily percent change: -0.03%
– Typical day: about 1.5 percentage points up or down
Frequency (days)

30

20

10

0
-5% 5%
0%
Stock market return
One One
standard Average standard
deviation deviation
Slide
4-79
Fig 5.1.11
Mining the Donations Database
• 989 people made donations
– Average donation $15.77, standard deviation $11.68
– Skewed distribution for donation amounts
300
Number of people

250
200
150
100
50
0
$0 $20 $40 $60 $80 $100 $120
One standard One standard
Donation amount
deviation deviation
Average donation
Slide
4-80 The Range
• The difference: Largest – Smallest

• Good features
– Easy and fast to compute
– Describe the data
– Check the data: Is the range too big to be reasonable?

• Problem
– Very sensitive to just two data values
• Compare to standard deviation, which combines all data values
Slide
4-81 Example: Spending
• $Thousands: 3.8, 1.4, 0.3, 0.6, 2.8, 5.5, 0.9, 1.1
• The range is 5.2
– larger than the standard deviation, 1.83

The range
3
Frequency

5.5–0.3 = 5.2
2
1
0
0 1 2 3 4 5 6 7
spending
Average One standard deviation
Slide
4-82 Coefficient of Variation
• A relative measure of variability
• The ratio: Standard deviation divided by average
– For a sample: S/X
– For a population: s/
• No measurement units. A pure number. Answers:
– “Typically, in percentage terms, how far are data values
from average?”
• Useful for comparing situations of different sizes
– To see how variability compares after adjusting for size
Slide
4-83 Example: Portfolio Performance
• You have invested $100 in each of 5 stocks
– Results: $116, 83, 105, 113, 98
– Average is $103, std. dev. is $13.21
• Your friend has invested $1,000 in each stock
– Results: $1,160, 830, 1,050, 1,130, 980
– Average is $1,030, std. dev. is $132.10
• Coefficients of variation are identical
13.21/103 = 132.10/1,030 = 0.128 = 12.8%
• Typically, results for these 5 stocks were
approximately 12.8% from their average value
Slide
4-84 Adding a Constant to the Data
• If the same number is added to each data value:
– The average changes by this same number
• The center of the distribution shifts by the same amount
– The standard deviation is unchanged
• Each data value stays the same distance from average
• Example: Order amounts: $3, 6, 9, 5, 8
– Average is $6.20, std. dev. is $2.39
– Now add shipping and handling, $1 per order:
$4, 7, 10, 6, 9
– Average rises by $1 to $7.20, but std. dev. is still $2.39
Slide
4-85 Multiplying the Data by a Constant
• If each data value is multiplied by some number:
– The average is multiplied by this same number
• The center of the distribution shifts by the same multiple
– The standard deviation is also multiplied by this same
number (after ignoring any minus sign)
• The distribution is widened (or narrowed) by this factor
• Example: Order amounts: $3, 6, 9, 5, 8
– Average is $6.20, std. dev. is $2.39
– Add 10% sales tax: $3.30, $6.60, $9.90, $5.50, $8.80
– Average rises by 10% to $6.82
– Std. dev. also rises by 10%, to $2.63
Slide
4-86 Example: International Exchange Rates
• Suppose $1 is worth 1.146 European euros
– Assume for now that this rate is constant
• Your firm is anticipating
– Average profits worth 850,000 euros
– Standard deviation (uncertainty) of 100,000 euros
• In dollars, after conversion, your firm anticipates
– Average profits worth 850,000/1.146 = $741,710
– Standard deviation of 100,000/1.146 = $87,260
• Relative risk is the same in $ and in euros
– Coefficient of variation is 11.8%

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy