Tutorial 2 - Asnwer Key
Tutorial 2 - Asnwer Key
2
January 27, 2023
Reasoning:
The mode is 4, the median is 4.5, and the mean is 5.4. Thus, Statement (i) is
false, and Statement (ii) is true. Furthermore, since the mean is greater than
the median, we can conclude that the distribution is positively skewed (or
skewed to the right); so, Statement (iii) is true.
QUESTION 2.2
Which of the following statements are true regarding
population standard deviations?
i. A standard deviation of 100 is considered large
ii. Population and sample standard deviations are
always equal
iii. Variance can be described as the mean squared
difference of measurements from the mean
a) ii only
b) iii only
c) i and ii
d) i and iii
e) i, ii, and iii
Reasoning:
Statement (i) is false: standard deviations are relative (ie. if we are dealing
with values in the millions, a standard deviation of 100 is small). Statement
(ii) is false: it depends on the sample used. Statement (iii) is true by definition
of variance.
QUESTION 2.3
A delivery company has a mean delivery time of 3.2 days
at a standard deviation of 0.35 days. According to
Chebyshev’s Theorem, approximately 84% of delivery
times fall within what interval?
a) [1.013, 5.388]
b) [2.850, 3.550]
c) [2.325, 4.075]
d) [2.688, 3.810]
e) None of the above
Reasoning:
Reasoning:
If i is an integer, we take the average of i’s and (i+1)’s values. In this case: we
take the average of the 3rd and 4th values: 51.5 ($51,500).
Reasoning:
𝑥̅ = 240.67
2
s2 = 1/6 ∑6𝑖=1(𝑥̅𝑖 − 𝜇) = 2952.89
2
𝑠 = √𝑠 = √2952.89 = 54.3405
s 54.3405
𝐶𝑉 = x (100) = (100) = 22.5788
240.67
QUESTION 2.6
The five-number summary is described by:
a) The minimum, mean, median, mode, and maximum
b) The first, second, third, fourth, and fifth quartiles
c) The minimum, 25th percentile, 50th percentile, 75th
percentile, and maximum
d) The mean, median, mode, standard deviation, and
variance
e) None of the above
Reasoning:
The five number summary is made up of the minimum (non-outlier) value, the
25th percentile (or first quartile), the median (or 50th percentile, or second
quartile), the 75th percentile (or third quartile), and the maximum (non-
outlier) value.
QUESTION 2.7
A fire chief is timing his firefighters in a drill to see how
quickly they can get ready in case of an emergency call.
He obtains the five-number summary 58, 64, 66, 70, 77,
expressed in seconds. Which of the following is/are
false?
a) The fastest 25% of firefighters took 64 seconds or less
to get ready
b) The slowest 75% of firefighters took 70 seconds or
more to get ready
c) Half the firefighters took between 58 and 66 seconds
to get ready
d) Half the firefighters took less than 64 seconds or over
70 seconds to get ready
e) None of the above
Reasoning:
Option (a) is true: the first quartile is given as 64, which means that 25% of
the firefighters required 64 seconds or less to get ready. Option (b) is false:
the third quartile (75th percentile) is given as 70, which means that the fastest
75% of firefighters took 70 seconds or less to get ready (equivalently, the
slowest 25% took 70 seconds or more). Option (c) is true: the span of 58 to 66
describes the observations below the median (which represents 50% of the
observations). Option (d) is true: the first and third quartiles are given as 64
and 70 (ie. the interquartile range, representing 50% of the observations),
respectively, so 50% of the data must lie outside this range.
QUESTION 2.8
Fill in the blanks: When the mean is ________ the median,
the dataset will be ________.
a) less than; skewed left
b) greater than; skewed left
c) greater than; negatively skewed d) less than;
positively skewed
e) None of the above
Reasoning:
When the mean is less (greater) than the median, the dataset will be skewed
left (right). In other words, when the mean is less (greater) than the median,
the dataset will be negatively (positively) skewed.
QUESTION 2.9
Which of the following is true of box plots?
i. The interquartile range describes the middle half of
the data
ii. If the median equals the mean, the distribution is
symmetric
iii. The median can never be an outlier
a) i only
b) ii only
c) iii only
d) i and ii
e) i, ii, and iii
Reasoning:
Statement (i) is true: the IQR ranges from the 25th to 75th percentiles
(representing half the data). Statement (ii) is true by definition. Statement (iii)
is true: outliers are described as any value outside the interval [𝑄1 − 1.5𝐼𝑄𝑅,
𝑄3 + 1.5𝐼𝑄𝑅] (this interval clearly includes the median).
QUESTION 2.10
Consider a sample of 50 observations ranging between 0
and 10. If the median is 5, what can be inferred about
the distribution?
a) It is symmetrically distributed
b) The mean is also 5
c) About 25 observations lie between 2.5 and 7.5
d) All of the above
e) None of the above
Reasoning:
Option (a) is false: consider a distribution in which the first quartile is 4 and
the third quartile is 9, this would clearly be positively skewed. Option (b) is
false: nothing can be inferred about the mean. Option (c) is false by the same
logic as Option (a); we do not know anything about the other quartiles.
QUESTION 2.11
Which of the following statements is true?
a) If we have an even number of observations, the
median must appear in the dataset
b) If we have an odd number of observations, the
median must appear in the dataset
c) If we have an even number of observations, the mean
must appear in the dataset
d) If we have an even number of observations, the
mean must appear in the dataset
e) None of the above
Reasoning:
Option (a) is false: consider the dataset {1, 2, 3, 4} (the median is 2.5, which
does not appear). Option (b) is true: consider the dataset {𝑎, 𝑏, 𝑐, 𝑑, 𝑒}, where 𝑎
≤ 𝑏 ≤ 𝑐 ≤ 𝑑 ≤ 𝑒 (the median is 𝑐, which does appear). Option (c) is false:
consider the dataset {1, 2, 3, 4} (the mean is 2.5, which does not appear).
Option (d) is false: consider the dataset {0, 2, 3, 4, 5} (the mean is 2.8, which
does not appear).
QUESTION 2.12
Consider the box plot below:
Reasoning:
Option (a) is true; these intervals represent the lower and upper halves of the
interquartile range (IQR), each representing a quarter of the data. Option (b)
is true; the distribution is clearly negatively skewed, which indicates that the
mean is less than the median. Option (c) is true; the IQR spans the interval [15,
20]. Option (d) is false; the interval [20, 22] represents the interval spanning
the third quartile to the maximum (non-outlier) value (which contains 25% of
the data), and the interval [19, 20] represents the interval spanning the
median to the third quartile (which also contains 25% of the data).
SUPP. QUESTION 2.13
Let S denote a set of five unique integers. If you decrease
the smallest number in S by one unit and you increase
the largest by one unit, which of the following would
remain unchanged?
i. Mean
ii. Median
iii. Mode
a) ii only
b) iii only
c) i and ii
d) ii and iii
e) i, ii, and iii
Let 𝑆 = {𝑥̅1, 𝑥̅2, 𝑥̅3, 𝑥̅4, 𝑥̅5} such that 𝑥̅1 < 𝑥̅2 < 𝑥̅3 < 𝑥̅4 < 𝑥̅5, and after the
transformation, let 𝑆’ = {𝑥̅1 − 1, 𝑥̅2, 𝑥̅3, 𝑥̅4, 𝑥̅5 + 1}. Consider the mean:
xs_bar = (𝑥̅1 + 𝑥̅2 + 𝑥̅3 + 𝑥̅4 + 𝑥̅5)/5 = ((𝑥̅1 -1) + 𝑥̅2 + 𝑥̅3 + 𝑥̅4 + (𝑥̅5+1) )/5 = xs’_bar
The mean clearly remains unchanged. Now, consider the median: in both sets,
the middle value is 𝑥̅3, and thus, the median remains unchanged. Since all
numbers are unique, there is no mode (this holds in 𝑆’ because we are simply
shifting the endpoints by one unit). Thus, the mode remains unchanged.
*Note: You can show this for a very simple set of five unique numbers.
Consider 𝑆 = {1, 2, 3, 4, 5}.
**Note: You can argue that the set, S, is multimodal (5 unique integers) and so
is S’ (with the first and last observations being different from S), so Option (c)
is also valid.