0% found this document useful (0 votes)
29 views4 pages

Stat7055 T01

The document contains questions about statistical concepts and analysis. It includes questions about classifying variables, calculating correlations and variances, describing central tendencies, and analyzing relationships in sample data.

Uploaded by

hydrogenbearowo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views4 pages

Stat7055 T01

The document contains questions about statistical concepts and analysis. It includes questions about classifying variables, calculating correlations and variances, describing central tendencies, and analyzing relationships in sample data.

Uploaded by

hydrogenbearowo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

STAT7055 Topic 01 Tutorial Questions

1. Data was collected on 105 homes in Canberra in 2003. For each house, the following
information was collected: the estimated price of the house (in dollars); the number of
bedrooms; the size of the house (in square metres); whether or not a pool was present (yes
or no); the distance from Civic; the rating of the insulation in the house (none, average
or high); the suburb; the number of bathrooms; and the type of internet connectivity
available (dialup, ADSL or the NBN, where dialup is the slowest connection and the NBN
is the fastest). Classify each variable as either nominal, ordinal, discrete or continuous.

2. You work in a country where every resident plays a sport every day. However the only
two sports played are table tennis (when it is raining) and golf (when it is sunny). Your
job is to provide statistical analysis to the management of a company that sells “ping-
pong” (table tennis) balls directly through the internet. Over the past eight months you
have collected the following data:

Month Marketing Number of rainy Number of


expenditure ($) days sales
1 4150 6 778
2 3000 10 779
3 2500 25 4200
4 10600 2 250
5 12000 7 300
6 8000 20 6000
7 1500 18 1500
8 6850 9 500

For this data, the sample coefficients of variation for marketing expenditure, number of
rainy days per month, and number of sales have been calculated to be 0.642849, 0.656009,
and 1.194023, respectively.
(a) The marketing manager has told you that it simply makes sense that there is a
strong and positive correlation between marketing expenditure and the number
of sales made. Provide some analysis regarding this relationship. What do you
conclude from your results?
(b) Using the data above, calculate the correlation coefficient between the number of
rainy days per month and the number of sales. The covariance between the number
of rainy days per month and the number of sales has been calculated as 14012.23.
(c) What does the result in (b) above suggest, and provide a potential reason for this
result.
Try using R to calculate the sample correlation coefficients from the raw data given in
the table.

Page 1 of 4
STAT7055 Topic 01 Tutorial Questions

3. A quality control officer in a chocolate factory records the number of minutes it takes
for the company’s signature chocolate bar to melt at room temperature. He recorded
the following 11 times for 11 different chocolate bars:

14 20 20 12 9 13 35 12 11 12 46

(a) Calculate the mean, mode and median of the times.


(b) It turned out that the quality control officer occasionally fell asleep while recording
the time for a chocolate bar to melt, leading to some incorrectly large melting times.
Based on this information, which would be a better measure of central tendency for
this data, the mean or the median?
(c) Calculate the IQR of the times.
(d) Calculate the difference between the 60th percentile and the 10th percentile.
(e) To what percentile does a time of 15.5 minutes correspond to?

4. There is a shortcut version for calculating the sample variance given by the following
formula: 󰀣󰀣 n 󰀤 󰁓n 󰀤
1 󰁛 ( X i )
2
s2 = Xi2 − i=1
n−1 i=1
n
Show that this is equivalent to the definition given in the lectures. In other words, show
that: 󰀣 n 󰀤 󰀣󰀣 n 󰀤 󰁓n 󰀤
1 󰁛󰀃 󰀄2 1 󰁛 ( X i )
2
Xi − X̄ = Xi2 − i=1
n − 1 i=1 n−1 i=1
n
Bonus: Show that the shortcut version of the sample covariance given below is equivalent
to the definition given in lectures.
󰀣󰀣 n 󰀤 󰁓n 󰁓n 󰀤
1 󰁛 ( X i ) ( Y )
i=1 i=1 i
sXY = X i Yi −
n−1 i=1
n

Page 2 of 4
STAT7055 Topic 01 Tutorial Questions

5. The Hula painted frog is an extremely rare species of frog that was thought to be extinct
but was rediscovered in 2011. Only 11 are believed to be living in the wild. Suppose the
weights of these 11 frogs are known and given in the table below (in grams):

13 26 22 16 18 28 14 15 15 17 25

(a) Calculate the population variance of these 11 frogs.

Suppose now we take five random samples of size four from this population, with each
new sample being taken after returning the previous sample to the population. The five
samples, along with some sample statistics, are listed below:
󰁓n
Sample X̄ i=1 Xi2
13, 22, 18, 16 17.25 1233
26, 15, 17, 15 18.25 1415
14, 18, 15, 25 18 1370
25, 14, 16, 17 18 1366
13, 26, 25, 18 20.5 1794

(b) Calculate the sample variance for each of the five samples.
(c) Calculate the sample variance for each of the five samples, but this time using n as
the denominator, instead of n − 1. That is, calculate:
n
∗2 1 󰁛󰀃 󰀄2
s = Xi − X̄
n i=1

(d) Calculate the average of the five samples variances in part (b) and the average of
the five sample variances in part (c). What do you notice?

Page 3 of 4
STAT7055 Topic 01 Tutorial Questions

6. The average score for a class of 30 students was 75. The 20 male students in the class
averaged 70. The boxplots for the scores for the male and female students are given
below.

100
90
80
70
60
50
40

Male Female

(a) What was the average of the 10 female students in the class?
(b) Describe the relationship between the median and the mean for both male students
and female students.
(c) Did a greater proportion of male students or female students score at least 83?

7. Discussion Question
Some scientists are conducting a study to investigate the effects of exercise and caffeine
on sleep quality. A random sample of 300 people aged between 20 and 50 were included
in the study. For a particular day, each person was asked to record the number of
cups of coffee/tea they drank, the number of minutes they exercised, and the number
of hours they slept that night. The scientists have asked you to help them analyse their
data. They would like to summarise each variable in their sample data. They are also
interested in determining whether doing more exercise or consuming less caffeine is more
likely to cause the person to sleep for longer. Discuss some approaches that you could
use to help the scientists. Remember to note any important issues that need to be
considered in the analysis or in the interpretation of the analysis.

8. swirl
Work through lessons 1 and 2 of the R Programming course.

Page 4 of 4

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy