0% found this document useful (0 votes)
20 views3 pages

Solutions Homework Week 6

This document contains solutions to an integrative assignment involving analyzing and visualizing data using different statistical techniques. It includes examples of constructing tables and plots like scatterplots, bar plots, and box plots to represent various relationships in datasets. Statistical tests discussed include t-tests, Wilcoxon rank sum tests, permutation tests, chi-squared tests, and calculating confidence intervals.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views3 pages

Solutions Homework Week 6

This document contains solutions to an integrative assignment involving analyzing and visualizing data using different statistical techniques. It includes examples of constructing tables and plots like scatterplots, bar plots, and box plots to represent various relationships in datasets. Statistical tests discussed include t-tests, Wilcoxon rank sum tests, permutation tests, chi-squared tests, and calculating confidence intervals.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

WEEK 6 – SOLUTIONS HOMEWORK

Q1
a) Suppose an equal number of students is assigned to each design:
Design 1 Design 2 Total
More than 1 min 12 5 17
Less than 1 min 13 20 33
Total 25 25 50

b)
1st year 3rd year Total
Yes 85 117 202
No 276 104 380
Total 361 221 582

Q2
a) See below for the marginals
GEBDER
Lied at least once MALE FEMALE Total
YES 3228 10295 13 523
(exp=6268.3) (exp=7254.7)
NO 9659 4620 14 279
(exp=6618.7) (exp=7660.3)
Total 12 887 14 915 27 802

b) The most appropriate description is the conditional distribution by gender (independent


variable), i.e. column percentages: 25.05% for men and 69.02% for women.
c) Women lie more often (at least: women admit more often that they are lying)
(#!!$%&!&$.#)! ()&!*%+&&*.#)!
d) 𝜒 ! = &!&$.#
+ ⋯+ +&&*.#
= 5352, 𝑑𝑓 = 1, P-value is very small. It is not
necessary to worry about the exact value of the P-value, because of this extreme result.
Q3
The main problem here is that we are not dealing with a two-way table. The fact is that each of the
119 students can come in multiple categories: they can come in multiple rows if they have seen
several of the films. They may even come in multiple rows (if, for example, they have problems both
at night and during the day).

Another problem is that percentages rather than numbers are reported. Since we know the sample
size n, it is possible to translate these percentages into numbers though.
INTEGRATIVE ASSIGMENT
1.
a) The best two options are the mean and the median.
b) The extent to which the response time distributions are skewed would be an important thing
to know when deciding whether to use the median or the mean. With a heavily-skewed
distribution, the mean will be affected by values that are far away from the typical
observations; the median will not exhibit this behavior.

2.
a) Assuming that a person’s sex is unrelated to whether they plan to have hip surgery, the
proportion of males in the sample should be the same as the population at large. In order to
assess this, we can construct a table like this:

Gender
Male female Total
Number 80
Proportion 1

b) In order to do this, we would use the formula for a confidence interval for a single
population proportion:
" "
, 01% 2
CI = ± 𝑧 ∗ 1# #
- -

where Y is the number of females in the sample. This is the same as the formula
CI = 𝑝̂ ± 𝑧 ∗ SE34
3.
a) Since both scores will be quantitative variables, the best plot in this case would
be a scatterplot.
b) In this case, a scatter plot would not yield as much information. The math scores
only have 4 potential values, so it would be more insightful to plot the average
IQ score for every math rating, along with standard errors. This would yield four
bars, one for each math rating.
c) Since the language-IQ plot is two (roughly) continuous variables, a good statistic would be
either 𝑟, if the relationship is linear, or Kendall's τ, if the relationship is not linear. The
mathematical score is a little trickier. The best answer, of the statistics we have discussed so
far, is Kendall's 𝜏.

4.
a) You could use a box plot or a bar plot, with standard errors, for this purpose. The plot would
have two boxes/bars, and the y axis would show the RT score.
b) An independent samples t-test would probably be best, assuming that the scores
are approximately normally distributed. Otherwise, one of the non-parametric
alternatives (Wilcoxon rank sum test or permutation test) would be more appropriate.

c) For the t-test, the null and alternative hypotheses are


𝐻* : 𝜇5 = 𝜇6
𝐻7 : 𝜇5 < 𝜇6
For the Wilcoxon, the statement will be about the median instead of the mean µ. For the
permutation test, the null is that the population distribution of RT scores is the same for
males and females.
d) For the t test, the formula is
8$ % 8%
𝑡=
!
& & !
9 ': (
#' #(

For the Wilcoxon test, the test statistic is the sum of the ranks of all observations in one of
the groups.
e) In this case, the appropriate statistic to build a confidence interval around is X; − X< . The
=! =!
formula for the confidence interval is 𝐶𝐼 = EX; − X< F ± 𝑡 ∗ G-' + -( .
' (
f) Assuming that the confidence is a 95% confidence interval, the test will be at α = 0.025.

5.
a) A scatterplot would be the most appropriate plot to show whether participants
score worse one month after surgery. On the x axis would be the score before,
and on the y axis would be the score one month after. It would also be helpful
to place a line at y = x to show where we’d expect points to fall if there were no
change in the scores.

b) Either a sign test or a matched-pairs t test would be appropriate for answering


this question.
6.
a) There are two acceptable answers to this question. The first is a clustered bar plot,
where the height of a bar is frequency, and each bar represents a different sex by
score combination. Another plot would be a bar plot showing the average score
for women and the average score for men (on the 1-4 scale), with standard errors.
The former plot is the better plot.
b) Because the mathematics score is discrete and probably not close to normal, the most
appropriate test would be a χ! test. One factor would be math score (4 rows) and one factor
would be sex (2 columns). The resulting 4 by 2 table would contain, in each cell, the
frequency for each combination of score and sex.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy