0% found this document useful (0 votes)
5 views7 pages

Worksheet 1 and The Rest Solutions

The document consists of a series of worksheets that cover statistical concepts, including variable types, study designs, summary statistics, probability calculations, and confidence intervals. It includes practical exercises for calculating medians, interquartile ranges, means, standard deviations, and probabilities related to various scenarios. The document emphasizes the importance of understanding statistical methods and their applications in research.

Uploaded by

Cattyy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views7 pages

Worksheet 1 and The Rest Solutions

The document consists of a series of worksheets that cover statistical concepts, including variable types, study designs, summary statistics, probability calculations, and confidence intervals. It includes practical exercises for calculating medians, interquartile ranges, means, standard deviations, and probabilities related to various scenarios. The document emphasizes the importance of understanding statistical methods and their applications in research.

Uploaded by

Cattyy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Worksheet #1

1. Use SOME of the terms listed below to fill in the blanks.

nominal ordinal dependent sample parameter


statistic descriptive variable experimental
independent inferential population continuous
Suppose a researcher is interested in whether wait times for mental health services in Canada
have increased or decreased in recent years. Suppose further that the researcher collects data
regarding the wait times for all mental health services in Ontario, as a way to estimate the wait
times for all mental health services across Canada.

In this context, wait times for mental health services is an example of a __ variable __ that is
measured on a _ continuous__ scale of measurement. The _ sample___ of Ontario mental
health services was drawn from the __ population __ of all Canadian mental health services,
and therefore the average wait times in Ontario would be an example of a __ statistic _, and
the average wait times in Canada (if known) would be an example of a __ parameter ___. In
general, any numbers or analyses used to summarize wait times in Ontario would be in the
category called __ descriptive ___ statistics; any numbers or analyses used to generalize from
what we learn about Ontario wait times to Canada as a whole would be in the category of
inferential statistics.

2. For each of the following, determine whether the variable being measured is
discrete/categorical or continuous and explain your answer.
a. Social networking (number of daily minutes on Facebook)

Continuous. Time is infinitely divisible.

b. Preference between digital or analog watch

Discrete/categorical. There are two separate and distinct categories (analog and digital).
c. Number of correct answers on a statistics quiz

Continuous. The quiz could be a 5-point quiz, a 10-point quiz, or a 50-point quiz, which
indicates that knowledge of statistics (or number of correct answers) can be divided
indefinitely.

3. Ford and Torok (2008) found that motivational signs were effective in increasing physical
activity on a college campus. Signs such as “Step up to a healthier lifestyle” and “An average
person burns 10 calories a minute walking up the stairs” were posted by the elevators and
stairs in a college building. Students and faculty increased their use of the stairs during
times that the signs were posted compared to times when there were no signs.

a. Identify the independent and dependent variables for this study.

The independent variable is whether or not the motivational signs were posted, and the
dependent variable is amount of use of the stairs.

b. What scale of measurement is used for the independent variable?

Posting versus not posting is measured on a nominal scale.

4. A study follows a large group of women with untreated dysplasia of the uterine cervix,
documenting the number who improve, stay unchanged, or progress into cervical cancer
after 2 years in the study. What type of study design did the researchers use?

Observational cohort study design.

5. A community assesses a random sample of its residents by telephone questionnaire to


assess the association between obesity and diagnosed diabetes. This study design is best
described as:

Observational cross-sectional study design.

Worksheet #2

6. By Hand:
For the following sample (n=10) of diastolic blood pressure values, calculate the median and
IQR. Are there outliers in this small data set?
63, 65, 70, 70, 72, 77, 81, 82, 64, 76 (mm Hg)

Rank order the values: 63, 64, 65, 70, 70, 72, 76, 77, 81, 82

Median = (70+72)/2 = 71 mm Hg

Q1 = 65 mm Hg
Note: There are 5 values below the median (lower half); the middle value (Q1) is 65

Q3 = 77 mm Hg
Note: There are 5 values above the median (upper half); the middle value (Q3) is 77

IQR = Q3-Q1 = 12

Check for outliers


Lower: Q1 - (1.5xIQR) = 65 – (1.5x12) = 47 mm Hg
Upper: Q3 + (1.5xIQR) = 77 + (1.5x12) = 95 mm Hg

There are no outliers because none of the values are smaller than 47 or larger than 95 mm
Hg.

7. Computer Practice (SPSS):


Consider the following small data set measuring Body Mass Index (in kg/m2) in ten patients:
24.4, 26.4, 24.9, 25.5, 22.8, 29.6, 31.9, 28.8, 31.5, 26.8

a. Calculate simple summary statistics (specifically mean, median, standard deviation and
range).
̅ = 27.26 kg/m2
𝑿 s = 3.07 kg/m2 Median = 26.60 kg/m2 Range = 9.10 kg/m2

b. In 1-2 sentences, report on the results of your analysis (remember units!)


The mean BMI in this sample is 27.26 kg/m2 (s=3.07 kg/m2). The BMI values ranged
from a low of 22.80 to a high of 31.90 kg/m2 and the data are slightly positively
skewed (mean is higher than the median).
8. By Hand:
Now calculate the standard deviation for the above data points by hand. Make sure to include
the correct number of decimal places and units. Confirm that you obtained the same answer as
above.

s2 = SS/(n-1) = ∑ X2 – (∑X)2 = (7516.12 – 7431.076)/9 = 9.4493 kg2/m4


________n___
(n-1)

SS
2
s= n − 1 = √𝟗. 𝟒𝟒𝟗𝟑 = 3.07 kg/m

Worksheet #3

9. By Hand:
The probability of adults with allergies reporting symptom relief with a specific medication is
0.80. If the medication is given to 10 new patients with allergies, what is the probability that it
is effective in exactly seven? In one sentence, report on the results of your analyses.

This is an example of a binomial (n=10, p=0.80, x=7)

𝟏𝟎!
P(X=7) = (𝟎. 𝟖)𝟕 (𝟏 − 𝟎. 𝟖)𝟏𝟎−𝟕
(𝟏𝟎−𝟕)!𝟕!

𝟏𝟎∗𝟗∗𝟖∗𝟕∗𝟔∗𝟓∗𝟒∗𝟑∗𝟐∗𝟏
= (𝟑∗𝟐∗𝟏) 𝟕∗𝟔∗𝟓∗𝟒∗𝟑∗𝟐∗𝟏 x 0.2097 x 0.008

𝟕𝟐𝟎
= x 0.0016776 = 0.2013
𝟔

Interpretation: There is a 20.13% probability that exactly 7 of 10 patients will report relief
from symptoms.
10. Using the same information from the above question, now calculate the probability that the
allergy medication is effective in one or more patients. In one sentence, report on the
results of your analyses.

To solve the problem using as few steps as possible, we must apply the complement rule (see
Week 3 slides). The complement in this case is that the medication is effective in 0 patients
(X=0). When this probability is subtracted from 1, we get the probability that the allergy
medication is effective in one or more patients.

n=10, p=0.80, x=0

𝟏𝟎! 𝟏𝟎!
P(x=0) = (𝟎. 𝟖)𝟎 (𝟏 − 𝟎. 𝟖)𝟏𝟎−𝟎 = (𝟏)(𝟎. 𝟐)𝟏𝟎 = .0000001024
(𝟏𝟎)!𝟎! (𝟏𝟎)!

The probability that the allergy medication is effective in one or more patients = 1 –
0.0000001024 = 0.9999999 = 1.00 (rounded up)

Therefore, there is a nearly 100% probability that the allergy medication is effective in one or
more patients in a sample of 10 new patients with allergies.

11. Review the following slides containing a magic trick. Explain how this card trick is possible.
What is the probability of your card disappearing?

Example of a sample space for a deck of cards. Specifically, we’re interested in the
outcomes containing face cards, of which there are 12 in a full deck of cards. The program
works by changing all of the face cards in the slide; thus, all of your cards will disappear in
the trick. So, the probability of your chosen card disappearing is 100%!

Worksheet #4

12. By Hand:
Suppose systolic blood pressure is assumed to be normally distributed with µ = 112 mm Hg
and σ = 16 mm Hg in a population. If a person is randomly sampled from this population,
what is the probability that this person’s BP is between 108 mm Hg and 114 mm Hg?

𝟏𝟎𝟖−𝟏𝟏𝟐 𝟏𝟏𝟒−𝟏𝟏𝟐
P(108 < X < 114) = P( <Z< ) = P(−0.25 < Z < 0.125) = P(−0.25 < Z < 0.12)*
𝟏𝟔 𝟏𝟔

*Note: must round Z-score to 2 decimal places in order to use Standard Normal Table
= .5478 – (1 – .5987)

= .5478 – .4013

= .1465

Concluding sentence: The probability of a person’s BP being between 108 mm Hg and 114
mm Hg is 0.1465 (or 14.65%).

13. By Hand:
Data on men’s serum cholesterol levels (in mg/dL) were analyzed to determine the SEM.
Here’s the output:

Using the information provided in the output, calculate the 95% confidence interval, and
interpret the results.
95% CI = 264.43 + (1.96) (13.759)
= 264.43 + 26.968
= 237.462 – 291.398
= 237.46 – 291.40
Based on this sample, our estimate of the mean for serum cholesterol levels in men is
264.43 mg/dL, and we are 95% confident that the interval 237.46 – 291.40 mg/dL will
capture the true mean in the long run.

Now calculate the 99% confidence interval and interpret the results.
99% CI = 264.43 + (2.58) (13.759)
= 264.43 + 35.498
= 228.932 – 299.928
= 228.93 – 299.93
Based on this sample, our estimate of the mean for serum cholesterol levels in men is
264.43 mg/dL, and we are 99% confident that the interval 228.93 – 299.93 mg/dL will
capture the true mean in the long run.

14. Why is the 99% confidence interval calculated above wider than the 95% confidence
interval?
Because it allows one to be more confident that the unknown population parameter (mean)
is contained within the interval.
Also, the Z-score values used to calculate the 99% CI (Z=±2.58) are larger than the values used
for 95% CI (Z=±1.96).

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy