0% found this document useful (0 votes)

10 views7 pages

HWK1 324 SS

The document outlines the requirements for Statistics 324 Homework 1, including submission guidelines, the importance of including R code, and the need for explanations in exercises. It contains exercises focused on statistical concepts such as sampling methods, mean and median calculations, standard deviation, and data visualization through histograms and boxplots. The exercises emphasize understanding the implications of sample selection and the characteristics of data distributions.

Uploaded by

jonathanolden9

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views7 pages

HWK1 324 SS

Uploaded by

jonathanolden9

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Statistics 324 Homework 1

Jonathan Nolden

*Submit your homework to Canvas by the due date and time. Email your lecturer if you
have extenuating circumstances and need to request an extension.
*If an exercise asks you to use R, include a copy of the code and output. Please edit your
code and output to be only the relevant portions.
*If a problem does not specify how to compute the answer, you many use any appropriate
method. I may ask you to use R or use manually calculations on your exams, so practice
accordingly.
*You must include an explanation and/or intermediate calculations for an exercise to be
complete.
*Be sure to submit the HWK1 Autograde Quiz which will give you ~20 of your 40 accuracy
points.
*50 points total: 40 points accuracy, and 10 points completion

Basics of Statistics and Summarizing Data Numerically and Graphically

(I)
Exercise 1. A number of individuals are interested in the proportion of citizens within a
county who will vote to use tax money to upgrade a professional baseball stadium in the
upcoming vote. Consider the following methods:
The Baseball Team Owner surveyed 8,000 people attending one of the baseball games
held in the stadium. Seventy eight percent (78%) of respondents said they supported the
use of tax money to upgrade the stadium.
The Pollster generated 1,000 random numbers between 1-52,661 (number of county
voters in last election) and surveyed the 1,000 citizens who corresponded to those
numbers on the voting roll. Forty three percent (43%) of respondents said they supported
the use of tax money to upgrade the stadium.
a. What is the population of interest? What is the parameter of interest? Will this
parameter ever be calculated?
The population of interest is the collection of Yes/No responses from all county citizens
who will vote in next election (52,661 People). The Parameter of interest is the proportion of
people in the county that would vote for tax money to be used on the professional baseball
stadium. This parameter will be calculated if the county does an official vote to see if the
citizens truly want the baseball stadium to receive tax money. It appears that this vote is
happening in the next vote therefore the parameter will be calculated.
b. What were the sample sizes used and statistics calculated from those samples?
Are these simple random samples from the population of interest?
The sample size used for the baseball team owner’s poll was 8,000 people attending the
baseball games. The poll found that 78% of respondents said they supported the use of tax
money on the stadium. This is not a simple random sample from the population, the owner
is initially narrowing down the sample size to 8000 people that go the baseball games. This
is not random, along with that, there will be a bias because people more interested in
baseball should be more willing to vote yes to upgrade the stadium.
The sample size for the Pollster was 1000 people randomly selected from the population of
interest. The poll came out that 43% of the voters supported the tax money used to
upgrade the stadium. This poll was a simple random sample from the population of
interest. The pollster did not use any methods to narrow down the population of interest
other than randomly selecting 1000 citizens.
c. The baseball team owner claims that the survey done at the baseball stadium will
better predict the voting outcome because the sample size was much larger. What
is your response?
The baseball team owner is incorrect. Even though his sample size is bigger than the
pollster, he didn’t randomly select the sample size from the population of interest. He
selected a predefined group of people, also he selected people that should have a bias
towards allocating the money since they are attending the baseball games. Therefore the
pollster’s data is more accurate even though he surveyed less people because his survey
was a simple random sample unlike the owner.
Exercise 2. There are 12 numbers in a sample, and the mean is 𝑥‾ = 24. The minimum of
the sample is accidentally changed from 11.9 to 1.19.
a. Is it possible to determine the direction in which (increase/decrease) the mean
(𝑥‾)changes? Or how much the mean changes? If so, by how much does it change? If
not, why not?
It is possible to determine the direction the mean will change when 11.9 is changed to
1.19. The mean will decrease because the minimum value is decreasing, thus decreasing
the mean. The mean decreases by .8925 as shown in the calculation below. This
calculation is made possible because we know the sample size and the two numbers that
got switched.
(11.9-1.19)/12

## [1] 0.8925

b. Is it possible to determine the direction in which the median changes? Or how much
the median changes? If so, by how much does it change? If not, why not?
It is possible to determine the direction the median changes, since the minimum value is
changed to a different number that will also be a minimum and the sample size (12) is still
the same, the median won’t change. The median is just the number in the middle of the
collected values since the minimum number is still a minimum number, the median
doesn’t change.
c. Is it possible to predict the direction in which the standard deviation changes? If so,
does it get larger or smaller? If not, why not? Describe why it is difficult to predict by
how much the standard deviation will change in this case.
The standard deviation will increase, this is because the standard deviation gives a value
for how much the data points deviate from the mean. Changing the 11.9 to 1.19 causes
that number to be further away from the mean thus increasing the standard deviation. It is
difficult to predict how much the standard deviation will change in this case, the standard
deviation requires all of the data points to calculate and in this example, we are only given
1 point out of 12.
Exercise 3: After manufacture, computer disks are tested for errors. The table below
tabulates the number of errors detected on each of the 100 disks produced in a day.

Number of Defects Number of Disks

0 41
1 31
2 15
3 8
4 5
a. Describe the type of data that is being recorded about the sample of 100 disks,
being as specific as possible.
The type of date being recorded is Quantitative - Discrete. It is quantitative because the
data is given in number form and are not categories. The data is discrete because there
can’t be 1.5 defects, it’s either 1 or 2, therefore the data is whole and discrete.
b. A frequency histogram showing the number of errors on the 100 disks is given
below. Write the R code to produce this frequency histogram. Be sure to create
useful labels. Hints: use the rep() function to define your defect data. Also use ylim
and breaks to format your graph.
Defects <- c(rep(0,41),rep(1,31),rep(2,15),rep(3,8),rep(4,5))
hist(Defects, breaks = seq(-0.5, 4.5, by = 1), ylim = c(0, 50),labels = TRUE)
Defect Histogram
c. What is the shape of the histogram for the number of defects observed in this
sample? Why does that make sense in the context of the question?
The rough shape of this histogram is Right skewed data.This shape makes sense because
the factory is trying to have no defects, therefore the chance of them having 4 defects are
much lower than the chance of 0 defects. As the amount of defects increase, the number
of samples that have that defect decreases thus proving how it is right skewed data.
d. Calculate the mean and median number of errors detected on the 100 disks by
hand and with R. How do the mean and median values compare and is that
consistent with what we would guess based on the shape? [You can use LaTeX such
𝑣𝑎𝑙𝑢𝑒1
as 𝑥‾ = 𝑣𝑎𝑙𝑢𝑒2 to help you show your work neatly.]

The difference between the mean and the median is 0.05 with the median being 1 and the
mean being 1.05. Therefore they are almost the same number. The mean and the median
both essentially being 1 is consistent with what we would guess based on it’s shape. This is
because there are the most amount of 0’s and 1’s while there are not many 2’s,3’s, or 4’s in
comparison. The calculations that prove these numbers can be shown below.
#Hand Calculations
mean_hand = sum(Defects)/length(Defects) #105/100 = 1.05
mean_hand

## [1] 1.05

#median_hand = 1 since the 50th disk would be a 1

#R Calculations
mean_R = mean(Defects) #mean_R = 1.05
mean_R

## [1] 1.05

Median_R = median(Defects) #Median_R = 1

Median_R

## [1] 1

e. Calculate the sample standard deviation ``by hand” and using R. Are the values
consistent between the two methods? How would our calculation differ if instead
we know that these 100 values were the whole population? [hint: use multiplication
instead of repeated addition]
My hand calculations for the standard deviation match the calculations from R, therefore
the 2 values are consistent. There is a different formula knowing that the standard
deviation is the whole population, not just a sample. The key difference is in the
denominator of the variance equation. For a sample, the denominator is (n-1), when the
population denominator is n. This caused the standard deviation to decrease by about .005
when we assumed it was a population not sample.
#Hand Calculation
#-> sd_sample = 1.158
# -> sd_pop = 1.152

#R Calculations
# Standard Deviation of sample
sd <- sd(Defects, na.rm = FALSE)
sd

## [1] 1.157976

#standard deviation of population

sd_pop <- sqrt(sum((Defects - mean(Defects))^2) / length(Defects))
sd_pop

## [1] 1.152172

f. Construct a boxplot for the number of errors data using R with helpful labels.
Explain how the shape of the data (identified in (c)) can be seen from the boxplot
using words such as minimum, 1st quartile, median,3rd quartile, and maximum.
The boxplot shown below matches the shape of the data from the histogram in part c. The
histogram is right skewed and this is shown in the box plot since there are no whiskers on
the left side of the box. The minimum value in the boxplot is 0 and this matches the
histograms minimum value. Along with that, the max value in the boxplot right whisker is 4,
which is also the max value in the histogram. The median of the boxplot is roughly 1 which
also matches the median according to the data set. The first quartile of the boxplot is 0 this
represents the strong right skewed cluster of data in the histogram. The third quartile of the
boxplot is also about 2, therefore 75% of the disks have 2 or fewer defects, this visual is
shown in how the histogram is right skewed.
boxplot(Defects,main = "Boxplot of Number of Errors on Disks", ylab = "Number
of Errors", xlab = "Disks", horizontal = TRUE)

g. Explain why the histogram is better able to show the discrete nature of the data than
a boxplot.
Histograms are better at showing descrete data (like this example) than boxplots. This is
because histograms display the frequency of the data. Therefore you can more easily see
how many disks had 0, 1, 2, 3, or 4 defects.The boxplot doesn’t display this information. It
is more focused on the distribution of the data as a whole and not the individual
frequencies. For example, we know about 75% of the disks have 2 or fewer defects, but we
don’t know how many disks had 2 defects based on the boxplot. Also, histograms show the
exact shape of the distribution, whereas a boxplot shows how much the data is skewed left
or right.

HWK2 324 SS
100% (1)
HWK2 324 SS
7 pages
Coursera Basic Statistics Final Exam Answers
80% (5)
Coursera Basic Statistics Final Exam Answers
9 pages
Rejection of Data
No ratings yet
Rejection of Data
21 pages
(Modeling and Simulation in Science, Engineering and Technology) Alan D. Freed (Auth.) - Soft Solids - A Primer To The Theoretical Mechanics of Materials-Birkhäuser Basel (2014)
100% (1)
(Modeling and Simulation in Science, Engineering and Technology) Alan D. Freed (Auth.) - Soft Solids - A Primer To The Theoretical Mechanics of Materials-Birkhäuser Basel (2014)
391 pages
Mostly Harmless Statistics
No ratings yet
Mostly Harmless Statistics
506 pages
Stats - The Theory 2
No ratings yet
Stats - The Theory 2
25 pages
RM EBBA Class 8 CH0 11 Quatitative Analysis
No ratings yet
RM EBBA Class 8 CH0 11 Quatitative Analysis
37 pages
Aula1-Estatistica Basica e Probabilidade
No ratings yet
Aula1-Estatistica Basica e Probabilidade
68 pages
Boam Bitch
100% (1)
Boam Bitch
67 pages
Lecture 2.2 - Statistics - Desc Stat and Distrib
No ratings yet
Lecture 2.2 - Statistics - Desc Stat and Distrib
48 pages
Chapter 3 Numerical Summaries of Data: Important Note: Follow Rounding Instructions
100% (1)
Chapter 3 Numerical Summaries of Data: Important Note: Follow Rounding Instructions
4 pages
Chapter 3 Univariate Data Worksheet Package Student Spaces
No ratings yet
Chapter 3 Univariate Data Worksheet Package Student Spaces
24 pages
Statistical Inference Assignment 3
No ratings yet
Statistical Inference Assignment 3
9 pages
High Voltage Engineering Theory and Practice by M. Khalifa
No ratings yet
High Voltage Engineering Theory and Practice by M. Khalifa
554 pages
ExercisIe Collection
No ratings yet
ExercisIe Collection
111 pages
Final Review Packet
No ratings yet
Final Review Packet
21 pages
504.applied Statistics For Social Sciences 1
No ratings yet
504.applied Statistics For Social Sciences 1
62 pages
Solution Manual Adms 2320 PDF
No ratings yet
Solution Manual Adms 2320 PDF
869 pages
Savage Worlds RPG Battlestar Galactica
100% (8)
Savage Worlds RPG Battlestar Galactica
61 pages
City Uni of New York
No ratings yet
City Uni of New York
33 pages
R For Data Exploration
No ratings yet
R For Data Exploration
52 pages
Math 10 Note Pack With Practice Summer 2016
No ratings yet
Math 10 Note Pack With Practice Summer 2016
97 pages
ETF1100 Business Statistics Week 6: Midterm Test Revision
No ratings yet
ETF1100 Business Statistics Week 6: Midterm Test Revision
25 pages
Tutorial Sheet EN
No ratings yet
Tutorial Sheet EN
29 pages
Math 140 Final Review Notes
No ratings yet
Math 140 Final Review Notes
20 pages
002 Probability-and-Statistics-Part-1-Data
No ratings yet
002 Probability-and-Statistics-Part-1-Data
84 pages
Statistics 101 Study Notes
No ratings yet
Statistics 101 Study Notes
33 pages
Oxford Stas - 1 Worked Solution
No ratings yet
Oxford Stas - 1 Worked Solution
46 pages
Exercises
100% (1)
Exercises
37 pages
IntroStat Oct2010
No ratings yet
IntroStat Oct2010
324 pages
PDF - Mathematics - The Complexity of Boolean Functions
No ratings yet
PDF - Mathematics - The Complexity of Boolean Functions
469 pages
Measures of Variability
100% (2)
Measures of Variability
71 pages
AP ECON 2500 Session 2
No ratings yet
AP ECON 2500 Session 2
22 pages
QBM 101 Business Statistics: Department of Business Studies Faculty of Business, Economics & Accounting HE LP University
No ratings yet
QBM 101 Business Statistics: Department of Business Studies Faculty of Business, Economics & Accounting HE LP University
62 pages
270 Book Solutions
No ratings yet
270 Book Solutions
78 pages
HMW 09
No ratings yet
HMW 09
1 page
Descriptive Use Charts Graphs Tables and Numerical Measures
No ratings yet
Descriptive Use Charts Graphs Tables and Numerical Measures
11 pages
Modul B Inggris Xii
100% (1)
Modul B Inggris Xii
27 pages
Week 01 Introduction
No ratings yet
Week 01 Introduction
33 pages
Chapter 1: Descriptive Statistics: Example 1: Making Steel Rods
No ratings yet
Chapter 1: Descriptive Statistics: Example 1: Making Steel Rods
20 pages
Public Administration Unit-53 Patterns of Relationship Between The Secretariat and Directorates
100% (3)
Public Administration Unit-53 Patterns of Relationship Between The Secretariat and Directorates
20 pages
Statistics
No ratings yet
Statistics
12 pages
STATS 10 Assignment 1
No ratings yet
STATS 10 Assignment 1
7 pages
Relative Resource Manager
0% (1)
Relative Resource Manager
5 pages
Fluent-FSI 14.5 Lect-03 CoSimulation Setup PDF
No ratings yet
Fluent-FSI 14.5 Lect-03 CoSimulation Setup PDF
45 pages
Module 1 - Descriptive Statistics PDF
No ratings yet
Module 1 - Descriptive Statistics PDF
34 pages
MDM4U1-31 - Test #1 - Statistics of One Variable
No ratings yet
MDM4U1-31 - Test #1 - Statistics of One Variable
5 pages
SubAtomic Particles
No ratings yet
SubAtomic Particles
28 pages
Mathematics Principles V11
From Everand
Mathematics Principles V11
Clive W. Humphris
No ratings yet
MCQ Statistics
No ratings yet
MCQ Statistics
8 pages
Sample Problems On Data Analysis: What Is Your Favorite Class?
No ratings yet
Sample Problems On Data Analysis: What Is Your Favorite Class?
8 pages
Employability Skills: Brush Up Your Maths
From Everand
Employability Skills: Brush Up Your Maths
Clive W. Humphris
No ratings yet
Basic Statistics: Introductory Workshop MS-Bapm
No ratings yet
Basic Statistics: Introductory Workshop MS-Bapm
78 pages
Solutions To Engineering Mechanics "RESULTANT OF ANY FORCE SYSTEM" 3rd Edition by Ferdinand Singer
57% (14)
Solutions To Engineering Mechanics "RESULTANT OF ANY FORCE SYSTEM" 3rd Edition by Ferdinand Singer
16 pages
11 4variationswithinadataset
No ratings yet
11 4variationswithinadataset
4 pages
01 Sample Problems For Chapter 1 - ANSWER KEY
No ratings yet
01 Sample Problems For Chapter 1 - ANSWER KEY
13 pages
Crises Management
No ratings yet
Crises Management
17 pages
Nov 2024 p1 (1 of 3) Stats (Last Supper)
No ratings yet
Nov 2024 p1 (1 of 3) Stats (Last Supper)
6 pages
Sample Thank You Message For Event Attendees PDF
No ratings yet
Sample Thank You Message For Event Attendees PDF
2 pages
GCSE Maths Teachers Pack V11
From Everand
GCSE Maths Teachers Pack V11
Clive W. Humphris
No ratings yet
Nexpose Sample Xss Report 5
No ratings yet
Nexpose Sample Xss Report 5
62 pages
Be Cre8v School
No ratings yet
Be Cre8v School
14 pages
Foreword: Frank G. Ripel: Nagualism
No ratings yet
Foreword: Frank G. Ripel: Nagualism
5 pages
MA121-1 3 4-hw
No ratings yet
MA121-1 3 4-hw
19 pages
CM760 E-Brochure Hemobascula
No ratings yet
CM760 E-Brochure Hemobascula
6 pages
Mock 2024 الحل
No ratings yet
Mock 2024 الحل
9 pages
Homework 1
No ratings yet
Homework 1
9 pages
MATH1208AnnotatedBook Imp
No ratings yet
MATH1208AnnotatedBook Imp
145 pages
Mastering The Rockefeller Habits
No ratings yet
Mastering The Rockefeller Habits
10 pages
Worksheets-Importance of Mathematics
No ratings yet
Worksheets-Importance of Mathematics
38 pages
CS1402 Ooad
No ratings yet
CS1402 Ooad
9 pages
Definitions of Descriptive Statistics of A Single Variable Generated by The Descriptive Statistics Tool in Excel's Data Analysis
No ratings yet
Definitions of Descriptive Statistics of A Single Variable Generated by The Descriptive Statistics Tool in Excel's Data Analysis
3 pages
Couple Therapy Forgiveness As An Islamic Approach in Counselling
No ratings yet
Couple Therapy Forgiveness As An Islamic Approach in Counselling
6 pages
Introduction To Rstudio: Creating Vectors
No ratings yet
Introduction To Rstudio: Creating Vectors
11 pages
FC3 Q01 Key
0% (1)
FC3 Q01 Key
1 page
Grand Canyon BUS 352 Entire Course
No ratings yet
Grand Canyon BUS 352 Entire Course
7 pages
REELS - R Test
100% (1)
REELS - R Test
11 pages
10-Year Project TOKIO
No ratings yet
10-Year Project TOKIO
16 pages
Light-NCERT Notes Class 10
No ratings yet
Light-NCERT Notes Class 10
2 pages
List of Important AP Statistics Concepts To Know
No ratings yet
List of Important AP Statistics Concepts To Know
9 pages
Descriptive Stat Excel
No ratings yet
Descriptive Stat Excel
3 pages
2015 09 29-Hom01-Molla Asgedom
No ratings yet
2015 09 29-Hom01-Molla Asgedom
4 pages
Portfolio Reflection
No ratings yet
Portfolio Reflection
2 pages
Modular Architecture As A Synergy of Chaos and Order-Case Study Prishtina
No ratings yet
Modular Architecture As A Synergy of Chaos and Order-Case Study Prishtina
12 pages
AP Stat Spring Pacing
No ratings yet
AP Stat Spring Pacing
4 pages
Recaptures, Recaptures:: Read All About It!
No ratings yet
Recaptures, Recaptures:: Read All About It!
12 pages
Speech and Language Disorders
No ratings yet
Speech and Language Disorders
3 pages
Essay 2 First Draft
No ratings yet
Essay 2 First Draft
6 pages
Hidrogeologi
No ratings yet
Hidrogeologi
393 pages
Myers Brigg
No ratings yet
Myers Brigg
7 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

HWK1 324 SS

Uploaded by

HWK1 324 SS

Uploaded by

Statistics 324 Homework 1

Basics of Statistics and Summarizing Data Numerically and Graphically

Number of Defects Number of Disks

#median_hand = 1 since the 50th disk would be a 1

Median_R = median(Defects) #Median_R = 1

#standard deviation of population

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.