0% found this document useful (0 votes)

90 views11 pages

Stat 231 A1

This document contains Nathan Cheung's answers to questions in Stat 231 Assignment 1. It includes summaries of uniform, exponential, and normal distributions based on sample data. Nathan analyzes relationships between variables like weight and cost, finding a weak negative correlation. Overall, the assignment helped Nathan learn basic statistics concepts, introduce the R programming language, and model real-world data with different distributions.

Uploaded by

nathan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

90 views11 pages

Stat 231 A1

Uploaded by

nathan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

Nathan Cheung

20815290
Stat 231 Assignment 1
23/5/2021

Question 1

1 1 7
a) E( X)= (a+ b)= (2+5)= =3.5
2 2 2
1
Var ( X )= ¿
12
b) The sample mean and variance is 3.499 and 0.753 respectively, which is only 1/100th
off the theoretical mean and variance.
c) Since the sample follows a uniform distribution, which means the distribution for the data
set is symmetrical, the skewness would be close to zero. The skewness of the
generated sample is 0.044.
d) The sample kurtosis is less than 3, since it is distributed uniformly and do not have a
peak. The kurtosis of the generated sample is 1.720.
e)

Normal distribution is not a good approximation for uniform distribution. Even though
normal distribution is symmetrical which gives it a zero skewness, the distribution has
the highest probability at the centre and lowest in the ends, giving it a bell shape,
meaning its kurtosis is larger than 3. Uniform distribution has a constant probability for all
x, therefore normal distribution does not model uniform distribution.
Question 2
a) E( X)=0.5
Var ( X )=0.52 =0.25
The mean and variance from the sample 0.478 and 0.254 respectively, which only
slightly differs from the theoretical mean and variance.
b) The distribution is positively skewed since the data are more densely spreaded on the
left tail, leaving the right tail longer than the left. The skewness of the sample is 1.525.
c)

The screenshot above is the cumulative distribution function for the sample and its
normal approximation. The normal distribution is not a good approximation for the
sample. The sample follows an exponential distribution, meaning its data are skewed
positively with higher density on the left tail, while normal distribution is not skewed at all.
This explains the difference in the shape of the CDFs, with the CDF of the exponential
distribution increasing logarithmically due to the positive skewness, while the normal
CDF increases gradually as its peak is symmetrical.
Question 3

a)
Mean = 0.364
Variance = 0.920
Skewness = - 0.059
Kurtosis = 2.311

b)
Mean = 0.008
Variance = 0.889
Skewness = 0.065
Kurtosis = 2.951

c)
Mean = 0.015
Variance = 0.976
Skewness = 0.052
Kurtosis = 2.982

d) As the sample size increases, the sample mean gradually approaches 0, which is the
theoretical mean. Also the variance approaches 1, which is also the theoretical variance,
as the sample size increases. The skewness and kurtosis respectively approaches 0
and 3 with sample size increasing, which is better approximated by the normal
distribution.
Question 4
a) The discount is offered in a continuous variate.
b) My R commands and output:
> min(dataset$Discount_offered)
[1] 1

> max(dataset$Discount_offered)
[1] 64

> median(dataset$Discount_offered)
[1] 7

> quantile(dataset$Discount_offered,0.25)
25%
4
> quantile(dataset$Discount_offered,0.75)
75%
10.25

> max(dataset$Discount_offered) - min(dataset$Discount_offered) #range

[1] 63

> quantile(dataset$Discount_offered,0.75) - quantile(dataset$Discount_offered,0.25)

#interquartile range
75%
6.25

c) My R commands and output:

> round(mean(dataset$Discount_offered), digits=3)
[1] 12.92

> round(sd(dataset$Discount_offered), digits=3)

[1] 15.361

> round(skewness(dataset$Discount_offered), digits=3)

[1] 1.854

The sample mean and the standard deviation are close, while the skewness is greater
than zero, meaning the left tail of the sample is densely populated, leaving a longer right
tail. This is similar to those for an exponential distribution.
d)
My R commands and output:
> hist(dataset$Discount_offered,breaks=50, main="Discount
Offered",col="seashell",xlab="Amount",freq=FALSE)
> curve(dexp(x,log=FALSE),col="red", add = TRUE)

e) The sample mean and the sample standard deviation is respectively 12.92 and 15.361.
This is quite close as the difference between the values is 2.441.

f) The general shape of the exponential distribution and the histogram is similar, with the
left tail being the densest, with less samples falling on the right tail, leaving a long right
tail. The calculated skewness for the sample is 1.84, which is close to the skewness of
exponential distribution of 2. Also, as mentioned in the previous part, the sample mean
and standard deviation is quite close, which can be modeled by those of the exponential
distribution.
Question 5
a) The cost of product is expressed in a continuous variate.
b) My R commands and output:
> round(mean(dataset$Cost_of_the_Product),3)
[1] 211.474

> round(median(dataset$Cost_of_the_Product),3)
[1] 218

> round(sd(dataset$Cost_of_the_Product),3)
[1] 47.851

> round(skewness(dataset$Cost_of_the_Product),3)
[1] -0.206

> round(kurtosis(dataset$Cost_of_the_Product),3)
[1] 2.015

c)
My R commands and output:
> hist(dataset$Cost_of_the_Product,breaks=50, main="Cost of the
Product",col="seashell",xlab="Amount",freq=FALSE)

> curve (dnorm(x,mean(dataset$Cost_of_the_Product),

sd(dataset$Cost_of_the_Product)), col="red",add=TRUE)

d) From a numerical standpoint, the skewness of the sample data is -0.026, which is very
close to the 0 skewness in the normal distribution. The kurtosis of the sample is 2.015,
showing a peak in the middle, similar to that of a normal distribution. The majority of the
sample data are located in the middle of the histogram, around 150 and 250, forming a
bell shaped curve similar to the normal distribution. Also the tails of the histogram are
roughly the same length and the overall shape of the sample is symmetrical, meaning
normal distribution can be an estimate for the sample data.

Question 6
a)
My R commands and outputs:
plot(dataset$Weight_in_gms, dataset$Cost_of_the_Product,
xlab="Weight (g)",
ylab="Cost of product ($)",
pch=19,
col="darkblue",
cex.axis=1.25,
cex.lab=1.5)
> x<-dataset$Weight_in_gms
> y<-dataset$Cost_of_the_Product
> RegModel <- lm(y~x)
> abline(RegModel)

b) My R commands and outputs:

> round(cor(x,y), 3)
[1] -0.085 #Correlation of x and y as defined in a)

c) As illustrated in the scattered plot, the sample data are scattered all across the graph. As
suggested by the linear regression model which has an equation of y = -0.002447x
+220.196, the sample data has a weak negative correlation. Also, from part b) the
correlation is calculated as -0.085, which also supports the fact that it has a weak slight
negative correlation.

Question 7
a) The mode of shipment is a discrete variate.
b) My R commands and outputs:
> table(dataset$Mode_of_Shipment)

Flight Road Ship

88 61 351

c)
My R commands and outputs:
> boxplot(formula = Weight_in_gms ~ Mode_of_Shipment,
data = dataset,
outline=TRUE,
frame=T,
col="seashell",
ylab="Weight (g)",
rm.NA=TRUE,
cex.axis=1.15,
cex.lab=1.5
d) The 3 data sets have a lot of similarities. From the box plot, the interquartile range of the
3 modes of shipments are similar, with flight having the highest maximum and minimum
value. The data sets are also skewed upwards, with the median located around the top
of the boxplot for all 3 of them. The range of weight is large for all the modes of
shipment, spanning from 1000 to around 6000kg. Finally, there are no outliers for the
weights for all modes of shipments.

Question 8
In this homework, it allows me to understand basic concepts of statistics, for example
the five number summary as well as interpreting the shape of a distribution and its
variability. This assignment acts as an introduction to the software R for me, as we are
instructed to perform basic tasks for example inserting data sets and plotting histograms.
During the exercise, I learnt how to model data with different distributions, for instance I
am instructed to overlay an exponential distribution over a histogram. After this exercise,
I feel much more confident with using the software R since I have limited prior
knowledge in using it. It also refreshes and jogs my memory for the basics in empirical
studies of statistics.

1st in Course Ecmt1020 Notes
No ratings yet
1st in Course Ecmt1020 Notes
101 pages
BCA Mathematics
No ratings yet
BCA Mathematics
25 pages
Business Statistics Final OSA (A)
No ratings yet
Business Statistics Final OSA (A)
11 pages
DSA5205 2 DIstribution&Risk
No ratings yet
DSA5205 2 DIstribution&Risk
59 pages
Study Note Chap 2
No ratings yet
Study Note Chap 2
23 pages
Econ Review Stat W2 Jan2023
No ratings yet
Econ Review Stat W2 Jan2023
49 pages
Lecture 8
No ratings yet
Lecture 8
76 pages
Sujal 4
No ratings yet
Sujal 4
31 pages
FM Statistics, Fall 2022, Homework 02
No ratings yet
FM Statistics, Fall 2022, Homework 02
8 pages
Module V 1
No ratings yet
Module V 1
7 pages
ADS QB Num+Theory Soln
No ratings yet
ADS QB Num+Theory Soln
37 pages
Practical Statistics
No ratings yet
Practical Statistics
14 pages
Lt15 IM61212 EDA-I Final
No ratings yet
Lt15 IM61212 EDA-I Final
26 pages
Basic Statisticks 1 - Assignment - Vivek T
100% (7)
Basic Statisticks 1 - Assignment - Vivek T
18 pages
Measures of Variability
No ratings yet
Measures of Variability
23 pages
ES - Chapter 1-Part 2
No ratings yet
ES - Chapter 1-Part 2
31 pages
ECMT1020 - Week 02 Workshop Answers PDF
No ratings yet
ECMT1020 - Week 02 Workshop Answers PDF
4 pages
Basic Statistics 1
100% (2)
Basic Statistics 1
12 pages
Sta 226
No ratings yet
Sta 226
5 pages
13.exploratory Data Analysis
No ratings yet
13.exploratory Data Analysis
13 pages
Probability and Statistics
No ratings yet
Probability and Statistics
5 pages
PSLP Notes
No ratings yet
PSLP Notes
13 pages
23 Statisties
No ratings yet
23 Statisties
33 pages
Assignment 1
100% (1)
Assignment 1
16 pages
Week1 PDF
No ratings yet
Week1 PDF
22 pages
Week 03
No ratings yet
Week 03
39 pages
Sta211 2016 2017
No ratings yet
Sta211 2016 2017
5 pages
Chapter 1-3 True-False Exercises (KEY)
No ratings yet
Chapter 1-3 True-False Exercises (KEY)
4 pages
1736682754
No ratings yet
1736682754
19 pages
Genetica Cuantitativa
No ratings yet
Genetica Cuantitativa
120 pages
Exercises
100% (1)
Exercises
37 pages
Week 5-8 Short Notes
No ratings yet
Week 5-8 Short Notes
10 pages
In Sem 2 Study Material
No ratings yet
In Sem 2 Study Material
19 pages
Presentation 3
No ratings yet
Presentation 3
29 pages
On Fitting Models For Danish Fire Data
No ratings yet
On Fitting Models For Danish Fire Data
49 pages
Assignment (Key) 1
100% (1)
Assignment (Key) 1
16 pages
Problem Set 3
No ratings yet
Problem Set 3
6 pages
Statistics
No ratings yet
Statistics
6 pages
Homework Questions
No ratings yet
Homework Questions
9 pages
Lecture1 Introduction
No ratings yet
Lecture1 Introduction
74 pages
Data Analaysis and Visualization - 49Q
No ratings yet
Data Analaysis and Visualization - 49Q
28 pages
COST ATKT Oct 2019 - Paper Solution
No ratings yet
COST ATKT Oct 2019 - Paper Solution
13 pages
R Console
No ratings yet
R Console
6 pages
Homework 1: Statistics 109 Due February 17, 2019 at 11:59pm EST
No ratings yet
Homework 1: Statistics 109 Due February 17, 2019 at 11:59pm EST
23 pages
Mock Exam - Summer 2024 (Business Stat 1)
No ratings yet
Mock Exam - Summer 2024 (Business Stat 1)
10 pages
M.sc. Statistics 2018
No ratings yet
M.sc. Statistics 2018
22 pages
Assignmeant-1 Sharan S
No ratings yet
Assignmeant-1 Sharan S
20 pages
QBM 101 Business Statistics: Department of Business Studies Faculty of Business, Economics & Accounting HE LP University
No ratings yet
QBM 101 Business Statistics: Department of Business Studies Faculty of Business, Economics & Accounting HE LP University
62 pages
SBQuiz 2
No ratings yet
SBQuiz 2
16 pages
PSet1 - Solnb Solutiond
No ratings yet
PSet1 - Solnb Solutiond
10 pages
8 Probability Distributions: 8.1 R As A Set of Statistical Tables
No ratings yet
8 Probability Distributions: 8.1 R As A Set of Statistical Tables
6 pages
Uestion 1
No ratings yet
Uestion 1
52 pages
Sta301 Ch.1 To 22 For Grand Quiz
No ratings yet
Sta301 Ch.1 To 22 For Grand Quiz
16 pages
Verify - Template 2 (Jenny)
No ratings yet
Verify - Template 2 (Jenny)
94 pages
Engdat
No ratings yet
Engdat
3 pages
Generalised Linear Models and Bayesian Statistics
No ratings yet
Generalised Linear Models and Bayesian Statistics
35 pages
Discriminative and Generative Models in Machine Learning
No ratings yet
Discriminative and Generative Models in Machine Learning
9 pages
Statistics: Descriptive and Inferential: Statistics Is Concerned With Developing and Studying Different
No ratings yet
Statistics: Descriptive and Inferential: Statistics Is Concerned With Developing and Studying Different
4 pages
Wiley Series in Probability and Statistics
No ratings yet
Wiley Series in Probability and Statistics
10 pages
Anova Test - Post Hoc 1
No ratings yet
Anova Test - Post Hoc 1
2 pages
Class 2
No ratings yet
Class 2
34 pages
CH 4 - Problems
No ratings yet
CH 4 - Problems
72 pages
Statistical Methods in Hydrology and Hyd
No ratings yet
Statistical Methods in Hydrology and Hyd
15 pages
Linear and Non-Linear Models-Lec4
No ratings yet
Linear and Non-Linear Models-Lec4
35 pages
Homework 1
0% (1)
Homework 1
4 pages
Hunter, M. A., & Takane, Y. (2002) - Constrained Principal Component Analysis.
No ratings yet
Hunter, M. A., & Takane, Y. (2002) - Constrained Principal Component Analysis.
41 pages
Institute of Actuaries of India: Subject CT3 - Probability & Mathematical Statistics
No ratings yet
Institute of Actuaries of India: Subject CT3 - Probability & Mathematical Statistics
14 pages
CS1 R Summary Sheets PDF Regression Analysis
No ratings yet
CS1 R Summary Sheets PDF Regression Analysis
2 pages
Wiley'S Cfa Program Level I Smartsheets: Fundamentals For Cfa Exam Success
No ratings yet
Wiley'S Cfa Program Level I Smartsheets: Fundamentals For Cfa Exam Success
11 pages
Statistics - Docx Unit 1
No ratings yet
Statistics - Docx Unit 1
9 pages
Assignment 1 Sta301
100% (1)
Assignment 1 Sta301
3 pages
Tushar Bhawsar Resume-1
No ratings yet
Tushar Bhawsar Resume-1
1 page
STAT 266 - Lecture 2
No ratings yet
STAT 266 - Lecture 2
45 pages
Department of Business Management Course: - Operational Research Individual Assignment 3
No ratings yet
Department of Business Management Course: - Operational Research Individual Assignment 3
6 pages
MDC BBA Sem 3
No ratings yet
MDC BBA Sem 3
4 pages
Biostatistics
No ratings yet
Biostatistics
10 pages
Confidence Intervals For The Population Proportion Instructions
0% (1)
Confidence Intervals For The Population Proportion Instructions
3 pages
Pertemuan 7z
No ratings yet
Pertemuan 7z
31 pages
Exam 1 ReviewV5
No ratings yet
Exam 1 ReviewV5
5 pages
BCA - 102 Statistical Methods - I Max. Marks: 85 Min. Marks: 28
No ratings yet
BCA - 102 Statistical Methods - I Max. Marks: 85 Min. Marks: 28
3 pages
Quiz Statekbis 1 2020
No ratings yet
Quiz Statekbis 1 2020
31 pages
Disease Mapping
No ratings yet
Disease Mapping
35 pages
Modelling of The Paper Plane Systems
No ratings yet
Modelling of The Paper Plane Systems
6 pages
Pengaruh Stres Dan Kelelahan Kerja Terhadap Kinerja Guru SMPN 2 Sukodono Di Kabupaten Lumajang
No ratings yet
Pengaruh Stres Dan Kelelahan Kerja Terhadap Kinerja Guru SMPN 2 Sukodono Di Kabupaten Lumajang
9 pages
Bivariate Analysis in SPSS - ANOVA
No ratings yet
Bivariate Analysis in SPSS - ANOVA
6 pages
Annotated 3
No ratings yet
Annotated 3
5 pages
Pre-Calculus Essentials
From Everand
Pre-Calculus Essentials
Ernest Woodward
No ratings yet
Solutions Manual to accompany Introduction to Linear Regression Analysis
From Everand
Solutions Manual to accompany Introduction to Linear Regression Analysis
Douglas C. Montgomery
1/5 (1)
Painless Pre-Algebra
From Everand
Painless Pre-Algebra
Barron's Educational Series
3/5 (2)
Numerical Analysis II Essentials
From Everand
Numerical Analysis II Essentials
The Editors of REA
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Stat 231 A1

Uploaded by

Stat 231 A1

Uploaded by

Nathan Cheung

> max(dataset$Discount_offered) - min(dataset$Discount_offered) #range

> quantile(dataset$Discount_offered,0.75) - quantile(dataset$Discount_offered,0.25)

c) My R commands and output:

> round(sd(dataset$Discount_offered), digits=3)

> round(skewness(dataset$Discount_offered), digits=3)

> curve (dnorm(x,mean(dataset$Cost_of_the_Product),

b) My R commands and outputs:

Flight Road Ship

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.