0% found this document useful (0 votes)
36 views15 pages

Solution STA101 Assignment 1&2 Summer24

The document contains solutions for STA101 assignments, including statistical analyses and frequency distributions based on student data. It covers various topics such as qualitative and quantitative data, frequency distribution tables, and measures of central tendency like mean, median, and mode. Additionally, it includes graphical representations and calculations related to stock prices and employee salary bands.

Uploaded by

montaxul.hasan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views15 pages

Solution STA101 Assignment 1&2 Summer24

The document contains solutions for STA101 assignments, including statistical analyses and frequency distributions based on student data. It covers various topics such as qualitative and quantitative data, frequency distribution tables, and measures of central tendency like mean, median, and mode. Additionally, it includes graphical representations and calculations related to stock prices and employee salary bands.

Uploaded by

montaxul.hasan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

STA101 Assignment 1&2 Solution

Answer to the question no. 1


Marks: 45
1. (i) [5.5]
Qualitative Quantitative (discrete) Quantitative (continuous)
Hair color Shoe size Foot length
Computer password Shirt size (10, 12, 14 etc.) Height
License plate number Time to drive to campus
Shirt size (S, M, L)
Zip code

(ii)
Nominal Ordinal Interval Ratio
Zip code Grade IQ Height
Gender Rating SAT score Time
Eye color Ranking Temperature (F, C) Weight

2. A sample of 100 students was taken, and these students were asked about the amount of
money they possess. The following table gives the frequency distribution of their responses.
[7]
Amount of Number of Amount of Number of
Money (Tk.) Students Money (Tk.) Students
0 - 99 18 500 - 599 √49

100 - 199 K 600 - 699 8

200 - 299 12 700 - 799 9

300 - 399 K-4 800 - 899 6

400 - 499 √81 900 - 999 5

a) Find the value of K and class midpoints.

STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24


b) Do all the classes have the same width? If so, what is the width?
c) Prepare the relative frequency and percentage distribution columns.
d) Prepare a cumulative frequency distribution. Calculate the cumulative relative frequencies
and cumulative percentages for all classes.
e) Find the percentage of the students who possess money -
i. Minimum 500
ii. Maximum 499
f) Represent the data set in a suitable graph with appropriate information (like title, axis, label
etc.) and comments on the graph.

Answer to the question no. 2


a)
Amount 𝑥𝑖 𝑓𝑖
0 - 99 49.5 18

100 - 199 149.5 K = 15

200 - 299 249.5 12

300 - 399 349.5 K-4 = 11

400 - 499 449.5 9

500 - 599 549.5 7

600 - 699 649.5 8

700 - 799 749.5 9

800 - 899 849.5 6

900 - 999 949.5 5


10

∑ ⬚ 𝑓𝑖 = 100
𝑖=1

STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24


b) Yes, the class width is 99.

c & d)
Relativ Percentage Cumulative Cumulative Cumulative
e
Frequency Relative Percentage
Freque Frequency
Amount 𝑓𝑖 ncy

Modal
Class 0 - 99 18 0.18 18 18 0.18 18

100 - 199 15 0.15 15 33 0.33 33

200 - 299 12 0.12 12 45 0.45 45

300 - 399 11 0.11 11 56 0.56 56

400 - 499 9 0.09 9 65 0.65 65

500 - 599 7 0.07 7 72 0.72 72

600 - 699 8 0.08 8 80 0.80 80

700 - 799 9 0.09 9 89 0.89 89

800 - 899 6 0.06 6 95 0.95 95

900 - 999
5 0.05 5 100 1.00 100
10

∑ ⬚ 𝑓𝑖
𝑖=1
= 100

e)
i. For minimum 500, we will consider classes from 500-599 to 900-999

Total frequency of those class = 7+8+9+6+5 = 35


35
And percentage = 100 × 100% = 35%

STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24


ii. For maximum 499 we will consider classes from 0-99 to 400-499
Total frequency of those class = 18+15+12+11+9 = 65
65
And percentage = 100 × 100% = 65%

f)

3. The following data set represents the record high temperatures in degree Fahrenheit (℉)
for each of the 50 US states: [5.5]

106 98 96 108 90 93 89 103 104 119

111 85 97 102 85 109 93 120 98 102

90 96 114 108 91 100 96 105 89 96

107 99 113 125 88 122 110 85 99 90

93 102 123 110 111 101 92 96 89 116

a) Construct a suitable frequency distribution table using interval 85 – 95, 95 – 105 and
so on. [2]
b) Construct a stem and leaf plot and mention the interesting features like maximum and
minimum value, range, modal value and median value. [3.5]

STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24


Answer to the question no. 3
a)
Class Limit Tally Frequency Relative Frequency Percentage

85-95 IIII IIII IIII 15 0.30 30

95-105 IIII IIII IIII II 17 0.34 34

105-115 IIII IIII II 12 0.24 24

115-125 IIII 5 0.10 10

125-135 I 1 0.02 2

Total 50 1 100

* Lower limit included and upper limit excluded

b) Stem and Leaf Plot:


Key: 10|4 → means 104

Stem Leaf
8 5558999
9 000123336666678899
10 0122234567889
11 00113469
12 0235

STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24


Maximum value = 125

Minimum value = 85

Range = 125-85 = 40

Modal Value = 96 [Occurs maximum number of times, 5 times]

Median Value = Average of 25th value + 26th value = (99+100) / 2 = 99.5

4. The number of Tesla, Inc. employees who will be selected for various salary bands in 2023 is
demonstrated in the following table: [5.0]

Wages ($) No. of Employees


40k – 50k Y+8
50k – 60k 24+√49
60k – 70k 19
70k – 80k √121 + Y
80k – 90k √225

Here, Y is the last digit of your student ID (i.e., 20100012, 21123415, etc.). Suppose your ID
is 20100012, then the 2nd last row (70k-80k) will be √121+2 = 13]. Estimate the following:

a) Find the Range of wages ($) and complete the frequency distribution table. [0.5+0.5]
b) Find:
i. Mean [1.0]
ii. Median [1.5]
iii. Mode [1.5]

STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24


Answer to the question no. 4
a) Range = (90 – 40) = 50
Let, Y = 2
Frequency of class (70– 80) = 11 + Y= 13
Complete frequency distribution table:

Wages ($) Frequency Mid value Cumulative 𝑓𝑖 𝑥𝑖


(𝑓𝑖 ) (𝑥𝑖 ) frequency
40 - 50 10 45 10 450

50 - 60
(Modal Class) 31 55 41 1705
60 - 70
(Median Class) 19 65 60 1235
70 - 80 13 75 73 975

80 - 90 15 85 88 1275
5 5

∑ ⬚ 𝑓𝑖 ∑ ⬚ 𝑓𝑖 𝑥𝑖
𝑖=1 𝑖=1
= 88 = 5640

b)
i) Mean:
∑5𝑖=1 ⬚𝑓𝑖 𝑥𝑖 5640
𝑥= ∑5𝑖=1 ⬚𝑓𝑖
= 88
= 64.09

The mean is 64.09

ii) Median:
𝑛 88
= = 44
2 2
𝐿𝑚 = 60
𝐹 = 41
𝑓𝑚 = 19
𝑐 = 10
STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24
𝑛
−𝐹 44 − 41
2
Median = 𝐿𝑚 + ∗ 𝑐 = 60 + ∗ 10 = 61.5789
𝑓𝑚 19

The median is 61.58


iii) Mode:
𝐿0 = 50
∆1 = 31 − 10 = 21
∆2 = 31 − 19 = 12
𝑐 = 10
∆1 21
Mode = 𝐿0 + ∆ ∗ 𝑐 = 50 + 21 + 12 ∗ 10 = 56.36
1 + ∆2

The mode is 56.36

Y Mean Median Mode


0 64.28571 60.52632 56.57143

1 64.18605 61.05263 56.47059

2 64.09091 61.57895 56.36364

3 64 62.10526 56.25

4 63.91304 62.63158 56.12903

5 63.82979 63.15789 56

6 63.75 63.68421 55.86207

7 63.67347 64.21053 55.71429

8 63.6 64.73684 55.55556

9 63.52941 65.26316 55.38462

STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24


5. The stock price of AAA Cable Company for the 24 trading days are in the following table
below: [9.5]
92 87 72 65 86 77 81 69

88 77 62 65 57 47 31 69

52 54 68 63 42 45 49 58

For the given information


a. Construct Stem and leaf plot [1]
b. Determine 𝑄1 , 𝑄2 𝑎𝑛𝑑 𝑄3 Quartiles [1.5]
c. Determine 𝐷5 𝑎𝑛𝑑 𝐷8 deciles [1]
d. Determine 𝑃30 , 𝑃80 𝑎𝑛𝑑 𝑃67 Percentiles [1.5]
e. Determine IQR [0.5]
f. Draw Box and whiskers plot. Also find the outliers if any? [2.5]
g. Calculate the coefficient of skewness and comment on the shape distribution [1.5]

STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24


Answer to the question no. 5
Unarranged Arranged
sl # Quartiles Deciles Percentiles
data data

92 1 31

87 2 42

72 3 45

65 4 47

86 5 49

77 6 52 Q1=AM of 6th and 7th


81 7 54 value=53

69 8 57 P30=>7.2th=8th=57

88 9 58

77 10 62

62 11 63

65 12 65 Q2=AM of 12th and D5=AM of 12th and


57 13 65 13th value=65 13th value=65

47 14 68

31 15 69

69 16 69

52 17 72 P67=>16.08th=17th=72

54 18 77 Q3=AM of 18th and


68 19 77 19th value=77

63 20 81 D8=>19.2th=20th=81 P80=>19.2th=20th=81

42 21 86

45 22 87

49 23 88

58 24 92

STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24


a) Stem and leaf plot:

Key: 3|9 means 39 Stem Leaf


=> 3 9

Stem Leaf
3 1
4 2,5,7,9
5 2,4,7,8
6 2,3,5,8,9,9
7 2,7,7
8 1,6,7,8
9 2

b) Determining Q1, Q2 and Q3 Quartiles:

1×24
Here n=24 and for 𝑄1 => = 6 (is an Integer value)
4
1 52+54
So 𝑄1 = [𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 6𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛 + 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 7𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛] = = 𝟓𝟑
2 2

2×24
Here n=24 and for 𝑄2 => = 12 (is an Integer)
4
1 65+65
So 𝑄2 = [𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 12𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛 + 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 13𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛] = = 𝟔𝟓
2 2

3×24
Here n=24 and for 𝑄3 => = 18 (is an Integer value)
4
1 77+77
So 𝑄3 = [𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 18𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛 + 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 19𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛] = = 𝟕𝟕
2 2

STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24


c) Determining 𝑫𝟓 𝒂𝒏𝒅 𝑫𝟖 Deciles:

5×24
Here n=24 and for 𝐷5 => = 12 (is an Integer)
10
1 65+65
So 𝐷5 = [𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 12𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛 + 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 13𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛] = = 𝟔𝟓
2 2
8×24
Here n=24 and for 𝐷8 => = 19.2 (is not an Integer)
10

So 𝐷8 = [𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 20𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛] = 𝟖𝟏

d) Determine 𝑷𝟑𝟎 , 𝑷𝟖𝟎 𝒂𝒏𝒅 𝑷𝟔𝟕 Percentiles:

30×24
Here n=24 and for 𝑃30 => = 7.2 (is not an Integer)
100
So 𝑃30 = [𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 8𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛] = 𝟓𝟕

80×22
Here n=24 and for 𝑃80 => = 19.2 (is not an Integer)
100
So 𝑃80 = [𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 20𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛] = 𝟖𝟏

67×24
Here n=24 and for 𝑃67 => = 16.08 (is not an Integer)
100
So 𝑃67 = [𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 17𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛] = 𝟕𝟐

e) 𝑰𝑸𝑹 = 𝑸𝟑 − 𝑸𝟏
= 77 − 53
= 24

STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24


f) Box and whiskers plot:

Outliers are identified as individual data points that fall outside the whiskers, which
extend to the minimum and maximum values within 1.5 times the interquartile
range (IQR).

So,

Lower fence = Q1 - 1.5×IQR,


Upper fence = Q3 + 1.5×IQR]
= [53 - (1.5 × 24) , 77 + (1.5 × 24) ]
= [17, 113]

Since, there is no data points outside this range so there are is outlier.
STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24
g) Coefficient of skewness:
𝐵𝑜𝑤𝑙𝑒𝑦 ′ 𝑠 𝑐𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑜𝑓 𝑠𝑘𝑒𝑤𝑛𝑒𝑠𝑠
(𝑄3 − 𝑄2 ) − (𝑄2 − 𝑄1 )
=
𝑄3 − 𝑄1
(77 − 65) − (65 − 53)
= = 0.0
77 − 53

The distribution is approximately symmetric. Because the coefficient of skewness is equal to 0.


The value of skewness is 0.

6. A study on a range of automotive lubricants reported the following data on oxidation-


induction time (min) for various commercial oils: [12.5]
Sample 1:
87 103 130 160 180 195 132 145 211 105
145 153 152 138 87 99 93 119 129

Sample 2:
99 102 110 33 56 112 130 111 124 155
201 209 103 66 84 75 107 202 59

a) What are the sample size of the sample 1 & 2 individually? [0.5]
b) Compute the sample mean, variance, and standard deviation for sample 1. [1+2.5+0.5]
c) Compute the sample mean, variance, and standard deviation for sample 2. [1+2.5+0.5]
d) Compute the Coefficient of variation for sample 1. [0.5]
e) Compute the Coefficient of variation for sample 2. [0.5]
f) Which measure one should consider to compare the performance/consistency among the
sample data? And why? [2]
g) For which sample of commercial oils, the relative variability of oxidation-induction time is
higher? [1]

STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24


Answer to the question no. 6

a) Sample size for sample 1=19


Sample size for sample 2=19

b) For sample 1:

87+103+130………..+119+129 2563
Sample mean = = = 134.895
19 19
∑𝑛 ̅ )2
𝑖=0(𝐱𝐢− 𝐱 22765.7895
Variance = = = 1264.766
𝑛−1 19−1

Standard Deviation= √𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 35.564

c) For sample 2:

99+102+110………..+202+59 2138
Sample mean = = = 112.526
19 19
∑𝑛 ̅)2
𝑖=0(𝐱𝐢− 𝐱 44576.7368
Variance = = = 2476.485
𝑛−1 19−1

Standard Deviation= √𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 49.764

𝑆𝐷 35.564
d) Coefficient of Variation, CV1 = x̅
×100 = 134.895 × 100 = 26.364%
𝑆𝐷 49.764
e) Coefficient of Variation, CV2 = x̅
×100 = 112.526 × 100 = 44.225%

f) The coefficient of Variation (CV)/ Standard Deviation one should consider to compare
the performance / consistency of the product of the two company based on the
situation.

The coefficient of variation represents the ratio of the standard deviation to the
mean, and it is a useful statistic for comparing the degree of variation from one data
series to another, even if the means are drastically different from one another.

g) As CV1 < CV2, Sample 2 of commercial oils has relatively higher variation in oxidation-
induction time comparing to Sample 1.

STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy