0% found this document useful (0 votes)
78 views15 pages

Tutorial: CTM 2019 Probability and Statistics For Data Sciences

The document provides a tutorial on probability and statistics concepts. It includes examples calculating probabilities for Poisson distributions, confidence intervals, hypothesis testing to compare the means of two populations. The key points are: 1. The number of lions seen on safaris can be modeled by a Poisson distribution with a mean and variance of 5. The probability of seeing 3 lions is 0.139. 2. A confidence interval for the average difference in eye refraction between left and right eyes was calculated as (-0.66, 0.25) at the 90% level based on a sample of 17 patients. 3. Comparing wear on rubber samples from two companies, a t-test showed the data supports the claim that

Uploaded by

Pavan Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views15 pages

Tutorial: CTM 2019 Probability and Statistics For Data Sciences

The document provides a tutorial on probability and statistics concepts. It includes examples calculating probabilities for Poisson distributions, confidence intervals, hypothesis testing to compare the means of two populations. The key points are: 1. The number of lions seen on safaris can be modeled by a Poisson distribution with a mean and variance of 5. The probability of seeing 3 lions is 0.139. 2. A confidence interval for the average difference in eye refraction between left and right eyes was calculated as (-0.66, 0.25) at the 90% level based on a sample of 17 patients. 3. Comparing wear on rubber samples from two companies, a t-test showed the data supports the claim that

Uploaded by

Pavan Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

TUTORIAL

CTM 2019

Probability and Statistics for Data Sciences


1. Suppose the average number of lions
seen on a 1-day safari is 5.

 What is the sample space of number of lions seen on


any 1-day safari?
 0,1,2,3…N – where N is the total number of lions existing
 Countably infinite sample space

 What family of probability distributions does it belong to?


Discrete/Continuous?
 Poisson distribution – discrete

 What is the mean and variance of the distribution?


 Mean = Variance = 5
1. Suppose the average number of lions
seen on a 1-day safari is 5.

 What is the probability that tourists will see 3 lions on the


next 1-day safari?
 Let X be number of lions in a 1-day safari
 P(x = 3) = (e-5*53)/3! = 0.139

 What is the probability that tourists will see fewer than 4


lions on the next 1-day safari?
 P(x < 4) = P(x=0) + P(x=1) + P(x=2) + P(x=3)
 = (e-5*50)/0! + (e-5*51)/1! + (e-5*52)/2! + (e-5*53)/3!
 = 0.0067 + 0.0337 + 0.0842 + 0.139 = 0.2636
1. Suppose the average number of lions
seen on a 1-day safari is 5.

 What is the probability that tourists will see 5 lions on a 2


day-safari?
 Average number of lions on a 1-day safari is 5.
 This implies that for a 2-day safari, the average number of lions is 2*5 =
10.
 Let Y be number of lions in a 2-day safari.
 P(Y = 5) = (e-10*105)/5! = 0.0377
2. Let X and Y be two jointly continuous
random variables with joint PDF
6𝑥𝑦, 0 ≤ 𝑥 ≤ 1, 0 ≤ 𝑦 ≤ 𝑥
fXY(x,y) =
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

 Find fX(x) and fY(y)


 Note: Rx = Ry = [0,1]
 To find fX(x) for 0 ≤ x ≤ 1, we can write
√𝑥
fX x = ‫׬‬0 6𝑥𝑦 𝑑𝑦
= 3x2 for 0 ≤ x ≤ 1, 0 otherwise
 To find fY(y) for 0 ≤ y ≤ 1, we can write
1
fY y = ‫𝑦׬‬2 6𝑥𝑦 𝑑𝑥

= 3y(1-y4) for 0 ≤ y ≤ 1, 0 otherwise


2. Let X and Y be two jointly continuous
random variables with joint PDF
6𝑥𝑦, 0 ≤ 𝑥 ≤ 1, 0 ≤ 𝑦 ≤ 𝑥
fXY(x,y) =
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

 Are X and Y independent?


 fXY(x,y) ≠ fX(x)fY(y)
 Therefore, X and Y are not independent.

 Find the conditional PDF of X given Y=y


𝑓𝑋𝑌(𝑥,𝑦)
 Conditional PDF: fX|Y(x|y) = 𝑓𝑌(𝑦)
6𝑥𝑦
= 3y 1−𝑦 4
2𝑥
= , 𝑦 2 ≤ 𝑥 ≤ 1 ; 0 otherwise
1−𝑦 4
2. Let X and Y be two jointly continuous
random variables with joint PDF
6𝑥𝑦, 0 ≤ 𝑥 ≤ 1, 0 ≤ 𝑦 ≤ 𝑥
fXY(x,y) =
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

 Find E[X|Y=y] for 0 ≤ y ≤ 1


1 2𝑥
 E(X|Y=y) = ‫𝑦׬‬2 𝑥. 𝑑𝑥
1−𝑦 4

2(1−𝑦 6 )
=
3(1−𝑦 4 )
3. A topic of interest in ophthalmology is whether
or not spherical refraction differs between the left
and right eye on average. In a study to investigate
this, refraction was measured on the left and right
eye of 17 patients. The differences (right-left) in
diopters were d1,d2… d17 and elementary
calculations gave σ17 𝑑
𝑖=1 𝑖 = −3.50 and σ17 2
𝑖=1 𝑖 =
𝑑
19.13.

 Provide a 90% confidence interval for the average


difference (right-left).
3a. Provide a 90% confidence interval for
the average difference (right-left).

σ17
𝑖=1 𝑑𝑖
 Sample mean d̅ = = -3.50/17 = -0.2059
𝑛
1 (−3.5)2
 and sample variance = 19.13 − = 1.15059
17−1 17

 Therefore, standard deviation = 1.07266


 90% confidence interval using t distribution with 16 degrees of
freedom,
-0.2059 ± 1.746*(1.07266/√17)
= (-0.66, 0.25)
3b. If the population variance of the difference in
diopters is known to be 1.2, using the same sample
of 17 patients, calculate the confidence level
which indicates that the average difference is
between -0.9568 to 0.545.

 Given CI : (-0.9568 , 0.545)

0.545−(−0.9568) 𝜎
 = 𝑧
2 √𝑛 𝛼/2

 1.5018/2 = (1.2/√17) * 𝑧𝛼/2


 𝑧𝛼/2 = 0.7509/0.2910 = 2.58
 𝛼 = 0.01
 99% confidence level
3c. If the population variance of the difference in
diopters is known to be 1.2, for a confidence level
of 95%, what should be the minimum sample size
such that the error in estimating the average
difference is less than 0.2
𝜎
 ∗ 𝑧𝛼 < 0.2
𝑛 2

 (1.96*1.2)/√n < 0.2


 √n > (1.96*1.2)/0.2
 √n > 11.76
 n > 138.29
 The sample size must be atleast 139.
4. Two companies manufacture a rubber material
intended for use in an automotive application. The
part will be subjected to abrasive wear in the field
application, so we decide to compare the material
produced by each company in a test. Twenty-five
samples of material from each company are tested in
an abrasion test, and the amount of wear after 1000
cycles is observed. For company 1, the sample mean
and standard deviation of wear are x̅1 = 20
milligrams/1000 cycles and s1 = 2 milligrams/1000
cycles, while for company 2 we obtain x̅2 = 15
milligrams/1000 cycles and s2 = 8 milligrams/1000
cycles.
a) Do the data support the claim that the two companies produce
material with different mean wear? Use α = 0.05, and assume each
population is normally distributed but that their variances are not
equal.
a) Ho: µ1- µ2 = 0 or µ1= µ2
Halt: µ1- µ2 ≠ 0 or µ1≠ µ2
α = 0.05, two-tailed test
The t-test statistic (independent, unequal variance) is:
t0 = (x̅1- x̅2) - do / ((s12/n1) + (s22/n2))1/2

Degrees of freedom: tk , where


Reject the null hypothesis if t0 < -t0.025,27 where -t0.025,27 = -2.052 or t0 > t0.025,27 where
t0.025,27 = 2.052.
x̅1 = 20, x̅2 = 15, do=0
s1=2, s2=8, n1=25, n2=25
t0 = (20- 15) - 0 / ((22/25) + (82/25))1/2 = 3.03
Since 3.03 > 2.052, reject null hypothesis and conclude that the data supported the
claim that the two companies produce material with significantly different wear at
the 0.05 level of significance.
b) What is the P-value for this test?

P-value = 2P(t>3.03) (two-tailed)


0.005 < P-value < 0.010
c) Do the data support a claim that the material
from company 1 has higher mean wear than the
material from company 2? Use the same
assumptions as in part (a).
One-tailed test
Ho: µ1- µ2 ≤ 0, H1: µ1- µ2 > 0
α = 0.05
Test statistic: t0 = (x̅1- x̅2) - do / ((s12/n1) + (s22/n2))1/2
Reject null hypothesis if t0 > t0.05,27 where t0.05,27 = 1.706
x̅1 = 20, x̅2 = 15, do=0
s1=2, s2=8, n1=25, n2=25
t0 = (20- 15) - 0 / ((22/25) + (82/25))1/2 = 3.03
Since 3.03>1.706, reject null hypothesis and conclude that the data support the claim that the
material from company 1 has higher mean than the material from company 2 using a 0.05
level of significance.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy