0% found this document useful (0 votes)

26 views43 pages

Data Analysis Slides

Uploaded by

sharandeepdasari11

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views43 pages

Data Analysis Slides

Uploaded by

sharandeepdasari11

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

Data Analysis

Dr. Rahul Pandya

https://www.probabilitycourse.com/
Introduction

▶ Discuss limit theorems and convergence modes for random

variables.
▶ Limit theorems are among the most fundamental results in
probability theory.
▶ Two important limit theorems: the law of large numbers
(LLN) and the central limit theorem (CLT).
▶ Importance of these theorems as applied in practice.
▶ Discuss the convergence of sequences of random variables.
Limit Theorems

▶ Discuss two important theorems in probability, the law of large

numbers (LLN) and the central limit theorem (CLT).
▶ The LLN states that the average of a large number of i.i.d.
random variables converges to the expected value.
▶ The CLT states that, under some conditions, the sum of a
large number of random variables has an approximately
normal distribution.
Law of Large Numbers

▶ The law of large numbers has a very central role in probability

and statistics.
▶ States that if you repeat an experiment independently a large
number of times and average the result, what you obtain
should be close to the expected value.
▶ Two main versions: weak and strong laws of the large
numbers.
▶ Focus on the weak law of large numbers (WLLN).
▶ Define the sample mean:

X1 + X2 + ... + Xn
X̄ =
n

https://www.probabilitycourse.com/
Sample Mean

▶ Common notation for the sample mean is Mn .

▶ If the Xi have CDF FX (x), we might show the sample mean
by Mn (X ) to indicate the distribution of the Xi ’s.
▶ The sample mean X̄ = Mn (X ) is also a random variable.
▶ Expectation:

E [X̄ ] = E [X1 + X2 + ... + Xn ] = nE [Xn ] = E [X ]

▶ Variance:
Var (X1 + X2 + ... + Xn ) Var (X )
Var (X̄ ) = 2
=
n n

https://www.probabilitycourse.com/
Weak Law of Large Numbers (WLLN)

▶ Let X1 , X2 , ..., Xn be i.i.d. random variables with a finite

expected value E [Xi ] = µ < ∞.
▶ Then, for any ϵ > 0:

lim P(|X̄ − µ| ≥ ϵ) = 0.
n→∞

▶ Proof using Chebyshev’s inequality:

Var (X )
P(|X̄ − µ| ≥ ϵ) ≤
nϵ2
▶ This goes to zero as n → ∞.

https://www.probabilitycourse.com/
Central Limit Theorem (Slide 1)
▶ The central limit theorem (CLT) is one of the most important
results in probability theory.

▶ It states that, under certain conditions, the sum of a large

number of random variables is approximately normal.

▶ Here, we state a version of the CLT that applies to i.i.d.

random variables.

▶ Suppose that X1 , X2 , . . . , Xn are i.i.d. random variables with

expected values E [Xi ] = µ < ∞ and variance
Var (Xi ) = σ 2 < ∞.

▶ The sample mean is given by:

X1 + X2 + . . . + Xn
X̄ =
n

https://www.probabilitycourse.com/
Central Limit Theorem (Slide 2)

▶ The sample mean has mean E [X̄ ] = µ and variance

2
Var (X̄ ) = σn .

▶ Thus, the normalized random variable

X̄ − µ X + X2 + . . . + Xn − nµ
Zn = √ = 1 √
σ/ n nσ

▶ Zn has mean E [Zn ] = 0 and variance Var (Zn ) = 1.

▶ The central limit theorem states that the CDF of Zn

converges to the standard normal CDF.

https://www.probabilitycourse.com/
Central Limit Theorem
▶ The Central Limit Theorem (CLT)

▶ Let X1 , X2 , . . . , Xn be i.i.d. random variables with expected

value E [Xi ] = µ < ∞ and variance 0 < Var (Xi ) = σ 2 < ∞.

▶ Then, the random variable

X̄ − µ X + X2 + . . . + Xn − nµ
Zn = √ = 1 √
σ/ n nσ

▶ converges in distribution to the standard normal random

variable as n goes to infinity, that is

lim P(Zn ≤ x) = Φ(x), for all x ∈ R,

n→∞

▶ where Φ(x) is the standard normal CDF.

https://www.probabilitycourse.com/
Understanding the Central Limit Theorem

▶ An interesting aspect of the CLT is that the distribution of

the Xi ’s does not matter.
▶ The Xi ’s can be:
▶ Discrete
▶ Continuous
▶ Mixed random variables
▶ Let’s assume that Xi ’s are Bernoulli(p):
▶ E [Xi ] = p
▶ Var (Xi ) = p(1 − p)
Understanding the Central Limit Theorem

▶ Yn = X1 + X2 + . . . + Xn has a Binomial(n, p) distribution:

Yn − np
Zn = p
np(1 − p)

▶ Where Yn ∼ Binomial(n, p).

▶ Figure 7.1 shows the PMF of Zn for different values of n:
▶ The shape of the PMF approaches a normal PDF curve as n
increases.
▶ Zn is a discrete random variable with a PMF, not a PDF.
▶ The CLT states that the CDF of Zn converges to the standard
normal CDF.
▶ PMF and PDF are conceptually similar, making the figure
useful for visualizing convergence to the normal distribution.

https://www.probabilitycourse.com/
Understanding the Central Limit Theorem

Figure: Zn is the normalized sum of n independent Bernoulli(p) random

variables. The shape of its PMF, PZn (z), resembles the normal curve as n
https://www.probabilitycourse.com/
Understanding the Central Limit Theorem

Figure: Zn is the normalized sum of n independent Bernoulli(p) random

variables. The shape of its PMF, PZn (z), resembles the normal curve as n
https://www.probabilitycourse.com/
Understanding the Central Limit Theorem

▶ As another example, let’s assume that Xi ’s are Uniform(0,1):

▶ E [Xi ] = 12
▶ Var (Xi ) = 1
12
▶ In this case, we have:
n
X1 + X2 + . . . + Xn − 2
Zn = pn
12

https://www.probabilitycourse.com/
Understanding the Central Limit Theorem

Figure:

Zn is the normalized sum of n independent Uniform(0,1) random

variables.
The shape of its PDF, fZn (z), gets closer to the normal curve as n
increases. https://www.probabilitycourse.com/
Normalization of Random Variables

▶ We could have directly looked at Yn = X1 + X2 + ... + Xn .

▶ Why do we normalize it first and say that the normalized
version (Zn ) becomes approximately normal?
▶ This is because E [Yn ] = nE [Xi ] and Var (Yn ) = nσ 2 go to
infinity as n goes to infinity.
▶ We normalize Yn in order to have a finite mean and variance
(E [Zn ] = 0, Var (Zn ) = 1).
▶ The CDF of Zn is obtained by scaling and shifting the CDF of
Yn .
Importance of the Central Limit Theorem

▶ The importance of the CLT stems from the fact that, in many
real applications, a certain random variable of interest is a
sum of a large number of independent random variables.
▶ Examples include:
▶ Laboratory measurement errors modeled by normal random
variables.
▶ Gaussian noise in communication and signal processing.
▶ Percentage changes in asset prices modeled by normal random
variables.
▶ Random sampling from a population to obtain statistical
knowledge.
▶ The CLT simplifies computations significantly, especially when
dealing with sums of a large number of i.i.d. random variables.
▶ It is often stated that if n ≥ 30, then the normal
approximation is very good.
Applying the Central Limit Theorem (CLT)

▶ Write the random variable of interest Y as the sum of n i.i.d.

random variables Xi :
Xn
Y = Xi
i=1

https://www.probabilitycourse.com/
Finding Mean and Variance

▶ Find E [Y ] and Var (Y ):

E [Y ] = nµ, Var (Y ) = nσ 2

▶ Where µ = E [Xi ] and σ 2 = Var (Xi ).

Conclusion of CLT

▶ According to the CLT, we conclude that:

Y − E [Y ] Y − nµ
p = √
Var (Y ) nσ

is approximately standard normal.

https://www.probabilitycourse.com/
Finding Probability Using CLT

▶ To find P(y1 ≤ Y ≤ y2 ), write:

y1 − nµ Y − nµ y2 − nµ
P(y1 ≤ Y ≤ y2 ) = P √ ≤ √ ≤ √
nσ nσ nσ

y2 − nµ y1 − nµ
≈Φ √ −Φ √
nσ nσ

https://www.probabilitycourse.com/
Problem Statement

A bank teller serves customers standing in the queue one by one.

Suppose that the service time Xi for customer i has mean
E [Xi ] = 2 minutes and Var (Xi ) = 1. Assume that service times for
different customers are independent. Let Y be the total time the
bank teller spends serving 50 customers.
Find P(90 < Y < 110).

https://www.probabilitycourse.com/
Solution

▶ Let Y = X1 + X2 + · · · + X50 , the total service time for 50

customers.
▶ The mean and variance of Y can be found as:

E [Y ] = 50 · E [Xi ] = 50 · 2 = 100

Var (Y ) = 50 · Var (Xi ) = 50 · 1 = 50

▶ By the Central Limit Theorem (CLT), Y is approximately
normal for large n.

https://www.probabilitycourse.com/
Finding P(90 < Y < 110)

▶ Standardize the variable:

Y − E [Y ] Y − 100
Z= p = √
Var (Y ) 50

▶ Now, calculate P(90 < Y < 110):

90 − 100 110 − 100
P(90 < Y < 110) = P √ <Z < √
50 50

−10 10
=P √ <Z < √
50 50
▶ Simplifying:
= P(−1.414 < Z < 1.414)

https://www.probabilitycourse.com/
Conclusion

▶ Using standard normal distribution tables,

P(−1.414 < Z < 1.414) ≈= 0.8427
▶ Therefore, the probability that the total service time is
between 90 and 110 minutes is approximately 84.27%.
Z-Table or Standard Normal Table

Figure: Z-Table or Standard Normal Table

https://www.probabilitycourse.com/
Z-Table or Standard Normal Table

Figure: Z-Table or Standard Normal Table

https://www.probabilitycourse.com/
Problem Statement

In a communication system, each data packet consists of 1000 bits.

Due to noise, each bit may be received in error with a probability
of 0.1. It is assumed that bit errors occur independently.
Find the probability that there are more than 120 errors in a
certain data packet.
Solution: Step 1 - Binomial Distribution

Let n = 1000 be the number of bits in a packet, and p = 0.1 be

the probability of error. The number of errors X follows a binomial
distribution:

X ∼ Binomial(n = 1000, p = 0.1)

We need to find P(X > 120).

https://www.probabilitycourse.com/
Defining the Problem

Let us define Xi as the indicator random variable for the i-th bit in
the packet. That is,

Xi = 1 if the i-th bit is received in error, Xi = 0 otherwise.

The Xi ’s are i.i.d. and

Xi ∼ Bernoulli(p = 0.1).

If Y is the total number of bit errors in the packet, then

Y = X1 + X2 + · · · + Xn .
Mean and Variance of Xi

Since Xi ∼ Bernoulli(p = 0.1), we have:

E[Xi ] = µ = p = 0.1, Var(Xi ) = σ 2 = p(1 − p) = 0.09.

https://www.probabilitycourse.com/
Using the Central Limit Theorem

Using the Central Limit Theorem (CLT), we can estimate:

Y − nµ 120 − nµ
P(Y > 120) = P √ > √
nσ nσ

Substituting the values:

120 − 100 20
P(Y > 120) = P √ >0 ≈1−Φ √ .
90 90

https://www.probabilitycourse.com/
Final Probability

From the standard normal distribution table:

P (Y > 120) ≈ 1 − Φ(2.11) = 0.0175.

Therefore, the probability that there are more than 120 errors in
the data packet is approximately 1.75%.

https://www.probabilitycourse.com/
Z-Table or Standard Normal Table

Figure: Z-Table or Standard Normal Table

https://www.probabilitycourse.com/
Z-Table or Standard Normal Table

Figure: Z-Table or Standard Normal Table

https://www.probabilitycourse.com/
Continuity Correction

Let us assume that Y ∼ Binomial(n = 20, p = 21 ), and suppose

that we are interested in P(8 ≤ Y ≤ 10).
We know that a Binomial(n = 20, p = 21 ) can be written as the
sum of n i.i.d. Bernoulli(p) random variables:

Y = X1 + X2 + . . . + Xn .
Expectation and Variance

Since Xi ∼ Bernoulli(p = 21 ), we have

1 1
E [Xi ] = µ = p = , Var(Xi ) = σ 2 = p(1 − p) = .
2 4
Thus, we may want to apply the CLT to write

8 − nµ Y − nµ 10 − nµ
P(8 ≤ Y ≤ 10) = P √ < √ < √
nσ nσ nσ

8 − 10 Y − nµ 10 − 10 2
=P √ < √ < √ ≈ Φ(0)−Φ − √ = 0.3145.
5 nσ 5 5

https://www.probabilitycourse.com/
Exact Probability Calculation

Since here, n = 20 is relatively small, we can actually find

P(8 ≤ Y ≤ 10) accurately. We have
10
X 20 k
P(8 ≤ Y ≤ 10) = p (1 − p)n−k
k
k=8
20
20 20 20 1
= + + = 0.4565.
8 9 10 2

https://www.probabilitycourse.com/
Approximation Error

We notice that our approximation is not so good. Part of the error

is due to the fact that Y is a discrete random variable and we are
using a continuous distribution to find P(8 ≤ Y ≤ 10).
Here is a trick to get a better approximation, called continuity
correction. Since Y can only take integer values, we can write

P(8 ≤ Y ≤ 10) = P(7.5 < Y < 10.5)

Applying Continuity Correction

We can express this as:

7.5 − nµ Y − nµ 10.5 − nµ
=P √ < √ < √
nσ nσ nσ

√

7.5 − 10 Y − nµ 10.5 − 10 2.5
=P √ < √ < √ ≈ Φ(0.5/ 5)−Φ − √
5 nσ 5 5
= 0.4567.

As we see, using continuity correction, our approximation

improved significantly.

https://www.probabilitycourse.com/
Application of Continuity Correction

The continuity correction is particularly useful when we would like

to find P(y1 ≤ Y ≤ y2 ), where Y is binomial and y1 and y2 are
close to each other.

https://www.probabilitycourse.com/
Continuity Correction for Discrete Random Variables
Let X1 , X2 , . . . , Xn be independent discrete random variables and
let

Y = X1 + X2 + . . . + Xn .

https://www.probabilitycourse.com/
Finding Probability Using CLT

Suppose that we are interested in finding P(A) = P(l ≤ Y ≤ u)

using the CLT, where l and u are integers. Since Y is an
integer-valued random variable, we can write

1 1
P(A) = P l − ≤ Y ≤ u + .
2 2
It turns out that the above expression sometimes provides a better
approximation for P(A) when applying the CLT. This is called the
continuity correction, and it is particularly useful when Xi ’s are
Bernoulli (i.e., Y is binomial).

https://www.probabilitycourse.com/

W7GS
100% (1)
W7GS
10 pages
PSR - Module3 - Bivariate Random Variables - Class11
No ratings yet
PSR - Module3 - Bivariate Random Variables - Class11
64 pages
Statistics Training
No ratings yet
Statistics Training
96 pages
CH 4
No ratings yet
CH 4
39 pages
Chapter3 Asymtotic Stats
No ratings yet
Chapter3 Asymtotic Stats
114 pages
W8GS
100% (1)
W8GS
8 pages
HANDBOOK OF MEDICAL STATISTICS Digital Download
100% (12)
HANDBOOK OF MEDICAL STATISTICS Digital Download
16 pages
Week10class1 Solution
No ratings yet
Week10class1 Solution
30 pages
4.10 Central Limit Theorem: Chapter 4. Two-Dimensional Random Variables
No ratings yet
4.10 Central Limit Theorem: Chapter 4. Two-Dimensional Random Variables
9 pages
Lecture 07
No ratings yet
Lecture 07
22 pages
CH 2
No ratings yet
CH 2
24 pages
Chap 4
No ratings yet
Chap 4
12 pages
Covergence
No ratings yet
Covergence
18 pages
Assessing The Var of A Portfolio Using D-Vine Copula Based Multivariate Garch Models
No ratings yet
Assessing The Var of A Portfolio Using D-Vine Copula Based Multivariate Garch Models
33 pages
Math408 Lecture 9 10
No ratings yet
Math408 Lecture 9 10
17 pages
Section 53
No ratings yet
Section 53
35 pages
02 Discrete Probability Distribution
No ratings yet
02 Discrete Probability Distribution
28 pages
확통1 LectureNote06 on Limit Theorems
No ratings yet
확통1 LectureNote06 on Limit Theorems
36 pages
Lecture7 8
No ratings yet
Lecture7 8
30 pages
Topic 6
No ratings yet
Topic 6
79 pages
Document 1
No ratings yet
Document 1
21 pages
370 Formulas Tables Packet
No ratings yet
370 Formulas Tables Packet
24 pages
Univariate and Multivariate Skewness and Kurtosis For Measuring Nonnormality: Prevalence, in Uence and Estimation
No ratings yet
Univariate and Multivariate Skewness and Kurtosis For Measuring Nonnormality: Prevalence, in Uence and Estimation
21 pages
ch5 Handout
No ratings yet
ch5 Handout
6 pages
Ee5110 Lecture Limit Theorems
No ratings yet
Ee5110 Lecture Limit Theorems
9 pages
MIT18 05S14 Class6slides PDF
No ratings yet
MIT18 05S14 Class6slides PDF
24 pages
Week 16 - L13 - Limit Theorems
No ratings yet
Week 16 - L13 - Limit Theorems
22 pages
Central Limit Theorem
No ratings yet
Central Limit Theorem
6 pages
Detailed LLN CLT Binomial Presentation
No ratings yet
Detailed LLN CLT Binomial Presentation
6 pages
Chapter 6 Continuous Probability Distributions
No ratings yet
Chapter 6 Continuous Probability Distributions
27 pages
Normal Distribution Ws & Ms
No ratings yet
Normal Distribution Ws & Ms
3 pages
3-Basic Stats
No ratings yet
3-Basic Stats
27 pages
Central Limit Theorem
No ratings yet
Central Limit Theorem
16 pages
Central Limit Theorem
No ratings yet
Central Limit Theorem
10 pages
Lecture Note 4
No ratings yet
Lecture Note 4
8 pages
Statistical Compendium
No ratings yet
Statistical Compendium
85 pages
Unit 3 - Bounds and Inequalities
No ratings yet
Unit 3 - Bounds and Inequalities
25 pages
National Call Center
No ratings yet
National Call Center
23 pages
STA 241 Topic 14 Laws of Large Numbers (Corr)
No ratings yet
STA 241 Topic 14 Laws of Large Numbers (Corr)
9 pages
9 CLT
No ratings yet
9 CLT
19 pages
Central Limit Theorem
No ratings yet
Central Limit Theorem
7 pages
CLT PDF
No ratings yet
CLT PDF
13 pages
Central Limit Theorem
No ratings yet
Central Limit Theorem
7 pages
Lecture Notes 4 Convergence (Chapter 5) 1 Random Samples: 1 N N 1 N N I
No ratings yet
Lecture Notes 4 Convergence (Chapter 5) 1 Random Samples: 1 N N 1 N N I
12 pages
Unit3-Probability and Stochastic Processes (18MAB203T)
No ratings yet
Unit3-Probability and Stochastic Processes (18MAB203T)
25 pages
Chapter 5
No ratings yet
Chapter 5
20 pages
Lec 6
No ratings yet
Lec 6
7 pages
Chap 1samp Distributions
No ratings yet
Chap 1samp Distributions
7 pages
Draft Updated
No ratings yet
Draft Updated
2 pages
Random Variables
No ratings yet
Random Variables
8 pages
WLLN and LST Overleaf
No ratings yet
WLLN and LST Overleaf
3 pages
Problems and Solutions 4 PDF
No ratings yet
Problems and Solutions 4 PDF
67 pages
Week 006-007 - Course Module Central Limit Theorem
No ratings yet
Week 006-007 - Course Module Central Limit Theorem
17 pages
Probability and Statistics Cookbook
No ratings yet
Probability and Statistics Cookbook
28 pages
W6-Sampling Distributions
No ratings yet
W6-Sampling Distributions
12 pages
160 Gautam Makhija Gautam Makhija 201901161 Scribe 11 1697 540888691
No ratings yet
160 Gautam Makhija Gautam Makhija 201901161 Scribe 11 1697 540888691
12 pages
Lec-11 - Central Limit Theorem
No ratings yet
Lec-11 - Central Limit Theorem
9 pages
Sampling, Sampling Distributions & CLT (PART 2) : 1st Semester SY 2020-2021
No ratings yet
Sampling, Sampling Distributions & CLT (PART 2) : 1st Semester SY 2020-2021
16 pages
ST Spring2014 Problemswithsolutions
No ratings yet
ST Spring2014 Problemswithsolutions
27 pages
Central Limit Theorem
No ratings yet
Central Limit Theorem
5 pages
Introducing The Binomial Distribution - Solutions PDF
25% (4)
Introducing The Binomial Distribution - Solutions PDF
2 pages
Limiting Distributions
No ratings yet
Limiting Distributions
10 pages
Gaussian Distributions: Overview: This Worksheet Introduces The Properties of Gaussian Distributions, The
No ratings yet
Gaussian Distributions: Overview: This Worksheet Introduces The Properties of Gaussian Distributions, The
25 pages
Review 2024 04
No ratings yet
Review 2024 04
5 pages
Convergence of Random Variables
No ratings yet
Convergence of Random Variables
11 pages
Central Limit Theorem
No ratings yet
Central Limit Theorem
4 pages
Lec 8
No ratings yet
Lec 8
13 pages
Kompetensi Pedagogik: Case Processing Summary
No ratings yet
Kompetensi Pedagogik: Case Processing Summary
3 pages
Formula Sheet STAT1301
No ratings yet
Formula Sheet STAT1301
3 pages
Approximations To Probability Distributions: Limit Theorems
No ratings yet
Approximations To Probability Distributions: Limit Theorems
15 pages
EDA-Discrete Probability Distribution
No ratings yet
EDA-Discrete Probability Distribution
35 pages
Jntuk 2 1 RV&SP Nov 2017 Q.P
100% (1)
Jntuk 2 1 RV&SP Nov 2017 Q.P
8 pages
Chapter 7 Binomial, Normal, and Poisson Distributions
No ratings yet
Chapter 7 Binomial, Normal, and Poisson Distributions
2 pages
Worksheet 3 For Eng
No ratings yet
Worksheet 3 For Eng
2 pages
Normal Distribution
No ratings yet
Normal Distribution
4 pages
Math 361, Problem Set 11
No ratings yet
Math 361, Problem Set 11
4 pages
Poisson CDF Table
No ratings yet
Poisson CDF Table
6 pages
Lbolytc Chapter 3: Numerical Descriptive Measures
No ratings yet
Lbolytc Chapter 3: Numerical Descriptive Measures
3 pages
Stat 330 Sample Solution Homework 8 1 Central Limit Theorem
No ratings yet
Stat 330 Sample Solution Homework 8 1 Central Limit Theorem
3 pages
Two Proofs of The Central Limit Theorem
No ratings yet
Two Proofs of The Central Limit Theorem
13 pages
Revision - Elements or Probability: Notation For Events
No ratings yet
Revision - Elements or Probability: Notation For Events
20 pages
Central Limit Theorem
No ratings yet
Central Limit Theorem
36 pages
Central Limit Theorem
No ratings yet
Central Limit Theorem
1 page
STA3030F - Jan 2015 PDF
No ratings yet
STA3030F - Jan 2015 PDF
13 pages
Osobine Var
No ratings yet
Osobine Var
19 pages
F A M X B.: Triangular Distribution (From
No ratings yet
F A M X B.: Triangular Distribution (From
2 pages
Statistic and Probability WEEK 1 - 2 - MODULE 1 Answer Key
No ratings yet
Statistic and Probability WEEK 1 - 2 - MODULE 1 Answer Key
2 pages
MIT14 30s09 Lec17
No ratings yet
MIT14 30s09 Lec17
9 pages
Generalized Fermat Equation
From Everand
Generalized Fermat Equation
Ran Van Vo
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Data Analysis Slides

Uploaded by

Data Analysis Slides

Uploaded by

Data Analysis

Dr. Rahul Pandya

▶ Discuss limit theorems and convergence modes for random

▶ Discuss two important theorems in probability, the law of large

▶ The law of large numbers has a very central role in probability

▶ Common notation for the sample mean is Mn .

E [X̄ ] = E [X1 + X2 + ... + Xn ] = nE [Xn ] = E [X ]

▶ Let X1 , X2 , ..., Xn be i.i.d. random variables with a finite

▶ Proof using Chebyshev’s inequality:

▶ It states that, under certain conditions, the sum of a large

▶ Here, we state a version of the CLT that applies to i.i.d.

▶ Suppose that X1 , X2 , . . . , Xn are i.i.d. random variables with

▶ The sample mean is given by:

▶ The sample mean has mean E [X̄ ] = µ and variance

▶ Thus, the normalized random variable

▶ Zn has mean E [Zn ] = 0 and variance Var (Zn ) = 1.

▶ The central limit theorem states that the CDF of Zn

▶ Let X1 , X2 , . . . , Xn be i.i.d. random variables with expected

▶ Then, the random variable

▶ converges in distribution to the standard normal random

lim P(Zn ≤ x) = Φ(x), for all x ∈ R,

▶ where Φ(x) is the standard normal CDF.

▶ An interesting aspect of the CLT is that the distribution of

▶ Yn = X1 + X2 + . . . + Xn has a Binomial(n, p) distribution:

▶ Where Yn ∼ Binomial(n, p).

Figure: Zn is the normalized sum of n independent Bernoulli(p) random

Figure: Zn is the normalized sum of n independent Bernoulli(p) random

▶ As another example, let’s assume that Xi ’s are Uniform(0,1):

Zn is the normalized sum of n independent Uniform(0,1) random

▶ We could have directly looked at Yn = X1 + X2 + ... + Xn .

▶ Write the random variable of interest Y as the sum of n i.i.d.

▶ Find E [Y ] and Var (Y ):

▶ Where µ = E [Xi ] and σ 2 = Var (Xi ).

▶ According to the CLT, we conclude that:

is approximately standard normal.

▶ To find P(y1 ≤ Y ≤ y2 ), write:

A bank teller serves customers standing in the queue one by one.

▶ Let Y = X1 + X2 + · · · + X50 , the total service time for 50

Var (Y ) = 50 · Var (Xi ) = 50 · 1 = 50

▶ Standardize the variable:

▶ Now, calculate P(90 < Y < 110):

▶ Using standard normal distribution tables,

Figure: Z-Table or Standard Normal Table

Figure: Z-Table or Standard Normal Table

In a communication system, each data packet consists of 1000 bits.

Let n = 1000 be the number of bits in a packet, and p = 0.1 be

X ∼ Binomial(n = 1000, p = 0.1)

Xi = 1 if the i-th bit is received in error, Xi = 0 otherwise.

The Xi ’s are i.i.d. and

If Y is the total number of bit errors in the packet, then

Since Xi ∼ Bernoulli(p = 0.1), we have:

E[Xi ] = µ = p = 0.1, Var(Xi ) = σ 2 = p(1 − p) = 0.09.

Using the Central Limit Theorem (CLT), we can estimate:

Substituting the values:

From the standard normal distribution table:

P (Y > 120) ≈ 1 − Φ(2.11) = 0.0175.

Figure: Z-Table or Standard Normal Table

Figure: Z-Table or Standard Normal Table

Let us assume that Y ∼ Binomial(n = 20, p = 21 ), and suppose

Since Xi ∼ Bernoulli(p = 21 ), we have

Since here, n = 20 is relatively small, we can actually find

We notice that our approximation is not so good. Part of the error

P(8 ≤ Y ≤ 10) = P(7.5 < Y < 10.5)

We can express this as:

As we see, using continuity correction, our approximation

The continuity correction is particularly useful when we would like

Suppose that we are interested in finding P(A) = P(l ≤ Y ≤ u)

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.