0% found this document useful (0 votes)

16 views49 pages

Lecture 4 - Part III

This document provides an outline and introduction to key concepts in economic analysis and quantitative methods, including: 1) It summarizes key chapters and distributions from the primary textbook including the normal, chi-squared, t and F distributions, and the distribution of sample averages. 2) It explains random sampling and how taking a simple random sample yields independently and identically distributed observations. 3) It describes how the sample average is a random variable and provides the mean and variance of the sampling distribution of the sample average.

Uploaded by

Asad Shahbaz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views49 pages

Lecture 4 - Part III

Uploaded by

Asad Shahbaz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 49

Introduction to Economic Analysis and Quantitative Methods

ECON1151-Lecture 4 Part III

Dr. Rafael Wildauer

1 / 49
Outline

References

Chapter 2: Normal, χ-Squared, t and F Distributions

Chapter 2: Random Sampling and the Distribution of the Sample Average

Chapter 2: Weak Law of Large Numbers and the Central Limit Theorem

Further Thoughts on Sampling

2 / 49
Textbooks

Larsen, R., & Marx, M. (2012). Introduction to mathematical statistics and its
applications. Pearson.
Stock, J., & Watson, M. (2015). Introduction to econometrics. Pearson.

3 / 49
Textbook Clarification

I Stock and Watson (2015) is the primary and required text.

I All chapter numbers refer to this text.

I Larsen and Marx (2012) is a more advanced auxiliary text if you want to go
beyond the material here.

4 / 49
Chapter 2: Normal, χ-Squared, t and F Distributions

5 / 49
The Normal Distribution

6 / 49
The Normal Distribution

I If a (continous) random variable X with mean µX and variance σX2 is normally

distributed it can be described by the following probability function
(x−µX )2
1 −
2σ 2
pdf (x) = q e X

2πσX2

I The normal distribution is symmetric around its mean

I and 95% of its probability fall into the interval [µX − 1.96σX ; µX + 1.96σX ]

I It is expressed as X ∼ N[µX , σX2 ]

7 / 49
8 / 49
The Standard Normal Distribution

I A special case is the Standard Normal Distribution with µ = 0 and variance

σ 2 = 1 (red curve in the previous picture)

I We can transform any random variable Y into a "normalized" random variable

Z by:
Z = (Y − µY )/σY
And this standardized variable Z has a remarkable property: We can use it to
analyse standard distributions in general

I The CDF of the normal distribution is complicated. However we can rely on

tables / computers to calculate the CDF of the standard normal distribution:

CDF (c) = Pr (Z ≤ c) = Φ(c)

9 / 49
10 / 49
The Standard Normal Distribution
I What does that mean?

I First, we introduced a symbol for the cumulative distribution function (cdf)

of the standard normal distribution:
Φ(c) = Pr (Z ≤ c)

I Second, it means that if we are interested in Pr (Y ≤ c) of any normally

distributed random variable Y, we can compute that probability by computing
Pr (Z ≤ d) instead where:
Y −µY
I Z is the standardized random variable: Z = σY

c−µY
I and d is the standardized value of c: d = σY

I We know that Pr (Z ≤ d) = Pr (Y ≤ c) and thus can answer the initial question

about Y!
11 / 49
12 / 49
The χ-Squared Distribution

I The chi-Squared (pronounced ki:) Distribution is used to test certain

hypotheses

I It is defined as the distribution "of the sum of m squared independent

standard normal random variables"

I For example, let (Z1 , Z2 , and Z3 ) be independent standard normal random

variables. Then (Z12 + Z22 + Z3 ) has a chi-squared distribution with 3 degrees
of freedom.

I It is denotes as χ2k for k degrees of freedom.

13 / 49
Probability Density Functions of χ2 Distributions

14 / 49
The Student t Distribution

I The Student t distribution with m degrees of freedom is defined to be the

distribution of the ratio of a standard normal random variable, divided by the
square root of an independently distributed chi-squared random variable with
m degrees of freedom divided by m.

I It is denoted tm for m degrees of freedom.

I The key property to remember is that as m increases it becomes more and

more similar to the standard normal distribution

15 / 49
Probability Density Functions of t Distributions

16 / 49
The F Distribution

I The F distribution with m and n degrees of freedom, denoted Fm,n, is defined

to be the distribution of the ratio of a chi-squared random variable with
degrees of freedom m, divided by m, to an independently distributed
chi-squared random variable with degrees of freedom n, divided by n.

I It is denoted as Fm,n for m and n degrees of freedom.

17 / 49
Probability Density Functions of F Distributions

18 / 49
Chapter 2: Random Sampling and the Distribution of the
Sample Average

19 / 49
Sample vs Population

The key idea in statistics is that we use information from a sample in order to
obtain insights about the population!

This is why it is important to distinguish between the two!!!

20 / 49
I Almost all the techniques discussed later on involve averages or weighted
averages of data from a sample.

I This is why understanding the distribution of sample averages is important

I This section first talks about random sampling and then the distribution of the
sample mean

21 / 49
Random Sampling
I The simplest form of random sampling is to randomly select n units from the
population and each unit is selected with the same probability

I This is called simple random sampling

I Example:
I If we randomly select days on which we record our commuting time, ...
I then we obtain a series of (y1 , . . . , yn ) records of commuting time.
I Each yi is a realization of random variable Yi which is the commuting time on
day i.
I and since we chose the days at random (in advance), the commuting time on
one day provides no information about commuting time on another day.
I Which means that the sequence of random variables (Y1 , . . . , Yn ) are
independent.
22 / 49
Independently and identically distributed (i.i.d)

I The previous example explains why simple random sampling yields a sample
which observations are independently and identically distributed

23 / 49
The Sampling Distribution of the Sample Average

I If we have a sequence of (y1 , . . . , yn ) observations we can compute the

sample mean or sample average as:
n
1X
y= yi
n
i=1

I Since each yi is a realization of a random variable it means in a different

situation we could have obtained a different sequence of yi0 s

I Which means that y is itself a random variable

24 / 49
Mean and Variance of y

I Since y is a random variable we can calculate the expected value (mean) and
variance:
n
1X 1
E[y ] = E[yi ] = nµY = µY
n n
i=1

I Here we use the fact that each yi was drawn from the same distribution and
thus has the same mean µY

25 / 49
For the variance we have:
n
" #
1X
Var [y ] = Var yi =
n
i=1

n n n n
1 X 1 X X 1 X
= Var [y i ] + Cov [y ,
i jy ] = Var [yi ]
n2 n2 n2
i=1 i=1 j=1,j6=i i=1

1 2 σY2
= nσ =
n2 Y n
σY
For the standard deviation we take the square root: σy = √
n

26 / 49
Mean and Variance of y
Let’s state the results again:
I The expected value of the sample mean (y ) is given by:

E[y ] = µY

I the variance and standard deviation of the sample mean (y ) are given by:
σY2
Var [y ] = σy2 =
n
σY
Std.Dev [y ] = σy = √
n

I These results hold whatever the common distribution of the random variables
Y1 , ..., Yn is! If it where a normal distribution we could even say that
" #
σY2
y ∼ N µY ,
n
27 / 49
Mean and Variance of y

I Do these results make sense?

I Let’s think about them!

28 / 49
Chapter 2: Weak Law of Large Numbers and the Central
Limit Theorem

29 / 49
The sampling distribution of y
I We said that only if we know that Y1 , ..., Yn are each normally distributed that
we can say that we know the distribution of y . Namely:
" #
σY2
y ∼ N µY ,
n

I If we don’t know the exact distribution for each Y1 , ..., Yn we can only derive
two characteristics of the distribution of y , namely its mean and its variance:
E[y ] = µY
σY2
Var [y ] = σy2 =
n

I But if we want to make statements about a likely range within y falls we know
more than E[y ] and Var [y ]. We need to know the (sampling) distribution of y
30 / 49
The asymptotic distribution of y

I There are two remarkable result in probability theory which allow us to learn
more about how y if we assume that the sample becomes very large: n → ∞

31 / 49
The Law of Large Numbers

I The first of these results is the (weak) law of large numbers which says that
if the sample size is large, y will be very close to µY with high probability.

I In a more general sense it is the justification that if we have a decently large

sample, we can actually learn about the population (means).

32 / 49
Central Limit Theorm

I The second of these results is the central limit theorm (CLT). It says that ,
when the samplesize is large, the sampling distribution of the standardized
sample average y −µ
σ
Y
is approximately normal.
y

I Remarkably the CLT holds independent of how the random variables Y are
distributed.

I This result will allow us to perform hypothesis tests and make statements
about how certain we are about results obtained from our sample!

33 / 49
The CLT in Action
I Let’s carry out a small Monte Carlo experiment to see whether this is all true!

I The setting:
I Draw a random sample of size n from a uniform distribution over the interval
[1, 100] (for that type of distribution we have µ = 50.5 and σ 2 = 816.75) and
compute the standardized sample average.
I generate 10,000 draws (samples) and see how the standardized sample
averages are distributed

I What does the CLT predict?

I With increasing sample size the distribution of the standardized sammple

averages should converge to a standard normal distribution

I So a practical question is how large should each of these 10,000 samples be?
What should we choose for n? Let’s start with n=5.
34 / 49
10,000 samples, each of size n = 5

35 / 49
10,000 samples, each of size n = 20

36 / 49
10,000 samples, each of size n = 50

37 / 49
What happens if we sample from a very uneven distribution

I Let’s assume the population we are interested in is described by a

ParetoDistribution[100,000; 2.5]
α
αxmin
I The pdf of such a distribution is defined as pdf (x) = x α+1

I Key features of this Pareto Distributions

I Minimum value is 100,000
I Population units with values substantially larger than 100,000 are rare
I However extreme values are possible

38 / 49
Richest 100 observations from a Pareto vs Normal Distribution

39 / 49
Richest 100 observations from a Pareto vs Normal Distribution

40 / 49
10,000 samples, each of size n = 5

41 / 49
10,000 samples, each of size n = 20

42 / 49
10,000 samples, each of size n = 100

43 / 49
10,000 samples, each of size n = 2000

44 / 49
Central Limit Theorem

I even when we sample from a highly unevenly distributed population the

distribution of the sample average approaches a normal distribution as long
as we have a large enough sample

I However it also means that we need to be careful with highly skewed

(uneven) distributions!!

45 / 49
Further Thoughts on Sampling

46 / 49
Sampling Problems
I Selection bias results when a subset of the units in the population is
less/more likely to be included in the sample (and the research design does
not or cannot take this into account)
I pure internet surveys
I survivorship bias (in firm data)
I differential non-response bias results for example when the probability of
participating in a survey is systematically linked to some characteristics of the
units in the population. (wealthy households less likely to participate in surveys
on household finances)

I Non-observation bias: studyin characteristics which are very unevenly

distributed (wealth; heavy tailed distributions) requires unfeasibly large
sample sizes in order to obtain enough of these individuals in the sample;
otherwise most samples would underestimate the population’s characteristic
(e.g. total wealth)
47 / 49
48 / 49
Most important takeaway
sample is not the population!
I will have different notation for sample and population characteristics!

I population characteristics
I e.g.: the true proportion intending to vote for a certain party in the next election
(can be more complicated like an average treatment effect)
I it is unknown and we try to estimate it from the population

I sample characteristics
I values we obtain from the sample (like the proportion of people intending to vote
for X)
I Use it to make statements about the population (statistical significance of an
effect, margin of error around voting intention we observe in the sample)
If your sample is biased (see above) your conclusions about the population will be
biased!
49 / 49

Module02A Slides Print
No ratings yet
Module02A Slides Print
66 pages
Sampling Dist
No ratings yet
Sampling Dist
34 pages
DMV - Unit I
No ratings yet
DMV - Unit I
44 pages
JB Ise Exercises Sampdists Answers
No ratings yet
JB Ise Exercises Sampdists Answers
18 pages
8.chapter 2
No ratings yet
8.chapter 2
13 pages
Da Session 3
No ratings yet
Da Session 3
72 pages
Econ Review Stat W2 2025
No ratings yet
Econ Review Stat W2 2025
49 pages
OLS 2 Variables 2
No ratings yet
OLS 2 Variables 2
169 pages
Lec 5
No ratings yet
Lec 5
64 pages
Session 31 - Sample Statistics
No ratings yet
Session 31 - Sample Statistics
28 pages
T 4 Sampling Distributions
No ratings yet
T 4 Sampling Distributions
13 pages
Math 301 CH 8 Sampling Distributions
No ratings yet
Math 301 CH 8 Sampling Distributions
38 pages
Untitled 3
No ratings yet
Untitled 3
32 pages
Chapter 6
No ratings yet
Chapter 6
12 pages
STA301 Mcqs FinalTerm by Vu Topper RM
No ratings yet
STA301 Mcqs FinalTerm by Vu Topper RM
45 pages
Unit - 4
No ratings yet
Unit - 4
10 pages
Sampling Distribution
No ratings yet
Sampling Distribution
41 pages
Sampling Distributions
No ratings yet
Sampling Distributions
32 pages
2 - Basics of Financial Statistics
No ratings yet
2 - Basics of Financial Statistics
32 pages
Module01 ProbabilityAndHypothesisTesting
No ratings yet
Module01 ProbabilityAndHypothesisTesting
62 pages
Part 2-1 Random Samples Sampling Distributions - Notes
No ratings yet
Part 2-1 Random Samples Sampling Distributions - Notes
8 pages
Ders 1
No ratings yet
Ders 1
34 pages
Statistical Foundations: SOST70151 - LECTURE 5
No ratings yet
Statistical Foundations: SOST70151 - LECTURE 5
49 pages
Ch1 Prob II NAU Spring23
No ratings yet
Ch1 Prob II NAU Spring23
17 pages
Notes 515 Fall 10 Chap 6
No ratings yet
Notes 515 Fall 10 Chap 6
12 pages
Chap 1 Sampling Distributions
No ratings yet
Chap 1 Sampling Distributions
14 pages
Probability and Statistics
No ratings yet
Probability and Statistics
33 pages
Selvanathan 7e - 09
No ratings yet
Selvanathan 7e - 09
46 pages
Sampling Distribution
No ratings yet
Sampling Distribution
19 pages
Sampling Distributions of Statistics: Corresponds To Chapter 5 of Tamhaneand Dunlop
No ratings yet
Sampling Distributions of Statistics: Corresponds To Chapter 5 of Tamhaneand Dunlop
36 pages
Lect9 Math231
No ratings yet
Lect9 Math231
42 pages
Statistics Lecture 3 Summary
No ratings yet
Statistics Lecture 3 Summary
5 pages
Week 5-8 Short Notes
No ratings yet
Week 5-8 Short Notes
10 pages
Lesson6 CLT 0
No ratings yet
Lesson6 CLT 0
25 pages
Distributions
No ratings yet
Distributions
21 pages
Chapter 5 PDF
No ratings yet
Chapter 5 PDF
30 pages
Probabilistic Methods in Geotechnical Engineering PDF
No ratings yet
Probabilistic Methods in Geotechnical Engineering PDF
17 pages
Parameter, Statistic and Random Samples: Random Sample of Size N If The X
No ratings yet
Parameter, Statistic and Random Samples: Random Sample of Size N If The X
15 pages
Stat1 Formulas and Tables For Statistics 2022
No ratings yet
Stat1 Formulas and Tables For Statistics 2022
34 pages
Review of Probability and Statistics
No ratings yet
Review of Probability and Statistics
34 pages
Lecture 3: Sampling and Sample Distribution
No ratings yet
Lecture 3: Sampling and Sample Distribution
30 pages
Review Prob
No ratings yet
Review Prob
81 pages
Lecture 3
No ratings yet
Lecture 3
39 pages
Statistics I: Introduction and Distributions of Sampling Statistics
No ratings yet
Statistics I: Introduction and Distributions of Sampling Statistics
22 pages
Distributions 1
No ratings yet
Distributions 1
18 pages
Week 2
No ratings yet
Week 2
12 pages
Gsbiju MA202 3 1
No ratings yet
Gsbiju MA202 3 1
5 pages
Chapter 5
No ratings yet
Chapter 5
47 pages
Manajemen Risiko Dalam Pelaksanaan Proyek Konstruksi - Jati Utomo Dwi Hatmoko
No ratings yet
Manajemen Risiko Dalam Pelaksanaan Proyek Konstruksi - Jati Utomo Dwi Hatmoko
28 pages
ECN 456 Statistics Note 2024
No ratings yet
ECN 456 Statistics Note 2024
64 pages
Market Risk 2
No ratings yet
Market Risk 2
49 pages
Module A
No ratings yet
Module A
43 pages
Intro To Data Science Lecture 2
No ratings yet
Intro To Data Science Lecture 2
12 pages
Probability and Statistics
No ratings yet
Probability and Statistics
56 pages
Chapter 8 - Sampling Distribution
No ratings yet
Chapter 8 - Sampling Distribution
34 pages
Global Optimization Methods in Geophysical Inversion 2ed. Edition Sen M.K. - Download The Ebook Now To Start Reading Without Waiting
100% (1)
Global Optimization Methods in Geophysical Inversion 2ed. Edition Sen M.K. - Download The Ebook Now To Start Reading Without Waiting
73 pages
Topic Probability Distributions
100% (1)
Topic Probability Distributions
25 pages
Module 6 - Probability
No ratings yet
Module 6 - Probability
61 pages
Grade 10-Q3-2025
No ratings yet
Grade 10-Q3-2025
10 pages
Central Limit Theorem
100% (3)
Central Limit Theorem
38 pages
Basic Statistical Concepts
No ratings yet
Basic Statistical Concepts
14 pages
HIRAC Template
No ratings yet
HIRAC Template
4 pages
Test Bank Chapter 7 Investment Bodie
No ratings yet
Test Bank Chapter 7 Investment Bodie
44 pages
The Practice of Statistic For Business and Economics Is An Introductory
No ratings yet
The Practice of Statistic For Business and Economics Is An Introductory
15 pages
KM1 - Think in Probabilities, Not Certainties
100% (1)
KM1 - Think in Probabilities, Not Certainties
13 pages
Probability Density Function Distribution Function: Distributions
No ratings yet
Probability Density Function Distribution Function: Distributions
7 pages
Queuing Models: Mr. Mgaya, S
No ratings yet
Queuing Models: Mr. Mgaya, S
44 pages
AP ECE Statistics EX 5 Review PDF
No ratings yet
AP ECE Statistics EX 5 Review PDF
11 pages
21mab204t - PQT - Unit 2, 3
No ratings yet
21mab204t - PQT - Unit 2, 3
23 pages
Fe Engineering Probability Statistics
No ratings yet
Fe Engineering Probability Statistics
9 pages
Probability Addition and Multiplication Theorem
No ratings yet
Probability Addition and Multiplication Theorem
7 pages
Statistics Study Guide: Matthew Chesnes The London School of Economics September 22, 2001
No ratings yet
Statistics Study Guide: Matthew Chesnes The London School of Economics September 22, 2001
22 pages
Prob. Distribution
No ratings yet
Prob. Distribution
24 pages
Probability & Statistics Theme 6 Sampling Distribution Random Sample
No ratings yet
Probability & Statistics Theme 6 Sampling Distribution Random Sample
4 pages
Complete Notes STATS
No ratings yet
Complete Notes STATS
16 pages
Probability & Statistics, Discrete Random Variables, RL 2.1.1
No ratings yet
Probability & Statistics, Discrete Random Variables, RL 2.1.1
21 pages
Unit III Probability B.tech 2nd Sem
No ratings yet
Unit III Probability B.tech 2nd Sem
30 pages
Statistics I (STA164)
No ratings yet
Statistics I (STA164)
7 pages
Highland Tower
No ratings yet
Highland Tower
10 pages
GROUP 15 Detailed - LessonPlan 1
No ratings yet
GROUP 15 Detailed - LessonPlan 1
17 pages
Sampling Distributions: The Basic Practice of Statistics
No ratings yet
Sampling Distributions: The Basic Practice of Statistics
14 pages
Theoretical Problem Week6
No ratings yet
Theoretical Problem Week6
3 pages
Devya Distribution
No ratings yet
Devya Distribution
14 pages
Cumulative Poisson Probability Distribution Table: Appendix C
No ratings yet
Cumulative Poisson Probability Distribution Table: Appendix C
5 pages
Probd
No ratings yet
Probd
49 pages
WORKSHEET 7 Probability PDF
No ratings yet
WORKSHEET 7 Probability PDF
1 page
Second Midterm: Part I - Multiple Choice Questions (3 Points Each)
No ratings yet
Second Midterm: Part I - Multiple Choice Questions (3 Points Each)
6 pages
Ordinary Differential Equations and Stability Theory: An Introduction
From Everand
Ordinary Differential Equations and Stability Theory: An Introduction
David A. Sanchez
No ratings yet
A Treatise on the Calculus of Finite Differences
From Everand
A Treatise on the Calculus of Finite Differences
George Boole
4/5 (1)
Exercises of Advanced Statistics
From Everand
Exercises of Advanced Statistics
Simone Malacrida
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Lecture 4 - Part III

Uploaded by

Lecture 4 - Part III

Uploaded by

Introduction to Economic Analysis and Quantitative Methods

ECON1151-Lecture 4 Part III

Dr. Rafael Wildauer

Chapter 2: Normal, χ-Squared, t and F Distributions

Chapter 2: Random Sampling and the Distribution of the Sample Average

Further Thoughts on Sampling

I Stock and Watson (2015) is the primary and required text.

I If a (continous) random variable X with mean µX and variance σX2 is normally

I The normal distribution is symmetric around its mean

I It is expressed as X ∼ N[µX , σX2 ]

I A special case is the Standard Normal Distribution with µ = 0 and variance

I We can transform any random variable Y into a "normalized" random variable

I The CDF of the normal distribution is complicated. However we can rely on

CDF (c) = Pr (Z ≤ c) = Φ(c)

I First, we introduced a symbol for the cumulative distribution function (cdf)

I Second, it means that if we are interested in Pr (Y ≤ c) of any normally

I We know that Pr (Z ≤ d) = Pr (Y ≤ c) and thus can answer the initial question

I The chi-Squared (pronounced ki:) Distribution is used to test certain

I It is defined as the distribution "of the sum of m squared independent

I For example, let (Z1 , Z2 , and Z3 ) be independent standard normal random

I It is denotes as χ2k for k degrees of freedom.

I The Student t distribution with m degrees of freedom is defined to be the

I It is denoted tm for m degrees of freedom.

I The key property to remember is that as m increases it becomes more and

I The F distribution with m and n degrees of freedom, denoted Fm,n, is defined

I It is denoted as Fm,n for m and n degrees of freedom.

This is why it is important to distinguish between the two!!!

I This is why understanding the distribution of sample averages is important

I This is called simple random sampling

I If we have a sequence of (y1 , . . . , yn ) observations we can compute the

I Since each yi is a realization of a random variable it means in a different

I Which means that y is itself a random variable

I Do these results make sense?

I Let’s think about them!

I In a more general sense it is the justification that if we have a decently large

I What does the CLT predict?

I With increasing sample size the distribution of the standardized sammple

I Let’s assume the population we are interested in is described by a

I Key features of this Pareto Distributions

I even when we sample from a highly unevenly distributed population the

I However it also means that we need to be careful with highly skewed

I Non-observation bias: studyin characteristics which are very unevenly

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.