0% found this document useful (0 votes)
10 views24 pages

Lecture1 - Copy (1) Copy 2

The lecture introduces statistical inference, which involves drawing conclusions about populations from data. It outlines the process of defining population parameters, estimators, and the importance of sampling methods, emphasizing the need for random samples to accurately represent the population. The lecture concludes with a roadmap for applying statistical inference to analyze relationships between variables, such as the impact of education on earnings.

Uploaded by

jpbennett223
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views24 pages

Lecture1 - Copy (1) Copy 2

The lecture introduces statistical inference, which involves drawing conclusions about populations from data. It outlines the process of defining population parameters, estimators, and the importance of sampling methods, emphasizing the need for random samples to accurately represent the population. The lecture concludes with a roadmap for applying statistical inference to analyze relationships between variables, such as the impact of education on earnings.

Uploaded by

jpbennett223
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Lecture 1: ECON-UA 266 - Intro to

Econometrics
Statistical Inference

Sahar Parsa

1
Statistical Inference
What is statistical inference?

Statistical inference: the process of drawing conclusions about


populations or scientific truths from data.

2
Statistical inference

1. Define a population relationship or parameter of interest

2. Define an estimator/statistic

3. Properties of the estimator and sampling distribution

4. Estimate

5. Hypothesis testing and confidence interval

3
What is the population?

Suppose we are interested in the expected value of the annual


earnings of students graduating from NYU.

“What is the expected earnings of NYU students?”

The object of interest is a population expectation, where the


population relate to NYU students.

The population is described by a random variable and its


probability distributions.

4
How do we describe a population?

Formally,

1. 𝑋 is defined as a random variable taking all the possible


earnings.

2. 𝑓𝑋 (𝑥) is defined as the probability density function of


earnings from NYU students.

3. 𝜇𝑋 = 𝔼[𝑋] is the expected value of earnings of students


at NYU. It is the unknown parameter we are interested in
(For instance 𝜇𝑋 = 5.7 (mu)).

5
How do we describe a population?

6
Problem statement

Problem: We do not observe the population


relationship/parameter, …

… But we might have a sample with observational data on


earnings for a sample of individuals at NYU.

This dataset is a sample from the population distribution of


interest.

What is a sample?

A sample: information about a population without having to


survey the entire group.

7
Statistical inference, population and sample

From the population to the sample: Sample generated from a


population distribution

From the sample to the population: We observe the sample


(NOT THE POPULATION) and infer about the population

8
Good and bad samples

A good sample recreates the characteristics of the entire


population on a smaller scale.

Whether a sample is a good sample depends on the sampling


method, i.e., how participants are selected for a sample.

Bad sampling methods include those that:

1. Gather data from outside the population being studied

2. Gather data that overrepresent or underrepresent a


subgroup of the population (not random)

9
Random sample

A random sample is a good sample:

1. representative: the sample includes only members of the


population being studied.

2. random: every member of the population being studied


has an equal chance of being selected for the sample.

Random sample are formalized by the notion of independent


and identically distributed, i.e., i.i.d.

10
Independent and identically distributed

𝑋𝑖 , 𝑖 = 1, ⋯ , 𝑁 is independent and identically distributed,


i.e.:

i. Independent: 𝑋𝑖 is independent from 𝑋𝑗 where 𝑖 ≠ 𝑗

ii. Identically distributed: 𝑋𝑖 drawn from the same


distribution 𝑓𝑋 (𝑥), where 𝑓𝑋 (.) is a probality density
function.

11
To sum up

1. Parameters: Unknown constant describing the population


behavior/relationship we are interested in.

2. Sample: Random sample from the population.

How do we infer about the parameter using a sample?


Statistic/estimator.

12
Statistics and estimators

A statistic is any function of the dataset

• An estimator is a statistic used to gauge the unknown


population of interest with the data.
• It has a sampling distribution
• Estimate, hypothesis testing, confidence interval

Backbone of statistical inference

13
Illustration

Remember NYU students earnings example.

Randomly sample from the population of students earnings at


NYU (random sample)

{𝑋𝑖 ∶ 𝑖 = 1, 2, … , 𝑁 } = {𝑋1 , 𝑋2 , 𝑋3 , ⋯ , 𝑋𝑁 }
𝑁
∑𝑖 𝑋𝑖
• Estimate 𝜇𝑋 using the estimator 𝑋̄ = 𝑁 . Is this the
only estimator?

14
The sampling distribution of a statistic

A statistic/estimator is a random variable. because

Source of randomness is that we do not observe the


population but a sample.

Statistic is a function of random variables (random sample).

It has a sampling distribution: Different samples generate


different estimates with a probability distribution.

15
The sampling distribution of a statistic

Suppose we draw from the students at NYU a random sample


of 5 students and collect a sample of their earnings (unit in
000 of dollars):

Let the sample be denoted 𝑆1 = {75, 40, 120, 250, 155}:


75+40+120+250+155
• The sample average is 𝑥𝑆̄ 1 = 5 = 128

Lets create another sample of heights and denote is 𝑆2 , where


𝑆2 = {30, 55, 135, 90, 70}:
30+55+135+90+70
• The sample average of 𝑥𝑆̄ 2 = 5 = 76

We can tell that different samples will generate a different


estimate.
16
Sampling distribution

There is a distribution of estimates:

17
Sampling distribution and sample size

The larger the sample, the smaller the variance of the


sampling distribution.

18
Why do we care?

1. We can’t take the estimate of an estimator face value.

2. We need a method to tell us something about how


confident we should be about the true value (population
value): Hypothesis testing and confidence interval

19
Notations

Conventional notation (unless stated otherwise):

• Random variables in upper case letters, except for


estimators (see below): 𝑋, 𝑌 , …

• Real variables in lower case letters: 𝑥, 𝑦, …

• Population parameters in greek letters: 𝛼 (alpha), 𝛽


(beta), 𝜇 (mu), ⋯

20
Notations

• Estimators of population parameters with a hat . ̂ (unless


stated otherwise): 𝛼̂ is an estimator for 𝛼.

• Estimates in roman alphabets: 𝑎 for a value that 𝛼̂ takes


for a given realization of a sample.

Example: Random sample of 𝑁 individuals {𝑋1 , ⋯ , 𝑋𝑁 } i.i.d.


from the population random variable 𝑋 with unknown
population mean 𝜇 and unkown population variance 𝜎2 .
𝑁
∑ 𝑖 𝑋𝑖
1. Estimator for the unknown mean 𝜇 is 𝑋̄ = 𝑁 also a
random variable.

2. Estimate given random sample 𝑆 = {𝑥1 , … , 𝑥𝑁 } is


𝑁
∑𝑖 𝑥𝑖
𝑥̄ = 𝑁 . 21
Roadmap

Step 1. Define the population paramater

Step 2. Define estimator

Step 3. Define properties of estimators (need to make


assumptions about the population distribution and the sample
distribution)

Step 4. Estimate, hypothesis testing and confidence interval

22
To sum up, in this class,

We will use statistical inference method applied to the


relationship between variables.

For instance: “How does an extra year of education affect an


individual’s earnings?”

1. We are interested in a population relation/parameter of


interest.

2. We will use a sample from the population.

23

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy