0% found this document useful (0 votes)

8 views

Lec2 ASE

The document discusses linear regression models. It explains that a linear regression model specifies a dependent variable (y) as a linear function of one or more independent variables (x). The slope coefficient (β1) measures the average change in the dependent variable (y) from a one-unit change in the independent variable (x). While nonlinear relationships can be approximated by linear models, this introduces approximation errors. The model also includes a stochastic error term (ε) to account for uncertainty. Observations from a random sample are used to estimate the coefficients.

Uploaded by

As'Ad Ichola ASSANI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

Lec2 ASE

Uploaded by

As'Ad Ichola ASSANI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 86

Linear Regression

Karim Nchare

African School of Economics

November 2020
Functional relations

I Quantitative characteristics of the world are usually entangled

in functional relations
Functional relations

I Quantitative characteristics of the world are usually entangled

in functional relations
I A regression or model specifies an explained variable as a
function of an explanatory variable

y = f (x)
Functional relations

I Quantitative characteristics of the world are usually entangled

in functional relations
I A regression or model specifies an explained variable as a
function of an explanatory variable

y = f (x)

I y is the regressand, response variable, explained variable,

dependant variable, outcome
Functional relations

I Quantitative characteristics of the world are usually entangled

in functional relations
I A regression or model specifies an explained variable as a
function of an explanatory variable

y = f (x)

I y is the regressand, response variable, explained variable,

dependant variable, outcome
I x is the regressor, predictor variable, explanatory variable,
independent variable, control variable.
Example: Quadratic regression
Rate of change
∆x = x1 − x0 and ∆y = y1 − y0 = f (x1 ) − f (x0 )
I The rate of change measures how y responds to changes in x
∆y f (x1 ) − f (x0 ) f (x0 + ∆X ) − f (x0 )
= =
∆x x1 − x0 x1 − x0
Rate of change
∆x = x1 − x0 and ∆y = y1 − y0 = f (x1 ) − f (x0 )
I The rate of change measures how y responds to changes in x
∆y f (x1 ) − f (x0 ) f (x0 + ∆X ) − f (x0 )
= =
∆x x1 − x0 x1 − x0

I It depends both on the initial point and the magnitude of the

change
Rate of Change
∆x = x1 − x0 and ∆y = y1 − y0 = f (x1 ) − f (x0 )
I The rate of change measures how y responds to changes in x
∆y f (x1 ) − f (x0 ) f (x0 + ∆X ) − f (x0 )
= =
∆x x1 − x0 x1 − x0
Rate of Change
∆x = x1 − x0 and ∆y = y1 − y0 = f (x1 ) − f (x0 )
I The rate of change measures how y responds to changes in x
∆y f (x1 ) − f (x0 ) f (x0 + ∆X ) − f (x0 )
= =
∆x x1 − x0 x1 − x0

I It depends both on the initial point and the magnitude of the

change
Linear Model
I A model is linear if it can be written as:

y = β0 + β1 x
Linear Model
I A model is linear if it can be written as:

y = β0 + β1 x

I Which means that the graph of the regression is a (straight)

line
Slope coefficient
Slope coefficient
Slope coefficient

I The slope of a linear model equals β1 independently of x0 and

∆x
∆y y1 − y0
=
∆x x1 − x0
(β0 + β1 x1 ) − (β0 + β1 x0 )
=
x1 − x0
β1 (x1 − x0 )
=
x1 − x0
= β1
The linearity assumption
I The linearity assumption is less restrictive than it appears
The linearity assumption
I The linearity assumption is less restrictive than it appears

I The following model is clearly nonlinear

y = log (γ0 x γ1 )
The linearity assumption
I The linearity assumption is less restrictive than it appears

I The following model is clearly nonlinear

y = log (γ0 x γ1 )

I After some relabelling:

β0 = log (γ0 )
β 1 = γ1
z = log (x)
The linearity assumption
I The linearity assumption is less restrictive than it appears

I The following model is clearly nonlinear

y = log (γ0 x γ1 )

I After some relabelling:

β0 = log (γ0 )
β 1 = γ1
z = log (x)

I We obtain the linear model

y = log (γ0 ) + γ1 log (x) = β0 + β1 z

Approximating nonlinear models

I Suppose that the true relationship between x and y is given by

y = f (x)
Approximating nonlinear models

I Suppose that the true relationship between x and y is given by

y = f (x)

I We can always abstract from non potential linearities and use

a linear model

ỹ = β0 + β1 x ≈ y = f (x)
Approximating nonlinear models

I Suppose that the true relationship between x and y is given by

y = f (x)

I We can always abstract from non potential linearities and use

a linear model

ỹ = β0 + β1 x ≈ y = f (x)

I If f is not linear, then the approximation will be inexact and

there will be approximation errors

= y − ỹ
Approximating nonlinear models
Multivariate regressions
I The value of the response variable may be a function of many
regressors
y = f (x1 , x2 , ..., xk )
Multivariate regressions
I The value of the response variable may be a function of many
regressors
y = f (x1 , x2 , ..., xk )

I We can still have linear models

y = β0 + β1 x1 + β2 x2 + · · · + βk xk
Multivariate regressions
I The value of the response variable may be a function of many
regressors
y = f (x1 , x2 , ..., xk )

I We can still have linear models

y = β0 + β1 x1 + β2 x2 + · · · + βk xk

I In this case, each coefficient βj is still a measure of change

holding every other variable constant
∆y
= βj
∆xj
Multivariate regressions
I The value of the response variable may be a function of many
regressors
y = f (x1 , x2 , ..., xk )

I We can still have linear models

y = β0 + β1 x1 + β2 x2 + · · · + βk xk

I In this case, each coefficient βj is still a measure of change

holding every other variable constant
∆y
= βj
∆xj

I For multivariate regressions linearity assumes separability

Unobserved variables

I We may not know or observe all the variables which affect y

y = β0 + β1 x1 + β2 x2 + · · · + βk xk
Unobserved variables

I We may not know or observe all the variables which affect y

y = β0 + β1 x1 + β2 x2 + · · · + βk xk

I if β2 x2 + · · · + βk xk is unobserved, We can still approximate y

with the variables that we do observe

ỹ = β0 + β1 x1
Unobserved variables

I We may not know or observe all the variables which affect y

y = β0 + β1 x1 + β2 x2 + · · · + βk xk

I if β2 x2 + · · · + βk xk is unobserved, We can still approximate y

with the variables that we do observe

ỹ = β0 + β1 x1

I As before, this approximation is inexact and has an

approximation error

= y − ỹ = β2 x2 + · · · + βk xk
Stochastic regression

I Most of the time there is uncertainty because (at least)

I We are not certain about the linearity of a regression
I We cannot list all the relevant regressors
I we can have some measurement error issues

I Uncertainty is captured by a stochastic error term

y = β0 + β1 x +

I β0 + β1 x is called the deterministic component of the model

I is called the random component of the model
Stochastic regression

I Assume that the error has zero mean conditional on x

I Then the deterministic component corresponds to the mean of

y conditional on x

E (y |x) = E (β0 + β1 x + |x)

= β0 + β1 x

I Then slope coefficient measures the average per-unit effect of

changes in x over the average value of y conditional on x

E (y |x1 ) − E (y |x0 ) = β1 (x1 − x0 )

Random Sample
I We are usually interested in different observations coming
from
I Cross-sectional – different sources
I Time series – a single source at different times
I Panel data – different time series from different sources

I We assume that the data comes from a random sample

{xi, yi, i }
I xi and yi are observed but i is not and we have a collection
of equations
yi = β0 + β1 xi + i
I In case of a multivariate regression

yi = β0 + β1 x1i + · · · + βk xki + i
Predictions and Residuals

I Suppose that we have estimates β̂0 and β̂1 , the estimated

model is then
ŷ = β̂0 + β̂1 x
I Given an estimated model, for each realization of xi the
predicted value of yi is:

ŷi = β̂0 + β̂1 xi

I The corresponding residual is:

ei = yi − ŷi

I Notice we cannot guarantee that ei = i unless we know β0

and β1
A linear regression, random sample
A linear regression, the estimated model
A linear regression, errors vs residuals
Example: Height and Weight model

I Contest Game:
I If you guest the weight of a participant within 10lb of the
actual weight, you get paid 2$.
I Otherwise you pay him or her 3$
I You could use height (observable) to estimate the weight

WEIGHTi = β0 + β1 HEIGHTi + i

I Given estimated coefficients β̂0 = 103.4 and β̂1 = 6.38, you

can make predictions

\ i = 103.4 + 6.38HEIGHTi
WEIGHT
Example: height and weight, Predictions, observations,
residuals
Example: height and weight Predictions
Example: height and weight, Predictions, observations
Example: height and weight, Predictions, observations,
residuals
Estimating linear models

I Begin from dataset coming from a random sample {xi , yi }

I We assume that x and y are related by a model:

yi = β0 + β1 xi + i

I We do not observe i and the true coefficients β0 and β1

I Our objective now is to generate estimates β̂0 and β̂1 of these
coefficients to obtain an estimated model

ŷi = β̂0 + β̂1 xi

Example: linear regression, Data generating process
Example: linear regression, Realized random sample
Example: linear regression, the best linear model
Example: linear regression, the best linear model
Example: linear regression, the best linear model
Example: linear regression, the best linear model
Example: linear regression, the best linear model
Example: linear regression, the closest linear model
Example: linear regression, the closest linear model
The best linear model

I Two uses for the estimated model:

I Prediction - Given xi , yi what is the predicted value ŷi for a
new value of x
I Policy - Given xi , yi what is the average change in y
associated with a change in x:

∆yi = β1 ∆xi ≈ ∆xi

I Better predictions when yi ≈ ŷi , i.e. when the residuals are

small
I Policy implications only make sense if we establish causality
I Better policy implications when β̂1 ≈ β1 and e ≈ 0
Ordinary least squares

Given a data set, the ordinary least squares (OLS) estimates of

β0 and β1 , are the numbers β̂0 and β̂1 which minimize the sum of
squared residuals:
n
X
SSR = (yi − β̂0 − β̂1 xi )2
i=1

The OLS estimated model is: ŷi = β̂0 + β̂1 xi

I We wish to have small residuals. Small means in magnitude
not sign:
ei = yi − ŷi = yi − β̂0 − β̂1 xi
Examples: OLS Random samples
Examples: OLS Estimated models
Computing OLS

I When β1 = 0, we know that β̂0 = ȳ why?

I Now suppose that we know that β0 = 0, i.e. yi = β1 xi + i
I In this case we obtain:
P
xi yi
β̂1 = P 2
xi
I In the general case, the OLS estimates are given by:
P
(xi − x̄)(yi − ȳ )
β̂1 =
(xi − x̄)2
P

β̂0 = ȳ − β̂1 x̄

cov (x,y )
I Notice that β̂1 looks like a sample analogue of var (x)
I The OLS estimates guarantee that
P
êi = 0
Example: height and weight Computing OLS
Example: height and weight Computing OLS
Example: height and weight Computing OLS

P
(xi − x̄)(yi − ȳ ) 590.2
β̂1 = P 2
= ≈ 6.38
(xi − x̄) 92.55
β̂0 = ȳ − β̂1 x̄ = 169 − 6.38 × 10 ≈ 105.22
ŷi = 105.22 + 6.38xi
Example: geography of trade
Example: military service and income
Example: income vs. fecundity
Example: public debt vs. growth
The need for an intercept

I Most of the time we will be interested in β1 rather than β0

I One could simply estimate

yi = β1 xi + i

I But if β̂0 = 0 we may get bad estimates

Multivariate regressions
I The analysis extends to multivariate models

yi = β0 + β1 x1i + · · · + βk xki + i

I The interpretation is slightly different: β̂k indicates the

response to changes in xk holding other regressors constant
I OLS is defined in the same way: minimizing SSR
I The formulas require linear algebra
I OLS is never done by hand: we use computers
Example: Financial Aid
I Response variable: FINAIDi – grant per year to applicant i
I Regressors:
I PARENTi -feasible contributions from parents
I HSRANKi -GPA rank in high school
I GENDERi -gender dummy (1 if male and 0 if female)
Example: financial aid, Dataset
Example: financial aid, Dataset
Example: financial aid, OLS
Estimated OLS model (ignoring GENDER and HSRANK):

\ i = 15897 − 0.34PARENTi
FINAID
Example: financial aid, OLS
Estimated OLS model (ignoring GENDER):

\ i = 8927 − 0.36PARENTi + 87.4HSRANKi

FINAID
Interaction terms

I If the effect of x1 on y depends on the value of x2

I Include an interaction term x1x2 in the regression

y = β0 + β1 x1 + β2 x2 + β3 x1 x2 +

I The average effect of one unit change in x1 over y is given by

β2 + β3 x2 :
0 0
E (y |x1 , x2 ) − E (y |x1 , x2 ) = (x1 − x1 )(β2 + β3 x2 )
Anscombe’s quartet Data
Example: Anscombe’s quartet, Scatterplots
Anscombe’s quartet Estimated models
Evaluating an estimated model

I Is the equation supported by sound theory/common sense?

I How well does the estimated model fit the data?
I Is the dataset reasonably large and accurate?
I Is OLS the best estimator to be used?
I Do estimated coefficients correspond to prior expectations?
I Are all the important variables included?
I In case we want to do policy: are the estimated parameters
structural?
Explained variation

I Regressions are used to explain y

I In particular, we wish to explain why/when is yi different from
E (y )
I The variation in y can be decomposed as:

yi − E (y ) = β0 + β1 xi + i − β0 − β1 E (x)
= β1 (xi − E (x)) + i
explained unexplained

I One way to evaluate models is to measure the proportion of

the variance of y that we are able to explain
Explained variation

I Regressions are used to explain y

I In particular, we wish to explain why/when is yi different from
ȳ
I The variation in y can be decomposed as:

yi − ȳ = β0 + β1 xi + i − β0 − β1 x̄
= β1 (xi − x̄) + i
explained unexplained

I One way to evaluate estimated models is to measure the

proportion of the variance of y that we are able to explain
Example: Variance decomposition
Example: Variance decomposition
Variance decomposition

X X
SST = (yi − ȳ )2 =(yi − ŷi + ŷi − ȳ )2
X X X
= (yi − ŷi )2 + 2 (yi − ŷi )(ŷi − ȳ ) + (ŷi − ȳ )2
X X
= (yi − ŷi )2 + (ŷi − ȳ )2
= Sum of Squares Residual + Sum of Squares Explained
= SSR + SSE
Goodness of fit R 2
I We have decomposed the total variation (SST) into the
explained variation (SSE) and the unexplained or residual
variation (SSR)
I A measure of how much of the variation of y can be explained
by the variation of x according to the estimated model
SSE SST − SSR SSR
R2 = = =1−
SST SST SST
I The higher the R 2 the closer the model is to the data and
since 0 ≤ SSR ≤ SST we know that 0 ≤ R 2 ≤ 1.
I It does not measure:
I How linear/tight the relation between x and y is (correlation)
I The inclination of the estimated line (slope coefficient)
I The strength of the causal relation between x and y
Examples
Example: height and weight, Computing OLS
Adding more regressors

I Adding a regressor always decreases SSR and then always

increases R 2 if y is independent from it. Why?
I Having more data or more variables improves the R2 because
it increases the degrees of freedom
I The adjusted R 2 controls for this bias:

SSR/n − K
R̄ 2 = 1 −
SST /n − 1

where n is the sample size and K is the number of parameters

I R̄ 2 = R 2 when K = 1 and R̄ 2 ≈ R 2 when n is very large.
ANOVA
Example: water supply variables

On The Geographical Interpretation of Eigenvalues
No ratings yet
On The Geographical Interpretation of Eigenvalues
35 pages
Short - Notes - Econometric Methods
No ratings yet
Short - Notes - Econometric Methods
22 pages
BST 32202 LINEAR REGRESSION 6 SLR ASSUMPTIONS LSE
No ratings yet
BST 32202 LINEAR REGRESSION 6 SLR ASSUMPTIONS LSE
20 pages
Econometrics Unit 3 Tedy Best
No ratings yet
Econometrics Unit 3 Tedy Best
147 pages
Simple Linear and Logistic Regression
No ratings yet
Simple Linear and Logistic Regression
81 pages
Module05 Notes
No ratings yet
Module05 Notes
19 pages
SimpleMultipleLinearRegression_FoundationalMathofAI_S24
No ratings yet
SimpleMultipleLinearRegression_FoundationalMathofAI_S24
6 pages
Multiple Regression
No ratings yet
Multiple Regression
22 pages
Lecture3 221109 035214
No ratings yet
Lecture3 221109 035214
87 pages
lecture 3
No ratings yet
lecture 3
33 pages
Mungadze Linear
No ratings yet
Mungadze Linear
21 pages
Sta 3
No ratings yet
Sta 3
9 pages
CH 06
No ratings yet
CH 06
22 pages
Linear Regression
No ratings yet
Linear Regression
11 pages
reg
No ratings yet
reg
110 pages
StatLearning2r PDF
No ratings yet
StatLearning2r PDF
267 pages
Stat 353 Study Guide
No ratings yet
Stat 353 Study Guide
44 pages
ch12_0
No ratings yet
ch12_0
43 pages
Midterm 2 Nem Veg Leges
No ratings yet
Midterm 2 Nem Veg Leges
9 pages
DAUNIT-3
No ratings yet
DAUNIT-3
32 pages
WEEK2 Simple Regression
No ratings yet
WEEK2 Simple Regression
133 pages
2.linear Regression
No ratings yet
2.linear Regression
49 pages
Lect 10 Regression
No ratings yet
Lect 10 Regression
7 pages
Linear Regression Models
No ratings yet
Linear Regression Models
41 pages
FCDS - RA ch1 Sp21
No ratings yet
FCDS - RA ch1 Sp21
14 pages
Lecture 2
No ratings yet
Lecture 2
23 pages
Reg Analysis
No ratings yet
Reg Analysis
63 pages
Regression
No ratings yet
Regression
60 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
55 pages
Student Notes Madule 2
No ratings yet
Student Notes Madule 2
12 pages
Simple Regression Model: Erbil Technology Institute
No ratings yet
Simple Regression Model: Erbil Technology Institute
9 pages
Notes2
No ratings yet
Notes2
16 pages
FE5209 3 AY 2024
No ratings yet
FE5209 3 AY 2024
59 pages
Chapter 9 Simple Linear Regression and Correlation (1) (1)
No ratings yet
Chapter 9 Simple Linear Regression and Correlation (1) (1)
56 pages
ch12 0
No ratings yet
ch12 0
82 pages
STAT 445-Lecture 1_2021
No ratings yet
STAT 445-Lecture 1_2021
42 pages
Introduction To Mathematical Modeling: Simple Linear Regression
No ratings yet
Introduction To Mathematical Modeling: Simple Linear Regression
21 pages
Estad Istica II Chapter 4: Simple Linear Regression
No ratings yet
Estad Istica II Chapter 4: Simple Linear Regression
46 pages
Ordinary Least Squares Linear Regression Review: Week 4
No ratings yet
Ordinary Least Squares Linear Regression Review: Week 4
10 pages
What Is Linear Regression
No ratings yet
What Is Linear Regression
14 pages
Linear Regression
No ratings yet
Linear Regression
47 pages
Linear Regression - Module 3
No ratings yet
Linear Regression - Module 3
16 pages
Supervised Machine Learning - Regression
No ratings yet
Supervised Machine Learning - Regression
34 pages
chapter 8
No ratings yet
chapter 8
39 pages
Stephen and Senthamarai Kannan (2017) - Detection of Outliers in Regression Model for Medical Data
No ratings yet
Stephen and Senthamarai Kannan (2017) - Detection of Outliers in Regression Model for Medical Data
7 pages
BA501 Week5 Linear Regression
No ratings yet
BA501 Week5 Linear Regression
45 pages
Module III (Part II)(Regression and Time Series)
No ratings yet
Module III (Part II)(Regression and Time Series)
118 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
Week 2 - The Simple Linear Regression Model PDF
No ratings yet
Week 2 - The Simple Linear Regression Model PDF
47 pages
Chapter 2 Regression Analysis Notes
No ratings yet
Chapter 2 Regression Analysis Notes
11 pages
21csc305p Ml Unit 2 Ppt
No ratings yet
21csc305p Ml Unit 2 Ppt
115 pages
Linear Regression
No ratings yet
Linear Regression
18 pages
Linear Regression Models
No ratings yet
Linear Regression Models
42 pages
Econometric Theory: Module - Iii
No ratings yet
Econometric Theory: Module - Iii
10 pages
Lecture Note #8_PEC-CS701E
No ratings yet
Lecture Note #8_PEC-CS701E
20 pages
Linear Regression
100% (1)
Linear Regression
14 pages
Econ 471 Notes 1
No ratings yet
Econ 471 Notes 1
14 pages
Lecture2 241007 162001
No ratings yet
Lecture2 241007 162001
11 pages
Calculus-II (Mathematics) Question Bank
From Everand
Calculus-II (Mathematics) Question Bank
Mohmmad Khaja Shareef
No ratings yet
Capsule Calculus
From Everand
Capsule Calculus
Ira Ritow
No ratings yet
Calculus Refresher
From Everand
Calculus Refresher
A. A. Klaf
3/5 (8)
Myth and National Identity in 19th Century Britain
No ratings yet
Myth and National Identity in 19th Century Britain
284 pages
Distillation Problem 6
100% (1)
Distillation Problem 6
3 pages
Complete Download Integrated life-cycle and risk assessment for industrial processes and products Second Edition Marta Schuhmacher PDF All Chapters
100% (4)
Complete Download Integrated life-cycle and risk assessment for industrial processes and products Second Edition Marta Schuhmacher PDF All Chapters
55 pages
Stellarlune Shannon Messenger download
No ratings yet
Stellarlune Shannon Messenger download
25 pages
Business and Office Management Society Organization's Week A.Y. 2019-2020 First Year
No ratings yet
Business and Office Management Society Organization's Week A.Y. 2019-2020 First Year
4 pages
GENERAL PHYSICS 2 - Q3 - Week 8
No ratings yet
GENERAL PHYSICS 2 - Q3 - Week 8
19 pages
Zubair CV VV
No ratings yet
Zubair CV VV
2 pages
WellTech Training Presentation
No ratings yet
WellTech Training Presentation
12 pages
Last Leap 1 Physics & Zoology
No ratings yet
Last Leap 1 Physics & Zoology
1 page
NYJC 2023 H1 General Paper 8807 P2 Ans Booklet
No ratings yet
NYJC 2023 H1 General Paper 8807 P2 Ans Booklet
7 pages
Water Governance SP
No ratings yet
Water Governance SP
8 pages
Unified Theory of Acceptance and Use of Technology - Wikipedia
No ratings yet
Unified Theory of Acceptance and Use of Technology - Wikipedia
6 pages
Math5 q2 Mod5 AddingAndSubtractingDecimalNumbersThrougThousandths v2
No ratings yet
Math5 q2 Mod5 AddingAndSubtractingDecimalNumbersThrougThousandths v2
20 pages
JD - Data Science Intern - Accredian - 11092001
No ratings yet
JD - Data Science Intern - Accredian - 11092001
2 pages
Children Hunger For Their Biological Parents
No ratings yet
Children Hunger For Their Biological Parents
2 pages
Oyster Culture PPT Presentation
No ratings yet
Oyster Culture PPT Presentation
16 pages
Graphic Organizers
No ratings yet
Graphic Organizers
21 pages
Practice Set 3_ms_xii Eng Pb 2024-25 Mathur Ji
No ratings yet
Practice Set 3_ms_xii Eng Pb 2024-25 Mathur Ji
13 pages
Anchor Bolt Tightening
No ratings yet
Anchor Bolt Tightening
9 pages
Being As Relation In Luce Irigaray Emma R Jones instant download
100% (1)
Being As Relation In Luce Irigaray Emma R Jones instant download
51 pages
The Current State of Psychoanalysis in Society Culture and The Clinic
No ratings yet
The Current State of Psychoanalysis in Society Culture and The Clinic
16 pages
Orga - Todo (1) 1 245
No ratings yet
Orga - Todo (1) 1 245
245 pages
Tour Guide Training: Module 2: Building Rapport Participant Guide
No ratings yet
Tour Guide Training: Module 2: Building Rapport Participant Guide
28 pages
Metaphorical Use of Mandarin Compound Directional Complements
No ratings yet
Metaphorical Use of Mandarin Compound Directional Complements
91 pages
Luksong Lubid Jnasdjv
No ratings yet
Luksong Lubid Jnasdjv
4 pages
Course Outline EEE141 22 KMM
No ratings yet
Course Outline EEE141 22 KMM
4 pages
Bai 11. Xu Ly Nhanh Gon Cac Cau Tu Vung
No ratings yet
Bai 11. Xu Ly Nhanh Gon Cac Cau Tu Vung
13 pages
10th Pan-IIM WMC 2025 Brochure
No ratings yet
10th Pan-IIM WMC 2025 Brochure
4 pages
Synthesis of Thermal Spray Grade Yttrium Oxide Powder and Its Application For Plasma Spray Deposition
No ratings yet
Synthesis of Thermal Spray Grade Yttrium Oxide Powder and Its Application For Plasma Spray Deposition
6 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Lec2 ASE

Uploaded by

Lec2 ASE

Uploaded by

Linear Regression

African School of Economics

I Quantitative characteristics of the world are usually entangled

I Quantitative characteristics of the world are usually entangled

I Quantitative characteristics of the world are usually entangled

I y is the regressand, response variable, explained variable,

I Quantitative characteristics of the world are usually entangled

I y is the regressand, response variable, explained variable,

I It depends both on the initial point and the magnitude of the

I It depends both on the initial point and the magnitude of the

I Which means that the graph of the regression is a (straight)

I The slope of a linear model equals β1 independently of x0 and

I The following model is clearly nonlinear

I The following model is clearly nonlinear

I After some relabelling:

I The following model is clearly nonlinear

I After some relabelling:

I We obtain the linear model

y = log (γ0 ) + γ1 log (x) = β0 + β1 z

I Suppose that the true relationship between x and y is given by

I Suppose that the true relationship between x and y is given by

I We can always abstract from non potential linearities and use

I Suppose that the true relationship between x and y is given by

I We can always abstract from non potential linearities and use

I If f is not linear, then the approximation will be inexact and

I We can still have linear models

I We can still have linear models

I In this case, each coefficient βj is still a measure of change

I We can still have linear models

I In this case, each coefficient βj is still a measure of change

I For multivariate regressions linearity assumes separability

I We may not know or observe all the variables which affect y

I We may not know or observe all the variables which affect y

I if β2 x2 + · · · + βk xk is unobserved, We can still approximate y

I We may not know or observe all the variables which affect y

I if β2 x2 + · · · + βk xk is unobserved, We can still approximate y

I As before, this approximation is inexact and has an

I Most of the time there is uncertainty because (at least)

I Uncertainty is captured by a stochastic error term 

I β0 + β1 x is called the deterministic component of the model

I Assume that the error has zero mean conditional on x

I Then the deterministic component corresponds to the mean of

E (y |x) = E (β0 + β1 x + |x)

I Then slope coefficient measures the average per-unit effect of

E (y |x1 ) − E (y |x0 ) = β1 (x1 − x0 )

I We assume that the data comes from a random sample

I Suppose that we have estimates β̂0 and β̂1 , the estimated

ŷi = β̂0 + β̂1 xi

I The corresponding residual is:

I Notice we cannot guarantee that ei = i unless we know β0

I Given estimated coefficients β̂0 = 103.4 and β̂1 = 6.38, you

I Begin from dataset coming from a random sample {xi , yi }

I We do not observe i and the true coefficients β0 and β1

ŷi = β̂0 + β̂1 xi

I Two uses for the estimated model:

∆yi = β1 ∆xi ≈ ∆xi

I Better predictions when yi ≈ ŷi , i.e. when the residuals are

Given a data set, the ordinary least squares (OLS) estimates of

The OLS estimated model is: ŷi = β̂0 + β̂1 xi

I When β1 = 0, we know that β̂0 = ȳ why?

I Most of the time we will be interested in β1 rather than β0

I But if β̂0 = 0 we may get bad estimates

I The interpretation is slightly different: β̂k indicates the

\ i = 8927 − 0.36PARENTi + 87.4HSRANKi

I If the effect of x1 on y depends on the value of x2

I The average effect of one unit change in x1 over y is given by

I Is the equation supported by sound theory/common sense?

I Regressions are used to explain y

I One way to evaluate models is to measure the proportion of

I Regressions are used to explain y

I One way to evaluate estimated models is to measure the

I Adding a regressor always decreases SSR and then always

where n is the sample size and K is the number of parameters

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

I Uncertainty is captured by a stochastic error term

E (y |x) = E (β0 + β1 x + |x)

I Notice we cannot guarantee that ei = i unless we know β0

I We do not observe i and the true coefficients β0 and β1