0% found this document useful (0 votes)
21 views34 pages

Chapter 0_Multiple Regression Models

The document provides an overview of multiple regression models, including basic concepts, types of regression models, and model testing methods. It emphasizes the importance of econometrics in quantifying and forecasting economic relationships, and discusses various regression techniques such as simple and multiple regression, as well as models with dummy variables. Additionally, it addresses common defects in regression models and methods for their resolution.

Uploaded by

Linh Phạm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views34 pages

Chapter 0_Multiple Regression Models

The document provides an overview of multiple regression models, including basic concepts, types of regression models, and model testing methods. It emphasizes the importance of econometrics in quantifying and forecasting economic relationships, and discusses various regression techniques such as simple and multiple regression, as well as models with dummy variables. Additionally, it addresses common defects in regression models and methods for their resolution.

Uploaded by

Linh Phạm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

NATIONAL ECONOMIC UNIVERSITY

Chapter 0

Multiple regression models

Dr. Phung Minh Duc


Contents
1. Basic Concepts
2. Simple Regression Model
3. Multiple Regression Model
4. Several types of regression models
5. Model with dummy variable
6. Model Tests
7. Some defects in the regression model
8. Practice
1 Basic concepts

What is Econometrics?
• Econometrics = Economics + Metric (using math and statistics)
• Econometrics is a combination of economics, math and statistics to quantify,
testing and forecasting economic relations.
Mathematical Modeling
Case study
Case study

Research question:
What is the impact of VAT on the poverty rate in Vietnam?
Case study

Qualitative or
quantitative?
1 Quantitative methods
Case study

Research question:
What is the impact of VAT on the poverty rate in Vietnam?
1 Basic concepts

State Hypotheses
Theoretical model
Model Setting
Econometric Model
Data Collection

Parameter Estimate

Analyze the results

Forecast

Conclusions
1 Data classification

❖ Data Structures:
✓ Cross sectional data;
✓ Time series;
✓ Panel data.
❖ Data fomat:
✓ Quantitative;
✓ Qualitative
❖ Data source:
✓ Primary;
✓ Secondary.
2 Simple Regression Model

Population Regression Function: 𝒀 = 𝜷𝟎 + 𝜷𝟏 𝑿 + 𝒖


❖ 𝑌: Dependent variable;
❖ 𝑋: Independent variable;
❖ 𝑢: Random error
❖ 𝛽0 : Intercept coefficient (𝛽0 = 𝐹(𝑌|𝑋 = 0)
❖ 𝛽1 : Slope coefficient (the mean effect of 𝑋 on 𝑌)
o 𝛽1 = 0: 𝑋 does not affect on 𝑌
o 𝛽1 > 0: 𝑋 ↑ (↓) 1 unit → 𝑌 ↑ (↓) 𝛽1 units
o 𝛽1 < 0: 𝑋 ↑ (↓) 1 unit → 𝑌 ↓ ↑ 𝛽1 units
❖ Assumption 𝐸 𝑢 𝑋 = 0 → 𝐸 𝑌 𝑋 = 𝛽0 + 𝛽1 𝑋
2 Simple Regression Model

Population (unknown) Sample (data)


Ordinary Least Squares Method (OLS):
2
Find 𝛽መ0 , 𝛽መ1 such that 𝑅𝑆𝑆 = σ𝑛𝑖=1 𝑒𝑖2 = σ𝑛𝑖=1 𝑌𝑖 − 𝑌෠𝑖 → min
2 Simple Regression Model

Assumptions of OLS method:


❖ AS1: Samples are random, independent
❖ AS2: 𝐸 𝑢 𝑋 = 0
❖ AS3: Homoscedasticity
𝑣𝑎𝑟 𝑢 𝑋 = 𝜎 2 ; 𝑣𝑎𝑟 𝑢𝑖 𝑋𝑖 = 𝑣𝑎𝑟 𝑢𝑗 𝑋𝑗 ∀𝑖 ≠ 𝑗
Theorem:
If AS1-AS3 are satisfied then 𝐸 𝛽መ0 = 𝛽0 , 𝐸 𝛽መ1 = 𝛽1 (unbiased) and they are
the best estimators.
2 Simple Regression Model

Coefficient of determination:
❖ The coefficient
2 σ𝑛𝑖=1 𝑒
2
𝑅 =1− σ𝑛 ത 2
𝑖=1 𝑌𝑖 −𝑌
indicates the rate (%) of the variation the dependent variable in the sample is
explained by the model (by the volatility of the independent variables).
❖ Adjuted R-squared coefficient:
𝑅ത 2 = 1 − (1 − 𝑅2 ) 𝑛−1
𝑛−𝑘
3 Multiple regression models

The necessity of multiple regression models


𝑌 = 𝛽0 + 𝛽1 𝑋 + 𝑢 (1)
❖ If 𝑐𝑜𝑣(𝑢, 𝑋) ≠ 0 then 𝑋 is called endogenous independent variable →
Assumption 2 is not satisfied → 𝐸 𝛽መ0 ≠ 𝛽0 , 𝐸 𝛽መ1 ≠ 𝛽1 : biased estimator
❖ If 𝑍 is the factor in 𝑢 correlated with 𝑋, then add 𝑍 to the model, we get the
multivariate regression model.
𝑌 = 𝛽0 + 𝛽1 𝑋 + 𝛽2 𝑍 + 𝑢 (2)
3 Multiple regression models

Advantages of multivariate regression models


❖ The form of the function is more suitable
❖ Better estimated quality
❖ Provide diverse information
❖ Provide more useful forecasts
3 Multiple regression models

General multiple regression model


𝑌 = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋2 + ⋯ + 𝛽𝑘 𝑋𝑘 + 𝑢 (1)
In which:
❖ 𝐸 𝑌 𝑋1 , 𝑋2 , … , 𝑋𝑘 = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋2 + ⋯ + 𝛽𝑘 𝑋𝑘
❖ 𝛽0 = 𝐸(𝑌|𝑋1 = ⋯ = 𝑋𝑘 = 0)
𝜕𝐸 𝑌
❖ 𝛽𝑗 = : the partial effect of 𝑋𝑗 on 𝐸(𝑌)
𝜕𝑋𝑗
❖ The sample multiple regression model has the form:
𝑌𝑖 = 𝛽መ0 + 𝛽መ1 𝑋1𝑖 + 𝛽መ2 𝑋2𝑖 + ⋯ + 𝛽መ𝑘 𝑋𝑘𝑖 + 𝑒𝑖 (1)
3 Multiple regression models

Ordinary Least Squares Method


2
Find 𝛽መ𝑗 such that: σ 𝑒𝑖2 = σ 𝑌𝑖 − 𝛽መ0 − 𝛽መ1 𝑋1𝑖 − 𝛽መ2 𝑋2𝑖 − ⋯ − 𝛽መ𝑘 𝑋𝑘𝑖 → min
Assumptions:
❖ AS1: Samples are random, independent
❖ AS2: 𝐸 𝑢 𝑋1 , … , 𝑋𝑘 = 0
❖ AS3: 𝑣𝑎𝑟 𝑢 𝑋1 , … , 𝑋𝑘 = 𝜎 2 (Homoscedasticity)
❖ AS4: The independent variables do not have perfect multicollinearity
Theorem: If AS1-AS4 are satisfied then the OLS estimates are BLUE (Best Linear
Unbiased Estimator) and consistent (the estimators in the sample are
asymptotic to those in the population when 𝑛 is large enough).
3 Multiple regression models

The coefficient of determination:


The coefficient
σ𝑛𝑖=1 𝑒
2
𝑅2 =1− σ𝑛 ത 2
𝑖=1 𝑌𝑖 −𝑌
indicates the rate (%) of the variation the dependent variable in the sample is
explained by the model (by the volatility of the independent variables).
Several types of regression
4
models

Several types of regression models


❖ Linear-linear model: 𝑌 = 𝛽0 + 𝛽1 𝑋 + 𝑢
=> If 𝑋 increases by 1 unit then 𝐸(𝑌) increases 𝛽 units
❖ Log-log model: ln 𝑌 = 𝛽0 + 𝛽1 ln 𝑋 + 𝑢
=> If 𝑋 increases by 1% then 𝐸(𝑌) increases 𝛽1 %
❖ Linear-log model: 𝑌 = 𝛽0 + 𝛽1 ln 𝑋 + 𝑢
𝛽1
=> If 𝑋 increases by 1% then 𝐸(𝑌) increases units
100
❖ Log-linear model: ln 𝑌 = 𝛽0 + 𝛽1 𝑋 + 𝑢
=> If 𝑋 increases by 1 unit then 𝐸(𝑌) increases 100𝛽1 %
Several types of regression
4
models

Several types of regression models


❖ Quadratic model:
𝑌 = 𝛽0 + 𝛽1 𝑋 + 𝛽2 𝑋 2 + 𝑢
❖ Model contains interactive variables :
𝑌 = 𝛽0 + 𝛽1 𝑋 + 𝛽2 𝑍 + 𝛽3 𝑋 ∗ 𝑍 + 𝑢
5 Model with dummy variable

❖ Binary variable 0-1


➢ D is a binary variable, for example:
D = 1 if FDI enterprises
D = 0 if other enterprises
➢ Model with dummy variable: 𝒀 = 𝜷𝟎 + 𝜷𝟏 𝑫 + 𝜷𝟐 𝑿 + 𝒖
For FDI enterprise: 𝑌 = 𝛽0 + 𝛽1 + 𝛽2 𝑋 + 𝑢
For other enterprises : 𝑌 = 𝛽0 + 𝛽2 𝑋 + 𝑢
5 Model with dummy variable

❖ Dummy variable with m categories: 𝑫𝟏 , … 𝑫𝒎 (𝑫𝟏 is the base)

𝒀 = 𝜷𝟎 + 𝜷𝟐 𝑫𝟐 + ⋯ + 𝜷𝒎 𝑫𝒎 + 𝜷𝑿 + 𝒖

At 𝐷1 : 𝑌 = 𝛽0 + 𝛽𝑋 + 𝑢
At 𝐷2 : 𝑌 = (𝛽0 +𝛽2 ) + 𝛽𝑋 + 𝑢

At 𝐷𝑚 : 𝑌 = (𝛽0 +𝛽𝑚 ) + 𝛽𝑋 + 𝑢
6 Model Tests

❖ T - test
𝒀 = 𝜷𝟎 + 𝜷𝟏 𝑿𝟏 + ⋯ + 𝜷𝒋 𝑿𝒋 + ⋯ + 𝜷𝒎 𝑿𝒎 + 𝒖
𝐻0 : 𝛽𝑗 = 0
o Hypotheses : ൝
𝐻1 : 𝛽𝑗 ≠ 0
o 𝑇 statistics: 𝑇0 = 𝛽መ𝑗 /𝑠𝑒(𝛽መ𝑗 )
𝑛−𝑘
o If 𝑇0𝑞𝑠 > 𝑡𝛼/2 then 𝐻0 is rejected
❖ 𝑷-value
o If P-value < 𝛼 ∗ then 𝐻0 is rejected
o If P-value > 𝛼 ∗ then there is no evidence to reject
Some defects in the regression
7
model

❖ Model omitted available variables


o Problem: The obtained estimate is biased
o Solution: Add the omitted variable to the model
❖ The model omits the unobserved variable
o Problem: The obtained estimate is biased
o Solution: Use proxy; instrumental variable method; panel data model
❖ The model contains inappropriate variables.
o Problem: The estimation results are less precise
o Solution: Remove inappropriate variables from the model.
Some defects in the regression
7
model

❖ The model has an incorrect functional form


o Problem: The obtained estimate is biased
o Test: Ramsey Test (estat ovtest)
o Solution: Add squared variables or use logarithmic transformation of
variables.
❖ Heteroskedasticity
o Problem: The test is insignificant
o Test: White test (estat imtest, white)
o Solution: Using the robust standard error method (robust).
❖ Multicollinearity
PRACTICE ON
STATA
Some Disabilities

Omitted variable
❖ The omitted variable is available on the data:
regress Y X1 X2 … Xk X => test X
If P-value << then the variable X is missing => Add variable X to the model
If P-value >> then the variable X does not explain the change of the mean
value of the dependent variable => It is possible to remove the variable X from
the model.
Some Disabilities

❖ Unobserved omitted variable => Endogenous => Biased estimate


▪ Find a proxy for an unobservable variable ???
▪ Instrument variable method (IV)
▪ Panel data model
Some Disabilities

The function form is wrong


regress Y X1 X2 … Xk X
estat ovtest
If P-value << then the function form is wrong
=> Add squared variable, log form
Some Disabilities

Heterogeneous variance
regress Y X1 X2 … Xk X
estat imtest, white
If P-value << then the variance is heterogeneous
=> regress Y X1 X2 … Xk X, robust
Some Disabilities

Multicollinearity
vif
Variable | VIF 1/VIF
-------------+----------------------
X1 | 1.09 0.918716
X2 | 1.14 0.876805
X3 | 12.37 0.030858
If VIF > 10 (or 1/VIF < 0.1) then there is multicollinearity
Multicollinearity should be considered before choosing a variable
PRACTICE

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy