0% found this document useful (0 votes)

94 views

CH - 3 - Simple and Multiple Linear Regressions in Stata

Application to Cross Sectional Econometrics in stata

Uploaded by

mengistu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

0% found this document useful (0 votes)

94 views

CH - 3 - Simple and Multiple Linear Regressions in Stata

Application to Cross Sectional Econometrics in stata

Uploaded by

mengistu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

You are on page 1/ 36

Mengistu Yismaw (MSc.) Department of Economics Debre Markos University (Burie Campus) Email: menyis.2012@gmail.comChapter ou Simple linear regression. ‘Regression with only qualitative (dummy) regressors: ANOVA © Specification © Estimation © Interpretation Multiple inear regression Regression with qualitative and quantitative regressors: ANCOVA © Specification © Estimation © Interpretation © Test of LRM assumptions © Violations of some of the CLRM assumption © Interaction effect 4 qualitative Response Regression Models: Dummy as dependent variable (Binary choice model) * near Probability Model (LPM) © Specification © Estimation © Interpretation o "CHAPTER THREE: CROSS SECTIONAL ECONOMETRICS. CMIa y PNAaAPRON = Methodology of econometrics analysis What are the steps or procedures of econometricians in their analysis of an economic problem? Broadly speaking, classical econometric methodology proceeds along the following lines (steps): Develop statement of theory or hypothesis Specification of the mathematical model of the theory Specification of the statistical, or econometric, model Obtaining the data Estimation of the parameters of the econometric model Hypothesis testing Forecasting or prediction Using the model for control or policy purposes OT2.1. Simple Linear Regression Simple linear regression= single regressor (independent variable) Suppose a regression with only qualitative (durnmy) regressor= ANOVA Regression: ‘Step 1: Develop a statement of theory or hypothesis Suppose we want to know if there isa productivity difference between male and female headed households > i.e. Suppose Gender is our independent variable Gender is a dummy (binary or Nominal scale ) variable Nominal scale variable: it is a type of variable which gives qualitative information only. male 1male i. Gender { cans let yan Note: the above coding ‘0’ or'1’ is used for identification purpose only. > Then values of nominal scale variable can't be divided, subtracted or ordered for comparison > This type of variable sometimes called dummy variable IIo)Step 2: Specification of the mathematical model of the theory Yield = BO+ B1Gender Step 3: Specification of the statistical, or econometric, model And let your multiple linear regression model is: Yield = By+ B1Gender +pi Step 4: Obtaining the data Then the next step is going to field and collect the data.» The next step is entering the data in to the appropriate software and format. O Remember ways of entering the data in to stata i. Directly entering the data in to the stata ii, Entering the data in to excel and import to the stata iii. Entering the data in to SPSS and save it in the appropriate stata format or use stata transfer software > The next step is estimating the model OeStep 5: Estimation of the parameters of the econometric model Ordinary Least Square (OLS) estimation techniques using stata Statistics mam models and related ==) Linear regression mm)Select the dependent and independent variables ===> Click submit = Click ok Syntax: reg depvar indepvarExample: eg Yield Gender nt Some statistical manipulations Depo of eedom(, sample size Penumber of parameters indep Vas) ben. of variables > (1, 30-2) (1,28) 8 — 0.4610 Estimate te residuals Estimate te ited value 4-BiGender Fe) ~ 1088 t= 508 Cl for fy = + #220) where; ¢2 ~~ value at (30 — 2,0%5/,) ~ 2.048 > CIfor fy=5.5278 +2.048(1.088)= (3.2986, 7.7568) WaQO To estimate the RSS (Residual), follow the following steps PaOQ To estimate the ESS (model), follow the following steps Dae SAInterpretation of coefficients What does the estimate 5.527 show? It is coefficient for Male showing that the average productivity of Male headed households is higher than female headed households by 5.527 at (sig at 1%): remember the t-test result in ch-2 What about the estimate 3.5? Average productivity of omitted category (Female headed households) ‘Why we omitted one category (Female)? Not to fall in dummy variable trap Average productivity of male headed hhs=3.5+ 5.527 *1= 9.027: remember the t-test result in ch-2 (Or use prediction ‘Average productivity of female headed hhs= 3.5+ 5.527 *0= 3.5 Or use prediction The productivity difference b/n male and female headed hh 9,027-3.5= 5.527 WaCo (e Exercise « Is there a significant difference in average productivity between households with and without access to credit? « What is the average productivity of households with access to credit? « How much of the productivity variation is explained by access to credit? CROSS SECTIONAL ECONOMET2.2. Multiple Linear Regression 2 Multiple linear regression= many regressors (independent variable) © Suppose: Dummy and continuous variables as an independent variable= ANCOVA Regression. Suppose you are going to analyze various determinants of maize productivity Based on your literature review, you think that maize productivity can be affected by: + Age of the household head ¥ Land fragmentation + Fertilizer applied per hectare ¥ Household land size ¥ Gender of the household head Then the multiple linear regression model will be: Yield= Bot Byaget Byfragment+ B.fertlizert B,land+ B.Gender+pi Note: estimation techniques are the same as simple linear regression model. Syntax: reg dep var indep vars. MENGISTU Y, UE Dae EL)Example: reg Yield age fragment fertlizer land Gender_n1 2 Based on p-value from five explanatory variables, only two variables (fragment and Gender) are significant. Note: only significant variables will be analyzed. a Then let as analyze the coefficients of significant variables 2 However, before making the analysis of the result, it is important to judge the efficiency of the model using some ed onteretetioneut equation of the regression model beams Ss 08sGender Vild~ 5.561740.055age 0.676Kragment-0.00Afertier 0, diagnostic tests. 2 In particular, inferences based on OLS results can be valid depending on whether the classical linear regression (CLRM) assumptions hold. UOa Now let as test the some of CLRM assumptions called diagnostic tests: i. Multicollinearity Test a The term multicollinearity means the existence of perfect or exact linear relationship among all or some of the explanatory variables of the regression model. a And the existence of multicollinearity can be examined (detected) using various techniques such as using auxiliary regression, pair-wise correlations among regressors and variance inflation factor (VIF) and or tolerance margin (1/VIF). @ VIF is most commonly used which measures how the variance of an estimator is inflated by the presence of multicollinearity. Note: Multicollinearity is a matter of degree and not of kind. < Itis not between the presence and the absence of its degrees (high or perfect)!Informal test: High R2 but t-ratio Formal tests: Take auxiliary regression Test pair-wise correlations among regressors Decision: best if less than 0.50 Test for variance inflation factor and tolerance Decision Asa rule of thumb if VIF is >10 or if 1/VIF < 10% (close to zero) there 1s multicollinearit. > Since our result shows that VIF ofall variables are less than 10 and I/VIF of all variables are grater than 10%, multicollinearity 1s not a problem in our model Note: > Multicollinearty is not a problem for nonlinear relationships between variables > Multicollinearity is essentially a sample (regression) phenomenon not for the population. Wa+ ° Remedial measures if there is multicollinearity problem Drop one or more of the perfectly collinear variables Take sample over wide area (increase the sample size) Take new data Transformation of variables (take square, natural logarithm...) Combining cross-sectional and time series data Do nothing: Multicollinearity is God’s will, according to Blanchard multicollinearity is essentially a data deficiency problem not a problem with OLS or statistical technique in general. MENGISTU Y, LEST DEES)|. Test of homoscedasticity a It is the test of the variance of the error (disturbance) term. alf the error term doesn’t have a constant variance, we say there is Heteroscedasticity problem. a The nature of the variance of the error term can be judged by Breusch-Pagan test.Stata command: hettest Then you get the following result (Deasion- if the P-value 1s sufficiently small, e, if below chosen significant level (usually TO%), we reject the null hypothesis (Ho) of homoscedasticity (constant variance and accept the alternative hypothesis (1). Since our result shows that P-value is less than 10%, we have to reject Ho Then there is no constant variance (there is Heteroscedasticity problem) in our model. MENGISTU Y, eur ee USRemedial measures for Heteroscedasticity problem Check for outliers (for the dependent variables) Use robust regression Example: reg Yield age fragment fertlizer land Gender_ni, robust Note: hettest is not appropriate after robust regression Waiii. Model Specification test Model specification test basically deals about: » The exclusion of relevant explanatory variables > The inclusion of irrelevant variables > Functional form error UIE Dae EL)Q Take Ramsey reset test Syntax: ovtest Decision: if the P-value is sufficiently small, that is, if below chosen significant level (usually 10%), we reject the null hypothesis (Ho) of homoscedasticity (constant variance and accept the alternative hypothesis (H,) < Implies that there is no model specification problem. TER THREE” CROSS SECTIONAL ECONOMETRICS DEBRE MARKOS UNIVERSITY(DMU) MENGISTU Y,iv. Normality of the disturbance term + There are various ways of testing the normality of ui. For example: y_ histogram with normal curve of residuals ¥ Normal probability plot and others COSA) LEST MENGISTU Y, oyTest of normality of the disturbance term using stata > First generate the disturbance term (U;) Syntax: predict ui, residual > Second test of normality of the disturbance term (Ui) a. Draw histogram of the ui with normal curve Syntax: histogram ui, normal v Then you get the result likeNormal probability and quartile plot Syntax: pnorm ui or qnorm ui > Then you get the following result respectively Noman) bia 0% enpwcsifsumie) °° o : aos > Both graphs shows that the disturbance term (ui) is almost normal. MENGISTU Y, LEST DEES)lations of Some of the VPC ellie The presence of multicollinearity a We said that multicollinearity means the existence of perfect or exact linear relationship among all or some of the explanatory variables of the regression model. » Let us assume that the variable fertilizer is twice that of age. » Then let us create hypothetical variable called age3 which is a function of age Syntax: gen age3=50+ageNote: we deliberately make the 10th observation of age3 95 instead of 75, unless the stata will drop one of the perfectly correlated variables in the regression. WeThen after regression with the new data, we get the following VIF result vit r DOTCategorical variables as a regressor ‘Suppose: Educational level (EducLevel) ‘Syntax: reg depvar i. Categorical var Example: reg Yield fertlizer Gender_n1 i.€ducLevel_ni Note: When you put i. Infront of the Categorical variables variable the software automatically drop the one category (usually the lowest category) that will be your bench mark Unless you put | Infront of the Categorical variables the software consider the variable as a continuous variable Your estimate will be wrong WaAnswer the following questions based on the regression result given below A. What does 4.612 shows? 8. Whats the average productivity d/ce b/n male and female headed hhs? C. Whatis the difference in average productivity b/n hhs with illiterate and secondary educ. completed heads? D. What is the difference in average productivity b/n hhs with secondary and post-secondary educ. completed heads? 7 PIE eerDiscussion question a What is the average productivity of households managed by male and secondary educ. completed heads? Ne eSTo know the average productivity of households managed by male and secondary educ. Completed heads. 1%: we have to generate interaction variable of Gender and educational level 2nd: make regression using the newly generated variable reg Yield fertlizer i.IntGenEduc The average productivity of households managed by male and secondary educ. Completed heads is 3.82 = UESThe linear probabi model (LPM) Suppose you are intended to investigate the effect of gender and land size on access to credit Model: Credit_Dummy_n1= By* B,land+ B,Gender+pi Since the dependent variable is takes values which are either 0 or 1, the model can be interpreted as the probability of observing a 0 or 1 given the explanatory variables Though the LPM model is not entirely correct, we can use OLS to estimate it. WaInterpretation of coefficients Interpret the intercept Interpret the coefficient of land Interpret the coefficient of Gender Answer ‘A. The probability of access to credit for female managed HHs with no land Is 0.168 or 17% Note: if the intercept term is negative, it will be interpreted as zero (because probability can’t be negative) B, The coefficient of land shows that for one hectare increase in HH's land size, on average, probability of access to credit decrease by 0.00069 or 0.07% but itis not statistically significant. However, we can estimate the actual probability of access to credit for a particular HH land size. Example: suppose the male managed HH with land size of S hectare E(x/land = 5, Gender = 1) = 0.168 ~ 0.00067 *5 +0499 +1 = 0.664 Or use prediction C. The coefficient of Gender shows that the probability of access to credit for male managed HHs greater than female managed HHs on average by 50% Uae SAieee UEP MCrsP eM CoM ire coreim ecm PeLar TT Importing STATA result to Microsoft word 1. Using asdoc Syntax: add asdoc before stata commands except for figure commands Examples: asdoc sum asdoc reg Yield age fragment fertlizer land Gender_n1 = EducLevel_n1 Credit t_Dummy_n1 Fete r & The software authomatically save your result in Microsoft word file “Myfile.doc” in the working directory you are working on. >» Click on “Myfile.doc” in the stata result window to open the document MENGISTU Y, CU LESSEE Uae SA2. Using outreg2 Itis used for regression results Syntax: Note: run simultaneously Example: reg Yield age fragment fertlizer land Gender_n1 EducLevel_n1 Credit_Dummy_ni outreg2 using Table1.doc, replace The software automatically save your result in Microsoft word file “Table1” in the working directory you are working on. Click on “Table” in the stata result window to open the document Note: outreg2 is usually used for publication purpose. For your senior essay please use asdoc option. — Wa

Panel Stata Command
No ratings yet
Panel Stata Command
7 pages
Heteroskedasticity
No ratings yet
Heteroskedasticity
30 pages
Chapter 5 - Violations of Regression Assumptions
No ratings yet
Chapter 5 - Violations of Regression Assumptions
44 pages
CH - 2 - Application To Univariate and Bivariate Analysis in Stata
No ratings yet
CH - 2 - Application To Univariate and Bivariate Analysis in Stata
32 pages
Panel Data Analysis Using Stata: Sebastian T. Braun University of ST Andrews
No ratings yet
Panel Data Analysis Using Stata: Sebastian T. Braun University of ST Andrews
90 pages
GMM Stata
No ratings yet
GMM Stata
27 pages
Introduction To STATA: Introduction To STATA About STATA Basic Operations Regression Analysis Panel Data Analysis
No ratings yet
Introduction To STATA: Introduction To STATA About STATA Basic Operations Regression Analysis Panel Data Analysis
27 pages
STATA Commands
No ratings yet
STATA Commands
42 pages
Panel Data Assign
No ratings yet
Panel Data Assign
19 pages
CH - 5 - Application To Logit and Probit Models in Stata
No ratings yet
CH - 5 - Application To Logit and Probit Models in Stata
18 pages
Econometrics With Stata PDF
No ratings yet
Econometrics With Stata PDF
58 pages
CH - 4 - Application To Time Series and Panel Data in Stata
No ratings yet
CH - 4 - Application To Time Series and Panel Data in Stata
40 pages
Saad Akhtar
No ratings yet
Saad Akhtar
48 pages
Panel Analysis - April 2019 PDF
100% (1)
Panel Analysis - April 2019 PDF
303 pages
2018-Panel Data by Baun PDF
100% (1)
2018-Panel Data by Baun PDF
88 pages
STATA Commands For Unobserved Effects Pa
No ratings yet
STATA Commands For Unobserved Effects Pa
23 pages
Stata Excel Spreadsheet
No ratings yet
Stata Excel Spreadsheet
43 pages
IVregression ECO311 Erdinc 14.03
No ratings yet
IVregression ECO311 Erdinc 14.03
11 pages
CH - 1 - Introduction To Econometrics Software Stata
No ratings yet
CH - 1 - Introduction To Econometrics Software Stata
35 pages
Lab Introduction To STATA
No ratings yet
Lab Introduction To STATA
27 pages
Materi GMM Panel Data
No ratings yet
Materi GMM Panel Data
11 pages
1Panel-Data Unit-Root Tests - Stata
No ratings yet
1Panel-Data Unit-Root Tests - Stata
3 pages
Stata Commands PDF
No ratings yet
Stata Commands PDF
5 pages
Time Sereis Analysis Using Stata
100% (1)
Time Sereis Analysis Using Stata
26 pages
How To Do Xtabond2: An Introduction To "Difference" and "System" GMM in Stata by David Roodman
No ratings yet
How To Do Xtabond2: An Introduction To "Difference" and "System" GMM in Stata by David Roodman
45 pages
Panel Data Methods For Microeconomics Using Stata
100% (1)
Panel Data Methods For Microeconomics Using Stata
39 pages
Structural VAR and Applications: Jean-Paul Renne
No ratings yet
Structural VAR and Applications: Jean-Paul Renne
55 pages
K Kiran Kumar IIM Indore
100% (1)
K Kiran Kumar IIM Indore
115 pages
DID101
No ratings yet
DID101
6 pages
Econometrics Example Questions and Solutions
No ratings yet
Econometrics Example Questions and Solutions
5 pages
Stata Textbook Examples Introductory Econometrics by Jeffrey PDF
No ratings yet
Stata Textbook Examples Introductory Econometrics by Jeffrey PDF
104 pages
2SLS Klein Macro PDF
No ratings yet
2SLS Klein Macro PDF
4 pages
Instrumental Variables
No ratings yet
Instrumental Variables
28 pages
Drukker XTDPD
No ratings yet
Drukker XTDPD
34 pages
Heteroscedasticity: What Heteroscedasticity Is. Recall That OLS Makes The Assumption That
No ratings yet
Heteroscedasticity: What Heteroscedasticity Is. Recall That OLS Makes The Assumption That
20 pages
Lecture 7 VAR
No ratings yet
Lecture 7 VAR
34 pages
Review: I Am Examining Differences in The Mean Between Groups
100% (2)
Review: I Am Examining Differences in The Mean Between Groups
44 pages
CH-3-Multiple Linear Regression
No ratings yet
CH-3-Multiple Linear Regression
13 pages
DDD Analysis
No ratings yet
DDD Analysis
21 pages
Between Within Stata Analysis
No ratings yet
Between Within Stata Analysis
3 pages
(Bruderl) Applied Regression Analysis Using Stata
No ratings yet
(Bruderl) Applied Regression Analysis Using Stata
73 pages
Ch. 9 Multiple Choice Review Questions: 1.96 B) 1.645 C) 1.699 D) 0.90 E) 1.311
No ratings yet
Ch. 9 Multiple Choice Review Questions: 1.96 B) 1.645 C) 1.699 D) 0.90 E) 1.311
5 pages
Pvar Stata Modul
No ratings yet
Pvar Stata Modul
29 pages
007 - Buku Basic Econometric Damodar N Gujarati 4th Solution-15-25
No ratings yet
007 - Buku Basic Econometric Damodar N Gujarati 4th Solution-15-25
12 pages
Econometrics II
100% (1)
Econometrics II
4 pages
Lec06 - Panel Data
No ratings yet
Lec06 - Panel Data
160 pages
List of Formula - Managerial Statistics
No ratings yet
List of Formula - Managerial Statistics
6 pages
Chapter Three: Estimation of Multiple Linear Regression Model
No ratings yet
Chapter Three: Estimation of Multiple Linear Regression Model
18 pages
Applied Econometrics Using Stata
100% (1)
Applied Econometrics Using Stata
100 pages
An Introduction To Stata For Economists: Data Analysis
No ratings yet
An Introduction To Stata For Economists: Data Analysis
48 pages
Stata Graphs - Examples
No ratings yet
Stata Graphs - Examples
42 pages
Violation of OLS Assumption - Multicollinearity
No ratings yet
Violation of OLS Assumption - Multicollinearity
18 pages
Essentials of Econometrics
7% (27)
Essentials of Econometrics
12 pages
Cross Sectional
No ratings yet
Cross Sectional
40 pages
Ôn Final KTL
No ratings yet
Ôn Final KTL
5 pages
Chapter 5
No ratings yet
Chapter 5
45 pages
Running A Proper Regression Analysis: V G R Chandran Govindaraju Uitm Email: Website
No ratings yet
Running A Proper Regression Analysis: V G R Chandran Govindaraju Uitm Email: Website
36 pages
C4-English
No ratings yet
C4-English
27 pages
STATA Training for staff
No ratings yet
STATA Training for staff
23 pages
Ontents: Foreword Preface To The Fourth Edition
No ratings yet
Ontents: Foreword Preface To The Fourth Edition
12 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

CH - 3 - Simple and Multiple Linear Regressions in Stata

Uploaded by

CH - 3 - Simple and Multiple Linear Regressions in Stata

Uploaded by

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.