0% found this document useful (0 votes)
31 views87 pages

3-4 CLRM

This document provides an overview of simple linear regression models (CLRM). It defines regression as attempting to explain movements in a dependent variable (y) using one or more independent variables (x). Regression differs from correlation in that it treats y and x asymmetrically, with y assumed to be random and x assumed to be fixed. Simple regression involves modeling the relationship between y and a single x variable using an equation of the form yt = α + βxt + ut, where ut is a random disturbance term. The parameters α and β are estimated using procedures like ordinary least squares.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views87 pages

3-4 CLRM

This document provides an overview of simple linear regression models (CLRM). It defines regression as attempting to explain movements in a dependent variable (y) using one or more independent variables (x). Regression differs from correlation in that it treats y and x asymmetrically, with y assumed to be random and x assumed to be fixed. Simple regression involves modeling the relationship between y and a single x variable using an equation of the form yt = α + βxt + ut, where ut is a random disturbance term. The parameters α and β are estimated using procedures like ordinary least squares.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 87

3 - 4.

CLRM: An Overview

Bhanu Pratap Singh

GIM
Learning Objectives

I Regression Model.
I Regression vs Correlation.
I Simple Regression.
I Terminology.
I Assumptions.
I Properties of OLS Estimators.
What is a Regression Model

I Concerned with the relationship between a given variable and one or more
other variables.
What is a Regression Model

I Concerned with the relationship between a given variable and one or more
other variables.
I An attempt to explain movements in a variable by one or more other
variables.
What is a Regression Model

I Concerned with the relationship between a given variable and one or more
other variables.
I An attempt to explain movements in a variable by one or more other
variables.
I Names of y and xs in regression models.
What is a Regression Model

I Concerned with the relationship between a given variable and one or more
other variables.
I An attempt to explain movements in a variable by one or more other
variables.
I Names of y and xs in regression models.

I
Regression vs Correlation

I Correlation.
Regression vs Correlation

I Correlation.
I Degree of linear association between two variables.
Regression vs Correlation

I Correlation.
I Degree of linear association between two variables.
I If y and x are correlated, it means that y and x are being treated in a
completely symmetrical way.
Regression vs Correlation

I Correlation.
I Degree of linear association between two variables.
I If y and x are correlated, it means that y and x are being treated in a
completely symmetrical way.
I It is not implied that changes in x cause changes in y , or changes in y cause
changes in x .
Regression vs Correlation

I Correlation.
I Degree of linear association between two variables.
I If y and x are correlated, it means that y and x are being treated in a
completely symmetrical way.
I It is not implied that changes in x cause changes in y , or changes in y cause
changes in x .
I Regression.
Regression vs Correlation

I Correlation.
I Degree of linear association between two variables.
I If y and x are correlated, it means that y and x are being treated in a
completely symmetrical way.
I It is not implied that changes in x cause changes in y , or changes in y cause
changes in x .
I Regression.
I Dependent variable (y ) and the independent variable(s) (xs) are treated very
differently.
Regression vs Correlation

I Correlation.
I Degree of linear association between two variables.
I If y and x are correlated, it means that y and x are being treated in a
completely symmetrical way.
I It is not implied that changes in x cause changes in y , or changes in y cause
changes in x .
I Regression.
I Dependent variable (y ) and the independent variable(s) (xs) are treated very
differently.
I The y variable is assumed to be random or ‘stochastic’ in some way, i.e., to
have a probability distribution.
Regression vs Correlation

I Correlation.
I Degree of linear association between two variables.
I If y and x are correlated, it means that y and x are being treated in a
completely symmetrical way.
I It is not implied that changes in x cause changes in y , or changes in y cause
changes in x .
I Regression.
I Dependent variable (y ) and the independent variable(s) (xs) are treated very
differently.
I The y variable is assumed to be random or ‘stochastic’ in some way, i.e., to
have a probability distribution.
I The x variables are, however, assumed to have fixed (‘non-stochastic’) values
in repeated samples.
Simple Regression

I Suppose that y depends on only one x variable.


Simple Regression

I Suppose that y depends on only one x variable.


I How asset returns vary with their level of market risk.
Simple Regression

I Suppose that y depends on only one x variable.


I How asset returns vary with their level of market risk.
I Measuring the long-term relationship between stock prices and dividends.
Simple Regression

I Suppose that y depends on only one x variable.


I How asset returns vary with their level of market risk.
I Measuring the long-term relationship between stock prices and dividends.
I Constructing an optimal hedge ratio.
Simple Regression

I Suppose that a researcher has some idea regarding a relationship between


two variables y and x .
Simple Regression

I Suppose that a researcher has some idea regarding a relationship between


two variables y and x .
I A sensible first stage would be to form a scatter plot of them.
Simple Regression

I Suppose that a researcher has some idea regarding a relationship between


two variables y and x .
I A sensible first stage would be to form a scatter plot of them.

I
Simple Regression

I Suppose that a researcher has some idea regarding a relationship between


two variables y and x .
I A sensible first stage would be to form a scatter plot of them.

I It appears that there is an approximate positive linear relationship.


Simple Regression

I Suppose that a researcher has some idea regarding a relationship between


two variables y and x .
I A sensible first stage would be to form a scatter plot of them.

I It appears that there is an approximate positive linear relationship.


I Can be described approximately by a straight line.
Simple Regression

I Big Question: To what extent this relationship can be described by an


equation
Simple Regression

I Big Question: To what extent this relationship can be described by an


equation
I And can be estimated using a defined procedure.
Simple Regression

I Big Question: To what extent this relationship can be described by an


equation
I And can be estimated using a defined procedure.
I It is possible to use the general equation for a straight.
Simple Regression

I Big Question: To what extent this relationship can be described by an


equation
I And can be estimated using a defined procedure.
I It is possible to use the general equation for a straight.
I y = α + βx
Simple Regression

I Big Question: To what extent this relationship can be described by an


equation
I And can be estimated using a defined procedure.
I It is possible to use the general equation for a straight.
I y = α + βx
I However, this relationship is an exact one.
Simple Regression

I Big Question: To what extent this relationship can be described by an


equation
I And can be estimated using a defined procedure.
I It is possible to use the general equation for a straight.
I y = α + βx
I However, this relationship is an exact one.
I More realistic model: Add a random disturbance term, denoted by u.
Simple Regression

I Big Question: To what extent this relationship can be described by an


equation
I And can be estimated using a defined procedure.
I It is possible to use the general equation for a straight.
I y = α + βx
I However, this relationship is an exact one.
I More realistic model: Add a random disturbance term, denoted by u.
I yt = α + βxt + ut
Simple Regression

I Big Question: To what extent this relationship can be described by an


equation
I And can be estimated using a defined procedure.
I It is possible to use the general equation for a straight.
I y = α + βx
I However, this relationship is an exact one.
I More realistic model: Add a random disturbance term, denoted by u.
I yt = α + βxt + ut
I t denontes observation no.
Simple Regression

Reasons for the inclusion of the disturbance term


I Omitted variables that affects y .
Simple Regression

Reasons for the inclusion of the disturbance term


I Omitted variables that affects y .
I Measurement error in y .
Simple Regression

Reasons for the inclusion of the disturbance term


I Omitted variables that affects y .
I Measurement error in y .
I Random outside influences on y that cannot be modelled. e.g. Terrorist
attack.
Simple Regression

I Various estimation procedures to determine appropriate values of α and β.


Simple Regression

I Various estimation procedures to determine appropriate values of α and β.


I Ordinary Least Squares.
Simple Regression

I Various estimation procedures to determine appropriate values of α and β.


I Ordinary Least Squares.
I Method of Moments.
Simple Regression

I Various estimation procedures to determine appropriate values of α and β.


I Ordinary Least Squares.
I Method of Moments.
I Maximum Likelyhood Method.
Simple Regression

OLS: Most common method used to fit a line to the data


I Suppose that the sample of data contains only five observations.
Simple Regression

OLS: Most common method used to fit a line to the data


I Suppose that the sample of data contains only five observations.
I The method entails taking each vertical distance, squaring it and then
minimising the total sum of squares.
Simple Regression

OLS: Most common method used to fit a line to the data


I Suppose that the sample of data contains only five observations.
I The method entails taking each vertical distance, squaring it and then
minimising the total sum of squares.

I
Simple Regression

OLS: Most common method used to fit a line to the data


I Let’s ût denote the residual.
Simple Regression

OLS: Most common method used to fit a line to the data


I Let’s ût denote the residual.
I Difference between the actual value of y and the value fitted by the model.
Simple Regression

OLS: Most common method used to fit a line to the data


I Let’s ût denote the residual.
I Difference between the actual value of y and the value fitted by the model.
I ût = yt − ŷt
Simple Regression

OLS: Most common method used to fit a line to the data


I Let’s ût denote the residual.
I Difference between the actual value of y and the value fitted by the model.
I ût = yt − ŷt

I
Simple Regression

OLS: Most common method used to fit a line to the data


I Let’s ût denote the residual.
I Difference between the actual value of y and the value fitted by the model.
I ût = yt − ŷt

I Thus, the OLS involves minimising the Residual Sum of Squares (RSS).
Simple Regression

OLS: Most common method used to fit a line to the data


I Let’s ût denote the residual.
I Difference between the actual value of y and the value fitted by the model.
I ût = yt − ŷt

I Thus, the OLS involves minimising the Residual Sum of Squares (RSS).
P5 P5
I Minimise t=1
uˆt 2 or t=1
(yt − yˆt )2 .
Simple Regression
OLS: Most common method used to fit a line to the data
I Equation of the fitted line
Simple Regression
OLS: Most common method used to fit a line to the data
I Equation of the fitted line
I yˆt = α̂ + β̂xt
Simple Regression
OLS: Most common method used to fit a line to the data
I Equation of the fitted line
I yˆt = α̂ + β̂xt
I Hence, following function (also known as Loss function) is minimised.
Simple Regression
OLS: Most common method used to fit a line to the data
I Equation of the fitted line
I yˆt = α̂ + β̂xt
I Hence, following function (also known as Loss function) is minimised.

I
Simple Regression
OLS: Most common method used to fit a line to the data
I Equation of the fitted line
I yˆt = α̂ + β̂xt
I Hence, following function (also known as Loss function) is minimised.

I The coefficient estimators for the slope and the intercept are given by
Simple Regression
OLS: Most common method used to fit a line to the data
I Equation of the fitted line
I yˆt = α̂ + β̂xt
I Hence, following function (also known as Loss function) is minimised.

I The coefficient estimators for the slope and the intercept are given by

I
Simple Regression
OLS: Most common method used to fit a line to the data
I Equation of the fitted line
I yˆt = α̂ + β̂xt
I Hence, following function (also known as Loss function) is minimised.

I The coefficient estimators for the slope and the intercept are given by

I
Simple Regression
OLS: Most common method used to fit a line to the data
I Equation of the fitted line
I yˆt = α̂ + β̂xt
I Hence, following function (also known as Loss function) is minimised.

I The coefficient estimators for the slope and the intercept are given by

I
I α̂ = ȳ + β̂x̄ .
Simple Regression
OLS: Most common method used to fit a line to the data
I Equation of the fitted line
I yˆt = α̂ + β̂xt
I Hence, following function (also known as Loss function) is minimised.

I The coefficient estimators for the slope and the intercept are given by

I
I α̂ = ȳ + β̂x̄ .
I Hence, given xt and yt , it is always possible to calculate the values of the
two parameters, α̂ and β̂ that best fit the set of data.
Simple Regression
OLS: Most common method used to fit a line to the data
I Equation of the fitted line
I yˆt = α̂ + β̂xt
I Hence, following function (also known as Loss function) is minimised.

I The coefficient estimators for the slope and the intercept are given by

I
I α̂ = ȳ + β̂x̄ .
I Hence, given xt and yt , it is always possible to calculate the values of the
two parameters, α̂ and β̂ that best fit the set of data.
I This method of finding the optimum is known as OLS.
Simple Regression

OLS: Most common method used to fit a line to the data


I Example: Some data on the excess returns on fund XXX and excess returns
on a market index.
Simple Regression

OLS: Most common method used to fit a line to the data


I Example: Some data on the excess returns on fund XXX and excess returns
on a market index.

I
Simple Regression

OLS: Most common method used to fit a line to the data


I Example: Some data on the excess returns on fund XXX and excess returns
on a market index.

I The fund manager has intuition that the β on this fund is positive.
Simple Regression

OLS: Most common method used to fit a line to the data


I Example: Some data on the excess returns on fund XXX and excess returns
on a market index.

I The fund manager has intuition that the β on this fund is positive.
I Relationship between x and y given the data.
Simple Regression

OLS: Most common method used to fit a line to the data


I Example: Scatter Plot
Simple Regression

OLS: Most common method used to fit a line to the data


I Example: Scatter Plot

I
Simple Regression

OLS: Most common method used to fit a line to the data


I Example: Scatter Plot

I Appears to be a positive, approximately linear relationship between x and y .


Simple Regression

OLS: Most common method used to fit a line to the data


I Example: Scatter Plot

I Appears to be a positive, approximately linear relationship between x and y .


I Using OLS formulas for slope and intercept: α̂ = -1.74 and β̂ = 1.64.
Simple Regression

OLS: Most common method used to fit a line to the data


I Example: Scatter Plot

I Appears to be a positive, approximately linear relationship between x and y .


I Using OLS formulas for slope and intercept: α̂ = -1.74 and β̂ = 1.64.
I Fitted line: ŷ = α̂ + β̂xt =⇒ ŷ = −1.74 + 1.64xt
Terminology

I PRF
Terminology

I PRF
I Represents the true relationship between the variables.
Terminology

I PRF
I Represents the true relationship between the variables.
I Also known as the data generating process (DGP).
Terminology

I PRF
I Represents the true relationship between the variables.
I Also known as the data generating process (DGP).
I yt = α + βxt + ut
Terminology

I PRF
I Represents the true relationship between the variables.
I Also known as the data generating process (DGP).
I yt = α + βxt + ut
I Note the disturbance term.
Terminology

I PRF
I Represents the true relationship between the variables.
I Also known as the data generating process (DGP).
I yt = α + βxt + ut
I Note the disturbance term.
I Even if we have the entire population, it is still not possible to obtain a
perfect line.
Terminology

I PRF
I Represents the true relationship between the variables.
I Also known as the data generating process (DGP).
I yt = α + βxt + ut
I Note the disturbance term.
I Even if we have the entire population, it is still not possible to obtain a
perfect line.
I SRF
Terminology

I PRF
I Represents the true relationship between the variables.
I Also known as the data generating process (DGP).
I yt = α + βxt + ut
I Note the disturbance term.
I Even if we have the entire population, it is still not possible to obtain a
perfect line.
I SRF
I Estimated relationship using sample.
Terminology

I PRF
I Represents the true relationship between the variables.
I Also known as the data generating process (DGP).
I yt = α + βxt + ut
I Note the disturbance term.
I Even if we have the entire population, it is still not possible to obtain a
perfect line.
I SRF
I Estimated relationship using sample.
I yˆt = α̂ + β̂xt
Terminology

I PRF
I Represents the true relationship between the variables.
I Also known as the data generating process (DGP).
I yt = α + βxt + ut
I Note the disturbance term.
I Even if we have the entire population, it is still not possible to obtain a
perfect line.
I SRF
I Estimated relationship using sample.
I yˆt = α̂ + β̂xt
I yt = α̂ + β̂xt + ût
Assumptions

I
Properties of OLS Estimators

I If assumptions 1 - 4 hold, then the estimators determined by OLS are BLUE.


Properties of OLS Estimators

I If assumptions 1 - 4 hold, then the estimators determined by OLS are BLUE.


I Consistency
Properties of OLS Estimators

I If assumptions 1 - 4 hold, then the estimators determined by OLS are BLUE.


I Consistency
I The estimates will converge to their true values as the sample size increases
to infinity.
Properties of OLS Estimators

I If assumptions 1 - 4 hold, then the estimators determined by OLS are BLUE.


I Consistency
I The estimates will converge to their true values as the sample size increases
to infinity.
I Inconsistent estimator: Even with infinite data, we could not be sure that the
estimated value will be close to its true value.
Properties of OLS Estimators

I If assumptions 1 - 4 hold, then the estimators determined by OLS are BLUE.


I Consistency
I The estimates will converge to their true values as the sample size increases
to infinity.
I Inconsistent estimator: Even with infinite data, we could not be sure that the
estimated value will be close to its true value.
I Unbiasedness
Properties of OLS Estimators

I If assumptions 1 - 4 hold, then the estimators determined by OLS are BLUE.


I Consistency
I The estimates will converge to their true values as the sample size increases
to infinity.
I Inconsistent estimator: Even with infinite data, we could not be sure that the
estimated value will be close to its true value.
I Unbiasedness
I On average, the estimated values for the coefficients will be equal to their
true values.
Properties of OLS Estimators

I If assumptions 1 - 4 hold, then the estimators determined by OLS are BLUE.


I Consistency
I The estimates will converge to their true values as the sample size increases
to infinity.
I Inconsistent estimator: Even with infinite data, we could not be sure that the
estimated value will be close to its true value.
I Unbiasedness
I On average, the estimated values for the coefficients will be equal to their
true values.
I E (α̂) = α and E (β̂) = β.
Properties of OLS Estimators

I If assumptions 1 - 4 hold, then the estimators determined by OLS are BLUE.


I Consistency
I The estimates will converge to their true values as the sample size increases
to infinity.
I Inconsistent estimator: Even with infinite data, we could not be sure that the
estimated value will be close to its true value.
I Unbiasedness
I On average, the estimated values for the coefficients will be equal to their
true values.
I E (α̂) = α and E (β̂) = β.
I Efficiency
Properties of OLS Estimators

I If assumptions 1 - 4 hold, then the estimators determined by OLS are BLUE.


I Consistency
I The estimates will converge to their true values as the sample size increases
to infinity.
I Inconsistent estimator: Even with infinite data, we could not be sure that the
estimated value will be close to its true value.
I Unbiasedness
I On average, the estimated values for the coefficients will be equal to their
true values.
I E (α̂) = α and E (β̂) = β.
I Efficiency
I An estimator is said to be efficient if no other estimator has a smaller
variance.
Properties of OLS Estimators

I If assumptions 1 - 4 hold, then the estimators determined by OLS are BLUE.


I Consistency
I The estimates will converge to their true values as the sample size increases
to infinity.
I Inconsistent estimator: Even with infinite data, we could not be sure that the
estimated value will be close to its true value.
I Unbiasedness
I On average, the estimated values for the coefficients will be equal to their
true values.
I E (α̂) = α and E (β̂) = β.
I Efficiency
I An estimator is said to be efficient if no other estimator has a smaller
variance.
I An efficient estimator would have a probability distribution that is narrowly
dispersed around the true value.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy