0% found this document useful (0 votes)
11 views10 pages

5 - Part II - Regression Analysis w-notes(1)

This lecture covers regression analysis, focusing on both simple and multiple linear regression, including model development, parameter estimation, and interpretation of results using Excel. It explains the least squares method, the Ordinary Least Squares (OLS) method, and how to evaluate regression models through correlation coefficients and the coefficient of determination (R2). Additionally, it provides examples of applying regression analysis to real-world scenarios, such as predicting university applications based on tuition costs and evaluating college rankings based on funding.

Uploaded by

najumobi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views10 pages

5 - Part II - Regression Analysis w-notes(1)

This lecture covers regression analysis, focusing on both simple and multiple linear regression, including model development, parameter estimation, and interpretation of results using Excel. It explains the least squares method, the Ordinary Least Squares (OLS) method, and how to evaluate regression models through correlation coefficients and the coefficient of determination (R2). Additionally, it provides examples of applying regression analysis to real-world scenarios, such as predicting university applications based on tuition costs and evaluating college rankings based on funding.

Uploaded by

najumobi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

OMGT 3223 - Lecture 5: Regression Analysis

OMGT 3223
Lecture 5: Regression Analysis
Lecture Outline

 Overview of the concept of regression analysis.


 Least squares method
 Simple linear regression
 Model development, parameter estimation.
 Simple linear regression using Excel.
 Interpretation of results.
 Multiple Linear Regression
 Overview of the concept of multivariate regression
 Model development, parameter estimation.
 Multiple linear regression using Excel.
 Interpretation of results

Predictive Models

In this topic we turn our attention to regression models. Recall that quantitative models can be classified as descriptive,
prescriptive or predictive models. Regression models are predictive models.

Predictive models are similar to descriptive models but allow for the fact that the outputs cannot be predicted exactly
from the inputs. In a predictive model the exact functional relationship between the inputs and outputs cannot be fully
described with the inputs available.

Regression Analysis

Regression analysis examines the relation of a dependent variable to one or more independent variables. Regression
analysis is one of the most widely applied tools in statistics. Regression is a rich and multi-faceted topic (entire courses
cover nothing but regression!) and is one of the most widely applied tools in statistics. Our objectives will be to review
simple and multiple regression and to learn how perform regression analysis using Excel.

Linear regression is a mathematical technique that relates a dependent variable to one or more independent variables in
the form of a linear equation.
1. Simple linear regression (aka Bivariate regression) generates a linear equation that best fits the observed data to
a single independent or predictor variable.
2. Multiple linear regression generates a linear equation that best fits the observed data to multiple independent or
predictor variables.

-1-
OMGT 3223 - Lecture 5: Regression Analysis

Simple Linear Regression

Basic Form of the Simple Linear Regression Equation:


In the case of simple linear regression, we start with a series of n observed pairs of data (xi, yi). We want to fit a linear
equation to the data.
yˆi  b0  b1 xi
Recall that a line is specified by two parameters:
1.
2.

Determining the “Best” Fit Line:


Consider the following set of points. Assume we want to fit a linear approximation to the data.

There are several different lines that provide a “good” approximation to the data. Therefore, we need some criteria to
determine the “best” fit line. Different criteria are used in different applications, all of which are based on minimizing
some “error” function. Errors, also known as residuals, can be measured for each y data point.

-2-
OMGT 3223 - Lecture 5: Regression Analysis

Fitting the Regression Equation:


In order to generate the regression equation, we need to find the coefficients b0 and b1 that will minimize the errors or
residuals. By far, the most common error function is the sum of the squared residuals.

Observed (Actual) Value Predicted Value


n
SSE    yi  yˆi 
2

i 1

The Ordinary Least Squares (OLS) Method:


The Ordinary Least Squares (OLS) method can be used to estimate the slope and intercept that minimize the sum of the
squared residuals. The OLS method is an “optimization” problem where the intercept (b0) and slope (b1) coefficients
represent the decision variables, while the sum of squared residuals (SSE) represents the objective function we seek to
minimize.

The Regression Equation:


Linear regression analysis generates a linear equation to predict y for specified values of x. In simple linear regression,
the model estimates an intercept (b0) and a slope (b1) for the regression line:

yˆi  b0  b1 xi

The regression line predicts the value of y for a given x. The predicted y value ( yˆ i ) is the value calculated from the

regression equation for the corresponding x value (xi). The actual y values will be scattered around that prediction.

-3-
OMGT 3223 - Lecture 5: Regression Analysis

Evaluating Simple Linear Regression Models

1. The Correlation Coefficient:


Correlation represents a measure of the strength of the relationship between the independent and the dependent
variable(s). Correlation is measured by the correlation coefficient.

Interpretation: The correlation coefficient, r, measures the strength and direction of a linear relationship.

Correlation Coefficient Examples:

r = 0.7 r = 0.3 r=0

r = -0.7 r = -0.3 r=0

2. The Coefficient of Determination:


The coefficient of determination (denoted R2) represents a measurement of the variability of the data around the
regression line. R2 is a standard way to measure the “fit” of the regression model.

Interpretation: The coefficient of determination, R2, measures the proportion of the total variation in y that is
explained by the regression line, i.e., R2 measures the percentage of variation in the dependent variable y resulting
from changes in the independent variable x.

-4-
OMGT 3223 - Lecture 5: Regression Analysis

Coefficient of Determination Examples:

Simple Linear Regression in Excel

Two alternative approaches can be used to perform simple linear regression analysis in Excel.
1. The Data Analysis Add-In can be used to develop and evaluate a simple linear regression model.
2. Individual Excel functions can also be used to estimate and evaluate a simple linear regression model.
 The =Intercept and =Slope functions can be used to estimate a simple regression line.
 The =Correl and =RSQ functions can be used to evaluate a simple linear regression model.

Simple Linear Regression Example:


The registrar at a university believes that the decrease in the number of freshman applications that has been experienced is
directly related to tuition increases. They have collected enrollment and tuition figures for the past decade.

Download the file Simple Linear Regression Example.xls and complete the following tasks:
1. Use the data to develop a simple linear regression model.
2. Forecast the number of applications for the university if tuition increases to $10,000 per year and if tuition is
lowered to $7,000 per year.
3. Evaluate the regression model.

-5-
OMGT 3223 - Lecture 5: Regression Analysis

Scatter plot of the data:

Simple Linear Regression Example Solution:

Simple Linear Regression Example with Data Analysis:

-6-
OMGT 3223 - Lecture 5: Regression Analysis

Data Analysis: Interpretation of Results

Regression Model:
The following equation predicts the average number of freshman applications for a given tuition cost.

# Applications =

Model Evaluation:
Tuition cost “explains” about 65% of the variation in the number of freshman applications in this sample.
The standard error of 408.4 is the estimated standard deviation of the number of applications about the mean.

ANOVA Table: The ANOVA table is the middle section of the regression output. ANOVA stands for ANalysis Of
VAriance. ANOVA studies the overall variation in the y variable and the data in the ANOVA table can help us perform
diagnostics on the model.

Model Significance: We can use the ANOVA table to determine how significant our regression model is. The
Significance-F value tells us if the model can make statistically significant predictions.
 A model with a Significance-F value under .05 is generally regarded as a statistically significant model.

Multiple Linear Regression

In simple (bivariate) regression models we limited ourselves to a single predictor or independent variable. However, in
many cases we may have multiple potential predictor variables (e.g., house prices).

Simple regression can be easily extended to allow for multiple predictors. When we have more than one predictor we refer
to the model as a multiple regression model. Multiple regression is a more powerful extension of linear regression.

-7-
OMGT 3223 - Lecture 5: Regression Analysis

Multiple Linear Regression Models:


Multiple Regression relates a dependent variable to more than one independent variables (e.g., new residential
construction is a function of several independent variables such as income growth, housing prices, etc.)

A multiple regression model has the following general form:

Yˆ  b0  b1 X1  b2 X 2   bk X k

The mean values of the dependent or response variable Y are estimated as a linear function of the multiple independent or
predictor X variables.

Data Format:
The n observed values of the response variable Y and the proposed predictor variables X1, X2, … , Xk are presented in the
form of an n x k matrix.

Multiple Linear Regression in Excel

The Data Analysis Add-In can be used to develop Multiple Linear Regression models.

Multiple Linear Regression Example:


The Dean of the College of Business has initiated a fund raising campaign. One of the selling points the Dean plans to use
with potential donors is that increasing the college’s private endowment will improve its ranking among business schools
as published in various news magazines. The Dean would like to demonstrate that there is a relationship between funding
and the rankings. She has collected the following data showing the private endowments ($ millions) and annual budgets ($
millions) from nine institutions, plus the ranking of each school.

Download the file Multiple Linear Regression Example.xls and complete the following tasks:
1. Use Excel to develop a multiple linear regression equation using ‘Private Endowment’ and ‘Annual Budget’ as
the independent or predictor variables.
2. Forecast a ranking for a private endowment of $70 million and an annual budget of $40 million.
3. Evaluate the regression model.

-8-
OMGT 3223 - Lecture 5: Regression Analysis

Scatter Plot of the Data:

Multiple Linear Regression with Data Analysis:

Data Analysis: Output Analysis

-9-
OMGT 3223 - Lecture 5: Regression Analysis

Regression Model:
The following equation predicts the ranking position for a given endowment figure and a given budget level.

Ranking Position = 124 + (-1.83) x Endowment + (0.41) x Budget

Model Evaluation:
The model “explains” about 91.4% of the variation in the ranking positions in this sample.

The Adjusted R2: Adjusted R2 is a measure similar to R2. The Adjusted R2 adjusts the R2 down, adding a penalty for
including more independent variables. We need the Adjusted R2 because adding more predictor variables will always
make R2 increase (or at least stay the same!)

 Unlike the R2, the Adjusted R2 will increase only if the additional predictor variables improve the model more
than would be expected by chance. Adding more predictors may make Adjusted R2 decrease if the new variable
is not a good predictor.

Model Significance:
We can determine how significant a multiple regression model is. The Significance-F value tells us if the model can
make statistically significant predictions.

 A model with a Significance-F value under .05 is generally regarded as a statistically significant model.

Variable Significance:
We can also determine how significant each predictor (i.e., each independent variable) is. A t-statistic and a p-value are
calculated for each independent variable. The p-value associated with each independent variable represents the
probability the variable is not significant (i.e. the probability the variable is significant due to chance).

 A variable with a p-value under .05 is generally regarded as a statistically significant predictor.

- 10 -

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy