CH 14 PPLN
CH 14 PPLN
A Decision-Making Approach
6th Edition
Chapter 14
Multiple Regression
and Model Building
y β0 β1x1 β 2 x 2 βk x k ε
Estimated multiple regression model:
Estimated Estimated
(or predicted) Estimated slope coefficients
intercept
value of y
ŷ b0 b1x1 b 2 x 2 bk x k
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-
Hall, Inc. Chap 14-4
Multiple Regression Model
Two variable model
y
ŷ b0 b1x1 b 2 x 2
x1
e
abl
ri
r va
fo
l ope x2
S
f or v ariable x 2
Slope
<yi
observation ŷ b0 b1x1 b 2 x 2
yi
<
e = (y – y)
x2i
x2
<
x1i The best fit equation, y ,
is found by minimizing the
x 1 Statistics: A Decision-Making Approach, 6e © 2005 Prentice-
sum of squared errors, e2
Business
Hall, Inc. Chap 14-6
Multiple Regression
Assumptions
<
e = (y – y)
Excel:
Tools / Data Analysis... / Regression
PHStat:
PHStat / Regression / Multiple Regression…
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
Check the
“confidence and
prediction interval
estimates” box
Input values
<
Predicted y value
Confidence interval for the
<
mean y value, given
these x’s
<
individual y value, given
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice- these x’s
Hall, Inc. Chap 14-20
Multiple Coefficient of
Determination
Reports the proportion of total variation in y
explained by all x variables taken together
models
What is the net effect of adding a new variable?
We lose a degree of freedom when a new x
variable is added
Did the new x variable add enough
2 n 1 2
R 1 (1 R )
A
n k 1
(where n = sample size, k = number of independent variables)
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
Test Statistic:
bi 0
t (df = n – k – 1)
sbi
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-
Hall, Inc. Chap 14-31
Are Individual Variables
Significant?
(continued)
Regression Statistics
t-value for Price is t = -2.306, with
Multiple R 0.72213
R Square 0.52148
p-value .0398
Adjusted R Square 0.44172
Standard Error 47.46341 t-value for Advertising is t = 2.855,
Observations 15 with p-value .0145
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
SSE
s MSE
n k 1
Is this value large or small? Must compare to the
mean size of y for comparison
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
1
VIFj 2
1 Rj
R2j is the coefficient of determination when the jth
independent variable is regressed against the
remaining k – 1 independent variables
Let:
y = pie sales ŷ b0 b1x1 b 2 x 2
x1 = price
x2 = holiday (X2 = 1 if a holiday occurred during the week)
(X2 = 0 if there was no holiday that week)
Different Same
intercept slope
y (sales)
If H0: β2 = 0 is
b0 + b2
Holi rejected, then
day
b0 “Holiday” has a
No H
olida significant effect
y
on pie sales
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-
Hall, Inc. x1 (Price) Chap 14-45
Interpretation of the Dummy
Variable Coefficient (with 2
Levels)
Example: Sales 300 - 30(Price) 15(Holiday)
Sales: number of pies sold per week
Price: pie price in $
1 If a holiday occurred during the week
Holiday:
0 If no holiday occurred
ŷ b0 b1x1 b 2 x 2 b 3 x 3
b2 shows the impact on price if the house is a
ranch style, compared to a condo
b3 shows the impact on price if the house is a
split level style, compared to a condo
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-
Hall, Inc. Chap 14-48
Interpreting the Dummy
Variable Coefficients (with 3
Levels)
Suppose the estimated equation is
ŷ 20.43 0.045x1 23.53x 2 18.84x 3
For a condo: x2 = x3 = 0
With the same square feet, a
ŷ 20.43 0.045x 1 ranch will have an estimated
average price of 23.53
For a ranch: x3 = 0 thousand dollars more than a
condo
ŷ 20.43 0.045x1 23.53
With the same square feet, a
For a split level: x2 = 0 split level will have an
estimated average price of
ŷ 20.43 0.045x 1 18.84 18.84 thousand dollars more
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-
than a condo.
Hall, Inc. Chap 14-49
Nonlinear Relationships
The relationship between the dependent
variable and an independent variable may not
be linear
Useful when scatter diagram indicates non-
linear relationship
Example: Quadratic model
2
y β0 β1x j β 2 x ε j
y y
x x
residuals
x residuals x
x1 x1 x1 x1
β1 < 0 β1 > 0 β1 < 0 β1 > 0
β2 > 0 β2 > 0 β2 < 0 β2 < 0
β1 = the coefficient of the linear term
β = the coefficient of the squared term
2
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-
Hall, Inc. Chap 14-53
Testing for Significance:
Quadratic Model
Test for Overall Relationship
MSR
F test statistic = MSE
Testing the Quadratic Effect
Compare quadratic model
y β0 β1x j β 2 x 2j ε
with the linear model
y β0 β1x j ε
Hypotheses
H0: β2 = 0 (No 2nd order polynomial term)
HA: β2 0 (2nd order polynomial term is needed)
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-
Hall, Inc. Chap 14-54
Higher Order Models
residuals
x x
x residuals x
Not Independent
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-
Hall, Inc.
Independent
Chap 14-65
The Normality Assumption