0% found this document useful (0 votes)
117 views9 pages

Statistics Module 11

The document discusses multiple linear regression analysis. It provides examples of using multiple linear regression to predict travel time for deliveries based on miles traveled and deliveries [END SUMMARY]

Uploaded by

Gaile Yabut
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
117 views9 pages

Statistics Module 11

The document discusses multiple linear regression analysis. It provides examples of using multiple linear regression to predict travel time for deliveries based on miles traveled and deliveries [END SUMMARY]

Uploaded by

Gaile Yabut
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Business Statistics: Module 11.

Multiple Linear Regression Page 1 of 9

Module 11. Multiple Linear Regression

Regression analysis

 A parametric tool used to describe the linear relationship between the independent
and dependent variables.

 Develops a model to predict the values of the dependent variable based on the
values of the independent variables.

Multiple linear regression (MLR)

 Multiple regression analysis which involves two or more independent variables and
one dependent variable in which the relationship among the variables is estimated
by a straight line

 MLR formula is as follow:

Y = a + b1x1 + b2x2 + b3x3 + … or y = bo + b1x1 + b2x2 +b3x3 + ... (the formula goes
on depending on the number of independent variables)

Where Y = dependent variable


X1, X2, X3 = independent variables
a or bo = y-intercept of the regression line; or the value of Y if X = 0;
constant
b or b1, b2, b3 = slope, or the unit change in Y for every unit change in X;
partial regression coefficients for the independent variables

 In addition, coefficient of correlation and coefficient of determination are used in


regression analysis to measure the strengths of relationship of the independent and
dependent variables. These two measures were discussed in module 9.

The difference of this module from the previous ones is that we will not be doing manual
computation of the multiple linear regression; instead, we will discuss about the multiple
linear regression analysis output using statistical software like the data analysis of MS
Excel.

Problem illustrations

1. A trucking company business is engaged in deliveries throughout the region; and


there are some instances that deliveries are delayed. To minimize or totally avoid
delays on its deliveries, the manager wants to develop the work schedules for their
drivers. To do so, he wants to predict the daily travel time for their drivers. He
thought that number of miles traveled and number of deliveries are directly related to
the travel time. He gathered data from randomly selected 10 drivers who had
assigned deliveries on the same day. The results are shown on the table below.
Business Statistics: Module 11. Multiple Linear Regression Page 2 of 9

Perform a multiple regression analysis and determine if there is significant linear


relationship among the independent variables (number of miles traveled and number
of deliveries) and dependent variable (travel time). Also, determine the extent of
relationship of the variables understudy.
Drivers Number of miles traveled (X1) Number of deliveries (X2) Travel time in hours (Y)
A 100 4 9.3
B 50 3 4.8
C 100 4 8.9
D 100 2 6.5
E 50 2 4.2
F 80 2 6.2
G 75 3 7.4
H 65 4 6.0
I 90 3 7.0
J 90 2 6.1

Given these data, we will use MS Excel data analysis, regression analysis. The results
of the analysis are as follow:
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.947248
R Square 0.897278
Adjusted R Square 0.867929
Standard Error 0.582776
Observations 10
ANOVA
df SS MS F Significance F
Regression 2 20.76661 10.3833 30.5726 0.000347
Residual 7 2.377394 0.339628
Total 9 23.144
Coefficients Standard Error t Stat P-value
Intercept -0.78387 0.967542 -0.81016 0.444512
Number of miles traveled (X1) 0.059413 0.010055 5.909002 0.000594
Number of deliveries (X2) 0.920966 0.22483 4.096277 0.004595

To test the hypothesis of no significant linear relationship of number of miles traveled


and number of deliveries with travel time in hours, look into the significance F-value
shown in ANOVA table and compare it with the level of significance of 0.05.

If significance F-value is less than .05 level of significance, reject the null hypothesis.

If significance F-value is greater than .05 level of significance, do not reject the null
hypothesis.

As shown on the ANOVA table, reject the null hypothesis, significance F-value of
0.000347 is less than .05 level of significance, which means at least one of the
Business Statistics: Module 11. Multiple Linear Regression Page 3 of 9

independent variables (number of miles traveled and number of deliveries) has


significant linear relationship with travel times in hours.

To determine which of the independent variables (number of miles traveled and number
of deliveries) has significant relationship with travel times in hours, look into the p-values
shown on the table and compare it with .05 level of significance.

If p-value is less than .05 level of significance, it means that particular independent
variable has significant linear relationship with the dependent variable.

Both the number of miles traveled (X1) and number of deliveries (X2) have p-values
(0.000594 and 0.004595 respectively) which are less than .05 level of significance,
which means both independent variables have significant linear relationships with travel
time in hours. This suggests that an increase on the number of miles traveled and on
the number of deliveries, travel time in hours increases as well.

To develop the multiple linear regression equation or model, focus on the coefficient
values shown on the table.

Multiple linear regression equation or model = Y = -0.78387 + 0.059413X1 + 0.920966X2

This model means that the unit change in travel time in hours (Y) is 0.059413 for every
unit change in number of miles traveled (X1) when the other independent variables are
constant; and the unit change in travel time in hours (Y) is 0.920966 for every unit
change in number of deliveries (X2) when the other independent variables are constant.
In addition, the estimated value of travel time in hours (Y) is -0.78387 if both the
numbers of miles traveled and deliveries are zero (0).

Use the model to predict the value of travel time in hours with given values of number of
miles traveled and number of deliveries. For example,

What is the estimated travel time in hours if the number of miles traveled in 115 and
number of deliveries is 6?

Y = -0.78387 + 0.059413X1 + 0.920966X2


= -0.78387 + 0.059413(115) + 0.920966(6) = 13.14

To determine the extent of linear relationship of the independent and dependent


variables, refer to the output table which shows the multiple r and multiple r2,

The number of miles traveled and number of deliveries have very strong positive linear
relationship with travel time in hours based on its multiple r-value of 0.9472. Likewise,
the multiple r2 suggests that 89.73% of the variation in travel time in hours can be
explained by its interaction with the number of miles traveled and number of deliveries,
and that 10.27% were caused by unexplained factors or factors which were not included
in the study.
Business Statistics: Module 11. Multiple Linear Regression Page 4 of 9

2. The president of a chain of fast-food restaurants has randomly selected 10


franchisees and recorded for each franchisee their last year net profit, counter sales,
and drive-through sales (all in million dollars). The gathered data are shown on the
table below. Perform a multiple regression analysis and determine if there is
significant linear relationship among the independent variables (counter sales and
drive-through sales) and dependent variable (net profit). Also, determine the extent
of relationship of the variables understudy. What is the estimated net profit for
counter sales of 9.1 and drive-through sales of 8.9 (in million dollars)?
Franchisees Net profit (Y) Counter sales (X1) Drive-through sales (X2)
1 1.5 8.4 7.5
2 0.8 3.3 4.5
3 1.2 5.8 8.4
4 1.4 10.0 7.8
5 0.2 4.7 2.4
6 0.8 7.7 4.8
7 0.6 4.5 2.5
8 1.3 8.6 3.4
9 0.4 5.9 2.0
10 0.6 6.3 4.1
Regression Statistics
Multiple R 0.876635
R Square 0.76849
Adjusted R Square 0.702344
Standard Error 0.243719
Observations 10
ANOVA
df SS MS F Significance F
Regression 2 1.380208 0.690104 11.61812 0.00597
Residual 7 0.415792 0.059399
Total 9 1.796
Coefficients Standard Error t Stat P-value
Intercept -0.22098 0.266616 -0.82883 0.434549
Counter sales (X1) 0.086339 0.044043 1.96033 0.090775
Drive-through sales (X2) 0.113513 0.039179 2.897269 0.023076

The null hypothesis of no significant positive linear relationship between independent


variables (counter sales and drive through sales) and net profit of the franchisees is
rejected because its significance f-value of 0.00597 is less than .05 level of significance.
This means that at least one of the independent variables has significant positive linear
relationship with net profit of the franchisees.

In particular, the drive-through sales has significant positive linear relationship with the
franchisees’ net profit as its p-value of 0.023076 is less than .05 level of significance;
while counter sales has p-value of 0.090775 is greater than .05 level of significance.
Business Statistics: Module 11. Multiple Linear Regression Page 5 of 9

Multiple linear regression equation model = Y = -0.22098 + 0.086339X1 + 0.113513X2

The model suggests that the value of net profit is -0.22098 if both counter sales and
drive-through sales are zero. It also shows that 0.086339 is the change in net profit for
every unit change in counter sales when other variables are held constant; and that
0.113513 is the change in net profit for every unit change in drive through sales when
other variables are held constant.

What is the estimate net profit for counter sales of 9.1 and drive-through sales of 8.9 (in
million dollars)?

Y = -0.22098 + 0.086339X1 + 0.113513X2


= -0.22098 + 0.086339(9.1) + 0.113513(8.9) = 1.04

Based on the multiple r-value of 0.876635, counter sales, drive-through sales have high
positive linear relationship with net profit of franchisees. Likewise, its multiple r 2 value
indicates that 76.85% of the variation in the net profit can be caused by its interactions
with the counter sales and drive-through sales and the remaining 23.15% are caused by
other factors which were not covered in this study.
,
End of Module Exercises

1. A consumer products company wants to measure the effectiveness of different types


of advertising media in the promotion of its products. The company is interested in
the effectiveness of radio advertising and newspaper advertising. A sample of 20
cities with roughly equal population is selected for the study during a test period of
one month. Each city is allocated a specific expenditure level both for radio
advertising and for newspaper advertising. The sales of the product and levels of
media expenses during the test period were recorded; and multiple regression
analysis was performed on it. Interpret the results of the multiple regression analysis
shown below. How much will be the sales of the product if the radio advertising will
cost $20,000 and newspaper advertising will cost $15,000?
Regression Statistics
Multiple R 0.904754997
R Square 0.818581605
Adjusted R Square 0.797238265
Standard Error 138249.7504
Observations 20
ANOVA
df SS MS F Significance F
Regression 2 1.46608E+12 7.33E+11 38.35302 4.99794E-07
Residual 17 3.24921E+11 1.91E+10
Total 19 1.791E+12
Business Statistics: Module 11. Multiple Linear Regression Page 6 of 9

Coefficients Standard Error t Stat P-value


Intercept 27499.14883 136293.3165 0.201764 0.842495
radio advertising 14.39166902 1.907186513 7.546021 8.01E-07
newspaper advertising 18.55405098 3.066064857 6.051422 1.3E-05

2. The owner of a chain of health spas has selected 10 of her smaller clubs for test in
which she varies the size of newspaper ad and the amount of initiation fee discount
to see how this might affect the number of prospective members who visit each club
during the following week. The results were recorded, and multiple regression
analysis was performed on it. Interpret the results of the multiple regression analysis
shown below. Determine the number of new visitors for an add column of 9 inches
and discount amount of $75.
Regression Statistics
Multiple R 0.799957448
R Square 0.639931918
Adjusted R Square 0.537055323
Standard Error 3.54273123
Observations 10
ANOVA
df SS MS F Significance F
Regression 2 156.143388 78.07169 6.220384 0.028012134
Residual 7 87.85661197 12.55094
Total 9 244
Coefficients Standard Error t Stat P-value
Intercept 11.49005792 4.013760848 2.862666 0.024245
ad column - inches 2.139333977 0.612334395 3.493735 0.010078
discount amount 0.031486486 0.04511417 0.697929 0.507735

3. The Conde Nast Traveler Gold List provides ratings for the top 20 small cruise ships.
Each score represents the percentage of respondents who rated a ship as excellent
or very good on several criteria including shore excursions and food/dining. An
overall score is also reported and used to rank the ships. The data were analyzed
using multiple regression analysis, the output of which is shown below. Interpret the
result. Predict the overall score for cruise ship with a shore excursion of 80 and a
food/dining score of 90.
Regression Statistics
Multiple R 0.859345037
R Square 0.738473893
Adjusted R Square 0.707706116
Standard Error 1.376504389
Observations 20
Business Statistics: Module 11. Multiple Linear Regression Page 7 of 9

ANOVA
df SS MS F Significance F
Regression 2 90.95450632 45.47725 24.00154 1.11912E-05
Residual 17 32.21099368 1.894764
Total 19 123.1655
Coefficients Standard Error t Stat P-value
Intercept 45.17795848 6.951847689 6.498698 5.45765E-06
shore excursions 0.252892474 0.041891237 6.036882 1.33357E-05
food/dining 0.248188929 0.061605727 4.028667 0.00087138

4. The Tire Rank, an online distributor of tires and wheels, conducts extensive testing
to provide customers with products that are right for their vehicle, driving style, and
driving conditions. In addition, The Tire Rack maintains an independent consumer
survey to help drivers help each other by sharing their long-term tire experiences
(The Tire Rack website, August 1, 2016). The survey use 1 to 10 rating scale with
10 as the highest rating for 18 high-performance all-season tires. The tread wear
variable rates quickness of wear based on the driver’s expectations; the dry traction
variable rates the grip of a tire on a dry road; the steering variable rates the tire’s
steering responses; and the buy again variable rates the driver’s desire to purchase
the same tire again. These data were analyzed using multiple regression analysis,
which results of which are shown on the table below. Interpret the results.
Regression Statistics
Multiple R 0.96956582
R Square 0.94005788
Adjusted R Square 0.926225083
Standard Error 0.57191668
Observations 17
ANOVA
df SS MS F Significance F
Regression 3 66.68549411 22.2285 67.95862652 3.36066E-08
Residual 13 4.252152953 0.327089
Total 16 70.93764706
Coefficients Standard Error t Stat P-value
Intercept -10.3397728 2.126333551 -4.86272 0.000310027
tread wear 1.200056035 0.146227731 8.206761 1.68927E-06
dry traction 0.621872989 0.504775274 1.23198 0.239775859
steering 0.325237913 0.424731891 0.765749 0.457504454

5. In testing 10 sedans, an automobile publication rated each on 13 different


characteristics, including ride, handling, and driver comfort. Each vehicle also
received an overall rating. The data were recorded, and multiple regression analysis
was performed on it. The table below shows the results of the analysis. Interpret
Business Statistics: Module 11. Multiple Linear Regression Page 8 of 9

the result. What is the estimated overall rating for a vehicle that scores 6 on ride, 9
on handling, and 7 on driver comfort?
Regression Statistics
Multiple R 0.818288937
R Square 0.669596785
Adjusted R Square 0.504395177
Standard Error 3.099875295
Observations 10
ANOVA
df SS MS F Significance F
Regression 3 116.8446389 38.94821 4.05320986 0.068370032
Residual 6 57.65536105 9.609227
Total 9 174.5
Coefficients Standard Error t Stat P-value
Intercept 46.74070022 15.53545447 3.008647 0.023741891
ride 3.463894967 1.157780465 2.991841 0.024262446
handling 3.915754923 1.30102027 3.009757 0.023707941
driver comfort -1.90809628 1.294945434 -1.4735 0.191049531

References

Albright, S. et al. (2015). Business analytics: data analysis and decision making (5th
ed). Cengage Learning.
Anderson, D., Sweeney, D.J., et.al., (2018). Modern business statistics. Australia:
Cengage Learning.
Antivola, H. (2015). Business statistics: a modular approach. Books Atbp. Publishing.
Anywhere Math. (2016). Introduction to Statistics.
https://www.youtube.com/watch?v=LMSyiAJm99g.
Berenson, M.L., Levine, D.M., & Krehbiel, T.C. (2015). Basic business statistics:
concepts and applications. Pearson Education Sou7th Asia Pte. Ltd.
Bowerman, B. (2017). Business statistics in practice: using modeling, data, and
analytics (8th ed.). McGraw-Hill Education.
Jaggia, S. (2019). Business statistics: communicating with numbers (3rd ed.). McGraw-
Hill Education.
Lee, N. (2016). Business statistics: using excel & SPSS. Sage.
Mukaka, M.M. (2012). A guide to appropriate use of correlation coefficient in medical
research. Malawi Medical Journal, v.24(3).
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3576830/
Simple Learning Pro. (2015). Mean, median, mode, range, and standard deviation.
https://www.youtube.com/watch?v=mk8tOD0t8M0.
Sharpe, N. (2015). Business statistics 3rd ed. Pearson Education.
Weier, R.M. (2014). Introduction to business statistics, 7th edition. Cengage Learning
Asia Pte. Ltd.
Willoughby, D. (2015). An essential guide to business statistics. John Wiley & Sons.
Business Statistics: Module 11. Multiple Linear Regression Page 9 of 9

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy