0% found this document useful (0 votes)

6 views53 pages

IE266 S25 Week12

The document discusses Multiple Linear Regression (MLR), which involves models with multiple independent variables to capture complex relationships. It outlines the steps for MLR, including selecting variables, estimating coefficients, specifying error distribution, and assessing model utility through tests and ANOVA. The document emphasizes the importance of unbiased estimators and the relationship between independent variables and the dependent variable in regression analysis.

Uploaded by

gereksizisler2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views53 pages

IE266 S25 Week12

Uploaded by

gereksizisler2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 53

Multiple Linear Regression

• Most applications require more complex relations than first-

order models.
• Models that include more than one independent variable or
terms such as 𝑥 2 , 𝑥 3 , 1/𝑥 are called Multiple Linear Regression
(MLR) models.
Regressor

𝑌 = 𝛽0 + 𝛽1 𝑥1 + ⋯ + 𝛽𝑘 𝑥𝑘 + 𝜀
Regression coefficient:
Expected change in 𝑦 per unit increase in
𝑥𝑘 , when all other independent variables
are constant.

METU, IE266, S25 1

Steps of the MLR

1. Choose which independent variables to include

2. Collect sample data
3. Estimate 𝛽0 , 𝛽1 , … , 𝛽𝑘
4. Specify the probability distribution of ei and estimate its
variance.
5. Check the utility of the model (if necessary go to Step 1).
6. If model is adequate, estimate E[Y] or predict Y.

First we will focus on Steps 3-6. Then discuss Step 1.

METU, IE266, S25 2

Step 3- Estimate 𝜷𝟎 , 𝜷𝟏 , ⋯ , 𝜷𝒌

Regression model

𝑌 = 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + ⋯ + 𝛽𝑘 𝑥𝑘 + 𝜀
~𝑁(0, 𝜎 2 )

𝑌𝑖 = 𝛽0 + 𝛽1 𝑥𝑖1 + 𝛽2 𝑥𝑖2 + ⋯ + 𝛽𝑘 𝑥𝑖𝑘 + 𝜀𝑖 , 𝑖 = 1,2, … , 𝑛

iid 𝑁(0, 𝜎 2 )

Regression equation

E[𝑌] = 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + ⋯ + 𝛽𝑘 𝑥𝑘

METU, IE266, S25 3

Step 3- Estimate 𝜷𝟎 , 𝜷𝟏 , ⋯ , 𝜷𝒌

In vector notation:
𝛽0 𝜀1
𝑦1 𝑌1 1 𝑥11 … . 𝑥1𝑘
𝛽1 𝜀2
𝑦2 𝑌 1 𝑥21 … . 𝑥2𝑘 𝜀= …
𝑦= … Y= 2 𝑥=
…
𝛽 = 𝛽2
…
… 𝜀𝑛
𝑦𝑛 𝑛×1 𝑌𝑛 1 𝑥𝑛1 … . 𝑥𝑛𝑘 𝑛×(𝑘+1)
𝑛×1
𝑛×1 𝛽𝑘 (𝑘+1)×1

Regression model
𝛽0
𝑌1 1 𝑥11 … . 𝑥1𝑘 𝜀1
𝛽1 𝜀2
𝑌 1 𝑥21 … . 𝑥2𝑘
𝑌 = 𝑥𝛽 + 𝜀 ⇒ 2 = 𝛽2 + …
… …
… 𝜀𝑛
𝑌𝑛 1 𝑥𝑛1 … . 𝑥𝑛𝑘
𝛽𝑘
Regression equation
𝐸[𝑌] = 𝑥𝛽
METU, IE266, S25 4
Step 3- Estimate 𝜷𝟎 , 𝜷𝟏 , ⋯ , 𝜷𝒌

Least squares method

𝑛

𝑚𝑖𝑛 = ෍ 𝑦𝑖 − 𝛽0 − 𝛽1 𝑥𝑖1 − 𝛽2 𝑥𝑖2 − ⋯ − 𝛽𝑘 𝑥𝑖𝑘 2

𝑖=1

In vector notation
𝑚𝑖𝑛 = 𝑦 − 𝑥𝛽 𝑇 (𝑦 − x𝛽)
= 𝑦 𝑇 𝑦 − 𝑥𝛽 𝑇 𝑦 − 𝑦 𝑇 𝑥𝛽 + 𝑥𝛽 𝑇 𝑥𝛽
= 𝑦 𝑇 𝑦 − 2𝛽 𝑇 𝑥 𝑇 𝑦 + 𝛽 𝑇 𝑥 𝑇 𝑥𝛽
𝜕𝐿
= −2𝑥 𝑇 𝑦 + 2𝑥 𝑇 𝑥𝛽 = 0 ⇒ 2𝑥 𝑇 𝑥𝛽 = 2𝑥 𝑇 𝑦
𝜕𝛽
𝛽መ = 𝑥 𝑇 𝑥 −1 𝑥 𝑇 𝑦

Estimated regression equation: 𝑦ො = 𝑥 𝛽መ

METU, IE266, S25 5
Step 3- Estimate 𝜷𝟎 , 𝜷𝟏 , ⋯ , 𝜷𝒌

Is 𝛽መ a vector of unbiased estimators?

E 𝛽መ = 𝐸 𝑥 𝑇 𝑥 −1 𝑥 𝑇 𝑌 = 𝐸 𝑥𝑇𝑥 −1 𝑥 𝑇 (𝑥𝛽 + 𝜀)

= 𝐸 𝑥𝑇𝑥 −1 (𝑥 𝑇 𝑥)𝛽 + 𝑥𝑇𝑥 −1 (𝑥 𝑇 𝜀)) = 𝛽 + 𝑥𝑇 𝑥 −1 𝑥 𝑇 𝐸[𝜀]

=𝛽

𝑐𝑜𝑣 𝛽መ = 𝜎 2 𝑥 𝑇 𝑥 −1

𝑣𝑎𝑟 𝛽መ = 𝜎 2 𝑥 𝑇 𝑥 −1
𝑗𝑗

METU, IE266, S25 6

Example (cont.) (Nb of employees vs. Sales revenue)

෢0 and 𝛽
Find 𝛽 ෢1 using matrix
operations.

METU, IE266, S25 7

Step 4- Specify distribution of e
Let 𝑖 denote the 𝑖 𝑡ℎ observation:
𝑌𝑖 = 𝛽0 + 𝛽1 𝑥𝑖1 + 𝛽2 𝑥𝑖2 + ⋯ 𝛽𝑘 𝑥𝑖𝑘 + 𝜀𝑖

Regression model assumptions:

1- 𝐸[𝜀𝑖 ] = 0
2- 𝜀𝑖 are iid and ~𝑁 0, 𝜎 2 .

𝑌𝑖 ~ 𝑁(𝛽0 + 𝛽1 𝑥𝑖1 + 𝛽2 𝑥𝑖2 + ⋯ + 𝛽𝑘 𝑥𝑖𝑘 , 𝜎 2 )

METU, IE266, S25 9

Step 4- Specify distribution of e
Unbiased estimator of 𝜎 2 :

𝑇 𝑇
𝑆𝑆𝐸 = 𝑦 − 𝑥 𝛽መ 𝑦 − 𝑥 𝛽መ = 𝑦 𝑇 𝑦 − 𝑦 𝑇 𝑥 𝛽መ − 𝑥 𝛽መ 𝑦 + 𝛽መ 𝑇 𝑥 𝑇 𝑥 𝛽መ
(𝑛 − 𝑘 − 1) 𝑥𝑇𝑦
= 𝑦 𝑇 𝑦 − 𝑦 𝑇 𝑥 𝛽መ
(𝑛 − 𝑝)

= 𝑦 𝑇 𝑦 − 𝑦 𝑇 𝑦ො

𝑆𝑆𝐸
𝑀𝑆𝐸 = = 𝜎ො 2
𝑛−𝑘−1

METU, IE266, S25 10

Step 5 - Is the model useful?

1. Tests on individual regression coefficients

2. Analysis of variation (ANOVA)

METU, IE266, S25 11

Step 5 - Tests on individual regression coefficients
• Point estimator:𝛽መ

𝛽መ = 𝑥 𝑇 𝑥 −1 𝑥 𝑇 𝑌 → a linear function of 𝑌𝑖 ’𝑠.

𝛽෡𝑗 ~𝑁(𝛽𝑗 , 𝜎 2 𝑥 𝑇 𝑥 −1
𝑗𝑗 )

Steps of HT:
0. Parameter: 𝛽𝑗
1. 𝐻0 : 𝛽𝑗 = 0
𝐻𝑎 : 𝛽𝑗 ≠ 0
෡𝑗 −0
𝛽
2. Test statistic : 𝑡0 = ~𝑡𝑛−𝑝
−1
𝑀𝑆𝐸 𝑥 𝑇 𝑥 𝑗𝑗

5. Critical value: 𝑡0 > 𝑡𝛼/2,𝑛−𝑝

Tests the significance of 𝛽𝑗 , given that other variables are in the model.
METU, IE266, S25 12
Step 5 - Tests on individual regression coefficients

𝐻0 : 𝛽1 = 0 𝐻0 : 𝛽2 = 0 𝐻0 : 𝛽𝑘 = 0
…. 𝐻𝑎 : 𝛽𝑘 ≠ 0
𝐻𝑎 : 𝛽1 ≠ 0 𝐻𝑎 : 𝛽2 ≠ 0

Suppose at 𝛼 = 0.05, k = 10, all 𝐻0 are true, then,

10
𝑃 𝑓𝑎𝑖𝑙 𝑡𝑜 𝑟𝑒𝑗𝑒𝑐𝑡 𝐻0 𝑖𝑠 𝑡𝑟𝑢𝑒 = 0.95 ≅ 0.6

Type I error =0.4

METU, IE266, S25 13

Step 5 – Analysis of Variance

𝐻0 : 𝛽1 = 𝛽2 = ⋯ = 𝛽𝑘 = 0
𝐻𝑎 : 𝐻0 𝑖𝑠 𝑛𝑜𝑡 𝑡𝑟𝑢𝑒

SST = SSR + SSE

𝑦 𝑇 𝑦 − 𝑛𝑦ത = 𝑦 𝑇 𝑥 𝛽መ − 𝑛𝑦ത + (𝑦 𝑇 𝑦 − 𝑦 𝑇 𝑥 𝛽)
መ

n-1 k n-p
𝑅2
Steps of HT: 𝑘
Note 𝐹0 = (1−𝑅2 )
where
1. 𝐻0 : 𝛽1 = 𝛽2 = ⋯ = 𝛽𝑘 = 0 𝑛−𝑝
𝑆𝑆𝑅
𝐻𝑎 : 𝐻0 𝑖𝑠 𝑛𝑜𝑡 𝑡𝑟𝑢𝑒 𝑅2 =
𝑆𝑆𝑇
𝑆𝑆𝑅
𝑀𝑆𝑅 𝑘
2. Test statistic : 𝐹0 = = 𝑆𝑆𝐸 ~𝐹𝑘,𝑛−𝑝
𝑀𝑆𝐸
𝑛−𝑝 Multiple coefficient of
5. Critical value: f0 > 𝑓𝛼,𝑘,𝑛−𝑝 determination
METU, IE266, S25 14
Example. Home size vs other factors
A model is to be built to relate Home Size (Y) to several factors:
Income, Family Size and Education.

METU, IE266, S25 15

Example. Home size vs other factors
A model is to be built to relate Home Size (Y) to several factors:
Income, Family Size and Education.

METU, IE266, S25 16

Step 5 - Is the model useful?

Note as the number of variables increase, so does 𝑅2 = 𝑆𝑆𝑅/𝑆𝑆𝑇.

𝑦𝑖 = 𝛽0 + 𝛽1 𝑥𝑖1 + 𝛽2 𝑥𝑖2 , 𝑖 = 1,2,3

𝑆𝑆𝐸
2 𝑛−𝑝
𝑅𝑎𝑑𝑗 =1−
𝑆𝑆𝑇
𝑛−1

METU, IE266, S25 17

Step 5 - Is the model useful?
Example. Home size vs other factors

METU, IE266, S25 18

Step 6. Is the model adequate?
The adequacy of the model is checked through
1. Whether the assumptions on the error term are valid
2. Outliers and influential observations
3. Statistical significance of 𝛽0
4. Lack-of-fit test for repeated observations
5. Multicollinearity

If necessary, transformation is considered.

METU, IE266, S25 19

Step 6. 1. Assumptions on the error term
Assumption 1: 𝑬[𝜺𝒊 ] = 𝟎 Assumption 2: 𝑽[𝜺𝒊 ] = 𝝈𝟐
Residuals should be randomly scattered 𝜀𝑖 has the same variance for all 𝑖
around 0.
𝑒𝑖 𝑒𝑖

𝑦ො𝑖 𝑦ො𝑖

Assumption 3: 𝜺~𝑵(𝟎, 𝝈𝟐 ) Assumption 4: 𝒄𝒐𝒗(𝜺𝒊 , 𝜺𝒋 )

When data is time-sequenced
Histogram of residuals.
Normal probability plot.
𝑒𝑖

METU, IE266, S25 observation order 20

Example (cont). Home size vs other factors

METU, IE266, S25 21

Step 6. 2. Outliers and influential observations

Same as in SLR

METU, IE266, S25 22

Step 6. 3. Statistical significance of 𝜷𝟎

Same as in SLR

METU, IE266, S25 23

Step 6. 4. Lack-of-fit test for repeated observations

ANOVA Lack-of-Fit analysis

Analysis of Variance

Source DF SS MS F
Regression k SSR MSR MSR/MSE
Residual Error n-p SSE MSE
Lack of Fit m-p SSLOF MSLOF MSLOF/MSPE
Pure Error n-m SSPE MSPE
Total n-1 SST

𝑚: number of “distinct values” for x-vector

𝑝 =𝑘+1

METU, IE266, S25 24

Step 6. 5. Multicollinearity

Do any two independent variables add redundant information?

Are they correlated?

1. Check whether 𝑡 values of the variables are non-significant,

while F-test for 𝐻0 : 𝛽1 = 𝛽2 = 0 is highly significant.
2. 𝛽መ𝑗 values are opposite signs of what they are expected.
3. Correlations via scatterplots.
4. Variance inflation factor for a 𝛽𝑗 parameter is > 5 or 10.

1
𝑉𝐼𝐹𝑗 = 2,
1 − 𝑅𝑗
𝑅𝑗2 = 𝑆𝑆𝑅𝑗 /𝑆𝑆𝑇𝑗 , multiple coefficient of determination for

𝐸 𝑥𝑖 = 𝛼0 + 𝛼1 𝑥1 + ⋯ + 𝛼𝑗−1 𝑥𝑗−1 + 𝛼𝑗+1 𝑥𝑗+1 + ⋯ + 𝛼𝑘 𝑥𝑘

METU, IE266, S25 25

Step 6. Estimation and Prediction

After validation is checked, models can be used for estimation

and prediction.
(𝟏 − 𝜶)𝟏𝟎𝟎% CI for 𝑬[𝒀|𝒙𝟎 ]:

𝑥0 = [1 𝑥01 𝑥02 … 𝑥0𝑘 ], 𝑦ො0 = 𝑥0 𝛽መ

𝑦ො0 ± 𝑡𝛼,𝑛−𝑝 𝜎ො 2 𝑥0𝑇 𝑥 𝑇 𝑥 −1 𝑥

0
2

(𝟏 − 𝜶)𝟏𝟎𝟎% PI for future observation 𝒀|𝒙𝟎 :

𝑦ො0 ± 𝑡𝛼,𝑛−𝑝 𝜎ො 2 1 + 𝑥0𝑇 𝑥 𝑇 𝑥 −1 𝑥

0
2

METU, IE266, S25 26

Remarks

• First use ANOVA (F-test). If regression model is

significant, use a limited number of individual t-tests.

• Then, look at residuals and problems such as

multicollinearity to check whether any
transformations are necessary. Decide whether to
exclude any outliers.

• Lack-of-fit can be applied also in the absence of

replications by data subsetting.

METU, IE266, S25 27

Steps of the MLR

1. Choose which independent variables to include

First we will focus on Steps 3-6. Then discuss Step 1.

METU, IE266, S25 28

Step 1. Selection of Variables in MLR

Assume there are k candidate variables

𝑌 = 𝛽0 + 𝛽1 𝑥1 + ⋯ + 𝛽𝑘 𝑥𝑘 + 𝜀

1
model with all variables included, possibly with 𝑥1 = 𝑥 , 𝑥2 = ln 𝑥2
1

Methods for variable selection: Performance measures for

0. Scatter plot evaluating the models in general:
1. All possible regressions i. 𝑅2
2
(with/out directed t search) ii. 𝑀𝑆𝐸 ≡ 𝑅𝑎𝑑𝑗
2. Forward Selection iii. 𝐶𝑝
3. Backward elimination
4. Stepwise regression
METU, IE266, S25 29
0. Scatter plot

1
𝐸[𝑌] = 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥12 + 𝛽3 ( )
𝑥2

Add: 𝛽4 𝑥1 𝑥2 + 𝛽5 𝑥12 𝑥2

METU, IE266, S25 30

1. All possible regressions
1. Fit all the regression equations involving one candidate
variable
2. .. two candidate variables
3. …
…
k. … k candidate variables
Without directed t search
Evaluate these 2𝑘 equations (e.g. if 𝑘 = 10, 2𝑘 = 1024 equations.
Use the performance measures to compare.
With directed t search
Construct the full model. Sort the variables in descending order of t.
Add one variable at a time, until 𝑡 < 𝑡ҧ (≡ 𝑝 > 𝛼𝑖𝑛 ). With those
(remaining) candidate variables, conduct the “all possible
regressions without directed t search” method.
METU, IE266, S25 31
Performance measures for evaluating the models
(i) 𝑅2
Let 𝑅𝑝2 denote the 𝑅2 when the number of regressor
variables and the constant term is 𝑝, 𝑝 ≤ 𝑘 + 1:
𝑆𝑆𝑅 𝑝
𝑅𝑝2 =
𝑆𝑆𝑇

kink

METU, IE266, S25 32

(ii) 𝑴𝑺𝑬 ≡ 𝑹𝟐𝒂𝒅𝒋

2 𝑀𝑆𝐸 𝑝
𝑅𝑎𝑑𝑗 (𝑝) =1−
𝑆𝑆𝑇
𝑛−1

METU, IE266, S25 33

(iii) 𝑪𝒑
Does the suggested model with 𝑝 − 1 regressor variables
mimic the actual regression model well?
Is the estimated regression equation an unbiased estimator of
𝐸[𝑌]?
If “perfect” model has 𝑝 − 1 regressor variables, then
estimator 𝑦ො has zero bias and variance is 𝑝𝜎 2 .
1 𝑝𝜎 2
Let Γ𝑝 = 𝑏𝑖𝑎𝑠 2 + 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = =𝑝
𝜎2 𝜎2

𝑀𝑆𝐸(𝑝)
𝐶𝑝 is an estimator of Γ𝑝 : 𝐶𝑝 = (𝑛 − 𝑝) − 𝑛 − 2𝑝 .
𝑀𝑆𝐸(𝑓𝑢𝑙𝑙)

If model is good, then either 𝐶𝑝 must be low, or if 𝐶𝑝 is slightly

larger, must be close to 𝑝. Note full model has 𝐶𝑝 = 𝑘 + 1.
2
(Equivalent to low MSE, high 𝑅𝑎𝑑𝑗 ).
METU, IE266, S25 34
(iii) 𝑪𝒑
In Simple Linear Regression, p=2:
1
Γ2 = σ𝑛𝑖=1 𝐸 𝑦ො𝑖 − 𝐸 𝑦𝑖 2
𝜎2
1
= σ𝑛𝑖=1(𝑏𝑖𝑎𝑠 2 + 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒)
𝜎2

1
= σ𝑛𝑖=1(𝐸 𝑦ො𝑖 − 𝐸 𝑦𝑖 2
+ 𝑣𝑎𝑟 𝑦ො𝑖 )
𝜎2

1 1 𝑥𝑖 −𝑥ҧ 2 1 2 σ𝑛
𝑖=1 𝑥𝑖 −𝑥ҧ
2
= σ𝑛𝑖=1 0+ 𝜎2 + = 𝜎 (1 + )
𝜎2 𝑛 𝑆𝑥𝑥 𝜎2 𝑆𝑥𝑥

σ𝑛
𝑖=1 𝑥𝑖 −𝑥ҧ
2
=1+ ⇒ Γ2 = 2
𝑆𝑥𝑥

METU, IE266, S25 35

Example.

METU, IE266, S25 36

Example. All possible regressions

METU, IE266, S25 37

2. Forward selection

Add candidate variables one-by-one in the order of highest 𝑡 (≡ highest

partial F ≡ lowest 𝑝), as long as the thresholds are met :
𝑡 > 𝑡𝑖𝑛 ≡ 𝐹 > 𝐹𝑖𝑛 ≡ 𝑝 < 𝛼𝑖𝑛 .

Note 𝛼𝑖𝑛 could be weaker than the significance level, 𝛼.

METU, IE266, S25 38

Partial F statistic
𝑌 = 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + 𝛽3 𝑥3 + 𝛽4 𝑥4 + 𝛽5 𝑥5 + 𝜀

𝐻0 : 𝛽1 = 𝛽2 = 𝛽3 = 0 (given 𝛽4 , 𝛽5 )
𝐻𝑎 : 𝐻0 not true

𝑆𝑆𝑅 𝛽1 , 𝛽2 , 𝛽3 𝛽0 , 𝛽4 , 𝛽5
= 𝑆𝑆𝑅 𝛽1 , 𝛽2 , 𝛽3 , 𝛽4 , 𝛽5 𝛽0 − 𝑆𝑆𝑅 𝛽4 , 𝛽5 𝛽0

= (𝑆𝑆𝑇 − 𝑆𝑆𝐸 𝑥1 , 𝑥2 , 𝑥3 , 𝑥4 , 𝑥5 ) − (𝑆𝑆𝑇 − 𝑆𝑆𝐸 𝑥4 , 𝑥5 )

= 𝑆𝑆𝐸 𝑥4 , 𝑥5 − 𝑆𝑆𝐸 𝑥1 , 𝑥2 , 𝑥3 , 𝑥4 , 𝑥5

𝑆𝑆𝑅 𝜷𝟏 , 𝜷𝟐 , 𝜷𝟑 𝛽0 , 𝛽4 , 𝛽5
𝐹0 = 3 ~𝐹3,𝑛−5−1
𝑆𝑆𝐸 𝑥1 , … , 𝑥5
𝑛−5−1
METU, IE266, S25 39
Example.
𝑌 = 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + 𝛽3 𝑥3 + 𝜀

𝐻0 : 𝛽1 = 𝛽2 = 0 (given 𝛽3 )
𝑺𝑺𝑹(. |𝜷𝟎 )
𝐻𝑎 : 𝐻0 not true

60.76−49.54
2 5.61
𝐹0 = =
68.405 − 60.76
= 8.07
𝑀𝑆𝐸
11

8.07 >? 𝐹0.05,2,11 = 2.85

SST=68.405, n=15

METU, IE266, S25 40

Example.
𝑌 = 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + 𝛽3 𝑥3 + 𝜀

𝑺𝑺𝑹(. |𝜷𝟎 )
𝐻0 : 𝛽1 = 𝛽2 = 0 (given 𝛽3 )
𝐻𝑎 : 𝐻0 not true

60.76−49.54
5.61
𝐹0 = 2 = = 8.07
𝑀𝑆𝐸 68.405 − 60.76
11

SST=68.405, n=15 8.07 >? 𝐹0.05,2,11 = 2.85

METU, IE266, S25 41

2. Forward selection. Example (cont). Soft drink delivery

METU, IE266, S25 42

2. Forward selection. Step 1

METU, IE266, S25 43

2. Forward selection. Step 2
Step 2

METU, IE266, S25 44

2. Forward selection. Step 3

Nb of locati

METU, IE266, S25 45

2. Forward selection. “Nb of machines” is not eligible to enter

METU, IE266, S25 47

2. Forward selection. Example (cont). Soft drink delivery

Forward selection

METU, IE266, S25 48

3. Backward elimination

Begin with the full model (with 𝑘 variables).

Check each variable’s contribution using the t-test (or partial F-

test, or p value).

If the variable with highest 𝑝 is higher than a threshold (𝛼𝑜𝑢𝑡 ),

then remove that variable.

Rerun the analysis with the remaining variables.

Continue until no further variables can be deleted.

METU, IE266, S25 49

3. Backward elimination
Example (cont). Step 1: Full model (p=5)

METU, IE266, S25 50

3. Backward elimination
Example (cont). Step 2: p=4

None of the variables can be taken out.

METU, IE266, S25 51
4. Stepwise regression
• most widely used variable selection method
• combines forward and backward methods
• 𝛼𝑖𝑛 (𝐹𝑖𝑛 ): threshold on p value (F statistics value) to add a
variable to the model
𝛼𝑜𝑢𝑡 (𝐹𝑜𝑢𝑡 ): threshold on p value to eliminate a variable from
the model
𝛼𝑖𝑛 ≤ 𝛼𝑜𝑢𝑡 . Experiment with different 𝛼𝑖𝑛 and 𝛼𝑜𝑢𝑡 .
Steps
(i) Consider set of remaining candidate variables
(ii) Are there any variables with 𝑝 < 𝛼𝑖𝑛 ? If yes, add the variable with
lowest 𝑝. o/w (iv)
(iii) Rerun the analysis, are there any variables with 𝑝 > 𝛼𝑜𝑢𝑡 ? If so select
the variable with highest 𝑝 and eliminate the variable from the model
and from the set of candidate variables. Go to Step (i).
(iv) Stop.
METU, IE266, S25 52
4. Stepwise regression.
Example (cont). Soft drink delivery

Conduct the stepwise regression.

METU, IE266, S25 53

4. Stepwise regression. Example (cont). Soft drink delivery

0.15

METU, IE266, S25 54

Remarks

1. t-test used in stepwise, backward and forward selection are

equivalent to partial F-test with single variable (entering or leaving).

2. Once the model is constructed, it must be checked for validity

(assumptions on the error term, collinearity, outliers, etc.)

3. There may not be a single best model but several good models. One
may verify with alternative variable selection methods.

4. If k is not large, all possible regressions model is recommended, since

the risk of multicollinearity in the final model is minimized.

METU, IE266, S25 55

Basic Principles of Gravity Method
No ratings yet
Basic Principles of Gravity Method
78 pages
Dew Point and Bubble
0% (1)
Dew Point and Bubble
6 pages
Closed Conduit Flow: Monroe L. Weber-Shirk S Civil Environmental Engineering
No ratings yet
Closed Conduit Flow: Monroe L. Weber-Shirk S Civil Environmental Engineering
44 pages
Regression Analysis
100% (1)
Regression Analysis
280 pages
Applied Business Forecasting and Planning: Multiple Regression Analysis
No ratings yet
Applied Business Forecasting and Planning: Multiple Regression Analysis
100 pages
ch12 0
No ratings yet
ch12 0
82 pages
Regression Analysis
No ratings yet
Regression Analysis
37 pages
Reg Analysis
No ratings yet
Reg Analysis
63 pages
Multiple Regression
100% (1)
Multiple Regression
100 pages
STAT630Slide Adv Data Analysis
No ratings yet
STAT630Slide Adv Data Analysis
238 pages
Regression - Part III - 2021
No ratings yet
Regression - Part III - 2021
55 pages
125.785 Module 2.2
No ratings yet
125.785 Module 2.2
95 pages
QM II Formula Sheet
No ratings yet
QM II Formula Sheet
2 pages
Chapter 5. Regression Models: 1 A Simple Model
No ratings yet
Chapter 5. Regression Models: 1 A Simple Model
49 pages
Chapter 3 Econometrics
No ratings yet
Chapter 3 Econometrics
34 pages
01 SLR Final
No ratings yet
01 SLR Final
37 pages
Notes 12
No ratings yet
Notes 12
41 pages
Multiple Linear Regression 13112023 063212pm
No ratings yet
Multiple Linear Regression 13112023 063212pm
49 pages
NPTEL Course List 2022
No ratings yet
NPTEL Course List 2022
205 pages
Module01.1 LinearRegression
No ratings yet
Module01.1 LinearRegression
32 pages
MAS316/Math352 Regression Analysis: 1 Multiple Linear Regression Models
No ratings yet
MAS316/Math352 Regression Analysis: 1 Multiple Linear Regression Models
12 pages
Multiple Regression
100% (1)
Multiple Regression
21 pages
Basic Aerodynamics: Lecture 12: Blade Element Analysis
No ratings yet
Basic Aerodynamics: Lecture 12: Blade Element Analysis
42 pages
Problem With Solution
No ratings yet
Problem With Solution
100 pages
Chapter 2 Simple Linear Regression - Jan2023
No ratings yet
Chapter 2 Simple Linear Regression - Jan2023
66 pages
Wne WP361
No ratings yet
Wne WP361
36 pages
SRM Notes
No ratings yet
SRM Notes
38 pages
Chapter 14
No ratings yet
Chapter 14
15 pages
Multiple Regression Analysis
No ratings yet
Multiple Regression Analysis
8 pages
Multiple Linear Regression & Nonlinear Regression Models
No ratings yet
Multiple Linear Regression & Nonlinear Regression Models
51 pages
ECE Formula Sheet
No ratings yet
ECE Formula Sheet
7 pages
ECMT1020 Formulas 2021
No ratings yet
ECMT1020 Formulas 2021
9 pages
Simple Linear
No ratings yet
Simple Linear
10 pages
Multiple Regression
No ratings yet
Multiple Regression
100 pages
Multiple Regression
No ratings yet
Multiple Regression
22 pages
Tcs Coding
No ratings yet
Tcs Coding
76 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
18 pages
College of Natural and Computational Science Department of Statistics Linear Regression Biostatistics Master Program
No ratings yet
College of Natural and Computational Science Department of Statistics Linear Regression Biostatistics Master Program
3 pages
Lecture 12 - Adv. Correlation and Multiple Regression
No ratings yet
Lecture 12 - Adv. Correlation and Multiple Regression
32 pages
mat67-Lj-Inner Product Spaces
No ratings yet
mat67-Lj-Inner Product Spaces
13 pages
Chapter 02
No ratings yet
Chapter 02
14 pages
Statistic SimpleLinearRegression
No ratings yet
Statistic SimpleLinearRegression
7 pages
Module01 LinearRegression
No ratings yet
Module01 LinearRegression
41 pages
Week 9
No ratings yet
Week 9
23 pages
Basics of Geometry
100% (1)
Basics of Geometry
27 pages
Chap01-3 (Autosaved)
No ratings yet
Chap01-3 (Autosaved)
51 pages
Square Roots Worksheet
No ratings yet
Square Roots Worksheet
2 pages
Stat 353 Study Guide
No ratings yet
Stat 353 Study Guide
44 pages
CSE 518A Project Report
No ratings yet
CSE 518A Project Report
6 pages
Chapter 2
No ratings yet
Chapter 2
19 pages
Multiple Linear Reegression
No ratings yet
Multiple Linear Reegression
21 pages
SRM Formula Sheet-2
100% (1)
SRM Formula Sheet-2
11 pages
Bessel Polynomials and Bessel Functions
No ratings yet
Bessel Polynomials and Bessel Functions
2 pages
Clinical Evaluation of Correction Algorithm For Corvis ST Tonometry (Post)
No ratings yet
Clinical Evaluation of Correction Algorithm For Corvis ST Tonometry (Post)
1 page
Untitled 472
No ratings yet
Untitled 472
13 pages
Syllabus For TUEE 2024
No ratings yet
Syllabus For TUEE 2024
22 pages
Grade 8 March Controlled Test
100% (1)
Grade 8 March Controlled Test
8 pages
Linear Regression
No ratings yet
Linear Regression
47 pages
Chapter 2 Regression Analysis Notes
No ratings yet
Chapter 2 Regression Analysis Notes
11 pages
Model Predictive Control
100% (1)
Model Predictive Control
12 pages
Reporte
No ratings yet
Reporte
2 pages
Pröbsting Mahadik Schuler Hofmann
No ratings yet
Pröbsting Mahadik Schuler Hofmann
16 pages
Rounding To 2dp 2
No ratings yet
Rounding To 2dp 2
2 pages
Speed and Acceleration
No ratings yet
Speed and Acceleration
4 pages
Mungadze Linear
No ratings yet
Mungadze Linear
21 pages
Linear Regression
No ratings yet
Linear Regression
29 pages
ME 308 Machine Elements Ii: 13.3.2019 Chapter 5 Spur Gears 1
No ratings yet
ME 308 Machine Elements Ii: 13.3.2019 Chapter 5 Spur Gears 1
32 pages
HKDSE Mathematics in Action (3rd Edition) 4B - Chapter 06 Exponential Functions - Full Solution
No ratings yet
HKDSE Mathematics in Action (3rd Edition) 4B - Chapter 06 Exponential Functions - Full Solution
35 pages
Pradytha Galuh Putranti - 2304220013 - SSD - B ING-STAT
No ratings yet
Pradytha Galuh Putranti - 2304220013 - SSD - B ING-STAT
26 pages
Chapter Three
No ratings yet
Chapter Three
35 pages
BA501 Week5 Linear Regression
No ratings yet
BA501 Week5 Linear Regression
45 pages
Chapter 5
No ratings yet
Chapter 5
45 pages
Wilcoxon Signed Rank Test Janice M. Griponmaed Math 1
No ratings yet
Wilcoxon Signed Rank Test Janice M. Griponmaed Math 1
22 pages
15multiple Linear Regression
No ratings yet
15multiple Linear Regression
168 pages
Chapter 0 - Multiple Regression Models
No ratings yet
Chapter 0 - Multiple Regression Models
34 pages
Chapter 9 Simple Linear Regression and Correlation
No ratings yet
Chapter 9 Simple Linear Regression and Correlation
56 pages
Lecture 4 - Multiple Linear Regression Imran 20022025 092939am
No ratings yet
Lecture 4 - Multiple Linear Regression Imran 20022025 092939am
49 pages
NLP03 Vector Space Models
No ratings yet
NLP03 Vector Space Models
61 pages
Nedelec. Integral Equations With Not Integrable Kernels, 1982.
No ratings yet
Nedelec. Integral Equations With Not Integrable Kernels, 1982.
11 pages
MGT Three
No ratings yet
MGT Three
86 pages
6 Semester PYQs
No ratings yet
6 Semester PYQs
2 pages
Practice Question 15.12.2023 2
No ratings yet
Practice Question 15.12.2023 2
13 pages
Quiz 1 知识点汇总
No ratings yet
Quiz 1 知识点汇总
5 pages
IE266 S25 Week11 - Redaksiyon
No ratings yet
IE266 S25 Week11 - Redaksiyon
71 pages
IE266 S25 Week10 Updated
No ratings yet
IE266 S25 Week10 Updated
27 pages
SC - Summer Holiday Assignment - Class-10 - Mathematics
No ratings yet
SC - Summer Holiday Assignment - Class-10 - Mathematics
5 pages
Lecture 2 Multivariate Linear Regression Models
No ratings yet
Lecture 2 Multivariate Linear Regression Models
15 pages
BRM - BMB203 Unit 2 2025
No ratings yet
BRM - BMB203 Unit 2 2025
15 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Introduction to Calculus
From Everand
Introduction to Calculus
Joan Van Glabek
4.5/5 (8)

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.