OM Forecasting
OM Forecasting
➢ Forecast:
➢ An educated guess for the future based on objective analysis to the
extent possible.
➢ Despite the presence of inaccuracies, some forecast is better
than no forecast.
➢ Analytical forecast is better than intuitive forecast.
➢ What can be forecast?
➢ Demand - Common
➢ Environmental - Social, Political, Economic
➢ Sales characteristics: RCTS
➢ Users:
➢ Long range – Facilities
➢ Medium range – Production Planning
➢ Short range – Product Forecast
Principles of Forecasting
Forecasting
➢ Do in 2 steps
➢ First fit various models,
Seasonal
Pattern
D
e
m Linear
a Trend
n
d Level
Random
Time
Simple Moving Average
➢ An averaging period “𝑛” is selected.
𝑡
1
𝐹𝑡+1 = 𝑋𝑖
𝑛
𝑖=𝑡−𝑛+1
Jan 100
Feb 110
Mar 120 105
Apr 130 115 110.00
May 125 125 120.00
Jun 115 127.5 125.00 117
Jul 105 120 123.33 120
Aug 95 110 115.00 119
Sep 85 100 105.00 114
Oct 87.5 90 95.00 105
Nov 97.5 86.25 89.17 97.5 107.25
Dec 107.5 92.5 90.00 94 107
Jan 117.5 102.5 97.50 94.5 106.75
Feb 127.5 112.5 107.50 99 106.5
Mar 137.5 122.5 117.50 107.5 106.25
Apr 132.5 132.5 127.50 117.5 107.5
May 122.5 135 132.50 124.5 109.25
Jun 112.5 127.5 130.83 127.5 111
Jul 102.5 117.5 122.50 126.5 112.75
Understanding Simple Moving Average
160
140
Demand
120
100
2 Month MA
80
3 Month MA
60
5 Month MA
40
10 Month MA
20
0
Weighted Moving Average
➢ It lags in trend.
Similarly we keep on replacing forecast terms (𝑭) with observed terms (𝒚),
𝐹𝑡+1 = 𝛼𝑦𝑡 + 1 − 𝛼 𝛼𝑦𝑡−1 + (1 − 𝛼)2 𝛼𝑦𝑡−2 + ⋯ + 1 − 𝛼 𝑛 𝛼𝑦𝑡−𝑛 + (1 − 𝛼)𝑡−1 𝐹2
We can initialize 𝐹2 = 𝑦1 , i.e., the response in the 1𝑠𝑡 period is taken as the
forecast for the 2𝑛𝑑 period.
Month 24 25 26 27 28 29 30 31 32
𝐹10 = 𝛼𝑦9 + 1 − 𝛼 𝐹9
𝐹10 = 𝟕𝟕 units
Use
➢ To set safety stocks or safety capacity
➢ To ensure a desired level of protection against stock out
𝐶𝐹𝐸 = 𝑒𝑡 = (𝑦𝑡 − 𝐹𝑡 )
𝑡=1 𝑡=1
𝑛
1
𝑅𝑀𝑆𝐸 = 𝑒𝑡2
𝑛
𝑡=1
Measures of Forecast Error
7. Tracking Signal (𝑇𝑆):
𝐶𝐹𝐸𝑛 σ𝑛𝑡=1 𝑒𝑡
𝑇𝑆𝑛 = =
𝑀𝐴𝐷𝑛 1 σ𝑛 𝑒
𝑛 𝑡=1 𝑡
24 78
25 65 78 -13 13 -0.20 0.20 13.00 -1.00
26 90 77 13 13 0.15 0.15 13.15 0.02
27 71 78 -7 7 -0.10 0.10 11.11 -0.61
28 80 77 3 3 0.03 0.03 9.00 -0.45
29 101 78 23 23 0.23 0.23 11.88 1.63
30 84 80 4 4 0.05 0.05 10.58 2.21
31 60 80 -20 20 -0.34 0.34 11.97 0.26
32 73 78 -5 5 -0.07 0.07 11.14 -0.20
𝑀𝐴𝐷𝑡- Mean Absolute Deviation up to period 𝑡
𝑇𝑆𝑡 - Tracking Signal up to period 𝑡
Holt’s Method (Double Exponential
Smoothing)
𝐹𝑡+𝑚 = 𝐿𝑡 + 𝑚𝑏𝑡
𝑳𝟒 = 𝟑𝟕𝟓
𝒃𝟒 = −𝟏𝟐. 𝟓
Initialization
• Initialization of Seasonal Component estimates:
𝑦1 500
𝑆1 = = = 𝟏. 𝟑𝟑𝟑
𝐿4 375
𝑦2 350
𝑆2 = = = 𝟎. 𝟗𝟑𝟑
𝐿4 375
𝑦3 250
𝑆3 = = = 𝟎. 𝟔𝟔𝟕
𝐿4 375
𝑦4 400
𝑆4 = = = 𝟏. 𝟎𝟔𝟕
𝐿4 375
Here, 𝛼 = 0.4, 𝛽 = 0.1, 𝛾 = 0.3, 𝑝 = 4
𝑚 = 1 (unless otherwise stated)
Results
Year Quarter 𝒕 Sales, 𝒚𝒕 𝑳𝒕 𝒃𝒕 𝑺𝒕 𝑭𝒕+𝒎
2014 1 1 500 1.333
2 2 350 0.933
3 3 250 0.667
4 4 400 375 -12.50 1.067
2015 1 5 450 352.50 -13.50 1.316 483.33
2 6 350 353.40 -12.06 0.950 316.40
3 7 200 324.80 -13.71 0.651 227.56
4 8 300 299.15 -14.91 1.048 331.83
2016 1 9 350 276.91 -15.64 1.301 374.16
2 10 200 240.93 -17.67 0.914 248.32
3 11 150 226.06 -17.39 0.655 145.43
4 12 400 277.94 -10.47 1.165 218.58
2017 1 13 550 329.64 -4.25 1.411 347.88
2 14 350 348.35 -1.95 0.941 297.52
3 15 250 360.50 -0.54 0.667 226.90
4 16 550 404.81 3.94 1.223 419.35
Year Quarter 𝒕 Sales, 𝒚𝒕 𝑳𝒕 𝒃𝒕 𝑺𝒕 𝑭𝒕+𝒎
𝑦5
𝐿5 = 𝛼 + (1 − 𝛼)(𝐿4 + 𝑏4 )
𝑆1
450
𝐿5 = (0.4) + 1 − 0.4 375 + −12.5 = 𝟑𝟓𝟐. 𝟓
1.333
Sample Calculations
• Seasonal Component estimate for period 5 (𝑆5 ):
𝑦5
𝑆5 = 𝛾 + (1 − 𝛾)𝑆1
𝐿5
450
𝑆5 = (0.3) + 1 − 0.3 (1.333) = 𝟏. 𝟑𝟏𝟔
352.5
𝑏5 = 𝛽 𝐿5 − 𝐿4 + (1 − 𝛽)𝑏4
𝐹6 = 𝟑𝟏𝟔. 𝟒
Forecast for periods 29-32
• Forecast made at the end of period 28 for period 29 (𝐹𝑡+𝑚 ): (Here,
𝑡 = 28, 𝑚 = 1)
𝑳𝟒 = 𝟑𝟕𝟓
𝒃𝟒 = −𝟏𝟐. 𝟓
Initialization
𝑆4 = 𝑦4 − 𝐿4 = 400 − 375 = 𝟐𝟓
Here, 𝛼 = 0.4, 𝛽 = 0.1, 𝛾 = 0.3, 𝑝 = 4 Results
𝑚 = 1 (unless otherwise stated)
𝐹5 = 𝐹4+1 = 𝐿4 + 1 ∗ 𝑏4 + 𝑆1
𝐿5 = 𝛼(𝑦5 − 𝑆1 ) + (1 − 𝛼)(𝐿4 + 𝑏4 )
𝑆5 = 𝛾(𝑦5 − 𝐿5 ) + (1 − 𝛾)𝑆1
𝑏5 = 𝛽 𝐿5 − 𝐿4 + (1 − 𝛽)𝑏4
𝐹6 = 𝐹5+1 = 𝐿5 + 1 ∗ 𝑏5 + 𝑆2
𝐹6 = 𝟑𝟎𝟖. 𝟓
Forecast for periods 29-32
• Forecast made at the end of period 28 for period 29 (𝐹𝑡+𝑚 ): (Here,
𝑡 = 28, 𝑚 = 1)
𝑌 𝑋 = 𝑎0 + 𝑎1 𝑋1 + 𝑎2 𝑋2 + ⋯ + 𝑎𝑛−1 𝑋𝑛−1 + 𝑎𝑛 𝑋𝑛
𝑌 𝑋 = 𝑎0 + 𝑎1 𝑋 + 𝑎2 𝑋2 + ⋯ + 𝑎𝑛−1 𝑋 𝑛−1 + 𝑎𝑛 𝑋𝑛
Y2 Y3
Response
Y4 Y5
Y1
X
X1 X2 X3 X4 X5 X6
Predictor
Method of Ordinary Least Squares (OLS)
𝑒𝑖 = 𝑌𝑖 − 𝑌𝑖
𝑖 = 𝑎 + 𝑏𝑋𝑖
But, 𝑌
Therefore, 𝑒𝑖 = 𝑌𝑖 − 𝑎 − 𝑏𝑋𝑖
Let 𝐸(𝑎, 𝑏) be the sum of the squared errors.
𝑛 𝑛
2
𝐸 𝑎, 𝑏 = 𝑌𝑖 − 𝑌𝑖 = 𝑌𝑖 − 𝑎 − 𝑏𝑋𝑖 2
𝑖=1 𝑖=1
⇒
It can be shown that the resulting least squares
estimators b and a are
σ𝑛𝑖=1 𝑌𝑖 − 𝑏 σ𝑛𝑖=1 𝑋𝑖
𝑎= = 𝑌ത − 𝑏𝑋ത
𝑛
The formula for 𝑏 can also be represented in
alternative ways, such as
𝐸 (𝑋 − 𝐸 𝑋 )(𝑌 − 𝐸 𝑌 ) 𝐸 𝑋𝑌 − 𝐸 𝑋 𝐸(𝑌)
𝑏= 2
=
𝐸(𝑋 − [𝐸 𝑋 ]) 𝐸 𝑋 2 − [𝐸 𝑋 ]2
𝐶𝑜𝑣(𝑋, 𝑌)
𝑏=
𝑉𝑎𝑟(𝑋)
𝑋𝑖 = 0
𝑖=1
then
σ𝑛𝑖=1 𝑋𝑖 𝑌𝑖
𝑏= 𝑛
σ𝑖=1 𝑋𝑖2
and
σ𝑛𝑖=1 𝑌𝑖
𝑎= = 𝑌ത
𝑛
Correlation Coefficient
• Correlation (Dependence): It is the existence of a
statistical (not necessarily causal) relationship between a
pair of variables.
2
σ𝑛𝑖=1 𝑌𝑖 − 𝑌ത
𝑅2 =
σ𝑛𝑖=1 𝑌𝑖 − 𝑌ത 2
𝐶𝑜𝑣(𝑋, 𝑌) 𝐸 𝑋𝑌 − 𝐸 𝑋 𝐸(𝑌)
𝑟𝑋𝑌 = =
𝜎𝑋 𝜎𝑌 𝐸 𝑋 2 − [𝐸 𝑋 ]2 𝐸 𝑌 2 − [𝐸 𝑌 ]2
78.091 − 6 ∗ 10.273
𝑟𝑋𝑌 = = 𝟎. 𝟗𝟔𝟏𝟓
46 − [6]2 134.818 − [10.273]2
𝑹𝟐 = 𝟎. 𝟗𝟐𝟒𝟒
Regression Analysis
• A simple regression model (𝑌 = 𝑎 + 𝑏𝑋)
developed for making forecasts must suitably
answer the following questions:
• 𝐻0 : 𝑏 = 0; 𝐻1 : 𝑏 ≠ 0
2
σ𝑛𝑖=1 𝑌𝑖 − 𝑌𝑖
𝑛−𝑘−1
𝑆𝐸𝑏 =
σ𝑛𝑖=1 𝑋𝑖 − 𝑋ത 2
24.355
11 − 2
𝑆𝐸𝑏 = = 0.1568
110
𝑏 1.645
𝑡= = = 10.491
𝑆𝐸𝑏 0.1568
• At 𝛼 = 0.05,
𝑡𝑐 = 𝑡0.025,9 = 2.262
• Since 𝑡 > 𝑡𝑐 , we can conclude that 𝑏 is significantly
different from 0.
Regression Analysis
• The second question is concerned with the
precision with which the regression coefficient 𝑏
can be estimated.
σ𝑛 2
𝑖=1 𝑌𝑖 −𝑌𝑖 1 𝑋 ′ −𝑋ത 2
• 𝑆𝐸𝐹𝑋 ′ = 1+ + σ𝑛 ത 2
𝑛−𝑘−1 𝑛 𝑖=1 𝑋𝑖 −𝑋
24.355 1 12 − 6 2
𝑆𝐸𝐹12 = 1+ + = 1.959
11 − 2 11 110
550
500
450
400
Yield (mol/kmol)
350
300
250
200
150
100
20 30 40 50 60
Temperature (°C)
Solution
• Fitting a quadratic regression model to the data and
applying OLS method to determine the coefficients,
the best-fit regression equation is found to be:
𝑻𝒆𝒎𝒑 = 𝟐𝟏𝟖. 𝟒𝟑𝟔 − 𝟏𝟏. 𝟓𝟎𝟒 ∗ 𝑻𝒆𝒎𝒑 + 𝟎. 𝟑𝟎𝟑 ∗ 𝑻𝒆𝒎𝒑𝟐
𝒀𝒊𝒆𝒍𝒅
ln 𝑌 = ln 𝑎 + 𝑏𝑋
Log(Sales)
Sales (in 105 units)
4.5
80
4
70
3.5
60
3
50
2.5
40
2
30 1.5
20 1
10 0.5
0 0
0 5 10 15 20 25 30 35 0 5 10 15 20 25 30 35
Period Period
Example
• Here, the natural logarithm of the response
variable (sales) is linear in terms of the
predictor variable (time).
2𝜋 2𝜋
𝑌 𝑡 = 𝐴 sin 𝑡 + 𝐵cos 𝑡
𝑝 𝑝
2𝜋 2𝜋 4𝜋 4𝜋
𝑌 𝑡 = 𝐴1 sin 𝑡 + 𝐵1 cos 𝑡 + 𝐴2 sin 𝑡 + 𝐵2 cos 𝑡
𝑝 𝑝 𝑝 𝑝
2𝜋 2𝜋
𝑌 𝑡 = 𝐴 + 𝐵𝑡 + 𝐶 sin 𝑡 + 𝐷 cos 𝑡
𝑝 𝑝
4500
4300
4100
Paper Production
3900
3700
3500
3300
3100
2900
2700
0 5 10 15 20 25 30 35 40 45 50
Period
Harmonic Regression (example)
• Assuming an additive model and considering only the first
and second order frequencies, the regression equation
can be written as:
2𝜋 2𝜋 4𝜋 4𝜋
𝑌 𝑡 = 𝐴 + 𝐵𝑡 + 𝐶 sin 𝑡 + 𝐷 cos 𝑡 + 𝐸 sin 𝑡 + 𝐹 cos 𝑡
𝑝 𝑝 𝑝 𝑝
𝟒𝟗 = 𝟒𝟕𝟔𝟖. 𝟒 𝐮𝐧𝐢𝐭𝐬
𝒀
Regression vs Linear Programming Model
x y
Run the regression
1 17500
model on the dataset
2 19000
3 23000
to get the intercept (𝑎)
4 33000 and the slope (𝑏).
5 37250 Calculate forecasted
6 35500 value (𝑦ො = 𝑎 + 𝑏𝑥) for
7 41500 each observation.
Results with Regression
𝑎 = 12500
𝑏 = 4258.93
𝑦ො = 12500 + 4258.93𝑥
x y 𝑦ො
1 17500 16758.93
2 19000 21017.86
3 23000 25276.79
4 33000 29535.72
5 37250 33794.65
6 35500 38053.57
7 41500 42312.5
• Mean Absolute Percentage Error (MAPE)= 7.67%
Linear Programming Model
Let 𝐸𝑖 (percentage error), 𝑎 and 𝑏 (regression coefficients) be the
decision variables.
Objective:
𝑛
1
𝑀𝑖𝑛 𝑍 = 𝐸𝑖
𝑛
𝑖=1
subject to
1. 𝑦ො𝑖 = 𝑎 + 𝑏𝑥𝑖 ; ∀𝑖 ∈ {1,2,3, … , 𝑛}
2. 𝐸𝑖 ≥ (𝑦ො𝑖 − 𝑦𝑖 )/𝑦𝑖 ; ∀𝑖 ∈ {1,2,3, … , 𝑛}
3. 𝐸𝑖 ≥ (𝑦𝑖 − 𝑦ො𝑖 )/𝑦𝑖 ; ∀𝑖 ∈ {1,2,3, … , 𝑛}
𝐸𝑖 ≥ 0; ∀𝑖 ∈ {1,2,3, … , 𝑛}
𝑎, 𝑏 unrestricted in sign
Results with LP using GAMS:
𝑍 = 0.0719
𝑎 = 13500, 𝑏 = 4000
𝑦ො = 13500 + 4000𝑥
MAPE = 7.19%
MAPE obtained using Linear programming model (7.19%) is less than that
obtained using regression (7.67%).
LP model considering costs of overestimation
and underestimation
• Let 𝐶𝑖 (cost), 𝑎 and 𝑏 (regression coefficients) be the decision
variables.
• Objective∶ 𝑀𝑖𝑛 𝑍 = σ𝑛𝑖=1 𝐶𝑖
subject to
1. 𝑦ො𝑖 = 𝑎 + 𝑏𝑥𝑖 ; ∀𝑖 ∈ {1,2,3, … , 𝑛}
2. 𝐶𝑖 ≥ 𝑦ො𝑖 − 𝑦𝑖 𝑝𝑜 ; ∀𝑖 ∈ {1,2,3, … , 𝑛}
3. 𝐶𝑖 ≥ (𝑦𝑖 − 𝑦ො𝑖 )𝑝𝑢 ; ∀𝑖 ∈ {1,2,3, … , 𝑛}
𝐶𝑖 ≥ 0; ∀𝑖 ∈ 1,2,3, … , 𝑛
𝑎, 𝑏 unrestricted in sign
• Here, 𝑝𝑜 and 𝑝𝑢 are the unit costs of overestimating and
underestimating the response variable, respectively.
• Special case: 𝑝𝑜 = 𝑝𝑢 = 1. Here, the objective of minimizing the
total cost becomes equivalent to minimizing the mean absolute
deviation.
Assumptions of Linear Regression
• Unlike the first two methods, HCSEs don’t alter the regression
coefficients and are much simpler to use.
𝐻𝐶𝑆𝐸1 =
Example
• The heights and weights of a sample of 100 athletes in a sports academy are
tabulated below. Fit a linear regression model to predict the weight of an athlete from
his/her height and check for homoscedasticity assumption using graphical
inspection. Height (in Weight Height (in Weight Height (in Weight Height (in Weight
m) (kg) m) (kg) m) (kg) m) (kg)
1.73 71.3 1.69 68.3 1.62 58.9 1.87 92.4
1.85 80.0 1.70 65.1 1.81 85.4 1.83 87.2
1.80 77.5 1.73 67.9 1.66 59.9 1.84 77.7
1.87 87.7 1.74 70.1 1.84 75.1 1.73 73.5
1.89 86.1 1.71 61.8 1.60 54.2 1.66 62.1
1.60 51.9 1.68 65.5 1.89 92.7 1.60 53.5
1.73 73.0 1.70 60.6 1.80 78.3 1.66 60.3
1.68 61.5 1.86 79.8 1.81 80.6 1.80 71.1
1.71 72.4 1.85 92.2 1.59 54.3 1.89 83.0
1.69 67.3 1.66 61.5 1.71 70.3 1.85 78.7
1.84 87.7 1.84 84.3 1.87 82.8 1.77 79.7
1.63 55.3 1.74 64.5 1.77 72.7 1.87 75.5
1.72 64.2 1.78 77.4 1.85 74.2 1.74 72.0
1.71 70.3 1.68 67.1 1.60 52.6 1.84 82.9
1.67 62.4 1.64 56.7 1.75 65.0 1.67 66.4
1.60 56.8 1.66 63.0 1.87 86.8 1.61 54.9
1.75 77.1 1.85 83.6 1.86 78.2 1.77 67.0
1.79 82.1 1.63 57.6 1.62 57.6 1.63 54.6
1.90 97.4 1.61 57.5 1.80 79.3 1.63 60.5
1.64 61.7 1.76 72.0 1.67 59.9 1.86 94.0
1.68 62.8 1.83 78.8 1.83 77.1 1.77 72.7
1.81 84.0 1.68 66.5 1.85 77.3 1.90 92.3
1.60 56.9 1.84 86.7 1.90 81.0 1.67 62.6
1.84 86.1 1.64 56.7 1.76 74.0 1.72 62.4
1.78 81.0 1.67 58.5 1.65 56.3 1.62 54.4
Solution
• The regression equation is determined as:
= −131.18 + 115.97 ∗ 𝐻𝑒𝑖𝑔ℎ𝑡
𝑊𝑒𝑖𝑔ℎ𝑡
• Plotting the residuals against the predicted values, we find that the error
variance is not constant, but appears to be increasing proportionally with
the predicted value (and hence with the independent variable). Thus, the
data set is found to be heteroscedastic by graphical inspection.
Error
15.0
10.0
5.0
0.0
50.0 55.0 60.0 65.0 70.0 75.0 80.0 85.0 90.0
-5.0
-10.0
-15.0
Variable Selection in Regression
• Since Engine Power is significant (𝑝-value < 𝛼 = 0.1) and has the
lowest 𝑝-value, it is selected for entry into the model.
Forward Addition – Step 2
• Mileage vs. Power and Displacement
Coefficients Standard Error t Stat p-value
Intercept 59.876 3.412 17.546 6.38E-19
Engine Power (kW) -0.335 0.073 -4.595 5.41E-05
Engine displacement (litres) -0.683 0.989 -0.691 0.494
• Since Displacement is insignificant and has the highest 𝑝-value, it is the first
predictor to be removed from the model.
Step 2:
• Mileage vs. Power and Acceleration
Coefficients Standard Error t Stat p-value
Intercept 87.739 6.674 13.146 4.22E-15
Engine Power (kW) -0.412 0.030 -13.588 1.60E-15
Acceleration (0-100 kmph) (sec) -1.620 0.383 -4.232 1.59E-04
• Here, both variables are significant and no further variable removal is
possible. The final regression equation is:
• Here, 𝐸𝑟𝑟𝑜𝑟(𝑦, 𝑦)
ො is some metric of forecast error for
the model, e.g. sum of squared errors.
LASSO Regression
• LASSO stands for Least Absolute Shrinkage and
Selection Operator.
• It is also known as L1 regularisation.
• It is similar to the Ridge regression approach, the
only difference is that the penalty term includes the
sum of absolute values of regression coefficients
instead of the sum of squares of the regression
coefficients.
(1 − 𝑅2 )(𝑛 − 1)
𝑅2 𝑎𝑑𝑗 =1−
𝑛−𝑘−1
Adjusted Coefficient of Determination
• Consider the same mileage example from before (Sample size
𝑛 = 38). The regression equation is:
2 1 − 𝑅2 𝑛 − 1 (1 − 0.8411)(38 − 1)
𝑅𝐴𝑑𝑗 𝑃𝑜𝑤,𝐴𝑐𝑐 =1− =1−
𝑛−𝑘−1 38 − 2 − 1
2
𝑅2 = 𝑅𝑃𝑜𝑤,𝐴𝑐𝑐,𝐷𝑖𝑠𝑝 = 0.8412
Adjusted Coefficient of Determination
• We find that there is a marginal increase in the value of 𝑅2
despite adding an insignificant variable.
2 1 − 𝑅2 𝑛 − 1 (1 − 0.8412)(38 − 1)
𝑅𝐴𝑑𝑗 𝑃𝑜𝑤,𝐴𝑐𝑐,𝐷𝑖𝑠𝑝 =1− =1−
𝑛−𝑘−1 38 − 3 − 1
1. Two predictors are highly correlated with each other (that is, they
have a correlation coefficient close to +1 or -1). In this case, knowing
the value of one of the variables tells you a lot about the value of the
other variable.
𝑌 = 𝑏0 + 𝑏1 𝑋1 + 𝑏2 𝑋2 + ⋯ + 𝑏𝑛 𝑋𝑛
1
𝑉𝐼𝐹𝑘 =
1 − 𝑅𝑘2
• Where 𝑅𝑘2 is the coefficient of multiple determination
considering 𝑋𝑘 as the dependent variable and all other 𝑋𝑖
(𝑖 ∈ {1,2,3, … , 𝑛}\{𝑘}) as independent variables
2
𝑅𝑃𝑜𝑤𝑒𝑟 = 0.0638
1 1 1
𝑉𝐼𝐹𝑃𝑜𝑤𝑒𝑟 = 2 = = = 𝟏. 𝟎𝟔𝟖𝟐
1 − 𝑅𝑃𝑜𝑤𝑒𝑟 1 − 0.0638 0.9362
𝜇𝑡 = 𝜇; ∀𝑡
𝜎𝑡 = 𝜎; ∀𝑡
𝜌 𝑦𝑡 , 𝑦𝑡+ℎ = 𝜌ℎ
• Trend rules out series (a), (c), (e), (f) and (i).
𝑦𝑡′ = 𝑦𝑡 − 𝑦𝑡−1
Differencing
• Transformations such as logarithms can help to
stabilize the variance of a time-series.
′
𝑦𝑡′′ = 𝑦𝑡′ − 𝑦𝑡−1 = 𝑦𝑡 − 𝑦𝑡−1 − (𝑦𝑡−1 − 𝑦𝑡−2 )
𝐴𝑅 1 ∶ (𝑋𝑡 − 𝜇) = 𝜑1 (𝑋𝑡−1 − 𝜇) + 𝜀𝑡
𝑋𝑡 = 𝑋𝑡 − 𝜀𝑡 = (𝜇 + 𝑋ሶ 𝑡 ) − 𝜀𝑡 = (𝜇 + 𝜑1 𝑋ሶ 𝑡−1 + 𝜀𝑡 ) − 𝜀𝑡
𝒕 = 𝝁 + 𝝋𝟏 𝑿ሶ 𝒕−𝟏
𝑿
𝐶𝑜𝑣(𝑋ሶ 𝑡 , 𝑋ሶ 𝑡−1 )
𝜑1 =
𝑉𝑎𝑟(𝑋ሶ 𝑡−1 )
Example
• Consider the following demand for TV sets over a 12-month period.
Using AR(1) process, predict the demand for the 13th month. Also,
compare the AR(1) model with MA(1) model and a five-period simple
moving average model using any error metric.
Period Demand Xt
1 106
2 110
3 118
4 105
5 115
6 100
7 112
8 106
9 118
10 102
11 112
12 110
AR(1) Process
Period Demand Xt 𝑿ሶ 𝒕 𝑿ሶ 𝒕−𝟏 𝑿ሶ 𝒕 𝑿ሶ 𝒕−𝟏 (𝑿ሶ 𝒕 )2
1 106 -3.5 12.25
2 110 0.5 -3.5 -1.75 0.25
3 118 8.5 0.5 4.25 72.25
4 105 -4.5 8.5 -38.25 20.25
5 115 5.5 -4.5 -24.75 30.25
6 100 -9.5 5.5 -52.25 90.25
7 112 2.5 -9.5 -23.75 6.25
8 106 -3.5 2.5 -8.75 12.25
9 118 8.5 -3.5 -29.75 72.25
10 102 -7.5 8.5 -63.75 56.25
11 112 2.5 -7.5 -18.75 6.25
12 110 0.5 2.5 1.25 0.25
Sum 1314 -256.25 379
Average 109.5 -23.3 31.6
AR(1) Process
• The data from the previous table can be used to
estimate the value of 𝜑1 for the AR Process.
𝐶𝑜𝑣(𝑋ሶ 𝑡 , 𝑋ሶ 𝑡−1 )
𝜑1 =
𝑉𝑎𝑟(𝑋ሶ 𝑡−1 )
𝑋𝑡 = 𝜇 + 𝜑1 𝑋ሶ 𝑡−1
𝟏𝟑 = 𝟏𝟎𝟗. 𝟐
𝑿
AR(1) Process
• Residual sum of squares from forecasts using AR(1) model:
𝒕 Error 𝜺𝒕
Period Demand Xt Forecast 𝑿 𝜺𝟐𝒕
1 106
2 110 112 -2 4
3 118 109.2 8.8 77.44
4 105 103.6 1.4 1.96
5 115 112.7 2.3 5.29
6 100 105.7 -5.7 32.49
7 112 116.2 -4.2 17.64
8 106 107.8 -1.8 3.24
9 118 112.2 5.8 33.64
10 102 103.6 -1.6 2.56
11 112 115 -3 9
12 110 109.2 0.8 0.64
Sum 187.9
Average 17.08
Simple 5-Period Moving Average(MA) Model
Period Demand Xt 𝒕
Forecast 𝑿 Error 𝜺𝒕 𝜺𝟐𝒕
1 106
2 110
3 118
4 105
5 115
6 100 110.8 -10.8 116.6
7 112 109.6 2.4 5.76
8 106 110 -4 16
9 118 107.6 10.4 108.2
10 102 110.2 -8.2 67.24
11 112 107.6 4.4 19.36
12 110 110 0 0
Sum 1314 333.2
Average 109.5 47.6
Moving Average Model
• A Moving Average (MA) model is similar to an AR model,
but uses lagged residuals instead of lagged observations
to predict the response.
𝑀𝐴 1 : 𝑋𝑡 − 𝜇 = 𝜀𝑡 − 𝜃1 𝜀𝑡−1
Moving Average Model
𝒕 = 𝝁 − 𝜽𝟏 𝜺𝒕−𝟏
𝑿
Period 𝒕
Demand Xt Forecast 𝑿 Error 𝜺𝒕 𝜺𝟐𝒕
1 106
2 110 111.3 -1.3 1.7
3 118 110.2 7.8 60.8
4 105 105.6 -0.6 0.4
5 115 109.8 5.2 27.0
6 100 106.9 -6.9 47.6
7 112 113 -1 1
8 106 110 -4 16
9 118 111.5 6.5 42.3
10 102 106.3 -4.3 18.5
11 112 111.7 0.3 0.1
12 110 109.4 0.6 0.4
Sum 1314 215.7
Average 109.5 19.6
ARMA(𝑝, 𝑞) Models
• An ARMA model combines features of AR models and
MA models.
➢ ARMA(1, 1):
(Xt - ) = 1 (X(t-1) - ) + t + 1 (t-1)
➢ ARMA(2, 1):
(Xt - ) = 1 (X(t-1) - ) + 2 (X(t-2) - ) + t + 1 (t-1)
➢ ARMA(1, 2):
(Xt - ) = 1 (X(t-1) - ) + t + 1 (t-1) + 2 (t-2)
ARIMA(𝒑, 𝒅, 𝒒) Models
• ARIMA(𝑝, 𝑑, 𝑞) models are extensions of ARMA(𝑝, 𝑞)
models that incorporate differencing, thereby allowing
non-stationary time-series to be modelled.
1. Data preparation
2. Model selection
3. Parameter estimation
4. Model Checking
5. Forecasting
Box-Jenkins Modelling
➢ Data Preparation:
➢ This involves cleaning the data and performing necessary
transformations and differencing to make the time series
stationary, to stabilize the variance of the data and minimize
the effects of obvious patterns like trend and seasonality.
The order of differencing (𝑑) is generally decided at this
stage.
➢ Model Selection:
➢ The model selection may be done by using plots of
Autocorrelation function (ACF), Partial Autocorrelation
function (PACF) and Extended Autocorrelation function
(EACF). These plots assist in deciding how many lagged
observations (𝑝) and residuals (𝑞) are needed.
Box-Jenkins Modelling
• Estimation:
➢ Forecasting:
➢ Once the model has been deemed fit for purpose, the
final step is to use the model to make forecasts.
Feedback from the performance of the model may be
used to further to make minor modifications to it and
refine it.
ARIMA Example
• Consider the same paper production example from before:
Period Paper_Prod Period Paper_Prod
Jan-17 2740 Jan-19 3870
Feb-17 2805 Feb-19 3850
Mar-17 2835 Mar-19 3810
Apr-17 2840 Apr-19 3800
May-17 2895 May-19 3790
Jun-17 2905 Jun-19 3820
Jul-17 2990 Jul-19 3910
Aug-17 3070 Aug-19 3980
Sep-17 3185 Sep-19 4030
Oct-17 3275 Oct-19 4110
Nov-17 3320 Nov-19 4195
Dec-17 3305 Dec-19 4235
Jan-18 3285 Jan-20 4325
Feb-18 3255 Feb-20 4395
Mar-18 3235 Mar-20 4475
Apr-18 3225 Apr-20 4510
May-18 3260 May-20 4495
Jun-18 3345 Jun-20 4470
Jul-18 3405 Jul-20 4450
Aug-18 3595 Aug-20 4435
Sep-18 3725 Sep-20 4425
Oct-18 3790 Oct-20 4485
Nov-18 3850 Nov-20 4585
Dec-18 3875 Dec-20 4635
Solution
• From a plot of the production data, it can be clearly seen
that the data exhibits cyclicity as well as an increasing
trend.
4700
4500
4300
4100
Paper Production
3900
3700
3500
3300
3100
2900
2700
0 5 10 15 20 25 30 35 40 45 50
Period
Solution
➢ Data Preparation:
➢ An initial analysis of the data reveals no extreme outliers or
sudden changes of magnitude or missing observations.
• The deviation of this output from its target value is then used
to adjust the weights.