Ass1 Q2 Daisy Econometric Prediction ARIMA
Ass1 Q2 Daisy Econometric Prediction ARIMA
2023-06-19
a. Plot the time series. On a strictly visual basis, does the series
appear to be stationary.
Firstly, we wil import the data and format them to plot and analysis the graph.
# Import dataset
# you will need to change the directory to reflect where you have stored the file
US_Inflation <- read.csv("US_Inflation.csv")
View(US_Inflation)
1
US Annual Inflation
6
US_Inflation$Ann_Infl
5
4
3
2
1
Time
As you can from the graph that, data is not stationary because the trend of data is fluctuate, not symmetric
and unpredictable. The inflation rate is from 1 percent to 3 percent between time 0 and time 250, it seem
stationary in this period. However, there is some out liner in the 250, with the inflation rate is over 6 percent
after 250. In conclusion, the data is not stationary.
Firstly, we need to run the Dickey Fuller method to test for the stationarity in the inflation time-series.
###Run Augmented Dickey-Fuller to test whether the inflation time-series is stationary.
Step 1: Hypotheses
adf.test(US_Inflation$Ann_Infl)
##
## Augmented Dickey-Fuller Test
2
##
## data: US_Inflation$Ann_Infl
## Dickey-Fuller = -2.1581, Lag order = 6, p-value = 0.5095
## alternative hypothesis: stationary
From the Dickey-Fuller test, we obtain a test statistic of F_stat = -2.1581 and associated p_value = 0.5095.
Since our p-value is greater than alpha = 0.05, we accept the null hypothesis, with ρ = 1.
Step 4: Conclusion
Based on this study, we would conclude that the US inflation figures for the period January 2000 to April
2023 is non-stationary.
c.Is there visual evidence that the inflation series follows an autoregressive (AR)
process? A moving average (MA) process?
Yes. Based on two graphs below, it obvious evidence to verify the model of the inflation series.
# Auto correllation function and Partial Auto Correlation function to see the evidence of AR or MA model
Acf(US_Inflation$Ann_Infl)
Series US_Inflation$Ann_Infl
1.0
0.8
0.6
ACF
0.4
0.2
−0.2 0.0
5 10 15 20
Lag
3
Pacf(US_Inflation$Ann_Infl)
Series US_Inflation$Ann_Infl
1.0
0.8
0.6
Partial ACF
0.4
0.2
−0.2
5 10 15 20
Lag
As can be seen from two graphs above, the inflation series is an Autoregressive (AR) process. Because the
first graph shows that the lag increase from 0 to 20 while the ACF decrease gradually from 1.0 to -0.2. In
addition, in Partial ACF graph, the first lag at 1, the Partial ACF is highest at 1, and then when the lag go
up from 1 to over 20, there is just a small fluctuation in partial ACF between -0.1 and 0.1. As a result, it is
enough evidence to concluded that the inflation series is an Autoregressive (AR) process.
# Run Arima(1,1,0), Arima(2,1,0) and Arima(3,1,0), compare AIC to choose the better AR model
Arima(US_Inflation$Ann_Infl, order = c(1,1,0))
## Series: US_Inflation$Ann_Infl
## ARIMA(1,1,0)
##
## Coefficients:
## ar1
## 0.429
## s.e. 0.054
##
4
## sigma^2 = 0.02802: log likelihood = 103.21
## AIC=-202.42 AICc=-202.38 BIC=-195.16
## Series: US_Inflation$Ann_Infl
## ARIMA(2,1,0)
##
## Coefficients:
## ar1 ar2
## 0.4465 -0.0409
## s.e. 0.0598 0.0599
##
## sigma^2 = 0.02807: log likelihood = 103.45
## AIC=-200.89 AICc=-200.8 BIC=-190
## Series: US_Inflation$Ann_Infl
## ARIMA(3,1,0)
##
## Coefficients:
## ar1 ar2 ar3
## 0.4381 0.0519 -0.2053
## s.e. 0.0585 0.0643 0.0587
##
## sigma^2 = 0.02698: log likelihood = 109.42
## AIC=-210.83 AICc=-210.69 BIC=-196.31
##
## Fitting models using approximations to speed things up...
##
## ARIMA(2,1,2) with drift : -222.4697
## ARIMA(0,1,0) with drift : -143.6764
## ARIMA(1,1,0) with drift : -197.8241
## ARIMA(0,1,1) with drift : -187.9188
5
## ARIMA(0,1,0) : -144.3907
## ARIMA(1,1,2) with drift : -203.6367
## ARIMA(2,1,1) with drift : -194.6601
## ARIMA(3,1,2) with drift : -224.2217
## ARIMA(3,1,1) with drift : -204.3787
## ARIMA(4,1,2) with drift : -224.9197
## ARIMA(4,1,1) with drift : -202.8613
## ARIMA(5,1,2) with drift : -222.1578
## ARIMA(4,1,3) with drift : -223.9524
## ARIMA(3,1,3) with drift : -224.5022
## ARIMA(5,1,1) with drift : -214.3436
## ARIMA(5,1,3) with drift : -221.9291
## ARIMA(4,1,2) : -226.2285
## ARIMA(3,1,2) : -225.3467
## ARIMA(4,1,1) : -204.0654
## ARIMA(5,1,2) : -223.4056
## ARIMA(4,1,3) : -224.9849
## ARIMA(3,1,1) : -205.6091
## ARIMA(3,1,3) : -225.2427
## ARIMA(5,1,1) : -216.2136
## ARIMA(5,1,3) : Inf
##
## Now re-fitting the best model(s) without approximations...
##
## ARIMA(4,1,2) : -228.8365
##
## Best model: ARIMA(4,1,2)
AIC value
modelA$aic
## [1] -228.8365
In time series modeling, the AIC (Akaike Information Criterion) is a statistical measure used to compare
different models and select the one that best fits the data while penalizing for complexity. Lower AIC values
indicate a better fit of the model to the data. The AIC formula is given by:
AIC = -2 * log-likelihood + 2 * (number of parameters)
As you can seen from the result that among the preliminary fits, the ARIMA(4,1,2) model with drift has
the lowest AIC value of -226.2285, which indicates that it has the best balance between fit and complexity
according to the AIC criterion.
After identifying the ARIMA(4,1,2) model as having the lowest AIC value in the preliminary fits, the model
is re-fitted without approximations. The re-fitted ARIMA(4,1,2) model has a slightly adjusted AIC value of
-228.8365.
Then, we will find out the coefficients as below:
# Model's coefficients
modelA$coef
6
The ARIMA(4,2,1) model you’ve provided has the following coefficient values for the autoregressive (AR)
and moving average (MA) terms:
1. AR Terms:
• AR1 (ar1): 0.4677
• AR2 (ar2): -0.7858
• AR3 (ar3): 0.1899
• AR4 (ar4): -0.0397
2. MA Terms:
• MA1 (ma1): -0.0288
• MA2 (ma2): 0.9298
1. AR Terms:
• AR1 (ar1): This coefficient (0.4677) represents the strength and sign of the first autoregressive
term. In this context, it indicates that the current value of the time series is positively influenced
by its value at lag 1, with a coefficient of approximately 0.4677.
• AR2 (ar2): This coefficient (-0.7858) represents the strength and sign of the second autoregressive
term. It indicates a negative influence of the time series value at lag 2 on the current value, with
a coefficient of approximately -0.7858.
• AR3 (ar3): This coefficient (0.1899) represents the strength and sign of the third autoregressive
term. It indicates a positive influence of the time series value at lag 3 on the current value, with
a coefficient of approximately 0.1899.
• AR4 (ar4): This coefficient (-0.0397) represents the strength and sign of the fourth autoregressive
term. It indicates a weak negative influence of the time series value at lag 4 on the current value,
with a coefficient of approximately -0.0397.
2. MA Terms:
• MA1 (ma1): This coefficient (-0.0288) represents the strength and sign of the first moving average
term. It indicates that the current value of the time series is negatively influenced by the white
noise error term at lag 1, with a coefficient of approximately -0.0288.
• MA2 (ma2): This coefficient (0.9298) represents the strength and sign of the second moving
average term. It indicates a strong positive influence of the white noise error term at lag 2 on the
current value, with a coefficient of approximately 0.9298.
These coefficient values help describe how the past values of the time series (AR terms) and past white noise
errors (MA terms) contribute to the prediction of the current value in the ARIMA(4,2,1) model.
Next, we will calculate the residual and plot ACF, PACF to check the auto correlation.
# Model's residuals
modelA$residuals
## Time Series:
## Start = 1
## End = 280
## Frequency = 1
## [1] 0.001999999 0.168388610 0.106231711 -0.169228265 0.163462857
## [6] 0.091423258 -0.079539492 0.069672098 0.010748484 -0.076702034
7
## [11] 0.112872641 -0.047749728 -0.005777780 0.117450514 -0.033622442
## [16] -0.126864280 -0.044403336 0.282300529 -0.103179441 -0.088181529
## [21] -0.049205227 0.132766908 0.168834743 -0.289062079 -0.063236924
## [26] 0.193158080 -0.186566498 0.024762667 -0.034709343 -0.106713831
## [31] -0.004321017 0.191191010 -0.323318733 0.075149096 -0.097282071
## [36] -0.032818830 -0.029061425 -0.210396928 0.125173917 -0.162060942
## [41] 0.110563961 -0.158039849 0.056139866 -0.156979217 -0.040010003
## [46] 0.129942499 -0.246143783 0.055368387 0.049910654 0.091873776
## [51] 0.301233106 0.014848018 -0.177374756 0.312758555 -0.120381531
## [56] -0.163005258 0.333150148 -0.030901928 0.140256184 -0.121664622
## [61] 0.135188182 0.132129968 -0.181990904 -0.121614742 0.118761642
## [66] -0.139122564 0.094153654 -0.075867761 -0.073106491 0.188198207
## [71] -0.048007173 0.021251200 -0.124471011 0.105968637 0.021173700
## [76] 0.125042956 -0.013601681 0.193728899 0.065261300 0.021122878
## [81] 0.037734496 -0.197764747 0.016338704 0.058911382 0.049871394
## [86] -0.089053463 -0.174301960 -0.047663798 0.001030773 -0.028065426
## [91] -0.050307702 -0.064304933 0.087717202 0.083723201 -0.006926373
## [96] 0.049795856 0.120686080 -0.226026843 0.138395607 -0.104821911
## [101] 0.035618440 0.092962981 0.045747280 -0.057278516 0.015409830
## [106] -0.261314617 -0.077553078 -0.101469783 -0.037459897 0.108935950
## [111] -0.057354932 0.086691631 -0.113913650 -0.054552533 -0.146454417
## [116] -0.015565412 0.140346264 0.127162379 -0.130748126 0.112207499
## [121] -0.155986487 -0.228741869 -0.097384027 -0.090369750 0.073352357
## [126] -0.044966717 -0.039458076 0.032728972 -0.062371578 -0.185451937
## [131] 0.267619404 -0.051591192 0.140861469 0.012549702 0.087727572
## [136] 0.084683007 0.141632435 -0.004642961 0.163553186 0.160049713
## [141] -0.094890896 0.071615319 0.113479189 -0.023568077 0.053402903
## [146] -0.138340543 0.175694409 -0.010665375 -0.062118784 -0.114832718
## [151] 0.005198552 -0.124891555 0.125527168 -0.069185398 -0.106111943
## [156] 0.081111555 0.026383998 0.044333461 -0.174000344 -0.120870786
## [161] 0.154276899 -0.117380518 0.033961390 0.076816339 -0.078567319
## [166] 0.028708496 -0.019723603 -0.004296646 -0.085757542 0.048300398
## [171] 0.102542928 0.030259017 0.133365340 -0.158251201 0.060365651
## [176] -0.163715477 0.059644204 0.092798749 -0.161577928 -0.073520672
## [181] 0.097317036 0.115537858 -0.018914418 -0.080131014 -0.025127096
## [186] 0.205535000 -0.092107438 -0.096178518 0.159910763 0.051223934
## [191] 0.031372897 -0.012486595 0.106253581 0.127484505 -0.178334328
## [196] -0.113325294 0.215724745 0.009180670 -0.106712741 0.065432495
## [201] -0.041698303 -0.036683910 -0.013085035 0.078113281 0.082658750
## [206] -0.142413771 -0.174587001 0.027336503 -0.124315131 0.019978348
## [211] -0.029957476 0.014571482 0.020327657 0.087036291 -0.163168903
## [216] 0.139736485 0.011385637 -0.028052074 0.265644255 -0.102620986
## [221] 0.085803135 0.094139152 0.066656737 -0.272792280 0.087286270
## [226] -0.016034575 0.107107031 -0.115309438 -0.005329633 -0.015904175
## [231] -0.044757951 0.081690446 -0.162398392 0.159746965 0.107272074
## [236] 0.109328410 -0.134519950 -0.063374420 0.136014365 -0.007801017
## [241] -0.107696926 0.100183821 -0.243756516 -0.581262200 0.082590738
## [246] 0.147236737 0.291295576 -0.205433962 -0.017138510 0.093134541
## [251] 0.062284324 -0.159412140 -0.243503996 0.130786130 0.419774010
## [256] 1.109553884 0.033598031 0.334351319 -0.174322435 -0.068580410
## [261] 0.042108077 0.495020165 0.043480726 0.460247759 0.313979702
## [266] 0.185595388 -0.082798488 -0.278506224 0.031772382 0.014561066
## [271] -0.078569845 0.331684057 0.206549572 -0.432414532 -0.204382631
## [276] -0.040338584 0.062329736 -0.204620691 0.049407129 -0.026607659
8
We will plot the ACF and Partial ACF to check the auto correlation.
# Check whether the specification of the best ARIMA model essentially eliminated the auto correlation?
# If all the values are within the 95% confidence interval, the problem of auto correlation of errors ha
acf(modelA$residuals)
Series modelA$residuals
0.2 0.4 0.6 0.8 1.0
ACF
−0.2
0 5 10 15 20
Lag
pacf(modelA$residuals)
9
Series modelA$residuals
0.2
0.1
Partial ACF
0.0
−0.1
−0.3
5 10 15 20
Lag
As can be seen from the graph that almost the values are within the 95% confidence interval, the problem
of auto correlation of errors has been eliminated. However, there is still some outliner, thus the model can
be improved to have a better result.
e. Use the results from part (d) to forecast the inflation rates for the next two
months.
## Point Forecast Lo 95 Hi 95
## 281 5.406332 5.097076 5.715588
## 282 5.439327 4.897414 5.981241
10
Forecasts from ARIMA(4,1,2)
6
5
4
3
2
1
The above result is the forecast for the next 2 periods ( 281, 282) of inflation using a model Arima (4,2,1).
The output uses the 95% prediction intervals for each period.
There is the interpretation of the results:
1. Point Forecast:
• For period 281, the point forecast for inflation is approximately 5.4063.
• For period 282, the point forecast for inflation is approximately 5.4393.
The “Point Forecast” provides the estimated values for inflation in the two future periods based on the
model.
The “Lo 95” column represents the lower limit of the 95% prediction interval, which means there is a 95%
probability that the actual inflation values for these periods will fall above these lower bounds.
11
• For period 282, the upper bound of the 95% prediction interval for inflation is approximately
5.9812.
The “Hi 95” column represents the upper limit of the 95% prediction interval, which means there is a 95%
probability that the actual inflation values for these periods will fall below these upper bounds.
In summary, the forecasted inflation values for the next two periods are approximately 5.4063 and 5.4393,
with corresponding 95% prediction intervals that provide a range of likely values. The prediction intervals
are useful for assessing the uncertainty associated with the forecasts, and they indicate the range within
which the actual inflation values are likely to fall with a 95% confidence level.
Check the best Arima model using BIC and compare with the best model using AIC
##
## Fitting models using approximations to speed things up...
##
## ARIMA(2,1,2) with drift : -200.6824
## ARIMA(0,1,0) with drift : -136.414
## ARIMA(1,1,0) with drift : -186.9305
## ARIMA(0,1,1) with drift : -177.0251
## ARIMA(0,1,0) : -140.7595
## ARIMA(1,1,2) with drift : -185.4806
## ARIMA(2,1,1) with drift : -176.504
## ARIMA(3,1,2) with drift : -198.8033
## ARIMA(2,1,3) with drift : -201.6534
## ARIMA(1,1,3) with drift : -180.51
## ARIMA(3,1,3) with drift : -195.4525
## ARIMA(2,1,4) with drift : -196.0295
## ARIMA(1,1,4) with drift : -188.0333
## ARIMA(3,1,4) with drift : -190.864
## ARIMA(2,1,3) : -206.3523
## ARIMA(1,1,3) : -185.6745
## ARIMA(2,1,2) : -205.4202
## ARIMA(3,1,3) : -199.8242
## ARIMA(2,1,4) : -200.7217
## ARIMA(1,1,2) : -190.676
## ARIMA(1,1,4) : -193.0171
## ARIMA(3,1,2) : -203.5594
## ARIMA(3,1,4) : -194.5848
##
## Now re-fitting the best model(s) without approximations...
##
## ARIMA(2,1,3) : -208.8781
##
## Best model: ARIMA(2,1,3)
12
## Point Forecast Lo 95 Hi 95
## 281 5.405174 5.096350 5.713997
## 282 5.440815 4.900573 5.981057
1. Model Selection: The code uses the ” Auto.arima ” function to automatically select the best ARIMA
model based on the Bayesian Information Criterion (BIC). The output shows several candidate models
with their corresponding BIC values. The model with the lowest BIC value is considered the best.
In this case, the best model selected is “ARIMA(2,1,3)” with drift, and its BIC value is -208.8781.
2. Forecast:
Then we will forecasts for the next two periods (periods 281 and 282).
• For period 281, the point forecast for inflation is approximately 5.4052, with a 95% prediction interval
ranging from approximately 5.0964 (Lo 95) to 5.7140 (Hi 95). This means there is a 95% confidence
that the actual inflation value for period 281 will fall within this interval.
• For period 282, the point forecast for inflation is approximately 5.4408, with a 95% prediction interval
ranging from approximately 4.9006 (Lo 95) to 5.9811 (Hi 95).
These results provide you with a forecast of inflation for the next two periods based on the selected
ARIMA(2,1,3) model. The point forecasts give you the estimated values, while the prediction intervals
indicate the range of likely values with a 95% confidence level.
13
Compare result between BIC and AIC result.
Both the AIC (Akaike Information Criterion) and BIC (Bayesian Information Criterion) are used to assess
the goodness of fit of different models and help select the best model. The primary difference between them
is how they balance goodness of fit and model complexity:
AIC (Akaike Information Criterion):
AIC penalizes models for complexity but is generally less stringent in this regard compared to BIC. A lower
AIC value indicates a better fit, with a preference for simpler models. It tends to favor models that fit the
data well but may have more parameters. BIC (Bayesian Information Criterion):
BIC is more stringent in penalizing complex models compared to AIC. It seeks a balance between goodness
of fit and model simplicity, but it tends to favor simpler models more strongly than AIC. A lower BIC value
indicates a better fit, with a stronger preference for simpler models.
There is the comparison the AIC and BIC results for the model selection:
AIC result for the best model (ARIMA(4,1,2)): -228.8365 BIC result for the best model (ARIMA(2,1,3)):
-208.8781 In this case, the AIC and BIC values for the selected best model (ARIMA(2,1,3)) are both negative.
However, the AIC value (-228.8365) is lower than the BIC value (-208.8781).
Interpretation:
The lower AIC value suggests that the ARIMA(4,1,2) model is preferred when considering AIC as the
criterion for model selection. The BIC value is higher, indicating that it puts more emphasis on model
simplicity, and it might favor slightly simpler models than the ARIMA(2,1,3) model.
In summary, the AIC and BIC criteria provide slightly different perspectives on model selection. In this spe-
cific case, the AIC favors the ARIMA(4,1,2) model, while the BIC is more cautious about model complexity
but still selects the ARIMA(2,1,3) model as the best among the options presented. The choice between them
can depend on the specific goals and trade-offs you have in the analysis.
14