Practical 9 - Time-Series Forecasting
Practical 9 - Time-Series Forecasting
#The dataset consists of monthly totals of international airline passengers, 1949 to 1960.Main
aim is to predict next ten years.
#The dataset shows the number of passengers travelling on a flight for all the months in a year.
AirPassengers
View(AirPassengers)
#This tell us that the data series is in a time series format
class(AirPassengers) # This indicates that AirPassengers is an object of class "ts",
meaning it is a time series object in R.
str(AirPassengers)
start(AirPassengers)
end(AirPassengers)
# The start() and end() functions give the start and end times of a time series object. The
AirPassengers dataset contains monthly airline passenger numbers from 1949 to 1960.
frequency(AirPassengers)
# The function frequency(AirPassengers) in R returns the number of observations per unit time
in a time series object. This cycle of the time series is 12 months in a year.
summary(AirPassengers)
#Exploring the data
# Visualization of data
plot(AirPassengers)
This command plots the AirPassengers dataset as a time series graph.
The x-axis represents time (years from 1949 to 1960).
The y-axis represents the number of airline passengers.
The plot shows an increasing trend and seasonal fluctuations (higher passenger
numbers in certain months).
Upward trend: The number of passengers increases over time.
Seasonality: Recurring peaks indicate seasonal patterns in air travel.
Variability increases: The fluctuations become larger as the number of passengers
increases.
#The abline() function can be used to add vertical, horizontal or regression lines to plot.
abline(reg=lm(AirPassengers~time(AirPassengers)))
Adds a regression (trend) line to the existing plot(AirPassengers).
Uses lm(AirPassengers ~ time(AirPassengers)) to fit a linear model where:
AirPassengers is the dependent variable.
time(AirPassengers) provides the time index as the independent variable.
abline() then draws the fitted regression line on the plot.
This trend line helps visualize the overall growth in airline passengers over
time.
If the data shows an upward sloping line, it confirms an increasing trend in
air travel.
Since AirPassengers exhibits seasonality and exponential growth, a simple
linear trend may not perfectly fit the data. For a better fit, consider log
transformation or exponential smoothing models
pacf(diff(log(AirPassengers))) #p=0
#Here we can see that the first lag is significantly out of the limit and the second one
is also out of the significant limit but it is not that far so we can select the order of the
p as 0.
The command pacf(diff(log(AirPassengers))) in R is used to compute and
plot the Partial Autocorrelation Function (PACF) of the differenced
logarithm of the AirPassengers dataset. Here’s a breakdown of the steps:
1. log(AirPassengers)
o Takes the natural logarithm of the monthly international airline
passenger numbers (1949–1960) to stabilize variance (reduce
exponential growth effects).
2. diff(log(AirPassengers))
o Computes the first difference to make the time series stationary by
removing trends.
3. pacf(diff(log(AirPassengers)))
o Plots the PACF to analyze the lagged dependencies after controlling
for intermediate lags.