CP2403 - Module 10
CP2403 - Module 10
Time Series
Summary of Module 9
• Regression when the explanatory variable is categorical
• Logistical Regression
Learning outcome – Module 10
By the end of Module 10, you should be able to perform basic time
series analysis
Topics covered in Week 10
• Time Series Analysis
Time Series
• collection of data points collected at constant time intervals
• time dependent
Example: Climate Data
Example: Google Search for ‘Diet’
has seasonality trend
Time Series Analysis
SV
tiB
asu
P
l
tu
ioti
alAC
o
lF
id
n
&
aP
sAr
ieA
C
F
c
zR
h
Tarts
Ie&i
M
f
Tm
i
n
ieA
d
m
o
eSp
m
t
eim
So
ralp
ear
id
r
m
ee
etr
i
esl
s
M
a
k
e
p
r
e
d
i
c
t
i
o
n
s
Visualizing the time Series
• Python Example
Some inferences
• The year on year trend clearly shows that the #passengers have been
increasing without fail.
• The variance and the mean value in July and August is much higher
than rest of the months.
• Even though the mean value of each month is quite different their
variance is small. Hence, we have strong seasonal effect with a cycle
of 12 months or less.
Time Series Analysis
SV
tiB
asu
P
l
tu
ioti
alAC
o
lF
id
n
&
aP
sAr
ieA
C
F
c
zR
h
Tarts
Ie&i
M
f
Tm
i
n
ieA
d
m
o
eSp
m
t
eim
So
ralp
ear
id
r
m
ee
etr
i
esl
s
M
a
k
e
p
r
e
d
i
c
t
i
o
n
s
2. Stationarize Time Series
• Before stationarize a series, check if it’s a stationary series.
• Criteria – these 3 properties should not be a function of time (i.e.
doesn’t change over time)
• Mean
• Variance
• Covariance
• If the time series is not stationary, can not model it
Stationary Series - Mean
Green – mean is the same over time
Red – mean is not the same over time
Stationary Series - Variance
Green – variance is the same over time
Red – variance changes over time
Stationary Series - Covariance
Green – covariance is the same over time
Red – covariance changes over time
Checking for if time series if stationary
• Calculating rolling mean
• Calculating rolling variance/std
• Dickey Fuller Test of Stationarity
Python example
2. Stationarize Time Series - Decomposing
• Modeling both trend and seasonality and removing them from the
model.
Python example
Convert to log
Decompose
Remove trend and seasonality
use only residual
M
a
k
e
p
r
e
d
i
c
t
i
o
n
s
Correlation of Two Time Series
• Lag-one autocorrelation
3. Plot ACF & PACF charts & find optimal
parameter
• Autocorrelation
• measure of the internal correlation within a time series
• way of measuring and explaining internal association between observations in
a time series
• ACF -> Autcorrelation function
• PACF -> Partial Autcorrelation function
the two dotted lines on either sides of 0 are the confidence intervals.
p = is where the PACF chart crosses the upper confidence interval for the first time. in
this case p=2.
q = is where the ACF chart crosses the upper confidence interval for the first time. in
this case q=2.
Time Series Analysis
SV
tiB
asu
P
l
tu
ioti
alAC
o
lF
id
n
&
aP
sAr
ieA
C
F
c
zR
h
Tarts
Ie&i
M
f
Tm
i
n
ieA
d
m
o
eSp
m
t
eim
So
ralp
ear
id
r
m
ee
etr
i
esl
s
M
a
k
e
p
r
e
d
i
c
t
i
o
n
s
4. Build ARIMA model
ARIMA - Auto-Regressive Integrated Moving Averages
Time Series Analysis
SV
tiB
asu
P
l
tu
ioti
alAC
o
lF
id
n
&
aP
sAr
ieA
C
F
c
zR
h
Tarts
Ie&i
M
f
Tm
i
n
ieA
d
m
o
eSp
m
t
eim
So
ralp
ear
id
r
m
ee
etr
i
esl
s
M
a
k
e
p
r
e
d
i
c
t
i
o
n
s
5. Make predictions
Summary of Module 10
• Time Series Analysis
Prac 10 Overview