MIT BA 12 - Time Series Session Notes
MIT BA 12 - Time Series Session Notes
Types of data :
1. String
2. Integer
3. Float
4. Character
5. Structured & Unstructured
6. Variables
7. Date and Time
8. Qualitative Data
9. Quantitative Data
10. Binary
11. Discrete
12. Continuous
13. Finite and Infinite data
14. Time Series Data
15. Cross Sectional Data
16. Panel Data
Time Series Data : Focuses on how a single
variable changes over time.
Eg : Temperature, Sales of a particular product,
Stock price of a particular stock, Weather etc
Cross Sectional Data : When the data compares
different entities at a single point of time.
Eg : A survey that take responses from different
people about their income, expenditure etc.
taken on the same day, Political polls, Fashion
trends, Population data taken on a particular
day.
Panel data : Combines the aspect of the cross-
sectional data and the time series data by
observing multiple entities over multiple
periods in time. This analyses the change over
different subject across different time periods.
Eg: Research data, Academic performance of
various students over various years, income of
various individuals over various periods of time,
census data which taken over different time
intervals, appraisal data of various individuals.
Naïve Methods
In the Grayson food data we can see long term
trend in the data. When there is a trend
detected in the data it is difficult to get
accurate results using the method of SMA.
Hence, we use Naïve methods to get the
necessary outputs.
1. Simple Naïve Method
- Moving average of order 1
- We will predict the next lag value based
on the previous value
- We will predict the most recent value.
2. Trend Naïve Method
- Trends can vary over different intervals of
time, that is yearly, monthly, quarterly,
daily etc.
- SNM + ((t-1)-(t-2))
3. Seasonal Naïve Method
- There can be a strong correlation on a
monthly or a yearly basis dataset.
- If you have a monthly data, then there
months will have a similar pattern in the
following months as well.
E.g.: If there is a hike in sales in July 2019
then the same hike in sales can be
observed in July 2020 as well.
If you have monthly dataset, then these
months will show a similar pattern in the
following months.
MIT BA 12
Time Series Analysis – Session 4
Error Analysis:
Method of Mean Square Error :
- To find the error (Actual value – Predicted
value)
- Find the squared error (Actual value –
Predicted value)2
- Mean of the squared error
Problem Statement :
- Use the sales (2010-19) data
- Check for null values
- Use the method of forward fill or
backward fill to remove null values
- Plot the data and check for long term
trends
- Forecast the data for the next day using
the method of SNM, Trend Naïve and
Seasonal Naïve.
- Perform the error analysis using MAPE
and MSE and find out which forecast is
the most accurate.
Time Series Decomposition
Time Series data can be affected by:
- Trend
- Seasonality
- Irregularity
1. Additive Model
Y=T+S+C+I
2. Multiplicative Model
Y = Trend * Seasonality
Steps to be followed for forecasting Sales Data:
1. Moving Average
2. Central Moving Average
3. Residual and Seasonal Component
4. Seasonal variation
5. Deseasonalize the data
- Remove the seasonal variation from the
original data
6. Trend
- Ytrend = Intercept + Slope(time)*Time Period
- Y = mX + C
7.Multiplicative model
- Y = Trend * Seasonality