0% found this document useful (0 votes)
25 views23 pages

AI For Real World Application - Notes (Module-6)

Uploaded by

amitpatharmora
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views23 pages

AI For Real World Application - Notes (Module-6)

Uploaded by

amitpatharmora
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

B.

Tech CSE AIML 5th Sem


AI for Real World Application
(PCC-CSM503)

Introduction to Time Series


A time series is a sequence of data points recorded over time, often at regular intervals. These
data points represent values that vary over time and are commonly used to analyze patterns,
trends, and seasonal variations. Time series analysis is critical in various fields such as finance,
economics, weather forecasting, biology, and social sciences.

Definition:
A time series is an ordered sequence of values, typically denoted as xtx_t, where tt
represents the time index (e.g., t=1,2,3,…t = 1, 2, 3, \ldots).

Components of Time Series:

○ Trend: The long-term movement in the data. It shows whether the series is
increasing, decreasing, or stagnant over time.
○ Seasonality: Regular, repeating patterns or cycles of behavior observed in the
series over fixed intervals (e.g., daily, monthly, or yearly).
○ Cyclic Variations: Irregular patterns in the data due to business or economic
cycles that do not have a fixed period.
○ Residual (Noise): Random variation or fluctuations that cannot be explained by
the trend, seasonality, or cyclic components.

Types of Time Series:

○ Univariate Time Series: Involves a single variable recorded over time (e.g., stock
prices).
○ Multivariate Time Series: Involves multiple variables that are often interrelated
(e.g., temperature and humidity data).

Applications of Time Series

1. Forecasting: Predicting future values based on past observations (e.g., sales forecasting,
stock price prediction).
2. Anomaly Detection: Identifying unusual patterns or outliers in the data.
3. Trend Analysis: Analyzing long-term movements to inform strategic decisions.
4. Seasonal Adjustment: Removing seasonal effects to understand the underlying trends.

Prepared By: Dept. of CSE-AIML, BWU


B. Tech CSE AIML 5th Sem
AI for Real World Application
(PCC-CSM503)

Time Series Models

1. Deterministic Models: Assume the time series can be represented using mathematical
functions (e.g., linear or exponential trend models).

2. Stochastic Models: Assume the data is generated by a probabilistic process:

○ Autoregressive (AR) Models: Models that use past values of the series to predict
future values.
○ Moving Average (MA) Models: Models that use past forecast errors to improve
predictions.
○ ARMA (Autoregressive Moving Average) Models: Combines AR and MA
models.
○ ARIMA (Autoregressive Integrated Moving Average) Models: Extends
ARMA by adding differencing to handle non-stationarity.
○ Seasonal ARIMA (SARIMA): Incorporates seasonal components into ARIMA
models.
○ State-Space Models: Includes Kalman filters to model time-varying processes.
3. Machine Learning Approaches:

○ Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM):


Designed to handle sequential data with dependencies over time.
○ Prophet: A forecasting tool developed by Facebook for seasonal trends.

Key Properties of Time Series

1. Stationarity: A time series is stationary if its statistical properties (mean, variance,


autocorrelation) remain constant over time.

○ Test for Stationarity: Augmented Dickey-Fuller (ADF) test, KPSS test.


○ Non-Stationarity Handling: Differencing, detrending, or applying
transformations like logarithms.
2. Lag: The difference in time between a data point and its prior value.

3. Autocorrelation: The correlation of a time series with its own lagged values.

○ Autocorrelation Function (ACF): Measures the correlation at different lags.


○ Partial Autocorrelation Function (PACF): Measures the correlation at a specific
lag, removing effects from intermediate lags.
4. Seasonality and Periodicity: The presence of repeating patterns over a fixed interval.

Prepared By: Dept. of CSE-AIML, BWU


B. Tech CSE AIML 5th Sem
AI for Real World Application
(PCC-CSM503)

Steps in Time Series Analysis

1. Data Collection: Gather and preprocess the time series data, ensuring it is complete and
free of errors.
2. Exploratory Data Analysis (EDA):
○ Plot the time series to visualize trends and patterns.
○ Decompose the series into components (trend, seasonality, and residual).
3. Stationarity Testing: Check if the series is stationary or requires transformations.
4. Model Selection: Choose an appropriate model based on data characteristics.
5. Parameter Estimation: Estimate model parameters using techniques like maximum
likelihood or least squares.
6. Validation: Evaluate the model's performance using metrics such as Mean Absolute
Error (MAE), Root Mean Squared Error (RMSE), or Mean Absolute Percentage Error
(MAPE).
7. Forecasting: Predict future values based on the chosen model.

Common Challenges

1. Seasonality and Trends: Decomposing and accurately modeling these components can
be complex.
2. Non-Stationarity: Many real-world time series are non-stationary and require
transformations.
3. High Variability: Extreme fluctuations and noise can make modeling difficult.
4. Data Gaps: Missing data points can lead to biased models.
5. Overfitting: Overly complex models may fit the noise instead of the actual signal.

Tools and Libraries for Time Series Analysis

1. Python:
○ Libraries: pandas, numpy, statsmodels, scikit-learn, prophet,
sktime, tensorflow, keras.
2. R:
○ Libraries: forecast, TSA, tseries.
3. Software:
○ MATLAB, SAS, Excel (for basic analysis).

Prepared By: Dept. of CSE-AIML, BWU


B. Tech CSE AIML 5th Sem
AI for Real World Application
(PCC-CSM503)

Stationary Time Series

A stationary time series is one whose statistical properties, such as mean, variance, and
autocorrelation, remain constant over time. Stationarity is a critical concept in time series
analysis because many analytical methods and models (e.g., ARIMA) assume that the underlying
data is stationary.

Types of Stationarity

1. Weak (or Second-Order) Stationarity:

○ Only the first two moments (mean and variance) are constant, and the
autocovariance depends only on the lag.
○ This is a less strict condition and is more commonly assumed in time series
analysis.

Importance of Stationarity

1. Modeling Requirements: Many models like AR, MA, ARMA, and ARIMA assume
stationarity. For non-stationary data, transformations like differencing are required.
2. Predictability: A stationary series is easier to forecast because its statistical properties do
not change over time.
3. Simplified Analysis: With constant statistical properties, analysis and hypothesis testing
become more robust.

Testing for Stationarity

1. Visual Inspection:

○ Plot the time series: Look for trends, seasonality, or changing variance.
○ Plot the autocorrelation function (ACF): A stationary series typically has rapidly
decaying autocorrelations.
2. Statistical Tests:

○ Augmented Dickey-Fuller (ADF) Test:


■ Null Hypothesis (H0H_0): The series is non-stationary.
■ If the p-value is less than the significance level (e.g., 0.05), reject H0H_0,
indicating stationarity.
○ KPSS (Kwiatkowski-Phillips-Schmidt-Shin) Test:
■ Null Hypothesis (H0H_0): The series is stationary.
■ A low p-value indicates non-stationarity.
○ Phillips-Perron (PP) Test: Another test similar to ADF but more robust to certain
violations.

Prepared By: Dept. of CSE-AIML, BWU


B. Tech CSE AIML 5th Sem
AI for Real World Application
(PCC-CSM503)

Converting Non-Stationary Series to Stationary

If a time series is not stationary, the following transformations can be applied:

1. Differencing:

○ Subtract the previous value from the current value to remove trends.
yt=xt−xt−1y_t = x_t - x_{t-1}
○ For seasonal effects, use seasonal differencing: yt=xt−xt−my_t = x_t - x_{t-m}
where mm is the seasonal period.

2. Transformation:

○ Apply logarithmic, square root, or other transformations to stabilize variance.


yt=log⁡(xt)y_t = \log(x_t)

3. Detrending:

○ Subtract a fitted trend (e.g., linear or polynomial) from the series.


3. Decomposition:

○ Separate the series into trend, seasonal, and residual components, then analyze the
residuals.
4. Smoothing:

○ Apply techniques like moving averages to remove high-frequency noise and


stabilize the series.

Examples of Non-Stationary Time Series

1. Trending Series: Data with an upward or downward trend over time.


2. Seasonal Series: Data with periodic patterns.
3. Changing Variance: Series with heteroscedasticity (non-constant variance).

Examples of Stationary Series

1. White Noise: Random data with no discernible pattern.


2. Differenced Series: After applying differencing to a trending series, it often becomes
stationary.

Applications of Stationary Series

1. Financial Analysis:
○ Stationary returns in stock markets are analyzed rather than raw prices.
2. Econometrics:

Prepared By: Dept. of CSE-AIML, BWU


B. Tech CSE AIML 5th Sem
AI for Real World Application
(PCC-CSM503)

○ GDP growth rates are often modeled as stationary series.


3. Signal Processing:
○ Analyze stationary signals for noise filtering or pattern recognition.

Smoothing Time Series

Smoothing is a technique in time series analysis used to reduce noise or fluctuations in


data to reveal underlying trends, seasonal patterns, or cycles. It simplifies the data by
removing random variations while retaining important features.

Why Smooth a Time Series?

1. Highlight Trends: Smoothing helps in visualizing long-term trends by eliminating


short-term fluctuations.
2. Reduce Noise: Filters out random noise to make the series more interpretable.
3. Improve Forecasting: Provides a cleaner input for models, leading to more
accurate predictions.
4. Seasonality Adjustment: Facilitates the removal of seasonal effects for clearer
trend analysis.
5. Detect Anomalies: Makes it easier to identify unusual events or outliers in the
data.

Types of Smoothing Techniques


1. Moving Average (MA) Smoothing

● Definition: Averages a fixed number of consecutive data points in a sliding window.


● Formula: yt=1k∑i=0k−1xt−iy_t = \frac{1}{k} \sum_{i=0}^{k-1} x_{t-i} where
kk is the window size, xtx_t are the data points, and yty_t is the smoothed value.
● Properties:
○ Reduces random fluctuations.
○ Larger kk: Smoother data but greater lag in trend response.
● Variants:
○ Weighted Moving Average (WMA): Assigns more weight to recent
values.
○ Exponential Moving Average (EMA): Applies exponentially decreasing
weights.
2. Exponential Smoothing

● Definition: Averages data by giving exponentially higher weights to recent


observations.
● Formula: yt=αxt+(1−α)yt−1y_t = \alpha x_t + (1-\alpha)y_{t-1} where α\alpha is
the smoothing constant (0<α<10 < \alpha < 1).

Prepared By: Dept. of CSE-AIML, BWU


B. Tech CSE AIML 5th Sem
AI for Real World Application
(PCC-CSM503)

● Variants:
○ Simple Exponential Smoothing: Assumes no trend or seasonality.
○ Holt’s Method: Extends exponential smoothing to include trends.
○ Holt-Winters Method: Accounts for both trend and seasonality.
3. Locally Weighted Scatterplot Smoothing (LOWESS or LOESS)

● Definition: Fits a local regression model to subsets of the data.


● Characteristics:
○ Non-parametric and flexible.
○ Effective for non-linear trends.
4. Savitzky-Golay Smoothing

● Definition: Fits a low-degree polynomial to a moving window of data.


● Advantages:
○ Preserves the shape of the data, such as peaks and valleys.
○ Suitable for high-frequency data with noise.
5. Spline Smoothing

● Definition: Uses piecewise polynomials to smooth the data while ensuring


continuity.
● Types:
○ Natural splines.
○ Cubic splines.
● Use Case: Handles non-linear trends effectively.
6. Kernel Smoothing

● Definition: Uses kernel functions (e.g., Gaussian) to assign weights to data points
based on proximity.
● Advantage: Smooths data while maintaining structure.
7. Moving Median Smoothing

● Definition: Replaces each data point with the median of a window of neighboring
values.
● Advantages:
○ Robust to outliers.
○ Effective for reducing sharp spikes.
8. Fourier Transform Smoothing

● Definition: Transforms the data into the frequency domain and removes
high-frequency components (noise).
Prepared By: Dept. of CSE-AIML, BWU
B. Tech CSE AIML 5th Sem
AI for Real World Application
(PCC-CSM503)

● Use Case: Effective for periodic data.

Steps for Smoothing

1. Visualize Raw Data: Plot the data to understand its structure and variability.
2. Select a Technique: Choose a smoothing method based on the nature of the data
(e.g., linear trend, seasonal patterns).
3. Apply Smoothing:
○ Use computational tools (e.g., Python, R) to implement the technique.
○ Adjust parameters like window size or smoothing constant.
4. Evaluate the Result:
○ Compare the smoothed series to the raw series.
○ Ensure the smoothed data retains key patterns while removing noise.

Applications of Smoothing

1. Trend Analysis: Identifying long-term trends in economic, financial, or social


data.
2. Seasonality Removal: Focusing on non-seasonal components for further analysis.
3. Data Preprocessing: Preparing time series for forecasting or machine learning
models.
4. Signal Processing: Filtering noise from sensor or experimental data.
5. Anomaly Detection: Highlighting significant deviations from expected patterns.

Advantages of Smoothing

1. Noise Reduction: Removes unwanted fluctuations, making trends clearer.


2. Flexibility: A variety of methods exist for different data characteristics.
3. Improved Interpretability: Simplifies complex data for easier understanding and
analysis.

Disadvantages of Smoothing

1. Lag Effect: Techniques like moving averages delay trend detection.


2. Loss of Detail: Fine-grained variations or short-term patterns may be obscured.
3. Subjectivity: Choice of smoothing method and parameters may affect results.
4. Not Universal: Some techniques may not work well for highly volatile or
complex data.

Choosing the Right Smoothing Technique

1. Based on Purpose:
Prepared By: Dept. of CSE-AIML, BWU
B. Tech CSE AIML 5th Sem
AI for Real World Application
(PCC-CSM503)

○ For short-term fluctuation removal: Moving Average or Median


Smoothing.
○ For long-term trend analysis: Exponential Smoothing or LOWESS.
○ For non-linear data: Splines or Kernel Smoothing.
2. Based on Data Properties:

○ Stationary Data: Simple Moving Averages.


○ Non-stationary Data: Exponential Smoothing with trend and seasonality
adjustments.

Example of Smoothing

Raw Data: Monthly sales data with irregular spikes and seasonality.

● Step 1: Use a 3-month Moving Average to smooth short-term spikes.


● Step 2: Apply Holt-Winters Exponential Smoothing to capture trends and
seasonality.
● Result: The smoothed data reveals a clear upward trend with consistent seasonal
peaks.

Implementation in Python
import pandas as pd
import matplotlib.pyplot as plt

# Example data
data = [120, 130, 115, 140, 135, 145, 150, 160, 155, 165, 170, 180]
series = pd.Series(data)

# Moving Average
window = 3
smoothed = series.rolling(window=window).mean()

# Plot
plt.plot(series, label="Original")
plt.plot(smoothed, label="Smoothed (Moving Average)", color="red")
plt.legend()
plt.show()

This approach demonstrates how smoothing can be applied to remove noise and reveal

Prepared By: Dept. of CSE-AIML, BWU


B. Tech CSE AIML 5th Sem
AI for Real World Application
(PCC-CSM503)

meaningful patterns in time series data.

Smoothing is a foundational technique in time series analysis, crucial for better


understanding and preparing data for further modeling and interpretation.

Autocorrelation Functions (ACF)

Autocorrelation measures the relationship between a time series and its lagged version
over successive time periods. The autocorrelation function (ACF) quantifies these
correlations and provides insights into the time series' structure, such as trends,
seasonality, or randomness.

Key Concepts

1. Lag:

○ The time difference between observations.


○ For a time series xtx_t, the lag-1 autocorrelation compares xtx_t with
xt−1x_{t-1}, lag-2 compares xtx_t with xt−2x_{t-2}, and so on.
2. Autocorrelation:

○ The correlation of a time series with a lagged version of itself.


○ Formula for lag-kk: rk=∑t=k+1N(xt−xˉ)(xt−k−xˉ)∑t=1N(xt−xˉ)2r_k =
\frac{\sum_{t=k+1}^N (x_t - \bar{x})(x_{t-k} - \bar{x})}{\sum_{t=1}^N
(x_t - \bar{x})^2} where:
■ xtx_t: Observation at time tt.
■ xˉ\bar{x}: Mean of the time series.
■ NN: Number of observations.
■ kk: Lag.
3. Partial Autocorrelation:

○ Measures the correlation between xtx_t and xt−kx_{t-k}, removing the


influence of intermediate lags (1,2,…,k−11, 2, \ldots, k-1).
○ Useful for identifying direct relationships.

Autocorrelation Function (ACF)

The ACF plots the autocorrelation coefficients rkr_k for different lags kk. It helps
identify:

1. Trends: A slowly decaying ACF suggests a trend.


2. Seasonality: Peaks in the ACF at regular intervals indicate seasonality.
Prepared By: Dept. of CSE-AIML, BWU
B. Tech CSE AIML 5th Sem
AI for Real World Application
(PCC-CSM503)

3. Stationarity:
○ Stationary series: ACF decays rapidly.
○ Non-stationary series: ACF decays slowly or remains high.

Partial Autocorrelation Function (PACF)

The PACF plots partial autocorrelations for different lags, showing the direct effect of a
lag kk on xtx_t without mediation by other lags.

● Use:
○ Identify the order of an autoregressive (AR) process.
○ Distinguish between AR and MA models in ARIMA.

Interpreting ACF and PACF


1. Stationary Series

● ACF: Decays to zero quickly (e.g., within a few lags).


● PACF: Shows significant spikes for initial lags only.
2. Non-Stationary Series

● ACF: Decays slowly or remains high.


● PACF: May show a significant spike at lag 1 but decreases after differencing.
3. Seasonal Series

● ACF: Repeating patterns at seasonal lags.


● PACF: Significant spikes at seasonal lags.
4. Pure AR Process

● ACF: Exponentially decays or oscillates.


● PACF: Significant spikes up to the order of the AR process, then cuts off.
5. Pure MA Process

● ACF: Significant spikes up to the order of the MA process, then cuts off.
● PACF: Exponentially decays or oscillates.

Significance in ACF and PACF Plots

● The horizontal bands around zero in ACF/PACF plots represent the 95%
confidence interval.

Prepared By: Dept. of CSE-AIML, BWU


B. Tech CSE AIML 5th Sem
AI for Real World Application
(PCC-CSM503)

○ Spikes outside this interval indicate statistically significant correlations.


○ Random data (white noise) typically has all spikes within the confidence
bounds.

Applications of ACF and PACF

1. Model Identification:

○ Determine the order of AR and MA components for ARIMA models.


○ AR order: Observed from PACF.
○ MA order: Observed from ACF.
2. Seasonality Detection:

○ Identify periodic patterns by locating regularly spaced significant spikes in


the ACF.
3. Stationarity Testing:

○ A slowly decaying ACF suggests non-stationarity, requiring differencing.


4. Forecasting:

○ Use significant autocorrelations to create better predictive models.

Steps to Compute and Plot ACF/PACF

1. Compute Autocorrelation:

○ Calculate rkr_k for various lags using statistical software (e.g., Python, R).
2. Generate Plots:

○ Use tools like statsmodels in Python or the acf function in R to create


ACF and PACF plots.
3. Interpret the Results:

○ Examine patterns in the plots to understand data characteristics.

Python Example: ACF and PACF


import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
Prepared By: Dept. of CSE-AIML, BWU
B. Tech CSE AIML 5th Sem
AI for Real World Application
(PCC-CSM503)

# Example data
data = [12, 14, 15, 13, 16, 18, 17, 19, 21, 20]
series = pd.Series(data)

# ACF and PACF plots


fig, ax = plt.subplots(2, 1, figsize=(10, 8))

# Plot ACF
plot_acf(series, ax=ax[0], lags=5, title="Autocorrelation Function")

# Plot PACF
plot_pacf(series, ax=ax[1], lags=5, title="Partial Autocorrelation Function")

plt.tight_layout()
plt.show()

ACF vs PACF

Feature ACF PACF


Definition Correlation of a series Direct correlation after removing
with its lags. intermediate effects.

Use Identify moving average Identify autoregressive (AR) order.


(MA) order.

Decay Gradual for AR; cuts off Gradual for MA; cuts off for AR.
for MA.

Practical Considerations

1. Sample Size:

○ ACF estimates are less reliable for small datasets.


○ Larger datasets provide more accurate lag correlations.
2. Multiple Lags:

○ Use sufficient lags to capture all patterns, especially in seasonal data.

Prepared By: Dept. of CSE-AIML, BWU


B. Tech CSE AIML 5th Sem
AI for Real World Application
(PCC-CSM503)

3. White Noise:

○ A purely random series has no significant autocorrelation at any lag.

Summary

● The ACF quantifies the relationship between a time series and its lags, while the
PACF measures direct correlations after accounting for intermediate lags.
● Both are essential tools for time series analysis, particularly for identifying
ARIMA model components, detecting seasonality, and understanding series
behavior.

Autoregressive Integrated Moving Average (ARIMA) Models

ARIMA models are statistical tools for analyzing and forecasting time series data. The
model combines three key components: Autoregression (AR), Differencing
(Integration, I), and Moving Averages (MA). These models are widely used for
understanding patterns in time series and making future predictions.

Key Components of ARIMA

1. Autoregressive (AR) Component:

○ The AR component models the relationship between an observation and its


lagged values.
○ pp: The number of lagged observations used in the model.
○ Formula: xt=c+ϕ1xt−1+ϕ2xt−2+⋯+ϕpxt−p+ϵtx_t = c + \phi_1 x_{t-1} +
\phi_2 x_{t-2} + \dots + \phi_p x_{t-p} + \epsilon_t where:
■ xtx_t: Current value of the series.
■ ϕ\phi: Autoregressive coefficients.
■ cc: Constant term.
■ ϵt\epsilon_t: Error term (white noise).
2. Integrated (I) Component:

○ This involves differencing the time series to achieve stationarity by


removing trends or seasonality.
○ dd: The number of differencing steps required.
○ First-order differencing: yt=xt−xt−1y_t = x_t - x_{t-1}
3. Moving Average (MA) Component:

○ The MA component captures the relationship between the observation and


residual errors from past forecasts.
○ qq: The number of lagged forecast errors included in the model.
Prepared By: Dept. of CSE-AIML, BWU
B. Tech CSE AIML 5th Sem
AI for Real World Application
(PCC-CSM503)

○ Formula: xt=c+ϵt+θ1ϵt−1+θ2ϵt−2+⋯+θqϵt−qx_t = c + \epsilon_t + \theta_1


\epsilon_{t-1} + \theta_2 \epsilon_{t-2} + \dots + \theta_q \epsilon_{t-q}
where:
■ θ\theta: Moving average coefficients.

Model Notation

The ARIMA model is represented as ARIMA(p,d,q)ARIMA(p, d, q):

● pp: Order of the AR component.


● dd: Degree of differencing.
● qq: Order of the MA component.

Steps to Build an ARIMA Model

1. Understand the Data:

○ Plot the time series to identify trends, seasonality, and stationarity.


2. Stationarity Testing:

○ Use visual inspection or statistical tests like the Augmented Dickey-Fuller


(ADF) Test to check for stationarity.
○ If non-stationary, apply differencing (dd).
3. Identify AR and MA Orders:

○ Use Autocorrelation Function (ACF) and Partial Autocorrelation


Function (PACF):
■ ACF: Helps identify qq (MA order).
■ PACF: Helps identify pp (AR order).
4. Fit the Model:

○ Estimate the parameters of pp, dd, and qq using statistical tools (e.g.,
Python, R).
5. Evaluate the Model:

○ Check residuals for randomness using diagnostic tools like:


■ Residual plots.

Prepared By: Dept. of CSE-AIML, BWU


B. Tech CSE AIML 5th Sem
AI for Real World Application
(PCC-CSM503)

■ Ljung-Box test.
○ Ensure no significant autocorrelation remains.
6. Forecast:

○ Use the model to predict future values and evaluate accuracy using metrics
like Mean Squared Error (MSE) or Mean Absolute Error (MAE).

ACF and PACF Interpretation


Compo ACF Behavior PACF Behavior
nent

AR(p) Gradual decay Cuts off after pp lags

MA(q) Cuts off after qq lags Gradual decay

ARMA( Mixed decay/cutoff Mixed decay/cutoff


p, q) behavior behavior

Extensions of ARIMA

1. Seasonal ARIMA (SARIMA):

○ Incorporates seasonality into ARIMA:


■ SARIMA(p,d,q)(P,D,Q,s)SARIMA(p, d, q)(P, D, Q, s), where:
■ P,D,QP, D, Q: Seasonal AR, I, MA orders.
■ ss: Seasonal period (e.g., 12 for monthly data).
2. ARIMAX:

○ Extends ARIMA by incorporating exogenous variables.


3. AutoARIMA:

○ Automatically selects the best parameters for pp, dd, and qq based on
criteria like AIC or BIC.

Advantages of ARIMA

1. Versatile: Can handle a variety of time series types.


2. Integration: Addresses non-stationarity through differencing.
Prepared By: Dept. of CSE-AIML, BWU
B. Tech CSE AIML 5th Sem
AI for Real World Application
(PCC-CSM503)

3. Reliable: Provides robust forecasts for univariate data.

Limitations of ARIMA

1. Requires Stationarity: Not suitable for strongly non-stationary data.


2. Assumes Linearity: Does not capture complex, non-linear relationships.
3. Single Variable: Cannot directly handle multivariate series.

Practical Example
Python Implementation
import pandas as pd
from statsmodels.tsa.arima.model import ARIMA
import matplotlib.pyplot as plt

# Example time series


data = [112, 118, 132, 129, 121, 135, 148, 148, 136, 119, 104, 118]
series = pd.Series(data)

# Fit ARIMA model (p=1, d=1, q=1)


model = ARIMA(series, order=(1, 1, 1))
model_fit = model.fit()

# Summary of the model


print(model_fit.summary())

# Forecast future values


forecast = model_fit.forecast(steps=5)
print("Forecast:", forecast)

# Plot results
plt.plot(series, label='Original Data')
plt.plot(range(len(series), len(series) + len(forecast)), forecast, label='Forecast',
color='red')
plt.legend()
plt.show()

Prepared By: Dept. of CSE-AIML, BWU


B. Tech CSE AIML 5th Sem
AI for Real World Application
(PCC-CSM503)

Applications

1. Economics: Forecasting GDP, inflation, or stock prices.


2. Finance: Modeling asset prices or sales trends.
3. Weather: Predicting temperature or precipitation patterns.
4. Demand Forecasting: Anticipating product sales or inventory needs.

Summary

● ARIMA is a foundational model for time series forecasting.


● It combines AR, I, and MA components to handle both stationary and
non-stationary data.
● A systematic approach involving stationarity checks, ACF/PACF analysis, and
parameter tuning is essential for effective use.
● Extensions like SARIMA and ARIMAX further enhance its applicability.

Deep Learning and Time Series Analysis

Deep learning has emerged as a transformative approach for time series analysis,
providing advanced techniques to model complex temporal relationships, handle
high-dimensional data, and predict future trends. Unlike traditional methods like
ARIMA, which rely on predefined statistical assumptions, deep learning methods can
automatically learn from raw data, making them ideal for tackling diverse and non-linear
time series problems.

Core Concepts

1. Time Series Characteristics:

○ Sequential Nature: Data points depend on previous values.


○ Components: Trend, seasonality, cyclicity, and noise.
○ Challenges: Non-linearity, multivariate dependencies, and missing data.
2. Deep Learning for Time Series:

○ Uses neural networks with multiple layers to learn features directly from
the data.
○ Handles both univariate (single variable) and multivariate (multiple
variables) series.

Prepared By: Dept. of CSE-AIML, BWU


B. Tech CSE AIML 5th Sem
AI for Real World Application
(PCC-CSM503)

Key Deep Learning Models for Time Series


1. Recurrent Neural Networks (RNNs):

● Designed for sequential data.


● Outputs depend on previous inputs via internal memory.
● Limitations:
○ Struggles with long-term dependencies due to vanishing/exploding
gradients.
2. Long Short-Term Memory (LSTM):

● Overcomes RNN limitations by introducing memory gates to control information


flow.
● Applications:
○ Forecasting long-term trends.
○ Capturing sequential patterns in time series data.
3. Gated Recurrent Units (GRU):

● Similar to LSTMs but with fewer parameters.


● Combines forget and input gates into a single update gate.
● Faster to train compared to LSTM with comparable accuracy.
4. Convolutional Neural Networks (CNNs):

● Extract spatial and temporal features from time series data.


● Often used in combination with RNNs or LSTMs for hybrid models.
5. Transformers:

● Uses self-attention mechanisms to model relationships across all time steps


simultaneously.
● Strengths:
○ Captures both short-term and long-term dependencies effectively.
○ Processes data in parallel, making it computationally efficient.
● Applications: Time series forecasting, anomaly detection.
6. Hybrid Models:

● Combines multiple architectures to leverage their individual strengths.


● Example: CNN-LSTM combines CNN for feature extraction and LSTM for
sequence modeling.

Prepared By: Dept. of CSE-AIML, BWU


B. Tech CSE AIML 5th Sem
AI for Real World Application
(PCC-CSM503)

Applications in Time Series Analysis

1. Forecasting:

○ Predicting future values based on historical data.


○ Example: Energy demand, stock prices, weather conditions.
2. Anomaly Detection:

○ Identifying deviations from normal patterns.


○ Example: Fraud detection, system fault monitoring.

3. Classification:

○ Assigning labels to time series data.


○ Example: Activity recognition, disease diagnosis using ECG data.
4. Clustering:

○ Grouping similar time series based on patterns.


○ Example: Customer segmentation in marketing.
5. Causal Analysis:

○ Understanding cause-effect relationships in multivariate time series.

Advantages of Deep Learning in Time Series

1. Non-linear Modeling:

○ Captures complex relationships without manual feature engineering.


2. Scalability:

○ Handles large datasets and multivariate series effectively.


3. Automatic Feature Learning:

○ Extracts meaningful features directly from raw data.


4. Flexibility:

○ Works with irregular, noisy, and non-stationary data.

Prepared By: Dept. of CSE-AIML, BWU


B. Tech CSE AIML 5th Sem
AI for Real World Application
(PCC-CSM503)

Challenges of Deep Learning for Time Series

1. Data Requirements:

○ Requires large datasets for effective training.


○ Solution: Use transfer learning or synthetic data generation.
2. Computational Cost:

○ Deep networks require significant computational resources.


○ Solution: Optimize model architecture, use GPUs.
3. Overfitting:

○ Complex models may overfit small datasets.


○ Solution: Apply regularization techniques (e.g., dropout, weight decay).

4. Interpretability:

○ Deep learning models are often seen as "black boxes."


○ Solution: Use explainability tools like attention mechanisms or SHAP.

Steps for Time Series Analysis Using Deep Learning

1. Data Preparation:

○ Normalize data to improve convergence.


○ Handle missing values using imputation techniques.
○ Create sliding windows for supervised learning.
2. Model Selection:

○ Choose architecture based on the problem:


■ LSTMs for sequential dependencies.
■ Transformers for long-term relationships.
3. Model Training:

○ Train on a training set while monitoring performance on a validation set.


○ Use optimizers like Adam or SGD.
4. Evaluation:

○ Metrics: Mean Absolute Error (MAE), Mean Squared Error (MSE), RMSE.
○ Check residuals for patterns to ensure model quality.

Prepared By: Dept. of CSE-AIML, BWU


B. Tech CSE AIML 5th Sem
AI for Real World Application
(PCC-CSM503)

5. Forecasting and Deployment:

○ Use the trained model to make predictions on new data.


○ Deploy using frameworks like TensorFlow or PyTorch.

Key Techniques for Improving Deep Learning Models

1. Regularization:

○ Prevent overfitting with techniques like dropout and early stopping.


2. Hyperparameter Tuning:

○ Optimize learning rates, number of layers, and units using grid search or
Bayesian optimization.
3. Data Augmentation:

○ Generate synthetic time series data to increase training set size.

4. Transfer Learning:

○ Use pre-trained models and fine-tune them for specific tasks.


5. Hybrid Architectures:

○ Combine deep learning models with traditional methods like ARIMA for
better performance.

Comparison: Deep Learning vs. Traditional Models


Aspect Traditional Models (e.g., Deep Learning Models
ARIMA)

Linearity Assumes linear relationships Models non-linear


relationships

Feature Manual Automatic


Engineering

Multivariate Limited Handles multiple variables


Handling easily

Prepared By: Dept. of CSE-AIML, BWU


B. Tech CSE AIML 5th Sem
AI for Real World Application
(PCC-CSM503)

Scalability Limited Scales with large datasets

Complexity Simple Complex

Performance Limited in non-linear scenarios High for non-linear data

Prepared By: Dept. of CSE-AIML, BWU

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy