0% found this document useful (0 votes)
24 views33 pages

Asm2024 1

Uploaded by

hono.stepstudy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views33 pages

Asm2024 1

Uploaded by

hono.stepstudy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Mid-Term Assignment

Course : IT Application in Banking and Finance

Question 1:
Using the any package to download any three series of the categories : financial market index,
foreign exchange rate, commodity market index, stock price and crypto currency to answer the
following questions:
Importing libraries
Before starting coding, I have to prepare the environment for the code to run. This code imports
various libraries and modules that are commonly used for financial data analysis, statistical tests,
time series analysis, and visualization in Python.

NumPy is a fundamental package for numerical computations in Python. It provides support for
arrays and matrices, along with a collection of mathematical functions to operate on these arrays.
array, copy, diag, log, pi, size, sqrt, sum, and zeros are specific functions and constants from
NumPy for various numerical operations.
Scipy builds on Numpy and provides additional functionality for scientific and technical
computing. It includes modules for optimization, integration, interpolation, eigenvalue problems,
and statistics. Stats module is used for statistical functions and tests.
The stats module from SciPy provides a wide range of statistical functions.
Pandas is a powerful data manipulation and analysis library, providing data structures like
DataFrames which are essential for data analysis tasks.
Matplotlib is a plotting library for creating static, animated, and interactive visualizations in
Python. pyplot is a module in Matplotlib used for plotting.
ARCH is a package for Autoregressive Conditional Heteroskedasticity models, which are used
in time series analysis for modeling volatility.
Statsmodels is a library for statistical models and tests. It provides classes and functions for the
estimation of many different statistical models, as well as for conducting statistical tests.
acorr_ljungbox is used for testing autocorrelation in residuals.
adfuller is the Augmented Dickey-Fuller test for unit roots.
SimpleExpSmoothing and ExponentialSmoothing are methods for time series forecasting.
sm is a general import of the statsmodels API, providing access to a wide range of statistical
models.
Seaborn is a visualization library based on Matplotlib that provides a high-level interface for
drawing attractive statistical graphics.
Collecting data

This line imports the yfinance library and gives it the alias yf. yfinance is a Python library that
provides an easy-to-use interface to download historical market data from Yahoo Finance. By
importing it with an alias yf, you can use the shorter yf prefix to access the functions and classes
within the yfinance library. This line imports the pandas library and gives it the alias pd.
Pandas is a powerful data manipulation and analysis library in Python. It provides data
structures like DataFrame and Series, which are essential for handling and analyzing structured
data.

The code below sets up the parameters to download historical stock price data for three
companies—Coca-Cola (KO), Apple Inc. (AAPL), and JPMorgan Chase (JPM)—from Yahoo
Finance for the period between January 1, 2019, and January 1, 2024. The tickers list contains
the ticker symbols for these three stocks. The start_date and end_date variables define the date
range for the historical data retrieval. This setup is typically followed by using a financial data
library, like yfinance, to fetch and manipulate the specified stock data over the given time
period.
This code segment downloads historical stock price data for three companies separately from
Yahoo Finance. Each yf.download function call fetches data for a specific company identified by
its ticker symbol ('AAPL' for Apple Inc., 'KO' for Coca-Cola, and 'JPM' for JPMorgan Chase)
over the specified date range (from start_date to end_date). The retrieved data for each
company is stored in separate variables (aapl_data, ko_data, and jpm_data) for further analysis
or processing. This approach allows for individual handling and analysis of each company's
stock price data.

This code snippet prints the first few rows of the historical stock price data for Apple Inc.
(AAPL), which was previously downloaded and stored in the variable aapl_data. The
print("AAPL Data:") statement serves as a header to indicate the data being displayed.
Subsequently, print(aapl_data.head()) is used to output the initial rows of the DataFrame
aapl_data, providing a concise preview of the dataset's structure and contents, including details
such as the date, opening price, high price, low price, closing price, and volume traded for each
trading day within the specified date range.
This code snippet displays the initial rows of the historical stock price data for The Coca-Cola
Company (KO), previously downloaded and stored in the variable `ko_data`. The statement
`print("\nCoca (KO) Data:")` introduces the output by printing a header indicating the data
being shown, with a blank line preceding it for clarity. Following this, `print(ko_data.head())`
outputs the first few rows of the DataFrame `ko_data`, offering a concise overview of the
dataset's structure and content, including details such as the date, opening price, high price, low
price, closing price, and volume traded for each trading day within the specified date range.

This code snippet prints a header followed by the initial rows of the historical stock price data for
JPMorgan Chase & Co. (JPM), which has been previously downloaded and stored in the variable
`jpm_data`. The header, `print("\nJPMorgan (JPM) Data:")`, introduces the output, with a
blank line inserted before it for clarity. Subsequently, `print(jpm_data.head())` displays the first
few rows of the DataFrame `jpm_data`, offering a succinct overview of the dataset's structure
and content, encompassing details such as the date, opening price, high price, low price, closing
price, and volume traded for each trading day within the specified date range.
This code segment augments each DataFrame (`aapl_data`, `ko_data`, and `jpm_data`)
containing historical stock price data with a new column named 'ticker'. Each row in these
DataFrames corresponds to a specific date and includes various attributes such as opening price,
high price, low price, closing price, and volume traded. The assignment statements
`aapl_data['ticker'] = 'AAPL'`, `ko_data['ticker'] = 'KO'`, and `jpm_data['ticker'] = 'JPM'`
ensure that the 'ticker' column in each DataFrame is populated with the respective ticker symbol
('AAPL', 'KO', or 'JPM'), effectively labeling each row with the corresponding company's
stock ticker. This additional column facilitates subsequent data manipulation or analysis tasks by
providing a categorical identifier indicating the origin of each data point within the DataFrames.

This code concatenates the three DataFrames (`aapl_data`, `ko_data`, and `jpm_data`)
containing historical stock price data for Apple Inc. (AAPL), The Coca-Cola Company (KO),
and JPMorgan Chase & Co. (JPM), respectively, into a single DataFrame named
`combined_data`. By using the `pd.concat()` function, the rows of each DataFrame are stacked
vertically to create a unified DataFrame, allowing for comprehensive analysis across all three
companies. The resulting `combined_data` DataFrame incorporates the data from all three
stocks, facilitating comparative analysis, visualization, or further processing of the combined
dataset.
By executing print(combined_data), the entire contents of the combined DataFrame are
displayed, providing a comprehensive overview of the aggregated data. This output includes all
rows and columns from the merged DataFrames, allowing for analysis and visualization of the
consolidated historical stock price data for the specified companies within the specified date
range.
This code snippet removes any rows from the DataFrame `combined_data` that contain at least
one missing value (NaN). By executing `combined_data.dropna(inplace=True)`, the
DataFrame is modified in place to eliminate any rows with missing values across all columns.
This operation ensures that the resulting DataFrame contains only complete data, facilitating
subsequent analysis or modeling tasks where missing values could impact the accuracy or
reliability of the results.
Equally-weighted portfolio

This code calculates the mean values of the 'Open', 'Adj Close', 'Low', and 'High' prices across
the individual stock DataFrames within the `combined_data` DataFrame. It employs the `pivot`
function to restructure the DataFrame, converting the specified columns into separate columns
based on the ticker symbols. Then, it computes the mean value for each row across these newly
created columns using the `mean` function along the specified axis. The resulting mean values
are stored in new columns named 'port_open', 'port_close', 'port_low', and 'port_high',
respectively, representing the average prices across all stocks in the portfolio for each
corresponding price category.

This code merges multiple DataFrames containing average values of 'Adj Close', 'Low', 'High',
and 'Open' prices across all stocks in the portfolio. It begins by merging the 'port_close' and
'port_low' DataFrames on the common 'Date' column using `pd.merge()`, resulting in a
DataFrame `closelow` with two columns: 'port_close' and 'port_low'. Next, it merges
`closelow` with the 'port_high' DataFrame on the 'Date' column, creating a new DataFrame
`clh` that contains three columns: 'port_close', 'port_low', and 'port_high'. Finally, it merges
`clh` with the 'port_open' DataFrame on the 'Date' column, producing the final DataFrame
`port` with all four columns: 'port_close', 'port_low', 'port_high', and 'port_open',
representing the average prices across all stocks in the portfolio for each corresponding price
category on each date.
Returns, absolute returns, squared returns

This code calculates various types of returns for the portfolio represented by the DataFrame
`port`. First, it computes the percentage change in the 'port_close' column to determine the
returns for each day, storing the results in a new column named 'returns'. Next, it calculates the
absolute returns by taking the absolute values of the returns and assigns them to the 'Abs_returns'
column. Then, it computes the squared returns by squaring the returns and stores them in the
'Squared_returns' column. Lastly, it calculates the logarithmic returns by applying the natural
logarithm to the 'port_close' column and taking the difference of consecutive values, assigning
the results to the 'Log_returns' column. These operations provide various metrics to analyze the
performance and volatility of the portfolio over time.

This code snippet prints a subset of columns ('returns', 'Abs_returns', and 'Squared_returns') from
the DataFrame `port`, which likely contains calculated returns data for a portfolio. The comment
above the print statement indicates that a NaN value in the first row of the selected columns
signifies that January 1, 2019, serves as the benchmark date for the portfolio's performance
calculation. By displaying these specific columns, the code enables the user to inspect and
analyze the portfolio's returns, absolute returns, and squared returns over time, with the first row
serving as a reference point for benchmarking purposes.
Plotting data

This code segment utilizes Matplotlib to generate a single plot visualizing the time series of
different types of returns for the portfolio stored in the DataFrame `port`. It sets the figure size to
14x7 inches using `plt.figure(figsize=(14, 7))` for better visibility. Subsequently, it plots the
'returns', 'Abs_returns', and 'Squared_returns' columns from the `port` DataFrame on the same
plot, assigning different labels to each curve ('Returns', 'Absolute Returns', and 'Squared
Returns'). The plot is titled 'Time Series of Portfolio Returns' and labeled on the x-axis with
'Date' and on the y-axis with 'Value'. Lastly, it adds a legend to the plot to differentiate between
the plotted curves and displays the plot using `plt.show()`. This visualization offers a
comprehensive view of the portfolio's returns, absolute returns, and squared returns over time,
facilitating analysis of their trends and patterns.

The time series plot of portfolio returns from 2019 to 2024 displays the raw returns (blue line),
absolute returns (orange line), and squared returns (green line). A significant period of high
volatility is evident around 2020, likely due to a major market event such as the COVID-19
pandemic, as shown by the pronounced spikes in both absolute and squared returns. The plot
indicates volatility clustering, where periods of high volatility are followed by more high
volatility. After the 2020 spike, the returns stabilize but still exhibit minor volatility spikes. This
visualization is instrumental in understanding market behavior, assessing risk, and informing
investment strategies by highlighting periods of heightened and reduced volatility.

This code snippet uses Matplotlib to create a figure with three subplots arranged vertically, each
displaying a different time series of portfolio returns. The `plt.subplots(nrows=3, ncols=1,
figsize=(15, 5))` function call initializes a figure with three subplots arranged in a single column,
with a specified figure size. Each subplot is then individually configured: the first subplot
(`axes[0]`) plots the 'returns' column from the `port` DataFrame with a blue line, labeled as
'Returns', and titled 'Time Series of Portfolio Returns'. The second subplot (`axes[1]`) plots the
'Abs_returns' column with a green line and labels it as 'Absolute Returns'. The third subplot
(`axes[2]`) plots the 'Squared_returns' column with a red line and labels it as 'Squared Returns'.
All subplots are labeled on the x-axis with 'Date' and on the y-axis with 'Value', and each subplot
includes a legend to distinguish between the plotted curves. Finally, `plt.tight_layout()` ensures
optimal spacing between subplots, and `plt.show()` displays the figure. This setup allows for a
clear comparison of different aspects of portfolio returns over time.
The provided graph presents a time series analysis of portfolio returns from 2019 to 2024,
divided into three subplots: raw returns (blue), absolute returns (green), and squared returns
(red). All three plots highlight a significant spike in volatility around 2020, likely due to a major
market event such as the COVID-19 pandemic, followed by a period of relative stability with
occasional minor fluctuations. The absolute and squared returns particularly emphasize the
magnitude and impact of these volatility spikes, demonstrating the heightened market risk during
2020 and subsequent stabilization. This analysis is essential for understanding risk and volatility
patterns in the portfolio, aiding in informed financial decision-making and risk management.
Descriptive statistics

This code computes descriptive statistics for three columns ('returns', 'Abs_returns', and
'Squared_returns') from the DataFrame `port`, which likely contains calculated returns data for a
portfolio. The `describe()` function calculates summary statistics, including count, mean,
standard deviation, minimum, 25th percentile (Q1), median (50th percentile), 75th percentile
(Q3), and maximum values, for each specified column. By selecting only these three columns
using `port[['returns', 'Abs_returns', 'Squared_returns']]`, the code focuses on summarizing
the returns, absolute returns, and squared returns of the portfolio. This provides valuable insights
into the central tendency, variability, and distribution of these different types of returns, aiding in
the analysis and understanding of the portfolio's performance characteristics.
The descriptive statistics table for the portfolio's returns, absolute returns, and squared returns,
based on 1,257 observations, reveals several insights. The average return is quite small at
0.000878, with a standard deviation of 0.015602, indicating moderate variability. The returns
range from a minimum of -0.124939 to a maximum of 0.126790. The absolute returns have a
mean of 0.010347 and a standard deviation of 0.011707, highlighting the average magnitude of
returns irrespective of direction. The squared returns, representing volatility, have a mean of
0.000244 and exhibit significant skewness with a standard deviation of 0.000893. These statistics
collectively illustrate the overall distribution, central tendency, and variability in the portfolio's
performance over the observed period.
ACF

This code segment utilizes Matplotlib and Pandas to create an autocorrelation plot for the returns
of a portfolio stored in the DataFrame `port`. The `autocorrelation_plot()` function from
Pandas' plotting module is used to generate the autocorrelation plot, which displays the
correlation of a series with itself at different lags. Specifically, it plots the autocorrelation of the
'returns' column after dropping any rows with missing values using `port['returns'].dropna()`.
Autocorrelation measures the linear relationship between lagged versions of a time series,
providing insights into the presence of patterns or trends in the data over time. Finally,
`plt.show()` displays the autocorrelation plot using Matplotlib. This visualization helps in
identifying any significant correlations between the returns at different time lags, aiding in the
analysis of the portfolio's performance dynamics and potential predictability.
The autocorrelation plot of the portfolio returns shows a rapid decline in autocorrelation values
from the initial lags, indicating a short-term dependency where past returns slightly influence
immediate future returns. As the lag increases, the autocorrelation values hover around zero and
fall within the confidence intervals, suggesting no significant long-term autocorrelation and
minimal influence of past returns on future returns beyond the short-term. This pattern indicates
that the returns largely follow a random walk behavior with only slight short-term predictability.
Normality, Ljung-Box tests
Conducting a variety of tests: Kolmogorov-Smirnov Test, Anderson-Darling Test, Shapiro-Wilk
Test and Ljung-Box tests, acquired the p-value as below:
The results of the Kolmogorov-Smirnov, Anderson-Darling, and Shapiro-Wilk tests all strongly
reject the null hypothesis of normality, with exceedingly small p-values indicating that the series
significantly deviates from a normal distribution. The Kolmogorov-Smirnov test statistic of
0.4745 and the Anderson-Darling test statistic of 23.7481, both with p-values near zero, along
with the Shapiro-Wilk test statistic of 0.8887 and p-value of 3.7877e-29, provide overwhelming
evidence against normality. The Ljung-Box test for autocorrelation at lag 10 yields a test statistic
of 197.8987 with a p-value of 4.4260e-37, indicating that the series is not independently
distributed and exhibits significant autocorrelation. The lack of normality and independence in
the series can be attributed to the presence of volatility clustering, where large changes in returns
tend to be followed by large changes and small changes by small changes, a common
characteristic in financial time series data. This behavior violates the assumptions of normality
and independence, leading to the observed test results.
The analysis of your time series data collected during the COVID-19 pandemic reveals
significant deviations from normality and independence, which can be rationalized by
considering the unique market conditions induced by the pandemic. The COVID-19 pandemic
caused unprecedented economic disruptions, leading to extreme market volatility and heightened
uncertainty. Studies have documented that during this period, financial markets experienced
sudden and severe movements, as well as volatility clustering, where large changes in returns
were followed by further large changes, rather than random, independent fluctuations (Baker et
al., 2020; Zhang et al., 2020). The rejection of normality by the Kolmogorov-Smirnov,
Anderson-Darling, and Shapiro-Wilk tests, as well as the lack of independence indicated by the
Ljung-Box test, align with these observations. The extreme volatility and autocorrelation
observed can be attributed to persistent market shocks and the propagation of volatility over
time, which are characteristic of the market behavior during the COVID-19 pandemic (Ashraf,
2020; Goodell, 2020). Thus, the market conditions during the pandemic provide a plausible
rationale for the non-normality and dependence in your return series.
Question 2
EWMA

This code computes the exponentially weighted moving average (EWMA) of the 'port_close'
column in the DataFrame `port`, using a span of 12 periods. The `ewm()` function calculates the
EWMA, and the `mean()` function computes the mean value over the specified span. The
resulting EWMA values are stored in a new column named 'EWMA12' in the `port` DataFrame.
Subsequently, the code plots both the 'port_close' and 'EWMA12' columns against the date index
using `port[['port_close', 'EWMA12']].plot()`, generating a visualization of the original closing
prices and their corresponding EWMA values over time. Finally, the EWMA values are assigned
to the variable `ewma_res` for further analysis or processing. This approach is commonly used to
smooth out fluctuations in time series data and identify underlying trends or patterns.

The provided plot displays the time series of the portfolio's closing prices (`port_close`)
alongside its 12-period Exponentially Weighted Moving Average (EWMA12). The EWMA12
line closely follows the actual closing prices, smoothing out short-term fluctuations while
preserving the overall trend. This indicates that the EWMA is effective in highlighting the
underlying trend and reducing noise, making it easier to identify periods of sustained upward or
downward movements. The close alignment between the two lines suggests a stable trend with
relatively low volatility over the period, except for notable disruptions around early 2020, likely
due to the COVID-19 pandemic. The pandemic-induced market shock is evident in the sharp
decline and subsequent recovery, reflecting the extreme volatility and rapid market adjustments
during this period (Baker et al., 2020; Zhang et al., 2020). The subsequent periods show more
stable growth, aligning with the global market recovery trends post-pandemic (Goodell, 2020).
These observations are consistent with studies documenting the impact of COVID-19 on
financial markets, highlighting increased volatility and rapid changes in market conditions.

This code snippet utilizes the ARIMA (AutoRegressive Integrated Moving Average) model from
the statsmodels library to fit a time series model to the exponentially weighted moving average
(EWMA) values stored in the variable `ewma_res`. First, it imports the ARIMA model class
from statsmodels. Then, it initializes an ARIMA model with an order of (0, 0, 1), indicating that
the model has no autoregressive (AR) or differencing (I) components, and only one moving
average (MA) component. Next, the `fit()` method is called on the ARIMA model to estimate the
model parameters using maximum likelihood estimation. The summary statistics of the fitted
model are then printed using the `summary()` method, providing information such as coefficient
estimates, standard errors, p-values, and goodness-of-fit statistics. This process allows for the
analysis of the ARIMA model's performance in capturing the underlying patterns or trends in the
EWMA values and assessing its suitability for forecasting future values.
The SARIMAX results for the EWMA12 series, modeled as ARIMA(0,0,1), indicate a strong
model fit. The constant term is significant (coef = 96.9524, p-value = 0.000), suggesting a stable
level around 97. The moving average component (ma.L1) has a coefficient close to 1 (0.9989)
with a highly significant p-value (0.000), indicating a strong short-term correlation in the data.
The model's AIC (9827.158) and BIC (9842.570) values suggest a relatively good fit. The
Ljung-Box test statistic (1237.78) with a p-value of 0.00 indicates significant autocorrelation in
residuals, while the Jarque-Bera test statistic (103.25) with a p-value of 0.00 confirms that
residuals are not normally distributed, likely due to skewness and kurtosis. The
Heteroskedasticity (H) test with a p-value of 0.00 suggests variance changes over time. These
findings align with the financial market behavior during COVID-19, marked by high volatility
and autocorrelation (Baker et al., 2020; Zhang et al., 2020).
GARCH (1,1)

This code segment utilizes the ARCH (Autoregressive Conditional Heteroskedasticity) model
from the arch library to fit a GARCH (Generalized Autoregressive Conditional
Heteroskedasticity) model to the log returns of a financial asset. Firstly, it prepares the data by
selecting the 'Log_returns' column from the DataFrame `port`, dropping any missing values, and
then filtering the data for the year 2013 onwards. It also computes the difference between the
natural logarithm of the 'port_low' and 'port_open' columns, dropping any missing values and
filtering the data for the year 2013 onwards.
Next, it initializes a GARCH model using the `arch_model()` function from the arch library,
specifying the log returns as the dependent variable, setting the volatility model to GARCH with
one lag for both the autoregressive (p) and moving average (q) components, and selecting the
Skewed Student's t distribution for the residuals.
Then, the `fit()` method is called on the GARCH model to estimate the model parameters using
maximum likelihood estimation. The `disp="off"` argument suppresses the display of
optimization convergence messages, and the `first_obs` and `last_obs` arguments set the
estimation period from January 1, 2013, to December 29, 2023. Finally, the fitted GARCH
model results are stored in the variable `garch_res`. This process allows for the analysis of
volatility dynamics in the log returns of the financial asset and assessing the adequacy of the
GARCH model in capturing these dynamics.
The GARCH model results for the log returns indicate significant findings on volatility
dynamics. The mean return (\(\mu\)) is 0.1223 with a highly significant p-value (7.015e-05),
suggesting a positive average return. The GARCH components show that \(\omega\) (constant) is
0.0380, \(\alpha[1]\) (ARCH term) is 0.1040, and \(\beta[1]\) (GARCH term) is 0.8780, all with
highly significant p-values, indicating that both past shocks and past variances significantly
influence current volatility. The high \(\beta[1]\) value suggests strong volatility clustering,
where high volatility tends to persist, a characteristic often seen in financial markets during
crises like the COVID-19 pandemic (Zhang et al., 2020). The negative skewness parameter
(\(\lambda = -0.0709\)) indicates asymmetry in the distribution of returns, consistent with market
stress conditions (Bollerslev, 1986). These findings align with other studies showing increased
volatility and persistent shocks in financial markets during the pandemic (Goodell, 2020; Ashraf,
2020).
GJR(1,1,1)

This code segment utilizes the GJR-GARCH (Generalized Autoregressive Conditional


Heteroskedasticity) model from the arch library to fit a GJR-GARCH model to the log returns of
a financial asset. Firstly, it prepares the data by selecting the 'logreturns' variable, which likely
contains the log returns of the asset. Then, it initializes a GJR-GARCH model using the
`arch_model()` function from the arch library. The model is specified with one lag for the
autoregressive (p) and moving average (q) components of the conditional variance, and one lag
for the asymmetry (o) term, which captures the leverage effect. Additionally, the mean is set to
'constant', indicating a constant mean model, the volatility model is specified as GARCH, and the
distribution is set to Student's t distribution.
Next, the `fit()` method is called on the GJR-GARCH model to estimate the model parameters
using maximum likelihood estimation. The `disp="off"` argument suppresses the display of
optimization convergence messages, and the `last_obs` argument specifies the end date of the
estimation period as January 1, 2024. Finally, the fitted GJR-GARCH model results are stored in
the variable `gjr_model`. This process allows for the analysis of volatility dynamics in the log
returns of the financial asset, particularly focusing on capturing the asymmetry or leverage effect
in volatility changes.
The GJR-GARCH model results for the log returns, using a standardized Student's t-distribution,
show significant coefficients and insights into the volatility dynamics. The mean return (\(\mu\))
is exceptionally high at 17.0410, indicating an unusually high average return, which could
suggest model misspecification or data issues, especially given the optimizer's convergence
warning. The volatility model components, including \(\alpha[1]\) (0.6116) and \(\beta[1]\)
(0.6943), are highly significant, indicating that past shocks and past variances strongly influence
current volatility. The negative \(\gamma[1]\) coefficient (-0.6116) suggests asymmetry in
volatility, consistent with the leverage effect where negative shocks increase volatility more than
positive shocks, as documented in financial literature (Glosten, Jagannathan, & Runkle, 1993).
The model's AIC (41619.1) and BIC (41649.9) suggest a fit, but the exceptionally high mean
return and convergence issues warrant caution and further model diagnostics. These findings
align with studies showing increased volatility and asymmetry during financial crises like the
COVID-19 pandemic, where market conditions led to persistent and asymmetric volatility
patterns (Bollerslev, 1986; Goodell, 2020).
This code segment facilitates the visualization of annualized volatility for the FTSE (Financial
Times Stock Exchange) index using the GJR-GARCH(1,1,1) model. Initially, it ensures proper
date formatting for plotting by registering date converters. Subsequently, it computes the
annualized volatility from the GARCH volatility estimates and plots the results against
corresponding dates. The plot is configured to ensure optimal visualization, with tight x-axis
scaling, formatted date labels, and appropriate spacing. This visualization offers valuable insights
into the fluctuation and risk characteristics of the FTSE index over time, aiding investors and
analysts in understanding and assessing market volatility dynamics.
The plot of FTSE annualized volatility estimated using the GJR-GARCH(1,1,1) model vividly
captures the impact of the COVID-19 pandemic on market volatility. The sharp spike in
volatility reaching over 100% around early 2020 corresponds to the onset of the pandemic,
reflecting extreme market uncertainty and panic (Zhang et al., 2020). This period of heightened
volatility is followed by a gradual decline, yet the volatility remains elevated compared to
pre-pandemic levels, indicating persistent market instability and periodic volatility clusters.
Subsequent peaks in volatility around 2022 and 2023 suggest that market conditions continued to
be influenced by residual shocks and ongoing economic uncertainties. The GJR-GARCH model
effectively captures these volatility dynamics, including the asymmetry where negative shocks
have a larger impact on volatility, consistent with the leverage effect (Glosten, Jagannathan, &
Runkle, 1993). These observations align with the broader literature on financial market behavior
during crises, which highlights increased volatility and prolonged market adjustments
(Bollerslev, 1986; Goodell, 2020).
AIC and SBIC
Calculating separate AIC and SBIC of each model, then comparing all of them to achieve the
model with the lowest volatility. That models can be considered as the best fit model for this
portfolio analysis.
This code segment fits an ARIMA (AutoRegressive Integrated Moving Average) model to the
exponentially weighted moving average (EWMA) values of a financial asset. It initializes the
ARIMA model using the `ARIMA()` function from the statsmodels library, specifying the
EWMA values as the dependent variable and setting the model order as (0, 0, 1), indicating no
autoregressive (AR) or differencing (I) components and one moving average (MA) component.
Next, it estimates the model parameters using the `fit()` method, storing the results in the
variable `ewma_results`. From the fitted model results, it retrieves the Akaike Information
Criterion (AIC) and Bayesian Information Criterion (BIC) values, which are metrics used to
evaluate the goodness of fit and model complexity. These AIC and BIC values provide insights
into the relative performance of the ARIMA model in capturing the underlying dynamics of the
EWMA values, facilitating model comparison and selection.

This code segment involves estimating a GARCH (Generalized Autoregressive Conditional


Heteroskedasticity) model with a Skewed Student's t distribution. Firstly, it initializes the
GARCH model using the `arch_model()` function from the arch library, specifying the log
returns of a financial asset (`lreturns`) as the input data. The model is configured with a
GARCH(1,1) specification, indicating one lag for both the autoregressive (p) and moving
average (q) components of the conditional variance. Additionally, it specifies the distribution as
Skewed Student's t. Next, the model parameters are estimated using the `fit()` method, with
options to suppress display of optimization messages (`disp="off"`) and to specify the first and
last observation dates. Finally, the Akaike Information Criterion (AIC) and Bayesian Information
Criterion (BIC) values are extracted from the fitted model results (`garch_res`) and stored in
variables `garch_aic` and `garch_bic`, respectively. These metrics serve as indicators of model fit
and help in comparing the relative performance of the GARCH model.
This code segment estimates a GJR-GARCH (Generalized Autoregressive Conditional
Heteroskedasticity with Asymmetric Effects) model for the log returns of a financial asset.
Initially, it initializes the GJR-GARCH model using the `arch_model()` function from the arch
library, specifying the log returns (`logreturns`) as the input data. The model is configured with
one lag for the autoregressive (p), moving average (q), and asymmetric (o) components, along
with a constant mean and a Student's t distribution. Next, the model parameters are estimated
using the `fit()` method, with options to suppress the display of optimization messages
(`disp="off"`) and to specify the last observation date. Subsequently, the Akaike Information
Criterion (AIC) and Bayesian Information Criterion (BIC) values are extracted from the fitted
model results (`gjr_res`) and stored in variables `gjr_aic` and `gjr_bic`, respectively. These
metrics serve as indicators of model fit and assist in comparing the relative performance of the
GJR-GARCH model.

This code snippet conducts a comparison of different financial volatility models, including
Exponentially Weighted Moving Average (EWMA), GARCH(1,1) (Generalized Autoregressive
Conditional Heteroskedasticity), and GJR(1,1,1) (GARCH with Asymmetric Effects). Initially, it
gathers the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) values
for each model into respective lists. Subsequently, it iterates through the lists of model names
(`models`), AIC values, and BIC values simultaneously using the `zip()` function. During each
iteration, it prints the model name alongside its corresponding AIC and BIC values, providing a
comprehensive comparison of the models' goodness of fit. This comparison aids in selecting the
most appropriate model based on their relative performance in terms of model fit and
complexity.
The model comparison results show distinct differences in the performance of the EWMA,
GARCH(1,1), and GJR-GARCH(1,1,1) models, based on their AIC and BIC values. The EWMA
model has an AIC of 9827.158 and a BIC of 9842.570, which are significantly higher than those
of the GARCH(1,1) model, which has an AIC of 4052.320 and a BIC of 4083.134. This suggests
that the GARCH(1,1) model provides a better fit to the data, capturing the volatility dynamics
more effectively than the simpler EWMA model. The GJR-GARCH(1,1,1) model, however,
shows extremely high AIC (41619.076) and BIC (41649.895) values, which are indicative of
poorer model performance or potential overfitting issues, despite its complexity and ability to
capture asymmetries in the volatility process.
The rationale behind these results can be attributed to the inherent characteristics of the models
and the nature of the data. The GARCH(1,1) model captures both the clustering of volatility and
the persistence of shocks, which are common features in financial time series, especially during
periods of market stress such as the COVID-19 pandemic (Bollerslev, 1986). The high AIC and
BIC values for the GJR-GARCH(1,1,1) model could be due to overparameterization or
convergence issues, as indicated by the warning in the model output. Additionally, while the
GJR-GARCH model is designed to capture asymmetries and leverage effects (Glosten,
Jagannathan, & Runkle, 1993), the specific dataset might not benefit sufficiently from this added
complexity, leading to poorer overall performance metrics.
In conclusion, based on the AIC and BIC criteria, the GARCH(1,1) model is preferred for
modeling the volatility of the portfolio returns during the COVID-19 period, providing a balance
between model complexity and goodness of fit.
Backtesting
This code segment performs a comprehensive backtesting analysis of a GARCH(1,1) model with
a Skewed Student's t distribution, focusing on Value at Risk (VaR) estimation. Initially, the
GARCH(1,1) model is estimated using the arch library, with the Skewed Student's t distribution.
The Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) values are
then computed as indicators of model fit and complexity. Subsequently, the VaR is estimated
using the fitted GARCH model, and backtesting is performed by comparing the actual returns to
the estimated VaR. The hit rates for VaR exceedance are calculated and printed. Two plots are
generated: one showing the estimated VaR and another displaying the actual returns along with
VaR violations. The hit rate is displayed on the second plot. Further backtesting tests are
conducted, including the Kupiec Probability of Failure (POF) test, Christoffersen test, and
Traffic Light test, to assess the model's performance in capturing risk. Finally, summary statistics
such as annualized return, cumulative return, and maximum drawdown are computed for both
the original returns and the returns adjusted for VaR, providing insights into the effectiveness of
the VaR model in managing risk. I conduct the backtesting step by step in the code.
After those steps, the results are revealed. Each step is discussed carefully above.

The results for the hit rates of 1% and 5% Value at Risk (VaR) indicate the performance of the
VaR model in predicting extreme losses. The 1% VaR hit rate shows no breaches (False: 1.0),
suggesting that the model is highly conservative or that the market conditions during the
COVID-19 pandemic were not extreme enough to trigger such losses. On the other hand, the 5%
VaR hit rate shows breaches approximately 4.22% of the time (True: 0.042164), which aligns
closely with the expected 5%, indicating better calibration and accuracy of the model at this
confidence level. The results imply that while the 5% VaR model is relatively well-calibrated, the
1% VaR model may be overestimating risk. This could be due to the heightened market volatility
during the pandemic, leading to more conservative risk estimates. Regular validation and
recalibration of VaR models are essential, especially in volatile market environments, to ensure a
balance between conservativeness and accuracy in risk assessment (Bollerslev, 1986; Zhang et
al., 2020).

The two plots provide a detailed visualization of the 5% Value at Risk (VaR) estimated using a
GARCH model and the corresponding hit rate of actual returns violating this VaR threshold. The
upper plot shows the estimated VaR, which adjusts dynamically over time, capturing the
increased volatility during the COVID-19 pandemic around early 2020, as evidenced by the
sharp dip. This reflects heightened risk and market uncertainty during that period (Zhang et al.,
2020). The lower plot juxtaposes the actual returns with the estimated 5% VaR threshold, with
violations marked by black triangles. The hit rate of 4.22% is close to the expected 5%,
indicating a well-calibrated VaR model. The visual alignment of the actual returns breaching the
VaR during periods of significant market stress, such as the pandemic's onset, confirms the
model's effectiveness in capturing extreme risk scenarios (Bollerslev, 1986; Glosten,
Jagannathan, & Runkle, 1993). These observations underscore the importance of robust risk
management practices in volatile market conditions and validate the model's utility in predicting
potential losses.
Question 3
Using the data above to fit to the multivariate GARCH frameworks such as BEKK, DCC,
ADCC, cDCC, etc.
Estimate and report the results

The provided R code installs four essential packages for financial time series analysis and
GARCH modeling. The rmgarch package is used for fitting multivariate GARCH models,
including Dynamic Conditional Correlation (DCC) and BEKK, which are critical for modeling
volatility and correlations in multiple time series. The rugarch package offers a comprehensive
framework for univariate GARCH models, such as standard GARCH, EGARCH, and TGARCH,
enabling detailed volatility analysis for individual time series. The xts package (eXtensible Time
Series) provides tools for creating, manipulating, and analyzing time series data, crucial for
handling irregular time series common in financial datasets. Lastly, the quantmod package
(Quantitative Financial Modelling Framework) facilitates the modeling, testing, and analysis of
financial data, including data retrieval and charting, making it a vital tool for quantitative
financial analysis. Together, these packages form a robust toolkit for conducting advanced time
series analysis and volatility modeling in R.
The R code snippet begins by loading four essential libraries: `rmgarch`, `rugarch`, `xts`, and
`quantmod`. The `rmgarch` library is used for fitting multivariate GARCH models, which are
crucial for analyzing volatility and correlations between multiple time series. The `rugarch`
library provides tools for univariate GARCH modeling, supporting various GARCH model
types. The `xts` library facilitates handling and manipulating time series data, essential for
efficient data operations in time series analysis. The `quantmod` library offers functionalities for
modeling, testing, and analyzing financial data, including data retrieval and charting. After
loading the libraries, the code reads data from a CSV file specified by `"path_to_your_data.csv"`,
replacing this with the actual data path. The data is then converted into a time series object using
the `xts` function, assuming the data contains 'returns' and 'returns2' columns and a 'Date' column
to order the data by dates. This preparation step ensures the data is in the correct format for
subsequent multivariate GARCH modeling and analysis.

Backtest the multivariate models using the VaR approach


References
Ashraf, B. N. (2020). Stock markets’ reaction to COVID-19: Cases or fatalities?. *Research in
International Business and Finance*, 54, 101249.
Baker, S. R., Bloom, N., Davis, S. J., & Terry, S. J. (2020). The unprecedented stock market
impact of COVID-19. National Bureau of Economic Research.
Bollerslev, T. (1986). Generalized Autoregressive Conditional Heteroskedasticity. *Journal of
Econometrics*, 31(3), 307-327.
Glosten, L. R., Jagannathan, R., & Runkle, D. E. (1993). On the relation between the expected
value and the volatility of the nominal excess return on stocks. *The Journal of Finance*, 48(5),
1779-1801.
Goodell, J. W. (2020). COVID-19 and finance: Agendas for future research. Finance Research
Letters, 101512.
Zhang, D., Hu, M., & Ji, Q. (2020). Financial markets under the global pandemic of COVID-19.
Finance Research Letters, 101528.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy