Engineering Applications of Artificial Intelligence: Mohit Beniwal, Archana Singh, Nand Kumar
Engineering Applications of Artificial Intelligence: Mohit Beniwal, Archana Singh, Nand Kumar
ir
https://www.tarjomano.com https://www.tarjomano.com
A R T I C L E I N F O A B S T R A C T
Keywords: Deep machine learning algorithms play an important role in facilitating the development of predictive models for
Deep learning the stock market. However, most studies focus on predicting next-day stock prices or movements, limiting the
Bi-LSTM usability of the predictive model for investors. This study extensively explores the ability of deep learning models
DNN
to predict out-of-sample the daily prices of global stock indices over a long term, up to a year. The performance of
RNN
CNN
six models, including Deep Neural Network (DNN), Recurrent Neural Network (RNN), Long Short-Term Memory
GRU (LSTM), Bidirectional Long Short-Term Memory (Bi-LSTM), Gated Recurrent Unit (GRU), and Convolutional
Forcasting Neural Network (CNN), are compared using Root Mean Squared Error (RMSE) and Mean Absolute Percentage
Index Error (MAPE). The models predict the long-term daily prices of five global stock indices, namely the Nifty, the
LSTM Dow Jones Industrial Average (DJIA), the DAX performance index (DAX), the Nikkei 225 (NI225), and the
Long-term Shanghai Stock Exchange composite Index (SSE). The results confirm the superiority of LSTM for predicting long-
Prediction term daily prices. The Bi-LSTM does not improve the result of LSTM but performs better than other algorithms.
Machine learning
CNN overfits the training data and poorly forecasts the long-term stock prices of global indices on the testing
Artificial intelligence
data. This research demonstrates the potential of deep learning models for long-term stock price forecasting,
Technical analysis
Fundamental analysis offering valuable insights for investors. Additionally, the patterns of predicted daily prices can be helpful in
building trading and risk management decision systems.
1. Introduction affect stock prices (Shahi et al., 2020). Recently, machine learning al
gorithms have emerged as powerful tools for stock market analysis,
The stock market is a complex and dynamic system, and predicting providing a more objective and data-driven approach to predicting stock
its future prices is a challenging task (Beniwal et al., 2023a). Although, prices. These algorithms can extract hidden patterns that are difficult to
the Efficient Market Hypothesis (Fama, 1970) and Random Walk Hy detect by other methods.
pothesis (Fama, 1995) suggest that it is futile to predict the stock market. Various machine learning algorithms have been used in stock market
However, accurate stock market predictions are crucial for making analysis, such as Support Vector Machines (SVM), Decision Trees (DT),
informed investment decisions, reducing risk, and maximizing returns. Random Forests (RF), and Naive Bayes (NB). These algorithms have
Traditional financial models rely on fundamental and technical analysis demonstrated varying degrees of success in predicting stock prices. The
to forecast future prices. Fundamental analysis involves analyzing accuracy of predictions is subject to a margin of error, which is influ
financial statements and economic indicators. On the other hand, enced by the choice of algorithm employed (Nikou et al., 2019). How
technical analysis relies on analyzing historical prices to predict future ever, neural networks have emerged as robust and effective machine
prices. In addition to these traditional methods, statistical and econo learning algorithms that can efficiently handle noisy and nonlinear data
metric methods such as Auto-Regressive Integrated Moving Average to forecast time series (Yu and Yan, 2020). Deep learning has become a
(ARIMA), Seasonal Auto-Regressive Integrated Moving Average (SAR popular approach for analyzing stock market data among neural net
IMA), Vector Auto-Regression (VAR), etc. have also been used to fore works due to its superior performance in prediction tasks. Deep learning
cast stock market time series. However, these linear methods may not enables the creation of computational models that consist of multiple
capture the complex and non-linear relationships between variables that layers of processing, which can learn to represent data with varying
* Corresponding author.
E-mail address: mohitbeniwal@dtu.ac.in (M. Beniwal).
https://doi.org/10.1016/j.engappai.2023.107617
Received 4 June 2023; Received in revised form 25 September 2023; Accepted 24 November 2023
Available online 4 December 2023
0952-1976/© 2023 Elsevier Ltd. All rights reserved.
Downloaded from https://iranpaper.ir
https://www.tarjomano.com https://www.tarjomano.com
degrees of abstraction (Lecun et al., 2015). There is a growing trend over extended periods, which can be more challenging than short-
among asset management companies and investment banks to allocate term predictions, where the focus is on immediate past patterns.
more research funds toward the development of artificial intelligence 2. Data Complexity: Long-term predictions involve Managing and
techniques, particularly in deep learning (Jianga). Deep learning models Processing complex data that can be computationally intensive.
have shown better performance than linear and machine learning 3. Volatility and Trends: Long-term stock prices are influenced by
models in stock market prediction tasks, owing to their ability to both short-term fluctuations and long-term trends. Deep learning
effectively handle large volumes of data and identify complex nonlinear models may not capture day-to-day volatility influenced by short-
relationships between input features and prediction targets (Jiangb). term events.
The application of deep learning models has significantly impacted 4. Overfitting and Underfitting: With a more extended training
the field of finance, particularly in predicting stock prices. One advan period, there’s a risk of overfitting or underfitting. Preventing
tage of deep learning is its ability to automatically extract features from overfitting and underfitting is crucial for accurate long-term
raw data (Wu et al., 2022), (Liang et al., 2017), eliminating the need for predictions.
feature engineering and improving forecasting accuracy. Deep learning 5. Feature Engineering: While deep learning models excel at feature
models comprise multiple layers of interconnected neurons, each layer extraction, features such as the P/E ratio and other fundamental
responsible for extracting higher-level features from the input data. indicators are not available on a daily frequency. Hence, they are
Numerous deep-learning architectures have been created to address difficult to include in this study.
diverse problems and the inherent structure of datasets (Bhandari et al.,
2022). In this paper, we explore some frequently used deep learning The study contributes to bridging the gap between short-term pre
models such as Deep Neural Networks (DNN), Recurrent Neural Net diction models and the practical needs of investors looking for robust
works (RNN), Long Short-Term Memory (LSTM), Bidirectional Long and informed long-term predictions. The proposed approach adds
Short-Term Memory (Bi-LSTM), Gated Recurrent Units (GRU), and important value to predicting long-term stock prices using deep
Convolutional Neural Networks (CNN). DNN includes dense hidden learning. Further, it can provide traders and investors with more accu
layers with a hierarchical topology (Thakkar and Chaudhari). DNNs can rate and reliable long-term predictions. This can help them make
learn multiple levels of features from raw input data using multiple informed investment decisions and reduce risk. The proposed approach
layers of nonlinear transformations. An RNN is a specialized type of can also be used to develop advanced trading strategies, such as trend-
Artificial Neural Network (ANN) that can handle sequential inputs by following or mean-reverting. Overall, this study contributes to the
incorporating internal feedback connections between neurons (Kumar ongoing research efforts to improve the accuracy and reliability of stock
and Haider, 2021). LSTM, a specific type of RNN, is a neural network market predictions using advanced deep learning algorithms. Hence,
architecture that can retain memory. It is well-suited for processing and this paper contributes in the following ways:
predicting significant events with longer intervals and time delays
within time series data (Lin et al., 2021). However, LSTM can only learn 1. Temporal Dependency: Most predictive studies in the stock market
from past information (Alkhatib et al.). Bi-LSTM, a variant of LSTM, can emphasize next-day prices. In contrast, this study presents an
learn from past and future information because it has two hidden layers approach to exploit price patterns in relation to time dependency to
with opposite directions connected with the same output (Alkhatib forecast long-term stock prices of global indices.
et al.)– (Houssein et al., 2022). Like LSTM, GRU can handle sequence 2. Models Comparison: Extensive experiments are conducted on the
data while simplifying the complex computations involved in LSTM top five GDPs to evaluate the predicting robustness of LSTM and
(Zhang et al., 2023). While CNN is inspired by computer vision, it can be other deep learning algorithms in the long term.
designed for financial data (Shah et al.), (Hoseinzade and Haratizadeh, 3. Evaluation of Deep Learning Models: The research addresses
2019). multiple questions regarding the ability to forecast long-term stock
In stock market prediction, most of the research has focused on prices of global indices, such as whether RNN performs better than
predicting the next day’s price (Rouf et al., 2021), using iterative DNN, whether it is better to use Bi-LSTM instead of LSTM, whether
methods to predict prices for the entire test data. This approach has GRU outperforms LSTM, and which of DNN, RNN, LSTM, Bi-LSTM,
occasionally achieved high accuracy but is not always useful for traders GRU, and CNN is the least suitable.
seeking long-term predictions. For long-term prediction, machine 4. Practical Utility: The approach helps investors gauge the market
learning algorithms need to provide multi-output predictions. However, outlook for the long term and make informed decisions.
structuring machine learning algorithms for multi-output long-term 5. Inspiration for Future Research: The study may inspire other re
predictions can be tedious and sometimes impractical (Beniwal et al., searchers to develop long-term trading and risk management
2023b). To address this limitation, in this study, we propose a novel systems.
approach for stock market prediction that leverages the time de
pendency of stock prices. Instead of predicting only the next day’s price, The remainder of the paper is organized as follows: Section 2 pre
we train machine learning algorithms to learn the patterns of price sents the literature on stock prediction using deep learning. Section 3
fluctuations in relation to time. By exploiting the time dependency of describes the deep learning models and an overview of the prediction
prices, we aim to predict long-term prices with higher accuracy and algorithm. Section 4 discusses the methodologies to explain the exper
precision. To test the robustness of the approach, we experiment with imental design, and Section 5 presents the results of experiments con
the ML algorithms on stock indices of the top five economies in terms of ducted on global indices, followed by a discussion of the findings.
Gross Domestic Product (GDP) (“Countries by GDP), namely Nifty, the Finally, Section 6 concludes the study, discusses its limitations, and
Dow Jones Industrial Average (DJIA), DAX performance index (DAX), suggests directions for future research.
Nikkei 225 (NI225), Shanghai Stock Exchange composite index (SSE).
Predicting long-term daily stock prices using deep learning models 2. Literature review
presents multiple challenges compared to short-term predictions. Here
are some insights into these challenges: 2.1. Deep learning in stock prediction
1. Temporal Dependency: Long-term predictions require under DNNs are feedforward neural networks that process input data in one
standing and modeling more extended sequences of historical data. direction, from the input layer through one or more hidden layers to the
Deep learning models must capture temporal dependencies that span output layer. Singh et al. (Singh and Srivastava, 2017) aimed to
demonstrate that deep learning can enhance the accuracy of stock
2
Downloaded from https://iranpaper.ir
https://www.tarjomano.com https://www.tarjomano.com
market forecasting. The study compared the performance of (2D)2PCA three recurrent neural networks (RNN) models for predicting crypto
+ Deep Neural Network (DNN) with 2-Directional 2-Dimensional Prin currency prices: LSTM, bi-LSTM, and GRU. The models were evaluated
cipal Component Analysis (2D)2PCA + Radial Basis Function Neural using mean absolute percentage error (MAPE). Results showed that the
Network (RBFNN) and found that the proposed method outperformed GRU model outperformed LSTM and bi-LSTM for all three crypto
the RBFNN method, with an improved accuracy of 4.8% for Hit Rate currencies (Bitcoin, Ethereum, and Litecoin), with the lowest MAPE
with a window size of 20. The proposed model was also compared with percentages. The bi-LSTM model had the highest prediction error.
Recurrent Neural Network (RNN) and showed an improved accuracy of Overall, the proposed models showed accurate predictions of crypto
15.6% for Hit Rate. Additionally, the correlation coefficient between currency prices.
actual and predicted return for DNN was 17.1% more than RBFNN and CNN has been given less emphasis on forecasting stock prices.
43.4% better than RNN. Kanwal et al. (2022) proposed a hybrid deep Generally, CNN is used for computer vision, but recently, it has been
learning (DL) model for timely and efficient prediction of stock prices. applied to stock time series forecasting. Khodaee et al. (2022) used a
The proposed model, BiCuDNNLSTM-1dCNN, combined Bidirectional hybrid model consisting of a Convolutional Neural Network (CNN) and
Cuda Deep Neural Network Long Short-Term Memory and a Long Short-Term Memory (LSTM) was developed to forecast Turning
one-dimensional Convolutional Neural Network. The model was Points (TPs) in stock prices. The model first classified each day in the
compared with other hybrid DL-based and state-of-the-art models using time series as a TP or Ordinary Point (OP) and used a balancing
five stock price datasets. The results indicated that the proposed hybrid approach to have a balanced number of TPs and OPs. Technical in
model was accurate and reliable for supporting informed investment dicators were then converted into 2D images to consider their rela
decisions. tionship, and the Fuzzy C-Means algorithm was applied to segment the
RNNs are designed to handle sequential data with temporal de inputs and aid training efficiency. A classification hybrid
pendencies, such as time series data or natural language processing CNN-LSTM-ResNet model was proposed to forecast TPs and OPs, and
tasks. Lui et al. (Liu et al., 2020) proposed two attention-based RNN augmentation techniques, including Residual Networks (ResNet), were
models, DSTP-RNN and DSTP-RNN-II, for long-term and multivariate employed. The proposed model outperformed other benchmarks with an
time series prediction. These models outperformed state-of-the-art average accuracy of 60.19% in Dow-30 and 63.62% in ETFs, achieving a
methods and provided insights for further exploration of profit of up to three times in Dow-30 and up to four times more than the
attention-based methods in time series prediction. Ranjan and Mahadani Buy and Hold strategy in ETFs.
(Ranjan et al., 2022) compared LSTM, bi-directional LSTM, and RNN
models with univariate and multivariate features to predict stock prices.
The study found that the recurrent neural network approach had the 2.2. Research gaps
highest accuracy with univariate and multivariate features. The per
formance was evaluated using the root mean square error (RMSE) and Most research in time series stock forecasting has primarily focused
mean square error (MSE) criteria. Naik et al. (Naik and Mohan, 2019) on predicting prices for the next day, leaving a gap in the literature
proposed an RNN with recurrent dropout to avoid overfitting and used regarding the prediction of long-term prices (Nazareth and Reddy). This
stock returns based on closing prices as input to the model. Data was short-term emphasis on prediction has created this gap. In this study, the
collected from NSE India, and the proposed RNN with LSTM out authors addressed this gap by predicting long-term prices. Additionally,
performed a feed-forward artificial neural network regarding error most studies have experimented with a single index. However, this study
minimization. used the daily historical price of the top five global economies’ indices
RNNs suffer from vanishing and exploding gradients, making it and predicted the daily price for the next year at once, providing in
difficult to train the model effectively. LSTMs overcome the vanishing vestors and traders with a long-term market outlook to make informed
gradient problem in RNNs. Gülmez (2023) developed a deep LSTM investment decisions and improve risk management. Furthermore, un
network with the ARO model (LSTM-ARO) to predict stock prices using like other studies, this study exhaustively experimented with six deep
DJIA index stocks. Four other models, one ANN and three LSTM, learning models, namely DNN, RNN, LSTM, Bi-LSTM, GRU, and CNN, to
including one optimized by Genetic Algorithm (GA), were compared forecast long-term stock prices. This extensive research differentiates
with LSTM-ARO using MSE, MAE, MAPE, and R2 evaluation criteria. this study from other studies and addresses all these gaps in the
The results indicate that LSTM-ARO outperformed the other models. literature.
Budiharto (Budiharto) experimented with LSTM and found it to be a
reliable predictor for short-term data with an accuracy of 94.57%. Using 3. Deep learning models
a shorter training period of 1 year with high epochs produced better
results than using three-year training data. Rather (2021) implemented 3.1. Deep Neural Network (DNN)
a new regression scheme on an LSTM-based deep neural network to
construct a predicted portfolio. The author conducted a large set of ex ANNs were proposed in the 1940s as the simplest model to mimic
periments using stock data of NIFTY-50 obtained from the National how human brains process information and learn from it. However, ANN
Stock Exchange of India. The results indicated that the proposed model learning becomes challenging if data increases in size (Awad and
outperformed various standard predictive and portfolio optimization Khanna, 2015). Multi-layer Perceptron (MLP) is a variant of feedforward
models. ANN and is the foundation of DNN (Sarker, 2021). This study calls a fully
Bi-LSTM has an additional LSTM layer that processes the input data connected MLP having more than or equal to two hidden layers a DNN.
in the reverse order. Lee et al. (2022) proposed and applied an DNNs are highly scalable and can handle large and complex datasets
attention-based BiLSTM (AttBiLSTM) model to trading strategy design. with ease. Hinton et al. (2006) first proposed DNNs in 2006. In a DNN,
The model is evaluated with various technical indicators (TIs), including there are three distinct layer types, starting with the input layer, then
a stochastic oscillator, RSI, BIAS, W%R, and MACD. Two trading stra two or more hidden layers, and concluding with an output layer. The
tegies suitable for deep neural networks (DNNs) are also proposed and input layer receives and processes the input data, which is then trans
verified for effectiveness. The study introduces five well-known TIs and formed through the subsequent hidden layers via activation functions.
demonstrates the highest accuracy of 68.83% in predicting stock trends. Each hidden layer comprises multiple neurons, where the output of each
Additionally, exporting the probability of the deep model to the trading neuron is fully connected to all neurons in the next layer, creating a
strategy is introduced, resulting in the highest return on investment of dense layer. Lastly, the output layer produces the network’s final output,
42.74% on the back test of TPE0050. GRU has fewer parameters to train which can either be a single value or a vector of values. Fig. 1 shows the
compared to LSTM. Hamayel and Owda (2021) proposed and compared architecture of a typical DNN.
3
Downloaded from https://iranpaper.ir
https://www.tarjomano.com https://www.tarjomano.com
4
Downloaded from https://iranpaper.ir
https://www.tarjomano.com https://www.tarjomano.com
5
Downloaded from https://iranpaper.ir
https://www.tarjomano.com https://www.tarjomano.com
4. Methodologies
In the study, all deep learning models carry the same configuration,
so the comparison can be unbiased. Fig. 7 shows the deep learning layers
architecture. The first layer of each model is the input layer. Generally,
data is normalized to feed in the model (Ghaderzadeh et al., 2022)–
(Gheisari et al., 2023). The dates are converted into an integer from 1 to
n. Along with integer dates, this layer also has an input of a transformed
array of scaled close prices using the min-max scaler formula in Eq. 1
X − Xmin
Xscaled = (1)
Xmax − Xmin
Assuming X is the feature matrix, Xmin and Xmax correspond to its
minimum and maximum values, respectively. The input layer gives
output to the model’s first layer. The first layer of the model is the
corresponding fully connected neuron cell such as DNN, RNN, LSTM, Bi-
Fig. 7. Deep learning model layers architecture.
LSTM, GRU, or CNN. There are 256 fully connected neuron units. This
layer has a ReLU activation function. Deep learning models sometimes
can overfit the training data. Hence, it is important to design a predic Table 1
tion method that tackles the problem of overfitting the training data. A Parameters.
dropout layer is added after the first layer with 256 units to handle the
Parameter Value
overfitting in training data. The dropout rate is kept at 20% in the
dropout layer for all models. Loss mse
Optimizer adam
A second layer of deep learning units is added to make the learning Metrics MeanSquaredError
deep. The second layer also has 256 units. This layer also has a ReLU Epochs 100
activation function. A second dropout layer is again stacked to the deep Batch Size 32
learning layer with the same 20% dropout rate and 256 units. Finally, a
dense layer with a single neuron is added with a sigmoid activation
4.2. The experimental setup
function. The data is input into the models with a batch size of 32. The
models are compiled and trained using Adam optimizer and mean
In this research, an investigation is conducted using five prominent
squared error as a loss function. For all models, 100 epochs are used to
global indices, which are the Nifty, the Dow Jones Industrial Average
train them. The output from the model is inversed, transformed, and
(DJIA), the DAX performance Index (DAX), the Nikkei 225 (NI225), and
compared with the original prices. The final evaluation of the models is
the Shanghai Stock Exchange composite index (SSE). These indices
done based on RMSE and MAPE. The parameters of the algo are given in
belong to India, the USA, Germany, Japan, and China. The countries are
Table 1.
the top five economies of the world. As the stock market is reflected in a
country’s economy, these indices from the diverse economies can help
test the robustness of the deep learning model. The selection of the top
five GDPs was based on the economic significance, the diversity of
economies, and the global impact of these economies. Using multiple
indices tests the robustness of the deep learning models across different
markets. This comparative approach enables the study to identify
trends, patterns, and variations in the deep learning model effectiveness
across diverse economic landscapes. The Data is collected from Yahoo
Finance for around ten years, from January 1st, 2013, to February 28th,
2023. Such a long period helps the model train on different market
phases, such as bull, bear, and stagnant. The last year’s data is kept for
testing purposes, while the rest is utilized for data training. Upon
completion of training, the trained models make predictions on future
prices using dates from the testing data. The prices obtained from the
testing phase are inverse-transformed and compared to the predicted
values. The efficacy of the models is evaluated using RMSE on both
training and testing data. MAPE is used to compare the results obtained
from different indices because the value of indices is different from
market to market. Fig. 8 shows the overall experimental design.
6
Downloaded from https://iranpaper.ir
https://www.tarjomano.com https://www.tarjomano.com
Table 3
DJIA.
Model Train RMSE Test RMSE Train MAPE Test MAPE
5.1. Nifty
Table 2 shows the results of all models on the Nifty index. In the
training phase, CNN performed the best, and LSTM performed the worst.
Further, GRU is the second best in terms of RMSE and MAPE. However,
in the testing phase, LSTM performed the best, and CNN was the second
worst in performance. This indicates that fitting well in the training
phase does not guarantee better performance in the testing phase. Bi-
LSTM also performed better than LSTM in the training phase but not
in the testing phase. Fig. 9 shows visually the price pattern and pre
diction in the training and testing phase. The gray line separates the
training and testing periods.
Fig. 10. DJIA predictions.
5.2. The Dow Jones Industrial Average (DJIA) learning models during the training and testing period. The MAPE
values are higher in DJIA compared to the Nifty index.
Table 3 shows the results of all models on the DJIA index. The per
formance of CNN in the training phase is the best in the DJIA index, 5.3. DAX performance index (DAX)
similar to that of CNN on Nifty. DNN is the second best in terms of
training RMSE and training MAPE. GRU performed worst in the training Table 4 shows the performance of deep learning models on the DAX
period, and LSTM was the second worst. However, the scenario changes performance index. Notably, predictions show big RMSE and MAPE
in the testing phase. The LSTM model again performed the best, while values. This could be attributed to a sharp fall during the predicted
the CNN performed the worst. Fig. 10 shows the performance of deep period. In contrast to the performance of CNN on the Nifty and DJIA, the
Table 2 Table 4
Nifty. DAX.
Model Train RMSE Test RMSE Train MAPE Test MAPE Model Train RMSE Test RMSE Train MAPE Test MAPE
DNN 503.34 1255.32 3.52 6.2 DNN 596.34 2453.73 3.71 17.06
RNN 510.74 1353.29 3.56 6.8 RNN 601.07 2479.58 3.67 17.26
LSTM 581.37 881.38 4.51 3.95 LSTM 684.68 2246.89 4.53 15.48
Bi-LSTM 576.12 1064.63 4.46 4.96 Bi-LSTM 681.42 2473.32 4.52 17.21
GRU 506.26 1224.16 3.6 5.98 GRU 679.04 2414.38 4.45 16.76
CNN 499.28 1326.14 3.49 6.64 CNN 598.2 2421.44 3.74 16.82
7
Downloaded from https://iranpaper.ir
https://www.tarjomano.com https://www.tarjomano.com
performance on CNN is not the best during the training phase. The DNN Table 5
model performed slightly better than the CNN model. Following a Nikkei 225.
similar pattern in the previous two indices, the LSTM model performed Model Train RMSE Test RMSE Train MAPE Test MAPE
the worst regarding the training RMSE and MAPE. However, the LSTM is
DNN 1016.41 3013.85 3.91 10.8
the best-performing model during testing in DAX. The MAPE values are RNN 996.18 2932.33 3.89 10.49
higher in DAX compared to both Nifty and DJIA. Fig. 11 visualizes the LSTM 1421.02 3080.97 5.88 11.06
pattern of prices both during the training and testing phase. Bi-LSTM 1259.34 2908.46 5.14 10.4
GRU 1058.77 2610.49 4.23 9.25
CNN 990.39 2989.09 3.8 10.71
5.4. Nikkei 225 (NI225)
Table 5 reports the results of the six models on Nikkei 225. The re
sults are similar to previous experiments. The CNN model shows supe
riority during the training phase, and the GRU model outperforms other
models during the testing period. The MAPE values are smaller than
DJIA and DAX but bigger than Nifty. Fig. 12 shows the chart of the
original close price and fit and prediction prices during the testing and
training period, respectively.
Table 6 shows the performance of the models on the SSE index. RNN
fits the best on training data of the SSE index, and LSTM fits the worst.
The Bi-LSTM models performed the best on testing data. The LSTM
model is in second place. The CNN models again performed poorly on
the testing data. Fig. 13 visualizes the results of all models. It can be seen
that the CNN predicted prices are farthest from testing close prices.
Fig. 12. NI225 predictions.
5.6. Discussion
In all models, the CNN model tends to overfit the training prices and
underfit the testing prices. In contrast, the LSTM model tends to underfit
the training price but performs the best on the testing data. Table 7
shows the consolidated result. To compare the models between the five
indices, RMSE cannot be used as the scales of close prices are different
for each index. Hence, MAPE is appropriate for comparison. Except for
LSTM, the Bi-LSTM is superior to all other models. This can be associated
with the fact that LSTM and Bi-LSTM have a lot of similarities in ar
chitecture. The performance of GRU and DNN is the same on testing
data. However, DNN fits better on the training data compared to the
GRU. GRU also has a gated mechanism like LSTM and Bi-LSTM, but it
cannot outperform these models to predict long-term prices (see
Table 8).
Table 7 shows the best and worst performers during the training and
testing phase. RNN is the second worst performed, possibly due to a
vanishing gradient problem. CNN performed the poorest compared to
Fig. 13. NI225 predictions.
other models, and this might be because CNN is designed to work with
computer vision and images. It can be used for time series analysis, but
with similar deep learning architecture, it does not predict long-term Table 6
SSE.
Model Train RMSE Test RMSE Train MAPE Test MAPE
prices well.
8
Downloaded from https://iranpaper.ir
https://www.tarjomano.com https://www.tarjomano.com
Table 7 superior to other models in predicting long-term stock prices for the five
Consolidated. indices considered. The LSTM model may perform poorly on training
Model Train MAPE Test MAPE data compared to other models. Still, it performed well on the testing
data, indicating that it is an appropriate model to predict long-term
DNN 3.74 11.05
RNN 3.77 11.93 stock prices of global indices. However, the CNN model tends to over
LSTM 4.76 10.38 fit the training data and underfit the testing data, making it unsuitable
Bi-LSTM 4.37 10.90 for predicting long-term stock prices. Furthermore, the study found that
GRU 4.04 11.05 the performance of the models varies across the different indices, indi
CNN 3.69 12.60
cating that the stock market trends in different regions are not identical.
Additionally, the study highlights the importance of selecting the
appropriate evaluation metrics when comparing the performance of the
Table 8 models. In this case, MAPE was used because it accounts for the differ
Overview of models’ results. ence in scale among the indices. Overall, the findings of this study could
Training Testing be useful for investors and traders who rely on stock price predictions to
The Best The Worst The Best The Worst make informed investment and risk mitigation decisions. The LSTM
model, in particular, could be a valuable tool for predicting long-term
Nifty CNN LSTM LSTM RNN
DJIA CNN GRU LSTM CNN stock prices, providing a competitive edge in the stock market.
DAX CNN LSTM LSTM RNN
NI225 CNN LSTM GRU LSTM 6. Conclusion
SSE RNN LSTM Bi-LSTM CNN
Based on the analysis, the study found that the LSTM model is
9
Downloaded from https://iranpaper.ir
https://www.tarjomano.com https://www.tarjomano.com
Compliance with ethical standards Houssein, E.H., Dirar, M., Abualigah, L., Mohamed, W.M., 2022. An efficient equilibrium
optimizer with support vector regression for stock market prediction. Neural
Comput. Appl. 34 (4), 3165–3200. https://doi.org/10.1007/s00521-021-06580-9.
This article does not contain any studies with human participants or W. Jiang, “Applications of Deep Learning in Stock Market Prediction: Recent Progress,”
animals performed by any of the authors. Expert Systems With Applications, vol. vol. 184. Elsevier Ltd, Dec. 01, 2021. doi:
10.1016/j.eswa.2021.115537.
W. Jiang, “Applications of Deep Learning in Stock Market Prediction: Recent Progress,”
CRediT authorship contribution statement Expert Systems With Applications, vol. vol. 184. Elsevier Ltd, Dec. 01, 2021. doi:
10.1016/j.eswa.2021.115537.
Kanwal, A., Lau, M.F., Ng, S.P.H., Sim, K.Y., Chandrasekaran, S., 2022. BiCuDNNLSTM-
Mohit Beniwal: Conceptualization, Methodology, Software, Vali 1dCNN — a hybrid deep learning-based predictive model for stock price prediction.
dation, Formal analysis, Investigation, Data curation, Writing – original Expert Syst. Appl. 202 https://doi.org/10.1016/j.eswa.2022.117123.
draft. Archana Singh: Writing – review & editing, Supervision. Nand Khodaee, P., Esfahanipour, A., Mehtari Taheri, H., 2022. Forecasting turning points in
stock price by applying a novel hybrid CNN-LSTM-ResNet model fed by 2D
Kumar: Writing – review & editing, Supervision.
segmented images. Eng. Appl. Artif. Intell. 116 https://doi.org/10.1016/j.
engappai.2022.105464.
Krizhevsky, A., Sutskever, I., Hinton, G.E., 2017. ImageNet classification with deep
Declaration of competing interest convolutional neural networks. Commun. ACM 60 (6), 84–90. https://doi.org/
10.1145/3065386.
Kumar, K., Haider, M.T.U., 2021. Enhanced prediction of intra-day stock market using
The authors declare that they have no known competing financial metaheuristic optimization on RNN–LSTM network. New Generat. Comput. 39 (1),
interests or personal relationships that could have appeared to influence 231–272. https://doi.org/10.1007/s00354-020-00104-0.
the work reported in this paper. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., 1998. Gradient-based learning applied to
document recognition. Proc. IEEE 86 (11), 2278–2323. https://doi.org/10.1109/
5.726791.
Data availability Lecun, Y., Bengio, Y., Hinton, G., 2015, 436–444, May 27. Deep Learning,” Nature, vol.
521. Nature Publishing Group, p. 7553. https://doi.org/10.1038/nature14539.
Lee, M.C., Chang, J.W., Yeh, S.C., Chia, T.L., Liao, J.S., Chen, X.M., 2022. Applying
Data will be made available on request.
attention-based BiLSTM and technical indicators in the design and performance
analysis of stock trading strategies. Neural Comput. Appl. 34 (16), 13267–13279.
References https://doi.org/10.1007/s00521-021-06828-4.
Liang, H., Sun, X., Sun, Y., Gao, Y., 2017. Text feature extraction based on deep learning:
a review. Dec. 01, 2017. In: Eurasip Journal on Wireless Communications and
Alkhatib, K., Khazaleh, H., Alkhazaleh, H.A., Alsoud, A.R., Abualigah, L., 2022. A new
Networking, vol. 1. Springer International Publishing. https://doi.org/10.1186/
stock price forecasting method using active deep learning approach. Journal of Open
s13638-017-0993-1.
Innovation: Technology, Market, and Complexity 8 (2). https://doi.org/10.3390/
Lin, Y., Yan, Y., Xu, J., Liao, Y., Ma, F., 2021. Forecasting stock index price using the
joitmc8020096.
CEEMDAN-LSTM model. N. Am. J. Econ. Finance 57. https://doi.org/10.1016/j.
Awad, M., Khanna, R., 2015. Deep neural networks. In: Efficient Learning Machines:
najef.2021.101421.
Theories, Concepts, and Applications for Engineers and System Designers. Apress.
Liu, Y., Gong, C., Yang, L., Chen, Y., 2020. DSTP-RNN: a dual-stage two-phase attention-
Beniwal, M., Singh, A., Kumar, N., 2023a. Predicting next quarter nifty 50 price using
based recurrent neural network for long-term and multivariate time series
genetic algorithm and support vector regression. In: 2023 2nd International
prediction. Expert Syst. Appl. 143 https://doi.org/10.1016/j.eswa.2019.113082.
Conference on Applied Artificial Intelligence and Computing. ICAAIC), pp. 631–635.
Naik, N., Mohan, B.R., 2019. Study of stock return predictions using recurrent neural
Beniwal, M., Singh, A., Kumar, N., 2023b. Forecasting long-term stock prices of global
networks with LSTM. In: Communications in Computer and Information Science.
indices: a forward-validating Genetic Algorithm optimization approach for Support
Springer Verlag, pp. 453–459. https://doi.org/10.1007/978-3-030-20257-6_39.
Vector Regression. Appl. Soft Comput. 110566.
N. Nazareth and Y. Y. R. Reddy, “Financial applications of machine learning: a literature
Bhandari, H.N., Rimal, B., Pokhrel, N.R., Rimal, R., Dahal, K.R., Khatri, R.K.C., 2022.
review,” Expert Systems with Applications, vol. vol. 219. Elsevier Ltd, Jun.01, 2023.
Predicting stock market index using LSTM. Machine Learning with Applications 9,
doi: 10.1016/j.eswa.2023.119640.
100320. https://doi.org/10.1016/j.mlwa.2022.100320.
Nikou, M., Mansourfar, G., Bagherzadeh, J., 2019. Stock price prediction using DEEP
Bhatt, D., et al., 2021. CNN variants for computer vision: history, architecture,
learning algorithm and its comparison with machine learning algorithms. Intell.
application, challenges and future scope. MDPI Electronics 10 (20). https://doi.org/
Syst. Account. Finance Manag. 26 (4), 164–174. https://doi.org/10.1002/isaf.1459.
10.3390/electronics10202470.
Ranjan, A., Mahadani, Kumar, A., 2022. Stock price prediction using deep learning-based
Budiharto, W., 2021. Data science approach to stock prices forecasting in Indonesia
univariate and multivariate LSTM and RNN. In: Sikdar Biplab, R.A., Prasad, Maity
during Covid-19 using Long Short-Term Memory (LSTM). J Big Data 8 (1). https://
(Eds.), Proceedings of the 3rd International Conference on Communication, Devices
doi.org/10.1186/s40537-021-00430-0.
and Computing. Springer Singapore, Singapore, pp. 95–103.
Cho, K., et al., 2014. Learning phrase representations using RNN encoder-decoder for
Rather, A.M., 2021. LSTM-Based deep learning model for stock prediction and predictive
statistical machine translation. arXiv preprint arXiv:1406.1078 [Online]. Available:
optimization model. EURO Journal on Decision Processes 9 (Jan). https://doi.org/
http://arxiv.org/abs/1406.1078.
10.1016/j.ejdp.2021.100001.
Elman, J.L., 1990. Finding structure in time. Cognit. Sci. 14 (2), 179–211. https://doi.
Rehmer, A., Kroll, A., 2020. On the vanishing and exploding gradient problem in gated
org/10.1207/s15516709cog1402_1.
recurrent units. In: IFAC-PapersOnLine. Elsevier B.V., pp. 1243–1248. https://doi.
Fama, E.F., 1970. Efficient capital markets: a review of theory and empirical work.
org/10.1016/j.ifacol.2020.12.1342
J. Finance 25 (2), 383–417 [Online]. Available:
Rouf, N., et al., 2021. Stock market prediction using machine learning techniques: a
Fama, E.F., 1995. Random walks in stock market prices. J. Finance 75–80.
decade survey on methodologies, recent developments, and future directions.
Fukushima, K., 1980. Neocognitron: a self-organizing neural network model for a
Electronics 10 (21). https://doi.org/10.3390/electronics10212717. MDPI, Nov. 01.
mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36,
S. S. Samsani, H. Mutahira, and M. S. Muhammad, “Memory-based crowd-aware robot
202.
navigation using deep reinforcement learning,” Complex and Intelligent Systems,
Ghaderzadeh, M., Aria, M., Hosseini, A., Asadi, F., Bashash, D., Abolghasemi, H., 2022.
Apr. 2022, doi: 10.1007/s40747-022-00906-3.
A fast and efficient CNN model for B-ALL diagnosis and its subtypes classification
Sarker, I.H., 2021. Deep learning: a comprehensive overview on techniques, taxonomy,
using peripheral blood smear images. Int. J. Intell. Syst. 37 (8), 5113–5133.
applications and research directions. Springer, Nov. 01 SN Comput. Sci. 2 (6).
Gheisari, M., et al., 2023. Deep Learning: Applications, Architectures, Models, Tools, and
https://doi.org/10.1007/s42979-021-00815-1.
Frameworks: A Comprehensive Survey. CAAI Trans Intell Technol.
Schuster, M., Paliwal, K.K., 1997. Bidirectional recurrent neural networks. IEEE Trans.
Graves, A., Schmidhuber, J., 2005. Framewise phoneme classification with bidirectional
Signal Process. 45 (11).
LSTM and other neural network architectures. In: Neural Networks. Jul,
J. Shah, D. Vaidya, and M. Shah, “A comprehensive review on multiple hybrid deep
pp. 602–610. https://doi.org/10.1016/j.neunet.2005.06.042.
learning approaches for stock prediction,” Intelligent Systems with Applications, vol.
Gülmez, B., 2023. Stock price prediction with optimized deep LSTM network with
vol. 16. Elsevier B.V., Nov. 01, 2022. doi: 10.1016/j.iswa.2022.200111.
Artificial Rabbits Optimization Algorithm. Expert Syst. Appl., 120346 https://doi.
Shahi, T.B., Shrestha, A., Neupane, A., Guo, W., 2020. Stock price forecasting with deep
org/10.1016/j.eswa.2023.120346.
learning: a comparative study. Mathematics 8 (9). https://doi.org/10.3390/
Hamayel, M.J., Owda, A.Y., 2021. A novel cryptocurrency price prediction model using
math8091441.
GRU, LSTM and bi-LSTM machine learning algorithms. AIComput Appl 2 (4),
Shen, G., Tan, Q., Zhang, H., Zeng, P., Xu, J., 2018. Deep learning with gated recurrent
477–496. https://doi.org/10.3390/ai2040030.
unit networks for financial sequence predictions. In: Procedia Computer Science.
Hinton, G.E., Osindero, S., Teh, Y.-W., 2006. A fast learning algorithm for deep belief
Elsevier B.V., pp. 895–903. https://doi.org/10.1016/j.procs.2018.04.298
nets. Neural Comput. 18 (7), 1527–1554.
Singh, R., Srivastava, S., 2017. Stock prediction using deep learning. Multimed. Tool.
Hinton, G.E., Krizhevsky, A., Sutskever, I., 2012. Imagenet classification with deep
Appl. 76 (18), 18569–18584. https://doi.org/10.1007/s11042-016-4159-7.
convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1, 1106–1114.
A. Thakkar and K. Chaudhari, “A comprehensive survey on deep neural networks for
Hochreiter, S., Schmidhuber, J., 1997. Long short-term memory. Neural Comput. 9 (8).
stock market: the need, challenges, and future directions,” Expert Systems with
Hoseinzade, E., Haratizadeh, S., 2019. CNNpred: CNN-based stock market prediction
Applications, vol. vol. 177. Elsevier Ltd, Sep. 01, 2021. doi: 10.1016/j.
using a diverse set of variables. Expert Syst. Appl. 129, 273–285. https://doi.org/
eswa.2021.114800.
10.1016/j.eswa.2019.03.029.
10
Downloaded from https://iranpaper.ir
https://www.tarjomano.com https://www.tarjomano.com
Tsantekidis, A., Passalis, N., Tefas, A., 2022. Chapter 5 - recurrent neural networks. In: Wu, D., Wang, X., Wu, S., 2022. Jointly modeling transfer learning of industrial chain
Iosifidis, A., Tefas, A. (Eds.), Deep Learning for Robot Perception and Cognition. information and deep learning for stock prediction. Expert Syst. Appl. 191 https://
Academic Press, pp. 101–115. https://doi.org/10.1016/B978-0-32-385787- doi.org/10.1016/j.eswa.2021.116257.
1.00010-5. Xia, H., Huang, K., Liu, Y., 2022. Unexpected interest recommender system with graph
Y. Wang, Q. Chen, M. Ding, and J. Li, “High precision dimensional measurement with neural network. Complex and Intelligent Systems. https://doi.org/10.1007/s40747-
convolutional neural network and bi-directional long short-term memory (LSTM),” 022-00849-9.
Sensors, vol. 19, no. 23, Dec. 2019, doi: 10.3390/s19235302. Yu, P., Yan, X., 2020. Stock price prediction based on deep neural networks. Neural
Wang, S., Wang, X., Wang, S., Wang, D., 2019. Bi-directional long short-term memory Comput. Appl. 32 (6), 1609–1628. https://doi.org/10.1007/s00521-019-04212-x.
method based on attention mechanism and rolling update for short-term load Zhang, J., Man, K.-F., 1998. Time series prediction using RNN in multi-dimension
forecasting. Int. J. Electr. Power Energy Syst. 109, 470–479. https://doi.org/ embedding phase space. In: SMC’98 Conference Proceedings. 1998 IEEE
10.1016/j.ijepes.2019.02.022. International Conference on Systems, Man, and Cybernetics (Cat. No. 98CH36218),
Wang, H., Wang, J., Cao, L., Li, Y., Sun, Q., Wang, J., 2021. A stock closing price pp. 1868–1873.
prediction model based on CNN-BiSLSTM. Complexity 2021. https://doi.org/ Zhang, S., Luo, J., Wang, S., Liu, F., 2023. Oil price forecasting: a hybrid GRU neural
10.1155/2021/5360828. network based on decomposition–reconstruction methods. Expert Syst. Appl. 218
https://doi.org/10.1016/j.eswa.2023.119617.
Countries by GDP,” https://www.populationu.com/gen/countries-by-gdp.
11