0% found this document useful (0 votes)
4 views12 pages

Comsys2020 Paper

The document presents a new deep learning model called Stacked Long Short-Term Memory (S-LSTM) for forecasting multivariate time-series data, particularly in the context of financial markets. It highlights the importance of data normalization techniques in enhancing the prediction accuracy of the S-LSTM model, which outperforms traditional methods. The study utilizes historical stock market data from the Bombay Stock Exchange and New York Stock Exchange to evaluate the model's performance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views12 pages

Comsys2020 Paper

The document presents a new deep learning model called Stacked Long Short-Term Memory (S-LSTM) for forecasting multivariate time-series data, particularly in the context of financial markets. It highlights the importance of data normalization techniques in enhancing the prediction accuracy of the S-LSTM model, which outperforms traditional methods. The study utilizes historical stock market data from the Bombay Stock Exchange and New York Stock Exchange to evaluate the model's performance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/346526680

Deep Neural Network for Multivariate Time-Series Forecasting

Article in Advances in Intelligent Systems and Computing · November 2020


DOI: 10.1007/978-981-15-7834-2_25

CITATIONS READS

22 1,929

1 author:

Samit Bhanja
Government General Degree College, Singur
15 PUBLICATIONS 120 CITATIONS

SEE PROFILE

All content following this page was uploaded by Samit Bhanja on 03 April 2021.

The user has requested enhancement of the downloaded file.


Deep Neural Network for Multivariate
Time-Series Forecasting

Samit Bhanja and Abhishek Das

Abstract Recently, Deep Neural Network (DNN) architecture with a deep learn-
ing approach has become one of the robust techniques for time-series forecasting.
Although DNNs provide fair forecasting results for the time-series prediction, still
they are suffering from various challenges. Because most of the time-series data,
especially the financial time-series data are multidimensional, dynamic, and nonlin-
ear. Hence, to address these challenges, here, we have proposed a new deep learning
model, Stacked Long Short-Term Memory (S-LSTM) model to forecast the multi-
variate time-series data. The proposed S-LSTM model is constructed by the stacking
of multiple Long Short-Term Memory (LSTM) units. In this research work, we have
used six different data normalization techniques to normalize the dataset as the pre-
processing step of the deep learning methods. Here, to evaluate and analyze the
performance of our proposed model S-LSTM, we have used the multivariate finan-
cial time-series data, such as stock market data. We have collected these data from
two stock exchanges, namely, Bombay Stock Exchange (BSE) and New York Stock
Exchange (NYSE). The experimental results show that the prediction performance
of the S-LSTM model can be improved with the appropriate selection of the data
normalization technique. The results also show that the prediction accuracy of the
S-LSTM model is higher than the other well-known methods.

Keywords LSTM · RNN · DNN · Stock market prediction · Data normalization


technique.

S. Bhanja
Government General Degree College, Singur 712409, Hooghly, India
e-mail: samitbhanja@gmail.com
A. Das (B)
Aliah University, New Town, Kolkata 700160, India
e-mail: adas@aliah.ac.in

© Springer Nature Singapore Pte Ltd. 2021 267


D. Bhattacharjee et al. (eds.), Proceedings of International Conference on Frontiers
in Computing and Systems, Advances in Intelligent Systems and Computing 1255,
https://doi.org/10.1007/978-981-15-7834-2_25
268 S. Bhanja and A. Das

1 Introduction

Multivariate time-series data is a set of multidimensional data collected in fixed


time intervals. Prediction of the multivariate time-series is an important Time-Series
Forecasting (TSF) problem and that has a wide range of application areas, viz.,
weather, pollution, finance, business, energy, etc. The successful prediction of the
multivariate time-series data would definitely smoothen the modern livelihood.
In recent times, the prediction of the financial time-series data, especially the stock
market prediction, has become one of the most important application areas for the
researchers. The successful prediction of the stock market has a great influence on
the socio-economic environment of the county. It also helps the investors to take an
early decision whether to sell or buy the stock share or bonds and that reduces the
risk of the investment. There are so many complex factors that influence the stock
market and the stock market data are high dimensional, volatile, and nonlinear and
due to this reason forecasting the stock market data is a highly challenging task.
Data normalization is the primary data preprocessing step for any DNN model
for the processing of the time-series data. The time-series data, especially the stock
market, is varied over a wide range, so as to produce good quality data from it and
to accelerate the learning process, data normalization is essential. The efficiency of
any DNN models highly depends on the selection of the proper data normalization
technique.
The main focus of this research work is to develop a new DNN model to forecast
the multivariate time-series data with a high degree of accuracy. Also to find out the
most effective data normalization method for the DNN model.
In this work, we have proposed a DNN model S-LSTM to forecast the stock
market as a multivariate time-series forecasting problem. We also analyzed its per-
formance on different data normalization techniques to identify the best suitable data
normalization method for the deep learning algorithms to forecast the multivariate
time-series data.
We have organized our paper as follows: the literature review is represented in
Sect. 2. In Sect. 3, we have provided the basic concepts. Data normalization meth-
ods are presented in Sect. 4. The proposed framework and dataset description are
represented in Sect. 5. Results and discussion are described in Sect. 6, and finally in
Sect. 7, the conclusion is drawn.

2 Literature Review

In the last few years, there are so many sincere efforts which have been made to
successfully predict the stock market. These efforts are broadly classified into two
categories, viz., statistical approach and soft-computing approach. Support Vector
Machine (SVM) [1] and Autoregressive Integrated Moving Average (ARIMA) [2] are
Deep Neural Network for Multivariate … 269

the two well-known statistical methods for time-series forecasting. These statistical
models can handle the nonlinear time-series data and exhibit a high degree of success
for the prediction of the univariate time-series data.
The Artificial Neural Network (ANN)-based models are the most popular soft-
computing approach for the time-series prediction. The Artificial Neural Network-
based models can perform a large variety of computational tasks faster than the
traditional approach [3]. Multilayer Perceptron (MLP) neural network, Back Propa-
gation (BP) [4, 5] neural network, etc. are the popular ANN models. These models
are successfully applied to solve the various problems, viz., classification problems,
time-series forecasting problems, etc. These ANN models are not suitable for the
large volume of highly dynamic, nonlinear, and complex data.
Nowadays, Deep Neural Networks (DNNs) [6–9] exhibit its great success in a
wide range of application areas, including multivariate time-series forecasting. The
basic difference between the shallow neural networks and the deep neural networks
is that the shallow networks have only one hidden layer whereas the deep neural
networks have many hidden layers. These multiple hidden layers allow the DNNs to
extract the complex features from the large volume of highly dynamic and nonlinear
time-series data.
In recent times, Recurrent Neural Networks (RNNs) [10, 11] have become one of
the most popular DNN architectures for the time-series classification and forecasting
problems. In RNN, output of one time stamp is considered as the input of the next
time stamp and for these timestamp concepts, it is most suitable for the processing of
the time-series data. But RNNs suffered from the vanishing gradient and exploding
gradient problems. For these problems, it cannot represent the long-term dependen-
cies of the historical time-series data. The Long Short-Term Memory (LSTM) [12,
13] is a specialized RNN that overcomes the shortfalls of the traditional RNNs.

3 Basic Concepts

When a neural network has two or more hidden layers, then it becomes a Deep
Neural Network (DNN). The most common neural networks, viz., Multilayer Per-
ceptron (MLP) or feedforward neural networks with two or more hidden layers are
the representatives of DNN models. DNN models are the basis of any deep learn-
ing algorithms. These multiple hidden layers allow the DNN models to capture the
complex features from the large volume of the dataset. It also allows the DNN to pro-
cess nonlinear and highly dynamic information. In recent times, a number of DNN
models have been proposed. Out of these large numbers of DNNs, Recurrent Neural
Network (RNN) is one of the most popular DNN models to process time-series data.
270 S. Bhanja and A. Das

3.1 Recurrent Neural Networks (RNNs)

RNN [10] is one of the most powerful Deep Neural Network (DNN) models that
can process the sequential data. It was first developed in 1986. Since it performs
the same set of operations on every element of the sequential data, it is called the
recurrent neural network. As per the theory, it can process very long sequences of
time-series data, but in reality, it can look only a limited number of steps behind.
Figure1 represents the typical architecture of RNN and its expanded form.
In RNN, following equations are used for the computational purpose:

h t = f (U xt + W h t−1 ) (1)

Ot = softmax(V h t ) (2)

where h t and xt are, respectively, the hidden sate and the input at the time stamp t,
Ot is the output at the time stamp t, and function f is a nonlinear function, viz., tanh
or ReLU .
The basic difference between the traditional DNNs and the RNN is that RNN
uses the same set of parameters (U, V, W as above) for all the steps. This parameter
sharing drastically reduces the total number of memorizable parameters of the model.

3.2 Time-Series Data

If a series of data are collected over a fixed time intervals, then that dataset is called
the time-series data. If every data points of time-series dataset, in spite of a single
value, it is a set of values, then that type of time-series data is called the multivariate
time-series data. There are numerous application areas present where multivariate
time-series data are present, viz., weather, pollution, sales, stocks, etc. and these
data can be analyzed for the forecasting purpose [14, 15]. The general format of the
time-series data is as follows:

Fig. 1 A typical RNN and its expanded architecture


Deep Neural Network for Multivariate … 271

X = {x(1), x(2), ...., x(t)} (3)

where x(t) is current value and x(1) is the oldest value. If X is multivariate time-
series data then every data point x(i) will be a vector of a fixed-length k. So, x(i) =
{xi,1 , xi,2 , ..., xi,k }.

4 Data Normalization Methods

The efficiency of any DNN models is heavily dependent on the normalization meth-
ods [16]. The main objective of the data normalization is to generate quality data
for the DNN model. The nonlinear time-series data, especially the stock market data
fluctuates over a large scale. So, the data normalization is essential to scale down
the data to a smaller range to accelerate the learning process of the DNN models.
Although there are different numbers of data normalization techniques are available,
in all of these techniques, each input value a of each attribute A of the multivariate
time-series data is converted to anorm to the range [low, high]. Some of the well-known
data normalization techniques are described below.

4.1 Min-Max Normalization

Here, the data scale down to a range of [0, 1] or [–1, 1]. The formulae for this method
are as follows:
(high − low) ∗ (a − min A)
anorm = (4)
max A − min A

where min A and max A are, respectively, the smallest and the largest values of the
attribute A.

4.2 Decimal Scaling Normalization

In this method, all the values of each attribute are converted to the complete frac-
tional number by moving the decimal points of each value. And this decimal point
movement is done based on the maximum value of each attribute.
a
anorm = (5)
10d
where d is the number of digits present in the integer part of the biggest number of
each attribute A.
272 S. Bhanja and A. Das

4.3 Z-Score Normalization

In this normalization method, all the values of each attribute A are scaled down to
a common range of 0 and standard deviation of that attribute. The formulae are as
follows:
a − μ(A)
anorm = (6)
δ( A)

where μ(A) and δ(X ) are, respectively, the mean value and the standard deviation
of the attribute A.

4.4 Median Normalization

In this method, all the values of each attribute A is normalized by the following
formulae: a
anorm = (7)
meadian (A)

4.5 Sigmoid Normalization

In this technique, the sigmoid function is used to normalize all the values of each
attribute A. The formulae are as follows:
1
anorm = (8)
1 − e−a

4.6 Tanh Estimators

This method is developed by Hample. Here, the data normalization is done by the
following formulae:
� � � �
0.01 ∗ (a − μ)
anorm = 0.5 tanh +1 (9)
δ

where μ is the mean value of the attribute A and δ is the standard deviation of the
attribute A.
Deep Neural Network for Multivariate … 273

5 Proposed Framework and Dataset Description

In this section, we have described the overall architecture of our proposed DNN
model, named stacked Long Short-Term Memory (S-LSTM) model. Figure 2 shows
the detailed architecture of our proposed model S-LSTM. The basic building blocks
of the S-LSTM model is the LSTM unit. The main reason for the selection of the
LSTM unit over RNN is that RNN suffers from the vanishing gradient and exploding
gradient problem and due to these problems RNN is not capable to learn the features
from the long sequences of the historical time-series data. On the contrary, the LSTM
unit has the gated structure and due to this gated structure, it can extract the features
from the long sequences of the historical data. The key part of the LSTM unit is its
memory cell (cell state). This memory cell comprises three gates, viz., input gate,
forget gate, and output gate. The basic gated structure of the LSTM unit is shown in
Fig. 3 [9].

Fig. 2 Proposed forecasting model

Fig. 3 Gated structure of LSTM


274 S. Bhanja and A. Das

We develop our proposed S-LSTM model by stacking of N number of LSTM


layers [13] as shown in Fig. 2 and each layer can be expanded to t number of LSTM
units. Here the number of time stamps is represented by t. For example, if we want
to forecast the fifth-day closing price of the stock market based on the previous 4 d
data, then each LSTM layer must be expanded to the four numbers of LSTM units.
In this research work, we have taken the historical stock market data as an example
of the multivariate time-series data. Here, we have used this stock market data to
evaluate and analyze the performance of our proposed model S-LSTM and to test the
effectiveness of each data normalization methods on our proposed model. Although
the stock market data has a large number of parameters, in this work, we have
only considered the four important parameters, viz., opening price, low price, high
price, and closing price. We have taken the historical stock market data from two
stock exchanges, viz., Bombay Stock Exchange (BSE) [17] and New York Stock
Exchange (NYSE) [18]. Here, we have collected a total of 1717 d BSE data and a
total of 1759 d NYSE data from January 1, 2012 to December 31, 2018. We have
used the first 70% of the total data (1202 d BSE data and 1232 d NYSE data) as the
labeled data for the training purpose. Second 15% of the data are used as the labeled
data for the validation purpose and we have used the remaining 15% of the data as
the unlabeled data for the testing purpose of our proposed model. Here, we have set
the number of stacked layers of our proposed model S-LSTM as 3. Here, we have
tried to forecast the seventh day’s closing price based on the previous 6 d opening
price, high price, low price, and closing price.

6 Results and Discussion

In this research work, we have done all the experiments by MATLAB R2016b with
Neural Network Toolbox.
Here, as the performance metric, we have used the Mean Absolute Error (MAE)
and Mean Squared Error (MSE). The formulae for calculating these errors are as
follows:
k
1�
MAE = (|oi − pi |) (10)
k i=1

k
1�
MSE = (oi − pi )2 (11)
k i=1

where the number of observation is k. oi and pi are, respectively, the actual value
and the predicted value.
In Tables 1 and 2, we represent the different prediction errors (MSE and MAE)
of the proposed model for each data normalization method of BSE and NYSE data,
respectively. Figures 4 and 5 graphically show the foretasted closing price of BSE
Deep Neural Network for Multivariate … 275

Table 1 Forecasting errors of BSE dataset


Normalization method MSE MAE
Min-Max 3.1579e–05 0.0046
Decimal scaling 1.3143e–07 2.7651e–04
Z-Score 3.0571e–04 0.0161
Sigmoid 3.1234e–08 1.3581e–04
Tanh Estimator 1.2439e–08 8.3422e–05
Median 2.3169e–06 0.0017

Table 2 Forecasting errors of NYSE dataset


Normalization Method MSE MAE
Min-Max 2.1031e–05 0.0041
Decimal scaling 1.7521e–06 9.8705e–04
Z-Score 2.3471e–04 0.0215
Sigmoid 7.8731e–08 2.0161e–04
Tanh Estimator 1.6359e–08 9.3741e–05
Median 1.5125e–06 8.8129e–04

Fig. 4 Forecasting results of BSE

and NYSE for each data normalization technique. In Table 3, we have compared
our proposed model S-LSTM with the other popular models concerning with their
prediction errors (MSE and MAE).
From Tables 1 and 2, we can observe that the prediction errors are varied with the
different normalization methods and the Tanh estimator produces lower prediction
errors for both the prediction of BSE and NYSE indices compared to the other
276 S. Bhanja and A. Das

Fig. 5 Forecasting results of NYSE

Table 3 Forecasting errors of different models for BSE dataset


Model MSE MAE
SVM 3.9652e–02 6.3119e–1
ARIMA 2.6943e–05 9.1638e–3
S-LSTM 1.2439e–08 8.3422e–05
RNN 2.8139e–07 5.7621e–04
LSTM 1.0429e–07 3.0538e–04

normalization methods. Figures 4 and 5 also show that the Tanh estimator data
normalization method produces better forecasting results. It is quite clear from Table 3
that our proposed model (S-LSTM) exhibits the smallest forecasting errors (MSE
and MAE) compared to other well-known models.

7 Conclusion

In this work, we have proposed a deep neural network model S-LSTM for forecasting
the multivariate time-series data. Moreover, we have also tried to find out the most
suitable data normalization method for the deep neural network models. Here, as a
case study, we have used BSE and NYSE historical time-series data for multivariate
time-series forecasting purposes. From Tables 1 and 2 and also from Figs. 4 and
5, we can conclude that the Tanh estimator data normalization method is the best
normalization method for deep neural network models. From all these observations,
View publication stats

Deep Neural Network for Multivariate … 277

we can draw the conclusion that our proposed deep neural network model S-LSTM
has outperformed all other well-known models for the forecasting of the BSE and
NYSE data.
In the future, we also want to analyze our proposed model for the forecasting of
different multivariate time-series data, such as weather, pollution, etc.

References

1. Meesad, P., Rasel, R.I.: Predicting stock market price using support vector regression. In: 2013
International Conference on Informatics, Electronics and Vision (ICIEV), pp. 1–6. IEEE (2013)
2. Rodriguez, G.: Time series forecasting in turning processes using arima model. Intell. Distrib.
Comput. XII 798, 157 (2018)
3. Sulaiman, J., Wahab, S.H.: Heavy rainfall forecasting model using artificial neural network for
flood prone area. In: IT Convergence and Security 2017, pp. 68–76. Springer (2018)
4. Werbos, P.J., et al.: Backpropagation through time: what it does and how to do it. Proc. IEEE
78(10), 1550–1560 (1990)
5. Lee, T.S., Chen, N.J.: Investigating the information content of non-cash-trading index futures
using neural networks. Expert Syst. Appl. 22(3), 225–234 (2002)
6. Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117
(2015)
7. Du, S., Li, T., Horng, S.J.: Time series forecasting using sequence-to-sequence deep learning
framework. In: 2018 9th International Symposium on Parallel Architectures, Algorithms and
Programming (PAAP), pp. 171–176. IEEE (2018)
8. Cirstea, R.G., Micu, D.V., Muresan, G.M., Guo, C., Yang, B.: Correlated time series fore-
casting using multi-task deep neural networks. In: Proceedings of the 27th ACM International
Conference on Information and Knowledge Management, pp. 1527–1530. ACM (2018)
9. Bhanja, S., Das, A.: Deep learning-based integrated stacked model for the stock market pre-
diction. Int. J. Eng. Adv. Technol. 9(1), 5167–5174 (2019). October
10. Williams, R.J., Zipser, D.: A learning algorithm for continually running fully recurrent neural
networks. Neural Comput. 1(2), 270–280 (1989)
11. Shih, S.Y., Sun, F.K., Lee, H.Y.: Temporal pattern attention for multivariate time series fore-
casting. Mach. Learn. 108(8–9), 1421–1441 (2019)
12. Bengio, Y., Simard, P., Frasconi, P., et al.: Learning long-term dependencies with gradient
descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)
13. Sagheer, A., Kotb, M.: Time series forecasting of petroleum production using deep lstm recur-
rent networks. Neurocomputing 323, 203–213 (2019)
14. Hsu, C.M.: Forecasting stock/futures prices by using neural networks with feature selection.
In: 2011 6th IEEE Joint International Information Technology and Artificial Intelligence Con-
ference, vol. 1, pp. 1–7. IEEE (2011)
15. Tang, Q., Gu, D.: Day-ahead electricity prices forecasting using artificial neural networks. In:
2009 International Conference on Artificial Intelligence and Computational Intelligence, vol. 2,
pp. 511–514. IEEE (2009)
16. Nayak, S., Misra, B., Behera, H.: Impact of data normalization on stock index forecasting. Int.
J. Comput. Inform. Syst. Ind. Manag. Appl. 6(2014), 257–269 (2014)
17. Yahoo! finance (June 2019). https://in.finance.yahoo.com/quote/%5EBSESN/history?p=
%5EBSESN
18. Yahoo! finance (June 2019). https://finance.yahoo.com/quote/%5ENYA/history/

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy