Addressing Big Data Issues Using RN N Based Techniques
Addressing Big Data Issues Using RN N Based Techniques
net/publication/338997430
CITATIONS READS
7 242
2 authors:
All content following this page was uploaded by Tanuja Das on 22 April 2020.
To cite this article: Tanuja Das & Goutam Saha (2019) Addressing big data issues using RNN
based techniques, Journal of Information and Optimization Sciences, 40:8, 1773-1785, DOI:
10.1080/02522667.2019.1703268
Tanuja Das *
Department of Information Technology
Gauhati University Institute of Science and Technology
Guwahati 781014
Assam
India
Goutam Saha †
Department of Information Technology
North-Eastern Hill University
Shillong 793022
Meghalaya
India
Abstract
Most of the real world prediction problems are naturally associated with time
component which requires time series data as input. Presently, machine learning approaches
are used for forecasting purpose from time series data. Though the time parameter provides
more information but it is also accompanied by many problems like temporal dependence
and temporal structures. Neural network based approaches has the ability to address this
problem as it has the capability of automatic learning and feature extraction from raw Big
data. In this paper, Recurrent Neural Networks and its variants, namely, LSTM and GRU were
tried for the purpose of forecasting from time series data. The experimentation was done on
Yahoo Finance data in different conditions for accessing the prediction accuracy based on
hourly historical data. The resulting performance with respect to prediction accuracy was
analysed. The results confirm that among RNN, LSTM and GRU, GRU has the best predictive
ability in case of temporal problems.
©
1774 T. DAS AND G. SAHA
I. Introduction
In the real world scenario, most of the events those generate Big Data
are discrete in nature and produce time series and temporal sequences.
This makes its comprehension very difficult. Common examples of these
types of data include the stock market data, number of hits on a web page
and many more. The difficulty here is to have an insight of extracting
dynamics from this time series data taking into consideration the past
events and also long-term correlations [1]. Due to the unstructured nature
of this enormous volume of data, this data is often termed as Big Data.
Many research initiatives were done in the modelling and prediction
of the stock market. Stating in simple terms, stock market refers to the
market where shares of a company are bought and sold in an exchange [2].
Nowadays, predicting the trend or price of stocks using machine learning
techniques and artificial neural networks has gaining momentum in the
stock market paradigm.
Neural Networks have the potential to predict new objects by training
with previously known samples. In order to distinguish patterns in
sequences of time series data, a special type of artificial neural network
called Recurrent Neural Networks (RNN) was developed [3]. It mainly
follows the concepts of David Rumelhart’s work in 1986 [4]. These
techniques have a temporal dimension and as such take time and sequence
into consideration. Recently it has been established that recurrent neural
nets can also be applied to images [5]. Since recurrent networks possess a
certain type of memory, and memory is also part of the human condition,
it can be treated analogical to memory in the brain [3]. But one of the
difficulties in training using Recurrent Neural Network is the problem
of vanishing gradient which mainly occurs because of the information
flowing through the many cascaded networks through numerous phases
of multiplication [6].
As a possible solution to the vanishing gradient problem, RNN
has been modified to another form called Long Short-Term Memory
Units or LSTM. This was proposed by the German researchers Sepp
Hochreiter and Juergen Schmidhuber in the mid-90s [7]. LSTM has shown
encouraging outcomes for certain class of problems described by for time
series forecasting [8][9]. LSTMs can unveil the underlying dynamics of
the time series due to its capacity of learning hidden long-term sequential
dependencies [10]. On the other hand, as real world data usually comprises
of probable outliers, it may affect the learning process which help in
ADDRESSING BIG DATA ISSUES USING RNN 1775
determining representation of the time series data. This may predict many
false predictions and thus minimize the performance [11].
In order to overcome the problem of outliers and shortcomings of
LSTM, Gated Recurrent Unit (GRU) [12] was proposed to make each
recurrent unit to adaptively capture dependencies of different time scales.
GRU resembles LSTM in that GRU has gating units that modulate the
flow of information inside the unit but it differs from LSTM by having no
independent memory cells [13]. GRU is architecturally quite simple. They
prove to be more efficient than LSTM [13]. So GRU is extensively used in
the tasks involving time series data analysis and predictions etc.
In this work, we have deployed the Recurrent Neural Network and
its variants namely, LSTM and GRU on the time series classification of
data from the Yahoo Finance in order to monitor and detect trends in the
Global Financial Markets.
Section II of this paper discusses the basics of Recurrent Neural
Networks, LSTM and GRU. In Section III it discusses as how this
methodology was adopted for our work of stock market data prediction.
Section IV shows a comparison of the results obtained with the various
models. Section V sketches the conclusion of the whole work.
II. Background
In this section, we give a brief overview of the different methodologies
adopted in the present. Section A gives an overview of Recurrent Neural
Networks (RNN). Section B gives a concise idea of Long Short-Term
Memory (LSTM). Section C discusses the basics of Gated Recurrent Unit
(GRU).
Figure 1
Recurrent Neural Network with loop
1776 T. DAS AND G. SAHA
Figure 2
Basic architecture of a Recurrent Neural Network
ADDRESSING BIG DATA ISSUES USING RNN 1777
B. LSTM
LSTM differs from RNN in that the repeating module of a LSTM has
a different architecture. LSTM networks are adapted for learning long-
term dependencies by appending a memory cell which is made up of
three main elements namely, an input gate, a forget gate and an output
gate. These gates make it possible for the cells to accumulate and utilize
information for a long duration [7]. LSTM also has a state space, which is
basically the number of hidden units.The basic architecture of a LSTM is
shown in Figure 3.
The equations for the three basic gates of a LSTM can be written as:
=it σ ( wi [ht −1 , xt ] + bi ) (3)
=ft (
σ w f [ht −1 , xt ] + b f ) (4)
where
it represents input gate
ft represents forget gate
ot represents output gate
s represents sigmoid function
wx represents weight for the respective gate(x) neurons
ht–1 represents output of the previous LSTM block(at timestamp t-1)
xt represents input at current timestamp
bx represents biases for the respective gate(x)
After the equations for the input, forget and output gate, the equations
for the cell state, candidate cell state and the final output are:
= t
c (
tanh w [h , x ] + b
c t −1 t c ) (6)
=ct ft * ct −1 + it * ct (7 )
ht = ot * tanh(c t ) (8)
where
ct represents cell state (memory) at timestamp t
1778 T. DAS AND G. SAHA
Fig. 3
Basic architecture of a LSTM
ht =(1 − zt ) ht −1 + zt Ht (13)
ADDRESSING BIG DATA ISSUES USING RNN 1779
Fig. 4
Basic architecture of a GRU
where
zt represents the update gate
rt represents the reset gate
Ht represents candidate activation
ht represents the hidden state.
Finally the output y of GRU can be given as
y = soft max(ht ) (14)
Figure 5
Basic model of the experiments
Figure 6
Model Structure
For all the three models, viz, RNN, LSTM and GRU, we apply one
respective basic block. We also apply one hidden layer after the input
and before the output layer (Figure 6). And in order to produce the best
performance setting we evaluate against numerous number of neurons in
the two hidden layers.
A. Dataset Used
The stock market produces a huge amount of valuable trading data.
Here we used the S&P 500 market index which is freely available in the
Yahoo finance site [16]. The daily stock quotes data comprises of the
properties that can be exploited to gain trading information.The structure
of the Stock Market Data is given in Table 1.
On a typical trading day, stock information of previous days like
open, high, low, close price of stock along with volume traded is to be
analysed which gives a lot of hidden information to predict stock trends.
To quickly visualize this information, OHLC (open, high, low, and close)
charts or Candlestick charts are commonly used.
For training and validation purpose, hourly historical stock data
is used from 08/06/2017 to 06/09/2017. The size of each stock taken
in a single window is 5000 and the first 400 samples are used as past
1782 T. DAS AND G. SAHA
Table 1
Structure of the stock market data
Figure 7
GSPC Test accuracy for RNN, LSTM and GRU
ADDRESSING BIG DATA ISSUES USING RNN 1783
hidden layer from 410-900 using the rule that amount of neurons capturing
70-90% of the variance of the dataset.
In order to test the three methods we selected the S&P 500(^GSPC)
during the tuning process. In order to consider the most recent data in the
model, Sliding Time Window is used [18]. The average test accuracy of
RNN, LSTM and GRU was found to be 51.95, 53.64 and 59.74 respectively.
Figure 7 shows the comparison of the various methods with test accuracy
vs. various numbers of hidden units.
Also as seen from Figure 7, initially LSTM is at par with GRU but
as the size of the dataset increases GRU becomes computationally more
efficient. This is because GRU is structurally more simple than LSTM and
thus it performance increases with the increase in the size of the dataset.
V. Conclusion
In this work, we did a comparative analysis of predicting stock market
using RNN, LSTM and GRU. For testing purposes we used the same stock
over all the three methods for building the model. It is evident from the
results that RNN and its variants can be trained to bridge time lags excess.
As stock market data is non-stationary and non-linear, feature correlation
can be totally different time to time. But as it approaches the limit of
number of hidden layer units, the accuracy somewhat remains constant.
Thus, RNN and its variants are very promising in making decisions as
whether the past behaviours are important enough to forecast the future.
In our future work, we will try to explore whether these methods can be
optimized further for predicting stock market trends.
Acknowledgement
I wish to acknowledge the travel and the registration fund from TEQIP
III providing the opportunity to attend this international conference at
Silchar.
References
[2] N. M.Masoud, “The impact of stock market performance upon eco-
nomic growth,” International Journal of Economics and Financial Issues,
vol.3, no.4, pp.788-798, 2013.
[3] J. T. Connor, R. D. MartinandL. E. Atlas, “Recurrent neural networks
and robust time series prediction,” IEEE transactions on neural net-
works, vol.5, no.2, pp.240-254, 1994.
[4] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning rep-
resentations by back-propagating errors,”Nature, vol.323, no.6088,
pp.533, 1986.
[5] P. Kumar, S. K. Sahu, & A. P. Singh, “Generating image descriptions
using capsule network,” Journal of Information and Optimization Sci-
ences, vol.40, no.2, 479-492, 2019.
[6] Y. Bengio, P.Simard, and P.Frasconi, “Learning long-term dependen-
cies with gradient descent is difficult,” IEEE transactions on neural net-
works, vol.5, no.2, pp.157-166, 1994.
[7] S.Hochreiter and J.Schmidhuber, “Long short-term memory,” Neural
computation, vol.9, no.8, pp.1735-1780,1997.
[8] D. M.Nelson, A. C. Pereira, and R. A.de Oliveira,“Stock market’s
price movement prediction with LSTM neural networks,” IEEE In-
ternational Joint Conference on Neural Networks (IJCNN), pp. 1419-1426,
2017.
[9]
L.Finsveen, “Time-series predictions with Recurrent Neural Net-
works-Studying Recurrent Neural Networks predictions and com-
paring to state-of-the-art Sequential Monte Carlo methods,”Master’s
thesis, NTNU,2018.
[10] C. Olah, “Understanding lstm networks,” colah.github.io,2015.
[11] G. Williams, R. Baxter, H.He, S.Hawkins,and L.Gu, “A comparative
study of RNN for outlier detection in data mining,” IEEE International
Conference on In Data Mining 2002, pp. 709-712, ICDM 2003.
[12] J. Chung, C. Gulcehre, K.Cho, and Y.Bengio, “Gated feedback recur-
rent neural networks,” International Conference on Machine Learn-
ing, pp.2067-2075, 2015.
[13] H.Larochelle, D.Erhan, A.Courville, J.Bergstra, and Y.Bengi, “An em-
pirical evaluation of deep architectures on problems with many fac-
tors of variation,” Proceedings of the 24th international conference
on Machine learning, ACM, pp. 473-480, 2007.
ADDRESSING BIG DATA ISSUES USING RNN 1785