IJNAA - Volume 14 - Issue 1 - Pages 1601-1610
IJNAA - Volume 14 - Issue 1 - Pages 1601-1610
14 (2023) 1, 1601–1610
ISSN: 2008-6822 (electronic)
http://dx.doi.org/10.22075/ijnaa.2022.27971.3772
a Department of Financial Management, Babol Branch, Islamic Azad University, Babol, Iran
b Department of Computer Engineering, Babol Branch, Islamic Azad University, Babol, Iran
c Department of Economic, Babol Branch, Islamic Azad University, Babol, Iran
Abstract
Bitcoin has recently attracted considerable attention in the fields of economics, cryptography, and computer science
due to its inherent nature of combining encryption technology, monetary units and blockchain. This paper reveals
the effect of neural networks (NNs) by analyzing the time series of the Bitcoin process. We also select the most
relevant features from Blockchain information that is deeply involved in Bitcoin’s supply and demand and use them
to train models to improve the predictive performance of the latest Bitcoin pricing process. In this paper, the purpose
of Bitcoin price prediction is to use the combined method of signal decomposition into intrinsic components (EMD)
and support vector regression (SVR) algorithms. The proposed method uses the intrinsic component decomposition
method as a denoising step in the training data. We conduct an empirical study that compares the proposed method
with other linear and non-linear benchmark models on modeling and predicting the Bitcoin process. Our empirical
studies show that NN performs well in predicting Bitcoin price time series and explaining the high volatility of the
recent Bitcoin price also the Mean Square Error (MSE) of the proposed method is calculated and compared with
previous works.
Keywords: Blockchain, Bitcoin Prices, Financial market prediction, Machine learning
2020 MSC: 91G15, 91G45
1 Introduction
Bitcoin is a decentralized, anonymous, exclusive ownership, and in nation-free currency [19]. Fry and Cheah [7]
found that in view of the innovative characteristics of decentralization and traceability of bitcoin, bitcoin has attracted
extensive attention from the media and investors. After the rise and fall of cryptocurrency prices in recent years, bitcoin
is increasingly seen as an investment asset. Investors see bitcoin as a speculative investment, similar to the Internet
stocks of the last century [12]. Bitcoin as a cryptocurrency, itself appears for a short time compared with the sovereign
currency [2]. Unlike the sovereign currency, bitcoin is a decentralized digital currency without any government credit
support, so the price of bitcoin is highly volatile. It produces much more volatility than sovereign currencies. Its
∗ Corresponding author
Email addresses: h.naghipor@gmail.com (Hossein Naghipour), anabavichashmi2003@gmail.com (Seyed Ali Nabavi Chashmi),
barzegar@iauns.ac.ir (Behnam Barzegar), memarian_er@yahoo.com (Erfan Memarian)
price rose from zero value when it was established in 2009, to about $13 per bitcoin in January 2013, and then soared
to about $20000 per bitcoin in December 2017. Since bitcoin started trading, its highly unstable nature has been
plaguing investors, and it may be a bubble, threatening the stability of the financial system. Therefore, it is necessary
to make a good prediction of the price of the special currency. The possibility of predicting the price trend of bitcoin
is a practical problem. It not only affects a country’s economic policy at the macro level but also strongly affects
investors decision to buy and sell investment instruments at the micro-level.
Matkovskyy and Jalan [5] found that the accurate prediction of bitcoin price can not only provide decision support
for investors but also provide reference for the government to formulate regulatory policies. Equally noteworthy are the
factors that influence bitcoin prices. In addition to the internal factors such as block size, hash rate, mining difficulty,
trading volume, and market value of bitcoin, this study thinks that the factors should be more comprehensive: firstly,
this study thinks that the Google and Baidu search index is an important factor affecting bitcoin because it is an
important indicator to measure investors’ attention and media hype and reflects the sentiment of the highly speculative
cryptocurrency market [23]. Secondly, this study argues that the irrational factors such as major events and investor
sentiment caused by economic policies will also affect the price of bitcoin [13]. Papadopoulos [8] shows that there is
good interaction between bitcoin price and gold price. Dyhrberg [6] proved the similarity among bitcoin, gold, and
the US dollar through the GARCH model. Therefore, this study takes the gold price and the dollar index as the
influencing factors of bitcoin price. By selecting the above external factors, the problem of simplifying bitcoin price
prediction is avoided.
and currently has a market worth of US$ 16 billion. As Bitcoin is used by ordinary people and because of its lack
of relevance to other assets, Bitcoin has become an attractive option for investors. Therefore, the ability to predict
prices would be a great help for investors.
Considering the importance of the topic, many researchers have recently studied Bitcoin price prediction. Almeida
et al. [1] reviewed an artificial neural network (ANN) model to predict the Bitcoin price using the last day price and
turnover volumes. The main problem with their method is the requirement of a large amount data for the prediction.
McNally’s [17] research concerns predicting Bitcoin prices using machine learning. This was achieved by using several
RNN, ARIMA, and LSTM patterns. The error percentages of the RNN, ARIMA, and LSTM models were 5.45%,
53.47%, and 6.87% respectively [11, 17]. Greaves and Au [10] investigated the characteristics of the blockchain network
based on Bitcoin’s future price using an ANN. The results showed that the average accuracy is approximately 55%.
Shah and Zhang [21] used the nonparametric classification technique developed by Chen et al. [4] to predict price
trends, claiming that a successful Bitcoin strategy would be based on Bayesian regression if its accuracy is 89%.
Madan et al. [14] used Bitcoin blockchain network properties to predict Bitcoin prices. Using SVM algorithms,
binomial logistic regression classifiers, and random forests, they predicted the Bitcoin price with an accuracy of 55%.
Georgoula et al. [9] investigated the determinants of the Bitcoin rate along with an emotional analysis using SVM.
The result showed that the amount of Wikipedia hits and hash rates in the network had a positive relationship with
the Bitcoin price.
In another study, Matta et al. [15] aimed to predict Bitcoin trading volumes. They examined whether the
general feeling that aggregates in a set of Twitter posts could be used to predict changes in the Bitcoin market. The
results showed that there was a significant association between Bitcoin’s upcoming price and the volume of tweets
during a day. Similarly, the volume of Google searches for the term “bitcoin” affects the Bitcoin price [16]. In the
proposed method, the goal is to predict the price of Bitcoin using the combined method of signal decomposition into
intrinsic components (EMD) and support vector regression (SVR) algorithms. The proposed method uses the intrinsic
component decomposition method as a denoising step in the training data.
2.2 Blockchain
Decentralization is the value pursued by all cryptocurrencies as opposed to general fiat currencies being valued by
central banks. Decentralization can be specified by the following goals:
The blockchain is the only available technology that can simultaneously achieve these three goals. Generation of
blocks in the Blockchain, which is directly involved in the creation and trading of Bitcoins, directly influence the supply
and demand of Bitcoins. Combination of Blockchain technologies and the Bitcoin market is a real-world example of
a combination of high-level cryptography and market economies. We then describe in detail how the Blockchain can
achieve the abovementioned goals in Bitcoin environment [20]. A participant in a Bitcoin network acts as a part of
a network system by providing hardware resources of their own computer, which is called a ”distributed system”.
All issuance and transaction of money are conducted through P2P networks. All trading history is recorded in the
Blockchain and shared by the network, and all past transaction history is verified by all network participants. The unit
called ”block”, which includes recent transactions and a hash value from the previous ”block”, creates irreversible data
by a hash function, and is pointed out from the next block. It takes more than a certain amount of time to generate
the block to make impossible to forge all or part of the Blockchain. This algorithm is called proof of work (PoW), and
the difficulty is automatically set to ensure that the problem can be solved within approximately 10 minutes.
PoW also provides incentives to motivate participants to maintain the value of Bitcoin by paying Bitcoin for the
participant who created the block. PoW agreement algorithm comes with several inherent risks. First, the validity
of the block can be intervened when the majority of total participants is occupied by a group with a specific purpose
called 51% problem. Second, when the Blockchain is forked, a considerable amount of time is consumed to form
the agreed Blockchain until the longest chain is selected after generation of several blocks. This condition causes a
transaction delay because the transaction cannot be completed during that time. Lastly, there may be the capacity
limit of the Blockchain or the performance limit of each node. Safety of the current Blockchain can be monitored by
observing measurable variables in the Blockchain from https://blockchain.info/. Considering that supply and demand
First, the validity of the block can be intervened when the majority of total participants is occupied
1604 by a group with a specific purpose called 51% problem. Second, when the Blockchain is forked, a Nabavi Chashmi, Barzegar
Naghipour,
considerable amount of time is consumed to form the agreed Blockchain until the longest chain is
selected after generation of several blocks. This condition causes a transaction delay because the
transaction cannot be completed during that time. Lastly, there may be the capacity limit of the
of Bitcoin are affected directly or indirectly
Blockchain or the performancebylimit
measurable variables
of each node. Safety of the currentinvolved
Blockchain canin the formation of a Blockchain, the
be monitored
by observing
current study evaluates several variablesmeasurable variablesto
related in the Blockchain fromformation
Blockchain https://blockchain.info/. Consideringof
as features thatthe Bitcoin pricing process.
supply and demand of Bitcoin are affected directly or indirectly by measurable variables involved in
the formation of a Blockchain, the current study evaluates several variables related to Blockchain
formation as features of the Bitcoin pricing process.
Fig Details
Figure 1: 1. Details ofofthe proposed
the methodmethod.
proposed
As shown in Figure 1, the Bitcoin price time series is first decomposed into intrinsic mode functions
As shown in Figure 1, the byBitcoin price
the algorithm time
of signal series isinto
decomposition first decomposed
intrinsic component and into intrinsic
its residual mode functions by the algorithm
which contains
of signal decomposition into noise is discarded.
intrinsic In the next step,
component and the its
inherent components
residual of the contains
which signals are converted
noiseinto is discarded. In the next step,
the inherent components of the signals are converted into subsequences by a multi-scale decomposition method, and
the load is predicted by these subsequences with the support vector regression algorithm.
n−1
X xi
SM An = (3.1)
T
i=n−T
In Eq. 3.1, SM An is the simple moving average for the n-th day, xi is the closing price on the i-th day, and T is
the duration required in the simple moving average, which is typically 5, 10, 20, 50, 100, and 200 days long.
The fifth step: the stopping condition was related to the IMF. This condition is given in Eq. 3.3.
PT
t=0 |hk−1
t (t) − hkt (t)|2
Dk (t) = PT k−1
(3.3)
t=0 |ht (t)|2
Sixth step: If there is no fifth step, we place the signal from step four instead of the original signal and continue
the process from the first step.
Seventh step: If the step condition is met, the process is finished and c1 = hk1 is considered as the first component
of the intrinsic mode, which is actually the high frequency component of the x(t) signal.
Eighth step: The remainder is defined as r1 = x(t) − ck1 and if it fulfills the condition of being an intrinsic mode
component, it is considered an intrinsic mode component, otherwise, if condition A (the number of maxima and
minima is equal to or more than the number of zeros) is true, it is assumed as the initial signal and steps one to
four are repeated until the next intrinsic mode component is obtained, and if it does not have this condition, it is
considered as the remainder r. The remainder can be defined by Eq. 3.4.
Therefore, in fact, the main signal is the sum of the intrinsic mode component plus the remainder, which is obtained
by Eq. 3.5.
n
X
x(t) = hi (t) + r(t) (3.5)
i=1
In Eq. 3.5, x(t) is the original data value. Each of hki shows the i-th value of the intrinsic mode component and
r(t)shows the remaining component, and n is the number of intrinsic mode components.
1606 Naghipour, Nabavi Chashmi, Barzegar
ti ≈ yi = W T xi + b ∀i = 1, 2, · · · , N. (3.7)
where Lϵ is the penalty function and so that the desired output should be defined between the positive and negative
range of ϵ according to Eq. 3.9.
ℵi = |ti − yi | − ϵ (3.9)
In Eq. 3.9, yi the desired output of the network should be ℵi the error resulting from the target and the output is
less than ϵ. Finally, the fine is calculated according to Eq. 3.10.
(
0 |ti − yi | ≤ ϵ,
Lϵ (ti , yi ) = (3.10)
|ti − yi | − ϵ other
For all data, Eq. 3.10 should be minimized, for which operational risk is defined by Eq. 3.11.
N
1 X
Remp = Lϵ (ti , yi ) (3.11)
N i=1
N
1 T X
−
min W W +C (ℵ+
i + ℵi )
2 i=1
s.t. − ti + yi + ϵ + ℵ+
i ≥0 ∀i
(3.12)
ti − yi + ϵ + ℵ+
i ≥0 ∀i
ℵ+
i ≥0 ∀i
ℵ−
i ≥0 ∀i
In the above Equation, the value of C is a fixed number. Now if we consider its double form for the above Equations,
− −
for −ti + yi + ϵ + ℵ+ + + +
i ≥ 0 the coefficient ai , for ℵi ≥ 0 the coefficient aµi and for ℵi ≥ 0 coefficient µi is placed.
Now, the objective function will be the sum of Eq. 3.12 with its related constraints, and after that the derivation
operation will be biased with respect to the weight, finally, for the dual objective function, it will be obtained by Eqs.
3.13, and 3.14.
1 XX + X X
min (ai − a− + − T
i )(aj − aj )xi xi − (a+ −
i − ai )ti + (a+ −
i − ai )ϵ (3.13)
2 i j i i
Modeling and prediction of Bitcoin prices based on blockchain information 1607
X
−
s.t. (a+
i − ai ) = 0, 0 ≤ a+
i ≤ C, 0 ≤ a−
i ≤C (3.14)
i
n
1X
M SE = (Pi − Ai )2 (4.1)
n i=1
v
u n
u1 X
RM SE = t (Pi − Ai )2 (4.2)
n i=1
In Eqs. 4.1, and 4.2, Ai is the actual value, Pi is the predicted value and n is the number of samples. The criteria
used for classification are three criteria: sensitivity, detection and accuracy. When the data can be divided into positive
and negative groups, the accuracy of the results of a test that divides the information into these two categories can be
measured and described using sensitivity and specificity indices. Sensitivity (True Positive Rate) means the proportion
of positive cases that the test correctly marks as positive. Detection (True Negative Rate) means the proportion of
negative cases that the test correctly marks as negative. In mathematical terms, the sensitivity is the result of dividing
the true positives by the sum of the true positives and false negatives. In the same way, the diagnosis is the result of
dividing the true negatives by the sum of the true negatives and false positives. The rate of sensitivity, detection and
accuracy is obtained using the Eqs. 4.3, 4.4, and 4.5.
TP
Sensitivity = (4.3)
TP + FN
TN
Specif icity = (4.4)
TN + FP
TP + TN
Accuracy = (4.5)
TP + TN + FP + FN
n
X
x(t) = hi (t) + r(t) (4.6)
i=1
To train the proposed SV R models, we first divide the dataset into two parts, training and testing. Then we used
EM D to extract the features which also performs noise removal as a training data set.
RMSE
60%
53.42%
50%
40%
30%
20%
The prediction results of the proposed method are based on the combination of support vector regression and signal
decomposition into intrinsic components on 260 time series samples. After calculation and prediction, the amount of
RM SE and DA was obtained, the results of which are shown in Figures 2 to 5.
Sensitivity
90%
79.52%
80%
70%
61.45%
60%
50% 42.50%
41.20%
40% 35.67% 33.56%
30%
20% 14.23%
10%
0%
ARIMA LTSM RNN SVR LINEAR BAYESIAN Proposed
NN Method
Figure 3: Comparison of the Sensitivity of the proposed method criterion with previous methods.
As shown in figures 2 to 5, the comparison of RM SE, Sensitivity, Specificity and Accuracy criteria has been done.
As shown in Figure 1, the RM SE of the proposed method is equal to 2.41, which is lower than the previous methods.
The reason for this is the use of the EM D method to decompose the signal into its intrinsic components. Due to the
fact that the signal noise is eliminated in this method, the efficiency is increased and the mean square error is reduced.
5 Conclusion
Bitcoin is a successful encryption project and has been widely explored in the fields of economics and computers.
In this article, we analyze the bitcoin price series using regression using signal decomposition technique into intrinsic
components. Investigating nonlinear relationships between input functions based on network analysis can explain the
analysis of the bitcoin price series. Bitcoin diversity should be modeled and more appropriate. This goal can be by
Modeling and prediction of Bitcoin prices based on blockchain information 1609
Specificity
100.00% 100.00%
100%
89.98%
90%
76.85% 78.12%
80%
70% 62.87%
59.75%
60%
50%
40%
30%
20%
10%
0%
ARIMA LTSM RNN SVR LINEAR BAYESIAN Proposed
NN Method
Figure 4: Comparison of the Specificity of the proposed method criterion with previous methods.
Accuracy
100% 91.45% 93.65%
87.52%
90%
80% 75.85%
70%
60% 51.75% 53.65% 51.35%
50%
40%
30%
20%
10%
0%
ARIMA LTSM RNN SVR LINEAR BAYESIAN Proposed
NN Method
Figure 5: Comparison of the Accuracy of the proposed method criterion with previous methods.
adopting other machine learning methods or considering new input capabilities related to bitcoin variability. The
decomposition of the component into the inherent signals is an analytical and efficient non-linear and non-constant
method of time data frequency. This method is based on the assumption that each signal consists of different subsets.
Non-linear and unnecessary time series can be divided into a group of average and quasi-periodic average signals,
where each of the inherent fashion components are called. This article uses support vector regression as well as signal
decomposition into intrinsic components to predict bitcoin price.
References
[1] J. Almeida, Sh. Tata, A. Moser, and V. Smit, Bitcoin prediciton using ann, Neural Networks 7 (2015), 1–12.
[2] A.F. Bariviera, M.J. Basgall, W. Hasperué, and M. Naiouf, Some stylized facts of the bitcoin market, Phys. A:
Statist. Mech. Appl. 484 (2017), 82–90.
[3] J. Brito and A. Castillo, Bitcoin: A primer for policymakers, Mercatus Center at George Mason University, 2013.
[4] H. Chen, P. De, Y.J. Hu, and B.-H. Hwang, Wisdom of crowds: The value of stock opinions transmitted through
social media, Rev. Financ. Stud. 27 (2014), no. 5, 1367–1403.
[5] J. Chu, S. Nadarajah, and S. Chan, Statistical analysis of the exchange rate of bitcoin, PloS one 10 (2015), no. 7,
e0133678.
1610 Naghipour, Nabavi Chashmi, Barzegar
[6] R.P. Dos Santos, On the philosophy of bitcoin/blockchain technology: is it a chaotic, complex system?, Metaphi-
losophy 48 (2017), no. 5, 620–633.
[7] A.H. Dyhrberg, Bitcoin, gold and the dollar–a GARCH volatility analysis, Finance Res. Lett. 16 (2016), 85–92.
[8] P. Franco, Understanding bitcoin: Cryptography, engineering and economics, John Wiley & Sons, 2014.
[9] I. Georgoula, D. Pournarakis, Ch. Bilanakos, D. Sotiropoulos, and G.M. Giaglis, Using time-series and sentiment
analysis to detect the determinants of bitcoin prices, Available at SSRN 2607167 (2015).
[10] A. Greaves and B. Au, Using the bitcoin transaction graph to predict the price of bitcoin, No Data 8 (2015),
416–443.
[11] G. James, D. Witten, T. Hastie, and R. Tibshirani, An introduction to statistical learning, vol. 112, Springer,
2013.
[12] P. Katsiampa, Volatility estimation for bitcoin: A comparison of GARCH models, Econ. Lett. 158 (2017), 3–6.
[13] R. Kaushal, Bitcoin: first decentralized payment system, Int. J. Eng. Comput. Sci. 5 (2016), no. 5, 16514–16517.
[14] I. Madan, Sh. Saluja, and A. Zhao, Automated bitcoin trading via machine learning algorithms, URL: http://cs229.
stanford. edu/proj2014/Isaac% 20Madan 20 (2015).
[15] M. Matta, I. Lunesu, and M. Marchesi, Bitcoin spread prediction using social and web search media., UMAP
Workshops, 2015, pp. 1–10.
[16] , The predictor impact of web search media on bitcoin trading volumes, 7th Int. Joint Conf. Knowledge
Discovery, Knowledge Engin. Knowledge Manag.(IC3K), vol. 1, IEEE, 2015, pp. 620–626.
[17] S. McNally, Predicting the price of bitcoin using machine learning, Ph.D. thesis, Dublin, National College of
Ireland, 2016.
[18] E.V. Murphy, M.M. Murphy, and M.V. Seitzinger, Bitcoin: Questions, answers, and analysis of legal issues,
Library of Congress, Congressional Research Service, 2015.
[19] S. Nakamoto, Bitcoin: A peer-to-peer electronic cash system, Decentr. Bus. Rev. (2008), 21260.
[20] A. Narayanan, J. Bonneau, E. Felten, A. Miller, and S. Goldfeder, Bitcoin and cryptocurrency technologies: A
comprehensive introduction, Princeton University Press, 2016.
[21] D. Shah and K. Zhang, Bayesian regression and bitcoin, 52nd Ann. Allerton Conf. Commun. Control Comput.
(Allerton), IEEE, 2014, pp. 409–414.
[22] N. Shi, A new proof-of-work mechanism for bitcoin, Financ. Innov. 2 (2016), no. 1, 1–8.
[23] A. Urquhart, The inefficiency of bitcoin, Econ. Lett. 148 (2016), 80–82.