Jang 2017
Jang 2017
fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2779181, IEEE Access
Abstract—Bitcoin has recently attracted considerable atten- relationship between Bitcoin and search information, such
tion in the fields of economics, cryptography, and computer as Google Trends and Wikipedia [11], and wavelet analysis
science due to its inherent nature of combining encryption of Bitcoin [12].
technology and monetary units. This study reveals the effect of
Bayesian neural networks (BNNs) by analyzing the time series
Relatively few studies have thus far been conducted on
of Bitcoin process. We also select the most relevant features estimation or prediction of Bitcoin prices. [13] evaluates
from Blockchain information that is deeply involved in Bitcoin’s Bitcoin price formation based on a linear model by con-
supply and demand and use them to train models to improve sidering related information that is categorized into several
the predictive performance of the latest Bitcoin pricing process. factors of market forces, attractiveness for investors, and
We conduct the empirical study that compares the Bayesian
neural network with other linear and non-linear benchmark
global macro-financial factors. They assume that the first
models on modeling and predicting the Bitcoin process. Our and second factors mentioned above significantly influence
empirical studies show that BNN performs well in predicting Bitcoin prices but with variation over time. The same
Bitcoin price time series and explaining the high volatility of researchers limit the number of regressors to facilitate linear
the recent Bitcoin price. model analysis. [14] predicts the Bitcoin pricing process us-
Index Terms—Bitcoin, Blockchain, Bayesian neural network, ing machine learning techniques, such as recurrent neural
Time-series analysis, Predictive model networks (RNNs) and long short-term memory (LSTM), and
compare results with those obtained using autoregressive
I. I NTRODUCTION integrated moving average (ARIMA) models. A machine
trained only with Bitcoin price index and transformed
2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2779181, IEEE Access
work. When the machine considers a lot of input variables, determined indirectly from the global macroeconomic in-
a trained machine can be complex and suffer from the dexes in actual markets. The exchange rate between several
overfitting problem. BNN models showed their effect to fiat currencies and Bitcoin price describes the relationship
the financial derivative securities analysis [16]. Formation between actual markets and Bitcoin market. The main
of Blockchain, a core technology of Bitcoin, distinguishes difference between the Bitcoin market and general currency
Bitcoin from other fiat currencies and is directly related to markets originates from the fact that the Bitcoin is a "virtual
Bitcoin’s supply and demand. To the best of our knowl- currency based on Blockchain technologies". Therefore,
edge, in addition to macroeconomic variables, direct use of economic size, E ; the velocity, V ; and the capacity of
Blockchain information, such as hash rate, difficulties, and the Bitcoin market, B , are closely related with several
block generation rate, has not been investigated to describe measurable market variables extracted from the Blockchain
the process of Bitcoin price. To fill this gap, the current platform and, which will be reviewed in the next subsection.
study systematically evaluates and characterizes the process
of Bitcoin price by modeling and predicting Bitcoin prices
B. Blockchain
using Blockchain information and macroeconomic factors.
We also try to account for the remarkable recent fluctuation, Decentralization is the value pursued by all cryptocur-
which is shown in Figure 1 and has not been considered in rencies as opposed to general fiat currencies being valued
previous studies. by central banks. Decentralization can be specified by
The rest of this article is structured as follows: Section the following goals: (i) Who will maintain and manage
II describes Bitcoin and Blockchain technique, which is a the transaction ledger? (ii) Who will have the right to
distinctive feature of Bitcoin not included in other general validate transactions? (iii) Who will create new Bitcoins?
currencies. Section III briefly reviews the BNNs employed The blockchain is the only available technology that can
to model the process of Bitcoin prices. Section IV presents simultaneously achieve these three goals. Generation of
the experimental design and data specifications. Section V blocks in the Blockchain, which is directly involved in
outlines empirical results. Section VI concludes the paper. the creation and trading of Bitcoins, directly influence the
supply and demand of Bitcoins. Combination of Blockchain
II. B ITCOIN AND B LOCKCHAIN technologies and the Bitcoin market is a real-world example
of a combination of high-level cryptography and market
A. Economics of Bitcoin
economies.
Barro’s model [17] provides a simple Bitcoin pricing
model under perfect market conditions as in [13]. In this
model, Bitcoin is assumed to possess currency value and is
exchangeable with traditional currencies, which are under
central bank control and can be used for purchasing goods
and services. The total Bitcoin supply, S B , is represented by
SB = PB B (1)
2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2779181, IEEE Access
maintain the value of Bitcoin by paying Bitcoin for the support vector machines for the time series prediction for
participant who created the block. depth-averaged current velocities of underwater gliders.
PoW agreement algorithm comes with several inherent Unlike other widely studied time series researches, there
risks. First, the validity of the block can be intervened are few related papers analyzing the Bitcoin processes in
when the majority of total participants is occupied by a terms of prediction performance. In this work, we have
group with a specific purpose called 51% problem. Second, employed Bayesian neural networks since the predicted
when the Blockchain is forked, a considerable amount of model with a large number of input variables need to be
time is consumed to form the agreed Blockchain until regularized for the weights. We have compared a predic-
the longest chain is selected after generation of several tion performance of BNN methods with linear regression
blocks. This condition causes a transaction delay because methods and SVRs, which are representative prediction
the transaction cannot be completed during that time. methods using various input variables. In this section, we
Lastly, there may be the capacity limit of the Blockchain or describe a Bayesian neural networks model employed for
the performance limit of each node. Safety of the current our experiments.
Blockchain can be monitored by observing measurable
variables in the Blockchain from https://blockchain.info/.
A. Bayesian neural networks
Considering that supply and demand of Bitcoin are
affected directly or indirectly by measurable variables in- Bayesian neural networks (BNN) is a transformed Multi-
volved in the formation of a Blockchain, the current study layer perceptron (MLP) which is a general term for ANNs
evaluates several variables related to Blockchain forma- in the fields of machine learning. The networks have been
tion as features of the Bitcoin pricing process. Section successful in many application such as image recognition,
IV describes in detail the variables exploited in empirical pattern recognition, natural language processing, and finan-
experiments. cial time series [27]. It becomes known that much effective
to represent the complex time series than the conventional
linear models, i.e. autoregressive and moving average, etc.
III. T IME SERIES MODELING The structure of a BNN is constructed with a number of
For time series analysis, nonlinear methods, such as processing units classified into three categories: an input
kernel regression model, exponential autoregressive models, layer, an output layer, and one or more hidden layers.
artificial neural network (ANN), BNN, and support vector Specifically, neural networks containing more than one
regression, have attracted research interest and exhibited hidden layers can solve the exclusive OR (XOR) problem,
improved predictive performance for various time series which cannot be solved by a single layer perceptron [28].
data [16], [19]–[26]. Different from a single layer perceptron, which can only be
[22] demonstrated that Nikkei 225 index future options linearly separated, they solve XOR problems by introducing
in 1995 were better predicted by neural networks using backpropagation algorithms and hidden layers. The hidden
the back-propagation algorithm than the traditional Black- layer mapping the original data to a new space transforms
Scholes models. [16] showed that generalization for pricing data that cannot be linearly separated into linearly separa-
and hedging derivatives can be improved by the Bayesian ble data.
regularization techniques and verified empirically for S&P Weights of a BNN must be learned between the input-
500 index daily call options from January 1988 to December hidden layer and hidden-output layer. Backpropagation
1993. [23] reported that support vector regression (SVR) refers to the process in which weights of hidden layers are
improved the forecast accuracy for the daily currency mar- adjusted by the error of hidden layers propagated by the
ket data of AUD/USD, EUR/USD, USD/JPN, and GBP/USD error of the output layer. An optimization method called
options from January to July in 2009. [24] presented sup- delta rule is used to minimize the difference between a
port vector regression methods optimized by chaotic firefly target value and output value when deriving backpropaga-
algorithm outperforms several methods of SVR for NASDAQ tion algorithm. In general, BNNs minimize the sum of the
quotes, Intel (from 9/12/2007 to 11/11/2010), National Bank following errors, E B , using backpropagation algorithm and
shares (from 6/27/2008 to 8/29/2011) and Microsoft (from delta rule.
9/12/2007 to 11/11/2011) daily closed stock prices. [26] α XN X K β
tuned the parameters of multi-output support vector re- EB = (t nk − o nk )2 + TB B (4)
2 n=1 k=1 2
gression using firefly algorithm and compared the proposed
SVR methods with other existing methods for forecasting where E B is the sum of the errors, N is the number of
the market indexes, S&P 500, Nikkei 225, and FTSE 100 the training variables, K is the size of the output layer, t nk
indexes. is the k-th variable of the n-th target vector, o nk is the
[25] showed that the least squares support vector ma- k-th output variable of the n-th training vector, α and β
chines has better prediction performance for the time series are the hyper-parameter, and B is the weights vector of the
of electrical energy consumption of Turkey compared to the Bayesian neural network.
traditional regression models and artificial neural networks. A BNN is a non-linear version of ridge regression, which
[21] proposed the time series prediction methods com- is largely based on the Bayesian theory for neural net-
bining backpropagation neural networks and least squares works. Unlike conventional neural networks that maximize
2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2779181, IEEE Access
marginal likelihood, BNN is a machine maximizing the bias and variance of the trained model. Bootstrap creating
value of posterior through an application of the Bayes’ the cloned multiple samples with the replacement is not
theory. The elements added to the error term cause the originally developed for model validation. It can give more
machine to learn by selecting a weight with high impor- biased results. Therefore, we employ the cross-validation
tance even when the number of total weights is reduced technique to our model validation. Cross-validation can
rather than distributed to a large number of weights. create high-variance problems when data size is small.
Our data size is sufficient to overcome the problem. We
B. Resampling methods employ the 10-fold cross-validation methods generally used
for model validations.
In this section, we discuss two representative resampling
methods: cross-validation, and bootstrap. We identify ad-
vantages and disadvantages of each method and select the IV. B LOCKCHAIN DATA DESCRIPTION
appropriate method for the empirical analysis of this study.
This section describes Blockchain data and macroeco-
A bootstrap method is one of the sampling techniques
nomic variables used in our empirical analysis and their
that new data set is sampled from the original data set with
summary statistics.
the replacement. A typical bootstrap works as follows [29]:
1 We have the original data setD with the number of N .
2 Below following step is repeated B times for particular A. Data specification
large number to produce B different bootstrap data set, Figure 1 shows the time series of Bitcoin price obtained
Z1 , Z2 , · · · , ZB from https://bitcoincharts.com/markets/, where the value
• Data set Zi with the size N is generated by of 1-Bitcoin, which was about $ 5 in September 2011,
sampling from the original data set D with the approximates $ 4,000 in August 2017. During this period,
replacement. market volatility with enormous price changes in Bitcoin
3 The machine is trained from each bootstrap data set. becomes exceptional compared with that in traditional
4 Accuracy of the machine is calculated by averaging currency markets. It is evident that standard economic
each bootstrap data set. theories are insufficient to account for the impressive price
B 1 X N
development and volatility of Bitcoin [11]. Bitcoin markets
1 X j
Accur ac y = (1 − Loss( ŷ i , y i )) (5) do not possess purchasing power nor interest rate parity. In
B j =1 N i =1 particular, Bitcoin is an actual implementation of decentral-
j ization issued under the consent of participants and not the
where y i is an i -th true training output data, ŷ i is an
central bank. This fact suggests that the need for completely
i -th estimated output from the bootstrap data Z j , and
new determinants of Bitcoin price: the Blockchain informa-
Loss(·, ·) is a loss function.
tion that includes relevant features as main determinants
A cross-validation randomly divides the original data set for pricing Bitcoin. Blockchain data used for empirical anal-
into K equal-sized parts without the replacement. We fit ysis can be collected from https://blockchain.info/. Table I
the machine learning model to the K − 1 parts leaving out presents the Blockchain data and macroeconomic variables
particular set k and acquire a prediction error for the left- to be used in predicting the evolution of Bitcoin prices.
out k part. Total prediction accuracy is combined after the
procedure is repeated for each part to leave [30], [31]. A
TABLE I
general procedure is as follows: D ATA FOR THE EMPIRICAL STUDY
1 We divide the original data set into K partial equal-
sized data set,C 1 ,C 2 , · · · ,C K , without the replacement. Data category Data
validation: Global
currency GBP, JPY, CHF, CNY, EUR
sP
K
k=1
(Er r k − Er¯r k )2 ratio( /USD)
ˆ (CVK ) =
SE (7)
N −1
Pnk
where Er r k is the k-th loss, i =1 Loss( ŷ ik , y i ). Several blockchain variables are considered as follow:
Bootstrap is adequate to validate a predictive model per- • Average block size (MB): the size of a block verified
formance, to use an ensemble method, and to estimate of by all participants.
2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2779181, IEEE Access
• Transactions per block: average number of transac- On the other hand, there is little difference between
tions per block. the most recent two years and the overall range in the
• Median confirmation time: the median time for each volatility of the global exchange rate market as well as the
transaction to be accepted into a mined block and growth of the global macroeconomic market economy over
recorded to the ledger. the past two years is much smaller than that of Bitcoin.
• Hash rate: estimated number of Tera (trillion) hashes These results provide empirical evidence for the fact that
per a second all miners (market participants to solve the recent volatility in Bitcoin prices stems mostly from
a hash problem for making a block) is performing. the Blockchain information directly involved in supply and
• Difficulty: next difficulty =(previous difficulty ∗2016 ∗ demand of Bitcoin and not from other macro-financial
10 minutes)/(time to mine last 2016 blocks) markets.
• Cost % of a transaction: miners’ revenue as the
percentage of the transaction volume.
V. E XPERIMENTAL R ESULTS
• Miners revenue: Total value of coin-base block rewards
and transaction fees paid to miners. A. Structure of the experiment
• Confirmed transaction: the number of confirmed the Most of the previous studies have focused on either
validity of transactions per day. modeling Bitcoin price without considering its relationship
• Total number of a unique Bitcoin: market capitaliza- to Blockchain information or identifying only its “linear”
tion of Bitcoin. relationship to macroeconomic factors. The present study
By employing ordinary least square (OLS) estimation, [32] attempts to overcome these limitations by employing a
demonstrates that the Dow Jones index, the euro-dollar Bayesian NN model that can investigate nonlinear in-
exchange rate, and WTI oil price influence the value of fluences of each relevant feature of input variables, the
Bitcoin price in the long run. We also consider several Blockchain information, and macroeconomic factors, on
variables such as S& P500, Eurostoxx, DOW30, NASDAQ, Bitcoin price formation. To this end, we first train a
Crude oil, SSE, Gold, VIX, Nikkei225, and FTSE100, which Bayesian NN to model Bitcoin price formation using given
associated with global macroeconomic development. above-mentioned relevant features of the process. We have
Given that Bitcoin is related to traditional currency mar- evaluated Bayesian NN in terms of training and test errors
kets in addition to the cryptocurrency market itself based by using the representative non-linear methodologies, SVR,
on digital cryptography, we take into account the exchange and the linear regression model as the benchmark methods.
rates between global monetary markets; exchange rates Next, we develop a prediction model of the near-future
are basic factors in the analysis of traditional currency price of Bitcoin after modeling the entire process. We con-
markets. We specifically use exchange rates between major figure forecasting models by the rollover framework, which
fiat currencies (GBP, JPY, CHF, CNY, EUR) and the dollar is generally applied to portfolio theory. Rollover strategy
because these rates are most likely to affect the Bitcoin is known as rolling a position forward which is closing
price. out an old position and establishing a new position in a
In summary, we cover the daily data from Sep 11, 2011, contract of the portfolio with a long time to maturity. In
to Aug 22, 2017 in the empirical analysis by employing our experiments, the trained machine is closing out an
both the traditional determinants of currency markets, such old information and acquiring new data according to the
as global macro-economic development and the features rollover framework over time. Figure 3 shows a schematic
endowed from the cryptocurrency. This experiment, which rollover strategy employed in our empirical studies. At the
has not been performed in previous studies, primarily aims initial training step, the machine is learned with N t r ai n
to discover the main features that can explain the recent training data, and the prediction performance is measured
highly volatile Bitcoin process. using N t est test data. Next, after t 0 − t time from time t ,
the machine is trained using again the N t r ai n data from
B. Summary data statistics time t 0 to update old learning data, and the performance
of N t est test data is thereafter measured. The machine is
Table II shows summary statistics of response vari- trained through the entire range in this way, and the average
ables, Blockchain-related variables, global macroeconomic performance of prediction errors measured several times is
indexes, and international exchange rates used in empirical evaluated.
analysis from September 13, 2011, to July 21, 2017. Several
notable points are considered in the empirical analysis. As
shown in Table II, response variables and Blockchain related
variables in the last two years are considerably more vari-
able than other categories such as global macroeconomic
indexes and international exchange rates. Bitcoin prices and
volatilities have nearly doubled over the past two years. In
addition, Blockchain data exhibit a significant increase in
trading volume and size per a block and a huge reduction
in miner’s profit and the hash rate. Fig. 3. the formation of the Blockchain
2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2779181, IEEE Access
TABLE II
S UMMARY STATISTICS OF THE DATA
Data category Whole range Recent 2 years Data category Whole range Recent 2 years
mean stdev. mean stdev. mean stdev. mean stdev.
Bitcoin price log of Bitcoin
458.32 606.2 901.96 804.0 5.04 1.92 6.52 0.71
(USD) price (USD)
Trading volatility
volatility 10.75 25.06 21.83 38.88 6.66∗104 5.82∗104 7.15∗104 5.21∗104
(BTC)
Transactions Median
751.81 625.03 1507.61 389.58 9.15 3.59 10.21 3.44
per block confirmation time
Hash rate 8.14∗106 1.41∗106 2.18∗106 1.68∗106 Difficulty 1.08∗1011 1.86∗1011 2.9∗1011 2.21∗1011
Confirmed transac.
1.14∗105 9.29∗104 2.26∗105 5.83∗104 S&P 500 1851.29 346.26 2169.8 166.84
per day
Eurostoxx 2977.97 413.73 3208.97 235.1 Dow Jones 30 1.64∗104 2.59∗103 1.87∗104 1.71∗103
Nasdaq 4279.47 1029.08 5289.29 543.53 Crudeoil 73.53 25.21 45.23 5.98
SSE 2706.43 633.03 3140.39 223.83 Gold 1356.01 201.03 1218.72 78.67
VIX 16.02 5.09 15.12 4.65 Nikkei225 1.50∗104 3.87∗103 1.82∗104 1.46∗103
FTSE100 6444.86 549.54 6704.23 531.92 USD/CNY 6.37 0.25 6.65 0.2
USD/GBP 0.67 0.06 0.74 0.06 USD/JPY 102.36 14.67 112.4 6.44
USD/EUR 0.82 0.08 0.91 0.03 USD/CHF 0.95 0.04 0.99 0.02
Learning the machine through the rollover framework response variable because almost all values of correlation
aims to validate the method of forecasting the next order coefficients of each explanatory variable are not exception-
of N t est test data from N t r ai n training data. Given that ally significant for the return value of Bitcoin.
the model employs time series in batch format, it is is
faster and easier to learn than other sequential neural Next, we discuss the multicollinearity problem, which
networks models, LSTM or RNN, and can reflect the flow of is often encountered in linear regression analysis. Several
information that changes with time. The rollover framework statistical problems are caused from the multicollinearity
can be used to implement semi-online prediction models to which is the situation that some regressors have a linear
incorporate new information or shocks with short learning relationship with other regressors. It can cause undesirable
time. regression analysis: very high R 2 for some coefficients that
are not statistically significant and their t-statistics sensitive
to data variation [33]. One of the prescriptions for dealing
B. Linear regression analysis with multicollinearity is to do a linear regression except for
We first construct a linear model for analysis of Bitcoin variables with large VIF values, which is a sort of measure
price and address several critical issues in assumptions of of the linear relationship between variables [33]. To remove
the linear regression model. A basic assumption required redundant variables for preventing the collinearity prob-
for linear regression is the model assumption that linear lems, we eliminate several explanatory variables with large
relationships exist between response variables and inde- VIF values. Table IV shows VIF values of each explanatory
pendent variables [33]. Table III shows (linear) correlations variable. In this study, we have determined that the set of
between explanatory variables and response variables. Each variables excluding linear relationships is suitable for linear
column represents linear correlation coefficients of regres- regression analysis to avoid multicollinearity problem. We
sors for each response variable and the value in parentheses select 16 suitable discriminators after eliminating variables
represents the results of t-test for the null hypothesis that with large VIFs and perform linear regression analysis
there is no linear relationship between the two variables. on Bitcoin log prices and log volatilities with these 16
We denote the null hypothesis-rejecting variables as bold, discriminators. Removed variables include the following:
based on a p-value of 0.05, and presented a t-value because transactions per a block, difficulty of the hash function,
the p-value was as small as zero. We exclude the return as Nikkei225 index, S&P 500 index, Eurostoxx index, DOW30
2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2779181, IEEE Access
TABLE III
C ORRELATION COEFFICIENTS AND ( T- VALUES ) BETWEEN THE RESPONSE AND INDEPENDENT VARIABLES .
Data category return price log(price) log(vol.) Data category return price log(price) log(vol.)
Trading vol. 0.064 0.071 0.123 0.245 Trading vol. 0.016 0.777 0.474 0.683
(BTC) (2.987) (3.315) (5.772) (11.769) (USD) (0.745) (57.485) (25.071) (43.549)
0.024 0.906 0.58 0.588 Miners revenue -0.034 -0.34 -0.51 -0.24
Difficulty
(1.118) (99.686) (33.159) (33.856) (%) (-1.584) (-16.838) (-27.613) (-11.514)
Miners revenue -0.015 0.92 0.76 0.625 Confirmed trans. 0.008 0.66 0.731 0.402
(USD) (-0.699) (109.326) (54.46) (37.288) per day (0.373) (40.915) (49.891) (20.447)
TABLE IV
VIF VALUES OF EACH EXPLANATORY VARIABLE FOR DETECTING THE COLLINEARITY PROBLE
Data category VIF Data category VIF Data category VIF Data category VIF
USD/CHF 7.7059
index, NASDAQ, and exchange rates of EUR and GBP. From Finally, we generate histograms residuals of each model
these 16 regressors, we construct two linear models, one for to verify the residual assumption by confirming it follows a
the log price and one for the volatility of Bitcoin process. We normal distribution.
then evaluate assumption fitness, say the residual assump- Figure 4 (a) & (b) show that the Bitcoin log price satisfies
tion that residual terms are independently and identically the residual assumption for linear regression: the histogram
distributed. is bell-typed and symmetric and the QQ-plot shows a
2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2779181, IEEE Access
TABLE V
T RAINING ERROR FOR THE B ITCOIN PRICE FORMATION
Log Log
Response var.
price volatility
(b) Bitcoin log-volatility Support vec. RMSE 0.1453 0.1434 0.3810 0.3939
Regression MAPE 0.0325 0.0322 0.5411 0.6293
Fig. 5. Prediction results of (a) the Bitcoin log price and (b) the Bitcoin
log volatility
2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2779181, IEEE Access
TABLE VI
T EST ERROR FOR THE B ITCOIN PRICE FORMATION
Log Log
Response var.
price volatility
2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2779181, IEEE Access
10
A CKNOWLEDGMENT
This work was supported by the National Research
Foundation of Korea (NRF) grant funded by the Korean
(a) Log value of the Bitcoin price government (MEST) (No. 2016R1A2B3014030).
R EFERENCES
[1] S. Nakamoto, “Bitcoin: A peer-to-peer electronic cash system,” 2008.
[2] A. H. Dyhrberg, “Bitcoin, gold and the dollar–a garch volatility
analysis,” Finance Research Letters, vol. 16, pp. 85–92, 2016.
[3] P. Katsiampa, “Volatility estimation for bitcoin: A comparison of garch
models,” Economics Letters, vol. 158, pp. 3–6, 2017.
[4] A. F. Bariviera, M. J. Basgall, W. Hasperué, and M. Naiouf, “Some
stylized facts of the bitcoin market,” Physica A: Statistical Mechanics
and its Applications, vol. 484, pp. 82–90, 2017.
[5] J. Chu, S. Nadarajah, and S. Chan, “Statistical analysis of the exchange
rate of bitcoin,” PloS one, vol. 10, no. 7, p. e0133678, 2015.
[6] A. Urquhart, “The inefficiency of bitcoin,” Economics Letters, vol. 148,
pp. 80–82, 2016.
[7] S. Nadarajah and J. Chu, “On the inefficiency of bitcoin,” Economics
Letters, vol. 150, pp. 6–9, 2017.
[8] A. H. Dyhrberg, “Hedging capabilities of bitcoin. is it the virtual gold?”
Finance Research Letters, vol. 16, pp. 139–144, 2016.
[9] E. Bouri, P. Molnár, G. Azzi, D. Roubaud, and L. I. Hagfors, “On the
(b) Log volatility of the Bitcoin price hedge and safe haven properties of bitcoin: Is it really more than a
diversifier?” Finance Research Letters, vol. 20, pp. 192–198, 2017.
Fig. 7. Prediction results of (a) the log value of the Bitcoin price and (b)
[10] E.-T. Cheah and J. Fry, “Speculative bubbles in bitcoin markets?
the log volatility of the Bitcoin price.
an empirical investigation into the fundamental value of bitcoin,”
Economics Letters, vol. 130, pp. 32–36, 2015.
2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2779181, IEEE Access
11
[11] L. Kristoufek, “Bitcoin meets google trends and wikipedia: Quan- Huisu Jang received the B.Sc., and the M.S. degree
tifying the relationship between phenomena of the internet era,” in industrial engineering from the Seoul National
Scientific reports, vol. 3, p. 3415, 2013. University (SNU), South Korea, in 2013, and 2015,
[12] ——, “What are the main drivers of the bitcoin price? evidence from where she is currently pursuing the ph.D. degree
wavelet coherence analysis,” PloS one, vol. 10, no. 4, p. e0123923, with the department of industrial engineering.
2015. Her research interests include statistical machine
[13] P. Ciaian, M. Rajcaniova, and d. Kancs, “The economics of bitcoin learning, time series analysis, and computational
price formation,” Applied Economics, vol. 48, no. 19, pp. 1799–1815, finance.
2016.
[14] S. McNally, “Predicting the price of bitcoin using machine learning,”
Ph.D. dissertation, Dublin, National College of Ireland, 2016.
[15] I. Madan, S. Saluja, and A. Zhao, “Automated bitcoin trading via
machine learning algorithms,” 2015.
[16] R. Gençay and M. Qi, “Pricing and hedging derivative securities
with neural networks: Bayesian regularization, early stopping, and
bagging,” IEEE Transactions on Neural Networks, vol. 12, no. 4, pp.
726–734, 2001.
[17] R. J. Barro, “Money and the price level under the gold standard,” The
Economic Journal, vol. 89, no. 353, pp. 13–33, 1979.
[18] A. Narayanan, J. Bonneau, E. Felten, A. Miller, and S. Goldfeder, Bit-
coin and Cryptocurrency Technologies: A Comprehensive Introduction.
Princeton University Press, 2016.
[19] J. Pati, B. Kumar, D. Manjhi, and K. Shukla, “A comparison among
arima, bp-nn and moga-nn forsoftware clone evolution prediction,”
IEEE Access, 2017.
[20] A. R. Subhani, W. Mumtaz, M. N. B. M. Saad, N. Kamel, and A. S.
Malik, “Machine learning framework for the detection of mental stress
at multiple levels,” IEEE Access, vol. 5, pp. 13 545–13 556, 2017.
[21] Y. Zhou, J. Yu, and X. Wang, “Time series prediction methods for
depth-averaged current velocities of underwater gliders,” IEEE Access,
vol. 5, pp. 5773–5784, 2017.
[22] J. Yao and C. L. Tan, “Time dependent directional profit model
for financial time series forecasting,” in Neural Networks, 2000.
IJCNN 2000, Proceedings of the IEEE-INNS-ENNS International Joint
Conference on, vol. 5. IEEE, 2000, pp. 291–296.
[23] P. Wang, “Pricing currency options with support vector regression
and stochastic volatility model with jumps,” Expert Systems with
Applications, vol. 38, no. 1, pp. 1–7, 2011.
[24] A. Kazem, E. Sharifi, F. K. Hussain, M. Saberi, and O. K. Hussain,
“Support vector regression with chaos-based firefly algorithm for
stock market price forecasting,” Applied soft computing, vol. 13, no. 2, Jaewook Lee is a professor and chair in the
pp. 947–958, 2013. Department of Industrial Engineering at Seoul
[25] F. Kaytez, M. C. Taplamacioglu, E. Cam, and F. Hardalac, “Forecasting National University, Seoul, Korea. He received the
electricity consumption: a comparison of regression analysis, neural B.S. degree in mathematics from Seoul National
networks and least squares support vector machines,” International University, and the Ph.D. degree in applied math-
Journal of Electrical Power & Energy Systems, vol. 67, pp. 431–438, ematics from Cornell University in 1993 and 1999,
2015. respectively. His research interests include ma-
[26] T. Xiong, Y. Bao, and Z. Hu, “Multiple-output support vector regres- chine learning, neural networks, global optimiza-
sion with a firefly algorithm for interval-valued stock price index tion, and their applications to data mining and
forecasting,” Knowledge-Based Systems, vol. 55, pp. 87–100, 2014. financial engineering.
[27] K. P. Murphy, Machine learning: a probabilistic perspective. MIT
press, 2012.
[28] M. Minsky and S. Papert, “Perceptrons.” 1969.
[29] B. Efron and R. Tibshirani, “Improvements on cross-validation: the
632+ bootstrap method,” Journal of the American Statistical Associa-
tion, vol. 92, no. 438, pp. 548–560, 1997.
[30] B. Efron, “Estimating the error rate of a prediction rule: improvement
on cross-validation,” Journal of the American statistical association,
vol. 78, no. 382, pp. 316–331, 1983.
[31] P. Jonathan, W. J. Krzanowski, and W. McCarthy, “On the use of cross-
validation to assess performance in multivariate prediction,” Statistics
and Computing, vol. 10, no. 3, pp. 209–229, 2000.
[32] D. van Wijk, “What can be expected from the bitcoin,” Erasmus
Universiteit Rotterdam, 2013.
[33] D. N. Gujarati and D. C. Porter, “Essentials of econometrics,” 1999.
2169-3536 (c) 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.