0% found this document useful (0 votes)
64 views6 pages

Foreign Exchange Forecasting Via Machine Learning: Christian Gonz Alez Rojas Molly Herman

This document proposes using machine learning to generate directional foreign exchange (FX) forecasts to develop profitable trading strategies. It will assess machine learning models' statistical and economic performance on FX prediction. The authors plan to use market and fundamental datasets spanning 2003-2018 for the USDMXN currency pair, and test logistic/linear regression, regularized regression, support vector machines/regression, and gradient boosting models. The goal is to classify long/short signals and forecast FX levels to design a profitable strategy based on machine learning-generated outputs.

Uploaded by

Bernabas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views6 pages

Foreign Exchange Forecasting Via Machine Learning: Christian Gonz Alez Rojas Molly Herman

This document proposes using machine learning to generate directional foreign exchange (FX) forecasts to develop profitable trading strategies. It will assess machine learning models' statistical and economic performance on FX prediction. The authors plan to use market and fundamental datasets spanning 2003-2018 for the USDMXN currency pair, and test logistic/linear regression, regularized regression, support vector machines/regression, and gradient boosting models. The goal is to classify long/short signals and forecast FX levels to design a profitable strategy based on machine learning-generated outputs.

Uploaded by

Bernabas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Foreign Exchange Forecasting via Machine Learning

Christian González Rojas Molly Herman


cgrojas@stanford.edu mrherman@stanford.edu

I. I NTRODUCTION and Neural Networks in forecasting the Malaysian FX. Fur-


The finance industry has been revolutionized by the in- thermore, Amat, Michalski, and Stoltz (2018) conclude that
creased availability of data, the rise in computing power and economic fundamentals gain power to forecast exchange rate
the popularization of machine learning algorithms. Accord- even at short horizons if ML methods are applied. Finally,
ing to The Wall Street Journal (2017b), quantitative hedge Hryshko and Downs (2004) apply Reinforcement Learning
funds represented 27% of total trading activity in 2017, to create FX trading strategies based on technical analysis.
rivaling the 29% that represents all individual investors. The main contribution of this paper is the assessment of
Most of these institutions are applying a machine learning the statistical and economic performance of ML-generated
approach to investing. directional forecasts.
Despite this boom in data-driven strategies, the literature III. DATASETS
that analyzes machine learning methods in financial fore- We make use of two different datasets to explore the
casting is very limited, with most papers focusing on stock forecasting power of two types of variables: market and
return prediction. Gu, Kelly, and Xiu (2018) provide the fundamentals. We define a market variable as an indicator
first comprehensive approach to quantifying the effect of with daily to weekly frequency that has a close relationship
using machine learning (ML) to the prediction of monthly with traded securities. On the other hand, we define a
stock returns. Our intention is to implement machine learn- fundamental variable as an indicator with monthly frequency
ing methods in a relatively unexplored asset class: foreign that is closely related to the macroeconomy.
exchange (FX).
Finally, we limit the scope of our project to forecasting the
The objective of this paper is to produce directional FX USDMXN, which is the exchange rate between the US Dol-
forecasts that are able to yield profitable investment strate- lar (USD) and the Mexican Peso (MXN), expressed in MXN
gies. Hence, we approach the problem from two perspectives: per USD. However, the exercise can be generalized to other
currencies. All data was retrieved either from Bloomberg,
1) Classification of long/short signals. the Global Financial Dataset or the Federal Reserve Bank.
2) Point forecasts of FX levels that translate into A. Market Variables Dataset
long/short signals. We obtained the weekly closing price of the USDMXN
These frameworks allow us to exploit different machine currency pair, which we use as our target variable. In
learning methodologies to solve a single problem: designing addition, we consider 25 features across both Mexico and
a profitable FX strategy based on ML-generated forecasts. the United States. A summary is shown in Table I.
TABLE I
II. R ELATED W ORK
M ARKET FEATURES : W EEKLY DATASET
Machine learning methods have long been used in stock
return prediction. For instance, variations of Principal Com- Type Country Variables
ponent Analysis, an unsupervised learning technique, have Fixed Income Mexico Bond yields (3m, 6m, 1Y and 10Y)
Debt holdings
been applied by Connor and Korajczyk (1988), Fan, Liao, US Bond yields (3m, 6m, 1Y and 10Y)∗
and Wang (2016), Kelly, Pruitt, and Su (2018) and Lettau Bond Index
and Pelger (2018) to identify latent risk factors that can Federal Funds Rate∗
Global Global High-Yield Indices
explain the dynamics of stock returns. Moreover, Gu et al. Emerging Market Bond Index
(2018) have found that regularization, dimension reduction Stock Market Mexico Mexican Stock Exchange Index∗
and the introduction of nonlinearities significantly improve US S&P 500 Index∗
Global Volatility Index∗
stock return predictions. Currency Dollar Index∗
Trader positions on USDMXN
Nevertheless, despite the large adoption of machine learn- Other Global Economic Surprise Indices
ing in stock return forecasting, ML applications in FX Commodities Index∗
prediction have been widely ignored by the literature. Few * Also considered in the monthly dataset
exceptions are available. Ramakrishnan, Butt, Chohan, and
Ahmad (2017) find that, when trained with commodities The dataset spans between the first week of January 2003
prices, Random Forests outperform Support Vector Machines and the second week of November 2018.

1
B. Fundamental Variables Dataset USDMXNt . We then construct an estimated long/short signal
The fundamental variables data uses the monthly closing by computing:
price of the USDMXN currency pair as our target variable.
(
\ t+1 − USDMXN
1 if USDMXN \ t ≥0
We use 27 features that describe the macroeconomic condi- Signal t =
\
tions of both the US and Mexico between March 1990 and 0 otherwise
October 2018. The additional features that are considered in Both strategies yield a binary signal output that we can
this dataset are detailed in Table II. execute as a trading strategy.
TABLE II
B. Models
F UNDAMENTAL FEATURES : M ONTHLY DATASET
The performance of different machine learning algorithms
is tested for each framework. In particular, we considered:
Type Country Variables
Economic Mexico IP, Industrial Production 1) Logistic/Linear Regression: We use logistic and linear
Activity Trade Balance (Exports - Imports)
US IP, Industrial Production regression as our benchmark models.
Trade Balance (Exports - Imports)
Labor US Unemployment 2) Regularized Logistic/Linear Regression: We consider
Market Non-farm Payroll L1 and L2 regularization applied to logistic and linear regres-
Prices Mexico CPI, Consumer Price Index sion. This allows to reduce overfitting in the validation set.
PPI, Producer Price Index
US CPI, Consumer Price Index The hyperparameter λ , which penalizes large coefficients, is
PPI, Producer Price Index tuned using the validation set accuracy.
Debt Mexico National Debt
US National Debt 3) Support Vector Machines/Regression (SVM/SVR): It is
Sentiment US PMI, Purchasing Managers Index highly likely that fitting FX dynamics requires a non-linear
Investor Sentiment
Other Mexico M2 Money Supply
boundary. SVM/SVR with a Gaussian kernel provide the
US M2 Money Supply flexibility to generate a non-linear boundary as a result of the
infinite-dimensional feature vector generated by the kernel.
C. Data Processing 4) Gradient Boosting Classifier/Regression (GBC/GBR):
Almost all data processing is identical in both datasets. We Tree-based models allow us to capture complex interactions
first split the data into 60% train set, 20% validation set, and between the variables. Unlike Random Forests, which re-
20% test set. These subsets are taken sequentially in order quire bootstrapping, GBC allows us to keep the time-series
to keep the time-series nature of the data and to guarantee structure of the data while considering non-linearities. It is
our algorithms train exclusively on past data. important to notice that GBC and GBR is just considered
for the market variables dataset, due to the division of work
To translate our problem into a classification problem, between the authors (See section IX).
we introduce the Signalt variable which we set to 1 if the
USDMXN was higher tomorrow than today. This is: 5) Neural Networks (NN): Neural networks can model
( complex relationships between input features, which could
1 if USDMXNt+1 − USDMXNt ≥ 0 improve the forecasting performance. We consider fully-
Signalt =
0 otherwise connected networks. The architecture is shown in Fig. 1.

We also perform data processing on the features. In partic- Input Hidden Hidden∗ Output
ular, we standardize using the mean and standard deviation
I1
of the training set for every covariate. H11 H12
For the fundamentals dataset, covariates are lagged by an
I2 O1
additional period. This is done to approximate the fact that H21 H22
it is extremely rare to obtain real-time macroeconomic data.
By lagging the features by one month we ensure we are not .. .. ..
peeking into the future by including unpublished data. . . .
In
IV. F RAMEWORKS AND M ODELS Hm1 H p2
A. Frameworks
First, we perform binary classification on the Signalt Fig. 1. NN architecture.
∗ Second
hidden layer only for the market variables model.
variable we constructed in the data processing step. This
essentially transforms what initially is a continuous variable
Gu et al. (2018) show that shallow learning outperforms
problem into a classification task.
deeper learning in asset pricing applications. We follow this
On a second exercise, we use ML algorithms to con- result and only consider shallow architectures. In particular,
struct point forecasts for our raw continuous target variable, we use a network with two hidden layers for the market

2
variables dataset and a neural net with one hidden layer for The results provide evidence that market variables have a
the fundamentals dataset. stronger forecasting power than fundamentals when it comes
to classifying long/short signals. The largest test accuracy
Our choice for loss depends on the framework. We se-
(56.0%) for the market variables was obtained by the SVM,
lect logistic loss for classification and mean squared error
while the maximum test accuracy (44.9%) is achieved by
for the continuous target variable problem. We choose the
logistic regression for the fundamentals data.
proper activations in the same fashion: sigmoid is used for
classification, while ReLU is used for the continuous target There is, however, an important caveat when interpreting
variable. Finally, we use dropout or activation regularization the results. Being a measurement of the fraction of pre-
to avoid overfitting. dictions that we can correctly forecast, accuracy does not
differentiate between true positives and true negatives. A
V. H YPERPARAMETER T UNING successful trading strategy should exploit true positives and
All model parameters are tuned using the validation set. true negatives, while minimizing false positives and false
We use accuracy as our performance evaluation in the negatives.
binary classification model and mean squared error in the
continuous target variable model. The resulting parameters To discern between these cases, Fig. 2 shows the confusion
are detailed in Table III. matrix for the SVM model in the market variables dataset.
The plot suggests a bad performance on the classification of
TABLE III short signals, as well as a prevalence of long predictions.
S ELECTED PARAMETERS

Model Framework Market Fundamentals


Regularized Binary λLASSO = 0.39 λLASSO = 0.0785
Regression λRidge = 0.14 λRidge = 1.13
Continuous λLASSO = 0.0002 λLASSO = 0.75
λRidge = 0.0071 λRidge = 0.29
SVM/SVR Binary C = 1000 C = 11.5
γ = 0.0001 γ = 0.001
Continuous C = 100 C = 14.5
γ = 0.00001 γ = 0.0014
NN Binary Neuron = 250 Neuron = 100
Epoch = 1000 Epoch = 5000
Batch = 64 Batch = Full
Dropout = 0.2 λ = 5, α = 0.03
Continuous Neuron = 500 Neuron = 50
Epoch = 2000 Epoch = 7000 Fig. 2. Confusion matrix of the SVM model on the market variables dataset
Batch = 32 Batch = 32
Dropout = 0.2 Dropout = 0.2
GBC/GBR Binary Trees = 100 We further explored why this would be the case, even
Depth = 7 after significant efforts were made to reduce overfitting via
α = 0.0005 regularization. Fig. 3 shows the density of the standardized
Continuous Trees = 500
Depth = 3 3-month yield of Mexican Treasury Bills computed using
α = 0.01 kernel density estimation, conditional on the binary target
variable. The plot provides evidence that both conditional
densities are very similar, a pattern that we observed was re-
VI. S TATISTICAL P ERFORMANCE
current across all features. This complicates the classification
A. Binary Experiments task and likely induces underperformance in short signals.
Table IV shows the statistical performance of every model
for the binary classification framework applied to the market
variables dataset and the fundamentals dataset.
TABLE IV
B INARY C LASSIFICATION : ACCURACY (%)

Market Fundamentals
Model
Train Validate Test Train Validate Test
Logistic 62.5 55.2 53.0 67.8 39.1 44.9
Lasso 59.1 58.8 53.6 58.5 53.6 34.8
Ridge 60.1 61.8 54.2 59.0 53.6 37.7
SVM 59.1 60.0 56.0 65.4 53.6 40.6
NN 69.7 56.4 54.2 65.5 55.1 40.6
GBC 81.9 52.1 48.2 Fig. 3. Conditional density of 3-month Mexican T-Bills
Note: Best performance on test set marked in red.

3
B. Continuous Experiments A profitable investment strategy requires algorithms that
Table V presents the statistical performance of every correctly predict the direction of very large movements in
model for the continuous target framework applied to the the price of the asset. In our case, if an algorithm correctly
market variables and the fundamentals datasets. predicts most small changes but misses large jumps in the
exchange rate, it is very likely that it will produce negative
TABLE V
economic performance upon execution. This issue has been
C ONTINUOUS TARGET: ACCURACY (%)
previously assessed in the literature by Kim, Liao, and
Tornell (2014).
Market Fundamentals
Model Therefore, to assess the economic performance of our
Train Validate Test Train Validate Test
Linear 65.3 65.9 58.8 54.5 55.9 50.0 models, we compute the cumulative profits generated by the
Lasso 63.2 67.1 57.0 50.5 63.2 52.9 execution of the ML-generated strategy in the test set. The
Ridge 63.6 67.1 60.0 52.0 52.9 50.0
SVR 67.3 56.7 58.2 55.9 45.6 54.5 implemented strategy is simple: we start with enough cash in
NN 79.2 54.9 60.0 65.2 45.6 54.4 MXN to buy a unit of USD. We then execute the following
GBR 73.9 50.6 56.4 for every time t:
Note: Best performance on test set marked in red.
(
\t = 1
Long USD 1 if Signal
Strategyt =
\t = 0
Short USD 1 if Signal
The outperformance of the continuous variable target with
respect to the binary classification models is significant. The
At the end of every period, the position is closed, profits are
improvement between the accuracy of the best performing
cashed-in and the strategy is repeated. Finally, we use a long-
models in the market variables test set is of around 7%,
only strategy as our benchmark for economic performance.
while of 21% for the fundamentals test set. All continuous
target models outperform the binary classification in terms
A. Binary Classification
of accuracy and all market-variables models outperform
fundamentals models. Fig. 5 plots the cumulative profits of executing the binary
classification algorithms on the market variables dataset as a
Given the bad results of the confusion matrix for the
trading strategy.
binary classification problem, we explore the results of the
continuous experiments. Fig. 4 shows the confusion matrix
of the best performing model in terms of accuracy on the
market variables data for the continuous variable framework,
Ridge regression.

Fig. 4. Confusion matrix of the Ridge model on the market variables data Fig. 5. USD cumulative profits of the market variables dataset

It is easy to observe that the change with respect to the The statistically best performing model corresponds to the
continuous model is dramatic. From a 4% true negative economically most profitable specification. However, it is
rate obtained in the best model for binary classification, important to notice that this positive result is mostly driven
this new continuous target framework yields a 59% rate. by a single correct bet made between weeks 725 and 750.
This is obtained at the expense of a lower true positive All other strategies produce profits that are equal to or worse
rate. However, the true positive rate still yields a reasonable than the long-only benchmark.
performance of 61%.
These results can be explained by the bad performance of
VII. E CONOMIC P ERFORMANCE the models in terms of the confusion matrix. Due to the very
A model with very successful statistical performance of low true negative rate of most models, all specifications are
long/short signals does not imply positive economic impli- close to the long-only benchmark and the departures are a
cations. This is an inherent problem in directional forecasts. consequence of few correct or incorrect short bets.

4
B. Continuous Variable Target It is no surprise that fixed income variables are the most
Fig. 6 plots the cumulative profits of executing the con- relevant features. The result is consistent with the idea that
tinuous variable target algorithms on the market variables the exchange rate is closely related to interest rates, as
dataset as a trading strategy. explained by the Uncovered Interest Rate Parity condition
widely studied in economics.
Finally, another interesting insight is that the USDMXN
reacts strongly to global and emerging-market (EM) fixed
income indicators. In theory, the bilateral exchange rate
should react strongly to the interest rate differential between
the two countries. We believe the observed result provides
evidence of investor behavior. As documented in recent
years by Bloomberg (2015), The Wall Street Journal (2017a)
and The Financial Times (2018), the high liquidity of the
Mexican Peso has allowed its role as a hedge for long EM
positions. Our results are consistent with these findings.

VIII. C ONCLUSION AND F UTURE W ORK


This paper makes use of machine learning methods to
forecast the US Dollar against Mexican Peso exchange rate.
We use an innovative framework to find the best possible per-
Fig. 6. USD cumulative profits of the market variables dataset formance. First, we consider a market variables dataset and
a fundamentals dataset on which we train ML algorithms.
The differences with respect to the binary classification Second, we conduct binary classification experiments and
results are, once again, significant. The final cumulative continuous target experiments to produce the same output: a
return in the continuous target variable framework is around binary long/short signal on which we are able to execute a
15% higher than under the binary classification framework. simple trading strategy.
Furthermore, all strategies outperform the long-only bench- Our results suggest that continuous target prediction out-
mark with the best strategy being Ridge regression. performs binary classification not only in terms of accuracy,
In addition, the economic effect of an improved true but also in terms of specificity and sensitivity. The economic
negative rate is considerable. Unlike the binary classification results are in line with this finding, with all algorithms
case, the outperformance of all strategies with respect to outperforming a long-only benchmark. The best results are
the benchmark is not driven by few correct short positions. produced by SVM in the binary classification case and
Moreover, the reduction in the true positive rate observed Ridge regression in the continuous target case, both in terms
for the continuous target variable framework does not sig- of accuracy and cumulative profits. Last, we find that the
nificantly penalize cumulative profits. The gains of a high fundamentals dataset yields poor results.
specificity outweigh any losses derived from the reduction Future work could focus in several areas. First, the recur-
in sensitivity. sive validation procedure proposed in Gu et al. (2018) for
A natural question to address is which variables explain time-series data could be implemented. This would allow
exchange rate forecasts the most. Fig. 7 shows the relative to obtain classifiers and models that perform better out-of-
importance of the features in explaining FX dynamics. sample. Second, a major improvement on model performance
could be achieved through model ensembling. Finally, using
more complex neural network models, such as LSTMs could
increase the forecasting power of our features.

IX. C ONTRIBUTIONS
The team worked on the same problem but used different
datasets. The contribution to this work was as follows:
Christian González Rojas was in charge of data col-
lecting, data processing, algorithm selection and algorithm
implementation on the market variables dataset for both the
continuous and the binary framework. He decided to consider
GBC/GBR as an additional model to further test the value of
nonlinear relationships. He was also responsible for writing
Fig. 7. Variable importance for ridge regression on the market variables the CS229 poster and the CS229 final report. His data and
dataset under the continuous target framework code can be found at this link.

5
Molly Herman worked on data collection, data processing
and algorithms for the fundamentals dataset. She was respon-
sible for modifying the CS229 poster to create an alternative
version for the CS229A presentation and was in charge of
writing her own final report for CS229A.
The division of work for the poster and the final report
was done to provide deeper insight on the results to which
each author contributed the most.
R EFERENCES
Amat, C., Michalski, T., & Stoltz, G. (2018). Fundamentals
and exchange rate forecastability with simple machine
learning methods. Journal of International Money and
Finance, 88, 1 - 24.
Bloomberg. (2015). Why Traders Love to Short the Mexican
Peso.
Connor, G., & Korajczyk, R. A. (1988). Risk and return
in an equilibrium APT: Application of a new test
methodology. Journal of Financial Economics, 21(2),
255 - 289.
Fan, J., Liao, Y., & Wang, W. (2016, 02). Projected principal
component analysis in factor models. Ann. Statist.,
44(1), 219–254.
Gu, S., Kelly, B. T., & Xiu, D. (2018). Empirical Asset Pric-
ing via Machine Learning. Chicago Booth Research
Paper, No. 18-04.
Hryshko, A., & Downs, T. (2004). System for foreign
exchange trading using genetic algorithms and rein-
forcement learning. International Journal of Systems
Science, 35(13-14), 763-774.
Kelly, B., Pruitt, S., & Su, Y. (2018). Characteristics
are covariances: A unified model of risk and return.
Journal of Financial Economics, Forthcoming.
Kim, Y. J., Liao, Z., & Tornell, A. (2014). Speculators’ Po-
sitions and Exchange Rate Forecasts: Beating Random
Walk Models. Working Paper.
Lettau, M., & Pelger, M. (2018). Factors that fit the time
series and cross-section of stock returns. Working
Paper.
Ramakrishnan, S., Butt, S., Chohan, M. A., & Ahmad, H.
(2017). Forecasting Malaysian exchange rate using
machine learning techniques based on commodities
prices. In 2017 International Conference on Research
and Innovation in Information Systems (ICRIIS) (p. 1-
5).
The Financial Times. (2018). Mexico’s Peso remains the
bellwether for Emerging Markets.
The Wall Street Journal. (2017a). The Mexican Peso: A
Currency in Turmoil.
The Wall Street Journal. (2017b). The Quants Run Wall
Street Now.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy