Quantum Computing For Finance, Oswaldo Zapata PHD
Quantum Computing For Finance, Oswaldo Zapata PHD
An Introductory Guide
The purpose of this short book is to illustrate how quantum computing is transform-
ing the future of finance and to outline the actions that professionals and financial
institutions can—and must—take to prepare for the upcoming revolution. It is
specifically intended for finance students and professionals who are interested in
learning how quantum computing is transforming their industry. Quantum comput-
ing experts with an interest in finance may also find it valuable.
1 Introduction 5
3 Non-Quantum Approaches 15
3.1 Optimization Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.1.1 Portfolio Optimization . . . . . . . . . . . . . . . . . . . . . . 18
3.2 Monte Carlo Simulation . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2.1 Stochastic Processes . . . . . . . . . . . . . . . . . . . . . . . 21
3.2.2 Monte Carlo and the Greeks . . . . . . . . . . . . . . . . . . . 24
3.3 Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3.1 ML Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3.2 ML in Finance . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.4 Computational Finance . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.4.1 Historical Background . . . . . . . . . . . . . . . . . . . . . . 40
3.4.2 Python’s Dominance . . . . . . . . . . . . . . . . . . . . . . . 40
4 Quantum-Enhanced Solutions 45
4.1 Quantum Portfolio Optimization . . . . . . . . . . . . . . . . . . . . 46
4.1.1 The VQE Algorithm . . . . . . . . . . . . . . . . . . . . . . . 46
4.1.2 The QAO Algorithm . . . . . . . . . . . . . . . . . . . . . . . 49
4.2 Quantum Machine Learning . . . . . . . . . . . . . . . . . . . . . . . 52
4.2.1 QML Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.2.2 QML in Finance . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.3 Programming Quantum Computers . . . . . . . . . . . . . . . . . . . 58
6 Conclusion 75
3
Chapter 1
Introduction
The purpose of this short book is to illustrate how quantum computing is transform-
ing the future of finance and to outline the actions that professionals and financial
institutions can—and must—take to prepare for the upcoming revolution.
It is specifically intended for finance students and professionals who are interested
in learning how quantum computing is transforming their industry. Quantum com-
puting experts with an interest in finance may also find it valuable.
Chapter 2 provides a brief overview of quantum computing. Since I have already
explained the basics of quantum computing in detail in previous notes,1 this chapter
focuses only on the key concepts most relevant to the upcoming chapters.
In Chapter 3, I discuss some of the most pressing challenges currently faced by the
financial industry and how experts are addressing them. Specifically, I highlight ar-
eas where quantum computing shows promise in offering significant advantages. Key
topics include portfolio optimization, Monte Carlo simulations, and various prob-
lems involving artificial intelligence models, particularly machine learning. Without
delving too deeply into technical details, I also touch on the programming skills
necessary to tackle these problems.
Chapter 4 discusses how quantum computing can help address the challenges men-
tioned in the previous chapter, particularly in portfolio optimization and the inte-
gration of quantum computing with machine learning. It concludes with a survey of
the most popular frameworks used to program several quantum hardware available
today in the market.
Chapter 5 provides an overview of the quantum computing landscape specifically
for finance. I discuss the maturity of quantum hardware, concrete examples of major
firms and startups driving the adoption of quantum technology, as well as practical
advice on what professionals and financial institutions can begin doing now to avoid
falling behind and gain a competitive advantage.
A final note on how to approach this guide: I suggest focusing on the chapters and
sections that align with your interests and background. For instance, if you are a
chief innovation officer at a bank and find the chapter on quantum computing too
challenging, feel free to skip ahead to the parts most relevant to you. Conversely,
if you are a quantum computing physicist and find this chapter too basic, move
on to the material that captures your interest. That said, for a comprehensive
understanding of the subject, I recommend reading and digesting the entire book.
1
See “An Introduction to Quantum Computing for Physicists” and “A Second Course on Quan-
tum Computing for Physicists.” You can find them on my LinkedIn. In the following, I will refer
to them as QC1 and QC2.
5
6 CHAPTER 1. INTRODUCTION
Feel free to connect with me on LinkedIn and let me know if you have any feedback:
https://www.linkedin.com/in/oswaldo-zapata-phd-quantum-finance.
Chapter 2
Quantum computers are expected to solve some problems significantly faster and
more accurately than classical devices. This chapter provides a brief summary of
their key features.
Quantum computation is a quantum mechanical approach to solving computa-
tional problems. It uses the principles and mathematical formalism of quantum
mechanics—such as state vectors, unitary operators, and measurements—to arrive
at logical solutions to given computational problems. A quantum computer , on the
other hand, is the physical device that implements the quantum computational pro-
cesses of interest. For decades, it was believed that quantum computations could
only be performed on quantum computers built entirely from quantum components.
In this chapter, we will see that this is no longer the case. Today, experts are con-
vinced that the first machines to surpass classical supercomputers will be hybrid
devices, composed of both quantum and classical components working together in
tandem.
This chapter is organized as follows. Sections 2.1 and 2.2 review fundamental
concepts of quantum computing, with a focus on the circuit model of quantum
computation and error correction. Section 2.3 offers a preliminary introduction to
the hybrid quantum-classical computational models just mentioned.
Let us begin with the most fundamental concepts. A model of computation, broadly
speaking, is a logical framework that defines how to proceed given a set of basic
elements and instructions. More precisely, a model of computation is characterized
by a set of abstract objects and a collection of elementary operations on these
objects. An algorithm is a sequence of precise instructions defined within a model
of computation, designed specifically to solve a computational problem.
For example, the Boolean or binary model of computation is based on the so-called
Boolean algebra. In Boolean algebra, there are only two elements, conventionally
denoted by 0 and 1, and three elementary operations, referred to as NOT, AND,
and OR. If we denote any two arbitrary elements as i, j = 0, 1, and use the standard
7
8 CHAPTER 2. REVIEW OF QUANTUM COMPUTING
notation of the arithmetic system, the elementary operations are defined as follows:
NOT OR AND
for the machine to do its job. It was believed that, thanks to digital computers, the
solution to any solvable problem was merely a matter of time. The goal of com-
puter science was then to discover more efficient algorithms and build more powerful
computers.
Let us examine in more detail some of the basic components of the quantum circuit
model of computation.
A single qubit is the simplest element of this model. If we denote by |0⟩ and |1⟩
the two measurable states, any single qubit |q⟩ will be a linear superposition of these
states,
1
X
|q⟩ = α0 |0⟩ + α1 |1⟩ = αi |i⟩ , (2.1.4)
i=0
where α0 and α1 are complex numbers. The states |0⟩ and |1⟩ are called the compu-
tational basis states. Quantum mechanics affirms that the probability of measuring
the single qubit |q⟩ in state |i⟩ is |αi |2 . Instead of using complex numbers to specify
the single qubit, two real numbers associated with angles in spherical coordinates
can be used. The general expression for the single-qubit state vector in these new
variables is:
z
|0⟩
|q⟩
ϑ
y
ϕ
x |1⟩
where each bit i1 , . . . , in in the string i1 . . . in can take the values 0 and 1, and the
coefficients αi1 ...in are complex numbers. Note that some states of the n-qubit system
are not entangled. For example, the following product states are non-entangled
states:
|i . . . i⟩ = |i⟩⊗n . (2.1.7)
A quantum gate is a unitary transformation applied to an input qubit. For instance,
a single-qubit gate is a unitary transformation applied to a single qubit,
1
X
|q⟩ 7−→ U |q⟩ = αi U |i⟩ , (2.1.8)
i=0
where, by the definition of unitary operator, ⟨q ′ |U † U |q⟩ = ⟨q ′ |q⟩. The circuit element
corresponding to the gate U is shown below:
|q⟩ U U |q⟩
The Pauli gate Z is a special case, when ϕ = π, of the relative phase gate (phase
shift gate),
P (ϕ) |0⟩ = |0⟩ , P (ϕ) |1⟩ = eiϕ |1⟩ . (2.1.12)
2.1. QUANTUM CIRCUITS 11
L =
X
|q⟩ |i⟩
After this brief survey of both the binary and quantum models of computation,
you may be asking yourself: why do we need quantum computers if we already have
classical (digital) computers, which have proven to be highly reliable for solving most
practical computational problems? There are two main reasons to believe that, in
some cases, quantum computers will surpass classical computers. The first reason is
that quantum computers may solve certain problems much faster than classical com-
puters. This is what is meant when it is said that quantum computers will be more
“efficient” or “powerful” than classical computers. In practical terms, this means
that, in the future, quantum computers could be much smaller (and less complex)
than classical computers designed for the same computational tasks. The second
reason for the growing interest in quantum computing is that quantum computers
will likely (though there is no formal proof yet) be able to solve computational prob-
lems that classical computers cannot solve. In other words, scientists expect them
to tackle problems that even the most powerful digital computers imaginable may
never be able to solve.
Note that this method would not work if several errors could occur simultaneously.
In fact, the majority voting strategy we just used will only correct a single wrong
2.2. COMPUTATIONAL ERRORS AND FAULT TOLERANCE 13
By repeating the initial bit many more times and assuming that the probability
of an error occurring is very small, this correction procedure becomes completely
reliable.
The bit-flip quantum repetition code is similar. However, there are some funda-
mental differences. Like in the classical case, instead of sending a single qubit |q⟩,
we send a three-qubit state |q⟩L ,
2 2
encode send
X X
|q⟩ = αi |i⟩ −−−−−→ |q⟩L = αi |i i i⟩ −−−→ (2.2.3)
i=1 i=1
To distinguish |q⟩L from the initial qubit |q⟩, the former is called the logical qubit,
and each of the state vectors in |i⟩ |i⟩ |i⟩ is referred to as the physical qubit. Now,
during the process of transmitting the qubit, errors may occur. In our example,
suppose a bit flip occurs. For the initial single qubit |q⟩, this would have meant
2
X 2
X
|q⟩ = αi |i⟩ −→ αi |ī⟩ ; (2.2.4)
i=1 i=1
however, since we are sending a logical three-qubit state, the error can now occur in
any of the three physical qubits. Suppose that it is the second physical qubit that
gets flipped,
2 2 2
X encode
X send (error occurs) X
|q⟩ = αi |i⟩ −−−−−→ |q⟩L = αi |i i i⟩ −−−−−−−−−−−−→ αi |i ī i⟩ (2.2.5)
i=1 i=1 i=1
The detection and correction of an error in a qubit is not as simple as in the clas-
sical case. In fact, we can measure a classical bit and leave it almost undisturbed.
However, according to the principles of quantum mechanics, the measurement of a
qubit will project it onto one of the observable states. To avoid this, we perform
a parity check (see QC1, Subsection 5.2). After identifying the error, we correct it
and recover the initial qubit. The entire process can be summed up as follows,
2 2 2
X encode
X send (error occurs) X
|q⟩ = αi |i⟩ −−−−−→ |q⟩L = αi |i i i⟩ −−−−−−−−−−−−→ αi |i ī i⟩
i=1 i=1 i=1
2 2
parity check correct decode
X X
−−−−−−−−→−−−−−→ |q⟩L = αi |i i i⟩ −−−−−→ αi |i⟩ = |q⟩
i=1 i=1
Other types of errors can occur, and similar procedures have been developed to
detect and correct them. In addition to errors related to the transmission of qubits,
gates can also produce errors. For example, imagine you expect a qubit |i⟩ to exit
a gate, but instead, the qubit |ī⟩ exits. Worse yet, a qubit transferred through a
noisy channel might enter a faulty gate. If the two errors combine, the final qubit
could become unrecognizable. Therefore, if we are not careful, errors can propagate
throughout the circuit, making the final computation unreliable.
14 CHAPTER 2. REVIEW OF QUANTUM COMPUTING
Finally, since the error correction process—namely, the encoding, detection, and
correction of errors—is performed by quantum devices, these devices will inevitably
introduce additional errors. As a result, error correction requires larger quantum
computers, which increases the probability that the final computational result will be
incorrect. The good news is that it has been proven that, under certain conditions,
error correction codes can reduce computational errors to arbitrarily small levels.
A fault-tolerant quantum computer is a quantum computer designed such that the
errors occurring in the logical qubits at each stage of the process are corrected along
the way, ensuring the final computational result is reliable. However, this technology
remains far from being realized.
Non-Quantum Approaches
With the basic understanding of quantum computing gained in the previous chapter,
we are now ready to explore some of the most complex computational problems
in finance. In the following chapter, we will examine how quantum computers can
enhance these methods. We begin with a brief overview of some elementary financial
concepts.
In finance, a portfolio refers to a collection of assets that an investor owns, such
as stocks, bonds, or real estate.1 For simplicity, let us assume that these assets are
held for a fixed period. At the end of this period, some assets will have increased in
value, while others will have decreased. This means that the investor will profit from
some assets and experience losses on others. The objective of a portfolio manager
is to maximize overall returns while managing risk. Balancing assets to achieve
the optimal tradeoff between risk and return is a mathematical challenge known as
portfolio optimization. This is far from trivial, as it requires analyzing large volumes
of data, including historical performance, correlations among assets, and risk factors.
When investments span multiple successive periods, the problem becomes even more
complex, as it requires predicting future market conditions and adjusting strategies
accordingly.
Derivatives are financial instruments whose values are derived from the price of
one or more underlying assets. These underlying assets can be stocks, bonds, com-
modities, currencies, or interest rates. For example, the value of a derivative tied
to a car manufacturing company’s stock could depend on factors such as the prices
of raw materials, the cost of production components, market demand for vehicles,
and broader economic, geopolitical, and environmental influences. The pricing of
derivatives is particularly complex because it involves predicting the value of the
underlying assets, which are influenced by multiple interconnected variables. This
makes derivative pricing one of the most intricate problems in finance. Sophisti-
cated mathematical models and powerful computational techniques, such as numer-
ical methods and simulations, have been developed to estimate prices and manage
the risks associated with derivatives.
An option is a special type of derivative. In simple terms, an option gives its
owner the right, but not the obligation, to buy or sell an underlying asset at a
predetermined price within a specified time frame. A call option gives the holder
the right to buy the asset at a fixed price, known as the strike price, before the
option’s expiration date. If the market price of the asset rises above the strike price,
1
For most of the financial concepts, I have relied on J. C. Hull’s classic book, Options, Futures,
and Other Derivatives.
15
16 CHAPTER 3. NON-QUANTUM APPROACHES
the option holder can exercise the option, purchase the asset at the lower fixed price,
and then sell it at the higher market price to make a profit. A put option, on the
other hand, gives the holder the right to sell the asset at a fixed price before the
expiration date. If the market price of the asset falls below the strike price, the
option holder can sell the asset at the higher fixed price, thereby avoiding losses
that would occur by selling it at the lower market price. While the basic concept
of options is relatively straightforward, predicting their future prices is extremely
challenging. The price of an option is influenced by several factors, such as its
volatility, the time remaining until expiration, and prevailing interest rates. These
interdependencies make option pricing a complex mathematical and computational
problem that requires advanced modeling techniques.
Financial services refer to the various offerings provided by financial institutions
to their clients. It is essential for these institutions to thoroughly assess the potential
consequences and risks associated with delivering these services. For example, when
individuals or businesses apply for loans from a bank, the bank must evaluate their
creditworthiness and ability to repay. This process involves analyzing factors such
as income, outstanding debts, repayment history, and credit behavior. The outcome
of this evaluation is a numerical score, known as the credit score, which quantifies
the risk that the borrower might default. This area of financial theory is referred
to as credit risk assessment. Optimizing the loan evaluation process is critical for
banks, as they need to minimize financial losses while providing fair and timely
services to customers. The goal is to identify and reject high-risk loan applicants,
while avoiding the denial of loans to eligible borrowers who genuinely need financial
support.
Another area of concern for financial institutions is credit card fraud . Prevent-
ing fraudulent transactions is essential, but unnecessary interruptions in legitimate
transactions can frustrate customers. For example, if someone takes a flight and uses
their credit card in two geographically distant locations within a short period, the
bank may flag the activity as suspicious. To reduce fraud while ensuring a smooth
customer experience, banks rely on advanced techniques to identify suspicious be-
haviors and anomalies.
In addition to satisfying the ever-growing demands imposed by customers, banks
must also comply with governments and regulatory bodies. One of the critical con-
cerns in this context is anti-money laundering. Financial institutions are responsible
for ensuring that the money they manage does not originate from illegal activities
such as tax evasion, corruption, or trafficking. To meet this requirement, banks
employ systems to monitor and flag suspicious transactions. Alongside anti-money
laundering, risk management is another crucial responsibility for banks. Financial
organizations are subject to regulations that limit the level of risk they can assume
when investing customer funds. To ensure compliance, they use sophisticated models
to evaluate and mitigate risks, ensuring that investments remain within acceptable
thresholds.
To sum up, financial institutions can enhance both their internal operations and
client services by adopting innovative technologies. As we will explore in the next
chapter, quantum computers have the potential to significantly advance classical
methods—an ambitious pursuit in today’s increasingly complex and competitive
environment.
3.1. OPTIMIZATION THEORY 17
where the ci ’s are real constants and bi ∈ {0, 1}, for i = 1, 2, . . . , n. For a quadratic
function f , quadratic terms must also be included:
where the Qij ’s are constants. In the second line, we have used that b2i = bi . Using
index notation, this sum can be written as
n
X
bi Qij bj . (3.1.3)
i,j=1
where the constants α and β have been included for greater generality. We will
assume that, as is the case in most practical situations, Qij = Qji .
It is common practice to arrange the n independent variables b1 , b2 , . . ., bn into
a column vector b = [b1 b2 . . . bn ]T . Similarly, we define the column vector c =
[c1 c2 . . . cn ]T and the n × n symmetric matrix Q = [Qij ], allowing the quadratic
function to be expressed as
f (b) = α bT Q b + β cT b . (3.1.5)
2
For more details, see my notes, “An Introduction to Portfolio Optimization with Quantum
Computers,” hereafter referred to as QC3. You can find it on my LinkedIn.
18 CHAPTER 3. NON-QUANTUM APPROACHES
Let us now briefly explore how this relates to the optimization of financial portfo-
lios.
where s = 1, 2, . . . , S, and ps (0) is the initial amount of money invested in the sth
stock. After a time T , the value of the investment in the sth stock becomes ps (T ),
and the total value of your portfolio at that time is given by
S
X
pP (T ) = ps (T ) . (3.1.8)
s=1
where
ps (T ) − ps (0)
Rs (T ) = , (3.1.10)
ps (0)
is the sth stock return rate, and
ps (0)
ws (0) = . (3.1.11)
pP (0)
Note that ws (0) represents the proportion of the initial capital invested in the sth
stock; it is known as the sth stock weight.
The portfolio return rate (3.1.9) can be written in vector notation as follows:
RP (T ) = wT (0)R(T ) . (3.1.12)
3.1. OPTIMIZATION THEORY 19
where the Greek letter µ stands for mean in probability theory. In vector notation,
Associated with the uncertainty of a portfolio’s return rate is the concept of risk.
Risk , simply put, is the possibility of losing money on an investment. In mod-
ern portfolio theory, also known as the mean-variance model (developed by Harry
Markowitz in the 1950s), risk is assumed to be directly proportional to the uncer-
tainty of the portfolio’s return rate. In other words, the higher the uncertainty of
the portfolio return rate, the greater the risk. Note that, according to this defini-
tion, risk also encompasses the possibility of earning more than expected. However,
risk is generally understood with a negative connotation, often interpreted as the
likelihood of earning less than expected or even incurring a loss.
Since the price movements of two or more stocks in a portfolio can be corre-
lated, the risk of an investment portfolio must account for these correlations. The
mathematical
object that incorporates these correlations is the covariance matrix
Σ R(T ) . Modern portfolio theory defines the variance of the portfolio return rate
as
σP2 (T ) = wT (0) Σ R(T ) w(0) ,
(3.1.16)
and the risk is directly proportional to it.
Given an expected portfolio return rate M, the goal of a portfolio manager is to
minimize the risk. That is,
and, of course,
1T w(0) = 1 . (3.1.19)
The positive number α is called the risk aversion coefficient and measures the in-
vestor’s tolerance for risk.
Suppose that, for instance, we are interested in constructing a portfolio where
each stock is either included or excluded. In this case, we are dealing with a binary
portfolio optimization problem. The goal is to minimize the risk,
or, equivalently,
f (b) = α bT Q b + β cT b . (3.1.24)
Conceptually, the application of MCS in finance stems from its origins in nuclear
physics. Suppose you have a portfolio of options. For simplicity, assume that the
options depend on a single stock.4 Let us use the nuclear fission analogy to under-
stand how MCS naturally applies to this case. The goal is to follow the history of the
stock. Initially, the stock has several possibilities: it can either remain unchanged,
go up in price, or go down, each with a certain probability. After the first outcome,
for each of the previous possibilities, the stock can again remain unchanged, go up,
or go down, with probabilities depending on the new market conditions. As time
passes, the stock price has more and more possible histories. At every moment, the
MCS estimates the probability associated with the different prices of the underlying
stock. Note that this is similar to the diffusion of neutrons in fissile material. MCS
does not attempt to solve a differential equation, such as the Black-Scholes equation
(1973) for options, but instead provides a probabilistic description of the option’s
behavior, considering the multiple price paths of a statistical sample.
These are examples of Monte Carlo simulation applied to stochastic processes.
The canonical example of a stochastic process is the random walk . Suppose you
are standing on a straight line, which we associate with the axis x, and you can
only step in the positive (+) or negative (−) direction with a fixed length L. For
mathematical simplicity, let L be the unit length, that is, L = 1. To decide whether
you move to the right (+) or to the left (−), you toss a fair coin. If the result is
heads (H), you move to the right, and if it is tails (T ), you move to the left. Since
the coin is fair, the probability of moving to the right or to the left is pH = 1/2
and pT = 1/2, respectively. The probability distribution is thus given by {H, pH =
1/2; T, pT = 1/2}. After tossing the coin K times, your distance from the origin will
be
DK = (δH1 − δT 1 ) + (δH2 − δT 2 ) + . . . + (δHK − δT K )
K
X K
X
= δHk − δT k . (3.2.1)
k=1 k=1
In other words, it is the difference between the number of steps taken to the right
and the number of steps taken to the left.
In probability language, if {H, T } is the sample space, that is, the set of possible
outcomes of a coin toss, then the step taken is a random variable,
X : {H, T } → {1, −1} , (3.2.2)
with X(H) = 1 and X(T ) = −1. A random walk is defined as a sequence of random
steps,
{X1 , . . . , Xk , . . . , XK } , (3.2.3)
where Xk is the value of the kth step, taking the value 1 if the coin shows H and
−1 if it shows T . A set of random variables like this is called a stochastic process.
In general, the distance from the origin after K steps is given by
K
X K−1
X
DK = Xk = Xk + XK = DK−1 + XK . (3.2.4)
k=1 k=1
4
Refer to the seminal paper by P. Boyle, “Options: A Monte Carlo Approach” (1977). The
following textbooks provide a technical introduction to MCM in finance: P. Glasserman, Monte
Carlo Methods in Financial Engineering, and P. Jäckel, Monte Carlo Methods in Finance.
3.2. MONTE CARLO SIMULATION 23
This simple random walk can be generalized in several ways, such as by extending
it to two or more dimensions.
Brownian motion, discovered by botanist Robert Brown in the late 19th century
while observing pollen grains suspended in a liquid, is a two-dimensional example
of a random walk. In this case, we need a tetrahedron to decide whether to move
right, left, up, or down. The resulting position of the suspended particle evolves as
a combination of these independent motions, creating a trajectory that reflects its
random nature.
We can also generalize the simple random walk by rolling a die instead of tossing a
coin. For instance, we could roll a fair die and define the following random variable:
The index i, ranging from 1 to N , represents the ith possible price. Each of the
prices i ps (T ) occurs with some probability. More generally, at any time kT , where
k = 1, . . . , K, we have 1 ps (kT ), . . . ,i ps (kT ), . . . ,N ps (kT ), where i ps (kT ) represents
the ith possible price of stock s at time kT . The stochastic movement of the stock
price will be described by a set of random variables representing the price at different
time points,
{i ps (kT )} , (3.2.8)
where k = 1, . . . , K and i = 1, . . . , N . This set of potential prices can be conveniently
represented by an N × K matrix,
1
ps (T ) 1 ps (2T ) · · · 1 ps (KT )
.. .. ... ..
. (3.2.9)
. . .
N N N
ps (T ) ps (2T ) · · · ps (KT )
Suppose that i ps ((k − 1)T ) represents the price of the stock at time step (k − 1)T ,
and j ps (kT ) denotes the price at time kT , where i, j = 1, . . . , N . With a slight
modification to definition (3.1.10), the stock return rate during this time period is
given by,
j
ps (kT ) − i ps ((k − 1)T )
Rsij ((k − 1)T, kT ) = i p ((k − 1)T )
s
j
ps (kT )
= ip
− 1. (3.2.10)
s ((k − 1)T )
More generally, given two moments in time, mT and nT , where m < n, the stock
return rate during this period is
in
ps (nT )
Rsim in (mT, nT ) = im p (mT )
−1
s
in in−1 im+1
ps (nT ) ps ((n − 1)T ) ps ((m + 1)T )
= in−1 p ((n − 1)T ) in−2 p ((n − 2)T )
· · · im p (mT )
− 1.
s s s
n−m
Y
= Rsin−k in−k+1 (∆Tn−k+1 ) − 1 . (3.2.12)
k=1
In order to make this equation more manageable, the log stock return rate is defined:
n−m
X
Rsim in (mT, nT ) log Rsin−k in−k+1 (∆Tn−k+1 ) .
log +1 = (3.2.13)
k=1
Let us pause the mathematical presentation here and shift focus to how Monte
Carlo simulation extracts useful statistical information from the random walk. The
application to stock prices is analogous.
Suppose you are playing the random walk game, but you are unaware that the
coin is biased. After tossing the coin K times, suppose you have followed a certain
stochastic path,
{D1 , . . . , Dk , . . . , DK } , (3.2.14)
P P
where Dk = δHk − δT k , for every k = 1, . . . , K. The question is: by analyzing
this stochastic path, how much can you learn about the probability distribution
{H, pH ̸= 1/2; T, pT = 1 − pH ̸= 1/2} underlying it? Once you have discovered
the probability distribution, you can make statistical predictions about the future
behavior of the path.
MCS proceeds as follows: It selects a probability distribution, say {H, pH ; T, pT }1 ,
and generates many possible stochastic paths. At each time step, the simulation
computes the corresponding value of the random walk and averages these values
across all generated paths to obtain an estimate of the expected behavior for the
given probability distribution. This procedure is repeated for different probability
distributions, {H, pH ; T, pT }2 , . . ., {H, pH ; T, pT }G . All these statistical models are
compared with the observed behavior (as described in (3.2.14)). The one that most
closely replicates the observed behavior is chosen as the underlying probability dis-
tribution that governs the random walk. Future predictions are then based on this
selected probability distribution.
Here, ∆i represents the delta of the ith option. If the goal of your portfolio is to
hedge the investment by minimizing exposure to price movements of the underlying
asset, you need an appropriate combination of put and call options, with some ∆i ’s
being positive and others negative. This creates a delta-neutral portfolio, where the
total delta is close to zero.
The next Greek is gamma (Γ). Using a physical analogy, if ∆ represents velocity,
then, Γ represents acceleration,
where Γi denotes the gamma of the ith option. Note that when the gammas of
all the options in a portfolio are approximately zero, the gamma of the portfolio
is also nearly zero. This implies that the portfolio’s delta remains nearly constant.
Consequently, the portfolio’s price changes at a constant rate with respect to the
26 CHAPTER 3. NON-QUANTUM APPROACHES
price of the underlying asset. In contrast, when the gammas of the individual options
are significantly greater than zero, the portfolio’s delta becomes highly sensitive
to changes in the underlying asset’s price. This increased sensitivity makes the
portfolio’s price less predictable, introducing greater risk when forecasting its value.
Suppose now that the value of the portfolio of options, pP , depends not only on
the price of the underlying asset, s, but also explicitly on time, t. We thus have
that,
n
X
pP (t, s) = wi Ci (t, s) . (3.2.20)
i=1
The Black-Scholes equation states that the value of the portfolio satisfies the follow-
ing partial differential equation:
∂pP (t, s) 1
+ rs ∆P (t, s) + σ 2 s2 ΓP (t, s) = r pP (t, s) . (3.2.22)
∂t 2
The first term defines the Greek theta (Θ),
∂pP (t, s)
Θ(t, s) = . (3.2.23)
∂t
Using this definition and simplifying the notation by omitting the dependence on
time and the asset price, the Black-Scholes equation simplifies to:
1 2 2
Θ + rs ∆P + σ s ΓP = r pP . (3.2.24)
2
For a delta-neutral portfolio (∆P = 0), that is, a portfolio constructed to be in-
sensitive to small changes in the price of the underlying stock, the Black-Scholes
equation reduces to:
1
Θ + σ 2 s2 ΓP = r pP . (3.2.25)
2
There are other Greeks, and all of them are necessary in one way or another
to appropriately manage a portfolio of options. The discussion above, though, is
enough to highlight the complex mathematical nature of option pricing.
Let us finally see how Monte Carlo simulation can be applied to the evaluation
of Greeks, focusing specifically on delta. Each MCS generates a possible trajectory
for the asset price over time, and this process is repeated many times to form a
statistical sample. At each point in time, from the purchase of the option to its
expiration, the average asset price across all simulations is calculated. To do this,
assumptions are typically made about market conditions, such as constant volatility,
a fixed risk-free rate, and using models like geometric Brownian motion to describe
price movements.
The resulting information can be visualized on a two-dimensional graph with the
asset price on the x-axis and the option price on the y-axis. Since delta represents
3.3. MACHINE LEARNING 27
the rate of change of the option price with respect to the underlying asset price, we
can approximate it by evaluating the price change over small increments in the asset
price. Assuming a linear relationship for small changes, we can express the option
price as:
b 1 ) = ∆C (s1 − s0 ) + C(s0 ) ,
C(s (3.2.26)
where C(s0 ) is the historical price of the option at the initial asset price s0 , ∆C is
the delta of the option, and C(s b 1 ) is the estimated option price at the new asset
price s1 . For every asset price si , with i = 1, . . . , n, the estimated value of the option
is given by:
b i ) = ∆C (si − si−1 ) + C(si−1 ) ,
C(s (3.2.27)
where C(si−1 ) is the option price at the previous asset price si−1 . Using the least
squares error (LSE ) method, the best fit is determined by minimizing the error, E,
between the observed option prices and the predicted values:
n
X
min E= min Ei , (3.2.28)
∆C , C(s0 ) ∆C , C(s0 )
i=1
where
b i ) − C(si ) 2 .
Ei = C(s (3.2.29)
The slope of the best fit line is the delta ∆C we are seeking.
algorithm until the best fit to the data is achieved. It is important to note that,
however, the goal of ML is not to derive an exact function, but to extract practical
and actionable insights from the data. In this section, we will review in some detail
several ML algorithms currently used in finance that have been identified as potential
candidates for enhancement by quantum computational methods.6 .
If, as stated above, ML aims at discovering patterns in data, it is crucial to begin
clarifying what we mean by data. Data refers to information associated with physical
objects or abstract concepts. The data collected comes in various forms: text, audio,
video, images, and more. It is generally categorized into two main types: structured
and unstructured.
Structured data is highly organized and typically stored in formats such as matri-
ces for numerical data or tables for more general datasets. For instance, numerical
data can be represented in matrix format, where rows correspond to samples (e.g.,
individual records), the first I columns represent features (also referred to as in-
puts), and the last O columns correspond to labels (outputs). For N samples, the
corresponding matrix is:
(1) (1) (1) (1)
x1 · · · xI y1 · · · yO
.. .. .. .. .. .. .
. . . . . . (3.3.1)
(N ) (N ) (N ) (N )
x1 ··· xI y1 ··· yO
as the nth sample feature vector. It is straightforward to verify that the submatrix
of inputs can be written as:
1 T
(x )
..
. = x 1 . . . xI . (3.3.4)
(xN )T
Unstructured data, on the other hand, refers to information that lacks a prede-
fined structure, such as raw text, audio, or video files. This type of data is often the
most common form of collected data. Before it can be used in ML algorithms, un-
structured data must be cleaned, organized, and converted into a structured format.
Data scientists, or more specifically, data engineers, are responsible for addressing
issues such as duplicates, missing values, and inconsistent formatting. These errors
must be corrected or removed to ensure the integrity of the dataset. Given the
vast amounts of data involved—often millions or even billions of data points—this
6
Andrew Ng’s CS229 course on machine learning at Stanford is a classic. The videos and
lecture notes are available online. A. Burkov’s The Hundred-Page Machine Learning Book is an
excellent complement to Ng’s course. The well-known quant Paul Wilmott also offers a concise
and instructive introduction to the subject in his book Machine Learning.
3.3. MACHINE LEARNING 29
process requires powerful programming tools such as Python, along with specialized
libraries like Pandas, NumPy, or TensorFlow (see below). Techniques like Natural
Language Processing (NLP) for text or feature extraction for images play a crucial
role in transforming unstructured data into a usable form.
Data science is a vast and varied field. For now, the key takeaway is that the
data used in ML algorithms is not the raw, noisy, unprocessed data that is initially
collected. Instead, it must be carefully preprocessed to ensure it is suitable for
analysis. Poor data selection or inadequate preprocessing can lead to inaccurate
descriptions of the data or faulty predictions and decisions based on it.
3.3.1 ML Algorithms
Supervised Learning
Supervised Learning (SL) is by far the most widely used ML method. It assumes
that labeled data—comprising features and labels—has already been organized into
a tabular format. For instance, when there is a single feature and a single label, the
data can be visualized as points on a two-dimensional plane. For two samples, this
data is represented in the matrix:
" (1) (1) #
x1 y 1
(2) (2)
, (3.3.5)
x2 y 2
(1) (1) (2) (2)
which simply represents the points (x1 , y1 ) and (x2 , y2 ) in a more manageable
form. This structured approach enables supervised learning algorithms to effectively
identify patterns and relationships between features and their corresponding labels.
Given a labeled dataset with feature vectors x1 , . . ., xI and label vectors y1 , . . .,
yO , here is how these algorithms work: the complete dataset is first divided into
two groups; the first group, known as the training data, is used to train the model,
(1) (1) (1) (1)
x1 · · · xI y1 · · · yO
.. .. .. .. .. .. ,
. . . . . . (3.3.6)
(n) (n) (n) (n)
x1 ··· xI y1 ··· yO
and the other, the test data, is used to evaluate the model’s performance after
training, (n+1) (n+1) (n+1) (n+1)
x1 · · · xI y1 · · · yO
.. .. .. .. .. .. .
. . . . . . (3.3.7)
(N ) (N ) (N ) (N )
x1 · · · xI y1 · · · yO
(k)
During the training phase, the algorithm learns to predict the labels yj based on
(k)
the inputs xi by minimizing a loss function that measures the difference between
(k) (k) (k) (k)
the predicted labels (ŷ1 , . . . , ŷO ) and the actual labels (y1 , . . . , yO ) for every k =
1, . . . , n. The model is iteratively refined by adjusting its parameters to minimize
this error, thereby improving accuracy over time. After training, whenever a new set
(n+1) (n+1)
of features (x1 , . . . , xI ) is introduced, the algorithm uses the learned model
(n+1) (n+1)
to predict the corresponding labels (ŷ1 , . . . , ŷO ). The test phase evaluates
the model’s ability to generalize to unseen data. This is done by comparing its
30 CHAPTER 3. NON-QUANTUM APPROACHES
predictions to the true labels, without making further adjustments to the already
trained model. It is important to note that the training and testing phases serve
distinct purposes: the training phase focuses on learning and optimizing the model,
while the testing phase assesses its generalization performance on new, unseen data.
What we have just seen is an example of regression. The simplest form of regres-
sion you may be familiar with is linear regression. While it shares similarities with
the classical statistical approach (refer to equation (3.2.28)), there are key differ-
ences. In brief, the standard statistical approach is more theoretical and focused
on understanding the relationship between variables, whereas linear regression in
ML is more data-driven and aimed at minimizing prediction error. Furthermore,
in ML, regularization techniques such as Lasso and Ridge regression are commonly
applied to linear regression models to prevent overfitting and enhance their ability
to generalize to unseen data.
Let us now consider the second major area of supervised learning: classification.
In classification, unlike regression, the labels are discrete rather than continuous.
The goal of a classification algorithm is to assign one of a finite number of possible
(n) (n) (n) (n)
labels, (y1 , . . . , yO ), to a given set of features, (x1 , . . . , xI ). For simplicity, let
us assume that we are dealing with a single-label dataset. This means that each
(n) (n)
sample, characterized by its feature set, (x1 , . . . , xI ), is associated with exactly
one label, y (n) . In matrix form,
(1) (1)
x1 · · · xI y (1)
.. .. .. .. ,
. . . . (3.3.8)
(N ) (N )
x1 ··· xI y (N )
where y (1) , . . . , y (N ) are elements of a discrete set. If there are more than two classes,
we refer to this as multiclass classification. This means that each feature point
(n) (n)
(x1 , . . . , xI ), where n = 1, . . . , N , is associated with one of the C categories,
y1 , y2 , . . . , yC , where C > 2. In other words, y (1) , . . . , y (N ) ∈ {y1 , y2 , . . . , yC }. If
there are only two categories, C = 2, it is a binary classification. In this case, each
(n) (n)
point (x1 , . . . , xI ) is associated with one of the two labels, y1 or y2 , which are
often expressed as − and +, or 0 and 1. In symbols, y (1) , . . . , y (N ) ∈ {+, −}.
k-Nearest Neighbors
The first classification algorithm we want to introduce is the k-Nearest Neighbors
(kNN) algorithm. Due to its simplicity and effectiveness, it is one of the most
widely used algorithms in classification tasks. The kNN algorithm groups feature
data points based on their proximity and uses this grouping to make predictions for
(n) (n)
new data points. Intuitively, if a point in (x1 , . . . , xI ) corresponds to a certain
class, such as − or +, it is because it is surrounded by points in the same class.
However, ambiguity arises when the point lies in a region where the surrounding
points belong to multiple classes. The kNN algorithm resolves this ambiguity by
assigning the label of a point based on the majority class among its k-nearest neigh-
bors. Specifically, if there are N+ neighboring points with the label + and N− points
with the label −, with N+ + N− = k, the classification is determined by the sign of
N+ − N− . If N+ > N− , the new point is classified as +. If N− > N+ , it is classified
as −. This process captures the essence of the kNN algorithm. To train and test
the kNN algorithm for data classification, we follow the same general steps outlined
3.3. MACHINE LEARNING 31
Neural Networks
Let us examine in detail Neural Networks, one of the most powerful methods in
machine learning. While they can be applied to both supervised and unsupervised
learning, we will focus on the supervised case here.
For over a century, biologists have understood that neurons in the human brain
communicate with each other to process information. Inspired by this biological
mechanism, computer scientists in the mid-20th century proposed the concept of
artificial neural networks—now commonly referred to as Neural Networks. These
computational models mimic certain aspects of how biological neurons function and
are used not only to simulate brain activity, as was originally intended, but also for
a wide range of applications in science and industry.
In these models, biological neurons are replaced by interconnected nodes, typically
organized into layers. Each layer of nodes serves a specific purpose: some nodes
receive input data (the input layer , analogous to sensory neurons), others process
this data through intermediate layers (the hidden layers), and some nodes produce
the final output or prediction (the output layer , akin to the response of a biological
organism). The network’s behavior is determined by a set of parameters, the number
of which depends on the number of layers and the nodes within each layer. If
the network’s prediction does not match the actual outcome, the model learns by
adjusting these parameters. This adjustment is guided by a cost or loss function,
which quantifies the discrepancy between the predicted and actual outcomes. To
minimize this loss during training, the network iteratively updates its parameters
using optimization techniques such as backpropagation and gradient descent. These
are the fundamental ideas behind neural networks.
Let us begin with the most elementary model. If there is a single input and a
unique output, linear regression proposes a linear equation to fit the data: ŷ =
wx + b, where the parameters w and b are adjusted through minimization of the
loss function, ensuring the line is the best fit for the data in the two-dimensional
x-y plane. When there are multiple inputs, x1 , . . . , xI , the geometry of the model
becomes a hyperplane: ŷ = wT x + b, where the vector x = [x1 . . . xI ]T collects
the input features and w = [w1 . . . wI ]T represents the vector of weights. The
parameters w1 , . . . , wI , which determine the contribution of each input to the next
node, and b, the bias term, are adjusted to minimize the loss function, ensuring the
hyperplane best fits the data. If the relationship between the inputs and the output
is non-linear (i.e., not a hyperplane), the model can be enhanced with the inclusion of
an activation function f . This introduces non-linearity into the predictions, enabling
the model to capture more complex patterns in the data. Thus, instead of
I
X
ŷ = w i xi + b , (3.3.9)
i=1
we use
I
X
ŷ = f w i xi + b . (3.3.10)
i=1
This simple model, known as the perceptron, consists of only two layers: the input
layer and the output layer (with only one node). We now introduce a hidden layer
32 CHAPTER 3. NON-QUANTUM APPROACHES
Considering the contribution from all the nodes in the hidden layer,
N1
X
o
ŷ = f wo,n1 hn1 + b . (3.3.12)
n1 =1
If there are two hidden layers, the first with N1 nodes and the second with N2 nodes,
a straightforward extension of what we just did gives,
N2
X N1
X I
X
o 2 1
ŷ = f wo,n2 f wn2 ,n1 f wn1 ,i xi + bn1 + bn2 + b . (3.3.14)
n2 =1 n1 =1 i=1
neural network for each data point, ŷj − yj . The total error across the dataset can
be quantified using a loss function, such as the Mean Squared Error (MSE ) formula,
O
1 X
MSE = (ŷj − yj )2 . (3.3.17)
O j=1
Minimizing this error is the goal during training, as it ensures the network’s pre-
dictions ŷj closely match the true values yj . The key to minimizing this error—and
thus improving the predictions of our neural network—lies in the dependence of
each ŷj on the weights and biases. If the neural network has H hidden layers
with N1 , N2 , . . . , NH nodes in each respective layer, the total number of parameters
(weights and biases) can be calculated as,
These include the weights connecting the input nodes to the output nodes, as well
as the biases for the output nodes. Specifically, for the perceptron, where No = 1,
there are NI + 1 parameters. For a single input feature and a single output label,
there are 1 + 1 = 2 parameters to adjust: the weight w and the bias b. We can use
O
d 1 d X
MSE = (ŷj (w, b) − yj )2 . (3.3.20)
dw O dw j=1
Given the initial values of the parameters, we optimize them using methods such
as the gradient descent algorithm. The process is similar when there are more pa-
rameters. However, in this case, the change in the error function depends on all the
parameters, and the partial derivatives with respect to each parameter capture the
effect of small variations in each one. Backpropagation is the procedure that guides
the algorithm in optimizing the parameters iteratively until the best fit is achieved.
where we have simplified the transformation of the input x0 into the output ŷ0
through the action of the weight matrix W0 . The vector b0 is a bias vector. At a
later time t1 , the input vector is x1 , with the corresponding weight matrix W1 and
bias vector b1 . If we ignore the effect of what happened at t0 , the expected value at
t1 would be
ŷ1 = f 1 (W1 x1 + b1 ) . (3.3.22)
However, recurrent neural networks aim to include the contribution from previous
time steps. In this context, the predicted value ŷ0 becomes the hidden state at t0 ,
h0 = f 0 (W0 x0 + b0 ) , (3.3.23)
ŷ1 = f 1 V0 h0 + (W1 x1 + b1 ) ,
(3.3.24)
where V0 is the weight matrix for the contributions coming from time t0 . For time
t2 , we proceed in the same way. We define the hidden state at t1 as
h1 = f 1 V0 h0 + (W1 x1 + b1 ) ,
(3.3.25)
and
ŷ2 = f 2 V1 h1 + (W2 x2 + b2 ) .
(3.3.26)
RNNs assume that there is nothing inherently unique about any specific time step.
As a result, the weight matrices—both those associated with the input features and
the hidden states—and the activation functions remain the same across all time
steps. In other words, if there are n time steps,
W0 = W1 = . . . = Wn−1 = Wn , (3.3.28)
and
f 0 = f 1 = . . . = f n−1 = f n . (3.3.29)
This, moreover, reduces the number of parameters and makes training computation-
ally more efficient. In conclusion, the predicted output vector at tn is
ŷn = f V hn−1 + (W xn + bn ) , (3.3.30)
There is, however, a potential problem with sharing weights across all hidden
states. If the elements of the weight matrix V are much smaller than 1 at each
step, during the backpropagation process, the gradients for parameter adjustment
become progressively smaller. As we backpropagate through the sequence backward
in time, the product of gradients across time steps diminishes exponentially, leading
to the vanishing gradient problem. Consequently, the parameters are only slightly
adjusted—or not adjusted at all—leading to slow learning and effectively neglecting
the contributions from earlier time steps in the sequence of temporal data. In or-
der to solve the vanishing gradient problem, we can introduce the Long Short-Term
3.3. MACHINE LEARNING 35
Memory (LSTM ) cell. Its purpose is to guide the network at every step on how
much information from previous time steps should be retained. The LSTM cell
prioritizes recent information while gradually forgetting or neglecting contributions
from steps far back in time, ensuring that the model focuses on the most relevant
context for its predictions.
Unsupervised Learning
Unlike supervised learning, Unsupervised Learning (UL) algorithms are provided
with datasets that do not include labeled data. Given a set of input data, their goal
is to independently identify hidden patterns within the data.
In many situations, the input feature space, RI , can be reduced to a lower-
ˆ
dimensional space, RI , where ideally Iˆ ≪ I, without losing significant information
that could affect the algorithm’s predictions. For example, consider a simple case
where the data is three-dimensional, (x(n) , y (n) , z (n) ), where n = 1, . . . , N labels the
sample. If it turns out that the z feature exhibits minimal variation across all sam-
ples compared to the variations in the x and y features, we can safely ignore z.
The data can then be simplified to (x(n) , y (n) ). This reduction eliminates the need
to process the superfluous z feature, which neither contributes meaningful informa-
tion nor affects the model’s predictive quality. When dealing with high-dimensional
data, where the input space is enormous, removing irrelevant or redundant features
becomes a crucial preprocessing step. This dimensionality reduction not only ac-
celerates the algorithm but also minimizes computational costs without sacrificing
predictive performance.
Similar for the j feature. The xi -xj plane can be effectively reduced to a single
36 CHAPTER 3. NON-QUANTUM APPROACHES
dimension by identifying the line that best fits the data, a process guided by the
MSE minimization method discussed earlier. This line corresponds to the direction
along which the data exhibits the greatest variance. In other words, the line aligns
with the principal eigenvector (the eigenvector with the largest eigenvalue), eij , of
the covariance matrix Σij of the data. This covariance matrix is computed after the
data has been shifted to its mean and rescaled,
N
1 X (n) (n) (n) (n)
Σij = x̄i x̄i + x̄j x̄j . (3.3.34)
N n=1
(n) (n)
If we define the two-dimensional feature vector x̄(n) = [x̄i x̄j ]T , its component
along the principal eigenvector eij is given by the dot product
By comparing yij with respect to x̄ij , we can quantify the amount of information
lost when substituting x̄ij with yij and ignoring the transverse component. This
comparison provides a measure of the variance retained in the reduced representa-
tion versus the variance discarded. This process can be systematically applied to as
many pairs of features as possible, iteratively reducing the dataset’s dimensionality.
The reduction continues until the number of features is brought to a manageable
size, balancing computational efficiency with minimal information loss. A common
best practice is to reduce the original feature space to a subspace that captures
90-95% of the total variance, ensuring that the reduced dataset retains most of the
critical information while discarding noise and less relevant variability.
k-Means Clustering
Given a dataset {x(1) , . . . , x(n) , . . . , x(N ) }, where N is the number of samples and
(n) (n) (n)
x(n) is the feature vector of the nth record, i.e., x(n) = [x1 . . . xi . . . xI ]T , the k-
Means Clustering algorithm aims to group the data into K distinct clusters identified
by the algorithm. The number of clusters, K, is a hyperparameter chosen by the
data analyst, and the algorithm automatically groups the data points. The first step
is to select K random points in the feature space RI ,
where k = 1, . . . , K. Here, µ0k,i is the ith component of the kth vector. These vectors
are referred to as the clustering centroids, for reasons we will discuss shortly. For
each feature vector x(n) and each centroid µ0k , the algorithm computes the Euclidean
distance between them,
v
u I
(n) 0
u X (n) 2
d x , µk = t xi − µ0k,i . (3.3.37)
i=1
Every feature vector is then assigned to the cluster whose centroid is the closest,
that is, the one for which d x(n) , µ0k is minimal. At this stage, the dataset is
partitioned into K clusters, denoted as C10 , . . . , CK
0
, each containing N10 , . . . , NK0
3.3. MACHINE LEARNING 37
points, respectively. Next, the algorithm updates the centroids µ1k of these clusters
by computing
1 X (n)
µ1k,i = 0 x , (3.3.38)
Nk (n) 0 i
x ∈Ck
for each component i = 1, . . . , I. This update ensures that each centroid represents
the mean position of all points in its respective cluster. The distances d x(n) , µ1k
are then recomputed for every feature vector and the updated centroids. Each data
point is reassigned to the cluster with the nearest centroid, forming new clusters
C11 , . . . , CK
1
. This process of recalculating centroids and reassigning data points is
repeated iteratively. The algorithm terminates when subsequent iterations result in
minimal or no change in the centroids, indicating convergence.
3.3.2 ML in Finance
The applications of ML to finance are many, and our intention here is not to present
a comprehensive account of the subject.7 Instead, let us focus on few applica-
tions of each of the algorithms presented above. For regression, we will consider
credit scoring and risk assessment. For the classification algorithms, we will con-
sider: credit risk assessment, transaction fraud detection, credit card fraud detec-
tion, money laundering detection, market behavior and sentiment analysis for the
kNN algorithm; credit scoring and risk assessment for the SVM algorithm; fraud
detection for NNs; and algorithmic trading and sentiment analysis for RNNs. For
unsupervised learning algorithms, we will see the application of PCA in portfolio
management, credit risk analysis, and algorithmic trading. Finally, we will examine
how the k-means algorithm is used in fraud detection and anti-money laundering.
Regression models are used in credit scoring to predict the probability of a bor-
rower repaying their loan (output close to 1) or defaulting (output close to 0). The
raw dataset typically contains information about borrowers, including demographic
details (such as age and gender) and financial data (such as loan amount, credit
history, and repayment records). This data is often used to derive additional fea-
tures, such as the length of credit history or the debt-to-income ratio, which can
enhance the model’s predictive power. Using this enriched dataset, the model is
trained to discover patterns between the input features and the target variable (e.g.,
the probability of repayment or default). The model’s performance is then evaluated
on a test dataset to ensure it generalizes well. Once trained and tested, the regres-
sion model can predict the probability of repayment for new applicants. This helps
financial institutions decide whether to approve a loan or determine credit limits
based on the applicant’s predicted creditworthiness. We have referred to probability
as a measure of the borrower’s creditworthiness. Banks, however, may instead use a
different metric known as the credit score. Credit scores have a defined range with
a minimum (indicating a higher risk of default) and a maximum (indicating a lower
risk of default). While credit scores and probabilities are expressed differently, they
are indirectly related, as both reflect the borrower’s credit risk.
Risk assessment has a broader meaning than simply evaluating a customer’s credit
risk. In finance, risk assessment aims to predict both the likelihood and potential
cost of adverse events that may impact an companies’ financial stability. The specific
7
See, for example, the book by M. F. Dixon et al., Machine Learning in Finance.
38 CHAPTER 3. NON-QUANTUM APPROACHES
dataset and features used depend on the type of risk being assessed, which could
include market risk, credit risk, operational risk, or country risk. For instance, when
assessing risks to a multinational company, factors such as financial performance,
political stability, social trends, and global economic conditions may be considered.
Suppose we are interested in evaluating the risk an event poses to the company’s
market valuation. After collecting all relevant data, input features are defined, and
the target variable (e.g., changes in the company’s valuation or earnings volatility)
is established. The regression model is trained on this data to discover patterns
and then tested to evaluate its predictive accuracy. This insight enables financial
institutions or companies to make informed decisions to mitigate potential risks. In
the context of a large financial institution or a multinational corporation, country
risk assessment can be seen as an extension of credit risk analysis, incorporating a
broader set of variables, such as political instability, regulatory changes, and cur-
rency fluctuations.
That the kNN algorithm can be used for credit risk assessment is quite obvious.
Suppose we are interested in the binary classification version, namely, predicting
whether a potential borrower will repay the loan (1 if they will repay, and 0 if they
will default). After collecting and cleaning the data, which includes personal and
financial information from previous borrowers along with their repayment history,
the algorithm is trained to recognize the customer’s profiles associated with suc-
cessful loan repayment. The data of a new applicant is then compared to the k
nearest neighbors in the training dataset and, using majority voting, the algorithm
predicts whether the applicant will repay the loan or default. Based on this pre-
diction, the bank can decide whether to approve or reject the loan application. It
is easy to see how kNN can be used to evaluate the legitimacy of a transaction by
comparing it to previous transactions, a process known as transaction fraud detec-
tion. In this case, the dataset consists of bank transactions, labeled as either valid
or fraudulent. When a new transaction is classified as fraudulent, the bank can
issue a warning to the account holder or block the transaction. Credit card fraud
detection and money laundering detection using machine learning work in a simi-
lar manner, although money laundering detection typically involves more complex
feature engineering and domain-specific knowledge.
Market movements are highly complex, and sophisticated techniques are often
required to attempt predictions of their future behavior. However, it can provide a
pedagogical example to illustrate the kNN algorithm. Suppose we wish to predict
the movement of a stock—whether it will go up, down, or remain stable—at some
future time tn+1 . To do this, we can collect historical financial data over a certain
period. For instance, we might gather the trading volume and the stock’s open,
high, low, and close prices at times t0 , t1 , . . . , ti , . . . , tn . Additionally, incorporating
macroeconomic indicators (such as interest rates, inflation data, or GDP growth)
and other relevant market data can enhance the model’s predictive capabilities. At
every time step ti , for i = 1, . . . , n, we define the input features as the financial
and economic data collected up to that point. The target variable can be specified
based on the price difference between ti and ti+1 , categorizing it as “up,” “down,”
or “stable.” The kNN algorithm is then trained to learn the relation between the
input features at each time ti and the corresponding movement label for ti+1 . When
applied to new data, the algorithm identifies the k nearest data points in the feature
space and uses majority voting to predict the stock’s movement at tn+1 . Another
variant of the kNN algorithm can be used to forecast future asset prices based
3.4. COMPUTATIONAL FINANCE 39
most common libraries, including machine learning libraries. (If you are already
well-versed in Python, you may skip the rest of this chapter.)
NumPy
NumPy (Numerical Python) is a library for numerical computing in Python. It
supports not only one-dimensional arrays of data (a simple list of values, like a
row of numbers) but also multi-dimensional arrays (for example, a two-dimensional
array is a table or matrix with rows and columns). In NumPy, these arrays are called
ndarray. A wide range of mathematical functions can be applied to them. We will
explore some examples below. NumPy’s advantages include improved computation
speed, reduced memory usage, and seamless integration with other powerful libraries
like Pandas and TensorFlow, which we will examine below. Today, NumPy is an
essential library used by data scientists. In the financial industry, as well as in any
sector that works with data, proficiency in Python implies expertise in NumPy.
To use NumPy, it must first be installed (if it has not already) using the following
command:
Then, it must be imported into the environment where Python code is written and
executed (for example, a Jupyter notebook ). The conventional way is:
import numpy as np
To see the shape of the array we have defined, we use the following function:
X.shape
which, in our case gives (n,m). Be aware that all the elements in an ndarray must
be of the same type. This means that they must all be integers, floats, complex
numbers, booleans, etc. To see the type of elements in the ndarray X defined above,
use the following command:
X.type
Regarding the mathematical operations we can perform with ndarrays, addition and
multiplication work as expected. For example, we can compute cX+dX, where c and
d are two numbers, by typing the following line of code in the Jupyter notebook:
c*X+d*X
Given another ndarray Y with the same shape and type of elements as X, we can
compute the sum cX+dY by writing:
c*X+d*Y
42 CHAPTER 3. NON-QUANTUM APPROACHES
If the shape of the matrix X is (n,p) and that of the matrix Y is (p,m), we can
multiply the matrices by using:
X.dot(Y)
np.dot(X,Y)
Many more operations can be performed on NumPy’s ndarrays, for example, ma-
trix transposition and computing the inverses and determinants of matrices. One
of the limitations of NumPy is that it cannot handle non-numerical data. This is
certainly a limitation because much of the data collected does not come in numerical
form. For example, dates and text data cannot be directly processed using NumPy
arrays. In such cases, libraries like Pandas are often used, as they provide better
support for handling mixed data types.
Pandas
Pandas is a data manipulation and analysis library in Python. Like NumPy, it is
widely used for handling structured data in tables, particularly large datasets. One
of the advantages of Pandas is its seamless integration with other Python libraries,
such as NumPy and machine learning libraries like Scikit-learn. It also enables data
visualizations by integrating with Matplotlib. Pandas is more powerful than Numpy
and has become an essential tool in financial data analysis, commonly expected to
be known by anyone applying for a data analyst role. In finance, Pandas enables
financial analysts to import historical stock market data, clean and preprocess it,
compute moving averages, measure volatility, and analyze correlations between dif-
ferent assets.
SciPy
While Pandas is primarily designed for data manipulation and analysis, SciPy is
focused on scientific computing. It provides advanced mathematical functions for
tasks like optimization, integration, and solving differential equations. While Pan-
das is great for preparing and analyzing data, SciPy extends these capabilities with
specialized tools for more complex mathematical operations. The two libraries can
be used together, with Pandas handling data manipulation and SciPy offering ad-
vanced computations.
Matplotlib
The preferred tool for graphing in Python is the library Matplotlib. The advantages
of Matplotlib include its flexibility in creating a wide range of static, animated,
and interactive plots. It offers extensive customization of plot elements, such as
labels, titles, and axes, enabling users to create highly specific visualizations. Ad-
ditionally, it integrates seamlessly with other libraries such as NumPy and Pandas
for data analysis. How Matplotlib compares to Excel is a natural question since
Excel is known for its user-friendly interface and powerful graphical tools. The
3.4. COMPUTATIONAL FINANCE 43
main difference between the two is that Excel is more accessible for beginners but
also more limited in terms of customization. Matplotlib, on the other hand, being
a Python library, provides much greater flexibility and programmability, allowing
developers to create more complex plots and handle large datasets with greater ease.
Before concluding this section, let us mention the most commonly used Python
library in machine learning.8
Scikit-learn
Scikit-learn is a widely used Python library for machine learning. It provides a
complete and well-rounded set of tools for data preprocessing, model training, and
evaluation. It supports both supervised and unsupervised learning algorithms. Since
it is built on NumPy, SciPy, and Matplotlib, it ensures efficient performance and
seamless integration with other data science libraries. Scikit-learn is optimized for
speed and handles moderate-sized datasets efficiently. However, it lacks support for
deep learning and may not be ideal for very large datasets compared to frameworks
like TensorFlow or PyTorch. Despite these limitations, its ease of use, strong com-
munity support, and extensive documentation make it a go-to library for traditional
machine learning tasks.
8
An introductory ML book with a strong emphasis on programming with Python is A. C. Müller
& S. Guido, Introduction to Machine Learning with Python. When it comes to finance, refer to
Y. Hilpisch, Python for Finance, and A. Nag, Stochastic Finance with Python.
Chapter 4
Quantum-Enhanced Solutions
Like any other business, the goal of a financial institution is to provide the best
possible service to its customers while maintaining profitability. In a competitive
free-market environment, achieving this is a significant challenge, due to factors such
as intense competition, evolving local and global regulations, and complex ethical
considerations. Meeting customer expectations under these conditions is no easy
task. This is where quantum computing could make a substantial difference in the
future.
The rationale for adopting quantum computing is clear: modern industries rely
heavily on digital technologies, and since quantum computers are expected to sur-
pass classical systems in both efficiency and security, these industries—and mod-
ern society as a whole—will inevitably be impacted by this emerging technology.
However, this reasoning both overestimates and oversimplifies the true potential of
quantum technologies. While it is certain that quantum computers will impact some
industries sooner than others, some sectors may experience little to no change at all.
Moreover, the impact of quantum technologies is believed to extend beyond mere
computational efficiency. Only future research will reveal the full extent of their
influence.
What seems undeniable is that quantum computers will profoundly impact every
industry that relies on machine learning (ML). This is largely because two funda-
mental mathematical pillars of ML—linear algebra and probability theory—are also
central to quantum mechanics. As a result, industries that use ML in their produc-
tion or operations will likely be disrupted by the current generation of NISQ (Noisy
Intermediate-Scale Quantum) computers, as well as by more advanced versions on
the horizon. Finance, which heavily depends on ML for portfolio optimization, mar-
ket prediction, and other financial challenges, is no exception. Some experts predict
that finance will be the first industry to be revolutionized by quantum computing.1
Before exploring how quantum computing can be applied in finance, it is important
to recall that the current NISQ era is characterized by noisy quantum devices and a
relatively small number of coherent qubits. In fact, there is broad consensus among
experts that fully reliable, fault-tolerant quantum computers will only become avail-
able in the long term. Because of this, researchers have developed a new class of
1
Some important papers on the subject are: A. Bouland et al., “Prospects and Challenges
of Quantum Finance” (2020), D. J. Egger et al., “Quantum Computing for Finance” (2020),
D. A. Herman et al., “A Survey of Quantum Computing for Finance” (2022), D. A. Herman et al.,
“Quantum Computing for Finance” (2023) and R. Orús et al., “Quantum Computing for Finance”
(2018).
45
46 CHAPTER 4. QUANTUM-ENHANCED SOLUTIONS
|Q⟩
e = |Q(θ)⟩ e |0⟩ .
e = U (θ) (4.1.1)
|0⟩ 7→ U (θ)
e |0⟩ = |Q(θ)⟩
e . (4.1.2)
2
For additional explanations, refer to QC2, Subsection 4.2.
4.1. QUANTUM PORTFOLIO OPTIMIZATION 47
where the coefficients hA1 ...An are real numbers, and the σA ’s are Pauli gates X, Y ,
Z (for A = X, Y , Z) or the identity operator (for A = I). That is, we measure
X
E(θ)
e = hA1 ...An ⟨0| U † (θ)
e σA ⊗ . . . ⊗ σAn U (θ)
1
e |0⟩ . (4.1.4)
A1 ,...,An
The VQE tells us that, from a computational point of view, it is more convenient
to estimate each of the terms with a quantum device,
⟨0| U † (θ)
e σA ⊗ . . . ⊗ σAn U (θ)
1
e |0⟩ , (4.1.5)
and let the sum be computed by a classical computer. This process is then repeated
for other parameters in the neighborhood of θ,
e
⟨0| U † (∆θ)
e σA ⊗ . . . ⊗ σAn U (∆θ)
1
e |0⟩ . (4.1.6)
All these results (obtained from the quantum device) are then sent to a classical
optimizer. In summary,
X
min hA1 ...An ⟨0|, U † (θ), σA1 ⊗ . . . ⊗ σAn U (θ), |0⟩ = EV QE ≳ E0 . (4.1.7)
θ
A1 ,...,An
Let us illustrate how the VQE could be applied to the optimization of portfolios. As
we discussed above (see equation (3.1.23)), a binary portfolio optimization problem
aims at
S
X S
X
min α bs Σss′ bs′ − µ s bs , (4.1.8)
s,s′ =1 s=1
where bs is a binary variable, bs ∈ {0, 1}, and Σss′ is a symmetric matrix, Σss′ = Σs′ s .
For completeness, let us write the objective function,
S
X S
X
f (b1 , . . . bs . . . , bS ) = α bs Σss′ bs′ − µ s bs . (4.1.9)
s,s′ =1 s=1
The variational quantum method establishes that we can minimize this function by
constructing a cost Hamiltonian, ĤC , and finding its minimum expectation value.
Minimizing the objective function is equivalent to minimizing the expectation value
of the cost Hamiltonian:
S S
where |Q⟩ ∈ H2 . If we choose the computational basis {|b1 · · · bs · · · bS ⟩} for H2 ,
where the entries of each vector |b1 · · · bs · · · bS ⟩ are associated to their correspond-
ing stocks (bs = 1 if the stock s is in the portfolio and bs = 0 if it is not), the vector
|Q⟩ can be expressed as a linear superposition of these basis vectors,
S
X
|Q⟩ = α... bs ... | · · · bs · · · ⟩ . (4.1.11)
s=1
S
X Σss′
=α (I − Zs′ − Zs + Zs Zs′ ) . (4.1.18)
s,s′ =1
4
Suppose the case of only two stocks, s and s′ . The corresponding cost Hamiltonian
is,
Σss′ 1 1
ĤCss′ = α (I − Zs′ − Zs + Zs Zs′ ) − µs (I − Zs ) − µs′ (I − Zs′ )
2 2 2
Σss′ µs Σss′ µs ′ Σss′
=α Zs Zs′ + −α Zs + −α Zs′
2 2 2 2 2
Σ ′ µs µs ′
ss
+ α − − I. (4.1.20)
2 2 2
4.1. QUANTUM PORTFOLIO OPTIMIZATION 49
Since we can always shift the minimum of the Hamiltonian expectation value, we
simply consider
Σss′ 1 1
ĤCss′ = α Zs Zs′ + µs − αΣss′ Zs + µs′ − αΣss′ Zs′ . (4.1.21)
2 2 2
where—for the same reason discussed above—we have omitted the term proportional
to the identity operator. To make this expression more manageable, we define
S S S
X Σss′ X 1 X
ĤZZ =α Zs Zs′ , ĤZ = µs − α Σss′ Zs . (4.1.23)
′
s,s =1
4 s=1
2 s′ =1
Let us pause here to understand the quantum circuit corresponding to this unitary.
First, recall that the rotation of a single qubit around the a-axis, where a = x, y, z, is
given in equation (2.1.13). The circuit corresponding to the exponential containing
ĤZ is obtained in the following manner. We begin by rewriting the exponential as
follows:
S S
−i γ ĤZ
hX γ X i
e = exp i α Σss′ − µs Zs
s=1
2 s′ =1
S
O h γ X S i
= exp i α Σss′ − µs Zs . (4.1.27)
s=1
2 s′ =1
so that
S
O
−iγ ĤZ
e = e−iγas Zs . (4.1.29)
s=1
5
This unitary is equivalent to
S
O S
O
−iγ ĤZ −iγas Zs
e = e = Rz (2γas ) . (4.1.30)
s=1 s=1
This result tells us that the exponential involving ĤZ is equivalent to rotating each
qubit s = 1, 2, . . . , S by an angle of 2γas about the z-axis. The circuit corresponding
to the exponential containing ĤZZ is a bit more complicated to figure out. Since,
as can easily be shown,
e−iδZs Zs′ |bs ⟩ |bs′ ⟩ = CNOT (I ⊗ Rz (2δ)) CNOT |bs ⟩ |bs′ ⟩ , (4.1.31)
we conclude that
S
h X i S
O
e−iγ ĤZZ = exp −iγass′ Zs Zs′ = e−iγass′ Zs Zs′
s,s′ =1 s,s′ =1
S
O
= CNOTss′ (I ⊗ Rz (2γass′ )) CNOTss′ , (4.1.32)
s=1
s′ <s
where, to simplify the notation, we have introduced ass′ = αΣss′ /4. The evolution
operator in equation (4.1.25) is thus realized by the circuit in (4.1.32), followed by
the circuit in (4.1.30).
Now that we know how to realize the evolution operator corresponding to the
cost Hamiltonian, we are interested in the qubit that enters the circuit, how to
prepare it, and the qubit that exits it. It is common to choose the input qubit as
|Q⟩in = |+⟩⊗S . This is done using Hadamard gates. Recall that a Hadamard gate
acts on a computational basis vector as follows:
1 X ′
H |b⟩ = √ (−1)bb |b′ ⟩ . (4.1.33)
2 b′
More generally,
1 X ′ ′
H ⊗S |b1 . . . bS ⟩ = √ (−1)b1 b1 + ... +bS bS |b′1 . . . b′S ⟩ . (4.1.34)
S
2 b′
s
The initial qubit |Q⟩in = |+⟩⊗S is prepared by applying a Hadamard gate to each
individual qubit |bs ⟩ = |0⟩:
1 X ′
|Q⟩in = |+⟩⊗S = H ⊗S |0 . . . 0⟩ = √ |b1 . . . b′S ⟩ . (4.1.35)
S
2 b′
s
5
See equation (4.72) of QC1.
4.1. QUANTUM PORTFOLIO OPTIMIZATION 51
H ⊗S UC (γ) 1 X ′
|0 . . . 0⟩ 7−−−−−→ |Q⟩in 7−−−−−−→ UC (γ) √ |b1 . . . b′S ⟩ . (4.1.36)
S
2 b′
s
Regardless of the input qubit state, the QAOA stipulates that we follow the unitary
UC (γ) with a second unitary, associated with the so-called mixer Hamiltonian,
S
X
ĤM = Xs . (4.1.37)
s=1
Note that, since UM (β) does not commute with UC (γ), the strict order is as follows:
first, UC (γ) acts on the input qubit, and then UM (β). It can easily be shown that
S
O
UM (β) = Rx (2β) . (4.1.39)
s=1
The qubit state that exits the quantum circuit, formed by the unitaries UC (γ)
followed by UM (β), is now parameterized by the angles γ and β,
UM (β)UC (γ)
|Q⟩in 7−−−−−−−−−→ |Q(γ, β)⟩ = UM (β) UC (γ) |Q⟩in . (4.1.40)
This is the ansatz state that we will use to measure the expectation value of the
cost Hamiltonian in equation (4.1.22),
Since the expectation value of the cost Hamiltonian is a real-valued function of both
parameters γ and β, we can simply write
We could then send the measurements of the expectation value of the cost Hamilto-
nian F (γ, β) to a classical optimizer to propose better values for γ and β, repeating
this process as many times as necessary until we reach a good approximation (γ∗ , β∗ ).
However, the QAOA suggests that, rather than optimizing a single pair of parame-
ters (γ, β), a better approximation can be found by creating a sequence of unitaries
UC (γ) and UM (β), each with its own parameter. In other words, the QAOA pre-
scribes that we allow the initial state |Q⟩in to enter the following parameterized
sequence of gates:
= |Q(γ1 , . . . , γk , . . . , γp , β1 , . . . , βk , . . . , βp )⟩ . (4.1.43)
52 CHAPTER 4. QUANTUM-ENHANCED SOLUTIONS
Each pair of unitaries UM (βk )UC (γk ), for k = 1, 2, . . . , p, is called a layer . Specifi-
cally, UM (βk )UC (γk ) is the kth layer, with UC (γk ) referred to as the kth cost layer
and UM (βk ) as the kth mixer layer . It has been shown that the larger the number
of layers, the better the approximation of the optimization problem. Needless to
say, though, the number of layers must be kept within a reasonable limit to ensure
that the calculation is not adversely affected by excessive noise.
In total, there are 2p parameters to be optimized variationally: p angles γk and
p angles βk . We can collect all these variational parameters in a more compact
notation: γ = (γ1 , . . . , γp ) and β = (β1 , . . . , βp ). The ansatz state is thus
p
Y
|Qp (γ, β)⟩ = UM (βk ) UC (γk ) |Q⟩in . (4.1.44)
k=1
Data Encoding
Suppose we have a set of samples, n = 1, . . . , N , characterized by two features, x1
and x2 . For simplicity, assume that each feature can take one of the following four
values: 0, 1, 2, or 3. Using simple binary notation, we can represent these values
as 00, 01, 10, and 11, respectively. The feature vector of the n-th sample can be
expressed in binary notation as follows:
(n) (n) T (n) (n) (n) T
x(n) = x1 x2 ←→ xb = x1b x2b
(n) (n) (n) (n) T
= (b1 )1 (b1 )2 (b2 )1 (b2 )2
(n) (n) (n) (n) T
= b11 b12 b21 b22 . (4.2.1)
6
For an introduction, see D. Pastorello, Concise Guide to Quantum Machine Learning, and
M. Schuld & F. Petruccione, Machine Learning with Quantum Computers.
4.2. QUANTUM MACHINE LEARNING 53
(n)
Basis encoding associates the binary feature vector xb with a computational basis
vector of a 24 -dimensional qubit space:
(n) (n) (n) (n) (n) (n)
xb 7−→ |xb ⟩ = |b11 b12 b21 b22 ⟩ . (4.2.2)
If the samples n and n′ are characterized by the two features x1 and x2 , the entire
dataset, X, is associated with the following state vector:
" (n) (n)
#
x1 x2 1 (n) 1 (n′ )
X = (n′ ) (n′ ) 7−→ |Xb ⟩ = √ |xb ⟩ + √ |xb ⟩ . (4.2.4)
x1 x2 2 2
where bj ∈ {0, 1}, and n depends on the value of x. (Note that the subscript n
here should not be confused with the superscript (n) in x(n) , which denotes the nth
sample.) If there are two features, say i and i′ , the feature vector for the nth sample
is given by
(n) (n) T (n) (n) (n) (n) (n) T
x(n) = xi xi′ ←→ xb = bi1 . . . bin bi′ 1 . . . bi′ n′ , (4.2.6)
where,
I I
(n) 2 (n) 2
X X
∥x(n) ∥2 = xi , ∥x̄(n) ∥2 = x̄i = 1. (4.2.11)
i=1 i=1
(n)
We then associate the normalized feature vector x̄ with a quantum state as follows:
I
X
(n) (n) (n)
x̄ 7−→ |x ⟩= x̄i |i⟩n . (4.2.12)
i=1
(n)
Note that for each feature value x̄i , there is an associated basis vector |i⟩n . It
follows that the dimension of the qubit space must satisfy 2Q ≥ I, where Q is the
number of qubits in |i⟩n . More specifically,
2Q I 2 Q
X (n)
X (n)
X (n)
(n)
|x ⟩= x̄i |i⟩n = x̄i |i⟩n + x̄i |i⟩n , (4.2.13)
i=1 i=1 i=I+1
(n)
where x̄i = 0 for i = I + 1, . . . , 2Q . The same idea can be extended to the entire
(n)
dataset X = xi , where i = 1, . . . , I, and n = 1, . . . , N . In this case, we normalize
(n) (n) (n)
the elements of the matrix X by setting x̄i = xi /∥xi ∥, where
I X
N
(n) 2
(n)
X
∥xi ∥2 = xi . (4.2.14)
i=1 n=1
(n) (n)
Consider now all the feature values of the nth sample, x1 , . . . , xI . To use a
(n)
similar angle parametrization, we can associate the smallest feature value, xi,min ,
(n)
with 0 radians and the largest value, xi,max , with π radians. The remaining angles
will lie between these two extremes,
(n) (n)
(n) xi − xi,min
θi = (n) (n)
π. (4.2.20)
xi,max − xi,min
After applying this transformation to all features and replacing x(n) 7→ θ (n) =
(n) (n) T
θ1 . . . θI , we associate the following ket with this column vector:
I
O
(n) (n) (n) (n)
x 7−→ θ 7−→ |θ ⟩= Ry (θi ) |0⟩ . (4.2.21)
i=1
The rotation of the single-qubit state vector |0⟩ about the y-axis allows us to use
the angle θ to encode one feature value. If there are I features, we can encode them
in I angles, with one angle per qubit |0⟩. In general, though, a qubit requires two
real values (angles in the Bloch sphere) to be fully described. We can therefore use
the angle ϕ, associated with a rotation about the z-axis, to encode another piece of
classical data. Using the general formula above for a rotation operator, a z-rotation
acts as:
Rz (ϕ) = cos(ϕ/2)I − i sin(ϕ/2)Z . (4.2.22)
When applied to the qubit state Ry (θ) |0⟩, given in (4.2.17), it produces
where we have used that Z |b⟩ = (−1)b |b⟩. Thus, we can encode 2I features in
the angles of rotation of I qubits. For example, we can associate the odd-indexed
features with rotations about the y-axis,
(n) (n) (n)
x2i−1 7→ x̄2i−1 7→ θ2i−1 , (4.2.24)
Ignoring the global phase, we can encode two classical data points as follows:
(n) (n) T (n) (n)
x2i−1 x2i 7−→Rz (ϕ2i )Ry (θ2i−1 ) |0⟩
(n)
(n) (n)
= cos(θ2i−1 /2) |0⟩ + eϕ2i sin(θ2i−1 /2) |1⟩ . (4.2.26)
Finally, the entire set of features corresponding to the nth sample is encoded in the
following ket:
I
O (n) (n)
x(n) 7−→ Rz (ϕ2i )Ry (θ2i−1 ) |0⟩ .
i=1
I
O (n)
(n) (n)
cos(θ2i−1 /2) |0⟩ + eϕ2i sin(θ2i−1 /2) |1⟩ .
= (4.2.27)
i=1
56 CHAPTER 4. QUANTUM-ENHANCED SOLUTIONS
Now that we understand how classical data can be encoded into quantum informa-
tion, let us examine how this quantum information can be processed into the analog
of classical neural networks.
For example, in certain tasks, QML techniques may provide a better approximation
than purely classical ML techniques. Even if the computation takes longer, this
advantage can be valuable. This contrasts with the traditional perspective, where
quantum algorithms are primarily expected to outperform their classical counter-
parts in terms of speed. The relevance of this shift is clear: in some cases, accuracy
is more desirable than faster computation, such as in privacy-sensitive applications
or scenarios requiring higher precision in modeling complex systems. For instance,
in financial modeling, precise risk estimation, market trend prediction, or portfolio
optimization may be more valuable than rapid execution. Another example outside
of finance is drug discovery, where accurately modeling quantum interactions at the
molecular level may be more crucial for identifying promising compounds, even at
the expense of a longer computational time.
As seen in the previous paragraph, evaluating the advantage of quantum machine
learning techniques in real-world applications can be quite complex. In addition
to this, there are the technical challenges such as the efficiency and robustness of
encoding classical data into quantum states, the impact of qubit and quantum gate
fidelity on the training process, the use of quantum optimizers instead of the classical
ones usually considered, among others.
GANs are used to generate synthetic financial data, such as stock prices or market
conditions, helping to test the resilience of financial strategies and create more ro-
bust models. They can also be employed to generate realistic fraudulent transaction
data, thereby improving fraud detection systems.
Qiskit
Qiskit is the most popular quantum software framework at the moment. Its success
can be attributed to several factors. First, it was developed by IBM, one of the
leading quantum hardware companies in the world today, and the company pro-
vides free access to their quantum computers through the cloud. Second, Qiskit has
a Python-based interface that integrates seamlessly with popular data science tools
like NumPy and Pandas, making it easy for students and professionals from various
fields who are familiar with Python to get started. As a result, Qiskit is a promi-
nent choice for both educational purposes and professional research. Its success is
also attributed to the fact that it is open-source software, benefiting from an active
and collaborative community, much like Python, that contributes to the frame-
work’s continuous development. Fourth, Qiskit enables users to develop quantum
algorithms, simulate quantum circuits, and run experiments on quantum computers
and simulators (classical computers that simulate the behavior of quantum comput-
ers). Finally, Qiskit supports a growing quantum ecosystem, including integration
4.3. PROGRAMMING QUANTUM COMPUTERS 59
Cirq
Similar to IBM’s Qiskit, Google has developed its own quantum computing frame-
work called Cirq. Cirq shares many of the advantages found in Qiskit.10 First, it is
backed by one of the largest tech companies in the world, which has a proven track
record of breakthroughs in quantum computing. Second, it offers a Python-based
interface that integrates seamlessly with Google’s quantum hardware, making it—at
least in principle—appealing to students and professionals from diverse fields. Third,
Cirq is open-source, allowing users to contribute to its development and utilize it
for creating and running quantum circuits, whether on quantum computers or sim-
ulators. Like Qiskit, Cirq also integrates with other quantum computing platforms
and tools. However, compared to Qiskit, Cirq lacks a large, established, and col-
laborative community of researchers and developers. While it’s difficult to quantify
the exact reasons for Qiskit’s broader popularity, it is likely that Qiskit’s expansive
ecosystem, strong educational resources, and greater community support have made
it a more widely adopted choice in the quantum computing developer community.11
PyQuil
The PyQuil quantum software framework was developed by Rigetti, a quantum
hardware company specializing in superconducting qubits. While IBM and Google
also focus on superconducting quantum processors, Rigetti is a smaller company
with a distinct emphasis on hybrid quantum-classical computing. PyQuil serves
as Rigetti’s counterpart to Qiskit and Cirq, designed for seamless integration with
its quantum hardware. Like the two frameworks discussed above, PyQuil allows
users to design and deploy quantum applications on both simulators and real quan-
tum processors. While Qiskit provides a broader ecosystem covering areas such as
quantum circuits, machine learning, and quantum chemistry, PyQuil is specifically
designed to facilitate hybrid quantum-classical computing. Although Qiskit and
Cirq also support hybrid execution, PyQuil is the most explicitly hybrid-focused
framework among the three, making it particularly well-suited for variational quan-
tum algorithms and real-time classical processing. That said, PyQuil lacks some of
Qiskit’s advantages, such as a larger user base and extensive community resources.
IBM’s strong investment in educational materials, cloud-based quantum access, and
an active global community gives Qiskit a more extensive ecosystem. Nonetheless,
PyQuil remains a powerful tool for users working within Rigetti’s quantum comput-
ing stack, particularly those interested in hybrid computing paradigms. 12
Amazon Braket
Although Amazon is developing its own quantum hardware, publicly available infor-
mation suggests that it is less advanced than that of IBM, Google, and Rigetti. In
fact, Amazon is best known for its quantum software framework: Amazon Braket.
9
https://www.ibm.com/quantum/qiskit
10
For an insightful discussion on the differences between Qiskit and Cirq, read my LinkedIn post
and the comments at https://bit.ly/QiskitvsCirq.
11
https://quantumai.google/cirq
12
https://www.rigetti.com
60 CHAPTER 4. QUANTUM-ENHANCED SOLUTIONS
Amazon Braket does not use Amazon’s own quantum hardware but instead provides
access to quantum processors from multiple external providers, including Rigetti
(superconducting qubits), IonQ (trapped-ion qubits), QuEra (neutral-atom qubits),
and D-Wave (a quantum annealer). In addition to providing a Python Software De-
velopment Kit (SDK) to build and run quantum algorithms on quantum computers,
it also offers classical quantum simulators to test algorithms before running them
on quantum hardware. Finally, Amazon Braket’s integration with other classical
computing resources from the Amazon ecosystem enables users to create seamless
hybrid quantum-classical workflows, which are ideal for solving complex problems,
such as those involving variational quantum algorithms.13
PennyLane
PennyLane is a hardware-agnostic quantum machine learning framework designed
to integrate seamlessly with classical machine learning libraries. It connects to quan-
tum computers through APIs from platforms such as Qiskit (IBM), Cirq (Google),
Rigetti’s Quantum Cloud Services (QCS), and Braket (Amazon). One of its key
strengths is quantum differentiation, which allows for efficient gradient computation
in quantum circuits, making it particularly valuable for quantum machine learn-
ing and AI research. Additionally, PennyLane works seamlessly with classical ma-
chine learning frameworks like TensorFlow and PyTorch, enabling users to develop
quantum-enhanced machine learning models and optimize complex tasks by incor-
porating quantum circuits into hybrid workflows.
13
https://aws.amazon.com/braket
Chapter 5
In this Chapter, we focus on the broader context of the adoption of quantum comput-
ing by financial institutions. Before delving into the details, however, it is essential
to understand the current status of the quantum computing ecosystem at large.
Recognizing the near-future impact of quantum technologies on society, the gov-
ernments of the most developed countries have begun funding research centers and
educational programs. Given the nascent and evolving status of the field, where
cutting-edge research is crucial to securing a competitive position, the primary fo-
cus of governments has been on supporting research institutions. Research centers
have been established in the United States, China, Europe, Singapore, and the Gulf
countries, to name just a few.
To address the present and future needs of research centers and industries already
engaged in quantum computing, advanced educational programs tailored specifically
to these requirements, such as MSc and PhD programs, have been created. These
programs cover diverse topics, ranging from quantum optics and machine learning
to the funding of startups. They are available in most countries with significant
quantum computing initiatives. Governments have played an active role in funding
these programs.
Due to the relatively low cost of educational initiatives compared to advanced
research—especially if it involves experimental research—quantum computing ed-
ucational programs are being proposed globally, including in developing countries.
Online certifications are also offered by leading educational institutions such as MIT
and by private companies.
The private sector, represented by major technology companies such as IBM and
Google, has also developed a strong presence in quantum research and education.
Companies not only focus on their internal needs and strategies for the future but
also contribute to the collective advancement of the field. In addition to this en-
gagement by major corporations, particularly in hardware and software research and
development, dozens of startups have emerged to address specific needs, including
those related to financial services, as discussed below.
Collaborations between these three sectors—governments, educational institutions,
and the private sector—are frequent.1 For instance, research partnerships often
1
The seminal book The Triple Helix: University-Industry-Government Innovation in Action,
by H. Etzkowitz & L. Leydesdorff, remains the basic reference for explaining how the interactions
between governments, universities, and private companies contribute to innovation, economic de-
velopment, and technological progress. Etzkowitz’s most recent book, published under the same
title, adopts a more practical approach and is well worth reading.
61
62 CHAPTER 5. HOW TO GET QUANTUM-READY
Artificial intelligence, and more specifically machine learning, are among the tools
banks use to analyze these staggering amounts of data and make sound decisions,
enabling better risk management, personalized services, and improved operational
efficiency.
In the previous chapters, we explored how machine learning has transformed the fi-
nancial industry, with applications in portfolio optimization, fraud detection, credit
risk analysis, anti-money laundering, and enhanced security. Beyond these, there
are other notable innovations, such as the delivery of more personalized services and
the use of AI-driven chatbots for instant customer support. However, it is impor-
tant to remember that the shift toward adopting machine learning was not always
universally accepted. Just a few years ago, many professionals and financial insti-
tutions were hesitant to embrace it, often due to a lack of understanding, concerns
about data privacy, or skepticism regarding its effectiveness. Today, however, there
is broader recognition of its potential to revolutionize the industry. It is now widely
accepted that banks that fail to adopt machine learning, or artificial intelligence
more broadly, risk becoming obsolete, much like institutions such as Lehman Broth-
ers and Barings Bank, which failed due to their inability to adapt. There is now a
consensus that AI is essential for enhancing customer experience, reducing opera-
tional costs, managing risks, and staying competitive in a rapidly evolving financial
landscape.4
A final word on quantum technologies: If the historical relationship between finance
and technology has been defined by the collection, storage, and transmission of
information, it is highly likely that the physical nature of that information will play
a crucial role in shaping future developments. Just as the transition from analog
to digital information revolutionized the financial services industry in the late 20th
century, the shift from digital to quantum information promises to be equally, if not
more, transformative.5
and it remains uncertain which approach will ultimately prove the most transfor-
mative. Let us explore some of these technologies and the key players shaping the
field.
Superconducting Qubits
Superconducting qubits use superconducting circuits with Josephson junctions to
create and manipulate quantum states. While they perform well in gate operations,
they require extremely cooling systems that are difficult to achieve. Leading compa-
nies in this field include IBM, Google, Rigetti, and Alibaba. Since this is currently
the most mature quantum computing technology, let us take some time to discuss
it.
IBM is one of the leading companies in superconducting qubits. Its technology
provides qubits with long coherence times and incorporates advanced quantum er-
ror mitigation techniques. The company has developed quantum processors such
as Eagle in 2021, with 127 qubits; Osprey in 2022, with 433 qubits; and Condor in
2023, with 1,121 qubits. Despite the tenfold increase in qubit count, maintaining
high qubit quality as systems scale remains a significant challenge. Recognizing this
challenge, in December 2023, IBM announced the Heron processor, featuring 133
qubits. It was the first in IBM’s new generation of error-mitigated quantum proces-
sors, focusing on improved qubit connectivity and reduced noise rather than simply
increasing the number of qubits. In 2024, the company unveiled the IBM Quantum
Heron R2, a 156-qubit processor. This chip builds upon the Heron architecture,
increasing the qubit count from 133 to 156 and introducing two-level system mit-
igation to further reduce noise. By the end of the decade, IBM plans to deliver a
fully error-corrected system with 200 qubits capable of running 100 million gates.6
Google, one of the major tech companies heavily investing in quantum computing
developments, employs a similar type of superconducting qubits but with a different
architecture. Google’s quantum circuits use a planar layout with nearest-neighbor
coupling, enabling efficient two-qubit gate operations. They focus on quantum error
correction, aiming for logical qubits that can sustain long computations. Google’s
Sycamore processor, which achieved quantum supremacy in 2019, used a 54-qubit
chip to outperform classical supercomputers in a random circuit sampling task (ac-
tually, only 53 qubits were used because one was faulty). In 2024, Google unveiled
its latest quantum processor, Willow, featuring 105 qubits. It achieved in less than
five minutes a task that would take today’s fastest classical supercomputers millions
and millions of years to complete. This was the second time Google achieved quan-
tum supremacy—the only Western company to have reached this goal. Google’s
roadmap includes building a 1-million-physical-qubit system with error correction
by the 2030s.7
Rigetti is a much smaller quantum hardware company, with approximately 150
employees, that also focuses on developing superconducting qubit processors. Its
Aspen-9 processor was announced in 2021 with 32 qubits, and the Aspen-M was
announced in 2023 with up to 80 qubits. Their high gate fidelity for single-qubit
and two-qubit gates has the potential for scalability and improved performance.
Rigetti plans to build quantum processors with up to 1,000 qubits by the end of
the decade, with fidelity above 99% and increasing coherence times. Due to their
6
https://www.ibm.com/quantum/technology
7
https://quantumai.google/quantumcomputer
5.2. THE RACE FOR THE FIRST QUANTUM COMPUTER 67
Trapped-Ion Qubits
In this approach, individual qubits are encoded in the internal energy states, hy-
perfine states, or other quantum states of the ions, and precision lasers are used to
manipulate these states. The most commonly used ions for trapped ion quantum
computing are barium (Ba), calcium (Ca), magnesium (Mg), and ytterbium (Yb).
The concept of using ions trapped in electromagnetic fields for quantum computing
began to take shape by the mid-1990s. However, it wasn’t until the 2010s that
trapped-ion qubits were scaled up to handle multiple qubits, with advancements in
fidelity, coherence, and error correction. Today, trapped-ion quantum technology is
regarded as one of the most promising approaches in quantum computing.
IonQ 9 and Quantinuum 10 are two prominent companies in the field of trapped-
ion quantum computing, each employing different hardware strategies. IonQ has
developed quantum computers with up to 32 physical qubits, while Quantinuum
has achieved up to 20. IonQ’s roadmap aims to reach 64 algorithmic qubits (error-
corrected qubits) by the end of 2025, with a target logical two-qubit gate fidelity
of 99.999%. In contrast, Quantinuum plans to develop a system with 96 physical
qubits and a physical error rate below 5 × 10−4 , and by 2029, they intend to in-
troduce a processor with thousands of physical qubits and hundreds of algorithmic
qubits, targeting a logical error rate between 1×10−5 and 1×10−10 . Both companies
face the challenge of scaling their systems while maintaining error correction, qubit
coherence, and performance as they increase qubit numbers.
Photonic Qubits
In addition to the three cases of quantum supremacy achieved using superconducting
technologies, photonic qubits (also called optical qubits) have also achieved quantum
supremacy.
Photonic qubits encode quantum information using the intrinsic properties of in-
dividual photons, such as polarization. These qubits are generated by single-photon
8
https://www.rigetti.com
9
https://ionq.com
10
https://www.quantinuum.com
68 CHAPTER 5. HOW TO GET QUANTUM-READY
sources, manipulated using beam splitters and phase shifters, and measured with
single-photon detectors. The key advantages of this technology are its low decoher-
ence, fast transmission, and room-temperature operation, making photonic qubits
particularly suited for quantum communication and networking. However, chal-
lenges such as weak photon-photon interactions, photon loss, and reliance on prob-
abilistic gates hinder scalability. Despite these obstacles, companies like PsiQuan-
tum 11 and Xanadu 12 are making significant strides toward large-scale, fault-tolerant
photonic quantum computing systems.
As mentioned above, quantum supremacy using photonic qubits was first demon-
strated in 2020 with the Jiuzhang experiment at the University of Science and Tech-
nology of China. The group was able to solve a problem in minutes that would take
millions of years on the most powerful classical computers. The follow-up Jiuzhang
2.0 experiment in 2021 further validated these results. While groundbreaking, these
experiments are task-specific and, as previously indicated, face ongoing challenges
for potential future applications.
Topological Qubits
In topological qubits, quantum information is stored in the global properties of quan-
tum states within certain materials. The key advantage of topological qubits is their
enhanced robustness to errors, as the information is encoded in the global properties
of the quantum states—specifically in the braiding of exotic particles called anyons,
which are neither fermions nor bosons. These qubits possess properties that enable
them to be isolated and manipulated in a way that makes them resistant to lo-
cal disturbances. This error resilience has the potential to significantly enhance the
scalability and reliability of quantum computers. The leading company in the devel-
opment of topological qubits is Microsoft, which focuses on a specific type of anyon
called Majorana fermions. The main challenges involve not only the theoretical un-
11
https://www.psiquantum.com
12
https://xanadu.ai
13
https://www.pasqal.com
5.3. FINANCIAL INDUSTRY LEADERS 69
derstanding of these objects but also experimental difficulties, such as isolating the
anyons and performing the necessary braiding operations. Despite these challenges,
Microsoft is fully committed to the development of topological qubits, investing
heavily in research to make topological quantum computing a practical technology.14
Quantum Annealers
This is the first time we mention quantum annealers. We have not discussed them
earlier because quantum annealers are not gate-based quantum computers (they are
not structured as quantum circuits). In quantum annealing, a problem is encoded
into the system’s Hamiltonian and the system evolves towards its lowest energy
state, which represents the optimal solution. The underlying physics involves a
quantum phenomenon called the adiabatic process. Quantum annealers are con-
sidered non-universal because they are specifically designed to solve optimization
problems. They excel at solving optimization tasks in fields like logistics, material
science, and finance. D-Wave is a leading company in the development of quantum
annealers, with their latest processor containing 5,000 qubits. However, the fidelity
remains much lower than that of gate-based quantum systems due to noise and de-
coherence. While quantum annealers face the same challenges as other quantum
computing approaches, such as noise, decoherence, and error correction, achieving
reliable results for complex problems remains difficult. Some quantum computing
experts are skeptical about the results obtained by quantum annealers and the po-
tential of the technology for the future.15
JPMorgan Chase
JPMorgan is perhaps the most prominent financial institution actively promoting
quantum computing. Its Managing Director and Head of Global Technology Ap-
plied Research has helped establish the firm as a leader in quantum computing for
financial services. He has not only built a strong team of researchers and published
14
https://quantum.microsoft.com
15
https://www.dwavesys.com
16
For an insightful discussion on how small and large banks are adopting quantum computing,
read my LinkedIn post and the comments at https://bit.ly/QCBigvsSmallBanks.
70 CHAPTER 5. HOW TO GET QUANTUM-READY
papers and surveys reviewing the state of the art for the expert community but
is also actively engaged in promoting quantum computing to non-experts in the
financial sector, such as quants and executives. The group’s research has been par-
ticularly focused on quantum computing for portfolio optimization, option pricing,
risk analysis, and machine learning applications such as fraud detection and natural
language processing.17
HSBC
HSBC is also a well-known bank heavily investing in quantum computing for finance.
However, according to available information, the bank focuses more on cybersecu-
rity, particularly on using quantum technologies to ensure the secure storage and
communication of sensitive information, such as customer transactions. Another
area of active research at the bank is fraud detection. Thus, although JPMorgan
and HSBC are two leaders in the quantum computing for finance industry and have
areas of overlap, JPMorgan focuses more on quantum computing solutions for finan-
cial services, such as portfolio optimization and quantum machine learning, while
HSBC appears to prioritize cybersecurity and fraud detection.18
Wells Fargo
Wells Fargo is another leader in quantum computing for financial services and se-
curity. Its areas of research are similar to those of JPMorgan and HSBC, with its
own unique strengths. However, the reason we want to mention it here is different.
The Managing Director and Chief Technology Officer of Advanced Technology at
Wells Fargo has been particularly vocal about the hype surrounding quantum com-
puting and its impact on how financial institutions, from small to large, approach
this technology. He believes that the hardware is still not developed enough, and
that making large investments in quantum computing for financial services at this
stage is not the right decision. He advocates that banks should refrain from making
excessive investments, limit their spending to stay informed about the technology
and new developments, and only make the leap into quantum once there are prac-
tical demonstrations of quantum technology in finance. In his view, banks should
remain frugal in their investments and avoid falling into the trap of hype around the
supposed practical advantages of quantum computing, which, as of today, remain
largely speculative.19
among the largest collectors of information in the world and leaders in quantum
computing research and development. Companies like Google, Amazon, and Meta
(formerly Facebook) have already begun entering the financial services industry, and
the possibility of them expanding their presence more decisively is a real concern
for traditional financial institutions.
The Chinese multinational company Alibaba, actively involved in quantum com-
puting research, is particularly noteworthy. This platform has seamlessly integrated
financial services into its ecosystem, offering payment solutions, lending, and invest-
ment products, thereby transforming how millions of users manage their finances.
Such developments illustrate the disruptive potential Big Tech companies could
bring to the global financial industry.
To conclude, it is worth noting that some major banks have yet to start—or at
least have not made it publicly known—that they are adopting quantum comput-
ing in their research or hiring personnel to work on it. Perhaps these institutions
believe, as historical precedent has sometimes proven, that it is more strategic to
wait until the technology has matured before making the substantial intellectual
and financial investments required to catch up with competitors. However, this
strategy is not without its risks. Delaying adoption could mean falling significantly
behind early adopters who have already established expertise and infrastructure.
On the other hand, investing heavily in a technology that is still several years away
from widespread practical implementation also carries inherent risks. The decision
requires a careful balance between foresight and pragmatism.
Multiverse Computing
Multiverse Computing was founded in 2019 by a team of financial experts and physi-
cists. Originally established in San Sebastián, Spain, the company has expanded its
presence with offices in Paris, Munich, London, among others. Initially, Multiverse
Computing focused on providing quantum software solutions for the finance industry.
Recognizing that many finance professionals are not well-versed in quantum com-
puting, the company developed Singularity, a software platform that allows users to
implement quantum algorithms through familiar tools like Microsoft Excel, without
requiring prior quantum computing knowledge. Multiverse Computing has collabo-
rated with major financial institutions, including BBVA and Crédit Agricole CIB,
to explore the application of quantum computing in finance. These collaborations
have led to advancements in areas such as portfolio optimization and risk analy-
sis. The company actively publishes its findings in research papers, contributing to
the broader scientific and financial communities. More recently, Multiverse Com-
puting has emphasized the development of quantum-inspired algorithms to enhance
artificial intelligence capabilities. This strategic shift aims to leverage quantum tech-
72 CHAPTER 5. HOW TO GET QUANTUM-READY
nologies to improve AI processes, making them more efficient and effective across
various applications. Over time, Multiverse Computing has broadened its scope to
offer quantum-inspired solutions across various sectors, including energy, manufac-
turing, logistics, biotech, pharmaceuticals, and aerospace.20
Quantum Signals
Much smaller than Multiverse Computing, Quantum Signals focuses exclusively on
quantum computing solutions for the financial sector. Founded in 2024 by two quan-
tum computing experts in finance, the company has, from the beginning, focused
on quantum-inspired solutions aimed at speeding up and improving the accuracy of
artificial intelligence applications.21
AbaQus
AbaQus was founded in 2021 in Vancouver. While detailed information about the
company remains limited, online sources suggest that the company focuses on the
practical implementation of quantum algorithms, providing access to cloud-based
quantum services, experimenting with quantum-inspired optimization, and integrat-
ing hybrid quantum-classical approaches into existing financial workflows.22
QuantFi
Founded in 2019 by American and French business partners, QuantFi is a startup
that offers quantum computing solutions for financial institutions. They engage in
training initiatives to help financial professionals understand quantum computing
and its potential impact—similar to what we saw with AbaQus. Additionally, they
conduct joint research projects with institutions to explore quantum computing ap-
plications in finance.23
finance.
Conclusion
In this guide, I have sought to strike a balance between technical and practical as-
pects, emphasizing the complexities of applying quantum computation techniques
to the financial services industry. An inherently interdisciplinary subject, it encom-
passes quantum mechanics, computer science, mathematical finance, programming,
technological research and business innovation.
Before concluding with actionable steps you can take to prepare for the next major
technological revolution in the financial industry, let us recap some of the main points
covered in this short book.
Quantum computation, by leveraging the principles of quantum mechanics that
govern nature, aims to build quantum computers whose efficiency and precision
will surpass current digital computers by hundreds or even millions of orders of
magnitude.
Among the many applications this groundbreaking technology will have in society,
the financial industry is expected to be one of the first to benefit from it. By acceler-
ating and improving conventional computational techniques—such as optimization,
Monte Carlo simulations, and machine learning—quantum computers promise to
transform the future of finance. They will enhance the way financial organizations
provide services, refine predictive strategies, and optimize investment methods, all
while ensuring compliance with increasingly stringent regulations imposed by gov-
ernments and regulatory bodies.
To become “quantum ready,” small and mid-sized financial institutions must act
proactively before it is too late. Although workable quantum computers are likely
several years away, this time should be used to prepare staff and familiarize them
with the most pressing advancements in the field. This preparation will not only en-
sure a smoother transition but also provide a competitive edge in a rapidly evolving
technological landscape.
If you are interested in learning how to implement quantum technologies in your
own organization, please do not hesitate to contact me. I would be glad to help
you navigate this exciting frontier and position your institution for success in the
quantum era.
75
Index
77
78 INDEX