0% found this document useful (0 votes)
75 views6 pages

Quantum Data-Fitting: PACS Numbers: 03.67.-A, 03.67.ac, 42.50.Dv

This document presents a new quantum algorithm for efficiently determining the quality of a least-squares fit to an exponentially large dataset. The algorithm builds on an existing quantum algorithm for solving systems of linear equations. It can efficiently estimate the error in approximating data with a concise fitting function, without having to learn the exact fit parameters. This makes it useful for fitting quantum states generated by a quantum simulator, even if extracting the full state is intractable classically. The algorithm implements matrix operations and inversions using quantum phase estimation to prepare a quantum state proportional to the pseudoinverse of the data matrix.

Uploaded by

ohenri100
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views6 pages

Quantum Data-Fitting: PACS Numbers: 03.67.-A, 03.67.ac, 42.50.Dv

This document presents a new quantum algorithm for efficiently determining the quality of a least-squares fit to an exponentially large dataset. The algorithm builds on an existing quantum algorithm for solving systems of linear equations. It can efficiently estimate the error in approximating data with a concise fitting function, without having to learn the exact fit parameters. This makes it useful for fitting quantum states generated by a quantum simulator, even if extracting the full state is intractable classically. The algorithm implements matrix operations and inversions using quantum phase estimation to prepare a quantum state proportional to the pseudoinverse of the data matrix.

Uploaded by

ohenri100
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

a

r
X
i
v
:
1
2
0
4
.
5
2
4
2
v
2


[
q
u
a
n
t
-
p
h
]


3

J
u
l

2
0
1
2
Quantum Data-Fitting
Nathan Wiebe
1
, Daniel Braun
2,3
, and Seth Lloyd
4
1
Institute for Quantum Computing, Waterloo, On, Canada
2
Universite de Toulouse, UPS, Laboratoire de Physique Theorique (IRSAMC), F-31062 Toulouse, France
3
CNRS, LPT (IRSAMC), F-31062 Toulouse, France and
4
MIT - Research Laboratory for Electronics and Department of Mechanical Engineering, Cambridge, MA 02139, USA
We provide a new quantum algorithm that eciently determines the quality of a least-squares
t over an exponentially large data set by building upon an algorithm for solving systems of linear
equations eciently (Harrow et al., Phys. Rev. Lett. 103, 150502 (2009)). In many cases, our
algorithm can also eciently nd a concise function that approximates the data to be tted and
bound the approximation error. In cases where the input data is a pure quantum state, the algorithm
can be used to provide an ecient parametric estimation of the quantum state and therefore can be
applied as an alternative to full quantum state tomography given a fault tolerant quantum computer.
PACS numbers: 03.67.-a, 03.67.Ac, 42.50.Dv
Invented as early as 1794 by Carl Friedrich Gauss, tting data to theoretical models has become over the centuries
one of the most important tools in all of quantitative science [1]. Typically, a theoretical model depends on a number
of parameters, and leads to functional relations between data that will depend on those parameters. Fitting a large
amount of experimental data to the functional relations allows one to obtain reliable estimates of the parameters. If
the amount of data becomes very large, tting can become very costly. Examples include inversion problems of X-ray
or neutron scattering data for structure analysis, or high-energy physics with giga-bytes of data produced per second
at the LHC. Typically, structure analysis starts from a rst guess of the structure, and then iteratively tries to improve
the t to the experimental data by testing variations of the structure. It is therefore often desirable to test many
dierent models, and compare the best possible ts they provide before committing to one for which one extracts
then the parameters from the t. Obtaining a good t with a relatively small number of parameters compared to the
amount of data can be considered a form of data compression. Indeed, also for numerically calculated data, such as
many-body wave-functions in molecular engineering, ecient tting of the wave-functions to simpler models would
be highly desirable.
With the rise of quantum information theory, one might wonder if a quantum algorithm can be found that solves
these problems eciently. The discovery that exploiting quantum mechanical eects might lead to enhanced computa-
tional power compared to classical information processing has triggered large-scale research aimed at nding quantum
algorithms which are more ecient than the best classical counterparts [27]. Although faulttolerant quantum com-
putation remains out of reach at present, quantum simulation is already now on the verge of providing answers to
questions concerning the states of complex systems that are beyond classical computability [8, 9]. Recently, a quan-
tum algorithm (called HHL in the following) was introduced that eciently solves a linear equation, Fx = b, with
given vector b of dimension N and sparse Hermitian matrix F [10]. Ecient solution means that the expectation
value x[M[x of an arbitrary poly-size Hermitian operator M can be found in roughly O(s
4

2
log(N)/) steps [11],
where is the condition number of F, i.e. the ratio between the largest and smallest eigenvalue of F, s denotes the
sparsenes (i.e. the maximum number of non-zero matrix elements of F in any given row or column), and is the
maximum allowed distance between the [x found by the computer and the exact solution. In contrast, they show
that it is unlikely that classical computers can eciently solve similar problems because it would imply that quantum
computers are no more powerful than classical computers.
While it has remained unclear so far whether expectation values of the form x[M[x provide answers to computa-
tionally important questions, we provide here an adaption of the algorithm to the problem of data tting that allows
one to eciently obtain the quality of a t without having to learn the t-parameters. Our algorithm is particularly
useful for tting data eciently computed by a quantum computer or quantum simulator, especially if an evolution
can be eciently simulated but no known method exists to eciently learn the resultant state. For example, our
algorithm could be used to eciently nd a concise matrixproduct state approximation to a groundstate yielded by
a quantum manybody simulator and assess the approximation error. More complicated states can be used in the
t if the quantum computer can eciently prepare them. Fitting quantum states to a set of known functions is an
interesting alternative to performing full quantum-state tomography [12].
Least-squares tting The goal in leastsquares tting is to nd a simple continuous function that well approximates
a discrete set of N points x
i
, y
i
. The function is constrained to be linear in the t parameters C
M
, but it can be
non-linear in x. For simplicity we consider x C, but the generalization to higher dimensional x is straight-forward.
2
Our t function is then of the form
f(x, ) :=
M

j=1
f
j
(x)
j
where
j
is a component of and f(x, ) : C
M+1
C. The optimal t parameters can be found by minimizing
E =
N

i=1
[f(x
i
, ) y
i
[
2
= [F y[
2
(1)
over all , where we have dened the N M matrix F through F
ij
= f
j
(x
i
), F
t
is its transpose, and y denotes the
column vector (y
1
, . . . , y
N
)
t
. Also, following HHL, we assume without loss of generality that
1

2
|F

F| 1 and
1

2
|FF

| 1 [10]. Throughout this Letter we use | | to denote the spectral norm.


Given that F

F is invertible, the t parameters that give the least square error are found by applying the Moore
Penrose pseudoinverse [13] of F, F
+
, to y:
= F
+
y = (F

F)
1
F

y. (2)
A proof that (2) gives an optimal for a leastsquare t is given in the appendix.
The algorithm consists of three subroutines: a quantum algorithm for performing the pseudoinverse, an algorithm
for estimating the t quality and an algorithm for learning the t-parameters .
1. Fitting Algorithm Our algorithm uses a quantum computer and oracles that output quantum states that
encode the matrix elements of F to approximately prepare F
+
y. The matrix multiplications, and inversions, are
implemented using an improved version of the HHL algorithm [10] that utilizes recent developments in quantum
simulation algorithms.
Input : A quantum state [y =

M+N
p=M+1
y
p
[p/[y[ that stores the data y, an upper bound (denoted ) for the square
roots of the condition numbers of FF

and F

F, the sparseness of F (denoted s) and an error tolerance .


Output : A quantum state [ that is approximately proportional to the optimal t parameters /[[ up to error as
measured by the Euclideannorm.
Computational Model : We have a universal quantum computer equipped with oracles that, when queried about a
nonzero matrix element in a given row, yield a quantum state that encodes a requested bit of a binary encoding the
column number or value of a nonzero matrix element of F in a manner similar to those in [14]. We also assume a
quantum blackbox is provided that yields copies of the input state [y on demand.
Query Complexity: The number of oracle queries used is

log(N)(s
3

6
)/

, (3)
where

O notation implies an upper bound on the scaling of a function, suppressing all sub-polynomial functions.
Alternatively, the simulation method of [15, 16] can be used to achieve a query complexity of

log(N)(s
6
)/
2

. (4)
Analysis of Algorithm The operators F and F

are implemented using an isometry superoperator I to represent


them as Hermitian operators on C
N+M
. The isometry has the following action on a matrix X:
I : X

0 X
X

. (5)
These choices are convenient because I(F

)[y contains F

y/[y[ in its rst M entries. We also assume for simplicity


that [I(F

)[y[ = 1. This can easily be relaxed by dividing I(F

)[y by [F

y[.
Preparing I(F

)[y The next step is to prepare the state I(F

)[y. This is not straightforward because I(F

) is
a Hermitian, rather than unitary, operator. We implement the Hermitian operator using the same phase estimation
trick that HHL use to enact the inverse of a Hermitian operator, but instead of dividing by the eigenvalues of each
3
eigenstate we multiply each eigenstate by its eigenvalue. We describe the relevant steps below. For more details,
see [10].
The algorithm rst prepares an ancilla state for a large integer T that is of order N
[
0
=

2
T
T1

=0
sin

( + 1/2)
T

[ [y. (6)
It then maps [
0
to,

2
T
T1

=0
sin

( + 1/2)
T

[ e
iI(F

)t0/T
[y, (7)
for t
0
O(/). We know from work on quantum simulation that exp(iI(F

)t
0
/T) can be implemented within
error O() in the 2-norm using

O(log(N)s
3
t
0
/T) quantum operations, if F has sparseness s [17]. Alternatively, the
method of [15, 16] gives query complexity

O(log(N)st
0
/(T)). If we write [y =

N
j=1

j
[
j
, where [
j
are the
eigenvectors of I(F

) with eigenvalue E
j
we obtain

2
T
T1

=0
sin

( + 1/2)
T

e
iEjt0/T
[
j
[
j
, (8)
The quantum Fourier transform is then applied to the rst register and, after labeling the Fourier coecients
k|j
,
the state becomes
N

j=1
T1

k=0

k|j

j
[k[
j
, (9)
HHL show that the Fourier coecients are small unless the eigenvalue E
j


E
k
:= 2k/t
0
, and t
0
O(/) is needed
to ensure that the error from approximating the eigenvalue is at most . It can be seen using the analysis in [10] that
after re-labeling [k as [

E
k
, and taking T O(N), (9) is exponentially close to

N
j=1

j
[

E
j
[
j
.
The nal step is to introduce an ancilla system and perform a controlled unitary on it that rotates the ancilla state
from [0 to

1 C
2
E
2
j
[0 +C

E
j
[1, where C O(max
j
[E
j
[)
1
because the state would not be properly normalized
if C were larger. The probability of measuring the ancilla to be 1 is O(1/
2
) since CE
j
is at least O(1/). O(
2
)
repetitions are therefore needed to guarantee success with high probability, and amplitude amplication can be used
to reduce the number of repetitions to O() [10]. HHL show that either O(1/
2
) or O(1/) attempts are also needed
to successfully perform I(F)
1
depending on whether amplitude amplication is used.
The cost of implementing I(F

) is the product of the cost of simulating I(F

) for time / and the number of


repetitions required to obtain a successful result, which scales as O(). The improved simulation method of Childs
and Kothari [17] allows the simulation to be performed in time

O(log(N)s
3
/), where s is the sparseness of F;
therefore, I(F

)[y can be prepared using



O(log(N)s
3

2
/) oracle calls. The cost of performing the inversion using
the simulation method of [15, 16] is found by substituting s s
1/3
/ into this or any of our subsequent results.
Inverting F

F We then nish the algorithm by applying (F

F)
1
using the method of HHL [10]. Note that the
existence of (F

F)
1
is implied by a well-dened tting-problem, in the sense that a zero eigenvalue of F

F would
result in a degenerate direction of the quadratic form (1). The operator F

F C
MM
is Hermitian and hence
amenable to the linear systems algorithm. We do, however, need to extend the domain of the operator to make it
compatible with [y which is in a Hilbert space of dimension N + M. We introduce A to denote the corresponding
operator,
A :=

F 0
0 FF

= I(F)
2
. (10)
If we dene [ C
N+M
to be a state of the form [ =

M
j=1

j
[j up to a normalizing constant, then F

F is
proportional to A[ up to a normalizing constant. This means that we can nd a vector that is proportional to the
least-squares t parameters by inversion via
[ = A
1
I(F

)[y. (11)
4
This can be further simplied by noting that
A
1
= I(F)
2
. (12)
Amplitude amplication does not decrease the number of attempts needed to implement A
1
in (11) because the
algorithm require reections about I(F

)[y, which requires O() repetitions to prepare.


Since amplitude amplication provides no benet for implementing A
1
, O(
5
) repetitions are needed to implement
A
1
I(F

). This is a consequence of the fact that the probability of successfully performing each I(F)
1
is O(1/
2
)
and the probability of performing I(F

) is O(1/) (if amplitude amplication is used). The cost of performing the


simulations involved in each attempt is

O(log(N)s
3
/) and hence the required number of oracle calls scales as

log(N)(s
3

6
/)

. (13)
Although the algorithm yields [ eciently, it may be exponentially expensive to learn [ via tomography;
however, we show below that a quantum computer can assess the quality of the t eciently.
2. Estimating Fit Quality We will now show that we can eciently estimate the t quality E even if M is
exponentially large and without having to determine the t-parameters. For this problem, note that due to the
isometry (5) E = [[y I(F)[[
2
. We assume the prior computational model. We are also provided a desired error
tolerance, , and wish to determine the quality of the t within error .
Input : A constant > 0 and all inputs required by algorithm 1.
Output : An estimate of [y[I(F)[[
2
accurate within error .
Query Complexity:

log(N)
s
3

. (14)
Algorithm We begin by preparing the state [y [y using the provided state preparation blackbox. We then use
the prior algorithm to construct the state
I(F)A
1
I(F

)[y [y = I(F)
1
I(F

)[y [y, (15)


within error O(). The cost of implementing I(F)
1
I(F

) (with high probability) within error is

log(N)
s
3

. (16)
The swap test [18] is then used to determine the accuracy of the t. The swap test is a method that can be used
to distinguish [y and I(F)[ by performing a swap operation on the two quantum states controlled by a qubit in
the state ([0 + [1)/

2. The Hadamard operation is then applied to the control qubit and the control qubit is then
measured in the computational basis. The test concludes that the states are dierent if the outcome is 1. The
probability of observing an outcome of 1 is (1 [y[I(F)[[
2
)/2 for our problem.
The overlap between the two quantum states can be learned by statistically sampling the outcomes from many
instances of the swap test. The value of [y[I(F)[[
2
can be approximated using the sample mean of this distribution.
It follows from estimates of the standard deviation of the mean that O(1/
2
) samples are required to estimate the
mean within error O(). The cost of algorithm 2 is then found by multiplying (16) by 1/
2
.
The quantity E can be estimated from the output of algorithm 2 by E 2(1 [y[I(F)[[). Taylor series analysis
shows that the error in the upper bound for E is also O().
There are several important limitations to this technique. First, if F is not sparse (meaning s O(poly(N))) then
the algorithm may not be ecient because the quantum simulation step used in the algorithm may not be ecient.
As noted in previous results [1416], we can generalize our results to systems where F is non-sparse if there exists
a set of ecient unitary transformations U
j
such that I(F) =

j
U
j
H
j
U

j
where each H
j
is sparse and Hermitian.
Also, in many important cases (such as tting to experimental data) it may not be posible to prepare the initial state
[y eciently. For this reason, our algorithm is better suited for approximating the output of quantum devices than
the classical outputs of experiments. Finally, algorithm 2 only provides an ecient estimate of the t quality and
does not provide ; however, we can use it to determine whether a quantum state has a concise representation within
a family of states. If algorithm 2 can be used to nd such a representation, then the parameters [ can be learned
via state tomography. We discuss this approach below.
3. Learning This method can also be used to nd a concise t function that approximates y. Specically, we
use statistical sampling and quantum state tomography to nd a concise representation for the quantum state using
M

parameters. The resulting algorithm is ecient if M

O(polylog(N)).
5
Input : As algorithm 2, but in addition with an integer M

O(polylog(M)) that gives the maximum number of t


functions allowed in the t.
Output : A classical bit string approximating [ to precision , a list of the M

t functions that comprise [ and


[y[I(F)[[
2
calculated to precision .
Computational Model : As algorithm 1, but the oracles can be controlled to either t the state to all M t functions
or any subset consisting of M

t functions.
Query Complexity:

log(N)s
3

2
+
M
2

.
Algorithm The rst step of the algorithm is to prepare the state [ using algorithm 1. The state is then measured
O(M

) times and a histogram of the measurement outcomes is constructed. Since the probability of measuring each
of these outcomes is proportional to their relevance to the t, we are likely to nd the M

of the most likely outcomes


by sampling the state O(M

) times.
After choosing the M

most signicant t functions, we remove all other t functions from the t and prepare the
state [ using the reduced set of t functions. Compressed sensing [1921] is then used to reconstruct [ within
O() error. The idea of compressed sensing is that a lowrank density matrix can be uniquely determined (with high
probability) by a small number of randomly chosen measurements. A convex optimization routine is then used to
reconstruct the density matrix from the expectation values found for each of the measurements.
Compressed sensing requires O(M

log(M

)
2
) measurement settings to reconstruct pure states, and observation 1
of [19] implies that O(M

/
2
) measurements are needed for each setting to ensure that the reconstruction error is
O(); therefore, O(M
2
log(M

)
2
/
2
) measurements are needed to approximate the state within error O(). The total
cost of learning [ is the number of measurements needed for tomography multiplied by the cost of preparing the
state and thus scales as

log(N)
s
3
M
2

, (17)
which subsumes the cost of measuring [ to nd the most signicant M

t functions.
Finally, we measure the quality of the t using algorithm 2. The total cost of estimating [ and the t quality is
thus the sum of (17) and (16), as claimed.
Remark: The quality of the resulting t that is yielded by this algorithm depends strongly on the set of t functions
that are used. If the t functions are chosen well, fewer than M

t functions are used to estimate [y with high


delity. Conversely, O(N) t functions may be needed to achieve the desired error tolerance if the t functions are
chosen poorly. Fortunately, the eciency of algorithm 2 allows the user to search many sets of possible t functions
for a concise and accurate model within a large set of potential models.
Acknowledgements: DB would like to thank the Joint Quantum Institute (NIST and University of Maryland) and
the Institute for Quantum Computing (University of Waterloo), for hospitality, and Arram Harrow and Avinatan
Hassidim for useful correspondence. NW would like to thank Andrew Childs and Dominic Berry for useful feedback
and acknowledges support from USARO/DTO. SL is supported by NSF, DARPA, ENI and ISI.
Appendix A: MoorePenrose Pseudoinverse
Here we review an elementary proof [23] of why applying MoorePenrose pseudoinverse to the complexvalued
vector y yields parameters that minimize the leastsquares t of the initial state. To begin, we need to prove some
properties of the pseudoinverse. First,
(FF
+
)

= FF
+
. (A1)
The proof of this property is
(FF
+
)

= (F(F

F)
1
F

= (F((F

F)
1
)

). (A2)
The result of (A1) then follows by noting that F

F is selfadjoint.
6
Next, we need the property that
FF
+
F = F. (A3)
This property follows directly from substituting in the denition of F
+
into the expression.
The nal property we need is
F

(FF
+
y y) = 0. (A4)
Using property (A1) we nd that
F

(FF
+
y y) = (FF
+
F)

y F

y. (A5)
Property (A3) then implies that
(FF
+
F)

y F

y = F

y F

y = 0. (A6)
For simplicity, we will express z = F
+
y and then nd
|F y|
2
= |Fz y + (F Fz)|
2
. (A7)
Expanding this relation yields,
|F y|
2
= |Fz y|
2
+|F( z)|
2
+ (Fz y)

F( z) + ( z)

(Fz y). (A8)


Property (A4) then implies that F

(Fz y) = 0 and hence


|F y|
2
= |Fz y|
2
+|F( z)|
2
|FF
+
y y|, (A9)
which holds with equality if = z = F
+
y. Therefore, applying the MoorePenrose peudoinverse to y provides t
parameters that minimize the leastsquare error.
[1] O. Bretscher, Linear Algebra With Applications, 3rd ed.. (Prentice Hall, Upper Saddle River NJ, 1995).
[2] P. W. Shor, in Proc. 35th Annu. Symp. Foundations of Computer Science (ed. Goldwasser, S.), p. 124-134 (IEEE Computer
Society, Los Alamitos, CA, 1994).
[3] D. Simon ((IEEE Computer Society, Los Alamitos, CA, 1994).
[4] L. K. Grover, Phys. Rev. Lett. 79, 325 (1997).
[5] W. van Dam and S. Hallgren, Ecient quantum algorithms for shifted quadratic character problems, arXiv:quant-
ph/0011067v2.
[6] D. Aharonov, Z. Landau, and J. Makowsky, quant-ph/0611156.
[7] A. M. Childs, R. Cleve, E. Deotto, E. Farhi, S. Gutmann, and D. A. Spielman, Proc. 35th ACM Symposium on Theory of
Computing (STOC 2003) pp. 5968 (2002).
[8] J. T. Barreiro, M. M uller, P. Schindler, D. Nigg, T. Monz, M. Chwalla, M. Hennrich, C. F. Roos, P. Zoller, and R. Blatt,
Nature 470, 486 (2011).
[9] J. Simon, W. S. Bakr, R. Ma, M. E. Tai, P. M. Preiss, and M. Greiner, Nature 472, 307 (2011).
[10] A. W. Harrow, A. Hassidim, and S. Lloyd, Phys. Rev. Lett. 103, 150502 (2009).
[11] The result in [10] incorrectly cites the results of Theorem 2 of [22] leading to the conclusion that the cost of linear inversion
scales with the sparseness as

O(s
2
) rather than

O(s
4
).
[12] Z. Hradil, Phys. Rev. A 55, R1561 (1997).
[13] A. Ben-israel, T. N. E. Greville, J. H. Ahlberg, E. N. Nilson, and J. L. Walsh, pp. 104106 (1974).
[14] N. Wiebe, D. W. Berry, P. Hyer, and B. C. Sanders, J. Phys. A 43, 065203 (2011).
[15] A. M. Childs, Commun. Math. Phys. 294, 581 (2009).
[16] D. W. Berry and A. M. Childs, Quantum Information and Computation 12, 29 (2012).
[17] A. Childs and R. Kothari, in Theory of Quantum Computation, Communication, and Cryptography, edited by W. van
Dam, V. Kendon, and S. Severini (Springer Berlin / Heidelberg, 2011), vol. 6519 of Lecture Notes in Computer Science,
pp. 94103, ISBN 978-3-642-18072-9.
[18] H. Buhrman, R. Cleve, J. Watrous, and R. de Wolf, Phys. Rev. Lett. 87, 167902 (2001).
[19] D. Gross, Y.-K. Liu, S. T. Flammia, S. Becker, and J. Eisert, Phys. Rev. Lett. 105, 150401 (2010).
[20] A. Shabani, R. L. Kosut, M. Mohseni, H. Rabitz, M. A. Broome, M. P. Almeida, A. Fedrizzi, and A. G. White, Phys. Rev.
Lett. 106, 100401 (2011).
[21] A. Shabani, M. Mohseni, S. Lloyd, R. L. Kosut, and H. Rabitz, arXiv:1002.1330 (2010).
[22] D. W. Berry, G. Ahokas, R. Cleve, and B. C. Sanders, Commun. Math. Phys. 270, 359 (2007).
[23] R. Penrose, Proceedings of The Cambridge Philosophical Society 52, 17 (1956).

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy