Quantum Communication
Quantum Communication
Principles of
Quantum Communication Theory:
A Modern Approach
ii
Acknowledgements
[IN PROGRESS]
We dedicate this book to the memory of Jonathan P. Dowling. Jon was generous
and kind-hearted, and he always gave all of his students his full, unwavering support.
His tremendous impact on the lives of everyone who met him will ensure that his
memory lives on and that he will not be forgotten. We will especially remember
Jon’s humour and his sharp wit. We are sure that, as he had promised, this book
would have made the perfect doorstop for his office.
Sumeet Khatri acknowledges support from the National Science Foundation
under Grant No. 1714215 and the Natural Sciences and Engineering Research
Council of Canada postgraduate scholarship. Mark M. Wilde acknowledges support
from the National Science Foundation over the past decade (specifically from Grant
Nos. 1350397, 1714215, 1907615, 2014010), and is indebted and grateful to Patrick
Hayden for hosting him for a sabbatical at Stanford University during calendar year
2020, with support from Stanford QFARM and AFOSR (FA9550-19-1-0369).
iii
Table of Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
I Preliminaries 2
2 Mathematical Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1 Finite-Dimensional Hilbert Spaces . . . . . . . . . . . . . . . . . 4
2.2 Linear Operators . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.1 Tensor Product . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.2 Image, Kernel, and Support . . . . . . . . . . . . . . . . . 12
2.2.3 Trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.4 Transpose and Conjugate Transpose . . . . . . . . . . . . 15
2.2.5 Hilbert–Schmidt Inner Product, Vectorization, and Trans-
pose Trick . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.6 Notable Classes of Linear Operators . . . . . . . . . . . . 19
2.2.7 Singular Value, Schmidt, and Polar Decompositions . . . . 22
2.2.8 Spectral Theorem . . . . . . . . . . . . . . . . . . . . . . 25
2.2.9 Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.2.10 Operator Inequalities . . . . . . . . . . . . . . . . . . . . 42
2.2.11 Superoperators . . . . . . . . . . . . . . . . . . . . . . . 48
2.3 Analysis and Probability . . . . . . . . . . . . . . . . . . . . . . 52
2.3.1 Limits, Infimum, Supremum, and Continuity . . . . . . . 53
2.3.2 Compact Sets . . . . . . . . . . . . . . . . . . . . . . . . 56
2.3.3 Convex Sets and Functions . . . . . . . . . . . . . . . . . 56
2.3.4 Fenchel–Eggleston–Carathéodory Theorem . . . . . . . . 58
2.3.5 Minimax Theorems . . . . . . . . . . . . . . . . . . . . . 58
2.3.6 Probability Distributions . . . . . . . . . . . . . . . . . . 61
2.4 Semi-Definite Programming . . . . . . . . . . . . . . . . . . . . 62
iv
2.4.1 SDPs for Spectral and Trace Norm, Maximum and Mini-
mum Eigenvalue . . . . . . . . . . . . . . . . . . . . . . 68
2.5 Symmetric Subspace . . . . . . . . . . . . . . . . . . . . . . . . 73
2.6 Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . 78
2.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
v
4.4.6 Entanglement-Breaking Channels . . . . . . . . . . . . . 166
4.4.7 Hadamard Channels . . . . . . . . . . . . . . . . . . . . . 172
4.4.8 Covariant Channels . . . . . . . . . . . . . . . . . . . . . 175
4.4.9 Bipartite and Multipartite Channels . . . . . . . . . . . . 181
4.5 Examples of Communication Channels . . . . . . . . . . . . . . . 181
4.5.1 (Generalized) Amplitude Damping Channel . . . . . . . . 181
4.5.2 Erasure Channel . . . . . . . . . . . . . . . . . . . . . . . 185
4.5.3 Pauli Channels . . . . . . . . . . . . . . . . . . . . . . . 187
4.5.4 Generalized Pauli Channels . . . . . . . . . . . . . . . . . 189
4.6 Special Types of Channels . . . . . . . . . . . . . . . . . . . . . 191
4.6.1 Petz Recovery Map . . . . . . . . . . . . . . . . . . . . . 191
4.6.2 LOCC Channels . . . . . . . . . . . . . . . . . . . . . . . 196
4.6.3 Completely PPT-Preserving Channels . . . . . . . . . . . 202
4.6.4 Non-Signaling Channels . . . . . . . . . . . . . . . . . . 206
4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
4.8 Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . 208
4.9 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
vi
6.3 Diamond Distance . . . . . . . . . . . . . . . . . . . . . . . . . . 282
6.4 Fidelity Measures for Channels . . . . . . . . . . . . . . . . . . . 286
6.5 Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . 292
Appendix 6.A SDP for Normalized Diamond Distance . . . . . . . . . 293
Appendix 6.B SDPs for Fidelity of States and Channels . . . . . . . . 296
6.B.1 Proof of Proposition 6.6 . . . . . . . . . . . . . . . . . . 296
6.B.2 Proof of Proposition 6.24 . . . . . . . . . . . . . . . . . . 299
vii
9.1.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 513
9.2 Generalized Divergence of Entanglement . . . . . . . . . . . . . . 537
9.2.1 Cone Program Formulations . . . . . . . . . . . . . . . . 549
9.3 Generalized Rains Divergence . . . . . . . . . . . . . . . . . . . 552
9.3.1 Semi-Definite Program Formulations . . . . . . . . . . . 558
9.4 Squashed Entanglement . . . . . . . . . . . . . . . . . . . . . . . 566
9.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 576
9.6 Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . 576
Appendix 9.A Semi-Definite Programs for Negativity . . . . . . . . . 579
viii
11.2.1 Proof of Achievability . . . . . . . . . . . . . . . . . . . 664
11.2.2 Additivity of the Sandwiched Rényi Mutual Information of
a Channel . . . . . . . . . . . . . . . . . . . . . . . . . . 667
11.2.3 Proof of the Strong Converse . . . . . . . . . . . . . . . . 676
11.2.4 Proof of the Weak Converse . . . . . . . . . . . . . . . . 677
11.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 678
11.3.1 Covariant Channels . . . . . . . . . . . . . . . . . . . . . 678
11.3.2 Generalized Amplitude Damping Channel . . . . . . . . . 684
11.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 686
11.5 Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . 687
Appendix 11.A Proof of Theorem 11.7 . . . . . . . . . . . . . . . . . . 688
Appendix 11.B The 𝛼 → 1 Limit of the Sandwiched Rényi Mutual
Information of a Channel . . . . . . . . . . . . . . . . . . . . . . 695
Appendix 11.C Achievability from a Different Point of View . . . . . . 696
Appendix 11.D Proof of Lemma 11.20 . . . . . . . . . . . . . . . . . . 698
Appendix 11.E Alternate Expression for the 1 → 𝛼 CB Norm . . . . . 702
Appendix 11.F Proof of the Multiplicativity of the 1 → 𝛼 CB Norm . . 704
Appendix 11.G The Strong Converse from a Different Point of View . . 713
ix
Appendix 12.A The 𝛼 → 1 Limit of the Sandwiched Rényi Υ-Information
of a Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 798
Appendix 12.B Proof of the Additivity of 𝐶 𝛽 (N) . . . . . . . . . . . . 799
x
14.5 Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . 933
Appendix 14.A Alternative Notions of Quantum Communication . . . . 934
xi
16.2.4 Squashed Entanglement Weak Converse Bound . . . . . .1060
16.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1062
16.3.1 Degradable Channels . . . . . . . . . . . . . . . . . . . .1062
16.3.2 Anti-Degradable Channels . . . . . . . . . . . . . . . . .1065
16.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1065
16.5 Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . .1066
xii
19.1.3 Squashed Entanglement Upper Bound on the Number of
Transmitted Qubits . . . . . . . . . . . . . . . . . . . . .1153
19.2 𝑛-Shot PPT-Assisted Quantum Communication Protocol . . . . .1154
19.2.1 Rényi–Rains Information Upper Bounds on the Number of
Transmitted Qubits . . . . . . . . . . . . . . . . . . . . .1156
19.3 LOCC- and PPT-Assisted Quantum Capacities of Quantum Channels1158
19.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1161
19.5 Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . .1161
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1194
xiii
Chapter 1
Introduction
[IN PROGRESS]
1
Part I
Preliminaries
Mathematical Tools
In this chapter, we learn about the various mathematical concepts required for the
analysis of quantum communication protocols. We mostly provide a summary of
the main definitions and results needed in later chapters, and we omit several of the
proofs. For further details on the concepts presented here, as well as for proofs not
explicity given here, please consult the Bibliographic Notes (Section 2.6) at the
end of the chapter.
Linear algebra forms the core mathematical foundation of quantum information
theory for finite-dimensional quantum systems, and thus it is worthwhile for us
to start by reviewing the basics of linear algebra, with an emphasis on linear
operators. We then proceed to give a summary of several relevant definitions
and results in real and convex analysis, probability theory, and semi-definite
programming. Concepts from real analysis play an important role in quantum
information theory. Indeed, as we discover later, the capacity of a quantum channel
is defined as a limit, which is a core notion in real analysis. Convexity plays a
prominent role as well. Not only is the set of quantum states a convex set, but
also the operator Jensen inequality, a foundational statement about operator convex
functions, is a fundamental inequality that leads to various quantum data-processing
inequalities. The latter data-processing principle is one of the central tenets of
quantum information that allows for placing limitations on the communication
capacities of quantum channels. Probability theory is essential as well, due to the
probabilistic nature of quantum mechanics and the inevitable and unpredictable
errors that occur when communicating information over quantum channels. Finally,
semi-definite programming is a remarkably useful tool, not only as an analytical tool
but also for numerically calculating relevant quantities of interest. Semi-definite
3
Chapter 2: Mathematical Tools
programming has also played a pivotal role in many of the substantive advances that
have taken place in quantum information theory during the past several decades,
and so it has become one of the standard tools in the quantum information theorist’s
toolkit.
4
Chapter 2: Mathematical Tools
for all |𝜑⟩, |𝜓⟩ ∈ H, and 𝑈 is called an isomorphism. For the finite-dimensional case
of interest for us, 𝑈 is a unitary operator (discussed in more detail in Section 2.2.6).
Note that C𝑑 is the vector space of 𝑑-dimensional column vectors with elements
in C. We let {|𝑖⟩}𝑖=0𝑑−1 denote an orthonormal basis, called the standard basis or
computational basis, for the Hilbert space with respect to the Euclidean inner
product. The vector |𝑖⟩ is defined to be a column vector with its (𝑖 + 1) th entry equal
to one and all others equal to zero, so that
© 1ª ©0ª ©0ª
0® 1® 0®
® ® ®
|0⟩ = 0®® , |1⟩ = 0®® , ... , |𝑑 − 1⟩ = 0®® . (2.1.3)
... ® ... ® ... ®
® ® ®
« 0¬ «0¬ «1¬
The inner product ⟨𝑖| 𝑗⟩ evaluates to ⟨𝑖| 𝑗⟩ = 𝛿𝑖, 𝑗 for all 𝑖, 𝑗 ∈ {0, . . . , 𝑑 − 1}, where
the Kronecker delta function is defined as
(
0 if 𝑖 ≠ 𝑗
𝛿𝑖, 𝑗 B (2.1.4)
1 if 𝑖 = 𝑗 .
Í𝑑−1 Í𝑑−1
More generally, for two vectors |𝜓⟩ = 𝑖=0 𝛼𝑖 |𝑖⟩ and |𝜙⟩ = 𝑖=0 𝛽𝑖 |𝑖⟩, with
𝛼𝑖 = ⟨𝑖|𝜓⟩ ∈ C and 𝛽𝑖 = ⟨𝑖|𝜙⟩ ∈ C being the respective components of |𝜓⟩ and |𝜙⟩
in the standard basis, the inner product ⟨𝜓|𝜙⟩ is defined as
𝑑−1
∑︁
⟨𝜓|𝜙⟩ B 𝛼𝑖 𝛽𝑖 . (2.1.5)
𝑖=0
The Euclidean norm, denoted by ∥|𝜓⟩∥ 2 , of a vector |𝜓⟩ ∈ H is the norm induced
by the inner product, i.e., √︁
∥|𝜓⟩∥ 2 B ⟨𝜓|𝜓⟩. (2.1.6)
0 0
1 · 0®® 0®
© © ªª © ª
0 ® ®
1 © ª «1¬® 1®
|𝑖⟩ ⊗ | 𝑗⟩ = |0⟩ ⊗ |2⟩ = ⊗ 0® = ® = ®. (2.1.8)
0 0 ® 0®
«1¬ 0 · ©0ª®®® 0®®
« «1¬¬ «0¬
Í𝑑 𝐴−1 Í 𝐵 −1
More generally, for vectors |𝜓⟩ 𝐴 = 𝑖=0 𝛼𝑖 |𝑖⟩ 𝐴 and |𝜙⟩𝐵 = 𝑑𝑗=0 𝛽 𝑗 | 𝑗⟩𝐵 , the
tensor-product vector |𝜓⟩ 𝐴 ⊗ |𝜙⟩𝐵 is given by
𝐴−1
𝑑∑︁
|𝜓⟩ 𝐴 ⊗ |𝜙⟩𝐵 = 𝛼𝑖 |𝑖⟩ 𝐴 ⊗ |𝜙⟩𝐵 (2.1.9)
𝑖=0
𝐴−1 𝑑∑︁
𝑑∑︁ 𝐵 −1
6
Chapter 2: Mathematical Tools
𝛽0 𝛼0 𝛽 0
𝛼0 · 𝛽1 ®® 𝛼0 𝛽1 ®
© © ªª © ª
𝛽0 ® ®
𝛼0 𝛽2 ¬® 𝛼0 𝛽2 ®
|𝜓⟩ 𝐴 ⊗ |𝜙⟩𝐵 = ⊗ 𝛽1 ® = ®=
© ª « ®. (2.1.11)
𝛼1 𝛽 0 ® 𝛼1 𝛽 0 ®
« 𝛽2 ¬ 𝛼1 · © 𝛽1 ª®®® 𝛼1 𝛽1 ®®
« « 𝛽2 ¬¬ «𝛼1 𝛽2 ¬
where {|0⟩, |1⟩} is the standard basis for a two-dimensional Hilbert space.
Exercise 2.1
Verify (2.1.15).
7
Chapter 2: Mathematical Tools
𝑑 𝐴−1 𝐵 −1
If {|𝑖⟩ 𝐴 }𝑖=0 and {| 𝑗⟩𝐵 } 𝑑𝑗=0 are orthonormal bases for H 𝐴 and H𝐵 , respectively,
then
|𝑖⟩ 𝐴 0
: 0 ≤ 𝑖 ≤ 𝑑𝐴 − 1 ∪ : 0 ≤ 𝑗 ≤ 𝑑𝐵 − 1 (2.1.16)
0 | 𝑗⟩𝐵
is an orthonormal basis for H 𝐴 ⊕ H𝐵 under the inner product
0
© . ª
.. ®
®
0 ®
®
|𝑖⟩ 𝑋 ⊗ | 𝑗⟩ 𝐴 ↔ | 𝑗⟩ 𝐴 ®® , (2.1.18)
0 ®
®
.. ®
. ®
« 0 ¬
holding for all 0 ≤ 𝑖 ≤ 𝑘 − 1 and all 0 ≤ 𝑗 ≤ 𝑑 − 1, where on the right-hand side
there is a one in the (𝑖 · 𝑑 + 𝑗 + 1) th entry of the column vector and zeros elsewhere.
Then, for an element |𝜓0 ⟩ 𝐴 ⊕ |𝜓1 ⟩ 𝐴 ⊕ · · · ⊕ |𝜓 𝑘−1 ⟩ 𝐴 ∈ H ⊕𝑘
𝐴 , we have
© |𝜓0 ⟩ 𝐴 ª 𝑘−1
|𝜓1 ⟩ 𝐴 ® ∑︁
. ®↔ |𝑖⟩ 𝑋 ⊗ |𝜓𝑖 ⟩ 𝐴 . (2.1.19)
.. ®
® 𝑖=0
« |𝜓 𝑘−1 ⟩ 𝐴 ¬
The isomorphism between H ⊕𝑘 and C𝑘 ⊗ H given by (2.1.18) and (2.1.19) is
relevant in the context of superpositions of quantum states and entanglement.
8
Chapter 2: Mathematical Tools
for all 𝛼, 𝛽 ∈ C and |𝜓⟩ 𝐴 , |𝜙⟩ 𝐴 ∈ H 𝐴 . For clarity, we sometimes write 𝑋 𝐴→𝐵 to
explicitly indicate the input and output Hilbert spaces of the linear operator 𝑋.
We use 1 to denote the identity operator, which is defined as the unique linear
operator such that 1 |𝜓⟩ = |𝜓⟩ for every vector |𝜓⟩. For clarity, when needed, we
write 1𝑑 to indicate the identity operator acting on a 𝑑-dimensional Hilbert space.
Exercise 2.2
Given an orthonormal basis {|𝑒 𝑘 ⟩} 𝑑𝑘=1 for a 𝑑-dimensional Hilbert space, prove
that
𝑑
1𝑑 =
∑︁
|𝑒 𝑘 ⟩⟨𝑒 𝑘 |. (2.2.2)
𝑘=1
By applying (2.1.3), we see that the operator |𝑖⟩𝐵 ⟨ 𝑗 | 𝐴 has a matrix representation
as a 𝑑 𝐵 × 𝑑 𝐴 matrix with the (𝑖 + 1, 𝑗 + 1) th entry equal to one and all other entries
9
Chapter 2: Mathematical Tools
©1 0 ··· 0ª ©0 1 ··· 0ª
0 0 ··· 0®® 0 0 ··· 0®®
|0⟩𝐵 ⟨0| 𝐴 = .. .. ... |0⟩𝐵 ⟨1| 𝐴 = .. ..
, ... ,...,
. . 0®® . . 0®®
«0 0 ··· 0¬ «0 0 ··· 0¬
(2.2.4)
©0 0 · · · 0ª
0 0 · · · 0®
|𝑑 𝐵 − 1⟩𝐵 ⟨𝑑 𝐴 − 1| 𝐴 = .. .. . . ®.
. . . 0 ®
®
« 0 0 · · · 1 ¬
Using this basis, we can write a linear operator 𝑋 ∈ L(H 𝐴 , H𝐵 ) as
𝐴−1
𝐵 −1 𝑑∑︁
𝑑∑︁
𝑋 𝐴→𝐵 = 𝑋𝑖, 𝑗 |𝑖⟩𝐵 ⟨ 𝑗 | 𝐴 , (2.2.5)
𝑖=0 𝑗=0
𝑋0,0 𝑋0,1
𝑋 = 𝑋1,0 𝑋1,1 ® .
© ª
(2.2.10)
« 𝑋2,0 𝑋2,1 ¬
10
Chapter 2: Mathematical Tools
Exercise 2.3
Show that every linear operator 𝑋 ∈ L(H 𝐴 , H𝐵 ), expressed as in (2.2.5), can
be written as
𝐵 −1
𝑑∑︁ 𝐴−1
𝑑∑︁
𝑋 𝐴→𝐵 = |𝑖⟩𝐵 ⟨𝜓𝑖 | 𝐴 = |𝜙 𝑗 ⟩𝐵 ⟨ 𝑗 | 𝐴 , (2.2.11)
𝑖=0 𝑗=0
𝑑 𝐵 −1 𝐴−1
where {⟨𝜓𝑖 | 𝐴 }𝑖=0 and {|𝜙 𝑗 ⟩𝐵 } 𝑑𝑗=0 are the rows and columns, respectively,
of 𝑋.
Given two linear operators 𝑋 ∈ L(H 𝐴 , H𝐵 ) and 𝑌 ∈ L(H 𝐴′ , H𝐵′ ), their tensor
product 𝑋 ⊗ 𝑌 is a linear operator in L(H 𝐴 ⊗ H 𝐴′ , H𝐵 ⊗ H𝐵′ ) such that
and it is known as the rank-nullity theorem (the quantity dim(ker(𝑋)) is called the
nullity of 𝑋).
The support of a linear operator 𝑋 ∈ L(H 𝐴 , H𝐵 ), denoted by supp(𝑋), is
defined to be the orthogonal complement of its kernel:
12
Chapter 2: Mathematical Tools
ker(X)
0
im(X)
supp(X)
Figure 2.1: Visual representation of the subspaces im(𝑋), ker(𝑋), and supp(𝑋)
corresponding to a linear operator 𝑋 ∈ L(H 𝐴 , H𝐵 ). Note that only the zero
vector is contained in both ker(𝑋) and supp(𝑋).
the column vector in which all of the elements are equal to zero), which implies
that dim(ker(𝑋)) = 0.
A linear operator 𝑋 ∈ L(H 𝐴 , H𝐵 ) is called surjective (or onto) if, for all
|𝜙⟩ ∈ H𝐵 , there exists |𝜓⟩ ∈ H 𝐴 such that 𝑋 |𝜓⟩ = |𝜙⟩. A necessary and sufficient
condition for 𝑋 to be surjective is that rank(𝑋) = 𝑑 𝐵 .
Exercise 2.4
Prove that a linear operator 𝑋 ∈ L(H) with the same, finite-dimensional input
and output Hilbert space H is injective if and only if it is surjective. (Hint: use
the rank-nullity theorem in (2.2.18).)
13
Chapter 2: Mathematical Tools
2.2.3 Trace
Exercise 2.5
Prove that the trace of a linear operator is independent of the choice of basis
Í𝑑−1 Í
used in (2.2.20). In other words, prove that 𝑖=0 ⟨𝑖|𝑋 |𝑖⟩ = 𝑑𝑘=1 ⟨𝑒 𝑘 |𝑋 |𝑒 𝑘 ⟩ for
every orthonormal basis {|𝑒 𝑘 ⟩} 𝑑𝑘=1 . (Hint: use (2.2.2).)
More generally, the cyclicity property holds for linear operators with different input
and output Hilbert spaces: for 𝑍 𝐴→𝐵 ∈ L(H 𝐴 , H𝐵 ), 𝑌𝐵→𝐶 ∈ L(H𝐵 , H𝐶 ), and
𝑋𝐶→𝐴 ∈ L(H𝐶 , H 𝐴 ),
Exercise 2.6
1. Prove the equalities in (2.2.22) and (2.2.23).
2. Prove that Tr[𝑋 ⊗ 𝑌 ] = Tr[𝑋]Tr[𝑌 ] for all 𝑋 ∈ L(H 𝐴 ) and 𝑌 ∈ L(H𝐵 ).
14
Chapter 2: Mathematical Tools
Note that the transpose is basis dependent, in the sense that it is defined with
respect to a particular basis (in the case above, we have defined it with respect to
the standard bases of H 𝐴 and H𝐵 ). Furthermore, taking the transpose with respect
to one orthonormal basis can lead to an operator different from that found by taking
the transpose with respect to a different orthonormal basis. In this sense, we could
more precisely refer to the operation in (2.2.24) as the “standard transpose.” The
standard transpose can also be understood as a linear superoperator (an operator on
operators) with the following representation:
𝑑∑︁ 𝐴−1
𝐵 −1 𝑑∑︁ 𝐴−1
𝐵 −1 𝑑∑︁
𝑑∑︁
T(𝑋) = (| 𝑗⟩ 𝐴 ⟨𝑖| 𝐵 ) 𝑋 (| 𝑗⟩ 𝐴 ⟨𝑖| 𝐵 ) = 𝑋𝑖, 𝑗 | 𝑗⟩ 𝐴 ⟨𝑖| 𝐵 . (2.2.25)
𝑖=0 𝑗=0 𝑖=0 𝑗=0
Exercise 2.7
Prove that the conjugate transpose is a basis-independent operation, i.e., that
one does not need to specify a basis in order to take the conjugate transpose of
15
Chapter 2: Mathematical Tools
a linear operator.
for all 𝑖 ∈ {0, . . . , 𝑑 𝐵 − 1} and 𝑗 ∈ {0, . . . , 𝑑 𝐴 − 1}. The operation vec on the right
in (2.2.31) is the “vectorize” operation, which transposes the rows of a 𝑑 𝐵 × 𝑑 𝐴
matrix with respect to the standard basis and then stacks the resulting columns
16
Chapter 2: Mathematical Tools
The following are useful identities involving the vec operation that we call upon
repeatedly throughout this book.
1. For every linear operator 𝑋 ∈ L(H 𝐴 , H𝐵 ),
where
𝐴−1
𝑑∑︁
|Γ⟩ 𝐴𝐴 B |𝑖, 𝑖⟩ 𝐴𝐴 . (2.2.34)
𝑖=0
For reasons that become clear later, we refer to |Γ⟩ 𝐴𝐴 as the “maximally
entangled vector.” Note that |Γ⟩ 𝐴𝐴 = vec( 1 𝐴 ). For clarity, when needed, we
write |Γ𝑑 ⟩ to refer to the vector defined in (2.2.34) when each Hilbert space
has dimension 𝑑.
We also note that, for two vectors |𝜓⟩𝐵 ∈ H𝐵 and |𝜙⟩ 𝐴 ∈ H 𝐴 ,
Exercise 2.8
1. Prove (2.2.33).
2. Prove the equality in (2.2.35) by writing both |𝜓⟩𝐵 and |𝜙⟩ 𝐴 in terms
𝑑 𝐵 −1 𝐴−1
of the orthonormal bases {|𝑖⟩𝐵 }𝑖=0 and {| 𝑗⟩ 𝐴 } 𝑑𝑗=0 , respectively, and
using (2.2.31).
𝜓
2. For every vector |𝜓⟩ 𝐴𝐵 ∈ H 𝐴 ⊗ H𝐵 , there exists a linear operator 𝑋 𝐴→𝐵 ∈
L(H 𝐴 , H𝐵 ) such that
17
Chapter 2: Mathematical Tools
Í𝑑 𝐴−1 Í𝑑 𝐵 −1
In particular, if |𝜓⟩ 𝐴𝐵 = 𝑖=0 𝑗=0 𝛼𝑖, 𝑗 |𝑖, 𝑗⟩ 𝐴𝐵 , then we can set
𝐴−1
𝐵 −1 𝑑∑︁
𝑑∑︁
𝜓
𝑋 𝐴→𝐵 = 𝛼𝑖, 𝑗 | 𝑗⟩𝐵 ⟨𝑖| 𝐴 . (2.2.37)
𝑗=0 𝑖=0
𝜓
Alternatively, there exists a linear operator 𝑌𝐵→𝐴 ∈ L(H𝐵 , H 𝐴 ) such that
𝜓 𝜓
and subsequently find by inspection of (2.2.37)–(2.2.39) that 𝑋 𝐴→𝐵 = T(𝑌𝐵→𝐴 ).
3. Transpose trick: For every linear operator 𝑋 ∈ L(H 𝐴 , H𝐵 ), the following
equality holds:
Exercise 2.9
1. Prove (2.2.40).
2. Prove (2.2.41).
3. Let 𝑋 ∈ L(H 𝐴 , H𝐵 ), 𝑌 ∈ L(H𝐶 , H 𝐴 ), and 𝑍 ∈ L(H𝐶 , H𝐷 ). Prove that
18
Chapter 2: Mathematical Tools
(𝑑) 1𝑑
𝜎0,0 B√ , (2.2.44)
𝑑
(𝑑;+) 1
𝜎𝑘,ℓ B √ (|𝑘⟩⟨ℓ| + |ℓ⟩⟨𝑘 |) , 0 ≤ 𝑘 < ℓ ≤ 𝑑 − 1, (2.2.45)
2
(𝑑;i) 1
𝜎𝑘,ℓ B √ (−i|𝑘⟩⟨ℓ| + i|ℓ⟩⟨𝑘 |) , 0 ≤ 𝑘 < ℓ ≤ 𝑑 − 1, (2.2.46)
2
𝑘−1
(𝑑) 1 ©©∑︁
| 𝑗⟩⟨ 𝑗 | ® − 𝑘 |𝑘⟩⟨𝑘 | ® , 1 ≤ 𝑘 ≤ 𝑑 − 1,
ª ª
𝜎𝑘,𝑘 B √︁ (2.2.47)
𝑘 (𝑘 + 1) 𝑗=0
«« ¬ ¬
𝑑 (𝑑−1) (𝑑;+)
Observe that there are 2 operators labeled as 𝜎𝑘,ℓ , all of which are
𝑑 (𝑑−1) (𝑑;i)
traceless, and 2 operators labeled as 𝜎𝑘,ℓ , which are also all traceless.
(𝑑)
The 𝑑 − 1 operators 𝜎𝑘,𝑘 are also traceless. If we scale each of the above
√
operators by 𝑑, then they are called the generalized Gell-Mann matrices.
When 𝑑 = 2, the generalized Gell-Mann matrices reduce to the Pauli matrices:
√ (2) √ (2;+)
1 B 0 1 = 2𝜎0,0 , 𝑋 B 1 0 = 2𝜎0,1
1 0 0 1
, (2.2.48)
19
Chapter 2: Mathematical Tools
√ (2;i) √ (2)
0 −i 1 0
𝑌 B = 2𝜎0,1 , 𝑍B = 2𝜎1,1 . (2.2.49)
i 0 0 −1
The Pauli matrices are important in the context of quantum mechanics, and
quantum information more generally, as they can be used to describe the
quantum states of two-dimensional quantum systems, as well as their evolu-
tion. They are also involved in fundamental quantum information processing
protocols such as quantum teleportation. We elaborate upon these points in
Chapters 3–5.
Exercise 2.10
Prove that the operators in (2.2.44)–(2.2.47) do indeed form an orthonormal
basis for the vector space of Hermitian operators. More generally, prove
that they form an orthonormal basis for the vector space L(C𝑑 ) of all linear
operators.
Exercise 2.11
Let 𝑋 ∈ L(H 𝐴 , H𝐵 ) be a linear operator, with H 𝐴 and H𝐵 arbitrary. Prove
that 𝑋 † 𝑋 is positive semi-definite.
• Density operators: These are Hermitian operators that are positive semi-definite
and have unit trace. Density operators are generalizations of probability
distributions from classical information theory and describe the states of a
quantum system, as detailed in Chapter 3.
• Unitary operators: These are linear operators whose inverses are equal to their
adjoints. That is, 𝑈 ∈ L(H) is unitary if 𝑈 †𝑈 = 𝑈𝑈 † = 1. Unitary operators
20
Chapter 2: Mathematical Tools
Exercise 2.12
Let 𝑈 ∈ L(H) be a unitary operator acting on a 𝑑-dimensional Hilbert
space H.
1. Given an orthonormal basis {|𝑒 𝑘 ⟩} 𝑑𝑘=1 for H, prove that the set
{| 𝑓 𝑘 ⟩} 𝑑𝑘=1 , with | 𝑓 𝑘 ⟩ B 𝑈|𝑒 𝑘 ⟩ for all 1 ≤ 𝑘 ≤ 𝑑, is another orthonormal
basis for H.
2. Using the transpose trick identity in (2.2.40), prove that
3. Using 1. and 2., conclude that the following identity holds for every
orthonormal basis {|𝑒 𝑘 ⟩} 𝑑𝑘=1 for H:
𝑑
∑︁
|Γ⟩ = |𝑒 𝑘 ⟩ ⊗ |𝑒 𝑘 ⟩. (2.2.51)
𝑘=1
Exercise 2.13
Let 𝑉 ∈ L(H 𝐴 , H𝐵 ) be an isometry.
1. Prove that ⟨𝑉𝜓|𝑉 𝜙⟩ = ⟨𝜓|𝜙⟩ for all |𝜓⟩, |𝜙⟩ ∈ H 𝐴 .
2. Using 1., prove that 𝑉 is injective.
3. Using 2., prove that 𝑑 𝐵 ≥ 𝑑 𝐴 . (Hint: use the rank-nullity theorem in
(2.2.18).)
21
Chapter 2: Mathematical Tools
An important fact that we make use of throughout this book is the singular value
decomposition theorem.
Remark: From (2.2.53), we see that the rank of a linear operator 𝑋, which we defined earlier
as the dimension of the image of 𝑋, is equal to the number of singular values of 𝑋.
The singular value decomposition theorem can be written in the following matrix
form that is familiar from elementary linear algebra. We first extend the orthonormal
vectors {|𝑒 𝑘 ⟩𝐵 : 1 ≤ 𝑘 ≤ 𝑟} in H𝐵 to an orthonormal basis {|𝑒 𝑘 ⟩𝐵 : 1 ≤ 𝑘 ≤ 𝑑 𝐵 }
22
Chapter 2: Mathematical Tools
Exercise 2.14
Using arguments similar to the proof of Theorem 2.2, prove that every linear
operator 𝑋 𝐴𝐵 ∈ L(H 𝐴 ⊗ H𝐵 ) can be written as
𝑟 √
∑︁
𝑋 𝐴𝐵 = 𝜆 𝑘 𝐸 𝐴𝑘 ⊗ 𝐹𝐵𝑘 , (2.2.60)
𝑘=1
where the set {𝜆 𝑘 }𝑟𝑘=1 consists of strictly positive reals, {𝐸 𝐴𝑘 }𝑟𝑘=1 and {𝐹𝐵𝑘 }𝑟𝑘=1 are
orthonormal sets of linear operators acting on H 𝐴 and H𝐵 , respectively, and 𝑟 =
rank(𝑀), where 𝑀 ∈ L(H 𝐴 ⊗ H 𝐴 , H𝐵 ⊗ H𝐵 ) is defined by ⟨ 𝑗, ℓ| 𝐵𝐵 𝑀 |𝑖, 𝑘⟩ 𝐴𝐴 =
⟨𝑖, 𝑗 | 𝐴𝐵 𝑋 |𝑘, ℓ⟩ 𝐴𝐵 for all 0 ≤ 𝑖, 𝑘 ≤ 𝑑 𝐴 − 1 and 0 ≤ 𝑗, ℓ ≤ 𝑑 𝐵 − 1.
24
Chapter 2: Mathematical Tools
Given a linear operator 𝑋 ∈ L(H) acting on some Hilbert space H, if there exists a
vector |𝜓⟩ ∈ H such that
𝑋 |𝜓⟩ = 𝜆|𝜓⟩, (2.2.61)
then |𝜓⟩ is said to be an eigenvector of 𝑋 with associated eigenvalue 𝜆. The set
of all eigenvectors associated with an eigenvalue 𝜆 is a subspace of H called the
eigenspace of 𝑋 associated with 𝜆, and the multiplicity of 𝜆 is the number of linearly
independent eigenvectors of 𝑋 that are associated with 𝜆 (in other words, it is
the dimension of the eigenspace of 𝑋 associated with 𝜆). The eigenspace of 𝑋
associated with 𝜆 is equal to ker(𝑋 − 𝜆𝐼).
The spectral theorem, which we state below, allows us to decompose every
normal operator 𝑋, i.e., an operator that commutes with its adjoint, so that
𝑋 𝑋 † = 𝑋 † 𝑋, in terms of its eigenvalues and projections onto its corresponding
eigenspaces. We employ it most often when analyzing quantum states and
observables.
25
Chapter 2: Mathematical Tools
where 𝑑 = dim(H) and {|𝜓 𝑘 ⟩} 𝑑𝑘=1 is a set of orthonormal vectors such that
Note that for the decomposition in (2.2.63) the numbers 𝜆 𝑘 ∈ C are not all
distinct because the eigenspace associated with each eigenvalue can have dimension
greater than one. Also, note that the decomposition in (2.2.63) is generally not
unique because the decomposition of the spectral projections into orthonormal
vectors is not unique.
From (2.2.63), it is evident that the rank of a normal operator 𝑋 (recall the
discussion around (2.2.16)) is equal to the number of non-zero eigenvalues of 𝑋
(including their multiplicities). Furthermore, the support of 𝑋 (recall (2.2.19)) is
equal to the span of all eigenvectors of 𝑋 associated with the non-zero eigenvalues
of 𝑋. In particular, ∑︁
Π𝑋 B |𝜓 𝑘 ⟩⟨𝜓 𝑘 | (2.2.65)
𝑘:𝜆 𝑘 ≠0
is the projection onto the support of 𝑋. It is also evident that the trace of 𝑋 is equal
Í Í
to the sum of its eigenvalues, i.e., Tr[𝑋] = 𝑑𝑘=1 𝜆 𝑘 = 𝑘:𝜆 𝑘 ≠0 𝜆 𝑘 .
Exercise 2.15
Let 𝑃 be a projection operator.
1. Prove that the eigenvalues of 𝑃 are either 0 or 1. Prove that Tr[𝑃] = rank(𝑃).
2. Using 1., conclude that rank(𝑋) = Tr[Π 𝑋 ] for every linear operator 𝑋,
where Π 𝑋 is the projection onto the support of 𝑋, as defined in (2.2.65).
The singular values of a linear operator 𝑋 (not necessarily normal) are related
to the eigenvalues of 𝑋 † 𝑋 and 𝑋 𝑋 † in the following way. Let {𝑠 𝑘 }rank(𝑋)
𝑘=1 be the set
rank(𝑋)
of singular values of 𝑋, and let {𝜆 𝑘 } 𝑘=1 be the non-zero eigenvalues of 𝑋 † 𝑋,
which are the same as the eigenvalues of 𝑋 𝑋 † . (Note that both 𝑋 † 𝑋 and 𝑋 𝑋 †√are
normal operators, so that the spectral theorem applies to them.) Then, 𝑠 𝑘 = 𝜆 𝑘
for all 1 ≤ 𝑘 ≤ rank(𝑋). In particular,
√ if 𝑋 is a Hermitian operator with non-zero
rank(𝑋)
eigenvalues {𝜔 𝑘 } 𝑘=1 , then 𝑠 𝑘 = 𝜔2𝑘 = |𝜔 𝑘 | for all 1 ≤ 𝑘 ≤ rank(𝑋).
Exercise 2.16
1. Prove that every Hermitian operator has real eigenvalues.
2. Prove that every unitary operator has eigenvalues with unit modulus; i.e., if
27
Chapter 2: Mathematical Tools
Exercise 2.17
Using the Jordan–Hahn decomposition, prove that there exists a basis for L(C𝑑 )
consisting entirely of positive semi-definite operators, for all 𝑑 ≥ 2.
Exercise 2.18
For every positive semi-definite operator 𝑋, prove that the operator 𝑋 + 𝜀 1 is
positive definite for all 𝜀 > 0.
𝑓 (𝑈 𝑋𝑈 † ) = 𝑈 𝑓 (𝑋)𝑈 † . (2.2.70)
This is due to the fact that 𝑋 and 𝑈 𝑋𝑈 † have the same eigenvalues.
Functions that arise frequently throughout this book are as follows:
• Power functions: for every 𝛼 ∈ N, the function 𝑓 (𝑥) = 𝑥 𝛼 , with 𝑥 ∈ R, extends
to Hermitian operators via (2.2.69) as
𝑑
∑︁
𝛼
𝑋 B 𝜆𝛼𝑘 |𝜓 𝑘 ⟩⟨𝜓 𝑘 |, 𝛼 ∈ N. (2.2.71)
𝑘=1
1
√ that 𝜆 𝑘 ≥ 0 for
If 𝑋 is a positive semi-definite operator (so
1
all 𝑘), then in √the
case 𝛼 = 2 we typically use the notation 𝑋 to refer to 𝑋 2 . In particular, 𝑋
√ √
is the unique positive semi-definite operator such that 𝑋 𝑋 = 𝑋.
29
Chapter 2: Mathematical Tools
• Logarithm functions: For the function log𝑏 : (0, ∞) → R with base 𝑏 > 0, we
define ∑︁
log𝑏 (𝑋) B log𝑏 (𝜆 𝑘 )|𝜓 𝑘 ⟩⟨𝜓 𝑘 |. (2.2.76)
𝑘:𝜆 𝑘 >0
We deal throughout this book exclusively with the base-2 logarithm log2 and
the base-e logarithm loge ≡ ln.
We end this section with a lemma, which is used several times in Chapter 7.
Lemma 2.5
Let 𝑋 ∈ L(H), and let 𝑓 be a function such that the squares of the singular
values of 𝑋 are in the domain of 𝑓 . Then
𝑋 𝑓 (𝑋 † 𝑋) = 𝑓 (𝑋 𝑋 † ) 𝑋. (2.2.77)
= 𝑊 𝑆𝑉 † 𝑓 (𝑉 𝑆𝑊 †𝑊 𝑆𝑉 † ) (2.2.79)
= 𝑊 𝑆𝑉 † 𝑓 (𝑉 𝑆 2𝑉 † ). (2.2.80)
30
Chapter 2: Mathematical Tools
Now, we use the fact that 𝑓 (𝑉 𝑆 2𝑉 † ) = 𝑉 𝑓 (𝑆 2 )𝑉 † , which holds because the function
(·) ↦→ 𝑉 (·)𝑉 † , with 𝑉 unitary, preserves the eigenvalues. Using as well the fact
that 𝑆 𝑓 (𝑆 2 ) = 𝑓 (𝑆 2 )𝑆, we obtain
𝑋 𝑓 (𝑋 † 𝑋) = 𝑊 𝑆 𝑓 (𝑆 2 )𝑉 † (2.2.81)
= 𝑊 𝑓 (𝑆 2 )𝑆𝑉 † (2.2.82)
= 𝑊 𝑓 (𝑆𝑉𝑉 † 𝑆)𝑊 †𝑊 𝑆𝑉 † (2.2.83)
= 𝑓 (𝑊 𝑆𝑉 †𝑉 𝑆𝑊 † )𝑊 𝑆𝑉 † (2.2.84)
= 𝑓 (𝑋 𝑋 † ) 𝑋. (2.2.85)
2.2.9 Norms
for all vectors |𝜓⟩ and |𝜙⟩ and all 𝜆 ∈ [0, 1].
In this section, we are primarily interested in the Hilbert space L(H) of linear
operators 𝑋 : H → H for some Hilbert space H. The following norm for linear
operators is used extensively in this book.
31
Chapter 2: Mathematical Tools
∥ 𝑋 ∥ ∞ B lim ∥ 𝑋 ∥ 𝛼 . (2.2.88)
𝛼→∞
Throughout this book, we extend the function ∥·∥ 𝛼 to include 𝛼 ∈ (0, 1) (with
the definition exactly as in (2.2.87)), although in this case it is not a norm because
it does not satisfy the triangle inequality.
Norms are typically employed in pure mathematics to measure the lengths of
vectors or operators, and different norms give different ways of measuring length.
In quantum information, we employ norms to measure entropy and information of
quantum states and channels (see Chapter 7). The parameter 𝛼 for the Schatten
norm then becomes the Rényi parameter for the Rényi entropy.
Exercise 2.19
Let 𝑋 be a linear operator, and let {𝑠 𝑘 }𝑟𝑘=1 be the set of singular values of 𝑋,
where 𝑟 B rank(𝑋). Prove that
𝑟
! 𝛼1
∑︁
∥ 𝑋 ∥𝛼 = 𝑠 𝛼𝑘 (2.2.89)
𝑘=1
32
Chapter 2: Mathematical Tools
Exercise 2.20
Let 𝑋 be a linear operator and 𝛼 ∈ (0, ∞). Prove that
𝑋†𝑋 𝛼
= 𝑋 𝑋† 𝛼
= ∥ 𝑋 ∥ 22𝛼 . (2.2.91)
33
Chapter 2: Mathematical Tools
Tr[𝑍 † 𝑋] ≤ ∥ 𝑋 ∥ 𝛼 ∥𝑍 ∥ 𝛽 (2.2.99)
which holds for all linear operators 𝑋 and 𝑍, where 𝛼, 𝛽 ∈ [1, ∞] satisfy
1 1 1 1
𝛼 + 𝛽 = 1. In this sense, the norms ∥·∥ 𝛼 and ∥·∥ 𝛽 , with 𝛼 + 𝛽 = 1, are said
to be dual to each other.
Proof:
1. In the case that 𝑋 = 0, the statement is trivial. So we focus on the case 𝑋 ≠ 0.
Let {𝑠 𝑘 }𝑟𝑘=1 denote the singular values of 𝑋, where 𝑟 B rank(𝑋). If we can
d
show that d𝛼 ∥ 𝑋 ∥ 𝛼 ≤ 0 for all 𝛼 ≥ 1, then it follows that ∥ 𝑋 ∥ 𝛼 is monotone
non-increasing with 𝛼. To this end, starting with (2.2.89), consider that
𝑟
! 𝛼1
d d ∑︁ d 1 ln Í𝑟 𝑠 𝛼
∥ 𝑋 ∥𝛼 = 𝑠 𝛼𝑘 = e𝛼 𝑘=1 𝑘 (2.2.100)
d𝛼 d𝛼 𝑘=1
d𝛼
𝑟
!
𝛼 d 1
1 Í 𝑟
∑︁
= e 𝛼 ln 𝑘=1 𝑠 𝑘 ln 𝑠 𝛼𝑘 (2.2.101)
d𝛼 𝛼 𝑘=1
𝑟
! 𝛼1 𝑟
" 𝑟
#!
∑︁ 1 ∑︁ 1 d ∑︁
= 𝑠 𝛼𝑘 − 2 ln 𝑠 𝛼𝑘 + ln 𝑠 𝛼𝑘 (2.2.102)
𝑘=1
𝛼 𝑘=1
𝛼 d𝛼 𝑘=1
! 1 " #!
𝑟 𝛼 𝑟 𝑟
∑︁ 1 ∑︁ 1 d ∑︁
= 𝑠 𝛼𝑘 − 2 ln 𝑠 𝛼𝑘 + Í𝑟 𝛼 𝑠 𝛼𝑘 (2.2.103)
𝑘=1
𝛼 𝑘=1
𝛼 𝑘=1 𝑠 𝑘 d𝛼 𝑘=1
𝑟
! 𝛼1 −1 " 𝑟 # " 𝑟 # " 𝑟
#!
∑︁ 1 ∑︁ ∑︁ 1 d ∑︁
= 𝑠 𝛼𝑘 − 2 𝑠 𝛼𝑘 ln 𝑠 𝛼𝑘 + 𝑠 𝛼𝑘 (2.2.104)
𝑘=1
𝛼 𝑘=1 𝑘=1
𝛼 d𝛼 𝑘=1
Í
𝑟
! 𝛼1 −1 d 𝑟 𝛼 − Í𝑟 𝛼 ln Í𝑟 𝛼
∑︁ 𝛼
© d𝛼 𝑘=1 𝑘 𝑠 𝑘=1 𝑘𝑠 𝑠
𝑘=1 𝑘 ª
= 𝑠 𝛼𝑘
2
® (2.2.105)
𝑘=1
𝛼 ®
« ¬
34
Chapter 2: Mathematical Tools
Í Í
! 𝛼1 −1 Í𝑟 𝑟 𝑟
𝑟
∑︁ ©𝛼 ln 𝑠 𝑘 − 𝛼
𝑘=1 𝑠 𝑘 ln 𝛼
𝑘=1 𝑠 𝑘 ª
𝛼
𝑘=1 𝑠 𝑘
= 𝑠 𝛼𝑘
2
® (2.2.106)
𝑘=1
𝛼 ®
« Í Í ¬
1
𝛼 −1
Í𝑟 𝑟 𝑟
𝑘=1 𝑠 𝑘 ln 𝑠 𝑘 −
! 𝛼 𝛼 𝛼 𝛼
𝑘=1 𝑠 𝑘 ln
𝑟
∑︁ © 𝑘=1 𝑠 𝑘 ª
= 𝑠 𝛼𝑘
2
®. (2.2.107)
𝑘=1
𝛼 ®
« ¬
The term on the very left in the last line is non-negative and so is the denominator
with 𝛼2 . The inequality
𝑟 𝑟
! 𝑟
!
∑︁ ∑︁ ∑︁
𝑠 𝛼𝑘 ln 𝑠 𝛼𝑘 − 𝑠 𝛼𝑘 ln 𝑠 𝛼𝑘 ≤ 0 (2.2.108)
𝑘=1 𝑘=1 𝑘=1
so that
𝑟 𝑟 𝑟
! 𝑟
!
∑︁ 𝑠 𝛼𝑘 ∑︁ ∑︁ ∑︁
Í𝑟 𝛼 ln 𝑠 𝛼𝑘 = 𝑝 𝑘 ln 𝑠 𝛼𝑘 ≤ ln 𝑝 𝑘 𝑠 𝛼𝑘 ≤ ln 𝑠 𝛼𝑘 , (2.2.111)
𝑘=1 𝑘=1 𝑠 𝑘 𝑘=1 𝑘=1 𝑘=1
4.-5. Multiplicativity with respect to the tensor product and the direct sum property
follow immediately from the definition of ∥·∥ 𝛼 .
6. We provide a proof of (2.2.98) in the special case 𝛼 = 1, 𝛽 = ∞ in Proposi-
tion 2.10 below. For all other values of 𝛼 and 𝛽, please consult the Biblio-
graphic Notes (Section 2.6). Given (2.2.98), for all linear operators 𝑋 and 𝑍,
let 𝑌 = ∥𝑍𝑍∥ . Then, ∥𝑌 ∥ 𝛽 ≤ 1, which means that
𝛽
1
Tr[𝑍 † 𝑋] = Tr[𝑌 † 𝑋] ≤ ∥ 𝑋 ∥ 𝛼 ⇒ Tr[𝑍 † 𝑋] ≤ ∥ 𝑋 ∥ 𝛼 ∥𝑍 ∥ 𝛽 ,
∥𝑍 ∥ 𝛽
(2.2.112)
which is the inequality in (2.2.99). ■
Proposition 2.8
Let 𝛼 ∈ (0, 1) ∪ (1, ∞]. Then, for every 𝛽 such that 𝛼1 + 1𝛽 = 1, and every
positive semi-definite operator 𝑋,
( 1
inf{Tr[𝑋𝑌 𝛽 ] : 𝑌 > 0, Tr[𝑌 ] = 1} if 𝛼 ∈ [0, 1),
∥ 𝑋 ∥𝛼 = 1 (2.2.113)
sup{Tr[𝑋𝑌 𝛽 ] : 𝑌 ≥ 0, Tr[𝑌 ] = 1} if 𝛼 ∈ [1, ∞).
An important case of the Schatten norms is the Schatten ∞-norm, which we recall
from (2.2.88) is defined as
∥ 𝑋 ∥ ∞ B lim ∥ 𝑋 ∥ 𝛼 , (2.2.114)
𝛼→∞
36
Chapter 2: Mathematical Tools
∥ 𝑋 ∥ ∞ = 𝑠max . (2.2.115)
We now prove the opposite inequality. Consider for 𝛼 > 𝛽 > 1 that
𝑟
! 𝛼1 𝑟
! 𝛼1
∑︁ ∑︁
𝛼−𝛽 𝛽 𝛼−𝛽 𝛽
∥ 𝑋 ∥ 𝛼 = ∥®𝑠 ∥ 𝛼 = 𝑠𝑘 𝑠𝑘 ≤ 𝑠max 𝑠 𝑘 (2.2.117)
𝑘=1 𝑘=1
𝑟
! 𝛼1
1− 𝛽 1− 𝛽 𝛽
1− 𝛽 𝛽
∑︁
𝛽
= 𝑠max𝛼 𝑠𝑘 = 𝑠max𝛼 ∥®𝑠 ∥ 𝛽𝛼 = 𝑠max𝛼 ∥ 𝑋 ∥ 𝛽𝛼 . (2.2.118)
𝑘=1
We thus have
1− 𝛼𝛽 𝛽
∥ 𝑋 ∥𝛼 ≤ 𝑠max ∥𝑋 ∥𝛽 .𝛼
(2.2.119)
For every fixed 𝛽, we find that the limit 𝛼 → ∞ of the right-hand side of the
above inequality is equal to 𝑠max . Therefore, ∥ 𝑋 ∥ ∞ ≤ 𝑠max , which concludes the
proof. ■
Due to Proposition 2.9, the term spectral norm is often used to refer to the
Schatten ∞-norm. It is also referred to as the operator norm, because it is the
norm induced by the Euclidean norm on the underlying Hilbert space on which the
operator 𝑋 acts, i.e.,
∥ 𝑋 |𝜓⟩∥ 2
∥ 𝑋 ∥ ∞ = sup = sup ∥ 𝑋 |𝜓⟩∥ 2 . (2.2.120)
|𝜓⟩≠0 ∥|𝜓⟩∥ 2 |𝜓⟩:∥|𝜓⟩∥ 2 =1
37
Chapter 2: Mathematical Tools
In the equation above, we have employed the shorthand sup, which stands for
supremum. We also often employ inf for infimum. These concepts are reviewed in
Section 2.3.1.
Exercise 2.21
Using the fact that 𝑋 has a singular value decomposition of the form 𝑋 =
Írank(𝑋)
𝑘=1 𝑠 𝑘 |𝑒 𝑘 ⟩⟨ 𝑓 𝑘 | (see Theorem 2.1), prove (2.2.120). Similarly, prove that
Exercise 2.22
Let 𝑈 be a unitary operator. Prove that ∥𝑈 ∥ ∞ = 1. More generally, prove that
∥𝑉 ∥ ∞ = 1 for every isometry 𝑉.
38
Chapter 2: Mathematical Tools
Exercise 2.23
Consider two vectors |𝜓⟩, |𝜙⟩ ∈ C𝑑 , with 𝑑 ≥ 2. Show that
Í
Proof: Let 𝑋 = 𝑟𝑘=1 𝑠 𝑘 |𝑒 𝑘 ⟩𝐵 ⟨ 𝑓 𝑘 | 𝐴 be the singular value decomposition of 𝑋,
where 𝑟 B rank(𝑋). Let 𝑌 ∈ L(H 𝐴 , H𝐵 ) be such that ∥𝑌 ∥ ∞ ≤ 1. Then,
" 𝑟
!#
∑︁
Tr[𝑌 † 𝑋] = Tr 𝑌 † 𝑠 𝑘 |𝑒 𝑘 ⟩𝐵 ⟨ 𝑓 𝑘 | 𝐴 (2.2.131)
𝑘=1
𝑟
∑︁
= 𝑠 𝑘 ⟨𝑒 𝑘 | 𝐵𝑌 | 𝑓 𝑘 ⟩ 𝐴 (2.2.132)
𝑘=1
𝑟
∑︁
≤ 𝑠 𝑘 |⟨𝑒 𝑘 | 𝐵𝑌 | 𝑓 𝑘 ⟩ 𝐴 | , (2.2.133)
𝑘=1
39
Chapter 2: Mathematical Tools
where the last line is due to the triangle inequality. Now, using (2.2.121), we have
|⟨𝑒 𝑘 | 𝐵𝑌 | 𝑓 𝑘 ⟩ 𝐴 | ≤ ∥𝑌 ∥ ∞ ≤ 1, (2.2.134)
holds. The opposite inequality holds by making a particular choice for 𝑌 . We pick
𝑌 to be the following linear operator defined from the singular value decomposition
Í
of 𝑋: 𝑌 = 𝑟𝑘=1 |𝑒 𝑘 ⟩𝐵 ⟨ 𝑓 𝑘 | 𝐴 . Observe that ∥𝑌 ∥ ∞ = 1. Thus,
" 𝑟 ! 𝑟 !#
∑︁ ∑︁
sup Tr[𝑌 † 𝑋] ≥ Tr | 𝑓 𝑘 ′ ⟩ 𝐴 ⟨𝑒 𝑘 ′ | 𝐵 𝑠 𝑘 |𝑒 𝑘 ⟩𝐵 ⟨ 𝑓 𝑘 | 𝐴 (2.2.137)
𝑌 ≠0:∥𝑌 ∥ ∞ ≤1 𝑘 ′ =1 𝑘=1
𝑟
∑︁
= 𝑠𝑘 (2.2.138)
𝑘=1
= ∥ 𝑋 ∥1 . (2.2.139)
Remark: Observe that Proposition 2.10 can be generalized as follows for every linear operator
𝑋 𝐴→𝐵 ∈ L(H 𝐴, H 𝐵 ):
†
∥ 𝑋 ∥1 = sup Re Tr[𝑌 𝑋] , (2.2.140)
𝑌 ≠0:∥𝑌 ∥ ∞ ≤1
where, as before, the optimization is with respect to every non-zero operator 𝑌 ∈ L(H 𝐴, H 𝐵 )
with spectral norm bounded from above by one. Indeed, for every complex number 𝑧 ∈ C, the
inequality Re(𝑧) ≤ |Re(𝑧)| ≤ |𝑧| holds, which means that
sup Re Tr[𝑌 † 𝑋] ≤ sup Tr[𝑌 † 𝑋] = ∥ 𝑋 ∥ 1 . (2.2.141)
𝑌 ≠0:∥𝑌 ∥ ∞ ≤1 𝑌 ≠0:∥𝑌 ∥ ∞ ≤1
Then, to obtain the opposite inequality, the same choice for 𝑌 as in the the proof of Proposition 2.10
can be made, because for that choice of 𝑌 we have Tr[𝑌 † 𝑋] = ∥ 𝑋 ∥ 1 , which is real, so that
Re(Tr[𝑌 † 𝑋]) = ∥ 𝑋 ∥ 1 . We can thus conclude (2.2.140).
40
Chapter 2: Mathematical Tools
We also remark that in both (2.2.130) and (2.2.140), it suffices to optimize with respect to
isometries. In particular, because ∥𝑈 ∥ ∞ = 1 for every isometry 𝑈 (see Exercise 2.22), using
similar techniques as in the proof of Proposition 2.10, it is straightforward to prove that for all
𝑋 ∈ L(H 𝐴, H 𝐵 ),
∥ 𝑋 ∥ 1 = sup |Tr[𝑈 𝐵→𝐴 𝑋 𝐴→𝐵 ] | = sup Re (Tr[𝑈 𝐵→𝐴 𝑋 𝐴→𝐵 ]) , 𝑑 𝐴 ≥ 𝑑𝐵, (2.2.142)
𝑈𝐵→𝐴 𝑈𝐵→𝐴
isometry isometry
∥ 𝑋 ∥ 1 = sup Tr[𝑉 𝐴→𝐵 (𝑋 𝐴→𝐵 ) † ] = sup Re Tr[𝑉 𝐴→𝐵 (𝑋 𝐴→𝐵 ) † ] , 𝑑 𝐴 ≤ 𝑑𝐵.
𝑉𝐴→𝐵 𝑉𝐴→𝐵
isometry isometry
(2.2.143)
Lemma 2.11
Let 𝑋 be a Hermitian operator satisfying Tr[𝑋] = 0. Then,
1
∥ 𝑋 ∥∞ ≤ ∥ 𝑋 ∥1 . (2.2.146)
2
Throughout this book, we make use of the Löwner partial order for Hermitian
operators. It is useful as a way of comparing two Hermitian operators in L(H),
generalizing the way in which we compare two real numbers.
42
Chapter 2: Mathematical Tools
The relations “≥” and “≤” satisfy the following expected properties: 𝑋 ≤ 𝑌
and 𝑋 ≥ 𝑌 imply that 𝑋 = 𝑌 , and 𝑋 ≤ 𝑌 and 𝑌 ≤ 𝑍 imply that 𝑋 ≤ 𝑍. The term
“partial order” is used because not every pair (𝑋, 𝑌 ) of Hermitian operators satisfies
either 𝑋 ≥ 𝑌 or 𝑋 ≤ 𝑌 .
The functions considered in Section 2.2.8.1 have the following properties with
respect to Definition 2.13:
• The function 𝑥 ↦→ 𝑥 𝛼 is operator monotone for 𝛼 ∈ [0, 1] and 𝑥 ∈ [0, ∞),
operator anti-monotone for 𝛼 ∈ [−1, 0) and 𝑥 ∈ (0, ∞), operator convex for
𝛼 ∈ [−1, 0) and 𝑥 ∈ (0, ∞), operator convex for [1, 2] and 𝑥 ∈ [0, ∞), and
operator concave for 𝛼 ∈ (0, 1] and 𝑥 ∈ [0, ∞). Note that the function 𝑥 ↦→ 𝑥 𝛼
is neither operator monotone, operator convex, nor operator concave for 𝛼 < −1
and 𝛼 > 2.
• The function 𝑥 ↦→ log𝑏 (𝑥), for every base 𝑏 > 0 and 𝑥 ∈ (0, ∞), is operator
monotone and operator concave.
43
Chapter 2: Mathematical Tools
• The function 𝑥 ↦→ 𝑥 log𝑏 (𝑥), for every base 𝑏 > 0 and 𝑥 ∈ [0, ∞), is operator
convex3 .
For proofs of these properties, please see the Bibliographic Notes (Section 2.6).
We note here that these properties are critical for understanding quantum entropies,
as detailed in Chapter 7. Especially, the data-processing inequality for quantum
relative entropy, which is at the heart of understanding quantum communication
limits, is intimately related to operator convexity.
We now state some basic operator inequalities that we use repeatedly throughout
the book.
Proof:
1. 𝑋 ≥ 0 implies that ⟨𝜓|𝑋 |𝜓⟩ ≥ 0 for all |𝜓⟩ ∈ H. Then, for every vector
|𝜙⟩ ∈ H′, we have ⟨𝜙|𝑍 𝑋 𝑍 † |𝜙⟩ ≥ 0 because 𝑍 † |𝜙⟩ ≡ |𝜓⟩ is some vector in H.
Therefore, 𝑍 𝑋 𝑍 † ≥ 0.
Now, 𝑋 ≥ 𝑌 is equivalent to 𝑋 − 𝑌 ≥ 0. Let 𝑊 = 𝑋 − 𝑌 . Then, from the
arguments in the previous paragraph, we have 𝑍𝑊 𝑍 † ≥ 0 for all 𝑍, which
3 Note that, because lim 𝑥→0 𝑥 log𝑏 (𝑥) = 0, we take the convention that 0 log𝑏 (0) = 0 throughout
this book.
44
Chapter 2: Mathematical Tools
1 𝑟𝑞 𝑟 𝑞
h 1 i h 𝑟 i
2. Tr 𝑌 2 𝑋𝑌 2 ≤ Tr 𝑌 2 𝑋 𝑌 2
𝑟 for all 𝑟 ≥ 1.
45
Chapter 2: Mathematical Tools
The operator Jensen inequality below is the linchpin of several quantum data-
processing inequalities presented later on in Chapter 7. These in turn are repeatedly
used in Parts II and III to place fundamental limits on quantum communication
protocols. As such, the operator Jensen inequality is a significant bridge that
connects convexity to information processing.
Now we prove that 3. ⇒ 2. Fix 𝑛 ∈ N and the sets {𝐴 𝑘 }𝑛𝑘=1 and {𝑋𝑘 }𝑛𝑘=1 of
operators such that they satisfy the conditions specified in 2. Define the following
Hermitian operator:
𝑛
∑︁
𝑋B 𝑋𝑘 ⊗ |𝑘⟩⟨𝑘 |, (2.2.162)
𝑘=1
as well as the isometry
𝑛
∑︁
𝑉 B 𝐴 𝑘 ⊗ |𝑘⟩, (2.2.163)
𝑘=1
Then the desired inequality in (2.2.159) follows from (2.2.164), (2.2.166), and
(2.2.160).
We finally prove that 1. ⇒ 3. Fix the operator 𝑋 and isometry 𝑉, as specified in 3.
Let 𝑀 be a Hermitian operator in L(H′) with spectrum in 𝐼. Let 𝑃 B 1H − 𝑉𝑉 † ,
and observe that 𝑃 is a projection (i.e., 𝑃2 = 𝑃), 𝑉 † 𝑃 = 0, and 𝑃𝑉 = 0. Set
𝑋 0 𝑉 𝑃 𝑉 −𝑃
𝑍B , 𝑈B , 𝑊B . (2.2.167)
0 𝑀 0 −𝑉 † 0 𝑉†
Observe that 𝑈 and 𝑊 are unitary operators (these are called unitary dilations of
the isometry 𝑉). By direct calculation, we then find that
† †𝑋𝑃
𝑉 𝑋𝑉 𝑉
𝑈 † 𝑍𝑈 = , (2.2.168)
𝑃𝑋𝑉 𝑃𝑋 𝑃 + 𝑉 𝑀𝑉 †
47
Chapter 2: Mathematical Tools
† 𝑋𝑉 †𝑋𝑃
𝑉 −𝑉
𝑊 † 𝑍𝑊 = , (2.2.169)
−𝑃𝑋𝑉 𝑃𝑋 𝑃 + 𝑉 𝑀𝑉 †
so that
1 † 𝑉 † 𝑋𝑉 0
†
𝑈 𝑍𝑈 + 𝑊 𝑍𝑊 = . (2.2.170)
2 0 𝑃𝑋 𝑃 + 𝑉 𝑀𝑉 †
From the same reasoning that leads to (2.2.165), and using (2.2.170), we find that
𝑓 𝑉 † 𝑋𝑉
0
𝑓 𝑃𝑋 𝑃 + 𝑉 𝐵𝑉 †
0
†
𝑉 𝑋𝑉 0
= 𝑓 (2.2.171)
0 𝑃𝑋 𝑃 + 𝑉 𝐵𝑉 †
1 †
= 𝑓 𝑈 𝑍𝑈 + 𝑊 † 𝑍𝑊 (2.2.172)
2
1 † 1 †
≤ 𝑓 𝑈 𝑍𝑈 + 𝑓 𝑊 𝑍𝑊 (2.2.173)
2 2
1 1
= 𝑈 † 𝑓 (𝑍) 𝑈 + 𝑊 † 𝑓 (𝑍) 𝑊 (2.2.174)
2
† 2
𝑉 𝑓 (𝑋)𝑉 0
= . (2.2.175)
0 𝑃 𝑓 (𝑋)𝑃 + 𝑉 𝑓 (𝐵)𝑉 †
The inequality follows from the assumption that 𝑓 is operator convex. The third
equality follows from (2.2.69). The final equality follows because
𝑓 (𝑋) 0
𝑓 (𝑍) = , (2.2.176)
0 𝑓 (𝑀)
and by applying (2.2.170) again, with the substitutions 𝑍 → 𝑓 (𝑍), 𝑋 → 𝑓 (𝑋),
and 𝑀 → 𝑓 (𝑀). It follows that
𝑓 𝑉 † 𝑋𝑉
†
0 ≤ 𝑉 𝑓 (𝑋)𝑉 0
,
0 𝑓 𝑃𝑋 𝑃 + 𝑉 𝐵𝑉 † 0 𝑃 𝑓 (𝑋)𝑃 + 𝑉 𝑓 (𝐵)𝑉 †
(2.2.177)
† †
and we finally conclude that 𝑓 (𝑉 𝑋𝑉) ≤ 𝑉 𝑓 (𝑋)𝑉 by examining the upper left
blocks in the operator inequality in (2.2.177). ■
2.2.11 Superoperators
input Hilbert space L(H 𝐴 ) and output Hilbert space L(H𝐵 ). We use the term
superoperator to refer to a linear operator acting on the Hilbert space of linear
operators. Specifically, a superoperator is a function N : L(H 𝐴 ) → L(H𝐵 ) such
that
N(𝛼𝑋 + 𝛽𝑌 ) = 𝛼N(𝑋) + 𝛽N(𝑌 ) (2.2.178)
for all 𝛼, 𝛽 ∈ C and 𝑋, 𝑌 ∈ L(H 𝐴 ). It is often helpful to indicate explicitly the input
and output Hilbert spaces of a superoperator N : L(H 𝐴 ) → L(H𝐵 ) by writing
N 𝐴→𝐵 . We make use of this notation throughout the book.
For every superoperator N 𝐴→𝐵 , there exists 𝑛 ∈ N, and sets {𝐾𝑖 }𝑖=1
𝑛 and {𝐿 } 𝑛
𝑖 𝑖=1
of operators in L(H 𝐴 , H𝐵 ) such that
𝑛
∑︁
N 𝐴→𝐵 (𝑋 𝐴 ) = 𝐾𝑖 𝑋 𝐴 𝐿 𝑖† , (2.2.179)
𝑖=1
for all 𝑋 𝐴 ∈ L(H 𝐴 ). This follows as a consequence of the requirement that N 𝐴→𝐵
has a linear action on 𝑋 𝐴 and the isomorphism in (2.2.31). The transpose operation
discussed previously in (2.2.25) is an example of a superoperator. In Chapter 4,
we see that quantum physical evolutions of quantum states, known as quantum
channels, are other examples of superoperators with additional constraints on the
sets {𝐾𝑖 }𝑖=1
𝑛 and {𝐿 } 𝑛 .
𝑖 𝑖=1
throughout this book whenever a superoperator acts only on one of the tensor
factors of the underlying Hilbert space of linear operators.
49
Chapter 2: Mathematical Tools
Exercise 2.24
Using (2.2.52), prove that a superoperator N is Hermiticity preserving if and
only if N(𝑋 † ) = N(𝑋) † for every linear operator 𝑋.
for all 𝑋 ∈ L(H 𝐴 ) and 𝑌 ∈ L(H 𝐴′ ), where we recall that ⟨·, ·⟩ is the Hilbert–
Schmidt inner product defined in (2.2.28).
Exercise 2.25
Let N 𝐴→𝐵 be a superoperator represented as in (2.2.179).
1. Prove that the adjoint N† is given by N† (𝑌 ) = 𝑖=1 𝐾𝑖†𝑌 𝐿 𝑖 for every linear
Í𝑛
operator 𝑌 .
2. If N is Hermiticity preserving,Í
then prove that an alternate operator-sum
representation of N is N(𝑋) = 𝑖=1𝑛
𝐿 𝑖 𝑋𝐾𝑖† for all 𝑋 ∈ L(H 𝐴 ).
3. Using 1. and 2., prove that if N is Hermiticity preserving, then so is its
adjoint N† .
50
Chapter 2: Mathematical Tools
Remark: Observe that if N is trace preserving and unital, and if H 𝐴 and H 𝐵 have finite
dimensions, then we find that 𝑑 𝐴 = 𝑑 𝐵 , by taking the trace on both sides of N( 1 𝐴) = 1 𝐵 . This
means that, in finite dimensions, it is necessary for trace-preserving and unital superoperators to
have the same input and output dimensions.
Exercise 2.26
Let N 𝐴→𝐵 be a trace-preserving superoperator represented as in (2.2.179).
𝐾𝑖† 𝐿 𝑖 = 1 𝐴 .
Í𝑛
1. Prove that 𝑖=1
2. Using 1., show that the adjoint N† is unital. Thus, the adjoint of every
trace-preserving superoperator is unital.
For every superoperator N 𝐴→𝐵 : L(H 𝐴 ) → L(H𝐵 ), we define its induced trace
norm ∥N∥ 1 as
∥N(𝑋)∥ 1
∥N∥ 1 B sup : 𝑋 ∈ L(H 𝐴 ), 𝑋 ≠ 0 (2.2.183)
∥ 𝑋 ∥1
= sup{∥N(𝑋) ∥ 1 : 𝑋 ∈ L(H 𝐴 ), ∥ 𝑋 ∥ 1 ≤ 1}. (2.2.184)
Exercise 2.27
Prove that
∥N∥ 1 = sup N† (𝑈) ∞
(2.2.186)
𝑈∈L(H 𝐵 )
unitary
for every superoperator N 𝐴→𝐵 , where the optimization is with respect to every
unitary operator 𝑈 acting on H𝐵 .
51
Chapter 2: Mathematical Tools
Theorem 2.21
For every superoperator N 𝐴→𝐵 ,
52
Chapter 2: Mathematical Tools
Limit of a sequence
We start with the definition of the limit of a sequence of real numbers. A sequence
{𝑠𝑛 }𝑛∈N ⊂ R of real numbers is said to have the limit ℓ, written lim𝑛→∞ 𝑠𝑛 = ℓ, if
for all 𝜀 > 0 there exists 𝑛𝜀 ∈ N such that, for all 𝑛 ≥ 𝑛𝜀 , the inequality |𝑠𝑛 − ℓ| < 𝜀
holds.
One can think of the concept of a limit intuitively as a game between two
players, an antagonist and a protagonist. The antagonist goes first, and gets to pick
an arbitrary 𝜀 > 0. The protagonist wins if he reports back an entry in the sequence
{𝑠𝑛 }𝑛 such that |𝑠𝑛 − ℓ| < 𝜀. If the protagonist reports back an entry 𝑠𝑛 such that
|𝑠𝑛 − ℓ| ≥ 𝜀, then the antagonist wins. If the limit exists and is equal to ℓ, then the
protagonist always wins by taking 𝑛 sufficiently large (i.e., larger than 𝑛𝜀 ) and then
reporting back 𝑠𝑛 . If the limit does not exist or if the limit is not equal to ℓ, then the
protagonist cannot necessarily win with the strategy of taking 𝑛 sufficiently large;
in this case, there exists a choice of 𝜀 > 0, such that for all 𝑛𝜀 ∈ N, there exists
𝑛 ≥ 𝑛𝜀 such that |𝑠𝑛 − ℓ| ≥ 𝜀. In this latter case, the choice of 𝜀 > 0 can again be
understood as a strategy of the antagonist.
Let us now recall the concepts of the infimum and supremum of subsets of the
real numbers. Roughly speaking, they are generalizations of the concepts of the
minimum and maximum, respectively, of a set. Formally, let 𝐸 ⊂ R.
• A point 𝑥 ∈ R is a lower bound of 𝐸 if 𝑦 ≥ 𝑥 for all 𝑦 ∈ 𝐸. If 𝑥 is the greatest
such lower bound, then 𝑥 is called the infimum of 𝐸, and we write 𝑥 = inf 𝐸.
• A point 𝑥 ∈ R is an upper bound of 𝐸 if 𝑦 ≤ 𝑥 for all 𝑦 ∈ 𝐸. If 𝑥 is the least
such upper bound, then 𝑥 is called the supremum of 𝐸, and we write 𝑥 = sup 𝐸.
The supremum and infinum may or may not be contained in the subset 𝐸. For
example, let 𝐸 = { 𝑛1 }𝑛∈N . Then, sup 𝐸 = 1 ∈ 𝐸, but inf 𝐸 = 0 ∉ 𝐸. As another
example, let 𝐸 = [0, 1). Then sup 𝐸 = 1 ∉ 𝐸 and inf 𝐸 = 0 ∈ 𝐸. If the supremum
is contained in 𝐸, then it is equal to the maximum element of 𝐸. Similarly, if the
infimum is contained in 𝐸, then it is equal to the minimum element of 𝐸.
53
Chapter 2: Mathematical Tools
and
sup 𝐹 (𝑋) B sup{𝐹 (𝑋) : 𝑋 ∈ 𝑆}. (2.3.2)
𝑋∈𝑆
Turning back to limits, the limit of a sequence need not always exist. A particularly
illuminating example is the sequence {𝑟 𝑛 }𝑛∈N for 𝑟 ∈ R. If −1 < 𝑟 < 1, then the
limit exists and is equal to zero. If 𝑟 > 1, then the sequence never converges to a
finite value and so the limit does not exist. We say that the sequence diverges to +∞
in this case. If 𝑟 < −1, then the sequence oscillates and diverges (but it does not
specifically diverge to either +∞ or −∞). If 𝑟 = −1, then the sequence oscillates
back and forth between −1 and +1 and so the limit does not exist.
Given that the limit of a sequence need not always exist, it can be helpful to have
a reasonable substitute for this asymptotic concept that does always exist. Such a
substitute is provided by two quantities: the limit inferior and limit superior of a
sequence. We now define the limit inferior and limit superior, noting that they can
be understood as asymptotic versions of the infimum and supremum just discussed.
• We say that 𝑠 is an asymptotic lower bound on the sequence {𝑠𝑛 }𝑛∈N if for all
𝜀 > 0 there exists 𝑛𝜀 ∈ N such that, for all 𝑛 ≥ 𝑛𝜀 , the inequality 𝑠𝑛 > 𝑠 − 𝜀
holds. The limit inferior is the greatest asymptotic lower bound and is denoted
by
lim inf 𝑠𝑛 . (2.3.3)
𝑛→∞
• The definition of the limit superior is essentially opposite to that of the limit
inferior. We say that 𝑠 is an asymptotic upper bound on the sequence {𝑠𝑛 }𝑛∈N
if for all 𝜀 > 0, there exists 𝑛𝜀 ∈ N, such that for all 𝑛 ≥ 𝑛𝜀 , the inequality
𝑠𝑛 < 𝑠 + 𝜀 holds. The limit superior is the least asymptotic upper bound and is
denoted by
lim sup 𝑠𝑛 . (2.3.4)
𝑛→∞
54
Chapter 2: Mathematical Tools
The limit inferior and limit superior always exist by extending the real line R to
include −∞ and +∞. Furthermore, every asymptotic lower bound on the sequence
cannot exceed an asymptotic upper bound, implying that the following inequality
holds for every sequence {𝑠𝑛 }𝑛∈N :
If the opposite inequality holds for a sequence {𝑠𝑛 }𝑛∈N , then the limit of the
sequence exists and we can write
This collapse is a direct consequence of the definitions of limit, limit inferior, and
limit superior.
inf 𝐹 (𝑋) = min 𝐹 (𝑋) and sup 𝐹 (𝑋) = max 𝐹 (𝑋). (2.3.7)
𝑋∈𝑆 𝑋∈𝑆 𝑋∈𝑆 𝑋∈𝑆
A subset 𝐶 of a vector space is called convex if, for all elements 𝑢, 𝑣 ∈ 𝐶 and for
all 𝜆 ∈ [0, 1], we have 𝜆𝑢 + (1 − 𝜆)𝑣 ∈ 𝐶. We often call 𝜆𝑢 + (1 − 𝜆)𝑣 a convex
combination of 𝑢 and 𝑣. More generally, for every set 𝑆 = {𝑣 𝑥 }𝑥∈X of elements of
a real vector space indexed by an alphabet X, and every function 𝑝 : X → [0, 1]
Í Í
with 𝑝(𝑥) ≥ 0 for all 𝑥 ∈ X and 𝑥∈X 𝑝(𝑥) = 1, the sum 𝑥∈X 𝑝(𝑥)𝑣 𝑥 is called a
convex combination of the vectors in 𝑆. The convex hull of 𝑆 is the convex set of
all possible convex combinations of the vectors in 𝑆.
56
Chapter 2: Mathematical Tools
Throughout this book, in the context of convex sets and functions, we consider
the real vector space of Hermitian operators acting on some Hilbert space. Then,
an important example of a convex subset is the set of all positive semi-definite
operators. Indeed, if 𝑋 and 𝑌 are positive semi-definite operators, then 𝜆𝑋 + (1−𝜆)𝑌
is a positive semi-definite operator for all 𝜆 ∈ [0, 1]. From now on, we assume 𝐶
to be a convex subset of the set of Hermitian operators, and we use 𝑋, 𝑌 , and 𝑍 to
denote arbitrary elements of 𝐶.
An element 𝑍 ∈ 𝐶 is called an extreme point of 𝐶 if 𝑍 cannot be written as a non-
trivial convex combination of other vectors in 𝐶. Formally, 𝑍 is an extreme point if
every decomposition of 𝑍 as the convex combination 𝑍 = 𝜆𝑋 + (1 − 𝜆)𝑌 , such that
𝜆 ∈ (0, 1) (so that the decomposition is non-trivial), implies that 𝑋 = 𝑌 = 𝑍. An
important fact is that every convex set is equal to the convex hull of its extreme
points.
We now define convex and concave functions.
57
Chapter 2: Mathematical Tools
We mentioned above that the convex hull of a subset 𝑆 of a real vector space is the
convex set consisting of all convex combinations of the vectors in 𝑆. A fundamental
result is that if the underlying vector space has dimension 𝑑, then, in order to obtain
an element in the convex hull of 𝑆, it suffices to take a convex combination of no
more than 𝑑 + 1 elements of 𝑆. If 𝑆 is connected and compact, then no more than 𝑑
elements are required. We state this formally as follows.
The expression above contains both an infimum and a supremum over subsets
𝑆, 𝑆′ ⊆ L(H) of some real-valued function 𝐹 : 𝑆 × 𝑆′ → R.
Expressions such as the one in (2.3.12) arise in the context of two-player
zero-sum games. In such a game, the function 𝐹 (𝑋, 𝑌 ) represents the reward of a
protagonist, who chooses elements 𝑌 ∈ 𝑆′ in order to maximize 𝐹. The antagonist
chooses elements 𝑋 ∈ 𝑆 in order to minimize 𝐹, i.e., to minimize the reward to the
58
Chapter 2: Mathematical Tools
protagonist4 . The worst-case scenario for the antagonist is, no matter what element
𝑋 ∈ 𝑆 it chooses, the protagonist chooses the “best” possible element in 𝑆′ for its
benefit, so that the reward is 𝐺 (𝑋) B sup𝑌 ∈𝑆′ 𝐹 (𝑋, 𝑌 ). The optimal reward of
the antagonist in this scenario is thus given by inf 𝑋∈𝑆 𝐺 (𝑋), which is the quantity
in (2.3.12).
On the other hand, the worst-case scenario for the protagonist is, no matter what
element 𝑌 ∈ 𝑆′ it chooses, the antagonist chooses the “best” possible element in 𝑆
e(𝑌 ) B inf 𝑋∈𝑆 𝐹 (𝑋, 𝑌 ). The optimal reward
for its benefit, so that the reward is 𝐺
e(𝑌 ), i.e.,
of the protagonist in this scenario is thus given by sup𝑌 ∈𝑆′ 𝐺
which always holds. We now prove this formally. Observe that for all 𝑋 ∈ 𝑆, 𝑌 ∈ 𝑆′,
e(𝑌 ) ≤ 𝐹 (𝑋, 𝑌 ). It then follows that sup𝑌 ∈𝑆′ 𝐺
we have that 𝐺 e(𝑌 ) ≤ sup𝑌 ∈𝑆′ 𝐹 (𝑋, 𝑌 )
by applying the definition of supremum. Since this latter inequality holds for all 𝑋 ∈
𝑆, the definition of infimum implies that sup𝑌 ∈𝑆′ 𝐺 e(𝑌 ) ≤ inf 𝑋∈𝑆 sup𝑌 ∈𝑆′ 𝐹 (𝑋, 𝑌 ),
which is precisely the inequality in (2.3.14).
Many proofs that we present in this book require determining when the inequality
opposite to the one in (2.3.14) holds, i.e.,
?
inf sup 𝐹 (𝑋, 𝑌 ) ≤ sup inf 𝐹 (𝑋, 𝑌 ), (2.3.15)
𝑋∈𝑆 𝑌 ∈𝑆 ′ 𝑌 ∈𝑆 ′ 𝑋∈𝑆
59
Chapter 2: Mathematical Tools
optimal strategies. It is thus important to know under what conditions this reverse
inequality holds.
We now present theorems for two classes of functions that tell us when the
inequality in (2.3.15) holds.
60
Chapter 2: Mathematical Tools
Throughout this book, we are concerned for the most part with discrete probability
distributions, and the following definitions suffice for our needs. A discrete
probability distribution is a function 𝑝 : X → [0, 1] defined on an finite alphabet X
Í
such that 𝑝(𝑥) ≥ 0 for all 𝑥 ∈ X and 𝑥∈X 𝑝(𝑥) = 1. Formally, we can consider the
alphabet X to be the set of realizations of a discrete random variable 𝑋 : Ω → X
from the space Ω of experimental outcomes, called the sample space, to the set X.
We then write 𝑝 𝑋 to denote the probability distribution of the random variable 𝑋,
i.e., 𝑝 𝑋 (𝑥) ≡ Pr[𝑋 = 𝑥].
The expected value or mean E[𝑋] of a random variable 𝑋 taking values in
X ⊂ R is defined as ∑︁
E[𝑋] = 𝑥 𝑝 𝑋 (𝑥). (2.3.18)
𝑥∈X
This is a very special case of the more elaborate operator Jensen inequality presented
previously in Theorem 2.16.
As we explain in Chapter 3, observables 𝑂 in quantum mechanics (which
are merely Hermitian operators) generalize random variables, such that their
61
Chapter 2: Mathematical Tools
62
Chapter 2: Mathematical Tools
minimize Tr[𝐵𝑌 ]
subject to Φ† (𝑌 ) ≥ 𝐴, (2.4.2)
𝑌 ≥ 0.
We let
denote the optimal values of the primal and dual SDPs, respectively.
Proof: Let 𝑋 ≥ 0 be primal feasible, and let 𝑌 ≥ 0 be dual feasible. Then the
following holds
Tr[ 𝐴𝑋] ≤ Tr[Φ† (𝑌 ) 𝑋] = Tr[𝑌 Φ(𝑋)] ≤ Tr[𝑌 𝐵]. (2.4.6)
The first inequality follows from the assumption that 𝑌 is dual feasible, so that
we have 𝐴 ≤ Φ† (𝑌 ), and by applying 2. of Lemma 2.14. The equality holds by
63
Chapter 2: Mathematical Tools
definition of the adjoint map Φ† ; see (2.2.182). The last inequality follows from the
assumption that 𝑋 is primal feasible, so that we have Φ(𝑋) ≤ 𝐵, and by applying
2. of Lemma 2.14. Since the inequality holds for all primal feasible 𝑋 and for all
dual feasible 𝑌 , we can take a supremum over the left-hand side of (2.4.6) and an
infimum over the right-hand side of (2.4.6), and we thus arrive at the weak duality
inequality in (2.4.5). ■
Note that the following equalities hold, which are helpful in the discussion below:
The inner infimum with respect to 𝑌 ≥ 0 forces the outer optimization to be with
respect to every feasible point 𝑋 for the primal SDP in (2.4.3). In this sense,
the variable 𝑌 can be thought as a “Lagrange multiplier”, analogous to Lagrange
multipliers that are used in constrained optimization problems in elementary
calculus. Indeed, suppose that an infeasible 𝑋 ≥ 0 is chosen, meaning that the
constraint Φ(𝑋) ≤ 𝐵 is violated. This means that there exists a non-trivial negative
eigenspace of 𝐵 − Φ(𝑋). Let |𝜑⟩ be a unit vector in this negative eigenspace.
We can then pick 𝑌 = 𝑐|𝜑⟩⟨𝜑| for 𝑐 > 0 and take the limit 𝑐 → ∞, so that
inf𝑌 ≥0 Tr[(𝐵 − Φ(𝑋))𝑌 ] = −∞. So a violation of the constraint Φ(𝑋) ≤ 𝐵 incurs
an infinite cost for the outer optimization with respect to 𝑋 ≥ 0, which is suboptimal.
The constraint Φ(𝑋) ≤ 𝐵 is therefore forced to be satisfied, leading to the equality
in (2.4.10).
64
Chapter 2: Mathematical Tools
If we instead take a supremum over 𝑋 ≥ 0 first and then take an infimum over
𝑌 ≥ 0, it follows that
inf sup L(Φ, 𝐴, 𝐵, 𝑋, 𝑌 ) = 𝑆(Φ,
b 𝐴, 𝐵). (2.4.12)
𝑌 ≥0 𝑋 ≥0
65
Chapter 2: Mathematical Tools
Remark: The nomenclature Slater’s “condition” (rather than “conditions”) is commonly used,
but note that one can check either one of the two sufficient conditions above to determine if
strong duality holds.
𝑌 𝐵 = 𝑌 Φ(𝑋), (2.4.14)
Φ† (𝑌 ) 𝑋 = 𝐴𝑋. (2.4.15)
Proof: On the one hand, suppose that 𝑋 is primal feasible, that 𝑌 is dual feasible,
and that they satisfy (2.4.14)–(2.4.15). Then it is clear by inspecting (2.4.6) that
the inequalities are saturated, thus implying that 𝑋 is primal optimal and 𝑌 is dual
optimal.
On the other hand, suppose that 𝑋 is primal optimal and that 𝑌 is dual optimal.
Then, by this assumption, it follows that Tr[ 𝐴𝑋] = Tr[𝐵𝑌 ] so that the inequalities
in (2.4.6) are saturated. This means that
Tr[(Φ† (𝑌 ) − 𝐴) 𝑋] = 0 (2.4.16)
Tr[𝑌 (𝐵 − Φ(𝑋))] = 0. (2.4.17)
Since Φ† (𝑌 )− 𝐴 and 𝑋 are positive semi-definite, the equality in (2.4.16) implies that
(Φ† (𝑌 ) − 𝐴) 𝑋 = 0, which is equivalent to (2.4.14). Similarly, since 𝐵 − Φ(𝑋) and
𝑌 are positive semi-definite, the equality in (2.4.17) implies that 𝑌 (𝐵 − Φ(𝑋)) = 0,
which is equivalent to (2.4.15). ■
66
Chapter 2: Mathematical Tools
If the matrices 𝐴 and 𝐵 and the map Φ involved in an SDP are of reasonable size,
then the SDP can be computed efficiently using numerical solvers (specifically, the
time required is polynomial in the size of these objects and polynomial in the inverse
of the numerical accuracy desired). As mentioned earlier, SDPs arise frequently in
quantum information, with some examples appearing in Chapter 6. Furthermore,
SDPs appear in some of the upper bounds for rates of quantum communication
protocols that we consider in Parts II and III.
Exercise 2.28
Consider the following pair of primal and dual optimization problems:
Reasoning analogous to that in Exercise 2.28 can be used to show that the
following pair of optimization problems are also SDPs, equivalent to the ones in
(2.4.1) and (2.4.2):
67
Chapter 2: Mathematical Tools
Exercise 2.29
1. Consider the following SDP in primal form:
2.4.1 SDPs for Spectral and Trace Norm, Maximum and Mini-
mum Eigenvalue
In this section, we provide semi-definite programs for calculating the spectral and
trace norms of Hermitian operators, as well as their largest and smallest eigenvalues.
68
Chapter 2: Mathematical Tools
𝑓 (𝐻) B inf {𝑡 : −𝑡 1 ≤ 𝐻 ≤ 𝑡 1} .
b (2.4.25)
𝑡≥0
The quantities above can be computed via SDPs, and in fact, the following
equality holds
𝑓 (𝐻) = ∥𝐻 ∥ ∞ .
𝑓 (𝐻) = b (2.4.26)
That is, 𝑓 (𝐻) is equal to the largest singular value of the Hermitian operator 𝐻.
Proof: Given that the optimization in (2.4.24) is a maximization, let us first show
that (2.4.24) can be written in the form of 𝑆(Φ, 𝐴, 𝐵) in (2.4.3). Indeed if we let
𝑋1 𝑍 †
𝐻 0
𝑋= , 𝐴= , (2.4.27)
𝑍 𝑋2 0 −𝐻
Φ(𝑋) = Tr[𝑋1 + 𝑋2 ], 𝐵 = 1, (2.4.28)
then we have that
𝑓 (𝐻) = sup {Tr[ 𝐴𝑋] : Φ(𝑋) ≤ 𝐵} . (2.4.29)
𝑋 ≥0
The constraint 𝑋 ≥ 0 implies that 𝑋1 , 𝑋2 ≥ 0. Furthermore, notice that the operator
𝑍 appears neither in the objective function Tr[𝐻 (𝑋1 − 𝑋2 )] nor in the constraint
Tr[𝑋1 + 𝑋2 ] ≤ 1. Thus, the operator 𝑍 plays no role in the optimization, and so we
can simply set 𝑍 = 0, so that
𝑋1 0
𝑋= . (2.4.30)
0 𝑋2
Thus, (2.4.24) is indeed an SDP in primal form.
Now, recall from (2.2.124) that the spectral norm of 𝐻 is given by the maximum
of the absolute values of the eigenvalues of 𝐻. In particular, we can write
∥𝐻 ∥ ∞ = max {|𝜆 max | , |𝜆 min |} , (2.4.31)
where 𝜆 max and 𝜆 min are the maximum and minimum eigenvalues, respectively,
of 𝐻. Note that we always have 𝜆 max ≥ 𝜆 min . Let |𝜙max ⟩ be an eigenvector of 𝐻
69
Chapter 2: Mathematical Tools
It now remains to prove the reverse inequality, namely, the inequality 𝑓 (𝐻) ≤
∥𝐻 ∥ ∞ . To prove this, let us show that b
𝑓 (𝐻), as defined in (2.4.25), is given by the
SDP dual to the one that defines 𝑓 (𝐻). In order to do this, we should determine
the map Φ† , which is the adjoint of Φ. Since 𝐵 = 1 and Φ(𝑋) = Tr[𝑋1 + 𝑋2 ] are
scalars, we take 𝑌 = 𝑡 to be a scalar also. Then, we find that
Plugging this into the standard form of the dual in (2.4.4), we find that
𝑡1 0
† 𝐻 0
inf Tr[𝐵𝑌 ] : Φ (𝑌 ) ≥ 𝐴 = inf 𝑡 : ≥
0 𝑡1
(2.4.38)
𝑌 ≥0 𝑡≥0 0 −𝐻
= inf {𝑡 : 𝑡 1 ≥ 𝐻, 𝑡 1 ≥ −𝐻} (2.4.39)
𝑡≥0
= inf {𝑡 : −𝑡 1 ≤ 𝐻 ≤ 𝑡 1} (2.4.40)
𝑡≥0
70
Chapter 2: Mathematical Tools
𝑓 (𝐻).
= b (2.4.41)
Let us now recall property 3. of Lemma 2.14, which states that 𝜆 min 1 ≤
𝐻 ≤ 𝜆max 1. By combining with (2.4.33), we find that 𝜆 max 1 ≤ ∥𝐻 ∥ ∞ 1 and
𝜆 min 1 ≥ − ∥𝐻 ∥ ∞ 1, which implies that
− ∥𝐻 ∥ ∞ 1 ≤ 𝐻 ≤ ∥𝐻 ∥ ∞ 1. (2.4.42)
Thus, we see that ∥𝐻 ∥ ∞ is a feasible choice for 𝑡 in (2.4.40), which implies that
𝑓 (𝐻) ≤ ∥𝐻 ∥ ∞ .
b (2.4.43)
We proved (2.4.26) by employing clever guesses for primal feasible and dual
feasible points. Doing so is possible in this case because the problem is simple
enough to begin with, and we could apply knowledge from linear algebra to make
these clever guesses. Although it is sometimes possible to make clever guesses and
arrive at analytical solutions like we did above, in many cases it is not possible. In
such cases, it can be helpful to check Slater’s condition in Theorem 2.28 explicitly
in order to see if strong duality holds. So let us do so for the SDPs corresponding to
𝑓 (𝐻) and b𝑓 (𝐻). For the primal SDP in (2.4.24), a strictly feasible point consists
of the choice 𝑋1 = 𝛼 1𝑑 and 𝑋2 = 𝛽 1𝑑 such that 𝛼, 𝛽 > 0 and 𝛼 + 𝛽 < 1, where 𝑑 is
the dimension of 1. Then we clearly have 𝑋1 > 0, 𝑋2 > 0, and Tr[𝑋1 + 𝑋2 ] < 1,
so that 𝑋1 and 𝑋2 are strictly feasible, as claimed. A feasible point for the dual
consists of the choice 𝛾 ≥ ∥𝐻 ∥ ∞ . Thus, strong duality holds, further confirming
that 𝑓 (𝐻) = b 𝑓 (𝐻), as shown above.
We now remark about the complementary slackness conditions from Proposi-
tion 2.29 for the SDPs corresponding to 𝑓 (𝐻) and b 𝑓 (𝐻), which apply to optimal
primal 𝑋 and optimal dual 𝑌 . In this case, the conditions reduce to
𝑡 = 𝑡 Tr[𝑋1 + 𝑋2 ], (2.4.44)
𝑡1 0
𝑋1 0 𝐻 0 𝑋1 0
=
0 𝑡1
, (2.4.45)
0 𝑋2 0 −𝐻 0 𝑋2
71
Chapter 2: Mathematical Tools
and the latter is the same as the following two separate conditions:
𝑡 𝑋1 = 𝐻 𝑋1 , −𝑡 𝑋2 = 𝐻 𝑋2 . (2.4.46)
If we have prior knowledge about the operator 𝐻,—e.g., that one of its eigenvalues
is non-zero—then we conclude that the optimal 𝑡 ≠ 0 and the condition in (2.4.44)
implies that Tr[𝑋1 + 𝑋2 ] = 1. In this case, we can conclude that the inequality
constraint in (2.4.24) is loose and it suffices to optimize over 𝑋1 and 𝑋2 satisfying
the constraint with equality. The conditions in (2.4.46) indicate that the image of
the optimal 𝑋1 should be in the eigenspace of 𝐻 with optimal eigenvalue 𝑡, and the
image of the optimal 𝑋2 should be in the eigenspace of 𝐻 with optimal eigenvalue
−𝑡. Observe that these complementary slackness conditions are consistent with the
choices that we made above.
As a final remark, if 𝐻 is actually positive semi-definite, then the lower bound
constraint in (2.4.25) is unnecessary. Letting 𝑃 be a positive semi-definite operator,
we thus find that
𝑓 (𝑃) = ∥𝑃∥ ∞ = inf {𝑡 : 𝑃 ≤ 𝑡 1} . (2.4.47)
𝑡≥0
(Hint: Show that the SDP in (2.4.49) is dual to the one in (2.4.48), and
then prove strong duality.)
72
Chapter 2: Mathematical Tools
Exercise 2.31 SDPs for the Maximum and Minimum Eigenvalue of Her-
mitian Operators
Let 𝐻 be a Hermitian operator. Prove that the maximum and minimum
eigenvalues of 𝐻, denoted by 𝜆 max (𝐻) and 𝜆 min (𝐻), respectively, have the
following SDP characterizations:
and
(Hint: Use the spectral theorem (Theorem 2.4) and the duality of SDPs.)
73
Chapter 2: Mathematical Tools
is called the anti-symmetric subspace of H ⊗𝑛 , where sgn(𝜋) is the sign of the permu-
tation 𝜋, defined as sgn(𝜋) = (−1)𝑇 (𝜋) where 𝑇 (𝜋) is the number of transpositions
into which 𝜋 can be decomposed5 .
The operator
1 ∑︁ 𝜋
ΠSym𝑛 (H) B 𝑊 (2.5.5)
𝑛!
𝜋∈S𝑛
5A transposition is a permutation that permutes only two elements of the set {1, 2, . . . , 𝑛}.
Any permutation 𝜋 ∈ S𝑛 can be decomposed into a product of transpositions. Although this
decomposition is in general not unique, the parity of the number 𝑇 (𝜋) of transpositions into which
𝜋 can be decomposed is unique, so that sgn(𝜋) is well defined.
74
Chapter 2: Mathematical Tools
Exercise 2.32
1. Prove that ΠSym𝑛 (H) and ΠASym𝑛 (H) are projections, as claimed above.
2. Prove that Sym𝑛 (H) and ASym𝑛 (H) are orthogonal subspaces of H ⊗𝑛 by
showing that
ΠSym𝑛 (H) ΠASym𝑛 (H) = 0. (2.5.7)
This implies that ⟨𝜓 𝑠 |𝜓𝑎 ⟩ = 0 for all |𝜓 𝑠 ⟩ ∈ Sym𝑛 (H) and |𝜓𝑎 ⟩ ∈
ASym𝑛 (H).
Exercise 2.33
Let H be a 𝑑-dimensional Hilbert space, 𝑑 ≥ 2. Show that, for 𝑛 = 2,
1
ΠSym2 (H) = ( 1𝑑 ⊗ 1𝑑 + 𝐹), (2.5.8)
2
1
ΠASym2 (H) = ( 1𝑑 ⊗ 1𝑑 − 𝐹), (2.5.9)
2
where 𝐹 B 𝑊 𝜋 is the representation of the permutation 𝜋 = (1 2), i.e.,
𝑑−1
∑︁
𝐹= |𝑘⟩⟨𝑘 ′ | ⊗ |𝑘 ′⟩⟨𝑘 |. (2.5.10)
𝑘,𝑘 ′ =0
each of the 𝑛 Hilbert spaces H corresponds to a quantum system, and each 𝑛 𝑗 tells
us how many of the 𝑛 quantum systems are in the state given by | 𝑗 − 1⟩. (We
formally draw the correspondence between Hilbert spaces and quantum systems in
Chapter 3.) The number of elements in this basis is equal to the number of ways
of selecting 𝑛 elements, with repetition, from a set of 𝑑 distinct elements. This
𝑑+𝑛−1
number is equal to 𝑛 . Consequently, the dimension of Sym𝑛 (H) is
𝑑+𝑛−1 𝑑+𝑛−1
dim(Sym𝑛 (H)) = = . (2.5.12)
𝑛 𝑑−1
Exercise 2.34
Let 𝑑 ≥ 2 and 𝑛 = 2. Show that the basis elements |𝑛1 , 𝑛2 , . . . , 𝑛 𝑑 ⟩ of Sym2 (C𝑑 )
are given as follows:
if 𝑛 𝑗 = 2, 𝑛ℓ = 0 ∀ℓ ≠ 𝑗, and
1
|𝑛1 , 𝑛2 , . . . , 𝑛 𝑑 ⟩ = √ (| 𝑗 − 1, 𝑘 − 1⟩ + |𝑘 − 1, 𝑗 − 1⟩), (2.5.14)
2
if 𝑛 𝑗 = 𝑛 𝑘 = 1, 𝑘 ≠ 𝑗 and 𝑛ℓ = 0 ∀ℓ ≠ 𝑗, 𝑘.
is called the bosonic Fock space. (Note that Sym0 (H) is the set of complex scalars, i.e.,
Sym0 (H) = C.) It is an infinite-dimensional Hilbert space that is relevant for the study of
quantum optical and other continuous-variable quantum systems.
An important fact that we state without proof (please consult the Bibliographic
Notes in Section 2.6) is that for every 𝑑-dimensional Hilbert space H,
∫
𝑑+𝑛−1
ΠSym𝑛 (H) = 𝜓 ⊗𝑛 d𝜓, (2.5.16)
𝑛
where the integral on the right-hand side is taken with respect to the Haar measure
over all unit vectors.
76
Chapter 2: Mathematical Tools
Remark: The measure d𝜓 is also called the Fubini-Study measure. A concrete coordinate
representation of the measure can be obtained by using the following parameterization of every
unit vector |𝜓⟩ in a 𝑑-dimensional Hilbert space H:
𝑑−1
∑︁
|𝜓⟩ = 𝑟 𝑘 ei𝜑𝑘 |𝑘⟩, (2.5.17)
𝑘=0
(Please consult the Bibliographic Notes in Section 2.6 for details.) In the case 𝑑 = 2, we have that
1 𝜃1 𝜃1
d𝜓 = cos sin d𝜃 1 d𝜑1 (𝑑 = 2). (2.5.22)
2𝜋 2 2
We often consider the case that the Hilbert space H is a tensor product of two
Hilbert spaces, i.e., H = H 𝐴 ⊗ H𝐵 ≡ H 𝐴𝐵 , with H 𝐴 a 𝑑 𝐴 -dimensional Hilbert
𝑑 𝐴−1
space and H𝐵 a 𝑑 𝐵 -dimensional Hilbert space. As we have seen above, if {|𝑖⟩ 𝐴 }𝑖=0
𝐵 −1
is an orthonormal basis for H 𝐴 and {| 𝑗⟩𝐵 } 𝑑𝑗=0 is an orthonormal basis for H𝐵 ,
then {|𝑖, 𝑗⟩ 𝐴𝐵 ≡ |𝑖⟩ 𝐴 ⊗ | 𝑗⟩𝐵 : 0 ≤ 𝑖 ≤ 𝑑 𝐴 − 1, 0 ≤ 𝑗 ≤ 𝑑 𝐵 − 1} is an orthonormal
basis for H 𝐴𝐵 . In this case, if we consider the 𝑛-fold tensor product H ⊗𝑛 𝐴𝐵 , then the
unitary representation {𝑊 ( 𝐴𝐵) 𝑛 } 𝜋∈S𝑛 defined in (2.5.1) acts as follows:
𝜋
𝑊 (𝜋𝐴𝐵) 𝑛 (|𝑖 1 , 𝑗1 ⟩ 𝐴1 𝐵1 ⊗ |𝑖 2 , 𝑗2 ⟩ 𝐴2 𝐵2 ⊗ · · · ⊗ |𝑖 𝑛 , 𝑗 𝑛 ⟩ 𝐴𝑛 𝐵𝑛 )
= |𝑖 𝜋(1) , 𝑗 𝜋(1) ⟩ 𝐴1 𝐵1 ⊗ |𝑖 𝜋(2) , 𝑗 𝜋(2) ⟩ 𝐴2 𝐵2 ⊗ · · · ⊗ |𝑖 𝜋(𝑛) , 𝑗 𝜋(𝑛) ⟩ 𝐴𝑛 𝐵𝑛 , (2.5.23)
77
Chapter 2: Mathematical Tools
where {𝑊 𝐴𝜋 𝑛 } 𝜋∈S𝑛 and {𝑊𝐵𝜋𝑛 } 𝜋∈S𝑛 are the unitary representations of S𝑛 acting on
H ⊗𝑛 ⊗𝑛
𝐴 and H 𝐵 , respectively. We can thus write the projection onto Sym𝑛 (H 𝐴 ⊗ H 𝐵 )
as
1 ∑︁ 𝜋 1 ∑︁ 𝜋
ΠSym𝑛 (H 𝐴 ⊗H𝐵 ) = 𝑊 ( 𝐴𝐵) 𝑛 ≡ 𝑊 𝐴𝑛 ⊗ 𝑊𝐵𝜋𝑛 . (2.5.25)
𝑛! 𝑛!
𝜋∈S𝑛 𝜋∈S𝑛
78
Chapter 2: Mathematical Tools
A proof of the operator Jensen inequality (Theorem 2.16) was given by Hansen
and Pedersen (2003). In presenting the implication 1. ⇒ 3. of Theorem 2.16, we
followed the proof given by Fujii et al. (2004, Theorem 3).
The notation ∥·∥⋄ for the quantity on the right-hand side of (2.2.187) was
introduced by Kitaev (1997), and it is known as a completely bounded trace norm
in the mathematics literature; see, for example, (Paulsen, 2003). The result in
(2.2.188) is due to Smith (1983) (see Theorem 2.10 therein), but it can also be
found in (Kitaev, 1997; Aharonov et al., 1998). For a proof of (2.2.190), see
Theorem 3.51 in (Watrous, 2018), which also contains several more properties of
the diamond norm.
For an introduction to real analysis, see (Rudin, 1976).
For an introduction to convex analysis, see (Rockafellar, 1970; Boyd and
Vandenberghe, 2004), and for a proof of the Fenchel–Eggleston–Carathéodory
theorem, see (Eggleston, 1958; Rockafellar, 1970).
Sion’s minimax theorem (Theorem 2.24) is due to Sion (1958), and it is a
generalization of a minimax theorem of von Neumann (1928). A short proof of
Sion’s minimax theorem can be found in (Komiya, 1988). The minimax theorem in
Theorem 2.25 was presented by Mosonyi and Hiai (2011).
For an introduction to probability theory, see (Feller, 1968; Ross, 2019). Proofs
of Markov’s inequality (2.3.20) and Jensen’s inequality (2.3.21) can be found in,
e.g., (Fristedt and Gray, 1997).
For further details on semi-definite programming, see Vandenberghe and Boyd
(1996); Watrous (2018). Various polynomial-time algorithms for solving semi-
definite programs were developed by Khachiyan (1980); Arora et al. (2005); Arora
and Kale (2007); Arora et al. (2012); Lee et al. (2015). A proof of Slater’s Theorem
(Theorem 2.28) can be found in (Boyd and Vandenberghe, 2004, Section 5.3.2).
For further details about the symmetric subspace of a tensor product of finite-
dimensional Hilbert spaces, as well as for a proof of (2.5.16), see (Harrow, 2013)
(see also Bengtsson and Zyczkowski (2017, Section 12.7)). Further details about
the Fubini-Study measure d𝜓 introduced in (2.5.16) and elaborated upon in the
remark immediately below it may be found in (Bengtsson and Zyczkowski, 2017,
Chapter 4).
79
Chapter 2: Mathematical Tools
2.7 Problems
1. Prove that a linear operator 𝑋 ∈ L(H) is positive semi-definite if and only if it can be
written as 𝑋 = 𝑌 †𝑌 for some 𝑌 ∈ L(H, H′).
2. Prove that the columns of every isometry form an orthonormal set of vectors. Similarly,
prove that the rows and columns of every unitary operator form orthonormal sets of
vectors. (Hint: Consider using the expressions in (2.2.11).)
3. Let 𝑋 ∈ L(H 𝐴 ) and 𝑌 ∈ L(H𝐵 ) be normal operators, and consider their so-called
Kronecker sum:
𝑋 ⊕K 𝑌 B 𝑋 ⊗ 1 𝐵 + 1 𝐴 ⊗ 𝑌 . (2.7.1)
Prove that spec(𝑋 ⊕K 𝑌 ) = {𝜆 + 𝜇 : 𝜆 ∈ spec(𝑋), 𝜇 ∈ spec(𝑌 )}. Also prove that the
associated eigenvectors are of the form |𝜓⟩ ⊗ |𝜙⟩, where |𝜓⟩ is an eigenvector of 𝑋 and
|𝜙⟩ is an eigenvector of 𝑌 .
4. The Hadamard product, also known as the Schur product, of two linear operators
𝑋, 𝑌 ∈ L(C𝑑 ), with 𝑑 ≥ 2, is defined to be the element-wise product of 𝑋 and 𝑌 : if
Í Í𝑑−1
𝑋 = 𝑖,𝑑−1
𝑗=0 𝑋𝑖, 𝑗 |𝑖⟩⟨ 𝑗 | and 𝑌 = 𝑖, 𝑗=0 𝑌𝑖, 𝑗 |𝑖⟩⟨ 𝑗 |, then
𝑑−1
∑︁
𝑋 ∗𝑌 B 𝑋𝑖, 𝑗 𝑌𝑖, 𝑗 |𝑖⟩⟨ 𝑗 |. (2.7.2)
𝑖, 𝑗=0
𝑑−1
∑︁ 𝑑−1
∑︁
diag(⟨𝜓|) B 𝛼𝑖 |𝑖⟩⟨𝑖|, diag(|𝜙⟩) B 𝛽 𝑗 | 𝑗⟩⟨ 𝑗 |. (2.7.4)
𝑖=0 𝑗=0
(c) Prove that the Hadamard product of two positive semi-definite operators is positive
semi-definite.
80
Chapter 2: Mathematical Tools
(b) Using (a), prove that {|𝜓 𝑗 ⟩} 𝑑𝑗=1 is a basis for C𝑑 . In other words, prove that every
vector |𝜙⟩ ∈ C𝑑 can be written as a unique linear combination of the vectors |𝜓 𝑗 ⟩.
We thus have that every set of 𝑑 linearly independent vectors in C𝑑 is a basis for C𝑑 .
(c) Prove that if 𝑑𝑗=1 |𝜓 𝑗 ⟩⟨𝜓 𝑗 | = 1𝑑 , then {|𝜓⟩ 𝑗 } 𝑑𝑗=1 is an orthonormal basis for C𝑑 .
Í
By combining this result with the result of Exercise 2.2, we have that a linearly
independent set {|𝜓 𝑗 ⟩} 𝑑𝑗=1 of vectors in C𝑑 is an orthonormal basis if and only if
𝑗=1 |𝜓 𝑗 ⟩⟨𝜓 𝑗 | = 1𝑑 .
Í𝑑
2
6. Let {𝐵 𝑗 } 𝑑𝑗=1 be an orthonormal basis for L(C𝑑 ), with 𝑑 ≥ 2.
(a) Prove that
𝑑2
∑︁
𝐵 𝑗 ⊗ 𝐵 𝑗 = Γ𝑑 , (2.7.7)
𝑗=1
Í𝑑−1
where we recall that Γ𝑑 = |Γ𝑑 ⟩⟨Γ𝑑 | = 𝑖, 𝑗=0 |𝑖, 𝑖⟩⟨ 𝑗, 𝑗 |; see (2.2.34). Similarly, prove
that
𝑑2
∑︁
𝐵†𝑗 ⊗ 𝐵 𝑗 = 𝐹, (2.7.8)
𝑗=1
Í𝑑−1
where we recall that 𝐹 = 𝑖, 𝑗=0 |𝑖, 𝑗⟩⟨ 𝑗, 𝑖|; see (2.5.10).
2
(Hint: Start by verifying that {𝐵 𝑗 } 𝑑𝑗=1 is an orthonormal basis for L(C𝑑 ). Then,
use the fact that every linear operator 𝑍 ∈ L(C𝑑 ⊗ C𝑑 ) can be written as 𝑍 =
Í𝑑 2
𝑗,𝑘=1 𝑐 𝑗,𝑘 𝐵 𝑗 ⊗ 𝐵 𝑘 for some coefficients 𝑐 𝑗,𝑘 ∈ C.)
81
Chapter 2: Mathematical Tools
𝑑2
𝐵 𝑗 𝑋 𝐵†𝑗 = Tr[𝑋] 1𝑑 .
∑︁
(2.7.9)
𝑗=1
(c) Prove that {vec(𝐵 𝑗 )} 𝑑𝑗=1 and {(𝐵 𝑗 ⊗ 1𝑑 )|Γ𝑑 ⟩} 𝑑𝑗=1 are orthonormal bases for C𝑑 ⊗ C𝑑 .
2 2
7. For all 𝑑 ≥ 2, construct a basis for L(C𝑑 ) that consists entirely of density operators.
(Hint: Consider using the eigenvectors of the orthonormal basis of Hermitian operators
defined in (2.2.45)–(2.2.47).)
8. Let {|𝜓 𝑗 ⟩} 𝑑𝑗=1 be a set of linearly independent, normalized, but non-orthogonal vectors
in C𝑑 , with 𝑑 ≥ 2. We would like to transform these vectors into a new set {|𝜙 𝑗 ⟩} 𝑑𝑗=1 of
orthonormal vectors via an invertible linear operator 𝑋, such that |𝜙 𝑗 ⟩ = 𝑋 |𝜓 𝑗 ⟩ for all
𝑗 ∈ {1, 2, . . . , 𝑑}.
(a) Prove that the operator 𝑆 defined as
𝑑
∑︁
𝑆B |𝜓 𝑗 ⟩⟨𝜓 𝑗 | (2.7.10)
𝑗=1
is invertible and positive definite. (Hint: Write 𝑆 in terms of the operator 𝑇 defined
in (2.7.6).)
(b) Let
1
|𝜙 𝑗 ⟩ B 𝑆 − 2 |𝜓 𝑗 ⟩ (2.7.11)
for all 𝑗 ∈ {1, 2, . . . , 𝑑}. Prove that {|𝜙 𝑗 ⟩} 𝑑𝑗=1 is an orthonormal basis for C𝑑 .
1
(Hint: See problem 5.(c).) Also, prove that ⟨𝜙𝑖 |𝜓 𝑗 ⟩ = ⟨𝑖 − 1|𝐺 2 | 𝑗 − 1⟩ for all
𝑖, 𝑗 ∈ {1, 2, . . . , 𝑑}, where 𝐺 B 𝑇 †𝑇 and 𝑇 B 𝑑𝑗=1 |𝜓 𝑗 ⟩⟨ 𝑗 − 1|.
Í
(c) Let us now show that the vectors defined in (2.7.11) are optimal with respect to the
Euclidean norm, in the following sense:
𝑑
∑︁
2
inf |𝜓 𝑗 ⟩ − |𝜙 𝑗 ⟩ 2
: |𝜙 𝑗 ⟩ = 𝑋 |𝜓 𝑗 ⟩, ⟨𝜙𝑖 |𝜙 𝑗 ⟩ = 𝛿𝑖, 𝑗 ∀ 1 ≤ 𝑗 ≤ 𝑑 (2.7.12)
𝑋
𝑗=1
82
Chapter 2: Mathematical Tools
𝑑
∑︁ 2
1
= |𝜓 𝑗 ⟩ − 𝑆 − 2 |𝜓 𝑗 ⟩ , (2.7.13)
2
𝑗=1
iii. Prove that the solution to the optimization problem given by (2.7.15) is 𝑈 = 1𝑑 ,
1
implying that the optimal 𝑋 in (2.7.12) is indeed 𝑆 − 2 . (Hint: Use Proposi-
tion 2.10.)
(Bibliographic Note: The vectors |𝜙 𝑗 ⟩ defined in (2.7.11) are known as the symmetric
orthogonalization of the original vectors |𝜓 𝑗 ⟩, and this construction is attributed to
Löwdin (1950); see also (Löwdin, 1970). An alternate proof of the optimality of this
construction, as worked out in part (c) of this problem, can be found in (Mayer, 2002).)
9. For the case 𝑑 = 2 and 𝑛 = 2, verify the equalities given by (2.5.5) and (2.5.16) by
making use of (2.5.22).
10. Prove that the right-hand side of (2.5.5) is indeed the projection onto Sym𝑛 (H) by
showing that
∑︁
|𝑛1 , 𝑛2 , . . . , 𝑛 𝑑 ⟩⟨𝑛1 , 𝑛2 , . . . , 𝑛 𝑑 | = ΠSym𝑛 (H) . (2.7.16)
𝑛1 ,𝑛2 ,...,𝑛 𝑑 ≥0,
Í𝑑
𝑗=1 𝑛 𝑗 =𝑛
83
Chapter 3
85
Chapter 3: Quantum States and Measurements
1. Qubit systems: The qubit is perhaps the most fundamental quantum system and
is the quantum analogue of the (classical) bit. Every physical system with two
distinct degrees of freedom obeying the laws of quantum mechanics can be
considered a qubit system. The Hilbert space associated with a qubit system is
C2 , whose standard orthonormal basis is denoted by {|0⟩, |1⟩}. Three common
ways of physically realizing qubit systems are as follows:
(a) The two spin states of a spin- 12 particle.
(b) Two distinct energy levels of an atom, such as the ground state and one of
the excited states.
(c) Clockwise and counter-clockwise directions of current flow in a supercon-
ducting electronic circuit.
2. Qutrit systems: A qutrit system is a quantum system consisting of three distinct
physical degrees of freedom. The Hilbert space of a qutrit is C3 , with the
standard orthonormal basis denoted by {|0⟩, |1⟩, |2⟩}. Qutrit systems are less
commonly considered than qubit systems for implementations, although one
important example of an implementation of a qutrit system occurs in quantum
optical systems, which we discuss below. Like qubit systems, qutrit systems
can also be physically realized using, for example, the spin states of a spin-1
atom or three distinct energy levels of an atom.
3. Qudit systems: A qudit system is a quantum system with 𝑑 distinct degrees
of freedom and is described by the Hilbert space C𝑑 , with the standard
orthonormal basis denoted by {|0⟩, |1⟩, . . . , |𝑑 − 1⟩}. The spin states of every
spin- 𝑗 atom can be used to realize a qudit system with 𝑑 = 2 𝑗 + 1. Another
physical realization of a qudit system is with the 𝑑 distinct energy levels of an
atom.
4. Quantum optical systems: An important quantum system, particularly for the
implementation of many quantum communication protocols, is a quantum
86
Chapter 3: Quantum States and Measurements
Exercise 3.1
Prove that the set of quantum states is a convex set. (Recall the definition of a
convex set from Section 2.3.3.) In other words, prove that for every alphabet X
and set {𝜌 𝑥 }𝑥∈X of quantum states, along with every probability distribution
𝑝 : X → [0, 1], the following convex combination is a quantum state:
∑︁
𝜌= 𝑝(𝑥) 𝜌 𝑥 . (3.2.1)
𝑥∈X
The extremal points in the convex set of quantum states are called pure states. A
pure state is a rank-one projection onto a unit vector in the Hilbert space. Concretely,
pure states are of the form |𝜓⟩⟨𝜓| where |𝜓⟩ ∈ H is a normalized vector. For
convenience, we sometimes denote |𝜓⟩⟨𝜓| simply as 𝜓, and refer to the unit vector
|𝜓⟩ as a state vector. Since every element of a convex set can be written as a convex
combination of the extremal points in the set, every quantum state 𝜌 that is not a
pure state can be written as
∑︁
𝜌= 𝑝(𝑥)|𝜓𝑥 ⟩⟨𝜓𝑥 | (3.2.2)
𝑥∈X
for some set {|𝜓𝑥 ⟩}𝑥∈X of state vectors defined with respect to a finite alphabet X,
where 𝑝 : X → [0, 1] is a probability distribution.
Exercise 3.2
Prove that a quantum state 𝜌 is pure if and only if 𝜌 2 = 𝜌. More generally, prove
that 𝜌 is pure if and only if Tr[𝜌 2 ] = 1. The quantity Tr[𝜌 2 ] is known as the
purity of 𝜌.
88
Chapter 3: Quantum States and Measurements
A state 𝜌 that is not pure is called a mixed state, because it can be thought of as
arising from the lack of knowledge of which pure state from the set {|𝜓𝑥 ⟩}𝑥∈X in
(3.2.2) the system has been prepared. Note that the decomposition in (3.2.2), of a
quantum state into pure states, is generally not unique.
A state 𝜌 is called maximally mixed if the set {|𝜓𝑥 ⟩}𝑥∈X in (3.2.2) consists of
𝑑 orthonormal state vectors and the probability distribution {𝑝(𝑥)}𝑥∈X is uniform
(i.e., 𝑝(𝑥) = 𝑑1 for all 𝑥 ∈ X). In this case, it follows that
1𝑑
𝜌= C 𝜋𝑑 , (3.2.3)
𝑑
as a consequence of Exercise 2.2. The state 𝜋 𝑑 is called maximally mixed because it
corresponds to having the most uncertainty about which state from the set {|𝜓 𝑘 ⟩} 𝑑𝑘=1
the system is in. This uncertainty can be quantified by using quantum entropy, and
in Chapter 7, we find that the maximally mixed state 𝜋 𝑑 has the largest entropy
among all states of a finite-dimensional system of dimension 𝑑, thus justifying the
term “maximally mixed.”
Now, let us recall the orthonormal basis of Hermitian operators defined in
(2.2.44)–(2.2.47).
√ In quantum information, it is common to scale these operators
by 𝑑, where 𝑑 is the dimension, so that we have an orthogonal basis {𝑆 𝑘(𝑑) } 𝑑𝑘=0−1
2
coherence vector, of 𝜌; please see the Bibliographic Notes (Section 3.4) for more
information on this terminology.
89
Chapter 3: Quantum States and Measurements
Exercise 3.3
Let 𝜌 be the quantum state represented as in (3.2.4).
1. Verify that Tr[𝜌] = 1.
Í𝑑 2 −1
2. Prove that 𝜌 is pure if and only if 𝑘=1 𝑟 𝑘2 = 𝑑 − 1.
The fact that Tr[𝜌] = 1 follows from the fact that 𝑋, 𝑌 , and 𝑍 are traceless operators,
while Tr[1] = 2. The condition for 𝜌 to be positive semi-definite is left to the
following exercise.
Exercise 3.4
Show that the positive semi-definiteness of every qubit state 𝜌, as represented
in (3.2.5), is equivalent to 𝑟 12 + 𝑟 22 + 𝑟 32 ≤ 1.
90
Chapter 3: Quantum States and Measurements
|0i
|−i
|−ii |+ii
|+i
|1i
Figure 3.1: The quantum states in D(C2 ) of every qubit system can be
represented as a point in the so-called Bloch ball. All pure states lie on the
surface of the Bloch ball, which is known as the Bloch sphere. Shown are the
basis state vectors |0⟩ and |1⟩, corresponding to the Bloch vectors (0, 0, 1) and
(0, 0, −1), respectively. The superposition state vectors |±⟩ B √1 (|0⟩ ± |1⟩)
2
correspond to the Bloch vectors (±1, 0, 0), and the superposition state vectors
| ± i⟩ B √1 (|0⟩ ± i|1⟩) correspond to the Bloch vectors (0, ±1, 0).
2
The joint state of two distinct quantum systems 𝐴 and 𝐵 is described by a bipartite
quantum state 𝜌 𝐴𝐵 ∈ D(H 𝐴 ⊗ H𝐵 ). For brevity, the joint Hilbert space H 𝐴 ⊗ H𝐵
of the composite system 𝐴𝐵 is denoted by H 𝐴𝐵 .
𝑑 𝐴−1 𝐵 −1
Let {|𝑖⟩ 𝐴 }𝑖=0 and {| 𝑗⟩𝐵 } 𝑑𝑗=0 be orthonormal bases for H 𝐴 and H𝐵 , respectively.
Then,
{|𝑖⟩ 𝐴 ⊗ | 𝑗⟩𝐵 : 0 ≤ 𝑖 ≤ 𝑑 𝐴 − 1, 0 ≤ 𝑗 ≤ 𝑑 𝐵 − 1} (3.2.7)
is an orthonormal basis for H 𝐴𝐵 . For brevity, we often write |𝑖, 𝑗⟩ 𝐴𝐵 instead of
|𝑖⟩ 𝐴 ⊗ | 𝑗⟩𝐵 . Every state vector |𝜓⟩ 𝐴𝐵 ∈ H 𝐴𝐵 can thus be written as
𝐴−1 𝑑∑︁
𝑑∑︁ 𝐵 −1
91
Chapter 3: Quantum States and Measurements
Í
where each Schmidt coefficient 𝜆 𝑘 is strictly positive and they all satisfy 𝑟𝑘=1 𝜆 𝑘 =
1, {|𝑒 𝑘 ⟩ 𝐴 }𝑟𝑘=1 and {| 𝑓 𝑘 ⟩𝐵 }𝑟𝑘=1 are orthonormal sets of vectors in H 𝐴 and H𝐵 ,
respectively, and 𝑟 = rank(𝑋), where 𝑋 ∈ L(H 𝐴 , H𝐵 ) is defined as ⟨ 𝑗 | 𝐵 𝑋 |𝑖⟩ 𝐴 =
⟨𝑖, 𝑗 |𝜓⟩ 𝐴𝐵 for all 0 ≤ 𝑖 ≤ 𝑑 𝐴 − 1 and 0 ≤ 𝑗 ≤ 𝑑 𝐵 − 1.
More generally, recall from Chapter 2 that we can define the orthonormal bases
{|𝑖⟩⟨𝑖′ | 𝐴 : 0 ≤ 𝑖, 𝑖′ ≤ 𝑑 𝐴 − 1}, {| 𝑗⟩⟨ 𝑗 ′ | 𝐵 : 0 ≤ 𝑗, 𝑗 ′ ≤ 𝑑 𝐵 − 1}, (3.2.10)
for L(H 𝐴 ) and L(H𝐵 ), respectively. Then, the set
{|𝑖, 𝑗⟩⟨𝑖′, 𝑗 ′ | 𝐴𝐵 ≡ |𝑖⟩⟨𝑖′ | 𝐴 ⊗| 𝑗⟩⟨ 𝑗 ′ | 𝐵 : 0 ≤ 𝑖, 𝑖′ ≤ 𝑑 𝐴 −1, 0 ≤ 𝑗, 𝑗 ′ ≤ 𝑑 𝐵 −1} (3.2.11)
is an orthonormal basis for L(H 𝐴𝐵 ). It follows that every mixed state 𝜌 𝐴𝐵 ∈
D(H 𝐴𝐵 ) can be written as
𝐴−1 𝑑∑︁
𝑑∑︁ 𝐵 −1
where each coefficient 𝜆 𝑘 is strictly positive, {𝐸 𝐴𝑘 }𝑟𝑘=1 and {𝐹𝐵𝑘 }𝑟𝑘=1 are orthonormal
sets of linear operators acting on H 𝐴 and H𝐵 , respectively, and 𝑟 = rank(𝑀), where
𝑀 ∈ L(H 𝐴 ⊗ H 𝐴 , H𝐵 ⊗ H𝐵 ) is defined by ⟨ 𝑗, ℓ| 𝐵𝐵 𝑀 |𝑖, 𝑘⟩ 𝐴𝐴 = ⟨𝑖, 𝑗 |𝜌 𝐴𝐵 |𝑘, ℓ⟩ for
all 0 ≤ 𝑖, 𝑗 ≤ 𝑑 𝐴 − 1 and 0 ≤ 𝑗, ℓ ≤ 𝑑 𝐵 − 1.
92
Chapter 3: Quantum States and Measurements
A A
ρA ≡ TrB [ρAB ]
ρAB ρAB
B B
ρB ≡ TrA [ρAB ]
Figure 3.2: The partial trace superoperator (see Definition 3.2) is the math-
ematical representation of physically discarding a subsystem of a composite
quantum system. In other words, given a bipartite state 𝜌 𝐴𝐵 for the two quantum
systems 𝐴 and 𝐵, the partial trace Tr 𝐵 allows us to determine the quantum state
of system 𝐴 when we do not have access to system 𝐵 (left), and Tr 𝐴 allows
us to determine the quantum state of system 𝐵 when we do not have access to
system 𝐴 (right).
Recall from Section 2.2 that the trace of a linear operator 𝑋 acting on a 𝑑-dimensional
Hilbert space can be written as
𝑑−1
∑︁
Tr[𝑋] = ⟨𝑖|𝑋 |𝑖⟩, (3.2.16)
𝑖=0
where {|𝑖⟩}𝑖=0
𝑑−1 is the standard orthonormal basis. We can interpret the trace as
the sum of the diagonal elements of the matrix corresponding to 𝑋 written in the
standard basis. From Exercise 2.5, however, we have that the trace is independent
of the choice of basis used to evaluate it.
The trace is physically relevant, especially when it acts on one part of a bipartite
quantum state, in which case we call it the partial trace. To be specific, given a
state 𝜌 𝐴𝐵 for the bipartite system 𝐴𝐵, we are often interested in determining the
state of only one of its subsystems. The partial trace Tr 𝐵 , which we define formally
below, takes a state 𝜌 𝐴𝐵 acting on the space H 𝐴𝐵 and returns a state 𝜌 𝐴 ≡ Tr 𝐵 [𝜌 𝐴𝐵 ]
acting on the space H 𝐴 . The partial trace is therefore the mathematical operation
used to determine the state of one of the subsystems given the state of a composite
system comprising two or more subsystems, and it can be thought of as the action
of “discarding” one of the subsystems; see Figure 3.2. The partial trace generalizes
the notion of marginalizing a joint probability distribution. Later, in Chapter 4, we
see that partial trace is a particular kind of quantum channel corresponding to this
discarding action.
93
Chapter 3: Quantum States and Measurements
Tr 𝐵 [𝑋 𝐴𝐵 ] = (id 𝐴 ⊗ Tr)(𝑋 𝐴𝐵 )
𝐵 −1
𝑑∑︁
(3.2.17)
= ( 1 𝐴 ⊗ ⟨ 𝑗 | 𝐵 ) 𝑋 𝐴𝐵 ( 1 𝐴 ⊗ | 𝑗⟩𝐵 )
𝑗=0
for every linear operator 𝑋 𝐴𝐵 ∈ L(H 𝐴 ⊗ H𝐵 ). Similarly, the partial trace over
𝐴 is denoted by Tr 𝐴 ≡ Tr 𝐴 ⊗ id𝐵 and is defined as
Remark: For every linear operator 𝑋 𝐴𝐵 acting on H 𝐴𝐵 , we can define the partial trace
Tr 𝐵 [𝑋 𝐴𝐵 ] more abstractly as the unique linear operator 𝑋 𝐴 acting on H 𝐴 such that
for every operator 𝑀 𝐴 ∈ L(H 𝐴). If we let 𝑋 𝐴𝐵 be the state 𝜌 𝐴𝐵 and 𝑀 𝐴 be a Hermitian operator,
then we can interpret this equation physically in the following way: in order to determine the
expectation value of an observable 𝑀 𝐴 acting on only one of the subsystems (in this case, the 𝐴
subsystem), it suffices to know the reduced state 𝜌 𝐴 of the subsystem 𝐴 rather than the joint state
𝜌 𝐴𝐵 of the total system.
It is clear from Definition 3.2 that the partial trace is a linear superoperator.
In particular, the expressions in (3.2.17) and (3.2.18) define the partial trace in
precisely the operator-sum form for superoperators shown in (2.2.179).
Now, in order to explicitly determine the partial trace of a given linear operator
𝑋 𝐴𝐵 ∈ L(H 𝐴𝐵 ), it suffices to know the action of the partial trace on basis elements
of L(H 𝐴𝐵 ) because the action of every linear superoperator is completely defined
by its action on basis elements. Using the orthonormal basis for L(H 𝐴𝐵 ) given in
(3.2.11), it is straightforward to see that the action of the partial trace Tr 𝐵 on this
94
Chapter 3: Quantum States and Measurements
basis is
Tr 𝐵 [|𝑖⟩⟨𝑖′ | 𝐴 ⊗ | 𝑗⟩⟨ 𝑗 ′ | 𝐵 ] = |𝑖⟩⟨𝑖′ | 𝐴 𝛿 𝑗, 𝑗 ′ (3.2.20)
for all 0 ≤ 𝑖, 𝑖′ ≤ 𝑑 𝐴 − 1, 0 ≤ 𝑗, 𝑗 ′ ≤ 𝑑 𝐵 − 1. Similarly, for Tr 𝐴 , we obtain
𝐴−1 𝑑∑︁
𝑑∑︁ 𝐵 −1
𝑋 𝐴 ≡ Tr 𝐵 [𝑋 𝐴𝐵 ] and 𝑋𝐵 ≡ Tr 𝐴 [𝑋 𝐴𝐵 ] (3.2.25)
denote its partial traces. For states, we also use the terms marginal states or reduced
states to refer to their partial traces.
An immediate consequence of the Schmidt decomposition theorem is that the
marginal states 𝜌 𝐴 B Tr 𝐵 [|𝜓⟩⟨𝜓| 𝐴𝐵 ] and 𝜌 𝐵 B Tr 𝐴 [|𝜓⟩⟨𝜓| 𝐴𝐵 ] of every pure state
|𝜓⟩⟨𝜓| 𝐴𝐵 have the same non-zero eigenvalues. Indeed, using (3.2.9), we find that
𝑟 √︁
∑︁
𝜌𝐴 = 𝜆 𝑘 𝜆 𝑘 ′ Tr 𝐵 [|𝑒 𝑘 ⟩⟨𝑒 𝑘 ′ | 𝐴 ⊗ | 𝑓 𝑘 ⟩⟨ 𝑓 𝑘 ′ | 𝐵 ] (3.2.26)
𝑘,𝑘 ′ =1
∑︁𝑟 √︁
= 𝜆 𝑘 𝜆 𝑘 ′ |𝑒 𝑘 ⟩⟨𝑒 𝑘 ′ | 𝐴 𝛿 𝑘,𝑘 ′ (3.2.27)
𝑘,𝑘 ′ =1
95
Chapter 3: Quantum States and Measurements
𝑟
∑︁
= 𝜆 𝑘 |𝑒 𝑘 ⟩⟨𝑒 𝑘 | 𝐴 , (3.2.28)
𝑘=1
𝑟
∑︁ √︁
and 𝜌𝐵 = 𝜆 𝑘 𝜆 𝑘 ′ Tr 𝐴 [|𝑒 𝑘 ⟩⟨𝑒 𝑘 ′ | 𝐴 ⊗ | 𝑓 𝑘 ⟩⟨ 𝑓 𝑘 ′ | 𝐵 ] (3.2.29)
𝑘,𝑘 ′ =1
𝑟 √︁
∑︁
= 𝜆 𝑘 𝜆 𝑘 ′ 𝛿 𝑘,𝑘 ′ | 𝑓 𝑘 ⟩⟨ 𝑓 𝑘 ′ | 𝐵 (3.2.30)
𝑘,𝑘 ′ =1
𝑟
∑︁
= 𝜆 𝑘 | 𝑓 𝑘 ⟩⟨ 𝑓 𝑘 | 𝐵 , (3.2.31)
𝑘=1
Exercise 3.5
Consider two quantum systems 𝐴 and 𝐵, with 𝑑 𝐴 = 𝑑 𝐵 = 𝑑.
1. Calculate Tr 𝐴 [|Γ⟩⟨Γ| 𝐴𝐵 ] and Tr 𝐵 [|Γ⟩⟨Γ| 𝐴𝐵 ], where we recall from (2.2.34)
Í
that |Γ⟩ 𝐴𝐵 = 𝑑−1𝑗=0 | 𝑗, 𝑗⟩ 𝐴𝐵 .
Below are two useful lemmas about how the support of a bipartite linear
operator (recall the definition of support from Section 2.2) relates to the support of
its partial traces. Their proofs are somewhat technical, and so we provide them in
Appendices 3.A and 3.B.
Lemma 3.3
Let 𝑋 𝐴𝐵 ∈ L(H 𝐴 ⊗ H𝐵 ) be positive semi-definite, and let 𝑋 𝐴 B Tr 𝐵 [𝑋 𝐴𝐵 ]
and 𝑋𝐵 B Tr 𝐴 [𝑋 𝐴𝐵 ]. Then supp(𝑋 𝐴𝐵 ) ⊆ supp(𝑋 𝐴 ) ⊗ supp(𝑋𝐵 ).
96
Chapter 3: Quantum States and Measurements
Lemma 3.4
Let 𝑋 𝐴𝐵 , 𝑌 𝐴𝐵 ∈ L(H 𝐴 ⊗ H𝐵 ) be positive semi-definite, and suppose that
supp(𝑋 𝐴𝐵 ) ⊆ supp(𝑌 𝐴𝐵 ). Then supp(𝑋 𝐴 ) ⊆ supp(𝑌 𝐴 ), where 𝑋 𝐴 B
Tr 𝐵 [𝑋 𝐴𝐵 ] and 𝑌 𝐴 B Tr 𝐵 [𝑌 𝐴𝐵 ].
The concepts of separable and entangled states are at the heart of virtually all
of the communication protocols that we consider in this book. More generally,
entanglement is a key distinction between the classical and quantum theories
of information; it simply is not present and therefore does not play a role in
classical information theory. Entanglement, in particular, is a key element of
private communication and secure key distillation, and the successful distribution
of entangled states among several spatially separated parties is a crucial ingredient
in the implementation of such protocols over the future quantum internet. If the
parties share only separable, unentangled states, then it is not possible for them to
distill a key that is secure against a general quantum adversary.
We begin this section by defining separable and entangled states.
97
Chapter 3: Quantum States and Measurements
Remark: Note that a separable state can always be written in the form
∑︁
𝜎𝐴𝐵 = 𝑞(𝑥 ′ )|𝜓 𝑥 ′ ⟩⟨𝜓 𝑥 ′ | 𝐴 ⊗ |𝜙 𝑥 ′ ⟩⟨𝜙 𝑥 ′ | 𝐵 (3.2.33)
𝑥 ′ ∈X′
for some probability distribution 𝑞 : X′ → [0, 1] on a finite alphabet X′ and sets of pure states
{|𝜓 𝑥 ′ ⟩⟨𝜓 𝑥 ′ | 𝐴 : 𝑥 ′ ∈ X′ }, {|𝜙 𝑥 ′ ⟩⟨𝜙 𝑥 ′ | 𝐵 : 𝑥 ′ ∈ X′ }. In other words, separable states can always
be written as a convex combination of pure product states. Indeed, from (3.2.32), we can take
spectral decompositions of 𝜔 𝑥𝐴 and 𝜏𝐵𝑥 ,
𝑟𝐴 𝑥 𝑥
𝑟𝐵
∑︁ ∑︁
𝜔 𝑥𝐴 = 𝑡 𝑘𝑥 |𝑒 𝑘𝑥 ⟩⟨𝑒 𝑘𝑥 | 𝐴, 𝜏𝐵𝑥 = 𝑠ℓ𝑥 | 𝑓ℓ𝑥 ⟩⟨ 𝑓ℓ𝑥 | 𝐵 , (3.2.34)
𝑘=1 ℓ=1
𝑟 𝐴 𝑟𝐵 𝑥 𝑥
∑︁ ∑︁ ∑︁
𝜌 𝐴𝐵 = 𝑝(𝑥)𝑡 𝑘𝑥 𝑠ℓ𝑥 |𝑒 𝑘𝑥 ⟩⟨𝑒 𝑘𝑥 | 𝐴 ⊗ | 𝑓ℓ𝑥 ⟩⟨ 𝑓ℓ𝑥 | 𝐵 . (3.2.35)
𝑥 ∈X 𝑘=1 ℓ=1
|𝜓 𝑥 ′ ⟩ 𝐴 B |𝑒 𝑘𝑥 ⟩ 𝐴, |𝜙 𝑥 ′ ⟩ 𝐵 B | 𝑓ℓ𝑥 ⟩ 𝐵 . (3.2.36)
From the development above, it follows that the set of separable states is the convex hull of the
set of pure product states. By an application of the Fenchel–Eggleston–Carathéodory Theorem
(Theorem 2.23), it follows that 𝜎𝐴𝐵 can be written as a convex combination of no more than
rank(𝜎𝐴𝐵 ) 2 pure product states.
In the sense that follows, bipartite separable states can be thought of as exhibiting
purely classical correlations between the two parties, Alice and Bob. Suppose that
Alice draws 𝑥 with probability 𝑝(𝑥), prepares her system in the state 𝜔𝑥𝐴 , sends
𝑥 to Bob over a classical channel, who then prepares his system in the state 𝜏𝐵𝑥 ,
where 𝑥 ∈ X and X is a finite alphabet. This procedure corresponds to preparing
the ensemble {( 𝑝(𝑥), 𝜔𝑥𝐴 ⊗ 𝜏𝐵𝑥 )}𝑥∈X , and if Alice and Bob discard the label 𝑥, then
Í
their shared joint state is the separable state 𝜎𝐴𝐵 = 𝑥∈X 𝑝(𝑥)𝜔𝑥𝐴 ⊗ 𝜏𝐵𝑥 .
98
Chapter 3: Quantum States and Measurements
Exercise 3.6
99
Chapter 3: Quantum States and Measurements
Exercise 3.7
Prove that every state vector of the form ( 1𝑑 ⊗ 𝑈)|Φ𝑑 ⟩ = √1 vec(𝑈) and
𝑑
(𝑈 ⊗ 1𝑑 )|Φ𝑑 ⟩, with 𝑑 ≥ 2 and 𝑈 a unitary operator, corresponds to a maximally
entangled pure state. Conversely, given a maximally entangled pure state
|𝜓⟩⟨𝜓| 𝐴𝐵 , prove that there exists a unitary 𝑈 𝐴 such that |𝜓⟩ 𝐴𝐵 = (𝑈 𝐴 ⊗ 1𝐵 )|Φ⟩ 𝐴𝐵 .
100
Chapter 3: Quantum States and Measurements
In Exercise 3.7, we learned that every state vector of the form ( 1 ⊗ 𝑈)|Φ⟩ and
(𝑈 ⊗ 1)|Φ⟩, with 𝑈 a unitary operator, is a maximally entangled state. We now
provide an important example of a class of maximally entangled states, known
as Bell states, for every dimension 𝑑 ≥ 2. These states are defined by particular
choices for the unitary 𝑈. The Bell states are an important element of many
quantum information processing tasks, most prominently quantum teleportation
and super-dense coding, which we discuss in Chapter 5.
We start with dimension 𝑑 = 2. Recall the Pauli operators 𝑋 and 𝑍 from (3.2.6):
0 1 1 0
𝑋= , 𝑍= . (3.2.42)
1 0 0 1
Observe that, in addition to being Hermitian, these operators are unitary, which is
due to the fact that 𝑋 2 = 𝑍 2 = 1. The operator 𝑌 defined in (3.2.6) is also unitary,
since 𝑌 2 = 1, from which it follows that the operator 𝑍 𝑋 = i𝑌 is also unitary.
Using the operators 𝑋, 𝑍, and 𝑍 𝑋, we define the following set of four entangled,
two-qubit state vectors:
for 𝑧, 𝑥 ∈ {0, 1}, where we recall from (3.2.39) that |Φ⟩ B √1 (|0, 0⟩ + |1, 1⟩). The
2
corresponding density operators Φ𝑧,𝑥 are known as the two-qubit Bell states. The
following notation is commonly used:
1
|Φ+ ⟩ ≡ |Φ0,0 ⟩ = √ (|0, 0⟩ + |1, 1⟩), (3.2.44)
2
1
|Φ− ⟩ ≡ |Φ1,0 ⟩ = √ (|0, 0⟩ − |1, 1⟩), (3.2.45)
2
1
|Ψ+ ⟩ ≡ |Φ0,1 ⟩ = √ (|0, 1⟩ + |1, 0⟩), (3.2.46)
2
1
|Ψ− ⟩ ≡ |Φ1,1 ⟩ = √ (|0, 1⟩ − |1, 0⟩). (3.2.47)
2
101
Chapter 3: Quantum States and Measurements
Exercise 3.8
1. Prove that the two-qubit Bell state vectors defined in (3.2.43) form an
orthonormal basis for C2 ⊗ C2 .
2. Prove that the state vectors |Φ+ ⟩, |Φ− ⟩, and |Ψ+ ⟩ form an orthonormal basis
for Sym2 (C2 ). (Hint: See (2.5.11) and Exercise 2.34.) For this reason, the
subspace Sym2 (C2 ) is called the triplet subspace.
3. Prove that ASym2 (C2 ) = span{|Ψ− ⟩}. For this reason, the subspace
ASym2 (C2 ) is called the singlet subspace and the state |Ψ− ⟩ is called the
singlet state vector.
We can generalize the Bell state vectors in (3.2.43) to systems with dimension
𝑑 > 2. Doing so requires a generalization of the qubit Pauli operators 𝑋 and 𝑍 to
unitary operators for qudits2 .
2The qudit operators defined in (2.2.44)–(2.2.47) represent one generalization of the qubit Pauli
operators. Although they are Hermitian, they are not generally unitary. What we require here is a
generalization to qudit operators that are unitary.
102
Chapter 3: Quantum States and Measurements
Exercise 3.9
1. Verify that when 𝑑 = 2, the Heisenberg–Weyl operators reduce to the qubit
Pauli operators 𝑍, 𝑋, and 𝑍 𝑋.
2. Prove that the operators 𝑍 (𝑧) and 𝑋 (𝑥) defined in (3.2.49) and (3.2.50)
satisfy the commutation relation
2 𝜋i𝑥𝑧
𝑍 (𝑧) 𝑋 (𝑥) = e 𝑑 𝑋 (𝑥)𝑍 (𝑧), (3.2.51)
The Heisenberg–Weyl operators are unitary, just like the Pauli operators;
however, unlike the Pauli operators, they are not Hermitian. In particular,
2 𝜋i𝑥𝑧
†
𝑊𝑧,𝑥 = e− 𝑑 𝑊−𝑧,−𝑥 . (3.2.52)
⟨𝑊𝑧1 ,𝑥1 , 𝑊𝑧2 ,𝑥2 ⟩ = Tr[𝑊𝑧†1 ,𝑥1 𝑊𝑧2 ,𝑥2 ] = 𝑑𝛿 𝑧1 ,𝑧2 𝛿𝑥1 ,𝑥2 (3.2.54)
Exercise 3.10
Prove (3.2.52), (3.2.53), and (3.2.54).
103
Chapter 3: Quantum States and Measurements
Exercise 3.11
Let 𝑑 ≥ 2, and consider the operator 𝑄 𝑑 defined as
𝑑−1
1 ∑︁ 2 𝜋i𝑘ℓ
𝑄𝑑 B √ e 𝑑 |𝑘⟩⟨ℓ|. (3.2.55)
𝑑 𝑘,ℓ=0
Using the Heisenberg–Weyl operators, we now define the set of qudit Bell states
in a manner analogous to (3.2.43).
Exercise 3.12
Prove that the two-qudit Bell state vectors defined in (3.2.58) form an orthonor-
mal basis for C𝑑 ⊗ C𝑑 for all 𝑑 ≥ 2.
The fact that the two-qubit Bell state vectors form an orthonormal basis for
104
Chapter 3: Quantum States and Measurements
for some probability distribution 𝑝 : {0, 1, . . . , 𝑑 − 1}2 → [0, 1], meaning that
Í𝑑−1
0 ≤ 𝑝(𝑧, 𝑥) ≤ 1 for all 𝑧, 𝑥 ∈ {0, 1, . . . , 𝑑 − 1} and 𝑧,𝑥=0 𝑝(𝑧, 𝑥) = 1.
One of the most useful concepts in quantum information is the notion of purification.
There is no strong classical analogue of this concept, and thus this notion represents
another distinction between the classical and quantum theories of information.
Tr 𝑅 [|𝜓⟩⟨𝜓| 𝑅 𝐴 ] = 𝜌 𝐴 . (3.2.61)
105
Chapter 3: Quantum States and Measurements
then satisfies
∑︁𝑟 √︁
Tr 𝑅 [|𝜓⟩⟨𝜓| 𝑅 𝐴 ] = 𝜆 𝑘 𝜆 𝑘 ′ Tr[|𝜑 𝑘 ⟩⟨𝜑 𝑘 ′ | 𝑅 ] |𝜙 𝑘 ⟩⟨𝜙 𝑘 ′ | 𝐴 (3.2.64)
𝑘,𝑘 ′ =1
| {z }
𝛿 𝑘,𝑘 ′
𝑟
∑︁
= 𝜆 𝑘 |𝜙 𝑘 ⟩⟨𝜙 𝑘 | 𝐴 (3.2.65)
𝑘=1
= 𝜌 𝐴, (3.2.66)
so that |𝜓⟩⟨𝜓| 𝑅 𝐴 is a purification of 𝜌 𝐴 , as required. ■
Remark: The theorem above states that the condition 𝑑 𝑅 ≥ rank(𝜌 𝐴) on the dimension of the
purifying system 𝑅 is sufficient to guarantee the existence of a purification. This condition is
also necessary, meaning that it is not possible to have a purifying system whose dimension is less
than the rank of 𝜌 𝐴.
The proof of the theorem above not only tells us that every state has a purification,
but it also tells us explicitly how to construct one such purification. We can also
construct a purification of every state 𝜌 𝐴 as follows:
√ √
|𝜓⟩ 𝑅 𝐴 = ( 1 𝑅 ⊗ 𝜌 𝐴 )|Γ⟩ 𝑅 𝐴 = vec( 𝜌 𝐴 ), (3.2.67)
Í𝑑 𝐴−1
where |Γ⟩ 𝑅 𝐴 = 𝑖=0 |𝑖, 𝑖⟩ 𝑅 𝐴 and where the operation vec is defined in (2.2.31).
We often call the state |𝜓⟩⟨𝜓| 𝑅 𝐴 the canonical purification of 𝜌 𝐴 . Note that the
canonical purification is very closely related to the purification used in the proof of
Theorem 3.10. Indeed, if
𝑟
∑︁
𝜌𝐴 = 𝜆 𝑘 |𝜙 𝑘 ⟩⟨𝜙 𝑘 | (3.2.68)
𝑘=1
106
Chapter 3: Quantum States and Measurements
with |𝜑⟩⟨𝜑| 𝑅 a pure state of the system 𝑅. In other words, purifications of pure states
can only be pure product states. Somewhat technically, according to Theorem 3.10,
the dimension of system 𝑅 need only satisfy 𝑑 𝑅 ≥ rank(𝜌 𝐴 ). In the case of a pure
state, the rank is equal to one, so that the reference system can be a trivial system of
dimension one. Thus, in this technical sense, pure states already purify themselves.
If we take the reference system to satisfy 𝑑 𝑅 ≥ 2, then indeed the purification has
the form given in (3.2.70).
Purifications of states are not unique. In fact, given a state 𝜌 𝐴 and a purification
|𝜓⟩⟨𝜓| 𝑅 𝐴 of 𝜌 𝐴 as in (3.2.63), let 𝑉𝑅→𝑅′ be an isometry (i.e., a linear operator
satisfying 𝑉 †𝑉 = 1 𝑅 ) acting on the 𝑅 system. Defining
we find that
𝑟 √︁
∑︁
′ ′
Tr 𝑅′ [|𝜓 ⟩⟨𝜓 | 𝑅′ 𝐴 ] = 𝜆 𝑘 𝜆 𝑘 ′ Tr[𝑉 |𝜑 𝑘 ⟩⟨𝜑 𝑘 ′ | 𝑅𝑉 † ]|𝜙 𝑘 ⟩⟨𝜙 𝑘 | 𝐴 (3.2.72)
𝑘,𝑘 ′ =1
𝑟
∑︁
= 𝜆 𝑘 |𝜙 𝑘 ⟩⟨𝜙 𝑘 | 𝐴 (3.2.73)
𝑘=1
= 𝜌 𝐴, (3.2.74)
107
Chapter 3: Quantum States and Measurements
where we conclude that Tr[𝑉 |𝜑 𝑘 ⟩⟨𝜑 𝑘 ′ |𝑉 † ] = 𝛿 𝑘,𝑘 ′ from cyclicity of the trace and
𝑉 †𝑉 = 1 𝑅 . So |𝜓 ′⟩⟨𝜓 ′ | 𝑅′ 𝐴 is also a purification of 𝜌 𝐴 .
A converse statement holds as well by employing the Schmidt decomposition
(Theorem 2.2): if |𝜓⟩⟨𝜓| 𝑅 𝐴 and |𝜓 ′⟩⟨𝜓 ′ | 𝑅′ 𝐴 are two purifications of the state 𝜌 𝐴 ,
then they are related by an isometry taking one reference system to the other. By
combining this statement and the previous one, we can thus say that purifications
are unique “up to isometries acting on the reference system.”
A purification is an example of an “extension” of a quantum state.
Remark: For a purification, it is required that 𝑑 𝑅 ≥ rank(𝜌 𝐴). However, there is no such
requirement for an extension.
Note that if the state 𝜌 𝐴 is pure, i.e., if 𝜌 𝐴 = |𝜙⟩⟨𝜙| 𝐴 , then every extension 𝜔 𝑅 𝐴
of 𝜌 𝐴 must be a product state, i.e., we must have 𝜔 𝑅 𝐴 = 𝜎𝑅 ⊗ |𝜙⟩⟨𝜙| 𝐴 for some
state 𝜎𝑅 .
A multipartite quantum state is a quantum state of more than two quantum systems.
Let 𝐴1 , . . . , 𝐴𝑛 denote 𝑛 ≥ 2 quantum systems. Then, every quantum state 𝜌 𝐴1 ···𝐴𝑛
can be represented as
𝑑 𝐴1 −1 𝐴𝑛 −1
𝑑∑︁
∑︁
𝜌 𝐴1 ···𝐴𝑛 = ··· 𝛽𝑖1 ,...,𝑖 𝑛 ;𝑖1′ ,...,𝑖 ′𝑛 |𝑖 1 , . . . , 𝑖 𝑛 ⟩⟨𝑖′1 , . . . , 𝑖′𝑛 | 𝐴1 ···𝐴𝑛 , (3.2.75)
𝑖 1 ,𝑖 1′ =0 𝑖 𝑛 ,𝑖 ′𝑛 =0
where 𝛽𝑖1 ,...,𝑖 𝑛 ;𝑖1′ ,...,𝑖 ′𝑛 = ⟨𝑖 1 , . . . , 𝑖 𝑛 |𝜌 𝐴1 ···𝐴𝑛 |𝑖′1 , . . . , 𝑖′𝑛 ⟩. This representation is simply
the generalization of the representation in (3.2.12) to 𝑛 ≥ 2 quantum systems.
Similarly, the generalization of the representation in (3.2.13) to 𝑛 ≥ 2 quantum
108
Chapter 3: Quantum States and Measurements
systems is
𝑑 2𝐴 −1 𝑑 2𝐴𝑛 −1
1
1 ∑︁ ∑︁
𝜌 𝐴1 ···𝐴𝑛 = ··· 𝑟 𝑘 1 ,...,𝑘 𝑛 𝑆 𝑘𝐴11 ⊗ · · · ⊗ 𝑆 𝑘𝐴𝑛𝑛 . (3.2.76)
𝑑 𝐴1 · · · 𝑑 𝐴 𝑛 𝑘 1 =0 𝑘 𝑛 =0
109
Chapter 3: Quantum States and Measurements
A1 A1
A2 A2
ρ A4 A3 A3 W π ρA4 W π†
A4 A4
Wπ
Figure 3.3: Depiction of the permutation operator 𝑊 𝜋 defined in (3.2.77) for
𝑛 = 4 quantum systems and the permutation 𝜋 defined by 𝜋(1) = 4, 𝜋(2) = 1,
𝜋(3) = 2, and 𝜋(4) = 3.
which follows straightforwardly from the definition in (3.2.77). See Figure 3.3 for
a visual depiction of the action of 𝑊 𝐴𝜋 𝑛 .
Exercise 3.13
1. Verify (3.2.78).
2. Let 𝑛 ≥ 2, and consider the cyclic permutation 𝜋 = (1 2 . . . 𝑛), which
satisfies 𝜋(𝑖) = 𝑖 + 1 for all 𝑖 ∈ {1, 2, . . . , 𝑛 − 1} and 𝜋(𝑛) = 1. Prove
that for all quantum states 𝜌 1𝐴1 , 𝜌 2𝐴2 , . . . , 𝜌 𝑛𝐴𝑛 , with 𝐴1 , 𝐴2 , . . . , 𝐴𝑛 being
identical quantum systems,
110
Chapter 3: Quantum States and Measurements
defined in (3.2.77).
Remark: Note that the permutation-invariance condition in (3.2.81) does not imply that the
state 𝜌 is supported on the symmetric subspace Sym𝑛 (H) of H ⊗𝑛 . In other words, the condition
in (3.2.81) does not imply that
As a simple example, suppose that H = C2 , and let 𝜌 = |Ψ − ⟩⟨Ψ − |, where |Ψ − ⟩ = √1 (|0, 1⟩−|1, 0⟩)
2
is the two-qubit Bell state defined in (3.2.47). Then, it is easy to see that 𝑊 𝜋 𝜌𝑊 𝜋† = 𝜌 for all
𝜋 ∈ S2 , while ΠSym2 (H) 𝜌ΠSym2 (H) = 0. The latter is true because |Ψ − ⟩ is an anti-symmetric
state, i.e., |Ψ − ⟩ ∈ ASym2 (H). The state 𝜌 is thus supported on the anti-symmetric subspace,
even though it is permutation invariant.
Exercise 3.14
Í𝑛
Let 𝜌 𝐴𝐵 = 𝑥∈X 𝑝(𝑥)𝜎𝐴𝑥 ⊗ 𝜏𝐵𝑥 , where X is a finite alphabet, 𝑝 : X → [0, 1] is a
probability distribution, and {𝜎𝐴𝑥 }𝑥∈X , {𝜏𝐵𝑥 }𝑥∈X are sets of quantum states.
Í
1. Prove that e 𝜌 𝐴𝐵𝐵′ B 𝑥∈X 𝑝(𝑥)𝜎𝐴𝑥 ⊗ 𝜏𝐵𝑥 ⊗ 𝜏𝐵𝑥 ′ is an extension of 𝜌 𝐴𝐵 , with
𝐵′ being the reference system, in accordance with Definition 3.11, such
that 𝑑 𝐵′ = 𝑑 𝐵 . Prove also that e 𝜌 𝐴𝐵′ B Tr 𝐵 [e 𝜌 𝐴𝐵𝐵′ ] = 𝜌 𝐴𝐵 .
Í
2. Now, let e 𝜌 𝐴𝐵1 𝐵2 ···𝐵 𝑘 B 𝑥∈X 𝑝(𝑥)𝜎𝐴𝑥 ⊗ 𝜏𝐵𝑥 1 ⊗ 𝜏𝐵𝑥 2 ⊗ · · · ⊗ 𝜏𝐵𝑥 𝑘 , where 𝑘 ∈ N.
Prove that e 𝜌 𝐴𝐵ℓ B Tr 𝐵 𝑗 : 𝑗≠ℓ [e 𝜌 𝐴𝐵1 𝐵2 ···𝐵 𝑘 ] = 𝜌 𝐴𝐵 for all ℓ ∈ {1, 2, . . . , 𝑘 },
𝜋†
and that 𝑊𝐵𝜋1 ···𝐵 𝑘 e 𝜌 𝐴𝐵1 ···𝐵 𝑘 for all 𝜋 ∈ S𝑘 . The notation
𝜌 𝐴𝐵1 ···𝐵 𝑘 𝑊𝐵1 ···𝐵 𝑘 = e
Tr 𝐵 𝑗 : 𝑗≠ℓ indicates to take the partial trace over all of the 𝐵 systems except
for 𝐵ℓ .
In the case 𝑛 = 2, meaning that there are only two quantum systems under
consideration, there is only one non-trivial permutation, 𝜋 = (1 2), which swaps
the two elements of the set {1, 2}. Recall from Exercise 2.33 that
𝑑−1
∑︁
(1 2)
𝑊 =𝐹B |𝑘, 𝑘 ′⟩⟨𝑘 ′, 𝑘 |. (3.2.83)
𝑘,𝑘 ′ =0
We call 𝐹 the swap operator, because 𝐹 (𝜌 ⊗ 𝜎)𝐹 † = 𝜎 ⊗ 𝜌 for all quantum states
𝜌 and 𝜎, which is a simple consequence of (3.2.78). In other words, the two states
𝜌 and 𝜎 become “swapped” with respect to the quantum systems after the action of
111
Chapter 3: Quantum States and Measurements
the operator 𝐹. The swap operator is Hermitian and satisfies 𝐹 2 = 1, meaning that
it is also unitary and self-inverse. Also, as a consequence of (3.2.79), we have
Tr[𝐹 (𝜌 ⊗ 𝜎)] = Tr[𝜌𝜎] (3.2.84)
for all quantum states 𝜌 and 𝜎.
Exercise 3.15
1. Verify that 𝐹 |Φ𝑑 ⟩ = |Φ𝑑 ⟩ for all 𝑑 ≥ 2.
2. For the two-qubit Bell state vectors defined in (3.2.43), prove that 𝐹 |Φ𝑧,𝑥 ⟩ =
(−1) 𝑧𝑥 |Φ𝑧,𝑥 ⟩ for all 𝑧, 𝑥 ∈ {0, 1}.
3. More generally, for the two-qudit Bell state vectors defined in (3.2.58),
prove that
2 𝜋i𝑧 𝑥
𝐹 |Φ𝑧,𝑥 ⟩ = e 𝑑 |Φ𝑧,−𝑥 ⟩ (3.2.85)
for all 𝑧, 𝑥 ∈ {0, 1, . . . , 𝑑 − 1}.
Exercise 3.16
Let 𝜌 𝐴𝑛 ∈ D(H ⊗𝑛 ) be an arbitrary quantum state, and consider the state
1 ∑︁ 𝜋
𝜎 𝐴𝑛 B 𝑊 𝐴𝑛 𝜌 𝐴𝑛 𝑊 𝐴𝜋†𝑛 . (3.2.86)
𝑛!
𝜋∈S𝑛
112
Chapter 3: Quantum States and Measurements
𝜌 𝐴𝑛 = 𝑊 𝐴𝜋 𝑛 𝜌 𝐴𝑛 𝑊 𝐴𝜋†𝑛 (3.2.88)
for all 𝜋 ∈ S𝑛 , where the unitary operators in the set {𝑊 𝐴𝜋 𝑛 } 𝜋∈S𝑛 are defined in
(2.5.1). Then, there exists a permutation-invariant purification |𝜓 𝜌 ⟩⟨𝜓 𝜌 | 𝐴ˆ 𝑛 𝐴𝑛 of
𝜌 𝐴𝑛 , such that |𝜓 𝜌 ⟩ 𝐴ˆ 𝑛 𝐴𝑛 ∈ Sym𝑛 (H 𝐴𝐴
ˆ ). This means that
|𝜓 𝜌 ⟩ 𝐴ˆ 𝑛 𝐴𝑛 = 𝑊 𝐴𝜋ˆ 𝑛 ⊗ 𝑊 𝐴𝜋 𝑛 |𝜓 𝜌 ⟩ 𝐴ˆ 𝑛 𝐴𝑛 (3.2.89)
Then, because the operators 𝑊 𝐴𝜋 𝑛 are real in the standard basis, meaning that
𝑊 𝜋ˆ 𝑛 = 𝑊 𝜋ˆ 𝑛 for all 𝜋 ∈ S𝑛 , and using the transpose trick in (2.2.42), we obtain
𝐴 𝐴
√
= 𝑊 ˆ 𝑛 ⊗ 𝑊 𝐴𝑛 1 𝐴ˆ 𝑛 ⊗ 𝜌 𝐴𝑛 |Γ⟩ 𝐴ˆ 𝑛 𝐴𝑛
𝑊 𝐴𝜋ˆ 𝑛 ⊗ 𝑊 𝐴𝜋 𝑛 |𝜓 𝜌 ⟩ 𝐴ˆ 𝑛 𝐴𝑛 𝜋 𝜋
(3.2.91)
𝐴
𝜋 √
= 1 𝐴ˆ 𝑛 ⊗ 𝑊 𝐴𝑛 𝜌 𝐴𝑛 𝑊 𝐴𝑛 |Γ⟩ 𝐴ˆ 𝑛 𝐴𝑛
𝜋†
(3.2.92)
√︃
= 1 𝐴ˆ 𝑛 ⊗ 𝑊 𝐴𝑛 𝜌 𝐴𝑛 𝑊 𝐴𝑛 |Γ⟩ 𝐴ˆ 𝑛 𝐴𝑛
𝜋 𝜋†
(3.2.93)
√
= 1 𝐴ˆ 𝑛 ⊗ 𝜌 𝐴𝑛 |Γ⟩ 𝐴ˆ 𝑛 𝐴𝑛
(3.2.94)
= |𝜓 𝜌 ⟩ 𝐴ˆ 𝑛 𝐴𝑛 (3.2.95)
for all 𝜋 ∈ S𝑛 , where the third equality follows from (2.2.70) and the fourth equality
follows from the permutation invariance of 𝜌 𝐴𝑛 . ■
So far, we have seen two special types of unitary operators: the Heisenberg–Weyl
operators {𝑊𝑧,𝑥 } 𝑧,𝑥=0
𝑑−1 , introduced in Definition 3.7, and the permutation operators
114
Chapter 3: Quantum States and Measurements
Exercise 3.17
Let 𝜌 𝐴 be a quantum state for a 𝑑-dimensional quantum system 𝐴, let 𝐺 be a
𝑔
group, and let {𝑈 𝐴 }𝑔∈𝐺 be a 𝑑-dimensional unitary representation of 𝐺.
1. Prove that the state
1 ∑︁ 𝑔 𝑔†
T (𝜌 𝐴 ) B
𝐺
𝑈 𝐴 𝜌 𝐴𝑈 𝐴 (3.2.96)
|𝐺 | 𝑔∈𝐺
is a purification of T 𝐺 (𝜌 𝐴 ).
1 ∑︁
𝑑−1
† 1𝑑
𝑊 𝑧,𝑥 𝑀𝑊 𝑧,𝑥 = Tr[𝑀] . (3.2.98)
𝑑 2 𝑧,𝑥=0 𝑑
115
Chapter 3: Quantum States and Measurements
Exercise 3.18
Provide a direct proof of Lemma 3.15. Do this by first showing that
1 Í𝑑−1 †
𝑑2 𝑧,𝑥=0 𝑊𝑧,𝑥 𝑀𝑊𝑧,𝑥 = (D 𝑋 ◦ D 𝑍 )(𝑀), where
𝑑−1
1 ∑︁
D 𝑋 (𝑀) B 𝑋 (𝑥) 𝑀 𝑋 (𝑥) † , (3.2.99)
𝑑 𝑥=0
𝑑−1
1 ∑︁
D𝑍 (𝑀) B 𝑍 (𝑧) 𝑀 𝑍 (𝑧) † . (3.2.100)
𝑑 𝑧=0
Recall that the Heisenberg–Weyl operators reduce to the Pauli operators for
𝑑 = 2. In this case, we obtain
1 1 1 1 12
𝑀 + 𝑋 𝑀 𝑋 + 𝑌 𝑀𝑌 + 𝑍 𝑀 𝑍 = Tr[𝑀] , (3.2.101)
4 4 4 4 2
for every 𝑀 ∈ L(C2 ).
Exercise 3.19
Let 𝐺 be a group, and let {𝑈 𝑔 }𝑔∈𝐺 be a 𝑑-dimensional unitary representation
of 𝐺, with 𝑑 ≥ 2. If 𝜌 ∈ D(C𝑑 ) is a group-invariant state, then prove that there
exists a purification |𝜓 𝜌 ⟩ of 𝜌 such that 𝑈 𝑔 ⊗ 𝑈 𝑔 |𝜓 𝜌 ⟩ = |𝜓 𝜌 ⟩ for all 𝑔 ∈ 𝐺.
specified as ∑︁
𝜌= 𝑝(𝑥) 𝜌 𝑥 . (3.2.102)
𝑥∈X
On the other hand, if Alice sends Bob classical information about which state
she has prepared, then from Bob’s perspective the state of the system can be
described by the following classical–quantum state:
∑︁
𝜌𝑋 𝐵 = 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜌 𝑥𝐵 , (3.2.103)
𝑥∈X
© 𝑝(𝑥1 ) 𝜌
𝑥1
ª
𝑝(𝑥2 ) 𝜌 𝑥2 ®
𝜌𝑋 𝐵 = ... ®.
® (3.2.105)
®
« 𝑝(𝑥 |X| ) 𝜌 𝑥 |X| ¬
Exercise 3.20
Construct a purification of the classical–quantum state 𝜌 𝑋 𝐵 in (3.2.103). (Hint:
Consider a purification analogous to the one in (3.2.87).)
117
Chapter 3: Quantum States and Measurements
where we have defined the transpose with respect to the orthonormal bases
{|𝑖⟩ 𝐴 : 0 ≤ 𝑖 ≤ 𝑑 𝐴 − 1} and {| 𝑗⟩ 𝐴′ : 0 ≤ 𝑗 ≤ 𝑑 𝐴′ − 1}. This is consistent with the
familiar definition of the transpose of a matrix 𝑋 as being the matrix 𝑋 T with its
rows and columns flipped relative to 𝑋. Indeed, if 𝑋 has the matrix representation
Í𝑑 𝐴′ −1 Í𝑑 𝐴−1 ′
𝑋 = 𝑗=0 𝑖=0 𝑋 𝑗,𝑖 | 𝑗⟩⟨ | 𝐴 𝑖 𝐴 , then it follows that the transpose of 𝑋 is
𝐴′ −1 𝑑∑︁
𝑑∑︁ 𝐴−1
T
𝑋 = 𝑋 𝑗,𝑖 |𝑖⟩ 𝐴 ⟨ 𝑗 | 𝐴′ = T(𝑋). (3.2.108)
𝑗=0 𝑖=0
Note that, unlike the trace or the conjugate transpose, the transpose depends on the
choice of orthonormal bases used to evaluate it. Throughout the rest of this book,
we use both T(𝑋) and 𝑋 T to refer to the transpose of 𝑋 with respect to the standard
orthonormal basis.
Exercise 3.21
For all 𝑑 ≥ 2, prove that the transpose map can be realized as follows, in terms
of the Heisenberg–Weyl operators from Definition 3.7:
𝑑−1
1 ∑︁ 2 𝜋i𝑧 𝑥 †
T(𝑋) = e 𝑑 𝑊𝑧,𝑥 𝑋𝑊𝑧,−𝑥 , (3.2.109)
𝑑 𝑧,𝑥=0
118
Chapter 3: Quantum States and Measurements
The transpose map is known as the partial transpose when it acts on one
subsystem of a bipartite linear operator 𝑋 𝐴𝐵 .
Given an expansion of 𝑋 𝐴𝐵 as
𝐵 −1
𝑑∑︁
𝑖, 𝑗
𝑋 𝐴𝐵 = 𝑋 𝐴 ⊗ |𝑖⟩⟨ 𝑗 | 𝐵 , (3.2.112)
𝑖, 𝑗=0
𝑖, 𝑗
where each 𝑋 𝐴 B ⟨𝑖| 𝐵 𝑋 𝐴𝐵 | 𝑗⟩𝐵 is a linear operator acting on system 𝐴, the partial
transpose map T𝐵 has the action
𝐵 −1
𝑑∑︁ 𝐵 −1
𝑑∑︁
𝑖, 𝑗 𝑗,𝑖
T𝐵 (𝑋 𝐴𝐵 ) = 𝑋𝐴 ⊗ | 𝑗⟩⟨𝑖| 𝐵 = 𝑋 𝐴 ⊗ |𝑖⟩⟨ 𝑗 | 𝐵 . (3.2.113)
𝑖, 𝑗=0 𝑖, 𝑗=0
and it is self-adjoint with respect to the Hilbert–Schmidt inner product, in the sense
that
⟨𝑋 𝐴𝐵 , T𝐵 (𝑌 𝐴𝐵 )⟩ = ⟨T𝐵 (𝑋 𝐴𝐵 ), 𝑌 𝐴𝐵 ⟩, (3.2.115)
119
Chapter 3: Quantum States and Measurements
where the Hilbert spaces corresponding to the systems 𝐴 and 𝐵 are isomorphic and
𝑋 𝑅 𝐴 ∈ L(H 𝑅 ⊗ H 𝐴 ).
Exercise 3.22
Verify (3.2.114), (3.2.115), and (3.2.116).
Lemma 3.18
Given quantum systems 𝐴 and 𝐵, the set PPT( 𝐴 : 𝐵) does not depend on which
system the transpose is taken, nor does it depend on which orthonormal basis is
used to define the transpose map.
Proof: To see the first statement, suppose that 𝜌 𝐴𝐵 ∈ PPT( 𝐴 : 𝐵). This means
that T𝐵 (𝜌 𝐴𝐵 ) ≥ 0. But since the eigenvalues are invariant under a full transpose
T 𝐴 ⊗ T𝐵 , this means that (T 𝐴 ⊗ T𝐵 )(T𝐵 (𝜌 𝐴𝐵 )) ≥ 0, the latter being the same
as T 𝐴 (𝜌 𝐴𝐵 ) ≥ 0 due to the self-inverse property of the partial transpose. So
T𝐵 (𝜌 𝐴𝐵 ) ≥ 0 implies T 𝐴 (𝜌 𝐴𝐵 ) ≥ 0, and vice versa.
𝑑 𝐵 −1
To see the second statement, let T𝐵 (𝜌 𝐴𝐵 ) ≥ 0, and let {|𝜙ℓ ⟩𝐵 }ℓ=0 be some
other orthonormal basis for 𝐵. The partial transpose with respect to this basis is
given by
𝐵 −1
𝑑∑︁
( 1 𝐴 ⊗ |𝜙ℓ ⟩⟨𝜙ℓ′ | 𝐵 ) 𝜌 𝐴𝐵 ( 1 𝐴 ⊗ |𝜙ℓ ⟩⟨𝜙ℓ′ | 𝐵 ). (3.2.118)
ℓ,ℓ ′ =0
120
Chapter 3: Quantum States and Measurements
T𝐵 (𝜌 𝐴𝐵 )
𝐵 −1
𝑑∑︁
= ( 1 𝐴 ⊗ | 𝑗⟩⟨ 𝑗 ′ | 𝐵 ) 𝜌 𝐴𝐵 ( 1 𝐴 ⊗ | 𝑗⟩⟨ 𝑗 ′ | 𝐵 )
𝑗, 𝑗 ′ =0
𝐵 −1
𝑑∑︁
= ( 1 𝐴 ⊗ | 𝑗⟩⟨ 𝑗 ′ |𝜙ℓ ⟩⟨𝜙ℓ | 𝐵 ) 𝜌 𝐴𝐵 ( 1 𝐴 ⊗ |𝜙ℓ′ ⟩⟨𝜙ℓ′ | 𝑗⟩⟨ 𝑗 ′ | 𝐵 )
𝑗, 𝑗 ′ ,ℓ,ℓ ′ =0
𝐵 −1
𝑑∑︁
= ( 1 𝐴 ⊗ ⟨𝜙ℓ′ | 𝑗⟩| 𝑗⟩⟨𝜙ℓ | 𝐵 ) 𝜌 𝐴𝐵 ( 1 𝐴 ⊗ |𝜙ℓ′ ⟩⟨ 𝑗 ′ | 𝐵 ⟨ 𝑗 ′ |𝜙ℓ ⟩)
𝑗, 𝑗 ′ ,ℓ,ℓ ′ =0
𝐵 −1
𝑑∑︁ 𝑑 𝐵 −1 𝐵 −1
𝑑∑︁
1 𝐴 ⊗ ⟨𝜙ℓ′ | 𝑗⟩| 𝑗⟩ ® ⟨𝜙ℓ | 𝐵 ® 𝜌 𝐴𝐵 1 𝐴 ⊗ |𝜙ℓ′ ⟩
© ∑︁
= ⟨ 𝑗 ′ | 𝐵 ⟨ 𝑗 ′ |𝜙ℓ ⟩ ®®
© ª ª © © ªª
ℓ,ℓ ′ =0 « « 𝑗=0 ¬ ¬ «
′
« 𝑗 =0 ¬¬
𝐵 −1
𝑑∑︁
= 1 𝐴 ⊗ |𝜙ℓ′ ⟩⟨𝜙ℓ | 𝐵 𝜌 𝐴𝐵 1 𝐴 ⊗ |𝜙ℓ′ ⟩⟨𝜙ℓ | , (3.2.119)
ℓ,ℓ ′ =0
Í𝑑 𝐵 −1
where in the last line we defined |𝜙ℓ ⟩ B ⟨𝜙ℓ | 𝑗⟩| 𝑗⟩ for 0 ≤ ℓ ≤ 𝑑 𝐵 − 1. Note
𝑗=0
𝑑 𝐵 −1 Í𝑑 𝐵 −1
that the set {|𝜙ℓ ⟩}ℓ=0 is orthonormal, so that 𝑈𝐵 B ℓ=0 |𝜙ℓ ⟩⟨𝜙ℓ | 𝐵 is a unitary
operator. Then we find that
𝐵 −1
𝑑∑︁
( 1 𝐴 ⊗ |𝜙ℓ′ ⟩⟨𝜙ℓ | 𝐵 ) 𝜌 𝐴𝐵 ( 1 𝐴 ⊗ |𝜙ℓ′ ⟩⟨𝜙ℓ | 𝐵 ) = 𝑈𝐵 T𝐵 (𝜌 𝐴𝐵 )𝑈𝐵† ≥ 0, (3.2.120)
ℓ,ℓ ′ =0
where the last inequality follows from the condition T𝐵 (𝜌 𝐴𝐵 ) ≥ 0 and property 1
of Lemma 2.14. We thus conclude that the PPT property does not depend on which
orthonormal basis is used to define the transpose map. ■
We mentioned at the beginning of this section that the partial transpose is useful
in quantum information because it leads to a sufficient condition for a bipartite state
to be entangled. We now derive the sufficient condition.
Suppose that a state 𝜎𝐴𝐵 is a separable, unentangled state of the following form:
∑︁
𝜎𝐴𝐵 = 𝑝(𝑥)𝜔𝑥𝐴 ⊗ 𝜏𝐵𝑥 , (3.2.121)
𝑥∈X
121
Chapter 3: Quantum States and Measurements
which is a separable quantum state, being the expected state of the ensemble
{( 𝑝(𝑥), 𝜔𝑥𝐴 ⊗ T(𝜏𝐵𝑥 ))}𝑥∈X . Each element of the ensemble is indeed a quantum state
because the transpose is a positive map, i.e., T(𝜏𝐵𝑥 ) ≥ 0 if 𝜏𝐵𝑥 ≥ 0. Due to this fact,
we conclude that T𝐵 (𝜎𝐴𝐵 ) ≥ 0, so that 𝜎𝐴𝐵 is a PPT state. Thus, we conclude the
following:
This is called the PPT criterion. Equivalently, by taking the contrapositive of this
statement, we obtain the following:
122
Chapter 3: Quantum States and Measurements
From (2.5.8) and (2.5.9), the swap operator has the following spectral decom-
position:
Sym ASym
𝐹𝐴𝐵 = Π 𝐴𝐵 − Π 𝐴𝐵 , (3.2.127)
Sym
where Π 𝐴𝐵 ≡ ΠSym2 (C𝑑 ) is the projection onto the symmetric subspace of C𝑑 ⊗ C𝑑
ASym
and Π 𝐴𝐵 ≡ ΠASym2 (C𝑑 ) is the projection onto the anti-symmetric subspace of
C𝑑 ⊗ C𝑑 . Indeed, we have that Π 𝐴𝐵 + Π 𝐴𝐵 = 1 𝐴𝐵 and Π 𝐴𝐵 Π 𝐴𝐵 = 0. Thus,
Sym ASym Sym ASym
the swap operator has negative eigenvalues, which by the PPT criterion means that
Φ 𝐴𝐵 is an entangled state, as expected.
Although the PPT criterion is generally only a necessary condition for separabil-
ity of a bipartite state, it is known that the PPT criterion is necessary and sufficient
for every quantum state 𝜌 𝐴𝐵 for which both 𝐴 and 𝐵 are qubits or 𝐴 is a qubit and
𝐵 is a qutrit; please consult the Bibliographic Notes in Section 3.4. In particular,
therefore, in higher dimensions there are PPT states that are entangled. These PPT
entangled states turn out to be useless for the task of entanglement distillation (see
Chapter 13), and thus they are called bound entangled (although they are entangled,
they cannot be used to extract pure maximally entangled states at a non-zero rate).
Exercise 3.23
Prove that the swap operator 𝐹𝐴𝐵 possesses the following symmetry:
In Section 3.2.6, we defined permutation-invariant states, which are states that are
invariant under the action of the unitary operator 𝑊 𝜋 for every permutation 𝜋 ∈ S𝑛 .
Another important class of quantum states in quantum information theory consists
of bipartite states that are invariant under certain kinds of unitaries. There are two
distinct such classes of states that we define in this section.
123
Chapter 3: Quantum States and Measurements
𝜌 𝐴𝐵 = (𝑈 ⊗ 𝑈) 𝜌 𝐴𝐵 (𝑈 ⊗ 𝑈) † (3.2.129)
for every unitary 𝑈. For every isotropic state 𝜌 𝐴𝐵 , there exists 𝑝 ∈ [0, 1] such
iso;𝑝
that 𝜌 𝐴𝐵 = 𝜌 𝐴𝐵 , where
1− 𝑝
( 1 𝐴𝐵 − |Φ⟩⟨Φ| 𝐴𝐵 ) ,
iso;𝑝
𝜌 𝐴𝐵 B 𝑝|Φ⟩⟨Φ| 𝐴𝐵 + (3.2.130)
𝑑2 − 1
Í𝑑−1
where |Φ⟩ 𝐴𝐵 = √1 | 𝑗, 𝑗⟩ 𝐴𝐵 .
𝑑 𝑗=0
(3.2.60)), that it has full rank for 𝑝 ∈ (0, 1), and that its eigenvalues are 𝑝 and 𝑑1−𝑝
2 −1
Exercise 3.24
1. Verify that the isotropic state in (3.2.130) is invariant under 𝑈 ⊗ 𝑈 for every
unitary 𝑈, i.e., verify that
(𝑈 𝐴 ⊗ 𝑈 𝐵 ) 𝜌 𝐴𝐵 (𝑈 𝐴 ⊗ 𝑈 𝐵 ) † = 𝜌 𝐴𝐵
iso;𝑝 iso;𝑝
(3.2.132)
124
Chapter 3: Quantum States and Measurements
1 𝐴𝐵
h i
𝑝𝑑 2 −1 −1
𝑎|Φ⟩⟨Φ| 𝐴𝐵 + (1 − 𝑎) 𝑑2
, where 𝑎 = 𝑑 2 −1
. Conclude that 𝑎 ∈ 𝑑 2 −1
,1 .
Just as every multipartite quantum state can be made permutation invariant via
the construction in (3.2.86), every bipartite quantum state 𝜌 𝐴𝐵 , with 𝑑 𝐴 = 𝑑 𝐵 , can
be made invariant under 𝑈 ⊗ 𝑈, i.e., isotropic, via the following construction:
∫
(𝑈 ⊗ 𝑈) 𝜌 𝐴𝐵 (𝑈 ⊗ 𝑈) † d𝑈. (3.2.133)
𝑈
Please see the Bibliographic Notes (Section 3.4) for more information about this
result, as well as about integrals of functions of unitaries with respect to the Haar
measure.
The isotropic states constitute one class of bipartite quantum states in which
every state is invariant under the action of a unitary acting on the individual
subsystems. We now define a second class of such states.
𝜌 𝐴𝐵 = (𝑈 ⊗ 𝑈) 𝜌 𝐴𝐵 (𝑈 ⊗ 𝑈) † (3.2.135)
for every unitary 𝑈. For every Werner state 𝜌 𝐴𝐵 , there exists 𝑝 ∈ [0, 1] such
W;𝑝
that 𝜌 𝐴𝐵 = 𝜌 𝐴𝐵 , where
W;𝑝 ⊥
𝜌 𝐴𝐵 B 𝑝𝜁 𝐴𝐵 + (1 − 𝑝)𝜁 𝐴𝐵 . (3.2.136)
125
Chapter 3: Quantum States and Measurements
1
𝜁 𝐴𝐵 B ( 1 𝐴𝐵 − 𝐹𝐴𝐵 ) , (3.2.137)
𝑑 (𝑑 − 1)
1
⊥
𝜁 𝐴𝐵 B ( 1 𝐴𝐵 + 𝐹𝐴𝐵 ) , (3.2.138)
𝑑 (𝑑 + 1)
Í𝑑−1
and 𝐹𝐴𝐵 = 𝑖, 𝑗=0 |𝑖, 𝑗⟩⟨ 𝑗, 𝑖| 𝐴𝐵 is the swap operator.
Exercise 3.25
1. Verify that the Werner state in (3.2.136) is invariant under 𝑈 ⊗ 𝑈 for every
unitary 𝑈, i.e., verify that
(𝑈 𝐴 ⊗ 𝑈𝐵 ) 𝜌 𝐴𝐵 (𝑈 𝐴 ⊗ 𝑈𝐵 ) † = 𝜌 𝐴𝐵
W;𝑝 W;𝑝
(3.2.140)
126
Chapter 3: Quantum States and Measurements
As before, the integral represents the uniform average over all unitaries, which can
be evaluated using a unitary two-design. In particular, for every state 𝜌 𝐴𝐵 ,
∫ h i
† W;𝑝 ASym
(𝑈 ⊗ 𝑈) 𝜌 𝐴𝐵 (𝑈 ⊗ 𝑈) d𝑈 = 𝜌 𝐴𝐵 , 𝑝 = Tr Π 𝐴𝐵 𝜌 𝐴𝐵 . (3.2.142)
𝑈
Please see the Bibliographic Notes (Section 3.4) for more information.
3.3 Measurements
Measurements in quantum mechanics are described by positive operator-valued
measures (POVMs).
𝑀𝑥 = 1.
∑︁
𝑀𝑥 ≥ 0 ∀ 𝑥 ∈ X, (3.3.1)
𝑥∈X
For our purposes, it suffices to consider finite sets of such operators. The
elements of the finite alphabet X are used to label the outcomes of the
measurement.
127
Chapter 3: Quantum States and Measurements
𝑀𝑥 = 𝑉 † ( 1 ⊗ |𝑥⟩⟨𝑥|)𝑉 ∀ 𝑥 ∈ X, (3.3.3)
where the last equality holds because {𝑀𝑥 }𝑥∈X is a POVM. It is straightforward to
check that (3.3.3) is satisfied for the choice of 𝑉 in (3.3.4). ■
ρA VA→ A0 P {| x ih x |} x∈X
ρ A0
isometry 𝑉𝐴→𝐴′ 𝑃 , and if 𝜌 𝐴𝑃 = 𝑉 𝜌 𝐴𝑉 † is the joint state of the system and the probe
after the interaction, then the measurement outcome probabilities are
𝑝 𝑋 (𝑥) = Tr[( 1 𝐴′ ⊗ |𝑥⟩⟨𝑥|) 𝜌 𝐴′ 𝑃 ] (3.3.9)
= Tr[( 1 𝐴′ ⊗ |𝑥⟩⟨𝑥|)𝑉 𝜌 𝐴𝑉 † ] (3.3.10)
= Tr[𝑉 † ( 1 𝐴′ ⊗ |𝑥⟩⟨𝑥|)𝑉 𝜌 𝐴 ] (3.3.11)
= Tr[𝑀𝑥 𝜌 𝐴 ], (3.3.12)
Exercise 3.26
Consider the qubit state vectors
2𝜋𝑘 2𝜋𝑘
|𝜓 𝑘 ⟩ B cos |0⟩ + sin |1⟩, 𝑘 ∈ {0, 1, 2, 3, 4}. (3.3.13)
5 5
4
Verify that the set 25 |𝜓 𝑘 ⟩⟨𝜓 𝑘 | 𝑘=0 is a POVM. Note that this POVM gives us an
example of a non-projective measurement.
we are interested in knowing the state of the system after we have measured it and
observed the outcome.
Since every POVM element 𝑀𝑥 is positive semi-definite, there exists an operator
𝐾𝑥 such that 𝑀𝑥 = 𝐾𝑥† 𝐾𝑥 √for all 𝑥 ∈ X. For example, we could let 𝐾𝑥 be the square
root of 𝑀𝑥 , so that 𝐾𝑥 = 𝑀𝑥 . Then, the Born rule in (3.3.2) for the probability of
the measurement outcome 𝑥 ∈ X can be written as 𝑝 𝑋 (𝑥) = Tr[𝐾𝑥 𝜌𝐾𝑥† ]. In this
case, the post-measurement state corresponding to the outcome 𝑥 ∈ X is as follows:
𝑥 𝐾𝑥 𝜌𝐾𝑥†
𝜌 B . (3.3.14)
Tr[𝐾𝑥 𝜌𝐾𝑥† ]
The state 𝜌 𝑥 can be understood to capture the experimenter’s description of the
state of the system given that the measurement outcome was observed to be 𝑥.
The post-measurement states 𝜌 𝑥 give rise to the ensemble {( 𝑝 𝑋 (𝑥), 𝜌 𝑥 )}𝑥∈X .
The expected density operator of the ensemble is
∑︁ ∑︁
𝜌M B 𝑥
𝑝 𝑋 (𝑥) 𝜌 = 𝐾𝑥 𝜌𝐾𝑥† . (3.3.15)
𝑥∈X 𝑥∈X
This expected density operator is the state of the system after measurement if
the measurement outcome is not available. It can be interpreted as the state of
the system after measurement if the experimenter does not have access to the
measurement outcome.
†
√ in the decomposition 𝑀𝑥 = 𝐾𝑥 𝐾𝑥 , other choices
Due to the unitary freedom
of 𝐾𝑥 are given by 𝐾𝑥 = 𝑈𝑥 𝑀𝑥 for some unitary 𝑈𝑥 , so that there is not a unique
way to determine the post-measurement state when starting from a POVM.
Suppose now that we perform a measurement on a subsystem of a composite
system. Specifically, consider measuring a system 𝐴 that is in a joint state 𝜌 𝑅 𝐴 with
a reference system 𝑅, and let the measurement be described by the POVM {𝑀 𝐴𝑥 }𝑥∈X
for some finite alphabet X. If we let 𝑀 𝐴𝑥 = 𝐾 𝐴𝑥† 𝐾 𝐴𝑥 , then according to (3.3.14), the
measurement probabilities are given by 𝑝 𝑋 (𝑥) = Tr[( 1 𝑅 ⊗ 𝐾 𝐴𝑥 ) 𝜌 𝑅 𝐴 ( 1 𝑅 ⊗ 𝐾 𝐴𝑥† )]
and the post-measurement states are as follows:
( 1 𝑅 ⊗ 𝐾 𝐴𝑥 ) 𝜌 𝑅 𝐴 ( 1 𝑅 ⊗ 𝐾 𝐴𝑥† )
𝜌 𝑥𝑅 𝐴 = (3.3.16)
Tr[( 1 𝑅 ⊗ 𝐾 𝐴𝑥 ) 𝜌 𝑅 𝐴 ( 1 𝑅 ⊗ 𝐾 𝐴𝑥† )]
1
= ( 1 𝑅 ⊗ 𝐾 𝐴𝑥 ) 𝜌 𝑅 𝐴 ( 1 𝑅 ⊗ 𝐾 𝐴𝑥† ) (3.3.17)
𝑝 𝑋 (𝑥)
130
Chapter 3: Quantum States and Measurements
for all 𝑥 ∈ X. The state of the system 𝑅 conditioned on the measurement outcome
𝑥 is then
𝜌 𝑥𝑅 B Tr 𝐴 [𝜌 𝑥𝑅 𝐴 ] (3.3.18)
1
= Tr 𝐴 [( 1 𝑅 ⊗ 𝐾 𝐴𝑥 ) 𝜌 𝑅 𝐴 ( 1 𝑅 ⊗ 𝐾 𝐴𝑥† )] (3.3.19)
𝑝 𝑋 (𝑥)
1
= Tr 𝐴 [( 1 𝑅 ⊗ 𝐾 𝐴𝑥† 𝐾 𝐴 ) 𝜌 𝑅 𝐴 ] (3.3.20)
𝑝 𝑋 (𝑥)
1
= Tr 𝐴 [( 1 𝑅 ⊗ 𝑀 𝐴𝑥 ) 𝜌 𝑅 𝐴 ]. (3.3.21)
𝑝 𝑋 (𝑥)
We thus see that, although the post-measurement state on the system 𝐴 being
measured is not uniquely defined due to the unitary freedom in the decomposition
𝑀 𝐴𝑥 = 𝐾 𝐴𝑥† 𝐾 𝐴𝑥 , as described earlier, the post-measurement state on the reference
system 𝑅 not being measured is uniquely defined because it depends directly on
each POVM element 𝑀 𝐴𝑥 . If the system 𝐴 undergoes a measurement for which
𝑀 𝐴𝑥 = |𝜓 𝑥 ⟩⟨𝜓 𝑥 | 𝐴 , then (3.3.21) can be written as
1
𝜌 𝑥𝑅 = ⟨𝜓 𝑥 | 𝐴 𝜌 𝑅 𝐴 |𝜓 𝑥 ⟩ 𝐴 . (3.3.22)
𝑝 𝑋 (𝑥)
Exercise 3.27
Í
Consider a finite set {𝜌 𝑥 }𝑥∈X of quantum states, and let 𝑅 B 𝑥∈X 𝜌𝑥 .
1. If 𝑅 is invertible, let
1 1
𝑀𝑥 B 𝑅 − 2 𝜌 𝑥 𝑅 − 2 ∀ 𝑥 ∈ X. (3.3.23)
𝑀𝑥 B 𝑅 − 2 𝜌 𝑥 𝑅 − 2 + 1 − Π 𝑅
1 1
∀ 𝑥 ∈ X, (3.3.24)
where Π 𝑅 is the projection onto the support of 𝑅. Prove that {𝑀𝑥 }𝑥∈X is a
POVM. (Hint: Recall the defnitions from Section 2.2.8.1.)
131
Chapter 3: Quantum States and Measurements
132
Chapter 3: Quantum States and Measurements
unitary operators with respect to the Haar measure, including proofs of (3.2.134)
and (3.2.142), can be found in (Collins, 2003; Collins and Śniady, 2006; Roy and
Scott, 2009).
Proof: First suppose that 𝑋 𝐴𝐵 is rank one, so that 𝑋 𝐴𝐵 = |Ψ⟩⟨Ψ| 𝐴𝐵 for some vector
|Ψ⟩ 𝐴𝐵 ∈ H 𝐴 ⊗ H𝐵 . Due to the Schmidt decomposition theorem (Theorem 2.2),
we have that ∑︁
|Ψ⟩ 𝐴𝐵 = 𝛾𝑧 |𝜃 𝑧 ⟩ 𝐴 ⊗ |𝜉 𝑧 ⟩𝐵 , (3.A.1)
𝑧∈Z
where |Z| ≤ min{dim(H 𝐴 ), dim(H𝐵 )}, the set {𝛾𝑧 } 𝑧 is a set of strictly positive
numbers, and {|𝜃 𝑧 ⟩ 𝐴 } 𝑧 and {|𝜉 𝑧 ⟩𝐵 } 𝑧 are orthonormal bases. Then
The statement then follows for this case because supp(𝑋 𝐴 ) = span{|𝜃 𝑧 ⟩ 𝐴 : 𝑧 ∈ Z}
and supp(𝑋𝐵 ) = span{|𝜉 𝑧 ⟩𝐵 : 𝑧 ∈ Z}.
Now suppose that 𝑋 𝐴𝐵 is not rank one. It admits a decomposition into rank-one
vectors of the following form:
∑︁
𝑋 𝐴𝐵 = |Ψ𝑥 ⟩⟨Ψ𝑥 | 𝐴𝐵 , (3.A.4)
𝑥∈X
where |Ψ𝑥 ⟩ 𝐴𝐵 ∈ H 𝐴 ⊗ H𝐵 for all 𝑥 ∈ X. Set Ψ𝑥𝐴𝐵 = |Ψ𝑥 ⟩⟨Ψ𝑥 | 𝐴𝐵 , and let
Ψ𝑥𝐴 B Tr 𝐵 [Ψ𝑥𝐴𝐵 ] and Ψ𝐵𝑥 B Tr 𝐴 [Ψ𝑥𝐴𝐵 ]. Then
134
Chapter 4
Quantum Channels
In the previous chapter, we studied quantum states and measurements. These two
topics constitute the first three axioms of quantum mechanics, as presented in
Section 3.1. The fourth and final axiom is about the evolution of quantum systems,
which is the subject of this chapter. Mathematically, the evolution is described by a
quantum channel. As quantum communication necessarily involves the evolution
of quantum systems (such as the evolution of photons when travelling through
an optical fiber), quantum channels are the primary objects of study in this book.
This chapter is devoted to a detailed study of quantum channels, including their
properties, representations, and various examples that are relevant for quantum
communication and quantum information more broadly.
The fourth axiom in Section 3.1 states that a quantum channel is a “linear,
completely positive, and trace-preserving map acting on the state of the system.” At
first glance, this appears to be a purely mathematical statement (which we elaborate
upon in Section 4.1), with seemingly little connection to physics. However, we can
connect this statement to the axiom of evolution of quantum systems as taught in a
basic quantum physics course. In such a course, one learns that the evolution of a
(non-relativistic) quantum system is governed by the Schrödinger equation:
𝜕
iℏ |𝜓(𝑡)⟩ = 𝐻 (𝑡)|𝜓(𝑡)⟩, (4.0.1)
𝜕𝑡
where |𝜓(𝑡)⟩ is the state vector of the system at time 𝑡 ≥ 0 and 𝐻 (𝑡) is the
Hamiltonian operator of the system at time 𝑡. The Hamiltonian operator is a
Hermitian operator that describes the energy of the system. Now, we know from
Chapter 3 that the state of a quantum system is described more generally by a
135
Chapter 4: Quantum Channels
density operator. The analogue of (4.0.1) for density operators is known as the von
Neumann equation:
𝜕 𝜌(𝑡)
iℏ = [𝐻 (𝑡), 𝜌(𝑡)], (4.0.2)
𝜕𝑡
where 𝜌(𝑡) is the density operator describing the state of the system at time 𝑡 ≥ 0,
and [𝐻 (𝑡), 𝜌(𝑡)] = 𝐻 (𝑡) 𝜌(𝑡) − 𝜌(𝑡)𝐻 (𝑡) is the commutator of the Hamiltonian
𝐻 (𝑡) and the state 𝜌(𝑡).
Both (4.0.1) and (4.0.2) describe the evolution of so-called closed quantum
systems, and this evolution is given by unitary maps. In other words, the solution
to (4.0.1) is |𝜓(𝑡)⟩ = 𝑈 (𝑡)|𝜓0 ⟩ for all 𝑡 ≥ 0, where |𝜓0 ⟩ is an initial state vector of
the system (at time 𝑡 = 0) and 𝑈 (𝑡) is a unitary operator. Similarly, the solution
to (4.0.2) is 𝜌(𝑡) = 𝑈 (𝑡) 𝜌0𝑈 (𝑡) † for all 𝑡 ≥ 0, where 𝜌0 is an initial quantum
state of the system (at time 𝑡 = 0) and 𝑈 (𝑡) is a unitary operator. We refer to the
Bibliographic Notes in Section 4.8 for references on explicit forms for the unitary
𝑈 (𝑡). We show in this chapter that unitary maps are quantum channels. This fact
provides a connection between the mathematical statement of the evolution axiom
in Section 3.1 and the statement of the evolution axiom typically taught in quantum
physics courses.
More generally, we are interested in the evolution of open quantum systems,
i.e., quantum systems that interact with an external environment that is out of
our control. For such systems, the same connection as before holds. In fact, the
evolution is given by a joint unitary evolution of the system and environment
followed by discarding the state of the environment, and as we show in Section 4.3,
every completely positive trace-preserving map (i.e., every quantum channel) can
be viewed in terms of a joint unitary evolution with an environment followed by
discarding the state of the environment. (Please see the Bibliographic Notes in
Section 4.8 for references on open quantum systems.) Thus, from an abstract,
information-theoretic perspective, the evolution of a quantum system is given
simply by a quantum channel, and the details of the actual physical system of
interest (which would be given by the Hamiltonian operator) are unimportant. This
viewpoint is powerful: with it, we realize that virtually every operation on quantum
states, including measurements, is a quantum channel.
136
Chapter 4: Quantum Channels
4.1 Definition
We can motivate the definition of a quantum channel by using the following basic
mathematical facts that should be satisfied by a map N : L(H) → L(H′) that
represents the evolution of a quantum system:
1. If N acts on a mixture of quantum states, then the output state should be equal
to the mixture of the individual outputs. That is,
N(𝜆𝜌 + (1 − 𝜆)𝜎) = 𝜆N(𝜌) + (1 − 𝜆)N(𝜎) (4.1.1)
for all states 𝜌 and 𝜎 and 𝜆 ∈ [0, 1]. This requirement is called convex linearity
on density operators, and for each convex linear map acting on the convex set
of density operators, it is possible to define a unique linear map acting on the
space of all linear operators. The latter is the mathematical object that we
employ, and so we require that N be a linear map, i.e., a superoperator. (Recall
the definition of a superoperator from Section 2.2.11.)
2. The map N should accept a quantum state (or a mixture of quantum states)
as input and output a legitimate quantum state. This means that N should
be trace preserving and positive. However, it is furthermore reasonable to
demand that if the channel acts on one share 𝐴 of a bipartite quantum state
𝜌 𝑅 𝐴 , then the output should be a legitimate bipartite quantum state. So we
demand additionally that a quantum channel should be not just positive, but
additionally completely positive. Let us now define these terms.
(a) N is called trace preserving if Tr[N(𝑋)] = Tr[𝑋] for every linear oper-
ator 𝑋. More generally, N is called trace non-increasing if Tr[N(𝑋)] ≤
Tr[𝑋] for every positive semi-definite operator 𝑋.
(b) N is called positive if it maps positive semi-definite operators to positive
semi-definite operators, i.e., N(𝑋) ≥ 0 for all 𝑋 ≥ 0. It is called 𝑘-positive,
with 𝑘 ≥ 1, if the map id𝑘 ⊗ N is positive. Note that if N is a map
acting on linear operators in L(C𝑑 ), then the map id𝑘 ⊗ N acts on linear
operators in L(C𝑘 𝑑 ). In other words, for every linear operator 𝑋 acting on
a 𝑘 𝑑-dimensional Hilbert space, which we can decompose as the block
matrix
© 𝑋0,0 · · · 𝑋0,𝑘−1 ª
𝑋 = .. . ... .. ®, (4.1.2)
. ®
« 𝑋𝑘−1,0 · · · 𝑋𝑘−1,𝑘−1 ¬
137
Chapter 4: Quantum Channels
A N B N
space
B
(a) (b)
time
Figure 4.1: Our convention for drawing quantum channels throughout this
book, with time increasing horizontally towards the right, and spatial separations
indicated vertically. In (a), the input and output systems 𝐴 and 𝐵, respectively,
of the quantum channel N are temporally separated but not spatially separated.
In (b), 𝐴 and 𝐵 are both spatially and temporally separated. We often draw a
dashed line to indicate the spatial separation explicitly.
Exercise 4.1
Let N be a 𝑘-positive superoperator, for an integer 𝑘 ≥ 1. Prove that N is
Hermiticity preserving (recall Definition 2.17). (Hint: See Exercise 2.17 and
use the Jordan–Hahn decomposition.)
Exercise 4.2
Prove that a superoperator N 𝐴→𝐵 is trace-non-increasing if and only if its adjoint
is subunital, meaning that N† ( 1𝐵 ) ≤ 1 𝐴 . Prove that the inequality is saturated,
i.e., that N† ( 1𝐵 ) = 1 𝐴 (N† is unital), if and only if N is trace preserving.
139
Chapter 4: Quantum Channels
Exercise 4.3
1. Let N 𝐴→𝐵 be a positive superoperator. Starting with (2.2.186), prove that
∥N∥ 1 = N† ( 1𝐵 ) ∞
. (4.1.6)
Combining the result of Exercise 4.3 and (2.2.185), we conclude that for every
positive, trace-non-increasing superoperator N 𝐴→𝐵 , and every linear operator
𝑋 ∈ L(H 𝐴 ),
∥N(𝑋) ∥ 1 ≤ ∥ 𝑋 ∥ 1 . (4.1.7)
An inequality of this type is called a data-processing inequality, for which we
provide an interpretation later on in Section 6.1. We encounter numerous such
inequalities with respect to various different quantities throughout the rest of this
book, and they turn out to be of central importance in the analysis of quantum
communication protocols, and in quantum information theory more generally.
140
Chapter 4: Quantum Channels
A
ΦAA0
A0
Alice
Bob N ΦNAB
141
Chapter 4: Quantum Channels
Exercise 4.4
Let N 𝐴→𝐵 be a superoperator.
𝐴𝐵 ] = N 𝐴→𝐵 ( 1 𝐴 ).
1. Prove that Tr 𝐴 [ΓN
𝐴𝐵 ] = 1 𝐴 . Conclude
2. Prove that N 𝐴→𝐵 is trace preserving if and only if Tr 𝐵 [ΓN
N
that Tr[Φ 𝐴𝐵 ] = 1.
D E
N
3. Prove that 𝑋 𝐴 ⊗ 𝑌𝐵 , Γ𝐴𝐵 = ⟨𝑌𝐵 , N 𝐴→𝐵 (𝑋 𝐴 )⟩ for all 𝑋 𝐴 ∈ L(H 𝐴 ) and
𝑌𝐵 ∈ L(H𝐵 ).
4. Using 3., prove that the Choi representation of N can be expressed using
the adjoint N† as follows:
𝐵 −1
𝑑∑︁
ΓN
𝐴𝐵 = N† (|𝑘⟩⟨ℓ| 𝐵 ) ⊗ |𝑘⟩⟨ℓ| 𝐵 . (4.2.3)
𝑘,ℓ=0
𝐴𝐵 ] = N ( 1 𝐵 ).
Conclude that Tr 𝐵 [ΓN †
ΓN
𝐴𝐵 = (U 𝐴 ⊗ (N 𝐴′ →𝐵 ◦ U 𝐴′ ))(Γ 𝐴𝐴′ ), (4.2.4)
†
where U 𝐴 (·) B 𝑈 𝐴 (·)𝑈 𝐴† and U 𝐴 (·) B 𝑈 𝐴 (·)𝑈 𝐴 .
142
Chapter 4: Quantum Channels
= Tr 𝐴 [(𝑋 𝐴T ⊗ 1𝐵 )ΓN
𝐴𝐵 ]. (4.2.11)
Now employing (2.2.41) and (2.2.40), we conclude that the action of N 𝐴→𝐵 on
every linear operator 𝑋 𝐴 can be expressed alternatively as
N 𝐴→𝐵 (𝑋 𝐴 ) = ⟨Γ| 𝐴′ 𝐴 (𝑋 𝐴′ ⊗ ΓN
𝐴𝐵 )|Γ⟩ 𝐴′ 𝐴 . (4.2.13)
The identities in (4.2.5) and (4.2.13) extend more generally to the case of the
superoperator id 𝑅 ⊗ N 𝐴→𝐵 acting on a bipartite operator 𝑋 𝑅 𝐴 by expanding 𝑋 𝑅 𝐴
Í −1 𝑖, 𝑗
as 𝑋 𝑅 𝐴 = 𝑖,𝑑 𝑅𝑗=0 |𝑖⟩⟨ 𝑗 | 𝑅 ⊗ 𝑋 𝐴 and using linearity. We thus conclude (4.2.5) and
(4.2.6). ■
Exercise 4.5
Prove that a superoperator N 𝐴→𝐵 is Hermiticity preserving (recall the definition
in Section 2.2.11) if and only if its Choi representation ΓN
𝐴𝐵 is Hermitian.
143
Chapter 4: Quantum Channels
Using the definition of the Choi state and the maximally entangled state Φ 𝐴′ 𝐴 ,
we can write (4.2.6) as
1
⟨Φ| 𝐴′ 𝐴 (𝑋 𝑅 𝐴′ ⊗ ΦN
𝐴𝐵 )|Φ⟩ 𝐴′ 𝐴 = N 𝐴′ →𝐵 (𝑋 𝑅 𝐴′ ). (4.2.14)
𝑑 2𝐴
Comparing this equation with (3.3.22), we see that it has the following physical
interpretation: if we start with the systems 𝑅, 𝐴′, 𝐴 and 𝐵 in the state 𝜌 𝑅 𝐴′ ⊗ ΦN
𝐴𝐵
and we measure 𝐴′ and 𝐴 according to the POVM {|Φ⟩⟨Φ| 𝐴′ 𝐴 , 1 𝐴′ 𝐴 − |Φ⟩⟨Φ| 𝐴′ 𝐴 },
then the outcome corresponding to |Φ⟩⟨Φ| 𝐴′ 𝐴 occurs with probability 𝑑12 and the
𝐴
post-measurement state on systems 𝑅 and 𝐵 is N 𝐴′ →𝐵 (𝜌 𝑅 𝐴′ ). The Choi state ΦN
𝐴𝐵
can thus be viewed as a resource state for the probabilistic implementation of the
channel N. We return to this point in Section 5.1 when we discuss post-selected
quantum teleportation.
The concept of the Choi state allows us to associate to each quantum channel
N 𝐴→𝐵 a bipartite quantum state. Conversely, given a bipartite state 𝜌 𝐴𝐵 , we can
associate a map given by
𝑋 𝐴 ↦→ 𝑑 𝐴 Tr 𝐴 [(𝑋 𝐴T ⊗ 1𝐵 ) 𝜌 𝐴𝐵 ]. (4.2.15)
Exercise 4.6
1. Given two superoperators (N1 ) 𝐴1 →𝐵1 and (N2 ) 𝐴2 →𝐵2 , prove that the Choi
144
Chapter 4: Quantum Channels
ΓN1 ⊗N2 N1 N2
𝐴1 𝐴2 𝐵 1 𝐵 2 = Γ 𝐴1 𝐵 1 ⊗ Γ 𝐴2 𝐵 2 . (4.2.17)
2. Given two superoperators N 𝐴→𝐵 and M𝐵→𝐶 , prove that the Choi represen-
tation of the composition (M ◦ N) 𝐴→𝐶 is given by
ΓM◦N N
𝐴𝐶 = M 𝐵→𝐶 (Γ 𝐴𝐵 ) (4.2.18)
= Tr 𝐵 [T𝐵 (ΓN M
𝐴𝐵 )Γ𝐵𝐶 ] (4.2.19)
= ⟨Γ| 𝐵𝐵′ ΓN M
𝐴𝐵 ⊗ Γ𝐵′ 𝐶 |Γ⟩ 𝐵𝐵′ . (4.2.20)
Exercise 4.7
Let N 𝐴→𝐵 be a superoperator. Prove that
1
ΓN ≤ ∥N∥⋄ ≤ ΓN . (4.2.21)
𝑑 𝐴 𝐴𝐵 1
𝐴𝐵
1
(Hint: Start with Theorem 2.21. Then, for the right-most inequality, start with
the discussion around (2.2.38) and then use (2.2.94).)
145
Chapter 4: Quantum Channels
𝐴𝐵 ] = 1 𝐴 .
Tr 𝐵 [ΓN
3. Kraus: There exists a set {𝐾𝑖 }𝑖=1
𝑟 of operators, called Kraus operators,
such that 𝑟
∑︁
N(𝑋 𝐴 ) = 𝐾𝑖 𝑋 𝐴 𝐾𝑖† (4.3.1)
𝑖=1
Please consult the Bibliographic Notes in Section 3.4 for references containing
a proof of this theorem.
Remark: Theorem 4.3 holds for quantum channels, i.e., completely positive trace-preserving
maps. More generally, if N 𝐴→𝐵 is completely positive and trace non-increasing, then the
trace-preserving condition for the Choi, Kraus, and Stinespring representations of N 𝐴→𝐵 changes
as follows.
• The isometric property of the operator 𝑉 in (4.3.2), which corresponds to the trace-
preserving property of every quantum channel, changes to 𝑉 †𝑉 ≤ 1 𝐴 when N 𝐴→𝐵 is trace
non-increasing.
𝑟
∑︁
𝐾𝑖′ = 𝑉𝑖, 𝑗 𝐾 𝑗 (4.3.3)
𝑗=1
are also Kraus operators for N. Indeed, for every linear operator 𝑋, the following
equality holds
𝑠
∑︁ 𝑠 ∑︁
∑︁ 𝑟
𝐾𝑖′ 𝑋 (𝐾𝑖′) † = 𝑉𝑖, 𝑗 𝑉𝑖, 𝑗 ′ 𝐾 𝑗 𝑋𝐾 †𝑗 ′ (4.3.4)
𝑖=1 𝑖=1 𝑗, 𝑗 ′ =1
𝑟 𝑠
!
∑︁ ∑︁
= (𝑉 † ) 𝑗 ′ ,𝑖𝑉𝑖, 𝑗 𝐾 𝑗 𝑋𝐾 †𝑗 ′ (4.3.5)
𝑗, 𝑗 ′ =1 𝑖=1
| {z }
(𝑉 †𝑉) 𝑗 ′ , 𝑗 =𝛿 𝑗 ′ , 𝑗
𝑟
∑︁
= 𝐾 𝑗 𝑋𝐾 †𝑗 (4.3.6)
𝑗=1
= N(𝑋), (4.3.7)
Exercise 4.8
Show that the Choi representation of a quantum channel N 𝐴→𝐵 can be expressed
𝑟 of its Kraus operators, with 𝑟 ≥ rank(ΓN ), as
using a set {𝐾𝑖 }𝑖=1 𝐴𝐵
𝑟
∑︁
ΓN
𝐴𝐵 = vec(𝐾𝑖 )vec(𝐾𝑖 ) † . (4.3.8)
𝑖=1
B A B
N(ρA ) ρA N(ρA )
ρA VA→BE U AE 0 →BE 00
0
E E E 00
|0iE 0
anything external to the quantum system that is not under our control. Stinespring’s
theorem then tells us that the evolution of every quantum system can be thought of
as first an interaction of the system with its environment, followed by discarding
the environment; see Figure 4.3. Given a set {𝐾𝑖 }𝑖=1𝑟 of Kraus operators for N, we
can let the environment 𝐸 correspond to a space of dimension 𝑟 and define the
isometry 𝑉𝐴→𝐵𝐸 as
𝑟
∑︁
𝑉𝐴→𝐵𝐸 = 𝐾 𝑗 ⊗ | 𝑗 − 1⟩𝐸 . (4.3.9)
𝑗=1
Exercise 4.9
Show that the Choi representation of a quantum channel N 𝐴→𝐵 can be expressed
using an isometric extension 𝑉𝐴→𝐵𝐸 of N, with 𝑑 𝐸 ≥ rank(ΓN 𝐴𝐵 ), as
ΓN †
𝐴𝐵 = Tr 𝐸 [vec(𝑉)vec(𝑉) ]. (4.3.10)
148
Chapter 4: Quantum Channels
Exercise 4.10
Let N 𝐴→𝐵 be a quantum channel with the following Kraus and Stinespring
representations:
𝑟
∑︁
N(𝑋) = 𝐾𝑖 𝑋𝐾𝑖† = Tr𝐸 [𝑉 𝑋𝑉 † ]. (4.3.11)
𝑖=1
1. Verify using (2.2.182) that the adjoint map N† can be represented in the
following two ways:
𝑟
𝐾𝑖†𝑌 𝐾𝑖 = 𝑉 † (𝑌 ⊗ 1𝐸 )𝑉 .
∑︁
†
N (𝑌 ) = (4.3.12)
𝑖=1
Exercise 4.11
1. Let N 𝐴→𝐵 be a positive trace-preserving map. Prove that the set
𝑑 𝐵 −1
{N† (|𝑖⟩⟨𝑖|)}𝑖=0 is a POVM. More generally, prove that the set
𝑑2 𝑑2
{N† (𝐸 𝑗 𝜌𝐸 †𝑗 )} 𝑗=1
𝐵
is a POVM for every orthonormal basis {𝐸 𝑗 } 𝑗=1
𝐵
for
L(H𝐵 ) and every quantum state 𝜌 ∈ D(H𝐵 ).
2. Conversely, let {𝑀𝑥 }𝑥∈X be a POVM, where X is a finite set. Prove that
there exists a quantum channel N such that 𝑀𝑥 = N† (|𝑥⟩⟨𝑥|) for all 𝑥 ∈ X,
where {|𝑥⟩}𝑥∈X is an orthonormal set. (Hint: Recall Naimark’s Theorem
(Theorem 3.22).)
149
Chapter 4: Quantum Channels
Proposition 4.4
Let 𝜌 𝐴 be a quantum state with purification 𝜓 𝑅 𝐴 . For every extension 𝜔 𝑅′ 𝐴
of 𝜌 𝐴 , there exists a quantum channel N 𝑅→𝑅′ such that
N 𝑅→𝑅′ (𝜓 𝑅 𝐴 ) = 𝜔 𝑅′ 𝐴 . (4.3.13)
as required. ■
Proposition 4.4 tells us that an extension of a quantum state can be “reached” via
a quantum channel acting on a purification of the state. In this sense, a purification
can be viewed as the “strongest” extension of a state.
The complementary channel for N 𝐴→𝐵 associated with the isometric extension
𝑉𝐴→𝐵𝐸 is denoted by N𝑐𝐴→𝐸 and is defined as
N𝑐𝐴→𝐸 (𝑋 𝐴 ) B Tr 𝐵 [𝑉 𝑋𝑉 † ]. (4.3.19)
Related to the above, the channel M𝑐𝐴→𝐸 is a complementary channel for the
channel M 𝐴→𝐵 if there exists an isometric channel W 𝐴→𝐵𝐸 such that
Given a channel N 𝐴→𝐵 , it does not have a unique complementary channel, just
as it does not have a unique Kraus representation, nor does a given quantum state 𝜌 𝐴
have a unique purification. Similar to the latter scenarios, however, it is possible
to show that all complementary channels for N 𝐴→𝐵 are related by an isometric
channel acting on their output. That is, let us suppose that (N1 ) 𝑐𝐴→𝐸 and (N2 ) 𝑐𝐴→𝐸 ′
are complementary channels for N 𝐴→𝐵 . Then there exists an isometric channel
S𝐸→𝐸 ′ such that
(N2 ) 𝑐𝐴→𝐸 ′ = S𝐸→𝐸 ′ ◦ (N1 ) 𝑐𝐴→𝐸 . (4.3.22)
Exercise 4.12
Let N 𝐴→𝐵 be a quantum channel with a set {𝐾𝑖 }𝑖=1
𝑟 of Kraus operators, where
𝑟 ≥ rank(ΓN
𝐴𝐵 ).
151
Chapter 4: Quantum Channels
ΓN † T
𝑐
𝐴𝐸 = (𝑊𝑊 ) . (4.3.24)
B B
N N
A D A A
Nc Nc
E E
the other hand, an anti-degradable channel is such that N can be simulated (via A)
using the output of N𝑐 , which means that Eve can simulate Bob’s received signal.
Exercise 4.13
Prove that a quantum channel N 𝐴→𝐵 is anti-degradable if and only if its Choi state
ΦN𝐴𝐵 is two-extendible, meaning that there exists a state 𝜎𝐴𝐵𝐵 , with 𝑑 𝐵 = 𝑑 𝐵 ,
′ ′
N
such that Tr 𝐵′ [𝜎𝐴𝐵𝐵′ ] = Tr 𝐵 [𝜎𝐴𝐵𝐵′ ] = Φ 𝐴𝐵 . (Hint: Use Proposition 4.4.)
𝑉 1 − 𝑉𝑉 † 0𝑑 𝐵 𝑑𝐸 ×𝑑 ′
𝑈 = 0𝑑 𝐴×𝑑 𝐴 𝑉†
© ª
0𝑑 𝐴×𝑑 ′ ® , (4.3.27)
« 0𝑑 ′ ×𝑑 𝐴 0𝑑 ′ ×𝑑 𝐵 𝑑 𝐸 1𝑑 ′ ¬
153
Chapter 4: Quantum Channels
𝑑 𝐵 𝑑 𝐸 + 𝑑 𝐴 + 𝑑 ′ = 𝑑 𝐴 𝑑 𝐵 (𝑑 𝐵 + 1), (4.3.29)
and we conclude that 𝑈 is a (𝑑 𝐴 𝑑 𝐵 (𝑑 𝐵 + 1)) × (𝑑 𝐴 𝑑 𝐵 (𝑑 𝐵 + 1)) matrix. It is also
indeed a unitary because
𝑉 1 − 𝑉𝑉 † 0 𝑉† 0 0
†
𝑈𝑈 = 0 † 0 ® 1 − 𝑉𝑉 𝑉 0 ®
†
© ª© ª
𝑉 (4.3.30)
«0 0 1¬ « 0 0 1¬
𝑉𝑉 † + ( 1 − 𝑉𝑉 † )( 1 − 𝑉𝑉 † ) ( 1 − 𝑉𝑉 † )𝑉 0
= 𝑉 † ( 1 − 𝑉𝑉 † ) 𝑉 †𝑉
© ª
0® (4.3.31)
« 0 0 1¬
1 0 0
= 0 1 0® .
© ª
(4.3.32)
« 0 0 1¬
Similarly, it can be shown that 𝑈 †𝑈 = 1. By defining the system 𝐸 ′ with dimension
𝑑 𝐸 ′ = 𝑑 𝐵 (𝑑 𝐵 + 1), we can think of 𝑈 as acting on the input tensor-product space
H𝐸 ′ ⊗ H 𝐴 . Then, we can embed the state 𝜌 𝐴 into this larger space as
𝜌𝐴 0 0
|0⟩⟨0| 𝐸 ′ ⊗ 𝜌 𝐴 = 0 0 0® ,
© ª
(4.3.33)
« 0 0 0¬
so that
𝜌𝐴 0 0 𝑉 𝜌𝑉 † 0 0
ª † ©
𝑈 0 0 0® 𝑈 = 0
© ª
0 0® . (4.3.34)
« 0 0 0¬ « 0 0 0¬
By defining the system 𝐸 ′′ with dimension 𝑑 𝐸 ′′ = 𝑑 𝐴 (𝑑 𝐵 + 1), we can think of the
output space of 𝑈 as the tensor-product space H𝐵 ⊗ H𝐸 ′′ , so that
N(𝜌 𝐴 ) = Tr𝐸 [𝑉 𝜌𝑉] = Tr𝐸 ′′ [𝑈 (|0⟩⟨0| 𝐸 ′ ⊗ 𝜌 𝐴 )𝑈 † ]. (4.3.35)
154
Chapter 4: Quantum Channels
155
Chapter 4: Quantum Channels
We can thus complete the action of the unitary 𝑈 𝐴𝐸 ′ →𝐵𝐸 on the remaining vectors
as follows:
𝑈 𝐴𝐸 ′ →𝐵𝐸 |𝑘⟩ 𝐴 ⊗ |ℓ⟩𝐸 ′ = |𝜙 𝑘,ℓ ⟩𝐵𝐸 , (4.3.41)
for 0 ≤ 𝑘 ≤ 𝑑 𝐴 − 1 and 1 ≤ ℓ ≤ 𝑑 2𝐵 − 1. Thus, the full unitary 𝑈 𝐴𝐸 ′ →𝐵𝐸 is specified
as
2 −1
𝐴−1 𝑑∑︁
𝑑∑︁ 𝐵
The preparation of a quantum system in a given (fixed) state, as well as taking the
tensor product of a state with a given (fixed) state, can both be viewed as quantum
channels.
156
Chapter 4: Quantum Channels
In other words, the appending channel P 𝜌 𝐴 ⊗ id𝐵 takes the tensor product of its
argument with 𝜌 𝐴 .
Exercise 4.14
Determine the Choi representation and a Stinespring representation of the
preparation channel P 𝜌 𝐴 corresponding to the quantum state 𝜌 𝐴 .
R𝜎𝐴→𝐵
𝐵
(𝑋 𝐴 ) = Tr[𝑋 𝐴 ]𝜎𝐵 (4.4.4)
for every linear operator 𝑋 𝐴 . When acting on one share of a bipartite state 𝜌 𝑅 𝐴 ,
the replacement channel R𝜎𝐴→𝐵 𝐵
has the following action:
R𝜎𝐴→𝐵
𝐵
(𝜌 𝑅 𝐴 ) = Tr 𝐴 [𝜌 𝑅 𝐴 ] ⊗ 𝜎𝐵 = 𝜌 𝑅 ⊗ 𝜎𝐵 . (4.4.5)
R𝜎𝐴→𝐵
𝐵
= P𝜎𝐵 ◦ Tr 𝐴 . (4.4.6)
157
Chapter 4: Quantum Channels
Exercise 4.15
Given a quantum state 𝜎𝐵 , determine the Choi representation, as well as Kraus
and Stinespring representations, of the replacement channel R𝜎𝐴→𝐵
𝐵
.
Recall the trace and partial trace of a linear operator from Definition 3.2. As a
map acting on linear operators, one can ask whether the partial trace is a channel.
The answer, perhaps not surprisingly, is “yes.” In fact, observe that the definition
in (3.2.17) of the partial trace Tr 𝐵 over 𝐵 is already in Kraus form, with Kraus
operators 𝐾 𝑗 = 1 𝐴 ⊗ ⟨ 𝑗 | 𝐵 . This means that Tr 𝐵 is completely positive. It is also
trace preserving because
𝐵 −1
𝑑∑︁ 𝐵 −1
𝑑∑︁ 𝐵 −1
𝑑∑︁
𝐾 †𝑗 𝐾 𝑗 = ( 1 𝐴 ⊗ | 𝑗⟩𝐵 )( 1 𝐴 ⊗ ⟨ 𝑗 | 𝐵 ) = 1 𝐴 ⊗ | 𝑗⟩⟨ 𝑗 | 𝐵 = 1 𝐴𝐵 , (4.4.7)
𝑗=0 𝑗=0 𝑗=0
| 𝑗⟩⟨ 𝑗 | 𝐵 = 1𝐵 .
Í𝑑 𝐵 −1
where we used the fact that 𝑗=0
Exercise 4.16
1. Determine the Choi representation, as well as a Stinespring representation,
of the partial trace channel Tr 𝐵 .
2. Prove that the adjoint of the partial trace channel Tr 𝐵 is
Tr†𝐵 (𝑋 𝐴 ) = 𝑋 𝐴 ⊗ 1𝐵 . (4.4.8)
Unlike the trace and partial trace, the transpose and partial transpose are trace-
preserving maps but not completely positive. Indeed, for the latter, recall from
(3.2.83) that T𝐵 (Φ 𝐴𝐵 ) = 𝑑1 𝐹𝐴𝐵 , so that its Choi representation is ΓT𝐴𝐵𝐵 = 𝐹𝐴𝐵 , which
we know has negative eigenvalues, as shown in (3.2.127). So by Theorem 4.3, the
transpose map T𝐵 is not completely positive.
158
Chapter 4: Quantum Channels
Two more simple examples of quantum channels are isometric and unitary channels.
An isometric channel conjugates the channel input by an isometry, and a unitary
channel conjugates the channel input by a unitary. Specifically, the isometric
channel V corresponding to an isometry 𝑉 is
V(𝑋) B 𝑉 𝑋𝑉 † . (4.4.9)
U(𝑋) B 𝑈 𝑋𝑈 † . (4.4.10)
Since every unitary is also an isometry, it follows that every unitary channel is an
isometric channel. Isometric channels are completely positive because they can
be described using only one Kraus operator, the isometry 𝑉. In fact, a quantum
channel is isometric if and only if it has a single Kraus operator.
Observe that by the unitarity of 𝑈, the map
U† (𝑌 ) B 𝑈 †𝑌𝑈, (4.4.11)
U† ◦ U = U ◦ U† = id, (4.4.12)
Tr[R𝑉 (𝑌 )] = Tr V† (𝑌 ) + Tr[( 1 − 𝑉𝑉 † )𝑌 ]𝜎
(4.4.14)
= Tr[𝑉 †𝑌𝑉] + Tr[𝑌 ] − Tr[𝑉𝑉 †𝑌 ] (4.4.15)
= Tr[𝑌 ]. (4.4.16)
159
Chapter 4: Quantum Channels
Since it is also completely positive, being the sum of completely positive maps, the
map R𝑉 is indeed a quantum channel. Like U† , the reversal channel R𝑉 reverses
the action of V:
Exercise 4.17
Determine the Choi representation, as well as a Kraus and Stinespring represen-
tation, of the reversal channel R𝑉 corresponding to an isometry 𝑉.
States that are diagonal in a preferred basis, as in the equation above, are typically
called classical states. In addition to being in one-to-one correspondence with
classical probability distributions, classical states do not exhibit the quantum
properties of coherence and entanglement.
In Chapter 12, however, we are interested in classical communication over
quantum channels. We are then interested in so-called classical–quantum channels,
which we discuss in this section, that take a classical state as input and output a
quantum state.
160
Chapter 4: Quantum Channels
More generally, for every |X|-dimensional system 𝐴 and every state 𝜌 𝐴 that is
not necessarily classical and is expressed in the computational basis as 𝜌 𝐴 =
Í ′ ′
𝑥,𝑥 ′ ∈X ⟨𝑥|𝜌 𝐴 |𝑥 ⟩|𝑥⟩⟨𝑥 |, we find that
∑︁ ∑︁
cq ′
N (𝜌 𝐴 ) = 𝑥
⟨𝑥|𝜌 𝐴 |𝑥 ⟩𝛿𝑥,𝑥 ′ 𝜎𝐴 = ⟨𝑥|𝜌 𝐴 |𝑥⟩𝜎𝐴𝑥 . (4.4.24)
𝑥,𝑥 ′ ∈X 𝑥∈X
Therefore, for every quantum state, the classical–quantum channel Ncq takes
the input state, measures it in the computational basis {|𝑥⟩}𝑥∈X , and with the
corresponding outcome probability ⟨𝑥|𝜌 𝐴 |𝑥⟩, outputs the state 𝜎𝐴𝑥 .
Exercise 4.18
Show that the Choi state of a classical–quantum channel Ncq is
1 ∑︁
ΦN
cq
= |𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜎𝐴𝑥 . (4.4.25)
𝑋𝐴
|X|
𝑥∈X
161
Chapter 4: Quantum Channels
and
𝑟𝑥
∑︁ ∑︁ 𝑟𝑥
∑︁ ∑︁
(𝐾 𝑥𝑗 ) † 𝐾 𝑥𝑗 = 𝜆𝑥𝑗 |𝑥⟩ ⟨𝜑𝑥𝑗 |𝜑𝑥𝑗 ⟩ ⟨𝑥| (4.4.31)
𝑥∈X 𝑗=1 𝑥∈X 𝑗=1
| {z }
=1
𝑟𝑥
∑︁ ∑︁
= 𝜆𝑥𝑗 |𝑥⟩⟨𝑥| (4.4.32)
𝑥∈X 𝑗=1
|{z}
=1 ∀ 𝑥
= 1𝑋 . (4.4.33)
Also, observe from the construction above that every classical–quantum channel
has a Kraus representation with unit-rank Kraus operators.
Having described classical–quantum channels, let us now describe channels for
which the situation is opposite, such that they accept quantum inputs and provide
classical outputs.
162
Chapter 4: Quantum Channels
Exercise 4.19
Prove that the Choi state of a measurement channel M 𝐴→𝑋 , as defined in
(4.4.34), is
M 1 ∑︁ T
Φ 𝐴𝑋 = (𝑀𝑥 ) 𝐴 ⊗ |𝑥⟩⟨𝑥| 𝑋 . (4.4.35)
𝑑𝐴
𝑥∈X
163
Chapter 4: Quantum Channels
where √
𝐾 𝑥𝑗 B 𝜇𝑥𝑗 |𝑥⟩⟨𝜙𝑥𝑗 | ∀ 𝑥 ∈ X, 1 ≤ 𝑗 ≤ 𝑟 𝑥 . (4.4.42)
Since
𝑟𝑥
∑︁ ∑︁ 𝑟𝑥
∑︁ ∑︁
(𝐾 𝑥𝑗 ) † 𝐾 𝑥𝑗 = 𝜇𝑥𝑗 |𝜙𝑥𝑗 ⟩⟨𝑥|𝑥⟩⟨𝜙𝑥𝑗 | (4.4.43)
𝑥∈X 𝑗=1 𝑥∈X 𝑗=1
∑︁ ∑︁𝑟𝑥
= 𝜇𝑥𝑗 |𝜙𝑥𝑗 ⟩⟨𝜙𝑥𝑗 | (4.4.44)
𝑥∈X 𝑗=1
| {z }
𝑀𝑥
∑︁
= 𝑀𝑥 (4.4.45)
𝑥∈X
= 1𝐴, (4.4.46)
it holds that {𝐾 𝑥𝑗 : 𝑥 ∈ X, 1 ≤ 𝑗 ≤ 𝑟 𝑥 } is a set of Kraus operators for Nqc .
This means that all quantum–classical channels have a Kraus representation with
unit-rank Kraus operators.
Recall from Section 3.3 that, for an arbitrary POVM {𝑀𝑥 }𝑥∈X , a set of post-measure-
ment states corresponding to an initial state 𝜌 can be given by (3.3.14),
𝐾𝑥 𝜌𝐾𝑥†
𝑥
𝜌 = ∀ 𝑥 ∈ X, (4.4.47)
Tr[𝐾𝑥 𝜌𝐾𝑥† ]
where {𝐾𝑥 }𝑥∈X is a set of operators such that 𝑀𝑥 = 𝐾𝑥† 𝐾𝑥 for all 𝑥 ∈ X. Also recall
that the expected density operator of the measurement is given by (3.3.15),
∑︁
𝐾𝑥 𝜌𝐾𝑥† . (4.4.48)
𝑥∈X
This expected state can be seen as arising from a quantum channel with Kraus
operators {𝐾𝑥 }𝑥∈X . Note that this map is indeed a channel since 𝑥∈X 𝐾𝑥† 𝐾𝑥 =
Í
𝑥∈X 𝑀𝑥 = 1. We can view the channel as being the sum of the completely positive
Í
and trace-non-increasing maps M𝑥 defined as
M𝑥 (𝜌) = 𝐾𝑥 𝜌𝐾𝑥† . (4.4.49)
164
Chapter 4: Quantum Channels
That is, the output of a quantum instrument is a classical–quantum state such that
the classical register 𝑋 stores the outcome of the measurement. This is unlike
the expected state in (4.4.52), which represents a lack of knowledge of which
measurement outcome occurred.
Note that the channel in (4.4.53) corresponding to a quantum instrument reduces
to the measurement channel defined in (4.4.34) if we consider a measurement with
165
Chapter 4: Quantum Channels
POVM {𝑀𝑥 }𝑥∈X and we define the maps M𝑥 as M𝑥 (𝜌) = Tr[𝑀𝑥 𝜌] for all 𝑥 ∈ X.
In this case, the channel in (4.4.53) becomes
∑︁
M(𝜌) = Tr[𝑀𝑥 𝜌]|𝑥⟩⟨𝑥| 𝑋 , (4.4.54)
𝑥∈X
which is precisely the measurement channel in (4.4.34).
Proposition 4.13
A channel N is entanglement breaking if and only if its Choi state ΦN
𝐴𝐵 is
separable.
Proof: Observe that if N 𝐴→𝐵 is entanglement breaking, then its Choi state ΦN 𝐴𝐵
is separable. On the other hand, if the Choi state ΦN 𝐴𝐵 of a given channel N is
separable, then it is of the form
∑︁
N
Φ 𝐴𝐵 = 𝑝(𝑥)𝜎𝐴𝑥 ⊗ 𝜏𝐵𝑥 (4.4.55)
𝑥∈X
for some probability distribution 𝑝 : X → [0, 1] on a finite alphabet X and sets
{𝜎𝐴𝑥 : 𝑥 ∈ X} and {𝜏𝐵𝑥 : 𝑥 ∈ X} of states. We note that the property Tr 𝐵 [ΦN
𝐴𝐵 ] = 𝜋 𝐴
166
Chapter 4: Quantum Channels
Then, for every reference system 𝑅 and state 𝜉 𝑅 𝐴 acting on H 𝑅 𝐴 , we find, by using
(4.2.5), that
where
Tr 𝐴 [T 𝐴 (𝜉 𝑅 𝐴 )( 1 𝑅 ⊗ 𝑑 𝐴 𝑝(𝑥)𝜎𝐴𝑥 )]
𝜔𝑥𝑅 B , (4.4.59)
𝑞(𝑥)
T
𝑞(𝑥) B 𝑝(𝑥)𝑑 𝐴 Tr[𝜉 𝐴 𝜎𝐴𝑥 ]. (4.4.60)
Now, the map 𝑥 ↦→ 𝑞(𝑥) is a probability distribution on X since 𝑞(𝑥) ≥ 0 for all
𝑥 ∈ X and
" !#
∑︁ ∑︁
𝑞(𝑥) = 𝑑 𝐴 Tr 𝜉 T𝐴 𝑝(𝑥)𝜎𝐴𝑥 (4.4.61)
𝑥∈X 𝑥∈X
T
= 𝑑 𝐴 Tr[𝜉 𝐴 𝜋 𝐴 ] (4.4.62)
= Tr[𝜉 𝐴 ] (4.4.63)
= 1, (4.4.64)
Proposition 4.14
A channel N is entanglement breaking if and only if there exists a set of Kraus
operators for N, with each Kraus operator having unit rank.
Proof: First suppose that the Kraus operators of N have unit rank. They are
therefore of the form |𝜙 𝑗 ⟩𝐵 ⟨𝜓 𝑗 | 𝐴 C 𝐾 𝑗 for 1 ≤ 𝑗 ≤ 𝑟. Without loss of generality,
167
Chapter 4: Quantum Channels
we can let each vector in the set {|𝜙 𝑗 ⟩}𝑟𝑗=1 be normalized. Then, since N is trace
preserving, it holds that
𝑟 𝑟 𝑟
1𝐴 =
∑︁ ∑︁ ∑︁
𝐾 †𝑗 𝐾 𝑗 = |𝜓 𝑗 ⟩ 𝐴 ⟨𝜙 𝑗 |𝜙 𝑗 ⟩⟨𝜓 𝑗 | 𝐴 = |𝜓 𝑗 ⟩⟨𝜓 𝑗 | 𝐴 . (4.4.65)
𝑗=1 𝑗=1 𝑗=1
Now, for every reference system 𝑅 of arbitrary dimension and every state 𝜌 𝑅 𝐴 ,
we find that
𝑟
( 1 𝑅 ⊗ 𝐾 𝑗 ) 𝜌 𝑅 𝐴 ( 1 𝑅 ⊗ 𝐾 †𝑗 )
∑︁
(id 𝑅 ⊗ N)(𝜌 𝑅 𝐴 ) = (4.4.66)
𝑗=1
𝑟
( 1 𝑅 ⊗ |𝜙 𝑗 ⟩𝐵 ⟨𝜓 𝑗 | 𝐴 ) 𝜌 𝑅 𝐴 ( 1 𝑅 ⊗ |𝜓 𝑗 ⟩ 𝐴 ⟨𝜙 𝑗 | 𝐵 )
∑︁
= (4.4.67)
𝑗=1
𝑟
( 1 𝑅 ⊗ ⟨𝜓 𝑗 | 𝐴 )(𝜌 𝑅 𝐴 )( 1 𝑅 ⊗ |𝜓 𝑗 ⟩ 𝐴 ) ⊗ |𝜙 𝑗 ⟩⟨𝜙 𝑗 | 𝐵
∑︁
= (4.4.68)
𝑗=1
𝑟
∑︁
𝑗
= 𝑝( 𝑗)𝜎𝑅 ⊗ |𝜙 𝑗 ⟩⟨𝜙 𝑗 | 𝐵 , (4.4.69)
𝑗=1
where
𝑗 ( 1 𝑅 ⊗ ⟨𝜓 𝑗 | 𝐴 )(𝜌 𝑅 𝐴 )( 1 𝑅 ⊗ |𝜓 𝑗 ⟩ 𝐴 )
𝜎𝑅 B , 𝑝( 𝑗) B ⟨𝜓 𝑗 | 𝐴 𝜌 𝐴 |𝜓 𝑗 ⟩ 𝐴 . (4.4.70)
𝑝( 𝑗)
Note that 𝑗 ↦→ 𝑝( 𝑗) is a probability distribution since 𝑝( 𝑗) ≥ 0 for all 𝑗, and
𝑟
∑︁ 𝑟
∑︁
𝑝( 𝑗) = ⟨𝜓 𝑗 | 𝐴 𝜌 𝐴 |𝜓 𝑗 ⟩ 𝐴 (4.4.71)
𝑗=1 𝑗=1
𝑟
∑︁
= Tr[|𝜓 𝑗 ⟩⟨𝜓 𝑗 |𝜌 𝐴 ] (4.4.72)
𝑗=1
© 𝑟
∑︁
= Tr |𝜓 𝑗 ⟩⟨𝜓 𝑗 | 𝐴 ® 𝜌 𝐴
ª
(4.4.73)
𝑗=1
« ¬
= Tr[𝜌 𝐴 ] (4.4.74)
= 1, (4.4.75)
168
Chapter 4: Quantum Channels
This holds for all 0 ≤ 𝑖, 𝑖′ ≤ 𝑑 𝐴 − 1, which means that {𝐾𝑥 }𝑥∈X is a set of Kraus
operators for N, each of which has unit rank. ■
169
Chapter 4: Quantum Channels
where X is a finite alphabet, {𝜎 𝑥 }𝑥∈X is a set of quantum states, and {𝑀𝑥 }𝑥∈X is a
POVM. Indeed, if each POVM element 𝑀𝑥 has a spectral decomposition of the
form 𝑟𝑥
∑︁
𝑀𝑥 = 𝜆𝑥𝑘 |𝜓 𝑥𝑘 ⟩⟨𝜓 𝑥𝑘 |, (4.4.83)
𝑘=1
where 𝑟 𝑥 = rank(𝑀𝑥 ), and each state 𝜎 𝑥 has a spectral decomposition of the form
𝑠𝑥
∑︁
𝑥
𝜎 = 𝛼ℓ𝑥 |𝜙ℓ𝑥 ⟩⟨𝜙ℓ𝑥 |, (4.4.84)
ℓ=1
Theorem 4.15
For every entanglement-breaking channel N, there exists a finite alphabet X, a
set {𝜎 𝑥 }𝑥∈X of states, and a POVM {𝑀𝑥 }𝑥∈X such that the action of N can be
written as ∑︁
N(𝜌) = Tr[𝑀𝑥 𝜌]𝜎 𝑥 (4.4.88)
𝑥∈X
for every state 𝜌.
170
Chapter 4: Quantum Channels
where 𝑟 = rank(ΓN ) and {𝐾 𝑗 }𝑟𝑗=1 is a set of Kraus operators for N, with each Kraus
operator having unit rank. Since each Kraus operator has unit rank, it holds that
𝐾 𝑗 = |𝜙 𝑗 ⟩⟨𝜓 𝑗 | for all 1 ≤ 𝑗 ≤ 𝑟, where {|𝜙 𝑗 ⟩}𝑟𝑗=1 and {|𝜓 𝑗 ⟩}𝑟𝑗=1 are sets of vectors
(without loss of generality, we can take {|𝜙 𝑗 ⟩}𝑟𝑗=1 to be a set of pure states). Since
N is trace preserving, it holds that
𝑟 𝑟 𝑟
|𝜓 𝑗 ⟩⟨𝜓 𝑗 | = 1.
∑︁ ∑︁ ∑︁
𝐾 †𝑗 𝐾 𝑗 = |𝜓 𝑗 ⟩⟨𝜙 𝑗 |𝜙 𝑗 ⟩⟨𝜓 𝑗 | = (4.4.90)
𝑗=1 𝑗=1 𝑗=1
This implies that {|𝜓 𝑗 ⟩⟨𝜓 𝑗 |}𝑟𝑗=1 is a POVM. Therefore, defining the alphabet
X = {1, 2, . . . , 𝑟 }, the POVM elements 𝑀𝑥 B |𝜓𝑥 ⟩⟨𝜓𝑥 | and states 𝜎 𝑥 B |𝜙𝑥 ⟩⟨𝜙𝑥 |,
we have that ∑︁
N(𝜌) = Tr[𝑀𝑥 𝜌]𝜎 𝑥 , (4.4.91)
𝑥∈X
as required. ■
where the second equality follows because 𝑥∈X 𝑀𝑥 = 1, and the last equality from
Í
the definition of the replacement channel for 𝜎 in Definition 4.8. This means that
every replacement channel is a measure-and-prepare channel, and in particular an
entanglement-breaking channel.
The development above Proposition 4.14 tells us that to every entanglement
breaking channel there is associated a separable state, namely, the Choi state. The
converse statement also holds; see Exercise 4.20.
171
Chapter 4: Quantum Channels
Exercise 4.20
Í 𝜌
Given a separable state 𝜌 𝐴𝐵 = 𝑥∈X 𝑝(𝑥)𝜔𝑥𝐴 ⊗ 𝜏𝐵𝑥 , show that the channel N 𝐴→𝐵
defined in (4.2.16) has the form
∑︁
𝜌
N 𝐴→𝐵 (𝑋 𝐴 ) = Tr[𝑋 𝐴 𝑀 𝐴𝑥 ]𝜏𝐵𝑥 , (4.4.93)
𝑥∈X
where T
−1 −1
𝑀 𝐴𝑥 = 𝑝(𝑥) 𝜌 𝐴 2 𝜔𝑥𝐴 𝜌 𝐴 2 (4.4.94)
for all 𝑥 ∈ X. In other words, by the discussion after (4.4.82), every separable
state can be associated with an entanglement-breaking channel.
N(𝑋) = 𝑁 ∗ 𝑉 𝑋𝑉 † (4.4.95)
172
Chapter 4: Quantum Channels
Note that the Hadamard product can be defined as the element-wise product with
respect to an arbitrary orthonormal basis; however, in this book, we only consider
the Hadamard product with respect to the standard basis.
The positive semi-definiteness of 𝑁 in the definition of a Hadamard channel is
necessary and sufficient for the map N defined in (4.4.95) to be completely positive,
while the fact that 𝑁 has unit diagonal elements in the standard basis and that 𝑉 is
an isometry ensures that N is trace preserving.
Exercise 4.21
1. Show that a dephasing channel, as defined in (4.5.35), is a Hadamard
channel.
2. Let {𝐾 𝑗 }𝑟𝑗=1 be a set of Kraus operators. Prove that the channel, defined by
the set {𝐾 𝑗 ⊗ | 𝑗⟩}𝑟𝑗=1 of Kraus operators, is a Hadamard channel.
Proposition 4.17
Any complement of a Hadamard channel is entanglement breaking.
173
Chapter 4: Quantum Channels
Therefore, using (4.4.97) and Definition 4.16, the action of N can be written as
𝑑−1
∑︁
N(𝑋) = ⟨𝜓𝑖 |𝜓 𝑗 ⟩⟨𝑖|𝑉 𝑋𝑉 † | 𝑗⟩|𝑖⟩⟨ 𝑗 | 𝐵 . (4.4.99)
𝑖, 𝑗=0
Now, set
|𝜙𝑖 ⟩ B 𝑉 † |𝑖⟩ ⇒ ⟨𝜙𝑖 | = ⟨𝑖|𝑉, (4.4.100)
so that
𝑑−1
∑︁
N(𝑋) = ⟨𝜓𝑖 |𝜓 𝑗 ⟩⟨𝜙𝑖 |𝑋 |𝜙 𝑗 ⟩|𝑖⟩⟨ 𝑗 | 𝐵 . (4.4.101)
𝑖, 𝑗=0
N𝑐 (𝑋) = Tr 𝐵 [𝑈 N 𝑋 (𝑈 N ) † ] (4.4.105)
𝑑−1
∑︁
= ⟨𝜙𝑖 |𝑋 |𝜙𝑖 ⟩|𝜓𝑖 ⟩⟨𝜓𝑖 | 𝐸 (4.4.106)
𝑖=0
𝑑−1
∑︁
= |𝜓𝑖 ⟩⟨𝜙𝑖 |𝑋 |𝜙𝑖 ⟩⟨𝜓𝑖 | (4.4.107)
𝑖=0
174
Chapter 4: Quantum Channels
𝑑−1
∑︁
= 𝐾𝑖 𝑋𝐾𝑖† , (4.4.108)
𝑖=0
where 𝐾𝑖 B |𝜓𝑖 ⟩⟨𝜙𝑖 |. So N𝑐 has a Kraus representation with unit-rank Kraus
operators, which means, by Proposition 4.14, that N𝑐 is entanglement breaking.
Every complement of a channel is related to another complement by an isometric
channel acting on the output of the complement, and this does not change the
entanglement-breaking property. ■
In Section 3.2.7, we defined states that are invariant under the action of a unitary
representation of a group. We now define an analogous notion of invariance for
quantum channels.
for all 𝑔 ∈ 𝐺.
Exercise 4.22
Let N 𝐴→𝐵 be a group-covariant channel, as per Definition 4.18.
1. Show that the condition in (4.4.109) can be written more compactly as
follows:
𝑔† 𝑔
N 𝐴→𝐵 = V𝐵 ◦ N 𝐴→𝐵 ◦ U 𝐴 (4.4.110)
𝑔† 𝑔† 𝑔 𝑔 𝑔 𝑔†
for all 𝑔 ∈ 𝐺, where V𝐵 (·) B 𝑉𝐵 (·)𝑉𝐵 and U 𝐴 (·) B 𝑈 𝐴 (·)𝑈 𝐴 .
175
Chapter 4: Quantum Channels
2. Show that the Choi representation of N 𝐴→𝐵 is invariant under the action of
𝑔T 𝑔†
𝑈 𝐴 ⊗ 𝑉𝐵 for all 𝑔 ∈ 𝐺; i.e., show that
ΓN N †
𝑔T 𝑔† 𝑔T 𝑔†
𝐴𝐵 = (𝑈 𝐴 ⊗ 𝑉𝐵 )Γ 𝐴𝐵 (𝑈 𝐴 ⊗ 𝑉𝐵 ) (4.4.111)
for all 𝑔 ∈ 𝐺.
𝑟 of Kraus operators for N, with 𝑟 ≥ rank(ΓN ), show
3. For every set {𝐾𝑖 }𝑖=1 𝐴𝐵
𝑔 𝑟 𝑔 𝑔† 𝑔
that {𝐾𝑖 }𝑖=1 , with 𝐾𝑖 B 𝑉𝐵 𝐾𝑖𝑈 𝐴 , is another set of Kraus operators for N
for all 𝑔 ∈ 𝐺.
4. For every isometric extension 𝑊 𝐴→𝐵𝐸 of N, with 𝑑 𝐸 ≥ rank(ΓN𝐴𝐵 ), show
𝑔 𝑔† 𝑔
that 𝑊 𝐴→𝐵𝐸 B 𝑉𝐵 𝑊𝑈 𝐴 is another isometric extension of N for all 𝑔 ∈ 𝐺.
Proof: Given is a group 𝐺 and a quantum channel N 𝐴→𝐵 that is covariant in the
following sense:
𝑔 𝑔† 𝑔 𝑔†
N 𝐴→𝐵 (𝑈 𝐴 𝜌 𝐴𝑈 𝐴 ) = 𝑉𝐵 N 𝐴→𝐵 (𝜌 𝐴 )𝑉𝐵 , (4.4.113)
𝑔 𝑔
for a set of unitaries {𝑈 𝐴 }𝑔∈𝐺 and {𝑉𝐵 }𝑔∈𝐺 .
Let a Kraus representation of M 𝐴→𝐵 be given as
∑︁
N 𝐴→𝐵 (𝜌 𝐴 ) = 𝐿 𝑗 𝜌 𝐴 𝐿 𝑗† . (4.4.114)
𝑗
𝑔† 𝑔
Thus, the channel has two different Kraus representations {𝐿 𝑗 } 𝑗 and {𝑉𝐵 𝐿 𝑗 𝑈 𝐴 } 𝑗 ,
𝑔
and these are necessarily related by a unitary with matrix elements 𝑤 𝑗 𝑘 (see the
discussion after (4.3.7)):
∑︁
𝑔† 𝑗 𝑔 𝑔
𝑉𝐵 𝐿 𝑈 𝐴 = 𝑤 𝑗𝑘 𝐿𝑘 . (4.4.117)
𝑘
N
A canonical isometric extension 𝑈 𝐴→𝐵𝐸 of N 𝐴→𝐵 is given as
∑︁
N
𝑈 𝐴→𝐵𝐸 = 𝐿 𝑗 ⊗ | 𝑗⟩𝐸 , (4.4.118)
𝑗
𝑔
where {| 𝑗⟩𝐸 } 𝑗 is an orthonormal basis. Defining 𝑊𝐸 as the following unitary
∑︁
𝑔 𝑔
𝑊𝐸 |𝑘⟩𝐸 = 𝑤 𝑗 𝑘 | 𝑗⟩𝐸 , (4.4.119)
𝑗
where the states |𝑘⟩𝐸 are chosen from {| 𝑗⟩𝐸 } 𝑗 , consider that
∑︁
N 𝑔 𝑔
𝑈 𝐴→𝐵𝐸 𝑈𝐴 = 𝐿 𝑗 𝑈 𝐴 ⊗ | 𝑗⟩𝐸 (4.4.120)
𝑗
∑︁
𝑔 𝑔† 𝑔
= 𝑉𝐵 𝑉𝐵 𝐿 𝑗 𝑈 𝐴 ⊗ | 𝑗⟩𝐸 (4.4.121)
𝑗
" #
∑︁ ∑︁
𝑔 𝑔
= 𝑉𝐵 𝑤 𝑗 𝑘 𝐿 𝑘 ⊗ | 𝑗⟩𝐸 (4.4.122)
𝑗 𝑘
∑︁ ∑︁
𝑔 𝑘 𝑔
= 𝑉𝐵 𝐿 ⊗ 𝑤 𝑗 𝑘 | 𝑗⟩𝐸 (4.4.123)
𝑘 𝑗
∑︁
𝑔 𝑔
= 𝑉𝐵 𝐿 𝑘 ⊗ 𝑊𝐸 |𝑘⟩𝐸 (4.4.124)
𝑘
𝑔 𝑔 N
= 𝑉𝐵 ⊗ 𝑊𝐸 𝑈 𝐴→𝐵𝐸 . (4.4.125)
177
Chapter 4: Quantum Channels
Exercise 4.23
Given a quantum channel N 𝐴→𝐵 , prove that the twirled channel N𝐺𝐴→𝐵 , as
defined in (4.4.127), is group covariant.
Proposition 4.20
Let N 𝐴→𝐵 be a Hermiticity-preserving superoperator that is covariant with
respect to a group 𝐺, as defined in Definition 4.18.
1. For every pure state 𝜓 𝐴′ 𝐴 , with 𝑑 𝐴′ = 𝑑 𝐴 ,
𝜌
∥N 𝐴→𝐵 (𝜓 𝐴′ 𝐴 )∥ 1 ≤ ∥N 𝐴→𝐵 (𝜙 𝐴′ 𝐴 )∥ 1 , (4.4.128)
Í 𝑔 𝑔†
where 𝜌 𝐴 B 𝜓 𝐴 = Tr 𝐴′ [𝜓 𝐴′ 𝐴 ], 𝜌 𝐴 B T 𝐺
𝐴 (𝜌 𝐴 ) =
1
|𝐺 | 𝑔∈𝐺 𝑈 𝐴 𝜌 𝐴𝑈 𝐴 , and
𝜌
𝜙 𝐴′ 𝐴 is a purification of 𝜌.
2. The diamond norm of N is given by
∥N∥⋄ = sup ∥ (id 𝐴′ ⊗ N 𝐴→𝐵 )(𝜓 𝐴′ 𝐴 )∥ 1 : 𝜓 𝐴 = T 𝐺
𝐴 (𝜓 𝐴 ) , (4.4.129)
𝜓 𝐴′ 𝐴
178
Chapter 4: Quantum Channels
𝑔 1𝐴
3. If the representation {𝑈 𝐴 }𝑔∈𝐺 is such that T 𝐺
𝐴 (·) = Tr[·] 𝑑 𝐴 , then
∥N∥⋄ = ΦN
𝐴𝐵 . (4.4.130)
1
Proof:
𝜌
1. Let 𝜓 𝐴′ 𝐴 be an arbitrary pure state, 𝜌 𝐴 = 𝜓 𝐴 , 𝜌 𝐴 = T 𝐺
𝐴 (𝜌 𝐴 ), and let 𝜙 𝐴′ 𝐴 be a
purification of 𝜌 𝐴 . Let us also consider the following purification of 𝜌 𝐴 :
𝜌 1 ∑︁
𝑔
|𝜓 ⟩ 𝑅 𝐴′ 𝐴 B √︁ |𝑔⟩ 𝑅 ⊗ 𝑈 𝐴 |𝜓⟩ 𝐴′ 𝐴 , (4.4.131)
|𝐺 | 𝑔∈𝐺
= 𝑊 𝐴′ →𝑅 𝐴′ N 𝐴→𝐵 (𝜙 𝐴′ 𝐴 )(𝑊 𝐴′ →𝑅 𝐴′ ) †
𝜌 𝜌
N 𝐴→𝐵 (𝜓 𝑅 𝐴′ 𝐴 ) (4.4.132)
1 1
𝜌
= N 𝐴→𝐵 (𝜙 𝐴′ 𝐴 ) , (4.4.133)
1
where the last line follows from isometric invariance of the trace norm (see
(2.2.93)).
Í
Now, let us apply the quantum channel 𝑋 ↦→ 𝑔∈𝐺 |𝑔⟩⟨𝑔|𝑋 |𝑔⟩⟨𝑔| to the
system 𝑅. By the data-processing inequality in (4.1.7), we find that
𝜌 1 ∑︁ 𝑔
N 𝐴→𝐵 (𝜓 𝑅 𝐴′ 𝐴 ) ≥ |𝑔⟩⟨𝑔| 𝑅 ⊗ (N 𝐴→𝐵 ◦ U 𝐴 )(𝜓 𝐴′ 𝐴 ) (4.4.134)
1 |𝐺 | 𝑔∈𝐺
1
1 ∑︁ 𝑔† 𝑔
= |𝑔⟩⟨𝑔| 𝑅 ⊗ (V𝐵 ◦ N 𝐴→𝐵 ◦ U 𝐴 )(𝜓 𝐴′ 𝐴 ) ,
|𝐺 | 𝑔∈𝐺
1
(4.4.135)
where the last line follows from applying the unitary channel defined by the
Í 𝑔†
unitary 𝑔∈𝐺 |𝑔⟩⟨𝑔| 𝑅 ⊗ 𝑉𝐵 and from unitary invariance of the trace norm.
179
Chapter 4: Quantum Channels
1 ∑︁
≥ |𝑔⟩⟨𝑔| 𝑅 ⊗ N 𝐴→𝐵 (𝜓 𝐴′ 𝐴 ) (4.4.137)
|𝐺 | 𝑔∈𝐺
1
= ∥N 𝐴→𝐵 (𝜓 𝐴′ 𝐴 )∥ 1 , (4.4.138)
where the last line follows from (2.2.96). The derived inequality is precisely
(4.4.128),
𝜌
2. Note that, by definition, every purification 𝜙 𝐴′ 𝐴 of 𝜌 𝐴 is such that its reduced
state on 𝐴 is invariant under the channel T 𝐺𝐴 . Therefore, using (4.4.128), for
every pure state 𝜓 𝐴′ 𝐴 , we obtain
𝜌
∥N 𝐴→𝐵 (𝜓 𝐴′ 𝐴 )∥ 1 ≤ N 𝐴→𝐵 (𝜙 𝐴′ 𝐴 ) (4.4.139)
1
≤ sup ∥N 𝐴→𝐵 (𝜙 𝐴′ 𝐴 ) ∥ 1 : 𝜙 𝐴 = T 𝐺
𝐴 (𝜙 𝐴 ) . (4.4.140)
𝜙 𝐴′ 𝐴
Since this inequality holds for every pure state 𝜓 𝐴′ 𝐴 , and because N is
Hermiticity preserving, we can use (2.2.190) to conclude that
The amplitude damping channel with decay parameter 𝛾 ∈ [0, 1] is the channel
A𝛾 given by A𝛾 (𝜌) = 𝐴1 𝜌 𝐴1† + 𝐴2 𝜌 𝐴2† , with the two Kraus operators 𝐴1 and 𝐴2
defined as
√ √︁
𝐴1 = 𝛾|0⟩⟨1|, 𝐴2 = |0⟩⟨0| + 1 − 𝛾|1⟩⟨1|. (4.5.1)
It is straightforward to verify that 𝐴1† 𝐴1 + 𝐴2† 𝐴2 = 1, so that A𝛾 is indeed trace
preserving.
Let 𝜌 be a state with a matrix representation in the standard basis {|0⟩, |1⟩} as
1−𝜆 𝛼
𝜌= . (4.5.2)
𝛼 𝜆
In order for 𝜌 to be a state (positive semi-definite with unit trace), the conditions
0 ≤ 𝜆 ≤ 1 and 𝜆(1 − 𝜆) ≥ |𝛼| 2 should hold, where 𝛼 ∈ C. The output state A𝛾 (𝜌)
has the matrix representation
√︁
1− (1 − 𝛾)𝜆 1 − 𝛾𝛼
A𝛾 (𝜌) = √︁ . (4.5.3)
1 − 𝛾𝛼 (1 − 𝛾)𝜆
1 0 √︁ 0 0
√
i 1−𝜂
© ª
0 𝜂 0®
𝑈 𝜂 = √︁ √ ® (4.5.5)
0 i 1 − 𝜂 𝜂 0®
«0 0 0 1¬
181
Chapter 4: Quantum Channels
is unitary. Note that the action of A𝛾 on the pure states |0⟩ and |1⟩ is, respectively,
A1−𝜂 (|0⟩⟨0| 𝐴 ) = |0⟩⟨0| 𝐵 ,
(4.5.6)
A1−𝜂 (|1⟩⟨1| 𝐴 ) = (1 − 𝜂)|0⟩⟨0| 𝐵 + 𝜂|1⟩⟨1| 𝐵 .
Exercise 4.24
Prove that A𝑐𝛾 = A1−𝛾 for all 𝛾 ∈ [0, 1].
Using the result of Exercise 4.24, we have that the action of the complementary
channel A1−𝜂
𝑐 on these states is
A1−𝜂
𝑐
(|0⟩⟨0| 𝐴 ) = |0⟩⟨0| 𝐸 ,
(4.5.8)
A1−𝜂
𝑐
(|1⟩⟨1| 𝐴 ) = 𝜂|0⟩⟨0| 𝐸 + (1 − 𝜂)|1⟩⟨1| 𝐸 .
We see that whenever the state |0⟩⟨0| is input to the channel, the output systems
𝐵 and 𝐸 are both in the state |0⟩⟨0|. On the other hand, if the input state is |1⟩⟨1|,
then 𝐵 receives a mixed state: with probability 1 − 𝜂, the state is |0⟩⟨0|, and with
probability 𝜂, the state is |1⟩⟨1|. The situation for 𝐸 is reversed, receiving |0⟩⟨0|
with probability 𝜂 and |1⟩⟨1| with probability 1 − 𝜂. The unitary 𝑈 𝜂 can thus be
viewed as a qubit analogue of a beamsplitter, and the amplitude damping channel
A1−𝜂 can be viewed as a qubit analogue of the pure-loss bosonic channel; see
Figure 4.5.
A beamsplitter is an optical device that takes two beams of light as input and
splits them into two separate output beams, with one of the output beams containing
a fraction 𝜂 of the intensity of the incoming beam and the other output beam
containing the remaining fraction 1 − 𝜂 of the incoming intensity. When one of the
input ports of the beamsplitter is empty, i.e., is in the vacuum state, the output to
the receiver is by definition the pure-loss bosonic channel. In the case of a single
incoming photon, the pure-loss channel either transmits the photon with probability
𝜂 (allowing it to go to the receiver) or reflects it with probability 1 − 𝜂 (sending it
to the environment).
To draw the correspondence between the qubit amplitude damping channel and
the pure-loss bosonic channel described above, we can think of the qubit state |0⟩
182
Chapter 4: Quantum Channels
|0ih0|
Uη
ρ A1 − η ( ρ )
Figure 4.5: The amplitude damping channel A1−𝜂 can be interpreted, using
(4.5.4), as an interaction with a qubit analogue of a bosonic beamsplitter
unitary 𝑈 𝜂 , followed by discarding the output state of the environment. The
channel from the sender to the receiver is from the left to the right, while the
input and output environment systems are at the top and the bottom, respectively.
In the bosonic case, the state |0⟩⟨0| of the environmental input arm of the
beamsplitter corresponds to the vacuum state, which contains no photons, and
the channel to the receiver’s end is called the pure-loss bosonic channel.
as the vacuum state and the qubit state |1⟩ as the state of a single photon. It is
possible to show that the output states of the amplitude damping channel in (4.5.6)
and (4.5.8) for the receiver and environment, respectively, then exactly match the
action of the bosonic pure-loss channel on the subspace spanned by |0⟩ and |1⟩
(please consult the Bibliographic Notes in Section 3.4 for further references on this
connection). The amplitude damping channel can indeed, therefore, be viewed as
the qubit analogue of the bosonic pure-loss channel.
By replacing the initial state |0⟩⟨0| of the environment in (4.5.4) with the state
where we again use the relation 𝛾 = 1 − 𝜂. This channel has the following four
Kraus operators:
√
√︁ 1 √︁ 0 √︁ 0 𝛾
𝐴1 = 1 − 𝑁 𝐵 , 𝐴2 = 1 − 𝑁 𝐵 , (4.5.11)
0 1−𝛾 0 0
√︁
√︁ 1−𝛾 0 √︁ 0 0
𝐴3 = 𝑁 𝐵 , 𝐴4 = 𝑁 𝐵 √ . (4.5.12)
0 1 𝛾 0
183
Chapter 4: Quantum Channels
Note that the amplitude damping channel is a special case of the generalized
amplitude damping channel in which the thermal noise 𝑁 𝐵 = 0, so that A1−𝜂 =
A1−𝜂,0 .
Exercise 4.25
1. Prove that A𝛾,𝑁 = (1 − 𝑁)A𝛾,0 + 𝑁A𝛾,1 for all 𝛾, 𝑁 ∈ [0, 1].
2. Prove that A𝛾,𝑁 = A𝛾2 ,𝑁2 ◦ A𝛾1 ,𝑁1 , where 𝛾 = 𝛾1 + 𝛾2 − 𝛾1 𝛾2 and 𝑁 =
𝛾1 (1−𝛾2 )𝑁1 +𝛾2 𝑁2
𝛾1 +𝛾2 −𝛾1 𝛾2 .
3. Using 2., along with the result of Exercise 4.24, prove that the amplitude
1
damping channel A𝛾,0 is degradable for all 𝛾 ∈ 0, 2 , with degrading
channel A 1−2𝛾 ,0 .
1−𝛾
The state 𝜃 𝑁 𝐵 in (4.5.10) is a qubit thermal state and can be regarded as a qubit
analogue of the bosonic thermal state. The latter is a state corresponding to a heat
bath with an average number of photons equal to 𝑁 𝐵 . The generalized amplitude
damping channel can then be seen as a qubit analogue of the bosonic thermal noise
channel.
Exercise 4.26
Recall the Pauli operators 𝑋, 𝑌 , 𝑍 from (3.2.6), and consider the generalized
amplitude-damping channel A𝛾,𝑁 , with 𝛾, 𝑁 ∈ [0, 1]. Show that
√︁
A𝛾,𝑁 (𝑋) = 1 − 𝛾𝑋, (4.5.13)
√︁
A𝛾,𝑁 (𝑌 ) = 1 − 𝛾𝑌 , (4.5.14)
A𝛾,𝑁 (𝑍) = (1 − 𝛾)𝑍, (4.5.15)
A𝛾,𝑁 ( 1) = 1 + 𝛾(1 − 2𝑁)𝑍. (4.5.16)
184
Chapter 4: Quantum Channels
Exercise 4.27
Verify that a set of Kraus operators for the erasure channel E 𝑝 is
n√︁ √ √ o
1 − 𝑝(|0⟩⟨0| + · · · + |𝑑 − 1⟩⟨𝑑 − 1|), 𝑝|𝑒⟩⟨0|, . . . , 𝑝|𝑒⟩⟨𝑑 − 1| .
(4.5.19)
Also, show that the Choi representation of E 𝑝 is
|0ih0|
Uη E1
A1 B1
A2 B2
Figure 4.6: The qubit erasure channel E1−𝜂 , for 𝜂 ∈ [0, 1], can be physically
realized by using a photonic dual-rail qubit system and passing each of the two
modes through a beamsplitter, modeled by the unitary 𝑈 𝜂 in (4.5.21), such that
the input from the environment is the vacuum state.
space spanned by {|0, 0⟩, |0, 1⟩, |1, 0⟩}, it is equivalent to the upper left 3 × 3 matrix
in (4.5.5). We can thus make use of this fact because the environment state for
𝜂
the bosonic pure-loss channel is prepared in the state |0⟩⟨0|. Let 𝑈 𝐴𝑖 𝐸𝑖 →𝐵𝑖 𝐸𝑖 then
denote the beamsplitter unitary, for 𝑖 ∈ {1, 2}, with the following action on the
basis {|0, 0⟩, |0, 1⟩, |1, 0⟩} of the input and output modes (in that order):
1 0 √︁ 0 ª
𝜂 √
= 0 √︁ 𝜂 i 1 − 𝜂® .
©
𝑈 𝐴𝑖 𝐸𝑖 →𝐵𝑖 𝐸𝑖 (4.5.21)
√
«0 i 1 − 𝜂 𝜂 ¬
Letting 𝜌 𝐴1 𝐴2 be an arbitrary qubit state on the two modes 𝐴1 and 𝐴2 defined by
𝜌 𝐴1 ,𝐴2 = (1 − 𝜆)|0, 1⟩⟨0, 1| 𝐴1 ,𝐴2 + 𝛼|0, 1⟩⟨1, 0| 𝐴1 ,𝐴2 + 𝛼|1, 0⟩⟨0, 1| 𝐴1 ,𝐴2
(4.5.22)
+ 𝜆|1, 0⟩⟨1, 0| 𝐴1 ,𝐴2 ,
the pure-loss bosonic channel on the dual-rail qubit system 𝐴1 𝐴2 is given by
h
𝜂 𝜂
Tr𝐸1 ,𝐸2 (𝑈 𝐴1 𝐸1 →𝐵1 𝐸1 ⊗ 𝑈 𝐴2 𝐸2 →𝐵2 𝐸2 )(𝜌 𝐴1 ,𝐴2 ⊗ |0, 0⟩⟨0, 0| 𝐸1 ,𝐸2 )
i (4.5.23)
𝜂 𝜂 †
×(𝑈 𝐴1 𝐸1 →𝐵1 𝐸1 ⊗ 𝑈 𝐴2 𝐸2 →𝐵2 𝐸2 ) .
186
Chapter 4: Quantum Channels
Although this has a similar form to (4.5.4), which defines the amplitude damping
channel, our particular realization of the qubit system in terms of dual-rail single
photons results in a completely different output from that of the amplitude damping
channel. In particular, using (4.5.22) along with (4.5.21), it is straightforward to
show that
h
𝜂 𝜂
Tr𝐸1 ,𝐸2 (𝑈 𝐴1 𝐸1 →𝐵1 𝐸1 ⊗ 𝑈 𝐴2 𝐸2 →𝐵2 𝐸2 )(𝜌 𝐴1 ,𝐴2 ⊗ |0, 0⟩⟨0, 0| 𝐸1 ,𝐸2 )
i
𝜂 𝜂 †
×(𝑈 𝐴1 𝐸1 →𝐵1 𝐸1 ⊗ 𝑈 𝐴2 𝐸2 →𝐵2 𝐸2 )
= 𝜂𝜌 𝐵1 ,𝐵2 + (1 − 𝜂)|0, 0⟩⟨0, 0| 𝐵1 ,𝐵2
= E1−𝜂 (𝜌 𝐴1 ,𝐴2 ). (4.5.24)
In other words, the pure-loss bosonic channel on a dual-rail qubit system is simply
the qubit erasure channel E1−𝜂 with erasure probability 1 − 𝜂 and erasure state
|0, 0⟩⟨0, 0| 𝐵1 ,𝐵2 . This means that a dual-rail single-photonic qubit sent through a
pure-loss bosonic channel is transmitted to the receiver unchanged with probability 𝜂,
or it is lost, and replaced by the vacuum state |0, 0⟩⟨0, 0|, with probability 1 − 𝜂.
𝜌 ↦→ 𝑝 𝐼 𝜌 + 𝑝 𝑋 𝑋 𝜌𝑋 + 𝑝𝑌 𝑌 𝜌𝑌 + 𝑝 𝑍 𝑍 𝜌𝑍, (4.5.26)
where 𝑝 𝐼 , 𝑝 𝑋 , 𝑝𝑌 , 𝑝 𝑍 ≥ 0, 𝑝 𝐼 + 𝑝 𝑋 + 𝑝𝑌 + 𝑝 𝑍 = 1.
Here we highlight two particular Pauli channels of interest.
187
Chapter 4: Quantum Channels
Exercise 4.28
Show that the Choi state of a generalized Pauli channel is a Bell-diagonal state.
(Recall the definition of a Bell-diagonal state in (3.2.60).)
Exercise 4.29
For the 𝑑-dimensional dephasing channel defined in (4.5.35), prove that, in the
standard basis, only the off-diagonal elements of the input state 𝜌 are affected
by the channel.
189
Chapter 4: Quantum Channels
Exercise 4.30
Using (3.2.98), prove that for all 𝑝 ∈ [0, 1], the action of the depolarizing
channel D 𝑝 can be written as
1𝑑
D 𝑝 (𝜌) = (1 − 𝑞) 𝜌 + 𝑞Tr[𝜌] (4.5.37)
𝑑
𝑝𝑑 2
for every linear operator 𝜌, where 𝑞 = 𝑑 2 −1
.
Exercise 4.31
Prove that the Choi state of the depolarizing channel D 𝑝 is the isotropic state
𝜌 iso;1−𝑝 (recall (3.2.131)). Conversely, using (4.2.16), prove that every isotropic
state is the Choi state of a depolarizing channel. In other words, prove that for
all 𝑝 ∈ [0, 1],
D 𝑝 (𝑋) = 𝑑Tr[(𝑋 T ⊗ 1) 𝜌 iso;1−𝑝 ] (4.5.38)
for all 𝑋 ∈ L(C𝑑 ).
Exercise 4.32
1. Using the result of Exercise 4.31, along with (3.2.132), show that the
depolarizing channel has the following covariance property:
D 𝑝 = U† ◦ D 𝑝 ◦ U, (4.5.39)
190
Chapter 4: Quantum Channels
Remark: If the operator N(𝜎) is invertible, then the Petz recovery map P 𝜎,N is a channel. If
1
the operator N(𝜎) is not invertible, then the inverse N(𝜎) − 2 is taken on the support on N(𝜎),
following the convention from Section 2.2.8.1. In this latter case, the Petz recovery map P 𝜎,N is
a trace non-increasing map.
The Petz recovery map is indeed completely positive because it is the composition
of the following completely positive maps:
1
1. Sandwiching by the positive semi-definite operator N(𝜎) − 2 .
2. The adjoint N† of N.
1
3. Sandwiching by the positive semi-definite operator 𝜎 2 .
The Petz recovery map is also trace non-increasing, as we can readily verify. For
every positive semi-definite operator 𝑋, the following holds
h 1 1i
† − 12 − 21
Tr[P𝜎,N (𝑋)] = Tr 𝜎 N N(𝜎) 𝑋N(𝜎)
2 𝜎2 (4.6.2)
h 1 1
i
= Tr 𝜎N† N(𝜎) − 2 𝑋N(𝜎) − 2 (4.6.3)
191
Chapter 4: Quantum Channels
h i
− 12 − 12
= Tr N(𝜎)N(𝜎) 𝑋N(𝜎) (4.6.4)
h i
− 21 − 21
= Tr N(𝜎) N(𝜎)N(𝜎) 𝑋 (4.6.5)
= Tr[ΠN(𝜎) 𝑋] (4.6.6)
≤ Tr[𝑋], (4.6.7)
where ΠN(𝜎) is the projection onto the support of N(𝜎), which arises because N(𝜎)
need not be invertible. If 𝑋 is contained in the support of N(𝜎), then Tr[ΠN(𝜎) 𝑋] =
Tr[𝑋], which means that the Petz recovery channel is trace-preserving for all inputs
with support contained in the support of N(𝜎).
One of the important properties of the Petz recovery channel P𝜎,N is that it
reverses the action of N on 𝜎 whenever N(𝜎) is invertible. In particular,
1
1
† − 12 − 12
P𝜎,N (N(𝜎)) = 𝜎 N N(𝜎) N(𝜎)N(𝜎)
2 𝜎2 (4.6.8)
= 𝜎 2 N† ( 1H′ )𝜎 2
1 1
(4.6.9)
= 𝜎, (4.6.10)
Then, since N is a channel, its adjoint is unital, which leads to the final equality.
Remark: The equality in (4.6.10) holds more generally; i.e., it holds even when N(𝜎) is not
invertible. To see this, we use the fact that the projection ΠN( 𝜎) onto the support of N(𝜎)
satisfies ΠN( 𝜎) ≤ 1H′ . Then, we find that
1
1 1
1
P 𝜎,N (N(𝜎)) = 𝜎 2 N† N(𝜎) − 2 N(𝜎)N(𝜎) − 2 𝜎 2 (4.6.12)
1 1
= 𝜎 2 N† (ΠN( 𝜎) )𝜎 2 (4.6.13)
≤ 𝜎 N† ( 1H′ )𝜎
1 1
2 2 (4.6.14)
= 𝜎. (4.6.15)
192
Chapter 4: Quantum Channels
Since |𝜓⟩ is arbitrary, it holds that Π 𝜎 ≤ N† (ΠN( 𝜎) ). Using this, we find that
1 1 1 1
P 𝜎,N (N(𝜎)) = 𝜎 2 N† (ΠN( 𝜎) )𝜎 2 ≥ 𝜎 2 Π 𝜎 𝜎 2 = 𝜎. (4.6.23)
Having shown that P 𝜎,N (N(𝜎)) ≤ 𝜎 and P 𝜎,N (N(𝜎)) ≥ 𝜎, we conclude that
Let the input Hilbert space to the channel N in the definition of the Petz recovery
map be H = H 𝐴𝐵 . Then, let N = Tr 𝐵 be the partial trace over 𝐵, and note that (see
Exercise 4.16)
N† (𝜎𝐵 ) = 𝜎𝐴 ⊗ 1𝐵 . (4.6.25)
Indeed, using Definition 2.18 for the adjoint of a superoperator, we have
Therefore, the Petz recovery map corresponding to the partial trace over 𝐵 is
1 1 1 1
− −
P𝜎𝐴𝐵 ,Tr𝐵 (𝑋 𝐴 ) = 𝜎𝐴𝐵
2
𝜎𝐴 2 𝑋 𝐴 𝜎𝐴 2 ⊗ 1𝐵 𝜎𝐴𝐵
2
. (4.6.30)
193
Chapter 4: Quantum Channels
𝐵 −1
𝑑∑︁
1
−1 −1 1
= 𝜎𝐴𝐵 𝜎𝐴 2 𝑋 𝐴 𝜎𝐴 2
2
⊗ | 𝑗⟩⟨ 𝑗 | 𝐵 𝜎𝐴𝐵
2
(4.6.32)
𝑗=0
𝐵 −1
𝑑∑︁
1
−1 −1 1
= 𝜎𝐴𝐵 𝜎𝐴 2
2
⊗ | 𝑗⟩𝐵 𝑋 𝐴 𝜎𝐴 2 ⊗ ⟨ 𝑗 | 𝐵 𝜎𝐴𝐵
2
(4.6.33)
𝑗=0
𝐵 −1
𝑑∑︁
= 𝐾 𝑗 𝑋 𝐴 𝐾 †𝑗 , (4.6.34)
𝑗=0
where
1
−1
𝐾𝑗 B 𝜎𝐴𝐵 𝜎𝐴 2
2
⊗ | 𝑗⟩𝐵 . (4.6.35)
The operators 𝐾 𝑗 , for 0 ≤ 𝑗 ≤ 𝑑 𝐵 − 1, are thus Kraus operators for the Petz recovery
map for the partial trace. Using (4.3.9), and letting the environment 𝐸 be denoted
by 𝐵ˆ (since the dimension of the environment in the construction (4.3.9) is the
same as the number of Kraus operators, which in this case is equal to the dimension
of 𝐵), we find that
𝐵 −1
𝑑∑︁ 1
−1
𝑉𝐴→𝐵 𝐵ˆ = 2
𝜎𝐴𝐵 (𝜎𝐴 2 ⊗ | 𝑗⟩𝐵 ) ⊗ | 𝑗⟩𝐵ˆ (4.6.36)
𝑗=0
𝐵 −1
𝑑∑︁ 1
−1
= 2
𝜎𝐴𝐵 (𝜎𝐴 2 ⊗ | 𝑗⟩𝐵 ⊗ | 𝑗⟩𝐵ˆ ) (4.6.37)
𝑗=0
1 𝐵 −1
𝑑∑︁
− 21
= (𝜎𝐴𝐵 ⊗ 1 )(𝜎 ⊗ 1𝐵 ⊗ 1𝐵ˆ ) 1 𝐴 ⊗ | 𝑗, 𝑗⟩𝐵 𝐵ˆ ®
2 © ª
𝐵ˆ 𝐴 (4.6.38)
« 𝑗=0 ¬
1
−1
= (𝜎𝐴𝐵
2
⊗ 1𝐵ˆ )(𝜎𝐴 2 ⊗ 1𝐵 𝐵ˆ )( 1 𝐴 ⊗ |Γ⟩𝐵 𝐵ˆ ), (4.6.39)
which is an isometric extension of the Petz recovery map P𝜎𝐴𝐵 ,Tr𝐵 . Omitting
identity operators, this can be written more simply as follows:
1
−1
𝑉𝐴→𝐵 𝐵ˆ = 𝜎𝐴𝐵
2
𝜎𝐴 2 |Γ⟩𝐵 𝐵ˆ . (4.6.40)
194
Chapter 4: Quantum Channels
Exercise 4.33
Recall the Bayes theorem from probability theory:
where 𝑋 and 𝑌 are random variables with joint probability distribution 𝑝 𝑋𝑌 (𝑥, 𝑦)
over the alphabets X and Y, and the distributions given above are derived from
this joint distribution as
∑︁ ∑︁
𝑝 𝑋 (𝑥) = 𝑝 𝑋𝑌 (𝑥, 𝑦), 𝑝𝑌 (𝑦) = 𝑝 𝑋𝑌 (𝑥, 𝑦), (4.6.42)
𝑦∈Y 𝑥∈X
𝑝 𝑋𝑌 (𝑥, 𝑦) 𝑝 𝑋𝑌 (𝑥, 𝑦)
𝑝𝑌 |𝑋 (𝑦|𝑥) = , 𝑝 𝑋 |𝑌 (𝑥|𝑦) = . (4.6.43)
𝑝 𝑋 (𝑥) 𝑝𝑌 (𝑦)
We now develop a connection between Bayes theorem and the Petz recovery
map. Let {𝜌𝑥 }𝑥∈X be a set of states, let N be a channel, and let {𝑀𝑦 } 𝑦∈Y be a
POVM. Set
𝑝𝑌 |𝑋 (𝑦|𝑥) = Tr[𝑀𝑦 N(𝜌𝑥 )]. (4.6.44)
1. Show that (4.6.41) is satisfied with
for the set {𝜎𝑦 } 𝑦∈Y of states, channel P, and POVM {𝐿 𝑥 }𝑥∈X chosen as
1 1 1
𝜎𝑦 = [N(𝜌)] 2 𝑀𝑦 [N(𝜌)] 2 , (4.6.46)
𝑝𝑌 (𝑦)
1
1
† − 12 − 12
P(·) = 𝜌 N [N(𝜌)] (·) [N(𝜌)]
2 𝜌2, (4.6.47)
1 1
𝐿 𝑥 = 𝑝 𝑋 (𝑥) [𝜌] − 2 𝜌𝑥 [𝜌] − 2 , (4.6.48)
where ∑︁
𝜌= 𝑝 𝑋 (𝑥) 𝜌𝑥 . (4.6.49)
𝑥∈X
195
Chapter 4: Quantum Channels
∑︁
→
L 𝐴𝐵→𝐴′ 𝐵′ = M𝑥𝐴→𝐴′ ⊗ N𝑥𝐵→𝐵′ . (4.6.50)
𝑥∈X
for some finite alphabet Y and sets {S𝑦 } 𝑦∈Y , {W𝑦 } 𝑦∈Y of completely positive
Í
trace-non-increasing maps such that 𝑦∈Y S𝑦 ⊗ W𝑦 is trace preserving.
196
Chapter 4: Quantum Channels
Consider the one-way LOCC channel L→ 𝐴𝐵→𝐴′ 𝐵′ from Alice to Bob defined
as in (4.6.50). The values in the set X form the possible messages that can be
communicated from Alice to Bob and constitute the “classical communication”
part of LOCC. The set {M𝑥 }𝑥∈X of completely positive trace-non-increasing maps
and the set {N𝑥 }𝑥∈X of quantum channels specify the actions of Alice and Bob
for each value 𝑥 ∈ X and constitute the “local operations” part of LOCC. The set
{M𝑥 }𝑥∈X of completely positive trace-non-increasing maps that sum to a channel
essentially specifies a quantum instrument. The operations corresponding to these
maps can thus be considered probabilisitic since the maps are not trace preserving.
In general, the party that performs the quantum instrument determines the direction
of classical communication and thus the direction of the LOCC channel. In this
case, Alice performs the classical communication since she performs the quantum
instrument. The values in the set X specify the outcomes of the instrument, and
Alice communicates the outcome to Bob, who performs the corresponding channel
selected from his set {N𝑥 }𝑥∈X of channels.
In more detail, let 𝜌 𝐴𝐵 be the initial state shared by Alice and Bob. Using
the definition in (4.4.53) for the channel corresponding to the quantum instrument
{M𝑥 }𝑥∈X , the state after applying the quantum instrument is
∑︁
M𝑥𝐴→𝐴′ (𝜌 𝐴𝐵 ) ⊗ |𝑥⟩⟨𝑥| 𝑋 𝐴 , (4.6.53)
𝑥∈X
where the system 𝑋 𝐴 stores the classical information corresponding to the outcome
of the instrument. Alice then communicates the outcome of the instrument to Bob.
This classical communication can be understood via the noiseless classical channel
from 𝑋 𝐴 to 𝑋𝐵 defined by
∑︁
𝜃 𝑋 𝐴 ↦→ ⟨𝑥| 𝑋 𝐴 𝜃 𝑋 𝐴 |𝑥⟩ 𝑋 𝐴 |𝑥⟩⟨𝑥| 𝑋𝐵 . (4.6.54)
𝑥∈X
197
Chapter 4: Quantum Channels
ρ A0 B0 Alice x1 xk ρ Ak Bk
Bob x2
which corresponds to Bob measuring his system 𝑋𝐵 in the basis {|𝑥⟩ 𝑋𝐵 }𝑥∈X and
applying a quantum channel from the set {N𝑥𝐵→𝐵′ }𝑥∈X to the system 𝐵 based on the
outcome. The final state is then
∑︁
(M𝑥𝐴→𝐴′ ⊗ N𝑥𝐵→𝐵′ )(𝜌 𝐴𝐵 ), (4.6.57)
𝑥∈X
which is precisely the output of the one-way LOCC channel L→ 𝐴𝐵→𝐴′ 𝐵′ defined in
(4.6.50). We can succinctly write the steps in (4.6.53)–(4.6.56) as
198
Chapter 4: Quantum Channels
L↔ 𝑘,→ 𝑘−1,←
𝐴0 𝐵0 →𝐴 𝑘 𝐵 𝑘 = L 𝐴 𝑘−1 𝐵 𝑘−1 →𝐴 𝑘 𝐵 𝑘 ◦ L 𝐴 𝑘−2 𝐵 𝑘−2 →𝐴 𝑘−1 𝐵 𝑘−1 ◦ · · ·
(4.6.61)
◦ L2,← 1,→
𝐴1 𝐵1 →𝐴2 𝐵2 ◦ L 𝐴0 𝐵0 →𝐴1 𝐵1 .
For each round 𝑖, there is a finite alphabet X𝑖 consisting of the messages commu-
nicated in the round, along with a set {M𝑖𝑥 }𝑥∈X𝑖 of completely positive trace-non-
increasing maps that sum to a quantum channel and a set {N𝑖𝑥 }𝑥∈X𝑖 of quantum
channels. In a multi-round LOCC channel such as this one, it possible for the
operation sets {M𝑖𝑥 }𝑥∈X𝑖 and {N𝑖𝑥 }𝑥∈X𝑖 on the 𝑖th round to depend on the outcomes
and actions taken in previous rounds. The quantum teleportation protocol in
Section 5.1 provides a concrete example of a one-way LOCC protocol from Alice
to Bob.
for some finite alphabet X and sets of completely positive and trace-non-
increasing maps {C𝑥 }𝑥∈X and {D𝑥 }𝑥∈X such that S is trace preserving.
Every separable channel has a set of Kraus operators in product form; i.e., for
every separable channel S 𝐴𝐵→𝐴′ 𝐵′ as in (4.6.62) there exists a finite alphabet Y and
𝑦 𝑦
sets {𝐶 𝐴→𝐴′ } 𝑦∈Y and {𝐷 𝐵→𝐵′ } 𝑦∈Y such that
∑︁
(𝐶 𝐴→𝐴′ ⊗ 𝐷 𝐵→𝐵′ ) 𝜌 𝐴𝐵 (𝐶 𝐴→𝐴′ ⊗ 𝐷 𝐵→𝐵′ ) †
𝑦 𝑦 𝑦 𝑦
S 𝐴𝐵→𝐴′ 𝐵′ (𝜌 𝐴𝐵 ) = (4.6.63)
𝑦∈Y
for all 𝜌 𝐴𝐵 .
A key property of a separable channel is that it outputs a separable state if the input
Í
state is separable. To see this, consider the separable state 𝜎𝐴𝐵 = 𝑧∈Z 𝑝(𝑧)𝜏𝐴𝑧 ⊗ 𝜔 𝑧𝐵 .
Then the output state S 𝐴𝐵→𝐴′ 𝐵′ (𝜎𝐴𝐵 ) is given by
S 𝐴𝐵→𝐴′ 𝐵′ (𝜎𝐴𝐵 )
199
Chapter 4: Quantum Channels
!
∑︁ ∑︁
𝑝(𝑧)𝜏𝐴𝑧 ⊗ 𝜔 𝑧𝐵 (𝐶 𝐴→𝐴′ ⊗ 𝐷 𝐵→𝐵′ ) †
𝑦 𝑦 𝑦 𝑦
= (𝐶 𝐴→𝐴′ ⊗ 𝐷 𝐵→𝐵′ ) (4.6.64)
𝑦∈Y 𝑧∈Z
∑︁
𝑝(𝑧)𝐶 𝐴→𝐴′ 𝜏𝐴𝑧 (𝐶 𝐴→𝐴′ ) † ⊗ 𝐷 𝐵→𝐵′ 𝜔 𝑧𝐵 (𝐷 𝐵→𝐵′ ) † ,
𝑦 𝑦 𝑦 𝑦
= (4.6.65)
𝑦∈Y,𝑧∈Z
The converse of Proposition 4.24 is not true. For example, let us define the
following operators:
1
𝐾1 B √ |0⟩⟨0| + |1⟩⟨1|, 𝐾2 B |0⟩⟨0|, 𝐾3 B |1⟩⟨1|. (4.6.66)
2
Then, following the notation in (4.6.62), let
C1𝐴→𝐴′ (·) = 𝐾1 (·)𝐾1† , (4.6.67)
C2𝐴→𝐴′ (·) = 𝐾2 (·)𝐾2† , (4.6.68)
C3𝐴→𝐴′ (·) = 𝐾3 (·)𝐾3† , (4.6.69)
and
D1𝐵→𝐵′ (·) = 𝐾2 (·)𝐾2† , (4.6.70)
D2𝐵→𝐵′ (·) = 𝐾1 (·)𝐾1† , (4.6.71)
D3𝐵→𝐵′ (·) = 𝐾3 (·)𝐾3† . (4.6.72)
Then, the map
3
∑︁
S 𝐴𝐵→𝐴′ 𝐵′ (·) B (C𝑥𝐴→𝐴′ ⊗ D𝑥𝐵→𝐵′ )(·) (4.6.73)
𝑥=1
= (𝐾1 ⊗ 𝐾2 )(·)(𝐾1 ⊗ 𝐾2 ) † + (𝐾2 ⊗ 𝐾1 )(·)(𝐾2 ⊗ 𝐾1 ) † (4.6.74)
+ (𝐾3 ⊗ 𝐾3 )(·)(𝐾3 ⊗ 𝐾3 ) † (4.6.75)
is a separable channel, but it can be shown that it is not an LOCC channel; please
consult the Bibliographic Notes in Section 3.4.
200
Chapter 4: Quantum Channels
ρA
ρA N A→ B N(ρ A ) −→ L↔
RAB0 → B N(ρ A )
ω RB0
N A→ B
N(𝜌 𝐴 ) = L↔
𝑅 𝐴𝐵′ →𝐵 (𝜌 𝐴 ⊗ 𝜔 𝑅𝐵′ ). (4.6.76)
For the LOCC channel L↔ 𝑅 𝐴𝐵′ →𝐵 , the input systems 𝑅 𝐴 are Alice’s, the input
′
system 𝐵 is Bob’s, Alice’s output system is trivial, and Bob’s output system
is 𝐵.
201
Chapter 4: Quantum Channels
end is N(𝜌 𝐴 ). The resource state 𝜔 𝑅𝐵′ is fixed, being such that the same resource
state can be used for every input state 𝜌 𝐴 on Alice’s system. A concrete example of
an LOCC simulation of a channel is shown in the context of quantum teleportation
in Section 5.1 below.
Due to the fact that separable channels strictly contain LOCC channels, it is
sensible to generalize the notion of teleportation simulation even further to this
case:
Proposition 4.28
Completely PPT-preserving channels preserve the set of PPT states.
202
Chapter 4: Quantum Channels
Proof: Suppose that 𝜌 𝐴𝐵 is a PPT state and that P 𝐴𝐵→𝐴′ 𝐵′ is a completely PPT-
preserving channel. If we take the partial transpose T𝐵′ on the output state
𝜎𝐴′ 𝐵′ = P 𝐴𝐵→𝐴′ 𝐵′ (𝜌 𝐴𝐵 ), then we find that
Proposition 4.29
Every separable channel is a completely PPT-preserving channel.
where X is a finite alphabet and {R𝑥 }𝑥∈X and {W𝑥 }𝑥∈X are sets of completely
Í
positive trace non-increasing maps such that 𝑥∈X R𝑥 ⊗ W𝑥 is trace preserving.
Then,
∑︁
T𝐵′ ◦ S 𝐴𝐵→𝐴′ 𝐵′ ◦ T𝐵 = R𝑥𝐴→𝐴′ ⊗ (T𝐵′ ◦ W𝑥𝐵→𝐵′ ◦ T𝐵 ). (4.6.82)
𝑥∈X
By applying Lemma 4.30 below, we conclude that the maps T𝐵′ ◦ W𝑥𝐵→𝐵′ ◦ T𝐵
are completely positive for all 𝑥 ∈ X, which means that T𝐵′ ◦ S 𝐴𝐵→𝐴′ 𝐵′ ◦ T𝐵 is
completely positive. Therefore, S 𝐴𝐵→𝐴′ 𝐵′ is completely PPT-preserving. ■
203
Chapter 4: Quantum Channels
Lemma 4.30
Let N𝐵→𝐵′ be a completely positive map. Then the map T𝐵′ ◦ N𝐵→𝐵′ ◦ T𝐵 is
completely positive, and its Choi operator is given by the full transpose of the
Choi operator for N𝐵→𝐵′ , i.e.,
T ′ ◦N 𝐵→𝐵′ ◦T 𝐵 N
Γ𝐵𝐵𝐵 ′ = T(Γ𝐵𝐵 ′ ). (4.6.83)
𝐵 −1
𝑑∑︁
T𝐵 (Γ𝐵𝐵
ˆ ) = |𝑖⟩⟨ 𝑗 | 𝐵ˆ ⊗ (|𝑖⟩⟨ 𝑗 | 𝐵 ) T (4.6.84)
𝑖, 𝑗=0
𝐵 −1
𝑑∑︁
= |𝑖⟩⟨ 𝑗 | 𝐵ˆ ⊗ | 𝑗⟩⟨𝑖| 𝐵 (4.6.85)
𝑖, 𝑗=0
𝐵 −1
𝑑∑︁
= (| 𝑗⟩⟨𝑖| 𝐵ˆ ) T ⊗ | 𝑗⟩⟨𝑖| 𝐵 (4.6.86)
𝑖, 𝑗=0
= T𝐵ˆ (Γ𝐵𝐵
ˆ ). (4.6.87)
Then the following holds for the Choi representation of T𝐵′ ◦ N𝐵→𝐵′ ◦ T𝐵 :
T ′ ◦N 𝐵→𝐵′ ◦T 𝐵
Γ𝐵𝐵𝐵 ′ = (T𝐵′ ◦ N𝐵→𝐵′ ◦ T𝐵 )(Γ𝐵𝐵ˆ ) (4.6.88)
= ((T𝐵ˆ ⊗ T𝐵′ ) ◦ N𝐵→𝐵′ )(Γ𝐵𝐵
ˆ ) (4.6.89)
N
= T(Γ𝐵𝐵 ′ ). (4.6.90)
Since the map N𝐵→𝐵′ is completely positive, its Choi representation Γ𝐵𝐵N is positive
′
semi-definite. Since positive semi-definiteness is preserved under transposition, we
N ) is positive semi-definite, which means that the map T ′ ◦N
find that T(Γ𝐵𝐵 ′ 𝐵 𝐵→𝐵′ ◦T 𝐵
is completely positive (by applying Theorem 4.3). ■
204
Chapter 4: Quantum Channels
Proof: We begin by proving the only-if part. Suppose that N 𝐴𝐵→𝐴′ 𝐵′ is com-
pletely PPT-preserving. By definition, this implies that T𝐵′ ◦ N 𝐴𝐵→𝐴′ 𝐵′ ◦ T𝐵 is
completely positive. By the definition of complete positivity, we conclude that
(T𝐵′ ◦ N 𝐴𝐵→𝐴′ 𝐵′ ◦ T𝐵 )(Φ 𝐴¯ 𝐵𝐴𝐵
¯ ) is a positive semi-definite operator, where the
Hilbert spaces corresponding to systems 𝐴¯ and 𝐵¯ are isomorphic to the Hilbert
spaces corresponding to the input systems 𝐴 and 𝐵, respectively. By employing a
calculation similar to that in (4.6.84)–(4.6.90), we conclude that
For the C-PPT-P channel P 𝑅 𝐴𝐵′ →𝐵 , the input systems 𝑅 𝐴 are Alice’s, the input
system 𝐵′ is Bob’s, Alice’s output system is trivial, and Bob’s output system
is 𝐵.
205
Chapter 4: Quantum Channels
ρA
ρA N A→ B N(ρ A ) −→ PRAB0 → B N(ρ A )
ω RB0
N A→ B
One of the main applications considered in this book is communication and, more
specifically, when communication is possible or impossible. To this end, suppose
that Alice and Bob are connected by means of a bipartite channel N 𝐴𝐵→𝐴′ 𝐵′ . Such
a channel is said to be non-signaling from Alice to Bob if it is impossible for Alice
and Bob to make use of it for the purpose of Alice to communicate a message to
Bob. We give a precise definition as follows:
To interpret the condition in (4.6.93), consider the following. For Bob, the
reduced state of his output system 𝐵′ is obtained by tracing out Alice’s output
system 𝐴′. Note that the reduced state on 𝐵′ is all that Bob can access at the output
in this scenario. If the condition in (4.6.93) holds, then the reduced state on Bob’s
206
Chapter 4: Quantum Channels
ΓN
𝐴𝐵𝐴′ 𝐵′ B N 𝐴¯ 𝐵→𝐴
¯ ′ 𝐵 ′ (Γ 𝐴 𝐴
¯ ⊗ Γ𝐵 𝐵¯ ) (4.6.94)
Tr 𝐴′ [ΓN N
𝐴𝐵𝐴′ 𝐵′ ] = 𝜋 𝐴 ⊗ Tr 𝐴′ 𝐴 [Γ 𝐴𝐵𝐴′ 𝐵′ ]. (4.6.95)
′ 𝐵 ′ ◦ R ¯ )(Γ 𝐴 𝐴
𝜋
(Tr 𝐴′ ◦N 𝐴¯ 𝐵→𝐴
¯ 𝐴 ¯ ⊗ Γ𝐵 𝐵¯ )
= (Tr 𝐴′ ◦N 𝐴¯ 𝐵→𝐴
¯ ′ 𝐵 ′ )( 1 𝐴 ⊗ 𝜋 𝐴 ¯ ⊗ Γ𝐵 𝐵¯ ) (4.6.97)
= (Tr 𝐴′ ◦N 𝐴¯ 𝐵→𝐴
¯ ′ 𝐵 ′ )(𝜋 𝐴 ⊗ 1 𝐴 ¯ ⊗ Γ𝐵 𝐵¯ ) (4.6.98)
= 𝜋 𝐴 ⊗ (Tr 𝐴′ ◦N 𝐴¯ 𝐵→𝐴
¯ ′ 𝐵 ′ )( 1 𝐴
¯ ⊗ Γ𝐵 𝐵¯ ) (4.6.99)
= 𝜋 𝐴 ⊗ (Tr 𝐴𝐴′ ◦N 𝐴¯ 𝐵→𝐴 ¯ ′ 𝐵 ′ )(Γ 𝐴 𝐴
¯ ⊗ Γ𝐵 𝐵¯ ) (4.6.100)
= 𝜋 𝐴 ⊗ Tr 𝐴′ 𝐴 [ΓN
𝐴𝐵𝐴′ 𝐵′ ]. (4.6.101)
207
Chapter 4: Quantum Channels
(4.6.93) holds for such a channel. By tracing over the output system 𝐴′, we find that
∑︁
Tr 𝐴′ [L←
𝐴𝐵→𝐴′ 𝐵′ (𝜌 𝐴𝐵 )] = Tr 𝐴′ [(N𝑥𝐴→𝐴′ ⊗ M𝑥𝐵→𝐵′ )(𝜌 𝐴𝐵 )] (4.6.102)
𝑥∈X
∑︁
= M𝑥𝐵→𝐵′ (Tr 𝐴 [𝜌 𝐴𝐵 ]) (4.6.103)
𝑥∈X
∑︁
= M𝑥𝐵→𝐵′ (𝜌 𝐵 ). (4.6.104)
𝑥∈X
The second equality follows because N𝑥𝐴→𝐴′ is trace preserving for all 𝑥. Also,
consider that
(Tr 𝐴′ ◦L←
𝐴𝐵→𝐴′ 𝐵′ ◦ R 𝐴 )(𝜌 𝐴𝐵 )
𝜋
4.7 Summary
209
Chapter 4: Quantum Channels
4.9 Problems
𝑟 and {𝐾 ′ } 𝑠 be two sets of Kraus operators
1. Let N be a quantum channel, and let {𝐾𝑖 }𝑖=1 𝑖 𝑖=1
for N. Prove that these sets of Kraus operators are related by an isometry as in (4.3.3).
Í −1
2. Let 𝐹𝐴𝐴′ = 𝑖,𝑑 𝐴𝑗=0 | 𝑗, 𝑖⟩⟨𝑖, 𝑗 | 𝐴𝐴′ be the swap operator, as defined in (3.2.83), and let
N 𝐴→𝐵 be a superoperator. Consider the operator
𝐴 −1
𝑑∑︁
N
𝐹𝐴𝐵 B N 𝐴′ →𝐵 (𝐹𝐴𝐴′ ) = | 𝑗⟩⟨𝑖| 𝐴 ⊗ N(|𝑖⟩⟨ 𝑗 | 𝐴′ ). (4.9.1)
𝑖, 𝑗=0
N = T (ΓN ).
(a) Prove that 𝐹𝐴𝐵 𝐴 𝐴𝐵
𝐴→𝐵 ( 1 𝐴 ).
N ] =N
(b) Prove that Tr 𝐴 [𝐹𝐴𝐵
N ] =1 .
(c) If N 𝐴→𝐵 is trace preserving, then prove that Tr 𝐵 [𝐹𝐴𝐵 𝐴
D E
(d) Prove that 𝑋 𝐴† ⊗ 𝑌𝐵 , 𝐹𝐴𝐵
N = ⟨𝑌𝐵 , N 𝐴→𝐵 (𝑋 𝐴 )⟩ for all 𝑋 𝐴 ∈ L(H 𝐴 ) and 𝑌𝐵 ∈
L(H𝐵 ).
𝐵 −1
𝑑∑︁
N
𝐹𝐴𝐵 = N† (|𝑘⟩⟨ℓ| 𝐵 ) † ⊗ |𝑘⟩⟨ℓ| 𝐵 . (4.9.3)
𝑘,ℓ=0
210
Chapter 4: Quantum Channels
N (|𝜓⟩ ⊗ |𝜙⟩ ) ≥ 0
(h) Prove that N 𝐴→𝐵 is a positive map if and only if (⟨𝜓| 𝐴 ⊗ ⟨𝜙| 𝐵 )𝐹𝐴𝐵 𝐴 𝐵
for all |𝜓⟩ ∈ H 𝐴 and |𝜙⟩𝐵 ∈ H𝐵 .
(Bibliographic Note: The representation in (4.9.1) was defined by de Pillis (1967). It is
sometimes called the Jamiołkowski representation of N due to the work of Jamiołkowski
N in (h) such that N
(1972), who proved the necessary and sufficient condition on 𝐹𝐴𝐵 𝐴→𝐵
is positive.)
211
Chapter 5
Fundamental Quantum
Information Processing Tasks
Having studied quantum states, measurements, and channels in detail in the
previous two chapters, we are now ready to study three fundamental tasks in
quantum information processing: quantum teleportation, quantum super-dense
coding, and quantum hypothesis testing. Quantum hypothesis testing has been
studied since the late 1960s, with the aim of generalizing (classical) statistical
hypothesis testing to the quantum setting. The discovery of quantum teleportation
and super-dense coding in the early 1990s demonstrated the practical advantages
that entanglement could allow for with respect to communication, and it contributed
to the rise of quantum information science as a prominent field of study in both
theoretical and experimental physics.
All of the tasks that we study in this chapter provide us with prototypes of some
of the quantum communication scenarios that we consider in Parts II and III of this
book. In particular, listed below are the tasks and protocols that we study in this
chapter and how they are connected to the communication tasks that we study later.
• Quantum teleportation (Section 5.1) is connected to the task of quantum
communication (Chapter 14), and in particular to LOCC-assisted quantum
communication (Chapter 19).
• Quantum super-dense coding (Section 5.2) is connected to the task of entangle-
ment-assisted classical communication (Chapter 11).
• Quantum hypothesis testing (Section 5.3), in particular state discrimination
212
Chapter 5: Fundamental Quantum Information Processing Tasks
Before stating the basic teleportation protocol, let us start by introducing a key
element of the protocol, the Bell measurement.
213
Chapter 5: Fundamental Quantum Information Processing Tasks
12 ⊗ 12 , so that the set {|Φ𝑧,𝑥 ⟩⟨Φ𝑧,𝑥 | : 𝑧, 𝑥 ∈ {0, 1}} is indeed a POVM. Furthermore,
the classical bits 𝑥 and 𝑧 can be viewed as being the outcomes of the measurement.
We can write the usual computational basis states for two qubits in terms of the
Bell states as
1
|0, 0⟩ = √ (|Φ0,0 ⟩ + |Φ1,0 ⟩), (5.1.6)
2
1
|0, 1⟩ = √ (|Φ0,1 ⟩ + |Φ1,1 ⟩), (5.1.7)
2
1
|1, 0⟩ = √ (|Φ0,1 ⟩ − |Φ1,1 ⟩), (5.1.8)
2
1
|1, 1⟩ = √ (|Φ0,0 ⟩ − |Φ1,0 ⟩). (5.1.9)
2
We now detail the teleportation protocol; see Figure 5.1 for a circuit diagram
depicting the protocol. The protocol starts with Alice and Bob sharing two qubits
in the state |Φ⟩ 𝐴𝐵 . Alice has an additional qubit, which is in the state |𝜓⟩ 𝐴′ , that
she wishes to teleport to Bob, where
|𝜓⟩ 𝐴′ = 𝛼|0⟩ 𝐴′ + 𝛽|1⟩ 𝐴′ , 𝛼, 𝛽 ∈ C, |𝛼| 2 + |𝛽| 2 = 1. (5.1.10)
214
Chapter 5: Fundamental Quantum Information Processing Tasks
|ψiA0 z
x
Bell
Measurement
Alice
|ΦiAB
Bob
Xx Zz |ψiB
Figure 5.1: Circuit diagram for the quantum teleportation protocol. The
protocol accomplishes the task of sending a quantum state 𝜓 from Alice to
Bob using a shared entangled state and two bits of classical communication.
The outcomes (𝑥, 𝑧) of Alice’s Bell measurement on her qubits 𝐴 and 𝐴′ are
communicated to Bob, who applies the unitary 𝑍 𝑧 𝑋 𝑥 on his qubit to transform
it to the state 𝜓 that Alice wished to send.
The state |𝜓⟩ 𝐴′ is arbitrary and need not be known to either Alice or Bob. The
overall joint state between Alice and Bob at the start of the protocol is therefore
1
|𝜓⟩ 𝐴′ ⊗ |Φ⟩ 𝐴𝐵 = √ (𝛼|0, 0, 0⟩ 𝐴′ 𝐴𝐵 + 𝛼|0, 1, 1⟩ 𝐴′ 𝐴𝐵
2 (5.1.11)
+ 𝛽|1, 0, 0⟩ 𝐴′ 𝐴𝐵 + 𝛽|1, 1, 1⟩ 𝐴′ 𝐴𝐵 ).
Alice and Bob then proceed as follows.
1. Alice performs a Bell measurement on her two qubits 𝐴′ and 𝐴. To determine
the measurement outcomes and their probabilities, it is helpful to write down the
initial state (5.1.11) in the Bell basis on Alice’s systems. Using (5.1.6)–(5.1.9),
we find that
|𝜓⟩ 𝐴′ ⊗ |Φ⟩ 𝐴𝐵
1
= |Φ0,0 ⟩ 𝐴′ 𝐴 ⊗ (𝛼|0⟩𝐵 + 𝛽|1⟩𝐵 ) +|Φ1,0 ⟩ 𝐴′ 𝐴 ⊗ (𝛼|0⟩𝐵 − 𝛽|1⟩𝐵 )
2
+|Φ0,1 ⟩ 𝐴′ 𝐴 ⊗ (𝛼|1⟩𝐵 + 𝛽|0⟩𝐵 ) +|Φ1,1 ⟩ 𝐴′ 𝐴 ⊗ (𝛼|1⟩𝐵 − 𝛽|0⟩𝐵 ) (5.1.12)
1
= |Φ0,0 ⟩ 𝐴′ 𝐴 ⊗ |𝜓⟩𝐵 +|Φ1,0 ⟩ 𝐴′ 𝐴 ⊗ 𝑍 𝐵 |𝜓⟩𝐵
2
+|Φ0,1 ⟩ 𝐴′ 𝐴 ⊗ 𝑋𝐵 |𝜓⟩𝐵 +|Φ1,1 ⟩ 𝐴′ 𝐴 ⊗ 𝑋𝐵 𝑍 𝐵 |𝜓⟩𝐵 . (5.1.13)
From this, it is clear that each outcome (𝑥, 𝑧) ∈ {0, 1}2 of the Bell measurement
occurs with equal probablity 14 and that the state of Bob’s qubit after the
measurement is 𝑋𝐵𝑥 𝑍 𝐵𝑧 |𝜓⟩𝐵 .
215
Chapter 5: Fundamental Quantum Information Processing Tasks
Exercise 5.1
Verify (5.1.12) and (5.1.13).
2. Alice communicates to Bob the two classical bits 𝑥 and 𝑧 resulting from the
Bell measurement.
3. Upon receiving the measurement outcomes, Bob performs 𝑋 𝑥 and then 𝑍 𝑧 on
his qubit. The resulting state of Bob’s qubit is |𝜓⟩.
Although we have described the teleportation protocol using a pure state |𝜓⟩ 𝐴′
as the state being teleported, the protocol applies just as well if the state to be
teleported is a mixed state 𝜌 𝐴′ .
The teleportation protocol for qubits described above can be easily generalized to
qudits using the Heisenberg–Weyl operators {𝑊𝑧,𝑥 : 0 ≤ 𝑧, 𝑥 ≤ 𝑑 − 1} introduced
in Definition 3.7. Specifically, recall from (3.2.58) that we define the two-qudit
Bell states in terms of the Heisenberg–Weyl operators as follows:
216
Chapter 5: Fundamental Quantum Information Processing Tasks
The state |𝜓⟩ 𝐴′ is the one to be teleported to Bob’s system. The starting joint state
on the three qudits 𝐴′, 𝐴, and 𝐵 is
𝑑−1
1 ∑︁
|𝜓⟩ 𝐴′ ⊗ |Φ⟩ 𝐴𝐵 =√ 𝑐𝑖 |𝑖, 𝑗, 𝑗⟩ 𝐴′ 𝐴𝐵 . (5.1.17)
𝑑 𝑖, 𝑗=0
Alice then performs a Bell measurement on her two qudits. By writing the Bell
states |Φ𝑧,𝑥 ⟩ 𝐴′ 𝐴 as
𝑑−1
1 ∑︁ 2 𝜋i(𝑘+𝑥 ) 𝑧
|Φ𝑧,𝑥 ⟩ 𝐴′ 𝐴 =√ e 𝑑 |𝑘 + 𝑥, 𝑘⟩ 𝐴′ 𝐴 , (5.1.18)
𝑑 𝑘=0
for each outcome (𝑧, 𝑥) ∈ {0, . . . , 𝑑 − 1}2 , we use (3.3.22) to find that the corre-
sponding (unnormalized) post-measurement state of Bob’s qudit is
ρ A0 z
x
Bell
ρ A0
Measurement
ΦAB
Alice
Bob
−→ TA0 AB→B ρB
ΦAB
Wz,x ρB
TA0 AB→B
Figure 5.2: The qudit teleportation protocol, depicted on the left, can be
regarded as an LOCC channel T 𝐴′ 𝐴𝐵→𝐵 with the input states 𝜌 𝐴′ and |Φ⟩⟨Φ| 𝐴𝐵
and the output state 𝜌 𝐵 , as shown on the right.
218
Chapter 5: Fundamental Quantum Information Processing Tasks
Exercise 5.2
Combine the quantum channels in (5.1.27)–(5.1.31) according to (5.1.26) and
conclude that the channel T 𝐴′ 𝐴𝐵→𝐵 can be written as
𝑑−1
∑︁ 𝑧,𝑥 †
T 𝐴′ 𝐴𝐵→𝐵 (𝜎𝐴′ 𝐴𝐵 ) = Tr 𝐴′ 𝐴 Φ𝑧,𝑥
′ 𝑊
𝐴 𝐴 𝐵
𝑧,𝑥
(𝜎𝐴 ′ 𝐴𝐵 )(𝑊
𝐵 ) (5.1.32)
𝑧,𝑥=0
for every state 𝜎𝐴′ 𝐴𝐵 . Verify that, for the input state 𝜎𝐴′ 𝐴𝐵 = 𝜌 𝐴′ ⊗ Φ 𝐴𝐵 , we
get T 𝐴′ 𝐴𝐵→𝐵 (𝜌 𝐴′ ⊗ Φ 𝐴𝐵 ) = 𝜌 𝐵 , as expected.
We can also connect with the previously defined notion of LOCC simulation of
a quantum channel (Definition 4.25). That is, we can understand the teleportation
protocol and (5.1.24) as demonstrating that the identity channel is LOCC simulable
with associated resource state given by the maximally entangled state. Now, by
using the teleportation protocol in conjunction with a quantum channel N 𝐴→𝐵 ,
we find that every quantum channel N 𝐴→𝐵 is LOCC simulable with associated
resource state given by the maximally entangled state of an appropriate Schmidt
rank. To see this, observe that the channel N 𝐴→𝐵 can be trivially written as
N 𝐴→𝐵 = N𝐵′ →𝐵 ◦ id 𝐴→𝐵′ , where 𝐵′ is an auxiliary system with the same dimension
as 𝐴. Then, by (5.1.25), we can simulate the identity channel id 𝐴→𝐵′ using the usual
teleportation protocol, so that the overall LOCC channel L is N𝐵′ →𝐵 ◦ T 𝐴𝐴′ 𝐵′ →𝐵′
and the resource state is |Φ⟩⟨Φ| 𝐴′ 𝐵′ , with the dimension of 𝐴′ equal to the dimension
of 𝐴. This is illustrated in Figure 5.3. We can also simulate N via teleportation
in a different manner, in which Alice locally applies the channel N to her input
state 𝜌 𝐴 , then teleports the resulting state to Bob. Mathematically, we write this
as N 𝐴→𝐵 = id 𝐴→𝐵ˆ ◦ N 𝐴→ 𝐴ˆ = T 𝐴ˆ 𝐴𝐵→𝐵
˜ ◦ N 𝐴→ 𝐴ˆ , where 𝐴,
ˆ 𝐴˜ are auxiliary systems
with the same dimension as 𝐵. We thus have have the following two ways to
represent the action of the channel N using teleportation:
N 𝐴→𝐵 (𝜌 𝐴 ) = N𝐵′ →𝐵 (T 𝐴𝐴′ 𝐵′ →𝐵′ (𝜌 𝐴 ⊗ |Φ⟩⟨Φ| 𝐴′ 𝐵′ )) (5.1.33)
= T 𝐴ˆ 𝐴𝐵→𝐵
˜ N 𝐴→ 𝐴ˆ (𝜌 𝐴 ) ⊗ |Φ⟩⟨Φ| 𝐴𝐵
˜ . (5.1.34)
Depending on whether the input dimension is smaller than the output dimension
of the channel, there can be a more economical way to perform the simulation.
If the channel’s output dimension is smaller than its input dimension, then the
more economical way to simulate the channel is for Alice to apply N 𝐴→𝐵 first and
then for Alice and Bob to perform the teleportation protocol. In this way, they
219
Chapter 5: Fundamental Quantum Information Processing Tasks
ρ A0 z
x
Bell
Measurement
ρA NA→B N(ρA ) = ΦAB0
Wz,x N N(ρA0 )
Now, let us take the maximally entangled state |Φ⟩ 𝐴𝐵 and define the states
|Φ𝑔 ⟩ 𝐴𝐵 B (𝑈 𝐴 ⊗ 1𝐵 )|Φ⟩ 𝐴𝐵 .
𝑔
(5.1.37)
We call these states the generalized Bell states. We see that these states are a direct
generalization of the usual qudit Bell states in (3.2.58).
Exercise 5.3
Using (5.1.36), prove that
1 ∑︁ 𝑔 𝑔 1 𝐴𝐵
|Φ ⟩⟨Φ | 𝐴𝐵 = 2 . (5.1.38)
|𝐺 | 𝑔∈𝐺 𝑑
𝑔 𝑑2 𝑔 𝑔
𝑀 𝐴𝐵 B |Φ ⟩⟨Φ | 𝐴𝐵 , (5.1.39)
|𝐺 |
we find that
𝑀 𝐴𝐵 = 1 𝐴𝐵 .
∑︁
𝑔
(5.1.40)
𝑔∈𝐺
221
Chapter 5: Fundamental Quantum Information Processing Tasks
We see that each outcome occurs with probability |𝐺1 | , and the post-measurement
𝑔†
state of Bob’s qudit is 𝑈𝐵 |𝜓⟩𝐵 .
2. Alice communicates the outcome 𝑔 resulting from the measurement to Bob.
3. Upon receiving the measurement outcome, Bob applies 𝑈 𝑔 on his qudit. The
resulting state of Bob’s qudit is |𝜓⟩𝐵 .
Observe that the original qudit teleportation protocol is a special case of the
generalized teleportation protocol outlined above, in which the group 𝐺 is Z𝑑 × Z𝑑
and its irreducible projective unitary representation {𝑈 𝑔 }𝑔∈𝐺 is taken to be the set
of Heisenberg–Weyl operators. Then, the generalized Bell states Φ𝑔 are precisely
the qudit Bell states Φ𝑧,𝑥 defined in (3.2.58). Furthermore, since |𝐺 | = 𝑑 2 , the
𝑑2
POVM elements 𝑀 𝑔 = |𝐺 | |Φ ⟩⟨Φ | are the projections on to the qudit Bell states,
𝑔 𝑔
222
Chapter 5: Fundamental Quantum Information Processing Tasks
Exercise 5.4
By following a development similar to that in (5.1.26)–(5.1.31) and Exercise 5.2,
verify that the one-way LOCC channel corresponding to the generalized
teleportation protocol presented above has the following form analogous to
(5.1.32):
∑︁ h i
𝑔 𝑔 𝑔†
T 𝐴′ 𝐴𝐵→𝐵 (𝜎𝐴′ 𝐴𝐵 ) =
𝐺
Tr 𝐴′ 𝐴 𝑀 𝐴′ 𝐴𝑈𝐵 (𝜎𝐴′ 𝐴𝐵 )𝑈𝐵 (5.1.46)
𝑔∈𝐺
223
Chapter 5: Fundamental Quantum Information Processing Tasks
ρ A0 g
Bell
Measurement
Alice
|ΦiAB
Bob
N Vg V g N(U g† ρU g )V g†
the |Φ⟩⟨Φ| 𝐴′ 𝐴 outcome of the Bell measurement as a “success” and the rest of the
outcomes as a “failure”, then we obtain post-selected teleportation. Post-selected
teleportation is probabilistic by definition. In particular, from (5.1.47), we see that
it succeeds with probability 𝑑12 .
Let us now consider the even more general protocol depicted in Figure 5.4. Let 𝐺 be
a finite group. As before, Alice and Bob start with a shared pair of qudits in the state
|Φ⟩⟨Φ| 𝐴𝐵 , while Alice holds an extra qudit in the state 𝜌 𝐴′ to be teleported to Bob.
Unlike the teleportation protocol above, however, Bob applies the channel N to his
qudit before he receives the results of the Bell measurement. Once he receives the
measurement results, he applies the unitary operation 𝑉 𝑔 from the set {𝑉 𝑔 : 𝑔 ∈ 𝐺}
of pre-determined unitary operators constituting a projective unitary representation
of 𝐺.
The initial tripartite joint state of the protocol is
𝜌 𝐴′ ⊗ ( 1 𝐴 ⊗ N𝐵 )(|Φ⟩⟨Φ| 𝐴𝐵 ). (5.1.48)
Alice performs the same generalized Bell measurement as before on 𝐴 and 𝐴′, which
𝑔 𝑔
we recall has the POVM {Π 𝐴𝐴′ }𝑔∈𝐺 with elements Π 𝐴𝐴′ defined in (5.1.39). Recall
that this POVM corresponds to an irreducible projective unitary representation of
𝐺 given by {𝑈 𝑔 }𝑔∈𝐺 . Since the Bell measurement operates only on the systems
𝐴′ and 𝐴, we can bring them inside the action of N on Bob’s share of the state
224
Chapter 5: Fundamental Quantum Information Processing Tasks
ρ A0 g
Bell
Measurement
Alice
ΦNAB
Bob
Vg
† †
V g N(U g ρU g )V g
|Φ⟩⟨Φ| 𝐴𝐵 . This means that the analysis for the qudit teleportation protocol from
Section 5.1.2.1 carries over exactly in this case. In other words, each outcome
𝑔 ∈ 𝐺 occurs with an equal probability of |𝐺1 | , and the post-measurement state on
Bob’s qudit corresponding to the outcome 𝑔 is
N(𝑈 𝑔† 𝜌𝑈 𝑔 ). (5.1.49)
After Bob applies the unitary 𝑉 𝑔 , the state of Bob’s qudit at the end of the protocol
is
𝑉 𝑔 N(𝑈 𝑔† 𝜌𝑈 𝑔 )𝑉 𝑔† . (5.1.50)
1
This occurs with probability |𝐺 | for all 𝑔 ∈ 𝐺.
Now, observe that the state (5.1.48) can be written as
𝜌 𝐴′ ⊗ ΦN
𝐴𝐵 , (5.1.51)
where we recall that ΦN 𝐴𝐵 = (id 𝐴 ⊗ N 𝐵 )(|Φ⟩⟨Φ| 𝐴𝐵 ) is the Choi state of the channel
N. In other words, the protocol depicted in Figure 5.4 is mathematically equivalent
to the teleportation protocol over a group 𝐺 outlined above, except that instead of
starting with the shared maximally entangled state |Φ⟩ 𝐴𝐵 , Alice and Bob start with
the shared state ΦN𝐴𝐵 . This equivalent protocol is depicted in Figure 5.5.
If Bob discards the classical message 𝑔 at the end of the protocol, then the state
of his system is given by
1 ∑︁ 𝑔
𝑉 N(𝑈 𝑔† 𝜌𝑈 𝑔 )𝑉 𝑔† . (5.1.52)
|𝐺 | 𝑔∈𝐺
225
Chapter 5: Fundamental Quantum Information Processing Tasks
ρ A0 g
Bell
Measurement
ρA NA→B N(ρA ) = ΦNAB
Vg N(ρA0 )
L→
A0 AB→B
Recall from (4.4.127) that this state is simply the output state of the twirl of N
with respect to the unitary representations {𝑈 𝑔 }𝑔∈𝐺 and {𝑉 𝑔 }𝑔∈𝐺 , because the
twirled channel N is a symmetrized version of the original channel N. Thus, the
generalized teleportation protocol gives an explicit procedure for implementing a
channel twirl by implementing the teleportation protocol using the Choi state of the
channel as the resource state.
Suppose now that the channel N satisfies the group covariance property from
Definition 4.18 for all 𝑔 ∈ 𝐺. In this case, we see that N(𝑈 𝑔† 𝜌𝑈 𝑔 ) = 𝑉 𝑔† N(𝜌)𝑉 𝑔
for every outcome 𝑔 of Alice’s generalized Bell measurement. Therefore, after Bob
applies 𝑉 𝑔 , the state of his qudit is N(𝜌). This generalized teleportation protocol
therefore effectively applies the channel N to the state 𝜌 𝐴′ and transfers the resulting
state to Bob’s qudit; see Figure 5.6. We say that the teleportation protocol simulates
the action of the channel N on the input state 𝜌 𝐴′ . As stated earlier, in this sense,
the original teleportation protocol can be regarded as a way to simulate the identity
channel.
The notion of simulation of a channel by a teleportation protocol can be extended
to a one-way LOCC channel L→ , as introduced in Definition 4.22, to obtain the
following definition.
226
Chapter 5: Fundamental Quantum Information Processing Tasks
ρA
ρA NA→B N(ρA ) = L→
RAB0 →B N(ρA )
ωRB0
NA→B
input state 𝜌 𝐴 ,
N(𝜌 𝐴 ) = L→
𝑅 𝐴𝐵′ →𝐵 (𝜌 𝐴 ⊗ 𝜔 𝑅𝐵′ ). (5.1.53)
227
Chapter 5: Fundamental Quantum Information Processing Tasks
x z
Xx Zz
Alice
|Φ+ iAB
Bob
(z, x)
Bell
Measurement
Figure 5.8: Circuit diagram for the super-dense coding protocol. Using the bits
(𝑧, 𝑥) that she wishes to send, Alice applies the appropriate Pauli 𝑋 and/or 𝑍
operators to her share 𝐴 of the maximally entangled qubits that are in the state
|Φ+ ⟩ 𝐴𝐵 and sends it through a noiseless qubit channel to Bob. Bob then performs
a Bell measurement on the two qubits to recover the encoded bits (𝑧, 𝑥).
classical communication.
Let us now go through the quantum super-dense coding protocol. See Figure
5.8 for a depiction of the protocol. Alice wishes to send two classical bits
(𝑧, 𝑥) ∈ {0, 1}2 to Bob by making use of a shared pair of qubits in the maximally
entangled state |Φ+ ⟩ 𝐴𝐵 and one use of a noiseless qubit channel. Depending on the
bits she wishes to send, she performs the following operations on her share of the
entangled qubits:
• To send the bits (0, 0), she does nothing.
• To send the bits (0, 1), she applies the Pauli 𝑋 operator to her qubit, transforming
the joint state |Φ+ ⟩ 𝐴𝐵 to |Ψ+ ⟩ 𝐴𝐵 .
• To send the bits (1, 0), she applies the Pauli 𝑍 operator to her qubit, transforming
the joint state |Φ+ ⟩ 𝐴𝐵 to |Φ− ⟩ 𝐴𝐵 .
• To send the bits (1, 1), she applies the 𝑋 operator followed by the 𝑍 operator,
so that the joint state becomes |Ψ− ⟩ 𝐴𝐵 .
After applying the appropriate operation, Alice sends her qubit to Bob with the one
allowed use of a noiseless qubit channel.
228
Chapter 5: Fundamental Quantum Information Processing Tasks
Bob now holds both qubits, and they are in one of the four Bell states
depending on the bits (𝑧, 𝑥) Alice sent. Bob then performs a Bell measurement on
his two qubits, and the outcome of this measurement consists precisely of the bits
(𝑧, 𝑥) that Alice wished to send.
The super-dense coding protocol has a simple generalization to the qudit case.
In this case, Alice and Bob share the qudit Bell state |Φ⟩ 𝐴𝐵 before communication
begins, and by applying one of the 𝑑 2 Heisenberg–Weyl operators 𝑊𝑧,𝑥 from (3.2.48)
on her share of the state, Alice can rotate the global state to one of the 𝑑 2 qudit Bell
states in (3.2.58). After Alice sends her share of the encoded state over a noiseless
qudit channel to Bob, Bob can then perform the qudit Bell measurement to decode
which of the 𝑑 2 messages Alice transmitted.
229
Chapter 5: Fundamental Quantum Information Processing Tasks
2. Type-II Error: Bob guesses “𝜌”, but the system is in the state 𝜎. The probability
of this occurring is Tr[𝑀 𝜌 𝜎].
In order to obtain an optimal strategy for Bob, there are two cases that are
typically considered.
• Symmetric Case: Also called quantum state discrimination, in this setting, Bob
has some prior knowledge about the state he is given. Specifically, he knows
that the state is 𝜌 with probability 𝜆 ∈ [0, 1] and 𝜎 with probability 1 − 𝜆. The
goal is then to minimize the average of the type-I and type-II error probabilities
with respect to this probability distribution. In other words, letting 𝑀 ≡ 𝑀 𝜌 ,
the goal is to minimize the function
Exercise 5.5
Consider a very simple hypothesis testing strategy in which Bob discards the
state of the quantum system and simply guesses “𝜌” with some probability
𝑞 ∈ [0, 1] and “𝜎” with probability 1 − 𝑞.
1. What is the POVM corresponding to this strategy?
2. Evaluate the type-I and type-II error probabilities for this strategy.
3. If, in the symmetric setting, the prior probability for the state 𝜌 is 𝜆 ∈ [0, 1],
then evaluate the error probability in (5.3.1) for this strategy.
Now, suppose that Bob is given several copies, say 𝑛 ≥ 1, of a quantum system,
each one of which is either in the state 𝜌 or the state 𝜎. His strategy to determine the
state can now make use of these multiple copies in an adaptive manner, for example,
and could allow the error probabilities to go below the “single-shot” (𝑛 = 1) error
probabilities defined above. Since Bob ultimately has to make a decision between 𝜌
and 𝜎, his strategy is still described by a two-outcome POVM, which we denote by
{𝑀 𝜌(𝑛) , 𝑀𝜎(𝑛) }. This setting of hypothesis testing with multiple copies is depicted in
Figure 5.10. The type-I and type-II error probabilities are defined in an analogous
manner as before. Specifically, the type-I error is Tr[𝑀𝜎(𝑛) 𝜌 ⊗𝑛 ] and the type-II error
is Tr[𝑀 𝜌(𝑛) 𝜎 ⊗𝑛 ]. In the symmetric case, if 𝜆 ∈ [0, 1] is the probability that each
system is in the state 𝜌, then the error probability is
{Mρ(n) , Mσ(n) }
Exercise 5.6
Consider states 𝜌 and 𝜎 along with a POVM {𝑀0 , 𝑀1 } representing a strategy
for hypothesis testing of a single copy of the quantum system, where the outcome
“0” corresponds to 𝜌 and the outcome “1” corresponds to 𝜎. Let 𝜆 ∈ [0, 1] be
the prior probability for 𝜌, and let 𝑛 ≥ 2. Construct the POVM {𝑀 𝜌(𝑛) , 𝑀𝜎(𝑛) },
and evaluate the type-I and type-II error probabilities for the following two
decision strategies for hypothesis testing of 𝑛 copies of the quantum system.
1. The majority-vote decision strategy: (1) Measure each system according
to the POVM {𝑀0 , 𝑀1 }, and let 𝑁𝑥 be the number of times the outcome
𝑥 occurs. (2) If 𝑁0 > 𝑁1 , guess “𝜌”, and if 𝑁1 > 𝑁0 , guess “𝜎”. If 𝑛 is
even and 𝑁0 = 𝑁1 , then guess “𝜌” with probability 𝑞 ∈ [0, 1] and guess
“𝜎” with probability 1 − 𝑞.
2. The unanimous-vote decision strategy: (1) Measure each system according
to the POVM {𝑀0 , 𝑀1 }, and let 𝑁𝑥 be the number of times the outcome 𝑥
232
Chapter 5: Fundamental Quantum Information Processing Tasks
Given quantum states 𝜌 and 𝜎, the goal of symmetric hypothesis testing, also known
as quantum state discrimination, is to devise a measurement strategy that minimizes
the error probability defined in (5.3.1), where 𝜆 ∈ [0, 1] is the probability that
the state is 𝜌 and 1 − 𝜆 is the probability that the state is 𝜎. The value of the
corresponding optimization problem in (5.3.2) is
Exercise 5.7
Show that 𝑝 ∗err (𝜆, 𝜌, 𝜎) can be evaluated using a semi-definite program. Then,
using strong duality, prove that an alternate expression for 𝑝 ∗err (𝜆, 𝜌, 𝜎) is
Exercise 5.8
Prove that 𝑝 ∗err (𝜆, 𝜌, 𝜎) is isometrically invariant: for every isometry 𝑉,
𝑝 ∗err (𝜆, 𝜌, 𝜎) = 𝑝 ∗err (𝜆, 𝑉 𝜌𝑉 † , 𝑉 𝜎𝑉 † ).
233
Chapter 5: Fundamental Quantum Information Processing Tasks
It turns out that 𝑝 ∗err (𝜆, 𝜌, 𝜎) can be written in terms of the trace norm (Sec-
tion 2.2.9.2) as
1
𝑝 ∗err (𝜆, 𝜌, 𝜎) = (1 − ∥𝜆𝜌 − (1 − 𝜆)𝜎∥ 1 ) , (5.3.15)
2
which is an immediate consequence of the following theorem.
234
Chapter 5: Fundamental Quantum Information Processing Tasks
and the conditions for an optimal 𝑀 are the same as given above.
Remark: Letting 𝐴 = 𝜆𝜌 and 𝐵 = (1 − 𝜆)𝜎 in the statement of Theorem 5.3, we recognize that
the objective function on the left-hand side of (5.3.16) is equal to 𝑝 err (𝜆, 𝜌, 𝜎, 𝑀) as defined
in (5.3.1). We thus obtain (5.3.15). Note that Theorem 5.3 also gives us a measurement that
achieves the minimal error probability.
where the last inequality follows because 𝑀 ≤ 1. The equality in (5.3.18) and the
inequality in (5.3.19) imply that
Since
∥ 𝐴 − 𝐵∥ 1 = Tr[| 𝐴 − 𝐵|] = Tr[Δ+ ] + Tr[Δ− ] (5.3.21)
235
Chapter 5: Fundamental Quantum Information Processing Tasks
and
Δ− = Δ+ + 𝐵 − 𝐴, (5.3.22)
we can write Tr[Δ+ ] as
1
Tr[Δ+ ] = (∥ 𝐴 − 𝐵∥ 1 − Tr[𝐵 − 𝐴]) . (5.3.23)
2
This means that the objective function on the left-hand side of (5.3.18) can be
bounded from below by 12 (Tr[ 𝐴 + 𝐵] − ∥ 𝐴 − 𝐵∥ 1 ). We have thus shown that
1
Tr[( 1 − 𝑀) 𝐴] + Tr[𝑀 𝐵] ≥ (Tr[ 𝐴 + 𝐵] − ∥ 𝐴 − 𝐵∥ 1 ) . (5.3.24)
2
for all 𝑀 such that 0 ≤ 𝑀 ≤ 1, which implies that
1
inf Tr[( 1 − 𝑀) 𝐴] + Tr[𝑀 𝐵] ≥ (Tr[ 𝐴 + 𝐵] − ∥ 𝐴 − 𝐵∥ 1 ) . (5.3.25)
𝑀:0≤𝑀 ≤1 2
where the last equality follows because Tr[Π+ Δ+ ] = Tr[Δ+ ] and Tr[Π+ Δ− ] =
0, since Π+ and Δ− are by definition orthogonal. We also used Tr[Λ0 Δ+ ] =
Tr[Λ0 Δ− ] = 0, with these latter equalities following because 0 ≤ Tr[Λ0 Δ± ] ≤
Tr[Π0 Δ± ] = 0. Therefore, using (5.3.23), we find that
1
Tr[ 𝐴] − Tr[(Π+ + Λ0 )( 𝐴 − 𝐵)] = (Tr[ 𝐴 + 𝐵] − ∥ 𝐴 − 𝐵∥ 1 ) . (5.3.29)
2
The operator Π+ + Λ0 thus achieves the bound in (5.3.24), which means that
1
inf Tr[( 1 − 𝑀) 𝐴] + Tr[𝑀 𝐵] = (Tr[ 𝐴 + 𝐵] − ∥ 𝐴 − 𝐵∥ 1 ) , (5.3.30)
𝑀:0≤𝑀 ≤ 1 2
so that (5.3.16) is established.
To see that Π+ + Λ0 is the only form for an optimal measurement operator,
suppose that 𝑀 is optimal, i.e., satisfies 0 ≤ 𝑀 ≤ 1 and saturates (5.3.16) with
236
Chapter 5: Fundamental Quantum Information Processing Tasks
equality. Then it follows that the two inequalities in (5.3.19) are saturated with
equality, so that
Tr[𝑀 (Δ+ − Δ− )] = Tr[𝑀Δ+ ] = Tr[Δ+ ] = Tr[Π+ Δ+ ]. (5.3.31)
The leftmost equality implies that Tr[𝑀Δ− ] = 0, where Δ− is the strictly negative
part of 𝐴 − 𝐵. Since both 𝑀 and Δ− are positive semi-definite and Π− is the
projection onto the strictly negative part of Δ− , we conclude that 𝑀Π− = Π− 𝑀 = 0.
This in turn implies that
𝑀 (Π+ + Π0 ) = (Π+ + Π0 ) 𝑀 = 𝑀, (5.3.32)
which, after sandwiching 𝑀 ≤ 1 on the left and right by Π+ + Π0 , implies that
𝑀 ≤ Π+ + Π0 . Since
0 = Tr[Δ+ (Π+ − 𝑀)] (5.3.33)
= Tr[Δ+ (Π+ − Π+ 𝑀Π+ )] (5.3.34)
= Tr[Δ+ Π+ ( 1 − 𝑀)Π+ ], (5.3.35)
we find that Π+ ( 1 − 𝑀)Π+ = 0. Now consider that Π− ( 1 − 𝑀)Π+ = 0 because
Π− Π+ = 0 and Π− 𝑀Π+ = Π− (Π+ + Π0 ) 𝑀Π+ = 0. So then Π+ ( 1 − 𝑀)Π+ = 0 and
Π− ( 1 − 𝑀)Π+ = 0 imply that ( 1 − 𝑀)Π+ = 0. From this equation, we conclude
that Π+ = Π+ 𝑀 = 𝑀Π+ . By sandwiching Π+ ≤ 1 by 𝑀 and applying operator
monotonicity of the square-root function (see Section 2.2.8.1), we conclude that
Π+ ≤ 𝑀. Combining this operator inequality with the previous one, we conclude
that an optimal 𝑀 satisfies Π+ ≤ 𝑀 ≤ Π+ + Π0 , which is equivalent to 𝑀
decomposing as 𝑀 = Π+ + Λ0 for 0 ≤ Λ0 ≤ Π0 .
The equality in (5.3.17) follows as a rewrite of (5.3.16):
1 1
∥ 𝐴 − 𝐵∥ 1 = Tr[ 𝐴 + 𝐵] − inf Tr[( 1 − 𝑀) 𝐴] + Tr[𝑀 𝐵] (5.3.36)
2 2 𝑀:0≤𝑀 ≤1
1
= sup Tr[ 𝐴 + 𝐵] − (Tr[( 1 − 𝑀) 𝐴] + Tr[𝑀 𝐵]) (5.3.37)
𝑀:0≤𝑀 ≤1 2
1
= sup Tr[𝐵 − 𝐴] + Tr[𝑀 ( 𝐴 − 𝐵)] (5.3.38)
𝑀:0≤𝑀 ≤1 2
1
= Tr[𝐵 − 𝐴] + sup Tr[𝑀 ( 𝐴 − 𝐵)]. (5.3.39)
2 𝑀:0≤𝑀 ≤1
Rearranging this equality, we arrive at (5.3.17). An optimal 𝑀 having the form
Π+ + Λ0 again follows from (5.3.19), (5.3.26)–(5.3.28), and the reasoning given
above. ■
237
Chapter 5: Fundamental Quantum Information Processing Tasks
Exercise 5.9
Let 𝜌 = |𝜓⟩⟨𝜓| ≡ 𝜓 and 𝜎 = |𝜙⟩⟨𝜙| ≡ 𝜙 be pure states, and let 𝜆 ∈ [0, 1].
Show that
1 √︁
∗ 2
𝑝 err (𝜆, 𝜓, 𝜙) = 1 − 1 − 4𝜆(1 − 𝜆) |⟨𝜓|𝜙⟩| . (5.3.40)
2
What is a measurement that achieves this optimal error probability?
Observe from (5.3.40) that if |𝜓⟩ and |𝜙⟩ are orthogonal, then 𝑝 ∗err (𝜆, 𝜓, 𝜙) = 0.
Exercise 5.10
Let 𝜌 and 𝜎 be quantum states that are orthogonal, in the sense that Π 𝜌 Π𝜎 =
Π𝜎 Π 𝜌 = 0, where Π 𝜌 and Π𝜎 are the projections onto the support of 𝜌 and
𝜎, respectively (recall (2.2.65)). Prove that the optimal error probability
for discriminating 𝜌 and 𝜎 vanishes, i.e., that 𝑝 ∗err (𝜆, 𝜌, 𝜎) = 0. What is a
measurement achieving this optimal error probability?
Exercise 5.11
𝑝 ∗err
iso;𝑝 W;𝑝
Evaluate the optimal error probability 𝜆, 𝜌 𝐴𝐵 1 , 𝜌 𝐴𝐵 2 for discriminating
iso;𝑝 W;𝑝
between the isotropic state 𝜌 𝐴𝐵 1 , 𝑝 1 ∈ [0, 1], and the Werner state 𝜌 𝐴𝐵 2 ,
𝑝 2 ∈ [0, 1], where 𝜆 ∈ [0, 1].
∗ ⊗𝑛 ⊗𝑛 1 ⊗𝑛 ⊗𝑛
𝑝 err (𝜆, 𝜌 , 𝜎 ) = 1 − 𝜆𝜌 − (1 − 𝜆)𝜎 1 (5.3.41)
2
is the lowest possible error probability. However, because Bob now has 𝑛 copies of
either 𝜌 or 𝜎, he can perform a discrimination strategy that involves a collective
measurement acting on the 𝑛 copies of the state. This means that the optimal error
exponent can generally be lower with 𝑛 ≥ 2 copies than with just one copy.
238
Chapter 5: Fundamental Quantum Information Processing Tasks
Exercise 5.12
Prove that the optimal error probability 𝑝 ∗err (𝜆, 𝜌, 𝜎) for quantum state discrim-
ination is monotonically non-increasing with 𝑛, i.e., prove that
for all 𝑛 ≥ 1.
Given states 𝜌 and 𝜎 and 𝜆 ∈ (0, 1), how does the optimal error probability
𝑝 err (𝜆, 𝜌 ⊗𝑛 , 𝜎 ⊗𝑛 ) behave as the number 𝑛 of copies of the state increases? If 𝜌 ≡ 𝜓
and 𝜎 ≡ 𝜙 are pure states, then because 𝜓 ⊗𝑛are 𝜙 ⊗𝑛 are both pure states, we can use
√
(5.3.40) and the expansion 12 1 − 1 − 4𝑥 = 𝑥 + 𝑂 (𝑥 2 ) to see that the following
approximation holds as 𝑛 becomes large:
2
𝑝 ∗err (𝜆, 𝜓 ⊗𝑛 , 𝜙 ⊗𝑛 ) ≈ 𝜆(1 − 𝜆) |⟨𝜓|𝜙⟩| 2𝑛 = 𝜆(1 − 𝜆)2−𝑛(− log2 |⟨𝜓|𝜙⟩| ) . (5.3.43)
Now, because |⟨𝜓|𝜙⟩| 2 ∈ [0, 1], we have that − log2 |⟨𝜙|𝜓⟩| 2 ≥ 0, which means
that, as 𝑛 becomes large, the optimal error probability decays exponentially to
zero. Does the exponential decay hold more generally? In other words, for
arbitrary states 𝜌 and 𝜎, is it true that 𝑝 ∗err (𝜆, 𝜌 ⊗𝑛 , 𝜎 ⊗𝑛 ) ≈ 2−𝑛𝜉 (𝜆,𝜌,𝜎) as 𝑛 becomes
large, where 𝜉 (𝜆, 𝜌, 𝜎) = − 𝑛1 log2 𝑝 ∗err (𝜆, 𝜌 ⊗𝑛 , 𝜎 ⊗𝑛 ) is a non-negative asymptotic
error exponent that is independent of 𝑛? To be more precise, does the limit
lim𝑛→∞ − 𝑛1 log2 𝑝 ∗err (𝜆, 𝜌 ⊗𝑛 , 𝜎 ⊗𝑛 ) exist, and if so, what is its value?
The following theorem provides positive answers to both questions. The
characterization given below is useful because the quantity 𝑝 ∗err (𝜆, 𝜌 ⊗𝑛 , 𝜎 ⊗𝑛 )
becomes more and more difficult to calculate as 𝑛 increases, so that the asymptotic
error exponent is a helpful characterization of 𝑝 ∗err (𝜆, 𝜌 ⊗𝑛 , 𝜎 ⊗𝑛 ).
239
Chapter 5: Fundamental Quantum Information Processing Tasks
where
𝑠 1−𝑠
𝐶 (𝜌∥𝜎) B sup − log2 Tr[𝜌 𝜎 ] . (5.3.45)
𝑠∈(0,1)
That is, 𝐶 (𝜌∥𝜎) is the optimal asymptotic error exponent for symmetric
hypothesis testing of 𝜌 and 𝜎.
Remark: Theorem 5.4 tells us that, as 𝑛 becomes large, the following approximation holds
so that the optimal error probability does indeed decay exponentially with the number 𝑛 of copies
of the state. In particular, the quantum Chernoff divergence is the optimal asymptotic error
exponent for the exponential decay of the error probability as the number 𝑛 of copies increases.
Note that the optimal error exponent in (5.3.44) is independent of the prior probability
distribution. This means that, in the asymptotic limit (i.e., in the limit 𝑛 → ∞), the prior
probability distribution is irrelevant for the optimal error exponent.
We call Theorem 5.4 the quantum Chernoff bound because it is analogous to the Chernoff
bound from (classical) symmetric hypothesis testing, which is the task of discriminating between
two hypotheses, each of which has a corresponding probability distribution (please consult the
Bibliographic Notes in Section 3.4 for details).
An important element of the proof of the quantum Chernoff bound is Lemma 5.5
below.
Lemma 5.5
Let 𝐴 and 𝐵 be positive semi-definite operators. Then, for all 𝑠 ∈ (0, 1),
1
(Tr[ 𝐴 + 𝐵] − ∥ 𝐴 − 𝐵∥ 1 ) ≤ Tr[ 𝐴 𝑠 𝐵1−𝑠 ]. (5.3.47)
2
Proof: Let Δ B 𝐴 − 𝐵, and let Δ+ and Δ− be the positive and negative parts,
respectively, of Δ, so that Δ = Δ+ − Δ− (recall (2.2.66)). Then,
Therefore,
∥ 𝐴 − 𝐵∥ 1 = ∥Δ∥ 1 = Tr[|Δ|] = Tr[Δ+ ] + Tr[Δ− ]. (5.3.49)
240
Chapter 5: Fundamental Quantum Information Processing Tasks
In addition, we write
𝐴 + 𝐵 = 𝐴 − 𝐵 + 2𝐵 = Δ+ − Δ− + 2𝐵, (5.3.50)
so that
1
(Tr[ 𝐴 + 𝐵] − ∥ 𝐴 − 𝐵∥ 1 )
2
1
= (Tr[Δ+ ] − Tr[Δ− ] + 2Tr[𝐵] − Tr[Δ+ ] − Tr[Δ− ]) (5.3.51)
2
= Tr[𝐵] − Tr[Δ− ]. (5.3.52)
So it suffices to prove that the following inequality holds for all 𝑠 ∈ (0, 1):
Tr[𝐵] − Tr[Δ− ] ≤ Tr[ 𝐴 𝑠 𝐵1−𝑠 ]. (5.3.53)
Using the fact that Δ+ ≥ 0, by definition of the positive part of Δ, we obtain
𝐵 + Δ+ ≥ 𝐵. (5.3.54)
Similarly, using 𝐴 − 𝐵 = Δ+ − Δ− , we obtain
𝐴 + Δ− = 𝐵 + Δ+ ≥ 𝐵. (5.3.55)
By the operator monotonicity of 𝑥 ↦→ 𝑥 𝑠 for 𝑠 ∈ (0, 1) (recall Definition 2.13 and
thereafter), the inequality in (5.3.55) implies that
𝐵 𝑠 ≤ ( 𝐴 + Δ− ) 𝑠 . (5.3.56)
Using this, along with the fact that Tr[𝐵] = Tr[𝐵 𝑠 𝐵1−𝑠 ], we find that
Tr[𝐵] − Tr[ 𝐴 𝑠 𝐵1−𝑠 ] = Tr[(𝐵 𝑠 − 𝐴 𝑠 )𝐵1−𝑠 ] (5.3.57)
≤ Tr[(( 𝐴 + Δ− ) 𝑠 − 𝐴 𝑠 )𝐵1−𝑠 ] (5.3.58)
≤ Tr[(( 𝐴 + Δ− ) 𝑠 − 𝐴 𝑠 )( 𝐴 + Δ− ) 1−𝑠 ] (5.3.59)
= Tr[ 𝐴] + Tr[Δ− ] − Tr[ 𝐴 𝑠 ( 𝐴 + Δ− ) 1−𝑠 ] (5.3.60)
≤ Tr[ 𝐴] + Tr[Δ− ] − Tr[ 𝐴] (5.3.61)
= Tr[Δ− ]. (5.3.62)
The first inequality follows because 𝐵1−𝑠 ≥ 0 and from (5.3.56). The second
inequality follows because ( 𝐴 + Δ− ) 𝑠 ≥ 𝐴 𝑠 and from (5.3.56) with the substitution
𝑠 → 1 − 𝑠. The last inequality follows because
Tr[ 𝐴 𝑠 ( 𝐴 + Δ− ) 1−𝑠 ] ≥ Tr[ 𝐴 𝑠 𝐴1−𝑠 ] = Tr[ 𝐴], (5.3.63)
241
Chapter 5: Fundamental Quantum Information Processing Tasks
Exercise 5.13
Let 𝐴 and 𝐵 be positive semi-definite operators and 𝑠 ∈ (0, 1). Starting with
Lemma 5.5, prove that
1 1
∥ 𝐴 − 𝐵∥ 1 ≥ Tr[ 𝐴 + 𝐵] − 𝐴 𝑠 𝐵1−𝑠 1
. (5.3.65)
2 2
(Hint: Recall Theorem 2.10.)
1 1 √ √
∥ 𝐴 − 𝐵∥ 1 ≥ Tr[ 𝐴 + 𝐵] − 𝐴 𝐵 (5.3.66)
2 2 1
1 √︁
= Tr[ 𝐴 + 𝐵] − 𝐹 ( 𝐴, 𝐵), (5.3.67)
2
where in the second line we let
√ √ 2
𝐹 ( 𝐴, 𝐵) B 𝐴 𝐵 . (5.3.68)
1
The quantity 𝐹 ( 𝐴, 𝐵) is called the fidelity between 𝐴 and 𝐵, and it plays a critical role in
the analysis of quantum communication protocols. We study the fidelity function in detail in
Chapter 6.
Proof of the Quantum Chernoff Bound (Theorem 5.4): Since the limit on
the left-hand side of (5.3.44) need not exist a priori, let us define the following
quantities based on the discussion in Section 2.3.1:
1
𝜉 (𝜌, 𝜎) B lim inf − log2 𝑝 ∗err (𝜆, 𝜌 ⊗𝑛 , 𝜎 ⊗𝑛 ), (5.3.69)
𝑛→∞ 𝑛
1
𝜉 (𝜌, 𝜎) B lim sup − log2 𝑝 ∗err (𝜆, 𝜌 ⊗𝑛 , 𝜎 ⊗𝑛 ). (5.3.70)
𝑛→∞ 𝑛
242
Chapter 5: Fundamental Quantum Information Processing Tasks
Note that, by definition, we always have 𝜉 (𝜌, 𝜎) ≤ 𝜉 (𝜌, 𝜎); see (2.3.5). Our goal
is to prove that 𝜉 (𝜌, 𝜎) = 𝜉 (𝜌, 𝜎) = lim𝑛→∞ − 𝑛1 log2 𝑝 ∗err (𝜆, 𝜌 ⊗𝑛 , 𝜎 ⊗𝑛 ) = 𝐶 (𝜌∥𝜎).
Now, if 𝜆 is the probability with which 𝜌 is chosen, and 1 − 𝜆 the probability
with which 𝜎 is chosen, then an application of Lemma 5.5, with 𝐴 = 𝜆𝜌 ⊗𝑛 ,
𝐵 = (1 − 𝜆)𝜎 ⊗𝑛 , and 𝑠 ∈ (0, 1), gives the following:
1
𝑝 ∗err (𝜆, ⊗𝑛 ⊗𝑛
𝜌 ,𝜎 ) = ⊗𝑛
1 − 𝜆𝜌 − (1 − 𝜆)𝜎 1 ⊗𝑛
(5.3.71)
2
≤ Tr 𝜆 𝑠 (𝜌 ⊗𝑛 ) 𝑠 (1 − 𝜆) 1−𝑠 (𝜎 ⊗𝑛 ) 1−𝑠
(5.3.72)
= 𝜆 𝑠 (1 − 𝜆) 1−𝑠 Tr[(𝜌 𝑠 ) ⊗𝑛 (𝜎 1−𝑠 ) ⊗𝑛 ] (5.3.73)
𝑛
𝑠 1−𝑠 𝑠 1−𝑠
= 𝜆 (1 − 𝜆) Tr[𝜌 𝜎 ] (5.3.74)
𝑛
≤ Tr[𝜌 𝑠 𝜎 1−𝑠 ] , (5.3.75)
where the last line follows from the fact that 𝜆 𝑠 (1 − 𝜆) 1−𝑠 ≤ 1. By taking a negative
logarithm and dividing by 𝑛, this bound becomes
1
− log2 𝑝 ∗err (𝜆, 𝜌 ⊗𝑛 , 𝜎 ⊗𝑛 ) ≥ − log2 Tr[𝜌 𝑠 𝜎 1−𝑠 ]. (5.3.76)
𝑛
Since the bound above holds for all 𝑠 ∈ (0, 1), we obtain the following bound:
1
𝜉 (𝜌, 𝜎) = lim inf − log2 𝑝 ∗err (𝜆, 𝜌 ⊗𝑛 , 𝜎 ⊗𝑛 ) (5.3.77)
𝑛→∞ 𝑛
≥ sup − log2 Tr[𝜌 𝑠 𝜎 1−𝑠 ] (5.3.78)
𝑠∈(0,1)
= 𝐶 (𝜌∥𝜎). (5.3.79)
For the opposite inequality, we start by observing that it suffices to optimize over
projectors when determining the optimal error probability 𝑝 ∗err (𝜆, 𝜌 ⊗𝑛 , 𝜎 ⊗𝑛 ). (This
is true because one choice of an optimal measurement is a projective measurement,
as shown in the proof of Theorem 5.3, where we can set Λ0 = 0.) Next, suppose
that 𝜌 and 𝜎 have the following spectral decompositions:
𝑑−1
∑︁ 𝑑−1
∑︁
𝜌= 𝜆𝑖 |𝜓𝑖 ⟩⟨𝜓𝑖 |, 𝜎= 𝜇 𝑗 |𝜙 𝑗 ⟩⟨𝜙 𝑗 |, (5.3.80)
𝑖=0 𝑗=0
243
Chapter 5: Fundamental Quantum Information Processing Tasks
where 𝑑 is the dimension of the underlying Hilbert space. Then, for every
projection Π,
𝑑−1
Tr[( 1 − Π) 𝜌] = 𝜆𝑖 Tr[( 1 − Π)|𝜓𝑖 ⟩⟨𝜓𝑖 |]
∑︁
(5.3.81)
𝑖=0
𝑑−1
𝜆𝑖 Tr[( 1 − Π)|𝜓𝑖 ⟩⟨𝜓𝑖 |( 1 − Π)]
∑︁
= (5.3.82)
𝑖=0
𝑑−1
𝜆𝑖 Tr[|𝜙 𝑗 ⟩⟨𝜙 𝑗 |( 1 − Π)|𝜓𝑖 ⟩⟨𝜓𝑖 |( 1 − Π)]
∑︁
= (5.3.83)
𝑖, 𝑗=0
𝑑−1
𝜆𝑖 ⟨𝜓𝑖 |( 1 − Π)|𝜙 𝑗 ⟩ ,
∑︁ 2
= (5.3.84)
𝑖, 𝑗=0
Then, using the fact that 𝜆𝑖 ≥ min{𝜆𝑖 , 𝜇 𝑗 } and 𝜇 𝑗 ≥ min{𝜆𝑖 , 𝜇 𝑗 } for all 0 ≤ 𝑖, 𝑗 ≤
𝑑 − 1, the error probability under the measurement given by Π is
𝑝 err (𝜆, 𝜌, 𝜎, Π)
= Tr[( 1 − Π)(𝜆𝜌)] + Tr[Π(1 − 𝜆)𝜎] (5.3.89)
𝑑−1
𝜆𝜆𝑖 ⟨𝜓𝑖 |( 1 − Π)|𝜙 𝑗 ⟩ + (1 − 𝜆)𝜇 𝑗 ⟨𝜓𝑖 |Π|𝜙 𝑗 ⟩
∑︁ 2 2
= (5.3.90)
𝑖, 𝑗=0
244
Chapter 5: Fundamental Quantum Information Processing Tasks
1
2 2
≥ min{𝜆𝜆𝑖 , (1 − 𝜆)𝜇 𝑗 } 𝑥𝑖, 𝑗 + 𝑦𝑖, 𝑗 , (5.3.91)
2
where we have defined 𝑥𝑖, 𝑗 B ⟨𝜓𝑖 |( 1 − Π)|𝜙 𝑗 ⟩ and 𝑦𝑖, 𝑗 B ⟨𝜓𝑖 |Π|𝜙 𝑗 ⟩ in the last line.
Using the identity |𝑎| 2 + |𝑏| 2 ≥ 12 |𝑎 + 𝑏| 2 , which holds for all 𝑎, 𝑏 ∈ C, we obtain
𝑑−1
1 ∑︁ 2
𝑝 err (𝜆, 𝜌, 𝜎, Π) ≥ min{𝜆𝜆𝑖 , (1 − 𝜆)𝜇 𝑗 } ⟨𝜓𝑖 |𝜙 𝑗 ⟩ . (5.3.92)
2 𝑖, 𝑗=0
The expression on the right-hand side of the above inequality is precisely half the
optimal error probability for discriminating the two probability distributions 𝑝 and 𝑞,
with a prior probability of 𝜆. Indeed, we can see this by an application Íof Theorem 5.3
𝑑−1
to the case of commutative 𝐴 and 𝐵. To this end, letting 𝐴 = 𝑖=0 𝑎𝑖 |𝑖⟩⟨𝑖| and
Í𝑑−1
𝐵 = 𝑖=0 𝑏𝑖 |𝑖⟩⟨𝑖| where 𝑎𝑖 , 𝑏𝑖 ≥ 0 for all 𝑖 ∈ {0, . . . , 𝑑−1}, it follows that an optimal
Í Í
measurement operator Π+ = 𝑖:𝑎𝑖 ≥𝑏𝑖 |𝑖⟩⟨𝑖| and its complement Π− = 𝑖:𝑎𝑖 <𝑏𝑖 |𝑖⟩⟨𝑖|,
so that
where the factor of 12 in (5.3.99) vanishes in the asymptotic limit. Finally, observe
that
𝑑−1 𝑑−1
2 𝑠 1−𝑠 2 1−𝑠
∑︁ ∑︁
𝑠 1−𝑠 𝑠
𝑝(𝑖, 𝑗) 𝑞(𝑖, 𝑗) = 𝜆𝑖 ⟨𝜓𝑖 |𝜙 𝑗 ⟩ 𝜇𝑗 ⟨𝜓𝑖 |𝜙 𝑗 ⟩ (5.3.103)
𝑖, 𝑗=0 𝑖, 𝑗=0
𝑑−1
∑︁ 2
= 𝜆𝑖𝑠 𝜇1−𝑠
𝑗 ⟨𝜓𝑖 |𝜙 𝑗 ⟩ (5.3.104)
𝑖, 𝑗=0
𝑑−1
∑︁
= 𝜆𝑖𝑠 𝜇1−𝑠
𝑗 Tr[|𝜓𝑖 ⟩⟨𝜓𝑖 |𝜙 𝑗 ⟩⟨𝜙 𝑗 |] (5.3.105)
𝑖, 𝑗=0
! 𝑑−1
𝑑−1 𝑠
∑︁ ∑︁
1−𝑠
= Tr 𝜆𝑖 |𝜓𝑖 ⟩⟨𝜓𝑖 | 𝜇 𝑗 |𝜙 𝑗 ⟩⟨𝜙 𝑗 | ®
© ª
(5.3.106)
𝑖=0
« 𝑗=0
¬
= Tr[𝜌 𝑠 𝜎 1−𝑠 ]. (5.3.107)
Therefore,
𝜉 (𝜌, 𝜎) ≤ sup − log2 Tr[𝜌 𝑠 𝜎 1−𝑠 ] = 𝐶 (𝜌∥𝜎), (5.3.108)
𝑠∈(0,1)
which, combined with (5.3.79) and 𝜉 (𝜌, 𝜎) ≤ 𝜉 (𝜌, 𝜎), implies that
𝜉 (𝜌, 𝜎) = 𝜉 (𝜌, 𝜎) = sup − log2 Tr[𝜌 𝑠 𝜎 1−𝑠 ] = 𝐶 (𝜌∥𝜎), (5.3.109)
𝑠∈(0,1)
We now briefly consider state discrimination when there are more than two states.
Suppose that Alice prepares a quantum system in a state chosen randomly from a
set {𝜌 𝑥 }𝑥∈X of states. We assume that X is some finite alphabet with size |X| ≥ 2
and that the state 𝜌 𝑥 is chosen with probability 𝑝(𝑥), where 𝑝 : X → [0, 1] is a
probability distribution. Alice sends her chosen state to Bob, whose task is to guess
the value of 𝑥, i.e., which state Alice sent. Bob’s knowledge of the system can be
described by the ensemble {( 𝑝(𝑥), 𝜌 𝑥𝐵 )}𝑥∈X , which has the following associated
classical–quantum state:
∑︁
𝜌 𝑋𝐴𝐵 B 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 𝐴 ⊗ 𝜌 𝑥𝐵 , (5.3.110)
𝑥∈X
where 𝑋 𝐴 is a classical |X|-dimensional register that contains Alice’s choice of
state. Note that while Bob knows both the prior probability distribution 𝑝 and
the association 𝑥 ↔ 𝜌 𝑥 between the labels 𝑥 and the states 𝜌 𝑥 , he does not have
access to the register 𝑋 𝐴 . (If Bob did have access to the classical register 𝑋 𝐴 , he
could simply measure it in the basis {|𝑥⟩}𝑥∈X and figure out what state Alice sent.)
Therefore, as before, Bob must make a measurement. His strategy is to choose a
POVM {𝑀𝐵𝑥 }𝑥∈X with elements indexed by the elements of X. If he obtains the
outcome corresponding to 𝑥 ∈ X, then he guesses that the state sent was 𝜌 𝑥 .
The scenario of multiple state discrimination is very similar to the task of
classical communication over a quantum channel N that we consider in Chapter 12.
The classical messages to be sent correspond to the labels 𝑥 ∈ X, while the states 𝜌 𝑥
correspond to an encoding of the messages into quantum states, which are then sent
through the quantum channel, and 𝑝 corresponds to the prior probability over the set
of messages. The measurement performed, in order to guess the state, corresponds
to a decoding channel that is applied at the receiving end of the quantum channel in
order to determine the message that was sent. The quantity in (5.3.119), evaluated
for the ensemble {( 𝑝(𝑥), N(𝜌 𝑥 ))}𝑥∈X , is then the optimal average probability of
correctly guessing the message sent, where the optimization is over all POVMs
indexed by the messages.
Í
Let M𝐵→𝑋𝐵 (·) B 𝑥∈X Tr[𝑀𝐵𝑥 (·)]|𝑥⟩⟨𝑥| 𝑋𝐵 be the measurement channel corre-
sponding to the POVM {𝑀𝐵𝑥 }𝑥∈X , where 𝑋𝐵 is a |X|-dimensional classical register.
(Recall the definition of a measurement channel from Definition 4.10.) After the
measurement, the classical–quantum state in (5.3.110) transforms to
𝜔 𝑋 𝐴 𝑋𝐵 B M𝐵→𝑋𝐵 (𝜌 𝑋 𝐴 𝐵 ) (5.3.111)
247
Chapter 5: Fundamental Quantum Information Processing Tasks
∑︁ ′
= 𝑝(𝑥)Tr[𝑀𝐵𝑥 𝜌 𝑥𝐵 ]|𝑥⟩⟨𝑥| 𝑋 𝐴 ⊗ |𝑥 ′⟩⟨𝑥 ′ | 𝑋𝐵 . (5.3.112)
𝑥,𝑥 ′ ∈X
Let ∑︁
Π succ
𝑋 𝐴 𝑋𝐵 B |𝑥⟩⟨𝑥| 𝑋 𝐴 ⊗ |𝑥⟩⟨𝑥| 𝑋𝐵 , (5.3.113)
𝑥∈X
which is a projector corresponding to the registers 𝑋 𝐴 and 𝑋𝐵 having the same
value; this is what we want for state discrimination to be successful. The expected
success probability of the strategy given by the POVM {𝑀 𝑥 }𝑥∈X is thus
Exercise 5.14
Show that the error probability for discriminating the states in the ensemble
{( 𝑝(𝑥), 𝜌 𝑥 )}𝑥∈X using the POVM {𝑀 𝑥 }𝑥∈X is
The optimal success probability for discriminating the states in the ensemble
{( 𝑝(𝑥), 𝜌 𝑥 )}𝑥∈X is
The optimal error probability is then 𝑝 ∗err ({( 𝑝(𝑥), 𝜌 𝑥 )}𝑥 ) B 1−𝑝 ∗succ ({( 𝑝(𝑥), 𝜌 𝑥 )}𝑥 ).
Exercise 5.15
Given an ensemble {( 𝑝(𝑥), 𝜌 𝑥 )}𝑥∈X of quantum states with associated classical–
Í
quantum state 𝜌 𝑋 𝐵 = 𝑥∈X 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜌 𝑥𝐵 , show that the optimal success
248
Chapter 5: Fundamental Quantum Information Processing Tasks
probability 𝑝 ∗succ ({( 𝑝(𝑥), 𝜌 𝑥 )}𝑥 ) can be evaluated using the following semi-
definite program:
maximize Tr[𝑀 𝑋 𝐵 𝜌 𝑋 𝐵 ]
subject to Tr 𝑋 [𝑀 𝑋 𝐵 ] = 1𝐵 , (5.3.120)
𝑀 𝑋 𝐵 ≥ 0.
In other words, show that
Exercise 5.16
Let {|𝜓 𝑥 ⟩}𝑥∈X be a finite set of mutually orthogonal vectors, let 𝜌 𝑥 = |𝜓 𝑥 ⟩⟨𝜓 𝑥 |
for all 𝑥 ∈ X, and consider the ensemble {( 𝑝(𝑥), 𝜌 𝑥 )}𝑥∈X , where 𝑝 : X → [0, 1]
is a probability distribution. Prove that 𝑝 ∗succ ({( 𝑝(𝑥), 𝜌 𝑥 )}𝑥 ) = 1 and construct
the corresponding optimal measurement.
Exercise 5.17
Generalize Proposition 5.2 to the case of multiple state discrimination. Specif-
ically, for every finite ensemble {( 𝑝(𝑥), 𝜌 𝑥 )}𝑥∈X of states and every positive,
trace preserving map N, prove that
𝑝 ∗succ ({( 𝑝(𝑥), 𝜌 𝑥 )}𝑥 ) ≥ 𝑝 ∗succ ({( 𝑝(𝑥), N(𝜌 𝑥 ))}𝑥 ). (5.3.124)
249
Chapter 5: Fundamental Quantum Information Processing Tasks
Given quantum states 𝜌 and 𝜎, the goal of asymmetric quantum hypothesis testing
is to minimize the type-II error probability given an upper bound on the type-I
error probability, as per the optimization problem in (5.3.3). The value of that
optimization problem is given by
with 𝜀 ∈ [0, 1] being an upper bound on the type-I error probability. Intuitively,
we might expect a trade-off between the type-I and type-II error probabilities. In
particular, we might expect that we can achieve a lower minimum type-II error
probability by increasing our tolerance on the type-I error probability. This is indeed
true. Observe that every measurement operator 𝑀 that satisfies Tr[𝑀 𝜌] ≥ 1 − 𝜀
also satisfies Tr[𝑀 𝜌] ≥ 1 − 𝜀′ for all 𝜀′ greater than 𝜀. All such measurement
operators are thus feasible points in the optimization for 𝛽𝜀′ (𝜌∥𝜎). Therefore,
Exercise 5.18
Show that 𝛽𝜀 (𝜌∥𝜎) can be evaluated using a semi-definite program. Then,
using strong duality, prove that an alternate expression for 𝛽𝜀 (𝜌∥𝜎) is
Exercise 5.19
Prove that the minimum type-II error probability for asymmetric hypothesis
testing of states 𝜌 and 𝜎, with 𝜀 ∈ [0, 1], is isometrically invariant: for every
isometry 𝑉, we have that 𝛽𝜀 (𝜌∥𝜎) = 𝛽𝜀 (𝑉 𝜌𝑉 † ∥𝑉 𝜎𝑉 † ).
As with the minimum error probability for symmetric hypothesis testing, the
250
Chapter 5: Fundamental Quantum Information Processing Tasks
minimum type-II error probability for asymmetric hypothesis testing obeys the
following data-processing inequality.
Proof: The proof is analogous to the proof of Proposition 5.2, and the intuition
behind it is as well. Let 𝑀 ′ be an operator satisfying 0 ≤ 𝑀 ′ ≤ 1 and Tr[𝑀 ′N(𝜌)] ≥
1 − 𝜀. Then, due to the positivity of N, and thus of N† , we have that N† (𝑀 ′) ≥ 0
and N† ( 1 − 𝑀 ′) ≥ 0 ⇒ N† (𝑀 ′) ≤ N† ( 1). Since N is trace preserving, the
adjoint N† is unital (see Exercise 4.10), which means that N† ( 1) = 1, so that
0 ≤ N† (𝑀 ′) ≤ 1. Furthermore, by definition of the adjoint of a superoperator, the
inequality Tr[N† (𝑀 ′) 𝜌] ≥ 1 − 𝜀 holds. Therefore, N† (𝑀 ′) is a feasible point in
the optimization for 𝛽𝜀 (𝜌∥𝜎), so that
where the last line follows by definition of the adjoint of a superoperator. Finally,
because the inequality 𝛽𝜀 (𝜌∥𝜎) ≤ Tr[𝑀 ′N(𝜎)] holds for every operator 𝑀 ′
satisfying 0 ≤ 𝑀 ′ ≤ 1 and Tr[𝑀 ′N(𝜌)] ≥ 1 − 𝜀, we conclude that
as required. ■
251
Chapter 5: Fundamental Quantum Information Processing Tasks
operator:
𝑀 (𝜇∗ , 𝑝 ∗ ) B Π 𝜇∗ 𝜌>𝜎 + 𝑝 ∗ Π 𝜇∗ 𝜌=𝜎 , (5.3.135)
where Π 𝜇∗ 𝜌>𝜎 is the projection onto the strictly positive part of 𝜇∗ 𝜌 − 𝜎, the
projection Π 𝜇∗ 𝜌=𝜎 projects onto the zero eigenspace of 𝜇∗ 𝜌 − 𝜎, and 𝜇∗ ≥ 0
and 𝑝 ∗ ∈ [0, 1] are chosen as follows:
Proof: First, observe that it suffices to optimize with respect to every measurement
operator 𝑀 that meets the constraint Tr[𝑀 𝜌] ≥ 1 − 𝜀 with equality. This follows
because for every measurement operator 𝑀 such that Tr[𝑀 𝜌] > 1 − 𝜀, there exists
a positive number 𝜆 ∈ [0, 1) such that Tr[(𝜆𝑀) 𝜌] = 1 − 𝜀. Note that 0 ≤ 𝜆𝑀 ≤ 1,
so that 𝜆𝑀 is a measurement operator, and because Tr[(𝜆𝑀)𝜎] < Tr[𝑀𝜎], we
conclude that
𝛽𝜀 (𝜌∥𝜎) = inf{Tr[𝑀𝜎] : 0 ≤ 𝑀 ≤ 1, Tr[𝑀 𝜌] = 1 − 𝜀}. (5.3.138)
Based on this, let 𝑀 be a measurement operator satisfying Tr[𝑀 𝜌] = 1 − 𝜀 and let
𝜇 ≥ 0. Then,
Tr[𝑀𝜎] = Tr[𝑀𝜎] + 𝜇 (1 − 𝜀 − Tr[𝑀 𝜌]) (5.3.139)
= −𝜇𝜀 + Tr[( 1 − 𝑀)𝜇𝜌] + Tr[𝑀𝜎] (5.3.140)
1
≥ −𝜇𝜀 + (Tr[𝜇𝜌 + 𝜎] − ∥𝜇𝜌 − 𝜎∥ 1 ) (5.3.141)
2
1
= −𝜇𝜀 + (𝜇 + Tr[𝜎] − ∥𝜇𝜌 − 𝜎∥ 1 ) . (5.3.142)
2
The sole inequality follows as an application of Theorem 5.3, with 𝐵 = 𝜎 and
𝐴 = 𝜇𝜌. Observe that the final expression is a universal bound independent of 𝑀.
To determine an optimal measurement operator, we can look to Theorem 5.3.
There, it was established that the following measurement operator is an optimal
one for inf 𝑀:0≤𝑀 ≤1 {Tr[( 1 − 𝑀)𝜇𝜌] + Tr[𝑀𝜎]}:
𝑀 (𝜇, 𝑝) B Π 𝜇𝜌>𝜎 + 𝑝Π 𝜇𝜌=𝜎 , (5.3.143)
252
Chapter 5: Fundamental Quantum Information Processing Tasks
where Π 𝜇𝜌>𝜎 is the projection onto the strictly positive part of 𝜇𝜌 − 𝜎, the
projection Π 𝜇𝜌=𝜎 projects onto the zero eigenspace of 𝜇𝜌 − 𝜎, and 𝑝 ∈ [0, 1]. The
measurement operator 𝑀 (𝜇, 𝑝) is called a quantum Neyman–Pearson test. We still
need to choose the parameters 𝜇 ≥ 0 and 𝑝 ∈ [0, 1]. Let us pick 𝜇 according to the
following optimization:
Now, suppose that multiple copies, say 𝑛, of the state (𝜌 or 𝜎) are available.
Based on the discussion at the beginning of Section 5.3, the optimal type-II
error probability, given an upper bound of 𝜀 on the type-I error probability, is
𝛽𝜀 (𝜌 ⊗𝑛 ∥𝜎 ⊗𝑛 ) for all 𝑛 ≥ 1. Then, as in the symmetric case, we are interested in
the behaviour of this type-II error probability as 𝑛 becomes large. Furthermore,
based on the earlier discussion of the trade-off between the type-I and type-II error
probabilities, we might imagine that as 𝑛 becomes large it is possible to bring the
type-I error probability all the way down to zero, because the states become more
distinguishable as 𝑛 increases. In order to investigate this possibility, we consider
the optimal type-II error exponent, by analogy with the error exponent for state
253
Chapter 5: Fundamental Quantum Information Processing Tasks
𝐸 (𝜌, 𝜎) ≤ 𝐸
e(𝜌, 𝜎). (5.3.151)
The following result, known as the quantum Stein’s lemma, provides us with a
tractable expression for 𝐸 (𝜌, 𝜎) and 𝐸e(𝜌, 𝜎) in terms of the quantum relative
entropy, an important quantity in quantum information theory that we introduce
formally in Chapter 7. We also delay the proof of the result to Chapter 7, when all
of the required elements of the proof become available to us.
𝐸 (𝜌, 𝜎) = 𝐸
e(𝜌, 𝜎) = 𝐷 (𝜌∥𝜎), (5.3.152)
where
Tr[𝜌(log2 𝜌 − log2 𝜎)] if supp(𝜌) ⊆ supp(𝜎),
𝐷 (𝜌∥𝜎) B (5.3.153)
+∞ otherwise
Remark: See Section 2.2.8.1 for the definition of the logarithm of a Hermitian operator. See
Section 7.2 for a more detailed explanation of the support conditions in the definition of the
quantum relative entropy.
254
Chapter 5: Fundamental Quantum Information Processing Tasks
R
Guess “N” if outcome is MN
ρRA
A
? B
Guess “M” if outcome is MM
{MN , MM }
N or M
Figure 5.11: The most general strategy for discriminating two quantum channels
N and M is to prepare a bipartite state 𝜌 𝑅 𝐴 , with the reference system 𝑅 having
arbitrary dimension, sending the system 𝐴 through the unknown quantum
channel, and then measuring both systems 𝑅 and 𝐴 according to a two-outcome
POVM {𝑀N , 𝑀M }. If the outcome corresponding to 𝑀N occurs, then we guess
that the unknown channel is N; otherwise, we guess that it is M. The minimum
error probability among all such strategies is given by Theorem 5.9.
255
Chapter 5: Fundamental Quantum Information Processing Tasks
error probability. The measurement acts on both the system 𝑅 and the system
𝐴 after system 𝐴 has passed through the unknown channel. The expected error
probability of this strategy is analogous to the expected error probability of a
strategy for quantum state discrimination: it is the expectation, with respect to
the prior probability distribution given by 𝜆, of the probabilities of the two types
of errors that can occur: guessing “M” when the channel is N, and guessing “N”
when the channel is M. In other words, the expected error probability is
where the last line follows by letting 𝑀N ≡ 𝑀 and from the definition of 𝑝 err in
(5.3.1). We see that, given a state 𝜌 𝑅 𝐴 , the task of discriminating N and M reduces
to the task of discriminating the states N 𝐴→𝐵 (𝜌 𝑅 𝐴 ) and M 𝐴→𝐵 (𝜌 𝑅 𝐴 ). The optimal
error probability is obtained by optimizing with respect to every state 𝜌 𝑅 𝐴 and
measurement operator 𝑀, so that
where the optimization is with respect to every quantum state 𝜌 𝑅 𝐴 , and there is
an implicit optimization with respect to the dimension of the system 𝑅. This
gives us a first look into how a quantity defined initially for quantum states can be
“lifted” to a quantity defined for quantum channels. In particular, the quantity 𝑝 ∗err ,
initially defined for quantum states as in (5.3.6), has been extended to quantum
channels by evaluating the state quantity with respect to the states N 𝐴→𝐵 (𝜌 𝑅 𝐴 ) and
M 𝐴→𝐵 (𝜌 𝑅 𝐴 ) and then optimizing with respect to both to every state 𝜌 𝑅 𝐴 and the
dimension of 𝑅. Such constructions of channel quantities from state quantities
arise throughout the rest of the book.
Just as the optimal error probability for discriminating two states can be
expressed using the trace norm (recall (5.3.15)), we now show that, analogously, the
optimal error probability for discriminating two quantum channels can be expressed
in terms of the diamond norm.
256
Chapter 5: Fundamental Quantum Information Processing Tasks
where the inequality follows from (4.1.7), with respect to the partial trace chan-
nel Tr 𝑅′ . Now, without loss of generality, we can let 𝑑 𝑅′ ≥ 𝑑 𝑅 𝑑 𝐴 ; see Section 3.2.5.
Then, by the Schmidt decomposition theorem (Theorem 2.2), in particular (2.2.59),
the state vector |𝜓⟩ 𝑅′ 𝑅 𝐴 can be expressed according to the 𝑅′ 𝑅| 𝐴 bipartition as
Í 𝐴 √
|𝜓⟩ 𝑅′ 𝑅 𝐴 = 𝑑𝑘=1 𝑝 𝑘 |𝑢 𝑘 ⟩ 𝑅′ 𝑅 ⊗ |𝑣 𝑘 ⟩ 𝐴 , where 𝑝 𝑘 is a probability and the vectors
|𝑢 𝑘 ⟩ 𝑅′ 𝑅 and |𝑣 𝑘 ⟩ 𝐴 form orthonormal bases for a 𝑑 𝐴 -dimensional vector space. In
other words, only a 𝑑 𝐴 -dimensional subspace of H 𝑅′ 𝑅 , call it H 𝐴′ , is relevant for
calculating the trace norm in (5.4.8), and there exists an isometry 𝑉𝐴′ →𝑅′ 𝑅 such that
𝑉𝐴′ →𝑅′ 𝑅 |𝜓⟩ 𝐴′ 𝐴 = |𝜓⟩ 𝑅′ 𝑅 𝐴 . Adopting the shorthand 𝑉 ≡ 𝑉𝐴′ →𝑅′ 𝑅 , it follows that
Exercise 5.20
Consider quantum channels N 𝐴→𝐵 and M 𝐴→𝐵 , and let 𝜆 ∈ [0, 1]. Using
(5.4.4) and (5.4.1), show that the optimal error probability 𝑝 ∗err (𝜆, N, M) can
be evaluated using a semi-definite program. Then, using strong duality, prove
that an alternate expression for 𝑝 ∗err (𝜆, N, M) is
𝑝 ∗err (𝜆, N, M)
= sup 𝜆 min (Tr 𝐵 [𝑊 𝐴𝐵 ]) : 𝑊 𝐴𝐵 ≤ 𝜆ΓN M
𝐴𝐵 , 𝑊 𝐴𝐵 ≤ (1 − 𝜆)Γ 𝐴𝐵 , (5.4.14)
𝑊 𝐴𝐵
Hermitian
Exercise 5.21
Consider quantum channels N 𝐴→𝐵 and M 𝐴→𝐵 , and let 𝜆 ∈ [0, 1]. Prove the
following bounds on the optimal error probability for discriminating N and M
in terms of the optimal error probability for discriminating the Choi states of N
and M:
𝑑 𝐴 𝑝 ∗err (𝜆, ΦN M ∗ ∗ N M
𝐴𝐵 , Φ 𝐴𝐵 ) ≤ 𝑝 err (𝜆, N, M) ≤ 𝑝 err (𝜆, Φ 𝐴𝐵 , Φ 𝐴𝐵 ). (5.4.15)
The upper bound in (5.4.15) corresponds to the strategy that consists of letting
the state 𝜌 𝑅 𝐴 in Figure 5.11 be the maximally-entangled state Φ 𝐴′ 𝐴 = |Φ⟩⟨Φ| 𝐴′ 𝐴 ,
258
Chapter 5: Fundamental Quantum Information Processing Tasks
Í𝑑 𝐴−1
with |Φ⟩ 𝐴′ 𝐴 = √1𝑑 𝑖=0 |𝑖, 𝑖⟩ 𝐴 𝐴 . The following exercise tells us when this strategy
′
𝐴
is optimal, i.e., when the upper bound in (5.4.15) is achieved.
Exercise 5.22
Let the quantum channels N 𝐴→𝐵 and M 𝐴→𝐵 be jointly covariant with respect
to a group 𝐺, so that
𝑔 𝑔† 𝑔 𝑔†
N 𝐴→𝐵 (𝑈 𝐴 𝜌 𝐴𝑈 𝐴 ) = 𝑉𝐵 N 𝐴→𝐵 (𝜌 𝐴 )𝑉𝐵 , (5.4.16)
𝑔 𝑔† 𝑔 𝑔†
M 𝐴→𝐵 (𝑈 𝐴 𝜌 𝐴𝑈 𝐴 ) = 𝑉𝐵 M 𝐴→𝐵 (𝜌 𝐴 )𝑉𝐵 , (5.4.17)
𝑔 𝑔
for every 𝑔 ∈ 𝐺 and every state 𝜌 𝐴 , where {𝑈 𝐴 }𝑔∈𝐺 and {𝑉𝐵 }𝑔∈𝐺 are unitary
representations of 𝐺 acting on H 𝐴 and H𝐵 , respectively. Furthermore, let
𝑔
{𝑈 𝐴 }𝑔∈𝐺 be such that
1 ∑︁ 𝑔 𝑔† 1𝐴
T𝐺
𝐴 (·) B 𝑈 𝐴 (·)𝑈 𝐴 = Tr[·] . (5.4.18)
|𝐺 | 𝑔∈𝐺 𝑑𝐴
5.5 Summary
259
Chapter 5: Fundamental Quantum Information Processing Tasks
260
Chapter 5: Fundamental Quantum Information Processing Tasks
5.7 Problems
1. Let 𝜌 𝐴𝐵 be a quantum state with 𝑑 𝐴 = 𝑑 𝐵 = 𝑑 ≥ 2, and consider the quantity
In other words, for classical–quantum states, the quantity 𝑞 corr reduces to the optimal
success probability for multiple state discrimination of the set {𝜌 𝑥𝐵 }𝑥∈X . (Hint: See
Exercise 4.11.)
(Bibliographic Note: The function 𝑞 corr was defined by Koenig et al. (2009) within the
context of the min-entropy (a quantity that we encounter in Chapter 7) and its operational
meaning.)
261
Chapter 6
263
Chapter 6: Distinguishibility Measures for Quantum States and Channels
because there are always choices for 𝑀 such that Tr[𝑀 (𝜌 − 𝜎)] ≥ 0. Then the
equality follows as a direct application of (5.3.17) and the optimality statement
following it, with 𝐴 = 𝜌 and 𝐵 = 𝜎. ■
From Exercise 2.30, we know that, for Hermitian operators, the trace norm
can be evaluated using semi-definite programming. The normalized trace distance
1
2 ∥ 𝜌 − 𝜎∥ 1 can therefore be evaluated using semi-definite programming, because
𝜌 − 𝜎 is Hermitian. Since 𝜌 and 𝜎 are positive semi-definite, we obtain the
following simpler semi-definite programs for their normalized trace distance.
Exercise 6.1
Prove (6.1.8).
264
Chapter 6: Distinguishibility Measures for Quantum States and Channels
Proof: This is immediate from (4.1.7), which tells us that the trace norm is
monotone non-increasing under the action of every positive trace-non-increasing
superoperator for every linear operator. It is also possible to provide a direct proof
using the expression in (6.1.7); see Exercise 6.2. ■
Exercise 6.2
Provide a direct proof of (6.1.9) using the expression in (6.1.7). (Hint: See the
proof of Proposition 5.2.)
By combining the results of Theorems 6.1 and 6.3, we find that the trace distance
is achieved by a measurement channel:
where the optimization is performed over POVMs {Λ𝑥 }𝑥∈X defined with respect
to a finite alphabet X, and an optimal POVM is given by {Λ∗ , 1 − Λ∗ }, where
Λ∗ = Π+ + Λ0 , the projection Π+ is the projection onto the strictly positive part
of 𝜌 − 𝜎, and Λ0 satisfies 0 ≤ Λ0 ≤ Π0 , with Π0 the projection onto the zero
eigenspace of 𝜌 − 𝜎.
6.2 Fidelity
In addition to the trace distance, another distinguishability measure for states that
we consider in this book is the fidelity (also called Uhlmann fidelity).
√ √ h√︁√ √ i 2
2
𝐹 (𝜌, 𝜎) B 𝜌 𝜎 1
= Tr 𝜎𝜌 𝜎 . (6.2.1)
Observe that the fidelity is symmetric in its arguments. We also have that
𝐹 (𝜌, 𝜎) ∈ [0, 1] for all states 𝜌 and 𝜎, a fact that we prove below.
For a pure state |𝜓⟩ and mixed state 𝜌, the fidelity between them is equal to
Also, for two pure states |𝜓⟩ and |𝜙⟩, the fidelity is simply
The formula in (6.2.2) gives the fidelity an operational meaning that we employ
in later chapters. Suppose that the goal of a quantum information processing
protocol is to produce the pure state |𝜓⟩⟨𝜓|, but it instead produces a mixed state
𝜌. Then the fidelity 𝐹 (𝜌, |𝜓⟩⟨𝜓|) is equal to the probability that the actual state 𝜌
passes a test for being the ideal state |𝜓⟩⟨𝜓|, with the test being given by the POVM
266
Chapter 6: Distinguishibility Measures for Quantum States and Channels
{|𝜓⟩⟨𝜓|, 1 − |𝜓⟩⟨𝜓|}. That is, the probability of obtaining the first outcome of the
measurement (i.e., “success”) is equal to 𝐹 (𝜌, |𝜓⟩⟨𝜓|). In this way, the fidelity
provides another natural way for assessing the performance of quantum information
processing protocols.
Like the trace distance, the fidelity can be computed via a semi-definite program,
as stated in the following proposition:
𝐹 (𝜌, 𝜎) = 𝐹 (𝑉 𝜌𝑉 † , 𝑉 𝜎𝑉 † ). (6.2.6)
Proof:
1. The fact that 𝐹 (𝜌, 𝜎) ≥ 0 for all states 𝜌 and 𝜎 follows from the definition
267
Chapter 6: Distinguishibility Measures for Quantum States and Channels
of the fidelity as the squared trace norm and the fact that the trace norm is
always non-negative. If 𝜌 and 𝜎 are supported on orthogonal subspaces, then
√ √
𝜌 𝜎 = 0, which means that 𝐹 (𝜌, 𝜎) = 0. Conversely, if 𝐹 (𝜌, 𝜎) = 0, then
√ √ √ √
𝜌 𝜎 1 = 0, which implies (by definition of a norm) that 𝜌 𝜎 = 0, which
in turn implies that 𝜌 and 𝜎 are supported on orthogonal subspaces.
Now, using (2.2.130), there exists a unitary 𝑈 such that
√ √ 2 √ √ 2
𝐹 (𝜌, 𝜎) = 𝜌 𝜎 1 = Tr[𝑈 𝜌 𝜎] . (6.2.8)
√︃ √︁ 2
† †
𝐹 (𝑉 𝜌𝑉 , 𝑉 𝜎𝑉 ) = 𝑉 𝜌𝑉 † 𝑉 𝜎𝑉 † (6.2.14)
1
√ † √† 2
= 𝑉 𝜌𝑉 𝑉 𝜎𝑉 1 (6.2.15)
√ √ 2
= 𝑉 𝜌 𝜎𝑉 † 1 (6.2.16)
√ √ 2
= 𝜌 𝜎 1, (6.2.17)
as required, where the last line is due to the isometric invariance of the Schatten
norms, as stated in (2.2.93).
268
Chapter 6: Distinguishibility Measures for Quantum States and Channels
√ √ √
3. Proof of multiplicativity:
√ Using the fact that 𝜌1 ⊗ 𝜌2 = 𝜌1 ⊗ 𝜌2 , and
similarly for 𝜎1 ⊗ 𝜎2 , and using the multiplicativity of the trace norm with
respect to the tensor product (see (2.2.96)), we find that
√︁ √ 2
𝐹 (𝜌1 ⊗ 𝜌2 , 𝜎1 ⊗ 𝜎2 ) = 𝜌1 ⊗ 𝜌2 𝜎1 ⊗ 𝜎2 (6.2.18)
1
√ √ √ √ 2
= ( 𝜌1 ⊗ 𝜌2 )( 𝜎1 ⊗ 𝜎2 ) 1
(6.2.19)
√ √ √ √ 2
= 𝜌1 𝜎1 ⊗ 𝜌2 𝜎2 1 (6.2.20)
√ √ 2 √ √ 2
= 𝜌1 𝜎1 1 𝜌2 𝜎2 1 (6.2.21)
= 𝐹 (𝜌1 , 𝜎1 )𝐹 (𝜌2 , 𝜎2 ), (6.2.22)
as required. ■
Remark: Since all purifications are related to each other by isometries on the purifying system
(which is the system 𝑅 as in the statement of the theorem), Uhlmann’s theorem tells us that the
fidelity between two quantum states is equal to the maximum overlap between their purifications.
Furthermore, it is straightforward to show that it suffices to take the dimension of 𝑅 the
same as the dimension of 𝐴, as we have done in the statement of the theorem. In other words,
performing an optimization over the dimension of 𝑅 leads to the same result as in (6.2.23).
|⟨𝜓 𝜌 | 𝑅 𝐴 (𝑈 𝑅 ⊗ 1 𝐴 )|𝜓 𝜎 ⟩ 𝑅 𝐴 | 2
√ √
= ⟨Γ| 𝑅 𝐴 ( 1 𝑅 ⊗ 𝜌 𝐴 )(𝑈 𝑅 ⊗ 1 𝐴 )( 1 𝑅 ⊗ 𝜎𝐴 )|Γ⟩ 𝑅 𝐴
2
(6.2.24)
√ √ 2
= ⟨Γ| 𝑅 𝐴 (𝑈 𝑅 ⊗ 𝜌 𝐴 𝜎𝐴 )|Γ⟩ 𝑅 𝐴 (6.2.25)
269
Chapter 6: Distinguishibility Measures for Quantum States and Channels
√ √
= ⟨Γ| 𝑅 𝐴 ( 1 𝑅 ⊗
2
𝜌 𝐴 𝜎𝐴𝑈 𝐴T )|Γ𝑅 𝐴 ⟩ , (6.2.26)
where the last line follows from the transpose trick in (2.2.40). Then, using (2.2.41),
we find that
√ √
|⟨𝜓 𝜌 | 𝑅 𝐴 (𝑈 𝑅 ⊗ 1 𝐴 )|𝜓 𝜎 ⟩ 𝑅 𝐴 | 2 = Tr[ 𝜌 𝐴 𝜎𝐴𝑈 𝐴T ] .
2
(6.2.27)
Since 𝑈 𝐴 is arbitrary, and 𝑈 𝐴T is also a unitary, we use (2.2.130) to obtain
√ √
max |⟨𝜓 𝜌 | 𝑅 𝐴 (𝑈 𝑅 ⊗ 1 𝐴 )|𝜓 𝜎 ⟩ 𝑅 𝐴 | 2 = max Tr[ 𝜌 𝐴 𝜎𝐴𝑈 𝐴T ]
2
(6.2.28)
𝑈 𝑈
√ √ 2
= max Tr[ 𝜌 𝐴 𝜎𝐴𝑈 𝐴 ] (6.2.29)
𝑈
√ √ 2
= 𝜌 𝐴 𝜎𝐴 1 (6.2.30)
as required. ■
Proof: Recall that every quantum channel N 𝐴→𝐵 can be written in the Stinespring
form as N 𝐴→𝐵 (𝜌 𝐴 ) = Tr𝐸 [𝑉 𝜌 𝐴𝑉 † ], where 𝑉 ≡ 𝑉𝐴→𝐵𝐸 is some isometric extension
of N and 𝑑 𝐸 ≤ rank(ΓN 𝐴𝐵 ). Since we have shown that the fidelity is invariant
under isometric channels, it remains to show that the fidelity is non-decreasing
under the action of the partial trace. To this end, consider bipartite states 𝜌 𝐴𝐵
and 𝜎𝐴𝐵 , and let |𝜓⟩ 𝑅 𝐴𝐵 be an arbitrary purification of 𝜌 𝐴𝐵 and let |𝜙⟩ 𝑅 𝐴𝐵 be an
arbitrary purification of 𝜎𝐴𝐵 , where 𝑑 𝑅 = 𝑑 𝐴 𝑑 𝐵 . Observe that |𝜓⟩ 𝑅 𝐴𝐵 and |𝜙⟩ 𝑅 𝐴𝐵
are also purifications of 𝜌 𝐴 = Tr 𝐵 [𝜌 𝐴𝐵 ] and 𝜎𝐴 = Tr 𝐵 [𝜎𝐴𝐵 ], respectively. Then,
by Uhlmann’s theorem, we have that
𝐹 (𝜌 𝐴 , 𝜎𝐴 ) = max |⟨𝜓| 𝑅 𝐴𝐵 (𝑈 𝑅𝐵 ⊗ 1 𝐴 )|𝜙 𝑅 𝐴𝐵 ⟩| 2 . (6.2.32)
𝑈 𝑅𝐵
270
Chapter 6: Distinguishibility Measures for Quantum States and Channels
Therefore,
The fidelity thus satisfies the data-processing inequality with respect to the partial
trace.
Using the data-processing inequality for the fidelity with respect to the partial
trace, along with its invariance under isometries, we conclude that
† †
𝐹 (N(𝜌), N(𝜎)) = 𝐹 Tr𝐸 [𝑉 𝜌𝑉 ], Tr𝐸 [𝑉 𝜌𝑉 ] (6.2.37)
≥ 𝐹 (𝑉 𝜌𝑉 † , 𝑉 𝜎𝑉 † ) (6.2.38)
= 𝐹 (𝜌, 𝜎), (6.2.39)
as required. ■
With Uhlmann’s theorem and the data-processing inequality for the fidelity in
hand, we can now establish two more properties of the fidelity.
Proof: By Uhlmann’s theorem, we know that the fidelity is given by the maximum
overlap between the purifications of the two states under consideration. Based on
271
Chapter 6: Distinguishibility Measures for Quantum States and Channels
where the last line follows from the formula in (6.2.2) for the fidelity between a
pure state and a mixed state. Then, using the data-processing inequality for the
fidelity with respect to the partial trace, we obtain
!
∑︁ ∑︁
𝑝(𝑥)𝐹 (𝜌 𝑥𝐴 , 𝜎𝐴 ) = 𝐹 𝑝(𝑥)|𝜙𝑥 ⟩⟨𝜙𝑥 | 𝑅 𝐴 , |𝜓 𝜎 ⟩⟨𝜓 𝜎 | 𝑅 𝐴 (6.2.44)
𝑥∈X 𝑥∈X
!
∑︁
≤𝐹 𝑝(𝑥)Tr 𝑅 [|𝜙𝑥 ⟩⟨𝜙𝑥 | 𝑅 𝐴 ], Tr 𝑅 [|𝜓 𝜎 ⟩⟨𝜓 𝜎 | 𝑅 𝐴 ] (6.2.45)
𝑥∈X
!
∑︁
=𝐹 𝑝(𝑥) 𝜌 𝑥𝐴 , 𝜎𝐴 , (6.2.46)
𝑥∈X
A more general result than the concavity result proved above, namely joint
concavity, can be obtained if we consider instead the square root of the fidelity,
which we call the “root fidelity” and denote by
√ √︁ √ √
𝐹 (𝜌, 𝜎) B 𝐹 (𝜌, 𝜎) = 𝜌 𝜎 1 . (6.2.47)
272
Chapter 6: Distinguishibility Measures for Quantum States and Channels
From the above and the data-processing inequality for the fidelity under partial
trace (Theorem 6.9), we conclude that
∑︁ √ √ √
𝑝(𝑥) 𝐹 (𝜌 𝑥𝐴 , 𝜎𝐴𝑥 ) = 𝐹 (𝜌 𝑋 𝐴 , 𝜎𝑋 𝐴 ) ≤ 𝐹 (𝜌 𝐴 , 𝜎𝐴 ), (6.2.57)
𝑥∈X
The steps in (6.2.51)–(6.2.56) demonstrate that the root fidelity satisfies the
direct-sum property: for every finite alphabet X, probability distributions 𝑝, 𝑞 :
X → [0, 1], and sets {𝜌 𝑥𝐴 }𝑥∈X , {𝜎𝐴𝑥 }𝑥∈X of states, we have
273
Chapter 6: Distinguishibility Measures for Quantum States and Channels
!
√ ∑︁ ∑︁
𝐹 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜌 𝑥𝐴 , 𝑞(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜎𝐴𝑥
𝑥∈X 𝑥∈X
∑︁ √︁ √
= 𝑝(𝑥)𝑞(𝑥) 𝐹 (𝜌 𝑥𝐴 , 𝜎𝐴𝑥 ). (6.2.58)
𝑥∈X
Just as the trace distance can be achieved with a measurement, so it holds that
the fidelity can also be achieved with a measurement, as we now show.
where the optimization is with respect to all POVMs {Λ𝑥 }𝑥∈X defined with
respect to a finite alphabet X.
√ √ 2 √ √ 2
√︃
𝐹 (𝜌, 𝜎) = 𝜌 𝜎 1 = Tr 𝜎𝜌 𝜎 = Tr[ 𝐴𝜎] 2 , (6.2.68)
where
12
− 21 1 1 1
𝐴B𝜎 𝜎 𝜌𝜎
2 2 𝜎− 2 . (6.2.69)
If 𝜎 is not invertible, then the inverse is understood to be on the support of 𝜎, in
which case 𝐴 is supported on the support of 𝜎. So the fidelity is simply equal to
the squared expectation value of the Hermitian operator 𝐴 with respect to the state
𝜎. Observe that we can also write
2
√ √
√︃
𝐹 (𝜌, 𝜎) = Tr 𝜎𝜌 𝜎 (6.2.70)
2
√ √
√︃
= Tr 𝜎Π𝜎 𝜌Π𝜎 𝜎 (6.2.71)
eigenvectors of 𝐴, where 𝑟 = rank( 𝐴). If 𝜎 is not invertible, then we can always add
a set {|𝜓𝑖 ⟩}𝑖=𝑟
𝑑−1 of linearly independent pure states orthogonal to the eigenbasis of
275
Chapter 6: Distinguishibility Measures for Quantum States and Channels
where the last line follows because 𝐴|𝜓𝑖 ⟩ = 𝜆𝑖 |𝜓𝑖 ⟩ for all 0 ≤ 𝑖 ≤ 𝑟 − 1, and we
have used the fact that 𝜆𝑖 = 0 for all 𝑟 ≤ 𝑖 ≤ 𝑑 − 1. Now, it is straightforward to
show that 𝐴𝜎 𝐴 = Π𝜎 𝜌Π𝜎 . Therefore,
𝑟−1 √︁
!2
∑︁ √︁
𝐹 (𝜌, 𝜎) = Tr[ 𝐴𝜎] 2 = ⟨𝜓𝑖 |Π𝜎 𝜌Π𝜎 |𝜓𝑖 ⟩ ⟨𝜓𝑖 |𝜎|𝜓𝑖 ⟩ (6.2.78)
𝑖=0
𝑟−1 √︁
!2
∑︁ √︁
= ⟨𝜓𝑖 |𝜌|𝜓𝑖 ⟩ ⟨𝜓𝑖 |𝜎|𝜓𝑖 ⟩ , (6.2.79)
𝑖=0
where the last line follows because 𝐴 is defined on the support of 𝜎, which means
that Π𝜎 |𝜓𝑖 ⟩ = |𝜓𝑖 ⟩ for all 0 ≤ 𝑖 ≤ 𝑟 − 1. We thus have
!2 𝑟−1 √︁
!2
∑︁ √︁ √︁ ∑︁ √︁
min Tr[Λ𝑥 𝜌] Tr[Λ𝑥 𝜎] ≤ ⟨𝜓𝑖 |𝜌|𝜓𝑖 ⟩ ⟨𝜓𝑖 |𝜎|𝜓𝑖 ⟩ (6.2.80)
{Λ 𝑥 } 𝑥
𝑥∈X 𝑖=0
= 𝐹 (𝜌, 𝜎), (6.2.81)
276
Chapter 6: Distinguishibility Measures for Quantum States and Channels
Proof: The reasoning here follows the reasoning of the proof of Theorem 6.3
closely. Let {Λ′𝑥 }𝑥∈X be a POVM. Then consider that
∑︁ √︁ ∑︁ √︃
′ ′
Tr[Λ𝑥 N(𝜌)]Tr[Λ𝑥 N(𝜎)] = Tr[N† (Λ′𝑥 ) 𝜌]Tr[N† (Λ′𝑥 )𝜎] (6.2.83)
𝑥∈X 𝑥∈X
∑︁ √︁
≥ min Tr[Λ𝑥 𝜌]Tr[Λ𝑥 𝜎] (6.2.84)
{Λ 𝑥 } 𝑥 ∈X
𝑥∈X
√
= 𝐹 (𝜌, 𝜎). (6.2.85)
The inequality follows because {N† (Λ′𝑥 )}𝑥∈X is a POVM since {Λ′𝑥 }𝑥∈X is and
N is a positive, trace-preserving map, so that N† (Λ′𝑥 ) ≥ 0 for all 𝑥 ∈ X and
𝑥∈X Λ𝑥 = N ( 1) = 1. The last equality follows from
† ′ † ′ †
Í Í
𝑥∈X N (Λ𝑥 ) = N
Theorem 6.12. Since the inequality holds for all POVMs {Λ′𝑥 }𝑥∈X , we conclude
that
√ ∑︁ √︁
𝐹 (N(𝜌), N(𝜎)) = min ′
Tr[Λ′𝑥 N(𝜌)]Tr[Λ′𝑥 N(𝜎)] (6.2.86)
{Λ 𝑥 } 𝑥 ∈X
𝑥∈X
√
≥ 𝐹 (𝜌, 𝜎), (6.2.87)
277
Chapter 6: Distinguishibility Measures for Quantum States and Channels
Proof: We first prove the upper bound. To do so, recall the formula in (6.1.1)
for the trace distance between two pure states. If we let |𝜓 𝜌 ⟩ 𝑅 𝐴 and |𝜓 𝜎 ⟩ 𝑅 𝐴 be
purifications of 𝜌 𝐴 and 𝜎𝐴 , respectively, such that 𝐹 (𝜌 𝐴 , 𝜎𝐴 ) = |⟨𝜓 𝜌 |𝜓 𝜎 ⟩| 2 , and
if we use the data-processing inequality for the trace distance with respect to the
partial trace channel Tr 𝑅 , then we obtain
1 1
∥ 𝜌 𝐴 − 𝜎𝐴 ∥ 1 = ∥Tr 𝑅 [|𝜓 𝜌 ⟩⟨𝜓 𝜌 | 𝑅 𝐴 − |𝜓 𝜎 ⟩⟨𝜓 𝜎 | 𝑅 𝐴 ] ∥ 1 (6.2.89)
2 2
1
≤ ∥|𝜓 𝜌 ⟩⟨𝜓 𝜌 | 𝑅 𝐴 − |𝜓 𝜎 ⟩⟨𝜓 𝜎 | 𝑅 𝐴 ∥ 1 (6.2.90)
2
√︃
= 1 − |⟨𝜓 𝜌 |𝜓 𝜎 ⟩| 2 (6.2.91)
√︁
= 1 − 𝐹 (𝜌 𝐴 , 𝜎𝐴 ), (6.2.92)
as required.
For the lower bound, we use the results of Theorems 6.12 and 6.4. Theorem
6.12 tells us that there exists a POVM {Λ𝑥 }𝑥∈X such that
!2
∑︁ √︁ √︁
𝐹 (𝜌, 𝜎) = Tr[Λ𝑥 𝜌] Tr[Λ𝑥 𝜎] (6.2.93)
𝑥∈X
!2
∑︁ √︁
≡ 𝑝(𝑥)𝑞(𝑥) , (6.2.94)
𝑥∈X
where we have let 𝑝(𝑥) B Tr[Λ𝑥 𝜌] and 𝑞(𝑥) B Tr[Λ𝑥 𝜎]. Using this, observe that
∑︁ √ √ 2 ∑︁ √
𝑝(𝑥) − 𝑞(𝑥) = 𝑝(𝑥) − 2 𝑝(𝑥)𝑞(𝑥) + 𝑞(𝑥) (6.2.95)
𝑥∈X 𝑥∈X
∑︁ √
=2−2 𝑝(𝑥)𝑞(𝑥) (6.2.96)
𝑥∈X
√︁
= 2 − 2 𝐹 (𝜌, 𝜎). (6.2.97)
278
Chapter 6: Distinguishibility Measures for Quantum States and Channels
√︁ √︁ √︁ √︁
Using this, and the fact that | 𝑝(𝑥) − 𝑞(𝑥)| ≤ | 𝑝(𝑥) + 𝑞(𝑥)|, we obtain
∑︁ √ √ 2 ∑︁ √ √ √ √
𝑝(𝑥) − 𝑞(𝑥) ≤ 𝑝(𝑥) − 𝑞(𝑥) 𝑝(𝑥) + 𝑞(𝑥) (6.2.100)
𝑥∈X 𝑥∈X
∑︁
= | 𝑝(𝑥) − 𝑞(𝑥)| (6.2.101)
𝑥∈X
≤ ∥ 𝜌 − 𝜎∥ 1 . (6.2.102)
So we have
√︁ ∑︁ √ √ 2
2 − 2 𝐹 (𝜌, 𝜎) = 𝑝(𝑥) − 𝑞(𝑥) ≤ ∥ 𝜌 − 𝜎∥ 1 , (6.2.103)
𝑥∈X
Proof: Suppose first that 𝜌 is a pure state |𝜓⟩⟨𝜓|. The post-measurement state is
then √ √
Λ|𝜓⟩⟨𝜓| Λ
. (6.2.107)
⟨𝜓|Λ|𝜓⟩
The fidelity between the original state |𝜓⟩ and the post-measurement state above is
as follows:
√ √ ! √ 2
Λ|𝜓⟩⟨𝜓| Λ ⟨𝜓| Λ|𝜓⟩ |⟨𝜓|Λ|𝜓⟩| 2
⟨𝜓| |𝜓⟩ = ≥ (6.2.108)
⟨𝜓|Λ|𝜓⟩ ⟨𝜓|Λ|𝜓⟩ ⟨𝜓|Λ|𝜓⟩
279
Chapter 6: Distinguishibility Measures for Quantum States and Channels
= ⟨𝜓|Λ|𝜓⟩ ≥ 1 − 𝜀. (6.2.109)
√
The first inequality follows because Λ ≥ Λ when Λ ≤ 𝐼. The second inequality
follows from the hypothesis of the lemma. Now let us consider when we have
mixed states 𝜌 𝐴 and 𝜌′𝐴 . Suppose |𝜓⟩ 𝑅 𝐴 and |𝜓 ′⟩ 𝑅 𝐴 are respective purifications of
𝜌 𝐴 and 𝜌′𝐴 , where
√
′ 𝐼 𝑅 ⊗ Λ 𝐴 |𝜓⟩ 𝑅 𝐴
|𝜓 ⟩ 𝑅 𝐴 ≡ √︁ . (6.2.110)
⟨𝜓|𝐼 𝑅 ⊗ Λ 𝐴 |𝜓⟩ 𝑅 𝐴
Then we can apply the data-processing inequality for fidelity (Proposition 6.13)
and the result above for pure states to conclude that
𝐹 (𝜌 𝐴 , 𝜌′𝐴 ) ≥ 𝐹 (𝜓 𝑅 𝐴 , 𝜓 ′𝑅 𝐴 ) ≥ 1 − 𝜀. (6.2.111)
Unlike the trace distance, the fidelity is not a distance measure in the mathematical
sense because it does not satisfy the triangle inequality. The following distance
measure based on the fidelity, however, does satisfy the triangle inequality, along
with the other properties that define a distance measure.
The measure 𝑃(𝜌, 𝜎) is known as the sine distance due to the fact that 𝐹 (𝜌, 𝜎)
has the interpretation as the largest value of the squared cosine of the angle
between
√︁ two arbitrary purifications of 𝜌 and 𝜎 (see Theorem 6.8), which means
that 1 − 𝐹 (𝜌, 𝜎) has the interpretation as the sine of the same angle. Related to
this interpretation, the measure 𝑃(𝜌, 𝜎) is equal to the minimum trace distance
between purifications of 𝜌 and 𝜎:
280
Chapter 6: Distinguishibility Measures for Quantum States and Channels
1
inf 𝜎 ∥|𝜓 𝜌 ⟩⟨𝜓 𝜌 | 𝑅 𝐴 − |𝜓 𝜎 ⟩⟨𝜓 𝜎 | 𝑅 𝐴 ∥ 1
|𝜓 𝜌 ⟩ 𝑅 𝐴,|𝜓 ⟩𝑅 𝐴 2
√︃
= 𝜌 inf 𝜎 1 − |⟨𝜓 𝜌 |𝜓 𝜎 ⟩ 𝑅 𝐴 | 2 = 𝑃(𝜌, 𝜎), (6.2.113)
|𝜓 ⟩ 𝑅 𝐴,|𝜓 ⟩ 𝑅 𝐴
where |Γ⟩ 𝑅 𝐴 is the maximally entangled vector from (2.2.34). Recalling (6.1.1),
for pure states |𝜙⟩ and |𝜑⟩, we have that
√︃
1 √︁
∥|𝜙⟩⟨𝜙| − |𝜑⟩⟨𝜑|∥ 1 = 1 − 𝐹 (𝜙, 𝜑) = 1 − |⟨𝜙|𝜑⟩| 2 . (6.2.119)
2
Let 𝑈 𝑅 and 𝑉𝑅 be arbitrary unitaries acting on the reference system 𝑅. From the
fact that trace distance obeys the triangle inequality and the equality given above,
we find that
√︃ √︃
1 − |⟨𝜓 | 𝑅 𝐴 (𝑊 𝑅 ⊗ 1 𝐴 )|𝜓 ⟩ 𝑅 𝐴 | ≤ 1 − |⟨𝜓 𝜎 | 𝑅 𝐴 (𝑈 𝑅† ⊗ 1 𝐴 )|𝜓 𝜔 ⟩ 𝑅 𝐴 | 2
𝜎 𝜌 2
√︃
+ 1 − |⟨𝜓 𝜔 | 𝑅 𝐴 (𝑉𝑅 ⊗ 1 𝐴 )|𝜓 𝜌 ⟩ 𝑅 𝐴 | 2 , (6.2.120)
281
Chapter 6: Distinguishibility Measures for Quantum States and Channels
Since the inequality holds for arbitrary unitaries 𝑈 and 𝑉, it holds for the minimum of
each term on the right, and so this, combined with Uhlmann’s theorem (Theorem 6.8),
implies the desired result:
√︁ √︁ √︁
1 − 𝐹 (𝜎𝐴 , 𝜌 𝐴 ) ≤ 1 − 𝐹 (𝜎𝐴 , 𝜔 𝐴 ) + 1 − 𝐹 (𝜔 𝐴 , 𝜌 𝐴 ), (6.2.122)
282
Chapter 6: Distinguishibility Measures for Quantum States and Channels
where the optimization is over all states 𝜌 𝑅 𝐴 , with the dimension of 𝑅 unbounded.
Proposition 6.19
The diamond norm of a Hermiticity-preserving map P 𝐴→𝐵 can be calculated as
where the optimization is over all pure states 𝜓 𝑅 𝐴 , such that the dimension of 𝑅
is equal to the dimension of the system 𝐴.
From the Schmidt decomposition (Theorem 2.2), it follows that the rank of the
reduced state 𝜓 𝑅 is no larger than the dimension of system 𝐴. So then it suffices to
optimize with respect to all pure states 𝜓 𝑅 𝐴 , such that the dimension of 𝑅 is equal
to the dimension of the system 𝐴. ■
284
Chapter 6: Distinguishibility Measures for Quantum States and Channels
285
Chapter 6: Distinguishibility Measures for Quantum States and Channels
is
1
𝑝 ∗err (𝜌 𝑅 𝐴 ) = (1 − ∥𝜆N 𝐴→𝐵 (𝜌 𝑅 𝐴 ) − (1 − 𝜆)M 𝐴→𝐵 (𝜌 𝑅 𝐴 )∥ 1 ) . (6.3.12)
2
Then, optimizing over all input states 𝜌 𝑅 𝐴 in order to minimize the error probability,
we find that
∗ 1
inf 𝑝 (𝜌 𝑅 𝐴 ) = 1 − sup ∥(𝜆N 𝐴→𝐵 − (1 − 𝜆)M 𝐴→𝐵 )(𝜌 𝑅 𝐴 )∥ 1 (6.3.13)
𝜌 𝑅 𝐴 err 2 𝜌𝑅 𝐴
1
= (1 − ∥𝜆N − (1 − 𝜆)M∥⋄) , (6.3.14)
2
where the last line follows from the definition of the diamond norm. The optimal
error probability for the task of channel discrimination is thus a simple function of
the diamond norm.
where
𝑑−1
1 ∑︁
|Φ⟩ 𝑅 𝐴 =√ |𝑖, 𝑖⟩ 𝑅 𝐴 . (6.4.2)
𝑑 𝑖=0
Notice that the entanglement fidelity of a channel is the fidelity of the maximally
entangled state with the Choi state of the channel. Intuitively, then, the entanglement
fidelity quantifies how good a channel is at preserving the entanglement between
two systems when it acts on one of the two systems.
286
Chapter 6: Distinguishibility Measures for Quantum States and Channels
It turns out that the entanglement fidelity is very closely related to another
fidelity-based measure on quantum channels called the average fidelity:
∫
𝐹 (N) B ⟨𝜓|N(𝜓)|𝜓⟩ d𝜓, (6.4.3)
𝜓
where we integrate over all pure states acting on the input Hilbert space of N with
respect to the Haar measure. The Haar measure is the uniform probability measure
on pure quantum states (see the remark after (2.5.16)). For a quantum channel N
with input system dimension 𝑑, the following identity holds
𝑑𝐹𝑒 (N) + 1
𝐹 (N) = . (6.4.4)
𝑑+1
Instead of taking the average as in (6.4.3), we can take the minimum over all
input states to obtain the minimum fidelity:
𝐹min (N) B inf ⟨𝜓|N(𝜓)|𝜓⟩, (6.4.5)
𝜓
where the optimization is over all pure states 𝜓 acting on the input Hilbert space of
the channel N. By introducing a reference system 𝑅 and optimizing over all joint
states |𝜓⟩ 𝑅 𝐴 of 𝑅 and the input system 𝐴 of the channel N, we obtain a fidelity
measure that generalizes the entanglement fidelity.
where we take the infimum over all pure states |𝜓⟩ 𝑅 𝐴 , with the dimension of 𝑅
equal to the dimension of 𝐴.
Note that the state |𝜓⟩ 𝑅 𝐴 = |Φ⟩ 𝑅 𝐴 is a special case in the optimization in (6.4.6).
This implies that, for a channel N, 𝐹 (N) ≤ 𝐹𝑒 (N).
More generally, we define the fidelity between two quantum channels N 𝐴→𝐵
and M 𝐴→𝐵 as follows:
287
Chapter 6: Distinguishibility Measures for Quantum States and Channels
where the infimum is taken over all bipartite states 𝜌 𝑅 𝐴 , with the dimension of
𝑅 arbitrarily large.
Remark: Similar to the diamond distance, we define the channel fidelity as above in order
to indicate its operational meaning with an infimum over all possible input states, but it is not
necessary to take the infimum over all bipartite states. One can instead restrict the infimum to be
over pure bipartite states where the reference system 𝑅 is isomorphic to the channel input system
𝐴, so that
𝐹 (N, M) = inf 𝐹 (N 𝐴→𝐵 (𝜓 𝑅 𝐴), M 𝐴→𝐵 (𝜓 𝑅 𝐴)), (6.4.8)
𝜓𝑅 𝐴
where 𝜓 𝑅 𝐴 is a pure bipartite state with system 𝑅 is isomorphic to system 𝐴. The same statement
thus applies to (6.4.6). An argument for this is similar to that given in the proof of Proposition 6.19,
except using the joint concavity of root fidelity rather than convexity of the trace norm.
Here, we provide a different argument for this fact. First, we have that
inf 𝐹 (N 𝐴→𝐵 (𝜌 𝑅 𝐴), M 𝐴→𝐵 (𝜌 𝑅 𝐴)) ≤ inf 𝐹 (N 𝐴→𝐵 (𝜓 𝑅 𝐴), M 𝐴→𝐵 (𝜓 𝑅 𝐴)) (6.4.9)
𝜌𝑅 𝐴 𝜓𝑅 𝐴
which holds simply by restricting the optimization on the left-hand side to pure states.
Next, given a state 𝜌 𝑅 𝐴, with the dimension of 𝑅 not necessarily equal to the dimension of
𝐴, we can purify it to a state 𝜓 𝑅′ 𝑅 𝐴. Then, using the data-processing inequality for the fidelity
with respect to the partial trace channel Tr 𝑅′ (Proposition 6.13), we find that
𝐹 (N 𝐴→𝐵 (𝜌 𝑅 𝐴), M 𝐴→𝐵 (𝜌 𝑅 𝐴)) = 𝐹 (N 𝐴→𝐵 (Tr 𝑅′ [𝜓 𝑅′ 𝑅 𝐴]), M 𝐴→𝐵 (Tr 𝑅′ [𝜓 𝑅′ 𝑅 𝐴])) (6.4.10)
= 𝐹 (Tr 𝑅′ [N 𝐴→𝐵 (𝜓 𝑅′ 𝑅 𝐴)], Tr 𝑅′ [M 𝐴→𝐵 (𝜓 𝑅′ 𝑅 𝐴)]) (6.4.11)
≥ 𝐹 (N 𝐴→𝐵 (𝜓 𝑅′ 𝑅 𝐴), M 𝐴→𝐵 (𝜓 𝑅′ 𝑅 𝐴)) (6.4.12)
≥ inf 𝐹 (N 𝐴→𝐵 (𝜓 𝑅′ 𝑅 𝐴), M 𝐴→𝐵 (𝜓 𝑅′ 𝑅 𝐴)). (6.4.13)
𝜓𝑅′ 𝑅 𝐴
Finally, by the Schmidt decomposition theorem (Theorem 2.2), for every pure state 𝜓 𝑅 𝐴, the
rank of the reduced state 𝜓 𝑅 need not exceed the dimension of 𝐴, implying that it suffices to
optimize over pure states for which the system 𝑅 has the same dimension as the system 𝐴. We
thus obtain
inf 𝐹 (N 𝐴→𝐵 (𝜌 𝑅 𝐴), M 𝐴→𝐵 (𝜌 𝑅 𝐴)) = inf 𝐹 (N 𝐴→𝐵 (𝜓 𝑅 𝐴), M 𝐴→𝐵 (𝜓 𝑅 𝐴) (6.4.15)
𝜌𝑅 𝐴 𝜓𝑅 𝐴
288
Chapter 6: Distinguishibility Measures for Quantum States and Channels
We then have that 𝐹 (N) = 𝐹 (N, id). In other words, the fidelity 𝐹 (N) of
a quantum channel N can be viewed as the fidelity between N and the identity
channel id.
Similar to the diamond distance, the fidelity of quantum channels can be
computed by means of primal and dual semi-definite programs:
Γ𝑅𝐵 𝑄 †𝑅𝐵
N
sup 𝜆 min (Re[Tr 𝐵 [𝑄 𝑅𝐵 ]]) : M ≥0 , (6.4.20)
𝑄 𝑅𝐵 𝑄 𝑅𝐵 Γ𝑅𝐵
289
Chapter 6: Distinguishibility Measures for Quantum States and Channels
The inequality in (6.2.88) relating the fidelity between two states 𝜌 and 𝜎
and their trace distance can be used to relate the fidelity-based distance measure
𝐹 (N, M) on channels and the diamond distance 12 ∥N − M∥⋄. It is straightfoward
to show that
√︁ 1 √︁
1 − 𝐹 (N, M) ≤ ∥N − M∥⋄ ≤ 1 − 𝐹 (N, M). (6.4.21)
2
Proposition 6.25
Let N be a quantum
√ channel. For all 𝜀 ∈ [0, 1], if 𝐹min (N) ≥ 1 − 𝜀, then
𝐹 (N) ≥ 1 − 2 𝜀.
Proof: The inequality in 𝐹min (N) ≥ 1 − 𝜀 implies that the following inequality
holds for all state vectors |𝜙⟩ ∈ H:
|𝜙⟩ + i𝑘 |𝜙⊥ ⟩
|𝑤 𝑘 ⟩ B √ , (6.4.25)
2
for 𝑘 ∈ {0, 1, 2, 3}. Then, it follows that
3
⊥ 1 ∑︁ 𝑘
|𝜙⟩⟨𝜙 | = i |𝑤 𝑘 ⟩⟨𝑤 𝑘 |. (6.4.26)
2 𝑘=0
290
Chapter 6: Distinguishibility Measures for Quantum States and Channels
The first inequality follows from the characterization of the operator norm in
(2.2.121) as ∥ 𝑋 ∥ ∞ = sup|𝜙⟩,|𝜓⟩ |⟨𝜓|𝑋 |𝜙⟩|, where the optimization is with respect to
pure states. The second inequality follows from substituting (6.4.26) and applying
the triangle inequality and homogeneity of the ∞-norm. The third inequality follows
because the ∞-norm of a traceless Hermitian operator is bounded from above by
half of its trace norm (see Lemma 2.11 below). The final inequality follows from
applying (6.4.23). Let |𝜓⟩ ∈ H′ ⊗ H be an arbitrary state. All such states have a
Schmidt decomposition of the following form:
∑︁ √︁
|𝜓⟩ = 𝑝(𝑥)|𝜁𝑥 ⟩ ⊗ |𝜑𝑥 ⟩, (6.4.31)
𝑥
where {𝑝(𝑥)}𝑥 is a probability distribution and {|𝜁𝑥 ⟩}𝑥 and {|𝜑𝑥 ⟩}𝑥 are sets of
states. Then, consider that
1 − ⟨𝜓|(idH′ ⊗ N)(|𝜓⟩⟨𝜓|)|𝜓⟩
= ⟨𝜓|(idH′ ⊗ idH − idH′ ⊗ N)(|𝜓⟩⟨𝜓|)|𝜓⟩ (6.4.32)
= ⟨𝜓|(idH′ ⊗ [idH − N])(|𝜓⟩⟨𝜓|)|𝜓⟩ (6.4.33)
∑︁
= 𝑝(𝑥) 𝑝(𝑦)⟨𝜑𝑥 | |𝜑𝑥 ⟩⟨𝜑 𝑦 | − N(|𝜑𝑥 ⟩⟨𝜑 𝑦 |) |𝜑 𝑦 ⟩. (6.4.34)
𝑥,𝑦
Now, applying the triangle inequality and (6.4.24), we find that the following holds
for all |𝜓⟩ ∈ H′ ⊗ H:
1 − ⟨𝜓|(idH′ ⊗ N)(|𝜓⟩⟨𝜓|)|𝜓⟩
∑︁
= 𝑝(𝑥) 𝑝(𝑦)⟨𝜑𝑥 | |𝜑𝑥 ⟩⟨𝜑 𝑦 | − N(|𝜑𝑥 ⟩⟨𝜑 𝑦 |) |𝜑 𝑦 ⟩ (6.4.35)
𝑥,𝑦
291
Chapter 6: Distinguishibility Measures for Quantum States and Channels
∑︁
≤ 𝑝(𝑥) 𝑝(𝑦) ⟨𝜑𝑥 | |𝜑𝑥 ⟩⟨𝜑 𝑦 | − N(|𝜑𝑥 ⟩⟨𝜑 𝑦 |) |𝜑 𝑦 ⟩ (6.4.36)
𝑥,𝑦
√
≤ 2 𝜀. (6.4.37)
292
Chapter 6: Distinguishibility Measures for Quantum States and Channels
and Wilde (2021). Proposition 6.25 was established by Barnum et al. (2000) and
reviewed by Kretschmann and Werner (2004). Here we followed the proof given
by Watrous (2018, Theorem 3.56), which therein established a relation between
trace distance and diamond distance between an arbitrary channel and the identity
channel.
due to the fact that the set of pure states with reduced state 𝜓 𝑅 positive definite is
dense in the set of all pure states. Now, recall from (2.2.38) that any such pure
state can be written as 𝜓 𝑅 𝐴 = 𝑋 𝑅 Γ𝑅 𝐴 𝑋 𝑅† for some linear operator 𝑋 𝑅 such that
Tr[𝑋 𝑅† 𝑋 𝑅 ] = 1 and |𝑋 𝑅 | > 0, where Γ𝑅 𝐴 defined in (2.2.34). Using this, we find
that the objective function can be rewritten as
293
Chapter 6: Distinguishibility Measures for Quantum States and Channels
0 ≤ Λ 𝑅𝐵 ≤ 1 𝑅𝐵 ⇔ 0 ≤ 𝑋 𝑅† Λ 𝑅𝐵 𝑋 𝑅 ≤ 𝑋 𝑅† 𝑋 𝑅 ⊗ 1𝐵 . (6.A.6)
Now, to establish the dual SDP in (6.3.10), we first determine the adjoint Φ† of
Φ using
Tr[Φ(𝑌 )𝑍] = Tr[𝑌 Φ† (𝑍)], (6.A.13)
where without loss of generality we can take 𝑍 to be
𝑍 𝑅𝐵 0 0
© ª
𝑍 B 0 𝜇1 0 ® . (6.A.14)
« 0 0 𝜇2 ¬
Then, we find that
then becomes
N
inf {𝜇1 − 𝜇2 : 𝑍 𝑅𝐵 ≥ Γ𝑅𝐵 M
− Γ𝑅𝐵 , (𝜇1 − 𝜇2 ) 1 𝑅 ≥ Tr 𝐵 [𝑍 𝑅𝐵 ]}. (6.A.19)
𝜇1 ≥0,
𝜇2 ≥0,
𝑍 𝑅𝐵 ≥0
Now, observe that the variables 𝜇1 and 𝜇2 always appear together in the above
optimization as 𝜇1 − 𝜇2 , and so can be reduced to the a single real variable 𝜇 ∈ R.
Then, the condition 𝜇1 𝑅 ≥ Tr 𝐵 [𝑍 𝑅𝐵 ] implies that 𝜇 ≥ 0. Thus, the optimization
in (6.A.19) can be simplified to
N
inf {𝜇 : 𝑍 𝑅𝐵 ≥ Γ𝑅𝐵 M
− Γ𝑅𝐵 , 𝜇1 𝑅 ≥ Tr 𝐵 [𝑍 𝑅𝐵 ]}, (6.A.20)
𝜇≥0,
𝑍 𝑅𝐵 ≥0
which is precisely (6.3.10). Equality of the primal and dual SDPs is due to strong
duality, which holds for the SDP in (6.3.10) because 𝑍 𝑅𝐵 = Γ𝑅𝐵 N − ΓM + 𝛿 1
𝑅𝐵 𝑅𝐵
and 𝜇 = Tr 𝐵 [𝑍 𝑅𝐵 ] + 𝛿1 𝑅 together form a strictly feasible point for all 𝛿 > 0 and a
feasible point for the primal is 𝜌 𝑅 = 𝜋 𝑅 and Ω 𝑅𝐵 = 𝜋 𝑅 ⊗ 1𝐵 .
Finally, the equality
N
inf {𝜇 : 𝑍 𝑅𝐵 ≥ Γ𝑅𝐵 M
− Γ𝑅𝐵 , 𝜇1 𝑅 ≥ Tr 𝐵 [𝑍 𝑅𝐵 ]}
𝜇≥0,
𝑍 𝑅𝐵 ≥0
N M
= inf {∥Tr 𝐵 [𝑍 𝑅𝐵 ] ∥ ∞ : 𝑍 𝑅𝐵 ≥ Γ𝑅𝐵 − Γ𝑅𝐵 } (6.A.21)
𝑍 𝑅𝐵 ≥0
holds by the expression in (2.4.47) for the Schatten ∞-norm for positive semi-definite
operators.
295
Chapter 6: Distinguishibility Measures for Quantum States and Channels
First, let us verify that strong duality holds for the primal and dual semi-definite
programs in (6.2.4) and (6.2.5), respectively. Consider that 𝑋 = 0 is a feasible
choice for the primal program, while 𝑌 = 𝑍 = 21 is strictly feasible for the dual
program. Thus, strong duality holds according to Theorem 2.28.
In order to prove the equality in (6.2.4), we start with the following lemma:
Lemma 6.26
Let 𝑃 and 𝑄 be positive semi-definite operators in L(H), and let 𝑋 ∈ L(H).
Then the operator
𝑃 𝑋
(6.B.1)
𝑋† 𝑄
296
Chapter 6: Distinguishibility Measures for Quantum States and Channels
√ √ √
= 𝜌 𝜎 1
= 𝐹 (𝜌, 𝜎). (6.B.5)
The first equality follows from Lemma 6.26. The third equality follows because we
can use the optimization over 𝐾 to adjust a global phase such that the real part is
equal to the absolute value (here, one should think of the fact that Re[𝑧] = 𝑟 cos(𝜃)
for 𝑧 = 𝑟𝑒𝑖𝜃 , and then one can optimize the value of 𝜃 so that Re[𝑧] = 𝑟). The
final equality follows by a generalization of Proposition 2.10 (in fact the same
proof given there implies that the optimization can be with respect to 𝑈 satisfying
∥𝑈 ∥ ∞ ≤ 1, rather than just with respect to isometries).
We now prove that (6.2.5) is the dual program of (6.2.4). We can rewrite the
primal SDP as
1
sup Tr[𝑋] + Tr[𝑋 † ] (6.B.6)
2
subject to
𝜌 0 0 𝑋 𝑅 𝑋
≥ , ≥0 (6.B.7)
0 𝜎 𝑋† 0 𝑋† 𝑆
because 𝑅 and 𝑆 are not involved in the objective function and can always be chosen
so that the second operator is PSD. Also, the following equivalences hold
𝜌 𝑋 𝜌 −𝑋
≥ 0 ⇐⇒ ≥0 (6.B.8)
𝑋† 𝜎 −𝑋 † 𝜎
𝜌 0 0 𝑋
⇐⇒ ≥ . (6.B.9)
0 𝜎 𝑋† 0
As given in (2.4.3) and (2.4.4), the standard forms of primal and dual SDPs for
Hermitian 𝐴 and 𝐵 and Hermiticity-preserving map Φ are as follows:
297
Chapter 6: Distinguishibility Measures for Quantum States and Channels
Setting
𝑊 𝑉
𝑌= † , (6.B.14)
𝑉 𝑍
the adjoint of Φ is given by
𝑊 𝑉 0 𝑋
Tr[𝑌 Φ(𝑋)] = Tr (6.B.15)
𝑉† 𝑍 𝑋† 0
†
𝑉𝑋 𝑊𝑋
= Tr (6.B.16)
𝑍 𝑋† 𝑉†𝑋
= Tr[𝑉 𝑋 † ] + Tr[𝑋𝑉 † ] (6.B.17)
0 𝑉 𝑅 𝑋
= Tr , (6.B.18)
𝑉† 0 𝑋† 𝑆
so that
0 𝑉
Φ† (𝑌 ) = † . (6.B.19)
𝑉 0
Then the dual is given by
1 𝜌 0 𝑊 𝑉
inf Tr (6.B.20)
2 0 𝜎 𝑉† 𝑍
subject to
0 1
0 𝑉 𝑊 𝑉
≥ ≥ 0.
𝑉† 0 1 0 , 𝑉† 𝑍
(6.B.21)
This simplifies to
1
inf Tr[𝜌𝑊] + Tr[𝜎𝑍], (6.B.22)
2
subject to
0 1
0 𝑉 𝑊 𝑉
≥ ≥0
𝑉† 0 1 0 , 𝑉† 𝑍
(6.B.23)
Since
𝑊 𝑉 𝑊 −𝑉
≥0 ⇐⇒ ≥0 (6.B.24)
𝑉† 𝑍 −𝑉 † 𝑍
𝑊 0 0 𝑉
⇐⇒ ≥ † , (6.B.25)
0 𝑍 𝑉 0
298
Chapter 6: Distinguishibility Measures for Quantum States and Channels
0 1
𝑊 0 0 𝑉
≥ † ≥
0 𝑍 𝑉 0 1 0 , (6.B.26)
and since 𝑉 plays no role in the objective function, we can set 𝑉 = 1. So the final
SDP simplifies as follows:
1
inf Tr[𝜌𝑊] + Tr[𝜎𝑍] (6.B.27)
2 𝑊,𝑍
subject to
𝑊 −1
≥ 0.
−1 𝑍
(6.B.28)
𝑊 −1 𝑊 1
≥0 ⇐⇒
−1 𝑍 1 𝑍 ≥0 (6.B.29)
𝜓 𝑅 𝐴 = 𝑋 𝑅 Γ𝑅 𝐴 𝑋 𝑅† , (6.B.32)
with
′
𝑊 𝑅𝐵 := 𝑋 𝑅† 𝑊 𝑅𝐵 𝑋 𝑅 , ′
𝑍 𝑅𝐵 := 𝑋 𝑅† 𝑍 𝑅𝐵 𝑋 𝑅 (6.B.37)
Now consider that the inequality in (6.B.35) is equivalent to
𝑊 𝑅𝐵 1 𝑅𝐵 𝑋 𝑅 0
†
𝑋𝑅 0
0 𝑋𝑅 1𝑅𝐵 𝑍 𝑅𝐵 0 𝑋𝑅 ≥ 0. (6.B.38)
𝑊 𝑅𝐵 1 𝑅𝐵 𝑋 𝑅 0
†
𝑋𝑅 0
0 𝑋𝑅 1𝑅𝐵 𝑍 𝑅𝐵 0 𝑋𝑅
𝑋 𝑅 𝑊 𝑅𝐵 𝑋 𝑅 𝑋 𝑅† 𝑋 𝑅 ⊗ 1𝐵
†
= † (6.B.39)
𝑋 𝑅 𝑋 𝑅 ⊗ 1𝐵 𝑋 𝑅† 𝑍 𝑅𝐵 𝑋 𝑅
′ 𝜌 𝑅 ⊗ 1𝐵
𝑊 𝑅𝐵
=
𝜌 𝑅 ⊗ 1𝐵 ′ , (6.B.40)
𝑍 𝑅𝐵
𝜌 𝑅 ⊗ 1𝐵
𝑊 𝑅𝐵
𝜌 𝑅 ≥ 0, Tr[𝜌 𝑅 ] = 1, ≥ 0.
𝜌 𝑅 ⊗ 1𝐵
(6.B.42)
𝑍 𝑅𝐵
300
Chapter 6: Distinguishibility Measures for Quantum States and Channels
Now let us calculate the dual SDP to this, using the following standard forms
for primal and dual SDPs, with Hermitian operators 𝐴 and 𝐵 and a Hermiticity-
preserving map Φ (as given in (2.4.3) and (2.4.4)):
inf Tr[𝐵𝑌 ] : Φ† (𝑌 ) ≥ 𝐴 .
sup {Tr[ 𝐴𝑋] : Φ(𝑋) ≤ 𝐵} , (6.B.43)
𝑋 ≥0 𝑌 ≥0
Tr[𝑋Φ† (𝑌 )]
𝑃 𝑅𝐵 𝑄 † 0 0
𝑅𝐵
𝑊 𝑅𝐵 𝜌 𝑅 ⊗ 1𝐵 0 0
𝑄 𝑅𝐵 𝑆 𝑅𝐵 0 0 ® 𝜌 𝑅 ⊗ 1𝐵
© ª © ª
𝑍 𝑅𝐵 0 0
= Tr
®
®
Tr[𝜌 𝑅 ]
®
0 0 𝜆 0® 0 0 0 ®
0
« 0 0 𝜇¬ « 0 0 0 − Tr[𝜌 𝑅 ] ¬
= Tr[𝑃 𝑅𝐵𝑊 𝑅𝐵 ] + Tr[𝑄 †𝑅𝐵 (𝜌 𝑅 ⊗ 1𝐵 )] + Tr[𝑄 𝑅𝐵 (𝜌 𝑅 ⊗ 1𝐵 )]
+ Tr[𝑆 𝑅𝐵 𝑍 𝑅𝐵 ] + (𝜆 − 𝜇) Tr[𝜌 𝑅 ]
= Tr[𝑃 𝑅𝐵𝑊 𝑅𝐵 ] + Tr[𝑆 𝑅𝐵 𝑍 𝑅𝐵 ] + Tr[(Tr 𝐵 [𝑄 𝑅𝐵 + 𝑄 †𝑅𝐵 ] + (𝜆 − 𝜇) 1 𝑅 ) 𝜌 𝑅 ]
301
Chapter 6: Distinguishibility Measures for Quantum States and Channels
𝑃 𝑅𝐵 0 0 𝑊 𝑅𝐵 0 0
©
= Tr 0 𝑆 𝑅𝐵 0 ª © ª
® 0 𝑍 𝑅𝐵 0 ® .
0
« 0 Tr 𝐵 [𝑄 𝑅𝐵 + 𝑄 †𝑅𝐵 ] + (𝜆 − 𝜇) 1 𝑅 ¬ « 0 0 𝜌 𝑅 ¬
(6.B.48)
So then
𝑃 𝑅𝐵 0 0
Φ(𝑋) =
© 0 𝑆 0 ª
𝑅𝐵 ®. (6.B.49)
« 0 0 Tr 𝐵 [𝑄 𝑅𝐵 + 𝑄 𝑅𝐵 ] + (𝜆 − 𝜇) 1 𝑅 ¬
†
𝑃 𝑅𝐵 0 0 N
Γ𝑅𝐵 0 0
M 0ª ,
® ≤ 0 Γ𝑅𝐵
© 0 𝑆 0 ª ©
𝑅𝐵 ® (6.B.51)
« 0 0 Tr 𝐵 [𝑄 𝑅𝐵 + 𝑄 𝑅𝐵 ] + (𝜆 − 𝜇) 1 𝑅 ¬ « 0
†
0 0¬
𝑃 𝑅𝐵 𝑄 †𝑅𝐵 0 0
© ª
𝑄 𝑅𝐵 𝑆 𝑅𝐵 0 0 ®
® ≥ 0, (6.B.52)
0 0 𝜆 0®
« 0 0 0 𝜇¬
which simplifies to
1
sup (𝜆 − 𝜇) (6.B.53)
2
subject to
N
𝑃 𝑅𝐵 ≤ Γ𝑅𝐵 , (6.B.54)
M
𝑆 𝑅𝐵 ≤ Γ𝑅𝐵 , (6.B.55)
Tr 𝐵 [𝑄 𝑅𝐵 + 𝑄 †𝑅𝐵 ] + (𝜆 − 𝜇) 1 𝑅 ≤ 0, (6.B.56)
𝑃 𝑅𝐵 𝑄 †𝑅𝐵
≥ 0, (6.B.57)
𝑄 𝑅𝐵 𝑆 𝑅𝐵
𝜆, 𝜇 ≥ 0. (6.B.58)
302
Chapter 6: Distinguishibility Measures for Quantum States and Channels
We can simplify this even more. We can set 𝜆′ = 𝜆 − 𝜇 ∈ R, and we can substitute
𝑄 𝑅𝐵 with −𝑄 𝑅𝐵 without changing the value, so then it becomes
1
sup 𝜆′ (6.B.59)
2
subject to
N
𝑃 𝑅𝐵 ≤ Γ𝑅𝐵 , (6.B.60)
M
𝑆 𝑅𝐵 ≤ Γ𝑅𝐵 , (6.B.61)
𝜆′1 𝑅 ≤ Tr 𝐵 [𝑄 𝑅𝐵 + 𝑄 †𝑅𝐵 ], (6.B.62)
−𝑄 †𝑅𝐵
𝑃 𝑅𝐵
≥ 0, (6.B.63)
−𝑄 𝑅𝐵 𝑆 𝑅𝐵
𝜆′ ∈ R. (6.B.64)
We can rewrite
𝑃 𝑅𝐵 −𝑄 †𝑅𝐵 𝑃 𝑅𝐵 𝑄 †𝑅𝐵
≥0 ⇐⇒ ≥0 (6.B.65)
−𝑄 𝑅𝐵 𝑆 𝑅𝐵 𝑄 𝑅𝐵 𝑆 𝑅𝐵
−𝑄 †𝑅𝐵
𝑃 𝑅𝐵 0 0
⇐⇒ ≥ (6.B.66)
0 𝑆 𝑅𝐵 −𝑄 𝑅𝐵 0
−𝑄 †𝑅𝐵
N
0 𝑃 𝑅𝐵 0 Γ𝑅𝐵 0
≤ ≤ M . (6.B.67)
−𝑄 𝑅𝐵 0 0 𝑆 𝑅𝐵 0 Γ𝑅𝐵
Since 𝑃 𝑅𝐵 and 𝑆 𝑅𝐵 do not appear in the objective function, we can set them to their
largest value and obtain the following simplification
1
sup 𝜆′ (6.B.68)
2
subject to
N 𝑄 †𝑅𝐵
Γ𝑅𝐵
𝜆 1𝑅 ≤
′
Tr 𝐵 [𝑄 𝑅𝐵 + 𝑄 †𝑅𝐵 ], M ≥ 0, 𝜆′ ∈ R (6.B.69)
𝑄 𝑅𝐵 Γ𝑅𝐵
303
Chapter 6: Distinguishibility Measures for Quantum States and Channels
†
N
1 Γ
sup 𝜆 : 𝜆1 𝑅 ≤ Tr 𝐵 [𝑄 𝑅𝐵 + 𝑄 †𝑅𝐵 ], 𝑅𝐵 𝑄 𝑅𝐵 ≥ 0
M
𝑄 𝑅𝐵 Γ𝑅𝐵
2 𝜆≥0,𝑄 𝑅𝐵
Γ𝑅𝐵 𝑄 †𝑅𝐵
N
= sup 𝜆 : 𝜆1 𝑅 ≤ Re[Tr 𝐵 [𝑄 𝑅𝐵 ]], M ≥ 0 . (6.B.70)
𝜆≥0,𝑄 𝑅𝐵 𝑄 𝑅𝐵 Γ 𝑅𝐵
This is equivalent to
N 𝑄 †𝑅𝐵
Γ𝑅𝐵
sup 𝜆min (Re[Tr 𝐵 [𝑄 𝑅𝐵 ]]) : M ≥0 . (6.B.71)
𝑄 𝑅𝐵 𝑄 𝑅𝐵 Γ𝑅𝐵
304
Chapter 7
7.1 Preview
Arguably one of the most important quantities in quantum information theory is the
von Neumann entropy, which is the quantum generalization of the Shannon entropy.
We also refer to it as the quantum entropy and do so from now on. For a quantum
system 𝐴 in the state 𝜌 𝐴 ∈ D(H 𝐴 ), the von Neumann entropy is defined as
306
Chapter 7: Quantum Entropies and Information
the quantum entropy quantifies the uncertainty about which of these pure states the
system 𝐴 is in. In particular, 𝐻 (𝜌 𝐴 ) is, in a rough sense, the expected information
gain upon performing an experiment to determine the state of the system.
Note that the right-hand side of (7.1.3) is the formula for the Shannon entropy
of the probability distribution {𝜆𝑖 }𝑖 corresponding to the eigenvalues of 𝜌 𝐴 . The
Shannon entropy of the probability distribution {𝑝, 1 − 𝑝}, for 𝑝 ∈ [0, 1], shows
up frequently and is denoted by ℎ2 ( 𝑝), i.e.,
ℎ2 ( 𝑝) B −𝑝 log2 𝑝 − (1 − 𝑝) log2 (1 − 𝑝). (7.1.4)
It is called the binary entropy function.
Just as the Shannon entropy has an operational meaning as the optimal rate of
(classical) data compression, the quantum entropy has an operational interpretation
as the optimal rate of quantum data compression. More precisely, given the state
𝜌 ⊗𝑛
𝐴 , the quantum entropy 𝐻 (𝜌 𝐴 ) is the minimum number of qubits per copy of the
state 𝜌 𝐴 that are needed to faithfully represent 𝜌 ⊗𝑛
𝐴 , when 𝑛 becomes large. This
task is also called Schumacher compression.
Other fundamental information-theoretic quantities, which are functions of
the Shannon entropy, have straightforward generalizations to the quantum setting.
Let 𝜌 𝐴𝐵 be a bipartite state, and let 𝜎𝐴𝐵𝐶 be a tripartite state.
1. The quantum conditional entropy is defined as
𝐻 ( 𝐴|𝐵) 𝜌 B 𝐻 ( 𝐴𝐵) 𝜌 − 𝐻 (𝐵) 𝜌 . (7.1.5)
The quantum conditional entropy quantifies the uncertainty about the state of
the system 𝐴 in the presence of additional quantum side information in the
form of the quantum system 𝐵.
2. The coherent information is defined as
𝐼 ( 𝐴⟩𝐵) 𝜌 B 𝐻 (𝐵) 𝜌 − 𝐻 ( 𝐴𝐵) 𝜌 = −𝐻 ( 𝐴|𝐵) 𝜌 , (7.1.6)
and it arises in the context of communication of quantum information over
quantum channels (see Chapter 14). The coherent information is asymmetric
and can be interpreted as having a directionality. We obtain a quantity called
the reverse coherent information by swapping the systems 𝐴 and 𝐵 in (7.1.6):
𝐼 (𝐵⟩ 𝐴) 𝜌 B 𝐻 ( 𝐴) 𝜌 − 𝐻 ( 𝐴𝐵) 𝜌 = −𝐻 (𝐵| 𝐴) 𝜌 . (7.1.7)
This quantity arises when studying feedback-assisted quantum communication.
307
Chapter 7: Quantum Entropies and Information
308
Chapter 7: Quantum Entropies and Information
Remark: More generally, we could define the quantum relative entropy exactly as above, but
with both arguments being positive semi-definite operators. For our purposes in this book,
however, it suffices to restrict the first argument to be a state.
The quantum relative entropy has an operational meaning in terms of the task of
quantum hypothesis testing, as is shown in Section 7.10 on the quantum Stein’s
lemma (Theorem 7.78). The quantum relative entropy 𝐷 (𝜌∥𝜎) can also be
interpreted as a distinguishability measure for the quantum states 𝜌 and 𝜎, in part
due to the facts that 𝐷 (𝜌∥𝜎) ≥ 0 and 𝐷 (𝜌∥𝜎) = 0 if and only if 𝜌 = 𝜎, which is
shown in Proposition 7.3 below.
The support condition supp(𝜌) ⊆ supp(𝜎) in the definition of the quantum
relative entropy essentially has to do with the term Tr[𝜌 log2 𝜎] and the fact that the
logarithm of an operator is really only well defined for positive definite operators,
while we allow for 𝜎 to be positive semi-definite, which means that it could have
some eigenvalues equal to zero. Recall that the expression 𝜌 log2 𝜌 is well defined
even for states with eigenvalues equal to zero since we set 0 log2 0 = 0. We justified
this with the fact that lim𝑥→0+ 𝑥 log2 𝑥 = 0. We can similarly make sense of the
support condition in the definition of the quantum relative entropy by using the
following fact.
Proposition 7.2
For every state 𝜌 and positive semi-definite operator 𝜎,
Consequently, whenever 𝜎 does not have full support (i.e., it is positive semi-
definite as opposed to positive definite), we can write 𝐷 (𝜌∥𝜎) as the following
limit:
𝐷 (𝜌∥𝜎) = lim+ 𝐷 (𝜌∥𝜎 + 𝜀 1). (7.2.4)
𝜀→0
309
Chapter 7: Quantum Entropies and Information
Proof: Observe that, for all 𝜀 > 0, the operator 𝜎 + 𝜀 1 has full support; i.e.,
supp(𝜎 + 𝜀 1) = H for all 𝜀 > 0, where H is the underlying Hilbert space. This
means that the quantity
Tr[𝜌 log2 (𝜎 + 𝜀 1)] (7.2.5)
is finite for all 𝜀 > 0. Now, let us decompose the Hilbert space H into the direct sum
of the orthogonal subspaces supp(𝜎) and ker(𝜎), so that H = supp(𝜎) ⊕ ker(𝜎).
Let Π𝜎 be the projection onto supp(𝜎) and Π𝜎⊥ the projection onto ker(𝜎). Then
with respect to this decomposition, the operators 𝜌 and 𝜎 can be written as the
block matrices
Π𝜎 𝜌Π𝜎 Π𝜎 𝜌Π𝜎⊥
𝜌0,0 𝜌0,1
𝜌= ≡ † ,
Π𝜎⊥ 𝜌Π𝜎 Π𝜎⊥ 𝜌Π𝜎⊥ 𝜌0,1 𝜌1,1
(7.2.6)
Π𝜎 𝜎Π𝜎 Π𝜎 𝜎Π𝜎⊥
𝜎 0
𝜎= = .
Π𝜎⊥ 𝜎Π𝜎 Π𝜎⊥ 𝜎Π𝜎⊥ 0 0
Then, using the fact that lim𝜀→0+ (− log2 𝜀) = +∞, we find that
as required. ■
The proposition above, in particular the fact that we can write the quantum
relative entropy as 𝐷 (𝜌∥𝜎) = lim𝜀→0+ 𝐷 (𝜌∥𝜎+𝜀 1), allows us to take the logarithm
log2 (𝜎) of 𝜎 on only the support of 𝜎 when determining the quantum relative
entropy. We can use this fact to write another formula for the quantum relative
entropy. We start by writing a spectral decomposition of the state 𝜌 and positive
semi-definite operator 𝜎. In particular, setting 𝑟 𝜌 ≡ rank(𝜌) and 𝑟 𝜎 ≡ rank(𝜎), let
𝑑
∑︁ 𝑟𝜌
∑︁
𝜌= 𝑝 𝑗 |𝜓 𝑗 ⟩⟨𝜓 𝑗 | = 𝑝 𝑗 |𝜓 𝑗 ⟩⟨𝜓 𝑗 |, (7.2.17)
𝑗=1 𝑗=1
𝑑
∑︁ 𝑟𝜎
∑︁
𝜎= 𝑞 𝑘 |𝜙 𝑘 ⟩⟨𝜙 𝑘 | = 𝑞 𝑘 |𝜙 𝑘 ⟩⟨𝜙 𝑘 |, (7.2.18)
𝑘=1 𝑘=1
311
Chapter 7: Quantum Entropies and Information
𝑟
∑︁ 𝑟𝜎
!
© 𝜌 ∑︁
− Tr 𝑝 𝑗 |𝜓 𝑗 ⟩⟨𝜓 𝑗 | ® log2 𝑞 𝑘 |𝜙 𝑘 ⟩⟨𝜙 𝑘 |
ª
(7.2.19)
𝑗=1
¬ 𝑘=1
«
𝑟
∑︁𝜌 𝑟 𝜌
∑︁ ∑︁ 𝑟 𝜎
2
= 𝑝 𝑗 log2 𝑝 𝑗 − ⟨𝜓 𝑗 |𝜙 𝑘 ⟩ 𝑝 𝑗 log2 𝑞 𝑘 (7.2.20)
𝑗=1 𝑗=1 𝑘=1
𝑟𝜌 " 𝑟𝜎
#
∑︁ ∑︁ 2
= 𝑝 𝑗 log2 𝑝 𝑗 − ⟨𝜓 𝑗 |𝜙 𝑘 ⟩ 𝑝 𝑗 log2 𝑞 𝑘 . (7.2.21)
𝑗=1 𝑘=1
Now, using the fact that the eigenvectors {|𝜙 𝑘 ⟩ : 1 ≤ 𝑘 ≤ 𝑑} form a complete
orthonormal basis for H, so that 1H = 𝑑𝑘=1 |𝜙 𝑘 ⟩⟨𝜙 𝑘 |, we conclude that
Í
𝑑
∑︁ 𝑟𝜎
∑︁ 2
1 = ⟨𝜓 𝑗 |𝜓 𝑗 ⟩ = ⟨𝜓 𝑗 |𝜙 𝑘 ⟩⟨𝜙 𝑘 |𝜓 𝑗 ⟩ = ⟨𝜓 𝑗 |𝜙 𝑘 ⟩ , (7.2.22)
𝑘=1 𝑘=1
for 1 ≤ 𝑗 ≤ 𝑟 𝜌 , where the last equality follows from the assumption that supp(𝜌) ⊆
supp(𝜎), which implies that the eigenvectors {|𝜓 𝑗 ⟩ : 1 ≤ 𝑗 ≤ 𝑟 𝜌 } of 𝜌 can be
expressed as a linear combination of the eigenvectors {|𝜙 𝑘 ⟩ : 1 ≤ 𝑘 ≤ 𝑟 𝜎 } of 𝜎.
Therefore,
𝑟 𝜌 ∑︁
𝑟𝜎
∑︁ 2 𝑝𝑗
𝐷 (𝜌∥𝜎) = ⟨𝜓 𝑗 |𝜙 𝑘 ⟩ 𝑝 𝑗 log2 , (7.2.23)
𝑗=1 𝑘=1
𝑞 𝑘
𝐷 (𝑉 𝜌𝑉 † ∥𝑉 𝜎𝑉 † ) = 𝐷 (𝜌∥𝜎). (7.2.24)
312
Chapter 7: Quantum Entropies and Information
where
∑︁
𝜌𝑋 𝐴 B 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜌 𝑥𝐴 , (7.2.28)
𝑥∈X
∑︁
𝜎𝑋 𝐴 B 𝑞(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜎𝐴𝑥 . (7.2.29)
𝑥∈X
Remark: If we let the first argument of the relative entropy be a general positive semi-definite
operator instead of just a state, then (7.2.26) can be generalized for every 𝛼, 𝛽 ∈ (0, ∞) as
𝛼
𝐷 (𝛼𝜌∥ 𝛽𝜎) = 𝛼𝐷 (𝜌∥𝜎) + 𝛼 log2 . (7.2.30)
𝛽
Proof:
1. Proof of isometric invariance: When supp(𝜌) ⊈ supp(𝜎), there is noth-
ing to prove because supp(𝑉 𝜌𝑉 † ) ⊈ supp(𝑉 𝜎𝑉 † ), which means that both
313
Chapter 7: Quantum Entropies and Information
we obtain
𝑑
∑︁ 𝑑
∑︁
𝑟𝑗 = 𝑞 𝑘 = Tr(𝜎). (7.2.47)
𝑗=1 𝑘=1
Therefore,
1
𝐷 (𝜌∥𝜎) ≥ (1 − Tr(𝜎)) ≥ 0, (7.2.48)
ln(2)
as required, where the last inequality holds by the assumption that Tr(𝜎) ≤
1.
(b) We are now interested in the case of equality in the statement 𝐷 (𝜌∥𝜎) ≥ 0
that we just proved in (a). In that proof, we made use of two inequalities.
The first was in (7.2.38), where we made use of the concavity of the
logarithm. Equality holds in (7.2.38) if and only if for each 𝑗 there exists 𝑘
such that 𝑐 𝑗,𝑘 = 1. The second inequality we used was in (7.2.43), where
equality holds if and only if 𝑥 = 1. Therefore, equality holds in (7.2.44)
if and only if 𝑝 𝑗 = 𝑟 𝑗 for all 𝑗, and equality in (7.2.38) is true if and only
if the eigenvectors of 𝜌 and 𝜎 are, up to relabeling, the same. Therefore
𝐷 (𝜌∥𝜎) = 0 if and only if 𝑝 𝑗 = 𝑞 𝑗 for all 𝑗 and the corresponding
eigenvectors are (up to relabeling) equal, which is true if and only if 𝜌 = 𝜎.
(c) Suppose that both 𝜌 and 𝜎 are positive definite. Since the logarithm is
operator monotone (see Section 2.2.8.1), the operator inequality 𝜌 ≤ 𝜎 im-
1 1
plies that log2 (𝜌) ≤ log2 (𝜎). This implies the inequality 𝜌 2 log2 (𝜌) 𝜌 2 ≤
1 1
𝜌 2 log2 (𝜎) 𝜌 2 , which implies that Tr[𝜌 log2 (𝜌)] ≤ Tr[𝜌 log2 (𝜎)], proving
the result. In the case that 𝜌 and/or 𝜎 are not positive definite, we first apply
the result to the positive definite state (1 − 𝛿) 𝜌 + 𝛿𝜋 and the positive definite
operator 𝜎 + 𝜀 1, with 𝛿, 𝜀 > 0, so that 𝐷 ((1 − 𝛿) 𝜌 + 𝛿𝜋∥𝜎 + 𝜀 1) ≤ 0.
Then, using
316
Chapter 7: Quantum Entropies and Information
as required. In the general case that the operators are not positive definite,
as in (c) we apply the result to the positive definite operators (1 − 𝛿) 𝜌 + 𝛿𝜋,
𝜎 + 𝜀 1, and 𝜎′ + 𝜀′1, for 𝛿, 𝜀, 𝜀′ > 0, and then use (7.2.49) to obtain the
result.
3. Proof of additivity: Since supp(𝜌1 ⊗ 𝜌2 ) = supp(𝜌1 ) ⊗ supp(𝜌2 ) and supp(𝜎1 ⊗
𝜎2 ) = supp(𝜎1 ) ⊗ supp(𝜎2 ), the condition supp(𝜌1 ⊗ 𝜌2 ) ⊈ supp(𝜎1 ⊗ 𝜎2 )
is equivalent to the condition supp(𝜌1 ) ⊈ supp(𝜎1 ) or supp(𝜌2 ) ⊈ supp(𝜎2 ).
Therefore, 𝐷 (𝜌1 ⊗ 𝜌2 ∥𝜎1 ⊗𝜎2 ) = +∞ and 𝐷 (𝜌1 ∥𝜎1 ) = +∞ or 𝐷 (𝜌2 ∥𝜎2 ) = +∞
if one of the support conditions is violated. Now suppose that supp(𝜌1 ⊗ 𝜌2 ) ⊆
supp(𝜎1 ⊗ 𝜎2 ). Letting 𝜌1 and 𝜌2 have spectral decompositions
𝑑
∑︁ 𝑑
∑︁
𝜌1 = 𝑝 1𝑗 |𝜓 1𝑗 ⟩⟨𝜓 1𝑗 |, 𝜌2 = 𝑝 2𝑘 |𝜓 2𝑘 ⟩⟨𝜓 2𝑘 |, (7.2.53)
𝑗=1 𝑘=1
we find that
𝑑
© ∑︁ 1 1 1
log2 (𝜌1 ⊗ 𝜌2 ) = log2 𝑝 𝑗 |𝜓 𝑗 ⟩⟨𝜓 𝑗 | ⊗ 𝑝 2𝑘 |𝜓 2𝑘 ⟩⟨𝜓 2𝑘 | ®
ª
(7.2.54)
« 𝑗,𝑘=1 ¬
𝑑
∑︁
= log2 ( 𝑝 1𝑗 𝑝 2𝑘 )|𝜓 1𝑗 ⟩⟨𝜓 1𝑗 | ⊗ |𝜓 2𝑘 ⟩⟨𝜓 2𝑘 | (7.2.55)
𝑗,𝑘=1
𝑑
∑︁
= log2 ( 𝑝 1𝑗 )|𝜓 1𝑗 ⟩⟨𝜓 1𝑗 | ⊗ |𝜓 2𝑘 ⟩⟨𝜓 2𝑘 | (7.2.56)
𝑗,𝑘=1
𝑑
∑︁
+ log2 ( 𝑝 2𝑘 )|𝜓 1𝑗 ⟩⟨𝜓 1𝑗 | ⊗ |𝜓 2𝑘 ⟩⟨𝜓 2𝑘 | (7.2.57)
𝑗,𝑘=1
= log2 (𝜌1 ) ⊗ 1 + 1 ⊗ log2 (𝜌2 ). (7.2.58)
𝐷 (𝜌1 ⊗ 𝜌2 ∥𝜎1 ⊗ 𝜎2 )
= Tr (𝜌1 ⊗ 𝜌2 )(log2 (𝜌1 ) ⊗ 1 + 1 ⊗ log2 (𝜌2 )
Observe that
∑︁
log2 𝜌 𝑋 𝐴 = |𝑥⟩⟨𝑥| 𝑋 ⊗ log2 ( 𝑝(𝑥) 𝜌 𝑥𝐴 ) (7.2.64)
𝑥∈X
|𝑥⟩⟨𝑥| 𝑋 ⊗ log2 𝑝(𝑥) 1 𝐴 +
∑︁ ∑︁
= |𝑥⟩⟨𝑥| 𝑋 ⊗ log2 𝜌 𝑥𝐴 , (7.2.65)
𝑥∈X 𝑥∈X
∑︁
log2 𝜎𝑋 𝐴 = |𝑥⟩⟨𝑥| 𝑋 ⊗ log2 (𝑞(𝑥)𝜎𝐴𝑥 ) (7.2.66)
𝑥∈X
|𝑥⟩⟨𝑥| 𝑋 ⊗ log2 𝑞(𝑥) 1 𝐴 +
∑︁ ∑︁
= |𝑥⟩⟨𝑥| 𝑋 ⊗ log2 𝜎𝐴𝑥 . (7.2.67)
𝑥∈X 𝑥∈X
318
Chapter 7: Quantum Entropies and Information
#
∑︁
+ 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜌 𝑥𝐴 log2 𝜎𝐴𝑥 (7.2.69)
𝑥∈X
∑︁
= 𝑝(𝑥) log2 𝑝(𝑥) − 𝑝(𝑥) log2 𝑞(𝑥)
𝑥∈X
∑︁
+ 𝑝(𝑥)Tr 𝜌 𝑥𝐴 log2 𝜌 𝑥𝐴 − 𝜌 𝑥𝐴 log2 𝜎𝐴𝑥 (7.2.70)
𝑥∈X
∑︁
= 𝐷 ( 𝑝∥𝑞) + 𝑝(𝑥)𝐷 (𝜌 𝑥𝐴 ∥𝜎𝐴𝑥 ), (7.2.71)
𝑥∈X
as required. ■
In other words, the quantum relative entropy 𝐷 (𝜌∥𝜎) can only decrease or
stay the same if we apply the same quantum channel N to the states 𝜌 and 𝜎.
When the quantum relative entropy is interpreted as a distinguishability measure on
quantum states, the data-processing inequality tells us that the distinguishability of
two quantum states cannot increase when we act on them with the same quantum
channel; see Figure 7.1. We postpone the proof of the data-processing inequality
319
Chapter 7: Quantum Entropies and Information
ρ
N
N (ρ)
N (σ)
N
σ
On the other hand, suppose that 𝜌, 𝜎, and N are such that the inequality in
Theorem 7.4 is saturated:
Then it is a non-trivial result that there exists a recovery channel R such that the
equality in (7.2.74) holds. In fact, this channel can be taken as the Petz recovery
channel from Definition 4.21. We do not provide a proof here and instead point to
the Bibliographic Notes in Section 7.13 for more details.
One of the remarkable aspects of the data-processing inequality for the quantum
relative entropy is that it alone can be used to prove many of the properties of the
quantum relative entropy stated in Proposition 7.3. For example, Klein’s inequality
follows by considering the trace channel Tr, so that for every state 𝜌 and positive
semi-definite operator 𝜎 such that Tr[𝜎] ≤ 1, we find that
Tr(𝜌)
𝐷 (𝜌∥𝜎) ≥ 𝐷 (Tr(𝜌)∥Tr(𝜎)) = Tr(𝜌) log2 (7.2.79)
Tr(𝜎)
1
= log2 ≥ 0. (7.2.80)
Tr(𝜎)
Isometric invariance also follows from the data-processing inequality. The inequality
𝐷 (𝜌∥𝜎) ≥ 𝐷 (𝑉 𝜌𝑉 † ∥𝑉 𝜎𝑉 † ) follows from data processing because (·) → 𝑉 (·)𝑉 †
is a channel. The reverse inequality also follows from data processing because
𝐷 (𝑉 𝜌𝑉 † ∥𝑉 𝜎𝑉 † ) ≥ 𝐷 (R𝑉 (𝑉 𝜌𝑉 † )∥R𝑉 (𝑉 𝜎𝑉 † )) = 𝐷 (𝜌∥𝜎), where R𝑉 is the
reversal channel defined in (4.4.13) and we used (4.4.17)–(4.4.20).
Another important fact that follows from the data-processing inequality for
quantum relative entropy is its joint convexity.
321
Chapter 7: Quantum Entropies and Information
Exercise 7.1
Í Í
Let 𝜌 𝑋 𝐴 B 𝑥∈X 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜌 𝑥𝐴 and 𝜎𝑋 𝐴 B 𝑥∈X 𝑞(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜎𝐴𝑥 . Prove
that
!
∑︁ ∑︁
𝐷 (𝜌 𝑋 𝐴 ∥𝜎𝑋 𝐴 ) ≥ 𝐷 ( 𝑝∥𝑞) + 𝐷 𝑝(𝑥) 𝜌 𝑥𝐴 𝑝(𝑥)𝜎𝐴𝑥 . (7.2.87)
𝑥∈X 𝑥∈X
As stated at the beginning of this chapter, the quantum relative entropy acts, as
in the classical case, as a parent quantity for all of the fundamental information-
theoretic quantities based on the quantum entropy. Indeed, using the properties of
the quantum relative entropy stated previously, it is straightforward to verify the
following:
322
Chapter 7: Quantum Entropies and Information
𝐻 ( 𝐴|𝐵) 𝜌 = −𝐷 (𝜌 𝐴𝐵 ∥ 1 𝐴 ⊗ 𝜌 𝐵 ) (7.2.89)
= − inf 𝐷 (𝜌 𝐴𝐵 ∥ 1 𝐴 ⊗ 𝜎𝐵 ), (7.2.90)
𝜎𝐵 ∈D(H 𝐵 )
𝐼 ( 𝐴⟩𝐵) 𝜌 = 𝐷 (𝜌 𝐴𝐵 ∥ 1 𝐴 ⊗ 𝜌 𝐵 ) (7.2.91)
= inf 𝐷 (𝜌 𝐴𝐵 ∥ 1 𝐴 ⊗ 𝜎𝐵 ). (7.2.92)
𝜎𝐵 ∈D(H 𝐵 )
Observe that
𝐻 ( 𝐴|𝐵) 𝜌 = −𝐼 ( 𝐴⟩𝐵) 𝜌 (7.2.93)
for every bipartite state 𝜌 𝐴𝐵 . Similarly, we can write the reverse coherent
information as
𝐼 (𝐵⟩ 𝐴) 𝜌 = 𝐷 (𝜌 𝐴𝐵 ∥ 𝜌 𝐴 ⊗ 1𝐵 ) (7.2.94)
= inf 𝐷 (𝜌 𝐴𝐵 ∥𝜎𝐴 ⊗ 1𝐵 ) (7.2.95)
𝜎𝐴 ∈D(H 𝐴)
𝐼 ( 𝐴; 𝐵) 𝜌 = 𝐷 (𝜌 𝐴𝐵 ∥ 𝜌 𝐴 ⊗ 𝜌 𝐵 ) (7.2.96)
= inf 𝐷 (𝜌 𝐴𝐵 ∥ 𝜌 𝐴 ⊗ 𝜎𝐵 ) (7.2.97)
𝜎𝐵 ∈D(H 𝐵 )
= inf 𝐷 (𝜌 𝐴𝐵 ∥𝜏𝐴 ⊗ 𝜌 𝐵 ) (7.2.98)
𝜏𝐴 ∈D(H 𝐴)
= inf 𝐷 (𝜌 𝐴𝐵 ∥𝜏𝐴 ⊗ 𝜎𝐵 ). (7.2.99)
𝜏𝐴 ∈D(H 𝐴)
𝜎𝐵 ∈D(H 𝐵 )
Exercise 7.2
Verify the equalities in (7.2.90), (7.2.92), and (7.2.97). Hint: First prove that
which holds for all states 𝜏𝐴 and 𝜎𝐵 , by Klein’s inequality as stated in (7.2.72).
Set 𝜏𝐴 = 𝜋 𝐴 to prove (7.2.90) and (7.2.92), and set 𝜏𝐴 = 𝜌 𝐴 to prove (7.2.97).
The properties of the quantum relative entropy, such as the ones stated in
Propositions 7.3 and 7.5, can be directly translated to properties of the derived
information measures stated above. Some of these properties are used frequently
throughout the book, and so we state them here for convenience. They are
straightforward to verify using definitions and properties of the quantum relative
entropy.
• Additivity of the quantum entropy for product states 𝜌 and 𝜏:
𝐻 (𝜌 ⊗ 𝜏) = 𝐻 (𝜌) + 𝐻 (𝜏). (7.2.104)
• Concavity of the quantum entropy: The joint convexity of the quantum relative
entropy, as stated in Proposition 7.5, and the identity in (7.2.88) imply that
the quantum entropy is concave in its input: if 𝑝 : X → [0, 1] is a probability
distribution over a finite alphabet X and {𝜌 𝑥𝐴 }𝑥∈X is a set of states on a system 𝐴,
then !
∑︁ ∑︁
𝑥
𝐻 𝑝(𝑥) 𝜌 𝐴 ≥ 𝑝(𝑥)𝐻 (𝜌 𝑥𝐴 ). (7.2.106)
𝑥∈X 𝑥∈X
324
Chapter 7: Quantum Entropies and Information
• Chain rule for quantum mutual information: For every state 𝜌 𝐴𝐵𝐶 , the following
equality holds
𝐼 ( 𝐴; 𝐵𝐶) 𝜌 = 𝐼 ( 𝐴; 𝐵) 𝜌 + 𝐼 ( 𝐴; 𝐶 |𝐵) 𝜌 . (7.2.112)
We call this the chain rule because it can be interpreted as saying that the
correlations between 𝐴 and 𝐵𝐶 can be built up by first establishing correlations
between 𝐴 and 𝐵 (signified by 𝐼 ( 𝐴; 𝐵) 𝜌 ), then establishing correlations between
𝐴 and 𝐶, given the correlations with 𝐵 (signified by 𝐼 ( 𝐴; 𝐶 |𝐵) 𝜌 ).
325
Chapter 7: Quantum Entropies and Information
Exercise 7.3
Provide explicit proofs of the properties in (7.2.104)–(7.2.109), by following
what is stated above.
Exercise 7.4
Verify the chain rules stated in (7.2.110) and (7.2.112). More generally, prove
that 𝐼 ( 𝐴; 𝐵𝐶 |𝐷) 𝜌 = 𝐼 ( 𝐴; 𝐵|𝐷) 𝜌 + 𝐼 ( 𝐴; 𝐶 |𝐵𝐷) 𝜌 for a four-party state 𝜌 𝐴𝐵𝐶𝐷 .
𝐼 ( 𝐴; 𝐵|𝐶) 𝜌 ≥ 0. (7.2.114)
Proof: One way to prove this result is by means the data-processing inequality for
the quantum relative entropy, along with the expression for the quantum conditional
entropy in (7.2.89). We start by using the definition of the quantum conditional
entropy to rewrite the quantum conditional mutual information defined in (7.1.11)
as
Then, using the expression in (7.2.89) for the quantum conditional entropy in terms
326
Chapter 7: Quantum Entropies and Information
Two direct consequences of strong subadditivity are that the conditional entropy
is concave and non-negative for every separable state. We detail these properties
below.
327
Chapter 7: Quantum Entropies and Information
Proof: Recall from Definition 3.5 that 𝜎𝐴𝐵 is separable if it can be written as
∑︁
𝜎𝐴𝐵 = 𝑝(𝑥)𝜏𝐴𝑥 ⊗ 𝜔𝑥𝐵 , (7.2.124)
𝑥∈X
328
Chapter 7: Quantum Entropies and Information
4. Additivity: For states 𝜌 𝐴1 𝐵1𝐶1 and 𝜏𝐴2 𝐵2𝐶2 , the following equality holds for
the product state 𝜌 𝐴1 𝐵1𝐶1 ⊗ 𝜏𝐴2 𝐵2𝐶2 :
7. Data-processing inequality for local channels: For every state 𝜌 𝐴𝐵𝐶 and
all local channels N 𝐴→𝐴′ and M𝐵→𝐵′ , the following inequality holds
329
Chapter 7: Quantum Entropies and Information
Now, let |𝜓⟩ 𝐴𝐵𝐶𝐸 be a purification of 𝜌 𝐴𝐵𝐶 . Then, since 𝜌 𝐴𝐵𝐶 and 𝜓 𝐸 have
the same spectrum, and since 𝜌 𝐵𝐶 and 𝜓 𝐴𝐸 have the same spectrum, we obtain
where the first inequality follows from (7.2.140), and the second inequality, as
before, from the fact that 𝐻 ( 𝐴) 𝜌 ≤ log2 𝑑 𝐴 for every state 𝜌. Therefore,
which follows from (7.2.104). This means that 𝐻 (𝐵|𝐶)𝜎⊗𝜏 = 𝐻 (𝐵)𝜎 and
𝐻 ( 𝐴𝐵|𝐶)𝜎⊗𝜏 = 𝐻 ( 𝐴𝐵)𝜎 , and we find that
4. By writing
and recognizing that each conditional entropy on the right-hand side is evaluated
on a product state, we can use (7.2.104) to obtain the desired result. As an
example, we evaluate the first term on the right-hand side:
𝐻 ( 𝐴1 𝐴2 |𝐶1𝐶2 ) 𝜌⊗𝜏
= 𝐻 ( 𝐴1 𝐴2𝐶1𝐶2 ) 𝜌⊗𝜏 − 𝐻 (𝐶1𝐶2 ) 𝜌⊗𝜏 (7.2.151)
= 𝐻 ( 𝐴1𝐶1 ) 𝜌 + 𝐻 ( 𝐴2𝐶2 )𝜏 − 𝐻 (𝐶1 ) 𝜌 − 𝐻 (𝐶2 )𝜏 (7.2.152)
= 𝐻 ( 𝐴1 |𝐶1 ) 𝜌 + 𝐻 ( 𝐴2 |𝐶2 )𝜏 . (7.2.153)
Let us consider the first term on the right-hand side. Using the definition of the
quantum conditional entropy and using the direct-sum property of the quantum
entropy, as stated in (7.2.109), we obtain
𝐻 ( 𝐴|𝐶 𝑋)𝜎
= 𝐻 ( 𝐴𝐶 𝑋)𝜎 − 𝐻 (𝐶 𝑋)𝜎 (7.2.155)
∑︁ ∑︁
= 𝐻 ( 𝑝) + 𝑝(𝑥)𝐻 ( 𝐴𝐶) 𝜌 𝑥 − 𝐻 ( 𝑝) − 𝑝(𝑥)𝐻 (𝐶) 𝜌 𝑥 (7.2.156)
𝑥∈X 𝑥∈X
∑︁
= 𝑝(𝑥) 𝐻 ( 𝐴𝐶) 𝜌 𝑥 − 𝐻 (𝐶) 𝜌 𝑥 (7.2.157)
𝑥∈X
∑︁
= 𝑝(𝑥)𝐻 ( 𝐴|𝐶) 𝜌 𝑥 . (7.2.158)
𝑥∈X
331
Chapter 7: Quantum Entropies and Information
The other terms on the right-hand side of (7.2.154) are evaluated similarly, and
we ultimately arrive at
∑︁
𝐼 ( 𝐴; 𝐵|𝐶 𝑋)𝜎 = 𝑝(𝑥) 𝐻 ( 𝐴|𝐶) 𝜌 𝑥 + 𝐻 (𝐵|𝐶) 𝜌 𝑥 − 𝐻 ( 𝐴𝐵|𝐶) 𝜌 𝑥
𝑥∈X
∑︁
= 𝑝(𝑥)𝐼 ( 𝐴; 𝐵|𝐶) 𝜌 𝑥 . (7.2.159)
𝑥∈X
332
Chapter 7: Quantum Entropies and Information
Proof: Suppose without loss of generality that 𝜀 > 0 (otherwise the statement
trivially holds). Let 𝜔𝜆𝐴𝐵𝐶 B 𝜆𝜌 𝐴𝐵𝐶 + (1 − 𝜆) 𝜎𝐴𝐵𝐶 for 𝜆 ∈ [0, 1]. Then the
following inequality holds
𝜆𝐼 ( 𝐴; 𝐵|𝐶) 𝜌 + (1 − 𝜆) 𝐼 ( 𝐴; 𝐵|𝐶)𝜎 ≤ 𝐼 ( 𝐴; 𝐵|𝐶)𝜔𝜆 + ℎ2 (𝜆), (7.2.170)
because for the classical–quantum state
𝜔𝜆𝐴𝐵𝐶 𝑋 B 𝜆𝜌 𝐴𝐵𝐶 ⊗ |0⟩⟨0| 𝑋 + (1 − 𝜆) 𝜎𝐴𝐵𝐶 ⊗ |1⟩⟨1| 𝑋 , (7.2.171)
we have that
𝜆𝐼 ( 𝐴; 𝐵|𝐶) 𝜌 + (1 − 𝜆) 𝐼 ( 𝐴; 𝐵|𝐶)𝜎 = 𝐼 ( 𝐴; 𝐵|𝐶 𝑋)𝜔𝜆 (7.2.172)
≤ 𝐼 ( 𝐴𝑋; 𝐵|𝐶)𝜔𝜆 (7.2.173)
= 𝐼 ( 𝐴; 𝐵|𝐶)𝜔𝜆 + 𝐼 (𝑋; 𝐵|𝐶 𝐴)𝜔𝜆 (7.2.174)
≤ 𝐼 ( 𝐴; 𝐵|𝐶)𝜔𝜆 + 𝐻 (𝑋)𝜔𝜆 (7.2.175)
= 𝐼 ( 𝐴; 𝐵|𝐶)𝜔𝜆 + ℎ2 (𝜆). (7.2.176)
The first equality follows from (7.2.135). The first inequality follows from the chain
rule and strong subadditivity. The second equality follows from the chain rule. The
333
Chapter 7: Quantum Entropies and Information
second inequality follows from the local entropy bound in (7.2.131). We also have
that
𝐼 ( 𝐴; 𝐵|𝐶)𝜔𝜆 ≤ 𝐼 ( 𝐴𝑋; 𝐵|𝐶)𝜔𝜆 (7.2.177)
= 𝐼 (𝑋; 𝐵|𝐶)𝜔𝜆 + 𝐼 ( 𝐴; 𝐵|𝐶 𝑋)𝜔𝜆 (7.2.178)
≤ ℎ2 (𝜆) + 𝜆𝐼 ( 𝐴; 𝐵|𝐶) 𝜌 + (1 − 𝜆) 𝐼 ( 𝐴; 𝐵|𝐶)𝜎 , (7.2.179)
which together imply that
𝜆𝐼 ( 𝐴; 𝐵|𝐶) 𝜌 + (1 − 𝜆) 𝐼 ( 𝐴; 𝐵|𝐶)𝜎 − 𝐼 ( 𝐴; 𝐵|𝐶)𝜔𝜆 ≤ ℎ2 (𝜆). (7.2.180)
Then consider the state
1
𝜁 𝐴𝐵𝐶 B (𝜌 𝐴𝐵𝐶 + [𝜎𝐴𝐵𝐶 − 𝜌 𝐴𝐵𝐶 ] + ) , (7.2.181)
1+𝜀
where [·] + denotes the positive part of an operator, and for this choice, we have that
1 𝜀 1
𝜌 𝐴𝐵𝐶 + 𝜉 = 𝜁 𝐴𝐵𝐶 (7.2.182)
1+𝜀 1 + 𝜀 𝐴𝐵𝐶
1 𝜀 2
= 𝜎𝐴𝐵𝐶 + 𝜉 , (7.2.183)
1+𝜀 1 + 𝜀 𝐴𝐵𝐶
analogous to the approach of Thales of Milete, where the states 𝜉 1𝐴𝐵𝐶 and 𝜉 2𝐴𝐵𝐶 are
defined as
1
𝜉 1𝐴𝐵𝐶 B [𝜎𝐴𝐵𝐶 − 𝜌 𝐴𝐵𝐶 ] + , (7.2.184)
𝜀
1
𝜉 2𝐴𝐵𝐶 B ((1 + 𝜀)𝜁 𝐴𝐵𝐶 − 𝜎𝐴𝐵𝐶 ) . (7.2.185)
𝜀
Applying (7.2.180) to the convex decompositions above, we find that
1
𝐼 ( 𝐴; 𝐵|𝐶) 𝜌 − 𝐼 ( 𝐴; 𝐵|𝐶)𝜎
1+𝜀
𝜀 𝜀
≤ 𝐼 ( 𝐴; 𝐵|𝐶)𝜉 2 − 𝐼 ( 𝐴; 𝐵|𝐶)𝜉 1 + 2ℎ2 (7.2.186)
1+𝜀 𝜀 1 + 𝜀
𝜀
≤ 𝐼 ( 𝐴; 𝐵|𝐶)𝜉 2 + 2ℎ2 (7.2.187)
1+𝜀 1+𝜀
𝜀 𝜀
≤ 2 log(min{𝑑 𝐴 , 𝑑 𝐵 }) + 2ℎ2 , (7.2.188)
1+𝜀 1+𝜀
where the last line follows from the dimension bound in (7.2.129). Multiplying
𝜀
through by 1 + 𝜀 and using the fact that 𝑔2 (𝜀) = (1 + 𝜀) ℎ2 1+𝜀 , we conclude that
𝐼 ( 𝐴; 𝐵|𝐶) 𝜌 − 𝐼 ( 𝐴; 𝐵|𝐶)𝜎 ≤ 2𝜀 log2 (min{𝑑 𝐴 , 𝑑 𝐵 }) + 2𝑔2 (𝜀). (7.2.189)
334
Chapter 7: Quantum Entropies and Information
To arrive at the other inequality, we again apply (7.2.180) to the convex decomposi-
tions above and find that
1
𝐼 ( 𝐴; 𝐵|𝐶)𝜎 − 𝐼 ( 𝐴; 𝐵|𝐶) 𝜌
1+𝜀
𝜀 𝜀
≤ 𝐼 ( 𝐴; 𝐵|𝐶)𝜉 1 − 𝐼 ( 𝐴; 𝐵|𝐶)𝜉 2 + 2ℎ2 . (7.2.190)
1+𝜀 1+𝜀
Then we apply the same reasoning as above to find that
𝐼 ( 𝐴; 𝐵|𝐶)𝜎 − 𝐼 ( 𝐴; 𝐵|𝐶) 𝜌 ≤ 2𝜀 log2 (min{𝑑 𝐴 , 𝑑 𝐵 }) + 2𝑔2 (𝜀). (7.2.191)
The inequality in (7.2.169) follows from the same proof, but applying observation
that the 𝐴 system of the states 𝜉 1𝐴𝐵𝐶 and 𝜉 2𝐴𝐵𝐶 in (7.2.184)–(7.2.185) are classical
when 𝜌 𝐴𝐵𝐶 and 𝜎𝐴𝐵𝐶 are classical on 𝐴. Here, we also apply the dimension bound
in (7.2.130) in (7.2.188) above. ■
𝐼 ( 𝐴; 𝐵) 𝜌 ≥ 0. (7.2.193)
3. Additivity: For states 𝜌 𝐴1 𝐵1 and 𝜏𝐴2 𝐵2 , the following equality holds for the
product state 𝜌 𝐴1 𝐵1 ⊗ 𝜏𝐴2 𝐵2 :
𝐼 ( 𝐴1 𝐴2 ; 𝐵1 𝐵2 ) 𝜌⊗𝜏 = 𝐼 ( 𝐴1 ; 𝐵1 ) 𝜌 + 𝐼 ( 𝐴2 ; 𝐵2 )𝜏 . (7.2.199)
336
Chapter 7: Quantum Entropies and Information
337
Chapter 7: Quantum Entropies and Information
Remark: The mutual information 𝐼 (𝑋; 𝐴)𝜌 of a classical–quantum state 𝜌 𝑋 𝐴 is often called
Holevo information, a term we use throughout this book.
Many of the strong converse theorems in this book rely heavily on the data-
processing inequality, and so we employ the generalized divergence to emphasize
this point.
We have already mentioned the quantum relative entropy as an example of a
generalized divergence. Other examples discussed later in this chapter, which are
relevant in the context of channel capacity theorems, are the Petz–, sandwiched,
and geometric Rényi relative entropies.
From the fact that generalized divergences satisfy the data-processing inequality
by definition, we immediately obtain two properties of interest.
339
Chapter 7: Quantum Entropies and Information
Proof:
1. We follow the same approach discussed after (7.2.80). Since the map 𝜌 ↦→
𝑉 𝜌𝑉 † is a channel, we immediately obtain 𝑫 (𝜌∥𝜎) ≥ 𝑫 (𝑉 𝜌𝑉 † ∥𝑉 𝜎𝑉 † ). To
prove that 𝑫 (𝜌∥𝜎) ≤ 𝑫 (𝑉 𝜌𝑉 † ∥𝑉 𝜎𝑉 † ), consider the channel R𝑉 , which was
defined in (4.4.13) as
and so 𝑫 (𝜌∥𝜎) = 𝑫 (𝑉 𝜌𝑉 † ∥𝑉 𝜎𝑉 † ).
2. Since taking the tensor product with a fixed state is a channel (recall Def-
inition 4.7), by definition of generalized divergence we obtain 𝑫 (𝜌∥𝜎) ≥
𝑫 (𝜌 ⊗ 𝜏∥𝜎 ⊗ 𝜏). On the other hand, the partial trace is also a channel, and so
by discarding the second system in the operators 𝜌 ⊗ 𝜏 and 𝜎 ⊗ 𝜏, we obtain
Proposition 7.17
Suppose that the generalized divergence obeys the following direct-sum prop-
erty: ∑︁
𝑫 (𝜌 𝑋 𝐴 ∥𝜎𝑋 𝐴 ) = 𝑝(𝑥) 𝑫 (𝜌 𝑥𝐴 ∥𝜎𝐴𝑥 ), (7.3.7)
𝑥∈X
340
Chapter 7: Quantum Entropies and Information
Then the generalized divergence is jointly convex; i.e., the following inequality
holds ∑︁
𝑝(𝑥) 𝑫 (𝜌 𝑥𝐴 ∥𝜎𝐴𝑥 ) ≥ 𝑫 (𝜌 𝐴 ∥𝜎 𝐴 ), (7.3.10)
𝑥∈X
Í Í
where 𝜌 𝐴 B 𝑥∈X 𝑝(𝑥) 𝜌 𝑥𝐴 and 𝜎 𝐴 B 𝑥∈X 𝑝(𝑥)𝜎𝐴𝑥 .
Proof: The proof is the same as the proof of Proposition 7.5 with 𝐷 replaced by
𝑫. ■
Now, just as we defined entropic quantities like the entropy, conditional entropy,
and mutual information using the quantum relative entropy, we can define their
generalized counterparts using the generalized divergence.
341
Chapter 7: Quantum Entropies and Information
𝑰( 𝐴; 𝐵) 𝜌 B inf 𝑫 (𝜌 𝐴𝐵 ∥ 𝜌 𝐴 ⊗ 𝜎𝐵 ). (7.3.14)
𝜎𝐵 ∈D(H 𝐵 )
where 𝜌′𝐴′ 𝐵′ = (N 𝐴→𝐴′ ⊗ M𝐵→𝐵′ )(𝜌 𝐴𝐵 ), with N 𝐴→𝐴′ and M𝐵→𝐵′ arbitrary
channels.
Proof:
342
Chapter 7: Quantum Entropies and Information
as required.
2. Let 𝜌 𝐴𝐵 be an arbitrary bipartite state, let N 𝐴→𝐴′ be an arbitrary unital channel,
and let M𝐵→𝐵′ be an arbitrary channel. Also, let 𝜎𝐵 be an arbitrary state. Then,
using the data-processing inequality of the generalized divergence 𝑫 and the
unitality of N, we obtain
343
Chapter 7: Quantum Entropies and Information
as required. ■
𝑫 (𝜌∥𝜎) ≥ 0 (7.3.35)
𝑫 (𝜌∥ 𝜌) = 0 (7.3.36)
for every state 𝜌. We should clarify that this assumption is quite minimal.
The reason is that it is essentially a direct consequence of (7.3.1) up to an
inessential additive factor. That is, (7.3.1) implies that there exists a constant 𝑐
such that
𝑫 (𝜌∥ 𝜌) = 𝑐 (7.3.37)
344
Chapter 7: Quantum Entropies and Information
for every state 𝜌. To see this, consider that one can get from the state 𝜌 to
another state 𝜔 by means of a trace and replace channel Tr[·]𝜔, so that (7.3.1)
implies that
𝑫 (𝜌∥ 𝜌) ≥ 𝑫 (𝜔∥𝜔). (7.3.38)
However, by the same argument, 𝑫 (𝜔∥𝜔) ≥ 𝑫 (𝜌∥ 𝜌), so that the claim holds.
So the assumption in (7.3.36) amounts to a redefinition of the generalized
divergence as
𝑫 ′ (𝜌∥𝜎) := 𝑫 (𝜌∥𝜎) − 𝑐. (7.3.39)
The Rényi entropy is defined for all 𝛼 ∈ (0, 1) ∪ (1, ∞), and one evaluates its value
at 𝛼 ∈ {0, 1, ∞} by taking limits:
𝐻0 (𝜌) B lim 𝐻𝛼 (𝜌) = log2 rank(𝜌), (7.4.5)
𝛼→0
𝐻1 (𝜌) B lim 𝐻𝛼 (𝜌) = −Tr[𝜌 log2 𝜌] = 𝐻 (𝜌), (7.4.6)
𝛼→1
𝐻∞ (𝜌) B lim 𝐻𝛼 (𝜌) = − log 𝜆max (𝜌). (7.4.7)
𝛼→∞
Proposition 7.21
For every state 𝜌 and positive semi-definite operator 𝜎,
1
log2 Tr 𝜌 𝛼 (𝜎 + 𝜀 1) 1−𝛼 .
𝐷 𝛼 (𝜌∥𝜎) = lim+ (7.4.8)
𝜀→0 𝛼−1
Proof: For 𝛼 ∈ (0, 1), this is immediate from the fact that the logarithm, trace,
and power functions are continuous, so that the limit can be brought inside the trace
and inside the power (𝜎 + 𝜀 1) 1−𝛼 .
For 𝛼 ∈ (1, ∞), since 1 − 𝛼 is negative and 𝜎 is not necessarily invertible, we
first decompose the underlying Hilbert space H as H = supp(𝜎) ⊕ ker(𝜎), just as
we did in (7.2.6), in order to write
𝜌0,0 𝜌0,1 𝜎 0
𝜌= † , 𝜎= . (7.4.9)
𝜌0,1 𝜌1,1 0 0
Then, writing 1 = Π𝜎 + Π𝜎⊥ , where Π𝜎 is the projection onto the support of 𝜎 and
Π𝜎⊥ is the projection onto the orthogonal complement of supp(𝜎), we find that
𝜎 + 𝜀Π𝜎 0
𝜎 + 𝜀1 = , (7.4.10)
0 𝜀Π𝜎⊥
which implies that
(𝜎 + 𝜀Π𝜎 ) 1−𝛼
(𝜎 + 𝜀 1) 1−𝛼
0
= . (7.4.11)
0 (𝜀Π𝜎⊥ ) 1−𝛼
346
Chapter 7: Quantum Entropies and Information
If supp(𝜌) ⊆ supp(𝜎), then 𝜌 = 𝜌0,0 , 𝜌0,1 = 0, and 𝜌1,1 = 0, which means that
𝛼
(𝜎 + ) 1−𝛼 0
𝜌 𝛼 (𝜎 + 𝜀 1) 1−𝛼 =
𝜌 𝜀Π 𝜎
, (7.4.12)
0 0
so that
1
log2 Tr 𝜌 𝛼 (𝜎 + 𝜀 1) 1−𝛼
lim+
𝜀→0 𝛼 − 1
1
= lim+ log2 Tr 𝜌 𝛼 (𝜎 + 𝜀Π𝜎 ) 1−𝛼 (7.4.13)
𝜀→0 𝛼 − 1
1
= log2 Tr 𝜌 𝛼 𝜎 1−𝛼 (7.4.14)
𝛼−1
= 𝐷 𝛼 (𝜌∥𝜎), (7.4.15)
as required.
If supp(𝜌) ⊈ supp(𝜎), then the blocks 𝜌0,1 and 𝜌1,1 are generally non-zero,
and we obtain
𝜌 𝛼 (𝜎 + 𝜀 1) 1−𝛼
𝛼
𝜌0,0 𝜌0,1 (𝜎 + 𝜀Π𝜎 ) 1−𝛼 0
= † (7.4.16)
𝜌0,1 𝜌1,1 0 (𝜀Π𝜎⊥ ) 1−𝛼
𝛼 𝛼−1
𝜌 0,0 𝜌 0,1 𝜀 (𝜎 + 𝜀Π 𝜎 ) 1−𝛼 0
= 𝜀 1−𝛼 † . (7.4.17)
𝜌0,1 𝜌1,1 0 (Π𝜎⊥ ) 1−𝛼
Due to the fact that 𝛼 ∈ (1, ∞) it holds that lim𝜀→0+ 𝜀 1−𝛼 = +∞, and since the limit
𝜀 → 0+ of the matrix in square brackets in (7.4.17) is finite, we find that
1
log2 Tr 𝜌 𝛼 (𝜎 + 𝜀 1) 1−𝛼 = +∞ = 𝐷 𝛼 (𝜌∥𝜎)
lim (7.4.18)
𝜀→0 𝛼 − 1
347
Chapter 7: Quantum Entropies and Information
for 𝛼 ∈ (0, 1). For 𝛼 ∈ (1, ∞), the same expression defines 𝐷 𝛼 ( 𝑝∥𝑞) whenever
𝑞(𝑥) = 0 implies 𝑝(𝑥) = 0 for all 𝑥 ∈ X; otherwise, 𝐷 𝛼 ( 𝑝∥𝑞) = +∞.
Recall the quantum Chernoff bound from Theorem 5.4 in Section 5.3.1, which
states that optimal error exponent for the task of discriminating between the states
𝜌 and 𝜎 is
𝜉 (𝜌, 𝜎) = 𝜉 (𝜌, 𝜎) = 𝐶 (𝜌∥𝜎) B sup − log2 Tr[𝜌 𝛼 𝜎 1−𝛼 ]. (7.4.20)
𝛼∈(0,1)
Using the definition of the Petz–Rényi relative entropy, we find that
𝐶 (𝜌∥𝜎) = inf (1 − 𝛼)𝐷 𝛼 (𝜌∥𝜎). (7.4.21)
𝛼∈(0,1)
The Petz–Rényi relative entropy thus plays a role in providing the optimal error ex-
ponent for the task of discriminating between two states (i.e., symmetric hypothesis
testing).
Like the quantum relative entropy, the Petz–Rényi relative entropy is faithful,
meaning that, for 𝛼 ∈ (0, 1) ∪ (1, ∞) and states 𝜌, 𝜎,
𝐷 𝛼 (𝜌∥𝜎) = 0 ⇐⇒ 𝜌 = 𝜎. (7.4.22)
We prove this in Proposition 7.36 in the next section, as it requires results from
both this section and the next one. Also, like the quantum relative entropy, the
Petz–Rényi relative entropy is a generalized divergence for certain values of 𝛼,
which is shown in Theorem 7.24 below.
Before getting to Theorem 7.24, we first discuss several important properties of
the Petz–Rényi relative entropy. Let us note that, if 𝜌 and 𝜎 act on a 𝑑-dimensional
Hilbert space and are invertible, then the Petz–Rényi relative quasi-entropy can be
written as
𝑄 𝛼 (𝜌∥𝜎) = ⟨𝜑 𝜌 |(𝜌 −1 ⊗ 𝜎 T ) 1−𝛼 |𝜑 𝜌 ⟩, (7.4.23)
where |𝜑 𝜌 ⟩ B (𝜌 2 ⊗ 1)|Γ⟩ is a purification of 𝜌 and |Γ⟩ = 𝑖=1
1 Í 𝑑
|𝑖, 𝑖⟩. This is due
to the transpose trick in (2.2.40) and the identity in (2.2.41). We can extend the
applicability of this expression to states 𝜌 and positive semi-definite operators 𝜎
that are not invertible by noting that
1
log2 Tr ((1 − 𝛿) 𝜌 + 𝛿𝜋) 𝛼 (𝜎 + 𝜀 1) 1−𝛼 . (7.4.24)
𝑄 𝛼 (𝜌∥𝜎) = lim+ lim+
𝜀→0 𝛿→0 𝛼 − 1
We start by establishing the important fact that the quantum relative entropy is
a special case of the Petz–Rényi relative entropy in the limit 𝛼 → 1.
348
Chapter 7: Quantum Entropies and Information
Proposition 7.22
Let 𝜌 be a state and 𝜎 a positive semi-definite operator. Then, in the limit
𝛼 → 1, the Petz–Rényi relative entropy converges to the quantum relative
entropy:
lim 𝐷 𝛼 (𝜌∥𝜎) = 𝐷 (𝜌∥𝜎). (7.4.25)
𝛼→1
Proof: Let us first consider the case 𝛼 ∈ (1, ∞). If supp(𝜌) ⊈ supp(𝜎), then
𝐷 𝛼 (𝜌∥𝜎) = +∞, so that lim𝛼→1+ 𝐷 𝛼 (𝜌∥𝜎) = +∞, consistent with the definition of
the quantum relative entropy in this case (see Definition 7.1). If supp(𝜌) ⊆ supp(𝜎),
then 𝐷 𝛼 (𝜌∥𝜎) is finite and we can write
1
𝐷 𝛼 (𝜌∥𝜎) = log2 𝑄 𝛼 (𝜌∥𝜎). (7.4.26)
𝛼−1
Now, let us define the function
𝑄 𝛼,𝛽 (𝜌∥𝜎) B Tr[𝜌 𝛼 𝜎 1−𝛽 ], (7.4.27)
so that 𝑄 𝛼 (𝜌∥𝜎) = 𝑄 𝛼,𝛼 (𝜌∥𝜎). By noting that supp(𝜌) ⊆ supp(𝜎) implies
𝑄 1 (𝜌∥𝜎) = Tr[𝜌Π𝜎 ] = 1 (where Π𝜎 is the projection onto the support of 𝜎), and
since log2 1 = 0, we can write 𝐷 𝛼 (𝜌∥𝜎) as
log2 𝑄 𝛼 (𝜌∥𝜎) − log2 𝑄 1 (𝜌∥𝜎)
𝐷 𝛼 (𝜌∥𝜎) = , (7.4.28)
𝛼−1
so that
d
lim 𝐷 𝛼 (𝜌∥𝜎) = log2 𝑄 𝛼 (𝜌∥𝜎) (7.4.29)
𝛼→1 d𝛼 𝛼=1
d
1 d𝛼 𝑄 𝛼 (𝜌∥𝜎) 𝛼=1
= (7.4.30)
ln(2) 𝑄 1 (𝜌∥𝜎)
1 d
= 𝑄 𝛼 (𝜌∥𝜎) , (7.4.31)
ln(2) d𝛼 𝛼=1
where the first equality follows from the definition of the derivative and the second
equality from the derivative of the natural logarithm, along with the chain rule.
Using the function 𝑄 𝛼,𝛽 and the chain rule, we write
d d d
𝑄 𝛼 (𝜌∥𝜎) = 𝑄 𝛼,1 (𝜌∥𝜎) + 𝑄 1,𝛽 (𝜌∥𝜎) . (7.4.32)
d𝛼 𝛼=1 d𝛼 𝛼=1 d𝛽 𝛽=1
349
Chapter 7: Quantum Entropies and Information
Then,
d d d
𝑄 𝛼,1 (𝜌∥𝜎) = Tr[𝜌 𝛼 Π𝜎 ] = Tr[𝜌 𝛼 ] = Tr[𝜌 𝛼 ln 𝜌], (7.4.33)
d𝛼 d𝛼 d𝛼
where we used the fact that 𝜌 𝛼 Π𝜎 = 𝜌 𝛼 since supp(𝜌) ⊆ supp(𝜎). Therefore,
d
𝑄 𝛼,1 (𝜌∥𝜎) = Tr[𝜌 ln 𝜌]. (7.4.34)
d𝛼 𝛼=1
Similarly,
d d
𝑄 1,𝛽 (𝜌∥𝜎) = Tr[𝜌𝜎 1−𝛽 ] = −Tr[𝜌𝜎 1−𝛽 ln 𝜎], (7.4.35)
d𝛽 d𝛽
so that
d
𝑄 1,𝛽 (𝜌∥𝜎) = −Tr[𝜌Π𝜎 ln 𝜎] = −Tr[𝜌 ln 𝜎], (7.4.36)
d𝛽 𝛽=1
where the last equality follows from the fact that the support condition supp(𝜌) ⊆
supp(𝜎) holds. So we find that
1 d
lim 𝐷 𝛼 (𝜌∥𝜎) = 𝑄 𝛼 (𝜌∥𝜎) = Tr[𝜌 log2 𝜌] − Tr[𝜌 log2 𝜎] (7.4.37)
𝛼→1 ln(2) d𝛼 𝛼=1
If supp(𝜌) ⊈ supp(𝜎) (and Tr[𝜌𝜎] ≠ 0), then observe that we can write 𝐷 𝛼 as
log2 𝑄 𝛼 (𝜌∥𝜎) − log2 𝑄 1 (𝜌∥𝜎) log2 𝑄 1 (𝜌∥𝜎)
𝐷 𝛼 (𝜌∥𝜎) = + , (7.4.40)
𝛼−1 𝛼−1
350
Chapter 7: Quantum Entropies and Information
so that
d log2 Tr[𝜌Π𝜎 ]
lim− 𝐷 𝛼 (𝜌∥𝜎) = log2 𝑄 𝛼 (𝜌∥𝜎) + lim− , (7.4.41)
𝛼→1 d𝛼 𝛼=1 𝛼→1 𝛼 − 1
where we have used 𝑄 1 (𝜌∥𝜎) = Tr[𝜌Π𝜎 ]. Now, since Tr[𝜌𝜎] ≠ 0 and supp(𝜌)
⊈ supp(𝜎), we have that 0 < Tr[𝜌Π𝜎 ] < 1, which means that log2 Tr[𝜌Π𝜎 ] < 0.
1
Since lim𝛼→1− 𝛼−1 = −∞, we get that the second term in (7.4.41) is equal to +∞,
which means that lim𝛼→1− 𝐷 𝛼 (𝜌∥𝜎) = +∞. Therefore,
lim 𝐷 𝛼 (𝜌∥𝜎)
𝛼→1 −
Tr[𝜌 log2 𝜌] − Tr[𝜌 log2 𝜎] if supp(𝜌) ⊆ supp(𝜎),
= (7.4.42)
+∞ otherwise
= 𝐷 (𝜌∥𝜎).
To conclude, we have that lim𝛼→1+ 𝐷 𝛼 (𝜌∥𝜎) = lim𝛼→1− 𝐷 𝛼 (𝜌∥𝜎) = 𝐷 (𝜌∥𝜎),
which means that
lim 𝐷 𝛼 (𝜌∥𝜎) = 𝐷 (𝜌∥𝜎), (7.4.43)
𝛼→1
as required. ■
𝐷 𝛼 (𝜌∥𝜎) = 𝐷 𝛼 (𝑉 𝜌𝑉 † ∥𝑉 𝜎𝑉 † ). (7.4.44)
351
Chapter 7: Quantum Entropies and Information
where
∑︁
𝜌𝑋 𝐴 B 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜌 𝑥𝐴 , (7.4.47)
𝑥∈X
∑︁
𝜎𝑋 𝐴 B 𝑞(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜎𝐴𝑥 . (7.4.48)
𝑥∈X
Remark: Observe that the direct-sum property analogous to that for the quantum relative entropy
(see Proposition 7.3) does not hold for the Petz–Rényi relative entropy for every 𝛼 ∈ (0, 1) ∪ (1, ∞).
We can instead only make a statement for the Petz–Rényi relative quasi-entropy.
Proof:
1. Proof of isometric invariance: Let us start by writing 𝐷 𝛼 (𝜌∥𝜎) as in (7.4.8):
1
log2 Tr 𝜌 𝛼 (𝜎 + 𝜀 1) 1−𝛼 .
𝐷 𝛼 (𝜌∥𝜎) = lim+ (7.4.49)
𝜀→0 𝛼−1
Then, using the fact that (𝑉 𝜌𝑉 † ) 𝛼 = 𝑉 𝜌 𝛼𝑉 † , we find that
𝐷 𝛼 (𝑉 𝜌𝑉 † ∥𝑉 𝜎𝑉 † )
1
log2 Tr (𝑉 𝜌𝑉 † ) 𝛼 (𝑉 𝜎𝑉 † + 𝜀 1) 1−𝛼
= lim+ (7.4.50)
𝜀→0 𝛼 − 1
1
log2 Tr 𝑉 𝜌 𝛼𝑉 † (𝑉 𝜎𝑉 † + 𝜀 1) 1−𝛼 .
= lim+ (7.4.51)
𝜀→0 𝛼 − 1
Therefore,
Tr 𝑉 𝜌 𝛼𝑉 † (𝑉 𝜎𝑉 † + 𝜀 1) 1−𝛼
= Tr 𝑉 𝜌 𝛼𝑉 †𝑉 (𝜎 + 𝜀 1) 1−𝛼𝑉 † + 𝜀 1−𝛼𝑉 𝜌 𝛼𝑉 † Π̂
(7.4.54)
= Tr 𝑉 𝜌 𝛼 (𝜎 + 𝜀 1) 1−𝛼𝑉 †
(7.4.55)
= Tr 𝜌 𝛼 (𝜎 + 𝜀 1) 1−𝛼 ,
(7.4.56)
where the second equality follows from the fact that 𝑉 † Π̂𝑉 = 𝑉 †𝑉 −𝑉 †𝑉𝑉 †𝑉 =
1 − 1 = 0, and the last equality from cyclicity of the trace. Therefore,
1
𝐷 𝛼 (𝑉 𝜌𝑉 † ∥𝑉 𝜎𝑉 † ) = lim+ log2 Tr 𝜌 𝛼 (𝜎 + 𝜀 1) 1−𝛼
(7.4.57)
𝜀→0 𝛼 − 1
= 𝐷 𝛼 (𝜌∥𝜎), (7.4.58)
as required.
2. Proof of monotonicity in 𝛼: Using the expression in (7.4.2) for 𝐷 𝛼 along with
the form in (7.4.23) for the quasi-entropy 𝑄 𝛼 , let us write 𝐷 𝛼 (𝜌∥𝜎) as
1 ln⟨𝜑 𝜌 |𝑋 1−𝛼 |𝜑 𝜌 ⟩ 1 ln⟨𝜑 𝜌 |𝑋 𝛾 |𝜑 𝜌 ⟩
𝐷 𝛼 (𝜌∥𝜎) = =− , (7.4.59)
𝛼−1 ln(2) 𝛾 ln(2)
where 𝑋 = 𝜌 −1 ⊗ 𝜎 T , 𝛾 B 1 − 𝛼, and |𝜑 𝜌 ⟩ = (𝜌 2 ⊗ 1)|Γ⟩ is a purification of
1
𝜌. We first prove the result for 𝜌 invertible, and the proof for non-invertible
d d d𝛾 d
states 𝜌 follows by (7.4.24). Now, since d𝛼 = d𝛾 d𝛼 = − d𝛾 , we find that
d
𝐷 𝛼 (𝜌∥𝜎)
d𝛼
1 d 1
= ln⟨𝜑 𝜌 |𝑋 𝛾 |𝜑 𝜌 ⟩ (7.4.60)
ln(2) d𝛾 𝛾
1 1 𝜌 𝛾 𝜌 1 ⟨𝜑 𝜌 |𝑋 𝛾 ln 𝑋 |𝜑 𝜌 ⟩
= − 2 ln⟨𝜑 |𝑋 |𝜑 ⟩ + (7.4.61)
ln(2) 𝛾 𝛾 ⟨𝜑 𝜌 |𝑋 𝛾 |𝜑 𝜌 ⟩
1 −⟨𝜑 𝜌 |𝑋 𝛾 |𝜑 𝜌 ⟩ ln⟨𝜑 𝜌 |𝑋 𝛾 |𝜑 𝜌 ⟩ + 𝛾⟨𝜑 𝜌 |𝑋 𝛾 ln 𝑋 |𝜑 𝜌 ⟩
= (7.4.62)
ln(2) 𝛾 2 ⟨𝜑 𝜌 |𝑋 𝛾 |𝜑 𝜌 ⟩
1 −⟨𝜑 𝜌 |𝑋 𝛾 |𝜑 𝜌 ⟩ ln⟨𝜑 𝜌 |𝑋 𝛾 |𝜑 𝜌 ⟩ + ⟨𝜑 𝜌 |𝑋 𝛾 ln 𝑋 𝛾 |𝜑 𝜌 ⟩
= . (7.4.63)
ln(2) 𝛾 2 ⟨𝜑 𝜌 |𝑋 𝛾 |𝜑 𝜌 ⟩
Letting 𝑔(𝑥) B 𝑥 log2 𝑥, it follows that
d ⟨𝜑 𝜌 |𝑔(𝑋 𝛾 )|𝜑 𝜌 ⟩ − 𝑔(⟨𝜑 𝜌 |𝑋 𝛾 |𝜑 𝜌 ⟩)
𝐷 𝛼 (𝜌∥𝜎) = . (7.4.64)
d𝛼 𝛾 2 ⟨𝜑 𝜌 |𝑋 𝛾 |𝜑 𝜌 ⟩
353
Chapter 7: Quantum Entropies and Information
Then,
∑︁
𝜌 𝛼𝑋 𝐴 = |𝑥⟩⟨𝑥| 𝑋 ⊗ ( 𝑝(𝑥) 𝜌 𝑥𝑋 ) 𝛼 (7.4.73)
𝑥∈X
∑︁
𝜎𝑋1−𝛼
𝐴 = |𝑥⟩⟨𝑥| 𝑋 ⊗ (𝑞(𝑥)𝜎𝐴𝑥 ) 1−𝛼 , (7.4.74)
𝑥∈X
so that
∑︁
𝜌 𝛼𝑋 𝐴 𝜎𝑋1−𝛼
𝐴 = |𝑥⟩⟨𝑥| 𝑋 ⊗ ( 𝑝(𝑥) 𝜌 𝑥𝐴 ) 𝛼 (𝑞(𝑥)𝜎𝐴𝑥 ) 1−𝛼 (7.4.75)
𝑥∈X
354
Chapter 7: Quantum Entropies and Information
∑︁
= 𝑝(𝑥) 𝛼 𝑞(𝑥) 1−𝛼 |𝑥⟩⟨𝑥| 𝑋 ⊗ (𝜌 𝑥𝐴 ) 𝛼 (𝜎𝐴𝑥 ) 1−𝛼 , (7.4.76)
𝑥∈X
and
as required. ■
We now prove the data-processing inequality for the Petz–Rényi relative entropy
for 𝛼 ∈ [0, 1) ∪ (1, 2].
Proof: We prove the statement for 𝛼 ∈ (0, 1) ∪ (1, 2]. The case of 𝛼 = 0 then
follows by taking the limit 𝛼 → 0. From Stinespring’s theorem (Theorem 4.3), we
know that the action of every channel N on a linear operator 𝑋 can be written as
which can be verified in a similar manner to the proof of (7.4.8) in Proposition 7.21.
Using the quasi-entropy 𝑄 𝛼 , we can equivalently write (7.4.82) as
𝑄 𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ) = ⟨𝜑 𝜌 𝐴𝐵 | 𝑓 (𝜌 −1 T
𝐴𝐵 ⊗ 𝜎𝐴ˆ 𝐵ˆ )|𝜑
𝜌 𝐴𝐵
⟩, (7.4.86)
𝑄 𝛼 (𝜌 𝐴 ∥𝜎𝐴 ) = ⟨𝜑 𝜌 𝐴 | 𝑓 (𝜌 −1 T 𝜌𝐴
𝐴 ⊗ 𝜎𝐴ˆ )|𝜑 ⟩, (7.4.87)
356
Chapter 7: Quantum Entropies and Information
= |𝜑 𝜌 𝐴𝐵 ⟩. (7.4.94)
We thus obtain, using the operator Jensen inequality (Theorem 2.16),
𝑄 𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ) = ⟨𝜑 𝜌 𝐴 |𝑉 † 𝑓 (𝜌 −1 T 𝜌𝐴
𝐴𝐵 ⊗ 𝜎𝐴ˆ 𝐵ˆ )𝑉 |𝜑 ⟩ (7.4.95)
≥ ⟨𝜑 𝜌 𝐴 | 𝑓 (𝑉 † (𝜌 −1 T 𝜌𝐴
𝐴𝐵 ⊗ 𝜎𝐴ˆ 𝐵ˆ )𝑉)|𝜑 ⟩, for 𝛼 ∈ (1, 2], (7.4.96)
and
𝑄 𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ) = ⟨𝜑 𝜌 𝐴 |𝑉 † 𝑓 (𝜌 −1 T 𝜌𝐴
𝐴𝐵 ⊗ 𝜎𝐴ˆ 𝐵ˆ )𝑉 |𝜑 ⟩ (7.4.97)
≤ ⟨𝜑 𝜌 𝐴 | 𝑓 (𝑉 † (𝜌 −1 T 𝜌𝐴
𝐴𝐵 ⊗ 𝜎𝐴ˆ 𝐵ˆ )𝑉)|𝜑 ⟩, for 𝛼 ∈ [0, 1). (7.4.98)
Note that the operator Jensen inequality is applicable because for 𝛼 ∈ (1, 2] the
function 𝑓 in (7.4.88) is operator convex and for 𝛼 ∈ (0, 1) it is operator concave.2
Now, consider that
𝑉 † (𝜌 −1 T
𝐴𝐵 ⊗ 𝜎𝐴ˆ 𝐵ˆ )𝑉
−1 1 1
−1
= ⟨Γ| 𝐵 𝐵ˆ (𝜌 𝐴 2 ⊗ 1 𝐴ˆ ) 𝜌 𝐴𝐵
2
𝐴𝐵 ⊗ 𝜎𝐴ˆ 𝐵ˆ ) 𝜌 𝐴𝐵 (𝜌 𝐴 ⊗ 1 𝐴ˆ )|Γ⟩ 𝐵 𝐵ˆ
(𝜌 −1 T 2 2
(7.4.99)
−1 −1
= ⟨Γ| 𝐵 𝐵ˆ (𝜌 𝐴 2 ⊗ 1 𝐴ˆ )(𝜌 0𝐴𝐵 ⊗ 𝜎𝐴Tˆ 𝐵ˆ )(𝜌 𝐴 2 ⊗ 1 𝐴ˆ )|Γ⟩𝐵 𝐵ˆ (7.4.100)
−1 −1
= ⟨Γ| 𝐵 𝐵ˆ (𝜌 𝐴 2 ⊗ 1 𝐴ˆ )( 1 𝐴𝐵 ⊗ 𝜎𝐴Tˆ 𝐵ˆ )(𝜌 𝐴 2 ⊗ 1 𝐴ˆ )|Γ⟩𝐵 𝐵ˆ (7.4.101)
= 𝜌 −1 T
𝐴 ⊗ ⟨Γ| 𝐵 𝐵ˆ 𝜎𝐴ˆ 𝐵ˆ |Γ⟩ 𝐵 𝐵ˆ (7.4.102)
= 𝜌 −1 T
𝐴 ⊗ 𝜎𝐴ˆ , (7.4.103)
where the last equality follows from the fact that
⟨Γ| 𝐵 𝐵ˆ 𝜎𝐴Tˆ 𝐵ˆ |Γ⟩𝐵 𝐵ˆ = Tr 𝐵ˆ [𝜎𝐴Tˆ 𝐵ˆ ] = 𝜎𝐴Tˆ , (7.4.104)
the last equality due to the fact that the transpose is taken on a product basis for
H 𝐴ˆ ⊗ H𝐵ˆ . Therefore, we find that
𝑄 𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ) ≥ 𝑄 𝛼 (𝜌 𝐴 ∥𝜎𝐴 ), for 𝛼 ∈ (1, 2], (7.4.105)
𝑄 𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ) ≤ 𝑄 𝛼 (𝜌 𝐴 ∥𝜎𝐴 ), for 𝛼 ∈ (0, 1), (7.4.106)
as required. This establishes the data-processing inequality for 𝐷 𝛼 under partial
trace. Combining this with the isometric invariance of 𝐷 𝛼 and Stinespring’s
theorem, we conclude that
𝐷 𝛼 (𝜌∥𝜎) ≥ 𝐷 𝛼 (N(𝜌)∥N(𝜎)), 𝛼 ∈ (0, 1) ∪ (1, 2] (7.4.107)
for every state 𝜌, positive semi-definite operator 𝜎, and channel N. ■
2 Indeed, the function 𝑥 𝛽 is operator convex for 𝛽 ∈ [−1, 0) ∪ [1, 2] and operator concave for
𝛽 ∈ (0, 1], where here 𝛽 = 1 − 𝛼.
357
Chapter 7: Quantum Entropies and Information
The data-processing inequality for the Petz–Rényi relative entropy can be written
using the Petz–Rényi relative quasi-entropy 𝑄 𝛼 as
1 1
log2 𝑄 𝛼 (𝜌∥𝜎) ≥ log2 𝑄 𝛼 (N(𝜌)|N(𝜎)) (7.4.109)
𝛼−1 𝛼−1
for all 𝛼 ∈ [0, 1) ∪ (1, 2]. Then, since 𝛼 − 1 is negative for 𝛼 ∈ [0, 1), we can use
the monotonicity of the function log2 to conclude that
With the data-processing inequality for the Petz–Rényi relative entropy in hand,
it is now straightforward to prove some of the following additional properties.
358
Chapter 7: Quantum Entropies and Information
Proof:
1. By the data-processing inequality for 𝐷 𝛼 with respect to the trace channel Tr,
and letting 𝑥 = Tr(𝜌) = 1 and 𝑦 = Tr(𝜎), we find that
1
𝐷 𝛼 (𝜌∥𝜎) ≥ 𝐷 𝛼 (𝑥∥𝑦) = log2 Tr[𝑥 𝛼 𝑦 1−𝛼 ] (7.4.112)
𝛼−1
1
= log2 (𝑦 1−𝛼 ) (7.4.113)
𝛼−1
1−𝛼
= log2 𝑦 (7.4.114)
𝛼−1
= − log2 𝑦 (7.4.115)
≥ 0, (7.4.116)
where the last line follows from the assumption that 𝑦 = Tr(𝜎) ≤ 1.
2. Proof of faithfulness: If 𝜌 = 𝜎, then the following equalities hold for all
𝛼 ∈ (0, 1) ∪ (1, 2):
1
𝐷 𝛼 (𝜌∥ 𝜌) = log2 Tr[𝜌 𝛼 𝜌 1−𝛼 ] (7.4.117)
𝛼−1
1
= log2 Tr(𝜌) (7.4.118)
𝛼−1
= 0. (7.4.119)
Next, suppose that 𝛼 ∈ (0, 1) ∪ (1, 2) and 𝐷 𝛼 (𝜌∥𝜎) = 0. From the above, we
conclude that 𝐷 𝛼 (Tr(𝜌)∥Tr(𝜎)) = − log2 𝑦 ≥ 0. From the fact that log2 𝑦 = 0
if and only if 𝑦 = 1, we conclude that 𝐷 𝛼 (𝜌∥𝜎) = 0 implies Tr(𝜎) = Tr(𝜌) = 1,
so that 𝜎 is a density operator. Then, for every measurement channel M,
359
Chapter 7: Quantum Entropies and Information
𝜌ˆ B |0⟩⟨0| ⊗ 𝜌, (7.4.124)
ˆ B |0⟩⟨0| ⊗ 𝜌 + |1⟩⟨1| ⊗ (𝜎 − 𝜌) .
𝜎 (7.4.125)
0 = 𝐷 𝛼 (𝜌∥ 𝜌) = 𝐷 𝛼 ( 𝜌∥ ˆ ≥ 𝐷 𝛼 (𝜌∥𝜎),
ˆ 𝜎) (7.4.126)
where the inequality follows from data processing with respect to partial trace
over the classical register.
4. Consider the state 𝜌ˆ B |0⟩⟨0| ⊗ 𝜌 and the operator 𝜎
ˆ B |0⟩⟨0| ⊗ 𝜎 + |1⟩⟨1| ⊗
(𝜎 − 𝜎), which is positive semi-definite because 𝜎′ ≥ 𝜎 by assumption.
′
Then
ˆ 1−𝛼 = |0⟩⟨0| ⊗ 𝜌 𝛼 𝜎 1−𝛼 ,
𝜌ˆ 𝛼 𝜎 (7.4.127)
which implies that
𝐷 𝛼 ( 𝜌∥ ˆ = 𝐷 𝛼 (𝜌∥𝜎).
ˆ 𝜎) (7.4.128)
ˆ = 𝜎′, and using the data-processing inequality
Then, observing that Tr1 [ 𝜎]
for 𝐷 𝛼 with respect to the partial trace channel Tr1 , we conclude that
as required. ■
360
Chapter 7: Quantum Entropies and Information
Furthermore, the Petz–Rényi relative entropy 𝐷 𝛼 is jointly convex for 𝛼 ∈ [0, 1):
!
∑︁ ∑︁ ∑︁
𝑥 𝑥
𝐷𝛼 𝑝(𝑥) 𝜌 𝐴 𝑝(𝑥)𝜎𝐴 ≤ 𝑝(𝑥)𝐷 𝛼 (𝜌 𝑥𝐴 ∥𝜎𝐴𝑥 ), 𝛼 ∈ [0, 1).
𝑥∈X 𝑥∈X 𝑥∈X
(7.4.132)
Then, since − log2 is a convex function, and using the definition of 𝐷 𝛼 in terms of
𝑄 𝛼 , we find that
!
∑︁
𝑥
∑︁
𝑥
∑︁ 1
𝐷𝛼 𝑝(𝑥) 𝜌 𝐴 𝑝(𝑥)𝜎𝐴 ≤ 𝑝(𝑥) log2 𝑄 𝛼 (𝜌 𝑥𝐴 ∥𝜎𝐴𝑥 ) (7.4.134)
𝛼−1
𝑥∈X 𝑥∈X 𝑥∈X
361
Chapter 7: Quantum Entropies and Information
∑︁
= 𝑝(𝑥)𝐷 𝛼 (𝜌 𝑥𝐴 ∥𝜎𝐴𝑥 ), (7.4.135)
𝑥∈X
as required. ■
e𝛼 (𝜌∥𝜎)
𝑄
h 1− 𝛼 i if 𝛼 ∈ (0, 1), or
1− 𝛼 𝛼
Tr 𝜎 2𝛼 𝜌𝜎 2𝛼
(7.5.1)
B 𝛼 ∈ (1, ∞), supp(𝜌) ⊆ supp(𝜎),
+∞
otherwise.
The sandwiched Rényi relative entropy is then defined as
e𝛼 (𝜌∥𝜎) B 1 e𝛼 (𝜌∥𝜎).
𝐷 log2 𝑄 (7.5.2)
𝛼−1
Observe that we can use the definition of the Schatten norm from (2.2.87) to
write the sandwiched Rényi relative entropy 𝐷
e𝛼 in the following different ways:
e𝛼 (𝜌∥𝜎) = 𝛼 1− 𝛼 1− 𝛼
𝐷 log2 𝜎 2𝛼 𝜌𝜎 2𝛼 (7.5.3)
𝛼−1 𝛼
𝛼 1 1− 𝛼 1
= log2 𝜌 2 𝜎 𝛼 𝜌 2 (7.5.4)
𝛼−1 𝛼
2𝛼 1 1− 𝛼
= log2 𝜌 2 𝜎 2𝛼 . (7.5.5)
𝛼−1 2𝛼
The expression in (7.5.4) and Proposition 2.8 then lead us to the following variational
362
Chapter 7: Quantum Entropies and Information
where
e𝛼 (𝜌∥𝜎; 𝜏)
𝐷
(
+∞ h i if 𝛼 > 1, supp(𝜌) ⊈ supp(𝜎), (7.5.7)
B 𝛼 1 1− 𝛼 1 𝛼−1
𝛼−1 log2 Tr 𝜌 𝜎
2 𝛼 𝜌 𝜏
2 𝛼 otherwise.
where we recall the definition of the fidelity 𝐹 (𝜌, 𝜎) from Definition 6.5.
In the case 𝛼 ∈ (1, ∞), since 1 − 𝛼 is negative, we take the inverse of 𝜎. In case
𝜎 is not invertible, we take the inverse of 𝜎 on its support. An alternative to this
convention is to define 𝐷e𝛼 (𝜌∥𝜎) for 𝛼 > 1 using only positive definite 𝜎, and for
positive semi-definite 𝜎, define
e𝛼 (𝜌∥𝜎) = lim 𝐷
𝐷 e𝛼 (𝜌∥𝜎 + 𝜀 1). (7.5.9)
+ 𝜀→0
Both alternatives are equivalent, as we now show (similar to what we did in the
proofs of Propositions 7.2 and 7.21).
Proposition 7.29
For every state 𝜌 and positive semi-definite operator 𝜎,
1− 𝛼 1 𝛼
1 h 1 i
e𝛼 (𝜌∥𝜎) = lim
𝐷 log2 Tr 𝜌 (𝜎 + 𝜀 1) 𝜌
2 𝛼 2 . (7.5.10)
+
𝜀→0 𝛼−1
Proof: For 𝛼 ∈ (0, 1), this is immediate from the fact that the logarithm, trace,
and power functions are continuous, so that the limit can be brought inside the trace
and inside the power (𝜎 + 𝜀 1) 2𝛼 .
1− 𝛼
For 𝛼 ∈ (1, ∞), since 1 − 𝛼 is negative and 𝜎 is not necessarily invertible, let
us start by decomposing the underlying Hilbert space H as H = supp(𝜎) ⊕ ker(𝜎),
363
Chapter 7: Quantum Entropies and Information
as in (7.2.6), so that
1 1
!
(𝜌 2 )0,0 (𝜌 2 )0,1
1 𝜎 0
𝜌 = 2 1 † 1 , 𝜎= . (7.5.11)
(𝜌 2 )0,1 (𝜌 2 )1,1 0 0
Then, writing 1 = Π𝜎 + Π𝜎⊥ , where Π𝜎 is the projection onto the support of 𝜎 and
Π𝜎⊥ is the projection onto the orthogonal complement of supp(𝜎), we find that
𝜎 + 𝜀Π𝜎 0
𝜎 + 𝜀1 = , (7.5.12)
0 𝜀Π𝜎⊥
1 1 1 1
If supp(𝜌) ⊆ supp(𝜎), then (𝜌 2 )1,0 = 0, (𝜌 2 )1,1 = 0, and (𝜌 2 )0,0 = 𝜌 2 , which
means that
1 1− 𝛼 1
(𝜌 2 ) 0,0 (𝜎 + 𝜀Π𝜎 ) 𝛼 (𝜌 2 ) 0,0 0
𝜌 2 (𝜎 + 𝜀 1) 𝛼 𝜌 2 =
1 1− 𝛼 1
, (7.5.14)
0 0
so that
1− 𝛼 1 𝛼
1 h 1 i
lim log2 Tr 𝜌 2 (𝜎 + 𝜀 1) 𝛼 𝜌 2 e𝛼 (𝜌∥𝜎),
=𝐷 𝛼 ∈ (1, ∞), (7.5.15)
𝜀→0+ 𝛼 − 1
as required.
1
If supp(𝜌) ⊈ supp(𝜎), then (𝜌 2 )1,1 is non-zero. In this case, we use the fact
that
1− 𝛼
!
(𝜎 + ) 0 0
(𝜎 + 𝜀 1) 𝛼 =
1− 𝛼 𝜀Π 𝜎 𝛼 0
≥ 1− 𝛼 (7.5.16)
0 (𝜀Π𝜎⊥ ) 𝛼
1− 𝛼
0 (𝜀Π𝜎⊥ ) 𝛼
to conclude that
𝜌 2 (𝜎 + 𝜀 1)
1 1− 𝛼 1
𝛼 𝜌2
1 1− 𝛼 1 † 1 1− 𝛼 1
!
(𝜌 )0,1 (𝜀Π𝜎⊥ ) 𝛼 (𝜌 2 )0,1
2 (𝜌 )0,1 (𝜀Π𝜎⊥ ) 𝛼 (𝜌 2 )1,1
2
≥ 1 1− 𝛼 1 † 1 1− 𝛼 1 (7.5.17)
(𝜌 2 )1,1 (𝜀Π𝜎⊥ ) 𝛼 (𝜌 2 )0,1 (𝜌 2 )1,1 (𝜀Π𝜎⊥ ) 𝛼 (𝜌 2 )1,1
364
Chapter 7: Quantum Entropies and Information
1 1− 𝛼 1 † 1 1− 𝛼 1
!
1− 𝛼 (𝜌 )0,1 (Π𝜎⊥ ) 𝛼 (𝜌 2 )0,1
2 (𝜌 )0,1 (Π𝜎⊥ ) 𝛼 (𝜌 2 )1,1
2
=𝜀 𝛼
1 1− 𝛼 1 † 1 1− 𝛼 1 . (7.5.18)
(𝜌 2 )1,1 (Π𝜎⊥ ) 𝛼 (𝜌 2 )0,1 (𝜌 2 )1,1 (Π𝜎⊥ ) 𝛼 (𝜌 2 )1,1
1− 𝛼
Now, since 𝛼 ∈ (1, ∞), we have that lim𝜀→0+ 𝜀 𝛼 = +∞; therefore, by continuity
arguments similar to those given above, we conclude that
1− 𝛼 1 𝛼
1 h 1 i
lim log2 Tr 𝜌 2 (𝜎 + 𝜀 1) 𝛼 𝜌 2 ≥ +∞, (7.5.19)
𝜀→0+ 𝛼 − 1
for the case 𝛼 ∈ (1, ∞) and supp(𝜌) ⊈ supp(𝜎). This implies that
1− 𝛼 1 𝛼
1 h 1 i
lim log2 Tr 𝜌 2 (𝜎 + 𝜀 1) 𝛼 𝜌 2 e𝛼 (𝜌∥𝜎),
=𝐷 (7.5.20)
𝜀→0+ 𝛼 − 1
The Petz–Rényi and sandwiched Rényi relative entropies are two ways of
defining a quantum generalization of the classical Rényi relative entropy in (7.4.19).
Indeed, if 𝜌 and 𝜎 are both classical, commuting states (i.e., both are diagonal in
the same basis), then both 𝐷 𝛼 (𝜌∥𝜎) and 𝐷 e𝛼 (𝜌∥𝜎) reduce to the classical Rényi
relative entropy in (7.4.19). In general, there are often many (in fact, typically
infinitely many) ways to generalize classical quantities to the quantum (i.e., non-
commutative) case such that we recover the original classical quantity in the special
case of commuting operators. What distinguishes one generalization from another
is the role that they play in characterizing operational tasks in quantum information
theory, which is a theme explored throughout this book.
We now establish the important fact that the quantum relative entropy is a
special case of the sandwiched Rényi relative entropy in the limit 𝛼 → 1. The
proof proceeds very similarly to the proof of the same property for the Petz–Rényi
relative entropy.
Proposition 7.30
Let 𝜌 be a state and 𝜎 a positive semi-definite operator. Then, in the limit
𝛼 → 1, the sandwiched Rényi relative entropy converges to the quantum relative
entropy:
e𝛼 (𝜌∥𝜎) = 𝐷 (𝜌∥𝜎).
lim 𝐷 (7.5.21)
𝛼→1
365
Chapter 7: Quantum Entropies and Information
Proof: Let us first consider the case 𝛼 ∈ (1, ∞). If supp(𝜌) ⊈ supp(𝜎), then
e𝛼 (𝜌∥𝜎) = +∞ for all 𝛼 ∈ (1, ∞), so that lim𝛼→1+ 𝐷
𝐷 e𝛼 (𝜌∥𝜎) = +∞. If supp(𝜌) ⊆
e𝛼 (𝜌∥𝜎) is finite and using (7.5.4) we write
supp(𝜎), then 𝐷
1 1 h 1 1− 𝛼 1 𝛼 i
e𝛼 (𝜌∥𝜎) =
𝐷 log2 𝑄 𝛼 (𝜌∥𝜎) =
e log2 Tr 𝜌 2 𝜎 𝛼 𝜌 2 . (7.5.22)
𝛼−1 𝛼−1
Let us define the function
𝛽
1 1− 𝛼 1
e𝛼,𝛽 (𝜌∥𝜎) B Tr 𝜌 𝜎
𝑄 2 𝛼 𝜌 2 , (7.5.23)
so that 𝑄e𝛼 (𝜌∥𝜎) = 𝑄 e𝛼,𝛼 (𝜌∥𝜎). By noting that supp(𝜌) ⊆ supp(𝜎) implies
e1 (𝜌∥𝜎) = Tr[𝜌Π𝜎 ] = 1 (where Π𝜎 is the projection onto the support of 𝜎), and
𝑄
e𝛼 (𝜌∥𝜎) as
since log2 1 = 0, we can write 𝐷
e𝛼 (𝜌∥𝜎) − log2 𝑄
log2 𝑄 e1 (𝜌∥𝜎)
e𝛼 (𝜌∥𝜎) =
𝐷 , (7.5.24)
𝛼−1
so that
e𝛼 (𝜌∥𝜎) = d log2 𝑄
lim 𝐷 e𝛼 (𝜌∥𝜎) (7.5.25)
𝛼→1 d𝛼 𝛼=1
d e
1 d𝛼 𝑄 𝛼 (𝜌∥𝜎) 𝛼=1
= (7.5.26)
ln(2) e1 (𝜌∥𝜎)
𝑄
1 d e
= 𝑄 𝛼 (𝜌∥𝜎) , (7.5.27)
ln(2) d𝛼 𝛼=1
where the first equality follows from the definition of the derivative and the second
equality from the derivative of the natural logarithm, along with the chain rule.
Using the function 𝑄 e𝛼,𝛽 and the chain rule, we write
d e d e d e
𝑄 𝛼 (𝜌∥𝜎) = 𝑄 𝛼,1 (𝜌∥𝜎) + 𝑄 1,𝛽 (𝜌∥𝜎) . (7.5.28)
d𝛼 𝛼=1 d𝛼 𝛼=1 d𝛽 𝛽=1
Then,
d e d h 1− 𝛼 i 1 h 1− 𝛼 i
𝑄 𝛼,1 (𝜌∥𝜎) = Tr 𝜌𝜎 𝛼 = − 2 Tr 𝜌𝜎 𝛼 ln 𝜎 , (7.5.29)
d𝛼 d𝛼 𝛼
366
Chapter 7: Quantum Entropies and Information
so that
d e
𝑄 𝛼,1 (𝜌∥𝜎) = −Tr[𝜌Π𝜎 ln 𝜎] = −Tr[𝜌 ln 𝜎], (7.5.30)
d𝛼 𝛼=1
where we used the fact that supp(𝜌) ⊆ supp(𝜎) to obtain the last equality. Similarly,
d e d 1 1 𝛽 d
𝑄 1,𝛽 (𝜌∥𝜎) = Tr 𝜌 2 Π𝜎 𝜌 2 = Tr[𝜌 𝛽 ] = Tr[𝜌 𝛽 ln 𝜌], (7.5.31)
d𝛽 d𝛽 d𝛽
where we again used the fact that supp(𝜌) ⊆ supp(𝜎) in order to conclude that
1 1
𝜌 2 Π𝜎 𝜌 2 = 𝜌. Therefore,
d e
𝑄 1,𝛽 (𝜌∥𝜎) = Tr[𝜌 ln 𝜌]. (7.5.32)
d𝛽 𝛽=1
So we find that
e𝛼 (𝜌∥𝜎) = 1 d e
lim 𝐷 𝑄 𝛼 (𝜌∥𝜎) = Tr[𝜌 log2 𝜌] − Tr[𝜌 log2 𝜎] (7.5.33)
𝛼→1 ln(2) d𝛼 𝛼=1
Let us now consider the case 𝛼 ∈ (0, 1). If supp(𝜌) ⊆ supp(𝜎), then since the
limit in (7.5.33) holds from both sides, we find that
e𝛼 (𝜌∥𝜎) = Tr[𝜌 log2 𝜌] − Tr[𝜌 log2 𝜎].
lim 𝐷 (7.5.35)
𝛼→1 −
If supp(𝜌) ⊈ supp(𝜎) (and Tr[𝜌𝜎] ≠ 0), then observe that we can write 𝐷
e𝛼 as
e𝛼 (𝜌∥𝜎) − log2 𝑄
log2 𝑄 e1 (𝜌∥𝜎) log2 𝑄
e1 (𝜌∥𝜎)
e𝛼 (𝜌∥𝜎) =
𝐷 + , (7.5.36)
𝛼−1 𝛼−1
so that
log2 Tr[𝜌Π𝜎 ]
e𝛼 (𝜌∥𝜎) = d log2 𝑄
lim− 𝐷 e𝛼 (𝜌∥𝜎) + lim− , (7.5.37)
𝛼→1 d𝛼 𝛼=1 𝛼→1 𝛼 − 1
367
Chapter 7: Quantum Entropies and Information
where we have used 𝑄 e1 (𝜌∥𝜎) = Tr[𝜌Π𝜎 ]. Now, since supp(𝜌) ⊈ supp(𝜎) and
Tr[𝜌𝜎] ≠ 0, we have that 0 < Tr[𝜌Π𝜎 ] < 1, which means that log2 Tr[𝜌Π𝜎 ] < 0.
1
Since lim𝛼→1− 𝛼−1 = −∞, we find that the second term in (7.5.37) is equal to +∞,
which means that lim𝛼→1− 𝐷e𝛼 (𝜌∥𝜎) = +∞. Therefore,
𝐷 e𝛼 (𝑉 𝜌𝑉 † ∥𝑉 𝜎𝑉 † ).
e𝛼 (𝜌∥𝜎) = 𝐷 (7.5.39)
368
Chapter 7: Quantum Entropies and Information
where
∑︁
𝜌𝑋 𝐴 B 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜌 𝑥𝐴 , (7.5.42)
𝑥∈X
∑︁
𝜎𝑋 𝐴 B 𝑞(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜎𝐴𝑥 . (7.5.43)
𝑥∈X
e𝛼 (𝜌1 ∥𝜎) ≤ 𝛼
𝐷 log2 𝛾 + 𝐷
e𝛼 (𝜌2 ∥𝜎), 𝛼 > 1. (7.5.44)
𝛼−1
6. For all 𝛼 ∈ (0, 1) ∪ (1, ∞), the sandwiched Rényi relative entropy 𝐷 e𝛼 is
always less than or equal to the Petz–Rényi relative entropy 𝐷 𝛼 , i.e.,
e𝛼 (𝜌∥𝜎) ≤ 𝐷 𝛼 (𝜌∥𝜎).
𝐷 (7.5.45)
Proof:
e𝛼 (𝜌∥𝜎) using the
1. Proof of isometric invariance: Let us start by writing 𝐷
function ∥·∥ 𝛼 as in (7.5.4):
𝛼
log2 𝜌 2 (𝜎 + 𝜀 1) 𝛼 𝜌 2
1 1− 𝛼 1
e𝛼 (𝜌∥𝜎) = lim
𝐷 , (7.5.47)
𝜀→0 𝛼 − 1 𝛼
where we have also made use of the fact that for positive semi-definite operators,
e𝛼 (𝜌∥𝜎) can be defined as in (7.5.10). Now,
𝐷
e𝛼 (𝑉 𝜌𝑉 † ∥𝑉 𝜎𝑉 † )
𝐷
𝛼 (7.5.48)
log2 (𝑉 𝜌𝑉 † ) 2 (𝑉 𝜎𝑉 † + 𝜀 1) 𝛼 (𝑉 𝜌𝑉 † ) 2
1 1− 𝛼 1
= lim .
𝜀→0 𝛼 − 1 𝛼
369
Chapter 7: Quantum Entropies and Information
1 1
Since (𝑉 𝜌𝑉 † ) 2 = 𝑉 𝜌 2 𝑉 † , we find that
(𝑉 𝜌𝑉 † ) 2 (𝑉 𝜎𝑉 † + 𝜀 1)
1 1− 𝛼 1
𝛼 (𝑉 𝜌𝑉 † ) 2
𝛼
= 𝑉 𝜌 𝑉 (𝑉 𝜎𝑉 + 𝜀 1)
1 1− 𝛼 1
† † †
2 𝛼 𝑉𝜌 𝑉 2 (7.5.49)
𝛼
= 𝜌 2 𝑉 † (𝑉 𝜎𝑉 † + 𝜀 1)
1 1− 𝛼 1
𝛼 𝑉𝜌2 , (7.5.50)
𝛼
where the last equality follows from the isometric invariance of the function
∥·∥ 𝛼 . Now, let Π B 𝑉𝑉 † be the projection onto the image of 𝑉, and let
Π̂ B 1 − Π. Then, we write
(𝑉 𝜎𝑉 † + 𝜀 1) = 𝑉 (𝜎 + 𝜀 1)
1− 𝛼 1− 𝛼 1− 𝛼
𝛼 𝛼 𝑉† + 𝜀 𝛼 Π̂. (7.5.52)
𝜌 2 𝑉 † (𝑉 𝜎𝑉 † + 𝜀 1) 𝛼 𝑉 𝜌 2
1 1− 𝛼 1
𝛼
= 𝜌 2 𝑉 † 𝑉 (𝜎 + 𝜀 1) 𝛼 𝑉 † + 𝜀 𝛼 Π̂ 𝑉 𝜌 2
1 1− 𝛼 1− 𝛼 1
(7.5.53)
𝛼
= 𝜌 2 𝑉 †𝑉 (𝜎 + 𝜀 1)
1 1− 𝛼 1 1− 𝛼 1 1
𝛼 𝑉 †𝑉 𝜌 2 + 𝜀 𝛼 𝜌 2 𝑉 † Π̂𝑉 𝜌 2 (7.5.54)
𝛼
= 𝜌 2 (𝜎 + 𝜀 1)
1 1− 𝛼 1
𝛼 𝜌2 , (7.5.55)
𝛼
where the last equality follows from the fact that 𝑉 † Π̂𝑉 = 𝑉 †𝑉 − 𝑉 †𝑉𝑉 †𝑉 =
1 − 1 = 0. Therefore,
𝛼
log2 𝜌 2 (𝜎 + 𝜀 1) 𝛼 𝜌 2
1 1− 𝛼 1
e𝛼 (𝑉 𝜌𝑉 † ∥𝑉 𝜎𝑉 † ) = lim
𝐷
𝜀→0 𝛼 − 1 𝛼 (7.5.56)
e𝛼 (𝜌∥𝜎),
=𝐷
as required.
e𝛼 (𝜌∥𝜎; 𝜏) defined
2. Proof of monotonicity in 𝛼: We make use of the function 𝐷
in (7.5.7), which we can write as
Taking the trace on both sides of this equation, and using the definition of 𝑄
e𝛼 ,
we conclude that
∑︁
𝑄 𝛼 (𝜌 𝑋 𝐴 ∥𝜎𝑋 𝐴 ) =
e 𝑝(𝑥) 𝛼 𝑞(𝑥) 1−𝛼 𝑄
e𝛼 (𝜌 𝑥 ∥𝜎 𝑥 ), (7.5.76)
𝐴 𝐴
𝑥∈X
372
Chapter 7: Quantum Entropies and Information
as required.
5. From the assumption that 𝜌1 ≤ 𝛾𝜌2 , we have that
1− 𝛼 1− 𝛼 1− 𝛼 1− 𝛼
𝜎 2𝛼 𝜌1 𝜎 2𝛼 ≤ 𝛾𝜎 2𝛼 𝜌2 𝜎 2𝛼 . (7.5.77)
Then, using (2.2.158), we obtain
1− 𝛼 𝛼 1− 𝛼 𝛼
h 1− 𝛼 i h 1− 𝛼 i
𝛼
Tr 𝜎 𝜌1 𝜎
2𝛼 2𝛼 ≤ 𝛾 Tr 𝜎 𝜌2 𝜎
2𝛼 2𝛼 . (7.5.78)
The result follows after applying the logarithm and dividing by 𝛼 − 1 on both
sides of this inequality.
6. This follows from the Araki–Lieb–Thirring inequalities, which we state here
without proof (see the Bibliographic Notes in Section 7.13): for positive
semi-definite operators 𝐴 and 𝐵 acting on a finite-dimensional Hilbert space,
and for 𝑞 ≥ 0, the following inequalities hold
1 𝑟𝑞 𝑟 𝑞
h 1 i h 𝑟 i
(a) Tr 𝐵 2 𝐴𝐵 2 ≥ Tr 𝐵 2 𝐴 𝐵 2
𝑟 for all 𝑟 ∈ [0, 1].
1 𝑟𝑞 𝑟 𝑞
h 1 i h 𝑟 i
(b) Tr 𝐵 𝐴𝐵
2 2 ≤ Tr 𝐵 𝐴 𝐵
2 𝑟 2 for all 𝑟 ≥ 1.
For 𝛼 ∈ (0, 1), we make use of the first of these inequalities. In particular, we
1− 𝛼
set 𝑞 = 1, 𝑟 = 𝛼, 𝐴 = 𝜌 and 𝐵 = 𝜎 𝛼 . Then, letting 𝛾 B 1−𝛼 2𝛼 , we obtain
h 1− 𝛼 i
𝛼 1−2 𝛼
𝛾 𝛾 𝛼 𝛼𝛾 𝛼 𝛼𝛾
Tr (𝜎 𝜌𝜎 ) ≥ Tr[𝜎 𝜌 𝜎 ] = Tr 𝜎 𝜌 𝜎 2 = Tr[𝜌 𝛼 𝜎 1−𝛼 ],
(7.5.79)
where the last equality holds by cyclicity of the trace. Since the logarithm
function is monotonically increasing, this inequality implies that
1− 𝛼 𝛼
h 1− 𝛼 i
log2 Tr 𝜎 𝜌𝜎
2𝛼 2𝛼 ≥ log2 Tr 𝜌 𝛼 𝜎 1−𝛼 . (7.5.80)
373
Chapter 7: Quantum Entropies and Information
This inequality holds for all positive semi-definite operators 𝐴 and 𝐵, as well
1 1
as for 𝑞 > 0, 𝑟 ∈ (0, 1], 𝑎, 𝑏 ∈ (0, ∞], and for 2𝑟𝑞 = 2𝑞 + 𝑎1 + 𝑏1 . Taking 𝑞 = 1,
1− 𝛼 2 2𝛼
𝑟 = 𝛼, 𝐴 = 𝜌, 𝐵 = 𝜎 𝛼 ,𝑎= 1−𝛼 , and 𝑏 = (1−𝛼) 2
, we obtain
h 1− 𝛼 1− 𝛼
𝛼i
Tr 𝜎 2𝛼 𝜌𝜎 2𝛼
2𝛼
2𝛼 (1− 𝛼) 2
h 1− 𝛼 1− 𝛼
i 𝛼 1− 𝛼
≤ Tr 𝜎 2 𝜌𝜎 2 𝜌 2
2
𝜎 2𝛼 . (7.5.85)
1− 𝛼 2𝛼
(1− 𝛼) 2
For 𝜎, we obtain
2𝛼
(1− 𝛼) 2
= (Tr[𝜎]) (1−𝛼) .
2
𝜎 2𝛼 (7.5.90)
2𝛼
(1− 𝛼) 2
Therefore,
1− 𝛼 𝛼
h 1− 𝛼 i 𝛼
≤ Tr[𝜌 𝜎 ] · (Tr[𝜎]) (1−𝛼) .
𝛼 1−𝛼 2
Tr 𝜎 2𝛼 𝜌𝜎 2𝛼 (7.5.91)
374
Chapter 7: Quantum Entropies and Information
1
Taking the logarithm of both sides and multiplying by 𝛼−1 , which is negative
for 𝛼 ∈ (0, 1), we obtain the inequality in (7.5.46). ■
Like the quantum relative entropy and the Petz–Rényi relative entropy, the
sandwiched Rényi relative entropy is faithful, meaning that for all states 𝜌, 𝜎 and
all 𝛼 ∈ (0, 1) ∪ (1, ∞),
e𝛼 (𝜌∥𝜎) = 0
𝐷 ⇐⇒ 𝜌 = 𝜎. (7.5.98)
375
Chapter 7: Quantum Entropies and Information
Proof: This proof follows steps very similar to those in the proof of the data-
processing inequality for the Petz–Rényi relative entropy (Theorem 7.24), with the
key difference being that in this case we make use of the fact that the sandwiched
Rényi relative entropy can be written as the optimization in (7.5.6).
From Stinespring’s theorem (Theorem 4.3), we know that the action of a channel
N on a linear operator 𝑋 can be written as
N(𝑋) = Tr𝐸 [𝑉 𝑋𝑉 † ], (7.5.100)
for some 𝑉, where 𝑉 is an isometry and 𝐸 is an auxiliary system with dimension
𝑑 𝐸 ≥ rank(ΓN ). As stated in (7.5.39), 𝐷 e𝛼 is isometrically invariant. Therefore, it
suffices to prove the data-processing inequality for 𝐷 e𝛼 under partial trace; i.e., it
suffices to show that for every state 𝜌 𝐴𝐵 , every positive semi-definite operator 𝜎𝐴𝐵 ,
and all 𝛼 ∈ [1/2, 1) ∪ (1, ∞):
e𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ) ≥ 𝐷
𝐷 e𝛼 (𝜌 𝐴 ∥𝜎𝐴 ). (7.5.101)
We now proceed to prove this inequality. We prove it for 𝜌 𝐴𝐵 , and hence 𝜌 𝐴 ,
invertible, as well as for 𝜎𝐴𝐵 and 𝜎𝐴 invertible. The result follows in the general
case of 𝜌 𝐴𝐵 and/or 𝜌 𝐴 non-invertible, as well as 𝜎𝐴𝐵 and/or 𝜎𝐴 non-invertible, by
applying the result to the invertible operators (1 − 𝛿) 𝜌 𝐴𝐵 + 𝛿𝜋 𝐴𝐵 and 𝜎𝐴𝐵 + 𝜀 1 𝐴𝐵 ,
with 𝛿, 𝜀 > 0, and taking the limits 𝜀 → 0+ and 𝛿 → 0+ , since
e𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ) = lim lim 𝐷
𝐷 e𝛼 ((1 − 𝛿) 𝜌 𝐴𝐵 + 𝛿𝜋 𝐴𝐵 ∥𝜎𝐴𝐵 + 𝜀 1 𝐴𝐵 ), (7.5.102)
+ +
𝜀→0 𝛿→0
376
Chapter 7: Quantum Entropies and Information
which can be verified in a similar manner to the proof of (7.5.10) in Proposition 7.29.
e𝛼 (𝜌∥𝜎; 𝜏) as
Let us start by defining the quantity 𝑄
|𝜑 𝜌 ⟩ B (𝜌 2 ⊗ 1)|Γ⟩
1
(7.5.105)
so that
e𝛼 (𝜌∥𝜎; 𝜏) = 𝛼 e𝛼 (𝜌∥𝜎; 𝜏),
𝐷 log2 𝑄 (7.5.107)
𝛼−1
e𝛼 (𝜌∥𝜎; 𝜏) defined in (7.5.7). Now, to prove (7.5.101),
where we recall the quantity 𝐷
we show that for every positive definite state 𝜔 𝐴 , there exists a positive definite
state 𝜏𝐴𝐵 such that
e𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ; 𝜏𝐴𝐵 ) ≥ 𝑄
𝑄 e𝛼 (𝜌 𝐴 ∥𝜎𝐴 ; 𝜔 𝐴 ), for 𝛼 ∈ (1, ∞),
(7.5.108)
e𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ; 𝜏𝐴𝐵 ) ≤ 𝑄
𝑄 e𝛼 (𝜌 𝐴 ∥𝜎𝐴 ; 𝜔 𝐴 ), for 𝛼 ∈ [1/2, 1) .
With these two inequalities, along with (7.5.107) and (7.5.6), the result follows.
Consider that
e𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ; 𝜏𝐴𝐵 ) = ⟨𝜑 𝜌 𝐴𝐵 | 𝑓 (𝜏 −1 ⊗ 𝜎 T )|𝜑 𝜌 𝐴𝐵 ⟩,
𝑄 (7.5.109)
𝐴𝐵 𝐴ˆ 𝐵ˆ
−1 T
e𝛼 (𝜌 𝐴 ∥𝜎𝐴 ; 𝜔 𝐴 ) = ⟨𝜑 𝐴 | 𝑓 (𝜔 ⊗ 𝜎 )|𝜑 𝜌 𝐴 ⟩,
𝜌
𝑄 𝐴 𝐴ˆ
(7.5.110)
377
Chapter 7: Quantum Entropies and Information
Recall that
1
−1 1
for 𝛼 ∈ [1/2, 1), where to obtain the last inequality in each case we used the operator
Jensen inequality (Theorem 2.16), which is applicable since for 𝛼 ∈ (1, ∞) the
function 𝑓 in (7.5.111) is operator convex and for 𝛼 ∈ [1/2, 1) it is operator concave.
Now, recall that to conclude (7.5.101), we should perform an optimization over
e𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ; 𝜏𝐴𝐵 ) in order
invertible states 𝜏𝐴𝐵 as per the definition in (7.5.6) of 𝐷
to obtain 𝐷e𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ). Since we only require a lower bound on 𝐷 e𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ),
we can obtain the lower bound in (7.5.101) on 𝐷 e𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ) by simply picking a
particular state 𝜏𝐴𝐵 in the optimization in (7.5.6). Let us therefore take
1
−1 −1 1
𝜏𝐴𝐵 = 𝜉 𝐴𝐵 (𝜔 𝐴 ) B 𝜌 𝐴𝐵
2
(𝜌 𝐴 2 𝜔 𝐴 𝜌 𝐴 2 ⊗ 1𝐵 ) 𝜌 𝐴𝐵
2
, (7.5.120)
where 𝜔 𝐴 is an arbitrary invertible state. Note that this choice of 𝜏𝐴𝐵 is indeed a
state because it is the result of applying the Petz recovery channel P 𝜌 𝐴𝐵 ,Tr𝐵 defined
in (4.6.30) to 𝜔 𝐴 . It is also invertible; in particular,
−1 1 1
−1
𝐴 𝜌 𝐴 ⊗ 1 𝐵 ) 𝜌 𝐴𝐵 .
−1
𝜏𝐴𝐵 = [𝜉 𝐴𝐵 (𝜔 𝐴 )] −1 = 𝜌 𝐴𝐵2 (𝜌 𝐴2 𝜔−1 2 2
(7.5.121)
378
Chapter 7: Quantum Entropies and Information
𝑉 † (𝜏𝐴𝐵
−1
⊗ 𝜎𝐴Tˆ 𝐵ˆ )𝑉
−1 1 1
−1
= ⟨Γ| 𝐵 𝐵ˆ (𝜌 𝐴 2 ⊗ 1 𝐴ˆ ) 𝜌 𝐴𝐵
2 −1
(𝜏𝐴𝐵 ⊗ 𝜎𝐴Tˆ 𝐵ˆ ) 𝜌 𝐴𝐵
2
(𝜌 𝐴 2 ⊗ 1 𝐴ˆ )|Γ⟩𝐵 𝐵ˆ (7.5.122)
− 12 12 − 12 1 1
− 12 12 − 12
= ⟨Γ| 𝐵 𝐵ˆ 𝜌 𝐴 𝜌 𝐴𝐵 𝜌 𝐴𝐵 (𝜌 𝐴 𝜔 𝐴 𝜌 𝐴 ⊗ 1𝐵 ) 𝜌 𝐴𝐵 𝜌 𝐴𝐵 𝜌 𝐴 ⊗ 𝜎𝐴Tˆ 𝐵ˆ |Γ⟩𝐵 𝐵ˆ (7.5.123)
2 −1 2
𝐴 ⊗ 1 𝐵 ⊗ 𝜎𝐴ˆ 𝐵ˆ |Γ⟩ 𝐵 𝐵ˆ
= ⟨Γ| 𝐵 𝐵ˆ 𝜔−1 T
(7.5.124)
= 𝜔−1 T
𝐴 ⊗ ⟨Γ| 𝐵 𝐵ˆ 𝜎𝐴ˆ 𝐵ˆ |Γ⟩ 𝐵 𝐵ˆ (7.5.125)
= 𝜔−1 T
𝐴 ⊗ 𝜎𝐴ˆ , (7.5.126)
where we have used the fact that ⟨Γ| 𝐵 𝐵ˆ 𝜎 Tˆ ˆ |Γ⟩𝐵 𝐵ˆ = Tr 𝐵ˆ [𝜎 Tˆ ˆ ] = 𝜎 Tˆ , the last
𝐴𝐵 𝐴𝐵 𝐴
equality due to the fact that the transpose is taken on a product basis for H 𝐴ˆ ⊗ H𝐵ˆ .
Therefore, for 𝛼 ∈ (1, ∞), taking the logarithm on both sides of (7.5.118) and
using the state in (7.5.120), we find that
e𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ; 𝜉 𝐴𝐵 (𝜔 𝐴 )) ≥ log2 ⟨𝜑 𝜌 𝐴 | 𝑓 (𝜔−1 ⊗ 𝜎 T )|𝜑 𝜌 𝐴 ⟩
log2 𝑄 (7.5.127)
𝐴 𝐴ˆ
e𝛼 (𝜌 𝐴 ∥𝜎𝐴 ; 𝜔 𝐴 ).
= log2 𝑄 (7.5.128)
𝛼
Multiplying both sides of this inequality by 𝛼−1 , we obtain
e𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ) =
𝐷 sup e𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ; 𝜏𝐴𝐵 )
𝐷 (7.5.129)
𝜏𝐴𝐵 >0
Tr[𝜏𝐴𝐵 ]=1
≥𝐷
e𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ; 𝜉 𝐴𝐵 (𝜔 𝐴 )) (7.5.130)
≥𝐷
e𝛼 (𝜌 𝐴 ∥𝜎𝐴 ; 𝜔 𝐴 ) (7.5.131)
for all invertible states 𝜔 𝐴 . Finally, taking the supremum over the set {𝜔 𝐴 : 𝜔 𝐴 >
0, Tr[𝜔 𝐴 ] = 1}, we conclude that
e𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ) ≥ 𝐷
𝐷 e𝛼 (𝜌 𝐴 ∥𝜎𝐴 ), for 𝛼 ∈ (1, ∞). (7.5.132)
For 𝛼 ∈ [1/2, 1), taking the logarithm on both sides of (7.5.119) and using the
state in (7.5.120), we conclude that
e𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ; 𝜉 𝐴𝐵 (𝜔 𝐴 )) ≤ log2 ⟨𝜑 𝜌 𝐴 | 𝑓 (𝜔−1 ⊗ 𝜎 T )|𝜑 𝜌 𝐴 ⟩
log2 𝑄 (7.5.133)
𝐴 𝐴ˆ
e𝛼 (𝜌 𝐴 ∥𝜎𝐴 ; 𝜔 𝐴 ).
= log2 𝑄 (7.5.134)
379
Chapter 7: Quantum Entropies and Information
𝛼
Multiplying both sides of this inequality by 𝛼−1 , which is negative in this case, so
that
𝛼 e𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ; 𝜉 𝐴𝐵 (𝜔 𝐴 )) ≥ 𝛼 log2 𝑄 e𝛼 (𝜌 𝐴 ∥𝜎𝐴 ; 𝜔 𝐴 ), (7.5.135)
log2 𝑄
𝛼−1 𝛼−1
we obtain
e𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ) =
𝐷 sup e𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ; 𝜏𝐴𝐵 )
𝐷 (7.5.136)
𝜏𝐴𝐵 >0,
Tr[𝜏𝐴𝐵 ]=1
≥𝐷
e𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ; 𝜉 𝐴𝐵 (𝜔 𝐴 )) (7.5.137)
≥𝐷
e𝛼 (𝜌 𝐴 ∥𝜎𝐴 ; 𝜔 𝐴 ) (7.5.138)
for all invertible states 𝜔 𝐴 . Finally, taking the supremum over the set {𝜔 𝐴 : 𝜔 𝐴 >
0, Tr[𝜔 𝐴 ] = 1}, we conclude that
e𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ) ≥ 𝐷
𝐷 e𝛼 (𝜌 𝐴 ∥𝜎𝐴 ), for 𝛼 ∈ [1/2, 1) . (7.5.139)
for 𝛼 ∈ [1/2, 1) ∪ (1, ∞), all states 𝜌, positive semi-definite operators 𝜎, and all
channels N. ■
With the data-processing inequality for the sandwiched Rényi relative entropy in
hand, it is now straightforward to prove some of the following additional properties.
380
Chapter 7: Quantum Entropies and Information
Proof:
1. By the data-processing inequality for 𝐷
e𝛼 with respect to the trace channel Tr,
and letting 𝑥 = Tr(𝜌) = 1 and 𝑦 = Tr(𝜎), we find that
1 1− 𝛼 1− 𝛼
e𝛼 (𝜌∥𝜎) ≥ 𝐷
𝐷 e𝛼 (𝑥∥𝑦) = log2 Tr[(𝑦 2𝛼 𝑥𝑦 2𝛼 ) 𝛼 ] (7.5.142)
𝛼−1
1
= log2 (𝑦 1−𝛼 ) (7.5.143)
𝛼−1
1−𝛼
= log2 𝑦 (7.5.144)
𝛼−1
= − log2 𝑦 (7.5.145)
≥ 0, (7.5.146)
where the last line follows from the assumption that 𝑦 = Tr(𝜎) ≤ 1.
2. Proof of faithfulness: If 𝜌 = 𝜎, then the following equalities hold for all
𝛼 ∈ [1/2, 1) ∪ (1, ∞):
1 h 1− 𝛼 1− 𝛼 𝛼 i
e𝛼 (𝜌∥ 𝜌) =
𝐷 log2 Tr 𝜌 2𝛼 𝜌𝜌 2𝛼 (7.5.147)
𝛼−1
1 h 1− 𝛼 i
𝛼 1−2 𝛼
= log2 Tr 𝜌 𝜌 𝜌
2 (7.5.148)
𝛼−1
1
= log2 Tr[𝜌 1−𝛼 𝜌 𝛼 ] (7.5.149)
𝛼−1
381
Chapter 7: Quantum Entropies and Information
1
= log2 Tr(𝜌) (7.5.150)
𝛼−1
= 0. (7.5.151)
Next, suppose that 𝛼 ∈ [1/2, 1)∪(1, ∞) and 𝐷 e𝛼 (𝜌∥𝜎) = 0. From the above, we
conclude that 𝐷 e𝛼 (Tr(𝜌)∥Tr(𝜎)) = − log2 𝑦 ≥ 0. From the fact that log2 𝑦 = 0
e𝛼 (𝜌∥𝜎) = 0 implies Tr(𝜎) = Tr(𝜌) = 1,
if and only if 𝑦 = 1, we conclude that 𝐷
so that 𝜎 is a density operator. Then, for every measurement channel M,
e𝛼 (M(𝜌)∥M(𝜎)) ≤ 𝐷
𝐷 e𝛼 (𝜌∥𝜎) = 0. (7.5.152)
𝐷 (M(𝜌)∥M(𝜎)) ≥ 𝐷
e𝛼 (Tr(M(𝜌))∥Tr(M(𝜎))) (7.5.153)
=𝐷e𝛼 (Tr(𝜌)∥Tr(𝜎)) (7.5.154)
= 0, (7.5.155)
𝜌ˆ B |0⟩⟨0| ⊗ 𝜌, (7.5.156)
ˆ B |0⟩⟨0| ⊗ 𝜌 + |1⟩⟨1| ⊗ (𝜎 − 𝜌) .
𝜎 (7.5.157)
where the inequality follows from data processing with respect to partial trace
over the classical register.
382
Chapter 7: Quantum Entropies and Information
e𝛼 (𝜌∥𝜎′) = 𝐷
𝐷 e𝛼 (Tr1 ( 𝜌)∥Tr
ˆ 1 ( 𝜎))
ˆ ≤𝐷e𝛼 ( 𝜌∥
ˆ 𝜎) e𝛼 (𝜌∥𝜎),
ˆ =𝐷 (7.5.161)
as required. ■
Let us now prove the faithfulness of both the Petz–Rényi and sandwiched Rényi
relative entropies for the full range of parameters for which they are defined.
Proof: Note that the equality 𝐷e𝛼 (𝜌∥ 𝜌) = 0 for all 𝛼 ∈ (0, 1) ∪ (1, ∞) is immediate
from the definition (see also (7.5.147)–(7.5.151)). The converse statement has
already been established in property 2. of Proposition 7.35 for 𝛼 ∈ [1/2, 1) ∪ (1, ∞).
Before getting to the range 𝛼 ∈ (0, 1/2), let us consider the Petz–Rényi relative
entropy.
It is immediately clear from the definition that 𝐷 𝛼 (𝜌∥ 𝜌) = 0 for all 𝛼 ∈
(0, 1)∪(1, ∞). For 𝛼 ∈ [0, 1)∪(1, 2], the converse follows from the data-processing
inequality, which holds for this parameter range as shown in Theorem 7.24, as well
as from arguments analogous to those in the proof of property 2. in Proposition 7.35.
For 𝛼 ∈ (2, ∞), we use the fact that 𝐷 𝛼 (𝜌∥𝜎) ≥ 𝐷 e𝛼 (𝜌∥𝜎) for all 𝜌, 𝜎, as shown
in Proposition 7.31. In particular, if 𝐷 𝛼 (𝜌∥𝜎) = 0, then 𝐷
e𝛼 (𝜌∥𝜎) ≤ 0. However,
383
Chapter 7: Quantum Entropies and Information
The data-processing inequality for the sandwiched Rényi relative entropy can
be written using the sandwiched Rényi relative quasi-entropy 𝑄
e𝛼 as
1 e𝛼 (𝜌∥𝜎) ≥ 1 log2 𝑄
e𝛼 (N(𝜌)|N(𝜎)).
log2 𝑄 (7.5.165)
𝛼−1 𝛼−1
Then, since 𝛼 − 1 is negative for 𝛼 ∈ [1/2, 1), we can use the monotonicity of the
function log2 to obtain
e𝛼 (𝜌∥𝜎) ≥ 𝑄
𝑄 e𝛼 (N(𝜌)∥N(𝜎)), for 𝛼 ∈ (1, ∞), (7.5.166)
e𝛼 (𝜌∥𝜎) ≤ 𝑄
𝑄 e𝛼 (N(𝜌)∥N(𝜎)), for 𝛼 ∈ [1/2, 1) . (7.5.167)
Just as with the Petz–Rényi relative entropy, we can use this to prove the joint
convexity of the sandwiched Rényi relative entropy.
384
Chapter 7: Quantum Entropies and Information
Then, since − log2 is a convex function, and using the definition of 𝐷 e𝛼 in terms of
e𝛼 , we conclude that
𝑄
!
∑︁ ∑︁ ∑︁ 1
𝐷e𝛼 𝑝(𝑥) 𝜌 𝑥𝐴 𝑝(𝑥)𝜎𝐴𝑥 ≤ 𝑝(𝑥) log2 𝑄e𝛼 (𝜌 𝑥 ∥𝜎 𝑥 ) (7.5.172)
𝛼−1 𝐴 𝐴
𝑥∈X 𝑥∈X 𝑥∈X
∑︁
= e𝛼 (𝜌 𝑥 ∥𝜎 𝑥 ),
𝑝(𝑥) 𝐷 (7.5.173)
𝐴 𝐴
𝑥∈X
as required. ■
Although the sandwiched Rényi relative entropy is not jointly convex for
𝛼 ∈ (1, ∞), it is jointly quasi-convex, in the sense that
!
∑︁ ∑︁
𝐷e𝛼 𝑝(𝑥) 𝜌 𝑥𝐴 𝑝(𝑥)𝜎𝐴𝑥 ≤ max 𝐷 e𝛼 (𝜌 𝑥 ∥𝜎 𝑥 ),
𝐴 𝐴 (7.5.174)
𝑥∈X
𝑥∈X 𝑥∈X
385
Chapter 7: Quantum Entropies and Information
for every finite alphabet X, probability distribution 𝑝 : X → [0, 1], set {𝜌 𝑥𝐴 }𝑥∈X of
states, and set {𝜎𝐴𝑥 }𝑥∈X of positive semi-definite operators. Indeed, from (7.5.168),
we immediately obtain
!
∑︁ ∑︁
𝑄
e𝛼 𝑝(𝑥) 𝜌 𝑥𝐴 𝑝(𝑥)𝜎𝐴𝑥 ≤ max 𝑄 e𝛼 (𝜌 𝑥 ∥𝜎 𝑥 ).
𝐴 𝐴 (7.5.175)
𝑥∈X
𝑥∈X 𝑥∈X
1
Taking the logarithm and multiplying by 𝛼−1 on both sides of this inequality leads
to (7.5.174).
where 𝜎𝜀 B 𝜎 + 𝜀 1 and
𝛼
1
−1 −1 1
𝐺 𝛼 (𝜎𝜀 , 𝜌) B 𝜎𝜀
2
𝜎𝜀 2 𝜌𝜎𝜀 2 𝜎𝜀2 (7.6.2)
386
Chapter 7: Quantum Entropies and Information
b𝛼 (𝜌∥𝜎) B 1 b𝛼 (𝜌∥𝜎).
𝐷 log2 𝑄 (7.6.3)
𝛼−1
Remark: In general, the weighted operator geometric mean of two positive definite operators
𝑋 and 𝑌 is defined as
1 𝛽
1
1 1
𝐺 𝛽 (𝑋, 𝑌 ) B 𝑋 2 𝑋 − 2 𝑌 𝑋 − 2 𝑋 2 , (7.6.4)
where 𝛽 ∈ R is the weight parameter. We recover the standard operator geometric mean for
𝛽 = 21 .
An important property of the weighted operator geometric mean is that
for all positive definite 𝑋, 𝑌 , and all 𝛽 ∈ R. To see this, observe that
1 1−𝛽
1
1 1
𝐺 1−𝛽 (𝑌 , 𝑋) = 𝑌 2 𝑌 − 2 𝑋𝑌 − 2 𝑌2 (7.6.6)
1
1 1 −𝛽 1
−2 − 12 −2 − 12
= 𝑌 𝑌 𝑋𝑌
2 𝑌 𝑋𝑌 𝑌2 (7.6.7)
1 1 1
1 1 1 1 −𝛽 1
= 𝑋 2 𝑋 2𝑌 − 2 𝑌 − 2 𝑋 2 𝑋 2𝑌 − 2 𝑌2 (7.6.8)
1 1
Now we apply Lemma 2.5. Specifically, we set 𝐿 = 𝑋 2 𝑌 − 2 and 𝑓 (𝑥) = 𝑥 −𝛽 therein to conclude
that
1
1 1 1 1 −𝛽 1 1 1
𝐺 1−𝛽 (𝑌 , 𝑋) = 𝑋 2 𝑋 2 𝑌 − 2 𝑌 − 2 𝑋 2 𝑋 2𝑌 − 2𝑌 2 (7.6.9)
1 𝛽
1
1 1
= 𝑋 2 𝑋− 2𝑌 𝑋− 2 𝑋 2 (7.6.10)
= 𝐺 𝛽 (𝑋, 𝑌 ). (7.6.11)
Definition 7.38 of the geometric Rényi relative entropy involves a limit, which
has to do with the possibility that 𝜎 might not be invertible (i.e., it might not be
positive definite). Recall that the same situation arises for the Petz– and sandwiched
Rényi relative entropies, which leads to expressions for them in terms of a limit
in Propositions 7.21 and 7.29, respectively. For these two quantities, the limits
evaluate to a finite value with an explicit expression under the condition 𝛼 ∈ (0, 1)
and Tr[𝜌𝜎] ≠ 0, or 𝛼 ∈ (1, ∞) and supp(𝜌) ⊆ supp(𝜎). For the geometric Rényi
relative entropy, however, there are several cases for which the limit in (7.6.1) is
387
Chapter 7: Quantum Entropies and Information
finite and has an explicit expression. The following proposition outlines some of
the simpler cases in which 𝜎 is positive definite:
Proposition 7.39
Let 𝜌 be a state, and let 𝜎 be a positive definite operator. Then,
1 𝛼
h 1 i
−
b𝛼 (𝜌∥𝜎) = Tr 𝜎 𝜎 2 𝜌𝜎 2 −
𝑄 = Tr[𝐺 𝛼 (𝜎, 𝜌)] (7.6.12)
for all 𝛼 ∈ (0, 1) ∪ (1, ∞). If 𝜌 is a positive definite state and 𝜎 a positive
definite operator, then
1 1 1−𝛼
−
b𝛼 (𝜌∥𝜎) = Tr 𝜌 𝜌 2 𝜎𝜌 2 −
𝑄 = Tr[𝐺 1−𝛼 (𝜌, 𝜎)] (7.6.13)
1 1 𝛼−1
= Tr 𝜌 𝜌 2 𝜎 −1 𝜌 2 . (7.6.14)
Proof. If 𝜎 is positive definite, then the support of 𝜎 is the entire Hilbert space,
and so the limit 𝜀 → 0+ in (7.6.1) simply evaluates to 𝑄 b𝛼 (𝜌∥𝜎) = Tr[𝐺 𝛼 (𝜎, 𝜌)]
for all 𝛼 ∈ (0, 1) ∪ (1, ∞).
If 𝜌 is also positive definite, then by invoking the equality in (7.6.5), we conclude
that Tr[𝐺 𝛼 (𝜎, 𝜌)] = Tr[𝐺 1−𝛼 (𝜌, 𝜎)] for all 𝛼 ∈ (0, 1) ∪ (1, ∞). Furthermore,
since both 𝜌 and 𝜎 are positive definite, the following equality holds
1−𝛼 𝛼−1
− 12 − 21 1
−1 1
𝜌 𝜎𝜌 = 𝜌 𝜎 𝜌
2 2 . (7.6.15)
We now provide explicit expressions for the geometric Rényi relative quasi-
b𝛼 (𝜌∥𝜎) that are consistent with the limit-based definition in (7.6.1)
entropy 𝑄
whenever 𝜌 and/or 𝜎 are not positive definite. The expressions given in (7.6.16)
below cover all possible values of 𝛼 ∈ (0, 1) ∪ (1, ∞) and support conditions.
Additional expressions are given in (7.6.19).
388
Chapter 7: Quantum Entropies and Information
where
−1 † 𝜌0,0 𝜌0,1
𝜌˜ B 𝜌0,0 − 𝜌0,1 𝜌1,1 𝜌0,1 , 𝜌= † , (7.6.17)
𝜌0,1 𝜌1,1
𝜌0,0 B Π𝜎 𝜌Π𝜎 , 𝜌0,1 B Π𝜎 𝜌Π𝜎⊥ , 𝜌1,1 B Π𝜎⊥ 𝜌Π𝜎⊥ , (7.6.18)
Π𝜎 is the projection onto the support of 𝜎, Π𝜎⊥ is the projection onto the kernel
1
of 𝜎, and the inverses 𝜎 − 2 and 𝜌1,1 −1 are taken on the supports of 𝜎 and 𝜌 ,
1,1
respectively. We also have the alternative expressions below for certain cases:
− 1
− 1 1−𝛼 if 𝛼 ∈ (0, 1)
Tr 𝜌 𝜌 2 𝜎𝜌 2
and supp(𝜎) ⊆ supp(𝜌)
𝑄 𝛼 (𝜌∥𝜎) =
b
, (7.6.19)
if 𝛼 ∈ (1, ∞)
𝛼−1
1
−1 1
Tr 𝜌 𝜌 2 𝜎 𝜌 2
and supp(𝜌) ⊆ supp(𝜎)
1
where the inverses 𝜌 − 2 and 𝜎 −1 are taken on the supports of 𝜌 and 𝜎, respec-
tively.
Proof: The proof is similar in spirit to the proofs of Propositions 7.21 and 7.29, but
it is more complicated than these previous proofs. We provide it in Section 7.6.2. ■
Observe that when supp(𝜌) ⊆ supp(𝜎) and 𝛼 ∈ (0, 1), the expression
389
Chapter 7: Quantum Entropies and Information
Tr[𝜎(𝜎 −1/2 𝜌𝜎 −1/2 ) 𝛼 ] is actually a special case of Tr[𝜎(𝜎 −1/2 𝜌𝜎˜ −1/2 ) 𝛼 ], because
the operators 𝜌0,1 and 𝜌1,1 are both equal to zero in this case, so that Π𝜎 𝜌 = 𝜌Π𝜎 = 𝜌
and 𝜌˜ = 𝜌0,0 .
The main intuition behind the first expression in (7.6.16) and those in (7.6.19)
is as follows. If 𝜌 and 𝜎 are positive definite, then the following equalities hold
1 𝛼 1 1−𝛼
h 1 i 1
Tr 𝜎 𝜎 − 2 𝜌𝜎 − 2 = Tr 𝜌 𝜌 − 2 𝜎𝜌 − 2 (7.6.20)
1 1 𝛼−1
−1
= Tr 𝜌 𝜌 2 𝜎 𝜌 2 , (7.6.21)
where
𝜌 𝛿 B (1 − 𝛿) 𝜌 + 𝛿𝜋, (7.6.24)
390
Chapter 7: Quantum Entropies and Information
and 𝜋 is the maximally mixed state. This holds because the expression for the
geometric Rényi relative quasi-entropy in Definition 7.38 does not involve an
inverse of the state 𝜌.
As it turns out, the order of the limits in (7.6.22) does not matter for 𝛼 ∈ (0, 1):
Now, because both 𝜎𝜀 and 𝜌 𝛿 are positive definite for 𝜀, 𝛿 > 0, we can use the
property in (7.6.5), along with Lemma 7.41, to obtain the following for 𝛼 ∈ (0, 1):
b𝛼 (𝜌∥𝜎) = lim lim Tr[𝐺 1−𝛼 (𝜌 𝛿 , 𝜎𝜀 )]
𝑄 (7.6.28)
𝛿→0+ 𝜀→0+
" 1−𝛼 #
1 1
− −
= lim+ lim+ Tr 𝜌 𝛿 𝜌 𝛿 2 𝜎𝜀 𝜌 𝛿 2 (7.6.29)
𝛿→0 𝜀→0
" 1−𝛼 #
− 21 − 12
= lim+ Tr 𝜌 𝛿 𝜌 𝛿 𝜎𝜌 𝛿 , (7.6.30)
𝛿→0
where the last equality holds for the analogous reason that (7.6.22) holds, namely,
that the inverse of 𝜎 is not involved. We are now in a situation that looks like
the expression in (7.6.1), except that the roles of 𝜌 and 𝜎 are reversed and 𝛼
is substituted with 1 − 𝛼. Then, in the limit 𝛿 → 0+ , if the support condition
391
Chapter 7: Quantum Entropies and Information
supp(𝜎) ⊆ supp(𝜌) holds, the expression converges to Tr[𝜌(𝜌 −1/2 𝜎𝜌 −1/2 ) 1−𝛼 ].
It is worthwhile to consider the special case of 𝛼 = 2. In this case, the geometric
Rényi relative quasi-entropy collapses to the Petz–Rényi relative quasi-entropy
when supp(𝜌) ⊆ supp(𝜎):
" 2#
1 1
b2 (𝜌∥𝜎) = lim Tr 𝜎𝜀 𝜎𝜀− 2 𝜌𝜎𝜀− 2
𝑄 (7.6.31)
𝜀→0 +
−1 −1 −1 −1
= lim+ Tr 𝜎𝜀 𝜎𝜀 2 𝜌𝜎𝜀 2 𝜎𝜀 2 𝜌𝜎𝜀 2 (7.6.32)
𝜀→0
392
Chapter 7: Quantum Entropies and Information
If the state 𝜌 is pure, then the geometric Rényi relative entropy simplifies as
follows, such that it is independent of 𝛼:
393
Chapter 7: Quantum Entropies and Information
where 𝜇 𝑦 are the non-negative eigenvalues and 𝑄 𝑦 are the eigenprojections. In this
decomposition, we are including values of 𝜇 𝑦 for which 𝜇 𝑦 = 0. Then it follows
that
𝜎𝜀 = 𝜎 + 𝜀 1 =
∑︁
𝜇𝑦 + 𝜀 𝑄 𝑦, (7.6.58)
𝑦
where 𝑦 0 is the value of 𝑦 for which 𝜇 𝑦 = 0 (if no such value of 𝑦 exists, then 𝑄 𝑦0
is equal to the zero operator). Thus, if ⟨𝜓|𝑄 𝑦0 |𝜓⟩ ≠ 0 (equivalent to |𝜓⟩ being
outside the support of 𝜎), then it follows that
𝜎𝜀 = |𝜙⟩⟨𝜙| + 𝜀 1 (7.6.65)
395
Chapter 7: Quantum Entropies and Information
so that
and then
h i
⟨𝜓|𝜎𝜀−1 |𝜓⟩ = ⟨𝜓| (𝑁 + 𝜀) −1 − 𝜀 −1 |𝜙′⟩⟨𝜙′ | + 𝜀 −1 1 |𝜓⟩ (7.6.70)
−1 −1
= (𝑁 + 𝜀) − 𝜀 |⟨𝜓|𝜙′⟩| 2 + 𝜀 −1 (7.6.71)
|⟨𝜓|𝜙′⟩| 2 1 − |⟨𝜓|𝜙′⟩| 2
= + . (7.6.72)
𝑁 +𝜀 𝜀
Note that we always have |⟨𝜓|𝜙′⟩| 2 ∈ [0, 1] because |𝜓⟩ and |𝜙′⟩ are unit vectors.
In the case that |⟨𝜓|𝜙′⟩| 2 ∈ [0, 1), then we find that
′⟩| 2 ′⟩| 2
|⟨𝜓|𝜙 1 − |⟨𝜓|𝜙
lim log2 ⟨𝜓|𝜎𝜀−1 |𝜓⟩ = lim+ log2
+ (7.6.73)
𝜀→0+ 𝜀→0 𝑁 +𝜀 𝜀
= +∞. (7.6.74)
|⟨𝜓|𝜙′⟩| 2 1 − |⟨𝜓|𝜙′⟩| 2
lim+ log2 ⟨𝜓|𝜎𝜀−1 |𝜓⟩
= lim+ log2 + (7.6.75)
𝜀→0 𝜀→0 𝑁 +𝜀 𝜀
1
= lim+ log2 (7.6.76)
𝜀→0 𝑁 +𝜀
= − log2 𝑁, (7.6.77)
We note here that, for pure states 𝜌 and 𝜎 and as indicated by (7.6.48), the
geometric Rényi relative entropy is either equal to zero or +∞, depending on
whether 𝜌 = 𝜎. This behavior of the geometric Rényi relative entropy for pure
states 𝜌 and 𝜎 is very different from that of the Petz– and sandwiched Rényi relative
396
Chapter 7: Quantum Entropies and Information
entropies. The latter quantities always evaluate to a finite value if the pure states
are non-orthogonal.
The geometric Rényi relative entropy possesses a number of useful properties,
similar to those for the Petz– and sandwiched Rényi relative entropies, which we
delineate now.
𝐷 b𝛼 (𝑉 𝜌𝑉 † ∥𝑉 𝜎𝑉 † ).
b𝛼 (𝜌∥𝜎) = 𝐷 (7.6.78)
where
∑︁
𝜌𝑋 𝐴 B 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜌 𝑥𝐴 , (7.6.81)
𝑥∈X
∑︁
𝜎𝑋 𝐴 B 𝑞(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜎𝐴𝑥 . (7.6.82)
𝑥∈X
397
Chapter 7: Quantum Entropies and Information
Proof:
b𝛼 (𝜌∥𝜎) as in (7.6.1)–
1. Proof of isometric invariance: Let us start by writing 𝐷
(7.6.3):
𝛼
1 − 21 − 12
b𝛼 (𝜌∥𝜎) = lim
𝐷 log2 Tr 𝜎𝜀 𝜎𝜀 𝜌𝜎𝜀 . (7.6.83)
𝜀→0+ 𝛼 − 1
where
𝜎𝜀 B 𝜎 + 𝜀 1. (7.6.84)
Let 𝑉 be an isometry. Then, defining
𝜔𝜀 B 𝑉 𝜎𝑉 † + 𝜀 1, (7.6.85)
we find that
𝛼
1 1 1
− −
b𝛼 (𝑉 𝜌𝑉 † ∥𝑉 𝜎𝑉 † ) = lim
𝐷 log2 Tr 𝜔𝜀 𝜔𝜀 2 𝑉 𝜌𝑉 † 𝜔𝜀 2 . (7.6.86)
𝜀→0 𝛼 − 1
+
398
Chapter 7: Quantum Entropies and Information
holds for all 𝜀 > 0, we conclude the proof of isometric invariance by taking
the limit 𝜀 → 0+ .
2. Proof of monotonicity in 𝛼: We prove this by showing that the derivative is
non-negative for all 𝛼 > 0. By applying (7.6.22), we can consider 𝜌 and 𝜎 to
be positive definite without loss of generality. By applying (7.6.14), consider
that
1 1 1−𝛼
b𝛼 (𝜌∥𝜎) = Tr 𝜌 𝜌 − 2 𝜎𝜌 − 2
𝑄 (7.6.96)
1 1 𝛼−1
= Tr 𝜌 𝜌 2 𝜎 −1 𝜌 2 . (7.6.97)
𝛾 B 𝛼 − 1, (7.6.98)
1 1
𝑋 B 𝜌 𝜎 −1 𝜌 ,
2 2 (7.6.99)
d b
ln(2) 𝐷 𝛼 (𝜌∥𝜎)
d𝛼
399
Chapter 7: Quantum Entropies and Information
d 1
= ln⟨𝜑 𝜌 |𝑋 𝛾 ⊗ 1 |𝜑 𝜌 ⟩ (7.6.101)
d𝛾 𝛾
1 1 d
= − 2 ln⟨𝜑 |𝑋 ⊗ 1 |𝜑 ⟩ +
𝜌 𝛾 𝜌
ln⟨𝜑 |𝑋 ⊗ 1 |𝜑 ⟩
𝜌 𝛾 𝜌
(7.6.102)
𝛾 𝛾 d𝛾
1 ⟨𝜑 𝜌 |𝑋 𝛾 ln 𝑋 ⊗ 1 |𝜑 𝜌 ⟩
1
= − 2 ln⟨𝜑 |𝑋 ⊗ 1 |𝜑 ⟩ +
𝜌 𝛾 𝜌
(7.6.103)
𝛾 𝛾 ⟨𝜑 𝜌 |𝑋 𝛾 ⊗ 1 |𝜑 𝜌 ⟩
−⟨𝜑 𝜌 |𝑋 𝛾 ⊗ 1 |𝜑 𝜌 ⟩ ln⟨𝜑 𝜌 |𝑋 𝛾 ⊗ 1 |𝜑 𝜌 ⟩ + 𝛾⟨𝜑 𝜌 |𝑋 𝛾 ln 𝑋 ⊗ 1 |𝜑 𝜌 ⟩
=
𝛾 2 ⟨𝜑 𝜌 |𝑋 𝛾 ⊗ 1 |𝜑 𝜌 ⟩
(7.6.104)
−⟨𝜑 𝜌 |𝑋 𝛾 ⊗ 1 |𝜑 𝜌 ⟩ ln⟨𝜑 𝜌 |𝑋 𝛾 ⊗ 1 |𝜑 𝜌 ⟩ + ⟨𝜑 𝜌 |𝑋 𝛾 ln 𝑋 𝛾 ⊗ 1 |𝜑 𝜌 ⟩
= .
𝛾 2 ⟨𝜑 𝜌 |𝑋 𝛾 ⊗ 1 |𝜑 𝜌 ⟩
(7.6.105)
b𝛼 (𝜌1 ∥𝜎1,𝜀1 ) · 𝑄
𝑄 b𝛼 (𝜌2 ∥𝜎2,𝜀2 )
𝛼 𝛼
− 21 − 12 − 12 − 12
= Tr 𝜎1,𝜀1 𝜎1,𝜀1 𝜌1 𝜎1,𝜀1 Tr 𝜎2,𝜀2 𝜎2,𝜀2 𝜌2 𝜎2,𝜀2 (7.6.109)
400
Chapter 7: Quantum Entropies and Information
𝛼 𝛼
−1 −1 −1 −1
= Tr 𝜎1,𝜀1 𝜎1,𝜀21 𝜌1 𝜎1,𝜀21 ⊗ 𝜎2,𝜀2 𝜎2,𝜀22 𝜌2 𝜎2,𝜀22 (7.6.110)
𝛼 𝛼
−1 −1 −1 −1
= Tr 𝜎1,𝜀1 ⊗ 𝜎2,𝜀2 𝜎1,𝜀21 𝜌1 𝜎1,𝜀21 ⊗ 𝜎2,𝜀22 𝜌2 𝜎2,𝜀22 (7.6.111)
𝛼
−1 −1 −1 −1
= Tr 𝜎1,𝜀1 ⊗ 𝜎2,𝜀2 𝜎1,𝜀21 𝜌1 𝜎1,𝜀21 ⊗ 𝜎2,𝜀22 𝜌2 𝜎2,𝜀22 (7.6.112)
h − 12 1 𝛼i
−2
= Tr 𝜎1,𝜀1 ⊗ 𝜎2,𝜀2 𝜎1,𝜀1 ⊗ 𝜎2,𝜀2 (𝜌1 ⊗ 𝜌2 ) 𝜎1,𝜀1 ⊗ 𝜎2,𝜀2
(7.6.113)
b𝛼 (𝜌1 ⊗ 𝜌2 ∥𝜎1,𝜀1 ⊗ 𝜎2,𝜀2 ).
=𝑄 (7.6.114)
By considering that
1
Finally, by applying the continuous function 𝛼−1 log2 (·) to all sides of the
equalities established, we conclude that additivity holds.
4. Proof of direct-sum property: Define the classical–quantum state 𝜌 𝑋 𝐴 and
operator 𝜎𝑋 𝐴 , respectively, as
∑︁ ∑︁
𝑥
𝜌𝑋 𝐴 B 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜌 𝐴 , 𝜎𝑋 𝐴 B 𝑞(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜎𝐴𝑥 . (7.6.117)
𝑥∈X 𝑥∈X
Define
𝑞(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜎𝐴𝑥 + 𝜀 1 𝑋 ⊗ 1 𝐴
∑︁
𝜎𝑋𝜀 𝐴 B (7.6.118)
𝑥∈X
|𝑥⟩⟨𝑥| 𝑋 ⊗ 1 𝐴
∑︁ ∑︁
= 𝑞(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜎𝐴𝑥 +𝜀 (7.6.119)
𝑥∈X,𝑞(𝑥)≠0 𝑥∈X
|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜀 1 𝐴 ,
∑︁ ∑︁
𝑥
= |𝑥⟩⟨𝑥| 𝑋 ⊗ 𝑞(𝑥)𝜎𝐴,𝜀 + (7.6.120)
𝑥∈X,𝑞(𝑥)≠0 𝑥∈X,𝑞(𝑥)=0
401
Chapter 7: Quantum Entropies and Information
where
𝑥
𝜎𝐴,𝜀 B 𝜎𝐴𝑥 + 𝜀 1 𝐴 . (7.6.121)
Then we find that
− 1 ∑︁ − 21
𝜎𝑋𝜀 𝐴 2 = |𝑥⟩⟨𝑥| 𝑋 ⊗ 𝑥
𝑞(𝑥)𝜎𝐴,𝜀 +
𝑥∈X,𝑞(𝑥)≠0
|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜀 − 2 1 𝐴 , (7.6.122)
∑︁ 1
𝑥∈X,𝑞(𝑥)=0
Defining
− 21 − 12
𝜔𝑥𝐴 B 𝑥
𝜎𝐴,𝜀 𝜌 𝑥𝐴 𝑥
𝜎𝐴,𝜀 , (7.6.124)
it then follows that
b𝛼 (𝜌 𝑋 𝐴 ∥𝜎 𝜀 )
𝑄
h 𝑋 𝐴 1 1 𝛼
i
𝜀 𝜀 −2 𝜀 −2
= Tr 𝜎𝑋 𝐴 𝜎𝑋 𝐴 𝜌 𝑋 𝐴 𝜎𝑋 𝐴 (7.6.125)
© ª
′
𝛼 ®
© ∑︁
𝑥 ®
ª ∑︁
′ ′ 𝑝(𝑥 ) 𝛼
𝑥 ′ ®
®
= Tr 𝑞(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜎𝐴,𝜀 ® |𝑥 ⟩⟨𝑥 | 𝑋 ⊗
𝜔 𝐴 ®
® ′ 𝑞(𝑥 ′) ®
𝑥∈X, 𝑥 ∈X,
¬ 𝑞(𝑥 ′′)≠0,
«𝑞(𝑥)≠0 ®
« 𝑝(𝑥 )≠0
¬
© ª
®
© ∑︁ ª ∑︁
′ ′ −𝛼 𝑥 ′ 𝛼 ®
𝑥 ®
®
+ Tr 𝑞(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜎𝐴,𝜀 ® |𝑥 ⟩⟨𝑥 | 𝑋 ⊗ 𝜀 (𝜌 𝐴 ) ®
® ′ ®
𝑥∈X, 𝑥 ∈X,
¬ 𝑞(𝑥 ′′)=0,
«𝑞(𝑥)≠0 ®
« 𝑝(𝑥 )≠0
¬
402
Chapter 7: Quantum Entropies and Information
© ª
′
𝛼 ®
© ∑︁ ª ∑︁ 𝑝(𝑥 ) 𝛼
|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜀 1 𝐴 ® ′ ′ 𝑥 ′ ®
®
+ Tr |𝑥 ⟩⟨𝑥 | 𝑋 ⊗
®
𝜔 𝐴 ®
® ′ 𝑞(𝑥 ′) ®
𝑥∈X, 𝑥 ∈X,
′
¬ 𝑞(𝑥 ′)≠0,
«𝑞(𝑥)=0 ®
« 𝑝(𝑥 )≠0
¬
© ∑︁ ª © ª
|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜀 1 𝐴 ®
∑︁ ′
|𝑥 ′⟩⟨𝑥 ′ | 𝑋 ⊗ 𝜀 −𝛼 (𝜌 𝑥𝐴 ) 𝛼 ® (7.6.126)
+ Tr
® ®
𝑥∈X, ® ′ ′ )=0,
®
𝑥 ∈X,𝑞(𝑥
¬ « 𝑝(𝑥 ′ )≠0
𝑞(𝑥)=0
« h i ¬
∑︁ 𝛼
= 𝑝(𝑥) 𝛼 𝑞(𝑥) 1−𝛼 Tr 𝜎𝐴,𝜀 𝑥
𝜔𝑥𝐴
𝑥∈X,𝑞(𝑥)≠0,𝑝(𝑥)≠0
∑︁
+ 𝜀 1−𝛼 Tr[(𝜌 𝑥𝐴 ) 𝛼 ]. (7.6.127)
𝑥∈X,𝑞(𝑥)=0,𝑝(𝑥)≠0
h 𝛼i
Now observing that Tr 𝜎𝐴,𝜀 𝑥 𝜔𝑥𝐴 = 𝑄b𝛼 (𝜌 𝑥 ∥𝜎 𝑥 ) and taking the limit
𝐴 𝐴,𝜀
𝜀 → 0+ in the last line above, we find that
© ª
∑︁ ∑︁ ®
𝛼 1−𝛼 𝑥 𝑥 1−𝛼 𝑥 𝛼
®
lim+ 𝑝(𝑥) 𝑞(𝑥) 𝑄 b𝛼 (𝜌 ∥𝜎 ) + 𝜀 Tr[(𝜌 ) ] ®
𝜀→0 𝐴 𝐴,𝜀 𝐴 ®
𝑥∈X,
𝑞(𝑥)≠0, 𝑥∈X, ®
®
𝑞(𝑥)=0,
« 𝑝(𝑥)≠0 ∑︁
𝑝(𝑥)≠0 ¬
𝛼 1−𝛼 b 𝑥 𝑥
= 𝑝(𝑥) 𝑞(𝑥) 𝑄 𝛼 (𝜌 𝐴 ∥𝜎𝐴 ) (7.6.128)
𝑥∈X
if 𝛼 ∈ (0, 1) or if 𝛼 ∈ (1, ∞), supp(𝜌 𝑥𝐴 ) ⊆ supp(𝜎𝐴𝑥 ), and there does not exist
a value of 𝑥 for which 𝑝(𝑥) ≠ 0 and 𝑞(𝑥) = 0. The latter support conditions
are precisely the same as supp(𝜌 𝑋 𝐴 ) ⊆ supp(𝜎𝑋 𝐴 ). If 𝛼 ∈ (1, ∞) and the
support conditions do not hold, then the limit evaluates to +∞, consistent
with the right-hand side above. This concludes the proof of the direct-sum
property. ■
We now establish the data-processing inequality for the geometric Rényi relative
entropy for 𝛼 ∈ (0, 1) ∪ (1, 2].
403
Chapter 7: Quantum Entropies and Information
Proof: From Stinespring’s dilation theorem (Theorem 4.3), we know that the
action of a quantum channel N on every linear operator 𝑋 can be written as
N(𝑋) = Tr𝐸 [𝑉 𝑋𝑉 † ], (7.6.130)
where 𝑉 is an isometry and 𝐸 is an auxiliary system with dimension 𝑑 𝐸 ≥ rank(ΓN 𝐴𝐵 ),
N
with Γ𝐴𝐵 the Choi operator for the channel N. As stated in Proposition 7.44, the
geometric Rényi relative entropy 𝐷 b𝛼 is isometrically invariant. Therefore, it suffices
to establish the data-processing inequality for 𝐷
b𝛼 under partial trace; i.e., it suffices
to show that for every state 𝜌 𝐴𝐵 , positive semi-definite operator 𝜎𝐴𝐵 , and for all
𝛼 ∈ (0, 1) ∪ (1, 2]:
b𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ) ≥ 𝐷
𝐷 b𝛼 (𝜌 𝐴 ∥𝜎𝐴 ). (7.6.131)
We now proceed to prove this inequality. We prove it for 𝜌 𝐴𝐵 , and hence 𝜌 𝐴 ,
invertible, as well as for 𝜎𝐴𝐵 and 𝜎𝐴 invertible. The result follows in the general
case of 𝜌 𝐴𝐵 and/or 𝜌 𝐴 non-invertible, as well as 𝜎𝐴𝐵 and/or 𝜎𝐴 non-invertible, by
applying the result to the invertible operators (1 − 𝛿) 𝜌 𝐴𝐵 + 𝛿𝜋 𝐴𝐵 and 𝜎𝐴𝐵 + 𝜀 1 𝐴𝐵 ,
with 𝛿 ∈ (0, 1) and 𝜀 > 0, and taking the limit 𝛿 → 0+ followed by 𝜀 → 0+ ,
because
b𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ) = lim lim 𝐷
𝐷 b𝛼 ((1 − 𝛿) 𝜌 𝐴𝐵 + 𝛿𝜋 𝐴𝐵 ∥𝜎𝐴𝐵 + 𝜀 1 𝐴𝐵 ), (7.6.132)
𝜀→0+ 𝛿→0+
b𝛼 (𝜌 𝐴 ∥𝜎𝐴 ) = lim lim 𝐷
𝐷 b𝛼 ((1 − 𝛿) 𝜌 𝐴 + 𝛿𝜋 𝐴 ∥𝜎𝐴 + 𝑑 𝐵 𝜀 1 𝐴 ), (7.6.133)
+
𝜀→0 𝛿→0+
which follows from (7.6.22) and the fact that the dimensional factor 𝑑 𝐵 does not
affect the limit in the second quantity above.
To establish the data-processing inequality, we make use of the Petz recovery
channel for partial trace (see Section 4.6.1.1), as well as the operator Jensen
inequality (Theorem 2.16). Recall that the Petz recovery channel P𝜎𝐴𝐵 ,Tr𝐵 for
partial trace is defined as
1 1 1 1
− −
P𝜎𝐴𝐵 ,Tr𝐵 (𝑋 𝐴 ) ≡ P(𝑋 𝐴 ) B 𝜎𝐴𝐵
2
𝜎𝐴 2 𝑋 𝐴 𝜎𝐴 2 ⊗ 1𝐵 𝜎𝐴𝐵
2
. (7.6.134)
404
Chapter 7: Quantum Entropies and Information
which can be verified by inspection. Since P𝜎𝐴𝐵 ,Tr𝐵 is completely positive and trace
preserving, it follows that its adjoint
−1 1 1
−1
P† (𝑌 𝐴𝐵 ) B 𝜎𝐴 2 Tr 𝐵 [𝜎𝐴𝐵
2 2
𝑌 𝐴𝐵 𝜎𝐴𝐵 ]𝜎𝐴 2 , (7.6.136)
The second equality follows from (7.6.135). The sole inequality is a consequence of
the operator Jensen inequality and the fact that 𝑥 𝛼 is operator convex for 𝛼 ∈ (1, 2].
Indeed, for M a completely positive unital map, item 2. of Theorem 2.16 implies
that
𝑓 (M(𝑋)) ≤ M( 𝑓 (𝑋)) (7.6.144)
for Hermitian 𝑋 and an operator convex function 𝑓 . The second-to-last equality
follows from (7.6.137).
Applying the same reasoning as above, but using the fact that 𝑥 𝛼 is operator
concave for 𝛼 ∈ (0, 1), we find for 𝛼 ∈ (0, 1) that
b𝛼 (𝜌 𝐴 ∥𝜎𝐴 ) ≥ 𝑄
𝑄 b𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ). (7.6.145)
405
Chapter 7: Quantum Entropies and Information
Putting together the above and employing definitions, we find that the following
inequality holds for 𝛼 ∈ (0, 1) ∪ (1, 2]:
b𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ) ≥ 𝐷
𝐷 b𝛼 (𝜌 𝐴 ∥𝜎𝐴 ), (7.6.146)
With the data-processing inequality for the geometric Rényi relative entropy in
hand, we can establish some additional properties.
2. Faithfulness: Suppose that Tr[𝜎] ≤ Tr[𝜌] = 1 and let 𝛼 ∈ (0, 1) ∪ (1, ∞).
Then 𝐷b𝛼 (𝜌∥𝜎) = 0 if and only if 𝜌 = 𝜎.
3. If 𝜌 ≤ 𝜎, then 𝐷
b𝛼 (𝜌∥𝜎) ≤ 0.
Proof:
1. Apply the data-processing inequality with the channel being the full trace-out
channel:
b𝛼 (𝜌∥𝜎) ≥ 𝐷
𝐷 b𝛼 (Tr[𝜌] ∥ Tr[𝜎]) (7.6.147)
1
= log2 (Tr[𝜌]) 𝛼 (Tr[𝜎]) 1−𝛼 (7.6.148)
𝛼−1
= − log2 Tr[𝜎] (7.6.149)
≥ 0. (7.6.150)
b𝛼 (𝜌∥𝜎) = 0 for 𝛼 ∈
2. If 𝜌 = 𝜎, then it follows by direct evaluation that 𝐷
(0, 1) ∪ (1, ∞).
406
Chapter 7: Quantum Entropies and Information
To see the other implication, suppose first that (0, 1) ∪ (1, 2]. Then 𝐷
b𝛼 (𝜌∥𝜎) =
0 implies that equality is achieved in the two inequalities in item 1. above.
So then Tr[𝜎] = 1. Furthermore, we conclude from data processing that
𝐷b𝛼 (M(𝜌)∥M(𝜎)) = 0 for all measurement channels M. This includes the
measurement that achieves the trace distance. By applying the faithfulness of
the classical Rényi relative entropy on the distributions that result from the
optimal trace-distance measurement, we conclude that 𝜌 = 𝜎. To get the range
outside the data-processing interval of (0, 1) ∪ (1, 2], note that 𝐷 b𝛼 (𝜌∥𝜎) = 0
for 𝛼 > 2 implies by monotonicity (Property 2 of Proposition 7.44) that
𝐷b𝛼 (𝜌∥𝜎) = 0 for 𝛼 ≤ 2. Then it follows that 𝜌 = 𝜎.
𝜌ˆ B |0⟩⟨0| ⊗ 𝜌, (7.6.151)
ˆ B |0⟩⟨0| ⊗ 𝜌 + |1⟩⟨1| ⊗ (𝜎 − 𝜌) .
𝜎 (7.6.152)
𝜌ˆ B |0⟩⟨0| ⊗ 𝜌, (7.6.154)
ˆ B |0⟩⟨0| ⊗ 𝜎 + |1⟩⟨1| ⊗ (𝜎′ − 𝜎) .
𝜎 (7.6.155)
407
Chapter 7: Quantum Entropies and Information
The data-processing inequality for the geometric Rényi relative entropy can be
b𝛼 (𝜌∥𝜎) as
written using the geometric Rényi relative quasi-entropy 𝑄
1 b𝛼 (𝜌∥𝜎) ≥ 1 log2 𝑄 b𝛼 (N(𝜌)∥N(𝜎)).
log2 𝑄 (7.6.157)
𝛼−1 𝛼−1
Since 𝛼 − 1 is negative for 𝛼 ∈ (0, 1), we can use the monotonicity of the function
log2 to obtain
b𝛼 (𝜌∥𝜎) ≥ 𝑄
𝑄 b𝛼 (N(𝜌)∥N(𝜎)), for 𝛼 ∈ (1, 2], (7.6.158)
b𝛼 (𝜌∥𝜎) ≤ 𝑄
𝑄 b𝛼 (N(𝜌)∥N(𝜎)), for 𝛼 ∈ (0, 1). (7.6.159)
We can use this to establish some convexity/concavity statements for the geometric
Rényi relative entropy.
Consequently, the geometric Rényi relative entropy 𝐷b𝛼 is jointly convex for
𝛼 ∈ (0, 1):
!
∑︁ ∑︁ ∑︁
𝐷
b𝛼 𝑝(𝑥) 𝜌 𝑥𝐴 𝑝(𝑥)𝜎𝐴𝑥 ≤ b𝛼 (𝜌 𝑥 ∥𝜎 𝑥 ).
𝑝(𝑥) 𝐷 𝐴 𝐴 (7.6.162)
𝑥∈X 𝑥∈X 𝑥∈X
Proof: The first two inequalities follow directly from the direct-sum property of the
geometric Rényi relative entropy (Proposition 7.44), the data-processing inequality
408
Chapter 7: Quantum Entropies and Information
(Theorem 7.45), and Proposition 7.17. The last inequality follows from the first by
applying the logarithm, scaling by 1/(𝛼 − 1), and taking a maximum. ■
Although the geometric Rényi relative entropy is not jointly convex for 𝛼 ∈
(1, 2] , it is jointly quasi-convex, in the sense that
!
∑︁ ∑︁
𝐷b𝛼 𝑝(𝑥) 𝜌 𝑥𝐴 𝑝(𝑥)𝜎𝐴𝑥 ≤ max 𝐷 b𝛼 (𝜌 𝑥 ∥𝜎 𝑥 ),
𝐴 𝐴 (7.6.163)
𝑥∈X
𝑥∈X 𝑥∈X
for every finite alphabet X, probability distribution 𝑝 : X → [0, 1], set 𝜌 𝑥𝐴 𝑥∈X of
states, and set 𝜎𝐴𝑥 𝑥∈X of positive semi-definite operators. Indeed, from (7.6.160),
we immediately obtain
!
∑︁ ∑︁
𝑄b𝛼 𝑝(𝑥) 𝜌 𝑥
𝐴 𝑝(𝑥)𝜎 𝑥 ≤ max 𝑄
𝐴
b𝛼 (𝜌 𝑥 ∥𝜎 𝑥 ).
𝐴 𝐴 (7.6.164)
𝑥∈X
𝑥∈X 𝑥∈X
1
Taking the logarithm and multiplying by 𝛼−1 on both sides of this inequality leads
to (7.6.163).
The geometric Rényi relative entropy has another interpretation, which is
worthwhile to mention.
where the classical Rényi relative entropy is defined in (7.4.19), the channel
P is a classical–quantum channel, 𝑝 : X → [0, 1] is a probability distribution
over a finite alphabet X, 𝑞 : X → [0, ∞) is a positive function on X, 𝜔( 𝑝) B
Í Í
𝑥∈X 𝑝(𝑥)|𝑥⟩⟨𝑥|, and 𝜔(𝑞) B 𝑥∈X 𝑞(𝑥)|𝑥⟩⟨𝑥|.
409
Chapter 7: Quantum Entropies and Information
Proof: First, suppose that there exists a quantum channel P such that
𝐷 𝛼 ( 𝑝∥𝑞) = 𝐷
b𝛼 (𝜔( 𝑝)∥𝜔(𝑞)) (7.6.167)
≥𝐷b𝛼 (P(𝜔( 𝑝))∥P(𝜔(𝑞))) (7.6.168)
b𝛼 (𝜌∥𝜎).
=𝐷 (7.6.169)
The first equality follows because the geometric Rényi relative entropy reduces to
the classical Rényi relative entropy for commuting operators. The inequality is
a consequence of the data-processing inequality for the geometric Rényi relative
entropy (Theorem 7.45). The final equality follows from the constraint in (7.6.166).
Since the inequality holds for arbitrary 𝑝, 𝑞, and P satisfying (7.6.166), we conclude
that
inf {𝐷 𝛼 ( 𝑝∥𝑞) : P( 𝑝) = 𝜌, P(𝑞) = 𝜎} ≥ 𝐷 b𝛼 (𝜌∥𝜎). (7.6.170)
{𝑝,𝑞,P}
410
Chapter 7: Quantum Entropies and Information
and
!
∑︁ 𝑞(𝑥) 1 1 1
∑︁ 1
P(𝜔(𝑞)) = 𝜎 2 Π𝑥 𝜎 2 = 𝜎 2 Π𝑥 𝜎 2 = 𝜎. (7.6.177)
𝑥
𝑞(𝑥) 𝑥
and the fact that these choices of 𝑝, 𝑞, and P satisfy the constraints P( 𝑝) = 𝜌 and
P(𝑞) = 𝜎, we conclude that
𝐷 𝛼 ( 𝑝∥𝑞) = 𝐷
b𝛼 (𝜌∥𝜎). (7.6.180)
411
Chapter 7: Quantum Entropies and Information
Proposition 7.49
Let 𝜌 be a state and 𝜎 a positive semi-definite operator. For 𝛼 ∈ (0, 1) ∪ (1, 2],
the following inequalities hold
e𝛼 (𝜌∥𝜎) ≤ 𝐷 𝛼 (𝜌∥𝜎) ≤ 𝐷
𝐷 b𝛼 (𝜌∥𝜎), (7.6.181)
Proof: The first inequality was stated as the last property of Proposition 7.31.
So we establish the proof of the second inequality here. Suppose that P is a
classical–quantum channel, 𝑝 : X → [0, 1] is a probability distribution over a finite
alphabet X, and 𝑞 : X → (0, ∞) is a positive function on X satisfying
where
∑︁ ∑︁
𝜔( 𝑝) B 𝑝(𝑥)|𝑥⟩⟨𝑥|, 𝜔(𝑞) B 𝑞(𝑥)|𝑥⟩⟨𝑥|. (7.6.183)
𝑥∈X 𝑥∈X
The first equality follows because the Petz–Rényi relative entropy reduces to
the classical Rényi relative entropy for commuting operators. The inequality
follows from the data-processing inequality for the Petz–Rényi relative entropy for
𝛼 ∈ (0, 1) ∪ (1, 2] (Theorem 7.24). The final equality follows from the constraint in
(7.6.182). Since the inequality above holds for all 𝑝, 𝑞, and P satisfying (7.6.182),
we conclude that
412
Chapter 7: Quantum Entropies and Information
The prefactors in these bounds to the left of the trace expressions are uniform and
independent of 𝜀, and so it follows that
𝛼 𝛼
− 12 − 21 − 21 ′ − 12
lim lim Tr 𝜎𝜀 𝜎𝜀 𝜌 𝛿 𝜎𝜀 = lim+ lim+ Tr 𝜎𝜀 𝜎𝜀 𝜌 𝛿 𝜎𝜀 , (7.6.192)
𝜀→0+ 𝛿→0+ 𝜀→0 𝛿→0
𝛼 𝛼
− 12 − 21 − 21 ′ − 12
lim+ lim+ Tr 𝜎𝜀 𝜎𝜀 𝜌 𝛿 𝜎𝜀 = lim+ lim+ Tr 𝜎𝜀 𝜎𝜀 𝜌 𝛿 𝜎𝜀 . (7.6.193)
𝛿→0 𝜀→0 𝛿→0 𝜀→0
Again from the operator monotonicity of 𝑥 𝛼 for 𝛼 ∈ (0, 1), we conclude for fixed
𝜀 > 0 that
𝛼 𝛼
− 12 ′ − 12 − 12 ′ − 12
𝛿1 ≤ 𝛿2 ⇒ Tr 𝜎𝜀 𝜎𝜀 𝜌 𝛿1 𝜎𝜀 ≤ Tr 𝜎𝜀 𝜎𝜀 𝜌 𝛿2 𝜎𝜀 ,
(7.6.194)
where 𝛿1 > 0. By exploiting the identity
𝛼
− 12 ′ − 12 1 1 1−𝛼
′ ′ −2 ′ −2
Tr 𝜎𝜀 𝜎𝜀 𝜌 𝛿 𝜎𝜀 = Tr 𝜌 𝛿 𝜌 𝛿 𝜎𝜀 𝜌 𝛿 (7.6.195)
from Proposition 7.39 and operator monotonicity of 𝑥 1−𝛼 for 𝛼 ∈ (0, 1), we
conclude for fixed 𝛿 > 0 that
𝛼 𝛼
− 12 − 12 − 12 ′ − 12
𝜀1 ≤ 𝜀2 ⇒ Tr 𝜎𝜀1 𝜎𝜀1 𝜌 𝛿 𝜎𝜀1 ≤ Tr 𝜎𝜀2 𝜎𝜀2 𝜌 𝛿 𝜎𝜀2 ,
(7.6.196)
413
Chapter 7: Quantum Entropies and Information
First suppose that 𝛼 ∈ (1, ∞) and supp(𝜌) ⊈ supp(𝜎). Then from Propositions 7.29
e𝛼 (𝜌∥𝜎) =
and 7.42 and the fact that the sandwiched Rényi relative quasi-entropy 𝑄
+∞ in this case, it follows that 𝑄 b𝛼 (𝜌∥𝜎) = +∞, thus establishing the third
expression in (7.6.16).
Now suppose that 𝛼 ∈ (0, 1) ∪ (1, ∞) and supp(𝜌) ⊆ supp(𝜎). Let us employ
the decomposition of the Hilbert space H as H = supp(𝜎) ⊕ ker(𝜎). Then we can
write 𝜌 as
𝜌0,0 𝜌0,1 𝜎 0
𝜌= † , 𝜎= . (7.6.199)
𝜌0,1 𝜌1,1 0 0
Writing 1 = Π𝜎 + Π𝜎⊥ , where Π𝜎 is the projection onto the support of 𝜎 and Π𝜎⊥ is
the projection onto the orthogonal complement of supp(𝜎), we find that
𝜎 + 𝜀Π𝜎 0
𝜎𝜀 = , (7.6.200)
0 𝜀Π𝜎⊥
which implies that !
− 12
− 21 (𝜎 + 𝜀Π𝜎 ) 0
𝜎𝜀 = 1 . (7.6.201)
0 𝜀 − 2 Π𝜎⊥
The condition supp(𝜌) ⊆ supp(𝜎) implies that 𝜌0,1 = 0 and 𝜌1,1 = 0. Then
− 12 − 12
− 21
𝜎𝜀 𝜌𝜎𝜀 =
− 12 (𝜎 + 𝜀Π 𝜎 ) 𝜌 0,0 (𝜎 + 𝜀Π 𝜎 ) 0 , (7.6.202)
0 0
so that
𝛼
− 12 − 12
Tr 𝜎𝜀 𝜎𝜀 𝜌𝜎𝜀
414
Chapter 7: Quantum Entropies and Information
" h i𝛼 !#
− 12 − 12
𝜎 + 𝜀Π𝜎 0 (𝜎 + 𝜀Π𝜎 ) 𝜌0,0 (𝜎 + 𝜀Π𝜎 ) 0
= Tr ⊥ (7.6.203)
0 𝜀Π𝜎 0 0
h h i𝛼i
− 21 − 21
= Tr (𝜎 + 𝜀Π𝜎 ) (𝜎 + 𝜀Π𝜎 ) 𝜌0,0 (𝜎 + 𝜀Π𝜎 ) . (7.6.204)
where
ˆ 𝜀 B 𝜎 + 𝜀Π𝜎 .
𝜎 (7.6.208)
Since 𝛼 𝛼−1
−1 −1 −1 −1 −1 −1
ˆ 𝜀 2 𝜌0,0 𝜎
𝜎 ˆ𝜀 2 = ˆ 𝜀 2 𝜌0,0 𝜎
𝜎 ˆ𝜀 2 ˆ 𝜀 2 𝜌0,0 𝜎
𝜎 ˆ𝜀 2 (7.6.209)
415
Chapter 7: Quantum Entropies and Information
1
−1
where we applied Lemma 2.5 with 𝑓 (𝑥) = 𝑥 𝛼−1 and 𝐿 = 𝜌0,0 2
ˆ 𝜀 2 . Now taking the
𝜎
limit 𝜀 → 0+ , we conclude that
𝛼 " 𝛼−1 #
1 1 1 1
− −
lim+ Tr 𝜎𝜀 𝜎𝜀 2 𝜌𝜎𝜀 2 = lim+ Tr 𝜌0,0 𝜌0,0
2
ˆ 𝜀−1 𝜌0,0
𝜎 2
(7.6.214)
𝜀→0 𝜀→0
" 𝛼−1 #
1 1
= Tr 𝜌0,0 𝜌0,0 𝜎 −1 𝜌0,0
2 2
(7.6.215)
1 1 𝛼−1
−1
= Tr 𝜌 𝜌 2 𝜎 𝜌 2 , (7.6.216)
for the case 𝛼 ∈ (1, ∞) and supp(𝜌) ⊆ supp(𝜎), thus establishing (7.6.19).
For the case that 𝛼 ∈ (0, 1) and supp(𝜎) ⊆ supp(𝜌), we can employ the limit
exchange from Lemma 7.41 and a similar argument as in (7.6.199)–(7.6.206), but
with respect to the decomposition H = supp(𝜌) ⊕ ker(𝜌), to conclude that
1 1 1−𝛼
− −
b𝛼 (𝜌∥𝜎) = Tr 𝜌 𝜌 2 𝜎𝜌 2
𝑄 , (7.6.217)
thus establishing the second expression in (7.6.16). This case amounts to the
exchange 𝜌 ↔ 𝜎 and 𝛼 ↔ 1 − 𝛼.
We finally consider the case 𝛼 ∈ (0, 1) and supp(𝜌) ⊈ supp(𝜎), which is the
most involved case. Consider that
𝜎𝜀 B 𝜎 + 𝜀 1 =
ˆ𝜀 0
𝜎
, (7.6.218)
0 𝜀Π𝜎⊥
𝜌 𝛿 B (1 − 𝛿) 𝜌 + 𝛿𝜋, (7.6.219)
with 𝛿 ∈ (0, 1) and 𝜋 the maximally mixed state. By invoking Lemma 7.41, we
conclude that the following exchange of limits is possible for 𝛼 ∈ (0, 1):
Now define
𝛿
𝜌0,0 B Π𝜎 𝜌 𝛿 Π𝜎 , 𝛿
𝜌0,1 B Π𝜎 𝜌 𝛿 Π𝜎⊥ , 𝛿
𝜌1,1 B Π𝜎⊥ 𝜌 𝛿 Π𝜎⊥ , (7.6.221)
416
Chapter 7: Quantum Entropies and Information
so that
𝛿 𝛿
𝜌0,0 𝜌0,1
𝜌𝛿 = 𝛿 † . (7.6.222)
(𝜌0,1 ) 𝜌1,1
𝛿
Then 𝛼
1 − 21 − 12
𝐷 𝛼 (𝜌 𝛿 ∥𝜎𝜀 ) = log2 Tr 𝜎𝜀 𝜎𝜀 𝜌 𝛿 𝜎𝜀 . (7.6.223)
𝛼−1
Consider that
− 21
𝛿 𝛿 − 21
−1 −1 ˆ
𝜎 0 𝜌0,0 𝜌0,1 ˆ𝜀 0
𝜎
𝜎𝜀 2 𝜌 𝛿 𝜎𝜀 2 = 𝜀 (7.6.224)
0 𝜀Π𝜎⊥ 𝛿 †
(𝜌0,1 ) 𝜌1,1
𝛿 0 𝜀Π𝜎⊥
! !
−1 𝛿 𝛿 − 12
ˆ𝜀 2 0 𝜌0,0 𝜌0,1 𝜎 ˆ𝜀 0
= 𝜎 † (7.6.225)
𝜀 − 2 Π ⊥ (𝜌0,1 ) 𝜌1,1
1 𝛿 𝛿 1
0 𝜎 0 𝜀− 2 Π⊥ 𝜎
− 12 −1 1 − 12
© 𝜎 ˆ𝜀 𝛿
𝜌0,0 ˆ𝜀 2
𝜎 𝜀− 2 𝜎 𝛿
ˆ 𝜀 𝜌0,1 Π𝜎⊥ ª
= 1 − 12 ® (7.6.226)
𝜀 − 2 Π ⊥ (𝜌 𝛿 ) † 𝜎
ˆ 𝜀 −1 Π ⊥ 𝜌 𝛿 Π ⊥
« 𝜎 0,1 𝜀 𝜎 1,1 𝜎 ¬
−1 −1 1 −1
© 𝜎
𝛿
ˆ 𝜀 2 𝜌0,0 ˆ 𝜀 2 𝜀− 2 𝜎
𝜎 𝛿
ˆ 𝜀 2 𝜌0,1
= 1
ª
− 1 ®. (7.6.227)
𝜀 − 2 (𝜌 𝛿 ) † 𝜎
ˆ 2
𝜀 −1 𝜌 𝛿
« 0,1 𝜀 1,1 ¬
So then
𝛼
−1 −1
Tr 𝜎𝜀 𝜎𝜀 2 𝜌 𝛿 𝜎𝜀 2
𝛼
− 12 𝛿 − 12 − 1 − 12 𝛿
𝜎ˆ 0 ©© 𝜎 ˆ 𝜀 𝜌0,0 𝜎 ˆ𝜀 𝜀 𝜎 2 ˆ 𝜀 𝜌0,1 ªª
= Tr 𝜀 ⊥ 1 − 21 ®® (7.6.228)
0 𝜀Π𝜎 𝜀 − 2 (𝜌 𝛿 ) † 𝜎 ˆ 𝜀 −1 𝜌 𝛿
«« 0,1 𝜀 1,1 ¬¬
𝛼
− 12 𝛿 − 21 1 − 12 𝛿
𝜎ˆ 𝜀
0 © −1 © 𝜀 0,0 𝜀 ˆ
𝜎 𝜌 ˆ
𝜎 𝜀 2 ˆ
𝜎 𝜀 𝜌
= Tr 𝜀 ⊥ 𝜀 1 −2 1
0,1 ªª
®® (7.6.229)
0 𝜀Π𝜎 2 (𝜌 𝛿 ) † 𝜎 𝛿
« « 𝜀 0,1 ˆ 𝜀 𝜌 1,1 ¬¬
𝛼
−𝛼 − 12 𝛿 − 21 1 − 12 𝛿
𝜀 𝜎 ˆ 𝜀 0 𝜀 𝜎ˆ
© 𝜀 0,0 𝜀 𝜌 𝜎ˆ 𝜀 2𝜎 ˆ 𝜀 𝜌 0,1 ª
= Tr 1−𝛼 ⊥ −2 1 (7.6.230)
𝜀 Π𝜎 𝜀 2 (𝜌 𝛿 ) † 𝜎
1 ®
0 𝛿
« 0,1 ˆ 𝜀 𝜌 1,1 ¬
Let us define
𝛿 −1 −1 𝛿 1 −1
ˆ 𝜀 2 𝜌0,0
©𝜀 𝜎 ˆ 𝜀 2 𝜀2𝜎
𝜎 ˆ 𝜀 2 𝜌0,1
𝐾 (𝜀) B 1
ª
1 ®, (7.6.231)
𝛿 † −2
«𝜀 (𝜌0,1 ) 𝜎
𝛿
2 ˆ𝜀 𝜌1,1 ¬
417
Chapter 7: Quantum Entropies and Information
Defining
𝜀𝑆(𝜌 𝛿 , 𝜎 ˆ 𝜀) 0
𝐿 (𝜀) B , (7.6.237)
0 𝛿
𝜌1,1 + 𝜀𝑅
𝛿 − 21 𝛿 𝛿 𝛿 −1 𝛿 † −1
ˆ 𝜀) B 𝜎
𝑆(𝜌 , 𝜎 ˆ 𝜀 𝜌0,0 − 𝜌0,1 (𝜌1,1 ) (𝜌0,1 ) 𝜎 ˆ 𝜀 2, (7.6.238)
𝛿 −1 𝛿 †
𝑅 B Re[(𝜌1,1 ˆ 𝜀 ) −1 (𝜌0,1
) (𝜌0,1 ) ( 𝜎 𝛿
)], (7.6.239)
where 𝐺 in Lemma 7.50 is defined from 𝐴 and 𝐵 above. The inequality in (7.6.240)
in turn implies the following operator inequalities:
√ √ √ √
𝑒 −𝑖 𝜀𝐺
𝐿(𝜀)𝑒 𝑖 𝜀𝐺
− 𝑜(𝜀) 1 ≤ 𝐾 (𝜀) ≤ 𝑒 −𝑖 𝜀𝐺
𝐿(𝜀)𝑒 𝑖 𝜀𝐺
+ 𝑜(𝜀) 1. (7.6.241)
Observe that
√ √ √ √
𝑒 −𝑖 𝜀𝐺
𝐿 (𝜀)𝑒𝑖 𝜀𝐺
+ 𝑜(𝜀) 1 = 𝑒 −𝑖 𝜀𝐺
[𝐿 (𝜀) + 𝑜(𝜀) 1] 𝑒𝑖 𝜀𝐺
. (7.6.242)
Now invoking these and the operator monotonicity of the function 𝑥 𝛼 for 𝛼 ∈ (0, 1),
we find that
𝛼
− 12 − 21
Tr 𝜎𝜀 𝜎𝜀 𝜌 𝛿 𝜎𝜀 (7.6.243)
−𝛼
𝜀 𝜎 ˆ𝜀 0
= Tr (𝐾 (𝜀)) 𝛼 (7.6.244)
0 𝜀 1−𝛼 Π𝜎⊥
418
Chapter 7: Quantum Entropies and Information
𝜀 −𝛼 𝜎
√ √ 𝛼
𝑒 −𝑖 𝜀𝐺 [𝐿 (𝜀) + 𝑜(𝜀) 1] 𝑒𝑖 𝜀𝐺
ˆ𝜀 0
≤ Tr (7.6.245)
0 𝜀 1−𝛼 Π𝜎⊥
−𝛼 √ √
(𝐿 (𝜀) + 𝑜(𝜀) 1) 𝑒
𝜀 𝜎 ˆ𝜀 0 −𝑖 𝜀𝐺 𝛼 𝑖 𝜀𝐺
= Tr 𝑒 . (7.6.246)
0 𝜀 1−𝛼 Π𝜎⊥
Consider that
(𝐿 (𝜀) + 𝑜(𝜀) 1) 𝛼
ˆ 𝜀 ) + 𝑜(𝜀) 1
𝛼
𝜀𝑆(𝜌 𝛿 , 𝜎 0
=
+ 𝜀𝑅 + 𝑜(𝜀) 1
𝛿 (7.6.247)
0 𝜌1,1
ˆ 𝜀 ) + 𝑜(𝜀) 1
𝛼 !
𝜀𝑆(𝜌 𝛿 , 𝜎 0
= 𝛼
𝜌1,1 + 𝜀𝑅 + 𝑜(𝜀) 1
𝛿 (7.6.248)
0
ˆ 𝜀 ) + 𝑜(1) 1
𝛼 !
𝜀 𝛼 𝑆(𝜌 𝛿 , 𝜎 0
= 𝛼 .
𝜌1,1 + 𝜀𝑅 + 𝑜(𝜀) 1
𝛿 (7.6.249)
0
√
Now expanding 𝑒𝑖 𝜀𝐺 to first order in order to evaluate (7.6.246) (higher order
terms will end up being irrelevant), we find that
−𝛼 √ √
𝑒 −𝑖 𝜀𝐺 (𝐿(𝜀) + 𝑜(𝜀) 1) 𝛼 𝑒𝑖 𝜀𝐺
𝜀 𝜎 ˆ𝜀 0
Tr
0 𝜀 1−𝛼 Π𝜎⊥
−𝛼
(𝐿 (𝜀) + 𝑜(𝜀) 1)
𝜀 𝜎 ˆ𝜀 0 𝛼
= Tr
0 𝜀 1−𝛼 Π𝜎⊥
−𝛼
√
−𝑖 𝜀𝐺 (𝐿 (𝜀) + 𝑜(𝜀) 1)
𝜀 𝜎 ˆ𝜀 0 𝛼
+ Tr
0 𝜀 1−𝛼 Π𝜎⊥
−𝛼
√
(𝐿 (𝜀) + 𝑜(𝜀) 1) 𝛼 𝑖 𝜀𝐺 + 𝑜(1)
𝜀 𝜎 ˆ𝜀 0
+ Tr (7.6.250)
0 𝜀 1−𝛼 Π𝜎⊥
ˆ 𝜀 ) + 𝑜(1) 1
" 𝛼 !#
𝜎ˆ 𝜀 𝑆(𝜌 𝛿 , 𝜎 0
= Tr 𝛼
0 𝜀 Π𝜎 𝜌1,1 + 𝜀𝑅 + 𝑜(𝜀) 1
1−𝛼 ⊥ 𝛿
ˆ 𝜀 ) + 𝑜(1) 1 𝜎
" 𝛼 ! #
√ 𝑆(𝜌 𝛿 , 𝜎 ˆ𝜀 0
− 𝑖 𝜀 Tr 𝛼
𝜌1,1 + 𝜀𝑅 + 𝑜(𝜀) 1 Π𝜎⊥
1−𝛼 𝛿 𝐺
0 𝜀
1
" 𝛼 ! #
√ ˆ
𝜎 𝜀 𝑆(𝜌 𝛿, 𝜎ˆ 𝜀 ) + 𝑜(1) 0
+ 𝑖 𝜀 Tr 𝛼 𝐺 + 𝑜(1)
0 𝜀 Π𝜎 𝜌1,1 + 𝜀𝑅 + 𝑜(𝜀) 1
1−𝛼 ⊥ 𝛿
(7.6.251)
419
Chapter 7: Quantum Entropies and Information
ˆ 𝜀 ) + 𝑜(1) 1
" 𝛼 !#
ˆ 𝜀 𝑆(𝜌 𝛿 , 𝜎
𝜎 0
= Tr 𝛼 + 𝑜(1)
0 𝜀 1−𝛼 Π𝜎⊥ 𝜌1,1
𝛿
+ 𝜀𝑅 + 𝑜(𝜀) 1
(7.6.252)
h 𝛼i 𝛼
ˆ 𝜀 ) + 𝑜(1) 1
ˆ 𝜀 𝑆(𝜌 𝛿 , 𝜎
= Tr 𝜎 + 𝜀 1−𝛼 Tr[Π𝜎⊥ 𝜌1,1
𝛿
+ 𝜀𝑅 + 𝑜(𝜀) 1 ] + 𝑜(1).
(7.6.253)
√
By observing the last line, we see that higher order terms for 𝑒𝑖 𝜀𝐺 include prefactors
of 𝜀 (or higher powers), which vanish in the 𝜀 → 0+ limit. Now taking the limit
𝜀 → 0+ , we find that
−𝛼 √ √
𝑒 −𝑖 𝜀𝐺 (𝐿 (𝜀) + 𝑜(𝜀) 1) 𝑒𝑖 𝜀𝐺
𝜀 𝜎 ˆ𝜀 0 𝛼
lim Tr
𝜀→0+ 0 𝜀 1−𝛼 Π𝜎⊥
1 𝛼
h 1 i
= Tr 𝜎 𝜎 − 2 𝜌0,0 𝛿 𝛿
− 𝜌0,1 𝛿 −1 𝛿 †
(𝜌1,1 ) (𝜌0,1 ) 𝜎 − 2 , (7.6.254)
where the inverses are taken on the support of 𝜎. By proceeding in a similar way,
but using the lower bound in (7.6.241), we find the following lower bound on
(7.6.243):
−𝛼 √ √
𝑒 −𝑖 𝜀𝐺 (𝐿(𝜀) − 𝑜(𝜀) 1) 𝛼 𝑒𝑖 𝜀𝐺 .
𝜀 𝜎 ˆ𝜀 0
Tr (7.6.255)
0 𝜀 1−𝛼 Π𝜎⊥
Then by the same argument above, the lower bound on (7.6.243) after taking the
limit 𝜀 → 0+ is the same as in (7.6.254). So we conclude that
𝛼 h 1 𝛼i
− 12 − 21 −2 𝛿 𝛿 𝛿 −1 𝛿 † − 21
lim+ Tr 𝜎𝜀 𝜎𝜀 𝜌 𝛿 𝜎𝜀 = Tr 𝜎 𝜎 𝜌0,0 − 𝜌0,1 (𝜌1,1 ) (𝜌0,1 ) 𝜎 .
𝜀→0
(7.6.256)
Now consider that
𝛿 𝛿 𝛿 −1 𝛿 † −1 †
lim+ 𝜌0,0 − 𝜌0,1 (𝜌1,1 ) (𝜌0,1 ) = 𝜌0,0 − 𝜌0,1 𝜌1,1 𝜌0,1 , (7.6.257)
𝛿→0
where the inverse on the right is taken on the support of 𝜌1,1 . This follows because
†
the image of 𝜌0,1 is contained in the support of 𝜌1,1 . Thus, we take the limit 𝛿 → 0+ ,
and find that
𝛼 h 1 𝛼i
− 12 − 12 −2 −1 † − 21
lim+ lim+ Tr 𝜎𝜀 𝜎𝜀 𝜌 𝛿 𝜎𝜀 = Tr 𝜎 𝜎 𝜌0,0 − 𝜌0,1 𝜌1,1 𝜌0,1 𝜎 ,
𝛿→0 𝜀→0
(7.6.258)
where all inverses are taken on the support. This concludes the proof.
420
Chapter 7: Quantum Entropies and Information
Lemma 7.50
Let 𝐴 be an invertible Hermitian operator, 𝐵 a linear operator, 𝐶 a Hermitian
operator, and let 𝜀 > 0. Then with
𝐴 𝜀𝐵
𝑀 (𝜀) B , (7.6.259)
𝜀𝐵† 𝜀 2𝐶
𝐴 + 𝜀 2 Re[ 𝐴−1 𝐵𝐵† ]
0
𝐷 (𝜀) B , (7.6.260)
0 𝜀 2 𝐶 − 𝐵† 𝐴−1 𝐵
−𝑖 𝐴−1 𝐵
0
𝐺B , (7.6.261)
𝑖𝐵† 𝐴−1 0
𝑖𝜀 [𝐺 𝑀 (𝜀) − 𝑀 (𝜀)𝐺]
−𝑖𝜀 𝐴−1 𝐵𝐵† 𝑖𝜀𝐵𝐵† 𝐴−1
𝑜(𝜀) −𝑖𝐵
= 𝑖𝜀 − (7.6.269)
𝑖𝐵† 𝑖𝜀𝐵† 𝐴−1 𝐵 𝑜(𝜀) −𝑖𝜀𝐵† 𝐴−1 𝐵
2𝜀 Re[ 𝐴−1 𝐵𝐵† ] −𝜀𝐵 + 𝑜(𝜀 2 )
2
= . (7.6.270)
−𝜀𝐵† + 𝑜(𝜀 2 ) −2𝜀 2 𝐵† 𝐴−1 𝐵
−𝑖 𝐴−1 𝐵
𝑜(1) 𝑜(𝜀) 0
𝐺 𝑀 (𝜀)𝐺 = (7.6.271)
𝑖𝐵† 𝑜(1) 𝑖𝐵† 𝐴−1 0
𝑜(𝜀) 𝑜(1)
= , (7.6.272)
𝑜(1) 𝐵 𝐴−1 𝐵
†
So then
𝜀2 2 𝜀2 2
𝐼 + 𝑖𝜀𝐺 − 𝐺 𝑀 (𝜀) 𝐼 − 𝑖𝜀𝐺 − 𝐺
2 2
= 𝑀 (𝜀) + 𝑖𝜀 [𝐺 𝑀 (𝜀) − 𝑀 (𝜀)𝐺]
422
Chapter 7: Quantum Entropies and Information
1 1
+ 𝜀 2 𝐺 𝑀 (𝜀)𝐺 − 𝐺 2 𝑀 (𝜀) − 𝑀 (𝜀)𝐺 2 + 𝑜(𝜀 2 ) (7.6.280)
2 2
2𝜀 Re[ 𝐴−1 𝐵𝐵† ] −𝜀𝐵 + 𝑜(𝜀 2 )
2
𝐴 𝜀𝐵
= +
𝜀𝐵† 𝜀 2𝐶 −𝜀𝐵† + 𝑜(𝜀 2 ) −2𝜀 2 𝐵† 𝐴−1 𝐵
−𝜀 Re[ 𝐴−1 𝐵𝐵† ] + 𝑜(𝜀 3 )
2
𝑜(𝜀 2 )
+ + 𝑜(𝜀 2 ) (7.6.281)
𝑜(𝜀 2 ) 𝜀 2 𝐵† 𝐴−1 𝐵 + 𝑜(𝜀 3 )
𝐴 + 𝜀 2 Re[ 𝐴−1 𝐵𝐵† ]
0 + 𝑜(𝜀 2 )
= † −1 (7.6.282)
0 𝜀 𝐶−𝐵 𝐴 𝐵
2
So we conclude that
where the inverse 𝜎 −1 is taken on the support of 𝜎 and the logarithm is evaluated
on the support of 𝜌.
3The name Staszewski is pronounced Stah·shev·ski, with emphasis on the second syllable.
423
Chapter 7: Quantum Entropies and Information
Proposition 7.52
Let 𝜌 be a state and 𝜎 a positive semi-definite operator. Then, in the limit 𝛼 → 1,
the geometric Rényi relative entropy converges to the Belavkin–Staszewski
relative entropy:
lim 𝐷b𝛼 (𝜌∥𝜎) = 𝐷 b (𝜌∥𝜎). (7.7.2)
𝛼→1
Proof: Suppose at first that supp(𝜌) ⊆ supp(𝜎). Then 𝐷 b𝛼 (𝜌∥𝜎) is finite for
all 𝛼 ∈ (0, 1) ∪ (1, ∞), and we can write the following explicit formula for the
geometric Rényi relative entropy by employing Proposition 7.40:
b𝛼 (𝜌∥𝜎) = 1 b𝛼 (𝜌∥𝜎)
𝐷 log2 𝑄 (7.7.3)
𝛼−1
1 h 1 𝛼i
−2 − 12
= log2 Tr 𝜎 𝜎 𝜌𝜎 . (7.7.4)
𝛼−1
Our assumption implies that Tr[Π𝜎 𝜌] = 1, and we find that
h 1 i
−2 − 21
𝑄 1 (𝜌∥𝜎) = Tr 𝜎 𝜎 𝜌𝜎
b (7.7.5)
= Tr[Π𝜎 𝜌] (7.7.6)
= 1. (7.7.7)
b𝛼 (𝜌∥𝜎) − log2 𝑄
log2 𝑄 b1 (𝜌∥𝜎)
b𝛼 (𝜌∥𝜎) =
𝐷 , (7.7.8)
𝛼−1
so that
b𝛼 (𝜌∥𝜎) − log2 𝑄
log2 𝑄 b1 (𝜌∥𝜎)
b𝛼 (𝜌∥𝜎) = lim
lim 𝐷 (7.7.9)
𝛼→1 𝛼→1 𝛼−1
424
Chapter 7: Quantum Entropies and Information
d b𝛼 (𝜌∥𝜎)
= log2 𝑄 (7.7.10)
d𝛼 𝛼=1
d b
1 d𝛼 𝑄 𝛼 (𝜌∥𝜎) 𝛼=1
= (7.7.11)
ln(2) b1 (𝜌∥𝜎)
𝑄
1 d b
= 𝑄 𝛼 (𝜌∥𝜎) . (7.7.12)
ln(2) d𝛼 𝛼=1
Then
d b d h 1 𝛼i
−2 − 12
𝑄 𝛼 (𝜌∥𝜎) = Tr 𝜎 𝜎 𝜌𝜎
d𝛼 𝛼=1 d𝛼
𝛼=1
d 1 1
𝛼
= Tr 𝜎 𝜎 − 2 𝜌𝜎 − 2 .
d𝛼 𝛼=1
it follows that
d 𝛼 d ∑︁ 𝛼
𝑋 = 𝜈 Π𝑧 (7.7.14)
d𝛼 𝛼=1 d𝛼 𝑧 𝑧
𝛼=1
∑︁ d
𝛼
= 𝜈𝑧 Π𝑧 (7.7.15)
𝑧
d𝛼 𝛼=1
∑︁
𝛼 𝛼
= 𝜈 𝑧 ln 𝜈 𝑧 𝛼=1 Π𝑧 (7.7.16)
𝑧
∑︁
= (𝜈 𝑧 ln 𝜈 𝑧 ) Π𝑧 (7.7.17)
𝑧
= 𝑋 ln∗ 𝑋, (7.7.18)
where
ln(𝑥) 𝑥 > 0
ln∗ (𝑥) B . (7.7.19)
0 𝑥=0
Thus we find that
d −1 −1 𝛼
Tr 𝜎 𝜎 2 𝜌𝜎 2
d𝛼 𝛼=1
425
Chapter 7: Quantum Entropies and Information
h 1 1 i
−2 − 21 −2 − 12
= Tr 𝜎 𝜎 𝜌𝜎 ln∗ 𝜎 𝜌𝜎 (7.7.20)
h 1 1 1 1 1 1 1 1 i
= Tr 𝜎 2 𝜌 2 𝜌 2 𝜎 2 ln∗ 𝜎 − 2 𝜌 2 𝜌 2 𝜎 − 2
−
(7.7.21)
h 1 1 1 1 1 1 1 1i
= Tr 𝜎 𝜌 ln∗ 𝜌 2 𝜎 − 2 𝜎 − 2 𝜌 2 𝜌 2 𝜎 − 2
2 2 (7.7.22)
h 1 1
1 1 1 1 i
= Tr 𝜌 Π𝜎 𝜌 ln∗ 𝜌 2 𝜎 − 2 𝜎 − 2 𝜌 2
2 2 (7.7.23)
h 1 1
i
−1
= Tr 𝜌 ln 𝜌 2 𝜎 𝜌 2 . (7.7.24)
The third equality follows from Lemma 2.5. The final equality follows from the
assumption supp(𝜌) ⊆ supp(𝜎) and by applying the interpretation of the logarithm
exactly as stated in Definition 7.51. Then we find that
h i
b𝛼 (𝜌∥𝜎) = Tr 𝜌 log2 𝜌 21 𝜎 −1 𝜌 12 ,
lim 𝐷 (7.7.25)
𝛼→1
Therefore,
b𝛼 (𝜌∥𝜎) = lim𝛼→1− 𝐷
To conclude, we have established that lim𝛼→1+ 𝐷 b𝛼 (𝜌∥𝜎) =
b (𝜌∥𝜎), which means that
𝐷
b𝛼 (𝜌∥𝜎) = 𝐷
lim 𝐷 b (𝜌∥𝜎), (7.7.28)
𝛼→1
as required. ■
426
Chapter 7: Quantum Entropies and Information
The following inequality relates the quantum relative entropy to the Belavkin–
Staszewski relative entropy:
Proposition 7.53
Let 𝜌 be a state and 𝜎 a positive semi-definite operator. Then the quantum
relative entropy is never larger than the Belavkin–Staszewski relative entropy:
𝐷 (𝜌∥𝜎) ≤ 𝐷
b (𝜌∥𝜎). (7.7.29)
Proof: If supp(𝜌) ⊈ supp(𝜎), then there is nothing to prove in this case because
both
𝐷 (𝜌∥𝜎) = 𝐷 b (𝜌∥𝜎) = +∞, (7.7.30)
and so the inequality in (7.7.29) holds trivially in this case. So let us suppose
instead that supp(𝜌) ⊆ supp(𝜎). From Propositions 7.42 and 7.40, we conclude
for all 𝛼 ∈ (0, 1) ∪ (1, ∞) that
e𝛼 (𝜌∥𝜎) ≤ 𝐷
𝐷 b𝛼 (𝜌∥𝜎). (7.7.31)
Thus, applying the limit 𝛼 → 1 to (7.7.31) and the two equalities above, we
conclude (7.7.29). ■
Similar to what was shown in Proposition 7.2, Definition 7.51 is consistent with
the following limit:
Proposition 7.54
For every state 𝜌 and positive semi-definite operator 𝜎, the following limit
427
Chapter 7: Quantum Entropies and Information
holds
1 1
b (𝜌∥𝜎) = lim lim Tr 𝜌 𝛿 log2 𝜌 𝜎𝜀−1 𝜌
𝐷 2 2
, (7.7.34)
+ +
𝜀→0 𝛿→0 𝛿 𝛿
𝜌 𝛿 B (1 − 𝛿) 𝜌 + 𝛿𝜋, 𝜎𝜀 B 𝜎 + 𝜀 1, (7.7.35)
where the second equality follows from applying Lemma 2.5 with 𝑓 = log2 and
1
−1 −1 −1
𝐿 = 𝜌 𝛿2 𝜎𝜀 2 . The second-to-last equality follows because 𝜎𝜀 2 𝜌 𝛿 𝜎𝜀 2 commutes
− 12 − 21
with log2 (𝜎𝜀 𝜌 𝛿 𝜎𝜀 ),
and by employing cyclicity of trace. In the last line, we
made use of the following function:
defined for all 𝑥 ∈ [0, ∞) with 𝜂(0) = 0. By appealing to the continuity of the
function 𝜂(𝑥) on 𝑥 ∈ [0, ∞) and the fact that lim𝛿→0+ 𝜌 𝛿 = 𝜌0,0 , we find that
− 21 − 21 − 12 − 12
lim+ Tr 𝜎𝜀 𝜂 𝜎𝜀 𝜌 𝛿 𝜎𝜀 = Tr 𝜎𝜀 𝜂 𝜎𝜀 𝜌0,0 𝜎𝜀 . (7.7.43)
𝛿→0
Now recall the function log2,∗ defined in (7.7.19). Using it, we can write
− 21 − 12
Tr 𝜎𝜀 𝜂 𝜎𝜀 𝜌0,0 𝜎𝜀
− 21 − 21 − 21 − 12
= Tr 𝜎𝜀 𝜎𝜀 𝜌0,0 𝜎𝜀 log2,∗ 𝜎𝜀 𝜌0,0 𝜎𝜀 (7.7.44)
1 1 1 1 1 1 1 1
− − −
= Tr 𝜎𝜀2 𝜌0,0
2 2
𝜌0,0 𝜎𝜀 2 log2,∗ 𝜎𝜀 2 𝜌0,0 2 2
𝜌0,0 𝜎𝜀 2 (7.7.45)
1 1 1 1 1 1 1 1
− − −
= Tr 𝜎𝜀2 𝜌0,0
2
log2,∗ 𝜌0,0 2
𝜎𝜀 2 𝜎𝜀 2 𝜌0,0
2 2
𝜌0,0 𝜎𝜀 2 (7.7.46)
1 1
−1 2
= Tr 𝜌0,0 log2,∗ 𝜌0,0 𝜎𝜀 𝜌0,0
2
(7.7.47)
1 1
= Tr 𝜌0,0 log2,∗ 𝜌0,0 2
(𝜎 + 𝜀Π𝜎 ) −1 𝜌0,0 2
, (7.7.48)
1
! 1 !
−1
2 (𝜎 + 𝜀Π𝜎 ) 0 2
= 𝜌0,0 0 −1 ⊥
𝜌0,0 0 (7.7.49)
0 0 0 𝜀 Π𝜎 0 0
1 1
!
−1 2
= 0,0𝜌 2
(𝜎 + 𝜀Π 𝜎 ) 𝜌 0,0 0 . (7.7.50)
0 0
Now taking the limit as 𝜀 → 0+ , and appealing to continuity of log2,∗ (𝑥) and 𝑥 −1
for 𝑥 > 0, we find that
1 1
−1 2
lim Tr 𝜌0,0 log2,∗ 𝜌0,0 (𝜎 + 𝜀Π𝜎 ) 𝜌0,0
2
𝜀→0+
1 1
= Tr 𝜌0,0 log2,∗ 𝜌0,0
2
𝜎 −1 𝜌0,0
2
(7.7.51)
429
Chapter 7: Quantum Entropies and Information
h 1 1
i
−1
= Tr 𝜌 log2 𝜌 𝜎 𝜌 2 2 (7.7.52)
where the formula in the last line is interpreted exactly as stated in Definition 7.51.
Thus, we conclude that
h 1 i
1 1
−1 −1 21
lim+ lim+ Tr 𝜌 𝛿 log2 𝜌 𝛿 𝜎𝜀 𝜌 𝛿 = Tr 𝜌 log2 𝜌 𝜎 𝜌
2 2 2 . (7.7.53)
𝜀→0 𝛿→0
430
Chapter 7: Quantum Entropies and Information
3. Additivity:
b (𝜌1 ⊗ 𝜌2 ∥𝜎1 ⊗ 𝜎2 ) = 𝐷
𝐷 b (𝜌1 ∥𝜎1 ) + 𝐷 (𝜌2 ∥𝜎2 ). (7.7.60)
where
∑︁
𝜌𝑋 𝐴 B 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜌 𝑥𝐴 , (7.7.63)
𝑥∈X
∑︁
𝜎𝑋 𝐴 B 𝑞(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜎𝐴𝑥 . (7.7.64)
𝑥∈X
431
Chapter 7: Quantum Entropies and Information
Proof:
1. Isometric invariance is a direct consequence of Propositions 7.44 and 7.52.
2. All of the properties in the second item follow from data processing (Corol-
lary 7.55).
(a) Applying the trace-out channel, we find that
b (𝜌∥𝜎) ≥ 𝐷
𝐷 b (Tr[𝜌] ∥ Tr[𝜎]) (7.7.65)
= Tr[𝜌] log2 (Tr[𝜌]/Tr[𝜎]) (7.7.66)
= − log2 Tr[𝜎] (7.7.67)
≥ 0. (7.7.68)
ˆ B |0⟩⟨0| ⊗ 𝜌 + |1⟩⟨1| ⊗ (𝜎 − 𝜌) .
𝜎 (7.7.69)
where the inequality follows from data processing by tracing out the first
classical register of 𝜌ˆ and 𝜎.
ˆ
(d) If 𝜎 ≤ 𝜎′, then the operator 𝜎′ − 𝜎 is positive semi-definite and so is the
following one:
where the inequality follows from data processing by tracing out the first
classical register of 𝜌ˆ and 𝜎.
ˆ
432
Chapter 7: Quantum Entropies and Information
A statement similar to that made by Proposition 7.48 holds for the Belavkin–
Staszewski relative entropy:
Proof: The proof is very similar to the proof of Proposition 7.48, and so we use
the same notation to provide a brief proof. By following the same reasoning that
leads to (7.6.170), it follows that
The optimal choices of 𝑝, 𝑞, and P saturating the inequality in (7.7.74) are again
given by (7.6.171)–(7.6.173). Consider for those choices that
∑︁
∑︁ 𝑝(𝑥)
𝑝(𝑥) log2 = 𝑝(𝑥) log2 (𝜆𝑥 ) (7.7.75)
𝑥
𝑞(𝑥) 𝑥
∑︁
= 𝜆 𝑥 𝑞(𝑥) log2 (𝜆𝑥 ) (7.7.76)
𝑥
∑︁
= 𝜆 𝑥 Tr[Π𝑥 𝜎] log2 (𝜆 𝑥 ) (7.7.77)
𝑥
433
Chapter 7: Quantum Entropies and Information
" !#
∑︁
= Tr 𝜎 𝜆𝑥 log2 (𝜆 𝑥 ) Π𝑥 (7.7.78)
h 𝑥 1 1
1 1
i
= Tr 𝜎 𝜎 − 2 𝜌𝜎 − 2 log2 𝜎 − 2 𝜌𝜎 − 2 (7.7.79)
h 1 i
−1 21
= Tr 𝜌 log2 𝜌 𝜎 𝜌
2 , (7.7.80)
where the last equality follows from reasoning similar to that used to justify
(7.7.20)–(7.7.24). Then by following the reasoning at the end of the proof of
Proposition 7.48, we conclude (7.7.73). ■
434
Chapter 7: Quantum Entropies and Information
The second-to-last equality demonstrates that 𝐷 max (𝜌∥𝜎) can be calculated using a
semi-definite program (SDP) (see Section 2.4). Indeed, the optimization in (7.8.4)
can be cast in the standard form in (2.4.4), i.e.,
infimum Tr[𝐵𝑌 ]
inf{𝜆 : 𝜌 ≤ 𝜆𝜎} = subject to Φ† (𝑌 ) ≥ 𝐴, (7.8.7)
𝑌 ≥ 0,
with 𝑌 ≡ 𝜆, 𝐴 ≡ 𝜌, 𝐵 ≡ 1, and Φ† (𝑌 ) = 𝑌 𝜎. (Note that taking the trace on both
sides of the constraint 𝜌 ≤ 𝜆𝜎 in (7.8.4) results in 𝜆 ≥ 1/Tr[𝜎], so that 𝜆 ≥ 0.)
The final equality in (7.8.6) results from calculating the SDP dual to that in (7.8.4).
The max-relative entropy has the following alternative representation, which,
when 𝜌 and 𝜎 are states, allows for thinking of it as being related to the largest
weight that one can place on 𝜌 to realize 𝜎 as a probabilistic mixture of 𝜌 and some
other state.
Lemma 7.59
The max-relative entropy 𝐷 max (𝜌∥𝜎) of a state 𝜌 and a positive semi-definite
operator 𝜎 can be written as follows:
n o
−𝜆 −𝜆
𝐷 max (𝜌∥𝜎) = inf 𝜆 : 𝜎 =2 𝜌+ 1−2 𝜔, Tr[𝜔] = 1 . (7.8.8)
𝜆∈R,𝜔≥0
−𝜇 −𝜇 −𝜇 2𝜇 𝜎 − 𝜌
−𝜇
2 𝜌 + (1 − 2 ) 𝜔 = 2 𝜌 + (1 − 2 ) 𝜇 (7.8.9)
2 −1
−𝜇
−𝜇 2 (𝜎 − 2 𝜌)
𝜇
−𝜇
= 2 𝜌 + (1 − 2 ) (7.8.10)
2𝜇 − 1
= 2−𝜇 𝜌 + 𝜎 − 2−𝜇 𝜌 (7.8.11)
= 𝜎. (7.8.12)
Thus, 𝜇 and 𝜔 satisfy the constraints for the optimization problem in (7.8.8), and
we conclude that
n o
−𝜆 −𝜆
𝜇 ≥ inf 𝜆 : 𝜎 =2 𝜌+ 1−2 𝜔, Tr[𝜔] = 1 . (7.8.13)
𝜆∈R,𝜔≥0
435
Chapter 7: Quantum Entropies and Information
Now we prove the opposite inequality. Let 𝜆 ∈ R and let 𝜔 be an arbitrary state
−𝜆 −𝜆
satisfying 𝜎 = 2 𝜌 + 1 − 2 𝜔. Then it follows that
−𝜆 −𝜆
𝜎 =2 𝜌+ 1−2 𝜔 ≥ 2−𝜆 𝜌, (7.8.15)
Remark: This result holds more generally for positive maps that are not necessarily trace
preserving.
Proof: To see this, let 𝜆 ∈ R be such that the operator inequality 𝜌 ≤ 2𝜆 𝜎 holds.
Then the operator inequality N(𝜌) ≤ 2𝜆 N(𝜎) holds because the quantum channel
N is a positive map. Then
Since the inequality holds for all choices of 𝜆 such that 𝜌 ≤ 2𝜆 𝜎 holds, we conclude
that
It turns out that the max-relative entropy is a limiting case of the sandwiched
and geometric Rényi relative entropies, as we now show.
Proposition 7.61
The sandwiched and geometric Rényi relative entropies converge to the max-
relative entropy in the limit 𝛼 → ∞:
e𝛼 (𝜌∥𝜎) = lim 𝐷
lim 𝐷 b𝛼 (𝜌∥𝜎) = 𝐷 max (𝜌∥𝜎). (7.8.20)
𝛼→∞ 𝛼→∞
Proof: We begin with the case in which supp(𝜌) ⊈ supp(𝜎). We trivially have
e𝛼 (𝜌∥𝜎) = 𝐷
𝐷 b𝛼 (𝜌∥𝜎) = 𝐷 max (𝜌∥𝜎) = +∞ for all 𝛼 > 1, which implies the
equality in (7.8.20) in this case.
In the case that supp(𝜌) ⊆ supp(𝜎), we can consider, without loss of generality,
that supp(𝜎) = H, so that 𝜆 min (𝜎) > 0.
We begin with the sandwiched Rényi relative entropy. Consider that
1 1 1− 𝛼 1
e𝛼 (𝜌∥𝜎) =
𝐷 log2 Tr[(𝜌 2 𝜎 𝛼 𝜌 2 ) 𝛼 ] (7.8.21)
𝛼−1
1 1 1 1 1 1
= log2 Tr[(𝜌 2 𝜎 − 2 𝜎 𝛼 𝜎 − 2 𝜌 2 ) 𝛼 ]. (7.8.22)
𝛼−1
By the operator inequalities [𝜆 min (𝜎)] 𝛼 1 ≤ 𝜎 𝛼 ≤ [𝜆max (𝜎)] 𝛼 1 and the mono-
1 1 1
1 𝛼 1 1
log2 𝜆 min (𝜎) + log2 𝜌 2 𝜎 −1 𝜌 2 ≤ 𝐷e𝛼 (𝜌∥𝜎)
𝛼−1 𝛼−1 𝛼
1 𝛼 1 1
≤ log2 𝜆 max (𝜎) + log2 𝜌 2 𝜎 −1 𝜌 2 . (7.8.26)
𝛼−1 𝛼−1 𝛼
437
Chapter 7: Quantum Entropies and Information
Now taking the limit 𝛼 → ∞ and applying the fact that lim𝛼→∞ ∥ 𝑋 ∥ 𝛼 = ∥ 𝑋 ∥ ∞
e𝛼 (𝜌∥𝜎) = 𝐷 max (𝜌∥𝜎).
(Proposition 2.9), we conclude the equality lim𝛼→∞ 𝐷
We now consider the geometric Rényi relative entropy. Since we have that
𝜆 min (𝜎) 1 ≤ 𝜎 ≤ 𝜆max (𝜎) 1, (7.8.27)
it follows that
h 𝛼i h 1 𝛼i
− 12 − 12 −2 − 12
𝜆 min (𝜎) Tr 𝜎 𝜌𝜎 ≤ Tr 𝜎 𝜎 𝜌𝜎 (7.8.28)
h 1 𝛼i
−2 − 12
≤ 𝜆 max (𝜎) Tr 𝜎 𝜌𝜎 . (7.8.29)
Combining this limit with the inequalities in (7.8.30) and (7.8.31), we arrive at the
equality lim𝛼→∞ 𝐷b𝛼 (𝜌∥𝜎) = 𝐷 max (𝜌∥𝜎). ■
for all bipartite states 𝜌 𝐴𝐵 , where the optimization is over states 𝜎𝐵 . Since the
max-relative entropy has the largest value among all the sandwiched Rényi relative
entropies, the quantity in (7.8.36) has the smallest value among all conditional
sandwiched Rényi entropies, which is why it is called the conditional min-entropy.
Note that the conditional sandwiched Rényi entropy is defined through (7.3.12),
with the generalized divergence 𝑫 therein replaced by the sandwiched Rényi relative
entropy 𝐷e𝛼 , the latter defined in (7.5.2). On the other hand, the quantity
𝐻max ( 𝐴|𝐵) 𝜌 B 𝐻 e 1 (𝜌 𝐴𝐵 ∥ 1 𝐴 ⊗ 𝜎𝐵 )
e1 ( 𝐴|𝐵) 𝜌 = − inf 𝐷 (7.8.37)
2 𝜎𝐵 2
439
Chapter 7: Quantum Entropies and Information
For the analysis of lower bounds on quantum and private communication rates,
we require the smooth max-relative entropy, which is an example of a smooth
generalized divergence. A smooth generalized divergence, denoted by 𝑫 𝜀 (𝜌∥𝜎), is
defined by taking a generalized divergence 𝑫 (𝜌∥𝜎), for a given state 𝜌 and positive
semi-definite operator 𝜎, and optimizing the quantity 𝑫 (e 𝜌 ∥𝜎) over states e 𝜌 that
are within a distance 𝜀 from the given state 𝜌. Specifically, it is defined as follows:
𝑫 𝜀 (𝜌∥𝜎) B inf 𝑫 (e
𝜌 ∥𝜎), (7.8.40)
𝜌 ∈B 𝜀 (𝜌)
e
where
B𝜀 (𝜌) B {𝜏 : 𝜏 ≥ 0, Tr[𝜏] = 1, 𝑃(𝜌, 𝜏) ≤ 𝜀} (7.8.41)
is the set of states 𝜏 that are 𝜀-close to 𝜌 in terms of the sine distance (Definition 6.16).
Just like the max-relative entropy, the smooth max-relative entropy is a general-
ized divergence, satisfying the data-processing inequality:
𝜌 , 𝜌) ≤ 𝜀.
𝑃(e (7.8.44)
440
Chapter 7: Quantum Entropies and Information
Then from the data-processing inequality for the sine distance under positive
trace-preserving maps (see (6.2.114)), it follows that
𝜌 ), N(𝜌)) ≤ 𝜀.
𝑃(N(e (7.8.45)
So it follows that
𝐷 max (e
𝜌 ∥𝜎) ≥ 𝐷 max (N(e
𝜌 )∥N(𝜎)) (7.8.46)
𝜀
≥ 𝐷 max (N(𝜌)∥N(𝜎)). (7.8.47)
Remark: The proof given above holds more generally when N is a positive, trace-preserving
map, so that (7.8.43) holds in this more general case.
The smooth max-relative entropy can be related to the sandwiched Rényi relative
entropy as follows:
implying that
𝜀
𝐷 max (𝜌∥𝜎) = log2 inf sup 𝜌 ].
Tr[Λe (7.8.50)
𝜌 :𝑃(e
e 𝜌 ,𝜌)≤𝜀 Λ≥0,Tr[Λ𝜎]≤1
441
Chapter 7: Quantum Entropies and Information
𝜌 ] is linear in Λ and e
Since the objective function Tr[Λe 𝜌 , the set {Λ : Λ ≥
0, Tr[Λ𝜎] ≤ 1} is compact and concave, and the set
{e 𝜌 , 𝜌) ≤ 𝜀, e
𝜌 : 𝑃(e 𝜌 ≥ 0, Tr[e
𝜌 ] = 1} (7.8.51)
is compact and convex (due to convexity of sine distance), the minimax theorem
(Theorem 2.24) applies and we find that
𝜀
𝐷 max (𝜌∥𝜎) = log2 sup inf 𝜌 ].
Tr[Λe (7.8.52)
𝜌 :𝑃(e
Λ≥0,Tr[Λ𝜎]≤1 e 𝜌 ,𝜌)≤𝜀
let us define the following set, for a choice of 𝜆 > 0 to be specified later:
S B 𝑖 : ⟨𝜙𝑖 |𝜌|𝜙𝑖 ⟩ > 2𝜆 ⟨𝜙𝑖 |𝜎|𝜙𝑖 ⟩ . (7.8.54)
Let ∑︁
ΠB |𝜙𝑖 ⟩⟨𝜙𝑖 |. (7.8.55)
𝑖∈S
Then from the definition, we find that
that
e𝛼 (𝜌∥𝜎) ≥ 𝐷
𝐷 e𝛼 (Δ(𝜌)∥Δ(𝜎)) (7.8.59)
!
1 (Tr[Π𝜌]) 𝛼 (Tr[Π𝜎]) 1−𝛼
= log2 𝛼 1−𝛼 (7.8.60)
𝛼−1 + Tr[ Π̂𝜌] Tr[ Π̂𝜎]
1
𝛼 1−𝛼
≥ log2 (Tr[Π𝜌]) (Tr[Π𝜎]) (7.8.61)
𝛼−1
442
Chapter 7: Quantum Entropies and Information
𝛼−1 !
1 Tr[Π𝜌]
= log2 Tr[Π𝜌] (7.8.62)
𝛼−1 Tr[Π𝜎]
1 Tr[Π𝜌]
= log2 (Tr[Π𝜌]) + log2 (7.8.63)
𝛼−1 Tr[Π𝜎]
1
≥ log2 (Tr[Π𝜌]) + 𝜆. (7.8.64)
𝛼−1
Now picking
e𝛼 (𝜌∥𝜎) + 1 1
𝜆=𝐷 log2 2 , (7.8.65)
𝛼−1 𝜀
we conclude that
Tr[Π𝜌] ≤ 𝜀 2 . (7.8.66)
Defining Π̂ B 1 − Π, this means that
443
Chapter 7: Quantum Entropies and Information
∑︁
≤ 2𝜆 𝜆𝑖 ⟨𝜙𝑖 |𝜎|𝜙𝑖 ⟩ (7.8.75)
𝑖∉S
𝜆
≤ 2 Tr[Λ𝜎] (7.8.76)
≤ 2𝜆 . (7.8.77)
Thus, we have found the following uniform bound for any operator Λ satisfying
Λ ≥ 0 and Tr[Λ𝜎] ≤ 1, with 𝜌′ the state in (7.8.68) depending on Λ and satisfying
𝑃(𝜌, 𝜌′) ≤ 𝜀:
1
Tr[Λ𝜌′] ≤ 2
𝜆+log2
1− 𝜀 2 . (7.8.78)
Then it follows that
𝜀
𝐷 max (𝜌∥𝜎) = log2 sup inf 𝜌]
Tr[Λe (7.8.79)
𝜌 :𝑃(e
Λ≥0,Tr[Λ𝜎]≤1 e 𝜌 ,𝜌)≤𝜀
′
≤ log2 sup Tr[Λ𝜌 ] (7.8.80)
Λ≥0,Tr[Λ𝜎]≤1
1
≤ 𝜆 + log2 . (7.8.81)
1 − 𝜀2
This concludes the proof. ■
for all 𝛼 > 1 and 𝜀 ∈ (0, 1). Note that the conditional sandwiched Rényi entropy
𝐻e𝛼 ( 𝐴|𝐵) 𝜌 is defined through (7.3.12), with the generalized divergence 𝑫 therein
replaced by the sandwiched Rényi relative entropy 𝐷 e𝛼 , the latter defined in (7.5.2).
444
Chapter 7: Quantum Entropies and Information
where
That is, the monotonicity of the log2 function allows us to bring − log2 inside
the minimization in the definition of 𝛽𝜀 (𝜌∥𝜎), and it suffices to optimize over
measurement operators that meet the constraint Tr[Λ𝜌] ≥ 1 − 𝜀 with equality. This
follows because for every measurement operator Λ such that Tr[Λ𝜌] > 1−𝜀, we can
modify it by scaling it by a positive number 𝜆 ∈ [0, 1) such that Tr[(𝜆Λ) 𝜌] = 1 − 𝜀.
The new operator 𝜆Λ is a legitimate measurement operator and the error probability
Tr[(𝜆Λ)𝜎] only decreases under this scaling (i.e., Tr[(𝜆Λ)𝜎] < Tr[Λ𝜎]), which
allows us to conclude (7.9.3).
The hypothesis testing relative entropy can be computed using a semi-definite
445
Chapter 7: Quantum Entropies and Information
Complementary slackness implies that the following equalities hold for optimal
Λ, 𝜇, and 𝑍:
To figure out the dual and having already identified 𝐴, 𝐵, and Φ† , we need to
determine the map Φ and plug into the standard form in (2.4.3). Letting
𝜇 0
𝑋B , (7.9.8)
0 𝑍
we find that
† Tr[Λ𝜌] 0 𝜇 0
Tr[Φ (𝑌 ) 𝑋] = Tr (7.9.9)
0 −Λ 0 𝑍
= 𝜇Tr[Λ𝜌] − Tr[Λ𝑍] (7.9.10)
= Tr[Λ(𝜇𝜌 − 𝑍)], (7.9.11)
To show that this is equal to 𝐷 𝜀𝐻 (𝜌∥𝜎), we should demonstrate that the primal
and dual SDPs satisfy the strong duality property. It is clear that Λ = 1 is a feasible
point for the primal SDP. Furthermore, the choices 𝜇 = 1 and 𝑍 = 1 + [𝜎 − 𝜌] + ,
where [𝜎 − 𝜌] + is the positive part of 𝜎 − 𝜌, are strictly feasible for the dual. Thus,
we conclude (7.9.5) by applying Theorem 2.28.
The complementary slackness conditions in (7.9.6) follow directly from Propo-
sition 2.29. ■
where Π 𝜇∗ 𝜌>𝜎 is the projection onto the strictly positive part of 𝜇∗ 𝜌 − 𝜎, the
projection Π 𝜇∗ 𝜌=𝜎 projects onto the zero eigenspace of 𝜇∗ 𝜌 − 𝜎, and 𝜇∗ ≥ 0
and 𝑝 ∗ ∈ [0, 1] are chosen as follows:
Proof: To find the form of an optimal measurement operator for the hypothesis
testing relative entropy, let Λ be a measurement operator satisfying Tr[Λ𝜌] = 1 − 𝜀
and let 𝜇 ≥ 0. Then
where Π 𝜇𝜌>𝜎 is the projection onto the strictly positive part of 𝜇𝜌 − 𝜎, the
projection Π 𝜇𝜌=𝜎 projects onto the zero eigenspace of 𝜇𝜌 − 𝜎, and 𝑝 ∈ [0, 1]. The
measurement operator Λ(𝜇, 𝑝) is called a quantum Neyman–Pearson test. We still
need to choose the parameters 𝜇 ≥ 0 and 𝑝 ∈ [0, 1]. Let us pick 𝜇 according to the
following optimization:
Tr[Λ(𝜇∗ , 𝑝 ∗ ) 𝜌] = 1 − 𝜀. (7.9.24)
Note that the other generalized divergences we have considered so far satisfy
𝑫 (𝜌∥ 𝜌) = 0 for all states 𝜌. The 𝜀-hypothesis testing relative entropy, however,
does not have this property unless 𝜀 = 0. In fact, it is clear from the definition,
along with (7.9.3), that
Like the quantum relative entropy, the Petz–Rényi relative entropy, the sand-
wiched Rényi relative entropy, and the max-relative entropy, the 𝜀-hypothesis
testing relative entropy is also a generalized divergence, meaning that is satisfies
the data-processing inequality.
Now, by enlarging the optimization set from measurement operators N† (Λ) satisfy-
ing 0 ≤ N† (Λ) ≤ 1 and Tr[N† (Λ) 𝜌] ≥ 1 − 𝜀 to all measurement operators, say
Λ′, satisfying 0 ≤ Λ′ ≤ 1 and Tr[Λ′ 𝜌] ≥ 1 − 𝜀, we obtain
𝐷 𝜀𝐻 (N(𝜌)∥N(𝜎))
≤ sup{− log2 Tr[N† (Λ)𝜎] : 0 ≤ N† (Λ) ≤ 1, Tr[N† (Λ) 𝜌] ≥ 1 − 𝜀} (7.9.32)
Λ
≤ sup{− log2 Tr[Λ′𝜎] : 0 ≤ Λ′ ≤ 1, Tr[Λ′ 𝜌] ≥ 1 − 𝜀} (7.9.33)
Λ′
= 𝐷 𝜀𝐻 (𝜌∥𝜎), (7.9.34)
as required. ■
Remark: Inspection of the proof above reveals that it holds more generally for N a positive,
trace-non-increasing map.
450
Chapter 7: Quantum Entropies and Information
!
∑︁ ∑︁
𝐷 𝜀𝐻 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜌 𝑥𝐴 𝑞(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜎𝐴𝑥
𝑥∈X 𝑥∈X
≥ min 𝐷 𝜀𝐻 (𝜌 𝑥𝐴 ∥𝜎𝐴𝑥 ). (7.9.37)
𝑥∈X
Proof:
1. Eq. (7.9.35) follows from Definition 7.65: increasing 𝜀 increases the set of
measurement operators Λ over which we can optimize, and 𝐷 𝜀𝐻 (𝜌∥𝜎) does
not decrease under such a change.
2. Consider that the following inequality holds for all 𝜀 ∈ (0, 1):
𝐷 𝜀𝐻 (𝜌∥𝜎) ≥ 𝐷 0 (𝜌∥𝜎), (7.9.38)
because the measurement operator Π 𝜌 (projection onto support of 𝜌) satisfies
Tr[Π 𝜌 𝜌] ≥ 1 − 𝜀 for all 𝜀 ∈ (0, 1). So we conclude that
lim inf 𝐷 𝜀𝐻 (𝜌∥𝜎) ≥ 𝐷 0 (𝜌∥𝜎). (7.9.39)
𝜀→0
𝐷 𝛼 (𝜌∥𝜎) ≥
𝜀 (𝜌∥𝜎) 1−𝛼 𝜀 (𝜌∥𝜎) 1−𝛼
1
𝛼 −𝐷 𝐻 𝛼 −𝐷 𝐻
log2 (1 − 𝜀) 2 + 𝜀 1−2 . (7.9.41)
𝛼−1
Now taking the limit of the right-hand side as 𝜀 → 0, we find that the following
bound holds for all 𝛼 ∈ (0, 1):
𝐷 𝛼 (𝜌∥𝜎) ≥ lim sup 𝐷 𝜀𝐻 (𝜌∥𝜎). (7.9.42)
𝜀→0
451
Chapter 7: Quantum Entropies and Information
Since the bound holds for all 𝛼 ∈ (0, 1), we can take the limit on the left-hand
side to arrive at
Now, in the definition of 𝛽𝜀 on the left-hand side of the inequality above, let
us restrict the infimum over all measurement operators to those of the form
Λ 𝑋 𝐴 = 𝑥∈X |𝑥⟩⟨𝑥| 𝑋 ⊗ 𝑀 𝐴𝑥 such that 0 ≤ 𝑀 𝐴𝑥 ≤ 1 𝐴 and Tr[𝑀 𝐴𝑥 𝜌 𝑥𝐴 ] ≥ 1 − 𝜀 for
Í
all 𝑥 ∈ X. Doing this leads to
!
∑︁ ∑︁
𝛽𝜀 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜌 𝑥𝐴 𝑞(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜎𝐴𝑥 (7.9.47)
𝑥∈X 𝑥∈X
∑︁
≤ inf 𝑞(𝑥)Tr[𝑀 𝐴𝑥 𝜎𝐴𝑥 ] : 0 ≤ 𝑀 𝐴𝑥 ≤ 1 𝐴 ,
{𝑀 𝐴 } 𝑥 ∈X
𝑥
𝑥∈X
Tr[𝑀 𝐴𝑥 𝜌 𝑥𝐴 ] ≥ 1 − 𝜀 ∀𝑥 ∈ X (7.9.48)
452
Chapter 7: Quantum Entropies and Information
The last inequality follows because 𝛽𝜀 (𝜌 𝑥𝐴 ∥𝜎𝐴𝑥 ) ≤ max𝑥∈X 𝛽𝜀 (𝜌 𝑥𝐴 ∥𝜎𝐴𝑥 ) for all
𝑥 ∈ X. So we have shown that the inequality (7.9.46) holds, which completes
the proof. ■
We now prove a bound relating the 𝜀-hypothesis testing relative entropy to the
quantum relative entropy.
Proof: To see this, we use the fact that the optimization in the definition of
𝐷 𝜀𝐻 (𝜌∥𝜎) can be restricted as in (7.9.3), i.e.,
𝐷 𝜀𝐻 (𝜌∥𝜎) = sup{− log2 Tr[Λ𝜎] : 0 ≤ Λ ≤ 1, Tr[Λ𝜌] = 1 − 𝜀}. (7.9.53)
Λ
1
− log Tr[Λ𝜎] ≤ 𝐷 (𝜌∥𝜎) + ℎ2 (𝜀) + 𝜀 log2 Tr[𝜎] . (7.9.58)
1−𝜀
Since this bound holds for all measurement operators Λ satisfying Tr[Λ𝜌] = 1 − 𝜀,
we conclude (7.9.52). ■
We now show a connection between the hypothesis testing relative entropy and the
Petz– and sandwiched Rényi relative entropies.
Proof: If the support condition supp(𝜌) ⊆ supp(𝜎) does not hold, then the
right-hand side of (7.9.59) is equal to +∞, and so the result is trivially true. Thus,
454
Chapter 7: Quantum Entropies and Information
in what follows, we suppose that the support condition supp(𝜌) ⊆ supp(𝜎) holds.
Let Λ denote a measurement operator such that Tr[Λ𝜌] = 1 − 𝜀. Let 𝑞 B Tr[Λ𝜎].
By the data-processing inequality for the sandwiched Rényi relative entropy for
𝛼 > 1 (Theorem 7.33), under the measurement channel
𝜔 ↦→ Tr[Λ𝜔]|0⟩⟨0| + Tr[( 1 − Λ)𝜔]|1⟩⟨1|, (7.9.61)
we find that
e𝛼 (𝜌∥𝜎) ≥ 𝐷
𝐷 e𝛼 ({1 − 𝜀, 𝜀}∥{𝑞, Tr[𝜎] − 𝑞}) (7.9.62)
1
= log2 (1 − 𝜀) 𝛼 𝑞 1−𝛼 + 𝜀 𝛼 (Tr[𝜎] − 𝑞) 1−𝛼 (7.9.63)
𝛼−1
1
≥ log2 [(1 − 𝜀) 𝛼 𝑞 1−𝛼 ] (7.9.64)
𝛼−1
𝛼
= log2 (1 − 𝜀) − log2 𝑞. (7.9.65)
𝛼−1
The statement in (7.9.59) follows by taking the supremum over all Λ such that
Tr[Λ𝜌] = 1 − 𝜀. Furthermore, (7.9.60) follows because lim𝛼→∞ 𝐷 e𝛼 (𝜌∥𝜎) =
𝛼
𝐷 max (𝜌∥𝜎) (as shown in Proposition 7.61) and the fact that lim𝛼→∞ 𝛼−1 = 1. ■
where the first equality is the result of Theorem 5.3. For 𝑝 ∈ (0, 1), pick 𝐴 = 𝑝𝜌
and 𝐵 = (1 − 𝑝) 𝜎. Plugging in to the inequality above, we find that there exists a
measurement operator Λ∗ = Λ( 𝑝, 𝜌, 𝜎) such that
implies 𝛼
∗ 𝑝
Tr[Λ 𝜎] ≤ Tr[𝜌 𝛼 𝜎 1−𝛼 ]. (7.9.79)
1− 𝑝
Considering that
1−𝛼 𝛼−1
1− 𝑝 1−𝛼 𝑝
𝜀= Tr[𝜌 𝜎 𝛼
]= Tr[𝜌 𝛼 𝜎 1−𝛼 ] (7.9.80)
𝑝 1− 𝑝
implies that
1
𝛼−1
𝜀 𝑝
= , (7.9.81)
Tr[𝜌 𝛼 𝜎 1−𝛼 ] 1− 𝑝
we get that
𝛼
𝑝
Tr[Λ∗ 𝜎] ≤ Tr[𝜌 𝛼 𝜎 1−𝛼 ] (7.9.82)
1− 𝑝
1 !𝛼
𝛼−1
𝜀
= Tr[𝜌 𝛼 𝜎 1−𝛼 ] (7.9.83)
Tr[𝜌 𝜎 ]
𝛼 1−𝛼
𝛼
1−𝛼𝛼
1−𝛼
= 𝜀 𝛼−1 Tr[𝜌 𝜎 ]𝛼
Tr[𝜌 𝛼 𝜎 1−𝛼 ] (7.9.84)
𝛼
1
𝛼 1−𝛼 1− 𝛼
=𝜀 𝛼−1 Tr[𝜌 𝜎 ] . (7.9.85)
Then, by taking the negative logarithm and optimizing over all Λ∗ satisfying
(7.9.77), we find that
ρ or σ “ρ” or “σ”
{ Λ ( n ) , 1⊗ n − Λ ( n ) }
where we recall the definition of the error probability 𝑝 err ({Λ (𝑛) , 1⊗𝑛 −Λ (𝑛) }; {𝜌, 𝜎})
of state discrimination from (5.3.1). Due to the fact that we combine both the type-I
and type-II error probabilities, the task of state discrimination is often referred to
as symmetric hypothesis testing, while the task of hypothesis testing that we are
considering in this section is referred to as asymmetric hypothesis testing.
More formally, a hypothesis testing protocol is defined by the four elements
(𝑛, 𝜌, 𝜎, Λ (𝑛) ), where 𝑛 is the number of copies of the system, each of which is
either in the state 𝜌 or 𝜎, and 0 ≤ Λ (𝑛) ≤ 1⊗𝑛 is the operator defining the POVM
{Λ (𝑛) , 1⊗𝑛 − Λ (𝑛) } used to decide the state of the system.
459
Chapter 7: Quantum Entropies and Information
Remark: When we say that there exists an (𝑛, 2−𝑛(𝑅− 𝛿 ) , 𝜀) hypothesis testing protocol for 𝜌
and 𝜎 for all 𝜀 ∈ (0, 1], 𝛿 > 0, and “sufficiently large 𝑛”, we mean that for all 𝜀 ∈ (0, 1] and
𝛿 > 0, there exists a number 𝑁 𝜀, 𝛿 ∈ N such that for all 𝑛 ≥ 𝑁 𝜀, 𝛿 , there exists an (𝑛, 2−𝑛(𝑅− 𝛿 ) , 𝜀)
hypothesis testing protocol for 𝜌 and 𝜎. This convention with the nomenclature “sufficiently
large 𝑛” is taken throughout the rest of the book.
460
Chapter 7: Quantum Entropies and Information
Note that, by definition, for every achievable rate 𝑅 there exists a value of 𝑛
such that the type-I error probability 𝜀 becomes arbitrarily close to zero.
Note that, by definition, for every strong converse rate 𝑅 there exists a value of
𝑛 such that the type-I error probability 𝜀 is arbitrarily close to one.
𝐸 (𝜌, 𝜎) ≤ 𝐸
e(𝜌, 𝜎). (7.10.8)
Indeed, suppose for a contradiction that this is not true, i.e., that 𝐸 (𝜌, 𝜎) > 𝐸
e(𝜌, 𝜎)
holds. This means that there exists an achievable rate 𝑅 such that 𝐸 e(𝜌, 𝜎) < 𝑅 <
𝐸 (𝜌, 𝜎), so that for all 𝜀 ∈ (0, 1], all 𝛿 > 0, and all sufficiently large 𝑛 there exists
461
Chapter 7: Quantum Entropies and Information
We now state and prove the quantum Stein’s lemma: both the optimal achievable
and strong converse rates are equal to the quantum relative entropy. As alluded
to at the beginning of Section 7.2 on the quantum relative entropy, the quantum
Stein’s lemma gives the quantum relative entropy its most fundamental operational
meaning as the optimal rate in asymmetric quantum hypothesis testing.
𝐸 (𝜌, 𝜎) = 𝐸
e(𝜌, 𝜎) = 𝐷 (𝜌∥𝜎). (7.10.9)
Therefore, for all sufficiently large 𝑛 such that (Tr[Π𝜎 𝜌]) 𝑛 ≤ 𝜀 holds, the elements
(𝑛, 𝜌, 𝜎, Λ (𝑛) ) constitute an (𝑛, 0, 𝜀) hypothesis testing protocol for 𝜌 and 𝜎, the
rate of which is +∞ = 𝐷 (𝜌∥𝜎). Since 𝜀 is arbitrary, we conclude that, for all
𝜀 ∈ (0, 1] and sufficiently large 𝑛, there exists a hypothesis testing protocol for 𝜌
462
Chapter 7: Quantum Entropies and Information
and 𝜎 with rate 𝑅 = +∞. This implies that 𝐸 (𝜌, 𝜎) = +∞ in the singular case of
supp(𝜌) ⊈ supp(𝜎).
For the remainder of the proof, we assume that the support condition supp(𝜌) ⊆
supp(𝜎) holds, so that 𝐷 (𝜌∥𝜎) is finite.
Let us first show that 𝐷 (𝜌∥𝜎) is an achievable rate, which establishes that
𝐸 (𝜌, 𝜎) ≥ 𝐷 (𝜌∥𝜎). To this end, fix 𝜀 ∈ (0, 1] and 𝛿 > 0. Let 𝛿1 , 𝛿2 > 0 be such
that
𝛿1 + 𝛿2 = 𝛿. (7.10.12)
Set 𝛼 ∈ (0, 1) such that
𝛿1 ≥ 𝐷 (𝜌∥𝜎) − 𝐷 𝛼 (𝜌∥𝜎), (7.10.13)
which is possible because lim𝛼→1 𝐷 𝛼 (𝜌∥𝜎) = 𝐷 (𝜌∥𝜎) by Proposition 7.22 and
𝐷 𝛼 is monotonically increasing in 𝛼, as established in Proposition 7.23. Then, with
this choice of 𝛼, take 𝑛 large enough so that
𝛼 1
𝛿2 ≥ log2 . (7.10.14)
𝑛(1 − 𝛼) 𝜀
testing protocol for 𝜌 and 𝜎. We now apply Proposition 7.72 and the additivity of
the Petz–Rényi relative entropy from Proposition 7.23 to find that
1 𝜀 ⊗𝑛 ⊗𝑛 1 𝛼 1
𝐷 𝐻 (𝜌 ∥𝜎 ) ≥ 𝐷 𝛼 (𝜌 ⊗𝑛 ∥𝜎 ⊗𝑛 ) + log2 (7.10.17)
𝑛 𝑛 𝑛(𝛼 − 1) 𝜀
𝛼 1
= 𝐷 𝛼 (𝜌∥𝜎) + log2 . (7.10.18)
𝑛(𝛼 − 1) 𝜀
Rearranging the right-hand side of this inequality and using (7.10.12)–(7.10.14),
we conclude that
1 𝜀 ⊗𝑛 ⊗𝑛
𝐷 (𝜌 ∥𝜎 )
𝑛 𝐻
463
Chapter 7: Quantum Entropies and Information
𝛼 1
≥ 𝐷 (𝜌∥𝜎) − 𝐷 (𝜌∥𝜎) − 𝐷 𝛼 (𝜌∥𝜎) + log2 (7.10.19)
𝑛(1 − 𝛼) 𝜀
≥ 𝐷 (𝜌∥𝜎) − (𝛿1 + 𝛿2 ) (7.10.20)
≥ 𝐷 (𝜌∥𝜎) − 𝛿. (7.10.21)
We thus have
1 𝜀 ⊗𝑛 ⊗𝑛
𝐷 (𝜌∥𝜎) − 𝛿 ≤ 𝐷 (𝜌 ∥𝜎 ). (7.10.22)
𝑛 𝐻
⊗𝑛 ⊗𝑛
The error 2−𝑛(𝐷 (𝜌∥𝜎)−𝛿) is then greater than or equal to 2−𝑛 ( 𝑛 𝐷 𝐻 (𝜌 ∥𝜎 ) ) , which
1 𝜀
means, by the fact stated in the paragraph immediately after Definition 7.73, that
there exists an (𝑛, 2−𝑛(𝑅−𝛿) , 𝜀) hypothesis testing protocol with 𝑅 = 𝐷 (𝜌∥𝜎) for
all sufficiently large 𝑛 such that (7.10.14) holds. Since 𝜀 and 𝛿 are arbitrary, we
conclude that for all 𝜀 ∈ (0, 1], 𝛿 > 0, and sufficiently large 𝑛, there exists an
(𝑛, 2−𝑛(𝑅−𝛿) , 𝜀) hypothesis testing protocol with 𝑅 = 𝐷 (𝜌∥𝜎). Then 𝐷 (𝜌∥𝜎) is
an achievable rate, so that
Let us now show that the quantum relative entropy 𝐷 (𝜌∥𝜎) is a strong converse
rate, which establishes that 𝐸 e(𝜌, 𝜎) ≤ 𝐷 (𝜌∥𝜎). Fix 𝜀 ∈ [0, 1) and 𝛿 > 0. Let
𝛿1 , 𝛿2 > 0 be such that
𝛿 > 𝛿 1 + 𝛿 2 C 𝛿′ . (7.10.24)
Set 𝛼 ∈ (1, ∞) such that
𝛿1 ≥ 𝐷
e𝛼 (𝜌∥𝜎) − 𝐷 (𝜌∥𝜎), (7.10.25)
Now, consider an arbitrary measurement operator Λ (𝑛) such that the hypothesis
testing protocol given by (𝑛, 𝜌, 𝜎, Λ (𝑛) ) satisfies 𝜀 ≥ Tr[( 1⊗𝑛 − Λ (𝑛) ) 𝜌 ⊗𝑛 ] and
𝜀II ≥ Tr[Λ (𝑛) 𝜎 ⊗𝑛 ]. By definition of the hypothesis testing relative entropy, we
464
Chapter 7: Quantum Entropies and Information
We thus have that Tr[Λ (𝑛) 𝜎 ⊗𝑛 ] > 2−𝑛(𝐷 (𝜌∥𝜎)+𝛿) . Since Λ (𝑛) is an arbitrary measure-
ment operator satisfying 𝜀 ≥ Tr[( 1⊗𝑛 − Λ (𝑛) ) 𝜌 ⊗𝑛 ], we see that, for all sufficiently
large 𝑛 such that (7.10.26) holds, an (𝑛, 2−𝑛(𝐷 (𝜌∥𝜎)+𝛿) , 𝜀) hypothesis testing protocol
cannot exist, for if it did we would have Tr[Λ (𝑛) 𝜎 ⊗𝑛 ] ≤ 2−𝑛(𝐷 (𝜌∥𝜎)+𝛿) for some Λ (𝑛) .
Since 𝜀 and 𝛿 are arbitrary, we have that for all 𝜀 ∈ [0, 1), 𝛿 > 0, and sufficiently
large 𝑛, there does not exist an (𝑛, 2−𝑛(𝐷 (𝜌∥𝜎)+𝛿) , 𝜀) hypothesis testing protocol for
𝜌 and 𝜎, which means that 𝐷 (𝜌∥𝜎) is a strong converse rate, so that
e(𝜌, 𝜎) ≤ 𝐷 (𝜌∥𝜎).
𝐸 (7.10.33)
𝐸 (𝜌, 𝜎) ≤ 𝐸
e(𝜌, 𝜎) ≤ 𝐷 (𝜌∥𝜎) ≤ 𝐸 (𝜌, 𝜎), (7.10.34)
We can conclude the main result of Theorem 7.78 in a different yet related way.
Recall that an alternate definition of the optimal type-II error exponent is given in
465
Chapter 7: Quantum Entropies and Information
(7.10.1), i.e.,
𝐷 𝜀𝐻 (𝜌 ⊗𝑛 ∥𝜎 ⊗𝑛 )
𝐸 (𝜌, 𝜎) = inf lim inf . (7.10.35)
𝜀∈(0,1) 𝑛→∞ 𝑛
It is straightforward to show that
𝐷 𝜀𝐻 (𝜌 ⊗𝑛 ∥𝜎 ⊗𝑛 )
inf lim inf = 𝐷 (𝜌∥𝜎). (7.10.36)
𝜀∈(0,1) 𝑛→∞ 𝑛
Indeed, using (7.10.18), we find that
1
inf lim inf 𝐷 𝜀𝐻 (𝜌 ⊗𝑛 ∥𝜎 ⊗𝑛 ) ≥ 𝐷 𝛼 (𝜌∥𝜎) (7.10.37)
𝜀∈(0,1) 𝑛→∞ 𝑛
for all 𝛼 ∈ (0, 1), so that taking the supremum over 𝛼 ∈ (0, 1) on the right-hand
side leads to
𝐸 (𝜌, 𝜎) ≥ 𝐷 (𝜌∥𝜎). (7.10.38)
We can also write the optimal strong converse type-II error exponent as
for all 𝛼 > 1, so that taking the infimum over 𝛼 ∈ (1, ∞) on the right-hand side
leads to
e(𝜌, 𝜎) ≤ 𝐷 (𝜌∥𝜎).
𝐸 (7.10.41)
We therefore conclude that 𝐸 (𝜌, 𝜎) = 𝐸 e(𝜌, 𝜎) = 𝐷 (𝜌∥𝜎). Note that in the
arguments above for the lower bound on 𝐸 (𝜌, 𝜎) and the upper bound on 𝐸
e(𝜌, 𝜎),
we did not have to explicitly take the infimum or supremum over 𝜀 ∈ (0, 1),
respectively.
Given states 𝜌 and 𝜎, as well as the number 𝑛 of copies of the states, we can change
our perspective a bit from that given in the previous section and instead determine
466
Chapter 7: Quantum Entropies and Information
bounds on the type-I error probability 𝜀. In particular, we can change our focus a bit,
such that we are now interested in how fast the type-I error probability converges to
zero if the type-II error exponent is equal to a constant smaller than the quantum
relative entropy, and we are also interested in how fast the type-I error probability
converges to one if the type-II error exponent is equal to a constant larger than the
quantum relative entropy. To assist with this analysis, we establish the following
propositions, whose proofs are closely related to the proofs of Propositions 7.71
and 7.72.
Proposition 7.79
Let 𝜌 be a state, and let 𝜎 be a positive semi-definite operator. Let 𝛼 > 1 and
𝑅 ≥ 0. Then, for Λ a measurement operator satisfying
Remark: Note that the second bound is nontrivial only in the case that 𝑅 > 𝐷 (𝜌∥𝜎) because
e 𝛼 (𝜌∥𝜎) > 𝐷 (𝜌∥𝜎) for 𝛼 > 1.
𝐷
Proof: The proof is similar to the proof of Proposition 7.71. Let 𝑝 B Tr[Λ𝜌] and
𝑞 B Tr[Λ𝜎]. By applying the data-processing inequality for the sandwiched Rényi
relative entropy along with the measurement channel from (7.9.61), we conclude
that
Proposition 7.80
Let 𝜌 be a state, and let 𝜎 be a positive semi-definite operator. Let 𝛼 ∈ (0, 1)
and 𝑅 ≥ 0. Then, there exists a measurement operator Λ such that
Remark: Note that the second bound above is nontrivial only in the case that 𝑅 < 𝐷 (𝜌∥𝜎)
because 𝐷 𝛼 (𝜌∥𝜎) < 𝐷 (𝜌∥𝜎) for 𝛼 ∈ (0, 1).
Proof: The proof is similar to the proof of Proposition 7.72. Employing the same
measurement operator Λ∗ therein, we conclude from the same reasoning in that
proof that
1−𝛼
∗ 1− 𝑝
Tr[(𝐼 − Λ ) 𝜌] ≤ Tr[𝜌 𝛼 𝜎 1−𝛼 ], (7.10.53)
𝑝
𝛼
𝑝
Tr[Λ∗ 𝜎] ≤ Tr[𝜌 𝛼 𝜎 1−𝛼 ]. (7.10.54)
1− 𝑝
468
Chapter 7: Quantum Entropies and Information
We see that picking 𝑝 in such a way is always possible because one more step of
the development above leads to the conclusion that
1
𝑝= ∈ (0, 1) . (7.10.59)
1 + 2 𝑅/𝛼 2 (𝛼−1)𝐷 𝛼 (𝜌∥𝜎)/𝛼
Substituting into (7.10.53), we find that
1−𝛼
𝑅/𝛼 (𝛼−1)𝐷 𝛼 (𝜌∥𝜎)/𝛼
∗
Tr[(𝐼 − Λ ) 𝜌] ≤ 2 2 2 (𝛼−1)𝐷 𝛼 (𝜌∥𝜎) (7.10.60)
(1− 𝛼) 2
= 2 ( 𝛼 ) 𝑅 2− 𝛼
1− 𝛼
𝐷 𝛼 (𝜌∥𝜎) (𝛼−1)𝐷 𝛼 (𝜌∥𝜎)
2 (7.10.61)
( 𝛼−1)
= 2( ) 𝑅2
1− 𝛼
𝛼 𝛼 𝐷 𝛼 (𝜌∥𝜎) (7.10.62)
= 2− ( ) (𝐷 𝛼 (𝜌∥𝜎)−𝑅) ,
1− 𝛼
𝛼 (7.10.63)
The inequalities from Propositions 7.79 and 7.80 lead to the following bounds
on the type-I error probability 𝜀 for quantum hypothesis testing, when the type-II
error probability has a fixed rate 𝑅:
The left inequality holds for all 𝛼 > 1, while the right inequality holds for all
𝛼 ∈ (0, 1). Let us now examine the behavior of 𝜀 above and below the optimal rate
𝐷 (𝜌∥𝜎).
𝜀 𝑛 ≤ 2−𝑛 ( ) (𝐷 𝛼 (𝜌∥𝜎)−𝑅) .
1− 𝛼
𝛼 (7.10.65)
469
Chapter 7: Quantum Entropies and Information
n→∞
1
Type-I
Error
Probability,
εn
Type-II
0 Error
Exponent,
D (ρkσ)
R
Figure 7.3: The type-I error probability 𝜀 𝑛 as a function of the rate 𝑅, i.e., the
type-II error exponent, as the number 𝑛 of copies of the system approaches
infinity for the task of asymmetric hypothesis testing for the states 𝜌 and 𝜎.
The optimal rate of 𝐷 (𝜌∥𝜎) for this task, as established by the quantum Stein’s
lemma in Theorem 7.78, has what is called the strong converse property, which
means that it is the optimal strong converse rate. Therefore, for every rate above
it, the type-I error probability converges to one in the limit of arbitrarily many
copies of the system.
Since 𝑅 < 𝐷 𝛼∗ (𝜌∥𝜎), taking the limit 𝑛 → ∞ on both sides of this inequality
gives us lim𝑛→∞ 𝜀 𝑛 ≤ 0. However, 𝜀 𝑛 ≥ 0 for all 𝑛 because 𝜀 𝑛 is by definition
a probability. So we find that
𝜀 𝑛 → 0 as 𝑛 → ∞ if 𝑅 < 𝐷 (𝜌∥𝜎). (7.10.67)
The optimal rate 𝐷 (𝜌∥𝜎) is therefore a sharp dividing point below which the
type-I error probability 𝜀 𝑛 exponentially drops to zero as 𝑛 → ∞ and above which
it exponentially increases to one as 𝑛 → ∞. This behavior is illustrated in Figure
7.3.
where the supremum is over all mixed states 𝜌 𝑅 𝐴 with an arbitrary reference
system 𝑅.
Proposition 7.82
Let 𝑫, N 𝐴→𝐵 , and M 𝐴→𝐵 be as given in Definition 7.81. It suffices to optimize
the generalized channel divergence 𝑫 (N∥M) with respect to pure states 𝜓 𝑅 𝐴
471
Chapter 7: Quantum Entropies and Information
where we have used the data-processing inequality for the generalized divergence
in the last line. This means that for every state 𝜌 𝑅 𝐴 , the generalized divergence in
(7.11.4) is never larger than the corresponding generalized divergence evaluated on
a purification of 𝜌 𝑅 𝐴 . This means that it suffices in (7.11.1) to optimize over only
pure states. Furthermore, by the Schmidt decomposition theorem (Theorem 2.2),
the purifying space H 𝑅′ 𝑅 need not have dimension exceeding that of the dimension
of H 𝐴 . Therefore, the generalized channel divergence can be written as in (7.11.2).
To see (7.11.3), we first use the fact in (2.2.38), which implies that for every
pure state 𝜓 𝑅 𝐴 , there exists a state 𝜌 𝑅 and a unitary 𝑈 𝑅 such that
√
|𝜓⟩ 𝑅 𝐴 = (𝑈 𝑅 𝜌 𝑅 ⊗ 1 𝐴 )|Γ⟩ 𝑅 𝐴 . (7.11.8)
472
Chapter 7: Quantum Entropies and Information
Proposition 7.83
Let 𝑫, N 𝐴→𝐵 , and M 𝐴→𝐵 be as given in Definition 7.81, and suppose that 𝑫
obeys the direct-sum property in (7.3.7). Then the function
√ √ √ M√
𝑓 (𝜌 𝐴 , M 𝐴→𝐵 ) B 𝑫 ( 𝜌 𝐴 ΓN
𝐴𝐵 𝜌 𝐴 ∥ 𝜌 𝐴 Γ 𝐴𝐵 𝜌 𝐴 ) (7.11.13)
is concave in the first argument and convex in the second. If 𝔐 is a convex set
of completely positive maps, then
√ √ √ M√
inf sup 𝐷 ( 𝜌 𝐴 ΓN
𝐴𝐵 𝜌 𝐴 ∥ 𝜌 𝐴 Γ 𝐴𝐵 𝜌 𝐴 )
M∈𝔐 𝜌 𝐴
√ √ √ M√
= sup inf 𝐷 ( 𝜌 𝐴 ΓN
𝐴𝐵 𝜌 𝐴 ∥ 𝜌 𝐴 Γ 𝐴𝐵 𝜌 𝐴 ). (7.11.14)
𝜌 𝐴 M∈𝔐
Equivalently,
Proof: To see the concavity, let 𝜓 0𝑅 𝐴 and 𝜓 1𝑅 𝐴 be pure states with reduced states
𝜓 0𝐴 and 𝜓 1𝐴 . Let 𝜓 𝑆𝑅
𝜆
𝐴 denote the following pure state:
√ √
|𝜓 𝜆 ⟩𝑆𝑅 𝐴 := 1 − 𝜆|0⟩𝑆 |𝜓 0 ⟩ 𝑅 𝐴 + 𝜆|1⟩𝑆 |𝜓 1 ⟩ 𝑅 𝐴 . (7.11.16)
Observe that
𝜓 𝜆𝐴 = (1 − 𝜆) 𝜓 0𝐴 + 𝜆𝜓 1𝐴 , (7.11.17)
so that the reduced state 𝜓 𝜆𝐴 is a convex combination of the reduced states 𝜓 0𝐴 and
𝜓 1𝐴 . Define
𝜆
𝜓 𝑆𝑅 𝐴 := (1 − 𝜆) |0⟩⟨0| 𝑆 ⊗ 𝜓 0𝑅 𝐴 + 𝜆|1⟩⟨1| 𝑆 ⊗ 𝜓 1𝑅 𝐴 , (7.11.18)
473
Chapter 7: Quantum Entropies and Information
which is the state resulting from the action of a completely dephasing qubit channel
on system 𝑆. Let 𝜙𝜆𝑅 𝐴 be an arbitrary pure state with reduced state equal to 𝜓 𝜆𝐴 .
Then we find that
The first equality follows because every two purifications of the same state are
related by an isometric channel acting on the reference system, as well as the
isometric invariance of the generalized divergence. The inequality follows from
quantum data processing, by the action of a completely dephasing qubit channel
on the system 𝑆. The final equality follows from the direct-sum property and
because the generalized divergence is invariant under tensoring in the same state
(Proposition 7.16). Finally, we have the following equalities, by employing the
isometric invariance of the generalized divergence (Proposition 7.16), the equality
in (7.11.12), and the definition in (7.11.13):
To see the convexity in M, consider that for every 𝜆 ∈ [0, 1] and completely
positive maps M1 , M2 ∈ 𝔐, the joint convexity of the generalized divergence
(Proposition 7.17) gives that
𝑓 (𝜌 𝐴 , 𝜆M1 + (1 − 𝜆)M2 )
√ √ √ 𝜆M1 +(1−𝜆)M2 √
= 𝑫 ( 𝜌 𝐴 ΓN 𝐴𝐵 𝜌 𝐴 ∥ 𝜌 𝐴 Γ 𝐴𝐵 𝜌 𝐴) (7.11.25)
√ √ √ √
= 𝑫 (𝜆 𝜌 𝐴 ΓN 𝐴𝐵 𝜌 𝐴 + (1 − 𝜆) 𝜌 𝐴 Γ 𝐴𝐵 𝜌 𝐴
N
√ √ √ M2 √
∥𝜆 𝜌 𝐴 ΓM𝐴𝐵
1
𝜌 𝐴 + (1 − 𝜆) 𝜌 𝐴 Γ 𝐴𝐵 𝜌 𝐴 ) (7.11.26)
√ √ √ M1 √
≤ 𝜆𝑫 ( 𝜌 𝐴 ΓN 𝐴𝐵 𝜌 𝐴 ∥ 𝜌 𝐴 Γ 𝐴𝐵 𝜌 𝐴 )
√ √ √ M2 √
+ (1 − 𝜆) 𝑫 ( 𝜌 𝐴 ΓN 𝐴𝐵 𝜌 𝐴 ∥ 𝜌 𝐴 Γ 𝐴𝐵 𝜌 𝐴 ) (7.11.27)
= 𝜆 𝑓 (𝜌 𝐴 , M1 ) + (1 − 𝜆) 𝑓 (𝜌 𝐴 , M2 ). (7.11.28)
474
Chapter 7: Quantum Entropies and Information
The equality in (7.11.14) follows from what was just shown and Sion’s minimax
theorem (Theorem 2.24). The equality in (7.11.15) follows from (7.11.12) and
Proposition 7.82. ■
The generalized channel divergence takes a simple form if the channel N and
the completely positive map M both happen to be jointly covariant with respect to
a group, as shown in the following proposition.
475
Chapter 7: Quantum Entropies and Information
where 𝜌 N M
𝐴𝐵 and 𝜌 𝐴𝐵 are Choi states of N and M, respectively.
where the last equality follows from the fact that every generalized divergence is
isometrically invariant (recall Proposition 7.16). Now, let us apply the dephasing
map 𝑋 ↦→ 𝑔∈𝐺 |𝑔⟩⟨𝑔|𝑋 |𝑔⟩⟨𝑔| to the 𝑅′ system. Since this map is a channel, by the
Í
data-processing inequality for the generalized divergence, we obtain
𝜌 𝜌
𝑫 N 𝐴→𝐵 (𝜓 𝑅′ 𝐴′ 𝐴 )∥M 𝐴→𝐵 (𝜓 𝑅′ 𝐴′ 𝐴 )
© 1 ∑︁ 𝑔
≥ 𝑫 |𝑔⟩⟨𝑔| 𝑅′ ⊗ (N 𝐴→𝐵 ◦ U 𝐴 )(𝜓 𝐴′ 𝐴 )
|𝐺 | 𝑔∈𝐺
«
476
Chapter 7: Quantum Entropies and Information
1 ∑︁ 𝑔
|𝑔⟩⟨𝑔| 𝑅′ ⊗ (M 𝐴→𝐵 ◦ U 𝐴 )(𝜓 𝐴′ 𝐴 ) ® . (7.11.40)
ª
|𝐺 | 𝑔∈𝐺
¬
Then, because generalized divergences are invariant under unitaries, we can apply
Í 𝑔†
the unitary channel given by the unitary 𝑔∈𝐺 |𝑔⟩⟨𝑔| ⊗ 𝑉𝐵 at the output of N and
M to obtain
𝜌 𝜌
𝑫 N 𝐴→𝐵 (𝜓 𝑅′ 𝐴′ 𝐴 )∥M 𝐴→𝐵 (𝜓 𝑅′ 𝐴′ 𝐴 )
© 1 ∑︁
|𝑔⟩⟨𝑔| 𝑅′ ⊗ ((V𝐵 ) † ◦ N 𝐴→𝐵 ◦ U 𝐴 )(𝜓 𝐴′ 𝐴 )
𝑔 𝑔
≥ 𝑫
|𝐺 | 𝑔∈𝐺
«
1 ∑︁
|𝑔⟩⟨𝑔| 𝑅′ ⊗ ((V𝐵 ) † ◦ M 𝐴→𝐵 ◦ U 𝐴 )(𝜓 𝐴′ 𝐴 ) ® .
𝑔 𝑔 ª
(7.11.41)
|𝐺 | 𝑔∈𝐺
¬
Finally, since the group-covariance of N and M with respect to the representations
𝑔 𝑔
{𝑈 𝐴 }𝑔∈𝐺 and {𝑉𝐵 }𝑔∈𝐺 implies that
(V𝐵 ) † ◦ N ◦ U 𝐴 = N,
𝑔 𝑔 𝑔† 𝑔
V𝐵 ◦ M ◦ U 𝐴 = M (7.11.42)
for all 𝑔 ∈ 𝐺, and since from Proposition 7.16 generalized divergences are invariant
under tensoring with the same state in both arguments, we obtain
𝜌 𝜌
𝑫 (N 𝐴→𝐵 (𝜙 𝑅 𝐴 )∥M 𝐴→𝐵 (𝜙 𝑅 𝐴 )) (7.11.43)
𝜌 𝜌
= 𝑫 N 𝐴→𝐵 (𝜓 𝑅′ 𝐴′ 𝐴 )∥M 𝐴→𝐵 (𝜓 𝑅′ 𝐴′ 𝐴 ) (7.11.44)
© 1 ∑︁ 1 ∑︁
≥ 𝑫 |𝑔⟩⟨𝑔| 𝑅′ ⊗ N 𝐴→𝐵 (𝜓 𝐴′ 𝐴 ) |𝑔⟩⟨𝑔| 𝑅′ ⊗ M 𝐴→𝐵 (𝜓 𝐴′ 𝐴 ) ®
ª
|𝐺 | 𝑔∈𝐺 |𝐺 | 𝑔∈ 𝐺
« ¬
(7.11.45)
= 𝑫 (N 𝐴→𝐵 (𝜓 𝐴′ 𝐴 )∥(M 𝐴→𝐵 (𝜓 𝐴′ 𝐴 )) , (7.11.46)
𝜌
which is precisely (7.11.30). By definition, the pure state 𝜙 𝑅 𝐴 is such that its
reduced state on 𝐴 is invariant under the channel T𝐺 . Therefore, optimizing over
all such pure states, we obtain
477
Chapter 7: Quantum Entropies and Information
𝑫 (𝜌 N M
𝑅𝐵 ∥ 𝜌 𝑅𝐵 ) ≥ 𝑫 (N 𝐴→𝐵 (𝜓 𝐴′ 𝐴 )∥M 𝐴→𝐵 (𝜓 𝐴′ 𝐴 )) (7.11.53)
478
Chapter 7: Quantum Entropies and Information
The generalized mutual information 𝑰(N) and the generalized coherent infor-
mation 𝑰 𝑐 (N) both involve an optimization over pure states. It is straightforward to
show that it suffices to optimize over pure states for both quantities. The argument
is similar to that in (7.11.4)–(7.11.7) above.
For the generalized mutual information of a covariant channel, we can prove a
479
Chapter 7: Quantum Entropies and Information
1 ∑︁ 𝑔 𝑔†
𝜌𝐴 = 𝑈 𝐴 𝜌 𝐴𝑈 𝐴 C T𝐺 (𝜌 𝐴 ), (7.11.58)
|𝐺 | 𝑔∈𝐺
𝜌
𝜌 𝐴 = 𝜓 𝐴 = Tr 𝑅 [𝜓 𝑅 𝐴 ], and 𝜙 𝑅 𝐴 is a purification of 𝜌 𝐴 . Consequently,
Proof: The proof is similar to the proof of Proposition 7.84 and uses some steps
therein.
The inequality
holds simply by restricting the optimization in the definition of 𝑰(N) to pure states
𝜙 𝑅 𝐴 whose reduced states 𝜙 𝐴 are invariant under the channel T𝐺 . The remainder
of the proof is devoted to showing that the reverse inequality holds as well.
480
Chapter 7: Quantum Entropies and Information
© 1 ∑︁ 1 ∑︁ 𝑔† 𝑔ª
≥ 𝑫 |𝑔⟩⟨𝑔| 𝑅′ ⊗ N 𝐴→𝐵 (𝜓 𝑅 𝐴 ) |𝑔⟩⟨𝑔| 𝑅′ ⊗ 𝜓 𝑅 ⊗ 𝑉𝐵 𝜎𝐵𝑉𝐵 ®
|𝐺 | 𝑔∈𝐺 |𝐺 | 𝑔∈𝐺
« ¬
(7.11.63)
≥ 𝑫 (N 𝐴→𝐵 (𝜓 𝑅 𝐴 )∥𝜓 𝑅 ⊗ 𝜏𝐵 ), (7.11.64)
where the last line follows from the data-processing inequality under the partial-trace
Í 𝑔† 𝑔
channel Tr 𝑅′ and we let 𝜏𝐵 B |𝐺1 | 𝑔∈𝐺 𝑉𝐵 𝜎𝐵𝑉𝐵 . By taking the infimum over all
states 𝜏𝐵 on the right-hand side of the inequality above, we find that
𝜌 𝜌
𝑫 (N 𝐴→𝐵 (𝜙 𝑅 𝐴 )∥𝜙 𝑅 ⊗ 𝜎𝐵 ) ≥ 𝑫 (N 𝐴→𝐵 (𝜓 𝑅 𝐴 )∥𝜓 𝑅 ⊗ 𝜏𝐵 ) (7.11.65)
≥ inf 𝑫 (N 𝐴→𝐵 (𝜓 𝑅 𝐴 )∥𝜓 𝑅 ⊗ 𝜏𝐵 ) (7.11.66)
𝜏𝐵
= 𝑰(𝑅; 𝐵)𝜔 , (7.11.67)
where 𝜔 𝑅𝐵 = N 𝐴→𝐵 (𝜓 𝑅 𝐴 ). The inequality above holds for all states 𝜓 𝑅 𝐴 and all
states 𝜎𝐵 . Therefore, optimizing over all states 𝜎𝐵 on the left-hand side of the
above inequality leads to
𝜌 𝜌
inf 𝑫 (N 𝐴→𝐵 (𝜙 𝑅 𝐴 )∥𝜙 𝑅 ⊗ 𝜎𝐵 ) = 𝑰(𝑅; 𝐵)𝜔 ≥ 𝑰(𝑅; 𝐵)𝜔 , (7.11.68)
𝜎𝐵
𝜌
where 𝜔 𝑅𝐵 = N 𝐴→𝐵 (𝜙 𝑅 𝐴 ). Thus, we conclude (7.11.57).
𝜌
Next, by construction, the state 𝜙 𝑅 𝐴 is such that its reduced state on 𝐴 is invariant
under the channel T𝐺 . Optimizing over all such states leads to
Since this inequality holds for all pure states 𝜓 𝑅 𝐴 , we finally obtain
481
Chapter 7: Quantum Entropies and Information
as required.
𝑔
To prove (7.11.60), note that if {𝑈 𝐴 }𝑔∈𝐺 is irreducible, then for every state
𝜓 𝑅 𝐴 , the state 𝜌 𝐴 = 𝜓 𝐴 satisfies 𝜌 𝐴 = T𝐺 (𝜌 𝐴 ) = 1𝑑 𝐴𝐴 . Then, since the maximally
𝜌
entangled state is a purification of the maximally mixed state, we let 𝜙 𝑅 𝐴 = Φ 𝑅 𝐴 ,
which implies via (7.11.68) that
The reverse inequality holds simply by restricting the optimization in the definition
of 𝑰(N) to the maximally entangled state Φ 𝑅 𝐴 . We thus have (7.11.60), as
required. ■
Proposition 7.87
Let N be a quantum channel. To compute its generalized Holevo information
𝝌(N), as defined in (7.11.56), it suffices to optimize over ensembles consisting
of pure states. If the underlying generalized divergence is continuous, then
no more than 𝑑 2 pure states are needed for the optimization, where 𝑑 is the
dimension of the input space of N.
𝝌(N) = sup 𝑰(𝑋; 𝐵)N 𝐴→𝐵 (𝜌 𝑋 𝐴) ≥ sup 𝑰(𝑍; 𝐵)N 𝐴→𝐵 (𝜏𝑍 𝐴) , (7.11.74)
𝜌𝑋 𝐴 𝜏𝑍 𝐴
483
Chapter 7: Quantum Entropies and Information
Therefore,
𝝌(N) = sup 𝑰(𝑋; 𝐵)N 𝐴→𝐵 (𝜌 𝑋 𝐴) ≤ sup 𝑰(𝑍; 𝐵)N 𝐴→𝐵 (𝜏𝑍 𝐴) (7.11.85)
𝜌𝑋 𝐴 𝜏𝑍 𝐴
as required.
When the underlying generalized divergence is continuous, the fact that the
alphabet X of the classical–quantum states 𝜏𝑍 𝐴 need not exceed 𝑑 2 elements is due
to the Fenchel–Eggleston–Carathéodory Theorem (Theorem 2.23) and the fact that
dimension of the space of density operators on a 𝑑-dimensional space is 𝑑 2 . ■
The information measures for channels on which we primarily focus in this book
are those based on the following generalized divergences: the quantum relative
entropy, the Petz–Rényi relative entropy, the sandwiched Rényi relative entropy,
and the hypothesis testing relative entropy. Specifically, given a channel N 𝐴→𝐵 , we
are interested in the following mutual information quantities. In each case, 𝜓 𝑅 𝐴
is a pure state with the dimension of the system 𝑅 the same as that of 𝐴, the state
𝜔 𝑅𝐵 = N 𝐴→𝐵 (𝜓 𝑅 𝐴 ), and 𝜎𝐵 is a state.
1. 𝜀-hypothesis testing mutual information of N, defined for 𝜀 ∈ [0, 1] as
where
𝐼 𝐻𝜀 ( 𝐴; 𝐵) 𝜌 B inf 𝐷 𝜀𝐻 (𝜌 𝐴𝐵 ∥ 𝜌 𝐴 ⊗ 𝜎𝐵 ) (7.11.88)
𝜎𝐵
484
Chapter 7: Quantum Entropies and Information
where
𝐼𝛼 ( 𝐴; 𝐵) 𝜌 B inf 𝐷 𝛼 (𝜌 𝐴𝐵 ∥ 𝜌 𝐴 ⊗ 𝜎𝐵 ) (7.11.90)
𝜎𝐵
𝐼𝛼 (N) B sup e
e 𝐼𝛼 (𝑅; 𝐵)𝜔 ∀ 𝛼 ∈ [1/2, 1) ∪ (1, ∞), (7.11.91)
𝜓𝑅 𝐴
where
𝐼𝛼 ( 𝐴; 𝐵) 𝜌 B inf 𝐷
e e𝛼 (𝜌 𝐴𝐵 ∥ 𝜌 𝐴 ⊗ 𝜎𝐵 ) (7.11.92)
𝜎𝐵
𝜒𝛼 (N) B sup e
e 𝐼𝛼 (𝑋; 𝐵)𝜔 ∀ 𝛼 ∈ [1/2, 1) ∪ (1, ∞). (7.11.95)
𝜌𝑋 𝐴
485
Chapter 7: Quantum Entropies and Information
where
𝐼 𝐻𝜀 ( 𝐴⟩𝐵) 𝜌 B inf 𝐷 𝜀𝐻 (𝜌 𝐴𝐵 ∥ 1 𝐴 ⊗ 𝜎𝐵 ) (7.11.97)
𝜎𝐵
is the 𝜀-hypothesis testing coherent information of the bipartite state 𝜌 𝐴𝐵 .
2. Petz–Rényi coherent information of N:
𝐼𝛼𝑐 (N) B sup 𝐼𝛼 (𝑅⟩𝐵)𝜔 ∀ 𝛼 ∈ [0, 1) ∪ (1, 2], (7.11.98)
𝜓𝑅 𝐴
where
𝐼𝛼 ( 𝐴⟩𝐵) 𝜌 B inf 𝐷 𝛼 (𝜌 𝐴𝐵 ∥ 1 𝐴 ⊗ 𝜎𝐵 ) (7.11.99)
𝜎𝐵
is the Petz–Rényi coherent information of the bipartite state 𝜌 𝐴𝐵 .
3. Sandwiched Rényi coherent information of N:
𝐼𝛼𝑐 (N) B sup e
e 𝐼𝛼 (𝑅⟩𝐵)𝜔 ∀ 𝛼 ∈ [1/2, 1) ∪ (1, ∞), (7.11.100)
𝜓𝑅 𝐴
where
e e𝛼 (𝜌 𝐴𝐵 ∥ 1 𝐴 ⊗ 𝜎𝐵 )
𝐼𝛼 ( 𝐴⟩𝐵) 𝜌 B inf 𝐷 (7.11.101)
𝜎𝐵
is the sandwiched Rényi coherent information of the bipartite state 𝜌 𝐴𝐵 .
For all of the quantities defined above, we define the corresponding quantities
based on the quantum relative entropy by taking the limit 𝛼 → 1. The key such
quantities of interest in this book are the following:
1. Mutual information of N, denoted by 𝐼 (N) and defined as
𝐼 (N) B sup 𝐼 (𝑅; 𝐵)𝜔 , (7.11.102)
𝜓𝑅 𝐴
In this section, we provide some simplified formulas for the Petz–Rényi information
quantities for general bipartite states and for all Rényi information quantities for
pure bipartite states.
so that
𝛼1 𝛼1
Tr 𝐴 [𝜌 𝛼𝐴𝐵 𝜏𝐴1−𝛼 ] = Tr Tr 𝐴 [𝜌 𝛼𝐴𝐵 𝜏𝐴1−𝛼 ] · 𝜔 𝐵 (𝛼) (7.11.117)
where the inequality follows because 𝐷 𝛼 (𝜔 𝐵 (𝛼)∥𝜎𝐵 ) ≥ 0 for all states. Consider
that
𝑄 𝛼 (𝜌 𝐴𝐵 ∥𝜏𝐴 ⊗ 𝜎𝐵 )
= Tr[𝜌 𝛼𝐴𝐵 (𝜏𝐴 ⊗ 𝜎𝐵 ) 1−𝛼 ] (7.11.120)
= Tr[𝜌 𝛼𝐴𝐵 (𝜏𝐴1−𝛼 ⊗ 𝜎𝐵1−𝛼 )] (7.11.121)
488
Chapter 7: Quantum Entropies and Information
1
Applying the function (·) → 𝛼−1 log2 (·) to both sides, we conclude that
𝐷 𝛼 (𝜌 𝐴𝐵 ∥𝜏𝐴 ⊗ 𝜎𝐵 ) =
1
𝛼 𝛼 1−𝛼 𝛼
log2 Tr Tr 𝐴 [𝜌 𝐴𝐵 𝜏𝐴 ] + 𝐷 𝛼 (𝜔 𝐵 (𝛼)∥𝜎𝐵 ). (7.11.126)
𝛼−1
Now consider that
𝑄 𝛼 (𝜌 𝐴𝐵 ∥𝜏𝐴 ⊗ 𝜔 𝐵 (𝛼))
= Tr[𝜌 𝛼𝐴𝐵 (𝜏𝐴1−𝛼 ⊗ 𝜔 𝐵 (𝛼) 1−𝛼 )] (7.11.127)
= Tr[Tr 𝐴 [𝜌 𝛼𝐴𝐵 𝜏𝐴1−𝛼 ]𝜔 𝐵 (𝛼) 1−𝛼 ] (7.11.128)
1−𝛼
𝛼1
𝛼 1−𝛼
© Tr 𝐴 [𝜌 𝐴𝐵 𝜏𝐴 ]
𝛼 1−𝛼 ª
= Tr Tr 𝐴 [𝜌 𝐴𝐵 𝜏𝐴 ] h
®
1i®
(7.11.129)
Tr Tr 𝐴 [𝜌 𝐴𝐵 𝜏𝐴 ]
𝛼 1−𝛼
𝛼
« ¬
1 𝛼−1 1−𝛼𝛼
𝛼 1−𝛼 𝛼 𝛼 1−𝛼 𝛼 1−𝛼
= Tr Tr 𝐴 [𝜌 𝐴𝐵 𝜏𝐴 ] Tr Tr 𝐴 [𝜌 𝐴𝐵 𝜏𝐴 ] Tr 𝐴 [𝜌 𝐴𝐵 𝜏𝐴 ]
(7.11.130)
𝛼1 𝛼−1 𝛼1
= Tr Tr 𝐴 [𝜌 𝛼𝐴𝐵 𝜏𝐴1−𝛼 ] Tr Tr 𝐴 [𝜌 𝛼𝐴𝐵 𝜏𝐴1−𝛼 ] (7.11.131)
𝛼1 𝛼
= Tr Tr 𝐴 [𝜌 𝛼𝐴𝐵 𝜏𝐴1−𝛼 ] . (7.11.132)
1
Now applying the function (·) → 𝛼−1 log2 (·) to both sides, we conclude that
𝛼1
𝛼
𝐷 𝛼 (𝜌 𝐴𝐵 ∥𝜏𝐴 ⊗ 𝜔 𝐵 (𝛼)) = log2 Tr Tr 𝐴 [𝜌 𝛼𝐴𝐵 𝜏𝐴1−𝛼 ] . (7.11.133)
𝛼−1
So this establishes (7.11.118). We then conclude from (7.11.119) that
inf 𝐷 𝛼 (𝜌 𝐴𝐵 ∥𝜏𝐴 ⊗ 𝜎𝐵 ) = 𝐷 𝛼 (𝜌 𝐴𝐵 ∥𝜏𝐴 ⊗ 𝜔 𝐵 (𝛼)), (7.11.134)
𝜎𝐵
489
Chapter 7: Quantum Entropies and Information
𝐼𝛼 ( 𝐴; 𝐵)𝜓 = 2𝐻
e 1 ( 𝐴)𝜓 , (7.11.137)
2𝛼−1
490
Chapter 7: Quantum Entropies and Information
" 𝛼1 #
1 1
= Tr Tr 𝐴 [Γ𝐴𝐵 𝜓 𝐴 𝜓 1−𝛼
2
𝐴 𝜓 𝐴]
2
(7.11.144)
𝛼1
= Tr Tr 𝐴 [Γ𝐴𝐵 𝜓 2−𝛼
𝐴 ] (7.11.145)
1 1
𝑇 2−𝛼 𝛼 𝑇 2−𝛼 𝛼
= Tr Tr 𝐴 [Γ𝐴𝐵 (𝜓 𝐵 ) ] = Tr (𝜓 𝐵 ) (7.11.146)
2− 𝛼 2− 𝛼
= Tr 𝜓 𝐵𝛼 = Tr 𝜓 𝐴𝛼 . (7.11.147)
The fourth equality follows from cyclicity of partial trace and the sixth follows from
the transpose trick in (2.2.40). Rearranging the first and last lines gives
𝛼 2− 𝛼
𝐼𝛼 ( 𝐴; 𝐵)𝜓 = log2 Tr 𝜓 𝐴 𝛼
(7.11.148)
𝛼−1
!
1 2− 𝛼
=2 2−𝛼
log2 Tr 𝜓 𝐴𝛼 (7.11.149)
1− 𝛼
= 2𝐻 2− 𝛼 ( 𝐴)𝜓 . (7.11.150)
𝛼
491
Chapter 7: Quantum Entropies and Information
1− 𝛼
𝛼
= ⟨𝜓| 𝐴𝐵 (𝜓 𝐴 ⊗ 𝜎𝐵 ) 𝛼|𝜓⟩ 𝐴𝐵 Tr |𝜓⟩⟨𝜓| 𝛼𝐴𝐵 (7.11.156)
1− 𝛼
𝛼
= ⟨𝜓| 𝐴𝐵 (𝜓 𝐴 ⊗ 𝜎𝐵 ) 𝛼 |𝜓⟩ 𝐴𝐵 (7.11.157)
𝛼
1 1− 𝛼 1− 𝛼 1
= ⟨Γ| 𝐴𝐵 𝜓 𝐴2 𝜓 𝐴𝛼 ⊗ 𝜎𝐵 𝛼 𝜓 𝐴2 |Γ⟩ 𝐴𝐵 (7.11.158)
𝛼
1 1− 𝛼
= ⟨Γ| 𝐴𝐵 𝜓 𝐴 ⊗ 𝜎𝐵
𝛼 𝛼
|Γ⟩ 𝐴𝐵 (7.11.159)
𝛼
1
= ⟨Γ| 𝐴𝐵 𝜓 𝐴𝛼 [T 𝐴 (𝜎𝐴 )] 𝛼 ⊗ 1𝐵 |Γ⟩ 𝐴𝐵
1− 𝛼
(7.11.160)
𝛼
1 1− 𝛼
= Tr 𝜓 𝐴 [T 𝐴 (𝜎𝐴 )] 𝛼
𝛼
. (7.11.161)
1
Now applying the function (·) → 𝛼−1 log2 (·) to both sides, we conclude that
𝛼 1 1− 𝛼
e𝛼 (𝜓 𝐴𝐵 ∥𝜓 𝐴 ⊗ 𝜎𝐵 ) =
𝐷 log2 Tr 𝜓 𝐴𝛼 [T 𝐴 (𝜎𝐴 )] 𝛼 , (7.11.162)
𝛼−1
and applying Proposition 2.8, we conclude that
492
Chapter 7: Quantum Entropies and Information
We prove (7.11.138). Let 𝜎𝐵 be a state with the same support as 𝜓 𝐵 . Recall the
formula in Proposition 7.43 for the geometric Rényi relative entropy when the state
𝜌 is pure. We use this to conclude that
b𝛼 (𝜓 𝐴𝐵 ∥𝜓 𝐴 ⊗ 𝜎𝐵 ) = log2 ⟨𝜓| 𝐴𝐵 (𝜓 𝐴 ⊗ 𝜎𝐵 ) −1 |𝜓⟩ 𝐴𝐵
𝐷 (7.11.186)
= log2 ⟨𝜓| 𝐴𝐵 𝜓 −1 −1
𝐴 ⊗ 𝜎𝐵 |𝜓⟩ 𝐴𝐵 (7.11.187)
1 1
= log2 ⟨Γ| 𝐴𝐵 𝜓 𝐴2 𝜓 −1
𝐴 ⊗ 𝜎 −1
𝐵 𝜓 𝐴2 |Γ⟩ 𝐴𝐵 (7.11.188)
= log2 ⟨Γ| 𝐴𝐵 1 𝐴 ⊗ 𝜎𝐵 |Γ⟩ 𝐴𝐵
−1
(7.11.189)
−1
= log2 Tr 𝜎𝐵 . (7.11.190)
Now consider that the minimum value of inf 𝜎𝐵 Tr 𝜎𝐵−1 occurs when 𝜎𝐵 is the
maximally mixed state 𝜋 𝐵. This follows from using the Lagrange multiplier method
−1 2
(or alternatively inf 𝜎𝐵 Tr 𝜎𝐵 can be evaluated as 𝑑 𝐵 by applying Proposition 2.8
again, with an implicit identity operator acting on the support of 𝜓 𝐵 ). We then
conclude that
𝐼𝛼 ( 𝐴; 𝐵)𝜓 = inf log2 Tr 𝜎𝐵−1 = log2 Tr 𝜋 −1
b 𝐵 (7.11.191)
𝜎𝐵
= 2 log2 rank(𝜓 𝐴 ) = 2𝐻0 ( 𝐴)𝜓 . (7.11.192)
Observe that all of the generalized information measures for quantum channels given
in Definition 7.85, as well as all of the channel information measures given above
for specific generalized divergences, are defined in a common manner. Specifically,
given a function 𝑓 : D(H 𝐴𝐵 ) → R for bipartite states, the corresponding function
𝑓 for quantum channels4 is defined as
where N 𝐴→𝐵 and M 𝐴→𝐵 are quantum channels and 𝜌 𝑅 𝐴 is a quantum state,
with the dimension of 𝑅 unbounded. In other words, we define the channel
quantity by evaluating the state quantity on the states N 𝐴→𝐵 (𝜌 𝑅 𝐴 ) and M 𝐴→𝐵 (𝜌 𝑅 𝐴 )
and optimizing over all states 𝜌 𝑅 𝐴 . We have already seen this principle being
used in Chapter 6 to define the diamond distance (Definition 6.18) and fidelity
(Definition 6.23) of two quantum channels, in which case the state quantity 𝑓 is the
trace distance or fidelity.
In both (7.11.201) and (7.11.202), properties of the underlying state quantity
(namely, the data-processing inequality), as well as the Schmidt decomposition
4 Ina slight abuse of notation, we use the same letter to denote the channel quantity as the state
quantity.
5We allow for the second argument to be a positive semi-definite operator more generally.
495
Chapter 7: Quantum Entropies and Information
where the optimization is now only over states 𝜌 𝐴 for the input system 𝐴 for the
channel N, and
√ √
𝑓 (𝜌 𝐴 , N 𝐴→𝐵 ) B 𝑓 ( 𝐴; 𝐵)𝜔 , 𝜔 𝐴𝐵 = 𝜌 𝐴 ΓN
𝐴𝐵 𝜌 𝐴 . (7.11.204)
This holds due to (2.2.38), which states for every purification |𝜓 𝜌 ⟩ 𝐴𝐴′ of 𝜌 𝐴
there exists an operator 𝑌 𝐴 such that (𝑌 𝐴 ⊗ 1 𝐴′ )|Γ⟩ 𝐴𝐴′ = |𝜓 𝜌 ⟩ 𝐴𝐴′ . Then, by the
polar decomposition (Theorem 2.3), and the fact that 𝑌 𝐴𝑌 𝐴† = 𝜌 𝐴 , it holds that
√
𝑌 𝐴 = 𝑈 𝐴 𝜌 𝐴 for some unitary 𝑈 𝐴 . Finally, using the definition of the Choi
representation ΓN 𝐴𝐵 and the unitary invariance of 𝑓 , we obtain (7.11.203). This
equivalent formulation of the channel quantity 𝑓 (N) has been used in (7.11.113)
for the coherent information of a channel.
If the underlying state quantity in (7.11.202) is unitarily invariant, then by
using the same reasoning as above we can write 𝑓 (N, M) in a form analogous to
(7.11.203):
√ √ √ M√
𝑓 (N, M) = sup 𝑓 ( 𝜌 𝐴 ΓN
𝐴𝐵 𝜌 𝐴 , 𝜌 𝐴 Γ 𝐴𝐵 𝜌 𝐴 ). (7.11.205)
𝜌𝐴
7.12 Summary
In this chapter, we studied various entropic quantities, starting with quantum
relative entropy. We proved many of its most important properties, and we saw
that it acts as a parent quantity for well-known quantities such as von Neumann
entropy, quantum conditional entropy, quantum mutual information and conditional
mutual information, and coherent information. We then studied the Petz–Rényi,
sandwiched Rényi, geometric Rényi, and hypothesis testing relative entropies, and
we proved many of their most important properties.
The unifying concept of this chapter is that of generalized divergence. A
generalized divergence is a function 𝑫 : D(H) × L+ (H) → R that satisifes the
496
Chapter 7: Quantum Entropies and Information
498
Chapter 7: Quantum Entropies and Information
499
Chapter 7: Quantum Entropies and Information
500
Chapter 7: Quantum Entropies and Information
Matthews and Wehner (2014). Proposition 7.71 was established by Cooney et al.
(2016). Proposition 7.72 is essentially due to Hayashi (2007), with a refinement by
Audenaert et al. (2012) and a later rediscovery of it, formulated in a different way,
by Qi et al. (2018b). Proposition 7.80 is essentially due to Hayashi (2007).
The generalized channel divergence of Definition 7.81 was proposed by Leditzky
et al. (2018), and Proposition 7.84 was established as well by Leditzky et al. (2018).
The various generalized channel information measures can be found in the papers
of Wilde et al. (2014); Gupta and Wilde (2015), and the related channel information
measures based on hypothesis testing, Petz–Rényi, and sandwiched Rényi relative
entropy are from Koenig and Wehner (2009); Sharma and Warsi (2013); Wilde
et al. (2014); Gupta and Wilde (2015); Datta et al. (2016). The channel mutual
information was defined by Adami and Cerf (1997), the channel Holevo information
by Schumacher and Westmoreland (1997) (based on the Holevo quantity for
ensembles Holevo (1973)), and the channel coherent information by Lloyd (1997).
These papers together thus developed a general concept of promoting a measure
of correlations in a quantum state to a measure of a channel’s ability to create the
same correlations, by optimizing the state measure with respect to a (subset of) all
of the states that can be generated by means of the channel.
The review by Ruskai (2002) is helpful not only for understanding entropy
inequalities in quantum information, but also for understanding the history of
developments with respect to quantum entropy and information. The book of
Tomamichel (2015) provides an exposition of Rényi relative entropies and their
properties (see also Leditzky (2016)). The book of Wilde (2017a) provides an
overview of entropies in the von Neumann family, their properties, and the derived
channel information measures.
501
Chapter 8
502
Chapter 9
Entanglement Measures
In the previous chapter, we laid the foundation for analyzing quantum communication
protocols by defining entropic quantities, such as the Petz– and sandwiched Rényi
relative entropies, as well as information measures for quantum states and channels
derived from these relative entropies. We now use these information measures to
define entanglement measures for quantum states and channels. Given quantum
systems 𝐴 and 𝐵, an entanglement measure is a function 𝐸 : D(H 𝐴𝐵 ) → R that
quantifies the amount of entanglement present in a state 𝜌 𝐴𝐵 of these systems.
The notion of “quantifying entanglement” is explained in Section 9.1 below,
with the defining requirement of an entanglement measure being that it does not
increase under channels realized by local operations and classical communication
(Definition 4.22). We can think of this requirement of “LOCC monotonicity” as
a restricted form of the data-processing inequality, but now applied to a single
bipartite state rather than to a pair of states. The data-processing inequality indicates
that the distinguishability of two states does not increase under the action of the
same quantum channel (Definition 7.15), whereas LOCC monotonicity indicates
that entanglement does not increase under the action of an LOCC channel on a
bipartite state.
Given an entanglement measure 𝐸 for states, the corresponding entanglement
measure for channels is defined using the general principle in Section 7.11.2, which
is to optimize the state measure with respect to all bipartite states that can be shared
between the sender and receiver of the channel by making use of the channel. We
develop entanglement measures for channels in Chapter 10, and these naturally
quantify how much entanglement can be generated by a channel connecting a
sender to a receiver.
503
Chapter 9: Entanglement Measures
504
Chapter 9: Entanglement Measures
for some finite alphabet X, probability distribution 𝑝 : X → [0, 1], and sets {𝜏𝐴𝑥 }𝑥∈X
and {𝜔𝑥𝐵 }𝑥∈X of states. Determining whether a given quantum state 𝜌 𝐴𝐵 is entangled
is a fundamental problem in quantum information theory. In Section 3.2.1, in the
discussion after Definition 3.5, we listed the following criteria for the entanglement
of pure and mixed states:
• Schmidt rank criterion: A pure bipartite state 𝜓 𝐴𝐵 is entangled if and only if
its Schmidt rank is strictly greater than one.
• PPT criterion: If a bipartite mixed state 𝜌 𝐴𝐵 has negative partial transpose
(i.e., the partial transpose 𝜌 T𝐴𝐵
𝐵
has at least one negative eigenvalue), then it is
entangled. If both systems 𝐴 and 𝐵 are qubit systems or if one of the systems
is a qubit and the other a qutrit, then 𝜌 𝐴𝐵 is entangled if and only if 𝜌 T𝐴𝐵
𝐵
has
negative partial transpose.
In the case of mixed states, there is generally not a simple necessary and sufficient
criterion to determine whether a given bipartite state is entangled, and in fact it is
known that it is computationally difficult, in a precise sense, to decide if a state is
entangled (please consult the Bibliographic Notes in Section 9.6).
In addition to determining whether or not a given quantum state is entangled, we
are interested in quantifying the amount of entanglement present in a quantum state.
Doing so allows us to compare quantum states based on the amount of entanglement
present in them. An entanglement measure is a function 𝐸 : D(H 𝐴𝐵 ) → R from
the set of density operators acting on the Hilbert space of a bipartite system to
the set of real numbers, and it quantifies the entanglement of a quantum state
𝜌 𝐴𝐵 ∈ D(H 𝐴𝐵 ). (The formal definition of an entanglement measure is given in
Definition 9.1.) To indicate the partitioning of the subsystems explicitly, we often
write 𝐸 ( 𝐴; 𝐵) 𝜌 instead of 𝐸 (𝜌 𝐴𝐵 ).
How exactly do we quantify entanglement? Suppose that we have a bipartite
state 𝜌 𝐴𝐵 and we would like to quantify the entanglement between the systems
505
Chapter 9: Entanglement Measures
where 𝜔 𝐴′ 𝐵′ B L 𝐴𝐵→𝐴′ 𝐵′ (𝜌 𝐴𝐵 ).
for every separable state 𝜎𝐴′ 𝐵′ . Now, given another separable state 𝜎𝐴′ ′′ 𝐵′′ , it
506
Chapter 9: Entanglement Measures
is possible to transform between 𝜎𝐴′ 𝐵′ and 𝜎𝐴′ ′′ 𝐵′′ using LOCC, meaning that
𝐸 ( 𝐴′; 𝐵′)𝜎 ≥ 𝐸 ( 𝐴′′; 𝐵′′)𝜎′ and 𝐸 ( 𝐴′; 𝐵′)𝜎 ≤ 𝐸 ( 𝐴′′; 𝐵′′)𝜎′ . Therefore,
𝐸 ( 𝐴′; 𝐵′)𝜎 = 𝐸 ( 𝐴′′; 𝐵′′)𝜎′ , (9.1.4)
for all separable states 𝜎𝐴𝐵 and 𝜎𝐴′ ′ 𝐵′ . As a consequence, an entanglement measure
𝐸 takes on its minimum value and is equal to a constant 𝑐 ∈ R for all separable
states. It is often convenient and simpler if an entanglement measure 𝐸 is equal to
zero for all separable states. If this is not the case, then we can simplify redefine
the entanglement measure as 𝐸 ′ ( 𝐴; 𝐵) 𝜌 = 𝐸 ( 𝐴; 𝐵) 𝜌 − 𝑐. By this reasoning and
adjustment (if needed), every entanglement measure (as per Definition 9.1) satisfies
the following two properties of non-negativity on all states and vanishing on
separable states:
1. Non-negativity: 𝐸 (𝜌 𝐴𝐵 ) ≥ 0 for every state 𝜌 𝐴𝐵 .
2. Vanishing for separable states: 𝐸 (𝜎𝐴𝐵 ) = 0 for every separable state 𝜎𝐴𝐵 .
Other properties that are desirable for an entanglement measure 𝐸 are as follows:
1. Faithfulness: 𝐸 (𝜎𝐴𝐵 ) = 0 if and only if 𝜎𝐴𝐵 is separable, so that 𝐸 (𝜌 𝐴𝐵 ) > 0
if and only if 𝜌 𝐴𝐵 is entangled.
2. Invariance under classical communication: For every finite alphabet X,
probability distribution 𝑝 : X → [0, 1] on X, and set {𝜌 𝑥𝐴𝐵 }𝑥∈X of states,
define the following classical–quantum state:
∑︁
𝜌 𝑋 𝐴𝐵 B 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜌 𝑥𝐴𝐵 . (9.1.5)
𝑥∈X
𝐸 ( 𝐴1 𝐴2 ; 𝐵1 𝐵2 )𝜏⊗𝜔 ≤ 𝐸 ( 𝐴1 ; 𝐵1 )𝜏 + 𝐸 ( 𝐴2 ; 𝐵2 )𝜔 (9.1.9)
for all states 𝜏𝐴1 𝐵1 and 𝜔 𝐴2 𝐵2 , then the entanglement measure 𝐸 is subadditive.
5. Selective LOCC monotonicity: A property stronger than LOCC monotonicity
is that 𝐸 is non-increasing on average under an LOCC instrument. In more
detail, let 𝜌 𝐴𝐵 be a bipartite state, and let {L𝑥𝐴𝐵→𝐴′ 𝐵′ }𝑥∈X be a collection of
maps, such that L↔ 𝐴𝐵→𝐴′ 𝐵′ is an LOCC channel of the form:
∑︁
↔
L 𝐴𝐵→𝐴′ 𝐵′ = L𝑥𝐴𝐵→𝐴′ 𝐵′ , (9.1.10)
𝑥∈X
for some finite alphabet X and where each map L𝑥𝐴𝐵→𝐴′ 𝐵′ is completely positive
such that the sum map L↔ 𝐴𝐵→𝐴′ 𝐵′ is trace preserving (i.e., a quantum channel).
Furthermore, each map L𝑥𝐴𝐵→𝐴′ 𝐵′ can be written in the form of (4.6.52), as
follows: ∑︁
𝑥,𝑦 𝑥,𝑦
L𝑥𝐴𝐵→𝐴′ 𝐵′ = E 𝐴→𝐴′ ⊗ F𝐵→𝐵′ , (9.1.11)
𝑦∈Y
𝑥,𝑦 𝑥,𝑦
where {E 𝐴→𝐴′ }𝑥∈X and {F𝐵→𝐵′ }𝑥∈X are sets of completely positive maps. Set
for every ensemble {( 𝑝(𝑥), 𝜔𝑥𝐴𝐵 )}𝑥∈X that arises from 𝜌 𝐴𝐵 via LOCC as
specified above. Selective LOCC monotonicity indicates that entanglement
does not increase on average under the action of LOCC. Many entanglement
measures satisfy this stronger property.
Observe that selective LOCC monotonicity in (9.1.14) implies LOCC mono-
tonicity in (9.1.2), simply because the alphabet X in (9.1.10) can consist of
only one letter.
The entanglement measures that we consider in this chapter satisfy many of the
properties listed above.
Given that we would like to quantify entanglement, it makes sense to ask what
the basic unit of entanglement should be. We take as our unit of entanglement the
two-qubit maximally entangled Bell state |Φ⟩ = √1 (|0, 0⟩ + |1, 1⟩), and we thus say
2
that the state |Φ⟩ represents “one ebit.” A maximally entangled state of Schmidt rank
𝑑 is then referred to as having “log2 𝑑 ebits.” All of the entanglement measures that
we consider in this chapter are equal to one for a two-qubit maximally entangled state,
which is another justification for using it as the unit of entanglement.1 Similarly, for
a maximally entangled state of Schmidt rank 𝑑, all of the entanglement measures
that we consider in this chapter are equal to log2 𝑑.
To close out this introductory section, we prove a lemma that helps to reduce
the difficulty in determining whether a given function is an entanglement measure.
509
Chapter 9: Entanglement Measures
Lemma 9.2
Let 𝐸 : 𝐷 (H 𝐴𝐵 ) → R be a function that, for every bipartite state 𝜌 𝐴𝐵 , is
1. invariant under classical communication, as defined in (9.1.6), and
2. obeys data processing under local channels, in the sense that
(9.1.11). In more detail, each such one-way LOCC channel (from Alice to Bob in
the case discussed below) has the following form:
∑︁
E𝑘,ℓ 𝑘,ℓ
𝐴→𝐴′ ⊗ F 𝐵→𝐵′ , (9.1.20)
𝑘,ℓ
where {E𝑘,ℓ 𝐴 } 𝑘,ℓ is a collection of completely positive maps such that the sum map
Í 𝑘,ℓ 𝑘,ℓ
𝑘,ℓ E 𝐴 is trace preserving, and {F 𝐵 } 𝑘,ℓ is a collection of quantum channels. For
now and for simplicity, let use the superindex 𝑚 B (𝑘, ℓ). We should think of the
classical information 𝑘 as that which is being kept and that in ℓ as that which is
being lost or let go. The one-way LOCC channel in (9.1.20) can be implemented
using the following steps:
511
Chapter 9: Entanglement Measures
where
" #
∑︁
𝑝(𝑘) B Tr (E𝑘,ℓ 𝑘,ℓ
𝐴→𝐴′ ⊗ F 𝐵→𝐵′ )(𝜏𝐴𝐵 ) , (9.1.29)
ℓ
1 ∑︁ 𝑘,ℓ 𝑘,ℓ
𝜔 𝑘𝐴′ 𝐵′ B (E 𝐴→𝐴′ ⊗ F𝐵→𝐵 ′ )(𝜏𝐴𝐵 ). (9.1.30)
𝑝(𝑘) ℓ
Bob could, if desired, finally discard the classical register 𝐾 𝐵 to implement the
one-way LOCC channel in (9.1.20). However, it is helpful to hold on to it for
our analysis below.
Now we analyze how the entanglement changes under each of these steps,
omitting the state subscripts at each step except for the first and last lines, because
those not shown are clear from the context:
𝐸 ( 𝐴; 𝐵)𝜏 ≥ 𝐸 ( 𝐴′ 𝑀 𝐴 ; 𝐵) (9.1.31)
= 𝐸 ( 𝐴′; 𝐵𝑀𝐵 ) (9.1.32)
≥ 𝐸 ( 𝐴′; 𝐵′ 𝑀𝐵 ) (9.1.33)
= 𝐸 ( 𝐴′; 𝐵′𝐾 𝐵 𝐿 𝐵 ) (9.1.34)
≥ 𝐸 ( 𝐴′; 𝐵′𝐾 𝐵 ) (9.1.35)
∑︁
= 𝑝(𝑘)𝐸 ( 𝐴′; 𝐵′)𝜔 𝑘 . (9.1.36)
𝑘
The first inequality follows from data processing under the local channel in
(9.1.21). The first equality follows from the assumption of invariance of classical
communication, i.e., invariance under the action of the classical channel in (9.1.22).
The second inequality follows from data processing under the local channel in
512
Chapter 9: Entanglement Measures
where the ensemble {( 𝑝(𝑘), 𝜔 𝑘𝐴′ 𝐵′ )} 𝑘 arises from the state 𝜏𝐴𝐵 by means of one-way
LOCC from Alice to Bob. By the same argument, but flipping the role of Alice and
Bob, it follows that selective one-way LOCC monotonicity from Bob to Alice holds
for the function 𝐸. Since every LOCC channel is built up as a serial concatenation
of one-way LOCC channels and since we have proven that selective monotonicity
holds for the function 𝐸 for each of them, it follows that 𝐸 obeys selective LOCC
monotonicity. ■
9.1.1 Examples
Let us now consider some examples of entanglement measures. The first two
entanglement measures that we consider are related to the Schmidt rank criterion
and the PPT criterion, respectively, stated in Section 3.2.1 and reiterated at the
beginning of this chapter. They are known as the entanglement of formation and the
log-negativity, respectively, and are some of the simplest and earliest entanglement
measures defined. They are also conceptually linked to more complex entanglement
measures like squashed entanglement and the Rains relative entropy, the latter of
which are the best known upper bounds on distillable entanglement (studied in
Chapter 13).
513
Chapter 9: Entanglement Measures
where 𝑟 is the Schmidt rank, 𝜆 𝑘 > 0 are the Schmidt coefficients, and {|𝑒 𝑘 ⟩ 𝐴 }𝑟𝑘=1 ,
{| 𝑓 𝑘 ⟩𝐵 }𝑟𝑘=1 are orthonormal sets. Observe that the reduced states 𝜓 𝐴 B Tr 𝐵 [𝜓 𝐴𝐵 ]
and 𝜓 𝐵 B Tr 𝐴 [𝜓 𝐴𝐵 ] have the same non-zero eigenvalues, which means that their
entropies are equal, i.e., 𝐻 (𝜓 𝐴 ) = 𝐻 (𝜓 𝐵 ). Furthermore, 𝐻 (𝜓 𝐴 ) = 0 if and only
if 𝑟 = 1, and 𝑟 = 1 if and only if 𝜓 𝐴𝐵 is separable, by the Schmidt rank criterion.
Therefore, the entropy of the reduced state of a pure bipartite state provides us with
a signature of entanglement for pure bipartite states:
𝜓 𝐴𝐵 entangled ⇐⇒ 𝐻 (𝜓 𝐴 ) > 0. (9.1.39)
We let
𝑟
∑︁
𝐸 𝐹 (𝜓 𝐴𝐵 ) B 𝐻 (Tr 𝐵 [𝜓 𝐴𝐵 ]) = − 𝜆 𝑘 log2 𝜆 𝑘 (9.1.40)
𝑘=1
for every pure state 𝜓 𝐴𝐵 .
The function 𝐸 𝐹 is an entanglement measure, as proven in Proposition 9.3
below. When evaluated on pure bipartites as above, it is known as the entropy of
entanglement or entanglement entropy, and it is often simply denoted by 𝐸 (𝜓 𝐴𝐵 )
in the research literature. By (9.1.39), it is also a faithful entanglement measure
on pure states, i.e., 𝐸 𝐹 (𝜓 𝐴𝐵 ) = 0 if and only if 𝜓 𝐴𝐵 is a separable state. Recall
from Section 3.2.3 that a maximally entangled state is defined by having 𝜆 𝑘 = 1𝑟
for all 1 ≤ 𝑘 ≤ 𝑟. For such states, 𝐸 𝐹 (𝜓 𝐴𝐵 ) = log2 𝑟, which justifies calling them
maximally entangled because log2 𝑟 is the largest value of the quantum entropy for
states supported on an 𝑟-dimensional space.
The definition in (9.1.40) for the entanglement measure 𝐸 𝐹 , so far, has been
defined only for pure states. To extend the definition to mixed states, we use the
fact that a mixed state 𝜌 𝐴𝐵 can be decomposed into a convex combination of pure
states as follows: ∑︁
𝜌 𝐴𝐵 = 𝑝(𝑥)𝜓 𝑥𝐴𝐵 , (9.1.41)
𝑥∈X
where X is a finite alphabet, 𝑝 : X → [0, 1] is a probability distribution, and
{𝜓 𝑥𝐴𝐵 }𝑥∈X is a set of pure states. We can then measure the entanglement of
𝜌 𝐴𝐵 by taking the expected entanglement of the pure states involved in the
Í
decomposition of 𝜌 𝐴𝐵 , i.e., by 𝑥∈X 𝑝(𝑥)𝐻 (𝜓 𝑥𝐴 ), where 𝜓 𝑥𝐴 B Tr 𝐵 [𝜓 𝑥𝐴𝐵 ]. However,
this strategy can lead to different values for the entanglement of 𝜌 𝐴𝐵 , depending on
the chosen decomposition, because the decomposition of mixed states into a convex
combination of pure states is generally not unique. We can address this issue by
minimizing over all possible decompositions, leading to the following definition:
514
Chapter 9: Entanglement Measures
Let ∑︁ √︁
𝜌
|𝜓 ⟩ 𝑅 𝐴𝐵 B 𝑝(𝑥)|𝑥⟩ 𝑅 |𝜓 𝑥 ⟩ 𝐴𝐵 . (9.1.48)
𝑥
Í
be a purification of 𝜌 𝐴𝐵 , with 𝜌 𝐴𝐵 = 𝑥 𝑝(𝑥)𝜓 𝑥𝐴𝐵 a pure-state decomposition of 𝜌 𝐴𝐵 .
By applying Uhlmann’s theorem (Theorem 6.8), there exists a purification 𝜓 𝑅𝜎𝐴𝐵
𝜌
of 𝜎𝐴𝐵 such that 𝐹 (𝜓 𝑅 𝐴𝐵 , 𝜓 𝑅𝜎𝐴𝐵 ) = 𝐹 (𝜌 𝐴𝐵 , 𝜎𝐴𝐵 ). By combining this observation
with (9.1.47), and the fact that the sine distance of two pure states is equal to the
normalized trace distance (see (6.1.1)), we conclude that
1 𝜌 √︁
𝜓 𝑅 𝐴𝐵 − 𝜓 𝑅𝜎𝐴𝐵 1
≤ 𝜀 (2 − 𝜀). (9.1.49)
2
Í
We now apply the measurement channel M 𝑅→𝑋 (·) B 𝑥 |𝑥⟩ 𝑋 ⟨𝑥| 𝑅 (·)|𝑥⟩ 𝑅 ⟨𝑥| 𝑋 to
the 𝑅 systems, as well as the data-processing inequality for the trace distance, to
conclude that
1 𝜌
√︁
M 𝑅→𝑋 (𝜓 𝑅 𝐴𝐵 ) − M 𝑅→𝑋 (𝜓 𝑅𝜎𝐴𝐵 ) 1 ≤ 𝜀 (2 − 𝜀). (9.1.50)
2
516
Chapter 9: Entanglement Measures
Consider that
∑︁
𝜌
𝜌 𝑋 𝐴𝐵 B M 𝑅→𝑋 (𝜓 𝑅 𝐴𝐵 ) = 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜓 𝑥𝐴𝐵 , (9.1.51)
𝑥
and there exists a probability distribution 𝑞(𝑥) and a set {𝜑𝑥𝐴𝐵 }𝑥 , satisfying
∑︁
𝜎𝐴𝐵 = 𝑞(𝑥)𝜑𝑥𝐴𝐵 , (9.1.52)
𝑥
such that ∑︁
𝜎𝑋 𝐴𝐵 B M 𝑅→𝑋 (𝜓 𝑅𝜎𝐴𝐵 ) = 𝑞(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜑𝑥𝐴𝐵 . (9.1.53)
𝑥
Now applying the uniform continuity of conditional mutual information (Proposi-
tion 7.10), we conclude that
1 1
𝐼 ( 𝐴; 𝐵|𝑋)𝜎 ≤ 𝐼 ( 𝐴; 𝐵|𝑋) 𝜌 + 𝛿 log2 min{𝑑 𝐴 , 𝑑 𝐵 } + 𝑔2 (𝛿), (9.1.54)
2 2
√︁
with 𝛿 = 𝜀 (2 − 𝜀). Since the states of systems 𝐴𝐵 are pure when conditioned on
the classical system 𝑋, for both 𝜌 𝑋 𝐴𝐵 and 𝜎𝑋 𝐴𝐵 , consider that
1 1
𝐼 ( 𝐴; 𝐵|𝑋)𝜎 = 𝐻 ( 𝐴|𝑋)𝜎 , 𝐼 ( 𝐴; 𝐵|𝑋) 𝜌 = 𝐻 ( 𝐴|𝑋) 𝜌 . (9.1.55)
2 2
So we conclude that
where the first inequality follows from Definition 9.3 and (9.1.52). Since the
pure-state decomposition of 𝜌 𝐴𝐵 is arbitrary, we conclude that
Running the argument again, but starting from an arbitrary pure-state decomposition
of 𝜎𝐴𝐵 , we conclude the inequality
517
Chapter 9: Entanglement Measures
then
𝐸 𝐹 (𝜌 𝐴𝐵 ) ≤ 𝛿 log2 min{𝑑 𝐴 , 𝑑 𝐵 } + 𝑔2 (𝛿), (9.1.60)
√︁
where 𝛿 B 𝜀 (2 − 𝜀). Conversely, for 𝜀 ≥ 0, if
𝐸 𝐹 (𝜌 𝐴𝐵 ) ≤ 𝜀, (9.1.61)
then
1 √
inf ∥ 𝜌 𝐴𝐵 − 𝜎𝐴𝐵 ∥ 1 ≤ 𝜀 ln 2. (9.1.62)
𝜎𝐴𝐵 ∈SEP( 𝐴:𝐵) 2
Proof: We begin by proving the first statement. Suppose that the state 𝜎𝐴𝐵
is separable. Then by the remark after Definition 3.5, there exists a pure-state
decomposition of 𝜎𝐴𝐵 as
∑︁
𝜎𝐴𝐵 = 𝑝(𝑥)𝜙𝑥𝐴 ⊗ 𝜑𝑥𝐵 . (9.1.63)
𝑥
Í
For this decomposition, we have that 𝑥 𝑝(𝑥)𝐻 (𝜙𝑥𝐴 ) = 0 because the quantum
entropy is equal to zero for a pure state. This implies by definition that 𝐸 𝐹 (𝜎𝐴𝐵 ) = 0.
The statement in (9.1.59)–(9.1.60) then follows by combining this observation with
(9.1.44)–(9.1.45), as well as the fact that the function on the right-hand side of
(9.1.60) is monotone in 𝜀.
To see the second statement, let
∑︁
𝜌 𝐴𝐵 = 𝑝(𝑥)𝜓 𝑥𝐴𝐵 (9.1.64)
𝑥
518
Chapter 9: Entanglement Measures
1 ∑︁
= 𝑝(𝑥)𝐷 (𝜓 𝑥𝐴𝐵 ∥𝜓 𝑥𝐴 ⊗ 𝜓 𝑥𝐵 ) (9.1.66)
2 𝑥
1 ∑︁ 2
≥ 𝑝(𝑥) 𝜓 𝑥𝐴𝐵 − 𝜓 𝑥𝐴 ⊗ 𝜓 𝑥𝐵 1
(9.1.67)
4 ln 2 𝑥
2
1 ∑︁ 𝑥
∑︁
≥ 𝑝(𝑥)𝜓 𝐴𝐵 − 𝑝(𝑥)𝜓 𝑥𝐴 ⊗ 𝜓 𝑥𝐵 (9.1.68)
4 ln 2 𝑥 𝑥 1
2
1 ∑︁
= 𝜌 𝐴𝐵 − 𝑝(𝑥)𝜓 𝑥𝐴 ⊗ 𝜓 𝑥𝐵 (9.1.69)
4 ln 2 𝑥 1
2
1 1
≥ inf ∥ 𝜌 𝐴𝐵 − 𝜎𝐴𝐵 ∥ 1 . (9.1.70)
ln 2 𝜎𝐴𝐵 ∈SEP( 𝐴:𝐵) 2
The second equality follows from rewriting the mutual information in terms of
relative entropy (see (7.2.96)). The first inequality follows from the quantum Pinsker
inequality (Corollary 7.32 and the remark thereafter). The second inequality follows
from convexity of the square function and the trace norm. Since the inequality
holds for an arbitrary pure-state decomposition of 𝜌 𝐴𝐵 , we conclude that
2
1 1
𝐸 𝐹 (𝜌 𝐴𝐵 ) ≥ inf ∥ 𝜌 𝐴𝐵 − 𝜎𝐴𝐵 ∥ 1 . (9.1.71)
ln 2 𝜎𝐴𝐵 ∈SEP( 𝐴:𝐵) 2
Proposition 9.6
The entanglement of formation is convex, so that (9.1.7) holds with 𝐸 set to 𝐸 𝐹 ,
and it is a selective LOCC monotone, so that (9.1.14) holds with 𝐸 set to 𝐸 𝐹 .
519
Chapter 9: Entanglement Measures
Proof: By Lemma 9.2, we only need to show that the entanglement of formation
does not increase under the action of a local channel and is invariant under
classical communication. We begin with the first one. Let 𝜌 𝐴𝐵 be a bipartite state,
and let N 𝐴→𝐴′ be a local quantum channel. Let {( 𝑝(𝑥), 𝜓 𝑥𝐴𝐵 )}𝑥 be a pure-state
Í
decomposition of 𝜌 𝐴𝐵 , i.e., satisfying 𝑥 𝑝(𝑥)𝜓 𝑥𝐴𝐵 = 𝜌 𝐴𝐵 . Let N 𝐴→𝐴′ have the
𝑦
Kraus representation {𝑁 𝐴→𝐴′ } 𝑦 . Then
∑︁
𝜔 𝐴′ 𝐵 B N 𝐴→𝐴′ (𝜌 𝐴𝐵 ) = 𝑝(𝑥)N 𝐴→𝐴′ (𝜓 𝑥𝐴𝐵 ), (9.1.72)
𝑥
and
∑︁ ∑︁
𝑁 𝐴→𝐴′ 𝜓 𝑥𝐴𝐵 (𝑁 𝐴→𝐴′ ) † =
𝑦 𝑦 𝑥,𝑦
N 𝐴→𝐴′ (𝜓 𝑥𝐴𝐵 ) = 𝑝(𝑦|𝑥)𝜑 𝐴′ 𝐵 , (9.1.73)
𝑦 𝑦
where
𝑝(𝑦|𝑥) B Tr[𝑁 𝐴→𝐴′ 𝜓 𝑥𝐴𝐵 (𝑁 𝐴→𝐴′ ) † ],
𝑦 𝑦
(9.1.74)
1
𝑁 𝐴→𝐴′ 𝜓 𝑥𝐴𝐵 (𝑁 𝐴→𝐴′ ) † .
𝑥,𝑦 𝑦 𝑦
𝜑 𝐴′ 𝐵 B (9.1.75)
𝑝(𝑦|𝑥)
𝑥,𝑦
Thus, {( 𝑝(𝑥) 𝑝(𝑦|𝑥), 𝜑 𝐴′ 𝐵 )}𝑥,𝑦 is a pure-state decomposition of 𝜔 𝐴′ 𝐵 . Also, observe
that
𝜓 𝑥𝐵 = Tr 𝐴 [𝜓 𝑥𝐴𝐵 ] (9.1.76)
= Tr 𝐴′ [N 𝐴→𝐴′ (𝜓 𝑥𝐴𝐵 )] (9.1.77)
∑︁
𝑥,𝑦
= 𝑝(𝑦|𝑥) Tr 𝐴′ [𝜑 𝐴′ 𝐵 ] (9.1.78)
𝑦
∑︁
𝑥,𝑦
= 𝑝(𝑦|𝑥)𝜑 𝐵 . (9.1.79)
𝑦
By flipping the role of Alice and Bob in the analysis above, we conclude that the
entanglement of formation does not increase under the action of a local channel on
Bob’s system.
Now we prove that 𝐸 𝐹 is invariant under classical communication. Let 𝜌 𝑋 𝐴𝐵
be the classical–quantum state defined in (9.1.5). A pure-state decomposition of
𝑥,𝑦
𝜌 𝑋 𝐴𝐵 has the form {( 𝑝(𝑥) 𝑝(𝑦|𝑥), |𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜓 𝐴𝐵 )}𝑥,𝑦 , where
∑︁
𝑥 𝑥,𝑦
𝜌 𝐴𝐵 = 𝑝(𝑦|𝑥)𝜓 𝐴𝐵 . (9.1.83)
𝑦
By the same argument, but exchanging the roles of Alice and Bob, we conclude that
∑︁
𝑝(𝑥)𝐸 𝐹 ( 𝐴; 𝐵) 𝜌 𝑥 = 𝐸 𝐹 ( 𝐴; 𝐵𝑋) 𝜌 . (9.1.89)
𝑥
521
Chapter 9: Entanglement Measures
Í Í 𝑦
Proof: Let 𝑥 𝑝(𝑥)𝜓 𝑥𝐴1 𝐵1 and 𝑦 𝑞(𝑦)𝜙 𝐴2 𝐵2 be respective pure-state decompo-
Í 𝑦
sitions of 𝜏𝐴1 𝐵1 and 𝜔 𝐴2 𝐵2 . Then 𝑥,𝑦 𝑝(𝑥)𝑞(𝑦)𝜓 𝑥𝐴1 𝐵1 ⊗ 𝜙 𝐴2 𝐵2 is a pure-state
decomposition of 𝜏𝐴1 𝐵1 ⊗ 𝜔 𝐴2 𝐵2 . It follows that
∑︁
𝑦
𝐸 𝐹 ( 𝐴1 𝐴2 ; 𝐵1 𝐵2 )𝜏⊗𝜔 ≤ 𝑝(𝑥)𝑞(𝑦)𝐻 (𝜓 𝑥𝐴1 ⊗ 𝜙 𝐴2 ) (9.1.90)
𝑥,𝑦
∑︁
𝑦
= 𝑝(𝑥)𝑞(𝑦) [𝐻 (𝜓 𝑥𝐴1 ) + 𝐻 (𝜙 𝐴2 )] (9.1.91)
𝑥,𝑦
∑︁ ∑︁
𝑦
= 𝑝(𝑥)𝐻 (𝜓 𝑥𝐴1 ) + 𝑞(𝑦)𝐻 (𝜙 𝐴2 ). (9.1.92)
𝑥 𝑦
Since the inequality holds for arbitrary pure-state decompositions of 𝜏𝐴1 𝐵1 and
𝜔 𝐴2 𝐵2 , we conclude that subadditivity holds. ■
𝐶 (𝜌 𝐴𝐵 ) = max{0, 𝜆1 − 𝜆 2 − 𝜆 3 − 𝜆 4 }, (9.1.94)
522
Chapter 9: Entanglement Measures
√︁√ √
where 𝜆 1 , 𝜆2 , 𝜆3 , 𝜆4 are the eigenvalues of 𝜌 𝐴𝐵 𝜌 𝐴𝐵 in decreasing order.
𝜌 𝐴𝐵 e
The operator e 𝜌 𝐴𝐵 B (𝑌 ⊗𝑌 ) 𝜌 𝐴𝐵 (𝑌 ⊗𝑌 ), with 𝑌 being the Pauli-𝑌
𝜌 𝐴𝐵 is defined as e
operator (see (4.5.25)) and 𝜌 𝐴𝐵 being the complex conjugate of 𝜌 𝐴𝐵 in the standard
basis.
Similar to how we motivated the entanglement of formation from the Schmidt rank
criterion for pure states, we can also motivate an entanglement measure from the
PPT criterion. The PPT criterion states that if the partial transpose T𝐵 (𝜌 𝐴𝐵 ) of a
given state 𝜌 𝐴𝐵 has a negative eigenvalue, then 𝜌 𝐴𝐵 is entangled.2 We use this fact
to define the negativity of 𝜌 𝐴𝐵 as
∥T𝐵 (𝜌 𝐴𝐵 )∥ 1 − 1
𝑁 (𝜌 𝐴𝐵 ) B , (9.1.95)
2
and the logarithmic negativity (often written simply as log-negativity) of 𝜌 𝐴𝐵 as
Both the negativity and the log-negativity quantify the extent to which the
partial transpose T𝐵 (𝜌 𝐴𝐵 ) has negative eigenvalues. In particular, suppose that
T𝐵 (𝜌 𝐴𝐵 ) has the following Jordan–Hahn decomposition:
T𝐵 (𝜌 𝐴𝐵 ) = 𝑃 − 𝑁, (9.1.97)
Therefore,
∥T𝐵 (𝜌 𝐴𝐵 ) ∥ 1 − 1 ∥T𝐵 (𝜌 𝐴𝐵 )∥ 1 − Tr[T𝐵 (𝜌 𝐴𝐵 )]
𝑁 (𝜌 𝐴𝐵 ) = = = Tr[𝑁]. (9.1.100)
2 2
2 Note that it does not matter in which basis the transpose is defined.
523
Chapter 9: Entanglement Measures
So, according to (2.2.67), the negativity is the sum of the absolute values of the
negative eigenvalues of 𝜌 T𝐴𝐵
𝐵
.
By utilizing Hölder duality and semi-definite programming duality, it is possible
to write ∥T𝐵 (𝜌 𝐴𝐵 ) ∥ 1 as the following primal and dual semi-definite programs:
where the optimization in the first line is with respect to Hermitian 𝑅 𝐴𝐵 . We give a
proof of (9.1.102)–(9.1.101) in Appendix 9.A.
Proposition 9.8
The log-negativity is non-negative for all bipartite states, and it is faithful on
the set of PPT states (i.e., it is equal to zero if and only if a state is PPT).
Proof: To see the first statement, we note that the choice 𝑅 𝐴𝐵 = 1 𝐴𝐵 is feasible
for the primal SDP in (9.1.101), so that ∥T𝐵 (𝜌 𝐴𝐵 ) ∥ 1 ≥ 1, and hence 𝐸 𝑁 (𝜌 𝐴𝐵 ) ≥ 0,
for every bipartite state 𝜌 𝐴𝐵 .
Suppose that 𝜌 𝐴𝐵 is a PPT state. Then ∥T𝐵 (𝜌 𝐴𝐵 )∥ 1 = Tr[T𝐵 (𝜌 𝐴𝐵 )] =
Tr[𝜌 𝐴𝐵 ] = 1 due to the assumption that T𝐵 (𝜌 𝐴𝐵 ) ≥ 0, implying that 𝐸 𝑁 (𝜌 𝐴𝐵 ) = 0
for every PPT state.
Finally, suppose that 𝐸 𝑁 (𝜌 𝐴𝐵 ) = 0. Then ∥T𝐵 (𝜌 𝐴𝐵 ) ∥ 1 = 1, and employing the
notation of (9.1.98)–(9.1.99), we conclude that 1 = Tr[𝑃 + 𝑁] = Tr[𝑃 − 𝑁], which
implies that Tr[𝑁] = 0. Since 𝑁 ≥ 0, this implies that 𝑁 = 0. Thus, T𝐵 (𝜌 𝐴𝐵 ) has
no negative part and 𝜌 𝐴𝐵 is thus a PPT state. ■
524
Chapter 9: Entanglement Measures
for every bipartite state 𝜌 𝐴𝐵 and C-PPT-P instrument {P𝑥𝐴𝐵→𝐴′ 𝐵′ }𝑥∈X , with
Proposition 9.10
The negativity and the log-negativity are selective PPT monotones, satisfying
(9.1.103). The negativity is convex, satisfying (9.1.7), but the log-negativity is
not.
Let us define
1
𝐾 𝐴𝑥 ′ 𝐵′ B (T𝐵′ ◦ P𝑥𝐴𝐵→𝐴′ 𝐵′ ◦ T𝐵 )(𝐾 𝐴𝐵 ), (9.1.110)
𝑝(𝑥)
1
𝐿 𝑥𝐴′ 𝐵′ B (T𝐵′ ◦ P𝑥𝐴𝐵→𝐴′ 𝐵′ ◦ T𝐵 )(𝐿 𝐴𝐵 ), (9.1.111)
𝑝(𝑥)
so that
𝜔𝑥𝐴′ 𝐵′ = T𝐵′ (𝐾 𝐴𝑥 ′ 𝐵′ − 𝐿 𝑥𝐴′ 𝐵′ ). (9.1.112)
Furthermore, since T𝐵′ ◦ P𝑥𝐴𝐵→𝐴′ 𝐵′ ◦ T𝐵 is completely positive, it follows that
𝐾 𝐴𝑥 ′ 𝐵′ , 𝐿 𝑥𝐴′ 𝐵′ ≥ 0. Thus, 𝐾 𝐴𝑥 ′ 𝐵′ and 𝐿 𝑥𝐴′ 𝐵′ are feasible for the SDP in (9.1.102) for
T𝐵 (𝜔𝑥𝐴′ 𝐵′ ) 1 , and we conclude that
Tr[𝐾 𝐴𝐵 + 𝐿 𝐴𝐵 ]
= Tr[T𝐵 (𝐾 𝐴𝐵 + 𝐿 𝐴𝐵 )] (9.1.114)
∑︁
= Tr[P𝑥𝐴𝐵→𝐴′ 𝐵′ (T𝐵 (𝐾 𝐴𝐵 + 𝐿 𝐴𝐵 ))] (9.1.115)
𝑥∈X:𝑝(𝑥)>0
∑︁
= Tr[(T𝐵′ ◦ P𝑥𝐴𝐵→𝐴′ 𝐵′ ◦ T𝐵 )(𝐾 𝐴𝐵 + 𝐿 𝐴𝐵 )] (9.1.116)
𝑥∈X:𝑝(𝑥)>0
∑︁
= 𝑝(𝑥) Tr[𝐾 𝐴𝑥 ′ 𝐵′ + 𝐿 𝑥𝐴′ 𝐵′ ] (9.1.117)
𝑥∈X:𝑝(𝑥)>0
∑︁
≥ 𝑝(𝑥) T𝐵 (𝜔𝑥𝐴′ 𝐵′ ) 1
. (9.1.118)
𝑥∈X:𝑝(𝑥)>0
The first and third equalities hold because the trace is invariant Íunder a partial
transpose. The second equality follows because the sum map 𝑥 P𝑥𝐴𝐵→𝐴′ 𝐵′ is
trace preserving. The fourth equality follows from the definitions in (9.1.110)–
(9.1.111). The inequality follows from (9.1.113). Since the inequality holds for all
𝐾 𝐴𝐵 , 𝐿 𝐴𝐵 ≥ 0 satisfying (9.1.106), we conclude that
∑︁
∥T𝐵 (𝜌 𝐴𝐵 )∥ 1 ≥ 𝑝(𝑥) T𝐵 (𝜔𝑥𝐴′ 𝐵′ ) 1 . (9.1.119)
𝑥∈X:𝑝(𝑥)>0
526
Chapter 9: Entanglement Measures
That the negativity is convex follows directly from the definition, convexity of
the trace norm, and linearity of the partial transpose.
The lack of convexity of log-negativity follows from direct evaluation for
Í Í
the states Φ 𝐴𝐵 B 12 𝑖, 𝑗 ∈{0,1} |𝑖𝑖⟩⟨ 𝑗 𝑗 | 𝐴𝐵 , 𝜎𝐴𝐵 B 12 𝑖∈{0,1} |𝑖𝑖⟩⟨𝑖𝑖| 𝐴𝐵 , and 𝜌 𝐴𝐵 B
1
2 (Φ 𝐴𝐵 + 𝜎𝐴𝐵 ), for which we have
3
𝐸 𝑁 (Φ 𝐴𝐵 ) = 1, 𝐸 𝑁 (𝜎𝐴𝐵 ) = 0, 𝐸 𝑁 (𝜌 𝐴𝐵 ) = log2 , (9.1.124)
2
so that
1
𝐸 𝑁 (𝜌 𝐴𝐵 ) > (𝐸 𝑁 (Φ 𝐴𝐵 ) + 𝐸 𝑁 (𝜎𝐴𝐵 )) . (9.1.125)
2
This concludes the proof. ■
527
Chapter 9: Entanglement Measures
𝐸 𝑁 (𝜓 𝐴𝐵 ) = 𝐻 1 (𝜓 𝐴 ). (9.1.131)
2
where we have taken the partial transpose with respect to the orthonormal set
{| 𝑓 𝑘 ⟩𝐵 }𝑟𝑘=1 . Observe that
𝑟 √︁ 𝑟 √︁
!
∑︁ ∑︁
|𝜓⟩⟨𝜓| T𝐴𝐵
𝐵
= 𝐹𝐴𝐵 𝜆 𝑘 ′ |𝑒 𝑘 ′ ⟩⟨𝑒 𝑘 ′ | 𝐴 ⊗ 𝜆 𝑘 | 𝑓 𝑘 ⟩⟨ 𝑓 𝑘 | 𝐵 , (9.1.134)
𝑘 ′ =1 𝑘=1
Í
where 𝐹𝐴𝐵 = 𝑟𝑘,𝑘 ′ =1 |𝑒 𝑘 ′ ⟩⟨𝑒 𝑘 | 𝐴 ⊗ | 𝑓 𝑘 ⟩⟨ 𝑓 𝑘 ′ | 𝐵 is a unitary swap operator. Thus, by
unitary invariance of the trace norm, we obtain
𝑟 √︁
!2 𝑟 √︁
!
∑︁ ∑︁
𝐸 𝑁 (𝜓 𝐴𝐵 ) = log2 𝜆 𝑘 = 2 log2 𝜆𝑘 . (9.1.135)
𝑘=1 𝑘=1
D(H AB )
∗
ρ AB σAB SEP( A : B)
∗ )
D (ρ AB kσAB
The two entanglement measures considered above are based on specific mathemati-
cal properties of entanglement. However, using the fact that entangled states are, by
definition, not separable, we can construct a broad class of entanglement measures
by finding the divergence of a given state 𝜌 𝐴𝐵 with the set of separable states. This
idea is illustrated in Figure 9.1. We primarily consider such divergence-based
entanglement measures in this book (in the research literature, these are also called
“distance-based” entanglement measures, even though divergences that are not
distances, such as relative entropy, are used in this approach).
As an example of a divergence-based entanglement measure, let us consider a
concrete divergence, the normalized trace distance, which we defined in Section 6.1
as 12 ∥ 𝜌 − 𝜎∥ 1 for every two states 𝜌 and 𝜎. Mathematically, the distance of a point
to a set is defined by finding the element of that set that is closest to the given point.
With this idea, we define the trace distance of entanglement of a state 𝜌 𝐴𝐵 as the
normalized trace distance from 𝜌 𝐴𝐵 to the closest state 𝜎𝐴𝐵 ∈ SEP( 𝐴 : 𝐵):
1
𝐸𝑇 ( 𝐴; 𝐵) 𝜌 B inf ∥ 𝜌 𝐴𝐵 − 𝜎𝐴𝐵 ∥ 1 . (9.1.136)
𝜎𝐴𝐵 ∈SEP( 𝐴:𝐵) 2
Note that the infimum is indeed achieved, because SEP( 𝐴 : 𝐵) is a compact set and
529
Chapter 9: Entanglement Measures
the trace norm is continuous in 𝜎𝐴𝐵 , so that there always exists a closest separable
state to the given state 𝜌 𝐴𝐵 . Recall that we implicitly introduced the trace distance
of entanglement in Proposition 9.5, when considering approximate faithfulness of
the entanglement of formation.
The quantity 𝐸𝑇 is indeed an entanglement measure. To see this, we use the
data-processing inequality for the trace distance (Theorem 6.3), and the fact that
separable states are preserved under LOCC channels (which follows immediately
from the definition of LOCC channels). Then, for every state 𝜌 𝐴𝐵 , every LOCC
channel L 𝐴𝐵→𝐴′ 𝐵′ , and letting 𝜔 𝐴′ 𝐵′ = L 𝐴𝐵→𝐴′ 𝐵′ (𝜌 𝐴𝐵 ), we obtain
1
𝐸𝑇 ( 𝐴; 𝐵) 𝜌 = inf ∥ 𝜌 𝐴𝐵 − 𝜎𝐴𝐵 ∥ 1 (9.1.137)
𝜎𝐴𝐵 ∈SEP( 𝐴:𝐵) 2
1
≥ inf ∥L 𝐴𝐵→𝐴′ 𝐵′ (𝜌 𝐴𝐵 ) − L 𝐴𝐵→𝐴′ 𝐵′ (𝜎𝐴𝐵 )∥ 1 (9.1.138)
𝜎𝐴𝐵 ∈SEP( 𝐴:𝐵) 2
1
≥ inf ′ ′ ∥L 𝐴𝐵→𝐴′ 𝐵′ (𝜌 𝐴𝐵 ) − 𝜏𝐴′ 𝐵′ ∥ 1 (9.1.139)
𝜏𝐴′ 𝐵′ ∈SEP( 𝐴 :𝐵 ) 2
= 𝐸𝑇 ( 𝐴′; 𝐵′)𝜔 . (9.1.140)
Although the simple proof above makes it clear that the trace distance of entangle-
ment is an LOCC monotone, it is known that the trace distance of entanglement
is not a selective LOCC monotone, as defined in (9.1.14) (please consult the
Bibliographic Notes in Section 9.6).
The trace distance of entanglement is also faithful, which is due to the fact that
the trace distance is a metric in the mathematical sense: 12 ∥ 𝜌 𝐴𝐵 − 𝜎𝐴𝐵 ∥ 1 ≥ 0 for
all states 𝜌 𝐴𝐵 , 𝜎𝐴𝐵 , and 12 ∥ 𝜌 𝐴𝐵 − 𝜎𝐴𝐵 ∥ 1 = 0 if and only if 𝜌 𝐴𝐵 = 𝜎𝐴𝐵 .
Beyond the trace distance, we can take any distinguishability measure and
define an entanglement measure analogous to the one in (9.1.136). That is, we can
take any generalized divergence 𝑫 as our divergence. Recall from Definition 7.15
that a generalized divergence is a function 𝑫 : D(H) × L+ (H) → R ∪ {+∞} that
obeys the data-processing inequality. We then define the generalized divergence of
entanglement of 𝜌 𝐴𝐵 as follows:
See Figure 9.1 for a visual depiction of the idea behind this quantity. If the
generalized divergence 𝑫 is continuous in its second argument, then the infimum
in (9.1.141) is achieved. We study this entanglement measure in much more detail
530
Chapter 9: Entanglement Measures
Figure 9.2: The set SEP( 𝐴 : 𝐵) of separable states acting on the Hilbert space
H 𝐴𝐵 is contained in the set PPT( 𝐴 : 𝐵) of positive partial transpose (PPT)
states, which in turn is contained in the set PPT′ ( 𝐴 : 𝐵) of operators defined in
(9.1.144). The sets PPT and PPT′ are relaxations of the set of separable states
that can be easily characterized in terms of semi-definite constraints.
Convexity of the set PPT′ ( 𝐴 : 𝐵) follows from convexity of the trace norm.
Furthermore, the set PPT′ ( 𝐴 : 𝐵) contains the set of PPT states because every PPT
state 𝜎𝐴𝐵 satisfies ∥T𝐵 (𝜎𝐴𝐵 )∥ 1 = 1. Furthermore, every operator 𝜎𝐴𝐵 ∈ PPT′ ( 𝐴 :
𝐵) is subnormalized, satisfying Tr[𝜎𝐴𝐵 ] ≤ 1, which follows because
Tr[𝜎𝐴𝐵 ] = Tr[T𝐵 (𝜎𝐴𝐵 )] ≤ ∥T𝐵 (𝜎𝐴𝐵 )∥ 1 ≤ 1. (9.1.145)
532
Chapter 9: Entanglement Measures
𝑬 ( 𝐴; 𝐵) 𝜌 ≥ 𝑬 PPT ( 𝐴; 𝐵) 𝜌 ≥ 𝑹( 𝐴; 𝐵) 𝜌 , (9.1.149)
for every bipartite state 𝜌 𝐴𝐵 . Thus, as we show later in the book, the relaxation from
SEP to PPT′ via the generalized Rains divergence not only allows for the possibility
of efficiently computable entanglement measures, but due to the inequality in
533
Chapter 9: Entanglement Measures
(9.1.149), it also allows for the possibility of obtaining a tighter upper bound on
communication rates in certain scenarios. We investigate the properties of the
generalized Rains divergence in detail in Section 9.3.
Before proceeding, let us state some properties of the set PPT′.
Remark: We emphasize that not all operators in the set PPT′ are quantum states, meaning that
not all operators 𝜎𝐴𝐵 ∈ PPT′ ( 𝐴 : 𝐵) satisfy Tr[𝜎𝐴𝐵 ] = 1.
Proof:
1. Let 𝜎𝐴′ 𝐵′ = P 𝐴𝐵→𝐴′ 𝐵′ (𝜌 𝐴𝐵 ). Since P 𝐴𝐵→𝐴′ 𝐵′ is a channel, and 𝜌 𝐴𝐵 is a state,
we have that 𝜎𝐴′ 𝐵′ ≥ 0. Then,
Now, consider that the induced trace norm of T𝐵′ ◦ P 𝐴𝐵→𝐴′ 𝐵′ ◦ T𝐵 satisfies
∥T𝐵′ ◦ P 𝐴𝐵→𝐴′ 𝐵′ ◦ T𝐵 ∥ 1 = 1, which follows from (2.2.184)–(2.2.185) and
the fact that T𝐵′ ◦ P 𝐴𝐵→𝐴′ 𝐵′ ◦ T𝐵 is a channel by definition of P 𝐴𝐵→𝐴′ 𝐵′ .
Furthermore, we have that ∥T𝐵 (𝜌 𝐴𝐵 ) ∥ 1 ≤ 1 because 𝜌 𝐴𝐵 ∈ PPT′ ( 𝐴 : 𝐵).
Putting these observations together, we find that
for a finite alphabet X, probability distribution 𝑝 : X → [0, 1], and sets {𝜌 𝑥𝐴 }𝑥∈X ,
{𝜏𝐵𝑥 }𝑥∈X of states. Let us form the following extension of 𝜎𝐴𝐵 to a state 𝜔 𝐴𝐵𝑋 , with
𝑋 a classical register:
∑︁
𝜔 𝐴𝐵𝑋 = 𝑝(𝑥) 𝜌 𝑥𝐴 ⊗ 𝜏𝐵𝑥 ⊗ |𝑥⟩⟨𝑥| 𝑋 . (9.1.158)
𝑥∈X
This is indeed an extension because Tr 𝑋 [𝜔 𝐴𝐵𝑋 ] = 𝜎𝐴𝐵 . Let us now consider the
conditional mutual information 𝐼 ( 𝐴; 𝐵|𝑋)𝜔 of this extension (recall the definition
535
Chapter 9: Entanglement Measures
for some set {𝜌 𝑥𝐴𝐵 }𝑥∈X of states and a probability distribution 𝑝(𝑥) satisfying
Í
𝜌 𝐴𝐵 = 𝑥∈X 𝑝(𝑥) 𝜌 𝑥𝐴𝐵 . The normalization factor of 12 is there for reasons that
become apparent later. If we require that every state 𝜌 𝑥𝐴𝐵 in the extension 𝜔 𝐴𝐵𝑋
should be pure, then the measure in (9.1.160) reduces to the entanglement of
formation (this was actually used in (9.1.55) in the proof of Proposition 9.4).
The quantity proposed in (9.1.160) is non-negative for every state 𝜌 𝐴𝐵 , due
to the non-negativity of mutual information and the fact that conditional mutual
information with a classical conditioning system is equal to a convex combination
of mutual informations. It is already clear that the quantity proposed in (9.1.160) is
equal to zero for every separable state—if a state is separable, then the optimization
in (9.1.160) finds the separable decomposition and the value of the quantity is zero,
as discussed just after (9.1.159). The converse is also true, which follows from the
same proof given for (9.1.61)–(9.1.62). It is actually also possible to show that
the quantity in (9.1.160) is an entanglement measure. However, we do not make
further use of this quantity in this book, because there is an entanglement measure
more suitable for our purposes, as introduced below.
536
Chapter 9: Entanglement Measures
537
Chapter 9: Entanglement Measures
We are particularly interested throughout the rest of this book in the following
generalized divergences of entanglement for every state 𝜌 𝐴𝐵 :
1. The relative entropy of entanglement of 𝜌 𝐴𝐵 ,
e𝛼 ( 𝐴; 𝐵) 𝜌 = 𝐸 𝑅 ( 𝐴; 𝐵) 𝜌
lim 𝐸 (9.2.5)
𝛼→1
for every state 𝜌 𝐴𝐵 . See Appendix 10.A for details of the proof.
4. The max-relative entropy of entanglement of 𝜌 𝐴𝐵 ,
538
Chapter 9: Entanglement Measures
for every state 𝜌 𝐴𝐵 . See Appendix 10.A for details of the proof. As a
e𝛼 ( 𝐴; 𝐵) 𝜌 is monotonically increasing
consequence of this fact, and the fact that 𝐸
in 𝛼 for all 𝜌 𝐴𝐵 , we have that
𝐸 max ( 𝐴; 𝐵) 𝜌 ≥ 𝐸
e𝛼 ( 𝐴; 𝐵) 𝜌 (9.2.8)
𝑬 ( 𝐴1 𝐴2 ; 𝐵1 𝐵2 ) 𝜌⊗𝜔 ≤ 𝑬 ( 𝐴1 ; 𝐵1 ) 𝜌 + 𝑬 ( 𝐴2 ; 𝐵2 )𝜔 . (9.2.10)
539
Chapter 9: Entanglement Measures
Proof:
1. For 𝜔 𝐴′ 𝐵′ = S 𝐴𝐵→𝐴′ 𝐵′ (𝜌 𝐴𝐵 ), we have by definition,
𝑬 ( 𝐴′; 𝐵′)𝜔 = inf 𝑫 (𝜔 𝐴′ 𝐵′ ∥𝜏𝐴′ 𝐵′ ) (9.2.13)
𝜏𝐴′ 𝐵′ ∈SEP( 𝐴′ :𝐵′ )
= inf 𝑫 (S 𝐴𝐵→𝐴′ 𝐵′ (𝜌 𝐴𝐵 )∥𝜏𝐴′ 𝐵′ ). (9.2.14)
𝜏𝐴′ 𝐵′ ∈SEP( 𝐴′ :𝐵′ )
Now, recall that every separable channel S 𝐴𝐵→𝐴′ 𝐵′ takes 𝜎𝐴𝐵 ∈ SEP( 𝐴 : 𝐵)
to a state in SEP( 𝐴′ : 𝐵′), as shown already in (4.6.64)–(4.6.65). Therefore,
restricting the optimization in (9.2.14) leads to
𝑬 ( 𝐴′; 𝐵′)𝜔 ≤ inf 𝑫 (S 𝐴𝐵→𝐴′ 𝐵′ (𝜌 𝐴𝐵 )∥S 𝐴𝐵→𝐴′ 𝐵′ (𝜎𝐴𝐵 )) (9.2.15)
𝜎𝐴𝐵 ∈SEP( 𝐴:𝐵)
≤ inf 𝑫 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ) (9.2.16)
𝜎𝐴𝐵 ∈SEP( 𝐴:𝐵)
= 𝑬 ( 𝐴; 𝐵) 𝜌 , (9.2.17)
as required, where we used the data-processing inequality for the generalized
divergence to obtain the second inequality.
540
Chapter 9: Entanglement Measures
2. We have
𝑬 ( 𝐴; 𝐵) 𝜌 = inf 𝑫 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ). (9.2.18)
𝜎𝐴𝐵 ∈SEP( 𝐴:𝐵)
If 𝜌 𝐴𝐵 ∈ SEP( 𝐴 : 𝐵), then the state 𝜌 𝐴𝐵 itself achieves the minimum in
(9.2.18) because 𝑫 (𝜌 𝐴𝐵 ∥ 𝜌 𝐴𝐵 ) = 0. We thus have 𝑬 ( 𝐴; 𝐵) 𝜌 = 0. On the
∗ such that
other hand, if 𝑬 ( 𝐴; 𝐵) 𝜌 = 0, then there exists a separable state 𝜎𝐴𝐵
∗ ) = 0, which by assumption implies that 𝜌
𝑫 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ∗
𝐴𝐵 = 𝜎𝐴𝐵 , i.e., that 𝜌 𝐴𝐵
is separable.
3. By definition, the optimization in the definition of 𝑬 ( 𝐴1 𝐴2 ; 𝐵1 𝐵2 ) 𝜌⊗𝜔 is
over the set SEP( 𝐴1 𝐴2 : 𝐵1 𝐵2 ). It is straightforward to see that this set
contains states of the form 𝜉 𝐴1 𝐵1 ⊗ 𝜏𝐴2 𝐵2 , where 𝜉 𝐴1 𝐵1 ∈ SEP( 𝐴1 : 𝐵1 ) and
𝜏𝐴2 𝐵2 ∈ SEP( 𝐴2 : 𝐵2 ). By restricting the optimization to such states, and by
using additivity of the generalized divergence 𝑫, we obtain
𝑬 ( 𝐴1 𝐴2 ; 𝐵1 𝐵2 ) 𝜌⊗𝜔 ≤ 𝑫 (𝜌 𝐴1 𝐵1 ⊗ 𝜔 𝐴2 𝐵2 ∥𝜉 𝐴1 𝐵1 ⊗ 𝜏𝐴2 𝐵2 ) (9.2.19)
= 𝑫 (𝜌 𝐴1 𝐵1 ∥𝜉 𝐴1 𝐵1 ) + 𝑫 (𝜔 𝐴2 𝐵2 ∥𝜏𝐴2 𝐵2 ). (9.2.20)
Since 𝜉 𝐴1 𝐵1 ∈ SEP( 𝐴1 : 𝐵1 ) and 𝜏𝐴2 𝐵2 ∈ SEP( 𝐴2 : 𝐵2 ) are arbitrary, the
inequality in (9.2.10) follows.
4. We have
!
∑︁
𝑬 ( 𝐴; 𝐵) 𝜌 = inf 𝑫 𝑝(𝑥) 𝜌 𝑥𝐴𝐵 𝜎𝐴𝐵 . (9.2.21)
𝜎𝐴𝐵 ∈SEP( 𝐴:𝐵)
𝑥∈X
as required. ■
541
Chapter 9: Entanglement Measures
We now delve a bit more into particular examples of the generalized divergence
of entanglement, which are based on the relative entropy and the Petz–, sandwiched,
and geometric Rényi relative entropies.
Proposition 9.17
The relative entropy of entanglement is invariant under classical communication;
i.e., (9.1.6) holds with 𝐸 set to 𝐸 𝑅 .
Consider that
where the last equality follows from the direct-sum property of relative entropy in
(7.2.27). Since the inequality holds for every set {𝜎𝐴𝐵
𝑥 }
𝑥∈X of separable states, we
conclude that ∑︁
𝐸 𝑅 (𝑋 𝐴; 𝐵) 𝜌 ≤ 𝑝(𝑥)𝐸 𝑅 ( 𝐴; 𝐵) 𝜌 𝑥 . (9.2.30)
𝑥∈X
the chain of inequalities holds for every separable state 𝜎𝑋 𝐴𝐵 , we conclude that
∑︁
𝐸 𝑅 (𝑋 𝐴; 𝐵) 𝜌 ≥ 𝑝(𝑥)𝐸 𝑅 ( 𝐴; 𝐵) 𝜌 𝑥 . (9.2.36)
𝑥∈X
Putting together (9.2.30) and (9.2.36), and noting that the same argument applies
when exchanging the roles of Alice and Bob, we conclude the statement of the
proposition. ■
for every bipartite state 𝜌 𝐴𝐵 and separable instrument {S𝑥𝐴𝐵→𝐴′ 𝐵′ }𝑥∈X , with
𝑝(𝑥) B Tr[S𝑥𝐴𝐵→𝐴′ 𝐵′ (𝜌 𝐴𝐵 )], (9.2.38)
1 𝑥
𝜔𝑥𝐴′ 𝐵′ B S ′ ′ (𝜌 𝐴𝐵 ). (9.2.39)
𝑝(𝑥) 𝐴𝐵→𝐴 𝐵
543
Chapter 9: Entanglement Measures
for some probability distribution 𝑝(𝑥) and set {𝜏𝐴𝑥 ′ 𝐵′ }𝑥∈X of states. Then consider
that
𝜏𝑋 𝐴′ 𝐵′ defined above. Let 𝜎𝐴𝐵 be an arbitrary separable state, and consider that
∑︁
S 𝐴𝐵→𝑋 𝐴′ 𝐵′ (𝜎𝐴𝐵 ) = 𝑞(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜎𝐴𝑥 ′ 𝐵′ , (9.2.44)
𝑥∈X
for some probability distribution 𝑞(𝑥) and set {𝜎𝐴𝑥 ′ 𝐵′ }𝑥∈X of separable states. Then
we find that
The first inequality follows from the data-processing inequality for the Petz–Rényi
relative quasi-entropy (Theorem 7.24), and the first equality follows from its direct-
sum property (see (7.4.46)). Now applying the monotonicity and concavity of the
1
function (·) → 𝛼−1 log2 (·) for 𝛼 ∈ (1, 2], we find that
𝐷 𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 )
1
= log2 𝑄 𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ) (9.2.48)
𝛼−1
∑︁ 𝛼−1
1 𝑝(𝑥) 𝑥 𝑥
≥ log2 𝑝(𝑥) 𝑄 𝛼 (𝜏𝐴′ 𝐵′ ∥𝜎𝐴′ 𝐵′ ) (9.2.49)
𝛼−1 𝑥∈X:𝑝(𝑥)>0 𝑞(𝑥)
"
#
𝛼−1
∑︁ 1 𝑝(𝑥)
≥ 𝑝(𝑥) log2 𝑄 𝛼 (𝜏𝐴𝑥 ′ 𝐵′ ∥𝜎𝐴𝑥 ′ 𝐵′ ) (9.2.50)
𝛼−1 𝑞(𝑥)
𝑥∈X:𝑝(𝑥)>0
∑︁ 𝑝(𝑥)
= 𝑝(𝑥) log2 + 𝐷 𝛼 (𝜏𝐴𝑥 ′ 𝐵′ ∥𝜎𝐴𝑥 ′ 𝐵′ ) (9.2.51)
𝑞(𝑥)
𝑥∈X:𝑝(𝑥)>0
∑︁
= 𝐷 ( 𝑝∥𝑞) + 𝑝(𝑥)𝐷 𝛼 (𝜏𝐴𝑥 ′ 𝐵′ ∥𝜎𝐴𝑥 ′ 𝐵′ ) (9.2.52)
𝑥∈X:𝑝(𝑥)>0
∑︁
≥ 𝑝(𝑥)𝐸 𝛼 ( 𝐴′; 𝐵′)𝜏 𝑥 . (9.2.53)
𝑥∈X:𝑝(𝑥)>0
The final two equalities follow by direct evaluation and applying definitions. The
final inequality follows because 𝐷 ( 𝑝∥𝑞) ≥ 0 for probability distributions 𝑝 and
545
Chapter 9: Entanglement Measures
𝑞, and it also follows from the definition of the Petz–Rényi relative entropy of
entanglement and the fact that the state 𝜎𝐴𝑥 ′ 𝐵′ is separable. Since the inequality
holds for every separable state 𝜎𝐴𝐵 , we conclude the desired inequality:
∑︁
𝐸 𝛼 ( 𝐴; 𝐵) 𝜌 ≥ 𝑝(𝑥)𝐸 𝛼 ( 𝐴′; 𝐵′)𝜏 𝑥 . (9.2.54)
𝑥∈X:𝑝(𝑥)>0
By applying the same method of proof for the sandwiched and geometric Rényi
relative entropies for the range of 𝛼 > 1 for which data processing holds, along with
their data processing and direct-sum properties, we conclude the same inequality
for the sandwiched and geometric Rényi relative entropies of entanglement. ■
The following additional facts are known specifically about the relative entropy
of entanglement and the Petz–, sandwiched, and geometric Rényi relative entropies
of entanglement.
Proposition 9.20
where the last three inequalities hold for the range of 𝛼 for which data
processing holds.
2. For every pure bipartite state 𝜓 𝐴𝐵 ,
where the last three equalities hold for the range of 𝛼 for which data
processing holds.
546
Chapter 9: Entanglement Measures
Remark: Observe that for pure states, the relative entropy of entanglement is equal to the
entanglement of formation (see (9.1.40)).
Proof:
1. Let 𝜎𝐴𝐵 be an arbitrary separable state, which can be written as
∑︁
𝜎𝐴𝐵 = 𝑝(𝑥)𝜔𝑥𝐴 ⊗ 𝜏𝐵𝑥 , (9.2.63)
𝑥∈X
1 𝐴 ⊗ 𝜎𝐵 = 𝑝(𝑥) 1 𝐴 ⊗ 𝜏𝐵𝑥 ≥
∑︁ ∑︁
𝑝(𝑥)𝜔𝑥𝐴 ⊗ 𝜏𝐵𝑥 = 𝜎𝐴𝐵 . (9.2.64)
𝑥∈X 𝑥∈X
𝐷 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ) ≥ 𝐷 (𝜌 𝐴𝐵 ∥ 1 𝐴 ⊗ 𝜎𝐵 ) (9.2.65)
≥ inf 𝐷 (𝜌 𝐴𝐵 ∥ 1 𝐴 ⊗ 𝜎𝐵 ) (9.2.66)
𝜎𝐵
= 𝐼 ( 𝐴⟩𝐵) 𝜌 (9.2.67)
for all separable states, where the optimization is with respect to every state
𝜎𝐵 on the right-hand side, and where we have used the expression in (7.2.92)
for coherent information. We thus have
By the same argument, but flipping the roles of Alice and Bob, we conclude
that
𝐸 𝑅 ( 𝐴; 𝐵) 𝜌 ≥ 𝐼 (𝐵⟩ 𝐴) 𝜌 . (9.2.69)
Combining this inequality with the one in (9.2.68) leads to the desired result.
The same proof, but using (9.2.64) and Property 4. of Propositions 7.26, 7.35,
and 7.46, leads to the inequalities in (9.2.56)–(9.2.58).
547
Chapter 9: Entanglement Measures
2. Let
𝑟 √︁
∑︁
|𝜓⟩ 𝐴𝐵 = 𝜆 𝑘 |𝑒 𝑘 ⟩ 𝐴 ⊗ | 𝑓 𝑘 ⟩𝐵 (9.2.70)
𝑘=1
be the Schmidt decomposition of |𝜓⟩ 𝐴𝐵 , where 𝑟 is the Schmidt rank, 𝜆 𝑘 > 0
for all 1 ≤ 𝑘 ≤ 𝑟, and {|𝑒 𝑘 ⟩ 𝐴 }𝑟𝑘=1 , {| 𝑓 𝑘 ⟩𝐵 }𝑟𝑘=1 are orthonormal sets of vectors.
Since the entropy 𝐻 ( 𝐴𝐵)𝜓 vanishes for all pure states, we immediately have
where the equality 𝐻 (𝐵)𝜓 = 𝐻 ( 𝐴)𝜓 follows from the Schmidt decomposition
in (9.2.70), which tells us that the reduced states 𝜓 𝐴 and 𝜓 𝐵 have the same
non-zero eigevalues. Based on the fact that 𝐸 𝑅 ( 𝐴; 𝐵)𝜓 ≥ 𝐼 ( 𝐴⟩𝐵)𝜓 , which we
just proved above, we thus have the lower bound
The same reasoning, but using the lower bounds in (9.2.56)–(9.2.58), as well
as (7.11.139)–(7.11.141), leads to the inequalities
548
Chapter 9: Entanglement Measures
As discussed earlier, both of these generalized divergences can be cast into the
SDP standard forms in Definition 2.26, and thus their corresponding generalized
divergence of entanglement can be formulated as a cone program. A cone program
is an optimization problem over a convex cone3 with a convex objective function.
An SDP is a special case of a cone program in which the convex cone is the set of
positive semi-definite operators.
The convex cone of interest here is the set SEP(
d 𝐴 : 𝐵) of all separable operators,
which we define as follows: 𝑋 𝐴𝐵 ∈ SEP(
d 𝐴 : 𝐵) if there exists a positive integer ℓ
and positive semi-definite operators {𝑃𝑥𝐴 }ℓ𝑥=1 and {𝑄 𝑥𝐵 }ℓ𝑥=1 such that
ℓ
∑︁
𝑋 𝐴𝐵 = 𝑃𝑥𝐴 ⊗ 𝑄 𝑥𝐵 . (9.2.92)
𝑥=1
3A subset 𝐶 of a vector space is called a cone if 𝛼𝑥 ∈ 𝐶 for every 𝑥 ∈ 𝐶 and 𝛼 > 0. A convex
cone is one for for which 𝛼𝑥 + 𝛽𝑦 ∈ 𝐶 for all 𝛼, 𝛽 > 0 and 𝑥, 𝑦 ∈ 𝐶.
550
Chapter 9: Entanglement Measures
where
n o
𝐺 max ( 𝐴; 𝐵) 𝜌 B inf Tr[𝑋 𝐴𝐵 ] : 𝜌 𝐴𝐵 ≤ 𝑋 𝐴𝐵 , 𝑋 𝐴𝐵 ∈ SEP
d . (9.2.94)
as required, where in the last step we made the change of variable 𝜇𝜎𝐴𝐵 ≡ 𝑋 𝐴𝐵 .
Since 𝜎𝐴𝐵 ∈ SEP( 𝐴 : 𝐵) and 𝜇 ≥ 0, we have that 𝑋 𝐴𝐵 ∈ SEP(
d 𝐴 : 𝐵). ■
Next, we show that the hypothesis testing relative entropy of entanglement can
be written as a cone program.
Proof: This follows from the definition in (9.2.3) and the dual formulation of the
hypothesis testing relative entropy stated in (7.9.5). ■
Recall that SEP = PPT in the case of qubit-qubit and qubit-qutrit states, which
means that the optimizations in (9.2.94) and (9.2.97) are SDPs when 𝜌 𝐴𝐵 is either
a two-qubit state or a qubit-qutrit state.
551
Chapter 9: Entanglement Measures
Since SEP ⊆ PPT′, optimizing over states in PPT′ can never lead to a value that
is greater than the value obtained by optimizing over separable states. Therefore,
as stated in (9.1.149),
𝑹( 𝐴; 𝐵) 𝜌 ≤ 𝑬 ( 𝐴; 𝐵) 𝜌 (9.3.3)
for every state 𝜌 𝐴𝐵 .
We are particularly interested throughout the rest of this book in the following
generalized Rains divergences for every state 𝜌 𝐴𝐵 :
1. The Rains relative entropy of 𝜌 𝐴𝐵 ,
𝑅( 𝐴; 𝐵) 𝜌 B inf 𝐷 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ), (9.3.4)
𝜎𝐴𝐵 ∈PPT′ ( 𝐴:𝐵)
553
Chapter 9: Entanglement Measures
𝑅( 𝐴; 𝐵) 𝜌 ≤ 𝑅max ( 𝐴; 𝐵) 𝜌 ≤ 𝐸 𝑁 ( 𝐴; 𝐵) 𝜌 . (9.3.11)
Furthermore, for all 𝛼, 𝛽 ∈ [1/2, 1) ∪ (1, ∞) such that 𝛼 < 𝛽, we have that
e𝛼 ( 𝐴; 𝐵) 𝜌 ≤ 𝑅
𝑅 e𝛽 ( 𝐴; 𝐵) 𝜌 . (9.3.12)
554
Chapter 9: Entanglement Measures
𝑹( 𝐴1 𝐴2 ; 𝐵1 𝐵2 ) 𝜌⊗𝜔 ≤ 𝑹( 𝐴1 ; 𝐵1 ) 𝜌 + 𝑹( 𝐴2 ; 𝐵2 )𝜔 . (9.3.18)
Remark: Note that the generalized Rains divergence is generally not a faithful entanglement
measure. Although 𝑹( 𝐴; 𝐵)𝜌 = 0 for all separable states 𝜌 𝐴𝐵 due to the containment SEP( 𝐴 :
𝐵) ⊆ PPT′ ( 𝐴 : 𝐵), the converse statement is not generally true because the infimum in the
definition of 𝑹( 𝐴; 𝐵)𝜌 is not generally achieved by a separable state.
Proof:
555
Chapter 9: Entanglement Measures
Now, recall from Lemma 9.14 that the set PPT′ is closed under completely
PPT-preserving channels. Based on this, it follows that the output operators of
the completely PPT-preserving channel P 𝐴𝐵→𝐴′ 𝐵′ are in the set PPT′ ( 𝐴′ : 𝐵′).
In other words, we have
PPT′ ( 𝐴1 𝐴2 : 𝐵1 𝐵2 )
= {𝜎𝐴1 𝐴2 𝐵1 𝐵1 : 𝜎𝐴1 𝐴2 𝐵1 𝐵2 ≥ 0, T𝐵1 𝐵2 (𝜎𝐴1 𝐴2 𝐵1 𝐵2 ) 1
≤ 1}, (9.3.27)
3. We have
!
∑︁
𝑹( 𝐴; 𝐵) 𝜌 = inf′ 𝑫 𝑝(𝑥) 𝜌 𝑥𝐴𝐵 𝜎𝐴𝐵 . (9.3.30)
𝜎𝐴𝐵 ∈PPT ( 𝐴:𝐵)
𝑥∈X
Let us restrict the optimization over all PPT′ operators to an optimization over
′
sets {𝜎𝐴𝐵
𝑥 }
𝑥∈X of PPT operators indexed by Í the alphabet X. Then, because
′
PPT ( 𝐴 : 𝐵) is a convex set, we have that 𝑥∈X 𝑝(𝑥)𝜎𝐴𝐵 𝑥 ∈ PPT′ ( 𝐴 : 𝐵).
Therefore, using the joint convexity of 𝑫, we obtain
!
∑︁ ∑︁
𝑹( 𝐴; 𝐵) 𝜌 ≤ 𝑥 inf ′ 𝑫 𝑝(𝑥) 𝜌 𝑥𝐴𝐵 𝑝(𝑥)𝜎𝐴𝐵𝑥
(9.3.31)
{𝜎𝐴𝐵 } 𝑥 ⊂PPT ( 𝐴:𝐵)
∑︁𝑥∈X 𝑥∈X
𝑥 𝑥
≤ inf
𝑥 } ⊂PPT′ ( 𝐴:𝐵)
𝑝(𝑥) 𝑫 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ) (9.3.32)
{𝜎𝐴𝐵 𝑥
𝑥∈X
∑︁
≤ 𝑝(𝑥) inf
𝑥 ∈PPT′ ( 𝐴:𝐵)
𝑥
𝑫 (𝜌 𝑥𝐴𝐵 ∥𝜎𝐴𝐵 ) (9.3.33)
𝜎𝐴𝐵
𝑥∈X
∑︁
= 𝑝(𝑥) 𝑹( 𝐴; 𝐵) 𝜌 𝑥 , (9.3.34)
𝑥∈X
as required. ■
e𝛼 ( 𝐴; 𝐵) 𝜌 ≤ max 𝑅
𝑅 e𝛼 ( 𝐴; 𝐵) 𝜌 𝑥 , (9.3.35)
𝑥∈X
Í
where 𝜌 𝐴𝐵 = 𝑥∈X 𝑝(𝑥) 𝜌 𝑥𝐴𝐵 .
Proof: We have
!
∑︁
e𝛼 ( 𝐴; 𝐵) 𝜌 =
𝑅 inf 𝐷
e𝛼 𝑝(𝑥) 𝜌 𝑥𝐴𝐵 𝜎𝐴𝐵 . (9.3.36)
𝜎𝐴𝐵 ∈PPT′ ( 𝐴:𝐵)
𝑥∈X
557
Chapter 9: Entanglement Measures
Let us restrict the optimization over all PPT′ operators to an optimization over sets
′
{𝜎𝐴𝐵
𝑥 }
𝑥∈X of PPT operators indexed by the alphabet X. Then, because PPT′ ( 𝐴 : 𝐵)
Í
is a convex set, we have that 𝑥∈X 𝑝(𝑥)𝜎𝐴𝐵 𝑥 ∈ PPT′ ( 𝐴 : 𝐵). Let us also recall
from (7.5.174) that the sandwiched Rényi relative entropy is jointly quasi-convex,
meaning that
!
∑︁ ∑︁
𝐷
e𝛼 𝑝(𝑥) 𝜌 𝑥𝐴𝐵 𝑝(𝑥)𝜎𝐴𝐵𝑥
≤ max 𝐷e𝛼 (𝜌 𝑥 ∥𝜎 𝑥 ),
𝐴𝐵 𝐴𝐵 (9.3.37)
𝑥∈X
𝑥∈X 𝑥∈X
We thus obtain
!
∑︁ ∑︁
e𝛼 ( 𝐴; 𝐵) 𝜌 ≤
𝑅 inf
𝑥 } ⊂PPT′ ( 𝐴:𝐵)
𝐷
e𝛼 𝑝(𝑥) 𝜌 𝑥𝐴𝐵 𝑥
𝑝(𝑥)𝜎𝐴𝐵 (9.3.38)
{𝜎𝐴𝐵 𝑥
𝑥∈X 𝑥∈X
≤ inf e𝛼 (𝜌 𝑥 ∥𝜎 𝑥 )
max 𝐷 (9.3.39)
𝑥 } ⊂PPT′ ( 𝐴:𝐵) 𝑥∈X 𝐴𝐵 𝐴𝐵
{𝜎𝐴𝐵 𝑥
≤ max inf e𝛼 (𝜌 𝑥 ∥𝜎 𝑥 )
𝐷 (9.3.40)
𝑥 ∈PPT′ ( 𝐴:𝐵) 𝐴𝐵 𝐴𝐵
𝑥∈X 𝜎𝐴𝐵
e𝛼 ( 𝐴; 𝐵) 𝜌 𝑥 ,
= max 𝑅 (9.3.41)
𝑥∈X
as required. ■
Using the expression in (9.1.102) for ∥T𝐵 (𝜎𝐴𝐵 )∥ 1 , this set can equivalently be
written as follows:
558
Chapter 9: Entanglement Measures
PPT′ ( 𝐴 : 𝐵) = {T𝐵 (𝐾 𝐴𝐵 − 𝐿 𝐴𝐵 ) : T𝐵 (𝐾 𝐴𝐵 − 𝐿 𝐴𝐵 ) ≥ 0,
𝐾 𝐴𝐵 , 𝐿 𝐴𝐵 ≥ 0, Tr[𝐾 𝐴𝐵 + 𝐿 𝐴𝐵 ] ≤ 1}, (9.3.44)
In this section, we show how these characterizations of the set PPT′ allow us to
compute both the max-Rains relative entropy and the hypothesis testing Rains
relative entropy via semi-definite programs (Section 2.4).
We first consider the max-Rains relative entropy, which we recall is defined as
Let us also recall from (7.8.4) that 𝐷 max (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ) can be written as follows:
As shown in the discussion after (7.8.4), the optimization in the equation above is a
semi-definite program (SDP). This, along with the definition in (9.3.42) of the set
PPT′ ( 𝐴 : 𝐵), leads to the following SDP formulation for 𝑅max .
𝑊max ( 𝐴; 𝐵) 𝜌
= inf {Tr[𝐾 𝐴𝐵 + 𝐿 𝐴𝐵 ] : T𝐵 [𝐾 𝐴𝐵 − 𝐿 𝐴𝐵 ] ≥ 𝜌 𝐴𝐵 }, (9.3.48)
𝐾 𝐴𝐵 ,𝐿 𝐴𝐵 ≥0
= sup {Tr[𝑌 𝐴𝐵 𝜌 𝐴𝐵 ] : ∥T𝐵 [𝑌 𝐴𝐵 ] ∥ ∞ ≤ 1}. (9.3.49)
𝑌 𝐴𝐵 ≥0
Proof: First, we establish the equality for 𝑊 ( 𝐴; 𝐵) 𝜌 in (9.3.48). Due to the fact
that the infimum over PPT′ operators in the definition of 𝑅max can be achieved, we
have that
559
Chapter 9: Entanglement Measures
inf{Tr[𝐾 𝐴𝐵 + 𝐿 𝐴𝐵 ] : 𝜌 𝐴𝐵 ≤ T𝐵 [𝐾 𝐴𝐵 − 𝐿 𝐴𝐵 ], 𝐾 𝐴𝐵 , 𝐿 𝐴𝐵 ≥ 0}
≤ inf{∥T𝐵 (𝑆 𝐴𝐵 )∥ 1 : 𝜌 𝐴𝐵 ≤ 𝑆 𝐴𝐵 }. (9.3.58)
To see the opposite inequality, let 𝐾 𝐴𝐵 and 𝐿 𝐴𝐵 be arbitrary operators such that
𝐾 𝐴𝐵 , 𝐿 𝐴𝐵 ≥ 0 and 𝜌 𝐴𝐵 ≤ T𝐵 [𝐾 𝐴𝐵 − 𝐿 𝐴𝐵 ]. Then, setting 𝑆 𝐴𝐵 = T𝐵 [𝐾 𝐴𝐵 − 𝐿 𝐴𝐵 ],
we find that 𝜌 𝐴𝐵 ≤ 𝑆 𝐴𝐵 and
∥T𝐵 (𝑆 𝐴𝐵 ) ∥ 1 = ∥𝐾 𝐴𝐵 − 𝐿 𝐴𝐵 ∥ 1 (9.3.59)
≤ ∥𝐾 𝐴𝐵 ∥ 1 + ∥𝐿 𝐴𝐵 ∥ 1 (9.3.60)
= Tr[𝐾 𝐴𝐵 + 𝐿 𝐴𝐵 ], (9.3.61)
560
Chapter 9: Entanglement Measures
inf{∥T𝐵 (𝑆 𝐴𝐵 )∥ 1 : 𝜌 𝐴𝐵 ≤ 𝑆 𝐴𝐵 } ≤
inf{Tr[𝐾 𝐴𝐵 + 𝐿 𝐴𝐵 ] : 𝜌 𝐴𝐵 ≤ T𝐵 (𝐾 𝐴𝐵 − 𝐿 𝐴𝐵 ), 𝐾 𝐴𝐵 , 𝐿 𝐴𝐵 ≥ 0}. (9.3.62)
with
1 𝐴𝐵 0
𝐾 𝐴𝐵 0
𝑋= , 𝐶=
0 1 𝐴𝐵
, (9.3.64)
0 𝐿 𝐴𝐵
Φ(𝑋) = T𝐵 [𝐾 𝐴𝐵 − 𝐿 𝐴𝐵 ], 𝐷 = 𝜌 𝐴𝐵 . (9.3.65)
Therefore,
T [𝑌 ] 0
Φ† (𝑌 ) = 𝐵 𝐴𝐵 (9.3.70)
0 −T𝐵 [𝑌 𝐴𝐵 ]
so that Φ† (𝑌 ) ≤ 𝐶 is equivalent to
1 𝐴𝐵 0
T𝐵 [𝑌 𝐴𝐵 ] 0
≤
0 1 𝐴𝐵
. (9.3.71)
0 −T𝐵 [𝑌 𝐴𝐵 ]
Due to the additivity of 𝐷 max , Proposition 9.25 implies that 𝑅max is subadditive:
𝑅max ( 𝐴1 𝐴2 ; 𝐵1 𝐵2 ) 𝜌⊗𝜔 ≤ 𝑅max ( 𝐴1 ; 𝐵1 ) 𝜌 + 𝑅max ( 𝐴2 ; 𝐵2 )𝜔 , (9.3.73)
where the inequality holds for all states 𝜌 𝐴1 𝐵1 and 𝜔 𝐴2 𝐵2 . Using the dual formulation
in (9.3.48) for 𝑅max , we find that the reverse inequality also holds, implying that
𝑅max is an additive entanglement measure.
562
Chapter 9: Entanglement Measures
Proof: We prove the inequality reverse to the one in (9.3.73). To this end, we
employ the dual formulation of 𝑅max in (9.3.49). Let 𝑌 𝐴1 𝐵1 and 𝑆 𝐴2 𝐵2 be arbitrary
operators satisfying
T𝐵1 [𝑌 𝐴1 𝐵1 ] ∞
≤ 1, 𝑌 𝐴1 𝐵1 ≥ 0, (9.3.75)
T𝐵2 [𝑆 𝐴2 𝐵2 ] ∞
≤ 1, 𝑆 𝐴2 𝐵2 ≥ 0. (9.3.76)
Then it follows from multiplicativity of the Schatten ∞-norm under tensor products
(see (2.2.96)) that
T𝐵1 𝐵2 [𝑌 𝐴1 𝐵1 ⊗ 𝑆 𝐴2 𝐵2 ] ∞
= T𝐵1 [𝑌 𝐴1 𝐵1 ] ⊗ T𝐵2 [𝑆 𝐴2 𝐵2 ] ∞
(9.3.77)
= T𝐵1 [𝑌 𝐴1 𝐵1 ] ∞
T𝐵2 [𝑆 𝐴2 𝐵2 ] ∞
(9.3.78)
≤ 1. (9.3.79)
follows. ■
We now consider the hypothesis testing Rains relative entropy, which is defined
as
𝜀
𝑅𝐻 ( 𝐴; 𝐵) 𝜌 = inf′ 𝐷 𝜀𝐻 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ), (9.3.84)
𝜎𝐴𝐵 ∈PPT ( 𝐴:𝐵)
for 𝜀 ∈ [0, 1]. Recall the primal and dual formulations of 𝐷 𝜀𝐻 from Proposition 7.66.
Using the dual formulation, we obtain the following:
563
Chapter 9: Entanglement Measures
𝑊𝐻𝜀 ( 𝐴; 𝐵) 𝜌
B sup {𝜇(1 − 𝜀) − Tr[𝑍 𝐴𝐵 ] : 𝜇𝜌 𝐴𝐵 ≤ T𝐵 (𝐾 𝐴𝐵 − 𝐿 𝐴𝐵 ) + 𝑍 𝐴𝐵 ,
𝜇≥0,𝑍 𝐴𝐵 ,
𝐾 𝐴𝐵 ,𝐿 𝐴𝐵 ≥0
T𝐵 (𝐾 𝐴𝐵 − 𝐿 𝐴𝐵 ) ≥ 0, Tr[𝐾 𝐴𝐵 + 𝐿 𝐴𝐵 ] ≤ 1} (9.3.86)
= inf {∥T𝐵 (𝑀 𝐴𝐵 + 𝑁 𝐴𝐵 )∥ ∞ : Tr[𝑀 𝐴𝐵 𝜌 𝐴𝐵 ] ≥ 1 − 𝜀, 𝑀 𝐴𝐵 ≤ 1 𝐴𝐵 }.
𝑀 𝐴𝐵 ,𝑁 𝐴𝐵 ≥0
(9.3.87)
Proof: Using the dual SDP formulation of the hypothesis testing relative entropy
from Proposition 7.66, we obtain
𝑊𝐻𝜀 ( 𝐴; 𝐵) 𝜌 =
sup {𝜇(1 − 𝜀) − Tr[𝑍 𝐴𝐵 ] : 𝜇𝜌 𝐴𝐵 ≤ 𝜎𝐴𝐵 + 𝑍 𝐴𝐵 } (9.3.88)
𝜇≥0,𝑍 𝐴𝐵 ≥0,𝜎𝐴𝐵 ∈PPT′ ( 𝐴:𝐵)
0 0 0
𝐵 = 0 0 0® .
© ª
(9.3.92)
«0 0 1¬
Setting
𝑀 𝐴𝐵 0 0
𝑌 = 0 𝑁 𝐴𝐵 0® ,
© ª
(9.3.93)
« 0 0 𝜆¬
we compute the dual map Φ† as follows:
Tr[𝑌 Φ(𝑋)]
= Tr[𝑀 𝐴𝐵 (𝜇𝜌 𝐴𝐵 − T𝐵 (𝐾 𝐴𝐵 − 𝐿 𝐴𝐵 ) − 𝑍 𝐴𝐵 )] − Tr[𝑁 𝐴𝐵 T𝐵 (𝐾 𝐴𝐵 − 𝐿 𝐴𝐵 )]
+ 𝜆 Tr[𝐾 𝐴𝐵 + 𝐿 𝐴𝐵 ] (9.3.94)
= 𝜇 Tr[𝑀 𝐴𝐵 𝜌 𝐴𝐵 ] − Tr[𝑀 𝐴𝐵 𝑍 𝐴𝐵 ] + Tr[(𝜆1 𝐴𝐵 − T𝐵 (𝑀 𝐴𝐵 + 𝑁 𝐴𝐵 )) 𝐾 𝐴𝐵 ]
+ Tr[(𝜆1 𝐴𝐵 + T𝐵 (𝑀 𝐴𝐵 + 𝑁 𝐵 )) 𝐿 𝐴𝐵 ], (9.3.95)
which implies that
Φ† (𝑌 ) =
Tr[𝑀 𝐴𝐵 𝜌 𝐴𝐵 ] 0 0 0
−𝑀 𝐴𝐵
© ª
0 0 0 ®
𝜆1 𝐴𝐵 − T𝐵 (𝑀 𝐴𝐵 + 𝑁 𝐴𝐵 )
®.
0 0 0 ®
« 0 0 0 𝜆1 𝐴𝐵 + T𝐵 (𝑀 𝐴𝐵 + 𝑁 𝐵 ) ¬
(9.3.96)
Then using the standard form of the dual program in (2.4.4), i.e.,
inf Tr[𝐵𝑌 ] : Φ† (𝑌 ) ≥ 𝐴 ,
(9.3.97)
𝑌 ≥0
we find that the dual SDP is given by
inf {𝜆 : Tr[𝑀 𝐴𝐵 𝜌 𝐴𝐵 ] ≥ 1 − 𝜀, 𝑀 𝐴𝐵 ≤ 1 𝐴𝐵 ,
𝜆,𝑀 𝐴𝐵 ,𝑁 𝐴𝐵 ≥0
𝜆1 𝐴𝐵 ± T𝐵 (𝑀 𝐴𝐵 + 𝑁 𝐴𝐵 ) ≥ 0}. (9.3.98)
This can alternatively be written as
inf {∥T𝐵 (𝑀 𝐴𝐵 + 𝑁 𝐴𝐵 ) ∥ ∞ : Tr[𝑀 𝐴𝐵 𝜌 𝐴𝐵 ] ≥ 1 − 𝜀, 𝑀 𝐴𝐵 ≤ 1 𝐴𝐵 }. (9.3.99)
𝑀 𝐴𝐵 ,𝑁 𝐴𝐵 ≥0
Tr𝐸 [𝜔 𝐴𝐵𝐸 ] = 𝜌 𝐴𝐵 , and this optimization corresponds to the worst possible scenario
in which the eavesdropper attempts to “squash down” the correlations of Alice and
Bob, i.e., to reduce the value of 𝐼 ( 𝐴; 𝐵|𝐸)𝜔 as much as possible. This cryptographic
perspective actually allows us to write the squashed entanglement in an alternative
way, which we do in Proposition 9.37 below.
We begin by establishing some basic properties of squashed entanglement. As
we will see, the squashed entanglement possesses all of the desired properties of an
entanglement measure stated at the beginning of Section 9.1.
𝐸 sq ( 𝐴; 𝐵) 𝜌 ≥ 0. (9.4.5)
𝐸 sq ( 𝐴1 𝐴2 ; 𝐵1 𝐵2 ) 𝜌 ≥ 𝐸 sq ( 𝐴1 ; 𝐵1 ) 𝜌 + 𝐸 sq ( 𝐴1 ; 𝐵2 ) 𝜌 + 𝐸 sq ( 𝐴2 ; 𝐵1 ) 𝜌
+ 𝐸 sq ( 𝐴2 ; 𝐵2 ) 𝜌 . (9.4.7)
𝐸 sq ( 𝐴1 𝐴2 ; 𝐵1 𝐵2 )𝜎 = 𝐸 sq ( 𝐴1 ; 𝐵1 )𝜔 + 𝐸 sq ( 𝐴2 ; 𝐵2 )𝜏 . (9.4.8)
567
Chapter 9: Entanglement Measures
Proof:
1. This follows immediately from the fact that the conditional mutual information
of an arbitrary state is non-negative (Theorem 7.6).
2. The statement “if 𝜎𝐴𝐵 is a separable state, then 𝐸 sq ( 𝐴; 𝐵)𝜎 = 0” follows from
the line of reasoning in (9.1.157)–(9.1.159) used to motivate the definition of
squashed entanglement. For a proof of the converse statement, please consult
the Bibliographic Notes in Section 9.6.
3. Let 𝜔𝑥𝐴𝐵𝐸 denote an arbitrary extension of 𝜌 𝑥𝐴𝐵 . Then
∑︁
𝜔 𝐴𝐵𝐸 𝑋 B 𝑝(𝑥)𝜔𝑥𝐴𝐵𝐸 ⊗ |𝑥⟩⟨𝑥| 𝑋 (9.4.9)
𝑥∈X
5. To see the equality in (9.4.8), first consider that for a tensor-product state
𝜔 𝐴1 𝐵1 ⊗ 𝜏𝐴2 𝐵2 , the reduced state on systems 𝐴1 𝐵2 is the product state 𝜔 𝐴1 ⊗ 𝜏𝐵2 ,
and the reduced state on systems 𝐴2 𝐵1 is the product state 𝜔 𝐴2 ⊗ 𝜏𝐵1 . Thus,
faithfulness implies that 𝐸 sq ( 𝐴1 ; 𝐵2 )𝜎 = 𝐸 sq ( 𝐴2 ; 𝐵1 )𝜎 = 0, and then the
monogamy inequality in (9.4.7) implies that
𝐸 sq ( 𝐴1 𝐴2 ; 𝐵1 𝐵2 )𝜎 ≥ 𝐸 sq ( 𝐴1 ; 𝐵1 )𝜔 + 𝐸 sq ( 𝐴2 ; 𝐵2 )𝜏 . (9.4.15)
2 · 𝐸 sq ( 𝐴1 𝐴2 ; 𝐵1 𝐵2 )𝜎 ≤ 𝐼 ( 𝐴1 𝐴2 ; 𝐵1 𝐵2 |𝐸 1 𝐸 2 )𝜔⊗𝜏 (9.4.16)
= 𝐼 ( 𝐴1 ; 𝐵1 |𝐸 1 )𝜔 + 𝐼 ( 𝐴2 ; 𝐵2 |𝐸 2 )𝜏 , (9.4.17)
where the equality follows from the additivity of conditional mutual information
with respect to tensor-product states (Proposition 7.9). Since the extensions
𝜔 𝐴1 𝐵1 𝐸1 and 𝜏𝐴2 𝐵2 𝐸2 are arbitrary, the following inequality holds:
𝐸 sq ( 𝐴1 𝐴2 ; 𝐵1 𝐵2 )𝜎 ≤ 𝐸 sq ( 𝐴1 ; 𝐵1 )𝜔 + 𝐸 sq ( 𝐴2 ; 𝐵2 )𝜏 . (9.4.18)
Proof: We prove that the conditions of Lemma 9.2 hold and then we apply it. The
first part of the proof follows from the fact that conditional mutual information does
not increase under the action of local channels (see Proposition 7.9): for a tripartite
state 𝜉 𝐴𝐵𝐸 and local channels N 𝐴→𝐴′ and M𝐵→𝐵′ ,
570
Chapter 9: Entanglement Measures
To this end, let 𝜔 𝑋 𝐴𝐵𝐸 be an arbitrary extension of 𝜌 𝑋 𝐴𝐵 . After the action of a local
Í
completely dephasing channel Δ 𝑋 (·) B 𝑥∈X |𝑥⟩⟨𝑥| 𝑋 (·)|𝑥⟩⟨𝑥| 𝑋 , it follows that the
state 𝜃 𝑋 𝐴𝐵𝐸 B Δ 𝑋 (𝜔 𝑋 𝐴𝐵𝐸 ) has the following form:
∑︁
𝜃 𝑋 𝐴𝐵𝐸 = 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 𝐴 ⊗ 𝜃 𝑥𝐴𝐵𝐸 , (9.4.26)
𝑥∈X
where 𝜃 𝑥𝐴𝐵𝐸 is an extension of 𝜌 𝑥𝐴𝐵 . To see this, let |𝜙𝑥 ⟩ 𝐴𝐵𝑅 purify 𝜌 𝑥𝐴𝐵 for each
𝑥 ∈ X, and consider that
∑︁ √︁
|𝜑⟩ 𝑋 𝑋𝐸 𝐴𝐵𝑅 B 𝑝(𝑥)|𝑥⟩ 𝑋 |𝑥⟩ 𝑋𝐸 |𝜙𝑥 ⟩ 𝐴𝐵𝑅 (9.4.27)
𝑥∈X
The last equality follows from the chain rule for conditional mutual information and
the second-to-last inequality from non-negativity of conditional mutual information:
𝐼 (𝑋; 𝐵|𝐸) 𝜌 ≥ 0. The final inequality holds because the conditional mutual
information does not increase under the action of a local channel on system 𝑋 (see
Proposition 7.9). Since the inequality holds for an arbitrary extension of 𝜌 𝑋 𝐴𝐵 , we
conclude that ∑︁
𝑝(𝑥)𝐸 sq ( 𝐴; 𝐵) 𝜌 𝑥 ≤ 𝐸 sq (𝑋 𝐴; 𝐵) 𝜌 . (9.4.35)
𝑥∈X
571
Chapter 9: Entanglement Measures
follows similarly. ■
𝐸 sq ( 𝐴; 𝐵) 𝜌 ≤ 𝐸 𝐹 ( 𝐴; 𝐵) 𝜌 , (9.4.37)
where 𝑝(𝑥) is a probability distribution and {𝜙𝑥𝐴𝐵 }𝑥 is a set of pure states satisfying
Í
𝜌 𝐴𝐵 = 𝑥 𝑝(𝑥)𝜙𝑥𝐴𝐵 , then it follows that
1
𝐸 sq ( 𝐴; 𝐵) 𝜌 ≤ 𝐼 ( 𝐴; 𝐵|𝑋)𝜔 = 𝐻 ( 𝐴|𝑋)𝜔 . (9.4.39)
2
Since the inequality holds for all such extensions, the inequality in (9.4.37) follows
by applying the definition of entanglement of formation in (9.1.42). ■
For pure bipartite states, the entanglement of formulation is simply the entropy
of the reduced state of one of the subsystems. It turns out that the squashed
entanglement reduces to the same quantity for pure bipartite states.
572
Chapter 9: Entanglement Measures
Proof: Every extension 𝜔 𝐴𝐵𝐸 of the pure bipartite state 𝜓 𝐴𝐵 is a product state of
the form 𝜔 𝐴𝐵𝐸 = 𝜓 𝐴𝐵 ⊗ 𝜌 𝐸 for some state 𝜌 𝐸 . By Proposition 7.9, it follows that
where the second equality holds because 𝐻 ( 𝐴𝐵)𝜓 = 0 and 𝐻 ( 𝐴)𝜓 = 𝐻 (𝐵)𝜓 for a
pure bipartite state. We thus have 𝐸 sq ( 𝐴; 𝐵)𝜓 = 𝐻 ( 𝐴)𝜓 . ■
Let us now return to the discussion after Definition 9.31 on the cryptographic
interpretation of squashed entanglement. We stated that the quantum conditional
mutual information 𝐼 ( 𝐴; 𝐵|𝐸)𝜔 can be interpreted as the amount of correlations
between Alice (𝐴) and Bob (𝐵) from the point of view of an eavesdropper (𝐸),
where 𝜔 𝐴𝐵𝐸 is the joint state shared by all three parties. If the eavesdropper wants
to reduce, or “squash” these correlations as much as possible, then we optimize
with respect to every state 𝜔 𝐴𝐵𝐸 that is consistent with the state 𝜌 𝐴𝐵 shared by
Alice and Bob, leading to the squashed entanglement 𝐸 sq ( 𝐴; 𝐵) 𝜌 . Now, recall
Proposition 4.4, which states that for every extension 𝜔 𝐴𝐵𝐸 of a given state 𝜌 𝐴𝐵
𝜌
there exists a quantum channel S𝐸 ′ →𝐸 such that S𝐸 ′ →𝐸 (𝜓 𝐴𝐵𝐸 ′ ) = 𝜔 𝐴𝐵𝐸 , where
𝜌
𝜓 𝐴𝐵𝐸 ′ is a purification of 𝜌 𝐴𝐵 . We therefore immediately have the following.
573
Chapter 9: Entanglement Measures
1 𝜌
𝐸 sq ( 𝐴; 𝐵) 𝜌 = inf {𝐼 ( 𝐴; 𝐵|𝐸)𝜔 : 𝜔 𝐴𝐵𝐸 = S𝐸 ′ →𝐸 (𝜓 𝐴𝐵𝐸 ′ )}, (9.4.43)
2 S𝐸 ′ →𝐸
𝐸 sq ( 𝐴; 𝐵) 𝜌 =
1 𝜌
inf 𝐻 (𝐵|𝐸)𝜃 + 𝐻 (𝐵|𝐹)𝜃 : 𝜃 𝐵𝐸 𝐹 B V𝐸 ′ →𝐸 𝐹 (𝜓 𝐴𝐵𝐸 ′ ) , (9.4.44)
2 V𝐸 ′ →𝐸𝐹
The act of “squashing” the correlations between Alice and Bob can thus be
thought of explicitly in terms of an eavesdropper applying the channel S𝐸 ′ →𝐸 to
their purifying system 𝐸 ′ of 𝜌 𝐴𝐵 . For this reason, we call S𝐸 ′ →𝐸 a squashing
channel.
Now, since 𝜃 𝐴𝐵𝐸 𝐹 is a pure state, we have from duality of conditional entropy that
Therefore,
𝐼 ( 𝐴; 𝐵|𝐸)𝜔 = 𝐻 (𝐵|𝐸)𝜃 + 𝐻 (𝐵|𝐹)𝜃 . (9.4.48)
We conclude that (9.4.44) holds because the squashing channel S𝐸 ′ →𝐸 is arbitrary
in the development above. ■
574
Chapter 9: Entanglement Measures
𝐹 (𝜌 𝐴𝐵 , 𝜎𝐴𝐵 ) ≥ 1 − 𝜀, (9.4.49)
for 𝜀 ∈ [0, 1]. Then the following bound applies to their squashed entangle-
ments:
√ √
𝐸 sq ( 𝐴; 𝐵) 𝜌 − 𝐸 sq ( 𝐴; 𝐵)𝜎 ≤ 𝜀 log2 min {𝑑 𝐴 , 𝑑 𝐵 } + 𝑔2 ( 𝜀), (9.4.50)
where
𝑔2 (𝛿) B (𝛿 + 1) log2 (𝛿 + 1) − 𝛿 log2 𝛿. (9.4.51)
Proof: Due to Uhlmann’s theorem (Theorem 6.8) and Proposition 4.4, for an
arbitrary extension 𝜌 𝐴𝐵𝐸 of 𝜌 𝐴𝐵 , there exists an extension 𝜎𝐴𝐵𝐸 of 𝜎𝐴𝐵 such that
By the relation between trace distance and fidelity (Theorem 6.14), it follows that
1 √
∥ 𝜌 𝐴𝐵𝐸 − 𝜎𝐴𝐵𝐸 ∥ 1 ≤ 𝜀. (9.4.53)
2
Then, applying the uniform continuity of conditional mutual information (Proposi-
tion 7.10), we find that
9.5 Summary
In this chapter, we studied entanglement measures for quantum states and quantum
channels. The defining property of an entanglement measure for states is mono-
tonicity under local operations and classical communication (LOCC): a function
𝐸 : H 𝐴𝐵 → R is an entanglement measure if 𝐸 (𝜌 𝐴𝐵 ) ≥ 𝐸 (L(𝜌 𝐴𝐵 )) for every
bipartite state 𝜌 𝐴𝐵 and every LOCC channel L. LOCC monotonicity can be
thought of as a special kind of data-processing inequality, and it is a core concept
in entanglement theory in the same way that the data-processing inequality is the
core concept behind generalized divergence.
An important type of state entanglement measure for our purposes in this book
is a divergence-based measure, in which the entanglement in a given bipartite
quantum state is quantified by its divergence with the set of separable states. As
our divergence, we take a generalized divergence 𝑫 : D(H) × L+ (H) → R, and
we call the resulting quantity generalized divergence of entanglement. Due to the
data-processing inequality (which holds for a generalized divergence by definition),
we immediately obtain LOCC monotonicity for the generalized divergence of
entanglement, thus making it an entanglement measure. We also consider the
divergence with the larger set of PPT′ operators that contains all separable states,
and call the resulting quantity generalized Rains divergence.
576
Chapter 9: Entanglement Measures
577
Chapter 9: Entanglement Measures
578
Chapter 9: Entanglement Measures
(2004), and the explicit bound given here is due to Shirokov (2017).
Here we prove that the quantity ∥T𝐵 (𝜌 𝐴𝐵 ) ∥ 1 has the following primal and dual
SDP formulations:
where the optimization in the first line is with respect to Hermitian 𝑅 𝐴𝐵 . Since this
is the core quantity underlying both the negativity and the log-negativity, it follows
that these entanglement measures can be computed by means of semi-definite
programs. To see the first equality, consider that
The second equality follows from Hölder duality (see (2.2.98)), and since T𝐵 (𝜌 𝐴𝐵 )
is Hermitian, it suffices to optimize over Hermitian 𝑅 𝐴𝐵 . The third equality follows
because the partial transpose is its own Hilbert–Schmidt adjoint. The fourth equality
follows from the substitution 𝑅 𝐴𝐵 → T𝐵 (𝑅 𝐴𝐵 ). The final equality follows because
the inequality ∥T𝐵 (𝑅 𝐴𝐵 )∥ ∞ ≤ 1 is equivalent to −1 𝐴𝐵 ≤ T𝐵 (𝑅 𝐴𝐵 ) ≤ 1 𝐴𝐵 for a
Hermitian operator T𝐵 (𝑅 𝐴𝐵 ).
Now consider that the set of Hermitian operators is equivalent to the set of
operators formed as differences of positive semi-definite operators. So this implies
that
579
Chapter 9: Entanglement Measures
∥T𝐵 (𝜌 𝐴𝐵 )∥ 1 =
sup {Tr[(𝑃 𝐴𝐵 − 𝑄 𝐴𝐵 ) 𝜌 𝐴𝐵 ] : −1 𝐴𝐵 ≤ T𝐵 (𝑃 𝐴𝐵 − 𝑄 𝐴𝐵 ) ≤ 1 𝐴𝐵 } . (9.A.8)
𝑃 𝐴𝐵 ,𝑄 𝐴𝐵 ≥0
Then by setting
1 𝐴𝐵 0
𝑃 𝐴𝐵 0 𝜌 𝐴𝐵 0
𝑋= , 𝐴= , 𝐵=
0 1 𝐴𝐵
, (9.A.9)
0 𝑄 𝐴𝐵 0 −𝜌 𝐴𝐵
T𝐵 (𝑃 𝐴𝐵 − 𝑄 𝐴𝐵 ) 0
Φ(𝑋) = , (9.A.10)
0 −T𝐵 (𝑃 𝐴𝐵 − 𝑄 𝐴𝐵 )
this primal SDP is now in the standard form of (2.4.3). Then, setting
𝐾 𝐴𝐵 0
𝑌= , (9.A.11)
0 𝐿 𝐴𝐵
we can calculate the Hilbert–Schmidt adjoint of Φ as
Tr[𝑌 Φ(𝑋)]
𝐾 𝐴𝐵 0 T𝐵 (𝑃 𝐴𝐵 − 𝑄 𝐴𝐵 ) 0
= Tr (9.A.12)
0 𝐿 𝐴𝐵 0 −T𝐵 (𝑃 𝐴𝐵 − 𝑄 𝐴𝐵 )
= Tr[𝐾 𝐴𝐵 (T𝐵 (𝑃 𝐴𝐵 − 𝑄 𝐴𝐵 ))] − Tr[𝐿 𝐴𝐵 (T𝐵 (𝑃 𝐴𝐵 − 𝑄 𝐴𝐵 ))] (9.A.13)
= Tr[T𝐵 (𝐾 𝐴𝐵 )(𝑃 𝐴𝐵 − 𝑄 𝐴𝐵 ))] − Tr[T𝐵 (𝐿 𝐴𝐵 )(𝑃 𝐴𝐵 − 𝑄 𝐴𝐵 )] (9.A.14)
= Tr[T𝐵 (𝐾 𝐴𝐵 − 𝐿 𝐴𝐵 )𝑃 𝐴𝐵 ] + Tr[T𝐵 (𝐿 𝐴𝐵 − 𝐾 𝐴𝐵 )𝑄 𝐴𝐵 ] (9.A.15)
T𝐵 (𝐾 𝐴𝐵 − 𝐿 𝐴𝐵 ) 0 𝑃 𝐴𝐵 0
= Tr , (9.A.16)
0 −T𝐵 (𝐾 𝐴𝐵 − 𝐿 𝐴𝐵 ) 0 𝑄 𝐴𝐵
so that
† T𝐵 (𝐾 𝐴𝐵 − 𝐿 𝐴𝐵 ) 0
Φ (𝑌 ) = . (9.A.17)
0 −T𝐵 (𝐾 𝐴𝐵 − 𝐿 𝐴𝐵 )
Then, plugging into the standard form for the dual SDP in (2.4.4) and simplifying a
bit, we find that it is given by
Tr[𝐾 𝐴𝐵 + 𝐿 𝐴𝐵 ] : T𝐵 (𝐾 𝐴𝐵 − 𝐿 𝐴𝐵 ) ≥ 𝜌 𝐴𝐵 ,
inf
𝐾 𝐴𝐵 ,𝐿 𝐴𝐵 ≥0 −T𝐵 (𝐾 𝐴𝐵 − 𝐿 𝐴𝐵 ) ≥ −𝜌 𝐴𝐵
= inf {Tr[𝐾 𝐴𝐵 + 𝐿 𝐴𝐵 ] : T𝐵 (𝐾 𝐴𝐵 − 𝐿 𝐴𝐵 ) = 𝜌 𝐴𝐵 } . (9.A.18)
𝐾 𝐴𝐵 ,𝐿 𝐴𝐵 ≥0
581
Chapter 10: Entanglement Measures for Quantum Channels
Remark: Note that it suffices to optimize (10.1.1) with respect to pure states 𝜓 𝑅 𝐴, with the
dimension of 𝑅 equal to the dimension of 𝐴, when calculating the entanglement of a channel, so
that
𝐸 (N) B sup 𝐸 (𝑅; 𝐵) 𝜔 , (10.1.2)
𝜓𝑅 𝐴
where 𝜔 𝑅𝐵 B N 𝐴→𝐵 (𝜓 𝑅 𝐴). This follows from the fact that an entanglement measure for states
is, by definition, monotone under LOCC channels. It is therefore monotone under a local partial
trace channel. In particular, consider a mixed state 𝜌 𝑅 𝐴, with the dimension of 𝑅 not necessarily
equal to the dimension of 𝐴. Let 𝜔 𝑅𝐵 = N 𝐴→𝐵 (𝜔 𝑅 𝐴). Then, if we take a purification 𝜙 𝑅′ 𝑅 𝐴 of
𝜌 𝑅 𝐴, we obtain
where 𝜏𝑅′ 𝑅𝐵 = N 𝐴→𝐵 (𝜙 𝑅′ 𝑅 𝐴) and to obtain the inequality we used the fact that 𝐸 is monotone
under the partial trace channel Tr 𝑅′ . This demonstrates that it suffices to optimize with respect
to pure states when calculating the entanglement of a channel. Furthermore, by the Schmidt
decomposition theorem (Theorem 2.2), the dimension of the purifying system 𝑅 ′ 𝑅 need not
exceed the dimension of 𝐴.
Note that in the definition above the channel N 𝐴→𝐵 acts locally on the state
𝜓 𝑅 𝐴 to produce the state 𝜔 𝑅𝐵 = N 𝐴→𝐵 (𝜓 𝑅 𝐴 ). We can thus view N 𝐴→𝐵 as an
LOCC channel, which means that 𝐸 (𝑅; 𝐵)𝜔 ≤ 𝐸 (𝑅; 𝐴)𝜓 , by the definition of an
entanglement measure for states. In other words, by sending one share of a bipartite
state through the channel N, the entanglement can only stay the same or go down.
The quantity 𝐸 (N) thus indicates how well entanglement is preserved when one
share of it is sent through the channel N.
Let us consider three examples of entanglement measures for quantum channels,
defined using entanglement measures for bipartite quantum states.
1. The generalized divergence of entanglement of a channel N 𝐴→𝐵 , defined for
every generalized divergence 𝑫 as
where EB( 𝐴 → 𝐵) denotes the set of entanglement-breaking channels taking system 𝐴 to system
𝐵. Now, using the expression for the generalized channel divergence in (7.11.2), we obtain
where the optimization is with respect to pure states 𝜓 𝑅 𝐴 is such that the dimension of 𝑅 is equal
to the dimension of 𝐴.
Now, because entanglement-breaking channels and separable states (with maximally mixed
reduced state) are in one-to-one correspondence (see Section 4.4.6), we find that the generalized
divergence of entanglement of N is bounded from above as follows:
583
Chapter 10: Entanglement Measures for Quantum Channels
The right-hand side of the above inequality and the quantity 𝑬 ′ (N) differ in the order of the
infimum and supremum. From the discussion in Section 2.3, in particular (2.3.14), we conclude
that 𝑬 (N) ≤ 𝑬 ′ (N) for all quantum channels N. For the rest of this chapter, and throughout the
rest of this book, we thus stick with the definition of a channel entanglement measure given in
Definition 10.1.
𝐸 ( 𝐴1 𝐴2 ; 𝐵1 𝐵2 ) 𝜌⊗𝜏 ≥ 𝐸 ( 𝐴1 ; 𝐵1 ) 𝜌 + 𝐸 ( 𝐴2 ; 𝐵2 )𝜏 (10.1.17)
for all states 𝜌 𝐴1 𝐵1 and 𝜏𝐴2 𝐵2 , then the channel entanglment measure is also
superadditive: for every two channels N 𝐴1 →𝐵1 and M 𝐴2 →𝐵2 ,
Proof:
1. Let N be an entanglement breaking channel. This means that 𝜔 𝑅𝐵 =
584
Chapter 10: Entanglement Measures for Quantum Channels
A0
A
Alice ρ 0 0
Bob
A AB
N ω ABB0
B
B0
Figure 10.1: Starting from a state 𝜌 𝐴′ 𝐴𝐵′ , Alice sends the system 𝐴 through
the channel N 𝐴→𝐵 to Bob, resulting in the state 𝜔 𝐴′ 𝐵𝐵′ = N 𝐴→𝐵 (𝜌 𝐴′ 𝐴𝐵′ ).
The difference between the final and initial entanglement (as quantified by an
entanglement measure for states), optimized over all initial states 𝜌 𝐴′ 𝐴𝐵′ , is
equal to the amortized entanglement of N; see Definition 10.3.
586
Chapter 10: Entanglement Measures for Quantum Channels
where 𝜔 𝐴′ 𝐵𝐵′ B N 𝐴→𝐵 (𝜌 𝐴′ 𝐴𝐵′ ) and the optimization is with respect to states
𝜌 𝐴′ 𝐴𝐵′ . The systems 𝐴′ and 𝐵′ have arbitrarily large, yet finite dimensions.
Due to the fact that the systems 𝐴′ and 𝐵′ can be arbitrarily large, it is not
necessarily the case that the supremum above can be achieved. Thus, in general, it
might be difficult to compute a channel’s amortized entanglement.
For every entanglement measure 𝐸 that is equal to zero for all separable states,
we always have that the entanglement of the channel never exceeds the amortized
entanglement of the channel:
Lemma 10.4
For a given quantum channel N and entanglement measure 𝐸 that is equal to
zero for all separable states, the channel’s entanglement does not exceed its
amortized entanglement:
𝐸 (N) ≤ 𝐸 A (N). (10.2.2)
Proof: By choosing the input state 𝜌 𝐴′ 𝐴𝐵′ in the optimization for amortized
entanglement to have a trivial (one-dimensional) system 𝐵′ (so that 𝜌 𝐴′ 𝐴𝐵′ is trivially
a separable state between Alice and Bob), we find that 𝐸 ( 𝐴′; 𝐵𝐵′)𝜔 = 𝐸 ( 𝐴′; 𝐵)𝜔
and 𝐸 ( 𝐴𝐴′; 𝐵′) 𝜌 = 0. Since such a state is an arbitrary state to consider for
optimizing the channel’s entanglement, the inequality follows. ■
Whether the inequality reverse to the one in (10.2.2) holds, which would imply
that 𝐸 (N) = 𝐸 A (N), depends on the entanglement measure 𝐸. In Section 10.6
below, we show that this so-called “amortization collapse” occurs for some
entanglement measures.
The amortized entanglement of a channel has several interesting properties,
which we list in some detail in this section. These include convexity, faithfulness,
587
Chapter 10: Entanglement Measures for Quantum Channels
and (sub)additivity.
588
Chapter 10: Entanglement Measures for Quantum Channels
Proof:
1. To prove (10.2.3), we use the fact that N can be simulated via teleporation.
Specifically, from (5.1.33) and (5.1.34), we can represent the action of N 𝐴→𝐵
on every state 𝜌 𝐴′ 𝐴𝐵′ in the following two ways:
N 𝐴→𝐵 (𝜌 𝐴′ 𝐴𝐵′ ) = N𝐵𝑖′ →𝐵 T 𝐴𝐴𝑖 𝐵𝑖 →𝐵𝑖′ (𝜌 𝐴′ 𝐴𝐵′ ⊗ Φ 𝐴𝑖 𝐵𝑖 ) , (10.2.7)
N 𝐴→𝐵 (𝜌 𝐴′ 𝐴𝐵′ ) = T 𝐴𝑜′ 𝐴𝑜 𝐵𝑜 →𝐵 N 𝐴→𝐴𝑜′ (𝜌 𝐴′ 𝐴𝐵′ ) ⊗ Φ+𝐴𝑜 𝐵𝑜 , (10.2.8)
as required.
2. Let N be an entanglement breaking channel. For every state 𝜌 𝐴′ 𝐴𝐵′ , let
𝜔 𝐴𝐵𝐵′ = N 𝐴→𝐵 (𝜌 𝐴′ 𝐴𝐵′ ). Recall from Section 4.4.6, specifically Theorem 4.15,
589
Chapter 10: Entanglement Measures for Quantum Channels
for every state 𝜌 𝐴′ 𝐴𝐵′ . Therefore, 𝐸 A (N) ≤ 0. On the other hand, because
𝐸 vanishes for all separable states, it holds that 𝐸 (N) ≥ 0. Therefore, by
Lemma 10.4, 𝐸 A (N) ≥ 0, and we conclude that 𝐸 A (N) = 0.
Now, let 𝐸 be a faithful entanglement measure, meaning that it vanishes if
and only if the input state is separable, and suppose that 𝐸 A (N) = 0. By
Lemma 10.4, we have that 0 = 𝐸 A (N) ≥ 𝐸 (N), which in turn implies that
𝐸 (N) = 0 because 𝐸 (N) ≥ 0 for all channels N. Therefore, by Proposi-
tion 10.2, we conclude that N is entanglement breaking.
3. Let 𝜌 𝐴′ 𝐴𝐵′ be an arbitrary state, and let
!
∑︁ ∑︁
𝜔 𝐴′ 𝐵𝐵′ = 𝑝(𝑥)N𝑥𝐴→𝐵 (𝜌 𝐴′ 𝐴𝐵′ ) = 𝑝(𝑥)𝜔𝑥𝐴′ 𝐵𝐵′ , (10.2.18)
𝑥∈X 𝑥∈X
where 𝜔𝑥𝐴′ 𝐵𝐵′ = N𝑥𝐴→𝐵 (𝜌 𝐴′ 𝐴𝐵′ ) for all 𝑥 ∈ X. Then, by convexity of the
entanglement measure 𝐸, we obtain
∑︁
𝐸 ( 𝐴′; 𝐵𝐵′)𝜔 ≤ 𝑝(𝑥)𝐸 ( 𝐴′; 𝐵𝐵′)𝜔 𝑥 . (10.2.19)
𝑥∈X
590
Chapter 10: Entanglement Measures for Quantum Channels
Since the state 𝜌 𝐴′ 𝐴𝐵′ is arbitrary, by optimizing over every state 𝜌 𝐴′ 𝐴𝐵′ on the
left-hand side of the inequality above, we obtain
!
∑︁ ∑︁
A
𝐸 𝑝(𝑥)N ≤ 𝑥
𝑝(𝑥)𝐸 A (N𝑥 ), (10.2.23)
𝑥∈X 𝑥∈X
as required.
4. Let 𝐴1 and 𝐵1 denote the respective input and output systems for the quantum
channel N, and let 𝐴2 and 𝐵2 denote the respective input and output quantum
systems for the quantum channel M. Let 𝜌 𝐴′ 𝐴1 𝐴2 𝐵′ be an arbitrary state. Let
where
𝜏𝐴′ 𝐴1 𝐵2 𝐵′ B M 𝐴2 →𝐵2 (𝜌 𝐴′ 𝐴1 𝐴2 𝐵′ ). (10.2.26)
Observe that the state 𝜔 𝐴′ 𝐵1 𝐵2 𝐵′ is both an example of an output state in the
optimization defining 𝐸 A (N ⊗ M) and in the optimization defining 𝐸 A (N)
(with an appropriate identification of the 𝐴′, 𝐵, and 𝐵′ systems for the latter).
Observe also that 𝜏𝐴′ 𝐴1 𝐵2 𝐵′ is an example of an output state in the optimization
defining 𝐸 A (M). Therefore,
Since the state 𝜌 𝐴′ 𝐴1 𝐴2 𝐵′ is arbitrary, we can optimize over all such states on
the left-hand side of the inequality above to obtain
capacities of certain quantum channels when they are assisted by LOCC. Recalling
Section 5.1.4, the basic idea behind this tool is that a quantum channel can be
simulated by the action of a teleportation protocol, with a maximally entangled
resource state shared between the sender 𝐴 and receiver 𝐵. More generally, recalling
Definition 4.25, a channel N 𝐴→𝐵 with input system 𝐴 and output system 𝐵 is
defined to be LOCC-simulable with associated resource state 𝜔 𝑅𝐵′ if the following
equality holds for all input states 𝜌 𝐴 :
Proposition 10.6
Let 𝐸 be a subadditive state entanglement measure (recall (9.1.9)). If a quantum
channel N 𝐴→𝐵 is LOCC-simulable with associated resource state 𝜔 𝑅𝐵′ , i.e.,
Proof: For every state 𝜌 𝐴′ 𝐴𝐵′′ , we use monotonicity of the state entanglement
measure under LOCC, as well as subadditivity of the measure, to obtain
593
Chapter 10: Entanglement Measures for Quantum Channels
where for the first inequality we made use of LOCC monotonicity and for the
second inequality we made use of the assumption of subadditivity. Since the state
𝜌 𝐴′ 𝐴𝐵′′ was arbitrary, we conclude (10.2.42). ■
Proposition 10.7
Let 𝐸 be an entanglement measure that is subadditive with respect to states and
zero on separable states, and let 𝐸 A denote its amortized version. If a channel
N 𝐴→𝐵 is LOCC-simulable with associated resource state 𝜔 𝑅𝐵 = N 𝐴→𝐵 (𝜌 𝑅 𝐴 )
for some input state 𝜌 𝑅 𝐴 , then the following equality holds
Proof: From Proposition 10.6, we have that 𝐸 A (N) ≤ 𝐸 (𝑅; 𝐵)𝜔 . For the reverse
inequality, we take 𝜌 𝐴′ 𝐴𝐵′′ = 𝜌 𝑅 𝐴 in the optimization that defines 𝐸 A (N), where
we identify 𝐴′ ≡ 𝑅 and 𝐵′′ ≡ ∅ (i.e., 𝐵′′ is a trivial one-dimensional system). Then,
N 𝐴→𝐵 (𝜌 𝐴′ 𝐴𝐵′′ ) = N 𝐴→𝐵 (𝜌 𝑅 𝐴 ), which is the resource state. Furthermore, since
𝐵′′ is a one-dimensional system, the state 𝜌 𝐴′ 𝐴𝐵′′ is trivially separable, so that
𝐸 ( 𝐴′ 𝐴; 𝐵′′) 𝜌 = 0. Therefore, 𝐸 A (N) ≥ 𝐸 ( 𝐴′; 𝐵)𝜔 ≡ 𝐸 (𝑅; 𝐵)𝜔 . ■
594
Chapter 10: Entanglement Measures for Quantum Channels
where 𝜔 𝑆𝐵 = N 𝐴→𝐵 (𝜓 𝑆 𝐴 ).
595
Chapter 10: Entanglement Measures for Quantum Channels
where 𝜔 𝑅𝐵 = N 𝐴→𝐵 (𝜓 𝑅 𝐴 ).
3. The sandwiched Rényi relative entropy of entanglement of N,
e𝛼 (N) B sup 𝐸
𝐸 e𝛼 (𝑅; 𝐵)𝜔 (10.3.9)
𝜓𝑅 𝐴
where 𝜔 𝐴′ 𝐵𝐵′ = N 𝐴→𝐵 (𝜌 𝐴′ 𝐴𝐵′ ), satisfies all of the properties stated in Proposi-
tion 10.5. In particular, it is subadditive. We show in Section 10.6 below that
A (N) for every quantum channel N, and this is what leads to the
𝐸 max (N) = 𝐸 max
subadditivity statement in (10.3.15).
For covariant channels, the optimization over pure input states in the generalized
divergence of entanglement can be simplified, as we now show. This simplification
is similar to the simplification that occurs for the generalized channel divergence
for jointly covariant channels (see Proposition 7.84).
1 ∑︁ 𝑔 𝑔†
𝜌𝐴 = 𝑈 𝐴 𝜓 𝐴𝑈 𝐴 C T𝐺 (𝜓 𝐴 ), (10.3.18)
|𝐺 | 𝑔∈𝐺
𝜌
and 𝜙 𝑅 𝐴 is a purification of 𝜌 𝐴 . Consequently,
597
Chapter 10: Entanglement Measures for Quantum Channels
which holds for every state 𝜌 acting on the input space of the channel N. We can write (10.3.19)
as
𝑬 (N) = sup{𝑬 (N, 𝜌) : 𝜌 = T𝐺 (𝜌)}. (10.3.21)
𝜌
where, to obtain the last equality, we used the fact that any generalized divergence
is isometrically invariant (recall Proposition 7.16). Now, if we apply the dephasing
channel 𝑋 ↦→ 𝑔∈𝐺 |𝑔⟩⟨𝑔|𝑋 |𝑔⟩⟨𝑔| to the 𝑅′ system, then by the data-processing
Í
inequality for the generalized divergence 𝑫, we obtain
𝜌
𝑫 (N 𝐴→𝐵 (𝜓 𝑅′ 𝑅 𝐴 )∥𝜏𝑅′ 𝑅𝐵 )
598
Chapter 10: Entanglement Measures for Quantum Channels
© 1 ∑︁ 𝑔
∑︁
𝑔 ª
≥ 𝑫 |𝑔⟩⟨𝑔| 𝑅′ ⊗ (N 𝐴→𝐵 ◦ U 𝐴 )(𝜓 𝑅 𝐴 ) 𝑝(𝑔)|𝑔⟩⟨𝑔| 𝑅′ ⊗ 𝜏𝑅𝐵 ®
|𝐺 | 𝑔∈𝐺 𝑔∈𝐺
« ¬
(10.3.27)
© 1 ∑︁
|𝑔⟩⟨𝑔| 𝑅′ ⊗ ((V𝐵 ) † ◦ N 𝐴→𝐵 ◦ U 𝐴 )(𝜓 𝑅 𝐴 )
𝑔 𝑔
= 𝑫
|𝐺 | 𝑔∈𝐺
«
∑︁
𝑔† 𝑔 𝑔ª
𝑝(𝑔)|𝑔⟩⟨𝑔| 𝑅′ ⊗ 𝑉𝐵 𝜏𝑅𝐵𝑉𝐵 ® , (10.3.28)
𝑔∈𝐺 ¬
where to obtain the last line we applied the unitary channel given by the unitary
Í 𝑔†
𝑔∈𝐺 |𝑔⟩⟨𝑔| 𝑅 ′ ⊗ 𝑉𝐵 and we used the fact that generalized divergences are invariant
under unitaries. Furthermore, we wrote the action of the dephasing channel on 𝜏𝑅′ 𝑅𝐵
Í 𝑔
as 𝑔∈𝐺 𝑝(𝑔)|𝑔⟩⟨𝑔| 𝑅′ ⊗ 𝜏𝑅𝐵 , where 𝑝 : 𝐺 → [0, 1] is a probability distribution and
{𝜏𝑅𝐵 }𝑔∈𝐺 is a set of states. This operator is in the set SEP(𝑅′ 𝑅 : 𝐵) because the
𝑔
set SEP is closed under local channels. Next, due to the covariance of N, we have
that (V𝐵 ) † ◦ N ◦ U 𝐴 = N, so that
𝑔 𝑔
𝜌
𝑫 (N 𝐴→𝐵 (𝜓 𝑅′ 𝑅 𝐴 )∥𝜏𝑅′ 𝑅𝐵 ) (10.3.29)
© 1 ∑︁ ∑︁
𝑔† 𝑔 𝑔ª
≥ 𝑫 |𝑔⟩⟨𝑔| 𝑅′ ⊗ N 𝐴→𝐵 (𝜓 𝑅 𝐴 ) 𝑝(𝑔)|𝑔⟩⟨𝑔| 𝑅′ ⊗ 𝑉𝐵 𝜏𝑅𝐵𝑉𝐵 ®
|𝐺 | 𝑔∈𝐺 𝑔∈𝐺
« ¬
(10.3.30)
∑︁
𝑔† 𝑔 𝑔ª
≥ 𝑫 N 𝐴→𝐵 (𝜓 𝑅 𝐴 )
©
𝑝(𝑔)𝑉𝐵 𝜏𝑅𝐵𝑉𝐵 ® , (10.3.31)
« 𝑔∈𝐺 ¬
where to obtain the last inequality we used the data-processing inequality for 𝑫
Í 𝑔† 𝑔 𝑔
under the channel Tr 𝑅′ . Now, observe that the state 𝑔∈𝐺 𝑝(𝑔)|𝑔⟩⟨𝑔| 𝑅′ ⊗ 𝑉𝐵 𝜏𝑅𝐵𝑉𝐵
is in the set SEP(𝑅′ 𝑅 : 𝐵). This is due to the fact that 𝑔∈𝐺 |𝑔⟩⟨𝑔| 𝑅′ ⊗ 𝑉𝐵 is a
Í 𝑔†
controlled unitary, and since register 𝑅′ is classical, this controlled unitary can
be implemented as an LOCC channel. Also, the set SEP is closed under LOCC
Í 𝑔† 𝑔
channels. It follows then that 𝑔∈𝐺 𝑝(𝑔)𝑉𝐵 𝜏𝑅𝐵𝑉𝐵 ∈ SEP(𝑅 : 𝐵) because we obtain
it from the previous separable state by applying a local partial trace over 𝑅′. By
taking the infimum over every state 𝜏𝑅𝐵 ∈ SEP(𝑅 : 𝐵) in (10.3.31), we have that
𝜌 𝜌
𝑫 (N 𝐴→𝐵 (𝜙 𝑅 𝐴 )∥𝜎𝑅𝐵 ) = 𝑫 (N 𝐴→𝐵 (𝜓 𝑅′ 𝑅 𝐴 )∥𝜏𝑅′ 𝑅𝐵 ) (10.3.32)
599
Chapter 10: Entanglement Measures for Quantum Channels
where 𝜔 𝑅𝐵 = N 𝐴→𝐵 (𝜓 𝑅 𝐴 ). This inequality holds for every state 𝜓 𝑅 𝐴 and every
state 𝜎𝑅𝐵 ∈ SEP(𝑅 : 𝐵). Therefore, optimizing over all 𝜎𝑅𝐵 ∈ SEP(𝑅 : 𝐵) leads to
𝜌
inf 𝑫 (N 𝐴→𝐵 (𝜙 𝑅 𝐴 )∥𝜎𝑅𝐵 ) = 𝑬 (𝑅; 𝐵)𝜔 ≥ 𝑬 (𝑅; 𝐵)𝜔 , (10.3.35)
𝜎𝑅𝐵 ∈SEP(𝑅:𝐵)
𝜌
where 𝜔 𝑅𝐵 = N 𝐴→𝐵 (𝜙 𝑅 𝐴 ). This is precisely the inequality in (10.3.17).
𝜌
Next, by construction, the state 𝜙 𝑅 𝐴 is such that its reduced state on 𝐴 is invariant
under the channel T𝐺 . Optimizing over all such states leads to
Since this inequality holds for every pure state 𝜓 𝑅 𝐴 , we finally obtain
600
Chapter 10: Entanglement Measures for Quantum Channels
where 𝜔 𝑆𝐵 = N 𝐴→𝐵 (𝜓 𝑆 𝐴 ). Now, recall from (2.2.38) that an arbitrary pure bipartite
Í𝑑 𝐴−1
state 𝜓 𝑆 𝐴 can be written as 𝑍 𝑆 Γ𝑆 𝐴 𝑍 𝑆† , where Γ𝑆 𝐴 = |Γ⟩⟨Γ|, |Γ⟩𝑆 𝐴 = 𝑖=0 |𝑖, 𝑖⟩𝑆 𝐴 ,
and 𝑍 𝑆 is an operator satisfying Tr[𝑍 𝑆† 𝑍 𝑆 ] = 1. Then
𝐸 max (N)
n o
N †
= log2 sup inf Tr[𝑋𝑆𝐵 ] : 𝑍 𝑆 Γ𝑆𝐵 𝑍𝑆 ≤ 𝑋𝑆𝐵 , 𝑍 𝑆† 𝑍 𝑆 > 0, Tr[𝑍 𝑆† 𝑍 𝑆 ] =1 .
𝑍 𝑆 𝑋𝑆𝐵 ∈SEP
d
(10.3.45)
Let us now make a change of variable, defining the variable 𝑌𝑆𝐵 according to the
relation 𝑋𝑆𝐵 = 𝑍 𝑆𝑌𝑆𝐵 𝑍 𝑆† . Then, since
N †
𝑍 𝑆 Γ𝑆𝐵 𝑍 𝑆 ≤ 𝑋𝑆𝐵 = 𝑍 𝑆𝑌𝑆𝐵 𝑍 𝑆† ⇐⇒ N
Γ𝑆𝐵 ≤ 𝑌𝑆𝐵 , (10.3.46)
𝑋𝑆𝐵 ∈ SEP
d ⇐⇒ 𝑌𝑆𝐵 ∈ SEP,
d (10.3.47)
we find that
Eq. (10.3.45)
n o
= sup inf Tr[𝑍 𝑆𝑌𝑆𝐵 𝑍 𝑆† ] : N
Γ𝑆𝐵 ≤ 𝑌𝑆𝐵 , 𝑍 𝑆† 𝑍 𝑆 > 0, Tr[𝑍 𝑆† 𝑍 𝑆 ] =1
𝑍 𝑆 𝑌𝑆𝐵 ∈SEP
d
n o
= sup inf Tr[𝑍 𝑆† 𝑍 𝑆𝑌𝑆𝐵 ] : N
Γ𝑆𝐵 ≤ 𝑌𝑆𝐵 , 𝑍 𝑆† 𝑍 𝑆 > 0, Tr[𝑍 𝑆† 𝑍 𝑆 ] =1
𝑍 𝑆 𝑌𝑆𝐵 ∈SEP
d
601
Chapter 10: Entanglement Measures for Quantum Channels
N
= sup inf Tr[𝜌 𝑆𝑌𝑆𝐵 ] : Γ𝑆𝐵 ≤ 𝑌𝑆𝐵 , (10.3.48)
𝜌 𝑆 𝑌𝑆𝐵 ∈SEP
d
where in the last line we made the substitution 𝜌 𝑆 = 𝑍 𝑆† 𝑍 𝑆 , so that the optimization
is with respect to density operators. Furthermore, we have employed the fact that
the set of density operators satisfying 𝜌 𝑆 > 0 is dense in the set of all density
operators. Now observing that the objective function is linear in 𝜌 𝑆 and 𝑌𝑆𝐵 , the
set of density operators is compact and convex, and the set of separable operators
is convex, the Sion minimax theorem (Theorem 2.24) applies, such that we can
exchange the optimizations to find that
N
sup inf Tr[𝜌 𝑆𝑌𝑆𝐵 ] : Γ𝑆𝐵 ≤ 𝑌𝑆𝐵
𝜌 𝑆 𝑌𝑆𝐵 ∈SEP
d
N
= inf sup Tr[𝜌 𝑆𝑌𝑆𝐵 ] : Γ𝑆𝐵 ≤ 𝑌𝑆𝐵 (10.3.49)
𝑌𝑆𝐵 ∈SEP
d 𝜌𝑆
N
= inf sup Tr[𝜌 𝑆 Tr 𝐵 [𝑌𝑆𝐵 ]] : Γ𝑆𝐵 ≤ 𝑌𝑆𝐵 (10.3.50)
𝑌𝑆𝐵 ∈SEP
d 𝜌𝑆
N
= inf ∥Tr 𝐵 [𝑌𝑆𝐵 ] ∥ ∞ : Γ𝑆𝐵 ≤ 𝑌𝑆𝐵 (10.3.51)
𝑌𝑆𝐵 ∈SEP
d
602
Chapter 10: Entanglement Measures for Quantum Channels
= sup inf
′
𝑫 (N 𝐴→𝐵 (𝜓 𝑆 𝐴 )∥𝜎𝑆𝐵 ), (10.4.2)
𝜓 𝑆 𝐴 𝜎𝑆𝐵 ∈PPT (𝑆:𝐵)
603
Chapter 10: Entanglement Measures for Quantum Channels
for all quantum channels N and M. We defer a proof of this to Section 10.6 below.
The amortized generalized Rains divergence 𝑹 A (N), defined according to
Definition 10.3 as
604
Chapter 10: Entanglement Measures for Quantum Channels
where 𝜔 𝐴′ 𝐵𝐵′ = N 𝐴→𝐵 (𝜌 𝐴′ 𝐴𝐵′ ), satisfies all of the properties stated in Proposi-
tion 10.5 except for faithfulness, because the generalized Rains divergence of
a bipartite quantum state is not faithful. In particular, due to additivity of the
max-Rains relative entropy (Proposition 9.29), we immediately obtain additivity of
the amortized max-Rains information of a quantum channel, i.e.,
A A A
𝑅max (N ⊗ M) = 𝑅max (N) + 𝑅max (M) (10.4.17)
for all quantum channels N and M. We show in Section 10.6 below that
A
𝑅max (N) = 𝑅max (N) (10.4.18)
for every quantum channel N, and it is this fact that leads to the additivity statement
in (10.4.15).
For covariant channels, the optimization over pure input states in the generalized
Rains divergence simplifies in the same way as it does for the generalized divergence
of entanglement.
1 ∑︁ 𝑔 𝑔†
𝜌𝐴 = 𝑈 𝐴 𝜌 𝐴𝑈 𝐴 C T𝐺 (𝜌 𝐴 ), (10.4.20)
|𝐺 | 𝑔∈𝐺
𝜌
𝜌 𝐴 = 𝜓 𝐴 = Tr𝑆 [𝜓 𝑆 𝐴 ], and 𝜙 𝑆 𝐴 a purification of 𝜌 𝐴 . Consequently,
605
Chapter 10: Entanglement Measures for Quantum Channels
which holds for every state 𝜌 acting on the input space of the channel N. We can write (10.4.21)
as
𝑹(N) = sup{𝑹(N, 𝜌) : 𝜌 = T𝐺 (𝜌)}. (10.4.23)
𝑅
Proof: The proof is identical to the proof of Proposition 10.9, with the exception
that the set PPT’ is involved rather than the set SEP. The LOCC channels discussed
there preserve the set PPT’, and this is the main reason why the same proof
applies. ■
where
Γmax (N)
N
= inf {∥Tr 𝐵 [𝑉𝑆𝐵 + 𝑌𝑆𝐵 ] ∥ ∞ : T𝐵 (𝑉𝑆𝐵 − 𝑌𝑆𝐵 ) ≥ Γ𝑆𝐵 } (10.4.25)
𝑌𝑆𝐵 ,𝑉𝑆𝐵 ≥0
N
= sup {Tr[Γ𝑆𝐵 𝑋𝑆𝐵 ] : Tr[𝜌 𝑆 ] ≤ 1, −𝜌 𝑆 ⊗ 1𝐵 ≤ T𝐵 (𝑋𝑆𝐵 ) ≤ 𝜌 𝑆 ⊗ 1𝐵 }.
𝜌 𝑆 ≥0
(10.4.26)
where the last equality follows from (9.3.47)–(9.3.49). Recall from (2.2.38) that an
arbitrary pure bipartite state 𝜓 𝑆 𝐴 can be written as 𝑍 𝑆 Γ𝑆 𝐴 𝑍 𝑆† , where Γ𝑆 𝐴 = |Γ⟩⟨Γ| 𝑆 𝐴 ,
|𝑖, 𝑖⟩𝑆 𝐴 , and 𝑍 𝑆 is an operator satisfying Tr[𝑍 𝑆† 𝑍 𝑆 ] = 1. Then
Í𝑑 𝐴=1
|Γ⟩𝑆 𝐴 = 𝑖=0
N †
2 𝑅max (N) = sup{Tr[Γ𝑆𝐵 𝑍 𝑆 𝑋𝑆𝐵 𝑍 𝑆 ] : ∥T𝐵 (𝑋𝑆𝐵 )∥ ∞ ≤ 1,
𝑋𝑆𝐵 ≥ 0, 𝑍 𝑆† 𝑍 𝑆 > 0, Tr[𝑍 𝑆† 𝑍 𝑆 ] = 1}. (10.4.32)
∥T𝐵 (𝑋𝑆𝐵 ) ∥ ∞ ≤ 1
⇐⇒ − 1𝑆𝐵 ≤ T𝐵 (𝑋𝑆𝐵 ) ≤ 1𝑆𝐵 (10.4.33)
⇐⇒ − 𝑍 𝑆† 𝑍 𝑆 ⊗ 1𝐵 ≤ 𝑍 𝑆† T𝐵 (𝑋𝑆𝐵 )𝑍 𝑆 ≤ 𝑍 𝑆† 𝑍 𝑆 ⊗ 1𝐵 (10.4.34)
⇐⇒ − 𝑍 𝑆† 𝑍 𝑆 ⊗ 1𝐵 ≤ T𝐵 (𝑍 𝑆† 𝑋𝑆𝐵 𝑍 𝑆 ) ≤ 𝑍 𝑆† 𝑍 𝑆 ⊗ 1𝐵 . (10.4.35)
N ′
2 𝑅max (N) = sup{Tr[Γ𝑆𝐵 𝑋𝑆𝐵 ] : −𝜌 𝑆 ⊗ 1𝐵 ≤ T𝐵 (𝑋𝑆𝐵
′
) ≤ 𝜌𝑆 ⊗ 1𝐵 ,
′
𝑋𝑆𝐵 ≥ 0, 𝜌 𝑆 > 0, Tr[𝜌 𝑆 ] = 1}, (10.4.36)
which is the equality in (10.4.24) and (10.4.26), after observing that the set
{𝜌 𝑆 : 𝜌 𝑆 > 0, Tr[𝜌 𝑆 ] = 1} is dense in the set {𝜌 𝑆 : 𝜌 𝑆 ≥ 0, Tr[𝜌 𝑆 ] = 1}.
To arrive at the equality in (10.4.25), we employ the dual formulation of the
max-Rains relative entropy in (9.3.49). Consider that
2 𝑅max (N)
= sup 2 𝑅max (𝑆;𝐵) 𝜔 (10.4.37)
𝜓𝑆 𝐴
607
Chapter 10: Entanglement Measures for Quantum Channels
′ N
− 𝐿 ′𝑆𝐵 ≥ Γ𝑆𝐵
T𝐵 (𝐾𝑆𝐵 − 𝐿 𝑆𝐵 ) ≥ N 𝐴→𝐵 (𝜓 𝑆 𝐴 ) ⇐⇒ T𝐵 𝐾𝑆𝐵 , (10.4.39)
′ and 𝐿 ′ are such that 𝐾 ′ † ′ †
where 𝐾𝑆𝐵 𝑆𝐵 𝑆𝐵 = 𝑍 𝑆 𝐾 𝑆𝐵 𝑍 𝑆 and 𝐿 𝑆𝐵 = 𝑍 𝑆 𝐿 𝑆𝐵 𝑍 𝑆 , respectively.
′ , 𝐿 ′ ≥ 0, and we find that
Then 𝐾𝑆𝐵 , 𝐿 𝑆𝐵 ≥ 0 ⇐⇒ 𝐾𝑆𝐵 𝑆𝐵
Employing cyclicity of trace, setting 𝜌 𝑆 = 𝑍 𝑆† 𝑍 𝑆 , and exploiting the fact that the set
{𝜌 𝑆 : 𝜌 𝑆 > 0, Tr[𝜌 𝑆 ] = 1} is dense in the set {𝜌 𝑆 : 𝜌 𝑆 ≥ 0, Tr[𝜌 𝑆 ] = 1}, we find
that
′
sup{Tr[𝜌 𝑆 (𝐾𝑆𝐵 + 𝐿 ′𝑆𝐵 )] : 𝜌 𝑆 ≥ 0, Tr[𝜌 𝑆 ] = 1}
𝜌𝑆
′
= sup{Tr[𝜌 𝑆 Tr 𝐵 [𝐾𝑆𝐵 + 𝐿 ′𝑆𝐵 ]] : 𝜌 𝑆 ≥ 0, Tr[𝜌 𝑆 ] = 1} (10.4.43)
𝜌𝑆
′
= Tr 𝐵 [𝐾𝑆𝐵 + 𝐿 ′𝑆𝐵 ] ∞
, (10.4.44)
608
Chapter 10: Entanglement Measures for Quantum Channels
where for the last line we used (2.2.123). Substituting back in, we find that
2 𝑅max (N) = ′
inf { Tr 𝐵 [𝐾𝑆𝐵
′ ,𝐿 ′
+ 𝐿 ′𝑆𝐵 ] ∞
′
: 𝐾𝑆𝐵 , 𝐿 ′𝑆𝐵 ≥ 0,
𝐾𝑆𝐵 𝑆𝐵
′ N
− 𝐿 ′𝑆𝐵 ≥ Γ𝑆𝐵
T𝐵 𝐾𝑆𝐵 }, (10.4.45)
as claimed in (10.4.26).
According to Theorem 2.28, strong duality holds by picking 𝑉𝑆𝐵 and 𝑌𝑆𝐵 equal
N ), respectively, which are feasible for
to the positive and negative parts of T𝐵 (Γ𝑆𝐵
(10.4.26). Furthermore, we can pick 𝜌 𝑆 = 1𝑆 /(2𝑑 𝑆 ) and 𝑋𝑆𝐵 = 1𝑆𝐵 /(3𝑑 𝑆 ), which
are strictly feasible for (10.4.25). ■
609
Chapter 10: Entanglement Measures for Quantum Channels
𝐸 sq (N) =
1
sup inf 𝐻 (𝐵|𝐸)𝜃 + 𝐻 (𝐵|𝐹)𝜃 : 𝜃 𝐵𝐸 𝐹 = (V𝐸 ′ →𝐸 𝐹 ◦ UN
𝐴→𝐵𝐸 ′ )(𝜌 𝐴 ) .
2 𝜌 𝐴 V𝐸 ′ →𝐸𝐹
(10.5.9)
610
Chapter 10: Entanglement Measures for Quantum Channels
with 𝜔 𝐴′ 𝐵𝐵′ = N 𝐴→𝐵 (𝜌 𝐴′ 𝐴𝐵′ ), satisfy all of the properties stated in Proposition 10.2
and Proposition 10.5, respectively. In particular, because the squashed entanglement
for states is additive (see (9.4.8)), we immediately have that the amortized squashed
entanglement of a channel is additive, i.e.,
A A A
𝐸 sq (N ⊗ M) = 𝐸 sq (N) + 𝐸 sq (M) (10.5.11)
for all quantum channels N and M. In Section 10.6 below, we prove that 𝐸 sq (N) =
A (N) for every quantum channel N, which then implies the additivity of the
𝐸 sq
squashed entanglement of a channel, i.e.,
Proposition 10.15
Let N 𝐴→𝐵 be a quantum channel. The function 𝜌 ↦→ 𝐸 sq (N, 𝜌), where
𝐸 sq (N, 𝜌) is defined in (10.5.5), is concave: for X a finite alphabet, 𝑝 : X →
[0, 1] a probability distribution on X, and {𝜌 𝑥𝐴 }𝑥∈X a set of states, the following
inequality holds
!
∑︁ ∑︁
𝐸 sq N, 𝑝(𝑥) 𝜌 𝑥𝐴 ≥ 𝑝(𝑥)𝐸 sq (N, 𝜌 𝑥𝐴 ). (10.5.13)
𝑥∈X 𝑥∈X
Proof: In order to prove this, we make use of the expression for 𝐸 sq (N, 𝜌 𝐴 ) in
(10.5.8).
For every state 𝜌 𝑥𝐴 , with 𝑥 ∈ X, define the state
𝜃 𝑥𝐵𝐸 𝐹 B (V𝐸 ′ →𝐸 𝐹 ◦ UN 𝑥
𝐴→𝐵𝐸 ′ )(𝜌 𝐴 ), (10.5.14)
Now, let
∑︁
𝜌𝐴 B 𝑝(𝑥) 𝜌 𝑥𝐴 , (10.5.16)
𝑥∈X
611
Chapter 10: Entanglement Measures for Quantum Channels
∑︁
𝜃 𝐵𝐸 𝐹 B 𝑝(𝑥)𝜃 𝑥𝐵𝐸 𝐹 (10.5.17)
𝑥∈X
∑︁
= 𝑝(𝑥)(V𝐸 ′ →𝐸 𝐹 ◦ UN 𝑥
𝐴→𝐵𝐸 ′ )(𝜌 𝐴 ) (10.5.18)
𝑥∈X
= (V𝐸 ′ →𝐸 𝐹 ◦ UN
𝐴→𝐵𝐸 ′ )(𝜌 𝐴 ). (10.5.19)
In Lemma 10.4, we proved the following relation between the entanglement 𝐸 (N)
of a channel N and its amortized entanglement 𝐸 A (N):
In general, therefore, amortization can yield a larger value for the entanglement of
a channel than the usual channel entanglement measure.
For which entanglement measures does the reverse inequality hold? In this
section, we investigate this question, and we prove that three of the channel
entanglement measures that we have considered in this chapter — max-relative
entropy of entanglement, max-Rains information, and squashed entanglement
— satisfy the reverse inequality. Thus, for these three entanglement measures,
612
Chapter 10: Entanglement Measures for Quantum Channels
amortization does not yield a higher entanglement value than the usual channel
entanglement measure. This so-called “amortization collapse” is important because
it immediately implies additivity of the usual channel entanglement measure.
We start by proving that the amortization collapse occurs for max-relative entropy
of entanglement. The key tools in the proof are Propositions 9.21 and 10.10,
which provide cone programs for both the max-relative entropy of entanglement for
bipartite states and the max-relative entropy of entanglement for quantum channels.
Let us recall these now:
N .
where 𝜌 𝐴𝐵 is a bipartite state and N is a quantum channel with Choi operator Γ𝑆𝐵
613
Chapter 10: Entanglement Measures for Quantum Channels
Proof: Using the cone program formulations in (10.6.2)–(10.6.5), we find that the
inequality in (10.6.6) is equivalent to
and
𝐺 max ( 𝐴′; 𝐵𝐵′)𝜔 = inf Tr[𝐷 𝐴′ 𝐵𝐵′ ], (10.6.13)
subject to the constraints
d 𝐴′ : 𝐵𝐵′),
𝐷 𝐴′ 𝐵𝐵′ ∈ SEP( (10.6.14)
𝐷 𝐴′ 𝐵𝐵′ ≥ N 𝐴→𝐵 (𝜌 𝐴′ 𝐴𝐵′ ). (10.6.15)
𝑌𝑆𝐵 ∈ SEP(𝑆
d : 𝐵), (10.6.17)
N
𝑌𝑆𝐵 ≥ Γ𝑆𝐵 . (10.6.18)
With these optimizations in place, we can now establish the inequality in (10.6.9)
by making a judicious choice for 𝐷 𝐴′ 𝐵𝐵′ . Let 𝐶 𝐴′ 𝐴𝐵′ be an arbitrary operator
to consider in the optimization for 𝐺 max ( 𝐴′ 𝐴; 𝐵′) 𝜌 (i.e., satisfying (10.6.11)–
(10.6.12)), and let 𝑌𝑆𝐵 be an arbitrary operator to consider in the optimization for
Í𝑑 𝐴−1
Σmax (N) (i.e., satisfying (10.6.17)–(10.6.18)). Let |Γ⟩𝑆 𝐴 = 𝑖=0 |𝑖, 𝑖⟩𝑆 𝐴 . Pick
614
Chapter 10: Entanglement Measures for Quantum Channels
We need to prove that 𝐷 𝐴′ 𝐵𝐵′ is feasible for 𝐺 max ( 𝐴′; 𝐵𝐵′)𝜔 . To this end, consider
that
N
⟨Γ| 𝑆 𝐴𝐶 𝐴′ 𝐴𝐵′ ⊗ 𝑌𝑆𝐵 |Γ⟩𝑆 𝐴 ≥ ⟨Γ| 𝑆 𝐴 𝜌 𝐴′ 𝐴𝐵′ ⊗ Γ𝑆𝐵 |Γ⟩𝑆 𝐴 (10.6.20)
= N 𝐴→𝐵 (𝜌 𝐴′ 𝐴𝐵′ ), (10.6.21)
which follows from (10.6.12), (10.6.18), and (4.2.6). Now, since 𝐶 𝐴′ 𝐴𝐵′ ∈
′ ′ Í
SEP( 𝐴 𝐴 : 𝐵 ), it can be written as 𝑥∈X 𝑃𝑥𝐴′ 𝐴 ⊗ 𝑄 𝑥𝐵′ for some finite alphabet
d
X and for sets {𝑃𝑥𝐴′ 𝐴 }𝑥∈X , {𝑄 𝑥𝐵′ }𝑥∈X of positive semi-definite operators. Similarly,
d (𝑆 : 𝐵), it can be written as Í𝑦∈Y 𝐿 𝑦 ⊗ 𝑀 𝑦 , for some finite alphabet
since 𝑌𝑆𝐵 ∈ SEP
𝑦 𝑦 𝑆 𝐵
Y and sets {𝐿 𝑆 } 𝑦∈Y , {𝑀𝐵 } 𝑦∈Y of positive semi-definite operators. Then, using
(2.2.40) and (2.2.41), we have that
For the second equality, we used the transpose trick from (2.2.40), and for the third,
we used (2.2.41). The last statement follows because
√︃ √︃
𝑥 𝑦 𝑦 𝑥 𝑦
Tr 𝐴 [𝑃 𝐴′ 𝐴 T 𝐴 (𝐿 𝐴 )] = Tr 𝐴 T 𝐴 (𝐿 𝐴 )𝑃 𝐴′ 𝐴 T 𝐴 (𝐿 𝐴 ) (10.6.26)
615
Chapter 10: Entanglement Measures for Quantum Channels
For the second equality, we used the transpose trick from (2.2.40). Since 𝐶 𝐴′ 𝐴𝐵′
and 𝑌𝑆𝐵 are positive semi-definite (this follows from (10.6.12) and (10.6.18),
respectively), using (2.2.98) we have that
where for the last equality we used the fact that the spectrum of an operator is
invariant under the action of a full transpose (note, in this case, that T 𝐴 is a full
transpose because the operator Tr 𝐵 [𝑌 𝐴𝐵 ] acts only on 𝐴). Therefore,
Since this inequality holds for all 𝐶 𝐴′ 𝐴𝐵′ satisfying (10.6.11)–(10.6.12) and for all
𝑌𝑆𝐵 satisfying (10.6.17)–(10.6.18), we conclude (10.6.9) after taking an infimum
with respect to all such operators.
Having shown that
A (N) = 𝐸
With the equality 𝐸 max max (N) in hand, the subadditivity of max-relative
entropy of entanglement of quantum channels immediately follows.
616
Chapter 10: Entanglement Measures for Quantum Channels
Let us now prove that the amortization collapse also occurs for the max-Rains
information of a quantum channel. The key tools needed for the proof are
Propositions 9.27 and 10.13, which provide semi-definite programs for both the
max-Rains relative entropy for bipartite states and the max-Rains information for
quantum channels. Let us recall these now:
2 𝑅max ( 𝐴;𝐵)𝜌 = 𝑊max ( 𝐴; 𝐵) 𝜌 (10.6.39)
= inf {Tr[𝐾 𝐴𝐵 + 𝐿 𝐴𝐵 ] : T𝐵 [𝐾 𝐴𝐵 − 𝐿 𝐴𝐵 ] ≥ 𝜌 𝐴𝐵 }, (10.6.40)
𝐾 𝐴𝐵 ,𝐿 𝐴𝐵 ≥0
𝑅max (N)
2 = Γmax (N) (10.6.41)
N
= inf {∥Tr 𝐵 [𝑉𝑆𝐵 + 𝑌𝑆𝐵 ] ∥ ∞ : T𝐵 [𝑉𝑆𝐵 − 𝑌𝑆𝐵 ] ≥ Γ𝑆𝐵 }, (10.6.42)
𝑌𝑆𝐵 ,𝑉𝑆𝐵 ≥0
N .
where 𝜌 𝐴𝐵 is a bipartite state and N is a quantum channel with Choi operator Γ𝑆𝐵
617
Chapter 10: Entanglement Measures for Quantum Channels
Proof: The proof given below is conceptually similar to the proof of Theorem 10.16,
but it has some key differences.
Using the semi-definite program formulations in (10.6.39)–(10.6.42), we find
that the inequality in (10.6.43) is equivalent to
With these SDP formulations in place, we can now establish the inequality in
(10.6.46) by making judicious choices for 𝐺 𝐴′ 𝐵𝐵′ and 𝐹𝐴′ 𝐵𝐵′ . Let 𝐶 𝐴′ 𝐴𝐵′ and 𝐷 𝐴′ 𝐴𝐵′
be arbitrary operators in the optimization for 𝑊max ( 𝐴′ 𝐴; 𝐵′) 𝜌 , and let 𝑌𝑆𝐵 and 𝑉𝑆𝐵
618
Chapter 10: Entanglement Measures for Quantum Channels
Í𝑑 𝐴−1
be arbitrary operators in the optimization for Γmax (N). Let |Γ⟩𝑆 𝐴 = 𝑖=0 |𝑖, 𝑖⟩𝑆 𝐴 .
Pick
𝐺 𝐴′ 𝐵𝐵′ = ⟨Γ| 𝑆 𝐴𝐶 𝐴′ 𝐴𝐵′ ⊗ 𝑉𝑆𝐵 + 𝐷 𝐴′ 𝐴𝐵′ ⊗ 𝑌𝑆𝐵 |Γ⟩𝑆 𝐴 , (10.6.56)
𝐹𝐴′ 𝐵𝐵′ = ⟨Γ| 𝑆 𝐴𝐶 𝐴′ 𝐴𝐵′ ⊗ 𝑌𝑆𝐵 + 𝐷 𝐴′ 𝐴𝐵′ ⊗ 𝑉𝑆𝐵 |Γ⟩𝑆 𝐴 . (10.6.57)
Note that 𝐺 𝐴′ 𝐵𝐵′ , 𝐹𝐴′ 𝐵𝐵′ ≥ 0 because 𝐶 𝐴′ 𝐴𝐵′ , 𝐷 𝐴′ 𝐴𝐵′ , 𝑌𝑆𝐵 , 𝑉𝑆𝐵 ≥ 0. Using
(10.6.49) and (10.6.55), consider that
T𝐵𝐵′ [𝐺 𝐴′ 𝐵𝐵′ − 𝐹𝐴′ 𝐵𝐵′ ]
= T𝐵𝐵′ [⟨Γ| 𝑆 𝐴 (𝐶 𝐴′ 𝐴𝐵′ − 𝐷 𝐴′ 𝐴𝐵′ ) ⊗ (𝑉𝑆𝐵 − 𝑌𝑆𝐵 )|Γ⟩𝑆 𝐴 ] (10.6.58)
= ⟨Γ| 𝑆 𝐴 T𝐵′ [𝐶 𝐴′ 𝐴𝐵′ − 𝐷 𝐴′ 𝐴𝐵′ ] ⊗ T𝐵 [𝑉𝑆𝐵 − 𝑌𝑆𝐵 ]|Γ⟩𝑆 𝐴 (10.6.59)
N
≥ ⟨Γ| 𝑆 𝐴 𝜌 𝐴′ 𝐴𝐵′ ⊗ Γ𝑆𝐵 |Γ⟩𝑆 𝐴 (10.6.60)
= N 𝐴→𝐵 (𝜌 𝐴′ 𝐴𝐵′ ), (10.6.61)
where the last equality follows from (4.2.6). Our choices of 𝐺 𝐴′ 𝐵𝐵′ and 𝐹𝐴′ 𝐵𝐵′
are thus feasible points for 𝑊max ( 𝐴′; 𝐵𝐵′)𝜔 . Using this, along with (2.2.40) and
(2.2.41), we obtain
𝑊max ( 𝐴′; 𝐵𝐵′)𝜔
≤ Tr[𝐺 𝐴′ 𝐵𝐵′ + 𝐹𝐴′ 𝐵𝐵′ ] (10.6.62)
= Tr[⟨Γ| 𝑆 𝐴 (𝐶 𝐴′ 𝐴𝐵′ + 𝐷 𝐴′ 𝐴𝐵′ ) ⊗ (𝑉𝑆𝐵 + 𝑌𝑆𝐵 )|Γ⟩𝑆 𝐴 ] (10.6.63)
= Tr[(𝐶 𝐴′ 𝐴𝐵′ + 𝐷 𝐴′ 𝐴𝐵′ )T 𝐴 (𝑉𝐴𝐵 + 𝑌 𝐴𝐵 )] (10.6.64)
= Tr[(𝐶 𝐴′ 𝐴𝐵′ + 𝐷 𝐴′ 𝐴𝐵′ )T 𝐴 (Tr 𝐵 [𝑉𝐴𝐵 + 𝑌 𝐴𝐵 ])]. (10.6.65)
The second equality follows from the transpose trick from (2.2.40). Now, since
𝐶 𝐴′ 𝐴𝐵′ , 𝐷 𝐴′ 𝐴𝐵′ ≥ 0 (recall (10.6.48)), and 𝑉𝐴𝐵 , 𝑌 𝐴𝐵 ≥ 0 (recall (10.6.54)), we can
use (2.2.98) to conclude that
Tr[(𝐶 𝐴′ 𝐴𝐵′ + 𝐷 𝐴′ 𝐴𝐵′ )T 𝐴 (Tr 𝐵 [𝑉𝐴𝐵 + 𝑌 𝐴𝐵 ])] (10.6.66)
= |Tr[(𝐶 𝐴′ 𝐴𝐵′ + 𝐷 𝐴′ 𝐴𝐵′ )T 𝐴 (Tr 𝐵 [𝑉𝐴𝐵 + 𝑌 𝐴𝐵 ])] | (10.6.67)
≤ ∥𝐶 𝐴′ 𝐴𝐵′ + 𝐷 𝐴′ 𝐴𝐵′ ∥ 1 ∥T 𝐴 (Tr 𝐵 [𝑉𝐴𝐵 + 𝑌 𝐴𝐵 ])∥ ∞ (10.6.68)
= Tr[𝐶 𝐴′ 𝐴𝐵′ + 𝐷 𝐴′ 𝐴𝐵′ ] ∥T 𝐴 (Tr 𝐵 [𝑉𝐴𝐵 + 𝑌 𝐴𝐵 ])∥ ∞ (10.6.69)
= Tr[𝐶 𝐴′ 𝐴𝐵′ + 𝐷 𝐴′ 𝐴𝐵′ ] ∥Tr 𝐵 [𝑉𝐴𝐵 + 𝑌 𝐴𝐵 ] ∥ ∞ , (10.6.70)
where the final equality follows because the spectrum of an operator is invariant
under the action of a (full) transpose (note, in this case, that T 𝐴 is a full transpose
because the operator Tr 𝐵 [𝑉𝐴𝐵 + 𝑌 𝐴𝐵 ] acts only on system 𝐴). We thus have
𝑊max ( 𝐴′; 𝐵𝐵′)𝜔 ≤ Tr[𝐶 𝐴′ 𝐴𝐵′ + 𝐷 𝐴′ 𝐴𝐵′ ] ∥T 𝐴 [Tr 𝐵 [𝑉𝐴𝐵 + 𝑌 𝐴𝐵 ]] ∥ ∞ (10.6.71)
619
Chapter 10: Entanglement Measures for Quantum Channels
Since this inequality holds for all 𝐶 𝐴′ 𝐴𝐵′ and 𝐷 𝐴′ 𝐴𝐵′ satisfying (10.6.49) and for
all 𝑉𝐴𝐵 and 𝑌 𝐴𝐵 satisfying (10.6.55), we conclude the inequality in (10.6.46).
Having shown that
for every state 𝜌 𝐴′ 𝐴𝐵′ , it immediately follows from the definition of amortized
entanglement of a channel that
A
𝑅max (N) ≤ 𝑅max (N). (10.6.73)
A (N) = 𝑅
With the equality 𝑅max max (N) in hand, along with additivity of max-
Rains relative entropy for bipartite states, additivity of max-Rains information of a
quantum channel immediately follows.
Proof: The additivity of max-Rains relative entropy for bipartite states (see
Proposition 9.29), along with Proposition 10.5, implies that the amortized max-
Rains information of a quantum channel is additive, meaning that
A A A
𝑅max (N ⊗ M) = 𝑅max (N) + 𝑅max (M) (10.6.76)
for all quantum channels N and M. Then, from (10.6.45), we obtain the desired
result. ■
620
Chapter 10: Entanglement Measures for Quantum Channels
Finally, let us prove that the amortization collapse occurs for the squashed entangle-
ment of a quantum channel.
for every state 𝜌 𝐴′ 𝐴𝐵′ , which means by definition of amortized entanglement that
A
𝐸 sq (N) ≤ 𝐸 sq (N), (10.6.81)
Proof: The additivity of the squashed entanglement for bipartite states (see
Proposition 9.32), along with Proposition 10.5, implies that the amortized squashed
entanglement of a quantum channel is additive, meaning that
A A A
𝐸 sq (N ⊗ M) = 𝐸 sq (N) + 𝐸 sq (M) (10.6.83)
for all quantum channels N and M. Then, from (10.6.77), we obtain the desired
result. ■
Lemma 10.22
Let 𝜓 𝐾 𝐿 1 𝐿 2 𝑀1 𝑀2 be a pure state. Then
𝐸 sq (𝐾; 𝐿 1 𝐿 2 )𝜓 ≤ 𝐸 sq (𝐾 𝐿 2 𝑀2 ; 𝐿 1 )𝜓 + 𝐸 sq (𝐾 𝐿 1 𝑀1 ; 𝐿 2 )𝜓 . (10.6.84)
Now, let 𝜙 𝐾 𝐿 1 𝐿 2 𝑀1′ 𝑀2′ 𝑅 be a purification of 𝜔 𝐾 𝐿 1 𝐿 2 𝑀1′ 𝑀2′ with purifying system 𝑅.
Then, by definition of conditional entropy, and using the fact that 𝜙 𝐾 𝐿 1 𝐿 2 𝑀1′ 𝑀2′ 𝑅 is
pure, we obtain1
𝐻 (𝐿 1 𝐿 2 |𝑀1′ 𝑀2′ 𝐾)𝜔 = 𝐻 (𝐿 1 𝐿 2 𝑀1′ 𝑀2′ 𝐾)𝜔 − 𝐻 (𝑀1′ 𝑀2′ 𝐾)𝜔 (10.6.93)
= 𝐻 (𝑅) 𝜙 − 𝐻 (𝐿 1 𝐿 2 𝑅) 𝜙 (10.6.94)
= −𝐻 (𝐿 1 𝐿 2 |𝑅) 𝜙 . (10.6.95)
Therefore,
623
Chapter 10: Entanglement Measures for Quantum Channels
10.7 Summary
... We considered two types of channel entanglement measures. The first type
quantifies the entanglement of a bipartite state after one share of it is sent through
the given quantum channel, in a manner analogous to the channel information
measures defined in Chapter 7. The second type of channel entanglement measure
is called amortized entanglement, which essentially quantifies the difference in
entanglement between a bipartite state and the state obtained after sending one
share of it through the given channel. The concept of amortized entanglement
turns out to play an important role in feedback-assisted communciation scenarios
(as considered in Part III), as it can be used to prove important properties of
entanglement measures of the first kind....
625
Chapter 10: Entanglement Measures for Quantum Channels
to Wang and Duan (2016b); Wang et al. (2019b). Proposition 10.12 is due
to Tomamichel et al. (2017). Concavity of a channel’s unoptimized squashed
entanglement (Proposition 10.15) is due to Takeoka et al. (2014). The amortization
collapse of max-relative entropy of entanglement and of max-Rains information
(Theorems 10.16 and 10.18, respectively) were shown by Berta and Wilde (2018),
with the amortization collapse of max-relative entropy implicitly considered by
Christandl and Müller-Hermes (2017). Lemma 10.22 is due to Takeoka et al. (2014),
and the explicit observation that Lemma 10.22 implies that amortization does not
increase the squashed entanglement of a quantum channel (Theorem 10.20) was
realized by Kaur and Wilde (2017). Corollary 10.21 was established by Takeoka
et al. (2014). ...
10.9 Problems
626
Chapter 10: Entanglement Measures for Quantum Channels
the quantum relative entropy (see Proposition 7.30). We also use the fact that
lim𝛼→∞ 𝐷e𝛼 = 𝐷 max (see Proposition 7.61).
10.A.0.1 𝜶 → 1 Limits
The proof of
e𝛼 ( 𝐴; 𝐵) 𝜌 = 𝑅( 𝐴; 𝐵) 𝜌 =
lim 𝑅 inf 𝐷 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ) (10.A.6)
𝛼→1 𝜎𝐴𝐵 ∈PPT′ ( 𝐴:𝐵)
Therefore,
e𝛼 ( 𝐴; 𝐵) 𝜌 = sup
lim 𝐸 inf e𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ).
𝐷 (10.A.13)
𝛼→1 − 𝛼∈( 1/2,1) 𝜎𝐴𝐵 ∈SEP( 𝐴:𝐵)
Now, we apply Theorem 2.25. Specifically, we can apply the theorem in order to
exchange the order of the infimum and supremum because the function
(𝜎𝐴𝐵 , 𝛼) ↦→ 𝐷
e𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ) (10.A.14)
627
Chapter 10: Entanglement Measures for Quantum Channels
The proof of
e𝛼 (N) = 𝑅(N) = sup
lim 𝑅 inf 𝐷 (N 𝐴→𝐵 (𝜓 𝑅 𝐴 )∥𝜎𝑅𝐵 ) (10.A.20)
′
𝛼→1 𝜓 𝑅 𝐴 𝜎𝑅𝐵 ∈PPT (𝑅:𝐵)
Next, when approaching from above, we again use Theorem 2.25. This time, since
the function
628
Chapter 10: Entanglement Measures for Quantum Channels
10.A.0.2 𝜶 → ∞ Limits
The proof of
e𝛼 ( 𝐴; 𝐵) 𝜌 = 𝑅max ( 𝐴; 𝐵) 𝜌 =
lim 𝑅 inf 𝐷 max (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ) (10.A.33)
𝛼→∞ 𝜎𝐴𝐵 ∈PPT′ ( 𝐴:𝐵)
e𝛼 = sup 𝐷
lim 𝐷 e𝛼 = 𝐷 max . (10.A.34)
𝛼→∞ 𝛼∈(1,∞)
Therefore,
e𝛼 ( 𝐴; 𝐵) 𝜌 = sup
lim 𝐸 inf e𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ).
𝐷 (10.A.35)
𝛼→∞ 𝛼∈(1,∞) 𝜎𝐴𝐵 ∈SEP( 𝐴:𝐵)
(𝜎𝐴𝐵 , 𝛼) ↦→ 𝐷
e𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ) (10.A.36)
629
Chapter 10: Entanglement Measures for Quantum Channels
The proof of
e𝛼 (N) = 𝑅max (N) = sup
lim 𝑅 inf 𝐷 max (N 𝐴→𝐵 (𝜓 𝑅 𝐴 )∥𝜎𝑅𝐵 ) (10.A.42)
′
𝛼→∞ 𝜓 𝑅 𝐴 𝜎𝑅𝐵 ∈PPT (𝑅:𝐵)
where in the third line we used Theorem 2.25 in order to exchange the infimum
and supremum based on exactly the same arguments used in the proof of (10.A.32)
above. This concludes the proof of (10.A.41).
630
Part II
Quantum Communication
Protocols
Entanglement-Assisted
Classical Communication
The first communication task that we consider is entanglement-assisted classical
communication. In this scenario, Alice and Bob are allowed to share an unlimited
amount of entanglement prior to communication, and the goal is for Alice to transmit
the maximum possible amount of classical information over a given channel N, by
using this prior shared entanglement as a resource. We consider this particular
setting before all other communication settings because, perhaps unexpectedly, the
main information-theoretic results in this setting are much simpler than those in all
other communication settings that we consider in this book.
Entanglement is a uniquely quantum phenomenon, and it is natural to ask,
when communicating over quantum channels, whether it can be used to provide
an advantage for sending classical information. The super-dense coding protocol,
described in Section 5.2, is an example of such an advantage. Recall that in this
protocol, Alice and Bob share a pair of quantum systems in the maximally entangled
state |Φ+ ⟩ = √1 (|0, 0⟩ + |1, 1⟩), and they are connected by a noiseless qubit channel.
2
With this shared entanglement, along with only one use of the channel, Alice can
communicate two bits of classical information to Bob. In the case of qudits, using
Í𝑑−1
the maximally entangled state √1 𝑖=0 |𝑖, 𝑖⟩, Alice can communication 2 log2 𝑑
𝑑
bits to Bob with only one use of a noiseless qudit quantum channel. Does this
kind of advantage exist in general? Specifically, supposing that we allow Alice
and Bob unlimited shared entanglement, what is the maximum amount of classical
information that can be communicated over a given quantum channel N?
632
Chapter 11: Entanglement-Assisted Classical Communication
The answer to this question is provided by Theorem 11.16, which tells us that
the entanglement-assisted classical capacity of a channel N is equal to the mutual
information 𝐼 (N) of the channel (see (7.11.102)). The strength of this result is that
it holds for all channels. Entanglement-assisted classical communication is one of
the few scenarios in which such a profoundly simple statement—applying to all
channels—can be made. Furthermore, the fact that the mutual information 𝐼 (N) is
the optimal rate for entanglement-assisted classical communication for all channels
N makes this communication scenario formally analogous to communication over
classical channels. Indeed, the famous result of Shannon from 1948 is that the
capacity of a classical channel described by a conditional probability distribution
𝑝𝑌 |𝑋 (𝑦|𝑥) with input and output random variables 𝑋 and 𝑌 , respectively, is equal
to max 𝑝 𝑋 𝐼 (𝑋; 𝑌 ), where 𝐼 (𝑋; 𝑌 ) is the mutual information between the random
variables 𝑋 and 𝑌 and the optimization is performed over all probability distributions
𝑝 𝑋 corresponding to the input 𝑋. Entanglement-assisted classical communication
can thus be viewed as a “natural” analogue of classical communication in the
quantum setting.
M3m
Alice
A0 E A
N B
m
b
Ψ A0 B0
Bob
B0
log2 |M| thus represents the number of bits that are communicated in the protocol.
One of the goals of this section is to obtain upper and lower bounds, in terms
of information measures for channels, on the maximum number log2 |M| of bits
that can be communicated in an entanglement-assisted classical communication
protocol.
The protocol proceeds as follows: let 𝑝 : M → [0, 1] be a probability
distribution over the message set. Alice starts by preparing two systems 𝑀 and 𝑀 ′
in the following classically correlated state:
𝑝
∑︁
Φ𝑀 𝑀 ′ B 𝑝(𝑚)|𝑚⟩⟨𝑚| 𝑀 ⊗ |𝑚⟩⟨𝑚| 𝑀 ′ . (11.1.2)
𝑚∈M
Note that if Alice wishes to send a particular message 𝑚 deterministically, then
she can choose the distribution 𝑝 to be the degenerate distribution, equal to one
for 𝑚 and zero for all other messages. Alice and Bob share the state Ψ𝐴′ 𝐵′ before
communication begins, so that the global state shared between them is
𝑝
Φ 𝑀 𝑀 ′ ⊗ Ψ 𝐴′ 𝐵 ′ . (11.1.3)
Alice then sends the 𝑀 ′ and 𝐴′ registers through the encoding channel E 𝑀 ′ 𝐴′ →𝐴 .
Due to the fact that the system 𝑀 ′ is classical, this encoding channel realizes a set
{E𝑚𝐴′ →𝐴 } 𝑚∈M of quantum channels as follows:
E𝑚𝐴′ →𝐴 (𝜏𝐴′ ) B E 𝑀 ′ 𝐴′ →𝐴 (|𝑚⟩⟨𝑚| 𝑀 ′ ⊗ 𝜏𝐴′ ) (11.1.4)
634
Chapter 11: Entanglement-Assisted Classical Communication
for every state 𝜏𝐴′ . The global state after the encoding channel is therefore
𝑝
∑︁
E 𝑀 ′ 𝐴′ →𝐴 (Φ 𝑀 𝑀 ′ ⊗ Ψ𝐴′ 𝐵′ ) = 𝑝(𝑚)|𝑚⟩⟨𝑚| 𝑀 ⊗ E𝑚𝐴′ →𝐴 (Ψ𝐴′ 𝐵′ ). (11.1.5)
𝑚∈M
Alice then transmits the 𝐴 system through the channel N 𝐴→𝐵 , leading to the
state 𝑝
(N 𝐴→𝐵 ◦ E 𝑀 ′ 𝐴′ →𝐴 )(Φ 𝑀 𝑀 ′ ⊗ Ψ𝐴′ 𝐵′ )
∑︁
= 𝑝(𝑚)|𝑚⟩⟨𝑚| 𝑀 ⊗ (N 𝐴→𝐵 ◦ E𝑚𝐴′ →𝐴 )(Ψ𝐴′ 𝐵′ ).
𝑚∈M (11.1.6)
∑︁
𝑚
= 𝑝(𝑚)|𝑚⟩⟨𝑚| 𝑀 ⊗ 𝜏𝐵𝐵 ′,
𝑚∈M
where
′ B (N 𝐴→𝐵 ◦ E 𝐴′ →𝐴 )(Ψ 𝐴′ 𝐵 ′ ) ∀ 𝑚 ∈ M.
𝑚 𝑚
𝜏𝐵𝐵 (11.1.7)
Bob, whose task is to determine which message Alice sent, applies a decoding
channel D𝐵𝐵′ → 𝑀b on his system 𝐵′ and the system 𝐵 received through the channel N.
The decoding channel is a quantum-classical channel (Definition 4.10) associated
with a POVM {Λ𝑚 𝐵𝐵′ } 𝑚∈M , so that
∑︁
D𝐵𝐵′ → 𝑀b (𝜏𝐵𝐵′ ) B
𝑚
Tr[Λ𝑚 𝑚
𝐵𝐵′ 𝜏𝐵𝐵′ ]| 𝑚
b
b⟩⟨𝑚
b | 𝑀b , (11.1.8)
b∈M
𝑚
b |𝑚) B Pr[ 𝑀
𝑞( 𝑚 b |𝑀 = 𝑚]
b=𝑚 (11.1.11)
𝐵𝐵′ N 𝐴→𝐵 (E 𝐴′ →𝐴 (Ψ 𝐴′ 𝐵′ ))].
= Tr[Λ𝑚
b 𝑚
(11.1.12)
The probability that Bob correctly identifies a given message 𝑚 is then given by
𝑞(𝑚|𝑚). The message error probability of the code P ≡ (Ψ, E, D) and message 𝑚
is then given by
𝑝 err (𝑚, P; N) B 1 − 𝑞(𝑚|𝑚)
= Tr[( 1𝐵𝐵′ − Λ𝑚 𝐵𝐵
𝑚
′ )N 𝐴→𝐵 (E 𝐴′ →𝐴 (Ψ 𝐴′ 𝐵 ′ ))]
∑︁ (11.1.13)
= b |𝑚).
𝑞( 𝑚
b∈M\{𝑚}
𝑚
Each of these three error probabilities can be used to assess the reliability of the
protocol, i.e., how well the encoding and decoding allow Alice to transmit her
message to Bob.
Lemma 11.2
The following equalities hold
1 𝑝 𝑝
𝑝 err (P; 𝑝, N) = Φ𝑀 𝑀 ′ − 𝜔 b , (11.1.16)
2 𝑀𝑀 1
636
Chapter 11: Entanglement-Assisted Classical Communication
1 𝑝
𝑝 ∗err (P; N) =
𝑝
max Φ𝑀 𝑀 ′ − 𝜔 b , (11.1.17)
𝑝:M→[0,1] 2 𝑀𝑀 1
𝑝 𝑝
where Φ 𝑀 𝑀 ′ and 𝜔 b are defined in (11.1.2) and (11.1.9), respec-
𝑀𝑀
tively. Thus, the error criterion 𝑝 ∗err (P; N) ≤ 𝜀 is equivalent to
𝑝 𝑝
max 𝑝:M→[0,1] 12 Φ 𝑀 𝑀 ′ − 𝜔 b ≤ 𝜀.
𝑀𝑀 1
Remark: The final criterion above states that the normalized trace distance between the initial
and final states of the protocol, maximized over all possible prior probability distributions, does
not exceed 𝜀.
Proof: To see this, let us first note that the normalized trace distance in (11.1.17)
is equal to the average error probability of the code. Indeed,
1 𝑝 𝑝
Φ𝑀 𝑀 ′ − 𝜔 b
2 𝑀 𝑀 1
1 ∑︁
= 𝑝(𝑚)|𝑚, 𝑚⟩⟨𝑚, 𝑚| 𝑀 𝑀 ′
2
𝑚∈M
∑︁
− b |𝑚)|𝑚, 𝑚
𝑝(𝑚)𝑞( 𝑚 b⟩⟨𝑚, 𝑚
b | 𝑀 𝑀b (11.1.18)
b∈M
𝑚,𝑚 1
1 ∑︁
= 𝑝(𝑚)|𝑚⟩⟨𝑚| 𝑀
2
𝑚∈M
!
∑︁
⊗ |𝑚⟩⟨𝑚| 𝑀 ′ − b |𝑚)| 𝑚
𝑞( 𝑚 b⟩⟨𝑚
b | 𝑀b (11.1.19)
b∈M
𝑚 1
1 ∑︁ ∑︁
= 𝑝(𝑚) |𝑚⟩⟨𝑚| 𝑀 ′ − b |𝑚)| 𝑚
𝑞( 𝑚 b⟩⟨𝑚
b | 𝑀b (11.1.20)
2
𝑚∈M b∈M
𝑚 1
1 ∑︁
= 𝑝(𝑚) ∥ (1 − 𝑞(𝑚|𝑚))|𝑚⟩⟨𝑚|
2
𝑚∈M
∑︁
− b |𝑚)| 𝑚
𝑞( 𝑚 b⟩⟨𝑚
b | 𝑀b (11.1.21)
b∈M\{𝑚}
𝑚 1
637
Chapter 11: Entanglement-Assisted Classical Communication
1 ∑︁ ∑︁
= 𝑝(𝑚) (1 − 𝑞(𝑚|𝑚)) + b |𝑚) ®
© ª
𝑞( 𝑚 (11.1.22)
2
𝑚∈M « b∈M\{𝑚}
𝑚 ¬
1 ∑︁ 1 ∑︁ ∑︁
= 𝑝(𝑚)(1 − 𝑞(𝑚|𝑚)) + b |𝑚)
𝑝(𝑚)𝑞( 𝑚 (11.1.23)
2 2
𝑚∈M b∈M\{𝑚}
𝑚∈M 𝑚
| {z } | {z }
𝑝 err (P;𝑝,N) 𝑝 err (P;𝑝,N)
= 𝑝 err (P; 𝑝, N), (11.1.24)
where the third and fifth equalities follow from (2.2.97) with 𝛼 = 1. Then, if 𝑚 ∗ ∈ M
is the message attaining the maximum error probability 𝑝 ∗err , let 𝑝˜ : M → [0, 1]
be the probability distribution such that 𝑝(𝑚 ˜ ∗ ) = 1 and 𝑝(𝑚)
˜ = 0 for all 𝑚 ≠ 𝑚 ∗ .
Using this probability distribution, we obtain
638
Chapter 11: Entanglement-Assisted Classical Communication
fined to be the maximum number log2 |M| of transmitted bits among all
(|M|, 𝜀) entanglement-assisted classical communication protocols over N. In
other words,
𝜀
𝐶EA (N) B sup {log2 |M| : 𝑝 ∗err ((Ψ, E, D); N) ≤ 𝜀}, (11.1.43)
(M,Ψ,E,D)
In addition to finding, for a given 𝜀 ∈ [0, 1], the maximum number of transmitted
bits among all (|M|, 𝜀) classical communication protocols over N 𝐴→𝐵 , we can
consider the following complementary problem: for a given number of messages
|M|, find the smallest possible error among all (|M|, 𝜀) entanglement-assisted
classical communication protocols, which we denote by 𝜀EA ∗ (|M|; N). In other
where the optimization is over every state Ψ𝐴′ 𝐵′ , encoding channel E 𝑀 ′ 𝐴′ →𝐴 , and
decoding channel D𝐵𝐵′ → 𝑀b , such that 𝑑 𝑀 ′ = 𝑑 𝑀b = |M|. In this chapter, we focus
primarily on the problem of optimizing the number of transmitted bits rather than
the error, and so our primary quantity of interest is the one-shot capacity 𝐶EA𝜀 (N).
M3m
Alice
A0 E A
PσB B
m̂
Ψ A0 B0
Bob
B0
in Figure 11.2. This useless channel discards the quantum state encoded with the
message and replaces it with some arbitrary (but fixed) state 𝜎𝐵 . This replacement
channel is useless for communication because the state 𝜎𝐵 does not contain any
information about the message 𝑚. Intuitively, we can say that such a channel
corresponds to “cutting the communication line.” As we show in Lemma 11.4,
comparing this protocol over the useless channel with the actual protocol allows us
to obtain an upper bound on the quantity log2 |M|, which we recall represents the
number of bits that are transmitted over the channel.
The definition of the useless channel implies that, for every message 𝑚 ∈ M,
where Ψ𝐵′ B Tr 𝐴′ [Ψ𝐴′ 𝐵′ ]. Making use of the definition of the replacement channel
R𝜎𝐴→𝐵
𝐵
in Definition 4.8, we can write this as
The state at the end of the protocol over the useless channel is then
∑︁
𝑝
𝜏 bB 𝑝(𝑚)Tr[Λ𝑚 𝐵𝐵′ (𝜎𝐵 ⊗ Ψ𝐵′ )]|𝑚⟩⟨𝑚| 𝑀 ⊗ | 𝑚
b
b⟩⟨𝑚b | 𝑀b
𝑀𝑀
𝑚,𝑚b∈M
𝑝
∑︁ (11.1.47)
= 𝜋𝑀 ⊗ Tr[Λ𝑚
𝐵𝐵′ (𝜎𝐵
b
⊗ Ψ𝐵′ )]| 𝑚
b⟩⟨𝑚
b | 𝑀b ,
b∈M
𝑚
where ∑︁
𝑝
𝜋𝑀 B 𝑝(𝑚)|𝑚⟩⟨𝑚| 𝑀 . (11.1.48)
𝑚∈M
641
Chapter 11: Entanglement-Assisted Classical Communication
𝑝
Now, recall from (11.1.10) that the state 𝜔 b at the end of the actual protocol over
𝑀𝑀
the channel N is given by
∑︁
𝑝
𝑝(𝑚)|𝑚⟩⟨𝑚| 𝑀 ⊗ Tr[Λ𝑚 𝐵𝐵′ N 𝐴→𝐵 (E 𝐴′ →𝐴 (Ψ 𝐴′ 𝐵′ ))]| 𝑚
𝑚
𝜔 b= b
b⟩⟨𝑚
b | 𝑀b .
𝑀𝑀
b∈M
𝑚,𝑚
(11.1.49)
It is helpful in what follows to let
1 ∑︁
Φ𝑀 𝑀 ′ B |𝑚⟩⟨𝑚| 𝑀 ⊗ |𝑚⟩⟨𝑚| 𝑀 ′ (11.1.50)
|M|
𝑚∈M
Lemma 11.4
Let Φ 𝑀 𝑀 ′ be the state defined in (11.1.50), and let 𝜔 𝑀 𝑀 ′ be a state on the
two classical registers 𝑀 and 𝑀 ′ such that 𝜔 𝑀 = Tr 𝑀 ′ [𝜔 𝑀 𝑀 ′ ] = 𝜋 𝑀 = 1|M| 𝑀
.
If the probability Tr[Π 𝑀 𝑀 ′ 𝜔 𝑀 𝑀 ′ ] that the state 𝜔 𝑀 𝑀 ′ passes the comparator
test defined by the POVM {Π 𝑀 𝑀 ′ , 1 − Π 𝑀 𝑀 ′ }, where Π 𝑀 𝑀 ′ is the projection
defined in (11.1.37), satisfies
Tr[Π 𝑀 𝑀 ′ 𝜔 𝑀 𝑀 ′ ] ≥ 1 − 𝜀, (11.1.51)
Tr[Π 𝑀 𝑀 ′ 𝜔 𝑀 𝑀 ′ ] ≥ 1 − 𝜀, (11.1.53)
642
Chapter 11: Entanglement-Assisted Classical Communication
where the inequality follows from the definition of the hypothesis testing relative
entropy in (7.9.1) (i.e., Π 𝑀 𝑀 ′ is a particular measurement operator satisfying
(11.1.53), but 𝐷 𝜀𝐻 (𝜔 𝑀 𝑀 ′ ∥𝜏𝑀 𝑀 ′ ) involves an optimization over all such operators).
Since the state 𝜎𝑀 ′ is arbitrary, we conclude that
The right-hand side of (11.1.52) is an upper bound on the number log2 |M| of bits
communicated using an (|M|, 𝜀) entanglement-assisted classical communication
protocol over the channel N. Indeed, since the error criterion 𝑝 ∗err (P) ≤ 𝜀 holds by
definition of an (|M|, 𝜀) protocol, using (11.1.36) and (11.1.24) we obtain
1 Note that the state in (11.1.47) at the end of the entanglement-assisted classical communication
protocol over the useless channel has precisely this form (when 𝑝 is taken to be the uniform
distribution over the message set M).
643
Chapter 11: Entanglement-Assisted Classical Communication
1 𝑝 𝑝
= max Φ𝑀 𝑀 ′ − 𝜔 b (11.1.64)
𝑝:M→[0,1] 2 𝑀𝑀 1
∗
= 𝑝 err (P; N) (11.1.65)
≤𝜀 (11.1.66)
𝑝
which is the state 𝜔 with 𝑝 being the uniform distribution over M. Observe that
𝑀𝑀b
" ! #
1 ∑︁ ∑︁
𝐵𝐵′ N 𝐴→𝐵 (E 𝐴′ →𝐴 (Ψ 𝐴′ 𝐵′ )) |𝑚⟩⟨𝑚| 𝑀
Λ𝑚 𝑚
Tr 𝑀b [𝜔 𝑀 𝑀b ] = Tr b
|M|
𝑚∈M b∈M
𝑚
(11.1.68)
1 ∑︁
= Tr[N 𝐴→𝐵 (E𝑚𝐴′ →𝐴 (Ψ𝐴′ 𝐵′ ))]|𝑚⟩⟨𝑚| 𝑀 (11.1.69)
|M|
𝑚∈M
= 𝜋𝑀 , (11.1.70)
where the last equality follows because the channels N 𝐴→𝐵 and E𝑚𝐴′ →𝐴 are trace
preserving. Finally, since the probability of passing the comparator test is given by
(11.1.41), i.e., h i
𝑝
Tr Π 𝑀 𝑀b 𝜔 b = 1 − 𝑝 err (P; 𝑝, N) (11.1.71)
𝑀𝑀
for every probability distribution 𝑝, we find that Tr[Π 𝑀 𝑀b 𝜔 𝑀 𝑀b ] ≥ 1 − 𝜀. The state
𝜔 𝑀 𝑀b thus satisfies the condition of Lemma 11.4. We conclude that
We now give a general upper bound on the number of transmitted bits possible for
an arbitrary one-shot entanglement-assisted classical communication protocol for
a channel N. This result is stated in Theorem 11.6. The upper bound obtained
therein holds independently of the encoding and decoding channels used in the
protocol and depends only on the given communication channel N.
Let us start with an arbitrary (|M|, 𝜀) entanglement-assisted classical com-
munication protocol over N corresponding to, as described at the beginning of
this chapter, a message set M, a prior shared entangled state Ψ𝐴′ 𝐵′ , an encoding
channel E, and a decoding channel D. The error criterion 𝑝 ∗err (P; N) ≤ 𝜀 holds
by the definition of an (|M|, 𝜀) protocol. Then, by the arguments at the end of the
previous section, Lemma 11.4 implies that the inequality log2 |M| ≤ 𝐼 𝐻𝜀 (𝑀; 𝑀)b𝜔
holds. Using this bound on the number log2 |M| of transmitted bits, we obtain the
following result:
645
Chapter 11: Entanglement-Assisted Classical Communication
646
Chapter 11: Entanglement-Assisted Classical Communication
where the first equality follows from the observation in (11.1.81) and the second
equality follows by definition. Now, observe that the state 𝜃 𝑀 𝐵𝐵′ has the form
N 𝐴→𝐵 (𝜌 𝑆 𝐴 ), where 𝑆 ≡ 𝑀 𝐵′ and 𝜌 𝑆 𝐴 ≡ E 𝑀 ′ 𝐴′ →𝐴 (Φ 𝑀 𝑀 ′ ⊗ Ψ𝐴′ 𝐵′ ). This means
that we can optimize over every state 𝜌 𝑆 𝐴 to obtain
where 𝜓 𝑆 𝐴 is a pure state, with the dimension of 𝑆 the same as that of 𝐴, and
𝜁 𝑆𝐵 = N 𝐴→𝐵 (𝜓 𝑆 𝐴 ). So we have
b 𝜔 ≤ 𝐼 𝐻𝜀 (𝑀 𝐵′; 𝐵)𝜃 ≤ 𝐼 𝐻𝜀 (N),
log2 |M| ≤ 𝐼 𝐻𝜀 (𝑀; 𝑀) (11.1.89)
as required. ■
in which we explictly see the comparison, via the hypothesis testing relative entropy,
between the actual entanglement-assisted classical communication protocol and
647
Chapter 11: Entanglement-Assisted Classical Communication
Since the bounds in (11.1.92) and (11.1.93) hold for every (|M|, 𝜀) entanglement-
assisted classical communication protocol over N, we have that
𝜀 1
𝐶EA (N) ≤ (𝐼 (N) + ℎ2 (𝜀)), (11.1.94)
1−𝜀
𝜀 𝛼 1
𝐶EA (N) ≤ e
𝐼𝛼 (N) + log2 ∀ 𝛼 > 1, (11.1.95)
𝛼−1 1−𝜀
648
Chapter 11: Entanglement-Assisted Classical Communication
testing relative entropy. Lemma 11.4 plays a role in bounding the maximum
number of transmitted bits for a particular protocol.
2. We then used the data-processing inequality for the hypothesis testing relative
entropy to obtain a quantity that is independent of the decoding channel, as
well as being minimized over all useless protocols when compared to the actual
protocol. This is done in (11.1.76) and (11.1.82)–(11.1.85) in the proof of
Proposition 11.5.
3. Finally, we optimized over all encoding channels (and, effectively, over all
shared states Ψ𝐴′ 𝐵′ ) in (11.1.86)–(11.1.88) to obtain Proposition 11.5, in which
the bound is a function solely of the channel and the error probability.
4. Using Propositions 7.70 and 7.71, which relate the hypothesis testing relative
entropy to the quantum relative entropy and the sandwiched Rényi relative
entropy, respectively, we arrived at Theorem 11.6.
The bounds in (11.1.92) and (11.1.93) are fundamental upper bounds on the
number of transmitted bits for every entanglement-assisted classical communication
protocol. A natural question to ask now is whether the upper bounds in (11.1.92)
and (11.1.93) can be achieved. In other words, is it possible to devise protocols
such that the number of transmitted bits is equal to the right-hand side of either
(11.1.92) or (11.1.93)? We do not know how to, especially if we demand that we
exactly attain the right-hand side of either (11.1.92) or (11.1.93). However, when
given many uses of a channel (in the asymptotic setting), we can come close to
achieving these upper bounds. This motivates finding lower bounds on the number
of transmitted bits.
649
Chapter 11: Entanglement-Assisted Classical Communication
where for 𝑚 ∈ M the message error probability 𝑝 err (𝑚, P; N) is defined in (11.1.13)
as
𝑝 err (𝑚, P; N) = 1 − 𝑞(𝑚|𝑚), (11.1.97)
b |𝑚) being the probability of identifying the message sent as 𝑚
with 𝑞( 𝑚 b, given that
the message 𝑚 was sent.
We make use of a technique called position-based coding along with sequential
decoding to establish the lower bound (11.1.108) in Proposition 11.8 below, which
is analogous to the upper bound (11.1.73) in Proposition 11.5. We now give a brief
description of position-based coding and sequential decoding, while leaving the
details to the proof of Proposition 11.8.
Let us consider an entanglement-assisted classical communication protocol
defined by the four elements (M, 𝜌 ⊗|M| 𝐴′ 𝐵′ , E, D) and depicted in Figure 11.3. The
state shared by Alice and Bob prior to communication is |M| copies of a state 𝜌 𝐴′ 𝐵′ .
The encoding E is defined such that if Alice wishes to send a message 𝑚 ∈ M,
then she sends her 𝑚th 𝐴′ system through the channel. Specifically, the encoding
channels E𝑚( 𝐴′ ) |M| →𝐴 are defined as
h i
E𝑚( 𝐴′ ) |M| →𝐴 (𝜌 ⊗|M|
𝐴′ 𝐵 ′ ) =𝜌 𝐵1′ ⊗ ··· ⊗ 𝜌 𝐴𝐵′𝑚 ⊗ ··· ⊗ 𝜌 𝐵′𝑚 = Tr 𝐴¯ 𝑚 𝜌 ⊗|M|
𝐴′ 𝐵 ′ , (11.1.98)
Bob, whose task is to determine the message 𝑚 sent to him, should apply a de-
coding channel that ideally succeeds with high probability. The sequential decoding
strategy consists of Bob performing a sequence of measurements on systems 𝐵𝑖′ and
2 In practice, it would be wasteful for the sender to discard so much entanglement by explicitly
using the encoding procedure in (11.1.98). The explicit encoding given should thus be considered a
conceptual tool for understanding that the 𝑚th system is sent through the channel, and in practice it
can be realized simply by sending the 𝑚th system through the channel.
650
Chapter 11: Entanglement-Assisted Classical Communication
Alice Bob
ρ A10 B10
A10 B10
ρ A20 B20
A20 B20
..
.. .. . .. ..
. . . .
m3M A0m
N 0
Bm
.. .. .. ..
. . .. . .
.
A |M| B|M|
ρ A0 B0
|M| |M|
𝑃𝑖 B 1𝐵1′ 𝑅1 ⊗ · · · ⊗ 1𝐵𝑖−1
′ 𝑅
𝑖−1 ⊗ Π 𝐵𝑖 𝐵𝑅𝑖 ⊗ 1 𝐵𝑖+1 𝑅𝑖+1 ⊗ · · · ⊗ 1 𝐵 |M| 𝑅 |M|
′ ′ ′ (11.1.100)
for all 1 ≤ 𝑖 ≤ |M|, and they correspond to measuring systems 𝐵𝑖′ 𝐵𝑅𝑖 with the
POVM {Π𝐵′ 𝐵𝑅 , 1𝐵′ 𝐵𝑅 − Π𝐵′ 𝐵𝑅 }. This measurement can be thought of intuitively as
asking the question “Was the 𝑖th message sent?”, with the outcome corresponding
to 𝑃𝑖 being “yes” and the outcome corresponding to
b𝑖 B 1 − 𝑃𝑖
𝑃 (11.1.101)
being “no.” Bob performs a measurement on the systems 𝐵′1 𝐵𝑅1 , followed by
a measurement on 𝐵′2 𝐵𝑅2 , followed by a measurement on 𝐵′3 𝐵𝑅3 , etc., until he
obtains the outcome corresponding to “yes.” The system number corresponding to
b |𝑚) of guessing
this outcome is then his guess for the message. The probability 𝑞( 𝑚
b given that the message 𝑚 was sent is, therefore,
𝑚
651
Chapter 11: Entanglement-Assisted Classical Communication
where
𝜔𝑚
𝐵′ ···𝐵′ 𝐵𝑅1 ···𝑅 |M| B 𝜏𝐵𝑚′ ···𝐵′ 𝐵 ⊗ |0, · · · , 0⟩⟨0, · · · , 0| 𝑅1 ···𝑅 |M| . (11.1.103)
1 |M| 1 |M|
for all 𝑚 ∈ M.
Recall that our goal is to place an upper bound on the maximal error probability
𝑝 ∗err (P; N)
of this position-based coding and sequential decoding protocol. We
obtain an upper bound on 𝑝 err (𝑚, P; N) for each message 𝑚 from applying the
following theorem, called the quantum union bound, whose proof can be found in
Appendix 11.A. This theorem can be thought of as a quantum generalization of
the union bound from probability theory. Indeed, if 𝐴1 , . . . , 𝐴 𝑁 is a sequence of
events, then the union bound is as follows:
𝑁
∑︁
𝑐
Pr[( 𝐴1 ∩ · · · ∩ 𝐴 𝑁 ) ] = Pr[ 𝐴1𝑐 ∪···∪ 𝐴𝑐𝑁 ] ≤ Pr[ 𝐴𝑖𝑐 ], (11.1.105)
𝑖=1
where the superscript 𝑐 denotes the complement of an event.
𝑖=1
Using this theorem, we place the following upper bound on the message error
probability 𝑝 err (𝑚, P; N):
𝑝 err (𝑚, P; N) ≤ (1 + 𝑐)Tr[ 𝑃
b𝑚 𝜔𝑚′ ′
𝐵 ···𝐵 𝐵𝑅1 ···𝑅 |M| ]
1 |M|
𝑚−1
∑︁ (11.1.107)
−1
+ (2 + 𝑐 + 𝑐 ) Tr[𝑃𝑖 𝜔𝑚
𝐵′ ···𝐵′ 𝐵𝑅1 ···𝑅 |M| ],
1 |M|
𝑖=1
652
Chapter 11: Entanglement-Assisted Classical Communication
which holds for all 𝑐 > 0. By making a particular choice for the projectors
𝑃1 , . . . , 𝑃 |M| , and a particular choice for the constant 𝑐, we obtain the following.
𝜀
Remark: The quantity 𝐼 𝐻 ( 𝐴; 𝐵)𝜌 defined in the statement of Proposition 11.8 above is similar
to the quantity 𝐼 𝐻𝜀 ( 𝐴; 𝐵)𝜌 defined in (7.11.88), except that we do not perform an optimization
𝜀
over states 𝜎𝐵 . The resulting channel quantity 𝐼 𝐻 (N) is then similar to the quantity 𝐼 𝐻𝜀 (N)
𝜀
defined in (7.11.87). The fact that it suffices to optimize over pure states 𝜓 𝑅 𝐴 in 𝐼 𝐻 (N), with the
dimension of 𝑅 the same as that of 𝐴, follows from arguments analogous to those presented in
Section 7.11.
Proof: Fix 𝜀 ∈ (0, 1) and 𝜂 ∈ (0, 𝜀). Starting with the encoded state on Bob’s
systems as defined in (11.1.99), observe that the state of the systems 𝐵 and 𝐵′𝑚b is
given by
𝑚 N 𝐴′ →𝐵 (𝜌 𝐴′ 𝐵′ ) b = 𝑚,
if 𝑚
𝜏𝐵𝐵′ = (11.1.112)
𝑚
b N 𝐴′ →𝐵 (𝜌 𝐴′ ) ⊗ 𝜌 𝐵′ if 𝑚
b ≠ 𝑚.
Recall that the system 𝐵 results from the system 𝐴𝑚 being sent through the channel
by Alice. If, along with system 𝐵, he measures the system 𝐵′𝑚 , then Bob is
653
Chapter 11: Entanglement-Assisted Classical Communication
meaning that
− log2 Tr[Λ𝐵′ 𝐵 (𝜌 𝐵′ ⊗ N 𝐴′ →𝐵 (𝜌 𝐴′ ))]
𝜀−𝜂
= 𝐷 𝐻 (N 𝐴′ →𝐵 (𝜌 𝐴′ 𝐵′ )∥ 𝜌 𝐵′ ⊗ N 𝐴′ →𝐵 (𝜌 𝐴′ )) (11.1.115)
𝜀−𝜂
= 𝐼 𝐻 (𝐵′; 𝐵)𝜉 ,
where 𝜉 𝐵′ 𝐵 B N 𝐴′ →𝐵 (𝜌 𝐴′ 𝐵′ ).
The measurement with POVM {Λ𝐵′ 𝐵 , 1𝐵′ 𝐵 − Λ𝐵′ 𝐵 } forms one part of Bob’s
decoding strategy. The other part of the decoding strategy is based on the fact
that Bob does not know the position corresponding to the system 𝐵 he receives
through the channel from Alice. He therefore does not know which of the systems
𝐵′1 , . . . , 𝐵′|M| to measure along with 𝐵. As described before the statement of the
proposition, the sequential decoding strategy consists of Bob performing a sequence
of projective measurements on the systems 𝐵𝑖′ 𝐵𝑅 corresponding to the question
“Was the 𝑖th message sent?”. Let us define the projectors {Π𝐵′ 𝐵𝑅 , 1𝐵′ 𝐵𝑅 − Π𝐵′ 𝐵𝑅 }
on which this measurement is based as follows:
Tr[Π𝐵′ 𝐵𝑅 (N 𝐴′ →𝐵 (𝜌 𝐴′ 𝐵′ ) ⊗ |0⟩⟨0| 𝑅 )]
654
Chapter 11: Entanglement-Assisted Classical Communication
Now, recall that the message error probability 𝑝 err (𝑚, P; N) is defined as in
(11.1.104), i.e.,
𝑝 err (𝑚, P; N)
(11.1.123)
= 1 − Tr[𝑃𝑚 𝑃
b𝑚−1 · · · 𝑃 b1 · · · 𝑃
b1 𝜔 𝐵′ ···𝐵′ 𝐵𝑅1 ···𝑅 |M| 𝑃
1 |M|
b𝑚−1 𝑃𝑚 ],
and that we can use the quantum union bound (Theorem 11.7) to place an upper
bound on this quantity as in (11.1.107), i.e.,
𝑝 err (𝑚; P) ≤ (1 + 𝑐)Tr[ 𝑃
b𝑚 𝜔𝑚′ ′
𝐵 ···𝐵 𝐵𝑅1 ···𝑅 |M| ]
1 |M|
𝑚−1
∑︁ (11.1.124)
−1
+ (2 + 𝑐 + 𝑐 ) Tr[𝑃𝑖 𝜔𝑚
𝐵1′ ···𝐵′|M| 𝐵𝑅1 ···𝑅 |M| ]
𝑖=1
for all 𝑐 > 0. Using (11.1.121) and (11.1.122), the inequality in (11.1.113), and
the equality in (11.1.115), the upper bound can be simplified so that
𝑝 err (𝑚, P; N)
≤ (1 + 𝑐)Tr[( 1𝐵′ 𝐵 − Λ𝐵′ 𝐵 )N 𝐴′ →𝐵 (𝜌 𝐵′ 𝐴′ )]
+ (2 + 𝑐 + 𝑐−1 )(𝑚 − 1)Tr[Λ𝐵′ 𝐵 (𝜌 𝐵′ ⊗ N 𝐴′ →𝐵 (𝜌 𝐴′ )] (11.1.125)
𝜀− 𝜂
−𝐼 𝐻 (𝐵′ ;𝐵) 𝜉
≤ (1 + 𝑐)(𝜀 − 𝜂) + (2 + 𝑐 + 𝑐−1 )|M|2 (11.1.126)
for all 𝑐 > 0, where the second inequality follows because 𝑚 − 1 ≤ |M|. The
inequality in (11.1.126) holds for all 𝑚 ∈ M, which means that for all 𝑐 > 0,
𝜀− 𝜂
(𝐵′ ;𝐵) 𝜉
𝑝 ∗err (P; N) ≤ (1 + 𝑐)(𝜀 − 𝜂) + (2 + 𝑐 + 𝑐−1 )|M|2−𝐼 𝐻 . (11.1.127)
655
Chapter 11: Entanglement-Assisted Classical Communication
𝜀−𝜂
Let us set 𝛾 ≡ 𝐼 𝐻 (𝐵′; 𝐵)𝜉 and solve for the value of |M| such that
We find that
|M| = 2𝛾 𝑏(𝜂 − 𝑏𝜀), (11.1.129)
𝑐
where 𝑏 ≡ 1+𝑐 . Since 𝑏 is a variable and our goal is to make |M| as large as possible
for fixed 𝜀 and 𝜂, let us maximize |M| with respect to 𝑏. Solving 𝜕|M|𝜕𝑏 = 0, we find
𝜂
that 𝑏 = 2𝜀 . This is a permissible value of 𝑏 since it is required that 𝑏 > 0 and
𝜂 − 𝑏𝜀 ≥ 0. Plugging back into (11.1.129), we find that
2 𝜀− 𝜂
𝛾𝜂 𝐼𝐻 (𝐵′ ;𝐵) 𝜉 −log2 4𝜀
|M| = 2 =2 𝜂2 . (11.1.130)
4𝜀
Thus, with |M| given by (11.1.130), we conclude that
and this proves the existence of an (|M|, 𝜀) protocol with |M| given by (11.1.130).
However, (11.1.130) holds for every state 𝜌 𝐴′ 𝐵′ , which means that we can take
𝜀−𝜂 ′ 4𝜀
log2 |M| = sup 𝐼 𝐻 (𝐵 ; 𝐵)𝜉 − log2 2
𝜌 𝐴′ 𝐵′ 𝜂
(11.1.132)
𝜀−𝜂 4𝜀
= 𝐼 𝐻 (N) − log2 2 ,
𝜂
and have (11.1.131) hold. This is precisely (11.1.108), and since 𝜀 ∈ (0, 1) and
𝜂 ∈ (0, 𝜀) are arbitrary, the proof is complete. ■
Let us take note of the following two facts from the proof of Proposition 11.8
given above:
1. Given a particular 𝜀 ∈ (0, 1) and an 𝜂 ∈ (0, 𝜀), we can construct a position-
based coding and sequential decoding protocol achieving a maximal error
probability of 𝑝 ∗err (P) ≤ 𝜀 by taking
4𝜀
𝐼 𝐻 (𝐵′; 𝐵)𝜉 − log2 2 ,
𝜀−𝜂
log2 |M| = b (11.1.133)
𝜂
656
Chapter 11: Entanglement-Assisted Classical Communication
Remark: The quantity 𝐼 𝛼 ( 𝐴; 𝐵)𝜌 defined in the statement of Theorem 11.9 above is similar to
the quantity 𝐼 𝛼 ( 𝐴; 𝐵)𝜌 defined in (7.11.90), except that we do not perform an optimization over
states 𝜎𝐵 . The resulting channel quantity 𝐼 𝛼 (N) is then similar to the quantity 𝐼 𝛼 (N) defined in
(7.11.89). The fact that it suffices to optimize over pure states 𝜓 𝑅 𝐴 in 𝐼 𝛼 (N), with the dimension
of 𝑅 the same as that of 𝐴, follows from arguments analogous to those presented in Section 7.11.
Proof: From Proposition 11.8, we know that for all 𝜀 ∈ (0, 1) and 𝜂 ∈ (0, 𝜀) there
exists an (|M|, 𝜀) entanglement-assisted classical communication protocol such
657
Chapter 11: Entanglement-Assisted Classical Communication
that
𝜀−𝜂 4𝜀
log2 |M| = 𝐼 𝐻 (N) − log2 2 . (11.1.137)
𝜂
Proposition 7.72 relates the hypothesis testing relative entropy to the Petz–Rényi
relative entropy according to
𝛼 1
𝐷 𝜀𝐻 (𝜌∥𝜎) ≥ 𝐷 𝛼 (𝜌∥𝜎) + log2 (11.1.138)
𝛼−1 𝜀
for all 𝛼 ∈ (0, 1), which implies that
𝜀 𝛼 1
𝐼 𝐻 (N) ≥ 𝐼 𝛼 (N) + log2 . (11.1.139)
𝛼−1 𝜀
Combining this inequality with (11.1.137), we obtain the desired result. ■
A1 B1
N
A2 B2
M3m N
A0
E ..
.
A n −1
..
N
.
..
.
Bn−1
An Bn m̂
N
Alice
Ψ A0 B0
Bob
B0
POVM elements as acting on 𝑛 systems instead of just one. In particular, the state
at the end of the protocol is
𝑝
= (D𝐵𝑛 𝐵′ → 𝑀b ◦ N ⊗𝑛
𝑝
𝐴→𝐵 ◦ E 𝑀 𝐴 →𝐴 )(Φ 𝑀 𝑀 ′ ⊗ Ψ 𝐴 𝐵 ), (11.2.1)
𝜔 ′ ′ 𝑛 ′ ′
𝑀𝑀b
where 𝑝 is the prior probability distribution over the set of messages M, the
encoding channel E 𝑀 ′ 𝐴′ →𝐴𝑛 defines a set {E𝑚𝑀 ′ →𝐴𝑛 } 𝑚∈M of channels so that
𝑝
∑︁
E 𝑀 𝐴 →𝐴 (Φ 𝑀 𝑀 ′ ⊗ Ψ𝐴 𝐵 ) =
′ ′ 𝑛 ′ ′ 𝑝(𝑚)|𝑚⟩⟨𝑚| 𝑀 ⊗ E𝑚𝐴′ →𝐴𝑛 (Ψ𝐴′ 𝐵′ ), (11.2.2)
𝑚∈M
and the decoding channel D𝐵𝑛 𝐵′ → 𝑀b , with associated POVM {Λ𝑚 𝐵 𝑛 𝐵′ } 𝑚∈M , is defined
as ∑︁
D𝐵𝑛 𝐵′ → 𝑀b (𝜏𝐵𝑛 𝐵′ ) = Tr[Λ𝑚𝐵 𝑛 𝐵′ 𝜏𝐵 𝑛 𝐵′ ]|𝑚⟩⟨𝑚| 𝑀b. (11.2.3)
𝑚∈M
Then, for a given code specified by the encoding and decoding channels, the
definitions of the message error probability of the code, the average error probability
of the code, and the maximal error probability of the code all follow analogously
from their definitions in (11.1.13), (11.1.14), and (11.1.15), respectively, from the
one-shot setting.
659
Chapter 11: Entanglement-Assisted Classical Communication
As we prove in Appendix A,
∗
𝑅 achievable rate ⇐⇒ lim 𝜀EA (2𝑛(𝑅−𝛿) ; N ⊗𝑛 ) = 0 ∀ 𝛿 > 0. (11.2.11)
𝑛→∞
In other words, a rate 𝑅 is achievable if the optimal error probability for a sequence
of protocols with rate 𝑅 − 𝛿, 𝛿 > 0, vanishes as the number 𝑛 of uses of N increases.
661
Chapter 11: Entanglement-Assisted Classical Communication
In other words, a weak converse rate is a rate above which the optimal error
probability cannot be made to vanish in the limit of a large number of channel uses.
In other words, unlike the weak converse, in which the optimal error is required
to simply be bounded away from zero as the number 𝑛 of channel uses increases,
in order to have a strong converse rate the optimal error has to converge to one
as 𝑛 increases. By comparing (11.2.14) and (11.2.15), it is clear that every strong
converse rate is a weak converse rate.
662
Chapter 11: Entanglement-Assisted Classical Communication
𝐶EA (N) ≤ 𝐶
eEA (N) (11.2.17)
for every quantum channel N. We can also write the strong converse entanglement-
assisted classical capacity as
𝐶EA (N) = 𝐶
eEA (N) = 𝐼 (N), (11.2.19)
encoding and decoding channels such that for all 𝜀 ∈ (0, 1] and sufficiently
large 𝑛, the encoding and decoding channels correspond to (𝑛, 2𝑛𝑟 , 𝜀) protocols,
as per Definition 11.10, with rates 𝑟 < 𝑅. Thus, if 𝑅 is an achievable rate, then,
for every error probability 𝜀, it is possible to find an 𝑛 large enough, along
with encoding and decoding channels, such that the resulting protocol has rate
arbitrarily close to 𝑅 and maximal error probability bounded from above by 𝜀.
The achievability part of the proof establishes that 𝐶EA (N) ≥ 𝐼 (N).
2. Strong Converse: We show that 𝐼 (N) is a strong converse rate, from which
it follows that 𝐶eEA (N) ≤ 𝐼 (N). In general, to show that 𝑅 ∈ R+ is a strong
converse rate, we show that, given any shared entangled state Ψ𝐴′ 𝐵′ and
any encoding and decoding channels, for every rate 𝑟 > 𝑅, 𝜀 ∈ [0, 1), and
sufficiently large 𝑛, the communication protocol defined by the encoding and
decoding channels is not an (𝑛, 2𝑛𝑟 , 𝜀) protocol.
After showing the achievability and strong converse parts, we can use the
inequality in (11.2.17) to conclude that
We first establish in Section 11.2.1 that the rate 𝐼 (N) is achievable for entan-
glement-assisted classical communication over N. We then address the additivity of
the mutual information of a channel, in particular of the sandwiched Rényi mutual
information of a channel, in Section 11.2.2. Finally, we prove that 𝐼 (N) is a strong
converse rate in Section 11.2.3. This implies that 𝐼 (N) is a weak converse rate;
however, in Section 11.2.4, we provide an independent proof of this fact, as the
technique used in the proof is useful for alternate communication scenarios (besides
entanglement-assisted communication) for which a strong converse theorem is not
known to hold.
Proof: Fix 𝜀 ∈ (0, 1]. The inequality (11.2.21) holds for every channel N, which
means that it holds for N ⊗𝑛 . Applying the inequality in (11.2.21) to N ⊗𝑛 and
dividing both sides by 𝑛, we obtain
log2 |M| 1 ⊗𝑛 𝛼 1 1 4𝜀
≥ 𝐼 𝛼 (N ) + log2 − log2 2 (11.2.23)
𝑛 𝑛 𝑛(𝛼 − 1) 𝜀−𝜂 𝑛 𝜂
for all 𝛼 ∈ (0, 1). By restricting the optimization in the definition of 𝐼 𝛼 (N ⊗𝑛 )
to tensor-power states, we conclude that 𝐼 𝛼 (N ⊗𝑛 ) ≥ 𝑛𝐼 𝛼 (N). This follows from
the additivity of the Petz–Rényi relative entropy under tensor-product states (see
Proposition 7.23). So we obtain
log2 |M| 𝛼 1 1 4𝜀
≥ 𝐼 𝛼 (N) + log2 − log2 2 (11.2.24)
𝑛 𝑛(𝛼 − 1) 𝜀−𝜂 𝑛 𝜂
for all 𝛼 ∈ (0, 1). Letting 𝜂 = 𝜀2 , and using the fact that 𝛼 − 1 is negative for
𝛼 ∈ (0, 1), this inequality becomes
log2 |M| 1 2 3
≥ 𝐼 𝛼 (N) − log2 − (11.2.25)
𝑛 𝑛(1 − 𝛼) 𝜀 𝑛
for all 𝛼 ∈ (0, 1). Since 𝜀 is arbitrary, we find that for all 𝜀 ∈ (0, 1], there exists an
(𝑛, |M|, 𝜀) protocol such that (11.2.22) is satisfied, as required. ■
665
Chapter 11: Entanglement-Assisted Classical Communication
The inequality in (11.2.22) gives us, for every 𝜀 ∈ (0, 1] and 𝑛 ∈ N, a lower
bound on the size |M| of the message set that we can take for a corresponding
(𝑛, |M|, 𝜀) entanglement-assisted classical communication protocol defined by
position-based coding and sequential decoding. If instead we fix a particular
communication rate 𝑅 by letting |M| = 2𝑛𝑅 , then we can rearrange the inequality
in (11.2.22) to obtain an upper bound on the maximal error probability of the
corresponding (𝑛, 2𝑛𝑅 , 𝜀) entanglement-assisted classical communication protocol.
Specifically, we conclude that
𝜀 ≤ 2 · 2−𝑛(1−𝛼) ( 𝐼 𝛼 (N)−𝑅− 𝑛 )
3
(11.2.26)
for all 𝛼 ∈ (0, 1).
The inequality in (11.2.22) implies that
𝑛,𝜀 1 2 3
𝐶EA (N) ≥ 𝐼 𝛼 (N) − log2 − (11.2.27)
𝑛(𝛼 − 1) 𝜀 𝑛
for all 𝜀 ∈ (0, 1] and 𝛼 ∈ (0, 1).
We can now use (11.2.22) to prove that the mutual information 𝐼 (N) is an
achievable rate for entanglement-assisted classical communication over N.
Now, making use of the inequality in (11.2.22) of Corollary 11.17, there exists
an (𝑛, |M|, 𝜀) protocol, with 𝑛 and 𝜀 chosen as above, such that
log2 |M| 1 2 3
≥ 𝐼 𝛼 (N) − log2 − . (11.2.31)
𝑛 𝑛(1 − 𝛼) 𝜀 𝑛
666
Chapter 11: Entanglement-Assisted Classical Communication
We thus have 𝐼 (N) − 𝛿 ≤ 𝑛1 log2 |M|. By the fact stated immediately after
Definition 11.10, we conclude that there exists an (𝑛, 2𝑛(𝑅−𝛿) , 𝜀) entanglement-
assisted classical communication protocol with 𝑅 = 𝐼 (N) for all sufficiently large 𝑛
such that (11.2.30) holds. Since 𝜀 and 𝛿 are arbitrary, we have that for all 𝜀 ∈ (0, 1],
𝛿 > 0, and sufficiently large 𝑛, there exists an (𝑛, 2𝑛(𝐼 (N)−𝛿) , 𝜀) entanglement-
assisted classical communication protocol. This means that 𝐼 (N) is an achievable
rate, and thus that 𝐶EA (N) ≥ 𝐼 (N). See Appendix 11.C for a discussion of a
different way of seeing the achievability proof.
667
Chapter 11: Entanglement-Assisted Classical Communication
A1 B1
A2 B2
M3m
A0
E ..
.
A n −1
..
. PσBn ..
.
Bn−1
An Bn m̂
Alice
Ψ A0 B0
Bob
B0
Proof: Since the inequalities (11.2.35) and (11.2.36) of Theorem 11.6 hold for
every channel N, they hold for the channel N ⊗𝑛 . Therefore, applying (11.2.35) and
(11.2.36) to N ⊗𝑛 and dividing both sides by 𝑛, we immediately obtain the desired
result. ■
The inequalities in the corollary above give us, for all 𝜀 ∈ [0, 1) and 𝑛 ∈ N, an
upper bound on the size |M| of the message set we can take for every corresponding
(𝑛, |M|, 𝜀) entanglement-assisted classical communication protocol. If instead
we fix a particular communication rate 𝑅 by letting |M| = 2𝑛𝑅 , then we can
668
Chapter 11: Entanglement-Assisted Classical Communication
Proof: We first recall from (7.11.102) that the mutual information 𝐼 (N) of the
channel N is defined as
𝐼 (N) = sup 𝐼 (𝑅; 𝐵)𝜔
𝜓𝑅 𝐴
(11.2.41)
= sup 𝐷 (N 𝐴→𝐵 (𝜓 𝑅 𝐴 )∥𝜓 𝑅 ⊗ N 𝐴→𝐵 (𝜓 𝐴 )),
𝜓𝑅 𝐴
To prove the reverse inequality, let 𝜌 𝑅𝐵1 𝐵1 B ((N1 ) 𝐴1 →𝐵1 ⊗ (N2 ) 𝐴2 →𝐵2 )
(𝜓 𝑅 𝐴1 𝐴2 ). Then, using the formula in (7.1.8) for the mutual information in
terms of the quantum entropy, it is straightforward to verify that
Now, Klein’s inequality in Proposition 7.3, implies that the mutual information is
non-negative. Using this fact on the last term in (11.2.53), we find that
Therefore,
𝐼 (𝑅; 𝐵1 ) 𝜌 ≤ sup 𝐼 (𝑅; 𝐵1 )𝜏 = 𝐼 (N1 ), (11.2.56)
𝜌 𝑅 𝐴1
we get that
𝐼 (𝑅𝐵1 ; 𝐵2 ) 𝜌 ≤ 𝐼 (N2 ). (11.2.58)
Therefore,
𝐼 (𝑅; 𝐵1 𝐵1 ) 𝜌 ≤ 𝐼 (N1 ) + 𝐼 (N2 ). (11.2.59)
Since the state 𝜓 𝑅 𝐴1 𝐴2 that we started with is arbitrary, we obtain
𝐼 (N1 ⊗ N2 ) = sup 𝐼 (𝑅; 𝐵1 𝐵2 ) 𝜌 ≤ 𝐼 (N1 ) + 𝐼 (N2 ), (11.2.60)
𝜓 𝑅 𝐴1 𝐴2
as required. Combining this inequality with that in (11.2.44), we have the required
equality, 𝐼 (N1 ⊗ N2 ) = 𝐼 (N1 ) + 𝐼 (N2 ). ■
671
Chapter 11: Entanglement-Assisted Classical Communication
Lemma 11.20
For every bipartite state 𝜌 𝐴𝐵 and 𝛼 > 1, the sandwiched Rényi mutual
𝐼𝛼 ( 𝐴; 𝐵) 𝜌 can be written as
information e
𝐼𝛼 ( 𝐴; 𝐵) 𝜌
e
𝛼 1− 𝛼 𝛼−1 (11.2.66)
= log2 sup Tr 𝐴𝐶 𝜌 𝐴𝛼 ⊗ 𝜏𝐶 𝛼 |𝜓⟩⟨𝜓| 𝐴𝐵𝐶 ,
𝛼−1 𝜏𝐶 𝛼
2𝛼−1
1− 𝛼 1− 𝛼
where S𝜎(𝛼)
𝐵
(·) B 𝜎𝐵2𝛼 (·)𝜎𝐵2𝛼 and
1 1
∥M∥ CB, 1→𝛼 B sup M 𝐴→𝐵 𝑌𝑅2𝛼 |Γ⟩⟨Γ| 𝑅 𝐴𝑌𝑅2𝛼 , (11.2.68)
𝑌𝑅 >0, 𝛼
Tr[𝑌𝑅 ]≤1
𝐼𝛼 ( 𝐴1 𝐴2 ; 𝐵1 𝐵2 )𝜉⊗𝜔 = e
e 𝐼 𝛼 ( 𝐴1 ; 𝐵 1 ) 𝜉 + e
𝐼 𝛼 ( 𝐴2 ; 𝐵 2 ) 𝜔 . (11.2.69)
672
Chapter 11: Entanglement-Assisted Classical Communication
𝐼𝛼 ( 𝐴1 𝐴2 ; 𝐵1 𝐵2 )𝜉⊗𝜔 = inf 𝐷
e e𝛼 (𝜉 𝐴1 𝐵1 ⊗ 𝜔 𝐴2 𝐵2 ∥𝜉 𝐴1 ⊗ 𝜔 𝐴2 ⊗ 𝜎𝐵1 𝐵2 ) (11.2.70)
𝜎𝐵1 𝐵2
If we restrict the optimization to product states 𝜎𝐵11 ⊗ 𝜎𝐵22 , then we find that
𝐼𝛼 ( 𝐴1 𝐴2 ; 𝐵1 𝐵2 )𝜉⊗𝜔
e
≤ inf 𝐷 e𝛼 (𝜉 𝐴1 𝐵1 ⊗ 𝜔 𝐴2 𝐵2 ∥𝜉 𝐴1 ⊗ 𝜔 𝐴2 ⊗ 𝜎𝐵1 ⊗ 𝜎𝐵2 ) (11.2.71)
1 2
𝜎 ⊗𝜎
1 2
n o
= inf 𝐷 e𝛼 (𝜉 𝐴1 𝐵1 ∥𝜉 𝐴1 ⊗ 𝜎𝐵1 ) + 𝐷e𝛼 (𝜔 𝐴2 𝐵2 ∥𝜔 𝐴2 ⊗ 𝜎𝐵2 ) (11.2.72)
1 2
𝜎 1 ,𝜎 2
𝐼 𝛼 ( 𝐴1 ; 𝐵 1 ) 𝜉 + e
=e 𝐼 𝛼 ( 𝐴2 ; 𝐵 2 ) 𝜔 . (11.2.73)
So
𝐼𝛼 ( 𝐴1 𝐴2 ; 𝐵1 𝐵2 )𝜉⊗𝜔 ≤ e
e 𝐼 𝛼 ( 𝐴1 ; 𝐵 1 ) 𝜉 + e
𝐼 𝛼 ( 𝐴2 ; 𝐵 2 ) 𝜔 . (11.2.74)
𝐼𝛼 ( 𝐴1 𝐴2 ; 𝐵1 𝐵2 )𝜉⊗𝜔
e
𝛼 1−𝛼𝛼 𝛼−1
= log2 sup Tr 𝐴1 𝐴2𝐶1𝐶2 𝜉 𝐴1 ⊗ 𝜔 𝐴2 ⊗ 𝜏𝐶1𝛼𝐶2
𝛼−1 𝜏𝐶1 𝐶2
× |𝜓1 ⟩⟨𝜓1 | 𝐴1 𝐵1𝐶1 ⊗ |𝜓2 ⟩⟨𝜓2 | 𝐴2 𝐵2𝐶2 𝛼 (11.2.75)
2𝛼−1
𝛼 1− 𝛼 1− 𝛼 𝛼−1 𝛼−1
≥ log2 sup Tr 𝐴1 𝐴2𝐶1𝐶2 𝜉 𝐴1𝛼 ⊗ 𝜔 𝐴𝛼2 ⊗ 𝜏𝐶1𝛼 ⊗ 𝜏𝐶2𝛼
𝛼−1 𝜏𝐶1 ⊗𝜏𝐶2
×|𝜓1 ⟩⟨𝜓1 | 𝐴1 𝐵1𝐶1 ⊗ |𝜓2 ⟩⟨𝜓2 | 𝐴2 𝐵2𝐶2 𝛼 (11.2.76)
2𝛼−1
𝛼 1− 𝛼 𝛼−1
= log2 sup Tr 𝐴1𝐶1 𝜉 𝐴1𝛼 ⊗ 𝜏𝐶1𝛼 |𝜓1 ⟩⟨𝜓1 | 𝐴1 𝐵1𝐶1
𝛼−1 𝜏𝐶1 ,𝜏𝐶2
1− 𝛼 𝛼−1
⊗Tr 𝐴2𝐶2 𝜔 𝐴2 ⊗ 𝜏𝐶2
𝛼 𝛼
|𝜓2 ⟩⟨𝜓2 | 𝐴2 𝐵2𝐶2 (11.2.77)
𝛼
2𝛼−1
(
𝛼 1− 𝛼 𝛼−1
= log2 sup Tr 𝐴1𝐶1 𝜉 𝐴1 ⊗ 𝜏𝐶1
𝛼 𝛼
|𝜓1 ⟩⟨𝜓1 | 𝐴1 𝐵1𝐶1
𝛼−1 𝜏𝐶1 ,𝜏𝐶2 𝛼
2𝛼−1
673
Chapter 11: Entanglement-Assisted Classical Communication
)
1− 𝛼 𝛼−1
× Tr 𝐴2𝐶2 𝜔 𝐴2 ⊗ 𝜏𝐶2
𝛼 𝛼
|𝜓2 ⟩⟨𝜓2 | 𝐴2 𝐵2𝐶2 (11.2.78)
𝛼
2𝛼−1
𝛼 1− 𝛼 𝛼−1
= log2 sup Tr 𝐴1𝐶1 𝜉 𝐴1 ⊗ 𝜏𝐶1
𝛼 𝛼
|𝜓1 ⟩⟨𝜓1 | 𝐴1 𝐵1𝐶1
𝛼−1 𝜏𝐶1 𝛼
2𝛼−1
𝛼 1− 𝛼 𝛼−1
+ log2 sup Tr 𝐴2𝐶2 𝜔 𝐴2 ⊗ 𝜏𝐶2
𝛼 𝛼
|𝜓2 ⟩⟨𝜓2 | 𝐴2 𝐵2𝐶2 (11.2.79)
𝛼−1 𝜏𝐶2 𝛼
2𝛼−1
𝐼 𝛼 ( 𝐴1 ; 𝐵 1 ) 𝜉 + e
=e 𝐼 𝛼 ( 𝐴2 ; 𝐵 2 ) 𝜔 . (11.2.80)
𝐼𝛼 (N1 ⊗ N2 ) = e
e 𝐼𝛼 (N1 ) + e
𝐼𝛼 (N2 ). (11.2.81)
where 𝜔 𝑅𝐵 B N 𝐴→𝐵 (𝜓 𝑅 𝐴 ), and the supremum is taken over every pure state 𝜓 𝑅 𝐴 ,
with 𝑅 having the same dimension as 𝐴. The superadditivity of the sandwiched
Rényi mutual information of a channel, namely,
𝐼𝛼 (N1 ⊗ N2 ) ≥ e
e 𝐼𝛼 (N1 ) + e
𝐼𝛼 (N2 ) (11.2.83)
𝐼𝛼 (N1 ⊗ N2 )
e
𝛼
= inf log2 S𝜎(𝛼) ◦ (N1 ⊗ N2 ) (11.2.84)
𝛼 − 1 𝜎𝐵1 𝐵2 𝐵1 𝐵2
CB, 1→𝛼
674
Chapter 11: Entanglement-Assisted Classical Communication
𝛼
≤ inf log2 S (𝛼)1 2 ◦ (N1 ⊗ N2 ) (11.2.85)
𝛼 − 1 𝜎𝐵1 ⊗𝜎𝐵2 𝜎𝐵 ⊗𝜎𝐵
1 2 CB, 1→𝛼
1 2
𝛼 (𝛼) (𝛼)
= inf log2 S𝜎1 ◦ N1 ⊗ S𝜎2 ◦ N2 , (11.2.86)
𝛼 − 1 𝜎𝐵1 ,𝜎𝐵2 𝐵1 𝐵2 CB, 1→𝛼
1 2
Now, consider that the norm ∥·∥ CB, 1→𝛼 is multiplicative with respect to tensor
products of completely positive maps, i.e.,
∥M1 ⊗ M2 ∥ CB, 1→𝛼 = ∥M1 ∥ CB, 1→𝛼 ∥M2 ∥ CB, 1→𝛼 (11.2.88)
for every two completely positive maps M1 , M2 and all 𝛼 > 1 (see Appendix 11.F
for a proof). Using this, we find that
𝐼𝛼 (N1 ⊗ N2 )
e
𝛼 (𝛼) (𝛼)
≤ inf log2 S𝜎1 ◦ N1 ⊗ S𝜎2 ◦ N2 (11.2.89)
𝛼 − 1 𝜎𝐵1 ,𝜎𝐵2 𝐵1 𝐵2 CB, 1→𝛼
1 2
𝛼 𝛼
= inf log2 S (𝛼)1 ◦ N1 + inf log2 S (𝛼)2 ◦ N2
𝛼 − 1 𝜎𝐵1 𝜎𝐵
1 CB, 1→𝛼 𝛼 − 1 𝜎𝐵2 𝜎𝐵
2 CB, 1→𝛼
1 2
(11.2.90)
𝐼𝛼 (N1 ) + e
=e 𝐼𝛼 (N2 ). (11.2.91)
Note that the additivity of the mutual information of a channel, i.e., Theo-
rem 11.19, follows straightforwardly from the theorem above by taking the limit
𝛼 → 1 (see Appendix 11.B for a proof).
Using the additivity of the sandwiched Rényi mutual information of a channel
(Theorem 11.21), the inequality in (11.2.38) can be written as
1 𝛼 1
log2 |M| ≤ e 𝐼𝛼 (N) + log2 (11.2.92)
𝑛 𝑛(𝛼 − 1) 1−𝜀
675
Chapter 11: Entanglement-Assisted Classical Communication
With the inequality in (11.2.93) in hand, we can now prove that the mutual
information 𝐼 (N) is a strong converse rate for entanglement-assisted classical
communication over N and establish that 𝐶
eEA (N) = 𝐼 (N).
𝛿 > 𝛿 1 + 𝛿 2 C 𝛿′ . (11.2.94)
Now, with the values of 𝑛 and 𝜀 as above, every (𝑛, |M|, 𝜀) entanglement-
assisted classical communication protocol satisfies (11.2.92). Rearranging the
right-hand side of this inequality, and using (11.2.94)–(11.2.96), we obtain
log2 |M| 𝛼 1
≤ 𝐼 (N) + e𝐼𝛼 (N) − 𝐼 (N) + log2 (11.2.97)
𝑛 𝑛(𝛼 − 1) 1−𝜀
≤ 𝐼 (N) + 𝛿1 + 𝛿2 (11.2.98)
= 𝐼 (N) + 𝛿′ (11.2.99)
< 𝐼 (N) + 𝛿. (11.2.100)
676
Chapter 11: Entanglement-Assisted Classical Communication
log |M|
So we have that 𝐼 (N) +𝛿 > 2𝑛 for all (𝑛, |M|, 𝜀) entanglement-assisted classical
communication protocols and sufficiently large 𝑛. Due to this strict inequality, it
follows that there cannot exist an (𝑛, 2𝑛(𝐼 (N)+𝛿) , 𝜀) entanglement-assisted classical
communication protocol for all sufficiently large 𝑛 such that (11.2.96) holds, for
if it did there would exist a set M such that |M| = 2𝑛(𝐼 (N)+𝛿) , which we have
just seen is not possible. Since 𝜀 and 𝛿 are arbitrary, we conclude that for all
𝜀 ∈ [0, 1), 𝛿 > 0, and sufficiently large 𝑛, there does not exist an (𝑛, 2𝑛(𝐼 (N)+𝛿) , 𝜀)
entanglement-assisted classical communication protocol. This means that 𝐼 (N) is
a strong converse rate, and thus that 𝐶 eEA (N) ≤ 𝐼 (N). See Appendix 11.G for a
different way of understanding the strong converse.
We now conclude Section 11.2 by providing an independent proof of the fact that
the mutual information 𝐼 (N) of a channel N is a weak converse rate.3
Proof: Suppose that 𝑅 is an achievable rate. Then, by definition, for all 𝜀 ∈ (0, 1],
𝛿 > 0, and sufficiently large 𝑛, there exists an (𝑛, 2𝑛(𝑅−𝛿) , 𝜀) entanglement-assisted
classical communication protocol over N. For all such protocols, the inequality
(11.2.61) holds by Corollary 11.18 and the additivity of the mutual information of
a channel, i.e.,
1 1
𝑅−𝛿 ≤ 𝐼 (N) + ℎ2 (𝜀) . (11.2.101)
1−𝜀 𝑛
Since this bound holds for all sufficiently large 𝑛, it holds in the limit 𝑛 → ∞, so
that
1
𝑅≤ 𝐼 (N) + 𝛿. (11.2.102)
1−𝜀
3 Recallthat any strong converse rate is also a weak converse rate, so that by the proof of the
strong converse part of Theorem 11.16 we can immediately conclude that 𝐼 (N) is a weak converse
rate.
677
Chapter 11: Entanglement-Assisted Classical Communication
Then, since this inequality holds for all 𝜀 ∈ (0, 1] and 𝛿 > 0, we obtain
1
𝑅 ≤ lim 𝐼 (N) + 𝛿 = 𝐼 (N). (11.2.103)
𝜀,𝛿→0 1 − 𝜀
11.3 Examples
In this section, we determine the entanglement-assisted classical capacity of some
of the channels that we introduced in Chapter 4. Recall that Theorem 11.16 states
that the entanglement-assisted classical capacity 𝐶EA (N) is given by the mutual
information of the channel N, i.e.,
where 𝜔 𝑅𝐵 = N 𝐴→𝐵 (𝜓 𝑅 𝐴 ) and the optimization is over every pure state 𝜓 𝑅 𝐴 , with
the dimension of 𝑅 the same as the dimension of the input system 𝐴 of the channel.
The mutual information 𝐼 (𝑅; 𝐵)𝜔 can be calculated using either the quantum relative
entropy or the quantum entropy via
Let us start with covariant channels. Recall from Definition 4.18 that a channel N is
covariant with respect to a group 𝐺 if there exist projective unitary representations
678
Chapter 11: Entanglement-Assisted Classical Communication
2.00
Dp
1.75 Ep
1.50 Ap
1.25
CE (bits)
1.00
0.75
0.50
0.25
0.00
0.0 0.2 0.4 0.6 0.8 1.0
p
𝑔 𝑔
{𝑈 𝐴 }𝑔∈𝐺 and {𝑉𝐵 }𝑔∈𝐺 such that
for all 𝜌 𝐴 , where 𝑑 is the dimension of the input space of the channel N. Such
channels are called irreducibly covariant.
Let us now recall Proposition 7.86, which tells us that the generalized mutual
information for every covariant channel is given as follows:
𝑰(N) = sup{𝑰(𝑅; 𝐵)𝜔 : 𝜙 𝐴 = T𝐺 (𝜙 𝐴 )}, (11.3.6)
𝜙𝑅 𝐴
Í
where 𝜔 𝑅𝐵 = N 𝐴→𝐵 (𝜙 𝑅 𝐴 ) and T𝐺 (·) B |𝐺1 | 𝑔∈𝐺 𝑈 𝑔 (·)𝑈 𝑔† . In other words, for
covariant channels, it suffices to optimize over pure states 𝜙 𝑅 𝐴 for which the reduced
679
Chapter 11: Entanglement-Assisted Classical Communication
state 𝜙 𝐴 is invariant under the channel T𝐺 . For irreducibly covariant channels, the
expression in (11.3.6) simplifies to
𝑰(N) = 𝑰( 𝐴; 𝐵) 𝜌N , (11.3.7)
where 𝜌 N
𝐴𝐵 = N 𝐴 →𝐵 (Φ 𝐴𝐴 ) is the Choi state of N.
′ ′
𝑔
where 𝜔 𝑅𝐵 = N 𝐴→𝐵 (𝜙 𝑅 𝐴 ). If the representation {𝑈 𝐴 }𝑔∈𝐺 is irreducible,
then the entanglement-assisted classical capacity of N is given by the mutual
information of its Choi state 𝜌 N
𝐴𝐵 , i.e.,
This means that the operators {1, 𝑋, 𝑌 , 𝑍 } satisfy the property in (11.3.5).4 By
Theorem 11.24, we thus have that the entanglement-assisted classical capacity of
the depolarizing channel is simply the mutual information of its Choi state.
D
It is straightforward to see that the Choi state 𝜌 𝐴𝐵𝑝 of the depolarizing channel is
D
𝜌 𝐴𝐵𝑝 = (1 − 𝑝)|Φ+ ⟩⟨Φ+ | 𝐴𝐵
𝑝
|Ψ+ ⟩⟨Ψ+ | 𝐴𝐵 + |Ψ− ⟩⟨Ψ− | 𝐴𝐵 + |Φ− ⟩⟨Φ− | 𝐴𝐵 . (11.3.15)
+
3
Since 𝐻 ( 𝐴) 𝜌D 𝑝 = 𝐻 (𝐵) 𝜌D 𝑝 = log2 (2) = 1 and
𝑝
𝐻 ( 𝐴𝐵) 𝜌D 𝑝 = −(1 − 𝑝) log2 (1 − 𝑝) − 𝑝 log2 , (11.3.16)
3
we find that
for all 𝑝 ∈ [0, 1]. See Figure 11.6 above for a plot of the capacity.
Let us also briefly analyze the lower and upper bounds obtained in Corollar-
ies 11.17 and 11.18, respectively. Specifically, let us consider the following bounds
on the maximal error probability that results from these bounds, i.e.,
𝜀 ≤ 2 · 2−𝑛(1−𝛼) ( 𝐼 𝛼 (D 𝑝 )−𝑅− 𝑛 ) ,
3
(11.3.20)
𝜀 ≥ 1 − 2−𝑛 ( 𝛼 )( 𝑅−𝐼 𝛼 (D 𝑝 ) ) ,
𝛼−1 e
(11.3.21)
which are rearrangements of (11.2.22) and (11.2.38) and are discussed further in
Appendices 11.C and 11.G, respectively. Now, by (11.3.7), we have
𝐼𝛼 (D 𝑝 ) = e
e 𝐼𝛼 (𝑅; 𝐵) 𝜌D 𝑝 ≤ e
𝐼 𝛼 (𝑅; 𝐵) 𝜌D 𝑝 , (11.3.22)
where
𝐼 𝛼 ( 𝐴; 𝐵) 𝜌 B 𝐷
e e𝛼 (𝜌 𝐴𝐵 ∥ 𝜌 𝐴 ⊗ 𝜌 𝐵 ). (11.3.23)
4The operators {1, 𝑋, 𝑌 , 𝑍 } form a projective unitary representation of the group Z2 × Z2 =
{(0, 0), (0, 1), (1, 0), (1, 1)}, where Z2 is the group consisting of the set {0, 1} with addition modulo
two. Specficially, we have 𝑈 (0,0) = 1, 𝑈 (0,1) = 𝑋, 𝑈 (1,0) = 𝑍, and 𝑈 (1,1) = 𝑌 .
681
Chapter 11: Entanglement-Assisted Classical Communication
1.0 1.0
0.8 0.8
0.6 0.6
Error, ε
Error, ε
0.4 0.4
Figure 11.7: Plot of the error bounds in (11.3.24) and (11.3.25) for the
depolarizing channel D 𝑝 with 𝑝 = 0.4. By increasing the number 𝑛 of channel
uses, it is possible to communicate at rates closer to the capacity (indicated by
the black vertical line) with vanishing error probability. Furthermore, for every
rate above the capacity, as 𝑛 increases, the error probability approaches one at
an exponential rate, consistent with the fact that the mutual information 𝐼 (D 𝑝 )
is a strong converse rate.
For simplicity, let us use the quantity in (11.3.22), which does not involve an
optimization over states 𝜎𝐵 , to place a lower bound on 𝜀, so that
−𝑛 ( 𝛼−1 ) 𝑅− 𝐼
e 𝛼 (𝑅;𝐵) 𝜔
𝜀 ≥ 1−2 𝛼
, (11.3.24)
D
where 𝜔 𝑅𝐵 = 𝜌 𝑅𝐵𝑝 is the Choi state of D 𝑝 . Similarly, for simplicity, let us take the
quantity 𝐼 𝛼 (N), which by definition involves an optimization over all pure states
𝜓 𝑅 𝐴 , and let 𝜓 𝑅 𝐴 be the maximally entangled state Φ+𝑅 𝐴 . So we take the upper
bound on the error probability to be
𝜀 ≤ 2 · 2−𝑛(1−𝛼) ( 𝐼 𝛼 (𝑅;𝐵) 𝜔 −𝑅− 𝑛 ) ,
3
(11.3.25)
D
where 𝜔 𝑅𝐵 = 𝜌 𝑅𝐵𝑝 is the Choi state of D 𝑝 . Then, for 𝑝 = 0.4, we plot in
Figure 11.7 the bounds in (11.3.24) and (11.3.25) (with 𝛼 = 1.0001 and 𝛼 = 0.9999,
respectively) to obtain plots that are analogous to the generic plot in Figure 11.8 in
Appendix 11.G. As portrayed in Figure 11.8, we indeed see that, as the number
𝑛 of channel uses increases, the capacity 𝐶EA (D 𝑝 ) becomes a clearer dividing
point between reliable communication—with nearly-vanishing error probability—
and unreliable communication—with error probability approaching one at an
exponential rate.
682
Chapter 11: Entanglement-Assisted Classical Communication
𝐶EA (D (𝑑) 2
𝑝 ) = 2 log2 𝑑 − ℎ2 ( 𝑝) − 𝑝 log2 (𝑑 − 1). (11.3.28)
for 𝑝 ∈ [0, 1], where |𝑒⟩ is a state that is orthogonal to all states in the input
qubit system. Recall that if we let the output space simply be a qutrit system with
the orthonormal basis {|0⟩, |1⟩, |2⟩}, then the input qubit space can be naturally
embedded into the subspace of the qutrit system spanned by |0⟩ and |1⟩, so that we
can let the erasure state simply be |2⟩. This means that we can write the action of
the erasure channel as
Observe that, like the depolarizing channel, the erasure channel is covariant
with respect to the group Z2 × Z2 , with the representation {1, 𝑋, 𝑌 , 𝑍 } on the input
qubit space and the representation {1 ⊕ |2⟩⟨2|, 𝑋 ⊕ |2⟩⟨2|, 𝑌 ⊕ |2⟩⟨2|, 𝑍 ⊕ |2⟩⟨2|}
on the output space. Then, by Theorem 11.24, the entanglement-assisted classical
capacity of the erasure channel is equal to the mutual information of its Choi state.
683
Chapter 11: Entanglement-Assisted Classical Communication
E
The Choi state 𝜌 𝐴𝐵𝑝 of the erasure channel is
E 1𝐴
𝜌 𝐴𝐵𝑝 = (1 − 𝑝)|Φ+ ⟩⟨Φ+ | 𝐴𝐵 + 𝑝 ⊗ |2⟩⟨2|. (11.3.31)
2
It is straightforward to verify that
𝐻 ( 𝐴) 𝜌E 𝑝 = 1, (11.3.32)
1− 𝑝
𝐻 (𝐵) 𝜌E 𝑝 = −(1 − 𝑝) log2 − 𝑝 log2 𝑝, (11.3.33)
2
𝑝
𝐻 ( 𝐴𝐵) 𝜌E 𝑝 = −(1 − 𝑝) log2 (1 − 𝑝) − 𝑝 log2 , (11.3.34)
2
so that
𝐶EA (E 𝑝 ) = 𝐼 ( 𝐴; 𝐵) 𝜌E 𝑝 = 2(1 − 𝑝). (11.3.35)
𝐶EA (E (𝑑)
𝑝 ) = 2(1 − 𝑝) log2 𝑑. (11.3.37)
√ √
√︁
1−𝛾 0 0 0
𝐴3 = 𝑁 , 𝐴4 = 𝑁 √ . (11.3.40)
0 1 𝛾 0
Therefore,
In the case 𝑁 = 0, the channel A𝛾,0 is the amplitude damping channel A𝛾 (see
(4.5.1)). Using (11.3.47), we find that
11.4 Summary
In this chapter, we formally defined and studied entanglement-assisted classical
communication. We began with the fundamental one-shot setting, in which a quan-
tum channel is used just once for entanglement-assisted classical communication,
and we defined the one-shot entanglement-assisted classical capacity in (11.1.43).
We then derived upper and lower bounds on the one-shot capacity (Propositions 11.5
and 11.8), making a fundamental link between communication and hypothesis
testing for both bounds. To derive the upper bound, the main conceptual point was
to compare an actual protocol for entanglement-assisted communication with a
useless one. This approach led to the hypothesis testing mutual information as an
upper bound. To derive the lower bound, we employed the combined approach of
sequential decoding and position-based coding, which at its core, is about how well
a correlated state can be distinguished from a product state. Stepping back a bit,
this is conceptually similar to the idea behind the converse upper bound, which
ultimately features a comparison between a correlated state and a product state. We
can consider the one-shot setting to contain the fundamental information theoretic
argument for entanglement-assisted communication.
With the one-shot setting in hand, we moved on the asymptotic setting, in which
the channel is allowed to be used multiple times (as a model of how communication
channels would be used in practice). We defined various notions of communication
686
Chapter 11: Entanglement-Assisted Classical Communication
rates, including achievable rates, capacity, weak converse rates, strong converse
rates, and strong converse capacity. With the fundamental one-shot bounds in
place, we then substituted one use of the channel N with 𝑛 uses (the tensor-product
channel N ⊗𝑛 ) and applied various technical arguments to prove that the mutual
information of a channel is equal to both its capacity and strong converse capacity
for entanglement-assisted communication. As a main step to establish the capacity,
we proved that the mutual information of a channel is additive, and as a main step
to establish the strong converse capacity, we proved that the sandwiched Rényi
mutual information of a channel is additive.
Finally, we calculated the entanglement-assisted classical for several key chan-
nels, including the depolarizing, erasure, and generalized amplitude damping
channels, in order to illustrate the theory on some concrete examples.
As it turns out, the strongest results known in quantum information theory
are for the entanglement-assisted capacity. The results stated above hold for all
quantum channels, and thus can be viewed from the physics perspective as universal
physical laws delineating the ultimate limits of entanglement-assisted classical
communication for any physical process (i.e., described by a quantum channel). In
this sense, shared entanglement simplifies quantum information theory immensely.
Going forward from here, the same concepts such as capacity, achievable rate,
etc. can be defined for different communication tasks (i.e., unassisted classical
communication, quantum communication, private communication, etc.). What
changes is that the known results are not as strong as they are for the entanglement-
assisted setting. We know the capacity of these other communication tasks only
for certain subclasses of channels. This might be considered unfortunate, but a
different perspective is that it is exciting, because rather exotic phenomena such as
superadditivity and superactivation can occur.
687
Chapter 11: Entanglement-Assisted Classical Communication
considered by Datta and Hsieh (2013); Matthews and Wehner (2014); Datta et al.
(2016); Anshu et al. (2019); Qi et al. (2018b); Anshu et al. (2019). Proposition 11.5
is due to Matthews and Wehner (2014), and the proof given here was found
independently by Qi et al. (2018b) and Anshu et al. (2019).
The position-based coding method was introduced by Anshu et al. (2019).
It can be understood as a quantum generalization of pulse position modulation
(Verdu, 1990; Cariolaro and Erseghe, 2003). Sequential decoding was considered
by Giovannetti et al. (2012); Sen (2012); Wilde (2013); Gao (2015); Oskouei et al.
(2019), and Theorem 11.7 is due to Oskouei et al. (2019). Proposition 11.8 is due
to Qi et al. (2018b), and the proof given here was found by Oskouei et al. (2019).
The strong converse for entanglement-assisted classical capacity was established
by Bennett et al. (2014) and Gupta and Wilde (2015), with the latter paper employing
the Rényi entropic method used in this chapter. Eq. (11.2.66) and the additivity of
sandwiched Rényi mutual information of bipartite states (Proposition 11.21) were
established by Beigi (2013). Eq. (11.2.67) was established by Gupta and Wilde
(2015), and the completely-bound 1 → 𝛼 norm was studied in depth by Devetak
et al. (2006). Theorem 11.22 was proven by Gupta and Wilde (2015), by employing
the multiplicativity of completely bounded norms (Eq. (11.2.88)) found by Devetak
et al. (2006).
The entanglement-assisted classical capacity of the depolarizing and erasure
channels was evaluated by Bennett et al. (1999b), the same capacity for the amplitude
damping channel was evaluated by Giovannetti and Fazio (2005), and the same
capacity for the generalized amplitude damping channel was evaluated by Li-Zhen
and Mao-Fa (2007a).
The proofs in Appendix 11.B are due to Cooney et al. (2016), and the proofs in
Appendices 11.E and 11.F are due to Jencova (2006) (with the proofs in this book
containing some slight variations). The Lieb concavity theorem (Theorem 11.30)
is due to Lieb (1973).
688
Chapter 11: Entanglement-Assisted Classical Communication
Theorem 11.25
Let {𝑃𝑖 }𝑖=1
𝑁
be a finite set of projectors. For every vector |𝜓⟩ and 𝑐 > 0,
Indeed, recall from Theorem 2.4 that every density operator 𝜌 has a spectral
decomposition of the following form:
∑︁
𝜌= 𝑝( 𝑗)|𝜓 𝑗 ⟩⟨𝜓 𝑗 |, (11.A.2)
𝑗 ∈J
+ (2 + 𝑐−1 ) ( 1 − 𝑃1 ) |𝜓 𝑗 ⟩
2
2
(11.A.4)
𝑁−1
= (1 + 𝑐) Tr[( 1 − 𝑃 𝑁 ) |𝜓 𝑗 ⟩⟨𝜓 𝑗 |] + (2 + 𝑐 + 𝑐 ) Tr[( 1 − 𝑃𝑖 )|𝜓 𝑗 ⟩⟨𝜓 𝑗 |]
∑︁
−1
𝑖=2
+ (2 + 𝑐 )Tr[( 1 − 𝑃1 ) |𝜓 𝑗 ⟩⟨𝜓 𝑗 |].
−1
(11.A.5)
The reduction from Theorem 11.25 to Theorem 11.7 follows by averaging the above
inequality over the probability distribution 𝑝 : J → [0, 1] and from the fact that
the right-hand side of the resulting inequality can be bounded from above by
𝑁−1
(1 + 𝑐)Tr[( 1 − 𝑃 𝑁 ) 𝜌] + (2 + 𝑐 + 𝑐 ) Tr[( 1 − 𝑃𝑖 ) 𝜌],
∑︁
−1
(11.A.6)
𝑖=1
689
Chapter 11: Entanglement-Assisted Classical Communication
so that
1 − Tr[𝑃 𝑁 𝑃 𝑁−1 · · · 𝑃1 𝜌𝑃1 · · · 𝑃 𝑁−1 𝑃 𝑁 ]
𝑁−1
(11.A.7)
≤ (1 + 𝑐)Tr[( 1 − 𝑃 𝑁 ) 𝜌] + (2 + 𝑐 + 𝑐 ) Tr[( 1 − 𝑃𝑖 ) 𝜌].
∑︁
−1
𝑖=1
We therefore shift our focus to proving Theorem 11.25, and we do so with the aid
of several lemmas. To simplify the notation, we hereafter employ the following
shorthands:
∥· · · ∥ 𝜓 ≡ ∥· · · |𝜓⟩∥ 2 , (11.A.8)
⟨· · · ⟩𝜓 ≡ ⟨𝜓| · · · |𝜓⟩, (11.A.9)
b𝑖 ≡ 1 − 𝑃𝑖 ,
𝑃 (11.A.10)
where for a given operator 𝐴 the expression ⟨𝐴⟩𝜓 = ⟨𝜓| 𝐴|𝜓⟩ is assumed to mean
⟨𝜓|( 𝐴|𝜓⟩). Furthermore, we also assume without loss of generality that the vector
|𝜓⟩ in Theorem 11.25 is a unit vector. This assumption can be dropped by scaling
the resulting inequality by an arbitrary positive number corresponding to the norm
of |𝜓⟩.
First recall that, due to the fact that 𝑃2 = 𝑃 for every projector 𝑃, we have the
following identities holding for all 𝑖 ∈ {1, 2, . . . , 𝑁 }:
⟨𝑃
b𝑖 𝑃𝑖−1 · · · 𝑃1 ⟩𝜓 = ⟨𝑃 b𝑖 𝑃𝑖−1 · · · 𝑃1 ⟩𝜓 ,
b𝑖 𝑃
(11.A.11)
⟨𝑃1 · · · 𝑃𝑖 ⟩𝜓 = ⟨𝑃1 · · · 𝑃𝑖 𝑃𝑖 ⟩𝜓 ,
Lemma 11.26
For a set {𝑃𝑖 }𝑖=1
𝑁
, a unit vector |𝜓⟩, and employing the shorthand in (11.A.8)–
(11.A.10), we have the following identities and inequality:
𝑁
∑︁
⟨𝑃
b𝑖 𝑃𝑖−1 · · · 𝑃1 ⟩𝜓 = 1 − ⟨𝑃 𝑁 · · · 𝑃1 ⟩𝜓 , (11.A.12)
𝑖=1
𝑁
∑︁
⟨𝑃1 · · · 𝑃𝑖−1 𝑃
b𝑖 ⟩𝜓 = 1 − ⟨𝑃1 · · · 𝑃 𝑁 ⟩𝜓 , (11.A.13)
𝑖=1
690
Chapter 11: Entanglement-Assisted Classical Communication
𝑁
∑︁
b𝑖 𝑃𝑖−1 · · · 𝑃1 ⟩𝜓 = 1 − ⟨𝑃1 · · · 𝑃 𝑁 · · · 𝑃1 ⟩𝜓 ,
⟨𝑃1 · · · 𝑃𝑖−1 𝑃 (11.A.14)
𝑖=1
√︃ √︃
1− ⟨𝑃 𝑁 ⟩𝜓 ⟨𝑃1 · · · 𝑃 𝑁 · · · 𝑃1 ⟩𝜓
∑︁𝑁 √︃ √︃
≤ ⟨𝑃 b𝑖 𝑃𝑖−1 · · · 𝑃1 ⟩𝜓 ,
b𝑖 ⟩𝜓 ⟨𝑃1 · · · 𝑃𝑖−1 𝑃
𝑖=1
(11.A.15)
691
Chapter 11: Entanglement-Assisted Classical Communication
Lemma 11.27
For a set {𝑃𝑖 }𝑖=1
𝑁
of projectors, a unit vector |𝜓⟩, and employing the shorthand
in (11.A.8)–(11.A.10), the following inequality holds for 𝑁 ≥ 2:
𝑁 𝑁−1
b𝑖 ( 1 − 𝑃𝑖−1 · · · 𝑃1 )∥ 2𝜓 ≤
∑︁ ∑︁
∥𝑃 b𝑖 ∥ 2𝜓 ,
∥𝑃 (11.A.21)
𝑖=1 𝑖=1
692
Chapter 11: Entanglement-Assisted Classical Communication
Consider that
1 − ∥𝑃 𝑁 · · · 𝑃1 ∥ 2𝜓 = 1 − ⟨𝑃1 · · · 𝑃 𝑁 · · · 𝑃1 ⟩𝜓
√︃ √︃
+ 2 1 − ⟨𝑃 𝑁 ⟩𝜓 ⟨𝑃1 · · · 𝑃 𝑁 · · · 𝑃1 ⟩𝜓
√︃ √︃
− 2 1 − ⟨𝑃 𝑁 ⟩𝜓 ⟨𝑃1 · · · 𝑃 𝑁 · · · 𝑃1 ⟩𝜓 (11.A.30)
√︃ √︃
= 2 1 − ⟨𝑃 𝑁 ⟩𝜓 ⟨𝑃1 · · · 𝑃 𝑁 · · · 𝑃1 ⟩𝜓
√︃ √︃ 2
− ⟨𝑃 𝑁 ⟩𝜓 − ⟨𝑃1 · · · 𝑃 𝑁 · · · 𝑃1 ⟩𝜓
− 1 + ⟨𝑃 𝑁 ⟩𝜓 . (11.A.31)
Continuing, we have that
Eq. (11.A.31)
693
Chapter 11: Entanglement-Assisted Classical Communication
√︃
√︁
b𝑁 ∥ 2𝜓 + 2 1 − 𝑃 𝑁 ⟨𝑃1 · · · 𝑃 𝑁 · · · 𝑃1 ⟩𝜓
≤ −∥ 𝑃 (11.A.32)
𝑁 √︃
∑︁ √︃
≤ b𝑁 ∥ 2𝜓
−∥ 𝑃 +2 ⟨𝑃
b𝑖 ⟩𝜓 ⟨𝑃1 · · · 𝑃𝑖−1 𝑃
b𝑖 𝑃𝑖−1 · · · 𝑃1 ⟩𝜓 (11.A.33)
𝑖=1
𝑁 √︃
b𝑖 ( 1 − 𝑃𝑖−1 · · · 𝑃1 )∥ 𝜓 .
∑︁
≤ b𝑁 ∥ 2𝜓
−∥ 𝑃 +2 ⟨𝑃
b𝑖 ⟩𝜓 ∥ 𝑃
b𝑖 ∥ 𝜓 + ∥ 𝑃 (11.A.34)
𝑖=1
Eq. (11.A.34)
𝑁 𝑁
b𝑖 ( 1 − 𝑃𝑖−1 · · · 𝑃1 )∥ 𝜓
∑︁ ∑︁
b𝑁 ∥ 2𝜓 + 2
= −∥ 𝑃 b𝑖 ∥ 2𝜓 + 2
∥𝑃 ∥𝑃
b𝑖 ∥ 𝜓 ∥ 𝑃 (11.A.41)
𝑖=1 𝑖=1
𝑁 𝑁
b𝑖 ( 1 − 𝑃𝑖−1 · · · 𝑃1 )∥ 𝜓
∑︁ ∑︁
b𝑁 ∥ 2𝜓 + 2
= −∥ 𝑃 b𝑖 ∥ 2𝜓 + 2
∥𝑃 ∥𝑃
b𝑖 ∥ 𝜓 ∥ 𝑃 (11.A.42)
𝑖=1 𝑖=2
𝑁
∑︁
b𝑁 ∥ 2𝜓 + 2
≤ −∥ 𝑃 b𝑖 ∥ 2𝜓
∥𝑃
𝑖=1
𝑁
b𝑖 ( 1 − 𝑃𝑖−1 · · · 𝑃1 )∥ 2𝜓
∑︁
+ b𝑖 ∥ 2𝜓 + 𝑐−1 ∥ 𝑃
𝑐∥ 𝑃 (11.A.43)
𝑖=2
694
Chapter 11: Entanglement-Assisted Classical Communication
𝑁
∑︁ 𝑁
∑︁ 𝑁−1
∑︁
b𝑁 ∥ 2𝜓 b𝑖 ∥ 2𝜓 b𝑖 ∥ 2𝜓 −1 b𝑖 ∥ 2𝜓
≤ −∥ 𝑃 +2 ∥𝑃 +𝑐 ∥𝑃 +𝑐 ∥𝑃 (11.A.44)
𝑖=1 𝑖=2 𝑖=1
𝑁−1
∑︁
b𝑁 ∥ 2𝜓 + (2 + 𝑐−1 )∥ 𝑃
≤ (1 + 𝑐)∥ 𝑃 b1 ∥ 2𝜓 + (2 + 𝑐 + 𝑐−1 ) b𝑖 ∥ 2𝜓 .
∥𝑃 (11.A.45)
𝑖=2
Eq. (11.A.42) follows from the convention that 𝑃𝑖−1 · · · 𝑃1 = 1 for 𝑖 = 1. Eq.
(11.A.43) is a consequence of the inequality 2𝑥𝑦 ≤ 𝑐𝑥 2 + 𝑐−1 𝑦 2 , holding for 𝑥, 𝑦 ∈ R
and 𝑐 > 0. Finally, (11.A.44) is obtained by using Lemma 11.27.
695
Chapter 11: Entanglement-Assisted Classical Communication
as required.
Similarly, for the sandwiched Rényi mutual information, we use the fact that
it increases monotonically with 𝛼 (see Proposition 7.31), along with the fact that
lim𝛼→1 𝐷 e𝛼 (𝜌∥𝜎) = 𝐷 (𝜌∥𝜎) (see Proposition 7.30), to obtain
𝐼𝛼 (N) =
lim e e𝛼 (N 𝐴→𝐵 (𝜓 𝑅 𝐴 )∥𝜓 𝑅 ⊗ 𝜎𝐵 )
inf sup inf 𝐷 (11.B.8)
𝛼→1+ 𝛼∈(1,∞) 𝜓 𝑅 𝐴 𝜎𝐵
e𝛼 (N 𝐴→𝐵 (𝜓 𝑅 𝐴 )∥𝜓 𝑅 ⊗ 𝜎𝐵 )
= sup inf inf 𝐷 (11.B.9)
𝜓 𝑅 𝐴 𝛼∈(1,∞) 𝜎𝐵
e𝛼 (N 𝐴→𝐵 (𝜓 𝑅 𝐴 )∥𝜓 𝑅 ⊗ 𝜎𝐵 )
= sup inf inf 𝐷 (11.B.10)
𝜓 𝑅 𝐴 𝜎𝐵 𝛼∈(1,∞)
= sup inf 𝐷 (N 𝐴→𝐵 (𝜓 𝑅 𝐴 )∥𝜓 𝑅 ⊗ 𝜎𝐵 ) (11.B.11)
𝜓 𝑅 𝐴 𝜎𝐵
= sup 𝐷 (N 𝐴→𝐵 (𝜓 𝑅 𝐴 )∥𝜓 𝑅 ⊗ N 𝐴→𝐵 (𝜓 𝐴 )) (11.B.12)
𝜓𝑅 𝐴
= 𝐼 (N). (11.B.13)
To obtain the second equality, we made use of the minimax theorem in Theorem 2.25
to exchange inf 𝛼∈(1,∞) and sup𝜓 𝑅 𝐴 . Specifically, we applied that theorem to the
function
(𝛼, 𝜓 𝑅 𝐴 ) ↦→ inf 𝐷
e𝛼 (N 𝐴→𝐵 (𝜓 𝑅 𝐴 )∥𝜓 𝑅 ⊗ 𝜎𝐵 ), (11.B.14)
𝜎𝐵
Here we show that the mutual information 𝐼 (N) is an achievable rate based on the
alternate definition given in Appendix A. According to that definition, a rate 𝑅 ∈ R+
is an achievable rate for entanglement-asssisted classical communication over N
if there exists a sequence {(𝑛, |M𝑛 |, 𝜀 𝑛 )}𝑛∈N of (𝑛, |M|, 𝜀) entanglement-assisted
696
Chapter 11: Entanglement-Assisted Classical Communication
To start, let us recall Corollary 11.17, which states that for all 𝜀 ∈ (0, 1], 𝑛 ∈ N,
and 𝛼 ∈ (0, 1), there exists an (𝑛, |M|, 𝜀) protocol satisfying
1 1 2 3
log2 |M| ≥ 𝐼 𝛼 (N) − log2 − . (11.C.2)
𝑛 𝑛(1 − 𝛼) 𝜀 𝑛
Fix constants 𝛿1 , 𝛿2 satisfying 0 < 𝛿2 < 𝛿1 < 1. Pick 𝛼𝑛 ∈ (0, 1) and 𝜀 𝑛 ∈ (0, 1]
as follows:
𝛼𝑛 B 1 − 𝑛−(1−𝛿1 ) , 𝜀 𝑛 B 2−𝑛 .
𝛿2
(11.C.3)
Plugging in to (11.C.2), we find that there exists a sequence of {(𝑛, |M𝑛 |, 𝜀 𝑛 )}𝑛∈N
protocols satisfying
1 1 2 3
log2 |M𝑛 | ≥ 𝐼 𝛼𝑛 (N) − log2 − (11.C.4)
𝑛 𝑛(1 − 𝛼𝑛 ) 𝜀𝑛 𝑛
1 + 𝑛 𝛿2 3
= 𝐼 𝛼𝑛 (N) − − (11.C.5)
𝑛 𝛿1 𝑛
1 1 3
= 𝐼 𝛼𝑛 (N) − 𝛿 − 𝛿 −𝛿 − . (11.C.6)
𝑛1 𝑛1 2 𝑛
Now taking the limit 𝑛 → ∞, we find that
1 1 1 3
lim inf log2 |M𝑛 | ≥ lim inf 𝐼 𝛼𝑛 (N) − 𝛿 − 𝛿 −𝛿 − (11.C.7)
𝑛→∞ 𝑛 𝑛→∞ 𝑛1 𝑛1 2 𝑛
= 𝐼 (N), (11.C.8)
𝜀 𝑛 , we find that for all 𝛼 ∈ (0, 1), there exists a sequence of {(𝑛, |M𝑛 |, 𝜀 𝑛 )}𝑛∈N
protocols satisfying
𝜀 𝑛 ≤ 2 · 2−𝑛(1−𝛼) ( 𝐼 𝛼 (N)−𝑅− 𝑛 ) .
3
(11.C.9)
Since 𝑅 < 𝐼 (N), lim𝛼→1 𝐼 𝛼 (N) = 𝐼 (N), and since 𝐼 𝛼 (N) is monotonically
increasing in 𝛼 (this follows from Proposition 7.31), there exists an 𝛼∗ < 1 such
that 𝐼 𝛼∗ (N) > 𝑅. Applying the bound in (11.C.9) to this value of 𝛼, we find that
∗
𝜀 𝑛 ≤ 2 · 2−𝑛(1−𝛼 ) ( 𝐼 𝛼∗ (N)−𝑅− 𝑛 ) .
3
(11.C.10)
Then, taking the limit 𝑛 → ∞ on both sides of this inequality, we conclude that
lim𝑛→∞ 𝜀 𝑛 = 0 exponentially fast. Thus, by choosing 𝑅 as a constant satisfying
𝑅 < 𝐼 (N) it follows that there exists a sequence of {(𝑛, 2𝑛𝑅 , 𝜀 𝑛 )}𝑛∈N protocols such
that the error probability 𝜀 𝑛 decays exponentially fast to zero.
where 𝛼 > 1 and the optimization is over states 𝜎𝐵 . Then, for every purification
|𝜓⟩ 𝐴𝐵𝐶 of 𝜌 𝐴𝐵 , we have
1− 𝛼 1− 𝛼 1− 𝛼 1− 𝛼
𝜌 𝐴2𝛼 ⊗ 𝜎𝐵2𝛼 𝜌 𝐴𝐵 𝜌 𝐴2𝛼 ⊗ 𝜎𝐵2𝛼
1− 𝛼 1− 𝛼 1− 𝛼 1− 𝛼
= Tr𝐶 𝜌 𝐴2𝛼 ⊗ 𝜎𝐵2𝛼 |𝜓⟩⟨𝜓| 𝐴𝐵𝐶 𝜌 𝐴2𝛼 ⊗ 𝜎𝐵2𝛼 . (11.D.3)
Now, the operator inside Tr𝐶 on the last line in the equation above is rank one,
which means that
1− 𝛼 1− 𝛼 1− 𝛼 1− 𝛼
Tr𝐶 𝜌 𝐴2𝛼 ⊗ 𝜎𝐵2𝛼 |𝜓⟩⟨𝜓| 𝐴𝐵𝐶 𝜌 𝐴2𝛼 ⊗ 𝜎𝐵2𝛼 and
1− 𝛼 1− 𝛼 1− 𝛼 1− 𝛼
Tr 𝐴𝐵 𝜌 𝐴2𝛼 ⊗ 𝜎𝐵2𝛼 |𝜓⟩⟨𝜓| 𝐴𝐵𝐶 𝜌 𝐴2𝛼 ⊗ 𝜎𝐵2𝛼
698
Chapter 11: Entanglement-Assisted Classical Communication
have the same non-zero eigenvalues. This means that their Schatten norms are
equal, so that
𝐼𝛼 ( 𝐴; 𝐵) 𝜌
e
𝛼 1− 𝛼 1− 𝛼 1− 𝛼 1− 𝛼
= log2 inf Tr𝐶 𝜌 𝐴2𝛼 ⊗ 𝜎𝐵2𝛼 |𝜓⟩⟨𝜓| 𝐴𝐵𝐶 𝜌 𝐴2𝛼 ⊗ 𝜎𝐵2𝛼 (11.D.4)
𝛼−1 𝜎𝐵
𝛼
𝛼 1− 𝛼 1− 𝛼 1− 𝛼 1− 𝛼
= log2 inf Tr 𝐴𝐵 𝜌 𝐴2𝛼 ⊗ 𝜎𝐵2𝛼 |𝜓⟩⟨𝜓| 𝐴𝐵𝐶 𝜌 𝐴2𝛼 ⊗ 𝜎𝐵2𝛼 (11.D.5)
𝛼−1 𝜎𝐵
𝛼
Now, we use the variational characterization of the Schatten norm in (2.2.98), which
states that for every operator 𝑋,
∥ 𝑋 ∥ 𝑝 = sup Tr[𝑌 † 𝑋] (11.D.6)
∥𝑌 ∥ 𝑝 ′ =1
where the last line follows by applying Sion’s minimax theorem (Theorem 2.24) to
the function
1− 𝛼 1− 𝛼 𝛼−1
(𝜏𝐶 , 𝜎𝐵 ) ↦→ Tr 𝜌 𝐴𝛼 ⊗ 𝜎𝐵 𝛼 ⊗ 𝜏𝐶 𝛼 |𝜓⟩⟨𝜓| 𝐴𝐵𝐶 , (11.D.11)
699
Chapter 11: Entanglement-Assisted Classical Communication
1− 𝛼
which is convex in the first argument because 𝜎𝐵 ↦→ 𝜎𝐵 𝛼 is operator convex and
𝛼−1
concave in the second argument because 𝜏𝐶 ↦→ 𝜏𝐶 𝛼 is operator concave.
Finally, we use Proposition 2.8, which is that
for all 0 < 𝑝 < 1, where 1𝑝 + 𝑝1′ = 1. Applying this to (11.D.10) with 𝑝′ = 𝛼
1−𝛼 , so
𝛼
that 𝑝 = 2𝛼−1 , we conclude that
𝐼𝛼 ( 𝐴; 𝐵) 𝜌
e
𝛼 1− 𝛼 1− 𝛼 𝛼−1
= log2 sup inf Tr 𝜌 𝐴𝛼 ⊗ 𝜎𝐵 𝛼 ⊗ 𝜏𝐶 𝛼 |𝜓⟩⟨𝜓| 𝐴𝐵𝐶 (11.D.13)
𝛼−1 𝜏𝐶 𝜎𝐵
𝛼 1− 𝛼 1− 𝛼 𝛼−1
= log2 sup inf Tr 𝜎𝐵 𝛼 Tr 𝐴𝐶 𝜌 𝐴𝛼 ⊗ 𝜏𝐶 𝛼 |𝜓⟩⟨𝜓| 𝐴𝐵𝐶 (11.D.14)
𝛼−1 𝜏𝐶 𝜎𝐵
𝛼 1− 𝛼 𝛼−1
= log2 sup Tr 𝐴𝐶 𝜌 𝐴𝛼 ⊗ 𝜏𝐶 𝛼 |𝜓⟩⟨𝜓| 𝐴𝐵𝐶 , (11.D.15)
𝛼−1 𝜏𝐶 𝛼
2𝛼−1
𝐼𝛼 (N)
e
𝛼 1− 𝛼 1− 𝛼 1− 𝛼 1− 𝛼
= sup log inf 𝜓 𝑅2𝛼 ⊗ 𝜎𝐵2𝛼 N 𝐴→𝐵 (𝜓 𝑅 𝐴 ) 𝜓 𝑅2𝛼 ⊗ 𝜎𝐵2𝛼 .
𝛼 − 1 𝜓 𝑅 𝐴 2 𝜎𝐵 𝛼
(11.D.17)
Now, we use the fact mentioned in (2.2.36), which is that for every pure state 𝜓 𝑅 𝐴 ,
with the systems 𝑅 and 𝐴 having the same dimensions, there exists an operator 𝑋 𝑅
such that
|𝜓⟩ 𝑅 𝐴 = (𝑋 𝑅 ⊗ 1 𝐴 )|Γ⟩ 𝑅 𝐴 , (11.D.18)
700
Chapter 11: Entanglement-Assisted Classical Communication
𝜓 𝑅 = Tr 𝐵 [N 𝐴→𝐵 (𝜓 𝑅 𝐴 )] (11.D.20)
√ √
= Tr 𝐵 [(𝑈 𝑅 𝜏𝑅 ⊗ 1𝐵 )N 𝐴→𝐵 (Γ𝑅 𝐴 )( 𝜏𝑅𝑈 𝑅† ⊗ 1𝐵 )] (11.D.21)
= 𝑈 𝑅 𝜏𝑅𝑈 𝑅† , (11.D.22)
where the last equality follows because N is trace preserving and Tr 𝐴 [|Γ⟩⟨Γ| 𝑅 𝐴 ] =
1𝑅 . Using this, we find that
1− 𝛼 1− 𝛼 1− 𝛼 1− 𝛼
𝜓 𝑅2𝛼 ⊗ 𝜎𝐵2𝛼 N 𝐴→𝐵 (𝜓 𝑅 𝐴 ) 𝜓 𝑅2𝛼 ⊗ 𝜎𝐵2𝛼
1− 𝛼
†
1− 𝛼 √ √ †
1− 𝛼
†
1− 𝛼
= 𝑈 𝑅 𝜏𝑅 𝑈 𝑅 ⊗ 𝜎𝐵
2𝛼 2𝛼
𝑈 𝑅 𝜏𝑅 N 𝐴→𝐵 (Γ𝑅 𝐴 ) 𝜏𝑅𝑈 𝑅 𝑈 𝑅 𝜏𝑅 𝑈 𝑅 ⊗ 𝜎𝐵
2𝛼 2𝛼
1 1− 𝛼 1 1− 𝛼
†
= 𝑈 𝑅 𝜏𝑅2𝛼 ⊗ 𝜎𝐵2𝛼 N 𝐴→𝐵 (Γ𝑅 𝐴 ) 𝜏𝑅2𝛼 𝑈 𝑅 ⊗ 𝜎𝐵2𝛼 . (11.D.23)
𝐼𝛼 (N)
e
𝛼 1 1− 𝛼 1 1− 𝛼
= sup log2 inf 𝜌 𝑅 ⊗ 𝜎𝐵
2𝛼 2𝛼
N 𝐴→𝐵 (Γ𝑅 𝐴 ) 𝜌 𝑅 ⊗ 𝜎𝐵
2𝛼 2𝛼
(11.D.24)
𝛼 − 1 𝜌𝑅 𝜎𝐵
𝛼
𝛼 1 1 1− 𝛼 1
= sup log2 inf N 𝐴→𝐵 (Γ𝑅 𝐴 ) 2 𝜌 𝑅 ⊗ 𝜎𝐵
𝛼 𝛼
N 𝐴→𝐵 (Γ𝑅 𝐴 ) 2 . (11.D.25)
𝛼 − 1 𝜌𝑅 𝜎𝐵
𝛼
is concave in the first argument (this follows from Lemma 11.29 in Appendix 11.F
below) and convex in the second argument (this follows from the operator convexity
701
Chapter 11: Entanglement-Assisted Classical Communication
1− 𝛼
of 𝜎𝐵 ↦→ 𝜎𝐵 𝛼 for 𝛼 > 1 and convexity of the Schatten norm). Thus, by the Sion
minimax theorem (Theorem 2.24), we can exchange sup 𝜌 𝑅 and inf 𝜎𝐵 . Also, we
1− 𝛼 1− 𝛼
define the completely positive map S𝜎(𝛼)
𝐵
by S𝜎(𝛼)
𝐵
(·) B 𝜎𝐵 (·)𝜎𝐵 . We can then
2𝛼 2𝛼
𝐼𝛼 (N) as
further rewrite e
𝐼𝛼 (N)
e
𝛼 1 1
= inf log2 sup (S𝜎(𝛼) ◦ N 𝐴→𝐵 ) 𝜌 𝑅2𝛼 |Γ⟩⟨Γ| 𝑅 𝐴 𝜌 𝑅2𝛼 (11.D.27)
𝛼−1 𝐵 𝜎 𝜌𝑅
𝐵
𝛼
𝛼
= inf log2 S𝜎(𝛼) ◦N , (11.D.28)
𝛼 − 1 𝜎𝐵 𝐵
CB, 1→𝛼
where, to arrive at the last line, we used the definition in (11.2.68). Also, consider
that the optimum in (11.2.68) is achieved when Tr[𝑌𝑅 ] = 1. Therefore,
𝛼
𝐼𝛼 (N) =
e inf log2 S𝜎(𝛼) ◦N , (11.D.29)
𝛼 − 1 𝜎𝐵 𝐵
CB, 1→𝛼
as required.
for every completely positive map M. We start with the expression in (11.2.68)
and write it alternatively as follows:
1 1
∥M∥ CB,1→𝛼 = sup M 𝐴→𝐵 𝑌𝑅2𝛼 |Γ⟩⟨Γ| 𝑅 𝐴𝑌𝑅2𝛼 (11.E.2)
𝑌𝑅 >0, 𝛼
Tr[𝑌𝑅 ]≤1
1 1
= sup M 𝐴→𝐵 𝑌𝑅 |Γ⟩⟨Γ| 𝑅 𝐴𝑌𝑅
2 2
(11.E.3)
𝑌𝑅 >0, 𝛼
∥𝑌𝑅 ∥ 𝛼 ≤1
702
Chapter 11: Entanglement-Assisted Classical Communication
1 1
M 𝐴→𝐵 𝑌𝑅2 |Γ⟩⟨Γ| 𝑅 𝐴𝑌𝑅2
𝛼
= sup . (11.E.4)
𝑌𝑅 >0 ∥𝑌𝑅 ∥ 𝛼
Now, we use the fact that there is a one-to-one correspondence between the operators
𝑌𝑅 and the vectors
1
|Γ𝑌 ⟩ 𝑅 𝐴 B (𝑌𝑅2 ⊗ 1 𝐴 )|Γ⟩ 𝑅 𝐴 . (11.E.5)
This allows us to rewrite the optimization in (11.E.4) in terms of such vectors. Then,
by employing isometric invariance of the norms with respect to an isometry acting
on the reference system 𝑅, we can restrict the optimization to arbitrary vectors
|𝜓⟩ 𝑅 𝐴 . Therefore, we have that
∥M 𝐴→𝐵 (𝜓 𝑅 𝐴 )∥ 𝛼
∥M∥ CB,1→𝛼 = sup , (11.E.6)
𝜓𝑅 𝐴 ∥Tr 𝐴 [𝜓 𝑅 𝐴 ] ∥ 𝛼
It remains to show the opposite inequality. Consider a vector |𝜙⟩𝑆𝑅 𝐴 that purifies
𝑌𝑅 𝐴 > 0, in the sense that Tr𝑆 [𝜙 𝑆𝑅 𝐴 ] = 𝑌𝑅 𝐴 . Then we have that
∥M 𝐴→𝐵 (𝑌𝑅 𝐴 )∥ 𝛼 ∥ (M 𝐴→𝐵 ⊗ Tr𝑆 )(𝜙 𝑆𝑅 𝐴 )∥ 𝛼
= (11.E.8)
∥Tr 𝐴 [𝑌𝑅 𝐴 ] ∥ 𝛼 ∥Tr𝑆 𝐴 [𝜙 𝑆𝑅 𝐴 ] ∥ 𝛼
∥ (M 𝐴→𝐵 ⊗ Tr𝑆 )(𝜙 𝑆𝑅 𝐴 )∥ 𝛼
≤ sup (11.E.9)
|𝜙⟩𝑆𝑅 𝐴 ∥Tr𝑆 𝐴 [𝜙 𝑆𝑅 𝐴 ] ∥ 𝛼
= ∥M ⊗ Tr∥ CB,1→𝛼 (11.E.10)
= ∥M∥ CB,1→𝛼 ∥Tr∥ CB,1→𝛼 (11.E.11)
= ∥M∥ CB,1→𝛼 (11.E.12)
The third-to-last equality follows from (11.E.6). The second-to-last equality follows
from (11.2.88), as shown in Appendix 11.F. The final inequality follows because
∥Tr∥ CB,1→𝛼 = 1, as can be readily verified.
703
Chapter 11: Entanglement-Assisted Classical Communication
for every two completely positive maps M1 and M2 and all 𝛼 > 1. Recall from
(11.2.68) that
1 1
M 2𝛼
∥M∥ CB,1→𝛼 = sup 𝑌𝑅2𝛼 Γ𝑅𝐵𝑌𝑅 , (11.F.2)
𝑌𝑅 >0, 𝛼
Tr[𝑌𝑅 ]≤1
M B M
where Γ𝑅𝐵 𝐴→𝐵 (Γ𝑅 𝐴 ) is the Choi representation of M, and the dimension of
𝑅 is the same as the dimension of 𝐴. For a completely positive map P𝐶→𝐷 , let us
also define
∥P𝐶→𝐷 (𝑍𝐶 ) ∥ 𝛼
∥P∥ 𝛼→𝛼 B sup . (11.F.3)
𝑍𝐶 >0 ∥𝑍𝐶 ∥ 𝛼
Note that
∥P𝐶→𝐷 (𝑍𝐶 ) ∥ 𝛼
∥P∥ 𝛼→𝛼 = sup (11.F.4)
𝑍𝐶 >0 ∥𝑍𝐶 ∥ 𝛼
= sup ∥P𝐶→𝐷 (𝑍𝐶 )∥ 𝛼 (11.F.5)
𝑍𝐶 >0,
∥𝑍𝐶 ∥ 𝛼 ≤1
1
= sup P𝐶→𝐷 (𝑌𝐶𝛼 ) , (11.F.6)
𝑌𝐶 >0, 𝛼
Tr[𝑌𝐶 ]≤1
where the last equality follows from the substitution 𝑌𝐶 = 𝑍𝐶𝛼 so that Tr[𝑌𝐶 ] =
Tr[𝑍𝐶𝛼 ] = ∥𝑍𝐶 ∥ 𝛼𝛼 .
Now, it immediately follows that
Indeed, due to the fact that the Choi representation of M1 ⊗ M2 has a tensor-product
form (see (4.2.17)), we can restrict the optimization in the definition of the norm
∥M1 ⊗ M2 ∥ CB,1→𝛼 to tensor-product operators 𝑌𝑅1 ⊗ 𝑌𝑅2 to obtain
∥M1 ⊗ M2 ∥ CB,1→𝛼
704
Chapter 11: Entanglement-Assisted Classical Communication
1 1
= sup 𝑌𝑅2𝛼1 𝑅2 (Γ𝑅M11𝐵1 ⊗ Γ𝑅M22𝐵2 )𝑌𝑅2𝛼1 𝑅2 (11.F.8)
𝑌𝑅1 𝑅2 >0, 𝛼
Tr[𝑌𝑅1 𝑅2 ]≤1
1 1 1 1
≥ sup (𝑌𝑅2𝛼1 ⊗ 𝑌𝑅2𝛼2 )(Γ𝑅M11𝐵1 ⊗ Γ𝑅M22𝐵2 )(𝑌𝑅2𝛼1 ⊗ 𝑌𝑅2𝛼2 ) (11.F.9)
𝑌𝑅1 >0,𝑌𝑅2 >0, 𝛼
Tr[𝑌𝑅1 ]≤1,Tr[𝑌𝑅2 ]≤1
1 1 1 1
= sup 𝑌𝑅2𝛼1 Γ𝑅M11𝐵1𝑌𝑅2𝛼1 ⊗ 𝑌𝑅2𝛼2 Γ𝑅M22𝐵2𝑌𝑅2𝛼2 (11.F.10)
𝑌𝑅1 >0,𝑌𝑅2 >0, 𝛼
Tr[𝑌𝑅1 ]≤1,Tr[𝑌𝑅2 ]≤1
1 1 1 1
= sup 𝑌𝑅2𝛼1 Γ𝑅M11𝐵1𝑌𝑅2𝛼1 𝑌𝑅2𝛼2 Γ𝑅M22𝐵2𝑌𝑅2𝛼2 (11.F.11)
𝑌𝑅1 >0,𝑌𝑅2 >0, 𝛼 𝛼
Tr[𝑌𝑅1 ]≤1,Tr[𝑌𝑅2 ]≤1
1 1 1 1
= sup 𝑌𝑅2𝛼1 Γ𝑅M11𝐵1𝑌𝑅2𝛼1 sup 𝑌𝑅2𝛼2 Γ𝑅M22𝐵2𝑌𝑅2𝛼2 (11.F.12)
𝑌𝑅1 >0, 𝛼 𝑌2 >0, 𝛼
Tr[𝑌𝑅1 ]≤1 Tr[𝑌𝑅2 ]≤1
Now we establish the opposite inequality. Let UM 𝐴→𝐵𝐸 be a linear map that
M
extends M 𝐴→𝐵 , in the sense that there is a linear operator 𝑈 𝐴→𝐵𝐸 such that
UM M M †
𝐴→𝐵𝐸 (𝑌 𝐴 ) = 𝑈 𝐴→𝐵𝐸 𝑌 𝐴 (𝑈 𝐴→𝐵𝐸 ) , (11.F.14)
Tr𝐸 [UM
𝐴→𝐵𝐸 (𝑌 𝐴 )] = M 𝐴→𝐵 (𝑌 𝐴 ). (11.F.15)
1 1
Due to the fact that 𝑌𝑅2𝛼 UM𝐴→𝐵𝐸 (Γ𝑅 𝐴 )𝑌𝑅 is a rank-one operator, and from an
2𝛼
and
1 1 1 1
Tr 𝑅𝐵 [𝑌𝑅2𝛼 UM
𝐴→𝐵𝐸 (Γ𝑅 𝐴 )𝑌𝑅 ] = Tr 𝑅 [𝑌𝑅 M 𝐴→𝐵 (Γ𝑅 𝐴 )𝑌𝑅 ]
2𝛼 2𝛼 𝑐 2𝛼
(11.F.17)
1
= M𝑐𝐴→𝐸 ((𝑌 𝐴T ) 𝛼 ), (11.F.18)
where M𝑐𝐴→𝐸 = Tr 𝐵 ◦ UM 𝐴→𝐵𝐸 denotes the complementary map and the last equality
follows by applying the transpose trick in (2.2.40) and the fact that Tr 𝑅 [Γ𝑅 𝐴 ] = 1 𝐴 .
705
Chapter 11: Entanglement-Assisted Classical Communication
(M1𝑐 ) 𝐴1 →𝐸1 ⊗ (M2𝑐 ) 𝐴2 →𝐸2 (𝑌 𝐴1 𝐴2 )
= (M1𝑐 ) 𝐴1 →𝐸1 (M2𝑐 ) 𝐴2 →𝐸2 (𝑌 𝐴1 𝐴2 ) (11.F.24)
= (M1𝑐 ) 𝐴1 →𝐸1 (𝑋 𝐴1 𝐸2 ). (11.F.25)
= M1𝑐 𝐴1 →𝐸 1 ⊗ id𝐸2 id 𝐴1 ⊗ M2𝑐 𝐴2 →𝐸 2 𝛼→𝛼 (11.F.28)
𝛼→𝛼
= M1𝑐 𝐴1 →𝐸 1 𝛼→𝛼 M2𝑐 𝐴2 →𝐸2 (11.F.29)
𝛼→𝛼
= ∥M1 ∥ CB,1→𝛼 ∥M2 ∥ CB,1→𝛼 . (11.F.30)
The third equality follows from Lemma 11.28 below. The final equality holds by
(11.F.23). Since 𝑌 𝐴1 𝐴2 is arbitrary, we find that
Lemma 11.28
Let M be a completely positive map. Then, for id an arbitrary identity map, the
following equality holds,
Proof: The inequality ∥id ⊗ M∥ 𝛼→𝛼 ≥ ∥M∥ 𝛼→𝛼 immediately follows by restrict-
ing the optimization on the left-hand side of the inequality. So we now establish
the non-trivial inequality ∥id ⊗ M∥ 𝛼→𝛼 ≤ ∥M∥ 𝛼→𝛼 . Letting the identity map act
on a reference system 𝑅, consider from (11.F.6) that
1
∥id ⊗ M∥ 𝛼→𝛼 = sup M 𝐴→𝐵 (𝑌𝑅𝛼𝐴 ) . (11.F.35)
𝑌𝑅 𝐴 >0, 𝛼
Tr[𝑌𝑅 𝐴]≤1
𝑑2
Let {𝑉 𝑖 }𝑖=1
𝑅
denote a set of Heisenberg–Weyl operators acting on the reference
system 𝑅 (see (3.2.48)), so that
2
1
𝑑𝑅
1 ∑︁ 𝑖 𝑖 †
𝑉 𝑅 (·)(𝑉 𝑅 ) = Tr[·] . (11.F.36)
𝑑 2𝑅 𝑖=1 𝑑𝑅
707
Chapter 11: Entanglement-Assisted Classical Communication
Then, for an arbitrary 𝑌𝑅 𝐴 > 0 satisfying Tr[𝑌𝑅 𝐴 ] ≤ 1, we use the unitary invariance
of the Schatten norm to obtain
𝑑𝑅 2
1 1 ∑︁ 1
M 𝐴→𝐵 (𝑌𝑅 𝐴 )𝛼
= 2 𝑉𝑅𝑖 M 𝐴→𝐵 (𝑌𝑅𝛼𝐴 )(𝑉𝑅𝑖 ) † (11.F.37)
𝛼 𝑑 𝑅 𝑖=1 𝛼
𝑑𝑅 2
1 ∑︁ 1
= 2 M 𝐴→𝐵 ((𝑉𝑅𝑖 𝑌𝑅 𝐴 (𝑉𝑅𝑖 ) † ) 𝛼 ) (11.F.38)
𝑑 𝑅 𝑖=1 𝛼
1
𝑑 2𝑅
© 1 ∑︁ 𝛼ª
𝑉𝑅𝑖 𝑌𝑅 𝐴 (𝑉𝑅𝑖 ) † ®
®
≤ M 𝐴→𝐵 2
(11.F.39)
𝑑𝑅 ®
𝑖=1
« ¬ 𝛼
1
= M 𝐴→𝐵 [𝜋 𝑅 ⊗ 𝑌 𝐴 ] 𝛼 , (11.F.40)
𝛼
where the inequality follows from Lemma 11.29 below, which states that the
1
function 𝑋 ↦→ M(𝑋 𝛼 ) is concave for all 𝛼 > 1. The last equality follows from
𝛼
1𝑅
(3.2.98), with 𝜋 𝑅 = |𝑅| the maximally mixed state and 𝑌 𝐴 = Tr 𝑅 [𝑌𝑅 𝐴 ]. Continuing,
we find that
1 1 1
M 𝐴→𝐵 [𝜋 𝑅 ⊗ 𝑌 𝐴 ] 𝛼 = M 𝐴→𝐵 𝜋 𝑅 ⊗ 𝑌 𝐴 𝛼 𝛼
(11.F.41)
𝛼 𝛼
1 1
= 𝜋 𝑅𝛼 ⊗ M 𝐴→𝐵 (𝑌 𝐴𝛼 ) (11.F.42)
𝛼
1 1
= 𝜋𝑅 𝛼
M 𝐴→𝐵 (𝑌 𝐴 ) 𝛼
(11.F.43)
𝛼 𝛼
1
= M 𝐴→𝐵 (𝑌 𝐴𝛼 ) (11.F.44)
𝛼
1
≤ sup M 𝐴→𝐵 (𝑌 𝐴𝛼 ) (11.F.45)
𝑌 𝐴 >0, 𝛼
Tr[𝑌 𝐴]≤1
= ∥M∥ 𝛼→𝛼 . (11.F.46)
Since the inequality holds for arbitrary 𝑌𝑅 𝐴 > 0 satisfying Tr[𝑌𝑅 𝐴 ] ≤ 1, we find
that
∥id ⊗ M∥ 𝛼→𝛼 ≤ ∥M∥ 𝛼→𝛼 , (11.F.47)
concluding the proof. ■
708
Chapter 11: Entanglement-Assisted Classical Communication
Lemma 11.29
Let 𝑋 be a positive semi-definite operator, and let M be a completely positive
map. For 𝛼 > 1, the following function is concave:
1
𝑋 ↦→ M(𝑋 𝛼 ) . (11.F.48)
𝛼
The Lieb concavity theorem (see Theorem 11.30 below) is the statement that the
following function is jointly concave with respect to positive semi-definite 𝑅 and 𝑆
for arbitrary 𝑡 ∈ (0, 1) and an arbitrary operator 𝐾:
Let 𝑋0 , 𝑋1 ≥ 0 and let 𝑌0 , 𝑌1 > 0 be such that Tr[𝑌0 ], Tr[𝑌1 ] ≤ 1. Then for
𝜆 ∈ (0, 1], and defining
we find that
1 𝛼−1 1 𝛼−1
𝜆Tr M(𝑋0𝛼 )𝑌0 𝛼 + (1 − 𝜆) Tr M(𝑋1𝛼 )𝑌1 𝛼
∑︁
∑︁ 1 𝛼−1 1 𝛼−1
= 𝜆Tr 𝑀𝑖 𝑋0𝛼 𝑀𝑖†𝑌0 𝛼 + (1 − 𝜆)Tr 𝑀𝑖 𝑋1𝛼 𝑀𝑖†𝑌1 𝛼 (11.F.55)
𝑖 𝑖
709
Chapter 11: Entanglement-Assisted Classical Communication
∑︁ 1 𝛼−1 1 𝛼−1
= 𝜆Tr 𝑀𝑖 𝑋0 𝑀𝑖†𝑌0
𝛼 𝛼
+ (1 − 𝜆) Tr 𝑀𝑖 𝑋1 𝑀𝑖†𝑌1
𝛼 𝛼
(11.F.56)
𝑖
∑︁
1 𝛼−1
≤ Tr 𝑀𝑖 𝑋𝜆 𝑀𝑖†𝑌𝜆
𝛼 𝛼
(11.F.57)
𝑖
1 𝛼−1
= Tr M(𝑋𝜆 )𝑌𝜆 𝛼 𝛼
(11.F.58)
1 𝛼−1 1
≤ sup Tr M(𝑋𝜆𝛼 )𝑌 𝐴 𝛼 = M(𝑋𝜆𝛼 ) , (11.F.59)
𝑌 𝐴 >0, 𝛼
Tr[𝑌 𝐴]≤1
where the first inequality follows from an application of the Lieb concavity theorem,
and the second inequality follows from applying (11.F.52) and because 𝑌𝜆 is a
particular operator satisfying 𝑌𝜆 > 0 and Tr[𝑌𝜆 ] ≤ 1. Since the chain of inequalities
holds for arbitary 𝑌0 , 𝑌1 > 0 such that Tr[𝑌0 ], Tr[𝑌1 ] ≤ 1, we conclude that
1 1 1
𝜆 M(𝑋0𝛼 ) + (1 − 𝜆) M(𝑋1𝛼 ) ≤ M(𝑋𝜆𝛼 ) , (11.F.60)
𝛼 𝛼 𝛼
710
Chapter 11: Entanglement-Assisted Classical Communication
1 1
= ⟨𝐾 | 𝑅 𝐴 𝑅 𝐴2 𝑔(𝑆 T𝑅 ⊗ 𝑅 −1
𝐴 )𝑅 𝐴 |𝐾⟩ 𝑅 𝐴 ,
2
(11.F.67)
where the fourth equality holds by the positive definiteness of 𝑅, and where
𝑔(𝑥) B 𝑥 1−𝑡 is an operator concave function. For 𝜆 ∈ [0, 1], let
where 𝑅0 and 𝑅1 are positive definite and 𝑆0 and 𝑆1 are positive semi-definite. Also,
let
𝐺 0 B 1 ⊗ 𝜆𝑅0 (𝑅𝜆 ) − 2 ,
√︁ 1
(11.F.69)
𝐺 1 B 1 ⊗ (1 − 𝜆) 𝑅1 (𝑅𝜆 ) − 2 .
√︁ 1
(11.F.70)
Then
+ 1 ⊗ (𝑅𝜆 ) − 2 (1 − 𝜆) 𝑅1 (𝑅𝜆 ) − 2
1 1
(11.F.71)
= 1 ⊗ (𝑅𝜆 ) − 21
𝑅𝜆 (𝑅𝜆 ) − 12
(11.F.72)
= 1 ⊗ 1. (11.F.73)
A variation of the operator Jensen inequality (Theorem 2.16) is that the following
inequality holds for an operator concave function 𝑓 , a finite set {𝑋𝑖 }𝑖 of Hermitian
operators, and a finite set {𝐴𝑖 }𝑖 of operators satisfying 𝑖 𝐴𝑖† 𝐴𝑖 = 1:
Í
!
∑︁ ∑︁
𝐴𝑖† 𝑓 (𝑋𝑖 ) 𝐴𝑖 ≤ 𝑓 𝐴𝑖† 𝑋𝑖 𝐴𝑖 . (11.F.74)
𝑖 𝑖
Then from the operator Jensen inequality and (11.F.67), we conclude that
1 1
= ⟨𝐾 | 𝑅 𝐴 (𝑅𝜆 ) 𝐴2 𝐺 †0 𝑔(𝜆𝑆0T ⊗ (𝜆𝑅0 ) −1 )𝐺 0 (𝑅𝜆 ) 𝐴2 |𝐾⟩ 𝑅 𝐴
1 1
+ ⟨𝐾 | 𝑅 𝐴 (𝑅𝜆 ) 𝐴2 𝐺 †1 𝑔((1 − 𝜆) 𝑆1T ⊗ ((1 − 𝜆) 𝑅1 ) −1 )𝐺 1 (𝑅𝜆 ) 𝐴2 |𝐾⟩ 𝑅 𝐴 (11.F.77)
1 1
≤ ⟨𝐾 | 𝑅 𝐴 (𝑅𝜆 ) 𝐴2 𝑔(𝐿) (𝑅𝜆 ) 𝐴2 |𝐾⟩ 𝑅 𝐴 , (11.F.78)
√ 1
where the third equality follows because 1 𝑅 ⊗ 𝜆𝑅02 = 1 𝑅 ⊗ (𝑅𝜆 ) 2 𝐺 †0 . In the last
1
Consider that
𝐿 = 𝐺 †0 𝜆𝑆0T ⊗ (𝜆𝑅0 ) −1 𝐺 0 + 𝐺 †1 (1 − 𝜆) 𝑆1T ⊗ ((1 − 𝜆) 𝑅1 ) −1 𝐺 1
= 1 ⊗ (𝑅𝜆 ) − 12
1 ⊗ 𝜆𝑅0 (𝑅𝜆 ) − 12
√︁ T −1
√︁
𝜆𝑅0 𝜆𝑆0 ⊗ (𝜆𝑅0 )
+ 1 ⊗ (𝑅𝜆 ) − 2 (1 − 𝜆) 𝑅1
1 √︁
1 ⊗ (1 − 𝜆) 𝑅1 (𝑅𝜆 )
−1
√︁ − 12
T
× (1 − 𝜆) 𝑆1 ⊗ ((1 − 𝜆) 𝑅1 ) (11.F.80)
= 𝜆𝑆0T ⊗ (𝑅𝜆 ) −1 + (1 − 𝜆) 𝑆1T ⊗ (𝑅𝜆 ) −1 (11.F.81)
= 𝑆𝜆T ⊗ (𝑅𝜆 ) −1 . (11.F.82)
So the function (𝑅, 𝑆) ↦→ Tr[𝐾 𝑅𝑡 𝐾 † 𝑆 1−𝑡 ] is jointly concave when the first argument
is restricted to be a positive definite operator. The more general case of positive
semi-definite operators in the first argument can be established by adding 𝜀 1 to any
positive semi-definite operator to ensure that it is positive definite, applying the
above inequality, and then taking the limit 𝜀 → 0 at the end. This concludes the
proof. ■
712
Chapter 11: Entanglement-Assisted Classical Communication
Here we show that the mutual information 𝐼 (N) is a strong converse rate based on
the the alternate definition given in Appendix A. According to that definition, a
rate 𝑅 ∈ R+ is a strong converse rate for entanglement-assisted classical commu-
nication over a channel N if for every sequence {(𝑛, |M𝑛 |, 𝜀 𝑛 )}𝑛∈N of (𝑛, |M|, 𝜀)
entanglement-assisted classical communication protocols over 𝑛 uses of N, we have
that lim inf 𝑛→∞ 𝑛1 log2 |M𝑛 | > 𝑅 ⇒ lim𝑛→∞ 𝜀 𝑛 = 1.
Let us show that the mutual information 𝐼 (N) of the channel N is a strong
converse rate under this alternate definition. Let {(𝑛, |M𝑛 | , 𝜀 𝑛 )}𝑛∈N be a sequence
of protocols satisfying lim inf 𝑛→∞ 𝑛1 log2 |M𝑛 | > 𝐼 (N). Due to this strict inequality,
the fact that lim𝛼→1 e𝐼𝛼 (N) = 𝐼 (N), and since the sandwiched Rényi mutual infor-
mation 𝐼𝛼 (N) is monotonically increasing in 𝛼 (this follows from Proposition 7.31),
e
there exists a value 𝛼∗ > 1 such that
1
lim inf log2 |M𝑛 | > e
𝐼𝛼∗ (N). (11.G.1)
𝑛→∞ 𝑛
Now recall the following bound from (11.2.92), which holds for all 𝛼 > 1 and for
every (𝑛, |M| , 𝜀) protocol:
1 𝛼 1
log2 |M| ≤ e
𝐼𝛼 (N) + log2 . (11.G.2)
𝑛 𝑛 (𝛼 − 1) 1−𝜀
We can apply it in our case to conclude that
∗
1 𝛼 1
log2 |M𝑛 | ≤ e
𝐼𝛼∗ (N) + log2 . (11.G.3)
𝑛 𝑛 (𝛼∗ − 1) 1 − 𝜀𝑛
Now suppose that
lim inf 𝜀 𝑛 = 𝑐 ∈ [0, 1). (11.G.4)
𝑛→∞
Then it follows that
∗
1 𝛼 1
lim inf log2 |M𝑛 | ≤ lim inf e 𝐼𝛼∗ (N) + log2 (11.G.5)
𝑛→∞ 𝑛 𝑛→∞ 𝑛 (𝛼∗ − 1) 1 − 𝜀𝑛
∗
𝛼 1
=e𝐼𝛼∗ (N) + lim inf log2 (11.G.6)
𝑛→∞ 𝑛 (𝛼∗ − 1) 1 − 𝜀𝑛
713
Chapter 11: Entanglement-Assisted Classical Communication
𝐼𝛼∗ (N),
=e (11.G.7)
where the last equality follows because 𝛼∗ > 1 is a constant and the sequence
{𝜀 𝑛 }𝑛∈𝑁 converges to a constant 𝑐 ∈ [0, 1). However, this contradicts (11.G.1).
Thus, (11.G.4) cannot hold, and so we conclude that lim inf 𝑛→∞ 𝜀 𝑛 = 1.
The argument given above makes no statement about how fast the error probabil-
ity converges to one in the large 𝑛 limit. If we fix the rate 𝑅 of communication to be a
constant satisfying 𝑅 > 𝐼 (N), then we can argue that the error probability converges
exponentially fast to one. To this end, consider a sequence {(𝑛, 2𝑛𝑅 , 𝜀 𝑛 )}𝑛∈N of
(𝑛, |M|, 𝜀) protocols, with each element of the sequence having an arbitrary (but
fixed) rate 𝑅 > 𝐼 (N). For each element of the sequence, the inequality in (11.2.92)
holds, which means that
𝛼 1
𝑅≤e 𝐼𝛼 (N) + log2 (11.G.8)
𝑛(𝛼 − 1) 1 − 𝜀𝑛
for all 𝛼 > 1. Rearranging this inequality leads to the following lower bound on the
error probabilities 𝜀 𝑛 :
𝜀 𝑛 ≥ 1 − 2−𝑛 ( 𝛼 )( 𝑅−𝐼 𝛼 (N) )
𝛼−1 e
(11.G.9)
for all 𝛼 > 1. Now, since 𝑅 > 𝐼 (N), lim𝛼→1 e 𝐼𝛼 (N) = 𝐼 (N), and since the
sandwiched Rényi mutual information 𝐼𝛼 (N) is monotonically increasing in 𝛼
e
(this follows from Proposition 7.31), there exists an 𝛼∗ > 1 such that 𝑅 > e 𝐼𝛼∗ (N).
Applying the inequality in (11.G.9) to this value of 𝛼, we find that
𝛼∗ −1
−𝑛 𝛼∗ ( 𝑅−e𝐼 𝛼∗ (N) )
𝜀𝑛 ≥ 1 − 2 . (11.G.10)
Then, taking the limit 𝑛 → ∞ on both sides of this inequality, we conclude that
lim𝑛→∞ 𝜀 𝑛 = 1 and the convergence to one is exponentially fast.
From the arguments above, we find not only that 𝐼 (N) is a strong converse
rate according to the alternate definition provided in Appendix A, but also that
the maximal error probability of every sequence of (𝑛, |M|, 𝜀) entanglement-
assisted classical communication protocols with fixed rate strictly above the mutual
information 𝐼 (N) approaches one at an exponential rate.
In Section 11.C, we showed that the error probability vanishes in the limit 𝑛 → ∞
for every fixed rate 𝑅 < 𝐼 (N). We thus see that, as 𝑛 → ∞, the mutual information
𝐼 (N) is a sharp dividing point between reliable, error-free communication and
communication with error probability approaching one exponentially fast. This
situation is depicted in Figure 11.8.
714
Chapter 11: Entanglement-Assisted Classical Communication
n→∞
Error
Probability,
εn
0 Rate, Rn
I (N )
715
Chapter 12
Classical Communication
We now move on to classical communication over quantum channels. Unlike
the previous chapter, here we suppose that Alice and Bob do not have access
to shared entanglement prior to communication. Thus, the scenario considered
in this chapter is more practical than the entanglement-assisted setting—in the
previous chapter, we made the simplifying assumption that shared entanglement
is available for free to the sender and receiver. However, without widespread
entanglement-sharing networks available, this assumption is not really practical,
and so the entanglement-assisted capacity is mostly of academic interest at the
moment.
Without shared entanglement available to the sender and receiver, is it still
advantageous to use a quantum strategy to send classical information over a quantum
channel? At first glance, it may seem that, without prior shared entanglement, there
might not be any point in using a quantum strategy to send classical information
over a quantum channel. However, when using a channel multiple times, there is
still the possibility of encoding a message into a state at the encoder that is entangled
across multiple channel uses and then performing a collective measurement at the
decoder. For many examples of channels, it is known that collective measurements
can enhance communication capacity, and it is known that in principle, there exists
a channel for which entangled states at the encoder provides a further enhancement
to communication capacity.
Although it may seem that determining the maximum amount of classical
information that can be communicated using a given quantum channel, i.e., de-
termining the classical capacity of a quantum channel, might be easier than its
716
Chapter 12: Classical Communication
717
Chapter 12: Classical Communication
chapter is that the maximum amount of classical information that can be commu-
nicated over a noiseless quantum channel without error is log2 𝑑, where 𝑑 is the
dimension of the channel. Let us describe a simple protocol that achieves this
number of communicated bits. Consider a discrete set M of messages, and suppose
that Alice encodes each message 𝑚 ∈ M into a quantum state |𝑚⟩, such that the set
{|𝑚⟩} 𝑚∈M , is orthonormal, i.e., ⟨𝑚|𝑚′⟩ = 𝛿𝑚,𝑚 ′ for all 𝑚, 𝑚′ ∈ M. Bob, knowing
Alice’s encoding of the messages, devises a measurement to extract the message
described by the POVM {|𝑚⟩⟨𝑚|} 𝑚∈M . His strategy is to guess that the message
sent was “𝑚” if the outcome of his measurement is 𝑚 ∈ M. If Alice sends the state
|𝑚⟩⟨𝑚| through a noiseless quantum channel, then Bob is guaranteed to receive the
state |𝑚⟩⟨𝑚| unaltered, so that his guess will always be correct. Alice can thus send
log2 |M| bits of classical information to Bob without error.
Now, if the channel is noisy, the initially orthogonal states in general become
non-orthogonal, so that if Alice sends the state |𝑚⟩⟨𝑚| through the channel then
Bob generally receive a mixed state 𝜌 𝑚 instead. As a consequence of using a noisy
quantum channel, Bob’s decoding strategy will not always succeed, meaning that
there will be errors. In order to mitigate the effects of noise, Alice can choose
a more clever encoding of the message, and similarly Bob can devise a more
clever decoding strategy.1 Alice and Bob can also use the channel multiple times,
which can decrease the error in general, while also allowing for the messages to be
encoded into higher-dimensional entangled states.
Observe that the task of classical communication over a quantum channel is
closely related to the task of state discrimination (see Section 5.3.1). Recall that
the goal of state discrimination is to minimize the error probability for a given set
{𝜌 𝑚 } 𝑚∈M of states corresponding to the message set M and a particular decoding
POVM {Λ𝑚 𝐵 } 𝑚∈M indexed by the messages. In classical communication, we focus
primarily on maximizing the rate 𝑛1 log2 |M| of communication for a given error
probability 𝜀, and we are interested in determining the maximum rate 𝑅 for which
𝜀 vanishes as the number 𝑛 of channel uses increases.
1We assume, as in all communication tasks considered in this book, that Alice and Bob know
the channel connecting them, so that they can use this knowledge to develop their encoding and
decoding.
718
Chapter 12: Classical Communication
M3m
E A
N B
m
b
Alice Bob
Figure 12.1: Depiction of a protocol for classical communication over one use
of the quantum channel N. Alice, who wishes to send a message 𝑚 chosen
from a set M of messages, first encodes the message into a quantum state on a
quantum system 𝐴, using a classical–quantum encoding channel E. She then
sends the quantum system 𝐴 through the channel N 𝐴→𝐵 . After Bob receives
the system 𝐵, he performs a measurement on it, using the outcome of the
measurement to give an estimate 𝑚 b of the message sent by Alice.
can choose the distribution 𝑝 to be the degenerate distribution, equal to one for 𝑚
and zero for all other messages.
She then uses an encoding channel E 𝑀→𝐴 to map the message to a quantum
state 𝜌 𝑚𝐴 . We can explicitly define the encoding channel E 𝑀 ′ →𝐴 as
Note that this channel has the form of a classical–quantum channel (recall Defini-
tion 4.9). The action of the encoding channel on the initial state in (12.1.1) is as
follows:
𝑝
∑︁
E 𝑀 →𝐴 (Φ 𝑀 𝑀 ′ ) =
′ 𝑝(𝑚)|𝑚⟩⟨𝑚| 𝑀 ⊗ E 𝑀 ′ →𝐴 (|𝑚⟩⟨𝑚|) (12.1.3)
𝑚∈M
∑︁
= 𝑝(𝑚)|𝑚⟩⟨𝑚| 𝑀 ⊗ 𝜌 𝑚𝐴 (12.1.4)
𝑚∈M
𝑝
=: 𝜌 𝑀 𝐴 . (12.1.5)
Alice then sends the system 𝐴 through the channel N 𝐴→𝐵 , resulting in the state
∑︁
N 𝐴→𝐵 (𝜌 𝑀 𝐴 ) = 𝑝(𝑚)|𝑚⟩⟨𝑚| 𝑀 ⊗ N 𝐴→𝐵 (𝜌 𝑚𝐴 ). (12.1.6)
𝑚∈M
Bob, whose task is to determine which message Alice sent, performs a decoding
measurement on his received system 𝐵, which has the corresponding POVM
𝐵 } 𝑚∈M . The measurement is associated with the decoding channel D 𝐵→ 𝑀
{Λ𝑚 b,
which is simply a quantum–classical channel as given in Definition 4.10, i.e.,
∑︁
D𝐵→ 𝑀b (𝜏𝐵 ) B Tr[Λ𝑚𝐵 𝜏𝐵 ]|𝑚⟩⟨𝑚| 𝑀
b (12.1.7)
𝑚∈M
𝐵 N 𝐴→𝐵 (𝜌 𝐴 )].
b |𝑀 = 𝑚] = Tr[Λ𝑚 𝑚
b |𝑚) B Pr[ 𝑀
𝑞( 𝑚 b=𝑚 b
(12.1.10)
720
Chapter 12: Classical Communication
721
Chapter 12: Classical Communication
and the steps to show this are the same as those shown in the proof of Lemma 11.2.
In particular, the following equality holds
1 𝑝 𝑝
Φ𝑀 𝑀 ′ − 𝜔 b = 𝑝 err ((E, D); 𝑝), (12.1.17)
2 𝑀 𝑀 1
which leads to
𝑝 ∗err (E, D; N) = max 𝑝 err ((E, D); 𝑝, N)
𝑝:M→[0,1]
1 𝑝 𝑝
(12.1.18)
= max Φ𝑀 𝑀 ′ − 𝜔 b .
𝑝:M→[0,1] 2 𝑀𝑀 1
Also, as in Chapter 11, another way to define the error criterion of the protocol is
through a comparator test. Recall that the comparator test is a measurement defined
by the two-element POVM {Π 𝑀 𝑀b , 1 𝑀 𝑀b − Π 𝑀 𝑀b }, where Π 𝑀 𝑀b is the projection
defined as ∑︁
Π 𝑀 𝑀b B |𝑚⟩⟨𝑚| 𝑀 ⊗ |𝑚⟩⟨𝑚| 𝑀b . (12.1.19)
𝑚∈M
𝑝
Note that Tr[Π 𝑀 𝑀b 𝜔 b ] is simply the probability that the classical registers 𝑀
𝑀𝑀
b in the state 𝜔 𝑝 have the same values. In particular, following the same
and 𝑀
𝑀𝑀
b
steps as in (11.1.38)–(11.1.40), we have
h i
𝑝
Tr Π 𝑀 𝑀b 𝜔 b = 1 − 𝑝 err ((E, D); 𝑝, N) =: 𝑝 succ ((E, D); 𝑝, N), (12.1.20)
𝑀𝑀
where we have acknowledged that the expression on the left-hand side can be
interpreted as the average success probability of the code (E, D) and denoted it by
𝑝 succ ((E, D); 𝑝, N).
As mentioned at the beginning of this chapter, our goal is to bound (from above
and below) the maximum number log2 |M| of transmitted bits for every classical
communication protocol over N. Given an error probability threshold of 𝜀, we call
the maximum number of transmitted bits the one-shot classical capacity of N.
722
Chapter 12: Classical Communication
N. In other words,
In addition to finding, for a given 𝜀 ∈ [0, 1], the maximum number of transmitted
bits among all (|M|, 𝜀) classical communication protocols over N 𝐴→𝐵 , we can
consider the following complementary problem: for a given number of messages
|M|, find the smallest possible error probability among all (|M|, 𝜀) classical
communication protocols, which we denote by 𝜀𝐶∗ (|M|; N). In other words, to
problem is to determine
where the optimization is over encoding channels E with input space dimension
|M| and decoding channels D with output space dimension |M|. In this book, we
focus primarily on the problem of optimizing the number of transmitted bits rather
than the error probability, and so our primary quantity of interest is the one-shot
capacity 𝐶 𝜀 (N).
We now turn to establishing an upper bound on the one-shot classical capacity, and
our approach is similar to the approach outlined in Section 11.1.1. With this goal
in mind, along with the actual classical communication protocol, we also consider
the same protocol but performed over a useless channel as depicted in Figure 12.2.
This useless channel discards the state encoded with the message and replaces it
with some arbitrary (but fixed) state 𝜎𝐵 . In other words,
where the encoded states 𝜌 𝑚𝐴 are defined in (12.1.2). This channel is useless
because the state 𝜎𝐵 does not contain any information about the message. As with
entanglement-assisted classical communication, comparing this protocol over the
723
Chapter 12: Classical Communication
M3m
E A
PσB B
m
b
Alice Bob
useless channel with the actual protocol allows us to obtain an upper bound on the
quantity log2 |M|, which we recall represents the number of bits that are transmitted
over the channel.
The state at the end of the protocol over the useless channel is the following
tensor-product state:
∑︁ ∑︁
𝑝
𝜏 bB 𝑝(𝑚)|𝑚⟩⟨𝑚| 𝑀 ⊗ Tr[Λ𝑚 𝐵 𝜎𝐵 ]| 𝑚
b
b⟩⟨𝑚
b | 𝑀b , (12.1.24)
𝑀𝑀
𝑚∈M b∈M
𝑚
which indicates that the decoded message system 𝑀 b is independent of the message
𝑝
system 𝑀 in this case. Now, recall from (12.1.9) that the state 𝜔 b at the end of
𝑀𝑀
the actual protocol over the channel N is given by
∑︁
𝑝
𝑝(𝑚)Tr[Λ𝑚 𝐵 N 𝐴→𝐵 (𝜌 𝐴 )]|𝑚⟩⟨𝑚| 𝑀 ⊗ | 𝑚
𝑚
𝜔 b= b
b⟩⟨𝑚
b | 𝑀b . (12.1.25)
𝑀𝑀
b∈M
𝑚,𝑚
1 ∑︁
Φ𝑀 𝑀 ′ B |𝑚⟩⟨𝑚| 𝑀 ⊗ |𝑚⟩⟨𝑚| 𝑀 ′ (12.1.27)
|M|
𝑚∈M
Tr[Π 𝑀 𝑀b 𝜔 𝑀 𝑀b ] ≥ 1 − 𝜀. (12.1.28)
for every (|M|, 𝜀) classical communication protocol. This means that, given a
particular choice of the encoding and decoding channels, if 𝑝 ∗err (E, D; N) ≤ 𝜀, then
the upper bound in (12.1.29) is the maximum number of bits that can be transmitted
over the channel N. The optimal value of this upper bound is realized by finding
the state 𝜎𝑀b defining the useless channel that optimizes the quantity 𝐼 𝐻𝜀 (𝑀; 𝑀)
b 𝜔 in
addition to the measurement that achieves the 𝜀-hypothesis testing relative entropy
in (11.1.61). Importantly, a different choice of encoding and decoding produces a
different value for this upper bound. We would thus like to find an upper bound
that applies regardless of which specific protocol is chosen. In other words, we
would like an upper bound that is a function of the channel N only.
We now give a general upper bound on the number of transmitted bits that can
be communicated in any classical communication protocol. This result is stated
in Theorem 12.4, and the upper bound obtained therein holds independently of
the encoding and decoding channels used in the protocol and depends only on the
given communication channel N.
Let us start with an arbitrary (|M|, 𝜀) classical communication protocol over
the channel N, corresponding to, as described at the beginning of this chapter, a
message set M, an encoding channel E, and a decoding channel D. The error
criterion 𝑝 ∗err (E, D; N) ≤ 𝜀 holds by definition of an (|M|, 𝜀) protocol, which
implies the upper bound in (12.1.29) for the number log2 |M| of transmitted bits in
any (|M|, 𝜀) classical communication protocol. Using this upper bound, we obtain
the following:
725
Chapter 12: Classical Communication
Therefore,
𝐶 𝜀 (N) ≤ 𝜒𝐻𝜀 (N). (12.1.31)
where the state 𝜔 𝑀 𝑀b is defined in (12.1.26). Recall that this bound follows from
Lemma 11.4. Note that the state 𝜔 𝑀 𝑀b can be written as
𝜔 𝑀 𝑀b = D𝐵→ 𝑀b (𝜃 𝑀 𝐵 ), (12.1.33)
where
1 ∑︁
𝜃𝑀𝐵 B |𝑚⟩⟨𝑚| 𝑀 ⊗ N 𝐴→𝐵 (𝜌 𝑚𝐴 ). (12.1.34)
|M|
𝑚∈M
Now, from the data-processing inequality for the hypothesis testing relative
entropy under the action of the decoding channel D𝐵→ 𝑀b , we find that
𝜃 𝑀 𝐵 = N 𝐴→𝐵 (𝜌 𝑀 𝐴 ), (12.1.36)
𝑝
where 𝜌 𝑀 𝐴 is the classical–quantum state 𝜌 𝑀 𝐴 defined in (12.1.5) with 𝑝 equal to
the uniform probability distribution. Optimizing over all classical–quantum states
𝜉 𝑀 𝐴 then leads to
726
Chapter 12: Classical Communication
where 𝜁 𝑀 𝐵 = N 𝐴→𝐵 (𝜉 𝑀 𝐴 ) and we have used the definition in (7.11.93) for the
𝜀-hypothesis testing Holevo information of a channel. Note that this optimization
over all classical–quantum states is effectively an optimization over all possible
encoding channels E 𝑀 ′ →𝐴 that define the (|M|, 𝜀) protocol. Putting everything
together, we obtain
as required. ■
727
Chapter 12: Classical Communication
Since the bounds in (12.1.41) and (12.1.42) hold for every (|M|, 𝜀) classical
communication protocol over N, we have that
1
𝐶 𝜀 (N) ≤ ( 𝜒(N) + ℎ2 (𝜀)), (12.1.43)
1−𝜀
𝛼 1
𝐶 𝜀 (N) ≤ e
𝜒𝛼 (N) + log2 ∀ 𝛼 > 1, (12.1.44)
𝛼−1 1−𝜀
728
Chapter 12: Classical Communication
Having obtained upper bounds on the number transmitted bits in the previous
section, let us now determine lower bounds. The key result of this section is
Proposition 12.5, resulting in Theorem 12.6, which contains a lower bound on the
number of transmitted bits for every (|M|, 𝜀) classical communication protocol.
As we saw in the previous chapter on entanglement-assisted classical co-
mmunication, in order to obtain a lower bound on the number of transmitted bits,
we should devise an explicit classical communication protocol (M, E, D) such that
the maximal error probability satisfies 𝑝 ∗err (E, D; N) ≤ 𝜀 for 𝜀 ∈ [0, 1]. Recall
from (12.1.15) that the maximal error probability is defined as
where, for all 𝑚 ∈ M, the message error probability 𝑝 err (𝑚; (E, D)) is defined in
(12.1.11) as
𝑝 err (𝑚, (E, D); N) = 1 − 𝑞(𝑚|𝑚), (12.1.46)
b |𝑚) being the probability of identifying the message sent as 𝑚
with 𝑞( 𝑚 b given that
the message 𝑚 was sent.
The classical communication protocol discussed here is related to the enta-
nglement-assisted classical communication protocol in Section 11.1.3. We suppose
at first that Alice and Bob have some shared randomness prior to communication.
This shared randomness is strictly speaking not part of the classical communication
protocol as outlined at the beginning of Section 12.1, but the advantage of using
it is that we can directly employ all of the developments for the position-based
coding and sequential decoding strategy from Section 11.1.3. We then perform
what is called derandomization and expurgation (both of which we outline below)
ultimately to remove this shared randomness from the protocol and thus obtain
the desired lower bound on the number of transmitted bits for the true unassisted
classical communication protocol.
729
Chapter 12: Classical Communication
𝜀
Remark: The quantity 𝜒 𝐻 (N) defined in the statement of Proposition 12.5 above is similar
to the quantity 𝜒𝐻 (N) defined in (7.11.93), except that it is defined with respect to the mutual
𝜀
𝜀
information 𝐼 𝐻 (𝑋; 𝐵)𝜌 that we encountered in Proposition 11.8, which does not involve an
optimization over states 𝜎𝐵 .
Combining (12.1.57) and (12.1.59), we can already conclude that, when shared
randomness is available for free, a lower bound on the number of transmitted bits
is given by (12.1.59). The condition in (12.1.57) on the message error probability
implies that the average error probability 𝑝 err ((E′, D′); 𝑝), with 𝑝 the uniform
distribution over M′, satisfies
1 ∑︁
𝑝 err ((E′, D′); 𝑝, N) = 𝑝 err (𝑚, (E′, D′), N) ≤ 𝜀. (12.1.60)
|M′ | ′ 𝑚∈M
Let us now use the expression in (11.1.104) to derive an exact expression for
the average error probability. The expression in (11.1.104) is
732
Chapter 12: Classical Communication
This implies that the measurement operator Λ∗𝐵′ 𝐵 that achieves the optimal value
𝜀−𝜂
for the quantity 𝐷 𝐻 (N 𝐴′ →𝐵 (𝜌 𝐴′ 𝐵′ )∥ 𝜌 𝐵′ ⊗ N 𝐴′ →𝐵 (𝜌 𝐴′ )) can be taken to have the
form ∑︁
∗
Λ 𝐵′ 𝐵 = |𝑥⟩⟨𝑥| 𝐵′ ⊗ 𝑀𝐵𝑥 . (12.1.68)
𝑥∈X
Now using the fact that
√︃ ∑︁ √︃
Λ∗𝐵′ 𝐵 = |𝑥⟩⟨𝑥| 𝐵′ ⊗ 𝑀𝐵𝑥 , (12.1.69)
𝑥∈X
where 𝑅 is a reference system held by Bob to help with the decoding, and the
projector Π𝐵𝑅
𝑥 is given by
𝑥
Π𝐵𝑅 B (𝑈𝐵𝑅 ) ( 1𝐵 ⊗ |1⟩⟨1| 𝑅 ) 𝑈𝐵𝑅
𝑥 † 𝑥
, (12.1.71)
√︃
𝑥
𝑈𝐵𝑅 B 1𝐵 − 𝑀𝐵𝑥 ⊗ (|0⟩⟨0| 𝑅 + |1⟩⟨1| 𝑅 )
√︃
+ 𝑀𝐵𝑥 ⊗ (|1⟩⟨0| 𝑅 − |0⟩⟨1| 𝑅 ) . (12.1.72)
This in turn implies that the measurement operators 𝑃𝑖 , which are used for the
sequential decoding and are defined in (11.1.100), have the form
∑︁
𝑃𝑖 = |𝑥⟩⟨𝑥| 𝐵1′ ···𝐵′|M′ | ⊗ 𝑃𝑖𝑥𝑖 , (12.1.73)
𝑥1 ,...,𝑥 |M′ | ∈X
we find that
𝜔𝑚
𝐵′ ···𝐵′ 𝐵𝑅1 ···𝑅 |M′ |
1 |M′ |
733
Chapter 12: Classical Communication
Ω𝑥𝑚𝑚 B 𝑃 b𝑥 𝑚−1 𝑃𝑚
b𝑥1 · · · 𝑃 𝑥 𝑚 b𝑥 𝑚−1 b𝑥1 .
𝑃𝑚−1 · · · 𝑃 (12.1.80)
1 𝑚−1 1
734
Chapter 12: Classical Communication
∑︁
𝑟 (𝑥1 ) · · · 𝑟 (𝑥 |M′ | ) 𝑝 err (C; 𝑝)
𝑥1 ,...,𝑥 |M′ | ∈X
∑︁
≤ 𝑟 (𝑥1 ) · · · 𝑟 (𝑥 |M′ | )𝑢 err (C; 𝑝) ≤ 𝜀, (12.1.84)
𝑥1 ,...,𝑥 |M′ | ∈X
where
∑︁ 1 𝑥𝑚 𝑥𝑚
𝑝 err (C; 𝑝, N) B 1 − Tr[Ω𝑚 (𝜌 𝐵 ⊗ |0⟩⟨0| 𝑅1 ···𝑅 |M′ | )] (12.1.85)
|M′ |
𝑚∈M′
is the average error probability under a code C in which each message 𝑚 is encoded
as 𝑚 ↦→ 𝑥 𝑚 ↦→ 𝜌 𝑥𝐴𝑚 and
𝑚−1
!
∑︁ 1 ∑︁
𝑢 err (C; 𝑝, N) B ′
𝛾I Tr[(𝐼 𝐵 − 𝑀𝐵𝑥 𝑚 ) 𝜌 𝑥𝐵𝑚 ] + 𝛾II Tr[𝑀𝐵𝑥𝑖 𝜌 𝑥𝐵𝑚 ]
|M |
𝑚∈M′ 𝑖=1
(12.1.86)
is an upper bound on the average error probability 𝑝 err (C; 𝑝, N). The decoding
is defined by the measurement operators {Ω𝑥𝑚𝑚 } 𝑚∈M′ . Note that the code C is a
random variable, in the sense that the string 𝑥1 , . . . , 𝑥 |M′ | of length |M′ | is used for
the encoding and decoding with probability 𝑟 (𝑥1 ) · · · 𝑟 (𝑥 |M′ | ).
Since the minimum does not exceed the average, the inequality in (12.1.84)
implies that there exists a code C∗ , with corresponding string 𝑥 1∗ , . . . , 𝑥 ∗|M′ | , such
that
𝑢 err (C∗ ; 𝑝, N) ≤ 𝜀, (12.1.87)
and in turn, via Theorem 11.7, that
By choosing this particular code, we can now follow through the entire argument
above without the shared randomness (in the form of the state 𝜌 𝐴′ 𝐵′ ) in order to
conclude that with the code C∗ , the number of transmitted bits is given by (12.1.59),
and the average error probability of the code is bounded from above by 𝜀. This
completes the derandomization part of the proof.
Finally, we are interested in a code, call it (E, D), satisfying the maximal error
probability criterion 𝑝 ∗err (E, D; N) ≤ 𝜀 instead of the average error probability
criterion. To find such a code, we can apply expurgation to the code C∗ defined
above. Formally, this means the following: since we have a code satisfying
735
Chapter 12: Classical Communication
𝑚−1
∑︁
∗
𝑢 err (𝑚, C ; N) B 𝛾I Tr[(𝐼 𝐵 − 𝑀𝐵𝑥 𝑚 ) 𝜌 𝑥𝐵𝑚 ] + 𝛾II Tr[𝑀𝐵𝑥𝑖 𝜌 𝑥𝐵𝑚 ]. (12.1.90)
𝑖=1
′
We thus define a new message set M ⊂ M′, with |M| = |M2 | , by removing all but
those messages in M′ whose encodings are given by 𝑐 1 , . . . , 𝑐 |M′ | . Let C denote
2
the expurgated code. Due to the fact that all of the terms in 𝑢 err (𝑚, C∗ ; N) are
non-negative, we find for all 𝑚 ∈ M that
where
𝑚−1
∑︁
𝑢 err (𝑚, C; N) B 𝛾I Tr[(𝐼 𝐵 − 𝑀𝐵𝑐 𝑚 ) 𝜌 𝑐𝐵𝑚 ] + 𝛾II Tr[𝑀𝐵𝑐𝑖 𝜌 𝑐𝐵𝑚 ]. (12.1.92)
𝑖=1
Again applying the quantum union bound (Theorem 11.7), we then find that
where
𝑝 err (𝑚, C; N) B 1−
𝑐 𝑚 b𝑐 𝑚−1 b𝑐1 (𝜌 𝑐 𝑚 ⊗ |0⟩⟨0| 𝑅1 ···𝑅 ′ ) 𝑃 b𝑐 𝑚−1 𝑃𝑚
b𝑐1 · · · 𝑃 𝑐𝑚
Tr[𝑃𝑚 𝑃𝑚−1 · · · 𝑃 1 𝐵 |M | 1 𝑚−1 ]. (12.1.94)
736
Chapter 12: Classical Communication
Therefore, we can use (12.1.59) to obtain the following for the number log2 |M|
of transmitted bits with the reduced message set:
′
|M |
log2 |M| = log2 = log2 |M′ | − log2 (2) (12.1.96)
2
𝜀−𝜂 4𝜀
= 𝜒 𝐻 (N) − log2 2 − log2 (2) (12.1.97)
𝜂
𝜀−𝜂 8𝜀
= 𝜒 𝐻 (N) − log2 2 . (12.1.98)
𝜂
Since 𝜀 and 𝜂 are arbitrary, we have shown that for all 𝜀 ∈ (0, 1) and 𝜂 ∈ (0, 𝜀),
there exists an (|M|,
2𝜀) classical communication protocol satisfying log2 |M| =
𝜀−𝜂
𝜒 𝐻 (N) − log2 8𝜀 𝜂2
. By the substitution 2𝜀 → 𝜀, we can finally say that for all
𝜀 ∈ (0, 1) and 𝜂 ∈ 0, 𝜀2 , there exists an (|M|, 𝜀) classical communication protocol
satisfying
𝜀 4𝜀
2 −𝜂
log2 |M| = 𝜒 𝐻 (N) − log2 2 . (12.1.99)
𝜂
This concludes the proof. ■
Here,
𝜒 𝛼 (N) B sup 𝐼 𝛼 (𝑋; 𝐵)𝜔 , (12.1.101)
𝜌𝑋 𝐴
737
Chapter 12: Classical Communication
Remark: The quantity 𝜒 𝛼 (N) defined in the statement of Theorem 12.6 above is similar to
the quantity 𝜒 𝛼 (N) defined in (7.11.94), except that it is defined with respect to the mutual
information 𝐼 𝛼 (𝑋; 𝐵) 𝜔 that we encountered in Theorem 11.9, which does not involve an
optimization over states 𝜎𝐵 .
Proof: From Proposition 12.5, we know that for all 𝜀 ∈ (0, 1) and 𝜂 ∈ 0, 𝜀2 , there
exists an (|M|, 𝜀) classical communication protocol such that
𝜀 4𝜀
−𝜂
log2 |M| = 𝜒 𝐻2 (N) − log2 2 . (12.1.103)
𝜂
Proposition 7.72 relates the hypothesis testing relative entropy to the Petz–Rényi
relative entropy according to
𝛼 1
𝐷 𝜀𝐻 (𝜌∥𝜎) ≥ 𝐷 𝛼 (𝜌∥𝜎) + log2 (12.1.104)
𝛼−1 𝜀
Combining this inequality with (12.1.103), we immediately get the desired re-
sult. ■
Since the inequality in (12.1.100) holds for every (|M|, 𝜀) classical communi-
cation protocol, we have that
𝜀 𝛼 1 4𝜀
𝐶 (N) ≥ 𝜒 𝛼 (N) + log2 𝜀 − log2 2 (12.1.106)
𝛼−1 2 −𝜂 𝜂
for all 𝛼 ∈ (0, 1), 𝜀 ∈ (0, 1), and 𝜂 ∈ 0, 𝜀2 .
A1 B1
N
A2 B2
N
M3m
E ..
.
A n −1
..
N
.
..
.
Bn−1
m
b
An Bn
N
Alice Bob
Figure 12.3: The most general classical communication protocol over a multiple
number 𝑛 ≥ 1 uses of a quantum channel N. Alice, who wishes to send a
message 𝑚 selected from a set M, first encodes the message into a quantum
state on 𝑛 quantum systems using a classical–quantum encoding channel E. She
then sends each quantum system through the channel N. After Bob receives the
systems, he performs a collective measurement on them, using the outcome of
the measurement to give an estimate 𝑚b of the message 𝑚 sent to him by Alice.
channel N only once, Alice encodes the message into 𝑛 ≥ 1 quantum systems
𝐴1 , . . . , 𝐴𝑛 , all with the same dimension as 𝐴, and sends each one of these through
the channel N. We call this the asymptotic setting because the number 𝑛 of channel
uses can be arbitrarily large.
Recall that in the case of entanglement-assisted classical communication, we
showed that encoding channels that entangle the 𝑛 systems 𝐴1 , . . . , 𝐴𝑛 do not help
to achieve higher rates in the asymptotic setting. This is due to the additivity of the
mutual information and the additivity of the sandwiched Rényi mutual information
of a channel for all channels and 𝛼 > 1. In the case of classical communication that
we consider in this chapter, it turns out that, so far, such a statement is known to be
generally false for the Holevo information of a quantum channel (please consult the
Bibliographic Notes in Section 12.5). That is, in principle there exists a channel
for which the Holevo information is not additive. Therefore, unlike entanglement-
assisted classical communication, concrete expressions for the classical capacity
exist only for specific classes of channels.
The analysis of the classical communication protocol in the asymptotic setting
is almost exactly the same as in the one-shot setting. This is due to the fact that 𝑛
independent uses of the channel N can be regarded as a single use of the channel
N ⊗𝑛 . So the only change that needs to be made is to replace N with N ⊗𝑛 and to
define the states and POVM elements as acting on 𝑛 systems instead of just one. In
739
Chapter 12: Classical Communication
particular, the state at the end of the protocol presented in (12.1.8)–(12.1.9) at the
beginning of Section 12.1 is
𝑝
= (D𝐵𝑛 → 𝑀b ◦ N ⊗𝑛
𝑝
𝐴→𝐵 ◦ E 𝑀 →𝐴 )(Φ 𝑀 𝑀 ′ ), (12.2.1)
𝜔 ′ 𝑛
𝑀𝑀b
where 𝑝 is the prior probability distribution over the message set M, the encoding
channel E 𝑀 ′ →𝐴𝑛 is defined as
E 𝑀 ′ →𝐴𝑛 (|𝑚⟩⟨𝑚| 𝑀 ′ ) = 𝜌 𝑚𝐴𝑛 ∀ 𝑚 ∈ M, (12.2.2)
and the decoding channel D𝐵𝑛 → 𝑀b , with associated POVM {Λ𝑚 𝐵 𝑛 } 𝑚∈M , is defined as
∑︁
D𝐵𝑛 → 𝑀b (𝜏𝐵 ) =
𝑛 Tr[Λ𝑚𝐵 𝑛 𝜏𝐵 𝑛 ]|𝑚⟩⟨𝑚| 𝑀
b. (12.2.3)
𝑚∈M
Then, for every given code specified by the encoding and decoding channels, the
definitions of the message error probability of the code, the average error probability
of the code, and the maximal error probability of the code all follow analogously
from their definitions in (12.1.11), (12.1.13), and (12.1.15), respectively, in the
one-shot setting.
As we prove in Appendix A,
In other words, a rate 𝑅 is achievable if the optimal error probability for a sequence
of protocols with rate 𝑅 − 𝛿, 𝛿 > 0, vanishes as the number 𝑛 of uses of N increases.
741
Chapter 12: Classical Communication
for every quantum channel N. We can also write the strong converse classical
capacity as
𝐶e(N) = sup lim sup 1 𝐶 𝜀 (N ⊗𝑛 ). (12.2.13)
𝜀∈[0,1) 𝑛→∞ 𝑛
See Appendix A for a proof.
Having defined the classical capacity of a quantum channel, as well as the strong
converse capacity, we now state one of the main theorems of this chapter, which
gives us a formal expression for the classical capacity of every quantum channel.
1
𝐶 (N) = 𝜒reg (N) B lim 𝜒(N ⊗𝑛 ). (12.2.14)
𝑛→∞ 𝑛
Remark: The quantity 𝜒reg (N) B lim𝑛→∞ 𝑛1 𝜒(N ⊗𝑛 ) is called the regularization of the Holevo
information. It can be shown that the limit in the definition of 𝜒reg (N) does indeed exist (please
consult the Bibliographic Notes in Section 12.5).
The achievability and weak converse proofs establish that the classical capacity
is equal to the regularized Holevo information: 𝐶 (N) = 𝜒reg (N). Theorem 12.13
and the inequality in (12.2.12) allow us to conclude that
We first establish in Section 12.2.1 that the rate 𝜒reg (N) is achievable for
classical communication over N. Then, in Section 12.2.2, we prove that 𝜒reg (N)
is a weak converse rate. We prove that the sandwiched Rényi Holevo information
of a entanglement-breaking channel is additive in Section 12.2.3. With this
additivity result, we prove in Section 12.2.4 that 𝐶 (N) = 𝐶 e(N) = 𝜒(N) for all
entanglement-breaking channels.
In this section, we prove that 𝜒reg (N) is an achievable rate for classical communi-
cation over N.
First, recall from Theorem 12.6 that for all 𝜀 ∈ (0, 1) and 𝜂 ∈ (0, 𝜀2 ), there
exists an (|M|, 𝜀) classical communication protocol over N such that
𝛼 1 4𝜀
log2 |M| ≥ 𝜒 𝛼 (N) + log2 𝜀 − log2 2 (12.2.17)
𝛼−1 2 −𝜂 𝜂
745
Chapter 12: Classical Communication
Proof: The inequality (12.2.17) holds for every channel N, which means that it
holds for N ⊗𝑛 . Applying the inequality in (12.2.17) to N ⊗𝑛 and dividing both sides
by 𝑛, we obtain
1 1 ⊗𝑛 𝛼 1 1 4𝜀
log2 |M| ≥ 𝜒 𝛼 (N ) + log2 𝜀 − log2 2 (12.2.21)
𝑛 𝑛 𝑛(𝛼 − 1) 2 −𝜂 𝑛 𝜂
for all 𝛼 ∈ (0, 1). Now, letting 𝜂 = 𝜀4 and using the fact that 𝛼 − 1 is negative for
𝛼 ∈ (0, 1), the following inequality holds for all 𝛼 ∈ (0, 1):
1 1 4 4
log2 |M| ≥ 𝜒 𝛼 (N) − log2 − . (12.2.23)
𝑛 𝑛(1 − 𝛼) 𝜀 𝑛
In other words, for all 𝜀 ∈ (0, 1], there exists an (𝑛, |M|, 𝜀) classical communication
protocol such that (12.2.20) is satisfied. This concludes the proof. ■
The inequality in (12.2.20) gives us, for every 𝜀 ∈ (0, 1] and 𝑛 ∈ N, a lower
bound on the rate of a corresponding (𝑛, |M|, 𝜀) classical communication protocol,
which is known to exist due to Proposition 12.5. If instead we fix a particular
communication rate 𝑅 by letting |M| = 2𝑛𝑅 , then we can rearrange the inequality
in (12.2.20) to obtain an exponentially decaying upper bound on the maximal
746
Chapter 12: Classical Communication
Now, making use of the inequality in (12.2.20) of Corollary 12.14, there exists
an (𝑛, |M|, 𝜀) protocol, with 𝑛 and 𝜀 chosen as above, such that
1 1 4 4
log2 |M| ≥ 𝜒 𝛼 (N) − log2 − . (12.2.29)
𝑛 𝑛(1 − 𝛼) 𝜀 𝑛
Rearranging the right-hand side of this inequality, and using (12.2.26)–(12.2.28),
we find that
1 1 4 4
log2 |M| ≥ 𝜒(N) − 𝜒(N) − 𝜒 𝛼 (N) + log2 + (12.2.30)
𝑛 𝑛(1 − 𝛼) 𝜀 𝑛
747
Chapter 12: Classical Communication
We thus have 𝜒(N) − 𝛿 ≤ 𝑛1 log2 |M|. Recall that if an (𝑛, |M|, 𝜀) protocol exists,
then an (𝑛, |M′ |, 𝜀) also exists for all M′ satisfying |M′ | ≤ |M|. We thus conclude
that there exists an (𝑛, 2𝑛(𝑅−𝛿) , 𝜀) classical communication with 𝑅 = 𝜒(N) for
all sufficiently large 𝑛 such that (12.2.28) holds. Since 𝜀 and 𝛿 are arbitrary, we
conclude that, for all 𝜀 ∈ (0, 1], 𝛿 > 0, and sufficiently large 𝑛, there exists an
(𝑛, 2𝑛( 𝜒(N)−𝛿) , 𝜀) classical communication protocol. This means that 𝜒(N) is an
achievable rate, and thus that 𝐶 (N) ≥ 𝜒(N).
Now, we can repeat the arguments above for the tensor-power channel N ⊗𝑘 with
𝑘 ≥ 1, and we conclude that 1𝑘 𝜒(N ⊗𝑘 ) is an achievable rate. Since this holds for all
𝑘, we conclude that lim𝑘→∞ 1𝑘 𝜒(N ⊗𝑘 ) = 𝜒reg (N) is an achievable rate. Therefore,
𝐶 (N) ≥ 𝜒reg (N).
Using arguments similar to those given in Appendix 11.C, we can make the
following statement: there exists a sequence {(𝑛, 2𝑛𝑅𝑛 , 𝜀 𝑛 )}𝑛∈N of (𝑛, |M|, 𝜀)
classical communication protocols over N, such that lim inf 𝑛→∞ 𝑅𝑛 ≥ 𝜒(N) and
lim𝑛→∞ 𝜀 𝑛 = 0. If we consider a sequence {(𝑛, 2𝑛𝑅 , 𝜀 𝑛 )}𝑛∈N of (𝑛, |M|, 𝜀) classical
communication protocols, this time keeping the rate at an arbitrary (but fixed) value
𝑅 < 𝜒(N) and varying the error probability, we conclude that there exists a sequence
of protocols for which the error probabilities 𝜀 𝑛 approach zero exponentially fast as
𝑛 → ∞.
We now show that the regularized Holevo information 𝜒reg (N) is a weak converse
rate. The result is to establish that 𝐶 (N) ≤ 𝜒reg (N) and therefore that 𝐶 (N)
= 𝜒reg (N), completing the proof of Theorem 12.13.
Let us first recall from Theorem 12.4 that for every quantum channel N we have
the following: for all 𝜀 ∈ [0, 1) and (|M|, 𝜀) classical communication protocols
748
Chapter 12: Classical Communication
A1 B1
A2 B2
.. .. ..
M3m
E .
A n −1
. PσBn .
Bn−1
m
b
An Bn
Alice Bob
over N,
1
log2 |M| ≤ ( 𝜒(N) + ℎ2 (𝜀)) , (12.2.33)
1−𝜀
𝛼 1
log2 |M| ≤ e
𝜒𝛼 (N) + log2 , ∀ 𝛼 > 1. (12.2.34)
𝛼−1 1−𝜀
To obtain these inequalities, we considered a classical communication protocol
over a useless channel and used the hypothesis testing relative entropy to compare
this protocol with the actual protocol over the channel N. The useless channel
in the asymptotic setting is analogous to the one in Figure 12.2 and is shown
in Figure 12.4. A simple corollary of Theorem 12.4, which is relevant for the
asymptotic setting, is the following.
749
Chapter 12: Classical Communication
Proof: Since the inequalities in (12.2.33) and (12.2.34) of Theorem 12.4 hold for
every channel N, they hold for the channel N ⊗𝑛 . Therefore, applying (12.2.33) and
(12.2.34) to N ⊗𝑛 and dividing both sides by 𝑛, we immediately obtain the desired
result. ■
The inequalities in the corollary above give us, for every 𝜀 ∈ [0, 1) and
𝑛 ∈ N, an upper bound on the size |M| of the message set we can take for an
arbitrary (𝑛, |M|, 𝜀) classical communication protocol. If instead we fix a particular
communication rate 𝑅 by letting |M| = 2𝑛𝑅 , then we can obtain a lower bound on
the maximal error probability of an arbitrary (𝑛, 2𝑛𝑅 , 𝜀) classical communication
protocol. Specifically, using (12.2.36), we find that
𝜀 ≥ 1 − 2−𝑛 (
𝛼−1
𝛼 )( 𝑅− 𝑛1 e𝜒 𝛼 (N ⊗𝑛 ) ) (12.2.37)
Suppose that 𝑅 is an achievable rate. Then, by definition, for all 𝜀 ∈ (0, 1], 𝛿 > 0,
and sufficiently large 𝑛, there exists an (𝑛, 2𝑛(𝑅−𝛿) , 𝜀) classical communication
protocol over N. For all such protocols, the inequality (12.2.35) in Corollary 12.15
holds, so that
1 1 1
𝑅−𝛿 ≤ 𝜒(N ⊗𝑛 ) + ℎ2 (𝜀) . (12.2.40)
1−𝜀 𝑛 𝑛
Since this bound holds for all 𝑛, it holds in the limit 𝑛 → ∞, so that
1 1 1
𝑅 ≤ lim 𝜒(N ⊗𝑛 ) + ℎ2 (𝜀) + 𝛿 (12.2.41)
𝑛→∞ 1 − 𝜀 𝑛 𝑛
750
Chapter 12: Classical Communication
1 1
= lim 𝜒(N ⊗𝑛 ) + 𝛿. (12.2.42)
1 − 𝜀 𝑛→∞ 𝑛
Then, since this inequality holds for all 𝜀, 𝛿 > 0, we then conclude that
1 1 1
𝑅 ≤ lim lim 𝜒(N ) + 𝛿 = lim 𝜒(N ⊗𝑛 ).
⊗𝑛
(12.2.43)
𝜀,𝛿→0 1 − 𝜀 𝑛→∞ 𝑛 𝑛→∞ 𝑛
We have thus shown that if 𝑅 is an achievable rate, then 𝑅 ≤ 𝜒reg (N). The
contrapositive of this statement is that if 𝑅 > 𝜒reg (N), then 𝑅 is not an achievable
rate. By definition, therefore, 𝜒reg (N) is a weak converse rate.
Recall that Theorem 12.13 only gives an expression for the capacity 𝐶 (N),
and not for the strong converse capacity 𝐶 e(N). The sandwiched Rényi Holevo
information e 𝜒𝛼 (N) of a channel N can be used to obtain the upper bound in
(12.2.36), holding for every (𝑛, |M|, 𝜀) protocol. This inequality then leads to an
expression for the strong converse capacity in the case that e𝜒𝛼 (N) happens to be
additive for N. We now, therefore, address this question regarding the additivity of
the sandwiched Rényi Holevo information.
Although we have shown that the classical capacity 𝐶 (N) of a channel N is given
by the regularized Holevo information 𝜒reg (N) = lim𝑛→∞ 𝑛1 𝜒(N ⊗𝑛 ), as mentioned
earlier, without the additivity of 𝜒(N) this result is not particularly helpful since it
is not known how to compute the regularized Holevo information in general.
Note, however, that for all channels N1 and N2 we always have the superadditivity
of the Holevo information, i.e.,
This follows by performing exactly the same steps in (11.2.44)–(11.2.52), but with
the systems 𝑅1 and 𝑅2 therein taken to be classical systems. Therefore, to prove the
additivity of 𝜒 for a channel N, it suffices to show that 𝜒(N ⊗ M) ≤ 𝜒(N) + 𝜒(M).
Similarly, the sandwiched Rényi Holevo information is superadditive; i.e., for
all 𝛼 ≥ 1 and all channels N1 and N2 , it holds that
𝜒𝛼 (N1 ⊗ N2 ) ≥ e
e 𝜒𝛼 (N1 ) + e
𝜒𝛼 (N2 ). (12.2.45)
751
Chapter 12: Classical Communication
Now, the proof of (12.2.45) proceeds similarly to the proof of the corresponding
inequality (12.2.44) for the Holevo information. By restricting the optimization in
𝜒𝛼 (N1 ⊗ N2 ) to product states, and letting 𝜌′𝑋1 𝑋2 𝐵1 𝐵2 be defined as
the definition of e
𝜒𝛼 (N1 ⊗ N2 ) = sup e
e 𝐼𝛼 (𝑋1 𝑋2 ; 𝐵1 𝐵2 ) 𝜌′ (12.2.49)
𝜌
≥ sup e
𝐼𝛼 (𝑋1 𝑋2 ; 𝐵1 𝐵2 )𝜉 ′ ⊗𝜔′ , (12.2.50)
𝜏⊗𝜔
𝐼𝛼 ( 𝐴1 𝐴2 ; 𝐵1 𝐵2 )𝜏⊗𝜔 = e
e 𝐼 𝛼 ( 𝐴1 ; 𝐵 1 ) 𝜏 + e
𝐼 𝛼 ( 𝐴2 ; 𝐵 2 ) 𝜔 (12.2.51)
i.e.,
𝜒𝛼 (N1 ⊗ N2 ) ≥ e
e 𝜒𝛼 (N1 ) + e
𝜒𝛼 (N2 ), (12.2.55)
as required.
752
Chapter 12: Classical Communication
We see that in order to show the additivity of the sandwiched Rényi Holevo
information for N, it suffices to show subadditivity for N, i.e.,
𝜒𝛼 (N ⊗𝑛 ) ≤ 𝑛e
e 𝜒𝛼 (N) ∀ 𝑛 ≥ 1. (12.2.56)
We now show that subadditivity, and thus additivity, of the sandwiched Rényi
Holevo information holds for all entanglement-breaking channels.
In this section, we prove that the sandwiched Rényi Holevo information is additive
for all entanglement-breaking channels.
𝜒𝛼 (N ⊗ M) = e
e 𝜒𝛼 (N) + e
𝜒𝛼 (M). (12.2.57)
The proof of this theorem relies on two lemmas, the first of which states that the
sandwiched Rényi Holevo information e 𝜒𝛼 (N) of a channel N is equal to a quantity
𝐾𝛼 (N), called the sandwiched Rényi information radius of N.
e
Lemma 12.17
For every quantum channel N and 𝛼 > 1, the following equality holds
e𝛼 (N) is called
where the optimizations are over states 𝜌 and 𝜎. The quantity 𝐾
the sandwiched Rényi information radius of N.
𝜒𝛼 (N) ≤ 𝐾
Proof: To prove this lemma, we show that e e𝛼 (N) and e
𝜒𝛼 (N) ≥ 𝐾
e𝛼 (N).
𝜒𝛼 (N)
e
753
Chapter 12: Classical Communication
𝐼𝛼 (𝑋; 𝐵)𝜔
= sup e (12.2.59)
𝜌𝑋 𝐴
!
∑︁ ∑︁
= sup inf 𝐷
e𝛼 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ N(𝜌 𝑥𝐴 ) 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜎𝐵 (12.2.60)
𝜌𝑋 𝐴 𝜎𝐵
𝑥∈X 𝑥∈X
!
∑︁ ∑︁
≤ sup 𝐷
e𝛼 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ N(𝜌 𝑥𝐴 ) 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜏𝐵 , (12.2.61)
𝜌𝑋 𝐴
𝑥∈X 𝑥∈X
where 𝜔 𝑋 𝐵 = N 𝐴→𝐵 (𝜌 𝑋 𝐴 ) and the supremum is over all classical–quantum states
Í
𝜌 𝑋 𝐴 of the form 𝜌 𝑋 𝐴 = 𝑥∈X 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜌 𝑥𝐴 , with X a finite alphabet with
associated |X|-dimensional quantum system 𝑋 and {𝜌 𝑥𝐴 }𝑥∈X is a set of states.
Now, recall from (7.5.174) that the sandwiched Rényi relative entropy is jointly
quasi-convex for 𝛼 > 1 and invariant under tensoring in the same state |𝑥⟩⟨𝑥|, which
implies that
!
∑︁ ∑︁
𝜒𝛼 (N) ≤ sup 𝐷
e e𝛼 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ N(𝜌 𝑥𝐴 ) 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜏𝐵 (12.2.62)
𝜌𝑋 𝐴
𝑥∈X 𝑥∈X
≤ sup max 𝐷 e𝛼 (N(𝜌 𝑥 )∥𝜏𝐵 ) (12.2.63)
𝐴
𝜌 𝑋 𝐴 𝑥∈X
e𝛼 (N(𝜌 𝐴 )∥𝜏𝐵 ).
≤ sup 𝐷 (12.2.64)
𝜌𝐴
The final inequality above holds for every state 𝜏𝐵 , which implies that
𝜒𝛼 (N) ≤ inf sup 𝐷
e e𝛼 (N(𝜌 𝐴 )∥𝜏𝐵 ) = 𝐾
e𝛼 (N), (12.2.65)
𝜏𝐵 𝜌 𝐴
𝜒𝛼 (N) ≤ 𝐾
i.e., e e𝛼 (N).
e𝛼 (N) ≤ e
We now show that 𝐾 𝜒𝛼 (N). First, consider that
e𝛼 (N) = inf sup 𝐷
𝐾 e𝛼 (N(𝜌 𝐴 )∥𝜎𝐵 ) (12.2.66)
𝜎𝐵 𝜌 𝐴
1 e𝛼 (N(𝜌 𝐴 )∥𝜎𝐵 )
= inf sup log2 𝑄 (12.2.67)
𝛼 − 1 𝜎𝐵 𝜌 𝐴
1 e𝛼 (N(𝜌 𝐴 )∥𝜎𝐵 ).
= log2 inf sup 𝑄 (12.2.68)
𝛼−1 𝜎𝐵 𝜌 𝐴
Now, by taking a supremum over all probability measures 𝜇 on the set of all states
𝜌 𝐴 , we find that
∫
sup 𝑄e𝛼 (N(𝜌 𝐴 )∥𝜎𝐵 ) ≤ sup e𝛼 (N(𝜌 𝐴 )∥𝜎𝐵 ) d𝜇(𝜌 𝐴 ).
𝑄 (12.2.69)
𝜌𝐴 𝜇
754
Chapter 12: Classical Communication
So we have that
∫
e𝛼 (N) ≤ 1 log2 inf sup
𝐾 e𝛼 (N(𝜌 𝐴 )∥𝜎𝐵 ) d𝜇(𝜌 𝐴 ).
𝑄 (12.2.70)
𝛼−1 𝜎𝐵 𝜇
We now apply the Sion minimax theorem (Theorem 2.24) to exchange inf 𝜎𝐵 and
sup 𝜇 . This theorem is applicable because the function
∫
(𝜇, 𝜎𝐵 ) ↦→ 𝑄e𝛼 (N(𝜌 𝐴 )∥𝜎𝐵 ) d𝜇(𝜌 𝐴 ) (12.2.71)
is linear in the measure 𝜇 and convex in the states 𝜎𝐵 . The latter is indeed true
because 𝛼
1 1− 𝛼 1
𝑄e𝛼 (N(𝜌 𝐴 )∥𝜎𝐵 ) = N(𝜌) 2 𝜎 N(𝜌 𝐴 ) 2 ,
𝛼
(12.2.72)
𝐵
𝛼
1− 𝛼
for all 𝛼 > 1 the function 𝜎𝐵 ↦→ 𝜎𝐵 is operator convex, the Schatten norm ∥·∥ 𝛼
𝛼
Therefore,
e𝛼 (N) ≤ 1 ∑︁
e𝛼 (N(𝜌 𝑥 )∥𝜎𝐵 )
𝐾 log2 sup inf 𝑝(𝑥) 𝑄 (12.2.75)
𝛼−1 {( 𝑝(𝑥),𝜌 𝑥𝐴)} 𝑥 𝜎𝐵
𝐴
𝑥∈X
1 e𝛼 (N 𝐴→𝐵 (𝜌 𝑋 𝐵 )∥ 𝜌 𝑋 ⊗ 𝜎𝐵 )
= log2 sup inf 𝑄 (12.2.76)
𝛼−1 {( 𝑝(𝑥),𝜌 𝐴)} 𝑥
𝑥 𝜎 𝐵
e𝛼 (𝜌 𝑋 𝐴 ∥ 𝜌 𝑋 ⊗ 𝜎𝐵 )
= sup inf 𝐷 (12.2.77)
𝜌 𝑋 𝐴 𝜎𝐵
𝐼𝛼 (𝑋; 𝐵)𝜔
= sup e (12.2.78)
𝜌𝑋 𝐴
𝜒𝛼 (N),
=e (12.2.79)
755
Chapter 12: Classical Communication
where 𝜔 𝑋 𝐵 = N 𝐴→𝐵 (𝜌 𝑋 𝐴 ), and to obtain the first equality we used the direct-
sum property of 𝑄 e𝛼 in (7.5.41). So we have 𝐾 e𝛼 (N) ≤ e 𝜒𝛼 (N) in addition to
𝐾𝛼 (N) ≥ e
e 𝜒𝛼 (N), which means that 𝐾𝛼 (N) = e
e 𝜒𝛼 (N), as required. ■
Lemma 12.18
Let M 𝐴→𝐵 be a completely positive map, and let 𝑃 𝑅 𝐴 be a positive semi-definite
separable operator, i.e., such that it can be written in the following form:
∑︁
𝑃𝑅 𝐴 = 𝐶 𝑅𝑥 ⊗ 𝐷 𝑥𝐴 , (12.2.81)
𝑥∈X
where
𝜈𝛼 (P) := sup ∥P(𝜌) ∥ 𝛼 , (12.2.83)
𝜌
P is a completely positive map, and the supremum is taken over every density
operator in the domain of P. As a consequence, if N 𝐴′ →𝐵′ is an entanglement-
breaking map, then the following equality holds for all 𝛼 ≥ 1:
756
Chapter 12: Classical Communication
where
√︃
⟨𝑥| ⊗ 𝐶 𝑅𝑥 ⊗ 1𝐵 ,
∑︁
𝑉 := (12.2.87)
𝑥∈X
|𝑥⟩⟨𝑥| ⊗ 1 𝑅 ⊗ M 𝐴→𝐵 (𝐷 𝑥𝐴 ).
∑︁
𝑇 := (12.2.88)
𝑥∈X
Now, for every operator 𝑍, note that 𝑍 𝑋 𝑍 † has the same non-zero eigenvalues as
1 1
𝑍 † 𝑍 2 𝑋 𝑍 † 𝑍 2 (this follows by considering the polar decomposition of 𝑍). In
addition, since 𝑍 † 𝑍 is positive semi-definite, applying (12.2.91) with 𝑌 = 𝑍 † 𝑍
gives us
h 𝑟 i 12 12 𝑟
Tr 𝑍 𝑋 𝑍 † = Tr 𝑍 † 𝑍 𝑋 𝑍 † 𝑍 (12.2.92)
≤ Tr 𝑋 𝑟 (𝑍 † 𝑍) 𝑟 .
(12.2.93)
Letting ∑︁ √︃
𝑆 := ⟨𝑥| ⊗ 𝐶 𝑅𝑥 , (12.2.96)
𝑥∈X
observe that
𝑉 †𝑉 = 𝑆 † 𝑆 ⊗ 1𝐵 , (12.2.97)
which implies that
(𝑉 †𝑉) 𝛼 = (𝑆 † 𝑆) 𝛼 ⊗ 1𝐵 . (12.2.98)
Therefore, since 𝑇, and thus 𝑇 𝛼 , is block diagonal, we find that
Tr[(𝑉 †𝑉) 𝛼𝑇 𝛼 ]
" !#
∑︁
= Tr (𝑆 † 𝑆) 𝛼 ⊗ 1𝐵 |𝑥⟩⟨𝑥| ⊗ 1 𝑅 ⊗ (M 𝐴→𝐵 (𝐷 𝑥𝐴 )) 𝛼 (12.2.99)
𝑥∈X
∑︁ h i
= Tr (𝑆 † 𝑆) 𝛼 Tr[(M 𝐴→𝐵 (𝐷 𝑥𝐴 )) 𝛼 ], (12.2.100)
𝑥
𝑥∈X
where
†
(𝑆 𝑆) 𝛼
:= (⟨𝑥| ⊗ 1 𝑅 )(𝑆 † 𝑆) 𝛼 (|𝑥⟩ ⊗ 1 𝑅 ). (12.2.101)
𝑥
Now,
1
Tr[(M 𝐴→𝐵 (𝐷 𝑥𝐴 )) 𝛼 ] 𝛼 = M 𝐴→𝐵 (𝐷 𝑥𝐴 ) 𝛼
≤ 𝜈𝛼 (M 𝐴→𝐵 ) (12.2.102)
⇒ Tr[(M 𝐴→𝐵 (𝐷 𝑥𝐴 )) 𝛼 ] ≤ 𝜈𝛼 (M 𝐴→𝐵 ) 𝛼 . (12.2.103)
758
Chapter 12: Classical Communication
To see the equality in (12.2.84), we prove it in two steps. First, consider that the
following inequality holds for all completely positive maps M 𝐴→𝐵 and N 𝐴′ →𝐵′ :
𝜈𝛼 (M 𝐴→𝐵 ⊗ N 𝐴′ →𝐵′ ) ≥ 𝜈𝛼 (M 𝐴→𝐵 ) · 𝜈𝛼 (N 𝐴′ →𝐵′ ). (12.2.113)
This follows simply by restricting the optimization in the definition of 𝜈𝛼 (M 𝐴→𝐵 ⊗
N 𝐴′ →𝐵′ ) to tensor-product states. Specifically,
𝜈𝛼 (M 𝐴→𝐵 ⊗ N 𝐴′ →𝐵′ ) = sup ∥ (M 𝐴→𝐵 ⊗ N 𝐴′ →𝐵′ )(𝜌 𝐴𝐴′ )∥ 𝛼 (12.2.114)
𝜌 𝐴𝐴′
≥ sup ∥ (M 𝐴→𝐵 ⊗ N 𝐴′ →𝐵′ )(𝜎𝐴 ⊗ 𝜔 𝐴′ )∥ 𝛼 (12.2.115)
𝜎𝐴,𝜔 𝐴′
= sup ∥ (M 𝐴→𝐵 (𝜎𝐴 ) ⊗ N 𝐴′ →𝐵′ (𝜔 𝐴′ )∥ 𝛼 (12.2.116)
𝜎𝐴,𝜔 𝐴′
= sup ∥ (M 𝐴→𝐵 (𝜎𝐴 )∥ 𝛼 · sup ∥N 𝐴′ →𝐵′ (𝜔 𝐴′ )∥ 𝛼 (12.2.117)
𝜎𝐴 𝜔 𝐴′
= 𝜈𝛼 (M 𝐴→𝐵 ) · 𝜈𝛼 (N 𝐴′ →𝐵′ ). (12.2.118)
The following reverse inequality
𝜈𝛼 (M 𝐴→𝐵 ⊗ N 𝐴′ →𝐵′ ) ≤ 𝜈𝛼 (M 𝐴→𝐵 ) · 𝜈𝛼 (N 𝐴′ →𝐵′ ) (12.2.119)
holds when N 𝐴′ →𝐵′ is an entanglement-breaking map. Indeed, considering an
arbitrary input state 𝜌 𝐴𝐴′ , the output state 𝜔 𝐴𝐵′ := N 𝐴′ →𝐵′ (𝜌 𝐴𝐴′ ) is a separable
operator. Applying (12.2.82) to the separable operator 𝜔 𝐴𝐵′ and identifying system
𝐵′ with 𝑅 in (12.2.82), we conclude that
∥ (M 𝐴→𝐵 ⊗ N 𝐴′ →𝐵′ )(𝜌 𝐴𝐴′ )∥ 𝛼 = ∥M 𝐴→𝐵 (𝜔 𝐴𝐵′ ) ∥ 𝛼 (12.2.120)
≤ 𝜈𝛼 (M 𝐴→𝐵 ) · ∥𝜔 𝐵′ ∥ 𝛼 (12.2.121)
= 𝜈𝛼 (M 𝐴→𝐵 ) · ∥N 𝐴′ →𝐵′ (𝜌 𝐴′ )∥ 𝛼 (12.2.122)
≤ 𝜈𝛼 (M 𝐴→𝐵 ) · 𝜈𝛼 (N 𝐴′ →𝐵′ ). (12.2.123)
Since the inequality holds for every input state 𝜌 𝐴𝐴′ , we conclude the inequality in
(12.2.119). ■
759
Chapter 12: Classical Communication
With Lemmas 12.17 and 12.18 in hand, we can now prove Theorem 12.16.
e𝛼 (N) as
We start by using (7.5.3) to write the definition of 𝐾
e𝛼 (N) = inf sup 𝐷
𝐾 e𝛼 (N(𝜌 𝐴 )∥𝜎𝐵 ) (12.2.124)
𝜎𝐵 𝜌 𝐴
𝛼 1− 𝛼 1− 𝛼
= inf sup log2 𝜎𝐵 N(𝜌 𝐴 )𝜎𝐵2𝛼
2𝛼
. (12.2.125)
𝜎𝐵 𝜌 𝐴 𝛼 − 1
𝛼
𝛼 1− 𝛼 1− 𝛼
= inf sup log2 𝜎𝐴2𝛼 ′ 𝐵′ ((N ⊗ M)(𝜌 𝐴𝐵 )) 𝜎 2𝛼
𝐴′ 𝐵 ′ (12.2.127)
𝛼 − 1 𝜎𝐴′ 𝐵′ 𝜌 𝐴𝐵 𝛼
𝛼 1− 𝛼 1− 𝛼
= inf log2 sup 𝜎𝐴2𝛼 ′ 𝐵 ′ ((N ⊗ M)(𝜌 𝐴𝐵 )) 𝜎𝐴′ 𝐵 ′
2𝛼
(12.2.128)
𝛼 − 1 𝜎𝐴′ 𝐵′ 𝜌 𝐴𝐵 𝛼
𝛼 1− 𝛼 1− 𝛼 1− 𝛼 1− 𝛼
≤ inf log2 sup 𝜎𝐴′ ⊗ 𝜏𝐵′ ((N ⊗ M)(𝜌 𝐴𝐵 )) 𝜎𝐴′ ⊗ 𝜏𝐵′
2𝛼 2𝛼 2𝛼 2𝛼
,
𝛼 − 1 𝜎𝐴′ ,𝜏𝐵′ 𝜌 𝐴𝐵
𝛼
(12.2.129)
where to obtain the inequality we have restricted the infimum to tensor product
states. Now, observe that since N is entanglement-breaking, then sandwiching
1− 𝛼
the output of the channel by the positive semi-definite operator 𝜎𝐴2𝛼
′ leads to a
′
new map N that is a completely positive entanglement-breaking map (though
not necessarily trace preserving). Similarly, sandwiching the output of M by the
1− 𝛼
positive semi-definite operator 𝜏𝐵2𝛼
′ leads to a new completely positive map M′.
Therefore, using Lemma 12.18, we obtain
e𝛼 (N ⊗ M)
𝐾
𝛼
≤ inf log2 sup ∥ (N′ ⊗ M′)(𝜌 𝐴𝐵 ) ∥ 𝛼 (12.2.130)
𝛼 − 1 𝜎𝐴′ ,𝜏𝐵′ 𝜌 𝐴𝐵
𝛼
= inf log2 𝜈𝛼 (N′ ⊗ M′) (12.2.131)
𝛼 − 1 𝜎𝐴′ ,𝜏𝐵′
760
Chapter 12: Classical Communication
𝛼
= inf log2 (𝜈𝛼 (N′)𝜈𝛼 (M′)) (12.2.132)
𝛼 − 1 𝜎𝐴′ ,𝜏𝐵′
𝛼
inf log2 𝜈𝛼 (N′) + log2 𝜈𝛼 (M′)
= (12.2.133)
𝛼 − 1 𝜎𝐴′ ,𝜏𝐵′
𝛼 𝛼
= inf log2 sup ∥N′ (𝜌 𝐴 ) ∥ 𝛼 + inf log2 sup ∥M′ (𝜔 𝐵 )∥ 𝛼 (12.2.134)
𝜎𝐴′ 𝛼 − 1 𝜌𝐴 𝜏𝐵′ 𝛼 − 1 𝜔𝐵
𝛼 𝛼
= inf sup log2 ∥N′ (𝜌 𝐴 ) ∥ 𝛼 + inf sup log2 ∥M′ (𝜔 𝐵 )∥ 𝛼 (12.2.135)
𝜎𝐴′ 𝜌 𝐴 𝛼 − 1 𝜏𝐵′ 𝜔 𝐵 𝛼 − 1
e𝛼 (N ⊗ M) ≤ 𝐾
So we have that 𝐾 e𝛼 (N) + 𝐾 e𝛼 (M) for every channel M. Using
Lemma 12.17, we obtain the desired result.
Note that the additivity of the Holevo information of every entanglement-
breaking channel follows from the additivity of the sandwiched Rényi Holevo
information of such channels by taking the limit 𝛼 → 1+ (the proof is analogous to
the one presented in Appendix 11.B).
Having shown that the sandwiched Rényi Holevo information is additive for all
entanglement-breaking channels, we can now proceed further from (12.2.36) to
prove a strong converse theorem for all entanglement-breaking channels. Moreover,
since the sandwiched Rényi Holevo information e 𝜒𝛼 (N) satisfies lim𝛼→1+ e
𝜒𝛼 (N) =
𝜒(N) (the proof of this is analogous to the one presented in Appendix 11.B), we
can go beyond the statement of Theorem 12.13 and say that 𝐶 (N) = 𝜒(N) for all
entanglement-breaking channels N.
𝐶 (N) = 𝐶
e(N) = 𝜒(N). (12.2.137)
761
Chapter 12: Classical Communication
Remark: Note that this theorem holds more generally for every channel N for which the
𝜒 𝛼 (N) is additive.
sandwiched Rényi Holevo information e
Proof: Since lim𝛼→1+ e 𝜒𝛼 (N) = 𝜒(N) (the proof of this is analogous to the one
presented in Appendix 11.B), we find that the Holevo information is additive for all
entanglement-breaking channels. The equality 𝐶 (N) = 𝜒(N) then follows from
Theorem 12.13.
The remainder of the proof is devoted to establishing that 𝜒(N) is a strong
converse rate for classical communication over N, from which it follows that
e(N) ≤ 𝜒(N), which in turn implies, via (12.2.12), that 𝐶
𝐶 e(N) = 𝜒(N).
𝛿 > 𝛿 1 + 𝛿 2 C 𝛿′ . (12.2.138)
Now, with the values of 𝑛 and 𝜀 chosen as above, every (𝑛, |M|, 𝜀) classical
communication protocol satisfies (12.2.36) in Corollary 12.15. In particular, using
the additivity of the sandwiched Rényi Holevo information for all 𝛼 > 1, we can
write (12.2.36) as
1 𝛼 1
log2 |M| ≤ e𝜒𝛼 (N) + log2 . (12.2.141)
𝑛 𝑛(𝛼 − 1) 1−𝜀
Rearranging the right-hand side of this inequality, and using the assumptions in
(12.2.138)–(12.2.140), we obtain
1 𝛼 1
log2 |M| ≤ 𝜒(N) + e𝜒𝛼 (N) − 𝜒(N) + log2 (12.2.142)
𝑛 𝑛(𝛼 − 1) 1−𝜀
≤ 𝜒(N) + 𝛿1 + 𝛿2 (12.2.143)
762
Chapter 12: Classical Communication
= 𝜒(N) + 𝛿′ (12.2.144)
< 𝜒(N) + 𝛿. (12.2.145)
So we have that 𝜒(N) + 𝛿 > 𝑛1 log2 |M| for all (𝑛, |M|, 𝜀) classical communication
protocols with 𝑛 sufficiently large. Due to this strict inequality, it follows that
there cannot exist an (𝑛, 2𝑛( 𝜒(N)+𝛿) , 𝜀) classical communication protocol for all
sufficiently large 𝑛 such that (12.2.140) holds, for if it did there would exist some
message set M such that 𝑛1 log2 |M| = 𝜒(N) + 𝛿, which we have just seen is
not possible. Since 𝜀 and 𝛿 are arbitrary, we conclude that for all 𝜀 ∈ [0, 1),
𝛿 > 0, and sufficiently large 𝑛, there does not exist an (𝑛, 2𝑛( 𝜒(N)+𝛿) , 𝜀) classical
communication protocol. This means that 𝜒(N) is a strong converse rate, which
completes the proof. ■
The difficulty in proving the additivity of the Holevo information for a general
channel, and thus obtaining an upper bound on its classical capacity, has motivated
the study of other, more tractable upper bounds on the classical capacity of a
quantum channel. In this section, we present two upper bounds on the strong
converse classical capacity of a quantum channel.
763
Chapter 12: Classical Communication
n→∞
Error
Probability,
εn
0 Rate, Rn
χ (N )
Figure 12.5: The error probability 𝜀 𝑛 as a function of the rate 𝑅𝑛 for classical
communication over a quantum channel N for which the sandwiched Rényi
Holevo information e𝜒𝛼 (N) is additive. As 𝑛 → ∞, for every rate below the
Holevo information 𝜒(N), there exists a sequence of protocols with error
probability converging to zero. For every rate above the Holevo information
𝜒(N), the error probability converges to one for all possible protocols.
Recall from Proposition 12.3 that the following upper bound on the number of
transmitted bits holds for every (|M|, 𝜀) classical communication protocol:
where the 𝜀-hypothesis testing Holevo information 𝜒𝐻𝜀 (N) of the quantum channel
N is defined in (7.11.93) as
comparison between the channel N and the channel R for classical communication
more explicit by writing the quantity 𝜒𝐻𝜀 (H) as
𝜒𝐻𝜀 (N) = sup inf 𝐷 𝜀𝐻 (N 𝐴→𝐵 (𝜌 𝑋 𝐴 )∥R 𝐴→𝐵 (𝜌 𝑋 𝐴 )), (12.2.149)
𝜌 𝑋 𝐴 R 𝐴→𝐵
where the supremum is over pure states 𝜓 𝑅 𝐴 , with the dimension of 𝑅 the same
as the dimension of 𝐴.
Remark: Note that it suffices to optimize over pure states 𝜓 𝑅 𝐴, with the dimension of 𝑅 equal
to the dimension of 𝐴, when calculating the generalized Υ-information of a channel, i.e., for
general states 𝜌 𝑅 𝐴 (with the dimension of 𝑅 not necessarily equal to the dimension of 𝐴),
The proof of this proceeds analogously to the steps in (7.11.4)–(7.11.2) for proving that it suffices
to optimize over pure states when calculating the generalized channel divergence.
1. The Υ-information of N,
In the last line we used the transpose trick (see (2.2.40). Then, we applied the
monotonicity of the quantum relative entropy with respect to the partial trace Tr 𝐴 .
Finally, we used the fact that 𝜌 T𝐴 is a state for every 𝜌 𝐴 , so that the optimization
over states remains unchanged.
Continuing, we have that
= 𝜒(N). (12.2.166)
To obtain the first inequality, we first used the fact that for every map F ∈ 𝔉 there
exists a state, which we call 𝜎F , such that F 𝐴→𝐵 (𝜌 𝐴 ) ≤ 𝜎F for all input states 𝜌 𝐴 .
We then used 2.(d) in Proposition 7.3. To obtain the last inequality, we simply
enlarged the set over which the infimum is performed to include all states. Then,
to obtain the equality on the last line, we used the expression in (12.2.80) for the
Holevo information. ■
Proposition 12.22
Let N be a quantum channel. For every (|M|, 𝜀) classical communication
protocol over N, the number of bits transmitted over N is bounded from above
by the 𝜀-hypothesis testing Υ-information of N, i.e.,
Proof: For every (|M|, 𝜀) classical communication protocol, with encoding and
decoding channel given by E and D, respectively, the maximal error probability
criterion 𝑝 ∗err (E, D; N) ≤ 𝜀 holds. This implies 𝑝 err ((E, D); 𝑝, N) ≤ 𝜀 for the
average probability, where 𝑝 : M → [0, 1] is the uniform prior probability
distribution over the messages in M. If the encoding channel E is defined such
767
Chapter 12: Classical Communication
that we obtain the set {𝜌 𝑚𝐴 } 𝑚∈M of states associated to each message 𝑚 ∈ M (see
(12.1.2)), and the decoding channel D is defined by the POVM {Λ𝑚 𝐵 }𝑚
b
b∈M , then we
can write the average success probability 𝑝 succ ((E, D); 𝑝, N) of the code (E, D) as
and we have that 𝑝 succ ((E, D); 𝑝, N) ≥ 1 − 𝜀. Now, recall from (4.2.5) that we can
write the action of N 𝐴→𝐵 in terms of its Choi representation ΓN 𝐴𝐵 as
N 𝐴→𝐵 (𝜌 𝑚𝐴 ) = Tr 𝐴 (𝜌 𝑚𝐴 ) T ⊗ 1𝐵 ΓN
𝐴𝐵 (12.2.170)
and a purification of it
√︁
|𝜙⟩ 𝐴𝐴′ B ( 1 𝐴 ⊗ 𝜌 𝐴 ⊗ 1 𝐴′ |Γ⟩ 𝐴′ 𝐴 ,
√︁ T
𝜌 𝐴′ )|Γ⟩ 𝐴𝐴′ = (12.2.172)
where we used the transpose trick in (2.2.40) to obtain the last equality. Then,
observe that
√︁ √︁
N 𝐴 →𝐵 (𝜙 𝐴𝐴′ ) = 𝜌 𝐴 N 𝐴 →𝐵 (Γ𝐴𝐴 ) 𝜌 T𝐴
′
T
′ ′ (12.2.173)
√︁ √︁
= 𝜌 T𝐴 ΓN T
𝐴𝐵 𝜌 𝐴 , (12.2.174)
ΓN T −
1 T −
1
𝐴𝐵 = (𝜌 𝐴 ) N 𝐴′ →𝐵 (𝜙 𝐴𝐴′ )(𝜌 𝐴 ) . (12.2.175)
2 2
Therefore,
768
Chapter 12: Classical Communication
1 ∑︁ 𝑚 T N
= Tr (𝜌 𝐴 ) ⊗ Λ𝑚 𝐵 Γ 𝐴𝐵 (12.2.177)
|M|
𝑚∈M
1 ∑︁ h 𝑚 T 𝑚 T −
1 T −
1
i
= Tr (𝜌 𝐴 ) ⊗ Λ𝐵 (𝜌 𝐴 ) N 𝐴′ →𝐵 (𝜙 𝐴𝐴′ )(𝜌 𝐴 )
2 2 (12.2.178)
|M|
" 𝑚∈M ! #
1 1 ∑︁ 1
= Tr (𝜌 T𝐴 ) − 2 (𝜌 𝑚𝐴 ) T ⊗ Λ𝑚 T −
𝐵 (𝜌 𝐴 ) N 𝐴′ →𝐵 (𝜙 𝐴𝐴′ )
2 (12.2.179)
|M|
𝑚∈M
= Tr[Ω 𝐴𝐵 N 𝐴′ →𝐵 (𝜙 𝐴𝐴′ )], (12.2.180)
where !
1 1 ∑︁ 1
Ω 𝐴𝐵 B (𝜌 T𝐴 ) − 2 (𝜌 𝑚𝐴 ) T ⊗ Λ𝑚 T −
𝐵 (𝜌 𝐴 ) .
2 (12.2.181)
|M|
𝑚∈M
Note that Ω 𝐴𝐵 is positive semi-definite, i.e., Ω 𝐴𝐵 ≥ 0. Also, observe that since
𝐵 ≤ 1 𝐵 for all 𝑚 ∈ M, we have that
Λ𝑚
!
1
(𝜌 𝑚𝐴 ) T ⊗ 1𝐵 (𝜌 T𝐴 ) − 2 = 1 𝐴𝐵 .
∑︁ 1
Ω 𝐴𝐵 ≤ (𝜌 T𝐴 ) (12.2.182)
|M|
𝑚∈M
Now, let F ∈ 𝔉. This means that there exists a state, call it 𝜎𝐵 , such that
F(𝜌 𝐴 ) ≤ 𝜎𝐵 for all states 𝜌 𝐴 . We find that
Tr[Ω 𝐴𝐵 F 𝐴′ →𝐵 (𝜙 𝐴𝐴′ )]
" ! #
1 1 ∑︁ 1
= Tr (𝜌 T𝐴 ) − 2 (𝜌 𝑚𝐴 ) T ⊗ Λ𝑚 T −
𝐵 (𝜌 𝐴 ) F 𝐴′ →𝐵 (𝜙 𝐴𝐴′ )
2 (12.2.184)
|M|
𝑚∈M
1 ∑︁ F
= Tr (𝜌 𝑚𝐴 ) T ⊗ Λ𝑚 𝐵 Γ 𝐴𝐵 (12.2.185)
|M|
𝑚∈M
1 ∑︁
Tr[Λ𝑚 𝐵 F 𝐴→𝐵 (𝜌 𝐴 )]
𝑚
= (12.2.186)
|M|
𝑚∈M
1 ∑︁
≤ Tr[Λ𝑚𝐵 𝜎𝐵 ] (12.2.187)
|M|
𝑚∈M
769
Chapter 12: Classical Communication
1
≤ , (12.2.188)
|M|
where we used (12.2.175) to obtain the second equality and we used the fact that
F(𝜌 𝐴 ) ≤ 𝜎𝐵 for every input state 𝜌 𝐴 to obtain the second-to-last inequality.
Now, by optimizing the quantity Tr[Ω 𝐴𝐵 F 𝐴′ →𝐵 (𝜙 𝐴𝐴′ )] over all measurement
operators, subject to the constraint Tr[Ω 𝐴𝐵 N 𝐴′ →𝐵 (𝜙 𝐴𝐴′ )] ≥ 1 − 𝜀, we get that
log2 |M| ≤ 𝐷 𝜀𝐻 (N 𝐴′ →𝐵 (𝜙 𝐴𝐴′ )∥F 𝐴′ →𝐵 (𝜙 𝐴𝐴′ )). (12.2.189)
Since this holds for every F ∈ 𝔉, we have that
log2 |M| ≤ inf 𝐷 𝜀𝐻 (N 𝐴′ →𝐵 (𝜙 𝐴𝐴′ )∥F 𝐴′ →𝐵 (𝜙 𝐴𝐴′ )). (12.2.190)
F∈𝔉
as required. ■
Proposition 12.23
Let N be a quantum channel, let 𝜀 ∈ [0, 1), and let 𝛼 > 1. For every (|M|, 𝜀)
classical communication protocol over N, the following bounds hold
1
log2 |M| ≤ (Υ(N) + ℎ2 (𝜀)) , (12.2.192)
1−𝜀
𝛼 1
log2 |M| ≤ Υ
e𝛼 (N) + log2 . (12.2.193)
𝛼−1 1−𝜀
These upper bounds hold for every (𝑛, |M|, 𝜀) classical communication protocol
over a quantum channel N, where 𝑛 ∈ N and 𝜀 ∈ [0, 1).
Now, as with the Holevo information and the sandwiched Rényi Holevo informa-
tion, we are faced with the additivity of the Υ-information and the sandwiched Rényi
Υ-information. Our primary focus is on the latter, since we would like to make a
statement about the strong converse for channels more general than entanglement-
breaking channels. It turns out that the sandwiched Rényi Υ-information is additive
for irreducibly-covariant channels.
Recall from Definition 4.18 that a channel N 𝐴→𝐵 is covariant with respect to
𝑔 𝑔
a group 𝐺 if there exist projective unitary representations {𝑈 𝐴 }𝑔∈𝐺 and {𝑉𝐵 }𝑔∈𝐺
such that
N(𝑈 𝐴 𝜌 𝐴 (𝑈 𝐴 ) † ) = 𝑉𝐵 N(𝜌 𝐴 )(𝑉𝐵 ) †
𝑔 𝑔 𝑔 𝑔
(12.2.196)
for all states 𝜌 𝐴 and all 𝑔 ∈ 𝐺. The channel N is called irreducibly covariant if
𝑔
the representation {𝑈 𝐴 }𝑔∈𝐺 acting on the input space of the channel is irreducible,
which means that it satisfies
1 ∑︁ 𝑔 1
𝑈 𝐴 𝜌 𝐴 (𝑈 𝐴 ) † =
𝑔
(12.2.197)
|𝐺 | 𝑔∈𝐺 𝑑𝐴
771
Chapter 12: Classical Communication
To prove the reverse inequality, let us recall Proposition 7.84, specifically its proof.
𝑔
Let 𝐺 be the group with respect to which N is irreducibly covariant, let {𝑈 𝐴 }𝑔∈𝐺
be the irreducible representation of 𝐺 acting on the input space of N 𝐴→𝐵 , and let
𝑔
{𝑉𝐵 }𝑔∈𝐺 be the representation of 𝐺 acting on the output space of N. Since the maps
F 𝐴→𝐵 in 𝔉, in particular the map achieving the infimum in the definition of 𝚼(N),
need not be irreducibly covariant, we cannot use Proposition 7.84 directly. Instead,
we consider (7.11.41) in its proof. For every F 𝐴→𝐵 ∈ 𝔉, by using (7.11.52), the
inequality in (7.11.41) becomes
772
Chapter 12: Classical Communication
where 𝔉𝑛 is the set of completely positive maps in (12.2.150) acting on the space
of the system 𝐴𝑛 . Now, the maximally entangled state Φ 𝑅 𝑛 𝐴𝑛 on 𝑛 identical copies
𝑅1 · · · 𝑅𝑛 and 𝐴1 · · · 𝐴𝑛 of the systems 𝑅 and 𝐴 splits into a tensor product in the
following way:
Φ 𝑅 𝑛 𝐴 𝑛 = Φ 𝑅 1 𝐴1 ⊗ · · · ⊗ Φ 𝑅 𝑛 𝐴 𝑛 . (12.2.210)
Furthermore, if we restrict the optimization over maps F 𝐴𝑛 →𝐵𝑛 ∈ 𝔉𝑛 to a tensor
product of identical maps G in the set 𝔉 such that
773
Chapter 12: Classical Communication
Now, with the values of 𝑛 and 𝜀 chosen as above, every (𝑛, |M|, 𝜀) classical
communication protocol satisfies (12.2.195). In particular, using the subadditivity
774
Chapter 12: Classical Communication
of the sandwiched Rényi Υ-information for all 𝛼 > 1, we can write (12.2.195) as
1 𝛼 1
log2 |M| ≤ Υ
e𝛼 (N) + log2 . (12.2.219)
𝑛 𝑛(𝛼 − 1) 1−𝜀
Rearranging the right-hand side of this inequality, and using the assumptions in
(12.2.216)–(12.2.218), we obtain
1 𝛼 1
log2 |M| ≤ Υ(N) + Υe𝛼 (N) − Υ(N) + log2 (12.2.220)
𝑛 𝑛(𝛼 − 1) 1−𝜀
≤ Υ(N) + 𝛿1 + 𝛿2 (12.2.221)
= Υ(N) + 𝛿′ (12.2.222)
< Υ(N) + 𝛿. (12.2.223)
So we have that Υ(N) + 𝛿 > 𝑛1 log2 |M| for all (𝑛, |M|, 𝜀) classical communication
protocols with 𝑛 sufficiently large. Due to this strict inequality, it follows that
there cannot exist an (𝑛, 2𝑛(Υ(N)+𝛿) , 𝜀) classical communication protocol for all
sufficiently large 𝑛 such that (12.2.218) holds, for if it did there would exist some
message set M such that 𝑛1 log2 |M| = Υ(N) + 𝛿, which we have just seen is not
possible. Since 𝜀 and 𝛿 are arbitrary, we have that for all 𝜀 ∈ [0, 1), 𝛿 > 0, and
sufficiently large 𝑛, there does not exist an (𝑛, 2𝑛(Υ(N)+𝛿) , 𝜀) classical communication
protocol. This means that Υ(N) is a strong converse rate, which completes the
proof. ■
Theorem 12.26 thus gives us an upper bound on the strong converse classical
e(N) of any irreducibly-covariant channel N, namely,
capacity 𝐶
e(N) ≤ Υ(N),
𝐶 (12.2.224)
775
Chapter 12: Classical Communication
channel is irreducibly covariant. This fact, along with other reasoning, allows us to
conclude that
for all dimensions 𝑑 ≥ 2 and all 𝑝 ∈ [0, 1]. We provide a proof of this chain of
equalities in Section 12.3.1.2 below.
While the Υ-information gives us an upper bound on the strong converse classical
capacity of any irreducibly-covariant channel, computing it is relatively challenging
due to the minimization over the set 𝔉. In this section, we define a subset of 𝔉,
denoted by 𝔉𝛽 , that allows us to obtain a quantity that can be computed using a
semi-definite program (SDP). Furthermore, this quantity turns out to be additive for
all channels, which means that it is an upper bound on the strong converse classical
capacity for all channels.
The set 𝔉𝛽 is defined as the following set of completely positive maps:
infimum Tr[𝐶 𝑋]
𝛽(F) = subject to Φ(𝑋) ≥ 𝐷, (12.2.229)
𝑋 ≥ 0,
where
1𝐵
𝑆𝐵 0 0
𝑋= , 𝐶= , (12.2.230)
0 𝑅 𝐴𝐵 0 0 𝐴𝐵
776
Chapter 12: Classical Communication
𝑅 𝐴𝐵 0 0 0
© ª
0 𝑅 𝐴𝐵 0 0
Φ(𝑋) =
®
0 1 𝑅 ⊗ 𝑆 𝐵 − 𝑅 𝐴𝐵
T𝐵 ®, (12.2.231)
0 0 ®
« 0 0 0 1𝑅 ⊗ 𝑆 𝐵 + 𝑅 𝐴𝐵 ¬
T𝐵
(ΓF𝐴𝐵 ) T 𝐵 0 0 0
© F
−(Γ𝐴𝐵 ) T𝐵 ª
0 0 0®
𝐷= ®. (12.2.232)
0 0 0 0®
« 0 0 0 0¬
Note that the constraints in (12.2.228) imply that 𝑆 𝐵 and 𝑅 𝐴𝐵 are positive semi-
definite, since the constraint −𝑅 𝐴𝐵 ≤ (ΓF𝐴𝐵 ) T 𝐵 ≤ 𝑅 𝐴𝐵 implies that
𝑅 𝐴𝐵 − (ΓF𝐴𝐵 ) T 𝐵 ≥ 0, (12.2.233)
𝑅 𝐴𝐵 + (ΓF𝐴𝐵 ) T 𝐵 ≥ 0. (12.2.234)
F 𝐴→𝐵 (𝜌 𝐴 ) = Tr 𝐴 (𝜌 T𝐴 ⊗ 1𝐵 )ΓF𝐴𝐵
(12.2.235)
T
= Tr 𝐴 (𝜌 𝐴 ⊗ 1𝐵 )(Γ𝐴𝐵 )F T𝐵
T
(12.2.236)
h √︁ √︁ i T
= Tr 𝐴 ( 𝜌 𝐴 ⊗ 1𝐵 )(Γ𝐴𝐵 ) ( 𝜌 𝐴 ⊗ 1𝐵 )
T F T𝐵 T
(12.2.237)
h √︁ √︁ i T
≤ Tr 𝐴 ( 𝜌 𝐴 ⊗ 1𝐵 )𝑅 𝐴𝐵 ( 𝜌 𝐴 ⊗ 1𝐵 )
T ∗ T
(12.2.238)
h √︁ √︁ i
= Tr 𝐴 ( 𝜌 𝐴 ⊗ 1𝐵 )(𝑅 𝐴𝐵 ) ( 𝜌 𝐴 ⊗ 1𝐵 )
T ∗ T𝐵 T
(12.2.239)
≤ Tr 𝐴 (𝜌 T𝐴 ⊗ 1𝐵 )( 1 𝐴 ⊗ 𝜎𝐵 )
(12.2.240)
= 𝜎𝐵 , (12.2.241)
where to obtain the first inequality we used (ΓF𝐴𝐵 ) T 𝐵 ≤ 𝑅 ∗𝐴𝐵 and to obtain the second
inequality we used (𝑅 ∗𝐴𝐵 ) T 𝐵 ≤ 1 𝐴 ⊗ 𝜎𝐵 . Therefore, F 𝐴→𝐵 (𝜌 𝐴 ) ≤ 𝜎𝐵 for all 𝜌 𝐴 ,
which means that F 𝐴→𝐵 ∈ 𝔉. Since F ∈ 𝔉𝛽 is arbitrary, we conclude that 𝔉𝛽 ⊂ 𝔉.
777
Chapter 12: Classical Communication
Since the set 𝔉𝛽 is a subset of 𝔉, minimizing over 𝔉𝛽 can never lead to a smaller
value compared to minimizing over 𝔉, which means that
This bound is looser than the one in Proposition 12.22, but it has the advantange
that it can be computed using an SDP. This is due to the fact that the hypothesis
testing relative entropy can itself be computed via an SDP.
Although we get an efficiently computable upper bound in the one-shot setting
via the Υ 𝛽 -information, in the asymptotic setting this bound is not known to
be additive, making its evaluation computationally prohibitive as the number 𝑛
of channel uses increases. Instead, for the purpose of obtaining an efficiently
computable upper bound in the asymptotic setting, we define the following quantity
for every quantum channel N:
Since 𝛽(N) can be computed using an SDP (in particular, via the optimization
problem in (12.2.228)), we have that 𝐶 𝛽 (N) can also be computed using an SDP.
778
Chapter 12: Classical Communication
A useful fact about the quantity 𝐶 𝛽 (N) is the fact that it is additive, i.e.,
Proposition 12.27
For a quantum channel N, the following inequalities hold for all 𝛼 > 1:
779
Chapter 12: Classical Communication
By examining (12.2.258) in the above proof, we see that the following bound
holds for an arbitrary (𝑛, |M|, 𝜀) classical communication protocol:
1 1 1
log2 |M| ≤ 𝐶 𝛽 (N) + log2 . (12.2.261)
𝑛 𝑛 1−𝜀
If we fix the rate 𝑅 = 𝑛1 log2 |M|, then this bound can be rewritten as follows:
which indicates that communicating at a rate 𝑅 > 𝐶 𝛽 (N) implies the success
probability 1 − 𝜀 of every sequence of such protocols decays exponentially fast to
zero.
12.3 Examples
In this section, we present various examples of channels with known formulas for the
Holevo information and/or known results on additivity of the Holevo information.
Let us start by making some observations about the Holevo information 𝜒(N)
of a channel N. First, by expanding the definition of the Holevo information using
the expression for the mutual information in terms of the relative entropy, we arrive
at the following:
781
Chapter 12: Classical Communication
where {𝜓 𝑥𝐴 }𝑥∈X is a set of pure states, and defining 𝜌′𝑋 𝐵 = N 𝐴→𝐵 (𝜌 𝑋 𝐴 ) is another
classical–quantum state, it follows from Proposition 7.14 that
!
∑︁ ∑︁
𝐼 (𝑋; 𝐵) 𝜌′ = 𝐻 𝑝(𝑥)N(𝜓 𝑥𝐴 ) − 𝑝(𝑥)𝐻 (N(𝜓 𝑥𝐴 )) (12.3.5)
𝑥∈X 𝑥∈X
782
Chapter 12: Classical Communication
where 𝜋 𝐴 = 1𝑑 𝐴𝐴 and
𝐻min (N) B min 𝐻 (N(𝜌 𝐴 )) (12.3.9)
𝜌𝐴
Proof: We have
" ! #
∑︁ ∑︁
𝜒(N) = sup 𝐻 𝑝(𝑥)N(𝜌 𝑥𝐴 ) − 𝑝(𝑥)𝐻 (N(𝜌 𝑥𝐴 )) (12.3.10)
{( 𝑝(𝑥),𝜌 𝑥𝐴)} 𝑥 𝑥∈X
( !) 𝑥∈X
∑︁
≤ sup 𝐻 𝑝(𝑥)N(𝜌 𝑥𝐴 )
{( 𝑝(𝑥),𝜌 𝑥𝐴)} 𝑥
(𝑥∈X )
∑︁
+ sup − 𝑝(𝑥)𝐻 (N(𝜌 𝑥𝐴 )) (12.3.11)
{( 𝑝(𝑥),𝜌 𝑥𝐴)} 𝑥 𝑥∈X
≤ sup 𝐻 (N(𝜌 𝐴 )) + sup {−𝐻 (N(𝜌 𝐴 ))} (12.3.12)
𝜌𝐴 𝜌𝐴
= sup 𝐻 (N(𝜌 𝐴 )) − inf 𝐻 (N(𝜌 𝐴 )). (12.3.13)
𝜌𝐴 𝜌𝐴
Now, by the unitary invariance of the quantum entropy, for every state 𝜌 𝐴 we obtain
𝐻 (N(𝜌 𝐴 )) = 𝐻 (𝑉𝐵 N(𝜌 𝐴 )(𝑉𝐵 ) † ) = 𝐻 (N(𝑈 𝐴 𝜌 𝐴 (𝑈 𝐴 ) † ))
𝑔 𝑔 𝑔 𝑔
(12.3.14)
for all 𝑔 ∈ 𝐺. This implies that
1 ∑︁
𝐻 (N(𝑈 𝐴 𝜌 𝐴 (𝑈 𝐴 ) † ))
𝑔 𝑔
𝐻 (N(𝜌 𝐴 )) = (12.3.15)
|𝐺 | 𝑔∈𝐺
© ∑︁ 1
N(𝑈 𝐴 𝜌 𝐴 (𝑈 𝐴 ) † ) ®
𝑔 𝑔
≤ 𝐻
ª
(12.3.16)
|𝐺 |
«𝑔∈𝐺 ¬
783
Chapter 12: Classical Communication
© © 1 ∑︁ 𝑔
𝑈 𝐴 𝜌 𝐴 (𝑈 𝐴 ) † ®®
𝑔 ªª
= 𝐻 N (12.3.17)
|𝐺 | 𝑔∈𝐺
« « ¬¬
= 𝐻 (N(𝜋 𝐴 )) , (12.3.18)
where the inequality follows from concavity of the quantum entropy and the last
equality follows because {𝑈 𝑔 }𝑔∈𝐺 is an irreducible representation, which implies
that
1 ∑︁ 𝑔 1𝐴
𝑈 𝐴 𝜌 𝐴 (𝑈 𝐴 ) † =
𝑔
(12.3.19)
|𝐺 | 𝑔∈𝐺 𝑑𝐴
for every state 𝜌 𝐴 . Then, since we are optimizing a continuous function over a
compact and convex set, the infimum in (12.3.13) can be achieved, meaning that
we can replace the infimum in (12.3.13) with a minimum, which means that
© ∑︁ 1 𝑔 ∗ 𝑔 † ª
∑︁ 1
𝐻 (N(𝑈 𝐴 𝜌 ∗𝐴 (𝑈 𝐴 ) † ))
𝑔 𝑔
𝜒(N) ≥ 𝐻 N(𝑈 𝐴 𝜌 𝐴 (𝑈 𝐴 ) ) ® − (12.3.21)
|𝐺 | |𝐺 |
«𝑔∈𝐺 ¬ 𝑔∈𝐺
∑︁ 1
𝐻 (𝑉𝐵 𝜌 ∗𝐴 (𝑉𝐵 ) † )
𝑔 𝑔
= 𝐻 (N(𝜋 𝐴 )) − (12.3.22)
𝑔∈𝐺
|𝐺 |
∑︁ 1
= 𝐻 (N(𝜋 𝐴 )) − 𝐻 (N(𝜌 ∗𝐴 )) (12.3.23)
𝑔∈𝐺
|𝐺 |
= 𝐻 (N(𝜋 𝐴 )) − 𝐻 (N(𝜌 ∗𝐴 )) (12.3.24)
= 𝐻 (N(𝜋 𝐴 )) − 𝐻min (N). (12.3.25)
Therefore,
𝜒(N) ≥ 𝐻 (N(𝜋 𝐴 )) − 𝐻min (N), (12.3.26)
and the proof is complete. ■
where the first inequality follows from concavity of the quantum entropy. So we
have
𝐻min (N) = min 𝐻 (N(𝜓)). (12.3.33)
𝜓
1.0 Dp
Ep
0.8
0.6
χ (N )
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
p
where we used the fact that the depolarizing channel is unital, i.e., D 𝑝 ( 1) = 1, and
that 𝐻 (𝜋) = log2 2 = 1. To compute the minimum output entropy, we use the fact
that it suffices to minimize over pure states. It is straightforward to show that the
two eigenvalues of D 𝑝 (𝜓) are ( 2𝑝/3, 1 − 2𝑝/3) for every pure state 𝜓. Therefore,
2𝑝
𝐻min (D 𝑝 ) = ℎ2 (12.3.36)
3
so that
2𝑝
𝜒(D 𝑝 ) = 1 − ℎ2 (12.3.37)
3
for 𝑝 ∈ [0, 1]. Since the Holevo information is known to be additive for the
depolarizing channel (please consult the Bibliographic Notes in Section 12.5), it
follows that 𝜒(D 𝑝 ) is equal to the classical capacity of the depolarizing channel.
See Figure 12.6 for a plot of the Holevo information 𝜒(D 𝑝 ) of the depolarizing
channel.
For the qudit depolarizing channel D (𝑑) 𝑝 , recall from the discussion around
(11.3.26) that it is irreducibly covariant. Therefore, by Theorem 12.30, we obtain
𝜒(D (𝑑) 𝑑
𝑝 ) = log2 𝑑 − 𝐻min (D 𝑝 ). (12.3.38)
786
Chapter 12: Classical Communication
The calculation of the minimum output entropy for the qudit depolarizing channel is
analogous to the calculation of the minimum output entropy of the qubit depolarizing
channel. In particular, for every pure state 𝜓, the eigenvalues of D (𝑑) 𝑑
𝑝 (𝜓) are 1− 𝑑+1 𝑝
(with multiplicity one) and 𝑑 2𝑑−1 𝑝 (with multiplicity 𝑑 − 1). Indeed, by using the
𝑝𝑑 2
parameterization in (4.5.37) with 𝑞 = 𝑑 2 −1
, consider that
𝐼
D (𝑑)
𝑝 (𝜓) = (1 − 𝑞) 𝜓 + 𝑞 (12.3.39)
𝑑
𝑞 𝑞
= 1−𝑞+ 𝜓 + (𝐼 − 𝜓) (12.3.40)
𝑑 𝑑
𝑝𝑑 𝑝𝑑
= 1− 𝜓+ 2 (𝐼 − 𝜓) . (12.3.41)
𝑑+1 𝑑 −1
Therefore,
𝑑𝑝 𝑑𝑝 𝑑𝑝 𝑑𝑝
𝐻min (D (𝑑)
𝑝 ) =− 1− log2 1 − − log2 2 , (12.3.42)
𝑑+1 𝑑+1 𝑑+1 𝑑 −1
so that
𝑑𝑝 𝑑𝑝 𝑑𝑝 𝑑𝑝
𝜒(D (𝑑)
𝑝 ) = log2 𝑑 + 1 − log2 1 − + log2 2 (12.3.43)
𝑑+1 𝑑+1 𝑑+1 𝑑 −1
for 𝑑 ≥ 2 and 𝑝 ∈ [0, 1].
The Holevo information is also known to be additive for the qudit depolarizing
channel, which means that the expression in (12.3.43) is equal to its classical
capacity.
𝐶 (D (𝑑) (𝑑)
𝑝 ) = 𝜒(D 𝑝 ). (12.3.45)
It also holds that the Holevo information is the strong converse classical capacity
of the qudit depolarizing channel, i.e.,
e(D (𝑑)
𝐶 (𝑑)
𝑝 ) = 𝜒(D 𝑝 ) (12.3.46)
for all 𝑑 ≥ 2 and all 𝑝 ∈ [0, 1]. Please consult the Bibliographic Notes in
Section 12.5 for a reference to the proof.
Let us now consider the erasure channel. Recall from (4.5.18) that the erasure
channel E 𝑝 , with 𝑝 ∈ [0, 1], is defined as
where |𝑒⟩ is called the erasure state and is not in the Hilbert space of the input
system 𝐴. In other words, the state |𝑒⟩⟨𝑒| is supported on the space orthogonal to
the input space. As argued in Section 11.3.1.2, we can consider the output space of
the channel to be a qutrit system with the orthonormal basis {|0⟩, |1⟩, |2⟩}, and we
can let the state |2⟩ be the erasure state. Then,
1− 𝑝 1− 𝑝
E 𝑝 (𝜋) = |0⟩⟨0| + |1⟩⟨1| + 𝑝|2⟩⟨2|, (12.3.50)
2 2
which means that
1− 𝑝
𝐻 E 𝑝 (𝜋) = −(1 − 𝑝) log2 − 𝑝 log2 𝑝 (12.3.51)
2
= 1 − 𝑝 + ℎ2 ( 𝑝). (12.3.52)
788
Chapter 12: Classical Communication
where the second equality follows because the state |2⟩⟨2| is orthogonal to 𝜓.
Therefore,
𝐻min (E 𝑝 ) = ℎ2 ( 𝑝), (12.3.56)
which means that the Holevo information of the erasure channel is
𝜒(E 𝑝 ) = 1 − 𝑝. (12.3.57)
This is consistent with what one might expect intuitively because communication
over the erasure channel is only possible with probability 1 − 𝑝, when no erasure
occurs, and conditioned on this outcome, the erasure channel is simply the identity
channel.
In general, for the qudit erasure channel E (𝑑)
𝑝 , whose action can be defined
on the 𝑑-dimensional space with orthonormal basis {|1⟩, . . . , |𝑑⟩} such that the
state |𝑑 + 1⟩ is the erasure state, we have that it is irreducibly covariant (see
Section 11.3.1.2). Using this fact, which implies that
𝜒(E 𝑝 ) = 𝐻 E 𝑝 (𝜋) − 𝐻min (E (𝑑)
(𝑑) (𝑑)
𝑝 ), (12.3.58)
𝜒(E (𝑑)
𝑝 ) = (1 − 𝑝) log2 𝑑. (12.3.59)
Proof: By combining (12.3.59) and Proposition 12.21, we conclude that Υ(E (𝑑)
𝑝 ) ≥
(𝑑)
𝜒(E 𝑝 ) = (1 − 𝑝) log2 𝑑. So we establish the opposite inequality.
789
Chapter 12: Classical Communication
The classical capacity of the quantum erasure channel and its strong converse
now follow as a direct corollary of (12.3.59), (12.2.15), Proposition 12.32, the
irreducible covariance of the erasure channel, and Theorem 12.26.
790
Chapter 12: Classical Communication
𝐶 (E (𝑑) e (𝑑)
𝑝 ) = 𝐶 (E 𝑝 ) = (1 − 𝑝) log2 𝑑. (12.3.69)
where
√ √︁
𝐴1 = 𝛾|0⟩⟨1|, 𝐴2 = |0⟩⟨0| + 1 − 𝛾|1⟩⟨1|. (12.3.71)
(Please consult the Bibliographic Notes in Section 12.5.) It is worth noting that
neither the additivity of the Holevo information for nor the classical capacity of the
amplitude damping channel are not known.
791
Chapter 12: Classical Communication
1.0 C β (Aγ )
χ (Aγ )
0.8
0.6
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
γ
In this section, we prove that the Holevo information is additive for all Hadamard
channels.
792
Chapter 12: Classical Communication
𝜒(N ⊗ M)
" !
∑︁
= sup 𝐻 𝑝(𝑥)(N ⊗ M)(𝜓 𝑥𝐴1 𝐴2 )
{( 𝑝(𝑥),𝜓 𝑥 )} 𝑥 ∈X 𝑥∈X (12.3.79)
#
∑︁
− 𝑝(𝑥)𝐻 ((N𝑐 ⊗ M𝑐 )(𝜓 𝑥𝐴1 𝐴2 )) .
𝑥∈X
Now, for every bipartite state 𝜌 𝐴𝐵 , it follows from strong subadditivity that
𝐻 (𝜌 𝐴𝐵 ) ≤ 𝐻 (𝜌 𝐴 ) + 𝐻 (𝜌 𝐵 ) a fact known as the subadditivity of the quantum
entropy (consider (10.6.88) with system 𝐶 trivial). Using this for the first term in
(12.3.79), we find that
!
∑︁
𝐻 𝑝(𝑥)(N ⊗ M)(𝜓 𝑥𝐴1 𝐴2 )
𝑥∈X
! ! (12.3.80)
∑︁ ∑︁
≤𝐻 𝑝(𝑥)N(𝜓 𝑥𝐴1 ) + 𝐻 𝑝(𝑥)M(𝜓 𝑥𝐴2 ) .
𝑥∈X 𝑥∈X
We now make use of the following identity, which is straightforward to verify: for
every finite alphabet X and ensemble {( 𝑝(𝑥), 𝜌 𝑥𝐴𝐵 )},
∑︁ ∑︁ ∑︁
𝑝(𝑥)𝐷 (𝜌 𝑥𝐴𝐵 ∥ 𝜌 𝑥𝐴 ⊗ 𝜌 𝑥𝐵 ) = 𝑝(𝑥)𝐻 (𝜌 𝑥𝐴 ) + 𝑝(𝑥)𝐻 (𝜌 𝑥𝐵 )
𝑥∈X 𝑥∈X 𝑥∈X
∑︁
− 𝑝(𝑥)𝐻 (𝜌 𝑥𝐴𝐵 ). (12.3.81)
𝑥∈X
793
Chapter 12: Classical Communication
Now, let us focus on the relative entropy term in the expression above. Since N is a
Hadamard channel, by Proposition 4.17 we know that the complementary channel
N𝑐 is entanglement-breaking. Then, from Theorem 4.15, we know that every
entanglement-breaking channel can be written as the composition of a measurement
channel followed by a preparation channel. This means that we can write N𝑐 as
N𝑐 = P ◦ Mqc , where Mqc is the measurement (or quantum–classical) channel, and
P is the preparation channel. Using the data-processing inequality for the quantum
relative entropy, for all 𝑥 ∈ X we obtain
794
Chapter 12: Classical Communication
Tr 𝐴1 [(𝑀 𝐴1 ⊗ 1 𝐴2 )𝜓 𝑥𝐴1 𝐴2 ]
∑︁ ∑︁
𝑥,𝑦 𝑦
𝑞(𝑦|𝑥) 𝜌 𝐴2 =
𝑦∈Y 𝑦∈Y
(12.3.88)
= Tr 𝐴1 [𝜓 𝑥𝐴1 𝐴2 ]
= 𝜓 𝑥𝐴2
for all 𝑥 ∈ X. Therefore, for all 𝑥 ∈ X,
𝐻 ((Mqc ⊗ M)(𝜓 𝑥𝐴1 𝐴2 ))
©∑︁ 𝑥,𝑦 ª
= 𝐻 𝑞(𝑦|𝑥)|𝑦⟩⟨𝑦|𝑌 ⊗ M𝑐 (𝜌 𝐴2 ) ® (12.3.89)
« 𝑦∈Y ∑︁ ¬
𝑐 𝑥,𝑦
= 𝐻 (𝑌 |𝑋 = 𝑥) + 𝑞(𝑦|𝑥)𝐻 (M (𝜌 𝐴2 )), (12.3.90)
𝑦∈Y
where the last equality follows from the direct-sum property of the quantum entropy.
Putting everything together, we obtain
!
∑︁ ∑︁
𝐻 𝑝(𝑥)(N ⊗ M)(𝜓 𝑥𝐴1 𝐴2 ) − 𝑝(𝑥)𝐻 ((N𝑐 ⊗ M𝑐 )(𝜓 𝑥𝐴1 𝐴2 ))
𝑥∈X 𝑥∈X
! !
∑︁ ∑︁ ∑︁
≤𝐻 𝑝(𝑥)N(𝜓 𝑥𝐴1 ) − 𝑝(𝑥)𝐻 (N𝑐 (𝜓 𝑥𝐴1 )) + 𝐻 𝑝(𝑥)M(𝜓 𝑥𝐴2 )
𝑥∈X 𝑥∈X 𝑥∈X
∑︁ ∑︁
− 𝑝(𝑥)𝐻 (M𝑐 (𝜓 𝑥𝐴2 )) + 𝑝(𝑥)𝐻 (𝑌 |𝑋 = 𝑥)
𝑥∈X 𝑥∈X
∑︁ ∑︁
+ 𝑝(𝑥)𝐻 (M𝑐 (𝜓 𝑥𝐴2 )) − 𝑝(𝑥)𝐻 (𝑌 |𝑋 = 𝑥)
𝑥∈X 𝑥∈X
∑︁ ∑︁
𝑥,𝑦
− 𝑝(𝑥)𝑞(𝑦|𝑥)𝐻 (M𝑐 (𝜓 𝐴2 )) (12.3.91)
𝑥∈X 𝑦∈Y
!
∑︁ ∑︁
=𝐻 𝑝(𝑥)N(𝜓 𝑥𝐴1 ) − 𝑝(𝑥)𝐻 (N𝑐 (𝜓 𝑥𝐴1 )) (12.3.92)
𝑥∈X 𝑥∈X
©∑︁ ∑︁ 𝑥,𝑦 ª
+ 𝐻 𝑝(𝑥)𝑞(𝑦|𝑥)M(𝜌 𝐴2 ) ®
«𝑥∈X
∑︁ ∑︁
𝑦∈Y ¬
𝑐 𝑥,𝑦
− 𝑝(𝑥)𝑞(𝑦|𝑥)𝐻 (M (𝜌 𝐴2 ))
𝑥∈X 𝑦∈Y
795
Chapter 12: Classical Communication
where we have used (12.3.88) and, to obtain the last inequality, the fact that the first
two terms in (12.3.93) are of the form of the objective function in the expression in
(12.3.1) for the Holevo information 𝜒(N), and similarly for the last two terms, in
𝑥,𝑦
which the ensemble is {( 𝑝(𝑥)𝑞(𝑦|𝑥), 𝜌 𝐴2 ) : 𝑥 ∈ X, 𝑦 ∈ Y}.
Since the ensemble {( 𝑝(𝑥), 𝜓 𝑥𝐴1 𝐴2 )}𝑥∈X used to obtain (12.3.93) is arbitrary, we
conclude that
𝜒(N ⊗ M) ≤ 𝜒(M) + 𝜒(N), (12.3.94)
which implies, via the superadditivity in (12.2.44) that 𝜒(N ⊗ M) = 𝜒(N) + 𝜒(M),
as required. ■
Exercise 12.1
Prove that the classical capacity of the 𝑑-dimensional dephasing channel, 𝑑 ≥ 2,
is log2 𝑑.
12.4 Summary
In this chapter, we developed the theory of classical communication over a quantum
channel, adopting a similar structure to that of the previous chapter. We began
with the one-shot setting of classical communication, and we defined the one-shot
classical capacity of a quantum channel in Definition 12.2. We then derived upper
(Proposition 12.3) and lower (Proposition 12.5) bounds on the one-shot classical
capacity in terms of the hypothesis testing Holevo information of a quantum channel.
The approaches to doing so are conceptually similar to those from the previous
chapter. However, there are extra steps involved in deriving the lower bound,
called derandomization and expurgation, that establish the existence of a code with
maximum error probability no larger than a given threshold and number of bits
transmitted roughly equal to the one-shot Holevo information.
With the fundamental information-theoretic arguments established in the one-
shot setting, we then moved on to the asymptotic setting of classical communication.
One of the main results is that the regularized Holevo information of a channel
is equal to its classical capacity (Theorem 12.13). We then considered some
special cases: for entanglement-breaking, Hadamard, depolarizing, and erasue
796
Chapter 12: Classical Communication
channels, the Holevo information is not only equal to the classical capacity but also
equal to the strong converse classical capacity (we showed the proofs in full for
entanglement-breaking and erasure channels, but deferred to the literature for the
others). We discussed general upper bounds on the classical capacity, including the
Υ-information and 𝐶 𝛽 semi-definite programming bound.
Going forward from here, the methods of position-based coding and sequential
decoding are useful for the tasks of secret key distillation (Chapter 15) and private
communication (Chapter 16), and the concept of derandomization appears again in
the context of private communication. The Holevo information will also play a role
in achievable rates for these tasks.
established by Shor (2002a), for Hadamard channels by King et al. (2007), for
the depolarizing channel by King (2003b), and for the erasure channel by Bennett
et al. (1997). The fact that the Holevo information is the strong converse classical
capacity of the depolarizing channel was proven by Koenig and Wehner (2009).
Additivity of the sandwiched Rényi-Holevo information for entanglement-breaking
channels (Theorem 12.16) was established by Wilde et al. (2014), by building upon
earlier seminal results of King (2003a) subsequently generalized by Holevo (2006).
That is, Lemma 12.18 is due to King (2003a); Holevo (2006). Lemma 12.17 is due
to Wilde et al. (2014).
The Υ-information of a quantum channel and its variants were defined by Wang
et al. (2019c). The same authors established bounds on classical capacity involving
Υ-information. The strong converse for the classical capacity of the quantum
erasure channel is due to Wilde and Winter (2014), but here we have followed the
approach of Wang et al. (2019c). The semi-definite programming upper bound
𝐶 𝛽 (N) for the classical capacity of a quantum channel N was established by Wang
et al. (2018).
The Holevo information of covariant channels was studied by Holevo (2002b).
A proof of the fact that the limit in the definition of the regularized Holevo
information of a channel exists was given by Barnum et al. (1998).
The formula in (12.3.72) for the Holevo information of the amplitude damping
channel was derived by Li-Zhen and Mao-Fa (2007b), using the techniques of
Cortese (2002) and Berry (2005). The formula in (12.3.77) for the quantity 𝐶 𝛽
for the same channel was determined by Wang et al. (2018) (see also Khatri et al.
(2020)).
798
Chapter 12: Classical Communication
Here, 𝜓 𝑅 𝐴 is a pure state, with the dimension of 𝑅 equal to the dimension of 𝐴, and
the infimum is over the set 𝔉 of completely positive maps defined as
𝔉 = {F 𝐴→𝐵 : ∃ 𝜎𝐵 ≥ 0, Tr[𝜎𝐵 ] ≤ 1, F 𝐴→𝐵 (𝜌 𝐴 ) ≤ 𝜎𝐵 ∀ 𝜌 𝐴 ∈ D(H 𝐴 )}.
(12.A.4)
Now, since the sandwiched Rényi relative entropy increases monotonically with
𝛼 (see Proposition 7.31), and since lim𝛼→1 𝐷 e𝛼 (𝜌∥𝜎) = 𝐷 (𝜌∥𝜎) (see Proposi-
tion 7.30), we obtain
e𝛼 (N) =
lim Υ e𝛼 (N 𝐴→𝐵 (𝜓 𝑅 𝐴 )∥F 𝐴→𝐵 (𝜓 𝑅 𝐴 ))
inf sup inf 𝐷 (12.A.5)
𝛼→1+ 𝛼∈(1,∞) 𝜓 𝑅 𝐴 F∈𝔉
We now prove that this equality holds. We start with the following lemma.
Lemma 12.35
Let 𝐴 and 𝐵 be Hermitian operators such that −𝐴 ≤ 𝐵 ≤ 𝐴, and let 𝐶 and 𝐷
be Hermitian operators such that −𝐶 ≤ 𝐷 ≤ 𝐶. Then,
−𝐴 ⊗ 𝐶 ≤ 𝐵 ⊗ 𝐷 ≤ 𝐴 ⊗ 𝐶. (12.B.3)
( 𝐴 − 𝐵) ⊗ (𝐶 − 𝐷) ≥ 0, (12.B.4)
( 𝐴 + 𝐵) ⊗ (𝐶 + 𝐷) ≥ 0, (12.B.5)
( 𝐴 − 𝐵) ⊗ (𝐶 + 𝐷) ≥ 0, (12.B.6)
( 𝐴 + 𝐵) ⊗ (𝐶 − 𝐷) ≥ 0. (12.B.7)
𝐴⊗𝐶−𝐵⊗𝐶−𝐴⊗𝐷+𝐵⊗𝐷 ≥ 0, (12.B.8)
𝐴⊗𝐶+𝐵⊗𝐶+𝐴⊗𝐷+𝐵⊗𝐷 ≥ 0, (12.B.9)
𝐴⊗𝐶−𝐵⊗𝐶+𝐴⊗𝐷−𝐵⊗𝐷 ≥ 0, (12.B.10)
𝐴⊗𝐶+𝐵⊗𝐶−𝐴⊗𝐷−𝐵⊗𝐷 ≥ 0. (12.B.11)
Now, adding the first two of these inequalities implies that 𝐴 ⊗ 𝐶 + 𝐵 ⊗ 𝐷 ≥ 0, which
is equivalent to the left-hand side of (12.B.3). Adding the last two inequalities
implies that 𝐴 ⊗ 𝐶 − 𝐵 ⊗ 𝐷 ≥ 0, which is equivalent to the right-hand side of
(12.B.3). ■
An immediate corollary of the lemma above is the following: for all Hermitian
operators 𝐴, 𝐵, 𝐶, 𝐷 such that 0 ≤ 𝐵 ≤ 𝐴 and 0 ≤ 𝐷 ≤ 𝐶, it holds that
0 ≤ 𝐵 ⊗ 𝐷 ≤ 𝐴 ⊗ 𝐶. (12.B.12)
infimum Tr[𝑆 𝐵 ]
𝛽(N) = subject to −𝑅 𝐴𝐵 ≤ T𝐵 [ΓN 𝐴𝐵 ] ≤ 𝑅 𝐴𝐵 , (12.B.14)
−1 𝐴 ⊗ 𝑆 𝐵 ≤ T𝐵 [𝑅 𝐴𝐵 ] ≤ 1 𝐴 ⊗ 𝑆 𝐵 .
Now, let (𝑅 1𝐴𝐵 , 𝑆 1𝐵 ) be a feasible point in the SDP for 𝛽(N1 ), and let (𝑅 2𝐴𝐵 , 𝑆 2𝐵 ) be
a feasible point in the SDP for 𝛽(N2 ). Each pair thus satisfies the constraints in
(12.B.14). Using Lemma 12.35, the first of these constraints implies that
−𝑅 1𝐴1 𝐵1 ⊗ 𝑅 2𝐴2 𝐵2 ≤ T𝐵1 [ΓN1 N2 1 2
𝐴1 𝐵1 ] ⊗ T 𝐵2 [Γ 𝐴2 𝐵2 ] ≤ 𝑅 𝐴1 𝐵1 ⊗ 𝑅 𝐴2 𝐵2 . (12.B.15)
Furthermore, observe that
T𝐵1 [ΓN1 N2 N1 N2
𝐴1 𝐵1 ] ⊗ T 𝐵2 [Γ 𝐴2 𝐵2 ] = T 𝐵1 𝐵2 [Γ 𝐴1 𝐵1 ⊗ Γ 𝐴2 𝐵2 ] (12.B.16)
= T𝐵1 𝐵2 [ΓN1 ⊗N2
𝐴1 𝐴2 𝐵1 𝐵2 ]. (12.B.17)
Using this, along with Lemma 12.35, the second constraint in (12.B.14) implies
that
−1 𝐴1 𝐴2 ⊗ 𝑆 1𝐵1 ⊗ 𝑆 2𝐵2 ≤ T𝐵1 𝐵2 [𝑅 1𝐴1 𝐵1 ⊗ 𝑅 2𝐴2 𝐵2 ] ≤ 1 𝐴1 𝐴2 ⊗ 𝑆 1𝐵1 ⊗ 𝑆 2𝐵2 . (12.B.18)
Now, the inequalities in (12.B.17) and (12.B.18) imply that (𝑅 1𝐴1 𝐵1 ⊗ 𝑅 2𝐴2 𝐵2 , 𝑆 1𝐵1 ⊗
𝑆 2𝐵2 ) is a feasible point in the SDP for 𝛽(N1 ⊗ N2 ). This means that
801
Chapter 12: Classical Communication
Lemma 12.36
For every quantum channel N, the SDP dual to the SDP in (12.B.14) for 𝛽(N)
is given by
N ] (𝐾
supremum Tr T 𝐵 [Γ 𝐴𝐵 𝐴𝐵 − 𝑀 )
𝐴𝐵 ,
B subject to 𝐾 𝐴𝐵 + 𝑀 𝐴𝐵 ≤ T𝐵 [𝐸 𝐴𝐵 + 𝐹𝐴𝐵 ],
𝛽(N) (12.B.22)
𝐸 𝐵 + 𝐹𝐵 ≤ 1𝐵 ,
b
𝐾 𝐴𝐵 , 𝑀 𝐴𝐵 , 𝐸 𝐴𝐵 , 𝐹𝐴𝐵 ≥ 0.
Proof: Using the formulation of the SDP for 𝛽(N) as in (12.2.229), the dual to
the SDP for 𝛽(N) is simply
supremum Tr[𝐷𝑌 ]
𝛽(N) = subject to Φ† (𝑌 ) ≤ 𝐶,
b (12.B.23)
𝑌 ≥ 0,
where
T𝐵 [ΓN𝐴𝐵 ] 0 0 0
1𝐵 −T𝐵 [ΓN
𝐴𝐵 ]
© ª
0 0 0 0®
𝐶= 𝐷=
, ®, (12.B.24)
0 0 𝐴𝐵 0 0 0 0®
« 0 0 0 0¬
𝑅 𝐴𝐵 0 0 0
© ª
0 𝑅 𝐴𝐵 0 0
Φ(𝑋) =
®
0 1 𝑅 ⊗ 𝑆 𝐵 − T𝐵 [𝑅 𝐴𝐵 ]
®, (12.B.25)
0 0 ®
« 0 0 0 1𝑅 ⊗ 𝑆 𝐵 + T𝐵 [𝑅 𝐴𝐵 ] ¬
𝑆 0
𝑋= 𝐵 . (12.B.26)
0 𝑅 𝐴𝐵
To determine the adjoint Φ† , we first observe that, since the operators 𝐶 and 𝐷
are block diagonal, the objective function Tr[𝐷𝑌 ] of the dual problem involves
only the diagonal blocks of 𝑌 . Furthermore, the fact that Φ(𝑋) and 𝑋 are block
diagonal means that the condition Tr[Φ(𝑋)𝑌 ] = Tr[𝑋Φ† (𝑌 )] defining the adjoint
802
Chapter 12: Classical Communication
map Φ† involves only the diagonal blocks of 𝑌 . Therefore, if the dual problem is
feasible, then there is always a feasible point 𝑌 that is block diagonal. This means
that, without loss of generality, we can let
1
𝑌 𝐴𝐵 0 0 0
© 2 ª
0 𝑌 𝐴𝐵 0 0 ®
𝑌 = 3 ®, (12.B.27)
0 0 𝑌 𝐴𝐵 0 ®
4
« 0 0 0 𝑌 𝐴𝐵 ¬
1 , 𝑌 2 , 𝑌 3 , 𝑌 4 ≥ 0. Then,
with 𝑌 𝐴𝐵 𝐴𝐵 𝐴𝐵 𝐴𝐵
Tr[Φ(𝑋)𝑌 ]
𝑅 𝐴𝐵 0 0 0
© ª
0 𝑅 𝐴𝐵 0 0
= Tr
®
1
⊗ − [𝑅 ]
®
0 0 𝑅 𝑆 𝐵 T 𝐵 𝐴𝐵 0 ®
1𝑅 ⊗ 𝑆 𝐵 + T𝐵 [𝑅 𝐴𝐵 ] ¬
0 0 0
«
1
𝑌 𝐴𝐵 0 0 0
© 2 ª
0 𝑌 𝐴𝐵 0 0 ®
× 3 ® (12.B.28)
0 0 𝑌 𝐴𝐵 0 ®
0 0 0 𝑌 𝐴𝐵 4
« ¬
= Tr 𝑅 𝐴𝐵𝑌 𝐴𝐵 + 𝑅 𝐴𝐵𝑌 𝐴𝐵 + ( 1 𝑅 ⊗ 𝑆 𝐵 − T𝐵 [𝑅 𝐴𝐵 ])𝑌 𝐴𝐵
1 2 3
+( 1 𝑅 ⊗ 𝑆 𝐵 + T𝐵 [𝑅 𝐴𝐵 ])𝑌 𝐴𝐵4
(12.B.29)
= Tr[𝑆 𝐵 (𝑌𝐵3 + 𝑌𝐵4 )] + Tr[𝑅 𝐴𝐵 (𝑌 𝐴𝐵
1 2
+ 𝑌 𝐴𝐵 4
+ T𝐵 [𝑌 𝐴𝐵 3
− 𝑌 𝐴𝐵 ])] (12.B.30)
3
𝑆𝐵 0 𝑌𝐵 + 𝑌𝐵4 0
= Tr . (12.B.31)
0 𝑅 𝐴𝐵 0 𝑌 𝐴𝐵 + 𝑌 𝐴𝐵 + T𝐵 [𝑌 𝐴𝐵
1 2 4 − 𝑌3 ]
𝐴𝐵
1 2 3 4
𝑌 𝐴𝐵 + 𝑌 𝐴𝐵 ≤ T𝐵 [𝑌 𝐴𝐵 − 𝑌 𝐴𝐵 ]. (12.B.35)
Then,
supremum Tr T𝐵 [ΓN
𝐴𝐵 ] (𝐾 𝐴𝐵 − 𝑀 𝐴𝐵 )
subject to 𝐾 𝐴𝐵 + 𝑀 𝐴𝐵 ≤ T𝐵 [𝐸 𝐴𝐵 − 𝐹𝐴𝐵 ],
=
𝐸 𝐵 + 𝐹𝐵 ≤ 1𝐵 ,
𝛽(N)
b (12.B.38)
𝐾 𝐴𝐵 , 𝑀 𝐴𝐵 , 𝐸 𝐴𝐵 , 𝐹𝐴𝐵 ≥ 0,
as required.
To show that 𝛽(N)b = 𝛽(N), we need to check that Slater’s condition holds
(Theorem 2.28). We can pick 𝐸 𝐴𝐵 = 13𝑑𝐴𝐵𝐴 , 𝐹𝐴𝐵 = 16𝑑𝐴𝐵𝐴 , and 𝐾 𝐴𝐵 = 𝑀 𝐴𝐵 = 24𝑑 1 𝐴𝐵
𝐴
,
where 𝑑 𝐴 is the dimension of the space of the system 𝐴. Then we have strict
inequalities for all of the constraints of the dual problem, which means that Slater’s
condition holds. The primal 𝛽(N) and dual 𝛽(N) b are thus equal. ■
With the dual problem in hand, we can now prove (12.B.21). Let (𝐾 𝐴1 1 𝐵1 ,
𝑀 𝐴1 1 𝐵1 , 𝐸 1𝐴1 𝐵1 , 𝐹𝐴1 1 𝐵1 ) be a feasible point for the dual SDP for N1 , and let (𝐾 𝐴2 2 𝐵2 ,
𝑀 𝐴2 2 𝐵2 , 𝐸 2𝐴2 𝐵2 , 𝐹𝐴2 2 𝐵2 ) be a feasible point for the dual SDP for N2 . Then, pick
𝐾 𝐴1 𝐵1 𝐴2 𝐵2 = 𝐾 𝐴1 1 𝐵1 ⊗ 𝐾 𝐴2 2 𝐵2 + 𝑀 𝐴1 1 𝐵1 ⊗ 𝑀 𝐴2 2 𝐵2 , (12.B.39)
𝑀 𝐴1 𝐵1 𝐴2 𝐵2 = 𝐾 𝐴1 1 𝐵1 ⊗ 𝑀 𝐴2 2 𝐵2 + 𝑀 𝐴1 1 𝐵1 ⊗ 𝐾 𝐴2 2 𝐵2 , (12.B.40)
𝐸 𝐴1 𝐵1 𝐴2 𝐵2 = 𝐸 1𝐴1 𝐵1 ⊗ 𝐸 2𝐴2 𝐵2 + 𝐹𝐴1 1 𝐵1 ⊗ 𝐹𝐴2 2 𝐵2 , (12.B.41)
𝐹𝐴1 𝐵1 𝐴2 𝐵2 = 𝐸 1𝐴1 𝐵1 ⊗ 𝐹𝐴2 2 𝐵2 + 𝐹𝐴1 1 𝐵1 ⊗ 𝐸 2𝐴2 𝐵2 . (12.B.42)
𝐾 𝐴1 𝐵1 𝐴2 𝐵2 − 𝑀 𝐴1 𝐵1 𝐴2 𝐵2 = (𝐾 𝐴1 1 𝐵1 − 𝑀 𝐴1 1 𝐵1 ) ⊗ (𝐾 𝐴2 2 𝐵2 − 𝑀 𝐴2 2 𝐵2 ), (12.B.43)
𝐾 𝐴1 𝐵1 𝐴2 𝐵2 + 𝑀 𝐴1 𝐵1 𝐴2 𝐵2 = (𝐾 𝐴1 1 𝐵1 + 𝑀 𝐴1 1 𝐵1 ) ⊗ (𝐾 𝐴2 2 𝐵2 + 𝑀 𝐴2 2 𝐵2 ), (12.B.44)
𝐸 𝐴1 𝐵1 𝐴2 𝐵2 − 𝐹𝐴1 𝐵1 𝐴2 𝐵2 = (𝐸 1𝐴1 𝐵1 − 𝐹𝐴1 1 𝐵1 ) ⊗ (𝐸 2𝐴2 𝐵2 − 𝐹𝐴2 2 𝐵2 ), (12.B.45)
𝐸 𝐴1 𝐵1 𝐴2 𝐵2 + 𝐹𝐴1 𝐵1 𝐴2 𝐵2 = (𝐸 1𝐴1 𝐵1 + 𝐹𝐴1 1 𝐵1 ) ⊗ (𝐸 2𝐴2 𝐵2 + 𝐹𝐴2 2 𝐵2 ). (12.B.46)
804
Chapter 12: Classical Communication
Consider that
𝐾 𝐴1 𝐵 1 𝐴2 𝐵 2 + 𝑀 𝐴1 𝐵 1 𝐴2 𝐵 2
= (𝐾 𝐴1 1 𝐵1 + 𝑀 𝐴1 1 𝐵1 ) ⊗ (𝐾 𝐴2 2 𝐵2 + 𝑀 𝐴2 2 𝐵2 ) (12.B.47)
≤ T𝐵1 [𝐸 1𝐴1 𝐵1 − 𝐹𝐴1 1 𝐵1 ] ⊗ T𝐵2 [𝐸 2𝐴2 𝐵2 − 𝐹𝐴2 2 𝐵2 ] (12.B.48)
= T𝐵1 𝐵2 [(𝐸 1𝐴1 𝐵1 − 𝐹𝐴1 1 𝐵1 ) ⊗ (𝐸 2𝐴2 𝐵2 − 𝐹𝐴2 2 𝐵2 )] (12.B.49)
= T𝐵1 𝐵2 [𝐸 𝐴1 𝐵1 𝐴2 𝐵2 − 𝐹𝐴1 𝐵1 𝐴2 𝐵2 ], (12.B.50)
where the inequality follows from the constraints 𝐾 𝑖𝐴𝑖 𝐵𝑖 , 𝑀 𝐴𝑖 𝑖 𝐵𝑖 ≥ 0 and 𝐾 𝑖𝐴𝑖 𝐵𝑖
+ 𝑀 𝐴𝑖 𝑖 𝐵𝑖 ≤ T𝐵𝑖 [𝐸 𝑖𝐴𝑖 𝐵𝑖 − 𝐹𝐴𝑖 𝑖 𝐵𝑖 ] for 𝑖 ∈ {1, 2} and from an application of (12.B.12).
Furthermore, we have that
where the inequality follows from the constraints 𝐸 𝐵𝑖 𝑖 , 𝐹𝐵𝑖 𝑖 ≥ 0 and 𝐸 𝐵𝑖 𝑖 + 𝐹𝐵𝑖 𝑖 ≤ 1𝐵𝑖
for 𝑖 ∈ {1, 2} and from an application of (12.B.12). The collection
(𝐾 𝐴1 𝐵1 𝐴2 𝐵2 , 𝑀 𝐴1 𝐵1 𝐴2 𝐵2 , 𝐸 𝐴1 𝐵1 𝐴2 𝐵2 , 𝐹𝐴1 𝐵1 𝐴2 𝐵2 ) (12.B.54)
thus constitutes a feasible point for the SDP in (12.B.38). By restricting the
optimization in the SDP to this point, we find that
𝛽(N1 ⊗ N2 ) (12.B.55)
h i
N1 ⊗N2
≥ Tr T𝐵1 𝐵2 [Γ𝐴1 𝐴2 𝐵1 𝐵2 ] (𝐾 𝐴1 𝐵1 𝐴2 𝐵2 − 𝑀 𝐴1 𝐵1 𝐴2 𝐵2 ) (12.B.56)
h i
N1 N2
= Tr T𝐵1 [Γ𝐴1 𝐵1 ] ⊗ T𝐵2 [Γ𝐴2 𝐵2 ] (𝐾 𝐴1 𝐵1 𝐴2 𝐵2 − 𝑀 𝐴1 𝐵1 𝐴2 𝐵2 ) (12.B.57)
h
N1 N2
= Tr T𝐵1 [Γ𝐴1 𝐵1 ] ⊗ T𝐵2 [Γ𝐴2 𝐵2 ]
i
× (𝐾 𝐴1 1 𝐵1 − 𝑀 𝐴1 1 𝐵1 ) ⊗ (𝐾 𝐴2 2 𝐵2 − 𝑀 𝐴2 2 𝐵2 ) (12.B.58)
h i
N1 1 1
= Tr T𝐵1 [Γ𝐴1 𝐵1 ] (𝐾 𝐴1 𝐵1 − 𝑀 𝐴1 𝐵1 )
h i
N2 2 2
× Tr T𝐵2 [Γ𝐴2 𝐵2 ] (𝐾 𝐴2 𝐵2 − 𝑀 𝐴2 𝐵2 ) . (12.B.59)
Now, since (𝐾 𝐴𝐵
1 , 𝑀 1 , 𝐸 1 , 𝐹 1 ) (𝐾 2 , 𝑀 2 , 𝐸 2 , 𝐹 2 ) were arbitrary feasible
𝐴𝐵 𝐴𝐵 𝐴𝐵 𝐴𝐵 𝐴𝐵 𝐴𝐵 𝐴𝐵
b 1 ) = 𝛽(N1 ) and 𝛽(N
points in the SDPs for 𝛽(N b 2 ) = 𝛽(N2 ), respectively, the
805
Chapter 12: Classical Communication
inequality in (12.B.59) holds for the feasible points achieving 𝛽(N1 ) and 𝛽(N2 ).
Therefore,
𝛽(N1 ⊗ N2 ) ≥ 𝛽(N1 ) · 𝛽(N2 ). (12.B.60)
We have thus shown that 𝛽(N1 ⊗ N2 ) = 𝛽(N1 ) · 𝛽(N2 ).
806
Chapter 13
Entanglement Distillation
In the last two chapters, we explored classical communication over quantum channels,
in which classical information is encoded into a quantum state, transmitted over
a quantum channel, and decoded at the receiving end. In this chapter, we begin
our exploration of quantum communication. The goal here is to send quantum
information between two spatially separated parties. By “quantum information,”
we mean that a particular quantum state is transmitted, which is carried physically
by some quantum system. As was the case in previous chapters, the particular
information carrier is unimportant to us when developing the theoretical results;
however, the most common physical manifestation is a photonic encoding, which is
useful for long-distance quantum communication.
A basic quantum communication protocol is teleportation, which we developed
in Section 5.1. In this protocol, the sender, Alice, initially shares a maximally
entangled state with the receiver, Bob. This shared entanglement, along with
classical communication, can be used to transmit an arbitrary quantum state
perfectly from Alice to Bob. Specifically, if Alice and Bob share a maximally
entangled state of Schmidt rank 𝑑 ≥ 2, then using this entanglement along
with 2 log2 𝑑 bits of classical communication, Alice can perfectly transmit an
arbitrary state of log2 𝑑 qubits to Bob. Thus, the quantum teleportation protocol
realizes a noiseless quantum channel between Alice and Bob without having to
physically transport the particles carrying the quantum information. Of course, this
achievement comes at the cost of having a pre-shared maximally entangled state.
How do we obtain maximally entangled states in the first place? In practice,
due to noise and other device imperfections, physical sources of entanglement
807
Chapter 13: Entanglement Distillation
An Â
Alice
Bob
ρ⊗
AB
n
L↔ Φ Â B̂
Bn B̂
Figure 13.1: Given a bipartite state 𝜌 𝐴𝐵 shared by Alice and Bob, the task of
entanglement distillation is to find the largest 𝑑 for which a maximally entangled
state |Φ⟩ 𝐴ˆ 𝐵ˆ of Schmidt rank 𝑑 can be extracted from 𝑛 copies of 𝜌 𝐴𝐵 with the
smallest possible error, given a two-way LOCC channel L↔𝑛 𝑛 ˆ ˆ between
𝐴 𝐵 → 𝐴𝐵
Alice and Bob.
often only produce mixed entangled states, not the pure, maximally entangled
states that are needed for quantum teleportation. The purpose of this chapter is
to show that many copies of a mixed entangled state can be used to extract, or
distill, some smaller number of pure maximally entangled states. These distilled
maximally entangled states can then be used for quantum communication via the
teleportation protocol. This is a basic strategy for quantum communication that
we consider in more detail in Chapter 19, in order to obtain achievable rates for
quantum communication over a quantum channel.
Similar to quantum teleportation, in which the allowed resources are local
operations by Alice and Bob and one-way classical communication from Alice to
Bob, in entanglement distillation we allow Alice and Bob local operations with
two-way classical communication (that is, communication from Alice to Bob and
from Bob to Alice); see Figure 13.1. The goal is to determine, given many copies
of a quantum state 𝜌 𝐴𝐵 , the maximum rate at which maximally entangled states
(i.e., ebits) can be distilled approximately from 𝜌 𝐴𝐵 , where the rate is defined as
the ratio 𝑛1 log2 𝑑 between the number log2 𝑑 of approximate ebits extracted and the
initial number 𝑛 of copies of 𝜌 𝐴𝐵 . In the asymptotic setting, this maximum rate
of entanglement distillation is called the distillable entanglement of 𝜌 𝐴𝐵 , and we
denote it by 𝐸 𝐷 (𝜌 𝐴𝐵 ). We often write 𝐸 𝐷 (𝜌 𝐴𝐵 ) as 𝐸 𝐷 ( 𝐴; 𝐵) 𝜌 in order to explicitly
indicate the bipartition between the subsystems.
The shared resource state 𝜌 𝐴𝐵 for entanglement distillation has to be entangled
to begin with in order for entanglement distillation to be successful. If 𝜌 𝐴𝐵 is
separable to begin with, then it stays separable after the application of an LOCC
channel, and it is not possible to distill high fidelity maximally entangled states
808
Chapter 13: Entanglement Distillation
from a separable state. This intuitive reasoning becomes formalized in this chapter:
some of the entanglement measures from Chapter 9 serve as upper bounds on
the distillable entanglement, in both the one-shot (Section 13.1) and asymptotic
(Section 13.2) settings. In particular, the Rains relative entropy and squashed
entanglement are upper bounds on distillable entanglement. These entanglement
measures are currently the best known upper bounds on distillable entanglement,
and so we focus exclusively on them for this purpose in this chapter. It is then
a trivial consequence of Proposition 9.24, (9.1.149), and Proposition 9.35 that
log-negativity, relative entropy of entanglement, and entanglement of formation
are upper bounds on distillable entanglement, and so we do not focus on these
entanglement measures in this chapter.
We also consider lower bounds on distillable entanglement in this chapter: the
lower bound on distillable entanglement in the one-shot setting in Section 13.1.2 is
based on the concept of decoupling, which is an important concept that we discuss
later. This lower bound, when applied in the asymptotic setting, leads to the coherent
information lower bound 𝐸 𝐷 (𝜌 𝐴𝐵 ) ≥ 𝐼 ( 𝐴⟩𝐵) 𝜌 on distillable entanglement.
𝑝 err (L↔ ; 𝜌 𝐴𝐵 ) B 1 − 𝐹 (Φ 𝐴ˆ 𝐵ˆ , L↔
𝐴𝐵→ 𝐴ˆ 𝐵ˆ
(𝜌 𝐴𝐵 )) (13.1.1)
= 1 − ⟨Φ| 𝐴ˆ 𝐵ˆ L↔
𝐴𝐵→ 𝐴ˆ 𝐵ˆ
(𝜌 𝐴𝐵 )|Φ⟩ 𝐴ˆ 𝐵ˆ , (13.1.2)
and 𝐹 is the fidelity (see Section 6.2). To obtain (13.1.2), we used the formula in
(6.2.2) for the fidelity between a pure state and a mixed state.
809
Chapter 13: Entanglement Distillation
The figure of merit in (13.1.2) is sensible: the error probability 𝑝 err (L↔ ; 𝜌 𝐴𝐵 ) is
equal to the probability that the state 𝜔 𝐴ˆ 𝐵ˆ B L↔ ˆ ˆ (𝜌 𝐴𝐵 ) fails an “entanglement
𝐴𝐵→ 𝐴 𝐵
test,” which is a measurement defined by the POVM
{Φ 𝐴ˆ 𝐵ˆ , 1 𝐴ˆ 𝐵ˆ − Φ 𝐴ˆ 𝐵ˆ }. (13.1.4)
Passing the test corresponds to the measurement operator Φ 𝐴ˆ 𝐵ˆ and failing corre-
sponds to 1 𝐴ˆ 𝐵ˆ − Φ 𝐴ˆ 𝐵ˆ . If 1 − Tr[𝜔 𝐴ˆ 𝐵ˆ Φ 𝐴ˆ 𝐵ˆ ] ≤ 𝜀 ∈ [0, 1], and 𝑑 𝐴ˆ = 𝑑 𝐵ˆ = 𝑑 ≥ 1,
then we say that the final state 𝜔 𝐴ˆ 𝐵ˆ contains log2 𝑑 𝜀-approximate ebits.
Given 𝜀 ∈ [0, 1], the largest number log2 𝑑 of 𝜀-approximate ebits that can be
extracted from a state 𝜌 𝐴𝐵 among all (𝑑, 𝜀) entanglement distillation protocols is
called the one-shot 𝜀-distillable entanglement of 𝜌 𝐴𝐵 .
In addition to finding the largest number log2 𝑑 of 𝜀-approximate ebits that can
be extracted from all (𝑑, 𝜀) entanglement distilltion protocols for a given 𝜀 ∈ [0, 1],
we can consider the following complementary question: for a given 𝑑 ≥ 1, what is
the lowest value of 𝜀 that can be attained among all (𝑑, 𝜀) entanglement distillation
protocols? In other words, what is the value of
𝜀 ∗𝐷 (𝑑; 𝜌 𝐴𝐵 ) B inf {𝑝 err (L↔ ; 𝜌 𝐴𝐵 ) : 𝑑 𝐴ˆ = 𝑑 𝐵ˆ = 𝑑}, (13.1.6)
L↔ ˆ ˆ ∈LOCC
𝐴𝐵→ 𝐴𝐵
extracted (approximate) ebits rather than the error, and so our primary quantity of
interest is the one-shot distillable entanglement 𝐸 𝐷𝜀 (𝜌 𝐴𝐵 ).
Calculating the one-shot distillable entanglement is generally a difficult task,
because it involves optimizing over every Schmidt rank 𝑑 ≥ 1 of the maximally
entangled state Φ 𝐴ˆ 𝐵ˆ and over every LOCC channel L 𝐴𝐵→ 𝐴ˆ 𝐵ˆ , with 𝑑 𝐴ˆ = 𝑑 𝐵ˆ = 𝑑.
We therefore try to estimate the one-shot distillable entanglement by devising upper
and lower bounds. We begin in the next section with upper bounds.
Lemma 13.3
Let 𝐴 and 𝐵 be quantum systems with the same dimension 𝑑 ≥ 1. Let Φ 𝐴𝐵
be a maximally entangled state of Schmidt rank 𝑑, and let 𝜔 𝐴𝐵 be an arbitrary
bipartite state. If the probability Tr[Φ 𝐴𝐵 𝜔 𝐴𝐵 ] that the state 𝜔 𝐴𝐵 passes the
entanglement test defined by the POVM {Φ 𝐴𝐵 , 1 𝐴𝐵 − Φ 𝐴𝐵 } satisfies
Tr[Φ 𝐴𝐵 𝜔 𝐴𝐵 ] ≥ 1 − 𝜀 (13.1.7)
811
Chapter 13: Entanglement Distillation
𝐹 (Φ 𝐴𝐵 , 𝜌 𝐴𝐵 ) = Tr[Φ 𝐴𝐵 𝜔 𝐴𝐵 ] ≥ 1 − 𝜀. (13.1.10)
where
≥ 2−𝐷 𝐻 (𝜔 𝐴𝐵 ∥ 1 𝐴 ⊗𝜎𝐵 ) ,
1
Tr[Φ 𝐴𝐵 ( 1 𝐴 ⊗ 𝜎𝐵 )] =
𝜀
(13.1.16)
𝑑
which implies that
log2 𝑑 ≤ 𝐷 𝜀𝐻 (𝜔 𝐴𝐵 ∥ 1 𝐴 ⊗ 𝜎𝐵 ) (13.1.17)
for every state 𝜎𝐵 . Optimizing over 𝜎𝐵 leads to
1
= ⟨Γ| 𝐴𝐵 ( 1 𝐴 ⊗ 𝜎𝐵 )|Γ⟩ 𝐴𝐵 (13.1.20)
𝑑2
1
= 2, (13.1.21)
𝑑
where the last line follows from the same reasoning for (13.1.11)–(13.1.13). Next,
recall that
which is precisely (13.1.9). To obtain the last equality, we made use of the
assumption Tr 𝐵 [𝜔 𝐴𝐵 ] = 𝜋 𝐴 . ■
Note that the result of Lemma 13.3 is general and applies to every bipartite
state that is close in fidelity to a maximally entangled state. Applying it to the state
𝜔 𝐴ˆ 𝐵ˆ = L 𝐴𝐵→ 𝐴ˆ 𝐵ˆ (𝜌 𝐴𝐵 ) at the output of a (𝑑, 𝜀) entanglement distillation protocol
for a state 𝜌 𝐴𝐵 , we obtain the following result:
813
Chapter 13: Entanglement Distillation
We now consider an upper bound based on the Rains relative entropy. In order
to place an upper bound on the one-shot distillable entanglement 𝐸 𝐷𝜀 (𝜌 𝐴𝐵 ) for a
given state 𝜌 𝐴𝐵 and 𝜀 ∈ [0, 1], we consider states that are useless for entanglement
distillation. This is entirely analogous conceptually to what is done for classical
and entanglement-assisted classical communication in the previous two chapters,
in which we used the set of replacement channels (which are useless for both of
these communication tasks) to place an upper bound on the number of transmitted
bits in an (|M|, 𝜀) protocol, such that the upper bound depends only on the channel
N being used for communication.
What states are useless for entanglement distillation? Note that an intuitive
necessary condition for successful entanglement distillation is that the initial state
𝜌 𝐴𝐵 should be entangled: if 𝜌 𝐴𝐵 is separable, then the output state L 𝐴𝐵→ 𝐴ˆ 𝐵ˆ (𝜌 𝐴𝐵 )
of an arbitrary entanglement distillation protocol is still a separable state. This
suggests that separable states are useless for entanglement distillation. To be
more precise, separable states are useless for entanglement distillation because they
have a very small probability of passing the entanglement test. As we show in
Lemma 13.5 below, the following bound holds for every separable state 𝜎𝐴𝐵 :
1
Tr[Φ 𝐴𝐵 𝜎𝐴𝐵 ] ≤ , (13.1.29)
𝑑
where 𝑑 is the Schmidt rank of Φ 𝐴𝐵 . More generally, operators in the set PPT′,
defined as
PPT′ ( 𝐴 : 𝐵) = {𝜎𝐴𝐵 : 𝜎𝐴𝐵 ≥ 0, ∥T𝐵 (𝜎𝐴𝐵 )∥ 1 ≤ 1}, (13.1.30)
814
Chapter 13: Entanglement Distillation
are also useless for entanglement distillation, in the sense that a statement analgous
to (13.1.29) can be made for them. We now prove this statement.
Lemma 13.5
Let 𝐴 and 𝐵 be quantum systems with the same dimension 𝑑 ≥ 1. Let Φ 𝐴𝐵
be a maximally entangled state of Schmidt rank 𝑑. If 𝜎𝐴𝐵 ∈ PPT′ ( 𝐴 : 𝐵), then
Tr[Φ 𝐴𝐵 𝜎𝐴𝐵 ] ≤ 𝑑1 .
1
Remark: Note that Lemma 13.5 implies that Tr[Φ 𝐴𝐵 𝜎𝐴𝐵 ] ≤ 𝑑 for every separable state 𝜎𝐴𝐵
because SEP( 𝐴 : 𝐵) ⊆ PPT′ ( 𝐴 : 𝐵) (recall Figure 9.2).
Proof: Using the fact that the partial transpose is self-inverse and self-adjoint, as
discussed in (3.2.114) and (3.2.115), respectively, we find that
Due to the fact that SEP ⊆ PPT ⊆ PPT′ (see Figure 9.2), Lemma 13.5 tells
us that both separable and PPT states are useless for entanglement distillation.
However, due to the fact that separable states are strictly contained in the set of PPT
states for all bipartite states except for qubit-qubit and qubit-qutrit states, it follows
that there are PPT entangled states that are useless for entanglement distillation.
We elaborate upon this point further in Section 13.2.0.1 below, and we show that
the distillable entanglement (in the asymptotic setting) vanishes for all PPT states.
815
Chapter 13: Entanglement Distillation
The steps followed in the proof of Lemma 13.5 above are completely analogous
to the steps in (13.1.11)–(13.1.13) and in (13.1.19)–(13.1.21) of the proof of
Lemma 13.3. Therefore, just as Lemma 13.3 was used to establish Proposition 13.4,
we can use Lemma 13.5 to place an upper bound on the number log2 𝑑 of approximate
ebits in a bipartite state 𝜔 𝐴𝐵 .
Proposition 13.6
Fix 𝜀 ∈ [0, 1], and let 𝐴 and 𝐵 be quantum systems with the same dimension
𝑑 ≥ 1. Fix a maximally entangled state Φ 𝐴𝐵 of Schmidt rank 𝑑. Let 𝜔 𝐴𝐵 be an
𝜀-approximate maximally entangled state, in the sense that
𝐹 (Φ 𝐴𝐵 , 𝜔 𝐴𝐵 ) = Tr[Φ 𝐴𝐵 𝜔 𝐴𝐵 ] ≥ 1 − 𝜀. (13.1.35)
Proof: Let 𝜎𝐴𝐵 be an arbitrary operator in PPT′ ( 𝐴 : 𝐵). The inequality Tr[Φ 𝐴𝐵 𝜔 𝐴𝐵 ] ≥
1 − 𝜀 guarantees that 𝜔 𝐴𝐵 passes the entanglement test with probability greater than
1 − 𝜀. Thus, we conclude that Φ 𝐴𝐵 is a particular measurement operator satisfying
the constraints for 2−𝐷 𝐻 (𝜔 𝐴𝐵 ∥𝜎𝐴𝐵 ) . Applying Lemma 13.5 and the definition of
𝜀
1
2−𝐷 𝐻 (𝜔 𝐴𝐵 ∥𝜎𝐴𝐵 ) ≤ Tr[Φ 𝐴𝐵 𝜎𝐴𝐵 ] ≤
𝜀
. (13.1.37)
𝑑
Rearranging this leads to
Since this inequality holds for every operator 𝜎𝐴𝐵 ∈ PPT′ ( 𝐴 : 𝐵), we conclude that
816
Chapter 13: Entanglement Distillation
𝐸 𝐷𝜀 ( 𝐴; 𝐵) 𝜌 ≤ 𝑅𝐻
𝜀
( 𝐴; 𝐵) 𝜌 (13.1.41)
The main step that allows us to conclude the bound in (13.1.40) in terms of
𝜀 is an entanglement measure, meaning that
the state 𝜌 𝐴𝐵 alone is the fact that 𝑅𝐻
it is monotone non-increasing under LOCC channels. In other words, the set of
PPT′ operators is preserved under LOCC channels. This fact is not true for the set
{1 𝐴 ⊗ 𝜎𝐵 : 𝜎𝐵 ∈ D(H)} appearing in the optimization that defines the 𝜀-hypothesis
817
Chapter 13: Entanglement Distillation
Corollary 13.8
Let 𝜌 𝐴𝐵 be a bipartite state, and let 𝜀 ∈ [0, 1/2). For every (𝑑, 𝜀) entanglement
distillation protocol (𝑑, L 𝐴𝐵→ 𝐴ˆ 𝐵ˆ ), with 𝑑 𝐴ˆ = 𝑑 𝐵ˆ = 𝑑, we have that
1 ′ ′
log2 𝑑 ≤ sup 𝐼 ( 𝐴 ⟩𝐵 )L(𝜌) + ℎ2 (𝜀) , (13.1.44)
1 − 2𝜀 L
where
e𝛼 ( 𝐴; 𝐵) 𝜌 =
𝑅 inf e𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 )
𝐷 (13.1.46)
𝜎𝐴𝐵 ∈PPT′ ( 𝐴:𝐵)
Proof: Combining the upper bound in (13.1.26) from Theorem 13.4 with the
upper bound in (7.9.52) from Proposition 7.70, we obtain
log2 𝑑 ≤ sup 𝐼 𝐻𝜀 ( 𝐴′⟩𝐵′)L(𝜌) (13.1.47)
L
1 ′ ′ 𝜀
≤ sup 𝐼 ( 𝐴 ⟩𝐵 )L(𝜌) + ℎ2 (𝜀) + log2 𝑑. (13.1.48)
1−𝜀 L 1−𝜀
Rearranging this and simplifying leads to
1
log2 𝑑 ≤ sup 𝐼 ( 𝐴′⟩𝐵′)L(𝜌) + ℎ2 (𝜀) , (13.1.49)
1 − 2𝜀 L
which is the inequality in (13.1.44). The inequality in (13.1.45) follows from
Theorem 13.7 and (7.9.59) in Proposition 7.71. ■
818
Chapter 13: Entanglement Distillation
Since the upper bounds in (13.1.44) and (13.1.45) hold for all (𝑑, 𝜀) entangle-
ment distillation protocols, we conclude the following upper bounds on distillable
entanglement:
1
𝐸 𝐷𝜀 ( 𝐴; 𝐵) 𝜌 ≤ sup 𝐼 ( 𝐴′⟩𝐵′)L(𝜌) + ℎ2 (𝜀) , (13.1.50)
1 − 2𝜀 L
𝜀 𝛼 1
𝐸 𝐷 ( 𝐴; 𝐵) 𝜌 ≤ 𝑅e𝛼 ( 𝐴; 𝐵) 𝜌 + log2 ∀ 𝛼 > 1. (13.1.51)
𝛼−1 1−𝜀
𝐸 sq ( 𝐴; ˆ 𝜔 ≤ 𝐸 sq ( 𝐴; 𝐵) 𝜌 ,
ˆ 𝐵) (13.1.54)
𝐹 (Φ 𝐴ˆ 𝐵ˆ , 𝜔 𝐴ˆ 𝐵ˆ ) ≥ 1 − 𝜀. (13.1.56)
The first equality follows from Proposition √ 9.36. We can√finally rearrange the
established inequality 𝐸 sq ( 𝐴; 𝐵) 𝜌 ≥ (1 − 𝜀) log2 𝑑 − 𝑔2 ( 𝜀) to be in the form
stated in the theorem. ■
820
Chapter 13: Entanglement Distillation
E ρE
A Â
ψ ABE
L Φ+
 B̂
B B̂
Figure 13.2: The task of entanglement distillation can be understood from the
perspective of decoupling: given a bipartite state 𝜌 𝐴𝐵 with purification 𝜓 𝐴𝐵𝐸 ,
the entanglement distillation protocol given by the LOCC channel L should
result in the pure maximally entangled state Φ 𝐴ˆ 𝐵ˆ , which by definition is in
tensor product with the environment, so that the joint state is Φ 𝐴ˆ 𝐵ˆ ⊗ 𝜌 𝐸 , with
𝜌 𝐸 = Tr 𝐴𝐵 [𝜓 𝐴𝐵𝐸 ].
821
Chapter 13: Entanglement Distillation
for all 𝜀 ∈ (0, 1), where the optimization in the first line is with respect to states 𝜎𝐵 .
We also need the smooth conditional max-entropy of 𝜌 𝐴𝐵 , which is defined as
𝜀
𝐻max ( 𝐴|𝐵) 𝜌 B inf 𝐻max ( 𝐴|𝐵) 𝜌 (13.1.67)
𝜌 ∈B 𝜀 (𝜌)
e
The obtain the last equality, we made use of (7.5.8), and the optimization therein is
with respect to states 𝜎𝐵 .
For every state 𝜌 𝐴𝐵 , the conditional min- and max-entropies are related as
follows:
𝐻max ( 𝐴|𝐵) 𝜌 = −𝐻min ( 𝐴|𝐸)𝜓 (13.1.71)
𝜀 𝜀
𝐻max ( 𝐴|𝐵) 𝜌 = −𝐻min ( 𝐴|𝐸)𝜓 , (13.1.72)
for all 𝜀 ∈ (0, 1), where 𝜓 𝐴𝐵𝐸 is a purification of 𝜌 𝐴𝐵 .
Both the conditional min-entropy and the smooth conditional min-entropy can
be formulated as semi-definite programs. The same is true for the conditional max-
entropy and the smooth conditional max-entropy. Please consult the Bibliographic
Notes in Section 13.5 for details.
Finally, we need the quantity
e2 ( 𝐴|𝐵) 𝜌 B − inf 𝐷
𝐻 e2 (𝜌 𝐴𝐵 ∥ 1 𝐴 ⊗ 𝜎𝐵 ) (13.1.73)
𝜎𝐵
" 2#
1 1
− −
= − inf log2 Tr 𝜎𝐵 4 𝜌 𝐴𝐵 𝜎𝐵 4 , (13.1.74)
𝜎𝐵
𝐻min ( 𝐴|𝐵) 𝜌 ≤ 𝐻
e2 ( 𝐴|𝐵) 𝜌 (13.1.75)
822
Chapter 13: Entanglement Distillation
where X is a finite alphabet, {E𝑥 ˆ }𝑥∈X is a set of completely positive maps such
Í 𝐴→ 𝐴
that 𝑥∈X E𝑥 ˆ is trace preserving, and {D𝑥 ˆ }𝑥∈X is a set of channels. Recall
𝐴→ 𝐴 𝐵→ 𝐵
from Section 4.6.2 that every one-way Alice-to-Bob LOCC channel can be written
as
L→𝐴𝐵→ 𝐴ˆ 𝐵ˆ
= D𝐵𝑋𝐵 →𝐵ˆ ◦ C 𝑋 𝐴→𝑋𝐵 ◦ E 𝐴→ 𝐴𝑋
ˆ 𝐴, (13.1.79)
where E 𝐴→ 𝐴𝑋
ˆ 𝐴 is a local channel for Alice that corresponds to the quantum
instrument given by the maps {E𝑥 ˆ }𝑥∈X , i.e., (see also (4.4.53))
𝐴→ 𝐴
∑︁
E 𝐴→ 𝐴𝑋
ˆ 𝐴 (𝜌 𝐴 ) = E𝑥𝐴→ 𝐴ˆ (𝜌 𝐴 ) ⊗ |𝑥⟩⟨𝑥| 𝑋 𝐴 . (13.1.80)
𝑥∈X
The map C 𝑋 𝐴→𝑋𝐵 is a noiseless classical channel that transforms the classical
register 𝑋 𝐴 , held by Alice, to the classical register 𝑋𝐵 (which is simply a copy of
823
Chapter 13: Entanglement Distillation
𝑋 𝐴 ), held by Bob. The final channel D𝐵𝑋𝐵 →𝐵ˆ is a local channel for Bob defined as
D𝐵𝑋𝐵 →𝐵ˆ (𝜌 𝐵 ⊗ |𝑥⟩⟨𝑥| 𝑋𝐵 ) = D𝑥𝐵→𝐵ˆ (𝜌 𝐵 ) (13.1.81)
for all 𝑥 ∈ X. In the proof below, we explicitly construct the CP maps {E 𝐴→ 𝐴ˆ }𝑥∈X
and the channels {D𝑥 ˆ }𝑥∈X .
𝐵→ 𝐵
In addition to providing explicit forms for the channels E 𝐴→ 𝐴𝑋 ˆ 𝐴 and D 𝐵𝑋 𝐵 → 𝐵ˆ
involved in the LOCC channel L → in (13.1.78), we prove that 𝑝 err (L; 𝜌 𝐴𝐵 ) ≤ 𝜀.
𝐴𝐵→ 𝐴ˆ 𝐵ˆ
To do this, we make use of the following general decoupling result, which we
explain and prove in Appendix 13.A.
Theorem 13.11
Given a subnormalized state 𝜌 𝐴𝐸 (i.e., Tr[𝜌 𝐴𝐸 ] ≤ 1), and a completely positive
map N 𝐴→𝐴′ , the following bound holds
∫
N 𝐴→𝐴′ (𝑈 𝐴 𝜌 𝐴𝐸 𝑈 𝐴† ) − ΦN
𝐴′ ⊗ 𝜌 𝐸 d𝑈 𝐴
𝑈𝐴 1
1 1 ′
≤ 2− 2 𝐻2 ( 𝐴|𝐸)𝜌 − 2 𝐻2 ( 𝐴| 𝐴 )ΦN , (13.1.82)
e e
where ΦN N N
𝐴′ B Tr 𝐴 [Φ 𝐴𝐴′ ], Φ 𝐴𝐴′ is given by N 𝐴→𝐴 (Φ 𝐴𝐴 ), 𝜌 𝐸 B Tr 𝐴 [𝜌 𝐴𝐸 ],
′
and the integral is over unitaries 𝑈 𝐴 acting on system 𝐴, taken with respect to
the Haar measure.
Remark: The integral in (13.1.82) with respect to the Haar measure should be thought of as
a uniform average over the continuous set of all unitaries 𝑈 𝐴 acting on the system 𝐴. In other
words, the integral is analogous to a uniform average over a discrete set of unitaries. In fact, for
every dimension 𝑑 ≥ 1, there exists a set {𝑈 𝑥 } 𝑥 ∈X of unitaries, called a unitary one-design, such
that
1
∫
1 ∑︁
𝑈 𝑋𝑈 † d𝑈 = 𝑈 𝑥 𝑋𝑈 𝑥† = Tr[𝑋] (13.1.83)
𝑈 |X| 𝑑
𝑥 ∈X
for every operator 𝑋. An example of a unitary one-design is the Heisenberg-Weyl operators
{𝑊𝑧,𝑥 : 0 ≤ 𝑧, 𝑥 ≤ 𝑑−1}, which are defined in (3.2.48)–(3.2.50). Please consult the Bibliographic
Notes in Section 13.5 for more information about integration over unitaries with respect to the
Haar measure and about unitary designs. A simple argument for the right-most equality in
(13.1.83) goes as follows. First, it follows for a unitary 𝑉 that
∫ ∫ ∫
† † †
𝑉 𝑈 𝑋𝑈 d𝑈 𝑉 = 𝑉𝑈 𝑋 (𝑉𝑈) d𝑈 = 𝑈 𝑋𝑈 † d𝑈, (13.1.84)
𝑈 𝑈 𝑈
824
Chapter 13: Entanglement Distillation
Tr[𝑋]/𝑑 follows by taking a trace of the left-hand side, using its cyclicity, and the fact that d𝑈 is
a probability measure.
825
Chapter 13: Entanglement Distillation
= 1 𝐴 , and {𝑉 𝑥
Í
projectors such that 𝑥∈X Π 𝐴
𝑥 }𝑥∈X is a set of isometries. So we
𝐴→ 𝐴ˆ
have that
𝑥 𝑥†
E𝑥𝐴→ 𝐴ˆ (·) B 𝑉𝐴→
𝑥
ˆ Π 𝑥
𝐴 (·)Π 𝐴𝑉 (13.1.92)
𝐴 𝐴→ 𝐴ˆ
for all 𝑥 ∈ X. Each isometry 𝑉 𝑥 ˆ takes the subspace of H 𝐴 onto which Π 𝑥𝐴
𝐴→ 𝐴
projects and embeds it into the fixed 𝑑-dimensional space H 𝐴ˆ , i.e., im(𝑉 𝑥 ˆ ) = H 𝐴ˆ
𝐴→ 𝐴
for all 𝑥 ∈ X. The projectors {Π 𝑥𝐴 }𝑥∈X correspond to a measurement of the input
state, with Π 𝑥𝐴 (·)Π 𝑥𝐴 the (unnormalized) post-measurement state, and the isometries
{𝑉 𝑥 ˆ }𝑥∈X can be thought of as encodings of the initial system 𝐴 into the system 𝐴ˆ
𝐴→ 𝐴
on which one share of the desired maximally entangled state Φ 𝐴ˆ 𝐵ˆ is to be generated.
We have
∑︁
E 𝐴→ 𝐴𝑋 ˆ 𝐴 (𝜌 𝐴𝐵 ) = E 𝐴→ 𝐴ˆ (𝜌 𝐴𝐵 ) ⊗ |𝑥⟩⟨𝑥| 𝑋 𝐴 (13.1.93)
𝑥∈X
∑︁
= V𝑥𝐴→ 𝐴ˆ (Π 𝑥𝐴 𝜌 𝐴𝐵 Π 𝑥𝐴 ) ⊗ |𝑥⟩⟨𝑥| 𝑋 𝐴 (13.1.94)
𝑥∈X
∑︁
= 𝑝(𝑥)𝜔𝑥𝐴𝐵
ˆ ⊗ |𝑥⟩⟨𝑥| 𝑋 𝐴 , (13.1.95)
𝑥∈X
where
𝑝(𝑥) B Tr[Π 𝐴 𝜌 𝐴 ], (13.1.96)
1 𝑥
𝜔𝑥𝐴𝐵
ˆ B V 𝐴→ 𝐴ˆ (Π 𝑥𝐴 𝜌 𝐴𝐵 Π 𝑥𝐴 ). (13.1.97)
𝑝(𝑥)
826
Chapter 13: Entanglement Distillation
≥ − log2 𝑑. (13.1.101)
1 ∑︁
ˆ 𝐴 =
𝜎𝐴𝑋 𝜋 𝐴ˆ ⊗ |𝑥⟩⟨𝑥| 𝑋 𝐴 (13.1.102)
𝑑𝑋𝐴
𝑥∈X
= 𝜋 𝐴ˆ ⊗ 𝜋 𝑋 𝐴 (13.1.103)
1
= 1 ˆ ⊗ 1𝑋𝐴 . (13.1.104)
𝑑𝐴 𝐴
= 𝑑, (13.1.113)
𝑑𝐴
where we recall that 𝑑 𝑋 𝐴 = |X| = 𝑑 . We thus have
1 √
2− 2 𝐻2 ( 𝐴| 𝐴𝑋 𝐴)ΦE ≤
e ˆ
𝑑, (13.1.114)
Note that
1 ∑︁ 𝑥
ΦE𝐴𝑋
ˆ = 𝑉𝐴→ 𝐴ˆ Π 𝑥𝐴 1 𝐴 Π 𝑥𝐴𝑉 𝑥† ˆ ⊗ |𝑥⟩⟨𝑥| 𝑋 𝐴 (13.1.116)
𝐴 𝑑𝐴 𝐴→ 𝐴
𝑥∈X
1 ∑︁ 𝑥
= 𝑉𝐴→ 𝐴ˆ Π 𝑥𝐴𝑉 𝑥† ˆ ⊗ |𝑥⟩⟨𝑥| 𝑋 𝐴 (13.1.117)
𝑑𝐴 𝐴→ 𝐴
𝑥∈X
1
= 1 ˆ ⊗ 1𝑋𝐴 (13.1.118)
𝑑𝐴 𝐴
= 𝜋 𝐴ˆ ⊗ 𝜋 𝑋 𝐴 , (13.1.119)
Now, since the average over a set of elements is never less than the minimum over
the same set, we have that
∫
E 𝐴→ 𝐴𝑋 𝜌 𝐴𝐸 𝑈 𝐴† ) − 𝜋 𝐴ˆ ⊗ 𝜋 𝑋 𝐴 ⊗ e
ˆ 𝐴 (𝑈 𝐴 e 𝜌 𝐸 d𝑈 𝐴
𝑈𝐴 1
≥ min E 𝐴→ 𝐴𝑋 𝜌 𝐴𝐸 𝑈 𝐴† ) − 𝜋 𝐴ˆ ⊗ 𝜋 𝑋 𝐴 ⊗ e
ˆ 𝐴 (𝑈 𝐴 e 𝜌𝐸 (13.1.121)
𝑈𝐴 1
This implies that there exists a unitary 𝑈 𝐴 (in particular, one that achieves the
minimum on the right-hand side of the above inequality) such that
∫
2
𝜂 ≥ E 𝐴→ 𝐴𝑋 𝜌 𝐴𝐸 𝑈 𝐴† ) − 𝜋 𝐴ˆ ⊗ 𝜋 𝑋 𝐴 ⊗ e
ˆ 𝐴 (𝑈 𝐴 e 𝜌𝐸 d𝑈 𝐴
𝑈𝐴 1
828
Chapter 13: Entanglement Distillation
≥ E 𝐴→ 𝐴𝑋 𝜌 𝐴𝐸 𝑈 𝐴† ) − 𝜋 𝐴ˆ ⊗ 𝜋 𝑋 𝐴 ⊗ e
ˆ 𝐴 (𝑈 𝐴 e 𝜌𝐸 (13.1.122)
1
Now, let
𝜔 ˆ 𝐴 𝐸 = E 𝐴→ 𝐴𝑋
e𝐴𝑋 𝜌 𝐴𝐸 𝑈 𝐴† ), e
ˆ 𝐴 (𝑈 𝐴 e ˆ 𝐴 𝐸 = 𝜋 𝐴ˆ ⊗ 𝜋 𝑋 𝐴 ⊗ e
𝜏𝐴𝑋 𝜌𝐸 , (13.1.123)
†
ˆ 𝐴 𝐸 = E 𝐴→ 𝐴𝑋
𝜔 𝐴𝑋 ˆ 𝐴 (𝑈 𝐴 𝜌 𝐴𝐸 𝑈 𝐴 ), ˆ 𝐴 𝐸 = 𝜋 𝐴ˆ ⊗ 𝜋 𝑋 𝐴 ⊗ 𝜌 𝐸 .
𝜏𝐴𝑋 (13.1.124)
Then, by the Fuchs–van de Graaf inequality (see (6.2.88)), and by the definition of
the sine distance (see Definition 6.16), we have that
𝜂2 ≥ 𝜔 ˆ 𝐴𝐸 − e
e𝐴𝑋 𝜏𝐴𝑋
ˆ 𝐴𝐸 (13.1.125)
1
√︃
≥ 2 − 2 𝐹 (𝜔 ˆ 𝐴𝐸 , e
e𝐴𝑋 ˆ 𝐴𝐸 )
𝜏𝐴𝑋 (13.1.126)
√︃
≥ 2 − 2 1 − 𝑃( 𝜔 ˆ 𝐴𝐸 , e
e𝐴𝑋 ˆ 𝐴𝐸 ) ,
𝜏𝐴𝑋 2 (13.1.127)
Then, by the triangle inequality for sine distance (Lemma 6.17), we have
𝑃(𝜔 𝐴𝑋
ˆ 𝐴𝐸 , eˆ 𝐴 𝐸 ) ≤ 𝑃(𝜔 𝐴𝑋
𝜏𝐴𝑋 ˆ 𝐴𝐸 , 𝜔 ˆ 𝐴 𝐸 ) + 𝑃( 𝜔
e𝐴𝑋 ˆ 𝐴𝐸 , e
e𝐴𝑋 ˆ 𝐴𝐸 )
𝜏𝐴𝑋 (13.1.131)
≤ 𝑃(𝜌 𝐴𝐸 , e
𝜌 𝐴𝐸 ) + 𝜂 (13.1.132)
√
≤ 𝜀−𝜂+𝜂 (13.1.133)
√
= 𝜀, (13.1.134)
where the second inequality follows from the data-processing inequality for the sine
distance, unitary invariance of the sine distance, and the inequality in (13.1.130).
To
√ obtain the last inequality, we used the definition of the state e
𝜌 𝐴𝐸 as one that is
( 𝜀 − 𝜂)-close to 𝜌 𝐴𝐸 in sine distance. We can write the inequality in (13.1.134)
in terms of fidelity as
𝐹 (𝜔 𝐴𝑋
ˆ 𝐴𝐸 , eˆ 𝐴 𝐸 ) ≥ 1 − 𝜀.
𝜏𝐴𝑋 (13.1.135)
829
Chapter 13: Entanglement Distillation
Now, let
1 𝑥
𝜓 𝑥𝐴𝐵𝐸
ˆ B V 𝐴→ 𝐴ˆ (Π 𝑥𝐴𝑈 𝐴 𝜓 𝐴𝐵𝐸 𝑈 𝐴† Π 𝑥𝐴 ) (13.1.142)
𝑝(𝑥)
be a purification of 𝜔𝑥ˆ for all 𝑥 ∈ X, and let Φ 𝐴ˆ 𝐵ˆ ⊗ 𝜙e𝐸 𝐵′ be a purification of
𝐴𝐸
𝜌 𝐸 . Then, by Uhlmann’s theorem (Theorem 6.8), for every 𝑥 ∈ X there exists
𝜋 𝐴ˆ ⊗ e
an isometric channel W𝑥 ˆ ′ such that
𝐵→ 𝐵𝐵
√ √
𝐹 (𝜔𝑥𝐴𝐸
ˆ , 𝜋 𝐴ˆ ⊗ e
𝜌𝐸 ) = 𝐹 (W𝑥𝐵→𝐵𝐵 𝑥
ˆ ′ (𝜓 𝐴𝐵𝐸
ˆ ), Φ 𝐴ˆ 𝐵ˆ ⊗ 𝜙e𝐸 𝐵′ ) (13.1.143)
for all 𝑥 ∈ X. Using the set {W𝑥 ˆ ′ }𝑥∈X , we define the quantum channels
𝐵→ 𝐵𝐵
{D𝑥 ˆ }𝑥∈X as follows:
𝐵→ 𝐵
D𝑥𝐵→𝐵ˆ B Tr 𝐵′ ◦ W𝑥𝐵→𝐵𝐵
ˆ ′. (13.1.144)
By the data-processing inequality for fidelity (see Theorem 6.9) under the partial
trace channel Tr𝐸 𝐵′ , we obtain
√
𝐹 (𝜔𝑥𝐴𝐸
ˆ , 𝜋 𝐴ˆ ⊗ e
𝜌𝐸 )
830
Chapter 13: Entanglement Distillation
√
= 𝐹 (W𝑥𝐵→𝐵𝐵 𝑥
ˆ ′ (𝜓 𝐴𝐵𝐸
ˆ ), Φ 𝐴ˆ 𝐵ˆ ⊗ 𝜙e𝐸 𝐵′ ) (13.1.145)
√
≤ 𝐹 (Tr𝐸 𝐵′ [W𝑥𝐵→𝐵𝐵 𝑥
ˆ ′ (𝜓 𝐴𝐵𝐸
ˆ )], Tr𝐸 𝐵′ [Φ 𝐴ˆ 𝐵ˆ ⊗ 𝜙e𝐸 𝐵′ ]) (13.1.146)
√
= 𝐹 (D𝑥𝐵→𝐵ˆ (𝜔𝑥𝐴𝐵ˆ ), Φ 𝐴ˆ 𝐵ˆ ), (13.1.147)
for all 𝑥 ∈ X, where we recall the definition of 𝜔𝑥ˆ from (13.1.97). Since the
𝐴𝐵
inequality in (13.1.147) holds for all 𝑥 ∈ X, we have that
√︄ !2
∑︁ 𝑝(𝑥) √
𝐹 (𝜔𝑥𝐴𝐸
ˆ , 𝜋 𝐴ˆ ⊗ e
𝜌𝐸 )
𝑑𝑋𝐴
𝑥∈X
√︄ !2
∑︁ 𝑝(𝑥) √
≤ 𝐹 (D𝑥𝐵→𝐵ˆ (𝜔𝑥𝐴𝐵
ˆ ), Φ 𝐴ˆ 𝐵ˆ ) . (13.1.148)
𝑑𝑋𝐴
𝑥∈X
Now, the final state of Alice and Bob after executing the LOCC channel defined
by (13.1.79), with E 𝐴→ 𝐴𝑋ˆ 𝐴 defined by (13.1.91) and D 𝐵𝑋 𝐵 → 𝐵ˆ defined by (13.1.81)
and (13.1.144), is
𝜔 𝐴ˆ 𝐵ˆ = (D 𝑋𝐵 𝐵→𝐵ˆ ◦ C 𝑋 𝐴→𝑋𝐵 ◦ E 𝐴→ 𝐴𝑋ˆ 𝐴 )(𝜌 𝐴𝐵 ) (13.1.149)
∑︁
= D 𝑋𝐵 𝐵→𝐵ˆ (E𝑥𝐴→ 𝐴ˆ (𝜌 𝐴𝐵 ) ⊗ |𝑥⟩⟨𝑥| 𝑋𝐵 ) (13.1.150)
𝑥∈X
∑︁
= (E𝑥𝐴→ 𝐴ˆ ⊗ D𝑥𝐵→𝐵ˆ )(𝜌 𝐴𝐵 ) (13.1.151)
𝑥∈X
" #
∑︁
= Tr 𝑋𝐵 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋𝐵 ⊗ D𝑥𝐵→𝐵ˆ (𝜔𝑥𝐴𝐵
ˆ ) , (13.1.152)
𝑥∈X
where in the third equality we recognize the required form in (13.1.78) for a one-way
Alice-to-Bob LOCC channel, and in the last inequality we made use of (13.1.97).
Using the form of 𝜔 𝐴ˆ 𝐵ˆ in the last equality, along with all of the developments above,
we finally obtain
𝐹 (𝜔 𝐴ˆ 𝐵ˆ , Φ 𝐴ˆ 𝐵ˆ )
" # !
∑︁
= 𝐹 Tr 𝑋𝐵 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋𝐵 ⊗ D𝑥𝐵→𝐵ˆ (𝜔𝑥𝐴𝐵
ˆ ) , Tr 𝑋 𝐵 [𝜋 𝑋 𝐵 ⊗ Φ 𝐴ˆ 𝐵ˆ ] (13.1.153)
𝑥∈X
!
∑︁ 1 ∑︁
≥𝐹 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋𝐵 ⊗ D𝐵→𝐵ˆ (𝜔 𝐴𝐵
𝑥 𝑥
ˆ ), |𝑥⟩⟨𝑥| 𝑋 𝐴 ⊗ Φ 𝐴ˆ 𝐵ˆ (13.1.154)
𝑑𝑋𝐴
𝑥∈X 𝑥∈X
831
Chapter 13: Entanglement Distillation
√︄ !2
∑︁ 𝑝(𝑥) √
= 𝐹 (D𝑥𝐵→𝐵ˆ (𝜔𝑥𝐴𝐵
ˆ ), Φ 𝐴ˆ 𝐵ˆ ) (13.1.155)
𝑑𝑋𝐴
𝑥∈X
√︄ !2
∑︁ 𝑝(𝑥) √
= 𝐹 ((Tr𝐸 𝐵′ ◦ W𝑥𝐵→𝐵𝐵 𝑥
ˆ ′ )(𝜓 𝐴𝐵𝐸
ˆ ), Tr𝐸 𝐵′ [Φ 𝐴ˆ 𝐵ˆ ⊗ 𝜙e𝐸 𝐵′ ])
𝑑𝑋𝐴
𝑥∈X
(13.1.156)
√︄ !2
∑︁ 𝑝(𝑥) √
≥ 𝐹 (W𝑥𝐵→𝐵𝐵 𝑥
ˆ ′ (𝜓 𝐴𝐵𝐸
ˆ ), Φ 𝐴ˆ 𝐵ˆ ⊗ 𝜙e𝐸 𝐵′ ) (13.1.157)
𝑑𝑋𝐴
𝑥∈X
√︄ !2
∑︁ 𝑝(𝑥) √
= 𝐹 (𝜔𝑥𝐴𝐸
ˆ , 𝜋 𝐴ˆ ⊗ e
𝜌𝐸 ) (13.1.158)
𝑑𝑋𝐴
𝑥∈X
≥ 1 − 𝜀. (13.1.159)
Therefore,
𝑝 err (L; 𝜌 𝐴𝐵 ) = 1 − 𝐹 (𝜔 𝐴ˆ 𝐵ˆ , Φ 𝐴ˆ 𝐵ˆ ) ≤ 𝜀. (13.1.160)
To summarize, we have shown that, given a state 𝜌 𝐴𝐵 and 𝜀 ∈ (0, 1), there
exists a (𝑑, 𝜀) one-way entanglement distillation
√
protocol L 𝐴𝐵→ 𝐴ˆ 𝐵ˆ if the dimension
𝜀−𝜂 √
𝑑 = 𝑑 𝐴ˆ = 𝑑 𝐵ˆ satisfies log2 𝑑 = −𝐻max ( 𝐴|𝐵) 𝜌 + 4 log2 𝜂, where 𝜂 ∈ [0, 𝜀).
Although we explicitly constructed the encoding channels {E𝑥 ˆ }𝑥∈X on Alice’s
𝐴→ 𝐴
side, on Bob’s side we relied on Uhlmann’s theorem to guarantee the existence of a
set of decoding channels {D𝑥 ˆ }𝑥∈X such that the overall LOCC channel L 𝐴𝐵→ 𝐴ˆ 𝐵ˆ
𝐵→ 𝐵
satisfies 𝑝 err (L; 𝜌 𝐴𝐵 ) ≤ 𝜀. ■
Combining Theorem 13.10 with (7.8.83), and using (13.1.72), leads to the
following lower bound on the one-shot distillable entanglement:
Corollary 13.12
Let 𝜌 𝐴𝐵√be a bipartite quantum state with purification 𝜓 𝐴𝐵𝐸 . For all 𝜀 ∈ (0, 1),
𝜂 ∈ [0, 𝜀), and 𝛼 > 1, there exists a (𝑑, 𝜀) one-way entanglement distillation
protocol for 𝜌 𝐴𝐵 satisfying
1 1
log2 𝑑 ≥ 𝐻
e𝛼 ( 𝐴|𝐸)𝜓 − log2 √
𝛼−1 ( 𝜀 − 𝜂) 2
832
Chapter 13: Entanglement Distillation
1
− log2 √ + 4 log2 𝜂. (13.1.161)
1 − ( 𝜀 − 𝜂) 2
Proof: The inequality follows from taking the results of Theorem 13.10, using
(13.1.72), and applying the inequality in (7.8.83). ■
Since the inequality in (13.1.161) holds for all (𝑑, 𝜀) entanglement √ distillation
protocols, we obtain the following bound for all 𝜀 ∈ (0, 1), 𝜂 ∈ [0, 𝜀), and 𝛼 > 1:
𝜀 ′ ′ 1 1
𝐸 𝐷 ( 𝐴; 𝐵) 𝜌 ≥ sup 𝐻
e𝛼 ( 𝐴 |𝐸 ) 𝜙 − log2 √
L 𝛼−1 ( 𝜀 − 𝜂) 2
1
− log2 √ + 4 log2 𝜂, (13.1.162)
1 − ( 𝜀 − 𝜂) 2
where the optimization is with respect to every LOCC channel L 𝐴𝐵→𝐴′ 𝐵′ , such that
𝜙 𝐴′ 𝐵′ 𝐸 ′ is a purification of L 𝐴𝐵→𝐴′ 𝐵′ (𝜌 𝐴𝐵 ). This comes about by first applying the
LOCC channel L 𝐴𝐵→𝐴′ 𝐵′ to 𝜌 𝐴𝐵 for free, applying Corollary 13.12 to the state
L 𝐴𝐵→𝐴′ 𝐵′ (𝜌 𝐴𝐵 ), and finally optimizing over every LOCC channel L 𝐴𝐵→𝐴′ 𝐵′ .
833
Chapter 13: Entanglement Distillation
As we prove in Appendix A,
𝑅 achievable rate ⇐⇒ lim 𝜀 𝐷 (2𝑛(𝑅−𝛿) ; 𝜌 ⊗𝑛
𝐴𝐵 ) = 0 ∀ 𝛿 > 0. (13.2.5)
𝑛→∞
In other words, a rate 𝑅 is achievable if the optimal error probability for a sequence
of protocols with rate 𝑅 − 𝛿, 𝛿 > 0, vanishes as the number 𝑛 of copies of 𝜌 𝐴𝐵
increases.
834
Chapter 13: Entanglement Distillation
As we show in Appendix A,
835
Chapter 13: Entanglement Distillation
Note that
𝐸 𝐷 ( 𝐴; 𝐵) 𝜌 ≤ 𝐸
e𝐷 ( 𝐴; 𝐵) 𝜌 (13.2.11)
for all bipartite states 𝜌 𝐴𝐵 . We can also write the strong converse distillable
entanglement as
and the squashed entanglement from (9.4.1) is a weak converse rate, in the
sense that
𝐸 𝐷 ( 𝐴; 𝐵) 𝜌 ≤ 𝐸 sq ( 𝐴; 𝐵) 𝜌 . (13.2.15)
836
Chapter 13: Entanglement Distillation
If we define
𝐷 ↔ (𝜌 𝐴𝐵 ) ≡ 𝐷 ↔ ( 𝐴; 𝐵) 𝜌 B sup 𝐼 ( 𝐴′⟩𝐵′)L(𝜌) , (13.2.16)
L
then we can write (13.2.13) as
1 ↔ ⊗𝑛
𝐸 𝐷 ( 𝐴; 𝐵) 𝜌 = lim 𝐷 (𝜌 𝐴𝐵 ) C 𝐷 ↔ reg (𝜌 𝐴𝐵 ), (13.2.17)
𝑛→∞ 𝑛
Recall from the discussion after Lemma 13.5 (see also Section 3.2.9) that there exist
PPT entangled states in higher dimensional bipartite systems because SEP( 𝐴 : 𝐵) ≠
838
Chapter 13: Entanglement Distillation
PPT
SEP = PPT Entangled SEP (Bound Entangled
Entangled)
— Distillable — Non-Distillable
Figure 13.3: The set of all bipartite states can be split into distillable and
non-distillable sets. (a) For qubit-qubit and qubit-qutrit states entanglement
and distillability are in one-to-one correspondence because all PPT states are
separable. (b) In higher dimensions, there are entangled states belonging to the
set PPT, which we call bound entangled states. Distillability and entanglement
are thus not synonymous in general for high-dimensional quantum systems.
PPT( 𝐴 : 𝐵) except for when 𝐴 and 𝐵 are both qubits or when one is a qubit and the
other is a qutrit. All of these entangled states have zero distillable entanglement, and
thus we refer to them as bound entangled. Remarkably, therefore, except for qubit-
qubit and qubit-qutrit states, prior entanglement is only necessary, but not sufficient,
for distilling pure maximally entangled states. Please consult the Bibliographic
Notes in Section 13.5 for more information about bound entanglement.
As shown in Figure 13.3, we can use entanglement distillation to split up the
set of all bipartite states into distillable and non-distillable states. For two-qubit
states and qubit-qutrit states, non-distillable states are exactly equal to the set of
separable states by the PPT criterion. For higher dimensions, as stated above, this
is not the case. Also in higher dimensions, it is in general possible to have states
with negative partial transpose (NPT) that are nonetheless non-distillable. These
NPT bound entangled states are shown in Figure 13.3(b) as the region between the
PPT bound entangled states and the distillable entangled states. It is not known
whether NPT bound entangled states exist, but since they have not been ruled out,
we nevertheless depict them in the figure.
839
Chapter 13: Entanglement Distillation
As the first step in proving the achievability part of Theorem 13.19, let us recall
Corollary√ 13.12: given a bipartite state 𝜌 𝐴𝐵 with purification 𝜓 𝐴𝐵𝐸 , for all 𝜀 ∈ (0, 1),
𝜂 ∈ [0, 𝜀), and 𝛼 > 1, there exists a (𝑑, 𝜀) entanglement distillation protocol for
𝜌 𝐴𝐵 such that
1 1
log2 𝑑 ≥ 𝐻e𝛼 ( 𝐴|𝐸)𝜓 − log2 √
𝛼−1 ( 𝜀 − 𝜂) 2
1
− log2 √ + 4 log2 𝜂, (13.2.21)
1 − ( 𝜀 − 𝜂) 2
where
𝐻 e𝛼 (𝜓 𝐴𝐸 ∥ 1 𝐴 ⊗ 𝜎𝐸 )
e𝛼 ( 𝐴|𝐸)𝜓 = − inf 𝐷 (13.2.22)
𝜎𝐸
is the sandwiched Rényi conditional entropy. Applying this inequality to the state
𝜌 ⊗𝑛
𝐴𝐵 for all 𝑛 ≥ 1 leads to the following:
Proposition 13.20
For every state 𝜌 𝐴𝐵 and 𝜀 ∈ (0, 1), there exists an (𝑛, 𝑑, 𝜀) entanglement
log 𝑑
distillation protocol for 𝜌 𝐴𝐵 such that the rate 𝑛2 satisfies
log2 𝑑 2𝛼 − 1 4 1 1
≥𝐻e𝛼 ( 𝐴|𝐸)𝜓 − log2 − log2 , (13.2.23)
𝑛 𝑛(𝛼 − 1) 𝜀 𝑛 1 − 𝜀4
for all 𝑛 ≥ 1 and 𝛼 > 1, where the optimization is over every LOCC channel
L 𝐴𝑛 𝐵𝑛 →𝐴′ 𝐵′ , such that 𝜙 𝐴′ 𝐵′ 𝐸 ′ is a purification of L 𝐴𝑛 𝐵𝑛 →𝐴′ 𝐵′ (𝜌 ⊗𝑛
𝐴𝐵 ).
Then, optimizing over every LOCC channel L 𝐴𝑛 𝐵𝑛 →𝐴′ 𝐵′ , and using the definition
𝑛,𝜀
of 𝐸 𝐷 ( 𝐴; 𝐵) 𝜌 in (13.2.4), we obtain (13.2.24).
By employing additivity of the sandwiched Rényi conditional entropy for all
𝛼 > 1, we have that
e𝛼 ( 𝐴𝑛 |𝐸 𝑛 )𝜓 ⊗𝑛 = 𝑛 𝐻
𝐻 e𝛼 ( 𝐴|𝐸)𝜓 . (13.2.26)
Note that the proof of additivity follows similarly to the proof of Proposition 11.21.
This leads to (13.2.23). ■
With the inequality in (13.2.23), we can prove that the coherent information is
an achievable rate for entanglement distillation.
Proof: Let 𝜓 𝐴𝐵𝐸 be a purification of 𝜌 𝐴𝐵 . Fix 𝜀 ∈ (0, 1] and 𝛿 > 0. Let 𝛿1 , 𝛿2 > 0
be such that
𝛿 = 𝛿1 + 𝛿2 . (13.2.27)
Set 𝛼 ∈ (1, ∞) such that
𝛿1 ≥ 𝐼 ( 𝐴⟩𝐵) 𝜌 − 𝐻
e𝛼 ( 𝐴|𝐸)𝜓 . (13.2.28)
841
Chapter 13: Entanglement Distillation
Also,
e𝛼 ( 𝐴|𝐸)𝜓 = sup 𝐻
lim 𝐻 e𝛼 ( 𝐴|𝐸)𝜓 (13.2.30)
𝛼→1+ 𝛼∈(1,∞)
= sup − inf 𝐷 e𝛼 (𝜓 𝐴𝐸 ∥ 1 𝐴 ⊗ 𝜎𝐸 ) (13.2.31)
𝛼∈(1,∞) 𝜎𝐸
e𝛼 (𝜓 𝐴𝐸 ∥ 1 𝐴 ⊗ 𝜎𝐸 )
= − inf inf 𝐷 (13.2.32)
𝛼∈(1,∞) 𝜎𝐸
e𝛼 (𝜓 𝐴𝐸 ∥ 1 𝐴 ⊗ 𝜎𝐸 )
= − inf inf 𝐷 (13.2.33)
𝜎𝐸 𝛼∈(1,∞)
= − inf 𝐷 (𝜓 𝐴𝐸 ∥ 1 𝐴 ⊗ 𝜎𝐸 ) (13.2.34)
𝜎𝐸
= 𝐻 ( 𝐴|𝐸)𝜓 , (13.2.35)
where the fifth equality follows from Proposition 7.30. Then, by definition of
conditional entropy, and the fact that 𝜓 𝐴𝐵𝐸 is a pure state, we find that
With 𝛼 ∈ (1, ∞) chosen such that (13.2.28) holds, take 𝑛 large enough so that
2𝛼 − 1 4 1 1
𝛿2 ≥ log2 + log2 . (13.2.37)
𝑛(𝛼 − 1) 𝜀 𝑛 1 − 𝜀4
Now, we use the fact that for the 𝑛 and 𝜀 chosen above there exists an (𝑛, 𝑑, 𝜀)
protocol such that
log2 𝑑 2𝛼 − 1 4 1 1
≥𝐻 e𝛼 ( 𝐴|𝐸)𝜓 − log2 − log2 . (13.2.38)
𝑛 𝑛(𝛼 − 1) 𝜀 𝑛 1 − 𝜀4
(This follows from Proposition 13.20 above.) Rearranging the right-hand side of
this inequality, and using (13.2.27), (13.2.28), and (13.2.37), we find that
log2 𝑑 2𝛼 − 1 4
≥ 𝐼 ( 𝐴⟩𝐵) 𝜌 − 𝐼 ( 𝐴⟩𝐵) 𝜌 − 𝐻
e𝛼 ( 𝐴|𝐸)𝜓 + log2
𝑛 𝑛(𝛼 − 1) 𝜀
1 1
+ log2 (13.2.39)
𝑛 1 − 𝜀4
≥ 𝐼 ( 𝐴⟩𝐵) 𝜌 − (𝛿1 + 𝛿2 ) (13.2.40)
= 𝐼 ( 𝐴⟩𝐵) 𝜌 − 𝛿. (13.2.41)
842
Chapter 13: Entanglement Distillation
We thus have that there exists an (𝑛, 𝑑, 𝜀) entanglement distillation protocol with
log 𝑑
rate 𝑛2 ≥ 𝐼 ( 𝐴⟩𝐵) 𝜌 − 𝛿. Therefore, there exists an (𝑛, 2𝑛(𝑅−𝛿) , 𝜀) entanglement
distillation protocol with 𝑅 = 𝐼 ( 𝐴⟩𝐵) 𝜌 for all sufficiently large 𝑛 such that (13.2.37)
holds. Since 𝜀 and 𝛿 are arbitrary, we conclude that for all 𝜀 ∈ (0, 1], 𝛿 > 0, and
sufficiently large 𝑛, there exists an (𝑛, 2𝑛(𝐼 ( 𝐴⟩𝐵)𝜌 −𝛿) , 𝜀) entanglement distillation
protocol. This means that, by definition, 𝐼 ( 𝐴⟩𝐵) 𝜌 is an achievable rate. ■
is an achievable rate.
In order to prove the weak converse part of Theorem 13.19, we make use of
Corollary 13.8, specifically (13.1.44): given a bipartite state 𝜌 𝐴𝐵 , for every (𝑑, 𝜀)
entanglement distillation protocol for 𝜌 𝐴𝐵 , with 𝜀 ∈ [0, 1/2), it holds that
1 ′ ′
log2 𝑑 ≤ sup 𝐼 ( 𝐴 ⟩𝐵 )L(𝜌) + ℎ2 (𝜀) , (13.2.53)
1 − 2𝜀 L
where the optimization is with respect to every LOCC channel L 𝐴𝐵→𝐴′ 𝐵′ . Applying
this inequality to the state 𝜌 ⊗𝑛
𝐴𝐵 immediately leads to the following.
844
Chapter 13: Entanglement Distillation
Proposition 13.22
Let 𝜌 𝐴𝐵 be a bipartite state, and let 𝑛 ≥ 1 and 𝜀 ∈ [0, 1/2). For an (𝑛, 𝑑, 𝜀)
entanglement distillation protocol for 𝜌 𝐴𝐵 with corresponding LOCC channel
log 𝑑
L 𝐴𝑛 𝐵𝑛 →𝐴′ 𝐵′ , the rate 𝑛2 satisfies
log2 𝑑 1 1 ′ ′ 1
≤ sup 𝐼 ( 𝐴 ⟩𝐵 )L(𝜌 ⊗𝑛 ) + ℎ2 (𝜀) . (13.2.54)
𝑛 1 − 2𝜀 L 𝑛 𝑛
Consequently,
1 1 1
𝑛,𝜀
𝐸𝐷 ( 𝐴; 𝐵) 𝜌 ≤ sup 𝐼 ( 𝐴′⟩𝐵′)L(𝜌 ⊗𝑛 ) + ℎ2 (𝜀) , (13.2.55)
1 − 2𝜀 𝑛 L 𝑛
Suppose that 𝑅 is an achievable rate for entanglement distillation for the bipartite
state 𝜌 𝐴𝐵 . Then, by definition, for all 𝜀 ∈ (0, 1], 𝛿 > 0, and sufficiently large 𝑛,
there exists an (𝑛, 2𝑛(𝑅−𝛿) , 𝜀) entanglement distillation protocol for 𝜌 𝐴𝐵 . For all
such protocols for which 𝜀 ∈ (0, 1/2), the inequality in (13.2.54) holds, so that
1 1 1
𝑅−𝛿 ≤ sup 𝐼 ( 𝐴′⟩𝐵′)L(𝜌 ⊗𝑛 ) + ℎ2 (𝜀) . (13.2.56)
1 − 2𝜀 𝑛 L 𝑛
Since the inequality holds for all sufficiently large 𝑛, it holds in the limit 𝑛 → ∞,
so that
1 1 1
𝑅 ≤ lim sup 𝐼 ( 𝐴′⟩𝐵′)L(𝜌 ⊗𝑛 ) + ℎ2 (𝜀) + 𝛿 (13.2.57)
𝑛→∞ 1 − 2𝜀 𝑛 L 𝑛
1 1
= lim sup 𝐼 ( 𝐴′⟩𝐵′)L(𝜌 ⊗𝑛 ) + 𝛿. (13.2.58)
1 − 2𝜀 𝑛→∞ 𝑛 L
845
Chapter 13: Entanglement Distillation
Then, since this inequality holds for all 𝜀 ∈ (0, 1/2), 𝛿 > 0, we conclude that
1 1
𝑅 ≤ lim lim sup 𝐼 ( 𝐴′⟩𝐵′)L(𝜌 ⊗𝑛 ) + 𝛿 (13.2.59)
𝜀,𝛿→0 1 − 2𝜀 𝑛→∞ 𝑛 L
1
= lim sup 𝐼 ( 𝐴′⟩𝐵′)L(𝜌 ⊗𝑛 ) . (13.2.60)
𝑛→∞ 𝑛 L
We have thus shown that the quantity lim𝑛→∞ 𝑛1 supL 𝐼 ( 𝐴′⟩𝐵′)L(𝜌 ⊗𝑛 ) is a weak
converse rate for entanglement distillation for 𝜌 𝐴𝐵 .
for an arbitrary (𝑑, 𝜀) entanglement distillation protocol, where 𝜀 ∈ (0, 1). Recall
that
e𝛼 ( 𝐴; 𝐵) 𝜌 =
𝑅 inf e𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ).
𝐷 (13.2.62)
𝜎𝐴𝐵 ∈PPT′ ( 𝐴:𝐵)
The upper bound above is a consequence of the fact that PPT′ operators are useless
for entanglement distillation, in the sense that for every 𝜎𝐴𝐵 ∈ PPT′ ( 𝐴 : 𝐵), the
bound Tr[Φ 𝐴𝐵 𝜎𝐴𝐵 ] ≤ 𝑑1 holds.
Applying the upper bound in (13.2.61) to the state 𝜌 ⊗𝑛
𝐴𝐵 leads to the following
result:
846
Chapter 13: Entanglement Distillation
Corollary 13.23
Let 𝜌 𝐴𝐵 be a bipartite state, let 𝑛 ≥ 1, 𝜀 ∈ [0, 1), and 𝛼 > 1. For an (𝑛, 𝑑, 𝜀)
entanglement distillation protocol, the following bound holds
log2 𝑑 𝛼 1
≤𝑅 e𝛼 ( 𝐴; 𝐵) 𝜌 + log2 . (13.2.63)
𝑛 𝑛(𝛼 − 1) 1−𝜀
Consequently,
𝑛,𝜀 𝛼 1
𝐸𝐷 ( 𝐴; 𝐵) 𝜌 ≤𝑅
e𝛼 ( 𝐴; 𝐵) 𝜌 + log2 . (13.2.64)
𝑛(𝛼 − 1) 1−𝜀
Given an 𝜀 ∈ (0, 1), the inequality in (13.2.63) gives us a bound on the rate of
an arbitrary (𝑛, 𝑑, 𝜀) entanglement distillation protocol for a state 𝜌 𝐴𝐵 . If instead
we fix the rate to be 𝑟, so that 𝑑 = 2𝑛𝑟 , then the inequality in (13.2.63) is as follows:
𝛼 1
𝑟≤𝑅 e𝛼 ( 𝐴; 𝐵) 𝜌 + log2 (13.2.68)
𝑛(𝛼 − 1) 1−𝜀
for all 𝛼 > 1. Rearranging this inequality gives us the following lower bound on 𝜀:
𝛿 > 𝛿 1 + 𝛿 2 C 𝛿′ . (13.2.72)
𝛿1 ≥ 𝑅
e𝛼 ( 𝐴; 𝐵) 𝜌 − 𝑅( 𝐴; 𝐵) 𝜌 , (13.2.73)
≤ 𝑅( 𝐴; 𝐵) 𝜌 + 𝛿1 + 𝛿2 (13.2.77)
= 𝑅( 𝐴; 𝐵) 𝜌 + 𝛿′ (13.2.78)
< 𝑅( 𝐴; 𝐵) 𝜌 + 𝛿. (13.2.79)
log 𝑑
So we have that 𝑛2 < 𝑅( 𝐴; 𝐵) 𝜌 + 𝛿 for all (𝑛, 𝑑, 𝜀) entanglement distillation
protocols for 𝜌 𝐴𝐵 with sufficiently large 𝑛 such that (13.2.74) holds. Due to this strict
inequality, it follows that there cannot exist an (𝑛, 2𝑛(𝑅( 𝐴;𝐵)𝜌 +𝛿) , 𝜀) entanglement
distillation protocol for 𝜌 𝐴𝐵 for all sufficiently large 𝑛 such that (13.2.74) holds.
For if it were to exist, there would be a 𝑑 ≥ 1 such that log2 𝑑 = 𝑛(𝑅( 𝐴; 𝐵) 𝜌 + 𝛿),
which we have just seen is not possible. Since 𝜀 and 𝛿 are arbitrary, we conclude
that for all 𝜀 ∈ [0, 1), 𝛿 > 0, and sufficiently large 𝑛, there does not exist an
(𝑛, 2𝑛(𝑅( 𝐴;𝐵)𝜌 +𝛿) , 𝜀) entanglement distillation protocol for 𝜌 𝐴𝐵 . This means that
𝑅( 𝐴; 𝐵) 𝜌 is a strong converse rate, so that 𝐸 e𝐷 ( 𝐴; 𝐵) 𝜌 ≤ 𝑅( 𝐴; 𝐵) 𝜌 . ■
Given that the Rains relative entropy is a strong converse rate for distillable
entanglement, by following arguments analogous to those in the proof above, we
can conclude that 1𝑘 𝑅( 𝐴 𝑘 ; 𝐵 𝑘 ) 𝜌 ⊗𝑘 is a strong converse rate for all 𝑘 ≥ 2. Therefore,
the regularized quantity
1
𝑅 reg ( 𝐴; 𝐵) 𝜌 B lim 𝑅( 𝐴𝑛 ; 𝐵𝑛 ) 𝜌 ⊗𝑛 , (13.2.80)
𝑛→∞ 𝑛
𝑅 reg ( 𝐴; 𝐵) 𝜌 ≤ 𝑅( 𝐴; 𝐵) 𝜌 , (13.2.82)
so that the regularized quantity in general gives a tighter upper bound on distillable
entanglement.
We now show that the Rains relative entropy is a strong converse rate using the
equivalent characterization of a strong converse rate in (13.2.9). In other words,
given a bipartite state 𝜌 𝐴𝐵 , we show that for an arbitrary sequence {(𝑛, 2𝑛𝑟 , 𝜀 𝑛 )}𝑛∈N
849
Chapter 13: Entanglement Distillation
Then, taking the limit 𝑛 → ∞ on both sides of this inequality, we conclude that
lim𝑛→∞ 𝜀 𝑛 ≥ 1. But 𝜀 𝑛 ≤ 1 for all 𝑛 because 𝜀 𝑛 is a probability by definition. So
we obtain lim𝑛→∞ 𝜀 𝑛 = 1. Since the rate 𝑟 > 𝑅( 𝐴; 𝐵) 𝜌 is arbitrary, we conclude
that 𝑅( 𝐴; 𝐵) 𝜌 is a strong converse rate for entanglement distillation for 𝜌 𝐴𝐵 . We
also see from (13.2.85) that the sequence {𝜀 𝑛 }𝑛∈N approaches one at an exponential
rate.
Corollary 13.25
Let 𝜌 𝐴𝐵 be a bipartite state, let 𝑛 ≥ 1, and let 𝜀 ∈ [0, 1). For an (𝑛, 𝑑, 𝜀)
entanglement distillation protocol, the following bound holds
√
1 1 𝑔2 ( 𝜀)
log2 𝑑 ≤ √ 𝐸 sq ( 𝐴; 𝐵) 𝜌 + . (13.2.86)
𝑛 1− 𝜀 𝑛
850
Chapter 13: Entanglement Distillation
𝐸 sq ( 𝐴𝑛 ; 𝐵𝑛 ) 𝜌 ⊗𝑛 = 𝑛𝐸 sq ( 𝐴; 𝐵) 𝜌 . (13.2.88)
We now provide a proof of (13.2.15), the statement that the squashed entan-
glement is a weak converse rate for entanglement distillation. Suppose that 𝑅 is
an achievable rate for entanglement distillation for the bipartite state 𝜌 𝐴𝐵 . Then,
by definition, for all 𝜀 ∈ (0, 1], 𝛿 > 0, and sufficiently large 𝑛, there exists an
(𝑛, 2𝑛(𝑅−𝛿) , 𝜀) entanglement distillation protocol for 𝜌 𝐴𝐵 . For all such protocols,
the inequality in (13.2.86) holds, so that
1 √
1
𝑅−𝛿 ≤ √ 𝐸 sq ( 𝐴; 𝐵) 𝜌 + 𝑔2 ( 𝜀) . (13.2.89)
1− 𝜀 𝑛
Since the inequality holds for all sufficiently large 𝑛, it holds in the limit 𝑛 → ∞,
so that
1 √
1
𝑅 ≤ lim √ 𝐸 sq ( 𝐴; 𝐵) 𝜌 + 𝑔2 ( 𝜀) + 𝛿 (13.2.90)
𝑛→∞ 1 − 𝜀 𝑛
1
= √ 𝐸 sq ( 𝐴; 𝐵) 𝜌 + 𝛿. (13.2.91)
1− 𝜀
Then, since this inequality holds for all 𝜀 ∈ (0, 1] and 𝛿 > 0, we conclude that
1
𝑅 ≤ lim √ 𝐸 sq ( 𝐴; 𝐵) 𝜌 + 𝛿 (13.2.92)
𝜀,𝛿→0 1 − 𝜀
= 𝐸 sq ( 𝐴; 𝐵) 𝜌 . (13.2.93)
851
Chapter 13: Entanglement Distillation
where
𝐷 → (𝜌 𝐴𝐵 ) B sup 𝐼 ( 𝐴′⟩𝐵′)L→ (𝜌) . (13.2.99)
L→
852
Chapter 13: Entanglement Distillation
→ 1
𝐸𝐷 ( 𝐴; 𝐵) 𝜌 = lim sup 𝐼 ( 𝐴′⟩𝑋 𝐵𝑛 )𝑉 𝜌 ⊗𝑛𝑉 † , (13.2.100)
𝑛→∞ 𝑛 𝑉
Proof: We start by recalling from Section 4.6.2 (see also the beginning of
Section 13.1.2) that every one-way LOCC channel L→
𝐴𝑛 𝐵 𝑛 →𝐴′ 𝐵′ can be expressed as
𝜔 𝐴 ′ 𝐵 ′ B L→ ′ ′ (𝜌 𝑛 𝑛 ) (13.2.103)
∑︁𝐴 𝐵 →𝐴 𝐵 𝐴 𝐵
𝑛 𝑛
For every 𝑛 ≥ 1, if we restrict the optimization in (13.2.98) such that |X| = 𝑑 2𝐴𝑛 =
𝐴 , 𝑑 = 𝑑 𝐴 = 𝑑 𝐴, D
𝑑 2𝑛 = id 𝑛 for all 𝑥 ∈ X, and E𝑥𝐴𝑛 →𝐴′ (·) = 𝐾 𝐴𝑥 𝑛 (·)𝐾 𝐴𝑥 𝑛 for
′
𝑛 𝑥
Í𝐵𝑛 →𝐵′ 𝑥 † 𝐵 𝑥
all 𝑥 ∈ X such that 𝑥∈X (𝐾 𝐴𝑛 ) 𝐾 𝐴𝑛 = 1 𝐴𝑛 , then the LOCC channel L→ 𝐴𝑛 𝐵 𝑛 →𝐴′ 𝐵′
reduces to
∑︁
L→𝐴𝑛 𝐵 𝑛 →𝐴′ 𝐵′ (𝜌 𝑛
𝐴 𝐵 𝑛 ) = 𝐾 𝐴𝑥 𝑛 𝜌 𝐴𝑛 𝐵𝑛 (𝐾 𝐴𝑥 𝑛 ) † ⊗ |𝑥⟩⟨𝑥| 𝑋 (13.2.108)
𝑥∈X
We thus obtain
→ 1
𝐸𝐷 ( 𝐴; 𝐵) 𝜌 ≥ lim sup 𝐼 ( 𝐴𝑛 ⟩𝑋 𝐵𝑛 )𝜔 (13.2.110)
𝑛→∞ 𝑛 𝑉
The rest of the proof is devoted to proving the reverse inequality. Let L→
𝐴𝑛 𝐵 𝑛 →𝐴′ 𝐵′
be an arbitrary one-way LOCC channel of the form in (13.2.103)–(13.2.105). For
every state 𝜌 𝐴𝑛 𝐵𝑛 , let
854
Chapter 13: Entanglement Distillation
∑︁
= 𝑝 𝑥 𝐼 ( 𝐴′⟩𝐵𝑛 ) 𝜌 𝑥 . (13.2.116)
𝑥∈X
Now,
∑︁
E 𝐴𝑛 →𝐴′ 𝑋 𝐴 (𝜌 𝐴𝑛 𝐵𝑛 )= E𝑥𝐴𝑛 →𝐴′ (𝜌 𝐴𝐵 ) ⊗ |𝑥⟩⟨𝑥| 𝑋 𝐴 (13.2.117)
𝑥∈X
∑︁
= 𝑝 𝑥 𝜌 𝑥𝐴′ 𝐵 ⊗ |𝑥⟩⟨𝑥| 𝑋 𝐴 . (13.2.118)
𝑥∈X
for all 𝑥 ∈ X, where the inequality follows from convexity of coherent information
(see (7.2.121)). Without loss of generality, we can take 𝐴′ ≡ 𝐴𝑛 : if 𝐴′ has a
dimension smaller than that of 𝐴𝑛 , we can always first isometrically embed 𝐴′ into
𝐴𝑛 . The coherent information remains unchanged under this isometric embedding.
Combining the last inequality above with the one in (13.2.116), we conclude
that
∑︁ ∑︁ ∑︁
′ ′ ′ 𝑛 𝑥,𝑦
𝐼 ( 𝐴 ⟩𝐵 )𝜔 ≤ 𝑝𝑥 𝑟 𝑦|𝑥 𝐼 ( 𝐴 ⟩𝐵 ) 𝜌 = 𝑞 𝑥,𝑦 𝐼 ( 𝐴′⟩𝐵𝑛 ) 𝜌 𝑥,𝑦 . (13.2.124)
𝑥∈X 𝑦∈Y 𝑥∈X,𝑦∈Y
855
Chapter 13: Entanglement Distillation
where 𝑑 𝐸 = 𝑑 𝑍 𝐵 = |Z|. Since the number of Kraus operators need not exceed
𝑑 2𝐴𝑛 = 𝑑 2𝑛 2𝑛
𝐴 (see Theorem 4.3), we can take 𝑑 𝑍 = 𝑑 𝐴 without loss of generality. We
can thus optimize over all isometries of the form in (13.2.127). Altogether, we have
that
𝐼 ( 𝐴′⟩𝐵′)𝜔 ≤ sup 𝐼 ( 𝐴′⟩𝑍 𝐵𝑛 )𝑉 𝜌 ⊗𝑛𝑉 † (13.2.128)
𝑉
for every one-way LOCC channel L→ and all 𝑛 ≥ 1. Optimizing over all
𝐴𝑛 𝐵 𝑛 →𝐴′ 𝐵′
one-way LOCC channels on the left-hand side of the inequality above, and taking
the limit 𝑛 → ∞ leads us to conclude that
→ 1
𝐸𝐷 ( 𝐴; 𝐵) 𝜌 ≤ lim sup 𝐼 ( 𝐴′⟩𝑍 𝐵𝑛 )𝑉 𝜌 ⊗𝑛𝑉 † . (13.2.129)
𝑛→∞ 𝑛 𝑉
Lemma 13.27
For every bipartite state 𝜌 𝐴𝐵 , the optimized coherent information lower bound
on distillable entanglement is non-negative, i.e., 𝐷 → (𝜌 𝐴𝐵 ) ≥ 0.
Proof: Let 𝜓 𝐴𝐵𝑅 = |𝜓⟩⟨𝜓| 𝐴𝐵𝑅 be a purification of 𝜌 𝐴𝐵 , and consider the following
Schmidt decomposition of |𝜓⟩ 𝐴𝐵𝑅 :
𝑟−1 √︁
∑︁
|𝜓⟩ 𝐴𝐵𝑅 = 𝜆 𝑘 |𝜙 𝑘 ⟩ 𝐴 ⊗ |𝜑 𝑘 ⟩𝐵𝑅 . (13.2.130)
𝑘=0
856
Chapter 13: Entanglement Distillation
Then, let
𝑟−1
∑︁
𝑉𝐴→𝐴′ 𝑋 𝐸 B |𝑘⟩ 𝐴′ ⟨𝜙 𝑘 | 𝐴 ⊗ |𝑘⟩ 𝑋 ⊗ |𝑘⟩𝐸 . (13.2.131)
𝑘=0
13.3 Examples
We now consider classes of bipartite states and evaluate the upper and lower bounds
on their distillable entanglement that we have established in this chapter. In some
cases, the distillable entanglement can be determined exactly because the upper
and lower bounds coincide.
The simplest example for which distillable entanglement can be determined exactly
is the class of pure bipartite states. In this case, the coherent information lower
bound and the Rains relative entropy upper bound coincide and are equal to the
entropy of the reduced state. Indeed, for the coherent information, the joint entropy
𝐻 ( 𝐴𝐵)𝜓 = 0 for every pure state 𝜓 𝐴𝐵 , so that
where the last equality follows from the Schmidt decomposition theorem (Theo-
rem 2.2) to see that the reduced states 𝜓 𝐴 and 𝜓 𝐵 have the same non-zero eigenvalues,
and thus the same value for the entropy. On the other hand, Proposition 9.20 states
that the relative entropy of entanglement 𝐸 𝑅 ( 𝐴; 𝐵)𝜓 = 𝐻 ( 𝐴)𝜓 , and we also know
that 𝐸 𝑅 ( 𝐴; 𝐵)𝜓 ≥ 𝑅( 𝐴; 𝐵)𝜓 (see (9.1.149)). We thus have the following:
857
Chapter 13: Entanglement Distillation
In this section, we define two classes of states for which the one-way distillable
entanglement takes on a simple form.
𝜔 𝐴𝐵′ = 𝜔 𝐴𝐵 = 𝜌 𝐴𝐵 . (13.3.4)
Remark: Degradable and anti-degradable states are the state counterparts of degradable
and anti-degradable channels; see Definition 4.6. In fact, observe that the Choi state of a
degradable channel is a degradable state, and the Choi state of an anti-degradable channel is an
anti-degradable state.
Anti-degradable states are also sometimes called symmetrically extendible states or two-
extendible states (please consult the Bibliographic Notes in Section 13.5).
Intuitively, a degradable state is one for which the system 𝐵 can be used
to simulate (via a quantum channel D𝐵→𝐸 ′ ) the correlations between 𝐴 and 𝐸.
Analogously, an anti-degrdable state is one for which the system 𝐸 can be used to
simulate (via a quantum channel A𝐸→𝐵′ ) the correlations between 𝐴 and 𝐵.
An anti-degradable state 𝜌 𝐴𝐵 is one for which the environment 𝐸 (corresponding
to the purifying system of 𝜌 𝐴𝐵 ) cannot be decoupled from 𝐴 and 𝐵 through LOCC
858
Chapter 13: Entanglement Distillation
from 𝐴 to 𝐵 alone. Indeed, recall the task of decoupling from Section 13.1.2 (in
particular, see Figure 13.2). Since a channel can always be applied to 𝐸 in order to
simulate the correlations between 𝐴 and 𝐵, from the point of view of 𝐴, the systems
𝐵 and 𝐸 become indistinguishable, so that 𝐴 and 𝐵 cannot be (perfectly) decoupled
from 𝐸. Given that decoupling is not possible for anti-degradable states, we might
expect that anti-degradable states have zero one-way distillable entanglement. This
is indeed true, as we now show.
𝜌 𝐴𝐵 = A 𝑅→𝐵 (𝜓 𝐴𝑅 ). (13.3.5)
Now, let
†
𝜔 𝐴′ 𝑋 𝐸 𝐵𝑅 = 𝑉𝐴→𝐴′ 𝑋 𝐸 𝜓 𝐴𝐵𝑅𝑉𝐴→𝐴 ′𝑋𝐸, (13.3.6)
which is a pure state. Then, using the fact that
†
𝜔 𝐴′ 𝑋 𝐵 = Tr𝐸 [𝑉𝐴→𝐴′ 𝑋 𝐸 𝜌 𝐴𝐵𝑉𝐴→𝐴 ′𝑋𝐸] (13.3.7)
†
= (A 𝑅→𝐵 ◦ Tr𝐸 )(𝑉𝐴→𝐴′ 𝑋 𝐸 𝜓 𝐴𝑅𝑉𝐴→𝐴 ′𝑋𝐸) (13.3.8)
= A 𝑅→𝐵 (𝜔 𝐴′ 𝑋 𝑅 ), (13.3.9)
where we used the data-processing inequality in (7.3.17), and for the subsequent
equalities we used the fact that 𝜔 𝐴′ 𝑋 𝐸 𝐵𝑅 is a pure state that is symmetric in 𝑋 and
859
Chapter 13: Entanglement Distillation
𝐸. We thus have 𝐼 ( 𝐴′⟩𝑋 𝐵)𝑉 𝜌𝑉 † ≤ 0 for every isometry 𝑉 used in the optimization
for 𝐷 → (𝜌 𝐴𝐵 ), implying that 𝐷 → (𝜌 𝐴𝐵 ) ≤ 0. However, since 𝐷 → (𝜌 𝐴𝐵 ) ≥ 0 by
Lemma 13.27, we obtain 𝐷 → (𝜌 𝐴𝐵 ) = 0. The statement that 𝐸 𝐷 → ( 𝐴; 𝐵) = 0
𝜌
follows by repeating the same argument for 𝑛 copies of 𝜌 𝐴𝐵 and using the fact that
𝜌 ⊗𝑛
𝐴𝐵 is an anti-degradable state if 𝜌 𝐴𝐵 is. ■
𝐷 → (𝜌 𝐴𝐵 ) = 𝐼 ( 𝐴⟩𝐵) 𝜌 . (13.3.15)
Consequently, 𝐷 → (𝜌 ⊗𝑛 →
𝐴𝐵 ) = 𝑛𝐷 (𝜌 𝐴𝐵 ), and thus the one-way distillable
entanglement of a degradable state 𝜌 𝐴𝐵 is equal to its coherent information:
→
𝐸𝐷 ( 𝐴; 𝐵) 𝜌 = 𝐼 ( 𝐴⟩𝐵) 𝜌 . (13.3.16)
= 𝐻 (𝑋 𝑅′ 𝐹) 𝜙 − 𝐻 ( 𝐴′ 𝑋 𝑅′ 𝐹) 𝜙 (13.3.22)
= 𝐻 (𝑋 𝑅′ 𝐹) 𝜙 − 𝐻 (𝐸 𝑅) 𝜙 (13.3.23)
= 𝐻 (𝑋 𝑅′ 𝐹) 𝜙 − 𝐻 (𝑋 𝑅) 𝜙 , (13.3.24)
where the inequality follows from the data-processing inequality with the partial
trace channel Tr 𝑋 , and the last equality follows because 𝜙 𝑅 = 𝜙 𝑅′ , due to the
degradability of 𝜌 𝐴𝐵 . Finally, observe that
†
𝜙 𝑅′ 𝐹 = 𝑊𝐵→𝑅′ 𝐹 𝜌 𝐵𝑊𝐵→𝑅 ′𝐹 , (13.3.30)
𝐷 → (𝜌 𝐴𝐵 ) = 𝐼 ( 𝐴⟩𝐵) 𝜌 . (13.3.32)
13.4 Summary
In this chapter, we studied the task of entanglement distillation, in which the goal is
for Alice and Bob to convert many copies of a shared entangled state 𝜌 𝐴𝐵 to some
861
Chapter 13: Entanglement Distillation
862
Chapter 13: Entanglement Distillation
Berta (2008); Buscemi and Datta (2010b); Brandao and Datta (2011); Wilde
et al. (2017) have considered lower bounds on distillable entanglement in the
one-shot setting. The lower bound that we present in Proposition 13.10 is the one
given by Wilde et al. (2017, Proposition 21), which makes use of the one-shot
decoupling results obtained by Dupuis et al. (2014). In particular, the proof of
Theorem 13.11 provided in Appendix 13.A comes directly from the proof of (Dupuis
et al., 2014, Theorem 3.3). The notion of decoupling has played an important role
in the development of quantum information theory. It was originally proposed by
Schumacher and Westmoreland (2002) in the context of understanding approximate
quantum error correction and quantum communication. It was then developed in
much more detail by Horodecki et al. (2005b, 2007) in the context of state merging
and by Hayden et al. (2008a) for understanding the coherent information lower
bound on quantum capacity. Dupuis (2010) developed the method in more detail in
his PhD thesis for a variety of information-processing tasks, and this culminated in
the general decoupling theorem presented as (Dupuis et al., 2014, Theorem 3.3).
For the SDP formulations of conditional min- and max-entropy, as well as their
smoothed variants, see (Tomamichel, 2015, Chapter 6). For more information
about unitary designs and about Haar measure integration over unitaries, we refer
to (Collins and Śniady, 2006; Roy and Scott, 2009). The one-shot upper bound
that we present in Proposition 13.6 and Theorem 13.7 based on the fact that PPT’
operators are useless for entanglement distillation was determined by Tomamichel
et al. (2016).
In the asymptotic setting, Devetak and Winter (2005) used random coding
arguments to establish the coherent information lower bound (also called the
“hashing inequality”) on the distillable entanglement of a bipartite quantum state.
The corresponding hashing protocol was presented by Bennett et al. (1996c) for
two-qubit Bell-diagonal states. Devetak and Winter (2005) also determined that
the general expression in Theorem 13.19 is an achievable rate for entanglement
distillation from a bipartite state, and they also proved the converse. Horodecki
et al. (2000) conjectured this formula earlier, conditioned on the hashing inequality
being true.
Theorem 13.26 is due to Devetak and Winter (2005). The other results in
Sections 13.2.5 and 13.3.2 on one-way entanglement distillation were obtained by
Leditzky et al. (2018), who in the same work used the concepts of approximate
degradablity and approximate anti-degradability of bipartite states to derive upper
bounds on distillable entanglement. We note that anti-degradable quantum states,
863
Chapter 13: Entanglement Distillation
Environment
ρ AE τB ⊗ ρ E
Alice
A B
U N
Alice Bob
Figure 13.4: Given a bipartite state 𝜌 𝐴𝐸 and a quantum channel N 𝐴→𝐵 , the
goal of decoupling is to obtain a state 𝜏𝐵 ⊗ 𝜌 𝐸 that is in tensor product with the
environment 𝐸, where 𝜌 𝐸 = Tr 𝐴 [𝜌 𝐴𝐸 ]. To assist with the task, Alice is allowed
to apply an arbitrary unitary 𝑈 to her system 𝐴.
864
Chapter 13: Entanglement Distillation
N 𝐴→𝐵 (𝑈 𝐴 𝜌 𝐴𝐸 𝑈 𝐴† ) ≈𝜀 𝜏𝐵 ⊗ 𝜌 𝐸 , (13.A.1)
min N 𝐴→𝐵 (𝑈 𝐴 𝜌 𝐴𝐸 𝑈 𝐴† ) − ΦN
𝐵 ⊗ 𝜌𝐸 (13.A.2)
𝑈𝐴 1
Note that the minimum over all unitaries never exceeds the average, meaning that
min N 𝐴→𝐵 (𝑈 𝐴 𝜌 𝐴𝐸 𝑈 𝐴† ) − ΦN
𝐵 ⊗ 𝜌𝐸
𝑈𝐴 1
∫
≤ N 𝐴→𝐵 (𝑈 𝐴 𝜌 𝐴𝐸 𝑈 𝐴† ) − ΦN
𝐵 ⊗ 𝜌𝐸 d𝑈 𝐴 , (13.A.3)
𝑈𝐴 1
where the integral over all unitaries 𝑈 𝐴 is with respect to the Haar measure, and it
can be thought of as a uniform average over the continuous set of all unitaries 𝑈 𝐴
acting on the system 𝐴. Theorem 13.11 provides an upper bound on the right-hand
side of the inequality above, and we restate the result here for convenience:
∫
N 𝐴→𝐵 (𝑈 𝐴 𝜌 𝐴𝐸 𝑈 𝐴† ) − ΦN
1e 1e
𝐵 ⊗ 𝜌𝐸 d𝑈 𝐴 ≤ 2− 2 𝐻2 ( 𝐴|𝐸)𝜌 − 2 𝐻2 ( 𝐴|𝐵)ΦN , (13.A.4)
𝑈𝐴 1
where we recall that the sandwiched Rényi conditional entropy of order two of a
bipartite state 𝜔𝐶𝐷 is defined as
e2 (𝐶 |𝐷)𝜔 = − inf 𝐷
𝐻 e2 (𝜔𝐶𝐷 ∥ 1𝐶 ⊗ 𝜎𝐷 ) (13.A.5)
𝜎𝐷
" 2#
− 14 − 41
= − inf log2 Tr 𝜎𝐷 𝜔𝐶𝐷 𝜎𝐷 , (13.A.6)
𝜎𝐷
865
Chapter 13: Entanglement Distillation
Let
𝑀𝐵𝐸 B N 𝐴→𝐵 (𝑈 𝐴 𝜌 𝐴𝐸 𝑈 𝐴† ) − ΦN
𝐵 ⊗ 𝜌𝐸 , (13.A.7)
𝜎𝐵𝐸 B 𝜏𝐵 ⊗ 𝜁 𝐸 , (13.A.8)
with 𝜏𝐵 and 𝜁 𝐸 arbitrary positive definite states. By the variational characterization
of the trace norm in (2.2.130), we have that
∥ 𝑀𝐵𝐸 ∥ 1 = max |Tr[𝑈𝐵𝐸 𝑀𝐵𝐸 ]| , (13.A.9)
𝑈 𝐵𝐸
where the optimization is over every unitary 𝑈𝐵𝐸 . Using the Cauchy–Schwarz
inequality (see (2.2.30)), and suppressing system labels for brevity, we obtain
∥ 𝑀 ∥ 1 = max |Tr[𝑈 𝑀]| (13.A.10)
𝑈
h 1 1
1 1
i
− −
= max Tr 𝜎 4 𝑈𝜎 4 𝜎 4 𝑀𝜎 4 (13.A.11)
𝑈
√︂ h 1 i h 1 i
1 1 1 1 1
≤ max Tr 𝜎 𝑈𝜎 4 4 †
𝜎 𝑈 𝜎
4 4 −
Tr 𝜎 𝑀𝜎 𝑀 𝜎
4 − 2 † − 4 (13.A.12)
𝑈
√︂ h 1 i h 1 i
1 1 1
† − −
= max Tr 𝜎 2 𝑈𝜎 2 𝑈 Tr 𝜎 4 𝑀𝜎 2 𝑀 𝜎 4 . † − (13.A.13)
𝑈
1 1
Since 𝜎 2 and 𝑈𝜎 2 𝑈 † are positive definite for every unitary 𝑈, by the Cauchy–
Schwarz inequality, we conclude that
h 1 1
i h 1 1
i
† †
Tr 𝜎 𝑈𝜎 𝑈 = Tr 𝜎 𝑈𝜎 𝑈
2 2 2 2 (13.A.14)
√︂ h i h i
1 1 1 1
†
≤ Tr 𝜎 2 𝜎 2 Tr 𝑈𝜎 2 𝑈 𝑈𝜎 2 𝑈 † (13.A.15)
= Tr[𝜎] (13.A.16)
=1 (13.A.17)
for every unitary 𝑈, which implies that
h 1 1
i
†
max Tr 𝜎 𝑈𝜎 𝑈 ≤ Tr[𝜎] = 1.
2 2 (13.A.18)
𝑈
866
Chapter 13: Entanglement Distillation
N 𝐴→𝐵 (𝑈 𝐴 𝜌 𝐴𝐸 𝑈 𝐴† ) − ΦN
𝐵 ⊗ 𝜌𝐸
1
√︄
1 2
1 †
≤ Tr (𝜏𝐵 ⊗ 𝜁 𝐸 ) − 4 (N 𝐴→𝐵 (𝑈 𝐴 𝜌 𝐴𝐸 𝑈 𝐴 ) − ΦN
𝐵 ⊗ 𝜌 𝐸 )(𝜏𝐵 ⊗ 𝜁 𝐸 )
−4 .
(13.A.23)
Now, define
1 1
e 𝐴→𝐵 (·) B 𝜏 − 4 N 𝐴→𝐵 (·)𝜏 − 4 ,
N (13.A.24)
𝐵 𝐵
− 14 − 14
𝜌 𝐴𝐸 B 𝜁 𝐸 𝜌 𝐴𝐸 𝜁 𝐸 .
e (13.A.25)
N 𝐴→𝐵 (𝑈 𝐴 𝜌 𝐴𝐸 𝑈 𝐴† ) − ΦN
𝐵 ⊗ 𝜌𝐸
√︄ 1
2
† N
≤ Tr N e 𝐴→𝐵 (𝑈 𝐴 e
𝜌 𝐴𝐸 𝑈 𝐴 ) − Φ𝐵 ⊗ e
e
𝜌 𝐸 . (13.A.26)
Taking the integral over unitaries 𝑈 𝐴 on both sides of this inequality, and using
Jensen’s inequality (see (2.3.21)), which applies because the square root function is
concave, we obtain
∫
N 𝐴→𝐵 (𝑈 𝐴 𝜌 𝐴𝐸 𝑈 𝐴† ) − ΦN
𝐵 ⊗ 𝜌𝐸 d𝑈 𝐴
𝑈𝐴 1
∫ √︄ 2
† N
≤ Tr N e 𝐴→𝐵 (𝑈 𝐴 e𝜌 𝐴𝐸 𝑈 𝐴 ) − Φ𝐵 ⊗ e
e
𝜌𝐸 d𝑈 𝐴 (13.A.27)
𝑈𝐴
867
Chapter 13: Entanglement Distillation
√︄∫ 2
≤ N 𝜌 𝐴𝐸 𝑈 𝐴† )
e 𝐴→𝐵 (𝑈 𝐴 e − ΦN ⊗e
e
Tr 𝐵 𝜌𝐸 d𝑈 𝐴 . (13.A.28)
𝑈𝐴
Expanding the integral on the right-hand side of the inequality above leads to
∫ 2
† N
Tr N e 𝐴→𝐵 (𝑈 𝐴 e
𝜌 𝐴𝐸 𝑈 𝐴 ) − Φ𝐵 ⊗ e
e
𝜌𝐸 d𝑈 𝐴
𝑈𝐴
∫ 2
= Tr N 𝜌 𝐴𝐸 𝑈 𝐴† )
e 𝐴→𝐵 (𝑈 𝐴 e d𝑈 𝐴 (13.A.29)
𝑈𝐴
∫ h i
† N
−2 Tr N 𝐴→𝐵 (𝑈 𝐴 e 𝜌 𝐴𝐸 𝑈 𝐴 )(Φ𝐵 ⊗ e 𝜌 𝐸 ) d𝑈 𝐴 + Tr[(ΦN 𝜌𝐸 ) 2]
𝐵 ⊗ e
e e e
𝑈𝐴
(13.A.30)
∫ 2
= Tr N 𝜌 𝐴𝐸 𝑈 𝐴† )
e 𝐴→𝐵 (𝑈 𝐴 e d𝑈 𝐴
𝑈𝐴
∫
† N
− 2 Tr N 𝜌 𝐸 ) + Tr[(ΦN
𝜌 𝐴𝐸 𝑈 𝐴 d𝑈 𝐴 (Φ𝐵 ⊗ e 𝜌 𝐸 ) 2 ].
𝐵 ⊗ e
e e
e 𝐴→𝐵 𝑈𝐴 e
𝑈𝐴
(13.A.31)
Now, using (13.1.83), we have that
∫
†
N 𝜌 𝐴𝐸 𝑈 𝐴 d𝑈 𝐴 = N 𝜌 𝐴𝐸 ]) = ΦN
e 𝐴→𝐵 (𝜋 𝐴 ⊗ Tr 𝐴 [e
𝐵 ⊗ e
e
e 𝐴→𝐵 𝑈𝐴 e 𝜌 𝐸 . (13.A.32)
𝑈𝐴
Therefore,
∫ 2
† N
Tr N e 𝐴→𝐵 (𝑈 𝐴 e
𝜌 𝐴𝐸 𝑈 𝐴 ) − Φ𝐵 ⊗ e
e
𝜌𝐸 d𝑈 𝐴
𝑈𝐴
∫ 2
𝜌 𝐴𝐸 𝑈 𝐴† ) d𝑈 𝐴 − Tr[(ΦN
e 2
= Tr N e 𝐴→𝐵 (𝑈 𝐴 e 𝜌 2𝐸 ]. (13.A.33)
𝐵 ) ]Tr[e
𝑈𝐴
868
Chapter 13: Entanglement Distillation
⊗2
= Tr N 𝜌 𝐴𝐸 𝑈 𝐴† )
e 𝐴→𝐵 (𝑈 𝐴 e (𝐹𝐵 ⊗ 𝐹𝐸 ) (13.A.35)
h i
e ⊗2 (𝑈 ⊗2 e ⊗2 † ⊗2
= Tr N 𝐴→𝐵 𝐴 𝜌 𝐴𝐸 (𝑈 𝐴 ) ) (𝐹𝐵 ⊗ 𝐹𝐸 ) (13.A.36)
h i
⊗2 † ⊗2 e ⊗2 † ⊗2
= Tr e𝜌 𝐴𝐸 (𝑈 𝐴 ) ( N 𝐴→𝐵 ) (𝐹𝐵 )𝑈 𝐴 ⊗ 𝐹𝐸 , (13.A.37)
where the last line follows from the definition of the adjoint of a channel. We thus
have
∫ 2
†
Tr N e 𝐴→𝐵 (𝑈 𝐴 e 𝜌 𝐴𝐸 𝑈 𝐴 ) d𝑈 𝐴
𝑈𝐴
∫
= Tr e 𝜌 ⊗2
𝐴𝐸 (𝑈 𝐴† ) ⊗2 ( N
e ⊗2 ) † (𝐹𝐵 )𝑈 ⊗2 d𝑈 𝐴 ⊗ 𝐹𝐸 . (13.A.38)
𝐴→𝐵 𝐴
𝑈𝐴
Now, we use the following known fact (a standard result in Schur–Weyl duality):
for every operator 𝑋 acting on C𝑑 ⊗ C𝑑 , with 𝑑 ≥ 1,
∫
(𝑈 † ) ⊗2 𝑋𝑈 ⊗2 d𝑈 = 𝛼1 + 𝛽𝐹, (13.A.39)
𝑈
where the last line follows from (13.A.34). By similar reasoning, we obtain
e ⊗2 ) † (𝐹𝐵 )] = Tr[N
Tr[𝐹𝐴 ( N e ⊗2 (𝐹𝐴 )𝐹𝐵 ] (13.A.45)
𝐴→𝐵 𝐴→𝐵
= 𝑑 2𝐴 Tr[(𝐹𝐴 ⊗ 𝐹𝐵 )(ΦN ⊗2
𝐴𝐵 ) ]
e
(13.A.46)
869
Chapter 13: Entanglement Distillation
= 𝑑 2𝐴 Tr[(ΦN 2
𝐴𝐵 ) ],
e
(13.A.47)
e ⊗2 on 𝐹𝐴 with the
where the second equality follows by expressing the action of N 𝐴→𝐵
Choi state ΦN
e
𝐴𝐵 , using (4.2.5). To obtain the last equality, we again used (13.A.34).
We thus have
Ne 2 !
Tr[(ΦN ) ] 𝐴𝐵 ) ]
e 2
𝐵 2
Tr[(Φ
𝛼= 𝑑 𝐴 − 𝑑 𝐴 , (13.A.48)
𝑑 2𝐴 − 1 N
Tr[(Φ𝐵 ) ]
e 2
!
N
Tr[(Φ 𝐴𝐵 ) ] 2
e 2 N
Tr[(Φ𝐵 ) ]
e 2
𝛽= 𝑑 𝐴 − 𝑑 𝐴 . (13.A.49)
𝑑 2𝐴 − 1 N
Tr[(Φ 𝐴𝐵 ) ]
e 2
We now make use of the following general fact, whose proof we provide below
in Lemma 13.32: for every non-zero positive semi-definite operator 𝑃 𝐴𝐵 with
𝑃 𝐵 B Tr 𝐴 [𝑃 𝐴𝐵 ], the following inequalities hold
1 Tr[𝑃2𝐴𝐵 ]
≤ ≤ 𝑑𝐴. (13.A.50)
𝑑𝐴 Tr[𝑃2𝐵 ]
𝛼 ≤ Tr[(ΦN 2
𝐵 ) ], 𝛽 ≤ Tr[(ΦN 2
𝐴𝐵 ) ].
e e
(13.A.51)
≤ Tr[(ΦN 2
𝜌 2𝐴𝐸 ]. (13.A.56)
𝐴𝐵 ) ]Tr[e
e
870
Chapter 13: Entanglement Distillation
This inequality holds for all states 𝜏𝐵 and 𝜁 𝐸 , which means that
∫
N 𝐴→𝐵 (𝑈 𝐴 𝜌 𝐴𝐸 𝑈 𝐴† ) − ΦN
𝐵 ⊗ 𝜌𝐸 d𝑈 𝐴
𝑈𝐴 1
" 2 # ! 12 " 2 # ! 12
− 14 − 41 − 41 − 14
≤ inf Tr 𝜏𝐵 ΦN
𝐴𝐵 𝜏𝐵 inf Tr 𝜁 𝐸 𝜌 𝐴𝐸 𝜁 𝐸 (13.A.59)
𝜏𝐵 𝜁𝐸
1 1
= 2− 2 𝐻2 ( 𝐴|𝐵)ΦN − 2 𝐻2 ( 𝐴|𝐸)𝜌
e
(13.A.60)
1 1
= 2− 2 𝐻2 ( 𝐴|𝐸)𝜌 − 2 𝐻2 ( 𝐴|𝐵)ΦN ,
e
(13.A.61)
Lemma 13.32
For every non-zero positive semi-definite operator 𝑃 𝐴𝐵 , with 𝑃 𝐵 = Tr 𝐴 [𝑃 𝐴𝐵 ],
it holds that
1 Tr[𝑃2𝐴𝐵 ]
≤ ≤ 𝑑𝐴. (13.A.62)
𝑑𝐴 Tr[𝑃2𝐵 ]
871
Chapter 13: Entanglement Distillation
872
Chapter 14
Quantum Communication
In the previous chapter, we considered entanglement distillation, which is the task
of taking many copies of a mixed entangled state 𝜌 𝐴𝐵 shared by Alice and Bob
and transforming them to a maximally entangled state Φ 𝐴ˆ 𝐵ˆ of Schmidt rank 𝑑 ≥ 2.
Using the quantum teleportation protocol, the maximally entangled state resulting
from entanglement distillation can be used for quantum communication, in the
sense that Alice can transfer an arbitrary state of log2 𝑑 qubits to Bob.
Now, if Alice and Bob are distantly separated, then how do they obtain many
copies of the shared entangled state 𝜌 𝐴𝐵 in the first place? Typically, one of the
parties, say Alice, prepares two quantum systems in an entangled state and sends
one of them through a quantum channel N 𝐴→𝐵 to Bob, thereby establishing the
shared entangled state. Rather than use the shared entangled state as the resource
for communication, it is more natural to use the quantum channel itself as the
resource, as it could in principle lead to better strategies and higher rates. This is
the scenario that we consider in this chapter.
Recall that in the case of classical communication from Chapter 12, we
considered messages from a set M, and the goal was to find upper and lower bounds
on the maximum number log2 |M| of transmitted bits over a quantum channel for a
given error 𝜀. Now, in the case of quantum communication, the goal is to transmit
a given number of qubits, rather than bits, for a given error 𝜀. Formally, suppose
that the sender, Alice, holds a quantum system 𝐴′ with dimension 𝑑 ≥ 1 that she
would like to transmit over the channel N to Bob, the receiver. In general, the
state of this system could be entangled with the state of some other system 𝑅 (of
arbitrary dimension) to which Alice does not have access, and so we suppose that
873
Chapter 14: Quantum Communication
Ψ RA0 ω RB0
A0
E A
N B
D B0
Alice Bob
the joint state is a pure state Ψ𝑅 𝐴′ with Schmidt rank 𝑑. Note that, by the Schmidt
decomposition theorem (Theorem 2.2), the dimension of 𝑅 need not exceed the
dimension of 𝐴′, which is 𝑑. The goal is to determine the largest value of log2 𝑑
(which can be thought of as the number of qubits in the system 𝐴′) for which the
𝐴′ part of an arbitrary entangled state Ψ𝑅 𝐴′ can be transmitted with error at most
𝜀. This general quantum communication scenario is known as strong subspace
transmission. As usual, Alice and Bob are allowed local encoding and decoding
channels, respectively, to help with this task; see Figure 14.1 for a depiction of a
one-shot protocol for quantum communication. In the asymptotic setting, they are
also allowed as many uses of the channel N as desired. The quantum capacity of N,
denoted by 𝑄(N), is then the largest value of 𝑛1 log2 𝑑 such that the 𝐴′ part of an
arbitrary pure state Ψ𝑅 𝐴′ can be transmitted to Bob with error that vanishes as the
number 𝑛 of channel uses increases.
Note that the notion of quantum communication presented above (strong
subspace transmission) is completely general and includes as special cases the
following information-processing tasks:
1. Entanglement transmission: Here, Alice’s system 𝐴′ is in the maximally
entangled state Φ 𝑅 𝐴′ with the reference system 𝑅, and the goal is to transmit
874
Chapter 14: Quantum Communication
compute in general. We then find tractable upper bounds on the quantum capacity,
and for this purpose, the channel entanglement measures defined in Chapter 10
play an important role.
Remark: In a strong subspace transmission protocol, the goal is to transmit one share of a
pure state Ψ𝑅 𝐴′ , with corresponding state vector |Ψ⟩𝑅 𝐴′ . Note that the state vector |Ψ⟩𝑅 𝐴′ has a
Schmidt decomposition of the form
𝑑 √
∑︁
|Ψ⟩ 𝑅 𝐴′ = 𝑝(𝑥)|𝜉 𝑥 ⟩𝑅 ⊗ |𝜁 𝑥 ⟩ 𝐴′ , (14.1.1)
𝑥=1
where {𝑝(𝑥)} 𝑑𝑥=1 are the Schmidt coefficients and {|𝜉 𝑥 ⟩𝑅 } 𝑑𝑥=1 , {|𝜁 𝑥 ⟩ 𝐴′ } 𝑑𝑥=1 are orthonormal sets
of vectors for 𝑅 and 𝐴′ , respectively. When written in this form, the state vector |Ψ⟩𝑅 𝐴′ can be
𝑝
understood as a coherent version of the initial state Φ 𝑀 𝑀 ′ for classical and entanglement-assisted
classical communication (see, e.g., (11.1.2)). The key difference in the classical-communication
case is that there is a fixed orthonormal basis {|𝑚⟩} 𝑚∈M corresponding to the messages 𝑚 in
the message set M. In quantum communication, the goal is to transmit a state of a quantum
system, which means that there is no particular basis used for communication. The encoding and
decoding channels should thus be defined so that they can reliably transmit states of the system
in an arbitrary basis.
The protocol proceeds as follows: we start with the entangled state Ψ𝑅 𝐴′ , where
the system 𝐴′ belongs to Alice and the system 𝑅 is an arbitrary reference system
inaccessible to Alice. Alice then sends the system 𝐴′ through the encoding channel
E 𝐴′ →𝐴 and sends 𝐴 through the channel N 𝐴→𝐵 . Once Bob receives the system
𝐵, he applies the decoding channel D𝐵→𝐵′ to it. The final state of the protocol is
1The quantum communication codes that we consider in this chapter are essentially equivalent
to codes for performing approximate quantum error correction. Please consult the Bibliographic
Notes in Section 14.5 for more information.
876
Chapter 14: Quantum Communication
therefore
𝜔 𝑅𝐵′ B (D𝐵→𝐵′ ◦ N 𝐴→𝐵 ◦ E 𝐴′ →𝐴 )(Ψ𝑅 𝐴′ ). (14.1.2)
Let us now quantify the reliability of the protocol described above, i.e., how
close the final state 𝜔 𝑅𝐵′ is to the initial state Ψ𝑅 𝐴′ . In Chapter 6, we discussed two
measures of closeness for states:
• Normalized trace distance, using which the distance between the initial and
final states is 12 ∥Ψ𝑅 𝐴′ − 𝜔 𝑅𝐵′ ∥ 1 . The lower the normalized trace distance, the
more reliable the protocol is.
• Fidelity, in which case we have
√︁ √ 2
𝐹 (Ψ𝑅 𝐴′ , 𝜔 𝑅𝐵′ ) = Ψ𝑅 𝐴′ 𝜔 𝑅𝐵′ = ⟨Ψ| 𝑅 𝐴′ 𝜔 𝑅𝐵′ |Ψ⟩ 𝑅 𝐴′ (14.1.3)
1
as the closeness measure between the initial and final states of the protocol.
The higher the fidelity, the more reliable the protocol is.
These two measures of closeness are arguably equivalent to each other, in the
sense that one can be used to bound the other via the inequality (6.2.88) shown in
Theorem 6.14, which we restate here: for all states 𝜌 and 𝜎,
√︁ 1 √︁
1 − 𝐹 (𝜌, 𝜎) ≤ ∥ 𝜌 − 𝜎∥ 1 ≤ 1 − 𝐹 (𝜌, 𝜎). (14.1.4)
2
Now, our figure of merit for the quantum communication protocol should not
be based on just one particular initial state, in this case Ψ𝑅 𝐴′ . Recall that the task of
quantum communication is to reliably transmit one share of an arbitrary pure state
through the channel N 𝐴→𝐵 . Intuitively, therefore, the closer the overall channel
D𝐵→𝐵′ ◦ N 𝐴→𝐵 ◦ E 𝐴′ →𝐴 is to the identity channel id 𝐴′ →𝐵′ , the better the code
(E, D) is at the quantum communication task, and so our figure of merit should
quantify this distance. One method to determine this distance is to calculate how
well a given code can transmit one share of a state in the worst case, i.e., by either
the highest value of the trace distance or by the lowest value of the fidelity. If a
code can be designed such that, in the worst case, the fidelity (trace distance) is
high (low), then by definition any other state will do just as well or better. We are
thus led to define the following two figures of merit:
1. Worst-case trace distance: We define this as
1
sup ∥Ψ𝑅 𝐴′ − (D𝐵→𝐵′ ◦ N 𝐴→𝐵 ◦ E 𝐴′ →𝐴 )(Ψ𝑅 𝐴′ )∥ 1 . (14.1.5)
Ψ𝑅 𝐴′ 2
877
Chapter 14: Quantum Communication
Now, since this state is a particular state in the optimization in (14.1.9) for the error
probability 𝑝 ∗err (E, D; N), we conclude that
𝑝 ∗err (E, D; N)
≥ 1 − ⟨Φ| 𝑅 𝐴′ (D𝐵→𝐵′ ◦ N 𝐴→𝐵 ◦ E 𝐴′ →𝐴 )(Φ 𝑅 𝐴′ )|Φ⟩ 𝑅 𝐴′ (14.1.11)
= 1 − 𝐹𝑒 (D𝐵→𝐵′ ◦ N 𝐴→𝐵 ◦ E 𝐴′ →𝐴 ) (14.1.12)
C 𝑝 err (E, D; N), (14.1.13)
where in the second line we have identified the entanglement fidelity of the channel
D𝐵→𝐵′ ◦ N 𝐴→𝐵 ◦ E 𝐴′ →𝐴 , as stated in Definition 6.21. In the last line, we have
defined the quantity 𝑝 err (E, D; N). As the notation suggests, this quantity is a
quantum analogue of the average error probability for classical and entanglement-
assisted classical communication. In classical and entanglement-assisted classical
communication, the average error probability corresponds to taking a uniform
distribution over the messages being sent. Similarly, in quantum communication,
the average error probability can be thought of as taking a uniform distribution
for the Schmidt coefficients in (14.1.1), which by definition gives a maximally
entangled state.
Another way of writing the average error probability for a quantum communica-
tion code is via what is known as the entanglement test, which we introduced in the
previous chapter. It is analogous to the comparator test that we defined in Chapters
11 and 12 in the context of classical communication. The entanglement test is
defined by the POVM {Φ 𝑅𝐵′ , 1 𝑅𝐵′ − Φ 𝑅𝐵′ }. The outcomes of the entanglement
test tell us whether the state being measured is the maximally entangled state Φ 𝑅𝐵′ .
Since the state Φ 𝑅𝐵′ is pure, using (6.2.2), the probability that the state 𝜔 𝑅𝐵′ at the
end of the protocol is in the maximally entangled state, i.e., the probability that the
state “passes the entanglement test,” is
Tr[Φ 𝑅𝐵′ 𝜔 𝑅𝐵′ ] = ⟨Φ| 𝑅𝐵′ 𝜔 𝑅𝐵′ |Φ⟩ 𝑅𝐵′ (14.1.14)
= 𝐹 (Φ 𝑅𝐵′ , 𝜔 𝑅𝐵′ ) (14.1.15)
= 1 − 𝑝 err (E, D; N). (14.1.16)
In addition to finding, for a given 𝜀 ∈ (0, 1], the maximum number of transmitted
qubits among all (𝑑, 𝜀) quantum communication protocols over N 𝐴→𝐵 , we can
consider the following complementary problem: for a given dimension 𝑑 ≥ 1, find
the smallest possible error among all (𝑑, 𝜀) quantum communication protocols for
N 𝐴→𝐵 , which we denote by 𝜀𝑄 ∗ (𝑑; N). In other words, the complementary problem
is to determine
∗
𝜀𝑄 (𝑑; N) B inf {𝑝 ∗err (E, D; N) : 𝑑 𝐴′ = 𝑑 𝐵′ = 𝑑}, (14.1.18)
E,D
where the optimization is with respect to all encoding channels E 𝐴′ →𝐴 and decoding
channels D𝐵→𝐵′ such that 𝑑 𝐴′ = 𝑑 𝐵′ = 𝑑. In this book, we focus primarily on the
problem of optimizing the number of transmitted qubits rather than the error, and
so our primary quantity of interest is the one-shot quantum capacity 𝑄 𝜀 (N).
Ψ RA0 ω RB0
A0
E A
PσB B
D B0
Alice Bob
Consider now the same protocol but over the useless channel depicted in Figure
14.2. This useless channel is exactly the same as the one considered in Chapters 11
and 12; namely, it is the replacement channel for some state 𝜎𝐵 . For the initial state
Φ 𝑅 𝐴′ , the state at the end of the protocol for the replacement channel is
This latter bound is the one that we employ in this chapter because it leads to a
formula for the quantum capacity of some channels of interest in applications. This
inequality tells us that, given an arbitrary (𝑑, 𝜀) quantum communication protocol
with corresponding code (E, D), the 𝜀-hypothesis testing coherent information
𝐼 𝐻𝜀 (𝑅⟩𝐵′) 𝜌 , with 𝜌 𝑅𝐵′ given by (14.1.20), is an upper bound on the maximum
number of qubits that can be transmitted over the channel with error at most 𝜀.
Note that a different choice for the encoding and decoding generally produces a
different value for the upper bound. We would like an upper bound that applies
regardless of the specific protocol. In other words, we would like an upper bound
that is a function of the channel N 𝐴→𝐵 only.
Proof: Let E and D be the encoding and decoding channels, respectively, for a
(𝑑, 𝜀) quantum communication protocol for N. Then, by (14.1.23), we have that
882
Chapter 14: Quantum Communication
𝐼 𝐻𝜀 (𝑅⟩𝐵′) 𝜌
= inf 𝐷 𝜀𝐻 (𝜌 𝑅𝐵′ ∥ 1 𝑅 ⊗ 𝜎𝐵′ ) (14.1.27)
𝜎𝐵′
≤ inf 𝐷 𝜀𝐻 ((D𝐵→𝐵′ ◦ N 𝐴→𝐵 ◦ E 𝐴′ →𝐴 )(Φ 𝑅 𝐴′ )∥ 1 𝑅 ⊗ D𝐵→𝐵′ (𝜏𝐵 )) (14.1.28)
𝜏𝐵
≤ inf 𝐷 𝜀𝐻 (N 𝐴→𝐵 (𝜌 𝑅 𝐴 )∥ 1 𝑅 ⊗ 𝜏𝐵 ) (14.1.29)
𝜏𝐵
where the second inequality follows from the data-processing inequality for hy-
pothesis testing relative entropy and we let 𝜌 𝑅 𝐴 B E 𝐴′ →𝐴 (Φ 𝑅 𝐴′ ). We now take
the supremum over every state 𝜌 𝑅 𝐴 , which effectively corresponds to taking the
supremum over all encoding channels, and since it suffices to consider only pure
states when optimizing the coherent information (see the arguments after Definition
7.85), we conclude that
as required. ■
Corollary 14.4
Let N 𝐴→𝐵 be a quantum channel, and let 𝜀 ∈ [0, 1). For all (𝑑, 𝜀) quantum
communication protocols for N, the following bounds hold:
883
Chapter 14: Quantum Communication
Having derived upper bounds on the number of transmitted qubits for an arbitrary
quantum communication protocol, let us now determine a lower bound on the
number of transmitted qubits. As with the other communication scenarios that we
have considered so far, in order to obtain a lower bound on the number qubits that
can be transmitted, we need to devise an explicit (𝑑, 𝜀) quantum communication
protocol for all 𝜀 ∈ (0, 1). The protocol we consider is based on the one-shot,
one-way entanglement distillation protocol from Proposition 13.10 in Chapter 13,
884
Chapter 14: Quantum Communication
which establishes
√ that, for an arbitrary bipartite state 𝜌 𝐴𝐵 and for all 𝜀 ∈ (0, 1] and
𝜂 ∈ [0, 𝜀), there exists a (𝑑, 𝜀) one-way entanglement distillation protocol for
𝜌 𝐴𝐵 such that √
𝜀−𝜂
log2 𝑑 = −𝐻max ( 𝐴|𝐵) 𝜌 + 4 log2 𝜂. (14.1.37)
The goal in this section is to show that entanglement distillation can be used
to develop a quantum communication strategy. Specifically, we show that the
existence of a (𝑑, 𝜀) one-way entanglement distillation protocol for the bipartite state
𝜔 𝐴𝐵 = N 𝐴′ →𝐵 (𝜓 𝐴𝐴′ ) implies the existence of a (𝑑 ′, 𝜀′) quantum communication
protocol, with 𝑑 ′ and 𝜀′ being functions of 𝑑 and 𝜀. The claim is as follows:
The first step in the proof of Theorem 14.5 is to observe that one-way entan-
glement distillation is an example of entanglement generation, albeit with forward
(i.e., sender to receiver) classical communication, which we introduced at the
beginning of this chapter and formally define below. We then show that forward
classical communication does not help for entanglement generation, even in the
non-asymptotic setting. One-way entanglement distillation thus implies entangle-
ment generation. We then show that entanglement generation implies entanglement
transmission, which we defined at the beginning of this chapter. Finally, we show
that entanglement transmission implies quantum communication.
Before proceeding with the proof of Theorem 14.5, let us formally define entan-
glement generation (with and without one-way LOCC assistance) and entanglement
885
Chapter 14: Quantum Communication
transmission.
• Entanglement generation: An entanglement generation protocol for N 𝐴→𝐵 is
defined by the three elements (𝑑, Ψ𝐴′ 𝐴 , D𝐵→𝐵′ ), where Ψ𝐴′ 𝐴 is a pure state
with 𝑑 𝐴′ = 𝑑, and D𝐵→𝐵′ is a decoding channel with 𝑑 𝐵′ = 𝑑. The goal of the
protocol is to transmit the system 𝐴 such that the final state
𝑝 (EG)
err (Ψ 𝐴′ 𝐴 , D; N) B 1 − ⟨Φ| 𝐴′ 𝐵′ 𝜎𝐴′ 𝐵′ |Φ⟩ 𝐴′ 𝐵′ (14.1.41)
= 1 − 𝐹 (Φ 𝐴′ 𝐵′ , 𝜎𝐴′ 𝐵′ ). (14.1.42)
We call the protocol (𝑑, Ψ𝐴′ 𝐴 , D𝐵→𝐵′ ) a (𝑑, 𝜀) protocol, with 𝜀 ∈ [0, 1], if
𝑝 (EG)
err (Ψ 𝐴′ 𝐴 , D; N) ≤ 𝜀.
Note that an entanglement generation protocol (𝑑, Ψ𝐴′ 𝐴 , D𝐵→𝐵′ ) over N 𝐴→𝐵
is an example of an entanglement distillation protocol (𝑑, L 𝐴𝐵→ 𝐴ˆ 𝐵ˆ ) for the
state 𝜌 𝐴′ 𝐵 = N 𝐴→𝐵 (Ψ𝐴′ 𝐴 ), with 𝐴ˆ ≡ 𝐴′, 𝐵ˆ ≡ 𝐵′, and L 𝐴𝐵→ 𝐴ˆ 𝐵ˆ ≡ D𝐵→𝐵′ .
• Entanglement generation assisted by one-way LOCC: An entanglement gener-
ation protocol for N 𝐴→𝐵 assisted by one-way LOCC from 𝐴 to 𝐵 is defined
by (𝑑, Ψ𝐴′ 𝐴 , {E𝑥𝐴′ 𝐴→𝐴′ 𝐴 }𝑥 , {D𝑥𝐵→𝐵′ }𝑥 ), where 𝑑 ≥ 1, Ψ𝐴′ 𝐴 is a pure state with
𝑑 𝐴′ = 𝑑, {E𝑥𝐴′ 𝐴→𝐴′ 𝐴 }𝑥∈X is a set of completely positive maps indexed by a finite
Í
alphabet X such that 𝑥∈X E𝑥𝐴′ 𝐴→𝐴′ 𝐴 is trace preserving, and {D𝑥𝐵→𝐵′ }𝑥∈X is a
set of quantum channels indexed by X, with 𝑑 𝐵′ = 𝑑. The goal of the protocol
is to transmit the system 𝐴 such that the final state
∑︁
→
𝜎𝐴′ 𝐵′ B (D𝑥𝐵→𝐵′ ◦ N 𝐴→𝐵 ◦ E𝑥𝐴′ 𝐴→𝐴′ 𝐴 )(Ψ𝐴′ 𝐴 ) (14.1.43)
𝑥∈X
We call the protocol (𝑑, Ψ𝐴′ 𝐴 , {E𝑥𝐴′ 𝐴→𝐴′ 𝐴 }𝑥 , {D𝑥𝐵→𝐵′ }𝑥 ) a (𝑑, 𝜀) protocol, with
𝜀 ∈ [0, 1], if 𝑝 (EG),→
err (Ψ𝐴′ 𝐴 , {E𝑥 }𝑥 , {D𝑥 }𝑥 ; N) ≤ 𝜀.
886
Chapter 14: Quantum Communication
We start by showing that an arbitrary entanglement distillation protocol for the state
N 𝐴→𝐵 (𝜓 𝐴′ 𝐴 ), with 𝜓 𝐴′ 𝐴 a pure state, has the same performance parameters as an
entanglement generation protocol with one-way LOCC assistance.
Consider an arbitrary (𝑑, 𝜀) entanglement distillation protocol for N 𝐴→𝐵 (𝜓 𝐴′ 𝐴 )
given by a one-way LOCC channel L 𝐴′ 𝐵→ 𝐴ˆ 𝐵ˆÍ , with 𝑑 𝐴ˆ = 𝑑 𝐵ˆ = 𝑑. In general,
this LOCC channel has the form L 𝐴′ 𝐵→ 𝐴ˆ 𝐵ˆ = 𝑥∈X E𝑥 ′ ˆ ⊗ D𝑥 ˆ , where X is
𝐴 →𝐴 𝐵→ 𝐵
some finite alphabet, {E𝑥 ′ ˆ }𝑥∈X is a set of completely positive maps such that
Í 𝐴 →𝐴
𝑥∈X E 𝐴′ → 𝐴ˆ is trace preserving, and {D 𝐵→ 𝐵ˆ } 𝑥∈X is a set of channels. The output
𝑥 𝑥
887
Chapter 14: Quantum Communication
∑︁
(E𝑥𝐴′ → 𝐴ˆ ⊗ D𝑥𝐵→𝐵ˆ )(N 𝐴→𝐵 (𝜓 𝐴′ 𝐴 ))
𝑥∈X
∑︁
= (D𝑥𝐵→𝐵ˆ ◦ N 𝐴→𝐵 ◦ E𝑥𝐴′ → 𝐴ˆ )(𝜓 𝐴′ 𝐴 ), (14.1.48)
𝑥∈X
which has the form of a state at the output of an entanglement generation protocol
with one-way LOCC assistance. We thus have that a (𝑑, 𝜀) entanglement distillation
protocol for N 𝐴→𝐵 (𝜓 𝐴′ 𝐴 ) is equivalent to a (𝑑, 𝜀) entanglement generation protocol
for N with one-way LOCC assistance. We now show that one-way LOCC assistance
does not help for entanglement generation.
Lemma 14.6
Given a (𝑑, 𝜀) entanglement generation protocol for a channel N, assisted by
one-way LOCC, with 𝑑 ≥ 1 and 𝜀 ∈ [0, 1], there exists a (𝑑, 𝜀) entanglement
generation protocol for N (without one-way LOCC assistance).
888
Chapter 14: Quantum Communication
𝑟𝑥
∑︁ ∑︁
Tr[Φ 𝐴′ 𝐵′ 𝜎𝐴→′ 𝐵′ ] = 𝑝(𝑥)𝑞(𝑘 |𝑥)Tr[Φ 𝐴′ 𝐵′ 𝜎𝐴𝑥,𝑘′ 𝐵′ ] (14.1.56)
𝑥∈X 𝑘=1
≤ max Tr[Φ 𝐴′ 𝐵′ 𝜎𝐴𝑥,𝑘′ 𝐵′ ] (14.1.57)
𝑥∈X,
1≤𝑘 ≤𝑟 𝑥
= max Tr[Φ 𝐴′ 𝐵′ (D𝑥𝐵→𝐵′ ◦ N 𝐴→𝐵 )(𝜙𝑥,𝑘
𝐴′ 𝐴 )]. (14.1.58)
𝑥∈X,
1≤𝑘 ≤𝑟 𝑥
889
Chapter 14: Quantum Communication
Proof: Let (𝑑, Ψ𝐴′ 𝐴 , D𝐵→𝐵′ ) be the elements of a (𝑑, 𝜀) entanglement generation
protocol for N 𝐴→𝐵 , with 𝑑 𝐴′ = 𝑑 𝐵′ = 𝑑. This implies that the output state
satisfies
𝐹 (Φ 𝐴′ 𝐵′ , 𝜎𝐴′ 𝐵′ ) ≥ 1 − 𝜀. (14.1.61)
We now construct an entanglement transmission protocol. To this end, let 𝐴′ ≡ 𝑅
be a reference system inaccessible to both Alice and Bob. By the data-processing
inequality for fidelity (Theorem 6.9) with respect to the partial trace channel Tr 𝐵′ ,
we have that
𝐹 (Φ 𝑅 , Ψ𝑅 ) = 𝐹 (U 𝐴′ →𝐴 (Φ 𝑅 𝐴′ ), Ψ𝑅 𝐴′ ) ≥ 1 − 𝜀. (14.1.63)
We let this isometric channel U 𝐴′ →𝐴 be the encoding channel for the entanglement
transmission protocol, and we let
Next, using the sine distance (Definition 6.16), by definition√of the (𝑑, 𝜀) entan-
glement generation protocol, we have that 𝑃(Φ 𝑅𝐵′ , 𝜎𝑅𝐵′ ) ≤ 𝜀. Similarly, from
(14.1.63) we have that
√
𝑃(Ψ𝑅 𝐴 , U 𝐴′ →𝐴 (Φ 𝑅 𝐴′ )) ≤ 𝜀. (14.1.65)
Therefore, by the triangle inequality for the sine distance (Lemma 6.17), we
conclude that
where the second inequality follows from the data-processing inequality for sine
distance (see (6.2.114)) and (14.1.65) to see that
Proof: Suppose that a (𝑑, 𝜀) entanglement transmission code for N 𝐴→𝐵 exists,
and let E 𝐴′ →𝐴 and D𝐵→𝐵′ be the corresponding encoding and decoding channels,
respectively, with 𝑑 𝐴′ = 𝑑 𝐵′ = 𝑑. The condition 𝑝 err (E, D; N) ≤ 𝜀 then holds,
namely,
1 − Tr[Φ 𝑅𝐵′ 𝜔 𝑅𝐵′ ] ≤ 𝜀, (14.1.74)
where
𝜔 𝑅𝐵′ = (D𝐵→𝐵′ ◦ N 𝐴→𝐵 ◦ E 𝐴′ →𝐴 )(Φ 𝑅 𝐴′ ). (14.1.75)
891
Chapter 14: Quantum Communication
Let
C 𝐴′ →𝐵′ B D𝐵→𝐵′ ◦ N 𝐴→𝐵 ◦ E 𝐴′ →𝐴 . (14.1.76)
We proceed with the following algorithm:
1. Set 𝑘 = 𝑑 and H𝑑 = H 𝐴′ . Suppose for now that (1 − 𝛿)𝑑 is a positive integer.
2. Set |𝜙 𝑘 ⟩ ∈ H𝑘 to be a state vector that achieves the minimum fidelity of C 𝐴′ →𝐵′ :
3. Set
H𝑘−1 B span{|𝜓⟩ ∈ H𝑘 : |⟨𝜓|𝜙 𝑘 ⟩| = 0}. (14.1.80)
That is, H𝑘−1 is set to the orthogonal complement of |𝜙 𝑘 ⟩ in H𝑘 , so that
H𝑘 = H𝑘−1 ⊕ span{|𝜙 𝑘 ⟩}. Set 𝑘 → 𝑘 − 1.
4. Repeat steps 2-3 until 𝑘 = (1 − 𝛿)𝑑 after step 3.
The idea behind this algorithm is to successively remove minimum fidelity
states from H 𝐴′ until 𝑘 = (1 − 𝛿)𝑑. By the structure of the algorithm and some
analysis given below, we are then guaranteed that for this 𝑘 and lower that
That is, the subspace H𝑘 is good for quantum communication of states at the channel
input with fidelity at least 1 − 𝜀/𝛿 (to be precise, the subspace H𝑘 is good for
subspace transmission as defined in the introduction of this chapter). Furthermore,
the algorithm implies that
Also, {|𝜙 𝑘 ⟩}ℓ𝑘=1 is an orthonormal basis for Hℓ , where ℓ ∈ {1, . . . , 𝑑}. Note that
the unit vectors |𝜙 𝑘 ⟩, 𝑘 ∈ {(1 − 𝛿)𝑑 − 1, . . . , 1} can be generated by repeating the
algorithm above exhaustively.
892
Chapter 14: Quantum Communication
We now analyze the claims above by employing Markov’s inequality and some
other tools. From (14.1.74), we have that
𝑑
1 ∑︁
|Φ⟩ 𝑅 𝐴′ =√ |𝜙 𝑘 ⟩ 𝑅 ⊗ |𝜙 𝑘 ⟩ 𝐴′ , (14.1.85)
𝑑 𝑘=1
where complex conjugation is taken with respect to the basis {|𝑖⟩}𝑖=0 𝑑−1 used in
So this implies that (1 − 𝛿)𝑑 of the 𝐹𝑘 values are such that 𝐹𝑘 ≥ 1 − 𝜀/𝛿. Since they
are ordered as given in (14.1.82), we conclude that H (1−𝛿)𝑑 , which by definition
has dimension (1 − 𝛿)𝑑, is a good subspace for quantum communication in the
following sense (subspace transmission):
893
Chapter 14: Quantum Communication
√︁
i.e., 𝑝 ∗err (E, D; N) ≤ 2 𝜀/𝛿, which is the criterion for strong subspace transmission
(the strongest notion of quantum communication).
To finish off the proof, suppose that (1 − 𝛿)𝑑 is not an integer. Then there exists a
𝛿′ > 𝛿 such that (1 − 𝛿′)𝑑 = ⌊(1 − 𝛿)𝑑⌋ is a positive integer. By the above reasoning,
there exists a code satisfying (14.1.91), except with 𝛿 replaced by ′
√︁ 𝛿 , and with√︁the
code dimension equal to ⌊(1 − 𝛿)𝑑⌋. We also have that 1 − 2 𝜀/𝛿′ > 1 − 2 𝜀/𝛿.
This concludes the proof. ■
We now return to the proof of Theorem 14.5. To finish it off, we combine the
results of Lemmas 14.6, 14.7, and 14.8 to conclude that the existence of a (𝑑, 𝜀)
entanglement distillation protocol for 𝜔 𝐴𝐵 = N 𝐴′ →𝐵 (𝜓 𝐴𝐴′ ) implies the existence
of a (𝑑 ′, 𝜀′) quantum communication protocol, where
√︂
′ ′ 𝜀
𝑑 = (1 − 𝛿)𝑑, 𝜀 = 4 , 𝛿 ∈ (0, 1). (14.1.92)
𝛿
Recalling that 𝑑 is given by (14.1.37), we conclude that
√
𝜀′ 𝛿
′ 4 −𝜂
log2 𝑑 = −𝐻max ( 𝐴|𝐵)𝜔 + log2 (𝜂4 (1 − 𝛿)). (14.1.93)
Then, since the pure state 𝜓 𝐴𝐴′ used in (14.1.93) is arbitrary, we conclude that there
exists a (𝑑 ′, 𝜀′) quantum communication protocol satisfying
√
𝜀′ 𝛿
′ −𝜂
log2 𝑑 = sup −𝐻max 4
( 𝐴|𝐵)𝜔 + log2 (𝜂4 (1 − 𝛿)) (14.1.94)
𝜓 𝐴𝐴′
√
for all 𝜂 ∈ [0, 𝜀′ 𝛿/4) and 𝛿 ∈ (0, 1). This is precisely the statement in (14.1.38),
and so the proof of Theorem 14.5 is complete.
Applying the relation between smooth conditional min- and max-entropy in
(13.1.72) to the result of Theorem 14.5, and combining it with (7.8.83), we obtain
the following.
894
Chapter 14: Quantum Communication
Corollary 14.9
Let N 𝐴→𝐵 be a quantum channel, and let VN 𝐴→𝐵𝐸 be √an isometric channel
extending N 𝐴→𝐵 . For all 𝜀 ∈ (0, 1), 𝛿 ∈ (0, 1), 𝜂 ∈ [0, 𝜀 𝛿/4), and 𝛼 > 1, there
exists a (𝑑, 𝜀) quantum communication protocol for N with
1 1
log2 𝑑 ≥ sup 𝐻 e𝛼 ( 𝐴|𝐸) 𝜙 − log2
𝜓 𝐴𝐴′ 𝛼−1 𝑓 (𝜀, 𝛿, 𝜂)
1
− log2 + log2 (𝜂4 (1 − 𝛿)), (14.1.95)
1 − 𝑓 (𝜀, 𝛿, 𝜂)
√ 2
where 𝑓 (𝜀, 𝛿, 𝜂) B − 𝜂 , 𝜙 𝐴𝐸 = Tr 𝐵 [VN
𝜀 𝛿
4 𝐴′ →𝐵𝐸 (𝜓 𝐴𝐴 )] = N 𝐴′ →𝐸 (𝜓 𝐴𝐴 ),
′
𝑐 ′
and 𝜓 𝐴𝐴′ is a pure state with the dimension of 𝐴′ equal to the dimension of 𝐴.
Since the inequality in (14.1.95) holds for all (𝑑, 𝜀) quantum communication
protocols, we have that
𝜀 1 1
𝑄 (N) ≥ sup 𝐻 e𝛼 ( 𝐴|𝐸) 𝜙 − log2
𝜓 𝐴𝐴′ 𝛼−1 𝑓 (𝜀, 𝛿, 𝜂)
1
− log2 + log2 (𝜂4 (1 − 𝛿)), (14.1.96)
1 − 𝑓 (𝜀, 𝛿, 𝜂)
where
𝜙 𝐴𝐸 = Tr 𝐵 [VN
𝐴′ →𝐵𝐸 (𝜓 𝐴𝐴′ )] = N 𝐴′ →𝐸 (𝜓 𝐴𝐴′ ),
𝑐
(14.1.97)
√
𝑓 (𝜀, 𝛿, 𝜂) is defined just above, 𝜂 ∈ [0, 𝜀 𝛿/4), 𝛿 ∈ (0, 1), and 𝛼 > 1.
To summarize what we did in this section, we used the result from Proposition 13.10
on one-shot entanglement distillation to prove the existence of a quantum commu-
nication protocol in the one-shot setting. Note that the entanglement distillation
protocol of Proposition 13.10 involves one-way classical communication, while
quantum communication (as we defined it at the beginning of this chapter) does
not. In other words, in this section we managed to remove the one-way classical
communication from the entanglement distillation protocol and thereby argue for
895
Chapter 14: Quantum Communication
Ψ RA0 ω RB0
A1 B1
N
A2 B2
N
A0
E ..
.
A n −1
..
N
.
..
.
Bn−1
D B0
An Bn
N
Alice Bob
896
Chapter 14: Quantum Communication
Analysis of the asymptotic setting is almost exactly the same as that of the
one-shot setting. This is due to the fact that 𝑛 independent uses of the channel
N can be regarded as a single use of the channel N ⊗𝑛 . So the only change that
needs to be made is to replace N with N ⊗𝑛 and to define the encoding and decoding
channels as acting on 𝑛 systems instead of just one. In particular, the state at the
end of the protocol becomes
Then, just as in the one-shot setting, we define the error probability of the code
(E, D) for 𝑛 independent uses of N as
where in the second equality we use the definition of the one-shot quantum capacity
𝑄 𝜀 given in (14.1.17), and the supremum is over all 𝑑 ≥ 1, encoding channels
E with input system dimension 𝑑, and decoding channels D with output system
dimension 𝑑.
897
Chapter 14: Quantum Communication
As we prove in Appendix A,
∗
𝑅 achievable rate ⇐⇒ lim 𝜀𝑄 (2𝑛(𝑅−𝛿) ; N ⊗𝑛 ) = 0 ∀ 𝛿 > 0. (14.2.5)
𝑛→∞
In other words, a rate 𝑅 is achievable if for all 𝛿 > 0, the optimal error probability
for a sequence of protocols with rate 𝑅 − 𝛿 vanishes as the number 𝑛 of uses of N
increases.
In other words, a weak converse rate is a rate for which the optimal error probability
cannot be made to vanish, even in the limit of a large number of channel uses.
898
Chapter 14: Quantum Communication
Unlike the weak converse, in which the optimal error is required to simply be
bounded away from zero as the number 𝑛 of channel uses increases, in order to
have a strong converse rate, the optimal error has to converge to one as 𝑛 increases.
By comparing (14.2.8) and (14.2.9), we conclude that every strong converse rate is
a weak converse rate.
𝑄(N)
e B inf{𝑅 : 𝑅 is a strong converse rate for N} (14.2.10)
𝑄(N) ≤ 𝑄(N)
e (14.2.12)
899
Chapter 14: Quantum Communication
𝑐 1 𝑐 ⊗𝑛
𝑄(N) = 𝐼reg (N) B lim 𝐼 (N ). (14.2.13)
𝑛→∞ 𝑛
quantum capacity. One way of attempting to prove that the coherent information of
a channel is equal to its strong converse quantum capacity involves proving that the
sandwiched Rényi coherent information is additive for the channel. Unfortunately,
this quantity has not been shown to be additive for any quantum channel thus far,
which means that this approach is not known to be useful for obtaining strong
converse quantum capacities. We consider another approach to strong converse
quantum capacities in Section 14.2.4, which leads to a strong converse theorem
for dephasing channels (see (4.5.35)). In terms of a general statement about the
converse, the best we can say generally is that the regularized coherent information
is a weak converse rate for all quantum channels.
There are two ingredients to the proof of Theorem 14.16:
1. Achievability: We show that 𝐼reg𝑐 (N) is an achievable rate, which involves
2. Weak Converse: We show that 𝐼reg 𝑐 (N) is a weak converse rate, from which it
follows that 𝑄(N) ≤ 𝐼reg𝑐 (N). To show that 𝐼 𝑐 (N) is a weak converse rate,
reg
we use the one-shot upper bounds from Section 14.1.2 to conclude that every
achievable rate 𝑅 satisfies 𝑅 ≤ 𝐼reg
𝑐 (N).
𝑐 (N) is an achievable
We first establish in Section 14.2.1 that the quantity 𝐼reg
rate for quantum communication over N. Then, in Section 14.2.2, we prove that
𝑐 (N) is a weak converse rate.
𝐼reg
𝐻 e𝛼 (𝜙 𝐴𝐸 ∥ 1 𝐴 ⊗ 𝜎𝐸 ),
e𝛼 ( 𝐴|𝐸) 𝜙 = − inf 𝐷 (14.2.17)
𝜎𝐸
where 𝜙 𝐴𝐸 = N𝑐𝐴′ →𝐸 (𝜓 𝐴𝐴′ ) and 𝜓 𝐴𝐴′ is a pure state with the dimension of 𝐴′
equal to the dimension of 𝐴.
902
Chapter 14: Quantum Communication
e𝛼 ( 𝐴𝑛 |𝐸 𝑛 ) 𝜙 ⊗𝑛 ≥ 𝑛 𝐻
𝐻 e𝛼 ( 𝐴|𝐸) 𝜙 . (14.2.20)
In other words,
e𝛼 ( 𝐴𝑛 |𝐸 𝑛 )Φ ≥ 𝐻
sup 𝐻 e𝛼 ( 𝐴𝑛 |𝐸 𝑛 ) 𝜙 ⊗𝑛 ≥ 𝑛 𝐻
e𝛼 ( 𝐴|𝐸) 𝜙 (14.2.21)
Ψ 𝐴𝑛 𝐴′ 𝑛
for all 𝑛 ≥ 1, 𝜀 ∈ (0, 1), and 𝛼 > 1, where 𝑑 𝐴′ = 𝑑 𝐴 and 𝜙 𝐴𝐸 = N𝑐𝐴′ →𝐸 (𝜓 𝐴𝐴′ ).
We can now use (14.2.18) to prove that the regularized coherent information
𝑐 (N)
𝐼reg is an achievable rate for quantum communication over N.
𝛿 = 𝛿1 + 𝛿2 . (14.2.24)
𝛿1 ≥ 𝐼 𝑐 (N) − sup 𝐻
e𝛼 ( 𝐴|𝐸) 𝜙 , (14.2.25)
𝜓 𝐴𝐴′
903
Chapter 14: Quantum Communication
where 𝜓 𝐴𝐴′ is a pure state with the dimension of 𝐴′ equal to the dimension of 𝐴
and 𝜙 𝐴𝐸 = N𝑐𝐴′ →𝐸 (𝜓 𝐴𝐴′ ). Note that this is possible because 𝐻
e𝛼 ( 𝐴|𝐸) 𝜙 increases
monotonically with decreasing 𝛼 (this follows from Proposition 7.23), so that
e𝛼 ( 𝐴|𝐸) 𝜙 = sup sup 𝐻
lim sup 𝐻 e𝛼 ( 𝐴|𝐸) 𝜙 (14.2.26)
𝛼→1+ 𝜓 𝐴𝐴′ 𝜓
𝛼∈(1,∞) 𝐴𝐴′
= sup sup − inf 𝐷 e𝛼 (𝜙 𝐴𝐸 ∥ 1 𝐴 ⊗ 𝜎𝐸 ) (14.2.27)
𝛼∈(1,∞) 𝜓 𝐴𝐴′ 𝜎𝐸
= − inf e𝛼 (𝜙 𝐴𝐸 ∥ 1 𝐴 ⊗ 𝜎𝐸 )
inf inf 𝐷 (14.2.28)
𝛼∈(1,∞) 𝜓 𝐴𝐴′ 𝜎𝐸
e𝛼 (𝜙 𝐴𝐸 ∥ 1 ⊗ 𝜎𝐸 )
= − inf inf inf 𝐷 (14.2.29)
𝜓 𝐴𝐴′ 𝜎𝐸 𝛼∈(1,∞)
= − inf inf 𝐷 (𝜙 𝐴𝐸 ∥ 1 𝐴 ⊗ 𝜎𝐸 ) (14.2.30)
𝜓 𝐴𝐴′ 𝜎𝐸
= sup − inf 𝐷 (𝜙 𝐴𝐸 ∥ 1 𝐴 ⊗ 𝜎𝐸 ) (14.2.31)
𝜓 𝐴𝐴′ 𝜎𝐸
where the fifth equality follows from Proposition 7.22. Now, let VN 𝐴′ →𝐵𝐸 be an iso-
metric channel extending N 𝐴′ →𝐵 such that 𝜙 𝐴𝐸 = N 𝐴′ →𝐸 (𝜓 𝐴𝐴′ ) = Tr 𝐵 [VN
𝑐
𝐴′ →𝐵𝐸 (𝜓 𝐴𝐴 )].
′
𝜔 𝐴𝐵 = N 𝐴′ →𝐵 (𝜓 𝐴𝐴′ ), so that
𝐻 ( 𝐴|𝐸) 𝜙 = 𝐻 ( 𝐴𝐸) 𝜙 − 𝐻 (𝐸) 𝜙 (14.2.33)
= 𝐻 (𝐵)𝜔 − 𝐻 ( 𝐴𝐵)𝜔 (14.2.34)
= 𝐼 ( 𝐴⟩𝐵)𝜔 (14.2.35)
for every pure state 𝜓 𝐴𝐴′ . Therefore,
sup 𝐻 ( 𝐴|𝐸) 𝜙 = sup 𝐼 ( 𝐴⟩𝐵)𝜔 = 𝐼 𝑐 (N). (14.2.36)
𝜓 𝐴𝐴′ 𝜓 𝐴𝐴′
With 𝛼 ∈ (1, ∞) chosen such that (14.2.25) holds, take 𝑛 large enough so that
!
1 128 1 1 4 1 15
𝛿2 ≥ log2 2 + log2 + log2 + . (14.2.37)
𝑛(𝛼 − 1) 𝜀 𝑛 1− 𝜀
2
𝑛 𝜀 𝑛
128
Now, we use the fact that for the 𝑛 and 𝜀 chosen above there exists an (𝑛, 𝑑, 𝜀)
protocol such that
904
Chapter 14: Quantum Communication
log2 𝑑 1 128
≥ sup 𝐻 e𝛼 ( 𝐴|𝐸) 𝜙 − log2 2
𝑛 𝜓 𝐴𝐴′ 𝑛(𝛼 − 1) 𝜀
!
1 1 4 1 15
− log2 2
− log2 − , (14.2.38)
𝜀
𝑛 1 − 128 𝑛 𝜀 𝑛
which holds due to Corollary 14.17. Rearranging the right-hand side of this
inequality, and using (14.2.24), (14.2.25), and (14.2.37), we find that
log2 𝑑 𝑐 𝑐 1 128
≥ 𝐼 (N) − 𝐼 (N) − sup 𝐻 e𝛼 ( 𝐴|𝐸) 𝜙 + log2 2
𝑛 𝜓 𝐴𝐴′ 𝑛(𝛼 − 1) 𝜀
! !
1 1 4 1 15
+ log2 2
+ log2 + (14.2.39)
𝜀
𝑛 1 − 128 𝑛 𝜀 𝑛
≥ 𝐼 𝑐 (N) − (𝛿1 + 𝛿2 ) (14.2.40)
= 𝐼 𝑐 (N) − 𝛿. (14.2.41)
log 𝑑
Thus, there exists an (𝑛, 𝑑, 𝜀) quantum communication protocol with rate 𝑛2 ≥
𝐼 𝑐 (N) − 𝛿. Therefore, there exists an (𝑛, 2𝑛(𝑅−𝛿) , 𝜀) quantum communication
protocol with 𝑅 = 𝐼 𝑐 (N) for all sufficiently large 𝑛 such that (14.2.37) holds. Since
𝜀 and 𝛿 are arbitrary, we conclude that for all 𝜀 ∈ (0, 1], 𝛿 > 0, and sufficiently
large 𝑛, there exists an (𝑛, 2𝑛(𝐼 (N)−𝛿) , 𝜀) quantum communication protocol. This
𝑐
Ψ RA0 ω RB0
A1 B1
A2 B2
A0
E ..
.
A n −1
..
. PσBn ..
.
Bn−1
D B0
An Bn
Alice Bob
for N,
(1 − 2𝜀) log2 𝑑 ≤ 𝐼 𝑐 (N) + ℎ2 (𝜀), (14.2.42)
𝑐 𝛼 1
log2 𝑑 ≤ e
𝐼𝛼 (N) + log2 ∀ 𝛼 > 1. (14.2.43)
𝛼−1 1−𝜀
To obtain these inequalities, we considered a quantum communication protocol for
a useless channel. The useless channel in the asymptotic setting is analogous to the
one in Figure 14.2 and is shown in Figure 14.4. Applying (14.2.42) and (14.2.43)
to the channel N ⊗𝑛 leads to the following.
906
Chapter 14: Quantum Communication
Proof: Since the inequalities in (14.2.42) and (14.2.43) of Theorem 14.4 hold for
every channel N, they hold for the channel N ⊗𝑛 . Therefore, applying (14.2.42) and
(14.2.43) to N ⊗𝑛 and dividing both sides by 𝑛, we obtain the desired result. ■
The inequalities in the corollary above give us, for all 𝜀 ∈ [0, 1) and 𝑛 ∈ N, an
upper bound on the rate of an arbitrary (𝑛, 𝑑, 𝜀) quantum communication protocol.
If instead we fix a particular rate 𝑅 by letting 𝑑 = 2𝑛𝑅 , then we can obtain a lower
bound on the error probability of an (𝑛, 2𝑛𝑅 , 𝜀) quantum communication protocol.
Specifically, using (14.2.45), we find that
𝜀 ≥ 1 − 2−𝑛 (
𝛼−1
𝛼 )( 𝑅− 𝑛1 e𝐼 𝛼𝑐 (N ⊗𝑛 ) ) (14.2.46)
Suppose that 𝑅 is an achievable rate. Then, by definition, for all 𝜀 ∈ (0, 1], 𝛿 > 0,
and sufficiently large 𝑛, there exists an (𝑛, 2𝑛(𝑅−𝛿) , 𝜀) quantum communication
protocol for N. For all such protocols, the inequality (14.2.44) in Corollary 14.18
holds, so that
1 1
(1 − 2𝜀)(𝑅 − 𝛿) ≤ 𝐼 𝑐 (N ⊗𝑛 ) + ℎ2 (𝜀). (14.2.49)
𝑛 𝑛
Since this bound holds for all sufficiently large 𝑛, it holds in the limit 𝑛 → ∞, so
that
1 𝑐 ⊗𝑛 1
(1 − 2𝜀)𝑅 ≤ lim 𝐼 (N ) + ℎ2 (𝜀) + (1 − 2𝜀)𝛿, (14.2.50)
𝑛→∞ 𝑛 𝑛
1
= lim 𝐼 𝑐 (N ⊗𝑛 ) + (1 − 2𝜀)𝛿. (14.2.51)
𝑛→∞ 𝑛
907
Chapter 14: Quantum Communication
Then, since this inequality holds for all 𝜀 ∈ (0, 1/2) and 𝛿 > 0, we obtain
1 1 𝑐 1
𝑅 ≤ lim lim 𝐼 (N) + 𝛿 = lim 𝐼 𝑐 (N ⊗𝑛 ) = 𝐼reg 𝑐
(N). (14.2.52)
𝜀,𝛿→0 1 − 2𝜀 𝑛→∞ 𝑛 𝑛→∞ 𝑛
We have thus shown that if 𝑅 is an achievable rate, then 𝑅 ≤ 𝐼reg 𝑐 (N). The
𝑐 (N), then 𝑅 is not an achievable
contrapositive of this statement is that if 𝑅 > 𝐼reg
𝑐 (N) is a weak converse rate.
rate. By definition, therefore, 𝐼reg
Although we have shown that the quantum capacity 𝑄(N) of a channel N is given
𝑐 (N) = lim 1 𝑐 ⊗𝑛
by its regularized coherent information 𝐼reg 𝑛→∞ 𝑛 𝐼 (N ), without the
additivity of 𝐼 𝑐 (N), this result is not particularly helpful since it is not clear whether
the regularized coherent information can be computed in general.
The coherent information is always superadditive, meaning that for two arbitrary
quantum channels N1 and N2 ,
𝐼 𝑐 (N1 ⊗ N2 ) ≥ 𝐼 𝑐 (N1 ) + 𝐼 𝑐 (N2 ). (14.2.53)
This follows from the fact that coherent information is additive for product states
𝜏𝐴1 𝐵1 ⊗ 𝜔 𝐴2 𝐵2 :
𝐼 ( 𝐴1 𝐴2 ⟩𝐵1 𝐵2 )𝜏⊗𝜔 = 𝐼 ( 𝐴1 ⟩𝐵1 )𝜏 + 𝐼 ( 𝐴2 ⟩𝐵2 )𝜔 , (14.2.54)
which is a consequence of (7.1.6) and the additivity of entropy for product states
(see (7.2.104)).
Now, let 𝜓 𝑅1 𝑅2 𝐴1 𝐴2 , 𝜙 𝑅1 𝐴1 , 𝜑 𝑅2 𝐴2 be arbitrary pure states, where 𝐴1 and 𝐴2 are
input systems to the channels N1 and N2 , respectively, and 𝑑 𝑅1 = 𝑑 𝐴1 and 𝑑 𝑅2 = 𝑑 𝐴2 .
Then, letting
𝜌 𝑅1 𝑅2 𝐵1 𝐵2 B ((N1 ) 𝐴1 →𝐵1 ⊗ (N2 ) 𝐴2 →𝐵2 )(𝜓 𝑅1 𝑅2 𝐴1 𝐴2 ), (14.2.55)
𝜏𝑅1 𝐵1 B (N1 ) 𝐴1 →𝐵1 (𝜙 𝑅1 𝐴1 ), (14.2.56)
𝜔 𝑅2 𝐵2 B (N2 ) 𝐴2 →𝐵2 (𝜑 𝑅2 𝐴2 ), (14.2.57)
and restricting the optimization in the definition of coherent information of a
channel to pure product states, we find that
𝐼 𝑐 (N1 ⊗ N2 ) = sup 𝐼 (𝑅1 𝑅2 ⟩𝐵1 𝐵2 ) 𝜌 (14.2.58)
𝜓 𝑅1 𝑅2 𝐴1 𝐴2
908
Chapter 14: Quantum Communication
which is precisely (14.2.53). The reverse inequality does not hold in general, but it
does for degradable channels (see Section 14.3.1 below).
For the sandwiched Rényi coherent information of a bipartite state 𝜌 𝐴𝐵 , which
is defined as
e e𝛼 (𝜌 𝐴𝐵 ∥ 1 𝐴 ⊗ 𝜎𝐵 ),
𝐼𝛼𝑐 ( 𝐴⟩𝐵) 𝜌 = inf 𝐷 (14.2.63)
𝜎𝐵
where the optimization is over states 𝜎𝐵 , the following additivity equality holds for
all product states 𝜏𝐴1 𝐵1 ⊗ 𝜔 𝐴2 𝐵2 and 𝛼 ∈ (1, ∞):
𝐼𝛼 ( 𝐴1 𝐴2 ⟩𝐵1 𝐵2 )𝜏⊗𝜔 = e
e 𝐼𝛼 ( 𝐴1 ⟩𝐵1 )𝜏 + e
𝐼𝛼 ( 𝐴2 ⟩𝐵2 )𝜔 . (14.2.64)
This equality follows by reasoning similar to that given for the proof of Propo-
sition 11.21. By the same reasoning given in (14.2.58)–(14.2.62), we conclude
that
𝐼𝛼𝑐 (N1 ⊗ N2 ) ≥ 𝐼𝛼𝑐 (N1 ) + 𝐼𝛼𝑐 (N2 ) (14.2.65)
for all 𝛼 ∈ (1, ∞), where 𝐼𝛼𝑐 (N) is the sandwiched Rényi coherent information of
the channel 𝐼𝛼𝑐 (N). Whether the reverse inequality holds, even for particular classes
of channels, is an open question.
Except for channels for which the coherent information is known to be additive
(such as the class of degradable channels; see Section 14.3.1 below), the quantum
capacity of a channel is difficult to compute. This prompts us to find tractable upper
bounds on quantum capacity. This search for tractable upper bounds is entirely
analogous to what was done in Section 12.2.5 for classical communication in order
to obtain tractable strong converse upper bounds on classical capacity.
Recall that in the previous chapter on entanglement distillation, our approach to
obtaining strong converse upper bounds on distillable entanglement consisted of
909
Chapter 14: Quantum Communication
comparing the state at the output of an entanglement distillation protocol with one
that is useless for entanglement distillation. We considered the set of PPT′ operators
as the useless set, and we obtained state entanglement measures as upper bounds in
the one-shot and asymptotic settings. Now, observe that entanglement transmission
is similar to entanglement distillation in the sense that, like entanglement distillation,
the error criterion for entanglement transmission involves comparing the output
state of the protocol to the maximally entangled state. This suggests that the state
entanglement measures defined in Section 9.3, and in particular the results of
Proposition 13.6 and Corollary 13.7, are relevant. However, the main resource
that we are considering in this chapter is a quantum channel and not a quantum
state, and so we have an extra degree of freedom in the input state to the channel,
which we can optimize. This suggests that the channel entanglement measures
from Chapter 10 are relevant, and this is indeed what we find.
Proposition 14.19
Let N 𝐴→𝐵 be a quantum channel. For an arbitrary (𝑑, 𝜀) quantum commu-
nication protocol for N 𝐴→𝐵 , the number log2 𝑑 of qubits transmitted over N
satisfies
𝜀
log2 𝑑 ≤ 𝑅𝐻 (N), (14.2.66)
where
𝜀 𝜀
𝑅𝐻 (N) = sup 𝑅𝐻 (𝑆; 𝐵)𝜔 (14.2.67)
𝜓𝑆 𝐴
= sup inf′
𝐷 (N 𝐴→𝐵 (𝜓 𝑆 𝐴 )∥𝜎𝑆𝐵 ) (14.2.68)
𝜓 𝑆 𝐴 𝜎𝑆𝐵 ∈PPT (𝑆:𝐵)
𝑄 𝜀 (N) ≤ 𝑅𝐻
𝜀
(N). (14.2.69)
Remark: Note that in the expression for 𝑅 𝐻 𝜀 (N) above it suffices to optimize over pure states
910
Chapter 14: Quantum Communication
Now, since every local channel is completely PPT preserving (this follows imme-
diately from Proposition 4.29 and Lemma 4.30), we conclude that the channel
D𝐵→𝐵′ ≡ id𝑆 ⊗ D𝐵→𝐵′ is completely PPT preserving, so that the set
{D𝐵→𝐵′ (𝜏𝑆𝐵 ) : 𝜏𝑆𝐵 ∈ PPT′ (𝑆 : 𝐵)} (14.2.73)
is a subset of PPT′ (𝑆 : 𝐵′). Thus, by restricting the optimization over all operators
𝜎𝑆𝐵′ ∈ PPT′ (𝑆 : 𝐵′) to the outputs D𝐵→𝐵′ (𝜏𝑆𝐵 ) of the decoding channel D𝐵→𝐵′
acting on operators 𝜏𝑆𝐵 ∈ PPT′ (𝑆 : 𝐵), we obtain
𝜀
𝑅𝐻 (𝑆; 𝐵′)𝜔
≤ inf′ 𝐷 𝜀𝐻 ((D𝐵→𝐵′ ◦ N 𝐴→𝐵 ◦ E 𝐴′ →𝐴 )(Φ𝑆 𝐴′ )∥D𝐵→𝐵′ (𝜏𝑆𝐵 ))
𝜏𝑆𝐵 ∈PPT (𝑆:𝐵)
(14.2.74)
≤ inf 𝐷 𝜀𝐻 ((N 𝐴→𝐵 ◦ E 𝐴′ →𝐴 )(Φ𝑆 𝐴′ )∥𝜏𝑆𝐵 ) (14.2.75)
𝜏𝑆𝐵 ∈PPT′ (𝑆:𝐵)
= inf′ 𝐷 𝜀𝐻 (N 𝐴→𝐵 (𝜌 𝑆 𝐴 )∥𝜏𝑆𝐵 ), (14.2.76)
𝜏𝑆𝐵 ∈PPT (𝑆:𝐵)
where the second inequality follows from the data-processing inequality for hypoth-
esis testing relative entropy, and the equality follows by letting 𝜌 𝑆 𝐴 = E 𝐴′ →𝐴 (Φ𝑆 𝐴′ ).
Finally, after optimizing over all states 𝜌 𝑆 𝐴 , we obtain
𝜀
𝑅𝐻 (𝑆; 𝐵′) 𝜌 ≤ sup inf′ 𝐷 𝜀𝐻 (N 𝐴→𝐵 (𝜌 𝑆 𝐴 )∥𝜏𝑆𝐵 ) (14.2.77)
𝜌 𝑆 𝐴 𝜏𝑆𝐵 ∈PPT (𝑆:𝐵)
𝜀
= 𝑅𝐻 (N), (14.2.78)
so that, by (14.2.70), we conclude that
𝜀
log2 𝑑 ≤ 𝑅𝐻 (N), (14.2.79)
as required. ■
911
Chapter 14: Quantum Communication
where
e𝛼 (N) = sup 𝑅
𝑅 e𝛼 (𝑆; 𝐵)𝜔 (14.2.81)
𝜓𝑆 𝐴
Remark: Note that in the expression for 𝑅 e𝛼 (N) above it suffices to optimize over pure states
𝜓 𝑆 𝐴, with the dimension of 𝑆 equal to the dimension of 𝐴. We showed this in (10.1.3)–(10.1.6)
immediately after Definition 10.1.
Since the inequality in (14.2.80) holds for all (𝑑, 𝜀) quantum communication
protocols, we conclude the following upper bound on the one-shot quantum capacity:
𝜀 𝛼 1
𝑄 (N) ≤ 𝑅 e𝛼 (N) + log2 (14.2.83)
𝛼−1 1−𝜀
for all 𝛼 > 1.
For 𝑛 channel uses, the bound in (14.2.80) becomes
log2 𝑑 1 ⊗𝑛 𝛼 1
≤ 𝑅 e𝛼 (N ) + log2 ∀ 𝛼 > 1, (14.2.84)
𝑛 𝑛 𝑛(𝛼 − 1) 1−𝜀
which holds for an arbitrary (𝑛, 𝑑, 𝜀) quantum communication protocol that employs
𝑛 uses of the channel N, where 𝑛 ≥ 1 and 𝜀 ∈ [0, 1). We can simplify this inequality
by making use of the following fact.
912
Chapter 14: Quantum Communication
𝛼(𝑑 2𝐴 − 1)
e𝛼 (N ⊗𝑛 ) ≤ 𝑛 𝑅
𝑅 e𝛼 (N) + log2 (𝑛 + 1). (14.2.85)
𝛼−1
Proof: Throughout this proof, for convenience we make use of the alternate
notation
e𝛼 (𝜌 𝐴𝐵 ) ≡ 𝑅
𝑅 e𝛼 ( 𝐴; 𝐵) 𝜌 (14.2.86)
where 𝜌 𝐴𝐵 is a bipartite state.
Let 𝜓 𝑆 𝐴𝑛 be an arbitrary pure state, with the dimension of 𝑆 equal to the
dimension of 𝐴𝑛 , and let 𝜌 𝐴𝑛 B Tr𝑆 [𝜓 𝑆 𝐴𝑛 ]. We start by observing that the channel
N ⊗𝑛 is covariant with respect to the symmetric group S𝑛 . In particular, if we let
{𝑊 𝐴𝜋 𝑛 } 𝜋∈S𝑛 and {𝑊𝐵𝜋𝑛 } 𝜋∈S𝑛 be the unitary representations of S𝑛 , defined in (2.5.1),
acting on H ⊗𝑛 ⊗𝑛
𝐴 and H 𝐵 , respectively, then for every state 𝜌 𝐴 , we have that
𝑛
𝜌
and 𝜓 𝑆 𝐴𝑛 is a purification of 𝜌 𝐴𝑛 .
Since the state 𝜌 𝐴𝑛 is permutation invariant by definition, by Lemma 3.13, it
has a purification |𝜙 𝜌 ⟩ 𝐴ˆ 𝑛 𝐴𝑛 ∈ Sym𝑛 (H 𝐴𝐴
ˆ ), where the dimension of 𝐴
ˆ is equal to
the dimension of 𝐴. Consequently, there exists an isometry 𝑉𝑆→ 𝐴ˆ 𝑛 such that
𝑉𝑆→ 𝐴ˆ 𝑛 |𝜓 𝜌 ⟩𝑆 𝐴𝑛 = |𝜙 𝜌 ⟩ 𝐴ˆ 𝑛 𝐴𝑛 . (14.2.90)
913
Chapter 14: Quantum Communication
But |𝜙 𝜌 ⟩ 𝐴ˆ 𝑛 𝐴𝑛 ∈ Sym𝑛 (H 𝐴𝐴
ˆ ), which means that
Therefore, ∫
𝑑 2𝐴 + 𝑛 − 1
𝜙 ⊗𝑛
𝜌
𝜙 ˆ𝑛 𝑛 ≤ Π 𝐴ˆ 𝑛 𝐴𝑛 = ˆ d𝜙, (14.2.94)
𝐴 𝐴 𝑛 𝐴𝐴
(𝑛 + 𝑑 2𝐴 − 1)(𝑛 + 𝑑 2𝐴 − 2) · · · (𝑛 + 2)(𝑛 + 1)
𝑛 + 𝑑 2𝐴 − 1
= (14.2.95)
𝑛 (𝑑 2𝐴 − 1)(𝑑 2𝐴 − 2) · · · 2 · 1
𝑛 + 𝑑 2𝐴 − 1 𝑛 + 𝑑 2𝐴 − 2 𝑛+2 𝑛+1
= · · · · · . (14.2.96)
𝑑 2𝐴 − 1 𝑑 2𝐴 − 2 2 1
e𝛼 (N ⊗𝑛 (𝜙 𝜌 ) ≤ 𝑅
e𝛼 (N ⊗𝑛 (𝜉 ˆ 𝑛 𝑛 )) + 𝛼
log2 (𝑛 + 1) 𝑑 𝐴−1 .
2
𝑅 (14.2.99)
𝐴→𝐵 ˆ𝑛 𝑛 𝐴 𝐴 𝐴→𝐵 𝐴 𝐴 𝛼−1
914
Chapter 14: Quantum Communication
Then, using subadditivity of the sandwiched Rényi Rains relative entropy for
tensor-product states, as given by (9.3.18), we find that
e𝛼 (N ⊗𝑛 (𝜙 ⊗𝑛 )) ≤ 𝑛 𝑅
𝑅 e𝛼 (N 𝐴→𝐵 (𝜙 ˆ )). (14.2.101)
𝐴→𝐵 ˆ 𝐴𝐴 𝐴𝐴
𝛼(𝑑 2𝐴 − 1)
e𝛼 (N ⊗𝑛 ) = sup 𝑅
𝑅 e𝛼 (N ⊗𝑛 (𝜓 𝑆 𝐴𝑛 )) ≤ 𝑛 𝑅
e𝛼 (N) + log2 (𝑛 + 1),
𝜓 𝑆 𝐴𝑛
𝐴→𝐵 𝛼−1
(14.2.104)
as required. ■
for all 𝛼 > 1. Consequently, the following bound holds for the 𝑛-shot quantum
capacity:
!
𝑑 2𝐴−1
𝛼 (𝑛 + 1)
𝑄 𝑛,𝜀 (N) ≤ 𝑅
e𝛼 (N) + log2 (14.2.106)
𝑛(𝛼 − 1) 1−𝜀
With this bound, we are now ready to state the main result of this section, which
is that the Rains information of a channel is an upper bound on the strong converse
capacity of an arbitrary quantum channel N.
916
Chapter 14: Quantum Communication
≤ 𝑅(N) + 𝛿1 + 𝛿2 (14.2.113)
= 𝑅(N) + 𝛿′ (14.2.114)
< 𝑅(N) + 𝛿. (14.2.115)
log 𝑑
So we have that 𝑅(N) +𝛿 > 𝑛2 for all (𝑛, 𝑑, 𝜀) quantum communication protocols
with sufficiently large 𝑛. Due to this strict inequality, it follows that there cannot
exist an (𝑛, 2𝑛(𝑅(N)+𝛿) , 𝜀) quantum communication protocol for all sufficiently
large 𝑛 such that (14.2.110) holds, for if it did there would exist a 𝑑 such that
log2 𝑑 = 𝑛(𝑅(N) + 𝛿), which we have just seen is not possible. Since 𝜀 and 𝛿 are
arbitrary, we conclude that for all 𝜀 ∈ [0, 1), 𝛿 > 0, and sufficiently large 𝑛, there
does not exist an (𝑛, 2𝑛(𝑅(N)+𝛿) , 𝜀) quantum communication protocol. This means
that 𝑅(N) is a strong converse rate, and thus that 𝑄(N)
e ≤ 𝑅(N). ■
Let us now show that the Rains relative entropy of a quantum channel N is a strong
converse rate according to the definition of a strong converse rate in Appendix A. To
this end, consider a sequence {(𝑛, 2𝑛𝑟 , 𝜀 𝑛 )}𝑛∈N of (𝑛, 𝑑, 𝜀) quantum communication
protocols, with each element of the sequence having an arbitrary (but fixed) rate
𝑟 > 𝑅(N). For each element of the sequence, the inequality in (14.2.105) holds,
which means that
!
𝑑 2𝐴−1
𝛼 (𝑛 + 1)
𝑟≤𝑅 e𝛼 (N) + log2 (14.2.116)
𝑛(𝛼 − 1) 1 − 𝜀𝑛
for all 𝛼 > 1. Rearranging this inequality leads to the following lower bound on the
error probabilities 𝜀 𝑛 :
we conclude that lim𝑛→∞ 𝜀 𝑛 = 1. Since the rate 𝑟 > 𝑅(N) is arbitrary, we conclude
that 𝑅(N) is a strong converse rate. We also see from (14.2.118) that the sequence
{𝜀 𝑛 }𝑛∈N approaches one at an exponential rate.
918
Chapter 14: Quantum Communication
14.3 Examples
We now consider the quantum capacity for particular classes of quantum channels.
As remarked earlier, computing the quantum capacity of an arbitrary channel is a
difficult task. This task is made more difficult by the fact that, in some cases, the
coherent information is known to be strictly superadditive, meaning that
This fact confirms that regularization of the coherent information really is needed
in general in order to compute the quantum capacity, and that additivity of coherent
information is simply not true for all channels. Another interesting phenomenon
related to quantum capacity is superactivation, which is when two channels N1
and N2 , each with zero quantum capacity, i.e., 𝑄(N1 ) = 𝑄(N2 ) = 0, can combine
to have non-zero quantum capacity, i.e., 𝑄(N1 ⊗ N2 ) > 0. Please consult the
Bibliographic Notes in Section 14.5 for more information about strict superadditivity
and superactivation.
919
Chapter 14: Quantum Communication
In this section, we show that coherent information is additive for all degradable
channels, which means that regularization is not needed in order to compute their
capacities. The same turns out to be true for generalized dephasing channels, and
we prove this by showing that the Rains relative entropy of those channels coincides
with their coherent information. We also show that anti-degradable channels have
zero quantum capacity. Finally, we evaluate the upper and lower bounds established
in this chapter for the generalized amplitude damping channel.
Before starting, let us first recall the definition of coherent information of a
channel:
𝐼 𝑐 (N) = sup 𝐼 (𝑅⟩𝐵)𝜔 = sup{𝐻 (N(𝜌)) − 𝐻 (N𝑐 (𝜌))}, (14.3.2)
𝜓𝑅 𝐴 𝜌
Recall from Definition 4.6 that a channel N 𝐴→𝐵 is degradable if there exists a
channel D𝐵→𝐸 such that
N𝑐 = D ◦ N, (14.3.4)
where N𝑐 is a channel complementary to N (see Definition 4.5) and 𝑑 𝐸 ≥ rank(ΓN
𝐴𝐵 ).
In particular, if 𝑉𝐴→𝐵𝐸 is an isometric extension of N, so that
N(𝜌) = Tr𝐸 [𝑉 𝜌𝑉 † ] (14.3.5)
for every state 𝜌, then
N𝑐 (𝜌) = Tr 𝐵 [𝑉 𝜌𝑉 † ]. (14.3.6)
We now show that the coherent information is additive for degradable quantum
channels, meaning that
𝐼 𝑐 (N ⊗ M) = 𝐼 𝑐 (N) + 𝐼 𝑐 (M) (14.3.7)
for all degradable quantum channels N and M. Consequently, regularization is
unnecessary, and we conclude that the quantum capacity of a degradable channel is
equal to its coherent information:
𝑄(N) = 𝐼 𝑐 (N) for every degradable channel N. (14.3.8)
920
Chapter 14: Quantum Communication
where the third equality follows from (14.3.10), the inequality follows from the
data-processing inequality for quantum relative entropy, and the last equality from
(7.11.103) and (7.11.104). Rearranging this inequality and applying subadditivity
of the entropy 𝐻 ((N ⊗ M)(𝜌 𝐴1 𝐴2 )) gives
as required. ■
Another useful fact about a degradable channel N is that the coherent information
𝐼 𝑐 (𝜌, N)
defined in (14.3.3) is concave in the input state 𝜌.
Lemma 14.27
For a degradable channel N, the function 𝜌 ↦→ 𝐼 𝑐 (𝜌, N) is concave in the input
state 𝜌. In other words, for every finite alphabet X, probability distribution
𝑝 : X → [0, 1], and set {𝜌 𝑥𝐴 }𝑥∈X of states,
!
∑︁ ∑︁
𝐼 𝑐
𝑝(𝑥) 𝜌 𝐴 , N ≥
𝑥
𝑝(𝑥)𝐼 𝑐 (𝜌 𝑥𝐴 , N). (14.3.23)
𝑥∈X 𝑥∈X
which is the desired inequality in (14.3.23). Indeed, the left-hand side of the
Í
inequality above is simply 𝐼 𝑐 𝑥∈X 𝑝(𝑥) 𝜌 𝑥𝐴 , N . For the right-hand side, we find
922
Chapter 14: Quantum Communication
that
∑︁
𝐻 (𝐵|𝑋)𝜔 = 𝑝(𝑥)𝐻 (N(𝜌 𝑥𝐴 )), (14.3.28)
𝑥∈X
∑︁
𝐻 (𝐸 |𝑋)𝜏 = 𝑝(𝑥)𝐻 (N𝑐 (𝜌 𝑥𝐴 )), (14.3.29)
𝑥∈X
where 𝑑 ≥ 1 and where the state vectors {|𝜓𝑖 ⟩}𝑖=0 𝑑−1 are arbitrary (not necessarily
and
𝑑−1
∑︁
N N†
N𝑐𝐴→𝐸 (𝜌 𝐴 ) = Tr 𝐵 [𝑉𝐴→𝐵𝐸 𝜌 𝐴𝑉𝐴→𝐵𝐸 ] = ⟨𝑖|𝜌 𝐴 |𝑖⟩|𝜓𝑖 ⟩⟨𝜓𝑖 | 𝐸 . (14.3.32)
𝑖=0
923
Chapter 14: Quantum Communication
for every state 𝜌. This implies that generalized dephasing channels N are degradable,
with N𝑐 being the degrading channel.
We now show that 𝑄(N) e = 𝐼 𝑐 (N) for every generalized dephasing channel N.
We do this by showing that the Rains information 𝑅(N) of a generalized dephasing
channel is equal to its coherent information.
𝑄(N) = 𝑄(N)
e = 𝑅(N) = 𝐼 𝑐 (N), (14.3.34)
which establish the coherent information as the quantum capacity and strong
converse quantum capacity.
Proof: It suffices to show that 𝑅(N) = 𝐼 𝑐 (N). Note that the inequality 𝐼 𝑐 (N) ≤
𝑅(N) holds for every quantum channel N by combining the result of Theorem 14.16
with the result of Theorem 14.22. We now show that the reverse inequality holds
for all generalized dephasing channels.
We start by observing that every generalized dephasing channel N is covariant
with respect to the operators {𝑍 ( 𝑗)} 𝑑−1
𝑗=0 defined in (3.2.49):
924
Chapter 14: Quantum Communication
conclude that for every state 𝜌, its corresponding average state 𝜌 has a purification
of the following form:
𝑑−1 √
∑︁
𝜌
|𝜙 ⟩ 𝑅 𝐴 = 𝑝(𝑖)|𝑖⟩ 𝑅 ⊗ |𝑖⟩ 𝐴 C |𝜓 𝑝 ⟩ 𝑅 𝐴 , (14.3.38)
𝑖=0
where the last line follows from the data-processing inequality for quantum relative
entropy, and we introduced the following channel:
𝑑−1
Δ(𝜌) B Π𝜌Π + ( 1 − Π) 𝜌( 1 − Π),
∑︁
Π= |𝑖⟩⟨𝑖| 𝑅 ⊗ |𝑖⟩⟨𝑖| 𝐵 . (14.3.43)
𝑖=0
925
Chapter 14: Quantum Communication
with the probability distribution 𝑞(𝑖) B ⟨𝑖|𝜎𝐵 |𝑖⟩. Note that the right-hand side of
the equation above is a state in PPT′ (𝑅 : 𝐵). Therefore, we have
= 𝑅(N), (14.3.53)
Let us now consider anti-degradable channels. Recall from Definition 4.6 that a
channel N 𝐴→𝐵 is anti-degradable if there exists an anti-degrading channel A𝐸→𝐵
such that
N = A ◦ N𝑐 , (14.3.54)
where N𝑐 is a channel complementary to N and 𝑑 𝐸 ≥ rank(ΓN
𝐴𝐵 ).
926
Chapter 14: Quantum Communication
as required. ■
927
Chapter 14: Quantum Communication
Let us recall the definition of the generalized amplitude damping channel (GADC)
from (4.5.10):
where
√
√ √
1 √︁ 0 0 𝛾
𝐴1 = 1 − 𝑁 , 𝐴2 = 1 − 𝑁 , (14.3.65)
0 1−𝛾 0 0
√ √
√︁
1−𝛾 0 0 0
𝐴3 = 𝑁 , 𝐴4 = 𝑁 √ , (14.3.66)
0 1 𝛾 0
for every state 𝜌 and all 𝛾, 𝑁 ∈ [0, 1]. In other words, the GADC A𝛾,𝑁 is related
to the GADC A𝛾,1−𝑁 via a simple pre- and post-processing by the Pauli unitary
𝑋 = |0⟩⟨1| + |1⟩⟨0|. The information-theoretic aspects of the GADC are thus
invariant under the interchange 𝑁 ↔ 1 − 𝑁, which means that we can, without loss
of generality, restrict the parameter 𝑁 to the interval [0, 1/2].
For 𝑁 = 0, the GADC reduces to the amplitude damping channel A𝛾 defined in
(4.5.1), which is degradable. Indeed, we first note that
where the complementary channel A𝑐𝛾,0 (recall Definition 4.5) is defined via the
following isometric extension:
is satisfied by the quantum channel D𝛾,0 B A 1−2𝛾 ,0 . It can be shown that for
1−𝛾
𝑁 > 0, the GADC A𝛾,𝑁 is not degradable for all 𝛾 ∈ (0, 1] (please consult the
Bibliographic Notes in Section 14.5).
Since A𝛾,0 is degradable, its coherent information is additive, which means that
its quantum capacity is equal to its coherent information, i.e.,
n o
𝑐 𝑐
𝑄(A𝛾,0 ) = 𝐼 (A𝛾,0 ) = sup 𝐻 (A𝛾,0 (𝜌)) − 𝐻 (A𝛾,0 (𝜌)) (14.3.71)
𝜌
= sup 𝐼 (𝜌, A𝛾,0 ),
𝑐
(14.3.72)
𝜌
where we have used the expression in (14.3.2). Now, as explained in Section 11.3.2,
the GADC is covariant with respect to the Pauli operator 𝑍. Furthermore, by
Lemma 14.27, the function 𝜌 ↦→ 𝐼 𝑐 (𝜌, A𝛾,0 ) is concave. Therefore, for every state
𝜌,
1 1 1 1
𝐼 𝑐 𝜌 + 𝑍 𝜌𝑍, A𝛾,0 ≥ 𝐼 𝑐 (𝜌, A𝛾,0 ) + 𝐼 𝑐 (𝑍 𝜌𝑍, A𝛾,0 ). (14.3.73)
2 2 2 2
Now, using the fact that A𝛾,0 is covariant with respect to 𝑍, and the fact that
A𝑐𝛾,0 = A1−𝛾,0 , we obtain
929
Chapter 14: Quantum Communication
1.0
0.8
0.6
Q(Aγ,0 )
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
γ
where in the last line we have evaluated 𝐼 𝑐 ((1 − 𝑝)|0⟩⟨0| + 𝑝|1⟩⟨1|, A𝛾,0 ). See
Figure 14.5 for a plot of the quantum capacity of the amplitude damping channel
A𝛾,0 . Note that the capacity vanishes at 𝛾 = 12 , which is due to the fact that for
𝛾 ≥ 12 the amplitude damping channel A𝛾,0 (and more generally the GADC A𝛾,𝑁
for 𝑁 ∈ [0, 1]) is anti-degradable. From Proposition 14.29, we thus have that
𝑄(A𝛾,𝑁 ) = 0 for all 𝑁 ∈ [0, 1] and 𝛾 ≥ 12 .
Let us now consider the coherent information of the GADC A𝛾,𝑁 for 𝑁 > 0.
In this case, the coherent information 𝐼 𝑐 (A𝛾,𝑁 ) is a lower bound on the quantum
capacity of the GADC. As with the amplitude damping channel, it can be shown
that for the GADC A𝛾,𝑁 with 𝑁 > 0 it suffices to optimize over states diagonal in
the standard basis in order to compute the coherent information:
𝐼 𝑐 (A𝛾,𝑁 ) = max 𝐼 𝑐 ((1 − 𝑝)|0⟩⟨0| + 𝑝|1⟩⟨1|, A𝛾,𝑁 ), (14.3.83)
𝑝∈[0,1]
for all 𝛾 ∈ (0, 1) and all 𝑁 > 0. The proof of this is more involved, since for 𝑁 > 0
the GADC is not degradable, meaning that we cannot use Lemma 14.27. Please
consult the Bibliographic Notes in Section 14.5 for a source of the proof.
In Figure 14.6, we plot the coherent information lower bound given by (14.3.83).
930
Chapter 14: Quantum Communication
I c (Aγ,N ) QUB
DP,1 QUB
DP,2 QUB
DP,3 QUB
DP,4 R(Aγ,N )
Figure 14.6: The coherent information lower bound 𝐼 𝑐 (A𝛾,𝑁 ) and four upper
bounds on the quantum capacity of the generalized amplitude damping channel
A𝛾,𝑁 . The quantum capacity lies within the shaded region.
We also plot the Rains information upper bound 𝑅(A𝛾,𝑁 ) as well as four other upper
bounds that are based on the following identities, which follow from (14.3.70):
A𝛾,𝑁 = A𝛾𝑁,1 ◦ A 𝛾 (1− 𝑁 ) ,0 , (14.3.84)
1−𝛾 𝑁
𝑄(A𝛾,𝑁 ) ≤ 𝑄(A𝛾(1−𝑁),0 ) C 𝑄 UB
DP,2 (𝛾, 𝑁), (14.3.87)
𝑄(A𝛾,𝑁 ) ≤ 𝑄(A𝛾𝑁,0 ) C 𝑄 UB
DP,3 (𝛾, 𝑁), (14.3.88)
𝑄(A𝛾,𝑁 ) ≤ 𝑄(A 𝛾𝑁 ) C 𝑄 UB
DP,4 (𝛾, 𝑁). (14.3.89)
1−𝛾 (1− 𝑁 ) ,0
Note that the right-hand side of each inequality can be calculated using (14.3.82).
We have also made use of (14.3.67), which implies that 𝑄(A𝛾,1 ) = 𝑄(A𝛾,0 ). These
931
Chapter 14: Quantum Communication
inequalities hold due to the fact that, for the composition of two quantum channels
N and M,
𝑄(N ◦ M) ≤ 𝑄(M) and 𝑄(N ◦ M) ≤ 𝑄(N). (14.3.90)
The first inequality holds by the data-processing inequality. The second inequality
can be viewed as a lower bound on the quantum capacity of the channel N that
arises from a coding strategy consisting of some encoding followed by many uses
of the channel M.
14.4 Summary
In this chapter, we studied quantum communication. Given a quantum channel
N 𝐴→𝐵 connecting Alice and Bob, the goal in quantum communication is to
determine the highest rate, called the quantum capacity and denoted by 𝑄(N), at
which the 𝐴′ part of an arbitrary pure state Ψ𝑅 𝐴′ can be transmitted to Bob without
error. At the disposal of Alice and Bob are local encoding and decoding channels,
as well as an arbitrary number of (unassisted) uses of the channel N 𝐴→𝐵 . By
unassisted, we mean that Alice and Bob are not allowed to communicate with each
other between channel uses. We found that the coherent information 𝐼 𝑐 (N) of N is
always a lower bound on its quantum capacity, and that, in general, computing the
𝑐 (N).
exact value of the capacity involves a regularization, so that 𝑄(N) = 𝐼reg
Starting with the one-shot setting, in which only one use of the channel is
allowed and there is some tolerable non-zero error, we determined both upper and
lower bounds on the number of qubits that can be transmitted. The one-shot upper
bound involves the hypothesis testing relative entropy in a way similar to how it is
involved in classical communication and entanglement distillation. Specifically,
we establish the hypothesis testing coherent information as an upper bound. This
leads to the coherent information (hence regularized coherent information) weak
converse upper bound in the asymptotic setting. To obtain a lower bound, we
used the results of Chapter 13 on entanglement distillation. We found that we
could take the entanglement distillation protocol developed in that chapter and
convert it to a suitable quantum communication protocol. We proved that this
lower bound is optimal when applied to the asymptotic setting, in the sense that
it leads to the coherent information (hence regularized coherent information) as
an achievable rate, which matches the upper bound. For degradable channels, we
showed that the coherent information is additive, meaning that 𝑄(N) = 𝐼 𝑐 (N) for
932
Chapter 14: Quantum Communication
all degradable channels. We also showed that anti-degradable channels have zero
quantum capacity.
With the goal of obtaining tractable estimates of quantum capacity for general
channels, we found that the Rains information 𝑅(N) of N is a strong converse
upper bound on the quantum capacity of N. This allowed us to conclude that the
quantum capacity of the generalized dephasing channel is equal to its coherent
information, because its Rains information and coherent information coincide.
We also looked ahead to Chapter 19 and concluded from the results there that
the squashed entanglement of a quantum channel is an upper bound on quantum
capacity.
933
Chapter 14: Quantum Communication
934
Chapter 14: Quantum Communication
𝑝 (ET)
err (E, D; N) B 1 − ⟨Φ| 𝑅𝐵′ 𝜔 𝑅𝐵′ |Φ⟩ 𝑅𝐵′ (14.A.2)
= 1 − 𝐹𝑒 (D ◦ N ◦ E), (14.A.3)
𝑝 (EG)
err (Ψ 𝐴′ 𝐴 , D; N) B 1 − ⟨Φ| 𝐴′ 𝐵′ 𝜎𝐴′ 𝐵′ |Φ⟩ 𝐴′ 𝐵′ (14.A.9)
= 1 − 𝐹 (Φ 𝐴′ 𝐵′ , 𝜎𝐴′ 𝐵′ ). (14.A.10)
We call the protocol (𝑑, Ψ𝐴′ 𝐴 , D𝐵→𝐵′ ) a (𝑑, 𝜀) protocol, with 𝜀 ∈ [0, 1], if
𝑝 (EG)
err (Ψ 𝐴′ 𝐴 , D; N) ≤ 𝜀.
˜ B E 𝐴′ →𝐴 (Φ 𝐴𝐴
Now, let the system 𝑅 ≡ 𝐴˜ belong to Alice, and let Ψ𝐴𝐴 ˜ ). Then,
936
Chapter 14: Quantum Communication
is close in fidelity to the initial state. The state transmission error of the
protocol is
𝑝 (ST)
err (E, D; N) B 1 − min ⟨𝜓|N(𝜓)|𝜓⟩ = 1 − 𝐹min (D ◦ N ◦ E), (14.A.14)
𝜓
Remark: An alternative way to define the error criterion for a subspace transmission code
would be to use the average fidelity, defined in (6.4.3); please consult the Bibliographic Notes in
Section 14.5.
Given a (𝑑, 𝜀) quantum communication protocol for the channel N with the
elements (𝑑, E, D), the equality 𝑝 ∗err (E, D; N) = 1 − 𝐹 (D ◦ N ◦ E) holds, where
𝐹 (·) is the channel fidelity defined in (6.4.6). Then, restricting the optimization
in 𝐹 (D ◦ N ◦ E) to pure states Ψ𝑅 𝐴′ = |Ψ⟩⟨Ψ| 𝑅 𝐴′ such that |Ψ⟩ 𝑅 𝐴′ = |𝜙⟩ 𝑅 ⊗ |𝜓⟩ 𝐴′ ,
we obtain
𝑝 (ST)
err (E, D; N)
= 1 − 𝐹min (D ◦ N ◦ E) (14.A.15)
= 1 − min ⟨𝜓|N(𝜓)|𝜓⟩ (14.A.16)
|𝜓⟩
= 1 − min (⟨𝜙| 𝑅 ⊗ ⟨𝜓| 𝐴′ )(𝜙 𝑅 ⊗ N(𝜓 𝐴′ ))(|𝜙⟩ 𝑅 ⊗ |𝜓⟩ 𝐴′ ) (14.A.17)
|𝜙⟩,|𝜓⟩
≤ 1 − 𝐹 (D ◦ N ◦ E) (14.A.18)
≤ 𝜀. (14.A.19)
937
Chapter 15
938
Chapter 15: Secret Key Distillation
The goal of a secret-key distillation protocol is for Alice and Bob to transform the
initial state 𝜓 𝐴𝐵𝐸 , by means of local operations and public classical communication,
to a state that approximates an ideal key state of the form in (15.0.1).
A secret key is useful in a communication task called the one-time pad protocol
(also known as the Vernam cipher). In this protocol, we suppose that Alice has a
message 𝑚 ∈ {0, . . . , 𝐾 − 1} that she would like to send to Bob. By making use of
the key, Alice can calculate 𝑚˜ B 𝑚 ⊕ 𝑖, where 𝑖 is the key value and the addition is
modulo 𝐾, and then send the encrypted message 𝑚˜ over a public classical channel.
Since the key is ideal, no one else besides Alice and Bob knows the precise key
value 𝑖, and the encrypted message 𝑚˜ is uniformly random, which means that it is
hard to guess (i.e., there is a 1/𝐾 chance that an eavesdropper could guess it, which
becomes small as 𝐾 becomes large). When Bob receives the encrypted message
˜ he can calculate 𝑚 = 𝑚˜ ⊖ 𝑖 and decrypt the message 𝑚 because he knows the
𝑚,
key value 𝑖. This is one of the main uses of a secret key and in turn why we are
interested in secret key distillation.
It turns out that there are strong connections between entanglement distillation
from Chapter 13 and secret key distillation. They are not precisely the same tasks
but there are strong links, and the structure of this chapter follows the structure of
Chapter 13 quite closely. The mainÍreason for the strong connection is that the
maximally entangled state Φ 𝐴𝐵 = 𝐾1 𝑖,𝐾−1 𝑗=0 |𝑖⟩⟨ 𝑗 | 𝐴 ⊗ |𝑖⟩⟨ 𝑗 | 𝐵 can be used to generate
an ideal key state. To see this, consider that the state Φ 𝐴𝐵 is unextendible, so that
the only possible extension of it is a tensor-product extension of the form Φ 𝐴𝐵 ⊗ 𝜎𝐸 .
Then, if Alice and Bob perform local measurement channels on their systems 𝐴 and
𝐵, with respect to the computational basis, they can realize the ideal tripartite key
state of the form in (15.0.1). Thus, if one can generate maximally entangled states,
then one can generate key states. However, the converse is not true in general, and
this is what distinguishes secret key distillation from entanglement distillation.
939
Chapter 15: Secret Key Distillation
Similar to what we have done in previous chapters, here we establish lower and
upper bounds on the number of secret key bits that can be distilled from a bipartite
state 𝜌 𝐴𝐵 . The lower bounds are given in terms of the private information of the
state, and the upper bounds are given in terms of not only the private information but
also the squashed entanglement and the relative entropy of entanglement. The fact
that we can use entanglement measures as bounds further highlights the connection
between secret key distillation and entanglement.
940
Chapter 15: Secret Key Distillation
for some state 𝜎𝐸 and where Φ 𝐴𝐵 is the maximally classically correlated state
𝐾−1
1 ∑︁
Φ 𝐴𝐵 B |𝑖⟩⟨𝑖| 𝐴 ⊗ |𝑖⟩⟨𝑖| 𝐵 . (15.1.5)
𝐾 𝑖=0
As stated in Definition 15.1, the defining aspect of an ideal tripartite key state is
that the systems 𝐴 and 𝐵 of Alice and Bob are perfectly correlated and uniformly
random. This property makes the actual key value, which ends up being observed
by both Alice and Bob, hard to guess if there are many key values. Furthermore,
the fact that the overall state is such that it is tensor product between 𝐴𝐵 and 𝐸
implies that Eve’s system cannot provide any help at all in guessing the key value.
With the notions above in place, we can now formally define a secret-key
distillation protocol. Such a protocol for the state 𝜌 𝐴𝐵 is defined by the pair
941
Chapter 15: Secret Key Distillation
(𝐾, L↔ ↔
𝐴𝐵→𝐾 𝐴 𝐾 𝐵 𝑍 ), where 𝐾 ∈ N and L 𝐴𝐵→𝐾 𝐴 𝐾 𝐵 𝑍 is an LOPC channel as defined
in (15.1.2), with 𝑑 𝐾 𝐴 = 𝑑 𝐾 𝐵 = 𝐾. The key distillation error 𝑝 err (L↔ ; 𝜌 𝐴𝐵 ) of the
protocol is given by the infidelity, defined as
↔ ↔
𝑝 err (L ; 𝜌 𝐴𝐵 ) B inf 1 − 𝐹 (𝛾𝐾 𝐴𝐾 𝐵 𝐸 𝑍 , L 𝐴𝐵→𝐾 𝐴𝐾 𝐵 𝑍 (𝜓 𝐴𝐵𝐸 )) , (15.1.6)
𝛾𝐾 𝐴 𝐾 𝐵 𝐸 𝑍
Given 𝜀 ∈ [0, 1], the largest number log2 𝐾 of 𝜀-approximate secret-key bits
that can be extracted from a state 𝜌 𝐴𝐵 among all (𝐾, 𝜀) secret-key distillation
protocols is called the one-shot 𝜀-distillable key of 𝜌 𝐴𝐵 .
942
Chapter 15: Secret Key Distillation
An important insight for secret key distillation is that there is a way to describe
the whole theory exclusively in terms of a bipartite scenario. This is related to
the assumption that the eavesdropper Eve possesses a full purification 𝜓 𝐴𝐵𝐸 of the
original state 𝜌 𝐴𝐵 , along with the structure of quantum mechanics.
To motivate this concept, consider that an approximate tripartite state 𝛾 𝐴𝐵𝐸 (as
described in Definition 15.1) is generated at the end of a key distillation protocol,
and it is such that all that the eavesdropper possesses is only available in the system
𝐸 (in this context, let us make the same identifications 𝐾 𝐴 ↔ 𝐴, 𝐾 𝐵 ↔ 𝐵, and
𝐸 𝑍 ↔ 𝐸 discussed around (15.1.6)). As such, we can consider a purification of
the state 𝛾 𝐴𝐵𝐸 of the form 𝛾 𝐴𝐴′ 𝐵𝐵′ 𝐸 , in which the joint system 𝐴′ 𝐵′ constitutes
the purifying system. Since a secret-key distillation protocol involves only three
parties, and we already argued that the system 𝐸 is all that Eve possesses, it follows
that Alice and Bob jointly possess the purifying system, which can be split among
them as 𝐴′ 𝐵′. The reduced state 𝛾 𝐴𝐴′ 𝐵𝐵′ = Tr𝐸 [𝛾 𝐴𝐴′ 𝐵𝐵′ 𝐸 ] is then a bipartite state
because all systems involved are in possession of Alice and Bob. If the original
state 𝛾 𝐴𝐵𝐸 is a tripartite key state according to Definition 15.1, then by constructing
943
Chapter 15: Secret Key Distillation
𝛾 𝐴𝐴′ 𝐵𝐵′ according to this procedure, the resulting state is called a bipartite private
state, and it has a particular structure. Conversely, if 𝛾 𝐴𝐴′ 𝐵𝐵′ is a state with the
structure of a bipartite private state, then it follows that by purifying this state
to 𝛾 𝐴𝐴′ 𝐵𝐵′ 𝐸 with an 𝐸 system and tracing over systems 𝐴′ and 𝐵′, we arrive at
a tripartite key state. So there is an equivalence between these two viewpoints
(tripartite picture of key distillation and bipartite picture of private state distillation).
We develop this correspondence in detail in what follows.
Before starting, we briefly mention that the equivalence between the tripartite
and bipartite pictures of key distillation implies that we can bring the tools of
entanglement theory (Chapter 9) to bear on the problem of establishing upper
bounds on the number of approximate secret-key bits that can be generated in a key
distillation protocol. This is one of the main applications of this correspondence,
and we note here that it has led to other insights in quantum information theory.
Theorem 15.5
A state 𝛾 𝐴𝐵𝐴′ 𝐵′ is a bipartite private state if and only if it has the following
form:
†
𝛾 𝐴𝐵𝐴′ 𝐵′ = 𝑈 𝐴𝐵𝐴′ 𝐵′ (Φ 𝐴𝐵 ⊗ 𝜃 𝐴′ 𝐵′ ) 𝑈 𝐴𝐵𝐴 ′ 𝐵′ , (15.1.10)
where Φ 𝐴𝐵 is a maximally entangled state of Schmidt rank 𝐾:
𝐾−1
1 ∑︁
Φ 𝐴𝐵 B |𝑖⟩⟨ 𝑗 | 𝐴 ⊗ |𝑖⟩⟨ 𝑗 | 𝐵 , (15.1.11)
𝐾 𝑖, 𝑗=0
944
Chapter 15: Secret Key Distillation
𝑖𝑗
In the above, 𝑈 𝐴′ 𝐵′ is a unitary operator for all 𝑖, 𝑗 ∈ {0, . . . , 𝐾 − 1}.
Proof: Suppose that 𝛾 𝐴𝐵𝐴′ 𝐵′ has the form in (15.1.10). A particular purification
of 𝛾 𝐴𝐵𝐴′ 𝐵′ is
|𝜙 𝛾 ⟩ 𝐴𝐵𝐴′ 𝐵′ 𝐸
= 𝑈 𝐴𝐵𝐴′ 𝐵′ |Φ⟩ 𝐴𝐵 ⊗ |𝜓 𝜃 ⟩ 𝐴′ 𝐵′ 𝐸 (15.1.13)
𝐾−1 𝐾−1
!
∑︁
𝑖𝑗 ª 1 ∑︁
= |𝑖⟩⟨𝑖| 𝐴 ⊗ | 𝑗⟩⟨ 𝑗 | 𝐵 ⊗ 𝑈 𝐴′ 𝐵′ ® √ |𝑘⟩ 𝐴 |𝑘⟩𝐵 ⊗ |𝜓 𝜃 ⟩ 𝐴′ 𝐵′ 𝐸
©
(15.1.14)
𝐾 𝑘=0
«𝑖, 𝑗=0 ¬
𝐾−1
1 ∑︁
=√ |𝑘⟩ 𝐴 |𝑘⟩𝐵 ⊗ 𝑈 𝐴𝑘 𝑘′ 𝐵′ |𝜓 𝜃 ⟩ 𝐴′ 𝐵′ 𝐸 , (15.1.15)
𝐾 𝑘=0
Thus, the particular purification |𝜙 𝛾 ⟩ 𝐴𝐵𝐴′ 𝐵′ 𝐸 leads to a tripartite key state on systems
𝐴𝐵𝐸. Now, in the development above, we chose a particular purification of 𝛾 𝐴𝐵𝐴′ 𝐵′ .
However, given that all purifications are related by isometries acting on the purifying
945
Chapter 15: Secret Key Distillation
system, every purification can be written as 𝑉𝐸→𝐸 ′ |𝜙 𝛾 ⟩ 𝐴𝐵𝐴′ 𝐵′ 𝐸 for some isometry
𝑉𝐸→𝐸 ′ . Then repeating the calculation above gives that the reduced state on 𝐴𝐵𝐸 ′
after local dephasing channels on 𝐴 and 𝐵 is
for some states |𝜙𝑖, 𝑗 ⟩ 𝐴′ 𝐵′ 𝐸 and probability amplitudes {𝛼𝑖, 𝑗 }𝑖, 𝑗 . However, in order
for the measurement outcomes of Alice and Bob to be perfectly correlated and
uniformly random, it is necessary that
1
if 𝑖 = 𝑗
|𝛼𝑖, 𝑗 | 2 = 𝐾 . (15.1.22)
0 if 𝑖 ≠ 𝑗
(Any other values for the amplitudes 𝛼𝑖, 𝑗 would lead to a different distribution
upon measurement of the 𝐴 and 𝐵 systems.) So the global state should have the
following form:
𝐾−1
𝛾
∑︁ 1
|𝜙 ⟩ 𝐴𝐵𝐴′ 𝐵′ 𝐸 = √ |𝑖⟩ 𝐴 |𝑖⟩𝐵 𝑒𝑖𝜑𝑖 |𝜙𝑖,𝑖 ⟩ 𝐴′ 𝐵′ 𝐸 . (15.1.23)
𝑖=0 𝐾
In order for the reduced density operator on 𝐸 to be independent of the measurement
outcomes of Alice and Bob, it is necessary for it to be a fixed state with no dependence
on 𝑖:
Tr 𝐴′ 𝐵′ [|𝜙𝑖,𝑖 ⟩⟨𝜙𝑖,𝑖 | 𝐴′ 𝐵′ 𝐸 ] = 𝜎𝐸 . (15.1.24)
In such a case, then all of the states |𝜙𝑖,𝑖 ⟩ 𝐴′ 𝐵′ 𝐸 are purifications of the same state 𝜎𝐸 ,
so that there exists a unitary 𝑈 𝑖𝐴′ 𝐵′ relating each |𝜙𝑖,𝑖 ⟩ 𝐴′ 𝐵′ 𝐸 to a fixed purification
|𝜙𝜎 ⟩ 𝐴′ 𝐵′ 𝐸 of 𝜎:
𝑒𝑖𝜑𝑖 |𝜙𝑖,𝑖 ⟩ 𝐴′ 𝐵′ 𝐸 = 𝑈 𝑖𝐴′ 𝐵′ |𝜙𝜎 ⟩ 𝐴′ 𝐵′ 𝐸 . (15.1.25)
946
Chapter 15: Secret Key Distillation
Proposition 15.7
If 𝜌 𝐴𝐵𝐴′ 𝐵′ is an 𝜀-approximate bipartite key state with 𝐾 key values, then the
state 𝜌 𝐴𝐵𝐸 is an 𝜀-approximate tripartite key state with 𝐾 key values, where
𝜌 𝜌
𝜌 𝐴𝐵𝐸 = Tr 𝐴′ 𝐵′ [𝜓 𝐴𝐵𝐴′ 𝐵′ 𝐸 ] and 𝜓 𝐴𝐵𝐴′ 𝐵′ 𝐸 is an arbitrary purification of 𝜌 𝐴𝐵𝐴′ 𝐵′ .
The converse statement is true as well.
𝜌
Proof: Suppose that the inequality in (15.1.28) is satisfied. Let 𝜓 𝐴𝐵𝐴′ 𝐵′ 𝐸 be a
purification of 𝜌 𝐴𝐵𝐸 . Then by applying Uhlmann’s theorem (Theorem 6.8), there
947
Chapter 15: Secret Key Distillation
The equivalence between ideal and approximate tripartite key states and bipartite
private states extends further, and it is a correspondence that allows us to consider
secret key distillation in the bipartite picture. To this end, we define a bipartite
private-state distillation protocol, and then we prove the equivalence.
A bipartite private-state distillation protocol for the state 𝜌 𝐴𝐵 is defined by the
pair (𝐾, L↔ ↔
𝐴𝐵→𝐾 𝐴 𝐾 𝐵 𝐴′ 𝐵′ ), where 𝐾 ∈ N and L 𝐴𝐵→𝐾 𝐴 𝐾 𝐵 𝐴′ 𝐵′ is an LOCC channel
948
Chapter 15: Secret Key Distillation
We now establish the main result of this section, which is the equivalence of
tripartite key distillation and bipartite private-state distillation:
Theorem 15.9
Let 𝐾 ∈ N and 𝜀 ∈ [0, 1]. Let 𝜌 𝐴𝐵 be a bipartite state. There exists a (𝐾, 𝜀)
tripartite key distillation protocol for 𝜌 𝐴𝐵 if and only if there exists a (𝐾, 𝜀)
bipartite private-state distillation protocol for 𝜌 𝐴𝐵 .
Proof: We start by proving that there exists a (𝐾, 𝜀) bipartite private-state distilla-
tion protocol if there exists a (𝐾, 𝜀) tripartite key distillation protocol. Let 𝜓 𝐴𝐵𝐸
be a purification of 𝜌 𝐴𝐵 , let L↔ 𝐴𝐵→𝐾 𝐴 𝐾 𝐵 𝑍 be the LOPC channel realizing the key
distillation, and let 𝛾𝐾 𝐴𝐾 𝐵 𝐸 𝑍 be a tripartite key state such that
1 − 𝐹 (𝛾𝐾 𝐴𝐾 𝐵 𝐸 𝑍 , L↔
𝐴𝐵→𝐾 𝐴 𝐾 𝐵 𝑍 (𝜓 𝐴𝐵𝐸 )) ≤ 𝜀. (15.1.35)
L ↔
An isometric extension 𝑈 𝐴𝐵→𝐾 ′ ′ of this LOPC channel is as follows:
𝐴𝐾 𝐵 𝐴 𝐵 𝑍
∑︁ 𝑧
L↔ E F𝑧
𝑈 𝐴𝐵→𝐾 𝐴𝐾 𝐵 𝐴′ 𝐵′ 𝑍 B 𝑉𝐴→𝐾 𝐴𝐴
′ ⊗ 𝑉𝐵→𝐾 𝐵 𝐵′ ⊗ |𝑧⟩ 𝑍 , (15.1.37)
𝑧∈Z
949
Chapter 15: Secret Key Distillation
E
where {𝑉𝐴→𝐾
𝑧 F 𝑧
′ } 𝑧∈Z and {𝑉𝐵→𝐾 𝐵 ′ } 𝑧∈Z are sets of linear operators such that
𝐴𝐴 𝐵
L↔
𝑈 𝐴𝐵→𝐾 is an isometry and
′ ′
𝐴𝐾 𝐵 𝐴 𝐵 𝑍
↔
Tr 𝐴′ 𝐵′ ◦UL𝐴𝐵→𝐾 𝐴𝐾 𝐵 𝐴′ 𝐵′ 𝑍 = L↔
𝐴𝐵→𝐾 𝐴 𝐾 𝐵 𝑍 , (15.1.38)
with
↔ ↔ ↔
UL𝐴𝐵→𝐾 𝐴𝐾 𝐵 𝐴′ 𝐵′ 𝑍 (·) B 𝑈 𝐴𝐵→𝐾
L L †
′ ′ (·)(𝑈 𝐴𝐵→𝐾 𝐾 𝐴′ 𝐵 ′ 𝑍 ) .
𝐴𝐾 𝐵 𝐴 𝐵 𝑍 𝐴 𝐵
(15.1.39)
E
To meet these requirements, note that it is necessary for each 𝑉𝐴→𝐾
𝑧 F 𝑧
′ and 𝑉𝐵→𝐾 𝐵 ′
𝐴𝐴 𝐵
to be a contraction, i.e., satisfying
E 𝑧 F 𝑧
𝑉𝐴→𝐾 𝐴𝐴
′ , 𝑉𝐵→𝐾 𝐵𝐵
′ ≤ 1. (15.1.40)
∞ ∞
↔
It then follows that the state UL𝐴𝐵→𝐾 𝐴𝐾 𝐵 𝐴′ 𝐵′ 𝑍 (𝜓 𝐴𝐵𝐸 ) purifies L↔
𝐴𝐵→𝐾 𝐴 𝐾 𝐵 𝑍 (𝜓 𝐴𝐵𝐸 ),
and by applying Uhlmann’s theorem (Theorem 6.8), there exists a pure state
𝛾𝐾 𝐴𝐾 𝐵 𝐴′ 𝐵′ 𝐸 𝑍 satisfying
𝐹 (𝛾𝐾 𝐴𝐾 𝐵 𝐸 𝑍 , L↔
𝐴𝐵→𝐾 𝐴 𝐾 𝐵 𝑍 (𝜓 𝐴𝐵𝐸 ))
↔
= 𝐹 (𝛾𝐾 𝐴𝐾 𝐵 𝐴′ 𝐵′ 𝐸 𝑍 , UL𝐴𝐵→𝐾 𝐴𝐾 𝐵 𝐴′ 𝐵′ 𝑍 (𝜓 𝐴𝐵𝐸 )). (15.1.41)
Now applying the same reasoning given in Proposition 15.7, we conclude that the
following inequality holds
↔
1 − 𝐹 (𝛾𝐾 𝐴𝐾 𝐵 𝐴′ 𝐵′ , (Tr 𝑍 ◦UL𝐴𝐵→𝐾 𝐴𝐾 𝐵 𝐴′ 𝐵′ 𝑍 )(𝜌 𝐴𝐵 )) ≤ 𝜀, (15.1.42)
where 𝛾𝐾 𝐴𝐾 𝐵 𝐴′ 𝐵′ is an ideal bipartite private state of size 𝐾. Note that the channel
↔
Tr 𝑍 ◦UL𝐴𝐵→𝐾 𝐴𝐾 𝐵 𝐴′ 𝐵′ 𝑍 is an LOCC channel, because it has the following form:
∑︁ 𝑧
L↔
VE𝐴→𝐾 𝐴 𝐴′ ⊗ VF𝐵→𝐾 𝐵 𝐵′ .
𝑧
Tr 𝑍 ◦U 𝐴𝐵→𝐾 𝐴𝐾 𝐵 𝐴′ 𝐵′ 𝑍 = (15.1.43)
𝑧∈Z
Thus, there exists a (𝐾, 𝜀) bipartite private-state distillation protocol if there exists
a (𝐾, 𝜀) tripartite key distillation protocol.
We now prove the opposite implication. Suppose that there exists a (𝐾, 𝜀)
bipartite private-state distillation protocol. Let L↔𝐴𝐵→𝐾 𝐴 𝐾 𝐵 𝐴′ 𝐵′ be the LOCC channel
realizing the private-state distillation, and let 𝛾𝐾 𝐴𝐾 𝐵 𝐴′ 𝐵′ be an ideal bipartite private
state satisfying
1 − 𝐹 (𝛾𝐾 𝐴𝐾 𝐵 𝐴′ 𝐵′ , L↔
𝐴𝐵→𝐾 𝐴 𝐾 𝐵 𝐴′ 𝐵′ (𝜌 𝐴𝐵 )) ≤ 𝜀. (15.1.44)
950
Chapter 15: Secret Key Distillation
where
†
E𝑧𝐴→𝐾 𝐴 (·) B Tr 𝐴′ [𝐸 𝐴→𝐾
𝑧
𝐴𝐴
𝑧
′ (·)(𝐸 𝐴→𝐾 𝐴′ ) ],
𝐴
(15.1.53)
†
F𝐵→𝐾
𝑧
𝐵
𝑧
(·) B Tr 𝐵′ [𝐹𝐵→𝐾 𝐵𝐵
𝑧
′ (·)(𝐹𝐵→𝐾 𝐵 ′ ) ].
𝐵
(15.1.54)
Thus, we have proven that the existence of a (𝐾, 𝜀) bipartite private-state distil-
lation protocol for 𝜌 𝐴𝐵 implies the existence of a (𝐾, 𝜀) tripartite key distillation
protocol. ■
𝐸 𝐷 ( 𝐴; 𝐵) 𝜌 ≤ 𝐾 𝐷 ( 𝐴; 𝐵) 𝜌 , (15.1.56)
e𝐷 ( 𝐴; 𝐵) 𝜌 ≤ 𝐾
𝐸 e𝐷 ( 𝐴; 𝐵) 𝜌 , (15.1.57)
Our study of upper bounds on one-shot distillable key begins with the private
information and the following lemma.
Lemma 15.10
Let 𝐴 and 𝐵 be quantum systems with the same dimension 𝐾 ∈ N, let 𝐸 be
another quantum system of arbitrary dimension, and let 𝜀 ∈ (0, 1). Let 𝜔 𝐴𝐵𝐸
be an 𝜀-approximate tripartite key state of size 𝐾, as specified in Definition 15.6,
and let 𝜔M𝐴𝐵𝐸 B (M 𝐴 ⊗ M 𝐵 )(𝜔 𝐴𝐵𝐸 ), where M 𝐴 and M 𝐵 are the measurement
channels in Definition 15.6. Then the following inequality holds
√ √
𝜀+𝛿 𝜀 1
log2 𝐾 ≤ 𝐼 𝐻 ( 𝐴; 𝐵)𝜔M − 𝐼max ( 𝐴; 𝐸)𝜔M + log2 , (15.1.58)
𝛿
√
where 𝛿 ∈ (0, 1 − 𝜀) and
√
𝜀
𝐼max ( 𝐴; 𝐸)𝜔M B inf √ inf 𝐷 max ( 𝜔
e𝐴𝐸 ∥ 𝜔
e𝐴 ⊗ 𝜏𝐸 ). (15.1.59)
e𝐴𝐸 ,𝜔M
e𝐴𝐸 :𝑃( 𝜔
𝜔 𝐴𝐸
)≤ 𝜀 𝜏𝐸
Proof: Consider that the following condition holds from Definition 15.6:
𝐹 (𝛾 𝐴𝐵𝐸 , 𝜔 𝐴𝐵𝐸 ) ≥ 1 − 𝜀, (15.1.60)
where 𝛾 𝐴𝐵𝐸 is an ideal tripartite key state. Applying the measurement channels
M 𝐴 and M𝐵 from Definition 15.1 and the data-processing inequality for fidelity,
we conclude that
𝐹 (Φ 𝐴𝐵 ⊗ 𝜎𝐸 , (M 𝐴 ⊗ M𝐵 )(𝜔 𝐴𝐵𝐸 )) ≥ 1 − 𝜀. (15.1.61)
Now tracing over system 𝐸 and again applying the data-processing inequality for
fidelity, we conclude that
𝐹 (Φ 𝐴𝐵 , 𝜔M
𝐴𝐵 ) = 𝐹 (Φ 𝐴𝐵 , (M 𝐴 ⊗ M 𝐵 )(𝜔 𝐴𝐵 )) ≥ 1 − 𝜀. (15.1.62)
953
Chapter 15: Secret Key Distillation
𝐾−1
∑︁
𝜔M
𝐴𝐵 = 𝑝(𝑖)𝑞( 𝑗 |𝑖)|𝑖⟩⟨𝑖| 𝐴 ⊗ | 𝑗⟩⟨ 𝑗 | 𝐵 , (15.1.63)
𝑖, 𝑗=0
𝐾−1
∑︁
Π 𝐴𝐵 B |𝑖⟩⟨𝑖| 𝐴 ⊗ |𝑖⟩⟨𝑖| 𝐵 , (15.1.64)
𝑖=0
Tr[Π 𝐴𝐵 𝜔M 𝐴𝐵 ]
𝐾−1 ! 𝐾−1
∑︁ ′ ′ ∑︁
|𝑖 ⟩⟨𝑖 | 𝐴 ⊗ |𝑖′⟩⟨𝑖′ | 𝐵
ª
= Tr 𝑝(𝑖)𝑞( 𝑗 |𝑖)|𝑖⟩⟨𝑖| 𝐴 ⊗ | 𝑗⟩⟨ 𝑗 | 𝐵 ®
©
(15.1.65)
𝑖 ′ =0
«𝑖, 𝑗=0
¬
𝐾−1
∑︁
= 𝑝(𝑖)𝑞(𝑖|𝑖) (15.1.66)
𝑖=0
which outputs a classical flag register indicating if the comparator test is successful
or not. Consider that
T 𝐴𝐵 (Φ 𝐴𝐵 ) = |1⟩⟨1|, (15.1.68)
𝐾−1
! 𝐾−1
!
∑︁ ∑︁
T 𝐴𝐵 (𝜔M𝐴𝐵 ) = 𝑝(𝑖)𝑞(𝑖|𝑖) |1⟩⟨1| + 1 − 𝑝(𝑖)𝑞(𝑖|𝑖) |0⟩⟨0|. (15.1.69)
𝑖=0 𝑖=0
Employing the data-processing inequality for the fidelity and the findings above,
we conclude that
1 − 𝜀 ≤ 𝐹 (Φ 𝐴𝐵 , 𝜔M
𝐴𝐵 ) (15.1.70)
≤ 𝐹 (T 𝐴𝐵 (Φ 𝐴𝐵 ), T 𝐴𝐵 (𝜔M
𝐴𝐵 )) (15.1.71)
954
Chapter 15: Secret Key Distillation
𝐾−1
∑︁
= 𝑝(𝑖)𝑞(𝑖|𝑖). (15.1.72)
𝑖=0
Thus, we conclude that the probability of passing the comparator test satisfies
Tr[Π 𝐴𝐵 𝜔M
𝐴𝐵 ] ≥ 1 − 𝜀. (15.1.73)
Tr[Π 𝛿𝐴 Φ 𝐴 ] ≥ 1 − 𝛿. (15.1.77)
Tr[Π 𝛿𝐴 Π 𝐴𝐵 Π 𝛿𝐴 (𝜔M 𝛿 M 𝛿
𝐴 ⊗ 𝜎𝐵 )] = Tr[Π 𝐴𝐵 (Π 𝐴 𝜔 𝐴 Π 𝐴 ⊗ 𝜎𝐵 )] (15.1.78)
1
≤ Tr[Π 𝐴𝐵 (Π 𝛿𝐴 Φ 𝐴 Π 𝛿𝐴 ⊗ 𝜎𝐵 )] (15.1.79)
𝛿
1
≤ Tr[Π 𝐴𝐵 (Φ 𝐴 ⊗ 𝜎𝐵 )] (15.1.80)
𝛿
1
= Tr[Π 𝐴𝐵 (𝐼 𝐴 ⊗ 𝜎𝐵 )] (15.1.81)
𝛿𝐾
1
= , (15.1.82)
𝛿𝐾
where the second inequality follows because Π 𝛿𝐴 and Φ 𝐴 commute. Then consider
that
Tr[(𝐼 𝐴𝐵 − Π 𝛿𝐴 Π 𝐴𝐵 Π 𝛿𝐴 )𝜔M
𝐴𝐵 ]
1
≤ Tr[(𝐼 𝐴𝐵 − Π 𝛿𝐴 Π 𝐴𝐵 Π 𝛿𝐴 )Φ 𝐴𝐵 ] + Φ 𝐴𝐵 − 𝜔M
𝐴𝐵 (15.1.83)
2 1
955
Chapter 15: Secret Key Distillation
1
≤ Tr[(𝐼 𝐴𝐵 − Π 𝐴𝐵 )Φ 𝐴𝐵 ] + Tr[(𝐼 𝐴𝐵 − Π 𝛿𝐴 ⊗ 𝐼 𝐵 )Φ 𝐴𝐵 ] + Φ 𝐴𝐵 − 𝜔M
𝐴𝐵
2 1
(15.1.84)
1
= Tr[(𝐼 𝐴 − Π 𝛿𝐴 )Φ 𝐴 ] + Φ 𝐴𝐵 − 𝜔M
𝐴𝐵 (15.1.85)
√︃ 2 1
≤ 𝛿 + 1 − 𝐹 (Φ 𝐴𝐵 , 𝜔M 𝐴𝐵 ) (15.1.86)
√
≤ 𝛿 + 𝜀. (15.1.87)
The first inequality is a consequence of the variational characterization of the
normalized trace distance from Theorem 6.1. The second inequality is a consequence
of the following union bound for commuting projectors 𝑃 and 𝑄:
𝐼 − 𝑃𝑄𝑃 ≤ 𝐼 − 𝑃 + 𝐼 − 𝑄, (15.1.88)
which in turn follows from (𝐼 − 𝑃) (𝐼 − 𝑄) ≥ 0. The third inequality follows from
Theorem 6.14, and the last from (15.1.62). As such, the measurement operator
Π 𝛿𝐴 Π 𝐴𝐵 Π 𝛿𝐴 is a particular measurement operator satisfying the contraints given in
√
(𝜔M M
𝜀+𝛿
the optimization for the hypothesis testing relative entropy 𝐷 𝐻 𝐴𝐵 ∥𝜔 𝐴 ⊗ 𝜎𝐵 ),
and we thus conclude that
log2 𝛿 + log2 𝐾 = log2 𝛿𝐾 (15.1.89)
≤ − log2 Tr[Π 𝛿𝐴 Π 𝐴𝐵 Π 𝛿𝐴 (𝜔M
𝐴 ⊗ 𝜎𝐵 )] (15.1.90)
√
(𝜔M M
𝜀+𝛿
≤ 𝐷𝐻 𝐴𝐵 ∥𝜔 𝐴 ⊗ 𝜎𝐵 ). (15.1.91)
Since the bound holds for every state 𝜎𝐵 , we conclude that
√
𝜀+𝛿 1
log2 𝐾 ≤ 𝐼 𝐻 ( 𝐴; 𝐵)𝜔M + log2 . (15.1.92)
𝛿
Then
√
𝜀
𝐼max ( 𝐴; 𝐸)𝜔M = inf √ inf 𝐷 max ( 𝜔
e𝐴𝐸 ∥ 𝜔
e𝐴 ⊗ 𝜏𝐸 ) (15.1.96)
e𝐴𝐸 ,𝜔M
e𝐴𝐸 :𝑃( 𝜔
𝜔 𝐴𝐸
)≤ 𝜀 𝜏𝐸
≤ 𝐷 max (Φ 𝐴 ⊗ 𝜎𝐸 ∥Φ 𝐴 ⊗ 𝜎𝐸 ) (15.1.97)
= 0, (15.1.98)
Note that the result of Lemma 15.10 is general and applies to every tripartite
state that is close in fidelity to an ideal tripartite key state. Applying it to the state
𝜔 𝐾 𝐴 𝐾 𝐵 𝐸 𝑍 = L↔
𝐴𝐵→𝐾 𝐴 𝐾 𝐵 𝐸 𝑍 (𝜓 𝐴𝐵𝐸 ) that is the final output of a (𝐾, 𝜀) tripartite key
distillation protocol for a state 𝜌 𝐴𝐵 with purification 𝜓 𝐴𝐵𝐸 , we obtain the following
result:
Proof: For a (𝐾, 𝜀) tripartite key distillation protocol (𝐾, L↔ 𝐴𝐵→𝐾 𝐴 𝐾 𝐵 𝑍 ) for 𝜓 𝐴𝐵𝐸 ,
↔
by definition the state 𝜔 𝐾 𝐴𝐾 𝐵 𝐸 𝑍 = L 𝐴𝐵→𝐾 𝐴𝐾 𝐵 𝑍 (𝜓 𝐴𝐵𝐸 ) satisfies
𝐹 (𝛾𝐾 𝐴𝐾 𝐵 𝐸 𝑍 , 𝜔 𝐾 𝐴𝐾 𝐵 𝐸 𝑍 ) ≥ 1 − 𝜀, (15.1.101)
957
Chapter 15: Secret Key Distillation
where 𝛾𝐾 𝐴𝐾 𝐵 𝐸 𝑍 is an ideal tripartite key state. Upon performing the local measure-
ments M𝐾 𝐴 and M𝐾 𝐵 mentioned in Definition 15.1, we conclude that
Set
𝜔M
𝐾 𝐴 𝐾 𝐵 𝐸 𝑍 B (M𝐾 𝐴 ⊗ M𝐾 𝐵 )(𝜔 𝐾 𝐴 𝐾 𝐵 𝐸 𝑍 ). (15.1.103)
Therefore, using (15.1.58), we conclude that
√ √
𝜀+𝛿 𝜀 1
log2 𝐾 ≤ 𝐼 𝐻 (𝐾 𝐴 ; 𝐾 𝐵 )𝜔M − 𝐼max (𝐾 𝐴 ; 𝐸 𝑍)𝜔M + log2 , (15.1.104)
𝛿
√
where 𝛿 ∈ (0, 1 − 𝜀). Since (M𝐾 𝐴 ⊗ M𝐾 𝐵 ) ◦ L↔ 𝐴𝐵→𝐾 𝐴 𝐾 𝐵 𝑍 is a particular LOPC
↔
channel of the form L 𝐴𝐵→𝑋 𝐵′ 𝑍 , with 𝑋 and 𝑍 classical systems, we conclude that
√ √
𝜀+𝛿 𝜀
𝐼𝐻 (𝐾 𝐴 ; 𝐾 𝐵 )𝜔M − 𝐼max (𝐾 𝐴 ; 𝐸 𝑍)𝜔M
√ √
𝜀+𝛿 ′ 𝜀
≤ sup 𝐼 𝐻 (𝑋; 𝐵 )L(𝜓) − 𝐼max (𝑋; 𝐸 𝑍)L(𝜓) . (15.1.105)
L
access to the purifying system 𝐸 of a purification of 𝜎𝐴𝐵 , which in this case can be
chosen as follows:
Then Eve can measure the system 𝐸 and obtain an outcome 𝑥 ∈ X with probability
𝑝(𝑥), and the resulting state of Alice and Bob is the product state 𝜓 𝑥𝐴 ⊗ 𝜑𝑥𝐵 . Being
a product state, the resulting state 𝜓 𝑥𝐴 ⊗ 𝜑𝑥𝐵 of Alice and Bob has no correlation
whatsoever and so cannot be used to generate a secret key. If Alice and Bob attempt
to process this state using LOCC, the same problem arises. In the model of key
distillation that we assume, Eve gets a copy of all classical data exchanged between
Alice and Bob, and so the resulting state is still a product state and is useless for
generating a secret key. As such, all separable states are useless for key distillation.
The intuition above is useful for reasoning about key distillation, but there
is a way to make it precise by means of a construct called the “privacy test.” In
doing so, we exploit the equivalence between tripartite key distillation and bipartite
private-state distillation discussed in Section 15.1.2 and identified in Theorem 15.9.
The privacy test is analogous to the entanglement test used in Chapter 13, which
we used to establish upper bounds on the number of approximate ebits that can
be generated in an entanglement distillation protocol. Here we define a “privacy
test” as a method for testing whether a given bipartite state is private. It forms an
essential component in Proposition 15.15, which states that the 𝜀-relative entropy of
entanglement is an upper bound on the number of private bits in an 𝜀-approximate
bipartite private state.
959
Chapter 15: Secret Key Distillation
If one has access to the systems 𝐴𝐵𝐴′ 𝐵′ of a bipartite state 𝜌 𝐴𝐵𝐴′ 𝐵′ and has a
description of 𝛾 𝐴𝐵𝐴′ 𝐵′ satisfying (15.1.29), then the 𝛾-privacy test decides whether
𝜌 𝐴𝐵𝐴′ 𝐵′ is a private state with respect to 𝛾 𝐴𝐵𝐴′ 𝐵′ . The first outcome corresponds to
the decision “yes, it is a 𝛾-private state,” and the second outcome corresponds to
“no.” Physically, this test is just untwisting the purported private state and projecting
onto a maximally entangled state. The following lemma states that the probability
for an 𝜀-approximate bipartite private state to pass the 𝛾-privacy test is not smaller
than 1 − 𝜀:
Lemma 15.13
Let 𝜀 ∈ [0, 1] and let 𝜌 𝐴𝐵𝐴′ 𝐵′ be an 𝜀-approximate private state as given in
Definition 15.6, with 𝛾 𝐴𝐵𝐴′ 𝐵′ satisfying (15.1.29). The probability for 𝜌 𝐴𝐵𝐴′ 𝐵′
to pass the 𝛾-privacy test is never smaller than 1 − 𝜀:
Proof: One can see this bound explicitly by inspecting the following steps:
Tr[Π 𝐴𝐵𝐴′ 𝐵′ 𝜌 𝐴𝐵𝐴′ 𝐵′ ]
†
= Tr[𝑈 𝐴𝐵𝐴′ 𝐵′ (Φ 𝐴𝐵 ⊗ 𝐼 𝐴′ 𝐵′ ) 𝑈 𝐴𝐵𝐴 ′ 𝐵 ′ 𝜌 𝐴𝐵𝐴′ 𝐵 ′ ] (15.1.112)
†
= Tr[(Φ 𝐴𝐵 ⊗ 𝐼 𝐴′ 𝐵′ ) 𝑈 𝐴𝐵𝐴 ′ 𝐵 ′ 𝜌 𝐴𝐵𝐴′ 𝐵 ′ 𝑈 𝐴𝐵𝐴′ 𝐵 ′ ] (15.1.113)
†
= ⟨Φ| 𝐴𝐵 Tr 𝐴′ 𝐵′ [𝑈 𝐴𝐵𝐴 ′ 𝐵 ′ 𝜌 𝐴𝐵𝐴′ 𝐵 ′ 𝑈 𝐴𝐵𝐴′ 𝐵 ′ ]|Φ⟩ 𝐴𝐵 (15.1.114)
†
= 𝐹 (Φ 𝐴𝐵 , Tr 𝐴′ 𝐵′ [𝑈 𝐴𝐵𝐴 ′ 𝐵 ′ 𝜌 𝐴𝐵𝐴′ 𝐵 ′ 𝑈 𝐴𝐵𝐴′ 𝐵 ′ ]) (15.1.115)
†
≥ 𝐹 (Φ 𝐴𝐵 ⊗ 𝜃 𝐴′ 𝐵′ , 𝑈 𝐴𝐵𝐴 ′ 𝐵 ′ 𝜌 𝐴𝐵𝐴′ 𝐵 ′ 𝑈 𝐴𝐵𝐴′ 𝐵 ′ ) (15.1.116)
†
= 𝐹 (𝑈 𝐴𝐵𝐴′ 𝐵′ (Φ 𝐴𝐵 ⊗ 𝜃 𝐴′ 𝐵′ )𝑈 𝐴𝐵𝐴 ′ 𝐵 ′ , 𝜌 𝐴𝐵𝐴′ 𝐵 ′ ) (15.1.117)
= 𝐹 (𝛾 𝐴𝐵𝐴′ 𝐵′ , 𝜌 𝐴𝐵𝐴′ 𝐵′ ) (15.1.118)
≥ 1 − 𝜀. (15.1.119)
The third equality follows because Φ 𝐴𝐵 is pure and by taking applying the definition
of partial trace (over 𝐴′ 𝐵′). The fourth equality follows from the expression
in (6.2.2), for the fidelity between a pure state and a mixed state. The first
inequality follows from the data-processing inequality for fidelity. The second-
to-last equality follows from the unitary invariance of the fidelity, and the last
equality follows because 𝛾 𝐴𝐵𝐴′ 𝐵′ is an ideal private state, written as 𝛾 𝐴𝐵𝐴′ 𝐵′ =
960
Chapter 15: Secret Key Distillation
†
𝑈 𝐴𝐵𝐴′ 𝐵′ (Φ 𝐴𝐵 ⊗ 𝜃 𝐴′ 𝐵′ )𝑈 𝐴𝐵𝐴 ′ 𝐵′ . ■
On the other hand, a separable state 𝜎𝐴𝐵𝐴′ 𝐵′ ∈ SEP( 𝐴𝐴′ : 𝐵𝐵′) of the key and
shield systems has a small chance of passing an arbitrary 𝛾-privacy test:
Lemma 15.14
For a separable state 𝜎𝐴𝐵𝐴′ 𝐵′ ∈ SEP( 𝐴𝐴′ : 𝐵𝐵′), the probability of passing an
arbitrary 𝛾-privacy test is not larger than 𝐾1 :
1
Tr[Π 𝐴𝐵𝐴′ 𝐵′ 𝜎𝐴𝐵𝐴′ 𝐵′ ] ≤ , (15.1.120)
𝐾
where 𝐾 is the number of values that the secret key can take (i.e., 𝐾 = 𝑑 𝐴 = 𝑑 𝐵 ).
Proof: The idea is to begin by establishing the bound for an arbitrary pure product
state |𝜙⟩ 𝐴𝐴′ ⊗ |𝜑⟩𝐵𝐵′ , i.e., to show that
1
Tr[Π 𝐴𝐵𝐴′ 𝐵′ |𝜙⟩⟨𝜙| 𝐴𝐴′ ⊗ |𝜑⟩⟨𝜑| 𝐵𝐵′ ] ≤ . (15.1.121)
𝐾
We can expand these states with respect to the standard bases of 𝐴 and 𝐵 as follows:
" 𝐾 # 𝐾
∑︁ ∑︁
|𝜙⟩ 𝐴𝐴′ ⊗ |𝜑⟩𝐵𝐵′ = 𝛼𝑖 |𝑖⟩ 𝐴 ⊗ |𝜙𝑖 ⟩ 𝐴′ ⊗ 𝛽 𝑗 | 𝑗⟩𝐵 ⊗ |𝜑 𝑗 ⟩𝐵′ , (15.1.122)
𝑖=1 𝑗=1
2
|𝛼𝑖 | 2 = 𝐾𝑗=1 𝛽 𝑗 = 1. We then find that
Í𝐾 Í
where 𝑖=1
𝐾 2
1 ∑︁
= 𝛼𝑖 𝛽𝑖𝑈 𝑖𝑖†
𝐴′ 𝐵′ |𝜙𝑖 ⟩ 𝐴 |𝜑𝑖 ⟩ 𝐵
′ ′ (15.1.127)
𝐾 𝑖=1
2
𝐾 2
1 ∑︁
= 𝛼𝑖 𝛽𝑖 |𝜉𝑖 ⟩ 𝐴′ 𝐵′ (15.1.128)
𝐾 𝑖=1 2
𝐾
1 ∑︁
= 𝛼𝑖 𝛽𝑖 𝛼∗𝑗 𝛽∗𝑗 ⟨𝜉 𝑗 |𝜉𝑖 ⟩ 𝐴′ 𝐵′ . (15.1.129)
𝐾 𝑖, 𝑗=1
where |𝜉𝑖 ⟩ 𝐴′ 𝐵′ B (𝑈 𝑖𝑖𝐴′ 𝐵′ ) † |𝜙𝑖 ⟩ 𝐴′ |𝜑𝑖 ⟩𝐵′ is a quantum state. The desired bound in
(15.1.121) is then equivalent to
𝐾
∑︁
𝛼𝑖 𝛽𝑖 𝛼∗𝑗 𝛽∗𝑗 ⟨𝜉 𝑗 |𝜉𝑖 ⟩ 𝐴′ 𝐵′ ≤ 1. (15.1.130)
𝑖, 𝑗=1
√ √
Setting 𝛼𝑖 = 𝑝𝑖 𝑒𝑖𝜃 𝑖 and 𝛽𝑖 = 𝑞𝑖 𝑒𝑖𝜂𝑖 , we find that
𝐾 𝐾
∑︁ ∑︁ √
𝛼𝑖 𝛽𝑖 𝛼∗𝑗 𝛽∗𝑗 ⟨𝜉 𝑗 |𝜉𝑖 ⟩ 𝐴′ 𝐵′ = 𝑝𝑖 𝑞𝑖 𝑝 𝑗 𝑞 𝑗 𝑒𝑖 ( 𝜃 𝑖 +𝜂𝑖 −𝜃 𝑗 −𝜂 𝑗 ) ⟨𝜉 𝑗 |𝜉𝑖 ⟩ 𝐴′ 𝐵′ (15.1.131)
𝑖, 𝑗=1 𝑖, 𝑗=1
𝐾
∑︁ √
≤ 𝑝𝑖 𝑞𝑖 𝑝 𝑗 𝑞 𝑗 ⟨𝜉 𝑗 |𝜉𝑖 ⟩ 𝐴′ 𝐵′ (15.1.132)
𝑖, 𝑗=1
𝐾
∑︁ √
≤ 𝑝𝑖 𝑞𝑖 𝑝 𝑗 𝑞 𝑗 (15.1.133)
𝑖, 𝑗=1
" 𝐾
#2
∑︁ √
= 𝑝𝑖 𝑞𝑖 ≤ 1, (15.1.134)
𝑖=1
where the last inequality holds for all probability distributions (this is just the
statement that the classical fidelity cannot exceed one). The above reasoning thus
establishes (15.1.120) for pure product states, and the bound for general separable
states follows because every such state can be written as a convex combination of
pure product states. ■
962
Chapter 15: Secret Key Distillation
Proposition 15.15
Fix 𝜀 ∈ [0, 1]. Let 𝜌 𝐴𝐵𝐴′ 𝐵′ be an 𝜀-approximate bipartite private state, as given
in Definition 15.6. Then the number log2 𝐾 of private bits in such a state is
bounded from above by the 𝜀-relative entropy of entanglement of 𝜌 𝐴𝐵𝐴′ 𝐵′ :
Proof: Let 𝜎𝐴𝐵𝐴′ 𝐵′ be an arbitrary separable state in SEP( 𝐴𝐴′ : 𝐵𝐵′). From
Definition 15.6 and Lemma 15.13, we conclude that the 𝛾-privacy test Π 𝐴𝐵𝐴′ 𝐵′
from (15.1.110) is a particular measurement operator satisfying the constraint
Tr[Π 𝐴𝐵𝐴′ 𝐵′ 𝜌 𝐴𝐵𝐴′ 𝐵′ ] ≥ 1 − 𝜀 for 𝛽𝜀 (𝜌 𝐴𝐵𝐴′ 𝐵′ ∥𝜎𝐴𝐵𝐴′ 𝐵′ ). Applying Lemma 15.14
and the definition of 𝛽𝜀 , we conclude that
1
𝛽𝜀 (𝜌 𝐴𝐵𝐴′ 𝐵′ ∥𝜎𝐴𝐵𝐴′ 𝐵′ ) ≤ Tr[Π 𝐴𝐵𝐴′ 𝐵′ 𝜎𝐴𝐵𝐴′ 𝐵′ ] ≤ . (15.1.138)
𝐾
Since the inequality holds for all separable states 𝜎𝐴𝐵𝐴′ 𝐵′ ∈ SEP( 𝐴𝐴′ : 𝐵𝐵′), we
conclude that
1
sup 𝛽𝜀 (𝜌 𝐴𝐵𝐴′ 𝐵′ ∥𝜎𝐴𝐵𝐴′ 𝐵′ ) ≤ . (15.1.139)
𝜎𝐴𝐵 𝐴′ 𝐵′ ∈SEP( 𝐴𝐴′ :𝐵𝐵′ ) 𝐾
963
Chapter 15: Secret Key Distillation
log2 𝐾 ≤ 𝐸 𝑅𝜀 ( 𝐴; 𝐵) 𝜌 . (15.1.140)
𝐾 𝐷𝜀 ( 𝐴; 𝐵) 𝜌 ≤ 𝐸 𝑅𝜀 ( 𝐴; 𝐵) 𝜌 , (15.1.141)
We then find the following upper bounds on the distillable key available in
(𝐾, 𝜀) key distillation protocols:
964
Chapter 15: Secret Key Distillation
Corollary 15.17
Let 𝜌 𝐴𝐵 be a bipartite state, and let 𝜀 ∈ [0, 1). For every (𝐾, 𝜀) secret-key
distillation protocol for 𝜌 𝐴𝐵 , we have that
√
1 − 2 𝜀 − 𝛿 log2 𝐾 ≤ sup 𝐼 (𝑋; 𝐵′)L↔ (𝜓) − 𝐼 (𝑋; 𝐸 𝑍)L↔ (𝜓)
L↔
√ √ √
1
+ ℎ2 ( 𝜀 + 𝛿) + 1 − 𝜀 − 𝛿 log2 + 2𝑔2 ( 𝜀), (15.1.144)
𝛿
√
where 𝛿 ∈ 0, 1 − 𝜀 , 𝜓 𝐴𝐵𝐸 is a purification of 𝜌 𝐴𝐵 , the information quantities
are evaluated on the state L↔ 𝐴𝐵→𝑋 𝐵′ 𝑍 (𝜓 𝐴𝐵𝐸 ), and the optimization is over every
↔
LOPC channel L 𝐴𝐵→𝑋 𝐵′ 𝑍 with classical systems 𝑋 and 𝑍. The following
bound holds for all 𝛼 > 1:
𝛼 1
log2 𝐾 ≤ 𝐸 e𝛼 ( 𝐴; 𝐵) 𝜌 + log2 , (15.1.145)
𝛼−1 1−𝜀
where
e𝛼 ( 𝐴; 𝐵) 𝜌 =
𝐸 inf e𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 )
𝐷 (15.1.146)
𝜎𝐴𝐵 ∈SEP( 𝐴;𝐵)
Proof: Employing the same reasoning that led to (15.1.92) and (15.1.98), consider
that the following bounds hold for a given (𝐾, 𝜀) secret-key distillation protocol:
√
𝜀+𝛿 1
log2 𝐾 ≤ 𝐼 𝐻 (𝐾 𝐴 ; 𝐾 𝐵 )𝜔M + log2 , (15.1.147)
𝛿
√
𝜀
𝐼max (𝐾 𝐴 ; 𝐸 𝑍)𝜔M ≤ 0. (15.1.148)
√
where 𝛿 ∈ (0, 1 − 𝜀). Consider from Proposition 7.70 that
√ 1 √
𝜀+𝛿
𝐼𝐻 (𝐾 𝐴 ; 𝐾 𝐵 )𝜔M ≤ √ 𝐼 (𝐾 𝐴 ; 𝐾 𝐵 )𝜔M + ℎ2 ( 𝜀 + 𝛿) . (15.1.149)
1− 𝜀−𝛿
Combining (15.1.147) and (15.1.149), we obtain
√
1 − 𝜀 − 𝛿 log2 𝐾 ≤ 𝐼 (𝐾 𝐴 ; 𝐾 𝐵 )𝜔M
√ √
1
+ ℎ2 ( 𝜀 + 𝛿) + 1 − 𝜀 − 𝛿 log2 . (15.1.150)
𝛿
965
Chapter 15: Secret Key Distillation
≥ inf √ inf 𝐷 (𝜔
e𝐾 𝐴 𝐸 𝑍 ∥ 𝜔
e𝐾 𝐴 ⊗ 𝜏𝐸 𝑍 ) (15.1.152)
e𝐾 𝐴 𝐸 𝑍 ,𝜔M )≤ 𝜀 𝜏𝐸 𝑍
e𝐾 𝐴 𝐸 𝑍 :𝑃( 𝜔
𝜔
Since the upper bounds in (15.1.144) and (15.1.145) hold for all (𝐾, 𝜀) secret-
key distillation protocols, we conclude the following upper bounds on one-shot
𝜀-distillable key:
966
Chapter 15: Secret Key Distillation
√
1 − 2 𝜀 − 𝛿 𝐾 𝐷𝜀 ( 𝐴; 𝐵) 𝜌 ≤ sup 𝐼 (𝑋; 𝐵′)L(𝜓) − 𝐼 (𝑋; 𝐸 𝑍)L(𝜓)
L
√ √ √
1
+ ℎ2 ( 𝜀 + 𝛿) + 1 − 𝜀 − 𝛿 log2 + 2𝑔2 ( 𝜀), (15.1.159)
𝛿
𝛼 1
log2 𝐾 ≤ 𝐸 e𝛼 ( 𝐴; 𝐵) 𝜌 + log2 , ∀𝛼 > 1, (15.1.160)
𝛼−1 1−𝜀
√
where 𝛿 ∈ (0, 1 − 𝜀) and the optimization in (15.1.159) is over every LOPC
channel L 𝐴𝐵→𝑋 𝐵′ 𝑍 .
and ∑︁
𝑖𝑗
𝑈 𝐴𝐵𝐴′ 𝐵′ = |𝑖⟩⟨𝑖| 𝐴 ⊗ | 𝑗⟩⟨ 𝑗 | 𝐵 ⊗ 𝑈 𝐴′ 𝐵′ (15.1.163)
𝑖, 𝑗
𝑖𝑗
is a controlled unitary known as a “twisting unitary,” with each 𝑈 𝐴′ 𝐵′ a unitary
operator. Due to the fact that the maximally entangled state Φ 𝐴𝐵 is unextendible, an
arbitrary extension 𝛾 𝐴𝐴′ 𝐵𝐵′ 𝐸 of a private state 𝛾 𝐴𝐴′ 𝐵𝐵′ necessarily has the following
form:
†
𝛾 𝐴𝐴′ 𝐵𝐵′ 𝐸 = 𝑈 𝐴𝐴′ 𝐵𝐵′ (Φ 𝐴𝐵 ⊗ 𝜎𝐴′ 𝐵′ 𝐸 ) 𝑈 𝐴𝐴 ′ 𝐵𝐵 ′ , (15.1.164)
967
Chapter 15: Secret Key Distillation
where 𝜎𝐴′ 𝐵′ 𝐸 is an extension of 𝜎𝐴′ 𝐵′ . We start with the following lemma, which
applies to an arbitrary extension of a bipartite private state:
Lemma 15.18
Let 𝛾 𝐴𝐴′ 𝐵𝐵′ be a bipartite private state, and let 𝛾 𝐴𝐴′ 𝐵𝐵′ 𝐸 be an extension of it,
as given above. Then the following identity holds for every such extension:
Proof: First consider that the following identity holds as a consequence of two
applications of the chain rule for conditional quantum mutual information (see
(7.2.136)):
Combined with the following identity, which holds for an arbitrary extension
𝛾 𝐴𝐴′ 𝐵𝐵′ 𝐸 of a private state 𝛾 𝐴𝐴′ 𝐵𝐵′ ,
where
𝛾 𝑖𝐴′ 𝐵′ 𝐸 B 𝑈 𝑖𝑖𝐴′ 𝐵′ 𝜎𝐴′ 𝐵′ 𝐸 (𝑈 𝑖𝑖𝐴′ 𝐵′ ) † . (15.1.171)
968
Chapter 15: Secret Key Distillation
969
Chapter 15: Secret Key Distillation
Proposition 15.19
Let 𝛾 𝐴𝐴′ 𝐵𝐵′ be a private state, with key systems 𝐴𝐵 and shield systems 𝐴′ 𝐵′,
and let 𝜔 𝐴𝐴′ 𝐵𝐵′ be an 𝜀-approximate private state, in the sense that
where
𝑔2 (𝛿) B (𝛿 + 1) log2 (𝛿 + 1) − 𝛿 log2 𝛿. (15.1.183)
Proof: By applying Uhlmann’s theorem for fidelity (Theorem 6.8) and the inequal-
ities relating trace distance and fidelity from Theorem 6.14, for a given extension
𝜔 𝐴𝐴′ 𝐵𝐵′ 𝐸 of 𝜔 𝐴𝐴′ 𝐵𝐵′ , there exists an extension 𝛾 𝐴𝐴′ 𝐵𝐵′ 𝐸 of 𝛾 𝐴𝐴′ 𝐵𝐵′ such that
1 √
∥𝛾 𝐴𝐴′ 𝐵𝐵′ 𝐸 − 𝜔 𝐴𝐴′ 𝐵𝐵′ 𝐸 ∥ 1 ≤ 𝜀. (15.1.184)
2
Defining 𝑓1 (𝛿, 𝐾) B 2𝛿 log2 𝐾 + 2𝑔2 (𝛿), we then find that
2 log2 𝐾 = 𝐼 ( 𝐴; 𝐵𝐵′ |𝐸) 𝛾 + 𝐼 ( 𝐴′; 𝐵| 𝐴𝐵′ 𝐸) 𝛾 (15.1.185)
√
≤ 𝐼 ( 𝐴; 𝐵𝐵′ |𝐸)𝜔 + 𝐼 ( 𝐴′; 𝐵| 𝐴𝐵′ 𝐸)𝜔 + 2 𝑓1 ( 𝜀, 𝐾) (15.1.186)
≤ 𝐼 ( 𝐴; 𝐵𝐵′ |𝐸)𝜔 + 𝐼 ( 𝐴′; 𝐵| 𝐴𝐵′ 𝐸)𝜔
√
+ 𝐼 ( 𝐴′; 𝐵′ | 𝐴𝐸)𝜔 + 2 𝑓1 ( 𝜀, 𝐾) (15.1.187)
√
= 𝐼 ( 𝐴𝐴′; 𝐵𝐵′ |𝐸)𝜔 + 2 𝑓1 ( 𝜀, 𝐾). (15.1.188)
The first equality follows from Lemma 15.18. The first inequality follows from
two applications of Proposition 7.10 (uniform continuity of conditional mutual
information). The second inequality follows because 𝐼 ( 𝐴′; 𝐵′ | 𝐴𝐸)𝜔 ≥ 0 (this is
strong subadditivity from Theorem 7.6). The last equality is a consequence of the
chain rule for conditional mutual information, as used in (15.1.166). Since the
inequality
√
2 log2 𝐾 ≤ 𝐼 ( 𝐴𝐴′; 𝐵𝐵′ |𝐸)𝜔 + 2 𝑓1 ( 𝜀, 𝐾) (15.1.189)
holds for an arbitrary extension of 𝜔 𝐴𝐴′ 𝐵𝐵′ , the statement of the proposition
follows. ■
970
Chapter 15: Secret Key Distillation
We now put these statements together and arrive at the following squashed-
entanglement upper bound on one-shot distillable key:
Proof: We exploit Theorem 15.9 and work in the bipartite picture of private-
state distillation, instead of the tripartite picture of key distillation. With this in
mind, consider a (𝐾, 𝜀) bipartite private-state distillation protocol for 𝜌 𝐴𝐵 with the
corresponding LOCC channel L 𝐴𝐵→𝐾 𝐴𝐾 𝐵 𝐴′ 𝐵′ . From the LOCC monotonicity of
squashed entanglement (Theorem 9.33), we have that
Having found upper bounds on one-shot distillable key, we now turn to establishing
a lower bound. In order to establish a lower bound on distillable key, we have to
find an explicit secret-key distillation protocol that works for an arbitrary bipartite
state 𝜌 𝐴𝐵 and an arbitrary error 𝜀 ∈ (0, 1). Recall that the goal of secret key
distillation is for two parties, Alice and Bob, to make use of LOPC to transform
a purification 𝜓 𝐴𝐵𝐸 of their shared state 𝜌 𝐴𝐵 to an ideal key state of the form in
Definition 15.1, with the key size 𝐾 as large as possible, subject to the constraint
that the error not exceed 𝜀. Furthermore, we allow them to make use of public
classical communication for free.
Before we get into the details, let us first slightly modify the model of secret key
distillation, and we discuss later how the model we have already discussed can fit
together with this alternative model. The alternative model consists of supposing
that the state shared by Alice, Bob, and Eve, is a classical–quantum–quantum state
𝜌 𝑋 𝐵𝐸 of the following form:
∑︁
𝜌 𝑋 𝐵𝐸 B 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜌 𝑥𝐵𝐸 , (15.1.195)
𝑥∈X
where the error 𝜀 ∈ [0, 1], the infimum is with respect to every state 𝜎𝐸 𝑍 , and
Φ𝐾 𝐴𝐾 𝐵 is a maximally classically correlated state of size 𝐾
𝐾−1
1 ∑︁
Φ𝐾 𝐴 𝐾 𝐵 B |𝑖⟩⟨𝑖| 𝐾 𝐴 ⊗ |𝑖⟩⟨𝑖| 𝐾 𝐵 . (15.1.197)
𝐾 𝑖=0
972
Chapter 15: Secret Key Distillation
and labels the resulting systems as 𝑋1,1 , . . . , 𝑋𝑘,𝑟−1 , 𝑋𝑘,𝑟+1 , . . . , 𝑋𝐾 𝑅 . Alice then
sends the classical registers 𝑋1,1 , . . . , 𝑋𝐾,𝑅 in lexicographic order over a public
classical communication channel, so that both Bob and Eve receive copies of them.
At this point, for fixed values of 𝑘 and 𝑟, the global shared state of Alice, Bob, and
Eve is as follows:
′ 𝑋 ′′ 𝐵𝐸 ⊗ 𝜌 𝑋
𝑘,𝑟+1 𝑋 𝑘,𝑟+1 𝑋 𝑘,𝑟+1 ⊗ · · · ⊗ 𝜌 𝑋𝐾 ,𝑅 𝑋𝐾 ,𝑅 𝑋𝐾 ,𝑅 , (15.1.200)
𝜌 𝑋𝑘,𝑟 𝑋𝑘,𝑟 ′ ′′ ′ ′′
𝑘,𝑟
where Bob possesses all systems labeled as 𝑋 ′ (in addition to his 𝐵 system) and Eve
possesses all systems labeled as 𝑋 ′′ (in addition to her 𝐸 system). Furthermore,
′ 𝑋 ′′ = · · · = 𝜌 𝑋
𝜌 𝑋1,1 𝑋1,1 ′ ′′ (15.1.201)
1,1 𝑘,𝑟 −1 𝑋 𝑘,𝑟 −1 𝑋 𝑘,𝑟 −1
= 𝜌 𝑋𝑘,𝑟+1 𝑋𝑘,𝑟+1
′ ′′
𝑋 𝑘,𝑟+1 = · · · = 𝜌 𝑋𝐾 ,𝑅 𝑋𝐾′ ,𝑅 𝑋𝐾′′ ,𝑅 (15.1.202)
∑︁
= 𝑝(𝑥)|𝑥𝑥𝑥⟩⟨𝑥𝑥𝑥|, (15.1.203)
𝑥∈X
and ∑︁
′ 𝑋 ′′ 𝐵𝐸 =
𝜌 𝑋𝑘,𝑟 𝑋𝑘,𝑟 𝑘,𝑟
𝑝(𝑥)|𝑥𝑥𝑥⟩⟨𝑥𝑥𝑥| ⊗ 𝜌 𝑥𝐵𝐸 . (15.1.204)
𝑥∈X
973
Chapter 15: Secret Key Distillation
Thus, it is only the 𝑋𝑘,𝑟 classical system that has correlation with Bob and Eve’s
systems 𝐵𝐸 and all others have no correlation whatsoever. The objective of the
key distillation protocol is for Bob to identify the 𝑋𝑘,𝑟 system that has correlation
with his (and in this way, identify the key value), while the randomness variable
𝑟 should have sufficient size 𝑅 to severely reduce the chance that Eve can guess
which 𝑋 ′′ system is correlated with hers. The reduced state of Bob, for fixed 𝑘 and
𝑟, is as follows:
𝜌 𝑋𝑘,𝑟′𝐾 𝑅 𝐵 = 𝜌 𝑋1,1
′ ⊗ · · · ⊗ 𝜌𝑋′
𝑘,𝑟 −1
⊗ 𝜌 𝑋𝑘,𝑟
′ 𝐵 ⊗ 𝜌𝑋′
𝑘,𝑟+1
⊗ · · · ⊗ 𝜌 𝑋𝐾′ ,𝑅 , (15.1.205)
The idea behind confusing Eve is that if 𝑅 is large enough, then it becomes difficult
for Eve to determine which 𝑋 ′′ system is correlated with her system 𝐸. What we
show later is that if 𝑅 is large enough, then her reduced state, for all key values 𝑘,
is essentially indistinguishable from the following product state:
′′ ⊗ · · · ⊗ 𝜌 𝑋 ′′
𝜌 𝑋1,1 ⊗ 𝜌𝐸 , (15.1.207)
𝐾 ,𝑅
where 𝜌 𝐸 = Tr 𝑋 𝐵 [𝜌 𝑋 𝐵𝐸 ]. If that is the case, then she can figure out essentially
nothing about the key value 𝑘, leaving her no strategy other than to try and randomly
guess it.
To analyze this protocol in detail, we employ two methods: position-based cod-
ing, as used previously in Section 11.1.3 in the context of classical communication,
and another idea known as convex splitting. Looking at Bob’s state in (15.1.205)
and comparing it with that in (11.1.99), it is natural to employ position-based
coding to figure out the value of 𝑘 and 𝑟. Indeed, invoking Proposition 11.8 (in
particular, (11.1.130)–(11.1.131)), if the following condition holds
𝜀−𝜂 4𝜀
log2 𝐾 𝑅 = 𝐼 𝐻 (𝑋; 𝐵) 𝜌 − log2 2 , (15.1.208)
𝜂
𝜀−𝜂
for 𝜂 ∈ (0, 𝜀), and where 𝐼 𝐻 (𝑋; 𝐵) 𝜌 is the hypothesis testing mutual information
defined in (7.11.88), then Bob can decode 𝑘 and 𝑟 with error probability no larger
than 𝜀. We would also like to guarantee that Eve’s state in (15.1.206) is close to the
974
Chapter 15: Secret Key Distillation
product state in (15.1.207). This is where the convex-split lemma is useful, which
states the following: If
√
𝜀−𝜂 2
log2 𝑅 = 𝐼 max (𝐸; 𝑋) 𝜌 + log2 2 , (15.1.209)
𝜂
then there exists a state e
𝜌 𝐸 satisfying
1 − 𝐹 (𝜌 𝑋𝑘 ′𝐾 𝑅 𝐸 , 𝜌 𝑋1,1
′′ ⊗ · · · ⊗ 𝜌 𝑋 ′′
𝐾 ,𝑅
⊗e
𝜌 𝐸 ) ≤ 𝜀, (15.1.210)
√
and 𝑃(e 𝜌 𝐸 , 𝜌 𝐸 ) ≤ 𝜀 − 𝜂. Observe that the inequality above holds for all key values
𝛿
𝑘. In the above, 𝐼 max (𝐸; 𝑋) 𝜌 is a smooth max-mutual information quantity defined
for 𝛿 ∈ (0, 1) as
𝛿
𝐼 max (𝐸; 𝑋) 𝜌 B inf 𝐷 max (e
𝜌𝑋𝐸 ∥ 𝜌𝑋 ⊗ e
𝜌 𝐸 ). (15.1.211)
𝜌 𝑋𝐸 ,𝜌 𝑋𝐸 )≤𝛿
𝜌 𝑋𝐸 :𝑃(e
e
𝛿
Observe that 𝐼 max (𝐸; 𝑋) 𝜌 is different from the smooth max-mutual information
quantity defined previously in (15.1.59). By suitably combining position-based
coding with convex splitting and subtracting (15.1.209) from (15.1.208), we thus
arrive at the conclusion that Alice and Bob can distill a key 𝐾 of size
√
𝜀−𝜂 𝜀−𝜂 4𝜀 2
log2 𝐾 = 𝐼 𝐻 (𝑋; 𝐵) 𝜌 − 𝐼 max (𝐸; 𝑋) 𝜌 − log2 2 − log2 2 , (15.1.212)
𝜂 𝜂
and be guaranteed that
1. Bob can decode the key value 𝑘 with error probability no larger than 𝜀 and
2. the key value is secure from Eve with security parameter 𝜀 (as given in
(15.1.210)).
Having discussed the protocol for key distillation and some intuition justifying
why the scheme works, we now formally state a lower bound on the one-shot
distillable key of a state 𝜌 𝑋 𝐵𝐸 :
Theorem 15.21
Let 𝜌 𝑋 𝐵𝐸 be a classical–quantum–quantum state, with system
√ 𝑋 held by Alice,
𝐵 by Bob, and 𝐸 by Eve. For all 𝜀 ∈ (0, 1], 𝜀 = 1 − 1 − 𝜀, 𝛿 ∈ (0, 𝜀′),
′
975
Chapter 15: Secret Key Distillation
𝜂 ∈ (0, 𝜀′ − 𝛿), and 𝜁 ∈ (0, 𝛿), there exists a (𝐾, 𝜀) one-way key distillation
protocol for 𝜌 𝑋 𝐵𝐸 with
𝜀 ′ −𝛿−𝜂 𝛿−𝜁
log2 𝐾 = 𝐼 𝐻 (𝑋; 𝐵) 𝜌 − 𝐼 max (𝐸; 𝑋) 𝜌
4(𝜀′ − 𝛿)
2
− log2 − log2 , (15.1.213)
𝜂2 𝜁2
𝜀 ′ −𝛿−𝜂
where the hypothesis testing mutual information 𝐼 𝐻 (𝑋; 𝐵) 𝜌 is defined in
𝛿−𝜁
(7.11.88) and the smooth max-mutual information 𝐼 max (𝐸; 𝑋) 𝜌 is defined in
(15.1.211).
As discussed above, one of the main tools that we employ to prove this theorem
is the smooth convex-split lemma, which we state here and prove in Appendix 15.A.
𝑃(𝜏𝐴1 ···𝐴𝑅 𝐸 , 𝜌 𝐴1 ⊗ · · · ⊗ 𝜌 𝐴𝑅 ⊗ e
𝜌 𝐸 ) ≤ 𝜀, (15.1.216)
𝜌 𝐸 , 𝜌 𝐸 ) ≤ 𝜀 − 𝜂.
and 𝑃(e
Proof (Proof of Theorem 15.21): Fix 𝜀 ∈ (0, 1], 𝛿 ∈ (0, 𝜀), 𝜂 ∈ (0, 𝜀 − 𝛿), and
𝜁 ∈ (0, 𝛿). Alice performs the key distillation protocol discussed in the paragraph
976
Chapter 15: Secret Key Distillation
𝜌 𝐾 𝐴 𝑅 𝐴 𝑋 𝐾 𝑅 𝑋 ′𝐾 𝑅 𝑋 ′′𝐾 𝑅 𝐵𝐸 B
𝐾 𝑅
1 ∑︁ ∑︁
|𝑘⟩⟨𝑘 | 𝐾 𝐴 ⊗ |𝑟⟩⟨𝑟 | 𝑅 𝐴 ⊗ 𝜌 𝑋𝑘,𝑟𝐾 𝑅 𝑋 ′𝐾 𝑅 𝑋 ′′𝐾 𝑅 𝐵𝐸 , (15.1.217)
𝐾𝑅 𝑘=1 𝑟=1
Tr[Λ𝑘,𝑟 𝜌 𝑘,𝑟 ] ≥ 1 − (𝜀 − 𝛿)
𝑋 ′𝐾 𝑅 𝐵 𝑋 ′𝐾 𝑅 𝐵
∀𝑘 ∈ [𝐾] , 𝑟 ∈ [𝑅] . (15.1.220)
𝐾 ∑︁
∑︁ 𝑅
M′𝑋 ′𝐾 𝑅 𝐵→𝐾 𝐵 𝑅 𝐵 (𝜏𝑋 ′𝐾 𝑅 𝐵 ) B Tr[Λ𝑘,𝑟 𝜏 ′𝐾 𝑅 𝐵 ]|𝑘⟩⟨𝑘 | 𝐾 𝐵 ⊗ |𝑟⟩⟨𝑟 | 𝑅 𝐵 ,
𝑋 ′𝐾 𝑅 𝐵 𝑋
𝑘=1 𝑟=1
(15.1.221)
and the reduced measurement channel M 𝑋 ′𝐾 𝑅 𝐵→𝐾 𝐵 as
Observe that
977
Chapter 15: Secret Key Distillation
1
M′𝑋 ′𝐾 𝑅 𝐵→𝐾 𝐵 𝑅 𝐵 (𝜌 𝑋𝑘,𝑟′𝐾 𝑅 𝐵 ) − |𝑘⟩⟨𝑘 | 𝐾 𝐵 ⊗ |𝑟⟩⟨𝑟 | 𝑅 𝐵
2 1
= 1 − Tr[Λ 𝑋 ′𝐾 𝑅 𝐵 𝜌 𝑋𝑘,𝑟′𝐾 𝑅 𝐵 ] ≤ 𝜀 − 𝛿, (15.1.224)
𝑘,𝑟
where e𝜌 𝑋 ′′𝐾 𝑅 𝐸 is some state of the eavesdropper Eve’s systems 𝑋 ′′𝐾 𝑅 𝐸. Thus, our
goal is to find an upper bound on the following quantity
1
M 𝑋 ′𝐾 𝑅 𝐵→𝐾 𝐵 (𝜌 𝐾 𝐴 𝑋 ′𝐾 𝑅 𝑋 ′′𝐾 𝑅 𝐵𝐸 ) − Φ𝐾 𝐴𝐾 𝐵 ⊗ e
𝜌 𝑋 ′′𝐾 𝑅 𝐸 , (15.1.229)
2 1
1 − 𝐹 (M 𝑋 ′𝐾 𝑅 𝐵→𝐾 𝐵 (𝜌 𝐾 𝐴 𝑋 ′𝐾 𝑅 𝑋 ′′𝐾 𝑅 𝐵𝐸 ), Φ𝐾 𝐴𝐾 𝐵 ⊗ e
𝜌 𝑋 ′′𝐾 𝑅 𝐸 ). (15.1.230)
To this end, let us first consider bounding the following intermediate quantity:
𝐾 𝑅
1 1 ∑︁ 1 ∑︁ 𝑘,𝑟
M 𝑋 ′𝐾 𝑅 𝐵→𝐾 𝐵 (𝜌 𝐾 𝐴 𝑋 ′𝐾 𝑅 𝑋 ′′𝐾 𝑅 𝐵𝐸 ) − |𝑘⟩⟨𝑘 | 𝐾 𝐴 ⊗ |𝑘⟩⟨𝑘 | 𝐾 𝐵 ⊗ 𝜌 .
2 𝐾 𝑘=1 𝑅 𝑟=1 𝑋 ′′𝐾 𝑅 𝐸
1
(15.1.231)
978
Chapter 15: Secret Key Distillation
We find that
𝐾 𝑅
1 1 ∑︁ 1 ∑︁ 𝑘,𝑟
M 𝑋 ′𝐾 𝑅 𝐵→𝐾 𝐵 (𝜌 𝐾 𝐴 𝑋 ′𝐾 𝑅 𝑋 ′′𝐾 𝑅 𝐵𝐸 ) − |𝑘⟩⟨𝑘 | 𝐾 𝐴 ⊗ |𝑘⟩⟨𝑘 | 𝐾 𝐵 ⊗ 𝜌
2 𝐾 𝑘=1 𝑅 𝑟=1 𝑋 ′′𝐾 𝑅 𝐸
Í 1
Í 𝑘,𝑟
1 𝐾1 𝐾𝑘=1 |𝑘⟩⟨𝑘 | 𝐾 𝐴 ⊗ M 𝑋 ′𝐾 𝑅 𝐵→𝐾 𝐵 𝑅1 𝑟=1 𝑅
𝜌 𝑋 ′𝐾 𝑅 𝑋 ′′𝐾 𝑅 𝐵𝐸
= Í Í 𝑅 𝑘,𝑟 (15.1.232)
2 − 𝐾1 𝐾𝑘=1 |𝑘⟩⟨𝑘 | 𝐾 𝐴 ⊗ |𝑘⟩⟨𝑘 | 𝐾 𝐵 ⊗ 𝑅1 𝑟=1 𝜌 𝑋 ′′𝐾 𝑅 𝐸 1
𝐾 𝑅
! 𝑅
1 ∑︁ 1 1 ∑︁ 𝑘,𝑟 1 ∑︁ 𝑘,𝑟
= M 𝑋 ′𝐾 𝑅 𝐵→𝐾 𝐵 𝜌 − |𝑘⟩⟨𝑘 | 𝐾 𝐵 ⊗ 𝜌 .
𝐾 𝑘=1 2 𝑅 𝑟=1 𝑋 ′𝐾 𝑅 𝑋 ′′𝐾 𝑅 𝐵𝐸 𝑅 𝑟=1 𝑋 ′′𝐾 𝑅 𝐸
1
(15.1.233)
Consider that
𝐾 𝑅
∑︁
′ ′ ,𝑘 1 ∑︁ 𝑘,𝑟
𝑞(𝑘 |𝑘)𝜔 𝑘𝑋 ′′𝐾 = 𝜌 . (15.1.236)
𝑘 ′ =1
𝑅𝐸
𝑅 𝑟=1 𝑋 ′′𝐾 𝑅 𝐸
Then we can write
𝑅
! 𝐾
1 ∑︁ ∑︁ ′ ,𝑘
M 𝑋 ′𝐾 𝑅 𝐵→𝐾 𝐵 𝑘,𝑟
𝜌 𝑋 ′𝐾 𝑅 𝑋 ′′𝐾 𝑅 𝐵𝐸 = 𝑞(𝑘 ′ |𝑘)|𝑘 ′⟩⟨𝑘 ′ | 𝐾 𝐵 ⊗ 𝜔 𝑘𝑋 ′′𝐾 𝑅 𝐸 , (15.1.237)
𝑅 𝑟=1 𝑘 ′ =1
so that
𝑅
! 𝐾
1 ∑︁ ∑︁
M 𝑋 ′𝐾 𝑅 𝐵→𝐾 𝐵 𝜌 𝑘,𝑟
= 𝑞(𝑘 ′ |𝑘)|𝑘 ′⟩⟨𝑘 ′ | 𝐾 𝐵 . (15.1.238)
𝑅 𝑟=1 𝑋 ′𝐾 𝑅 𝐵 𝑘 ′ =1
979
Chapter 15: Secret Key Distillation
𝐾 𝐾 𝐾
1 ∑︁ 1 ∑︁ ′ ′ ′ 𝑘 ′ ,𝑘
∑︁ ′ ,𝑘
= 𝑞(𝑘 |𝑘)|𝑘 ⟩⟨𝑘 | 𝐾 𝐵 ⊗ 𝜔 𝑋 ′′𝐾 𝑅 𝐸 − |𝑘⟩⟨𝑘 | 𝐾 𝐵 ⊗ 𝑞(𝑘 ′ |𝑘)𝜔 𝑘𝑋 ′′𝐾 𝑅𝐸
𝐾 𝑘=1 2 𝑘 ′ =1 𝑘 ′ =1 1
(15.1.239)
𝐾 𝐾
1 ∑︁ ∑︁ ′ 1 ′ ,𝑘 𝑘 ′ ,𝑘
≤ 𝑞(𝑘 |𝑘) |𝑘 ′⟩⟨𝑘 ′ | 𝐾 𝐵 ⊗ 𝜔 𝑘𝑋 ′′𝐾 𝑅 𝐸 − |𝑘⟩⟨𝑘 | 𝐾 𝐵 ⊗ 𝜔 𝑋 ′′𝐾 𝑅 𝐸
𝐾 𝑘=1 𝑘 ′ =1 2 1
(15.1.240)
𝐾 𝐾
1 ∑︁ ∑︁ ′ 1 ′ ′
= 𝑞(𝑘 |𝑘) |𝑘 ⟩⟨𝑘 | 𝐾 𝐵 − |𝑘⟩⟨𝑘 | 𝐾 𝐵 1 (15.1.241)
𝐾 𝑘=1 𝑘 ′ =1 2
𝐾 𝐾
1 ∑︁ ∑︁
= 𝑞(𝑘 ′ |𝑘) (15.1.242)
𝐾 𝑘=1 ′
𝑘 =1,
𝑘 ′ ≠𝑘
𝐾 𝑅
!
1 ∑︁ 1 1 ∑︁ 𝑘,𝑟
= M 𝑋 ′𝐾 𝑅 𝐵→𝐾 𝐵 𝜌 − |𝑘⟩⟨𝑘 | 𝐾 𝐵 (15.1.243)
𝐾 𝑘=1 2 𝑅 𝑟=1 𝑋 ′𝐾 𝑅 𝐵
1
≤ 𝜀 − 𝛿. (15.1.244)
We thus conclude that
𝐾 𝑅
1 1 ∑︁ 1 ∑︁ 𝑘,𝑟
M 𝑋 ′𝐾 𝑅 𝐵→𝐾 𝐵 (𝜌 𝐾 𝐴 𝑋 ′𝐾 𝑅 𝑋 ′′𝐾 𝑅 𝐵𝐸 ) − |𝑘⟩⟨𝑘 | 𝐾 𝐴 ⊗ |𝑘⟩⟨𝑘 | 𝐾 𝐵 ⊗ 𝜌
2 𝐾 𝑘=1 𝑅 𝑟=1 𝑋 ′′𝐾 𝑅 𝐸
1
≤ 𝜀 − 𝛿. (15.1.245)
We now turn to the analysis of privacy. Starting from the overall global state
(15.1.217), and fixing a value of 𝑘, the reduced state of Eve’s systems is as follows:
𝑅
1 ∑︁ 𝑘,𝑟
𝜌 𝑋𝑘 ′′𝐾 𝑅 𝐸 = 𝜌 ′′ ⊗ · · · ⊗ 𝜌 𝑋 ′′
= 𝜌 𝑋1,1
𝑅 𝑟=1 𝑋 ′′𝐾 𝑅 𝐸 𝑘−1,𝑅
𝑅
1 ∑︁
⊗ 𝜌 𝑋 ′′ ⊗ · · · ⊗ 𝜌 𝑋𝑘,𝑟
′′
−1
⊗ 𝜌 𝑋𝑘,𝑟
′′ 𝐸 ⊗ 𝜌 𝑋 ′′ ⊗ · · · ⊗ 𝜌 𝑋𝑘,𝑅
′′
𝑅 𝑟=1 𝑘,1 𝑘,𝑟+1
⊗ 𝜌 𝑋𝑘+1,1
′′ ⊗ · · · ⊗ 𝜌 𝑋𝐾′′ ,𝑅 . . (15.1.246)
Our goal is to show that
1 𝑘
𝜌 ′′𝐾 𝑅 − 𝜌 𝑋 ′′𝐾 𝑅 ⊗ e
𝜌𝐸 ≤ 𝛿, (15.1.247)
2 𝑋 𝐸 1
980
Chapter 15: Secret Key Distillation
for some state e𝜌 𝐸 . By the invariance of the trace distance with respect to tensor-
product states, i.e., ∥𝜎 ⊗ 𝜏 − 𝜔 ⊗ 𝜏∥ 1 = ∥𝜎 − 𝜔∥ 1 , we find that
1 𝑘
𝜌 ′′𝐾 𝑅 − 𝜌 𝑋 ′′𝐾 𝑅 ⊗ e
𝜌𝐸 1 (15.1.248)
2 𝑋 𝐸
1 𝑘
= 𝜌 ′′ ′′ − 𝜌 𝑋𝑘,1
′′ ···𝑋 ′′ ⊗ e
𝜌𝐸 (15.1.249)
2 𝑋𝑘,1 ···𝑋𝑘,𝑅 𝐸 𝑘,𝑅
1
𝑅
1 1 ∑︁
= 𝜌 𝑋 ⊗ · · · ⊗ 𝜌 𝑋𝑘,𝑟 −1 ⊗ 𝜌 𝑋𝑘,𝑟 𝐸 − 𝜌 𝑋𝑘,𝑟 ⊗ e
′′ ′′ ′′ ′′ 𝜌 𝐸 ⊗ 𝜌 𝑋𝑘,𝑟+1
′′ ⊗ · · · ⊗ 𝜌 𝑋𝑘,𝑅
′′ .
2 𝑅 𝑟=1 𝑘,1
1
(15.1.250)
By invoking the smooth convex-split lemma (Lemma 15.22) and the inequality
relating normalized trace distance and sine distance (see (6.2.88)), we find that if
we pick 𝑅 such that
𝛿−𝜁 2
log2 𝑅 = 𝐼 max (𝐸; 𝑋) 𝜌 + log2 2 , (15.1.251)
𝜁
then we are guaranteed that
1 𝑘
𝜌 ′′𝐾 𝑅 − 𝜌 𝑋 ′′𝐾 𝑅 ⊗ e
𝜌𝐸 ≤ 𝛿, (15.1.252)
2 𝑋 𝐸 1
where e
𝜌 𝐸 is some state such that 𝑃(e𝜌 𝐸 , 𝜌 𝐸 ) ≤ 𝛿 − 𝜁. Now combining (15.1.245)
and (15.1.252) with the triangle inequality, we conclude the desired statement:
1
M 𝑋 ′𝐾 𝑅 𝐵→𝐾 𝐵 (𝜌 𝐾 𝐴 𝑋 ′𝐾 𝑅 𝑋 ′′𝐾 𝑅 𝐵𝐸 ) − Φ𝐾 𝐴𝐾 𝐵 ⊗ 𝜌 𝑋 ′′𝐾 𝑅 ⊗ e
𝜌𝐸 ≤ 𝜀. (15.1.253)
2 1
𝜌 𝐸 ) ≤ 𝜀 (2 − 𝜀)
1 − 𝐹 (M 𝑋 ′𝐾 𝑅 𝐵→𝐾 𝐵 (𝜌 𝐾 𝐴 𝑋 ′𝐾 𝑅 𝑋 ′′𝐾 𝑅 𝐵𝐸 ), Φ𝐾 𝐴𝐾 𝐵 ⊗ 𝜌 𝑋 ′′𝐾 𝑅 ⊗ e
(15.1.254)
by exploiting the inequality in (6.2.88) relating fidelity and trace distance. Now
√ the inverse function of 𝜀(2 − 𝜀), with domain and range given
using the fact that
by [0, 1], is 1 − 1 − 𝜀 and reassigning 𝜀 (2 − 𝜀) as 𝜀, we conclude the desired
statement in (15.1.213). ■
The result of Theorem 15.21 applies to the model of secret key distillation
outlined in the paragraph containing (15.1.195)–(15.1.198). To extend it to the
981
Chapter 15: Secret Key Distillation
main model considered in this chapter (and outlined in Section 15.1), we can allow
Alice and Bob to perform an LOPC channel L↔ 𝐴𝐵→𝑋 𝐵′ 𝑍 to obtain the following
state:
𝜌 𝑋 𝐵 ′ 𝐸 𝑍 B L↔
𝐴𝐵→𝑋 𝐵′ 𝑍 (𝜓 𝐴𝐵𝐸 ), (15.1.255)
where 𝜓 𝐴𝐵𝐸 is a purification of the state 𝜌 𝐴𝐵 of interest and L↔
𝐴𝐵→𝑋 𝐵′ 𝑍 is an LOPC
channel with classical output system 𝑋 and quantum output system 𝐵′. Then we
′
Corollary 15.23
Let 𝜌 𝐴𝐵 be a bipartite state, with system
√ 𝑋 held by Alice, 𝐵 by Bob, and 𝐸
by Eve. For all 𝜀 ∈ (0, 1], 𝜀 B 1 − 1 − 𝜀, 𝛿 ∈ (0, 𝜀 ), 𝜂 ∈ (0, 𝜀′ − 𝛿), and
′ ′
𝜁 ∈ (0, 𝛿), there exists a (𝐾, 𝜀) one-way key distillation protocol for 𝜌 𝐴𝐵 with
𝜀 ′ −𝛿−𝜂 𝛿−𝜁
log2 𝐾 = 𝐼 𝐻 (𝑋; 𝐵′) 𝜌 − 𝐼 max (𝐸 𝑍; 𝑋) 𝜌
4(𝜀′ − 𝛿)
2
− log2 − log2 , (15.1.256)
𝜂2 𝜁2
𝜀 ′ −𝛿−𝜂
where the hypothesis testing mutual information 𝐼 𝐻 (𝑋; 𝐵′) 𝜌 is defined
𝛿−𝜁
in (7.11.88), the smooth max-mutual information 𝐼 max (𝐸 𝑍; 𝑋) 𝜌 is defined
in (15.1.211), and these quantities are evaluated with respect to the state in
(15.1.255), with L↔𝐴𝐵→𝑋 𝐵′ 𝑍 an LOPC channel with classical output system 𝑋
and quantum output system 𝐵′. Consequently, for the one-shot distillable key
of 𝜌 𝐴𝐵 , we have
𝜀 ′ −𝛿−𝜂 𝛿−𝜁
𝐾 𝐷𝜀 ( 𝐴; 𝐵) 𝜌 ≥ sup 𝐼𝐻 (𝑋; 𝐵′) 𝜌 − 𝐼 max (𝐸 𝑍; 𝑋) 𝜌
L↔ ,𝛿∈(0,𝜀 ′ ),
𝜂∈(0,𝜀 ′ −𝛿),𝜁 ∈(0,𝛿)
4(𝜀′ − 𝛿)
2
− log2 2
− log2 2 , (15.1.257)
𝜂 𝜁
√
where 𝜀′ B 1 − 1 − 𝜀 and the optimization is over every LOPC channel
L↔
𝐴𝐵→𝑋 𝐵′ 𝑍 .
Now combining Corollary 15.23 with Propositions 7.64 and 7.72, we conclude
the following lower bound on one-shot distillable key:
982
Chapter 15: Secret Key Distillation
Corollary 15.24
Let 𝜌 𝐴𝐵 be√ a bipartite state′ with purification 𝜓 𝐴𝐵𝐸 . For all 𝜀 ∈ (0, 1),
′ ′
𝜀 = 1 − 1 − 𝜀, 𝛿 ∈ (0, 𝜀 ), 𝜂 ∈ (0, 𝜀 − 𝛿), 𝜁 ∈ (0, 𝛿), 𝜈 ∈ (0, 𝛿 − 𝜁),
𝛼 ∈ (0, 1), and 𝛽 > 1, there exists a (𝐾, 𝜀) one-way key distillation protocol
for 𝜌 𝐴𝐵 satisfying
where
𝜌 𝑋 𝐵 ′ 𝐸 𝑍 B L↔
𝐴𝐵→𝑋 𝐵′ 𝑍 (𝜓 𝐴𝐵𝐸 ), (15.1.258)
L↔𝐴𝐵→𝑋 𝐵′ 𝑍 is an LOPC channel with classical output system 𝑋 and quantum
output system 𝐵′,
𝐼 𝛽′ (𝑋; 𝐸 𝑍) 𝜌 B 𝐷
e e 𝛽 (𝜌 𝑋 𝐸 𝑍 ∥ 𝜌 𝑋 ⊗ 𝜌 𝐸 𝑍 ), (15.1.259)
and
′ 𝛼 1 8
𝑓 (𝜀 , 𝛿, 𝜂, 𝜈, 𝜁, 𝛼, 𝛽) B log2 ′ + log2 2
1−𝛼 𝜀 −𝛿−𝜂 𝜈
1 1 1
+ log2 + log2
𝛽−1 (𝛿 − 𝜁 − 𝜈) 2 1 − (𝛿 − 𝜁 − 𝜈) 2
4(𝜀′ − 𝛿)
2
+ log2 + log 2 . (15.1.260)
𝜂2 𝜁2
Proof: The main idea here is to convert the smooth mutual information quantities
𝜀 ′ −𝛿−𝜂 𝛿−𝜁
𝐼𝐻 (𝑋; 𝐵′) 𝜌 and 𝐼 max (𝐸 𝑍; 𝑋) 𝜌 from Corollary 15.23 to Rényi mutual informa-
tion quantities with correction terms related to the smoothing parameters. Let us
first invoke Proposition 7.72 to conclude the following lower bound:
𝜀 ′ −𝛿−𝜂 𝛼 1
𝐼𝐻 (𝑋; 𝐵′) 𝜌 ≥ 𝐼 𝛼 (𝑋; 𝐵′) 𝜌 − log2 ′ . (15.1.261)
1−𝛼 𝜀 −𝛿−𝜂
Next, we invoke Lemma 15.25 below to conclude that
𝛿−𝜁 𝛿−𝜁−𝜈 8
𝐼 max (𝐸 𝑍; 𝑋) 𝜌 ≤ 𝐷 max (𝜌 𝑋 𝐸 𝑍 ∥ 𝜌 𝑋 ⊗ 𝜌 𝐸 𝑍 ) + log2 2 , (15.1.262)
𝜈
where 𝜈 ∈ (0, 𝛿 − 𝜁). Then we invoke Proposition 7.64 to conclude that
983
Chapter 15: Secret Key Distillation
𝛿−𝜁−𝜈
𝐷 max (𝜌 𝑋 𝐸 𝑍 ∥ 𝜌 𝑋 ⊗ 𝜌 𝐸 𝑍 ) ≤ 𝐷e 𝛽 (𝜌 𝑋 𝐸 𝑍 ∥ 𝜌 𝑋 ⊗ 𝜌 𝐸 𝑍 )
1 1 1
+ log2 + log2 . (15.1.263)
𝛽−1 (𝛿 − 𝜁 − 𝜈) 2 1 − (𝛿 − 𝜁 − 𝜈) 2
Considering that
𝐷 𝐼 𝛽′ (𝑋; 𝐸 𝑍) 𝜌 .
e 𝛽 (𝜌 𝑋 𝐸 𝑍 ∥ 𝜌 𝑋 ⊗ 𝜌 𝐸 𝑍 ) = e (15.1.264)
Putting all of the above together with Corollary 15.23, we conclude the proof. ■
Lemma 15.25
Let 𝜌 𝐴𝐸 be a bipartite state, and let 𝜀, 𝛿 > 0 be such that 𝜀 + 𝛿 < 1. Then
𝜀+𝛿 𝜀 8
𝐼 max (𝐸; 𝐴) 𝜌 ≤ 𝐷 max (𝜌 𝐴𝐸 ∥ 𝜌 𝐴 ⊗ 𝜌 𝐸 ) + log2 2 , (15.1.265)
𝛿
𝜀+𝛿 𝜀 (𝜌
where 𝐼 max (𝐸; 𝐴) 𝜌 is defined in (15.1.211) and 𝐷 max 𝐴𝐸 ∥ 𝜌 𝐴 ⊗ 𝜌 𝐸 )
in (7.8.42).
Having found upper and lower bounds on the one-shot distillable key 𝐾 𝐷𝜀 ( 𝐴; 𝐵) 𝜌 of
a bipartite state 𝜌 𝐴𝐵 , let us now move on to the asymptotic setting. In this setting,
we allow Alice and Bob to make use of an arbitrarily large number 𝑛 of copies of
the state 𝜌 𝐴𝐵 in order to obtain a secret-key state. A secret key distillation protocol
for 𝑛 copies of 𝜌 𝐴𝐵 is defined by the triple (𝑛, 𝐾, L↔ 𝐴𝑛 𝐵 𝑛 →𝐾 𝐴 𝐾 𝐵 𝑍 ), consisting of the
number 𝑛 of copies of 𝜌 𝐴𝐵 , an integer 𝐾 ∈ N, and an LOPC channel L↔ 𝐴𝑛 𝐵 𝑛 →𝐾 𝐴 𝐾 𝐵 𝑍
with 𝑑 𝐾 𝐴 = 𝑑 𝐾 𝐵 = 𝐾. Observe that a secret-key distillation protocol for 𝑛 copies
of 𝜌 𝐴𝐵 is equivalent to a one-shot secret-key distillation protocol for the state 𝜌 ⊗𝑛 𝐴𝐵 .
All of the results of Section 15.1 thus carry over to the asymptotic setting simply
by replacing 𝜌 𝐴𝐵 with 𝜌 ⊗𝑛 𝐴𝐵 . In particular, the error probability for a secret-key
distillation protocol for 𝜌 𝐴𝐵 defined by (𝑛, 𝐾, L↔ 𝐴𝑛 𝐵 𝑛 →𝐾 𝐴 𝐾 𝐵 𝑍 ) is equal to
↔
𝑝 err (L ; 𝜌 ⊗𝑛
𝐴𝐵 ) = inf 1− 𝐹 (𝛾𝐾 𝐴𝐾 𝐵 𝐸 𝑍 , L↔ ⊗𝑛
𝐴𝑛 𝐵 𝑛 →𝐾 𝐴 𝐾 𝐵 𝑍 (𝜓 𝐴𝐵𝐸 )) , (15.2.1)
𝛾𝐾 𝐴 𝐾 𝐵 𝐸 𝑍
984
Chapter 15: Secret Key Distillation
where the infimum is with respect to every ideal tripartite key state 𝛾𝐾 𝐴𝐾 𝐵 𝐸 𝑍 and
𝜓 𝐴𝐵𝐸 is a purification of 𝜌 𝐴𝐵 . The definition in (15.2.1) is thus the same as that in
(15.1.6), but for the tensor-power state 𝜌 ⊗𝑛
𝐴𝐵 .
As we prove in Appendix A,
𝑅 achievable rate ⇐⇒ lim 𝜀 𝐷 (2𝑛(𝑅−𝛿) ; 𝜌 ⊗𝑛
𝐴𝐵 ) = 0 ∀𝛿 > 0. (15.2.5)
𝑛→∞
985
Chapter 15: Secret Key Distillation
In other words, a rate 𝑅 is achievable if the optimal error probability for a sequence
of protocols with rate 𝑅 − 𝛿 vanishes as the number 𝑛 of copies of 𝜌 𝐴𝐵 increases.
As we show in Appendix A,
986
Chapter 15: Secret Key Distillation
Note that
𝐾 𝐷 ( 𝐴; 𝐵) 𝜌 ≤ 𝐾
e𝐷 ( 𝐴; 𝐵) 𝜌 (15.2.11)
for every bipartite state 𝜌 𝐴𝐵 . We can also write the strong converse distillable key
as
e𝐷 ( 𝐴; 𝐵) 𝜌 = sup lim sup 1 𝐾 𝐷𝜀 (𝜌 ⊗𝑛 ).
𝐾 (15.2.12)
𝐴𝐵
𝜀∈[0,1) 𝑛→∞ 𝑛
See Appendix A for a proof.
We are now ready to present a general expression for the distillable key of a
bipartite state, as well as two upper bounds on it.
and the squashed entanglement from (9.4.1) is a weak converse rate, in the
sense that
𝐾 𝐷 ( 𝐴; 𝐵) 𝜌 ≤ 𝐸 sq ( 𝐴; 𝐵) 𝜌 . (15.2.15)
987
Chapter 15: Secret Key Distillation
If we define
𝐷 ←→ ←→ ′
𝐾 (𝜌 𝐴𝐵 ) ≡ 𝐷 𝐾 ( 𝐴; 𝐵) 𝜌 B sup 𝐼 (𝑋; 𝐵 )L(𝜓) − 𝐼 (𝑋; 𝐸 𝑍)L(𝜓) , (15.2.16)
L
• The private information is an achievable rate for secret key distillation, i.e.,
𝐾 𝐷 ( 𝐴; 𝐵) 𝜌 ≥ max{𝐼 (𝑋; 𝐵)𝜏 − 𝐼 (𝑋; 𝐸)𝜏 , 𝐼 ( 𝐴; 𝑌 )𝜔 − 𝐼 (𝑌 ; 𝐸)𝜔 }, (15.2.18)
where
∑︁
𝜏𝑋 𝐵𝐸 B |𝑥⟩⟨𝑥| 𝑋 ⊗ Tr 𝐴 [Λ𝑥𝐴 𝜓 𝐴𝐵𝐸 ], (15.2.19)
𝑥
∑︁
𝑦
𝜔𝑌 𝐴𝐸 B |𝑦⟩⟨𝑦|𝑌 ⊗ Tr 𝐵 [Γ𝐵 𝜓 𝐴𝐵𝐸 ], (15.2.20)
𝑦
𝑦
𝜓 𝐴𝐵𝐸 is a purification of the bipartite state 𝜌 𝐴𝐵 , and {Λ𝑥𝐴 }𝑥 and {Γ𝐵 } 𝑦 are
POVMs. The idea behind the first achievable rate 𝐼 (𝑋; 𝐵)𝜏 − 𝐼 (𝑋; 𝐸)𝜏 is that
Alice performs the measurement {Λ𝑥𝐴 }𝑥 on her system 𝐴, and this produces
the classical–quantum–quantum state 𝜏𝑋 𝐵𝐸 . Alice and Bob then execute the
protocol from Theorem 15.21 on many copies of the state 𝜏𝑋 𝐵𝐸 . Alternatively,
the idea behind the second achievable rate 𝐼 ( 𝐴; 𝑌 )𝜔 − 𝐼 (𝑌 ; 𝐸)𝜔 is similar, but
with the roles of Alice and Bob swapped and distilling a key from many copies
of the state 𝜔𝑌 𝐴𝐸 .
As with other previous capacity theorem proofs in this book, we prove Theo-
rem 15.32 in two steps:
In order to show this, we use the one-shot upper bounds from Section 15.1.3 to
prove that every achievable rate 𝑅 satisfies
1
𝑅 ≤ lim sup 𝐼 (𝑋; 𝐵′)L (𝑛) (𝜓 ⊗𝑛 ) − 𝐼 (𝑋; 𝐸 𝑍)L (𝑛) (𝜓 ⊗𝑛 ) . (15.2.24)
𝑛→∞ 𝑛 (𝑛)
L
The expression in (15.2.13) for the distillable key involves both a limit over an
unbounded number of copies of the state 𝜌 𝐴𝐵 , as well as an optimization over all
two-way LOPC channels. Computing the distillable key is therefore intractable
in general. After establishing a proof of (15.2.13), we proceed to establish upper
bounds on distillable entanglement that depend only on the given state 𝜌 𝐴𝐵 .
Specifically, in Section 15.2.3, we use the one-shot results in Section 15.1.3.2 to
show that the relative entropy of entanglement is a strong converse rate for secret
key distillation. We also show that the squashed entanglement is a weak converse
rate for secret key distillation.
As the first step in proving the achievability part of Theorem 15.32, let us recall
Corollary√15.24: given a bipartite state 𝜌 𝐴𝐵 with purification 𝜓 𝐴𝐵𝐸 , for all 𝜀 ∈ (0, 1),
𝜀′ = 1 − 1 − 𝜀, 𝛿 ∈ (0, 𝜀′), 𝜂 ∈ (0, 𝜀′ − 𝛿), 𝜁 ∈ (0, 𝛿), 𝜈 ∈ (0, 𝛿 − 𝜁), 𝛼 ∈ (0, 1),
and 𝛽 > 1, there exists a (𝐾, 𝜀) one-way key distillation protocol for 𝜌 𝐴𝐵 satisfying
log2 𝐾 ≥ 𝐼 𝛼 (𝑋; 𝐵′) 𝜌 − e
𝐼 𝛽′ (𝑋; 𝐸 𝑍) 𝜌 − 𝑓 (𝜀′, 𝛿, 𝜂, 𝜈, 𝜁, 𝛼, 𝛽) (15.2.25)
where
𝜌 𝑋 𝐵 ′ 𝐸 𝑍 B L↔
𝐴𝐵→𝑋 𝐵′ 𝑍 (𝜓 𝐴𝐵𝐸 ), (15.2.26)
L↔𝐴𝐵→𝑋 𝐵′ 𝑍 is an LOPC channel with classical output system 𝑋 and quantum output
system 𝐵′, the Rényi mutual information e 𝐼 𝛽′ (𝑋; 𝐸 𝑍) 𝜌 is defined in (15.1.259), and
the function 𝑓 (𝜀′, 𝛿, 𝜂, 𝜈, 𝜁, 𝛼, 𝛽) in (15.1.260). Applying this inequality to the
state 𝜌 ⊗𝑛
𝐴𝐵 for all 𝑛 ∈ N leads to the following:
Proposition 15.33
For every state 𝜌 𝐴𝐵 and 𝜀 ∈ (0, 1), there exists an (𝑛, 𝐾, 𝜀) key distillation
log 𝐾
protocol for 𝜌 𝐴𝐵 such that the rate 𝑛2 satisfies
′ 𝜀′ 𝜀′ 𝜀′
log2 𝐾 ′ ′ 1 ′ 𝜀
≥ 𝐼 𝛼 (𝑋; 𝐵 ) 𝜌 − e
𝐼 𝛽 (𝑋; 𝐸 𝑍) 𝜌 − 𝑓 𝜀 , , , , , 𝛼, 𝛽 , (15.2.27)
𝑛 𝑛 2 4 4 2
for all 𝑛 ∈ N, 𝛼 ∈ (0, 1), 𝛽 > 1, where the information quantities are with
respect to the state in (15.2.26). More generally, we have the following lower
bound on the finite-length distillable key:
990
Chapter 15: Secret Key Distillation
1
′ ′
𝐾 𝐷𝑛,𝜀 ( 𝐴; 𝐵) 𝜌 𝑛
≥ sup 𝐼 𝛼 (𝑋; 𝐵 )𝜏 − 𝐼 𝛽 (𝑋; 𝐸 𝑍)𝜏
e
𝑛 L↔
′ ′ ′ ′
1 ′ 𝜀 𝜀 𝜀 𝜀
− 𝑓 𝜀 , , , , , 𝛼, 𝛽 , (15.2.28)
𝑛 2 4 4 2
for all 𝑛 ∈ N, 𝛼 ∈ (0, 1), 𝛽 > 1, where the optimization is over every LOPC
channel L↔ 𝐴𝑛 𝐵 𝑛 →𝑋 𝐵′ 𝑍 and the information quantities are with respect to the
following state:
𝜏𝑋 𝐵′ 𝐸 𝑛 𝑍 B L↔ ⊗𝑛
𝐴𝑛 𝐵 𝑛 →𝑋 𝐵′ 𝑍 (𝜓 𝐴𝐵𝐸 ). (15.2.29)
Using the inequality in (15.2.27), we can prove the following lower bound on
distillable key:
991
Chapter 15: Secret Key Distillation
𝜏𝑋 𝐵′ 𝐸 𝑍 B L↔
𝐴𝐵→𝑋 𝐵′ 𝑍 (𝜓 𝐴𝐵𝐸 ), (15.2.32)
Proof: Let 𝜓 𝐴𝐵𝐸 be a purification of 𝜌 𝐴𝐵 . Fix 𝜀 ∈ (0, 1] and 𝛿 > 0. Let 𝛿1 , 𝛿2 > 0
be such that 𝛿 = 𝛿1 + 𝛿2 . Set 𝛼 ∈ (0, 1) and 𝛽 > 1 such that
′ ′ ′
𝛿1 ≥ 𝐼 (𝑋; 𝐵 )𝜏 − 𝐼 (𝑋; 𝐸 𝑍)𝜏 − 𝐼 𝛼 (𝑋; 𝐵 )𝜏 − 𝐼 𝛽 (𝑋; 𝐸 𝑍)𝜏 .
e (15.2.34)
Note that this is possible because 𝐼 𝛼 (𝑋; 𝐵′)𝜏 increases monotonically with increasing
𝛼 ∈ (0, 1) (see Proposition 7.23) and e 𝐼 𝛽′ (𝑋; 𝐸 𝑍)𝜏 decreases monotonically with
decreasing 𝛽 (see Proposition 7.31), so that
Also,
With 𝛼 and 𝛽 chosen such that (15.2.34) holds, take 𝑛 large enough so that
′ 𝜀′ 𝜀′ 𝜀′
1 𝜀
𝛿2 ≥ 𝑓 𝜀′, , , , , 𝛼, 𝛽 . (15.2.39)
𝑛 2 4 4 2
Now, we use the fact that for the 𝑛 and 𝜀 chosen above, there exists an (𝑛, 𝐾, 𝜀)
protocol such that
′ 𝜀′ 𝜀′ 𝜀′
log2 𝐾 1 𝜀
≥ 𝐼 𝛼 (𝑋; 𝐵′) 𝜌 − e
𝐼 𝛽′ (𝑋; 𝐸 𝑍) 𝜌 − 𝑓 𝜀′, , , , , 𝛼, 𝛽 . (15.2.40)
𝑛 𝑛 2 4 4 2
992
Chapter 15: Secret Key Distillation
(This follows from Proposition 15.33 above.) Rearranging the right-hand side of
this inequality, and using (15.2.34), (15.2.39), and (15.2.40), we find that
log2 𝐾
≥ 𝐼 (𝑋; 𝐵′)𝜏 − 𝐼 (𝑋; 𝐸 𝑍)𝜏
𝑛
′ ′ ′
© 𝐼 (𝑋; 𝐵 )𝜏 − 𝐼 (𝑋; 𝐸 𝑍)𝜏 − 𝐼 𝛼 (𝑋; 𝐵 )𝜏 − 𝐼 𝛽 (𝑋; 𝐸 𝑍)𝜏 ª
e
− ′ ′ ′ ′ ® (15.2.41)
+ 𝑛1 𝑓 𝜀′, 𝜀2 , 𝜀4 , 𝜀4 , 𝜀2 , 𝛼, 𝛽
« ¬
′
≥ 𝐼 (𝑋; 𝐵 )𝜏 − 𝐼 (𝑋; 𝐸 𝑍)𝜏 − (𝛿1 + 𝛿2 ) (15.2.42)
= 𝐼 (𝑋; 𝐵′)𝜏 − 𝐼 (𝑋; 𝐸 𝑍)𝜏 − 𝛿. (15.2.43)
We thus have shown that there exists an (𝑛, 𝐾, 𝜀) secret key distillation proto-
col with rate 𝑛2 ≥ 𝐼 (𝑋; 𝐵′)𝜏 − 𝐼 (𝑋; 𝐸 𝑍)𝜏 − 𝛿. Therefore, there exists an
log 𝐾
(𝑛, 2𝑛(𝑅−𝛿) , 𝜀) secret-key distillation protocol with 𝑅 = 𝐼 (𝑋; 𝐵′)𝜏 − 𝐼 (𝑋; 𝐸 𝑍)𝜏 for
all sufficiently large 𝑛 such that (15.2.39) holds. Since 𝜀 and 𝛿 are arbitrary, we
conclude that for all 𝜀 ∈ (0, 1], 𝛿 > 0, and sufficiently large 𝑛, there exists an
′
(𝑛, 2𝑛(𝐼 (𝑋;𝐵 ) 𝜏 −𝐼 (𝑋;𝐸 𝑍) 𝜏 −𝛿) , 𝜀) secret key distillation protocol. This means that, by
definition, 𝐼 (𝑋; 𝐵′)𝜏 − 𝐼 (𝑋; 𝐸 𝑍)𝜏 is an achievable rate. ■
Let L↔
𝐴 𝑘 𝐵 𝑘 →𝑋 𝐵′ 𝑍
be an arbitrary LOPC channel with 𝑘 ∈ N, let
𝜏𝑋 𝐵′ 𝐸 𝑘 𝑍 B L↔
𝐴 𝑘 𝐵 𝑘 →𝑋 𝐵′ 𝑍
(𝜓 ⊗𝑘
𝐴𝐵𝐸 ), (15.2.44)
where 𝜓 𝐴𝐵𝐸 is a purification of 𝜌 𝐴𝐵 . Fix 𝜀 ∈ (0, 1] and 𝛿 > 0. Let 𝛿1 , 𝛿2 > 0 be
such that 𝛿 = 𝛿1 + 𝛿2 . Set 𝛼 ∈ (0, 1) and 𝛽 ∈ (1, ∞) such that
1 1
𝛿1 ≥ 𝐼 (𝑋; 𝐵′)𝜏 − 𝐼 (𝑋; 𝐸 𝑘 𝑍)𝜏 − 𝐼 𝛼 (𝑋; 𝐵′)𝜏 − e
𝐼 𝛽′ (𝑋; 𝐸 𝑘 𝑍)𝜏 , (15.2.45)
𝑘 𝑘
which is possible based on the arguments given in the proof of Theorem 15.34
above. Then, with this choice of 𝛼 and 𝛽, take 𝑛 large enough so that
′ ′ ′ ′
1 ′ 𝜀 𝜀 𝜀 𝜀
𝛿2 ≥ 𝑓 𝜀 , , , , , 𝛼, 𝛽 . (15.2.46)
𝑘𝑛 2 4 4 2
Now, we use the fact that, for the chosen 𝑛 and 𝜀, there exists an (𝑛, 𝐾, 𝜀) secret-key
distillation protocol such that (15.2.27) holds, i.e.,
′ 𝜀′ 𝜀′ 𝜀′
log2 𝐾 1 𝜀
≥ 𝐼 𝛼 (𝑋; 𝐵′)𝜏 − e
𝐼 𝛽′ (𝑋; 𝐸 𝑘 𝑍)𝜏 − 𝑓 𝜀′, , , , , 𝛼, 𝛽 . (15.2.47)
𝑛 𝑛 2 4 4 2
993
Chapter 15: Secret Key Distillation
1 ′
, 𝜀) secret key distillation protocol. This means that
𝑘 𝐼 (𝑋; 𝐵 ) 𝜏 − 𝐼 (𝑋; 𝐸 𝑍) 𝜏 is an achievable rate.
𝑘
is an achievable rate.
994
Chapter 15: Secret Key Distillation
In order to prove the weak converse part of Theorem 15.32, we make use of
Corollary 15.17, specifically (15.1.144): given a bipartite state 𝜌 𝐴𝐵 , for every
(𝐾, 𝜀) secret key distillation protocol for 𝜌 𝐴𝐵 , with 𝜀 ∈ [0, 1), the following bound
holds
√
1 − 2 𝜀 − 𝛿 log2 𝐾 ≤ sup 𝐼 (𝑋; 𝐵′)L(𝜓) − 𝐼 (𝑋; 𝐸 𝑍)L(𝜓)
L
√ √ √
1
+ ℎ2 ( 𝜀 + 𝛿) + 1 − 𝜀 − 𝛿 log2 + 2𝑔2 ( 𝜀), (15.2.55)
𝛿
√
where 𝛿 ∈ 0, 1 − 𝜀 , 𝜓 𝐴𝐵𝐸 is a purification of 𝜌 𝐴𝐵 , and the information quantities
are evaluated on the state L 𝐴𝐵→𝑋 𝐵′ 𝑍 (𝜓 𝐴𝐵𝐸 ). Applying this inequality to the state
𝜌 ⊗𝑛
𝐴𝐵 leads to the following.
Proposition 15.35
√
Let 𝜌 𝐴𝐵 be a bipartite state, let 𝑛 ∈ N, 𝜀 ∈ [0, 1), and 𝛿 ∈ (0, 1 − 𝜀). For
an (𝑛, 𝐾, 𝜀) secret-key distillation protocol for 𝜌 𝐴𝐵 with corresponding LOPC
log 𝐾
channel L 𝐴𝑛 𝐵𝑛 →𝑋 𝐵′ 𝑍 , with classical systems 𝑋 and 𝑍, the rate 𝑛2 satisfies
√ log2 𝐾 1
≤ sup 𝐼 (𝑋; 𝐵′)L(𝜓 ⊗𝑛 ) − 𝐼 (𝑋; 𝐸 𝑍)L(𝜓 ⊗𝑛 )
1−2 𝜀−𝛿
𝑛 𝑛 L
√ √ √
1 1
+ ℎ2 ( 𝜀 + 𝛿) + 1 − 𝜀 − 𝛿 log2 + 2𝑔2 ( 𝜀) . (15.2.56)
𝑛 𝛿
Consequently,
√ 1
1 − 2 𝜀 − 𝛿 𝐾 𝐷𝑛,𝜀 ( 𝐴; 𝐵) 𝜌 ≤ sup 𝐼 (𝑋; 𝐵′)L(𝜓 ⊗𝑛 ) − 𝐼 (𝑋; 𝐸 𝑍)L(𝜓 ⊗𝑛 )
𝑛 L
√ √ √
1 1
+ ℎ2 ( 𝜀 + 𝛿) + 1 − 𝜀 − 𝛿 log2 + 2𝑔2 ( 𝜀) , (15.2.57)
𝑛 𝛿
995
Chapter 15: Secret Key Distillation
Suppose that 𝑅 is an achievable rate for secret key distillation for the bipartite
state 𝜌 𝐴𝐵 . Then, by definition, for all 𝜀 ∈ (0, 1], 𝛿 > 0, and sufficiently large 𝑛,
there exists an (𝑛, 2𝑛(𝑅−𝛿) , 𝜀) secret-key distillation protocol for 𝜌 𝐴𝐵 . For all such
protocols, the inequality in (15.2.56) holds, so that
√ 1
1 − 2 𝜀 − 𝛿′ (𝑅 − 𝛿) ≤ sup 𝐼 (𝑋; 𝐵′)L(𝜓 ⊗𝑛 ) − 𝐼 (𝑋; 𝐸 𝑍)L(𝜓 ⊗𝑛 )
𝑛 L
√ √ √
1 ′ ′ 1
+ ℎ2 ( 𝜀 + 𝛿 ) + 1 − 𝜀 − 𝛿 log2 ′ + 2𝑔2 ( 𝜀) . (15.2.58)
𝑛 𝛿
Since the inequality holds for all sufficiently large 𝑛, it holds in the limit 𝑛 → ∞,
so that
√
1 − 2 𝜀 − 𝛿′ (𝑅 − 𝛿)
1
sup 𝐼 (𝑋; 𝐵′)L(𝜓 ⊗𝑛 ) − 𝐼 (𝑋; 𝐸 𝑍)L(𝜓 ⊗𝑛 )
≤ lim
𝑛→∞ 𝑛 L
!
√ √ √
1 1
ℎ2 ( 𝜀 + 𝛿′) + 1 − 𝜀 − 𝛿′ log2
+ + 2𝑔2 ( 𝜀) (15.2.59)
𝑛 𝛿
1
= lim sup 𝐼 (𝑋; 𝐵′)L(𝜓 ⊗𝑛 ) − 𝐼 (𝑋; 𝐸 𝑍)L(𝜓 ⊗𝑛 ) .
(15.2.60)
𝑛→∞ 𝑛 L
Then√since this inequality holds for all 𝜀 ∈ (0, 1), 𝛿 > 0, it holds in particular for
𝛿′ = 𝜀, 𝜀 ∈ (0, 19 ), which gives
1 1
√ lim sup 𝐼 (𝑋; 𝐵′)L(𝜓 ⊗𝑛 ) − 𝐼 (𝑋; 𝐸 𝑍)L(𝜓 ⊗𝑛 ) + 𝛿,
𝑅≤ (15.2.61)
1 − 3 𝜀 𝑛→∞ 𝑛 L
and we thus conclude that
1 1
√ lim sup 𝐼 (𝑋; 𝐵′)L(𝜓 ⊗𝑛 ) − 𝐼 (𝑋; 𝐸 𝑍)L(𝜓 ⊗𝑛 ) + 𝛿
𝑅 ≤ lim
𝜀,𝛿→0 1 − 3 𝜀 𝑛→∞ 𝑛 L
(15.2.62)
1
= lim sup 𝐼 (𝑋; 𝐵′)L(𝜓 ⊗𝑛 ) − 𝐼 (𝑋; 𝐸 𝑍)L(𝜓 ⊗𝑛 ) .
(15.2.63)
𝑛→∞ 𝑛 L
We have thus shown that the quantity lim𝑛→∞ 𝑛1 supL 𝐼 (𝑋; 𝐵′)L(𝜓 ⊗𝑛 ) −𝐼 (𝑋; 𝐸 𝑍)L(𝜓 ⊗𝑛 )
is a weak converse rate for secret key distillation for 𝜌 𝐴𝐵 .
996
Chapter 15: Secret Key Distillation
As indicated previously, the expression in (15.2.13) for distillable key involves both
a limit over an unbounded number of copies of the initial state 𝜌 𝐴𝐵 , as well as an
optimization over all two-way LOPC channels. Computing the distillable key is
therefore intractable in general. In this section, we use the one-shot upper bound
established in Section 15.1.3.2 to show that the relative entropy of entanglement is
a strong converse upper bound on the distillable key of a bipartite state 𝜌 𝐴𝐵 .
We start by recalling the upper bound in (15.1.145), which tells us that
𝛼 1
log2 𝐾 ≤ 𝐸e𝛼 ( 𝐴; 𝐵) 𝜌 + log2 ∀𝛼 > 1, (15.2.64)
𝛼−1 1−𝜀
for an arbitrary (𝐾, 𝜀) secret-key distillation protocol, where 𝜀 ∈ (0, 1). Recall that
e𝛼 ( 𝐴; 𝐵) 𝜌 =
𝐸 inf e𝛼 (𝜌 𝐴𝐵 ∥𝜎𝐴𝐵 ).
𝐷 (15.2.65)
𝜎𝐴𝐵 ∈SEP( 𝐴:𝐵)
Recall that the upper bound above is a consequence of the fact that separable states
are useless for secret key distillation.
Applying the upper bound in (15.2.64) to the state 𝜌 ⊗𝑛
𝐴𝐵 leads to the following
result:
Corollary 15.36
Let 𝜌 𝐴𝐵 be a bipartite state, let 𝑛 ∈ N, 𝜀 ∈ [0, 1), and 𝛼 > 1. For an (𝑛, 𝐾, 𝜀)
secret-key distillation protocol, the following bound holds
log2 𝐾 𝛼 1
≤𝐸e𝛼 ( 𝐴; 𝐵) 𝜌 + log2 . (15.2.66)
𝑛 𝑛 (𝛼 − 1) 1−𝜀
Consequently,
𝛼 1
𝐾 𝐷𝑛,𝜀 ( 𝐴; 𝐵) 𝜌 ≤ 𝐸
e𝛼 ( 𝐴; 𝐵) 𝜌 + log2 . (15.2.67)
𝑛 (𝛼 − 1) 1−𝜀
state 𝜌 ⊗𝑛
𝐴𝐵 and dividing both sides by 𝑛 leads to
log2 𝐾 1e 𝑛 𝑛 𝛼 1
≤ 𝐸 𝛼 ( 𝐴 ; 𝐵 ) 𝜌 ⊗𝑛 + log2 . (15.2.68)
𝑛 𝑛 𝑛 (𝛼 − 1) 1−𝜀
Now, by subadditivity of the sandwiched Rényi relative entropy of entanglement
(see (9.2.10)), we have that
e𝛼 ( 𝐴𝑛 ; 𝐵𝑛 ) 𝜌 ⊗𝑛 ≤ 𝑛 𝐸
𝐸 e𝛼 ( 𝐴; 𝐵) 𝜌 . (15.2.69)
Therefore,
log2 𝐾 𝛼 1
≤𝐸
e𝛼 ( 𝐴; 𝐵) 𝜌 + log2 , (15.2.70)
𝑛 𝑛 (𝛼 − 1) 1−𝜀
as required. Since this inequality holds for all (𝑛, 𝐾, 𝜀) protocols, we obtain
(15.2.67) by optimizing over all key distillation protocols. ■
Given an 𝜀 ∈ (0, 1), the inequality in (15.2.66) gives us a bound on the rate of
an arbitrary (𝑛, 𝐾, 𝜀) secret-key distillation protocol for a state 𝜌 𝐴𝐵 . If we instead
fix the rate to be 𝑟, so that 𝐾 = 2𝑛𝑟 , then the inequality in (15.2.66) is as follows:
𝛼 1
𝑟≤𝐸 e𝛼 ( 𝐴; 𝐵) 𝜌 + log2 (15.2.71)
𝑛 (𝛼 − 1) 1−𝜀
for all 𝛼 > 1. Rearranging this inequality gives us the following lower bound on 𝜀:
998
Chapter 15: Secret Key Distillation
Proof: The proof is identical that given for Theorem 13.24, except we make use
of (15.2.66). ■
Given that the relative entropy of entanglement is a strong converse rate for
distillable key, by following arguments analogous to those in the referenced proof,
we conclude that 1𝑘 𝐸 𝑅 ( 𝐴 𝑘 ; 𝐵 𝑘 ) 𝜌 ⊗𝑘 is a strong converse rate for all 𝑘 ∈ N. Therefore,
the regularized quantity
reg 1
𝐸 𝑅 ( 𝐴; 𝐵) 𝜌 B lim 𝐸 𝑅 ( 𝐴𝑛 ; 𝐵𝑛 ) 𝜌 ⊗𝑛 (15.2.75)
𝑛→∞ 𝑛
so that the regularized quantity in general gives a tighter upper bound on distillable
key.
Corollary 15.38
Let 𝜌 𝐴𝐵 be a bipartite state, let 𝑛 ∈ N, and let 𝜀 ∈ [0, 1). For an (𝑛, 𝐾, 𝜀)
secret-key distillation protocol, the following bound holds
√ 1 2 √
1 − 2 𝜀 log2 𝐾 ≤ 𝐸 sq ( 𝐴; 𝐵) 𝜌 + 𝑔2 ( 𝜀). (15.2.78)
𝑛 𝑛
the state 𝜌 ⊗𝑛
𝐴𝐵 and dividing both sides by 𝑛 leads to
√ 1 1 2 √
1 − 2 𝜀 log2 𝐾 ≤ 𝐸 sq ( 𝐴𝑛 ; 𝐵𝑛 ) 𝜌 ⊗𝑛 + 𝑔2 ( 𝜀). (15.2.79)
𝑛 𝑛 𝑛
Now, by additivity of the squashed entanglement (Proposition 9.32), we have that
𝐸 sq ( 𝐴𝑛 ; 𝐵𝑛 ) 𝜌 ⊗𝑛 = 𝑛𝐸 sq ( 𝐴; 𝐵) 𝜌 . (15.2.80)
We now provide a proof of (15.2.15), the statement that the squashed entan-
glement is a weak converse rate for secret key distillation. Suppose that 𝑅 is
an achievable rate for secret key distillation for the bipartite state 𝜌 𝐴𝐵 . Then,
by definition, for all 𝜀 ∈ (0, 1], 𝛿 > 0, and sufficiently large 𝑛, there exists an
(𝑛, 2𝑛(𝑅−𝛿) , 𝜀) secret-key distillation protocol for 𝜌 𝐴𝐵 . For all such protocols, the
inequality in (15.2.78) holds, so that
√ 2 √
1 − 2 𝜀 (𝑅 − 𝛿) ≤ 𝐸 sq ( 𝐴; 𝐵) 𝜌 + 𝑔2 ( 𝜀). (15.2.81)
𝑛
Since the inequality holds for all sufficiently large 𝑛, it holds in the limit 𝑛 → ∞,
so that
√ 2 √
1 − 2 𝜀 (𝑅 − 𝛿) ≤ lim 𝐸 sq ( 𝐴; 𝐵) 𝜌 + 𝑔2 ( 𝜀) (15.2.82)
𝑛→∞ 𝑛
= 𝐸 sq ( 𝐴; 𝐵) 𝜌 . (15.2.83)
Then, since this inequality holds for all 𝜀 ∈ (0, 1] and 𝛿 > 0, it holds in particular
for all 𝜀 ∈ (0, 14 ) and 𝛿 > 0, implying that
1
𝑅≤ √ 𝐸 sq ( 𝐴; 𝐵) 𝜌 + 𝛿, (15.2.84)
1−2 𝜀
and furthermore, that
1
𝑅 ≤ lim √ 𝐸 sq ( 𝐴; 𝐵) 𝜌 + 𝛿 (15.2.85)
𝜀,𝛿→0 1 − 2 𝜀
= 𝐸 sq ( 𝐴; 𝐵) 𝜌 . (15.2.86)
We have thus shown that the squashed entanglement is a weak converse rate for
secret key distillation.
1000
Chapter 15: Secret Key Distillation
where ∑︁
𝜏𝑋 𝐵𝐸 B |𝑥⟩⟨𝑥| 𝑋 ⊗ Tr 𝐴 [Λ𝑥𝐴 𝜓 𝐴𝐵𝐸 ], (15.3.2)
𝑥
𝜓 𝐴𝐵𝐸 is a purification of 𝜌 𝐴𝐵 , and {Λ𝑥𝐴 }𝑥 is a POVM. By reversing the roles of
Alice and Bob in the protocol, we find that
𝐾 𝐷 ( 𝐴; 𝐵) 𝜌 ≥ 𝐼 ( 𝐴; 𝑌 )𝜔 − 𝐼 (𝑌 ; 𝐸)𝜔 , (15.3.3)
where ∑︁
𝑦
𝜔𝑌 𝐴𝐸 B |𝑦⟩⟨𝑦|𝑌 ⊗ Tr 𝐵 [Γ𝐵 𝜓 𝐴𝐵𝐸 ], (15.3.4)
𝑦
𝑦
where {Γ𝐵 } 𝑦 is a POVM. Then, in general, we have the following lower bound on
distillable key:
⊗𝑛
where the information quantities are evaluated on the state L↔ 𝐴𝑛 𝐵 𝑛 →𝑋 𝐵′ 𝑍 (𝜓 𝐴𝐵𝐸 ),
L↔𝐴𝑛 𝐵 𝑛 →𝑋 𝐵′ 𝑍 is an LOPC channel with classical systems 𝑋 and 𝑍, and 𝜓 𝐴𝐵𝐸 is a
purification of 𝜌 𝐴𝐵 .
If we restrict the optimization in (15.3.6) above to one-way LOPC channels
of the form L→ 𝐴𝑛 𝐵 𝑛 →𝑋 𝐵′ 𝑍 , then we obtain what is called the one-way distillable
1001
Chapter 15: Secret Key Distillation
where
𝐷→ ′
𝐾 (𝜌 𝐴𝐵 ) B sup 𝐼 (𝑋; 𝐵 )L→ (𝜓) − 𝐼 (𝑋; 𝐸 𝑍)L→ (𝜓) . (15.3.9)
L→
Like the distillable key, the one-way distillable key is an operational quantity of
interest. Furthermore, the equality in (15.3.7) can be proved similarly to how we
proved (15.2.13).
In what follows, we show that this expression for one-way distillable key can be
simplified.
where
∑︁
⊗𝑛
𝜏𝑋 𝑍 𝐵𝑛 𝐸 𝑛 B |𝑥⟩⟨𝑥| 𝑋 ⊗ |𝑧⟩⟨𝑧| 𝑍 ⊗ Tr 𝐴𝑛 [Λ𝑥,𝑧
𝐴𝑛 𝜓 𝐴𝐵𝐸 ], (15.3.11)
𝑥∈X,𝑧∈Z
This theorem tells us that, to determine the one-way distillable key of a bipartite
state, it suffices to optimize over one-way LOPC channels that consist of a POVM
conducted on Alice’s systems.
Proof: Let us start by recalling from Definition 4.22 and the discussion around
1002
Chapter 15: Secret Key Distillation
𝜔 𝑋 𝐵 ′ 𝑍 B L→ ′ (𝜉 𝑛 𝑛 ) (15.3.12)
∑︁𝐴 𝐵 →𝑋 𝐵 𝑍 𝐴 𝐵
𝑛 𝑛
where Z is some finite alphabet, {E𝑧𝐴𝑛 →𝑋 } 𝑧∈Z is a set of completely positive maps
Í
such that 𝑧∈Z E𝑧𝐴𝑛 →𝑋 is trace preserving, and {D𝑧𝐵𝑛 →𝐵′ } 𝑧∈Z is a set of channels.
Furthermore,
∑︁
E 𝐴𝑛 →𝑋 𝑍 𝐴 (𝜉 𝐴𝑛 𝐵𝑛 ) = E𝑧𝐴𝑛 →𝑋 (𝜉 𝐴𝑛 𝐵𝑛 ) ⊗ |𝑧⟩⟨𝑧| 𝑍 , (15.3.15)
𝑧∈Z
D𝑍 𝐵 𝐵𝑛 →𝐵′ (|𝑧⟩⟨𝑧| 𝑍 𝐵 ⊗ 𝜉 𝐴𝑛 𝐵𝑛 ) = D𝑧𝐵𝑛 →𝐵′ (𝜉 𝐴𝑛 𝐵𝑛 ), (15.3.16)
and since the map E𝑧𝐴𝑛 →𝑋 has a classical output 𝑋, it can be written as
∑︁
E𝑧𝐴𝑛 →𝑋 (𝜉 𝐴𝑛 𝐵𝑛 ) = Tr 𝐴𝑛 [Λ𝑥,𝑧
𝐴𝑛 𝜉 𝐴 𝐵 ]|𝑥⟩⟨𝑥| 𝑋 ,
𝑛 𝑛 (15.3.17)
𝑥∈X
where {Λ𝑥,𝑧
𝐴𝑛 } 𝑥∈X,𝑧∈Z is a POVM.
For every 𝑛 ∈ N, if we restrict the optimization in (15.3.7) to D𝑧𝐵𝑛 →𝐵′ = id𝐵𝑛
for all 𝑧 ∈ Z and E𝑧𝐴𝑛 →𝑋 (·) = 𝑥∈X Tr 𝐴𝑛 [Λ𝑥,𝑧
Í
𝐴𝑛 (·)] for all 𝑧 ∈ Z, then the LOPC
→
channel L 𝐴𝑛 𝐵𝑛 →𝑋 𝐵′ 𝑍 reduces to
∑︁
L→ (𝜉 𝑛
𝐴𝑛 𝐵 𝑛 →𝑋 𝐵 𝑛 𝑍 𝐴 𝐵 𝑛 ) = Tr 𝐴𝑛 [Λ𝑥,𝑧
𝐴𝑛 𝜉 𝐴 𝐵 ] ⊗ |𝑥⟩⟨𝑥| 𝑋 ⊗ |𝑧⟩⟨𝑧| 𝑍
𝑛 𝑛 (15.3.18)
𝑥∈X,𝑧∈Z
The rest of the proof is devoted to proving the reverse inequality. Let L→
𝐴𝑛 𝐵 𝑛 →𝑋 𝐵′ 𝑍
be an arbitrary LOPC channel of the form in (15.3.12)–(15.3.17). Consider that
1003
Chapter 15: Secret Key Distillation
where
∑︁
L′𝐴𝑛 𝐵𝑛 →𝑋 𝐵𝑛 𝑍 (𝜉 𝐴𝑛 𝐵𝑛 ) B Tr 𝐴𝑛 [Λ𝑥,𝑧
𝐴𝑛 𝜉 𝐴 𝐵 ] ⊗ |𝑥⟩⟨𝑥| 𝑋 ⊗ |𝑧⟩⟨𝑧| 𝑍 .
𝑛 𝑛 (15.3.23)
𝑥∈X,𝑧∈Z
The inequality follows from data-processing with respect to the decoding channel
D𝑍 𝐵 𝐵𝑛 →𝐵′ of Bob. This concludes the proof. ■
Lemma 15.40
For every bipartite state 𝜌 𝐴𝐵 , the optimized private information lower bound
on distillable key is non-negative, i.e., 𝐷 →
𝐾 (𝜌 𝐴𝐵 ) ≥ 0.
Lemma 15.41
For every bipartite state 𝜌 𝐴𝐵 , the optimized private information lower bound
on distillable key is not smaller than the coherent information of 𝜌 𝐴𝐵 , i.e.,
𝐷→𝐾 (𝜌 𝐴𝐵 ) ≥ 𝐼 ( 𝐴⟩𝐵) 𝜌 . Thus, the coherent information is a lower bound for
one-way distillable key:
𝐾 𝐷→ (𝜌 𝐴𝐵 ) ≥ 𝐼 ( 𝐴⟩𝐵) 𝜌 . (15.3.25)
1004
Chapter 15: Secret Key Distillation
Proof: Let Λ𝑥,𝑧 𝐴 = |𝜑𝑥 ⟩⟨𝜑𝑥 | 𝐴 be a rank-one POVM for which there is no output 𝑧.
Let the state after the measurement be as follows:
∑︁
𝜏𝑋 𝐵𝐸 B |𝑥⟩⟨𝑥| 𝑋 ⊗ Tr 𝐴 [|𝜑𝑥 ⟩⟨𝜑𝑥 | 𝐴 𝜓 𝐴𝐵𝐸 ] (15.3.26)
𝑥∈X
∑︁
= 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜓 𝑥𝐵𝐸 , (15.3.27)
𝑥∈X
where
The first equality follows because the 𝑍 system is trivial. The third equality follows
because 𝐻 (𝐵|𝑋)𝜏 = 𝐻 (𝐸 |𝑋)𝜏 , which in turn follows because each state 𝜓 𝑥𝐵𝐸 is
pure. ■
15.4 Examples
We now consider classes of bipartite states and evaluate the upper and lower bounds
on their distillable key that we have established in this chapter. In some cases,
the distillable key can be determined exactly because the upper and lower bounds
coincide.
The simplest example for which distillable key can be determined exactly is the
class of pure bipartite states. In this case, the coherent information lower bound
1005
Chapter 15: Secret Key Distillation
from Lemma 15.41 and the relative entropy of entanglement upper bound from
Theorem 15.32 coincide and are equal to the entropy of the reduced state. Thus,
applying this same reasoning from Section 13.3.1, we conclude the following:
Applying this fact and the data-processing inequality to the expression 𝐼 (𝑋; 𝐵𝑛 |𝑍)𝜏 −
𝐼 (𝑋; 𝐸 𝑛 |𝑍)𝜏 from Theorem 15.39, we conclude that
1006
Chapter 15: Secret Key Distillation
𝐷→
𝐾 (𝜌 𝐴𝐵 ) = 𝐼 ( 𝐴⟩𝐵) 𝜌 . (15.4.3)
⊗𝑛
Consequently, 𝐷 → →
𝐾 (𝜌 𝐴𝐵 ) = 𝑛𝐷 𝐾 (𝜌 𝐴𝐵 ), and thus the one-way distillable key
of a degradable state 𝜌 𝐴𝐵 is equal to its coherent information:
𝐾 𝐷→ ( 𝐴; 𝐵) 𝜌 = 𝐼 ( 𝐴⟩𝐵) 𝜌 . (15.4.4)
where
∑︁
⊗𝑛
𝜏𝑋 𝑍 𝐵𝑛 𝐸 𝑛 = |𝑥⟩⟨𝑥| 𝑋 ⊗ |𝑧⟩⟨𝑧| 𝑍 ⊗ Tr 𝐴𝑛 [Λ𝑥,𝑧
𝐴𝑛 𝜓 𝐴𝐵𝐸 ]. (15.4.8)
𝑥∈X,𝑧∈Z
The sole inequality above follows from the data-processing inequality for mutual
information and the fact that there is a degrading channel from 𝐵𝑛 to 𝐸 𝑛 . Now
let Λ𝑥,𝑧 𝑥,𝑦,𝑧 ⟩⟨𝜑𝑥,𝑦,𝑧 | 𝑛 be a rank-one decomposition of the POVM {Λ𝑥,𝑧 }
Í
𝐴𝑛 = 𝑦 |𝜑 𝐴 𝐴𝑛 𝑥,𝑧
and define the following extension of the state 𝜏𝑋 𝑍 𝐵𝑛 𝐸 𝑛 :
∑︁
𝜏𝑋 𝑍𝑌 𝐵𝑛 𝐸 𝑛 = |𝑥⟩⟨𝑥| 𝑋 ⊗ |𝑧⟩⟨𝑧| 𝑍 ⊗ |𝑦⟩⟨𝑦|𝑌 ⊗ Tr 𝐴𝑛 [|𝜑𝑥,𝑦,𝑧 ⟩⟨𝜑𝑥,𝑦,𝑧 | 𝐴𝑛 𝜓 ⊗𝑛
𝐴𝐵𝐸 ].
𝑥∈X,𝑧∈Z
(15.4.9)
Then consider that
𝐼 (𝑋 𝑍; 𝐵𝑛 )𝜏 − 𝐼 (𝑋 𝑍; 𝐸 𝑛 )𝜏
1007
Chapter 15: Secret Key Distillation
The sole inequality above follows from the data-processing inequality for conditional
mutual information and the fact that there is a degrading channel from 𝐵𝑛 to 𝐸 𝑛 .
This concludes the proof. ■
15.5 Summary
In this chapter, we considered the task of secret key distillation, in which the goal is
for Alice and Bob to convert a bipartite state to an approximate tripartite key state
with as many secret key bits as possible. In doing so, they are allowed to perform
local operations and public classical communication, in which an eavesdropper
obtains a copy of all of the classical communication exchanged. The highest rate at
which this can be accomplished is called the distillable key of the state. We began
with the one-shot setting, in which we allow some error in the distillation protocol,
and we determined lower and upper bounds on the number of approximate secret
key bits that can be distilled. In the asymptotic setting, we proved that the private
information of the state is an achievable rate, and we proved that the squashed
entanglement and the relative entropy of entanglement are upper bounds. These
latter quantities are the best known upper bounds on distillable key.
By performing secret key distillation and then the one-time pad protocol
(described in the introduction of this chapter), Alice can transmit a classical
message privately to Bob. This process thus induces an ideal private classical
channel from Alice to Bob. If Alice and Bob are connected by a quantum
channel, then they can use it to share a bipartite state, from which they can induce
a private classical channel in the aforementioned manner. This is one way to
communicate privately over a quantum channel. In the next chapter, we discuss
other, more direct approaches for private communication, which give an optimal
1008
Chapter 15: Secret Key Distillation
1009
Chapter 15: Secret Key Distillation
𝜌𝐴 ⊗ e 𝜌 𝐴𝐸 + (1 − 𝑝) 𝜔 𝐴𝐸 ,
𝜌 𝐸 = 𝑝e (15.A.1)
for some 𝑝 ∈ (0, 1) and 𝜔 𝐴𝐸 some state. We define the following state, which we
think of as an approximation to 𝜏𝐴1 ···𝐴𝑅 𝐸 :
𝑅
1 ∑︁
𝜏𝐴1 ···𝐴𝑅 𝐸
e B 𝜌 𝐴 ⊗ · · · ⊗ 𝜌 𝐴𝑟 −1 ⊗ e
𝜌 𝐴𝑟 𝐸 ⊗ 𝜌 𝐴𝑟+1 ⊗ · · · ⊗ 𝜌 𝐴𝑅 . (15.A.2)
𝑅 𝑟=1 1
1010
Chapter 15: Secret Key Distillation
√
It is a good approximation if 𝜀 − 𝜂 is small, because
√
𝐹 (𝜏𝐴1 ···𝐴𝑅 𝐸 , e
𝜏𝐴1 ···𝐴𝑅 𝐸 )
1 ∑︁ √
𝑅
≥ 𝐹 (𝜌 ⊗𝑟−1
𝐴 ⊗ 𝜌 𝐴𝑟 𝐸 ⊗ 𝜌 ⊗𝑅−𝑟
𝐴 , 𝜌 ⊗𝑟−1
𝐴 𝜌 𝐴𝑟 𝐸 ⊗ 𝜌 ⊗𝑅−𝑟
⊗e 𝐴 ) (15.A.3)
𝑅 𝑟=1
1 ∑︁ √
𝑅
= 𝐹 (𝜌 𝐴𝑟 𝐸 , e
𝜌 𝐴𝑟 𝐸 ) (15.A.4)
𝑅 𝑟=1
√
= 𝐹 (𝜌 𝐴𝐸 , e
𝜌 𝐴𝐸 ), (15.A.5)
where the inequality follows from the concavity of the root fidelity (Theorem 6.11).
This in turn implies that
√ √
𝐹 (𝜏𝐴1 ···𝐴𝑅 𝐸 , e
𝜏𝐴1 ···𝐴𝑅 𝐸 ) ≥ 𝐹 (𝜌 𝐴𝐸 , e
𝜌 𝐴𝐸 ). (15.A.6)
So the inequality in (15.A.6), the√definition of the sine distance (Definition 6.16),
𝜌 𝐴𝐸 , 𝜌 𝐴𝐸 ) ≤ 𝜀 − 𝜂, imply that
and the fact that 𝑃(e
√
𝜏𝐴1 ···𝐴𝑅 𝐸 ) ≤ 𝜀 − 𝜂.
𝑃(𝜏𝐴1 ···𝐴𝑅 𝐸 , e (15.A.7)
Now, let us define the following states:
𝛽 𝐴𝐸 B 𝜌 𝐴 ⊗ e 𝜌𝐸 , (15.A.8)
𝛼 𝐴𝐸 B e 𝜌 𝐴𝐸 , (15.A.9)
𝑅
1 ∑︁
𝜏𝐴 𝑅 𝐸 𝑅 B
e 𝛽 𝐴 𝐸 ⊗ · · · ⊗ 𝛽 𝐴𝑟 −1 𝐸𝑟 −1 ⊗ 𝛼 𝐴𝑟 𝐸𝑟 ⊗ 𝛽 𝐴𝑟+1 𝐸𝑟+1 ⊗ · · · ⊗ 𝛽 𝐴𝑅 𝐸 𝑅 ,
𝑅 𝑟=1 1 1
(15.A.10)
and observe that
Tr𝐸 𝑅 [(𝛽 𝐴𝐸 ) ⊗𝑅 ] = 𝜌 𝐴1 ⊗ · · · ⊗ 𝜌 𝐴𝑅 ⊗ e
𝜌𝐸 , (15.A.11)
2
Tr𝐸 𝑅 [e
𝜏𝐴 𝑅 𝐸 𝑅 ] = e
𝜏𝐴1 ···𝐴𝑅 𝐸 . (15.A.12)
2
Thus, it follows from the data-processing inequality for the sine distance that
𝜏𝐴1 ···𝐴𝑅 𝐸 , 𝜌 𝐴1 ⊗ · · · ⊗ 𝜌 𝐴𝑅 ⊗ e
𝑃(e 𝜏𝐴 𝑅 𝐸 𝑅 , (𝛽 𝐴𝐸 ) ⊗𝑅 ).
𝜌 𝐸 ) ≤ 𝑃(e (15.A.13)
Now consider that
𝜌 𝐴𝐸 + (1 − 𝑝) 𝜔 𝐴𝐸 ) ⊗𝑅
(𝛽 𝐴𝐸 ) ⊗𝑅 = ( 𝑝e (15.A.14)
1011
Chapter 15: Secret Key Distillation
∑︁
⊗[𝑅]\𝑆
= 𝑝 |𝑆| (1 − 𝑝) 𝑅−|𝑆| e
𝜌 ⊗𝑆
𝐴𝐸 ⊗ 𝜔 𝐴𝐸 (15.A.15)
𝑆⊂[𝑅]
𝑅
∑︁ 𝑅
= 𝑝 𝑘 (1 − 𝑝) 𝑛−𝑘 𝜃 𝑘 (15.A.16)
𝑘=0
𝑘
𝑅−1 𝑘 𝑅
In the last line, we used the identity 𝑘−1 = 𝑅𝑝 𝑘 . Defining the following
classical–quantum states:
𝑅
∑︁ 𝑅
𝛽 𝐴𝑅 𝐸 𝑅 𝐾 B 𝑝 𝑘 (1 − 𝑝) 𝑛−𝑘 𝜃 𝑘 ⊗ |𝑘⟩⟨𝑘 | 𝐾 , (15.A.22)
𝑘=0
𝑘
𝑅
∑︁ 𝑘 𝑅 𝑘
𝜏𝐴 𝑅 𝐸 𝑅 𝐾 B
e 𝑝 (1 − 𝑝) 𝑅−𝑘 𝜃 𝑘 ⊗ |𝑘⟩⟨𝑘 | 𝐾 , (15.A.23)
𝑘=0
𝑅𝑝 𝑘
consider that
√
𝐹 ((𝛽 𝐴𝐸 ) ⊗𝑅 , e
𝜏𝐴 𝑅 𝐸 𝑅 )
√
≥ 𝐹 (𝛽 𝐴 𝑅 𝐸 𝑅 𝐾 , e
𝜏𝐴 𝑅 𝐸 𝑅 𝐾 ) (15.A.24)
1012
Chapter 15: Secret Key Distillation
√︄ √︄
𝑅 √
∑︁ 𝑅 𝑘 𝑛−𝑘 𝑘 𝑅 𝑘
= 𝑝 (1 − 𝑝) 𝑝 (1 − 𝑝) 𝑅−𝑘 𝐹 (𝜃 𝑘 , 𝜃 𝑘 ) (15.A.25)
𝑘=0
𝑘 𝑅𝑝 𝑘
√︄
𝑅
∑︁ 𝑅 𝑘 𝑘
= 𝑝 (1 − 𝑝) 𝑅−𝑘 (15.A.26)
𝑘=0
𝑘 𝑅𝑝
√︄
𝑅
1 ∑︁ 𝑅 𝑘 √
= 𝑝 (1 − 𝑝) 𝑅−𝑘 𝑘 (15.A.27)
𝑅 𝑝 𝑘=0 𝑘
√︄
1 h√ i
= E𝐾 𝐾 , (15.A.28)
𝑅𝑝
where E𝐾 denotes the expectation with respect to the binomial random variable 𝐾.
The first inequality follows from the data-processing inequality for fidelity with
respect to partial trace. The other steps follow by direct evaluation. Let 𝜇 = 𝑅 𝑝 (i.e.,
the mean of a binomial random variable). Consider that the following inequality
holds for all 𝑘 ≥ 0 and 𝜇 > 0:
√ √ 𝑘 − 𝜇 (𝑘 − 𝜇) 2
𝑘 ≥ 𝜇+ √ − . (15.A.29)
2 𝜇 2𝜇3/2
Then we find that
√︄ √︄
√
1 h i 1 √ 𝐾 − 𝜇 (𝐾 − 𝜇) 2
E𝐾 𝐾 ≥ E𝐾 𝜇 + √ − (15.A.30)
𝑅𝑝 𝑅𝑝 2 𝜇 2𝜇3/2
√︄
1 √ Var(𝐾)
= 𝜇− (15.A.31)
𝑅𝑝 2𝜇3/2
√︄ !
1 √︁ Var(𝐾)
= 𝑅𝑝 − (15.A.32)
𝑅𝑝 2 (𝑅 𝑝) 3/2
𝑅 𝑝 (1 − 𝑝)
=1− (15.A.33)
(𝑅 𝑝) 2
(1 − 𝑝)
=1− (15.A.34)
𝑅𝑝
1
≥ 1− . (15.A.35)
𝑅𝑝
Thus it follows that
√ 𝜂2
𝐹 ((𝛽 𝐴𝐸 ) ⊗𝑅 , e
𝜏𝐴 𝑅 𝐸 𝑅 ) ≥ 1 − (15.A.36)
2
1013
Chapter 15: Secret Key Distillation
if
2
log2 𝑅 ≥ log2 (1/𝑝) + log2 2 . (15.A.37)
𝜂
This implies that
𝑃((𝛽 𝐴𝐸 ) ⊗𝑅 , e
𝜏𝐴 𝑅 𝐸 𝑅 ) ≤ 𝜂. (15.A.38)
For the same choice of 𝑅, it follows from (15.A.13) that
𝜏𝐴1 ···𝐴𝑅 𝐸 , 𝜌 𝐴1 ⊗ · · · ⊗ 𝜌 𝐴𝑅 ⊗ e
𝑃(e 𝜌 𝐸 ) ≤ 𝜂. (15.A.39)
The
√ whole argument above holds for an arbitrary state e 𝜌 𝐴𝐸 , 𝜌 𝐴𝐸 ) ≤
𝜌 𝐴𝐸 satisfying 𝑃(e
𝜀 − 𝜂 and (15.A.1), and so taking an infimum of log2 (1/𝑝) over 𝑝 and all states
satisfying these conditions, and applying the definition in (15.1.211), as well as
Lemma 7.59, we find that
√
𝑃(𝜏𝐴1 ···𝐴𝑅 𝐸 , 𝜌 𝐴1 ⊗ · · · ⊗ 𝜌 𝐴𝑅 ⊗ e
𝜌𝐸 ) ≤ 𝜀 (15.A.41)
if √
𝜀−𝜂 2
log2 𝑅 ≥ 𝐼 max (𝐸; 𝐴) 𝜌 + log2 2 . (15.A.42)
𝜂
This concludes the proof.
𝛾 𝛿2
𝜌𝐸 ]
Tr[Π𝐸 e ≥ 1− . (15.B.4)
8
𝜌 𝐴𝐸 = Tr 𝑋 [b
b 𝜌 𝐴𝐸 𝑋 ] (15.B.7)
𝜌 1/2 𝜌 1/2
𝛾 𝛾 𝛾
𝜌 𝐴𝐸 Π𝐸 + 𝜌 𝐴 ⊗ e
= Π𝐸 e 𝐸 𝐼 − Π𝐸 e 𝐸 . (15.B.8)
𝜌 𝐴𝐸 ≤ 𝜇𝜌 𝐴 ⊗ 𝜌 𝐸 , with
Then, using the inequality e
and the fact that 𝜇 𝛿82 ≥ 1 (which holds because 𝐷 max (e𝜌 𝐴𝐸 ∥ 𝜌 𝐴 ⊗ 𝜌 𝐸 ) ≥ 0 and
8 ≥ 𝛿 ), we find that
2
𝜌 1/2
𝛾 𝛾 𝛾 1/2
𝜌 𝐴𝐸 ≤ 𝜇𝜌 𝐴 ⊗ Π𝐸 𝜌 𝐸 Π𝐸 + 𝜌 𝐴 ⊗ e
b 𝐸 𝐼 − Π𝐸 e𝜌𝐸 (15.B.10)
1015
Chapter 15: Secret Key Distillation
8 𝛾 𝛾 1/2 𝛾 1/2
≤𝜇 𝜌 𝐴 ⊗ Π 𝐸 𝜌
e 𝐸 Π 𝐸 + 𝜌 𝐴 ⊗ 𝜌
e 𝐸 𝐼 − Π 𝐸 e 𝜌𝐸 (15.B.11)
𝛿2
8 h 𝛾 𝛾 1/2 𝛾 1/2
i
≤ 𝜇 2 𝜌 𝐴 ⊗ Π𝐸 e 𝜌 𝐸 Π𝐸 + 𝜌 𝐴 ⊗ e 𝜌 𝐸 𝐼 − Π𝐸 e 𝜌𝐸 (15.B.12)
𝛿
8 h i
𝛾 𝛾 1/2 𝛾 1/2
= 𝜇 2 𝜌 𝐴 ⊗ Π𝐸 e 𝜌 𝐸 Π𝐸 + e 𝜌 𝐸 𝐼 − Π𝐸 e 𝜌𝐸 (15.B.13)
𝛿
8
= 𝜇 2 𝜌𝐴 ⊗ b𝜌𝐸 . (15.B.14)
𝛿
The second inequality above follows from (15.B.2). Applying the definition of
𝐷 max (b
𝜌 𝐴𝐸 ∥ 𝜌 𝐴 ⊗ b𝜌 𝐸 ), we conclude that
8
𝐷 max (b 𝜌 𝐴𝐸 ∥ 𝜌 𝐴 ⊗ b
𝜌 𝐸 ) ≤ 𝐷 max (e
𝜌 𝐴𝐸 ∥ 𝜌 𝐴 ⊗ 𝜌 𝐸 ) + log2 2 . (15.B.15)
𝛿
𝜌 1/2
𝛾 𝛾 𝛾 1/2
𝜌 𝐴𝐸 = Π𝐸 e
b 𝜌 𝐴𝐸 Π𝐸 + 𝜌 𝐴 ⊗ e 𝐸 𝐼 − Π 𝐸 e𝜌𝐸 (15.B.22)
𝛾 𝛾
≥ Π𝐸 e 𝜌 𝐴𝐸 Π𝐸 (15.B.23)
1016
Chapter 15: Secret Key Distillation
2
From the above and (15.B.4), we conclude that 𝐹 (b 𝜌 𝐴𝐸 𝑋 , 𝜌 𝐴𝐸 𝑋 ) ≥ 1 − 𝛿4 , which
implies that
𝛿
𝜌 𝐴𝐸 𝑋 , 𝜌 𝐴𝐸 𝑋 ) ≤ .
𝑃(b (15.B.24)
2
Now consider that
𝑃(𝜌 𝐴𝐸 𝑋 , 𝜌 𝐴𝐸 ⊗ |0⟩⟨0| 𝑋 )
≤ 𝑃(𝜌 𝐴𝐸 𝑋 , e 𝜌 𝐴𝐸 ⊗ |0⟩⟨0| 𝑋 )
+ 𝑃(e 𝜌 𝐴𝐸 ⊗ |0⟩⟨0| 𝑋 , 𝜌 𝐴𝐸 ⊗ |0⟩⟨0| 𝑋 ) (15.B.25)
√︁
= 1 − 𝐹 (𝜌 𝐴𝐸 𝑋 , e 𝜌 𝐴𝐸 ⊗ |0⟩⟨0| 𝑋 ) + 𝑃(e 𝜌 𝐴𝐸 , 𝜌 𝐴𝐸 ) (15.B.26)
√︄
√︃ √︁ 2
𝛾 𝛾
= 1 − Π𝐸 e 𝜌 𝐴𝐸 Π𝐸 e 𝜌 𝐴𝐸 + 𝑃(e 𝜌 𝐴𝐸 , 𝜌 𝐴𝐸 ) (15.B.27)
1
√︄
√︃ √︃ 2
𝛾 𝛾 𝛾 𝛾
= 1 − Π𝐸 e 𝜌 𝐴𝐸 Π 𝐴 Π𝐸 e 𝜌 𝐴𝐸 Π 𝐴 + 𝑃(e 𝜌 𝐴𝐸 , 𝜌 𝐴𝐸 ) (15.B.28)
1
√︃
𝛾 2
= 1 − Tr[Π𝐸 e 𝜌 𝐴𝐸 ] + 𝑃(e 𝜌 𝐴𝐸 , 𝜌 𝐴𝐸 ) (15.B.29)
𝛿
≤ + 𝜀, (15.B.30)
2
where we applied the triangle inequality of the sine distance (Lemma 6.17) for the
√ √ √ √
first inequality and the fact that Π𝜔Π 𝜏 = Π𝜔Π Π𝜏Π for a projector
1 1
Π and states 𝜔 and 𝜏. Combining this with (15.B.24), we find that
𝜌 𝐴𝐸 , 𝜌 𝐴𝐸 ) = 𝑃(b
𝑃(b 𝜌 𝐴𝐸 𝑋 , 𝜌 𝐴𝐸 ⊗ |0⟩⟨0| 𝑋 ) (15.B.31)
≤ 𝑃(b𝜌 𝐴𝐸 𝑋 , 𝜌 𝐴𝐸 𝑋 ) + 𝑃(𝜌 𝐴𝐸 𝑋 , 𝜌 𝐴𝐸 ⊗ |0⟩⟨0| 𝑋 ) (15.B.32)
= 𝜀 + 𝛿. (15.B.33)
1017
Chapter 16
Private Communication
This chapter focuses on the task of private communication, in which the goal is for
a sender to communicate classical information privately over a quantum channel to
a receiver, such that the environment of the channel gains essentially no information
about the message transmitted. There are connections between this task and secret
key distillation from Chapter 15, as well as with quantum communication from
Chapter 14. Private communication can be considered a dynamic version of the
general problem of establishing secret correlations between two parties, whereas
secret key distillation is a static version of the same problem. Indeed, the resource
shared between the two parties in the former task is a quantum channel (a dynamic
resource), whereas the resource shared in the latter is a bipartite quantum state (a
static resource). The cryptographic models are similar as well: in key distillation,
we assumed that an eavesdropper possesses the purifying system of a purification of
the shared state, whereas, in this chapter, we assume that an eavesdropper possesses
the purifying system of a purification of the channel connecting the sender to
receiver (i.e., the eavesdropper possesses the environment of the channel). The
connection of private communication to quantum communication is as follows: if
two parties can communicate some amount of quantum information with some
error, then the amount of private information that they can communicate is related
to this amount by an inequality. This inequality in turn implies that the private
capacity of a quantum channel is not smaller than its quantum capacity.
As with other communication tasks that we have considered in previous chapters,
there are multiple ways to define how communication can be private, based on
various error criteria. In this chapter, we define two such criteria that lead to two
different but related communication tasks, one that we call secret-key transmission
1018
Chapter 16: Private Communication
and another that we call private communication. The criterion for the former task
is most similar to an average error criterion, in which the goal is for the sender
to use the channel to transmit one share of a secret key to the receiver, and the
criterion for the latter task is a maximal infidelity criterion, in which all messages
transmitted over the channel are required to meet a particular error criterion, which
captures both the decoding error probability of the receiver, as well as the security
of the message transmitted.
As usual by now, we begin our development in the one-shot setting, with the
goal of establishing lower and upper bounds on the one-shot private capacity.
We find several upper bounds on the one-shot private capacity, in terms of the
one-shot private information of the channel, the hypothesis testing relative entropy
of entanglement, and the squashed entanglement. The lower bound that we establish
is related to a different variation of the one-shot private information of the channel
(not the same quantity as in the upper bound), and we juxtapose the methods
of position-based coding and convex splitting to prove the achievability of this
one-shot private information. Some of the mathematical steps in the proof of the
lower bound are similar to those that we used in the previous chapter, in which
we established a lower bound on the one-shot distillable key. Moving on to the
asymptotic setting, we prove that the private capacity of a quantum channel is
equal to its regularized private information. This quantity is difficult to compute in
general, and so we then establish some upper bounds on it in terms of the relative
entropy of entanglement and squashed entanglement.
Let N 𝐴→𝐵 be a quantum channel connecting a sender Alice to a receiver Bob, and
let UN 𝐴→𝐵𝐸 be an isometric channel extending N 𝐴→𝐵 , in the sense that N 𝐴→𝐵 =
Tr𝐸 ◦UN 𝐴→𝐵𝐸 . The goal of a private communication protocol is for Alice to
communicate a classical message to Bob reliably, in the sense that Bob can decode
it with high probability, and such that it is secure from anyone who possesses the
environment system 𝐸 (we personify the environment as the eavesdropper Eve). A
private communication protocol in the one-shot setting is illustrated in Figure [REF].
It is defined by the three elements (M, E 𝑀 ′ →𝐴 , D𝐵→ 𝑀ˆ ), in which M is a message
set, E 𝑀 ′ →𝐴 is an encoding channel, and D𝐵→ 𝑀ˆ is a decoding channel. The pair
(E 𝑀 ′ →𝐴 , D𝐵→ 𝑀ˆ ), consisting of the encoding and decoding channels, is called a
private communication code or, more simply, a code. The encoding channel is a
1019
Chapter 16: Private Communication
UN
𝑝
𝐴→𝐵𝐸 (𝜌 𝑀 𝐴 ), (16.1.5)
where we have used the fact that the decoding channel is a measurement channel
and thus can be written in terms of a POVM {Λ𝑚 𝐵 } 𝑚∈M as
∑︁
ˆ
D𝐵→ 𝑀ˆ (𝜏𝐵 ) B Tr[Λ𝑚𝐵 𝜏𝐵 ] | 𝑚⟩⟨
ˆ 𝑚|ˆ 𝑀ˆ . (16.1.8)
ˆ
𝑚∈M
ˆ N 𝑚ˆ
𝐵 U 𝐴→𝐵𝐸 (𝜌 𝐴 )] = Tr[Λ 𝐵 N 𝐴→𝐵 (𝜌 𝐴 )],
B Tr[Λ𝑚 𝑚 𝑚
ˆ
𝑞( 𝑚|𝑚) (16.1.10)
then we can write the final state of the protocol alternatively as follows:
∑︁
𝑝 𝑚ˆ
𝜔 ˆ = ˆ
𝑝(𝑚)𝑞( 𝑚|𝑚)|𝑚⟩⟨𝑚| 𝑀 ⊗ | 𝑚⟩⟨
ˆ 𝑚|ˆ 𝑀ˆ ⊗ 𝜔𝑚,
𝐸 . (16.1.11)
𝑀𝑀𝐸
ˆ
𝑚,𝑚∈M
𝑝 err (E, D; 𝑝, N)
𝑝 𝑝
B inf 1 − 𝐹 (Φ 𝑀 𝑀ˆ ⊗ 𝜎𝐸 , P 𝑀 ′ → 𝑀ˆ 𝐸 (Φ 𝑀 𝑀 ′ )) (16.1.12)
𝜎𝐸
!2
∑︁ √
= inf 1 − 𝑝(𝑚) 𝐹 (|𝑚⟩⟨𝑚| 𝑀ˆ ⊗ 𝜎𝐸 , P 𝑀 ′ → 𝑀ˆ 𝐸 (|𝑚⟩⟨𝑚| 𝑀 ′ )) ® ,
© ª
𝜎𝐸
« 𝑚∈M ¬
(16.1.13)
where
P 𝑀 ′ → 𝑀ˆ 𝐸 B D𝐵→ 𝑀ˆ ◦ UN
𝐴→𝐵𝐸 ◦ E 𝑀 ′ →𝐴 (16.1.14)
and the infimum is taken over every state 𝜎𝐸 of the eavesdropper’s system 𝐸.
Also, we employed Proposition 7.31 with 𝛼 = 12 in the last line above. If the prior
probability distribution 𝑝(𝑚) is the uniform distribution (i.e., 𝑝(𝑚) = 1/|M|),
then the communication task is called secret-key transmission, because the
goal is for Alice to transmit one share of a secret key to the receiver Bob.
2. An alternative error criterion is the maximal infidelity of the code, defined as
1021
Chapter 16: Private Communication
is that there exists a state 𝜎𝐸 of the eavesdropper’s system 𝐸 such that the state of
systems 𝑀ˆ and 𝐸 is close to the product state |𝑚⟩⟨𝑚| 𝑀ˆ ⊗ 𝜎𝐸 , on average. This
means that not only can Bob can decode well, but also, that the state of Eve’s system
is close to the constant state 𝜎𝐸 , such that her system is not useful for figuring out
the message transmitted (on average). Indeed, by applying the data-processing
inequality with respect to partial trace of system 𝐸 and letting 𝜎𝐸 be the state that
achieves 𝑝 err (E, D; 𝑝, N), we conclude that
The second inequality follows from convexity of the square function and the third
from the data-processing inequality for fidelity. The latter expression is the same as
the average error probability from (12.1.13). Now applying the data-processing
ˆ we conclude that
inequality with respect to partial trace over 𝑀,
1022
Chapter 16: Private Communication
!2
∑︁ √
=1− 𝑝(𝑚) 𝐹 (𝜎𝐸 , N𝑐𝐴→𝐸 (𝜌 𝑚𝐴 )) , (16.1.27)
𝑚∈M
which indicates that the state of Eve’s system 𝐸 is close to the constant state 𝜎𝐸 on
average. In the above, N𝑐𝐴→𝐸 is a complementary channel of N 𝐴→𝐵 , as defined in
Section 4.3.2, and is given by N𝑐𝐴→𝐸 = Tr 𝐵 ◦UN 𝐴→𝐵𝐸 . Also, in the last line above,
1
we employed Proposition 7.31 with 𝛼 = 2 .
The interpretation of the maximum infidelity obeying the constraint
is similar. If this condition holds, then there exists a state 𝜎𝐸 of the eavesdropper’s
system 𝐸 such that the state of systems 𝑀ˆ and 𝐸 is close to the product state
|𝑚⟩⟨𝑚| 𝑀ˆ ⊗ 𝜎𝐸 , for every message 𝑚 ∈ M. So this is a much stronger constraint in
general and the one we aim to achieve for private communication. By applying
the data-processing inequality to 𝑝 ∗err (E, D; N) ≤ 𝜀 and letting 𝜎𝐸 be the state that
achieves 𝑝 ∗err (E, D; N), we conclude by similar reasoning as given above that
and
𝜀 ≥ 𝑝 ∗err (E, D; N) ≥ max 1 − 𝐹 (𝜎𝐸 , N𝑐𝐴→𝐸 (𝜌 𝑚𝐴 )) .
(16.1.30)
𝑚∈M
Thus, if 𝑝 ∗err (E, D; N) ≤ 𝜀 holds, then Bob can reliably decode every message
𝑚 ∈ M, in the sense that
𝐵 N 𝐴→𝐵 (𝜌 𝐴 )] ≥ 1 − 𝜀
Tr[Λ𝑚 ∀𝑚 ∈ M,
𝑚
(16.1.31)
and Eve’s system 𝐸 is not useful for determining any of the messages, in the sense
that
𝐹 (𝜎𝐸 , N𝑐𝐴→𝐸 (𝜌 𝑚𝐴 )) ≥ 1 − 𝜀 ∀𝑚 ∈ M. (16.1.32)
These two different infidelity criteria can be used to assess the performance of a
protocol, i.e., how well Bob can decode the message and how secure it is from Eve.
1023
Chapter 16: Private Communication
1024
Chapter 16: Private Communication
Proposition 16.3
The existence of an (𝑀, 𝜀) quantum communication protocol for a quan-
tum channel N 𝐴→𝐵 implies the existence of an (⌊𝑀/2⌋ , min{1, 2𝜀}) private
communication protocol for N 𝐴→𝐵 .
𝜎𝑅𝑆𝐸 B (D𝐵→𝑆 ◦ UN
𝐴→𝐵𝐸 ◦ E𝑆 ′ →𝐴 )(Φ 𝑅𝑆 ′ ) (16.1.38)
extends the state output from the actual protocol. By Uhlmann’s theorem (The-
orem 6.8), there exists an extension of Φ 𝑅𝑆 such that the fidelity between this
extension and the state 𝜎𝑅𝑆𝐸 is equal to the fidelity in (16.1.37). However, the
maximally entangled state Φ 𝑅𝑆 is unextendible in the sense that the only possible
extension is a tensor-product state Φ 𝑅𝑆 ⊗ 𝜔 𝐸 for some state 𝜔 𝐸 . So, putting these
statements together, we find that
𝐹 (Φ 𝑅𝑆 ⊗ 𝜔 𝐸 , (D𝐵→𝑆 ◦ UN
𝐴→𝐵𝐸 ◦ E𝑆 ′ →𝐴 )(Φ 𝑅𝑆 ′ )) ≥ 1 − 𝜀. (16.1.39)
𝐹 (Φ 𝑅𝑆 ⊗ 𝜔 𝐸 𝑛 , (D𝐵→𝑆 ◦ UN
𝐴→𝐵𝐸 ◦ E𝑆 ′ →𝐴 )(Φ 𝑅𝑆 )) ≥ 1 − 𝜀, (16.1.40)
1025
Chapter 16: Private Communication
where D𝐵→𝑆 denotes the concatenation of the original decoder D𝐵→𝑆 followed by
the local measurement:
∑︁
D𝐵→𝑆 (·) B |𝑚⟩⟨𝑚|D𝐵→𝑆 (·)|𝑚⟩⟨𝑚| (16.1.41)
𝑚
∑︁
= Tr[(D𝐵→𝑆 ) † [|𝑚⟩⟨𝑚|] (·)]|𝑚⟩⟨𝑚| 𝑆 . (16.1.42)
𝑚
Observe that {(D𝐵→𝑆 ) † [|𝑚⟩⟨𝑚|]} 𝑚 is a valid POVM. Using the direct-sum property
of the fidelity (Proposition 7.31 with 𝛼 = 21 ) and defining 𝜌 𝑚𝐴 B E𝑆′ →𝐴 (|𝑚⟩⟨𝑚| 𝑆′ ),
we can then rewrite this as
𝑀 √
!2
1 ∑︁
𝐹 (|𝑚⟩⟨𝑚| 𝑆 ⊗ 𝜔 𝐸 , (D𝐵→𝑆 ◦ UN 𝑚
𝐴→𝐵𝐸 )(𝜌 𝐴 )) ≥ 1 − 𝜀. (16.1.43)
𝑀 𝑚=1
1 ∑︁ √
𝑀 √
𝐹 (|𝑚⟩⟨𝑚| 𝑆 ⊗ 𝜔 𝐸 , (D𝐵→𝑆 ◦ UN
𝐴→𝐵𝐸 )(𝜌 𝑚
𝐴 )) ≥ 1−𝜀 (16.1.44)
𝑀 𝑚=1
and again as
𝑀
1 ∑︁ √ N
√
1 − 𝐹 (|𝑚⟩⟨𝑚| 𝑆 ⊗ 𝜔 𝐸 , (D𝐵→𝑆 ◦ U 𝐴→𝐵𝐸 )(𝜌 𝐴 )) ≤ 1 − 1 − 𝜀
𝑚
𝑀 𝑚=1
(16.1.45)
′
Markov’s inequality then guarantees that there exists a subset M of the set
{1, . . . , 𝑀 } of size ⌊𝑀/2⌋ such that the following condition holds for all 𝑚 ∈ M′:
√ N
√
1 − 𝐹 (|𝑚⟩⟨𝑚| 𝑆 ⊗ 𝜔 𝐸 , (D𝐵→𝑆 ◦ U 𝐴→𝐵𝐸 )(𝜌 𝐴 )) ≤ 2 1 − 1 − 𝜀 . (16.1.46)
𝑚
1026
Chapter 16: Private Communication
Thus, we have shown that from an (𝑀, 𝜀) quantum communication protocol, one
can realize an ( ⌊𝑀/2⌋ , 2𝜀) protocol for private communication. ■
Proposition 16.3 then implies the following for the one-shot capacities:
Theorem 16.4
For a quantum channel N 𝐴→𝐵 and 𝜀 ∈ (0, 1), the following inequality relates
𝜀
the one-shot quantum capacity 𝑄 2 (N) to the one-shot private capacity 𝑃 𝜀 (N):
𝜀
𝑄 2 (N) ≤ 𝑃 𝜀 (N) + 1. (16.1.51)
which follows from the definition of the one-shot private capacity 𝑃 𝜀 (N). We
𝜀
finally use the fact that log2 (𝑀/2) = 𝑄 2 (N) − 1. ■
trace out the system possessed by the eavesdropper Eve. In this case, it is only the
environment 𝐸 of the isometric channel UN 𝐴→𝐵𝐸 that belongs to the eavesdropper,
and tracing it out leads to the original channel N 𝐴→𝐵 .
A bipartite private-state transmission protocol is defined by the triple
Φ 𝑀 ′′ 𝑀 𝑀 ′ B |Φ⟩⟨Φ| 𝑀 ′′ 𝑀 𝑀 ′ , (16.1.54)
where
1 ∑︁
|Φ⟩ 𝑀 ′′ 𝑀 𝑀 ′ B √︁ |𝑚⟩ 𝑀 ′′ |𝑚⟩ 𝑀 |𝑚⟩ 𝑀 ′ . (16.1.55)
|M| 𝑚∈M
She transmits the 𝑀 ′ system through the isometric encoding channel UE𝑀 ′ →𝐴𝐴′ ,
leading to the state UE𝑀 ′ →𝐴𝐴′ (Φ 𝑀 ′′ 𝑀 𝑀 ′ ). She transmits the 𝐴 system through the
channel N 𝐴→𝐵 , leading to the state
Bob finally performs the isometric decoding channel UD ˆ ′ . The final state of
𝐵→ 𝑀 𝐵
the protocol is then as follows:
𝜔 𝑀 ′′ 𝑀 𝐴′ 𝑀ˆ 𝐵′ B (UD
𝐵→ 𝑀ˆ 𝐵′
◦ N 𝐴→𝐵 ◦ UE𝑀 ′ →𝐴𝐴′ )(Φ 𝑀 ′′ 𝑀 𝑀 ′ ), (16.1.57)
(UE , UD ; N) B
𝑏
𝑝 err inf 1 − 𝐹 (𝛾 𝑀 ′′ 𝑀 𝐴′ 𝑀ˆ 𝐵′ , 𝜔 𝑀 ′′ 𝑀 𝐴′ 𝑀ˆ 𝐵′ ) , (16.1.58)
𝛾 𝑀 ′′ 𝑀 𝐴′ 𝑀ˆ 𝐵′
1028
Chapter 16: Private Communication
where the optimization is with respect to every ideal bipartite private state
𝛾 𝑀 ′′ 𝑀 𝐴′ 𝑀ˆ 𝐵′ , with key system 𝑀 held by Alice, shield systems 𝑀 ′′ 𝐴′ by Alice, key
system 𝑀ˆ by Bob, and shield system 𝐵′ by Bob (see Section 15.1.1).
We now establish the main result of this section, which is the equivalence of
secret-key transmission and bipartite private-state transmission:
Theorem 16.6
Let M be a message set, and let 𝜀 ∈ [0, 1]. Let N 𝐴→𝐵 be a quantum channel.
There exists an (|M| , 𝜀) secret-key transmission protocol for N 𝐴→𝐵 if and
only if there exists an (|M| , 𝜀) bipartite private-state transmission protocol for
N 𝐴→𝐵 .
channel, and let D𝐵→ 𝑀ˆ be the decoding channel. The final state of the protocol is
as follows:
𝜔 𝑀 𝑀ˆ 𝐸 B (D𝐵→ 𝑀ˆ ◦ N 𝐴→𝐵 ◦ E 𝑀 ′ →𝐴 )(Φ 𝑀 𝑀 ′ ) (16.1.59)
and satisfies the inequality
1 − 𝐹 (Φ 𝑀 𝑀ˆ ⊗ 𝜎𝐸 , 𝜔 𝑀 𝑀ˆ 𝐸 ) ≤ 𝜀. (16.1.60)
𝜔 𝑀 𝑀 ′′ 𝐴′ 𝑀ˆ 𝐵′ 𝐸 B (UD
𝐵→ 𝑀ˆ 𝐵′
◦ UN E
𝐴→𝐵𝐸 ◦ U 𝑀 ′ →𝐴𝐴′ )(Φ 𝑀 ′′ 𝑀 𝑀 ′ ). (16.1.61)
1029
Chapter 16: Private Communication
𝐹 (Φ 𝑀 𝑀ˆ ⊗ 𝜎𝐸 , 𝜔 𝑀 𝑀ˆ 𝐸 ) = 𝐹 (𝛾 𝑀 𝑀 ′′ 𝐴′ 𝑀ˆ 𝐵′ 𝐸 , 𝜔 𝑀 𝑀 ′′ 𝐴′ 𝑀ˆ 𝐵′ 𝐸 ). (16.1.62)
Tracing over the 𝐸 system, we conclude from the data-processing inequality for
fidelity that
1 − 𝐹 (𝛾 𝑀 𝑀 ′′ 𝐴′ 𝑀ˆ 𝐵′ , 𝜔 𝑀 𝑀 ′′ 𝐴′ 𝑀ˆ 𝐵′ ) ≤ 𝜀, (16.1.63)
where
𝜔 𝑀 𝑀 ′′ 𝐴′ 𝑀ˆ 𝐵′ = (UD
𝐵→ 𝑀ˆ 𝐵′
◦ N 𝐴→𝐵 ◦ UE𝑀 ′ →𝐴𝐴′ )(Φ 𝑀 ′′ 𝑀 𝑀 ′ ). (16.1.64)
1 − 𝐹 (𝛾 𝑀 𝑀 ′′ 𝐴′ 𝑀ˆ 𝐵′ , 𝜔 𝑀 𝑀 ′′ 𝐴′ 𝑀ˆ 𝐵′ ) ≤ 𝜀, (16.1.65)
𝐹 (𝛾 𝑀 𝑀 ′′ 𝐴′ 𝑀ˆ 𝐵′ , 𝜔 𝑀 𝑀 ′′ 𝐴′ 𝑀ˆ 𝐵′ ) = 𝐹 (𝛾 𝑀 𝑀 ′′ 𝐴′ 𝑀ˆ 𝐵′ 𝐸 , 𝜔 𝑀 𝑀 ′′ 𝐴′ 𝑀ˆ 𝐵′ 𝐸 ). (16.1.66)
Tracing over the systems 𝑀 ′′, 𝐴′, and 𝐵′, the following inequality holds
1 − 𝐹 (𝛾 𝑀 𝑀ˆ 𝐸 , 𝜔 𝑀 𝑀ˆ 𝐸 ) ≤ 𝜀. (16.1.67)
By the definition of an ideal private state (see Definition 15.4) and since the state
𝛾 𝑀 𝑀 ′′ 𝐴′ 𝑀ˆ 𝐵′ is an ideal bipartite private state, it follows that 𝛾 𝑀 𝑀ˆ 𝐸 is an ideal
tripartite key state. Thus, we have proven the second claim. ■
1030
Chapter 16: Private Communication
We now establish some general upper bounds on the number of private bits
that can be communicated in an arbitrary private communication protocol. The
results are stated in Proposition 16.7 and Theorems 16.9 and 16.11, and, like the
upper bounds established in previous chapters, they hold independently of the
encoding and decoding channels used in the protocol and depends only on the
given communication channel N. The first upper bound is in terms of the one-shot
private information of the channel, and the others are in terms of the channel’s
𝜀-relative entropy of entanglement and squashed entanglement.
where the optimization is over every ensemble {𝑝(𝑥), 𝜌 𝑥𝐴 }𝑥∈X and the state
𝜌 𝑋 𝐵𝐸 is given by
∑︁
𝜌 𝑋 𝐵𝐸 B 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ UN 𝑥
𝐴→𝐵𝐸 (𝜌 𝐴 ), (16.1.69)
𝑥∈X
with UN
𝐴→𝐵𝐸 an isometric channel extending N 𝐴→𝐵 . The hypothesis testing
mutual information 𝐼√𝐻𝜀 (𝑋; 𝐵) 𝜌 is defined in (7.11.88) and the smooth max-
𝜀
mutual information 𝐼max (𝑋; 𝐸) 𝜌 in (15.1.59). Therefore,
√
𝜀 𝜀 𝜀
𝑃 (N) ≤ sup 𝐼 𝐻 (𝑋; 𝐵) 𝜌 − 𝐼max (𝑋; 𝐸) 𝜌 . (16.1.70)
{𝑝(𝑥),𝜌 𝑥𝐴 } 𝑥 ∈X
Proof: The proof has some similarities with the proof of Lemma 15.10. Since
1031
Chapter 16: Private Communication
𝑝 ∗err (E, D; N) ≥ 𝑝 err (E, D; 𝑝, N) for every probability distribution 𝑝(𝑚) over the
messages, it follows by definition that
𝑃 𝜀 (N) ≤ sup log2 |M| : 𝑝 err (E, D; 𝑝, N) ≤ 𝜀 , (16.1.71)
(M,E,D)
with 𝑝 set to the uniform distribution over messages. So we bound the right-hand
side instead (note that it is equal to the one-shot secret-key transmission capacity).
Let (M, E 𝑀 ′ →𝐴 , D𝐵→ 𝑀ˆ ) be an arbitrary private communication protocol. By the
reasoning in (16.1.17)–(16.1.22), it follows that
1 ∑︁
Tr[Λ𝑚 𝐵 N 𝐴→𝐵 (𝜌 𝐴 )] ≥ 1 − 𝜀.
𝑚
(16.1.72)
|M|
𝑚∈M
By the same reasoning given in the proof of Proposition 12.3, we conclude that
Observe that
1 ∑︁
𝜏𝑀 𝐵𝐸 = |𝑚⟩⟨𝑚| 𝑀 ⊗ (UN
𝐴→𝐵𝐸 ◦ E 𝑀 ′ →𝐴 )(|𝑚⟩⟨𝑚| 𝑀 ′ ) (16.1.75)
|M|
𝑚∈M
= (UN
𝐴→𝐵𝐸 ◦ E 𝑀 ′ →𝐴 )(Φ 𝑀 𝑀 ′ ). (16.1.76)
𝜀 ≥ 1 − 𝐹 (𝜋 𝑀 ⊗ 𝜎𝐸 , ( N
b 𝐴→𝐸 ◦ E 𝑀 ′ →𝐴 )(Φ 𝑀 𝑀 ′ ))
= 1 − 𝐹 (𝜏𝑀 ⊗ 𝜎𝐸 , 𝜏𝑀 𝐸 ),
n o
1
where the final inequality follows by noting that is a particular
|M| , 𝜌 𝑚𝐴
𝑚∈M
input ensemble and the one-shot private information in the last line involves an
optimization over all input ensembles. ■
Corollary 16.8
Let N 𝐴→𝐵 be a quantum channel, and let 𝜀 ∈ [0, 1). For all (|M| , 𝜀) private
communication protocols for N, the following bound holds
√
1 − 𝜀 − 𝜀 log2 |M| ≤
√
sup 𝐼 (𝑋; 𝐵) 𝜌 − 𝐼 (𝑋; 𝐸) 𝜌 + ℎ2 (𝜀) + 2𝑔( 𝜀). (16.1.80)
{ 𝑝(𝑥),𝜌 𝑥𝐴 } 𝑥 ∈X
Consequently, the following bound holds for the one-shot private capacity of a
channel N:
√
1 − 𝜀 − 𝜀 𝑃 𝜀 (N) ≤
√
sup 𝐼 (𝑋; 𝐵) 𝜌 − 𝐼 (𝑋; 𝐸) 𝜌 + ℎ2 (𝜀) + 2𝑔( 𝜀). (16.1.81)
{ 𝑝(𝑥),𝜌 𝑥𝐴 } 𝑥 ∈X
Proof: Employing the same reasoning that led to (16.1.73) and (16.1.77), consider
that the following bounds hold for a given (|M| , 𝜀) private communication protocol:
where the state 𝜏𝑀 𝐵𝐸 is defined in (16.1.74). Now we apply (7.2.96) and Proposi-
tion 7.70 to conclude that
1
𝐼 𝐻𝜀 (𝑀; 𝐵)𝜏 ≤ (𝐼 (𝑀; 𝐵)𝜏 + ℎ2 (𝜀)) , (16.1.84)
1−𝜀
1033
Chapter 16: Private Communication
where the last inequality follows by optimizing over all input ensembles. ■
≤ 𝐸 𝑅𝜀 (N). (16.1.95)
The second inequality follows from the data-processing inequality for 𝐷 𝜀𝐻 under
the action of the isometric decoding channel UD ˆ ′ and where the state 𝜌 𝑀 ′′ 𝑀 𝐴′ 𝐵
𝐵→ 𝑀 𝐵
is defined as
𝜌 𝑀 ′′ 𝑀 𝐴′ 𝐵 B (N 𝐴→𝐵 ◦ UE𝑀 ′ →𝐴𝐴′ )(Φ 𝑀 ′′ 𝑀 𝑀 ′ ). (16.1.96)
The systems 𝑀 ′′ 𝑀 𝐴′ extend the system 𝐴 of the state UE𝑀 ′ →𝐴𝐴′ (Φ 𝑀 ′′ 𝑀 𝑀 ′ ), with
𝐴 being the input to the channel N 𝐴→𝐵 . As such, we can optimize over all such
input states, and then conclude the final inequality above (here, we need to apply
the remark after Definition 10.1 as well). ■
Corollary 16.10
Let N 𝐴→𝐵 be a quantum channel, and let 𝜀 ∈ [0, 1). For all (|M| , 𝜀) private
communication protocols for N, the following bound holds for all 𝛼 > 1:
𝛼 1
log2 |M| ≤ 𝐸
e𝛼 (N) + log2 , (16.1.97)
𝛼−1 1−𝜀
1036
Chapter 16: Private Communication
Proof: As indicated above, the argument is precisely the same as in the proof of
Proposition 16.9, except that we apply the following bound from Proposition 15.19
instead:
√ √
1 − 2 𝜀 log2 |M| ≤ 𝐸 sq (𝑀 ′′ 𝑀 𝐴′; 𝑀ˆ 𝐵′)𝜔 + 2𝑔2 ( 𝜀). (16.1.102)
After this step, we apply the data-processing inequality for 𝐸 sq and optimize over
channel input states. ■
Having derived upper bounds on the number of private bits that can be transmitted
in an arbitrary private communication protocol, let us now determine a lower bound.
Here we use the methods of position-based coding and convex splitting to derive
an explicit (|M| , 𝜀) protocol for all 𝜀 ∈ (0, 1).
To derive this lower bound, let us consider a slightly different model of
communication, in which there is a one-input, two-output classical–quantum channel
connecting the sender Alice to the legitimate receiver Bob and the eavesdropper
1037
Chapter 16: Private Communication
Eve:
𝑥 → 𝜌 𝑥𝐵𝐸 , (16.1.103)
where 𝑥 ∈ X is the classical input symbol and 𝜌 𝑥𝐵𝐸 is the bipartite quantum state
that appears at the output when 𝑥 is input. Bob has access to the system 𝐵 of the
output and Eve to 𝐸. The channel can alternatively be written as a quantum channel
as follows: ∑︁
N 𝑋→𝐵𝐸 (𝜔) B ⟨𝑥| 𝑋 𝜔|𝑥⟩ 𝑋 𝜌 𝑥𝐵𝐸 , (16.1.104)
𝑥∈X
where {|𝑥⟩ 𝑋 }𝑥∈X is an orthonormal basis. In this way, a private communication
protocol for N 𝑋→𝐵𝐸 is defined exactly as we did in Section 16.1, with N 𝑋→𝐵𝐸
replacing the isometric channel U 𝐴→𝐵𝐸 therein. Furthermore, the notions of
code infidelity, an (|M| , 𝜀) private communication protocol, and one-shot private
capacity are defined in the same way, but with N 𝑋→𝐵𝐸 replacing the isometric
channel U 𝐴→𝐵𝐸 .
The main result of this section is the following lower bound on the one-shot
private capacity 𝑃 𝜀 (N) of a classical–quantum wiretap channel N 𝑋→𝐵𝐸 :
𝜀 ′ −𝛿−𝜂 𝛿−𝜁
𝑃 𝜀 (N) ≥ 𝐼 𝐻 (𝑋; 𝐵) 𝜌 − 𝐼 max (𝐸; 𝑋) 𝜌
8 (𝜀′ − 𝛿)
2
− log2 − log2 , (16.1.105)
𝜂2 𝜁2
√︁
where 𝜀′ = 1 − 1 − 𝜀/2, 𝛿 ∈ (0, 𝜀′), 𝜂 ∈ (0, 𝜀′ − 𝛿), and 𝜁 ∈ (0, 𝛿), and the
information measures are evaluated with respect to the state
∑︁
𝜌 𝑋 𝐵𝐸 B 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜌 𝑥𝐵𝐸 . (16.1.106)
𝑥∈X
In the above, 𝑃 𝜀 (N) represents the maximum number of bits that can be sent from
Alice to Bob, using a classical–quantum wiretap channel once, such that the infidelity
does not exceed 𝜀 ∈ (0, 1). The quantities on the right-hand side of the inequality
in (16.1.105) are particular one-shot generalizations of the Holevo information to
Bob and Eve, which are defined in (7.11.88) and (15.1.211), respectively.
To prove the one-shot bound in (16.1.105), we employ position-based coding
(Section 11.1.3) and convex splitting (Section 15.1.4). The main idea of position-
based coding is conceptually simple and we review it briefly here. To communicate
a classical message from Alice to Bob, we allow them to share a quantum state 𝜌 𝑅⊗𝑀
𝐴
before communication begins, where 𝑀 is the number of messages, Bob possesses
1038
Chapter 16: Private Communication
the 𝑅 systems, and Alice the 𝐴 systems. If Alice wishes to communicate message
𝑚, then she sends the 𝑚th 𝐴 system through the channel. The reduced state of
Bob’s systems is then
The convex-split lemma (Lemma 15.22) guarantees that as long as log2 𝑀 is roughly
equal to the smooth max-mutual information in (15.1.211), then the state above is
nearly indistinguishable from the product state 𝜌 𝑅⊗𝑀 ⊗ 𝜌 𝐵 .
Here we use the approaches of position-based coding and convex splitting in
conjunction to construct codes for the classical–quantum wiretap channel. The
main underlying idea is to have a message variable 𝑚 ∈ {1, . . . , 𝑀 } and a local
randomness variable 𝑟 ∈ {1, . . . , 𝑅}, the latter of which is selected uniformly at
random and used to confuse the eavesdropper Eve. Before communication begins,
Alice, Bob, and Eve are allowed share to 𝑀 𝑅 copies of the common randomness
state ∑︁
𝜌 𝑋 𝑋 ′ 𝑋 ′′ B 𝑝 𝑋 (𝑥)|𝑥𝑥𝑥⟩⟨𝑥𝑥𝑥| 𝑋 𝑋 ′ 𝑋 ′′ . (16.1.109)
𝑥∈X
1039
Chapter 16: Private Communication
Theorem 16.12
Let N 𝑋→𝐵𝐸 : 𝑥 → 𝜌 𝑥𝐵𝐸 be a classical–quantum wiretap channel, in which Alice
has access to the input, Bob to the output
√︁ system 𝐵, and Eve to the output
system 𝐸. For all 𝜀 ∈ (0, 1], 𝜀 = 1 − 1 − 𝜀/2, 𝛿 ∈ (0, 𝜀′), 𝜂 ∈ (0, 𝜀′ − 𝛿),
′
and 𝜁 ∈ (0, 𝛿), there exists an (|M| , 𝜀) private communication protocol for
N 𝑋→𝐵𝐸 , such that
𝜀 ′ −𝛿−𝜂 𝛿−𝜁
log2 |M| = 𝐼 𝐻 (𝑋; 𝐵) 𝜌 − 𝐼 max (𝐸; 𝑋) 𝜌
8 (𝜀′ − 𝛿)
2
− log2 − log2 .
𝜂2 𝜁2
𝜀 ′ −𝛿−𝜂
where the hypothesis testing mutual information 𝐼𝐻 (𝑋; 𝐵) 𝜌 is defined in
1040
Chapter 16: Private Communication
𝛿−𝜁
(7.11.88) and the smooth max-mutual information 𝐼 max (𝐸; 𝑋) 𝜌 in (15.1.211),
and they are evaluated with respect to the state 𝜌 𝑋 𝐵𝐸 in (16.1.106).
Proof: We first exhibit a public shared randomness assisted protocol for private
communication and then show later how to derandomize it. The protocol proceeds
exactly as discussed above. We suppose that Alice, Bob, and Eve share the state
𝜌 ⊗𝑀
𝑋 𝑋 ′ 𝑋 ′′ before communication begins, where 𝑀 = |M|. If Alice wants to send
𝑅
the message 𝑚, she picks 𝑟 uniformly at random from {1, . . . , 𝑅} and transmits a
classical copy 𝑋 ′′′ of the 𝑋 system labeled by (𝑚, 𝑟) through the channel N 𝑋→𝐵𝐸 .
The resulting state of Alice, Bob, and Eve, for fixed 𝑚 and 𝑟, is then as follows:
𝜌 𝑚,𝑟
𝑋 𝑀 𝑅 𝑋 ′𝑀 𝑅 𝑋 ′′𝑀 𝑅 𝐵𝐸
′ 𝑋 ′′ ⊗ · · · ⊗ 𝜌 𝑋
B 𝜌 𝑋1,1 𝑋1,1 1,1 𝑚,𝑟 −1 𝑋𝑚,𝑟 −1 𝑋𝑚,𝑟 −1 ⊗
′ ′′
′ 𝑋 ′′ 𝐵𝐸 ⊗ 𝜌 𝑋
𝑚,𝑟+1 𝑋𝑚,𝑟+1 𝑋𝑚,𝑟+1 ⊗ · · · ⊗ 𝜌 𝑋 𝑀,𝑅 𝑋 𝑀,𝑅 𝑋 𝑀,𝑅 , (16.1.110)
𝜌 𝑋𝑚,𝑟 𝑋𝑚,𝑟 ′ ′′ ′ ′′
𝑚,𝑟
where
′ 𝑋 ′′ = · · · = 𝜌 𝑋
𝜌 𝑋1,1 𝑋1,1 ′ ′′ (16.1.111)
1,1 𝑚,𝑟 −1 𝑋𝑚,𝑟 −1 𝑋𝑚,𝑟 −1
= 𝜌 𝑋𝑚,𝑟+1 𝑋𝑚,𝑟+1
′ ′′
𝑋𝑚,𝑟+1 = · · · = 𝜌 𝑋 𝑀,𝑅 𝑋 𝑀,𝑅
′ ′′
𝑋 𝑀,𝑅 (16.1.112)
∑︁
= 𝑝 𝑋 (𝑥)|𝑥𝑥𝑥⟩⟨𝑥𝑥𝑥| 𝑋 𝑋 ′ 𝑋 ′′ , (16.1.113)
𝑥∈X
and
∑︁
𝜌 ′ 𝑋 ′′ 𝐵𝐸
𝑋𝑚,𝑟 𝑋𝑚,𝑟 𝑚,𝑟
= 𝑝 𝑋 (𝑥)|𝑥𝑥𝑥⟩⟨𝑥𝑥𝑥| 𝑋 𝑋 ′ 𝑋 ′′ ⊗ N 𝑋 ′′′ →𝐵𝐸 (|𝑥⟩⟨𝑥| 𝑋 ′′′ ) (16.1.114)
𝑥∈X
∑︁
= 𝑝 𝑋 (𝑥)|𝑥𝑥𝑥⟩⟨𝑥𝑥𝑥| 𝑋 𝑋 ′ 𝑋 ′′ ⊗ 𝜌 𝑥𝐵𝐸 . (16.1.115)
𝑥∈X
At this point, the state here is precisely the same as that given in (15.1.200),
and the goal from here is the same as well. Thus, we can apply the same reasoning
given there to conclude that the following infidelity condition holds
if
𝜀 ′ −𝛿−𝜂 𝛿−𝜁
log2 |M| = 𝐼 𝐻 (𝑋; 𝐵)𝜏 − 𝐼 max (𝐸; 𝑋)𝜏
1041
Chapter 16: Private Communication
4 (𝜀′ − 𝛿)
2
− log2 − log2 , (16.1.117)
𝜂2 𝜁2
where ∑︁
𝜏𝑋 𝐵𝐸 B 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ 𝜌 𝑥𝐵𝐸 (16.1.118)
𝑥∈X
and 𝜌 𝑀 𝐴 𝑋 ′𝑀 𝑅 𝑋 ′′𝑀 𝑅 𝐵𝐸 is the reduction of the following state:
𝜌 𝑀 𝐴 𝑅 𝐴 𝑋 𝑀 𝑅 𝑋 ′𝑀 𝑅 𝑋 ′′𝑀 𝑅 𝐵𝐸 B
𝑀 𝑅
1 ∑︁ ∑︁
|𝑚⟩⟨𝑚| 𝑀 𝐴 ⊗ |𝑟⟩⟨𝑟 | 𝑅 𝐴 ⊗ 𝜌 𝑚,𝑟
𝑋 𝑀 𝑅 𝑋 ′𝑀 𝑅 𝑋 ′′𝑀 𝑅 𝐵𝐸
. (16.1.119)
𝑀 𝑅 𝑚=1 𝑟=1
That is,
𝜌𝐸 ) ≤ 𝛿 − 𝜁 .
𝑃(𝜌 𝐸 , e (16.1.121)
1042
Chapter 16: Private Communication
∑︁
𝑝(𝑥 1,1 ) · · · 𝑝(𝑥 𝑀,𝑅 )|𝑥 1,1 , . . . , 𝑥 𝑀,𝑅 ⟩⟨𝑥1,1 , . . . , 𝑥 𝑀,𝑅 | 𝑋 ′𝑀 𝑅 ⊗
𝑥 1,1 ,...,𝑥 𝑀,𝑅
𝑅
1 ∑︁ 𝑥 𝑚,𝑟
|𝑥1,1 , . . . , 𝑥 𝑀,𝑅 ⟩⟨𝑥1,1 , . . . , 𝑥 𝑀,𝑅 | 𝑋 ′′𝑀 𝑅 ⊗ 𝜌 . (16.1.123)
𝑅 𝑟=1 𝐵𝐸
M 𝑋 ′𝑀 𝑅 𝐵→𝑀𝐵 (𝜌 𝑀 𝐴 𝑋 ′𝑀 𝑅 𝑋 ′′𝑀 𝑅 𝐵𝐸 ) =
𝑀
1 ∑︁ ∑︁
𝑝(𝑥1,1 ) · · · 𝑝(𝑥 𝑀,𝑅 )|𝑚⟩⟨𝑚| 𝑀 𝐴 ⊗
𝑀 𝑚=1 𝑥 ,...,𝑥
1,1 𝑀,𝑅
𝑅
!
𝑥 ,...,𝑥 1 ∑︁ 𝑥 𝑚,𝑟
|𝑥1,1 , . . . , 𝑥 𝑀,𝑅 ⟩⟨𝑥 1,1 , . . . , 𝑥 𝑀,𝑅 | 𝑋 ′′𝑀 𝑅 ⊗ M𝐵→𝑀
1,1
𝐵
𝑀,𝑅
𝜌 𝐵𝐸 , (16.1.124)
𝑅 𝑟=1
1−𝜀
≤ 𝐹 (M 𝑋 ′𝑀 𝑅 𝐵→𝑀𝐵 (𝜌 𝑀 𝐴 𝑋 ′𝑀 𝑅 𝑋 ′′𝑀 𝑅 𝐵𝐸 ), Φ 𝑀 𝐴 𝑀𝐵 ⊗ 𝜌 𝑋 ′′𝑀 𝑅 ⊗ e 𝜌𝐸 ) (16.1.125)
" 1 Í 𝑀 Í # 2
𝑀 𝑚=1 𝑥1,1,...,𝑥 𝑀,𝑅 𝑝(𝑥 1,1) · · · 𝑝(𝑥 𝑀,𝑅 )×
= √
𝑥1,1 ,...,𝑥 𝑀,𝑅 1 Í 𝑅 𝑥 𝑚,𝑟
, (16.1.126)
𝐹 M𝐵→𝑀 𝐵 𝑅 𝑟=1 𝜌 𝐵𝐸 , |𝑚⟩⟨𝑚| 𝑀 𝐵 ⊗ 𝜌
e 𝐸
We can now exploit the “Shannon trick” of exchanging the sum over the messages
𝑚 and the sum over the codewords to rewrite this inequality as
∑︁
𝑝(𝑥1,1 ) · · · 𝑝(𝑥 𝑀,𝑅 )×
𝑥1,1 ,...,𝑥 𝑀,𝑅
1043
Chapter 16: Private Communication
𝑀 √ 𝑅
! !!
1 ∑︁ 𝑥1,1 ,...,𝑥 𝑀,𝑅 1
∑︁ 𝑥 𝑚,𝑟
𝐹 M𝐵→𝑀 𝜌 , |𝑚⟩⟨𝑚| 𝑀𝐵 ⊗ e
𝜌𝐸
𝑀 𝑚=1 𝐵 𝑅 𝑟=1 𝐵𝐸
√
≥ 1 − 𝜀. (16.1.128)
Since the average does not exceed the maximum, we conclude that there exists
some choice of codewords 𝑥1,1 , . . . , 𝑥 𝑀,𝑅 such that the following inequality holds
! !
1 ∑︁ √
𝑀
𝑥1,1 ,...,𝑥 𝑀,𝑅 1
𝑅
∑︁ 𝑥 𝑚,𝑟 √
𝐹 M𝐵→𝑀 𝜌 , |𝑚⟩⟨𝑚| 𝑀 𝐵 ⊗ 𝜌
e 𝐸 ≥ 1 − 𝜀. (16.1.129)
𝑀 𝑚=1 𝐵 𝑅 𝑟=1 𝐵𝐸
𝑥 ,...,𝑥
Let us then use the shorthand M𝐵→𝑀𝐵 ≡ M𝐵→𝑀 1,1
𝐵
𝑀,𝑅
, so that we can rewrite the
above as
! !
1
𝑀 √
∑︁ 1
𝑅
∑︁ 𝑥 𝑚,𝑟 √
𝐹 M𝐵→𝑀𝐵 𝜌 𝐵𝐸 , |𝑚⟩⟨𝑚| 𝑀𝐵 ⊗ e
𝜌 𝐸 ≥ 1 − 𝜀. (16.1.130)
𝑀 𝑚=1 𝑅 𝑟=1
Now applying Markov’s inequality, we conclude that at least half of the messages
are such that the following inequality holds
𝑅
! !
1 ∑︁ 𝑥 𝑚,𝑟
1 − 𝐹 M𝐵→𝑀𝐵 𝜌 𝐵𝐸 , |𝑚⟩⟨𝑚| 𝑀𝐵 ⊗ e
𝜌 𝐸 ≤ 2𝜀. (16.1.133)
𝑅 𝑟=1
Thus, these messages and the corresponding codewords are retained as the final code.
To be clear, suppose without loss of generality, that messages 1, . . . , ⌊𝑀/2⌋ are
1044
Chapter 16: Private Communication
retained and messages ⌊𝑀/2⌋ + 1, . . . , 𝑀 are expurgated. Then this means that the
corresponding codewords retained are 𝑥1,1 , . . . , 𝑥1,𝑅 , 𝑥2,1 , . . . , 𝑥 2,𝑅 , . . . , 𝑥 ⌊𝑀/2⌋,1 ,
. . . , 𝑥 ⌊𝑀/2⌋,𝑅 , and the ones discarded are 𝑥 ⌊𝑀/2⌋+1,1 , . . . , 𝑥 ⌊𝑀/2⌋+1,𝑅 , 𝑥 ⌊𝑀/2⌋+2,1 , . . . ,
𝑥 ⌊𝑀/2⌋+2,𝑅 , . . . , 𝑥 𝑀,1 , . . . , 𝑥 𝑀,𝑅 . After the expurgation, the rate of the code is given
by
𝑥 → 𝜌 𝑥𝐴 → UN 𝑥
𝐴→𝐵𝐸 (𝜌 𝐴 ). (16.1.137)
That is, based on the value of a letter 𝑥, Alice inputs the state 𝜌 𝑥𝐴 into the isometric
channel UN 𝐴→𝐵𝐸 . Optimizing over all such preprocessings and applying Theo-
rem 16.12, we arrive at the following lower bound on the one-shot private capacity
of a quantum channel N 𝐴→𝐵 (according to the definition given in Section 16.1):
Corollary 16.13
Let N 𝐴→𝐵 be a quantum channel that √︁ is extended by the isometric channel
N
U 𝐴→𝐵𝐸 . For all 𝜀 ∈ (0, 1], 𝜀 = 1 − 1 − 𝜀/2, 𝛿 ∈ (0, 𝜀′), 𝜂 ∈ (0, 𝜀′ − 𝛿), and
′
𝜁 ∈ (0, 𝛿), there exists an (|M| , 𝜀) private communication protocol for N 𝐴→𝐵 ,
such that
𝜀 ′ −𝛿−𝜂 𝛿−𝜁
log2 |M| = sup 𝐼𝐻 (𝑋; 𝐵) 𝜌 − 𝐼 max (𝐸; 𝑋) 𝜌
{ 𝑝(𝑥),𝜌 𝐴 } 𝑥 ∈X
𝑥
1045
Chapter 16: Private Communication
8 (𝜀′ − 𝛿)
2
− log2 − log2 .
𝜂2 𝜁2
𝜀 ′ −𝛿−𝜂
where the hypothesis testing mutual information 𝐼 𝐻 (𝑋; 𝐵) 𝜌 is defined in
𝛿−𝜁
(7.11.88) and the smooth max-mutual information 𝐼 max (𝐸; 𝑋) 𝜌 in (15.1.211),
and the information quantities are evaluated with respect to the following state:
∑︁
𝜌 𝑋 𝐵𝐸 B 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ UN 𝑥
𝐴→𝐵𝐸 (𝜌 𝐴 ). (16.1.138)
𝑥∈X
Now applying Propositions 7.72 and 7.64, we conclude the following bound:
Corollary 16.14
Let N 𝐴→𝐵 be a quantum channel that√︁is extended by the isometric channel
UN ′ ′
𝐴→𝐵𝐸 . For all 𝜀 ∈ (0, 1], 𝜀 = 1 − 1 − 𝜀/2, 𝛿 ∈ (0, 𝜀 ), 𝜂 ∈ (0, 𝜀 − 𝛿),
′
𝜁 ∈ (0, 𝛿), 𝜈 ∈ (0, 𝛿 − 𝜁), 𝛼 ∈ (0, 1), and 𝛽 > 1, there exists an (|M| , 𝜀)
private communication protocol for N 𝐴→𝐵 , such that
log2 |M| ≥ sup 𝐼 𝛼 (𝑋; 𝐵) 𝜌 − 𝐼 𝛽 (𝑋; 𝐸) 𝜌 − 𝑓 (𝜀′, 𝛿, 𝜂, 𝜈, 𝜁, 𝛼, 𝛽).
e ′
{ 𝑝(𝑥),𝜌 𝑥𝐴 } 𝑥 ∈X
(16.1.139)
where the Petz–Renyi mutual information 𝐼 𝛼 (𝑋; 𝐵) 𝜌 is defined in (11.1.136),
the sandwiched Renyi mutual information e 𝐼 𝛽′ (𝑋; 𝐸) 𝜌 as
𝐼 𝛽′ (𝑋; 𝐸) 𝜌 B 𝐷
e e 𝛽 (𝜌 𝑋 𝐸 ∥ 𝜌 𝑋 ⊗ 𝜌 𝐸 ), (16.1.140)
and the information quantities are evaluated with respect to the following state:
∑︁
𝜌 𝑋 𝐵𝐸 B 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ UN 𝑥
𝐴→𝐵𝐸 (𝜌 𝐴 ). (16.1.141)
𝑥∈X
Furthermore,
′ 𝛼 1 8
𝑓 (𝜀 , 𝛿, 𝜂, 𝜈, 𝜁, 𝛼, 𝛽) B log2 ′ + log2 2
1−𝛼 𝜀 −𝛿−𝜂 𝜈
1 1 1
+ log2 + log2
𝛽−1 (𝛿 − 𝜁 − 𝜈) 2 1 − (𝛿 − 𝜁 − 𝜈) 2
1046
Chapter 16: Private Communication
8 (𝜀′ − 𝛿)
2
+ log2 + log2 . (16.1.142)
𝜂2 𝜁2
Proof: The reasoning here is precisely the same as that given in the proof of
Corollary
15.24. The only difference is that we optimize over every ensemble
𝑝(𝑥), 𝜌 𝐴 𝑥∈X . ■
𝑥
𝑝 ∗err (E, D; N ⊗𝑛 ) =
inf max (1 − 𝐹 (|𝑚⟩⟨𝑚| 𝑀ˆ ⊗ 𝜎𝐸 𝑛 , (D𝐵𝑛 → 𝑀ˆ ◦ (UN
𝐴→𝐵𝐸 )
⊗𝑛
◦ E 𝑀 ′ →𝐴𝑛 )(|𝑚⟩⟨𝑚| 𝑀 ′ ))),
𝜎𝐸 𝑛 𝑚∈M
(16.2.2)
where the infimum is with respect to every state 𝜎𝐸 𝑛 of the eavesdropper’s system 𝐸.
1047
Chapter 16: Private Communication
where, in the second equality, we use the definition of the one-shot private capacity
𝑃 𝜀 given in (16.1.35), and the supremum is over every message set M, encoding
channel E with input dimension |M|, and decoding channel D with output dimension
|M|.
We now provide several definitions related to private capacity and its associated
concepts.
1048
Chapter 16: Private Communication
𝑃(N)
e B inf {𝑅 : 𝑅 is a strong converse rate for N} . (16.2.7)
1049
Chapter 16: Private Communication
𝑝 1 𝑝 ⊗𝑛
𝑃(N) = 𝐼reg (N) B lim 𝐼 (N ), (16.2.10)
𝑛→∞ 𝑛
and the information quantities are evaluated with respect to the state
∑︁
𝜌 𝑋 𝐵𝐸 B 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ UN 𝑥
𝐴→𝐵𝐸 (𝜌 𝐴 ), (16.2.12)
𝑥∈X
with UN
𝐴→𝐵𝐸 an isometric channel extending N 𝐴→𝐵 .
Observe that the expression in (16.2.10) for the private capacity involves a
regularization of the private information. Thus, in general, it is difficult to compute
because the optimization is over an arbitrarily large number of channel uses.
By following an argument similar to that given in Section 14.2.3, it follows that
the private information is always superadditive, meaning that 𝐼 𝑝 (N ⊗𝑛 ) ≥ 𝑛𝐼 𝑝 (N)
for every 𝑛 ∈ N and channel N. This means that the private information is always a
lower bound on the private capacity of a channel N:
𝑃(N) ≥ 𝐼 𝑝 (N) for every channel N. (16.2.13)
If the private information happens to be additive for a particular channel, then the
regularization in (16.2.10) is not required. For example, the private information
1050
Chapter 16: Private Communication
to arrive at the conclusion that the private capacity of a quantum channel is not
smaller than its quantum capacity.
Theorem 16.22
For a quantum channel N 𝐴→𝐵 , its private information is not smaller than its
coherent information:
𝐼 𝑐 (N) ≤ 𝐼 𝑝 (N), (16.2.15)
where the coherent information is defined in (7.11.107) and the private infor-
mation in (16.2.11). As a consequence, the private capacity is not smaller than
the quantum capacity:
𝑄(N) ≤ 𝑃(N). (16.2.16)
Proof: Picking a pure-state ensemble in (16.2.11), i.e., {𝑝(𝑥), 𝜓 𝑥𝐴 }𝑥∈X , and setting
∑︁
𝜌 𝑋 𝐵𝐸 B 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ UN 𝑥
𝐴→𝐵𝐸 (𝜓 𝐴 ), (16.2.17)
𝑥∈X
with UN
𝐴→𝐵𝐸 an isometric channel extending N 𝐴→𝐵 , we find that
𝐼 𝑝 (N) ≥ 𝐼 (𝑋; 𝐵) 𝜌 − 𝐼 (𝑋; 𝐸) 𝜌 (16.2.18)
= 𝐻 (𝐵) 𝜌 − 𝐻 (𝐵|𝑋) 𝜌 − 𝐻 (𝐸) 𝜌 − 𝐻 (𝐸 |𝑋) 𝜌 (16.2.19)
= 𝐻 (𝐵) 𝜌 − 𝐻 (𝐸) 𝜌 . (16.2.20)
The first equality follows from rewriting the mutual information, and the second
follows because the conditional entropies can be written as
∑︁
𝐻 (𝐵|𝑋) 𝜌 = 𝑝(𝑥)𝐻 (Tr𝐸 [UN 𝑥
𝐴→𝐵𝐸 (𝜓 𝐴 )]), (16.2.21)
𝑥∈X
∑︁
𝐻 (𝐸 |𝑋) 𝜌 = 𝑝(𝑥)𝐻 (Tr 𝐵 [UN 𝑥
𝐴→𝐵𝐸 (𝜓 𝐴 )]). (16.2.22)
𝑥∈X
They are equal because the entropies of the marginal states of a pure bipartite state
are equal. Now consider that the reduced state of the 𝐵𝐸 systems is
!
∑︁ ∑︁
𝜌 𝐵𝐸 = 𝑝(𝑥)UN 𝑥 N
𝐴→𝐵𝐸 (𝜓 𝐴 ) = U 𝐴→𝐵𝐸 𝑝(𝑥)𝜓 𝑥𝐴 . (16.2.23)
𝑥∈X 𝑥∈X
Since we can realize an arbitrary input density operator by taking convex com-
binations of pure states, and by applying (7.11.113), we conclude the claim in
(16.2.15).
1052
Chapter 16: Private Communication
log2 |M|
≥ sup 𝐼 𝛽′ (𝑋; 𝐸) 𝜌
𝐼 𝛼 (𝑋; 𝐵) 𝜌 − e
𝑛 { 𝑝(𝑥),𝜌 𝑥𝐴 } 𝑥 ∈X
′ ′ ′ ′
1 ′ 𝜀 𝜀 𝜀 𝜀
− 𝑓 𝜀 , , , , , 𝛼, 𝛽 , (16.2.24)
𝑛 2 4 4 2
where the information quantities are evaluated with respect to the state
∑︁
𝜌 𝑋 𝐵𝐸 B 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ UN 𝑥
𝐴→𝐵𝐸 (𝜌 𝐴 ), (16.2.25)
𝑥∈X
Proof: We evaluate the quantities in Corollary 16.14 with respect to the tensor-
power isometric channel (UN 𝐴→𝐵𝐸 )
⊗𝑛 and choose the input ensemble to be a
tensor-power ensemble {𝑝(𝑥 1 ) · · · 𝑝(𝑥 𝑛 ), 𝜌 𝑥𝐴11 ⊗ · · · ⊗ 𝜌 𝑥𝐴𝑛𝑛 }𝑥1 ,...,𝑥 𝑛 ∈X×𝑛 . This implies
that the state being evaluated for the Renyi information quantities is the tensor-power
𝜀′ 𝜀′ 𝜀′
state 𝜌 ⊗𝑛
𝑋 𝐵𝐸 , where 𝜌 𝑋 𝐵𝐸 is defined in (16.2.25). Let 𝛿 = 2 , 𝜂 = 4 , 𝜈 = 4 , and
𝜀′ ′
𝜁 = 2 . Exploiting the additivity of 𝐼 𝛼 and e 𝐼 𝛽 , substituting into the inequality in
1053
Chapter 16: Private Communication
′ 𝜀′ 𝜀′ 𝜀′
1 𝜀
− 𝑓 𝜀′, , , , , 𝛼, 𝛽 . (16.2.26)
𝑛 2 4 4 2
This concludes the proof. ■
Proof: Let UN 𝐴→𝐵𝐸 be an isometric channel extending the channel N 𝐴→𝐵 of interest.
Fix 𝜀 ∈ (0, 1] and 𝛿 > 0. Let 𝛿1 , 𝛿2 > 0 be such that 𝛿 = 𝛿1 + 𝛿2 . Set 𝛼 ∈ (0, 1)
and 𝛽 > 1 such that
′
𝛿1 ≥ 𝐼 (𝑋; 𝐵) 𝜌 − 𝐼 (𝑋; 𝐸) 𝜌 − 𝐼 𝛼 (𝑋; 𝐵) 𝜌 − 𝐼 𝛽 (𝑋; 𝐸) 𝜌 ,
e (16.2.28)
where the information quantities are evaluated with respect to the state 𝜌 𝑋 𝐵𝐸 in
(16.2.25). Note that this is possible because 𝐼 𝛼 (𝑋; 𝐵) 𝜌 increases monotonically with
increasing 𝛼 ∈ (0, 1) (see Proposition 7.23) and e 𝐼 𝛽′ (𝑋; 𝐸) 𝜌 decreases monotonically
with decreasing 𝛽 (see Proposition 7.31), so that
lim 𝐼 𝛼 (𝑋; 𝐵) 𝜌 = sup 𝐼 𝛼 (𝑋; 𝐵) 𝜌 , (16.2.29)
𝛼→1 − 𝛼∈(0,1)
𝐼 𝛽′ (𝑋; 𝐸) 𝜌 =
lim+ e 𝐼 𝛽′ (𝑋; 𝐸) 𝜌 .
inf e (16.2.30)
𝛽→1 𝛽∈(1,∞)
Also,
𝐼 (𝑋; 𝐵) 𝜌 = lim− 𝐼 𝛼 (𝑋; 𝐵) 𝜌 , (16.2.31)
𝛼→1
1054
Chapter 16: Private Communication
𝐼 𝛽′ (𝑋; 𝐸) 𝜌 .
𝐼 (𝑋; 𝐸) 𝜌 = lim+ e (16.2.32)
𝛽→1
With 𝛼 and 𝛽 chosen such that (16.2.28) holds, take 𝑛 large enough so that
′ 𝜀′ 𝜀′ 𝜀′
1 𝜀
𝛿2 ≥ 𝑓 𝜀′, , , , , 𝛼, 𝛽 . (16.2.33)
𝑛 2 4 4 2
Now, we use the fact that for the 𝑛 and 𝜀 chosen above, there exists an (𝑛, |M| , 𝜀)
protocol such that
′ 𝜀′ 𝜀′ 𝜀′
log2 |M| ′ 1 ′ 𝜀
≥ 𝐼 𝛼 (𝑋; 𝐵) 𝜌 − e
𝐼 𝛽 (𝑋; 𝐸) 𝜌 − 𝑓 𝜀 , , , , , 𝛼, 𝛽 , (16.2.34)
𝑛 𝑛 2 4 4 2
which follows from Corollary 16.23 above. Rearranging the right-hand side of this
inequality, and using (16.2.28), (16.2.33), and (16.2.34), we find that
log2 |M|
≥ 𝐼 (𝑋; 𝐵) 𝜌 − 𝐼 (𝑋; 𝐸) 𝜌
𝑛
𝐼 (𝑋; 𝐵) − 𝐼 (𝑋; 𝐸) − 𝐼 (𝑋; 𝐵) − 𝐼
e ′ (𝑋; 𝐸)
𝜌 𝜌 𝛼 𝜌 𝜌 ª
−
© ′ ′ ′ ′
𝛽 ® (16.2.35)
+ 𝑛1 𝑓 𝜀′, 𝜀2 , 𝜀4 , 𝜀4 , 𝜀2 , 𝛼, 𝛽
« ¬
≥ 𝐼 (𝑋; 𝐵) 𝜌 − 𝐼 (𝑋; 𝐸) 𝜌 − (𝛿1 + 𝛿2 ) (16.2.36)
= 𝐼 (𝑋; 𝐵) 𝜌 − 𝐼 (𝑋; 𝐸) 𝜌 − 𝛿. (16.2.37)
We thus have shown that there exists an (𝑛, |M| , 𝜀) private communication protocol
log |M|
with rate 2𝑛 ≥ 𝐼 (𝑋; 𝐵) 𝜌 − 𝐼 (𝑋; 𝐸) 𝜌 −𝛿. Therefore, there exists an (𝑛, 2𝑛(𝑅−𝛿) , 𝜀)
private communication protocol with 𝑅 = 𝐼 (𝑋; 𝐵) 𝜌 − 𝐼 (𝑋; 𝐸) 𝜌 for all sufficiently
large 𝑛 such that (16.2.33) holds. Since 𝜀 and 𝛿 are arbitrary, we conclude that for
all 𝜀 ∈ (0, 1], 𝛿 > 0, and sufficiently large 𝑛, there exists an (𝑛, 2𝑛(𝑅−𝛿) , 𝜀) private
communication protocol. This means that, by definition, 𝐼 (𝑋; 𝐵) 𝜌 − 𝐼 (𝑋; 𝐸) 𝜌 is
an achievable rate. Since this is true for all input ensembles, we can finally take a
supremum over all input ensembles to arrive at the conclusion in (16.2.27). ■
Let {𝑝(𝑥), 𝜌 𝑥𝐴 𝑘 }𝑥∈X be an arbitrary ensemble over 𝑘 channel input systems, with
𝑘 ∈ N. Let
∑︁
𝜏𝑋 𝐵 𝑘 𝐸 𝑘 B 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ (UN ⊗𝑘 𝑥
𝐴→𝐵𝐸 ) (𝜌 𝐴 𝑘 ). (16.2.38)
𝑥∈X
1055
Chapter 16: Private Communication
Fix 𝜀 ∈ (0, 1] and 𝛿 > 0. Let 𝛿1 , 𝛿2 > 0 be such that 𝛿 = 𝛿1 + 𝛿2 . Set 𝛼 ∈ (0, 1)
and 𝛽 > 1 such that
1 𝑘 𝑘
1
𝑘 ′ 𝑘
𝛿1 ≥ 𝐼 (𝑋; 𝐵 )𝜏 − 𝐼 (𝑋; 𝐸 )𝜏 − 𝐼 𝛼 (𝑋; 𝐵 )𝜏 − 𝐼 𝛽 (𝑋; 𝐸 )𝜏 , (16.2.39)
e
𝑘 𝑘
which is possible based on the arguments given in the proof of Theorem 16.24
above. Then, with this choice of 𝛼 and 𝛽, take 𝑛 large enough so that
′ 𝜀′ 𝜀′ 𝜀′
1 𝜀
𝛿2 ≥ 𝑓 𝜀′, , , , , 𝛼, 𝛽 . (16.2.40)
𝑘𝑛 2 4 4 2
Now, we use the fact that, for the chosen 𝑛 and 𝜀, there exists an (𝑛, |M| , 𝜀) private
communication protocol such that (16.2.24) holds, i.e.,
′ 𝜀′ 𝜀′ 𝜀′
log2 |M| 𝑘 ′ 𝑘 1 ′ 𝜀
≥ 𝐼 𝛼 (𝑋; 𝐵 )𝜏 − e 𝐼 𝛽 (𝑋; 𝐸 )𝜏 − 𝑓 𝜀 , , , , , 𝛼, 𝛽 . (16.2.41)
𝑛 𝑛 2 4 4 2
Dividing both sides by 𝑘 gives
′ 𝜀′ 𝜀′ 𝜀′
log2 |M| 1 𝑘 ′ 𝑘
1 ′ 𝜀
≥ 𝐼 𝛼 (𝑋; 𝐵 )𝜏 − e 𝐼 𝛽 (𝑋; 𝐸 )𝜏 − 𝑓 𝜀 , , , , , 𝛼, 𝛽 .
𝑘𝑛 𝑘 𝑘𝑛 2 4 4 2
(16.2.42)
Rearranging the right-hand side of this inequality, and using (16.2.39)–(16.2.42),
we find that
log2 |M| 1 𝑘 𝑘
≥ 𝐼 (𝑋; 𝐵 )𝜏 − 𝐼 (𝑋; 𝐸 )𝜏
𝑘𝑛 𝑘
1
− 𝐼 𝛽′ (𝑋; 𝐸 𝑘 )𝜏
𝐼 (𝑋; 𝐵 𝑘 )𝜏 − 𝐼 (𝑋; 𝐸 𝑘 )𝜏 − 𝐼 𝛼 (𝑋; 𝐵 𝑘 )𝜏 − e
𝑘
′ ′ ′ ′
1 ′ 𝜀 𝜀 𝜀 𝜀
− 𝑓 𝜀 , , , , , 𝛼, 𝛽 (16.2.43)
𝑘𝑛 2 4 4 2
1 𝑘 𝑘
≥ 𝐼 (𝑋; 𝐵 )𝜏 − 𝐼 (𝑋; 𝐸 )𝜏 − (𝛿1 + 𝛿2 ) (16.2.44)
𝑘
1 𝑘 𝑘
= 𝐼 (𝑋; 𝐵 )𝜏 − 𝐼 (𝑋; 𝐸 )𝜏 − 𝛿. (16.2.45)
𝑘
log |M|
Thus, there exists a (𝑘𝑛, |M| , 𝜀) private communication protocol with rate 𝑘𝑛2
≥
′
1
𝑘 𝐼 (𝑋; 𝐵 ) 𝜏 − ′𝐼 (𝑋; 𝐸 ) 𝜏 − 𝛿. Therefore, letting 𝑛 ≡ 𝑘𝑛, we conclude that there
𝑘 𝑘
for all sufficiently large 𝑛 such that (16.2.40) holds. Since 𝜀 and 𝛿 are arbitrary,
we conclude that for all 𝜀 ∈ (0, 1], 𝛿 > 0, and sufficiently large 𝑛, there exists an
(𝑛, 2𝑛 ( 𝑘 ( 𝐼 (𝑋;𝐵 ) 𝜏 −𝐼 (𝑋;𝐸 ) 𝜏 )−𝛿) , 𝜀) private communication protocol. This means that
1 𝑘 𝑘
1
𝑘 𝐼 (𝑋; 𝐵 ) 𝜏 − 𝐼 (𝑋; 𝐸 ) 𝜏 is an achievable rate.
𝑘 𝑘
Now, since the input ensemble is arbitrary in the arguments above, we conclude
that
1 𝑝 ⊗𝑘 1 𝑘 𝑘
𝐼 (N ) = sup 𝐼 (𝑋; 𝐵 )𝜏 − 𝐼 (𝑋; 𝐸 )𝜏 (16.2.47)
𝑘 {𝑝(𝑥),𝜌 𝑥 𝑘 } 𝑥 ∈X 𝑘
𝐴
In order to prove the weak converse part of Theorem 16.21, we make use of
Corollary 16.8, specifically (16.1.80). Applying this inequality to the tensor-power
channel N ⊗𝑛
𝐴→𝐵 leads to the following:
Proposition 16.25
Let N 𝐴→𝐵 be a quantum channel, and let UN 𝐴→𝐵𝐸 be an isometric channel ex-
tending it. Let 𝑛 ∈ N and 𝜀 ∈ [0, 1). For an (𝑛, |M| , 𝜀) private communication
log |M|
protocol for N 𝐴→𝐵 , the rate 2𝑛 satisfies
√ log2 |M| 1
1−𝜀− 𝜀 ≤ sup 𝐼 (𝑋; 𝐵𝑛 ) 𝜌 − 𝐼 (𝑋; 𝐸 𝑛 ) 𝜌
𝑛 { 𝑝(𝑥),𝜌 𝑥𝐴𝑛 } 𝑥 ∈X 𝑛
1 √
+ ℎ2 (𝜀) + 2𝑔( 𝜀) , (16.2.48)
𝑛
where the information quantities are evaluated with respect to the state
∑︁
𝜌 𝑋 𝐵𝑛 𝐸 𝑛 B 𝑝(𝑥)|𝑥⟩⟨𝑥| 𝑋 ⊗ (UN ⊗𝑛 𝑥
𝐴→𝐵𝐸 ) (𝜌 𝐴𝑛 ), (16.2.49)
𝑥∈X
with UN
𝐴→𝐵𝐸 an isometric channel extending N 𝐴→𝐵 . Consequently,
1057
Chapter 16: Private Communication
√ 1
1 − 𝜀 − 𝜀 𝑃𝑛,𝜀 (N) ≤ sup 𝐼 (𝑋; 𝐵𝑛 ) 𝜌 − 𝐼 (𝑋; 𝐸 𝑛 ) 𝜌
{ 𝑝(𝑥),𝜌 𝑥𝐴𝑛 } 𝑥 ∈X 𝑛
1 √
+ ℎ2 (𝜀) + 2𝑔( 𝜀) . (16.2.50)
𝑛
Suppose that 𝑅 is an achievable rate for private communication over the channel
N 𝐴→𝐵 . Then, by definition, for all 𝜀 ∈ (0, 1], 𝛿 > 0, and sufficiently large 𝑛, there
exists an (𝑛, 2𝑛(𝑅−𝛿) , 𝜀) private communication protocol for N 𝐴→𝐵 . For all such
protocols, the inequality in (16.2.48) holds, so that
√ 1
1 − 𝜀 − 𝜀 (𝑅 − 𝛿) ≤ sup 𝐼 (𝑋; 𝐵𝑛 ) 𝜌 − 𝐼 (𝑋; 𝐸 𝑛 ) 𝜌
{ 𝑝(𝑥),𝜌 𝑥𝐴𝑛 } 𝑥 ∈X 𝑛
1 √
+
ℎ2 (𝜀) + 2𝑔( 𝜀) . (16.2.51)
𝑛
Since the inequality holds for all sufficiently large 𝑛, it holds in the limit 𝑛 → ∞,
so that
√ 1
1 − 𝜀 − 𝜀 (𝑅 − 𝛿) ≤ lim sup 𝐼 (𝑋; 𝐵𝑛 ) 𝜌 − 𝐼 (𝑋; 𝐸 𝑛 ) 𝜌
𝑛→∞
{ 𝑝(𝑥),𝜌 𝑥𝐴𝑛 } 𝑥 ∈X 𝑛
!
1 √
+ ℎ2 (𝜀) + 2𝑔( 𝜀) (16.2.52)
𝑛
1
= lim sup 𝐼 (𝑋; 𝐵𝑛 ) 𝜌 − 𝐼 (𝑋; 𝐸 𝑛 ) 𝜌 . (16.2.53)
𝑛→∞
{ 𝑝(𝑥),𝜌 𝑥 𝑛 } 𝑛
𝐴 𝑥 ∈X
Then since this√inequality holds for all 𝜀 ∈ (0, 1), 𝛿 > 0, it holds in particular for 𝜀
satisfying 𝜀 + 𝜀 < 1, which gives
1 1
𝑅≤ √ lim sup 𝐼 (𝑋; 𝐵𝑛 ) 𝜌 − 𝐼 (𝑋; 𝐸 𝑛 ) 𝜌 + 𝛿, (16.2.54)
1 − 𝜀 − 𝜀 𝑛→∞ { 𝑝(𝑥),𝜌 𝑥 𝑛 } 𝑛
𝐴 𝑥 ∈X
1
= lim sup 𝐼 (𝑋; 𝐵𝑛 ) 𝜌 − 𝐼 (𝑋; 𝐸 𝑛 ) 𝜌 (16.2.56)
𝑛→∞
{ 𝑝(𝑥),𝜌 𝑥𝐴𝑛 } 𝑥 ∈X 𝑛
𝑝
= 𝐼reg (N). (16.2.57)
𝑝
We have thus shown that the quantity 𝐼reg (N) is a weak converse rate for private
communication over N.
Except for channels for which the private information is known to be additive (such
as the class of degradable channels; see Section 16.3.1 below), the private capacity
of a channel is difficult to compute. This prompts us to find upper bounds on the
private capacity. In this section, we do so in terms of the channel’s relative entropy
of entanglement, and in terms of the channel’s squashed entanglement in the next
section.
We begin by recalling the bound from (16.1.97), which holds for all (|M| , 𝜀)
private communication protocols and for all 𝛼 > 1:
𝜀 𝛼 1
𝑃 (N) ≤ 𝐸 e𝛼 (N) + log2 . (16.2.58)
𝛼−1 1−𝜀
For 𝑛 channel uses, the bound in (16.1.97) becomes
log2 |M| 1 ⊗𝑛 𝛼 1
≤ 𝐸 e𝛼 (N ) + log2 , (16.2.59)
𝑛 𝑛 𝑛 (𝛼 − 1) 1−𝜀
which holds for all 𝛼 > 1 and for all (𝑛, |M| , 𝜀) private communication protocols,
with 𝑛 ∈ N and 𝜀 ∈ [0, 1). We can simplify this inequality by making use of the
following fact:
1059
Chapter 16: Private Communication
Proof: The proof is identical to the proof of Proposition 14.21, but making use of
Proposition 10.9 at the beginning instead of Proposition 10.12. ■
Proof: The proof here is identical to that given for Theorem 14.22, but using the
relative entropy of entanglement 𝐸 𝑅 (N) instead of the Rains information 𝑅(N). ■
the 𝑛-shot setting, establishing a bound on the 𝑛-shot private capacity and we
conclude from it that the squashed entanglement of a quantum channel is a weak
converse rate for private communication over it. Later on in the book, in Chapter 20,
we prove that the squashed entanglement of a channel is an upper bound on its
secret-key-agreement capacity, which generally can be much larger than its private
capacity. Thus, the squashed entanglement bound is generally a loose upper bound
on its (unassisted) private capacity.
Proof: Plugging the tensor-power channel N ⊗𝑛 into the bound from Theorem 16.11,
we conclude the following bound
√ log2 |M| 1 2 √
1−2 𝜀 ≤ 𝐸 sq (N ⊗𝑛 ) + 𝑔2 ( 𝜀). (16.2.65)
𝑛 𝑛 𝑛
The desired statement then follows from the additivity of squashed entanglement of
a channel (Corollary 10.21), which implies that 𝑛1 𝐸 sq (N ⊗𝑛 ) = 𝐸 sq (N). ■
Proof: We exploit the bound from Theorem 16.28 and an argument similar to that
from Section 15.2.4 to conclude the desired statement. ■
1061
Chapter 16: Private Communication
16.3 Examples
We now consider the private capacity for particular classes of quantum channels.
As we indicated earlier, computing the private capacity of an arbitrary channel is a
difficult task. This task is made more difficult by the fact that, in some cases, the
private information is known to be strictly superadditive in the following sense:
𝐼 𝑝 (N ⊗𝑛 ) ≥ 𝑛𝐼 𝑝 (N). (16.3.1)
This fact confirms that regularization of the private information is really needed
in general in order to compute the private capacity, and that additivity of private
information does not hold for all channels. Please consult the Bibliographic Notes in
Section 16.5 for more information about strict superadditivity of private information
for certain quantum channels.
Before starting the development below, recall that the private information of a
channel N 𝐴→𝐵 is defined as
𝐼 𝑝 (N) = sup 𝐼 (𝑋; 𝐵) 𝜌 − 𝐼 (𝑋; 𝐸) 𝜌 , (16.3.2)
{𝑝(𝑥),𝜌 𝑥𝐴 } 𝑥 ∈X
with UN
𝐴→𝐵𝐸 an isometric channel extending N 𝐴→𝐵 and the optimization over every
ensemble {𝑝(𝑥), 𝜌 𝑥𝐴 }𝑥∈X .
Recall from Definition 4.6 that a channel N 𝐴→𝐵 is degradable if there exists a
degrading channel D𝐵→𝐸 such that
N𝑐 = D ◦ N, (16.3.4)
We now show that the private information is equal to the coherent information
for every degradable channel, i.e.,
Proof: By Theorem 16.22, we only need to prove the inequality 𝐼 𝑝 (N) ≤ 𝐼 𝑐 (N)
for the case of a degradable channel. Let UN 𝐴→𝐵𝐸 be an isometric channel extending
Í 𝑥,𝑦
N 𝐴→𝐵 . Let 𝜌 𝐴 = 𝑦 𝑝(𝑦|𝑥)𝜓 𝐴 be a spectral decomposition of the input state 𝜌 𝑥𝐴 ,
𝑥
Consider that
𝐼 (𝑋; 𝐵) 𝜌 − 𝐼 (𝑋; 𝐸) 𝜌
= 𝐼 (𝑋𝑌 ; 𝐵) 𝜌 − 𝐼 (𝑌 ; 𝐵|𝑋) 𝜌 − 𝐼 (𝑋𝑌 ; 𝐸) 𝜌 − 𝐼 (𝑌 ; 𝐸 |𝑋) 𝜌 (16.3.11)
= 𝐼 (𝑋𝑌 ; 𝐵) 𝜌 − 𝐼 (𝑋𝑌 ; 𝐸) 𝜌 − 𝐼 (𝑌 ; 𝐵|𝑋) 𝜌 − 𝐼 (𝑌 ; 𝐸 |𝑋) 𝜌 (16.3.12)
≤ 𝐼 (𝑋𝑌 ; 𝐵) 𝜌 − 𝐼 (𝑋𝑌 ; 𝐸) 𝜌 (16.3.13)
1063
Chapter 16: Private Communication
= 𝐻 (𝐵) 𝜌 − 𝐻 (𝐵|𝑋𝑌 ) 𝜌 − 𝐻 (𝐸) 𝜌 − 𝐻 (𝐸 |𝑋𝑌 ) 𝜌 (16.3.14)
= 𝐻 (𝐵) 𝜌 − 𝐻 (𝐸) 𝜌 (16.3.15)
≤ 𝐼 𝑐 (N). (16.3.16)
The first equality follows by applying the chain rule for conditional mutual infor-
mation. The first inequality follows by applying the data-processing inequality for
conditional mutual information and the fact that there is a degrading channel D𝐵→𝐸
such that 𝜌 𝑋𝑌 𝐸 = D𝐵→𝐸 (𝜌 𝑋𝑌 𝐵 ). The last few steps follow the same reasoning
given in the proof of Theorem 16.22. ■
𝑃(N) = 𝑃(N)
e = 𝐸 𝑅 (N) = 𝐼 𝑝 (N) (16.3.17)
= 𝑄(N) = 𝑄(N)
e = 𝑅(N) = 𝐼 𝑐 (N). (16.3.18)
Let us consider the private capacity for anti-degradable channels. Recall from
Definition 4.6 that a channel N 𝐴→𝐵 is anti-degradable if there exists an anti-
degrading channel A𝐸→𝐵 such that
Proof: The first claim is a direct consequence of the definition of the private
information in (16.3.2), the fact that there is an anti-degrading channel A𝐸→𝐵
such that 𝜌 𝑋 𝐵 = A𝐸→𝐵 (𝜌 𝑋 𝐸 ), where the state 𝜌 𝑋 𝐵𝐸 is defined in (16.3.3), and the
data-processing inequality for mutual information. The second claim follows from
the regularized expression for private capacity from Theorem 16.21 and the fact
that a tensor product of anti-degradable channels is anti-degradable. ■
16.4 Summary
1065
Chapter 16: Private Communication
transmitted message). The private capacity is defined as the largest rate at which
private communication is possible, such that the decoding error probability tends to
zero and the eavesdropper’s system becomes decoupled with the message system.
In our definitions, we combined these requirements into a single constraint. We
found that the private information 𝐼 𝑝 (N) of quantum channel N is a lower bound
on its private capacity, and that, in general, computing the exact value of the private
𝑝
capacity involves a regularization, i.e., 𝑃(N) = 𝐼reg (N).
Following the same course as in previous chapters, we began with the one-shot
setting for private communication, in which only one use of the channel is allowed,
along with some non-zero error. We then determined upper and lower bounds on
the number of private bits that can be transmitted. We established three upper
bounds on the one-shot private capacity, involving the one-shot private information,
the hypothesis testing relative entropy of entanglement, as well as the squashed
entanglement. These in turn led to upper bounds on the asymptotic private capacity.
To obtain a lower bound on the one-shot private capacity, we employed the methods
of position-based coding and convex splitting, similar to how we did in the previous
chapter on secret key distillation (Chapter 15). This lower bound is optimal when
employed in the asymptotic setting because it leads to the regularized private
information as an achievable rate for private communication, and this matches the
upper bound. For degradable channels, there is no difference between the private
information and the coherent information, and this implies that there is no difference
between the private capacity and quantum capacity for these channels. We also
proved that the private capacity of anti-degradable channels is equal to zero.
Since the regularized private information is difficult to compute, we established
other upper bounds on private capacity, in terms of relative entropy of entanglement
(strong converse upper bound) and squashed entanglement (weak converse upper
bound). We then concluded that the strong converse property holds for all
generalized dephasing channels and their private capacity is equal to their coherent
information.
1066
Chapter 16: Private Communication
by Csiszár and Körner (1978), who established a general formula for the private
capacity of a classical channel.
Bennett and Brassard (1984) devised the first protocol for sending private
classical information over a quantum channel, which is known as quantum key
distribution. The private capacity of a quantum channel was studied by Devetak
(2005); Cai et al. (2004), who independently established the regularized expression
for it in Theorem 16.21.
Private communication was studied from the one-shot perspective by Renes
and Renner (2011); Wilde et al. (2017); Wilde (2017b); Radhakrishnan et al.
(2017). Proposition 16.3 was established by Wilde and Qi (2018). The connection
between secret-key transmission and bipartite private-state transmission is a direct
consequence of the insights of Horodecki et al. (2005a, 2009a) and was discussed
by Wilde et al. (2017). The upper bound in Proposition 16.7 is similar to that
established by Qi et al. (2018a). The upper bound in Theorem 16.9 is due to Wilde
et al. (2017) and the upper bound in Theorem 16.11 to Takeoka et al. (2014). The
lower bound in Section 16.1.4 is due to Wilde (2017b).
As mentioned above, the asymptotic theory of private communication was
developed by Devetak (2005); Cai et al. (2004). Devetak (2005) proved Theo-
rem 16.22, relating coherent and private information and the private to quantum
capacity. Strict superadditivity of the private information of a quantum channel
was established by Smith et al. (2008), and this result was strengthened by Elkouss
and Strelchuk (2015). The relative entropy of entanglement strong converse bound
on private capacity in Theorem 16.27 was proven by Wilde et al. (2017). The
squashed entanglement weak converse bound on private capacity in Theorem 16.29
was proven by Takeoka et al. (2014). The private capacity of degradable channels
(i.e., Theorem 16.30 and (16.3.8)) was established by Smith (2008). The strong
converse property for the private capacity of generalized dephasing channels was
established by Wilde et al. (2017).
1067
Part III
Quantum Communication
Protocols With Feedback
Assistance
Quantum-Feedback-Assisted
Communication
In this chapter, we begin our foray into interactive quantum communication
by analyzing communication protocols in which the goal is for the sender to
communicate a classical message to the receiver, with the assistance of a free
noiseless quantum feedback channel. By a quantum feedback channel, we mean a
quantum channel from the receiver to the sender that is separate from the channel
from the sender to the receiver being used to communicate the message. We thus
call this communication scenario “quantum-feedback-assisted communication.”
One simple (yet effective) way to make use of this free noiseless quantum
feedback channel is for the receiver to transmit one share of a bipartite quantum
state to the sender. By doing so, they can establish shared entanglement, and the
rates of classical communication that are achievable with such a strategy are given
by the limits on entanglement-assisted communication that we studied previously
in Chapter 11.
Perhaps surprisingly, we show here that the same non-asymptotic converse
bounds established in (11.2.61) and (11.2.92) apply to protocols assisted by noiseless
quantum feedback. These non-asymptotic converse bounds imply that the quantum-
feedback-assisted classical capacity of a channel is no larger than its entanglement-
assisted capacity. Furthermore, the strong converse property holds for the quantum-
feedback-assisted capacity, so that the strong converse capacity is equal to the
mutual information of a quantum channel.
1069
Chapter 17: Quantum-Feedback-Assisted Communication
...
Alice Ψ 0
Bob
F0 B0 N N N
...
B1 F1 B2 F2 Bn
B00 B10 B20 Bn0 −1
D1 D2 ··· m
b
(M, Ψ𝐹0 𝐵0′ , E0𝑀 ′ 𝐹0 →𝐴′ 𝐴1 , {E𝑖𝐴′ 𝐹𝑖 →𝐴′ , D𝑖𝐵𝑖 𝐵′ →𝐹𝑖 𝐵𝑖′ }𝑖=1 , D 𝐵 𝑛 𝐵′
𝑛−1 𝑛
),
1 𝑖 𝐴
𝑖+1 𝑖+1 𝑖−1 𝑛−1
→𝑀
b
(17.1.1)
where M is the message set, Ψ𝐹0 𝐵0′ denotes a bipartite quantum state, the objects
denoted by E are encoding channels, and the objects denoted by D are decoding
channels. Let C denote all of these elements, which together constitute the quantum-
1070
Chapter 17: Quantum-Feedback-Assisted Communication
Furthermore, Alice and Bob also initially share a quantum state Ψ𝐹0 𝐵0′ on Alice’s
system 𝐹0 and Bob’s system 𝐵′0 . This state is prepared by Bob locally, and then he
transmits the system 𝐹0 to Alice via the noiseless quantum feedback channel. The
initial global state shared between them is
𝑝
Φ 𝑀 𝑀 ′ ⊗ Ψ𝐹0 𝐵0′ . (17.1.3)
Alice then sends the 𝑀 ′ and 𝐹0 registers through the first encoding channel
E0𝑀 ′ 𝐹0 →𝐴′ 𝐴1 . This encoding channel realizes a set {E0,𝑚 }
𝐹0 →𝐴1′ 𝐴1 𝑚∈M
of quantum
1
channels as follows:
E0,𝑚 (𝜏 ) B E0𝑀 ′ 𝐹0 →𝐴′ 𝐴1 (|𝑚⟩⟨𝑚| 𝑀 ′ ⊗ 𝜏𝐹0 ),
𝐹0 →𝐴′ 𝐴1 𝐹0
(17.1.4)
1 1
for all input states 𝜏𝐹0 . The global state after the first encoding channel is then as
follows:
𝑝
E0𝑀 ′ 𝐹0 →𝐴′ 𝐴1 (Φ 𝑀 𝑀 ′ ⊗ Ψ𝐹0 𝐵0′ ). (17.1.5)
1
Note that the scratch system 𝐴′1can contain a classical copy of the particular
message 𝑚 that is being communicated, and the same is true for all of the later
scratch systems 𝐴𝑖′, for 𝑖 ∈ {2, . . . , 𝑛}. In fact, this is necessary in order for the
communication protocol to be effective. Alice then transmits the 𝐴1 system through
the channel N 𝐴1 →𝐵1 , leading to the state
𝑝
𝜌 1𝑀 𝐴′ 𝐵1 𝐵′ B (N 𝐴1 →𝐵1 ◦ E0𝑀 ′ 𝐹0 →𝐴′ 𝐴1 )(Φ 𝑀 𝑀 ′ ⊗ Ψ𝐹0 𝐵0′ ). (17.1.6)
1 0 1
After receiving the 𝐵1 system, Bob performs the decoding channel D1𝐵1 𝐵′ →𝐹1 𝐵′ ,
0 1
such that the state is then
D1𝐵1 𝐵′ →𝐹1 𝐵′ (𝜌 1𝑀 𝐴′ 𝐵1 𝐵′ ), (17.1.7)
0 1 1 0
1071
Chapter 17: Quantum-Feedback-Assisted Communication
with it being understood that the system 𝐵′1 is Bob’s new scratch register and the
feedback system 𝐹1 gets sent over the noiseless quantum feedback channel back to
Alice.
In the next round, Alice processes the 𝐴′1 𝐹1 systems with the encoding channel
E1𝐴′ 𝐹1 →𝐴′ 𝐴2 , and she sends system 𝐴2 over the channel N 𝐴2 →𝐵2 , leading to the state
1 2
Bob then applies the second decoding channel D2𝐵2 𝐵′ →𝐹2 𝐵′ . This process then
1 2
iterates 𝑛 − 2 more times, and the state after each use of the channel is as follows:
𝜌𝑖𝑀 𝐴′ 𝐵𝑖 𝐵′ B
𝑖 𝑖−1
(N 𝐴𝑖 →𝐵𝑖 ◦ E𝑖−1
𝐴′ 𝐹 →𝐴𝑖′ 𝐴𝑖 ◦ D𝑖−1
𝐵𝑖−1 𝐵′ →𝐹𝑖−1 𝐵𝑖−1
𝑖−1
′ )(𝜌 𝑀 𝐴′ 𝐵 𝐵′ ), (17.1.9)
𝑖−1 𝑖−1 𝑖−2 𝑖−1 𝑖−1 𝑖−2
Using the alternative expression in (11.1.36) for the maximal error probability, we
have that the maximal error probability of the quantum-feedback-assisted code C is
given by
1 𝑝
𝑝 ∗err (C) = max
𝑝
Φ𝑀 𝑀 ′ − 𝜔 b . (17.1.12)
𝑝:M→[0,1] 2 𝑀𝑀 1
1072
Chapter 17: Quantum-Feedback-Assisted Communication
...
Alice Ψ 0
Bob
F0 B0 PσB PσB PσB
...
B1 F1 B2 F2 Bn
B00 B10 B20 Bn0 −1
D1 D2 ··· m
b
where P𝜎𝐵 denotes a preparation channel that prepares the arbitrary (but fixed) state
𝜎𝐵 at the output.
We can modify the 𝑖 th step of the protocol discussed in the previous section,
such that instead of the actual channel N 𝐴𝑖 →𝐵𝑖 being applied, the replacement
channel R 𝐴𝑖 →𝐵𝑖 is applied; see Figure 17.2.
1073
Chapter 17: Quantum-Feedback-Assisted Communication
The state after the first round in this protocol over the useless channel is
1 0 𝑝
𝐴′ 𝐵1 𝐵′ B (R 𝐴1 →𝐵1 ◦ E 𝑀 ′ 𝐹0 →𝐴′ 𝐴1 )(Φ 𝑀 𝑀 ′ ⊗ Ψ𝐹0 𝐵0 )
𝜏𝑀 ′ (17.1.14)
1 0 1
𝑝
= Tr 𝐴1 [E0𝑀 ′ 𝐹0 →𝐴′ 𝐴1 (Φ 𝑀 𝑀 ′ ⊗ Ψ𝐹0 𝐵0′ )] ⊗ 𝜎𝐵1 , (17.1.15)
1
where the second equality holds due to the fact that the first encoding channel
E0𝑀 ′ 𝐹0 →𝐴′ 𝐴1 is trace preserving, and where
1
∑︁
𝑝
𝜋𝑀 B 𝑝(𝑚)|𝑚⟩⟨𝑚| 𝑀 . (17.1.20)
𝑚∈M
where we used (17.1.16) to obtain the last line. If we take the partial trace over
system 𝐴′2 , then the fact that the encoding channel E1𝐴′ 𝐹1 →𝐴′ 𝐴2 is trace preserving
1 2
implies that
2
𝜏𝑀 𝐵2 𝐵 ′ 1
1074
Chapter 17: Quantum-Feedback-Assisted Communication
𝑝
1
Then, using (17.1.19), which implies that 𝜏𝑀 𝐵′
= 𝜋 𝑀 ⊗ Ψ𝐵0′ , we find that
0
2 1 𝑝
𝐵2 𝐵′ = Tr 𝐹1 [D 𝐵1 𝐵′ →𝐹1 𝐵′ (𝜋 𝑀 ⊗ Ψ𝐵0 ⊗ 𝜎𝐵1 )] ⊗ 𝜎𝐵2
𝜏𝑀 ′ (17.1.27)
1 0 1
𝑝
= 𝜋 𝑀 ⊗ Tr𝐹1 [D1𝐵1 𝐵′ →𝐹1 𝐵′ (Ψ𝐵0′ ⊗ 𝜎𝐵1 )] ⊗ 𝜎𝐵2 (17.1.28)
0 1
𝑝
= 𝜋 𝑀 ⊗ 𝜏𝐵2 ′ ⊗ 𝜎𝐵2 . (17.1.29)
1
Thus, we find again that there is no correlation whatsoever between the message
system 𝑀 and Bob’s systems 𝐵2 𝐵′1 after tracing over all of Alice’s systems.
The states for the other rounds 𝑖 ∈ {3, . . . , 𝑛} are given by
𝑖
𝜏𝑀 𝐴′ 𝐵𝑖 𝐵 ′
𝑖 𝑖−1
B (R 𝐴𝑖 →𝐵𝑖 ◦ E𝑖−1
𝐴′ 𝐹 𝑖−1 →𝐴 ′ 𝐴 ◦ D𝐵
𝑖
𝑖−1
𝑖−1 𝐵 ′ →𝐹
𝑖−1 𝐵
𝑖−1
′ )(𝜏𝑀 𝐴′ 𝐵 𝐵′ ) (17.1.30)
𝑖−1 𝑖 𝑖−2 𝑖−1 𝑖−1 𝑖−1 𝑖−2
= Tr 𝐴𝑖 [E𝑖−1
′ 𝐹
𝐴𝑖−1 ′
𝑖−1 →𝐴𝑖 𝐴𝑖
◦ D𝑖−1 ′ →𝐹
𝐵𝑖−1 𝐵𝑖−2 𝑖−1 𝐵𝑖−1
𝑖−1
′ )(𝜏𝑀 𝐴′ 𝐵 𝐵′ )] ⊗ 𝜎𝐵𝑖 (17.1.31)
𝑖−1 𝑖−1 𝑖−2
= Tr 𝐴𝑖 [E𝑖−1
′ 𝐹
𝐴𝑖−1 ′
𝑖−1 →𝐴𝑖 𝐴𝑖
◦ D𝑖−1 ′ →𝐹
𝐵𝑖−1 𝐵𝑖−2 𝑖−1 𝐵𝑖−1
𝑖−1
′ )(𝜏𝑀 𝐴′ 𝐵 ′ ⊗ 𝜎𝐵𝑖−1 )] ⊗ 𝜎𝐵𝑖 .
𝑖−1 𝑖−2
(17.1.32)
Repeating a calculation similar to the above leads to a similar conclusion as above:
𝑖 𝑝
𝜏𝑀 𝐵𝑖 𝐵 ′ = 𝜋 𝑀 ⊗ 𝜏𝐵𝑖 ′ ⊗ 𝜎𝐵𝑖 , (17.1.33)
𝑖−1 𝑖−1
for all 𝑖 ∈ {3, . . . , 𝑛}. That is, there is no correlation whatsoever between the
message system 𝑀 and Bob’s systems 𝐵𝑖 𝐵𝑖−1 ′ after tracing over all of Alice’s systems.
Again, this is intuitively a consequence of the fact that the “communication line has
been cut” when employing the replacement channel.
Bob’s final decoding channel D𝑛 therefore leads to the following
𝐵 𝑛 𝐵′𝑛−1 → 𝑀
b
classical–classical state:
𝑝 𝑝
𝜏𝑀 𝑀b B 𝜋 𝑀 ⊗ D𝑛 (𝜏𝐵2 ′ ⊗ 𝜎𝐵𝑛 ) = 𝜋 𝑀 ⊗ 𝜏𝑀b , (17.1.34)
𝐵 𝑛 𝐵′𝑛−1 → 𝑀
b 𝑛−1
Í
where 𝜏𝑀b B 𝑚b∈M 𝑡 ( 𝑚
b)| 𝑚 b | 𝑀b for some probability distribution 𝑡 : M → [0, 1],
b⟩⟨𝑚
which corresponds to Bob’s measurement.
1075
Chapter 17: Quantum-Feedback-Assisted Communication
We now give a general upper bound on the number transmitted bits in any quantum-
feedback-assisted classical communication protocol. This result is stated in Theorem
17.2, and it holds independently of the encoding and decoding channels used in the
protocol and depends only on the given communication channel N. Recall from
the previous section that log2 |M| represents the number of bits that are transmitted
over the channel N.
1076
Chapter 17: Quantum-Feedback-Assisted Communication
Now, let Φ 𝑀 𝑀b be the state defined in (17.1.2) with 𝑝 the uniform distribution, and
similarly let 𝜔 𝑀 𝑀b , defined in (17.1.10), be the state at the end of the protocol such
that 𝑝 is the uniform prior probability distribution. Observe that Tr[𝜔 𝑀 𝑀b ] = 𝜋 𝑀 .
Also, letting ∑︁
Π 𝑀 𝑀b = |𝑚⟩⟨𝑚| 𝑀 ⊗ |𝑚⟩⟨𝑚| 𝑀b (17.1.38)
𝑚∈M
be the projection defining the comparator test, as in (11.1.37), observe that
1
1 − Tr[Π 𝑀 𝑀b 𝜔 𝑀 𝑀b ] = Φ − 𝜔 𝑀 𝑀b ≤ 𝜀, (17.1.39)
2 𝑀 𝑀b 1
where the first equality follows by combining (11.1.24) with (11.1.41). This means
that
Tr[Π 𝑀 𝑀b 𝜔 𝑀 𝑀b ] ≥ 1 − 𝜀. (17.1.40)
We thus have all of the ingredients to apply Lemma 11.4. Doing so gives the
following critical first bound:
𝜀 1
𝐼 𝐻 (𝑀; 𝑀)𝜔 ≤
b 𝐼 (𝑀; 𝑀)𝜔 + ℎ2 (𝜀) .
b (17.1.42)
1−𝜀
Now, using the data-processing inequality for the mutual information (see Proposi-
tion 7.19) with respect to the last decoding channel, D𝑛 ′ , we find that
b 𝐵 𝑛 𝐵 𝑛−1 → 𝑀
b 𝜔 ≤ 𝐼 (𝑀; 𝐵𝑛 𝐵′ ) 𝜌 𝑛 .
𝐼 (𝑀; 𝑀) (17.1.43)
𝑛−1
Then, using the chain rule for mutual information in (7.2.112), we obtain
where the second line is a consequence of the chain rule, as well as non-negativity
of mutual information:
≤ 𝐼 (𝑀 𝐵′𝑛−1 ; 𝐵𝑛 ) 𝜌 𝑛 . (17.1.47)
Finally, observe that the state 𝜌 𝑛𝑀 𝐵𝑛 𝐵′ has the following form:
𝑛−1
𝜌 𝑛𝑀 𝐵𝑛 𝐵′ = N 𝐴𝑛 →𝐵𝑛 (𝜁 𝑀
𝑛
𝐵′ 𝐴𝑛 ), (17.1.48)
𝑛−1 𝑛−1
where
𝑛
𝜁𝑀 𝐵′ 𝐴𝑛 B
𝑛−1
Tr 𝐴′𝑛 [(E𝑛−1
𝐴′ 𝐹 →𝐴′𝑛 𝐴𝑛 ◦ D𝑛−1
𝐵 𝑛−1 𝐵′
𝑛−1
→𝐹𝑛−1 𝐵′𝑛−1 )(𝜌 𝑀 𝐴′𝑛−1 𝐵 𝑛−1 𝐵′𝑛−2 )]. (17.1.49)
𝑛−1 𝑛−1 𝑛−2
𝑛
That is, the state 𝜁 𝑀 is a particular state to consider in the optimization of the
𝐵′𝑛−1 𝐴𝑛
mutual information of a channel (with the channel input system being 𝐴𝑛 and the
external correlated systems being 𝑀 𝐵′𝑛−1 ), whereas the definition of the mutual
information of a channel involves an optimization over all such states. This means
that
𝐼 (𝑀 𝐵′𝑛−1 ; 𝐵𝑛 ) 𝜌 𝑛 ≤ 𝐼 (N). (17.1.50)
Putting together (17.1.43), (17.1.45), and (17.1.50), we find that
b 𝜔 ≤ 𝐼 (N) + 𝐼 (𝑀; 𝐵′ ) 𝜌 𝑛 .
𝐼 (𝑀; 𝑀) (17.1.51)
𝑛−1
The quantity 𝐼 (𝑀; 𝐵′𝑛−1 ) 𝜌 𝑛 can be bounded using steps analogous to the above.
In particular, using the data-processing inequality for the mutual information with
respect to the second-to-last decoding channel D𝑛−1𝐵 𝑛−1 𝐵′𝑛−2 →𝐹𝑛−1 𝐵′𝑛−1
, then employing
the same steps as above, we conclude that
𝐼 (𝑀; 𝐵′𝑛−1 ) 𝜌 𝑛 ≤ 𝐼 (𝑀; 𝐵𝑛−1 𝐵′𝑛−2 ) 𝜌 𝑛−1 (17.1.52)
= 𝐼 (𝑀; 𝐵𝑛−1 |𝐵′𝑛−2 ) 𝜌 𝑛−1 + 𝐼 (𝑀; 𝐵′𝑛−2 ) 𝜌 𝑛−1 (17.1.53)
≤ 𝐼 (𝑀 𝐵′𝑛−2 ; 𝐵𝑛−1 ) 𝜌 𝑛−1 + 𝐼 (𝑀; 𝐵′𝑛−2 ) 𝜌 𝑛−1 (17.1.54)
≤ 𝐼 (N) + 𝐼 (𝑀; 𝐵′𝑛−2 ) 𝜌 𝑛−1 , (17.1.55)
Overall, this leads to
b 𝜔 ≤ 2𝐼 (N) + 𝐼 (𝑀; 𝐵′ ) 𝜌 𝑛−1 .
𝐼 (𝑀; 𝑀) (17.1.56)
𝑛−2
Then, bounding 𝐼 (𝑀; 𝐵′𝑛−2 ) in the same manner as above, and continuing this
process 𝑛 − 3 more times such that we completely “unwind” the protocol, we obtain
b 𝜔 ≤ 2𝐼 (N) + 𝐼 (𝑀; 𝐵′ ) 𝜌 𝑛−1
𝐼 (𝑀; 𝑀) (17.1.57)
𝑛−1
1078
Chapter 17: Quantum-Feedback-Assisted Communication
𝐼𝛼 (𝑀; 𝑀)
Recall that the sandwiched Rényi mutual information e b 𝜔 is defined as
𝐼𝛼 (𝑀; 𝑀)
e e𝛼 (𝜔 b ∥𝜔 𝑀 ⊗ 𝜉 b )
b 𝜔 = inf 𝐷
𝑀𝑀 𝑀 (17.1.65)
𝜉𝑀
c
e𝛼 (𝜔 b ∥𝜋 𝑀 ⊗ 𝜉 b ).
= inf 𝐷 (17.1.66)
𝜉𝑀 𝑀𝑀 𝑀
c
Our goal now is to compare the actual protocol with one that results from employing
a useless, replacement channel. To this end, let R𝜎𝐴→𝐵 𝐵
be the replacement channel
defined in (17.1.13), with 𝜎𝐵 an arbitrary (but fixed) state. Then as discussed in
Section 17.1.1 (in particular, in (17.1.34)), the final state of the protocol conducted
with the replacement channel is given by 𝜏𝑀 𝑀b = 𝜋 𝑀 ⊗ 𝜏𝑀b . Then, we find that
𝐼𝛼 (𝑀; 𝑀)
e e𝛼 (𝜔 b ∥𝜋 𝑀 ⊗ 𝜉 b )
b 𝜔 = inf 𝐷
𝑀𝑀 𝑀 (17.1.67)
𝜉𝑀
c
≤𝐷
e𝛼 (𝜔 b ∥𝜋 𝑀 ⊗ 𝜏 b )
𝑀𝑀 𝑀 (17.1.68)
e𝛼 (𝜔 b ∥𝜏 b ).
=𝐷 (17.1.69)
𝑀𝑀 𝑀𝑀
1079
Chapter 17: Quantum-Feedback-Assisted Communication
We now proceed with a similar method considered in the proof of the bound in
(17.1.35), but using the sandwiched Rényi relative entropy as our main tool for
analysis. By applying the data-processing inequality for the sandwiched Rényi
relative entropy with respect to the last decoding channel, and using (17.1.33), we
find that
e𝛼 (𝜌 𝑛
e𝛼 (𝜔 b ∥𝜏 b ) ≤ 𝐷 𝑛
𝐷 𝑀𝑀 𝑀𝑀 𝑀 𝐵 𝑛 𝐵′ ∥𝜏𝑀 𝐵 𝑛 𝐵′ ) 𝑛−1 𝑛−1
(17.1.70)
e𝛼 (𝜌 𝑛 𝑛
=𝐷 𝑀 𝐵 𝑛 𝐵′𝑛−1 ∥𝜋 𝑀 ⊗ 𝜏𝐵′𝑛−1 ⊗ 𝜎𝐵 𝑛 ) (17.1.71)
𝛼 e𝛼 (𝜌 𝑛 𝑛 1
= log2 𝑄 ′ ∥𝜋 𝑀 ⊗ 𝜏 ′ ⊗ 𝜎𝐵 ) 𝛼, (17.1.72)
𝛼−1 𝑀 𝐵 𝐵
𝑛 𝑛−1 𝐵 𝑛−1
𝑛
where in the last line we used the definition in (7.5.2) of the sandwiched Rényi
relative entropy. Now, recalling that 𝜌 𝑛𝑀 𝐵𝑛 𝐵′ = N 𝐴𝑛 →𝐵𝑛 (𝜁 𝑀 𝑛
𝐵′ 𝐴𝑛
) with the state
𝑛−1 𝑛−1
(𝑛)
𝜁𝑀 𝐵′𝑛−1 𝐴𝑛
defined in (17.1.49), and defining the positive semi-definite operator
1−2𝛼𝛼 1−2𝛼𝛼
(𝛼)
𝑋𝑀 𝐵′𝑛−1 𝐴𝑛
B 𝜋𝑀 ⊗ 𝜏𝐵𝑛 ′ 𝑛
𝜁𝑀 𝐵′𝑛−1 𝐴𝑛 𝜋𝑀 ⊗ 𝜏𝐵𝑛 ′ , (17.1.73)
𝑛−1 𝑛−1
(S𝜎(𝛼)
𝐵
(𝛼)
◦ N 𝐴𝑛 →𝐵𝑛 )(𝑋 𝑀 𝐵′ 𝐴𝑛
)
𝑛−1 𝛼
(S𝜎(𝛼)
𝐵
(𝛼)
◦ N 𝐴𝑛 →𝐵𝑛 )(𝑋 𝑀 𝐵′ 𝐴𝑛
)
𝑛−1 𝛼 (𝛼)
= 𝑋𝑀 𝐵′
(17.1.78)
(𝛼) 𝑛−1 𝛼
𝑋𝑀 𝐵′𝑛−1 𝛼
1080
Chapter 17: Quantum-Feedback-Assisted Communication
(S𝜎(𝛼)
𝐵
(𝛼)
◦ N 𝐴𝑛 →𝐵𝑛 )(𝑋 𝑀 𝐵′ 𝐴𝑛
)
𝑛−1 𝛼
= ×
(𝛼)
𝑋𝑀 𝐵′𝑛−1 𝛼
1−2𝛼𝛼 1−2𝛼𝛼
1− 𝛼 1− 𝛼
𝜋𝑀 ⊗
2𝛼
𝜏𝐵𝑛 ′ 𝑛
𝜁𝑀 𝐵′𝑛−1 𝜋𝑀 ⊗
2𝛼
𝜏𝐵𝑛 ′ (17.1.79)
𝑛−1 𝑛−1
𝛼
(S𝜎(𝛼)
𝐵
(𝛼)
◦ N 𝐴𝑛 →𝐵𝑛 )(𝑋 𝑀 𝐵′ 𝐴𝑛
)
=
𝑛−1 𝛼 e𝛼 (𝜁 𝑛 ′ ∥𝜋 𝑀 ⊗ 𝜏 𝑛 ′ ) 𝛼1
·𝑄 (17.1.80)
(𝛼) 𝑀𝐵 𝐵
𝑛−1 𝑛−1
𝑋𝑀 𝐵′𝑛−1 𝛼
(S𝜎(𝛼)
𝐵
◦ N 𝐴𝑛 →𝐵𝑛 )(𝑌𝑀 𝐵′𝑛−1 𝐴𝑛 ) 1
𝛼 e𝛼 (𝜁 𝑛 ′ ∥𝜋 𝑀 ⊗ 𝜏 𝑛 ′ ) 𝛼
≤ sup ·𝑄 𝑀𝐵 𝐵
𝐴 ≥0
𝑌 𝑀 𝐵′ 𝑛−1 𝑛−1
𝑛−1 𝑛 𝑌𝑀 𝐵′𝑛−1
𝛼
(17.1.81)
= S𝜎(𝛼)
𝐵
◦ N 𝐴𝑛 →𝐵𝑛 e𝛼 (𝜁 𝑛 ′ ∥𝜋 𝑀 ⊗ 𝜏 𝑛 ′ ) 𝛼1 ,
·𝑄 𝑀𝐵 𝐵 (17.1.82)
CB,1→𝛼 𝑛−1 𝑛−1
e𝛼 (𝜌 𝑛 𝑛
𝐷 𝑀 𝐵 𝑛 𝐵′𝑛−1 ∥𝜏𝑀 𝐵 𝑛 𝐵′𝑛−1 )
𝛼
≤ log2 S𝜎(𝛼) ◦ N 𝐴𝑛 →𝐵𝑛 e𝛼 (𝜁 𝑛 ′ ∥𝜋 𝑀 ⊗ 𝜏 𝑛 ′ ). (17.1.84)
+𝐷
𝛼−1 𝐵
CB,1→𝛼 𝑀𝐵 𝐵
𝑛−1 𝑛−1
As in the proof of (17.1.35), we now iterate the above by successively bounding the
sandwiched Rényi relative entropy terms 𝐷 e𝛼 (𝜁 𝑖 ′ ∥𝜋 𝑀 ⊗ 𝜏𝑖 ′ ) for 𝑖 ∈ {1, . . . , 𝑛}.
𝑀 𝐵𝑖−1 𝐵𝑖−1
e𝛼 (𝜁 ′ ∥𝜋 𝑀 ⊗𝜏 ′ ), we use the data-processing inequality
Starting with the term 𝐷 𝑛 𝑛
𝑀 𝐵 𝑛−1 𝐵 𝑛−1
for the sandwiched Rényi relative entropy under the second-to-last decoding channel
D𝑛−1
𝐵 𝑛−1 𝐵′𝑛−2 →𝐹𝑛−1 𝐵′𝑛−1
, then apply the same reasoning as in (17.1.75)–(17.1.82) to
obtain
e𝛼 (𝜁 𝑛 ′ ∥𝜋 𝑀 ⊗ 𝜏 𝑛 ′ )
𝐷 𝑀𝐵 𝑛−1
𝐵 𝑛−1
1081
Chapter 17: Quantum-Feedback-Assisted Communication
e𝛼 (𝜌 𝑛−1 ′ ∥𝜋 𝑀 ⊗ 𝜏 𝑛−1 ′ )
≤𝐷 (17.1.85)
𝑀 𝐵 𝑛−1 𝐵𝑛−2
𝐵 𝑛−1 𝐵 𝑛−2
e𝛼 (𝜌 𝑛−1 ′ ∥𝜋 𝑀 ⊗ 𝜏 𝑛−1
=𝐷 𝑀 𝐵 𝑛−1 𝐵 𝑛−2 𝐵′𝑛−2 ⊗ 𝜎𝐵 ) (17.1.86)
𝛼
≤ log2 S𝜎(𝛼) ◦ N 𝐴𝑛−1 →𝐵𝑛−1 e𝛼 (𝜁 𝑛−1′ ∥𝜋 𝑀 ⊗ 𝜏 𝑛−1
+𝐷 𝐵′𝑛−2 ). (17.1.87)
𝛼−1 𝐵
CB,1→𝛼 𝑀 𝐵 𝑛−2
Iterating this reasoning 𝑛 − 2 more times, we end up with the following bound:
e𝛼 (𝜔 b ∥𝜏 b )
𝐷 𝑀𝑀 𝑀𝑀
𝛼
≤𝑛 log2 S𝜎(𝛼) ◦ N 𝐴→𝐵 e𝛼 (𝜌 1 ′ ∥𝜋 𝑀 ⊗ Ψ𝐵′ )
+𝐷
𝛼−1 𝐵
CB,1→𝛼 𝑀𝐵 0 0
𝛼
=𝑛 log2 S𝜎(𝛼) ◦ N 𝐴→𝐵 , (17.1.88)
𝛼−1 𝐵
CB,1→𝛼
log2 |M| ≤
𝛼 𝛼 1
𝑛 log2 S𝜎(𝛼) ◦ N 𝐴→𝐵 + log2 . (17.1.89)
𝛼−1 𝐵
CB,1→𝛼 𝛼−1 1−𝜀
Since we proved that this bound holds for any choice of the state 𝜎𝐵 , we conclude
that
log2 |M|
𝛼 𝛼 1
≤𝑛 inf log2 S𝜎(𝛼) ◦ N 𝐴→𝐵 + log2 (17.1.90)
𝛼 − 1 𝜎𝐵 𝐵
CB,1→𝛼 𝛼−1 1−𝜀
𝛼 1
= 𝑛e𝐼𝛼 (N) + log2 , (17.1.91)
𝛼−1 1−𝜀
where the last equality follows from Lemma 11.20, and it implies (17.1.36). ■
In this section, we revisit the proofs above for Theorem 17.2 that establish bounds
on non-asymptotic quantum feedback-assisted capacity. In particular, we adopt a
different perspective, which we call the amortized perspective and which turns out
to be useful in establishing bounds for all kinds of feedback-assisted protocols other
than the ones considered in this chapter.
1082
Chapter 17: Quantum-Feedback-Assisted Communication
We begin by defining the following key concept, the amortized mutual information
of a quantum channel:
where
𝜔 𝐴′ 𝐵𝐵′ B N 𝐴→𝐵 (𝜌 𝐴′ 𝐴𝐵′ ) (17.1.93)
and the optimization is over states 𝜌 𝐴′ 𝐴𝐵′ .
Intuitively, the amortized mutual information is equal to the largest net mutual
information that can be realized by the channel, if we allow Alice and Bob to share
an arbitrary state before communication begins. As mentioned above, this concept
turns out to be useful for understanding the feedback-assisted protocols presented
previously.
We have the following simple relationship between mutual information and
amortized mutual information:
1083
Chapter 17: Quantum-Feedback-Assisted Communication
Lemma 17.4
The mutual information of any channel N 𝐴→𝐵 does not exceed its amortized
mutual information:
𝐼 (N) ≤ 𝐼 A (N). (17.1.94)
Proof: Let us restrict the optimization in the definition of the amortized mutual
information to states 𝜌 𝐴′ 𝐴𝐵′ that have a trivial 𝐵′ system. This means that
𝜌 𝐴′ 𝐴𝐵′ is of the form 𝜌 𝐴′ 𝐴𝐵′ = 𝜌 𝐴′ 𝐴 ⊗ |0⟩⟨0| 𝐵′ . Therefore, 𝐼 ( 𝐴′ 𝐴; 𝐵′) 𝜌 = 0 and
𝐼 ( 𝐴′; 𝐵𝐵′)𝜔 = 𝐼 ( 𝐴′; 𝐵)𝜔 , where 𝜔 𝐴′ 𝐵𝐵′ = N 𝐴→𝐵 (𝜌 𝐴′ 𝐴𝐵′ ), so that
Proposition 17.5
Given an arbitrary quantum channel N, amortization does not increase its
mutual information:
𝐼 (N) = 𝐼 A (N). (17.1.98)
Proof: To see this, consider that for an arbitrary input state 𝜌 𝐴′ 𝐴𝐵′ , we can use the
chain rule for mutual information in (7.2.112) twice to obtain
where the last line follows because the state 𝜔 𝐴′ 𝐵 = N 𝐴→𝐵 (𝜌 𝐴′ 𝐴 ) has the form of
states that we consider when performing the optimization in the definition of the
mutual information of a channel. Since the inequality
holds for an arbitrary input state 𝜌 𝐴′ 𝐴𝐵′ , we conclude the bound in (17.1.98). ■
We note here that the equality in (17.1.98) is stronger than the additivity of mutual
information shown in Chapter 11 (in particular, that shown in Theorem 11.19).
Indeed, the equality in (17.1.98) actually implies the additivity relation discussed
previously. To see this, consider that the equality in (17.1.98) implies that
for an arbitrary input state 𝜌 𝐴′ 𝐴𝐵′ , where 𝜔 𝐴′ 𝐵𝐵′ = N 𝐴→𝐵 (𝜌 𝐴′ 𝐴𝐵′ ). Now let
𝜌 𝐴′ 𝐴𝐵′ = M 𝐴′′ →𝐵′ (𝜎𝐴′ 𝐴𝐴′′ ) for some channel M 𝐴′′ →𝐵′ and some state 𝜎𝐴′ 𝐴𝐴′′ .
Then it follows that
where the inequality follows because the state 𝜎𝐴′ 𝐴𝐴′′ is a particular state to consider
for the optimization in the definition of the mutual information of the channel
1085
Chapter 17: Quantum-Feedback-Assisted Communication
M 𝐴′′ →𝐵′ . Since the inequality holds for all input states 𝜎𝐴′ 𝐴𝐴′′ to N 𝐴→𝐵 ⊗ M 𝐴′′ →𝐵′ ,
we conclude that
𝐼 (N ⊗ M) ≤ 𝐼 (N) + 𝐼 (M), (17.1.112)
which is the non-trivial inequality needed in the proof of the additivity of the mutual
information of a channel (see the proof of Theorem 11.19).
How is the amortized mutual information relevant for analyzing a feedback-
assisted protocol? Consider that the bound in (17.1.43) involves the mutual
information 𝐼 (𝑀; 𝐵𝑛 𝐵′𝑛−1 ) 𝜌 𝑛 , so that
𝐼 (𝑀; 𝐵𝑛 𝐵′𝑛−1 ) 𝜌 𝑛
= 𝐼 (𝑀; 𝐵𝑛 𝐵′𝑛−1 ) 𝜌 𝑛 − 𝐼 (𝑀; 𝐵′0 ) 𝜌1 (17.1.113)
𝑛−1
∑︁
= 𝐼 (𝑀; 𝐵𝑛 𝐵′𝑛−1 ) 𝜌 𝑛 − 𝐼 (𝑀; 𝐵′0 ) 𝜌1 + 𝐼 (𝑀; 𝐵𝑖′) 𝜌𝑖 − 𝐼 (𝑀; 𝐵𝑖′) 𝜌𝑖 (17.1.114)
𝑖=1
𝑛−1
∑︁
≤ 𝐼 (𝑀; 𝐵𝑛 𝐵′𝑛−1 ) 𝜌 𝑛 − 𝐼 (𝑀; 𝐵′0 ) 𝜌1 + ′
𝐼 (𝑀; 𝐵𝑖 𝐵𝑖−1 ) 𝜌𝑖 − 𝐼 (𝑀; 𝐵𝑖′) 𝜌𝑖 (17.1.115)
𝑖=1
𝑛
∑︁
′ ′
= 𝐼 (𝑀; 𝐵𝑖 𝐵𝑖−1 ) 𝜌𝑖 − 𝐼 (𝑀; 𝐵𝑖−1 ) 𝜌𝑖 (17.1.116)
𝑖=1
≤ 𝑛 · sup 𝐼 ( 𝐴′; 𝐵𝐵′)𝜔 − 𝐼 ( 𝐴′ 𝐴; 𝐵′) 𝜌 (17.1.117)
𝜌 𝐴′ 𝐴𝐵′
A
= 𝑛 · 𝐼 (N) = 𝑛 · 𝐼 (N). (17.1.118)
The first equality follows because the state 𝜌 1𝑀 𝐵′ is a product state. The second
0
equality follows by adding and subtracting the mutual information of the state
of the message system 𝑀 and Bob’s memory system 𝐵𝑖′. The inequality is a
consequence of data processing under the action of the decoding channels. The
third equality follows from collecting terms. The final inequality follows because the
state 𝜌𝑖𝑀 𝐵𝑖 𝐵′ is a particular state to consider in the optimization of the amortized
𝑖−1
mutual information, and the final equality follows from the amortization collapse
in Proposition 17.5.
Thus, we observe that the bound in (17.1.35), at a fundamental level, is a
consequence of the amortization collapse from Proposition 17.5.
1086
Chapter 17: Quantum-Feedback-Assisted Communication
We can also consider the concept of amortization for the sandwiched Rényi mutual
information, and in this subsection, we revisit the bound in (17.1.36) to understand
it from this perspective.
where 𝜔 𝐴′ 𝐵𝐵′ B N 𝐴→𝐵 (𝜌 𝐴′ 𝐴𝐵′ ) and the optimization is over states 𝜌 𝐴′ 𝐴𝐵′ .
Just as with the mutual information of a channel, we find that for all 𝛼 ∈
(0, 1) ∪ (1, ∞),
e 𝐼𝛼A (N),
𝐼𝛼 (N) ≤ e (17.1.120)
and the proof of this analogous to the proof of Lemma 17.4, which establishes
the corresponding inequality for the mutual information. So the question is to
determine whether the opposite inequality holds. Indeed, we find again that it is
the case, at least for 𝛼 > 1.
Proposition 17.7
Amortization does not increase the sandwiched Rényi mutual information of a
quantum channel N for all 𝛼 > 1:
e 𝐼𝛼A (N).
𝐼𝛼 (N) = e (17.1.121)
Proof: Let 𝜌 𝐴′ 𝐴𝐵′ be an arbitrary input state, and let 𝜎𝐵 and 𝜏𝐵′ be arbitrary states.
Then, letting 𝜔 𝐴′ 𝐵𝐵′ = N 𝐴→𝐵 (𝜌 𝐴′ 𝐴𝐵′ ), we find that
𝐼𝛼 ( 𝐴′; 𝐵𝐵′)𝜔
e
= inf 𝐷 e𝛼 (𝜔 𝐴′ 𝐵𝐵′ ∥𝜔 𝐴′ ⊗ 𝜉 𝐵𝐵′ ) (17.1.122)
𝜉 𝐵𝐵′
≤𝐷
e𝛼 (𝜔 𝐴′ 𝐵𝐵′ ∥𝜔 𝐴′ ⊗ 𝜎𝐵 ⊗ 𝜏𝐵′ ) (17.1.123)
1087
Chapter 17: Quantum-Feedback-Assisted Communication
where to obtain the last equality we used the alternate expression in (7.5.3) for the
sandwiched Rényi relative entropy. Defining
1− 𝛼 1− 𝛼
𝑋 𝐴(𝛼)
′ 𝐴𝐵′ B (𝜌 𝐴 ⊗ 𝜏𝐵 )
′ ′ 2𝛼 𝜌 𝐴′ 𝐴𝐵′ (𝜌 𝐴′ ⊗ 𝜏𝐵′ ) 2𝛼 , (17.1.126)
where in the last line we have used the expression in (11.E.1) for the norm ∥·∥ CB,1→𝛼 .
Plugging (17.1.130) back into (17.1.125), we find that
𝛼
𝐼𝛼 ( 𝐴′; 𝐵𝐵′)𝜔 ≤
e log2 S𝜎(𝛼) ◦ N 𝐴→𝐵
𝛼−1 𝐵
CB,1→𝛼
+𝐷
e𝛼 (𝜌 𝐴′ 𝐵′ ∥ 𝜌 𝐴′ ⊗ 𝜏𝐵′ ). (17.1.131)
Since the inequality holds for arbitrary states 𝜎𝐵 and 𝜏𝐵′ , we conclude that
𝐼𝛼 ( 𝐴′; 𝐵𝐵′)𝜔
e
𝛼
≤ inf log2 S𝜎(𝛼) ◦ N 𝐴→𝐵 + inf 𝐷
e𝛼 (𝜌 𝐴′ 𝐵′ ∥ 𝜌 𝐴′ ⊗ 𝜏𝐵′ ) (17.1.132)
𝛼 − 1 𝜎𝐵 𝐵
CB,1→𝛼 𝜏𝐵′
=e 𝐼𝛼 ( 𝐴′; 𝐵′) 𝜌
𝐼𝛼 (N) + e (17.1.133)
1088
Chapter 17: Quantum-Feedback-Assisted Communication
≤e 𝐼𝛼 ( 𝐴′ 𝐴; 𝐵′) 𝜌 ,
𝐼𝛼 (N) + e (17.1.134)
where the equality follows from Lemma 11.20 and the final inequality from the
data-processing inequality for the mutual information under the partial trace Tr 𝐴 .
Since we have shown that the following inequality holds for an arbitrary input state
𝜌 𝐴′ 𝐴𝐵′ :
𝐼𝛼 ( 𝐴′; 𝐵𝐵′)𝜔 − e
e 𝐼𝛼 ( 𝐴′ 𝐴; 𝐵′) 𝜌 ≤ e
𝐼𝛼 (N), (17.1.135)
𝐼𝛼A (N) ≤ e
we conclude that e 𝐼𝛼A (N) = e
𝐼𝛼 (N), which leads to e 𝐼𝛼 (N) after combining
with (17.1.120). ■
𝐼𝛼 (𝑀; 𝐵𝑛 𝐵′𝑛−1 ) 𝜌 𝑛 ≤ 𝑛 · e
e 𝐼𝛼 (N), (17.1.137)
which in turn implies the bound in (17.1.36). Thus, we can alternatively analyze
feedback-assisted protocols and arrive at the bound in (17.1.36) by utilizing the
concept of amortization.
1089
Chapter 17: Quantum-Feedback-Assisted Communication
1090
Chapter 17: Quantum-Feedback-Assisted Communication
𝐶QFB (N) = 𝐶
eQFB (N) = 𝐼 (N), (17.2.3)
𝐶QFB (N) ≤ 𝐶
eQFB (N), (17.2.4)
and by Theorem 11.16 and the fact that any entanglement-assisted classical
communication protocol is a particular kind of quantum-feedback-assisted classical
communication protocol, we have that
The upper bound 𝐶 eQFB (N) ≤ 𝐼 (N) follows from (17.1.36) and the same reasoning
given in the proof detailed in Section 11.2.3. ■
1092
Chapter 18
Classical-Feedback-Assisted
Communication
In this chapter, we continue with our study of feedback-assisted capacities. The
class of protocols that we consider in this chapter are very similar to those from
the previous chapter (Chapter 17), with the exception that the feedback channel is
a classical channel instead of a quantum channel. The resulting communication
task is then called classical communication assisted by a classical channel (or
classical-feedback-assisted communication for short).
Interestingly, this slight change has the effect of complicating the theory quite
a bit: a general expression for the capacity is not known. It is only known for
certain channels such as entanglement-breaking channels and erasure channels.
Additionally, there are examples of channels for which classical feedback can
increase the classical capacity significantly, due to the interplay between classical
feedback and entanglement that can be generated by the channel. We do not discuss
this example in this chapter and instead point to the Bibliographic Notes for details
(Section 18.7). All of the above implies that the increase of capacity due to classical
feedback is a truly quantum-mechanical phenomenon that separates the classical
and quantum theories of communication. Indeed, it is necessary for a channel to
have the ability to generate entanglement in order for classical feedback to give a
boost to capacity.
Our main focus in this chapter is on establishing upper bounds on the classical-
feedback-assisted capacity. First, we prove that classical feedback does not increase
the capacity of entanglement-breaking channels. The main tools here are similar
1093
Chapter 18: Classical-Feedback-Assisted Communication
to those employed in Section 12.2.3.1. Next, we establish that the average output
entropy of a channel is an upper bound on the feedback-assisted capacity. Finally,
we establish that the Υ-information of a channel, introduced in Section 12.2.5.1,
is actually an upper bound on the feedback-assisted capacity. We close out the
chapter by discussing some example channels and summarizing the main concepts
presented.
Section 17.1, with the exception that every state with an 𝐹 label is replaced by the
same state succeeded by the completely dephasing channel Δ𝐹𝑖 . That is, the initial
state is
𝑝
Φ 𝑀 𝑀 ′ ⊗ Δ𝐹0 (Ψ𝐹0 𝐵0′ ), (18.1.3)
and the other states are
𝑝
𝜌 1𝑀 𝐴′ 𝐵1 𝐵′ B (N 𝐴1 →𝐵1 ◦ E0𝑀 ′ 𝐹0 →𝐴′ 𝐴1 )(Φ 𝑀 𝑀 ′ ⊗ Δ𝐹0 (Ψ𝐹0 𝐵0′ )), (18.1.4)
1 0 1
𝜌𝑖𝑀 𝐴′ 𝐵𝑖 𝐵′ B
𝑖 𝑖−1
2
(N 𝐴𝑖 →𝐵𝑖 ◦ E𝑖−1
𝐴′ 𝐹 →𝐴𝑖′ 𝐴𝑖 ◦ Δ𝐹𝑖−1 ◦ D𝑖−1
𝐵𝑖−1 𝐵′ ′ )(𝜌 𝑀 𝐴′ 𝐵 𝐵 ′ ),
→𝐹𝑖−1 𝐵𝑖−1 (18.1.6)
𝑖−1 𝑖−1 𝑖−2 2 2 1
where 𝑖 ∈ {3, . . . , 𝑛}. The final state of the protocol is then as follows:
𝑝
𝜔 B D𝑛𝐵 ′ (Tr 𝐴′𝑛 [𝜌 𝑛𝑀 𝐴′𝑛 𝐵𝑛 𝐵′ ]). (18.1.7)
𝑀 𝑀ˆ 𝑛 𝐵 𝑛−1 → 𝑀
ˆ 𝑛−1
Consider that the initial state of the protocol, as given in (18.1.3), has the
following form:
𝑝
Φ 𝑀 𝑀 ′ ⊗ Δ𝐹0 (Ψ𝐹0 𝐵0′ ) =
∑︁ ∑︁
𝑓
𝑝(𝑚)|𝑚⟩⟨𝑚| 𝑀 ⊗ |𝑚⟩⟨𝑚| 𝑀 ⊗
′ 𝑝( 𝑓0 )| 𝑓0 ⟩⟨ 𝑓0 | 𝐹0 ⊗ Ψ𝐵0′ , (18.1.8)
0
𝑚∈M 𝑓0
𝑝
E0𝑀 ′ 𝐹0 →𝐴′ 𝐴1 (Φ 𝑀 𝑀 ′ ⊗ Δ𝐹0 (Ψ𝐹0 𝐵0′ )) =
1
∑︁ ∑︁
0,𝑚, 𝑓 𝑓
𝑝(𝑚) 𝑝( 𝑓0 )|𝑚⟩⟨𝑚| 𝑀 ⊗ 𝜍 𝐴′ 𝐴1 0 ⊗ Ψ𝐵0′ , (18.1.9)
1 0
𝑚∈M 𝑓0
0,𝑚, 𝑓
where the state 𝜍 𝐴′ 𝐴1 0 is defined as
1
0,𝑚, 𝑓
𝜍 𝐴′ 𝐴1 0 B E0𝑀 ′ 𝐹0 →𝐴′ 𝐴1 (|𝑚⟩⟨𝑚| 𝑀 ′ ⊗ | 𝑓0 ⟩⟨ 𝑓0 | 𝐹0 ). (18.1.10)
1 1
1095
Chapter 18: Classical-Feedback-Assisted Communication
Then one can proceed from here, defining states of the protocol conditioned on the
value of the message and the classical feedback.
Just as we did in Chapter 11, we define the message error probability, average
error probability, and maximal error probability, as in (11.1.13), (11.1.14), and
(11.1.15), respectively. Using the expression in (11.1.24), the average error
probability for the classical-feedback-assisted code C is given by
1 𝑝 𝑝
𝑝 err (C; 𝑝) B Φ 𝑀 𝑀ˆ − 𝜔 ˆ , (18.1.11)
2 𝑀𝑀 1
and using the expression in (11.1.36), the maximal error probability for the
classical-feedback-assisted code C is given by
1 𝑝
𝑝 ∗err (C) B
𝑝
max Φ 𝑀 𝑀ˆ − 𝜔 ˆ , (18.1.12)
𝑝:M→[0,1] 2 𝑀 𝑀 1
where P𝜎𝐵 denotes a preparation channel that prepares the state 𝜎𝐵 at the output.
The initial state of this protocol is
𝑝
Φ 𝑀 𝑀 ′ ⊗ Δ𝐹0 (Ψ𝐹0 𝐵0′ ), (18.2.2)
and the others are as follows:
1 0 𝑝
𝐴′ 𝐵1 𝐵′ B (R 𝐴1 →𝐵1 ◦ E 𝑀 ′ 𝐹0 →𝐴′ 𝐴1 )(Φ 𝑀 𝑀 ′ ⊗ Δ𝐹0 (Ψ𝐹0 𝐵0 )),
𝜏𝑀 ′ (18.2.3)
1 0 1
2 1 1 1
𝜏𝑀 𝐴′ 𝐵2 𝐵′ B (R 𝐴2 →𝐵2 ◦ E 𝐴′ 𝐹1 →𝐴′ 𝐴2 ◦ Δ𝐹1 ◦ D 𝐵1 𝐵′ →𝐹1 𝐵′ )(𝜌 𝑀 𝐴′ 𝐵1 𝐵′ ),
2 1 1 2 0 1 1 0
(18.2.4)
𝑖
𝜏𝑀 𝐴′ 𝐵𝑖 𝐵 ′ B
𝑖 𝑖−1
2
(R 𝐴𝑖 →𝐵𝑖 ◦ E𝑖−1
𝐴′ 𝐹 →𝐴𝑖′ 𝐴𝑖 ◦ Δ𝐹𝑖−1 ◦ D𝑖−1
𝐵𝑖−1 𝐵′ →𝐹 𝐵 ′ )(𝜌 𝑀 𝐴′ 𝐵 𝐵 ′ ), (18.2.5)
𝑖−1 𝑖−1 𝑖−2 𝑖−1 𝑖−1 2 2 1
where 𝑖 ∈ {3, . . . , 𝑛}. The final state of the protocol is then as follows:
𝑝
𝜔 B D𝑛𝐵 ′ (Tr 𝐴′𝑛 [𝜌 𝑛𝑀 𝐴′𝑛 𝐵𝑛 𝐵′ ]). (18.2.6)
𝑀 𝑀ˆ 𝑛 𝐵 𝑛−1 → 𝑀
ˆ 𝑛−1
Before stating the main theorem of this section, we discuss particular aspects
of a classical-feedback-assisted protocol for classical communication over an
entanglement-breaking channel. Indeed, suppose that N 𝐴→𝐵 is an entanglement-
breaking channel. We begin our analysis by inspecting the state in (18.1.9). This
state is fully separable with respect to the cut 𝑀 : 𝐴′1 𝐴1 : 𝐵′0 . That is, it can be
written as follows: ∑︁
𝑧
𝑞(𝑧)𝜏𝑀 ⊗ 𝜎𝐴𝑧 ′ 𝐴1 ⊗ 𝜔 𝑧𝐵0 , (18.3.1)
1
𝑧
𝑧
for 𝑞 a probability distribution and {𝜏𝑀 } 𝑧 , {𝜎𝐴𝑧 ′ 𝐴1 } 𝑧 , and {𝜔 𝑧𝐵0 } 𝑧 sets of states.
1
Since the channel N 𝐴→𝐵 is entanglement breaking, when it acts on system 𝐴1 of
0,𝑚, 𝑓
the state 𝜍 𝐴′ 𝐴1 0 in (18.1.9), the resulting state is a separable state of the following
1
form: ∑︁
0,𝑚, 𝑓0 𝑦,𝑚, 𝑓 𝑦,𝑚, 𝑓
N 𝐴1 →𝐵1 (𝜍 𝐴′ 𝐴1 ) = 𝑝(𝑦|𝑚, 𝑓0 )𝜍 𝐴′ 0 ⊗ 𝜍 𝐵1 0 . (18.3.2)
1 1
𝑦
So this implies that the state 𝜌 1𝑀 𝐴′ 𝐵1 𝐵′ , as defined in (18.1.4) and with N 𝐴1 →𝐵1
1 0
entanglement breaking, is fully separable across all systems (i.e., with respect to
the cut 𝑀 : 𝐴′1 : 𝐵1 : 𝐵′0 ).
Bob then applies the decoding channel Δ𝐹1 ◦ D1𝐵1 𝐵′ →𝐹1 𝐵′ and the state at this
0 1
point is as follows:
1098
Chapter 18: Classical-Feedback-Assisted Communication
∑︁ ∑︁
𝑝(𝑦|𝑚, 𝑓0 ) 𝑝(𝑚) 𝑝( 𝑓0 )|𝑚⟩⟨𝑚| 𝑀 ⊗
𝑚∈M 𝑓0 ,𝑦
𝑦,𝑚, 𝑓0 𝑦,𝑚, 𝑓0 𝑓
𝜍 𝐴′ ⊗ (Δ𝐹1 ◦ D1𝐵1 𝐵′ →𝐹1 𝐵′ )(𝜍 𝐵1 ⊗ Ψ𝐵0′ ). (18.3.3)
1 0 1 0
𝑦,𝑚, 𝑓 𝑓
Since the 𝐹1 system is classical, the state (Δ𝐹1 ◦ D1𝐵1 𝐵′ →𝐹1 𝐵′ )(𝜍 𝐵1 0 ⊗ Ψ𝐵0′ ) can
0 1 0
be written as
∑︁
1 𝑦,𝑚, 𝑓0 𝑓0 𝑓 ,𝑦,𝑚, 𝑓0
(Δ𝐹1 ◦ D𝐵1 𝐵′ →𝐹1 𝐵′ )(𝜍 𝐵1 ⊗ Ψ𝐵′ ) = 𝑝( 𝑓1 |𝑦, 𝑚, 𝑓0 )| 𝑓1 ⟩⟨ 𝑓1 | 𝐹1 ⊗ 𝜍 𝐵1′ .
0 1 0 1
𝑓1
(18.3.4)
This means that the state (Δ𝐹1 ◦ D1𝐵1 𝐵′ →𝐹1 𝐵′ )(𝜌 1𝑀 𝐴′ 𝐵1 𝐵′ ) is fully separable with
0 1 1 0
respect to the cut 𝑀 : 𝐴′1 : 𝐹1 : 𝐵′1 .
This process continues, and since the channel N 𝐴→𝐵 is entanglement breaking,
by following an analysis similar to that given above, we observe that the state of
the message system 𝑀, Alice’s, and Bob’s is always fully separable throughout the
protocol. This is the key reason that we obtain the bounds given in the following
theorem:
Proof: Applying precisely the same reasoning as in the beginning of the proof of
1099
Chapter 18: Classical-Feedback-Assisted Communication
where 𝜔 𝑀 𝑀ˆ is the final state of the protocol when the distribution 𝑝 is set to the
uniform distribution.
Invoking Proposition 7.70, the definition of 𝐼 𝐻𝜀 (𝑀; 𝑀)
ˆ from (7.11.88), and the
expression for the mutual information from (7.2.97), we find that
1
𝐼 𝐻𝜀 (𝑀; 𝑀)
ˆ 𝜔≤ 𝐼 (𝑀; 𝑀)
ˆ 𝜔 + ℎ2 (𝜀) . (18.3.8)
1−𝜀
Now employing the data-processing inequality for the mutual information with
respect to the last decoding channel D𝑛 ′ ˆ , we find that
𝐵 𝑛 𝐵 𝑛−1 → 𝑀
ˆ 𝜔 ≤ 𝐼 (𝑀; 𝐵𝑛 𝐵′ ) 𝜌 𝑛 .
𝐼 (𝑀; 𝑀) (18.3.9)
𝑛−1
Then using the chain for the mutual information in (7.2.112), we obtain
As mentioned above, the state shared between Alice and Bob, at any point during
the protocol, is a separable state. Thus, the global state before the 𝑛th channel use
can be written as follows:
𝑛 1 ∑︁ ∑︁
𝑚,𝑦 𝑚,𝑦
𝜌 𝑀 𝐴′𝑛 𝐴𝑛 𝐵′ = |𝑚⟩⟨𝑚| 𝑀 ⊗ 𝑝(𝑦|𝑚)𝜍 𝐴′ 𝐴𝑛 ⊗ 𝜍 𝐵′ . (18.3.12)
𝑛−1 |M| 𝑦
𝑛 𝑛−1
𝑚∈M
𝜌 𝑛𝑀𝑌 𝐴′𝑛 𝐵𝑛 𝐵′ =
𝑛−1
1100
Chapter 18: Classical-Feedback-Assisted Communication
1 ∑︁ ∑︁ 𝑚,𝑦 𝑚,𝑦
𝑝(𝑦|𝑚)|𝑚⟩⟨𝑚| 𝑀 ⊗ |𝑦⟩⟨𝑦|𝑌 ⊗ N 𝐴𝑛 →𝐵𝑛 (𝜍 𝐴′ 𝐴𝑛 ) ⊗ 𝜍 𝐵′ , (18.3.15)
|M| 𝑦
𝑛 𝑛−1
𝑚∈M
and tracing over the system 𝐴′𝑛 leads to the following state:
𝜌 𝑛𝑀𝑌 𝐵𝑛 𝐵′
𝑛−1
The first inequality follows from the data-processing inequality for mutual informa-
tion. The first equality follows from the chain rule. The second equality follows
because the state in (18.3.17) is product when conditioning on the systems 𝑀 and
𝑌 . The last inequality follows because the state 𝜌 𝑛𝑀𝑌 𝐵𝑛 is a classical–quantum state
of the following form:
Thus, the definition of the Holevo information in (7.11.106) implies the last in-
equality in (18.3.21). Putting together (18.3.9), (18.3.10)–(18.3.11), and (18.3.18)–
(18.3.21), we conclude that
ˆ 𝜔 ≤ 𝜒(N) + 𝐼 (𝑀; 𝐵′ ) 𝜌 𝑛
𝐼 (𝑀; 𝑀) (18.3.24)
𝑛−1
≤ 𝜒(N) + 𝐼 (𝑀; 𝐵𝑛−1 𝐵′𝑛−2 ) 𝜌 𝑛−1 , (18.3.25)
where the last inequality follows from the data-processing inequality for mutual
information.
1101
Chapter 18: Classical-Feedback-Assisted Communication
Now, we recognize the term 𝐼 (𝑀; 𝐵𝑛−1 𝐵′𝑛−2 ) 𝜌 𝑛−1 as being of the same form
as 𝐼 (𝑀; 𝐵𝑛 𝐵′𝑛−1 ) 𝜌 𝑛 in (18.3.10). Thus, we iterate through the same sequence of
arguments to conclude that
𝐼 (𝑀; 𝐵𝑛−1 𝐵′𝑛−2 ) 𝜌 𝑛−1 ≤ 𝜒(N) + 𝐼 (𝑀; 𝐵𝑛−2 𝐵′𝑛−3 ) 𝜌 𝑛−2 , (18.3.26)
Continuing all the way back to the first channel use, we find that
𝐼 (𝑀; 𝑀)
ˆ 𝜔 ≤ 𝑛𝜒(N) (18.3.28)
because 𝐼 (𝑀; 𝐵′0 ) = 0 (the systems 𝑀 and 𝐵′0 are in a product state at the start of
the protocol). Putting together (18.3.7), (18.3.8), and (18.3.28), we conclude that
1
log2 |M| ≤ (𝑛𝜒(N) + ℎ2 (𝜀)) , (18.3.29)
1−𝜀
which implies the claim in (18.3.5).
We now prove the inequality in (18.3.6). Our starting point is again (18.3.7),
but from there, we instead apply Proposition 7.71, the definition of 𝐼 𝐻𝜀 (𝑀; 𝑀)
ˆ 𝜔
from (7.11.88), and the expression for the sandwiched Rényi mutual information
from (7.11.92) to find that the following holds for all 𝛼 > 1:
𝜀 𝛼 1
𝐼 𝐻 (𝑀; 𝑀)
ˆ 𝜔≤e 𝐼𝛼 (𝑀; 𝑀)ˆ 𝜔+ log2 . (18.3.30)
𝛼−1 1−𝜀
𝐼𝛼 (𝑀; 𝑀)
Recall that the sandwiched Rényi mutual information e ˆ 𝜔 is defined as
𝐼𝛼 (𝑀; 𝑀)
e e𝛼 (𝜔 𝑀 𝑀ˆ ∥𝜔 𝑀 ⊗ 𝜉 𝑀ˆ )
ˆ 𝜔 = inf 𝐷 (18.3.31)
𝜉 𝑀ˆ
e𝛼 (𝜔 𝑀 𝑀ˆ ∥𝜋 𝑀 ⊗ 𝜉 𝑀ˆ ).
= inf 𝐷 (18.3.32)
𝜉 𝑀ˆ
We adopt a similar approach to that given for the proof of (17.1.36). Our goal is
thus to compare the actual protocol with one that results from employing a useless,
replacement channel (of the form discussed in Section 18.2). To this end, let R𝜎𝐴→𝐵 𝐵
𝐼𝛼 (𝑀; 𝑀)
e e𝛼 (𝜔 𝑀 𝑀ˆ ∥𝜋 𝑀 ⊗ 𝜉 𝑀ˆ )
ˆ 𝜔 = inf 𝐷 (18.3.33)
𝜉 𝑀ˆ
≤𝐷
e𝛼 (𝜔 𝑀 𝑀ˆ ∥𝜋 𝑀 ⊗ 𝜏𝑀ˆ ) (18.3.34)
e𝛼 (𝜔 𝑀 𝑀ˆ ∥𝜏𝑀 𝑀ˆ ).
=𝐷 (18.3.35)
e𝛼 (𝜌 𝑛 𝑛
𝐷 𝑀 𝐵 𝑛 𝐵′𝑛−1 ∥𝜏𝑀 𝐵′𝑛−1 ⊗ 𝜎𝐵 𝑛 ) =
1−2𝛼𝛼 1−2𝛼𝛼
𝛼
log2 Θ 1−𝛼𝛼 ◦ N 𝐴𝑛 →𝐵𝑛 𝑛
𝜏𝑀 𝐵′𝑛−1 𝜌 𝑛𝑀 𝐴𝑛 𝐵′ 𝑛
𝜏𝑀 𝐵′𝑛−1 ,
𝛼−1 𝜎𝐵𝑛 𝑛−1
𝛼
(18.3.38)
where we define the completely positive map Θ 𝑋 by
1 1
Θ 𝑋 (𝜌) B 𝑋 2 𝜌𝑋 2 . (18.3.39)
We now employ the key observation from before: if the channel N is entanglement
breaking, then Alice and Bob’s systems are always separable throughout the protocol.
Thus, the state 𝜌 𝑛𝑀 𝐴𝑛 𝐵′ is fully separable with respect to the cut 𝑀 : 𝐴𝑛 : 𝐵′𝑛−1 . It
𝑛−1
is in turn separable with respect to the bipartite cut 𝐴𝑛 : 𝑀 𝐵′𝑛−1 and can be written
as ∑︁
𝑛 𝑗 𝑗
𝜌 𝑀 𝐴𝑛 𝐵′ = 𝑝( 𝑗) 𝜌 𝐴𝑛 ⊗ 𝜌 𝑀 𝐵′ , (18.3.40)
𝑛−1 𝑛−1
𝑗
1103
Chapter 18: Classical-Feedback-Assisted Communication
∑︁ 1−2𝛼𝛼 1−2𝛼𝛼
𝑗 𝑛 𝑗 𝑛
= 𝑝( 𝑗) 𝜌 𝐴𝑛 ⊗ 𝜏𝑀 𝐵′𝑛−1 𝜌 𝑀 𝐵′ 𝜏𝑀 𝐵′𝑛−1 . (18.3.42)
𝑛−1
𝑗
is the sandwiched Rényi relative entropy at round 𝑛 − 1 of the protocol, which allows
us to apply the argument inductively. The first equality follows because 𝜌 𝑛𝑀 𝐵′ = 𝜏𝑀
𝑛
𝐵0′
0
because no channels have been applied at this point in the protocol. Putting together
(18.3.7), (18.3.30), (18.3.33)–(18.3.35), (18.3.36), and (18.3.44)–(18.3.47), we
1104
Chapter 18: Classical-Feedback-Assisted Communication
conclude that
𝛼 𝛼 1
log2 |M| ≤ 𝑛 log2 𝜈𝛼 Θ 1−𝛼𝛼 ◦ N 𝐴→𝐵 + log2 . (18.3.49)
𝛼−1 𝜎𝐵 𝛼−1 1−𝜀
Since this upper bound holds for every state 𝜎𝐵 , we can take an infimum over all
such states and conclude that
log2 |M|
𝛼 𝛼 1
≤𝑛 inf log2 𝜈𝛼 Θ 1−𝛼𝛼 ◦ N 𝐴→𝐵 + log2 (18.3.50)
𝛼 − 1 𝜎𝐵 𝜎𝐵 𝛼−1 1−𝜀
= 𝑛𝐾e𝛼 (N 𝐴→𝐵 ) + 𝛼 log2 1 (18.3.51)
𝛼−1 1−𝜀
𝛼 1
= 𝑛e𝜒𝛼 (N 𝐴→𝐵 ) + log2 . (18.3.52)
𝛼−1 1−𝜀
The first equality follows from the definition of 𝜈𝛼 in (12.2.82) and the definition
e𝛼 in (12.2.58). The last equality follows from Lemma 12.17. ■
of 𝐾
We now establish an upper bound that holds for an arbitrary quantum channel. It is
equal to the maximum output entropy of the channel (Theorem 18.5). A refinement
of this upper bound leads to an upper bound equal to the maximum expected output
entropy of the channel (Theorem 18.6), by writing it as a convex combination of
other channels.
We begin by establishing the first upper bound. The main idea for doing so is
to consider a protocol that simulates the general protocol detailed in Section 18.1.
The simulation is a purified protocol, in which every step of the original protocol
is purified. Each state of the purified protocol, when conditioned on the message
being transmitted and the values of the classical feedback, is in a pure state. We
now detail the form of this purified protocol. In order to simplify notation, we let
𝐴ˆ denote a joint system throughout, referring to both the original system 𝐴′ and
a purifying reference system, and we take the same convention when using the
ˆ By inspecting (18.1.8), the initial state of Bob in the purified protocol
notation 𝐵.
1105
Chapter 18: Classical-Feedback-Assisted Communication
is as follows:
∑︁
𝑓
𝜎𝐹0 𝐹 ′ 𝐵ˆ 0 B 𝑝( 𝑓0 )| 𝑓0 ⟩⟨ 𝑓0 | 𝐹0 ⊗ | 𝑓0 ⟩⟨ 𝑓0 | 𝐹0′ ⊗ 𝜓 ˆ0 , (18.3.53)
0 𝐵0
𝑓0
𝑓 𝑓
where the state 𝜓 ˆ0 purifies Bob’s state Ψ𝐵0′ , such that tracing over a subsystem
𝐵0 0
Additionally, Bob keeps an extra copy 𝐹0′ of the classical data
𝑓 𝑓
of 𝜓 ˆ0 gives Ψ𝐵0′ .
𝐵0 0
transmitted over the classical feedback channel. Let U0 denote an isometric
𝑀 ′ 𝐹0 → 𝐴ˆ 1 𝐴1
channel extending the encoding channel E0𝑀 ′ 𝐹0 →𝐴′ 𝐴1 . After U0 ′ acts, the
1 𝑀 𝐹0 → 𝐴ˆ 1 𝐴1
global state is as follows:
∑︁ ∑︁
0,𝑚, 𝑓 𝑓
𝜔1𝑀 𝐴ˆ 𝐴 𝐹 ′ 𝐵ˆ B 𝑝(𝑚) 𝑝( 𝑓0 )|𝑚⟩⟨𝑚| 𝑀 ⊗ 𝜑 ˆ 0 ⊗ | 𝑓0 ⟩⟨ 𝑓0 | 𝐹0′ ⊗ 𝜓 ˆ0 , (18.3.54)
1 1 0 0 𝐴1 𝐴1 𝐵0
𝑚∈M 𝑓0
where
U0𝑀 ′ 𝐹 → 𝐴ˆ (|𝑚⟩⟨𝑚| 𝑀 ′ ⊗ | 𝑓0 ⟩⟨ 𝑓0 | 𝐹0 ). (18.3.55)
0 1 𝐴1
𝑖, 𝑓𝑖 𝑖, 𝑓𝑖 𝑖, 𝑓𝑖 †
V ′ → 𝐵ˆ (𝜏𝐵𝑖 𝐵𝑖−1 )
′ B𝑉 ′ → 𝐵ˆ 𝜏𝐵𝑖 𝐵𝑖−1 (𝑉𝐵 𝐵 ′ → 𝐵ˆ ) .
′ (18.3.57)
𝐵𝑖 𝐵𝑖−1 𝑖 𝐵𝑖 𝐵𝑖−1 𝑖 𝑖 𝑖−1 𝑖
This extended decoding channel keeps an extra copy of the classical feedback value
𝑓𝑖 for Bob in the classical register 𝐹𝑖′. The final decoding channel in the original
protocol is a measurement channel and thus can be written as
∑︁
D𝐵 𝐵′ → 𝑀ˆ (𝜏𝐵𝑛 𝐵𝑛−1 ) =
𝑛 ′ Tr[Λ𝑚 𝐵 𝑛 𝐵′ 𝜏𝐵 𝑛 𝐵 𝑛−1 ]|𝑚⟩⟨𝑚| 𝑀ˆ ,
′ (18.3.59)
𝑛 𝑛−1 𝑛−1
𝑚∈M
where {Λ𝑚
𝐵𝑛 𝐵′
} 𝑚∈M is a POVM. We enlarge it as follows in the simulation protocol:
𝑛−1
∑︁ √︃ √︃
V𝑛𝐵 ′ (𝜏 ′ ) B
𝑀ˆ 𝐵 𝑛 𝐵 𝑛−1
Λ𝑚
𝐵𝑛 𝐵′
𝜏𝐵 𝑛 𝐵 𝑛−1 Λ 𝐵 𝑛 𝐵′
′ 𝑚 ⊗ |𝑚⟩⟨𝑚| 𝑀ˆ , (18.3.60)
𝑛 𝐵 𝑛−1 → 𝐵ˆ 𝑛 𝑛−1 𝑛−1
𝑚∈M
Alice transmits system 𝐴1 through the first use of the extended channel UN
𝐴1 →𝐵1 𝐸 1 ,
resulting in the following state:
𝜌 1𝑀 𝐴ˆ ′ B UN 1
𝐴1 →𝐵1 𝐸 1 (𝜔 𝑀 𝐴ˆ ′ ). (18.3.62)
1 𝐵1 𝐸 1 𝐹0 𝐵ˆ 0 1 𝐴1 𝐹0 𝐵ˆ 0
Bob processes his systems 𝐵1 and 𝐵′0 with the extended decoding channel
V1 ′ ˆ ′ , and Alice acts with the extended encoding channel U ′
1 ,
𝐵1 𝐵0 →𝐹1 𝐵1 𝐹1 ˆ 𝐴1 𝐹1 → 𝐴2 𝐴2
resulting in the state
𝜌𝑖𝑀 𝐴ˆ 𝐵 𝐵ˆ 𝑖−1 ′ B UN 𝑖
𝐴𝑖 →𝐵𝑖 𝐸 𝑖 (𝜔 𝑀 𝐴ˆ 𝐴 𝐵ˆ 𝑖−1 ′ ), (18.3.64)
𝑖−1 𝐸 1 [𝐹0 ] 𝑖 𝑖−1 𝐸 1 [𝐹0 ]
𝑖 𝑖−1
𝑖 𝑖 𝑖
for 𝑖 ∈ {2, . . . , 𝑛 − 1}. The final extended decoding channel results in the following
state:
where the 𝐵˜ system encompasses all systems in Bob’s possession at the end. Note
that we recover each state of the original protocol described in Section 18.1 by
performing particular partial traces.
Before stating the main theorem of this section, we prove two lemmas that play
an important role in its proof. Both lemmas involve the following information
measure:
𝐼 (𝑋; 𝐶𝑌 )𝜏 + 𝐻 (𝐶 |𝑋𝑌 )𝜏 , (18.3.67)
where the information quantities are evaluated with respect to the following
classical–quantum state:
∑︁
𝑥,𝑦
𝜏𝑋𝑌𝐶 B 𝑝(𝑥, 𝑦)|𝑥⟩⟨𝑥| 𝑋 ⊗ |𝑦⟩⟨𝑦|𝑌 ⊗ 𝜏𝐶 . (18.3.68)
𝑥,𝑦
𝑥,𝑦
In the above, 𝑝(𝑥, 𝑦) is a probability distribution and 𝜏𝐶 is a quantum state for all
𝑥 and 𝑦.
Lemma 18.3
Let 𝜏𝑋𝑌 𝐴𝐵 be a classical–quantum state, with classical systems 𝑋𝑌 and quantum
systems 𝐴𝐵 pure when conditioned on 𝑋𝑌 . Let L 𝐴𝐵→𝐴′ 𝐵′ 𝑍 be a one-way
LOCC channel of the following form:
∑︁
L 𝐴𝐵→𝐴 𝐵 𝑍 B
′ ′ U𝑧𝐴→𝐴′ ⊗ V𝑧𝐵→𝐵′ ⊗ |𝑧⟩⟨𝑧| 𝑍 , (18.3.69)
𝑧
1108
Chapter 18: Classical-Feedback-Assisted Communication
Proof: The inequality 𝐼 (𝑋; 𝐵𝑌 )𝜏 ≥ 𝐼 (𝑋; 𝐵′𝑌 𝑍)𝜔 follows from the data-processing
inequality for mutual information. In more detail, consider that 𝜔 𝑋𝑌 𝑍 𝐵′ is equal to
𝜔 𝑋𝑌 𝑍 𝐵′ = Tr 𝐴′ [𝜔 𝑋𝑌 𝑍 𝐴′ 𝐵′ ] (18.3.71)
" #
∑︁
= Tr 𝐴′ (U𝑧𝐴→𝐴′ ⊗ V𝑧𝐵→𝐵′ )(𝜏𝑋𝑌 𝐴𝐵 ) ⊗ |𝑧⟩⟨𝑧| 𝑍 (18.3.72)
𝑧
∑︁
= ((Tr 𝐴′ ◦U𝑧𝐴→𝐴′ ) ⊗ V𝑧𝐵→𝐵′ )(𝜏𝑋𝑌 𝐴𝐵 ) ⊗ |𝑧⟩⟨𝑧| 𝑍 (18.3.73)
𝑧
∑︁
= V𝑧𝐵→𝐵′ (Tr 𝐴 [𝜏𝑋𝑌 𝐴𝐵 ]) ⊗ |𝑧⟩⟨𝑧| 𝑍 (18.3.74)
𝑧
∑︁
= V𝑧𝐵→𝐵′ (𝜏𝑋𝑌 𝐵 ) ⊗ |𝑧⟩⟨𝑧| 𝑍 . (18.3.75)
𝑧
The fourth equality follows because U𝑧𝐴→𝐴′ is an isometric channel for all 𝑧. Thus,
the state 𝜔 𝑋𝑌 𝑍 𝐵′ can be understood as arising from the action of the quantum
Í
instrument 𝑧 V𝑧𝐵→𝐵′ ⊗ |𝑧⟩⟨𝑧| 𝑍 on the state 𝜏𝑋𝑌 𝐵 , and since this a channel taking
system 𝐵 to 𝐵′ 𝑍, the data-processing inequality for mutual information applies. The
inequality 𝐻 (𝐵|𝑋𝑌 )𝜏 ≥ 𝐻 (𝐵′ |𝑋𝑌 𝑍)𝜔 is a consequence of the LOCC monotonicity
of the entanglement of formation (see Proposition 9.6). Indeed, consider that
𝐻 (𝐵|𝑋𝑌 )𝜏 = 𝐸 𝐹 ( 𝐴; 𝐵𝑋𝑌 )𝜏 , (18.3.76)
𝐻 (𝐵′ |𝑋𝑌 𝑍)𝜔 = 𝐸 𝐹 ( 𝐴′; 𝐵′ 𝑋𝑌 𝑍)𝜔 , (18.3.77)
which follows from the direct-sum property of the entanglement of formation (see
the proof of Proposition 9.6) and its reduction to entropy of entanglement for pure
states (see (9.1.40)). Thus, we apply these equalities and the LOCC monotonicity
of entanglement of formation (i.e., 𝐸 𝐹 ( 𝐴; 𝐵𝑋𝑌 )𝜏 ≥ 𝐸 𝐹 ( 𝐴′; 𝐵′ 𝑋𝑌 𝑍)𝜔 ). ■
The following lemma places an entropic upper bound on the amount by which
the information quantity in (18.3.67) can increase by the action of a channel N 𝐴→𝐵 :
Lemma 18.4
Let N 𝐴→𝐵 be a quantum channel, and let 𝜏𝑋𝑌 𝐴𝐵′ be a classical–quantum state
of the following form:
∑︁
𝑥,𝑦
𝜏𝑋𝑌 𝐴𝐵′ B 𝑝(𝑥, 𝑦)|𝑥⟩⟨𝑥| 𝑋 ⊗ |𝑦⟩⟨𝑦|𝑌 ⊗ 𝜏𝐴𝐵′ . (18.3.78)
𝑥,𝑦
1109
Chapter 18: Classical-Feedback-Assisted Communication
Then
The key properties of the information quantity in (18.3.67) is that it does not
increase under the action of a one-way LOCC channel from Bob to Alice (i.e., the
decoding channel of Bob, the classical feedback channel, and the encoding channel
of Alice) and it cannot increase by more than the output entropy of a channel under
its action. We can use these properties to establish the following entropy bound on
the number of bits that can be transmitted by a feedback-assisted communication
protocol:
Theorem 18.5
Let N 𝐴→𝐵 be a quantum channel, and let 𝜀 ∈ [0, 1). For an (𝑛, |M| , 𝜀) protocol
for classical communication over a quantum channel N 𝐴→𝐵 assisted by classical
feedback, as described in Section 18.1, the following bound holds
log2 |M| 1 ℎ2 (𝜀)
≤ sup 𝐻 (N 𝐴→𝐵 (𝜌 𝐴 )) + . (18.3.85)
𝑛 1 − 𝜀 𝜌𝐴 𝑛
Proof: Our starting point is the general bounds in (18.3.7)–(18.3.8), which imply
1110
Chapter 18: Classical-Feedback-Assisted Communication
that
1
log2 |M| ≤ 𝐼 (𝑀; 𝑀)ˆ 𝜔 + ℎ2 (𝜀) , (18.3.86)
1−𝜀
where 𝜔 𝑀 𝑀ˆ is the final state of the protocol, as given in (18.1.7), with 𝑝 therein set
to the uniform distribution over the set M of messages. Continuing, and considering
the purified protocol outlined above, we find that
𝐼 (𝑀; 𝑀)
ˆ 𝜔
≤ 𝐼 (𝑀; 𝐵𝑛 𝐵ˆ 𝑛−1 [𝐹0𝑛−1 ] ′) 𝜌 𝑛 + 𝐻 (𝐵𝑛 𝐵ˆ 𝑛−1 |[𝐹0𝑛−1 ] ′ 𝑀) 𝜌 𝑛 (18.3.87)
= 𝐼 (𝑀; 𝐵𝑛 𝐵ˆ 𝑛−1 [𝐹0𝑛−1 ] ′) 𝜌 𝑛 + 𝐻 (𝐵𝑛 𝐵ˆ 𝑛−1 |[𝐹0𝑛−1 ] ′ 𝑀) 𝜌 𝑛
− 𝐼 (𝑀; 𝐵ˆ 0 𝐹0′ )𝜔1 + 𝐻 ( 𝐵ˆ 0 |𝐹0′ 𝑀)𝜔1
(18.3.88)
= 𝐼 (𝑀; 𝐵𝑛 𝐵ˆ 𝑛−1 [𝐹0𝑛−1 ] ′) 𝜌 𝑛 + 𝐻 (𝐵𝑛 𝐵ˆ 𝑛−1 |[𝐹0𝑛−1 ] ′ 𝑀) 𝜌 𝑛
− 𝐼 (𝑀; 𝐵ˆ 0 𝐹0′ )𝜔1 + 𝐻 ( 𝐵ˆ 0 |𝐹0′ 𝑀)𝜔1
𝑛
∑︁
+ 𝐼 (𝑀; 𝐵ˆ 𝑖−1 [𝐹0𝑖−1 ] ′)𝜔𝑖 + 𝐻 ( 𝐵ˆ 𝑖−1 |[𝐹0𝑖−1 ] ′ 𝑀)𝜔𝑖
𝑖=2
− 𝐼 (𝑀; 𝐵ˆ 𝑖−1 [𝐹0𝑖−1 ] ′)𝜔𝑖 + 𝐻 ( 𝐵ˆ 𝑖−1 |[𝐹0𝑖−1 ] ′ 𝑀)𝜔𝑖 .
(18.3.89)
The first inequality follows from non-negativity of quantum entropy and data
processing under the action of the final decoding channel. The first equality follows
because 𝐼 (𝑀; 𝐵ˆ 0 𝐹0′ )𝜔1 + 𝐻 ( 𝐵ˆ 0 |𝐹0′ 𝑀)𝜔1 = 0 for the initial state 𝜔1 ˆ ′ ˆ (indeed,
𝑀 𝐴1 𝐴1 𝐹0 𝐵0
the systems 𝑀 and 𝐹0′ 𝐵ˆ 0 of the reduced state 𝜔1 ′ ˆ are product, and the state on
𝑀 𝐹0 𝐵0
system 𝐵ˆ 0 is pure when conditioned on 𝐹0′ 𝑀). The last equality follows by adding
and subtracting the same term. Continuing, we find that the quantity in the last line
above is bounded as
1111
Chapter 18: Classical-Feedback-Assisted Communication
𝑛
∑︁
≤ 𝐻 (𝐵𝑖 ) 𝜌𝑖 (18.3.92)
𝑖=1
≤ 𝑛 sup 𝐻 (N 𝐴→𝐵 (𝜌 𝐴 )) (18.3.93)
𝜌𝐴
The first inequality follows from Lemma 18.3 and the second from Lemma 18.4.
So we conclude that
𝐼 (𝑀; 𝑀)
ˆ 𝜔 ≤ 𝑛 sup 𝐻 (N 𝐴→𝐵 (𝜌 𝐴 )). (18.3.94)
𝜌𝐴
In this section, we provide a brief proof of the following theorem, which generalizes
Theorem 18.5 to the maximum average output entropy of a quantum channel:
Theorem 18.6
Í
Let N 𝐴→𝐵 = 𝑥 𝑝 𝑋 (𝑥)N𝑥𝐴→𝐵 , where 𝑝 𝑋 is a probability distribution and
{N𝑥𝐴→𝐵 }𝑥 is a set of channels. For an (𝑛, |M|, 𝜀) protocol for classical com-
munication over the channel N 𝐴→𝐵 assisted by classical feedback, of the form
described in Section 18.1, the following bound applies
∑︁
(1 − 𝜀) log2 |M| ≤ 𝑛 · sup 𝑝 𝑋 (𝑥)𝐻 (N𝑥𝐴→𝐵 (𝜌 𝐴 )) + ℎ2 (𝜀).
𝜌𝐴 𝑥
Proof: The main idea behind the proof is to observe that an arbitrary feedback-
assisted protocol of the form discussed in Section 18.1, which is for communication
Í
over a probabilistic mixture channel N 𝐴→𝐵 = 𝑧 𝑝 𝑍 (𝑧)N 𝑧𝐴→𝐵 , has a simulation of
the following form:
1. Before the 𝑖th use of the channel N 𝐴→𝐵 in the feedback-assisted protocol, Bob
selects a random variable 𝑍𝑖 independently according to the distribution 𝑝 𝑍 .
He transmits 𝑍𝑖 over the classical feedback channel to Alice.
1112
Chapter 18: Classical-Feedback-Assisted Communication
2. Each channel use N 𝐴→𝐵 from the original protocol is replaced by a simulation
in terms of another channel M 𝐴𝑍 ′ →𝐵 , which accepts a quantum input on system
𝐴 and a classical input on system 𝑍 ′. Conditioned on the value 𝑧 in system
𝑍 ′, the channel M 𝐴𝑍 ′ →𝐵 applies N 𝑧𝐴→𝐵 to the quantum system 𝐴. Thus, if the
random variable 𝑍 ∼ 𝑝 𝑍 is fed into the input system 𝑍 ′ of M 𝐴𝑍 ′ →𝐵 , then the
channel M 𝐴𝑍 ′ →𝐵 is indistinguishable from the original channel N 𝐴→𝐵 .
3. Alice feeds a copy of the classical random variable 𝑍𝑖 into the 𝑖th use of the
channel M 𝐴𝑍 ′ →𝐵 .
4. All other aspects of the protocol are executed in the same way as before. Namely,
even though it would be an advantage to Alice to modify her encodings and
Bob to modify later decodings based on the realizations of 𝑍𝑖 , they do not do
so, and they instead blindly operate all other aspects of the simulation protocol
as they are in the original protocol.
Our goal now is to establish the inequality in Theorem 18.6, relating the 𝑛, |M|,
and 𝜀 parameters of the original (𝑛, |M| , 𝜀) protocol by using the above simulation.
The main observation to make from here is that the same proof from Lemma 18.4
gives the following bound:
This follows by grouping 𝑍 with 𝑌 , but then discarding only 𝑌 and 𝐵′ at the end
of the proof. We then apply this bound, and the same reasoning in the proof
of Theorem 18.5, except that the variables 𝑍0 , . . . , 𝑍𝑖 are grouped together with
the feedback variables [𝐹0𝑖−1 ] ′ and then the same reasoning in (18.3.87)–(18.3.93)
applies. At this point, we invoke (18.3.95) and find that
𝑛
∑︁
(1 − 𝜀) log2 |M| ≤ 𝐻 (𝐵𝑖 |𝑍𝑖 ) 𝜌 (𝑖) + ℎ2 (𝜀). (18.3.98)
𝑖=1
1113
Chapter 18: Classical-Feedback-Assisted Communication
The second equality follows from the definition of conditional entropy. The third
inequality follows from optimizing over all states. ■
In this section, we prove that the Υ-information bound from Section 12.2.5.1 is
actually an upper bound on the classical capacity assisted by classical feedback. The
main idea behind the approach detailed in this section is to establish a correlation
measure for bipartite channels, which is non-increasing under the action of one-way
LOCC channels and measures the forward classical communication that can be
generated by the bipartite channel for which it is evaluated. Such a measure is
relevant in the context of a feedback-assisted protocol because, in such a protocol,
Alice and Bob employ a one-way LOCC channel from Bob to Alice. In particular,
local channels are allowed for free, as well as the use of a classical feedback channel.
Both of these actions can be considered as particular kinds of bipartite channels
and both of them fall into the class of bipartite channels that are non-signaling from
Alice to Bob and C-PPT-P (call this class NS 𝐴↛𝐵 ∩ PPT). Recall the definition of
non-signaling channels from Section 4.6.4 and C-PPT-P channels from Section 4.6.3.
As such, if we employ a measure of bipartite channels that involves a comparison
1114
Chapter 18: Classical-Feedback-Assisted Communication
ΓM
𝐴𝐴′ 𝐵𝐵′ B M 𝐴ˆ 𝐵→𝐴
ˆ ′ 𝐵 ′ (Γ 𝐴 𝐴
ˆ ⊗ Γ𝐵 𝐵ˆ ). (18.3.106)
and 𝜋 𝐴 B 𝐼 𝐴 /𝑑 𝐴 .
Since 𝑆 𝐴𝐴′ 𝐵𝐵′ ± 𝑉𝐴𝐴′ 𝐵𝐵′ ≥ 0 implies that 𝑆 𝐴𝐴′ 𝐵𝐵′ ≥ 0, we can also express
1115
Chapter 18: Classical-Feedback-Assisted Communication
All of the properties above hold for bipartite channels, while the second and
fifth through eighth hold more generally for completely positive bipartite maps.
𝐶 𝛽 (N 𝐴𝐵→𝐴′ 𝐵′ ) ≥ 0. (18.3.115)
Proof: We prove the equivalent statement 𝛽(N 𝐴𝐵→𝐴′ 𝐵′ ) ≥ 1. Let 𝜆, 𝑆 𝐴𝐴′ 𝐵𝐵′ , and
𝑉𝐴𝐴′ 𝐵𝐵′ be arbitrary Hermitian operators satisfying the constraints in (18.3.114).
Then consider that
𝜆𝑑 𝐵 = 𝜆 Tr 𝐵 [𝐼 𝐵 ] (18.3.116)
1
≥ Tr 𝐴𝐴′ 𝐵𝐵′ [𝑆 𝐴𝐴′ 𝐵𝐵′ ] (18.3.117)
𝑑𝐴
1
≥ Tr 𝐴𝐴′ 𝐵𝐵′ [𝑉𝐴𝐴′ 𝐵𝐵′ ] (18.3.118)
𝑑𝐴
1
= Tr 𝐴𝐴′ 𝐵𝐵′ [𝑇𝐵𝐵′ (𝑉𝐴𝐴′ 𝐵𝐵′ )] (18.3.119)
𝑑𝐴
≥ Tr 𝐴𝐴′ 𝐵𝐵′ [ΓN𝐴𝐴′ 𝐵𝐵′ ] (18.3.120)
1
= Tr 𝐴𝐵 [𝐼 𝐴𝐵 ] (18.3.121)
𝑑𝐴
1117
Chapter 18: Classical-Feedback-Assisted Communication
= 𝑑𝐵. (18.3.122)
This implies that 𝜆 ≥ 1. Since the inequality holds for all 𝜆, 𝑆 𝐴𝐴′ 𝐵𝐵′ , and 𝑉𝐴𝐴′ 𝐵𝐵′
satisfying the constraints in (18.3.114), we conclude the statement above. ■
𝐶 𝛽 (id 𝐴→
¯ 𝐴˜ ⊗M 𝐴𝐵→𝐴′ 𝐵′ ⊗ id 𝐵→
¯ 𝐵˜ ) = 𝐶 𝛽 (M 𝐴𝐵→𝐴′ 𝐵′ ). (18.3.123)
Proof: Let 𝑆 𝐴𝐴′ 𝐵𝐵′ and 𝑉𝐴𝐴′ 𝐵𝐵′ be arbitrary Hermitian operators satisfying the
constraints in (18.3.105) for M 𝐴𝐵→𝐴′ 𝐵′ . The Choi operator of id 𝐴→
¯ 𝐴˜ ⊗M 𝐴𝐵→𝐴′ 𝐵′ ⊗
id𝐵→
¯ 𝐵˜ is given by
Γ𝐴¯ 𝐴˜ ⊗ ΓM
𝐴𝐴′ 𝐵𝐵′ ⊗ Γ𝐵¯ 𝐵˜ . (18.3.124)
Let us show that Γ𝐴¯ 𝐴˜ ⊗ 𝑆 𝐴𝐴′ 𝐵𝐵′ ⊗ Γ𝐵¯ 𝐵˜ and Γ𝐴¯ 𝐴˜ ⊗𝑉𝐴𝐴′ 𝐵𝐵′ ⊗ Γ𝐵¯ 𝐵˜ satisfy the constraints
¯ 𝐴˜ ⊗M 𝐴𝐵→𝐴′ 𝐵′ ⊗ id 𝐵→
in (18.3.105) for id 𝐴→ ¯ 𝐵˜ . Consider that
1
= 𝑑 ¯ Tr 𝐴𝐴′ 𝐵′ [𝑆 𝐴𝐴′ 𝐵𝐵′ ⊗ 𝐼 𝐵¯ ] ∞
(18.3.132)
𝑑 𝐴 𝑑 𝐴¯ 𝐴
1
= Tr 𝐴𝐴′ 𝐵′ [𝑆 𝐴𝐴′ 𝐵𝐵′ ] ⊗ 𝐼 𝐵¯ ∞ (18.3.133)
𝑑𝐴
1
= ∥Tr 𝐴𝐴′ 𝐵′ [𝑆 𝐴𝐴′ 𝐵𝐵′ ] ∥ ∞ . (18.3.134)
𝑑𝐴
Thus, it follows that
1
𝑆′𝐴𝐴′ 𝐵𝐵′ B Tr ¯ ˜ ¯ ˜ [𝑆 ¯ ˜ ′ ′ ¯ ˜ ], (18.3.136)
𝑑 𝐴¯ 𝑑 𝐵¯ 𝐴 𝐴 𝐵 𝐵 𝐴 𝐴𝐴𝐴 𝐵𝐵 𝐵 𝐵
′ 1
𝑉𝐴𝐴 ′ 𝐵𝐵 ′ B Tr ¯ ˜ ¯ ˜ [𝑉 ¯ ˜ ′ ′ ¯ ˜ ]. (18.3.137)
𝑑 𝐴¯ 𝑑 𝐵¯ 𝐴 𝐴 𝐵 𝐵 𝐴 𝐴𝐴𝐴 𝐵𝐵 𝐵 𝐵
Consider that
⊗N⊗id M
Γid ˜ ′ 𝐵𝐵′ 𝐵¯ 𝐵˜ = Γ 𝐴¯ 𝐴˜ ⊗ Γ 𝐴𝐴′ 𝐵𝐵′ ⊗ Γ𝐵¯ 𝐵˜ .
𝐴¯ 𝐴𝐴𝐴
(18.3.138)
Then
M
𝑇𝐵𝐵′ 𝐵¯ 𝐵˜ (𝑉𝐴¯ 𝐴𝐴𝐴
˜ ′ 𝐵𝐵′ 𝐵¯ 𝐵˜ ± Γ 𝐴¯ 𝐴˜ ⊗ Γ 𝐴𝐴′ 𝐵𝐵′ ⊗ Γ𝐵¯ 𝐵˜ ) ≥ 0 (18.3.139)
M
=⇒ Tr 𝐴¯ 𝐴˜ 𝐵¯ 𝐵˜ [𝑇𝐵𝐵′ 𝐵¯ 𝐵˜ (𝑉𝐴¯ 𝐴𝐴𝐴
˜ ′ 𝐵𝐵′ 𝐵¯ 𝐵˜ ± Γ 𝐴¯ 𝐴˜ ⊗ Γ 𝐴𝐴′ 𝐵𝐵′ ⊗ Γ𝐵¯ 𝐵˜ )] ≥ 0 (18.3.140)
⇐⇒ 𝑇𝐵𝐵′ (𝑉𝐴𝐴′ 𝐵𝐵′ ± 𝑑 𝐴¯ 𝑑 𝐵¯ ΓM
𝐴𝐴′ 𝐵𝐵′ ) ≥ 0 (18.3.141)
′ M
⇐⇒ 𝑇𝐵𝐵′ (𝑉𝐴𝐴 ′ 𝐵𝐵 ′ ± Γ 𝐴𝐴′ 𝐵𝐵 ′ ) ≥ 0. (18.3.142)
Also
˜ ′ 𝐵𝐵′ 𝐵¯ 𝐵˜ ± 𝑉 𝐴¯ 𝐴𝐴𝐴
𝑆 𝐴¯ 𝐴𝐴𝐴 ˜ ′ 𝐵𝐵′ 𝐵¯ 𝐵˜ ≥ 0 (18.3.143)
=⇒ Tr 𝐴¯ 𝐴˜ 𝐵¯ 𝐵˜ [𝑆 𝐴¯ 𝐴𝐴𝐴
˜ ′ 𝐵𝐵′ 𝐵¯ 𝐵˜ ± 𝑉 𝐴¯ 𝐴𝐴𝐴
˜ ′ 𝐵𝐵′ 𝐵¯ 𝐵˜ ] ≥ 0 (18.3.144)
⇐⇒ 𝑆′𝐴𝐴′ 𝐵𝐵′ ± 𝑉𝐴𝐴 ′
′ 𝐵𝐵 ′ ≥ 0, (18.3.145)
and
˜ ′ [𝑆 𝐴¯ 𝐴𝐴𝐴
Tr 𝐴𝐴 ˜ ′ 𝐵𝐵′ 𝐵¯ 𝐵˜ ] = 𝜋 𝐴𝐴
¯ ⊗ Tr 𝐴¯ 𝐴𝐴𝐴
˜ ′ [𝑆 𝐴¯ 𝐴𝐴𝐴
˜ ′ 𝐵𝐵′ 𝐵¯ 𝐵˜ ] (18.3.146)
1119
Chapter 18: Classical-Feedback-Assisted Communication
=⇒ ˜ ′ 𝐵¯ 𝐵˜ [𝑆 𝐴¯ 𝐴𝐴𝐴
Tr 𝐴¯ 𝐴𝐴 ˜ ′ 𝐵𝐵′ 𝐵¯ 𝐵˜ ] = Tr 𝐴¯ 𝐵¯ 𝐵˜ [𝜋 𝐴𝐴
¯ ⊗ Tr 𝐴¯ 𝐴𝐴𝐴˜ ′ [𝑆 𝐴¯ 𝐴𝐴𝐴
˜ ′ 𝐵𝐵′ 𝐵¯ 𝐵˜ ]]
(18.3.147)
= 𝜋 𝐴 ⊗ Tr 𝐴¯ 𝐴𝐴𝐴 ˜ ′ 𝐵¯ 𝐵˜ [𝑆 𝐴¯ 𝐴𝐴𝐴
˜ ′ 𝐵𝐵′ 𝐵¯ 𝐵˜ ] (18.3.148)
⇐⇒ Tr 𝐴′ [𝑆′𝐴𝐴′ 𝐵𝐵′ ] = 𝜋 𝐴 ⊗ Tr 𝐴𝐴′ [𝑆′𝐴𝐴′ 𝐵𝐵′ ]. (18.3.149)
Finally, let 𝜆 be such that
1
Tr ¯ ˜ ′ ′ ˜ [𝑆 ¯ ˜ ′ ′ ¯ ˜ ] ≤ 𝜆𝐼 𝐵 𝐵¯ . (18.3.150)
𝑑 𝐴 𝑑 𝐴¯ 𝐴 𝐴𝐴𝐴 𝐵 𝐵 𝐴 𝐴𝐴𝐴 𝐵𝐵 𝐵 𝐵
Then it follows that
1
Tr 𝐵¯ Tr ¯ ˜ ′ ′ ˜ [𝑆 ¯ ˜ ′ ′ ¯ ˜ ] ≤ Tr 𝐵¯ [𝜆𝐼 𝐵 𝐵¯ ] (18.3.151)
𝑑 𝐴 𝑑 𝐴¯ 𝐴 𝐴𝐴𝐴 𝐵 𝐵 𝐴 𝐴𝐴𝐴 𝐵𝐵 𝐵 𝐵
1
⇐⇒ Tr ¯ ˜ ′ ′ ¯ ˜ [𝑆 ¯ ˜ ′ ′ ¯ ˜ ] ≤ 𝑑 𝐵¯ 𝜆𝐼 𝐵 (18.3.152)
𝑑 𝐴 𝑑 𝐴¯ 𝐴 𝐴𝐴𝐴 𝐵 𝐵 𝐵 𝐴 𝐴𝐴𝐴 𝐵𝐵 𝐵 𝐵
1
⇐⇒ Tr 𝐴𝐴′ 𝐵′ [𝑆′𝐴𝐴′ 𝐵𝐵′ ] ≤ 𝜆𝐼 𝐵 . (18.3.153)
𝑑𝐴
Thus, we conclude that
𝛽(M 𝐴𝐵→𝐴′ 𝐵′ ) ≤ 𝛽(id 𝐴→
¯ 𝐴˜ ⊗M 𝐴𝐵→𝐴′ 𝐵′ ⊗ id 𝐵→
¯ 𝐵˜ ). (18.3.154)
This concludes the proof. ■
𝐶 𝛽 (Δ𝐵→𝐴′ ) = 0. (18.3.156)
Proof: We prove the equivalent statement that 𝛽(Δ𝐵→𝐴′ ) = 1. In this case, the 𝐴
and 𝐵′ systems are trivial, so that 𝑑 𝐴 = 1, and the Choi operator of Δ𝐵→𝐴′ is given
by
Δ
Γ𝐵𝐴 ′ = Γ 𝐵𝐴′ , (18.3.157)
1120
Chapter 18: Classical-Feedback-Assisted Communication
where
𝐵 −1
𝑑∑︁
Γ 𝐵𝐴′ B |𝑖⟩⟨𝑖| 𝐵 ⊗ |𝑖⟩⟨𝑖| 𝐴′ . (18.3.158)
𝑖=0
Pick 𝑆 𝐵𝐴′ = 𝑉𝐵𝐴′ = Γ 𝐵𝐴′ . Then we need to check that the constraints in (18.3.105)
are satisfied for these choices. Consider that
Δ
𝑇𝐵 (𝑉𝐵𝐴′ ± Γ𝐵𝐴 ′) ≥ 0 (18.3.159)
⇐⇒ 𝑇𝐵 (Γ 𝐵𝐴′ ± Γ 𝐵𝐴′ ) ≥ 0 (18.3.160)
⇐⇒ Γ 𝐵𝐴′ ± Γ 𝐵𝐴′ ≥ 0, (18.3.161)
and the no-signaling condition Tr 𝐴′ [𝑆 𝐴𝐴′ 𝐵𝐵′ ] = 𝜋 𝐴 ⊗ Tr 𝐴𝐴′ [𝑆 𝐴𝐴′ 𝐵𝐵′ ] is trivially
satisfied because the 𝐴 system is trivial, having dimension equal to one. Finally, let
us evaluate the objective function for these choices:
1
∥Tr 𝐴𝐴′ 𝐵′ [𝑆 𝐴𝐴′ 𝐵𝐵′ ] ∥ ∞ = ∥Tr 𝐴′ [𝑆 𝐴′ 𝐵 ] ∥ ∞ (18.3.164)
𝑑𝐴
= Tr 𝐴′ [Γ 𝐵𝐴′ ] (18.3.165)
∞
= ∥𝐼 𝐵 ∥ ∞ (18.3.166)
= 1. (18.3.167)
Combined with the general lower bound from Proposition 18.8, we conclude
(18.3.156). ■
Proof: We prove the equivalent statement that 𝛽(E 𝐴→𝐴′ ⊗ F𝐵→𝐵′ ) = 1. Set
𝑆 𝐴𝐴′ 𝐵𝐵′ = 𝑉𝐴𝐴′ 𝐵𝐵′ = ΓE𝐴𝐴′ ⊗ Γ𝐵𝐵
F , where ΓE and ΓF are the Choi operators of
′ 𝐴𝐴′ 𝐵𝐵′
1121
Chapter 18: Classical-Feedback-Assisted Communication
E 𝐴→𝐴′ and F𝐵→𝐵′ , respectively. We need to check that the constraints in (18.3.105)
are satisfied for these choices. Consider that
and
Combined with the general lower bound from Proposition 18.8, we conclude
(18.3.168). ■
1122
Chapter 18: Classical-Feedback-Assisted Communication
1 M 1
𝑇𝐵𝐵′ (𝑉𝐴𝐴 ′ 𝐵𝐵 ′ ± Γ 𝐴𝐴′ 𝐵𝐵 ′ ) ≥ 0, (18.3.184)
𝑆 1𝐴𝐴′ 𝐵𝐵′ ± 𝑉𝐴𝐴
1
′ 𝐵𝐵 ′ ≥ 0, (18.3.185)
Tr 𝐴′ [𝑆 1𝐴𝐴′ 𝐵𝐵′ ] = 𝜋 𝐴 ⊗ Tr 𝐴𝐴′ [𝑆 1𝐴𝐴′ 𝐵𝐵′ ], (18.3.186)
and let 𝑆 2𝐴′ 𝐴′′ 𝐵′ 𝐵′′ and 𝑉𝐴2′ 𝐴′′ 𝐵′ 𝐵′′ satisfy
This latter statement is a consequence of the general fact that if 𝐴, 𝐵, 𝐶, and 𝐷 are
Hermitian operators satisfying 𝐴 ± 𝐵 ≥ 0 and 𝐶 ± 𝐷 ≥ 0, then 𝐴 ⊗ 𝐶 ± 𝐵 ⊗ 𝐷 ≥ 0.
To see this, consider that the original four operator inequalities imply the four
operator inequalities ( 𝐴 ± 𝐵) ⊗ (𝐶 ± 𝐷) ≥ 0, and then summing these four different
operator inequalities in various ways leads to 𝐴 ⊗ 𝐶 ± 𝐵 ⊗ 𝐷 ≥ 0.
Now apply the following positive map to (18.3.190)–(18.3.191):
where
∑︁
|Γ⟩ 𝐴′ 𝐴′ B |𝑖⟩ 𝐴′ |𝑖⟩ 𝐴′ , (18.3.193)
𝑖
∑︁
|Γ⟩𝐵′ 𝐵′ B |𝑖⟩𝐵′ |𝑖⟩𝐵′ . (18.3.194)
𝑖
This gives
3 M ◦M 2 1
𝑇𝐵𝐵′′ (𝑉𝐴𝐴 ′′ 𝐵𝐵 ′′ ± Γ 𝐴𝐴′′ 𝐵𝐵 ′′ ) ≥ 0, (18.3.195)
1123
Chapter 18: Classical-Feedback-Assisted Communication
where
3 1 2
𝑉𝐴𝐴 ′′ 𝐵𝐵 ′′ B (⟨Γ| 𝐴′ 𝐴′ ⊗ ⟨Γ| 𝐵 ′ 𝐵 ′ )(𝑉 𝐴𝐴′ 𝐵𝐵 ′ ⊗ 𝑉 𝐴′ 𝐴′′ 𝐵 ′ 𝐵 ′′ )(|Γ⟩ 𝐴′ 𝐴′ ⊗ |Γ⟩ 𝐵 ′ 𝐵 ′ ),
(18.3.197)
ΓM ◦M M M
2 1 1 2
𝐴𝐴′′ 𝐵𝐵′′ B (⟨Γ| 𝐴′ 𝐴′ ⊗ ⟨Γ| 𝐵′ 𝐵′ )(Γ 𝐴𝐴′ 𝐵𝐵′ ⊗ Γ 𝐴′ 𝐴′′ 𝐵′ 𝐵′′ )(|Γ⟩ 𝐴′ 𝐴′ ⊗ |Γ⟩ 𝐵′ 𝐵′ ),
(18.3.198)
𝑆 3𝐴𝐴′′ 𝐵𝐵′′ B (⟨Γ| 𝐴′ 𝐴′ ⊗ ⟨Γ| 𝐵′ 𝐵′ )(𝑆 1𝐴𝐴′ 𝐵𝐵′ ⊗ 𝑆 2𝐴′ 𝐴′′ 𝐵′ 𝐵′′ )(|Γ⟩ 𝐴′ 𝐴′ ⊗ |Γ⟩𝐵′ 𝐵′ ),
(18.3.199)
and we applied (4.2.20) to conclude that
(18.3.204)
1
= ⟨Γ| 𝐵′ 𝐵′ (Tr 𝐴′ [𝑆 1𝐴𝐴′ 𝐵𝐵′ ] ⊗ Tr 𝐴′ 𝐴′′ [𝑆 2𝐴′ 𝐴′′ 𝐵′ 𝐵′′ ])|Γ⟩𝐵′ 𝐵′ (18.3.205)
𝑑 𝐴′
1
= ⟨Γ| 𝐵′ 𝐵′ (𝜋 𝐴 ⊗ Tr 𝐴𝐴′ [𝑆 1𝐴𝐴′ 𝐵𝐵′ ] ⊗ Tr 𝐴′ 𝐴′′ [𝑆 2𝐴′ 𝐴′′ 𝐵′ 𝐵′′ ])|Γ⟩𝐵′ 𝐵′ (18.3.206)
𝑑 𝐴′
1
= 𝜋𝐴 ⊗ ⟨Γ| 𝐵′ 𝐵′ (Tr 𝐴𝐴′ [𝑆 1𝐴𝐴′ 𝐵𝐵′ ] ⊗ Tr 𝐴′ 𝐴′′ [𝑆 2𝐴′ 𝐴′′ 𝐵′ 𝐵′′ ])|Γ⟩𝐵′ 𝐵′ . (18.3.207)
𝑑 𝐴′
Now consider that
1
Tr 𝐴𝐴′′ [𝑆 3𝐴𝐴′′ 𝐵𝐵′′ ] = ⟨Γ| 𝐵′ 𝐵′ (Tr 𝐴𝐴′ [𝑆 1𝐴𝐴′ 𝐵𝐵′ ] ⊗ Tr 𝐴′ 𝐴′′ [𝑆 2𝐴′ 𝐴′′ 𝐵′ 𝐵′′ ])|Γ⟩𝐵′ 𝐵′ .
𝑑 𝐴′
(18.3.208)
1124
Chapter 18: Classical-Feedback-Assisted Communication
So we conclude that
straints in (18.3.184)–(18.3.186) and 𝑆 2𝐴′ 𝐴′′ 𝐵′ 𝐵′′ and 𝑉𝐴2′ 𝐴′′ 𝐵′ 𝐵′′ are arbitrary Her-
mitian operators satisfying the constraints in (18.3.187)–(18.3.189), we conclude
(18.3.182). ■
F 𝐴ˆ 𝐵→𝐴
ˆ ′′ 𝐵 ′′ B (N 𝐴′ →𝐴′′ ⊗ P 𝐵 ′ →𝐵 ′′ )M 𝐴𝐵→𝐴′ 𝐵 ′ (K 𝐴→𝐴
ˆ ⊗ L𝐵→𝐵
ˆ ). (18.3.215)
Then
𝐶 𝛽 (F 𝐴ˆ 𝐵→𝐴
ˆ ′′ 𝐵 ′′ ) ≤ 𝐶 𝛽 (M 𝐴𝐵→𝐴′ 𝐵 ′ ). (18.3.216)
1125
Chapter 18: Classical-Feedback-Assisted Communication
𝐶 𝛽 (F 𝐴ˆ 𝐵→𝐴
ˆ ′′ 𝐵 ′′ )
Then
𝐶 𝛽 (F 𝐴𝐵→𝐴′ 𝐵′ ) = 𝐶 𝛽 (M 𝐴𝐵→𝐴′ 𝐵′ ). (18.3.220)
Proof: Let 𝑆 𝑥𝐴𝐴′ 𝐵𝐵′ and 𝑉𝐴𝐴 ′ 𝐵𝐵 ′ satisfy the constraints in (18.3.105) for M 𝐴𝐵→𝐴′ 𝐵 ′
𝑥 𝑥
where the second inequality follows from convexity of the ∞-norm. Since the
inequality holds for all 𝑆 𝑥𝐴𝐴′ 𝐵𝐵′ and 𝑉𝐴𝐴
𝑥
′ 𝐵𝐵 ′ satisfying the constraints in (18.3.105)
for M 𝐴𝐵→𝐴′ 𝐵′ for 𝑥 ∈ {0, 1}, we conclude (18.3.221). ■
𝑥
Using the quantum relative entropy, the sandwiched Rényi relative entropy,
the Belavkin–Staszewski relative entropy, and the geometric Rényi relative en-
tropy, we then obtain the following respective channel measures: Υ(N 𝐴𝐵→𝐴′ 𝐵′ ),
e𝛼 (N 𝐴𝐵→𝐴′ 𝐵′ ), Υ(N
Υ b𝛼 (N 𝐴𝐵→𝐴′ 𝐵′ ), defined by substituting 𝑫 with
b 𝐴𝐵→𝐴′ 𝐵′ ), and Υ
𝐷, 𝐷e𝛼 , 𝐷,
b and 𝐷 b𝛼 .
Proof: We prove the first inequality and the proof of the second inequality is
similar. Consider that
𝚼(N 𝐴𝐵→𝐴′ 𝐵′ )
= inf 𝑫 (N 𝐴𝐵→𝐴′ 𝐵′ ∥M 𝐴𝐵→𝐴′ 𝐵′ )
M 𝐴𝐵→𝐴′ 𝐵′ :
𝛽(M 𝐴𝐵→𝐴′ 𝐵′ )≤1
≥ inf 𝑫 (N 𝐴𝐵→𝐴′ 𝐵′ (Φ 𝑅 𝐴 ⊗ Φ𝐵𝑆 )∥M 𝐴𝐵→𝐴′ 𝐵′ (Φ 𝑅 𝐴 ⊗ Φ𝐵𝑆 ))
M 𝐴𝐵→𝐴′ 𝐵′ :
𝛽(M 𝐴𝐵→𝐴′ 𝐵′ )≤1
≥ inf 𝑫 (Tr[N 𝐴𝐵→𝐴′ 𝐵′ (Φ 𝑅 𝐴 ⊗ Φ𝐵𝑆 )] ∥ Tr[M 𝐴𝐵→𝐴′ 𝐵′ (Φ 𝑅 𝐴 ⊗ Φ𝐵𝑆 )])
M 𝐴𝐵→𝐴′ 𝐵′ :
𝛽(M 𝐴𝐵→𝐴′ 𝐵′ )≤1
= inf 𝑫 (1∥ Tr[M 𝐴𝐵→𝐴′ 𝐵′ (Φ 𝑅 𝐴 ⊗ Φ𝐵𝑆 )]) (18.3.229)
M 𝐴𝐵→𝐴′ 𝐵′ :𝛽(M 𝐴𝐵→𝐴′ 𝐵′ )≤1
= Tr[ΓM
𝐴𝐴′ 𝐵𝐵′ ], (18.3.237)
which is equivalent to
𝜆 ≥ Tr[M 𝐴𝐵→𝐴′ 𝐵′ (Φ 𝑅 𝐴 ⊗ Φ𝐵𝑆 )]. (18.3.238)
Taking an infimum over 𝜆, 𝑆 𝐴𝐴′ 𝐵𝐵′ , and 𝑉𝐴𝐴′ 𝐵𝐵′ satisfying the constraints in
(18.3.114) for M 𝐴𝐵→𝐴′ 𝐵′ and applying the assumption 𝛽(M 𝐴𝐵→𝐴′ 𝐵′ ) ≤ 1, we
conclude (18.3.230). ■
for every channel N 𝐴𝐵→𝐴′ 𝐵′ and completely positive map M 𝐴𝐵→𝐴′ 𝐵′ . Combining
with Proposition 18.9 and the definition in (18.3.227), we conclude (18.3.239). ■
𝚼(Δ𝐵→𝐴′ ) = 0. (18.3.242)
Proof: This follows from Proposition 18.10. Since 𝛽(Δ𝐵→𝐴′ ) = 1, we can pick
M𝐵→𝐴′ = Δ𝐵→𝐴′ , and then
𝑫 (Δ𝐵→𝐴′ ∥M𝐵→𝐴′ ) = 𝑫 (Δ𝐵→𝐴′ ∥Δ𝐵→𝐴′ ) = 0. (18.3.243)
1129
Chapter 18: Classical-Feedback-Assisted Communication
Proof: Same argument as given for Proposition 18.19, but use Proposition 18.11
instead. ■
We now establish some properties that are more specific to the Belavkin–
Staszewski and geometric Rényi relative entropies (however the first actually holds
also for the quantum relative entropy and other quantum Rényi relative entropies).
Proposition 18.21
Let N 𝐴𝐵→𝐴′ 𝐵′ be a bipartite channel. Then for all 𝛼 ∈ (1, 2],
b 𝐴𝐵→𝐴′ 𝐵′ ) ≤ Υ
Υ(N b𝛼 (N 𝐴𝐵→𝐴′ 𝐵′ ) ≤ 𝐶 𝛽 (N 𝐴𝐵→𝐴′ 𝐵′ ). (18.3.245)
1130
Chapter 18: Classical-Feedback-Assisted Communication
Proposition 18.23
Let M 𝐴→𝐵′ be a point-to-point completely positive map. Then
Tr[𝑆 𝐵′ ] :
𝛽(M 𝐴→𝐵′ ) B inf 𝑇𝐵𝐵′ (𝑉𝐴𝐵′ ± ΓM 𝐴𝐵′ ) ≥ 0, . (18.3.247)
𝑆 𝐵′ ,𝑉 𝐴𝐵′ ∈Herm
𝐼 𝐴 ⊗ 𝑆 𝐵′ ± 𝑉𝐴𝐵′ ≥ 0
Proof: In this case, the systems 𝐴′ and 𝐵 are trivial. So then the definition in
(18.3.105) reduces to
∥Tr 𝐵′ [𝑆 𝐴𝐵′ ] ∥ ∞ :
𝑇𝐵𝐵′ (𝑉𝐴𝐵′ ± ΓM ′ ) ≥ 0,
𝛽(M 𝐴→𝐵′ ) = inf 𝐴𝐵 . (18.3.248)
𝑆 𝐴𝐵′ ,𝑉 𝐴𝐵′ ∈Herm
𝑆 𝐴𝐵 ′ ± 𝑉 𝐴𝐵′ ≥ 0,
𝑆 𝐴𝐵′ = 𝜋 𝐴 ⊗ Tr 𝐴 [𝑆 𝐴𝐵′ ]
The last constraint implies that the optimization simplifies to
∥Tr 𝐵′ [𝜋 𝐴 ⊗ Tr 𝐴 [𝑆 𝐴𝐵′ ]] ∥ ∞ :
𝛽(M 𝐴→𝐵′ ) = inf 𝑇𝐵𝐵′ (𝑉𝐴𝐵′ ± ΓM 𝐴𝐵′ ) ≥ 0, (18.3.249)
𝑆 𝐴𝐵′ ,𝑉 𝐴𝐵′ ∈Herm
𝜋 𝐴 ⊗ Tr 𝐴 [𝑆 𝐴𝐵′ ] ± 𝑉𝐴𝐵′ ≥ 0
′
Tr 𝐵′ [𝜋 𝐴 ⊗ 𝑆 𝐵′ ] ∞ :
= ′ inf 𝑇𝐵𝐵′ (𝑉𝐴𝐵′ ± ΓM 𝐴𝐵′ ) ≥ 0, (18.3.250)
𝑆 𝐵′ ,𝑉 𝐴𝐵′ ∈Herm
𝜋 𝐴 ⊗ 𝑆′ ′ ± 𝑉𝐴𝐵′ ≥ 0
𝐵
1131
Chapter 18: Classical-Feedback-Assisted Communication
Tr[𝑆′𝐵′ ] ∥𝜋 𝐴 ∥ ∞ :
= ′ inf 𝑇𝐵𝐵′ (𝑉𝐴𝐵′ ± ΓM 𝐴𝐵′ ) ≥ 0, (18.3.251)
𝑆 𝐵′ ,𝑉 𝐴𝐵′ ∈Herm
𝜋 𝐴 ⊗ 𝑆′ ′ ± 𝑉𝐴𝐵′ ≥ 0
𝐵
1 ′
𝑑 𝐴 Tr[𝑆 𝐵′ ] :
= ′ inf 𝑇𝐵𝐵′ (𝑉𝐴𝐵′ ± ΓM 𝐴𝐵′ ) ≥ 0,
(18.3.252)
𝑆 𝐵′ ,𝑉 𝐴𝐵′ ∈Herm
𝜋 𝐴 ⊗ 𝑆′ ′ ± 𝑉𝐴𝐵′ ≥ 0
𝐵
Tr[𝑆 𝐵′ ] :
= inf 𝑇𝐵𝐵′ (𝑉𝐴𝐵′ ± ΓM 𝐴𝐵′ ) ≥ 0, . (18.3.253)
𝑆 𝐵′ ,𝑉 𝐴𝐵′ ∈Herm
𝜋 𝐴 ⊗ 𝑆 𝐵′ ± 𝑉𝐴𝐵′ ≥ 0
This concludes the proof. ■
b 𝐴→𝐵′ ) and Υ
which leads to the quantities Υ(N b𝛼 (N 𝐴→𝐵′ ), for which we have the
following bounds for 𝛼 ∈ (1, 2]:
b 𝐴→𝐵′ ) ≤ Υ
Υ(N b𝛼 (N 𝐴→𝐵′ ) ≤ 𝐶 𝛽 (N 𝐴→𝐵′ ). (18.3.255)
The next proposition is critical for establishing our upper bound proofs in
Section 18.3.3.4. It states that if one share of a maximally classically correlated
state passes through a completely positive map M 𝐴→𝐵′ for which 𝛽(M 𝐴→𝐵′ ) ≤ 1,
then the resulting operator has a very small chance of passing the comparator test,
as defined in (18.3.258). (Recall that we previously used the comparator test in
(11.1.37) and (12.1.19).)
1132
Chapter 18: Classical-Feedback-Assisted Communication
𝑑−1
∑︁
Π 𝐴𝐵
ˆ ′ B |𝑖⟩⟨𝑖| 𝐴ˆ ⊗ |𝑖⟩⟨𝑖| 𝐵′ , (18.3.258)
𝑖=0
ˆ 𝐴, and 𝐵′.
and the following systems are isomorphic: 𝐴,
Proof: Recall the expression for 𝛽(M 𝐴→𝐵′ ) in (18.3.247). Let 𝑆 𝐵′ and 𝑉𝐴𝐵′
be arbitrary Hermitian operators satisfying the constraints for 𝛽(M 𝐴→𝐵′ ). An
application of (4.2.6) implies that
M
M 𝐴→𝐵′ (Φ 𝐴𝐴
ˆ ) = ⟨Γ| 𝐴 𝐴˜ Φ 𝐴𝐴 ˜ ′ |Γ⟩ 𝐴 𝐴˜ ,
ˆ ⊗ Γ 𝐴𝐵 (18.3.259)
where 𝐴˜ ≃ 𝐴. This means that
M
ˆ ′ M 𝐴→𝐵′ (Φ 𝐴𝐴
Tr[Π 𝐴𝐵 ˆ )] = Tr[Π 𝐴𝐵
ˆ ′ ⟨Γ| 𝐴 𝐴˜ Φ 𝐴𝐴 ˜ ′ |Γ⟩ 𝐴 𝐴˜ ]
ˆ ⊗ Γ 𝐴𝐵 (18.3.260)
M
= Tr[𝑇𝐵′ (Π 𝐴𝐵
ˆ ′ )⟨Γ| 𝐴 𝐴˜ Φ 𝐴𝐴
ˆ ⊗ Γ 𝐴𝐵
˜ ′ |Γ⟩ 𝐴 𝐴˜ ] (18.3.261)
M
ˆ ′ ⟨Γ| 𝐴 𝐴˜ Φ 𝐴𝐴
= Tr[Π 𝐴𝐵 ˜ ′ )|Γ⟩ 𝐴 𝐴˜ ]
ˆ ⊗ 𝑇𝐵′ (Γ 𝐴𝐵 (18.3.262)
≤ Tr[Π 𝐴𝐵
ˆ ′ ⟨Γ| 𝐴 𝐴˜ Φ 𝐴𝐴
ˆ ⊗ 𝑇𝐵′ (𝑉 𝐴𝐵
˜ ′ )|Γ⟩ 𝐴 𝐴˜ ] (18.3.263)
= Tr[𝑇𝐵′ (Π 𝐴𝐵
ˆ ′ )⟨Γ| 𝐴 𝐴˜ Φ 𝐴𝐴
ˆ ⊗ 𝑉 𝐴𝐵
˜ ′ |Γ⟩ 𝐴 𝐴˜ ] (18.3.264)
ˆ ′ ⟨Γ| 𝐴 𝐴˜ Φ 𝐴𝐴
= Tr[Π 𝐴𝐵 ˆ ⊗ 𝑉 𝐴𝐵
˜ ′ |Γ⟩ 𝐴 𝐴˜ ] (18.3.265)
≤ Tr[Π 𝐴𝐵
ˆ ′ ⟨Γ| 𝐴 𝐴˜ Φ 𝐴𝐴
ˆ ⊗ 𝐼 𝐴˜ ⊗ 𝑆 𝐵′ |Γ⟩ 𝐴 𝐴˜ ] (18.3.266)
ˆ ′ ⟨Γ| 𝐴 𝐴˜ Φ 𝐴𝐴
= Tr[Π 𝐴𝐵 ˆ ⊗ 𝐼 𝐴˜ |Γ⟩ 𝐴 𝐴˜ ⊗ 𝑆 𝐵′ ] (18.3.267)
ˆ ′ Tr 𝐴 [Φ 𝐴𝐴
= Tr[Π 𝐴𝐵 ˆ ] ⊗ 𝑆 𝐵′ ] (18.3.268)
1
= Tr[Π 𝐴𝐵 ˆ ′ 𝐼 𝐴ˆ ⊗ 𝑆 𝐵′ ] (18.3.269)
𝑑
1
= Tr[𝑆 𝐵′ ]. (18.3.270)
𝑑
Since this holds for all 𝑆 𝐵′and 𝑉𝐴𝐵′ satisfying the constraints for 𝛽(M 𝐴→𝐵′ ), we
conclude that
1
ˆ ′ M 𝐴→𝐵′ (Φ 𝐴𝐴
Tr[Π 𝐴𝐵 ˆ )] ≤ . (18.3.271)
𝑑
1133
Chapter 18: Classical-Feedback-Assisted Communication
We finally state another proposition that plays an essential role in our upper
bound proofs in Section 18.3.3.4.
Proposition 18.25
Suppose that N 𝐴→𝐵 is a channel with 𝐴 isomorphic to 𝐵 that satisfies
1
N 𝐴→𝐵 (Φ 𝑅 𝐴 ) − Φ 𝑅𝐵 ≤ 𝜀, (18.3.272)
2 1
Í
for 𝜀 ∈ [0, 1) and where Φ 𝑅𝐵 B 𝑑1 𝑖 |𝑖⟩⟨𝑖| 𝑅 ⊗ |𝑖⟩⟨𝑖| 𝐵 and 𝑑 = 𝑑 𝑅 = 𝑑 𝐴 = 𝑑 𝐵 .
Then
implies that
Tr[Π 𝑅𝐵 N 𝐴→𝐵 (Φ 𝑅 𝐴 )] ≥ 1 − 𝜀, (18.3.276)
Í
where Π 𝑅𝐵 B 𝑖 |𝑖⟩⟨𝑖| 𝑅 ⊗ |𝑖⟩⟨𝑖| 𝐵 is the comparator test. Indeed, applying a
Í
completely dephasing channel Δ𝐵 (·) B 𝑖 |𝑖⟩⟨𝑖|(·)|𝑖⟩⟨𝑖| to the output of the channel
N 𝐴→𝐵 and applying the data-processing inequality for trace distance, we conclude
that
1
𝜀≥ N 𝐴→𝐵 (Φ 𝑅 𝐴 ) − Φ 𝑅𝐵 (18.3.277)
2 1
1134
Chapter 18: Classical-Feedback-Assisted Communication
1
≥ (Δ𝐵 ◦ N 𝐴→𝐵 )(Φ 𝑅 𝐴 ) − Δ𝐵 (Φ 𝑅𝐵 ) (18.3.278)
2 1
1
= (Δ𝐵 ◦ N 𝐴→𝐵 )(Φ 𝑅 𝐴 ) − Φ 𝑅𝐵 . (18.3.279)
2 1
1135
Chapter 18: Classical-Feedback-Assisted Communication
So we conclude that
Tr[Π 𝑅𝐵 N 𝐴→𝐵 (Φ 𝑅 𝐴 )] ≥ 1 − 𝜀. (18.3.291)
Applying the definition of the hypothesis testing relative entropy from Defini-
tion 7.65, we conclude that
as well as the previous proposition. The proof of (18.3.296) follows the same proof
given for Proposition 7.71. ■
1136
Chapter 18: Classical-Feedback-Assisted Communication
We now have everything that we need to establish that the geometric Υ-information
is an upper bound on the number of bits that can be transmitted by means of
a quantum channel assisted by a classical feedback channel. By examining the
𝑝
protocol in Section 18.1, consider that the final state 𝜔 ˆ of the protocol can be
𝑀𝑀
written as follows:
𝑝 𝑝
𝜔 ˆ = P 𝑀 ′ → 𝑀ˆ (Φ 𝑀 𝑀 ′ ), (18.3.297)
𝑀𝑀
where
and A is an appending channel that appends the state Δ𝐹0 (Ψ𝐹0 𝐵0′ ) to the input state
𝑝
Φ 𝑀 𝑀 ′ . In (18.3.298), we have omitted all system labels for simplicity.
We now state the main result of this section:
Theorem 18.26
Fix 𝑛 ∈ N, 𝜀 ∈ [0, 1), and 𝛼 ∈ (1, 2], and let N 𝐴→𝐵 be a quantum channel. For
all (𝑛, |M| , 𝜀) classical-feedback-assisted classical communication protocols
over the channel N 𝐴→𝐵 , the following bound holds
log2 |M| b 𝛼 1
≤ Υ𝛼 (N 𝐴→𝐵 ) + log2 , (18.3.299)
𝑛 𝑛 (𝛼 − 1) 1−𝜀
Proof: Consider an arbitrary (𝑛, |M| , 𝜀) protocol of the form described in Sec-
tion 18.1, with final state as given in (18.3.297). Let the distribution 𝑝 over the
messages be the uniform distribution. Since the condition 𝑝 ∗err (C) ≤ 𝜀 holds, with
𝑝 ∗err defined in (18.1.12), we can apply (18.3.274) of Proposition 18.25 to conclude
that
1137
Chapter 18: Classical-Feedback-Assisted Communication
𝛼 1
+ log2 (18.3.300)
𝛼−1 1−𝜀
𝛼 1
≤Υ
b𝛼 (P ′ ˆ ) +
𝑀 →𝑀 log2 , (18.3.301)
𝛼−1 1−𝜀
where the second inequality follows from the definition in with 𝑫 set to 𝐷 b𝛼 .
Eq. (18.3.298) indicates that the whole protocol is a serial composition of bipartite
channels. Then we find that
b𝛼 (P ′ ˆ )
Υ 𝑀 →𝑀
= Υ𝛼 (D𝑛 ◦ N ◦ E𝑛−1 ◦ Δ ◦ D𝑛−1 ◦ N ◦ E𝑛−2 ◦ Δ ◦ D𝑛−2 ◦
b (18.3.302)
· · · ◦ D2 ◦ N ◦ E1 ◦ Δ ◦ D1 ◦ N ◦ E0 ◦ A) (18.3.303)
𝑛
∑︁ 𝑛−1
∑︁
𝑖 b𝛼 (E𝑖 ) + Υ
≤ 𝑛Υ
b𝛼 (N) + 𝑛Υ b𝛼 (Δ) + b𝛼 (D ) +
Υ Υ b𝛼 (A) (18.3.304)
𝑖=1 𝑖=0
b𝛼 (N).
= 𝑛Υ (18.3.305)
The inequality follows from Proposition 18.22. The last equality follows from
Propositions 18.18, 18.20, and 18.19 because each encoding channel E𝑖 and
decoding channel D𝑖 is a local channel and Δ is a classical feedback channel. We
also implicitly used the stability property in Proposition 18.18. Putting everything
together, we conclude that
𝛼 1
log2 |M| ≤ 𝑛Υb𝛼 (N) + log2 , (18.3.306)
𝛼−1 1−𝜀
which is equivalent to the desired bound in (18.3.299). ■
1138
Chapter 18: Classical-Feedback-Assisted Communication
1139
Chapter 18: Classical-Feedback-Assisted Communication
𝐶CFB (N) = 𝐶
eCFB (N) = 𝜒(N), (18.4.3)
Proof: The lower bound 𝜒(N) ≤ 𝐶CFB (N) follows from Theorem 12.13 (i.e.,
not making use of the classical feedback channel at all). The upper bound
𝐶eCFB (N) ≤ 𝜒(N) follows from (18.3.6) of Theorem 18.2, and by reasoning similar
to that given in the proof of Theorem 12.19. ■
Proof: This is a direct consequence of Theorem 18.6 and reasoning similar to that
given for the proof around (12.2.43). ■
1140
Chapter 18: Classical-Feedback-Assisted Communication
Proof: This is a direct consequence of the upper bound in Theorem 18.26 and
reasoning similar to that given in the proof of Theorem . We also require the fact
that the geometric Rényi relative entropy converges to the Belavkin–Staszewski
relative entropy in the limit as 𝛼 → 1 (see Proposition 7.52). ■
18.5 Examples
In this section, we briefly provide some examples of channels for which we evaluate
the capacity upper bounds in Section 18.4. We begin with the quantum erasure
channel (see Section 4.5.2). Recall that a quantum erasure channel acts as follows
on an input density operator 𝜌:
where 𝑝 ∈ [0, 1] is the erasure probability and |𝑒⟩⟨𝑒| is an erasure state orthogonal
to every possible input. Let 𝑑 be the dimension of the input to the channel. By
inspection, we see that the erasure channel is a probabilistic mixture of an identity
channel and a channel that traces out the input and replaces with the erasure state.
Thus, we apply Theorem 18.32 to conclude that
1141
Chapter 18: Classical-Feedback-Assisted Communication
Since this upper bound is an achievable for classical communication over the erasure
channel without feedback (see Theorem 12.33), we then conclude that
That is, classical feedback does not increase the classical capacity of the erasure
channel.
Finally, we evaluate the bound in Theorem 18.33 for the qubit depolarizing
channel. Recall from Section 16.32 that it is defined as
It was already established in Section 16.32 that Υ(D 𝑝 ) is an upper bound on its
(unassisted) classical capacity, and we discussed in Section 16.32 how the Holevo
information is equal to its classical capacity. What we find now is that Υ(D 𝑝 ) is
an upper bound on its classical capacity assisted by a classical feedback channel.
Figure 18.1 plots this upper bound and also plots the Holevo information lower
bound when 𝑑 = 2. The latter is given by 1 − ℎ2 ( 𝑝/2), where ℎ2 is the binary
entropy function. Note that the depolarizing channel is entanglement breaking for
𝑑 𝑑
𝑝 ≥ 𝑑+1 . As such, the bounds from Theorem 18.31 apply, so that, for 𝑝 ≥ 𝑑+1 ,
the Holevo information 1 − ℎ2 ( 𝑝/2) is equal to the classical capacity assisted by
classical feedback.
18.6 Summary
In this chapter, we developed the general theory of classical communication over a
quantum channel assisted by classical feedback from receiver to sender. Our main
focus was on establishing upper bounds on this capacity. The main findings of this
chapter are as follows:
1. We first proved that classical feedback does not enhance the classical capacity
of an entanglement-breaking channel.
2. Next, we established that the average output entropy of a channel is a weak
converse upper bound on the feedback-assisted capacity. The method for
establishing this average entropy bound involves identifying an information
1142
Chapter 18: Classical-Feedback-Assisted Communication
0.9
0.8
0.7
0.6
Rate
0.5
0.4
0.3
0.2
Holevo information
0.1 Upsilon Information
Entanglement-breaking
0
0 0.2 0.4 0.6 0.8 1
p
measure that has two key properties: 1) it does not increase under a one-way
local operations and classical communication channel from the receiver to the
sender and 2) a quantum channel from sender to receiver cannot increase the
information measure by more than the maximum average output entropy of the
channel. This information measure can be understood as the sum of two terms,
with one corresponding to classical correlation and the other to entanglement.
3. We finally established a general strong converse upper bound on the feedback-
assisted capacity, in terms of the geometric Υ-information of a quantum
channel. The main method for doing was to devise an information measure
for bipartite channels that is equal to zero for classical feedback channels and
products of local channels.
1143
Chapter 18: Classical-Feedback-Assisted Communication
does not increase the capacity of entanglement-breaking channels. This result was
strengthened to a strong converse statement by Ding and Wilde (2018). Smith
and Smolin (2009) provided an example of a channel for which classical feedback
can signficantly enhance the classical capacity. Bennett et al. (2006) related the
feedback-assisted capacity to other capacities in quantum Shannon theory, and
García-Patrón et al. (2018) related it to other notions of feedback-assisted capacity.
Ding et al. (2019) established the entropy upper bound on the feedback-assisted
capacity, and Ding et al. (2023) established the geometric Υ-information upper
bound on the strong converse feedback-assisted capacity.
1144
Chapter 19
LOCC-Assisted Quantum
Communication
This chapter develops an important variation of quantum communication, in which
we allow the sender and receiver the free use of classical communication. That is,
in between every use of a given quantum communication channel N 𝐴→𝐵 , the sender
and receiver are allowed to perform local operations and classical communication
(LOCC). For this reason, the capacity considered in this chapter is called the
LOCC-assisted quantum capacity.
The practical motivation for this kind of feedback-assisted quantum capacity
comes from the fact that, these days, classical communication is rather cheap
and plentiful. Thus, from a resource-theoretic perspective, it can be sensible to
simply allow classical communication as a free resource (similar to how we did for
entanglement-assisted communication in Chapter 11). Then our goal is to place
informative bounds on the rate at which quantum information can be communicated
from the sender to the receiver in this setting. Furthermore, these bounds are
relevant for understanding and placing limitations on the speed at which distributed
quantum computation can be carried out.
In order to establish upper bounds on LOCC-assisted quantum capacity, we
revisit the concept of amortization introduced in Section 17.1.3. However, in
this context, we proceed somewhat differently, instead employing entanglement
measures to quantify how much entanglement can be generated by multiple uses of a
quantum channel. Then we define the amortized entanglement of a quantum channel
as the largest difference between the output and input entanglement of the channel.
1145
Chapter 19: LOCC-Assisted Quantum Communication
˜ , as 𝑛 → ∞.
is achievable for 𝑛 copies of 𝜔 𝐴𝐵
3. Using the distilled maximally entangled state, along with 2 log2 𝑑 bits of
classical communication, Alice and Bob perform the quantum teleportation
protocol to transmit the 𝐴′ part of an arbitrary pure state Ψ𝑅 𝐴′ from Alice to
Bob, with 𝑑 𝐴′ = 𝑑.
Since there are 𝑛 uses of the channel in this strategy, we see that as 𝑛 → ∞, the
log 𝑑
rate of this strategy (the number of qubits per channel use) is 𝑛2 = 𝐼 ( 𝐴⟩𝐵)𝜔 .
By optimizing over all initial pure states 𝜓 𝐴𝐴˜ prepared by Alice, we find that, in
the asymptotic setting, the rate sup𝜓 𝐴𝐴
˜
𝐼 ( 𝐴⟩𝐵)𝜔 = 𝐼 𝑐 (N) is achievable, where we
˜
1146
Chapter 19: LOCC-Assisted Quantum Communication
Reference
Ψ RA0 Ψ RB0
Alice
A0
Ãn Â
T→
ψ Ãn An
An B0
Alice
Bob N ⊗n L→
Bn B̂
One-way Teleportation
entanglement
distillation
1147
Chapter 19: LOCC-Assisted Quantum Communication
Let T 𝜔𝐴→𝐵 denote the channel realized by teleportation over the unideal state 𝜔 𝐴′ 𝐵′ :
We would then like to determine the deviation of the ideal channel from T 𝜔𝐴→𝐵 , and
to do so, we can employ the normalized diamond distance. Then consider that,
from the data-processing inequality for trace distance,
T 𝜔𝐴→𝐵 − id 𝐴→𝐵 ⋄
= sup T 𝐴→𝐵 (𝜓 𝑅 𝐴 )
𝜔
− id 𝐴→𝐵 (𝜓 𝑅 𝐴 ) 1
(19.1.4)
𝜓𝑅 𝐴
= sup ∥T 𝐴𝐴′ 𝐵′ →𝐵 (𝜓 𝑅 𝐴 ⊗ 𝜔 𝐴′ 𝐵′ ) − T 𝐴𝐴′ 𝐵′ →𝐵 (𝜓 𝑅 𝐴 ⊗ Φ 𝐴′ 𝐵′ )∥ 1 (19.1.5)
𝜓𝑅 𝐴
1148
Chapter 19: LOCC-Assisted Quantum Communication
≤ sup ∥𝜓 𝑅 𝐴 ⊗ 𝜔 𝐴′ 𝐵′ − 𝜓 𝑅 𝐴 ⊗ Φ 𝐴′ 𝐵′ ∥ 1 (19.1.6)
𝜓𝑅 𝐴
= ∥𝜔 𝐴′ 𝐵′ − Φ 𝐴′ 𝐵′ ∥ 1 ≤ 2𝜀, (19.1.7)
so that we arrive at the desired statement mentioned above:
1 1 𝜔
∥𝜔 𝐴′ 𝐵′ − Φ 𝐴′ 𝐵′ ∥ 1 ≤ 𝜀 ⇒ T − id 𝐴→𝐵 ⋄
≤ 𝜀. (19.1.8)
2 2 𝐴→𝐵
Thus, for the above reason, we focus exclusively on LOCC-assisted protocols
whose aim is to generate an approximate maximally entangled state. In what follows,
all bipartite cuts for separable states or LOCC channels should be understood as
being between Alice’s and Bob’s systems.
A protocol for LOCC-assisted quantum communication is depicted in Fig-
ure [REF], and it is defined by the following elements:
(𝜌 (1)
𝐴 ′ 𝐴1 𝐵 ′
, {L (𝑖)
𝐴′ 𝐵 𝐵 𝑖
(𝑛+1)
′ →𝐴′ 𝐴 𝐵 ′ }𝑖=2 , L 𝐴′ 𝐵 𝐵 ′ →𝑀 𝑀 ),
𝑛
(19.1.9)
1 1 𝑖−1 𝑖−1 𝑖−1 𝑖 𝑖 𝑛 𝑛 𝑛 𝐴 𝐵
where 𝜌 (1)
𝐴 ′ 𝐴1 𝐵 ′
is a separable state, L (𝑖)
𝐴′ 𝐵 𝐵′ →𝐴𝑖′ 𝐴𝑖 𝐵𝑖′
is an LOCC channel for
1 1 𝑖−1 𝑖−1 𝑖−1
𝑖 ∈ {2, . . . , 𝑛}, and L (𝑛+1)
𝐴′𝑛 𝐵 𝑛 𝐵′𝑛 →𝑀 𝐴 𝑀 𝐵
is a final LOCC channel that generates the
approximate maximally entangled state in systems 𝑀 𝐴 and 𝑀𝐵 . Let C denote
all of these elements, which together constitute the LOCC-assisted quantum
communication code. All systems with primed labels should be understood as
local quantum memory or scratch registers that Alice or Bob can employ in this
information-processing task. They are also assumed to be finite-dimensional, but
could be arbitrarily large. The unprimed systems are the ones that are either input
to or output from the quantum communication channel N 𝐴→𝐵 .
The LOCC-assisted quantum communication protocol begins with Alice and
(1)
Bob performing an LOCC channel L∅→𝐴 ′ 𝐴 𝐵 ′ , which leads to the separable state
1 1 1
𝜌 (1) mentioned above, where
𝐴1′ 𝐴1 𝐵1′
and 𝐴′1 𝐵′1
are systems that are finite-dimensional
but arbitrarily large. The system 𝐴1 is such that it can be fed into the first channel
use. Alice sends system 𝐴1 through the first channel use, leading to a state
𝜔 (1)
𝐴′ 𝐵1 𝐵 ′
B N 𝐴1 →𝐵1 (𝜌 (1)
𝐴 ′ 𝐴1 𝐵 ′
). (19.1.10)
1 1 1 1
Alice and Bob then perform the LOCC channel L (2) 𝐴1′ 𝐵1 𝐵1′ →𝐴2′ 𝐴2 𝐵2′
, which leads to
the state
𝜌 (2)
𝐴 ′ 𝐴2 𝐵 ′
B L (2)
𝐴′ 𝐵1 𝐵′ →𝐴′ 𝐴2 𝐵′
(𝜔 (1)
𝐴′ 𝐵1 𝐵 ′
). (19.1.11)
2 2 1 1 2 2 1 1
1149
Chapter 19: LOCC-Assisted Quantum Communication
Alice sends system 𝐴2 through the second channel use N 𝐴2 →𝐵2 , leading to the state
𝜔 (2)
𝐴′ 𝐵2 𝐵 ′
B N 𝐴2 →𝐵2 (𝜌 (2)
𝐴 ′ 𝐴2 𝐵 ′
). (19.1.12)
2 2 2 2
This process iterates: the protocol uses the channel 𝑛 times. In general, we have the
following states for all 𝑖 ∈ {2, . . . , 𝑛}:
𝜌 (𝑖)
𝐴 ′ 𝐴𝑖 𝐵 ′
B L (𝑖)
𝐴′ 𝐵 𝐵′ →𝐴𝑖′ 𝐴𝑖 𝐵𝑖′
(𝜔 (𝑖−1)
𝐴′ 𝐵𝑖−1 𝐵′
), (19.1.13)
𝑖 𝑖 𝑖−1 𝑖−1 𝑖−1 𝑖−1 𝑖−1
𝜔 (𝑖)
𝐴′ 𝐵𝑖 𝐵 ′
B N 𝐴𝑖 →𝐵𝑖 (𝜌 (𝑖)
𝐴𝑖′ 𝐴𝑖 𝐵𝑖′
), (19.1.14)
𝑖 𝑖
where L (𝑖)
𝐴′ 𝐵 𝐵′ →𝐴𝑖′ 𝐴𝑖 𝐵𝑖′
is an LOCC channel. The final step of the protocol
𝑖−1 𝑖−1 𝑖−1
consists of an LOCC channel L (𝑛+1)
𝐴′𝑛 𝐵 𝑛 𝐵′𝑛 →𝑀 𝐴 𝑀 𝐵
, which generates the systems 𝑀 𝐴
and 𝑀𝐵 for Alice and Bob, respectively. The protocol’s final state is as follows:
𝜔 𝑀 𝐴 𝑀𝐵 B L (𝑛+1)
𝐴′ 𝐵 𝑛 𝐵′ →𝑀 𝐴 𝑀 𝐵
(𝜔 (𝑛)
𝐴′ 𝐵 𝑛 𝐵 ′
). (19.1.15)
𝑛 𝑛 𝑛 𝑛
The goal of the protocol is for the final state 𝜔 𝑀 𝐴 𝑀𝐵 to be close to a maximally
entangled state, and we define the quantum error probability of the code as follows:
where 𝐹 denotes the quantum fidelity (Definition 6.5) and the maximally entangled
state Φ 𝑀 𝐴 𝑀𝐵 = |Φ⟩⟨Φ| 𝑀 𝐴 𝑀𝐵 is defined from
𝑀
1 ∑︁
|Φ⟩ 𝑀 𝐴 𝑀𝐵 B√ |𝑚⟩ 𝑀 𝐴 ⊗ |𝑚⟩ 𝑀𝐵 , (19.1.18)
𝑀 𝑚=1
such that it has Schmidt rank 𝑀. Intuitively, the quantum error probability 𝑞 err (C)
is equal to the probability that one obtains the outcome “not maximally entangled
state Φ 𝑀 𝐴 𝑀𝐵 ” when performing the test or measurement {Φ 𝑀 𝐴 𝑀𝐵 , 𝐼 𝑀 𝐴 𝑀𝐵 −Φ 𝑀 𝐴 𝑀𝐵 }
on the final state 𝜔 𝑀 𝐴 𝑀𝐵 of the protocol.
1150
Chapter 19: LOCC-Assisted Quantum Communication
[IN PROGRESS]
one-shot lower bound in terms of coherent information of a state. This achieves
coherent information of a channel, as well as reverse coherent information of a
channel.
Proposition 19.2
Let N 𝐴→𝐵 be a quantum channel, let 𝜀 ∈ [0, 1], and let 𝐸 be an entanglement
measure that is equal to zero for all separable states. For an (𝑛, 𝑀, 𝜀) LOCC-
assisted quantum communication protocol with final state 𝜔 𝑀 𝐴 𝑀𝐵 , the following
bound holds
𝐸 (𝑀 𝐴 ; 𝑀𝐵 )𝜔 ≤ 𝑛 · 𝐸 A (N). (19.1.19)
1151
Chapter 19: LOCC-Assisted Quantum Communication
𝐸 (𝑀 𝐴 ; 𝑀𝐵 )𝜔
≤ 𝐸 ( 𝐴′𝑛 ; 𝐵𝑛 𝐵′𝑛 )𝜔 (𝑛) (19.1.20)
′ ′ ′ ′
= 𝐸 ( 𝐴𝑛 ; 𝐵𝑛 𝐵𝑛 )𝜔 (𝑛) − 𝐸 ( 𝐴1 𝐴1 ; 𝐵1 ) 𝜌 (1) (19.1.21)
" 𝑛 #
∑︁
= 𝐸 ( 𝐴′𝑛 ; 𝐵𝑛 𝐵′𝑛 )𝜔 (𝑛) + 𝐸 ( 𝐴𝑖′ 𝐴𝑖 ; 𝐵𝑖′) 𝜌 (𝑖) − 𝐸 ( 𝐴𝑖′ 𝐴𝑖 ; 𝐵𝑖′) 𝜌 (𝑖)
𝑖=2
′ ′
− 𝐸 ( 𝐴1 𝐴1 ; 𝐵1 ) 𝜌 (1) (19.1.22)
𝑛
∑︁
𝐸 ( 𝐴𝑖′; 𝐵𝑖 𝐵𝑖′)𝜔 (𝑖) − 𝐸 ( 𝐴𝑖′ 𝐴𝑖 ; 𝐵𝑖′) 𝜌 (𝑖)
≤ (19.1.23)
𝑖=1
≤ 𝑛 · 𝐸 A (N). (19.1.24)
The first equality follows because the state 𝜌 (1) 𝐴1′ 𝐴1 𝐵1′
is a separable state, and by
assumption, the entanglement measure 𝐸 vanishes for all such states. The second
equality follows by adding and subtracting equal terms. The second inequality
follows because 𝐸 ( 𝐴𝑖′ 𝐴𝑖 ; 𝐵𝑖′) 𝜌 (𝑖) ≤ 𝐸 ( 𝐴𝑖−1
′ ;𝐵 ′
𝑖−1 𝐵𝑖−1 )𝜔 (𝑖−1) for all 𝑖 ∈ {2, . . . , 𝑛},
due to monotonicity of the entanglement measure 𝐸 with respect to LOCC channels.
The final inequality follows from the definition of amortized entanglement and
the fact that the states 𝜔 (𝑖)
𝐴𝑖′ 𝐵𝑖 𝐵𝑖′
and 𝜌 (𝑖)
𝐴𝑖′ 𝐴𝑖 𝐵𝑖′
are particular states to consider in its
optimization. ■
The inequality in (19.1.19) states that the entanglement of the final output state
𝜔 𝑀 𝐴 𝑀𝐵 , as quantified by 𝐸, cannot exceed 𝑛 times the amortized entanglement of
the channel N 𝐴→𝐵 . Intuitively, the only resource allowed in the protocol, which
has the potential to generate entanglement, is the quantum communication channel
N 𝐴→𝐵 . All of the LOCC channels allowed for free have no ability to generate
entanglement on their own. Thus, the entanglement of the final state should not
exceed the largest possible amount of entanglement that could ever be generated
with 𝑛 calls to the channel, and this largest entanglement is exactly the amortized
entanglement of the channel.
Observe that the bound in Proposition 19.2 depends on the final state 𝜔 𝑀 𝐴 𝑀𝐵 ,
and thus it is not a universal bound, depending only on the parameters 𝑛, 𝑀, and
𝜀, because this state in turn depends on the entire protocol. Similar to the upper
bounds established in previous chapters, it is desirable to refine this bound such that
1152
Chapter 19: LOCC-Assisted Quantum Communication
it depends only on 𝑛, 𝑀, and 𝜀, which are the parameters characterizing any generic
LOCC-assisted quantum communication protocol. In the forthcoming sections, we
consider particular entanglement measures, such as squashed entanglement and
Rains relative entropy, which allow us to relate the parameters 𝑀 and 𝜀 to the final
state 𝜔 𝑀 𝐴 𝑀𝐵 .
To end this section, we note here that the bound in Proposition 19.2 simplifies for
teleportation-simulable channels and for entanglement measures that are subadditive
with respect to states and equal to zero for all separable states. This conclusion is a
consequence of Propositions 10.6 and 19.2:
We now establish the squashed entanglement upper bound on the number of qubits
that a sender can transmit to a receiver by employing an LOCC-assisted quantum
communication protocol:
1153
Chapter 19: LOCC-Assisted Quantum Communication
where the equality follows from Theorem 10.20. Applying Definition 19.1 leads to
𝐹 (Φ 𝑀 𝐴 𝑀𝐵 , 𝜔 𝑀 𝐴 𝑀𝐵 ) ≥ 1 − 𝜀. (19.1.28)
𝐸 sq (𝑀 𝐴 ; 𝑀𝐵 )𝜔
√ √
≥ 𝐸 sq (𝑀 𝐴 ; 𝑀𝐵 )Φ − 𝜀 log2 min {|𝑀 𝐴 | , |𝑀𝐵 |} + 𝑔2 ( 𝜀) (19.1.29)
√ √
= log2 𝑀 − 𝜀 log2 𝑀 + 𝑔2 ( 𝜀) (19.1.30)
√ √
= (1 − 𝜀) log2 𝑀 − 𝑔2 ( 𝜀). (19.1.31)
(𝜌 (1)
𝐴 ′ 𝐴1 𝐵 ′
, {P (𝑖)
𝐴′ 𝐵 𝐵 𝑖
(𝑛+1)
′ →𝐴′ 𝐴 𝐵 ′ }𝑖=2 , P 𝐴′ 𝐵 𝐵 ′ →𝑀 𝑀 ),
𝑛
(19.2.1)
1 1 𝑖−1 𝑖−1 𝑖−1 𝑖 𝑖 𝑛 𝑛 𝑛 𝐴 𝐵
where 𝜌 (1)
𝐴 ′ 𝐴1 𝐵 ′
is a PPT state, P (𝑖)
𝐴′ 𝐵 𝐵′ →𝐴𝑖′ 𝐴𝑖 𝐵𝑖′
is a C-PPT-P channel for 𝑖 ∈
1 1 𝑖−1 𝑖−1 𝑖−1
{2, . . . , 𝑛}, and P (𝑛+1)
𝐴′ 𝐵 𝑛 𝐵′ →𝑀 𝐴 𝑀 𝐵
is a final C-PPT-P channel that generates an
𝑛 𝑛
1154
Chapter 19: LOCC-Assisted Quantum Communication
Since every LOCC channel is a C-PPT-P channel, we can make the following
observation immediately:
Proposition 19.6
Let N 𝐴→𝐵 be a quantum channel, let 𝜀 ∈ [0, 1], and let 𝐸 be an entanglement
measure that is monotone under completely PPT-preserving channels and
is equal to zero for all PPT states. For an (𝑛, 𝑀, 𝜀) PPT-assisted quantum
communication protocol with final state 𝜔 𝑀 𝐴 𝑀𝐵 , the following bound holds
𝐸 (𝑀 𝐴 ; 𝑀𝐵 )𝜔 ≤ 𝑛 · 𝐸 A (N). (19.2.2)
Recalling Definition 4.32, a channel N 𝐴→𝐵 with input system 𝐴 and output
system 𝐵 is defined to be PPT-simulable with associated resource state 𝜔 𝑅𝐵′ if the
following equality holds for all input states 𝜌 𝐴 :
N 𝐴→𝐵 (𝜌 𝐴 ) = P 𝐴𝑅𝐵′ →𝐵 (𝜌 𝐴 ⊗ 𝜔 𝑅𝐵′ ), (19.2.3)
1155
Chapter 19: LOCC-Assisted Quantum Communication
Corollary 19.7
Let 𝐸 𝑆 denote an entanglement measure that is monotone non-increasing with
respect to completely PPT-preserving channels, subadditive with respect to
states, and equal to zero for all PPT states. Let N 𝐴→𝐵 be a channel that
is PPT-simulable with associated resource state 𝜃 𝑅𝐵′ . Let 𝜀 ∈ [0, 1]. For
an (𝑛, 𝑀, 𝜀) PPT-assisted quantum communication protocol with final state
𝜔 𝑀 𝐴 𝑀𝐵 , the following bound holds
We now establish the max-Rains information upper bound on the number of qubits
that a sender can transmit to a receiver by employing a PPT-assisted quantum
communication protocol:
1156
Chapter 19: LOCC-Assisted Quantum Communication
where the equality follows from Theorem 10.18. Applying Definition 19.5 leads to
𝐹 (Φ 𝑀 𝐴 𝑀𝐵 , 𝜔 𝑀 𝐴 𝑀𝐵 ) ≥ 1 − 𝜀. (19.2.7)
log2 𝑀 ≤ 𝑅 𝜀 (𝑀 𝐴 ; 𝑀𝐵 )𝜔 (19.2.8)
1
≤ 𝑅max (𝑀 𝐴 ; 𝑀𝐵 )𝜔 + log2 . (19.2.9)
1−𝜀
Combining (19.2.6) and (19.2.9), we conclude the proof. ■
1157
Chapter 19: LOCC-Assisted Quantum Communication
𝛼 > 1:
′ 𝛼 1
log2 𝑀 ≤ 𝑛 · 𝑅
e𝛼 (𝑆; 𝐵 )𝜃 + log2 , (19.2.10)
𝛼−1 1−𝜀
1
log2 𝑀 ≤ [𝑛 · 𝑅(𝑆; 𝐵′)𝜃 + ℎ2 (𝜀)] . (19.2.11)
1−𝜀
1159
Chapter 19: LOCC-Assisted Quantum Communication
19.4 Examples
[IN PROGRESS]
erasure channel - get Rains information as a strong converse rate - will match
lower bound in terms of reverse coherent information (Hayashi called this pseudo-
coherent information)
covariant dephasing channels - get Rains information as a strong converse rate
and then coherent information matches this (will evaluate Rains information bound
in unassisted quantum capacity chapter)
depolarizing channel - evaluate Rains information
use squashed entanglement to give upper bound for amplitude damping channel
(1999); Gottesman and Chuang (1999); Zhou et al. (2000); Bowen and Bose (2001);
Takeoka et al. (2002); Giedke and Ignacio Cirac (2002); Wolf et al. (2007); Niset
et al. (2009); Chiribella et al. (2009); Soeda et al. (2011); Leung and Matthews
(2015); Pirandola et al. (2017); Takeoka et al. (2016); Wilde et al. (2017); Takeoka
et al. (2017); Kaur and Wilde (2017).
A precise mathematical definition of an LOCC-assisted quantum communication
protocol conducted over a quantum channel was presented in (Müller-Hermes,
2012, Definition 12) and (Takeoka et al., 2014, Section IV).
That the entanglement of the final state of an 𝑛-round LOCC-assisted quantum
communication is bounded from above by 𝑛 times the channel’s amortized entan-
glement (Proposition 19.2) was anticipated by Bennett et al. (2003) and proven by
Kaur and Wilde (2017). Corollary 19.3 was anticipated by Bennett et al. (1996c)
and presented in more detail in (Müller-Hermes, 2012, Chapter 4), while the form
in which we have presented it here is closely related to the presentation by Kaur
and Wilde (2017).
The 𝑛-round PPT-assisted quantum communication protocols presented in
Section 19.2 were considered by Kaur and Wilde (2017), with PPT-assisted quantum
communication over a single or parallel use of a quantum channel considered by
Leung and Matthews (2015); Wang and Duan (2016b); Wang et al. (2019b). The
bound in Proposition 19.6 was established by Kaur and Wilde (2017).
Theorem 19.4 is due to Takeoka et al. (2014).
Wang and Duan (2016a) defined a semi-definite programming upper bound on
distillable entanglement of a bipartite state, and Wang and Duan (2016b) defined a
semi-definite programming upper bound on the quantum capacity of a quantum
channel. Wang et al. (2019b) observed that the quantity defined by Wang and
Duan (2016a) is equal to the max-Rains relative entropy, while also observing
that the quantity defined by Wang and Duan (2016b) is equal to the max-Rains
information of a quantum channel. Berta and Wilde (2018) established the max-
Rains information as an upper bound on the 𝑛-round non-asymptotic PPT-assisted
quantum capacity (Theorem 19.8). The upper bounds in Theorem 19.9 are due to
Kaur and Wilde (2017).
1162
Chapter 20
1163
Chapter 20: Secret Key Agreement
1164
Chapter 20: Secret Key Agreement
Taking the same perspective as that in Chapter 16, with the idea that a powerful, fully
quantum eavesdropper could have access to every system to which the legitimate
parties do not have access, we suppose that the quantum eavesdropper has access
to the environment system 𝐸. Furthermore, in a secret-key-agreement protocol, the
legitimate parties are allowed to use a public, classical communication channel, in
addition to the quantum channel N 𝐴→𝐵 , in order to generate a secret key. Since
this channel is public, we suppose that the eavesdropper has access to all of the
classical data exchanged between the legitimate parties.
In more detail, an 𝑛-shot protocol for secret key agreement consists of 𝑛 calls
to the quantum channel N 𝐴→𝐵 , interleaved by LOPC channels. Since all of the
classical data exchanged between Alice and Bob is assumed to be public and
available to the eavesdropper as well, we call these channels “LOPC” channels,
which is an abbreviation of “local operations and public communication.” In fact,
a protocol for secret key agreement has essentially the same structure as a protocol
for LOCC-assisted quantum communication, as discussed in Section 19.1, with the
exception that the systems at the end should hold a secret key instead of a maximally
entangled state.
A protocol for secret key agreement is depicted in Figure [REF],and it consists
1165
Chapter 20: Secret Key Agreement
All systems labeled by 𝐴 belong to Alice, those labeled by 𝐵 belong to Bob, and
those labeled by 𝑌 are classical systems belonging to Eve, representing a copy
of the classical data exchanged by Alice and Bob in a round of LOPC. In the
above, 𝜌 𝐴1′ 𝐴1 𝐵1′ 𝑌1 is a separable state, L (𝑖)
𝐴′ 𝐵𝑖−1 𝐵′ →𝐴′ 𝐴𝑖 𝐵′𝑌𝑖
is an LOPC channel
𝑖−1 𝑖−1 𝑖 𝑖
for 𝑖 ∈ {2, . . . , 𝑛}, and L (𝑛+1) is a final LOPC channel that generates
𝐴′𝑛 𝐵 𝑛 𝐵′𝑛 →𝐾 𝐴 𝐾 𝐵𝑌𝑛+1
the approximate secret key in systems 𝐾 𝐴 and 𝐾 𝐵 . Let C denote all of these
elements, which together constitute the secret-key-agreement protocol. As with
LOCC-assisted quantum communication, all systems with primed labels should
be understood as local quantum memory or scratch registers that Alice and Bob
can employ in this information-processing task. We also assume that they are
finite-dimensional, yet arbitrarily large. The unprimed systems are the ones that are
either input to or output from the quantum communication channel N 𝐴→𝐵 .
The secret-key-agreement protocol begins with Alice and Bob performing an
(1) (1)
LOPC channel L∅→𝐴 ′ 𝐴 𝐵 ′ 𝑌 , which leads to the separable state 𝜌 𝐴′ 𝐴 𝐵 ′ 𝑌 mentioned
1 1
1 1 1 1 1 1
′ ′
above, where 𝐴1 and 𝐵1 are systems that are finite-dimensional yet arbitrarily large.
In particular, the state 𝜌 (1)
𝐴′ 𝐴1 𝐵′ 𝑌1
has the following form:
1 1
∑︁
𝜌 (1)
𝑦 𝑦
𝐴1′ 𝐴1 𝐵1′ 𝑌1
B 𝑝𝑌1 (𝑦 1 )𝜏𝐴1′ 𝐴1 ⊗ 𝜁 𝐵1′ ⊗ |𝑦 1 ⟩⟨𝑦 1 |𝑌1 , (20.1.3)
1 1
𝑦1
𝜔 (1)
𝐴′ 𝐵1 𝐵′ 𝐸 1𝑌1
B UN (1)
𝐴1 →𝐵1 𝐸 1 (𝜌 𝐴′ 𝐴1 𝐵′ 𝑌1 ). (20.1.5)
1 1 1 1
1166
Chapter 20: Secret Key Agreement
Note that we write the channel use as the isometric channel UN 𝐴1 →𝐵1 𝐸 1 that extends
N 𝐴1 →𝐵1 , since we would like to incorporate the eavesdropper’s system 𝐸 1 explicitly
into the description of the protocol. Alice and Bob then perform the LOPC channel
L (2)
𝐴′ 𝐵1 𝐵′ →𝐴′ 𝐴2 𝐵′ 𝑌2
, which leads to the state
1 1 2 2
𝜌 (2)
𝐴′ 𝐴2 𝐵′ 𝐸 1𝑌1𝑌2
B L (2)
𝐴′ 𝐵1 𝐵′ →𝐴′ 𝐴2 𝐵′ 𝑌2
(𝜔 (1)
𝐴′ 𝐵1 𝐵′ 𝐸 1𝑌1
). (20.1.6)
2 2 1 1 2 2 1 1
Alice sends system 𝐴2 through the second channel use UN 𝐴2 →𝐵2 𝐸 2 , leading to the
state
𝜔 (2)
𝐴′ 𝐵2 𝐵′ 𝐸 1 𝐸 2𝑌1𝑌2
B UN (2)
𝐴2 →𝐵2 𝐸 2 (𝜌 𝐴′ 𝐴2 𝐵′ 𝐸 1𝑌1𝑌2 ). (20.1.9)
2 2 2 2
This process iterates: the protocol uses the channel 𝑛 times. In general, we have the
following states for all 𝑖 ∈ {2, . . . , 𝑛}:
𝜌 (𝑖)′ B L (𝑖)
𝐴′ 𝐵 𝐵′ →𝐴𝑖′ 𝐴𝑖 𝐵𝑖′𝑌𝑖
(𝜔 (𝑖−1) ), (20.1.10)
𝐴𝑖 𝐴𝑖 𝐵𝑖′ 𝐸 1𝑖−1𝑌1𝑖 𝑖−1 𝑖−1 𝑖−1
′ ′ 𝐸 𝑖−1𝑌 𝑖−1
𝐴𝑖−1 𝐵𝑖−1 𝐵𝑖−1 1 1
𝜔 (𝑖)
′ ′ B UN
→𝐵 (𝜌 (𝑖)
), (20.1.11)
𝐴𝐵𝐵𝐸 𝑌𝑖 𝑖
𝑖 𝑖 𝑖
𝐴 𝑖
1 1
𝐸
𝑖 𝑖 ′ 𝐴𝑖 𝐴𝑖 𝐵𝑖′ 𝐸 1𝑖−1𝑌1𝑖
where L (𝑖)
𝐴′ 𝐵 𝐵′ →𝐴𝑖′ 𝐴𝑖 𝐵𝑖′𝑌𝑖
is an LOPC channel that can be written as
𝑖−1 𝑖−1 𝑖−1
∑︁
L (𝑖)
𝑦 𝑦
𝐴′ 𝐵 𝐵′ →𝐴𝑖′ 𝐴𝑖 𝐵𝑖′𝑌𝑖
B E 𝐴𝑖′ →𝐴𝑖′ 𝐴𝑖
⊗ F𝐵𝑖𝑖−1 𝐵′ →𝐵𝑖′
⊗ |𝑦𝑖 ⟩⟨𝑦𝑖 |𝑌𝑖 . (20.1.12)
𝑖−1 𝑖−1 𝑖−1 𝑖−1 𝑖−1
𝑦𝑖
𝑦 𝑦
In the above, {E 𝐴𝑖′ }
and {F𝐵𝑖𝑖−1 𝐵′ →𝐵′ } 𝑦𝑖 are sets of completely positive
→𝐴𝑖′ 𝐴𝑖 𝑦 𝑖
Í 𝑦 𝑖−1 𝑖 𝑖−1
𝑦
maps such that the sum map 𝑦𝑖 E 𝐴𝑖′ →𝐴′ 𝐴𝑖 ⊗ F𝐵𝑖𝑖−1 𝐵′ →𝐵′ is trace preserving.
𝑖−1 𝑖 𝑖−1 𝑖
1167
Chapter 20: Secret Key Agreement
The classical system 𝑌𝑖 represents the eavesdropper’s copy of the classical data
exchanged by Alice and Bob in this round of LOPC. Note that the reduced channel
acting on Alice and Bob’s systems is as follows:
∑︁
(𝑖) 𝑦 𝑦
L 𝐴′ 𝐵𝑖−1 𝐵′ →𝐴′ 𝐴𝑖 𝐵′ = E 𝐴𝑖′ →𝐴′ 𝐴𝑖 ⊗ F𝐵𝑖𝑖−1 𝐵′ →𝐵′ . (20.1.13)
𝑖−1 𝑖−1 𝑖 𝑖 𝑖−1 𝑖 𝑖−1 𝑖
𝑦𝑖
𝜔 𝐾 𝐴𝐾 𝐵 𝐸 𝑛𝑌 𝑛+1 B L (𝑛+1)
𝐴′ 𝐵 𝑛 𝐵′ →𝐾 𝐴 𝐾 𝐵𝑌𝑛+1
(𝜔 (𝑛)
𝐴 ′ 𝐵 𝑛 𝐵 ′ 𝐸 𝑛𝑌 𝑛
). (20.1.14)
1 1 𝑛 𝑛 𝑛 𝑛 1 1
and the reduced final channel acting on Alice and Bob’s systems is as follows:
∑︁
(𝑛+1) 𝑦 𝑦 𝑛+1
L 𝐴′ 𝐵𝑛 𝐵′ →𝐾 𝐴𝐾 𝐵 = E 𝐴𝑛+1
′ →𝐾 ⊗ F 𝐵 𝐵 ′ →𝐾 .
𝐴 𝑛 𝐵
(20.1.16)
𝑛 𝑛 𝑛 𝑛
𝑦 𝑛+1
The goal of the protocol is for the final state 𝜔 𝐾 𝐴𝐾 𝐵 𝐸 𝑛𝑌 𝑛+1 to be nearly indistin-
1 1
guishable from a tripartite secret-key state, and we define the privacy error of the
code to be as follows:
where 𝜎𝐸 𝑛𝑌 𝑛+1 is some state of the eavesdropper’s systems, 𝐹 denotes the quantum
1 1
fidelity (Definition 6.5) and the maximally classically correlated state Φ𝐾 𝐴𝐾 𝐵 is
defined as
𝐾
1 ∑︁
Φ𝐾 𝐴 𝐾 𝐵 B |𝑘⟩⟨𝑘 | 𝐾 𝐴 ⊗ |𝑘⟩⟨𝑘 | 𝐾 𝐵 . (20.1.18)
𝐾 𝑘=1
Intuitively, the privacy error 𝑝 err (C) quantifies how distinguishable the final state
𝜔 𝐾 𝐴𝐾 𝐵 𝐸 𝑛𝑌 𝑛+1 is from an ideal tripartite secret-key state Φ𝐾 𝐴𝐾 𝐵 ⊗ 𝜎𝐸 𝑛𝑌 𝑛+1 , in which
1 1 1 1
the key values in 𝐾 𝐴 and 𝐾 𝐵 are perfectly correlated and uniformly random and
1168
Chapter 20: Secret Key Agreement
in tensor product with the eavesdropper’s systems 𝐸 1𝑛𝑌1𝑛+1 . For an ideal tripartite
secret-key state, it is difficult for an eavesdropper to guess the value of the key by
observing the content of her quantum systems 𝐸 1𝑛𝑌1𝑛+1 . In fact, the chance for an
eavesdropper to guess the key value of an ideal secret-key state is equal to 1/𝐾,
which is no better than random guessing.
Due to the isometric invariance of the fidelity and the fact that all isometric
extensions of a channel are related by an isometry acting on the environment system,
the privacy error in (20.1.17) is invariant under any choice of an isometric channel
UN𝐴→𝐵𝐸 that extends the original channel N 𝐴→𝐵 . Thus, the relevant performance
parameters for a secret-key agreement protocol do not change with the particular
isometric extension chosen. This is to be expected since the actual information that
the eavesdropper gains in the protocol does not depend on the particular isometric
extension chosen.
Let (𝜌 (1)
𝐴1′ 𝐴1 𝐵1′ 𝑌1
, {L (𝑖)
′ 𝐵
𝐴𝑖−1 ′ ′ ′ }
𝑛 , L (𝑛+1)
𝑖−1 𝐵𝑖−1 →𝐴𝑖 𝐴𝑖 𝐵𝑖 𝑌𝑖 𝑖=2 𝐴′𝑛 𝐵 𝑛 𝐵′𝑛 →𝐾 𝐴 𝐾 𝐵𝑌𝑛+1
) be the elements
of an 𝑛-round LOPC-assisted secret-key-agreement protocol over the channel
N 𝐴→𝐵 . The protocol is called an (𝑛, 𝐾, 𝜀) protocol, with 𝜀 ∈ [0, 1], if the
privacy error 𝑝 err (C) ≤ 𝜀.
1169
Chapter 20: Secret Key Agreement
state at the end of the protocol and the ideal state is no smaller than 1 − 𝜀. In
𝑝
more detail, let Φ 𝑀 𝐴 𝑀𝐵 denote the following state in which there is an arbitrary
distribution 𝑝 over the message:
𝐾
∑︁
𝑝
Φ 𝑀 𝐴 𝑀𝐵 B 𝑝(𝑚)|𝑚⟩⟨𝑚| 𝑀 𝐴 ⊗ |𝑚⟩⟨𝑚| 𝑀𝐵 . (20.1.19)
𝑚=1
𝑝
Let 𝜔 denote the final state of the protocol, which is defined in the
𝑀 𝐴 𝑀 𝐵 𝐸 1𝑛𝑌1𝑛+1
same way as (20.1.14), with the exception that the message distribution 𝑝 is no
longer uniform. Then an (𝑛, 𝐾, 𝜀) LOPC-assisted private communication protocol
is defined similarly to an (𝑛, 𝐾, 𝜀) secret-key-agreement protocol as given above,
except that the following inequality holds
𝑝 𝑝
max 1 − 𝐹 (𝜔 , Φ 𝑀 𝐴 𝑀𝐵 ⊗ 𝜎𝐸 𝑛𝑌 𝑛+1 ) ≤ 𝜀. (20.1.20)
𝑝:M→[0,1] 𝑀 𝐴 𝑀 𝐵 𝐸 1𝑛𝑌1𝑛+1 1 1
where the maximization is over all message distributions 𝑝 and 𝜎𝐸 𝑛𝑌 𝑛+1 is some fixed
1 1
state of the eavesdropper’s systems that is independent of the message transmitted.
By the use of the one-time pad protocol, it follows that an (𝑛, 𝐾, 𝜀) secret-key-
agreement protocol leads to an (𝑛, 𝐾, 𝜀) LOPC-assisted private communication
protocol. To see how the one-time pad protocol works in conjunction with secret
key agreement, suppose that Alice and Bob have completed an (𝑛, 𝐾, 𝜀) secret-
key-agreement protocol as described in the previous section, with the final state
𝜔 𝐾 𝐴𝐾 𝐵 𝐸 𝑛𝑌 𝑛+1 satisfying 𝑝 err (C) ≤ 𝜀. Alice then brings in her local message registers
1 1
𝑀 𝐴 and 𝑀 𝐴′ , so that the overall quantum state is
𝑝
Φ 𝑀 𝐴 𝑀 𝐴′ ⊗ 𝜔 𝐾 𝐴𝐾 𝐵 𝐸 𝑛𝑌 𝑛+1 (20.1.21)
1 1
The one-time pad protocol is an LOPC protocol in which Alice then performs the
following classical computation, represented as a quantum channel, on her message
register 𝑀 𝐴′ and her key register 𝐾 𝐴 :
∑︁
|𝑚 ⊕ 𝑘⟩𝐶 𝐴 ⟨𝑚| 𝑀 𝐴′ ⟨𝑘 | 𝐾 𝐴 (·)|𝑚⟩ 𝑀 𝐴′ |𝑘⟩𝐾 𝐴 ⟨𝑚 ⊕ 𝑘 |𝐶 𝐴 , (20.1.22)
𝑘,𝑚
where the addition ⊕ is modulo 𝐾. She then transmits the classical register 𝐶 𝐴 over
a public classical channel to Bob. Eve can make a copy 𝐶 𝐴′ of this classical register
containing the value 𝑚 ⊕ 𝑘, but since Bob’s key register 𝐾 𝐵 is not available to her,
the register 𝐶 𝐴′ is nearly independent of Alice’s message register 𝑀 𝐴 (depending on
1170
Chapter 20: Secret Key Agreement
how small 𝜀 is). Bob then performs the following classical computation, represented
as a quantum channel, on his received register 𝐶 𝐴 and his key register 𝐾 𝐵 :
∑︁
|𝑐 ⊖ 𝑘⟩ 𝑀𝐵 ⟨𝑐|𝐶 𝐴 ⟨𝑘 | 𝐾 𝐵 (·)|𝑐⟩𝐶 𝐴 |𝑘⟩𝐾 𝐵 ⟨𝑐 ⊖ 𝑘 | 𝑀𝐵 , (20.1.23)
𝑐,𝑘
𝑝
where the subtraction ⊖ is modulo 𝐾. Let 𝜔 denote the final state of
𝑀 𝐴 𝑀 𝐵 𝐸 1𝑛𝑌1𝑛+1 𝐶 𝐴′
the protocol. By applying the data-processing inequality to (20.1.20), as well as
the fact mentioned above that 𝐶 𝐴′ is independent of 𝑀 𝐴 and 𝑀𝐵 in the ideal case,
the following inequality holds
𝑝 𝑝
max 1 − 𝐹 (𝜔 𝑛 𝑛+1 , Φ 𝑀 𝐴 𝑀𝐵
⊗ 𝜎𝐸 𝑛𝑌 𝑛+1𝐶 𝐴′ ) ≤ 𝜀, (20.1.24)
𝑝:M→[0,1] 𝑀 𝐴 𝑀 𝐵 𝐸 1 𝑌1 𝐶 𝐴′ 1 1
where the systems 𝑆 𝐴1 and 𝑆 𝐵1 are known as local “shield” systems. In principle,
the shield systems 𝑆 𝐴1 and 𝑆 𝐵1 could be held by Alice and Bob, respectively, and
𝑦 𝑦
the states |𝜏 𝑦1 ⟩ 𝐴1′ 𝐴1 𝑆 𝐴1 and |𝜁 𝑦1 ⟩𝐵1′ 𝑆 𝐵1 purify 𝜏𝐴1′ 𝐴1 and 𝜁 𝐵1′ in (20.1.3), respectively.
1 1
We assume without loss of generality that the shield systems contain a coherent
classical copy of the classical random variable 𝑌1 , such that tracing over systems
𝑆 𝐴1 and 𝑆 𝐵1 recovers the original state in (20.1.3). As before, Eve possesses system
𝑌1 , which contains a coherent classical copy of the classical data exchanged.
Each LOPC channel L (𝑖) ′ 𝐵
𝐴𝑖−1 ′ ′ ′ for 𝑖 ∈ {2, . . . , 𝑛} is of the form in
𝑖−1 𝐵𝑖−1 →𝐴𝑖 𝐴𝑖 𝐵𝑖
(20.1.12) and can be purified to an isometry in the following way:
(𝑖)
𝑈 𝐴L′ 𝐵 𝐵′ →𝐴𝑖′ 𝐴𝑖 𝑆 𝐴𝑖 𝐵𝑖′ 𝑆 𝐵𝑖 𝑌𝑖 B
𝑖−1 𝑖−1 𝑖−1
∑︁
𝑈 𝐴E′ ⊗ 𝑈𝐵F𝑖−1 𝐵′
𝑦𝑖 𝑦𝑖
→𝐴𝑖′ 𝐴𝑖 𝑆 𝐴𝑖 →𝐵𝑖′ 𝑆 𝐵𝑖 ⊗ |𝑦𝑖 ⟩𝑌𝑖 , (20.2.2)
𝑖−1 𝑖−1
𝑦𝑖
∥𝑈 𝐴E′ ∥𝑈𝐵F𝑖−1 𝐵′
𝑦𝑖 𝑦𝑖
𝑖−1
→𝐴𝑖′ 𝐴𝑖 𝑆 𝐴𝑖 ∥ ∞ , 𝑖−1
→𝐵𝑖′ 𝑆 𝐵𝑖 ∥ ∞ ≤ 1, (20.2.3)
The systems 𝑆 𝐴𝑖 and 𝑆 𝐵𝑖 in (20.2.2) are shield systems belonging to Alice and
Bob, respectively, and we assume without loss of generality that they contain a
coherent classical copy of the classical random variable 𝑌𝑖 , such that tracing over
the systems 𝑆 𝐴𝑖 and 𝑆 𝐵𝑖 recovers the original LOPC channel in (20.1.12). As before,
𝑌𝑖 is a system held by Eve, containing a coherent classical copy of the classical data
exchanged in this round.
Thus, a purification of the state 𝜌 (𝑖)
𝐴 ′ 𝐴𝑖 𝐵 ′
after each LOPC channel is as follows:
𝑖 𝑖
|𝜌 (𝑖) ⟩ 𝐴′ 𝐴𝑖 𝑆 𝐵′ 𝑆 𝐸 𝑖−1𝑌1𝑖 B
𝑖 𝐴𝑖 𝑖 𝐵𝑖 1
1 1
L (𝑖)
𝑈 ′ 𝐵
𝐴𝑖−1 ′ ′ ′
𝑖−1 𝐵𝑖−1 →𝐴𝑖 𝐴𝑖 𝑆 𝐴𝑖 𝐵𝑖 𝑆 𝐵𝑖 𝑌𝑖
|𝜔 (𝑖−1) ⟩ 𝐴′ 𝐵 𝐵′ 𝑆 𝑆 𝐸 𝑖−1𝑌1𝑖−1 , (20.2.4)
𝑖−1 𝑖−1 𝑖−1 𝐴𝑖−1 𝐵𝑖−1 1
1 1
where 𝑈 𝐴N𝑖 →𝐵𝑖 𝐸𝑖 is an isometric extension of the 𝑖th channel use N 𝐴𝑖 →𝐵𝑖 .
The final LOPC channel takes the form in (20.1.15), and it can be purified to an
isometry similarly as
(𝑛+1)
𝑈 𝐴L′𝑛 𝐵𝑛 𝐵′𝑛 →𝐾 𝐴𝑆 𝐴 𝐾 𝐵 𝑆 𝐵𝑛+1 𝑌𝑛+1 B
𝑛+1
∑︁
𝑈 𝐴E′𝑛 →𝐾 𝐴𝑆 𝐴 ⊗ 𝑈𝐵F𝑛 𝐵′𝑛 →𝐾 𝐵 𝑆 𝐵
𝑦𝑛+1 𝑦𝑛+1
⊗ |𝑦 𝑛+1 ⟩𝑌𝑛+1 . (20.2.6)
𝑛+1 𝑛+1
𝑦 𝑛+1
The systems 𝑆 𝐴𝑛+1 and 𝑆 𝐵𝑛+1 are again shield systems belonging to Alice and Bob,
respectively, and we assume again that they contain a coherent classical copy of
the classical random variable 𝑌𝑛+1 , such that tracing over 𝑆 𝐴𝑛+1 and 𝑆 𝐵𝑛+1 recovers
the original LOPC channel in (20.1.15). As before, 𝑌𝑛+1 is a system held by Eve,
containing a coherent classical copy of the classical data exchanged in this round.
The final state at the end of the purified protocol is a pure state |𝜔⟩𝐾 𝐴𝑆 𝐴𝐾 𝐵 𝑆 𝐵 𝐸 𝑛𝑌 𝑛+1 ,
given by
|𝜔⟩𝐾 𝐴𝑆 𝐴𝐾 𝐵 𝑆 𝐵 𝐸 𝑛𝑌 𝑛+1 B
1173
Chapter 20: Secret Key Agreement
(𝑛+1)
𝑈 𝐴L′𝑛 𝐵𝑛 𝐵′𝑛 →𝐾 𝐴𝑆 𝐴 𝐾 𝑆 𝑌 |𝜔
(𝑛)
⟩ 𝐴′𝑛 𝐵𝑛 𝑆 𝐴𝑛 𝐵′𝑛 𝑆 𝐵𝑛 𝐸1𝑛𝑌1𝑛 . (20.2.7)
𝑛+1 𝐵 𝐵𝑛+1 𝑛+1 1 1
Alice is in possession of the key system 𝐾 𝐴 and the shield systems 𝑆 𝐴 ≡ 𝑆 𝐴1 · · · 𝑆 𝐴𝑛+1 ,
Bob possesses the key system 𝐾 𝐵 and the shield systems 𝑆 𝐵 ≡ 𝑆 𝐵1 · · · 𝑆 𝐵𝑛+1 , and
Eve holds the environment systems 𝐸 𝑛 ≡ 𝐸 1 · · · 𝐸 𝑛 . Additionally, Eve has coherent
copies 𝑌 𝑛+1 ≡ 𝑌1 · · · 𝑌𝑛+1 of all the classical data exchanged.
where
UE𝐴′ (·) B 𝑈 𝐴E′ (·) [𝑈 𝐴E′ †
𝑦𝑖 𝑦𝑖 𝑦𝑖
→𝐴𝑖′ 𝐴𝑖 𝑆 𝐴𝑖 →𝐴𝑖′ 𝐴𝑖 𝑆 𝐴𝑖 →𝐴 ′𝐴 𝑆 ] , (20.2.10)
𝑖−1 𝑖−1 𝑖−1 𝑖 𝑖 𝐴𝑖
UF𝐵𝑖−1 𝐵′ F 𝑦𝑖 F 𝑦𝑖
(·) [𝑈𝐵𝑖−1 𝐵′ →𝐵′ 𝑆 𝐵 ] † .
𝑦𝑖
𝑖−1
→𝐵𝑖′ 𝑆 𝐵𝑖 (·) B𝑈 ′ →𝐵 ′ 𝑆
𝐵𝑖−1 𝐵𝑖−1 𝑖 𝐵𝑖 𝑖−1 𝑖 𝑖
(20.2.11)
Tracing over Eve’s system 𝑌𝑛+1 of the final isometry leads to the following LOCC
channel:
∑︁ 𝑦
(𝑛+1)
UE𝐴′𝑛 →𝐾 𝐴𝑆 𝐴 ⊗ UF𝐵𝑛 𝐵′𝑛 →𝐾 𝐵 𝑆 𝐵 ,
𝑛+1 𝑦𝑛+1
L 𝐴′ 𝐵𝑛 𝐵′ →𝐾 𝐴𝑆 𝐴 𝐾 𝐵 𝑆 𝐵 = (20.2.12)
𝑛 𝑛 𝑛+1 𝑛+1 𝑛+1 𝑛+1
𝑦 𝑛+1
1174
Chapter 20: Secret Key Agreement
The states at every step of the protocol are then given by the following for all
𝑖 ∈ {2, . . . , 𝑛}:
𝜌 (𝑖)
𝐴 ′ 𝐴𝑖 𝑆 𝐵𝑖′ 𝑆 𝐵𝑖
B L (𝑖)
𝐴′ 𝐵 𝐵′ →𝐴𝑖′ 𝐴𝑖 𝑆 𝐴𝑖 𝐵𝑖′ 𝑆 𝐵𝑖
(𝜔 (𝑖−1)
𝐴′ 𝑆 𝐵 𝐵′ 𝑆
), (20.2.13)
𝑖 𝐴𝑖 𝑖−1 𝑖−1 𝑖−1 𝑖−1 𝐴𝑖−1 𝑖−1 𝑖−1 𝐵𝑖−1
1 1 1 1
𝜔 (𝑖)
𝐴′ 𝑆 𝐵𝑖 𝐵𝑖′ 𝑆 𝐵𝑖
B N 𝐴𝑖 →𝐵𝑖 (𝜌 (𝑖)
𝐴 ′ 𝐴𝑖 𝑆 𝐵′ 𝑆
), (20.2.14)
𝑖 𝐴𝑖 𝑖 𝐴𝑖 𝑖 𝐵𝑖
1 1 1 1
𝜔 𝐾 𝐴𝑆 𝐴𝐾 𝐵 𝑆 𝐵 B L (𝑛+1)
𝐴′ 𝐵 𝑛 𝐵′ →𝐾 𝐴 𝑆 𝐴 𝐾 𝑆
(𝜔 (𝑛)
𝐴′ 𝐵 𝑛 𝑆 ′ ). (20.2.15)
𝑛 𝑛 𝑛+1 𝐵 𝐵𝑛+1 𝑛 𝐴𝑛 𝐵 𝑛 𝑆 𝐵𝑛
1 1
One observation that we make here is that the shield systems in a bipartite private-
state distillation protocol are finite-dimensional, yet arbitrarily large. That is, there
is no bound that we can establish on their dimension for a generic private-state
distillation protocol, and this unboundedness is a consequence of the fact that the
shield systems result from purifying the local memory or scratch registers of Alice
and Bob, which in turn have no bound on their dimension. This unboundedness
poses a challenge when trying to establish upper bounds on the rate at which
secret key agreement, or equivalently, bipartite private-state distillation is possible.
However, there are methods for handling this unboundedness that we detail later.
1176
Chapter 20: Secret Key Agreement
Due to the fact that a maximally entangled state is a particular kind of bipartite
private state and due to the equivalence between secret key agreement and LOCC-
assisted bipartite private-state distillation, we arrive at the following conclusion,
which relates LOCC-assisted quantum communication to secret key agreement:
Proposition 20.2
Let N 𝐴→𝐵 be a quantum channel, let 𝑛, 𝐾 ∈ N, and let 𝜀 ∈ [0, 1]. Then an
(𝑛, 𝐾, 𝜀) LOCC-assisted quantum communication protocol is also an (𝑛, 𝐾, 𝜀)
protocol for secret key agreement.
where Alice and Bob have access to the 𝐴 and 𝐵 systems, respectively, and the
eavesdropper has access to the system 𝑌 . The only requirement for a public
𝑦 𝑦
separable channel is that {E 𝐴→𝐴Í′ } 𝑦 and {F𝐵→𝐵′ } 𝑦 are sets of completely positive
𝑦 𝑦
maps such that the sum map 𝑦 E 𝐴 ⊗ F𝐵 is trace preserving. Similar to the
distinction between LOCC and separable channels, it is not possible in general
to implement a public separable channel via local operations and public classical
communication.
The main point that we make in this section is that we can generalize a secret-
key-agreement protocol to be assisted by public separable channels rather than
just LOPC channels. For fixed privacy error, the resulting protocol achieves a
rate of communication that is either the same or higher than that achieved by
an LOPC-assisted protocol, due to the fact that every LOPC channel is a public
separable channel. Such a protocol is defined in the same way as we did in
Section 20.1, and then we arrive at the following definition:
Let C B (𝜌 (1)
𝐴1′ 𝐴1 𝐵1′ 𝑌1
, {L (𝑖)
′ 𝐵
𝐴𝑖−1 ′ ′ ′ }
𝑛 , L (𝑛+1)
𝑖−1 𝐵𝑖−1 →𝐴𝑖 𝐴𝑖 𝐵𝑖 𝑌𝑖 𝑖=2 𝐴′𝑛 𝐵 𝑛 𝐵′𝑛 →𝐾 𝐴 𝐾 𝐵𝑌𝑛+1
) be the ele-
ments of an 𝑛-round public-separable-assisted secret-key-agreement protocol
over the channel N 𝐴→𝐵 . The protocol is called an (𝑛, 𝐾, 𝜀) protocol, with
𝜀 ∈ [0, 1], if the privacy error 𝑝 err (C) ≤ 𝜀.
Proposition 20.4
Let N 𝐴→𝐵 be a quantum channel, let 𝜀 ∈ [0, 1], and let 𝐸 be an entanglement
measure that is equal to zero for all separable states. For an (𝑛, 𝐾, 𝜀) secret-
key-agreement protocol, the following bound holds
𝐸 (𝐾 𝐴 𝑆 𝐴 ; 𝐾 𝐵 𝑆 𝐵 )𝜔 ≤ 𝑛 · 𝐸 A (N), (20.2.20)
Just as the bound from Proposition 19.2 depends on the final state of the
LOCC-assisted quantum communication protocol, the same is true for the bound
in (20.2.20). The bound is thus not a universal bound (a universal bound would
depend only on the protocol parameters 𝑛, 𝐾, and 𝜀). Thus, one of the main goals
of the forthcoming sections is to employ particular entanglement measures in order
to arrive at universal bounds for secret-key-agreement protocols.
We should also observe that the quantity 𝐸 (𝐾 𝐴 𝑆 𝐴 ; 𝐾 𝐵 𝑆 𝐵 )𝜔 in the bound in
1179
Chapter 20: Secret Key Agreement
1180
Chapter 20: Secret Key Agreement
Proposition 20.6
Let N 𝐴→𝐵 be a quantum channel, let 𝜀 ∈ [0, 1], and let 𝐸 be an entanglement
measure that is monotone non-increasing with respect to separable channels
and equal to zero for all separable states. For an (𝑛, 𝐾, 𝜀) secret-key-agreement
protocol assisted by public separable channels, the following bound holds
𝐸 (𝐾 𝐴 𝑆 𝐴 ; 𝐾 𝐵 𝑆 𝐵 )𝜔 ≤ 𝑛 · 𝐸 A (N), (20.2.22)
Finally, this bound again simplifies for channels that are simulable by the action
of a separable channel on a resource state 𝜃 𝑅𝐵′ (separable-simulable channels):
Corollary 20.7
Let 𝐸 𝑆 denote an entanglement measure that is that is monotone non-increasing
with respect to separable channels, subadditive with respect to states (Defini-
tion 9.1.9), and equal to zero for separable states. Let N 𝐴→𝐵 be a channel that is
separable-simulable with associated resource state 𝜃 𝑅𝐵′ (Definition 4.26). Let
𝜀 ∈ [0, 1]. For an (𝑛, 𝐾, 𝜀) secret-key-agreement protocol assisted by public
separable channels, the following bound holds
1181
Chapter 20: Secret Key Agreement
shown in Section 9.4 that the squashed entanglement satisfies all of the requirements
needed to apply it in Proposition 20.4. Namely, it is equal to zero for separable
states, it is an entanglement measure (non-increasing under the action of an LOCC
channel), and the squashed entanglement of a channel does not increase under
amortization (Theorem 10.20). Putting all of these items together, we can already
conclude the following bound for an (𝑛, 𝐾, 𝜀) secret-key-agreement protocol:
𝐸 sq (𝐾 𝐴 𝑆 𝐴 ; 𝐾 𝐵 𝑆 𝐵 )𝜔 ≤ 𝑛 · 𝐸 sq (N), (20.3.1)
and ∑︁
𝑖𝑗
𝑈 𝐴𝐵𝐴′ 𝐵′ = |𝑖⟩⟨𝑖| 𝐴 ⊗ | 𝑗⟩⟨ 𝑗 | 𝐵 ⊗ 𝑈 𝐴′ 𝐵′ (20.3.4)
𝑖, 𝑗
𝑖𝑗
is a controlled unitary known as a “twisting unitary,” with each 𝑈 𝐴′ 𝐵′ a unitary
operator. Due to the fact that the maximally entangled state Φ 𝐴𝐵 is unextendible,
1182
Chapter 20: Secret Key Agreement
any extension 𝛾 𝐴𝐴′ 𝐵𝐵′ 𝐸 of a private state 𝛾 𝐴𝐴′ 𝐵𝐵′ necessarily has the following
form:
†
𝛾 𝐴𝐴′ 𝐵𝐵′ 𝐸 = 𝑈 𝐴𝐴′ 𝐵𝐵′ (Φ 𝐴𝐵 ⊗ 𝜎𝐴′ 𝐵′ 𝐸 ) 𝑈 𝐴𝐴 ′ 𝐵𝐵 ′ , (20.3.5)
where 𝜎𝐴′ 𝐵′ 𝐸 is an extension of 𝜎𝐴′ 𝐵′ .
We start with the following lemma, which applies to any extension of a bipartite
private state:
Lemma 20.8
Let 𝛾 𝐴𝐴′ 𝐵𝐵′ be a bipartite private state, and let 𝛾 𝐴𝐴′ 𝐵𝐵′ 𝐸 be an extension of it,
as given above. Then the following identity holds for any such extension:
Proof: First consider that the following identity holds as a consequence of two
applications of the chain rule for conditional quantum mutual information:
Combined with the following identity, which holds for an extension 𝛾 𝐴𝐴′ 𝐵𝐵′ 𝐸 of a
private state 𝛾 𝐴𝐴′ 𝐵𝐵′ ,
1183
Chapter 20: Secret Key Agreement
where
𝛾 𝑖𝐴′ 𝐵′ 𝐸 B 𝑈 𝑖𝑖𝐴′ 𝐵′ 𝜎𝐴′ 𝐵′ 𝐸 (𝑈 𝑖𝑖𝐴′ 𝐵′ ) † . (20.3.12)
Similarly, tracing over system 𝐴 of 𝛾 𝐴𝐴′ 𝐵𝐵′ 𝐸 leads to
1 ∑︁
𝛾 𝐵𝐴′ 𝐵′ 𝐸 = |𝑖⟩⟨𝑖| 𝐵 ⊗ 𝛾 𝑖𝐴′ 𝐵′ 𝐸 . (20.3.13)
𝐾 𝑖
So these and the chain rule for conditional entropy imply that
𝐻 ( 𝐴𝐴′ 𝐸) 𝛾 = 𝐻 ( 𝐴) 𝛾 + 𝐻 ( 𝐴′ 𝐸 | 𝐴) 𝛾 = log2 𝐾 + 𝐻 ( 𝐴′ 𝐸 | 𝐴) 𝛾 . (20.3.14)
Similarly, we have that
𝐻 (𝐵𝐵′ 𝐸) 𝛾 = log2 𝐾 + 𝐻 (𝐵′ 𝐸 |𝐵) 𝛾 = log2 𝐾 + 𝐻 (𝐵′ 𝐸 | 𝐴) 𝛾 , (20.3.15)
where we have used the symmetries in (20.3.11)–(20.3.13). Since 𝛾 𝐸 = 𝛾 𝑖𝐸 for all 𝑖
(this is a consequence of 𝛾 𝐴𝐵𝐴′ 𝐵′ being an ideal private state), we find that
1 ∑︁
𝐻 (𝐸) 𝛾 = 𝐻 (𝐸) 𝛾 𝑖 = 𝐻 (𝐸 | 𝐴) 𝛾 . (20.3.16)
𝐾 𝑖
Finally, we have that
𝐻 ( 𝐴𝐴′ 𝐵𝐵′ 𝐸) 𝛾 = 𝐻 ( 𝐴𝐵𝐴′ 𝐵′ 𝐸)Φ⊗𝜎 (20.3.17)
= 𝐻 ( 𝐴𝐵)Φ + 𝐻 ( 𝐴′ 𝐵′ 𝐸)𝜎 (20.3.18)
1 ∑︁
= 𝐻 ( 𝐴′ 𝐵′ 𝐸) 𝛾 𝑖 (20.3.19)
𝐾 𝑖
= 𝐻 ( 𝐴′ 𝐵′ 𝐸 | 𝐴) 𝛾 . (20.3.20)
The first equality follows from unitary invariance of quantum entropy. The second
equality follows because the entropy is additive for tensor-product states. The
third equality follows because 𝐻 ( 𝐴𝐵)Φ = 0 since Φ 𝐴𝐵 is a pure state, and 𝜎𝐴′ 𝐵′ 𝐸
is related to 𝛾 𝑖𝐴′ 𝐵′ 𝐸 by the unitary 𝑈 𝑖𝑖𝐴′ 𝐵′ . The final equality follows by applying
(20.3.11), and the fact that conditional entropy is a convex combination of entropies
for a classical-quantum state where the conditioning system is classical.
Combining (20.3.9), (20.3.14), (20.3.15), (20.3.16), (20.3.20), and the fact that
𝐼 ( 𝐴′; 𝐵′ | 𝐴𝐸) 𝛾 = 𝐻 ( 𝐴′ 𝐸 | 𝐴) 𝛾 +𝐻 (𝐵′ 𝐸 | 𝐴) 𝛾 −𝐻 (𝐸 | 𝐴) 𝛾 −𝐻 ( 𝐴′ 𝐵′ 𝐸 | 𝐴) 𝛾 , (20.3.21)
we recover (20.3.8). ■
1184
Chapter 20: Secret Key Agreement
Proposition 20.9
Let 𝛾 𝐴𝐴′ 𝐵𝐵′ be a private state, with key systems 𝐴𝐵 and shield systems 𝐴′ 𝐵′,
and let 𝜔 𝐴𝐴′ 𝐵𝐵′ be an 𝜀-approximate private state, in the sense that
where
𝑔2 (𝛿) B (𝛿 + 1) log2 (𝛿 + 1) − 𝛿 log2 𝛿. (20.3.24)
Proof: By applying Uhlmann’s theorem for fidelity (Theorem 6.8) and the inequal-
ities relating trace distance and fidelity from Theorem 6.14, for a given extension
𝜔 𝐴𝐴′ 𝐵𝐵′ 𝐸 of 𝜔 𝐴𝐴′ 𝐵𝐵′ , there exists an extension 𝛾 𝐴𝐴′ 𝐵𝐵′ 𝐸 of 𝛾 𝐴𝐴′ 𝐵𝐵′ such that
1 √
∥𝛾 𝐴𝐴′ 𝐵𝐵′ 𝐸 − 𝜔 𝐴𝐴′ 𝐵𝐵′ 𝐸 ∥ 1 ≤ 𝜀. (20.3.25)
2
Defining 𝑓1 (𝛿, 𝐾) B 2𝛿 log2 𝐾 + 2𝑔2 (𝛿), we then find that
The first equality follows from Lemma 20.8. The first inequality follows from
two applications of Proposition 7.10 (uniform continuity of conditional mutual
information). The second inequality follows because 𝐼 ( 𝐴′; 𝐵′ | 𝐴𝐸)𝜔 ≥ 0 (this is
strong subadditivity from Theorem 7.6). The last equality is a consequence of
the chain rule for conditional mutual information, as used in (20.3.7). Since the
inequality √
2 log2 𝐾 ≤ 𝐼 ( 𝐴𝐴′; 𝐵𝐵′ |𝐸)𝜔 + 2 𝑓1 ( 𝜀, 𝐾) (20.3.30)
holds for any extension of 𝜔 𝐴𝐴′ 𝐵𝐵′ , the statement of the proposition follows. ■
1185
Chapter 20: Secret Key Agreement
We now establish the squashed entanglement upper bound on the number of private
bits that a sender can transmit to a receiver by employing a secret-key-agreement
protocol. The proof is similar to that of Theorem 19.4, but it instead invokes
Proposition 20.9.
where the equality follows from Theorem 10.20. Applying Definition 20.1 and
(20.2.17) leads to
𝐹 (𝛾𝐾 𝐴𝑆 𝐴𝐾 𝐵 𝑆 𝐵 , 𝜔 𝐾 𝐴𝑆 𝐴𝐾 𝐵 𝑆 𝐵 ) ≥ 1 − 𝜀, (20.3.33)
where 𝛾𝐾 𝐴𝑆 𝐴𝐾 𝐵 𝑆 𝐵 is an exact private state of log2 𝐾 private bits. As a consequence
of Proposition 20.9, we find that
√ √
𝐸 sq (𝐾 𝐴 𝑆 𝐴 ; 𝐾 𝐵 𝑆 𝐵 )𝜔 ≥ (1 − 2 𝜀) log2 𝐾 − 2𝑔2 ( 𝜀). (20.3.34)
Putting together (20.3.32) and (20.3.34), we arrive at the statement of the theo-
rem. ■
1186
Chapter 20: Secret Key Agreement
where the equality follows from Theorem 10.16. Applying Definition 20.1 and
(20.2.17) leads to
𝐹 (𝛾𝐾 𝐴𝑆 𝐴𝐾 𝐵 𝑆 𝐵 , 𝜔 𝐾 𝐴𝑆 𝐴𝐾 𝐵 𝑆 𝐵 ) ≥ 1 − 𝜀, (20.4.3)
where 𝛾𝐾 𝐴𝑆 𝐴𝐾 𝐵 𝑆 𝐵 is an exact private state of log2 𝐾 private bits. As a consequence
of Propositions 15.15 and 7.71, we find that
log2 𝐾 ≤ 𝐸 𝑅𝜀 (𝑆 𝐴 𝐾 𝐴 ; 𝑆 𝐵 𝐾 𝐵 )𝜔 (20.4.4)
1
≤ 𝐸 max (𝑆 𝐴 𝐾 𝐴 ; 𝑆 𝐵 𝐾 𝐵 )𝜔 + log2 . (20.4.5)
1−𝜀
𝐹 (𝛾𝐾 𝐴𝑆 𝐴𝐾 𝐵 𝑆 𝐵 , 𝜔 𝐾 𝐴𝑆 𝐴𝐾 𝐵 𝑆 𝐵 ) ≥ 1 − 𝜀, (20.4.10)
log2 𝐾 ≤ 𝐸 𝑅𝜀 (𝑆 𝐴 𝐾 𝐴 ; 𝑆 𝐵 𝐾 𝐵 )𝜔 . (20.4.11)
1
log2 𝐾 ≤ [𝐸 𝑅 (𝑆 𝐴 𝐾 𝐴 ; 𝑆 𝐵 𝐾 𝐵 )𝜔 + ℎ2 (𝜀)] . (20.4.13)
1−𝜀
Putting together (20.4.8), (20.4.9), (20.4.12), and (20.4.13) concludes the proof. ■
1189
Chapter 20: Secret Key Agreement
We have the exact same definitions for secret key agreement assisted by
public separable channels, and we use the notation 𝑃SEP↔ to refer to the public-
𝑃↔ (N) ≤ 𝑃
e↔ (N) ≤ 𝑃
e↔ (N),
SEP (20.5.3)
𝑃 (N) ≤ 𝑃 (N) ≤ 𝑃↔ (N).
↔ ↔
SEP
e
SEP (20.5.4)
1190
Chapter 20: Secret Key Agreement
20.6 Examples
[IN PROGRESS]
1191
Chapter 20: Secret Key Agreement
1192
Chapter 20: Secret Key Agreement
1193
Summary
[IN PROGRESS]
1194
Appendix A
Analyzing General
Communication Scenarios
[IN PROGRESS]
1195
Bibliography
A. Acín, E. Bagan, M. Baig, Ll. Masanes, and R. Muñoz Tapia. Multiple-copy two-state
discrimination with individual measurements. Physical Review A, 71:032338, March 2005. doi:
10.1103/PhysRevA.71.032338. URL https://link.aps.org/doi/10.1103/PhysRevA.71.
032338.
C. Adami and N. J. Cerf. von Neumann capacity of noisy quantum channels. Physical Review A,
56:3470–3483, November 1997. doi: 10.1103/PhysRevA.56.3470. URL https://link.aps.
org/doi/10.1103/PhysRevA.56.3470.
Dorit Aharonov, Alexei Kitaev, and Noam Nisan. Quantum Circuits with Mixed States. In
Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, STOC ’98, page
20–30, New York, NY, USA, 1998. Association for Computing Machinery. ISBN 0897919629.
doi: 10.1145/276698.276708. URL https://doi.org/10.1145/276698.276708.
Rudolf Ahlswede and Imre Csiszár. Common randomness in information theory and cryptography.
I. Secret sharing. IEEE Transactions on Information Theory, 39:1121–1132, July 1993. URL
https://ieeexplore.ieee.org/document/243431.
Robert Alicki and Mark Fannes. Continuity of quantum conditional information. Journal of Physics
A: Mathematical and General, 37:L55–L57, January 2004. URL https://doi.org/10.1088/
0305-4470/37/5/l01.
Anurag Anshu, Vamsi Krishna Devabathini, and Rahul Jain. Quantum communication using coherent
rejection sampling. Physical Review Letters, 119:120506, September 2017. doi: 10.1103/
PhysRevLett.119.120506. URL https://link.aps.org/doi/10.1103/PhysRevLett.119.
120506.
Anurag Anshu, Mario Berta, Rahul Jain, and Marco Tomamichel. A minimax approach to one-shot
entropy inequalities. Journal of Mathematical Physics, 60:122201, December 2019. doi:
10.1063/1.5126723. URL https://doi.org/10.1063/1.5126723.
Anurag Anshu, Rahul Jain, and Naqueeb A. Warsi. Building blocks for communication over noisy
quantum networks. IEEE Transactions on Information Theory, 65:1287–1306, February 2019. doi:
10.1109/TIT.2018.2851297. URL https://ieeexplore.ieee.org/document/8399830.
1196
Anurag Anshu, Rahul Jain, and Naqueeb A. Warsi. On the near-optimality of one-shot classical
communication over quantum channels. Journal of Mathematical Physics, 60:012204, January
2019. doi: 10.1063/1.5039796. URL https://doi.org/10.1063/1.5039796.
S. Arora, E. Hazan, and S. Kale. Fast algorithms for approximate semidefinite programming using
the multiplicative weights update method. In 46th Annual IEEE Symposium on Foundations
of Computer Science (FOCS’05), pages 339–348, 2005. doi: 10.1109/SFCS.2005.35. URL
https://ieeexplore.ieee.org/document/1530726.
Sanjeev Arora and Satyen Kale. A combinatorial, primal-dual approach to semidefinite programs. In
Proceedings of the Thirty-Ninth Annual ACM Symposium on Theory of Computing, pages 227–236,
New York, NY, USA, June 2007. Association for Computing Machinery. ISBN 9781595936318.
doi: 10.1145/1250790.1250823. URL https://doi.org/10.1145/1250790.1250823.
Sanjeev Arora, Elad Hazan, and Satyen Kale. The multiplicative weights update method: a
meta-algorithm and applications. Theory of Computing, 8:121–164, 2012. doi: 10.4086/toc.
2012.v008a006. URL http://www.theoryofcomputing.org/articles/v008a006.
Koenraad Audenaert, Bart De Moor, Karl Gerd H. Vollbrecht, and Reinhard F. Werner. Asymptotic
relative entropy of entanglement for orthogonally invariant states. Physical Review A, 66:032310,
September 2002. doi: 10.1103/PhysRevA.66.032310. URL http://link.aps.org/doi/10.
1103/PhysRevA.66.032310.
Koenraad M. R. Audenaert and Jens Eisert. Continuity bounds on the quantum relative entropy.
Journal of Mathematical Physics, 46:102104, October 2005. doi: 10.1063/1.2044667. URL
https://doi.org/10.1063/1.2044667.
Koenraad M. R. Audenaert, John Calsamiglia, Ramon Muñoz Tapia, Emilio Bagan, Lluis Masanes,
Antonio Acin, and Frank Verstraete. Discriminating states: The quantum Chernoff bound.
Physical Review Letters, 98:160501, April 2007. doi: 10.1103/PhysRevLett.98.160501. URL
http://link.aps.org/doi/10.1103/PhysRevLett.98.160501.
Koenraad M. R. Audenaert, Milàn Mosonyi, and Frank Verstraete. Quantum state discrimination
bounds for finite sample size. Journal of Mathematical Physics, 53:122205, December 2012.
doi: 10.1063/1.4768252. URL https://doi.org/10.1063/1.4768252.
Masashi Ban, Kouichi Yamazaki, and Osamu Hirota. Accessible information in combined and
sequential quantum measurementson a binary-state signal. Physical Review A, 55:22–26,
January 1997. doi: 10.1103/PhysRevA.55.22. URL https://link.aps.org/doi/10.1103/
PhysRevA.55.22.
1197
Howard Barnum, Michael A. Nielsen, and Benjamin Schumacher. Information transmission through
a noisy quantum channel. Physical Review A, 57:4153–4175, June 1998. doi: 10.1103/PhysRevA.
57.4153. URL https://link.aps.org/doi/10.1103/PhysRevA.57.4153.
Howard Barnum, Emanuel Knill, and Michael A. Nielsen. On quantum fidelities and channel
capacities. IEEE Transactions on Information Theory, 46:1317–1329, July 2000. URL
https://ieeexplore.ieee.org/document/850671.
David Beckman, Daniel Gottesman, Michael A. Nielsen, and John Preskill. Causal and localizable
quantum operations. Physical Review A, 64:052309, October 2001. doi: 10.1103/PhysRevA.64.
052309. URL https://link.aps.org/doi/10.1103/PhysRevA.64.052309.
Salman Beigi. Sandwiched Rényi divergence satisfies data processing inequality. Journal of
Mathematical Physics, 54:122202, December 2013. URL https://aip.scitation.org/
doi/10.1063/1.4838855.
Salman Beigi, Nilanjana Datta, and Felix Leditzky. Decoding quantum information via the Petz
recovery map. Journal of Mathematical Physics, 57:082203, August 2016. URL https:
//doi.org/10.1063/1.4961515.
V. P. Belavkin and P. Staszewski. C*-algebraic generalization of relative entropy and entropy. Annales
de l’I.H.P. Physique théorique, 37:51–58, 1982. URL http://eudml.org/doc/76163.
Charles H. Bennett and Gilles Brassard. Quantum cryptography: Public key distribution and coin
tossing. In International Conference on Computer System and Signal Processing, IEEE, 1984,
pages 175–179, 1984.
Charles H. Bennett and Stephen J. Wiesner. Communication via one- and two-particle operators on
Einstein-Podolsky-Rosen states. Physical Review Letters, 69:2881–2884, November 1992. doi:
10.1103/PhysRevLett.69.2881. URL https://link.aps.org/doi/10.1103/PhysRevLett.
69.2881.
Charles H. Bennett, Gilles Brassard, Claude Crépeau, Richard Jozsa, Asher Peres, and William K.
Wootters. Teleporting an unknown quantum state via dual classical and Einstein-Podolsky-Rosen
channels. Physical Review Letters, 70:1895–1899, March 1993. URL https://link.aps.
org/doi/10.1103/PhysRevLett.70.1895.
Charles H. Bennett, Gilles Brassard, Claude Crepeau, and Ueli M. Maurer. Generalized privacy
amplification. IEEE Transactions on Information Theory, 41:1915–1923, November 1995. doi:
10.1109/18.476316. URL https://ieeexplore.ieee.org/document/476316.
Charles H. Bennett, Herbert J. Bernstein, Sandu Popescu, and Benjamin Schumacher. Concentrating
partial entanglement by local operations. Physical Review A, 53:2046–2052, April 1996a. doi:
10.1103/PhysRevA.53.2046. URL https://link.aps.org/doi/10.1103/PhysRevA.53.
2046.
1198
Charles H. Bennett, Gilles Brassard, Sandu Popescu, Benjamin Schumacher, John A. Smolin, and
William K. Wootters. Purification of noisy entanglement and faithful teleportation via noisy
channels. Physical Review Letters, 76:722–725, January 1996b. doi: 10.1103/PhysRevLett.76.
722. URL https://link.aps.org/doi/10.1103/PhysRevLett.76.722.
Charles H. Bennett, David P. DiVincenzo, John A. Smolin, and William K. Wootters. Mixed-state
entanglement and quantum error correction. Physical Review A, 54:3824–3851, November 1996c.
doi: 10.1103/PhysRevA.54.3824. URL https://link.aps.org/doi/10.1103/PhysRevA.
54.3824.
Charles H. Bennett, David P. DiVincenzo, and John A. Smolin. Capacities of quantum erasure
channels. Physical Review Letters, 78:3217–3220, April 1997. doi: 10.1103/PhysRevLett.78.
3217. URL https://link.aps.org/doi/10.1103/PhysRevLett.78.3217.
Charles H. Bennett, David P. DiVincenzo, Christopher A. Fuchs, Tal Mor, Eric Rains, Peter W.
Shor, John A. Smolin, and William K. Wootters. Quantum nonlocality without entanglement.
Physical Review A, 59:1070–1091, February 1999a. doi: 10.1103/PhysRevA.59.1070. URL
http://link.aps.org/doi/10.1103/PhysRevA.59.1070.
Charles H. Bennett, Peter W. Shor, John A. Smolin, and Ashish V. Thapliyal. Entanglement-assisted
classical capacity of noisy quantum channels. Physical Review Letters, 83:3081–3084, October
1999b. doi: 10.1103/PhysRevLett.83.3081. URL https://link.aps.org/doi/10.1103/
PhysRevLett.83.3081.
Charles H. Bennett, Peter W. Shor, John A. Smolin, and Ashish V. Thapliyal. Entanglement-
assisted capacity of a quantum channel and the reverse Shannon theorem. IEEE Transactions on
Information Theory, 48:2637–2655, October 2002. URL https://ieeexplore.ieee.org/
document/1035117.
Charles H. Bennett, Aram W. Harrow, Debbie W. Leung, and John A. Smolin. On the capacities
of bipartite Hamiltonians and unitary gates. IEEE Transactions on Information Theory, 49:
1895–1911, August 2003. ISSN 0018-9448. doi: 10.1109/TIT.2003.814935. URL https:
//ieeexplore.ieee.org/document/1214070.
Charles H. Bennett, Igor Devetak, Peter W. Shor, and John A. Smolin. Inequalities and separations
among assisted capacities of quantum channels. Physical Review Letters, 96:150502, April 2006.
URL https://link.aps.org/doi/10.1103/PhysRevLett.96.150502.
Charles H. Bennett, Igor Devetak, Aram W. Harrow, Peter W. Shor, and Andreas Winter. The
quantum reverse Shannon theorem and resource tradeoffs for simulating quantum channels. IEEE
Transactions on Information Theory, 60:2926–2959, May 2014. URL https://ieeexplore.
ieee.org/document/6757002.
Dominic W. Berry. Qubit channels that achieve capacity with two states. Physical Review A, 71:
032334, March 2005. URL https://link.aps.org/doi/10.1103/PhysRevA.71.032334.
Mario Berta. Single-shot quantum state merging. Diploma thesis, ETH Zurich, February 2008.
1199
Mario Berta and Mark M. Wilde. Amortization does not enhance the max-Rains information of a
quantum channel. New Journal of Physics, 20:053044, May 2018. doi: 10.1088/1367-2630/
aac153. URL https://doi.org/10.1088/1367-2630/aac153.
Mario Berta, Omar Fawzi, and Marco Tomamichel. On variational expressions for quantum
relative entropies. Letters in Mathematical Physics, 107:2239–2265, December 2017. URL
https://doi.org/10.1007/s11005-017-0990-7.
Reinhold A. Bertlmann and Philipp Krammer. Bloch vectors for qudits. Journal of Physics A:
Mathematical and Theoretical, 41:235303, May 2008. doi: 10.1088/1751-8113/41/23/235303.
URL https://doi.org/10.1088/1751-8113/41/23/235303.
Rajendra Bhatia. Matrix Analysis. Springer New York, 1997. doi: 10.1007/978-1-4612-0653-8.
Igor Bjelakovic and Rainer Siegmund-Schultze. Quantum Stein’s lemma revisited, inequalities for
quantum entropies, and a concavity theorem of Lieb. July 2003.
Garry Bowen. Quantum feedback channels. IEEE Transactions on Information Theory, 50:
2429–2434, October 2004. URL https://ieeexplore.ieee.org/document/1337116.
Garry Bowen and Sougato Bose. Teleportation as a depolarizing quantum channel, relative entropy,
and classical capacity. Physical Review Letters, 87:267901, December 2001. doi: 10.1103/
PhysRevLett.87.267901. URL https://link.aps.org/doi/10.1103/PhysRevLett.87.
267901.
Garry Bowen and Rajagopal Nagarajan. On feedback and the classical capacity of a noisy
quantum channel. IEEE Transactions on Information Theory, 51:320–324, January 2005. URL
https://ieeexplore.ieee.org/document/1365361.
Stephen Boyd and Lieven Vandenberghe. Convex Optimization. Cambridge University Press, 2004.
Fernando G. S. L. Brandao and Nilanjana Datta. One-shot rates for entanglement manipulation
under non-entangling maps. IEEE Transactions on Information Theory, 57:1754–1760, March
2011. ISSN 0018-9448. doi: 10.1109/TIT.2011.2104531. URL https://ieeexplore.ieee.
org/document/5714245.
Fernando G.S.L. Brandao, Matthias Christandl, and Jon Yard. Faithful squashed entanglement.
Communications in Mathematical Physics, 306:805–830, September 2011. ISSN 0010-3616. doi:
10.1007/s00220-011-1302-1. URL http://dx.doi.org/10.1007/s00220-011-1302-1.
Sarah Brandsen, Mengke Lian, Kevin D. Stubbs, Narayanan Rengaswamy, and Henry D. Pfister.
Adaptive Procedures for Discriminating Between Arbitrary Tensor-Product Quantum States. In
2020 IEEE International Symposium on Information Theory (ISIT), pages 1933–1938, 2020.
doi: 10.1109/ISIT44484.2020.9174234. URL https://ieeexplore.ieee.org/abstract/
document/9174234.
1200
Samuel L. Braunstein, Giacomo M. D’Ariano, Gerard J. Milburn, and Massimiliano F. Sacchi.
Universal teleportation with a twist. Physical Review Letters, 84:3486–3489, April 2000. doi:
10.1103/PhysRevLett.84.3486. URL https://link.aps.org/doi/10.1103/PhysRevLett.
84.3486.
Heinz-Peter Breuer and Francesco Petruccione. The Theory of Open Quantum Systems. Oxford
University Press, 2002.
Dorje Brody and Bernhard Meister. Minimum decision cost for quantum ensembles. Physical
Review Letters, 76:1–5, January 1996. doi: 10.1103/PhysRevLett.76.1. URL https://link.
aps.org/doi/10.1103/PhysRevLett.76.1.
Francesco Buscemi and Nilanjana Datta. The quantum capacity of channels with arbitrarily
correlated noise. IEEE Transactions on Information Theory, 56:1447–1460, March 2010a.
ISSN 0018-9448. doi: 10.1109/TIT.2009.2039166. URL https://ieeexplore.ieee.org/
document/5429118.
Francesco Buscemi and Nilanjana Datta. Distilling entanglement from arbitrary resources. Jour-
nal of Mathematical Physics, 51:102201, October 2010b. doi: http://dx.doi.org/10.1063/
1.3483717. URL http://scitation.aip.org/content/aip/journal/jmp/51/10/10.
1063/1.3483717.
Mark S. Byrd and Navin Khaneja. Characterization of the positivity of the density matrix in terms
of the coherence vector representation. Physical Review A, 68:062322, December 2003. doi:
10.1103/PhysRevA.68.062322. URL https://link.aps.org/doi/10.1103/PhysRevA.68.
062322.
Ning Cai, Andreas Winter, and Raymond W. Yeung. Quantum privacy and quantum wiretap
channels. Problems of Information Transmission, 40:318–336, October 2004. ISSN 0032-9460.
URL http://dx.doi.org/10.1007/s11122-005-0002-x.
Gianfranco Cariolaro and Tomaso Erseghe. Pulse Position Modulation. John Wiley & Sons, Inc.,
2003. ISBN 9780471219286. doi: 10.1002/0471219282.eot394. URL http://dx.doi.org/
10.1002/0471219282.eot394.
Eric A. Carlen. Trace inequalities and quantum entropy: An introductory course. Contempo-
rary Mathematics, 529:73–140, 2010. URL http://www.ueltschi.org/AZschool/notes/
EricCarlen.pdf.
Filippo Caruso and Vittorio Giovannetti. Degradability of bosonic Gaussian channels. Physical
Review A, 74:062307, December 2006. URL https://journals.aps.org/pra/abstract/
10.1103/PhysRevA.74.062307.
Nicholas J. Cerf and Christoph Adami. Negative entropy and information in quantum mechanics.
Physical Review Letters, 79:5194–5197, December 1997. doi: 10.1103/PhysRevLett.79.5194.
URL https://link.aps.org/doi/10.1103/PhysRevLett.79.5194.
1201
Nicolas J. Cerf and Chris Adami. Information theory of quantum entanglement and measurement.
Physica D: Nonlinear Phenomena, 120:62–81, September 1998. ISSN 0167-2789. doi:
https://doi.org/10.1016/S0167-2789(98)00045-1. URL http://www.sciencedirect.com/
science/article/pii/S0167278998000451.
Herman Chernoff. A measure of asymptotic efficiency for tests of a hypothesis based on the
sum of observations. The Annals of Mathematical Statistics, 23:493–507, 12 1952. doi:
10.1214/aoms/1177729330. URL https://doi.org/10.1214/aoms/1177729330.
Giulio Chiribella, Giacomo Mauro D’Ariano, and Paolo Perinotti. Realization schemes for quantum
instruments in finite dimensions. Journal of Mathematical Physics, 50:042101, April 2009. doi:
10.1063/1.3105923. URL http://dx.doi.org/10.1063/1.3105923.
Eric Chitambar, Debbie Leung, Laura Mančinska, Maris Ozols, and Andreas Winter. Everything you
always wanted to know about LOCC (but were afraid to ask). Communications in Mathematical
Physics, 328:303–326, May 2014. ISSN 1432-0916. doi: 10.1007/s00220-014-1953-9. URL
http://dx.doi.org/10.1007/s00220-014-1953-9.
Eric Chitambar, Julio I. de Vicente, Mark W. Girard, and Gilad Gour. Entanglement manipulation
and distillability beyond LOCC. Journal of Mathematical Physics, 61:042201, April 2020. URL
https://doi.org/10.1063/1.5124109.
Man-Duen Choi. Completely positive linear maps on complex matrices. Linear Algebra and
Its Applications, 10:285–290, 1975. URL https://www.sciencedirect.com/science/
article/pii/0024379575900750.
Matthias Christandl. The Structure of Bipartite Quantum States: Insights from Group Theory and
Cryptography. PhD thesis, University of Cambridge, April 2006.
Matthias Christandl and Alexander Müller-Hermes. Relative entropy bounds on quantum, private
and repeater capacities. Communications in Mathematical Physics, 353:821–852, July 2017. doi:
10.1007/s00220-017-2885-y. URL https://doi.org/10.1007/s00220-017-2885-y.
Matthias Christandl, Artur Ekert, Michal Horodecki, Pawel Horodecki, Jonathan Oppenheim, and
Renato Renner. Unifying classical and quantum key distillation. Proceedings of the 4th Theory
of Cryptography Conference, Lecture Notes in Computer Science, 4392:456–478, February 2007.
URL https://doi.org/10.1007/978-3-540-70936-7_25.
Matthias Christandl, Norbert Schuch, and Andreas Winter. Entanglement of the antisymmetric
state. Communications in Mathematical Physics, 311:397–422, March 2012. URL https:
//doi.org/10.1007/s00220-012-1446-7.
1202
Benoît Collins. Moments and cumulants of polynomial random variables on unitarygroups, the
Itzykson-Zuber integral, and free probability. International Mathematics Research Notices,
2003:953–982, 01 2003. ISSN 1073-7928. doi: 10.1155/S107379280320917X. URL https:
//doi.org/10.1155/S107379280320917X.
Benoît Collins and Piotr Śniady. Integration with Respect to the Haar Measure on Unitary,
Orthogonal and Symplectic Group. Communications in Mathematical Physics, 264:773–795,
June 2006. ISSN 1432-0916. doi: 10.1007/s00220-006-1554-3. URL https://doi.org/10.
1007/s00220-006-1554-3.
Tom Cooney, Milan Mosonyi, and Mark M. Wilde. Strong converse exponents for a quantum
channel discrimination problem and quantum-feedback-assisted communication. Communica-
tions in Mathematical Physics, 344:797–829, June 2016. URL https://doi.org/10.1007/
s00220-016-2645-4.
John Cortese. Relative entropy and single qubit Holevo-Schumacher-Westmoreland channel capacity.
July 2002.
Imre Csiszár and Janos Körner. Broadcast channels with confidential messages. IEEE Transactions
on Information Theory, 24:339–348, May 1978. URL https://ieeexplore.ieee.org/
document/1055892.
Toby Cubitt, David Elkouss, William Matthews, Maris Ozols, David Pérez-García, and Sergii
Strelchuk. Unbounded number of channel uses may be required to detect quantum capacity.
Nature Communications, 6:6739, 2015. URL https://doi.org/10.1038/ncomms7739.
Toby S. Cubitt, Mary Beth Ruskai, and Graeme Smith. The structure of degradable quantum
channels. Journal of Mathematical Physics, 49:102104, 2008. URL https://doi.org/10.
1063/1.2953685.
Marcos Curty, Maciej Lewenstein, and Norbert Lütkenhaus. Entanglement as a precondition for
secure quantum key distribution. Physical Review Letters, 92:217903, May 2004. doi: 10.
1103/PhysRevLett.92.217903. URL https://link.aps.org/doi/10.1103/PhysRevLett.
92.217903.
Nilanjana Datta. Max-relative entropy of entanglement, alias log robustness. International Journal
of Quantum Information, 7:475–491, January 2009a. URL https://www.worldscientific.
com/doi/abs/10.1142/S0219749909005298.
Nilanjana Datta. Min- and max-relative entropies and a new entanglement monotone. IEEE
Transactions on Information Theory, 55:2816–2826, June 2009b. URL https://ieeexplore.
ieee.org/document/4957651.
Nilanjana Datta and Min-Hsiu Hsieh. One-shot entanglement-assisted quantum and classical
communication. IEEE Transactions on Information Theory, 59:1929–1939, March 2013. URL
https://ieeexplore.ieee.org/document/6359930.
1203
Nilanjana Datta and Felix Leditzky. A limit of the quantum Rényi divergence. Journal of Physics
A: Mathematical and Theoretical, 47:045304, January 2014. URL http://stacks.iop.org/
1751-8121/47/i=4/a=045304.
Nilanjana Datta, Milan Mosonyi, Min-Hsiu Hsieh, and Fernando G. S. L. Brandão. A smooth entropy
approach to quantum hypothesis testing and the classical capacity of quantum channels. IEEE
Transactions on Information Theory, 59:8014–8026, December 2013. ISSN 0018-9448. doi:
10.1109/TIT.2013.2282160. URL https://ieeexplore.ieee.org/document/6670246.
Nilanjana Datta, Marco Tomamichel, and Mark M. Wilde. On the second-order asymptotics for
entanglement-assisted communication. Quantum Information Processing, 15:2569–2591, June
2016. URL https://doi.org/10.1007/s11128-016-1272-5.
Igor Devetak. The private classical capacity and quantum capacity of a quantum channel. IEEE
Transactions on Information Theory, 51:44–55, January 2005. doi: 10.1109/TIT.2004.839515.
URL https://ieeexplore.ieee.org/document/1377491.
Igor Devetak and Peter W. Shor. The Capacity of a Quantum Channel for Simultaneous Transmission
of Classical and Quantum Information. Communications in Mathematical Physics, 256:287–303,
June 2005. URL https://doi.org/10.1007/s00220-005-1317-6.
Igor Devetak and Andreas Winter. Distillation of secret key and entanglement from quantum
states. Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering
Sciences, 461:207–235, January 2005. doi: 10.1098/rspa.2004.1372. URL https://doi.org/
10.1098/rspa.2004.1372.
Igor Devetak, Marius Junge, Christopher King, and Mary Beth Ruskai. Multiplicativity of completely
bounded 𝑝-norms implies a new additivity result. Communications in Mathematical Physics,
266:37–63, August 2006. URL https://doi.org/10.1007/s00220-006-0034-0.
Dawei Ding and Mark M. Wilde. Strong converse for the feedback-assisted classical capacity of
entanglement-breaking channels. Problems of Information Transmission, 54:1–19, 2018. URL
https://doi.org/10.1134/S0032946018010015.
Dawei Ding, Yihui Quek, Peter W. Shor, and Mark M. Wilde. Entropy bound for the classical
capacity of a quantum channel assisted by classical feedback. In Proceedings of the 2019 IEEE
International Symposium on Information Theory, pages 250–254, Paris, France, July 2019. URL
https://ieeexplore.ieee.org/document/8849604.
1204
Dawei Ding, Sumeet Khatri, Yihui Quek, Peter W. Shor, Xin Wang, and Mark M. Wilde.
Bounding the forward classical capacity of bipartite quantum channels. IEEE Transactions
on Information Theory, 69(5):3034–3061, May 2023. doi: 10.1109/TIT.2022.3233924. URL
https://ieeexplore.ieee.org/document/10005080.
David P. DiVincenzo, Peter W. Shor, and John A. Smolin. Quantum-channel capacity of very noisy
channels. Physical Review A, 57:830–839, February 1998. doi: 10.1103/PhysRevA.57.830.
URL https://link.aps.org/doi/10.1103/PhysRevA.57.830.
David P. DiVincenzo, Peter W. Shor, John A. Smolin, Barbara M. Terhal, and Ashish V. Thapliyal.
Evidence for bound entangled states with negative partial transpose. Physical Review A, 61,
May 2000. doi: 10.1103/physreva.61.062312. URL https://link.aps.org/doi/10.1103/
PhysRevA.61.062312.
Andrew C. Doherty, Pablo A. Parrilo, and Federico M. Spedalieri. Complete family of separability
criteria. Physical Review A, 69:022308, February 2004. doi: 10.1103/PhysRevA.69.022308.
URL https://link.aps.org/doi/10.1103/PhysRevA.69.022308.
Frederic Dupuis. The decoupling approach to quantum information theory. PhD thesis, University
of Montreal, April 2010.
Frédéric Dupuis and Mark M. Wilde. Swiveled Rényi entropies. Quantum Information Processing,
15:1309–1345, March 2016. ISSN 1573-1332. doi: 10.1007/s11128-015-1211-x. URL
http://dx.doi.org/10.1007/s11128-015-1211-x.
Frederic Dupuis, Lea Kraemer, Philippe Faist, Joseph M. Renes, and Renato Renner. Generalized
entropies. XVIIth International Congress on Mathematical Physics, pages 134–153, 2013. URL
https://doi.org/10.1142/9789814449243_0008.
Frédéric Dupuis, Mario Berta, Jürg Wullschleger, and Renato Renner. One-shot decoupling.
Communications in Mathematical Physics, 328:251–284, May 2014. ISSN 1432-0916. doi:
10.1007/s00220-014-1990-4. URL http://dx.doi.org/10.1007/s00220-014-1990-4.
Wolfgang Dür, J. Ignacio Cirac, Maciej Lewenstein, and Dagmar Bruß. Distillability and partial
transposition in bipartite systems. Physical Review A, 61:062313, May 2000. doi: 10.
1103/PhysRevA.61.062313. URL https://link.aps.org/doi/10.1103/PhysRevA.61.
062313.
Artur K. Ekert. Quantum cryptography based on Bell’s theorem. Physical Review Letters, 67:661–
663, August 1991. URL https://link.aps.org/doi/10.1103/PhysRevLett.67.661.
1205
David Elkouss and Sergii Strelchuk. Superadditivity of private information for any number of uses
of the channel. Physical Review Letters, 115:040501, July 2015. doi: 10.1103/PhysRevLett.115.
040501. URL http://link.aps.org/doi/10.1103/PhysRevLett.115.040501.
Kun Fang and Hamza Fawzi. Geometric Rényi divergence and its applications in quantum
channel capacities. Communications in Mathematical Physics, 384:1615–1677, June 2021. doi:
10.1007/s00220-021-04064-4. URL https://doi.org/10.1007/s00220-021-04064-4.
William Feller. An Introduction to Probability Theory and Its Applications, volume 1. Wiley, third
edition, 1968.
Rupert L. Frank and Elliott H. Lieb. Monotonicity of a relative Rényi entropy. Journal of
Mathematical Physics, 54:122201, December 2013. URL https://doi.org/10.1063/1.
4838835.
Bert Fristedt and Lawrence Gray. A Modern Approach to Probability Theory. Birkhäuser, Boston,
1997. ISBN 978-1-4899-2839-9.
Christopher A. Fuchs and Carlton M. Caves. Mathematical techniques for quantum communication
theory. Open Systems & Information Dynamics, 3:345–356, 1995. doi: 10.1007/BF02228997.
URL https://doi.org/10.1007/BF02228997.
Christopher A. Fuchs and Jeroen van de Graaf. Cryptographic distinguishability measures for
quantum mechanical states. IEEE Transactions on Information Theory, 45:1216–1227, May
1998. URL https://ieeexplore.ieee.org/document/761271.
Jun-Ichi Fujii, Masatoshi Fujii, and Ritsuo Nakamoto. Jensen’s operator inequality and its
application. Sûrikaisekikenkyûsho Kôkyûroku, 1396:85–93, March 2004. URL http://www.
kurims.kyoto-u.ac.jp/~kyodo/kokyuroku/contents/pdf/1396-10.pdf.
Jingliang Gao. Quantum union bounds for sequential projective measurements. Physical Review A,
92:052331, November 2015. doi: 10.1103/PhysRevA.92.052331. URL https://link.aps.
org/doi/10.1103/PhysRevA.92.052331.
Raúl García-Patrón, Stefano Pirandola, Seth Lloyd, and Jeffrey H. Shapiro. Reverse coherent
information. Physical Review Letters, 102:210501, May 2009. doi: 10.1103/PhysRevLett.102.
210501. URL https://link.aps.org/doi/10.1103/PhysRevLett.102.210501.
Raul García-Patrón, William Matthews, and Andreas Winter. Quantum enhancement of randomness
distribution. IEEE Transactions on Information Theory, 64:4664–4673, June 2018. URL
https://ieeexplore.ieee.org/document/8328871.
I. Gelfand and Mark Aronovich Naimark. On the imbedding of normed rings into the
ring of operators in Hilbert space. Rec. Math. [Mat. Sbornik] N.S., 12(54):197–217,
1943. URL http://www.mathnet.ru/php/archive.phtml?wshow=paper&jrnid=sm&
paperid=6155&option_lang=eng.
1206
Sevag Gharibian. Strong NP-hardness of the quantum separability problem. Quantum Information
and Computation, 10:343–360, March 2010. ISSN 1533-7146. URL https://doi.org/10.
26421/QIC10.3-4-11.
Géza Giedke and J. Ignacio Cirac. Characterization of Gaussian operations and distillation of
Gaussian states. Physical Review A, 66:032316, September 2002. doi: 10.1103/PhysRevA.66.
032316. URL https://link.aps.org/doi/10.1103/PhysRevA.66.032316.
Alexei Gilchrist, Nathan K. Langford, and Michael A. Nielsen. Distance measures to compare
real and ideal quantum processes. Physical Review A, 71:062310, June 2005. URL https:
//link.aps.org/doi/10.1103/PhysRevA.71.062310.
Vittorio Giovannetti, Seth Lloyd, and Lorenzo Maccone. Achieving the Holevo bound via sequential
measurements. Physical Review A, 85:012302, January 2012. URL https://link.aps.org/
doi/10.1103/PhysRevA.85.012302.
Daniel Gottesman and Isaac L. Chuang. Demonstrating the viability of universal quantum
computation using teleportation and single-qubit operations. Nature, 402:390–393, November
1999. doi: 10.1038/46503. URL https://doi.org/10.1038/46503.
Markus Grassl, Thomas Beth, and Thomas Pellizzari. Codes for the quantum erasure channel.
Physical Review A, 56:33–38, July 1997. doi: 10.1103/PhysRevA.56.33. URL https://link.
aps.org/doi/10.1103/PhysRevA.56.33.
Manish Gupta and Mark M. Wilde. Multiplicativity of completely bounded 𝑝-norms implies a
strong converse for entanglement-assisted capacity. Communications in Mathematical Physics,
334:867–887, March 2015. URL https://doi.org/10.1007/s00220-014-2212-9.
Leonid Gurvits. Classical complexity and quantum entanglement. Journal of Computer and System
Sciences, 69:448–484, 2004. ISSN 0022-0000. doi: https://doi.org/10.1016/j.jcss.2004.06.003.
URL http://www.sciencedirect.com/science/article/pii/S0022000004000893.
Brian C. Hall. Quantum Theory for Mathematicians. Graduate Texts in Mathematics. Springer New
York, 2013. ISBN 9781461471165.
Frank Hansen and Gert K. Pedersen. Jensen’s operator inequality. Bulletin of the London Mathe-
matical Society, 35:553–564, July 2003. ISSN 1469-2120. doi: 10.1112/S0024609303002200.
URL http://dx.doi.org/10.1112/S0024609303002200.
1207
Paul Hausladen, Richard Jozsa, Benjamin Schumacher, Michael Westmoreland, and William K.
Wootters. Classical information capacity of a quantum channel. Physical Review A, 54:1869–
1876, September 1996. doi: 10.1103/PhysRevA.54.1869. URL https://link.aps.org/doi/
10.1103/PhysRevA.54.1869.
Masahito Hayashi. Error exponent in asymmetric quantum hypothesis testing and its application
to classical-quantum channel coding. Physical Review A, 76:062301, December 2007. doi:
10.1103/PhysRevA.76.062301. URL https://link.aps.org/doi/10.1103/PhysRevA.76.
062301.
Masahito Hayashi and Hiroshi Nagaoka. General formulas for capacity of classical-quantum
channels. IEEE Transactions on Information Theory, 49:1753–1768, July 2003. URL https:
//ieeexplore.ieee.org/document/1023343.
Patrick Hayden, Richard Jozsa, Dénes Petz, and Andreas Winter. Structure of states which satisfy
strong subadditivity of quantum entropy with equality. Communications in Mathematical Physics,
246:359–374, April 2004. URL https://doi.org/10.1007/s00220-004-1049-z.
Patrick Hayden, Michał Horodecki, Andreas Winter, and Jon Yard. A decoupling approach
to the quantum capacity. Open Systems & Information Dynamics, 15:7–19, 2008a. doi:
10.1142/S1230161208000043. URL https://doi.org/10.1142/S1230161208000043.
Patrick Hayden, Peter W. Shor, and Andreas Winter. Random quantum codes from Gaussian
ensembles and an uncertainty relation. Open Systems & Information Dynamics, 15:71–89, March
2008b. URL https://doi.org/10.1142/S1230161208000079.
Teiko Heinosaari and Mário Ziman. The Mathematical Language of Quantum Theory: From
Uncertainty to Entanglement. Cambridge University Press, 2012.
Carl W. Helstrom. Detection theory and quantum mechanics. Information and Control, 10:254–291,
1967. ISSN 0019-9958. URL https://doi.org/10.1016/S0019-9958(67)90302-6.
Carl W. Helstrom. Quantum detection and estimation theory. Journal of Statistical Physics, 1:
231–252, 1969. ISSN 0022-4715. doi: 10.1007/BF01007479. URL https://doi.org/10.
1007/BF01007479.
Carl W. Helstrom. Quantum Detection and Estimation Theory. Academic Press, 1976.
Fumio Hiai and Milán Mosonyi. Different quantum 𝑓 -divergences and the reversibility of quantum
operations. Reviews in Mathematical Physics, 29:1750023, August 2017. URL https:
//doi.org/10.1142/S0129055X17500234.
Fumio Hiai and Dénes Petz. The proper formula for relative entropy and its asymptotics in quantum
probability. Communications in Mathematical Physics, 143:99–114, December 1991. URL
https://doi.org/10.1007/BF02100287.
1208
B. L. Higgins, A. C. Doherty, S. D. Bartlett, G. J. Pryde, and H. M. Wiseman. Multiple-copy state
discrimination: Thinking globally, acting locally. Physical Review A, 83:052314, May 2011. doi:
10.1103/PhysRevA.83.052314. URL https://link.aps.org/doi/10.1103/PhysRevA.83.
052314.
Foek T. Hioe and Joseph H. Eberly. 𝑛-level coherence vector and higher conservation laws in quantum
optics and quantum mechanics. Physical Review Letters, 47:838–841, September 1981. doi:
10.1103/PhysRevLett.47.838. URL https://link.aps.org/doi/10.1103/PhysRevLett.
47.838.
Alexander S. Holevo. The capacity of the quantum channel with general signal states. IEEE
Transactions on Information Theory, 44:269–273, January 1998. URL https://ieeexplore.
ieee.org/document/651037.
Alexander S. Holevo. Remarks on the classical capacity of quantum channel. December 2002b.
Alexander S. Holevo. Multiplicativity of 𝑝-norms of completely positive maps and the additivity
problem in quantum information theory. Russian Mathematical Surveys, 61:301–339, 2006.
URL https://doi.org/10.1070/rm2006v061n02abeh004313.
Roger A. Horn and Charles R. Johnson. Matrix Analysis. Matrix Analysis. Cambridge University
Press, 2013. ISBN 9780521839402.
Karol Horodecki, Michał Horodecki, Paweł Horodecki, and Jonathan Oppenheim. Secure key
from bound entanglement. Physical Review Letters, 94:160502, April 2005a. doi: 10.
1103/PhysRevLett.94.160502. URL http://link.aps.org/doi/10.1103/PhysRevLett.
94.160502.
1209
Karol Horodecki, Michał Horodecki, Paweł Horodecki, Debbie Leung, and Jonathan Oppenheim.
Quantum key distribution based on private states: Unconditional security over untrusted channels
with zero quantum capacity. IEEE Transactions on Information Theory, 54:2604–2620, June
2008a. ISSN 0018-9448. doi: 10.1109/TIT.2008.921870. URL https://ieeexplore.ieee.
org/document/4529275.
Karol Horodecki, Michał Horodecki, Paweł Horodecki, Debbie Leung, and Jonathan Oppenheim.
Unconditional privacy over channels which cannot convey quantum information. Physical
Review Letters, 100:110502, March 2008b. doi: 10.1103/PhysRevLett.100.110502. URL
http://link.aps.org/doi/10.1103/PhysRevLett.100.110502.
Karol Horodecki, Michal Horodecki, Pawel Horodecki, and Jonathan Oppenheim. General paradigm
for distilling classical key from quantum states. IEEE Transactions on Information Theory, 55:
1898–1929, April 2009a. URL https://ieeexplore.ieee.org/document/4802308.
Michał Horodecki. Simplifying monotonicity conditions for entanglement measures. Open Systems
& Information Dynamics, 12:231–237, September 2005. URL https://doi.org/10.1007/
s11080-005-0920-5.
Michał Horodecki and Paweł Horodecki. Reduction criterion of separability and limits for a class of
distillation protocols. Physical Review A, 59:4206–4216, June 1999. doi: 10.1103/PhysRevA.59.
4206. URL http://link.aps.org/doi/10.1103/PhysRevA.59.4206.
Michał Horodecki, Paweł Horodecki, and Ryszard Horodecki. Separability of mixed states: nec-
essary and sufficient conditions. Physics Letters A, 223:1–8, November 1996. doi: 10.1016/
s0375-9601(96)00706-2. URL https://www.sciencedirect.com/science/article/
pii/S0375960196007062.
Michał Horodecki, Paweł Horodecki, and Ryszard Horodecki. Mixed-state entanglement and
distillation: Is there a ‘bound’ entanglement in nature? Physical Review Letters, 80:5239–5242,
June 1998. doi: 10.1103/physrevlett.80.5239. URL https://link.aps.org/doi/10.1103/
PhysRevLett.80.5239.
Michał Horodecki, Paweł Horodecki, and Ryszard Horodecki. General teleportation channel, singlet
fraction, and quasidistillation. Physical Review A, 60:1888–1898, September 1999. doi: 10.1103/
PhysRevA.60.1888. URL https://link.aps.org/doi/10.1103/PhysRevA.60.1888.
Michał Horodecki, Paweł Horodecki, and Ryszard Horodecki. Unified approach to quantum
capacities: Towards quantum noisy coding theorem. Physical Review Letters, 85:433–436,
July 2000. doi: 10.1103/PhysRevLett.85.433. URL https://link.aps.org/doi/10.1103/
PhysRevLett.85.433.
Michal Horodecki, Peter W. Shor, and Mary Beth Ruskai. Entanglement breaking channels.
Reviews in Mathematical Physics, 15:629–641, 2003. URL https://doi.org/10.1142/
S0129055X03001709.
Michal Horodecki, Jonathan Oppenheim, and Andreas Winter. Partial quantum information. Nature,
436:673–676, August 2005b. URL https://doi.org/10.1038/nature03909.
1210
Michal Horodecki, Jonathan Oppenheim, and Andreas Winter. Quantum state merging and negative
information. Communications in Mathematical Physics, 269:107–136, January 2007. URL
https://doi.org/10.1007/s00220-006-0118-x.
Pawel Horodecki. Separability criterion and inseparable mixed states with positive partial
transposition. Physics Letters A, 232:333–339, August 1997. ISSN 0375-9601. doi:
https://doi.org/10.1016/S0375-9601(97)00416-7. URL http://www.sciencedirect.com/
science/article/pii/S0375960197004167.
Ryszard Horodecki, Paweł Horodecki, Michał Horodecki, and Karol Horodecki. Quantum entangle-
ment. Reviews of Modern Physics, 81:865–942, June 2009b. doi: 10.1103/RevModPhys.81.865.
URL https://link.aps.org/doi/10.1103/RevModPhys.81.865.
Raban Iten, Joseph M. Renes, and David Sutter. Pretty good measures in quantum information
theory. IEEE Transactions on Information Theory, 63:1270–1279, February 2017. doi:
10.1109/TIT.2016.2639521. URL https://ieeexplore.ieee.org/document/7782776.
Vojkan Jaksic, Yoshiko Ogata, Claude-Alain Pillet, and Robert Seiringer. Quantum hypothesis
testing and non-equilibrium statistical mechanics. Reviews in Mathematical Physics, 24:1230002,
2012. URL https://doi.org/10.1142/S0129055X12300026.
Anna Jencova. A relation between completely bounded norms and conjugate channels. Communica-
tions in Mathematical Physics, 266:65–70, August 2006. URL https://doi.org/10.1007/
s00220-006-0035-z.
Anna Jencova. Quantum hypothesis testing and sufficient subalgebras. Letters in Mathematical
Physics, 93:15–27, 2010. URL https://doi.org/10.1007/s11005-010-0398-0.
Vishal Katariya and Mark M. Wilde. Geometric distinguishability measures limit quantum channel
estimation and discrimination. Quantum Information Processing, 20:78, April 2021. URL
https://doi.org/10.1007/s11128-021-02992-7.
Eneet Kaur and Mark M. Wilde. Amortized entanglement of a quantum channel and approximately
teleportation-simulable channels. Journal of Physics A: Mathematical and Theoretical, July
2017. URL http://iopscience.iop.org/10.1088/1751-8121/aa9da7.
Johannes Henricus Bernardus Kemperman. Strong converses for a general memoryless channel
with feedback. In Transactions of the Sixth Prague Conference on Information Theory, Statistical
Decision Functions, and Random Processes, 1971.
1211
Leonid G Khachiyan. Polynomial algorithms in linear programming. USSR Computational
Mathematics and Mathematical Physics, 20:53–72, 1980. ISSN 0041-5553. URL https:
//www.sciencedirect.com/science/article/pii/0041555380900610.
Sumeet Khatri, Eneet Kaur, Saikat Guha, and Mark M. Wilde. Second-order coding rates for key
distillation in quantum key distribution. October 2019.
Sumeet Khatri, Kunal Sharma, and Mark M. Wilde. Information-theoretic aspects of the generalized
amplitude-damping channel. Physical Review A, 102:012401, July 2020. URL https://link.
aps.org/doi/10.1103/PhysRevA.102.012401.
Gen Kimura. The Bloch vector for 𝑁-level systems. Physics Letters A, 314:339–349, 2003.
ISSN 0375-9601. doi: https://doi.org/10.1016/S0375-9601(03)00941-1. URL https://www.
sciencedirect.com/science/article/pii/S0375960103009411.
Christopher King. The capacity of the quantum depolarizing channel. IEEE Transactions
on Information Theory, 49:221–229, January 2003b. ISSN 0018-9448. URL https://
ieeexplore.ieee.org/document/1159773.
Christopher King, Keiji Matsumoto, Michael Nathanson, and Mary Beth Ruskai. Properties
of conjugate channels with applications to additivity and multiplicativity. Markov Processes
and Related Fields, 13:391–423, 2007. URL http://math-mprf.org/journal/articles/
id1123/. J. T. Lewis memorial issue.
Alexei Kitaev. Quantum computations: algorithms and error correction. Russian Mathematical Sur-
veys, 52:1191–1249, 1997. URL https://doi.org/10.1070/rm1997v052n06abeh002155.
Oskar Klein. Zur Quantenmechanischen Begründung des zweiten Hauptsatzes der Wärmelehre. Z.
Physik, 72:767–775, 1931.
Rochus Klesse. Approximate quantum error correction, random codes, and quantum channel
capacity. Physical Review A, 75:062315, June 2007. doi: 10.1103/PhysRevA.75.062315. URL
https://link.aps.org/doi/10.1103/PhysRevA.75.062315.
Rochus Klesse. A random coding based proof for the quantum coding theorem. Open Sys-
tems & Information Dynamics, 15:21–45, March 2008. URL https://doi.org/10.1142/
S1230161208000055.
Robert Koenig and Stephanie Wehner. A strong converse for classical channel coding using
entangled inputs. Physical Review Letters, 103:070504, August 2009. URL https://link.
aps.org/doi/10.1103/PhysRevLett.103.070504.
Robert Koenig, Renato Renner, and Christian Schaffner. The Operational Meaning of Min- and
Max-Entropy. IEEE Transactions on Information Theory, 55:4337–4347, September 2009. URL
https://ieeexplore.ieee.org/document/5208530.
1212
Pieter Kok, W. J. Munro, Kae Nemoto, T. C. Ralph, Jonathan P. Dowling, and G. J. Milburn. Linear
optical quantum computing with photonic qubits. Reviews of Modern Physics, 79:135–174,
January 2007. doi: 10.1103/RevModPhys.79.135. URL https://link.aps.org/doi/10.
1103/RevModPhys.79.135.
Hidetoshi Komiya. Elementary proof for Sion’s minimax theorem. Kodai Mathematical Journal,
11:5–7, 1988. URL https://doi.org/10.2996/kmj/1138038812.
Karl Kraus. States, Effects and Operations: Fundamental Notions of Quantum Theory,. Springer
Verlag, 1983.
Dennis Kretschmann and Reinhard F. Werner. Tema con variazioni: quantum channel capacity. New
Journal of Physics, 6:26, 2004. URL http://stacks.iop.org/1367-2630/6/i=1/a=026.
Erwin Kreyszig. Introductory Functional Analysis with Applications. Wiley Classics Library. Wiley,
1989. ISBN 9780471504597.
Lev Landau. Das dämpfungsproblem in der wellenmechanik. Zeitschrift für Physik, 45:430–441,
May 1927. ISSN 0044-3328. URL https://doi.org/10.1007/BF01343064.
Oscar E. Lanford, III and Derek W. Robinson. Mean entropy of states in quantum-statistical
mechanics. Journal of Mathematical Physics, 9:1120–1125, July 1968. doi: 10.1063/1.1664685.
URL https://doi.org/10.1063/1.1664685.
Jimmie D. Lawson and Yongdo Lim. The geometric mean, matrices, metrics, and more. The American
Mathematical Monthly, 108:797–812, November 2001. doi: 10.1080/00029890.2001.11919815.
URL https://doi.org/10.1080/00029890.2001.11919815.
Felix Leditzky. Relative entropies and their use in quantum information theory. PhD thesis,
University of Cambridge, November 2016.
Felix Leditzky. Distillable key of degradable states. unpublished, August 2019. private email
communication.
Felix Leditzky, Nilanjana Datta, and Graeme Smith. Useful states and entanglement distillation. IEEE
Transactions on Information Theory, 64:4689–4708, July 2018. doi: 10.1109/TIT.2017.2776907.
URL https://ieeexplore.ieee.org/document/8119865.
Felix Leditzky, Eneet Kaur, Nilanjana Datta, and Mark M. Wilde. Approaches for approximate
additivity of the Holevo information of quantum channels. Physical Review A, 97:012332,
January 2018. doi: 10.1103/PhysRevA.97.012332. URL https://link.aps.org/doi/10.
1103/PhysRevA.97.012332.
Yin Tat Lee, Aaron Sidford, and Sam Chiu Wai Wong. A faster cutting plane method and its
implications for combinatorial and convex optimization. In IEEE 56th Annual Symposium
on the Foundations of Computer Science, pages 1049–1065, October 2015. URL https:
//ieeexplore.ieee.org/document/7354442.
1213
Matthew S. Leifer. Conditional density operators and the subjectivity of quantum operations.
AIP Conference Proceedings, 889:172–186, February 2007. doi: 10.1063/1.2713456. URL
https://aip.scitation.org/doi/abs/10.1063/1.2713456.
Matthew S. Leifer and Robert W. Spekkens. Towards a formulation of quantum theory as a causally
neutral theory of Bayesian inference. Physical Review A, 88:052130, November 2013. doi:
10.1103/PhysRevA.88.052130. URL http://link.aps.org/doi/10.1103/PhysRevA.88.
052130.
Matthew S. Leifer, Leah Henderson, and Noah Linden. Optimal entanglement generation from
quantum operations. Physical Review A, 67:012306, January 2003. doi: 10.1103/physreva.67.
012306. URL https://link.aps.org/doi/10.1103/PhysRevA.67.012306.
Debbie Leung and William Matthews. On the power of PPT-preserving and non-signalling codes.
IEEE Transactions on Information Theory, 61:4486–4499, August 2015. ISSN 0018-9448. doi:
10.1109/TIT.2015.2439953. URL https://ieeexplore.ieee.org/document/7115934.
Hou Li-Zhen and Fang Mao-Fa. Entanglement-assisted classical capacity of a generalized amplitude
damping channel. Chinese Physics Letters, 24:2482, 2007a. URL http://stacks.iop.org/
0256-307X/24/i=9/a=006.
Hou Li-Zhen and Fang Mao-Fa. The Holevo capacity of a generalized amplitude-damping channel.
Chinese Physics, 16:1843, 2007b. URL http://stacks.iop.org/1009-1963/16/i=7/a=
006.
Elliot H. Lieb. Convex trace functions and the Wigner-Yanase-Dyson conjecture. Advances in Math-
ematics, 11:267–288, December 1973. URL https://doi.org/10.1016/0001-8708(73)
90011-X.
Elliott H. Lieb and Mary Beth Ruskai. Proof of the strong subadditivity of quantum-mechanical
entropy. Journal of Mathematical Physics, 14:1938–1941, 1973a. URL https://doi.org/10.
1063/1.1666274.
Elliott H. Lieb and Mary Beth Ruskai. A fundamental property of quantum-mechanical entropy.
Physical Review Letters, 30:434–436, March 1973b. doi: 10.1103/PhysRevLett.30.434. URL
https://link.aps.org/doi/10.1103/PhysRevLett.30.434.
Elliott H. Lieb and Walter E. Thirring. Inequalities for the Moments of the Eigenvalues of the
Schrodinger Hamiltonian and Their Relation to Sobolev Inequalities, pages 269–304. Princeton
University Press, 1976. doi: doi:10.1515/9781400868940-014. URL https://doi.org/10.
1515/9781400868940-014.
Göran Lindblad. Completely positive maps and entropy inequalities. Communications in Mathe-
matical Physics, 40:147–151, June 1975. ISSN 0010-3616. doi: 10.1007/BF01609396. URL
http://dx.doi.org/10.1007/BF01609396.
1214
Zi-Wen Liu and Andreas Winter. Resource theories of quantum channels and the universal role of
resource erasure. April 2019.
Seth Lloyd. Capacity of the noisy quantum channel. Physical Review A, 55:1613–1622, March 1997.
doi: 10.1103/PhysRevA.55.1613. URL https://link.aps.org/doi/10.1103/PhysRevA.
55.1613.
Per-Olov Löwdin. On the nonorthogonality problem. 5:185–199, 1970. ISSN 0065-3276. doi:
https://doi.org/10.1016/S0065-3276(08)60339-1. URL https://www.sciencedirect.com/
science/article/pii/S0065327608603391.
Per-Olov Löwdin. On the non-orthogonality problem connected with the use of atomic wave
functions in the theory of molecules and crystals. The Journal of Chemical Physics, 18:365–375,
1950. doi: 10.1063/1.1747632. URL https://doi.org/10.1063/1.1747632.
Keiji Matsumoto. Quantum fidelities, their duals, and convex analysis. August 2014a.
Keiji Matsumoto. On the condition of conversion of classical probability distribution families into
quantum families. December 2014b.
Keiji Matsumoto. A new quantum version of 𝑓 -divergence. In Masanao Ozawa, Jeremy Butterfield,
Hans Halvorson, Miklós Rédei, Yuichiro Kitajima, and Francesco Buscemi, editors, Reality
and Measurement in Algebraic Quantum Theory, volume 261, pages 229–273, Singapore, 2018.
Springer Singapore. ISBN 9789811324864 9789811324871. doi: 10.1007/978-981-13-2487-1_
10. URL http://link.springer.com/10.1007/978-981-13-2487-1_10. Series Title:
Springer Proceedings in Mathematics & Statistics.
William Matthews and Stephanie Wehner. Finite blocklength converse bounds for quantum
channels. IEEE Transactions on Information Theory, 60:7317–7329, November 2014. URL
https://ieeexplore.ieee.org/document/6891222.
Ueli M. Maurer. Secret key agreement by public discussion from common information. IEEE
Transactions on Information Theory, 39:733–742, May 1993. URL https://ieeexplore.
ieee.org/document/256484.
Simon Milz and Kavan Modi. Quantum Stochastic Processes and Quantum non-Markovian
Phenomena. PRX Quantum, 2:030201, July 2021. doi: 10.1103/PRXQuantum.2.030201. URL
https://link.aps.org/doi/10.1103/PRXQuantum.2.030201.
Adam Miranowicz and Satoshi Ishizaka. Closed formula for the relative entropy of entanglement.
Physical Review A, 78:032310, September 2008. doi: 10.1103/PhysRevA.78.032310. URL
https://link.aps.org/doi/10.1103/PhysRevA.78.032310.
1215
Gert Molière and Max Delbrück. Statistische Quantenmechanik und Thermodynamik. Berlin:
Verlag der Akademie der Wissenschaften, 1935.
Ciara Morgan and Andreas Winter. ‘Pretty strong’ converse for the quantum capacity of degradable
channels. IEEE Transactions on Information Theory, 60:317–333, January 2014. URL
https://ieeexplore.ieee.org/document/6663606.
Milan Mosonyi and Nilanjana Datta. Generalized relative entropies and the capacity of classical-
quantum channels. Journal of Mathematical Physics, 50:072104, July 2009. doi: 10.1063/1.
3167288. URL http://dx.doi.org/10.1063/1.3167288.
Milán Mosonyi and Fumio Hiai. On the quantum Rényi relative entropies and related capacity
formulas. IEEE Transactions on Information Theory, 57:2474–2487, April 2011. URL
https://ieeexplore.ieee.org/document/5730573.
Milán Mosonyi and Tomohiro Ogawa. Quantum hypothesis testing and the operational interpretation
of the quantum Rényi relative entropies. Communications in Mathematical Physics, 334:
1617–1648, March 2015. URL https://doi.org/10.1007/s00220-014-2248-x.
Milán Mosonyi and Dénes Petz. Structure of sufficient quantum coarse-grainings. Letters in
Mathematical Physics, 68:19–30, April 2004. ISSN 1573-0530. URL https://doi.org/10.
1007/s11005-004-4072-2.
Martin Müller-Lennert, Frédéric Dupuis, Oleg Szehr, Serge Fehr, and Marco Tomamichel. On
quantum Rényi entropies: a new generalization and some properties. Journal of Mathematical
Physics, 54:122203, December 2013. URL https://doi.org/10.1063/1.4838856.
Hiroshi Nagaoka. The converse part of the theorem for quantum Hoeffding bound. November 2006.
Mark Aronovich Naimark. Spectral functions of a symmetric operator. Izv. Akad. Nauk SSSR
Ser. Mat., 4:277–318, 1940. URL http://www.mathnet.ru/php/archive.phtml?wshow=
paper&jrnid=im&paperid=3745&option_lang=eng.
Michael A. Nielsen. Continuity bounds for entanglement. Physical Review A, 61:064301, April
2000. URL https://link.aps.org/doi/10.1103/PhysRevA.61.064301.
Michael A. Nielsen. A simple formula for the average gate fidelity of a quantum dynamical
operation. Physics Letters A, 303:249 – 252, 2002. ISSN 0375-9601. doi: DOI:10.1016/
S0375-9601(02)01272-0. URL https://www.sciencedirect.com/science/article/
pii/S0375960102012720.
Michael A. Nielsen and Isaac L. Chuang. Quantum Computation and Quantum Information.
Cambridge University Press, 2000.
1216
Julien Niset, Jaromír Fiurasek, and Nicolas J. Cerf. No-go theorem for Gaussian quantum error
correction. Physical Review Letters, 102:120501, March 2009. doi: 10.1103/PhysRevLett.102.
120501. URL http://link.aps.org/doi/10.1103/PhysRevLett.102.120501.
Michael Nussbaum and Arleta Szkoła. The Chernoff lower bound for symmetric quantum hypothesis
testing. The Annals of Statistics, 37:1040–1057, 2009. doi: 10.1214/08-AOS593. URL
https://doi.org/10.1214/08-AOS593.
Tomohiro Ogawa and Hiroshi Nagaoka. Strong Converse and Stein’s Lemma in Quantum Hypothesis
Testing, pages 28–42. 2005. doi: 10.1142/9789812563071_0003. URL https://www.
worldscientific.com/doi/abs/10.1142/9789812563071_0003.
Samad Khabbazi Oskouei, Stefano Mancini, and Mark M. Wilde. Union bound for quantum
information processing. Proceedings of the Royal Society A, 475:20180612, January 2019.
doi: 10.1098/rspa.2018.0612. URL https://royalsocietypublishing.org/doi/abs/10.
1098/rspa.2018.0612.
Vern Paulsen. Completely Bounded Maps and Operator Algebras. Cambridge Studies in Advanced
Mathematics. Cambridge University Press, 2003. doi: 10.1017/CBO9780511546631.
Asher Peres. Separability criterion for density matrices. Physical Review Letters, 77:1413–1415,
August 1996. doi: 10.1103/PhysRevLett.77.1413. URL http://link.aps.org/doi/10.
1103/PhysRevLett.77.1413.
Dénes Petz. Quasi-entropies for States of a von Neumann Algebra. Publications of the Research
Institute for Mathematical Sciences, 21:787–800, 1985. doi: 10.2977/prims/1195178929. URL
https://doi.org/10.2977/prims/1195178929.
Dénes Petz. Quasi-entropies for finite quantum systems. Reports in Mathematical Physics, 23:
57–65, 1986a. URL https://doi.org/10.1016/0034-4877(86)90067-4.
Dénes Petz. Sufficient subalgebras and the relative entropy of states of a von Neumann algebra.
Communications in Mathematical Physics, 105:123–131, March 1986b. ISSN 1432-0916. URL
https://doi.org/10.1007/BF01212345.
Dénes Petz. Sufficiency of channels over von Neumann algebras. Quarterly Journal of Mathematics,
39:97–108, 1988. ISSN 1464-3847. URL https://doi.org/10.1093/qmath/39.1.97.
Dénes Petz. Monotonicity of quantum relative entropy revisited. Reviews in Mathematical Physics,
15:79, March 2003. URL https://doi.org/10.1142/S0129055X03001576.
Dénes Petz and Mary Beth Ruskai. Contraction of generalized relative entropy under stochastic
mappings on matrices. Infinite Dimensional Analysis, Quantum Probability and Related Topics,
1:83–89, January 1998. URL https://doi.org/10.1142/S0219025798000077.
1217
Marco Piani, Michal Horodecki, Pawel Horodecki, and Ryszard Horodecki. Properties of quantum
nonsignaling boxes. Physical Review A, 74:012305, July 2006. doi: 10.1103/PhysRevA.74.
012305. URL https://link.aps.org/doi/10.1103/PhysRevA.74.012305.
Stefano Pirandola, Riccardo Laurenza, Carlo Ottaviani, and Leonardo Banchi. Fundamental limits
of repeaterless quantum communications. Nature Communications, 8:15043, 2017. URL
https://doi.org/10.1038/ncomms15043.
Martin B. Plenio. Logarithmic negativity: A full entanglement monotone that is not convex.
Physical Review Letters, 95:090503, August 2005. doi: 10.1103/PhysRevLett.95.090503. URL
https://link.aps.org/doi/10.1103/PhysRevLett.95.090503.
Martin B. Plenio, Shashank Virmani, and P. Papadopoulos. Operator monotones, the reduction
criterion and the relative entropy. Journal of Physics A: Mathematical and General, 33:L193,
June 2000. URL http://stacks.iop.org/0305-4470/33/i=22/a=101.
Yury Polyanskiy and Sergio Verdú. Arimoto channel coding converse and Rényi divergence.
In Proceedings of the 48th Annual Allerton Conference on Communication, Control, and
Computation, pages 1327–1333, September 2010. URL https://ieeexplore.ieee.org/
abstract/document/5707067.
Haoyu Qi, Kunal Sharma, and Mark M. Wilde. Entanglement-assisted private communication
over quantum broadcast channels. Journal of Physics A: Mathematical and Theoretical, 51:
374001, August 2018a. doi: 10.1088/1751-8121/aad5f3. URL https://doi.org/10.1088/
1751-8121/aad5f3.
Haoyu Qi, Qing-Le Wang, and Mark M. Wilde. Applications of position-based coding to classical
communication over quantum channels. Journal of Physics A, 51:444002, November 2018b.
URL https://doi.org/10.1088/1751-8121/aae290.
Lu-Feng Qiao, Alexander Streltsov, Jun Gao, Swapan Rana, Ruo-Jing Ren, Zhi-Qiang Jiao, Cheng-
Qiu Hu, Xiao-Yun Xu, Ci-Yu Wang, Hao Tang, Ai-Lin Yang, Zhi-Hao Ma, Maciej Lewenstein,
and Xian-Min Jin. Entanglement activation from quantum coherence and superposition.
Physical Review A, 98:052351, November 2018. doi: 10.1103/PhysRevA.98.052351. URL
https://link.aps.org/doi/10.1103/PhysRevA.98.052351.
Jaikumar Radhakrishnan, Pranab Sen, and Naqueeb Ahmad Warsi. One-shot private classical
capacity of quantum wiretap channel: Based on one-shot quantum covering lemma. March 2017.
Eric M. Rains. Bound on distillable entanglement. Physical Review A, 60:179–184, July 1999a.
doi: 10.1103/PhysRevA.60.179. URL http://link.aps.org/doi/10.1103/PhysRevA.60.
179.
1218
Eric M. Rains. Rigorous treatment of distillable entanglement. Physical Review A, 60:173–178,
July 1999b. doi: 10.1103/PhysRevA.60.173. URL https://link.aps.org/doi/10.1103/
PhysRevA.60.173.
Alexey E. Rastegin. A lower bound on the relative error of mixed-state cloning and related
operations. Journal of Optics B: Quantum and Semiclassical Optics, 5:S647, December 2003.
URL http://stacks.iop.org/1464-4266/5/i=6/a=017.
Michael Reed and Barry Simon. Methods of Modern Mathematical Physics, volume I: Functional
Analysis. Academic Press, 1981. ISBN 9780080570488.
Joseph M. Renes and Renato Renner. Noisy channel coding via privacy amplification and information
reconciliation. IEEE Transactions on Information Theory, 57:7377–7385, November 2011.
ISSN 0018-9448. doi: 10.1109/TIT.2011.2162226. URL https://ieeexplore.ieee.org/
document/5967913.
Joseph M. Renes and Renato Renner. One-shot classical data compression with quantum side
information and the distillation of common randomness or secret keys. IEEE Transactions
on Information Theory, 58:1985–1991, March 2012. doi: 10.1109/TIT.2011.2177589. URL
https://ieeexplore.ieee.org/document/6157080.
Renato Renner. Security of Quantum Key Distribution. PhD thesis, ETH Zürich, December 2005.
Luca Rigovacca, Go Kato, Stefan Baeuml, Myungshik S. Kim, William J. Munro, and Koji Azuma.
Versatile relative entropy bounds for quantum networks. New Journal of Physics, 20:013033,
January 2018. URL https://doi.org/10.1088/1367-2630/aa9fcf.
Ralph Tyrrell Rockafellar. Convex Analysis. Princeton Landmarks in Mathematics and Physics.
Princeton University Press, 1970. ISBN 9780691015866.
Sheldon Ross. Introduction to Probability Models. Academic Press, 12 edition, 2019. ISBN
978-0-12-814346-9.
1219
Aidan Roy and A. J. Scott. Unitary designs and codes. Designs, Codes and Cryptography, 53:
13–31, October 2009. doi: 10.1007/s10623-009-9290-2. URL https://doi.org/10.1007/
s10623-009-9290-2.
Walter Rudin. Principles of Mathematical Analysis. International Series in Pure and Applied
Mathematics. McGraw-Hill, 1976. ISBN 9780070856134.
Mary Beth Ruskai. Inequalities for quantum entropy: A review with conditions for equality. Journal
of Mathematical Physics, 43:4358–4375, September 2002. doi: 10.1063/1.1497701. URL
https://doi.org/10.1063/1.1497701.
Benjamin Schumacher. Sending entanglement through noisy quantum channels. Physical Review
A, 54:2614–2628, October 1996. doi: 10.1103/PhysRevA.54.2614. URL https://link.aps.
org/doi/10.1103/PhysRevA.54.2614.
Benjamin Schumacher and Michael A. Nielsen. Quantum data processing and error correction.
Physical Review A, 54:2629–2635, October 1996. doi: 10.1103/PhysRevA.54.2629. URL
https://link.aps.org/doi/10.1103/PhysRevA.54.2629.
Benjamin Schumacher and Michael D. Westmoreland. Sending classical information via noisy
quantum channels. Physical Review A, 56:131–138, July 1997. URL https://link.aps.org/
doi/10.1103/PhysRevA.56.131.
Pranab Sen. Achieving the Han–Kobayashi inner bound for the quantum interference channel
by sequential decoding. In 2012 IEEE International Symposium on Information Theory
Proceedings, pages 736–740, September 2012. doi: 10.1109/ISIT.2012.6284656. URL
https://ieeexplore.ieee.org/document/6284656.
Claude Shannon. The zero error capacity of a noisy channel. IRE Transactions on Information Theory,
IT-2:S8–S19, September 1956. URL https://ieeexplore.ieee.org/document/1056798.
Claude E. Shannon. Communication theory of secrecy systems. The Bell System Technical
Journal, 28:656–715, October 1949. doi: 10.1002/j.1538-7305.1949.tb00928.x. URL https:
//ieeexplore.ieee.org/document/6769090.
Naresh Sharma. Equality conditions for the quantum 𝑓 -relative entropy and generalized data
processing inequalities. Quantum Information Processing, 11:137–160, 2012. ISSN 2157-8095.
URL https://doi.org/10.1007/s11128-011-0238-x.
1220
Naresh Sharma and Naqueeb Ahmad Warsi. Fundamental bound on the reliability of quantum
information transmission. Physical Review Letters, 110:080501, February 2013. doi: 10.1103/
PhysRevLett.110.080501. URL https://link.aps.org/doi/10.1103/PhysRevLett.110.
080501.
Yaoyun Shi and Xiaodi Wu. Epsilon-net method for optimizations over separable states. In Artur
Czumaj, Kurt Mehlhorn, Andrew Pitts, and Roger Wattenhofer, editors, Automata, Languages,
and Programming, pages 798–809, Berlin, Heidelberg, 2012. Springer Berlin Heidelberg. ISBN
978-3-642-31594-7. URL https://doi.org/10.1007/978-3-642-31594-7_67.
Maksim E. Shirokov. Tight uniform continuity bounds for the quantum conditional mutual
information, for the Holevo quantity, and for capacities of quantum channels. Journal of
Mathematical Physics, 58:102202, October 2017. doi: 10.1063/1.4987135. URL https:
//doi.org/10.1063/1.4987135.
Peter W. Shor. Scheme for reducing decoherence in quantum computer memory. Physical
Review A, 52:R2493–R2496, October 1995. doi: 10.1103/PhysRevA.52.R2493. URL https:
//link.aps.org/doi/10.1103/PhysRevA.52.R2493.
Peter W. Shor. The quantum channel capacity and coherent information. In Lecture Notes, MSRI
Workshop on Quantum Computation, 2002b.
Maurice Sion. On general minimax theorems. Pacific Journal of Mathematics, 8:171–176, March
1958. URL https://msp.org/pjm/1958/8-1/p14.xhtml.
Graeme Smith. Private classical capacity with a symmetric side channel and its application to quantum
cryptography. Physical Review A, 78:022306, August 2008. doi: 10.1103/PhysRevA.78.022306.
URL https://link.aps.org/doi/10.1103/PhysRevA.78.022306.
Graeme Smith and John A. Smolin. Degenerate quantum codes for Pauli channels. Physical
Review Letters, 98:030501, January 2007. doi: 10.1103/PhysRevLett.98.030501. URL https:
//link.aps.org/doi/10.1103/PhysRevLett.98.030501.
Graeme Smith and John A. Smolin. Extensive nonadditivity of privacy. Physical Review Letters,
103:120503, September 2009. URL https://link.aps.org/doi/10.1103/PhysRevLett.
103.120503.
Graeme Smith and Jon Yard. Quantum communication with zero-capacity channels. Science, 321:
1812–1815, September 2008. URL https://science.sciencemag.org/content/321/
5897/1812.
1221
Graeme Smith, Joseph M. Renes, and John A. Smolin. Structured codes improve the Bennett-
Brassard-84 quantum key rate. Physical Review Letters, 100:170502, April 2008. doi: 10.1103/
PhysRevLett.100.170502. URL https://link.aps.org/doi/10.1103/PhysRevLett.100.
170502.
Graeme Smith, John A. Smolin, and Jon Yard. Quantum communication with Gaussian channels of
zero quantum capacity. Nature Photonics, 5:624–627, August 2011. URL https://doi.org/
10.1038/nphoton.2011.203.
R. R. Smith. Completely Bounded Maps between C*-Algebras. Journal of the London Mathematical
Society, s2-27:157–166, 02 1983. ISSN 0024-6107. doi: 10.1112/jlms/s2-27.1.157. URL
https://doi.org/10.1112/jlms/s2-27.1.157.
Akihito Soeda, Peter S. Turner, and Mio Murao. Entanglement cost of implementing controlled-
unitary operations. Physical Review Letters, 107:180501, October 2011. doi: 10.1103/physrevlett.
107.180501. URL https://link.aps.org/doi/10.1103/PhysRevLett.107.180501.
Gilbert Strang. Introduction to Linear Algebra. Wellesley-Cambridge Press and SIAM, fifth edition,
May 2016.
David Sutter, Volkher B. Scholz, Andreas Winter, and Renato Renner. Approximate degradable
quantum channels. IEEE Transactions on Information Theory, 63:7832–7844, December 2017.
URL https://ieeexplore.ieee.org/document/8046086.
Masahiro Takeoka, Masashi Ban, and Masahide Sasaki. Quantum channel of continuous variable
teleportation and nonclassicality of quantum states. Journal of Optics B: Quantum and
Semiclassical Optics, 4:114, April 2002. URL http://stacks.iop.org/1464-4266/4/i=
2/a=306.
Masahiro Takeoka, Saikat Guha, and Mark M. Wilde. The squashed entanglement of a quantum
channel. IEEE Transactions on Information Theory, 60:4987–4998, August 2014. ISSN
0018-9448. URL https://ieeexplore.ieee.org/document/6832533.
1222
Masahiro Takeoka, Kaushik P. Seshadreesan, and Mark M. Wilde. Unconstrained capacities of
quantum key distribution and entanglement distillation for pure-loss bosonic broadcast channels.
Physical Review Letters, 119:150501, October 2017. URL https://link.aps.org/doi/10.
1103/PhysRevLett.119.150501.
Marco Tomamichel and Masahito Hayashi. A hierarchy of information quantities for finite
block length analysis of quantum tasks. IEEE Transactions on Information Theory, 59:7693–
7710, November 2013. ISSN 0018-9448. doi: 10.1109/TIT.2013.2276628. URL https:
//ieeexplore.ieee.org/document/6574274.
Marco Tomamichel, Roger Colbeck, and Renato Renner. A fully quantum asymptotic equipartition
property. IEEE Transactions on Information Theory, 55:5840–5847, December 2009. URL
https://ieeexplore.ieee.org/document/5319753.
Marco Tomamichel, Roger Colbeck, and Renato Renner. Duality Between Smooth Min- and
Max-Entropies. IEEE Transactions on Information Theory, 56:4674–4681, September 2010.
URL https://ieeexplore.ieee.org/document/5550419.
Marco Tomamichel, Mario Berta, and Joseph M. Renes. Quantum coding with finite resources. Na-
ture Communications, 7:11419, May 2016. URL https://doi.org/10.1038/ncomms11419.
Marco Tomamichel, Mark M. Wilde, and Andreas Winter. Strong converse rates for quantum
communication. IEEE Transactions on Information Theory, 63:715–727, January 2017. doi:
10.1109/tit.2016.2615847. URL https://ieeexplore.ieee.org/document/7586115.
Robert R. Tucci. Entanglement of distillation and conditional mutual information. Februrary 2002.
Armin Uhlmann. The ‘Transition Probability’ in the State Space of a *-Algebra. Reports on
Mathematical Physics, 9:273–279, April 1976. URL https://www.sciencedirect.com/
science/article/pii/0034487776900604.
Michael L. Ulrey. Sequential coding for channels with feedback. Information and Control, 32:
93–100, October 1976. URL https://doi.org/10.1016/S0019-9958(76)90129-7.
Lieven Vandenberghe and Stephen Boyd. Semidefinite programming. SIAM Review, 38:49–95,
1996. doi: 10.1137/1038003. URL https://doi.org/10.1137/1038003.
1223
Gonzalo Vazquez-Vilar. Multiple quantum hypothesis testing expressions and classical-quantum
channel converse bounds. In 2016 IEEE International Symposium on Information Theory, pages
2854–2857, Barcelona, Spain, 2016. URL https://ieeexplore.ieee.org/document/
7541820.
Vlatko Vedral and Martin B. Plenio. Entanglement measures and purification procedures. Physical
Review A, 57:1619–1633, March 1998. doi: 10.1103/PhysRevA.57.1619. URL http://link.
aps.org/doi/10.1103/PhysRevA.57.1619.
Vlatko Vedral, Martin B. Plenio, M. A. Rippin, and Peter L. Knight. Quantifying entanglement.
Physical Review Letters, 78:2275–2279, March 1997. doi: 10.1103/PhysRevLett.78.2275. URL
https://link.aps.org/doi/10.1103/PhysRevLett.78.2275.
Sergio Verdu. On channel capacity per unit cost. IEEE Transactions on Information Theory, 36:1019–
1030, 1990. doi: 10.1109/18.57201. URL https://ieeexplore.ieee.org/document/
57201.
Gilbert S. Vernam. Cipher printing telegraph systems for secret wire and radio telegraphic
communications. Transactions of the American Institute of Electrical Engineers, 45:295–301,
1926. URL https://ieeexplore.ieee.org/document/5061224.
Guifré Vidal. Entanglement monotones. Journal of Modern Optics, 47:355–376, 2000. doi:
10.1080/09500340008244048. URL https://www.tandfonline.com/doi/abs/10.1080/
09500340008244048.
Guifré Vidal and Reinhard F. Werner. Computable measure of entanglement. Physical Review A,
65:032314, February 2002. doi: 10.1103/PhysRevA.65.032314. URL https://link.aps.
org/doi/10.1103/PhysRevA.65.032314.
Johann von Neumann. Zur theorie der gesellschaftsspiele. Mathematische Annalen, 100:295–320,
December 1928. ISSN 1432-1807. URL https://doi.org/10.1007/BF01448847.
Johann von Neumann. Mathematische grundlagen der quantenmechanik. Verlag von Julius Springer
Berlin, 1932.
Michael Walter, David Gross, and Jens Eisert. Multi-partite entanglement. 2016.
Kun Wang, Xin Wang, and Mark M. Wilde. Quantifying the unextendibility of entanglement.
November 2019a.
1224
Ligong Wang and Renato Renner. One-shot classical-quantum capacity and hypothesis testing.
Physical Review Letters, 108:200501, 2012. doi: 10.1103/PhysRevLett.108.200501. URL
https://link.aps.org/doi/10.1103/PhysRevLett.108.200501.
Xin Wang and Runyao Duan. Improved semidefinite programming upper bound on distillable
entanglement. Physical Review A, 94:050301, November 2016a. doi: 10.1103/physreva.94.
050301. URL https://link.aps.org/doi/10.1103/PhysRevA.94.050301.
Xin Wang and Runyao Duan. A semidefinite programming upper bound of quantum capacity. In
2016 IEEE International Symposium on Information Theory (ISIT). IEEE, July 2016b. doi:
10.1109/isit.2016.7541587. URL https://ieeexplore.ieee.org/document/7541587.
Xin Wang and Mark M. Wilde. Resource theory of asymmetric distinguishability. Physical
Review Research, 1:033170, December 2019. doi: 10.1103/PhysRevResearch.1.033170. URL
http://arxiv.org/abs/1905.11629.
Xin Wang and Mark M. Wilde. 𝛼-logarithmic negativity. Physical Review A, 102:032416, September
2020. doi: 10.1103/PhysRevA.102.032416. URL https://link.aps.org/doi/10.1103/
PhysRevA.102.032416.
Xin Wang, Wei Xie, and Runyao Duan. Semidefinite programming strong converse bounds for
classical capacity. IEEE Transactions on Information Theory, 64:640–653, January 2018.
ISSN 0018-9448. doi: 10.1109/TIT.2017.2741101. URL https://ieeexplore.ieee.org/
document/8012535.
Xin Wang, Kun Fang, and Runyao Duan. Semidefinite programming converse bounds for quantum
communication. IEEE Transactions on Information Theory, 65:2583–2592, April 2019b. URL
https://ieeexplore.ieee.org/document/8482492.
Xin Wang, Kun Fang, and Marco Tomamichel. On converse bounds for classical communication over
quantum channels. IEEE Transactions on Information Theory, 65:4609–4619, July 2019c. doi:
10.1109/TIT.2019.2898656. URL https://ieeexplore.ieee.org/document/8638816.
John Watrous. Semidefinite programs for completely bounded norms. Theory of Comput-
ing, 5:217–238, November 2009. doi: 10.4086/toc.2009.v005a011. URL http://www.
theoryofcomputing.org/articles/v005a011.
John Watrous. Simpler semidefinite programs for completely bounded norms. Chicago Jour-
nal of Theoretical Computer Science, July 2013. URL http://cjtcs.cs.uchicago.edu/
articles/2013/8/contents.html.
John Watrous. The Theory of Quantum Information. Cambridge University Press, 2018. doi:
10.1017/9781316848142.
1225
Reinhard F. Werner. An application of Bell’s inequalities to a quantum state extension problem.
Letters in Mathematical Physics, 17:359–363, May 1989a. doi: 10.1007/BF00399761. URL
https://doi.org/10.1007/BF00399761.
Reinhard F. Werner. All teleportation and dense coding schemes. Journal of Physics A: Mathematical
and General, 34:7081, September 2001. URL http://stacks.iop.org/0305-4470/34/i=
35/a=332.
Mark M. Wilde. Squashed entanglement and approximate private states. Quantum Information
Processing, 15:4563–4580, November 2016. ISSN 1573-1332. doi: 10.1007/s11128-016-1432-7.
URL http://dx.doi.org/10.1007/s11128-016-1432-7.
Mark M. Wilde. Quantum Information Theory. Cambridge University Press, second edition, 2017a.
URL https://doi.org/10.1017/CBO9781139525343.
Mark M. Wilde. Position-based coding and convex splitting for private communication over
quantum channels. Quantum Information Processing, 16:264, October 2017b. URL https:
//doi.org/10.1007/s11128-017-1718-4.
Mark M. Wilde. Strong and uniform convergence in the teleportation simulation of bosonic Gaussian
channels. Physical Review A, 97:062305, June 2018a. doi: 10.1103/PhysRevA.97.062305. URL
https://link.aps.org/doi/10.1103/PhysRevA.97.062305.
Mark M. Wilde. Optimized quantum 𝑓 -divergences and data processing. Journal of Physics A, 51:
374002, September 2018b. URL https://doi.org/10.1088/1751-8121/aad5a1.
Mark M. Wilde and Haoyu Qi. Energy-constrained private and quantum capacities of quantum
channels. IEEE Transactions on Information Theory, 64:7802–7827, December 2018. URL
https://ieeexplore.ieee.org/document/8541091.
Mark M. Wilde and Andreas Winter. Strong Converse for the Quantum Capacity of the Erasure
Channel for Almost All Codes. 27:52–66, 2014. ISSN 1868-8969. doi: 10.4230/LIPIcs.TQC.
2014.52. URL http://drops.dagstuhl.de/opus/volltexte/2014/4806.
Mark M. Wilde, Andreas Winter, and Dong Yang. Strong converse for the classical capacity
of entanglement-breaking and Hadamard channels via a sandwiched Rényi relative entropy.
Communications in Mathematical Physics, 331:593–622, October 2014. URL https://doi.
org/10.1007/s00220-014-2122-x.
1226
Mark M. Wilde, Marco Tomamichel, and Mario Berta. Converse bounds for private communication
over quantum channels. IEEE Transactions on Information Theory, 63:1792–1817, March 2017.
URL https://ieeexplore.ieee.org/document/7807212.
Andreas Winter. Tight uniform continuity bounds for quantum entropies: conditional entropy,
relative entropy distance and energy constraints. Communications in Mathematical Physics, 347:
291–313, October 2016. URL https://doi.org/10.1007/s00220-016-2609-8.
Michael M. Wolf, David Pérez-García, and Geza Giedke. Quantum capacities of bosonic channels.
Physical Review Letters, 98:130501, March 2007. doi: 10.1103/PhysRevLett.98.130501. URL
https://link.aps.org/doi/10.1103/PhysRevLett.98.130501.
Jacob Wolfowitz. Coding Theorems of Information Theory, volume 31 of Ergebnisse der Mathematik
und Ihrer Grenzgebiete. Springer, 1964.
William K. Wootters. Entanglement of formation of an arbitrary state of two qubits. Physical
Review Letters, 80:2245–2248, March 1998. doi: 10.1103/PhysRevLett.80.2245. URL https:
//link.aps.org/doi/10.1103/PhysRevLett.80.2245.
Aaron D. Wyner. The wire-tap channel. Bell System Technical Journal, 54:1355–1387, October
1975. URL https://ieeexplore.ieee.org/document/6772207.
Dong Yang. A simple proof of monogamy of entanglement. Physics Letters A, 360:249–250,
2006. ISSN 0375-9601. doi: https://doi.org/10.1016/j.physleta.2006.08.027. URL http:
//www.sciencedirect.com/science/article/pii/S0375960106012801.
Jon Yard, Patrick Hayden, and Igor Devetak. Capacity theorems for quantum multiple-access
channels: Classical-quantum and quantum-quantum capacity regions. IEEE Transactions
on Information Theory, 54:3091–3113, July 2008. URL https://ieeexplore.ieee.org/
document/4545000.
Haidong Yuan and Chi-Hang Fred Fung. Fidelity and Fisher Information on Quantum Chan-
nels. New Journal of Physics, 19:113039, November 2017. doi: 10.1088/1367-2630/
aa874c. URL http://stacks.iop.org/1367-2630/19/i=11/a=113039?key=crossref.
c8abb94f653e6d572133885d9e0b86b0.
Horace Yuen, Robert Kennedy, and Melvin Lax. Optimum testing of multiple hypotheses in
quantum detection theory. IEEE Transactions on Information Theory, 21:125–134, March 1975.
URL https://ieeexplore.ieee.org/document/1055351.
Sisi Zhou and Liang Jiang. An Exact Correspondence between the Quantum Fisher Information
and the Bures Metric. October 2019. URL http://arxiv.org/abs/1910.08473.
Xinlan Zhou, Debbie W. Leung, and Isaac L. Chuang. Methodology for quantum logic gate
construction. Physical Review A, 62:052316, oct 2000. doi: 10.1103/PhysRevA.62.052316.
URL https://link.aps.org/doi/10.1103/PhysRevA.62.052316.
Karol Zyczkowski, Paweł Horodecki, Anna Sanpera, and Maciej Lewenstein. Volume of the set of
separable states. Physical Review A, 58:883–892, August 1998. doi: 10.1103/PhysRevA.58.883.
URL https://link.aps.org/doi/10.1103/PhysRevA.58.883.
1227