Hemmo M. Quantum, Probability, Logic... 2020 PDF
Hemmo M. Quantum, Probability, Logic... 2020 PDF
Meir Hemmo
Orly Shenker Editors
Quantum,
Probability,
Logic
The Work and Influence
of Itamar Pitowsky
Jerusalem Studies in Philosophy and History
of Science
Series Editor
Orly Shenker, The Hebrew University of Jerusalem, The Sidney M. Edelstein
Center for the History and Philosophy of Science, Technology and Medicine,
Jerusalem, Israel
Jerusalem Studies in Philosophy and History of Science sets out to present state of
the art research in a variety of thematic issues related to the fields of Philosophy of
Science, History of Science, and Philosophy of Language and Linguistics in their
relation to science, stemming from research activities in Israel and the near region
and especially the fruits of collaborations between Israeli, regional and visiting
scholars.
This Springer imprint is published by the registered company Springer Nature Switzerland AG.
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Dedicated to the memory of Itamar Pitowsky
1 The essays in the volume are arranged in alphabetical order of the authors’ surnames.
vii
viii Preface
2 Bub,
J., Demopoulos, W: Itamar Pitowsky 1950–2010. Studies in History and Philosophy of
Modern Physics 41, 85 (2010).
Itamar Pitowsky: Academic Genealogy
and the Arrow of Time
As you go backward in your family tree, the number of ancestors grows exponen-
tially with the number of generations, which means that it is practically impossible
to list all of them. The solution is easy enough: from the large number of ancestors,
you choose one with some distinction. Then you list the people in the branch leading
from this person to you while ignoring the rest. The rest include a multitude of
anonymous people, a few crooks, and perhaps some outright criminals. (If you
go back far enough, you are likely to find them). This is how your yichus is
manufactured.
Academic genealogy, on the other hand, consists of a single branch or at most a
few. You simply list your PhD supervisor, then his/her PhD supervisor, and so on. It
turns out that there is a web site devoted to academic genealogy of mathematicians
from which some of the information below is taken and can be continued further
down the generations to the seventeenth century. So here is part of it:
ix
x Itamar Pitowsky: Academic Genealogy and the Arrow of Time
2. David Joseph Bohm, PhD from the University of California, Berkeley, 1943
(© Mark Edwards)
3. J. Robert Oppenheimer, PhD from the University of Göttingen, 1927
5. Carl David Tolmé Runge, PhD from the University of Berlin, 1880
6b. Supervisor #2: Karl Theodor Wilhelm Weierstrass, PhD from the University of
Königsberg, 1854
xii Itamar Pitowsky: Academic Genealogy and the Arrow of Time
And so on . . .
Going forward in time changes one’s perspective completely. The great Weier-
strass has tens of thousands of academic descendants, so academic genealogy is not
a very big deal either. However, academic trees, unlike family trees, grow at different
rates backward and forward. As more PhD’s graduate, the entropy increases.
Itamar Pitowsky: Selected Publications
xiii
xiv Itamar Pitowsky: Selected Publications
xvii
xviii Contents
Samson Abramsky
1.1 Introduction
S. Abramsky ()
Department of Computer Science, University of Oxford, Oxford, UK
e-mail: samson.abramsky@cs.ox.ac.uk
More formally, we are given some basic events E1 , . . . , En , and some boolean
functions ϕ 1 , . . . , ϕ m of these events. Such a function ϕ can be described by a
propositional formula in the variables E1 , . . . , En .
Suppose further that we are given probabilities p(Ei ), p(ϕ j ) of these events.
m−1
ϕm → ¬ϕ i .
i=1
m−1
m−1
m−1
m−1
p(ϕ m ) ≤ p( ¬ϕ i ) ≤ p(¬ϕ i ) = (1−p(ϕ i )) = (m−1)− p(ϕ i ).
i=1 i=1 i=1 i=1
1 Classical Logic, Classical Probability, and Quantum Mechanics 3
m
p(ϕ i ) ≤ m − 1. (1.1)
i=1
We can directly apply the above inequality to deduce a version of Bell’s theorem
Bell (1964). Consider the following table.
Here we have two agents, Alice and Bob. Alice can choose from the measurement
settings a or a , and Bob can choose from b or b . These choices correspond to
the rows of the table. The columns correspond to the joint outcomes for a given
choice of settings by Alice and Bob, the two possible outcomes for each individual
measurement being represented by 0 and 1. The numbers along each row specify a
probability distribution on these joint outcomes. Thus for example, the entry in row
2, column 2 of the table says that when Alice chooses setting a and Bob chooses
setting b , then with probability 1/8, Alice obtains a value of 1, and Bob obtains a
value of 0.
A standard version of Bell’s theorem uses the probability table given above. This
table can be realized in quantum mechanics, e.g. by a Bell state
|00 + |11
√ ,
2
We now pick out a subset of the elements of each row of the table, as indicated in
the following table.
We can think of basic events Ea , Ea , Eb , Eb , where e.g. Ea is the event that
the quantity measured by a has the value 0. Note that, by our assumption that each
measurement has two possible outcomes,2 ¬Ea is the event that a has the value 1.
Writing simply a for Ea etc., the highlighted positions in the table are represented
by the following propositions:
ϕ1 = a∧b ∨ ¬a ∧ ¬b = a ↔ b
ϕ2 = a ∧ b ∨ ¬a ∧ ¬b = a ↔ b
ϕ3 = a ∧ b ∨ ¬a ∧ ¬b = a ↔ b
ϕ4 = ¬a ∧ b ∨ a ∧ ¬b = a ⊕ b .
The first three propositions pick out the correlated outcomes for the variables they
refer to; the fourth the anticorrelated outcomes. These propositions are easily seen
to be contradictory. Indeed, starting with ϕ 4 , we can replace a with b using ϕ 3 ,
b with a using ϕ 1 , and a with b using ϕ 2 , to obtain b ⊕ b , which is obviously
unsatisfiable.
We see from the table that p(ϕ 1 ) = 1, p(ϕ i ) = 6/8 for i = 2, 3, 4. Hence the
violation of the Bell inequality (1.1) is 1/4.
We may note that the logical pattern shown by this jointly contradictory family
of propositions underlies the familiar CHSH correlation function.
2 And, implicitly, the assumption that each quantity has a definite value, whether we measure it or
not.
3 The size of a family {ϕ }
j j ∈J is the cardinality of J . Note that repetitions are allowed in the family.
1 Classical Logic, Classical Probability, and Quantum Mechanics 5
m
p(ϕ i ) ≤ K. (1.2)
i=1
See Abramsky and Hardy (2012) for the (straightforward) proof. Note that the
basic inequality (1.1) is a simple consequence of this result.
We thus have a large class of inequalities arising directly from logical consistency
conditions. As shown in Abramsky and Hardy (2012), even the basic form (1.1) is
sufficient to derive the main no-go results in the literature, including the Hardy, GHZ
and Kochen-Specker “paradoxes”.
More remarkably, as we shall now go on to see, this set of inequalities
is complete. That is, every facet-defining inequality for the non-local, or more
generally non-contextual polytopes, is equivalent to one of the form (1.2). In this
sense, we have given a complete answer to Boole’s question.
The following quotation from Pitowsky suggests that he may have envisaged the
possibility of such a result (Pitowsky 1991, p. 413):
In fact, all facet inequalities for c(n) should follow from “Venn diagrams”, that is, the
possible relations among n events in a probability space.
for the formulae ϕ j by the usual truth-table method, and hence determines a 0/1-
vector vs in Rn+m . The polytope c(E, ϕ ) is defined to be the convex hull of the set
of vectors vs , for s ∈ 2 .
E
For the case where ϕ is the set of all pairwise conjunctions ϕ ij of basic events,
Pitowsky shows that this problem is NP-complete.
We can see that the (quite standard) version of Bell’s theorem described in
Sect. 1.2.1 amounts to the statement that the vector
For certain families of events the theory stipulates that they are commeasurable. This
means that, in every state, the relative frequencies of all these events can be measured on
one single sample. For such families of events, the rules of classical probability — Boole’s
conditions in particular — are valid. Other families of events are not commeasurable, so
their frequencies must be measured in more than one sample. The events in such families
nevertheless exhibit logical relations (given, usually, in terms of algebraic relations among
observables). But for some states, the probabilities assigned to the events violate one or
more of Boole’s conditions associated with those logical relations.
The point we would like to emphasize is that tables such as the Bell table in
Sect. 1.2.1 can—and do—arise from experimental data, without presupposing any
particular physical theory. This data does clearly involve concepts from classical
probability. In particular, for each set C of “commeasurable events”, there is a
sample space, namely the set of possible joint outcomes of measuring all the
observables in C. Moreover, there is a well-defined probability distribution on this
sample space.
Taking the Bell table for illustration, the “commeasurable sets”, or contexts, are
the sets of measurements labelling the rows of the table. The sample space for a
given row, with measurement α by Alice and β by Bob, is the set of joint outcomes
(α = x, β = y) with x, y ∈ {0, 1}. The probabilities assigned to these joint
outcomes in the table form a (perfectly classical) probability distribution on this
sample space.
Thus the structure of the table as a whole is a family of probability distributions,
each defined on a different sample space. However, these sample spaces are not
completely unrelated. They overlap in sharing common observables, and they “agree
on overlaps” in the sense that they have consistent marginals. This is exactly the
force of the no-signalling condition.
The nature of the “non-classicality” of the data tables generated by quantum
mechanics (and also elsewhere, see e.g. Abramsky 2013), is that there is no
distribution over the global sample space of outcomes for all observables, which
recovers the empirically accessible data in the table by marginalization. This is
completely equivalent to a statement of the more traditional form: “there is no local
or non-contextual hidden variable model” (or, in currently fashionable terminology:
there is no “ontological model” of a certain form).
Our preferred slogan for this state of affairs is:
1.3.1 Formalization
X = {a, a , b, b} O = {0, 1}
M = {{a, b}, {a, b }, {a , b}, {a , b }}.
Given this description of the experimental setup, then either performing repeated
runs of such experiments with varying choices of measurement context and
recording the frequencies of the various outcome events, or calculating theoretical
predictions for the probabilities of these outcomes, results in a probability table like
that in Sect. 1.2.1.
Such data is formalised as an empirical model for the given measurement
scenario X, M, O. For each valid choice of measurement context, it specifies the
probabilities of obtaining the corresponding joint outcomes. That is, it is a family
{eC }C∈M where each eC is a probability distribution on the set O C of functions
assigning an outcome in O to each measurement in C (the rows of the probability
table).
We require that the marginals of these distributions agree whenever contexts
overlap, i.e.
Suppose we are given a measurement scenario X, M, O. Each global assignment
t ∈ O X induces a deterministic empirical model δ t :
1, t|C = s
δ tC (s) =
0 otherwise.
Note that t|C is the function t restricted to C, which is a subset of its domain X.
We have the following result from (Abramsky and Brandenburger 2011, Theorem
8.1):
Theorem 2 An empirical model {e C } is non-contextual if and only if it can be
written as a convex combination j ∈J μj δ tj where tj ∈ O X for each j ∈ J .
This means that for each C ∈ M,
t
eC = μj δ Cj .
j
Given a scenario X, M, O, define m := C∈M |O C | = | {C, s | C ∈ M,
s ∈ O C | to be the number of joint outcomes as we range over contexts. For
example, in the case of the Bell table there are four contexts, each with four possible
outcomes, so m = 16. We can regard an empirical model {eC } over X, M, O as a
real vector ve ∈ Rm , with ve [C, s] = eC (s).
We define the non-contextual polytope for the scenario X, M, O to be the
convex hull of the set of deterministic models δ t , where t ranges over O X . By the
preceding theorem, this is exactly the set of non-contextual models.
Thus we have captured the question as to whether an empirical model is
contextual in terms of membership of a polytope. The facet inequalities for this
family of polytopes, as we range over measurement scenarios, give a general notion
of Bell inequalities.
10 S. Abramsky
1 if g|C = s;
M[C, s, g] :=
0 otherwise.
M d = ve and d ≥ 0.
Heine–Borel and extreme value theorems since the set of such λ is bounded and closed.
12 S. Abramsky
type scenarios Elitzur et al. (1992). We denote it by NCF(e), and the contextual
fraction by CF(e) := 1 − NCF(e).
The notion of contextual fraction in general scenarios was introduced in Abram-
sky and Brandenburger (2011), where it was proved that a model is non-contextual
if and only if its non-contextual fraction is 1.
A global subprobability distribution is represented by a vector b ∈ Rn with non-
negative components, its weight being given by the dot product 1 · b, where 1 ∈ Rn
is the vector whose n components are each 1. The following LP thus calculates the
non-contextual fraction of an empirical model e:
Find b ∈ Rn
maximising 1 · b
(1.4)
subjectto M b ≤ ve
and b ≥ 0 .
The violation of a Bell inequality a, R by a model e is max{0, a·ve −R}. However,
it is useful to normalise this value by the maximum possible violation in order to give
a better idea of the extent to which the model violates the inequality. The normalised
violation of the Bell inequality by the model e is
max{0, a · ve − R}
.
a − R
6 We will consider only inequalities satisfying R < a, which excludes inequalities trivially
satisfied by all models, and avoids cluttering the presentation with special caveats about division
by 0.
1 Classical Logic, Classical Probability, and Quantum Mechanics 13
We now return to the issue of the complexity of deciding whether a given empirical
model is contextual.
It will be useful to consider the class of (n, k, 2) Bell scenarios. In these scenarios
there are n agents, each of whom has a choice of k measurement settings, and all
measurements have 2 possible outcomes. This gives rise to a measurement scenario
X, M, O where |X| = nk. Each C ∈ M consists of n measurements, one chosen
by each agent. Thus |M| = k n . For each context C, there are 2n possible outcomes.
An empirical model for an (n, k, 2) Bell scenario is thus given by a vector of
k n 2n probabilities. Thus the size of instances is exponential in the natural parameter
n. This is the real obstacle to tractable computation as we increase the number of
agents.
Given an empirical model ve as an instance, we can use the linear program (1.4)
given in the previous section to determine if it is contextual. The size of the linear
program is determined by the incidence matrix, which has dimensions p × q, where
p is the dimension of ve , and q = 2nk .
If we treat n as the complexity parameter, and keep k fixed, then q = O(s k ),
where s is the size of the instance. Thus the linear program has size polynomial
in the size of the instance, and membership of the non-contextual polytope can be
decided in time polynomial in the size of the instance.
There is an interesting contrast with Pitowsky’s results on the NP-completeness
of deciding membership in the correlation polytope of all binary conjunctions of
basic events. In Pitowsky’s case, the size of instances for these special forms of
correlation polytope is polynomial in the natural parameter, which is the number
of basic events. If we consider the correlation polytopes which would correspond
directly to empirical models, the same argument as given above would apply: the
instances would have exponential size in the natural parameter, while membership
in the polytope could be decided by linear programming in time polynomial in the
instance size.
14 S. Abramsky
We can also consider the situation where we fix n, and treat k as the complexity
parameter. In this case, note that the size of instances are polynomial in k, while the
size of the incidence matrix is exponential in k. Thus linear programming does not
help. In fact, this seems to be the case that Pitowsky primarily had in mind. With
n = 2, the restriction to binary conjunctions makes good sense, and all the examples
he discusses are of this kind.
We also mention the results obtained in Abramsky and Hardy (2012); Abramsky
et al. (2013); Abramsky (2017); Abramsky et al. (2017), which study the analogous
problem with respect to possibilistic contextuality, that is, whether the supports
of the probability distributions in the empirical model can be obtained from a
set of global assignments. In that case, the complexity of deciding contextuality
is shown to be NP-complete in the complexity parameter k in general; a precise
delineation is given of the tractability boundary in terms of the values of the
parameters. Moreover, as Rui Soares Barbosa has pointed out,7 if we take n as
the complexity parameter, there is a simple algorithm for detecting possibilistic
contextuality which is polynomial in the size of the instance. Thus the complexity
of detecting possibilistic contextuality runs completely in parallel with the proba-
bilistic case.
In my view, this states the extent of the challenge posed to classical logic by quantum
mechanics too strongly. As we have discussed, the observational data predicted
by quantum mechanics and confirmed by actual experiments consists of families
of probability distributions, each defined on different sample spaces, correspond-
ing to the different contexts. Since the contexts overlap, there are relationships
between the sample spaces, which are reflected in coherent relationships between
the distributions, in the form of consistent marginals. But there is no “global”
distribution, defined on a sample space containing all the observable quantities,
which accounts for all the empirically observable data. This does pose a challenge
to the understanding of quantum mechanics as a physical theory, since it implies
7 Personal communication.
1 Classical Logic, Classical Probability, and Quantum Mechanics 15
that we cannot ascribe definite values to the physical quantities being measured,
independent of whether or in what context they are measured. It does not, however,
challenge classical logic and probability, which can be used to describe exactly this
situation.
For this reason, I prefer to speak of contextuality as living “on the boundary of
paradox” Abramsky (2017), as a signature of non-classicality, and one for which
there is increasing evidence that it plays a fundamental rôle in quantum advantage
in information-processing tasks (Anders and Browne 2009; Raussendorf 2013;
Howard et al. 2014). So this boundary seems likely to prove a fruitful place to be.
But we never actually cross the boundary, for exactly the reasons vividly expressed
by Pitowsky in the opening two sentences of the above quotation.
There is much more to be said about the connections between logic and
contextuality. In particular:
• There are notions of possibilistic and strong contextuality, and of All-versus-
Nothing contextuality, which give a hierarchy of strengths of contextuality, and
which can be described in purely logical terms, without reference to probabilities
(Abramsky and Brandenburger 2011).
• These “possibilistic” forms of contextuality can be connected with logical para-
doxes in the traditional sense. For example, the set of contradictory propositions
used in Sect. 1.2.1 to derive Bell’s theorem form a “Liar cycle” (Abramsky et al.
2015).
• There is also a topological perspective on these ideas. The four propositions from
Sect. 1.2.1 form a discrete version of the Möbius strip. Sheaf cohomology can be
used to detect contextuality (Abramsky et al. 2015).
• The logical perspective on contextuality leads to the recognition that the same
structures arise in many non-quantum areas, including databases (Abramsky
2013), constraint satisfaction (Abramsky et al. 2013), and generic inference
(Abramsky and Carù 2019).
• In Abramsky et al. (2017), an inequality of the form pF ≥ NCF(e)d is derived
for several different settings involving information-processing tasks. Here pF
is the failure probability, NCF(e) is the non-contextual fraction of an empirical
model viewed as a resource, and d is a parameter measuring the “difficulty”
of the task. Thus the inequality shows the necessity of increasing the amount
of contextuality in the resource in order to increase the success probability. In
several cases, including certain games, communication complexity, and shallow
circuits, the parameter d is n−K
n , where K is the K-consistency we encountered
in formulating logical Bell inequalities (1.2).
We refer to the reader to the papers cited above for additional information on
these topics.
16 S. Abramsky
Acknowledgements My thanks to Meir Hemmo and Orly Shenker for giving me the opportunity
to contribute to this volume in honour of Itamar Pitowsky. I had the pleasure of meeting Itamar on
several occasions when he visited Oxford.
I would also like to thank my collaborators in the work I have described in this paper: Adam
Brandenburger, Rui Soares Barbosa, Shane Mansfield, Kohei Kishida, Ray Lal, Giovanni Carù,
Lucien Hardy, Phokion Kolaitis and Georg Gottlob. My thanks also to Ehtibar Dzhafarov, whose
Contextuality-by-Default theory (Dzhafarov et al. 2015) has much in common with my own
approach, for our ongoing discussions.
References
Abramsky, S. (2013). Relational databases and Bell’s theorem. In In search of elegance in the
theory and practice of computation (pp. 13–35). Berlin: Springer.
Abramsky, S. (2017). Contextuality: At the borders of paradox. In E. Landry (Ed.), Categories for
the working philosopher. Oxford: Oxford University Press.
Abramsky, S., & Brandenburger, A. (2011). The sheaf-theoretic structure of non-locality and
contextuality. New Journal of Physics, 13(11), 113036.
Abramsky, S., & Carù, G. (2019). Non-locality, contextuality and valuation algebras: a general
theory of disagreement. Philosophical Transactions of the Royal Society A, 377(2157),
20190036.
Abramsky, S., & Hardy, L. (2012). Logical Bell inequalities. Physical Review A, 85(6), 062114.
Abramsky, S., Gottlob, G., & Kolaitis, P. (2013). Robust constraint satisfaction and local hidden
variables in quantum mechanics. In Twenty-Third International Joint Conference on Artificial
Intelligence.
Abramsky, S., Barbosa, R. S., Kishida, K., Lal, R., & Mansfield, S. (2015). Contextuality,
cohomology and paradox. In S. Kreutzer (Ed.), 24th EACSL Annual Conference on Computer
Science Logic (CSL 2015), volume 41 of Leibniz International Proceedings in Informatics
(LIPIcs) (pp. 211–228). Dagstuhl, Schloss Dagstuhl–Leibniz-Zentrum für Informatik.
Abramsky, S., Barbosa, R. S., & Mansfield, S. (2017). Contextual fraction as a measure of
contextuality. Physical Review Letters, 119(5), 050504.
Anders, J., & Browne, D. E. (2009). Computational power of correlations. Physical Review Letters,
102(5), 050502.
Bell, J. S. (1964). On the Einstein-Podolsky-Rosen paradox. Physics, 1(3), 195–200.
Boole, G. (1862). On the theory of probabilities. Philosophical Transactions of the Royal Society
of London, 152, 225–252.
Christof, T., Löbel, A., & Stoer, M. (1997). PORTA-POlyhedron Representation Transformation
Algorithm. Publicly available via ftp://ftp.zib.de/pub/Packages/mathprog/polyth/porta.
Dantzig, G. B., & Thapa, M. N. (2003). Linear programming 2: Theory and extenstions (Springer
series in operations research and financial engineering). New York: Springer.
Dzhafarov, E. N., Kujala, J. V., & Cervantes, V. H. (2015). Contextuality-by-default: A brief
overview of ideas, concepts, and terminology. In International Symposium on Quantum
Interaction (pp. 12–23). Heidelberg: Springer.
1 Classical Logic, Classical Probability, and Quantum Mechanics 17
Elitzur, A. C., Popescu, S., & Rohrlich, D. (1992). Quantum nonlocality for each pair in an
ensemble. Physics Letters A, 162(1), 25–28.
Ghirardi, G. C., Rimini, A., & Weber, T. (1980). A general argument against superluminal
transmission through the quantum mechanical measurement process. Lettere Al Nuovo Cimento
(1971–1985), 27(10), 293–298.
Howard, M., Wallman, J., Veitch, V., & Emerson, J. (2014). Contextuality supplies the ‘magic’ for
quantum computation. Nature, 510(7505), 351–355.
Jordan, T. F. (1983). Quantum correlations do not transmit signals. Physics Letters A, 94(6–7), 264.
Pitowsky, I. (1989). Quantum probability, quantum logic (Lecture notes in physics, Vol. 321).
Berlin: Springer.
Pitowsky, I. (1991). Correlation polytopes: Their geometry and complexity. Mathematical Pro-
gramming, 50(1–3), 395–414.
Pitowsky, I. (1994). George Boole’s “Conditions of possible experience” and the quantum puzzle.
The British Journal for the Philosophy of Science, 45(1), 95–125.
Raussendorf, R. (2013). Contextuality in measurement-based quantum computation. Physical
Review A, 88(2), 022322.
Strichman, O. (2002). On solving Presburger and linear arithmetic with SAT. In Formal methods
in computer-aided design (pp. 160–170). Heidelberg: Springer.
Weisstein, E. W. (2019). Bonferroni inequalities. http://mathworld.wolfram.com/
BonferroniInequalities.html. MathWorld–A Wolfram Web Resource.
Chapter 2
Why Scientific Realists Should Reject
the Second Dogma of Quantum
Mechanics
Valia Allori
2.1 Introduction
Bub and Pitowsky (2010) have proposed the so-called information-theoretic (IT)
approach to quantum mechanics. It is a realist approach to quantum theory which
rejects what Bub and Pitowsky call the ‘two dogmas’ of quantum mechanics.
V. Allori ()
Department of Philosophy, Northern Illinois University, Dekalb, IL, USA
e-mail: vallori@niu.edu
These dogmas are as follows: (D1) measurement results should not be taken
as unanalysable primitives; and (D2) the wavefunction represents some physical
reality. In contrast, in the IT approach measurements are not analysed in terms of
something more fundamental, and the wavefunction does not represent physical
entities. Bub and Pitowsky reject the first dogma arguing that their approach is
purely kinematic, and that dynamical explanations (used to account for experimental
outcomes) add no further insight. Their approach has been criticized by Brown
and Timpson (2006) and by Timpson (2010), who argue that kinematic theories
should not be a model for realist quantum theories. The second dogma is rejected
as a consequence of the rejection of the first. Bub and Pitowsky assume that their
rejection of the wavefunction as representing physical entities commits them to an
epistemic view in which the wavefunction reflects is a reflection of our ignorance,
and as such they have been criticized by some authors, such as Leifer (2014) and
Gao (2017).
Little has been done to investigate whether it is possible to reject the second
dogma without appealing to the alleged superiority of dynamical explanations, and
whether one could think of the wavefunction as not representing matter without
falling prey to the objections to the epistemic view (or to the so-called nomological
view). The aim of this paper is to explore such issues. Therefore, first I argue that
Bub and Pitowsky, as well as scientific realists in general, should reject the second
dogma to start with, independently of whether they decide to accept or deny the
first dogma. This argument supports the IT account to motivate their view, but it
also stands alone as an argument to reject the wavefunction as a material entity
in general. Then, in connection with this, I propose a functional account of the
wavefunction which avoids the objections of the alternative views. This nicely fits
in the IT interpretation by dissolving the objections against it which relies on the
wavefunction being epistemic, but it is also helpful in answering general questions
about the nature of the wavefunction.
Here’s the scheme of the paper. In Sect. 2.2 I present the IT approach of Bub
and Pitowsky. Then in Sect. 2.3 I discuss the distinction between dynamical and
kinematical explanations, as well as the principle/constructive theory distinction,
and how they were used by Bub and Pitwosky to motivate their view. Section 2.4
concludes this part of the paper by presenting the objections to the view connected
with the explanatory superiority of kinematic theories. In Sect. 2.5 I move to the
most prominent realist view which accepts dogma 2, namely wavefunction realism.
Then in Sect. 2.6 I provide alternative reasons why a realist should reject the second
dogma by introducing the primitive ontology approach. I connect the rejection of
the first dogma to scientific realism in Sect. 2.7, showing how a realist may or may
not consistently reject dogma 1. Then I move to the second set of objections to
the IT account based on the epistemic wavefunction. First, in Sect. 2.8 I review the
objections to the ontic accounts, while in Sect. 2.9 I discuss the ones to the epistemic
account. Finally in Sect. 2.10 I propose an account of how the wavefunction based
on functionalism, and I argue that it is better than the alternatives. In the last section
I summarize my conclusions.
2 Why Scientific Realists Should Reject the Second Dogma of Quantum Mechanics 21
1 See,respectively Clifton et al. (2003), Bub (2004, 2005, 2007), Pitowsky (2007), Bub and
Pitowsky (2010).
2 Here they are: the impossibility of superluminal information transfer between two physical
information-theoretic constraints, such a story can, in principle, have no excess empirical content
over quantum mechanics” (Bub 2005, p. 542).
22 V. Allori
Bub and Pitowsky argue that the IT interpretation is a kinematic theory, and that
this kind of theories are to be preferred, as dynamical theories add no further
insight. Originally, the argument has been presented by Bub in terms of the
principle/constructive theories distinction, so let’s start with that. Principle theories
are formulated in terms principles, which are used as constraints on physically pos-
sible processes, as in thermodynamics (‘no perpetual motion machines’). Instead,
constructive theories involve the dynamical reduction of macroscopic objects in
terms of their microscopic constituents, as in the kinetic theory (which reduces the
behavior of gases to the motion of atoms).4 Einstein introduced this distinction when
discussing his 1905 theory of relativity, which he regarded as a principle theory, as
it was formulated in terms of the two principles of equivalence of inertial frames for
all physical laws, and constancy of the velocity of light (in vacuum for all inertial
frames). This theory explains relativistic effects (such as length contraction and
time dilation) as the physical phenomena compatible with the theory’s principles.
By contrast, since Lorentz’s theory (1909) derives the relativistic transformations
and the relativistic effects from the electromagnetic properties of the ether and its
interactions with matter, is a constructive theory.5
According to Bub and Pitowsky, Lorentz’s constructive theory came first, then
only later Einstein formulated his (principle theory of) special relativity: Minkowski
provided kinematic constraints which relativistic dynamics has to obey, and then
Einstein came up with an interpretation for special relativity. Bub and Pitowsky
also claim that we should take the fact that Einstein’s kinematic theory has been
preferred over Lorentz’s dynamical theory as evidence that such type of theories
are to be preferred in general. In addition, they argue that the fact that Lorentz’s
theory can constructively explain Lorentz invariance justified a realist interpretation
of special relativity as a principle theory. But after this initial use, the constructive
counterpart is no longer necessary. Similarly, Bub and Pitowsky argue that IT
is kinematic theory in that it provides constraints on the phenomena without
4 Here’s how Balashov and Janssen put it:“In a theory of principle [ . . . ] one explains the
phenomena by showing that they necessarily occur in a world in accordance with the postulates.
Whereas theories of principle are about the phenomena, constructive theories aim to get at the
underlying reality. In a constructive theory one proposes a (set of) model(s) for some part of
physical reality [ . . . ]. One explains the phenomena by showing that the theory provides a model
that gives an empirically adequate description of the salient features of reality” (Balashov and
Janssen 2003).
5 Again in the worlds of Balashov and Janssen: “Consider the phenomenon of length contraction.
Understood purely as a theory of principle, SR explains this phenomenon if it can be shown that
the phenomenon necessarily occurs in any world that is in accordance with the relativity postulate
and the light postulate. By its very nature such a theory-of-principle explanation will have nothing
to say about the reality behind the phenomenon. A constructive version of the theory, by contrast,
explains length contraction if the theory provides an empirically adequate model of the relevant
features of a world in accordance with the two postulates. Such constructive-theory explanations
do tell us how to conceive of the reality behind the phenomenon” (Balashov and Janssen 2003).
24 V. Allori
The IT framework has been criticized for mainly two reasons: on the one hand,
objections have been raised against the alleged superiority of kinematic theories,
therefore undermining the rejection of dogma 1; and on the other hand, people
have pointed out the difficulties of considering the wavefunction as epistemic, as
a consequence of the rejection of dogma 2. In this section, we will review the first
6 “There is no deeper explanation for the quantum phenomena of interference and entanglement
than that provided by the structure of Hilbert space, just as there is no deeper explanation for the
relativistic phenomena of Lorentz contraction and time dilation than that provided by the structure
of Minkowski space-time” (Bub and Pitowsky 2010).
2 Why Scientific Realists Should Reject the Second Dogma of Quantum Mechanics 25
kind of criticisms, while in Sect. 2.9 we will discuss the ones based on the nature of
the wavefunction.
First of all, Timpson (2010) argues that the dogmas Bub and Pitowsky consider
are not dogmas at all, as they can be derived from more general principles like
realism. In fact, regarding dogma 1, a realist recognizes measurement results and
apparatuses as merely complicated physical systems without any special physical
status. Regarding dogma 2, we just proceed as we did in classical theories: there’s a
formalism, there’s an evolution equation, then one is realist about what the equation
is about, which, in the case of quantum mechanics, is the wavefunction. However, I
think Timpson misses the point. The dogmas are dogmas for the realist: the realist
is convinced that she has to accept them to solve the measurement problem. What
Bub and Pitowsky are arguing is that realist does not have to accept these dogmas:
Bub and Pitowsky in fact provide a realist alternative which does not rely on these
dogmas. Timpson is on target in that a realist will likely endorse dogma 1 (what’s
special about measurements?), but Bub and Pitowsky argue that that is not the only
option. Namely, one can recognize dogma 1 as being a dogma and therefore may
want to recognize the possibility of rejecting it. One advantage of doing so, as we
saw, is that in this way quantum theory becomes a kinematic theory, if one believes
in the explanatory superiority of these type of theories. This leads one to recognize
dogma 2 also as a dogma, and then to recognize the possibility of rejecting it as
well.
Moreover, let me notice that Timpson also points out that Bub and Pitowsky
provide us with no positive reason why one should be unhappy with such dogmas. I
will present in Sect. 2.9 reasons for being extremely suspicious of dogma 2. Indeed,
as I will argue later, one can restate the situation as follows: a realist, based on
the reasons Timpson points out, at first is likely to accept both dogmas. However,
upon further reflection, after realizing that by accepting dogma 2 one needs to face
extremely hard problems, the realist should reject it.
Brown and Timpson (2006) point out that the first dogma was not a dogma at
all for Einstein.7 In this way, they criticize the historical reconstruction of Bub
and Pitowsky for their argument for the superiority of kinematic theories. In the
view of Brown and Timpson, Einstein’s 1905 theory was seen by Einstein himself
as a temporary theory, to be discarded once a deeper theory would make itself
7 In fact, discussing his theory of special relativity he wrote: “One is struck that the theory [ . . . ]
introduces two kinds of physical things, i.e., (1) measuring rods and clocks, (2) all other things,
e.g., the electromagnetic field, the material point, etc. This, in a certain sense, is inconsistent;
strictly speaking measuring rods and clocks would have to be represented as solutions of the basic
equations (objects consisting of moving atomic configurations), not, as it were, as theoretically
self-sufficient entities. However, the procedure justifies itself because it was clear from the very
beginning that the postulates of the theory are not strong enough to deduce from them sufficiently
complete equations [ . . . ] in order to base upon such a foundation a theory of measuring rods and
clocks. [ . . . ] But one must not legalize the mentioned sin so far as to imagine that intervals are
physical entities of a special type, intrinsically different from other variables (‘reducing physics to
geometry’, etc.)” (Einstein 1949b).
26 V. Allori
8 “The methodology of Einstein’s 1905 theory represents a victory of pragmatism over explanatory
depth; and that its adoption only made sense in the context of the chaotic state of physics at the
start of the 20th century” (Brown and Timpson 2006).
9 For according to Einstein, “when we say we have succeeded in understanding a group of natural
processes, we invariably mean that a constructive theory has been found which covers the processes
in question” (Brown and Timpson 2006).
10 See also Brown (2005), and Brown and Pooley (2004).
11 See Flores (1999), who argues that principle theories, in setting out the kinematic structure, are
more properly viewed as explaining by unification, in the Friedman (1974)/Kitcher (1989) sense.
Moreover, see Felline (2011) for an argument against Brown’s view in the framework of special
relativity. See also van Camp (2011) who argues that it is not the kinematic feature of Einsteinian
relativity that makes it preferred to the Lorenzian dynamics but the constructive component, in
virtue of which the theory possesses a conceptual structure.
2 Why Scientific Realists Should Reject the Second Dogma of Quantum Mechanics 27
an advantage is given only when a dynamical theory can provide some ‘excess’
in empirical content. Since dynamical constructive quantum theories (such as the
pilot-wave theory) are empirically equivalent to a kinematic quantum theory, then
(unlike kinetic theory) no dynamical quantum theory can ever provide more insight
than a kinematic one. Critics however could point out immediately that there are
dynamical quantum theories such as the spontaneous collapse theory, which are not
empirically equivalent to quantum mechanics. Moreover, it is not obvious at all that
the only reason to prefer dynamical theories is because they can point to new data.
Therefore, it seems safe to conclude that an important step in Bub and Pitowsky’s
argument for their view rests on controversial issues connected with whether a
kinematic explanation is better than a dynamical one. Since the issue has not
been settled, I think it constitutes a serious shortcoming of the IT interpretation
as it stands now, and it would be nice if instead it could be grounded on a less
controversial basis. Indeed, this is what I aim to do in Sect. 2.6. Before doing this,
however, I will present in the next section the most popular realist approach which
accepts the second dogma.
Part of Bub and Pitowsky’s reasoning is based on their willingness to dissolve the
‘big’ measurement problem. So let’s take a step back and reconsider this problem
again. As is widely known, quantum theory has always been regarded as puzzling, at
best. As recalled by Timpson (2010), within classical mechanics the scientific realist
can read out the ontology from the formalism of the theory: the theory is about the
temporal evolution of a point in three-dimensional space, and therefore the world
is made of point-like particles. Similarly, upon opening books on quantum theory,
a mathematical entity and an evolution equation for it immediately stand out: the
wavefunction and the Schrödinger equation. So, the natural thought is to take this
object as describing the fundamental ontology. Even if we have just seen how it can
be readily justified, as we have seen, Bub and Pitowsky think it is a dogma, in the
sense that it is something that a realist has the option of rejecting. Here I want to
reinforce the argument that this is indeed a dogma, and that this dogma should be
rejected.
As emphasized by Schrödinger when discussing his cat paradox to present
the measurement problem (1935b), assuming the second dogma makes the theory
empirically inadequate. In fact, superposition states, namely states in which ‘stuff’
is both here and there at the same time, are mathematically possible solutions of
the Schrödinger equation. This feature is typical of waves, as the wavefunction is in
virtue of obeying Schrödinger’s equation (which is a wave equation). However, if
the wavefunction provides a complete description of the world, superpositions may
exist for everything, including cats in boxes being alive and dead at the same time.
The problem is that we never observe such macroscopic superpositions, and this
is, schematically, the measurement problem: if we open the box and we measure
28 V. Allori
the state of the cat, we find the cat either dead or alive. The realist, keeping the
second dogma true, traditionally identified their next task in finding theories to
solve this problem. That is, the realist concluded that, in Bell’s words, “either
the wavefunction, as given by the Schrödinger equation, is not everything, or it
is not right” (Bell 1987, 201). Maudlin (1995) has further elaborated the available
options by stating that the following three claims are mutually inconsistent: (A)
the wavefunction of a system is complete; (B) the wavefunction always evolves
to the Schrödinger equation; and (C) measurements have determinate outcomes.
Therefore, the most popular realist quantum theories are traditionally seen as
follows: the pilot-wave theory (Bohm 1952) rejects A, the spontaneous collapse
theory (Ghirardi et al. 1986) denies B, while the many-world theory (Everett 1957)
denies C.
Notice that it is surprising that even if dogma 2 has been challenged or resisted
in the 1920s by de Broglie, Heisenberg, Einstein and Lorentz (see next section),12
the physicists proposing the realist quantum theories mentioned above did not do
so. Indeed, the view which accepts the second dogma has been the accepted view
among the realists for a very long time. In this sense it was a dogma, as Bub and
Pitowsky rightly call it, until it was recently recognized to be an assumption and the
name of wavefunction realism was given to the view which accepts dogma 2.
Even if it is not often discussed, most wavefunction realists acknowledge that the
measurement problem leads to another problem, the so-called configuration space
problem. The wavefunction by construction is an object that lives on configuration
space. This is classically defined as the space of the configurations of all particles,
which, by definition, is a space with a very large number of dimensions. So, given
that for dogma 2 the ontology is given by the wavefunction, if physical space is
the space in which the ontology lives, then physical space is configuration space.
Accordingly, material objects are represented by a field in configuration space. That
is, the arena in which physical phenomena take place seems to be fundamentally
three-dimensional, but fundamentally it is not; and physical objects seem to be
three-dimensional, while fundamentally they are not. Thus, the challenge for the
wavefunction realist is to account for the fact that the world appears so different
from what it actually is.13 According to this view, the realist theories mentioned
above are all taken to be theories about the wavefunction, aside from the pilot-wave
theory which had both particles and waves. In contrast, since in all the other theories
everything is ‘made of’ wavefunctions, particles and particle-like behaviors are best
seen as emergent, one way or another.14
Be that as it may, the configuration space problem, after it was ignored since the
1920s, started to be recognized and discussed only in the 1990s, and at the moment it
12 See Bacciagaluppi and Valentini (2009) for an interesting discussion of the various positions
about this issue and others at the 1927 Solvay Congress.
13 See most notably Albert (1996, 2013, 2015), Lewis (2004, 2005, 2006, 2013), Ney (2012, 2013,
In the last two decades or so, some authors have started to express resistance to the
characterization of quantum theories in terms of having a wavefunction ontology.17
They propose that in all quantum theories material entities are not represented by
the wavefunction but by some other mathematical entity in three-dimensional space
(or four-dimensional space-time) which they dubbed the primitive ontology (PO) of
the theory. The idea is that the ontology of a fundamental physical theory should
always be represented by a mathematical object in three-dimensional space, and
quantum mechanics is not an exception. Given that the wavefunction is a field in
configuration space, it cannot represent the ontology of quantum theory.
This view constitutes a realist approach, just like wavefunction realism. In
contrast with it, however, the configuration space problem never arises because
physical objects are not described by the wavefunction. In this way, not only the
pilot-wave theory but also the spontaneous collapse and many-worlds are ‘hidden
variable’ theories, in the sense that matter needs to be described by something else
(in three-dimensional space) and not by the wavefunction. For instance, primitive
ontologists talk about GRWm and GRWf as spontaneous localization theories with
a matter density m and an event (‘flash’) ontology f, as opposed to what they call
GRW0, namely the spontaneous localization theory with merely the wavefunction
(Allori et al. 2008).18 The proponents of this view therefore, contrarily to Bub
and Pitowsky, do not deny the first dogma: measurement processes are ‘regular’
physical processes analyzable in terms of their microscopic constituents (the PO).
15 With this I do not mean that there are no proposals but simply that these proposals are work-in-
progress, rather than full-blown solutions.
16 One may think that the fact that there is more than one solution to the measurement problem
implies that it has not really been solved. That is, one may think that the measurement problem
would be solved only if we had a unique answer to it. While this is certainly true, the sense
in which it has been solved (albeit not uncontroversially so) is that all its possible solutions are
fully developed, mature accounts, in contrast with the proposals to solve the configuration space
problem, which are only in their infancy. This is therefore the sense in which the configuration
space problem is more serious than the measurement problem.
17 Dürr et al. (1992), Allori et al. (2008), Allori (2013a, b).
18 Notice however, that they are hidden variable theories only in the sense that one cannot read
their ontology in the formalism of quantum mechanics, which is, in this approach, fundamentally
incomplete.
30 V. Allori
Nevertheless, like the IT interpretation, they reject the second dogma: they deny the
wavefunction representing physical objects. As we will see in Sect. 2.8, however,
they do not take the wavefunction as epistemic: this view rejects the second dogma
in the sense that the wavefunction is not material but not in the sense that it does not
represent some objective feature of the world.
In this framework, one of the motivations for the rejection of dogma 2 is the
conviction that the configuration space problem is a more serious problem than
the measurement problem. The measurement problem is the problem of making
sense of unobserved macroscopic superpositions which are consequences of the
formalism of the theory. I wish to argue now that one important reason why one
should drop the second dogma comes from the recognition that, even if the realist
solves the measurement problem, the corresponding realist theories would still face
the configuration space problem, which is arguably much harder to solve and which
can be entirely avoided if one rejects the second dogma.
First, take the wavefunction-only theories, like the spontaneous localization and
many-worlds theories. If one could keep the wavefunction in three-dimensional
space, then both these theories could identify a ‘particle’ as a localized three-
dimensional field given by the wavefunction (the superposition wavefunction
spontaneously localizes in the spontaneous localization theory, and decoherence
separates the different braches in Everett’s theory). This would solve the measure-
ment problem. However, these theories would need a wavefunction in configuration
space to be empirically adequate, and therefore one ends up facing the configuration
space problem. So, wavefunction realists solve one problem, but they have a new
one. In the case of the pilot-wave theory, moreover, the argument for the rejection
of the second dogma is even clearer, because the theory does not even solve the
measurement problem from the point of view of the wavefunction realist. In fact,
in this view matter would be made by both particles and waves. That is, the cat is
both made of waves and of particles, and because of its wave-like nature, she can
be in unobserved macroscopic superposition. The measurement problem is solved
only when one considers the theory a theory of particles only, without the wave-
counterpart representing matter. This is how indeed the theory is often discussed:
there are particles (which make up the cat), which are pushed around by the wave.
This is however still misleading because it assumes that some wave physically
exists, and this becomes problematical as soon as one realizes that this wave is
in configuration space: What kind of physical entity is it? How does it interact with
the particles? A better way of formulating the theory avoiding all this is to say that
cats are made of particles and the wavefunction is not material, therefore rejecting
the second dogma.19
Regardless, what’s so bad about the configuration space problem? According
to the proponents of the PO approach, it is extremely hard to solve, if not
unsolvable.20 Let us briefly see some reasons. One difficulty was recognized already
by de Broglie, who noticed that, since for the wavefunction realist there are
no particles at the fundamental level, “it seems a little paradoxical to construct
a configuration space with the coordinates of points which do not exist” (de
Broglie 1927). Interestingly, this argument can be tracked down historically also
to Heisenberg, who very vividly said to Bloch (1976), referring to configuration
space realism: “Nonsense, [ . . . ] space is blue and birds fly through it”, to express
the ultimate unacceptability of building a theory in which there was no fundamental
three-dimensional space.21 Also, as already noticed, in wavefunction realism the
traditional explanatory schema of the behavior of macroscopic entities in terms of
microscopic ones needs to be heavily revised: contrarily to the classical case, tables
and chairs are not macroscopic three-dimensional objects composed of microscopic
three-dimensional particles; rather, as already noticed, they are macroscopic three-
dimensional objects ‘emerging from’ high-dimensional wavefunction. And this
problem has to be solved in absence of a truly compelling argument to do so. In
fact, what is the positive argument for dogma 2? The one we already saw is that
we should do what we did in Newtonian mechanics, namely extract the ontology
from the evolution equation of the theory. However here the situation is different.
Newton started off with an a priori metaphysical hypothesis and built his theory
around it, so there is no surprise that one has no trouble figuring out the ontology of
the theory. In contrasts, quantum mechanics was not developed like that. Rather, as it
is known, different formulations have been proposed to account for the experimental
data (Heisenberg matrix mechanics and Schrödinger wave mechanics) without
allowing ideas about ontology to take precedence. Just like Bub and Pitowsky claim,
before the solution of the measurement problem, the theory is a kinematic theory:
just like thermodynamics, it put constraints on possible experimental outcomes
without investigating how and why these outcomes come about. The main difference
between the PO and the IT approaches is that the former provides a framework of
constructive quantum theories based on the microscopic PO, which can be seen as
the dynamical counterpart of the latter, in which measurements are fundamental.22
Recently, a more compelling argument for wavefunction realism has been
defended, namely that wavefunction realism is the only picture in which the world
is local and separable.23 The idea is that, since fundamentally all that exists is the
21 Similar concerns have been expressed by Lorentz, who in a 1926 letter to Schrödinger wrote:
“If I had to choose now between your wave mechanics and the matrix mechanics, I would give
the preference to the former, because of its greater intuitive clarity, so long as one only has to
deal with the three coordinates x, y, z. If, however, there are more degrees of freedom, then I
cannot interpret the waves and vibrations physically, and I must therefore decide in favor of matrix
mechanics” (Przibram 1967). Similarly, Schrödinger wrote: “The direct interpretation of this wave
function of six variables in three-dimensional space meets, at any rate initially, with difficulties
of an abstract nature” (1926, p. 39). Again: “I am long past the stage where I thought that one
can consider the w-function as somehow a direct description of reality” (1935a). This is also a
concern heartfelt by Einstein, who expressed this view in many letters, e.g.: “The field in a many-
dimensional coordinate space does not smell like something real” (Einstein 1926).
22 For more on the comparison between the PO approach and the IT framework, see Dunlap (2015).
23 See Loewer (1996) and Ney (forthcoming) and references therein.
32 V. Allori
high dimensional wavefunction, there are no other facts that fail to be determined by
local facts about the wavefunction. First, let me notice that this is true also for the IT
account, in which what is real are the (three-dimensional) experimental outcomes.
However, while a more thorough analysis of this should be done somewhere else,
let me point out that it is an open question whether the notions of locality and
separability used in this context are the ones which have been of interest in the
literature. Einstein first proposed the EPR experiment to discard quantum mechanics
because it would show three-dimensional nonlocality. Arguably,24 Bell instead
proved that the locality assumption in EPR was false and that nonlocality is a fact of
nature. In other words, to show that the world is separable and local in configuration
space seems to give little comfort to people like Einstein who wanted a world which
is local and separable in three dimensional space.
Be that as it may, a further reason to reject the second dogma is that, if we
accept it and somehow we manage to solve the configuration space problem, still
we would lose important symmetries.25 However, this is not the case if we drop
the second dogma. In fact, the law of evolution of the wavefunction should no
longer be regarded as playing a central role in determining the symmetries of the
theory. Indeed, they are determined by the ontology, whatever it is (the PO in the
PO approaches, and measurement outcomes in the IT framework). If one assumes
that the wavefunction does not represent matter, and wants to keep the symmetry,
then one can allow the wavefunction to transform as to allow for this.26
Another reason to reject the second dogma, even assuming the configuration
space problem is solved, is the so-called problem of empirical incoherence. A
theory is said to be empirically incoherent when its truth undermines our empirical
justification for believing it to be true. Arguably, any theory gets confirmation by
spatiotemporal observations. However, wavefunction realism rejects space-time as
fundamental so that its fundamental entities are not spatiotemporal. Because of this,
our observations are not spatiotemporal observations, and thus provide no evidence
for the theory in the first place.27
Moreover, it seems to me that insisting on holding on to dogma 2 at all costs
would ultimately undermines the original realist motivation. In fact, the realist, in
24 See Bell and Gao (2016) and references therein for a discussion of Bell’s theorem and its
implications.
25 See Allori et al. (2008) and Allori (2018, 2019). For instance, quantum mechanics with
wavefunction ontology turns out not Galieli invariant. In fact the wavefunction is a scalar field
in configuration space, and as such will remain the same under a Galilean transformation:
ψ(r) → ψ(r − vt). In contrast,
invariance would require a more complicated transformation:
i
mvr− 1 mv 2 t
(r, t) → e 2
ψ (r − vt, t) . A similar argument can be drawn to argue that if one
accepts the second dogma then the theory is also not time reversal invariant.
26 See respectively Allori (2018, 2019) for arguments based on Galilean symmetry and time reversal
symmetry that the wavefunction is best seen as a projective ray in Hilbert space rather than a
physical field.
27 Barrett (1999), Healey (2002), Maudlin (2007). For responses, see Huggett and Wüthrich (2013)
28 Notice that there is nothing strange in this: the wavefunction realist did not accept quantum
mechanics as it is because it suffered from the measurement problem, and in order to solve it, she
finds it acceptable to, for instance, change the theory by modifying the Schrödinger equation.
34 V. Allori
Let’s summarize some of the points made so far, and draw some partial conclusions.
A scientific realist thinks it is possible to explain the manifest image of our senses
in terms of the scientific image given to us by our best theories (Sellars 1962). We
have different possibilities:
1. Reject dogma 2, thus assume a microscopic ontology in three-dimensions, as
advised by the PO approach;
31 Interestingly,
there’s logically a fourth option, namely a theory which rejects dogma 1 without
rejecting dogma 2. This would be a theory in measurement outcomes are left unanalyzed and the
wavefunction represents physical objects. I am not sure who would hold such a view, but it seems to
me that such a theory would presumably not appeal many people: while it solves the measurement
problem (by rejecting dogma 1) it does not solve the configuration space problem (since it accepts
dogma 2). Moreover, this theory leaves a mystery what the relation between the wavefunction and
measurement processes is. Regardless, it nicely shows the relation between the two dogmas, with
dogma 2 being in my opinion the one responsible for the problems for the quantum realist.
36 V. Allori
accounts, while we will leave the criticism to the epistemic accounts to the next
section.
An account is said to be ontic when the wavefunction (or the density matrix
which may also define the quantum state) represents some sort of objective feature
of the world. In contrast, an account is epistemic if the wavefunction (or the
density matrix) has to do with our knowledge of the physical system. Each
view has their own sub-classification. As we have seen, wavefunction realism
and the wavefunction as multi-field are ontic approaches in that they regard the
wavefunction as a physical field. However, there are other approaches that could be
labelled as ontic which do not have this feature. One of these ontic interpretations
is the view according to which the wavefunction is a law of nature. This is a
peculiar account because it has been formulated in the PO framework, which denies
the second dogma. Roughly put, the idea is that since the wavefunction does not
represent matter but has the role of governing the behavior of matter, then the
wavefunction has a nomological character and thus is best understood as a law of
nature.32 Notice that this account of the wavefunction is still ontic (the wavefunction
expresses some nomic feature of the world) even if the second dogma is rejected
(the wavefunction does not represent material entities), and thus one can dub this
view as ‘ψ-non-material’ rather than ‘ψ-ontic’, which does not seem sufficiently
specific (another non-material view is the one according to which the wavefunction
is a property, see below). Perhaps the most serious problem33 for this approach is
that one usually thinks of laws as time-independent entities, while the wavefunction
evolves itself in time.34 Another challenge for the view is that while laws are taken
to be unique, the wavefunction is not. Replies have been provided, relying on a
future quantum cosmology in which the wave function would be static.35 However,
they have been received with skepticism, given that one would want to get a handle
on the nature of the wavefunction in the current situation, without having to wait for
a future theory of quantum cosmology.
Perhaps a better way of capturing this idea can be found in a Humean framework,
regarding the wavefunction as part of the Humean mosaic.36 Nonetheless, many
still find the account wanting, most prominently because they find Humeanism with
respect to laws misguided for other reasons. A different but connected approach that
can be thought as nomological is to think of the wave function as a dispositional
property or more generally as a property of the ontology.37 Not surprisingly, even if
these approaches do not have the time-dependence problem, they nevertheless are
not immune to criticisms, given that the dispositional properties are notoriously a
tough nut to crack and moreover the wavefunction does not behave like a regular
property.38
Moving on to the epistemic approaches, they have in common the idea that the
wavefunction should be understood as describing our incomplete knowledge of the
physical state, rather than the physical state itself. One of the motivations for the
epistemic approaches is that they immediately eliminate the measurement problem
altogether. In fact, schematically, in a Schrödinger cat type of experiment, the cat is
never in a superposition state; rather, it is the experimenter who does not know what
its state is. When I open the box and find the cat alive, I simply have updated my
knowledge about the status of the cat, who now I know to always have been alive
during the experiment.
The epistemic approaches can be distinguished into the ‘purely epistemic,’ or
neo-Copenhagen, accounts, and the ones that invoke hidden variables, which one
can dub ‘realist-epistemic’ approaches. The prototype of the latter type is Einstein’s
original ‘ignorance’ or ‘statistical’ interpretation of the wavefunction (Einstein
1949a), according to which the current theory is fundamentally incomplete: there are
some hidden variables which describe the reality under the phenomena, and whose
behavior is statistically well described by the wavefunction.39 For a review of these
epistemic approaches, see Ballentine (1970).40 In contrast, the neo-Copenhagen
approaches reject that the above mentioned hidden variables exist, or are needed.
The wavefunction thus describes our knowledge of measurement outcomes, rather
than of an underlying reality. This view has a long story that goes back to some
of the founding fathers of the Copenhagen school as one can see, for instance,
in Heisenberg (1958) and in Peierls (1991), who writes that the wavefunction
“represents our knowledge of the system we are trying to describe.” Similarly, Bub
and Pitowsky write: “quantum state is a credence function, a bookkeeping device
for keeping track of probabilities”.41
A crucial difference between the ontic and the epistemic approaches is that while
in the former the description provided by the wavefunction is objective, in the
latter it is not. This is due to the fact that different agents, or observers, may have
different information about the same physical system. Thus, many epistemic states
may correspond to the same ontic state.
2002; Fuchs and Schack 2009, 2010, Caves et al. 2002a, 2002b, 2007), pragmatist approaches
(Healey 2012, 2015, 2017; Friederich 2015), and relational ones (Rovelli 1996).
38 V. Allori
As we anticipated, since the IT interpretation rejects the first dogma and takes
measurements as primitive and fundamental, one is left with the question about
what the wavefunction is. According to Bub and Pitowsky, the wavefunction (or the
density matrix) is not fundamental. Thus, they spend not too much time in discussing
what the nature of the wavefunction is, and Timpson (2010) finds this troublesome.
However, it seems natural for them to consider the wavefunction as epistemic, given
their overall approach: the wavefunction adds no empirical content to the theory,
and thus is best seen not as a description of quantum systems but rather as reflecting
the assigning agents’ epistemic relations to the systems.
One epistemic approach which shows important similarity with Bub and
Pitowsky’s view is the one put forward by Healey (2012, 2015, 2017). According
to him, physical situations are not described by the wavefunction itself. Rather
the wavefunction prescribes how our epistemic attitudes towards claims about
physical situations should be. For instance, in a particles interference experiment, an
experimenter uses the Born rule to generate the probability that the particle is located
in some region. The wavefunction therefore prescribes what the experimenter
should believe about particle location. This view is epistemic in the sense that
the wavefunction provides a guide to the experimenter’s beliefs. Moreover, it is
pragmatic in the sense that concerns about explanation are taken to be prior to
representational ones. In this sense, the wavefunction has an explanatory role.
This explanatory role is also shared by Bub and Pitowsky’s account, since in their
account the wavefunction allows the definition of the principles of the theory which
constrain the possible physical phenomena, including interference.
Anyway, because of its epistemic approach, the IT interpretation has received
many criticisms, and the situation remains extremely controversial.42 Let me
review what the main criticisms are. First of all, it has been argued that the
epistemic approaches cannot explain the interference phenomena in the two-slit
experiment for particles: how can our ignorance about certain details can make some
phenomenon such as interference happen? It seems that some objective feature of
the world needs to exist in order to explain the interference fringes, and the obvious
candidate is the wavefunction (Leifer 2014; Norsen 2017). Moreover, in particle
experiments with the Mach-Zender interferometer outcomes change depending on
whether we know or not which path the particle has taken. This seems incompatible
with the epistemic view: if the wavefunction represent our ignorance about which
path has been taken, then coming to know that would change the wavefunction but
would not produce any physical effect (Bricmont 2016).
Moreover, the so-called no-go theorems for hidden variables (starting from the
ones of von Neumann 1932, and Bell 1987) have been claimed to show that the
realist-epistemic view is untenable, so that they can be considered no-go theorems
42 See Gao (2017) for a review of the objections, and Leifer (2014) for a defense.
2 Why Scientific Realists Should Reject the Second Dogma of Quantum Mechanics 39
for this type of approach. In fact, if the wavefunction represents our ignorance of
an underlying reality in which matter has some property A, say, spin in a given
direction, then the wavefunction provides the statistical distribution of v(A), the
values of A. Indeed, if this interpretation is true, v(A) exists for more than one
property, since it would be arbitrary to say that spin exists only on certain directions.
However, the no-hidden variable theorems show that mathematically no v(A) can
agree with the quantum predictions. That is, one cannot introduce hidden variables
to all observables at once.43
Recently, a new theorem called PBR theorem, from the initials of its proponents
(Pusey et al. 2012) has been proposed as a no-go theorem for the epistemic theories
in general. This theorem is supposed to show that, in line of the strategy used by
the other no-go theorems, epistemic approaches are in conflict with the statistical
predictions of quantum mechanics by showing that if the wavefunction is epistemic,
then it requires the existence of certain relations which are mathematically impossi-
ble. Therefore, if quantum mechanics is empirically adequate, then the wavefunction
must be ontic (either material, or not).
This objections have been subject to an extensive examination, and proponents of
the epistemic views have proposed several replies, which are however in the eyes of
many still tentative.44 Bub and Pitowsky respond that they can explain interference
as a phenomenon constrained by the structure of Hilbert space, more specifically by
the principle of ‘no universal broadcasting.’ Moreover, PBR’s conclusion, as well as
the conclusions of the other no-go theorems, applies only to theories that assume an
underlying reality, which they reject. However, these answers appear unsatisfactory
to someone who would want to know why these phenomena obtain, rather than
having a mere description. So, without entering too much in the merit of these
objections, in the view of many they pose a serious challenge to the epistemic views,
and, accordingly, to the IT interpretation. Therefore, one may wonder whether it is
possible to deny the second dogma while staying away from the epistemic views
without endorsing the non-material interpretations discussed in Sect. 2.8, given that
they suffer from serious objections too. This is the enterprise I will take on in the
next section.
We have reviewed in the previous sections the various non-material accounts of the
wavefunction (the nomological approach, which is non-material but ontic, and the
epistemic ones) and their criticisms. In this section, I propose a new account of the
wavefunction based on functionalism, which I argue is better than the alternatives
and could be adopted both by the proponents of the PO approach and by Bub and
Pitowsky.
43 Notice that the pilot-wave theory circumvents this theorem because some, but not all, experi-
ments reveal pre-existing properties (for more on this, see Bricmont 2016; Norsen 2017).
44 Again, see Leifer (2014), Gao (2017), and references therein.
40 V. Allori
Functionalism, broadly speaking, is the view that certain entities can be ‘reduced’
to their functional role. To use a powerful slogan, ‘a table is as a table does.’
Strategies with a functionalist core have been used in the philosophy of mind for
a long time (starting from Putnam 1960), and they have been recently used also in
philosophy of physics.
For instance Knox (2013, 2014) argues that spacetime can be functionalized
in the classical theory of gravity. That is, spacetime can be thought of as non-
fundamental (emergent) in virtue of the fact that spacetime plays the role of
defining inertial frames. Interestingly, Albert (2015) defends wavefunction realism
using his own brand of functionalism. He in fact argues that ordinary three-
dimensional objects are first functionalized in terms of their causal roles, and
the wavefunction dynamically uses these relations in a way which gives rise to
the empirical evidence.45 Lam and Wüthrich (2017) use functionalist strategies in
quantum gravity. There are many proposals for how to articulate a quantum theory
of gravity, however it has been argued that in all these theories spacetime is not
fundamental, but rather it is emergent from some non-spatiotemporal structures
(Huggett and Wuthrich forthcoming). They argue that in order for spacetime to
emerge it suffices to recover only those features which are functionally relevant
in producing observations. However, not surprisingly, this approach faces the same
objections of wavefunction realism. An objection to this approach is that it is unclear
how space and time can indeed emerge from a fundamental non-spatiotemporal
ontology: first, it is unclear what a non-spatiotemporal fundamental could be;
second, it is doubtful whether the right notion of emergence is the one proposed
(Lam and Esfeld 2013). In addition, there is the already mentioned problem of
empirical incoherence, namely that the theory undermines its own confirmation.
Arguably, any theory gets confirmation by spatiotemporal observations. However, a
theory in which spacetime is functionalized is a theory whose fundamental entities
are not spatiotemporal. Because of this, our observations are not spatiotemporal
observations, and thus provide no evidence for the theory in the first place.46
As anticipated, my idea is that one should functionally define the wavefunction,
similarly as how people wish to functionalize space, space-time and material
objects, and define it in terms of the role it plays in the theory. That is, the
wavefunction is as the wavefunction does. I am going to show that this view captures
the appeals of the other accounts without falling prey to their objections. In other
words, by thinking of the wavefunction functionally one can capture the intuitions
that the wavefunction is law-like (as proposed by the nomological approach), and
that it is explanatory rather than representational (as proposed by the epistemic and
pragmatic approaches) avoiding the problem of time-dependence, non-uniqueness,
and no-go theorems.
The wavefunction plays different but interconnected roles in realist quantum
theories which reject the second dogma, which however all contribute to the same
45 Ney (2017) provides a criticism and discusses her own alternative. See also Ney (forthcoming).
46 For responses, see Huggett and Wüthrich (2013), and Ney (2015, forthcoming).
2 Why Scientific Realists Should Reject the Second Dogma of Quantum Mechanics 41
goal: to reproduce the empirical data. First, the wavefunction plays a nomological
role. This is straightforward in the case of PO theories, where it is one of the
ingredients necessary to write the dynamics for the PO, which in turn constructively
accounts for the measurement outcomes.47 In the case of the IT interpretation, the
dynamics plays no role, but the wavefunction can be seen as nomological in the
sense that allows the principles or the kinematic constraints of the theory to be
expressed, which in turn limit the physically possible phenomena and thus explain
the data.
Connected to this, the wavefunction does not represent matter but it has an
explanatory role: it is a necessary ingredient to account for the experimental results.
As we have seen, the PO approach and the IT interpretation differ in what counts as
explanatory, the former favoring a dynamical account, and the latter a kinematic one.
Accordingly, the explanatory role of the wavefunction in these theories is different.
In the PO approach the wavefunction helps defining the dynamics for the PO, and
in this way accounts for the empirical results constructively as macroscopically
produced by the ‘trajectories’ of the microscopic PO. In the IT interpretation the
wavefunction provides the statistics of the measurement outcomes via the Born rule,
therefore explaining the probabilistic data by imposing kinematical constraints.
Notice that this explanatory role allows to be pragmatic about the wavefunction:
it is a useful object to adequately recover the empirical predictions. Pragmatically,
both in the PO approach and in the IT framework one uses the wavefunction to
generate the probability outcomes given by the Born rule, but one could have chosen
another object, like a density matrix, or the same object evolving according to
another evolution equation.48 The fact that quantum mechanics is formulated in
terms of the wavefunction is contingent to other super-empirical considerations
like for example simplicity and explanatory power: it is the simplest and most
explanatory choice one could have made.
This pragmatic take is compatible also with the nomological role of the wave-
function. In fact, as we have seen, the wavefunction is not a law of nature,
strictly speaking, rather it is merely an ingredient in the law. Because of this the
wavefunction does not have to be unique, like potentials are not unique, and can be
dispensable: other entities, such as the density matrix, can play its role. In this way,
the wavefunction plays a nomological role, but avoiding the usual objections to the
nomological view.
Moreover, notice that this approach, even if pragmatic, is not epistemic: the
wavefunction does not have to do with our knowledge of the system (contra Bub and
Pitowsky’s original proposal). As a consequence, it does not suffer from the usual
problems of the epistemic approaches. Moreover, in the case of the IT approach, this
47 In the context of the pilot-wave theory, the wavefunction can be taken as a force or a potential,
but one should not think of it as material (even if Belousek 2003 argues otherwise).
48 In the PO framework, for instance, see Allori et al. (2008) for unfamiliar formulations of the
pilot-wave theory and the spontaneous collapse theory respectively with a ‘collapsing’ wave
function and a non-collapsing one respectively.
42 V. Allori
view remains true to the ideas which motivated the interpretation, namely to reject
both dogmas, and fits well with their ideas of quantum probability.
Also, I wish to remark that, since all the theories we are considering reject the
second dogma, this view also avoids the objection of empirical incoherence of other
functionalist approaches. In fact the problem originates only if the fundamental
entities of the theory are not spatiotemporal, and this is not the case in the IT
frameworks (in which measurements outcomes are fundamental and are in three-
dimensional space) and in the PO approach (in which the PO is a spatiotemporal
entity).
To summarize, the wavefunction plays distinctive but interconnected roles which
act together and allow to recover the empirical data. First, the wavefunction is
an ingredient in the laws of nature, and as such contributes in determining the
trajectories of matter. As such, it has an explanatory role in accounting of the
experimental results, even if it is not representational one. Because of this, the
wavefunction can be thought in pragmatic terms without considering it connected to
our knowledge of the system. Formulated in this way, the view has the advantages
of capturing the main motivations for the other views without falling prey of their
main objections:
1. The wavefunction has a nomological role, without being a law; so it does not
matter whether it is time-evolving or unique;
2. The wavefunction has an explanatory role without necessarily being uniquely
defined or connected to the notion of dynamical explanation;
3. The wavefunction has a pragmatic role in recovering the empirical data without
being epistemic, thereby avoiding the interference problem and the no-go
theorems linked with the epistemic view.
These roles combine together to define the wavefunction functionally: the
wavefunction is whatever function it plays, namely to recover the experimental
results. This role can be accomplished either directly, in the IT framework, or
indirectly, through the ‘trajectories’ of the PO, but always remaining in spacetime,
therefore bypassing the configuration space problem and the worry of empirical
incoherence. This account of the wavefunction thus is arguably better than the other
taken individually, and accordingly is the one that the proponents of realist theories
which reject the second dogma should adopt. In particular, it can be adopted by the
proponents of the PO approach, as well as by Bub and Pitowsky to strengthen their
views.
2.11 Conclusion
Acknowledgements Thank you to Meir Hemmo and Orly Shenker for inviting me to contribute to
this volume. Also, I am very grateful to Laura Felline for her constructive comments on an earlier
version of this paper, and to the participants of the International Workshop on “The Meaning of the
Wave Function”, Shanxi University, Taiyuan, China, held in October 2018, for important feedback
on the second part of this paper.
44 V. Allori
References
Brown, H. R., & Wallace, D. (2005). Solving the measurement problem: de Broglie-Bohm loses
out to Everett. Foundations of Physics, 35, 517–540.
Bub, J. (2004). Why the quantum? Studies in the History and Philosophy of Modern Physics, 35,
241–266.
Bub, J. (2005). Quantum mechanics is about quantum information. Foundations of Physics, 35(4),
541–560.
Bub, J. (2007). Quantum probabilities as degrees of belief. Studies in History and Philosophy of
Modern Physics, 39, 232–254.
Bub, J., & Pitowsky, I. (2010). Two dogmas about quantum mechanics. In S. Saunders, J. Barrett,
A. Kent, & D. Wallace (Eds.), Many worlds?: Everett, quantum theory, and reality (pp. 433–
459). Oxford: Oxford University Press.
Callender, C. (2015). One world, one Beable. Synthese, 192, 3153–3177.
Caves, C. M., Fuchs, C. A., & Schack, R. (2002a). Quantum probabilities as Bayesian probabilities.
Physical Review A, 65, 022305.
Caves, C. M., Fuchs, C. A., & Schack, R. (2002b). Unknown quantum states: The quantum de
Finetti representation. Journal of Mathematical Physics, 44, 4537–4559.
Caves, C. M., Fuchs, C. A., & Schack, R. (2007). Subjective probability and quantum certainty.
Studies in History and Philosophy of Modern Physics, 38(2), 255–274.
Clifton, R., Bub, J., & Halvorson, H. (2003). Characterizing quantum theory in terms of
information theoretic constraints. Foundations of Physics, 33(11), 1561.
de Broglie, L. (1927). La Nouvelle Dynamique des Quanta. In: Solvay conference, Electrons
et Photons, translated in G. Bacciagaluppi and A. Valentini (2009) Quantum theory at the
crossroads: Reconsidering the 1927 Solvay conference (pp. 341–371). Cambridge: Cambridge
University Press.
Dunlap, L. (2015). On the common structure of the primitive ontology approach and information-
theoretic interpretation of quantum theory. Topoi, 34(2), 359–367.
Dürr, D., Goldstein, S., & Zanghí, N. (1992). Quantum equilibrium and the origin of absolute
uncertainty. Journal of Statistical Physics, 67, 843–907.
Dürr, D., Goldstein, S., & Zanghì, N. (1997). Bohmian mechanics and the meaning of the wave
function. In R. S. Cohen, M. Horne, & J. Stachel (Eds.), Experimental metaphysics — Quantum
mechanical studies for Abner Shimony (Vol. 1: Boston Studies in the Philosophy of Science)
(Vol. 193, pp. 25–38). Boston: Kluwer Academic Publishers.
Egg, M. (2018). Dissolving the measurement problem is not an option for realists. Studies in
History and Philosophy of Science Part B: Studies in History and Philosophy of Modern
Physics, 66, 62–68.
Einstein, A. (1919). What is the theory of relativity? The London Times. Reprinted in Einstein, A.
(1982). Ideas and opinions (pp. 227–232). New York: Crown Publishers, Inc.
Einstein, A. (1926). Einstein to Paul Ehrenfest, 18 June, 1926, EA 10–138. Translated in Howard
(1990), p. 83.
Einstein, A. (1949a). Autobiographical notes. In P. A. Schilpp (Ed.), Albert Einstein: Philosopher-
scientist. Evanston, IL: The Library of Living Philosophers.
Einstein, A. (1949b). Reply to criticisms. In P. A. Schilpp (Ed.), Albert Einstein: Philosopher-
scientist. Evanston, IL, The Library of Living Philosophers.
Esfeld, M. A. (2014). Quantum Humeanism, or: Physicalism without properties. The Philosophical
Quarterly, 64, 453–470.
Esfeld, M. A., Lazarovici, D., Hubert, M., & Dürr, D. (2014). The ontology of Bohmian mechanics.
The British Journal for the Philosophy of Science, 65, 773–796.
Everett, H., III. (1957). ‘Relative state’ formulation of quantum mechanics. Reviews of Modern
Physics, 29, 454–462.
Felline, L. (2011). Scientific explanation between principle and constructive theories. Philosophy
of Science, 78(5), 989–1000.
Flores, F. (1999). Einstein’s theory of theories and types of theoretical explanation. International
Studies in the Philosophy of Science, 13(2), 123–134.
Forrest, P. (1988). Quantum metaphysics. Oxford: Blackwell.
46 V. Allori
Leifer, M. S. (2014). Is the quantum state real? An extended review of ψ-ontology theorems.
Quanta, 3, 67–155.
Lewis, P. J. (2004). Life in configuration space. The British Journal for the Philosophy of Science,
55, 713–729.
Lewis, P. J. (2005). Interpreting spontaneous collapse theories. Studies in History and Philosophy
of Modern Physics, 36, 165–180.
Lewis, P. J. (2006). GRW: A case study in quantum ontology. Philosophy Compass, 1, 224–244.
Lewis, P. J. (2013). Dimension and illusion. In D. Albert & A. Ney (Eds.), The wave-function:
Essays in the metaphysics of quantum mechanics (pp. 110–125). New York: Oxford University
Press.
Loewer, B. (1996). Humean Supervenience. Philosophical Topics, 24, 101–127.
Lorentz, H. A. (1909). The theory of electrons. New York: Columbia University Press.
Maudlin, T. (1995). Three measurement problems. Topoi, 14, 7–15.
Maudlin, T. (2007). Completeness, supervenience and ontology. Journal of Physics A, 40, 3151–
3171.
Miller, E. (2014). Quantum entanglement, bohmian mechanics, and humean supervenience.
Australasian Journal of Philosophy, 92(3), 567–583.
Monton, B. (2013). Against 3N-dimensional space. In D. Z. Albert & A. Ney (Eds.), The Wave-
function: Essays in the metaphysics of quantum mechanics (pp. 154–167). New York: Oxford
University Press.
Ney, A. (2012). The status of our ordinary three-dimensions in a quantum universe. Noûs, 46,
525–560.
Ney, A. (2013). Ontological reduction and the wave-function ontology. In D. Albert & A. Ney
(Eds.), The wave-function: Essays in the metaphysics of quantum mechanics (pp. 168–183).
New York: Oxford University Press.
Ney, A. (2015). Fundamental physical ontologies and the constraint of empirical coherence.
Synthese, 192(10), 3105–3124.
Ney, A. (2017). Finding the world in the wave-function: Some strategies for solving the macro-
object problem. Synthese, 1–23.
Ney, A. (forthcoming). Finding the world in the wave function. Oxford: Oxford University Press.
Norsen, T. (2010). The theory of (exclusively) local Beables. Foundations of Physics, 40(12),
1858–1884.
Norsen, T. (2017). Foundations of quantum mechanics: An exploration of the physical meaning of
quantum theory. Cham: Springer.
North, J. (2013). The structure of the quantum world. In D. Z. Albert & A. Ney (Eds.), The wave-
function: Essays in the metaphysics of quantum mechanics (pp. 184–202). New York: Oxford
University Press.
Peierls, R. (1991). In defence of ‘measurement’. Physics World, January, pp. 19–20.
Pitowsky, I. (2007). Quantum mechanics as a theory of probability. In W. Demopoulos & I.
Pitowsky (Eds.), Physical theory and its interpretation: Essays in honor of Jeffrey Bub. Berlin:
Springer.
Przibram, K (1967). Letters on wave mechanics (trans. Martin Klein). New York: Philosophical
Library.
Pusey, M., Barrett, J., & Rudolph, T. (2012). On the reality of the quantum state. Nature Physics,
8, 475–478.
Putnam, H. (1960). Minds and machines, reprinted in Putnam (1975b), pp. 362–385.
Putnam, H. (1975). Mind, language, and reality. Cambridge: Cambridge University Press.
Rovelli, C. (1996). Relational quantum mechanics. International Journal of Theoretical Physics,
35(8), 1637–1678.
Schrödinger, E. (1926). Quantisierung als Eigenwertproblem (Zweite Mitteilung). Annalen der
Physik, 79, 489–527. English translation: Quantisation as a Problem of Proper Values. Part II.
Schrödinger, E. (1935a). Schrödinger to Einstein, 19 August 1935. Translated in Fine, A. (1996).
The Shaky game: Einstein realism and the quantum theory (p. 82). Chicago: University of
Chicago Press.
48 V. Allori
Guido Bacciagaluppi
Abstract There are two notions in the philosophy of probability that are often used
interchangeably: that of subjective probabilities and that of epistemic probabilities.
This paper suggests they should be kept apart. Specifically, it suggests that the dis-
tinction between subjective and objective probabilities refers to what probabilities
are, while the distinction between epistemic and ontic probabilities refers to what
probabilities are about. After arguing that there are bona fide examples of subjective
ontic probabilities and of epistemic objective probabilities, I propose a systematic
way of drawing these distinctions in order to take this into account. In doing so, I
modify Lewis’s notion of chances, and extend his Principal Principle in what I argue
is a very natural way (which in fact makes chances fundamentally conditional). I
conclude with some remarks on time symmetry, on the quantum state, and with
some more general remarks about how this proposal fits into an overall Humean
(but not quite neo-Humean) framework.
I believe I first met Itamar Pitowsky in what I think was the spring of 1993, when he
came to spend a substantial sabbatical period at Wolfson College, Cambridge, while
writing his paper on George Boole’s ‘Conditions of Possible Experience’ (Pitowsky
1994). At that time, the Cambridge group was by far one of the largest research
groups in philosophy of physics worldwide, and Wolfson had the largest share of the
group. Among others, that included two further Israelis, my friends and fellow PhD-
students Meir Hemmo, who is co-editing this volume, and Jossi Berkovitz, who had
worked under Itamar’s supervision on de Finetti’s probabilistic subjectivism and its
G. Bacciagaluppi ()
Descartes Centre for the History and Philosophy of the Sciences and the Humanities, Utrecht
University, Utrecht, Netherlands
e-mail: g.bacciagaluppi@uu.nl
This paper is about the notion of subjective probabilities and that of epistemic
probabilities, and how to keep them apart. Since terminology varies, I should briefly
sketch which uses of the terms I intend. (A fuller explication will be given in
Sects. 3.2 and 3.3.)
I take the use of the term ‘subjective probability’ to be fairly uniform in the
literature, and to be fairly well captured by the idea of degrees of belief (and as such
‘assigned by agents’ and presumably ‘living in our heads’), as opposed to some
appropriate notion of objective probabilities (‘out there’), sometimes cashed out in
terms of frequencies or propensities. I shall take the standard explication of objective
probabilities as that given by David Lewis with his notion of chance (Lewis 1980).
Along these lines, all probability assignments can be taken as subjective in the
first place, but some such assignments may correspond to objective probabilities ‘out
there’ in the world, determined by actual matters of fact. Probability assignments
that do not correspond to objective probabilities shall be termed ‘purely’ or ‘merely’
subjective in the following. In this terminology, the subjective probabilities of a
subjectivist like de Finetti, for whom objective probabilities do not exist, are the
prime example of purely subjective probabilities, even though usually they will
be firmly if pragmatically rooted in matters of actual fact (as further discussed in
Sects. 3.4 and 3.10).
The term ‘epistemic probabilities’ has a wider variety of meanings, of which two
are probably dominant, depending on whether one emphasises the fact that we lack
knowledge of something or whether one emphasises the knowledge that we do have.
The one I shall focus on is the former, that of ignorance-interpretable probabilities
(i.e. of probabilities attaching to propositions expressing a matter of fact, but whose
truth is unknown), which is perhaps most common in the philosophy of physics. The
other refers to probability assignments whose values reflect our state of knowledge.
The classical notion of probability based on the principle of indifference or an
3 Unscrambling Subjective and Epistemic Probabilities 51
objective Bayesian view based on the principle of maximum entropy (to which we
shall presently return) can thus also be termed epistemic.
This second notion of ‘epistemic probabilities’, which is perhaps most common
in the philosophy of probability, belongs somewhere along the axis between subjec-
tive and objective probabilities. It is often contrasted with subjective probabilities
when these are understood as merely ‘prudential’ or ‘pragmatic’ or ‘instrumental’.
This distinction is challenged by Berkovitz (2019), who convincingly argues that de
Finetti’s instrumental probabilities are also epistemic in this sense. Indeed, when
talking about subjective probabilities in Sects. 3.4, 3.6 and 3.10, I shall always
implicitly take them to be so. This second sense of ‘epistemic probabilities’ is not
the topic of this paper, and will be discussed no further.
Epistemic probabilities in the sense of ignorance-interpretable ones instead are
liable to be assimilated to subjective probabilities, but, as I shall argue, they ought
not to, because the distinction between probabilities that are epistemic in this sense
and those that are not (we shall call them ‘ontic’) is conceptually orthogonal to the
subjective-objective distinction.
Take for example Popper’s discussion in Vol. III of the Postscript (Popper
1982).1 After treating extensively of his propensity interpretation in Vol. II, Popper
starts Vol. III (on quantum mechanics) with a discussion of subjective vs objective
probabilities, complaining that physicists conflate the two (especially in quantum
mechanics). Popper’s contrast for the objective view of probabilities is what he
throughout calls the subjective view, which for him is the view that probabilistic
statements are about our (partial) ignorance. In the case of classical statistical
mechanics, he complains that ‘[t]his interpretation leads to the absurd result that the
molecules escape from our bottle because we do not know all about them’ (Popper
1982, p. 109).
True, we care about whether our coffee goes cold, and the coffee does not care
about our feelings. And, at least in the sense in which Popper took his own notion
of propensities to be explanatory of observed frequencies (cf. Bub and Pitowsky
1985, p. 541), it seems reasonable to say that while the behaviour of the coffee is
a major factor in explaining why we come to have certain subjective expectations,
it is only objective probabilities that can explain the behaviour of the coffee. But
there is also a sense in which the probabilities of (classical) statistical mechanics
are clearly epistemic: we do believe the coffee or the gas in the bottle to be in a
definite microstate, but we do not know it. Also the statistical mechanical description
of the gas presupposes there to be a particular matter of fact about the state of
the molecules, which is however left out of the description. If we read Popper’s
complaint as being that epistemic probabilities must be subjective, or rather purely
1 Incidentally, reviewed by Itamar when it came out (Bub and Pitowsky 1985). Note that my
examples of possible assimilation or confusion between epistemic and subjective probabilities (in
the senses just sketched) may well prove to be straw-mannish under further scrutiny, but I just wish
to make a prima facie case that these two notions can indeed be confused. The rest of the paper
will make the extended case for why we should distinguish them and how.
52 G. Bacciagaluppi
2 It is unclear to what extent this ‘objective Bayesian’ strategy indeed yields objective probabilities.
Consider for instance situations in which information about the physical situation is in principle
readily available but is glibly ignored. In this case, even if we use principles such as maximum
entropy, one might complain that there is little or no connection between the physical situation
and the probabilities we assign. On the other hand, there may be situations in which the
physical situation itself appears to systematically lead to situations of ignorance, e.g. because of
something like mixing properties of the dynamics. In this case, there may arguably be a systematic
link between certain objective physical situations and certain ways of assigning probabilities,
which may go some way towards justifying the claim that these probabilities are objective. (Cf.
discussions about whether high-level probabilities are objective, e.g. Glynn (2009), Lyon (2011),
Emery (2013) and references therein.) If so, note that this will be the case whether or not there
is anyone to assign those probabilities. Many thanks to Jossi Berkovitz for discussion of related
points.
3 For an example of what Jaynes has in mind, see Heisenberg’s ‘The Copenhagen interpretation
of quantum theory’ (Heisenberg 1958, Chap. 3). In Sect. 3.9 below, I shall allude to what I think
are the origins of these aspects of Heisenberg’s views, but I agree with Jaynes that (at least in the
form they are usually presented) they are hard to make sense of. Note also that the ‘Copenhagen
interpretation’ as publicised there appears to have been largely a (very successful) public relations
exercise by Heisenberg (cf. Howard 2004). In reality, physicists like Bohr, Heisenberg, Born and
Pauli held related but distinct views about how quantum mechanics should be understood.
3 Unscrambling Subjective and Epistemic Probabilities 53
Indeed, as Jaynes sees it, in the case of quantum mechanics human knowledge is
limited by dogmatic adherence to indeterminism in the Copenhagen interpretation.
But in fact, even though the ‘Copenhagen interpretation’ may have been dogmatic,
indeterminism was the least worrying of its dogmas: a wide spectrum of non-
dogmatic approaches to the foundations of quantum mechanics today embrace
indeterminism in some form or other. Thus, if (as Jaynes seems to think) subjective
probabilities must be epistemic, then they seem to be inappicable to many if not
most fundamental approaches to quantum mechanics.4
These problems with both Popper’s and Jaynes’ positions seem to me to stem
from a scrambling together of subjective and epistemic probabilities. I shall take it
as uncontroversial that we often form degrees of belief about facts we do not know
(so that such probabilities are both subjective and epistemic), but this in itself does
not show that all epistemic probabilities should be subjective, or that all subjective
probabilities should be epistemic. Indeed, being a degree of belief is conceptually
distinct from being a probability about something we ignore: the characterisation of
a probability as subjective is in terms of what that probability is, that of a probability
as epistemic is in terms of what that probability is about.
Admittedly, even if we can conceptually distinguish between subjective and
epistemic probabilities, it could be that the two categories are coextensive. And
if all epistemic probabilities are merely subjective, or all subjective probabilities
are also epistemic, then Popper’s or Jaynes’s worries become legitimate again. As
a matter of fact, typical cases of epistemic probabilities are arguably subjective (‘Is
Beethoven’s birthday the 16th of December?’5 ), and, say, probabilities in plausibly
indeterministic scenarios are typically thought to be objective (‘Will this atom decay
within the next minute?’). And, indeed, while there is an entrenched terminology
for probabilities that are not purely subjective (namely ‘objective probabilities’),
non-epistemic probabilities do not even seem to have a separate standard name (we
shall call them ‘ontic probabilities’). It is indeed tempting to think that all epistemic
probabilties are subjective and all ontic probabilities are objective.
I urge the reader to resist this temptation. I believe that we can make good
use of the logical space that lies between these distinctions. As the Popper and
Jaynes quotations above are meant to suggest, objectivism about probability will be
4 Spontaneous collapse approaches are invariably indeterministic (unless one counts among them
also non-linear modifications of the Schrödinger equation, which however have not proved
successful), as is Nelson’s stochastic mechanics. Everett’s theory vindicates indeterminism at
the emergent level of worlds. And even many pilot-wave theories are stochastic, namely those
modelled on Bell’s (1984) theory based on fermion number density. Thus, among the main
fundamental approaches to quantum mechanics, only theories modelled on de Broglie and Bohm’s
original pilot-wave theory provide fully deterministic underpinnings for it. (For a handy reference
to philosophical issues in quantum mechanics, see Myrvold (2018b) and references therein. For
more on Nelson see Footnote 40.)
5 16 December is the traditional date for Beethoven’s birthday, but we know from documentary
evidence only that he was baptised on 17 December 1770 – Beethoven was probably born on 16
December.
54 G. Bacciagaluppi
6 With the qualifications mentioned in Footnote 2. Thanks to both Jos Uffink and Ronnie Hermens
for making me include more about Jaynes.
7 These (especially the latter) can be skipped by readers who are not particularly familiar with
debates on the metaphysics of time or, respectively, on the nature of the quantum state and of
collapse in quantum mechanics.
3 Unscrambling Subjective and Epistemic Probabilities 55
whatever the form of our credence function at the time t. Of course, if we further
believe at t that the chance at t of A is in fact x, i. e. if crt (cht (A) = x) = 1, then
our credence at t in A is also x (by the probability calculus), or if we learn at t that
cht (A) = x, our credence in A becomes x (by Bayesian conditionalisation).
The stringency of Lewis’s requirements on what is to count as rationally
compelling is such that there are few candidates in the literature for satisfying them,
none of which are uncontroversial. And because Lewis himself believes in ‘Humean
8 While in a Cambridge HPS mood, I cannot refrain from thanking Jeremy Butterfield for (among
so many other things!) introducing me to this wonderful paper and all issues around objective
probabilities.
9 Again, terminology varies somewhat, but I shall take ‘credences’ to be synonymous with
‘subjective probabilities’.
10 Bayesian conditionalisation is often taken as the unique rational way in which one can change
one’s credences. But there may be other considerations competing with it, in particular affecting
background theoretical knowledge. This will generally be implicit (but not needed) in much of this
paper. Thanks to Jossi Berkovitz for pointing out to me that de Finetti himself was not committed
to conditionalisation as the only way for updating probabilities, believing instead that agents may
have reasons for otherwise changing their minds (cf. Berkovitz 2012, p. 16).
56 G. Bacciagaluppi
11 Two proposals of note are the one by Deutsch and by Wallace in the context of Everettian
quantum mechanics, who derive the quantum mechanical Born rule from what they argue are
rationality constraints on decision making in the context of Everett (see e.g. Wallace 2010, or
Brown and Ben Porath (2020)), and the notion of ‘Humean objective chances’ along the lines of
the neo-Humean ‘best systems’ approach (Loewer 2004; Frigg and Hoefer 2010). Note that I am
explicitly distinguishing between Lewisian chances and Humean objective chances. The latter are
an attempt to find something that will satisfy the definition of Lewisian chances, but there is no
consensus as to whether it is a successful attempt. (See Sect. 3.10 below for some further comments
on neo-Humeanism.) The Deutsch-Wallace approach is often claimed to provide the only known
example of Lewisian chances, but again this claim is controversial, as it appears that some of the
constraints that are used cannot be thought of as rationality assumptions but merely as naturalness
assumptions. On this issue see Brown and Ben Porath (2020) as well as the further discussion in
Saunders et al. (2010). The difficulty in finding anything that satisfies the definition of Lewisian
chances is (of course) an argument in favour of subjectivism.
12 I shall assume throughout that degrees of belief attach to propositions about matters of fact (in
the actual world), which incidentally need not be restricted to Lewis’s ‘particular’ matters of fact
(e.g. I shall count as matters of fact also holistic facts about quantum states – if such facts there
be). I shall call these material or empirical propositions.
Of course, some of the background assumptions that we make in choosing our degrees of belief
may not just refer to matters of fact, but will be theoretical propositions (say, referring to possible
worlds). In that case, I would prefer not to talk about degrees of belief attaching to such theoretical
propositions, but about degrees of acceptance.
3 Unscrambling Subjective and Epistemic Probabilities 57
And the requirement that (3.1) should hold whatever the form of our credence
function at t (subject to the provisos) now translates into the requirement that (3.2)
should hold for any ‘reasonable initial credence function’ (which we shall keep
fixed) and for any admissible propositions Et and T , which for Lewis are, indeed,
ones that are compatible with the chance of A being x and in addition do not provide
information about the occurrence of A over and above what is provided by knowing
the chance of A (if chance is to guide credence, we must not allow information that
trumps the chances). This is Lewis’s ‘Principal Principle’ (PP) relating credence and
chance. It can be taken, is often taken, and I shall take it as providing the implicit
definition of objective probabilities: iff there is a proposition X such that, for all
choices of reasonable initial credence function and admissible propositions,
cr A X & Et & T = x, (3.3)
The matter is further complicated by the fact that even full acceptance of a theory need not
imply belief in all propositions about matters of fact following from the theory. For instance, an
empiricist may believe only a subset of the propositions about actual matters of fact following
from a theory they accept (say, only propositions about observable events), thus drawing the
empirical-theoretical distinction differently and making a distinction between material propositions
and empirical propositions, which for the purposes of this paper I shall ignore. (For my own very
real empiricist leanings, see Bacciagaluppi 2019.)
For simplicity I shall also ignore the distinction between material propositions and actual
propositions that one makes in a many-worlds theory, where the material world consists of several
‘possible’ branches, even though I shall occasionally refer to Everett for illustration of some point.
One might reflect the distinction between material (or empirical) and theoretical propositions
by not conditionalising on the latter, but letting them parameterise our credence functions. Another
way out of these complications may be to assign subjective probabilities to all propositions alike,
but interpret the probabilities differently depending on what propositions they attach to: in general
they will be degrees of acceptance, but in some cases (all material propositions, or a more restricted
class of empirical propositions), one may think of them as straightforward degrees of belief. Such
an additional layer of interpretation of subjective probabilities will, however, not affect the role
they play in guiding our actions (degrees of acceptance will generally be degrees of ‘as-if-belief’).
In any case, the stroke notation makes clear that assuming a theory – even hypothetically – is meant
to affect our credences as if we were conditionalising on a proposition stating the theory (I believe
this is needed to derive some of Lewis’s results).
Note that for Lewis, T will typically include theoretical propositions about chances themselves
in the form of history-to-chance conditionals, and because in Lewis’s doctrine of Humean
supervenience chances in fact supervene on matters of fact in the actual world, such background
assumptions may well be propositions that degrees of belief can attach to (at least depending on
how one interprets the conditionals), even if one wishes to restrict the latter (as I have just sketched)
to propositions about matters of fact in the actual world.
We shall return to the material-theoretical distinction in Sect. 3.7.
58 G. Bacciagaluppi
then there is such a thing as the objective chance at t of A and its value is x. The
weakest such proposition X is then cht (A) = x.13
More precisely, Lewis restricts admissible propositions about matters of fact to
historical propositions up to time t (at least ‘as a rule’: no clairvoyance, time travel
and such, and we shall follow suit for now – although he appears to leave it open that
one may rethink the analysis if one should wish to consider such less-than-standard
situations (Lewis 1980, p. 274)). Admissible propositions about matters of fact can
thus be taken to be what can be reasonably thought to be knowable in principle at
t, and Lewis allows them to include the complete history Ht of the world up to that
time. He also allows hypothetical propositions about the theory of chance, i. e. about
how chances may depend on historical facts.
Even more propositions might be admissible at time t, but this partial char-
acterisation of admissibility already makes the PP a useful and precise enough
characterisation of objective chances. Indeed, taking the conjunction Ht & TC of
the complete history of the world up to t and a complete and correct theory TC of
how chances depend on history, the following will be an instance of the PP:
cr A (cht (A) = x) & Ht & TC = x. (3.4)
Since TC is a complete and correct theory of chance, Ht & TC already implies that
cht (A) = x, and we can rewrite (3.4) as
13 Notethat X is a proposition about (actual) matters of fact. Thus, chances in the sense of Lewis
indeed satisfy the intuition that objective probabilities are properties of objects or more generally
determined by actual facts. Thanks to Jossi Berkovitz for discussion.
3 Unscrambling Subjective and Epistemic Probabilities 59
yet – what is the probability of Heads? In this situation, there is a matter of fact
about whether the coin has landed Heads or Tails, and if I knew it, I would assign to
Heads probability 0 or 1 (depending on what that matter of fact was).
Another common example is modelling coin flipping as a deterministic Newto-
nian process: even before the coin has landed, the details of the flip (initial height,
velocity in the vertical direction, and angular momentum14 ) determine what the
outcome will be, but we generally do not know these details. In this case, we say
that the probabilities of Heads and Tails are epistemic, even if the event of the coin
landing Heads or Tails is in the future, because we can understand such events as
labels for sets of unknown initial conditions.
If we are looking for examples of ontic probabilities, we thus presumably
need to look at cases of indeterminism. The most plausible example is that of
quantum probabilities, say, the probability for a given atom to decay within the next
minute. (Of course, whether this is a genuine case of indeterminism will depend on
one’s views about quantum mechanics, in particular about the existence of ‘hidden
variables’. For the sake of the example we shall take quantum mechanics to be
genuinely indeterministic.)
Still, if all actual matters of fact past, present and future are taken to be knowable
in principle, then all probabilities become purely epistemic, and (at least for now) it
appears we cannot draw an interesting distinction.
One way of ensuring that the future is unknowable, of course, is to postulate that
the future is open, i. e. that at the present moment there is no (single) future yet.
This is a very thick metaphysical reading of indeterminism, however, which I wish
to avoid for the purpose of drawing the epistemic-ontic distinction.15 Indeterminism
can be defined in a non-metaphysical way in terms of the allowable models of
a theory (or nomologically possible worlds, if worlds are also understood in a
metaphysically thin way): a theory is (future-)deterministic iff any two models
agreeing up to (or at) time t agree also for all future times (and with suitable
modifications for relativistic theories).
For the purpose of this first pass at drawing the epistemic-ontic distinction we
can then simply stipulate that the notion of ignorance only pertains to facts in the
actual world that are knowable in principle, and that at time t only facts up to t
are knowable in principle (which is in fact the stipulation we already made in the
context of Lewis’s analysis of chances – we may want again to make a mental note
to adapt the notion of ‘knowable in principle’ if we allow for clairvoyance, time
14 For the classic Newtonian analysis of coin flipping see Diaconis et al. 2007.
15 Or for any other purpose – as a matter of fact, I shall assume throughout a block universe picture.
Specifically, in the case of indeterminism one should think of each possible world as a block
universe. The actual world may be more real than the merely possible ones, or all possible worlds
may be equally real (whether they are or not is inessential to the following, although a number
of remarks below may make it clear that I am no modal realist, nor in fact believe in objective
modality). But within each world, all events will be equally real. (Note that at a fundamental level
the Everett theory is also deterministic and Everettians emphasise that the Everettian multiverse is
indeed a block universe.)
60 G. Bacciagaluppi
travel and such like). If one so wishes, one can of course talk about ‘ignorance’ of
which possible world is the actual world, or of which way events in our world will
be actualised. But if probabilities in cases of indeterminism are to count as ‘ontic’,
then it simply is the case that in practice we make a more restrictive use of the term
‘ignorance’ when we draw the epistemic-ontic distinction.
The general intuition is now that epistemic probabilities lie in the gap between
probability assignments given what is knowable in principle (call it the ontic
context), and probability assignments given what is actually known (the epistemic
context). By the same token, ontic probabilities are any probabilities that remain
non-trivial if we assume that we know everything that is knowable in principle.
In general, probabilities will be a mix of epistemic and ontic probabilities. For
instance, we ask for the probability that any atom in a given radioactive probe
may decay in the next minute, but the composition of the probe is unknown:
the probability of decay will be given by the ontic probabilities corresponding
to each possible radioactive component, weighted by the epistemic probabilities
corresponding to each component.
The EPR setup in quantum mechanics (within a standard collapse picture)
provides another good example: before Bob performs a spin measurement, assuming
the global state is the singlet state, Alice has ontic probability 0.5 of getting spin up
in any direction. After Bob performs his measurement, Alice has ontic probabilities
cos2 (α/2) and sin2 (α/2) of getting up or down, where α is the angle between the
directions measured on the two sides. However, she is ignorant of Bob’s outcome,
and her epistemic probabilities for his having obtained up or down are both 0.5.
Thus, given her epistemic context, her (now mixed) probabilities for her own
outcomes are unchanged.16
As a final example, assume we have some ontic probability that is also an
objective chance in the sense of Lewis. According to the PP in the form (3.5), this
ontic probability will be given by
Now compare it with cr(A|Et & TC ), our credence in A at t given the same
theoretical assumptions but our actual epistemic context Et . This now is a mixed
(partially epistemic) probability. Indeed, let Hti be the various epistemic possibilities
(compatible with our epistemic context Et ) for the actual history up to t (i. e. for the
actual ontic context). We then have
cr(A|Et & TC ) = cr(A|Hti & Et & TC )cr(Hti |Et & TC ), (3.7)
i
16 This is of course the quantum mechanical no-signalling theorem, as seen in the standard collapse
picture. (It is instructive to think through what the theorem means in pilot-wave theory or in Everett.
Thanks to Harvey Brown for enjoyable discussion of this point.) And of course, mixed quantum
states are the classic example of the distinction in quantum mechanics between probabilities that
are ignorance-interpretable (‘proper mixtures’) and probabilities that are not (‘improper mixtures’).
3 Unscrambling Subjective and Epistemic Probabilities 61
which simplifies to
cr(A|Et & TC ) = cr(A|Hti & TC )cr(Hti |Et & TC ), (3.8)
i
because the Hti are fuller specifications of the past history at t than the one
provided by Et . In (3.8), now, the credences cr(Hti |Et & TC ) are purely epistemic
probabilities, and the cr(A|Hti & TC ) are purely ontic probabilities corresponding to
each of the epistemically possible ontic contexts. The various Hti could e. g. fix the
composition of the radioactive probe in our informal example above.17
Judging by the examples so far, the subjective-objective distinction and the
epistemic-ontic distinction could well be co-extensive. Indeed, both subjective and
epistemic probabilities seem to have to do with us, while both objective and ontic
probabilities seem to have to do with the world. Epistemic probabilities depend
on epistemic contexts, as do subjective ones; and chances depend on past history,
which we have also taken as the context in which ontic probabilities are defined. But
appearances may deceive.
In order to argue that the subjective-objective distinction and the epistemic-ontic
distinction are to be drawn truly independently, I need to convince you that there
are bona fide examples of probabilities that are both subjective (in fact purely
subjective) and ontic, and of probabilities that are both epistemic and objective.
These will be the respective objectives of the next two sections.
Let us start with the case of subjective ontic probabilities. I have discussed this case
before, in a paper (Bacciagaluppi 2014) in which I comment on Chris Fuchs and co-
workers’ radical subjectivist approach to quantum mechanics, known as qBism (see
e.g. Fuchs 2002, 2014),18 and in which I spell out how I believe radical subjectivism
à la de Finetti (1970) can be applied to quantum mechanics not only along qBist
lines but also within any of the traditional foundational approaches (Bohm, GRW
and Everett). The idea that subjectivism is applicable to quantum mechanics had
17 In this example, it is of course possible to think of mixed epistemic and ontic probabilities as
arising not only through ignorance of matters of fact in the ontic context Ht , but also through
ignorance of the correct theory of chance TC . We might indeed know the exact composition of
the radioactive probe, but take ourselves to be ignorant of the correct decay laws for the various
isotopes, with different credences about which laws are correct. I shall neglect this possibility in
the following, because as noted already in Footnote 12, I prefer not to think of ignorance when
talking about theoretical assumptions. In any case, I shall not need this further possible case for the
arguments below.
18 I am deeply grateful to Chris Fuchs for many lovely discussions about qBism and subjectivism
been already worked out by Jossi Berkovitz in the years 1989–1991 (prompted
by Itamar), with specific reference to de Finetti’s idea that probabilities are only
definable for verifiable events.
For the purposes of the present paper, I need to make the case that purely
subjective probabilities can be ontic, thus radical subjectivism about quantum
probabilities (in a context in which quantum mechanics is taken to be irreducibly
indeterministic) provides the ideal example, and in the following I shall rely heavily
on my own previous discussion. (Alternatively, I could arguably have relied on
Berkovitz 2012.) Note that what I need to convince you of is not that such a position
is correct, but only that it is coherent, i. e. that there is an interesting sense in which
one can combine the notion of purely subjective probabilities with that of ontic
probabilities.
Itamar himself (Pitowsky 2003, 2007), Bub (2007, 2010) and Fuchs and co-
workers have all defended approaches to quantum mechanics in which probabilities
are subjective in the sense of being degrees of belief, and some modifications or
additions to Bayesian rationality are introduced in order to deal with situations,
respectively, in which one deals with ‘quantum bets’ (incompatible bets possibly
with some outcomes in common), in which one assumes no-cloning as a funda-
mental principle, or in which gathering evidence means irreducibly intervening into
the world. I take it that all of these authors (as well as de Finetti himself) agree on
the assumption that in general there are no present or past facts that will determine
the outcomes of quantum bets or any quantum measurements, and thus that the
probabilities in their approaches are ontic in the sense used in the previous section.
It is less clear that the probabilities involved are purely subjective. Indeed, both
Pitowsky and Bub see subjective probability assignments as severely constrained
by the structure of quantum bets or quantum observables, in particular through
Gleason’s theorem and (relatedly) through the mathematical structure of collapse.
These might be seen as rationality constraints, so that the probabilities in these
approaches might in fact be seen as at least partly objective.19 As Itamar puts it
(Pitowsky 2003, p. 408):
19 Whether or not they are will make little difference to my argument (although if they should be
purely subjective I would have further concrete examples of subjective ontic probabilities in the
sense I need). In Bub’s case, he explicitly states that the no-cloning principle is an assumption
and could turn out to be false. That means that the constraints hold only given some specific
theoretical assumptions, and are not rationally compelling in the strong sense used by Lewis. In
Pitowsky’s case, the constraints seem stronger, in the sense that the ‘quantum bets’ are not per se
theoretical assumptions about the world, but are just betting situations whose logical structure is
non-classical. Whether or not there are rational constraints on our probability assignments will be
a well-posed question only after you and the bookie have agreed on the logical structure of the bet.
On the other hand, whenever we want to evaluate credences in a particular situation, we need to
make a theoretical judgement as to what the logic of that situation is. That means, I believe, that
whether we can have rationally compelling reasons to set our credences in any particular situation
will still depend on theoretical assumptions, and fall short of Lewis’s very stringent criterion.
See Bacciagaluppi (2016) for further discussion of identifying events in the context of quantum
probability, and Sect. 1.2 of Pitowsky (2003) for Itamar’s own view of this issue. Cf. also Brown
and Ben Porath (2020).
3 Unscrambling Subjective and Epistemic Probabilities 63
For a given single physical system, Gleason’s theorem dictates that all agents share a
common prior probability distribution or, in the worst case, they start using the same
probability distribution after a single (maximal) measurement.
Indeed, since pure quantum states always assign probability 1 to results of some
particular measurement, the constraints may be so strong that even some dogmatic
credences are forced upon us.
Fuchs instead clearly wishes to think of his probabilities as radically subjective
(which I want to exploit). Precisely because of the strong constraints imposed by
collapse, he feels compelled to not only take probability assignments as subjective,
but also quantum states themselves and even Hamiltonians (Fuchs 2002, Sect. 7).
To a certain extent, however, I think this is throwing out the baby without getting
rid of the bath water. Indeed, I believe the (empirically motivated but normative)
additions to Bayesian rationality to which Fuchs is committed imply already by
themselves very strong constraints on qBist probability assignments (Bacciagaluppi
2014, Sect. 2.2). On the other hand, as I shall presently explain, I also believe that
radical subjectivism can be maintained even within a context in which quantum
states and Hamiltonians are taken as (theoretical) elements in our ontology. While
in some ways less radical than Fuchs’s own qBism, this is the position that I propose
in my paper, and which I take to prove by example that a coherent case can be made
for (purely) subjective ontic probabilities. It can be summarised as follows.
Subjectivism about probabilities (de Finetti 1970) holds that there are no
objective probabilities, only subjective ones. Subjective probability assignments are
strategies we use for navigating the world. Some of these strategies may turn out to
be more useful than others, but that makes them no less subjective. As an example,
take again (classical) coin flipping: we use subjective probabilities to guide our
betting behaviour, and these probabilities get updated as we go along (i.e. based on
past performance). But it is only a contingent matter that we should be successful at
all in predicting frequencies of outcomes. Indeed, any given sequence of outcomes
(with any given frequency of Heads and Tails) will result from appropriate initial
conditions.
Under certain assumptions the judgements of different agents will converge
with subjective probability 1 when conditionalised on sufficient common evidence,
thus establishing an intersubjective connection between degrees of belief and
frequencies. This is de Finetti’s theorem (see e.g. Fuchs 2002, Sect. 9). One
might be tempted to take it as a sign that subjective probabilities are tracking
something ‘out there’.20 This, however, does not follow – because the assumption
of ‘exchangeability’ under which the theorem can be proved is itself a subjective
feature of our probability assignments. The only objective aspect relevant to the
theorem is the common evidence, which is again just past performance.
For the subjectivist, there is no sense in which our probability judgements are
right or wrong. Or as de Finetti (1970, Preface, quoted in Fuchs 2014) famously
20 Such a reading is e.g. perhaps present in Greaves and Myrvold (2010), or even in Pitowsky
(2003).
64 G. Bacciagaluppi
the world and use it (like weight distribution) as the basis for choosing our subjective
probabilities.21 There is no need for a qBist like Fuchs to reject the putative onticity
of quantum states,22 precisely because that by itself does not make our probability
assignments objective (forgive me, Itamar!).
As further described in Bacciagaluppi (2014), this general perspective can be
applied to all the main approaches to quantum mechanics that take the quantum
state as ontic, so that one can be a de Finettian subjectivist about probabilities in
the context of Bohm, GRW or Everett. In the case of Bohm, this is unremarkable of
course, because the Bohm theory is deterministic and probabilities are epistemic
in the sense used above. In the case of GRW, it is an unfamiliar position. The
default intuition in GRW is presumably to take probabilities as some kind of
propensities, but subjectivism may prove more attractive.23 Finally, the Deutsch-
Wallace decision-theoretic framework for the Everett theory already analyses
probabilities as subjective. Deutsch and Wallace themselves (see e.g. Wallace 2010)
argue that the weights in the universal wavefunction in fact satisfy the Principal
Principle (and are in fact the only concrete example of something that qualifies
as objective chances in Lewis’s sense!). Many authors, however, dispute that the
Deutsch-Wallace theorem rationally compels one to adopt the Born rule. I concur,
but see that as no reason to reject the decision-theoretic framework itself. One can
simply use the Deutsch-Wallace theorem as an ingredient in our subjective strategy
for assigning probabilities on the basis of quantum states. I take it that this is in fact
the subjectivist (or ‘pragmatist’) position on Everett proposed by Price (2010).
Thus, in all these cases (with the exception of Bohm of course), we have a view
in which the probabilities of quantum mechanics are coherently understood as both
purely subjective and ontic, which is what we set out to establish in this section.
21 The quantum state is thus taken to be a matter of fact, but presumably not a ‘particular
matter of fact’ in the sense of Lewis, because of the holistic aspects of the quantum state. (Cf.
Footnote 12 above.) Note also that historically the notion of quantum state preceded its ‘statistical
interpretation’, famously introduced by Max Born (for discussion see Bacciagaluppi 2008).
22 For other, even more forceful arguments to this conclusion, see Brown (2019) and Myrvold
(2020).
23 Frigg and Hoefer (2007) have argued that the propensity concept is inadequate to the setting of
GRW. Their own preference is to use ‘Humean objective chances’ instead (cf. Frigg and Hoefer
2010). Note that if the alternative subjectivist strategy includes taking the quantum state as a non-
probabilistic physical object, there are severe constraints on doing this for the case of relativistic
spontaneous collapse theories (see also below, Sects. 3.6 and 3.9).
66 G. Bacciagaluppi
24 Indeed, we defined these probabilities as ontic, so unless we accept subjective ontic probabilities,
• You are still going to bet at equal odds, and in fact this is just as rationally
compelling as betting at those odds before the measurement.
The first claim is uncontroversial. The second one – your epistemic probabilities
after the measurement are rationally compelling, granted the probabilities before
the measurement are – I take to be incontrovertible. From a Bayesian perspective,
for instance, you have no new evidence. So it would be irrational to change your
subjective probabilities after I have performed the experiment. But what is more,
I shall now argue that your non-trivial probabilities are objective for the same
reason as in Lewis’s discussion. That is, the rational justification for your probability
assignment is still completely in the spirit of the PP, even though it does not respect
its letter.
Indeed, recall what the PP (3.3) says, as an implicit definition of objective
probabilities: iff there is a proposition X such that, for all choices of reasonable
initial credence function and admissible propositions,
cr A X & Et & T = x, (3.10)
then A has an objective probability and its value is x. Crucially, Lewis sees chances
as functions of time. Thus he requires that our credence function at t should not
incorporate information that is inadmissible at the time t. This corresponds to a
betting situation in which you are allowed to know any matters of fact up to time t,
and the chances fix the objectively fair odds for betting in this situation.
But in the case we are considering now, the betting situation is different. Indeed,
you are not allowed to know what has happened after I have performed the
measurement (even if the measurement was at a time t < t). If in this betting
situation you knew the outcome of the measurement, that now would be cheating.
In this situation, the proposition that the chance at t is equal to x (in particular 0
or 1) is itself inadmissible information, and it is so for the same reason that Lewis
holds a proposition about the future chance of an event to be inadmissible.
Indeed, in his version of the PP, Lewis is considering a betting context restricted
to what we can in principle know at t (i.e. the past history Ht ), and he asks for the
best we can do in setting our rational credence in a future event A. The answer is
that the best we can do is to set our credence to the chance of A at that time t.
Information about how the chances evolve after t is inadmissible because it illicitly
imports knowledge exceeding what we are allowed to know at t. If we consider
restricting our betting context further so that what we are allowed to know is what
we could in principle have known at an earlier time t , and now ask for the best
we can do in setting our rational credence in an event at t, then information about
the chance at t of that event has become inadmissible, again precisely because it
exceeds the assumed restriction on what we are allowed to know. The best we can
do is to set our credence to the chance of A at the earlier time t .
The point is perhaps even clearer if we take Lewis’s reformulated version (3.5)
of the PP:
We have argued above that one can make good sense both of subjective (indeed,
purely subjective) ontic probabilities and of epistemic objective probabilities, thus
arguing in favour of two unusual conceptual possibilities. But in other respects
we had to be quite conservative. Indeed, the arguments above would have been
weakened had we not tried to keep as much as possible to the accepted use of
the terms ‘epistemic’, ‘ontic’, ‘subjective’ and ‘objective’. Once we have seen
the usefulness of admitting subjective ontic probabilities and epistemic objective
probabilities, however, we might be interested in trying to redraw the subjective-
objective distinction and the epistemic-ontic distinction in ways that are both more
systematic and cleaner in the light of our new insights.
Let us start with the subjective-objective distinction. What we have just argued
for in Sect. 3.5 is that there are more cases of rationally compelling credences than
Lewis in fact allows for.25
Lewis’s intuition is that chances are functions of time, and in the formulation
of the PP this effectively forces him to privilege the maximal epistemic context Ht
comprising all that is knowable in principle at t, or equivalently to quantify over all
epistemic contexts Et comprising what may be actually known at t. Thus, according
to Lewis, we need to ask whether propositions exist that would fix our credences
across all epistemic contexts admissible at time t.
But we have just argued that we can also ask whether propositions exist that
would fix our credences when we are in a particular one of these epistemic contexts.
Specifically, we considered the situation in which at t we only know all matters of
25 Many thanks to Ronnie Hermens for noting that this point was not spelled out clearly enough in
a previous draft.
3 Unscrambling Subjective and Epistemic Probabilities 69
fact up to t < t. In that case, if we assume Lewisian chances exist, we know in fact
that there are such propositions, namely the chances at t .
Now we can ask that same question in general: do propositions exist that would
fix our credences when we are in an arbitrary epistemic context E (e. g. any ever so
partial specification of the history up to a time t)? If so, we shall talk of the chances
not at the time t, but the chances given the context E. Lewisian chances at t are then
just the special case when E has the form Ht .
Writing it down in formulas, we have a natural generalisation of the PP and of
objective probabilities: given a context E, iff there is a proposition X such that,
for all choices of reasonable initial credence function and any further theoretical
assumptions T (compatible with X),
cr A X & E & T = x, (3.12)
then A has an objective probability given the context E and its value is x. The
weakest such proposition will be denoted chE (A) = x.
Now, similarly as with Lewis’s own reformulation (3.5) of the PP, note that a
complete theory of chance will now include also history-to-chance conditionals for
any antecedent E for which chances given E exist. Therefore, not only will the
following be an instance of the generalised PP (3.12),
cr A (chE (A) = x) & E & TC = x, (3.13)
Thus, also our generalised chances will have the kind of properties that Lewis
derives for his own chances from his reformulation (3.5) of the original PP (in
particular, they will obey the probability calculus).
Further, (3.14) makes vivid the intuition that if we have a complete and correct
theory of (generalised) chance, the best we can rationally do if we are in the
epistemic context E, is just to plug the information we actually have into the
complete and correct theory of chance, and let that guide our expectations.
Finally, and very significantly, we see from (3.14) that chances in the more
general sense are not functions of propositions and times, but functions of pairs
of propositions, and they have the form of conditional probabilities. That is,
in this picture chances are fundamentally conditional. I believe this has many
advantages.26
26 Note that also epistemic and ontic probabilities as discussed here are naturally seen as conditional
(on epistemic and ontic contexts, respectively). That all probabilities should be fundamentally
conditional has been championed by a number of authors (see e.g. Hájek 2011 and references
therein). The idea in fact fits nicely with Lewis’s own emphasis that already the notion of possibility
70 G. Bacciagaluppi
From the point of view of the philosophy of physics, such a liberalisation of the
PP means that all kinds of other probabilities used in physics can now be included
in what could potentially be objective chances.
For instance, it allows one to consider the possibility of high-level probabilities
being objective. By high-level probabilities I mean probabilites featuring in theories
such as Boltzmannian statistical mechanics, which is a theory about macrostates,
identified as coarse-grained descriptions of physical systems. Our generalised PP
allows one among other things to apply a Lewisian-style analysis to probabilities
conditional on such a macrostate of a physical system, thus introducing the
possibility of objective chances even in case we are ignorant of the microstate of
a system such as a gas (pace Popper).27
Or take again the EPR setup: Bob performs a measurement, usually taken
to collapse the state also of Alice’s photon, so that the probabilities for her
own measurement results are now cos2 (α/2) and sin2 (α/2). Let us grant that
these probabilities are objective. In Sect. 3.3 we considered Alice’s probability
assignments in the epistemic context in which she does not know yet the result of the
measurement on Bob’s side, and treated them as partially epistemic. Now, however,
we can argue that in Alice’s epistemic context what is rationally compelling for her
is to keep her probabilities unchanged, because we compare them with the quantum
probabilities conditional on events only on her side.
Indeed, the situation is very much analogous to the simple example in which
I perform a measurement but do not tell you the result. If anything, in the EPR
example the case is even more compelling, because there is no way even in principle
that Alice could know Bob’s result if the two measurements are performed at
spacelike separation. Thus, if we take probabilities to be encoded or determined by
the quantum state, we have now made space for the idea that collapse does not act
instantaneously at a distance, but along the future light cone (of Bob’s measurement
in this case).
Quite generally, it is clear that Lewis’s treatment of chances taken literally is
incompatible with relativity, for Lewis’s chances as functions of time presuppose a
notion of absolute simultaneity. In fact, the idea that chance could be relativised to
space-like hypersurfaces or more generally to circumstances which are independent
of absolute time, e.g. properties of space-time regions, was proposed in Berkovitz
(1998), who discusses this proposal with respect to causality and non-locality in the
quantum realm.
(1986), p. 118). Note that high-level deterministic chances relate to all kinds of interesting
questions (e.g. emergence, the trade-off between predictive accuracy and explanatory power, the
demarcation between laws and initial conditions, etc.). For recent literature on the debate about
deterministic chances, see e.g. Glynn (2009), Lyon (2011), Emery (2013), and references therein.
3 Unscrambling Subjective and Epistemic Probabilities 71
28 I actually prefer the 2000 archived version to the published paper (Myrvold 2002). More recently,
Myrvold (2017a) has shown that any such theories of foliation-dependent collapse must lead to
infinite energy production. I believe, however, that Myrvold’s idea that probabilities in special
relativity (and in quantum field theory) can be defined conditional on events to the past of an
arbitrary hypersurface can still be implemented (see Sect. 3.9 below).
29 I am indebted to both of these proposals and their authors in more ways than I can say. See also
my Bacciagaluppi (2010a).
30 Retrocausal theories (in particular in the context of quantum mechanics) have been vigorously
championed by Huw Price. All-at-once theories are ones in which probabilities for events are
specified given general boundary conditions, rather than just initial (or final) conditions, and have
been championed in particular by Ken Wharton. See e.g. Price (1997), Wharton (2014) and the
other references in Adlam (2018a). For further discussion see also Rijken (2018) (whom I thank
for discussion of Adlam’s work).
72 G. Bacciagaluppi
have information about some previous times t1 , . . . , tn , the original PP has nothing
to recommend us except to try and figure out the missing information, so as to be
able to determine the chances ch(A|Ht ) at t. Our physical theory, instead, does
contain also probabilities of the form p(A|Et1 & . . . &Etn ), where Eti describes
the state of our physical system at time ti . In order to be able to say that also
these probabilities featuring in our non-Markovian theory are objective, we have
to generalise chances to allow for contexts of the ‘gappy’ form E = Et1 & . . . &Etn
in (3.12)–(3.14).
Note that p(A|Et1 & . . . &Etn ) is just a coarse-graining of probabilities of the
j j
form p(A|Ht ), where Ht is a possible history of the world up to t compatible
with Et1 & . . . &Etn , i.e. a coarse-graining of probabilities which in turn do allow a
reading as Lewisian chances. In this sense, our generalisation (3.12) appears to be
required by sheer consistency, if we want objective probabilities to be closed under
coarse-grainings.31
Finally, even leaving aside motivations from philosophy of physics, what may
perhaps be decisive from the point of view of the philosopny of probability is that
such a generalisation of the PP completely eliminates the need to restrict epistemic
contexts to ‘admissible propositions’. The reason for worrying about admissibility
was that chances were tacitly or explicitly thought to always refer to the context
Ht , and the actual epistemic context had to be restricted to admissible propositions
in order not to bring in any information not contained in Ht . By requiring that
the (ontic) context of the chances exactly match the (epistemic) context of the
credences, no such possibility of mismatch can arise, and no additional conditions
of admissibility are needed.
I suspect Lewis himself might have welcomed such a simplification in the
formulation of the PP, especially given his own qualified formulations when treating
of admissible propositions in connection with the possibility of non-standard
epistemic contexts in the presence of time travel, clairvoyance, and other unusual
cases (as will in fact be the retrocausal and all-at-once theories considered by
Adlam).
31 Ofcourse also in the case of E = Ht , the chance given E is the coarse-graining over all
possible intervening histories between t and t. But this particular coarse-graining does yield again
a Lewisian chance, namely the chance at t .
3 Unscrambling Subjective and Epistemic Probabilities 73
32 And, indeed, already in Sect. 3.3 we suggested that ontic probabilities tend to be associated with
indeterministic contexts, which we defined in terms of multiple possible futures in our theoretical
models compatible with the present (or present and past) state of the world.
33 As mentioned above, one could of course distinguish further between the actual world and the
material world (if one is a many-world theorist) or take some material propositions to be theoretical
(if one is an empiricist), but for simplicity I neglect these possibilities.
74 G. Bacciagaluppi
34 Materialprobabilities by definition support only indicative conditionals: ‘If Beethoven was not
born on 16th December, he was born on the 15th’ expresses our next best guess. Instead, when
talking about atomic decay we more commonly use subjunctive or counterfactual conditionals,
such as ‘Should this atom not decay in the next hour, it would have probability 0.5 of decaying
during the following hour’.
35 My thanks to Niels van Miltenburg and to Albert Visser for pressing me on this point.
3 Unscrambling Subjective and Epistemic Probabilities 75
36 We shall refine this point in our discussion of time-symmetric theoretical models in Sect. 3.8.
37 It
may, however, contribute to muddle the waters in the Popper and Jaynes examples, or indeed
in modern discussions about ‘typicality’ in statistical mechanics: are we talking about the actual
world (of which there is only one), or about possible worlds (of which there are many, at least in
our models)? It is also not always clear which aspects of a model are indeed law-like and which
are contingent. We shall see a class of examples in Sect. 3.8, but a much-discussed one would be
whether or not the ‘Past Hypothesis’ can be thought of as a law – which it is in the neo-Humean
‘best systems’ approach (cf. Loewer 2007); thanks to Sean Gryb for raising this issue. For further
discussion of typicality and/or the past hypothesis, see e.g. Goldstein (2001), Frigg and Werndl
(2012), Wallace (2011), Pitowsky (2012), and Lazarovici and Reichert (2015).
76 G. Bacciagaluppi
Many more things could be said about applying the newly-drawn distinctions in
practice. I shall limit myself in this section to a few remarks on the issue of time
symmetry, and in the next one to some further remarks on the quantum state.
I mentioned in Sect. 3.3 that for the purpose of distinguishing epistemic and ontic
probabilities I did not want to commit myself to a metaphysically thick notion of
indeterminism, with a ‘fixed’ past and an ‘open’ future. Now we can see that such a
notion is in fact incompatible with the way I have suggested drawing the epistemic-
ontic distinction.
Indeed, as drawn in Sect. 3.7, the epistemic-ontic distinction allows for ontic
probabilities also in the case we theoretically assume backwards indeterminism
(where models that coincide at a time t or for times s ≥ t need not coincide at times
earlier than t). We can take the entire history to the future of t as an ontic context
(call it H̃t ), and thus probabilities p(A|H̃t ) about events earlier than t as ontic. This
is plainly incompatible with tying ontic probabilities (whether in the material or the
theoretical sense) to the idea of a fixed past and open future.
There are other reasons for rejecting the idea of fixed past and open future.38
But my suggestion above seems to fly in the face of a number of arguments for a
different metaphysical status of past and future themselves based on probabilistic
considerations. It is these arguments I wish to address here, specifically by briefly
recalling and reframing what I have already written in my Bacciagaluppi (2010b)
against claims that past-to-future conditional probabilities will have a different
status (namely ontic) from that of future-to-past conditional probabilities (allegedly
epistemic). I refer to that paper for details and further references.
Of course we are all familiar with the notion that probabilistic systems can
display strongly time-asymmetric behaviour. To fix the ideas, take atomic decay,
modelled simplistically as a two-level classical Markov process (we shall discuss
quantum mechanics proper in the next section), with an excited state and a de-
excited state, a (comparatively large) probability α for decay from the excited state
to the de-excited state in unit time, and a (comparatively small) probability ε for
spontaneous re-excitation in unit time. We shall take the process to be homogeneous,
i.e. the transition probabilities to be time-translation invariant.
Any initial probability distribution over the two states will converge in time to
an equilibrium distribution in which most of the atoms are de-excited (a fraction
α ε
α+ε ), and only a few are excited (a fraction α+ε ), and the transition rate from the
38 In particular, I agree with Saunders (2002) that standard moves to save ‘relativistic becoming’
in fact undermine such metaphysically thick notions. Rather, I believe that the correct way of
understanding the ‘openness’ of the future, the ‘flow’ of time and similar intuitions is via the logic
of temporal perspectives as developed by Ismael (2017), which straightforwardly generalises to
relativistic contexts, since it exactly captures the perspective of an IGUS (‘Information Gathering
and Utilising System’) along a time-like worldline. For introducing me to the pleasures of time
symmetry I would like to thank in particular Huw Price.
3 Unscrambling Subjective and Epistemic Probabilities 77
excited to the de-excited state will asymptotically match the transition rate from the
de-excited to the excited state (so-called ‘detailed balance’ condition).
This convergence to equilibrium is familiar time-directed behaviour. Of course,
given the joint distributions for two times s > t, transition probabilities can be
defined both from t to s and from s to t (by conditionalising on the state at t and
at s, respectively), but the example can be used in a number of ways to argue that
these transition probabilities forwards and backwards in time must have a different
status. In the language of this paper:
(A) Were backwards probabilities also ontic and equal to the forwards probabilities,
then the system would converge to equilibrium also towards the past, and the
system would have to be stationary (Sober 1993).
(B) While by assumption the forwards transition probabilities are time-translation
invariant, the backwards transition probabilities in general are not, suggesting
that they have a different status (Arntzenius 1995).
(C) If in addition to the forwards transition probabilities also the backwards
transition probabilities were law-like, the initial distribution would also inherit
this law-like status, while it is clearly contingent – indeed can be freely chosen
(Watanabe 1965).
The problem with all these arguments is that they take the observed backwards
frequencies as estimates for the putative ontic backwards probabilities. But fre-
quencies can be used as estimates for probabilities only when we have reasons to
think our samples are unbiased.39 For instance, we would never use post-selected
ensembles to estimate forwards transition probabilities. However, we routinely use
pre-selected ensembles, especially when we set up experiments ourselves and in fact
freely choose the initial conditions. Thus, while we normally use such experiments
to estimate forwards transition probabilities, we plainly cannot rely on them to
estimate the backwards ones.
It turns out that this time-asymmetric behaviour (and a large class of more general
examples) can indeed be modelled using stationary stochastic processes in which
the transition probabilities are time-symmetric, and in which the time asymmetry is
introduced by the choice of special initial (rather than final) frequencies. Thus, in
modelling such time-asymmetric behaviour there is no need to use time-asymmetric
ontic probabilities or to deny ontic status altogether to the backwards transition
probabilities of the process. The apparent asymmetry comes in because we generally
have asymmetric epistemic access to a system: we tend to know aspects of its past
history, and thus are able to pre-select, while its future history will generally be
unknowable.40
39 Thatthis is not always licensed is clear also from discussion of causal loops (Berkovitz 1998,
2001).
40 Of course, this strategy is analogous to that of modelling time-directed behaviour in the
deterministic case using time-symmetric laws and special initial conditions. The same point was
made independently and at the same time by (Uffink 2007, Section 7), specifically in the context
of probabilistic modelling of classical statistical mechanics. A further explicit example of time-
78 G. Bacciagaluppi
If one assumes that the transition probabilities are time-symmetric (or time-
symmetric except for some overall drift41 ), this places severe constraints on the
overall (multi-time) distributions that define the process. Not only is the process
stationary if it is homogeneous, but the process may well be uniquely determined by
its forwards and backwards transition probabilities.42 Thus, in particular, the entire
process inherits the ontic status of the transiton probabilities.
This will seem puzzling, if we are used to thinking of transition probabilities
as ontic, and of initial distributions as epistemic, so that the probabilities of the
process should have a mixed character (as in the examples we discussed beginning
in Sect. 3.3). But as we saw in Sect. 3.7, we standardly align the distinction between
epistemic and ontic probabilities with the distinction between contingent and law-
like aspects of our theoretical probabilistic models. The theoretical probabilities
of the process may very well be even uniquely determined by the forwards and
backwards transition probabilities, but our theoretical models will still provide scope
for contingent elements through the notion of samples of the process. We then
feed our epistemic probabilities into the theoretical model via constraints on the
frequencies in the samples, which will typically be initial constraints because of our
time-asymmetric access to the empirical world.43
We have just seen that, classically, a stochastic process may be nothing over and
above an encoding of its forwards and backwards transition probabilities. I now
symmetric ontic probabilities is provided by Nelson’s (1966,1985) stochastic mechanics, where the
quantum mechanical wavefunction and the Schrödinger equation are derived from an underlying
time-symmetric diffusion process. See also Bacciagaluppi (2005) and references therein. The
notorious gap in Nelson’s derivation pointed out by Wallstrom (1989) has now been convincingly
filled through the modification of the theory recently proposed by Derakhshani (2017).
41 In the case of Nelson’s mechanics, for instance, one has a systematic drift velocity that equals
the familiar de Broglie-Bohm velocity (which in fact also changes sign under time reversal). If one
subtracts the systematic component, transition probabilities are then exactly time-symmetric.
42 More precisely, it will be uniquely determined within each ‘ergodic class’. (Note that ergodicity
results are far easier to prove for stochastic processes than for deterministic dynamical systems.
See again the references in Bacciagaluppi (2010b) for details.)
43 Without going into details, these distinctions will be relevant to debates about the ‘neo-
Boltzmannian’ notion of typicality (see the references in Footnote 37), and about the notion of
‘sub-quantum disequilibrium’ in theories like pilot-wave theory or Nelsonian mechanics (for the
former see e.g. Valentini 1996, for the latter see Bacciagaluppi 2012).
3 Unscrambling Subjective and Epistemic Probabilities 79
wish to briefly hint at the possibility that the same may be true also in quantum
mechanics.44
In quantum mechanics, the role of determining transition probabilities belongs
to the quantum state (either the fixed Heisenberg-picture state, or the totality of
the time-evolving Schrödinger-picture or interaction-picture states45 ). The quantum
state, however, is traditionally seen as more than simply encoding transition
probabilities, rather as a physical object in itself, which in particular determines
these probabilities. One obvious exception is qBism, where however I have argued
in Sect. 3.4 above that the rejection of the quantum state as a physical object is not
sufficiently motivated.46
In this section I want to take a look at a different approach to quantum theory,
where I believe there are strong constraints to taking the quantum state as an
independently existing object, and reasons to take it indeed as just encoding (ontic,
law-like) transition probabilities, largely independently of the issue of whether these
probabilities should be subjective or objective.47
This approach is the foliation-dependent approach to relativistic collapse. As
is well-known, non-relativistic formulations of collapse will postulate that the
quantum state collapses instantaneously at a distance (whether the collapse is
spontaneous or the traditional collapse upon measurement). So, for instance, in an
EPR scenario the collapse will depend on whether Alice’s or Bob’s measurement
takes place first. However, predictions for the outcomes of the measurements are
independent of their order because the measurements commute: we obtain the
same probabilities for the same pairs of measurement results. This suggests that
one might be able to obtain a relativistically invariant formulation of collapse
after all. One of the main strategies for attempting this takes quantum states to
be defined relative to spacelike hyperplanes or more generally relative to arbitrary
spacelike hypersurfaces, and collapse accordingly to be defined relative to spacelike
foliations. This option has been pioneered for a long time by Fleming (see e.g.
44 This section can be skipped without affecting the rest of the paper. The discussion is substantially
influenced by Myrvold (2000), but should be taken as an elaboration rather than exposition of
his views. See also Bacciagaluppi (2010a). For Myrvold’s own recent views on the subject, see
Myrvold (2017b, 2019).
45 The interaction picture is obtained from the Schrödinger picture by absorbing the free evolution
into the operators. In other words, one encodes the free terms of the evolution in the evolution of
the operators, and encodes the interaction in the evolution of the state.
46 Another obvious exception is Nelson’s stochastic mechanics (mentioned in Footnote 40), but
there it is not the quantum mechanical transition probabilities that are fundamental but the
transition probabilities of the underlying diffusion process. While also a number of Bohmians (cf.
Goldstein and Zanghì 2013) think of the wavefunction not as a material entity but as nomological,
as a codification of the law-like motion of the particles, there is no mathematical derivation in
pilot-wave theory of the wavefunction and Schrödinger equation from more fundamental entities
comparable to that in stochastic mechanics, and the wavefunction can be thought of as derivative
only if one takes a quite radical relationist view, like the one proposed by Esfeld (forthcoming).
47 For a sketch of how I think this relates to the PBR theorem (Pusey et al. 2012), see below.
80 G. Bacciagaluppi
48 The other main strategy is to consider collapse to be occurring along light cones (either along the
future light cone or along the past light cone, as in the proposal by Hellwig and Kraus (1970)), and
it arguably allows one to retain the picture of quantum states as physical objects (Bacciagaluppi
2010a). No such theory has been developed in detail, however.
3 Unscrambling Subjective and Epistemic Probabilities 81
quantum state, to determine these probabilities. (As we saw in Sect. 3.6, also Adlam
(2018a,b) is critical of thinking of a separate physical object as determining the
quantum probabilities.)
What I wish to suggest is that, as in the classical case of Sect. 3.8, the use of
a separate quantum state to calculate transition probabilities may be an artefact of
our own limited epistemic perspective, and that one needs no more than forwards
and backwards transition probabilities corresponding to a ‘stationary process’. The
form of this process is easy to guess, reasoning by time symmetry from the fact that
the future boundary condition we standardly use is the maximally mixed state. If
we take that to be the case because we happen to lack information about the future,
and therefore do not post-select, that suggests that if we want to remove the pre-
selection that we normally do, and thus get to the correct theoretical probabilities,
we need to take the maximally mixed state also as the past boundary condition. The
appearance of a contingent quantum state would then follow from the fact that we
have epistemic access to past collapses, and conditionalise upon them when making
predictions, in general mixing our epistemic probabilities about such past collapses
(‘results of measurements’) and ontic transition probabilities provided by the theory.
Of course in general the maximally mixed state is unnormalisable, but condi-
tional probabilities will be well-defined. In such a ‘no-state’ quantum theory not
only are conditional probabilities fundamental, but unconditional probabilities in
general do not even exist.49
No-state foliation-dependent flash-ontology quantum field theory would thus be
a way of directly implementing Myrvold’s idea that conditional probabilities should
be determined by the theory directly, and not by way of the quantum state. Better
predictions will be obtained the more events one conditionalises upon stretching
back into the past. The theory is thus essentially non-Markovian (thus temporally
non-local, in line with Adlam’s views). As mentioned in Footnote 28, Myrvold
(2017a), developing remarks by Pearle (1993), has shown that foliation-dependent
collapse theories always run into a problem of infinite energy production, but I
believe a no-state theory in fact escapes this conclusion.50
49 Rather related ideas are common in the decoherent histories or consistent histories literature.
Griffiths (1984) does not use an initial and a final state, but an initial and a final projection of the
same kind as the other projections in a history, so his consistent histories formalism is in fact a
no-state quantum theory. Hartle (1998) suggests that the fundamental formula for the decoherence
functional is the two-state one, but that we are well-shielded from the future boundary condition so
that two-state quantum theory is predictively equivalent to standard one-state quantum theory (i.e.
we can assume the maximally mixed state as future boundary condition). If in one-state quantum
theory we assume that we are similarly shielded from the initial boundary condition, then by
analogy one-state quantum theory is in fact predictively equivalent to no-state quantum theory
(i.e. we can assume the maximally mixed state also as past boundary condition). Thus, it could just
as well be that no-state quantum theory is fundamental. (Of course we need to include pre-selection
on the known preparation history.)
50 The idea behind Myrvold’s proof is extremely simple. Leaving technicalities aside, because of
the Lorentz-invariance of the vacuum state, infinite energy production would arise if localised
collapse operators did not leave the vacuum invariant on average. Thus, one should require that
82 G. Bacciagaluppi
Such a proposal has in fact been put forward in the non-relativistic case by
Bedingham and Maroney (2017), in a paper in which they discuss in general
the possibility of time-symmetric collapse theories. As with the requirement of
Lorentz invariance, the requirement of time symmetry makes it problematic to
think of the collapsing quantum state as a physical object, because the sequence
of collapsing states defined by the collapse equations applied in the past-to-future
direction is not the time reversal of the sequence of states defined by the collapse
equations applied in the future-to-past direction. That is, the quantum state is time-
direction-dependent. However, under very mild conditions, from the application of
the equations in the two directions of time one does obtain the same probabilities for
the same collapse events. This suggests, again as in the relativistic case, that a time-
symmetric theory of collapse requires a flash ontology. And, as I am suggesting in
the case of hypersurface-dependence, Bedingham and Maroney suggest one might
want to get rid altogether of the quantum state (since the collapse mechanism
suggests it is asymptotically maximally mixed).51
One can find an analogy for the approach discussed in this section by looking at
Heisenberg’s views on ‘quantum mechanics’ (i.e. matrix mechanics) as expressed in
the 1920s in opposition to Schrödinger’s views on wave mechanics.52 The evidence
suggests that Heisenberg did not believe in the existence of the quantum state as a
physical object, or of collapse as a physical process. Rather, the original physical
picture behind matrix mechanics was one in which quantum systems performed
local collapse operators leave the vacuum invariant. But since a local operator commutes with
local operators at spacelike separation, and since (by the Reeh-Schlieder theorem) applications
of the latter can approximate any global state, it follows that in order to avoid infinite energy
production collapse operators must leave every state invariant. Thus there is no collapse.
One way out that is already present in the literature (and which motivates Myrvold’s theorem)
is to tie the collapse to some ‘non-standard’ fields (which commute with themselves at spacelike
and timelike separation), as done by both Bedingham (2011) and Pearle (2015) himself. Another
possibility, favoured by Myrvold (2018a), is to interpret the result as supporting the idea that
collapse is tied to gravity: in the curvature-free case of Minkowski spacetime, the theorem shows
there is no collapse. As mentioned, however, I believe that a no-state proposal will also circumvent
the problem of infinite energy production, because collapse operators will automatically leave the
(Lorentz-invariant) maximally mixed state invariant on average.
51 I should briefly remark on how I think these suggestions relate to the work of Pusey et al. (2012),
who famously draw an epistemic-ontic distinction for the quantum state, and who conclude from
their PBR theorem that the quantum state is ontic. An ontic state for PBR is an independently
existing object that determines the ontic probabilities (in my sense) for results of measurements.
And an epistemic reading of the quantum state for PBR is a reading of it as an epistemic probability
distribution (in my sense) over the ontic states of a system. Thus, the PBR theorem seems to
push for a wavefunction ontology. In a no-state proposal (as here or in Bedingham and Maroney
2017), however, the probabilities for future (or past) collapse events are fully determined by the
set of past (or future) collapse events. The contingent quantum state (in fact whether defined along
hypersurfaces or along lightcones) can thus be identified with such a set of events, which is indeed
ontic, thus removing the apparent contradiction. Whether this is the correct way of thinking about
this issue of course deserves further scrutiny.
52 For more detailed accounts, see Bacciagaluppi and Valentini (2009), Bacciagaluppi (2008),
quantum jumps between stationary states. This then evolved into a picture of
quantum systems performing stochastic transitions between values of measured
quantities. Transition probabilities for Heisenberg were ‘out there’ in the world (i.e.
presumably objective as well as ontic). If a quantity had been measured, then it
would perform transitions according to the quantum probabilities. The observed
statistics would conform to the usual rules of the probability calculus, in particular
the law of total probability. But if a quantity had not been measured, there would be
nothing to perform the transitions, giving rise to the phenomenon of ‘interference of
probabilities’.
Schrödinger’s quantum states were for Heisenberg nothing but a convenient tool
for calculating (forwards) transition probabilities, and any state that led to the correct
probabilities could be used. (Note that, classically, forwards transition probabilities
are independent of the chosen initial conditions.) Hence in particular, the movability
of the ‘Heisenberg cut’ and Heisenberg’s cavalier attitude to collapse,53 and the
peculiar mix of ‘subjective’ and ‘objective’ elements in talking about quantum
probabilities. And now, finally, despite the shrouding in ‘Copenhagen mist’, we can
perhaps start to make sense of some of Heisenberg’s intuitions as expressed in ‘The
Copenhagen interpretation of quantum theory’ (Heisenberg 1958, Chap. 3).
This paper has suggested a framework for thinking of probabilities in a way that
distinguishes between subjective and epistemic probabilities. Two strategies have
been used: liberalising the contexts with respect to which one can define various
kinds of probabilities, and making a liberal use of probabilities over theoretical
possibilities. I wish to conclude by putting the latter into a wider perspective and
very briefly sketching the kind of subjectivist and anti-metaphysical position that I
wish to endorse.54
In forming theoretical models, whether they involve causal, probabilistic or any
other form of reasoning, we aim at describing the behaviour of our target systems in
a way that guides our expectations. To do so, our models need to have modal force,
to describe law-like rather than accidental behaviour. It is up for grabs what this
modal force is. We can have an anti-Humean view of causes, laws or probabilities,
as some kind of necessary connections, but Hume (setting aside the finer scruples
of Hume scholarship) famously denied the existence of such necessary connections.
This traditionally gives rise to the problem of induction. Hume himself proposed
53 For an example of (Born and) Heisenberg using a very unusual choice of ‘collapsed states’ to
calculate quantum probabilities, see (Bacciagaluppi and Valentini 2009, Sect. 6.1.2).
54 To this position, I would wish to add a measure of empiricism (Bacciagaluppi 2019). For a
similar view, see Brown and Ben Porath (2020). On these matters (and several others dealt with in
this paper) I am earnestly indebted to many discussions with Jenann Ismael over the years. Any
confusions are entirely my own, however.
84 G. Bacciagaluppi
Acknowledgements I am deeply grateful to Meir Hemmo and Orly Shenker for the invitation to
contribute to this wonderful volume, as well as for their patience. I have given thanks along the way,
both for matters of detail and for inspiration and stimuli stretching back many years, but I must add
my gratitude to students at Aberdeen, Utrecht and the 5th Tübingen Summer School in the History
and Philosophy of Science, where I used bits of this material for teaching, and to the audience at
two Philosophy of Science seminars at Utrecht, where I presented earlier versions of this paper,
in particular to Sean Gryb, Niels van Miltenburg, Wim Mol, Albert Visser and Nick Wiggershaus.
Special thanks go to Ronnie Hermens for precious comments on the first written draft, further
thanks to Harvey Brown, Alan Hájek and Jenann Ismael for comments and encouraging remarks
on the final draft, and very special thanks to Jossi Berkovitz for a close reading of the final draft
and many detailed suggestions and comments.
References
Bacciagaluppi, G., & Valentini, A. (2009). Quantum theory at the crossroads – Reconsidering the
1927 Solvay conference. Cambridge: Cambridge University Press.
Bacciagaluppi, G., Crull, E., & Maroney, O. (2017). Jordan’s derivation of blackbody fluctuations.
Studies in History and Philosophy of Modern Physics, 60, 23–34. http://philsci-archive.pitt.
edu/13021/.
Bedingham, D. (2011). Relativistic state reduction dynamics. Foundations of Physics, 41, 686–704.
Bedingham, D., & Maroney, O. (2017). Time reversal symmetry and collapse models. Foundations
of Physics, 47, 670–696.
Bell, J. S. (1984). Beables for quantum field theory. Preprint CERN-TH-4035. Reprinted in: (1987).
Speakable and unspeakable in quantum mechanics (pp. 173–180). Cambridge: Cambridge
University Press.
Berkovitz, J. (1998). Aspects of quantum non-locality II – Superluminal causation and relativity.
Studies in History and Philosophy of Modern Physics, 29(4), 509–545.
Berkovitz, J. (2001). On chance in causal loops. Mind, 110(437), 1–23.
Berkovitz, J. (2012). The world according to de Finetti – On de Finetti’s theory of probability and
its application to quantum mechanics. In Y. Ben-Menahem & M. Hemmo (Eds.), Probability
in physics (pp. 249–280). Heidelberg: Springer.
Berkovitz, J. (2019). On de Finetti’s instrumentalist philosophy of probability. European Journal
for Philosophy of Science, 9, article 25.
Brown, H. (2019). The reality of the wavefunction – Old arguments and new. In A. Cordero
(Ed.), Philosophers look at quantum mechanics (Synthese library, Vol. 406, pp. 63–86). Cham:
Springer.
Brown, H., & Ben Porath, G. (2020). Everettian probabilities, the Deutsch-Wallace theorem and
the principal principle. This volume, pp. 165–198.
Bub, J. (2007). Quantum probabilities as degrees of belief. Studies in History and Philosophy of
Modern Physics, 38, 232–254.
Bub, J. (2010). Quantum probabilities – An information-theoretic interpretation. In S. Hartmann
& C. Beisbart (Eds.), Probabilities in physics (pp. 231–262). Oxford: Oxford University Press.
Bub, J., & Pitowsky, I. (1985). Critical notice – Sir Karl R. Popper, Postscript to the logic of
scientific discovery. Canadian Journal of Philosophy, 15(3), 539–552.
de Finetti, B. (1970). Theory of probability. New York: Wiley.
Derakhshani, M. (2017). Stochastic mechanics without ad hoc quantization – Theory and
applications to semiclassical gravity. Doctoral dissertation, Utrecht University. https://arxiv.
org/pdf/1804.01394.pdf.
Diaconis, P., Holmes, S., & Montgomery, R. (2007). Dynamical bias in the coin toss. SIAM Review,
49(2), 211–235.
Emery, N. (2013). Chance, possibility, and explanation. The British Journal for the Philosophy of
Science, 66(1), 95–120.
Esfeld, M. (forthcoming). A proposal for a minimalist ontology. Forthcoming in Synthese, https://
doi.org/10.1007/s11229-017-14268.
Fleming, G. (1986). On a Lorentz invariant quantum theory of measurement. In D. M. Greenberger
(Ed.), New techniques and ideas in quantum measurement theory (Annals of the New York
Academy of Sciences, Vol. 480, pp. 574–575). New York: New York Academy of Sciences.
Fleming, G. (1989). Lorentz invariant state reduction, and localization. In A. Fine & M. Forbes
(Eds.), PSA 1988 (Vol. 2, pp. 112–126). East Lansing: Philosophy of Science Association.
Fleming, G. (1996). Just how radical is hyperplane dependence? In R. Clifton (Ed.), Perspectives
on quantum reality – Non-relativistic, relativistic, and field-theoretic (pp. 11–28). Dordrecht:
Kluwer Academic.
Frigg, R., & Hoefer, C. (2007). Probability in GRW theory. Studies in History and Philosophy of
Modern Physics, 38(2), 371–389.
Frigg, R., & Hoefer, C. (2010). Determinism and chance from a Humean perspective. In D. Dieks,
W. Gonzalez, S. Hartmann, M. Weber, F. Stadler, & T. Uebel (Eds.), The present situation in
the philosophy of science (pp. 351–371). Berlin: Springer.
Frigg, R., & Werndl, C. (2012). Demystifying typicality. Philosophy of Science, 79(5), 917–929.
3 Unscrambling Subjective and Epistemic Probabilities 87
Fuchs, C. A. (2002). Quantum mechanics as quantum information (and only a little more). arXiv
preprint www.quant-ph/0205039.
Fuchs, C. A. (2014). Introducing qBism. In M. C. Galavotti, D. Dieks, W. Gonzalez, S. Hartmann,
T. Uebel, & M. Weber (Eds.), New directions in the philosophy of science (pp. 385–402). Cham:
Springer.
Glynn, L. (2009). Deterministic chance. The British Journal for the Philosophy of Science, 61(1),
51–80.
Goldstein, S. (2001). Boltzmann’s approach to statistical mechanics. In J. Bricmont, D. Dürr,
M. C. Galavotti, G.C. Ghirardi, F. Petruccione, & N. Zanghì (Eds.), Chance in physics (Lecture
notes in physics, Vol. 574, pp. 39–54). Berlin: Springer.
Goldstein, S., & Zanghì, N. (2013). Reality and the role of the wave function in quantum theory.
In A. Ney, & D. Z. Albert (Eds.), The wave function – Essays on the metaphysics of quantum
mechanics (pp. 91–109). Oxford: Oxford University Press.
Greaves, H., & Myrvold, W. (2010). Everett and evidence. In S. Saunders, J. Barrett, A. Kent, &
D. Wallace (Eds.), Many worlds? Everett, quantum theory, and reality (pp. 264–304). Oxford:
Oxford University Press.
Griffiths, R. B. (1984). Consistent histories and the interpretation of quantum mechanics. Journal
of Statistical Physics, 36(1–2), 219–272.
Hájek, A. (2011). Conditional probability. In D. M. Gabbay, P. Thagard, J. Woods, P. S.
Bandyopadhyay, & M. R. Forster (Eds.), Philosophy of statistics, Handbook of the philosophy
of science (Vol. 7, pp. 99–135). Dordrecht: North-Holland.
Hartle, J. B. (1998). Quantum pasts and the utility of history. Physica Scripta T, 76, 67–77.
Heisenberg, W. (1958). Physics and philosophy – The revolution in modern science. New York:
Harper.
Hellwig, K.-E., & Kraus, K. (1970). Formal description of measurements in local quantum field
theory. Physical Review D, 1, 566–571.
Howard, D. (2004). Who invented the Copenhagen interpretation? A study in mythology.
Philosophy of Science, 71(5), 669–682.
Hume, D. (1748). An enquiry concerning human understanding. London: A. Millar.
Ismael, J. (2011). A modest proposal about chance. The Journal of Philosophy, 108(8), 416–442.
Ismael, J. (2017). Passage, flow, and the logic of temporal perspectives. In C. Bouton & P. Huneman
(Eds.), Time of nature and the nature of time (Boston studies in the philosophy and history of
science, Vol. 326, pp. 23–38). Cham: Springer.
Jaynes, E. T. (1985). Some random observations. Synthese, 63(1), 115–138.
Lazarovici, D., & Reichert, P. (2015). Typicality, irreversibility and the status of macroscopic laws.
Erkenntnis, 80(4), 689–716.
Lewis, D. (1976). The paradoxes of time travel. American Philosophical Quarterly, 13(2), 145–
152.
Lewis D. (1980). A subjectivist’s guide to objective chance. In R. C. Jeffrey (Ed.), Studies in
inductive logic and probability (Vol. 2, pp. 263–293). Reprinted in: (1986). Philosophical
papers (Vol. 2, pp. 83–113). Oxford: Oxford University Press.
Lewis, D. (1986). Postscripts to ‘A subjectivist’s guide to objective chance’. In Philosophical
papers (Vol. 2, pp. 114–132). Oxford: Oxford University Press.
Loewer, B. (2004). David Lewis’s Humean theory of objective chance. Philosophy of Science, 71,
1115–1125.
Loewer, B. (2007). Counterfactuals and the second law. In H. Price & R. Corry (Eds.), Causation,
physics, and the constitution of reality – Russell’s republic revisited (pp. 293–326). Oxford:
Oxford University Press.
Lyon, A. (2011). Deterministic probability – Neither chance nor credence. Synthese, 182(3), 413–
432.
Menzies, P., & Price, H. (1993). Causation as a secondary quality. British Journal for the
Philosophy of Science, 44, 187–203.
Myrvold, W. (2000). Einstein’s untimely burial. PhilSci preprint http://philsci-archive.pitt.edu/
222/.
88 G. Bacciagaluppi
Uffink, J. (2007). Compendium of the foundations of classical statistical physics. In J. Butterfield &
J. Earman (Eds.), Handbook of the philosophy of physics: Part B (pp. 923–1074). Amsterdam:
North-Holland.
Valentini, A. (1996). Pilot-wave theory of fields, gravitation and cosmology. In J. T. Cushing, A.
Fine, & S. Goldstein (Eds.), Bohmian mechanics and quantum theory – An appraisal (pp. 45–
66). Dordrecht: Springer.
Wallstrom, T. C. (1989). On the derivation of the Schrödinger equation from stochastic mechanics.
Foundations of Physics Letters, 2(2), 113–126.
Watanabe, S. (1965). Conditional probability in physics. Progress of Theoretical Physics Supple-
ment, E65, 135–167.
Wharton, K. (2014). Quantum states as ordinary information. Information, 5, 190–208.
Wallace, D. (2010). How to prove the Born rule. In S. Saunders, J. Barrett, A. Kent, & D. Wallace
(Eds.), Many worlds? Everett, quantum theory, and reality (pp. 227–263). Oxford: Oxford
University Press.
Wallace, D. (2011). The Logic of the past hypothesis. PhilSci preprint http://philsci-archive.pitt.
edu/8894/.
Chapter 4
Wigner’s Friend as a Rational Agent
Abstract In a joint paper Jeff Bub and Itamar Pitowsky argued that the quantum
state represents “the credence function of a rational agent [. . . ] who is updating
probabilities on the basis of events that occur”. In the famous thought experiment
designed by Wigner, Wigner’s friend performs a measurement in an isolated
laboratory which in turn is measured by Wigner. Here we consider Wigner’s friend
as a rational agent and ask what her “credence function” is. We find experimental
situations in which the friend can convince herself that updating the probabilities
on the basis of events that happen solely inside her laboratory is not rational and
that conditioning needs to be extended to the information that is available outside of
her laboratory. Since the latter can be transmitted into her laboratory, we conclude
that the friend is entitled to employ Wigner’s perspective on quantum theory when
making predictions about the measurements performed on the entire laboratory, in
addition to her own perspective, when making predictions about the measurements
performed inside the laboratory.
V. Baumann ()
Vienna Center for Quantum Science and Technology (VCQ), Faculty of Physics, University of
Vienna, Vienna, Austria
Faculty of Informatics, Università della Svizzera italiana, Lugano, Switzerland
e-mail: baumav@usi.ch
Č. Brukner
Vienna Center for Quantum Science and Technology (VCQ), Faculty of Physics, University of
Vienna, Vienna, Austria
Institute of Quantum Optics and Quantum Information (IQOQI), Austrian Academy of Sciences,
Vienna, Austria
e-mail: caslav.brukner@univie.ac.at
4.1 Introduction
1
|φS |0F → √ (| ↑S |U F + | ↓S |DF ) =: |Φ + SF , (4.1)
2
where |ZF with Z ∈ {U, D} is the state of the friend having registered outcome
z ∈ {“up”, “down”} respectively. Wigner can perform a highly degenerate projective
measurement MW : {|Φ + Φ + |F S , 1 − |Φ + Φ + |F S } to verify his state assignment.
The respective results of the measurement will occur with probabilities p(+) = 1
and p(−) = 0. We note that the specific relative phase between the two amplitudes
in Eq. (4.1) (there chosen to be zero) is determined by the interaction Hamiltonian
between the friend and the system, which models the measurement and is assumed
to be known to and in control of Wigner.
We next introduce a protocol through which Wigner’s friend will realise that her
predictions will be incorrect, if she bases them on the state-update rule conditioned
only on the outcomes observed inside her laboratory. The protocol consists of three
steps (see Fig. 4.1): (a) The friend performs measurement MF and registers an
outcome; (b) The friend makes a prediction about the outcome of measurement
Fig. 4.1 Three steps of the protocol: (a) Wigner’s friend (F) performs a measurement MF on the
system S in the laboratory and registers an outcome. (b) The friend makes a prediction about the
outcome of a future measurement MW on the laboratory (including the system and her memory),
for example, by writing it down on a piece of paper A. She communicates her prediction to Wigner
(W). (c) Wigner performs measurement MW . The three steps (a–c) are repeated until sufficient
statistics for measurement MW is collected. The statistics is noted, for example, on another piece
of paper B. Afterwards, the “lab is opened” and the friend can compare the two lists A and B. An
alternative protocol is given in the main text
94 V. Baumann and Č. Brukner
1
d
|φ d S |0F → √ |j S |α j F =: |Φd+ SF , (4.2)
d j =1
where |α j F is the state of the friend’s memory after having observed outcome
j . For MW we choose: {|Φd+ Φd+ |SF , 1 − |Φd+ Φd+ |SF }. Wigner records the “+”
result with unit probability, i.e. p(+) = 1 and p(−) = 0. Using the measurement-
update rule the friend, however, predicts
1
p(+|j ) = |j |α j |Φd+ SF |2 = −→ 0
d d→∞
(4.3)
1
p(−|j ) = 1 − −→ 1,
d d→∞
of the measured system and the number of outcomes increases, the discrepancy
between her prediction and the actual statistics becomes maximal, giving rise to an
“all-versus-nothing” argument.
4.3 Discussion
Alternatively, the friend can use only the quantum state Wigner assigns to the
laboratory, but apply a modification of the Born rule as introduced in Baumann
and Wolf (2018). The new rule enables to evaluate conditional probabilities in a
sequence of measurements. In the case when the sequence of measurements is
performed on the same system, it restores the predictions of the standard state-
update rule. However, the rule enables making predictions in Wigner’s-friend
scenarios as well, where the first measurement is performed on the system by
Wigner’s friend, and the subsequent measurement is performed on the “system +
friend” by Wigner. In this case, the rule recovers the prediction as given by
Eq. (4.3). This modified Born rule gains an operational meaning in the present
setup where Wigner’s friend has access to outcomes of both measurements and can
evaluate the computed conditional probabilities (Note that Wigner does not have
this status, since he has no access to the outcome of the friend’s measurement.).
While Bub and Pitowsky claim in Bub and Pitowsky (2010) that “Conditionalizing
on a measurement outcome leads to a nonclassical updating of the credence
function represented by the quantum state via the von Neumann-Lüders rule. . . ” we
argue that in case of Wigner’s friend experiments both the “outside” and “inside”
perspective should contribute to “the credence function” for making predictions.
We apply the modified Born rule of Baumann and Wolf (2018) to the present situa-
tion. The rule defines the conditional probabilities in a sequence of measurements.
According to it, every measurement is described as a unitary evolution analogous
to Eq. (4.1). Consider a sequence of two measurements, the first one described by
unitary U and the subsequent one by unitary V . The sequence of unitaries correlate
the results of the two measurements with two memory registers 1 and 2. The overall
state of the system and the two registers evolve as
U
|φS |01 |02 −
→ j |φ|j S |α j 1 |02 (4.4)
j
V
−
→ j |φβ̃ j k |j, α j |β̃ j k S1 |β k 2 = |Φtot .
jk
The conditional probability for observing outcome k̄ given that outcome j¯ has been
observed in the previous measurement is given by
p(k̄, j¯)
p(k̄|j¯) = . (4.5)
¯
k̄ p(k̄, j )
98 V. Baumann and Č. Brukner
Here p(k̄, j¯) is the joint probability for observing the two outcomes and is given
by the projection of the overall state |Φtot on the states |α j¯ 1 and |β k̄ 2 of the two
registers:
In the case when the sequence of the measurements are performed on a single
system, the rule restores the prediction of the standard state-update rule, i.e.
p(k̄, j¯) = |k̄|j¯|2 . This corresponds to the state
|Φtot = j |φk|j |kS |α j 1 |β k 2 . (4.7)
jk
d
|Φtot = j |φ|j S |α j 1 |+2 (4.8)
j =1
where |+2 is the state of the memory register corresponding to observing the result
of |Φd+ Φd+ |S1 , and the joint probability is given by
This means that full unitary quantum theory together with the adapted Born rule
allows for consistent predictions in Wigner’s-friend-type scenarios, while giving the
same predictions as the standard Born and measurement-update rules for standard
observations.
Acknowledgements We thank Borivoje Dakic, Fabio Costa, Flavio del Santo, Renato Rennner
and Ruediger Schack for useful discussion. We also thank Jeff Bub for the discussions and for
the idea of the alternative protocol. We acknowledge the support of the Austrian Science Fund
(FWF) through the Doctoral Programme CoQuS (Project no. W1210-N25) and the projects I-
2526-N27 and I-2906. This work was funded by a grant from the Foundational Questions Institute
(FQXi) Fund. The publication was made possible through the support of a grant from the John
Templeton Foundation. The opinions expressed in this publication are those of the authors and do
not necessarily reflect the views of the John Templeton Foundation.
4 Wigner’s Friend as a Rational Agent 99
References
Baumann, V., & Wolf, S. (2018). On formalisms and interpretations. Quantum, 2, 99.
Brukner, Č. (2015). On the quantum measurement problem. arXiv:150705255. https://arxiv.org/
abs/1507.05255.
Brukner, Č. (2018). A no-go theorem for observer-independent facts. Entropy, 20(5), 350.
Bub, J., & Pitowsky, I. (2010). Two dogmas about quantum mechanics. In Many worlds?: Everett,
quantum theory, & reality (pp. 433–459). Oxford: Oxford University Press.
Diósi, L. (2014). Gravity-related wave function collapse. Foundations of Physics, 44(5), 483–491.
Finkelstein, J. (2005). “quantum-states-as-information” meets Wigner’s friend: a comment on
hagar and hemmo. arXiv preprint quant-ph/0512203.
Frauchiger, D., & Renner, R. (2018). Quantum theory cannot consistently describe the use of itself.
Nature Communications, 9(1), 3711.
Fuchs, C. A. (2010). Qbism, the perimeter of quantum bayesianism. arXiv:10035209. https://arxiv.
org/abs/1003.5209.
Ghirardi, G. C., Rimini, A., & Weber, T. (1986). Unified dynamics for microscopic and macro-
scopic systems. Physical Review D, 34(2):470. https://doi.org/10.1103/PhysRevD.34.470.
Hagar, A., & Hemmo, M. (2006). Explaining the unobserved—Why quantum mechanics ain’t only
about information. Foundations of Physics, 36(9), 1295–1324.
Penrose, R. (2000). Wavefunction collapse as a real gravitational effect. In Mathematical physics
2000 (pp. 266–282). Singapore: World Scientific.
Rovelli, C. (1996). Relational quantum mechanics. International Journal of Theoretical Physics,
35(8), 1637–1678.
Wheeler, J. A. (1957). Assessment of everett’s “relative state” formulation of quantum theory.
Reviews of Modern Physics, 29(3), 463.
Wigner, E. P. (1963). The problem of measurement. American Journal of Physics, 31(1), 6–15.
https://doi.org/10.1119/1.1969254.
Chapter 5
Pitowsky’s Epistemic Interpretation
of Quantum Mechanics and the PBR
Theorem
Yemima Ben-Menahem
5.1 Introduction
This paper draws on an earlier paper on the PBR theorem (Ben-Menahem 2017).
Y. Ben-Menahem ()
Department of Philosophy, The Hebrew University of Jerusalem, Jerusalem, Israel
e-mail: yemima.ben-menahem@mail.huji.ac.il
point, the novelty of Born’s interpretation was that it construed quantum probabili-
ties as fundamental in the sense that they were believed to be irreducible, that is, they
were not alleged to arise from averaging over an underlying deterministic theory. In
other words, quantum probabilities were said to be different from those of other
probabilistic theories, statistical mechanics in particular, where the existence of an
underlying deterministic theory—classical mechanics—is taken for granted. The
proclaimed irreducibility gave rise to serious conceptual puzzles, (more on these
below), but, from the pragmatic/empirical point of view, the notion that quantum
states (as represented and calculated by QM) stand for probability amplitudes of
measurement results was sufficient to make quantum mechanics a workable (and
highly successful) theory.
Itamar Pitowsky endorsed the irreducibly probabilistic nature of QM, adding to
it a major twist. On the basis of developments such as Bell’s inequalities, Gleason’s
theorem and other no-go theorems, he argued that QM is not merely a probabilistic
theory, but a non-classical probabilistic theory.2 Studying the structure of the
new theory in both algebraic and geometrical terms, he was able to demonstrate
that quantum probabilities deviate from classical probabilities in allowing more
correlations between events than classical probability theory would allow. He saw
his interpretation as a response to the threat of non-locality and as a key to the
solution of other puzzles such as the notorious measurement problem. In addition,
his understanding of QM as a theory of non-classical probability also constitutes
a new take on the status of the theory of probability, turning it from an a priori
extension of logic into an empirical theory, which may or may not apply to particular
empirical domains.3
The meaning of the concept of probability, which has been debated among
scholars for a long time, is of particular concern in the context of QM. An option
that has attracted a great deal of attention in the last few decades is an epistemic
interpretation of probability (also known as information-theoretic, or Bayesian),
according to which probabilities in general, and quantum probabilities in particular,
represent states of knowledge or belief or information. How are we to complete
this sentence—knowledge (belief, information) about what? It turns out that the
as complementarity (which had actually been debated by members of the Copenhagen school),
but from the perspective of this paper and the problems about probability it discusses, the term
‘standard interpretation’ will do. I will not discuss the notion of probability in non-standard
approaches such as Bohm’s theory or the Many Worlds interpretation.
2 The non-classical character of quantum probability had been recognized, for example, by Richard
Feynman: “But far more fundamental was the discovery that in nature the laws of combining
probabilities were not those of the classical probability theory of Laplace” (Feynman 1951, p.
533). In actually developing and axiomatizing the non-classical calculus of quantum probability,
Pitowsky went far beyond that general recognition. I will not undertake a comparison of Pitowsky’s
work with that of his predecessors or contemporaries.
3 As a theory of rational belief, the theory of probability can be seen as an extension of logic
and the idea that the theory of probability is empirical—as extension of the view that logic is
empirical. Pitowsky’s interpretation of QM is thus related to the earlier suggestion that QM obeys
a non-classical logic—quantum logic. The connection is explicit in the title of Pitowsky’s (1989)
Quanyum Probability—Quantum Logic.
5 Pitowsky’s Epistemic Interpretation of Quantum Mechanics and the PBR Theorem 103
4 Priorto the publication of the PBR theorem, the difference between possible answers to this
question was rather obscure. One of the contributions of the theorem is that it sheds light on this
difference.
5 The 2011 version of the paper was published in the arXiv quant-ph11113328.
6 Born [1926] (1963), p. 234. “Die Bewegung der partikeln folgt Wharscheinlichkeitsgesetzen,
die wharscheinlichkeit selbst aber breitet sich im Einklang mit dem Kausalgesetz aus.” . . . Die
Quantenmechanik allenfals nur Antwort gibt auf richtig gestellte statistische Fragen, aber im
allgemeinen die Frage nach dem Ablauf sines Einzelprozesses unbeantwortet laesst.”
104 Y. Ben-Menahem
mechanics and classical mechanics, but these pertain mainly to the issue of irreversibility and will
not concern us here. See Albert (2000), Hemmo and Shenker (2012), and the literature there cited.
Note, further, that despite the fact that the central laws of statistical mechanics are probabilistic,
there is no binding reason to construe these probabilities as subjective.
5 Pitowsky’s Epistemic Interpretation of Quantum Mechanics and the PBR Theorem 105
10 The epistemic interpretation already existed in 1930 but took some time to make an impact
outside of the community of probability theorists. Leading advocates of the subjective/epistemic
interpretation of probability as degree of belief are Ramsey (1931), de Finetti and Savage (1954).
The delay in impact is clearly reflected in the fact that de Finetti’s works of the 1930s had to wait
several decades to be collected and translated into English (de Finetti 1974).
106 Y. Ben-Menahem
11 Iam not sure about the origin of this analogy; I may have heard it from David Finkelstein.
12 The meaning of the phase is being examined by Guy Hetzroni in his Ph.D dissertation The
Quantum Phase and Quantum Reality submitted to the Hebrew University in July 2019 and
approved in October 2019. See also Hetzroni 2019a.
13 The Aharonov Bohm paper was published in 1959 but the problem about the phases could have
been raised in 1926. There are different interpretations of the Aharonov Bohm effect (see, for
example Aharonov and Rohrlich 2005, Chaps. 4, 5 and 6 and Healey 2007, Chap. 2), but, none of
them is intuitive from the perspective of an ensemble interpretation.
5 Pitowsky’s Epistemic Interpretation of Quantum Mechanics and the PBR Theorem 107
14 The letter is reproduced in Popper (1968) pp. 457–464. Incidentally, Einstein’s letter repeats
the main argument of the EPR paper in terms quite similar to those used in the published version
and makes it clear that the meaning of the uncertainty relations is at the center of the argument.
The letter thus disproves the widespread allegations that Einstein was unhappy with the published
argument and that it was not his aim to question the uncertainty relations.
15 It was only in the first few years of QM that Heisenberg and Bohr understood the uncertainty
relations as reflecting concrete disturbance by measurement, that is, as reflecting a causal process.
In his response to the EPR paper, Bohr realized that no concrete disturbance takes place in the EPR
situation. He construed the uncertainty relations, instead, as integral to the formalism, and as such,
holding for entangled states even when separated to spacelike distances so that they cannot exert
any causal influence on one another.
16 In contrast to the older ensemble interpretation, current epistemic interpretations of QM do not
context; the two positions actually represent different theories diverging in their
empirical implications.17
Thus far, I have concentrated on the distinction between the ensemble interpreta-
tion and standard QM rather than the distinction between epistemic and ontic models
that is the focus of the PBR theorem. The reason for using the former as backdrop
for the latter is that in general, ensemble and epistemic interpretations have not been
properly distinguished by their proponents and do indeed have much in common;
neither of them construes quantum states as a unique representation of physical
states of individual systems and both can dismiss the worry about the collapse of the
wavefunction. In principle, however, epistemic and ensemble interpretations could
diverge: The epistemic theorist could go radical and decline the assumption of an
underlying level of well-determined physical states that was the raison d’être of the
ensemble interpretation. This was in fact the route taken by Pitowsky.
17 Construing the diverging interpretations as different theories does not mean that the existing
empirical tests are conclusive, but that in principle, such tests are feasible. Even the evidence
accumulated so far, however, makes violation of the uncertainty relations in the individual case
highly unlikely.
18 There are of course other approaches to QM, some of which retain classical probability theory.
The ensemble interpretation discussed above and Bohm’s theory, which I do not discuss here are
examples of such approaches.
5 Pitowsky’s Epistemic Interpretation of Quantum Mechanics and the PBR Theorem 109
probability of the union (E1 UE2 ) is p1 + p2 −p12 , and cannot exceed the sum of the
probabilities (p1 + p2 ).
0 ≤ p1 + p2 − p12 ≤ p1 + p2 ≤ 1
The violation of this constraint is already apparent in simple paradigm cases such
as the two-slit experiment, for there are areas on the screen that get more hits when
the two slits are both open for a certain time interval Δt, than when each slit is
open separately for the same interval Δt. In other words, contrary to the classical
principle, we get a higher probability for the union than the sum of the probabilities
of the individual events. (Since we get this violation in different experiments—
different samples—it does not constitute an outright logical contradiction.) This
quantum phenomenon is usually described in terms of interference, superposition,
the wave particle duality, the nonlocal influence of one open slit on a particle passing
through the other, and so on. Pitowsky’s point is that regardless of the explanation of
the pattern predicted by QM (and confirmed by experiment), we must acknowledge
the bizarre phenomenon that it displays—nothing less than a violation of a highly
intuitive principle of the classical theory of probability.
QM predicts analogous violations of other classical conditions. The most famous
of these violations is that of the Bell inequalities which are violated by QM (and
experiment). Pitowsky showed that from the analogue of the above rule for three
events, it is just a short step to Bell’s inequalities. He therefore linked the inequalities
directly to Boole’s “conditions of possible experience” (the basic conditions of
classical probability) and their violation to the need for an alternative probability
calculus. To appreciate the significance of this move for the debate over the
epistemic interpretation, it is crucial to understand that what makes the violated
conditions classical is their underlying assumption of determinate states occupied
by the entities in question (or properties they obtain) independently of measurement:
Just as balls in an urn have definite properties such as being red or wooden, to
derive Bell’s inequalities it is assumed that particles have a definite polarization
or a definite spin in a certain direction, and so on. The violation of the classical
principles of probability compels us to discard this picture, replacing it with a new
understanding of quantum states and quantum properties. Rather than an analogue
of the classical state, which represents physical entities and their properties prior
to measurement, Pitowsky saw the quantum state function as keeping track of the
probabilities of measurement results, “a device for the bookkeeping of probabilities”
(2006, p. 214).19
This understanding of the state function, in turn, led Pitowsky to two further
observations: First, he interpreted the book-keeping picture subjectively, i.e., quan-
19 Pitowsky’s answer is similar to that of Schrödinger, who characterizes quantum states as “the
momentarily-attained sum of theoretically based future expectations, somewhat laid down as in a
catalog. It is the determinacy bridge between measurements and measurements” (1935, p. 158).
Pitowsky only became aware of this similarity around 1990. For more on Schrödinger’s views and
their similarity to those of Pitowsky see note number 25 below and Ben-Menahem (2012).
110 Y. Ben-Menahem
in QM, we get irreducibility, i.e., (with 0 as the null event and 1 the certain event):
If for some z and for all x, x = (x z) x z⊥ then z = 0 or z = 1
20 The idea that the collapse can always be understood epistemically has been challenged by Meir
Hemmo; see Hagar and Hemmo (2006).
21 Pitowsky’s formulation is slightly different from that of Birkhoff and von Neumann, but the
difference is immaterial. Pitowsky makes significant progress, however, in his treatment of the
representation theorem for the axiom system, in particular in his discussion of Solér’s theorem.
The theorem, and the representation problem in general, is crucial for the application of Gleason’s
theorem (Gleason 1957), but will not concern us here.
5 Pitowsky’s Epistemic Interpretation of Quantum Mechanics and the PBR Theorem 111
simply dealing with a different notion of probability, then all space-time considerations
become irrelevant (2006, pp. 231–232).
Recall that in order to countenance nonlocality without breaking with STR, the
no signaling constraint must be observed. As nonlocality is construed by Pitowsky
in formal terms—a manifestation of the quantum mechanical probability calculus,
uncommitted to a particular dynamic—it stands to reason that no signaling will
likewise be derived from probabilistic considerations. Indeed, it turns out that the no
signaling constraint can be understood as an instance of the more general principle
known as the non-contextuality of measurement (Barnum et al. 2000). In the spirit
of the probabilistic approach to QM, Pitowsky and Bub therefore maintain that
No signaling is not specifically a relativistic constraint on superluminal signaling. It is
simply a condition imposed on the marginal probabilities of events for separated systems
requiring that the marginal probability of a B-event is independent of the particular set of
mutually exclusive and collectively exhaustive events selected at A, and conversely” (Bub
and Pitowsky 2010, p. 443).22
Pusey, Barrett and Rudolph summarize the no-go argument that has come to be
known as the PBR theorem in the following sentence:
We show that any model in which a quantum state represents mere information about
an underlying physical state of the system, and in which systems that are prepared
independently have independent physical states, must make predictions that contradict those
of quantum theory. (Pusey et al. 2012, p. 475)
Clearly, the theorem purports to target the epistemic interpretation of the wave
function, thereby supporting its realist interpretation. This is indeed the received
reading of the theorem and the source of its appeal. Under the title “Get Real” Scott
Aaronson (2012) announced the new result as follows:
Do quantum states offer a faithful representation of reality or merely encode the partial
knowledge of the experimenter? A new theorem illustrates how the latter can lead to a
contradiction with quantum mechanics. (2012, p. 443)
Since the PBR theorem does indeed answer the said question in the negative, the
conclusion is that the wave function has gained (more) reality.
Why is the negative answer to this question a refutation of the epistemic
interpretation? Here Pusey et al. build on a distinction between Ψ -ontic models
and Ψ -epistemic models introduced by Harrigan and Spekkens (2010): In Ψ -
Quntum level
Physical level
(a) The same quantum state corresponds (b) The same physical state
to several physical states corresponds to two quantum states
Fig. 5.1 Two types of relation between the physical level and the quantum level. (a) The same
quantum state corresponds to several physical states. (b) The same physical state corresponds to
two quantum states
114 Y. Ben-Menahem
ontic models the Ψ function corresponds to the physical state of the system; in
Ψ -epistemic models, Ψ represents knowledge about the physical state of the system.
Consequently, there are also two varieties of incompleteness: Ψ could give us a
partial description of the physical state or a partial representation of our knowledge
about that state. If a Ψ -ontic model is incomplete, it is conceivable that Ψ could be
supplemented with further parameters—‘hidden variables’. In this case, the same Ψ
function could correspond to various physical states of the system, distinguishable
by means of the values of the additional hidden variables. Presumably, Ψ -epistemic
models can also be complete or incomplete but completing them cannot be accom-
plished by hidden variables of the former kind. Note that this analysis presupposes
a definite answer to the question posed at the end of the introduction to this
paper–what, according to the epistemic view, is the knowledge (belief, information)
represented by the quantum state about? The assumption underlying the theorem is
that it is knowledge about the physical state of the system.
So far, Harrigan and Spekkens’ distinction is merely terminological, but they
proceed to offer a criterion that distinguishes Ψ -epistemic from Ψ -ontic models:
If the Ψ function is understood epistemically, they claim, it can stand in a non-
functional relation to the physical state of the system, that is, the same physical state
may correspond to different (non-identical but also not orthogonal) Ψ functions.
Or, when probabilities rather than sharp values are considered, the supports of the
probability distributions corresponding to different Ψ functions can overlap for
some physical states (See Fig. 5.2 for an illustration of this criterion in the case
of two Ψ functions).23
Harrigan and Spekkens maintain that the possibility of overlap only makes
sense under the epistemic interpretation of Ψ , for, they argue, knowledge about
the physical state could be updated without any change to the physical state itself.
Δ
Ψ -ontic models Ψ -epistemic models
Fig. 5.2 The difference between Ψ -ontic and Ψ -epistemic models: (a) Ψ -ontic models. (b) Ψ -
epistemic models
24 Interpretationssuch as Bohm’s theory and the many world interpretation show that the
assumption is not ruled out by QM, but it is, nonetheless, sufficiently contentious to require some
justification.
5 Pitowsky’s Epistemic Interpretation of Quantum Mechanics and the PBR Theorem 117
observer rather than in the external physical world,” (2014, p. 69) and can therefore
vary from one observer to another. Pitowsky, I suggest, did not understand the
epistemic interpretation in that way. QM, he thought, is an algorithm that determines
degrees of rational belief, or rational betting procedures in the quantum context.
Even if the belief is in an individual’s mind, the algorithm is objective. According
to an epistemic interpretation of this kind, an observer using classical probability
theory rather than QM, is bound to get results that deviate from QM, or, in the
betting jargon often adopted by epistemic theorists, is bound to lose. The epistemic
interpretation construes quantum probabilities as constraints on what can be known,
or rationally believed, about particular measurement outcomes. By itself, it does not
imply more liberty (about the ascription of probabilities to certain outcomes) than
realist interpretations.
To what extent is an epistemic theorist committed to the Harrigan-Spekkens con-
dition of non-supervenience as characterizing Ψ -epistemic models? Let’s assume
that the condition holds and there are well-determined physical states that ‘corre-
spond’ to different quantum states. Obviously, this correspondence is not meant to
be an identity for a physical state cannot be identical with different quantum states.
A possible way of unpacking this correspondence is as follows: a well-defined
physical state exists (prior to measurement) and, upon measurement ‘generates’
or ‘gives rise to’ different results. Such as interpretation is at odds with the
received understanding of Bell’s inequalities and, more specifically, with Pitowsky’s
interpretation of QM as a non-classical theory of probability. We have seen that
Pitowsky explicitly denied the classical picture of well-defined physical states
generating quantum states, a picture that leads to Bool’s classical conditions, which,
in turn, are violated by QM. Indeed, the language of a physical state ‘giving rise’
to, or ‘generating,’ the result of a quantum measurement would be rejected by
Pitowsky even if it referred to a physical state corresponding to single quantum
state, not merely to different ones as is the assumption in the Harrigan-Spekkens
analysis. If, on the other hand, by the said ‘correspondence’ we mean the physical
state of the system at the very instant it is being measured (or thereafter, but before
another measurement is made), we are no longer running into conflict with Bell’s
theorem, but then, there is no reason to accept that this physical state corresponds to
different quantum states. In other words, there is no reason to deny supervenience.
The Epistemic theorist can take quantum states to determine the probabilities of
measurement results, and understand probabilities as degrees of rational belief,
and still deny the possibility that the same physical state corresponds to different
quantum states. It would actually be very strange if the epistemic theorist would
endorse such a possibility. Clearly, the quantum state—a probability amplitude—is
compatible with different measurement outcomes, but the probability itself is fixed
by QM.
Moreover, if quantum probabilities are conceived as the maximal information
QM makes available, then surely we cannot expect different probability amplitudes
118 Y. Ben-Menahem
to correspond the same physical state.25 The idea of maximal information plays
an important role in Spekkens (2005), where, on the basis of purely information-
theoretic constraints, a toy model of QM is constructed. The basic principle of this
toy model is that in a state of maximal knowledge, the amount of knowledge is equal
to the amount of uncertainty, that is, the number of questions about the physical state
of a system that can be answered is equal to the number of questions that receive
no answer. (This was also Schrödinger’s intuition cited in the previous footnote).
Spekkens succeeds in deriving from this principle many of the characteristic
features of QM, for example, interference, disturbance, the noncommutative nature
of measurement and no cloning.26 Either way, then—whether the real physical
state allegedly corresponding to different quantum states is taken to exist prior to
measurement or thereafter—the epistemic theorist I have in mind—Pitowsky in
particular—need not accept the Harrigan-Spekkens condition, let alone endorse it
as a criterion for epistemic models.
A comparison with statistical mechanics is again useful. The goal of statistical
mechanics is to derive thermodynamic behavior, described in terms of macro-
properties such as temperature, pressure and entropy, from the description of
the underlying level of micro-states in terms of classical mechanics. For some
macroscopic parameters, the connection with micro-properties is relatively clear—
it is quite intuitive, for instance, to correlate the pressure exerted by a gas on
its container with the average impact (per unit area) of micro-particles on the
container. But this is not the case of other macro-properties; entropy, in particular.
Recovering the notion of entropy is essential for the recovery of the second law
of thermodynamics and constitutes a major challenge of statistical mechanics. The
fundamental insight underlying the connection between entropy and the micro-
25 The idea of maximal information or completeness is explicit in Schrödinger, who took the
Ψ function to represent a maximal catalog of possible measurements. The completeness of the
catalog, he maintained, entails that we cannot have a more complete catalog, that is, we cannot
have two Ψ functions of the same system one of which is included in the other. For Schrödinger
this is the basis for ‘disturbance’: Any additional information, arrived at by measurement, must
change the previous catalog by deleting information from it, meaning that at least some of the
previous values have been destroyed. Entanglement is also explained by Schrödinger on the basis
of the maximality or completeness of
. He argues as follows: A complete catalog for two separate
systems is, ipso facto, also a complete catalog of the combined system, but the reverse does not
follow. “Maximal knowledge of a total system does not necessarily include total knowledge of
all its parts, not even when these are fully separated from each other and at the moment are not
influencing each other at all” (1935, p. 160, italics in original). The reason we cannot infer such
total information is that the maximal catalog of the combined system may contain conditional
statements of the form: if a measurement on the first system yields the value x, a measurement on
the second will yield the value y, and so on. Schrödinger sums up: “Best possible knowledge of
a whole does not necessarily include the same for its parts . . . The whole is in a definite state, the
parts taken individually are not” (1935, p. 161). In other words, separated systems can be correlated
or entangled via the Ψ function of the combined system, but this does not mean that their individual
states are already determined.
26 Spekkens also notes that the model fails to reproduce the Kochen-Specker theorem and the
level was that macrostates are multiply realizable by microstates. The implication is
that in general, since the same macrostate could be realized by numerous different
microstates, the detailed description of the microstate of the system is neither
sufficient nor necessary for understanding the properties of its macrostate. What
is crucial for the characterization of a macrostate in terms of the micro-structure
of the system realizing it, however, is the number of ways (or its measure-theoretic
analogue for continuous variables) in which a macrostate can be realized by the
microstates comprising it. Each macrostate is thus realizable by all the microstates
corresponding to points that belong to the phase-space volume representing this
macrostate—a volume that can vary enormously from one macrostate to another.
Note the nature of the correspondence between microstates and macrostates in this
case: When considering a particular system, at each moment it is both in a specific
microstate and in a specific macrostate. In other words, microstates neither give rise
to macrostates nor generate them! The system is in single well-defined state, but it
can be described either in terms of its micro-properties or in terms of its macro-
properties. For an individual state, the distinction between macro and micro levels
is not a distinction between different states but rather between different descriptions
of the same state. It is only the type, characterized in terms of macro-properties, that
has numerous different microstates belonging to it.27
The insight that macrostates differ in the number of their possible realizations (or
it measure-theoretic analogue) led to the identification of the volume representing
a macrostate in phase space with the probability of this macrostate and to the
definition of entropy in terms of this probability. On this conception, the maximal
entropy of the equilibrium state in thermodynamic terms is a manifestation of
its high probability in statistical-mechanical terms.28 The link between entropy
and probability constituted a crucial step in transforming thermodynamics into
statistical mechanics, but it was not the end of the story. For one thing, the
meaning of probability in this context still needed refinement, for another, the
connection between the defined probability and the dynamic of the system had to be
established.29
27 Itseems that this is not the way Harrigan and Spekkens or Puesy et al. conceive of the relation
between the physical state and the quantum state. This obscurity might be the source of the
problems raised here regarding their characterization of the epistemic interpretation.
28 The probability of a macrostate can be understood objectively in terms of the ‘size’ of that
macrostate (in phase space) or subjectively, as representing our ignorance about the actual
microstate (as in the above quotation from Jaynes). This difference in interpretation does not
affect the probabilities of macrostates. Note that from the perspective of a single microstate (and
certainly from the perspective of a single atom or molecule), it should not matter whether it belongs
to a macrostate of high entropy, that is, a highly probable macrostate to which numerous other
microstates also belong, or to a low-entropy state—an improbable one, to which only a tiny fraction
of microstates belongs. In so far as only the microlevel is concerned, entropy does not constitute
an observable physical category.
29 A natural account of probability would involve an ensemble of identical systems (possibly an
infinite ensemble), where probabilities of properties are correlated with fractions of the ensemble.
In such an ensemble of systems we should expect to find more systems in probable macrostates
120 Y. Ben-Menahem
than in improbable ones. And yet, how does this consideration reflect on the individual system
we are looking at? Assuming an actual system that happens to be in an improbable macrostate—it
would certainly not follow from the above probabilistic considerations that it should, in the course
of time, move from its improbable state to a more probable one. If one’s goal is to account for the
evolution of an individual system (in particular, its evolution to equilibrium), there is a gap in the
argument. In addition, there is the formidable problem of deriving thermodynamic irreversibility
from the statistical-mechanical description, which is based on the time-reversal-symmetric laws of
mechanics. Since this problem has no analogue in the present discussion of the interpretation of
quantum states, we can set it aside.
30 As mentioned in Sect. 5.1, the analogy was rejected by the Copenhagen school from its earliest
days.
5 Pitowsky’s Epistemic Interpretation of Quantum Mechanics and the PBR Theorem 121
5.6 Instrumentalism
Admittedly, the instrumentalist will not assume the existence of physical states that
cannot be observed and will thus avoid the first assumption of the PBR theorem.
In this respect instrumentalism and Pitowsky’s radical epistemic interpretation
are alike. But here they part ways. To see why, it is useful to characterize
instrumentalism more precisely as committed to a verificationist (non-realist) theory
of meaning.31 On this theory, the meaning of a sentence depends on the existence
of a verification procedure—where none exists, the sentence is meaningless—and
its truth (falsity) on the existence of a proof (refutation). Unlike the realist, who
accepts bivalence (every sentence is either true or false regardless of whether or
not we can prove it), the verificationist admits a third category of indeterminate
sentences—sentences that are neither provable, nor disprovable. Consider the
difference between realism and verificationism when entertaining the claim that
Julius Cesar, say, could not stand raw fish or that he had a scar on his left toe.
Assuming that there is no evidence to settle the matter, the verificationist will deem
these claims indeterminate, whereas the realist will insist that although we may
never know, they must be either true or false. What if we stipulate that Julius
Cesar detested raw fish, or that he did in fact have a scar on his toe? On the
assumption that there is no evidence, such stipulation cannot get us into conflict
with the facts. We are therefore free to stipulate the truth values of indeterminate
sentences. This freedom, I claim, is constitutive of indeterminacy. The continuum
hypothesis provides a nice example from mathematics: Since it is independent of
the axioms of set theory, it might be true in some models of set theory and false
in others. If by stipulating the truth value of the hypothesis we could generate a
contradiction with set theory, we could no longer regard it as undecidable. The
axiom of parallels provides another example of the legitimacy of stipulation. Since
it is independent of the other axioms its negation(s) can be added to the other axioms
without generating a contradiction. In QM, questions regarding the meaning of
quantum states were for a long time undecided. If they were indeed undecidable,
that is undecidable in principle, the verificationist (instrumentalist) could stipulate
the answer by fiat. Schrödinger provided the first clue that this is not the case—the
very stipulation of determinate values would lead to conflict with QM. Later no-go
theorems and experiments confirmed this conclusion. As we saw, Pitowsky argued
that the reason for the violation of Bell’s inequalities is that in order to derive the
inequalities, one makes the classical assumption of determinate states underlying
quantum states. The violation of the inequalities, he thought, indicates that this
classical assumption is not merely indeterminate but actually false—leading to a
classical rather than quantum probability space. The constitutive characteristic of
31 In this section I repeat an argument made in Ben-Menahem (1997). The difference between
instrumentalism and the radical epistemic interpretation stands even for an instrumentalist who
does not explicitly endorse verificationism, for even such an instrumentalist would not envisage
the empirical implications that the radical epistemist calls attention to.
122 Y. Ben-Menahem
References
Pitowsky, I. (1989). Quantum probability, quantum logic (Lecture notes in physics) (Vol. 321).
Heidelberg: Springer.
Pitowsky, I. (2006). Quantum mechanics as a theory of probability. In Demopoulos & Pitowsky
(Eds.), Physical theory and its interpretation (pp. 213–239). Dordrecht: Springer.
Popper, K. R. (1967). Quantum mechanics without ‘The observer’. In M. Bunge (Ed.), Quantum
theory and reality (pp. 7–43). Berlin: Springer.
Popper, K. R. (1968). The logic of scientific discovery (Revised edn). London: Hutchinson.
Pusey, M. F., Barrett, J., & Rudolph, T. (2012). On the reality of the quantum state. Nature Physics,
8, 475–478.
Ramsey, F. P. (1931). Truth and probability. In R. B. Braithwait (Ed.), The foundations of
mathematics and other logical essays (pp. 156–198). London: Kegan Paul.
Redhead, M. (1987). Incompleteness, nonlocality and realism. Oxford: Clarendon Press.
Savage, L. J. (1954). The foundations of statistics. New York: Wiley.
Schlosshauer, M., & Fine, A. (2012). Implications of the Pusey-Barrett-Rudolph quantum no-go
theorem. Physical Review Letters, 108, 260404.
Schlosshauer, M., & Fine, A. (2014). No-go theorem for the composition of quantum systems.
Physical Review Letters, 112, 070407.
Schrödinger, E. (1935). The present situation in quantum mechanics (trans. J. D. Trimmer).
In: Wheeler & Zurek (Eds.), Quantum theory and measurement (pp. 152–167). Princeton:
Princeton University Press.
Spekkens, R. W. (2005). In defense of the epistemic view of quantum states: A toy theory.
arXive:quant-ph/0401052v2.
Wootters, W., & Zurek, W. H. (1982). A single quantum cannot be cloned. Nature, 299, 802–803.
Chapter 6
On the Mathematical Constitution
and Explanation of Physical Facts
Joseph Berkovitz
J. Berkovitz ()
Institute for the History and Philosophy of Science and Technology, University of Toronto,
Victoria College, Toronto, ON, Canada
e-mail: joseph.berkovitz@utoronto.ca
Modern physics is highly mathematical, and its enormous success seems to suggest
that mathematics is very effective as a tool for representing the physical realm.
Nobel Laureate Eugene Wigner famously expressed this view in “The unreasonable
effectiveness of mathematics in the natural sciences”:
the mathematical formulation of the physicist’s often crude experience leads in an uncanny
number of cases to an amazingly accurate description of a large class of phenomena (Wigner
1960, p. 8).
The great success of mathematical physics has led many to wonder about the causes
of and reasons for the effectiveness of mathematics in representing the physical
realm. Wigner thought that this success is unreasonable and even miraculous.
The miracle of the appropriateness of the language of mathematics for the formulation of
the laws of physics is a wonderful gift which we neither understand nor deserve. (Ibid.,
1960, p. 11)
1 Steiner (1998, p. 9) argues that there are two problems with Wigner’s formulation of question of
the “unreasonable effectiveness of mathematics in the natural sciences.” First, Wigner ignores the
cases in which “scientists fail to find appropriate mathematical descriptions of natural phenomena”
and the “mathematical concepts that never have found an application.” Second, Wigner focuses
on individual cases of successful applications of mathematical concepts, and these successes
might have nothing to do with the fact that a mathematical concept was applied. Steiner seeks
to formulate the question of the astonishing success of the applicability of mathematics in the
natural sciences in a way that escapes these objections. He argues that in their discoveries of new
theories, scientists relied on mathematical analogies. Often these analogies were ‘Pythagorean’,
meaning that they were inexpressible in any other language but that of pure mathematics. That
is, often the strategy that physicists pursued to guess the laws of nature was Pythagorean: “they
used the relations between the structures and even the notations of mathematics to frame analogies
and guess according to those analogies” (ibid., pp. 4–5). Steiner argues that although not every
guess, or even a large percentage of the guesses, was correct, this global strategy was astonishingly
successful.
Steiner’s reasoning and examples are intriguing and deserve an in-depth study, which, for want
of space, I need to postpone for another opportunity. I believe, though, that such a study will not
change the main thrust of my analysis of the question of the mathematical constitution of the
physical and its implications for the questions of mathematical explanations of physical facts and
the “unreasonable effectiveness of mathematics in the natural sciences”.
6 On the Mathematical Constitution and Explanation of Physical Facts 127
the nature of explanation in these accounts is unclear and that they are subject to the
objection that the mathematical frameworks, structures, or facts they appeal to play
a representational rather than explanatory role.
The current study was inspired by an exchange that I had with participants of
the workshop The Role of Mathematics in Science at the Institute for the History
and Philosophy of Science and Technology at the University of Toronto in October
2010. To my surprise, the common view was that while mathematics is very
important for representing physical systems/states/properties/laws, it does not play
any constitutive role in physics. Even more surprising, this view was also shared
by supporters of ontic structural realism, for whom such a constitution seems to
be a natural premise. The workshop Mathematical and Geometrical Explanations at
the Universitat Autònoma de Barcelona in March 2012 and subsequent conferences
and colloquia provided me with the opportunity to present the proposed account of
mathematical explanation of physical facts, and the invitation to contribute to the
second volume in honor of Itamar Pitowsky prompted me to write up this paper.
Pitowsky had a strong interest in mathematics and its use in physics. His research
covers various topics in these fields: the philosophical foundations of probability;
the nature of quantum probability and its relation to classical probability and quan-
tum logic; the interpretation of quantum mechanics, in general, and as a theory of
probability, in particular; quantum non-locality; computation complexity and quan-
tum computation; and probability in statistical mechanics. This research has inspired
a new perspective on the non-classical nature of quantum probability, quantum logic,
and quantum non-locality, and influenced our understanding of these issues.2
Pitowsky’s view of the relationship between mathematics and physics is less
known, however. Based on his commentary on Lévy-Leblond’s reflections on the
use and effectiveness of mathematics in physics in the Bar-Hillel colloquium in the
mid 1980s, in the next section I present his thoughts on this topic. Lévy-Leblond’s
and Pitowsky’s reflections will provide a background for the main focus of the
paper: the mathematical constitution and mathematical explanations of physical
facts. Both Lévy-Leblond and Pitowsky embrace the idea of the mathematical
constitution of the physical. Yet, it is not clear from their discussion what the exact
nature of this constitution is. In Sect. 4, I review two traditional conceptions of
mathematical constitution of physical facts – the Pythagorean and the neo-Kantian –
and briefly give an example of mathematical constitution of physical facts. In
Sect. 5, I review the common view of how mathematical models/theories represent
physical facts. The main idea of this view is that a mathematical model/theory
represents the properties of its target physical entities on the basis of a similarity
between the relevant mathematical structure in the model/theory and the structure of
the properties of these entities. More precisely, the idea is that there is an appropriate
structure-preserving mapping between the mathematical system in the model/theory
and the properties and relations of the represented physical entities. I argue that
the mapping account undermines the idea that physical facts are non-mathematical.
Further, in Sect. 6, I argue that a reflection on the concept of the physical in modern
natural science supports the idea that the physical is constituted by the mathematical.
In Sect. 7, I suggest that the philosophical interpretative frameworks of scientific
72394418_Itamar_Pitowsky
130 J. Berkovitz
theories/models in natural science determine the scope rather than the existence of
the mathematical constitution of the physical. In Sect. 8, I propose a new account of
mathematical explanation of physical facts. This account is sufficiently general so
as to be integrated into the frameworks of various existing accounts of explanation.
Accordingly, such accounts of explanation can be revised along the lines of the
proposed account to embody the idea that mathematical explanations of physical
facts highlight the mathematical constitution of physical facts and thus make the
explained facts intelligible or deepen and expand the scope of our understanding
of them. In Sect. 9, I consider four existing accounts of mathematical explanation
of physical facts. I argue that the exact nature of explanation in these accounts is
unclear and, moreover, that they are subject to the objection that mathematics only
plays a representational role, representing the physical facts that actually have the
explanatory power. I then show that these accounts can be revised along the lines of
the proposed account so as to circumvent the above objections. I conclude in Sect.
10 by briefly commenting on the implications of the mathematical constitution of
the physical for the question of the unreasonable effectiveness of mathematics in
physics.
Lévy-Leblond (1992, p. 147) notes that the picture of mathematics as the language
of physics is ambiguous and admits two different interpretations: mathematics as
the intrinsic language of Nature, which has to be mastered in order to understand
her workings; and mathematics as a special language into which facts of Nature
need to be translated in order to be comprehended. He refers to Galileo and
Einstein as advocates of the first view, and Heisenberg as an exponent of the second
view. He does not think of these views as contradictory but rather as extreme
positions on a continuous spectrum of opinions. Lévy-Leblond explores the question
of the peculiar relation between mathematics and physics, and he contends that
the view of mathematics as a special language of nature does not provide an
answer to this question. Instead of explaining why mathematics applies so well
to physics, this view is concerned with the supposed adequacy of mathematics
as a language for acquiring knowledge of nature in general. In his view, in other
natural sciences, such as chemistry, biology, and the earth sciences, we could
talk about the relation between mathematics and the particular science as one
of application, with the implication that “we are concerned with an instrumental
relation, in which mathematics appears as a purely technical tool, external to its
6 On the Mathematical Constitution and Explanation of Physical Facts 131
of mathematics in physics keeps on growing, it does so much less rapidly than the
development of mathematics that is not utilized by physics (ibid., p. 154).
Lévy-Leblond’s reflections above are intended to probe the nature of the
mathematical constitution of the physical and the special relationship between
physics and mathematics. His main conclusions are that there is no pre-established
harmony between physics and mathematics and that the singularity of physics in
its relationship to mathematics is very difficult to account for on the view that
mathematics is merely a language for representing the physical reality. Lévy-
Leblond holds that if mathematics is conceived as merely a language, it “would
necessarily have to be a universal one, applicable in principle to each and every
natural science”, and the specificity of physics “must then be treated as only a matter
of degree, and criteria must be found to distinguish it from other sciences” (ibid., p.
156). He considers two possible ways to try to explain the specificity of physics as a
matter of degree. One is the common view that physics is more advanced. The idea
here is that “more precise experimental methods, a firmer control of experimental
conditions, permit the quantitative measurement of physical notions, which are
thus transformed into numerical magnitudes” (ibid., p. 157). The other explanation
localizes
the specificity of physics in its objects of inquiry . . . It is commonly stated that physics is
“more fundamental” than the other natural sciences. Dealing with the deepest structures of
the world, it aims to bring to light the most general laws of Nature, which are implicitly
considered as the “simplest” and thus, in an Aristotelian way, as the “most mathematical”
(ibid.).
Lévy-Leblond rejects both explanations. He contends that the first explanation per
se fails to explain the privileged status of physics, and, moreover, it relies on a
very naive view of mathematics. And he claims that the second explanation implies
that “mathematicity assumes a normative character and becomes a criterion of
scientificity”, but the development of other sciences, such as chemistry, geology,
and molecular biology seems to contradict this norm. He argues that these sciences
are characterized by a clear-cut separation between their conceptual equipment and
the mathematical techniques used – a separation that allows them to maintain a
degree of autonomy even in the face of new developments in relevant branches of
physics (ibid., pp. 148, 157–158).
In the end, Lévy-Leblond fails to find an explanation for the singular relationship
between mathematics and physics. He thus opts for a radical solution according to
which physics is defined as any area of the natural sciences where mathematics is
constitutive (ibid., p. 158).
In his commentary on Lévy-Leblond’s reflections, Pitowsky (1992, p. 166)
finds Lévy-Leblond’s characterization of physics very attractive. He points out,
though, that it is incomplete, as the question arises as to what makes a given
use of mathematics constitutive rather than a mere (non-constitutive) application.
Indeed, as one could see from the above review, the questions of the nature of the
mathematical constitution of the physical and the distinction between constitutive
and non-constitutive uses of mathematics remain open. While Pitowsky shares
6 On the Mathematical Constitution and Explanation of Physical Facts 133
3 Bolzano’s example is believed to have been produced in the 1830s but the manuscript was only
discovered in 1922 (Kowalewski 1923) and published in 1930 (Bolzano 1930). Neuenschwander
(1978) notes that Weierstrass presented a continuous nowhere differentiable function before the
Royal Academy of Sciences in Berlin on 18 July 1872 and that it was published first, with his
assent, in Bois-Reymond (1875) and later in Weierstrass (1895).
134 J. Berkovitz
There are two traditional ways of conceiving the mathematical constitution of the
physical. One is along what I shall call the ‘Pythagorean picture’. There is no
canonical depiction of this picture. In what follows, I will consider three versions of
it, due to Aristotle, Galileo, and Einstein.
Aristotle describes the Pythagorean picture in the Metaphysics, Book 1, Part 5.
[T]he so-called Pythagoreans, who were the first to take up mathematics, not only advanced
this study, but also having been brought up in it they thought its principles were the
principles of all things. Since of these principles numbers are by nature the first, and in
numbers they seemed to see many resemblances to the things that exist and come into
being – more than in fire and earth and water (such and such a modification of numbers
being justice, another being soul and reason, another being opportunity – and similarly
almost all other things being numerically expressible); since, again, they saw that the
modifications and the ratios of the musical scales were expressible in numbers; since, then,
all other things seemed in their whole nature to be modelled on numbers, and numbers
seemed to be the first things in the whole of nature, they supposed the elements of numbers
to be the elements of all things, and the whole heaven to be a musical scale and a number.
(Aristotle 1924)
In his lecture “On the method of theoretical physics”, Einstein also seems to
endorse a version of the Pythagorean picture. He depicts nature as the realization of
mathematical ideas.
Our experience hitherto justifies us in believing that nature is the realization of the simplest
conceivable mathematical ideas. I am convinced that we can discover by means of purely
mathematical constructions the concepts and the laws connecting them with each other,
which furnish the key to the understanding of natural phenomena. (Einstein 1933/1954)
The Pythagorean picture has also been embraced by a number of other notable
natural philosophers and physicists. While the exact nature of this picture varies
136 J. Berkovitz
from one version to another, all share the idea that mathematics constitutes all the
elements of physical reality. Physical objects, properties, relations, facts, principles,
and laws are mathematical by their very nature, so that ontologically one cannot
separate the physical from the mathematical. This does not mean that physical
things are numbers, as it may be tempting to interpret Aristotle’s characterization
in the Metaphysics and as it is frequently suggested. Rather, the idea is that
mathematics defines the fundamental nature of physical quantities, relations, and
structures. Without their mathematical properties and relations, physical things
would have been essentially different from what they are. The Pythagoreans see
the mathematical features of the physical as intrinsic to it, so that mathematics is not
regarded merely as a ‘language’ or conceptual framework within which the physical
is represented and studied.
The main challenge for the Pythagoreans is to identify the mathematical frame-
works that define the nature of the physical, especially given that both mathematics
and natural science continue to develop. The Pythagoreans may argue, though, that
it is the role of mathematics and natural science to discover these frameworks.
The second traditional way of conceiving the mathematical constitution of the
physical is along neo-Kantian lines, especially those of Herman Cohen and Ernest
Cassirer of the Marburg school. Cassirer (1912/2005, p. 97) comments that Cohen
held that
the most general, fundamental meaning of the concept of object itself, which even
physiology presupposes, cannot be determined rigorously and securely except in the
language of mathematical physics.
Cassirer held that mathematics is crucial for furnishing the fundamental scaf-
folding of physics, the intellectual work of understanding, and the construction of
physical reality.
“Pure” experience, in the sense of a mere inductive collection of isolated observations, can
never furnish the fundamental scaffolding of physics; for it is denied the power of giving
mathematical form. The intellectual work of understanding, which connects the bare fact
systematically with the totality of phenomena, only begins when the fact is represented and
replaced by a mathematical symbol. (Cassirer 1910/1923, p. 147)
The idea is that “the chaos of impressions becomes a system of numbers” and
the “objective value” is obtained in “the transformation of impression into the
mathematical ‘symbol’.” The physical analysis of an object “into the totality of its
numerically constants” is a judgment in which
the concrete impression first changes into the physically determinate object. The sensuous
quality of a thing becomes a physical object, when it is transformed into a serial
determination. The “thing” now changes from a sum of properties into a mathematical
system of values, which are established with reference to some scale of comparison. Each
6 On the Mathematical Constitution and Explanation of Physical Facts 137
of the different physical concepts defines such a scale, and thereby renders possible an
increasingly intimate connection and arrangement of the elements of the given. (Ibid., p.
149)
More generally, in Cassirer’s view, the concepts of natural science are “products
of constructive mathematical procedure”. It is only when the values of physical
constants are inserted in the formulae of general laws, “that the manifold of
experiences gains that fixed and definite structure, that makes it ‘nature’.” The
reality that natural science studies is a construction, and the construction is in
mathematical terms.
The scientific construction of reality is only completed when there are found, along with the
general causal equations, definite, empirically established, quantitative values for particular
groups of processes: as, for example, when the general principle of the conservation of
energy is supplemented by giving the fixed equivalence-numbers, in accordance with which
the exchange of energy takes place between two different fields. (Ibid., p. 230)
4 For the arrow paradox and its analysis, see for example Huggett (1999, 2019) and references
therein.
5 Two comments: 1. The introduction of the calculus is often presented as a solution to Zeno’s
arrow paradox. I believe that this presentation is somewhat misleading. The calculus does not
really solve Zeno’s paradox. It just evades it by fiat, i. e. by redefining the concept of instantaneous
velocity. 2. For a recent example of the role that the calculus plays in constituting physical facts,
see Stemeroff’s (2018, Chap. 3) analysis of Norton’s Dome – the thought experiment that is
purported to show that non-deterministic behaviour could exist within the scope of Newtonian
mechanics. Stemeroff argues that: (i) this thought experiment overlooks the constraints that the
calculus imposes on Newton’s theory; and that (ii) when these constraints are taken into account,
Norton’s Dome is ruled out as impossible in Newtonian universes.
138 J. Berkovitz
where t denotes time and dv/dt denotes the derivative of velocity with respect to
time.
Second law: The net force F on an object is equal to the rate of change of its linear
momentum p, p = mv, where m denotes the object’s mass:
dp d(mv) dv
F = = =m . (6.2)
dt dt dt
Third law: For every action there is an equal and opposite re-action. That is, for
every force exerted by object A, FA , on object B, B exerts a reaction force, FB ,
that is equal in size, but opposite in direction:
FA = −FB . (6.3)
it posits are isomorphic to the structures of properties of, and relations between
these systems. Intuitively, the idea is that a mathematical model and properties
and relations of the physical systems it represents share the same structures. More
precisely, there is a one-to-one mapping between the mathematical structures
in the model and properties of, and relations between the represented systems.
Mathematical models/theories could also represent by means of weaker notions
of structural identity, such as partial isomorphism (French and Ladyman 1998,
Bueno et al. 2002, Bueno and French 2011, 2018), embedding (Redhead 2001),
or homomorphism (Mundy 1986). Some other accounts of how mathematical
models/theories represent take the mapping account to be incomplete. In particular,
Bueno and Colyvan (2011, p. 347) argue that, for the mapping account to get
started, we need “something like a pre-theoretic structure of the world (or at least a
pre-modeling structure of the world).” While it is common to think of mathematical
structure as a set of objects and a set of relations on them (Resnik 1997, Shapiro
1997), Bueno and Colyvan remark that “the world does not come equipped with
a set of objects . . . and sets of relations on those. These are either constructs of
our theories of the world or identified by our theories of the world.” Thus, the
mapping account requires having what they call “an assumed structure” (ibid.). Yet,
Bueno and Colyvan accept the mapping account as a necessary part of mathematical
representation of the physical world.
While the mapping account is supposed to support the common view that
mathematical models could represent physical systems even if the physical is funda-
mentally non-mathematical, it actually undermines it. If the notion of representation
is to be adequate, the notion of identity between a mathematical structure and the
physical structure it represents has to be sufficiently precise. But it is difficult to
see how the notion of identity between mathematical and physical structures could
be sufficiently precise for the purposes of modern physics if physical structures
did not have a mathematical structure. Thus, it seems that any adequate account of
mathematical representation of the physical that is based on the mapping account
(or some similar cousin of it) will have to presuppose that the represented physical
things have mathematical structures.
The above argument for the mathematical constitution of the physical is focused on
accounts of representation which are based on the mapping account of mathematical
representations. There have been various objections to the mapping account (see,
for example, Frigg and Nguyen (2018) for criticism of structuralist accounts of
representation and references therein). I believe that the main idea of the above
argument can be extended to any adequate account of mathematical representation
of the physical, as any such account would have to include some aspects of the
mapping account as a necessary component. Indeed, it is difficult to see how
140 J. Berkovitz
mathematics could have the potency it has in modern natural science if mathematical
representations did not involve mapping. In any case, there is another weighty
reason for rejecting the idea that the physical is non-mathematical. It is related to the
question of the nature of the physical, and it is applicable to all accounts of scientific
representation, independently of whether they embody the mapping account.
A key motivation for thinking about the physical as non-mathematical is an
equivocation between two notions of the physical:
(i) the physical as determined by the theoretical and experimental frameworks of
modern physics and, more generally, modern natural science; and
(ii) the physical as some kind of stuff out there, the nature of which is not defined
by these frameworks.
The second notion of the physical is too vague and rudimentary to play any
significant role in most contemporary scientific applications. In particular, theories
and models in physics actually refer to the first notion of the physical, where the
physical is constituted by the mathematical in the sense that fundamental features
of the physical are in effect mathematical. Indeed, physical objects, properties,
relations, and laws in modern physics are by their very essence characterized in
mathematical terms. Yet, although the gap between the above notions of the physical
seems unbridgeable, it is common to assume without justification that our theories
and models, which embody the first notion of the physical, are about the second
notion of the physical.
The considerations above suggest that in modern natural science:
(a) The common view – that the physical is ontologically separated from the math-
ematical – fails to explain how mathematical models/theories could represent
the physical.
(b) The mapping account is an essential component of mathematical representa-
tions of the physical.
(c) The mapping account requires that physical facts have mathematical structure.
(d) The mapping account supports the idea of the mathematical constitution of
the physical, and this constitution makes sense of the mapping account as an
essential component of mathematical representations of the physical.
the physical phenomena that current theories/models account for are mathematically
constituted. The raw data that scientists collect are typically quantitative. Further,
these data often lack the systematicity, stability, and unity required for constructing
these phenomena. The phenomena that our theories/models have to answer for are
constructed from raw data in terms of various postulates and statistical inferences
which are mathematically constituted. The scope of the mathematical constitution
of physical facts beyond the phenomena depends on the interpretative framework.
If all the current theories/models in natural science were interpreted as purely
instrumental, so that they are not supposed to represent anything beyond the
phenomena, the mathematical constitution of the physical would only concern
the phenomena; for under such interpretative framework, natural science is not
supposed to represent anything beyond physical phenomena. While there are
reasons to consider parts of modern natural science as purely instrumental, it is
doubtful that there is any adequate understanding of all of it as purely instrumental.
Accordingly, it is plausible to conceive the mathematical constitution of the physical
as pertaining also to various aspects of the physical reality beyond the phenomena.
In any case, in the account of mathematical explanation of physical facts that I
now turn to propose the extent of the mathematical constitution of the physical
has implications for the scope rather than nature of mathematical explanations of
physical facts.
Recently, the literature on the mathematical explanation of physical facts has grown
steadily and various accounts have been proposed. Here I can only consider some
of them, and the choice reflects the way the current study has developed so far.
Analyzing putative examples of mathematical explanations of physical facts, I will
argue that the structure of the explanation in these accounts is unclear and that they
are susceptible to the objection that the mathematical frameworks, structures, or
facts that they appeal to play a representational rather than explanatory role. I will
then show how these accounts could be revised along the lines of the proposed
account to circumvent these challenges.
6 On the Mathematical Constitution and Explanation of Physical Facts 143
There have been various objections to the above explanation, such as that the
choice of time units (e.g. years rather than seasons) is arbitrary, that the explanation
begs the question against nominalism, and that the mathematical facts in the
explanation play a representational rather than explanatory role (Melia 2000, 2002;
Leng 2005; Bangu 2008; Baker 2009; Daly and Langford 2009; Saatsi 2011, 2016,
2018; Koo 2015). Since my main interest here is to identify the nature of Baker’s
proposed explanation, I will focus on the last objection.
Daly and Langford (2009, p. 657) argue that
[i]t is this property of the cicadas’ life-cycle duration – this periodic intersection with the
life-cycles of certain predatory kinds of organism – that plays an explanatory role in why
their life-cycle has the duration it has. Primes have similar properties, and so successfully
index the duration. But then so also do various non-primes: measuring the life-cycle in
seasons produces analogous patterns.
That is, the idea is that the durations of the life-cycles of the cicada and of the
other relevant organisms in its environment (and the forces of evolution) explain
the cicada life-cycle duration, and the number of units these durations amount to
(relative to a given measuring system) only index the durations measured. Saatsi
(2011, 2016) expresses a similar objection, proposing that Baker’s explanation
could proceed with an alternative premise to (2) and (3):
(2/3) For periods in the range of 14–18 years the intersection-minimizing period is 17. [fact
about time]
While Daly, Langford and Saatsi have nominalist sympathies, the objection that the
mathematical fact in Baker’s explanation plays only a representational role has also
been submitted by Brown (2012, p. 11) who advocates Platonism.
Baker (2017, p. 198) replies that the above nominalist perspective lacks a scope
generality: it is inapplicable to other situations in which the ecological conditions
are different. As he points out, this is not only a hypothetical point, as there are
subspecies of the periodical cicada with 13-year life cycles. Indeed, (2/3) could be
generalized into a schema (Saatsi 2011, p. 152):
(2/3)* There is a unique intersection minimizing period Tx for periods in the range [T1 ,
. . . , T2 ] years.
Yet, Baker (ibid., p. 199) notes that while (2/3)* has a more general scope, the
explanation that this schema provides is less unified and has less depth than the
original one. Further, Baker argues that the nominalist perspective also lacks topic
generality (ibid., pp. 200–208). A mathematical explanation of physical facts has
a higher topic generality if the mathematical facts that are supposed to do the
explanation could apply to explanations of other topics. For example, Baker shows
that a revised version of the explanation in (1)–(5), which meets scope generality
and topic generality, could share the same core as an explanation of why in brakeless
fixed-gear bicycles the most popular gear ratios are 46/17, 48/17, 50/17, and 46/15.
6 On the Mathematical Constitution and Explanation of Physical Facts 145
Baker argues that: (a) for an explanation to have a high level of scope and topic
generality, mathematics is indispensable; (b) the interpretation of the mathematical
facts in a mathematical explanation of physical facts as representations of physical
facts limits the topic generality of the explanation; and (c) the optimal version of the
mathematical explanation of the cicada life-cycle “has an explanatory core that is
topic general, and is not ‘about’ any designated class of physical facts, such as facts
about time, or facts about durations” (ibid., p. 201). To establish (c), he proposes a
generalized explanation of the cicada life-cycle, which I discuss below.
I agree with Baker’s observation that the nominalist perspective restricts the
level of scope generality and topic generality of explanations. But I don’t think that
his response meets the objection that, in his account, mathematics only represents
the physical facts that actually do the explanation. In what follows, I will present
the challenge for Baker’s account, first considering the original explanation of the
periodical cicada life-cycle and then the generalized version.
Baker’s explanation in (1)–(5) is ambiguous because premise (2) is ambiguous.
It could be interpreted as:
(2i) a fact about time, i.e. a physical fact; or
(2ii) a theorem about number theory, i.e. a purely mathematical fact.
For the above D-N-like explanation to be valid, it requires (2i) rather than (2ii):
since (2ii) is a pure mathematical fact, (1) and (2ii) per se do not imply (3). Thus,
it may be argued that it is the physical fact about time in (2i) that explains why
the cicadas’ life-cycle is likely to be prime, and the mathematical fact in (2ii)
only represents this physical fact; and the impression that (1) and (2ii) entail (3)
is due to an equivocation between (2i) and (2ii). The upshot is that Baker fails
to demonstrate that the explanation of the periodical cicada life-cycle above is a
mathematical explanation of a physical fact. If we interpret premise (2) as a physical
fact about the nature of time, i.e. as (2i), the D-N-like explanation in (1)–(5) is a
physical explanation of a physical fact, and it is not clear what explanatory role the
mathematical fact about prime numbers plays. If, on the other hand, we interpret (2)
as a purely mathematical fact, i.e. as (2ii), the explanation becomes either invalid
or unclear: if the explanation is supposed to be a D-N-like explanation, where
the explanandum follows deductively from the explanans, it is invalid; and if the
explanation is not intended as a D-N-like explanation, it is not clear what kind of
explanation it is.
It may be tempting to reply that the above objection does not apply to Baker’s
generalized explanation of the cicada life-cycle, where the explanatory core is not
‘about’ any designated class of physical facts because of its topic generality. Yet, as
we shall see below, the generalized explanation is still subject to the same challenge.
146 J. Berkovitz
tems – all the physical systems that satisfy the requirement of topic generality under
consideration – and that the mathematical fact in (UC3) is only a representation
of this universal physical fact. Since (UC3) follows from (UC1) and (UC2) (and
(M1) and (M2)), for the explanation to be valid, (UC1) and (UC2) also have to be
interpreted as propositions about physical systems – again, all the physical systems
that satisfy the requirement of topic generality under consideration. Thus, like in the
original explanation, the generalized explanation of the periodical cicada life-cycle
is subject to the challenge that it is either a physical explanation of a physical fact,
invalid, or unclear.
The objection above can be divided into two related objections. (α) For the
conclusion (8G) to follow from the premises in the generalized explanation of
the cicada life-cycle, (UC1) – (UC3) have to be interpreted as propositions about
physical facts. Thus, the explanation is a physical explanation of a physical fact,
and it is not clear in what sense it is a mathematical explanation of a physical fact.
(β) Like in the original explanation of the cicada life-cycle, it may be argued that
mathematics plays a representational rather than explanatory role: it represents the
physical facts that actually explains the cicada life-cycle.
The second objection is particularly compelling against those who assume that
the physical can ontologically be separated from the mathematical. It presup-
poses that the facts about physical co-occurrences can ontologically be separated
from the corresponding mathematical facts. Similarly, the objection to the orig-
inal explanation of the cicada life-cycle presupposes that the physical fact that
prime time periods minimize intersection with other (nearby/lower) periods can
ontologically be separated from the corresponding mathematical fact. But these
presuppositions are unwarranted for those who take the mathematical to constitute
the physical. The physical fact that prime time periods minimize intersection with
other (nearby/lower) periods is constituted by number theory, in general, and
its theorems about prime numbers, in particular. Accordingly, the physical fact
that prime time periods minimize intersection with other (nearby/lower) periods
cannot be separated from the corresponding mathematical fact. Likewise, facts
about physical co-occurrences are constituted by number theory, in general, and its
theorems about prime numbers, in particular. Thus, such facts cannot be separated
from the corresponding mathematical facts.
Given the mathematical constitution of the physical, Baker’s D-N-like account of
mathematical explanation of physical facts can be revised as follows to circumvent
the objections in (α) and (β). A mathematical explanation of physical facts is a phys-
ical explanation of physical facts along the D-N model in which the mathematical
constitution of some physical facts in the explanans is highlighted. The idea here
is that by highlighting the mathematical constitution of certain physical facts in
the explanans, the explanation deepens and expands the scope of the understanding
of the physical fact in the explanandum. Focusing on the original explanation of
the periodical cicada life-cycle, the revised account has two parts. First, it derives
deductively the likelihood of the prime-numbered-year life-cycle of the periodical
cicada from general biological facts, particular facts about the periodical cicada and
its eco-system, and facts about physical time. Second, it highlights the mathematical
148 J. Berkovitz
Dorato and Felline (2011, p. 161) comment that the ongoing controversy concerning
the interpretation of the formalism of quantum mechanics may explain why
philosophers have often contrasted the poor explanatory power of quantum theory
to its unparalleled predictive capacity. Yet, they claim, “quantum theory provides
a kind of mathematical explanation of the physical phenomena it is about”, which
they refer to as structural explanation. To demonstrate their claim, they present two
case studies: one involves the quantum uncertainty relations between position and
momentum, and the other focuses on quantum nonlocality.
6 On the Mathematical Constitution and Explanation of Physical Facts 149
Following Clifton (1998, p. 7),6 Dorato and Felline (2011, p. 163) hold that
we explain some feature B of the physical world by displaying a mathematical model of
part of the world and demonstrating that there is a feature A of the model that corresponds
to B, and is not explicit in the definition of the model.
The idea here is that the explanandum B is made intelligible via its structure
similarities with its formal representative, the explanans A. How do these structural
similarities render the explanandum intelligible? Dorato and Felline propose that
“in order for such a representational relation to also be sufficient for a structural
explanation, . . . we have to accept the idea that we understand the physical
phenomenon in terms of its formal representative, by locating the latter in the
appropriate model” (ibid., p. 165).
Consider, for example, a structural explanation of why position and momentum
cannot assume simultaneously definite values. In standard quantum mechanics, sys-
tems’ position and momentum are related through Fourier transforms, and Dorato
and Felline locate the explanans as the mathematical properties of these Fourier
transforms. The formal representation of the momentum (position) of a particle
is a Fourier transform of the formal representation of its position (momentum),
and Dorato and Felline propose that these Fourier transforms are required to
make intelligible the uncertainty relations between position and momentum. These
transformations are supposed to constitute the answer to the question “why do
position and momentum not assume simultaneously sharp magnitudes?”: “because
their formal representatives in the mathematical model have a property that makes
this impossible” (ibid., p 166).
Since ‘because’ here is not supposed to have a causal interpretation, the
question arises as to its exact meaning. Dorato and Felline do not address this
question explicitly. Their structural account of explanation seems to be based
on the following key ideas: (i) A structural explanation makes the explanandum
intelligible; (ii) the assumption that the properties of a physical system exemplify
the relevant parts of the mathematical model that represents it allows one to use the
properties of the latter to make intelligible the properties of the former; (iii) there
is a structure-preserving morphism from the representing mathematical model to
the represented physical fact, and this relation ensures that the represented fact can
be made intelligible by locating its representative in the mathematical model. Yet,
these ideas leave the logical structure of the explanation unclear. In particular, the
extent to which the structural explanation is different from mere representation is
unclear. Indeed, as Dorato and Felline acknowledge, one may raise the objection
that the proposed explanation is a mere translation or redescription of the physical
explanation to be given (ibid., p. 166). They concede that, in a sense, the physical
properties carry the explanatory burden. They consider, for example, a balance with
eight identical apples, five on one pan and three on the other.
6 Clifton
(1998) appropriates, with minor modifications, a definition of explanation given by
Hughes (1993).
150 J. Berkovitz
If someone explained the dropping of the pan with five apples (or the rising of the side with
three) by simply saying “5 > 3”, he/she would not have provided a genuine explanation. The
side with five apples drops because it is heavier and because of the role that its gravitational
mass has vis-à-vis the earth, not because 5 > 3!
In reply, Dorato and Felline claim that “structural explanations are not so easily
translatable into non-mathematical terms without loss of explicative power” (ibid.,
p. 167). The problem with this reply is that it does not really address the above
objection. The question under consideration is not whether to replace mathematical
explanations with non-mathematical ones. Rather, it is a question about the nature
of the mathematical explanation on offer. In particular, it is not clear in what sense
the proposed structural explanation is a mathematical explanation of physical facts
rather than a physical explanation of physical facts in which mathematics only plays
a representational role. It is not clear, for example, that, by pointing to Fourier
transforms between the functions that represent position and momentum, we are
providing a mathematical explanation of a physical fact, and not merely a physical
explanation of a physical fact in which mathematics plays a representational role.
Another aspect that is unclear in Dorato and Felline’s structural explanation is
related to the question whether the proposed pattern of explanation requires an
interpretation of the representing mathematical model. Dorato and Felline conceive
the structural explanation as providing “a common ground for understanding
[the uncertainty relations between position and momentum], independently of the
various different ontologies underlying the different interpretations of quantum
theory” (ibid., p. 165). The question arises then: How does this conception
square with the fact that their structural explanation relies on the idea that for
a mathematical model to explain properties of a physical system the physical
system has to exemplify the relevant parts of the model? The problem here is
that the quantum-mechanical formalism represents different things under different
interpretations. Consider, for example, Bohmian mechanics (Goldstein 2017 and
references therein). The functions that represent systems’ positions and momentum
in the standard interpretation of the quantum-mechanical formalism do not repre-
sent their position and momentum in Bohmian mechanics. Under this alternative
interpretation, quantum systems always have definite position and momentum, and
systems’ position and momentum do not exemplify the Fourier transform relations.
Thus, the explanation that the quantum-mechanical formal representatives – the
Fourier transforms between position and momentum – have a property that makes
it impossible for systems to have simultaneously definite position and momentum
is not correct in this case. The functions that represent systems’ position and
momentum in the standard interpretation represent in Bohmian mechanics the
range of possible outcomes of measurements of positions and momentum and their
probabilities; and the Fourier transform relations between these functions reflect
the epistemic limitation on the knowledge of systems’ position and momentum.
The upshot is that Dorato & Felline’s structural explanation cannot circumvent the
question of the interpretation of the quantum-mechanical formalism.
Given the mathematical constitution of the physical, Dorato & Felline’s struc-
tural account of explanation can be revised along the lines proposed in Sect. 8 so as
6 On the Mathematical Constitution and Explanation of Physical Facts 151
to meet the above challenges. The main idea of the revised account is that a reference
to the mathematical representatives of physical facts is explanatory if it highlights
the mathematical constitution of these facts and thus makes them intelligible. For
example, the structural explanation of why, in standard quantum mechanics, it is
impossible for position and momentum to assume simultaneously definite values
highlights the mathematical constitution of the relations between position and
momentum and thus makes the explanandum intelligible. In Bohmian mechanics,
a reference to the mathematical representatives of position and momentum in the
quantum-mechanical formalism (i.e. to the position and momentum ‘observables’)
also highlights the mathematical constitution of physical facts, but the physical facts
are not the same as in standard quantum mechanics. In Bohmian mechanics, the
highlighting is of the mathematical constitution of the limitation on simultaneous
measurements, and accordingly knowledge, of a system’s position and momentum
at any given time.7
conclusions about minimal tours across bridge systems can be reformulated as the
following theorem in graph theory.8
Euler’s theorem: A graph is Eulerian if and only if it contains no odd vertices; where a
vertex is called ‘odd’ just in case it has an odd number of edges leading to it.
8 Since graph theory did not exist at the time, this is obviously an anachronistic account of Euler’s
reasoning. That is not a problem for the current discussion, as the aim is not to reconstruct Euler’s
own analysis but rather to consider Pincock’s account of mathematical explanation of physical
facts. For Euler’s reasoning, see Euler (1736/1956), and for a reconstruction of it, see, for example,
Hopkins and Wilson (2004).
6 On the Mathematical Constitution and Explanation of Physical Facts 153
graph not being Eulerian. Similarly, the impossibility of a minimal tour across
this bridge system is abstractly dependent on the corresponding graph having odd
vertices.
Pincock’s motivation for an abstract explanation of the impossibility of a minimal
walk across the Königsberg bridge system is that it seems superior because it gets to
the root cause of why such a walk is impossible by focusing on the abstract structure
of this system. More generally, abstracting away from various physical facts, such
as the bridge materials and dimensions, size of the islands and river banks, etc.,
scientists can often give better explanations of features of physical systems (Pincock
2007, p. 260).
Since abstract dependence is supposed to be objective but not causal, the question
arises as to what kind of dependence it is. Pincock considers a few candidates. First,
he (2015) discusses Woodward’s (2003, p. 221) proposal that
the common element in many forms of explanation, both causal and noncausal, is that they
must answer what-if-things-had-been-different questions. . . . When a theory or derivation
answers a what-if-things-had-been-different question but we cannot interpret this as an
answer to a question about what would happen under an intervention, we may have a
noncausal explanation of some sort.
In the case of abstract explanation, the question that is answered concerns the “the
systematic relationship between the more abstract objects and their properties [the
explanans], and the more concrete objects and their properties [the explanandum].”
Pincock comments that Woodward’s proposal could be applicable to abstract
explanations “if the sort of explanatory question here could be suitably clarified”
(ibid., p. 869).
Next, Pincock (ibid., p. 869–871) discusses Strevens’ (2008) Kairetic account
of explanation. He notes that the Kairetic account “allows important explanatory
contributions from more abstract entities like mathematical objects” (ibid., p. 869).
He considers whether this account – which, like Woodward’s account, is based on
the idea of difference making, and moreover aims to reduce all explanations to
154 J. Berkovitz
cases, the entities that we appeal to in the explanation are constituents of the fact to be
explained. (Ibid., p. 879)
9 Vineberg (2018) suggests that the objection above does not apply to the structuralist understanding
of mathematics. It is not clear, however, how an appeal to this conception of mathematics could
help here. One may accept the structuralist rejection of mathematical objects yet argue that
mathematical structures only represent the physical structures which actually do the explanation.
156 J. Berkovitz
For example, the fact that a mother fails repeatedly to distribute evenly 23
strawberries among her 3 children without cutting any strawberry is explained by
the mathematical fact that 23 cannot be divided evenly by 3 (ibid., p. 6). The
explanation shows that mother’s success is impossible in a stronger sense than
causal considerations underwrite (ibid., p. 30). The explanans in this case consist
of the mathematical fact that 23 cannot be divided evenly by 3 and the contingent
facts presupposed by the why-question under consideration: namely, that there are
23 strawberries and 3 children, and the distribution is of uncut strawberries. And
the explanation shows that, under such contingent conditions, an even distribution
of the 23 strawberries among the 3 children is impossible, where the impossibility
is stronger than natural impossibility.
Saatsi (2018) poses two challenges for Lange’s account of explanation. One
challenge is that “information about the strong degree of necessity involved
[in explanation by constraint] risks being too cheap: the exalted modal aspect
6 On the Mathematical Constitution and Explanation of Physical Facts 157
Lange (2018, p. 35) admits that “perhaps Saatsi is correct that the answer ‘Because
1+1=2’ is ‘utterly shallow’.” But he suggests that
that impression may arise from everyone’s knowing that 1+1=2 and that this fact suffices
to necessitate the explanandum, so it is difficult to see what information someone asking
Saatsi’s question might want. Furthermore, “Because 1+1=2” may not explain A+B’s
mass at all, if A and B can chemically interact when “united”. (Ibid.)
(i) just represents the physical fact in (ii). Put another way, one may argue that
the intuitive appeal of Lange’s distinctively mathematical explanation is due to the
above equivocation, taking the explanation to be a physical explanation of a physical
fact – it is an explanation of a particular physical fact in terms of a corresponding
universal physical fact – in which the mathematical fact 1 + 1 = 2 only plays a
representational role.
The above objection is particularly compelling for those who maintain that the
physical is ontologically separated from the mathematical. It can be circumvented
by acknowledging the mathematical constitution of the physical. Distinctively math-
ematical explanations of physical facts could then be understood as explanations
of physical facts in which the mathematical constitution of some physical facts is
highlighted by stating the constraints that this constitution mandates. For example,
in explaining “why the union A+B has the mass of 2 kg” by pointing out that
“1 + 1 = 2”, one appeals to the universal physical fact that for any two possible
physical objects of 1 kg, A and B, which are ‘united’ without being changed, the
total mass of A + B is 2 kg. Yet, one observes that this universal physical fact is
constituted by number theory, in general, and a mathematical fact that follows from
it, 1 + 1 = 2, in particular. This constitution is highlighted by stating the constraint
1+1=2 that it mandates. In this kind of explanation, there is no need to appeal to
counterfactual dependence. Indeed, it is not clear how counterfactual dependence
could be of any help here.
and is then picked up by physicists to yield great success. It may thus be natural
to see the success of the application of the language of mathematics in physics as
surprising, mysterious, and even unreasonable. However, Wigner’s conception is
controversial (see, for example, Lützen 2011, Ferreirós 2017, Islami 2017, and ref-
erences therein). Some have argued that while Wigner’s view of mathematics seems
to have been inspired by the formalist philosophy of mathematics, this philosophy
has lost its credibility by the second half of the twentieth century (Lützen 2011,
Ferreirós 2017). Further, historically, the development of mathematics has been
entangled with that of the natural sciences, in general, and physics, in particular, and
Wigner’s conception of mathematics fails to reflect this fact. Consider, for example,
complex numbers. Wigner singles them out as one of the prime examples of most
advanced mathematical concepts that mathematicians invented for the sole purpose
of demonstrating their ingenuity and sense of formal beauty without any regard
to possible applications. Indeed, it seems that complex numbers were introduced
by Scipione del Ferro, Niccola Fontana (also known as Tartaglia, the stammerer),
Gerolamo Cardano, Ludovico Ferrari, and Rafael Bombelli in the sixteenth century
with no intent in mind to apply them. Yet, the context of the introduction was the
attempt to solve the general forms of quadratic and cubic equations, which by that
time had a long history of applications (Katz 2009). Thus, taking into account this
broader context, the claim that the introduction of complex numbers was solely for
the purpose of demonstrating ingenuity and sense of formal beauty is misleading.
That is not to argue that all the developments in mathematics were connected
to physics, neither to deny that in many cases future applications of mathematical
concepts could not be anticipated at the time of their introduction. Yet, attention to
the historical entanglement between the developments of mathematics and physics
casts doubt on the validity and scope of Wigner’s argument for the unreasonable
success of the application of mathematics in physics.
In any case, while Wigner’s view of mathematics may provide some support to
the view that the success of the application of mathematics in physics is unrea-
sonable, things are very different if we conceive the mathematical as constitutive
of the physical. In the context of such a conception, the relationship between
mathematics and physics is intrinsic. The physical is characterized in mathematical
terms. The physicist’s crude experience is formulated in precise mathematical terms,
often in statistical models of the phenomena, and accordingly the gap between the
phenomena and the theoretical models that account for them diminishes. There is
no essential gap between the concepts of elementary mathematics and geometry
and more advanced mathematical concepts. The division between applied and
theoretical mathematics is neither a priori nor fundamental. And the appropriateness
of the use of the language of mathematics in future physics is not in doubt. Thus, it is
reasonable to expect the language of mathematics to be appropriate for representing
physical reality.
Of course, such a conception of the role of mathematics in physics does not
obliterate the sense of wonder that one has with respect to the fact that “in spite
of the baffling complexity of the world, certain regularities in the events could
be discovered” (Wigner, ibid., p. 4). Yet, the reasons to conceive this wonder in
6 On the Mathematical Constitution and Explanation of Physical Facts 161
miraculous terms are not as compelling as for those who deny the mathematical
constitution of physical facts.
Acknowledgments I owe a great debt to Itamar Pitowsky. Itamar’s graduate course in the
philosophy of probability stimulated my interest in the philosophical foundations of probability
and quantum mechanics. Itamar supervised my course essay and MA thesis on the application of
de Finetti’s theory of probability to the interpretation of quantum probabilities. During my work
on this research project, I learned from Itamar a great deal about the curious nature of probabilities
in quantum mechanics. Itamar was very generous with his time and the discussions with him were
always enlightening. Our conversations continued for many years to come, and likewise they were
always helpful in clarifying and developing my thoughts and ideas. Itamar’s untimely death has
been a great loss. Whenever I have an idea I would like to test, I think of him and wish we could
talk about it. I sorely miss the meetings with him and I wish I could discuss with him the questions
and ideas considered above.
I am very grateful to the volume editors, Meir Hemmo and Orly Shenker, for inviting me
to contribute, and to Meir for drawing my attention to Itamar’s commentary on the relationship
between mathematics and physics. The main ideas of the proposed account of mathematical
explanations of physical facts were first presented in the workshop on Mathematical and Geo-
metrical Explanations at the Universitat Autònoma de Barcelona (March 2012), and I thank Laura
Felline for inviting me to participate in the workshop. Earlier versions of this paper were also
presented in the 39th, 40th, 43rd, and 46th Dubrovnik Philosophy of Science conferences, IHPST,
Paris, IHPST, University of Toronto, Philosophy, Università degli Studi Roma Tre, Philosophy,
Università degli Studi di Firenze, CSHPS, Victoria, CPNSS, LSE, Philosophy, Leibniz Universität
Hannover, ISHPS, Jerusalem, Munich Center for Mathematical Philosophy, LMU, Faculty of
Sciences, Universidade de Lisboa. I would like to thank the audiences in these venues for their
helpful comments. For discussions and comments on earlier drafts of the paper, I am very grateful
to Jim Brown, Donald Gillies, Laura Felline, Craig Fraser, Aaron Kenna, Flavia Padovani, Noah
Stemeroff, and an anonymous referee. The research for this paper was supported by SSHRC Insight
and SIG grants as well as Victoria College travel grants.
References
Aristotle. (1924). Metaphysics. A revised text with introduction and commentary by W. D. Ross (2
vols). Oxford: Clarendon Press.
Bachelard, G. (1965). L’ activité rationaliste de la physique contemporaine. Paris: Presses
Universitaires de France.
Baker, A. (2005). Are there genuine mathematical explanations of physical phenomena? Mind,
114(454), 223–238.
Baker, A. (2009). Mathematical explanations in science. British Journal for the Philosophy of
Science, 60(3), 611–633.
Baker, A. (2017). Mathematics and explanatory generality. Philosophia Mathematica, 25(2), 194–
209.
Bangu, S. (2008). Inference to the best explanation and mathematical explanation. Synthese,
160(1), 13–20.
Baron, S., Colyvan, M., & Ripley, D. (2017). How mathematics can make a difference. Philoso-
phers’ Imprint, 17(3), 1–19.
Batterman, R. (2002). Asymptotics and the role of minimal models. British Journal for the
Philosophy of Science, 53(1), 21–38.
Batterman, R. (2010). On the explanatory role of mathematics in empirical science. British Journal
for the Philosophy of Science, 61(1), 1–25.
162 J. Berkovitz
Galileo, G. (1623/1960). The Assayer. In the controversy of the comets of 1618: Galileo Galilei,
Horatio Grassi, Mario Guiducci, Johann Kepler (S. Drake, & C. D. O’Malley, Trans.)
Philadelphia: The University of Pennsylvania Press.
Goldstein, S. (2017). Bohmian mechanics. The Stanford Encyclopedia of Philosophy (Summer
2017 Edition), E. N. Zalta (Ed.). URL=https://plato.stanford.edu/archives/sum2017/entries/
qm-Bohm/
Hopkins, B., & Wilson, R. J. (2004). The truth about Königsberg. The College Mathematics
Journal, 35(3), 198–207.
Huggett, N. (Ed.). (1999). Space from Zeno to Einstein: Classic readings with a contemporary
commentary. Cambridge, MA: MIT Press.
Huggett, N. (2019). Zeno paradoxes. The Stanford Encylopedia of Philosophy (Spring 2019
Edition), E. N. Zalta (ed.). URL= https://plato.stanford.edu/archives/win2019/entries/paradox-
zeno/
Hughes, R. I. G. (1993). Theoretical explanation. Midwest Studies in Philosophy, XVIII, 132–153.
Islami, A. (2017). A match not made in heaven: On the applicability of mathematics in physics.
Synthese, 194(12), 4839–4861.
Jansson, L., & Saatsi, J. (2019). Explanatory abstraction. British Journal for the Philosophy of
Science, 70(3), 817–844.
Katz, V. J. (2009). A history of mathematics: An introduction (3rd ed.). Reading: Addision-Wesley.
Koo, A. (2015). Mathematical explanation in science. PhD thesis, IHPST, University of Toronto.
Koslicki, K. (2012). Varieties of ontological dependence. In F. Correia & B. Schnieder (Eds.),
Metaphysical grounding: Understanding the structure of reality (pp. 186–213). Cambridge:
Cambridge University Press.
Kowalewski, G. (1923). Über Bolzanos nichtdiffrenzierbare stetige funktion. Acta Mathematica,
44, 315–319.
Lange, M. (2016). Because without cause: Non-causal explanations in science and mathematics.
Oxford: Oxford University Press.
Lange, M. (2018). Reply to my critics: On explanations by constraint. Metascience, 27(1), 27–36.
Lazarovici, D., Oldofredi, A., & Esfeld, M. (2018). Observables and unobservables in quantum
mechanics: How the no-hidden-variables theorems support the Bohmian particle ontology.
Entropy, 20(5), 116–132.
Leng, M. (2005). Mathematical explanation. In C. Cellucci & D. Gillies (Eds.), Mathematical
reasoning and heuristics (pp. 167–189). London: King’s College Publishing.
Lévy-Leblond, J.-M. (1992). Why does physics need mathematics? In E. Ullmann-Margalit (Ed.),
The scientific enterprise (Boston Studies in the Philosophy of Science) (Vol. 146, pp. 145–161).
Dordrecht: Springer.
Lützen, J. (2011). The physical origin of physically useful mathematics. Interdisciplinary Science
Reviews, 36(3), 229–243.
Lyon, A. (2012). Mathematical explanations of empirical facts, and mathematical realism.
Australasian Journal of Philosophy, 90(3), 559–578.
Mancosu, P. (2018). Explanation in Mathematics. The Stanford Encyclopedia of Philosophy
(Summer 2018 Edition), E. N. Zalta (Ed.). URL=https://plato.stanford.edu/archives/sum2018/
entries/mathematics-explanation/
Mandelbrot, B. (1977). Fractals, form, chance and dimension. San Francisco: W.H. Freeman.
Melia, J. (2000). Weaseling away the indispensability argument. Mind, 109(435), 455–479.
Melia, J. (2002). Response to Colyvan. Mind, 111(441), 75–79.
Mundy, B. (1986). On the general theory of meaningful representation. Synthese, 67(3), 391–437.
Neuenschwander, E. (1978). Riemann’s example of a continuous “nondifferentiable” function.
Mathematical Intelligencer, 1, 40–44.
Pincock, C. (2004). A new perspective on the problem of applying mathematics. Philosophia
Mathematica, 12(2), 135–161.
Pincock, C. (2007). A role for mathematics in the physical sciences. Noûs, 41(2), 253–275.
Pincock, C. (2011a). Abstract explanation and difference making. A colloquium presentation in
the Munich Center for Mathematical Philosophy, LMU, December 12, 2011.
164 J. Berkovitz
Pincock, C. (2011b). Discussion note: Batterman’s “On the explanatory role of mathematics in
empirical science”. British Journal for the Philosophy of Science, 62(1), 211–217.
Pincock, C. (2012). Mathematics and scientific representation. Oxford: Oxford University Press.
Pincock, C. (2015). Abstract explanations in science. British Journal for the Philosophy of Science,
66(4), 857–882.
Pitowsky, I. (1992). Why does physics need mathematics? A comment. In E. Ullmann-Margalit
(Ed.), The scientific enterprise (Boston Studies in the Philosophy of Science) (Vol. 146, pp.
163–167). Dordrecht: Springer.
Poincaré, H. (1913). La valeur de la science. Geneva: Editions du Cheval Ailé.
Redhead, M. (2001). Quests of a realist: Review of Stathis Psillos’s Scientific Realism: How
science tracks truth. Metascience, 10(3), 341–347.
Resnik, M. D. (1997). Mathematics as a science of patterns. Oxford: Clarendon Press.
Saatsi, J. (2011). The enhanced indispensability argument: Representational versus explanatory
role of mathematics in science. British Journal for the Philosophy of Science, 62(1), 143–154.
Saatsi, J. (2016). On the “indispensable explanatory role” of mathematics. Mind, 125(500), 1045–
1070.
Saatsi, J. (2018). A pluralist account of non-causal explanations in science and mathematics:
Review of M. Lange’s Because without cause: Non-causal explanation in science and
mathematics. Metascience, 27(1), 3–9.
Shapiro, S. (1997). Philosophy of mathematics: Structure and ontology. Oxford: Oxford University
Press.
Steiner, M. (1978). Mathematical explanations. Philosophical Studies, 34(2), 135–151.
Steiner, M. (1998). The applicability of mathematics as a philosophical problem. Cambridge, MA:
Harvard University Press.
Stemeroff, N. (2018). Mathematics, structuralism, and the promise of realism: A study of the
ontological and epistemological implications of mathematical representation in the physical
sciences. A PhD thesis submitted to the University of Toronto.
Strevens, M. (2008). Depth: An account of scientific explanation. Cambridge, MA: Harvard
University Press.
Vineberg, S. (2018). Mathematical explanation and indispensability. Theoria, 33(2), 233–247.
Weierstrass, K. (1895). Über continuirliche functionen eines reellen arguments, die für keinen
werth des letzteren einen bestimmten differentialquotienten besitzen (read 1872). In Mathema-
tische werke von Karl Weierstrass (Vol. 2, pp. 71–76). Berlin.
Wiener, N. (1923). Differential space. Journal of Mathematical Physics, 2, 131–174.
Wigner, E. (1960). The unreasonable effectiveness of mathematics in the natural sciences.
Communications in Pure and Applied Mathematics, 13(1), 1–14.
Woodward, J. (2003). Making things happen: A theory of causal explanation. Oxford: Oxford
University Press.
Chapter 7
Eerettian Probabilities, The
Deutsch-Wallace Theorem and the
Principal Principle
Abstract This paper is concerned with the nature of probability in physics, and in
quantum mechanics in particular. It starts with a brief discussion of the evolution
of Itamar Pitowsky’s thinking about probability in quantum theory from 1994
to 2008, and the role of Gleason’s 1957 theorem in his derivation of the Born
Rule. Pitowsky’s defence of probability therein as a logic of partial belief leads
us into a broader discussion of probability in physics, in which the existence
of objective “chances” is questioned, and the status of David Lewis influential
Principal Principle is critically examined. This is followed by a sketch of the work
by David Deutsch and David Wallace which resulted in the Deutsch-Wallace (DW)
theorem in Everettian quantum mechanics. It is noteworthy that the authors of
H. R. Brown ()
Faculty of Philosophy, University of Oxford, Oxford, UK
e-mail: harvey.brown@philosophy.ox.ac.uk
G. Ben Porath
Department of History and Philosophy of Science, University of Pittsburgh, Pittsburgh, PA, USA
this important decision-theoretic derivation of the Born Rule have different views
concerning the meaning of probability. The theorem, which was the subject of a
2007 critique by Meir Hemmo and Pitowsky, is critically examined, along with
recent related work by John Earman. Here our main argument is that the DW
theorem does not provide a justification of the Principal Principle, contrary to the
claims by Wallace and Simon Saunders. A final section analyses recent claims to the
effect that the DW theorem is redundant, a conclusion that seems to be reinforced
by consideration of probabilities in “deviant” branches of the Everettian multiverse.
7.1 Introduction
Itamar Pitowsky was convinced that the heart of the quantum revolution concerned
the role of probability in physics, and in 2008 defended the notion that quantum
mechanics is essentially a new theory of probability in which the Hilbert space for-
malism is a logic of partial belief (Sect. 7.2 below). In the present paper we likewise
defend a subjectivist interpretation of probability in physics, but not restricted to
quantum theory (Sect. 7.3). We then turn (Sect. 7.4) to the important work by David
Deutsch and David Wallace over the past two decades that has resulted in the so-
called Deutsch-Wallace (DW) theorem, providing a decision-theoretic derivation
of the Born Rule in the context of Everettian quantum mechanics. This derivation
is arguably stronger than that proposed by Pitowsky in 2008 based on the 1957
Gleason theorem and further developed by John Earman, but it is noteworthy that
Deutsch and Wallace have differing views on the meaning of probability. In Sect. 7.5
we attempt to rebut the claim by Wallace and Simon Saunders – not Deutsch
– that the DW theorem provides a justification of Lewis’ influential Principal
Principle. Finally in Sect. 7.6 we analyse the 2007 critique of the DW theorem
by Meir Hemmo and Pitowsky as well as Wallaces reply. We also consider recent
claims to the effect that the theorem is superfluous, a conclusion that seems to
be reinforced by consideration of probabilities in “deviant” branches within the
Everettian multiverse.
. . . the difference between classical and quantum phenomena is that relative frequencies of
microscopic events, which are measured on distinct samples, often systematically violate
some of Boole’s conditions of possible experience.1
In the same paper, Pitowsky argued that quantum mechanics is essentially a new
theory of probability, in which the Hilbert space formalism is a “logic of partial
belief” in the sense of Frank Ramsey.3 Pitowsky now questioned Boole’s view of
probabilities as weighted averages of truth values, which leads to the erroneous
“metaphysical assumption” that incompatible (in the quantum sense) propositions
have simultaneous truth values. Whereas the 1994 paper expressed puzzlement
over the quantum violations of Boole’s conditions of possible experience by way
4 We will not discuss here Pitowsky’s 2006 resolution of the familiar measurement problem, other
than to say that it relies heavily on his view that the quantum state is nothing more than a device
for the bookkeeping of probabilities, and that it implies that “we cannot consistently maintain that
the proposition ’the [Schrödinger] cat is alive’ has a truth value. Op. cit. p. 28. For a recent defence
of the view that the quantum state is ontic, and not just a bookkeeping device, see Brown (2019).
5 The identity is
k
l
{B = bj } ∩ {A = ai } = {B = bj } = {B = bj } ∩ {C = ci } (7.1)
i=1 i=1
The argument is, to us, unconvincing. Pitowsky accepted that the non-
contextualism issue is an empirical one, but seems to have resolved it by fiat. (It
is noteworthy that in his opinion it is by not committing to the lattice of subspaces
as the event structure that advocates of the (Everett) many worlds interpretation
require a separate justification of probabilistic non-contextualism.6 ) Is making such
a commitment any more natural in this context than was, say, Kochen and Specker’s
ill-fated presumption (Kochen and Specker 1967) that truth values (probabilities
0 or 1) associated with propositions regarding the values of observables in a
deterministic hidden variable theory must be non-contextual?7 Arguably, the fact
that Pitowsky’s interpretation of quantum mechanics makes no appeal to such
theories – which must assign contextual values under pain of contradiction with the
Hilbert space structure of quantum states – may weaken the force of this question.
We shall return to this point in Sect. 7.4.3 below when discussing the extent to
which probabilistic non-contextualism avoids axiomatic status in the Deutsch-
Wallace theorem in the Everett picture.
John Earman (Earman 2018) is one among other commentators who also regard
Gleason’s theorem as the essential link in quantum mechanics between subjective
probabilities and the Born Rule. But he questions Pitowksy’s claim that in the light
of Gleason’s theorem the event structure dictates the quantum probability rule when
he reminds us that Gleason showed only that in the case of D > 2, and where H
is separable when D = ∞, the probability measure on the lattice of subspaces is
represented uniquely by a density operator on H iff it is countably additive (Earman
2018). And countable additivity does not hold in all subjective interpretations
of probability. (de Finetti, for instance, famously restricted probability to finite
additivity.)
In our view, the two main limitations of such Gleason-based derivations of the
Born Rule are the assumption of non-contextualism and the awkward fact that the
Gleason theorem fails in the case of D < 3, and thus can have no probabilistic
implications for qubits.8 One might also question whether probabilities in quantum
mechanics need be subjective. Indeed, Earman holds a dualist view, involving
objective probabilities as well (and which, as we shall see later, have a separate
underpinning in quantum mechanics).
The possibility of contextual probabilities admittedly raises the spectre of a
violation of the no-signalling principle, analogously to the way the (compulsory)
contextuality in hidden variable theories leads to nonlocality.9 It is noteworthy that
assuming no superluminal signalling and the projection postulate (which Pitowsky
POMs (positive operator-valued measures) rather than PVMs (projection-valued measures). More
recent, stronger Gleason-type results are discussed in Wright and Weigert (2019).
9 See for example Brown and Svetlichny (1990).
170 H. R. Brown and G. Ben Porath
7.3.1 Chances
10 See Svetlichny (1998). For more recent derivations of the Born Rule based on no-signalling, see
Barnum (2003), and McQueen and Vaidman (2019). A useful critical review of derivations of the
Born Rule – including those of Deutsch and Wallace (see below) – is found in Vaidman (2020).
11 The fact that it is the same parameter in each case makes the phenomenon even more remarkable.
7 Everettian Probabilities, The Deutsch-Wallace Theorem and the Principal Principle 171
12 There are good reasons, however, for considering the source of the unpredictability in some if
not all classical chance processes as having a quantum origin too; see Albrecht and Phillips (2014).
But whether strict indeterminism is actually in play in the quantum realm is far from clear, as we
see below.
13 Gillies (2000), pp. 177, 178.
14 Gillies (1972), pp. 150, 151.
15 Maudlin (2007).
16 Wallace (2012), p. 137.
172 H. R. Brown and G. Ben Porath
In fact, Wallace goes so far as to say that denial of the existence of objective
probabilities is incompatible with a realist stance in the philosophy of science,
unless one entertains a “radical revision of our extant science”.17 At any rate,
Wallace is far from unique in denying that such probabilities can be defined in
terms of frequencies (finite or otherwise)18 so it seems to follow that the quantitative
element of reality he invokes is distinct from the presumably uncontentious,
qualitative one related to stable frequencies, referred to at the beginning of this
section.
Many philosophers have tried to elucidate what this element of reality is; the
range of ideas, which we will not attempt to itemize here, runs from the more-or-less
intuitive notion of “propensity” to the abstract “best system” approach to extracting
probabilities from frequencies in the Humean mosaic (or universal landscape of
events in physics, past, present and future).19 The plethora of interpretations on the
part of philosophers has a curious feature: the existence of chance is widely assumed
before clarification is achieved as to what it is! And the presumption is based on
physics, or what is taken to be the lesson of physics.20
Until relatively recently, the half-life (or 0.693 of the mean lifetime) of 14 C (which
by definition is a probabilistic notion21 ) was anomalous from a theoretical point
17 Op. cit., p. 138. It should be noted, however, that although Wallace claims that the certain
probabilities in statistical mechanics must be objective, he accepts that such a claim is hard to
make sense of without bringing in quantum mechanics (see Wallace 2014). For further discussion
of Wallace’s views on probability in physics, see (Brown, 2017).
18 For discussion of the problems associated with such a definition, see, e.g., Saunders (2005),
Greaves and Myrvold (2010), Wallace (2012), p. 247, and Myrvold, W. C., Beyond Chance and
Credence [unpublished manuscript], Sect. 3.2.
19 This last approach is due to David Lewis (1980); an interesting analysis within the Everettian
picture of a major difficulty in the best system approach, namely “undermining”, is found in
Saunders (2020). It should also be recognised that the best system approach involves a judicious
almagam of criteria we impose on our description of the world, such as simplicity and strength;
the physical laws and probabilities in them are not strict representations of the world but rather our
systematization of it. Be that as it may, Lewis stressed that whatever physical chance is, it must
satisfy his Principal Principle linking it to subjective probability; see below.
20 Relatively few philosophers follow Hume and doubt the existence of chances in the physical
world; one is Toby Handfield, whose lucid 2012 book on the subject (Handfield 2012) started out
as a defence and turned into a rejection of chances. Another skeptic is Ismael (1996); her recent
work (Ismael 2019) puts emphasis on the kind of weak objectivity mentioned at the start of this
section. See also in this connection recent work of Bacciagaluppi, whose view on the meaning of
probability is essentially the same as that defended in this paper: (Bacciagaluppi, 2020), Sect. 10.
21 The half-life is not, pace Google, the amount of time needed for a given sample to decay by
one half! (What if the number of nuclei is odd?) It is the amount of time needed for any of the
nuclei in the sample to have decayed with a probability of 0.5. Of course this latter definition will
7 Everettian Probabilities, The Deutsch-Wallace Theorem and the Principal Principle 173
of view. It is many orders of magnitude larger than both the half-lives of isotopes
of other light elements undergoing the same decay process, and what standard
calculations for nucleon-nucleon interactions in Gamow-Teller beta decay would
indicate.22
Measurements of the half-life of 14 C have been going on since 1946, involving
a number of laboratories worldwide and two distinct methods. Results up until
1961 are now discarded from the mix; they varied between 4.7 and 7.2 K years.
The supposedly more accurate technique involving mass-spectrometry has led to
the latest estimate of 5700 ± 30 years (Bé and Chechev 2012). This has again
involved compiling frequencies from different laboratories using a weighted average
(the weights depending on estimated systematic errors in the procedures within the
various laboratories).
In the case of the humble neutron, experimental determination of its half-life
is, as we write, in a curious state. Again, there are two standard techniques, the so-
called ‘beam’ and ‘bottle’ measurements. The latest and most accurate measurement
to date using the bottle approach estimates the lifetime as 877.7 ± 0.7 s (Pattie
2018), but there is a 4 standard deviations disagreement with the half-life for neutron
beta decay determined by using the beam technique which is currently not well
understood.
It should be clear that measurements involve frequencies of decays in all these
cases. The procedures might be compared to the measurement of, say, the mass of
the neutron. As Maudlin says, further experimentation could refine the numbers.
But for those who (correctly) deny that probabilities can be defined in terms of
finite frequencies, the question obviously arises: in what precise sense does taking
measurements of frequencies constitute a measurement of probability (in the present
case, half-life)? Is Gillies right that nothing like a bet is being made?
The difference between measuring mass or atomic weights, say, and half-life
is not that one case involves complicated, on-going interactions between theory and
experiment and the other doesn’t. The essential difference is that for the latter, if it is
thought to be objective, no matter how accurate the decay frequency measurements
are and how many runs are taken, the resulting frequencies could in principle all be a
fluke, and thereby significantly misleading. That is the nature of chance procedures,
whether involving isotopes or coins. Here is de Finetti, taken out of context:
It is often thought that these objections [to the claim that probability theory and other exact
sciences] may be escaped by making the relations between probabilities and frequencies
precise is analogous to the practical impossibility that is encountered in all the experimental
sciences of relating exactly the abstract notions of the theory and the empirical realities. The
analogy is, in my view illusory . . . in the calculus of probability it is the theory itself which
obliges us to admit the possibility of all frequencies. In the other sciences the uncertainty
be expected to closely approximate the former when the sample is large, given the law of large
numbers (see below).
22 In 2011, (Maris et al. 2011) showed for the first time, with the aid of a supercomputer, that
flows indeed from the imperfect connection between the theory and the facts; in our case,
on the contrary, it does not have its origin in this link, but in the body of the theory itself
. . . .23
23 de Finetti (1964), p. 117. de Finetti’s argument, as Gillies (2000) stresses on pages 103 and
159, was made in the context of the falsifiability of probabilistic predictions, even in the case
of subjective probabilities. But the argument also holds in the case of inferring “objective”
probabilities by way of frequencies.
24 Saunders puts the point succinctly:
(i) Whether we are objectivists, subjectivists or dualists about probability, for such
processes as the tossing of a bent coin, or the decay of Carbon-14, we will need to be
guided by experience in order to arrive at the relevant probability. For the objectivist,
this is letting relative frequencies uncover an initially unknown objective probability
or chance. For the subjectivist, it is using the standard rule of updating a (suitably
restricted) subjective prior probability in the light of new statistical evidence, subject
to a certain constraint on the priors.
Suppose we have a chance set-up, and we are interested in the probability of a
particular outcome (say heads in the case of coin tossing, or decay for the nucleus
of an isotope within a specified time) in the n + 1th repetition of the experiment,
when that outcome has occurred k times in the preceding n repetitions. For the
objectivist, the chance p at each repetition (“Bernoulli trial”) is typically assumed
to be an identically distributed and independent distribution (iid), which means that
the objective probability in question is p, whatever the value of M is. But when p
is unknown, we expect frequencies to guide us. It is their version of the (weak) law
of large numbers that allows the objectivists to learn from experience.25
The celebrated physicist Richard Feynman defined the probability of an event as
an essentially time-asymmetric notion, viz. our estimate of the most likely relative
frequency of the event in N future trials.26 This reliance on the role of the estimator
already introduces an irreducibly subjective element into the discussion.27 And note
how Feynman describes the experimental determination of probabilities in the case
of tossing a coin or a similar “chancy” process.
We have defined P (H ) = NH /N [where P (H ) is the probability of heads, and NH is
the expected number of heads in N tosses]. How shall we know what to expect for NH ? In
some cases, the best we can do is observe the number of heads obtained in large numbers
of tosses. For want of anything better, we must set NH = NH (observed). (How could
we expect anything else?) We must understand, however, that in such a case a different
experiment, or a different observer, might conclude that P (H ) was different. We√would
expect, however, that the various answers should agree within the deviation 1/2 N [if
P (H ) is near one-half]. An experimental physicist usually says that an “experimentally
determined” probability has an “error”, and writes
NH 1
P (H ) = ± √ (7.2)
N 2 N
25 Thisis not to say that the law of large numbers allows for probabilities to be defined in terms of
frequencies, a claim justly criticised in Gillies (1973), pp. 112–116.
26 Feynman et al. (1965), Sect. 6.1.
27 See Brown (2011). In so far as agents are brought into the picture, who remember the past and
not the future, probability requires the existence of an entropic arrow of time in the cosmos; see in
this connection (Handfield 2012, chapter 11), Myrvold (2016) and (Brown, 2017), Sect. 7.
176 H. R. Brown and G. Ben Porath
due to a fluctuation. There is, however, no way to make such thinking logically consistent.
It is probably better to realize that the probability concept is in a sense subjective, that it
is always based on uncertain knowledge, and that its quantitative evaluation is subject to
change as we obtain more information.28
Now how far this position takes Feynman away from the objectivist stance is perhaps
debatable; note that Feynman does not refer here to quantum indeterminacies. But
other physicists have openly rejected an objective interpretation of probability; we
have in mind examples such as E. T. Jaynes (1963), Frank Tipler (2014), Don Page
(1995) and Jean Bricmont (2001). In 1995, Page memorably compared interpreting
the unconscious quantum world probabilistically to the myth of animism, i.e.
ascribing living properties to inanimate objects.29
(ii) It is indeed worth reminding ourselves of the fact that the brazen subjectivists
can make a good – though far from uncontroversial – case for the claim that their
practice is consistent with standard experimental procedure in physics.
The subjectivist specifies a prior subjective probability for such a probability
(presumably conditional on background knowledge of some kind) and appeals to
Bayes’ rule for updating probability when new evidence (frequencies) accrues.
Following the work principally of de Finetti, the crucial assumption that replaces
the iid condition for objectivists is the constraint of “exchangeability” on the
prior probabilities, which opens the door to learning from the past. de Finetti’s
1937 representation theorem based on exchangeability leads to an analogue of the
Bernoulli law of large numbers.30 In the words of Brian Skyrms,
de Finetti . . . looks at the role that chance plays in standard statistical reasoning, and argues
that role can be fulfilled perfectly well without the metaphysical assumption that chances
exist. (Skyrms 1984)
28 Feynman et al. (1965), Sect. 6.3. We are grateful to Jeremy Steeger for bringing this passage to
our attention. For his careful defence of a more objective variant of Feynman’s notion of probabil-
ity, see (Steeger, ’The Sufficient Coherence of Quantum States’ [unpublished manuscript]).
29 See Page (1995). Page has also defended the view that truly testable probabilities are limited to
conditional probabilities associated with events defined at the same time (and perhaps the same
place); see Page (1994). This view is motivated by the fact that strictly speaking we have no direct
access to the past or future, but only present records of the past, including of course memories.
The point is well-taken, but radical skepticism about the veracity of our records/memories would
make science impossible, and as for probabilities concerning future events conditional on the past
(or rather our present memories/records of the past) they will be testable when the time comes,
given the survival of the relevant records/memories. So it is unclear whether recognition of Page’s
concern should make much difference to standard practice.
30 de Finetti (1964); for a review of de Finetti’s work see Galavotti (1989). A helpful introduction
to the representation theorem is given by Gillies (2000, chapter 4); in this book Gillies provides a
sustained criticism of the subjectivist account of probabilities.
7 Everettian Probabilities, The Deutsch-Wallace Theorem and the Principal Principle 177
agent looking for the optimal betting strategy for sequences of a relevant class of
experimental setups. Then
. . . when he conditionalizes on the results of elements of the sequence, he learns about what
the optimal strategy is, and he is certain that any agent with non-dogmatic priors on which
the sequence of experiments is exchangeable will converge to the same optimal strategy.
If this is not the same as believing that there are objective chances, then it is something
that serves the same purpose. Rather than eliminate the notion of objective chance, we have
uncovered, in [the agent’s] belief state, implicit beliefs about chances, – or, at least, about
something that plays the same role in his epistemic life. . . .
. . . like it or not, an agent with suitable preferences acts as if she believes that there
are objective chances associated with outcomes of the experiments, about which she can
learn, provided she is non-dogmatic. . . . There may be more to be said about the nature and
ontological status of such chances, but, whatever more is said, it should not affect the basic
picture of confirmation we have sketched.31
Note that in explicitly leaving the question of the “nature and ontological status” of
chances open, Greaves and Myrvold’s analysis appeals only to the intersubjectivity
of the probabilistic conclusions drawn by (non-dogmatic) rational betting agents
who learn from experience according to Bayesian updating. Whatever one calls the
resulting probabilities, it is questionable whether they are out there in the world
independent of the existence of rational agents, and playing the role of elements of
reality seemingly demanded by Gillies, Maudlin and Wallace as spelt out in Sect. 2.1
above.32,33
(iii) In anticipation of the discussion of the refutability of probabilistic assertions
in science below (Sect. 7.6), it is worth pausing briefly to consider an important
feature of subjectivism in this sense. A subjective probability for some event, either
construed as a prior, or conditional on background information (say relevant past
frequencies), is not open to revision. Such a probability is immutable; it is not
to be “repudiated or corrected”.34 The same holds for the view that conditional
probabilities are rational inferences arising from incomplete information. What the
introduction of relevant new empirical information does is not to correct the initial
probability but replace it (by Bayesian updating) with a new one conditional on the
new (and past, if any) evidence, which in turn is immutable.35
above that objectivists need to introduce the substantive additional assumption that the statistical
data is “typical”. As far as we can see, this is again because the notion of chance they are
considering is ontologically neutral.
34 See de Finetti (1964, pp. 146, 147).
35 Gillies (2000, pp. 74, 83, 84) and Wallace (2012, p. 138), point out that de Finetti’s scheme
of updating by Bayesian conditionalisation yields reasonable results only if the prior probability
function is appropriate. If it isn’t, no amount of future evidence can yield adequate predictions
based on Bayesian updating. This a serious concern, which will not be dealt with here (see in this
178 H. R. Brown and G. Ben Porath
(iv) We note finally that, as Myrvold has recently stressed, there is a tradition
dating back to the nineteenth century of showing another way initial credences
can “converge”, which does not involve the accumulation of new information. The
“method of arbitrary functions”, propounded in particular by Poincaré, involves
dynamical systems in which somewhat arbitrary prior probability distributions are
“swamped” by the physics, which is to say that owing to the dynamics of the system
a wide range of prior distributions converge over time to a final distribution. It is
noteworthy that Myrvold accepts that the priors can be epistemic in nature, but
regards the final probabilities as “hybrid”, neither epistemic nor objective chances.
He calls them epistemic chances.36
If one issue has dominated the philosophical literature on chance in recent decades –
and which will be relevant to our discussion of the Deutsch-Wallace theorem below
– it is the Principal Principle (henceforth PP), a catchy term coined and made
prominent by David Lewis in 1980 (Lewis 1980), but referring to a longstanding
notion in the literature.37
This has to do with the second side of what Ian Hacking called in 1975 the Janus-
faced nature of probability,38 or rather its construal by objectivists. The first side
concerns the issue touched on above, i.e. the estimation of probabilities on the basis
of empirical frequencies. Papineau (1996) called this the inferential linkbetween
frequencies and objective probabilities (chances). The second side concerns what
we do with these probabilities, or why we find them useful. Recall the famous
dictum of Bishop Butler that probabilities are a guide to life. Objective probabilities
determine, or should determine, our beliefs, or “credences”, about the future (and
less frequently the past) which if we are rational we will act on. Following Ramsey,
it is widely accepted in the literature that this commitment can be operationalised,
without too much collateral damage, by way of the act of betting.39
connection Greaves and Myrvold 2010). Note that it seems to be quite different from the reason
raised by Wallace (see Sect. 2.1 above) as to why the half-life of a radioactive isotope, for example,
must be an objective probability. See also Sect. 7.5.1 below.
36 Myrvold, W. C., Beyond Chance and Credence [unpublished manuscript], Chap. 5.
37 See Strevens (1999). In our view, a refreshing exception to the widespread philosophical interest
in the PP is Gillies’ excellent 2000 book (Gillies 2000), which makes no reference to it.
38 Hacking (1984), though see Gillies (2000, pp. 18, 19) for references to earlier recognition of
the dual nature of probability. Note that Wallace has argued that the PP grounds both sides; see
Footnote 23 above.
39 See Gillies (2000, chapter 4).
7 Everettian Probabilities, The Deutsch-Wallace Theorem and the Principal Principle 179
Now even modest familiarity with the literature reveals that the status of the PP
is open to dispute. According to Ned Hall (2004), it is an analytic truth (i.e. true
by virtue of the meaning of the terms “chance” and “credence”) though this view is
hardly consensual. The weaker position that chance is implicitly defined by the PP
has been criticised by Carl Hoefer.41 Amongst other philosophers who think the PP
requires justification, Simon Saunders considers that the weakness in the principle
has to do with the notoriously obscure nature of chances:
Failing an account of what objective probabilities are, it is hard to see how the [PP] could
be justified, for it seems that it ought to be facts about the physical world that dictate our
subjective future expectations. (Saunders 2005)
40 This version is essentially the same as that in Wallace p. 140. Note that Wallace here (i) does
not strictly equate the PP with the decision-theoretic link and (ii) sees PP as underpinning not
just the decision-theoretic link but also the inferential link (see Footnote 24 above). We return to
point (i) below; we are also overlooking here the usual subtleties involved with characterising
the background information as “accessible”. For a more careful discussion of the PP, see
(Bacciagaluppi 2020).
41 Hoefer (2019), Sect. 1.3.4.
42 Wallace (2012), Sects. 4.10, 4.12.
43 Strevens (1999). See also (Hoefer, 2019, Sects. 1.2.3 and 1.3.4).
180 H. R. Brown and G. Ben Porath
wonders how chances could possibly induce normative constraints on our credences.
Myrvold cuts the Gordian Knot by claiming that the PP is concerned rather with our
beliefs about chances – even if there are no non-trivial chances in the world.
PP captures all we know about credences about chances, since it is not really about
chances.44
Can there be any doubt that the meaning and justification of the PP are issues that
are far from straightforward?
Subjectivists can of course watch the whole tangled debate with detachment,
if not amusement: if there are no chances, nor credences about chances, the
PP is empty. And Papineau’s two links collapse into one: from frequencies to
credences.45 But now an obvious and old question needs to be addressed: can
credences sometimes be determined without any reference to frequencies?
It has been recognised from the earliest systematic writings on probability that
symmetry considerations can also play an important role in determining credences
in chance processes. In fact, in what is often called the “classical” theory of
probability, which, following Laplace, was dominant amongst mathematicians for
nearly a century,46 the application of the principle of indifference (or insufficient
reason) based on symmetry considerations is the grounding of all credences in
physics and games of chance. The closely related, so-called “objective” Bayesian
approach, associated in particular with the name of E. T. Jaynes, likewise makes
prominent use of the principle of indifference.
David Wallace voices the opinion of many commentators when he describes this
principle as problematic.47 It is certainly stymied in the case of chance processes
lacking symmetry (such as involving a biased coin) and it is widely accepted that
serious ambiguities arise when the sample space is continuous.48 But what can be
wrong with use of the principle when assigning the probability P (H ) = 1/2 (see
Sect. 7.3.3) in the case of a seemingly unbiased coin? And are we not obtaining in
Jaynes’ defence of the principle of indifference in physics – and conceding that the principle has
been successfully applied in a number of cases in physics.
7 Everettian Probabilities, The Deutsch-Wallace Theorem and the Principal Principle 181
But if, in a branching Everettian universe where, loosely speaking, everything that
can happen does happen, a derivation of credences can be given on the basis of
rationality principles that feature the symmetry properties of the wavefunction, no
such symmetry need be broken, and a reductive analysis of probability is possible.
It is precisely this feat that the Deutsch-Wallace theorem achieves.
7.4.2 Deutsch
In 1999, David Deutsch (Deutsch 1999) attempted to demonstrate two things within
the context of Everettian quantum mechanics: (i) the meaning of probabilistic claims
in quantum mechanics, which otherwise are ill-defined, and (ii) the fact that the
Born Rule need not be considered an extra postulate in the theory. In a more recent
paper, Deutsch stresses that (iii) his 1999 derivation of the Born Rule also overcomes
what is commonly called the “incoherence problem” in the Everettian account of
probabilities.50
As for (i), Deutsch interpreted probability as that concept which appears in
a Savage-style representation theorem within the application of rational decision
theory. In the case of quantum mechanics, Deutsch exploited the fragment of
decision theory expunged of any probabilistic notions, applying it to quantum games
in an Everettian universe – or preferences over the future of quantum mechanical
measurement outcomes associated with certain payoffs. The emergent notion of
probability is agent-dependent, in the sense that, as with Feynman’s notion of
probability (recall Sect. 7.3.3 above), probabilities arise from the actions of ideally
rational agents and have no independent existence in the universe – a branching
universe whose fundamental dynamics is deterministic.
. . . the usual probabilistic terminology of quantum theory is justifiable in light of result
of this paper, provided one understands it all as referring ultimately to the behaviour of
7.4.3 Wallace
51 Deutsch (1999), p. 14. As Hemmo and Pitowsky noted, Deutsch’s proof, if successful, would
give “strong support to the subjective approaches to probability in general.” (Hemmo and Pitowsky
2007, p. 340).
7 Everettian Probabilities, The Deutsch-Wallace Theorem and the Principal Principle 183
This is followed by proof of the claim that any solution to a quantum decision prob-
lem which is compatible with a state dependent solution must be non-contextual.53
But it is important to note how Wallace defines this condition. Informally,
an agent’s preferences conform to a probability rule that is non-contextualist
in Wallace’s terms if it assigns the same probabilities to the outcomes of a
measurement of operator X whether or not a compatible operator Y is measured at
the same time.54 After giving a more formal decision-theoretic definition, Wallace
explicitly admits that this is not exactly the principle used in Gleason’s theorem, “but
it embodies essentially the same idea”.55 We disagree. Wallace’s principle, which
for purposes of subsequent discussion we call weak non-contextualism, involves a
single measurement procedure; its violation means that a rational agent prefers “a
given act to the same (knowably the same act, in fact) under a different descrip-
tion, which violates state supervenience (and, I hope, is obviously irrational).”56
But the Gleason-related principle, or strong non-contextualism, involves mutually
incompatible procedures. Now Wallace appears to be referring to this strong non-
contextualism when he writes
It is fair to note, though, that just as a non-primitive approach to measurement allows one
and the same physical process to count as multiple abstractly construed measurements,
it also allows one and the same abstractly construed measurement to be performed by
multiple physical processes. It is then a nontrivial fact, and in a sense a physical analogue
of noncontextuality, that rational agents are indifferent to which particular process realizes
a given measurement.57
52 Similar qualms were voiced by Hemmo and Pitowsky (Hemmo and Pitowsky 2007).
53 Wallace (2012), p. 196.
54 Op. cit. p. 196.
55 Op. cit. p. 214.
56 Op. cit. p. 197.
57 Ibid.
184 H. R. Brown and G. Ben Porath
The short answer as to why is that two acts which correspond to the same abstractly
construed measurement can be transformed into the same act via processes to which rational
agents are indifferent.58
Now it seems to us, as it did to Pitowsky (see Sect. 7.2), that to contemplate
a contextual assignment of probabilities in quantum mechanics is prima facie far
from irrational, given the non-commutative property of operators associated with
measurements.59 In our view, the most plausible justification of non-contextualism
in the context of the DW theorem was given by Timpson (Timpson 2011; Sect. 5.1).
It is based on consideration of the details of the dynamics of the measurement
process in unitary quantum mechanics, and shows that nothing in the standard
account of the process supports the possibility of contextualism. However, this
argument presupposes that the standard account is independent of the Born Rule,
a supposition which deserves attention. At any rate, it should not be overlooked that
the DW-theorem applies to systems with Hilbert spaces of arbitrary dimensions,
which is a significant advantage over the proof of the Born Rule for credences using
Gleason’s theorem.
(i) For David Wallace and Simon Saunders, it is of great significance that Everettian
quantum mechanics (EQM) provides an underpinning, unprecedented in single-
world physics, for the Principal Principle.60 For Wallace,
. . . if an Everett-specific derivation of the Principal Principle can be given, then the Everett
interpretation solves an important philosophical problem which could not be solved under
the assumption that we do not live in a branching universe.61
For Saunders, “. . . nothing comparable has been achieved for any other physical
theory of chance.”62
58 Ibid. Note that a different, recent approach to proving that credences should be non-contextual in
quantum mechanics is urged by Steeger (Steeger, J., ’The Sufficient Coherence of Quantum States’
[unpublished manuscript]).
59 It was mentioned in Sect. 7.2 above that contextual probabilities may lead to the possibility of
superluminal signalling. But this does not imply that contextualism is irrational. Indeed, violation
of no-signalling is bound to happen in some “deviant” branches in the Everett multiverse; see
Sect. 7.6 below.
60 Perhaps this should be understood as an ‘emancipated’ PP, given that Lewis himself did not
And according to Wallace, his latest approach (see below) succeeds where
Deutsch’s 1999 fails in providing a vindication of the PP.
Deutsch’s theorem . . . amounts simply to a proof of the decision theoretic link between
objective probabilities and action.63
Clearly, Wallace has a very different reading of the Deutsch theorem to ours; we see
no reference to objective probabilities therein. But Wallace is attempting to make
a substantive point: that the PP in the context of classical decision theory is not
equivalent to Papineau’s decision-theoretic link between chances and credences.
Here, briefly, is the reason.
The utility function U in the Deutsch theorem is, Wallace argues, not obviously
the same as the utility function V in classical decision theory, where the agent’s
degrees of belief may refer to unknown information. U is a feature of what Wallace
calls the Minimal decision-theoretic link between objective probability and action,
which supposedly encompasses the Deutsch theorem and Wallace’s Born Rule
theorem (though see below). In contrast, the standard classical decision theory in
which V is defined makes no mention of objective probabilities, and allows for bets
when the physical state of the system is unknown, and even when there is uncertainty
concerning the truth of the relevant physical theory.
For Wallace, the PP will be upheld in Everettian quantum mechanics only
when U and V can be shown to be equivalent (up to a harmless positive affine
transformation). If the quantum credences in the Minimal decision-theoretic link are
just subjective, or personal probabilities, then the matter is immediately resolved.
But in his 2012 book, Wallace feels the need to provide a “more direct” proof
involving a thought experiment and some further mathematics, resulting in what
he calls the utility equivalence lemma.64 Such considerations are a testament to
Wallace’s exceptional rigour and attention to detail, but for the skeptic about chances
and hence the PP (ourselves included) they seem like a considerable amount of work
for no real gain.
Let’s take another look at the Minimal decision theoretic link. The argument is,
again, that if preferences over quantum games satisfy certain plausible constraints,
the credences defined in the corresponding representation theorem are in turn
constrained to agree with branch weights. The arrow of inference goes from suitably
constrained credences to (real) numbers extracted from the (complex) state of the
system. This is somewhat out of kilt with Papineau’s decision-theoretic link, which
involves an inference from knowledge of objective probabilities to credences. And
this discrepancy is entirely innocuous, we claim, in Deutsch’s own understanding
of his 1999 theorem, where branch weights are not interpreted as objective, agent-
independent chances.
(ii) As far as we can tell, the grip that the PP has on Wallace’s thinking can be
traced back to his conviction that probabilities in physics, both in classical statistical
But subjectivists do not claim that the half-life of 14 C, for example, is purely a matter
of belief. It is belief highly constrained by empirical facts. The half-life is arrived
at through Bayesian updating based on the results of ever more accurate/plentiful
statistical measurements of decay, as we saw in Sect. 7.3.3.66
For Wallace, in the case of quantum mechanics (and hence of classical statistical
mechanics, which he correctly sees as the classical limit of quantum mechanics)
the probabilistic elements of reality – chances – are (relative) branch weights, or
mod-squared amplitudes. Now no one who is a realist about the quantum state
would question whether amplitudes are agent-independent and supervenient on the
ontic universal wavefunction. But are they intrinsically “chances” of the kind that
defenders of the PP would recognise?
This is a hard question to answer, in part because the notion of chance in the
literature is so elusive. Wallace and Saunders adopt the approach of “cautious
functionalism”. Essentially, this means that branch weights act as if they were
chances, according to the PP. Here is Wallace:
. . . the Principal Principle can be used to provide what philosophers call a functional
definition of objective probability: it defines objective probability to be whatever thing fits
the ‘objective probability’ slot in the Principal Principle.67
In Saunders’ words:
[Wallace] shows that branching structures and the squared moduli of the amplitudes, in so
far as they are known, ought to play the same decision theory role [as in Papineau’s decision
theoretic link] that chances play, in so far as they are known, in one-world theories.68
And again:
The [DW] theorem demonstrates that the particular role ordinarily but mysteriously
played by physical probabilities, whatever they are, in our rational lives, is played in a
wholly perspicuous and entirely unmysterious way by branch weights and branching. It is
establishing that this role is played by the branch weights, and establishing that they play
all the other chance roles, that qualifies these quantities as probabilities.69
Saunders calls this reasoning a “derivation” of the PP,70 but in our view it
amounts to presupposing the PP and showing that the elusive notion of chance can
be cashed out in EQM in terms of branch weights. It hardly seems consistent with
Wallace’s express hope
. . . to find some alternative characterization of objective probability, independent of the
Principal Principle, and then prove that the Principal Principle is true for that alternatively
characterized notion.71
Note, however, that Wallace himself stresses that his Born Rule theorem is insuffi-
cient for deriving the PP, since it is “silent on what to do when the quantum state is
not known, or indeed when the agent is uncertain about whether quantum mechanics
is true.” This leads Wallace to think that the theorem, like Deutsch’s, merely
establishes the Minimal decision-theoretic link between objective probability and
action, as we have seen. As a consequence Wallace developed a decision-theoretic
“unified approach” to probabilities in EQM,72 which avoids the awkwardness of
having the concepts of probability and utility derived twice, from his Born Rule
theorem, and from appeal to something like the 2010 Greaves-Myrvold de Finetti-
inspired solution to what they call the “evidential problem” in EQM, although a
defence of the PP can still be gained by appeal (see above) to the utility equivalence
lemma.73
In our view, any such defence relies on the claim that somewhere in the
decision-theoretic reasoning an “agent-independent notion of objective probabil-
ity”74 emerges. We turn to this issue now.
(iii) In his 2010 treatment of EQM, Saunders gives an account of “what
probabilities actually are (branching structures)”.75 It is important to recognise the
role of the distinction between branching structures and amplitudes in this analysis.
The former are ‘emergent’ and non-fundamental; the latter are fundamental and
provide (in terms of their modulus squared) the numerical values of the chances:
Just like other examples of reduction, . . . [probability] can no longer be viewed as
fundamental. It can only have the status of the branching structure itself; it is ‘emergent’ . . . .
Chance, like quasiclassicality, is then an ‘effective’ concept, its meaning at the microscopic
level entirely derivative on the establishment of correlations, natural or man-made, with
macroscopic branching. That doesn’t mean that amplitudes in general . . . have no place in
69 See Saunders (2020); this paper also contains a rebuttal of the claim that the process of
decoherence, so essential to the meaning of branches in EQM, itself depends on probability
assumptions.
70 Saunders (2010).
71 Wallace (2012), p. 144.
72 Op. cit. Chap. 6.
73 Op. cit. p. 234.
74 Op. cit. p. 229.
75 Ibid.
188 H. R. Brown and G. Ben Porath
the foundations of EQM – on the contrary, they are part of the fundamental ontology – but
their link to probability is indirect. It is simply a mistake, if this reduction is successful, to
see quantum theory as at bottom a theory of probability.76
It is easy to see the tension between the Author’s replies if one thinks of the
objective probabilities emerging from the DW-theorem as having an existence
independent of betting agents.80 But such is not what the application of the PP yields
here. And note again that the branch weights yielded by the wavefunction are not
objective probabilities (according to the argument) until the branches are effectively
A dollar note, e.g., has no more intrinsic status as money than branch weights have the status of
probability, until humans confer this status on it. This point is nicely brought out by Harari, who
writes that the development of money “involved the creation of a new inter-subjective reality that
exists solely in people’s shared imagination.” See Harari 2014, chapter 10.
7 Everettian Probabilities, The Deutsch-Wallace Theorem and the Principal Principle 189
Note that the probabilities in the argument are credences; the exercise is at heart
the application of a sophisticated variant of the principle of indifference based on
symmetry. This makes the nature of quantum probabilities essentially something
Laplace would recognise, but with the striking new feature mentioned at the
beginning of this section. To reify the resulting quantitative values of the credences
(branch weights) as chances seems to us both unnecessary and ill-advised; it would
be like telling Laplace that his credence in the outcome ‘heads’ is a result (in part)
of an objective probability inherent in the individual (unbiased) coin.83
81 Wallace describes the Everettian program as interpreting the “bare quantum formalism” – which
itself makes no reference to probability (Wallace 2012, p. 16) – in a “straightforwardly realist way”
without modifying quantum mechanics (op. cit., p. 36).
82 Wallace (2010), pp. 259, 260. Saunders has recently also stressed the role of symmetry in the
and Ismael, in their thoughtful 2015 review of Wallace’s 2012 book (of which more in Sect. 7.6
below):
According to such a view, the ontological content of the theory makes no use of
probabilities. There is a story that relates the ontology to the evolution of observables
along a family of decoherent histories, and probability is something that plays a role in
the cognitive life of an agent whose experience is confined to sampling observables along
such a history. In so doing, one would still be doing EQM . . . (Bacciagaluppi and Ismael
2015, Sect. 3.2).
190 H. R. Brown and G. Ben Porath
7.5.2 Earman
Recently, John Earman (2018) has also argued that something like the PP is
a “theorem” of non-relativistic quantum mechanics, but now in a single-world
context. Briefly, the argument goes like this.
Earman starts the argument within the abstract algebraic formulation of the
theory. A “normal” quantum state ω is defined as a normed positive linear functional
on the von Neumann algebra of bounded observables B(H) operating on a separable
Hilbert space H, with a density operator representation. This means that there is a
trace class operator ρ on H with T r(ρ) = 1, such that ω(A) = T r(ρA) for all A ∈
B(H). Normal quantum states induce quantum probability functions P r ω on the
lattice of projections P(B(H)): P r ω (E) = ω(E) = T r(ρE) for all E ∈ P(B(H)).
Earman takes the quantum state to be an objective feature of the physical system,
and infers that the probabilities induced by them are objective.84 This is spelt out
by showing how pure state preparation can be understood non-probabilistically
using the von Neumann projection postulate in the case of yes-no experiments, and
using the law of large numbers to relate the subsequent probabilities induced by the
prepared state to frequencies in repeated measurements. All this Earman calls the
“top-down” approach to objective quantum probabilities.
Credences are now introduced into the argument by considering a “bottom-
up” approach’ in which probability functions on P(B(H)) are construed as the
credence functions of actual or potential Bayesian agents. States are construed as
bookkeeping devices used to keep track of credence functions. By considering
again a (non-probabilistic) state preparation procedure, and the familiar Lüders
updating rule, Earman argues that a rational agent will in this case, in the light
of Gleason’s theorem, adopt a credence function on P(B(H)) which is precisely the
objective probability induced by the prepared pure state (as defined in the previous
paragraph).85 Thus, Earman is intent on showing
that there is a straightforward sense in which no new principle of rationality is needed to
bring rational credences over quantum events into line with the events’ objective chances –
the alignment is guaranteed by as a theorem of quantum probability, assuming the credences
satisfy a suitable form of additivity. (Earman 2018)
However, we do not follow these authors in their attempt to define chances consonant with such
probabilities and the PP. (We are grateful to David Wallace for drawing to our attention this review
paper.)
84 Op. Cit. p. 16.
85 Recall that in the typical case of the infinite dimensional, separable Hilbert space H, a condition
of Gleason’s theorem is that the probability function must be countably additive; see Sect. 7.2
above.
7 Everettian Probabilities, The Deutsch-Wallace Theorem and the Principal Principle 191
“is not a supposed law of nature, nor is it any factual claim about what happens
in nature (the explicanda), nor is it derived from one. . . . And one cannot make an
explanation problematic merely by declaring it so.” p. 29.
For Deutsch, it is a triumph of the DW-theorem that in Everettian quantum
mechanics this problem – that the methodological falsifiability rule is mere whim –
is avoided. The theorem shows that
. . . rational gamblers who knew Everettian quantum theory, . . . and have . . . made no
probabilistic assumptions, when playing games in which randomisers were replaced by
quantum measurements, would place their bets as if those were randomisers, i.e. using the
[Born] probabilistic rule . . . according to the methodological [falsifiability] rule [above].
p. 31.
Furthermore, it is argued that “the experimenter – who is now aware of the same
evidence and theories as they are – must agree with them” when one of two
gamblers regards his theory as having been refuted. The argument is subtle, if not
convoluted, and requires appeal on the part of gamblers to a non-probabilistic notion
of expectation that Deutsch introduces early in the 2016 paper.
Perhaps a simpler approach works. Subjectivists, as much as objectivists, aim to
learn from experience; despite the immutability of their conditional probabilities
(see Sect. 7.3.3) they of course update their probabilities conditional on new
information. As we saw, this is the basis of their version of the law of large numbers.
In fact, it might be considered part of the scientific method for subjectivists, or of
the very meaning of probabilities in science, that updating is deemed necessary in
cases where something like the Popper-Gillies-Deutsch falsifiability rule renders the
theory in question problematic. In practice, such behaviour is indistinguishable from
that of objectivists about probability in cases of falsification (even in the tentative
sense advocated by Deutsch and others). But this line of reasoning does not make
special appeal to a branching universe, and thus is not in the spirit of Deutsch’s
argument.
The reader can decide which, if either, of these approaches is right. But now an
interesting issue arises regarding the DW-theorem, as a number of commentators
have noticed.
Given the branching created in (inter alia) repeated measurement processes, it
is inevitable that in some branches statistics of measurement outcomes will be
obtained that fail to adhere to the Born Rule, whatever reasonable preordained
statistical test is chosen. These are sometimes called deviant branches. What will
Everettian-favourable observers in such branches – conclude?89 If they adhere to
89 In his 1999 paper, Deutsch assumed that agents involved in his argument were initially adherents
of the non-probabilistic part of Everettian theory. Wallace was influenced by the work of Greaves
and Myrvold (Greaves and Myrvold 2010) who developed a confirmation theory suitable for
branching as well as non-branching universes, but he went on to develop his own confirmation
theory as part of what he called a “unified approach” to probabilities in EQM (see Sect. 7.4.1(ii)
above). However, Deutsch, in his 2016 paper, follows Popperian philosophy and rejects any notion
of theory confirmation, thereby explicitly sidelining the work of Greaves and Myrvold, despite the
fact that it contains a falsifiability element as well as confirmation.
7 Everettian Probabilities, The Deutsch-Wallace Theorem and the Principal Principle 193
the Popper-Gillies-Deutsch falsifiability rule, they must conclude that their theory is
in trouble. Wallace and Saunders correctly point out that from the point of view of
the DW-theorem, they happen just to be unlucky.90 But such observers will have
no option but to question Everettian theory. To say that statistics trump theory,
including principles of rationality ultimately based on symmetries,91 is just to say
that the falsifiability rule is embraced, and not to embrace it for subjectivists like
Deutsch is to confine the probabilistic component of quantum mechanics to human
psychology.
But what part of the theory is to be questioned? Read has recently argued that
it is the non-probabilistic fragment of EQM – the theory an Everettian accepts
before adopting the Born Rule. In this case, observers in deviant branches could
question the physical assumptions involved in the DW-theorem (for example, that
the agent’s probabilities supervene solely on the wavefunction), and thus consider it
inapplicable to their circumstances.
Hemmo and Pitowsky, on the other hand, argued in 2007 (Hemmo and Pitowsky
2007) that such an observer could reasonably question the rationality axioms that
lead to non-contextualism in (either version of) the DW-theorem. Here is Wallace’s
2012 reply.
. . . it should be clear that the justification used in this chapter is not available to operational-
ists, for (stripping away the technical detail) the Everettian defence of non-contextuality is
that two processes are decision-theoretically equivalent if they have the same effect on the
physical state of the system, whatever it is. Since operationalists deny that there is a physical
state of the system, this route is closed to them.92
Now we saw in Sect. 7.2 (footnote 4) that Pitowsky, in particular, denies the physical
reality of the quantum state. But it is unclear to us whether this is relevant to the
matter at hand. Wallace seems to be referring to what we called in Sect. 7.4.3
weak non-contextualism, whereas what Hemmo and Pitowsky have in mind is
strong non-contextualism. Wallace proceeds to give a detailed classical decision
problem (analogous to the quantum decision problem) which contains a condition of
“classical noncontextuality” that again is justified if it assumed “that the state space
really is a space of physical states”. But in a classical theory, no analogue exists
of non-commutativity and the incompatibility of measurement procedures involved
in strong non-contextualism. It seems to us that the concern raised by Hemmo and
Pitowsky is immune to Wallace’s criticism; it does not depend on adopting a non-
“operationalist” stand in relation to the quantum state, and applies just as much to
non-deviant as to deviant branches.93 But we mentioned in Sect. 7.3.3 above that a
strong plausibility argument for non-contextualism in Wallace’s decision-theoretic
approach has been provided by Timpson.
90 See Wallace (2012, p. 196) and Saunders (2020). Saunders again correctly points out that in an
infinite non-branching universe an analogous situation holds.
91 Read (2018), p. 138.
92 Wallace (2012), p. 226.
93 However, Hemmo and Pitowsky also argue that
194 H. R. Brown and G. Ben Porath
While we have doubts above as to whether the DW-theorem does justify the PP,
Read is obviously right that the past-statistics-driven route to the Born Rule is not
available to an agent ignorant of the statistics! But such an epistemologically-limited
agent would be very hard to find in practice.
. . . in the many worlds theory, not only Born’s rule but any probability rule is meaningless.
The only way to solve the problem . . . is by adding to the many worlds theory some
stochastic element. (Hemmo and Pitowsky 2007, p. 334)
This follows from the claim that the shared reality of all of the multiple branches in EQM entails
“one cannot claim that the quantum probabilities might be inferred (say, as an empirical conjecture)
from the observed frequencies.” (p. 337). This is a variant of the “incoherence” objection to
probabilities in EQM (see footnote 45 above), but in our opinion it is not obviously incoherent
to bet on quantum games in a branching scenario (see Wallace 2010).
94 Although in their above-mentioned review, Bacciagaluppi and Ismael do not go quite this
far, they do not regard the DW-theorem as necessary for “establishing the intelligibility of
probabilities” in EQM. By appealing to standard inductive practice based on past frequencies,
they argue:
The inputs [frequencies] are solidly objective, but the outputs [probabilities] need not be
reified.. . . In either the classical or the Everettian setting, probability emerges uniformly as
a secondary or derived notion, and we can tell an intelligible story about the role it plays in
setting credences. (Bacciagaluppi and Ismael 2015, section 3.2).
95 Read (2018), p. 140. Note that Read, like Dawid and Thébault, rests the rationality of the
statistics-driven route on the Bayesian confirmation analysis given by Greaves and Myrvold op.
cit..
7 Everettian Probabilities, The Deutsch-Wallace Theorem and the Principal Principle 195
7.7 Conclusions
The principal motivation for this paper was the elucidation of the difference between
the views of David Deutsch and David Wallace as regards the significance of the
DW-theorem. Deutsch interprets the probabilities therein solely within the Ramsay-
de Finetti-Savage tradition of rational choice, now in the context of quantum games.
Wallace defends a dualistic interpretation involving objective probabilities (chances)
and credences, and, like Simon Saunders, argues that his more rigorous version
of the theorem provides, at long last, a “derivation” or “proof” of David Lewis’
Principal Principle.
We have questioned whether this derivation of the PP is sound. In our view, the
Wallace-Saunders argument presupposes the PP and provides within it a number
to fill the notorious slot representing chances. For proponents of reified chances in
the world, this is, nonetheless, a remarkable result. But it raises awkward questions
about the scope of the ontology in EQM, and for subjectivists about probability like
Deutsch the concerns with the PP should seem misguided in the first place. Our
own inclinations are closer to those of Deutsch, and in key respects mirror those of
Bacciagaluppi and Ismael in their discussion of the DW-theorem.
We have also questioned the doubts expressed by Hemmo and Pitowsky as to
whether all the rationality assumptions in the DW-theorem are compelling, with
particular emphasis on the role of non-contextualism. Finally, by considering the
falsifiable nature of probabilities in science, we regard the complaint by Dawid and
Thébault, and largely endorsed by Read, that the DW-theorem is, for all practical
purposes, redundant, a serious challenge to those who endorse the arguments of the
authors of the theorem.
Acknowledgements The authors are grateful to the editors for the invitation to contribute to this
volume. HRB would like to acknowledge the support of the Notre Dame Institute for Advanced
Study; this project was started while he was a Residential Fellow during the spring semester
of 2018. Stimulating discussions over many years with Simon Saunders and David Wallace are
acknowledged. We also thank David Deutsch, Chiara Marletto, James Read and Christopher
Timpson for discussions, and Guido Bacciagaluppi, Donald Gillies and Simon Saunders for helpful
critical comments on the first draft of this paper. We dedicate this article to the memory of Itamar
Pitowsky, who over many years was known to and admired by HRB, and to Donald Gillies, teacher
to HRB and guide to the philosophy of probability.
References
Albrecht, A., & Phillips D. (2014). Origin of probabilities and their application to the multiverse.
Physical Review D, 90, 123514.
Bacciagaluppi, G., & Ismael, J. (2015). Essay review: The emergent multiverse. Philosophy of
Science, 82, 1–20.
Bacciagaluppi, G. (2020). Unscrambling subjective and epistemic probabilities. This volume,
http://philsci-archive.pitt.edu/16393/1/probability%20paper%20v4.pdf
196 H. R. Brown and G. Ben Porath
Hemmo, M., & Pitowsky, I. (2007). Quantum probability and many worlds. Studies in History and
Philosophy of Modern Physics, 38, 333–350.
Hoefer, C. (2019). Chance in the world: A Humean guide to objective chance. (Oxford Studies in
Philosophy of Science). Oxford, Oxford University Press, 2019.
Hume, D. (2008). An enquiry concerning human understanding. Oxford: Oxford University Press.
Ismael, J. (1996). What chances could not be. British Journal for the Philosophy of Science, 47(1),
79–91.
Ismael, J. (2019, forthcoming). On Chance (or, Why I am only Half-Humean). In S. Dasgupta
(Ed.), Current controversies in philosophy of science. London: Routledge.
Jaynes, E. T. (1963). Information theory and statistical mechanics. In: G. E. Uhlenbeck et al. (Eds.),
Statistical physics. (1962 Brandeis lectures in theoretical physics, Vol. 3, pp. 181–218). New
York: W.A. Benjamin.
Kochen, S., & Specker, E. P. (1967). The problem of hidden variables in quantum mechanics.
Journal of Mathematics and Mechanics, 17, 59–87.
Lewis, D. (1980). A subjectivist’s guide to objective chance. In: R. C. Jeffrey (Ed.), Studies in
inductive logic and probability (Vol. 2, pp. 263–293). University of California Press (1980).
Reprinted in: Lewis, D. Philosophical papers (Vol. 2, pp. 83–132). Oxford: Oxford University
Press.
Maris, J. P, Navrátil, P., Ormand, W. E., Nam, H., & Dean, D. J. (2011). Origin of the anomalous
long lifetime of 14 C. Physical Review Letters, 106, 202502.
Maudlin, T. (2007). What could be objective about probabilities? Studies in History and Philosophy
of Modern Physics, 38, 275–291.
McQueen, K. J., & Vaidman, L. (2019). In defence of the self-location uncertainty account of
probability in the many-worlds interpretation. Studies in History and Philosophy of Modern
Physics, 66, 14–23.
Myrvold, W. C. (2016). Probabilities in statistical mechanics, In C. Hitchcock & A. Hájek (Eds.),
Oxford handbook of probability and philosophy. Oxford: Oxford University Press. Available at
http://philsci-archive.pitt.edu/9957/
Page, D. N. (1994). Clock time and entropy. In J. J. Halliwell, J. Pérez-Mercader, & W. H.
Zurek (Eds.), The physical origins of time asymmetry, (pp. 287–298). Cambridge: Cambridge
University Press.
Page, D. N. (1995). Sensible quantum mechanics: Are probabilities only in the mind? arXiv:gr-
qc/950702v1.
Papineau, D. (1996). Many minds are no worse than one. British Journal for the Philosophy of
Science, 47(2), 233–241.
Pattie, R. W. Jr., et al. (2018). Measurement of the neutron lifetime using a magneto-gravitational
trap and in situ detection. Science, 360, 627–632.
Pitowsky, I. (1994). George Boole’s ‘conditions of possible experience’ and the quantum puzzle.
British Journal for the Philosophy of Science, 45(1), 95–125.
Pitowsky, I. (2006). Quantum mechanics as a theory of probability. In W. Demopoulos & I.
Pitowsky (Eds.), Physical theory and its interpretation (pp. 213–240). Springer. arXiv:quant-
phys/0510095v1.
Read, J. (2018). In defence of Everettian decision theory. Studies in History and Philosophy of
Modern Physics, 63, 136–140.
Saunders, S. (2005). What is probability? In A. Elitzur, S. Dolev, & N. Kolenda (Eds.), Quo vadis
quantum mechanics? (pp. 209–238). Berlin: Springer.
Saunders, S. (2010). Chance in the Everett interpretation. In S. Saunders, J. Barrett, A. Kent, &
D. Wallace (Eds.), Many worlds? Everett, quantum theory, and reality (pp. 181–205). Oxford:
Oxford University Press.
Saunders, S. (2020), The Everett interpretation: Probability. To appear in E. Knox & A. Wilson
(Eds.), Routledge Companion to the Philosophy of Physics. London: Routledge.
Skyrms, B. (1984). Pragmatism and empiricism. New Haven: Yale University Press.
Strevens, M. (1999). Objective probability as a guide to the world, philosophical studies. An
International Journal for Philosophy in the Analytic Tradition, 95(3), 243–275.
198 H. R. Brown and G. Ben Porath
Jeffrey Bub
Abstract I revisit the paper ‘Two dogmas about quantum mechanics,’ co-authored
with Itamar Pitowsky, in which we outlined an information-theoretic interpretation
of quantum mechanics as an alternative to the Everett interpretation. Following the
analysis by Frauchiger and Renner of ‘encapsulated’ measurements (where a super-
observer, with unrestricted ability to measure any arbitrary observable of a complex
quantum system, measures the memory of an observer system after that system
measures the spin of a qubit), I show that the Everett interpretation leads to modal
contradictions. In this sense, the Everett interpretation is inconsistent.
8.1 Introduction
About ten years ago, Itamar Pitowsky and I wrote a paper, ‘Two dogmas about quan-
tum mechanics’ (Bub and Pitowsky 2010), in which we outlined an information-
theoretic interpretation of quantum mechanics as an alternative to the Everett
interpretation. Here I revisit the paper and, following Frauchiger and Renner
(2018), I show that the Everett interpretation leads to modal contradictions in
‘Wigner’s-Friend’-type scenarios that involve ‘encapsulated’ measurements, where
a super-observer (which could be a quantum automaton), with unrestricted ability
to measure any arbitrary observable of a complex quantum system, measures the
J. Bub ()
Department of Philosophy, Institute for Physical Science and Technology, Joint Center for
Quantum Information and Computer Science, University of Maryland, College Park, MD, USA
e-mail: jbub@umd.edu
The salient difference between classical and quantum mechanics is the noncom-
mutativity of the algebra of observables of a quantum system, equivalently the
non-Booleanity of the algebra of two-valued observables representing properties
(for example, the property that the energy of the system lies in a certain range of
values, with the two eigenvalues representing ‘yes’ or ‘no’), or propositions (the
proposition asserting that the value of the energy lies in this range, with the two
eigenvalues representing ‘true’ or ‘false’). The two-valued observables of a classical
system form a Boolean algebra, isomorphic to the Borel subsets of the phase space
of the system. The transition from classical to quantum mechanics replaces this
Boolean algebra by a family of ‘intertwined’ Boolean algebras, to use Gleason’s
term (Gleason 1957), one for each set of commuting two-valued observables, repre-
sented by projection operators in a Hilbert space. The intertwinement precludes the
possibility of embedding the whole collection into one inclusive Boolean algebra, so
you can’t assign truth values consistently to the propositions about observable values
in all these Boolean algebras. Putting it differently, there are Boolean algebras in the
family of Boolean algebras of a quantum system, for example the Boolean algebras
for position and momentum, or for spin components in different directions, that
don’t fit together into a single Boolean algebra, unlike the corresponding family for
a classical system. In this non-Boolean theory, probabilities are, as von Neumann
put it, ‘sui generis’ (von Neumann 1937), and ‘uniquely given from the start’ (von
Neumann 1954, p. 245) via Gleason’s theorem, or the Born rule, as a feature of
the geometry of Hilbert space, related to the angle between rays in Hilbert space
representing ‘pure’ quantum states.1
The central interpretative question for quantum mechanics as a non-Boolean
theory is how we should understand these ‘sui generis’ probabilities, since they are
not probabilities of spontaneous transitions between quantum states, nor can they be
interpreted as measures of ignorance about quantum properties associated with the
actual values of observables prior to measurement.
The information-theoretic interpretation is the proposal to take Hilbert space
as the kinematic framework for the physics of an indeterministic universe, just
as Minkowski space provides the kinematic framework for the physics of a non-
2 See (Janssen 2009) for a defense of this view of special relativity contra Harvey Brown (2006).
202 J. Bub
entangled states that can be distinguished from separable states tends to zero. In this
sense, a macrosystem is effectively a commutative or Boolean system.
On the information-theoretic interpretation, quantum mechanics is a new sort of
non-representational theory for an indeterministic universe, in which the quantum
state is a bookkeeping device for keeping track of probabilities and probabilistic
correlations between intrinsically random events. Probabilities are defined with
respect to a single Boolean perspective, the Boolean algebra generated by the
‘pointer-readings’ of what Bohr referred to as the ‘ultimate measuring instruments,’
which are ‘kept outside the system subject to quantum mechanical treatment’ (Bohr
1939, pp. 23–24):
In the system to which the quantum mechanical formalism is applied, it is of course possible
to include any intermediate auxiliary agency employed in the measuring processes. . . . The
only significant point is that in each case some ultimate measuring instruments, like the
scales and clocks which determine the frame of space-time coordination—on which, in the
last resort, even the definition of momentum and energy quantities rest—must always be
described entirely on classical lines, and consequently be kept outside the system subject to
quantum mechanical treatment.
Bohr did not, of course, refer to Boolean algebras, but the concept is simply a
precise way of codifying a significant aspect of what Bohr meant by a description
‘on classical lines’ or ‘in classical terms’ in his constant insistence that (his
emphasis) (Bohr 1949, p. 209)
however far the phenomena transcend the scope of classical physical explanation, the
account of all evidence must be expressed in classical terms.
3 Aage Petersen (Petersen 1963, p. 12): ‘When asked whether the algorithm of quantum mechanics
could be considered as somehow mirroring an underlying quantum world, Bohr would answer,
“There is no quantum world. There is only an abstract quantum physical description. It is wrong to
think that the task of physics is to find out how nature is. Physics concerns what we can say about
nature.”
8 ‘Two Dogmas’ Redux 205
where Pa is the projection operator onto the eigenstate |a and Pψ is the projection
operator onto the quantum state. After the measurement, the state is updated to
pψ (a, b) pψ (a, b)
pψ (b|a) = = = Tr(Pb Pa ) = |b|a|2 (8.3)
p
b ψ (a, b) pψ (a)
V
|ψ −→ |Ψ = a|ψ|aS ⊗ |Aa O (8.4)
a
Baumann and Wolf (2018) formulate the Born rule for the memory state |Aa O of
the observer O having observed a in the relative state interpretation, where there is
no commitment to the existence of a physical record of the measurement outcome as
classical information. The probability of the observer O observing a in this sense is
where PAa is the projection operator onto the memory state |Aa O . This is
equivalent to the probability pψ (a) in the standard theory.
This equivalence can also be interpreted as showing the movability of the
notorious Heisenberg ‘cut’ between the observed system and the composite system
doing the observing, if the observing system is treated as a classical or Boolean
system. The cut can be placed between the system and the measuring instrument
plus observer, so that only the system is treated quantum mechanically, or between
the measuring instrument and observer, so that the system plus instrument is
treated as a composite quantum system. Similar shifts are possible if the measuring
instrument is subdivided into several component systems, e.g., if a recording device
is regarded as a separate component system, or if the observer is subdivided into
component systems involved in the registration of a measurement outcome.
On the relative state interpretation, the probability that S is in the state |a but
the observer sees a is
So the probability that S is in the state |a after O observes a is 1, as in the standard
theory according to the state update rule.
For a sequence of measurements of observables A and B by two observers, O1
and O2 , on the same system S, the conditional probability for the outcome b given
the outcome a is
which turns out to be the same as the conditional probability given by the state
update rule: qψ (b|a) = pψ (b|a).
So Everett’s relative state interpretation and the information-theoretic interpre-
tation make the same predictions, both for the probability of an outcome of an
A-measurement on a system S, and for the conditional probability of an outcome of
a B-measurement, given an outcome of a prior A-measurement on the same system
S by two observers, O1 and O2 .
As Baumann and Wolf show, this is no longer the case for conditional probabil-
ities involving encapsulated measurements at different levels of observation, where
an observer O measures an observable of a system S and a super-observer measures
an arbitrary observable of the joint system S + O. In other words, the movability of
the cut is restricted to measurements that are not encapsulated. Note that both the
observer and the super-observer could be quantum automata, so the analysis is not
restricted to observers as conscious agents.
Suppose an observer O measures an observable with eigenstates |aS on a system
S in a state |ψS = a a|ψ|aS , and a super-observer SO then measures an
observable with eigenstates |bS+O in the Hilbert space of the composite system
S + O.
8 ‘Two Dogmas’ Redux 207
On the relative state interpretation, after the unitary evolutions associated with
the observer O’s measurement of the S-observable with eigenstates |a, and the
super-observer SO’s measurement of the (S + O)-observable with eigenstates |b,
the state of the composite system S + O + SO is
|Ψ = a|ψb|a ⊗ Aa |bS+O |Bb SO (8.9)
a,b
where |Bb O is the memory state of SO correlated with the outcome b. The
probability of the super-observer SO finding the outcome b given that the observer
O obtained the outcome a is then qψ (b|a) as given by (8.7), with |Ψ Ψ | for
VO2 VO1 Pψ VO† 1 VO† 2 . The numerator of this expression, after taking the trace over
the (S + O)-space followed by the trace over the O-space of the projection onto
PBb followed by the projection onto PAa , becomes
pψ (b|a) a |ψψ|a b|a ⊗ Aa a ⊗ Aa |b (8.10)
a ,a
so
pψ (b|a)
a ,a a |ψψ|a b|a ⊗ Aa a ⊗ Aa |b
qψ (b|a) = (8.11)
b Tr(1 ⊗ PAa ⊗ PBb · |Ψ Ψ |)
Baumann and Wolf point out that this is equal to pψ (b|a) in (8.8) only if |bS+O =
|aS ⊗ |Aa O for all b, i.e., if |bS+O is a product state of an S-state |aS and an
O-state |Aa O , which is not the case for general encapsulated measurements.
Here O’s measurement outcome states |Aa O are understood as relative to
states of SO, and there needn’t be a record of O’s measurement outcome as
classical or Boolean information. If there is a classical record of the outcome
then, as Baumann and Wolf show, SO’s conditional probability is the same as the
conditional probability in the standard theory.
To see that the relative state interpretation leads to a modal contradiction,
consider the Frauchiger-Renner modification of the ‘Wigner’s Friend’ thought
experiment (Frauchiger and Renner 2018). Frauchiger and Renner add a second
Friend (F ) and a second Wigner (W ). Friend F measures an observable with
eigenvalues ‘heads’ or ‘tails’ and eigenstates |h, |t on a system R in the state
208 J. Bub
√1 |h + 23 |t. One could say that F ‘tosses a biased quantum coin’ with
3
probabilities 1/3 for heads and 2/3 for tails. She prepares a qubit S in the state | ↓
if the outcome is h, or in the state | → = √1 (| ↓ + | ↑) if the outcome is t, and
2
sends S to F . When F receives S, he measures an S-observable with eigenstates
| ↓, | ↑ and corresponding eigenvalues − 12 , 12 . The Friends F and F are in two
laboratories, L and L, which are assumed to be completely isolated from each other,
except briefly when F sends the qubit S to F .
Now suppose that W is able to measure an L-observable with eigenstates
1
|failL = √ (|hL + |tL )
2
1
|okL = √ (|hL − |tL )
2
1 1 1
|failL = √ (| − L + | L )
2 2 2
1 1 1
|okL = √ (| − L − | L )
2 2 2
Suppose W measures first and we stop the experiment at that point. On the
relative state interpretation, F ’s memory and all the systems in F ’s laboratory
involved in the measurement of R become entangled by the unitary transformation,
and similarly F ’s memory and all the systems in F ’s laboratory involved in the
measurement of S become entangled. If R is in the state |hR and F measures R,
the entire laboratory L evolves to the state |hL = |hR |hF , where |hF represents
the state of F ’s memory plus measuring and recording device plus all the systems
in F ’s lab connected to the measuring and recording device after the registration of
outcome h. Similarly for the state |tL , and for the states |− 12 L = |− 12 S |− 12 F and
| 12 L = | 12 S | 12 F of F ’s lab L with respect to the registration of the outcomes ± 12
of the spin measurement. So after the evolutions associated with the measurements
by F and F , the state of the two laboratories is
1 1 2 | − 12 L + | 12 L
|Ψ = √ |hL | − L + |tL √ (8.12)
3 2 3 2
1 1 2
= √ |hL | − L + |t |failL (8.13)
3 2 3 L
According to the state (8.13), the probability is zero that W gets the outcome ‘ok’
for his measurement, given that F obtained the outcome t for her measurement.
8 ‘Two Dogmas’ Redux 209
Frauchiger and Renner derive this modal contradiction on the basis of three
assumptions, Q, S, and C. Assumption Q (for ‘quantum’) says that an agent can
‘be certain’ of the value of an observable at time t if the quantum state assigns
probability 1 to this value after a measurement that is completed at time t, and
the agent has established that the system is in this state. An agent can also ‘be
certain’ that the outcome does not occur if the probability is zero. One should
also include under Q the assumption that measurements are quantum interactions
that transform the global quantum state unitarily. Assumption S (for ‘single value’)
says, with respect to ‘being certain,’ that an agent can’t ‘be certain’ that the value
of an observable is equal to v at time t and also ‘be certain’ that the value of this
observable is not equal to v at time t.6 Finally, assumption C (for ‘consistency’) is a
transitivity condition for ‘being certain’: if an agent ‘is certain’ that another agent,
reasoning with the same theory, ‘is certain’ of the value of an observable at time t,
then the first agent can also ‘be certain’ of this value at time t.
As used in the Frauchiger-Renner argument, ‘being certain’ is a technical term
entirely specified by the three assumptions. In other words, ‘being certain’ can mean
whatever you like, provided that the only inferences involving the term are those
sanctioned by the three assumptions. In particular, it is not part of the Frauchiger-
Renner argument that if an agent ‘is certain’ that the value of an observable is v at
time t, then the value of the observable is indeed v at time t—the observable might
have multiple values, as in the Everett interpretation. So it does not follow that the
proposition ‘the value of the observable is v at time t’ is true, or that the system has
the property corresponding to the value v of the observable at time t. Also, it is not
part of the Frauchiger-Renner argument that if an agent ‘is certain’ of the value of
an observable, then the agent knows the value, in any sense of ‘know’ that entails
the truth of what the agent knows.
The argument involves measurements, and inferences by the agents via the notion
of ‘being certain,’ that can all be modeled as unitary transformations to the agents’
memories. So picture the agents as quantum automata, evolving unitarily, where
these unitary evolutions result in changes to the global quantum state of the two
friends and the two Wigners that reflect changes in the agents’ memories. What
Frauchiger and Renner show is that these assumptions lead to memory changes that
end up in a modal contradiction.
Obviously, the relative state interpretation is inconsistent with the assumption
S in the sense that different branches of the quantum state can be associated
with different values of an observable after a measurement. So, with respect to a
particular branch, an agent can be certain that an observable has a certain value
and, with respect to a different branch, an agent associated with that branch can be
certain that the observable has a different value. This is not a problem. Rather, the
6 The standard axiom system for the modal logic of the operator ‘I am certain that’ includes the
axiom ‘if I am certain that A, then it is not the case that I am certain that not-A’ (assuming the
accessibility relation is serial and does not have any dead-ends). Thanks to Eric Pacuit for pointing
this out.
8 ‘Two Dogmas’ Redux 211
problem for the relative state interpretation is that there is a branch of the global
quantum state after W ’s measurement, which has a non-zero probability of 1/12,
with a memory entry that is inconsistent with S: W is certain that the outcome of his
measurement is ‘fail’ and also certain that the outcome is ‘ok.’
Renner gives a full description of the state of the global system as it evolves
unitarily with the measurements and inferential steps at every stage of the experi-
ment, which is repeated over many rounds until W and W both obtain the outcome
‘ok’ (Renner 2019). The measurements are assumed to be performed by the agents
F , F, W , W in sequence beginning at times 00, 10, 20, and 30 (and ending at times
01, 11, 21, and 31).
So, for example, in round n after F ’s measurement the global entangled state is
|Ψ n:11 (cf. (8.12)):
After the measurements by W and W with outcomes w̄ and w, the global entangled
state is |Ψ n:31 :
12 |okL |ok, so I am certain that w = failW |okL |I am certain that w = fail; I observe w = ok!W
1
− 12
1
|okL |ok, so I am certain that w = failW |failL |I am certain that w = fail; I observe w = failW
+ 12
1
|failL |fail, no conclusionW |okL |no conclusion previously; I observe w = okW
√
+ 2 |failL |fail, no conclusionW |failL |no conclusion previously; I observe w = failW
3
(8.16)
8.4 A Clarification
Baumann and Wolf argue that standard quantum mechanics with the ‘collapse’ rule
for measurements, and quantum mechanics on Everett’s relative state interpretation,
are really two different theories, not different interpretations of the same theory. For
the Frauchiger-Renner scenario, they derive the conditional probability
212 J. Bub
for the standard theory by supposing that F updates the quantum state on the basis
of the outcome of her measurement of R and corresponding state preparation of S,
but F ’s measurement is described as a unitary transformation from F ’s perspective,
so the state of the two labs after the measurements by F (assuming F found ‘tails’)
and F is |tL (| − 12 L + | + 12 L ) = |tL |failL . For the relative state theory, they
derive the conditional probability
This would seem to be in conflict with the claim in the previous section that the
probability is zero that W gets the outcome ‘ok’ for his measurement, given that
F obtained the outcome ‘tails’ for her measurement, whether or not W measures
before W .
To understand the meaning of the probability qΨ (okW |tF ) as Baumann and Wolf
define it, note that after the unitary evolution associated with W ’s measurement on
the lab L, the state of the two laboratories and W is (from (8.14))
1 1
Φ = √ |okL |okW |okL − √ |okL |okW |failL
12 12
√
1 3
+ √ |failL |failW |okL + |failL |failW |failL (8.19)
12 2
1 5 3 1
Φ = √ |hL √ |failW − √ |okW |failL
2 6 10 10
1 1 1
+ √ √ |failW + √ |okW |okL
6 2 2
1 5 3 1
√ |tL √ |failW + √ |okW |failL
2 6 10 10
1 1 1
+ √ √ |failW − √ |okW |okL (8.20)
6 2 2
qΦ (okW , tL )
qΦ (okW |tL ) = = 1/6 (8.21)
qΦ (okW , tL ) + qΦ (failW , tL )
8 ‘Two Dogmas’ Redux 213
The conditional probability qΦ (okW |tL ) is derived from the jointly observable
statistics at the latest time in the unitary evolution, so at the time immediately after
W ’s measurement, following W ’s prior measurement.7 After W ’s measurement
of the observable with eigenvalues ‘ok,’ ‘fail,’, the L-observable with eigenvalues
‘heads,’ ‘tails’ is indefinite. The probability qΦ (okW |tL ) refers to a situation in
which a super-observer W measures an observable with eigenvalues ‘heads’ or
‘tails’ on L and notes the fraction of cases where W gets ‘ok’ to those in which this
measurement results in the outcome ‘tails.’ The ‘tails’ value in this measurement
is randomly related to the ‘tails’ outcome of F ’s previous measurement before W ’s
intervention, so the probability qΦ (okW |tL ), which Baumann and Wolf identify with
qΦ (okW |tF ), is not relevant to F ’s prediction that W will find ‘fail’ in the cases
where she finds ‘tails.’ Certainly, given the disruptive nature of W ’s measurement,
it is not the case that W will find ‘tails’ after W ’s measurement if and only if
F ’s measurement resulted in the outcome ‘tails’ at the earlier time before W ’s
measurement.8
This is not a critique of Baumann and Wolf, who make no such claim. The
purpose of this section is simply to clarify the difference between the conditional
probability qΦ (okW |tF ), as Baumann and Wolf define it, and F ’s conditional
prediction that W will find ‘fail’ in the cases where she finds ‘tails.’
Frauchiger and Renner present their argument as demonstrating that any interpre-
tation of quantum mechanics that accepts assumptions Q, S, and C is inconsistent,
and they point out which assumptions are rejected by specific interpretations. For
the relative state interpretation and the many-worlds interpretation, they say that
these interpretations conflict with assumption S, because
any quantum measurement results in a branching into different ‘worlds,’ in each of which
one of the possible measurement outcomes occurs.
But this is not the issue. Rather, Frauchiger and Renner show something much more
devastating: that for a particular scenario with encapsulated measurements involving
multiple agents, there is a branch of the global quantum state with a contradictory
memory entry.
An interpretation of quantum mechanics is a proposal to reformulate quantum
mechanics as a Boolean theory in the representational sense, either by introducing
hidden variables, or by proposing that every possible outcome occurs in a mea-
surement, or in some other way. An implicit assumption of the Frauchiger-Renner
argument is that quantum mechanics is understood as a representational theory,
in the minimal sense that observers can be represented as physical systems, with
the possibility that observers can observe other observers. What the Frauchiger-
Renner argument really shows is that quantum mechanics can’t be interpreted as a
representational theory at all.
Acknowledgements Thanks to Veronika Baumann, Michael Dascal, Allen Stairs, and Tony
Sudbery for critical comments on earlier drafts.
References
Baumann, V., & Wolf, S. (2018). On formalisms and interpretations. Quantum, 2, 99–111.
Bell, J. (1987). Quantum mechanics for cosmologists. In J. S. Bell (Ed.), Speakable and
unspeakable in quantum mechanics. Cambridge: Cambridge University Press.
Bohr, N. (1939). The causality problem in atomic physics. In New theories in physics (pp. 11–30).
Warsaw: International Institute of Intellectual Cooperation.
Bohr, N. (1949). Discussions with Einstein on epistemological problems in modern physics. In
P. A. Schilpp (Ed.), Albert Einstein: Philosopher-scientist (The library of living philosophers,
vol. 7, pp. 201–241). Evanston: Open Court.
Brown, H. R. (2006). Physical relativity: Space-time structure from a dynamical perspective.
Oxford: Clarendon Press.
Bub, J. (2016). Bananaworld: Quantum mechanics for primates. Oxford: Oxford University Press.
Bub, J., & Pitowsky, I. (2010). Two dogmas about quantum mechanics. In S. Saunders, J. Barrett,
A. Kent, & D. Wallace (Eds.), Many worlds? Everett, quantum theory, and reality (pp. 431–
456). Oxford: Oxford University Press.
Bub, J., & Stairs, A. (2014). Quantum interactions with closed timelike curves and superluminal
signaling. Physical Review A, 89, 022311.
Frauchiger, D., & Renner, R. (2018). Quantum theory cannot consistently describe the use of itself.
Nature Communications, 9, article number 3711.
Gleason, A. N. (1957). Measures on the closed subspaces of Hilbert space. Journal of Mathematics
and Mechanics, 6, 885–893.
Healey, R. (2018). Quantum theory and the limits of objectivity. Foundations of Physics, 49, 1568–
1589.
Janssen, M. (2009). Drawing the line between kinematics and dynamics in special relativity.
Studies in History and Philosophy of Modern Physics, 40, 26–52.
Pauli, W. (1954). ‘Probability and physics,’ In C. P. Enz & K. von Meyenn (Eds.), Wolfgang Pauli:
Writings on physics and philosophy (p. 46). Berlin: Springer. The article was first published in
(1954) Dialectica, 8, 112–124.
Petersen, A. (1963). The philosophy of Niels Bohr. Bulletin of the Atomic Scientists, 19, 8–14.
Pitowsky, I. (1994). George Boole’s ‘conditions of possible experience’ and the quantum puzzle.
British Journal for the Philosophy of Science, 45, 95–125. ‘These . . . may be termed the
conditions of possible experience. When satisfied they indicate that the data may have, when
not satisfied they indicate that the data cannot have, resulted from actual observation.’ Quoted
by Pitowsky from George Boole, ‘On the theory of probabilities,’ Philosophical Transactions
of the Royal Society of London, 152, 225–252 (1862). The quotation is from p. 229.
Pitowsky, I. (2004). Macroscopic objects in quantum mechanics: A combinatorial approach.
Physical Review A, 70, 022103–1–6.
Pitowsky, I. (2007). Quantum mechanics as a theory of probability. In W. Demopoulos & I.
Pitowsky (Eds.), Festschrift in honor of Jeffrey Bub (Western Ontario Series in Philosophy
of Science). New York: Springer.
8 ‘Two Dogmas’ Redux 215
Renner, R. (2019). Notes on the discussion of ‘Quantum theory cannot consistently describe the
use of itself.’ Unpublished.
von Neumann, J. (1937). Quantum logics: Strict- and probability-logics. A 1937 unfinished
manuscript published in A. H. Taub (Ed.), Collected works of John von Neumann (vol. 4, pp.
195–197). Oxford/New York: Pergamon Press.
von Neumann, J. (1954). Unsolved problems in mathematics. An address to the international
mathematical congress, Amsterdam, 2 Sept 1954. In M. Rédei & M. Stöltzner (Eds.), John von
Neumann and the foundations of quantum physics (pp. 231–245). Dordrecht: Kluwer Academic
Publishers, 2001.
Chapter 9
Physical Computability Theses
B. J. Copeland
Department of Philosophy, University of Canterbury, Christchurch, New Zealand
e-mail: jack.copeland@canterbury.ac.nz
O. Shagrir ()
Department of Philosophy, The Hebrew University of Jerusalem, Jerusalem, Israel
9.1 Introduction
The physical Church-Turing Thesis (PCTT) limits the behavior of physical systems
to Turing computability. We will distinguish several versions of PCTT, and will
discuss some possible empirical considerations against these. We give special
emphasis to Itamar Pitowsky’s contributions to the formulation of the physical
Church-Turing Thesis, and to his remarks concerning its validity.
The important distinction between ‘modest’ and ‘bold’ variants of PCTT was
noted by Gualtiero Piccinini (2011). Modest variants concern only computing
systems, while bold variants concern the behavior of physical systems without
restriction. The literature contains numerous examples of both modest and bold
formulations of PCTT; e.g., bold formulations appear in Deutsch (1985) and
Wolfram (1985), and modest formulations in Gandy (1980) and Copeland (2000).
We will distinguish the modest and bold variants from a third variant of PCTT,
which we term “super-bold”. This variant goes beyond the other two in including
decidability questions within its scope, saying, roughly, that every physical aspect
of the behavior of any physical system – e.g., stability, periodicity – is Turing
computable.
Once the distinction between the modest, bold, and super-bold variants is drawn,
we will give three different formulations of PCTT: a modest version PCTT-M, a
bold version PCTT-B, and a super-bold version PCTT-S. We will then review some
potential challenges to these three versions, drawn from relativity theory, quantum
mechanics, and elsewhere. We will conclude that all three are to be viewed as open
empirical hypotheses.
The issue of whether every aspect of the physical world is Turing computable was
raised by several authors in the 1960s and 1970s, and the topic rose to prominence
in the mid-1980s. In 1985, Stephan Wolfram formulated a thesis that he described
as “a physical form of the Church-Turing hypothesis”: this says that the universal
Turing machine can simulate any physical system (1985: 735, 738). Wolfram put it
as follows: “[U]niversal computers are as powerful in their computational capacities
as any physically realizable system can be, so that they can simulate any physical
system” (Wolfram 1985: 735). In the same year David Deutsch (who laid the
foundations of quantum computation) formulated a principle that he also called
“the physical version of the Church-Turing principle” (Deutsch 1985: 99). Other
formulations were advanced by Earman (1986), Pour-El and Richards (1989) and
others.
Pitowsky also formulated a version of CTTP, in his paper “The Physical Church
Thesis and Physical Computational Complexity” (Pitowsky 1990), based on his
9 Physical Computability Theses 219
We will use the term physical to refer to systems whose operations are in accord
with the actual laws of nature. These include not only actually existing systems, but
also idealized physical systems (systems that operate in some idealized conditions),
and physically possible systems that do not actually exist, but that could exist, or
will exist, or did exist, e.g., in the universe’s first moments. (Of course, there is no
consensus about exactly what counts as an idealized or possible physical system,
but this is not our concern here.)
We start by formulating a modest version of PCTT:
Modest Physical Church-Turing Thesis (PCTT-M) Every function computed by any
physical computing system is Turing computable.
The functions referred to need not necessarily be defined over discrete values (e.g.,
integers). Many physical systems presumably operate on real-valued magnitudes;
and the same is true of physical computers, e.g., analog computers. The nervous
system too might have analog computing components. All this requires consid-
eration of computability over the reals. The extension of Turing computability to
real-valued domains (or non-denumerable domains more generally) was initiated
by Turing, who talked about real computable numbers in his (1936). Definitions of
1 Some papers from the Workshop were published in 1990, in a special volume of Iyyun. The
volume also contains papers by Avishai Margalit, Charles Parsons, Warren Goldfarb, William Tait,
and Mark Steiner.
220 B. J. Copeland and O. Shagrir
The physical processes involved in these scenarios – the motion and stability
of physical systems – may (so far as we know at present) be Turing computable,
in the sense that the motions of the planets may admit of simulation by a Turing
machine, to any required degree of accuracy. (Another way to put this is that
possibly a Turing machine could be used to predict the locations of the planets at
every specific moment.) Yet the answers to certain physical questions about physical
systems – e.g., whether (under ideal conditions) the system’s motion eventually
terminates – may nevertheless be uncomputable. The situation is similar in the case
of the universal Turing machine itself: the machine’s behavior (consisting of the
physical actions of the read/write head) is always Turing computable in the sense
under discussion, since the behavior is produced by the Turing machine’s program;
yet the answers to some yes/no questions about the behavior, such as whether or
not the machine halts given certain inputs, are not Turing computable. Undecidable
questions also arise concerning the dynamics of cellular automata and many other
idealized physical systems.
We express this form of the physical thesis as follows:
Super-Bold Physical Church-Turing Thesis (PCTT-S): Every physical aspect of the
behavior of any physical system is Turing computable (decidable).
Are these three physical versions of the Church-Turing Thesis true, or even well-
evidenced? We discuss the modest version first.
There have been several attempts to cook up idealized physical machines able
to compute functions that no Turing machine can compute. Perhaps the most
interesting of these are “supertask” machines—machines that complete infinitely
many computational steps in a finite span of time. Among such machines are
accelerating machines (Copeland 1998, Copeland and Shagrir 2011), shrinking
machines (Davies 2001), and relativistic machines (Pitowsky 1990; Hogarth 1994;
Andréka et al. 2009). Pitowsky proposed a relativistic machine in the 1987 lecture
mentioned earlier. At the same time, Istvan Németi also proposed a relativistic
machine, which we outline below.
The fundamental idea behind relativistic machines is intriguing: these machines
operate in spacetime structures with the property that the entire endless lifetime of
one participant is included in the finite chronological past of a second participant—
sometimes called “the observer”. Thus the first participant could carry out an endless
computation, such as calculating each digit of π, in what is a finite timespan from
the observer’s point of view, say 1 hour. Pitowsky described a setup with extreme
acceleration that nevertheless functions in accordance with Special Relativity. His
example is of a mathematician “dying to know whether Fermat’s conjecture is true
or false” (the conjecture was unproved back then). The mathematician takes a trip
in a satellite orbiting the earth, while his students (and then their students, and then
222 B. J. Copeland and O. Shagrir
their students . . . ) “examine Fermat’s conjecture one case after another, that is, they
take quadruples of natural numbers (x,y,z,n), with n≥3, and check on a conventional
computer whether xn + yn = zn ” (Pitowsky 1990: 83).
Pitowsky suggested that similar set-ups could be replicated by spacetime struc-
tures in General Relativity (now sometimes called Malament-Hogarth spacetimes).
Mark Hogarth (1994) pointed out the non-recursive computational powers of
devices operating in these spacetimes. More recently, Etesi and Németi (2002),
Hogarth (2004), Welch (2008), Button (2009), and Barrett and Aitken (2010) have
further explored the computational powers of such devices, within and beyond the
arithmetical hierarchy. In what follows, we describe a relativistic machine RM that
arguably computes the halting function (we follow Shagrir and Pitowsky (2003)).
RM consists of a pair of communicating Turing machines TA and TB : TA ,
the observer, is in motion relative to TB , a universal machine. When the input
(m,n)—asking whether the mth Turing machine (in some enumeration of the Turing
machines) halts or not, when started on input n—enters TA , TA first prints 0
(meaning “never halts”) in its designated output cell and then transmits (m,n) to TB .
TB simulates the computation performed by the mth Turing machine when started
on input n and sends a signal back to TA if and only if the simulation terminates. If
TA receives a signal from TB , it deletes the 0 it previously wrote in its output cell
and writes 1 there instead (meaning “halts”). After 1 hour, TA ’s output cell shows 1
if the mth Turing machine halts on input n and shows 0 if the mth machine does not
halt on n. Since RM is able to do this for any input pair (m,n), RM will deliver any
desired value of the halting function. There is further discussion of RM in Copeland
and Shagrir (2007).
Here we turn to the question of whether RM is a counterexample to PCTT-
M. This depends on whether RM is physical and on whether it really computes
the halting function. First, is RM physical? Németi and his colleagues provide the
most physically realistic construction, locating machines like RM in setups that
include huge slowly rotating Kerr-type black holes (Andréka et al. 2009). They
emphasize that the computation is physical in the sense that “the principles of
quantum mechanics are not violated” and RM is “not in conflict with presently
accepted scientific principles”; and they suggest that humans might “even build”
a relativistic computer “sometime in the future” (Andréka et al. 2009: 501).
Naturally, all this is controversial. John Earman and John Norton pointed out
that communication between the participants is not trivially achieved, due to
extreme blue-shift effects, including the possibility of the signal destroying the
receiving participant (Earman and Norton 1993). Subsequently, several potential
solutions to this signaling problem have been proposed; see Etesi and Németi
(2002), Németi and Dávid (2006) and Andréka et al. (2009: 508–9). An additional
potential objection is that infinitary computation requires infinite memory, and so
requires infinite computation space (Pitowsky 1990: 84). Another way of putting
the objection is that the infinitary computation requires an unbounded amount
of matter-energy, which seems to violate the basic principles of quantum gravity
(Aaronson 2005)—although Németi and Dávid (2006) offer a proposed solution to
this problem. We return to the infinite memory problem in a later section.
9 Physical Computability Theses 223
Second, does RM compute the halting function? The answer depends on what
is included under the heading “physical computation”. We cannot even summarize
here the diverse array of competing accounts of physical computation found in the
current literature. But we can say that RM computes in the senses of “compute”
staked out by several of these accounts: the semantic account (Shagrir 2006;
Sprevak 2010), the mechanistic account (Copeland 2000; Miłkowski 2013; Fresco
2014; Piccinini 2015), the causal account (Chalmers 2011), and the BCC (broad
conception of computation) account (Copeland 1997). According to all these
accounts, RM is a counterexample to the modest thesis if RM is physical; at the
very least, RM seems to show that non-Turing physical computation is logically
possible. However, if computation is construed as the execution of an algorithm in
the classical sense, then RM does not compute.
PCTT-B says that the behavior of every physical system can be simulated (to any
required degree of precision) by a Turing machine. Speculation that there may be
physical processes whose behavior cannot be calculated by the universal Turing
machine stretches back over several decades (see Copeland 2002 for a survey).
The focus has been not so much on idealized constructions such as RM (which,
if physical, is a counterexample to PCTT-B, as well as PCTT-M, since PCTT-B
entails PCTT-M); rather, the focus has been on whether the mathematical equations
governing the dynamics of physical systems are or are not Turing computable.
Early papers by Scarpellini (1963), Komar (1964), and Kreisel (1965, 1967) raised
this question. Georg Kreisel stated “There is no evidence that even present day
quantum theory is a mechanistic, i.e. recursive theory in the sense that a recursively
described system has recursive behavior” (1967: 270). Roger Penrose (1989, 1994)
conjectured that some mathematical insights are non-recursive. Assuming that this
mathematical thinking is carried out by some physical processes in the brain, the
bold thesis must then be false. But Penrose’s conjecture is highly controversial.
Another challenge to the bold thesis derives from the postulate of genuine
physical randomness (as opposed to quasi-randomness). Church showed in 1940
that any infinite, genuinely random sequence is uncomputable (Church 1940: 134–
135). Some have argued that, under certain conditions relating to unboundedness,
PCTT-B is false in a universe containing a random element (to use Turing’s term
from his 1950: 445; see also Turing 1948: 416). A random element is a system that
generates random sequences of bits. It is argued that if physical systems include
systems capable of producing unboundedly many digits of an infinite random binary
sequence, then PCTT-B is false (Copeland 2000, 2004; Calude and Svozil 2008;
Calude et al. 2010; Piccinini 2011). One of us, Copeland, also argues that (again
under unboundedness conditions) a digital computer using a random element forms
a counterexample to PCTT-M (Copeland 2002). However, the latter claim, unlike
the corresponding claim concerning PCTT-B, depends crucially upon one’s account
224 B. J. Copeland and O. Shagrir
Now to functions:
Even if, in our world, the initial conditions envisaged in the Pour-El & Richards
example do not occur, nevertheless these conditions could occur in some other
physically possible world in which the wave equation holds, so showing, as
Pitowsky said, that physical processes do not necessarily preserve recursiveness.
226 B. J. Copeland and O. Shagrir
Pitowsky further noted that many other yes/no questions about the system are
computationally undecidable. In fact, it follows from Rice’s theorem that “almost
every conceivable question about the unbounded future of a universal Turing
machine turns out to be computationally undecidable” (Pitowsky 1996: 171; see
also Harel and Feldman 1992). Rice’s theorem says: any nontrivial property about
the language recognized by a Turing machine is undecidable, where a property is
nontrivial if at least one Turing machine has the property and at least one does not.
However, the objection that PCTT-S is straight-out false can hardly be sustained.
It would by no means be facile to implement a Turing machine, with its infinite
tape, in a finite physical object. The assumption that the physical universe is able
to supply an infinite memory tape is controversial. If the number of particles in
the cosmos is finite, and each cell of the tape requires at least one particle for its
implementation, then clearly the assumption of an infinite physical tape must be
false. Against this, it might be pointed out that there are a number of well-known
constructions for shoehorning an infinite amount of memory into a finite object. For
example, a single strip of tape 1 metre long can be used: the first cell occupies the
first half-metre, the second cell the next quarter-metre, the third cell the next eighth
of a metre, and so on. However, this observation does not help matters. Obviously,
this construction can be realized only in a universe whose physics allows for the
infinite divisibility of the tape—again by no means an evidently true assumption.
Another way to implement the Turing machine with its infinite tape is in the
continuous (or rational) values of a physical magnitude, as described in Moore’s
system. Assume that each (potential) configuration of the Turing machine is
encoded in a (potentially infinite) sequence of 0s and 1s. The goal now is to
efficiently realize each sequence in a unique location of the particle in the (finite)
unit square. Much like the example of the 1 metre strip of tape above, this realization
9 Physical Computability Theses 227
can be achieved if we use potentially infinitely many different locations (x,y) within
a unit square, where x and y are continuous (or rational) values between 0 and 1.
This realization, however, requires that the (idealized) mirrors have the ability to
bounce the particle, accurately, into potentially infinitely many different positions
within the unit square. Moreover, if Laplace’s Demon wants to predict the future
position of the particle after k rounds, the Demon will have to be able to measure the
differences between arbitrarily close positions. As Pitowsky notes: “[T]o predict the
particle position with accuracy of 1/2 the demon has to measure initial conditions
with accuracy 2−(k+1) . The ratio of final error to initial error is 2k , and growing
exponentially with time k (measured by the number of rounds)” (1996, 167). This
means that implementing a Turing machine in Moore’s system requires determining
a particle’s position with practically infinite precision (an accuracy of 2−(k + 1) for
unbounded k), and it is questionable whether this implementation is physically
feasible. At any rate, the claim that this is physically feasible is, again, far from
obvious.
To summarize this discussion, PCTT-S hypothesizes in part that the universe
is physically unable to supply an infinite amount of memory, since if PCTT-S
is true, the resources for constructing a universal computing machine must be
unavailable (the other necessary resources, aside from the infinite memory, being
physically undemanding). This point helps illuminate the relationships between
PCTT-S and PCTT-M. Returning to the discussion of the relativistic machine RM
and the infinite memory problem raised in Sect. 9.3, it is clearly the case that, since
RM requires infinite memory, PCTT-S rules out RM (and this is to be expected,
since PCTT-M rules out RM, and PCTT-S entails PCTT-M). Nevertheless, the
falsity of PCTT-S, and the availability of infinite memory, would be insufficient
to falsify PCTT-M—a universe that consists of nothing but a universal Turing
machine with its infinite tape does not falsify PCTT-M. Thus, counterexamples
to PCTT-M must postulate not only the availability of infinite memory but also
additional physical principles of some sort, such as gravitational time dilation
or unbounded acceleration or unbounded shrinking of components. Relativistic,
accelerating and shrinking machines arguably invoke these principles successfully,
and, hence, provide counterexamples to PCTT-M.
Moving on to challenges to PCTT-S at the quantum level, there are undecidable
questions concerning the behavior of quantum systems. In 1986, Robert Geroch
and James Hartle argued that undecidable physical theories “should be no more
unsettling to physics than has the existence of well-posed problems unsolvable by
any algorithm have been to mathematics”; and they suggested such theories may
be “forced upon us” in the quantum domain (Geroch and Hartle 1986: 534, 549).
Arthur Komar raised “the issue of the macroscopic distinguishability of quantum
states” in 1964, claiming there is no effective procedure “for determining whether
two arbitrarily given physical states can be superposed to show interference effects”
(Komar 1964: 543–544). More recently, Jens Eisert, Markus Müller and Christian
Gogolin showed that “the very natural physical problem of determining whether
certain outcome sequences cannot occur in repeated quantum measurements is
undecidable, even though the same problem for classical measurements is readily
228 B. J. Copeland and O. Shagrir
9.6 Conclusion
We have distinguished three theses about physical computability, and have discussed
some empirical evidence that might challenge these. Concerning the boldest
versions, PCTT-B and PCTT-S, both are false if the physical universe permits
infinite memory and genuine randomness. Even assuming that the physical universe
9 Physical Computability Theses 229
is deterministic, the most that can be said for PCTT-B and PCTT-S is that, to date,
there seems to be no decisive empirical evidence against them. PCTT-B and PCTT-
S are both thoroughly empirical theses; but matters are more complex in the case
of the modest thesis PCTT-M, since a conceptual issue also bears on the truth or
falsity of this thesis (even in a universe containing genuine randomness)—namely,
the difficult issue of what counts as physical computation. Our conclusion is that,
at the present stage of physical enquiry, it is unknown whether any of the theses is
true.
Acknowledgements An early version of this chapter was presented at the Workshop on Physics
and Computation (UCNC2018) in Fontainebleau, at the International Association for Computing
and Philosophy meeting (IACAP2018) in Warsaw, and at the research logic seminar in Tel Aviv
University. Thanks to the audiences for helpful discussion. We are also grateful to Piccinini for
his comments on a draft of this paper. Shagrir’s research was supported by the Israel Science
Foundation grant 830/18.
References
Aaronson, S. (2005). Guest column: NP-complete problems and physical reality. ACM SIGACT
News, 36(1), 30–52.
Andréka, H., Németi, I., & Németi, P. (2009). General relativistic hypercomputing and foundation
of mathematics. Natural Computing, 8, 499–516.
Barrett, J. A., & Aitken, W. (2010). A note on the physical possibility of ordinal computation. The
British Journal for the Philosophy of Science, 61, 867–874.
Button, T. (2009). SAD computers and two versions of the Church-Turing thesis. The British
Journal for the Philosophy of Science, 60, 765–792.
Calude, C. S., Dinneen, M. J., Dumitrescu, M., & Svozil, K. (2010). Experimental evidence of
quantum randomness incomputability. Physical Review A, 82: 022102–1-022102–8.
Calude, C. S., & Svozil, K. (2008). Quantum randomness and value indefiniteness. Advanced
Science Letters, 1, 165–168.
Castelvecchi, D. (2015). Paradox at the heart of mathematics makes physics problem unanswer-
able. Nature, 528, 207.
Chalmers, D. J. (2011). A computational foundation for the study of cognition. Journal of
Cognitive Science, 12, 323–357.
Church, A. (1936). An unsolvable problem of elementary number theory. American Journal of
Mathematics, 58, 345–363.
Church, A. (1940). On the concept of a random sequence. American Mathematical Society Bulletin,
46, 130–135.
Copeland, B. J. (1997). The broad conception of computation. American Behavioral Scientist, 40,
690–716.
Copeland, B. J. (1998). Even Turing machines can compute uncomputable functions. In C. Calude,
J. Casti, & M. Dinneen (Eds.), Unconventional models of computation. New York: Springer.
Copeland, B. J. (2000). Narrow versus wide mechanism. Journal of Philosophy 96, 5–32. Reprinted
in M. Scheutz (Ed.) (2002). Computationalism. Cambridge, MA: MIT Press.
Copeland, B. J. (2002). Hypercomputation. Minds and Machines, 12, 461–502.
Copeland, B. J. (2004). Hypercomputation: Philosophical issues. Theoretical Computer Science,
317, 251–267.
230 B. J. Copeland and O. Shagrir
Copeland, B. J. (2017). The Church-Turing thesis. In E. N. Zalta (Ed.), The Stanford encyclopedia
of philosophy. https://plato.stanford.edu/archives/win2017/entries/church-turing/
Copeland, B. J., & Shagrir, O. (2007). Physical computation: How general are Gandy’s principles
for mechanisms? Minds and Machines, 17, 217–231.
Copeland, B. J., & Shagrir, O. (2011). Do accelerating Turing machines compute the uncom-
putable? Minds and Machines, 21, 221–239.
Copeland, B. J., & Shagrir, O. (2019). The Church–Turing thesis—Logical limit, or breachable
barrier? Communications of the ACM, 62, 66–74.
Copeland, B. J., Shagrir, O., & Sprevak, M. (2018). Zuse’s thesis, Gandy’s thesis, and Penrose’s
thesis. In M. Cuffaro & S. Fletcher (Eds.), Physical perspectives on computation, computa-
tional perspectives on physics (pp. 39–59). Cambridge: Cambridge University Press.
Cubitt, T. S., Perez-Garcia, D., & Wolf, M. M. (2015). Undecidability of the spectral gap. Nature,
528(7581), 207–211.
Davies, B. E. (2001). Building infinite machines. The British Journal for the Philosophy of Science,
52, 671–682.
Deutsch, D. (1985). Quantum theory, the Church-Turing principle and the universal quantum
computer. Proceedings of the Royal Society of London A: Mathematical, Physical and
Engineering Sciences, 400(1818), 97–117.
Earman, J. (1986). A primer on determinism. Dordrecht: Reidel.
Earman, J., & Norton, J. D. (1993). Forever is a day: Supertasks in Pitowsky and Malament-
Hogarth spacetimes. Philosophy of Science, 60, 22–42.
Eisert, J., Müller, M. P., & Gogolin, C. (2012). Quantum measurement occurrence is undecidable.
Physical Review Letters, 108(26), 260501.
Etesi, G., & Németi, I. (2002). Non-Turing computations via Malament-Hogarth space-times.
International Journal of Theoretical Physics, 41, 341–370.
Fresco, N. (2014). Physical computation and cognitive science. Berlin: Springer.
Gandy, R. O. (1980). Church’s thesis and principles of mechanisms. In S. C. Kleene, J. Barwise,
H. J. Keisler, & K. Kunen (Eds.), The Kleene symposium. Amsterdam: North-Holland.
Geroch, R., & Hartle, J. B. (1986). Computability and physical theories. Foundations of Physics,
16, 533–550.
Grzegorczyk, A. (1955). Computable functionals. Fundamenta Mathematicae, 42, 168–203.
Grzegorczyk, A. (1957). On the definitions of computable real continuous functions. Fundamenta
Mathematicae, 44, 61–71.
Harel, D., & Feldman, Y. A. (1992). Algorithmics: The spirit of computing (3rd ed.). Harlow:
Addison-Wesley.
Hogarth, M. (1994). Non-Turing computers and non-Turing computability. Proceedings of the
Biennial Meeting of the Philosophy of Science Association, 1, 126–138.
Hogarth, M. (2004). Deciding arithmetic using SAD computers. The British Journal for the
Philosophy of Science, 55, 681–691.
Komar, A. (1964). Undecidability of macroscopically distinguishable states in quantum field
theory. Physical Review, second series, 133B, 542–544.
Kreisel, G. (1965). Mathematical logic. In T. L. Saaty (Ed.), Lectures on modern mathematics (Vol.
3). New York: Wiley.
Kreisel, G. (1967). Mathematical logic: What has it done for the philosophy of mathematics? In R.
Schoenman (Ed.), Bertrand Russell: Philosopher of the century. London: Allen and Unwin.
Lacombe, D. (1955). Extension de la notion de fonction récursive aux fonctions d’une ou plusieurs
variables réelles III. Comptes Rendus Académie des Sciences Paris, 241, 151–153.
Mazur, S. (1963). Computational analysis. Warsaw: Razprawy Matematyczne.
Miłkowski, M. (2013). Explaining the computational mind. Cambridge: MIT Press.
Moore, C. (1990). Unpredictability and undecidability in dynamical systems. Physical Review
Letters, 64, 2354–2357.
Németi, I., & Dávid, G. (2006). Relativistic computers and the Turing barrier. Journal of Applied
Mathematics and Computation, 178, 118–142.
9 Physical Computability Theses 231
Penrose, R. (1989). The emperor’s new mind: Concerning computers, minds and the laws of
physics. Oxford: Oxford University Press.
Penrose, R. (1994). Shadows of the mind. Oxford: Oxford University Press.
Piccinini, G. (2011). The physical Church-Turing thesis: Modest or bold? The British Journal for
the Philosophy of Science, 62, 733–769.
Piccinini, G. (2015). Physical computation: A mechanistic account. Oxford: Oxford University
Press.
Pitowsky, I. (1990). The physical Church thesis and physical computational complexity. Iyyun, 39,
81–99.
Pitowsky, I. (1996). Laplace’s demon consults an oracle: The computational complexity of
prediction. Studies in History and Philosophy of Science Part B, 27(2), 161–180.
Pitowsky, I. (2002). Quantum speed-up of computations. Philosophy of Science, 69, S168–S177.
Pour-El, M. B. (1974). Abstract computability and its relation to the general purpose analog
computer (some connections between logic, differential equations and analog computers).
Transactions of the American Mathematical Society, 199, 1–28.
Pour-El, M. B., & Richards, I. J. (1981). The wave equation with computable initial data such that
its unique solution is not computable. Advances in Mathematics, 39, 215–239.
Pour-El, M. B., & Richards, I. J. (1989). Computability in analysis and physics. Berlin: Springer.
Scarpellini, B. (1963). Zwei unentscheidbare Probleme der Analysis. Zeitschrift für mathematische
Logik und Grundlagen der Mathematik, 9, 265–289. English translation in Minds and Machines
12 (2002): 461–502.
Shagrir, O. (2006). Why we view the brain as a computer. Synthese, 153, 393–416.
Shagrir, O., & Pitowsky, I. (2003). Physical hypercomputation and the Church–Turing thesis.
Minds and Machines, 13, 87–101.
Sprevak, M. (2010). Computation, individuation, and the received view on representation. Studies
in History and Philosophy of Science, 41, 260–270.
Turing, A. M. (1936). On computable numbers, with an application to the Entscheidungsproblem.
Proceedings of the London Mathematical Society, Series, 2(42), 230–265.
Turing, A. M. (1947). Lecture on the Automatic Computing Engine. In Copeland, B. J. (2004). The
essential Turing: Seminal writings in computing, logic, philosophy, artificial intelligence, and
artificial life, plus the secrets of Enigma. Oxford University Press.
Turing, A. M. (1948). Intelligent machinery. In The essential Turing. Oxford University Press.
Turing, A.M. (1950). Computing machinery and intelligence. In The essential Turing. Oxford
University Press.
Welch, P. D. (2008). The extent of computation in Malament-Hogarth spacetimes. The British
Journal for the Philosophy of Science, 59, 659–674.
Wolfram, S. (1985). Undecidability and intractability in theoretical physics. Physical Review
Letters, 54, 735–738.
Chapter 10
Agents in Healey’s Pragmatist Quantum
Theory: A Comparison with Pitowsky’s
Approach to Quantum Mechanics
Mauro Dorato
10.1 Introduction
In a series of related papers (2012a, b, 2015, Healey 2017c, 2018a, 2018b) and
in a book (2017b), Richard Healey has proposed and articulated a new pragmatist
approach to quantum theory (PQT henceforth) that so far has not received the
attention it deserves. And yet, Healey’s approach – besides trying to clarify and
solve (or dissolve) well-known problems in the foundations of quantum mechanics
from a technical point of view – has relevant consequences also for philosophy in
general, in particular in areas like scientific realism, metaphysics, the philosophy of
M. Dorato ()
Department of Philosophy Communication and Performing Arts, Università degli Studi Roma 3,
Rome, Italy
e-mail: mauro.dorato@uniroma3.it
probability, and the philosophy of language. Given that quantum theory (even in its
non-relativistic form) is one of two pillars of contemporary physics,1 any attempt
to offer a coherent and general outlook on the physical world that, like his, is not
superficial or merely suggestive, is certainly welcome.
The main aim of this paper is to show that, as a consequence of this ambition,
PQT raises important questions related to its compatibility with a narrowly defined
form of physicalism that enables us to clearly specify in what sense agents can
be regarded as quantum systems. It is important to note from the start that this
problem is related to the question whether quantum theory can be applied to all
physical systems – and therefore also to agents who use it (see Frauchiger and
Renner 2018; Healey 2018b) – to the extent that this applicability is a necessary
condition to consider them as physical systems. In what follows, I will not discuss
no-go arguments of this kind, since in the relationist context that Healey at least
initially advocated and that will be the subject of my discussion, he himself grants
that agents can be regarded as quantum systems by other agents.2
In this perspective, let me refer to this particular form of physicalism as “agent
physicalism”, in order to distinguish it from other approaches to physicalism that
Healey does not explicitly reject – among these, supervenience physicalism regarded
as the aim of physics3 - and that therefore will not be the target of my paper. In fact,
Healey grants that a particular form of physicalism – according to which our most
fundamental theory provides a complete ontology on which everything else must
supervene – is incompatible with PQT. From the outset, he claims that quantum
theory “lacking beables of its own, has no physical ontology and states no fact
about physical objects and events”, i.e. is not a fundamental description of the
physical world, but it rather amounts to a series of effective prescriptions yielding
correct attributions of quantum states to physical systems (Healey 2017b, pp. 11–
12). PQT’s normative approach to quantum theory is evident from the terminology
that in his papers abound (“prescribes”, “provides good advice”, etc.) and that
presupposes rules or norms that rational agents should follow). PQT postulates
in fact non-representational relations between physically situated agents’ epistemic
states (inter-subjectively shared credences) on the one hand, and mind-independent
probabilistic correlations instantiated by physical systems on the other. As we will
1 The other being, of course, general relativity. In this paper, I will neglect any considerations
emerging from quantum field theory to which, in the intention of his proposer, PQT certainly
applies. This fact is also clear from the examples that he discusses.
2 There are various forms of physicalism in the literature and here I am not interested in discussing
them all.
3 “Quine (1981, p. 98) offered this argument for a nonreductive physicalism: “Nothing happens
in the world, not the flutter of an eyelid, not the flicker of a thought, without some redistribution
of microphysical states. . . If the physicist suspected there was any event that did not consist in a
redistribution of the elementary states allowed for by his physical theory, he would seek a way of
supplementing his theory” (Healey 2017a, 235). Even though Healey never rejected supervenience
physicalism, he adds that: “If anything, the quantum revolution set back progress toward Quine’s
goal” (Healey 2017a, b, p. 236).
10 Agents in Healey’s Pragmatist Quantum Theory: A Comparison. . . 235
see, one of the main reasons of tension between agent physicalism and Healey’s
pragmatist approach to quantum mechanics is that, lacking a precise characteriza-
tion of agents that Healey does not provide, we don’t know what PQT is about. To
the extent that agent physicalism requires an exact notion of agent, PQT is incom-
patible with it and quantum theory cannot give a quantum description of an agent. In
this sense, this particular form of physicalism is therefore incompatible with PQT.4
I will begin by briefly reviewing PQT by putting it into a more general
philosophical frame. In the second section I will discuss Healey’s self-proclaimed
quantum realism by arguing that, aside from an implicit endorsement of a contextual
entity realism,5 his position is really instrumentalist despite his claims to the
contrary Healey 2018a. This conclusion will be important to clarify in more details
the ontological status of Healey’s agents and the related issue of agent physicalism.
In the third section, I will discuss the problem of providing some criterion to
distinguish between agents that are quantum systems from agents that are not
and cannot be (like electrons or quarks). As we will argue later, the fact that
Healey’s characterization of agents plausibly relies on the irreducibly epistemic
notion of information is a first argument in favor of the incompatibility between
agent physicalism and PQT.6 This argument will allow me to compare Healey
view of quantum theory with Pitowsky’s approach to the measurement problem,
by discussing in particular the role of observers and the notion of information in
their philosophy.
Before beginning, let me put forth two points to be kept in mind in what follows.
First, in order to be as faithful as possible to Healey’s claims and avoid possible
misunderstandings on my part, it will be necessary to resort to a few quotations from
his texts. The second point is terminological but important nevertheless: “realism
about the wave function
” does not entail a commitment to the existence of a
complex-valued function defined on an abstract mathematical space but entails the
view that
denotes a real physical state that, following many authors and Healey
himself, I will call “the quantum state”.
4 The question is whether the failure of agent physicalism also entails the failure of supervenience
physicalism will not be an object of this paper.
5 In PQT, also the reality of quantum particles is a contextual matter.
6 Of course, agents can be characterized in terms of supervenience physicalism.
7 See also Glick (2018).
236 M. Dorato
8 Healey’s former student Price (2011) has certainly played a role too.
9 A recent, rather original approach to quantum theory that for various aspects is similar to Healey’s
have typically assumed that their main task is to answer the question: “what can
the world be like if quantum theory is true?”. Thirty years ago, Healey himself
adopted this viewpoint, since the words in scare quotes are taken from his older
book on quantum mechanics (Healey 1989, p. 6). As he sees it now, however,
the problem with the above question is that it is responsible for a long-standing
dissent in the philosophy of quantum mechanics, which should be put to an end.
The question of interpreting the formalism of the theory in a realistic framework
typically presupposes the so-called “beables”, whose postulation is needed as an
attempt to answer what I take to be an inescapable question: “what is quantum
theory about?” For scientific realist, a refusal to answer this question, is equivalent
to betraying the aim of physics.10
However, according to Healey, in the philosophy of quantum mechanics this is
the wrong question to ask, since it leads to the formulation of theories (Bohm’s,
GRW’s, and many worlds’) underdetermined by their key parameters and by the
evidence.11 For instance, a instrumentalistic view of the quantum state and Bohm’s
realism about the latter state in principle cannot make any predictive difference. This
point has been repeatedly insisted upon also by Pitowsky: “I think that Weinberg’s
point and also Bohm’s theory are justified only to the extent that they yield new
discoveries in physics (as Weinberg certainly hoped). So far they haven’t” (Pitowsky
2005, p. 4, my emphasis).12
The new task that Healey assigns to PQT is not to explain the success of the Born
rules by appealing to some constructive theory in the sense of Einstein (1919), but
rather to explain how quantum theory can be so empirically successful even refusing
to grant both representational power to its models and “reality” to the quantum
state and the Born’s rule, in a sense of “reality” to be carefully specified below
(Healey 2018a).13 Somewhat surprisingly, he claims that his pragmatist approach,
somehow strongly suggested by the theory itself, constitutes the new revolutionary
aspect of quantum mechanics (2012a, b, 2017b, p. 121). This claim should strike
the reader as particularly controversial for at least two reasons. Firstly, as this
section illustrates, a pragmatistic philosophy is at least in part presupposed by
Healey and, like any philosophical approach to QT, cannot be read just from the
data and the theory. Secondly, despite some differences that basically involve the
idea of complementarity, Healey’s PQT rather than being so new, is rather close to
Bohr’s philosophy of quantum mechanics, especially if the latter is purified from
myths surrounding the so-called “Copenhagen interpretation” (Howard 2004) and
10 “What is the theory is about” presupposes simpleminded answers like: classical mechanics is
about particles acted upon by forces, and classical electromagnetism is about fields generated by
charged particles.
11 For this position, see also Bub (2005).
12 Pitowsky is referring to Weinberg’s attempt to reconstruct gravity on a flat spacetime (ibid.).
13 In particular when discussing a crucial but obscure passage in which he claims that “a quantum
states is not an element of reality (it is not a beable”) since “the wave function does not represent
a physical system or its properties. But is real nonetheless” Healey (2017c, p. 172).
238 M. Dorato
read more charitably in light of more careful historical reconstructions (Faye and
Folse 2017).14
In order to convince ourselves of the proximity of PQT and Bohr’s philosophy,
consider the first, following point. It is true that Healey’s empiricist understanding
of quantum theory, unlike Bohr’s, presupposes decoherence, but Howard (1994) and
Camilleri and Schlosshauer (2008, 2015) have put forward the claim that this notion
was in some sense already implicit in Bohr’s philosophy of quantum mechanics,
despite the lack at his time of the relevant physical knowledge. Secondly, even if this
somewhat ahistorical claim were rejected, one must consider PQT’s treatment of the
classical-quantum separation. Without a clear criterion that separates the quantum
from the classical regime, the set of magnitudes that Healey refers to as “non-
quantum” can be typically regarded as those described by classical physics. Healey,
however, explicitly denies that “non-quantum” refers to “classical” since, he claims,
it also refers to the property of any ordinary physical object”. However, the term
‘ordinary object’ is rather vague, unless it is intended to refer to objects with which
we interact in our ordinary life and in our ordinary environment. Consequently, the
reports of experimental results via what he calls NQMC (non-quantum magnitudes
claims)15 involve objects that can be described only by classical physics. This
claim resembles Bohr’s appeal to classical physics as the condition of possibility to
describe the quantum world: “indeed, if quantum theory itself provides no resources
for describing or representing physical reality, then all present and future ways of
describing or representing it will be non-quantum” (Healey 2012a, p. 740).
In addition, there are two close analogies between PQT and Bohr’s empiricist
philosophy of quantum mechanics. The first involves Bohr’s pragmatism, whose
importance for his philosophy has been supported by highly plausible historical
evidence (see among others Murdoch 1987 and Folse 1985). Furthermore, in force
of his holistic view of quantum mechanics, Bohr – analogously to Healey, −
would not deny that there is (a weak) sense in which a quantum state refers to or
describes genuinely physical, mind-independent probabilistic correlations between
measurement apparata and quantum systems. If Healey denied any representation
force to the wave function, he would be in the difficult position to justify the claim
that only classical models, unlike quantum models, represent or describe physical
entities. Otherwise, whence this epistemic separation if one cannot make recourse
to an ontological distinction (remember that in PQT there are no beables)?
These remarks, which are put forward to show the persisting though often
unrecognized influence of Bohr’s philosophy of quantum mechanics,16 are not
14 For a neo Copenhagen interpretation that here I cannot discuss, see Brukner (2017).
15 NQMCs (or canonical magnitude claims) are of the form “the value of dynamical variable M on
physical system S lies in a certain interval ”.
16 After the 1935 confrontation with Einstein, physicists attributed to Bohr a definite victory over
Einstein’s criticism. But since the late 1960s surge of interest in the foundations of physics caused
by Bell’s theorem and his sympathy for alternative formulations of quantum mechanics, Bohr has
become to be regarded as responsible – and not just by philosophers – for having “brainwashed
a whole generation of physicists into thinking that the job was done 50 years ago” (Gell-Mann
10 Agents in Healey’s Pragmatist Quantum Theory: A Comparison. . . 239
intended to deny that PQT diverges from the latter and that therefore significantly
enriches the contemporary debate in the philosophy of quantum mechanics.
P3. The last notable aspect, the relationality of quantum systems, touches more
on the metaphysical aspects of PQT, and if even if it was more stressed in the
earlier version of the formulation of PQT, is the most relevant for my purposes
and in any case interesting for its own sake. In order to summarize its main
thrust, it will be helpful to compare it with Rovelli’s relational quantum mechanics
(Rovelli 1996; Rovelli and Smerlak 2007; Laudisa and Rovelli 2008; Dorato 2016).
Formally, PQT does not accept the eigenstate-eigenvalue link. Philosophically, the
difference involves Healey’s stress on a pragmatist aspect. According to Rovelli’s
view, any interaction of a system with another system (not necessarily an agent)
yields definite results (events). In PQT ascriptions of quantum states are not
relative “to any other system, but only to physically instantiated agents, whether
conscious or not” (Healey 2012a, p. 750, emphasis added). Healey is aware that
his relationism may conflict with agent physicalism but claims to have a way
out: “Since agents are themselves in the physical world, in principle one agent
may apply a model to another agent regarded as part of the physical world. This
second agent then becomes the target of the first, who wields the model. I say
“in principle” since human agents are so complex and so “mixed up” with their
physical environment that to effectively model a human (or a cat!) in quantum theory
is totally impracticable” (Healey 2017b, p. 10). In short, his essential proposal,
to be discussed in what follows, is that “all agents are quantum systems with
respect to other users, but not all quantum systems are agents” (Healey 2012a, b,
p. 750).17 Except for the relational aspect, this statement is analogous to the claim
that every mental event is a physical event, but not all physical events are mental.
Independently or the relational aspect of the claim, the quotation raises another
crucial question to be discussed below: how do we distinguish between quantum
systems that are agents and those that are non-agents? Notice that this question
is essential to any philosophical view that stresses the role of agents in using the
quantum formalism.
I conclude my brief sketch of PQT by assessing its implications for the issue of
scientific realism which is importanto also for (Healey 2018a). It might be objected
that this is a desperate or useless enterprise, for at least two reasons. Firstly,
1976, p. 29). It is desirable to have a more balanced view of his contribution to the philosophy of
quantum mechanics.
17 While here I will focus on this relational version of PQT as expressed in the 2017’s quotation,
but I will show why the same questions would be relevant also to later non-relational versions of
PQT.
240 M. Dorato
each philosopher has her brand of realism and following Fine (1984) and others,
we might legitimately have the temptation to drop the whole issue in favor of a
natural ontological attitude. Secondly, it is not easy to locate Healey’s position
in the continuous spectrum that goes from full-blown scientific realism to radical
instrumentalism, since he is very careful to stir away from both positions. But for
my purpose however, attempts to solve this problem is important to find out whether
some kind of scientific realism about quantum mechanics is necessary for agent
physicalism.
Here we should distinguish between entity realism in Hacking’s “interventionist
sense” (1983) and any other sort of scientific realism. As far as the former is
concerned, there is no doubt about the fact that Healey is a scientific realist: PQT
describes the unobservable because entities (like quarks or muons) that agents apply
quantum theory to exist independently of them. However, the following six points
ought to convince the reader that, despite Healey’s claim to the contrary (2017a, p.
253 (Healey 2018a)), PQT is a typical instrumentalist position in any other possible
sense of “scientific realism”.
1. Healey claims that his PQT is not outright instrumentalist because of the
normative role of the quantum state, which is justified by the fact that its use
yields correct, empirically confirmed, mind-independent probabilistic predictions.
His “realism” about quantum mechanics could at most imply that the quantum state
can be taken to “refer” or “represent” mind-independent, statistical correlations.
However, statistical correlations about what? Since Healey does not explicitly raise
this crucial question, we may want to consider two possible interpretations. The first
is to read these correlations as holding between observations relating what he calls
“backing conditions” (inputs) via “advice conditions” to outputs or predictions that
must be formulated in the language of classical physics via NQMC (see note 16).
The idea that a theory connects mind-independent empirical correlations between
observations and predictions is one of the features of instrumentalism, so that this
first reading leads to antirealism. In referring to those conditions, the second reading
omits any explicit reference to observations. However, a distinctive feature of PQT
is that these empirical correlations are not to be explained bottom-up by postulating
some beables. On the contrary, it is plausible to attribute Healey the view that
they ought to be regarded as universally valid, empirical regularities like those that
one encounters in thermodynamics or, more generally, in the so-called “theories of
principle” (Einstein 1919). The idea that quantum mechanics ought to be regarded
as a theory of principle in this sense expresses an empiricist philosophical attitude
that is shared also by Bub and Pitowsky (2010). This attitude, however, is not a mark
of a realist approach to physical theories, at least to the extent that such an approach
tries to build constructive theories to explain those empirical regularities.
2. As in all pragmatist theories of truth, there is no attempt to explain the
effectiveness of the Born rule by the claim that quantum theory is approximately
true, since in the framework of a pragmatistic theory there is nothing to explain!
Antirealist philosophers are suspicious of explanations of effectiveness, since if
the quantum algorithm were not effective, it would not be empirically adequate
(a typical refrain of van Fraassen). To the extent that this effectiveness is a brute
10 Agents in Healey’s Pragmatist Quantum Theory: A Comparison. . . 241
could rebut that postulating objective physical probabilities is not incompatible per
se with the claim that these are not reducible or supervenient upon propensities,
frequencies or Lewisian chances. However, these are the most often discussed
objective accounts of probabilities in the market and postulating objective but
“theoretical probabilities” (a theoretical entity that, according to Healey’s anti-
representationalism, is not “represented” but tracked by the Born rules) is certainly
against a pragmatist spirit. Therefore, until we are told what these objective
probabilities are and what “track” exactly means, we be entitled to conclude that
in PQT the Born rules are just an effective bookkeeping device, given that the
assignment of the quantum state depends entirely on the “algorithm provided by the
Born Rule for generating quantum probabilities” (2012a, b, p. 731). In a word, given
the probabilistic nature of quantum theory, it is safe to assume that Healey’s anti-
representationalism about the Born rule (and the quantum state) essentially depends
on his philosophy of probability, and therefore on the fact that PQT’s probabilities
are objective1 (or intersubjectively valid for all agents) but they are not objective2 .
The question is whether this epistemic view can be regarded as consistent with agent
physicalism and treating agents as physical systems.
4. Another piece of evidence of Healey’s antirealistic approach is his view
of relations between quantum system: “By assigning a quantum state one does
not represent any physical relation among its subsystems: entanglement is not a
physical relation” (Healey 2017b, p. 248, my emphasis). Now, even if Schrödinger
was wrong in claiming that entanglement is “the characteristic trait of quan-
tum mechanics” (Schrödinger 1935, p. 555), there is no question that, in more
realistically-inclined approaches to quantum theory, entangled states are considered
to be real physical relations between physical systems that have no separate identity.
Ontic structural realism is an example of a philosophical approach that takes
relations between quantum entities as ontically prior (French 2016).
This approach to entanglement creates tension with PQT’s reliance on deco-
herence as a way to solve practically the measurement problem, to explain the
emergence of the classical world and to justify its reliance on the epistemic rock
given by NQMC. In particular, decoherence is needed because quantum theory
does not imply and cannot by itself explain the definiteness of our experience
(2017a, p. 99). To the extent that invoking decoherence does not exclude more
realistic attempts at solving the measurement problem – dynamical collapse models,
Bohmian mechanics, Everett-type interpretations – PQT is more antirealistic than
these positions that grant reality to the quantum state and yet also rely on decoher-
ence. Moreover, decoherence presupposes a spreading of entanglement, which is
typically regarded as a genuine, irreversible physical process. But if “entanglement
is not a physical relation” but a mathematical relation among state vectors (2017a,
p. 248), it is problematic to rely on this relation to claim that our ordinary classical
has an approximately classical appearance.
5. Despite what the author claims about the fact that PQT does not explicitly
mention “observables” or “measurement” (2012b, p. 44) – the typical vocabulary
of instrumentalist positions – the centrality of NQMC in Healey’s philosophical
project is incompatible with the view that his approach to quantum theory is
10 Agents in Healey’s Pragmatist Quantum Theory: A Comparison. . . 243
18 Dorato (2016).
244 M. Dorato
or, as Healey has, of “physically situated agents or users”. Some of these conditions
are obviously dictated by the theory itself.
It seems to me that the many senses in which one can be a realist about
quantum mechanics discussed above are sufficient to refute the claim that PQT is
a non-instrumentalist approach to quantum theory. Insisting that PQT is “realist”
about the quantum state while denying the status of a beable is realism only in
name, since PQT’s “realism” about the wave function amounts to the claim that
assigning a quantum state is an effective instrument to predict the outcomes of an
experiment. As hinted above, this conclusion is not just the product of an exegetic
effort, but will be important in the next section, when I will deal with the question
of agent physicalism.19
19 Infairness to Healey, it is very important to qualify these six points: PQT is not incompatible
with the idea that realism ought to be the goals of scientific theorizing. I think Healey would
agree with the idea that realism, like physicalism, are stances or attitudes (van Fraassen 1985;
Ladyman and Ross 2007), that is, they are theses about the aim of science. One way to express this
desideratum is as follows: “a physical theory ought to be about something and therefore describe
physical reality”. If he were right to argue that quantum theory has not beables of its own, he
would be entitled to be against the view that quantum theory, in its present form at least, has the
resources or instruments to achieve this aim, but he could nevertheless argue that realism about
the wave function as a beable may have an important heuristic role, similar to an idea of reason in
the sense of Kant’s transcendental dialectic. Another possibility is that by endorsing some kind of
“piecemeal realism” (Fine 1991), this conclusion may just apply to posits within quantum theory
and not to quantum theory in general or to other physical theories.
10 Agents in Healey’s Pragmatist Quantum Theory: A Comparison. . . 245
everything else supervenes, I will now show that the attempt to rescue PQT from
the charge of incompatibility with agent physicalism is doomed to fail.
Let us begin by noting that in his original 2012a’s formulation of PQT, Healey
states the following principle:
RPQT any physical system whatsoever (agents included) can be a quantum system
and therefore be assigned a quantum state only relative to an agent applying the
algorithm of quantum theory. Healey 2012a, p. 750)
RPQT is a necessary consequence of any agent-centered approach to quantum
theory. The essentially relational aspect of PQT depends in fact on the meaning
of “agent” and cannot be avoided by the kind of pragmatist approach to quantum
theory defended by Healey. Notice in addition that Healey obviously admits that
not all quantum systems can be regarded as agents: “all agents are quantum systems
with respect to other agents, but not all quantum systems are agents” (Healey 2012a,
p. 750, my emphasis),
Consider the following argument, where “physical” means describable in princi-
ple by current physical theories
1. All quantum systems are physical systems;
2. According to RPQT above, all physical systems (electrons, quarks, atom, agents
etc.) possess a quantum state and are quantum systems not intrinsically, but only
relationally, that is, only with respect to real (or hypothetical) agents who use
information provided by these systems.
3. Some quantum systems are not agents, but all agents are quantum systems only
relative to other agents.
4. PQT does not provide any intrinsic and exact criterion to distinguish between
quantum systems that are agents from those that are not agents.
5. agent physicalism requires an intrinsic and exact physical characterization of the
notion of agent or a physically instantiated user of key notions of the theory.
Conclusion
PQT is incompatible with physicalism
Let us discuss these premises in turn. Premise (1) must be granted by any
non-dualistic approach to quantum theory and is accepted by PQT20 : if it were
false, some quantum objects (agents are the most plausible candidates) would be
non-physical and my claim would be proved. Premise 2 expresses the essential
20 Also, the converse of (1) holds, since it is reasonable to claim that all physical objects are
quantum objects. Notoriously size, which looks the most plausible candidate to be an intrinsic
criterion to separate the two realms, is a non-starter. Microscopic systems like electrons, neutrons or
photons are quantum systems par excellence, but also mesoscopic quantum systems need quantum
mechanics for their description: C60 fullerene molecules, to which also Healey refers, are visible
with electron microscopes. Very recently, scientists in California managed to put an object visible
to the naked eye into a mixed quantum state of moving and not moving (O’Connelly et al. 2010). It
is possible to claim that even stars are quantum systems, given that in their nuclei there are nuclear
reactions.
246 M. Dorato
commitment of PQT in any of its form21 and therefore must be accepted by any
agent-centered view of quantum theory: the recourse to agents or users is necessary
to PQT, otherwise Healey’s approach would not be pragmatist. Premise 3 is a
quotation from Healey’s text (see above) and must be granted: not all physical
systems (say electrons) can apply the algorithm provided by the Born rule, but only
agents can, whatever an agent is.
As far as I can tell, the truth of premise 4 also relies on textual evidence and I
will now show why, as a consequence of the vagueness of the notion of ‘a physically
instantiated agent’ – and despite the fact that needs physical agent for its formulation
(see premise 2) – PQT does not tell us what quantum theory is about.
Relying on a passage from Healey (2017c, p.137), he could rebut this claim by
noting that quantum theory is about the mathematical structures that figure in its
models.22 However, as a proposal to understanding the theory, PQT must of course
refer to the role that agents have in making sense of the functions of the key terms
of the theory (Born rule, quantum state, etc.) by applying them to quantum systems.
Note furthermore that the passage mentioned above justifies the claim that PQT
must refer to agents, not in the simpleminded sense that any physical theory needs
agents for its formulation and implementation. Even in model-theoretical accounts
of scientific theories, a physical theory must be able to interpret its formulas, lest it is
devoid of any physical meaning: classical mechanics is not just about the family of
its mathematical models. And given the refusal to give a representationalist account
of the models of quantum theory, PQT relationalism as expressed by 2 implies that
PQT is also about agents relationally regarded as special quantum systems.
Now the problem is that in order to distinguish in an exact way quantum systems
that are agents from those that are not we need an intrinsic criterion, which Healey,
unfortunately, does not provide: “Agents are generic, physically situated systems
in interaction with other physical systems: . . . the term ‘agent’ is being used very
broadly so as to apply to any physically instantiated user of quantum theory, whether
human, merely conscious, or neither” (2012a, p. 752, my emphasis). However,
we do have an intuitive idea of what agents are: they may be either conscious or
unconscious robots acting like conscious agents (as Healey himself has it in the
same passage, 2012a, p. 752), but atoms, electrons, photons and quarks are certainly
not agents in any sense of the word.
Once again, if we want to know what the theory is about, the distinction between
quantum systems that are agent and quantum systems that are not must be given
in intrinsic terms and must be exact in Bell’s sense (see note 19). The point of
the argument is that the possibility of formulating agent physicalism needs an
exact formulation of the notion a physically instantiated user of the mathematical
apparatus of the theory that, due to the irremediable vagueness of “agent”, PQT is
not able to give. It follows that, in its current form at least, PQT is incompatible
with agent physicalism and this conclusion holds independently of the claim that
21 As a response to a criticism, this must hold also in later, less relational versions of 2.
22 This quotation has been reminded to me by one of the referees.
10 Agents in Healey’s Pragmatist Quantum Theory: A Comparison. . . 247
The conclusion of the previous section raises immediate questions: why should
PQT be saddled with the task to describe physically instantiated agents in quantum
terms? Shouldn’t this problem be left to cognitive scientists? “Quantum theory, like
all scientific theories, was developed by (human) agents for the use of agents (not
necessarily human: while insisting that any agent be physically situated, I direct
further inquiry on the constitution of agents to cognitive scientists”) (Healey 2017a,
p. 137). In view of this, don’t alternative formulations or approaches to quantum
theory also suffer from the same problem? Of course, if supervenience physicalism
holds in these latter approaches, it also holds in PQT, since in PQT agents are
quantum systems, even though relatively to other agents.
However, the problem is that, unlike what happens in Everettian, Bohmian
and dynamical-collapse quantum mechanics – PQT cannot account for agents in
quantum terms (and therefore in physical terms) even in principle. In Bohmian
mechanics, agents are physical systems that can in principle be described in
quantum terms: there is no difference between quantum systems that are agents and
quantum systems that are not. An analogous conclusion holds in Everettian quantum
mechanics, whose bizarre ontological consequences (many worlds or many minds)
were put forth to avoid considering agents as different in principle from any other
quantum systems. Dynamical collapse models, in the flash version for instance,
regard agents as constituted by galaxies of “flashes” in the same sense in which a
table is. Analogous considerations hold for the matter field version of GRW. The
fact that in PQT agent physicalism fails explains why it is different from these
approaches: agents must be treated in a different way from non-agents but we are not
told in exact physically terms why this should be so. How can we make sense in a
clearer way of the status of agents qua physically instantiated user of a mathematical
model on the one hand and the physical electrons and quarks on the other?
In order to avoid explaining why agents are distinguished from non-agents,
Healey could regard the notion of ‘agent’ as “primitive”. In one sense, “primitive”
as referred to agents would entail that deeper, constructive explanations (possibly
involving beables) of how the physical interactions or information exchange
between agents and quantum systems can occur can be altogether avoided. This
move would also eschew the charge that PQT, which, as we have argued must
essentially be about agents, is silent about the demarcation line separating them
from system that are clearly non-agents: the interaction between agents and non-
agents must be accepted as one that does not require any explanation, since it is
presupposed by an application of PQT by agents.
248 M. Dorato
23 See his response to Einstein’s objection, involving the moving screen with the double slit as
reported in Bacciagaluppi and Valentini (2009).
10 Agents in Healey’s Pragmatist Quantum Theory: A Comparison. . . 249
24 Thisconclusion is somewhat similar to Frauricher and Renner’s claim that “Quantum theory
cannot consistently describe the use of itself” (2018), an interesting paper which here I cannot
discuss.
25 Compare Bell’s famous questions: “whose information, information about what?”
250 M. Dorato
decoherence, to non-quantum magnitude claims. But this way out introduces once
again an unclear distinction between quantum systems and classical systems whose
vagueness the relationality of PQT wanted to avoid.
10.6.1 Realism
As to realism, that there has been an evolution in Pitowsky view of quantum theory.
In chapter 6 of Quantum Probability and Quantum Logic (Pitowsky 1989), he seems
to defend a form of underdetermination of theories by data, and his preference for
Bohmian mechanics, were one to choose a hidden variable theory, is clearly stated:
“ . . . Can we produce a hidden variable theory which goes beyond quantum theory
and predicts new phenomena? Psychologically this question seems to be connected
to the problem of realism. It seems that realists are more inclined to consider
hidden variable theories. Logically however, the physical merits of hidden variable
theories are independent of the metaphysical debate. One can take an instrumentalist
position regarding hidden variables and still appreciate their potential application in
predicting new phenomena. Alternatively, one can believe in the existence of hidden
causes, while still holding on to the view that these causes lie beyond our powers of
manipulation. Hidden variable theories may very well lead to new predictions, this
is a matter of empirical judgment. My own bias tends to side with non-local theories
such as Bohm (1952). In this and similarly detailed dynamical theories we may be
able to find a way to manipulate and control the hidden variables and arrive at a new
phenomenological level. In any case, I can conceive of no a-priori argument why
this could not occur” (Pitowsky 1989, p. 181).
While this manipulation is today regarded as out of question, in his earlies
production Pitowsky’s open-mindedness about quantum theory distinguishes his
position from Healey’s more recent work. It is significant, however, that in order
to break the underdetermination, he does not regard the greater explanatory virtue
of Bohm’s theory as sufficient to rule out instrumentalist approaches, as long as
the latter are regarded as having the same epistemic force. The decisive element
would be the production of new experimental evidence, an aspect that today is
unreasonably regarded with skepticism.
In later works, however, Pitowsky has argued that quantum theory can be char-
acterized structurally as a theory about the logic and probability of quantum events,
252 M. Dorato
that the measurement problem is a pseudo-problem, that the quantum state does not
represent reality but just provides an algorithm to assign probabilities. Unlike the
position defended by Healey, according to Pitowsky probabilities are credences that
reflect objective correlations and that are assigned to certain outcomes. As PQT,
also according to Pitowsky decoherence explains the emergence of a classical world
(see, for instance, Bub and Pitowsky 2010, p. 1). These are the typical traits of
instrumentalistic approaches to quantum theory, at least insofar as the quantum state,
denoted by the wavefunction, is regarded as the central notion used by quantum
theory and there is no attempt at giving a dynamic explanation of the definiteness of
the measurement outcomes that does not rely on decoherence. This particular form
of antirealism about the wave function is certainly shared also by Healey, even he
does not explicitly argue that the claim that measurement should not be appear in
the primitive of the theory is the result of “a dogma”.
10.6.2 Probability
The theory of probability defended by the two authors prima facie seems different.
Healey rejects subjectivist view of probability, since, modulo the remarks above,
PQT is described as capable of tracking mind-independent probabilistic correla-
tions. According to Bub and Pitowsky, the quantum state is credence function
(2010), and the objective nature of probability is given by the kinematic constraints
represented by the projective geometry of Hilbert space. Yet, also Bub and Pitowsky
argue that angles in such a geometry are associated with “structural probabilistic
constraints on correlations between events”, where these events are to be regarded
as physical, in analogy with events in Minkowski spacetime, also structured by
spatiotemporal relations. Despite the fact that Healey does not explicitly mention
intersubjectively shared credences, also the probabilistic approach of the two
philosophers is strikingly similar.
The real difference between Healey and Pitowsky’s (and Bub’s) philosophies
of quantum theory involves the relational character of PQT, which is called for by
the agent-centered view of Healey’s approach. We have seen why it is the role of
agents that makes PQT into a relational theory and therefore creates troubles vis à
vis agent physicalism. I will now show that there is an important pragmatic, even
if only implicit component also in Pitowsky’s approach, which is a consequence of
the pivotal role that information has in his late production and that pushes toward
a relational approach, involving conscious agents manipulating and transferring
information.
10.6.3 Information
In order to argue in favor of this claim it is important to figure out in what sense
information can be regarded as physical, given that according to Pitowsky and
10 Agents in Healey’s Pragmatist Quantum Theory: A Comparison. . . 253
Bub, quantum theory can be reconstructed as a theory about the transmission and
manipulation of information, “constrained by the possibilities and impossibilities of
information-transfer in our world” (Bub 2005).26 Recall that according to Healey
information has a merely epistemic sense and plays a minor role in his theory.
However, the fact that agents are necessarily presupposed also in information
theoretic approaches to quantum theory is evident from the fact that only users
can manipulate information regarded as a physical property of objects. Think of
questions like: “what is an information source?”, what does it mean “to broadcast
the outcome of a source?” Is the “no-signaling principle” understandable without
the presence of an agent, and therefore in purely physical terms?
Let us agree to begin with that the no cloning principle applies to machines, since
it claims that “there is no cloning machine that can copy the outcome of an arbitrary
information source” (2010, p. 7). The main question therefore involves the notion
of a source of information. A fair coin, that either land head or tail, is a source of
information (i.e., a classical bit) but from the physical viewpoint it is merely a short
cylinder with two different faces. Only agents, whether conscious or unconscious,
can use these physical properties with some goal or intention (toss it decide who
must go shopping for example), but without a use of this property by someone or
something, there is no source of information, even if information has no semantic
sense, as pointed out by Shannon in his original paper.
Signaling is a word that has a univocal, intentional sense: only agents can signal
to other agents by using physical instruments with the purpose of communicating
something, but the existence of a probabilistic correlation between two experimental
outcomes by itself can carry no information. Consider now a photon, which is a
qubit with respect to its polarization. In this sense, it’s purely physical (possibly
dispositional) properties are analogous to that of a coin, despite the fact that a
qubit is associated to “an infinite set of noncommuting two-values observables
that cannot have all their value simultaneously” (Bub 2016, p. 30). The physical,
agent-independent, property of the photon is its elliptical or linear polarization, the
information it carries and the possibility of using it as a source thereof presupposes
the existence of a user, whether conscious or not, with some intention to do
something. In this sense, information theoretic approaches à la Pitowsky and Bub
are much more similar to Healey’s PQT than it might be thought at first sight. This
should be of no surprise, given the shared claim that the wave function has no
representative power and has a pure algorithmic role. There is a sense according
to which instrumentalist philosophies are more anthropocentric than their more
realistically inclined rivals.
26 “quantum mechanics [ought to be regarded] as a theory about the representation and manipula-
tion of information constrained by the possibilities and impossibilities of information-transfer in
our world (a fundamental change in the aim of physics), rather than a theory about the behavior of
non-classical waves and particles” (Bub 2005, p. 542).
254 M. Dorato
Acknowledgments I thank two anonymous referees for their very precious help on previous
version of this manuscript. In particular, I owe many remarks that I added to the second version of
the paper to the second referee.
References
Bacciagaluppi, G., & Valentini, A. (2009). Quantum theory at the crossroad. Cambrigde: Cam-
brigde University Press.
Bell, J. (1989). Towards an exact quantum mechanics. In S. Deser & R. J. Finkelstein (Eds.),
Themes in contemporary physics II. Essays in honor of Julian Schwinger’s 70th birthday.
Singapore: World Scientific.
Bohm, D. (1952). A suggested Interpretation of the quantum theory in terms of ‘Hidden’ variables,
I and II. Physical Review, 85(2), 166–193. https://doi.org/10.1103/PhysRev.85.166.
Brandom, R. (2000). Articulating reasons: An introduction to Inferentialism. Cambridge, MA:
Harvard University Press.
Breuer, T. (1993). The impossibility of accurate state self-measurements. Philosophy of Science,
62, 197–214.
Brukner, Č. (2017). On the quantum measurement problem. In R. Bertlmann & A. Zeilinger (Eds.),
Quantum (un)speakables II (pp. 95–117). Cham: Springer.
Bub, J. (2005). Quantum theory is about quantum information. Foundations of Physics, 35(4),
541–560.
Bub, J. (2016). Banana world. Quantum mechanics for primates. Oxford: Oxford University Press.
Bub, J., & Pitowsky, I. (2010). Two dogmas about quantum mechanics. In S. Saunders, J. Barrett,
A. Kent, & D. Wallace (Eds.), Many worlds?: Everett, quantum theory & reality (pp. 433–459).
Oxford: Oxford University Press.
Camilleri, K., & Schlosshauer, M. (2008). The quantum-to-classical transition: Bohr’s doctrine of
classical concepts, emergent classicality, and decoherence. arXiv:0804.1609v1.
Camilleri, K., & Schlosshauer, M. (2015). Niels Bohr as philosopher of experiment: Does
decoherence theory challenge Bohr’s doctrine of classical concepts? Studies in History and
Philosophy of Modern Physics, 49, 73–83.
Churchland, P., & Hooker, C. (Eds.). (1985). Images of science: Essays on realism and empiricism
(with a reply from bas C. van Fraassen). Chicago: University of Chicago Press.
Dalla Chiara, M. L. (1977). Logical self-reference, set theoretical paradoxes and the measurement
problem in quantum mechanics. Journal of Philosophical Logic, 6, 331–347.
Dorato, M. (2016). Rovelli’s relation quantum mechanics, antimonism and quantum becoming. In
A. Marmodoro & D. Yates (Eds.), The metaphysics of relations (pp. 235–261). Oxford: Oxford
University Press.
Dorato, M. (2017). Bohr’s relational holism and the classical-quantum interaction. In H. Folse &
J. Faye (Eds.), Niels Bohr and philosophy of physics: Twenty first century perspectives (pp.
133–154). London: Bloomsbury Publishing.
Dürr, D., Goldstein, S., Tumulka, R., & Zanghì, N. Bohmian mechanics. https://arxiv.org/abs/
0903.2601v1
Einstein, A. (1919, November 28). Time, space, and gravitation. Times (London), pp. 13–14.
Faye, J., & Folse, H. (Eds.). (2017). Niels Bohr and the philosophy of physics, twenty-first-century
perspectives. London: Bloomsbury Academic.
Fine, A. (1984). The natural ontological attitude. In J. Leplin (Ed.), Scientific realism. Berkeley:
University of California Press.
Fine, A. (1991). Piecemeal realism. Philosophical Studies, 61, 79–96.
Folse, H. (1985). The philosophy of Niels Bohr. The framework of complementarity. Amsterdam:
North Holland.
10 Agents in Healey’s Pragmatist Quantum Theory: A Comparison. . . 255
Folse, H. (2017). Complementarity and pragmatic epistemology. In J. Faye & H. Folse (Eds.), Niels
Bohr and the philosophy of physics. A-twenty-first-century perspective (pp. 91–114). London:
Bloomsbury.
French, S. (2016). The structure of the world. Oxford: Oxford University Press.
Frauchiger, D., & Renner, R. (2018). Quantum theory cannot consistently describe the use of
itself. Nature Communications, 9, Article number: 3711. https://doi.org/10.1038/s41467-018-
05739-8.
Friederich, S. (2015). Interpreting quantum theory. A therapeutic approach. New York: Palgrave
Macmillan.
Gell-Mann, M. (1976). What are the building blocks of matter? In H. Douglas & O. Prewitt (Eds.),
The nature of the physical universe: Nobel conference (pp. 27–45). New York: Wiley.
Glick, D. (2018). Review of the quantum revolution in philosophy by Richard Healey. philsci-
archive.pitt.edu/14272/2/healey_review.pdf
Godfrey-Smith, P. (2003). Theory and reality. Chicago: University of Chicago Press.
Hacking, I. (1983). Representing and intervening. Cambridge: Cambridge University Press.
Healey, R. (1989). The philosophy of quantum mechanics. Cambridge: Cambridge University
Press.
Healey, R. (2012a). Quantum theory: A pragmatist approach. British Journal for the Philosophy of
Science, 63, 729–771.
Healey R. (2012b). Quantum decoherence in a pragmatist view. Resolving the measurement
problem. arXiv.org > quant-ph > arXiv.
Healey, R. (2015). How quantum theory helps us explain. British Journal for Philosophy of
Science, 66, 1–43.
Healey R. (2017a). Quantum-Bayesian and pragmatist views of quantum theory. In E. N.
Zalta (Ed.), The Stanford encyclopedia of philosophy (Spring 201a edition). URL: https://
plato.stanford.edu/archives/spr2017/entries/quantum-bayesian/
Healey, R. (2017b). The quantum revolution in philosophy. Oxford: Oxford University Press.
Healey, R. (2017c). Quantum states as objective informational bridges. Foundations of Physics, 47,
161–173.
Healey, R. (2018a, unpublished manuscript). Pragmatist quantum realism. British Journal for
Philosophy of Science.
Healey, R. (2018b). Quantum theory and the limits of objectivity. Foundations of Physics, 48(11),
1568–1589.
Howard, D. (1994). What makes a classical concept classical? Toward a reconstruction of Niels
Bohr’s philosophy of physics. In Niels Bohr and contemporary philosophy (Vol. 158 of Boston
studies in the philosophy of science) (pp. 201–229). Dordrecht: Kluwer.
Howard, D. (2004). Who invented the ‘Copenhagen interpretation?’ A study in mythology.
Philosophy of Science, 71, 669–682.
Ladyman, J., & Ross, S. (2007). Everything must go. Metaphysics naturalized. Oxford: Oxford
University Press.
Laudisa, F., & Rovelli, C. (2008). Relational quantum mechanics. In E. N. Zalta (Ed.), The
Stanford encyclopedia of philosophy (Fall 2008 edition). URL: http://plato.stanford.edu/
archives/fall2008/entries/qm-relational/
MacKinnon, E. (1982). Scientific explanation and atomic physics. Chicago: Chicago University
Press.
Maudlin, T. (2007). The metaphysics within physics. Oxford: Oxford University Press.
McGinn, C. (1993). Problems in philosophy: The limits of inquiry. Cambridge: Basil Blackwell.
McLaughlin, B., & Bennett, K. (2014). Supervenience. In E. N. Zalta (Ed.), The Stanford
encyclopedia of philosophy (Spring 2014 edition). URL: https://plato.stanfrd.edu/archives/
spr2014/entries/supervenience/
Minkowski, H. (1952). Space and time. In H. A. Lorentz, A. Einstein, H. Minkowski, & H. Weyl
(Eds.), The principle of relativity: A collection of original memoirs on the special and general
theory of relativity (pp. 75–91). New York: Dover.
Murdoch, R. (1987). Niels Bohr’s philosophy of physics. Cambridge: Cambridge University Press.
256 M. Dorato
O’Connell, A. D., Hofheinz, M., Ansmann, M., Bialczak, R. C., Lucero, E., Neeley, M., Sank, D.,
Wang, H., Weides, M., Wenner, J., Martinis, J. M., & Cleland, A. N. (2010). Quantum ground
state and single-phonon control of a mechanical resonator. Nature, 464, 697–703.
Pitowsky, I. (1989). Quantum probability, quantum logic. Berlin: Springer.
Pitowsky, I. (2005). Quantum mechanics as a theory of probability. arXiv:quant-ph/0510095v1.
Price, H. (2011). Naturalism without mirrors. Oxford: Oxford University Press.
Psillos, S. (1999). How science tracks truth. London/New York: Routledge.
Quine, W. V. (1981). Theories and things. New York: The Belknap Press.
Rovelli, C. (1996). Relational quantum mechanics. International Journal of Theoretical Physics,
35, 1637–1678.
Rovelli, C., & Smerlak, M. (2007). Relational EPR. Foundations of Physics, 37, 427–445.
Schrödinger, E. (1935). Discussion of probability relations between separated systems. Proceed-
ings of the Cambridge Philosophical Society, 31, 555–563; 32, 446–451.
Stoljar, D. (2010). Physicalism. New York: Routledge.
Stoljar, D. (2017). Physicalism. The Stanford Encyclopedia of Philosophy (Winter 2017 Edition)
(E. N. Zalta (Ed.)). https://plato.stanford.edu/archives/win2017/entries/physicalism/
Suárez, M. (2004). An inferential conception of scientific representation. Philosophy of Science,
71(Supplement), S767–S779.
Timpson, C. (2013). Quantum information theory. Oxford: Clarendon Press.
Van Fraassen, B. (1985). Empiricism in the philosophy of science. In P. Churchland & C.
Hooker (Eds.), Images of science: Essays on realism and empiricism (pp. 245–308). Chicago:
University of Chicago Press.
Van Fraassen, B. (2002). The empirical stance. Princeton: Princeton University Press.
Wigner, E. P. (1967). Remarks on the mind–body question. In Symmetries and reflections (pp.
171–184). Bloomington: Indiana University Press.
Zinkernagel, H. (2016). Niels Bohr on the wave function and the classical/quantum divide. Studies
in History and Philosophy of Modern Physics, 53, 9–19.
Chapter 11
Quantum Mechanics As a Theory of
Observables and States (And, Thereby,
As a Theory of Probability)
Abstract Itamar Pitowsky contends that quantum states are derived entities,
bookkeeping devices for quantum probabilities, which he understands to reflect the
odds rational agents would accept on the outcomes of quantum gambles. On his
view, quantum probability is subjective, and so are quantum states. We disagree. We
take quantum states, and the probabilities they encode, to be objective matters of
physics. Our disagreement has both technical and conceptual aspects. We advocate
an interpretation of Gleason’s theorem and its generalizations more nuanced—and
less directly supportive of subjectivism—than Itamar’s. And we contend that taking
quantum states to be physical makes available explanatory resources unavailable
to subjectivists, explanatory resources that help make sense of quantum state
preparation.
11.1 Introduction
J. Earman
Department of History and Philosophy of Science, University of Pittsburgh, Pittsburgh, PA, USA
e-mail: jearman@pitt.edu
L. Ruetsche ()
Department of Philosophy, University of Michigan, Ann Arbor, MI, USA
e-mail: ruetsche@umich.edu
We agree very much with the spirit of T1, but we emphasize that T1 needs to be
appropriately generalized to cover the branches of quantum theory that go beyond
ordinary non-relativistic QM, which was Itamar’s focus. And we would underscore,
even more strongly than Itamar did, how the structure that quantum theory gives
to the ‘elements of reality’—or as we would prefer to say, to quantum events or
propositions—shapes quantum probabilities.
Our disagreement with Itamar concerns T2, and has technical and conceptual
dimensions. Our fundamental unhappiness is conceptual. We contend that T2
mischaracterizes the nature of quantum probabilities, and we propose inverting T2
to the counter thesis
T2∗: Not only is the quantum state not a derived entity, and so not merely a device for
bookkeeping quantum probabilities, rather the quantum state has ontological primacy in
that quantum probabilities are properties of quantum systems that are derived from the
state.
Although T2∗ gainsays T2’s claim that quantum states are derived entities, the
theses differ on a deeper matter. Taking quantum states to codify objective features
of quantum systems T2∗ implies that quantum probabilities are objective, observer-
independent probabilities—objective chances if you care to use that label. The
territory, between T2∗ and a view asserting quantum states to be derived from
objective probabilities, may not be worth quibbling over. However, Itamar denies
that quantum probabilities are objective. His T2 claims not only that quantum states
are derived entities but also explains why they are derived entities. On Itamar’s
view, the event structure T1 come first, and then come the probabilities, in the
form of credences rational agents would assign events so structured. The states
are bookkeeping devices for these credences. Itamar’s interpretation of quantum
probability is personalist, and an element of a striking and original interpretation
of QM. Itamar’s personalism widens the distance between T2 and T2∗ to include
territory that is worth fighting over.
Let’s briefly explore that territory. Naively viewed, quantum theory describes an
event structure and a set of probability distributions over that event structure. The
sophisticated might question the division of labor this naive picture presupposes.
Which of the elements italicized above does physics supply and which (if any) do
we, qua rational agents, supply? For Itamar, physics supplies only the event structure
T1; the rest is the work of rational agents. Quantum probability distributions reflect
betting odds ideal rational agents would accept on “quantum gambles”; quantum
states are just bookkeeping devices for those probability distributions.2 By contrast,
while we also accept T1, we expect physics to do more work. And so we assert
T2∗: We take it that physics supplies quantum probabilities, which probabilities are
objective, independent of us, and determined by physical quantum states.
On Itamar’s view, rationality and only rationality constrains what quantum proba-
bilities, and so what quantum states, are available. Our view countenances additional
constraints. If probabilities are physical, both physical law and physical contingency
can govern what quantum probabilities are available for physical purposes. It’s clear
that our explanatory resources are richer than Itamar’s. Rationality on its own can’t
explain Kepler’s laws, but Newton’s Law of Universal Gravitation can. And LUG
on its own can’t explain why Halley’s comet appears every 75 years, but celestial
contingencies (in collaboration with physical law) can. What’s not so obvious:
whether there are any explanations essential to making sense of quantum mechanics
to which our resources are adequate but Itamar’s are not.
Enough has been said to introduce and situate our technical disagreement with
T2. The main result Itamar cites in support of T2 is known as Gleason’s theorem.
By itself the original Gleason theorem does not yield T2; by itself the theorem
says nothing about the ontological priority of probability measures over states, nor
does it rule on whether quantum probabilities are personalist or objective. Finally,
Itamar’s claim that the Born Rule follows from Gleason’s theorem rests on a tacit
restriction on admissible probability distributions/physically realizable states.3 As
we will discuss, how one justifies these restrictions depends on one’s interpretation
of quantum probabilities and/or the status assigned to states. Arguably proponents of
T2∗ have better grounds than proponents of T2 for adopting the restrictions, which
do critical work in attempts to make sense of quantum state preparation.
§§2–4 will explore these technical matters in more depth. §5 and beyond
elaborate and adjudicate our conceptual disagreement with Itamar. The standard by
which both views are judged is: do they explain why QM works as well as it does?
That is, do they make sense of our remarkably successful laboratory practice of
preparing systems, attributing them quantum states, extracting statistical predictions
from those states, and performing measurements whose outcomes confirm those
predictions? With respect to this standard, Itamar’s version of T2 holds both promise
and peril. The promise is that it leads to a dissolution of the notorious measurement
problem.4 The peril is that T2 looks shaky in the face of two explanatory demands:
first, the demand to explain the fact that (some) quantum states can be prepared; and,
2 For details of Pitowsky’s account of quantum gambles, see his (2003, Sec. 1).
3A further technical quibble, readily addressed, is that the version Itamar discusses applies only to
the special setting of non-relativistic QM; §4.2 extends his discussion.
4 Pitowsky (2003) and Bub and Pitowsky (2010) call it the “‘big’ measurement problem,” which
they distinguish from the “small” measurement problem. More on this below.
260 J. Earman and L. Ruetsche
second, the demand to explain the fact that the probabilities induced by prepared
states yield the correct statistics for outcomes of experiments.
The first demand is the topic of §§5–8, which describe how each view fields the
challenge of state preparation, both in ordinary QM and in more general settings
(QFT, thermodynamic limit). An attractive account of state preparation appeals to
physical states—which seems to favor T2∗. In fairness, however, we indicate how
the formal account of state preparation offered by quantum theory can be turned into
an alternative account of state preparation as belief state fixation—a repurposing
congenial to advocates of T2. However, the repurposing requires a restriction on the
space of admissible credences that may not be motivated by demands of rationality
alone. We also confess that we are in danger of being hoisted on our own petard.
Because the quantum account of state preparation seems to falter in QFT, our
insistence that T1 be generalized to cover branches of quantum theory beyond
ordinary QM appears to weaken the support consideration of state preparation lends
T2∗. We review some results that promise to circumvent this roadblock.
Turning to the second demand, to make sense of our capacity to complete
measurements confirming quantum statistical predictions, §9 confesses that T2∗
runs smack into the measurement problem. It contends that Itamar’s view also
founders on this explanatory demand. We argue that the measurement problem is
a real problem that needs to be solved rather than dissolved. A concluding section
recaps the main points of our discussion, and relates them to another foundations
question dear to Itamar’s heart: just how much realism about QM is it possible to
have?
We leave it to the reader to render final judgment on the relative merits of T2
vs. T2∗. All we insist upon is that knocking heads over T2 vs. T2∗ is a good way
of sharpening some of the most fundamental issues in the foundations of quantum
theory.
We begin our discussion with a reformulation of thesis T1 designed to cover all
branches of quantum theory.
5 For
the relevant operator algebra theory the reader may consult Bratelli and Robinson (1987) and
Kadison and Ringrose (1987).
11 Quantum Mechanics As a Theory of Observables and States (And. . . 261
of Itamar’s T1. We feel confident that he would endorse it. It is hardly a novel or
controversial thesis; indeed, T1∗ is the orthodoxy among mathematical physicists
who use the algebraic approach to quantum theory (see, for example, Hamhalter
2003).6
A quantum probability measure on P(N) is a map Pr : P(N) → [0, 1] satisfying
the quantum analogs of the classical probability axioms:
(i) Pr(I ) = 1 (I the identity projection)
(ii) Pr(E1 ∨ E2 ) = Pr(E1 ) + Pr(E2 ) whenever projections E1 , E2 ∈ P (N) are mutually
orthogonal.
6 This is not to say that T1∗ is uncontroversial among philosophers of physics. Bohmians would
subcollections. The sup exists because the sequence of sums over ever larger finite subcollections
gives a non-decreasing bounded sequence of real numbers, and every such sequence has a least
upper bound.
262 J. Earman and L. Ruetsche
The different event structures T1∗ extracts from different von Neumann algebras
naturally bring with them differences in the structure of quantum probabilities. In
general once the algebra of observables N is fixed so are the non-measurable events:
the latter consist of the complement of P(N) in the set of all projections on H. In
ordinary QM there is no analog of the phenomenon of non-measurable events in
classical probability. All projections on H are in the ordinary QM event algebra;
therefore all lie in the range of a probability measure defined on the algebra. In
relativistic QFT, by contrast, there are non-measurable events galore. The Type III
local algebras endemic there contain no finite dimensional projections and, thus,
every finite dimensional projection on a Hilbert space H on which a Type III algebra
acts corresponds to a non-measurable event. Furthermore, the theory offers a prima
facie explanation of why some events are non-measurable: they are not assigned
probabilities because they do not correspond to observables.
Different types of algebras of observables have different implications for the
status of additivity requirements for quantum probabilities. In particular, in ordinary
QM where the algebra of observables is a Type I factor, countable additivity entails
complete additivity unless the dimension of the Hilbert space is as great as the least
measurable cardinal (see Drish 1979 and Eilers and Horst 1975). This entailment
does not hold for the Type III algebras encountered in QFT.
In the following sections we will see that differences in the algebras of observ-
ables also make for differences in the possible states, which in turn imply differences
in the possible probability measures.
(b) there is a density operator , i.e., a trace class operator on H with T r() = 1, such that
ω(A) = T r(A) for all A ∈ N.8
This result has been generalized to cover non-separable Hilbert spaces and, more
significantly, quite general von Neumann algebras, so that it is applicable to all
branches of quantum theory:
Theorem 3 (Gleason generalized). Let N be a von Neumann algebra acting on a Hilbert
space H, separable or non-separable. Suppose that N does not contain any direct summands
of Type I2 . Then for any quantum probability measure Pr on P (N) there is a unique
extension of Pr to a quantum state ωPr on N. Further, the state ωPr is normal iff Pr is
completely additive.9
8 See Kadison and Ringrose (1987, Vol. 2, Theorem 7.1.12) and Bratelli and Robinson (1987,
Theorem 2.4.21).
9 This Theorem requires proof techniques that are different from the one used for the original
Gleason theorem; see Hamhalter (2003) and Maeda (1990) detailed treatments of this crucial
theorem. A Type I2 summand is isomorphic to the algebra of bounded operators on a two-
dimensional Hilbert space.
264 J. Earman and L. Ruetsche
The generalized Gleason’s theorem opens the door to those who want to endorse
T2 and relegate quantum states to bookkeeping devices across the board in quantum
theory. But why go through this door? One motivation comes from those who want
to apply Bruno de Finetti’s personalism to quantum theory and construe quantum
probabilities as the degrees of belief rational agents have about quantum events
P(N). Gleason’s theorem allows the personalists to claim that quantum states have
no independent ontological standing but function only to represent and track the
credences of Bayesian agents. This answer leads directly to Itamar’s T2: quantum
states are bookkeeping devices for quantum probabilities, which reflect rational
responsiveness to quantum event structures limned by T1.
Itamar writes: “The remarkable feature exposed by Gleason’s theorem is that the
event structure dictates the quantum probability rule. It is one of the strongest pieces
of evidence in support of the claim that the Hilbert space formalism is just a new
kind of probability theory” (Pitowsky 2003, p. 222). By “the quantum probability
rule” he means the Born rule. The Gleason theorem by itself does not yield the
Born rule for calculating probabilities and expectation values. The Born rule uses a
density operator representation of expectation values, and from the point of view of
states the validity of this representation comes from the restriction of possible states
to normal states; and from the point of view of probabilities the validity derives
from the restriction to completely additive measures.10 And it is exactly here that a
challenge arises for both of the competing theses T2 and T2∗.
The use of the Born Rule is, we claim, baked into the practice of applying
quantum theory. In the following section we offer evidence for this claim. The
challenge for the proponents of the opposing T2 and T2∗ is to explain and justify
this practice, or explain when and why the practice should be violated.
In this subsection we use ‘state’ in way that is neutral with respect to T2 vs T2∗;
so if you are a fan of T2 feel free to understand ‘state’ as qua representative of a
10 An awkwardness for Itamar’s account is the existence, in the case where dim(H) = 2, of
probability measures on P (B(H)) that don’t extend to quantum states on B(H). In order to
maintain that quantum states are just bookkeeping devices for the credences of rational agents
contemplating bets on quantum event structures, it seems that Itamar must either deny that two
dimensional Hilbert spaces afford quantum event structures, or articulate and motivate rationality
constraints (in addition to the constraint of conformity to the axioms of the probability calculus!)
that prevent agents from adopting “stateless” credences.
11 Quantum Mechanics As a Theory of Observables and States (And. . . 265
no-go results on the existence of conditional expectations for normal states such
as that of Takesaki (1972). Such considerations do not undermine our support
for T2∗; indeed they strengthen it by indicating that the sorts of considerations
physicists use to decide the admissibility of quantum states are quite different from
the considerations of rationality of credences the proponents of T2 would bring to
bear.
From the point of view of T2 the justification for normality can only be that the
credence functions rational agents put on the projection lattice P(N) must be
completely additive and, thus, by the generalized Gleason theorem the states that
bookkeep these credences are normal.
One way to make good on the claim that the credence functions of rational agents
are completely additive is to mount a Dutch book argument. An agent who uses her
credence function to assign betting quotients to events is subject to Dutch book just
in case there is a family of bets, each of which she accepts but collectively have the
property that she loses money in every possible outcome. The goal is to show that
conforming credences to the axioms of probability with complete additivity is both
necessary and sufficient for being immune to Dutch book. The ‘only if’ part of the
goal can be attained if the agent is required to stand ready to accept any bet she
regards as merely fair, but not if she is only required to accept bets that she regards
as strictly favorable (see Skyrms 1992). This may be regarded as a minor glitch in
that for most applications of quantum theory a separable Hilbert space suffices; in
these cases complete additivity reduces to countable additivity, and the Dutch book
argument goes through even if the agent only accepts what she regards as strictly
favorable bets.
Nevertheless, personalists are on thin ice here since other constructions—such
as scoring rule arguments—personalists have used to try to prove that rational
credences should conform to the axioms of probability do not reach countable,
much less complete, additivity. Additionally, Bruno de Finetti, the patron saint
of the personalist interpretation of probability, adamantly denied that credences
need be countably additive much less completely additive (de Finetti 1972, 1974).
The only restrictions on admissible probability functions to which personalists
may appeal are restrictions justified by considerations of rationality alone. If the
patron saint of personalism is rational, it seems that the status of countable and
complete additivity is something about which rational parties can disagree. And if
countable/complete additivity can’t be rationally compelled then the restriction of
quantum probability measures to countably and completely additive ones—and ergo
268 J. Earman and L. Ruetsche
11.4.4 Finitizing
We have been pressing the line that advocates of T2∗ have better resources for
justifying normality, as well as for explicating departures from normality, than
advocates of T2 have. But if normality requires no justification, this line evaporates,
taking with it our best grounds for preferring our own objectivist interpretation of
quantum probabilities to Itamar’s personalist interpretation of those probabilities.
And Itamar has at his disposal a maneuver that would eradicate the need to justify
normality. That maneuver is to finitize. In this subsection, we’ll sketch it briefly,
then explain why we think even Itamar should resist it.
Throughout his exposition of quantum mechanics as a theory of probability,
Itamar assumes the von Neumann algebras B(H)—the algebras framing quantum
bets and affording quantum event structures—to be finite dimensional. We can find
motivation for the assumption in his 2006: “A proposition describing a possible
event in a probability space is of a rather special kind. It is constrained by the
requirement that there should be a viable procedure to determine whether the event
occurs” (p. 217). This is because “only a recordable event can be the object of a
gamble” (p. 218). Itamar’s point is that the outcomes subject to quantum bets have
to be the sorts of thing we can settle. Otherwise we wouldn’t be able to resolve
the bets we’ve made. Given certain assumptions about our perceptual limitations
and our attention (and life) spans, this constrains quantum bets to concern events
with only finitely many possible outcomes—and thereby constrains quantum event
structures to be given by the projection lattices of finite dimensional von Neumann
algebras. And, of course, with quantum events structures thus constrained, complete
additivity collapses to finite additivity, and the rock-bottom requirement that
quantum probability functions be finitely additive implies that quantum states are
automatically normal. Normality requires no further justification; it makes no sense
to assess accounts of quantum probability on the basis of how well they justify
normality.
12 Bayesian statisticians are divided on the issue of additivity requirements. Some versions of
decision theory, such as that of Savage (1972), operate with mere finite additivity. Kadane et al.
(1986) argue that various anomalies and paradoxes of Bayesian statistics can be resolved by the
use of so-called improper priors and that such priors can be interpreted in terms of finitely additive
probabilities. For more considerations in favor of mere finite additivity see Seidenfeld (2001).
11 Quantum Mechanics As a Theory of Observables and States (And. . . 269
While we concede the formal merits of this maneuver, we contend that even
Itamar should resist it, because it fundamentally undermines the spirit and interest
of his interpretation of quantum mechanics. Baldly put, we think the finitizing
move is hubris. It amounts to demanding that the quantum event structure—the
topic of T1/T1∗, and the element all hands agree to be supplied by the physics—
conform to packages of gambles mortal human agents can entertain and settle.
That is, the finitizing maneuver sculpts the physical event structure by appeal to
the limitations of agents. We acknowledge and admire the resourcefulness and
integrity of personalist attempts to derive constraints on credences from features of
the agents entertaining them. But we take the present finitizing move to inject agents
where they don’t belong: into the physical event structures themselves. Itamar’s
conjunction of T1 and T2 is brilliant and provocative in no small part due to the
delicate division of labor it posits: physics supplies the event structure; we supply
everything else. Upsetting that division of labor, the finitizing move diffuses the
distinctive force of Itamar’s position. Without finitizing, anyone who agrees that
the quantum event structure is physical must take Itamar’s position seriously. With
finitizing, opponents might well complain that Itamar’s pulled a bait and switch, and
offered subjectified quantum event structures in lieu of physical ones.
To repeat, the view we are promoting is that quantum states codify objective features
of quantum systems, and the probabilities they induce are thereby objective. In
support of this view we offer the facts that (some) quantum states can be prepared
and that the probabilities they induce are borne out in the statistics of the outcomes
of experiments. Quantum theory itself contains the ingredients of a formal account
of state preparation.
The account contains two main ingredients, the first of which is the von
Neumann/Lüders projection postulate:
vN/L Postulate: If the pre-measurement state of a system with observable algebra N is ω and
a measurement of F ∈ P (N) returns a Yes answer, then the immediate post-measurement
ω(F AF )
state is ωF (A) := , A ∈ N, provided that ω(F ) = 0.13
ω(F )
Note that if ω is a normal state then so is ωF . The second ingredient is the concept
of a filter for states:
Def. A projection Fϕ ∈ P (N) is a filter for a normal state ϕ on N iff for any normal state ω
on N such that ω(Fϕ ) = 0,
13 von Neumann’s preferred version of the projection postulate differs from the version stated here
in the case of observables with degenerate spectra. Preliminary experimental evidence seems to
favor the Lüders version stated here; see Hegerfeldt and Mayato (2012) and Kumar et al. (2016).
270 J. Earman and L. Ruetsche
ω(Fϕ AFϕ )
ωFϕ (A) := = ϕ(A) for all A ∈ N.
ω(Fϕ )
In ordinary QM with N = B(H) the normal pure states are the vector states, and
as the reader can easily check any such state ψ has a filter Fψ ∈ P(B(H)), which
is the projection onto the ray spanned by the vector |ψ representing ψ. So suppose
that Fψ is measured and that the measurement returns a Yes answer. Then by the
vN/L postulate, if the pre-measurement state is ω then the post-measurement state
is ωFψ . And since Fψ is a filter for ψ the post-measurement state is ψ regardless of
what the pre-measurement state ω is, provided only that ω is a normal state and that
ω(Fψ ) = 0.
To repeat, this formal account of state preparation meshes with the statistics of
experiments. E.g. prepare a system in the normal pure state ψ. Immediately after
preparation make a measurement of some E ∈ P(B(H)) and record the result.
Repeat the preparation of ψ on the system or a similar system; then repeat the
measurement of E and record the result. Repeat . . . . Since the trials are independent
and identically distributed the strong law of large numbers implies that as the
number of trials is increased the frequency of Yes answers for E almost surely
approaches its theoretical value ψ(E). This is exactly what happens in actual
cases—at least there is always apparent convergence towards ψ(E).
The ansatz that only normal states are physically realizable has played a silent
but crucial role in the above account of state preparation. We have already suggested
that this ansatz fits more comfortably with our T2∗—which countenances physical
constraints on the space of admissible states—than it does with Itamar’s T2—
which countenances only rationality constraints. The restriction to normal states
has played a crucial rule because if a system were initially in a non-normal state
it would be inexplicable how it could be prepared in a normal state since neither
unitary evolution nor Lüders/von Neumann projection can cause a tiger to change
its stripes: if ω is non-normal then so is ω (•) = ω(U • U −1 ) for any unitary
ω(F • F )
U : H → H; and if ω is non-normal then so is ω (•) = for any
ω(F )
F ∈ P(B(H)) such that ω(F ) = 0 (see Ruetsche 2011). Turning this around, the
success in preparing systems in normal states provides a transcendental argument
for the ansatz. Transcendental arguments are a priori, if they are arguments at all.
So perhaps an unholy marriage of Kant and Itamar will avail advocates of T2 of the
resources necessary, on this story, to make sense of state preparation.
Another attitude toward the ansatz that only normal states are physically
realizable might be available to advocates of T2∗ embarrassed by transcendental
arguments but not embarrassed by certain dialectical moves made on behalf of
the Past Hypothesis (see e.g. Loewer 2012). This is the hypothesis that the initial
macrostate of the universe was exceedingly low entropy. Some fans of the Past
Hypothesis claim that (i) it helps explain prevalent time asymmetries integral to our
epistemic practices; that (ii) it thereby figures as an axiom in the simplest, strongest
systematization of physical fact; and that (iii) this qualifies the Past Hypothesis as a
physical law by the lights of the Best Systems Account of laws of nature. Despite
11 Quantum Mechanics As a Theory of Observables and States (And. . . 271
Granted that conditions (a) and (b) capture features we want updating to have, the
uniquely determined updating rule specifies that if the pre-measurement probability
ω(F EF ) ω(F EF )
Pr(E//F ) = = for all E, F ∈ P(N). (L)
ω(F ) Pr(F )
Pr(EF ) Pr(E ∧ F )
Pr(E//F ) = = (B)
Pr(F ) Pr(F )
which is the rule of Bayes updating used in classical probability. However, when
E and F do not commute ω(F EF ) cannot be written as Pr(F EF ) since F EF ∈ /
P(N) and is not assigned a probability. It should be embarrassing to the proponent
of T2 that states have to be utilized in the updating rule, but let this pass.
Call Xϕ ∈ P(N) a belief filter for a normal state ϕ on N iff for any completely
additive Pr on P(N) such that Pr(Xϕ ) = 0, Pr(E//Xϕ ) = ϕ(E) for all E ∈ P(N).
When the generalized Gleason theorem applies an application of (L) shows that
a state filter for ϕ is a belief filter for ϕ. The converse is also true: when the
generalized Gleason theorem applies a belief filter for ϕ is also a state filter for
ϕ (see Appendix 4 for a proof). The equivalence of belief filters and state filters
allows the proponents of T2 and a personalist interpretation of probability to claim
various virtues.
They can say that they can account for ‘state preparation’ without having to carry
the ontological baggage of T2∗. What the proponents T2∗ call a state filter for ϕ,
they call a belief filter for ϕ. And in virtue of Xϕ being a belief filter, all agents
who assign a non-zero prior probability to the belief filter Xϕ and who operate with
completely additive credence functions will, upon updating on the belief filter Xϕ ,
agree that posterior probabilities are equal to those assigned by ϕ. The state ϕ has
been ‘prepared’, but this just means that the bookkeeping device ϕ represents the
mutually shared updated opinion.
But please note that the personalist appropriation of §5’s formal account of state
preparation requires a restriction to completely additive probabilities—a restriction,
we’ve suggested, that fails to follow from considerations of rationality alone, unless
those considerations encompass transcendental arguments of a novel and surprising
nature. And, to review the discussion of §4.4, while the maneuver of finitizing
quantum event structures accomplishes the “restriction” in one fell swoop, it also
11 Quantum Mechanics As a Theory of Observables and States (And. . . 273
revises the division of labor so compellingly effected by Itamar’s T1 and T2. To our
minds, proponents of T2∗, who can invoke the dumb luck of physical contingency,
are better positioned to make sense of state preparation. Insofar as we’re assessing
proposals with respect to their capacity to deliver the goods of making sense of
quantum theory’s empirical success, this tips the balance in favor of our T2∗.
Upsetting the balance is the possibility that our strategy for making sense of
state preparation might not generalize beyond Type I von Neumann algebras. Given
our insistence on such generalization, such a possibility would hoist us on our own
petard. The next section gives life to the possibility, but also explains how we can
avoid being hoisted.
The account of state preparation given in §5 focused on normal pure states. We have
already explained the restriction to normal states. The restriction to pure states is
explained by the fact that mixed states do not have filters and, thus, are outside the
ambit of the formal account of state preparation offered above (see Appendix 3 for a
proof). This fact leads a puzzle about how and whether state preparation is possible
in QFT.
It is widely assumed that local observers can only make measurements in
bounded regions of spacetime. If this is so it seems that such observers cannot
prepare global states, i.e. states on the quasi-local algebra N(M) for the entirely
of Minkowski spacetime M. If N(M) is Type I, as is typically the case, it admits
normal pure states, and such states have filters. But a state filter is a minimal
projection, and the local Type III algebras N(O) associated with open bounded
regions O ⊂ M do not contain any such projections. This is just as well, you may
say, for if a global state could be prepared by making measurements in a bounded
O ⊂ M some horrible form of non-locality would be in the offing. But things seem
just as bad at the local level. Since a local algebra N(O) is Type III it does not admit
any normal pure states, and filters for mixed states on N(O) do not exist in N(O).
A way out of this seeming conundrum was sketched by Buchholz et al. (1986) by
showing that, under certain conditions on the net O → N(O) of local algebras, local
state preparation is possible. The exposition begins with a definition:
Def. Let ϕ be a normal state on N(M). A projection Eϕ ∈ N(M) is a local filter for the
restriction ϕ|N(O) of ϕ to the local algebra N(O) iff for any normal state ω on N(M)
We know that Eϕ cannot belong to N(O) itself. But ideally it can be located in some
N(O) where O only slightly larger than O. So the hope is that for any normal state
ϕ on N(M) and any O ⊂ M there is a local filter Eϕ ∈ N(O) for ϕ|
N(O) with O
only slightly larger than O.
The hope for local filters can be fulfilled if the local algebras exhibit a property
called split inclusion:
Def. Let N1 and N2 be von Neumann algebras with N1 ⊆ N2 . The pair (N1 , N2 ) is a split
inclusion iff there is a Type I factor F such N1 ⊆ F ⊆ N2 .
with
The net O → N(O) has the split inclusion property iff for any O and O
O ⊃ O, the pair (N(O), N(O)) is a split inclusion.
Theorem 6 (Buchholz et al. 1986). The split inclusion property of the local algebras is both
necessary and sufficient for the existence of local filters.
In terms of our discussion the moral is that the competing stories the proponents of
T2 and T2∗ told in the preceding section about state preparation in ordinary QM can
be retold in QFT for states on local algebras just in case the split inclusion property
holds.15
So what is the status of the split inclusion property? It is known to hold in some
models of AQFT and fail in others. That the dividing line is being crossed is signaled
by the number of degrees of freedom in the model. It is often assumed that the
vacuum state for N(M) is a vector state whose generating vector is cyclic and
separating. Halvorson (2007) shows
Theorem 7. Let O → N(O) be a net of von Neumann algebras acting on H, and suppose
that there is a vector |Ω ∈ H that is cyclic and separating for all the local algebras in the
net. If the split inclusion property holds then H is separable.16
Physical models of AQFT feature vacuum states that are cyclic and separating.
Further, the split inclusion property is known to be entailed by the nuclearity
condition, which expresses a precise form of the idea that as the energy of a system
increases the energy level density should not increase too rapidly (see Buchholz and
Wichmann 1986). Perhaps a case can be made that the split inclusion property picks
out “physically reasonable” models of AQFT, but it is clearly a contingent property.
15 One of us (LR) has reservations about the interpretational morals being drawn from the BDL
theorem; see Appendix 6.
16 Let N be a von Neumann algebra acting on H. A vector |Ω ∈ H is cyclic iff N|Ω is dense in
H. It is separating vector iff A|Ω = 0 implies A = 0 for any A ∈ N such that [A, N] = 0.
11 Quantum Mechanics As a Theory of Observables and States (And. . . 275
Of course, its contingency isn’t a challenge for Itamar’s T2. The property restricts
which algebras afford QFT’s quantum event structure. The topic of Itamar’s T1 and
our friendly amendment thereof to T1∗, quantum event structures are something
Itamar regards as a matter of physics.
The proponents of T2 might claim to have achieved a standoff with the proponents
of T2∗. In ordinary QM where filters exist for all normal pure states the argument
for T2∗ from state preparation can be rewired to give an alternative account of ‘state
preparation’ in terms of belief fixation that is in accord with T2. In QFT the situation
is a bit more complex. At the global level filters for states on the global algebra are
unavailable to local observers so that the state preparation argument for T2∗ falters.
At the local level the story divides. In models where the split inclusion property
holds there are local filters and, thus, there is a standoff over local state preparation
parallel to the situation in ordinary QM; and in models where the split inclusion
property fails there are no local filters so that the state preparation argument for T2∗
falters.
The proponents of T2∗ may wish to put a different spin on matters. At the global
level in AQFT there is no preparation story for states on N(M) for the personalist
to piggyback on. The proponents of T2∗ can not only admit this but explain why
measurements by local observers cannot be used to prepare global states. And they
will also note that insofar as the practice of physics presupposes that there is a unique
global state, the proponents of T2 are at a loss to explain the practice. Similarly, at
the local level there is no preparation story for local states when the split inclusion
property fails. Again proponents of T2∗ can explain why measurements by local
observers cannot be used to prepare local states; and they will also note that insofar
as the practice of physics presupposes that there is on any local N(O) a unique local
state, the proponents of T2 are at a loss to explain the practice.
Setting aside QFT impediments, we take the proponents of T2∗ to have an
advantage over the proponent of T2. Both the formal T2∗ and the appropriated T2
accounts of state preparation by filtration need to restrict admissible quantum prob-
ability functions to completely additive ones. The expanded explanatory resources
afforded proponents of T2∗ give them the means to do so. Proponents of T2 need
to anchor such restrictions in considerations of rationality alone. It is not at all clear
that such considerations suffice.
Of course, if proponents of T2 could solve the measurement problem, there
would be good reason to set such quibbles aside. The next section contends that
they cannot.
276 J. Earman and L. Ruetsche
The von Neumann/Lüders projection postulate gives the empirically correct pre-
scription for calculating post-measurement probabilities. If, following T2∗, quan-
tum states codify objective features of quantum systems and if, following John
Bell’s injunction, ‘measurement’ is not to be taken as primitive term of quantum
theory then there is a problem in accounting for the success of the vN/L postulate.
For numerous no-go results show that treating a quantum measurement as an
interaction between an apparatus and an object system and describing the composite
apparatus-object system interaction as Schrödinger/Heisenberg evolution does not
yield the desired result.
It is tempting to think that the problem can be dissolved by rejecting Bell’s
injunction and also rejecting T2∗ in favor of T2. For these twin rejections open
the way to say: The change of state postulated in vN/L does take place but this
change is innocuous and calls for no further explanation; for the change is simply
a change in the bookkeeping entries used to track the change in credence functions
when updating on measurement outcomes. In more detail, when the information
that a measurement of F ∈ P(N) gives a Yes result is acquired a completely
additive credence function Pr on P(N) is updated by Lüders conditionalization on
F to Pr (•) = Pr(•//F ). Assuming that the generalized Gleason theorem applies,
it follows that there are unique normal states ω and ω bookkeeping Pr and Pr
respectively, and it follows from the definition of Lüders conditionalization that
ω(F • F )
these bookkeepers are related by ω (•) = , just as vN/L requires. That’s
ω(F )
all there is to it.17
We demur: there is more, and the more steers us back towards the original
measurement problem. On T2∗ there is a unique but, perhaps, unknown pre-
measurement state, and the vN/L postulate refers to the change in this state. By
contrast, on T2 there are as many different bookkeeping states as there are agents
with different pre-measurement credence functions, and unless the F in question
is a belief filter there are many different post-measurement bookkeeping states.
Only one of these post-measurement bookkeeping states gives correct predictions
about observed frequencies, something easily understood on T2∗ since only one of
these states induces the objective chances of post-measurement results. Thus, on
T2 there is the 0th measurement problem of how to explain uniqueness of states:
how, in the parlance of T2, to explain why exactly one of the myriad admissible
probability functions available accords with subsequent observation. The problem
becomes particularly pressing in cases where there are no belief filters, as in models
of QFT where the split property fails.
17 Or rather that is all there is supposed to be to what Pitowsky (2003) and Bub and Pitowsky (2010)
call the ‘big’ measurement problem. They think that after the big problem is dissolved there still
remains the ‘small’ measurement problem, viz. to explain the emergence of the classical world we
observe.
11 Quantum Mechanics As a Theory of Observables and States (And. . . 277
The 0th measurement problem threatens to spill over into the BIG measurement
problem. It is our experience that after measurements, the credences available to
observers narrow to those corresponding to full belief in one outcome or another.
That is, upon updating on a measurement outcome, I obtain a credence assigning
probability 1 to an element of the set {Ei } of spectral projections of the pointer
observable. I never obtain a credence assigning probability 1 to a member of a set of
projections {Ei } incompatible with the pointer observable. It is not clear how Itamar
can explain why our options are invariably narrowed in these characteristic ways.
And even if he can, he faces a further mystery: why does a measurement deliver me
to full belief in one rather than another of the {Ei }? “Because that outcome—En
(say) was observed!” he might reply, rehearsing §6’s account of belief fixation via
filtration. But §6’s account posits an outcome on which all believers update—and it
is not clear that Itamar has the wherewithal to deliver the posit. Concerning quantum
event structures, T1 says nothing about which quantum events are realized in any
given physical situation. Thus it’s unclear that Itamar can say what the physical facts
are, much less that they include measurement outcomes. Compare our view, which
can marshal the physical quantum state to characterize the facts. The physical state,
along with some account of what’s true of a system in that state (e.g. the unjustly
maligned eigenvector-eigenvalue link) is the right kind of resource for saying what
the physical facts are. Alas, to date, no one’s articulated a rule that manages, without
legerdemain, to number measurement outcomes among the physical facts.
Additionally, even if ‘measurement’ is taken as a primitive it makes sense to ask
how physically embodied agents acquire information about measurement results.
Are we to take this as an unanalyzed process? Or do we search for an account using
quantum theory to analyze the interaction of agents with the rest of the world? The
former strengthens the suspicion that real problems are being swept under the rug.
The latter points to rerun of the original problem.
11.10 Conclusion
The key question separating our position from Itamar’s is how to understand
quantum probabilities and quantum states. Are they a matter of physics (our T2∗)
or a matter of rationality (Itamar’s T2)? But there is another question in the vicinity,
illuminating further points of contention. The question is: just how much realism
about QM is possible?
One way to approach the question is see how well candidate realisms accommo-
date the key explanandum that QM works. A stability requirement for a would-be
abductive realist is that what she believes when she believes QM must be consistent
with the evidence she cites to support her belief in QM. That evidence takes the
form of a history of preparing physical systems, attributing them quantum states,
extracting probabilistic predictions from those states, and verifying those predic-
tions by means of subsequent measurements. Interpretations that render preparation
infeasible, statistical prediction unreachable, or measurement impossible fail this
278 J. Earman and L. Ruetsche
18 Thisis hardly to deny that Bohmians are realists. They’re just realists about something else—the
Bohm theory. Standard QM stands to the Bohm theory as an effective theory stands to a more
fundamental theory whose coarse-grained implications it mimics. Thus realism about the Bohm
theory induces something like “effective realism” about QM. The point is that this represents less
realism about the theory than Itamar’s position. For more on effective realism, see Williams (2017).
11 Quantum Mechanics As a Theory of Observables and States (And. . . 279
What about the measurement problem? Both views encounter it in some form.
Both are self-undermining. But at least our view does not undermine physics. For
advocates of T2∗, the measurement problem signals that something physical is
eluding our grasp—and that signal is a call to develop new physics. Advocates
of T2 deny that anything is missing from the physics. Indeed, they contend that
the measurement problem results from putting something in the physics—quantum
states that reflect objective probabilities—that doesn’t belong there. Taking the
measurement problem to be thereby dissolved, they predict that no fruitful new
physics is to be had by confronting the problem.
In the end, one’s attitude towards the measurement problem comes down to a bet
on what is the most fruitful way to advance physics. Our bet is that path starts from
taking the measurement problem as a problem to be solved rather than dissolved;
and further, our bet is that advancement down this path will require new physics,
perhaps something along the lines of a stochastic mechanism to produce the changes
postulated by vN/L or perhaps a radically new theory.
Assume that among the physically realizable states on the quasi-local global algebra
N(M) there is at least one normal state ϕ. Assume also the property of local
definiteness. It follows that any physically realizable state ω is locally normal, i.e.
for any p ∈ M there is a neighborhood O of p such that ω| is normal. To
N(O)
∞
see this, consider a sequence {On }n=1 of open neighborhoods in M descending
to p. By local definiteness there must be a sufficiently large n0 ∈ N such that
||(ω − ϕ)|N(On ) || < 2. This implies that ω|N(On ) and ϕ|N(On ) belong to the
0 0 0
same folium, and since ϕ|N(On ) is normal so is ω|N(On ) . To complete the proof
0 0
= On0 .
set O
280 J. Earman and L. Ruetsche
Suppose that Xϕ ∈ P(N) a belief filter for the normal state ϕ, and let Pr be a
completely additive probability on P(N) such that Pr(Xϕ ) = 0. By the generalized
Gleason theorem it follows that ω(Xϕ ) = 0 and that
ω(Xϕ EXϕ )
= ϕ(E) for all E ∈ P(N) (*)
ω(Xϕ )
where ω is the normal state that extends Pr. Now use the facts that N is the weak
closure of P(N) and that normal states are continuous in the weak topology to
conclude that
ω(Xϕ AXϕ )
= ϕ(A) for all A ∈ N (**)
ω(Xϕ )
Since Pr was arbitrary and since completely additive probabilities are in one-
one correspondence with normal states, (**) holds for any normal state ω such that
ω(Xϕ ) = 0, which is to say that Xϕ is a state filter for ϕ.
Here we sketch the proof that the split inclusion property is sufficient for the
existence of local filters in AQFT; for a proof of the converse the reader is referred
to Buchholz et al. (1986),
11 Quantum Mechanics As a Theory of Observables and States (And. . . 281
And a fortiori (11.3) holds for all A ∈ N(O) ⊂ F. The upshot is that if ϕ is a
normal state on N(M) whose restriction ϕ|N(O) to the local algebra N(O) extends
to a normal pure state ψ on F then the support projection Eψ for ψ serves as a local
filter for ϕ and, as hoped, Eψ belongs to N(O) associated with a O only slightly
larger than O. It remains only to note that any normal state on N(O) extends to a
vector state on F. Since N(O) is Type III, any normal state ξ on N(O) is a vector
state. So there is a |ξ ∈ H such that ξ (A) = ξ |A|ξ for all A ∈ N(O). This state
obviously extends to the vector state ξ̄ on F B(H) where ξ̄ (B) = ξ |B|ξ for all
B ∈ F. The support projection for this state Eξ̄ ∈ F ⊂ N(O) is then the desired
local filter.
filters for its normal states. And this short-circuits what might otherwise have been
a comforting physical story about Buchholtz-Doplicher-Longo preparations. The
comforting story is that we are preparing a state ϕ on N(O) by preparing a state on
a larger algebra containing N(O); ϕ is just the restriction to N(O) of the prepared
superstate. (The comforting story incidentally explains why, contrary to ordinary
expectations, the states we prepare in QFT are mixed. The explanation is that in the
presence of entanglement, subsystem states are inevitably mixed, and subsystem
states are what BDL preparations prepare.) The comforting story is short-circuited
because in the BDL scenario, we aren’t preparing a state on the larger algebra N(O)
/the larger region O. We aren’t because we can’t: normal states on N(O) lack filters
While such puzzles are no reason to disbelieve the theorem, they leave some
in O.
of us wondering, sometimes, what we’re believing when we believe this piece of
mathematics.
Acknowledgements This work reflects a collaboration with the late Aristides Arageorgis extend-
ing over many years. We are indebted to him. And we thank Gordon Belot, Jeff Bub, and the editors
of this volume for valuable guidance and feedback on earlier drafts.
References
Araki, H. (2009). Mathematical theory of quantum fields. Oxford: Oxford University Press.
Bratelli, O., & Robinson, D. W. (1987). Operator algebras and quantum statistical mechanics I
(2nd ed.). Berlin: Springer.
Bub, J., & Pitowsky, I. (2010). Two dogmas of quantum mechanics. In S. Saunders, J. Barrett, A.
Kent, & D. Wallace (Eds.), Many worlds? Everett, quantum theory, & reality (pp. 433–459).
Oxford: Oxford University Press.
Buchholz, D., & Wichmann, E. H. (1986). Causal independence and the energy-level density of
states in local quantum field theory. Communications in Mathematical Physics, 106, 321–344.
Buchholz, D., Doplicher, S., & Longo, R. (1986). On Noether’s theorem in quantum field theory.
Annals of Physics, 170, 1–17.
Cassinelli, G., & Zanghi, N. (1983). Conditional probabilities in quantum mechanics. Nuovo
Cimento B, 73, 237–245.
de Finetti, B. (1972). Probability, induction and statistics. London: Wiley.
de Finetti, B. (1974). Theory of probability (2 Vols). New York: Wiley.
Drish, T. (1979). Generalizations of Gleason’s theorem. International Journal of Theoretical
Physics, 18, 239–243.
Eilers, M., & Horst, E. (1975). Theorem of Gleason for nonseparable Hilbert spaces. International
Journal of Theoretical Physics, 13, 419–424.
Feintzeig, B., & Weatherall, J. (2019). Why be regular? Part II. Studies in History and Philosophy
of Science Part B: Studies in History and Philosophy of Modern Physics, 65, 133–144.
Haag, R. (1992). Local quantum physics. Berlin: Springer.
Halvorson, H. (2001). On the nature of continuous physical quantities in classical and quantum
mechanics. Journal of Philosophical Logic, 30, 27–50.
Halvorson, H. (2007). Algebraic quantum field theory. In J. Earman & J. Butterfield (Eds.), Hand-
book of the philosophy of science, philosophy of physics part A (pp. 731–864). Amsterdam:
Elsevier.
Hamhalter, J. (2003). Quantum measure theory. Dordrecht: Kluwer Academic.
11 Quantum Mechanics As a Theory of Observables and States (And. . . 283
Hegerfeldt, G. C., & Mayato, R. (2012). Discriminating between von Neumann and Lüders
reduction rule. Physical Review A, 85, 032116.
Kadane, J. B, Schervish, M. J., & Seidenfeld, T. (1986). Statistical implications of finitely additive
probability. In P. K. Goel & A. Zellner (Eds.), Bayesian inference and decision techniques (pp.
59–76). Amsterdam: Elsevier.
Kadison, R. V., & Ringrose, J. R. (1987). Fundamentals of the theory of operator algebras (2 Vols).
Providence: Mathematical Society.
Kumar, C. S., Shukla, A., & Mahesh, T. S. (2016). Discriminating between Lüders and von
Neumann measuring devices: An NMR investigation. Physics Letters A, 380, 3612–2616.
Loewer, B. (2012). Two accounts of laws and time. Philosophical Studies, 160, 115–137.
Maeda, S. (1990). Probability measures on projections in von Neumann algebras. Reviews in
Mathematical Physics, 1, 235–290.
Petz, D., & Rédei, M. (1996). John von Neumann and the theory of operator algebras. The
Neumann Compendium, 163–185.
Pitowsky, I. (2003). Betting on the outcomes of measurements: A Bayesian theory of quantum
probability. Studies in the History and Philosophy of Modern Physics, 34, 395–414.
Pitowsky, I. (2006). Quantum mechanics as a theory of probability. In: W. Demopoulos & I.
Pitowsky (Eds.), Physical theory and its implications; essays in honor of Jeffrey Bub (pp. 213–
240). Berlin: Springer.
Raggio, G. A. (1988). A remark on Bell’s inequality and decomposable normal states. Letters in
Mathematical Physics, 15, 27–29.
Ruetsche, L. (2011). Why be normal? Studies in History and Philosophy of Modern Physics, 42,
107–115.
Savage, L. J. (1972). Foundations of statistics (2nd ed.). New York: Wiley.
Seidenfeld, T. (2001). Remarks on the theory of conditional probability. In V. F. Hendricks, et al.
(Eds.), Probability theory (pp. 167–178). Dordrecht: Kluwer Academic.
Skyrms, B. (1992). Coherence, probability and induction. Philosophical Issues, 2, 215–226.
Srinvas, M. D. (1980). Collapse postulate for observables with continuous spectra. Communica-
tions in Mathematical Physics, 71, 131–158.
Takesaki, M. (1972). Conditional expectations in von Neumann algebras. Journal of Functional
Analysis, 9, 306–321.
Williams, P. (2017). Scientific realism made effective. The British Journal for the Philosophy of
Science, 70, 209–237.
Chapter 12
The Measurement Problem and Two
Dogmas About Quantum Mechanics
Laura Felline
L. Felline ()
Department of Philosophy, University of Roma Tre, Rome, Italy
12.1 Introduction
1 See later in this section ad especially note 2 for some clarifications about the terminology.
12 The Measurement Problem and Two Dogmas About Quantum Mechanics 287
2A further clarification: Bub takes this terminology from Wallace (2016) where, however,
probabilistic interpretations are characterized as those according to which observables always have
determinate values, while the quantum state is merely “an economical way of coding a probability
distribution over those observables” (p. 20). I think this characterization is incorrect of the three
approaches here analysed, since they do not take observables as determinate all the time, therefore
I don’t refer to Wallace terminology.
288 L. Felline
In arguing for this general point, I focus on the measurement problem, typically
taken to be explained away in non-representational accounts. As in the above
illustrated example of non-locality, and contrarily to what often claimed by the
literature, also in the case of the measurement problem, the majority of non-
representational accounts fail to provide a genuine solution.
One of the most systematic expositions of the non-representational view and of
its relationship with the measurement problem was put forward few years ago by
Itamar Pitowsky, in a series of paper (Pitowsky 2003, 2004, 2006; Bub and Pitowsky
2010) where he puts forward his own Bayesian theory of quantum probability.
According to his analysis, the theoretical problems of QT are originated from
two ‘dogmas’: the first forbidding the use of the notion of measurement in the
fundamental axioms of the theory; the second imposing an interpretation of the
quantum state as representing a system’s objectively possessed properties and
evolution.
In this paper I illustrate and criticise Pitowsky’s analysis, taking the rejection
of the two dogmas as an accurate articulation of the fundamental stances of the
probabilistic approach to the measurement problem. In order to assess such stances,
I will test them through the use of an argument elaborated by Amit Hagar and
Meir Hemmo (2006), showing how some probabilistic interpretations of QT fail
at dictating coherent predictions in Wigner’s Friend situations.
More specifically, I evaluate three different probabilistic approaches: qBism
(more specifically, the version articulated by Christopher Fuchs, e.g. Fuchs 2010;
Fuchs et al. 2014) as a representative of the epistemic subjective interpretation of
the quantum state; Bub’s information-theoretic interpretation of QT (Bub 2016,
2018) as an example of the ontic approach to the quantum state; Pitowsky’s prob-
abilistic interpretation (Pitowsky 2003, 2004, 2006), as an epistemic but objective
interpretation. I argue that qBism succeeds in providing a formal solution to the
problem that does not lead to a self-contradictory picture, although the resulting
interpretation would bring the real subject matter of QT to clash alarmingly with
scientific practice. The other two approaches, instead, strictly fail when confronted
to scenarios where the measurement problem is relevant, showing in such a way that
they don’t provide a genuine solution to the problem.
The idea that deflating the quantum state’s ontological import is sufficient to explain
away the measurement problem is a leitmotif in the literature about information-
based accounts of QT, still shared by the vast majority of the advocates of the
probabilistic view:
The existence of two laws for the evolution of the state vector becomes problematical only
if it is believed that the state vector is an objective property of the system. If, however,
the state of a system is defined as a list of [experimental] propositions together with their
[probabilities of occurrence], it is not surprising that after a measurement the state must be
changed to be in accord with [any] new information. (Hartle 1968, as quoted in Fuchs 2010)
12 The Measurement Problem and Two Dogmas About Quantum Mechanics 289
Itamar Pitowsky, also together with Jeff Bub (Pitowsky 2006; Bub and Pitowsky
2010), has identified two distinct issues behind the measurement problem. The first
one, that he calls the small measurement problem, is the question “why is it hard to
observe macroscopic entanglement, and what are the conditions in which it might
be possible?” (2006, p. 28). The second issue, the big measurement problem, is the
problem of accounting for the determinateness of our experiences:
The ‘big’ measurement problem is the problem of explaining how measurements can have
definite outcomes, given the unitary dynamics of the theory: it is the problem of explaining
how individual measurement outcomes come about dynamically. (Bub and Pitowsky 2010,
p. 5)
In this paper we are exclusively concerned with the big problem; therefore,
from now on, I will refer to the latter more simply as the measurement problem.
However, I reject Bub’s and Pitowsky a priori request for a dynamical account;
therefore I characterise the measurement problem as the problem of accounting for
the determinateness of measurement results.
The second most notable contribution that Pitowsky has given to the analysis of
the measurement problem concerns the identification of two assumptions, labelled
as ‘dogmas’, in the interpretation of QT, that, according to the philosopher, are at
the origin of the issues in the interpretation of QT.
The first dogma is Bell’s assertion (defended in [(Bell 1987)]) that measurement should
never be introduced as a primitive process in a fundamental mechanical theory like classical
or quantum mechanics, but should always be open to a complete analysis, in principle, of
how the individual outcomes come about dynamically. The second dogma is the view that
the quantum state has an ontological significance analogous to the ontological significance
of the classical state as the ‘truthmaker’ for propositions about the occurrence and non-
occurrence of events, i.e., that the quantum state is a representation of physical reality.
(Pitowksy 2006, p. 5)
A couple of observations are in order here. First of all, the two claims:
“the quantum state has an ontological significance analogous to the ontological
significance of the classical state as the ‘truthmaker’ for propositions about the
occurrence and non-occurrence of events” and “the quantum state is a representation
of physical reality”, are not equivalent, although they are here put forward as such.
More specifically, one can deny that the quantum state represents ‘traditionally’
(Bub 2018) in the sense that it does not represent a system’s properties, and therefore
acting as a truthmaker for propositions about the occurrence and non-occurrence of
events, without denying that the quantum state represents physical reality. In fact,
that’s the case for Bub’s information-theoretic interpretation which, as I will argue
below, denies that the quantum state represents a system’s properties and dynamics,
and yet interprets the quantum state ontically, as representing physical reality. In
the following, therefore, I take the second dogma as the view that the quantum state
does not represent traditionally, while the issue whether the quantum state represents
physical reality or a mental state remains open.
Secondly, although probabilistic interpretations reject both dogmas, it is useful
to keep in mind that, at least at a first sight, the two are logically distinct, and it still
remains to be seen whether they need to come in package. The rejection of the first
290 L. Felline
Therefore, the system composed of Friend and the particle (F + P) is in the state:
√ √
| Psi 0>P+F = 1/ 2 | +z >P + 1/ 2 |−z >P | Psi 0>F (12.2)
12 The Measurement Problem and Two Dogmas About Quantum Mechanics 291
Since they agreed on this first part of the protocol, Wigner knows that Friend is
going to perform the measurement; however, because the lab is isolated from Wigner
and the rest of the environment, he does not know the result of her measurement.
Now, consider a second measurement, of an observable O of the system F + P,
with eigenstate
√ √
| Psi>P+F = 1/ 2 | +z >P | +z >F + 1/ 2 |−z >P |−z >F (12.3)
or
In the rest of the paper, the application of this argument to three different
theoretical accounts will be used to explore the relationship between the two dogmas
and the measurement problem.
12.4 QBism
I have already said that HH chose qBism as a case study to formulate their argument,
while claiming that the same result applies to any black-box approach to QT. The
succeeding debate around this specific approach, however, has shed some light about
how different black-box approaches lead to different results when faced to Wigner’s
Friend scenarios. In order to formulate my criticism of qBism, in the following I
will build on such a debate. I will first isolate the exact feature of qBism that allows
it to formally escape HH’s argument, and then show how this same feature leads to
a philosophically problematic account of QT.
According to qBism, the quantum state represents the rational subjective belief
of an observer, assembled from personal experience. The first direct consequence of
this interpretation is that quantum probabilities are not a measure of anything in real-
ity, but only of the degree of belief of an observer over future experience. Secondly,
different observers might consistently attribute different quantum states to the same
system. In order to solve the dilemma in the Wigner’s Friend scenario, therefore, one
needs to take into consideration each observer’s individual perspective: given that
she has gathered new data, QT prescribes Friend to update P’s quantum state (and
F + P’s state, clearly). From her perspective, then, the prediction of the result in an
O-measurement in the second stage of the experiment will be YES with probability
1/2. On the other hand, Wigner has registered no new data, therefore QT prescribes
him to maintain the entangled state (12.4) for F + P. From his perspective, therefore,
the O-measurement will yield YES with probability 1.
The surprising consequence is that, according to qBism, different observers
can coherently hold incompatible predictions about the world. The subjectivism
embraced by qBism turns QT into “a “single-user theory”: probability assign-
ments express the beliefs of the agent who makes them, and refer to that same
agent’s expectations for her subsequent experiences. The term “single-user” does
not, however, mean that different users cannot each assign their own coherent
probabilities” (Fuchs et al. 2014, p. 2).
According to Amit Hagar (2007) this strategy can’t work: the (repeated) mea-
surement of O will yield a series of results with a certain frequency that will
(hopefully!) confirm one of the predictions and falsify the other. If this is so, at
the end of the day one of the two alternatives (collapse or no-collapse) will be right,
while the other will be wrong.
To Hagar, Timpson (2013) replies that this kind of objections are ineffective
against Qbism:
12 The Measurement Problem and Two Dogmas About Quantum Mechanics 293
[n]o objection can be successful which takes the form: ‘in such and such a situation, the
quantum Bayesian position will give rise to, or will allow as a possibility, a state assignment
which can be shown not to fit the facts’, simply because the position denies that the requisite
kind of relations between physical facts and probability assignments hold. (Timpson 2013,
p. 209)
3I am very grateful to Meir Hemmo and an anonymous referee for pushing me to provide a more
thorough evaluation of Timpson’s position.
4 By HH themselves first, but see also, just to cite one, (Bacciagaluppi 2013) for an insightful take
is available to all the observers, while the lab is maintained in isolation (see for
instance (Frauchiger and Renner 2018) and (Baumann and Brukner 2019)).
The core of the qBist solution to this problem lies instead in “some kind of
solipsism or radical relativism, in which we care only about single individuals’
credence’s, and not about whether and how they ought to mesh” (Bacciagaluppi
2013, p. 6).
Now, it is difficult to formulate an argument with solid grounds in favour of an
interpretation of QT, if your starting point is solipsism. However, it might be argued
that the situation is not as bad as it might seem. Timpson (2013, §9.3, but see also
Healey 2017, §2), for instance, denies that qBism implies solipsism, and claims
instead that the former avoids the paradoxical or anyway pernicious consequences
of the latter as a general epistemological thesis. In other words, although, according
to qBism, QT fails to include the perspective of more than one observer, it does not
follow that according to Qbism there is only one sentient being in the world, let
alone that the rest of empirical science is affected by these limitations. This same
defence strategy is used by Timpson against the charge of instrumentalism: it is
true that, according to Qbism, QT is neither true or false of the world, and a mere
instrument for bookkeeping of probabilities. However, in the case of Qbism, this
conclusion follows, from specific considerations about the structure of QT and only
applies to said theory. Because of these considerations, qBism does not imply the
majority of the controversial features of instrumentalism as a general position about
science.
The upshot of this rebuttal to criticisms is: QT might forbid inferences about
the experiences of other observers as solipsism does, and it might be as void of
knowledge about the world, as instrumentalism dictates – but this is OK, because
qBism does not lead to the possibly paradoxical consequences of such thesis.
I am not going here to challenge Timpson’s argument. Here, I take for granted
that this defence provides qBism with a way-out from the conclusions of HH
argument. This being said, it is legitimate to scrutinize the consequences of this
unique status of QT with respect to the rest of empirical science – uniqueness that,
we just concluded, lies at the core of the qBist success.
I am going to argue that this achievement is a pyrrhic victory, as it still fails to
provide an acceptable picture of the scientific enterprise in the field of QT.
qBism succeeds in the context of Wigner’s Friend scenarios, but also in explain-
ing away other apparent oddities of the quantum world, because it characterizes QT
not as a physical theory, but as a Bayesian theory for the formation of expectations.
As such, it can coherently ignore the constraints imposed to an objective description
of the world, when they are in the way of coherence. While a theory about the world
must submit to the typical epistemic and methodological standards of empirical
sciences, Bayesian epistemology is subordinate to very different standards, for the
simple reason that different subject studies require different methodologies. The
violation of the requirement of intersubjectivity in the qBist explanation away of
non-locality (see Fuchs et al. 2014) and in the account of the Wigner’s Friend
thought experiment.
12 The Measurement Problem and Two Dogmas About Quantum Mechanics 295
In other words, Qbism plays a different game, with different rules, with respect
to physics and, more in general, with respect to natural sciences. And it escapes
strict failure when confronted against HH’s argument (as far as self-consistency is
the only requirement to avoid strict failure, but see later) only as far as QT is not
interpreted as natural science.
If this is so, a coherent and truthful endorsement of Qbism implies also the
renunciation to constraints with a great value of guidance to scientific change like
empirical adequacy, physical salience, or intersubjectivity because, in a quantum
Bayesian theory about rational belief, such criteria fail to make sense. In fact, the
renunciation to such criteria is the key for the Qbist explanation away of the oddities
of QT.
And yet, it can’t be ignored that giving up these criteria means throwing away a
huge piece of scientific practice – a hard bullet to bite for working physicists using
QT as an empirical science every day.
One might reply that this is a made-up problem: scientific practice does not, nor
should, change because of philosophical debates, so there is nothing to be worried
about. However, the unproblematic acknowledgment of such a chasm between
scientific practice and the real content of scientific theories, is an even harder bullet
to bite (if not straightforward failure) for philosophers.5
This is what one gets when submitting to a qBist view of QT. As we will see in
the next sections, those that indeed think that this is too high price to pay for internal
consistency, but still want to deny the representational role of QT, will have to do so
while maintaining in some way the empirical import of the quantum state, denied
by qBism.
5 Inhis analysis of the virtues and problems of Qbism, Chris Timpson formulates a similar
challenge to Qbism, but focusing on explanation:
It seems that we do have very many extensive and detailed explanations deriving from
quantum mechanics, yet if the quantum Bayesian view were correct, it is unclear how we
would do so. (Timpson 2013, p. 226)
Timpson’s criticism is that Qbism suffers of an explanatory deficit because it can’t explain why
QT explains; however, I think that the Qbist lack of a realist explanation for the explanatory power
of QT is hardly an unsurmountable problem.
First of all, the Qbist can reply to Timpson’s challenge that there is no reason why a theory that is
neither true nor false can’t do what false theories have done for centuries: being explanatory. If then
one insists (wrongfully, I think) that only at least approximate truth have explanatory power, then
the Qbist can simply deny a genuine explanatory power for QT: after all, the topic of explanation
in QT is a time-honoured headache in philosophy of science, and the philosophers that deny the
explanatory power of QT are not few. An objection hinging on the explanatory power of QT can
go just as far.
296 L. Felline
quantum state run the risk of as a bookkeeping device is that it turns out being void
of physical content, or, even worse, instrumentalist; part of Pitowsky’s conceptual
work is therefore devoted to stressing how this approach avoids it.
One way to mark a distance from instrumentalism is to stress by stressing the
explanatory power of quantum Bayesianism. Under an instrumentalist view (e.g. in
the textbook approach to QT), in fact, the question ‘why do the quantum events do
not conform to classical probabilities?’ has no answer. The information-theoretic
view, on the other hand, have more tools to provide such an explanation, i.e. since
it is realist towards the structure of quantum gambles, i.e. towards the structure of
Hilbert space
the Hilbert space, or more precisely, the lattice of its closed subspaces, [is] the structure that
represents the “elements of reality” in quantum theory. (Pitowsky 2006, p. 4)
Instrumentalists often take their “raw material” to be the set of space–time events: clicks
in counters, traces in bubble chambers, dots on photographic plates, and so on. Quantum
theory imposes on this set a definite structure. Certain blips in space–time are identified as
instances of the same event. Some families of clicks in counters are assumed to have logical
relations with other families, etc. What we call reality is not just the bare set of events, it is
this set together with its structure, for all that is left without the structure is noise. [ . . . ]
It is one thing to say that the only role of quantum theory is to “predict experimental
outcome” and that different measurements are “complementary.” It is quite another thing to
provide an understanding of what it means for two experiments to be incompatible, and yet
for their possible outcomes to be related; to show how these relations imply the uncertainty
principle; and even, finally, to realize that the structure of events dictates the numerical
values of the probabilities (Gleason’s theorem). (Pitowksy 2003, p. 412)
The fact that such a structure has a physical reality allows the above-cited
explanations to be genuine physical explanations rather than collapsing to the kind
of explanations given by logic or formal epistemology. Said physical explanations
are not dynamical, but structural, in the sense of (Felline 2018a, b): the explanandum
phenomenon is explained as the instantiation of a physical structure, fundamental
in the sense that it is not inferable from the properties and behaviour of underlying
entities.
Pitowsky’s realism goes further than this. I have said before that, according to
Pitowsky’s account, only registered measurement results are facts, about which
the theory can make predictions. This claim, however, has an exception when the
quantum state also describes an objective outside reality, and its predictions can be
taken as genuine facts:
In the Bayesian approach what constitutes a possible event is dictated by Nature, and the
probability of the event represents the degree of belief we attach to its occurrence. This
distinction, however, is not sharp; what is possible is also a matter of judgment in the sense
that an event is judged impossible if it gets probability zero in all circumstances. In the
present case we deal with physical events, and what is impossible is therefore dictated by
the best available physical theory. Hence, probability considerations enter into the structure
of the set of possible events. We represent by 0 the equivalence class of all events which
our physical theory declares to be utterly impossible (never occur, and therefore always get
probability zero) and by 1 what is certain (always occur, and therefore get probability one).
(Pitowsky 2006, p. 5)
12 The Measurement Problem and Two Dogmas About Quantum Mechanics 299
So, according to this stipulations, not only registered results are facts, but also
results with probability 1. In this case, the quantum state represents something in
the world.
This traces a crucial difference between Pitowsky’s information-theoretic inter-
pretation and Qbism. In the latter, given the strict Bayesian reading of probabilities,
probability 1 only means that the person who associates this probability to an event
E very strongly beliefs the occurrence of E (the same is valid, mutatis mutandis, for
probability 0).
According to Pitowsky’s analysis of the measurement problem, adopting an
epistemic view is sufficient to explain away the measurement problem. In fact,
the claim that the quantum state is a mental state blocks, at a first analysis, the
consequences that we have seen in Bub’s ontic approach.
This, however, is not a viable solution for Pitowsky’s approach, due to the realism
he acknowledges towards the quantum state.
In order to concretely see why, take again HH’s argument and the predictions
for the measurement of O. In a partially objective reading of the state like Pitowsky
epistemic view, the fact that Wigner associates probability 1 to the result YES means
that such result is an objective fact about reality. But if this is so, then there is a fact
of the matter about what will be the result of the measurement of O and Friend’s
predictions are objectively wrong.
But this, we know, would mean adopting a non-collapse view of QT.
Recently (2018) Bub has formulated a new argument addressing the problem of
incompatible predictions of QT in Wigner’s Friend scenario. Bub does not directly
cite HH’s argument, and focuses instead on the more recent argument formulated in
(Frauchiger and Renner 2018). In the following, for simplicity, I will keep using the
argument as formulated by HH.
Bub stance about the notion of measurement is very much in line with Pitowsky’s
genuine facts. Accordingly, QT does not apply to, nor make predictions about,
generic events, but only about measurement results. Here, as in Pitowsky, the notion
of measurement is substantially revised:
A quantum “measurement” is a bit of a misnomer and not really the same sort of thing as a
measurement of a physical quantity of a classical system. It involves putting a microsystem,
like a photon, in a situation, say a beamsplitter or an analyzing filter, where the photon is
forced to make an intrinsically random transition recorded as one of two macroscopically
distinct alternatives in a device like a photon detector. The registration of the measurement
outcome at the Boolean macrolevel is crucial, because it is only with respect to a suitable
structure of alternative possibilities that it makes sense to talk about an event as definitely
occurring or not occurring, and this structure is a Boolean algebra. (Bub 2018, p. 6)
So, on the one hand Bub provides a physical characterization that partially
explicates the notion of measurement (it involves an intrinsically random transition
300 L. Felline
for the measurement system, whose final state must be registered macroscopically);
on the other hand, the first dogma is still violated, since there is no criterion for
whether or not a process counts as measurement (we don’t know when a photon is
forced to make an intrinsically random transition).
Bub explains the problems in Wigner’s Friend scenario as originated by the
structure of quantum events, composed by a “family of “interwined” Boolean
algebras, one for each set of commuting observables [ . . . ]. The interwinement
precludes the possibility of embedding the whole collection into one inclusive
Boolean algebra, so you can’t assign truth values consistently to the propositions
about observable values in all these Boolean algebras” (Bub 2018, p. 5).
So, the algebra of observables is non-Boolean, but each observable (each
measurement) picks up a Boolean algebra which associates different probabilities to
same observables. In our thought experiment: Friend’s and Wigner’s measurements
pick up different Boolean algebras that can’t be consistently embedded into one
single coherent framework.
According to Bub only one Boolean algebra provides the correct framework
for the description of the state. In order to justify the selection of one single
Boolean algebra, Bub starts from reminding how George Boole introduced Boolean
constraints on probability as “conditions of possible experience”. He therefore takes
a ‘Bohrian’ perspective and suggests that the choice of a unique Boolean algebra
is imposed by the necessity to “tell others what we have done and what we have
learned” (Bohr, as cited in Bub 2018, p. 9).
The correct quantum state to associate to a system, therefore, is the one
corresponding to what Bub calls the ultimate measurement (and relative ultimate
observer). It is important to notice right from the start, and I will insist on this point
later along the way, that in order for HH’s objection to be met, all observers (not
only the ultimate observer) must apply the same Boolean algebra, on pain of falling
again into a scenario with inconsistent predictions.
But who is the ultimate observer in HH’s thought experiment?
The only criterion for the identification of the ultimate measurement/observer is
that the result of the measurement must be registered macroscopically.
“In a situation, as in the [Hagar and Hemmo] argument, where there are
multiple candidate observers, there is a question as to whether [Friend is] “ultimate
observer[..],” or whether only Wigner [is]”. The difference has to do with whether
[Friend] perform measurements of the observables [P] with definite outcomes at the
Boolean macrolevel, or whether they are manipulated by Wigner [ . . . ] in unitary
transformations that entangle [Friend] with systems in their laboratories, with no
definite outcomes for the observables A and B. What actually happens to [Friend]
is different in the two situations” (Bub 2018, p. 11).
It is important to clarify, about this passage, that the exclusive character of the
disjunction ‘Friend performs measurements registered at the macrolevel or Wigner
manipulates observables’ is misleading. The two disjuncts are actually not mutually
excluding: both can be true. Let’s assume, in fact, that both Wigner and Friend write
down their results and are therefore ultimate observers. This is an unproblematic
12 The Measurement Problem and Two Dogmas About Quantum Mechanics 301
assumption, since, according to Bub’s analysis,6 both Friend and Wigner act as
ultimate observers in different moments of the experiment: in the first part, when
Friend performs her measurement, she is the ultimate observer, while Wigner
becomes the ultimate observer in the second part of the experiment.
Now, Wigner knows (as we know) that Friend will write down her measurements’
results, and so that she is the legitimate ultimate observer in the first part of the
experiment. He also knows that he has to take this assignment very seriously, given
that “What actually happens to [Friend] is different” whether or not Friend is the
ultimate observer and “the difference between the two cases [ . . . ] is an objective
fact at the macrolevel” (Bub 2018, p. 12).
This means that Wigner knows that in the first part of the experiment a legitimate
ultimate measurement is being performed in the lab, and with it the uncontrollable
disturbance that characterize, in measurement processes, the passage from non-
Booleanity to Booleanity. If we were to use Pitowsky’s Bayesian framework, we
could say that Friend’s measurement is a gamble over which Wigner can bet. He
does not clearly know what is the result of the experiment, but he knows that she
definitely gets a + or a− result, because this is what the ‘ultimate Boolean algebra’
predicts.
The point I am trying to make is that, if QT is not a ‘single user theory’ (as
it is not according to Bub), and if the selection of a Boolean algebra does not
apply uniquely to the ultimate observer (as it does not according to Bub), then
after Friend’s measurement, Both she and Wigner will have to predict that her
measurement ends up either in the state (12.5) or in the state (12.6). The result of
Friend’s experiment is an objective fact about the world, which Wigner can’t ignore.
The correct state that, according to this view, Wigner should associate to F + P
after Friend’s measurement is therefore a proper mixture of (12.5) and (12.6).
But if this is true, when Wigner makes his predictions about the measurement of
O, he should use the probabilities dictated by a proper mixture of (12.5) and (12.6),
rather than those dictated by the entangled state (12.4).
But this, again, “would seem to require a suspension of unitary evolution
in favour of an unexplained “collapse” of the quantum state” (12.3), which the
information-theoretic view clearly rejects.
6 “If there are events at the macrolevel corresponding to definite measurement outcomes for Alice
and Bob, then Alice and Bob represent “ultimate observers” and the final state of the combined
quantum coin and qubit system is |hA |0B or |tA |0B or |tA |1B , depending on the outcomes.
If Wigner and Friend subsequently measure the super-observables X, Y on the whole composite
Alice-Bob system (so they are “ultimate observers” in the subsequent scenario), the probability of
obtaining the pair of outcomes {ok, ok} is 1/4 for any of the product states |hA |0B or |tA |0B or
|tA |1B ” (Bub 2018).
302 L. Felline
12.8 Conclusions
In this paper I have tested the solutions to the big measurement problem provided
by different interpretations of QT as a theory about information, by analysing how
they behave in a thought experiment à la Wigner’s Friend.
As analytical tools for the examination of such performances, I’ve used the two
claims that, according to Itamar Pitowsky, lie at the basis of the conundrums of QT:
the claim that the concept of measurement should not appear in the fundamental
axioms of a physical theory and the claim that the quantum state represents a
physical system, its properties and its dynamical evolution.
About the first dogma, HH argue that the conclusion of their argument is that
black-box approaches are incoherent or not complete. If this is true, their argument
can be seen as a justification of the first dogma, which, as a result, is not a dogma
anymore: besides the already good reasons illustrated by John Bell, a reason to adopt
Bell’s dictum is that black-box theories can’t provide consistent predictions. In the
previous sections I have acknowledged that qBism might have a way-out of HH’s
conclusions. Should we conclude that HH’s argument fails in vindicating the first
dogma?
I think this conclusion would be wrong. Bell’s criticism of the notion of
measurement is a reflection over physical theories: measurement can’t appear in
a fundamental physical theory because it is not a physically fundamental notion. It
should be already clear that, since the conclusion of my analysis is that qBism is not
a physical theory, the first dogma does not apply to it.
When seen, as it was intended, as a criterion for fundamental physics, Bell’s
dictum seems quite reasonable. However, it is hardly surprising that the notion of
measurement, and measurement results, appear in the axioms of a theory about
beliefs and about how to update beliefs with new experience.7
The fact that Qbism is not affected by HH’s argument, therefore, does not say
much about the status of Bell’s dictum as a dogma or as a justifiable request. As far
as HH’s argument is successful against physical black-box theories, it still provides
a valid justification for the first dogma.
From the analysis put forward in this paper, therefore, Bell’s dictum seems in
great shape, and not a dogma at all.
Let’s move to the analysis of the second dogma. The main target of this paper
was the assumption, common to probabilistic and information-based approaches to
QT, that the second dogma is responsible for the measurement problem, and that its
rejection is sufficient to explain the problem away.
The failure of Bub’s and Pitowsky’s information-theoretic interpretations in the
context of HH’s thought experiment, on the one hand, and the discussed problems
in the Qbist approach, show that this assumption is too simplistic: the claim that
QT is about information does not solve the measurement problem, neither in the
7 Although (Fuchs et al. 2014, p. 2) for a reflection over the notion of measurement.
12 The Measurement Problem and Two Dogmas About Quantum Mechanics 303
physical interpretation of the notion of information (as in Bub), nor in its epistemic
interpretation (as in Pitowsky’s).
Acknowledgement The discussion of this paper at the Birmingham FraMEPhys Seminar series
and at the MCMP-Western Ontario Workshop on Computation in Scientific Theory and Practice
helped me to improve this paper. I am very thankful to the editors of this volume, and especially to
Meir Hemmo for his valuable comments, and to Veronika Baumann for the same reason.
References
Myrvold, W. (2018). Philosophical issues in quantum theory. In E. N. Zalta (Ed.), The Stanford
encyclopedia of philosophy (Fall 2018 edition). URL: https://plato.stanford.edu/archives/
fall2018/entries/qt-issues/
Pitowsky, I. (2003). Betting on the outcomes of measurements: A Bayesian theory of quantum
probability. Studies in History and Philosophy of Science Part B: Studies in History and
Philosophy of Modern Physics, 34(3), 395–414.
Pitowsky, I. (2004). Macroscopic objects in quantum mechanics: A combinatorial approach.
Physical Review A, 70(2), 022103.
Pitowsky, I. (2006). Quantum mechanics as a theory of probability. In Physical theory and its
interpretation (pp. 213–240). Dordrecht: Springer. arXiv:quant-ph/0510095v1.
Ramsey, F. P. (1926). Truth and probability. Reprinted In D. H. Mellor (Ed.) F. P. Ramsey:
Philosophical Papers. Cambridge: Cambridge University Press (1990).
Timpson, C. G. (2013). Quantum information theory and the foundations of quantum mechanics.
Oxford: Oxford University Press.
Wallace, D. (2016). What is orthodox quantum mechanics? arXiv preprint arXiv:1604.05973.
Chapter 13
There Is More Than One Way to Skin
a Cat: Quantum Information Principles
in a Finite World
Amit Hagar
Abstract An analysis of two routes through which one may disentangle a quantum
system from a measuring apparatus reveals how the no–cloning theorem can follow
from an assumption on an infrared and ultraviolet cutoffs of energy in physical
interactions.
13.1 In Memory
Itamar Pitowsky was my teacher and my MA thesis advisor. He was the reason
I became an academic. His classes were a sheer joy, and his deep voice and
enthusiasm gave warmth to us all in those dreary Jerusalem winters. Itamar’s genius
became evident a decade earlier when he published his Ph.D. thesis in the Physical
Review (Pitowsky 1983) and later in a Springer book (Pitowsky 1989) that has
gone missing from so many university libraries because everybody steals it. In it
he showed that one could still maintain a local realistic interpretation of quantum
mechanical correlations if one would allow the existence of non-measurable sets.
In a sense it was the beginning of his view of quantum mechanics as a theory of
(non–Boolean) probability, a view which he has been advocating forcefully later in
I am thankful to G. Ortiz for discussion, to the audience in the IU Logic Colloquium and the UBC
Physics Colloquium for their comments and questions on earlier versions of this paper, and to
Itamar, the oracle from Givat Ram, whose advice has guided me since I met him for the first time
more than 25 years ago.
A. Hagar ()
HPSC Department, Indiana University, Bloomington, IN, USA
e-mail: hagara@indiana.edu
his career before his untimely death. I dedicate this paper to Itamar and his unique
empiricist view. The calm sunshine of his mind is among the things I will never
forget.
13.2 Introduction
In the late 1920s von Neumann axiomatized non relativistic quantum mechan-
ics (QM henceforth) with the Hilbert space, and ever since then generations
of physicists have been accustomed to view this space as indispensable to the
description of quantum phenomena. The Hilbert space with its inner product and
the accompanying non commutative algebra of its orthomodular lattice of closed
subspaces have also given rise to a manifold of “interpretations” of non relativistic
QM, and to a subsequent plethora of puzzles and metaphysical speculations about
the inhabitant of this space, the “state vector” (Nye and Albert 2013), its “branching”
(Saunders et al. 2010), and its “collapse” (Ghirardi et al. 1986).
Such puzzles are not surprising. Faithful as they are to the Galilean imperative
(“The book [of Nature] is written in mathematical language” Galilei 1623/1960),
theoretical physicists often incline to interpret mathematical constructs that they
use in the description of physical reality as designating pieces of that reality.
In classical mechanics such a correspondence is relatively straightforward: the
dynamical variables of the theory can be assigned values, and so can the attributes of
the physical objects they represent. In QM, however, things are more complicated.
To begin with, the mathematical structure can support two types for dynamical
variables, the operators and the vectors they operate on (the historical reason for this
duality can be traced back to the birth of quantum theory; see, e.g., Beller (1999,
pp. 67–78)). Next, this mathematical structure doesn’t yield itself to correspondence
with physical reality as easily as in classical mechanics. One candidate for the
dynamical variables, the operator, can be assigned (eigen)values only in specific
circumstances (namely, when it operates on eigenvectors), and, furthermore, not
all operators can be assigned (eigen)values simultaneously (Kochen and Specker
1967); the other candidate, the state vector, can be assigned values (amplitudes),
but instead of actual attributes of a physical object, these (or rather their mod
square) represent the probabilities of the object’s having these attributes as an object
represented by (or spanned by the eigenvectors of) the respective operator.
Those who insist on interpreting the state vector of a single quantum system
as designating a real physical object can do so by (i) altering QM, namely by
introducing a physical collapse of the state vector as a genuine physical process
(Ghirardi et al. 1986), or by (ii) replacing QM with a deterministic and non–local
alternative (Goldstein et al. 1992), or by (iii) keeping QM intact but committing to
a metaphysical picture of multiple worlds (Saunders et al. 2010). Since neither of
these routes is terribly attractive to physicists, a possible alternative is to regard the
state vector as a mathematical object which does not represent pieces of reality, but
rather our knowledge of that reality (Unruh 1994), knowledge which is inherently
13 There Is More Than One Way to Skin a Cat: Quantum Information. . . 307
system so that it will end up in a specific state, i.e., we measure the system in a
certain orthonormal basis of an operator and “collapse” it to one of the eigenvectors
of the operator. Note that the quantum state we prepared becomes “known” to us
with certainty only after the measurement took place, and once it is, we can repeat
the measurement many times without changing it.
That we need to accept this scenario and not analyze the measurement process
any further is the essence of the information–theoretic view that regards measure-
ment results as primitive and “collapse” as a subjective knowledge update and not as
a physical process. But what if we asked the reverse question? how does one prepare
an operator so that an unknown quantum state becomes its eigenstate?
The obstacle for doing so comes in the form of the no–cloning theorem, that
states that a cloning device of the sort |ψ ⊗ |s, where |ψ is an unknown quantum
state and |s is the pure target state to which |ψ should be cloned according to the
unitary evolution
can only work if |s = |ψ (in which case |ψ is known in advance) or if |ψ and |s
are orthogonal. In other words, a general unknown quantum state cannot be cloned
with unitary evolution (Dieks 1982; Wootters and Zurek 1982).
From an information–theoretic perspective, the theorem encapsulates the differ-
ence between quantum and classical information: in the classical case states such
as |0 and |1 are always orthogonal but QM allows for additional non orthogonal
states such as √1 (|0 + |1). From an operational perspective, however, the no–
2
cloning theorem is just another reminder that in QM the measurement process of
an unknown state disturbs the state (Park 1970).1 It is this feature that, or so I
shall argue, the finite nature hypothesis can explain in the two generic measurement
scenarios described below.
The finite nature hypothesis imposes an upper bound on energy in any physical
interaction. It translates into physics a position in the philosophy of mathematics
known as finitism, according to which the physical world is seen as “a large but
finite system; finite in the amount of information in any volume of spacetime, and
finite in the total volume of spacetime” (Fredkin 1990). On this view, space and
time are only finitely extended and divisible, all physical quantities are discrete,
and all physical processes are only finitely complex. What follows is an attempt to
demonstrate how the operational principle behind the no–cloning theorem results
from such a bound on energy in the interaction between the quantum system and its
measuring device.
1 If
we have many identical copies of a qubit then it is possible to measure the mean value of non–
commuting observables to completely determine the density matrix of the qubit Inherent in the
conclusion that nonorthogonal states cannot be distinguished without disturbing them then is the
implicit provision that it is not possible to make a perfect copy of a qubit.
13 There Is More Than One Way to Skin a Cat: Quantum Information. . . 309
One way a construction of a measurement interaction that keeps the quantum state
intact could take place is via the quantum adiabatic theorem. According to this
theorem (Messiah 1961, pp. 739–746), an eigenstate (and in particular a ground
state) of an initial time–dependent Hamiltonian of a quantum system can remain
unperturbed in the process of deforming that Hamiltonian, if several conditions are
met: (i) the initial and final Hamiltonians, as well as the ones the system is deformed
through, are all non–degenerate, and their eigenvalues are piecewise differentiable
in time (ii) there is no level–crossing (i.e., there always exists an energy gap between
the ground state and the next excited state), and (iii) the deformation is adiabatic,
i.e., infinitely slow. To take an everyday analogy, if your sound sleeper baby is sound
asleep in her cradle at the living room, then moving the cradle as gently as possible
from that room to the bedroom will not wake her up.
More precisely, consider a quantum system described in a Hilbert space H by a
smoothly time–dependent Hamiltonian, that varies between HI and HF , for t ranging
over [t0 , t1 ], with a total evolution time given by T = t1 − t0 , and with the following
convolution:
t t
H = H (t) = 1 − HI + HF . (13.2)
T T
When the above conditions (i–iii) of the adiabatic theorem are satisfied, then if the
system is initially (in t0 ) in an eigenstate of the initial Hamiltonian HI , it will evolve
at t1 (up to a phase) to an eigenstate of HF . In the special case where the eigenstate is
the ground state, the adiabatic theorem ensures that in the limit T → ∞ the system
will remain in the ground state throughout its time–evolution.
Now, although in practice T is always finite, the more it satisfies the minimum
energy gap condition (condition (ii) above), the smaller will be the probability that
the system will deviate from the ground state. This means that the gap condition
controls the evolution time of the process, in the exact following way:
T ! 1/gmin
2
. (13.4)
where E0 and E1 are the ground state and the first excited state, respectively.
Prima facia the adiabatic theorem seems highly suitable for a measurement
interaction that keeps the unknown quantum state intact, as it appears to allow one to
remain agnostic about both the final state vector at t1 and the final Hamiltonian HF ,
while ensuring that the former is an eigenstate (and, in particular, a ground state, in
the case one starts at t0 from a state one knows to be the ground state of HI ) of the
latter.
310 A. Hagar
One may already point out that the adiabatic theorem is correct only at the limit
T → ∞. In actual physical scenarios, T is always finite, and so there is always a
finite probability that the system will not remain in the ground state after all. In those
cases, repeated measurements of the same state would send it to a state orthogonal
thereto, and so one would not be able to fully characterize it.
However, here I shall argue that the issue isn’t just with the impracticality of
the above scenario, and the problem is actually much more serious: for even for
a finite T , one can disentangle the system from the apparatus and keep the state
undisturbed during the measurement interaction if and only if one knows in advance
the Hamiltonian, which means that one also knows in advance its ground state.
As we shall see, that T is always finite and that prior knowledge of the
Hamiltonian is required are two facets of the same fact, namely, that the question
whether a state remains undisturbed during a process of the kind described above
is an undecidable question. One could argue that there “exists” such a situation in
which the state is undisturbed by the process, and that one can make the probability
of failure of the adiabatic process as small as one wants by simply enlarging the
finite evolution time T . The problem, however, is that even in the realistic case of
a finite T , without prior knowledge of the Hamiltonian, the question what T is (i.e.,
when to halt the adiabatic process so that the state one ends up with is undisturbed),
is undecidable; it is simply impossible to compute T in advance, unless one has
prior knowledge of the final Hamiltonian (and thus of the ground state that can be
calculated therefrom).
To see why, let us focus on the gap condition (Eq. 13.4). This condition is in the
heart of the adiabatic theorem, as it ensures its validity, and, more important, as it
governs the evolution time of the entire process. This spectral gap g = E1 − E0
may be calculated for HI at t0 , since we ourselves construct this Hamiltonian and
prepare the system to be in its ground state. But how does this gap behave along
the convolution? Note that the question here is not whether the gap exists along
the convolution. Saying that T is finite means that at some point a level–crossing
would occur with a finite probability, and so the process would not keep the state
undisturbed forever. But let’s concede that a gap (hence an undisturbed state) does
“exist”. Conceding this, the question we ask now is how small is this gap in all
instances other than t0 . For in the absence of a detailed spectral analysis, in general
nobody knows what g is, how it behaves, or how to compute it!
This question of the gap behavior is of crucial importance. Recall that the gap
controls the velocity of the process with which one would like to keep the state
vector undisturbed throughout the measurement process. To achieve this, one needs
to deform the initial known Hamiltonian HI slow enough, until one arrives at the
final unknown Hamiltonian HF . But how slow is slow enough? How slow should
one go, and, more important, when should one stop the process if one doesn’t
know in advance either (a) the spectral gap or (b) the final Hamiltonian HF ? The
problem, therefore, is not that the evolution time T is finite, but that since g or HF are
unknown, T (and hence the “speed” T −1 ), are finite alright, but also unboundedly
large (slow).
13 There Is More Than One Way to Skin a Cat: Quantum Information. . . 311
Can we estimate T in advance? For each given “run” of the adiabatic process that
ends with HF we have to come up with a process whose rate of change is ∼ Tf−1 .
How do we know that we are implementing the correct rate of change while H (t)
is evolving? Apparently, by being able to measure differences of order Tf−1 , that is,
having the sensitive “speedometer” at our disposal. But when the going gets tough,
we approach very slow speeds of the order of ∼ Tf−1 , which begs the question,
since we can then compute Tf using our “speedometer”, and hence compute HF
and determine its ground state without even running the adiabatic process.
Now if we don’t have a “speedometer”, then even if we decided to increase the
running–time from the order of Tf to, say, the order of Tf + 7, we will have no clue
that the machine is indeed doing (Tf + 7)−1 and not Tf−1 , and that the state remains
undisturbed. And so in the absence of knowledge in advance of g or HF , we will
never know how slow should we evolve our physical system. But then we will also
fail to fulfill the adiabatic condition which ensures us that in any given moment the
system remains in its ground state.
It follows that if the projection operator on the initial state ψ is measured after a
short time t, the probability of finding the state unchanged is
312 A. Hagar
p = 1 − (H )2 t 2 , (13.7)
that in all measurements one will find the state unchanged. In fact, for n → ∞,
the left hand side in (13.8) tends to 1, and so in the limit when the frequency of
measurements tends to infinity, the state will not change at all, and will remain
“frozen”.
From a mathematical perspective, the situation can be explained as a case in
which, given a certain orthonormal basis for H = H0 + V , all the diagonal terms
reside in H0 , and V contains only off–diagonal elements. The eigenspace of H0
is thus a “decoherence free subspace” (Lidar and Whaley 2003), a sector of the
system’s Hilbert space where the system is decoupled from the measuring apparatus
(or more generally, from the environment) and thus its evolution is completely
unitary. Physically this means that the additional term V in the Hamiltonian H
rapidly alters the phases of the eigenvalues, and the decay products (responsible
for the decaying of the state) acquire energies different from those of H , as to make
the decay process more difficult, or, in the limit, impossible. Now if one could locate
one’s state vector in the eigenspace of H0 , then consecutive measurements would
not change the state vector, and one could measure it without disturbing it, and
violate the no cloning theorem.
The quantum Zeno effect was originally presented in terms of a paradox (Misra
and Sudarshan 1977), the idea being that without any restrictions on the frequency
of measurements (time being what it is, namely, a continuous parameter), QM fails
to supply us predictions for the decay of a state which is monitored by a continuous
measurement process, and so appears to be incomplete.
But as in the earlier case of the adiabatic theorem, the state can be guaranteed
to undisturbed in the context of the quantum Zeno effect only at the limit n → ∞
(i.e., T → 0). In actual physical scenarios, T = 0 hence the frequency of the
measurement is always finite, and so there is always a possibility that the state
will be disturbed after all. As in the case of the adiabatic theorem, one could just
replace actual infinity with potential unboundedness, and argue that a decoherence
free subspace simply “exists”, and, moreover, that one can make the probability of
decay as small as one wants.
And yet, such a decoherence free subspace – a subspace in which one can
measure a quantum state without disturbing it – can be constructed with certainty
only if one knows in advance the state one wants to protect. Indeed, further analysis
has revealed (Ghirardi et al. 1979) that a correct interpretation of the time–energy
uncertainty relation (i) relates the measurement frequency to the uncertainty in
energy of the measuring apparatus:
T E ≥ 1 (13.9)
13 There Is More Than One Way to Skin a Cat: Quantum Information. . . 313
where T is the time duration of the measurement and E is the energy uncertainty,
and (ii) controls, via this uncertainty, the rate of decay of the state:
π
|ψ 0 |ψ t |2 ≥ cos2 (Et), for 0 ≤ t ≤ . (13.10)
2E
Before we go on to analyze these results, two remarks are in place. First,
inequality (13.10) was first derived by Mandelstam and Tamm in 1945 for E =
H , i.e., for the case where the uncertainty (spread) in energy is given in terms
of the standard deviation, under the assumption that the latter is finite. Recently it
was shown that since H can diverge, a more suitable measure of uncertainty is
required (Uffink 1993). To our purpose here this caveat is immaterial; even after
a correct measure of uncertainty is chosen, the probability of decay of the state
still depends on this measure of uncertainty. Second, while the duration of the
measurement is a parameter and not a quantum operator, the application of the time–
energy uncertainty relation is legitimate here because we are dealing with what may
be called a static experiment (Aharonov and Bohm 1961): in this case T is not a
preassigned or externally imposed parameter of the apparatus, but is on the contrary
determined dynamically through the workings of the Schrödinger equation which
describes the evolution of the wave packet through the measuring system. The wave
packet may perhaps be said to interact with the apparatus during the time interval
T and in this sense the time–energy uncertainty relation is valid, and constrains the
precision of the energy of the interaction. Note that the time–energy uncertainty
relation all by itself doesn’t constrain T to be finite. It is the additional assumption
that in realistic interaction scenarios energy is finite and bounded from above that
does the trick. See, e.g., Hilgevoord (1998, p. 399).
With this preliminaries we can now state our negative result, which is similar to
the one we have established in the case of the adiabatic theorem. For even in the
case where the frequency T −1 is finite but unbounded (hence T is not exactly 0
but arbitrarily close to 0), one still needs to know the state in advance in order to
know how fast one needs to repeat the measurements that project the state onto its
protected subspace and thereby disentangle it from the measuring apparatus.
The reason is that also here, for each given “Zeno effect” process k, we have to
come up with a repeated measurement process whose frequency is ∼ Tk−1 . But how
do we know that we are implementing the correct frequency? Apparently, by being
able to measure differences of order Tk−1 , that is, having the sensitive “speedometer”
at our disposal. But when the going gets tough we approach very fast speeds of the
order of ∼ Tk−1 , which begs the question, since we can then compute Tk using our
“speedometer”.
Now if we don’t have a “speedometer”, then even if we decided to increase the
frequency of the repeated measurements from the order of Tk−1 to, say, the order of
(Tk − 7)−1 , we will have no clue that the apparatus is indeed working at (Tk − 7)−1
and not Tk−1 . And so also here, in the absence of knowledge in advance of the
energy spread E or the desired state (from which, along with the initial state, the
energy spread can be calculated), we will never know how fast should we measure
314 A. Hagar
our physical system so that this state is protected. But then we will also fail to fulfill
the conditions for the quantum Zeno effect which ensure us that the system remains
undisturbed in its initial state.
13.4 Moral
In both scenarios presented above, the adiabatic evolution or the Zeno effect, the
issue at stake is not the practical inability to realize actual infinity; in both cases,
the upper and lower bounds on the duration of the measurement, while a practical
necessity, signify a much deeper restriction.
What our analysis in section (3) has shown is that for the state to remain
undisturbed, the decision when to halt the adiabatic process, or how fast to measure
the state in the Zeno effect scenario requires a–priori knowledge of that state. But
this means that the imposition of some upper or lower bound on measurement
duration, ultraviolet or infrared cutoffs which signify the inability to resolve energy
differences unboundedly small, or to apply unbounded energy with increasingly
frequent measurements, respectively, is tantamount to knowledge in advance of
the state to be measured. The actual duration or frequency in the two processes,
respectively, yield the precision with which one can describe the desired known
state.
In other words, if one had such unbounded resolution power or unbounded
energy, one would be able to measure an unknown quantum state without disturbing
it and discern between two non–orthogonal quantum states. Conversely, at least in
the two cases discussed here, if one would like to measure an unknown quantum
state without disturbing it, one must avail oneself to such unbounded resolution
power or unbounded energy (this, of course, need not hold in general, as we
only investigated two possible measurement scenarios, and there may be other
scenarios in some domains which may allow one to circumvent the no–cloning
theorem without availing oneself to unbounded energy. The sheer existence of those
scenarios would entail the inapplicability of QM in those domains).2 The upshot, in
short, is that if the finite nature hypothesis holds, so does the no–cloning theorem.
2 Note that even if one had bounded but large resolution one could not measure an unknown
quantum state without disturbing it. Saying that such disturbance could be “very small” misses
the point, because one would still need to know in advance what the result is supposed to be in
order to say how small is “very small” disturbance. Regardless, even a small disturbance will be
enough to validate no cloning. Indeed, Atia and Aharonov (2017) have recently shown – and I
thank a referee for this reference – that only when a Hamiltonian or its eigenvalues are known,
then there is a tradeoff between accuracy and efficiency of the kind envisioned in the “small
disturbance” remark. Their result which connects quantum information to fundamental questions
about precision measurements emphasizes a prior point made by Aharonov et al. (2002), and which
is repeated here: once a-priori knowledge of the Hamiltonian or the state is allowed, the game has
changed and one can bypass certain structural restrictions imposed by quantum mechanics. The
whole point of this chapter is that these restrictions can arise from fundamental limitations on
spatial resolution.
13 There Is More Than One Way to Skin a Cat: Quantum Information. . . 315
References
Aharonov, Y., & Bohm, D. (1961). Time in the quantum theory and the uncertainty relation for
time and energy. Physical Review, 122, 1649–1658.
Aharonov, Y., Massar, S., & Popescu, S. (2002). Measuring energy, estimating hamiltonians, and
the time-energy uncertainty relation. Physical Review A, 66, 052107.
Atia, Y., & Aharonov, D. (2017). Fast-forwarding of hamiltonians and exponentially precise
measurements. Nature Communications, 8(1), 1572.
Beller, M. (1999). Quantum dialogues. Chicago: University of Chicago Press.
Bub, J., & Pitowsky, I. (2010). Two dogmas about quantum mechanics. In S. Saunders et al. (Eds.),
Many worlds? Oxford: Oxford University Press.
Caves, C., Fuchs, C., & Schack, R. (2002). Quantum probabilities as bayesian probabilities.
Physical Review A, 65, 022305.
316 A. Hagar
Clifton, R., Bub, J., & Halverson, H. (2003). Characterizing quantum theory in terms of
information-theoretic constraints. Foundations of Physics, 33, 1561.
Degasperis, A., Fonda, L., & Ghirardi, G. C. (1974). Does the lifetime of an unstable system
depend on the measuring apparatus? Nuovo Cimento, 21A, 471–484.
Dieks, D. (1982). Communication by EPR devices. Physics Letters A, 92, 271–272.
Fredkin, E. (1990). Digital mechanics: An informational process based on reversible universal
cellular automata. Physica D, 45, 254–270.
Fuchs, C. (2003). Quantum mechanics as quantum information, mostly. Journal of Modern Optics,
50(6–7), 987–1023.
Galilei, G. (1623/1960). The assayer (Il Saggiatore). In The controversy on the comets of 1618.
Philadelphia: University of Pennsylvania Press. English trans. S. Drake & C. D. O’Malley.
Ghirardi, G., Omero, C., Weber, T., & Rimini, A. (1979). Small-time behavior of quantum
nondecay probability and zeno’s paradox in quantum mechanics. Nuovo Cimento, 52A, 421–
442.
Ghirardi, G., Rimini, A., & Weber, T. (1986). Unified dynamics for microscopic and macroscopic
systems. Physical Review D, 34(2), 470–491.
Goldstein, S., Dürr, D., & Zanghi, N. (1992). Quantum equilibrium and the origin of absolute
uncertainty. The Journal of Statistical Physics, 67(5/6). 843–907.
Hagar, A. (2014). Discrete or continuous? Cambridge: Cambridge University Press.
Hagar, A. (2017). On the tension between ontology and epistemology in quantum probabilities. In
O. Lombardi et al. (Eds.), What is quantum information? Cambridge: Cambridge University
Press.
Hagar, A., & Sergioli, G. (2014). Counting steps: a new approach to objective probability in
physics. Epistemologia, 37(2), 262–275.
Hemmo, M., & Shenker, O. (2013). Probability zero in Bohm’s theory. Philosophy of Science,
80(5), 1148–1158.
Hilgevoord, J. (1998). The uncertainty principle for energy and time. II. American Journal of
Physics, 66(5), 396–402.
Kochen, S., & Specker, E. (1967). The problem of hidden variables in quantum mechanics. Journal
of Mathematics and Mechanics, 17, 59–87.
Lidar, D., & Whaley, B. (2003). Decoherence–free subspaces and subsystems. In F. Benatti &
R. Floreanini (Eds.), Irreversible quantum dynamics (Lecture notes in physics, Vol. 622, pp. 83–
120). Berlin: Springer.
Messiah, A. (1961). Quantum mechanics (Vol. II). New York: Interscience Publishers.
Misra, B., & Sudarshan, E. C. G. (1977). The zeno paradox in quantum theory. Journal of
Mathematical Physics, 18(4), 756–763.
Nye, A., & Albert, D., eds. (2013). The wave function. Oxford: Oxford University Press.
J. Park (1970). The concept of transition in quantum mechanics. Foundations of Physics, 1(1),
23–33.
Peres, A. (1980). Zeno paradox in quantum theory. American Journal of Physics, 48(11), 931–932.
Pitowsky, I. (1983). Deterministic model of spin and statistics. Physical Review D, 27, 2316.
Pitowsky, I. (1989). Quantum probability, quantum logic (Lectures note in physics, Vol. 321).
Springer Berlin Heidelberg.
Saunders, S., Barrett, J., Kent, A., & Wallace, D. (Eds.). (2010). Many worlds? Oxford: Oxford
University Press.
Uffink, J. (1993). The rate of evolution of a quantum state. American Journal of Physics, 61(10),
935–936.
Unruh, W. G. (1994). Reality and measurement of the wave function. Physical Review A, 50, 882–
887.
Wootters, W. K., & Zurek, W. (1982). A single quantum cannot be cloned. Nature, 299, 802–803.
Chapter 14
Is Quantum Mechanics a New Theory
of Probability?
Richard Healey
Abstract Some physicists but many philosophers believe that standard Hilbert
space quantum mechanics faces a serious measurement problem, whose solution
requires a new theory or at least a novel interpretation of standard quantum
mechanics. Itamar Pitowsky did not. Instead, he argued in a major paper (Pitowsky,
Quantum mechanics as a theory of probability. In: Demopoulos W, Pitowsky I (eds)
Physical theory and its interpretation. Springer, Dordrecht, pp 213–240, 2006) that
quantum mechanics offers a new theory of probability. In these and other respects
his views paralleled those of QBists (quantum Bayesians): but their views on the
objectivity of measurement outcomes diverged radically. Indeed, Itamar’s view of
quantum probability depended on his subtle treatment of the objectivity of outcomes
as events whose collective structure underlay this new theory of probability. I’ve
always been puzzled by the thesis that quantum mechanics requires a new theory of
probability, as distinct from new ways of calculating probabilities that play the same
role as other probabilities in physics and daily life. In this paper I will try to articulate
the sources of my puzzlement. I’d like to be able to think of this paper as a dialog
between Itamar and me on the nature and application of quantum probabilities.
Sadly, that cannot be: by taking his part in the dialog I will inevitably impose my
own distant, clouded perspective on his profound and carefully crafted thoughts.
R. Healey ()
Philosophy Department, University of Arizona, Tucson, AZ, USA
e-mail: rhealey@email.arizona.edu
14.1 Introduction
Itamar Pitowsky (2003, 2006) developed and defended the thesis that the Hilbert
space formalism of quantum mechanics is just a new kind of probability theory. In
developing this thesis he argued that all features of quantum probability, including
the Born probability rule, can be derived from rational probability assignments to
finite “quantum gambles”. In defense of his view he argued that all experimental
aspects of entanglement, including violations of the Bell inequalities, are explained
as natural outcomes of the resulting probability structure. He further maintained
that regarding the quantum state as a book-keeping device for quantum probabilities
dissolves the BIG measurement problem that afflicts those who believe the quantum
state is a real physical state whose evolution is always governed by a linear dynam-
ical law; and that a residual small measurement problem may then be resolved
by appropriate attention to the great practical difficulties attending measurement
of any observable on a macroscopic system incompatible with readily executable
measurements of macroscopic observables.
This is a bold view of quantum mechanics. It promises to transcend not only the
frustratingly inconclusive debate among so-called realist interpretations (including
Bohmian, Everettian, and “collapse” theories) but also the realist/instrumentalist
dichotomy itself. In this way it resembles the pragmatist approach I have been
developing myself (Healey 2012, 2017). Our views agree on more substantive
issues, including the functional roles of quantum states and probabilities and
the consequent dissolution of the measurement problem. But I remain puzzled
by Itamar’s central thesis. The best way to explain my puzzlement may be to
show how the points on which we agree mesh with a contrasting conception of
quantum probability. On this rival conception, quantum mechanics requires no new
probability theory, but only new ways of calculating probabilities with the same
formal features and the same role as other probabilities in physics and daily life.1
The rest of this contribution proceeds as follows. In the next section I say
what Pitowsky meant by a probability theory and explain how he took quantum
probability theory to differ formally from classical probability theory. The key
difference arises from the different event structures over which these are defined.
Whereas classical probabilities are defined over a σ -algebra of subsets of a set,
quantum probabilities are defined over the lattice L of closed subspaces of a Hilbert
space. Gleason’s theorem plays a major role here: for a Hilbert space of dimension
greater than 2, it excludes a classical truth-evaluation on L but at the same time
completely characterizes the class of possible quantum probability measures on L.
Pitowsky sought to motivate the formal difference between quantum and classical
probability theory by an extension of a Dutch book argument offered in support of
a coherence requirement on rational degrees of belief (in the tradition of Ramsey
1 Thatquantum mechanics involves a new way of calculating probabilities is a point made long
ago by Feynman, as Pitowsky himself noted. But Feynman (1951) also says that the concept of
probability is not thereby altered in quantum mechanics. Here I am in agreement with Feynman,
though the understanding he offers of that concept is disappointingly shallow.
14 Is Quantum Mechanics a New Theory of Probability? 319
Pitowsky maintained that quantum probability theory differs from classical proba-
bility theory formally as well as conceptually. Following Kolmogorov, probability
is usually characterized formally as a unit-normed, countably additive, measure Pr
on a σ -algebra of subsets Ei (i ∈ N) of a set : that is
Pr : −→ [0, 1] satisfies
1. Pr() = 1 (Probability measure)
2. Pr Ei = Pr Ei provided that ∀i = j (Ei ∩ Ej = ∅)
i i
320 R. Healey
This theorem has two important consequences. The first is that there are no
dispersion-free measures on L(H ) if dim(H ) 3, where a quantum measure μ
is dispersion-free if and only if its range is the set {0, 1}. This is important since
any truth-valuation on the set of all propositions stating that it is event S or
instead event S ⊥ that represents an element of reality would have to give rise to
14 Is Quantum Mechanics a New Theory of Probability? 321
where the quantum state is represented by density operator Ŵ and Ê() is the
relevant projection operator from the spectral family defined by the unique self-
adjoint operator  corresponding to dynamical variable A.
Cognizant of the first consequence of Gleason’s theorem, Pitowsky did not
defend the radical thesis that quantum theory shows the world obeys a non-classical
logic. But he took the second consequence as “one of the strongest pieces of
evidence in support of the claim that the Hilbert space formalism is just a new kind
of probability theory” (Pitowsky 2006, p.222).
objective probability at which a rational agent’s credences should aim. Lewis (1980)
disagreed. Instead he located a distinct concept of objective probability he called
chance, of which all we know is that it provides a further rational constraint on an
agent’s credences through what he called the Principal Principle. I defer further
consideration of this principle until Sect. 14.5, since it plays no role in Pitowsky’s
own view of quantum probability theory.
According to Pitowsky (2006), a quantum gamble consists of four steps:
1. A single physical system is prepared by a method known to everybody.
2. A finite set M of incompatible measurements, each with a finite number of possible
outcomes, is announced by the bookie. The agent is asked to place bets on the possible
outcomes of each one of them.
3. One of the measurements in the set M is chosen by the bookie and the money placed on
all other measurements is promptly returned to the agent.
4. The chosen measurement is performed and the agent gains or loses in accordance with
his bet on that measurement.
argument is needed to justify the claim that the full lattice L(H ) is a probability
space.
The argument Pitowsky (2006, p.216) offers in Sect. 2.1 proceeds by analogy to
a classical application of probability theory to games of chance:
Consider two measurements A, B, which can be performed together; and suppose that A has
the possible outcomes a1 , a2 , . . . , ak , and B the possible outcomes b1 , b2 , . . . , br . Denote
by {A = ai } the event “the outcome of the measurement of A is ai ”, and similarly for
{B = bj }. Now consider the identity:
k
{B = bj } = ({B = bj } ∩ {A = ai }) (14.1)
i
This is the distributivity rule which holds in this case as it also holds in all classical cases.
This means, for instance, that if A represents the roll of a die with six possible outcomes
and B the flip of a coin with two possible outcomes, then Eq. (14.1) is trivial. Consequently
the probability of the left hand side of Eq. (14.1) equals the probability of the right hand
side, for every probability measure.
k
l
({B = bj } ∩ {A = ai }) = {B = bj } = ({B = bj } ∩ {C = ci }) (14.2)
i i
In a way, this was Pitowsky’s point. He used the analogy only to motivate his
proposal that quantum probability is different from classical probability in just this
way. As he put it (Pitowsky 2006, pp.216–7)
I assume that the 0 of the algebra of subspaces represents impossibility (zero probability in
all circumstances) 1 represents certainty (probability one in all circumstances), and the iden-
tities such as Eqs. (14.1) and (14.2) represent identity of probability in all circumstances.
This is the sense in which the lattice of closed subspaces of the Hilbert space is taken as
an algebra of events. I take these judgments to be natural extensions of the classical case; a
posteriori, they are all justified empirically.
However, his argument for this proposal was supposedly based on rational-
ity requirements on the credences of a quantum gambler ignorant of quantum
mechanics, and here empirical considerations are out of place. If probability
just is rational credence, then a requirement of rationality cannot be based on
empirical considerations of the kind that are taken to warrant acceptance of quantum
mechanics and the non-contextuality of probabilities consequent on application of
the Born rule. Immediately after stating Rule 2, Pitowsky (2006) says that this
follows from an identity between events essentially equivalent to Eq. (14.2), and the
principle that identical events in a probability space have equal probabilities. But, as
we have seen, that principle applies automatically only when these events are part of
a single probability space, corresponding to a single trial or measurement context.
A lattice L(H ) is naturally understood as a quantum measure space, but only
its Boolean subspaces are naturally understood as probability spaces. Pitowsky’s
appeal to rationality requirements on a quantum gambler does not justify his thesis
that quantum theory is a new kind of probability theory.
In the language of probability theory, the term ‘event’ may be used to refer either to
a mathematical object (such as a set) or to a physical occurrence. When arguing that
the Hilbert space formalism of quantum mechanics is a new theory of probability,
Pitowsky took that theory to consist of an algebra of events, and the probability
measure defined on it. Events in this sense are mathematical objects—subspaces of a
Hilbert space. But to each such object he also associated a class of actual or possible
physical occurrences—outcomes of an experiment in which a quantum observable
is measured. A token physical event eS occurs just in case in a measurement M of
observable O the outcome is oi , an eigenvalue of Ô with eigenspace S. Here M is a
token physical procedure corresponding to the mathematical object BM , a Boolean
subalgebra of a lattice L(H ) of subspaces including S, where Ô is a self-adjoint
operator on H .
When a quantum gamble is defined over a set M of incompatible measurements,
each of these is characterized by the corresponding Boolean subalgebra of L(H ).
A token event eS settles a gamble if it is the outcome of an actual measurement
procedure M corresponding to the mathematical object BM ∈ M, where the bookie
326 R. Healey
chose to perform M rather than some other measurement in the gamble. Following
the subjectivist tradition pioneered by Ramsey, Pitowsky stressed that, for a quantum
gamble to reveal an agent’s degrees of belief, the outcome of whatever measurement
is chosen by the bookie must be settleable.
A proposition that describes a possible event in a probability space is of a rather special kind.
It is constrained by the requirement that there should be a viable procedure to determine
whether the event occurs, so that a gamble that involves it can be unambiguously decided.
This means that we exclude many propositions. For example, propositions that describe past
events of which we have only a partial record, or no record at all. (Pitowsky 2003, p.217)
Recall that in section 2.2 we restricted “matters of fact” to include only observable
records. Our notion of “fact” is analytically related to that of “event” in the sense that a
bet can be placed on x1 only if its occurrence, or failure to occur, can be unambiguously
recorded. (op. cit., p.231)
of the Born rule in a pragmatist view of quantum mechanics. But in this view
a proposition to which the Born rule assigns a probability does not describe the
outcome of a measurement but the physical event of a magnitude taking on a value,
whether or not this is precipitated by measurement.
Even when a magnitude (a.k.a. observable) takes on a value recording the
outcome of a quantum measurement, that record may be subsequently lost, at
which point Pitowsky’s view excludes any event described by a proposition about
that outcome from a quantum probability space. While such scruples may not be
out of place in the context of a quantum gamble, only an indefensible form of
verificationism would prohibit retrospective application of the Born rule to that
event. So quantum mechanics permits application of the concept of probability in
circumstances in which it cannot be understood to be defined on a space of events
consistent with Pitowsky’s view.
One might object that no record of a measurement outcome is ever irretrievably
lost. But recent arguments (Healey 2018; Leegwater 2018) challenge such epistemic
optimism in the context of Gedankenexperimenten based on developments of
the famous Wigner’s friend scenario. These arguments threaten the objectivity
of measurement outcomes under the assumption that unitary, no–collapse single-
world quantum mechanics is applicable at all scales, even when applied to one
observer’s measurement on another observer’s lab (including any devices in that
lab recording the outcomes of prior quantum measurements). It is characteristic
of these arguments that an observation by one observer on the lab of another
completely erases all of the latter’s records of the outcome of his or her prior
quantum measurement. As Leegwater (2018, p.13) put it
such measurements in effect erase the previous measurement’s result; they must get rid of
all traces from which one can infer the outcome of the first measurement.
In this situation, the first observer’s measurement outcome was a real physical
occurrence, observable by and (then) known to that observer. But the second
observer’s measurement on the first observer’s lab erased all records of that
outcome, and neither of these observers, nor any other observer, can subsequently
verify what it was. Indeed, even the first observer’s memory of his own result has
been erased. This is clearly a situation in which Pitowsky would have excluded
the event corresponding to the first observer’s measurement outcome from any
probability space. For in no sense does a true proposition describing that outcome
state (what he called) a “matter of fact”.
I wonder what Pitowsky would have made of these recent arguments. Leegwater
(2018) presents his argument as a “no-go” result for unitary, no–collapse single-
outcome (relativistic) quantum mechanics, while Healey (2018) takes the third
argument he considers to challenge the objectivity of measurement outcomes.
Brukner (2018) formulates a similar argument in a paper entitled “A no-go theorem
for observer-independent facts”, in which he says
We conclude that Wigner, even as he has clear evidence for the occurrence of a definite
outcome in the friend’s laboratory, cannot assume any specific value for the outcome
to coexist together with the directly observed value of his outcome, given that all other
328 R. Healey
assumptions are respected. Moreover, there is no theoretical framework where one can
assign jointly the truth values to observational propositions of different observers (they
cannot build a single Boolean algebra) under these assumptions. A possible consequence of
the result is that there cannot be facts of the world per se, but only relative to an observer, in
agreement with Rovelli’s relative-state interpretation, quantum Bayesianism, as well as the
(neo)-Copenhagen interpretation.
states. Even if there are such things as degrees of belief, these cannot be reliably
measured by an agent’s betting behavior of the kind that figures in an argument
intended to show that a (prudentially) rational agent’s degrees of belief will be
representable as probabilities.
A Dutch book argument may still be used to justify Bayesian coherence of a set
of degrees of belief as a normative condition on epistemic rationality, analogous to
the condition that a rational agent’s full beliefs be logically compatible. The view
that ideally rational degrees of belief must be representable as probabilities has been
called probabilism (Christensen 2004, p.107): such degrees of belief are known as
credences. The force of a Dutch book argument for probabilism is independent of
whether there are any bookies or whether bets in a book are settleable. Understood
this way, Rule 1 is justified as a condition of epistemic rationality on degrees
of belief in a quantum event, whether or not a gamble that involves it can be
unambiguously decided.
Probabilism places only minimal conditions on the degrees of belief of an indi-
vidual cognitive agent, just as logical compatibility places only minimal conditions
on his or her full beliefs. Pitowsky sought to justify additional conditions, sufficient
to establish the result that a rational agent’s credences in quantum events involving
a system be representable by a quantum measure on the lattice L(H ) of subspaces
of a Hilbert space H associated with that system (provided that dim(H ) 3). This
was the key result he took to justify his claim that the Hilbert space formalism of
quantum mechanics is a new theory of probability.
Achieving this result depended on Rule 2: the further condition that a rational
agent’s credence in a quantum event represented by subspace S ∈ L(H ) be the
same, no matter whether it occurs as an outcome of measurement M1 (represented
by Boolean sub-lattice B1 ⊂ L(H )) or M2 (represented by B2 ). Pitowsky took
Rule 2 to follow from the identity between events, and the principle that identical
events in a probability space have equal probabilities. As embodied in Eq. (14.2),
Pitowsky took these judgments to be natural extensions of the classical case (as
embodied in (14.1)): and he said of these judgments “a posteriori, they are all
justified empirically”.
This justification for Rule 2 is quite different from the justification offered
for Rule 1, which was justified not empirically but as a normative requirement
on epistemic rationality. As such, Rule 1 showed why an epistemically rational
agent should adopt degrees of belief representable as probabilities over a classical
probability space of events associated with each Boolean subalgebra B of L(H ):
in that sense, it justified considering B as a probability space. It is because no
analogous normative requirement justifies taking L(H ) itself to be a probability
space that Pitowsky appealed instead to empirical considerations. But while such
empirical considerations may justify the introduction of a quantum measure μ
over L(H ) as a convenient device for generating a posteriori justified objective
probability measures over Boolean subalgebras of L(H ), this does not show that
μ is itself a probability, or L(H ) a probability space. Only by appeal to some
additional normative principle of epistemic rationality could a Bayesian in the
tradition of Ramsey and de Finetti attempt to show that.
330 R. Healey
There is now a large literature on how a Principal Principle should be stated, and
how, if at all, it may be justified. This includes Ismael’s helpful formulation as an
implicit definition of a notion of chance:
The [modified Lewisian] chance of A at p, conditional on any information H p about the
contents of p’s past light cone satisfies: Crp (A/Hp ) =df Chp (A).
is accessible to each. This may occur just because the agents do not share a single
spacetime location, in conformity to Ismael’s modified Lewisian chance: Born
probabilities supply many examples of such objective probabilities. After a specific
quantum state is assigned to an individual system, the Born rule yields a probability
measure over events of certain types involving it. By instantiating the general Born
rule, the agent may then derive an objective chance distribution over possible events.
This is a classical probability distribution over an event space with the structure
of a Boolean algebra B, not a quantum measure over a non-Boolean lattice: in
a legitimate application of the Born rule no probability is assigned to events
that are not elements of B. Events assigned a probability in this way are given
canonical descriptions of the form Qs ∈ , where Q is a dynamical variable
(observable), s is a quantum system, and is a Borel set of real numbers: I call
Qs ∈ a magnitude claim. Some of these events are appropriately redescribed as
measurement outcomes: for others, this redescription is less appropriate.
Probability theory had its origins in a dispute concerning dice throws, so it may
be helpful to draw an analogy with applications of probability theory to throws
of a die. The faces of a normal die are marked in such a way that the point total
of opposite faces sums to 7, and the faces marked 4, 6 meet on one edge of the
die. Consider an evenly weighted die that includes a small amount of magnetic
material carefully distributed throughout its bulk. This die may be thrown onto a
flat, level surface in one or other of two different kinds of experiments: a magnetic
field may be switched on or off underneath the surface as the die is thrown. In an
experiment with the magnetic field off, a throw of the die is fair, so there is an equal
chance of 1/6 that each face will land uppermost. (Notice that, though natural, this
use of the term ‘chance’ does not conform to Lewis’s Principal Principle—even
in Ismael’s modified formulation—since it does not presuppose that dice throws
are indeterministic processes.) In an experiment with the magnetic field on, the die
is biassed so that some faces are more likely to land uppermost than others. But
because of the careful placement of the magnetic material within the die, the chance
is still 1/3 that a face marked 4 or 6 lands uppermost.
In each kind of experiment, a probabilistic model of dice-throwing may be tested
against frequency data from repeated dice throws of that kind. This model includes
general probability statements of the form PX (E) = p specifying the probability
of an event of type E describing a possible outcome of a throw of the die in an
experiment of type X. Consider the situation of an actual or hypothetical agent
immediately prior to an individual throw of the die. The information accessible
to this agent does not include the outcome of that throw: nor could it include
all potentially relevant microscopic details of the initial and boundary conditions
present in that throw, even if it were a deterministic process. But it does include
knowledge of whether the magnetic field is on or off: failure to obtain this relevant,
accessible information would be an act of epistemic irresponsibility.
Having accepted a probabilistic model on the evidence provided by relative
frequencies of different outcome types in repeated throws of both kinds, this actual
or hypothetical agent has reason to instantiate the general probability statement
of the form PX (E) = p for each possible event e of type E in experiment x of
332 R. Healey
14.6 Conclusion
those not everyone can know—but even these outcomes are as objective as science
needs them to be.
In all these ways I have come to disagree with Itamar’s view of probability in
quantum mechanics. I only wish he could now reply to show me why I am wrong:
we could all learn a lot from the ensuing debate. Let me finish by reiterating two
important points of agreement. There is no BIG quantum measurement problem:
there is only the small problem of physically modeling actual measurements within
quantum theory and showing why some are much easier than others. A quantum
state is not an element of physical reality (|ψ is not a beable); it is a book-keeping
device for updating an agent’s credences as time passes or in the light of new
information (even if there is no actual agent or bookie to keep the books)!
References
Bell, J. S. (1966). On the problem of hidden variables in quantum mechanics. Reviews of Modern
Physics, 38, 447–52.
Bell, J. S. (2004). Speakable and unspeakable in quantum mechanics (2nd ed.). Cambridge:
Cambridge University Press.
Brukner, Č. (2018). A no-go theorem for observer-independent facts. Entropy, 20, 350.
Christensen, D. (2004). Putting logic in its place. Oxford: Oxford University Press.
De Finetti, B. (1937). La prévision: ses lois logiques, ses sources subjectives. Annales de l’Institute
Henri Poincaré, 7, 1–68.
Earman, J. (2018). The relation between credence and chance: Lewis’ “Principal Principle” is a
theorem of quantum probability theory. http://philsci-archive.pitt.edu/14822/. Accessed 6 May
2019.
Feynman, R. P. (1951). The concept of probability in quantum mechanics. In Second Berkeley Sym-
posium on Mathematical Statistics and Probability, 1950 (pp. 533–541). Berkeley: University
of California Press.
Fine, A. (1982). Joint distributions, quantum correlations, and commuting observables. Journal of
Mathematical Physics, 23, 1306–1310.
Fuchs, C., & Schack, R. (2013). Quantum-Bayesian coherence. Reviews of Modern Physics, 85,
1693–1715.
Gleason, A. M. (1957). Measures on the closed subspaces of a Hilbert space. Journal of
Mathematics and Mechanics, 6, 885–893.
Healey, R. A. (2012). Quantum theory: a pragmatist approach. British Journal for the Philosophy
of Science, 63, 729–71.
Healey, R. A. (2017). The quantum revolution in philosophy. Oxford: Oxford University Press.
Healey, R. A. (2018). Quantum theory and the limits of objectivity. Foundations of Physics, 48,
1568–89.
Hemmo, M., & Pitowsky, I. (2007). Quantum probability and many worlds. Studies in History and
Philosophy of Modern Physics, 38, 333–350.
Ismael, J. T. (2008). Raid! Dissolving the big, bad bug. Nous, 42, 292–307.
Jaynes, E. T. (2003). In G. Larry Bretthorst (Ed.), Probability theory: The logic of science.
Cambridge: Cambridge University Press.
Kochen, S., & Specker, E. (1967). The problem of hidden variables in quantum mechanics. Journal
of Mathematics and Mechanics, 17, 59–87.
Leegwater, G. (2018). When Greenberger, Horne and Zeilinger meet Wigner’s friend.
arXiv:1811.02442v2 [quant-ph] 7 Nov 2018. Accessed 6 May 2019.
336 R. Healey
M. Hemmo ()
Department of Philosophy, University of Haifa, Haifa, Israel
e-mail: meir@research.haifa.ac.il
O. Shenker
Edelstein Center for History and Philosophy of Science, The Hebrew University of Jerusalem,
Jerusalem, Israel
e-mail: Orly.shenker@mail.huji.ac.il
15.1 Introduction
1 Everett’s (1957) interpretation of quantum mechanics has a similar attempt in this respect.
2 See also: Bub (1977, 2007, 2016, 2020); Pitowsky (2003, 2007).
3 For versions of the Bayesian approach to quantum mechanics, see e.g., Caves et al. (2002a, b,
2007); Fuchs and Schack, (2013); Fuchs et al. (2014); Fuchs and Stacey (2019).
15 Quantum Mechanics as a Theory of Probability 339
In their approach, von Neumann’s projection postulate turns out to stand for
a rule of updating probabilities by conditionalisation on a measurement outcome,
that is the von Neumann-Lüders rule. The fact that the updating is non-unitary
expresses in their view the ‘no cloning’ principle in quantum mechanics, according
to which there is no universal cloning machine capable of copying the outputs of an
arbitrary information source. This principle entails, via Bub and Pitowsky’s (2010,
p. 455) “information loss theorem”, that a measurement necessarily results in the
loss of information (e.g., the phase in the pre-measurement state). Since the ‘no
cloning’ principle is implied by the Hilbert space structure, they argue, the loss of
information in measurement is independent of how the measurement process might
be implemented dynamically. By this they mean that if no structure is added to
Hilbert space, any dynamical quantum theory of matter and fields that will attempt
to account for the occurrence of individual events must satisfy this principle (and
other constraints imposed by the quantum event space, such as ‘no signaling’4 ),
regardless of its details.
4 The ‘no signaling’ principle asserts, roughly, that the marginal probabilities of events associated
with a quantum system are independent of the particular set of mutually exclusive and collectively
exhaustive events associated with any other system.
340 M. Hemmo and O. Shenker
This idea, that quantum mechanics is a theory of probability in which the event
space is Hilbert space, leads them to reject what they identify as two ‘dogmas’ of
quantum mechanics: that measurement is a process that, as a matter of principle,
should always be open to a complete dynamical analysis; and that a quantum pure
state is analogous to the classical state given by a point in phase space, that is
the quantum state is a description (or representation) of physical reality. On the
view they propose, the quantum state is a credence function, which keeps track
of the objective chances in the universe, it is not the ‘truthmaker’ of propositions
about the occurrence or non-occurrence of individual events; and measurement is
primitive in the sense that the no-cloning principle follows from the non-Boolean
structure of the event space, and if one does not add structure to Hilbert space (like
Bohmian trajectories, GRW collapses), then there is no further deeper story to be
told about how measurement outcomes are brought about dynamically. In this sense
the measurement problem, the ‘big’ problem as they call it, is a pseudo problem.
But why is there no deeper story to be told? After all, Bub and Pitowsky
themselves realise that the quantum event space imposes constrains on the ‘objec-
tive chances’ of all possible configurations of individual events, and the unitary
dynamics evolves this entire structure: it does not describe the evolution from one
configuration of the universe to another. But if the theory does not describe the
transition from one configuration to the next, what are the ‘objective chances’
chances of? And in what sense is this quantum theory of probability a physical
theory of an indeterministic universe? In fact, in what sense is the universe
indeterministic at all, given the determinism of the unitary dynamics (which is all
there is)?
To these questions, Bub and Pitowsky reply by an analogy from special relativity:
If we take special relativity as a template for the analysis of quantum conditionalization
and the associated measurement problem, the information-theoretic view of quantum
probabilities as ‘uniquely given from the start’ by the structure of Hilbert space as a
kinematic framework for an indeterministic physics is the proposal to interpret Hilbert
space as a constructive theory of information-theoretic structure or probabilistic structure,
part of the kinematics of a full constructive theory of the constitution of matter, where
the corresponding principle theory includes information-theoretic constraints such as ‘no
signaling’ and ‘no cloning.’ Lorentz contraction is a physically real phenomenon explained
relativistically as a kinematic effect of motion in a non-Newtonian spacetime structure.
Analogously, the change arising in quantum conditionalization that involves a real loss of
information is explained quantum mechanically as a kinematic effect of any process of
gaining information of the relevant sort in the non-Boolean probability structure of Hilbert
space (irrespective of the dynamical processes involved in the measurement process).
Given ‘no cloning’ as a fundamental principle, there can be no deeper explanation for the
information loss on conditionalization than that provided by the structure of Hilbert space
as a probability theory or information theory. The definite occurrence of a particular event
is constrained by the kinematic probabilistic correlations encoded in the structure of Hilbert
space, and only by these correlations – it is otherwise ‘free’. (Bub and Pitowsky, p. 448)
6 SeeBrown (2005) for a different view about the role of dynamical analyses in special relativity;
and Brown and Timpson (2007) for a criticism of the analogy with special relativity.
342 M. Hemmo and O. Shenker
where |↑z and |↓z are (respectively) the up- and down-eigenstates of the z-spin of
the electron; and | ψ0 is Alice’s brain state in which she is ready to measure the
z-spin of the electron. The Schrödinger equation (in the non-relativistic case) maps
the state in (1) to the following superposition state at the end of the measurement at
some time t:
√ √
|
(t) = 1/ 2 | ↑z ⊗ |ψ ↑ + 1/ 2 | ↓z ⊗ |ψ ↓ (15.2)
where the | ψ↑↓ are the brain states of Alice in which she would believe that the
outcome of the z-spin measurement is up (or down, respectively), if the initial state
of the electron were only the up-state (or only the down-state). We assume here that
Alice’s brain states | ψ↑↓ include all the degrees of freedom that are correlated with
whatever macroscopic changes occurred in Alice’s brain due to this transition. For
simplicity, the description in (2) omits many details of the microphysics, but nothing
in our argument hinges on this. This is the generic case in which the measurement
problem arises in which Alice’s brain states and the spin states of the electron are
entangled.
The essential idea of decoherence (for our purposes) is very briefly this. Since
Alice is a relatively massive open system that continuously interacts with a large
number of degrees of freedom in the environment, the quantum state of the universe
is not (2), but rather this state:
√ √
|
t = 1/ 2 | ↑z ⊗ |ψ ↑ ⊗ |E+ (t) + 1/ 2 |↓z ⊗ |ψ ↓ ⊗ |E− (t) (15.3)
in which the | E± (t) are the collective (relative-)states of all the degrees of freedom
in the environment9 . According to decoherence theory, the Hamiltonian that governs
7 For decoherence theory, and references, see e.g., Joos et al. (2003).
8 Interactionsnever begin, so state (1) (which is a product state) should be replaced with a state in
which Alice + electron are (weakly) entangled; otherwise the dynamics would not be reversible.
9 In discrete models, both | ψ and | E (t) are states of the form:
N
↑↓ ± i=1 | μi defined in the
corresponding multi-dimensional Hilbert spaces.
15 Quantum Mechanics as a Theory of Probability 343
the interaction with the environment commutes (or approximately commutes) with
some observables of the macroscopic system, for example the position of its center
of mass. In our example, Alice’s brain states | ψ↑↓ denote the eigenstates of the
observables on which the interaction Hamiltonian depends. These states are called
in the literature the ‘decoherence basis’. The interaction couples (approximately)
the states of the environment | E± (t) to the states | ψ↑↓ in the decoherence basis.
Since the environment consists of a large number of dynamically independent
degrees of freedom (this is expressed by assuming a suitable ‘random’ probability
distribution over the states of the different degrees of freedom in the environment),
the | E± (t) are (highly likely to become) extremely quickly almost orthogonal to
each other. Because of the large number of degrees of freedom in the environment,
this behaviour is likely to persist for very long times (in some models, for times
comparable with the age of the universe) for exactly the same reasons.
As long as decoherence persists, that is the | E± (t) remain nearly orthogonal
to each other, the interference terms in the superposition (3) in the decoherence
basis are very small. This means that at all these times, the off-diagonal elements of
Alice’s reduced density matrix (calculated by partial tracing over the environmental
degrees of freedom) in the decoherence basis | ψ↑↓ are very small. Since the
interference small the branches defined in the decoherence basis | ψ↑↓ evolve in
time in the course of the Schrödinger evolution of (3) almost separately of each
other. That is, given the Schrödinger evolution of the universal state (3), which
results in decoherence, it turns out that (in the decoherence basis) each of the
branches in (3) evolves in a way that is almost independent of the other branches.
In these senses, the decoherence basis is special: It has the above formal features
when decoherence obtains, were in other bases in which state (3) may be written,
the expansion is much more cumbersome.
Notice that in our outline of decoherence we have defined the branches by Alice’s
brain states; but this is just an artefact of our example. The states of any decohering
system (such as a macroscopic body in a thermal or radiation bath, a Brownian
particle in suspension in a fluid, a pointer of a measuring device, Schrödinger’s cat)
are equally suitable to define the branches. It might even be that Alice’s relevant
brain states are not subject directly to decoherence (whether or not this is the case
is the subject matter of brain science after all). All one needs is that Alice’s relevant
brain states (whatever they are) that give rise to her experience be coupled to the
states of a decohering system. In state (3) we use Alice’s brain states to define the
decoherence basis only for simplicity.
The essential point for Bub and Pitowsky is that, since the interference terms are
very small in the decoherence basis, the dynamical process of decoherence gives rise
to an effectively classical (Boolean) probability space in which the probabilities for
the events are given by the Born probabilities. The macroscopic events in this space
are given by the projections onto the eigenstates of the decohering variables (in our
example, the projections onto the states in the decoherence basis | ψ↑↓ ) and these
turn out to be the macroscopic variables that feature in our experience, and more
generally in classical mechanics. And the probability space (approximately) satisfies
classical probability theory as long as decoherence endures. For example, the
344 M. Hemmo and O. Shenker
It is true that the decoherence process is basis-independent and that the interaction
Hamiltonian picks out the decoherence basis as dynamically preferred (given a
factorisation of the Hilbert space into system and environment), as we explained
above. In fact, since the process of decoherence is continuous, there is a continuous
infinity of basis states in Alice’s state space other than the | ψ↑↓ which satisfy to
the same degree the conditions of decoherence (namely the near orthogonality of
the relative states of the environment and the fact that this behaviour is stable over
long time periods). It is an objective fact that as long as decoherence endures the
interference between the branches defined in any such basis (and the off-diagonal
15 Quantum Mechanics as a Theory of Probability 345
elements in the reduced state of the decohering system) are very small and can
hardly be detected by measurements. In many cases in physics it is practically useful
to ignore small terms which make no detectible difference for calculating empirical
predictions. Here are two examples.
In probability theory it is sometimes assumed that events with very small
probability will not occur with practical (or moral) certainty, or equally, that events
with very high probability which is not strictly one, will occur with practical
certainty (see Kolmogorov 1933). In these cases, the identification of very small
probability with zero probability is practically useful whenever one assumes that
cases with very small probability occur with very small relative frequency. Although
such cases will appear in the long run of repeated trials, they are unlikely to appear
in our relatively short-lived experience. Another example is the Taylor series of
a function which is used to approximate the value of a function at the expansion
point by the function’s derivatives at that point (when the derivatives are defined
and continuous around the point). Although the Taylor series is an infinite sum of
terms, it is true in many cases (but not always) that one can approximate the function
to a sufficient degree of accuracy by truncating the Tailor series and taking only
the first finite number of the terms in the series. So effectively in these cases the
rest of the terms in the series which typically decay as the order of the derivatives
increases, but do not strictly vanish, are set to zero. One can obtain a bound via
Taylor’s theorem, on the size of the error in the approximation depending on the
length of the truncated series which might be much below the degree of accuracy
that one can measure. Although in general the Tailor infinite series is not equal
to the function (even if the series converges at every point), using the truncated
series gives accurate empirical predictions about the values of the function near the
expansion points and its behavior in various limits. But no claim is made that the
truncated small terms do not exist, the claim is only that we are not able to detect
their existence in our experience.
It seems to us that the role of decoherence in the foundations of quantum
mechanics should be understood along similar lines to the above two examples.
Suppose that the Hamiltonian of the universe includes interactions that we call
measurements of certain observables and decoherence interactions (as in our spin
example above). The evolution of the universal state can be written in infinitely
many bases in the Hilbert space of the universe. Different bases pick out different
events, in some bases the interference terms between the components describing
the events are small, and in others they are large. In some intuitive sense, the bases
in which the interference terms are small seem to be esthetically more suitable or
more convenient for us to describe the quantum state. But given the Hilbert space
structure these bases have no other special status, in particular they are not more
physically real (as it were) than the other bases. In this sense the analogy of Hilbert
space bases with Lorentz frames in special relativity is in place. It turns out that
the events that feature in our experience correspond to the projections onto the
decoherence basis. If we take this fact into account without a deeper explanation
of how it comes about, then it is only natural that we write the quantum state
describing our measurements in the decoherence basis. And then for matters of
346 M. Hemmo and O. Shenker
calculating our empirical predictions the small interference terms in the decoherence
basis make no detectable difference to what we can measure in our experience, and
for this purpose these terms can be set to zero. To detect the superposition in (3), by
means of measurements which are described in the decoherence basis, one needs to
manipulate the relative states of the environment in a way that results in significant
interference. But this is practically impossible, due to the large number of degrees
of freedom in the environment and the fact that they are much beyond our physical
reach. In this practical sense, setting the interference terms in the decoherence basis
to effectively zero is a perfectly useful approximation.
But in order for Bub and Pitowsky’s consistency proof to be analogous to the
dynamical proofs in special relativity, these effective considerations are immaterial.
It is an objective fact about the universe that decoherence obtains (given a factori-
sation). This is our candidate (iii) for the analogy with the breaking of the thread in
special relativity. But what about Bub and Pitowsky’s target (candidate (ii)) for the
analogy, namely the fact that the structure of the probability space of the events in
our experience seems to be classical (Boolean)? The decoherence basis is analogous
only to certain Lorentz frames, and the decoherence interaction explains why the
interference terms of state (3), described in this basis, are small; the decoherence
interaction equally explains why in other bases of the Hilbert space of the universe
(which are analogous to other Lorentz frames) the interference terms of state (3) are
large. Consequently, the decoherence interaction explains why a probability space
of events picked out by the decoherence basis is effectively classical (Boolean);
the decoherence interaction equally explains why the probability space of events
picked out by some other bases are not effectively classical. But the decoherence
interaction does not explain why events picked out by the decoherence basis, rather
than by these other bases, feature in our experience.
For this reason, the analogy with special relativity breaks down. In special
relativity one can explain why the experiences of certain (physical) observers are
associated with certain Lorentz frames by straightforward physical facts about these
observers, for example, their relative motion.10 But in quantum mechanics, if we
do not add structure to Hilbert space, one cannot explain by appealing to physical
facts why the experiences of certain observers, for example, human observers, are
associated with the events picked out by the decoherence basis. The emergence of
the probability space of these events in our experience is not a basis-independent
10 In statistical mechanics, here is an analogous scenario: the phase space of the universe can
be partitioned into infinitely many sets corresponding to infinitely many macrovariables. Some
of these partitions exhibit certain regularities, for example, the partition to the thermodynamic
sets which human observers are sensitive to, exhibit the thermodynamic regularities; while other
partitions to which human observers are not sensitive, although they are equally real exhibit
other regularities and may even be anomalous (that is, exhibit no regularities at all). We call this
scenario in statistical mechanics Ludwig’s problem (see Hemmo and Shenker 2012, 2016). But
one can explain by straightforward physical facts why (presumably) human observers experience
the thermodynamic sets and regularities but not the equally existing other sets (although it might
be that in our experience some other sets also appear). This case too is dis-analogous to the basis-
symmetry in quantum mechanics, from which the problem of no preferred basis follows.
15 Quantum Mechanics as a Theory of Probability 347
11 A preferred basis in exactly the same sense is also required in all the contemporary versions of
the Everett (1957) interpretation of quantum mechanics, so that one must add a new law of physics
to the effect that worlds emerge relative to the decoherence basis; see (Hemmo and Shenker 2019).
348 M. Hemmo and O. Shenker
12 Recent variations are, for example, Caves et al. (2002a, b, 2007); Fuchs and Schack (2013);
Fuchs et al. (2014); Fuchs and Stacey (2019).
13 For the role of decoherence in the Bayesian approach and its relation to Dutch-book consistency,
from the start. Our point is that the structure added by the Bayesian approach is no less than
e.g., Bohmiam trajectories or GRW flashes.
15 Quantum Mechanics as a Theory of Probability 349
measurement outcome, where the probability for the choice is dictated by the Born
rule.16
The second consequence of the Bayesian approach is that as a matter of principle,
not of ignorance about the external physical world, mental states (and experience in
general) cannot be accounted for by physics. Here is why. Contemporary science
(brain and cognitive science) is far from explaining how mental experience comes
about. But the standard working hypothesis in these sciences – which is fruitful
and enjoys vast empirical support – is that our mental experience is to be explained
(somehow, by and large) by features of our brains, perhaps together with some other
parts of the body; for simplicity we call all of them ‘the brain’. How and why we
can treat biological systems in effectively classical terms is part of the questioin of
the classical limit of quantum mechanics. But here our specific question is this. Let
us suppose that quantum mechanics does yield in some appropriate limit classical
behavior of (say) our brains. Can the Bayesian approach accept the standard working
hypothesis that mental states are to be accounted for by brain states (call it the ‘brain
hypothesis’)?
If the brain hypothesis is true, there is some physical state of my brain which
underlies my credence function. Since according to contemporary physics quantum
mechanics is our best physical theory, this physical state (whatever its details are)
should be some quantum state of my brain. But on the Bayesian approach, the
quantum state of my brain is a credence function of some observer, not the physical
state of anything; so a fortiori it can’t be the physical state of my brain. It follows
that, as a matter of principle, mental states cannot be accounted for by brain states:
they are not physical even fundamentally.
In classical physics the Bayesian theory of probability is not meant to replace
the physical theory (i.e., classical relativistic physics) describing how the relative
frequencies of events are brought about. After all, as Pitowsky (2003, p. 400) has
put it: “the theory of probability, even in its most subjective form, associates a
person’s degree of belief with the objective possibilities in the physical world.”
In the classical case the subjective interpretation of probability (de Finetti 1970;
Ramsey 1929) may assume in the background that – as a matter of principle –
not only the frequencies but also the credences of agents and their evolution over
time might be ultimately accounted for by physics. In this sense, the subjectivist
approach to classical probability is consistent with the physicalist thesis in cognition
and philosophy of mind. But the Bayesian approach to quantum mechanics, is not
only inconsistent with physicalism: it is a form of idealism.
Acknowledgement We thank Guy Hetzroni and Cristoph Lehner for comments on an earlier draft
of this paper. This research was supported by the Israel Science Foundation (ISF), grant number
1114/18.
16 These two roles are carried out by the mind at one shot (as it were), but analytically they are
different.
350 M. Hemmo and O. Shenker
References
Bell, J. S. (1987a). Are there quantum jumps? In J. Bell (Ed.), Speakable and unspeakable in
quantum mechanics (pp. 201–212). Cambridge: Cambridge University Press.
Bell, J. S. (1987b). How to teach special relativity. In J. Bell (Ed.), Speakable and unspeakable in
quantum mechanics (pp. 67–80). Cambridge: Cambridge University Press.
Brown, H. (2005). Physical relativity: Spacetime structure from a dynamical perspective. Oxford:
Oxford University Press.
Brown, H. R., & Timpson, C. G. (2007). Why special relativity should not be a template for a
fundamental reformulation of quantum mechanics. In Pitowsky (Ed.), Physical theory and its
interpretation: Essays in honor of Jeffrey Bub, W. Demopoulos and I. Berlin: Springer.
Bub, J. (1977). Von Neumann’s projection postulate as a probability conditionalization rule in
quantum mechanics. Journal of Philosophical Logic, 6, 381–390.
Bub, J. (2007). Quantum probabilities as degrees of belief. Studies in History and Philosophy of
Modern Physics, 38, 232–254.
Bub, J. (2016). Bananaworld: Quantum mechanics for primates. Oxford: Oxford University Press.
Bub, J. (2020). ‘Two Dogmas’ Redux. In Hemmo, M., Shenker, O. (eds.) Quantum, probability,
logic: Itamar Pitowsky’s work and influence. Cham: Springer.
Bub, J., & Pitowsky, I. (2010). Two dogmas about quantum mechanics. In S. Saunders, J. Barrett,
A. Kent, & D. Wallace (Eds.), Many worlds? Everett, quantum theory, and reality (pp. 431–
456). Oxford: Oxford University Press.
Bohm, D. (1952). A suggested interpretation of the quantum theory in terms of “hidden” variables.
Physical Review, 85(166–179), 180–193.
Caves, C. M., Fuchs, C. A., & Schack, R. (2002a). Unknown quantum states: The quantum de
Finetti representation. Journal of Mathematical Physics, 43(9), 4537–4559.
Caves, C. M., Fuchs, C. A., & Schack, R. (2002b). Quantum probabilities as Bayesian probabilities.
Physical Review A, 65, 022305.
Caves, C. M., Fuchs, C. A., & Schack, R. (2007). Subjective probability and quantum certainty.
Studies in History and Philosophy of Modern Physics, 38, 255–274.
de Finetti, B. (1970). Theory of probability. New York: Wiley.
Everett, H. (1957). ‘Relative state’ formulation of quantum mechanics. Reviews of Modern Physics,
29, 454–462.
Fuchs C. A. & Schack, R. (2012). Bayesian conditioning, the reflection principle, and quantum
decoherence. In Ben-Menahem, Y., & Hemmo, M. (eds.), Probability in physics (pp. 233–247).
The Frontiers Collection. Berlin/Heidelberg: Springer.
Fuchs, C. A., & Schack, R. (2013). Quantum Bayesian coherence. Reviews of Modern Physics, 85,
1693–1715.
Fuchs, C. A., Mermin, N. D., & Schack, R. (2014). An introduction to QBism with an application
to the locality of quantum mechanics. American Journal of Physics, 82, 749–754.
Fuchs, C. A., & Stacey, B. (2019). QBism: Quantum theory as a hero’s handbook. In E. M.
Rasel, W. P. Schleich, & S. Wölk (Eds.), Foundations of quantum theory: Proceedings of the
International School of Physics “Enrico Fermi” course 197 (pp. 133–202). Amsterdam: IOS
Press.
Ghirardi, G., Rimini, A., & Weber, T. (1986). Unified dynamics for microscopic and macroscopic
systems. Physical Review, D, 34, 470–479.
Hagar, A. (2003). A philosopher looks at quantum information theory. Philosophy of Science,
70(4), 752–775.
Hagar, A., & Hemmo, M. (2006). Explaining the unobserved – Why quantum mechanics ain’t only
about information. Foundations of Physics, 36(9), 1295–1324.
Hemmo, M., & Shenker, O. (2012). The road to Maxwell’s demon. Cambridge: Cambridge
University Press.
Hemmo, M., & Shenker, O. (2016). Maxwell’s demon. Oxford University Press: Oxford Hand-
books Online.
15 Quantum Mechanics as a Theory of Probability 351
Hemmo, M. & Shenker, O. (2019). Why quantum mechanics is not enough to set the framework
for its interpretation, forthcoming.
Joos, E., Zeh, H. D., Giulini, D., Kiefer, C., Kupsch, J., & Stamatescu, I. O. (2003). Decoherence
and the appearance of a classical world in quantum theory. Heidelberg: Springer.
Pitowsky, I. (2003). Betting on the outcomes of measurements: A Bayesian theory of quantum
probability. Studies in History and Philosophy of Modern Physics, 34, 395–414.
Kolmogorov, A. N. (1933). Foundations of the Theory of Probability. New York: Chelsea
Publishing Company, English translation 1956.
Pitowsky, I. (2007). Quantum mechanics as a theory of probability. In W. Demopoulos & I.
Pitowsky (Eds.), Physical theory and its interpretation: Essays in honor of Jeffrey Bub (pp.
213–240). Berlin: Springer.
Ramsey, F. P. (1990). Truth and Probability. (1926); reprinted in Mellor, D.H. (ed.), F. P. Ramsey:
Philosophical Papers. Cambridge: Cambridge University Press.
Tumulka, R. (2006). A relativistic version of the Ghirardi–Rimini–Weber model. Journal of
Statistical Physics, 125, 821–840.
von Neumann, J. (2001). Unsolved problems in mathematics. In M. Redei & M. Stoltzner (Eds.),
John von Neumann and the foundations of quantum physics (pp. 231–245). Dordrecht: Kluwer
Academic Publishers.
Chapter 16
On the Three Types of Bell’s Inequalities
Gábor Hofer-Szabó
16.1 Introduction
Ever since its appearance three decades ago, Itamar Pitowsky’s Quantum Probabil-
ity – Quantum Logic has been serving as a towering lighthouse showing the way
for many working in the foundations of quantum mechanics. In this wonderful book
Pitowsky provided an elegant geometrical representation of classical and quantum
probabilities: a powerful tool in tackling many difficult formal and conceptual
questions in the foundations of quantum theory. One of them is the meaning of
the violation of Bell’s inequalities.
However, for someone reading the standard foundations of physics literature on
Bell’s theorems, it is not easy to connect up Bell’s inequalities as they are presented
G. Hofer-Szabó ()
Research Center for the Humanities, Budapest, Hungary
e-mail: szabo.gabor@btk.mta.hu
in Pitowsky’s book with the inequalities presented by Bell, Clauser, Horne, Shimony
etc. The main difference, to make it brief, is that Bell’s inequalities in their original
form are formulated in terms of conditional probabilities, representing certain
measurement outcomes provided that certain measurement are performed, while
Pitowsky’s Bell inequalities are formulated in terms of unconditional probabilities,
representing the distribution of certain underlying properties or events responsible
for the measurement outcomes.
Let us see this difference in more detail. Probabilities enter into quantum
mechanics via the trace formula
p = ps (A|a) (16.2)
The ontological and the operational definition are connected as follows. The
measurement a is said to measure the physical magnitude a ∗ only if the following
holds: the outcome of measurement a performed on the system will be A if and
only if the system is in an ontological state for which the value of a ∗ is A∗ :
∗
pλ (A|a) = 1 if and only if λ ∈ A
As is seen, the two interpretations differ in how they treat quantum probabilities.
On the operational interpretation quantum probabilities are condition probabilities,
while on the ontological interpretation they are unconditional probabilities. (Note
that in both interpretations probabilities are “conditioned” on the preparation which
will be dropped from the next section.)
Now, consider a set of self-adjoint operators {âi } (i ∈ I ) each with two spectral
projections {Âi , Â⊥
i }. Correspondingly, consider a set of measurements {ai } each
with two outcomes {Ai , A⊥ ∗
i } and a set of magnitudes {ai } each with two values
{A∗i , A∗⊥
i }. If Âi and Âj are commuting, then the quantum probabilities
or ontologically:
pi = ps (A∗i ) (16.8)
pij = ps (A∗i ∧ A∗j ) (16.9)
−1 pij + pi j + pij − pi j − pi − pj 0
i, i = 1, 2; j, j = 3, 4; i = i ; j = j (16.10)
The numbers featuring in (16.10) are probabilities. But which type of probabilities?
Are they simply (uninterpreted) quantum probabilities of type (16.4)–(16.5), or
condition probabilities of type (16.6)–(16.7), or unconditional probabilities of
type (16.8)–(16.9)? Depending on how the probabilities in the Clauser-Horne
inequalities are understood, the inequalities can be written out in the following three
different forms:
356 G. Hofer-Szabó
(where Âi Âj , Âi Âj , Âi Âj and Âi Âj are commuting projections); or
or
Consider real numbers pi and pij in [0, 1] such that i = 1 . . . n and (i, j ) ∈ S
where S is a subset of the index pairs {(i, j ) |i < j ; i, j = 1 . . . n }. When can these
numbers be probabilities of certain events and their conjunctions? More precisely:
given the numbers pi and pij , is there a classical probability space (, , p) with
events Ai and Ai ∧ Aj in such that
pi = p(Ai )
pij = p(Ai ∧ Aj )
uiε = εi i = 1...n
uijε = εi ε j (i, j ) ∈ S
Then we define the classical correlation polytope in R(n, S) as the convex hull of
the classical vertex vectors:
⎧ ⎫
⎨ ⎬
c(n, S) := p ∈ R(n, S) p = λε u ε ; λε 0; λε = 1
⎩ ⎭
ε∈{0,1}n ε∈{0,1}n
The polytope c(n, S) is a simplex, that is any correlation vector in c(n, S) has a
unique expansion by classical vertex vectors.
Pitowsky’s theorem (Pitowsky 1989, p. 22) states that p admits a Kolmogorovian
representation if and only if p ∈ c(n, S). That is numbers can be probabilities of
certain events and their conjunctions if and only if the correlation vector composed
of these numbers is in the classical correlation polytope.
358 G. Hofer-Szabó
Now, Bell’s inequalities enter the scene as the facet inequalities of the classical
correlation polytopes. The simplest such correlation polytope is the one with n =
2 and S = {(1, 2)}. In this case the vertices are: (0, 0; 0), (1, 0; 0), (0, 1; 0) and
(1, 1; 1) and the facet inequalities (Bell’s inequalities) are the following:
0 p12 p1 , p2 1
p1 + p2 − p12 1
Another famous polytope is c(n, S) with n = 4 and S = {(1, 3), (1, 4), (2, 3),
(2, 4)}. The facet inequalities are then the following:
0 pij pi , pj 1 i = 1, 2; j = 3, 4 (16.14)
pi + pj − pij 1 i = 1, 2; j = 3, 4 (16.15)
−1 pij +pi j +pij −pi j −pi −pj 0 i, i =1, 2; j, j =3, 4; i = i ; j = j
(16.16)
The facet inequalities (16.16) are called the Clauser-Horne inequalities. They
express whether 4 + 4 real numbers can be regarded as the probability of four
classical events and their certain conjunctions.
A special type of the correlation vectors are the independence vectors that is
correlation vectors such that for all (i, j ) ∈ S, pij = pi pj . It is easy to see that all
independence vectors lie in the classical correlation polytope with coefficients
%
n
pi if ε i = 1
λε = pi∗ , where pi∗ = (16.17)
1 − pi if ε i = 0
i=1
Classical vertex vectors are independence vectors by definition; they are extremal
points of the classical correlation polytope. For a classical vertex vector pi ∈
{0, 1} for all i = 1 . . . n. Thus we will sometimes call a classical vertex vector
a deterministic independence vector and an independence vector which is not a
classical vertex vector an indeterministic independence vector.
Although correlation vectors in c(n, S) has a unique convex expansion by
classical vertex vectors, they (if not classical vertex vectors) can have many convex
expansions by indeterministic independence vectors. Moreover, for a classical
correlation vector p ∈ c(n, S) (which is not a classical vertex vector) there always
exist many sets of indeterministic independence vectors {p ε } such that
p = λε u ε = λε p ε
ε ε
16 On the Three Types of Bell’s Inequalities 359
Now, we ask the following question: When do the correlations in the Kolmogorovian
representation of a proper correlation vector in c(n, S) have a common causal
explanation?
A common cause in the Reichenbachian sense (Reichenbach 1956) is a screener-
off partition of the algebra (or an extension of the algebra; see Hofer-Szabó et al.
2013, Ch. 3 and 6). In other words, correlations (16.19) are said to have a joint
common causal explanation if there is a partition {Ck } (k ∈ K) of such that for
any (i, j ) in S and k ∈ K
2 For example:
2 2 1
p = , ;
5 5 5
1 1 1 2
= (1, 1; 1) + (1, 0; 0) + (0, 1; 0) + (0, 0; 0)
5 5 5 5
& √ √ √ '
1 1 1 1 1 3+ 5 3+ 5 7+3 5
= , ; + , ;
5 4 4 16 5 8 8 32
& √ √ √ '
1 3 − 5 3 − 5 7 − −3 5 2 1 1 1
+ , ; + , ; (16.18)
5 8 8 32 5 2 2 4
360 G. Hofer-Szabó
Since the common cause vectors are independence vectors lying in the classical
correlation polytope c(n, S), therefore their convex combination p also lies in the
classical correlation polytope—which, of course, we knew since we assumed that p
has a Kolmogorovian representation.
We call a common cause vector p k deterministic, if p k is a deterministic
independence vector (classical vertex vector); otherwise we call it indeterministic.
All classical correlation vectors have 2n deterministic common cause vectors,
namely the classical vertex vectors. In this cases k = ε ∈ {0, 1}n and the
probabilities are:
piε = ε i
cε = λε
where λε is specified in (16.17). Classical correlation vectors which are not classical
vertex vectors also can be expanded as a convex sum of indeterministic common
cause vectors in many different ways.
Conversely, knowing the k common cause vectors and the probabilities ck alone,
one can easily construct a common causal explanation of the correlations (16.19).
First, construct the classical probability space associated to the correlation vector
p (for the details see Pitowsky 1989, p. 23). Then extend containing the events
Ai and Ai ∧ Aj such that the extended probability space contains also the common
causes {Ck } (for such an extension see Hofer-Szabó et al. (1999)).
To sum up, in the case of classical probabilities the fulfillment of Bell’s
inequalities is a necessary and sufficient condition for a set of numbers to be the
probability of certain events and their conjunctions. If these events are correlated,
then the correlations will always have a common causal explanation. In other
words, having a common causal explanation does not put a further constraint on
the correlation vectors. Thus, Bell’s inequalities have a double meaning: they test
whether numbers can represent probabilities of events and at the same time whether
the correlations between these events (provided they exist) have a joint common
cause. These two meanings of Bell’s inequalities will split up in the next section
where we treat Bell’s inequalities with classical conditional probabilities.
16 On the Three Types of Bell’s Inequalities 361
Just as in the previous section suppose that we are given real numbers pi and pij in
[0, 1] with i = 1 . . . n and (i, j ) ∈ S. But now we ask: when can these numbers be
conditional probabilities of certain events and their conjunctions? More precisely:
given the numbers pi and pij , do there exist events Ai and ai (i = 1 . . . n) in a
classical probability space (, , p) such that pi and pij are the following classical
conditional probabilities:
pi = p(Ai |ai )
pij = p(Ai ∧ Aj |ai ∧ aj )
3 For example consider the following correlation vector in R(n, S) with n = 2 and S = {1, 2}:
2 2 1
p = , ;
3 3 5
p1 + p2 − p12 1
hence it is not in c(n, S) and, consequently, it does not admit a Kolmogorovian representation.
However, p admits a conditional Kolmogorovian representation with the following atomic events
and probabilities:
1
p(A1 ∧ A2 ∧ a1 ∧ a2 ) =
25
4
p(A⊥ ⊥
1 ∧ A2 ∧ a1 ∧ a2 ) =
25
9
p(A⊥ ⊥ ⊥ ⊥
1 ∧ A2 ∧ a1 ∧ a2 ) = p(A1 ∧ A2 ∧ a1 ∧ a2 ) =
25
1
p(A⊥ ⊥ ⊥ ⊥ ⊥ ⊥
1 ∧ A2 ∧ a1 ∧ a2 ) = p(A1 ∧ A2 ∧ a1 ∧ a2 ) =
25
(The probability of all other atomic events is 0.)
362 G. Hofer-Szabó
Contrary to c(n, S), the polytope u(n, S) is not a simplex, hence the expansion of
the correlation vectors in u(n, S) by vertex vectors is typically not unique.4
Now, the set of correlation vectors admitting a conditional Kolmogorovian
representation is dense in u(n, S): all interior points of u(n, S) admit a conditional
Kolmogorovian representation and also all surface points, except for which there
is a pair (i, j ) ∈ S such that either (i) pi = 0 or pj = 0 but pij = 0; or
(ii) pi = pj = 1 but pij = 1. Denote the set of correlation vectors admitting a
conditional Kolmogorovian representation by u (n, S).
Now, let’s go over to the common causal explanation of conditional correlations.
Let p be a proper correlation vector in u (n, S). Lying in u (n, S) the correlation
vector p represents the conditional probabilities of certain events and their conjunc-
tions and the events associated to some pairs (i, j ) ∈ S are conditionally correlated:
Interpret now the events Ai and ai in the context of physical experiments: Let ai
represent a possible measurement that an experimenter can perform on an object.
Then the event ai ∧ aj will represent the joint performance of measurements ai
and aj . Let furthermore the event Ai represent an outcome of measurement ai
1 1 1 2
p = (0, 1; 0) + (1, 0; 0) + (1, 1; 1) + (1, 1; 0)
3 3 5 15
1 1 7
= (0, 0; 0) + (1, 1; 1) + (1, 1; 0)
3 5 15
etc.
16 On the Three Types of Bell’s Inequalities 363
Equations (16.22) express the Reichenbachian idea that the common cause is to
screen off all correlations. Equations (16.23) express the so-called no-conspiracy,
the idea that common causes should be causally, and hence probabilistically,
independent of the measurement choices. The common causal explanation is joint
since all correlations (16.21) have the same common cause.
Now, suppose the correlations in the a given conditional Kolmogorovian repre-
sentation of (16.21) of p in u (n, S) have a non-conspiratorial joint common causal
explanation. Introduce again the notation
and consider the k common cause vectors of the correlation vector p:
p k = p1k , . . . , pnk ; . . . , pik pjk , . . .
Since common cause vectors are independence vectors lying in c(n, S), their convex
combination also lies in the classical correlation polytope c(n, S). Thus, p being a
classical correlation vector is a necessary condition for a conditional Kolmogorovian
representation of p to have a non-conspiratorial joint common causal explanation.
364 G. Hofer-Szabó
Conversely, knowing the k common cause vectors and the probabilities ck alone,
one can construct the classical probability space with the conditionally correlating
events Ai and ai and the common causes {Ck }.
Observe that the situation is now different from the one in the previous section:
the numbers pi and pij can be conditional probabilities of events and their
conjunctions even if they violate the corresponding Bell inequalities. However,
the conditional correlations between these events have a non-conspiratorial joint
common causal explanation if and only if the correlation vector composed of the
numbers pi and pij lies in the classical correlation polytope. In other words, in case
of classical conditional probabilities Bell’s inequalities do not test the conditional
Kolmogorovian representability but whether correlations can be given a common
causal explanation.
for any (i, j ) ∈ S. Note that non-signaling is not a feature of the correlation vector
itself but of the representation. For the same correlation vector in c(n, S) one can
provide both non-signaling and also signaling representations.
Now, a conditional Kolmogorovian representation can have a property explana-
tion only if it satisfies non-signaling. Recall namely that for a property {Ck }:
and hence
∗
where the equation = holds due to no-conspiracy (16.23) and the symbol k: pik =1
means that we sum up for all k for which pik = 1. Similarly, using (16.22)–(16.23)
one obtains
pij = p(Ck )
k: pik =1, pjk =1
Ci = ∨k: pk =1 Ck
i
Ci ∧ Cj = ∨k: pk =1, pk =1 Ck
i j
pi = p(Ci ) = p(Cε ) = λε (16.27)
ε: ε i =i ε: ε i =i
pij = p(Ci ∧ Cj ) = p(Cε ) = λε (16.28)
ε: ε i =1, ε j =1 ε: ε i =1, ε j =1
that
Now, consider a non-signaling conditional Kolmogorovian representation of p,
is let there be events Ai and ai in such that
pi = p(Ai |ai )
pij = p(Ai ∧ Aj |ai ∧ aj )
Then, extend the algebra to generated by the atomic events Dε,ε defined as
follows:
where the numbers piε ∈ [0, 1] will be specified below. Now, using (16.29)–(16.33)
one obtains
That is the partition {Cε } provides a propensity explanation for the conditional
Kolmogorovian representation of p with probabilities specified in (16.27)–(16.28).
Now
16 On the Three Types of Bell’s Inequalities 367
p(Ai ∧ aε ) p(Ai ∧ Dε,ε ) p(Ai |Dε,ε ) p(Dε,ε )
p(Ai |aε ) = = ε
= ε
p(aε ) p(aε ) p(aε )
ε
ε p p(aε )λε
= ε i i = ε i piε λε
p(aε ) ε
that is
pi = p(Ai |ai ) = piε λε (16.34)
ε
and similarly
pij = p(Ai ∧ Aj |ai ∧ aj ) = piε pjε λε (16.35)
ε
This means that the numbers piε are to be taken from [0, 1] such that the
2n independence vectors p ε provide a convex combination for p with the same
coefficients λε as the vertex vectors u ε do. One such expansion always exists.
Namely, when p ε = u ε . In this case piε = εi for any i = 1 . . . n and ε ∈ {0, 1}n
and (16.34)–(16.35) reads as follows:
pi = p(Ai |ai ) = εi p(Cε ) = p(Ci )
ε
pij = p(Ai ∧ Aj |ai ∧ aj ) = ε i εj p(Cε ) = p(Ci ∧ Cj )
ε
term “property” both for a deterministic common cause in the conditional Kol-
mogorovian representation and also as a synonym of “event” in the unconditional
Kolmogorovian representation.
Again start with real numbers pi and pij in [0, 1] with i = 1 . . . n and (i, j ) ∈ S.
But now the question is the following: when can these numbers be quantum
probabilities? That is, given the numbers pi and pij , do there exist projections
Ai (i = 1 . . . n) representing quantum events in a quantum probability space
(P(H ), ρ), where P(H ) is the projection lattice of a Hilbert space H and ρ is a
density operator on H representing the quantum state, such that pi and pij are the
following quantum probabilities:
pi = Tr(ρAi )
pij = Tr(ρ(Ai ∧ Aj ))
uiε = εi i = 1...n
uijε εi ε j (i, j ) ∈ S
Obvious, any classical vertex vector is a quantum vertex vector and any quantum
vertex vector is a vertex vector, but the reverse inclusion does not hold. In R(n, S)
with n = 2 and S = {1, 2} for example the vertex vector (1, 1; 0) is quantum but not
classical, and the vertex vectors (0, 0; 1), (1, 0; 1), (0, 1; 1) are not even quantum.
Denote by q(n, S) the convex hull of quantum vertex vectors. Pitowsky then
shows that almost all correlation vectors in q(n, S) admit a quantum representation.
More precisely, Pitowsky shows that—denoting by q (n, S) the set of quantum
correlation vectors that is the set of those correlation vectors which admit a quantum
representation—the following holds:
(i) c(n, S) ⊂ q (n, S) ⊂ q(n, S);
(ii) q (n, S) is convex (but not closed);
(iii) q (n, S) contains the interior of q(n, S)
We can add to this our result in the previous section:
(iv) q(n, S) ⊂ u (n, S) ⊂ u(n, S)
16 On the Three Types of Bell’s Inequalities 369
Thus, the set of numbers admitting a quantum representation is strictly larger than
the set of numbers admitting a Kolmogorovian representation but strictly smaller
than the set of numbers admitting a conditional Kolmogorovian representation.
Let us turn now to the question of the common causal explanation. Let p be a
proper quantum correlation vector. p then represents the quantum probabilities of
certain quantum events and their conjunctions and the events associated to some
pairs (i, j ) ∈ S are correlated:
where
Ck ρ Ck
ρ k :=
Tr(ρ Ck )
pik = Tr(ρ k Ai )
ck = Tr(ρ Ck )
and consider the k common cause vectors of the correlation vector p:
p k = p1k , . . . , pnk ; . . . , pik pjk , . . .
The common cause vectors are independence vectors. Hence, using (16.37) and the
theorem of total probability, the convex combination of the common cause vectors
p c = ck p k (16.38)
k
will be in c(n, S). However p c is not necessarily identical with the original
More precisely, p c = p for any ρ if {Ck } are commuting
correlation vector p.
370 G. Hofer-Szabó
common causes. But if {Ck } are noncommuting common causes, then p c and p
can be different and hence even if p c ∈ c(n, S), p might be outside c(n, S).
In short, a quantum correlation vector can have a joint noncommuting common
causal explanation even if it lies outside the classical correlation polytope. The
correlation vector p is confined in c(n, S) only if the common causes are required
to be commuting.5
To sum up, in the case of quantum probabilities we found a scenario different from
both previous cases. Here Bell’s inequalities neither put a constraint on whether
numbers can be quantum probabilities nor on whether the correlation between
events with the prescribed probability can have a common causal explanation.
Bell’s inequalities constrain common causal explanations only if common causes
are understood as commuting common causes.
Perhaps it is worth mentioning that in algebraic quantum field theory (Rédei
and Summers 2007; Hofer-Szabó and Vecsernyés 2013) and quantum information
theory (Bengtson and Zyczkowski 2006) the violation of Bell’s inequalities com-
posed of quantum probabilities is used for another purpose: it places a bound on
the strength of correlations between certain events. Abstractly, one starts with two
mutually commuting C ∗ -subalgebras A and B of a C ∗ -algebra C and defines a Bell
operator R for the pair (A, B) as an element of the following set:
1
B(A, B) := A1 (B1 + B2 ) + A2 (B1 − B2 ) Ai = A∗i ∈ A;
2
(
∗
Bi = Bi ∈ B; −1 Ai , Bi 1
where 1 is √the unit element of C. Then one can prove that for any Bell operator R,
|φ(R)| 2 for any state φ; but |φ(R)| 1 for separable states (i.e. for convex
combinations of product states).
In other words, in these disciplines one fixes a Bell operator (a witness operator)
and scrolls over the different quantum states to see which of them is separable.
In this reading, Bell’s inequalities neither test Kolmogorovian representation nor
common causal explanation but separability of certain normalized positive linear
functionals. Obviously, this role of Bell’s inequalities is completely different from
the one analyzed in this paper.
Bell’s inequalities in the EPR-Bohm scenario are standardly said to be violated. But
which Bell inequalities and what does this violation mean?
5 For the details and for a concrete example see Hofer-Szabó and Vecsernyés (2013, 2018).
16 On the Three Types of Bell’s Inequalities 371
Let us start in the reverse order, with quantum representation. Consider the
EPR-Bohm scenario with pairs of spin- 12 particles prepared in the singlet state. In
quantum mechanics the state of the system and the quantum events are represented
by matrices on M2 ⊗ M2 , where M2 is the algebra of the two-dimensional complex
matrices. The singlet state is represented as:
1 3
ρs = 1⊗1− σk ⊗ σk
4
k=1
and the event that the spin of the particle is “up” on the left wing in direction ai
(i = 1, 2); on the right wing in direction bj (i = 3, 4); and on both wings in
directions ai and bj , respectively, are represented as:
1
Ai = (1 + ai · σ ) ⊗ 1
2
1
Aj = 1 ⊗ (1 + bj · σ )
2
1
Aij = (1 + ai · σ ) ⊗ (1 + bj · σ )
4
1
pi = Tr(ρ s Ai ) = (16.39)
2
1
pj = Tr(ρ s Aj ) = (16.40)
2
1 θ ij
pij = Tr(ρ s Aij ) = sin2 (16.41)
2 2
where θ ij denotes the angle between directions ai and bj . As it is well known, for
the measurement directions
1
a1 = (0, 1, 0) b3 = √ (1, 1, 0)
2
1
a2 = (1, 0, 0) b4 = √ (−1, 1, 0)
2
√
1+ 2
− 1 p13 + p23 + p14 − p24 − p1 − p3 = − (16.42)
2
is violated.
What does the violation of the Clauser-Horne inequality (16.42) mean with
respect to the quantum representation? As clarified in Sect. 16.5, it does not
mean that the numbers (16.39)–(16.41) cannot be given a quantum mechanical
representation. Equations (16.39)–(16.41) just provide one. On the other hand, the
violation of (16.42) neither means that the EPR correlations
pi = p(Ai |ai )
pj = p(Bj |bj )
pij = p(Ai ∧ Bj |ai ∧ bj )
where ai denotes the event that the spin on the left particle is measured in direction
ai , and Ai denotes the event that the outcome in this measurement is “up”.
(Similarly, for bj and Bj .) That is, the probabilities are conditional probabilities of
certain measurement outcomes provided that certain measurements are performed.
However, the violation of the Clauser-Horne inequality (16.42) does exclude that
the conditionally correlating pairs of outcomes
6 More than that, we have also shown that in an algebraic quantum field theoretic setting the
common causes can even be localized in the common past of the correlating events.
7 The problem with such noncommuting common causes, however, is that it is far from being clear
pi = p(Ai )
pj = p(Bj )
pij = p(Ai ∧ Bj )
Hence, the violation of (16.42) again excludes a property explanation for any non-
signaling conditional Kolmogorovian representation.
16.7 Conclusions
What does it mean that a set of numbers violates Bell’s inequalities? One can answer
this question in three different ways depending on whether the numbers are inter-
preted as classical unconditional probabilities, classical conditional probabilities, or
quantum probabilities:
(i) In the first case, the violation of Bell’s inequalities excludes these numbers
to be interpreted as the classical (unconditional) probabilities of certain
events/properties and their conjunctions. The satisfaction of Bell’s inequalities
374 G. Hofer-Szabó
does not only guarantee the existence of such events, but also the existence
of a set of joint common causes screening off the correlations between these
events.
(ii) In the second case, the violation of Bell’s inequalities does not exclude that
these numbers can be interpreted as the classical conditional probability of
certain events (measurement outcomes) conditioned on other events (measure-
ment settings). However, it does exclude a non-conspiratorial joint common
causal explanation for the conditional correlations between these events.
(iii) Finally, the violation or satisfaction of Bell’s inequalities has no bearing either
on whether a set of numbers can be interpreted as quantum probability of
certain projections, or whether there can be given a joint common causal
explanation for the correlating projections—as long as noncommuting com-
mon causes are adopted in the explanation.
Acknowledgements This work has been supported by the Hungarian Scientific Research Fund,
OTKA K-115593 and a Research Fellowship of the Institute of Advanced Studies Köszeg. I wish
to thank Márton Gömöri and Balázs Gyenis for reading and commenting on the manuscript.
References
Bengtson, I., & Zyczkowski, K. (2006). Geometry of quantum states: An introduction to quantum
entanglement. Cambridge: Cambridge University Press.
Gömöri, M., & Placek, T. (2017). Small probability space formulation of Bell’s theorem. In G.
Hofer-Szabó & L. Wronski (Eds.), Making it formally explicit – Probability, causality and
indeterminism (European studies in the philosophy of science series, pp. 109–126). Dordrecht:
Springer Verlag.
Hofer-Szabó, G., & Vecsernyés, P. (2012). Noncommuting local common causes for correlations
violating the Clauser–Horne inequality. Journal of Mathematical Physics, 53, 12230.
Hofer-Szabó, G., & Vecsernyés, P. (2013). Bell inequality and common causal explanation in
algebraic quantum field theory. Studies in History and Philosophy of Science Part B – Studies
in History and Philosophy of Modern Physics, 44(4), 404–416.
Hofer-Szabó, G., & Vecsernyés, P. (2018). Quantum theory and local causality. Dordrecht:
Springer Brief.
Hofer-Szabó, G., Rédei, M., & Szabó, L. E. (1999). On Reichenbach’s common cause principle and
on Reichenbach’s notion of common cause. The British Journal for the Philosophy of Science,
50, 377–399.
Hofer-Szabó, G., Rédei, M., & Szabó, L. E. (2013). The principle of the common cause.
Cambridge: Cambridge University Press.
Pitowsky, I. (1989). Quantum probability – Quantum logic. Dordrecht: Springer.
Rédei, M., & Summers, J. S. (2007). Quantum probability theory. Studies in History and
Philosophy of Modern Physics, 38, 390–417.
Reichenbach, H. (1956). The direction of time. Los Angeles: University of California Press.
Szabó, L. E. (2008). The Einstein-Podolsky-Rosen argument and the Bell inequalities. Internet
encyclopedia of philosophy, http://www.iep.utm.edu/epr/.
Chapter 17
On the Descriptive Power of Probability
Logic
Ehud Hrushovski
Logic is often defined as the “systematic study of the form of valid inference”.1 “Let
C be the planets, B to be near, A not to twinkle,” wrote Aristotle; you can then study
the implication from the premises (∀x)(A(x) → B(x)) and (∀x)(C(x) → A(x)),
in modern notation, to the conclusion that the planets are near. It is the pride and
weakness of his method that “no further term is required from without in order to
E. Hrushovski ()
Mathematical Institute, Oxford, UK
make the consequence necessary” (Aristotle 1906). By the end of the argument, we
may learn a new fact about B and C; we will not be introduced to any new idea D.
Modern logic has, it seems to me, an additional role: the introduction of new
concepts. The primary use of the first-order relational calculus is not for judging the
wisdom of asserting a certain hypothesis, whose meaning is assumed clear. It is first
of all a tool for carving out new ideas and dividing lines, about which assertions
can subsequently be formulated. An ancient example is the notion of parallel lines,
defined from a basic relation of incidence between a point and a line. With the aid of
a quantifier, concerning the (non-) existence of an intersection point, a new relation
– parallelism – has been defined from one taken to be more basic (a point lying on
a line). Taking such definitions further, Descartes shows how to interpret algebraic
operations in geometry, in such a way that one can turn back and define the geometry
in the algebra. Or to move to the nineteenth century, take the definition of continuity
of a function F :
(∀x)(∀ > 0)(∃δ > 0)(∀y)(d(x, y) < δ → d(F (x), F (y)) < )
relates 3 variables, x, , δ, in a new way, and we can now afford to quantify one
out. By contrast, if we only have unary relations, quantifiers produce nothing new
beyond sentences; there the monadic expressive power ends. We obtain assertions
whose truth we can try to decide, but no new definitions, no new distinctions among
elements populating the given world. This is the fundamental difference between
Aristotle’s logic, and the modern version.
It is easy to multiply examples of the birth of new concepts in first-order
language; mathematics forms a vast web of fields of discourse, whose basic concepts
17 On the Descriptive Power of Probability Logic 377
are introduced and inter-related in this manner. Sets are sometimes used, but set
theory itself is understood via first-order logic; the primary use of the assumption
that Y consists of all subsets of X is the first-order comprehension axiom.
The interpretation of mathematics in set theory does show, in one way among
many, that one binary relation – the smallest non-monadic relational language –
suffices to interpret anything we like. And having properly started, Gödel’s theorem
tells us it would be impossible to set fixed limits to this process.
The mathematics used in physics is no exception. Science is only at the very
end (if that) about checking the truth of universal sentences. There is first the step
of recognizing or creating the notions, in terms of which the theory is formulated
and may yield universal consequences. Regular viewings of a light point in the
night sky are interpreted in terms of a celestial body; an orbital ellipse describes its
movement; a differential equation predicts the ellipse. Newton’s laws are phrased
in terms of derivatives; when fully clarified, they are formulated in terms of tangent
and cotangent spaces, vector fields and flows and symplectic forms. The definition
of all of these, and their connection to the simpler mathematical objects that have
a direct interpretation – the distance a stone will fly – is carried out using the first-
order quantifiers.
It is in this light that I would like to consider probability logic, where the
absolute quantifiers are replaced by numerical ones giving probabilities.2 In the
setting of Mendelian genetics, say, an ordinary quantifier may be used to ask
whether a yellow pea ever descends from two orange ones. A probability quantifier
rather asks: what proportion of such descendents is yellow? If we pick one at
random, what is the likelihood that it is yellow? Robust logical calculi where
such expressions are permitted have been constructed.3 Such systems admit a
completeness theorem, so that the (false) dichotomy between probabilistic and
measure-theoretic language becomes a useful duality, with the measure theory on
the semantic side complementing the random-variable viewpoint in the language.
They are applicable when a natural measure forms part of the structure of the
universe we are trying to describe.
In certain circumstances, probability quantifiers significantly enrich the power of
the logic. A trivial way this may happen is at the very reading of the initial data. A
graph on a large number n of points, obtained by a random choice of edges, looks
the same to first-order eyes regardless of whether the edge probability is 1%, or
99%, provided n is big enough compared to the formulas considered. Probability
quantifiers, by their design, will of course directly and immediately detect the
difference.
But is probability logic of use beyond the simple access to the basic data? Does it
also help in the formulation of new concepts, new relations, and if so, is it possible
to say in what way? This is the question we attempt to answer here. This is quite
different from the obvious importance of probability or measure as mathematical
tools in their own right, studied from the outside using ordinary quantifiers; we are
interested in the creative power coded within that calculus.
Consider this thought-experiment. You are plunged into a world of which nothing
is known; or perhaps you sit outside, but have the possibility of probing it. It is
a Hilbert-style world, of the general kind of his axiomatization of geometry; with
recognizable ‘things’ (perhaps points and lines, ‘material particles’ (Maxwell 1925),
planets, packets of radiation, small colored pixels, · · · . ) Some basic relations among
them – we will start with unary and binary relations – can be directly sensed. If
you are unlucky, your world consists of Kolmogorovian noise; no pattern can be
inferred, no means of deducing any additional observations from some given ones.
But if there is a pattern, how much of it you will discern depends not only on your
eyes, but on the logical language you have available. Quite possibly, the laws of the
world in question can only be formulated in terms of concepts and structures that
are not immediately given in the basic relations. If you use first-order logic, starting
with what seem to be points and lines, interpreted in algebra and leading to number
theory, such entities may be discovered and analyzed; there is no limit to the level
of abstraction that may be needed. Is there a similar phenomenon if your logic is
purely stochastic? What features will you be able to make out?
This essay is conceived as a dialogue between a mathematical logician (L), and
a philosopher of physics (P), sometimes joined by P’s brilliant student (S). The
philosopher, over L’s objections that he is engaged in purely technical lemmas, asks
L to present his thoughts on the expressive power of probability logic. This leads
to a number of conversations, in which the interlocutors attempt a philosophical
interpretation of the lemmas in question, and their mathematical background.
To try to gain insight into this question, they try to study a purified extract; a logic
with probability quantifiers replacing the usual ones entirely.
Their first conclusion is that the probability logic considered excels in char-
acterizing, and recognizing, space-like features. If the given universe includes a
metric giving a notion of distance, the metric space may be characterized by the
stochastic logical theory (a theorem of Gromov.) Under certain finite-dimensionality
assumptions, the logic is sometimes able to distinguish infinitesimal features, even
from coarse-grained data: if the metric is not directly given at the smallest scale, but
only some mid-scale relation depending on it is directly observable, nevertheless the
probability quantifiers can penetrate into the small scale structure of the world. If
we further assume that the universe (with respect to certain relations) has the same
statistical properties at every point, it can be shown to approximate a Riemannian
homogeneous space, a rather unique kind of mathematical object.
But there it ends. Probability logic can describe a locally compact space-like
substrate of a given relational world; and color this space by the local correlation
functions of some conjunctions of the basic relations. Beyond this, it can go no
further on its own. Conditioned on this spatial structure, all (or almost all) events
appear statistically independent; hidden structures beyond the monadic space may
exist, but can not be detected with stochastic logic.
17 On the Descriptive Power of Probability Logic 379
In some cases, hints of a deeper structure can be gleaned from the probability-
zero sets where the statistical independence breaks down; they must be analyzed by
other means. The general formulation of concepts in probability logic, with both
ordinary and probability quantifiers present, can be viewed then as an iteration
of a first-order layer, then a purely probabilistic one, and so on. The role of the
probability quantifiers appears restricted to a determination of the space associated
with the current conceptual framework, along with some correlations along it;
certain measure zero sets suggest new phenomena to be explored, once again with
first order logic.
Does the picture change if we use ternary or higher relations? Here a full
mathematical description is not yet available; but it is clear that the expressive power
of pure probability logic remains limited, and perhaps again restricted to specifying
parameters for a somewhat more generalized notion of space; the new phenomena
that occur will be briefly considered.
This essay is dedicated to the memory of my friend Itamar Pitowsky. The first
lines of the dialogue really do represent a meeting with Itamar, in the academic
year 2009/2010, on the Givat Ram campus. I was already thinking about related
questions, and Itamar was – as always – interested. But we first diverged to different
subjects, and the rest could never materialize in reality; these dialogues are purely
imaginary.
Needless to say, P does not stand for Pitowsky; I could not possibly, and did not
try, to guess Itamar’s real reactions.
One of the points that arises is the distinction between monadic logic and full
first order logic, and the apparent non-existence of any intermediate (beyond the
relativistic correction that is discussed in the text.) There is at least an analogy,
probably a deeper connection, between this and the question of existence of a
mathematical world lying strictly between linear algebra (linear equations) and full
algebraic geometry (all algebraic equations.) In this Itamar was interested over many
years, for reasons stemming (in part) from his deep study of Descarte’s geometry,
and the significance of quadratic equations there.4 I was likewise interested for
quite different reasons – such a world flickers briefly, intriguingly, in and out
of existence within Zilber’s proof5 (by contradiction) of the linearity of totally
categorical structures. I am still not aware of such a world, or a framework in which
it may be formulated.‘
P. L! You are back. What are you thinking about these days?
L. Actually, it is related to probability logic; but purely classical, Kolmogorov-
style probability, I’m afraid. A certain statement I can’t quite put my hands on;
perhaps it will end up requiring some higher-categorical language.
6 SeeChang and Keisler (1966), Ben Yaacov et al. (2008), and Ben Yaacov and Usvyatsov (2010).
7 Ben Yaacov (2008). One way to present such structures is to introduce a metric and use it at both
ends, with 0 an ideal limit of increasing resolution, and ∞ an ideal upper limit of distances; so
formulas, when considered up to a given precision, can only access a bounded region, as well as
only distinguish points at some resolution bounded above 0.
17 On the Descriptive Power of Probability Logic 381
Now suppose A is an elementary submodel in the usual 2-valued sense, and let
us use your coding. If R(b) takes a real value α = 0.abc · · · and A is an elementary
submodel, then one can find b in A with value 0.abc · · · , agreeing with α for as
many digits as we wish. In particular, R(b ) is arbitrarily close to α; this is indeed
what we would want. But the fact that the specifically chosen binary expansions
agree is very slightly stronger, and not a natural demand.
P. I see. So the right notion of approximation in continuous logic is simply this:
in continuous logic, A is an elementary submodel of B if for any formula φ(x) with
parameters from A, if φ(b) has value r ∈ R, then for any > 0 one can find b in A
with φ(b ) of value between r − and r + .
S. I think I can guess what connectives must be. Returning to sequences of
formulas in ordinary logic, say R1 , R2 , · · · and S1 , S2 , · · · , we can produce a
sequence T1 , T2 , · · · where each Tn is a Boolean function of finitely many Ri and
Sj ; say T1 = R1 &S1 , T2 = R1 ∨ S3 , and so on. In terms of binary expansions of
numbers, we obtain any function C(x, y) such that the n’th binary digit of C(x, y)
depends only on the first m digits of x and of y, for some m. I suppose the continuous
logic analogue would be any function from R2 to R, whose value up to an -accuracy
depends only on the inputs up to some δ?
L. Just so; in other words, you are simply describing a continuous function.
P. Isn’t it excessive to allow any continuous function as a connective? True, in the
two-valued case, too, we have an exponential number of n-place connectives. But
we can choose a small basis, for instance conjunction and negation, that generate
the rest.
L. In the continuous logic case, one can also approximate any connectives with
arbitrary accuracy using a small basis, for instance just the binary connective giving
the minimum of two values, multiplication by rationals as unary operators, and
rational constants.
P. And what about quantifiers?
L. Here a natural basis consists of sup and inf; see Ben Yaacov and Usvyatsov
(2010). There is an equivalent, perhaps more philosophically satisfying approach in
Chang and Keisler (1966), where a quantifier is seen to be an operator from a set
of values to a single value, continuous in the appropriate sense. Thus the existential
quantifier is viewed as the operator telling us if a set includes 1 or not; (∃y)φ(y) is
true in M if 1 ∈ {[φ(a)] : a ∈ M}.
P. After describing the usual predicate calculus, one steps back into the metalan-
guage to define the notion of logical validity.
L. It looks the same here. One can ask about the set of possible truth values of a
sentence σ in models M; or for simplicity, just whether that set reduces to a single
element {α}. This will be the case if and only if for any given > 0, a syntactic
proof can be found (in an appropriate system) that shows the value to be within
of α.
In fact all the major theorems of first order logic persist, as if they were made
for this generalization. Compactness, effectiveness statements remain the same;
notions such as model completeness and criteria for quantifier elimination; stability
and beyond. It took substantial work of a number of model theorists over several
382 E. Hrushovski
generations to get us used to the intrusion of real numbers; but once the right point of
view was found, it became clear that continuous logic is a very gentle generalization
of the old.
P. We have so far paid no attention to equality.
L. The analogue of equality in continuous logic can be taken to be a metric, the
data of the distance between any two points. This leads to completions, which do
not appear in the first-order case, and do sometimes require more care. But issues of
this type will not concern us here.
S. And what is the gain in moving to continuous logic?
L. As you know, unlike the ‘total’ foundations of mathematics envisaged by
Russell and Whitehead, or Hilbert, model theory has to tame one area at a time;
using continuous logic permitted a great expansion of the repertoire. See for instance
Ben Yaacov et al. (2008). Once one has continuous logic, Robinson’s phrase, that
model theory is the metamathematics of algebra, can be extended to meaningful
parts of analysis. From another angle, constructions such as the monadic space of
a theory, while first encountered in the discrete setting (Shelah 1990; Lascar 1982;
Kim and Pillay 1997), are more naturally presented in the continuous logic setting.
P. It also makes mathematical logic look much more like the foundations of
physics. In a structure one takes an atomic formula, and upon specifying parameters,
obtains a value. In classical physics, we have ‘observables’, often depending on
parameters (such as a test function supported in a small region of some point of
space), and obtains a numerical answer. And a given experiment will confirm a
theory, or measure the value of an observable or a fundamental constant, only to a
given degree of approximation.
L. For many years I was puzzled by the fact that the simplest physical models,
and with them many areas of mathematics, seem not to be approachable by first-
order logic, without the intermediary of a large measure of set theory. The latter
seems to be used only as a scaffolding, yet it makes model-theoretic decidability
results impossible. It is surprising to what extent this impression was due to no
more than a slight discrepancy in the formalization, correctible without any change
of the superstructure it supports.
P. I wonder to what extent the role of Hilbert in the formalization of both
mathematical logic and mathematical physics has to do with this parallelism, later
broken and, I see, mended.8
8 Corry (2004) has an interesting account of the proximity of the two in Hilbert’s own thought.
Hilbert’s foundations of specific subjects looks on this account very much like a direct predecessor
of the model-theoretic program of finding ‘tame’ axiomatizations of wide (but circumscribed) areas
of mathematics; while his ‘continuity axiom’ in this context looks like a part of a description of
continuous logic: If a sufficiently small degree of accuracy is prescribed in advance as our condition
for the fulfilment of a certain statement, then an adequate domain may be determined, within which
one can freely choose the arguments [of the function defining the statement], without however
deviating from the statement, more than allowed by the prescribed degree. (Hilbert 1905, cited in
Corry (2004).)
17 On the Descriptive Power of Probability Logic 383
S. But still, in the specific setting of probability logic, can I just ignore this issue
if I pretend that probabilities are meaningful only to a given precision (say to a
single atom in a mole)? Can I then use a finite coding of probabilities by ordinary
first order formulas?
L. Yes, if you make this assumption you can ignore today’s discussion. When
probabilities behave discretely, they can be treated using ordinary predicates.
Conversely, ordinary quantifiers are clearly special cases of probability quantifiers,
assuming there is a gap between 0 and the next highest possible probability for a
given question. Yet the results we will discuss are already meaningful, even under
such restrictions.9
P. But even if we ignore real-valued logic, taking all probabilities to be rational,
this discussion was useful in order to remind us of the significance of the notion of
approximation, which is surely just as important in discrete first order logic.
I think it underlies most achievements of modern logic, in Gödel’s proof of the
consistency of the continuum hypothesis no less than the Ax-Kochen asymptotic
theory of p-adic fields, to choose two random examples.
L. I agree. Is this notion as pervasive in the philosophy of physics?
P. I think physicists never evoke a mathematical structure without being aware
of the meaning of approximations to the same. They may exhibit this awareness in
their practice, rather than in their definitions, which is why Dirac felt comfortable
with his δ-functions long before Schwarz.
It is also the standard model of the history of physics, as opposed to physics
proper. We have a succession of theories, approximating reality with respect to more
and more classes of phenomena, and to better and better accuracy. Until perhaps one
reaches a final theory, and needs only to fine-tune some parameters, approaching
their physical value.
L. Sometimes it seems to go the other way: the perfect mathematical model is
the theory; the universe is the rough approximation; it is understandable precisely
because it approximates a beautiful theory. A hill of sand has the perfect shape
predicted by a differential equation, whose terms are dictated by considerations
of weight and forces of the wind. But we know that the grains are discrete; the
differential equation (taken literally) is wrong. The right mathematical statement
is that the actual beach lies on a sequence of ideal ones, with sand made of ever
smaller grains; it is the limit shape of the various sand hills, that will really satisfy
the differential equation; reality is just a very good approximation to the ideal.
S. What notion of ‘limit shape’ do you have in mind here?
P. Perhaps some variation on Gromov-Hausdorff distance. In a known ambient
space, say 3-dimensional Euclidean space, Hausdorff showed how to view the space
9 There are in fact interesting mathematical worlds where probabilities are always rational, and
with bounded denominator in any given family. Thus for instance Chatzidakis et al. (1992) study
‘nonstandard finite fields’, infinite models of the theory of all finite fields; they carry a notion of
probability deriving from the counting-based measure on finite fields. There, the probability of
being a square is exactly 12 , as is easy to see; the probability of x and x + 1 both being squares is a
little more complicated to analyze, but still equals exactly 14 .
384 E. Hrushovski
Probabilities behave formally like a quantifier. Given a formula φ, with free variable
x and possibly some others y, one can consider another statement, asserting that the
probability – if one chooses x at random – that φ(x, y) be true, is α. This can be
viewed as a true/false statement about y and α, but is more conveniently construed
(as discussed above) as a statement about y, whose value is α; we can denote it as
Ex φ(x, y).
P. But how do you iterate quantifiers in this language? After all, the complexity
of first-order logic is created by quantifier alternation. And here you started with a
0/1-valued formula, and obtained a real-valued one.
L. It would be possible, though clumsy, to stick with 0/1-valued statements, and
then iterate them, in expressions such as: if we choose x at random, with probability
>.99, a further random choice of y will have probability <.01 of making φ true.
But it is more convenient to generalize probabilities to expectations. For a 0/1-
valued statement, the expectation of the truth value is the same as the probability.
But expectations make sense for a real-valued formula as well, and one can simply
write Ey (Ex φ(x, y)). These two formalisms convey the same information, one can
translate one into the other.
P. I see; we have a formal operator taking a formula φ(x, y, z) (in free variables
x, y, z say) to a formula Ex φ(x, y, z) with free variables y, z – indeed a kind
of quantifier. We view φ(x, b, c) as describing an event relative to the variable
x; depending on parameters b, c. Then Ex φ(x, b, c) gives the probability. Or
more generally, if φ(x, b, c) denotes a ‘weighted event’, or random variable, then
Ex φ(x, b, c) should be the expectation. The first axioms for “fields of probability”
in Kolmogorov (1950) are entirely algebraic; they already occur in some form in
Hilbert and earlier, cf. Corry (2004). It should be easy to formalize them here.
17 On the Descriptive Power of Probability Logic 385
physics; the problem being not doubts about their existence, dependent on the axiom
of choice, but the fact that this axiom, when used to show existence of one set, tends
to bring with it a plethora of others, too similar to each other; such sets can rarely if
ever be uniquely characterized, so that hopes of a meaningful physical theory using
them seem forlorn in advance. But thinking about measures on Cartesian products
suggests a quite different perspective. There, non-measurability only means, in usual
usage: with respect to the product measure. It can simply indicate a binary relation
that does not reduce to unary ones.
Consider a family of finite structures, Fn approaching a structure F∗ . Each
Fn carries a probability measure induced by counting – a subset X has measure
|X|/|Fn |. As probability logic structures, they approach (F∗ , μ1 ) for some measure
μ1 . Likewise, Subsets R of the Cartesian plane F∗2 are always measurable for an
appropriate limit measure μ2 on F∗2 . But they are very rarely measurable for the
product measure μ1 × μ1 .
S. Are these exotic sets that are not measurable? Or everyday ones like the
diagonal?
L. The diagonal on F∗ itself, the graph of the equality relation, will be
measurable since it has measure zero. But thickenings of the diagonal – pairs of
elements at distance ≤ from each other, for some and an appropriate notion of
distance – will usually fail to be measurable. In case Fn is a group, there are natural
neighborhoods U of of the form xy ∈ v ∈ U , where U is a ‘neighborhood’ of 1,
of small but positive measure on Fn . If Fn is a sufficiently noncommutative group,10
such neighborhoods will not be measurable. But there is nothing pathological about
them.
The specific case of non-measurable spin functions on a sphere may not be
attainable in a pseudo-finite manner, as suggested by results in Farah and Magidor
(2012). But the larger idea of a significant two-dimensional set, not measurable
for the product measure but restricting to a measurable set in every linear one-
dimensional section, attains in this light a different allure.
S. But let us go back to completeness. Is there also an incompleteness theorem?
Of course if one throws in the usual quantifiers, the usual incompleteness theorems
apply. But if we use exclusively probabilistic quantifiers, replacing “all” by“almost
all” in the appropriate sense? Gödel showed that an alternation of n quantifiers does
not reduce to an alternation of n − 1; is it likewise true here that alternating “almost
surely” and “almost never”?
P. I would guess not. It is often pointed out (by Laplace as well as Keynes, if
memory serves me) that the knowledge of highly probable events – with probability
very close to 1 – suffices to determine all probabilities; this, by the law of large
numbers, provided we can discuss the probability outcome of a large number
of choices. The same principle can be used to read out, by measuring a single
probability, when (for instance) 99% of elements x there are about 3% of elements
10 I.e.
without a commutative subgroup of bounded index. The more precise statement is that the
map (x, y) → (x, yx) will not be measurable at the limit.
17 On the Descriptive Power of Probability Logic 387
P. You said probability logic is good at discerning spatial structures. Could you give
some examples?
L. Let us start with Gromov’s reconstruction theorem.11 It applies to a language
for probability logic with a distance predicate d(x, y), satisfying the axioms of
a metric space. A model for this is thus a complete metric space M carrying a
measure; we assume moreover that all balls are measurable, with positive measure.
If there are any measure zero balls, we discard them; probability logic will not notice
the difference, since it is blind to measure zero phenomena. Likewise, if M is not
complete, we can just pass to the completion. Then the theorem asserts that any
such model is determined by its theory. It is usually presented in more concrete
terms: if for each n we know the probability distribution for the distances between
n randomly chosen points, then we know the space.
P. And if M carries additional structure?
L. In continuous logic, all predicates are assumed uniformly continuous with
respect to the structural metric. With this assumption, the theorem remains true.
The case of an additional constant is already interesting; it implies that if the same
geometry is visible when you stand at a or at b inside M – for instance, in the ball
of radius 1 around either of them, there is such-and-such a probability for a triangle
to be obtuse,12 the same probability for a and b; and ditto for all such questions –
then there exists an automorphism of M carrying a to b.
P. So if the geometry looks the same from any vantage point on M, statistically
speaking, then the space is homogeneous; there are automorphisms taking any given
point to any other.
L. Yes. M will be compact then, and a whole body of theorems on the structure
of compact and locally compact groups applies to it; in the present case, the Peter-
Weyl theorem. It means in particular that with respect to any given formula, or finite
number of formulas, M can be quite precisely approximated by finite unions of
compact, finite dimensional, homogeneous Riemannian manifolds. Such manifolds
are very special; with somewhat stronger homogeneity assumptions (on directions),
they can be fully classified. Spheres and tori of various dimensions are good
examples. It is really a vertiginous change of perspective, from the very abstract
situation we started with to these classical geometries.
This phenomenon persists if the measure is infinite, but finite on any ball. In this
case M is locally compact; Euclidean space, say R3 , is the most classical example.
Locally compact groups were the subject of Hilbert’s 5th problem, resolved in
the 1950s, and showing that they are quite close to Lie groups, finite dimensional
groups with a differentiable structure; the simple Lie groups were classified in the
nineteenth century.
S. Going back – can I summarize Gromov’s theorem by saying that M can be
identified internally – that a physicist in M, using probability logic, and able to
sample distances between points, will be able to recognize the geometry M?
P. Prima facie, the theorem asserts only that the universe is determined by the
answers to all stochastic experiments. So your physicist will get closer and closer to
the full description, that does determine M. But there is no guarantee that at some
given time, inside M, a physicist can guess at M as the right answer. At any point,
there may be many other possibilities.
L. That’s right; there need not exist a finite description of M at all, after all. But
suppose M is in fact homogeneous. Our physicists can see that the formulas they
measure have very close values at different points. They can then make the leap
and assume it is homogeneous. Then I expect there will often (say when Aut (M) is
semisimple) be in fact be a unique homogeneous M where a certain finite number
of stochastic questions have the experimentally found answers.
S. But how can the physicists of M guess that this particular sentence –
expressing homogeneity – once gleaned from the statistics, would have such fruitful
consequences? How to distinguish it from others, that presumably also exist, that
lead to nothing special, or else to a completely wild, Gödelian domain?
L. If we knew that, we’d be able to answer the question: What is model theory?
P. Seriously, it is an interesting topic of mathematical philosophy of physics:
which possible geometries are amenable to discovery by their denizens, in the sense
12 Expressed via the sides, i.e. the square of one side is greater than the sum of squares of the other
two.
17 On the Descriptive Power of Probability Logic 389
that a finite number of statements, if seen to have high probability and assumed to
hold absolutely, will characterize the world uniquely? I think physicists are well
aware of such questions (and for related reasons, were disappointed to find large
numbers of Calabi-Yau manifolds.)
L. This leap incidentally, from high probability to always, cannot be justified
within the probability calculus; I don’t know what it means to say that it is likely to
be true, or becomes likelier with more observations; a condition like homogeneity
would seem in any reasonable model to have probability zero, if it has one at
all. Nevertheless this seems to be the kind of leap science is based on. And then,
following the passage from probabilistic to definite rules, a long mathematical
analysis using classical logic, before going back to probabilistic verification.
S. There is also an issue at the opening that we ignored. We assumed that the
distance between two points is directly observable, and with arbitrary precision. But
what if one only has access to everyday distances, not too small or big? Suppose
these physicists have size on the order of 100 units. They can estimate distances
up to the nearest integer unit; by weighing or other means, like Dalton, they can
measure the number of particles in a region a few units wide ball, or at least the
ratio of particle numbers between two such regions. But to describe the geometry
– e.g. to see whether it is Riemannian – requires understanding distances at much
smaller scales. How are they to get this information?
L. This is not unrelated to Hilbert’s fifth problem, mentioned a few moments
ago. The breakthrough on this problem was Gleason’s construction, starting from
a compact neighborhood of the identity in a group, of the tangent space, the
infinitesimal vectors. A modern version with pseudo-finite ramifications is due to
Breuillard et al. (2012). (and in fact this is how I got interested in such questions.)
The description of the metric at small distances, given the statistics of multiplication
at a medium range, is a step in this construction.
But this can be done without any homogeneity assumptions; the most direct
description is due to Lovász and Szegedy (2010), though their setting was slightly
different.13 Suppose, in your scenario, that R(x, y) denotes the relation: x, y are
at most a unit distance from each other. The Lovász-Szegedy metric can then be
written in one line as a probability logic formula:
The first thought would be that x, y are nearby if the unit balls around them are 99%
equal, in terms of the number of particles they contain; there is of course a grain
of truth in this. But what actually works is at one remove: intersect the unit balls
around x and z, and compare this to the result for y and z. If for the vast majority of
13 The theory of graphons includes ideas very close to those presented here. Lovász and his
collaborators begin with measure-theoretic limits of sequences of graphs, then builds a topology.
In the same years, unaware of this, model theorists added measures to their already topological
type and strong type spaces. Many results are represented in the two theories, in different terms;
but a systematic comparison has not been undertaken.
390 E. Hrushovski
particles z, the two numbers are close, then x, y must be close. Even at the pseudo-
finite limit, where the number of particles in a ball is infinite, this metric can be
shown to be compact; and a unit distance ball, in naive units, is still contained in
a finite ball of this metric. This indicates that the new metric must be seeing short-
range distances fairly faithfully; if all distances below 1 were lumped together as in
the original metric, the unit ball would be infinite and discrete, and not compact. The
story goes further than that, but you see that probability quantifiers are a powerful
tool in the recognition of space.
P. We were able so far to talk without agreeing on the notion of space; just as
mathematicians felt no need for a definition of algorithm, as long as they were
showing many things were algorithmic. But if we are to discuss the capabilities
of probability logic beyond the spatial, we should perhaps agree what we mean by
the latter.
L. The most basic example is a ‘space’ allowing two positions. We could call one
of them Q. the other then is ¬Q. A unary predicate. In, or out.
S. Isn’t it a little artificial to call this a space? Do any mathematical definitions of
‘space’ apply here?
L. Well, you can think of it as a very basic, two-element topological space X0 ,
with points 0 and 1. Given a structure M, the interpretation of Q gives a map M →
X0 . We think of M as spread out over X, with Q lying above 1, and the complement
above 0.
S. But X0 is discrete; you are not really using the topology.
L. For one predicate, that is true. With more unary predicates, we can distinguish
more and more regions in this primitive space, positions that elements may take;
we can subdivide the house into rooms, or delineate additional areas outside it. As
long as the set of possible positions is finite, it is discrete and the topology is not
used. But generally there are infinitely many unary formulas to consider, and then
the best way to keep track of the possible positions is a topology. For instance if we
have infinitely many unary predicates P1 , P2 , · · · with no relations among them –
all positions are possible – then X0 would be the Cantor space of 0/1-sequences,
with 01 · · · coding P1 ∩ (¬P2 ) ∩ · · · . If constraints exist, we will have a certain
subspace of that.
S. And in continuous logic, I suppose, these positions may form a connected
space; the range of values of a unary predicate can be an extended interval.
L. Quite so. In any case it is compact; up to a given resolution, there are finitely
many possibilities.
P. Perhaps this is a good place to disambiguate the word ‘finite’. When a
combinatorialist like Erdős says ‘let n be a number’, n is a finite number; but one
17 On the Descriptive Power of Probability Logic 391
that we view as potentially increasing towards infinity. A theorem on all but finitely
many n would often be satisfactory. In other cases, it is really viewed as fixed.14
L. Well, the space X0 , to any given resolution, is not pseudo-finite; it is not
moving; it’s dead finite. We may well visualize large finite structures, where Q has
n elements and ¬Q some much bigger number m perhaps; at the limit we talk of a
pseudo-finite n. But X0 will still have two.
S. So the position of an element a in X0 is what is called in Chang and Keisler
(1990) the type of a; it determines the value in M of all unary formulas concerning
a. But given two elements a, b, and a relation R, the positions of a and of b in X0
will not necessarily determine whether R(a, b) holds. By ‘spatial’ you just mean
‘unary’, then?
P. The spatial configuration of n points should be determined by the position of
each of them; in this sense, we do view space as monadic, or unary.
L. So far we took ‘monadic’ to be synonymous to ‘unary’. But the literal notion
of unary is not robust. Starting with a purely unary structure and forgetting some
information, we can arrive at one that is not quite unary. Let us go back to our unary
predicate Q; we think of it as specifying one side of some border, some distinction
in the world. You might think a single unary predicate is the least possible structure
a relational universe can have, but it is not: you can easily conceive of a language
able to specify less: it gives the demarcation line, without breaking the symmetry by
giving a name to either side. You simply need a relation E(x, y), true if both x, y
are in the extension of the predicate, or if both are out; an equivalence relation with
two classes. Technically, E(x, y) is a binary relation. But it could have been defined
from the unary predicate Q, if we had it: E(x, y) ⇐⇒ (Q(x) ∧ Q(y)) ∨ (¬Q(x) ∧
¬Q(y)). A world with E as a basic relation carries no more structure than one with
Q, and possibly less: there may be no way to tell apart the two classes.
S. What happens to X0 in this case?
L. Given a model M, we can define X0 as M/E, the set of classes; even without
having a name for one of them. So we still have the map p : M → X0 , giving what
we take for the spatial location of an element. The elements of X0 are called finite
imaginaries; it was precisely for them, and quite in the present setting, that Shelah
introduced his notion of imaginary elements (or T eq ; Shelah 1990).
S. But as a topological space, X0 is just the same as in the case of unary
predicates?
L. Yes, but there is some further information in it, of a different kind: about what
we are not able to name. If E has n classes (n being a limited number), then X0
has n points; some of them may have names, and some not; if we pick one, we may
have a way of naming another. The right ideas to describe the situation were found
by Galois: there is a finite group acting on the set of classes; all classes, or sets of
14 Jordan (1878), introducing his theorem regarding all but finitely many matrix groups of a given
dimension, attempted to introduce the term ‘limited’ (see p. 114), insisting that limité et illimité
. . . ne sont pas synonymes de fini et infini”, but unlike the concept and theorem, this linguistic
advice was not fully taken up by later mathematicians.
392 E. Hrushovski
classes, or even relations on the classes describable in the language are preserved by
the group, and conversely any invariant relation can be described by a formula. It is
exactly the situation with the roots of an irreducible polynomial; the Galois theory
underlying Shelah’s finite imaginaries directly generalizes the Galois’s theory for
solutions of algebraic equations.
P. And if we use continuous logic?
L. The need for a more general monadic space actually arose earlier, in the
ordinary, two-valued first-order case, in Lascar (1982) and Kim and Pillay (1997).
In place of maps p : M → X0 with X0 a finite set, we have to take into account
maps into compact spaces; but using the Peter-Weyl theorem, it suffices in fact
to consider maps p : M → Y into finite dimensional Riemannian manifolds.
The Galois theory persists: there is a canonical group H acting on Y , a subgroup
of the orthogonal group of automorphisms of Rn preserving Euclidean distance.
A Riemannian distance ρ(y, y ) on Y is preserved by H . And while the literal
spatial position p(m) in Y is not definable, the distance between two points –
ρ(p(m), p(m ))- is definable in the sense of continuous logic. The full monadic
space X is X is the projective limit of all such p; inasfar as any finite number
of formulas is concerned, to a given approximation, it suffices to think of one
p : M → Y as above.
P. The move you describe from unary to monadic closely parallels the relativistic
correction of Galileo. The elements of the universe are mapped into a space Y ,
but only relative distances are meaningful. And for any given question, if you fix
a sufficient number of reference points, spatial location becomes unary again; it is
determined by the distances to each of these reference points.
S. But the Galilean group is not compact; it includes the translation group R3 ;
like the space R3 itself, it is only locally compact.
L. We mentioned earlier are ‘unbounded’ generalizations of continuous logic;
there is also a natural generalization of probability logic, where the measures are
finite only locally. This should really be our framework, but as they do add a
technical layer, it seemed better to describe the bounded case. In the full setting
we do obtain locally compact groups and spaces; so R3 is included.
P. It is quite striking how Euclidean spaces, that one thinks of as essential but
possibly contingent features of physics and geometry, arise of themselves in a first-
order logic setting. Do you also see something resembling time?
L. Not so far. Nor Lorentzian metrics.
P. I think we can provisionally accept this as the monadic, and in a sense the
compact or locally compact spatial part of the theory of M;
Now what about probability logic?
L. If we have probability quantifiers too, we obtain an induced measure on
the space X; and we can ignore any open set of measure zero, as the probability
quantifiers will not see it. This is quite close to the measured metric spaces we
encountered yesterday.
17 On the Descriptive Power of Probability Logic 393
S. Is it time now to repeat my question from day 3? Suppose we have at least one
binary relation R in the language, two-valued if you like. Define
15 The discontinuity points of the correlation functions, also of measure zero, may be another such
hint.
17 On the Descriptive Power of Probability Logic 395
P. But the details appear different – in physics a bundle is induced on the ambient
space, here it appears as still relational and on the particles. This too might be worth
looking into.
L. The second issue is equality, and higher imaginaries; it is probably cognate to
what the higher topos theory of Lurie (2009) is after. We already had to consider
imaginary elements, introduced by Shelah into model theory in precisely this
setting. They are just ordinary elements (or tuple of elements) a, but with a new
notion of equality, provided by a definable equivalence relation. The imaginary is an
ideal element a/E, representing one or any of the elements of a given equivalence
class C. Now suppose the elements a themselves carry some structure (a), and
all the (a) are isomorphic for a ∈ C. We would like to be able to speak of an
ideal , isomorphic to any concrete embodiment (a). This cannot be done with
ordinary imaginaries, but does form a part of the space that probability quantifiers
can discern. A special case, when G = Aut (N/M), N algebraic over a sort M,
R(g, a, b) if g fixes the fibers about (a, b), was treated in Hrushovski (2012).
It is conceivable that these two considerations are the only new ones needed.
Acknowledgements Warm thanks to Dan Drai for his reading and comments on the text; to Martin
Guttman for a conversation, explaining to me inter alia that monads have no windows; to the
editors, Meir and Orly, for their encouragement and patience.
References
Aristotle. (1906). Prior analytics (A. J. Jenkinson, Trans.) and Posterior analytics (G. R. G.
Mure, Trans.). Oxford: Oxford Translation. Available from: http://classics.mit.edu/Aristotle/
prior.html, http://classics.mit.edu/Aristotle/posterior.html.
Ben Yaacov, I. (2008). Continuous first order logic for unbounded metric structures. Journal of
Mathematical Logic, 8(2), 197–223.
Ben Yaacov, I., & Usvyatsov, A. (2010). Continuous first order logic and local stability. Transac-
tions of the American Mathematical Society, 362, 5213–5259.
Ben Yaacov, I., Berenstein, A., Henson, C. W., & Usvyatsov, A. (2008). Model theory for metric
structures. In Model theory with applications to algebra and analysis (Vol. 2, pp. 315–427).
London mathematical society lecture note series (Vol. 350). Cambridge: Cambridge University
Press.
Breuillard, E., Green, B., & Tao, T. (2012). The structure of approximate groups. Publications
Mathematiques- Institut Des Hautes Etudes Scientifiques, 116, 115–221.
Chang, C.-C., & Keisler, H. J. (1966). Continuous model theory. Annals of mathematics studies
(Vol. 58, pp. xii+166). Princeton: Princeton University Press.
Chang, C. C., & Keisler, H. J. (1990). Model theory (3rd ed.). Studies in logic and the foundations
of mathematics (Vol. 73). Amsterdam: North-Holland Publishing Co.
Chatzidakis, Z., van den Dries, L., & Macintyre, A. (1992). Definable sets over finite fields. Journal
für Die Reine und Angewandte Mathematik, 427, 107–135.
Corry, L. (2004). David Hilbert and the axiomatization of physics (1898–1918). From Grundlagen
der Geometrie to Grundlagen der Physik. Archimedes: New studies in the history and
philosophy of science and technology (Vol. 10, pp. xvii+513). Dordrecht: Kluwer Academic
Publishers.
17 On the Descriptive Power of Probability Logic 397
Farah, I., & Magidor, M. (2012). Independence of the existence of Pitowsky spin models.
arXiv:1212.0110.
Gromov, M. (1998). Metric structures for riemannian and non-riemannian spaces. Boston:
Birkhuaser.
Gromov, M. (2008). Mendelian dynamics and Sturtevant’s paradigm. Geometric and probabilistic
structures in dynamics. Contemporary mathematics (Vol. 469, pp. 227–242). Providence:
American Mathematical Society.
Hrushovski, E. (2012). Groupoids, imaginaries and internal covers. Turkish Journal of Mathemat-
ics, 36(2), 173–198. (Arxiv: math.LO/0603413.)
Jordan, C. (1878). Mémoire sur les équations différentielles linéaires intégrale algébriques. Journal
für Die Reine und Angewandte mathematik, 84, 89–215.
Keisler, H. J. (1985). Probability quantifiers. In J. Barwise & S. Feferman (Eds.), Model-theoretic
logics (pp. 509–556). Perspectives in mathematical logic. New York: Springer.
Kim, B., & Pillay, A. (1997). Simple theories. Joint AILA-KGS model theory meeting (Florence,
1995). Annals of Pure and Applied Logic, 88(2–3), 149–164.
Kolmogorov, A. N. (1950). Grundbegriffe der Wahrscheinlichkeitrechnung. Ergebnisse der Math-
ematik (1933); English translation: New York: Chelsea.
Lascar, D. (1982). On the category of models of a complete theory. Journal of Symbolic Logic,
47(2), 249–266.
Lovász, L., & Szegedy, B. (2010). Regularity partitions and the topology of graphons. An irregular
mind, Szemerédi is 70 (pp. 415–446). Berlin/Heidelberg: János Bolyai Mathematical Society
and Springer-Verlag.
Lurie, J. (2009). Higher topos theory. Annals of mathematics studies (Vol. 170). Princeton/Oxford:
Princeton University Press.
Maxwell, J. C. (1925). Matter and motion, 1877. London: Sheldon Press.
Mumford, D. (2000). The dawning of the age of stochasticity. Mathematics: Frontiers and
perspectives (pp. 197–218). Providence: American Mathematical Society.
Pillay, A. (1996). Geometric stability theory. Oxford logic guides (Vol. 32). New York: The
Clarendon Press/Oxford University Press. x+361. Oxford Science Publications.
Pitowsky, I. (1982). Resolution of the Einstein-Podolsky-Rosen and Bell paradoxes. Physical
Review Letters, 48(19), 1299–1302.
Pitowsky, I. (1999). The limits of computability and the limits of formalization (hebrew) (Vol. 48,
pp. 209–229). Iyyun: The Jerusalem Philosophical Quarterly.
Razborov, A. (2007). Flag algebras. Journal of Symbolic Logic, 72, 1239–1282.
Razborov, A. (2013). What is a flag algebra? Notices of the AMS, 60(10), 1324–1327.
Shelah, S. (1990). Classification theory and the number of nonisomorphic models (2nd ed.). Studies
in logic and the foundations of mathematics (Vol. 92). Amsterdam: North-Holland Publishing
Co.
Simon, P. (2015). A guide to NIP theories. Lecture notes in logic (Vol. 44, pp. vii+156). Cambridge:
Cambridge Scientific Publishers. Association for symbolic logic, Chicago.
Sturtevant, A. H. (1913). The linear arrangement of six sex-linked factors in Drosophila, as shown
by their mode of association. Journal of Experimental Zoology, 14, 43–59.
Towsner, H. (2011). A model theoretic proof of Szemerédi?s Theorem. arXiv:1002:4456v3,
January 2011.
Vershik, A. M. (2004). Random and universal metric spaces. In A. Maass, S. Martínez, & J. San
Martín (Eds.), Dynamics and randomness II. Nonlinear phenomena and complex systems (Vol.
10, pp. 199–228). Dordrecht: Kluwer Academic Publishers.
Zilber, B. I. (1984). Strongly minimal countably categorical theories. II. Sibirskii matematicheskii
zhurnal, 25(3), 71–88.
Chapter 18
The Argument Against Quantum
Computers
Gil Kalai
18.1 Introduction
G. Kalai ()
Einstein Institute of Mathematics, The Hebrew University of Jerusalem, Jerusalem, Israel
Efi Arazy School of Computer Science, Interdisciplinary Center Herzliya, Herzliya, Israel
Noisy quantum systems will not allow building quantum error-correcting codes needed for
quantum computation.
To support this statement we study the relations between abstract ideas about
computation and the performance of actual physical computers, which are referred
to as noisy intermediate-scale quantum computers or NISQ computers for short.
We need to explain why it is the case that quantum error-correcting codes are
needed to gain the computational advantage of quantum computing, and we also
need to explain why quantum error-correcting codes are out of reach. The crux of
the matter lies in a parameter referred to as the rate of noise. We will explain why
reducing the rate of noise to the level needed for good quantum error-correction
will already enable NISQ computers to demonstrate “quantum supremacy.” We
will also explain why NISQ computers represent a very low level computational
complexity class which cannot support quantum supremacy. This provides a good,
in-principle, argument for the infeasibility of both good quantum error-correcting
codes and quantum supremacy.
We will show where the argument breaks down for classical computation.
Rudimentary classical error-correcting codes are supported by the low-level com-
putational complexity class that describes NISQ computers.
Pitowsky (1990) formulates and studies a strong form of the “physical” Church–
Turing thesis (referred to later as the extended Church–Turing thesis (ECTT))1 ,
namely
whether we can invoke physical laws to reduce a computational problem that is manifestly
or presumably of exponential complexity, and actually complete it in polynomial time.
Pitowsky (1990) described three types of computers where each type is more
restrictive than the previous one: “the Platonist computer,” “the constructivist
computer,” and “the finitist computer.” We will consider versions of the last two
types. Our definitions rely on the notion of Boolean circuits.3
The ‘constructivist computer’ is just a (deterministic) Turing machine, and associated with
this notion is the class of all Turing machine computable, i.e., recursive functions. Can a
physical machine compute a non-recursive function? Itamar Pitowsky (1990)
3 The precise relation between the Turing machine model (considered in Pitowsky 1990) and the
circuit model is interesting and subtle but we will not discuss it in this paper.
402 G. Kalai
Model 1 – Computers (circuits) A computer (or circuit) has n bits, and it can
perform certain logical operations on them. The NOT gate, acting on a single bit, and
the AND gate, acting on two bits, suffice for the full power of classical computing.
The outcome of the computation is the value of certain k output bits. When k = 1
the circuit computes a Boolean function, namely, a function f (x1 , x2 , . . . , xn ) of n
Boolean variables (xi = 1 or xi = 0) so that the value of f is also 0 or 1.
While general circuits can compute arbitrary Boolean functions, matters change
when we require that the size of the circuit be at most a polynomial in the number
of input variables. Here the size of the circuits is the total number of gates.
Model 2 – Polynomial size circuit A circuit with n input bits of size at most
Anc . Here A and c are positive constants.
The complexity class P refers to problems that can be solved using a polynomial
number of steps in the size of the input. The complexity class NP refers to problems
whose solution can be verified in a polynomial number of steps. Our understanding
of the computational complexity world depends on a whole array of conjectures: NP
= P is the most famous one. Many natural computational tasks are NP-complete.
(A task in NP is NP-complete if a computer equipped with a subroutine for solving
this task can solve, in a polynomial number of steps, every problem in NP.)
Let me mention two fundamental examples of computational “easiness” and
“hardness.” The first example is: multiplication is easy, factoring is hard!4 Multi-
plying two n-digit numbers requires a simple algorithm with a number of steps that
is quadratic in n. On the other hand, the best-known algorithm for factoring an n-
digit number to its prime factors is (we think) exponential in n1/3 (Fig. 18.1).
The second fundamental example of complexity theory is that “determinants are
easy but permanents are hard.” Recall that the permanent of an n-by-n matrix M
has a similar (but simpler) formula compared to the more famous determinant. The
difference is that the signs are eliminated, and this difference accounts for a huge
computational difference. Gaussian elimination gives a polynomial-time algorithm
5 The hardness of computing permanents defines an important computational class #P (in words,
number-P or sharp-P) that is larger than PH (the entire “polynomial hierarchy”).
6 For simplicity, we use in this paper the term P (rather than BPP) to describe the computational
• A Quantum gate is a unitary transformation. We can put one or two qubits through
gates representing unitary transformations acting on the corresponding two- or
four-dimensional Hilbert spaces. As for classical computers, there is a small list
of gates that are sufficient for the full power of quantum computing.
• Measurement: Measuring the state of k qubits leads to a probability distribution
on 0–1 vectors of length k.
Here and later we assume that the number of gates is at most polynomial in
n. Quantum computers allow sampling from probability distributions well beyond
the capabilities of classical computers (with random bits). Shor’s famous algorithm
shows that quantum computers can factor n-digit integers efficiently, in roughly n2
steps! It is conjectured that quantum computers cannot solve NP-complete problems
and this implies that they cannot compute permanents.
For the notions of noisy quantum computers and for fault-tolerant quantum
computation that we discuss next, it is necessary to allow measurement throughout
the computation process and to be able to feed the measurement results back into
later gates.
Next we consider noise. Quantum systems are inherently noisy; we cannot accu-
rately control them, and we cannot accurately describe them. In fact, every
interaction of a quantum system with the outside world amounts to noise.
Model 5 – Noisy quantum computers A noisy quantum circuit has the property
that every qubit is corrupted in every “computer cycle” with a small probability t,
and every gate is t-imperfect. Here, t is a small constant called the rate of noise.
Here, in a “computer cycle” we allow several non-overlapping gates to perform
in parallel. We do not specify the precise technical meaning of “corrupt” and “t-
imperfect,” but the following intuitive explanation (for a restricted form of noise
called depolarizing noise) could be useful. When a qubit is corrupted then its state
is replaced by a uniformly distributed random state on the same Hilbert state. A gate
is t-imperfect if with probability t the state of the qubits that are involved in the gate
is replaced by a uniformly random state in the associated Hilbert space.
Theorem 1 (“The threshold theorem”) If the error rate is small enough, noisy
quantum circuits allow the full power of quantum computing.
The threshold theorem was proved around 1995 by Aharonov and Ben-Or (1997),
Kitaev (1997), and Knill et al. (1998). The proof relies on quantum error-correcting
codes first introduced by Shor (1995) and Steane (1996).
A common interpretation of the threshold theorem is that it shows that large-scale
quantum computers are possible in principle. A more careful interpretation is that
if we can control noisy intermediate-scale quantum systems well enough, then we
can build large-scale universal quantum computers. As we will see, there are good
reasons for why we cannot control the quality of noisy intermediate-scale quantum
systems well enough.
18 The Argument Against Quantum Computers 405
7 JohnPreskill coined the terms “quantum supremacy” (in 2012) and “NISQ systems” (in 2018).
Implicitly, these notions already play a role in Preskill (1998). The importance of multi-scale
analysis of the quantum computer puzzle is emphasized in Kalai (2016a).
406 G. Kalai
Fig. 18.2 NISQ circuits are computationally very weak and therefore unlikely to create quantum
error-correcting codes needed for quantum computers
The quality of individual qubits and gates is the major factor for the quality of the
quantum circuits built from them. One consequence of our argument is that there is
an upper bound on the quality of qubits and gates, which is quite close, in fact, to
what people can achieve today. This consequence seems counterintuitive to many
researchers. As a matter of fact, the quantum-computing analogue of Moore’s law,
known as “Schoelkopf’s law,” asserts, to the contrary, that roughly every three years,
quantum decoherence can be delayed by a factor of ten. Our argument implies that
Schoelkopf’s law will be broken before we reach the quality needed for quantum
supremacy and quantum fault-tolerance. (Namely, now!)9 Our first prediction is thus
that
8A nice example concerns the computation of the Ramsey number R(k, r), which is the smallest
integer n such that in every party with n participants you can find either a set of k participants where
every two shook hands, or a set of r participants where no two shook hands. This is an example
where asymptotic computational complexity insights into the large-scale behavior (namely, when k
and r tend to infinity) explain the small-scale behavior (namely, when k and r are small integers). It
is easy to verify that R(3, 3) = 6, and it was possible with huge computational efforts (and human
ingenuity) to compute the Ramsey number R(4, 5) = 25 (McKay and Radziszowski 1995). It
is commonly believed that computing R(7, 7) and certainly R(10, 10) is and will remain well
beyond our computational ability. Building gates that are two orders of magnitude better than the
best currently existing gates can be seen as analogous to computing R(10, 10): we have good
computational complexity arguments that both these tasks are beyond reach.
9 Schoelkopf’s law was formulated by Rob Schoelkopf’s colleagues based on Schoelkopf’s
experimental breakthroughs, and only later was adopted as a prediction for the future. An even
bolder related prediction by Neven asserts that quantum computers are gaining computational
power relative to classical ones at a “doubly exponential” rate. Our argument implies that Neven’s
law does not reflect reality.
408 G. Kalai
(a) The quality of qubits and gates cannot be improved beyond a certain threshold
that is quite close to the best currently existing qubits and gates. This applies
also to topological qubits (Sect. 18.3.5).
Our argument leads to several further predictions on near-term experiments, for
example, on distributions of 0–1 strings based on a random quantum circuit (Goal
2), or on circuits aimed at creating good-quality quantum error-correcting codes
such as distance-5 and even distance-3 surface codes (Goal 3).
(b) For a larger amount of noise, robust experimental outcomes are possible but
they will represent probability distributions that can be expressed in terms of
low-degree polynomials (LDP-distributions, for short) that are far away from
the desired noiseless distributions.
(c) For a wide range of smaller amounts of noise, the experimental outcomes
will not be robust. This means that the resulting probability distributions will
strongly depend on fine properties of the noise and hence the outcomes will be
chaotic.
In other words, in this range of noise rate the probability distributions are
still far away from the desired noiseless distributions and, moreover, running
the experiment twice will lead to very different probability distributions.
Following predictions (b) and (c) we further expect that
(d) The effort required to control k qubits to allow good approximations of the
desired distribution will increase exponentially and will fail in a fairly small
number of qubits. (My guess is that the number will be ≤20.)
Fig. 18.3 The huge computational gap (left) between boson sampling (purple) and fermion
sampling (green) vanishes in the noisy versions (right)
410 G. Kalai
NISQ computers account for most of the effort to create stable qubits via quantum
error-correction, to demonstrate quantum supremacy, and to ultimately build uni-
versal quantum computers. We expect (Sect. 18.4.1) that noise stability and the very
low-level computational class LDP that we identified for noisy NISQ systems apply
in greater generality.
18 The Argument Against Quantum Computers 411
It is not easy to define quantum devices that “refrain from using error correction,”
but this property holds for NISQ circuits that simply do not have enough qubits
to use quantum error correction, and appears to hold for all known experimental
processes aimed at building high quality physical qubits including toplogical qubits.
What we see next is that the low level complexity class of NISQ circuits does allow
building basic forms of classical error-correcting codes.
The study of noisy boson sampling relies on a study of Benjamini et al. (1999) of
noise stability and noise sensitivity of Boolean functions. See Kalai (2018) where
both these theories and their connections to statistical physics, voting rules, and
computation are discussed. A crucial mathematical aspect of the theory is that
the noise has a simple description in terms of Fourier expansion: it reduces “high
frequency” coefficients exponentially and, therefore, noise stability reflects “low
frequency” (low-degree) Fourier expansion.
The theory of noise stability and noise sensitivity posits a clear distinction
between classical information and quantum information. The class LDP of prob-
ability distributions that can be approximated by low-degree polynomials does not
support quantum supremacy and quantum error-correction but it still supports robust
classical information, and with it also classical communication and computation.
The (“majority”) Boolean function, defined by f (x1 , x2 , . . . , xn ) = 1 if more than
half the variables have a value of 1, allows for very robust bits based on a large
number of noisy bits or qubits and admits excellent low-degree approximations. It
412 G. Kalai
can be argued that every form of robust information, communication, and compu-
tation in nature is based on a rudimentary form of classical error-correction where
information is encoded by repetition (or simple variants of repetition) and decoded
by the majority function (or a simple variant of it). In addition to this rudimentary
form of classical error-correction, we sometimes witness more sophisticated forms
of classical error-correction.
We argue that NISQ devices are, in fact, low-level classical computing devices,
and I regard it as strong evidence that engineers will not be able to reach the level of
noise required for quantum supremacy. Others argue that the existence of low-level
simulation (in the NISQ regime) for every fixed level of noise does not have any
bearing on the engineering question of reducing the level of noise. These sharply
different views will be tested in the next few years.
In his comments on this paper, physicist and friend Aram Harrow wrote that my
predictions are so sweeping that they say that hundreds of different research teams
will all stop making progress. Indeed, my argument and predictions say that the
goals of these groups (such as Goals 1–3 and other, more ambitious, goals) will
not be achieved. Rather, if I am correct, these research teams will make important
progress in our understanding of the inherent limitations of computation in nature,
and the limitations of human ability to control quantum systems.
Aram Harrow also wrote: “The Extended Church–Turing Thesis (ECTT) is a
major conjecture, implicating the work of far more than quantum computing. Long
before Shor’s algorithm it was observed that quantum systems are hard in practice
to simulate because we do not know algorithms that beat the exponential scaling
and work in all cases, or even all practical cases. Simulating the 3 or so quarks
(and gluons) of a proton is at the edge of what current supercomputers can achieve
despite decades of study, and going to larger nuclei (say atomic weight 6 or more)
is still out of reach. If the extended Church–Turing thesis is actually true then there
is a secret poly-time simulation scheme here that the lattice QCD community has
overlooked. Ditto for chemistry where hundreds of millions of dollars are spent
each year simulating molecules. Thus the ECTT is much stronger than just saying
we cannot build quantum computers.”
I agree with the spirit of Harrow’s comment, if not with the specifics, and the
comment can serve as a nice transition to the next section that discusses a few
underlying principles and consequences that follow from the failure of quantum
computing.
In this section we will propose some general underlying principles for noisy
quantum processes that follow from the failure of quantum computers. (For more,
see Kalai 2016b and Kalai 2018.)
414 G. Kalai
18.4.1.1 Learnability
It is a simple and important insight of machine learning that the approximate value
of a low-degree polynomial can efficiently be learned from examples. This offers
an explanation for our ability to understand natural processes and the parameters
18 The Argument Against Quantum Computers 415
defining them. Rudimentary physical processes with a higher entropy rate may
represent simple extensions of the class of LDP for which efficient learnability is
still possible, and this is an interesting topic for further research.
Fig. 18.4 Low-entropy quantum states give probability distributions described by low degree
polynomials, and very low-entropy quantum states give chaotic behavior. Higher entropy enables
classical information
416 G. Kalai
(g) All experimental attempts to achieve Goals 2 and 3 will lead to a strong effect
of error-synchronization.
We emphasize that our argument against quantum computers in Sect. 18.3 does
not depend on Principle 3 and, moreover, the main trouble with quantum computers
is that the noise rate of qubits and gates cannot be reduced toward the threshold for
fault-tolerance. Correlation is part of this picture since, as is well known, the value
of the threshold becomes smaller when there is a larger positive correlation for 2-
qubit gated errors. We note (Kalai 2016a) that Prediction (f) implies that the quality
of logical qubits via quantum error-correction is limited by the quality of gates. See
also Chubb and Flammia (2018) for an interesting analysis of the effect of correlated
errors on surface codes.
A reader may question how it is possible that entangled states will experience
correlated noise, even long after they have been created and even if the constituent
particles are far apart and not directly interacting. We emphasize that Principle 3
does not reflect any form of “non local” noise and that no “additional” correlated
noise on far apart entangled qubits is ever assumed or conjectured. Principle 3
simply reflects two facts. The first fact is that errors for gated qubits are positively
correlated from the time of creation. The second fact is that unless the error rate
is small enough to allow quantum error-correction (and this may be precluded by
the argument of the previous section), cat states created indirectly via a sequence of
computational steps will inherit errors with large positive correlation.
Remark Error-synchronization is related to an interesting possibility (Kalai 2016b)
that may challenge conventional wisdom and can also be checked for near term
NISQ (and other) experiments. A central property of Hamiltonian models of noise
in the study of quantum fault-tolerance (see, e.g., Preskill 2013), is that error
fluctuations are sub-Gaussian. Namely, when there are √ N qubits, the standard
deviation for√the number of qubit errors behaves like N and the probability of
more than t N errors decays as it does for Gaussian distribution. Many other
models in mathematical physics have a similar property. However, it is possible
that fluctuations for the total amount of errors for noisy quantum circuits and other
quantum systems with interactions are actually super-Gaussian and perhaps even
linear.
Fig. 18.5 Human imagination allows for a cat with an unusual geometry. Drawing by Netta Kasher
(a tribute to Louis Wain)
18.5 Conclusion
The intrinsic study of computation transcends human-made artifacts, and its expanding con-
nections and interactions with all sciences, integrating computational modeling, algorithms,
and complexity into theories of nature and society, marks a new scientific revolution! Avi
Wigderson – Mathematics and Computation 2019.
Understanding noisy quantum systems and potentially even the failure of quan-
tum computers is related to the fascinating mathematical theory of noise stability
and noise sensitivity and its connections to the theory of computing. Our study
18 The Argument Against Quantum Computers 419
shows how insights from the computational complexity of very low-level compu-
tational classes support the extended Church–Turing thesis. Exploring this avenue
may have important implications for various areas of quantum physics. These
connections between mathematics, computation, and physics are characterized by
a rich array of conceptual and technical methods and points of view. In this study,
we witness a delightfully thin gap between fundamental and philosophical issues on
the one hand and practical and engineering aspects on the other.
18.6 Itamar
This paper is devoted to the memory of Itamar Pitowsky. Itamar was great. He
was great in science and great in the humanities. He could think and work like a
mathematician, and like a physicist, and like a philosopher, and like a philosopher
of science, and probably in various additional ways. And he enjoyed the academic
business greatly, and took it seriously, with humor. Itamar had an immense human
wisdom and a modest, level-headed way of expressing it.
Itamar’s scientific path interlaced with mine in many ways. Itamar made
important contributions to the foundations of the theory of computation, probability
theory, and quantum mechanics, all related to some of my own scientific endeavors
and to this paper. Itamar’s approach to the foundation of quantum mechanics was
that quantum mechanics is a theory of non-commutative probability that (like
classical probability) can be seen as a mathematical language for the other laws
of physics. (My work, to a large extent, adopts this point of view. But, frankly, I do
not quite understand the other points of view.) (Fig. 18.6)
In the late 1970s when Itamar and I were both graduate students I remember him
enthusiastically sharing his thoughts on the Erdős–Turán problem with me. This is
a mathematical conjecture that asserts that if we have asequence of integers 0 <
a1 < a2 < · · · < an < · · · such that the infinite series 1
an diverges then we can
find among the elements of the sequence an arithmetic progression of length k, for
every k. Over the years both of us spent some time on this conjecture without much
to show for it. This famous conjecture is still open even for k = 3. Some years later
both Itamar and I became interested, for entirely different reasons, in remarkable
geometric objects called cut polytopes. Cut polytopes are obtained by taking the
convex hull of characteristic vectors of edges in all cuts of a graph. Cut polytopes
arise naturally when you try to understand correlations in probability theory and
Itamar wrote several fundamental papers studying them; see Pitowsky (1991). Cut
polytopes came into play, in an unexpected way, in the work of Jeff Kahn and myself
where we disproved Borsuk’s conjecture (Kahn and Kalai 1993). Over the years,
Itamar and I became interested in Arrow’s impossibility theorem (Arrow 1950),
which Itamar regarded as a major twentieth-century intellectual achievement. He
gave a course centered around this theorem and years later so did I. The study of the
noise stability and noise sensitivity of Boolean functions has close ties to Arrow’s
theorem.
420 G. Kalai
The role of skepticism in science and how (and if) skepticism should be
practiced is a fascinating issue. It is, of course, related to this paper that describes
skepticism over quantum supremacy and quantum fault-tolerance, widely believed
to be possible. Itamar and I had long discussions about skepticism in science, and
about the nature and practice of scientific debates, both in general and in various
specific cases. This common interest was well suited to the Center for the Study of
Rationality at the Hebrew University that we were both members of, where we and
other friends enjoyed many discussions, and, at times, heated debates.
A week before Itamar passed away, Itamar, Oron Shagrir, and I sat at our little
CS cafeteria and talked about probability. Where does probability come from? What
does probability mean? Does it just represent human uncertainty? Is it just an
emerging mathematical concept that is convenient for modeling? Do matters change
when we move from classical to quantum mechanics? When we move to quantum
physics the notion of probability itself changes for sure, but is there a change in the
interpretation of what probability is? A few people passed by and listened, and it
felt like this was a direct continuation of conversations we had while we (Itamar and
I; Oron is much younger) were students in the early 1970s. This was to be our last
meeting.
Acknowledgements Work supported by ERC advanced grants 320924, & 834735, NSF grant
DMS-1300120, and BSF grant 2006066. The author thanks Yuri Gurevich, Aram Harrow, and
Greg Kuperberg for helpful comments, and Netta Kasher for drawing Figs. 18.1, 18.2, 18.3, 18.4,
and 18.5.
References
Aaronson, S., & Arkhipov, A. (2013). The computational complexity of linear optics. Theory of
Computing, 4, 143–252.
Aharonov, D., & Ben-Or, M. (1997). Fault-tolerant quantum computation with constant error. In
STOC’97 (pp. 176–188). New York: ACM.
Arrow, K. (1950). A difficulty in the theory of social welfare. Journal of Political Economy, 58,
328–346.
Arute, F., et al. (2019). Quantum supremacy using a programmable superconducting processor.
Nature, 574, 505–510.
Barkai, N., & Leibler, S. (1997). Robustness in simple biochemical networks. Nature, 387, 913.
Benjamini, I., Kalai, G., & Schramm, O. (1999). Noise sensitivity of Boolean functions and
applications to percolation. Publications Mathématiques de l’Institut des Hautes Études
Scientifiques, 90, 5–43.
Chubb, C. T., & Flammia, S. T. Statistical mechanical models for quantum codes with correlated
noise, arXiv:1809.10704.
Deutsch, D. (1985). Quantum theory, the Church–Turing principle and the universal quantum
computer. Proceedings of the Royal Society of London A, 400, 96–117.
Feynman, R. P. (1982). Simulating physics with computers. International Journal of Theoretical
Physics, 21, 467–488.
Johansson, N., & Larsson, J.-A. Realization of Shor’s algorithm at room temperature, arX-
ive:1706.03215.
Kahn, J., & Kalai, G. (1993). A counterexample to Borsuk’s conjecture. Bulletin of the American
Mathematical Society, 29, 60–62.
Kalai, G. (2016a). The quantum computer puzzle. Notices of the American Mathematical Society,
63, 508–516.
Kalai, G. (2016b). The quantum computer puzzle (expanded version), arXiv:1605.00992.
Kalai, G. (2018). Three puzzles on mathematics, computation and games. In Proceedings of the
International Congress of Mathematicians, Rio de Janeiro (Vol. I, pp. 551–606).
Kalai, G., & Kindler, G. Gaussian noise sensitivity and BosonSampling, arXiv:1409.3093.
Kitaev, A. Y. (1997). Quantum error correction with imperfect gates. In Quantum communication,
computing, and measurement (pp. 181–188). New York: Plenum Press.
Knill, E., Laflamme, R., & Zurek, W. H. (1998). Resilient quantum computation: Error models and
thresholds. Proceedings of the Royal Society of London A, 454, 365–384.
McKay, B. D., & Radziszowski, S. P. (1995). R(4,5)=25. Journal of Graph Theory, 19, 309–322.
Pitowsky, I. (1990). The physical Church thesis and physical computational complexity. lyuun, A
Jerusalem Philosophical Quarterly, 39, 81–99.
Pitowsky, I. (1991). Correlation polytopes: Their geometry and complexity. Mathematical Pro-
gramming, A50, 395–414.
Polterovich, L. (2014). Symplectic geometry of quantum noise. Communications in Mathematical
Physics, 327, 481–519.
Preskill, J. (1998). Quantum computing: Pro and con. Proceedings of the Royal Society of London
A, 454, 469–486.
Preskill, J. (2013). Sufficient condition on noise correlations for scalable quantum computing.
Quantum Information and Computing, 13, 181–194.
Shor, P. W. (1995). Scheme for reducing decoherence in quantum computer memory. Physical
Review A, 52, 2493–2496.
Shor, P. W. (1999). Polynomial-time algorithms for prime factorization and discrete logarithms
on a quantum computer. SIAM Review, 41, 303–332. (Earlier version, Proceedings of the 35th
Annual Symposium on Foundations of Computer Science, 1994.)
Steane, A. M. (1996). Error-correcting codes in quantum theory. Physical Review Letters, 77, 793–
797.
422 G. Kalai
Troyansky, L., & Tishby, N. (1996). Permanent uncertainty: On the quantum evaluation of the
determinant and the permanent of a matrix. In Proceedings of the 4th Workshop on Physics and
Computation.
Wigderson, A. (2019). Mathematics and computation. Princeton University Press. Princeton, New
Jersey.
Wolfram, S. (1985). Undecidability and intractability in theoretical physics. Physical Review
Letters, 54, 735–738.
Chapter 19
Why a Relativistic Quantum Mechanical
World Must Be Indeterministic
19.1 Introduction
ends.1 One of the aims of this paper (see more details below) is to give a general
characterization of contextuality in the context of Quantum Mechanics (QM) and its
relation to other concepts such as locality and determinism.
Empirical Models (EM) (defined in Brandenburger and Yanofsky 2008; Abram-
sky and Brandenburger 2011) are probabilistic frameworks for analyzing the
properties of measurement scenarios that arise in both classical and contextual
experiments. One may view the EM framework as a general mathematical frame-
work in which the full scope of probability theories can be analyzed. An EM consists
of a set of admissible measurement sequences and the space of their corresponding
outcomes together with a probability structure defined on these spaces. In simple
terms, an EM is a “table” of measurements and their outcomes with the relative
frequency at which these outcomes are observed.
The more commonly used term Hidden Variables Model (HVM) is an extension
of the EM where a “state” space is appended to the measurement and outcome
spaces. In a HVM the state of the measured system is often taken to provide an
“explanation” for the frequencies of the measurement outcomes captured in the
corresponding EM. The EM is the phenomenological description of the experimen-
tal results while the HVM describes a deeper structure that realizes the observed
measurement outcomes and their relative frequencies as depicted by the EM. Since
in general there may be several HVMs, that differ in their properties, all of which
give rise to the same EM, one may conclude that the EM is the more fundamental
layer of describing the physical reality in the sense that it gives the empirically
compelling constraints on the HVM.
The EM and HVM are very general structures that are not restricted specif-
ically to quantum experiments and can describe any reasonable experimental
setting. As demonstrated beautifully in Brandenburger and Yanofsky (2008) (see
also Abramsky and Brandenburger 2011; Brandenburger and Keisler 2016) the
HVM framework allows defining, in an accurate mathematical way, interesting
properties of physical systems such as Determinism (Strong and Weak), Locality,
No-Signaling and Contextuality and deriving relations between these properties.
(We use these terms now without explicit definitions; we shall give rigorous
mathematical definitions as we proceed.)
Our first aim in this paper is to extend the definitions of the physical properties
of Determinism, Locality and Non-Signaling from the HVM framework to the EM
framework. Such extension is partially proposed in the above-mentioned references,
but we would like to complete it here. If one looks only at the EM, the relations
between these properties can be attributed to the actual “physical reality” rather than
to a specific HVM theory. Indeed, this is exactly the way in which Bell’s Theorem
(1964) is understood (see e.g. Pitowsky 1986; Albert 1992; Maudlin 1994), namely
that any violation of the Bell inequality (given Bell’s assumptions) entails that the
1 The “degree” of contextuality can be measured by the value of certain bounds on inequalities
pertaining to the probability distribution that are violated for contextual (non-classical) probability
theories.
19 Why a Relativistic Quantum Mechanical World Must Be Indeterministic 425
physical reality itself is nonlocal. We shall define the above properties in terms of
EM, examine their relations and compare them to the corresponding properties and
relations of the HVM.
The second aim of this paper is to characterize the contextuality property which
as we will show is relevant to EM only. There seems to be some confusion in the
literature about how precisely to understand contextuality and we would like to
clarify it. This discussion will lead us to propose that under reasonable assumptions
Non-Contextuality is in a sense equivalent to Locality and to Strong Determinism,
and Contextuality is equivalent to Non-Locality and to Weak (that is, not Strong)
Determinism.
Our third aim is to consider the EM and HVM framework in the context of
Special Relativity (SR). The No-Signaling property is an implicit requirement
stemming from the principles of SR, which implies that no physical signal can
be transmitted in a superluminal speed. Here we shall add explicitly the postulate
according to which a HVM compatible with SR must be genuinely Lorentz invariant,
where we understand this last requirement in a broad sense as follows: the laws of
physics should be the same in every Lorentz frame of reference, and moreover there
is no (absolute) temporal order (and therefore no preferred Lorentz reference frame)
that fixes the temporal direction of the cause and effect2 relation. We shall take it
that these conditions should hold in any relativistic theory regardless of whether
or not a violation of them may be empirically detected. We shall call this condition
from now on Lorentz invariance. We will show (in Sect. 19.5) that requiring Lorentz
invariance in this sense implies the No-Signaling property. In fact, assuming Lorentz
invariance, we will prove the following general result:
A contextual Empirical Model does not have a Strongly Deterministic (Local) or
a Weakly Deterministic (Non-Local) Hidden Variable Model.
Since the quantum mechanical probabilities of measurement outcomes that have
been confirmed by actual experiments match a contextual empirical model (i.e.
the quantum probabilities are non-classical; see Sect. 19.4), our proof of the
above proposition implies that under these assumptions (including our additional
requirement of the existence of no preferred Lorentz frame) the physical reality
must be genuinely stochastic. In other words, God must play dice in a relativistic
quantum world.
Bell (1964) was the first to prove that the EPR setting cannot be realized by
a local and (strongly) deterministic HVM. Results pertaining to non-local (weak)
determinism have been proposed in certain other contexts, e.g., Hardy and Squires
(1992), Clifton et al. (1992), Conway and Kochen (2006, 2009), Gisin (2011).
But our theorem above is obtained in a very general context which is explicitly
independent of quantum theory, and which is constrained only by the contextuality
of measurement outcomes.
The paper is organized as follows. In Sect. 19.2 we briefly give the basic
definitions, examples and the main results in the EM and HVM framework. In
Sect. 19.3 we present the definitions of the physical properties mentioned above
and their logical relations in the EM framework. In Sect. 19.4 we analyze the
property of contextuality. In Sect. 19.5 we extend the discussion of EM and HVM
to the framework of Special Relativity and we prove our general theorem, namely
that there is no deterministic theory that is consistent with both the predictions
of quantum mechanics and with the relativistic postulate of no preferred Lorentz
frame of reference. In Sect. 19.6 we briefly draw some implications of our proof
with regard to current interpretations of quantum mechanics. Section 19.7 is the
conclusion.
This section is rather technical and uses probability theory for the definitions
part. We put it forward for the sake of completeness of the paper. The attached
examples and the results pertaining to the relations between the properties can
be understood without going into the formal details of all the definitions. Unless
otherwise stated, all the definitions and results, but not the examples, are based
on Brandenburger and Yanofsky (2008), Abramsky and Brandenburger (2011) and
Brandenburger and Keisler (2016). However, our definitions of Empirical Model
and Hidden Variable Model are closer to those in Abramsky and Brandenburger
(2011) than to the definitions in the other two references. Hence there are certain
differences in the main formulations which we shall discuss at the end of Sect.
19.2.2. When appropriate we add explicitly in comments the physical motivation
and interpretation of the formal definitions, as we understand them.
Comments
• Given an EM (E, {Qm }), the HVM defined by (E × , {Pm }), where is a
singleton state space = {λ}, and PA, B, . . . (a, b, . . . , λ) = QA, B, . . . (a, b, . . . ) is
consistent with (E, {Qm }). Hence every EM has a “trivial HVM” that is consistent
with it.
• Given an HVM (E × , {Pm }), one can define an EM (E, {Qm }) where the prob-
ability measures are given by QA,B,... (a, b, . . . ) = PA,B,... (a, b, . . . , λ).
λ∈Λ
Then, by definition, the given HVM is consistent with this “derived” EM.
0, 0 0, 1 1, 0 1, 1
A, B |λ=0 1 0 0 0
A, C |λ=0 1 0 0 0
A, B |λ=1 0 0 0 1
A, C |λ=1 0 0 0 1
0, 0 0, 1 1, 0 1, 1
A, B |λ=0 0 1 0 0
A, C |λ=0 0 0 0 1
A, B |λ=1 0 0 1 0
A, C |λ=1 1 0 0 0
is Weakly Deterministic since given the state λ the outcome of the two admissible
measurements is uniquely determined, but it is not Strongly Deterministic since
given λ = 0 (and also λ = 1) the outcome of the measurement of A can be either
0 or 1 depending on whether it was measured together with B or with C. This
counter intuitive property which characterizes certain quantum systems is called
contextuality and will be investigated later in great details.
No-Signaling (Parameter Independence)3 (NS)
An HVM is No-Signaling if for every state λ, the outcome of a measurement does
not depend on the other measurements performed simultaneously with it.
Formally: for every λ: PA,B,... (a|λ) = PA,B ,... (a|λ) = PA (a|λ).
The No-Signaling condition is associated with Special Relativity, since a HVM
that fails NS may allow superluminal signaling if the measurements are performed
in space-like separated locations.
Example 2.1 is No-Signaling but Example 2.2 is not as indicated before.
3 The term ‘Parameter Independence’ is more common in the literature, but due to our focus
on the relation between EM and SR we prefer the term ‘No-Signaling’, which emphasizes the
physical implication of this property. Also note that we use the term ‘signaling’ in a neutral
way, disregarding completely whether or not the signals can be noticed or readout, unless we say
otherwise. See Maudlin (1994) for a broader discussion of signaling.
430 A. Levy and M. Hemmo
Comment
Even when an HVM does not satisfy the No-Signaling property, it may still be
impossible to transmit superluminal signals. This is the case when it is not feasible4
to prepare the system at those states where violation of the NS property takes place.
Bohm’s (1952) theory is an example for this situation (see discussions of this point
and its implications in Pitowsky 1989; Dürr, Goldstein and Zanghi 1992; Albert
1992; Maudlin 1994; Hemmo and Shenker 2013).
Outcome Independence (OI)
An HVM is Outcome independent if for every state λ, the outcome of a measurement
does not depend on the outcomes of other measurements performed simultaneously
with it.
Formally: PA, B, C, . . . (a, b, c, . . . | λ) = PA, B, C, . . . (A = a| λ)PA, B, C, ...
(B = b| λ)PA, B, C, . . . (C = c| λ)· · · , for every λ.
Example 2.3 In tossing two independent binary coins with measurements A, B
) parameter λ ∈ {1, 2, 3, 4} characterized the 4 possible outcomes.
respectively a state
Let P (λ) = 1
4 for every λ then the HVM described by the following table
satisfied the Outcome Independence property:
0, 0 0, 1 1, 0 1, 1
A, B |λ=0 1 0 0 0
A, B |λ=1 0 1 0 0
A, B |λ=2 0 0 1 0
A, B |λ=3 0 0 0 1
Example 2.4 The following “two balls in two jars” scenario is an example of HVM
that is not Outcome-Independence.
A red ball and a blue ball are placed in two jars with equal probabilities. The
measurement of A detects the color of the ball in the first jar and B in the second jar.
Assume a singleton state space and a probability table defined by:
)r) (b, b)
(r, r) (r,)b) (b,
(A, B) 0 1 1 0
2 2
4 We don’t address here the question of why in a given theory the preparation of a system in a
certain state λ is not feasible.
19 Why a Relativistic Quantum Mechanical World Must Be Indeterministic 431
Locality
The locality property is the conjunction of No-Signaling and Outcome Indepen-
dence properties, as defined by Bell (1964):
PA, B, C . . . (a, b, c, . . . | λ) = PA (a| λ)PB (b| λ)PC (c| λ)· · · for every λ.
λ – Independence (LI)
Brandenburger and Yanofsky (2008) define an additional property of HVM, which
they call λ-Independence (LI). This property entails that the state of the measured
system as determined by the value of the hidden variable is independent of the
measurement performed on the system and vice versa. However, in our settings of
HVM this definition is meaningless since we did not assume a probabilistic structure
on the admissible series of measurements. In fact, we tacitly assumed that there can
be no dependence between the set of admissible measurements and the state of
the measured system. Assuming the converse means that there is some sort of a
“universal conspiracy” that correlates between the choices of the experimenter and
the state of the measured system. For this reason, the λ-Independence property is
sometimes associated with lack of ‘free will’ (we don’t address this issue here).
As is well known, Bell (1964) has assumed λ-Independence. Although retro-
causal approaches to QM reject λ-Independence and admit correlations between the
choices of the experimenter and the state of the measured system (see e.g. Price and
Wharton 2013), we shall assume, as we said above, the more commonly accepted
assumption that λ-Independence holds.
The following theorems and propositions are stated without their proofs. The
interested reader can find the proofs in Brandenburger and Yanofsky (2008) and
Brandenburger and Keisler (2016).
432 A. Levy and M. Hemmo
Existence Theorem
For every Empirical Model there exists a consistent Weakly Deterministic Hidden
Variable Model.
Note that this theorem does not guarantee that there is a consistent Strongly
Deterministic HVM. The most remarkable example for a Weakly Deterministic
HVM theory is Bohm’s interpretation of QM. As we shall see later it’s Weakly
Deterministic (but not SD) nature is intimately related to the fact that the values
assigned to all quantum mechanical observables other than position are contextual.
It follows from the above existence theorem that the Weakly Deterministic property
of Bohm’s theory is not an exception. However, it is still remarkable of course
that Bohm’s theory provides dynamics for the hidden variables that solves the
measurement problem in QM in a way that is compatible with the Schrödinger
dynamics of the quantum wavefunction, while this existence theorem refers only
to measurement outcomes without mentioning dynamics at all.
However, the generality of this theorem may come as a surprise since in the
history of QM it took a few decades to realize that von Neumann’s “no-go proof,”
according to which hidden variables theories consistent with the predictions of QM
do not exist, presupposed that the hidden variables have pre-measurement values
that can be measured as quantum observables (see Bub (2010) for a discussion of
this point). In the terminology of the EM and HVM framework, von Neumann’s
assumption is equivalent to requiring that the HVM satisfies Strong Determinism,
and indeed such an HVM does not exist for a contextual Empirical Model as we
shall explain later.
Comment on Strongly Deterministic Explanation
Brandenburger and Yanofsky (2008) consider another existence theorem:
Existence Theorem
For every Empirical Model there exists a consistent Strongly Deterministic HVM
(but not necessarily λ- independent).
Note that although this theorem guarantees that every Empirical Model has a
Strongly Deterministic HVM, this comes with the price that this model may not
satisfy λ – Independence (described at the end of Sect. 19.2.2). As we said, in
our setting we presuppose independence between the admissible measurements set
and the state of the measured system. Hence, in this setting λ – Independence is
guaranteed and therefore a Strongly Deterministic explanation may fail. We will see
in Sect. 19.4 that if the Empirical Model is classical (non-contextual; see below)
then it does have a consistent Strongly Deterministic HVM even in our setting (i.e.
an HVM that satisfies λ – Independence), but if it is contextual then a consistent
Strongly Deterministic HVM does not exist.
Relations Between HVM Properties (‘→’ Means Material Implication)
Let us state the following relations between properties of HVM without explicit
proofs:
19 Why a Relativistic Quantum Mechanical World Must Be Indeterministic 433
Let A, A denote the spin measurements of particle 1 in two different directions
and B, B the spin measurements of particle 2 in the same two directions. The
outcomes of the measurements are a ∈ {±1} for A, A and b ∈ {±1} for B, B .
Finally, let the state space be = {0, 1} with uniform distribution and define the
measurement outcomes deterministically by the table below.
One can notice that this model is Weakly Deterministic since for every
(λ, m1 , m2 ) the tuple (a, b) is determined uniquely, but it is not Strongly Deter-
ministic as for (λ = 0, m1 = A, m2 = B), b = − 1 but for (λ = 0, m1 = A , m2 = B),
b = 1 and hence the outcome of measurement B is not determined by the state
alone. The model is Outcome Independent but is not No-Signaling which can be
verified directly or concluded from relations (1), (4) above, and therefore this model
is not local.
λ m1 m2 a b
0 A B 1 −1
1 A B −1 1
0 A B 1 1
1 A B −1 −1
0 A B 1 1
1 A B −1 −1
0 A B 1 −1
1 A B −1 1
a, b = 0, 0 0,)1 1,)0 1, 1
A, B 0 1 2 1 2 0
) )
A, C 1 2 0 0 1 2
However, the original HVM defined in this Example (2.2) is not No-Signaling
(since it is WD∗ ). This example demonstrates the fact that the phenomenal level EM
may have properties that are different from a specific physical HVM that realizes it.
By contrast to the HVM case, the conjunction of Outcome Independence and
No-Signaling for EM is not equivalent to the Locality property. This is due to the
fact that there may be a consistent HVM that is Outcome Independence and another
consistent HVM that is No-Signaling but there is no single consistent HVM that
satisfies both OI and NS. Example 3.1 above demonstrate this situation since it has
a Weakly Deterministic consistent HVM that is Outcome Independence (by relation
(1)) and the trivial HVM is No-Signaling but it is not local (i.e. does not have a local
HVM) since it is not classical. We will discuss this last point in Sect. 19.4.
Relations Between Properties of an Empirical Model
The first three relations pertaining to the properties of HVM properties (see Sect.
19.2) are satisfied also by EM:
1. Strong Determinism → Weak Determinism → Outcome Independence
2. Strong Determinism → No signaling, hence
3. Strong Determinism → Locality
However, relation (4): Weak Determinism and No signaling → Strong Deter-
minism, fails because (as we mentioned above) there can be a consistent HVM
which is WD and another consistent HVM which is No signaling but there is no
single consistent HVM with both properties. This situation is demonstrated by the
following example:
Example 3.2 Let there be four measurements A, A , B, B each with outcomes in
{0, 1} and the following conditional probability table:
19.4 Contextuality
6 SeePitowsky (1986) for characterizations of classical and quantum probability spaces as convex
subsets of the space of phenomenal distributions.
438 A. Levy and M. Hemmo
and the contextual EM has a consistent Weakly Deterministic, but not Strongly
Deterministic HVM.
One last comment here which is important for the next section. If one wants
that a Contextual Empirical Model be compatible with Special Relativity, the model
must satisfy No-Signaling. Let there be a consistent Weakly Deterministic HVM
(which is guaranteed by the existence theorem). Then, this HVM cannot satisfy
No-Signaling, since if it did, it would be Strongly Deterministic (by property (4)
of Sect. 19.2), which implies that the Empirical Model is not contextual (contrary
to our assumption). We conclude that every deterministic HVM consistent with a
contextual EM must have a state that allows superluminal signaling! Therefore, in
order to comply with Special Relativity, it is either impossible to prepare a system in
such a state, or else a Weakly Deterministic HVM cannot exist. In the next section
we will present a strong argument in support of the second option.
7 As we said in Sect. 19.1, by Lorentz invariance we mean that there is no preferred Lorentz frame
of reference and therefore no absolute time order and no absolute direction of causation.
8 This point has been stressed by Yakir Aharanov, since (as far as we know) the mid 1980s.
440 A. Levy and M. Hemmo
measure directly the value of this state, but when it is known, the probability distri-
bution over the measurement outcomes described by the EM can be explained. This
explanation may be deterministic (Strongly Deterministic or Weakly Deterministic
HVM) or otherwise stochastic. Therefore, the HVM plays the role of the physical
theory that allows making predictions about, or producing explanations for, the
results of measurements performed on the system.
Although the EM and HVM framework is aimed at analyzing physical systems,
it does not refer explicitly to space and time. Yet, the properties of No-Signaling
and Locality originated from physical systems that are positioned in space-time and
hence it is only natural to introduce these aspects to the EM and HVM framework.
A simple way of doing this is to assign to each measurement a space-time label
and then every pair of measurements can be in space-like, time-like or light-like
relation. And under these relations one can require, for example, that properties
such as No-Signaling or Outcome Independence will apply only to measurements
that are in the space-like relation. One can conclude (from special relativity) that
an EM that does not satisfy No-Signaling with respect to two space-like related
measurements cannot represent a physical system, since it will allow transmitting
information in superluminal speed.
A HVM, on the other hand (as explained in Sect. 19.2.2), may violate No-
Signaling but nevertheless it may constitute a deeper physical structure that gives
rise to the EM, since the HVM might also rule out the possibility of preparing a
system in a specific state, so that transmitting superluminal signals is not physically
feasible.
However, as we shall see below, requiring that a HVM is Lorentz Invariant in the
sense that it predicts consistent measurement outcomes in different inertial frames,
implies that the HVM must be No-Signaling, and from this fact we shall prove a no-
go theorem according to which a Weakly Deterministic HVM realizing a contextual
EM does not exist.
We stress that this no-go theorem is strictly stronger than the exclusion of a
strongly deterministic HVM for a contextual EM presented in Sect. 19.4, since
the existence theorem of Sect. 19.2.3, guarantees the existence of a Weakly
Deterministic HVM for every EM, including a contextual one. On the other hand,
our no-go theorem does not contradict this existence theorem, since it is the
Lorentz Invariance additional requirement that excludes the existence of the Weakly
Deterministic HVM. So, as we shall see below, our no-go theorem does not imply
that there is no Weakly Deterministic HVM consistent with a contextual EM. It
does, however, show that such a HVM will be inconsistent with the fact that the
time order of space-like measurements may be reversed for certain inertial frames.
To establish this no-go theorem let’s start with proving the following Lemma.
Lemma 5.1 An HVM that is consistent with Lorentz Invariance must be No-
Signaling.
By consistency with Lorentz Invariance we mean that given two measurements
in space-like locations the HVM yields identical probabilistic predictions for every
inertial frame of reference.
19 Why a Relativistic Quantum Mechanical World Must Be Indeterministic 441
choice of measurement B or B may be taken after the measurement in Alice’s
AB (X = a|λ) = P AB (X = a|λ).
location.9 Hence PX,Y,... X
And similarly in the BA frame: P BA (Y = b|λ) = P BA (Y = b|λ).
X,Y,... Y
In addition, the two Lorentz frames must agree on the probability distributions
of every event and therefore one concludes that:
9 We assume here λ-independence according to which there are no correlations between the choice
of measurement and the outcome of other measurements through the parameter λ.
10 See e.g., the two Mach-Zehnder type interferometers by Hardy and Squires (1992), the Kochen-
Specker set up by Conway and Kochen (2006, 2009), Bell’s EPR scenario by Gisin (2011), and
the Suarez-Scarai model by Suarez (2010).
19 Why a Relativistic Quantum Mechanical World Must Be Indeterministic 443
Let us now briefly show how using the EM and HVM frameworks may be helpful in
dissolving certain debates that have been raised recently concerning the possibility
of constructing interpretations of QM that are Lorentz invariant. An argument
appearing in Conway and Kochen (2009) claims that for a contextual EM it is
possible to exclude also the existence of a Lorentz invariant consistent probabilistic
HVM. Conway and Kochen (2009) apply an analysis of the probability distribution
functions of a contextual EM which is similar to Gisin’s argument. However, their
conclusion is unsound, as indicated also in Goldstein et al. (2010) for a specific
scenario, since given a contextual, No-Signaling EM, the trivial (singleton )
consistent HVM is a No-Signaling Lorentz invariant probabilistic explanation for
this EM.
Our analysis shows that it is impossible to construct any deterministic Lorentz
invariant (and hence No-Signaling) HVM that yields the quantum mechanical
predictions. Since, for example, Bohm’s (1952) theory is Weakly Deterministic,
it must fail to be Lorentz invariant, and in this sense Bohm’s theory is incompatible
with Special Relativity (SR).11 The attempts to overcome this problem resort to
postulating that there is a preferred Lorentz reference frame which describes the
absolute temporal order of events (see Dürr et al. 2013). This absolute temporal
order circumvents our proof above exactly where we assume that a later event in
some reference frame cannot have an impact on an earlier event in the same frame.
The case of stochastic theories (of the kind originally proposed by Ghirardi et
al. 1986, GRW) is subtler, since prima facie a genuinely stochastic theory may
have a Lorentz invariant relativistic version compatible with QM, as we have shown
above. In Tumulka (2006) it is suggested that the flash theory denoted by rGRWf
is fully compatible with SR, in particular in the sense that it requires no preferred
Lorentz frame. Tumulka’s (2006) proposal has been criticized by Esfeld and Gisin
(2014) who claim that “neither rGRWf, nor any other theory, can account for the
occurrences of the Alice-flash and the occurrences of the Bob-flash in a Lorentz-
invariant manner”. This conclusion is based on the argument presented originally in
Gisin (2011) which we also used in this section to derive our no-deterministic HVM
result. Let us now examine Esfeld and Gisin’s (2014) argument in the context of the
rGRWf theory and the EM and HVM framework.
The core of Esfeld and Gisin’s (2014) argument (on our understanding) is that
the probabilities observed at space-like separation by Alice and Bob are “classical”,
11 For explicit proofs that Bohm’s theory requires an absolute temporal order (i.e., a preferred
Lorentz frame), see Albert (1992); Hemmo and Shenker (2013). Of course, as we said above,
Bohm’s theory does not allow to use its No-Signaling property, since its additional dynamical
law for the motion of particles together with its statistical postulate concerning the distribution of
particles entail that preparing a system at a known position that is more exact than the system’s
prepared wavefunction is physically impossible.
444 A. Levy and M. Hemmo
i.e. they can be “computed and simulated on a classical computer.” Hence, they
conclude that there exists a classical variable λ ∈ (that can be stored on a
computer’s hard disc) such that the outcomes on both sides of the EPR experiment
are given by the four deterministic functions that depend on this variable and on the
settings of the measurements as presented in our proof above. From this assumption,
they derive their conclusion according to which the rGRWf (and any other theory)
cannot provide a genuinely relativistic (i.e., Lorentz invariant) account of an EPR
experiment.
However, as we have just seen in this section, there always exists a probabilistic
HVM that is consistent with EPR-type experiments (or every other contextual
experiment) and this fact obviously disagrees with Esfeld and Gisin’s conclusion.
We will now argue that Esfeld and Gisin’s argument implicitly presupposes an
assumption which in fact need not hold in a stochastic theory, and specifically does
not hold in rGRWf.
Esfeld and Gisin (2014) are correct in arguing that the marginal probabilities of
an EPR empirical model, namely those computed by Alice or Bob, are classical.12
However, as explained in Sect. 19.4 above, this does not imply that there is a
common classical probability model for all possible combinations of measurements
on the two wings in the EPR setting. Assuming that the classical marginal
probabilities imply the existence of a common classical probability is equivalent
to assuming that the EM is non-contextual, which is not satisfied by the EM of the
EPR experiment.
So perhaps Esfeld and Gisin (2014) mean that there exists a consistent Weakly
Deterministic (non-classical) HVM for the EPR probabilities, which is indeed
true, as shown by Existence Theorem in Sect. 19.2. In this case one can derive a
contradiction with Lorentz invariance, as we have just done in this section. However,
even this assumption will not make their argument against the rGRWf theory valid,
unless one can prove that the rGRWf theory is Weakly Deterministic. But we do not
see how this can be done, because both the original GRW theory and the rGRWf
theory are manifestly genuinely stochastic theories. Of course, our point here should
not be taken as a positive proof that rGRWf is a genuinely relativistic interpretation
of QM. Such a proof seems to go beyond the framework of EM and HVM.
19.7 Conclusion
We saw in this paper that using the EM and HVM mathematical framework
allows us to clarify certain relations between significant physical properties such
as Determinism, Locality and Contextuality. Moreover, we were able to have an
insight into the debate concerning the prospects of interpretations of QM that are
12 A non-classical EM may have many classical marginal probabilities; this is the core of Abramsky
compatible with Special Relativity. In particular we have proved here using the EM
and HVM framework that the physical reality of our world (assuming that there is
no absolute time order) must be stochastic.13
Viewed from this framework one can even think of standard QM (in von
Neumann’s formulation) as a stochastic HVM (in which the quantum state plays
the role of the hidden variable) that is consistent with the observed frequencies of
measurement outcomes. As we saw in Sect. 19.5, from a mathematical point of view,
Lorentz invariance (in the sense that there is no preferred Lorentz frame)14 by itself
is consistent with a genuinely stochastic HVM that yields the predictions of QM.
The genuine stochastic nature of QM allows it to have the property of No-Signaling
even at the HVM level.
However, the EM and HVM framework cannot really settle the question of
the compatibility of QM and SR because the dynamical aspect of physics is
missing in this framework. And obviously, one should not expect to find in this
framework a resolution to the measurement problem in QM. But it is nonetheless
encouraging that there is no mathematical barrier to finding a stochastic theory that
is compatible with both QM and SR. Whether or not rGRWf or other attempts
should be considered satisfactory from a physical point of view is still to be settled.
Acknowledgement We thank Orly Shenker and Yemima Ben-Menahem for discussions of issues
related to this paper. We also thank Ori Feiglin, Aharon Gill, Zohar Maliniak and Moti Suess, from
the research group in philosophy of physics at the University of Haifa for valuable comments on
this paper. Finally, we thank Samson Abramsky for very valuable comments on an earlier draft of
this paper. This research has been supported by the Israel Science Foundation (ISF), grant number
1148/18.
References
Abramsky, S. (2013). Relational hidden variables and non-locality. Studia Logica, 101(2), 411–
452.
Abramsky, S., & Brandenburger, A. (2011). The sheaf-theoretic structure of non-locality and
contextuality. New Journal of Physics, 13, 113036.
Albert, D. (1992). Quantum mechanics and experience. Cambridge, MA: Harvard University Press.
Badzia̧g, P., Bengtsson, I., Cabello, A., & Pitowsky, I. (2009). Universality of state-independent
violation of correlation inequalities for noncontextual theories. Physical Review Letters, 103,
050401.
Bell, J. (1964). On the Einstein-Podolsky-Rosen paradox. Physics, 1, 195–200.
Ben-Menahem, Y. (2012). Locality and determinism: The odd couple. In Y. Ben-Menahem & M.
Hemmo (Eds.), Probability in physics (pp. 149–166). Berlin/Heidelberg: Springer.
13 We did not go into the question why God’s dice is quantum rather than, say, a PR dice. This
will involve adding extra assumptions that inhibit the appearance of super-quantum theories (see
Rohrlich and Aharonov 2013 for a recent survey).
14 Of course, in a relativistic quantum theory, the Schrödinger dynamics is replaced with a
Bohm, D. (1952). A suggested interpretation of the quantum theory in terms of “hidden” variables.
Physical Review, 85(166–179), 180–193.
Brandenburger, A., & Keisler, H. J. (2016). Fiber products of measures and quantum foundation.
In J. Chubb, A. Eskandarian, & V. Harizanov (Eds.), Logic and algebraic structures in quantum
computing, lecture notes in logic (Vol. 45, pp. 71–87). Cambridge: Cambridge University Press.
Brandenburger, A., & Yanofsky, N. (2008). A classification of hidden variable properties. Journal
of Physics A: Mathematical and Theoretical, 41, 425302.
Bub, J. (2010). Von Neumann’s ‘no hidden variables’ proof: A re-appraisal. Foundations of
Physics, 40(9), 1333–1340.
Bub, J., & Pitowsky, I. (2010). Two dogmas about quantum mechanics. In S. Saunders, J. Barrett,
A. Kent, & D. Wallace (Eds.), Many worlds? Everett, quantum theory and reality (pp. 433–
459). Oxford: Oxford University Press.
Clifton, R., Pagonis, C., & Pitowsky, I. (1992). Relativity, quantum mechanics and EPR. In D. Hull,
M. Forbes, & K. Okruhlik (Eds.), PSA 1992 (Vol. 1, pp. 114–128). East Lansing: Philosophy
of Science Association.
Conway, J., & Kochen, S. (2006). The free will theorem. Foundations of Physics, 36(10), 1441–
1473.
Conway, J., & Kochen, S. (2009). The strong free will theorem. Notices of the American
Mathematical Society, 56(2), 226–232.
Dürr, D., Goldstein, S., & Zanghi, N. (1992). Quantum equilibrium and the origin of absolute
uncertainty. Journal of Statistical Physics, 67(5/6), 843–907.
Dürr, D., Goldstein, S., & Norsen, T. (2013). Can Bohmian mechanics be made relativistic?
Proceedings of the Royal Society A, 470, No. 2162, 20130699.
Esfeld, M., & Gisin, N. (2014). The GRW flash theory: A relativistic quantum ontology of matter
in space-time? Philosophy of Science, 81(2), 248–264.
Fine, A. (1982). Hidden variable, joint probability and the Bell inequalities. Physical Review
Letters, 48(5), 291.
Ghirardi, G., Rimini, A., & Weber, T. (1986). Unified dynamics for microscopic and macroscopic
systems. Physical Review, D, 34, 470–479.
Gisin, N. (2011). Impossibility of covariant deterministic nonlocal hidden-variable extensions of
quantum theory. Physical Review A, 020102(R), 83.
Goldstein, S., Tausk, D. V., Tumulka, R., & Zanghi, N. (2010). What does the free will theorem
actually prove? Notices of the American Mathematical Society, 57, 1451–1453.
Hardy, L., & Squires, E. J. (1992). On the violation of Lorentz-invariance in deterministic hidden-
variable interpretations of quantum theory. Physics Letters A, 168(3), 169–173.
Hemmo, M., & Shenker, O. (2013). Probability zero in Bohm’s theory. Philosophy of Science, 80,
1148–1158.
Kolmogorov, A. N. (1933). Foundations of the theory of probability. New York: Chelsea Publishing
Company. English translation 1956.
Maudlin, T. (1994). Quantum non-locality and relativity: Metaphysical intimations of modern
physics. Oxford: Wiley-Blackwell Press.
Peres, A. (1995). Quantum theory: Concepts and methods. Dordrecht: Kluwer Academic Press.
Pitowsky, I. (1986). The range of quantum probability. Journal of Mathematical Physics, 27, 1556.
Pitowsky, I. (1989). Quantum probability, quantum logic. University of California, Springer-
Verlag.
Pitowsky, I. (2007). Quantum mechanics as a theory of probability. In: W. Demopoulos & I.
Pitowsky (Eds.), Physical theory and its interpretation, essays in honor of Jeffrey Bub, the
Western Ontario series in philosophy of science (Vol. 72, pp. 213–240). Springer.
Popescu, S., & Rohrlich, D. (1994). Quantum nonlocality as an axiom. Foundations of Physics,
24(3), 379–385.
Price, H., & Wharton, K (2013). Dispelling the quantum spooks – a clue that Einstein missed?
arXiv 1307.7744 [physics.hist-ph].
Redhead, M. (1987). Incompleteness, nonlocality and realism. Oxford: Clarendon.
19 Why a Relativistic Quantum Mechanical World Must Be Indeterministic 447
Rohrlich, D., & Aharonov, Y. (2013). PR-box correlations have no classical limit. In D. C.
Struppa & J. M. Tollaksen (Eds.), Quantum theory: A two-time success story (Yakir Aharonov
festschrift) (pp. 205–211). New York: Springer.
Suarez, A. (2010). The general free will theorem. arXiv: 1006.2485 [quant-ph].
Tumulka, R. (2006). A relativistic version of the Ghirardi–Rimini–Weber model. Journal of
Statistical Physics, 125, 821–840.
Chapter 20
Subjectivists About Quantum
Probabilities Should Be Realists About
Quantum States
Wayne C. Myrvold
20.1 Introduction
W. C. Myrvold ()
Department of Philosophy, The University of Western Ontario, London, ON, Canada
e-mail: wmyrvold@uwo.ca
decision theory, pioneered by Ramsey and de Finetti, and developed more fully
by Savage. On this view, an agent’s preferences between acts (often illustrated by
choices of bets to make or accept) are taken to be indications of the agent’s degrees
of belief in various propositions.
It is an attractive idea to extend this conception to quantum probabilities. One
systematic development of this idea can be found in Pitowsky’s “Betting on the
outcomes of measurements: a Bayesian theory of quantum probability” (Pitowsky
2003). The idea also forms the basis of the position (or family of positions) known as
QBism (Caves et al. 2002, 2007; Fuchs and Schack 2013; Fuchs et al. 2014; Fuchs
2017; Fuchs and Stacey 2019). Proponents of such views tend to also take quantum
states to be nothing more than codifications of an agent’s beliefs. But the argument
for this conclusion is never spelled out with sufficient clarity.
On the other hand, we find arguments—of which the theorem of Pusey et al.
(2012) is the best known—for the opposite conclusion, namely, that quantum
states should be taken to represent elements of physical reality. If interpreting
quantum probabilities as subjective leads one inexorably to rejecting an ontic view
of quantum states, then there must be something about the reasoning behind these
arguments that doesn’t go through on a subjective reading of probability. In this
chapter, I will argue that this is not the case. Suitably interpreted, a version of the
PBR argument leads to the conclusion that an agent who takes quantum mechanics
as a guide for setting her subjective degrees of belief about the outcomes of future
experiments ought to believe that preparations with which she associates distinct
pure states result in ontically distinct physical states of affairs (this terminology will
be explained in Sect. 20.3, below). Note that this conclusion is entirely about the
agent’s beliefs. We are not presuming, as part of the argument, that the agent’s
beliefs in any way reflect physical reality. That is a matter for the agent’s own
judgment (though, of course, there is an implication that if you take quantum
mechanics as a guide for setting your degrees of belief about the outcomes of future
experiments, you should take distinct pure states to be ontically distinct).
The argument, of course, involves assumptions about the agent’s credences. It is
also assumed that the agent may entertain theories about the way the world is, and
have degrees of belief in such theories. It is, of course, logically possible to reject
all such theorizing. If such a move is proposed in connection with quantum theory,
then the question arises whether the empirical evidence that led the community of
physicists to accept quantum theory over classical theory provides any incentive
for the move. I will argue that we have no reason whatsoever to make such a
move. Anyone is, of course, free to make such a move without reason, as a free
choice. Dissuasion of someone so inclined will not be the topic of this chapter; what
concerns us here is the evidential situation.
In this section I would like to bring to the reader’s mind some considerations of
the sort that may be called “pre-theoretical”—the sort of considerations that tend
20 Quantum Probabilities 451
to be taken for granted, prior to any physical theorizing. They are, for that reason, to
be thought of as neither particularly classical nor quantum. I do not take myself to
be saying anything new, radical, or contentious. Of course, one may start a scientific
investigation with certain presuppositions about the way the world is, and end up
concluding, as a result of one’s investigations, that some of those presuppositions
are false. If any readers are inclined to dispute some assertion made in this section,
then I will ask that reader to provide evidence for the falsity of the assertion. I
hope that it goes without saying that the mere fact that some proposition leads to an
unwelcome conclusion does not suffice as evidence that the proposition is false.
Consider the following situation. You are attending a lecture, and you are taking
notes on a laptop. This process involves a sequence of choices: you are choosing, at
every moment, which, if any, keys to hit. Why are you doing this? If you are taking
notes for your own benefit, as an aid to memory, then, presumably, you believe that,
at some later date, you will be able to see or hear or feel (depending on the device
used for readout) something that will be informative about your prior choices of
key-strokes. If you are taking notes for someone else’s benefit, then, presumably,
you believe that, at a later date, that other agent will be able to see or hear or feel
something that will be informative about your earlier choices of key-strokes.1
Why are you using a laptop, rather than, say, a kumquat? Presumably, you believe
that there is something about the internal structure of the laptop that makes it suitable
for the purpose. Suppose that you can use the readout to distinguish between two
alternative choices of things to write today. For concreteness, consider a choice
between writing
A ‘Twas brillig, and the slithy toves did gyre and gimble in the wabe.
B Now is the winter of our discontent made glorious summer by this sun of York.
If you believe that the readout tomorrow will allow you to distinguish between
having made choice A and choice B today (in case you’re forgetful)—that is, if
you believe that the laptop can serve as an auxiliary memory—this is to attribute
to the laptop a certain role in a causal process linking your choice today to your
experience tomorrow. Even if you have little or no idea of the internal workings of
the laptop, attribution of this causal role to the laptop involves an assumption that
your action will have an effect on the internal state of the laptop, and that the effect
your action had on the internal state of the laptop will, in turn, have an effect on
what it is that you experience tomorrow.
Now, an extreme version of operationalism might forbid you to entertain
conjectures about the internal state of the laptop, and speak only of the relations
between your act today and the readout tomorrow, with no mention of intervening
variables. But the fact is that you are using a laptop rather than a kumquat, and,
in fact, would not attempt to use a kumquat for this purpose, and that means that
1 One might also consider the following situation. You are reading a text of which you are not the
author. Presumably, you take what you are reading as informative about what choices of keystrokes
the author made while composing it. If not, then why are you still reading?
452 W. C. Myrvold
you believe that there is something about the physical constitution of the laptop that
makes it suitable for the role you wish it to play in the causal chain between your act
today and the readout tomorrow. Caution should be exercised in theorizing about the
internal workings of the laptop, but, as long as we are careful not to be committed
to anything more than what we have good evidence for, there is no harm, and some
potential benefit, in theorizing about the processes that mediate your actions today
and experiences tomorrow.
Suppose, then, that you begin—tentatively, and with all due caution—to form
a theory about the inner workings of the laptop. What constraints do your beliefs
about the input-output relations of the laptop place on the sorts of theories you
should entertain about its inner workings?
It seems that there are a few general things that can be said. If you believe
that the laptop can serve as a means for discriminating (with certainty) between
act A and act B, you should believe that there are distinct sets of internal states
corresponding to these two acts. That is, the set of states that the laptop could end
up in, as a result of your performing act A, has no overlap with the set of states that
it could end up in as a result of your performing act B. Moreover, these two sets of
states are distinguishable upon readout—the readout mechanism produces outputs,
corresponding to the two sets of internal states associated with acts A and B, that
are perceptually distinguishable by you. The only alternative to believing this is to
believe in a direct causal effect of your choice today on your experience tomorrow,
unmediated by any effect on the state of the world in the interim.
Of course, discrimination with absolute certainty is too much to ask. You might
not regard your laptop as a completely reliable means for obtaining information
about your past choices of key-strokes. That’s okay; it can still be informative, as
long as if you take some perceived outputs to be more likely, given some choices of
keystrokes, than others. We can cash this out in terms of your conditional degrees
of belief.
All of this will be formalized in the next section.
2 Theframework in this section is based on the ontological models framework of Harrigan and
Spekkens (2010).
20 Quantum Probabilities 453
We consider some physical system (you may think of the laptop as a running
example), with which our agent will engage. She has some set of operations
{φ 1 , . . . , φ n } that can be performed on the system at time t0 (think choices of key-
strokes). We will also consider situations in which an act is chosen without Alice’s
knowledge of which one; she has an n-sided die that she can toss (and when she
does, her credence is equally divided among the possible outcomes), and a gadget
that will choose among the operations {φ 1 , . . . , φ n }, depending on the outcome of
the die-throw.
At some later time t1 , she performs an operation R (a “readout” operation), which
will result in one of a set of perceptually distinguishable results {r1 , . . . rm }. (We
can, of course, generalize, and give her a choice of operations to perform, but for
our purposes, one will suffice.)
We assume that our agent has conditional credences, cr A (rk |φ i ), representing
credence in result rk on the supposition of act φ i . Upon learning the result of the
operation R, she uses these to update, via conditionalization, her credences about
which act was performed.
Alice does not believe that the acts {φ i } have a direct influence on the outcomes
{rk }, unmediated by any effect of those acts on the system S. We will permit her
to entertain various theories about the workings of the system. Any theory T of
that sort will involve a set ΩT of possible physical states. At the level of generality
at which we are operating, we will assume nothing about the structure of these
state-spaces ΩT , and, in particular, do not assume that they are either classical or
quantum in structure. We do assume that, on the supposition of theory T , Alice can
form credences in propositions about the state of the system S to the effect that the
system’s state is in a subset Δ of ΩT , and that, for each theory T , there is an algebra
AT of subsets of ΩT deemed suitable for credences of that sort.
Suppose, now that Alice regards the state of the system to be relevant to the
outcome of the readout operation. What she thinks about that relevance may, of
course, depend on the theory T . To represent these judgments, we suppose her to
have conditional credences of the form cr A (rk |φ i , T , ω). This is her conditional
credence, on the supposition of theory T and the supposition that the state of
the system is ω ∈ ΩT , that the result of the readout will be rk , if act φ i was
performed. It should be stressed that these are Alice’s conditional credences; there
is no assumption that these conditional credences mirror any causal structure out
there in the rest of the world. Alice may hope that her conditional credences reflect
something of the causal structure of the world (or rather, the causal structure,
according to theory T ), but no assumption to the effect that they do reflect the causal
structure of the world will form any part of our argument, because our conclusions
will only be about what Alice should believe.
We assume, also, that, on the supposition of T , Alice has credences about
the states produced by the acts φ i . For any subset Δ of ΩT that is in AT , let
cr A (Δ|T , φ i ) be Alice’s conditional credence, on the supposition of theory T , that
performing act φ i would put the system into a state in Δ. Her credences about the
state should mesh with her conditional credences cr A (rk |φ i , T , ω) to yield her act-
outcome conditional credences, on the supposition of theory T :
454 W. C. Myrvold
where the angle-brackets · indicate expectation value taken with respect to her
credences cr A (Δ|T , φ i ) about the state, given act φ i . If {Tj } is the set of theories
that Alice takes seriously enough to endow with nonzero credence, then her act-
outcome conditional credences cr A (rk |φ i ) should satisfy
cr A (rk |φ i ) = cr A (rk |φ i , Tj ) cr A (Tj ). (20.2)
j
As mentioned, Alice does not believe that the acts {φ i } have a direct influence on
the outcomes {rk }, unmediated by any effect of those acts on the system S. This will
be reflected in her conditional credences; her belief can be captured by the condition
that cr A (rk |φ i , T , ω) be independent of which act is performed. That is, for all i, j ,
for all states ω ∈ ΩT . This is the condition that her choice of act has an influence
on the later outcome only via the influence of this choice on the state of the system.
Now, part of the point of taking notes on a laptop is that what you see later will
permit you to distinguish between choices of key-strokes you made earlier. There
may be a limit to such discrimination; if you type something and then delete it,
the later readout might fail to distinguish between alternatives for the deleted text.
Some pairs of acts will be readout-distinguishable; others will not. We will say that
Alice takes two acts {φ i , φ j } to be R-distinguishable if, for every outcome rk of the
operation R, either cr A (rk |φ i ) or cr A (rk |φ j ) is zero.
If Alice takes two acts to be R-distinguishable, then, provided she takes the
influence of acts performed on the system on the later readout to be mediated by
a change in the state of the system, then she should also take it that the sets of states
that can result from the two acts are distinct. We can formalize this; say that Alice
takes the effects of acts {φ i , φ j } to be ontically distinct, on the supposition of theory
T , if there exists Δ ⊆ AT such that cr A (Δ|T , φ i ) = 1 and cr A (Δ|T , φ j ) = 0. We
have the following simple theorem.
Theorem 1 If an agent’s credences satisfy (20.1), (20.2) and (20.3), then, if she
takes two acts to be R-distinguishable for some operation R, then she also takes
the effects of those acts to be ontically distinct on any theory T in which she places
nonzero credence.
We emphasize: this is a theorem about Alice’s credences, not about actual states of
the system. Alice could be wrong in her judgment that the effects on her laptop of
different choices of keystrokes are ontically distinct. It might be that her keystrokes
have no effect whatsoever on the internal state of the laptop (in which case she
would be misguided in taking those acts to be readout-distinguishable).
All of this is very general, and no assumptions have been made about the
sorts of theories that Alice might entertain. Now, suppose we apply it to quantum
20 Quantum Probabilities 455
We have come to the conclusion that Alice should take the effects of preparation-acts
that she regards as R-distinguishable for some operation R to be ontically distinct.
This includes acts with which she associates orthogonal quantum states. But what
about non-orthogonal quantum states?
Let {φ 1 , . . . , φ n } be a set of preparation-acts that are such that Alice’s credences
about the results of future experiments, upon performance of these acts, can be
represented by pure quantum states {|φ 1 , . . . , |φ n }. If Alice takes the effects of
these acts to be ontically distinct, this means that she believes that the physical
state carries a trace of which preparation was performed and, that, if she knew
the physical state, this would be enough to specify which of the quantum states
{|φ 1 , . . . , |φ n } she deems appropriate to use to set her credences about the
results of future experiments. That is, she believes that, for each physical state that
could arise from one of these preparations, there is a unique quantum state that
“corresponds” to it as the state to use in setting her credences. The correspondence
might not be one-one, as there might be a multiplicity of physical states that
correspond to the same quantum state.
If Alice believes that the effects of any pair of preparation-acts with which she
associates distinct pure quantum states are ontically distinct, then Alice believes that
there is something in physical reality corresponding to these quantum states. We will
say, in such a case, that Alice is a realist about pure quantum states.
The PBR theorem can be adapted to provide a set of conditions on Alice’s
credences sufficient for her to be a realist about pure quantum states.
Following the terminology of Leifer (2014), we will say that Alice takes a set
{φ 1 , . . . , φ n } of preparation-acts to be antidistinguishable if there is an operation R
such that, for each result rk , cr A (rk |φ i ) is zero for at least one φ i . Though Alice
might not be able to uniquely decide which of the acts was performed, she will
always be able to rule at least one out, whatever the result of operation R is.
The setup of the PBR theorem is as follows. Suppose that we have two systems,
S1 and S2 , of the same type, and suppose that each of them can be subjected to
either of two preparation-acts, {φ, ψ}. We assume that each choice of preparation
act performed on one system is compatible with both choices of act performed on
456 W. C. Myrvold
the other. This gives us four possibilities of preparation-acts for the joint system,
which we will designate by {φ 1 ⊗ φ 2 , φ 1 ⊗ ψ 2 , ψ 1 ⊗ φ 2 , ψ 1 ⊗ ψ 2 }.
Suppose, now, that Alice regards these four preparation-acts as antidistinguish-
able. Under certain further conditions, which we will now specify, she should then
take the effects of the preparation-acts φ and ψ on the two systems to be ontically
distinct.
A theory about the physical states of the two systems, presumably, would be able
to treat S1 and S2 on their own, and also to regard the composite system that has S1
and S2 as parts as a system in its own right. It will, therefore, include state-spaces
Ω1 and Ω2 for the individual systems, and a state space Ω12 for the composite
system. In a classical theory, to specify the physical state of the composite system
S1 + S2 , it suffices to specify the states of the component parts, and so the state
of the composite can be represented as an ordered pair ω1 , ω2 of states of the
components, and Ω12 can be taken to be the set of all such ordered pairs. That is,
Ω12 can be taken to be the cartesian product of Ω1 and Ω2 . For other theories, such
as quantum mechanical theories, the cartesian product Ω1 × Ω2 might be included
in Ω12 without exhausting it. In quantum mechanics, for any states ψ 1 , ψ 2 of two
disjoint systems, there is a corresponding product state ψ 1 ⊗ ψ 2 of the composite,
but not all pure states of the composite system are of this form, as there also pure
entangled states.
We will say that a theory T satisfies the Cartesian Product Assumption (CPA)
if Ω1 × Ω2 is a subset of Ω12 . Given a theory satisfying the CPA, Alice’s
credences regarding a pair of preparations φ 1 , φ 2 that can be performed on the
two subsystems will be said to satisfy the No Correlations Assumption (NCA) if
her credences about the joint state of the two systems, on the supposition of T
and the preparations φ 1 , φ 2 , are concentrated on the cartesian product Ω1 × Ω2
and if her credences are such that information about the state of one system would
be completely uninformative about the state of the other. The conjunction of the
Cartesian Product Assumption and the No Correlations Assumption is called the
Preparation Independence Postulate (PIP). Note that this is a combination of an
assumption about the structure of the state space of a theory T and an assumption
about Alice’s credences, conditional on the supposition of T and on the preparations
φ1, φ2.
Now suppose that we have a pair of systems S1 , S2 , a theory T about them that
satisfies the CPA, and, for each system, a pair of preparation-acts φ i , ψ i , which are
such that:
1. There is an operation R that can be performed on the pair of systems, such that,
for each outcome rk of the operation, at least one of
It then follows, by the argument of Pusey et al. (2012), that Alice should regard the
effects of these preparations to be ontically distinct, conditional on the supposition
of theory T .
Now, of course, Alice might entertain theories for which the CPA does not hold,
and, among those for which it does, her conditional credences might satisfy the
NCA for some but not for others. In this case, she should not be certain that the
effects of these preparations are ontically distinct. However, her credence that they
are must be at least as high as her total credence in the class of theories for which
the above-listed conditions are true. Another way to put this is: in order to be certain
that the effects of these preparations are not ontically distinct—that is, in order to
categorically deny onticity of quantum states—Alice’s credences in all such theories
must be strictly zero.
Though it seems natural from a classical standpoint, quantum mechanics gives us
incentive to question the Cartesian Product Assumption. In the set of pure states of
a bipartite system, product states are rather special, and every neighbourhood of any
product state contains infinitely many entangled states. If both systems have Hilbert
spaces of infinite dimension, then the entangled states are dense in the set of all
states, pure or mixed (see Clifton and Halvorson 2000, 2001). We should, therefore,
take seriously the cases of theories whose state spaces either lack a Cartesian product
component, or are such that product states cannot reliably be prepared exactly,
although we may be able to approximate them arbitrarily closely. A modification
of the PBR argument that does not employ the Cartesian Product Assumption is
therefore desirable.
In Myrvold (2018, 2020), a substitute for the PIP is proposed, which is strictly
weaker than it and dispenses with the Cartesian Product Assumption. This substitute
is called the Preparation Uninformativeness Condition (PUC).
To state the assumption, we consider the following set-up. Suppose that, for
systems S1 , S2 , we have some set of possible preparation-acts that can be performed
on the individual systems. Suppose that the choice of preparation for each of the
subsystems is made independently, say, by rolling two separate dice. Following
the preparation of the joint system, which consists of individual preparations on
the subsystems, you are not told which preparations have been performed, but you
are given a specification of the ontic state of the joint system. On the basis of this
information, you form credences about which preparations were performed. In the
case of preparations whose effects you take to be ontically distinct, you will be
certain about what preparation has been performed; otherwise, you may have less
than total information about which preparations were performed.
We ask: under these conditions, if you are now given information about which
preparation was performed on one system, is this informative about which prepa-
ration was performed on the other? The Preparation Uninformativeness Condition
is the condition that it is not. This condition is satisfied whenever the PIP is. It is
also satisfied if you regard the effects of the preparations to be ontically distinct: in
such a case, given the ontic state of the joint system, you are already certain about
which preparations have been performed, and being told about the preparation on
one system will not shift your credences.
458 W. C. Myrvold
The PUC is implied by the PIP, but it is strictly weaker. Even if the CPA is
assumed, it is possible to construct models for the PBR setup, outlined in the
previous section, in which the PUC is satisfied but the PIP is not. See Myrvold
(2018) for one such construction.
On the assumption of the Preparation Uninformativeness Condition—which,
in the current context, is a condition on the agent’s credences, independent of
the structure of the state space of the theory considered—one can get a ψ-
ontology proof for a class of nonorthogonal states. Let {φ i , ψ i } (i = 1, 2) be
preparation-acts such that Alice’s credences about the results of future experi- √
ments can be represented by quantum states {|φ i , |ψ i }. If |φ i |ψ i | ≤ 1/ 2,
then {|φ 1 |φ 2 , |φ 1 |ψ 2 , |ψ 1 |φ 2 , |ψ 1 |ψ 2 } is an antidistinguishable set. From
these assumptions, together with a mild side-assumption called the principle of
extendibility,3 it follows that, for any theory T such that Alice’s credences satisfy
the PUC for each of the preparation-acts {φ 1 ⊗ φ 2 , φ 1 ⊗ ψ 2 , ψ 1 ⊗ φ 2 , ψ 1 ⊗ ψ 2 },
Alice takes the effects of the preparation acts {φ i , ψ i } to be ontically distinct.
A key feature of quantum mechanics used in these proofs is √the fact
that, if {|φ i , |ψ i } are any state vectors with |φ i |ψ i | ≤ 1/ 2, then
{|φ 1 |φ 2 ,|φ 1 |ψ 2 ,|ψ 1 |φ 2 ,|ψ 1 |ψ 2 } is an antidistinguishable set. What is
needed for the argument is that there exist some preparations and an operation
such that Alice’s preparation-response conditional credences match the Born-rule
probabilities yielded by these states for the outcomes of the experiment envisaged.
This is not a necessity or probabilistic coherence. What we will assume is that Alice
accepts quantum mechanics, in the following sense: for any quantum state ρ of a
system, and any complete set of mutually orthogonal projections {P1 , . . . , Pn } on
the Hilbert space of the system, there is some combination of preparation-act and
readout-operation such that Alice’s credences in the possible results {r1 , . . . , rn }
match (or at least closely approximate) the Born-rule probabilities Tr(ρ Pk ).4
The upshot of these theorems is that, for any theory T for which the assumptions
(which may include both assumptions about the theory and about Alice’s credences
conditional on the supposition of the theory), Alice’s credences, conditional on T ,
should take the effects of the preparation-acts considered to be ontically distinct.
Thus, her credence in the ontic view of quantum states should be at least as high
as her total credence in the class of all such theories. The only way that she can
categorically deny that preparation-acts corresponding to distinct quantum states
yield ontically distinct states of reality is to attach credence zero to the class of all
such theories.
3 This says that any system composed of N subsystems of the same type can be regarded as a part
of a larger system consisting of a greater number of systems of the same type.
4 Of course, we could reasonably expect something stronger, that this holds, not just for projections,
but also for any positive-operator valued measure (POVM). But the weaker condition is all that we
need.
20 Quantum Probabilities 459
Advocates of QBism will not accept the conclusion that a subjectivist about quantum
probabilities should be a realist about quantum states. Yet the argument so far has
not violated any of the core tenets of QBism. To avoid the conclusion, QBists must
add to its main tenets a further prohibition, logically independent of the explicitly
stated core tenets.
In section 2, we first argued for what, in my opinion, ought to be an uncontentious
claim: that an agent who regards a pair of preparation-acts to be distinguishable
ought to regard the effects of these acts to be ontically distinct. This has the
consequence that, if an agent associates orthogonal quantum states with a pair of
preparation-acts, she ought to take their effects to be ontically distinct. The extension
of this conclusion to nonorthogonal quantum states requires additional assumptions,
which might be questioned. But the QBist will want to resist the very first step,
having to do with orthogonal quantum states.
Nonetheless, the reasoning involved does not violate what proponents of QBism
have identified as its fundamental tenets. These are given in different versions in
different venues, but this is true for any of these versions.
Fuchs, Schack, and Mermin (2014) present the following as the “three funda-
mental precepts” of QBism.
1. A measurement outcome does not preexist the measurement. An outcome is
created for the agent who takes the measurement action only when it enters the
experience of that agent. The outcome of the measurement is that experience.
Experiences do not exist prior to being experienced.
2. An agent’s assignment of probability 1 to an event expresses that agent’s personal
belief that the event is certain to happen. It does not imply the existence of an
objective mechanism that brings about the event. Even probability-1 judgments
are judgments. They are judgments in which the judging agent is supremely
confident.
3. Parameters that do not appear in the quantum theory and correspond to nothing
in the experience of any potential agent can play no role in the interpretation of
quantum mechanics.
Regarding Precept 1: We have not assumed that the result of a readout-operation
(we have refrained from using the misleading term “measurement”) preexists
the operation, or that experiences exist prior to being experienced. Nor have we
presupposed that the result of such an operation merely indicates a preexisting
element of reality.
Regarding Precept 2: What we have concluded is that, if an agent takes two
preparation-acts to be distinguishable with certainty, the agent should take the
effects of those acts to be ontically distinct. The agent may be mistaken about this,
of course. For example: I may be convinced that I will be able to learn tomorrow
about my choices of key-strokes today, because I regard different choices of what
to write today as readout-distinguishable tomorrow. Accompanying this belief is a
460 W. C. Myrvold
belief that distinct choices of keystrokes today, if I save the file and don’t erase it,
have the effect of putting my laptop’s internal mechanism into physically distinct
states, which will have observationally distinguishable effects tomorrow. I could be
wrong about this, of course. It might be that my laptop is malfunctioning and that
there are, in fact, no lasting effects of my choices of keystrokes today. But—and
this is the crucial point—if I were to revise my judgment about the ontic effects of
my keystrokes, I would also revise my assessment of the readout-distinguishability
of my choices tomorrow. What I would not do is simultaneously maintain that my
choices today have no effect on the internal state of the laptop and that the result of
the readout operation tomorrow will be informative about those choices.
Regarding Precept 3: The conclusion that an agent who takes two preparation-
acts to be distinguishable should regard the effects of these acts to be ontically
distinct is quite general, and is not specific to quantum mechanics. It invokes
the sort of considerations that can be brought to bear on any sort of theorizing
about the world. It should also be emphasized that the mere introduction of some
sort of physical state-space Ω does not automatically consign us to the realm
of hidden-variables interpretations of quantum mechanics. The structure of the
physical state-space is left completely open, and our conclusion that one should
take there to be distinctions in physical reality corresponding to distinctions between
pure quantum state-preparations does not, by itself, commit us to anything else.
Fuchs (2017) presents the “three tenets” of QBism as
1. The Born rule is a normative statement.
2. All probabilities, including all quantum probabilities, are so subjective they never
tell nature what to do. This includes probability-1 assignments. Quantum states
thus have no “ontic hold” on the world.
3. Quantum measurement outcomes just are personal experiences for the agent
gambling on them.
Regarding Tenet 1: Neither the Born rule, nor anything at all quantum-mechanical,
was invoked in the argument for the conclusion that an agent should take the
effects of preparation-acts that she regards as readout-distinguishable to be ontically
distinct. A special case of this is a pair of preparations that are such that her
credences can be represented by a orthogonal quantum states. In the extension
to non-orthogonal quantum states, Born-rule probabilities were invoked only in
consideration of what sorts of credences an agent who accepts quantum mechanics
should have.
Regarding Tenet 2: Again, we have not assumed anything at all about what nature
does. Our conclusion has only to do with the meshing of an agent’s act-outcome
conditional credences with her credences about the states of the physical system
that mediates between the acts and the outcomes. As we have already mentioned
in connection with the malfunctioning laptop, that system is under no obligation to
conform to her credences, and, she may, indeed, find her expectations disappointed.
If the last sentence in the statement of Precept 2 is supposed to indicate a denial
of an ontic construal of quantum states, then, despite the “thus,” it does not follow
from the first. As we have seen, one can begin with the assumption that quantum
20 Quantum Probabilities 461
probabilities are subjective and end with a conclusion that an agent whose credences
conform to quantum mechanics ought to take quantum states to be ontic.
Regarding Tenet 3: Nothing in the argument depends on whether the results {rk }
of the operation are taken to be things like states of the laptop screen, or states of
the agent’s nervous system, or her conscious experience upon looking at the screen.
Given that we have not, in our arguments, violated the fundamental precepts or
tenets of QBism, why would, and how could, a QBist resist the conclusion? The only
option open is to reject the claim that there is a causal relation between my choices
of preparation-acts today and my experiences of the results of readout operations
tomorrow that is mediated by the effects of my actions on the state of the physical
system on which I act. There are two ways to do this. One is to maintain that there is
a direct causal link between my present actions and future experiences, unmediated
by any effect of my actions on the physical system on which I act. The other is to
reject all talk of causal links between my present actions and future experiences.
There are passages in some writings by QBists that suggest the latter.
There is a sense in which this unhinging of the Born Rule from being a “law of nature” in
the usual conception . . . i.e., treating it as a normative statement, rather than a descriptive
one — makes the QBist notion of quantum indeterminism a far more radical variety than
anything proposed in the quantum debate before. It says in the end that nature does what
it wants, without a mechanism underneath, and without any “hidden hand” (Fine 1989) of
the likes of Richard von Mises’s Kollektiv or Karl Popper’s propensities or David Lewis’s
objective chances, or indeed any conception that would diminish the autonomy of nature’s
events. Nature and its parts do what they want, and we as free-willed agents do what we can
get away with. Quantum theory, on this account, is our best means yet for hitching a ride
with the universe’s pervasive creativity and doing what we can to contribute to it (Fuchs
2017, 272).
The suggestion is that the physical world obeys neither deterministic nor stochastic
laws.
What sort of physical laws, if any, nature conforms to is not something that can
be known a priori. If someone proposes a physical theory according to which one
should expect to observe certain statistical regularities, that theory can be subjected
to empirical test according to the usual Bayesian methods. This will never produce
certainty that the theory is correct, but can result in high credence that the theory,
or something like it, is at least approximately correct within a certain domain of
application. In particular, if someone were to propose a theory according to which
a given sort of preparation-act (say, passing a particle of a certain type through a
Stern-Gerlach apparatus oriented vertically, and selecting the + beam) bestowed
determinate chances on certain sorts of subsequent experiments, we could test that
assertion by repeating the preparation-experiment sequence, and looking to see
whether the relative frequencies converged to a stable value, in accordance with the
weak law of large numbers. If we routinely find the theoretical expectation fulfilled,
this should boost credence in the theory; if the outcomes of the experiment are not
at all like what the theory leads us to expect, this counts as evidence against the
theory. One could imagine that persistent failure to find any sort of regularity at all
might boost credence in the proposition that there are no regularities to be found—
462 W. C. Myrvold
if the agent could survive that long! (It may be a requirement for the existence of
agents that could have credences that there be some modicum of predictability in
the world.)
Despite passages like the one quoted above, I am skeptical that any QBist actually
rejects the claim that there is a causal link between their present actions and their
future experiences, mediated by an effect of their actions on the physical systems
with which they interact. I find it difficult to imagine how someone whose beliefs
were like that would act. I am certain, however, that someone with beliefs like that
would not judge a laptop that was advertised as having more memory—that is, more
readout-distinguishable internal states that the user could choose between via choice
of actions performed on the machine—to be worth more money than one with less
memory. Indeed, it is hard to see why an agent like that could prefer any laptop at
all to a kumquat as an implement for taking notes.5
Fuchs does not explain what, if anything, he takes to be evidence for the radical
indeterminism of which he speaks. Such evidence cannot have come from the
results of quantum experiments, which, on the contrary, seem to indicate law-like
connections, best expressed in terms of probabilistic laws, between preparation-acts
and experimental outcomes. At any rate, the evidence against the claim that the
world is so completely devoid of law-like regularities that theorizing about it is
pointless seems to be overwhelming.
There is, it seems, one possible response for the QBist. This would be to say
that he accepts the thesis of radical indeterminism, of a world subject neither to
deterministic nor stochastic laws, not on the basis of evidence, but as a leap of faith,
or an expression of temperament (see Fuchs 2017, 251–253). Though he believes in
no causal connection between his actions on the world and future experiences, as a
psychological weakness he is unable to set his credences about the future at will, and
can only affect his credences about the future by undergoing a ritual manipulation
of the physical objects around him. He finds himself in the position of someone who
does not believe that a horseshoe hung on the door will bring good luck, but hangs it
anyway because he finds that doing so makes him more optimistic about the future.
If that is the view—namely, that a QBist can offer no reason whatsoever to accept
the radical renunciation of physical theorizing about the world that acceptance of
the view entails—then we are in full agreement on that point.
5 Itis worth re-emphasizing: I am not claiming that anything about the laptop or the kumquat
follows from assertions about the agent’s credences about these systems. The agent may prefer
the laptop to the kumquat as a note-taking device because of a false belief that the laptop is more
suitable to play that role in a causal chain between the agent’s actions and future experiences. The
point is merely that the agent’s belief, true or false, is a belief about the physical workings of the
laptop.
20 Quantum Probabilities 463
In addition to the tenets and precepts listed above, QBism also involves a severe
restriction on the application of quantum mechanics. An agent is required to only
apply quantum probabilities to her own future experiences, and not to events
(including the experiences of other agents) that she herself will not experience.
The severity of this restriction of scope is not always sufficiently emphasized.
Though some of the ideas adopted by QBists are inspired by quantum information
theory, the restriction of credences to one’s own experiences eliminates vast swaths
of quantum information theory, including everything to do with communication and
cryptography. The point of a communication is to influence the credences of another
agent, and, in designing a communication protocol, one must consider probabilities
of changes to the recipient’s belief-state.6
There may seem to be an advantage to this restriction: one thereby avoids
quantum nonlocality. If an agent uses quantum mechanics to assign probabilities
to events along her own worldline, and never to events at a distance from herself, all
the events to which she assigns probabilities are timelike-related. There can be no
question of action-at-a-distance because there are no distances between the events
she assigns credences to.
This advantage obtains only if one refrains from doing any theorizing, quantum
or otherwise, about events at a distance. Any theory, quantum or otherwise, that
attempts to account for outcomes of tests of Bell inequalities—if we mean by
‘outcomes’ what is usually meant, namely, detector-registrations that occur at
spacelike separation from each other—will have to violate at least one of any set
of conditions sufficient to yield Bell inequalities.
The seeming advantage is illusory. If one refrains from giving an account of
goings-on in the world that occur at a distance from each other in order to avoid
giving an account of nonlocal goings-on, and if it is held that such a renunciation
is necessary to avoid some sort of nonlocality that is thought to be objectionable,
this is tantamount to saying that if one were to give an account of goings-on in
the world that occur at a distance from each other, such an account would involve
the objectionable sort of nonlocality. If this differs from acknowledging that the
supposedly objectionable nonlocality is a feature of the world, I don’t know how. It
is as if one declined a request to evaluate a co-worker by saying that one is declining
6 Itmight be claimed, in response, that Alice, in sending a signal to Bob, is concerned about the
impact on Bob only insofar as it will later impinge on her. This, as everyone knows, is false. Some
of the intended results of a communication are ones that the sender will never be aware of. As a
particularly vivid example, consider a spy whose cover has been blown, who sends out one last
encrypted message before taking a cyanide pill to avoid capture. A QBist must regard this as a
misuse of quantum mechanics if the encryption scheme is a quantum scheme. It is not! Nor is it
a misuse of quantum mechanics for an engineer designing a nuclear waste storage facility to use
quantum mechanics to compute half-lives with the aim of constructing a facility that will be safe
for generations to come.
464 W. C. Myrvold
the request in order to avoid saying that the co-worker is incompetent. In giving such
a response, one has not refrained from conveying one’s judgment of the co-worker’s
competence.
20.7 Conclusion
References
Caves, C. M., Fuchs, C. A., & Schack, R. (2002). Quantum probabilities as Bayesian probabilities.
Physical Review A, 65, 022305.
Caves, C. M., Fuchs, C. A., & Schack, R. (2007). Subjective probability and quantum certainty.
Studies in History and Philosophy of Modern Physics, 38, 255–274.
Clifton, R., & Halvorson, H. (2000). Bipartite-mixed-states of infinite-dimensional systems are
generically nonseparable. Physical Review A, 61, 012108.
Clifton, R., & Halvorson, H. (2001). Entanglement and open systems in algebraic quantum field
theory. Studies in History and Philosophy of Modern Physics, 32, 1–13.
Fine, A. (1989). Do correlations need to be explained? In J.T. Cushing and E. McMullin (Eds.),
Philosophical Consequences of Quantum Theory: Reflections on Bell’s Theorem. Notre Dame,
IN: University of Notre Dame Press, Notre Dame, 175–194.
Fuchs, C. A. (2017). Notwithstanding Bohr, the reasons for QBism. Mind & Matter, 15, 245–300.
Fuchs, C. A., & Schack, R. (2013). Quantum Bayesian coherence. Reviews of Modern Physics, 85,
1693–1715.
20 Quantum Probabilities 465
Fuchs, C. A., & Stacey, B. (2019). QBism: Quantum theory as a hero’s handbook. In E. M.
Rasel, W. P. Schleich, & S. Wölk (Eds.) Foundations of quantum theory: Proceedings of the
international school of physics “Enrico Fermi” course 197, pp. 133–202. Amsterdam: IOS
Press.
Fuchs, C. A., Mermin, N. D., & Schack, R. (2014). An introduction to QBism with an application
to the locality of quantum mechanics. American Journal of Physics, 82, 749–754.
Harrigan, N., & Spekkens, R. W. (2010). Einstein, Incompleteness, and the Epistemic View of
Quantum States. Foundations of Physics, 40, 125–157.
Leifer, M. S. (2014). Is the quantum state real? An extended review of ψ-ontology theorems.
Quanta, 3, 67–155.
Myrvold, W. C. (2018). ψ-ontology result without the Cartesian product assumption. Physical
Review A, 97, 052109.
Myrvold, W. C. (2020). On the status of quantum state realism. In S. French & J. Saatsi (Eds.)
Scientific realism and the quantum. Oxford: Oxford University Press, 229–251.
Pitowsky, I. (2003). Betting on the outcomes of measurements: A Bayesian theory of quantum
probability. Studies in History and Philosophy of Modern Physics, 34, 395–414.
Pusey, M. A., Barrett, J., & Rudolph, T. (2012). On the reality of the quantum state. Nature Physics,
8, 475–478.
Chapter 21
The Relativistic Einstein-Podolsky-Rosen
Argument
Michael Redhead
Keywords New critique of Ghirardi and Grassi’s paper assuming locality · New
refutation of Redhead and Rivière’s paper · Nonlocality issues revisited
21.1 Preliminaries
M. Redhead ()
Centre for Philosophy of Natural and Social Science, London School of Economics and Political
Science, London, UK
e-mail: mlr1000@cam.ac.uk
While the Bell argument that establishes nonlocality for realistic interpretations
such as Bohm’s has been formulated in an experimental relativistic context (see
Bohm and Hiley, Chap. 12, 1993) there is no well-established relativistic formula-
tion of the EPR argument. In the absence of such a formulation, it seems hasty to
conclude that the tension between the standard interpretations and relativity theory
is just as great as that between detailed Bohmian interpretations and relativity.
Clearly, if a relativistic formulation of EPR could be given that did not entail
nonlocality, antirealist interpretations would have an advantage over the Bohmian
interpretation. (To be sure anti-realist interpretations that do not require nonlocality
include Healey’s pragmatism (see Healey 2017), Qbism (see Caves et al. 2002) and
epistemicism (see Williamson 1994).
Let us now investigate the possibility of a relativistic formulation of the
EPR argument. First, the standard nonrelativistic version of the EPR argument
is reviewed and the problematics of translating it into a relativistic context are
considered, paying particular attention to the need for a reformulation of the so-
called reality criterion. Then, we introduce one such reformulated reality criterion,
due to GianCarlo Ghirardi and Renata Grassi (see Ghirardi and Grassi 1994) and
show how it is applied to the nonrelativistic EPR argument. Next, we discuss the
application of the new reality criterion in a relativistic context.
A relativistic version of the EPR argument must differ from the nonrelativistic
version in two principal ways. First, the particle states must be described by a
relativistic wave-function. The details don’t concern us here; we need only require
that the wave-function preserve the maximal, mirror-image correlations of the
nonrelativistic singlet state. And indeed, the existence of maximal correlations in
the vacuum state of relativistic algebraic quantum field theory has recently been
demonstrated (see Redhead 1995). Second, the argument must not depend on the
existence of absolute time ordering between the measurement events on the left and
right wings of the system, for in the relativistic argument these may be spacelike
separated. As it turns out, the nonrelativistic version of the argument does invoke
absolute time ordering. To see how to get around this problem, we must briefly
review the standard formulation of the incompleteness argument.
For EPR, a necessary condition for the completeness of a theory is that every
element of physical reality must have a counterpart in the theory. To demonstrate
that quantum mechanics is incomplete, EPR need simply point to an element of
physical reality that does not have a counterpart in the theory. In this vein, they
consider measurements on a pair of scattered particles with correlated position
and momentum, but the formulation of the argument in Bohm (see Bohm 1951)
in terms of a pair of oppositely moving, singlet-state, spin-1/2 decay products of
a spin-0 particle, is conceptually simpler. In this case, the formalism of quantum
mechanics demands a strict correlation between the spin components of the two
spatially separated particles, such that a measurement of, say, the z-component of
spin of one particle allows one to predict with certainty the outcome of the same
measurement on the distant particle . This ability to predict with certainty, or at least
probability one, the outcome of a measurement is precisely the EPR criterion for the
existence of an element of reality at the as-yet-unmeasured particle. By invoking one
21 The Relativistic Einstein-Podolsky-Rosen Argument 469
be fixed in all possible worlds (specifically, the laws require that a = −a). Thus the
counterfactual is true, and an element of reality can be said to exist at the left-hand
particle. From here, the argument unrolls in the usual way, and by supplementing
this reality criterion with a locality assumption (they call it G-Loc, after Galileo),
Ghirardi and Grassi can deduce that quantum mechanics is incomplete. Once again,
we can represent their argument schematically by
or
righthand measurement from the occurrence of the measurement at the left. This
done, Ghirardi and Grassi then demonstrate the existence of an element of reality at
the left-hand side following the same reasoning as above. From here, the argument
unrolls once again in the usual way and locality makes a second appearance in its
familiar place at the end of the argument. In this way, Ghirardi and Grassi can again
prove that standard quantum mechanics plus ‘locality’ implies incompleteness.
But there are two quite distinct cases of L-Loc that are actually being employed,
one used in getting the argument started and the other appearing in the conclusion.
Ghirardi and Grassi define L-Loc as the following: ‘An event cannot be influenced
by events in spacelike separated regions. In particular, the outcome obtained in
a measurement cannot be influenced by measurements performed in spacelike
separated regions; and analogously, possessed elements of physical reality referring
to a system cannot be changed by actions taking place in spacelike separated
regions.’ As in the nonrelativistic case, it is not the general principle but rather the
two special cases, call them OM-Loc (for outcome of measurement) and ER-Loc
(again for element of reality), that are doing the logical work in their argument. OM-
Loc affirms that the outcome of a measurement cannot be influenced by performing
another measurement at a spacelike separation, while ER-Loc affirms that elements
of reality cannot be created by performing a measurement at spacelike separation.
Ghirardi and Grassi invoke OMLoc at the beginning of the argument while applying
the counterfactual reality criterion, as discussed above, and they invoke ER-Loc
at the end of the argument, as they did in the nonrelativistic case. So if we write
L-Loc = OM-Loc ˆ ER-Loc, then, schematically, their argument looks like this:
or
Ghirardi and Grassi now argue, in effect, as follows. Assuming OM-Loc we can
again demonstrate from Completeness a violation of ER-Loc, i.e. Einstein’s spooky
action-ata-distance creating elements of reality at a distance. But if we don’t assume
OM-Loc, then we cannot deduce a violation of ER-Loc. All this is quite correct, but
the price we have to pay for not being able to demonstrate a violation of ER-Loc is
precisely that we have to accept a violation of OM-Loc!
In other words, the relativistic formulation of the EPR argument does not
help with the thesis of peaceful coexistence between quantum mechanics and
special relativity, unless one argues that violating ER-Loc is more serious than
violating OM-Loc from a relativistic point of view. This is hard to maintain since
violating OM-Loc involves a case-by-case version of what Shimony (see Shimony
1984) refers to as violating parameter independence, i.e. the independence of the
probability of the outcome on one wing of the EPR experiment with respect to the
472 M. Redhead
parameters controlling the type of experiment being performed on the other wing.
By analogy, violating ER-Loc is also a form of parameter dependence.
So to return again, Ghirardi and Grassi’s claim of peaceful coexistence is already
justified, with a novel idea of how it works out, a violation of OM-Loc.
However they say they need to identify an additional assumption omitted from
(21.5) and (21.6), which, if challenged, could undermine the inference. Recall that
to run the argument in either the nonrelativistic or relativistic case, Ghirardi and
Grassi must establish that the outcome of, say, the right-hand measurement is the
same in all accessible worlds. With this established, the correlation laws of quantum
mechanics imply that the outcome of the left-hand measurement is the same in all
accessible worlds, and hence establish the truth of the counterfactual assertion about
the left-hand measurement result that permits the attribution of an element of reality
to the left hand particle. In the nonrelativistic case, the constancy of the right-hand
result is a natural consequence of the absolute time ordering as discussed above; in
the relativistic case. A premise akin to one that Michael Redhead labels the Principle
of Local Counterfactual Definiteness (PLCD) (see Redhead, p. 92, 1989) is needed
to do this sort of work.
In the present case, PLCD may be taken to assert that the result of an experiment
that could be performed on a microscopic system has a definite value that does
not depend on the occurrence of a measurement at a distant apparatus. Ghirardi
and Grassi implicitly assume that PLCD is licensed by their locality principle, for
they invoke only OM-Loc to establish the constancy of the right-hand outcome in
all accessible worlds. But we argue that PLCD does not follow directly from any
typical locality principle, certainly not from one like OM-Loc, which asserts that
the outcome obtained in a measurement cannot be influenced by measurements in
spacelike separated regions. The reason is quite simple: while invoking locality may
prevent measurements on the left hand particle from influencing the result at the
right and from breaking the constancy of the accessible worlds as far as the right-
hand result is concerned, it does not prevent indeterminism from wreaking that sort
of havoc. Intuitively, we can imagine that we run the world over again, this time
performing the measurement on the left-hand particle. If we consider this left-hand
measurement schematically as a point event with a backward light cone identical to
that in the actual world, we are concerned with what will happen in the complement
of the forward and backward light cones. Under indeterminism the events in this
complement (the absolute elsewhere) simply cannot be assumed to remain the same.
Thus it is maintained that Ghirardi and Grassi need both OM-Loc and an
assumption of determinism to get their argument off the ground.
21 The Relativistic Einstein-Podolsky-Rosen Argument 473
Thus,
Determinism ➔ OM-Loc (➔ PLCD)
But there is a catch. We note first that Determinism ➔ Bell Inequality (assuming
randomness assumption is assumed to be true, for which independent engagement
can be given). But this contradicts the conclusion that the Bell Inequality is false.
So we cannot infer ~ER-Loc.
But, according to Lewis (private communication), OM-Loc ➔ PLCD and hence
allows the Bell Inequality, independently of Determinism. The argument here
appears to be the following. For possible worlds we allow so-called quasi-miracles
to happen, but also they might not happen. To prove the counterfactual we need to
close down worlds where miracles happen, leaving the counterfactual to succeed.
We disagree with this claim since it involves discussion of possible worlds,
whereas in our world the counterfactual comes out as false.
Put succinctly, the argument is the following. A counterfactual supposition about
an event can, and indeed should, be taken to preserve as much as possible of the
actual facts, compatible with the truth of the supposition. Since in a relativistic
setting, a counterfactual supposition about an event is compatible with preserving
the the actual facts that are separated from it. It seems to us, under indeterminism,
one’s treatment of a counterfactual supposition should ‘respect’ the indeterminism
by allowing the ways that history could play out, compatibly with the counterfactual
supposition to be considered.
But if we follow Lewis we again allow the Bell Inequality so on both counts the
attempt to rescue the justification of ~ER-Loc fails.
We make a remark on Ghirardi and Grassi’s proof on peaceful existence. For
Ghirardi and Grassi there are two sorts of argument: (1) EPR argument assuming
locality; (2) argument showing, correctly, how nonlocality works in violating Bell.
I confused the two sorts of arguments, starting with (1) and trying to prove peaceful
existence according to (2)! I learned from my mistakes as Popper would have said!
But notice how we have shown, using (1), that peaceful coexistence actually
works out.
474 M. Redhead
21.5 Conclusion
So what conclusion concerning nonlocality can we draw? Taking first the realist
option, in which all observables have sharp values at all times, if we assume
determinism or restrict the discussion to cases of strict correlation or anticorrelation
where we can derive determinism on plausible assumptions, there seems very little
scope for avoiding the conclusion of nonlocality for any ‘realist’ reconstruction
of quantum mechanics. But experimentally we never observe absolutely strict
correlations, so it may be argued that discussions of nonlocality should not deal
with this ideal case, but should be based upon the experimentally realistic nonideal
case. We are not convinced by this argument, since idealization is an essential aspect
of any scientific theorizing. Be that as it may, the assumption of determinism for
the nonideal case where correlations are less than perfect, may well be suspect. In
such cases one is forced, as we have seen, to go to the stochastic hidden-variable
framework, and much of the recent discussion has focused on the significance of
violating the locality assumptions involved in this framework. Following Jon Jarrett
(Jarrett 1984) these break down into two classes.
Violation of (1) has been generally held to mark probabilistic causal dependence
between the outcome of a local measurement and the ‘setting’ of a remote piece of
apparatus, in prima facie conflict with standard interpretations of special relativity
as prohibiting causal links between spacelike separated events. Violation of (2) has
led to more controversy. It seems to demonstrate that the precise specification of the
state of the source, or more generally of the particles before the measurements are
undertaken, cannot be regarded as a common cause of the measurement outcomes.
Indeed a direct causal connection exists between the events consisting of the
measurement results on distant particles obtaining in the way they actually do. We
have challenged this view (Redhead, pp.90–96, pp.98–107, 1989) by proposing to
interpret violation of (2), in the way proposed, but in terms of a noncausal direct
dependence between measurement outcomes that lacks the necessary ‘robustness’ in
respect of how the measurement results are brought about, to merit description as a
causal connection. This approach gives up the stochastic hidden-variable framework
in favour of trying to understand the correlations between distant events in terms of a
harmony-at-a distance, or, as Shimony describes it, a passionat-a distance (Shimony
1984).
This idea ties in with, but is really quite distinct from, the fact that violation
of (2) cannot be used to transmit signals between distant locations (Redhead,
21 The Relativistic Einstein-Podolsky-Rosen Argument 475
Acknowledgements I acknowledge Jeremy Butterfield and Bryan Roberts for reading the
manuscript, and for helpful comments.
A.1 Appendix
References
Einstein, A., Podolsky, B., & Rosen, R. (1935). Can quantum-mechanical description of physical
reality be considered complete? Physical Review, 47, 777–800.
Ghirardi, G., & Grassi, R. (1994). Outcome predictions and property attribution: The EPR
argument reconsidered. Studies in History and Philosophy of Science, 25, 397–423.
Healey, R. (2017). The quantum revolution in philosophy. Oxford: Oxford University Press.
Jarrett, J. (1984). On the physical significance of the locality conditions in the bell arguments.
Nous, 18, 569–589.
Lewis, D. (1986). Philosophy papers (Vol. 2). Oxford: Oxford University Press.
Redhead, M. L. G. (1989). Incompleteness, nonlocality and realism. Oxford: Clarendon Press.
Redhead, M. L. G. (1995). More ado about nothing. Foundations of Physics, 25, 123–137.
Redhead, M. L. G., & La Rivière, P. (1997). The EPR relativistic argument. In R. Cohen, M. Horne,
& J. Stachel (Eds.), Potentiality, entanglement and passion-at-a-distance. Quantum Mechanical
Studies for Abner Shimony (Vol. 2, pp. 207–215). Dordrecht: Kluwer.
Shimony, A. (1984). Controllable and uncontrollable non-locality. In S. Kamafuchi (Ed.), Pro-
ceedings of the international symposium: Foundations of quantum mechanics in the light of
new technology (pp. 225–230). Tokio: Physical Society of Japan.
Williamson, T. (1994). Vagueness. London: Routledge.
Chapter 22
What Price Statistical Independence?
How Einstein Missed the Photon
Simon Saunders
Abstract Einstein’s celebrated 1905 argument for light quanta was phrased in
terms of the concept of ‘statistical independence’– wrongly. It was in fact based
on the notion of local, non-interacting entities, and as such was neutral on the
question of statistical independence. A specific kind of statistical dependence had
already been introduced by Gibbs, in connection with the Gibbs paradox, but went
unrecognised by Einstein. A modification of Einstein’s argument favours the latter.
This kind of statistical dependence, as eventually attributed by Einstein to Bose,
although expressed, in quantum mechanics, by the (anti)symmetrisation of the
wave-function, does not require ‘genuine’ entanglement. This chapter demystifies
quantum indistinguishability; its novel features derive from the finiteness of state-
space in quantum mechanics not indistinguishability.1
22.1 Overview
According to the physicist and historian Abraham Pais, the photon was the first
particle to be predicted theoretically. It was also, he noted, met with scepticism on
an epic scale: ‘[i]ts assimilation came after a struggle more intense and prolonged
than for any other particle ever postulated’.2 Yet it was also, in a certain sense,
derived from empirical data – specifically, derived by Einstein from the Wien limit
of Planck’s black-body spectral distribution, taken as a phenomenological given
1 This chapter is based on Saunders (2018, 2020), but it’s focus is different from either.
2 Pais (1982, p. 361).
S. Saunders ()
Faculty of Philosophy, University of Oxford, Oxford, UK
e-mail: simon.saunders@philosophy.ox.ac.uk
(it was by then accurately confirmed at high frequencies and low temperatures).
The derivation was what John Norton has called Einstein’s ‘miraculous argument’,
contained in the most revolutionary of all of Einstein’s great papers of 1905.3
If it was derived from the data, how was it opposed? The answer in part is that it
was incomplete: whatever light was really made of, it could not only be Einstein’s
light quanta, because if it were, it would contravene the Planck distribution in the
low frequency high-temperature regime (the Rayleigh-Jeans regime, also by then
accurately confirmed). What was the nature of the struggle, that was so intense
and prolonged? The answer in part was the difficulty of reconciling a particulate
structure to radiation with Maxwell’s equations (or any wave equation, needed to
explain interference effects), but another part was down to Einstein’s insistence
on statistical independence. It was Einstein’s struggle; and it was only concluded
(and then only temporarily) with his quantum theory of the ideal monatomic gas,
published in three instalments in the period 1924–1925.
To see the nature of the difficulty, consider that radiation in thermal equilibrium
behaves as a gas of indistinguishable particles, photons, satisfying the Bose-Einstein
statistics; and that the latter, as Einstein admitted in his second paper of his gas
trilogy, implies that particles are not statistically independent of each other; yet
Einstein’s ‘miraculous argument’ of 1905 had shown that thermal radiation, in the
Wien regime, is made of statistically independent particles. Contradiction.
An easy way out is to suppose that in the Wien regime photons just are
statistically independent of each other; and it is true that in this regime they satisfy
Maxwell-Boltzmann statistics. However, the latter is not the signature of statistical
independence: classical indistinguishable particles, in Gibbs sense, obey Maxwell-
Boltzmann statistics, but they are not statistically independent in the requisite sense.
The better answer is that Einstein did not show that radiation, in the Wien
regime, is composed of statistically independent particles. His argument showed
that in the Wien regime thermal radiation is made of non-interacting, localised
entities; it left open the question of whether they were indistinguishable, in Gibbs’
sense, or statistically independent, in Einstein’s. However, as we shall see, a slightly
different kind of fluctuation than the one Einstein considered resolves the under-
determination in favour of the Gibbs concept – for roughly the reasons Gibbs
himself advocated it, namely, as a solution to what is now called the Gibbs paradox.
Had Einstein in 1905 understood this point, the history of the development of
quantum theory would have been significantly different, for it would have been a
relatively short step to Einstein’s quantum gas theory. It is therefore something of
an embarrassment that Gibbs’ concept of indistinguishability, introduced in terms
of the notion of ‘generic phase’ (as opposed to ‘specific phase’, as had hitherto been
implicitly assumed), was to be found in his Elementary Principles in Statistical
Mechanics, published in 1902, and indeed already published in German translation
in 1904.
3 Norton (2006).
22 What Price Statistical Independence? How Einstein Missed the Photon 481
It seems that Einstein knew nothing of Gibbs’ work at this time, but we know
that he had read at least parts of the Principles by 1910 (and that it impressed him
greatly)4 ; however, we also know that he was sceptical of Planck’s claims, in the
period 1912–1918, on the extensivity of the entropy function of a material gas, that
cited Gibbs5 (the failure of extensivity is one formulation of the Gibbs paradox).
When Einstein embraced essentially the same solution to the extensivity puzzle in
the second of his papers on gas theory, it was in the name of what he called ‘Bose’s
statistical approach’, not Gibbs’.
Bose himself, by his own admission, was unaware that his approach had differed
from the usual one – and, indeed, his procedure so obscured the underlying
statistical assumptions that Einstein, when he took it over in his first paper on gas
theory, did not realise (or at any rate did not mention) that it implied the failure of
statistical independence. In his second paper, published at the beginning of 1925,
Einstein presented the expression for the entropy in a different form, from which
this failure was ‘easily recognised’; but he still attributed the approach to Bose.
That expression for the entropy, I am claiming, is the natural transposition of Gibbs’
notion of generic phase to the case where the one-particle state space has a finite
number of states (when it is divided into ‘elementary cells’, of size h3 ) – the central
assumption of the ‘old’ quantum theory.
No more did Schrödinger identify Gibbs as the progenitor of ‘Bose’s approach’,
although he came close. He had publicly criticised Planck ‘s arguments for an
extensive entropy, that drew on Gibbs’, and now found fault with Einstein’s new gas
theory precisely on the point that, following Bose’s method, particles could not be
statistically independent of each other. Einstein, by then, was prepared to grant that
it was for nature to decide; not so Schrödinger, who went on (in a paper immediately
preceding the first of his historic papers on wave mechanics) to propose a new
quantisation method for the quantum gas that maintained statistical independance.
Neither Einstein nor Schrödinger mentioned Gibbs by name, nor the concept of
generic phase. The following year, in 1926, Dirac laid down the definitive treatment
of particle indistinguishability in quantum mechanics, again with no reference to
Gibbs (or to Schrödinger on gas theory) – and few physicists looked back to its
precursors since.
Historians, meanwhile, seem to have taken Einstein, Schrödinger, and Dirac
rather too much at face value; since none of them mentioned Gibbs or the generic
phase, the equivalence of the latter with ‘Bose’s approach’ has not been recognised,
or even seriously entertained.
There are other reasons why the equivalence was not recognised. One is because
Bose-Einstein statistics is directly related to the symmetrisation of the state in
quantum mechanics (Dirac cited Bose and Einstein). But the latter, by wide
consensus, is thought to involve entanglement, a quintessentially quantum concept.
That it is in fact entanglement of a particularly trivial kind – that does not, for
example, support the violation of any Bell inequality – has only recently been
shown. Another reason the equivalence has not been recognized comes from the
classical side, where it is thought that the concept of particle indistinguishability
cannot apply to classical mechanics ‘because classical particles can always be
distinguished by their trajectories’. That phrase, or one like it, recurs again and
again in the subsequent literature.6 But it meets the counter-argument that quantum
particles can also have trajectories – either approximately, for sufficiently massive
and weakly-interacting particles in well-localised states, or exactly, in the case of
non-interacting particles, taking ‘trajectories’ as orbits of orthogonal one-particle
states. In neither case do the particles cease to be indistinguishable, in the sense
that their total state remains symmetrised7 We shall see a simple example in the
concluding section.
It appears that the struggle initiated by Einstein with the photon concept is so
intense, and so prolonged, that it continues unabated to this day. I can offer up
nothing quite as remarkable as Itamar Pitowsky’s discovery, in George Boole’s
‘impossibilities of experience’, of Bell’s inequalities. His beautiful theory of corre-
lation polytopes has long been a source of inspiration to me, as has his independence
of thought and temperament. But it is my hope that he would have found pleasing
the discovery in Gibbs’ paradox of the kind of failure of statistical independence
on which quantum indistinguishability is based – as within the realm of possible
experience. Jon Dorling, in his acute historical and theoretical observations, was a
similar inspiration, and himself wrote on my topic, arguing that in the limit of the
Wein regime Einstein’s argument showed light quanta were truly point-like. Their
friendship and support has always meant much to me. This essay is dedicated to
Itamar and to Jon.
Boltzmann’s equation
S = k ln W + const
was first written down in this notation by Planck, but it was introduced by Boltzmann
in his 1877 memoire, often accounted his most important contribution to statistical
mechanics. It was derived anew by Einstein in 1905 on the basis of what he
called ‘Boltzmann’s principle’, namely the principle that the entropy of a system
6 See,for example, Epstein (1936a, p. 557), Darrigol (1986, pp. 205–9), Bach (1997, p. 8), Dieks
(2014). (Bach’s defence of the concept of classical particle indistinguishability extends only to the
notion of probability distributions that give no information on individual trajectories.)
7 Saunders (2006, pp. 199–200, 207).
22 What Price Statistical Independence? How Einstein Missed the Photon 483
SA = ϕ A (WA ) , SB = ϕ B (WB ) .
If these two systems are viewed as a single system of entropy S and probability W, we
have
S = SA + SB = ϕ(W ) (22.1)
and
W = WA .WB.
This last relation tells us that the states of the two systems are mutually independent
events.
From these equations it follows that
ϕ (W ) = C ln(W ) + const.
The quantity C is thus a universal constant; it follows from the kinetic theory of gases
that its value is R/N [Boltzmann’s constant k, in Planck’s notation].
The argument has power and beauty, but we must look to its premises. The
left-hand equality of (22.1) is nowadays called the additivity of the entropy. In
thermodynamics, it applies to non-interacting systems A and B each in thermal
equilibrium (not necessarily with each other), but Einstein went on to use it to
broader effect. Let there be z one-particle states, all equiprobable, so that the
probability that a particle will be found in a given state is 1/z, and suppose there are n
non-interacting particles. They will be ‘mutually independent’ in Einstein’s sense if
the probability that they will all be found in that state is 1/zn . The joint probability, in
8 Einstein (1905, p. 94–95); I follow Einstein’s notation, save for replacing ‘1’ and ‘2’ by ‘A’ and
‘B’.
484 S. Saunders
other words, should factorise, as the product of probabilities for individual particles
considered separately. In this sense, the probability of a set of outcomes on throwing
of n dice should be the same, whether the dice are thrown sequentially or all at once,
so long as they are thrown independently and do not influence each other.
If thermodynamic probabilities are to factorise in this way when particles are
non-interacting, and if they are to connect in a simple way with the available phase
space volume,9 then the volume measure on the many-particle phase space had
better factorise; and that in turn requires that the phase space itself factorises, i.e.
that it is the Cartesian product of 1-particle phase-spaces γ:
Γ = γ × .... × γ (22.2)
WV = V n .
SV = k ln V n . (22.3)
Let the two volumes be contiguous and let the two gases subsequently diffuse
into each other. Then the final entropy will be:
9I shall use the terms ‘probability’ and ‘phase space volume’ more or less interchangeably, and
lazily use the same symbol V for both spatial volume and volume of γ; the context should make
clear which is intended.
22 What Price Statistical Independence? How Einstein Missed the Photon 485
∼ Vn
WV = . (22.6)
n!
The latter does not factorise, and particles described in this way fail Einstein’s
test of statistical independence. For suppose particles nA are statistically indepen-
dent from particles nB , both occupying the same region V. Then the probability
∼
of the instantaneous state of the composite W A∪B should be the product of the
∼ ∼
probabilities of the state of each, W A ·W B . However, by inspection:
∼ V (nA +nB ) V nA V nB ∼ ∼
W A∪B = ≈ · = WA · WB (22.7)
(nA + nB )! nA ! nB !
and the probability does not factorise. Contrast the specific probabilities, that do:
Returning to the generic phase, our argument used ‘A’ and ‘B’ as names for
the two collections of particles, dispersed through the same volume, when they
are supposed to be indistinguishable; is this move legitimate? But surely it is if
10 Gibbs did not of course use the concept ‘group’, and gave no independent motivation for the
generic phase other than that it gave no entropy of mixing. It was seen as ad hoc by Lorentz and
Ehrenfest; see Saunders (2020), Darrigol (1991) for further discussion.
486 S. Saunders
where qa , pa ∈ γ , etc., is a specific phase (specific phase space point). The
identification of all such points related by a permutation of entries in sequence
position, is to pass to a space the points of which are represented by unordered
collections:
∼
{qa , pa , qb , pb , . . . , qd , pd } ∈ Γ .
If the two gases are allowed to mix, they are no longer statistically independent
of each other, as we have seen, and the joint probability is given by the left-hand
side of Eq. (22.7). Using the Sterling approximation:
11 The assumption is innocent in a continuum theory of point-like particles, or for extended but
impenetrable molecules, but for a finite state space it is the difference between fermions (no
repetitions) and bosons (repetitions).
12 Gibbs (1902, p. 188). Whether my realist reading of the generic phase should be attributed to
Gibbs is a delicate question, for Gibbs writings were so brief. For further discussion, see my (2020).
13 If this is in doubt, consider the analogous question using a discrete state-space, and apply Eq.
(22.16) to find the number of ways of distributing nA indistinguishable particles over zA states, and
likewise for nB particles over zB disjoint states; the product of the two numbers is the same as the
number of ways of distributing n = nA + nB indistinguishable particles over z = zA + zB states,
under the constraint that nA are in VA and nB are in VB (with no regard to which is in which).
22 What Price Statistical Independence? How Einstein Missed the Photon 487
ln n! ≈ n ln n − n
and bearing in mind that since the two gases prior to mixing had the same
temperature and pressure, their densities are equal as well, nA /VA = nb /VB ; it follows
after a short calculation:
∼ ∼
W A+B ≈ W A∪B . (22.8)
The reason the equality is only approximate in (22.8) is because after the gases
mix the states available include those in which numbers of particles different from
nA and nB may be found in the regions A and B respectively (but not of course
by virtue of which particles are found in each of the two regions). After mixing,
the available number of microstates is thus increased, but by a number that can be
neglected in the Sterling approximation.
The lesson of the Gibbs paradox is that indistinguishable particles, in Gibbs’
sense, even if non-interacting, are not statistically independent in Einstein’s sense.
To put the matter vividly: if there are nA particles alone, initially confined to volume
VA , expanding into volume V, the entropy increases. If there are nB particles alone,
initially confined to volume VB , with the same density, expanding into volume
V, the entropy increases. Yet let nA , nB particles be present in each of the two
volumes, of the same density, and allowed to expand into volume V, then there is no
entropy increase, even though the particles are non-interacting, and even though the
trajectory of every particle is the same as if all the others were not there. The entropy
of the joint system is not the sum of the entropies of the two subsystems, and the
entropy change of the joint system is not the sum of the entropy changes of the two
subsystems. Einstein’s criterion fails for both, and with it mutual independence.
Consider now Einstein’s celebrated argument from the Wien black-body distribution
formula. This claimed to show that insofar as that equation is obeyed, light behaves
as if made of elements that were ‘mutually independent’ and, moreover, that were
‘moveable points’.
First ‘mutually independent’. We must take Einstein to mean what he meant
earlier in the same paper, in the argument for the universality of the entropy as a
function of W just reviewed. Mutual independence, then, means the factorising of
probabilities. Is it the same as the condition that the particles are non-interacting? It
has been interpreted in this way,14 but that would be a mistake: even if Einstein
did ultimately believe that statistical correlations must be explained by some
14 Most recently by Norton (2006), according to whom ‘independence is just the simplest case of
no inter-molecular forces’ (p. 93).
488 S. Saunders
The omission of locality may have been no more than a slip; does he commit the
fallacy of affirming the consequent as well? He treads a fine line: light behaves as
if it is so constituted, granted; but it so behaves if it is constituted in other ways as
well – in particular, if the energy quanta are non-interacting and localised but lack
statistical independence, in the very particular way that they are correlated when
described by the generic phase.
We clearly need to see the details. From the first law of thermodynamics:
dU = T dS + pdV
it follows:
dS 1
= .
dU V =const T
The Wien spectral distribution law, as first proposed by Wilhelm Wien in 1896,
is
ρ = αv 3 e−βv/T (22.10)
22 What Price Statistical Independence? How Einstein Missed the Photon 489
Planck had found this equation in 1898, likewise working from the Wien
distribution (and he had used the same method again in 1900, to obtain an equation
for the entropy as a function of ρ, working from the Planck distribution). Einstein’s
originality lay in what came next.
Let S0 = V0 ϕ be the entropy of the radiation in volume V0 per unit frequency
range and let E be the energy V0 ρ. Multiplying both sides of (22.11) by V0 , it
follows, writing ρ in terms of E and V0 :
E E
S0 = − ln − 1 .
βv αv 3 V0
Suppose now that this volume of radiation (in the given frequency range)
undergoes a spontaneous fluctuation, and is contained instead in the sub-volume
V; then it will have the entropy:
E E
S=− ln 3 − 1 .
βv αv V
E V
S − S0 = ln
βv V0
or more suggestively:
E
V kβv
S − S0 = k ln . (22.12)
V0
We ask: How great is the probability of the last-mentioned state relative to the original one?
Or: How great is the probability that at a randomly chosen instant of time all n independently
movable points in a given volume V0 will be contained (by chance) in volume V?
Obviously, for this probability, which is a ‘statistical probability’,15 one obtains the value16
* +n
V
W rel = ; (22.13)
V0
W = V n.
It is the same as the ratio of probabilities for Gibbs’ specific phase. But the ratio
of probabilities for Gibbs’ generic phases is exactly the same:
∼ n
∼ W V
W rel
= ∼ =
V0
W0
∼
even though W , as given by Eq. (22.6), cannot be factorised. Non-interacting move-
able points that are indistinguishable in Gibbs sense are not mutually independent in
15 The reference is to his preference for an ergodic interpretation of probabilities, rather than
Boltzmann’s combinatorial method, that he had emphasised in his early papers of 1903 and 1904.
16 To avoid confusion, we write Wrel where Einstein wrote W.
17 This is consonant with e.g. Pais (1982, p. 72); I am not aware of any alternative reading.
22 What Price Statistical Independence? How Einstein Missed the Photon 491
Einstein’s sense, yet yield the same relative probability and have the same volume
dependence of their entropy as do mutually independent movable points. The Wien
law, concerning this kind of fluctuation, does not discriminate between them.
Would it be better to read Einstein’s notion of independence as requiring the
factorising only of relative probabilities? In that case indistinguishable particles
in Gibbs’ sense may come out as statistically independent after all. But that
contravenes what we have learned from Gibbs’ paradox in Sect. 22.2. Indeed,
consider a fluctuation of a rather different sort than the one Einstein imagined,
in which nA particles initially occupying volume V are subsequently found in
the sub-volume VA , and the remaining nB particles, initially in volume V, are
subsequently found in the disjoint sub-volume VB , where V = VA + VB . Suppose
first that the nA particles are statistically independent of the nB particles. Then the
relative probability of the fluctuation should factorise into the product of the relative
probability of each considered independently:
nA nB
VA VB
W rel = WArel .WBrel = . .
V V
It is the same, whether calculated as the ratio of the probability of the joint final
state to the initial, or as the product of the ratios of final and initial states of each
separately. Suppose now that the fluctuation does not change the density, so that:
(nA + nB ) nA nB
≈ ≈ .
(VA + VB ) VA VB
∼ VA nA /nA ! · VB nB /nB !
W rel =
V N /n!
There are of course other solutions, or purported solutions, to the Gibbs paradox, or
rather paradoxes (for it has been taken to mean many different things), than the one
we have considered; and there are other criticisms, or purported criticisms, of Gibbs’
concept of generic phase. But there is a clear structural correspondence between
statistics as made out in terms of the generic phase, vs the specific phase, and
statistics following what Einstein in 1924 called ‘Bose’s statistical approach’, vs that
for ‘statistically independent molecules’, following Boltzmann’s. Bose’s method, in
the Wien limit, is the same as Gibbs’, using the generic phase.
Alas, neither this structural correspondence, nor the connection with Gibbs
paradox, was so much as mentioned by any of the key players, nor were they for
some years.18 The first to consider the matter in any detail appears to have been
Paul Epstein, in a contribution to the Commentary on the Scientific Writings of J.
Willard Gibbs published by Yale University, Gibbs’ alma mater, in 1936. Epstein
highlighted the relationship between specific vs generic phase, in Gibbs’ sense, and
statistical independence vs Bose’s method, in Einstein’s; but he also maintained,
as did virtually everyone following Dirac’s 1926 intervention, that the absence of
trajectories plays an essential role in quantum indistinguishability, which therefore
has no analogue in classical statistical mechanics (whether using specific or generic
phases)19 – precisely the opposite conclusion to the one advanced here.
Epstein’s intervention, whatever its merits, came too late for historians of the
indistinguishability concept. It came too late for Gibbs, for by then photons were
a species of bosons; gibbsons were not to be. Yet the connection was there for
the taking in Einstein’s discussion. The key expression for the thermodynamic
probability, the signature of the new statistics, had been at the heart of the quantum
enigma since it first appeared in Planck’s publication of 1900 (where Planck’s
symbol ‘h’ appeared for the first time as well):
18 As remarked, Schrödinger (1925) came the closest, but did not mention Gibbs or the generic
phase. For further reading, see Darrigol (1991, pp. 293–8), Mehra and Rechenberg (1987, pp.
361–66), Saunders (2020).
19 Epstein (1936a, p. 557). Epstein also argued that a discrete state space was inconsistent with
a continuous dynamics, in the case of generic phase, in (1936a, p. 564, p. 527) and (1936b, pp.
485–91).
22 What Price Statistical Independence? How Einstein Missed the Photon 493
(n + z − 1)!
. (22.15)
(z − 1)!n!
(zs + ns − 1)!
(22.16)
(zs − 1)!ns !
∼ % (zs + ns − 1)!
Z= . (22.17)
s
(zs − 1)!ns !
20 The probability factorises for the elementary regions, in the same way and for the same reasons,
as it does for the subsystems defined by VA and VB in the Gibbs set-up (see fn.13).
494 S. Saunders
Einstein contrasted this expression with that following from ‘the hypothesis of
statistical independence of the molecules’. There ‘a state is defined microscopically
by the fact that for each molecule one indicates in which cell it is sitting (complex-
ion)’,21 where ‘complexion’ was Boltzmann’s 1877 term for microstates. In that
case the number of microstates of ns particles in region s is:
zs n s ,
denote Zs , and across all the regions for given occupation numbers n1 , n2 . . . it is
n! %
zs n s (22.18)
n1 !n2 ! . . . . . . s
n!
n1 !n2 ! . . . . . .
8π v 2 hv
ρ= .
c 3 e hv/kT −1
It goes over to the Wien distribution (22.10) in the limit hv ! kT. But that is
the limit in which very few modes of the electromagnetic field are excited, for the
minimum required is much greater than the thermal energy available (of order kT),
and those that are excited are at the lowest excitation (so the occupation numbers
are all 0’s and 1’s). Since zs is the number of modes, it follows the Wien limit is the
limit zs ! ns . In that limit it can easily be seen:
21 Einstein
(1925); ‘each molecule’ appears to be functioning as a proper name. See Saunders
(2006, 2013, 2016) for more on philosophy of language and indistinguishability.
22 What Price Statistical Independence? How Einstein Missed the Photon 495
(zs + ns − 1)! zs n s
≈ . (22.19)
(zs − 1)!ns ! ns !
The total number of microstates for all elementary regions s for which zs ! ns is
then approximately:
% zs n s
(22.20)
ns !
s;zs !ns
recovering the Maxwell-Boltzmann probability Eq. (22.18), save for the overall
factor n! (where now n is the total number of particles in the Wien regime). In
∼ ∼
terms of the continuous phase space measure W s for ns particles in region s, Z s and
∼
W s are proportional only in the Wien regime. In contrast the equation
1
Zs = Ws
τ ns
holds for every region s and numbers ns , where τ is the unit of phase space volume
γ . Classically τ can always be chosen as sufficiently small to ensure the Wien limit
is enforced (only the size of the elementary regions must be kept large, so as to
contain many particles, so that the Sterling approximation applies). Boltzmann’s
combinatorial method then goes over unchanged (apart from the overall factor
n!), using Gibbs’ generic phase, on replacing ‘cells’ by ‘elementary regions’, the
same as using the specific phase. In the quantum case τ = h3 , where h is Planck’s
constant, whereupon the number of cells zs cannot be taken as arbitrarily large, and
the approximation (22.19) fails for sufficiently large ns .
None of this was mentioned by Einstein. The correspondence between (22.18
and 22.20) should have been obvious much earlier, for example, to the Ehrenfests,
at the time of writing their famous Encyclopaedia article in 1912, for they knew of
the distinction between elementary regions and cells (which indeed goes back to
Boltzmann in 1877). They too did not mention it.22
Fast-forwarding to Dirac, all of the foregoing applies to equilibrium distributions
of photons. In the Wien regime, the total number of microstates in each elementary
region is well-approximated by (22.19), so photons are not statistically independent,
in Einstein’s sense. The exact expression for every frequency and temperature (so
long as the temperature is not so high that pair-creation of electrons becomes
∼ ∼
possible) is Z s , as given by (22.16). Indeed, Z s is exactly the dimension of the
subspace of the Hilbert space spanned by symmetrised wave-functions in the ns -fold
If Einstein was wrong to think that in the Wien limit light was made up of
statistically independent entities, his argument surely did show that in that regime
it has a discrete structure, and consists of non-interacting localised entities23 – in
that nobody has been able to come up with another interpretation of the fluctuation
formula (22.12) that says any different. He was right in another respect as well:
in recommending that light quanta be thought of as movable. That might seem
an obvious presupposition of his thought-experiment (how else was the fluctuation
in volume supposed to happen unless the particles were ‘movable’?), but it was a
first step towards investing the photon, as the concept was eventually named, with
different properties at different times – in short, to considering it as the sort of
entity that might have different states at different times, and hence that might have a
state-space at all; whereas ‘energy elements’, of the sort that had been discussed by
Boltzmann, Planck, and Natanson, were never thought of like that. The latter were
alike in all of their properties; they were hardly thought of as capable of change
at all, unless it be the transfer of energy, emission and absorption, creation and
annihilation.
Did it matter, historically, that light quanta were wrongly thought to be statisti-
cally independent in the Wien regime? For those in the critical period 1905–1925
prepared to extend the concept of light quanta to the Rayleigh-Jeans regime, it led
to the idea that correlations only came into play in that regime. This encouraged the
picture of a cooperative behaviour (‘the light molecule’) of quanta all in the same
cell, or all in the same normal mode of cavity radiation. Einstein was sympathetic
to these ideas. They were also entertained, in one of his early papers, by Louis de
Broglie. Einstein warmed to this theme in his second paper on gas theory in 1925.
He recommended de Broglie’s newly published PhD thesis, as taken up, among
others, by Schrödinger. In these respects the confusion was positively beneficial,
for it led Schrödinger to de Broglie’s concept of matter wave and to his concept of
quantisation conditions as defined by solutions to wave equations; but confusion it
was.
The story did not stop there, however. Einstein called the failure of statistical
independence ‘an initially completely puzzling mutual influence of the molecules’
(ibid, p.374). He also spoke of ‘spooky’ action-at-a-distance, a line of inquiry that
eventually led to the EPR paradox. The latter, we know, exploits entanglement.
23 Whether the ‘movable points’ must really be points will depend, presumably, on the precision
with which the probability of the fluctuation is defined (perhaps only in the limit v → ∞). I refer
to Dorling (1971) for discussion, already mentioned.
22 What Price Statistical Independence? How Einstein Missed the Photon 497
n!
n1 !n2 ! . . . nz !
different possible temporal sequences, all with the same final set of occupation
numbers n1 , . . nz , with each sequence equiprobable – and we are back to Maxwell-
Boltzmann statistics.
There is more to learn from the failure of statistical independence apparent in the
Gibbs paradox. The Gibbs paradox is properly speaking a cluster of puzzles, where
the one so-far considered made direct contact with Einstein’s idea of statistical
independence. Here is another formulation.
24 Howard (1990), Norton (2006). Howard – and Norton – may well be right that the two were
connected in Einstein’s thinking; whether Einstein was right to connect them is the point I am
contesting.
25 For triviality, see Ghirardi et al. (2002), Ghirardi and Marinatto (2004), Penrose (2004); for
quantum trajectories, see Saunders (2006, p. 199–200, 2016, p. 169–70, 2018, p. 9–10); for Bell
inequalities, see Caulton (2020).
498 S. Saunders
26 Versteegh and Dieks (2011, p. 742) (I have made minor changes of notation).
27 Dieks (2014, p. 1308).
22 What Price Statistical Independence? How Einstein Missed the Photon 499
a b
t t
t2 t2
t’
t1 t1
A B x A B x
Fig. 22.1 Space-time diagram for the mixing of gases. In (a) the partition remains in place, whilst
in (b) it is removed at time t
a b
partition is not removed, and Fig. 22.2b shows the kind of new trajectory possible
when it is, chosen to yield the same final positions and momenta. The intuition to
which Versteegh and Dieks appeal suggests that the two final microstates in Fig.
22.2a, b are different because they differ as to which particle has which position
and momentum. This is, fairly obviously, consistent with ordinary notions of
macroscopic objects, for macroscopic objects differ in their intrinsic, position- and
momentum-independent properties; but for microscopic particles, whose intrinsic
properties are exactly the same?
To insist that there is a difference even in this case is to insist on the use of the
specific phase, in Gibbs sense. In the space of specific phases of the two particles, a
continuous curve connecting two points related by particle interchange is a physical
process by which each particle acquires the position and velocity of the other. In the
space of specific states, it is an open curve whose end points are distinct. In the space
500 S. Saunders
of generic states, the initial and end points are one and the same. But the continuous
curve connecting them is not obliterated – the exchange has still taken place – it
is merely that it is now a closed curve in (reduced) phase space, with beginning
and end points numerically the same. (Of course these states obtain at different
times; as time slices they are numerically distinct; as such each may have a different
past, but that is accounted for by a change in the Hamiltonian, as time-dependent,
not the state.) On using the generic phase, it remains true that new trajectories are
possible, on removing the partition, but not new states; it is only that the same states
(microstates) can be approached in new ways.28
The Gibbs paradox, if it is a paradox, now reduces to this: are the final states of
Fig. 22.2a, b the same? Can the state of two particles be unchanged, when each of
the two particles has changed? No logical contradiction is involved. Imagine two
things, each of a definite shape, say that one is round and one is square; we can
imagine the square slowly turning into a circle, and the circle into a square, in such
a way that each occupies the same position as the other at the end.
It points, metaphysically, to the idea that the right notion of how the instantaneous
state of a joint system depends on the instantaneous state of its subsystems is global.
The state of the joint system supervenes on both – it is not built up from the state
of each subsystem considered individually. It may be this idea is more at home in
quantum theory, and perhaps there it is forced; but it may be the right metaphysics
more generally.
To see that it is forced in (conventional) quantum mechanics, consider a totally
symmetrised state of two identical particles, of the form:
1
| Ψ = √ |ϕ ⊗ | ψ + |ψ ⊗ ϕ
2
where |ϕ ∈ H is a one-particle state (and similarly | ψ, where | ϕand | ψ are
orthogonal). Let the particles be non-interacting, and suppose the unitary evolution
smoothly switches the two:
Û t ⊗ Û t | Ψ = | Ψ .
Each of | ϕ and | ψ is (continuously) changed, yet the initial and final state is
the same.
The same question can be posed in terms of the Feynman sum-over-paths
formalism, using the same Fig. 22.2. In computing the transition amplitude between
28 Discounting the new states in which particle numbers on each side are changed, as already
mentioned.
22 What Price Statistical Independence? How Einstein Missed the Photon 501
initial and final states: do we sum the amplitudes for 2a and 2b coherently? We
should if the final state in the two cases is the same.
It was the same question posed, and answered, by Dirac in 1926, in terms of
the matrix mechanics, supplemented with Schrödinger’s new concept of state. He
concluded
If the positions of two of the electrons are interchanged, the new state of the atom
is physically indistinguishable from the original one. In such a case one would expect
only symmetrical functions of the coordinates of all the electrons to be capable of being
represented by matrices. It is found that this allows one to obtain two solutions of the
problem satisfying all the necessary conditions, and the theory is incapable of deciding
which is the correct one. One of the solutions leads to Pauli’s principle that not more than
one electron can be in any given orbit, and other, when applied to the analogous problem of
the ideal gas, leads to the Einstein-Bose statistical mechanics.29
29 Dirac (1926, p. 662). These are the two possibilities already remarked on in fn.11. Note that
parastatistics do not arise, unless we admit microstates that differ by particle permutation alone.
502 S. Saunders
temperature, and not just in the Wien regime. Had light quanta already been
recognised as non-independent in the Wien limit, the Rayleigh-Jeans regime might
better have been recognised for what it was: the regime in which the discrete
structure to radiation was no longer point-like, rather than non-independent; but
still movable.
References
Andrew W. Simmons
23.1 Introduction
The Bell-Kochen-Specker
theorem tells us that there exist sets of projectors onto
states {ψ i }, such that it is impossible to assign to each of them a valuation
of 0 or 1 in a context-independent way, and have there be exactly one 1-valued
projector in every PVM context. We might ask the following question: under
an assumption of outcome-definiteness, how many such projectors can we give
A. W. Simmons ()
Department of Physics, Imperial College London, London, UK
e-mail: andrew@simmons.co.uk
a noncontextual valuation to, and how many projectors (or more robustly, what
fraction of projectors), must be considered to have context-dependent valuations?
Historically, Kochen-Specker style proofs of contextuality, also known as strong
or maximal contextuality, have tended to have a “lynch pin” quality; removal of
any part of their structure causes the proof to fall apart in its entirety. In fact, this
fact has been exploited in order to tame the maximal contextuality demonstrated
by the Peres-Mermin magic square for all qubit states Bermejo-Vega et al. (2016),
as we need only remove the ability to measure one context before it fails to be
a proof of contextuality at all. For such proofs, then, since removing one context
is sufficient to remove all contextuality, certainly allowing a projector to vary in
its value assignment by context (by assigning it a C) will. This minimalism is
partially motivated by the fact that since the original paper by Kochen and Specker,
there has been interest in trying to find examples of small Kochen-Specker sets
(Arends et al. 2011). When only a single projector need be considered to take
a contextual valuation, finding a small such set is the only way of attaining a
higher fraction of contextual projectors. We will be looking , for an assignment
to all projection operators onto a Kochen-Specker set {ψ i ψ i } → {0, C, 1}
such
,that the following rules hold: in any basis formed from the projectors onto
{ψ i ψ i }, there must be at least one with a C- or 1-valuation, and no more than
one with a 1-valuation. The fraction which must be given a C-valuation is the key
figure-of-merit that will be explored in this paper. This figure of merit, then, can
be seen as a generalisation or successor to the search for small Kochen-Specker
sets. Pitowsky (1998, 2005) explored the idea that Gleason’s theorem, viewed as a
proof of the Bell-Kochen-Specker theorem, can be thought of as providing more
quantitative, rather than qualitative, view on the Kochen-Specker theorem when
applied to quantum subtheories. Given any set of measurements, it is possible to
calculate the maximum deviation from the Born rule that the scenario permits;
thinking of a noncontextual 0,1-valuation as representing the maximum possible
deviation from the continuous Born rule probabilities. From this perspective, the
problem approached in this paper can be viewed as a complementary approach
to that of Pitowsky; we find our quantitative figure of merit by seeing how much
we have to relax the noncontextuality assumption to permit a classical model for a
certain compatibility structure, rather than how much the event structure can tolerate
deviations from quantum-like probabilities given by the Born rule.
In this paper, following Pitowsky’s approach (Pitowsky 1998), we will use
graph-theoretic methods as a computational tool to investigate how robust, in this
sense, quantum-mechanically accessible Kochen-Specker proofs can be. We will
demonstrate that it is possible that other physical theories can be more contextual,
given this definition, than quantum mechanics. We can consider this as a stronger
analogue than the observation that a PR-box is more nonlocal with respect to the
CHSH inequality than any quantum-mechanically accessible scenario. The figure
of merit in this paper can be defined referring only to this orthogonality graph,
which agrees with Pitowsky’s suggestion that all quantum behaviour should be
viewed as a product of the compatibility structure of quantum measurements, via the
restrictions imposed on possible probability rules by Gleason’s theorem. The central
23 How (Maximally) Contextual Is Quantum Mechanics? 507
question we ask and partially resolve in this paper explores the limits of Pitowsky’s
ideas and work on logical indeterminacy (Pitowsky 1998); unlike in the situation
Pitowsky considered, we see that allowing some observables to have valuations that
contextually vary allows many others to be deterministic. We are then able to query
to what extent the logical indeterminacy principle remains true, even in the presence
of explicit contextuality.
One metric for contextuality is the contextual fraction Abramsky et al. (2017),
in which we consider our probability distribution to be a convex mixture between
a noncontextual theory and a contextual one. This notion is what prompts the
term “maximal contextuality”; if a scenario is maximally contextual then it has
a contextual fraction that is the highest possible, viz 1. It is independent under
a re-labelling of the measurement effects, but the contextual fraction is hard to
calculate when the number of corners of the noncontextual polytope is large, which
is true in many natural situations such as in nonlocality scenarios in which one
party has access to measurements with three or more outcomes Simmons (2018).
The figure-of-merit explored in this paper acts as a complementary measure of
contextuality that can distinguish different scenarios which have the maximum
possible contextual fraction of 1. However, it is likely to be hard to calculate in many
circumstances, and in the case in which one is using the entirety of Hilbert space as
one’s vector set, we shall explore the connection to open problems in mathematics
known as Witsenhausen’s problem (Witsenhausen 1974) and the Hadwiger-Nelson
problem (Soifer 2008).
qs (G)
q(G) = . (23.2)
|V (G)|
Fig. 23.1 Three images illustrating the calculation of the figure of merit. The first is the orthogo-
nality graph for the 33-ray proof of the Kochen-Specker theorem by Peres (1991). Highlighted
in the second in red is a maximal independent set, that represents a maximal assignment of
noncontextual 1-valuations. In fact, only a single context lacks a 1-valued vector, visible in the
centre of the graph. Highlighted in the third in blue is a projector that can be given a contextual
valuation in order that every context contain a ray assigned either a 1 or a contextual valuation; it
will take a 1 valuation in the “central” context, and a 0 in all others
necessarily capture at least one element from the basis, and so any set of vectors
within such a hyperspherical cap C construes a maximum-clique hitting-set, TC .
Since columns of a Haar-random unitary matrix over Cd are uniformly random
unit vectors, the volume V of this hyperspherical cap as a proportion of that of the
sphere is given by 1 − F (t), where F (t) is the cumulative distribution function of
the squared modulus of an entry in such a Haar-random unitary. This quantity is
well known in the field of random matrix theory and it can be found, for example,
as a special case of a result by Życzkowski and Sommers (2000), and will be
demonstrated in Sect. 23.3.
V
= (1 − t)d−1 . (23.5)
VS
Clearly, the value of q(G) for a quantum graph is upper bounded by the size of any
maximum-clique hitting set, so if we can bound the minimum number of vectors
we must intersect with such a hyperspherical cap, this also constitutes a bound on
the value q(G). However, we would not expect this bound to be tight, since we
have ignored the independent set subtrahend entirely. We would like to demonstrate
an independent set within our maximum-clique hitting-set TC , and therefore give
a better upper bound on the quantity qs (G) = minT minA⊂T |T | − |A|, giving an
explicit construction procedure for a T and A ⊂ T .
We note that a hyperspherical cap containing the vectors |φ such that |ψ|φ|2 >
1/2 cannot contain two orthogonal vectors as a simple application of the triangle
inequality. We can place this smaller hyperspherical cap anywhere within the larger
hyperspherical cap, and this will form an independent set A that is a subset of TC .
For the rest of this proof, we will assume that the two hyperspherical caps are centred
on the same point; this will make no difference to the worst case scenario which
forms the bound. The region of Hilbert space, then, in which vectors count towards
the figure of merit, is a member of a two-parameter family of annuli on the complex
hypersphere, illustrated in Fig. 23.2 and defined by
t1 ≤ |ψ|φ|2 ≤ t2 . (23.6)
of the Hilbert space. Our construction has t1 = 1/d , and t2 = 1/2, giving us a
hyperspherical annulus defined by 1/d ≤ |ψ|φ|2 ≤ 1/2. It now remains to be
proved that no matter the positions of the set of vectors {ψ}, there must be some
place in which we can place this annulus such that we only intersect strictly less
than p(t1 , t2 ) of them.
Consider the canonical action of SU (d) on the unit hypersphere in Cd , which we
shall denote SCd−1 . We define a space SU (d) × SCd−1 , where SU (d) is equipped with
the Haar measure, and SCd−1 is equipped with the uniform measure. Equivalently,
we could have taken Hd rather than SCd−1 , in which case it would be equipped with
the Fubini-Study
metric.
At each ψ i , we place a bump function f|ψ with unit integral and support only
i
within a disc of radius around ψ i . Let At ,t denote the hyperspherical annulus
1 2
{|φ | t1 ≤ |0|φ|2 ≤ t2 }. By Fubini’s theorem, we have:
1g.z∈At1 ,t2 f|ψ (z) dz dg= 1g.z∈At1 ,t2 f|ψ (z) dg dz
d−1 i d−1 i
SU (d) SC i SC SU (d) i
(23.8)
We note that if we extend our annulus’s extent by , so that we have t1 →
t1 − and t2 → t2 + , we capture the entirety of the measure of the bump
functions
centred
insidethe -original annulus. Hence,
ψ i ψ i ∈ g −1 At ,t ≤ d−1 1g.z∈A for all g ∈ SU (d) we have
1 2 S C
t1 −,t2 + i f|ψ (z) dz, and so we have
i
ψ i ψ i ∈ g −1 At1 ,t2 dg ≤ 1g.z∈At1 −,t2 + f|ψ (z) dg dz,
d−1 i
SU (d) SC SU (d) i
(23.9)
Rearranging:
ψ i ψ i ∈ g −1 At1 ,t2 dg ≤ f (z) 1g.z∈At1 −,t2 + dg dz
d−1 |ψ i
SU (d) i SC SU (d)
(23.10)
Twirling the indicator function 1g.z∈At1 −,t2 + with respect to the Haar measure
over SU (n) reduces it to a constant function with value p(t1 − , t2 + ), the
proportion of the Hilbert space taken up by the annulus.
−1
ψ i ψ i ∈ g At1 ,t2 dg ≤ p(t1 − , t2 + ) f|ψ (z) dz
d−1 i
SU (d) i SC
(23.11)
≤ p(t1 − , t2 + ) ψ i (23.12)
512 A. W. Simmons
Since the left hand side of Equation 23.11 represents an average
over the rotation
group, there must be some g ∈ SU (d) such that ψ i ψ i ∈ g −1 At ,t ≤ p(t1 −
1 2
, t2 + ) ψ i , and this is true for all . Therefore:
*
)
Corollary 1 Any quantum mechanically accessible G has q(G) ≤
4251920575/11019960576∼ 0.385838.
d−1
Proof Calculating the derivative of 1 − d1 − 2d−1
1
with respect to d shows
that it takes a maximum between d = 9 and d = 10. Evaluation of the quantity
at these points reveal the maximum to be at d = 9, when we get (1 − 1/9)8 −
1/28 = 4251920575/11019960576. This, then, forms a hard limit of q(G) for any quantum
mechanically achievable G. *
)
We note that as the quantum dimension approaches infinity, we achieve a limiting
value of 1/e, so quantum systems with very high dimension are viable candidates
for providing robust contextuality scenarios. In Fig. 23.3, we see the bound on q(G)
plotted as a function of d.
Above, it was shown that a spherical cap taking up a proportion of (1 − 1/d )d−1
of Hilbert space must capture at least one vector from every orthonormal basis.
However, if for many bases this spherical cap captures more than one basis element,
this might suggest that this spherical cap was not the most efficient way of choosing
a vector from each basis. We note, though, that the exact geometry of the set
• • • • • • • • • • • •
• • •
1
e •
0.3 •
0.2 •
0.1
• d
5 10 15 20
Fig. 23.3 A chart showing the value of the bound on q(G ) for quantum-mechanically accessible
G as a function of the Hilbert space dimension d
23 How (Maximally) Contextual Is Quantum Mechanics? 513
of projectors will have a large effect on the efficiency of this construction. One
one hand, an increase in the quantum dimension allows for more complicated
structures and interplay of the permitted projectors that, as we shall demonstrate,
seems to allow for construction of quantum graphs with very small independence
number. However, it also has a dampening affect on the quantity |T | since the bases
themselves have more geometrical constraints.
In fact such an annulus can in the worst case can contain every vector from
a basis. Below, we will see a specific example of a set of vectors providing a
nonlocality scenario, in which every projector is captured by a pessimally-chosen
spherical cap position. In d-dimensional Hilbert space, a spherical cap of proportion
(1− 1/n)d−1 can capture n vectors out of a basis. So for any chosen n , as d increases,
we see that the fraction of the spherical cap in the construction that cannot capture
n of a basis tends to 0. However, if we consider the fraction of the spherical cap
that can capture a constant proportion of basis elements c, we get the following
behaviour under increasing d:
1 d−1 1
lim 1 − = e− c . (23.15)
d→∞ cd
This might imply that the quality of the bound does not degrade as the dimension
increases (Fig. 23.4).
Quantum upper
1 d−1
d ≤n 1− d ≥ n
2d−1
≤0.385838
bound
E8 root
8 ≤ 11n
60
n
15 ≤0.116̇
system
Two-qubit stabiliser 3n n
4 10 5 0.1
quantum mechanics
Peres-Mermin magic 7n 5n
square (Peres 1991)
4 24 24 0.083̇
Cabello’s 18-ray 5n 4n
4 18 18 0.05̇
proof (Cabello 1997)
Peres’s 33-ray 9n 12n
3 33 33 0.0̇3̇
proof (Peres 1991)
Fig. 23.4 A table comparing the graph constructions in this paper; each concrete example
represents the graph union of multiple noninteracting copies of the KS proof. Two-qubit stabiliser
quantum mechanics forms a KS proof using 60 rays and 105 contexts; the E8 root system forms a
KS proof using 120 rays and 2025 contexts. The quantities related to each were calculated using
a proof by exhaustion, or are the best found during a non-exhaustive search, such as is the case
for the E8 system. It is perhaps of note that the lower bound of the size of a traversal less the
independence number forms a trivial bound only for Peres’s 33-ray proof, and in fact is optimal
for Cabello’s 18-ray proof and the Peres-Mermin magic square
514 A. W. Simmons
In triangle-free graphs, the maximum cliques are merely edges, and so each context
consists of two measurement objects. Quantum mechanically, the only such graphs
are uninteresting: the disjoint graph union of K2 graphs. These do not have enough
structure to form a proof of the Bell-Kochen-Specker theorem; this reflects the fact
that each projector appears in only a single context. Many such graphs, then, are not
quantum-mechanically accessible. However, they do allow us to prove a concrete
bound on what values of q(G) are achievable within the framework of generalised
probabilistic theories. In such graphs, the concept of a maximum-clique hitting-set
reduces to that of a covering of edges by vertices, known as a vertex cover. The size
of the minimal vertex cover of G is denoted τ (G). This reduction of our figure-of-
merit allows us to make use of a bound derived for Ramsey type problems.
In the triangle-free case, we have a powerful relation between maximum-clique
hitting-sets and independent sets; namely that they are graph complements of each
other. A maximum-clique hitting-set, or vertex cover, is a set of vertices T such that
for every edge, at least one of that edge’s vertices is in T . We can see, then, that the
set V − T must be independent, since if there were two vertices in V − T connected
by an edge, then neither of those vertices would be in T , a contradiction. Conversely,
if we have an independent set A, then V − A must be a vertex cover; if there were
an edge with neither vertex in V − A, then both vertices are in A, a contradiction.
This means that we can bound qs (G) as
In other words, to be able to lower bound qs (G), we need only consider the
independence number, α(G). We seek, therefore, triangle-free graphs with low
independence number. We can invoke here a theorem due to Alon et al. (2010).
Theorem 2 There exists a constant c such that for all n√∈ N there exists a regular
triangle-free graph Gn , with V (Gn ) = n, and α(Gn ) ≤ c n log n.
Hence, we have
√
n − 2c n log n
lim q(Gn ) ≥ lim (23.17)
n→∞ n→∞ n
log n
≥ 1 − 2c lim = 1. (23.18)
n→∞ n
We now extend the result to apply to projective valued measures including projectors
of rank greater than 1, although we are still considering the case in which contexts
are made up of a set of projectors which sum to the identity, rather than the more
general case of a set of commuting projectors. We will allow a
rank-k.projector P to
(i)
“inherit” a labelling as a function of the assignments to some ψ P , which form
(i) ./ (i)
a specially chosen decomposition of P as P = i ψ P ψ P .
Given a variable assignment to states f : Hd → {0, C, 1}, we can define a
variable assignment to all projection operators on Hd , g : P(Hd ) → {0, C, 1} by
the following:
⎧
(i)
(i) ./ (i)
(i)
.
⎪
⎪ 0 if ∃ ψ P , P = i ψ P ψ P , f ψ P = 0 ∀i,
⎨
(i) ./ (i)
.
g(P ) = 1 elif ∃ ψ P , P = i ψ P ψ P , f ψ P
(i) (i)
∈ {0, 1} ∀i,
⎪
⎪
⎩
C o/w.
(23.19)
The motivation for this notion of value inheritance is as follows: in any context
containing P , we can consider replacing it with one of its maximally fine-grained
decompositions, which then inherits value assignments from f derived via the
annulus method. Since the assignment of values to this rank-1 decomposition of
P is necessarily consistent with the restrictions of the scenario, we can consider
a post-processing that consists of a coarse-graining of those fine-grained results,
“forgetting” which one of them occurred. This process must also be consistent.
Hence, as long as there exists some decomposition of P into rank-1 projectors, each
of which is assigned a noncontextual valuation, then P can inherit a noncontextual
valuation. However if there must be a contextual vector in any such decomposition,
we must treat P as having a contextual valuation.
The challenge, then, is to characterise which projectors P must be given a
contextual valuation under this schema, and then prove a bound analogous to that of
Theorem 1.
Theorem 3 In a scenario in which the available measurements are rank-r projec-
tors {Pi }, performed on a quantum system of dimension d, then at most (I1/2 (r, d −
r) − Ir/d (r, d − r))|{Pi }| are given a contextual valuation, where Ix (α, β) is the
regularised incomplete beta function.
Proof The proof proceeds similarly to the proof of Theorem 1. We will identify
a generalised annulus within the space of rank-r projectors on Hd which can be
associated with contextual valuations, and then use its volume in proportion to that
of the overall Hilbert space to form a bound on how many must be given contextual
valuations.
516 A. W. Simmons
Y
Pd,r (y) = cd,r y 2r−1 (1 − y 2 )d−r−1 , (23.21)
Y (y) is the probability density function for the random variable Y , and
where Pd,r
2 2(d)
cd,r = = , (23.22)
B (r, d − r) (r)(d − r)
in which B is the Euler beta function. and is the gamma function. Performing a
change of variables, then, in which T = Y 2 , we get:
d √ Y √
T
Pd,r (t) = t Pd,r t (23.23)
dt
23 How (Maximally) Contextual Is Quantum Mechanics? 517
t r−1 (1 − t)d−r−1
= (23.24)
B(r, d − r)
To complete the proof we once again apply the argument from Fubini’s theorem
used in the proof of Theorem 1. *
)
Corollary 2 For any choice of d, r, the proportion of projectors requiring a
contextual valuation is always less than 1/2.
Proof We have for any d, r, that the proportion of projectors requiring a contextual
valuation is bounded above by I1/2 (r, d − r) − Ir/d (r, d − r). We wish, then, to bound
this quantity above by 1/2.
Since Ix (α, β) is the cumulative distribution function for a random variable with
distribution Beta(α, β), a trivial upper bound for this quantity is given by
α−1 α
≤ m(α, β) ≤ , (23.27)
α+β −2 α+β
Hence by the monotonicity of Ix (α, β), we have Ir/d (r, d−r) ≥ Im(r,d−r) (r, d−r) =
1/2,
and the result follows.
*
)
518 A. W. Simmons
Acknowledgements Special thanks to Mordecai Waegell, who suggested using the E8 root
system and two-qubit stabiliser quantum mechanics to demonstrate lower bounds on what values
of the figure of merit in Sect. 23.2 were quantum-mechanically accessible, as well as his help
calculating this lower bound. Special thanks also to Terry Rudolph for asking this question in
the first place. I am grateful to Angela Xu, Angela Karanjai and Baskaran Sripathmanathan for
their helpful discussions which have guided this project. I acknowledge support from EPSRC, via
the CDT in Controlled Quantum Dynamics at Imperial College London; and from Cambridge
Quantum Computing Limited. This research was supported in part by Perimeter Institute for
Theoretical Physics. Research at Perimeter Institute is supported by the Government of Canada
through the Department of Innovation, Science and Economic Development and by the Province
of Ontario through the Ministry of Research and Innovation.
References
Abramsky, S., Barbosa, R. S., & Mansfield, S. (2017). The contextual fraction as a measure of
contextuality. arXiv:1705.07918.
Alon, N., Ben-Shimon, S., & Krivelevich, M. (2010). A note on regular Ramsey graphs. Journal
of Graph Theory, 64(3), 244–249.
Arends, F., Ouaknine, J., & Wampler, C. W. (2011). On searching for small Kochen-Specker vector
systems (pp. 23–34). Berlin/Heidelberg: Springer.
Bermejo-Vega, J., Delfosse, N., Browne, D. E., Okay, C., & Raussendorf, R. (2016). Contextuality
as a resource for qubit quantum computation. arXiv:1610.08529.
Cabello, A. (1997). A proof with 18 vectors of the Bell-Kochen-Specker theorem (pp. 59–62).
Dordrecht: Springer.
DeCorte, E., & Pikhurko, O. (2016). Spherical sets avoiding a prescribed set of angles.
International Mathematics Research Notices, 2016(20), 6096–6117.
Garey, M. R., & Johnson, D. S. (1979). Computers and intractability; a guide to the theory of
NP-completeness. W.H. Freeman and Company. New York.
Karanjai, A., Wallman, J., & Bartlett, S. (2018). Contextuality bounds the efficiency of classical
simulation of quantum processes. arXiv:1802.07744.
Kerman, J. A closed-form approximation for the median of the beta distribution. arXiv:1111.0433,
November 2011.
Peres, A. (1991). Two simple proofs of the Kochen-Specker theorem. Journal of Physics A:
Mathematical and General, 24, 175–178.
Pitowsky, I. (1998). Infinite and finite Gleason’s theorems and the logic of indeterminacy. Journal
of Mathematical Physics, 39, 218.
Pitowsky, I. (2005). Quantum mechanics as a theory of probability. In W. Demopoulos &
I. Pitowsky (Eds.) Physical theory and its interpretation (The western Ontario series in
philosophy of science, Vol. 72). Dordrecht: Springer.
Poljak, S. (1974). A note on stable sets and colorings of graphs. Commentationes Mathematica
Universitatis Carolinae, 15(2), 307–309.
Renner, R., & Wolf, S. (2004). Quantum pseudo-telepathy and the Kochen-Specker theorem. In
International Symposium on Information Theory, 2004. ISIT 2004. Proceedings (pp. 322–322).
Simmons, A. W. (2018, February). On the computational complexity of detecting possibilistic
locality. Journal of Logic and Computation, 28(1), 203–217. https://doi.org/10.1093/logcom/
exx045.
Soifer, A. (2008). The mathematical coloring book: Mathematics of coloring and the colorful life
of its creators. New York/London: Springer.
Witsenhausen, H. S. (1974). Spherical sets without orthogonal point pairs. American Mathematical
Monthly, 81, 1101–1102.
Życzkowski, K., & Sommers, H.-J. (2000). Truncations of random unitary matrices. Journal of
Physics A: Mathematical and General, 33(10), 2045.
Chapter 24
Roots and (Re)sources of Value
(In)definiteness Versus Contextuality
Karl Svozil
24.1 Introduction
K. Svozil ()
Institute for Theoretical Physics, Vienna University of Technology, Vienna, Austria
e-mail: svozil@tuwien.ac.at; http://tph.tuwien.ac.at/~svozil
as taken from his publications. In what follows classical value indefiniteness on col-
lections of (intertwined) quantum observables will be considered a consequence, or
even a synonym, of what he called indeterminacy. Whether or not this identification
is justified is certainly negotiable; but in what follows this is taken for granted.
The term value indefiniteness has been stimulated by recursion theory (Rogers,
Jr. 1967; Odifreddi 1989; Smullyan 1993), and in particular by partial func-
tions (Kleene 1936) – indeed the notion of partiality has not diffused into physical
theory formation, and might even appear alien to the very notion of functional value
assignments – and yet it appears to be necessary (Abbott et al. 2012, 2014, 2015)
if one insists (somewhat superficially) on classical interpretations of quantized
systems.
Value indefiniteness/indeterminacy will be contrasted with some related inter-
pretations and approaches, in particular, with contextuality. Indeed, I believe that
contextuality was rather foreign to Itamar Pitowsky’s thinking: the term “contex-
tuality” appears marginally – as in “a different context” – in his book Quantum
Probability – Quantum Logic (Pitowsky 1989b), nowhere in his reviews on Boole-
Bell type inequalities (Pitowsky 1989a, 1994), and mostly with reference to
contextual quantum probabilities in his late writings (Pitowsky 2006). The emphasis
on value indefiniteness/indeterminacy was, I believe, independently shared by Asher
Peres as well as Ernst Specker.
I met Itamar Pitowsky (Bub and Demopoulos 2010) personally rather late; after
he gave a lecture entitled “All Bell Inequalities” in Vienna (ESI – The Erwin
Schrödinger International Institute for Mathematical Physics 2001) on September
6th, 2000. Subsequent discussions resulted in a joint paper (Pitowsky and Svozil
2001) (stimulating further research (Sliwa 2003; Colins and Gisin 2004)). It presents
an application of his correlation polytope method (Pitowsky 1986, 1989a,b, 1991,
1994) to more general configurations than had been studied before. Thereby semi-
automated symbolic as well as numeric computations have been used.
Nevertheless, the violations of what Boole called (Boole 1862, p. 229) “condi-
tions of possible experience,” obtained through solving the hull problem of classical
correlation polytopes, was just one route to quantum indeterminacy pursued by
Itamar Pitowsky. One could identify at least two more passages he contributed
to: One approach (Pitowsky 2003, 2006) compares differences of classical with
quantum predictions through conditions and constraints imposed by certain inter-
twined configurations of observables which I like to call quantum clouds (Svozil
2017b). And another approach (Pitowsky 1998; Hrushovski and Pitowsky 2004)
pushes these predictions to the limit of logical inconsistency; such that any attempt
of a classical description fails relative to the assumptions. In what follows we shall
follow all three pursuits and relate them to new findings.
24 Roots and (Re)sources of Value (In)definiteness Versus Contextuality 523
The basic idea to obtain all classical predictions – including classical probabilities,
expectations as well as consistency constraints thereof – associated with (mostly
complementary; that is, non-simultaneously measurable) collections of observables
is quite straightforward: Figure out all “extreme” cases or states which would be
classically allowed. Then construct all classically conceivable situations by forming
suitable combinations of the former.
Formally this amounts to performing the following steps (Pitowsky 1986,
1989a,b, 1991, 1994):
• Contemplate about some concrete structure of observables and their interconnec-
tions in intertwining observables – the quantum cloud.
• Find all two-valued states of that quantum cloud. (In the case of “contextual
inequalities” (Cabello 2008) include all variations of true/1 and false/0, irre-
spective of exclusivity; thereby often violating the Kolmogorovian axioms of
probability theory even within a single context.)
• Depending on one’s preferences, form all (joint) probabilities and expectations.
• For each of these two-valued states, evaluate the joint probabilities and expec-
tations as products of the single particle probabilities and expectations they are
formed of (this reflects statistical independence of the constituent observables).
• For each of the two-valued states, form a tuple containing these relevant (joint)
probabilities and expectations.
• Interpret this tuple as a vector.
• Consider the set of all such vectors – there are as many as there are two-
valued states, and their dimension depends on the number of (joint) probabilities
and expectations considered – and interpret them as vertices forming a convex
polytope.
• The convex combination of all conceivable two-valued states yields the surface
of this polytope; such that every point inside its convex hull corresponds to a
classical probability distribution.
• Determine the conditions of possible experience by solving the hull problem
– that is, by computing the hyperplanes which determine the inside–versus–
outside criteria for that polytope. These then can serve as necessary criteria for
all classical probabilities and expectations considered.
The systematic application of this method yields necessary criteria for classi-
cal probabilities and expectations which are violated by the quantum probabil-
ities and expectations. Since I have reviewed this subject exhaustively (Svozil
2018c, Sect. 12.9) (see also Svozil 2017a) I have just sketched it to obtain
a taste for its relevance for quantum indeterminacy. As is often the case in
mathematical physics the method seems to have been envisioned independently
a couple of times. From its (to the best of my knowledge) inception by Boole
(1862) it has been discussed in the measure theoretic context by Chochet the-
524 K. Svozil
ory (Bishop and Leeuw 1959) and by Vorob’ev (1962). Froissart (Froissart 1981;
Cirel’son (=Tsirel’son) 1993) might have been the first explicitly proposing it
as a method to generalized Bell-type inequalities. I suggested its usefulness for
non-Boolean cases (Svozil 2001) with “enough” two-valued states; preferable
sufficiently many to allow a proper distinction/separation of all observables (cf.
Kochen and Specker’s Theorem 0 (Kochen and Specker 1967, p. 67)). Consider-
ation of the pentagon/pentagram logic – that is, five cyclically intertwined con-
texts/blocks/Boolean subalgebras/cliques/orthonormal bases popularized the sub-
ject and also rendered new predictions which could be used to differentiate classical
from quantized systems (Klyachko 2002; Klyachko et al. 2008; Bub and Stairs 2009,
2010; Badzia̧g et al. 2011).
A caveat: the obtained criteria involve multiple mutually complementary sum-
mands which are not all simultaneously measurable. Therefore, different terms,
when evaluated experimentaly, correspond to different, complementary measure-
ment configurations. They are obtained at different times and on different particles
and samples.
Explicit, worked examples can, for instance, be found in Pitowsky’s
book (Pitowsky 1989b, Section 2.1), or papers (Pitowsky 1994) (see also Froissart’s
example (Froissart 1981)). Empirical findings are too numerous to even attempt
a just appreciation of all the efforts that went into testing classicality. There is
overwhelming evidence that the quantum predictions are correct; and that they
violate Boole’s conditions of possible classical experience (Clauser 2002) relative
to the assumptions (basically non-contextual realism and locality).
So, if Boole’s conditions of possible experience are violated, then they can no
longer be considered appropriate for any reasonable ontology forcing “reality”
upon them. This includes the realistic (Stace 1934) existence of hypothetical
counterfactual observables: “unperformed experiments seem to have no consis-
tent outcomes” (Peres 1978). The inconsistency of counterfactuals (in Specker’s
scholastic terminology infuturabilities (Specker 1960, 2009)) provides a connection
to value indefiniteness/indeterminacy – at least, and let me again repeat earlier
provisos, relative to the assumptions. More of this, piled higher and deeper, has
been supplied by Itamar Pitowsky, as will be discussed later.
Quantum probabilities are vector based. At the same time those probabilities mimic
“classical” ones whenever they must be classical; that is, among mutually commut-
ing observables which can be measured simultaneously/concurrently on the same
particle(s) or samples – in particular, whenever those observables correspond to
projection operators which are either orthogonal (exclusive) or identical (inclusive).
24 Roots and (Re)sources of Value (In)definiteness Versus Contextuality 525
1995; Länger and Maçzyński 1995; Dvurečenskij and Länger 1995a,b; Beltrametti
and Bugajski 1996; Pulmannová 2002). And because linear combinations of linear
operators remain linear, one can identify the terms occurring in conditions of possi-
ble experience with linear self-adjoint operators, whose sum yields a self-adjoint
operator, which stands for the “quantum version” of the respective conditions
of possible experience. This operator has a spectral decomposition whose min-
max eigenvalues correspond to the quantum bounds (Filipp and Svozil 2004a,b),
which thereby generalize the Tsirelson bound (Cirel’son (=Tsirel’son) 1980). In
that way, every condition of possible experience which is violated by the quantum
probabilities provides a direct criterium for non-classicality.
Fig. 24.1 The convex structure of classical probabilities in this (Greechie) orthogonality diagram
representation of the Specker bug quantum or partition logic is reflected in its partition logic,
obtained through indexing all 14 two-valued measures, and adding an index 1 ≤ i ≤ 14 if the
ith two-valued measure is 1 on the respective atom. Concentrate on the outermost left and right
observables, depicted by squares: Positivity and convexity requires that 0 ≤ λi ≤ 1 and λ1 + λ2 +
λ3 + λ7 + λ10 + λ13 ≤ 14 i=1 λi = 1. Therefore, if a classical system is prepared (a generalized
urn model/automaton logic is “loaded”) such that λ1 + λ2 + λ3 = 1, then λ7 + λ10 + λ13 = 0,
which results in a TIFS: the classical prediction is that the latter outcome never occurs if the former
preparation is certain
Very similar arguments against classical value definiteness can be inferred from
quantum clouds with true-implies-true sets of two-valued states (TITS) (Stairs 1983;
Clifton 1993; Johansen 1994; Vermaas 1994; Belinfante 1973; Pitowsky 1982;
Hardy 1992, 1993; Boschi et al. 1997; Cabello and García-Alcaine 1995; Cabello
et al. 1996, 2013, 2018; Cabello 1997; Badzia̧g et al. 2011; Chen et al. 2013).
There the quantum advantage is in the non-occurrence of outcomes which classical
predictions mandate to occur.
{1}
a
{2,3,4,5,6,7}
b
{1′}
{1′,2′,3′,4′,5′}
c 4 36 º {} 5
12 31
15 13
30 14 29
10 17
28 20 11
7 33 23
34 16
27 21
6 25
1 26
24 35
18 32
22 9
19 8
37 º {1″,2″,3″,4″}
2 3
Fig. 24.2 (a) TIFS cloud, and (b) TITS cloud with only a single overlaid classical value
assignment if the system is prepared in state |1 (Svozil 2018b). (c) The combined cloud from
(a) and (b) has no value assignment allowing 36 = {} to be true/1; but still allows 8 classical value
assignments enumerated by Table 24.1, with overlaid partial coverage common to all of them. A
faithful orthogonal realization is enumerated in Abbott et al. (2015, Table. 1, p. 102201–7)
530 K. Svozil
the latter ones (that is, {2, 3, 4, 5, 6, 7} and {1 , 2 , 3 , 4 , 5 }) to be false/0 and true/1,
respectively, if the former ones (that is, {1} ≡ {1 }) are true/1.
Formally, the only two-valued states on the logics depicted in Fig. 24.2a and
b which allow v({1}) = v ({1 }) = 1 requires that v({2, 3, 4, 5, 6, 7}) = 0
but v ({1 , 2 , 3 , 4 , 5 }) = 1 − v({2, 3, 4, 5, 6, 7}), respectively. However, both
these logics have a faithful orthogonal representation (Abbott et al. 2015, Table. 1,
p. 102201–7) in terms of vectors which coincide in |{1} = |{1 }, as well as
in |{2, 3, 4, 5, 6, 7} = |{1 , 2 , 3 , 4 , 5 }, and even in all of the other adjacent
observables.
The combined logic, which features 37 binary observables (propositions) in 26
contexts has no longer a classical interpretation in terms of a partition logic, as the
8 two-valued states enumerated in Table 24.1 cannot mutually separate (Kochen
and Specker 1967, Theorem 0) the observables 2, 13, 15, 16, 17, 25, 27 and 36,
respectively.
It might be amusing to keep in mind that, because of non-separability (Kochen
and Specker 1967, Theorem 0) of some of the binary observables (propositions),
there does not exist a proper partition logic. However, there exist generalized
urn (Wright 1978, 1990) and finite automata (Moore 1956; Schaller and Svozil
1995, 1996) model realisations thereof: just consider urns “loaded” with balls which
have no colored symbols on them; or no such balls at all, for the binary observables
(propositions) 2, 13, 15, 16, 17, 25, 27 and 36. In such cases it is no more possible to
empirically reconstruct the underlying logic; yet if an underlying logic is assumed
then – at least as long as there still are truth assignments/two-valued states on
the logic – “reduced” probability distributions can be defined, urns can be loaded,
and automata prepared, which conform to the classical predictions from a convex
combination of these truth assignments/two-valued states – thereby giving rise to
“reduced” conditions of experience via hull computations.
For global/total truth assignments (Pitowsky 1998; Hrushovski and Pitowsky
2004) as well as for local admissibility rules allowing partial (as opposed to
total, global) truth assignments (Abbott et al. 2012, 2015), such arguments can be
extended to cover all terminal states which are neither collinear nor orthogonal.
One could point out that, insofar as a fixed state has to be prepared the resulting
value indefiniteness/indeterminacy is state dependent. One may indeed hold that
the strongest indication for quantum value indefiniteness/indeterminacy is the total
absence/non-existence of two-valued states, as exposed in the Kochen-Specker
theorem (Kochen and Specker 1967). But this is rather a question of nominalistic
taste, as both cases have no direct empirical testability; and as has already been
pointed out by Clifton in a private conversation in 1995: “how can you measure a
contradiction?”
Table 24.1 Enumeration of the 8 two-valued states on 37 binary observables (propositions) of the combined quantum clouds/logics depicted in Fig. 24.2a
and b. Row vector indicate the state values on the observables, column vectors the values on all states per the respective observable
# 1 2 3 4 ··· ··· ··· 34 35 36 37
1 1 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 1 0 0 1 0 1 0 0 1 1 0 1 1 1 1 0 1
2 1 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 1 0 0 1 1 0 0 0 0 0 1 1 1 1 1 1 0 1
3 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 1 0 0 1 0 1 0 1 0 0 1 1 1 1 1 0 1
4 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 1 1 0 0 0 1 0 0 0 1 1 1 1 1 1 0 1
5 1 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 0 0 1 0 1 0 0 1 1 0 0 0 1 1 0 0
6 1 0 1 1 0 1 0 0 1 0 0 1 0 1 0 0 0 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0
7 1 0 1 0 1 1 0 1 0 1 0 0 0 1 0 0 0 1 0 1 0 0 0 1 0 1 0 1 0 0 1 0 0 1 1 0 0
8 1 0 1 0 1 0 1 1 0 0 0 1 0 1 0 0 0 1 0 1 0 0 0 1 0 1 0 1 0 0 1 0 1 0 1 0 0
24 Roots and (Re)sources of Value (In)definiteness Versus Contextuality
531
532 K. Svozil
as functions of the initial state and the context measured (Svozil 2009a, 2012;
Dzhafarov et al. 2017); and that they actually “are real” and not just “idealistically
occur in our imagination;” that is, being “mental through-and-through” (Segal
and Goldschmidt 2017/2018). Early conceptualizations of context-dependence aka
contextuality can be found in Bohr’s remark (in his typical Nostradamus-like
style) (Bohr 1949) on “the impossibility of any sharp separation between the
behavior of atomic objects and the interaction with the measuring instruments which
serve to define the conditions under which the phenomena appear.” Bell, referring
to Bohr, suggested (Bell 1966), Sec. 5) that “the result of an observation may
reasonably depend not only on the state of the system (including hidden variables)
but also on the complete disposition of the apparatus.”
However, the common, prevalent, use of the term “contextuality” is not an
explicit context-dependent form, as suggested by the realist Bell in his earlier quote,
but rather a situation where the classical predictions of quantum clouds are violated.
More concretely, if experiments on quantized systems violate certain Boole-Bell
type classical bounds or direct classical predictions, the narratives claim to have
thereby “proven contextuality” (e.g., see Hasegawa et al. (2006), Cabello et al.
(2008), Cabello (2008), Bartosik et al. (2009), Amselem et al. (2009), Bub and Stairs
(2010) and Cabello et al. (2013) for a “direct proof of quantum contextuality”).
What if we take Bell’s proposal of a context dependence of valuations –
and consequently, “classical” contextual probability theory – seriously? One of
the consequences would be the introduction of an uncountable multiplicity of
counterfactual observables. An example to illustrate this multiplicity – comparable
to de Witt’s view of Everett’s relative state interpretation (Everett III 1973) – is
the uncountable set of orthonormal bases of R3 which are all interconnected at the
same single intertwining element. A continuous angular parameter characterizes the
angles between the other elements of the bases, located in the plane orthogonal to
that common intertwining element. Contextuality suggests that the value assignment
of an observable (proposition) corresponding to this common intertwining element
needs to be both true/1 and false/0, depending on the context involved, or whenever
some quantum cloud (collection of intertwining observables) demands this through
consistency requirements.
Indeed, the introduction of multiple quantum clouds would force any context
dependence to also implicitly depend on this general perspective – that is, on the
respective quantum cloud and its faithful orthogonal realization, which in turn
determines the quantum probabilities via the Born-Gleason rule: Because there exist
various different quantum clouds as “pathways interconnecting” two observables,
context dependence needs to vary according to any concrete connection between
the prepared and the measured state.
A single context participates in an arbitrary, potentially infinite, multiplicity of
quantum clouds. This requires this one context to “behave very differently” when
it comes to contextual value assignments. Alas, as quantum clouds are hypothetical
constructions of our mind and therefore “mental through-and-through” (Segal and
Goldschmidt 2017/2018), so appears context dependence: as an idealistic concept,
534 K. Svozil
Nietzsche once speculated (Nietzsche 1887, 2009–,,) that what he has called
“slave morality” originated from superficially pretending that – in what later Blair
(aka Orwell 1949) “doublespeak” – weakness means strength. In a rather similar
sense the lack of comprehension – Planck’s “sign-post” – and even the resulting
inconsistencies tended to become reinterpreted as an asset: nowadays consequences
of the vector-based quantum probability law are marketed as “quantum supremacy”
– a “quantum magic” or “hocus-pocus” (Svozil 2016) of sorts.
Indeed, future centuries may look back at our period, and may even call it a
second “renaissance” period of scholasticism (Specker 1960). In years from now
historians of science will be amused about our ongoing queer efforts, the calamities
and “magic” experienced through our painful incapacity to recognize the obvious –
that is, the non-existence and therefore value indefiniteness/indeterminacy of certain
counterfactual observables – namely exactly those mentioned in Itamar Pitowsky’s
indeterminacy principle.
This principle has a positive interpretation of a quantum state, defined as the
maximal knowledge obtainable by simultaneous measurements of a quantized
system; or, conversely, as the maximal information content encodable therein. This
can be formalized in terms of the value definiteness of a single (Zeilinger 1999;
Svozil 2002, 2004, 2018b; Grangier 2002) context – or, in a more broader (non-
24 Roots and (Re)sources of Value (In)definiteness Versus Contextuality 535
Where might this aforementioned type of stochasticism arise from? It could well
be that it is introduced by interactions with the environment; and through the many
uncontrollable and, for all practical purposes (Bell 1990), huge number of degrees
of freedom in unknown states.
The finiteness of physical resources needs not prevent the specification of a
particular vector or context. Because any other context needs to be operationalized
within the physically feasible means available to the respective experiment: it is the
measurable coordinate differences which count; not the absolute locatedness relative
to a hypothetical, idealistic absolute frame of reference which cannot be accessed
operationally.
Finally, as the type of context envisioned to be value definite can be expressed in
terms of vector spaces equipped with a scalar product – in particular, by identifying
a context with the corresponding orthonormal basis or (the spectral decomposition
of) the associated maximal observable(s) – one may ask how one could imagine
the origin of such entities? Abstractly vectors and vector spaces could originate
from a great variety of very different forms; such as from systems of solutions
of ordinary linear differential equations. Any investigation into the origins of the
quantum mechanical Hilbert space formalism itself might, if this turns out to be a
progressive research program (Lakatos 1978), eventually yield to a theory indicating
operational physical capacities beyond quantum mechanics.
References
Abbott, A. A., Calude, C. S., Conder, J., & Svozil, K. (2012). Strong Kochen-Specker theorem
and incomputability of quantum randomness. Physical Review A, 86, 062109. https://doi.org/
10.1103/PhysRevA.86.062109, arXiv:1207.2029
Abbott, A. A., Calude, C. S., & Svozil, K. (2014) Value-indefinite observables are almost
everywhere. Physical Review A, 89, 032109. https://doi.org/10.1103/PhysRevA.89.032109,
arXiv:1309.7188
Abbott, A. A., Calude, C. S., & Svozil, K. (2015). A variant of the Kochen-Specker theorem
localising value indefiniteness. Journal of Mathematical Physics, 56(10), 102201. https://doi.
org/10.1063/1.4931658, arXiv:1503.01985
Amselem, E., Rådmark, M., Bourennane, M., & Cabello, A. (2009). State-independent quantum
contextuality with single photons. Physical Review Letters, 103(16), 160405. https://doi.org/
10.1103/PhysRevLett.103.160405
Aufféves, A., & Grangier, P. (2017). Recovering the quantum formalism from physically
realist axioms. Scientific Reports, 7(2123), 43365 (1–9). https://doi.org/10.1038/srep43365,
arXiv:1610.06164
Aufféves, A., & Grangier, P. (2018). Extracontextuality and extravalence in quantum mechanics.
Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering
Sciences, 376(2123), 20170311. https://doi.org/10.1098/rsta.2017.0311, arXiv:1801.01398
Badzia̧g, P., Bengtsson, I., Cabello, A., Granström, H., & Larsson, J. A. (2011). Pentagrams and
paradoxes. Foundations of Physics, 41. https://doi.org/10.1007/s10701-010-9433-3.
Bartosik, H., Klepp, J., Schmitzer, C., Sponar, S., Cabello, A., Rauch, H., & Hasegawa, Y. (2009).
Experimental test of quantum contextuality in neutron interferometry. Physical Review Letters,
103(4), 040403. https://doi.org/10.1103/PhysRevLett.103.040403, arXiv:0904.4576.
Belinfante, F. J. (1973). A survey of hidden-variables theories (International series of monographs
in natural philosophy, Vol. 55). Oxford/New York: Pergamon Press/Elsevier. https://doi.org/10.
1016/B978-0-08-017032-9.50001-7.
Bell, J. S. (1966). On the problem of hidden variables in quantum mechanics. Reviews of Modern
Physics, 38, 447–452. https://doi.org/10.1103/RevModPhys.38.447.
Bell, J. S. (1990). Against ‘measurement’. Physics World, 3, 33–41. https://doi.org/10.1088/2058-
7058/3/8/26.
Beltrametti, E. G., & Bugajski, S. (1996). The Bell phenomenon in classical frameworks. Journal
of Physics A: Mathematical and General Physics, 29. https://doi.org/10.1088/0305-4470/29/2/
005.
Beltrametti, E. G., & Maçzyński, M. J. (1991). On a characterization of classical and nonclassical
probabilities. Journal of Mathematical Physics, 32. https://doi.org/10.1063/1.529326.
Beltrametti, E. G., & Maçzyński, M. J. (1993). On the characterization of probabilities: A
generalization of Bell’s inequalities. Journal of Mathematical Physics, 34. https://doi.org/10.
1063/1.530333.
538 K. Svozil
Cabello, A., Severini, S., & Winter, A. (2014). Graph-theoretic approach to quantum correla-
tions. Physical Review Letters, 112, 040401. https://doi.org/10.1103/PhysRevLett.112.040401,
arXiv:1401.7081.
Cabello, A., Portillo, J. R., Solís, A., & Svozil, K. (2018). Minimal true-implies-false and true-
implies-true sets of propositions in noncontextual hidden-variable theories. Physical Review A,
98, 012106. https://doi.org/10.1103/PhysRevA.98.012106, arXiv:1805.00796.
Chen, J. L., Cabello, A., Xu, Z. P., Su, H. Y., Wu, C., & Kwek, L. C. (2013) Hardy’s paradox for
high-dimensional systems. Physical Review A, 88, 062116. https://doi.org/10.1103/PhysRevA.
88.062116.
Cirel’son (=Tsirel’son), B. S. (1980). Quantum generalizations of Bell’s inequality. Letters in
Mathematical Physics, 4(2), 93–100. https://doi.org/10.1007/BF00417500.
Cirel’son (=Tsirel’son), B. S. (1993). Some results and problems on quantum Bell-type inequali-
ties. Hadronic Journal Supplement, 8, 329–345. http://www.tau.ac.il/~tsirel/download/hadron.
pdf.
Clauser, J. (2002). Early history of Bell’s theorem. In R. Bertlmann & A. Zeilinger (Eds.), Quantum
(un)speakables: From Bell to quantum information (pp. 61–96). Berlin: Springer. https://doi.
org/10.1007/978-3-662-05032-3_6.
Clifton, R. K. (1993). Getting contextual and nonlocal elements-of-reality the easy way. American
Journal of Physics, 61(5), 443–447. https://doi.org/10.1119/1.17239.
Colins, D., & Gisin, N. (2004). A relevant two qbit Bell inequality inequivalent to the CHSH
inequality. Journal of Physics A: Mathematical and General, 37, 1775–1787. https://doi.org/
10.1088/0305-4470/37/5/021, arXiv:quant-ph/0306129.
Del Noce, C. (1995). An algorithm for finding Bell-type inequalities. Foundations of Physics
Letters, 8. https://doi.org/10.1007/bf02187346.
Dvurečenskij, A., & Länger, H. (1994). Bell-type inequalities in horizontal sums of boolean
algebras. Foundations of Physics, 24. https://doi.org/10.1007/bf02057864.
Dvurečenskij, A., & Länger, H. (1995a). Bell-type inequalities in orthomodular lattices. I.
Inequalities of order 2. International Journal of Theoretical Physics, 34. https://doi.org/10.
1007/bf00671363.
Dvurečenskij, A., & Länger, H. (1995b). Bell-type inequalities in orthomodular lattices. II.
Inequalities of higher order. International Journal of Theoretical Physics, 34. https://doi.org/
10.1007/bf00671364.
Dzhafarov, E. N., Cervantes, V. H., & Kujala, J. V. (2017). Contextuality in canonical systems
of random variables. Philosophical Transactions of the Royal Society A: Mathematical,
Physical and Engineering Sciences, 375(2106), 20160389. https://doi.org/10.1098/rsta.2016.
0389, arXiv:1703.01252.
Earman, J. (2007). Aspects of determinism in modern physics. Part B. In J. Butterfield & J.
Earman (Eds.), Philosophy of physics, handbook of the philosophy of science (pp. 1369–1434).
Amsterdam: North-Holland. https://doi.org/10.1016/B978-044451560-5/50017-8.
ESI – The Erwin Schrödinger International Institute for Mathematical Physics. (2001). Scientific
report for the year 2000. https://www.esi.ac.at/material/scientific-reports-1/2000.pdf, eSI-
Report 2000.
Everett III, H. (1973). The many-worlds interpretation of quantum mechanics (pp. 3–140).
Princeton: Princeton University Press.
Filipp, S., & Svozil, K. (2004a). Generalizing Tsirelson’s bound on Bell inequalities using a min-
max principle. Physical Review Letters, 93, 130407. https://doi.org/10.1103/PhysRevLett.93.
130407, arXiv:quant-ph/0403175.
Filipp, S., & Svozil, K. (2004b). Testing the bounds on quantum probabilities. Physical Review A,
69, 032101. https://doi.org/10.1103/PhysRevA.69.032101, arXiv:quant-ph/0306092.
540 K. Svozil
Pitowsky, I. (1998). Infinite and finite Gleason’s theorems and the logic of indeterminacy. Journal
of Mathematical Physics, 39(1), 218–228. https://doi.org/10.1063/1.532334.
Pitowsky, I. (2003). Betting on the outcomes of measurements: A Bayesian theory of quantum
probability. Studies in History and Philosophy of Science Part B: Studies in History and Phi-
losophy of Modern Physics, 34(3), 395–414. https://doi.org/10.1016/S1355-2198(03)00035-2,
quantum Information and Computation, arXiv:quant-ph/0208121.
Pitowsky, I. (2006). Quantum mechanics as a theory of probability. In W. Demopoulos &
I. Pitowsky (Eds.), Physical theory and its interpretation (The western Ontario series in
philosophy of science, Vol. 72, pp. 213–240). Springer Netherlands. https://doi.org/10.1007/
1-4020-4876-9_10, arXiv:quant-ph/0510095.
Pitowsky, I., & Svozil, K. (2001). New optimal tests of quantum nonlocality. Physical Review A,
64, 014102. https://doi.org/10.1103/PhysRevA.64.014102, arXiv:quant-ph/0011060.
Planck, M. (1932). The concept of causality. Proceedings of the Physical Society, 44(5), 529–539.
https://doi.org/10.1088/0959-5309/44/5/301.
Pták, P., & Pulmannová, S. (1991). Orthomodular structures as quantum logics. Intrinsic properties,
state space and probabilistic topics (Fundamental theories of physics, Vol. 44). Dordrecht:
Kluwer Academic Publishers/Springer.
Pulmannová, S. (2002). Hidden variables and Bell inequalities on quantum logics. Foundations of
Physics, 32, https://doi.org/10.1023/a:1014424425657.
Pykacz, J., & Santos, E. (1991). Hidden variables in quantum logic approach reexamined. Journal
of Mathematical Physics, 32. https://doi.org/10.1063/1.529327.
Ramanathan, R., Rosicka, M., Horodecki, K., Pironio, S., Horodecki, M., & Horodecki, P. (2018).
Gadget structures in proofs of the Kochen-Specker theorem. https://arxiv.org/abs/1807.00113,
arXiv:1807.00113
Rogers, Jr H. (1967). Theory of recursive functions and effective computability. New
York/Cambridge, MA: MacGraw-Hill/The MIT Press.
Schaller, M., & Svozil, K. (1995). Automaton partition logic versus quantum logic. International
Journal of Theoretical Physics, 34(8), 1741–1750. https://doi.org/10.1007/BF00676288.
Schaller, M., & Svozil, K. (1996). Automaton logic. International Journal of Theoretical Physics,
35, https://doi.org/10.1007/BF02302381.
Segal, A., & Goldschmidt, T. (2017/2018). The necessity of idealism. In Idealism: New essays
in metaphysics (pp. 34–49). Oxford: Oxford University Press. https://doi.org/10.1093/oso/
9780198746973.003.0003.
Simmons, A. W. (2020). How (maximally) contextual is quantum mechanics? This volume,
pp. 505–520.
Sliwa, C. (2003). Symmetries of the Bell correlation inequalities. Physics Letters A, 317, 165–168.
https://doi.org/10.1016/S0375-9601(03)01115-0, arXiv:quant-ph/0305190.
Smullyan, R. M. (1993). Recursion theory for metamathematics (Oxford logic guides, Vol. 22).
New York/Oxford: Oxford University Press.
Solís-Encina, A., & Portillo, J. R. (2015). Orthogonal representation of graphs. https://arxiv.org/
abs/1504.03662, arXiv:1504.03662.
Specker, E. (1960). Die Logik nicht gleichzeitig entscheidbarer Aussagen. Dialectica, 14(2–3),
239–246. https://doi.org/10.1111/j.1746-8361.1960.tb00422.x, arXiv:1103.4537.
Specker, E. (1990). Selecta. Basel: Birkhäuser Verlag. https://doi.org/10.1007/978-3-0348-9259-
9.
Specker, E. (2009). Ernst Specker and the fundamental theorem of quantum mechanics. https://
vimeo.com/52923835, video by Adán Cabello, recorded on June 17, 2009.
Stace, W. T. (1934). The refutation of realism. Mind, 43(170), 145–155. https://doi.org/10.1093/
mind/XLIII.170.145.
Stairs, A. (1983). Quantum logic, realism, and value definiteness. Philosophy of Science, 50, 578–
602. https://doi.org/10.1086/289140.
Svozil, K. (2001). On generalized probabilities: Correlation polytopes for automaton logic and
generalized urn models, extensions of quantum mechanics and parameter cheats. https://arxiv.
org/abs/quant-ph/0012066, arXiv:quant-ph/0012066.
24 Roots and (Re)sources of Value (In)definiteness Versus Contextuality 543
Svozil, K. (2002). Quantum information in base n defined by state partitions. Physical Review A,
66, 044306. https://doi.org/10.1103/PhysRevA.66.044306, arXiv:quant-ph/0205031.
Svozil, K. (2004). Quantum information via state partitions and the context translation principle.
Journal of Modern Optics, 51, 811–819. https://doi.org/10.1080/09500340410001664179,
arXiv:quant-ph/0308110.
Svozil, K. (2005). Logical equivalence between generalized urn models and finite automata.
International Journal of Theoretical Physics, 44, 745–754. https://doi.org/10.1007/s10773-
005-7052-0, arXiv:quant-ph/0209136.
Svozil, K. (2009a). Proposed direct test of a certain type of noncontextuality in quantum
mechanics. Physical Review A, 80(4), 040102. https://doi.org/10.1103/PhysRevA.80.040102.
Svozil, K. (2009b). Quantum scholasticism: On quantum contexts, counterfactuals, and the
absurdities of quantum omniscience. Information Sciences, 179, 535–541. https://doi.org/10.
1016/j.ins.2008.06.012.
Svozil, K. (2012). How much contextuality? Natural Computing, 11(2), 261–265. https://doi.org/
10.1007/s11047-012-9318-9, arXiv:1103.3980.
Svozil, K. (2016). Quantum hocus-pocus. Ethics in Science and Environmental Politics (ESEP),
16(1), 25–30. https://doi.org/10.3354/esep00171, arXiv:1605.08569.
Svozil, K. (2017a). Classical versus quantum probabilities and correlations. https://arxiv.org/abs/
1707.08915, arXiv:1707.08915.
Svozil, K. (2017b). Quantum clouds. https://arxiv.org/abs/1808.00813, arXiv:1808.00813.
Svozil, K. (2018a). Kolmogorov-type conditional probabilities among distinct contexts. https://
arxiv.org/abs/1903.10424, arXiv:1903.10424.
Svozil, K. (2018b). New forms of quantum value indefiniteness suggest that incompatible
views on contexts are epistemic. Entropy, 20(6), 406(22). https://doi.org/10.3390/e20060406,
arXiv:1804.10030.
Svozil, K. (2018c). Physical [a]causality. Determinism, randomness and uncaused events.
Cham/Berlin/Heidelberg/New York: Springer. https://doi.org/10.1007/978-3-319-70815-7.
Sylvia, P., & Majernik, V. (1992). Bell inequalities on quantum logics. Journal of Mathematical
Physics, 33. https://doi.org/10.1063/1.529638.
Ursic, S. (1984). A linear characterization of NP-complete problems. In R. E. Shostak (Ed.), 7th
international conference on automated deduction, Napa, 14–16 May 1984 Proceedings (pp. 80–
100). New York: Springer. https://doi.org/10.1007/978-0-387-34768-4_5.
Ursic, S. (1986). Generalizing fuzzy logic probabilistic inferences. In Proceedings of the second
conference on uncertainty in artificial intelligence, UAI’86 (pp. 303–310). Arlington: AUAI
Press. http://dl.acm.org/citation.cfm?id=3023712.3023752, arXiv:1304.3114.
Ursic, S. (1988). Generalizing fuzzy logic probabilistic inferences. In J. F. Lemmer & L. N. Kanal
(Eds.), Uncertainty in artificial intelligence 2 (UAI1986) (pp. 337–362), Amsterdam: North
Holland.
Vermaas, P. E. (1994). Comment on “getting contextual and nonlocal elements-of-reality the easy
way”. American Journal of Physics, 62. https://doi.org/10.1119/1.17488.
Vorob’ev, N. N. (1962). Consistent families of measures and their extensions. Theory of Probability
and Its Applications, 7. https://doi.org/10.1137/1107014.
Weihs, G., Jennewein, T., Simon, C., Weinfurter, H., & Zeilinger, A. (1998). Violation of Bell’s
inequality under strict Einstein locality conditions. Physical Review Letters, 81, 5039–5043.
https://doi.org/10.1103/PhysRevLett.81.5039.
Wright, R. (1978). The state of the pentagon. A nonclassical example. In A. R. Mar-
low (Ed.), Mathematical foundations of quantum theory (pp. 255–274). New York: Aca-
demic Press. https://www.elsevier.com/books/mathematical-foundations-of-quantum-theory/
marlow/978-0-12-473250-6.
Wright, R. (1990). Generalized urn models. Foundations of Physics, 20(7), 881–903. https://doi.
org/10.1007/BF01889696.
544 K. Svozil
Jos Uffink
25.1 Introduction
On March 25, 1935, the Physical Review received a paper by Einstein, Podolsky
and Rosen (EPR), arguing that the quantum description of physical reality was
incomplete. It was published on May 15 of the same year. It will be superfluous, on
this occasion, to stress the importance of this paper for the subsequent discussions
of the foundations of quantum theory in the 1930s and beyond, in particular
by directing these discussions to focus on the phenomenon of entanglement,
eventually opening up, in recent years, investigations into quantum information
and computation. It is a pleasure to dedicate this paper to the memory of Itamar
J. Uffink ()
Philosophy Department and Program in History of Science, Technology and Medicine,
University of Minnesota, Minneapolis, MN, USA
e-mail: jbuffink@umn.edu
The points that Schrödinger worried about are subtle. They concern both the role of
biorthogonal decompositions of bipartite wave functions and the crucial dependence
on the measurement outcomes in the EPR argument. To understand his worries, it
might be well to first recall the structure of the argument in the EPR paper.
2 “[E]very element of the physical reality must have a counterpart in the physical theory.” (Einstein
et al. 1935, p. 777)
3 “If, without in any way disturbing a system, we can predict with certainty (i.e., with probability
equal to unity) the value of a physical quantity, then there exists an element of reality corresponding
to that quantity.” (Einstein et al. 1935, p. 777)
4 Incidentally, the EPR paper never refers to spatial separation between the two systems as a
condition that would guarantee the absence of such interaction, unlike Schrödinger’s argument
in 1931.
5 Here and in further quotations, I have not followed original notation. In particular, wave functions
are presented here in the now common Dirac notation for perspicuity, even though, strictly
speaking, symbols like |
are vectors in a Hilbert space, and the corresponding wave function
(x1 , x2 ) = x1 , x2 |
is a representation of this vector in the position eigenbasis. Also, I abstain
from refering to such entities as ‘states’, because it is an issue of contention between Einstein and
Schrödinger whether wave functions represent states of a system.
548 J. Uffink
After reaching this general conclusion, EPR proceed to discuss a special case, in
which it happens that ψ k and φ r above are eigenfunctions of the non-commuting
quantities P and Q, respectively, of system 2. For this purpose, they employ the
wave function
+∞
1
(q1 , q2 ) = √ ei(q1 −q2 +q0 )p dp (25.2)
2π −∞
And this conclusion is then shown to lead to a contradiction with the assumption
that the wave function gives a complete description of reality, because quantum
mechanics does not allow any wave function for system 2 in which both P2 and Q2
are both elements of reality.
It is worth emphasizing that the argument for the special conclusion, unlike the
general conclusion, makes essential use of the circumstance that the special wave
function (25.3) possesses multiple biorthogonal decompositions.
25 Schrödinger’s Reaction to the EPR Paper 549
The notes in the undated folder Übungen relating to the EPR paper start off with
2 pages entitled “Entwicklung e. Funktion zweier Variablen in eine biorthogonale
Reihe” (Expansion of a function of two variables in a biorthogonal series). This is a
rehearsal of a proof of the now well-known biorthogonal decomposition theorem,—
a proof Schrödinger had in fact already worked out earlier, in the folder Supraleitung
1931.6 He included a proof of the theorem along the lines of these notes in his June 7
letter to Einstein, and also in Schrödinger (1935a). In Übungen, he did not work out
all the details of the theorem, but only to the point that he refreshed his knowledge
of the topic enough to continue. He writes:
The main issue established. To be worked out later.
Schrödinger then immediately turns to the EPR paper (ignoring Einstein’s coau-
thors) with the remark:
Einstein does not write of a complete biorthogonal decomposition.
Even before we go into the details of his argument, these remarks show two
points: he is contemplating what I called the general conclusion of the EPR
paper, based on the non-biorthogonal decompositions (25.1), rather than the special
conclusion, where (25.3) of course does provide biorthogonal decompositions.
Secondly: Schrödinger thought that this general conclusion was problematic pre-
cisely because it did not assume a biorthogonal decomposition. In more detail, his
argument in Übungen initially worries about the question what the non-biorthogonal
decompositions in (25.1) imply for the application of the projection postulate after
a measurement on system 1. But these initial worries do not materialize, the text
breaks off, and he then switches to a different problem:
However, the problem is different. From the mere fact that after one measurement [i.e. of A
on system 1] we are stuck with one [wave function ψ k ], but after another measurement [of
A on system 1] we are stuck with another [wave] function [φ r ], where these functions
correspond to contradictory properties of “the other” system [2], one cannot fabricate
a noose by which to hang quantum mechanics (läßt sich die Qu. mech. noch kein
Strick drehen). Because even one and the same method of measurement can and must, for
identically prepared systems, provide sometimes this, sometimes another outcome, which
lead to inferences about the “other” system that will likewise be contradictory.
Clearly, Schrödinger sees the general conclusion in the EPR paper as insufficient to
produce a paradox. He continues:
It is therefore also not a contradiction when the function [
(q1 , q2 )] includes the possibility
that on one occasion a measurement delivers a determinate value of position, and on another
6 https://fedora.phaidra.univie.ac.at/fedora/objects/o:259940/methods/bdef:Book/view, pp.95–6.
550 J. Uffink
What was he thinking here? Let me try to reconstruct the argument by breaking
the problem down in a piecemeal manner, proceeding from simple cases to more
general ones. There are two main themes throughout this reconstruction attempt: the
first is the distinction between biorthogonal and non-biorthogonal decompositions
of the joint wave function |
, as already indicated. The second theme is the
distinction between inferences based on just a single performance of a measurement
(Einzelmessung), which will naturally be conditional on the particular outcome
obtained in this single measurement, versus conclusions that can be stated by
quantifying over all possible outcomes of the measurement procedure, and therefore
are not conditional on any particular outcome.
The simplest case is when the state vector |
for the pair of systems has a unique
biorthogonal decomposition, say:
|
= ci |ai |bi (25.5)
i
1.7 And the answer to that question is generally negative, even in classical physics,
because two systems, even when they do not interact, might be correlated, so that
the outcome obtained by a measurement on 1 can still be informative about what is
real for system 2. Indeed, one of the main themes that surface again and again in
Schrödinger’s writings on entanglement and the EPR problem is that they express
an intermixture of ‘mental’ (where today we would perhaps say ‘epistemic’) and
‘physical’ (where today we would perhaps say ‘ontic’) aspects of the wave function.
Therefore, in the special case of (25.5), the fact that, after the same measurement
yields different outcomes on system 1, different wave functions are assigned to
system 2, depending on the outcomes obtained on system 1, does not justify EPR’s
general conclusion that those wave functions represent the same physical reality for
system 2.
Let us proceed to a more general case, where the decomposition is not biorthog-
onal, i.e.:
|
= cn |an |ψ n (25.6)
n
obtained from system 1. If these outcomes are held to be informative about system
2, they might represent mutually exclusive pieces of information about system 2.
But then, how could one argue that the inferences about system 2, i.e. that either
of the quantities P k or P r represents an element of reality by having the value 1,
refer to the same reality? Accordingly, Schrödinger does not regard it as paradoxical
by itself that, conditional on different outcomes obtained in measurements on the
first system, one should be able to predict (and infer the reality of) values of non-
commuting quantities of the second system.
He expresses this worry in his June 7 letter to Einstein:
In order to construct a contradiction, it is in my view not sufficient that for the same
preparation of the system pair the following can happen: one concrete single measurement
on the first system leads to a determinate value of a quantity [B] of the second system, any
other value of [B] or [B ] is excluded for general reasons. I believe this is not sufficient.
Indeed, the fact that same preparations do not always give the same measurement outcomes
is something we have to concede anyway. So when on one occasion the measurement gives
the value [bk ], and on another occasion the value [br ] to quantity B, why would it not give
on a third time, not the quantity B but the quantity B reality, and provide it with a value
[bs ]?
The point that Schrödinger raises in the the last sentence, namely that the
measurement on system 1 could lead to different (in particular non-commuting)
quantities obtaining values, would be hard to appreciate without the context of his
notes in the folder Übungen, because Schrödinger does not explicitly mention the
issue of biorthogonality in this passage. But one can infer from these notes that
the non-biorthogonality of the EPR wave function (25.1) was his main concern,
and since my reconstruction argues that for a non-biorthogonal decomposition,
like (25.6), one would indeed end up assigning reality to values of non-commuting
quantities like P k and P r , depending on the outcome obtained in the measurement
on system 1, this last sentence just quoted makes sense.
But of course, the two cases just reviewed still do not cover the case treated by
EPR’s general conclusion, i.e. the state (25.1). Here, we consider a measurement
of A, or a measurement of B, on system 1 with outcome ak or br respectively, and
contemplate their implications for system 2. Both EPR and Schrödinger agree that
this choice of what is done to system 1 should not affect what is real on system 2.
(As he said in 1931: what we choose to measure depends on an arbitrary choice
(Willesentscheidung).) But what we infer about system 2 depends not only on what
we do to system 1, it depends also on the outcome we obtain. Now, it would be
hard to deny that, just as a1 and a2 are mutually incompatible outcomes for the
measurement of A on system 1, that allow different inferences about what is real
for system 2, in example (25.6), the outcomes ak and br obtained in incompatible
measurements on system 1 also provide incompatible data that enable different
inferences about what is real for system 2. And so one may well ask whether this
should be in any way more paradoxical than the case already discussed under (25.6).
Schrödinger’s answer seems to be negative.
To summarize, what I have been arguing so far is that Schrödinger’s misgivings
about the EPR argument focus on their general conclusion, and particularly on the
fact that, at this stage of their argument, biorthogonality of the decomposition of the
25 Schrödinger’s Reaction to the EPR Paper 553
wave function is not assumed. He argues that any inference about the second system,
based on a measurement of the first, depends not only on the quantity measured on
the first system, but also on the outcome obtained in such a measurement. In the
case of a non-biorthogonal decomposition of the wave function, he does not regard
it as paradoxical that, depending on the outcome of a measurement on system 1,
we should infer that the values of one or another of non-commuting quantities for
system 2 constitute an element of reality.
Schrödinger also describes in his June 7 letter to Einstein what he thinks is a
satisfactory way to argue for a contradiction. The argument is very similar to what
he wrote in his Übungen notebook:
In order to construct a contradiction, the following seems necessary to me. There should
be a wave function for the pair of systems and there should be two quantities A and [A ],
whose simultaneous reality is excluded for general reasons, and for those, the following
should hold:
1. There exists a method of measuring, which for this wave function always gives the
quantity A a sharp value (even if not always the same value), so that I can say, even
without actually performing the measurement: in the case of such a wave function, A
possesses reality — I don’t care about which value it has. (welchen Wert es hat ist mir
Wurst.)
2. Another method of measurement must provide, at least sometimes, a sharp value for the
quantity [A ] (always for the same wave function, of course).
Schrödinger then points out that the maximally entangled wave function used in
the example by EPR in their special conclusion does indeed satisfy these conditions,
(and in fact always provides a sharp value in both measurements), but that this
is due to the fact that their example wave function allows multiple biorthogonal
decompositions. In fact, this example wave function is exceptionally special in
the sense that the absolute values of the coefficients (eipx0 ) are all the same, and
this implies that by performing an arbitrary unitary basis transformation on both
bases in the Hilbert spaces of the two systems, one can get another biorthogonal
decomposition. He then expresses a secondary worry that a conclusion as far-
reaching as EPR, should not depend on such an exceptional example. But he also
points out that if one does not share this secondary worry, the example is just fine
in demonstrating the argument. His letter ends with an outline of a proof of the
biorthogonal decomposition theorem.
In this attempt at reconstructing Schrödinger’s thinking, the theme of the dis-
tinction between conclusions based single measurements versus conclusions based
on general conclusions that are not conditional on a particular, given, measurement
outcome plays center stage, whereas the distinction between biorthogonal versus
non-biorthogonal decompositions has dropped to an auxiliary role.
So like Schrödinger, one could argue as follows: suppose we have the wave func-
tion (25.1), considered in their general argument. Suppose we make a measurement
of quantity A on system 1, whose eigenfunctions are |ai , and eigenvalues ai . If this
measurement yields the outcome ak we infer that the corresponding wave function
for sytem 2 is ψ k . This leads us to predict with 100% probability, before making
any measurement on system 2 that the physical quantity P k has the value 1. By
554 J. Uffink
the “criterion of reality”, this entails that in this case P k is an element of reality.
However, this conclusion is conditional on the outcome obtained on system 1, i.e.
the value ak obtained in a measurement of A on system 1. And, even if system 2 is
non-interacting with, or even spatially separated from system 1, we cannot conclude
that the entire variety (for k = 1, 2, . . . of wave functions |ψ k or associated
quantities P k refer to the same physical reality of system 2, precisely because they
do not depend only on what measurement is performed on system 1 but also on the
outcome obtained.
However, contra Schrödinger, one could continue as follows: suppose the
procedure mentioned in the previous paragraph is in place.
We perform the measurement A on system 1 and obtain one of its possible outcomes ak . For
each of those specific outcomes one of the quantities P k represents an element of reality for
system 2. So, if we do not care what this outcome might be, we can infer that at least one of
the quantities P 1 , P 2 , . . . represents a element of physical reality.
[P 1 = 1] ∨ [P 2 = 1] · · · ∨ [P n = 1] (25.7)
and one could regard this description as representing an element of reality for the
second system. Now compare this to another measurement on system 1, of the
quantity A . This would lead similarly to a logical disjunction of
where Qi = |φ i φ i |.
But in general, of course, these two disjunctions will be mutually exclusive, and
indeed, there will be no description within quantum mechanics for a state in which
both disjunctions are correct descriptions. So, assuming with EPR that what is real
for system 2 cannot depend on the choice of measurement on system 1 we do reach
their conclusion that QM is incomplete, even for non-biorthogonal wave functions.
What emerges from this discussion is thus that Schrödinger’s insistence that the
reliance on single measurements in EPR’s general argument is not sufficient to
establish their conclusion is well-founded. However, his insistence that biorthogonal
decompositions are needed for this purpose seems too strong. What I have tried to
argue is that, while they of course simplify the argument quite considerably, they are
not essential to their argument.
Einstein’s June 19 1935 reply to Schrödinger’s letter has become rather well-
known, since it has been quoted and discussed by several commentators (Fine 1986,
25 Schrödinger’s Reaction to the EPR Paper 555
2017; Howard 1985, 1990; Bacciagaluppi and Crull 2020). In this letter, Einstein
expresses his dissatisfaction with the formulation of the EPR paper, blaming
Podolsky for “smothering it in erudition”. It is also the first time he explicitly puts
forward a ‘Separation Principle’ (Trennungsprinzip), which might be paraphrased
by saying that for two spatially separated systems, each has its own physical state,
independently of what is done to the other system.
Einstein first presents a simple classical illustration: a ball is always found in
either of two boxes, and we describe this situation by saying that the probability of
finding it in the first or second box is 12 . He asks whether this description could be
considered as a complete specification, and contemplates two answers:
No: A complete specification is: the ball is in the first box (or not). This is how the
characterization of the state should look in the case of a complete description.
Yes: Before I open the lid of a box, the ball is not at all in one of the two boxes. Its being
in one determinate box only comes about because I open the lid. Only that brings
about the statistical character of the world of experience, [. . . ]
We have analogous alternatives when we want to interpret the relation between
quantum mechanics and reality. In the ball-system, the second “spiritist” or Schrödinger
interpretation is, so to say, silly, and only the first “Born” interpretation would be taken
seriously.8 But the talmudic philosopher whistles at “reality” as a sock puppet of naivety
and declares both views as only different in their ways of expression. [. . . ]
In quantum theory, one describes the real state of a system by means of a normalized
function ψ of its coordinates (in configuration space).[. . . ] One would now like to say the
following: ψ is correlated one-to-one with the real state of the real system. [. . . ] If this
works, I speak of a complete description of reality by the theory. But if such an interpretation
fails I call the theoretical description ‘incomplete’. [. . . ]
Let us describe the joint system consisting of subsystems [1] and [2] by a ψ-function
[|
]. This description refers to a moment in time when the interaction has practically
ceased. This ψ of the joint system can be built up from the normalized eigen-ψ |ai ,
|bj that belong to the eigenvalues of the “observables” (respectively sets of commuting
observables) [A] and [B] respectively. We can write:
|
= cij |ai |bj (25.9)
ij
This is the ψ-function of the subsystem [2] in the case I have made an [A]-measurement.
Now instead of the eigenfunctions of the observables [A] and [B], I can also decompose
in the eigenfunctions [of] [A ], and [B], where [A ] is another set of commuting observables:
|
= c ij |ai |bj (25.11)
ij
8 Here, the reference to the “Schrödinger” and “Born” interpretation refer back to the year 1926,
when Schrödinger originally interpreted the wave function as corresponding to the physical state
of the system, whereas Born proposed a statistical interpretation.
556 J. Uffink
Ultimately, it is only essential that |ψ and |ψ are at all different. I hold that this variety is
incompatible with the hypothesis that the ψ-description is correlated in a one-to-one fashion
with physical reality.
This reply is striking, in several ways. First of all, as has been noted by previous
authors, because the argument is significantly different from the EPR argument:
Einstein does not mention the Reality Criterium, and employs a very different notion
of completeness. The Separability Principle, although appearing here for the first
time, is arguably implicitly present in the EPR paper (Fine 1986, 2017; Howard
1985, 1990). A second reason this reply is striking is that it seems to ignore the
remarks that Schrödinger put forward in his previous letter, and with regard to
Schrödinger’s contention that biorthogonal decompositions are essential, Einstein’s
only remark is “Das ist mir wurst” (I could not care less).
Let me try to spell this out in more detail. The conclusion of Einstein’s argument
is that depending on which measurement we perform on system 1, one should assign
different wave functions to system 2, even though the real physical state of system
2 should —according to the Separation Principle— be the same, regardless of what
we choose to measure on system 1. This conclusion just reiterates the general
conclusion of the EPR paper, which was the focus of Schrödinger’s worries in the
first place. But where, for EPR, this conclusion was only an initial step —and did not
by itself imply the incompleteness of the quantum description of reality—, Einstein
argues that it is all that is ultimately essential for incompleteness.
This contrast between Einstein’s June 19 argument and the EPR argument is
due, of course, to their different construals of the notion of completeness. EPR
provide only a necessary condition for when a theoretical description is complete.
On the other hand, Einstein, in the quote above, describes a necessary and sufficient
condition in terms of a one-to-one correspondence between real states and their
quantum description by means of a wave function.
Lehner (2014) has proposed an apt terminology for dissecting notions of
completeness employed by EPR and in Einstein’s letter to Schrödinger. Failures of
a one-to-one correspondence between “real states” and a “theoretical description”
may come in at least two ways:
Indeed, one main reason why EPR need to continue their argument to establish
their special conclusion is just that they aim to establish incompleteness rather than
overcompleteness. (Another reason is of course their reliance on the Criterion of
Reality— I will come back to that below.)
The surprising point, here, is that, cases of overcompleteness are ubiquitous
within physics. Indeed, even in quantum theory, it is well-known that one can
multiply a wave function ψ with an arbitrary phase factor eiα (α ∈ R ) to obtain
another wave function ψ = eiα ψ, where it is generally assumed that ψ and ψ
describe the same physical situation. More general examples of overcompleteness
can be found in gauge theories. Indeed one might wonder whether overcompleteness
is a worrisome issue at all in theoretical physics. Now of course, in the examples just
alluded to, the gauge freedom in the theoretical description of a physical state can
usually be expresses in terms of equivalence relations and be expelled by ‘dividing
out’ the corresponding equivalence classes in the theoretical descriptions. No such
equivalence relations exist in the case discussed by Einstein in the June 19 letter,
and so the case of overcompleteness presented there in his is not so simple as in the
examples of gauge freedom (cf. Howard 1985).
But apart from the distinctions between Einstein (on June 19) and EPR, let me
return to Schrödinger. As we have seen, Schrödinger had misgivings precisely about
the EPR general conclusion, and tried to point out to Einstein that the wave function
assigned to system 2 depends not only on what measurement is performed on system
1, but also on its outcome.
Einstein’s letter, however, does not contain any acknowledgement of this issue.
For example, he argues that the wave function (25.10) “is the ψ function of the
subsystem [2] in the case I have made an [A]-measurement”. But it is not, or at the
very least, Einstein’s notation is ambiguous here. Inspection of the right-hand side
of (25.10) reveals that this wave function depends on i, which labels the outcome
obtained in the measurement of A. So a proper formulation, instead of (25.10)
should have read something like
|ψ i = cij |bj (25.13)
j
This is the ψ-function of subsystem 2 in case I have made an A-measurement and obtained
the outcome ai .
But this notation makes it transparent that there will be different wave functions for
system 2 for different outcomes ak and ar , and, as Schrödinger argued, this need not
be paradoxical. A similar remark holds for the wave function (25.12).
Now, one might argue for a more charitable reading of Einstein’s response of
June 19 to Schrödinger. One such reading9 is that Einstein may have wanted to
point out that that if we perform measurement A on system 1 the wave function of
system 2 would be one of the form
|ψ i = cij |bj (25.14)
j
or, more precisely, that it would be one of the wave functions in the set
{|ψ 1 , . . . |ψ n } (25.15)
{|ψ 1 . . . |ψ n } (25.16)
and in general, these two sets will be disjoint. So, if there were a wave function to
describe the real physical state of system 2, independently of what measurement we
perform on system 1, it would have to be a member of two disjoint sets. In other
words, there would not even be any wave function to assign to system 2. And so, we
would recover Einstein’s conclusion, i.e. a failure of a one-to-one correspondence
between between physical states and wave functions, not in terms of Lehner’s
incompleteness (many-to-one) nor overcompleteness (one-to-many) but rather as
a one-to-none correspondence. Be that as it may, it would of course still be failure
of one-to-one correspondence.
However, this still leaves another vulnerability in Einstein’s argument to be
discussed. This vulnerability is that the argument will only convince those who
share Einstein’s premise that the wave function is to be interpreted as a direct
description of the real state of a system. In the June 19 letter, he ascribes this view
to Schrödinger, but Schrödinger sets him right in a response of August 19:
I have long passed the stage when I thought that one could see the ψ-function as a direct
description of reality.
Indeed, there are numerous passages in Schrödinger’s post-1927 work that indicate
he came to acknowledge that the wave function contained epistemic factors,
pertaining to our knowledge about the system, as well as objective factors pertaining
to its physical state. But perhaps worse than Schrödinger’s rejection of this premise
is that adherents of the Copenhagen interpretation, and in particular the ‘talmudic
philosopher’ (Bohr), i.e. the main target of the argument, will also reject it.
In this regard, the EPR paper itself fares much better. The reason is that it does
not attempt to establish a link between physical reality and a wave function, but
only between elements of physical reality and measurement outcomes of physical
quantities, at least when they are fully predictable. How does this change the
argument? In the case of their special conclusion, one could argue as follows:
assume the Separability Principle. But depending on what quantity we measure on
system 1 (either P or Q), we can infer with certainty the values of the corresponding
measurements on system 2. They should therefore represent elements of physical
reality, indeed simultaneously so, because what is real on system 2 should, by
Separability, not depend on what measurement we perform on system 1. But there
25 Schrödinger’s Reaction to the EPR Paper 559
is no quantum description of a state with definite real values for both position an
momentum for system 2. Therefore the quantum description of physical reality is
incomplete (in the sense of a one-to-none correspondence).
The reason why the EPR special conclusion is more robust against objections,
compared to Einstein’s June 19 argument, is precisely because it avoids the premise
that wave functions directly describe physical reality. But this comes at a price:
the EPR special conclusion employs the maximally entangled EPR state with it
its multiple biorthogonal decompositions, i.e. the assumption that Einstein said he
could not care less about. However, as I have argued in the previous section, this
conclusion can be generalized to a case without such biorthogonal ecompositions.
Einstein, in his June 19 letter ascribes the view that the wave function presents a
complete description of physical reality to Schrödinger. This view is indeed one
that Schrödinger used in his original works on wave mechanics in the first half of
1926 (Joas and Lehner 2009). However, he abandoned this view in several letters
written in November 1926, precisely because of the phenomenon we would now
call entanglement.
Indeed, in his (1935b), he writes that the wave function is only a “representative”
of the state of the system, and that under the proviso that he is describing the current
view (Lehrmeinung) of quantum mechanics. More specifically, he characterizes a
wave function as a mere catalogue of expectation values, and compares it to a travel
guide (Baedeker). All this underlines that Schrödinger did not commit to a view that
the wave function was in one-to-one correspondence with the real state of a system.
Einstein persistently and consistently saw the EPR problem as an objection to the
claim that the quantum description of physical systems by means of a wave function
was complete, and also persistently argued that this problem would disappear if
only one would take the view of the wave function as a statistical ensemble (thus
560 J. Uffink
Schrödinger, by contrast, did not harbour any such hopes. For Schrödinger,
the argument pointed to a deeper problem, that would not be resolved by merely
adopting an ensemble interpretation. Thus, in response to Einstein on August 19, he
points out that Einstein’s solution would not work. He provides an argument, using
2
the example of the operator Pa 2 + a 2 Q2 which had been presented to Schrödinger
for a similar purpose by Pauli in a letter of July 9 (von Meyenn et al. 1985). His
(1935b) contains a similar argument. Einstein’s response (on September 4) to this
argument is
If I claimed that I have understood your last letter, that would be a fat lie (von Meyenn 2011,
p. 569).
In reply, (October 4) Schrödinger apologizes and uses the example of the operator
for angular momentum, also suggested to him by Pauli on July 9, and also discussed
in (1935b). Both arguments are actually versions of Von Neumann’s argument
against a hidden variables interpretation.
But if Schrödinger did not believe in Einstein’s favorite way to resolve the
paradox, what alternative did he have? It seems to me that Schrödinger did not
see a clear way out. For example in his letter to Sommerfeld (December 1931)
Schrödinger concluded
One can hardly help oneself in any other way than by the following ‘emergency decree’: in
quantum mechanics it is forbidden to make statements about what is ‘real’, about the object,
such statements deal only with the relation object–subject, and apparently in a much more
restrictive way than in any other description of nature. (von Meyenn 2011, Vol. I, p. 490)
This suggests that Schrödinger saw the only rescue in a non-realist interpretation of
quantum mechanics. In the Übungen notebook, his conclusion was a rejection of the
Copenhagen interpretation;
Therefore: This interpretational scheme does not work. One has to look for another one.
But in his June 7 letter to Einstein, he put blame on the non-relativist nature of QM:
The lesson (Der Vers) that I have drawn for myself until now about this whole business
is this. We do not possess a quantum mechanics compatible with relativity theory, that
is, amongst others, a theory that takes account of the finite speed of propagation of all
interactions. [. . . ] When we want to separate two systems, their interaction will, long before
it vanishes, fail to approximate the absolute Coulomb or similar force. And that is where
our understanding ends. The process of separation cannot at all be grasped in the orthodox
scheme.
Similarly, his final remarks in (1935b) point out the non-relativistic nature of
QM, and the obstacles towards a fully relativistic theory, and conjecture that non-
relativistic QM might merely be a convenient calculational artifice. For Schrödinger,
25 Schrödinger’s Reaction to the EPR Paper 561
the conclusions to be drawn form the EPR argument were therefore not so clear-
cut. On some occasions he would be satisfied by just claiming that it showed
an inadequacy of the Copenhagen interpretation of quantum mechanics, on other
occasions he hoped that a future relativistic version of quantum theory would make
the problem disappear, but without having concrete proposals about how this would
work.
10 AsVon Meyenn (p. 548) notes, this issue concerns the fact that under the Nazi regime, the
mention of Einstein’s name in German scientific literature had been forbidden.
562 J. Uffink
The letter continues by describing some of the themes that did indeed end up in the
pubished (Schrödinger 1935b).
On July 25, Schrödinger wrote to Berliner that his manuscript was finished in
handwriting, and that a citation of the work from the Physical Review would only
appear after three quarters of his manuscript, at the beginning of the final quarter. He
mentions that it would take him some more time to produce a type-written version.
He writes to Berliner again on August 11, saying he did not reply earlier:
because the manuscript, which you so kindly agreed to publish, was doubly burning under
my fingers. I have now put it all together, and will send it off tomorrow together with this
letter.
But Fine’s argument differs in detail from Moore and Gribbin. He suggests that
section 4 and also section 5 (containing the Cat) of (1935b) were not part of the
paper sent to Berliner on August 12, but later additions:
The conclusion seems inescapable that section 4 was put together only after the August 19
letter [by Schrödinger], and was not part of the original manuscript [Schrödinger] sent to
Berliner. (p. 81)
25 Schrödinger’s Reaction to the EPR Paper 563
But if section 4 was only written after August 19, then most likely section 5 is a later
addition as well. (p. 82)
There are several problems with this hypothesis. I will deal with its first claim
(i.e. that section 4 of Schrödinger’s (1935b) could only have been written after
Schrödinger’s August 19 letter) below. Let me first mention problems with the
conjecture that section 5, containing the Cat Paradox, was composed later as
well. The first problem is that it would imply that Schrödinger is literally lying
to Einstein when he wrote on August 19, in response to Einstein’s exploding
gunpowder example, that he had already presented a quite similar example in the
paper he just sent off a week earlier, and described the cat example. This is all the
more improbable given that Schrödinger seems to be generally generous in issues
of priority, and did not even emphasize in his June 7 letter that the EPR paper
“about that which we discussed so much in Berlin” is in fact indebted to his own
formulation of the EPR problem in 1931.
Of course, Fine’s hypothesis could be tested, e.g. if a draft version of the Natur-
wissenschaften paper surfaces. I could not find any in the Schrödinger Nachlass,
(but who knows what might emerge from archives of Die Naturwissenschaften, if
they survived the war). However, one can rely on what is known from Schrödinger’s
correspondence in this period with Berliner.
When Schrödinger wrote to Berliner on July 25 that he had finished a handwritten
draft of the paper, and would still need time to type it. He reported a word
count of 10230 words, and estimated that this would fill 13.3 to 14 pages of die
Naturwissenschaften not counting the paragraph headings. (von Meyenn 2011, p.
555). He suggested to that this “extraordinary length” probably required it to be cut
into two or three installments, (but advising against the latter option).
On July 29, Berliner replied that he was looking forward to the submission, and
that he would happily allot all the space Schrödinger needed, and promised that it
would appear in two subsequent issues of the journal.
On August 11, Schrödinger wrote to Berliner that his typescript was finished and
to be mailed the next day. On August 14, Berliner responded that he had suddenly
been removed from his position as editor of die Naturwissenschaften, but implored
Schrödinger that his paper should still be published in that journal, and that he had
proposed Walther Matthée as his successor.11
On August 19, Schrodinger wrote Einstein, as mentioned before, and told him
also that his first impulse was to withdraw the paper, but didn’t because Berliner
himself had insisted on its publication.
On August 24, he received a letter from Mathée, telling Schrödinger he was
the new editor of Die Naturwissenschaften, thanking him for the manuscript,
and mentions including the proofs of the paper. Mathée informed him that the
11 Berliner
was removed from the journal because he was Jewish (von Meyenn 2011, Vol. 2,
pp. 546–7).
564 J. Uffink
manuscript was to be split into three parts, to appear in consecutive issues, and
asked Schrödinger to indicate in these proofs where the cuts were to be made.12
Schrödinger must have replied to Mathée soon afterwards, and there is a draft of
this letter in the selection by Von Meyenn, but erroneously appended to his letter
to Berliner from July 25. (von Meyenn 2011, Vol. 2, p. 557). Here, Schrödinger
mentions that he had just sent a telegram with the message “Sorry, tripartion
impossible” and maintains that the manuscript should only be split into two parts,
or otherwise returned to him. He mentions that this division could well take place
after section 9, which is “very exactly half of the manuscript”. He ends, however, on
a more reconciliatory note, saying that he trusts Mathée’s judgment, who should not
feel bound by an earlier promise, and that Schrödinger would feel the same way if
Berliner had told him that the promise of a bipartition was not possible. There is no
mention at all of a revised version with later additions to the paper at this stage, and
clearly there would have been very little time to do so, if this revision is supposed
to have occurred after August 19.
In fact, the paper was published in three installments in Schrödinger (1935b). In
total, they comprise about 12270 words (not counting footnotes, references, figure
captions, tables of contents and section titles)13 and span about 15.5 pages. This
implies that Schrödinger did expand his manuscript by about 2000 extra words,
or about 2 pages, after July 25. This could actually fit Fine’s hypothesis that
its sections 4 and 5 (which together comprise just less than 2 pages, and about
1600 words) could be later additions — but not necessarily additions from after
August 19.
The second problem for Fine’s hypothesis concerns Schrödinger’s estimate on
July 25 that the first mention of Einstein’s name would occur after three quarters of
the paper, as well as his estimate, presumably a few days after August 24, that the
end of section 9 would “ very exactly” divide the manuscript into two equal halves.
Both these estimates are actually quite accurate also for the published version. This
suggests that, more likely, he made additions while typing his draft of July 25, more
or less equally spaced throughout the manuscript. In any case, if Schrödinger added
two whole new sections before section 9 in Fine’s hypothetical revision one must
surmise him to have also made additions of the same size to the second half of his
manuscript as well in order to keep the end of section 9 at exactly half of the paper.
But that clearly did not happen, because that would take the size of the manuscript
beyond what was actually published.
Given this evidence, it seems quite unlikely that Schrödinger only thought of
the cat paradox after he read Einstein’s exploding device example. The more
simple and straightforward explanation, as stated earlier, is that these two authors,
independently, thought of very similar examples at roughly the same time.
Next, let me discuss Fine’s argument for the “inescapable” claim that section 4
of Schrödinger (1935b) is a later addition to the manuscript Schrödinger sent to
Berliner on August 12. This claim is based on the fact that Einstein writes in a
letter to Schrödinger on August 9, that he believes a mere ensemble interpretation
of the wave function (which, of course would concede that the description by a
wave function is incomplete (in Einstein’s sense as well as the EPR sense of the
term) would suffice to solve the paradox. Schrödinger responds on August 19 that
this argument will not work. He proceeds to provide a counter-argument, using the
Hamiltonian of the harmonic oscillator as an example. Section 4 of Schrödinger
(1935b) contains a very similar argument.
Fine assumes that Schrödinger only could have developed this argument in
response to Einstein’s letter, and thus after August 19. However, it seems to me
that this assumption overlooks Schrödinger’s wider correspondence in this period.
The very idea that an interpretation of the wave function as incomplete, i.e., as
merely describing a statistical inhomogeneous ensemble is irreconcilable with the
quantum predictions was mentioned to Schrödinger by Pauli in a letter of July 9
1935, specifically pointing out that the discrete spectrum that quantum mechanics
requires for operators like the Hamiltonian P 2 + ω2 Q2 for a harmonic oscillator
or the angular momentum (in a 3-dimensional case) Px Qy − Py Qx would be
incompatible with this view. Schrödinger apparently took Pauli’s remarks to heart.
Both the harmonic oscillator Hamiltonian and the angular momentum examples
are discussed in Schrödinger (1935b), with his own twist, adding an arbitrary
multiplicative constant to the last term of the harmonic oscillator Hamiltonian.
Moreover, he presents this argument in a version adapted to the EPR case in a
letter to von Laue (July 25 1935). Again, I believe there seems little ground for
the hypothesis that Schrödinger could only have composed this argument after he
responded to Einstein on August 19, and even less for the claim that, therefore,
section 4 of (1935b) must have been a later addition to the manuscript he submitted
on August 12 of that year.
Indeed, it seems to me that Fine’s analysis of Schrödinger’s Naturwissenschaften
paper, vis-a-vis his correspondence with Einstein in 1935, paints him as a passive
receiver of Einstein’s ideas, and as composing his own (Schrödinger 1935b) paper
merely in response to those ideas by Einstein. I believe that we should redress this
picture. Schrödinger knew about the essential aspects of the EPR argument 3 12 years
before it was published, and might well have influenced Einstein in his transition
from the photon box argument to the clearer and simpler version that is EPR. There
was no need for Schrödinger to learn about EPR from Einstein. Instead, as I have
argued, he had a lesson to teach Einstein, although he failed to get across.
Acknowledgements I am indebted to The Max Planck Institute for History of Science in Berlin
for enabling this research, as well as the Vossius Institute in Amsterdam and the Descartes Centre
in Utrecht for providing the opportunity to finish it. I also thank Christoph Lehner for many
valuable discussions, and deciphering Schrödinger’s handwriting, and Guido Bacciagaluppi for
many helpful and insightful comments. I am also grateful to Alexander Zartl for assistance in
browsing through the Schrödinger Nachlass in the Zentralbibliothek für Physik at the University
of Vienna.
566 J. Uffink
References
Bacciagaluppi, G., & Crull E. (2020). The Einstein paradox: The debate on nonlocality and
incompleteness in 1935. Cambridge: Cambridge University Press.
Einstein, A., Podolsky, B., & Rosen, N. (1935). Can quantum mechanical description of reality be
considered complete? Physical Review, 47, 777–780.
Fine, A. (1986). The Shaky game: Einstein, realism and the quantum theory. Chicago: University
of Chicago Press.
Fine, A. (2017). The Einstein-Podolsky-Rosen argument in quantum theory. In E. N. Zalta (Ed) The
Stanford encyclopedia of philosophy (Winter 2017 Edition). https://plato.stanford.edu/archives/
win2017/entries/qt-epr/.
Gilder, L. (2008). The age of entanglement. New York: Knopf.
Gribbin, J. (2012). Erwin Schrödinger and the quantum revolution. London: Bantam Press.
Howard, D. (1985). Einstein on locality and separability. Studies in History and Philosophy of
Science, 16, 171–201.
Howard, D. (1990). Nicht sein kann was nicht sein darf, or the Prehistory of EPR, 1909–1935:
Einstein’s early worries about the Quantum Mechanics of Composite Systems. In A. I. Miller
(Ed.) Sixty-two years of uncertainty: Historical, philosophical and physical inquiries in the
foundations of quantum mechanics (NATO ASI series, Vol. 226, pp. 61–111). New York:
Plenum Press.
Joas, C. & Lehner, C. (2009). The classical roots of wave mechanics: Schrödinger’s transfor-
mations of the optical-mechanical analogy. Studies in the History and Philosophy of Modern
Physics, 40, 338–351.
Kaiser, D. (2016). How Einstein and Schrödinger conspired to kill a cat. Nautilus, Issue 041.
Lehner, C. (2014). Einstein’s realism and his critique of quantum mechanics. In M. Janssen & C.
Lehner (Eds) The Cambridge companion to Einstein (pp. 306–353). Cambridge: Cambridge
University Press.
Moore, W. (1992). Schrödinger, life and thought. Cambridge: Cambridge University Press.
Ryckman, T. (2017). Einstein. Milton Park: Routledge.
Schrödinger, E. (1935a). Discussion of probability relations between separated systems. Mathe-
matical Proceedings of the Cambridge Philosophical Society, 31, 555–563.
Schrödinger, E. (1935b). Die gegenwärtige Situation in der Quantenmechanik. Die Naturwis-
senschaften, 23, 807–812, 823–828, 844–848.
von Meyenn, K. (2011). Eine entdeckung von ganz außerordentliche Tragweite; Schrödinger’s
Briefwechsel zur Wellenmechanik und zum Katzenparadoxon (Vols. 1 and 2). Heidelberg:
Springer.
von Meyenn, K., Hermann, A., & Weiskopf, V. F. (1985). Wolfgang Pauli: Wissenschaftlicher
Briefwechsel mit Bohr, Einstein, Heisenberg u.a. Vol II: 1930–1939. Heidelberg: Springer.
Chapter 26
Derivations of the Born Rule
Lev Vaidman
26.1 Introduction
My attempt to derive the Born rule appeared in the first memorial book for Itamar
Pitowsky (Vaidman 2012). I can only guess Itamar’s view on my derivation from
reading his papers (Pitowsky 1989, 2003, 2006; Hemmo and Pitowsky 2007).
It seems that we agree which quantum features are important, although our
conclusions might be different. In this paper I provide an overview of various
derivations of the Born rule. In numerous papers on the subject I find in depth
analyses of particular approaches and here I try to consider a wider context that
should clarify the status of the derivation of the Born rule in quantum theory. I
hope that it will trigger more general analyses which finally will lead to a consensus
needed for putting foundations of quantum theory on solid ground.
The Born rule was born at the birth of quantum mechanics. It plays a crucial
role in explaining experimental results which could not be explained by classical
physics. The Born rule is known as a statement about the probability of an outcome
of a quantum measurement. This is an operational meaning which corresponds to
L. Vaidman ()
Raymond and Beverly Sackler School of Physics and Astronomy, Tel-Aviv University,
Tel-Aviv, Israel
e-mail: vaidman@post.tau.ac.il
%
N %
N
ΔA %
N
Ā |Ψ n = Ψ |A|Ψ |Ψ n + |Ψ n |Ψ⊥ k . (26.3)
N
n=1 n=1 k=1 n=k
The amplitude of the first term in the right hand side of the equation is of order
1 while the amplitude of the second term (the sum) is proportional to √1 , so in
N
the limit
N as N tends to infinity, the second term can be neglected and the product
state n=1 |Ψ n can be considered an eigenstate of the variable Ā with eigenvalue
Ψ |A|Ψ .
Now consider the measurement of Ā followed by measurements of A of each
of the individual systems. Ni is the number of outcomes A = ai . The probability
of outcome ai is defined as the limit pi ≡ limN →∞ NNi . To derive the Born rule
we consider the shift of the pointer of the measuring device measuring Ā, in two
ways. First, since in the limit, the state is an eigenstate with eigenvalue Ψ |A|Ψ ,
the pointer is shifted by this value. Second, consider the evolution backward in time
given that we have the results of individual measurements of variable A of each
system. Then the shift has to be N a i Ni
i=1 N . In the limit we obtain Ψ |A|Ψ =
N
i |α i | ai = i=1 ai pi . This equation can be generally true only if pi = |α i | for
2 2
The legitimacy of going to the limit N → ∞ in the earlier proofs was questioned
in Squires (1990), Buniy et al. (2006) and Aharonov’s approach was analyzed in
Finkelstein (2003). I am also skeptical about the possibility of arguments relying on
the existence of infinities to shed light on Nature. Surely, the infinitesimal analysis
is very helpful, but infinities lead to numerous very peculiar sophisticated features
which we do not observe. I see no need for infinities to explain our experience.
Very large numbers can mimic everything and are infinitely simpler than infinity.
The human eye cannot distinguish between 24 pictures per second and continuous
motion, but infinite information is required to describe the latter. There is no need
for infinities to explain all what we see around.
Another reason for my skepticism about possibility to understand Nature by
neglecting vanishing terms in the infinite limit is the following example in which
these terms are crucial for providing common sense explanation. In the modification
of the interaction-free measurement (Elitzur and Vaidman 1993) based on the Zeno
effect (Kwiat et al. 1995) we get information about the presence of an object without
being near it. The probability of success can be made as close to 1 as we wish
by changing the parameters. Together with this, there is an increasing number of
moments of time at which the particle can be absorbed by the object with decreasing
probability of each absorption. In the limit, the sum of the probabilities of absorption
at all different moments goes to zero, but without these cases the success of the
interaction-free measurement seems to contradict common sense. These are the
cases in which there is an interaction. Taking the limit in proving the Born rule
is analogous to neglecting these cases.
The main reason why I think that this approach cannot be the solution is that I
do not see what is the additional assumption from which we derive the Born rule.
Consider a counter example. Instead of the Born rule, the Nature has “Equal rule”.
Every outcome ai of a measurement of A has the same probability given that it
is possible, i.e., α i = 0 . Of course, this model contradicts experimental results,
but it does not contradict the unitary evolution part of the formalism of quantum
mechanics. I do not see how making the number of experiments infinite can rule
out Equal rule. Note that an additional assumption ruling out this model is hinted in
Aharonov and Reznik (2002) “the results of physical experiments are stable against
small perturbations”. A very small change of the amplitude can make a finite change
in the probability in the proposed model. (This type of continuity assumption is
present in some other approaches too.)
provides the unique outcome, so it seems hopeless to look for an explanation of the
Born rule based on the Schrödinger equation.
What might be possible are consistency arguments. If we accept the Hibert
space structure of quantum mechanics and we accept that there is probability for
an outcome of a quantum measurement, what might this probability measure be?
Itamar Pitowsky suggested to take it as the basis and showed how Gleason’s theorem
(Gleason 1957) (which has its own assumptions) leads to the Born rule (Pitowsky
1998). He was aware of “two conceptual assumptions, or perhaps dogmas. The
first is J. S. Bell’s dictum that the concept of measurement should not be taken
as fundamental, but should rather be defined in terms of more basic processes.
The second assumption is that the quantum state is a real physical entity, and that
denying its reality turns quantum theory into a mere instrument for predictions”
(Bub and Pitowsky 2010). In what followed, he recognized the problem as I do:
“This last assumption runs very quickly into the measurement problem. Hence,
one is forced either to adopt an essentially non-relativistic alternative to quantum
mechanics (e.g. Bohm without collapse, GRW with it); or to adopt the baroque
many worlds interpretation which has no collapse and assumes that all measurement
outcomes are realized.” The difference between us is that he viewed “the baroque
many worlds interpretation” as unacceptable (Hemmo and Pitowsky 2007), while I
learned to live with it (Vaidman 2002).
Maybe more importantly, we disagree about the first dogma. I am not ready to
accept “measurement” as a primitive. Physics has to explain all our experiences,
from observing results of quantum measurements to observing the color of the sun
and the sky at sunset. I do believe in the ontology of the wave function (Vaidman
2016, 2019) and I am looking for a direct correspondence between the wave function
and my experience considering quantum observables only as tools for helping to find
this correspondence. I avoid attaching ontological meaning to the values of these
observables. It does not mean that I cannot discuss the Born rule. The measurement
situation is a well defined procedure and our experiences of this procedure (results
of measurements) have to be explained.
The basic requirement of the measurement procedure is that if the initial state
is an eigenstate of the measured variable, it should provide the corresponding
eigenvalue with certainty. Any procedure fulfilling this property is a legitimate
measuring procedure. The Born rule states that the probability it provides should
be correct for any legitimate procedure, and this is a part of what has to be proved,
but let us assume that the fact that all legitimate procedures provide the same
probabilities is given. I will construct then a particular measurement procedure
(which fulfills the property of probability 1 for eigenstates) which will allow me
to explain the probability formula of the Born rule.
Consider a measurement of variable A performed on a system prepared in the
state (26.1). The measurement procedure has to include coupling to the measuring
device and the amplification part in which the result is written in numerous quantum
systems providing a robust record. Until this happens, there is no point in discussing
the probability, since outcomes were not created yet. So the measurement process
is:
572 L. Vaidman
% % %
|Ψ |rMD
m → α i |ai |MD
m |rMD
m , m r|m
MD MD
= 0 ∀m
m i m∈Si m∈S
/ i
(26.4)
where “ready” states |rMD m of the numerous parts m of the measuring device,
m ∈ Si , are changed to macroscopically different (and thus orthogonal) states
|MD
m , in correspondence with the eigenvalue ai . For all possible outcomes ai , the
set Si of subsystems of the measuring device which change their states to orthogonal
states has to be large enough. This part of the process takes place according to all
interpretations. In collapse interpretations, at some stage, the state collapses to one
term in the sum.
This schematic description is not too far from reality. Such a situation appears
in a Stern-Gerlach experiment in which the atom, the spin component of which is
measured, leaves a mark on a screen by exciting numerous atoms of the screen.
But I want to consider a modified measurement procedure. Instead of a screen in
which the hitting atom excites many atoms, we put arrays of single-atom detectors
in the places corresponding to particular outcomes. The arrays cover the areas of
the quantum uncertainty of the hitting atom. The arrays are different, as they have
a different number Ni of single-atom detectors which we arrange according to
equation Ni = |α i |2 N . (We assume that we know the initial state of the system.)
In the first stage of the modified measuring procedure an entangled state of the
atom and sensors of the single-atom detectors is created:
% αi %
|Ψ |rsen
n → √ |ai |sen
ki |rsen
n , (26.5)
n i
N i k n=k
i i
where |rsen
n represents an unexcited state of the sensor with label n running
over sensors of all arrays of single-photon detectors. N is the total number of
detectors. For each eigenvalue ai there is one array of Ni detectors with sensors
in a superposition of entangled states in which one of the sensors ki has an excited
state |sen
ki . At this stage the measurement has not yet taken place. The number of
sensors with changed quantum state might be large, but no “branches” with many
systems in excited states has been created. We need also the amplification process
which consists of excitation of a large number of subsystems of individual detectors.
In the modified measurement, instead of a multiple recording of an event specified
by the detection of ai , we record activation of every sensor ki by excitation of a
large (not necessarily the same) number of quantum subsystems m belonging to
the set Sik . Including in our description these subsystems, the description of the
measurement process is:
% % 1 % % %
|Ψ |rsen
n |rMD
m → √ |ai |sen
ki |rsen
n |MD
m |rMD
m .
n m N i k n=k m∈S m∈S
/
i i ik ik
(26.6)
Here we also redefined the states |sen
to absorb the phase of α i to see explicitly
ki
that all terms in the superposition have the same amplitude. Every term in the
26 Derivations of the Born Rule 573
(26.7)
where
% % % %
≡ |rsen
n |MD
m |rMD
m .
i n=i m∈Si m∈S
/ i
√ √
Fig. 26.1 Modified Stern-Gerlach experiment specially tailored for the state 0.4|↑ + 0.6|↓.
There are two detectors in the location corresponding to the result ‘up’ and three detectors in the
location corresponding to the result ‘down’
experimental procedures provide the same probabilities for outcomes. The modified
procedure has a very natural combinatorial counting meaning of probability. It can
be applied to the collapse interpretations when we count possible outcomes and in
the MWI where we count worlds. The objection that the number of worlds is not a
well defined concept (Wallace 2010) is answered when we put weights (measures
of existence) on the worlds (Vaidman 1998; Greaves 2004; Groisman et al. 2013).
In various derivations of the Born rule, the statement that equal amplitudes lead to
equal probabilities relies on symmetry arguments. The starting point is the simplest
(sometimes named pivotal) case
1
|Ψ = √ (|a1 + |a2 ). (26.8)
2
The pioneer in attempting to solve this problem was Deutsch (1999) whose work
was followed by extensive development by Wallace (2007), a derivation by Zurek
26 Derivations of the Born Rule 575
(2005), and some other attempts such as Sebens and Carroll (2016) and also my
contribution with McQueen (Vaidman 2012; McQueen and Vaidman 2018). The
key element of these derivations is the symmetry under permutation between |a1
and |a2 . It is a very controversial topic, with numerous accusations of circularity
for some of the proofs (Hemmo and Pitowsky 2007; Barnum et al. 2000; Saunders
2004; Gill 2005; Schlosshauer and Fine 2005; Lewis 2010; Rae 2009; Dawid and
Thébault 2014).
In all these approaches there is a tacit assumption (which I also used above)
that the probability of an outcome of a measurement of A does not depend on
the procedure we use to perform this measurement. Another assumption is that
probability depends only on the quantum state. In the Deutsch-Wallace approach
some manipulations, swapping, and erasures are performed to eliminate the dif-
ference between |a1 and |a2 , leading to probability half due to symmetry. If the
eigenstates do not have internal structure except for being orthogonal states, then
symmetry can be established, but it seems to me that these manipulations do not
provide the proof for important realistic cases in which the states are different in
many respects. It seems that what we need is a proof that all properties, except
for amplitudes, are irrelevant. I am not optimistic about the existence of such a
proof without adding some assumptions. Indeed, what might rule out the “Equal
rule”, a naive rule according to which probabilities for all outcomes corresponding
to nonzero amplitudes are equal, introduced above?
The assumption of continuity of probabilities as functions of time rules Equal
rule out, but this is an additional assumption. The Deutsch-Wallace proof is in
the framework of the MWI, i.e. that the physical theory is just unitary evolution
which is, of course, continuous, but it is about amplitudes as functions of time. The
experience, including the probability of self-location of an observer, supervenes
on the quantum state specified by these amplitudes, but the continuity of this
supervenience rule is not granted.
Zurek made a new twist in the derivation of the Born rule (Zurek 2005). His key
idea is to consider entangled systems and rely on “envariance” symmetry. A unitary
evolution of a system which can be undone by the unitary evolution of the system it
is entangled with. For the pivotal case, the state is
1
|Ψ = √ (|a1 |1 + |a2 |2), (26.9)
2
where |1, |2 are orthogonal states of the environment. The unitary swap
|a1 ↔ |a2 followed by the unitary swap of the entangled system |1 ↔ |2 brings
us back to the original state which, by assumption, corresponds to the original
probabilities. Other Zurek’s assumptions are that a manipulation of the second
system does not change the probability of the measurement on the system, while
the swap of the states of the systems swaps the probabilities for the two outcomes.
This proves that the probabilities for the outcomes in the pivotal example must be
equal.
576 L. Vaidman
In my view, the weak point is the claim that swapping the states of the system
swaps the probabilities of the outcomes. This property follows from the quantum
formalism when the initial state is an eigenstate, but in our case, when the
mechanism for the choice of the outcome is unknown, we also do not know how
it is affected by unitary operations. Note, that it is not true for the Stern-Gerlach
experiment in the framework of Bohmian mechanics.
Zurek, see also Wallace (2010), Baker (2007), and Boge (2019), emphasises the
importance of decoherence: entanglement of environment with eigenstates of the
system. Indeed, decoherence is almost always present in quantum measurements
and its presence might speed up the moment we can declare that the measurement
has been completed, but, as far as I understand, decoherence is neither necessary,
nor sufficient for completing a quantum measurement. It is not necessary, because it
is not practically possible to perform an interference experiment with a macroscopic
detector in macroscopically different states even if it is isolated from the environ-
ment. It is not sufficient, because decoherence does not ensure collapse and does not
ensure splitting of a world. For a proper measurement, the measuring device must
be macroscopic. It is true that an interaction of the system with an environment,
instead of a macroscopic measuring device, might lead to a state similar to (26.4)
with macroscopic number of microscopic systems of environment “recording” the
eigenvalue of the observable. It, however, does not ensure that the measurement
happens. It is not clear that macroscopic number of excited microsystem causes a
collapse, see analysis of a such situation in the framework of the physical collapse
model (Ghirardi et al. 1986) in Albert and Vaidman (1989) and Aicardi et al. (1991).
In the framework of the many-worlds interpretation we need splitting of worlds. The
moment of splitting does not have a rigorous definition, but a standard definition
(Vaidman 2002) is that macroscopic objects must have macroscopically different
states. Decoherence might well happen due to a change of states of air molecules
which do not represent any macroscopic object.
What I view as the most problematic “symmetry argument proof” of probability
half for the pivotal example is the analysis of Sebens and Carroll (2016), see also
Kent (2015). Sebens and Caroll considered the measurement in the framework of the
MWI and apply the uncertainty of self-location in a particular world as a meaning
of probability (Vaidman 1998). However, in my understanding of the example
they consider, this uncertainty does not exist (McQueen and Vaidman 2018). In
their scenario, a measurement of A on a system in state (26.8) is performed on a
remote planet. Sebens and Caroll consider a question: What is the probability of
an observer who is here, i.e., far away from the planet, to be in a world with a
particular outcome? This question is illegitimate, because he is certainly present in
both worlds, there is no uncertainty here. This conclusion is unavoidable in the MWI
as I understand it (Vaidman 2002), which is a fully deterministic theory without any
room for uncertainty. However, uncertainty in the MWI is considered in Saunders
(2004) and Saunders and Wallace (2008), so if this program succeeds (see however
Lewis 2007), then the Sebens-Caroll proof might make sense. Another way to make
26 Derivations of the Born Rule 577
Itamar Pitowsky’s analysis of the Born rule on the basis of Gleason’s theorem
(Pitowsky 1998) was taken further to the case of generalized measurements (Caves
et al. 2004). Galley and Masanes (2017) continued research which singles out
the Born rule from other alternatives. Note that they also used symmetry (“bit
symmetry”) to single out the Born rule. Together with Muller, they extended their
analysis (Masanes et al. 2019) and claimed to prove everything just from some
“natural” properties of measurements which are primitive elements in their theory.
So, people walked very far on the road paved by Itamars’s pioneering works. I
have to admit that I am not sympathetic to this direction. The authors of Masanes
et al. (2019) conclude “Finally, having cleared up unnecessary postulates in the
formulation of QM, we find ourselves closer to its core message.” For me it seems
that they go away from physics. Quantum mechanics was born to explain physical
phenomena that classical physics could not. It was not a probability theory. It was
not a theory of measurements, and I hope it will not end as such. “Measurements”
should not be primitives, they are physical processes as any other, and physics
should explain all of them.
Similarly, I cannot make much sense of claims that the Born rule appears even
in classical systems presented in the Hilbert space formalism (Brumer and Gong
2006; Deumens 2019). Note that in the quantum domain, the Born rule appears
even outside the framework of Hilbert spaces in the work of Saunders (2004),
who strongly relies on operational assumptions such as a continuity assumption:
“sufficiently small variations in the state-preparation device, and hence of the initial
578 L. Vaidman
state, should yield small variations in expectation value.” This assumption is much
more physical than postulates of general probabilistic theories.
The dynamical derivation in the framework of the Bohmian interpretation cham-
pioned by Valentini (Valentini and Westman 2005; Towler et al. 2011) who argued
that under some (not too strong) requirements of complexity, the Born distribution
arises similarly to thermal probabilities in ordinary statistical mechanics. See
extensive discussion in Callender (2007) and recent analysis in Norsen (2018) which
brings also similar ideas from Dürr et al. (1992). The fact that for some initial
conditions of some systems relaxation to Born statistics does not happen is a serious
weakness of this approach. What I find more to the point as a proof of the Born rule
is that the Born statistical distribution remains invariant under time evolution in all
situations. And that, under some very natural assumptions, is the only distribution
with this strong property (Goldstein and Struyve 2007).
Wallace (2010) and Saunders (2010) advocate analyzing the issue of probability
in the framework of the consistent histories approach. It provides formal expressions
which fit the probability calculus axioms. However, I have difficulty seeing what
these expressions might mean. I failed to see any ontological meaning for the main
concept of the approach “the probability of a history”, and it also has no operational
meaning apart from the conditional probability of an actually performed experiment
(Aharonov et al. 1964), while the approach is supposed to be general enough to
describe evolution of systems which were not measured at the intermediate time.
I feel that there is a lot of confusion in the discussions of the subject and it is
important to make the picture much more clear. Even if definite answers might not
be available now, the question: What are the open problems? can be clarified. First,
it is important to specify the framework: collapse theory, hidden variables approach
or noncollapse theory. Although in many cases the “derivation of the Born rule”
uses similar structure and arguments in all frameworks, the conceptual task is very
different. I believe that in all frameworks there is no way to prove the Born rule
from other axioms of standard quantum mechanics. The correctly posed question is:
What are the additional assumptions needed to derive the Born rule?
Standard quantum mechanics tells us that the evolution is unitary, until it leads
to a superposition of quantum states corresponding to macroscopically different
classical pictures. There is no precise definition of “macroscopically different
classical pictures” and this is a very important part of the measurement problem,
but discussions of the Born rule assume that this ambiguity is somehow solved, or
proven irrelevant. The discussions analyze a quantitative property of the nonunitary
process which happens when we reach this stage assuming that the fact that it
happens is given. I see no possibility to derive the quantitative law of this nonunitary
process from laws of unitary evolution. It is usually assumed that the process
depends solely on the quantum state, i.e. that the probability of an outcome of
26 Derivations of the Born Rule 579
100
|a2 → |bk , (26.10)
k=1
and immediately measures operator B which tells him the eigenvalue bk . This
operation splits the initial single world with the particle in a superposition into
hundred and one worlds: hundred worlds with one of Bob’s detectors finding the
particle, and one world in which the particle was not found by Bob’s detectors.
Prior to her measurement, Alice is present in all these worlds. Her measurement
1
tests if she is in one particular world, so she has only probability of 101 to find the
particle at her site.
Bob’s unitary operation and measurement change the probability of Alice’s
outcome. With measurements on a single particle, the communication is not very
reliable, but using an ensemble will lead to only very rare cases of an error. The
Equal rule will ensure that Alice and Bob meeting in the future will (most probably)
verify correctness of the bit Bob has sent.
We know that unitary evolution does not allow superluminal communication.
(When we consider a relativistic generalisation of the Schrödinger equation.) Can,
26 Derivations of the Born Rule 581
Acknowledgements This work has been supported in part by the Israel Science Foundation Grant
No. 2064/19.
References
Aharonov, Y., & Reznik, B. (2002). How macroscopic properties dictate microscopic probabilities.
Physical Review A, 65, 052116.
Aharonov, Y., & Vaidman, L. (1990). Properties of a quantum system during the time interval
between two measurements. Physical Review A, 41, 11–20.
Aharonov, Y., Bergmann, P. G., & Lebowitz, J. L. (1964). Time symmetry in the quantum process
of measurement. Physical Review, 134, B1410–B1416.
582 L. Vaidman
Aharonov, Y., Cohen, E., Gruss, E., & Landsberger, T. (2014). Measurement and collapse within
the two-state vector formalism. Quantum Studies: Mathematics and Foundations, 1, 133–146.
Aharonov, Y., Cohen, E., & Landsberger, T. (2017). The two-time interpretation and macroscopic
time-reversibility. Entropy, 19, 111.
Aicardi, F., Borsellino, A., Ghirardi, G. C., & Grassi, R. (1991). Dynamical models for state-vector
reduction: Do they ensure that measurements have outcomes? Foundations of Physics Letters,
4, 109–128.
Albert, D. Z., & Vaidman, L. (1989). On a proposed postulate of state-reduction. Physics letters A,
139, 1–4.
Baker, D. J. (2007). Measurement outcomes and probability in Everettian quantum mechanics.
Studies in History and Philosophy of Science Part B: Studies in History and Philosophy of
Modern Physics, 38, 153–169.
Barnum, H., Caves, C. M., Finkelstein, J., Fuchs, C. A., & Schack, R. (2000). Quantum probability
from decision theory? Proceedings of the Royal Society of London. Series A: Mathematical,
Physical and Engineering Sciences, 456, 1175–1182.
Barrett, J. A. (2017). Typical worlds. Studies in History and Philosophy of Science Part B: Studies
in History and Philosophy of Modern Physics, 58, 31–40.
Boge, F. J. (2019). The best of many worlds, or, is quantum decoherence the manifestation of
a disposition? Studies in History and Philosophy of Science Part B: Studies in History and
Philosophy of Modern Physics, 66, 135–144.
Bohm, D. (1952). A suggested interpretation of the quantum theory in terms of “hidden” variables.
I. Physical Review, 85, 166.
Brumer, P., & Gong, J. (2006). Born rule in quantum and classical mechanics. Physical Review A,
73, 052109.
Bub, J., & Pitowsky, I. (2010). Two dogmas about quantum mechanics. In S. Saunders, J. Barrett,
A. Kent, & D. Wallace (Eds.), Many Worlds?: Everett, quantum theory, & reality (pp. 433–459).
Oxford: Oxford University Press.
Buniy, R. V., Hsu, S. D., & Zee, A. (2006). Discreteness and the origin of probability in quantum
mechanics. Physics Letters B, 640, 219–223.
Callender, C. (2007). The emergence and interpretation of probability in Bohmian mechanics.
Studies in History and Philosophy of Science Part B: Studies in History and Philosophy of
Modern Physics, 38, 351–370.
Caves, C.M., Fuchs, C.A., & Schack, R. (2002). Quantum probabilities as Bayesian probabilities.
Physical Review A, 65, 022305.
Caves, C. M., Fuchs, C. A., Manne, K. K., & Renes, J. M. (2004). Gleason-type derivations of the
quantum probability rule for generalized measurements. Foundations of Physics, 34, 193–209.
Chiribella, G., D’Ariano, G. M., & Perinotti, P. (2011). Informational derivation of quantum theory.
Physical Review A, 84, 012311.
Dawid, R., & Thébault, K. P. (2014). Against the empirical viability of the Deutsch–Wallace–
Everett approach to quantum mechanics. Studies in History and Philosophy of Science Part B:
Studies in History and Philosophy of Modern Physics, 47, 55–61.
Deumens, E. (2019). On classical systems and measurements in quantum mechanics. Quantum
Studies: Mathematics and Foundations, 6, 481–517.
Deutsch, D. (1999). Quantum theory of probability and decisions. Proceedings of the Royal Society
of London. Series A: Mathematical, Physical and Engineering Sciences, 455, 3129–3137.
Dürr, D., Goldstein, S., & Zanghi, N. (1992). Quantum equilibrium and the origin of absolute
uncertainty. Journal of Statistical Physics, 67, 843–907.
Elitzur, A. C., & Vaidman, L. (1993). Quantum mechanical interaction-free measurements.
Foundations of Physics, 23, 987–997.
Everett III, H. (1957). “Relative state” formulation of quantum mechanics. Reviews of Modern
Physics, 29, 454–462.
Farhi, E., Goldstone, J., & Gutmann, S. (1989). How probability arises in quantum mechanics.
Annals of Physics, 192, 368–382.
26 Derivations of the Born Rule 583
Pitowsky, I. (2006). Quantum mechanics as a theory of probability. In Physical theory and its
interpretation (pp. 213–240). Berlin/Heidelberg: Springer.
Popescu, S., & Rohrlich, D. (1994). Quantum nonlocality as an axiom. Foundations of Physics,
24, 379–385.
Rae, A. I. (2009). Everett and the Born rule. Studies in History and Philosophy of Science Part B:
Studies in History and Philosophy of Modern Physics, 40, 243–250.
Saunders, S. (2004). Derivation of the Born rule from operational assumptions. Proceedings of
the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences, 460,
1771–1788.
Saunders, S. (2010). Chance in the Everett interpretation. In S. Saunders, J. Barrett, A. Kent, &
D. Wallace (Eds.), Many worlds?: Everett, quantum theory, & reality (pp. 181–205). Oxford:
Oxford University Press.
Saunders, S., & Wallace, D. (2008). Branching and uncertainty. The British Journal for the
Philosophy of Science, 59, 293–305.
Schlosshauer, M., & Fine, A. (2005). On Zurek’s derivation of the Born rule. Foundations of
Physics, 35, 197–213.
Sebens, C. T., & Carroll, S. M. (2016). Self-locating uncertainty and the origin of probability in
Everettian quantum mechanics. The British Journal for the Philosophy of Science, 69, 25–74.
Squires, E. J. (1990). On an alleged “proof” of the quantum probability law. Physics Letters A,
145, 67–68.
Tappenden, P. (2010). Evidence and uncertainty in Everett’s multiverse. British Journal for the
Philosophy of Science, 62, 99–123.
Tappenden, P. (2017). Objective probability and the mind-body relation. Studies in History and
Philosophy of Science Part B: Studies in History and Philosophy of Modern Physics, 57, 8–16.
Towler, M., Russell, N., & Valentini, A. (2011). Time scales for dynamical relaxation to the Born
rule. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences,
468, 990–1013.
Vaidman, L. (1998). On schizophrenic experiences of the neutron or why we should believe in
the many-worlds interpretation of quantum theory. International Studies in the Philosophy of
Science, 12, 245–261.
Vaidman, L. (2002). Many-Worlds interpretation of Quantum mechanics. In E. N. Zalta (Ed.), The
Stanford encyclopedia of philosophy. Metaphysics Research Lab, Stanford University. https://
plato.stanford.edu/cgi-bin/encyclopedia/archinfo.cgi?entry=qm-manyworlds
Vaidman, L. (2012). Probability in the many-worlds interpretation of quantum mechanics. In
Y. Ben-Menahem & M. Hemmo (Eds.), Probability in physics (pp. 299–311). Berlin: Springer.
Vaidman, L. (2016). All is ψ. Journal of Physics: Conference Series, 701, 012020.
Vaidman, L. (2019). Ontology of the wave function and the many-worlds interpretation. In
O. Lombardi, S. Fortin, C. López, & F. Holik (Eds.), Quantum worlds: Perspectives on the
ontology of quantum mechanics. (pp. 93–106). Cambridge: Cambridge University Press.
Valentini, A., & Westman, H. (2005). Dynamical origin of quantum probabilities. Proceedings of
the Royal Society A: Mathematical, Physical and Engineering Sciences, 461, 253–272.
Van Wesep, R. A. (2006). Many worlds and the appearance of probability in quantum mechanics.
Annals of Physics, 321, 2438–2452.
Wallace, D. (2007). Quantum probability from subjective likelihood: Improving on Deutsch’s proof
of the probability rule. Studies In History and Philosophy of Science Part B: Studies In History
and Philosophy of Modern Physics, 38, 311–332.
Wallace, D. (2010). How to prove the Born rule. In S. Saunders, J. Barrett, A. Kent, D. Wallace
(Eds.) Many worlds?: Everett, quantum theory, & reality (pp. 227–263). Oxford University
Press.
Zurek, W. H. (2005). Probabilities from entanglement, Born’s rule pk = |ψ k |2 from envariance.
Physical Review A, 71, 052105.
Chapter 27
Dynamical States and the
Conventionality of (Non-) Classicality
Alexander Wilce
Abstract Itamar Pitowsky, along with Jeff Bub and others, long championed the
view that quantum mechanics (QM) is best understood as a non-classical probability
theory. This idea has several attractions, not least that it allows us even to pose the
question of why quantum probability theory has the rather special mathematical
form that it has—a problem that has been very fruitful, and to which we now
have several rather compelling answers, at least for finite-dimensional QM. Here,
however, I want to offer some modest caveats. One is that QM is best seen, not as a
new probability theory, but as something narrower, namely, a class of probabilistic
models, selected from within a much more general framework. It is this general
framework that, if anything, deserves to be regarded as a “non-classical” probability
theory.
Secondly, however, I will argue that, for individual probabilistic models, and even
for probabilistic theories (roughly, classes of such models), the distinction between
“classical” and a “non-classical” is largely a conventional one, bound up with the
question of what one means by the state of a system. In fact, for systems with a high
degree of symmetry, including quantum mechanics, it is possible to interpret general
probabilistic models as having a perfectly classical probabilistic structure, but an
additional dynamical structure: states, rather than corresponding simply to prob-
ability measures, are represented as certain probability measure-valued functions
on the system’s symmetry group, and thus, as fundamentally dynamical objects.
Conversely, a classical probability space equipped with reasonably well-behaved
family of such “dynamical states” can be interpreted as a generalized probabilistic
model in a canonical way. It is noteworthy that this “dynamical” representation is
not a conventional hidden-variables representation, and the question of what one
means by “non-locality” in this setting is not entirely straightforward.
A. Wilce ()
Department of Mathematical Sciences, Susquehanna University, Selinsgrove, PA, USA
e-mail: wilce@susqu.edu
27.1 Introduction
Itamar Pitowsky long championed the idea (also associated with Jeff Bub, among
others) that quantum theory is best viewed as a non-classical probability theory
(Pitowsky 2005). This position is strongly motivated by the fact that the mathe-
matical apparatus of quantum mechanics can largely be reduced to the statement
that the binary observables—the physically measurable, {0, 1}-valued quantitites—
associated with a quantum system are in one-to-one correspondence with the
projections in a von Neumann algebra, in such a way that commuting families
of projections can be jointly measured and mutually orthogonal projections are
mutually exclusive.1 In other words, mathematically, quantum theory is what one
gets when one replaces the Boolean algebras of classical probability theory with
projection lattices. It is very natural to conclude from this that quantum theory
simply is a “non-classical” or “non-commutative” probability theory.
This point of view has many attractions, not least that it more or less dissolves
what Pitowsky called the “big” measurement problem (Bub and Pitowsky 2012;
Pitowsky 2005). It also raises the question of why quantum mechanics has the
particular non-classical structure that it does—that is, why projection lattices of von
Neumann algebras, rather than something more general? The past decade has seen
a good deal of progress on this question, at least as it pertains to finite-dimensional
quantum systems (Hardy 2001; Masanes and Müller 2011; D’Ariano et al. 2011;
Mueller et al. 2014; Wilce 2012).
Here, I want to offer some modest caveats. The first is that how we understand
Pitowski’s thesis depends on how we understand probability theory itself. On
the view that I favor, quantum mechanics (henceforth, QM) is best seen, not as
replacement for general classical probability theory, but rather, as a specific class
of probabilistic models—a probabilistic theory—defined within a much broader
framework, in which individual models are defined by essentially arbitrary sets
of basic measurements and states, with the latter assigning probabilities (in an
elementary sense) to outcomes of the former. It is this framework that, if anything,
counts as a “non-classical” probability theory—but in fact, I think we should regard,
and refer to, this general framework, simply as probability theory, without any
adjective. Since it outruns what’s usually studied under that heading, I settle here
1 Given this, Gleason’s Theorem, as extended by Christensen and Yeadon, identifies the system’s
states with states (positive, normalized linear functionals) on that same von Neumann algebra,
and Wigner’s Theorem then identifies the possible dynamics with one-parameter groups of unitary
elements of that algebra. For the details, see Varadarajan (1985).
27 Dynamical States and the Conventionality of (Non-) Classicality 587
for the term general probability theory.2 Within this framework, in addition to QM,
we can identify classical probability theory, in the strict Kolmogorovian sense, as a
special, and, I will argue, equally contingent, probabilistic theory.
However—and this is my second point—even as applied to individual proba-
bilistic models, or, indeed, individual probabilistic theories, the distinction between
“classical” and “non-classical” is not so clear cut. One standard answer to the
question of what makes the probabilistic apparatus of QM “non-classical” is that
its state space (the set of density operators on a Hilbert space, or the set of states
on a non-commutative von Neumann algebra, depending on how general we want
to be) is not a simplex: the same state can be realized as a mixture—a convex
combination—of pure states in many different ways. Another, related, answer is
that, whereas in classical probability theory any two measurements (or, if you prefer,
experiments) can effectively be performed jointly, in QM this is not the case.
Neither of these answers is really satisfactory. In the first place, it is perfectly
within the scope of standard classical probability theory to consider restricted
sets of probability measures and random variables on a given measurable space:
the former need not be simplices, and the latter need not be closed under the
formation of joint random variables. We might also wish to allow “unsharp”
random variables having some intrinsic randomness (represented, depending on
one’s taste, by Markov kernels or by response-function valued measures). We might
try to identify as “broadly classical” just those probabilistic models arising in this
way, from a measurable space plus some selection of probability measures and
(possibly unsharp) random variables. However, once we open the door this far,
it’s difficult to keep it on its hinges. This is because there exist straightforward,
well known, and in a sense even canonical, recipes for representing essentially any
“non-classical” probabilistic model in terms of such a broadly classical one, either
by treating the state space as a suitable quotient of a simplex (in the manner of
the Beltrametti-Bugajski representation of quantum mechanics), or by introducing
contextual “hidden variables”. In fact, for systems with a high degree of symmetry,
including finite-dimensional quantum systems, there is another, less well known,
but equally straightforward classical representation: the probabilistic structure is
simply that of a single reference observable, and hence, entirely classical; states,
however, rather than corresponding simply to probability measures, are represented
by probability measure-valued functions on the system’s symmetry group, and
thus, as fundamentally dynamical objects. Conversely, a classical probability space
equipped with a family of suitably covariant such “dynamical states”—what I call a
dynamical classical model—can be interpreted as a generalized probabilistic model
in a canonical way.
At the level of individual probabilistic models, then, it seems that the distinction
between classical and non-classical is, at least mathematically, more or less one
of convention (and, pragmatically, of convenience). There is, of course, another
= {E
M | E ∈ M },
One of the most obvious distinctions between the probabilistic models associated
with “classical” systems and those associated with quantum-mechanical systems
is that the former have an abundance of dispersion-free states, while the latter,
always has
absent superselection rules, have none at all. Notice, however, that M
an abundance of dispersion-free probability weights (just choose an outcome from
every test).
Definition 2.5 (Locally finite test spaces) A test space M is locally finite iff every
test E ∈ M is a finite set. Every test space is associated with a locally finite test
space M# , consisting of finite partitions of tests in M. That is, writing D(E) for
the set of finite partitions of a set E,
M# = D(E).
E∈M
If M is a test space, an event for M is a subset of a test (that is, an event in the
to that test). We write Ev(M) for the set of all events of
standard sense pertaining
M, so that Ev(M) = E∈M P(E). Thus, the outcome-space for M# is precisely
the set Ev(M) {∅} of non-empty events.
If α is a probability weight
on M, then we define the probability of an event
a ∈ Ev(M) by α(a) = x∈a α(x). This evidently defines a probability weight on
M# . However, unless M is locally finite to begin with, M# will generally support
many additional “singular” probability weights. (Indeed, if M is semiclassical, one
can simply choose a non-principal ultrafilter on each E ∈ M.)
Lemma 2.6 Let M be locally finite with outcome-set X. Then the convex set
Pr(M) of probability weights on M is compact as a subset of [0, 1]X (with the
product topology).
Proof Since each E ∈ M(A) is finte, the pointwise limit of a net of probability
weights will continue to sum to unity over E. Thus, Pr(M) is closed, and hence,
compact, in [0, 1]X . *
)
It follows that if A is a probabilistic model with M(A) locally finite, the closure
of (A) in Pr(M(A)) will also be compact. Because it is both mathematically
convenient and operationally reasonable to do so, I shall assume from now on that,
unless otherwise indicated, all probabilistic models A under discussion have a
locally finite test space M(A) and a closed, hence compact, state space (A).
While I will make no direct use of it, it’s worth noting that the Krein-Milman
Theorem now tells us that every state in (A) is a limit of convex combinations
of pure states.
It is often useful to consider sub-normalized
positive weights
on a test space
M, that is, functions α : X := M → R+ such that x∈E α(x) =: r ≤ 1,
independently of E. In fact, these are simply probability weights on a slightly larger
test space:
592 A. Wilce
M+ = {E ∪ {∗}|E ∈ M}.
3 This is one setting in which we would often not require to be closed. For instance, if =
quantum-mechanical system one associates with H, but in a manner that puts its
probabilistic structure in the foreground. Note that only if H is finite-dimensional,
is (H) compact.
While the model above is mathematically attractive, it is more usual to identify
measurement outcomes with projection operators. The projective quantum test
space consists of sets of projections pi on H summing to 1. If ρ is a density operator
on H, we obtain a probability weight (also denoted by ρ) given by ρ(p) = Tr(ρp).
Letting consist of all such pobability weights, we obtain what we can call the
projective quantum model. More generally still, if A is a von Neumann algebra (or,
any ring with identity), the collection M(A) of sets E of projections
for that matter,
p ∈ A with p∈E p = 1 is a test space, on which every state on A gives rise to
a probability weight by evaluation. In this case, we might identify with the state
space (in the usual sense) of A, which is weak-∗ compact, but which contains a host
of non-normal states.
Two pathological examples The following simple examples are useful as
illustrations of the range of possibilities comprended by the formalism sketched
above.
Example 2.10 (Two bits) Let M = {{a, a }, {b, b }} be a test space consisting of
two yes-no tests, one with outcomes a and a , the other with outcomes b and b .
The space of all probability weights on M is essentially the unit square, since such
a weight α is determined by the pair (α(a), α(b)). While the state space is not a
simplex, pure states are dispersion-free. The square bit B and diamond bit B are
the probabilistic models having the same test space, namely
M(B) = M(B ) = M,
but the two different state spaces pictured below in Fig. 27.1.
The square bit figures, directly or indirectly, in the large literature on “PR boxes”
inaugurated by Popescu and Rohrlich (1994). While the state space of B is affinely
isomorphic to that of B, it interacts with M differently. While both models are
unital, in B , for each outcome x there is a unique state δ x with δ x (x) = 1.
z y
a x b
Examples 2.11 (The Wright triangle5 ) A more interesting example consists of three
overlapping, three-outcome tests pasted together in a loop, as in Fig. 27.2:
α(a) = α(y) = 1, β(b) = β(z) = 1, γ (c) = γ (x) = 1 and δ(x) = δ(y) = δ(y) = 1,
In particular, is not a simplex, and not all pure states are dispersion-free.
It is often convenient to consider the vector space V(A) ≤ RX(A) spanned by the
state space (A). This carries an obvious pointwise order. Writing V(A)+ for the
cone of all α ∈ V(A) with α(x) ≥ 0 for all x ∈ X(A), it is not difficult to see that
5 So well known in the quantum-logic community as to be almost a cliche, this gives rise to the
simplest quantum logic that is an orthoalgebra but not an orthomodular poset. See Wilce (2017)
for details. The example is named for R. Wright, one of Randall and Foulis’s students.
27 Dynamical States and the Conventionality of (Non-) Classicality 595
each α ∈ V(A)+ has the form α = rβ where β ∈ (A) and r ≥ 0. Moreover, the
coefficient r is unique, so (A) is a base for the cone V(A)+ .
An effect on A is a functional f : V(A) → R with 0 ≤ f (α) ≤ 1 for α ∈ (A).
Every outcome x ∈ X(A) determines an effect x ∈ V(A)∗ by evaluation, that is, by
x (α) = α(x) for all α ∈ V(A). We can regard an arbitrary f as representing an “in-
principle” measurement outcome, not necessarily included among the outcomes in
X(A), but consistent with the convex structure of (A): f (α) is the probability
of the effect f occurring (if measured) in the state α. There is a unique unit
effect uA given by u(α) = 1 for all α ∈ (A), representing the trivial outcome
“something happened”. If f is an effect, so is uA − f =: f ; we interpret f
as representing the event that f did not occur. Thus, the pair {f, f } is an in-
principle test (measurement, experiment) not necessarily
included in M(A). More
generally, if f1 , . . . , fn is a set of effects with i fi = uA , we regard {f1 , . . . , fn }
as an “in principle” measurement or experiment—the standard term is “discrete
observable”—not necessarily included in the catalogue M(A), but consistent with
the structure of (A). Each test E ∈ M(A) gives rise to a discrete observable
{
x |x ∈ E} in this sense.
Examples 2.12
(a) As an illustration, let (S, ) be a measurable space, and let = (S, )
be the space of finitely additive probability measures on . Then V() is the
space of all signed measures μ = μ+ − μ− on , where μ+ and μ− are
positive measures. Any measurable
- function ζ : S → [0, 1] defines an effect
a through the recipe a(μ) = S ζ (s)dμ(s). It is common to think of such an
effect as a “fuzzy” or “unsharp” version of an indicator function (modelling,
perhaps, a detector that responds with probability ζ (s) when the system is in
a state s). The unit effect is the constant function 1. A discrete observable,
accordingly, represents a generalization of a “fuzzy” or unsharp version of a
finite measurable partition.
(b) In the case of a quantum-mechanical model, say A = A(H), if we identify
(A) with the set of density operators on H, then V(A) can be identified with
the space of all compact self-adjoint operators on H, ordered in the usual way;
V(A)∗ is then naturally isomorphic, as an ordered vector space, to the space
of all bounded self-adjoint operators on H, and an effect is an effect in the
usual quantum-mechanical sense, that is, a positive self-adjoint operator a with
0 ≤ a ≤ 1, where 1 is the identity operator on H, which serves as the unit effect
in this case. A discrete observable corresponds to a discrete positive-operator
valued measure (POVM) with values in N.
Unlike a test in M(A), a general discrete observable on A can have repeated
values. Nevertheless, by considering their graphs, such observables can be organized
into a test space, as follows. Here, [0, uA ] denotes the set of all effects on A, and
(0, uA ] denotes the set of non-zero effects.
Definition 2.13 (Ordered tests) The space of ordered tests over the convex set
is the test space Mo () with
596 A. Wilce
6 Itcan be shown that, conversely, every probability weight on Mo () is determined by a finitely-
additive normalized measure on the effect algebra [0, u], and hence, by a positive linear functional
in V(A)∗∗ . Hence, probability weights on Mo () that are continuous on (0, u] in the latter’s
relative weak-∗ topology, arise from elements of .
27 Dynamical States and the Conventionality of (Non-) Classicality 597
Before we can discuss these representations, we need to address head on the central
concern of this chapter: what does it mean (what ought it to mean) for a probabilistic
model A to be “classical”? Whatever our final answer, presumably we would at least
wish to consider models A(S, ) := (D(S, ), σ (S, )) as classical. However,
as discussed earlier, even classically it won’t do to restrict attention to such models:
any model of the form (D(S, ), ) where is any subset of σ (S, ), should
almost certainly count as “classical”, in the sense of falling within the scope of
classical probability theory (Lindsay 1995).
We should also pause to ask whether or not the restriction to σ -additive measures
should be regarded as one of the defining features of classicality. If μ is a finitely
additive measure on an algebra of sets, then letting T = β() be the Stone space
of , regarded as a Boolean algebra, one finds that, owing to the fact that elements
of are clopen in T and T is compact, no countable disjoint family of sets in has
union equal to T . Hence, every measure on (T , ) is countably additive by default.
In different language, whether or not a measure on a Boolean algebra is countably
additive depends on the particular representation of as an algebra of sets.
For this reason, it seems prudent to allow any Kolmogorovian model, in the sense
of Example 2.8—that is, any probabilistic model of the form (D(S, ), ), where
is any algebra of subsets of S and is any convex set of probability measures
on , countably additive or otherwise, to count as “classical”. But if this is so, it
seems equally reasonable to admit as classical models of the form (M, ) as long
as M ⊆ D(S, ) and ⊆ (S, ). That is, models admitting embeddings, in the
sense of Definition 2.16, into Kolmogorovian models in the sense of Example 2.8,
should themselves be considered “classical” in a broad sense.
When does an abstract model (M, ) admit such a classical embedding? If
φ : M(A) → D(S, ) is an interpretation, then every point-mass δ s in (S, )
pulls back along φ to a dispersion-free probability weight on M(A). If φ is to be
an embedding of test spaces, there must be enough of these dispersion-free states to
separate outcomes in X(A). Moreover, if φ is to be an embedding of probabilistic
models, every state α ∈ (A) must belong to the closed convex hull of these
dispersion-free weights.
27 Dynamical States and the Conventionality of (Non-) Classicality 599
ax := {s ∈ S|s(x) = 1} ⊆ S.
G = { a ∩ b | a ∈ E, b ∈ F, a ∩ b = ∅ }
An embedding is not the only way of explaining one mathematical object in terms
of another. One can also consider representing the object to be explained as a
quotient of a more familiar object. In connection with probabilistic models, this
line of thinking was explored by Holevo (1982) and, slightly later, by Bugajsky
and Beltrametti (Beltrametti and Bugajski 1995; see also Hellwig and Singer 1991;
Hellwig et al. 1993.) In the literature, this sort of representation of a quantum model
is usually called a classical extension. The basic idea is that any convex set can arise
as a quotient of another under an affine surjection. As a simple example, a square
can arise as the projection of a regular tetrahedron (a 4-simplex) on a plane.
To develop this idea further, we will need a bit of background on Choquet theory
(Alfsen 1971). Recall that the σ -field of Baire sets of a compact Hausdorff space
S is the smallest σ -algebra containing all sets of the form f −1 (0) where f ranges
over continuous real-valued functions on S. Let K be a compact convex set, that
is, a compact, convex subset of a locally convex topological vector space V , and
let o (K) be the set of Baire probability measures thereon. This is an infinite-
dimensional simplex in the sense that V(o (K)) is a vector lattice (Alfsen 1971).
For each μ ∈ (K), define the barycenter of μ, μ ∈ V ∗∗ , by
27 Dynamical States and the Conventionality of (Non-) Classicality 601
μ(f ) = f dμ
K
for all functionals f ∈ V ∗ . Using the Hahn-Banach Theorem, one can show that
μ ∈ K (Alfsen 1971, I.2.1). Thus, we have an affine mapping μ → μ from o (K)
to K. The barycenter of the point-mass δ α at α ∈ K is evidently α, so this mapping
is surjective. In this sense, every compact convex set is the image of a simplex under
an affine mapping.
Definition 3.5 (Classical extensions) A classical extension of a probabilistic
model A consists of a measurable space (S, ) and an affine surjection
q : σ (S, ) → (A).
This makes no reference to M(A). However, the affine surjection q : o (S) →
(A) can be dualized to yield a mapping q ∗ : X(A) → Aff(o (S)) taking every
outcome x ∈ X(A) to the functional q ∗ (x)(μ) = μ(x), which is evidently an
effect on (S). Now, such an effect can be understood as a “unsharp” version
of an
indicator function, as discussed in Example 2.12(a) Accordingly, since x∈E q ∗ (x)
is identically 1, q ∗ is an effect-valued weight on M(A), representing each test as
a partition of 1 by such fuzzy indicator functions, that is, q ∗ (E) is a (discrete)
“unsharp random variable” on S. For more on this, see Bacciagaluppi (2004).
In such a representation, in other words, the state space is essentially classical
(it’s simply a set of probability measures), while outcomes and tests become
“unsharp”. While this may represent a slight extension of the apparatus of standard,
Kolmogorovian probability theory, it is certainly within the scope of classical
probability theory in the somewhat wider sense that concerns us here.
Another way to put all this is that a classical extension comes along with
(and determines) an interpretation M(A) → Mo ((S, )), where the latter
is the space of ordered tests on (S, ), in the sense of Definition 2.13. This
then gives us an interpretation of models A → (Mo ((S, )), (S, )). In
this sense, a classical extension is an “unsharp” version of a classical embedding.
However, as the discussion above shows, every probabilistic model has a canonical
classical extension obtained by taking S = (A) and the Baire field of .
This construction is even functorial, since the construction A → (A) is (the
object part of) a contravariant functor from the category of probabilistic models
to the category of compact convex sets, while K → 0 (K) and K → Mo (K)
are, respectively, a covariant endofunctor on the category of compact convex sets
and continuous affine maps, and from this category to the category of test spaces
and interpretations. Putting these together gives us a covariant functor A →
(Mo (0 ((A))), 0 ((A))) from arbitrary (locally finite) probabilistic models
to unsharp Kolmogorovian models.
One can also construct a classical extension in which the carrier space is the set
Kext of extreme points of K. This need not be a Borel subset of K; however, it
becomes a measurable space if we let ext be the trace of K’s Baire field o on
Kext , i.e., ext = { b ∩ Kext | b ∈ }. If μ is a probability measure on ext , then
we can pull this back along the boolean homomorphism φ : b → b ∩ Kext to a
602 A. Wilce
= {E
M | E ∈ M },
D (S(A), ) (S(A))
ψ φ = ψ∗
M(A) (A)
M (A) (A))
Pr(M
π π∗
ax = {s ∈ S(A)|s(x, E) = 1},
{ax |x ∈ E} is a partition of S(A).
This gives us a very simple representation of an arbitrary model as a quotient of a
sub-model of Kolmogorovian model. To the extent that a representation of a model
A as a quotient of a model B allows us to “explain” the model A in terms of the
model B, and to the extent that representing a model B as a submodel of a model
C also allows us to explain B in terms of C, then—to the extent that “explains” is
transitive—we see that every model can be explained in terms of a classical one.
This construction is even functorial: if φ : A → B is an interpretation from a model
A to a model B, there is a canonical extension of φ to an interpretation φ:A → B,
given by φ(x, E) = (φ(x), φ(E)). Hence, an entire probabilistic theory can be
rendered essentially classical, if we are willing to embrace contextuality.
Hidden Variables This is as good a place as any to clarify how the foregoing
discussion connects to traditional notions of “hidden variables” (or “ontological
representations” (Spekkens 2005)). Briefly, a (not necessarily deterministic, not
necessarily contextual) hidden variables representation of a probabilistic model
A consists of a measurable space and, for every test E ∈ M, a conditional
probability distribution p( · |λ,
E) on E—that is, p(x|λ, E) is a non-negative real
number for each x ∈ E, and x∈E p( x | λ, E ) = 1. It is also required that, for
every state α ∈ (A), there exist a probability measure μ on such that for every
x ∈ X(A) and every test E with x ∈ E,
p(x|λ, E)dμ(λ) = α(x). (27.1)
604 A. Wilce
27.3.4 Discussion
representation, available for locally finite models whenever such a great deal of
symmetry is in play. This is mathematically similar to the representation in terms
of the semiclassical cover; conceptually, however, it is leaner, in that it privileges
a single observable, and encodes the structure of the model in the ways in which
probabilities on this observable can change.
In this section, we will consider probabilistic models A that, in addition to the test
space M(A) and state space (A), are equipped with a distinguished symmetry
group G(A) acting on X(A), M(A) and (A). In most applications, G(A) will be
a lie group and, in our finite dimensional setting, compact.
Definition 4.1 (Symmetries
of test spaces) A symmetry of a test space M with
outcome-set X = M is an isomorphism g : M → M, or, equivalently, a
bijection g : X → X such that ∀E ⊆ X, g(E) ∈ M iff E ∈ M. We write
Aut(M) for the group of all symmetries of M under composition.
Note that Aut(M) also has a natural right action on Pr(M), since if g ∈
Aut(M) and α ∈ Pr(M), then α ◦g is again a probability weight, and the mapping
g ∗ : α → α ◦ g is an affine bijection Pr(M) → Pr(M).
Definition 4.2 (Dynamical models) A dynamical probabilistic model is a proba-
bilistic model A with a distinguished dynamical group G(A) ≤ Aut(M(A)) under
which (A) is invariant (Barnum and Wilce 2016).
I use the terms dynamical model and dynamical group because in situations in
which G(A) is a Lie group, a dynamics for (or consistent with) the model will
be a choice of a continuous one-parameter group g ∈ Hom(R, G(A)), that is, a
mapping t → gt with gt+s = gt gs , which we interpret at tracking the system’s
evolution over time: if α is the state at some initial time, gt (α) = α ◦ g−t is the state
after the elapse of t units of time. That g is a homomorphism encodes a Markovian
assumption about the dynamics, namely, that a system’s later state depends only on
its initial state and the amount of time that has passed, rather than on the system’s
entire history.
Definition 4.3 (Symmetric and fully symmetric models) A dynamical proba-
bilistic model A is symmetric iff G(A) acts transitively on the set M(A) of tests,
and the stabilizer, G(A)E , of a (hence, any) test E ∈ M acts transitively on E. Note
that this implies that all tests have a common size. We say that A is fully symmetric
iff, additionally, for any test E ∈ M(A) and any permutation σ : E → E, there
exists at least one g ∈ G(A) with gx = σ (x) for every x ∈ E.
An equivalent way to express full symmetry is to say that all tests E, F ∈ M(A)
have the same cardinality, and, for every bijection f : E → F , there exists some
g ∈ G(A) with gx = f (x) for every x ∈ E.
606 A. Wilce
The quantum probabilistic model A(H) is fully symmetric under the unitary
group U (H), since any bijection between two (projective) orthonormal bases for H
extends to a unitary operator.
We can reconstruct a symmetric test space M(A) from any one of its tests, plus
information about its symmetries (Wilce 2009). Suppose we are given, in addition
to the group G(A), a single test E ∈ M(A), an outcome xo ∈ E, and the two
subgroups
We now reformulate these ideas in a way that gives more prominence to the chosen
test E ∈ M(A). In brief: if we hold the test E fixed, but retain control over both
the state space (A) and the dynamical group G(A), we obtain a mathematically
equivalent picture of the model, but one in which the probabilistic structure is, from
a certain point of view, essentially classical.
Consider the following situation: one has some laboratory apparatus, defining
an experiment with outcome-set E. This is somehow coupled to a physical system
having a state-space , governed by a Lie group G. That is, G acts on in such
a way that all possible evolutions of this system are described by an initial state
α o ∈ and a one-parameter subgroup (gt )t∈R of G. It is harmless, and will be
convenient, to take to be a right G-space, that is, to denote the image of state
α ∈ under the action of g ∈ G by αg. We may suppose that the way in which the
apparatus is coupled with the system manifests itself probabilistically, by a function
p : × E → [0, 1]
27 Dynamical States and the Conventionality of (Non-) Classicality 607
(If not, we can factor out the obvious equivalence relation on E.) I will also suppose
that the experiment E and the group G together separate states, that is, knowing
how the probabilities p(αg, x) vary with g ∈ G, for all outcomes x ∈ E, is enough
to determine the state α uniquely. In other words,
α (g)(x) = p(αg, x).
α as a mapping G × E → [0, 1] with x∈E
Alternatively, we can treat α (g, x) = 1,
by writing α (g, x) := α (g)(x). I will leave it to context to determine which of
these representations is intended. In either case, I will refer to
α as the dynamical
state associated with state α ∈ . The state-separation condition (27.4) tells us
that α → α ∈ (E)G is injective, so if we wish, we can identify states with the
corresponding dynamical states.
The mapping α ∈ (E)G is a (discrete) random probability measure on G, while
α : G × E → [0, 1] is a discrete Markov kernel on G × E. Thus, nothing we have
done so far takes us outside the range of classical probability theory. In fact, if G is
compact, there is a natural measure on G × E, namely the product of the normalized
Haar measure on G and the counting measure on E, such that
& '
α (g, x)d(g, x) =
α (g, x) dg = 1dg = 1.
G×E G x∈E G
7 I prefer to write this as p(α, x) rather than as p(x|α) in order to make the covariance conditions
p(αg −1 , x) = p(αg −1 , σ g σ −1 −1 −1
g x) = p(αg g, σ g x) = p(α, σ
−1
x)
so g −1 ∈ H . *
)
Thus, E carries a natural H action. In what follows, I will simplify the notation
by writing hx for σ h x, where h ∈ H and x ∈ E. It is natural to consider cases in
which E is transitive as an H-set, meaning that for every x, y ∈ E, there exits at
least one h ∈ H with hx = y. This motivates the following
Definition 4.6 (Dynamical classical models) A dynamical classical model
(DCM) consists of groups H ≤ G, a transitive left H -set E, a convex right G-
space , and a Markov kernel p : × E → [0, 1] such that p(α, hx) = p(αh, x)
for all α ∈ , h ∈ H and x ∈ E.
Given a DCM as defined above, for each state α ∈ define
α : G → (E) by
α (g)(x) = p(αg, x).
K := { g ∈ G | p(αg, xo ) = p(α, xo ) ∀α ∈ }.
27 Dynamical States and the Conventionality of (Non-) Classicality 609
Arguing in much the same way as above, it is easy to see that K, too, is a subgroup
of G. The following slightly extends a result from Wilce (2005); see also Wilce
(2009):
Theorem 4.7 Let E be a set acted upon transitively by a group H , let G be any
group with H ≤ G, and let K be any subgroup of G with K ≤ K and K ∩ H =
Hxo . Then there is a well-defined H -equivariant injection φ : E → X given by
φ(hxo ) = hK for all h ∈ H . Identifying E with φ(E) ⊆ X,
(a) M := { gφ(E) | g ∈ G } is a fully G-symmetric test space, and
(b) For every α ∈ ,
α (xg ) := α(g, xo )
is a well-defined probability weight on M; and
(c) The mapping α → α is a G-equivariant affine injection.
The proof is given in Appendix C.
Thus, the essentially classical picture presented above—a single test (or observ-
able) E, interacting probabilistically with a system with a state-space and
symmetry group G by means of the function p : × E → [0, 1]—can be
reinterpreted as a “non-classical” probabilistic model in which E appears as one
of many possible tests, provided that a sufficiently large set of permutations of E
can be implemented physically by the group G. What is more important for our
purposes, however, is the (even more) trivial converse: Any symmetric model A, and
in particular, any (finite-dimensional) quantum model, arises from this construction:
simply take H to be the stabilizer in G(A) of a chosen test E ∈ M(A) and K
to be the stabilizer in G(A) of some chosen outcome xo ∈ E, and we recover
(M(A), (A)) from Theorem 4.7.
To summarize: in the representation above, we have reinterpreted the states as, in
effect, random probability measures (or rather, weights) on a fixed test E, indexed
by the dynamical group G(A). Another way to view this is that each state determines
a family of trajectories in the simplex (E) of classical probability weights on E.
Given a state α ∈ plus the system’s actual dynamics, as specified by a choice of
one-parameter group g : R → G(A), we obtain a path α t :=
α (gt ) in (E). In this
representation, notice,
(a) States are not viewed as probability weights on a non-classical test space, but
rather, specify how classical probabilities change over time, given the dynamics.
(b) In general, the trajectories in (E) arising from states and one-parameter
subgroups of G(A) are not governed by flows, that is, there is no one-parameter
group of affine mappings T s : (E) → (E) such that α t+s = T s (α t ). In
particular, the observed evolution of probabilities on E is not Markov. In this
respect, it is the dynamical, rather than the probabilistic, structure of the DCM
that can be regarded as non-classical.
610 A. Wilce
Thus far, we have been discussing probabilistic models largely in isolation, and
largely in abstraction from physics. Once we start to consider probabilistic models
in relation to one another, and as representing actual physical systems localized in
spacetime, e.g., particles (or laboratories), we encounter a host of new questions
regarding compound systems in which two or more component systems occupy
causally separate regions of spacetime. In particular, we unavoidably encounter the
concepts of entanglement and locality.
In this section, I briefly review how these notions unfold in the context of general
probabilistic theories (following Barnum and Wilce (2016), Foulis and Randall
(1981)), and how the various classical representations discussed above interact with
such composites. As we’ll see, entangled states arise naturally in this context. There
is a sense in which the representations in terms of classical extensions and in terms
of semiclassical covers are both obviously non-local, simply because in each case the
“classical” state spaces associated with a composite system AB is typically much
larger than the Cartesian product of those associated with A and B separately. Thus,
for instance, if S(A) stands for the set of dispersion-free probability weights on
M(A), then S(AB) is going to be much larger than S(A) × S(B). Similarly, unless
(A) or (B) is a simplex, (AB) will generally allow for entangled pure states,
which essentially just means that (AB)ext will be larger than (A)ext × (B)ext .
A more technical notion of “non-locality” in terms of hidden variables is discussed
below in Sect. 27.5.2, while the more delicate question of whether composites of
DCMs should be regarded as local or not is discussed in Sect. 27.5.4.
Suppose A and B are probabilistic models. A joint probability weight on the test
spaces M(A) and M(B) is a function ω : X(A) × X(B) → [0, 1] that sums to
1 on each product test E × F , where E ∈ M(A) and F ∈ M(B). Certainly if α
and β are probability weights on A and B, we can form a joint probability weight
27 Dynamical States and the Conventionality of (Non-) Classicality 611
ω(x, y) ω(x, y)
ω2|x (y) := and ω1|y (x) :=
ω2 (y) ω1 (x)
(with, say, the convention that both are zero if their denominators are). The marginal
states can be recovered as convex combinations of these conditional states, in a
version of the law of total probability:
ω1 = ω2 (y)ω1|y and ω2 = ω1 (x)ω2|x .
y∈F x∈E
We should certainly want these conditional and (hence) marginal states to belong to
the designated state-spaces of each component of a reasonable composite model.
Definition 5.2 Given probabilistic models A and B, let A ⊗ B denote the prob-
abilistic model with test space M(A ⊗ B) := M(A) × M(B) and state-space
(A ⊗ B) consisting of all non-signaling joint states ω on M(A) × M(B) having
conditional (and hence, marginal) states belonging to (A) and (B).
This is essentially the maximal (or injective) tensor product of (A) and (B)
(Barnum and Wilce 2016; Namioka and Phelps 1969). Since it does not allow for
612 A. Wilce
8 A more general definition would drop the requirement that φ be outcome-preserving, thus
allowing for the possibility that for some outcomes x and y, φ(x, y) =: xy might be a non-trivial
(and even possibly empty) event of AB. We will not need this generality here.
27 Dynamical States and the Conventionality of (Non-) Classicality 613
27.5.2 Entanglement
If AB is a
composite of models A and B, then any convex combination of product
states, say i ti α i ⊗ β i or, more generally,
(α λ ⊗ β λ )dμ(λ)9
9 Where is a measurable space and μ is a probability measure thereon, and where the integral
exists in the obvious sense that
α λ ⊗ β λ )dμ(λ) (z) = (α λ ⊗ β λ )(z)dμ(λ).
10 As an historical note, both of these points were first noted in this generality, but without any
reference to entanglement, in a pioneering paper of Namioka and Phelps (1969) on tensor products
of compact convex sets. They were rediscovered, and connected with entanglement, by Kläy
(1988).
614 A. Wilce
In the literature, a (contextual) local hidden variable model for a bipartite probability
weight ω is usually taken to consist of a measurable space , a probability measure
μ thereon, and a pair of response functions pA and pB such that, such that for
all tests E ∈ M(A), F ∈ M(B) and all outcomes x ∈ E and y ∈ F ∈
M(B), p(x|E, λ) is the probability of x given the parameter λ and the choice of
measurement E, and similarly for p(y|F, λ). These are required to satisfy
ω(x, y) = pA (x|E, λ)p(y|F, λ)dμ(λ)
so that we can interpret the joint probability ω(x, y) as resulting from a classical
correlation (given by μ) together with the local response functions pA and pB .
As discussed in Sect. 27.3.3, it is straightforward to reinterpret the functions pA
and pB in terms of probability weights on M(A)
and M(B), respectively, by
writing pA (x|E, λ) as pA,λ (x, E) and pB (y|F, λ) as pB,λ (y, F ) (emphasizing that
λ can be treated as merely an index). We then have
ω(x, y) = pA,λ ⊗ pB,λ dμ(λ) ((x, E), (y, F )),
independently of E and F . In other words, ω has a local HV model if, and only if,
it is separable, in the integral sense, when regarded as a joint probability weight on
⊗ B) (using the composite of Definition 5.2).
M(A
Remark A first reaction to this observation might be that it must be wrong, as it’s
well-known that there exist entangled Werner states that nevertheless have local
HV representations (Werner 1989). The subtlety here (such as it is) resides in the
fact that whether a state is separable or entangled depends on what “local” state-
spaces are in play. Here, we are expressing ω as a weighted average of products
of “contextual” states, whereas in standard discussions of quantum entanglement,
a state is separable iff it can be expressed as a weighted average of products of
quantum (in particular, non-contextual) states of the component systems.
Since dynamical classical models are simply another way of looking at symmetric
probabilistic models, in principle they can be composed in the same way as the latter.
I have discussed this in some technical detail elsewhere (Wilce 2009). Without going
into such detail, I want to make a few remarks on the sense in which composites of
DCMs can be regarded as local. Suppose, then, that
are two DCMs as defined in Sect. 27.4.2, and suppose that AB is a DCM serving as
a composite of these. This will require, at a minimum that E(AB) = E(A) × E(B),
and that we have functions
with the former bi-affine and the latter a group homomorphism, such that
and
(α ⊗ β)(g ⊗ h) = αg ⊗ βh
for all (α, β) ∈ (A) × (B) and all (g, h) ∈ G(A) × G(B). Additional conditions
are necessary to ensure that a non-signaling condition is satisfied, but this is enough
for present purposes.
Examples 5.8
(a) By the minimal classical dynamical model associated with a finite set E, I mean
the DCM A(E) with (A(E)) = (E), G(A(E)) = H (A(E)) = S(E) (the
symmetric group of all bijections E → E) and pA(E) (α, x) = α(x). Now given
two finite sets, then the minimal DCM A(E × F ) is a composite of A(E) and
A(F ) with the maps ⊗ : (E)×(F ) → (E ×F ) and ⊗ : S(E)×S(F ) →
S(E × F ) the obvious ones, that is,
for all (x, y) ∈ E × F , (α, β) ∈ (E) × (F ) and (g, h) ∈ S(E) × S(F ).
(b) Let A and B be quantum-mechanical systems associated with Hilbert spaces
HA and HB . Fixing orthonormal bases E ∈ M(HA ) and F ∈ M(HB ),
we can regard A and B as DCMs with (A) and (B) the spaces of density
operators on HA and HB , respectively, G(A) = U (HA ) and G(B) = U (HB ),
and with
27.6 Conclusion
Many of the observations collected above are well known, at least folklorically. My
aim in bringing them together, in the particular way that I have, has been to draw
attention to some points that I believe follow from these observations collectively,
and that I believe are somewhat less widely appreciated than they might be:
(a) Classicality in the strict (that is, unrestricted) Kolmogorovian sense, is man-
ifestly a contingent matter, since it is not a point of logical necessity that
all experiments should be compatible. To the extent that probability theory is
meant to be a completely general and a priori study of reasoning in the face of
uncertainty or chance, it isn’t Kolmogorovian, nor is it quantum-mechanical:
it is, rather, the study of what I’ve here called probabilistic theories generally
(whether formulated as I have done here, or in some roughly similar way).
(b) Within such a framework, what we (ought to) mean by saying that a probabilis-
tic model or probabilistic theory is “classical” is not entirely obvious. For some
values of “classical”, every probabilistic theory is, or can be represented as, a
classical one, at the cost of contextuality and, where we are dealing with a non-
signaling theory, also of locality. Moreover, for certain kinds of theories (fully
symmetric ones), there is a broad sense of “classical” in which such a theory
simply is classical, without any need to invoke hidden variables, contextual,
non-local or otherwise. Or at least, this is true for the theory’s probabilistic
structure: any departure from “classicality” is dynamical. Finite-dimensional
quantum mechanics is such a theory.
27 Dynamical States and the Conventionality of (Non-) Classicality 617
I have no issue to take with this, as far as it goes: we are always free to reject
a classical representation of a probabilistic theory by objecting to the additional
“classical” structure as “non-physical”. But I would also want to say that doing
so makes the theory a physical theory, and not a theory of probability. Or, to put
it differently, whether one wants to take, for example, a set of contextual hidden
variables, or a privileged observable, seriously in formulating a given physical
theory is a pragmatic question about the best (the most elegant, the most convenient,
the most fruitful) way to formulate mathematical models of a particular sort of
physical situation. It is also (in particular!) a metaphysical question about what
ontology one wants to admit in framing such a model. Such decisions are important;
but they are not decisions about what kind of probability theory to use.
Pitowsky also says (emphasis mine):
In this paper, all we have discussed is the Hilbert space formalism. I have argued that it is
a new kind of probability theory that is quite devoid of physical content, save perhaps the
indeterminacy principle which is built into axiom H4. Within this formal context there is no
explication of what a measurement is, only the identification of “observables” as Hermitian
operators. In this respect the Hilbert space formalism is really just a syntax which represents
the set of all possible outcomes, of all possible measurements.
The final lines of Pitowsky’s paper crystalize what I find problematic in his
proposal. (Again, the emphasis is mine.)
However, there is a structure to the set of events. Not only does each and every type of
measurement yield a systematic outcome; but also the set of all possible outcomes of all
measurements — including those that have been realized by an actual recording — hang
together tightly in the structure of L(H). This is the quantum mechanical structure of
reality.
φ ∗ (α) := α ◦ φ −1 .
i.e., the inverse limit of M, regarded as a small category under coarsening maps.
As long as M is locally finite, one can show that this is non-empty (a consequence
620 A. Wilce
[x] := { x ∈ S | ∀E ∈ M(A) x0 ∈ E ⇒ xE = x0 }.
Thus, every cell in the partition [[F ]] is a union of cells of [[E]], i.e., E 0 F . Since
every pair of tests in M(A) have a common refinement with respect to (A), it
follows that [[M(A)]] is a refinement ideal of partitions. It is clear that φ : x →
[x] is an outcome-preserving, positive interpretation from M(A) onto [[M(A)]].
It remains to show it’s injective. Let x, y ∈ X(A) with [x] = [y]. Supposing that
−1 −1
x ∈ E ∈ M(A) and y ∈ F ∈ M(A), let G / E, F . Then fG,E (x) = fG,F (y), so
for all α ∈ (A), we have
−1 −1
α(x) = α(fG,E (x)) = α(fG,F (y)) = α(y).
:= {
a | a ∈ Ev(M) }.
27 Dynamical States and the Conventionality of (Non-) Classicality 621
a ) = α(a).
α (
Lemma A.6
α is well-defined, and a finitely-additive probability measure on .
Proof Let a ⊆ E ∈ M and b ⊆ F ∈ M. We want to show that if a =
b,
then α(a) = α(b). Let G refine both E and F . There are canonical surjections
e : G → E and f : G → F , namely e(z) = x where x is the unique cell of E
containing z ∈ G, and similarly for f . Let a1 = e−1 (a) and b1 = f −1 (b). Then
a1 = a = b = b1 . Since is injective on P(G), a1 = b1 . Since A is a refinement
ideal with respect to (A), α(a) = α(a1 ) = α(b1 ) = α(b).
This shows that α is well defined. To see that it’s additive, let a ⊆ E ∈ M(A),
b ⊆ F ∈ M(A), with a ∩b = ∅. Choosing G a common refinement of E and F ,
we have a1 , b1 ⊆ G with a1 = a and b1 = a1 ∩
b, so b1 = ∅, whence, a1 ∩ b1 = ∅.
It follows that
a ∪
α ( α (a
b) = 1 ∪ a2 ) = α(a1 ∪ a2 ) = α(a1 ) + α(a2 )
=
α ( α (
a1 ) + b1 ) = α (
a) +
α ( b).
*
)
Proposition A.7 If A is a locally finite refinement ideal and (A) separates
compatible events, then there is a measurable space (S, ) and an embedding
M(A) → D(S, ) such that every probability weight in extends to a finitely-
additive probability measure on (S, ). Conversely, if A is unital and locally finite
and admits such an embedding, then A embeds in a refinement ideal without loss of
states.
Proof The forward implication follows from Lemmas A.4, A.5 and A.6, while the
reverse implication is more or less obvious. *
)
To this extent, then, the existence of common refinements is the key classical
postulate.
622 A. Wilce
Recall that M is semiclassical iff distinct tests in M are disjoint. Evidently, such a
test space has a wealth of dispersion-free states, as we can simply choose an element
xE from each test E ∈ M and set δ(x) = 1 if x = xE for the unique test containing
x, and 0 otherwise. In fact, the dispersion-free states are exactly the pure states:
Lemma B.1 Let Ki be an indexed family of convex sets, and let α = (α i ) ∈ K =
i∈I Ki . Then α is pure iff each α i is pure.
Proof Suppose that for some j ∈ I , α j is not pure. Then there exist distinct points
β j , γ j ∈ Kj with α j = tβ j + (1 − t)γ j for some t ∈ (0, 1). Define
β,
γ ∈ K by
setting
α i i = j α i i = j
βi = γi =
and
βj i = j γj i = j
Then we have t
β + (1 − t)
γ = α, so α is not pure either. The converse is clear. )
*
Corollary B.2 If M is semiclassical, then Pr(M)ext = Pr(M)df .
Proof Since M is semiclassical, Pr(M) E∈M (E). *
)
Let S(M) denote the set of dispersion-free states on M. It is easy to see that
S(M) is closed, and hence compact, as a subset of [0, 1]X . By (S), I mean the
simplex of Borel probability measures on S with respect to this topology.
This appendix collects more technical material on the construction of fully symmet-
ric models. In particular, we prove Theorem 4.7, which for convenience we restate
below as Proposition C.2.
As discussed in Sect. 27.4.2, we are given a set E (which we wish to regard as
the outcome-set of an experiment), and a group H acting transitively on E. We are
also given a set of physical states (those of the system to which the experiment
pertains), and a function
p : × E → [0, 1]
for all h ∈ H . For each α ∈ , we write α both for the mapping G → (E)
given by α (g)(x) = p(αg, x), and for the mapping G × E → [0, 1] given by
α (g, x) =
α (g)(x), whichever is more convenient. Thus, α3 g1 (g2 ) =
α (g1 g2 ) for all
g1 , g2 ∈ G, and
α (gh)(x) = α (g)(hx) for all g ∈ G, h ∈ H and x ∈ E. We assume
that the mapping α → α is injective, so that α is determined by the function α.
Given this data, choose and fix an outcome xo ∈ E, and let
α (gk)(xo ) =
Equivalently, k ∈ K iff α (g)(xo ). Notice that the stabilizer Ho of xo in
H is a subset of K; in particular, K is nonempty.
Lemma C.1 K ≤ G, and K ∩ H = Ho .
Proof Let k, k ∈ K. Then for all g ∈ G and α ∈ , we have α(gkk , xo ) =
α(gk, xo ) = α(g, xo ) so kk ∈ K; also
α (gk −1 , xo ) =
α ((gk −1 )k, xo ) =
α (g, xo )
α (g, xo ) =
α (gh, xo ) =
α (g, hxo ).
φ : hxo → xh .
α (xg ) = α (xgk ) =
α (gk, xo ) =
α (g, xo ) = α (xg ).
To see that α is a probability weight on M, for each x ∈ E, choose hx ∈ H with
hx xo = x (recalling here that H acts transitively on E). Then
α (gxhx ) =
α (gφ(x)) = α (xghx ) =
α (ghx , xo ) =
α (g, hx xo ) =
α (g, x).
Thus,
α (gφ(x)) =
α (g, x) = 1.
x∈E
x∈E
(αg)(xl ) = (5
α g)(l, xo ) =
α (gl, xo ) = α (glxo ) = α (g, xl ) = α g(xl ).
4
*
)
Thus, (M, ), where = {α |α ∈ }, is a G-symmetric model with state space
.
isomorphic to
Remarks
(1) If Ho ≤ K ≤ K ≤ K, we obtain a G-equivariant, surjective outcome-
preserving interpretation
φ K ,K : MK → MK
Given the H -set E, we can obtain the initial data for this construction as follows.
Let G be any group containing H as a subgroup, and let
G ×H E = (G × E)/ ∼
where ∼ is the equivalence relation defined by (g, x) ∼ (h, y) iff there exists some
s ∈ H with h = gs and sy = x. Letting [g, x] denote the equivalence class of
(g, x), we have
g[h, x] = [gh, x]
then if [h, x] ∈ [g, E], we have [h, x] = [g, y] for some y ∈ E, whence, there
is some s ∈ H with h = gs and sy = x. Then for all z ∈ E, say z = s x, we
have [h, z] = [gs, s x] = [g, ss x] ∈ [g, E]. That is, [h, E] ⊆ [g, E]. By the same
token, [g, E] ⊆ [h, E]. In other words, the sets [g, E] partition G ×H E. With this
observation, it is easy to prove the following
Lemma C.3 With notation as in Proposition C.2, XHo G ×Ho E, and MHo
{[g, E]|g ∈ E}, a semiclassical test space, independent of .
Proof Let φ : G × E → G/Ho = XHo be given by
for all g ∈ G, s ∈ H .
To see that this is well-defined, note that if s1 xo = s2 xo , then s2−1 s1 := s ∈ Ho ,
and gs1 Ho = gs2 sHo = gs2 Ho . Next, observe that φ(g1 , s1 xo ) = φ(g2 , s2 xo )
iff g1 s1 Ho = g2 s2 Ho iff g2 s2 = g1 s1 s for some s ∈ Ho , whence, [g2 , s2 xo ] =
[g2 s2 , xo ] = [g1 s1 s, xo ] = [g1 , s1 xo ]. Thus, passing to the quotient set G ×Ho E
gives us a bijection. Since {[g, E]|g ∈ G} is a partition of G ×Ho E, we have a
semiclassical test space, as advertised. *
)
Notice that the stabilizer of [e, E] in G is exactly H0 . Now choosing any G-
invariant set of probability weights on MHo —say, the orbit of any given probability
weight—we have a natural G-equivariant injection → (E)G given by
α →
α:
α (g)(x) = α[g, x]
626 A. Wilce
for all α ∈ . The construction now proceeds as above for any choice of subgroup
K of G with Ho ≤ K ≤ K, where K is determined by as in Equation (27.6). We
can regard K as parameterizing the possible fully G-symmetric models containing
E as a test, and having dynamical group G. If K = Ho , the stabilizer of xo in H ,
then M is semiclassical. Larger choices of K further constrain the structure of M,
enforcing outcome-identifications between tests that, in turn, constrain any further
enlargement of the state space that we might contemplate.
References
Mueller, M., Barnum, H., & Ududec, C. (2014). Higher-order interference and single-system
postulates characterizing quantum theory. New Journal of Physics, 16. arXiv:1403.4147.
Namioka, I., & Phelps, R. (1969). Tensor products of compact convex sets. Pacific Journal of
Mathematics, 31, 469–480.
Pitowsky, I. (2005). Quantum mechanics as a theory of probability. In W. Demopolous (Ed.),
Physical theory and its interpretation: Essays in honor of Jeffrey Bub (Western Ontario series
in philosophy of science). Kluwer. arXiv:0510095.
Popescu, S., & Rohrlich, D. (1994). Quantum nonlocality as an axiom. Foundations of Physics,
24, 379–285.
Spekkens, R. (2005). Contextuality for preparations, transformations and unsharp measurements.
Physical Review A, 71. arXiv:quant-ph/0406166.
Varadarajan, V. S. (1985). The geometry of quantum theory (2nd ed.). New York: Springer.
Werner, R. (1989). Quantum states with Einstein-Podolsky-Rosen correlations admitting a hidden-
variable model. Phys. Rev. A, 40, 4277.
Wilce, A. (2005). Symmetry and topology in quantum logic. International Journal of Theoretical
Physics, 44, 2303–2316.
Wilce, A. (2009). Symmetry and composition in generalized probabilistic theories.
arXiv:0910.1527.
Wilce, A. (2012). Conjugates, filters and quantum mechanics. arXiv:1206.2897.
Wilce, A. (2017). Quantum logic and probability theory. In Stanford encyclopedia of philosophy.
https://plato.stanford.edu/archives/spr2017/entries/qt-quantlog/.